5fa1a76
1
2
3
There are three main components to Mask2Former: A Swin backbone accepts an image and creates a low-resolution image feature map from 3 consecutive 3x3 convolutions.