There are three main components to Mask2Former: | |
A Swin backbone accepts an image and creates a low-resolution image feature map from 3 consecutive 3x3 convolutions. |
There are three main components to Mask2Former: | |
A Swin backbone accepts an image and creates a low-resolution image feature map from 3 consecutive 3x3 convolutions. |