Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
There are three main components to Mask2Former:
A Swin backbone accepts an image and creates a low-resolution image feature map from 3 consecutive 3x3 convolutions.