The model consists of a hierarchical Transformer encoder and a lightweight all-MLP decode head to achieve great | |
results on image segmentation benchmarks such as ADE20K and Cityscapes. |
The model consists of a hierarchical Transformer encoder and a lightweight all-MLP decode head to achieve great | |
results on image segmentation benchmarks such as ADE20K and Cityscapes. |