Following BERT developed in the natural language processing area, we propose a masked image | |
modeling task to pretrain vision Transformers. |
Following BERT developed in the natural language processing area, we propose a masked image | |
modeling task to pretrain vision Transformers. |