--- datasets: - imagenet-1k tags: - mae - crossmae pipeline_tag: image-classification library_name: pytorch license: cc-by-nc-4.0 --- ## CrossMAE: Rethinking Patch Dependence for Masked Autoencoders by Letian Fu*, Long Lian*, Renhao Wang, Baifeng Shi, Xudong Wang, Adam Yala†, Trevor Darrell†, Alexei A. Efros†, Ken Goldberg† at UC Berkeley and UCSF [[Paper](https://arxiv.org/abs/2401.14391)] | [[Project Page](https://crossmae.github.io/)] | [[Citation](#citation)]
ViT-Small | ViT-Base | ViT-Base448 | ViT-Large | ViT-Huge | |
---|---|---|---|---|---|
pretrained checkpoint | download | download | download | download | download |
fine-tuned checkpoint | download | download | download | download | download |
Reference ImageNet accuracy (ours) | 79.318 | 83.722 | 84.598 | 85.432 | 86.256 |
MAE ImageNet accuracy (baseline) | 84.8 | 85.9 |