Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
The abstract from the paper is the following:
We introduce a self-supervised vision representation model BEiT, which stands for Bidirectional Encoder representation
from Image Transformers.