Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
DiT applies the self-supervised objective of BEiT (BERT pre-training of Image Transformers) to 42 million document images, allowing for state-of-the-art results on tasks including:
document image classification: the RVL-CDIP dataset (a collection of
400,000 images belonging to one of 16 classes).