Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
[ViTForImageClassification] is an image classification head - a linear layer on top of the final hidden state of the CLS token - on top of the base [ViTModel].