Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
ViT has a global receptive field which means it can see more of an image at once thanks to its attention mechanism.