Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
To feed images to the Transformer encoder, each image is split into a sequence of fixed-size non-overlapping patches,
which are then linearly embedded.