Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
If you want to use Pix2Struct for image captioning, you should use the model fine tuned on the natural images captioning dataset and so on.