Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
CLIP takes a different approach and makes a pair prediction of (image, text) .