Spaces:

Ahmadzei
/

RAG

Runtime error

RAG

File size: 257 Bytes

5fa1a76

We demonstrate that the simple pre-training task of predicting which caption goes
with which image is an efficient and scalable way to learn SOTA image representations from scratch on a dataset of 400
million (image, text) pairs collected from the internet.