It can be used for image-text similarity and for zero-shot image classification.