Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
[ViltProcessor] wraps a BERT tokenizer and ViLT image processor into a convenient single processor:
from transformers import ViltProcessor
processor = ViltProcessor.from_pretrained(model_checkpoint)
To preprocess the data we need to encode the images and questions using the [ViltProcessor].