Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
We further propose two visually-grounded language model objectives for
pre-training VisualBERT on image caption data.