We also show the generalizability of our | |
pretrained cross-modality model by adapting it to a challenging visual-reasoning task, NLVR, and improve the previous | |
best result by 22% absolute (54% to 76%). |
We also show the generalizability of our | |
pretrained cross-modality model by adapting it to a challenging visual-reasoning task, NLVR, and improve the previous | |
best result by 22% absolute (54% to 76%). |