Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
Further, while the cross-modality encoder
contains self-attention for each respective modality and cross-attention, only the cross attention is returned and
both self attention outputs are disregarded.