File size: 205 Bytes
5fa1a76
 
 
1
2
3
Further, while the cross-modality encoder
  contains self-attention for each respective modality and cross-attention, only the cross attention is returned and
  both self attention outputs are disregarded.