File size: 172 Bytes
5fa1a76
 
 
1
2
3
The novel convolution heads, together with the
rest self-attention heads, form a new mixed attention block that is more efficient at both global and local context
learning.