Spaces:

Ahmadzei
/

RAG

Runtime error

RAG

File size: 240 Bytes

5fa1a76

Although all its attention heads query on the whole input sequence for
generating the attention map from a global perspective, we observe some heads only need to learn local dependencies,
which means the existence of computation redundancy.