File size: 127 Bytes
5fa1a76
1
When working with large context, models apply various optimizations to prevent Attention complexity from scaling quadratically.