Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
To address this limitation, we introduce the Longformer with an attention
mechanism that scales linearly with sequence length, making it easy to process documents of thousands of tokens or
longer.