5fa1a76
1
2
LED works very well on long-range sequence-to-sequence tasks where the input_ids largely exceed a length of 1024 tokens.