File size: 208 Bytes
5fa1a76
 
 
1
2
3
To improve the efficiency, we examine the
much-overlooked redundancy in maintaining a full-length token-level presentation, especially for tasks that only
require a single-vector presentation of the sequence.