Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
To improve the efficiency, we examine the
much-overlooked redundancy in maintaining a full-length token-level presentation, especially for tasks that only
require a single-vector presentation of the sequence.