5fa1a76
1
DeBERTa added a disentangled attention mechanism where the word and its position are separately encoded in two vectors.