File size: 180 Bytes
5fa1a76
 
1
2
The result is a new attention mechanism we call {\em Transient Global}
(TGlobal), which mimics ETC's local/global attention mechanism, but without requiring additional side-inputs.