The result is a new attention mechanism we call {\em Transient Global} | |
(TGlobal), which mimics ETC's local/global attention mechanism, but without requiring additional side-inputs. |
The result is a new attention mechanism we call {\em Transient Global} | |
(TGlobal), which mimics ETC's local/global attention mechanism, but without requiring additional side-inputs. |