Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
In addition, to perform token-level predictions as required by common pretraining
objectives, Funnel-Transformer is able to recover a deep representation for each token from the reduced hidden sequence
via a decoder.