Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
Furthermore, XLNet integrates ideas from Transformer-XL, the state-of-the-art autoregressive model, into
pretraining.