Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
The resulting model, the Reformer, performs on par with Transformer models
while being much more memory-efficient and much faster on long sequences.