Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
At some point, you will notice a difference between the two implementations, which should point you to the bug
in the 🤗 Transformers implementation.