Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
This is most likely because either
you used incorrect parameters in BrandNewBertConfig(), have a wrong architecture in the 🤗 Transformers
implementation, you have a bug in the init() functions of one of the components of the 🤗 Transformers
implementation or you need to transpose one of the checkpoint weights.