Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
This is because DeepSpeed's state_dict contains a placeholder instead of the real weights and you won't be able to load them.