Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
Attempted loss scale: 1, reducing to 1
[]
This means the DeepSpeed loss scaler is unable to find a scaling coefficient to overcome loss overflow.