This is because DeepSpeed's state_dict contains a placeholder instead of the real weights and you won't be able to load them. |
This is because DeepSpeed's state_dict contains a placeholder instead of the real weights and you won't be able to load them. |