Spaces:

Ahmadzei
/

RAG

Runtime error

added 3 more tables for large emb model

5fa1a76 over 1 year ago

1.01 kB

	For example, if your checkpoint folder looked like this:

	$ ls -l output_dir/checkpoint-1/
	-rw-rw-r-- 1 stas stas 1.4K Mar 27 20:42 config.json
	drwxrwxr-x 2 stas stas 4.0K Mar 25 19:52 global_step1/
	-rw-rw-r-- 1 stas stas 12 Mar 27 13:16 latest
	-rw-rw-r-- 1 stas stas 827K Mar 27 20:42 optimizer.pt
	-rw-rw-r-- 1 stas stas 231M Mar 27 20:42 pytorch_model.bin
	-rw-rw-r-- 1 stas stas 623 Mar 27 20:42 scheduler.pt
	-rw-rw-r-- 1 stas stas 1.8K Mar 27 20:42 special_tokens_map.json
	-rw-rw-r-- 1 stas stas 774K Mar 27 20:42 spiece.model
	-rw-rw-r-- 1 stas stas 1.9K Mar 27 20:42 tokenizer_config.json
	-rw-rw-r-- 1 stas stas 339 Mar 27 20:42 trainer_state.json
	-rw-rw-r-- 1 stas stas 2.3K Mar 27 20:42 training_args.bin
	-rwxrw-r-- 1 stas stas 5.5K Mar 27 13:16 zero_to_fp32.py*
	To reconstruct the fp32 weights from the DeepSpeed checkpoint (ZeRO-2 or ZeRO-3) subfolder global_step1, run the following command to create and consolidate the full fp32 weights from multiple GPUs into a single pytorch_model.bin file.