For example, to run the run_glue.py training script with the FSDP configuration: | |
accelerate launch \ | |
./examples/pytorch/text-classification/run_glue.py \ | |
--model_name_or_path google-bert/bert-base-cased \ | |
--task_name $TASK_NAME \ | |
--do_train \ | |
--do_eval \ | |
--max_seq_length 128 \ | |
--per_device_train_batch_size 16 \ | |
--learning_rate 5e-5 \ | |
--num_train_epochs 3 \ | |
--output_dir /tmp/$TASK_NAME/ \ | |
--overwrite_output_dir | |
You could also specify the parameters from the config_file.yaml file directly in the command line: | |
accelerate launch --num_processes=2 \ | |
--use_fsdp \ | |
--mixed_precision=bf16 \ | |
--fsdp_auto_wrap_policy=TRANSFORMER_BASED_WRAP \ | |
--fsdp_transformer_layer_cls_to_wrap="BertLayer" \ | |
--fsdp_sharding_strategy=1 \ | |
--fsdp_state_dict_type=FULL_STATE_DICT \ | |
./examples/pytorch/text-classification/run_glue.py | |
--model_name_or_path google-bert/bert-base-cased \ | |
--task_name $TASK_NAME \ | |
--do_train \ | |
--do_eval \ | |
--max_seq_length 128 \ | |
--per_device_train_batch_size 16 \ | |
--learning_rate 5e-5 \ | |
--num_train_epochs 3 \ | |
--output_dir /tmp/$TASK_NAME/ \ | |
--overwrite_output_dir | |
Check out the Launching your Accelerate scripts tutorial to learn more about accelerate_launch and custom configurations. |