Begin by running the following command to create and save a configuration file: | |
accelerate config | |
Test your setup to make sure it is configured correctly: | |
accelerate test | |
Now you are ready to launch the training: | |
accelerate launch run_summarization_no_trainer.py \ | |
--model_name_or_path google-t5/t5-small \ | |
--dataset_name cnn_dailymail \ | |
--dataset_config "3.0.0" \ | |
--source_prefix "summarize: " \ | |
--output_dir ~/tmp/tst-summarization | |
Use a custom dataset | |
The summarization script supports custom datasets as long as they are a CSV or JSON Line file. |