Instead, we'll only look at the loss: thon from transformers import Seq2SeqTrainingArguments training_args = Seq2SeqTrainingArguments( output_dir="speecht5_finetuned_voxpopuli_nl", # change to a repo name of your choice per_device_train_batch_size=4, gradient_accumulation_steps=8, learning_rate=1e-5, warmup_steps=500, max_steps=4000, gradient_checkpointing=True, fp16=True, evaluation_strategy="steps", per_device_eval_batch_size=2, save_steps=1000, eval_steps=1000, logging_steps=25, report_to=["tensorboard"], load_best_model_at_end=True, greater_is_better=False, label_names=["labels"], push_to_hub=True, ) Instantiate the Trainer object and pass the model, dataset, and data collator to it.