training_args = TrainingArguments( output_dir="my_awesome_asr_mind_model", per_device_train_batch_size=8, gradient_accumulation_steps=2, learning_rate=1e-5, warmup_steps=500, max_steps=2000, gradient_checkpointing=True, fp16=True, group_by_length=True, evaluation_strategy="steps", per_device_eval_batch_size=8, save_steps=1000, eval_steps=1000, logging_steps=25, load_best_model_at_end=True, metric_for_best_model="wer", greater_is_better=False, push_to_hub=True, ) trainer = Trainer( model=model, args=training_args, train_dataset=encoded_minds["train"], eval_dataset=encoded_minds["test"], tokenizer=processor, data_collator=data_collator, compute_metrics=compute_metrics, ) trainer.train() Once training is completed, share your model to the Hub with the [~transformers.Trainer.push_to_hub] method so everyone can use your model: trainer.push_to_hub() For a more in-depth example of how to finetune a model for automatic speech recognition, take a look at this blog post for English ASR and this post for multilingual ASR.