Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
Since mT5 was pre-trained unsupervisedly, there's no real advantage to using a task prefix during single-task
fine-tuning.