5fa1a76
1
2
Since ByT5 was pre-trained unsupervisedly, there's no real advantage to using a task prefix during single-task fine-tuning.