Since ByT5 was pre-trained unsupervisedly, there's no real advantage to using a task prefix during single-task | |
fine-tuning. |
Since ByT5 was pre-trained unsupervisedly, there's no real advantage to using a task prefix during single-task | |
fine-tuning. |