Because T5 has been trained with the span-mask denoising objective, | |
it can be used to predict the sentinel (masked-out) tokens during inference. |
Because T5 has been trained with the span-mask denoising objective, | |
it can be used to predict the sentinel (masked-out) tokens during inference. |