Note that T5 uses the pad_token_id as the decoder_start_token_id, so when doing generation without using | |
[~generation.GenerationMixin.generate], make sure you start it with the pad_token_id. |
Note that T5 uses the pad_token_id as the decoder_start_token_id, so when doing generation without using | |
[~generation.GenerationMixin.generate], make sure you start it with the pad_token_id. |