Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
byT5: byT5 is a T5 model pre-trained on byte sequences rather than SentencePiece subword token sequences.