Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
One can use [T5ForConditionalGeneration] (or the Tensorflow/Flax variant), which includes the
language modeling head on top of the decoder.