Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
One can then place a regular language modeling head on top, to project the last dimension to the
vocabulary size of the model, i.e.