Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
All models are trained on sequences of 16k tokens and show improvements on inputs with up to 100k tokens.