Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
In masked language modeling, some percentage of the input tokens are randomly masked, and the model needs to predict these.