File size: 123 Bytes
5fa1a76
1
In masked language modeling, some percentage of the input tokens are randomly masked, and the model needs to predict these.