Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
In PyTorch and Tensorflow, this can be done by replacing them with -100, which is the ignore_index
of the CrossEntropyLoss.