Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
The prediction, whether it is the next sentence or not, is passed to a feedforward network with a softmax over the two classes (IsNext and NotNext).