Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
It's a
bidirectional transformer pretrained using a combination of masked language modeling objective and next sentence
prediction on a large corpus comprising the Toronto Book Corpus and Wikipedia.