Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
It builds on BERT and modifies key hyperparameters, removing the next-sentence pretraining objective and training with
much larger mini-batches and learning rates.