As a result, our best model establishes new state-of-the-art results on the GLUE, RACE, and | |
SQuAD benchmarks while having fewer parameters compared to BERT-large. |
As a result, our best model establishes new state-of-the-art results on the GLUE, RACE, and | |
SQuAD benchmarks while having fewer parameters compared to BERT-large. |