We find that BERT was significantly undertrained, and can match or exceed the performance of every | |
model published after it. |
We find that BERT was significantly undertrained, and can match or exceed the performance of every | |
model published after it. |