Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
Implications
Transformers models based on the BERT (Bidirectional Encoder Representations from
Transformers)
architecture, or its variants such as
distilBERT and
roBERTa run best on
Inf1 for non-generative tasks such as extractive question answering, sequence
classification, and token classification.