File size: 301 Bytes
5fa1a76 |
1 2 3 4 5 6 7 8 |
Implications Transformers models based on the BERT (Bidirectional Encoder Representations from Transformers) architecture, or its variants such as distilBERT and roBERTa run best on Inf1 for non-generative tasks such as extractive question answering, sequence classification, and token classification. |