File size: 310 Bytes
5fa1a76 |
1 2 3 4 5 6 7 8 |
Preprocess The next step is to load a DistilBERT tokenizer to preprocess the tokens field: from transformers import AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("distilbert/distilbert-base-uncased") As you saw in the example tokens field above, it looks like the input has already been tokenized. |