If you trained your own tokenizer, you can create one from your vocabulary file: from transformers import DistilBertTokenizer my_tokenizer = DistilBertTokenizer(vocab_file="my_vocab_file.txt", do_lower_case=False, padding_side="left") It is important to remember the vocabulary from a custom tokenizer will be different from the vocabulary generated by a pretrained model's tokenizer.