Having written a functional tokenization script that uses the original repository, an analogous script for 🤗 Transformers should be created.