File size: 755 Bytes
57bdca5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
Utilities for Tokenizers This page lists all the utility functions used by the tokenizers, mainly the class [~tokenization_utils_base.PreTrainedTokenizerBase] that implements the common methods between [PreTrainedTokenizer] and [PreTrainedTokenizerFast] and the mixin [~tokenization_utils_base.SpecialTokensMixin]. Most of those are only useful if you are studying the code of the tokenizers in the library. PreTrainedTokenizerBase [[autodoc]] tokenization_utils_base.PreTrainedTokenizerBase - call - all SpecialTokensMixin [[autodoc]] tokenization_utils_base.SpecialTokensMixin Enums and namedtuples [[autodoc]] tokenization_utils_base.TruncationStrategy [[autodoc]] tokenization_utils_base.CharSpan [[autodoc]] tokenization_utils_base.TokenSpan |