|
|
|
Utilities for Tokenizers |
|
This page lists all the utility functions used by the tokenizers, mainly the class |
|
[~tokenization_utils_base.PreTrainedTokenizerBase] that implements the common methods between |
|
[PreTrainedTokenizer] and [PreTrainedTokenizerFast] and the mixin |
|
[~tokenization_utils_base.SpecialTokensMixin]. |
|
Most of those are only useful if you are studying the code of the tokenizers in the library. |
|
PreTrainedTokenizerBase |
|
[[autodoc]] tokenization_utils_base.PreTrainedTokenizerBase |
|
- call |
|
- all |
|
SpecialTokensMixin |
|
[[autodoc]] tokenization_utils_base.SpecialTokensMixin |
|
Enums and namedtuples |
|
[[autodoc]] tokenization_utils_base.TruncationStrategy |
|
[[autodoc]] tokenization_utils_base.CharSpan |
|
[[autodoc]] tokenization_utils_base.TokenSpan |