File size: 755 Bytes
57bdca5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

Utilities for Tokenizers
This page lists all the utility functions used by the tokenizers, mainly the class
[~tokenization_utils_base.PreTrainedTokenizerBase] that implements the common methods between
[PreTrainedTokenizer] and [PreTrainedTokenizerFast] and the mixin
[~tokenization_utils_base.SpecialTokensMixin].
Most of those are only useful if you are studying the code of the tokenizers in the library.
PreTrainedTokenizerBase
[[autodoc]] tokenization_utils_base.PreTrainedTokenizerBase
    - call
    - all
SpecialTokensMixin
[[autodoc]] tokenization_utils_base.SpecialTokensMixin
Enums and namedtuples
[[autodoc]] tokenization_utils_base.TruncationStrategy
[[autodoc]] tokenization_utils_base.CharSpan
[[autodoc]] tokenization_utils_base.TokenSpan