The tokenizer returns this mask as the "token_type_ids" entry: | |
thon | |
encoded_dict["token_type_ids"] | |
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1] | |
The first sequence, the "context" used for the question, has all its tokens represented by a 0, whereas the second | |
sequence, corresponding to the "question", has all its tokens represented by a 1. |