File size: 320 Bytes
5fa1a76 |
1 2 3 4 5 6 7 8 |
To indicate those tokens are not separate words but parts of the same word, a double-hash prefix is added for "RA" and "M": thon print(tokenized_sequence) ['A', 'Titan', 'R', '##T', '##X', 'has', '24', '##GB', 'of', 'V', '##RA', '##M'] These tokens can then be converted into IDs which are understandable by the model. |