Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
encoded_sequence_a = tokenizer(sequence_a)["input_ids"]
encoded_sequence_b = tokenizer(sequence_b)["input_ids"]
The encoded versions have different lengths:
thon
len(encoded_sequence_a), len(encoded_sequence_b)
(8, 19)
Therefore, we can't put them together in the same tensor as-is.