Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
Now the output of the second sequence matches its actual output:
By default, the tokenizer creates an attention_mask for you based on your specific tokenizer's defaults.