attention_mask = torch.tensor([[1, 1, 1, 1, 1, 1], [1, 0, 0, 0, 0, 0]]) | |
output = model(input_ids, attention_mask=attention_mask) | |
print(output.logits) | |
tensor([[ 0.0082, -0.2307], | |
[-0.1008, -0.4061]], grad_fn=) | |
🤗 Transformers doesn't automatically create an attention_mask to mask a padding token if it is provided because: | |
Some models don't have a padding token. |