Update README.md
#22
by
RithwikChhugani
- opened
🚨 Suggested Correction to Model Usage Example
In the current example, it is recommended to remove pad_token_id=0
from the generate()
call, as it does not provide any functional value in this context.
🔍 Why This Matters
- The tokenizer in the example does not have a default padding token set.
- If you provide multiple inputs as a batch, the tokenizer cannot automatically pad them to the same length unless:
padding=True
is specified, and- a valid
pad_token_id
is configured.
- Without both, calling
generate()
may result in errors or unexpected behavior. - Setting
pad_token_id=0
without configuring the tokenizer may silently introduce incorrect behavior, especially if0
corresponds to a meaningful token (e.g.,<unk>
or<eos>
).
✅ Recommendation
Update the example using one of the following options:
Option 1: If batching is not intended
Remove pad_token_id=0
from the generate()
call:
output = model.generate(
input_ids,
max_new_tokens=20
)
Option 2: If batching is intended
Set tokenizer.pad_token
to the model's pad token. For example, the tokenizer.json
contains:
{
"id": 128004,
"content": "<|finetune_right_pad_id|>",
"single_word": false,
"lstrip": false,
"rstrip": false,
"normalized": false,
"special": true
}
You should set:
tokenizer.pad_token = "<|finetune_right_pad_id|>"
Example: Using apply_chat_template
and generate
with batching
# Prepare a batch of chat messages
messages_batch = [
[{"role": "user", "content": "Hello!"}],
[{"role": "user", "content": "How are you?"}]
]
# Apply chat template and tokenize with padding
input_ids = tokenizer.apply_chat_template(
messages_batch,
pad_token_id=tokenizer.pad_token,
padding=True,
return_tensors="pt"
)
# Generate outputs with correct pad_token_id
outputs = model.generate(
input_ids,
max_new_tokens=20,
pad_token_id=tokenizer.pad_token_id
)