|
|
|
Blenderbot |
|
Overview |
|
The Blender chatbot model was proposed in Recipes for building an open-domain chatbot Stephen Roller, Emily Dinan, Naman Goyal, Da Ju, Mary Williamson, Yinhan Liu, |
|
Jing Xu, Myle Ott, Kurt Shuster, Eric M. Smith, Y-Lan Boureau, Jason Weston on 30 Apr 2020. |
|
The abstract of the paper is the following: |
|
Building open-domain chatbots is a challenging area for machine learning research. While prior work has shown that |
|
scaling neural models in the number of parameters and the size of the data they are trained on gives improved results, |
|
we show that other ingredients are important for a high-performing chatbot. Good conversation requires a number of |
|
skills that an expert conversationalist blends in a seamless way: providing engaging talking points and listening to |
|
their partners, and displaying knowledge, empathy and personality appropriately, while maintaining a consistent |
|
persona. We show that large scale models can learn these skills when given appropriate training data and choice of |
|
generation strategy. We build variants of these recipes with 90M, 2.7B and 9.4B parameter models, and make our models |
|
and code publicly available. Human evaluations show our best models are superior to existing approaches in multi-turn |
|
dialogue in terms of engagingness and humanness measurements. We then discuss the limitations of this work by analyzing |
|
failure cases of our models. |
|
This model was contributed by sshleifer. The authors' code can be found here . |
|
Usage tips and example |
|
Blenderbot is a model with absolute position embeddings so it's usually advised to pad the inputs on the right |
|
rather than the left. |
|
An example: |
|
thon |
|
|
|
from transformers import BlenderbotTokenizer, BlenderbotForConditionalGeneration |
|
mname = "facebook/blenderbot-400M-distill" |
|
model = BlenderbotForConditionalGeneration.from_pretrained(mname) |
|
tokenizer = BlenderbotTokenizer.from_pretrained(mname) |
|
UTTERANCE = "My friends are cool but they eat too many carbs." |
|
inputs = tokenizer([UTTERANCE], return_tensors="pt") |
|
reply_ids = model.generate(**inputs) |
|
print(tokenizer.batch_decode(reply_ids)) |
|
[" That's unfortunate. Are they trying to lose weight or are they just trying to be healthier?"] |
|
|
|
Implementation Notes |
|
|
|
Blenderbot uses a standard seq2seq model transformer based architecture. |
|
Available checkpoints can be found in the model hub. |
|
This is the default Blenderbot model class. However, some smaller checkpoints, such as |
|
facebook/blenderbot_small_90M, have a different architecture and consequently should be used with |
|
BlenderbotSmall. |
|
|
|
Resources |
|
|
|
Causal language modeling task guide |
|
Translation task guide |
|
Summarization task guide |
|
|
|
BlenderbotConfig |
|
[[autodoc]] BlenderbotConfig |
|
BlenderbotTokenizer |
|
[[autodoc]] BlenderbotTokenizer |
|
- build_inputs_with_special_tokens |
|
BlenderbotTokenizerFast |
|
[[autodoc]] BlenderbotTokenizerFast |
|
- build_inputs_with_special_tokens |
|
|
|
BlenderbotModel |
|
See [~transformers.BartModel] for arguments to forward and generate |
|
[[autodoc]] BlenderbotModel |
|
- forward |
|
BlenderbotForConditionalGeneration |
|
See [~transformers.BartForConditionalGeneration] for arguments to forward and generate |
|
[[autodoc]] BlenderbotForConditionalGeneration |
|
- forward |
|
BlenderbotForCausalLM |
|
[[autodoc]] BlenderbotForCausalLM |
|
- forward |
|
|
|
TFBlenderbotModel |
|
[[autodoc]] TFBlenderbotModel |
|
- call |
|
TFBlenderbotForConditionalGeneration |
|
[[autodoc]] TFBlenderbotForConditionalGeneration |
|
- call |
|
|
|
FlaxBlenderbotModel |
|
[[autodoc]] FlaxBlenderbotModel |
|
- call |
|
- encode |
|
- decode |
|
FlaxBlenderbotForConditionalGeneration |
|
[[autodoc]] FlaxBlenderbotForConditionalGeneration |
|
- call |
|
- encode |
|
- decode |
|
|
|
|