RAG / knowledge_base /model_doc_blenderbot.txt
Ahmadzei's picture
update 1
57bdca5
Blenderbot
Overview
The Blender chatbot model was proposed in Recipes for building an open-domain chatbot Stephen Roller, Emily Dinan, Naman Goyal, Da Ju, Mary Williamson, Yinhan Liu,
Jing Xu, Myle Ott, Kurt Shuster, Eric M. Smith, Y-Lan Boureau, Jason Weston on 30 Apr 2020.
The abstract of the paper is the following:
Building open-domain chatbots is a challenging area for machine learning research. While prior work has shown that
scaling neural models in the number of parameters and the size of the data they are trained on gives improved results,
we show that other ingredients are important for a high-performing chatbot. Good conversation requires a number of
skills that an expert conversationalist blends in a seamless way: providing engaging talking points and listening to
their partners, and displaying knowledge, empathy and personality appropriately, while maintaining a consistent
persona. We show that large scale models can learn these skills when given appropriate training data and choice of
generation strategy. We build variants of these recipes with 90M, 2.7B and 9.4B parameter models, and make our models
and code publicly available. Human evaluations show our best models are superior to existing approaches in multi-turn
dialogue in terms of engagingness and humanness measurements. We then discuss the limitations of this work by analyzing
failure cases of our models.
This model was contributed by sshleifer. The authors' code can be found here .
Usage tips and example
Blenderbot is a model with absolute position embeddings so it's usually advised to pad the inputs on the right
rather than the left.
An example:
thon
from transformers import BlenderbotTokenizer, BlenderbotForConditionalGeneration
mname = "facebook/blenderbot-400M-distill"
model = BlenderbotForConditionalGeneration.from_pretrained(mname)
tokenizer = BlenderbotTokenizer.from_pretrained(mname)
UTTERANCE = "My friends are cool but they eat too many carbs."
inputs = tokenizer([UTTERANCE], return_tensors="pt")
reply_ids = model.generate(**inputs)
print(tokenizer.batch_decode(reply_ids))
[" That's unfortunate. Are they trying to lose weight or are they just trying to be healthier?"]
Implementation Notes
Blenderbot uses a standard seq2seq model transformer based architecture.
Available checkpoints can be found in the model hub.
This is the default Blenderbot model class. However, some smaller checkpoints, such as
facebook/blenderbot_small_90M, have a different architecture and consequently should be used with
BlenderbotSmall.
Resources
Causal language modeling task guide
Translation task guide
Summarization task guide
BlenderbotConfig
[[autodoc]] BlenderbotConfig
BlenderbotTokenizer
[[autodoc]] BlenderbotTokenizer
- build_inputs_with_special_tokens
BlenderbotTokenizerFast
[[autodoc]] BlenderbotTokenizerFast
- build_inputs_with_special_tokens
BlenderbotModel
See [~transformers.BartModel] for arguments to forward and generate
[[autodoc]] BlenderbotModel
- forward
BlenderbotForConditionalGeneration
See [~transformers.BartForConditionalGeneration] for arguments to forward and generate
[[autodoc]] BlenderbotForConditionalGeneration
- forward
BlenderbotForCausalLM
[[autodoc]] BlenderbotForCausalLM
- forward
TFBlenderbotModel
[[autodoc]] TFBlenderbotModel
- call
TFBlenderbotForConditionalGeneration
[[autodoc]] TFBlenderbotForConditionalGeneration
- call
FlaxBlenderbotModel
[[autodoc]] FlaxBlenderbotModel
- call
- encode
- decode
FlaxBlenderbotForConditionalGeneration
[[autodoc]] FlaxBlenderbotForConditionalGeneration
- call
- encode
- decode