|
|
|
RAG |
|
|
|
Overview |
|
Retrieval-augmented generation ("RAG") models combine the powers of pretrained dense retrieval (DPR) and |
|
sequence-to-sequence models. RAG models retrieve documents, pass them to a seq2seq model, then marginalize to generate |
|
outputs. The retriever and seq2seq modules are initialized from pretrained models, and fine-tuned jointly, allowing |
|
both retrieval and generation to adapt to downstream tasks. |
|
It is based on the paper Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks by Patrick Lewis, Ethan Perez, Aleksandara Piktus, Fabio Petroni, Vladimir |
|
Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, Douwe Kiela. |
|
The abstract from the paper is the following: |
|
Large pre-trained language models have been shown to store factual knowledge in their parameters, and achieve |
|
state-of-the-art results when fine-tuned on downstream NLP tasks. However, their ability to access and precisely |
|
manipulate knowledge is still limited, and hence on knowledge-intensive tasks, their performance lags behind |
|
task-specific architectures. Additionally, providing provenance for their decisions and updating their world knowledge |
|
remain open research problems. Pre-trained models with a differentiable access mechanism to explicit nonparametric |
|
memory can overcome this issue, but have so far been only investigated for extractive downstream tasks. We explore a |
|
general-purpose fine-tuning recipe for retrieval-augmented generation (RAG) — models which combine pre-trained |
|
parametric and non-parametric memory for language generation. We introduce RAG models where the parametric memory is a |
|
pre-trained seq2seq model and the non-parametric memory is a dense vector index of Wikipedia, accessed with a |
|
pre-trained neural retriever. We compare two RAG formulations, one which conditions on the same retrieved passages |
|
across the whole generated sequence, the other can use different passages per token. We fine-tune and evaluate our |
|
models on a wide range of knowledge-intensive NLP tasks and set the state-of-the-art on three open domain QA tasks, |
|
outperforming parametric seq2seq models and task-specific retrieve-and-extract architectures. For language generation |
|
tasks, we find that RAG models generate more specific, diverse and factual language than a state-of-the-art |
|
parametric-only seq2seq baseline. |
|
This model was contributed by ola13. |
|
Usage tips |
|
Retrieval-augmented generation ("RAG") models combine the powers of pretrained dense retrieval (DPR) and Seq2Seq models. |
|
RAG models retrieve docs, pass them to a seq2seq model, then marginalize to generate outputs. The retriever and seq2seq |
|
modules are initialized from pretrained models, and fine-tuned jointly, allowing both retrieval and generation to adapt |
|
to downstream tasks. |
|
RagConfig |
|
[[autodoc]] RagConfig |
|
RagTokenizer |
|
[[autodoc]] RagTokenizer |
|
Rag specific outputs |
|
[[autodoc]] models.rag.modeling_rag.RetrievAugLMMarginOutput |
|
[[autodoc]] models.rag.modeling_rag.RetrievAugLMOutput |
|
RagRetriever |
|
[[autodoc]] RagRetriever |
|
|
|
RagModel |
|
[[autodoc]] RagModel |
|
- forward |
|
RagSequenceForGeneration |
|
[[autodoc]] RagSequenceForGeneration |
|
- forward |
|
- generate |
|
RagTokenForGeneration |
|
[[autodoc]] RagTokenForGeneration |
|
- forward |
|
- generate |
|
|
|
TFRagModel |
|
[[autodoc]] TFRagModel |
|
- call |
|
TFRagSequenceForGeneration |
|
[[autodoc]] TFRagSequenceForGeneration |
|
- call |
|
- generate |
|
TFRagTokenForGeneration |
|
[[autodoc]] TFRagTokenForGeneration |
|
- call |
|
- generate |
|
|
|
|