Update Readme.md: add links to ReAct and HotpotQA papers
Browse files
README.md
CHANGED
@@ -17,9 +17,9 @@ This model is a fine-tuned version of [meta-llama/Llama-2-7b-chat-hf](https://hu
|
|
17 |
|
18 |
## Model description
|
19 |
|
20 |
-
This model has been generated in the context of the [Araft](https://github.com/FDeRubeis/Araft) project. The Araft project consists in fine-tuning a Llama2-7B model to enable the use of the ReAct pattern for Wikipedia-augmented question-answering. This model is the product of the first training step: SFT training.
|
21 |
|
22 |
-
In the SFT training step, the trajectories from the [Araft dataset](https://huggingface.co/datasets/FDeRubeis/araft) have been used to fine-tune the model, using each step as a desired output for the previous part of the trajectory. The model achieves a 16% performace (f1 score) on the HotpotQA dataset.
|
23 |
|
24 |
For further information, please see the [Araft](https://github.com/FDeRubeis/Araft) github repo.
|
25 |
|
|
|
17 |
|
18 |
## Model description
|
19 |
|
20 |
+
This model has been generated in the context of the [Araft](https://github.com/FDeRubeis/Araft) project. The Araft project consists in fine-tuning a Llama2-7B model to enable the use of the [ReAct](https://arxiv.org/abs/2210.03629) pattern for Wikipedia-augmented question-answering. This model is the product of the first training step: SFT training.
|
21 |
|
22 |
+
In the SFT training step, the trajectories from the [Araft dataset](https://huggingface.co/datasets/FDeRubeis/araft) have been used to fine-tune the model, using each step as a desired output for the previous part of the trajectory. The model achieves a 16% performace (f1 score) on the [HotpotQA dataset](https://hotpotqa.github.io/).
|
23 |
|
24 |
For further information, please see the [Araft](https://github.com/FDeRubeis/Araft) github repo.
|
25 |
|