FDeRubeis commited on
Commit
833e3cb
·
verified ·
1 Parent(s): 1e9a02e

Update Readme.md: add links to ReAct and HotpotQA papers

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -17,9 +17,9 @@ This model is a fine-tuned version of [meta-llama/Llama-2-7b-chat-hf](https://hu
17
 
18
  ## Model description
19
 
20
- This model has been generated in the context of the [Araft](https://github.com/FDeRubeis/Araft) project. The Araft project consists in fine-tuning a Llama2-7B model to enable the use of the ReAct pattern for Wikipedia-augmented question-answering. This model is the product of the first training step: SFT training.
21
 
22
- In the SFT training step, the trajectories from the [Araft dataset](https://huggingface.co/datasets/FDeRubeis/araft) have been used to fine-tune the model, using each step as a desired output for the previous part of the trajectory. The model achieves a 16% performace (f1 score) on the HotpotQA dataset.
23
 
24
  For further information, please see the [Araft](https://github.com/FDeRubeis/Araft) github repo.
25
 
 
17
 
18
  ## Model description
19
 
20
+ This model has been generated in the context of the [Araft](https://github.com/FDeRubeis/Araft) project. The Araft project consists in fine-tuning a Llama2-7B model to enable the use of the [ReAct](https://arxiv.org/abs/2210.03629) pattern for Wikipedia-augmented question-answering. This model is the product of the first training step: SFT training.
21
 
22
+ In the SFT training step, the trajectories from the [Araft dataset](https://huggingface.co/datasets/FDeRubeis/araft) have been used to fine-tune the model, using each step as a desired output for the previous part of the trajectory. The model achieves a 16% performace (f1 score) on the [HotpotQA dataset](https://hotpotqa.github.io/).
23
 
24
  For further information, please see the [Araft](https://github.com/FDeRubeis/Araft) github repo.
25