FDeRubeis
/

araft_trained_sft

Generated from Trainer

Model card Files Files and versions Community

FDeRubeis commited on May 15, 2024

Commit

78f8df3

·

verified ·

1 Parent(s): 3b01a7d

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -19,7 +19,7 @@ This model is a fine-tuned version of [meta-llama/Llama-2-7b-chat-hf](https://hu
 This model has been generated in the context of the [Araft](https://github.com/FDeRubeis/Araft) project. The Araft project consists in fine-tuning a Llama2-7B model to adapt it to use the ReAct pattern for Wikipedia-augmented question-answering. This model is the product of the first training step: SFT training.
-In the SFT training step, the trajectories from the [Araft dataset](https://huggingface.co/datasets/FDeRubeis/araft) have been used to fine-tune the model, using each step as a desired output for the previous part of the trajectory. The model achieve a 16% performace (f1 score) on the HotpotQA dataset.
 For further information, please see the [Araft](https://github.com/FDeRubeis/Araft) github repo.

 This model has been generated in the context of the [Araft](https://github.com/FDeRubeis/Araft) project. The Araft project consists in fine-tuning a Llama2-7B model to adapt it to use the ReAct pattern for Wikipedia-augmented question-answering. This model is the product of the first training step: SFT training.
+In the SFT training step, the trajectories from the [Araft dataset](https://huggingface.co/datasets/FDeRubeis/araft) have been used to fine-tune the model, using each step as a desired output for the previous part of the trajectory. The model achieves a 16% performace (f1 score) on the HotpotQA dataset.
 For further information, please see the [Araft](https://github.com/FDeRubeis/Araft) github repo.