robinsmits commited on
Commit
a43af6b
·
verified ·
1 Parent(s): 2b7c94f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -72
README.md CHANGED
@@ -19,94 +19,54 @@ license: apache-2.0
19
  ---
20
  # Schaapje-2B-Chat-V1.0
21
 
22
- <!-- Provide a quick summary of what the model is/does. -->
23
 
 
24
 
25
- ## Model Details
26
 
27
- ### Model Description
28
 
29
- <!-- Provide a longer summary of what this model is. -->
30
 
31
- This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
 
 
32
 
33
- - **Developed by:** [More Information Needed]
34
- - **Funded by [optional]:** [More Information Needed]
35
- - **Shared by [optional]:** [More Information Needed]
36
- - **Model type:** [More Information Needed]
37
- - **Language(s) (NLP):** [More Information Needed]
38
- - **License:** [More Information Needed]
39
- - **Finetuned from model [optional]:** [More Information Needed]
40
 
41
- ### Model Sources [optional]
 
 
42
 
43
- <!-- Provide the basic links for the model. -->
44
 
45
- - **Repository:** [More Information Needed]
46
- - **Paper [optional]:** [More Information Needed]
47
- - **Demo [optional]:** [More Information Needed]
48
 
49
- ## Uses
 
 
50
 
51
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
52
 
53
- ### Direct Use
 
 
54
 
55
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 
 
56
 
57
- [More Information Needed]
58
 
59
- ### Downstream Use [optional]
60
 
61
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
62
 
63
- [More Information Needed]
 
64
 
65
- ### Out-of-Scope Use
66
 
67
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
68
-
69
- [More Information Needed]
70
-
71
- ## Bias, Risks, and Limitations
72
-
73
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
74
-
75
- [More Information Needed]
76
-
77
- ### Recommendations
78
-
79
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
80
-
81
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
82
-
83
- ## How to Get Started with the Model
84
-
85
- Use the code below to get started with the model.
86
-
87
- [More Information Needed]
88
-
89
- ## Training Details
90
-
91
- ### Training Data
92
-
93
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
94
-
95
- [More Information Needed]
96
-
97
- ### Training Procedure
98
-
99
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
100
-
101
- #### Preprocessing [optional]
102
-
103
- [More Information Needed]
104
-
105
-
106
- #### Training Hyperparameters
107
-
108
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
109
-
110
- #### Speeds, Sizes, Times [optional]
111
-
112
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
 
19
  ---
20
  # Schaapje-2B-Chat-V1.0
21
 
22
+ ## Model description
23
 
24
+ This is the DPO aligned model based on the SFT trained model [robinsmits/Schaapje-2B-Chat-SFT-V1.0](https://huggingface.co/robinsmits/Schaapje-2B-Chat-SFT-V1.0).
25
 
26
+ General Dutch Chat and/or Instruction following works quitte well with this model.
27
 
28
+ ## Model usage
29
 
30
+ A basic example of how to use this DPO aligned model for Chat or Instruction following.
31
 
32
+ ```
33
+ import torch
34
+ from transformers import AutoTokenizer, AutoModelForCausalLM
35
 
36
+ device = 'cuda'
37
+ model_name = 'robinsmits/Schaapje-2B-Chat-V1.0'
 
 
 
 
 
38
 
39
+ model = AutoModelForCausalLM.from_pretrained(model_name,
40
+ device_map = "auto",
41
+ torch_dtype = torch.bfloat16)
42
 
43
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
44
 
45
+ messages = [{"role": "user", "content": "Hoi hoe gaat het ermee?"}]
 
 
46
 
47
+ chat = tokenizer.apply_chat_template(messages,
48
+ tokenize = False,
49
+ add_generation_prompt = True)
50
 
51
+ input_tokens = tokenizer(chat, return_tensors = "pt").to('cuda')
52
 
53
+ output = model.generate(**input_tokens,
54
+ max_new_tokens = 512,
55
+ do_sample = True)
56
 
57
+ output = tokenizer.decode(output[0], skip_special_tokens = False)
58
+ print(output)
59
+ ```
60
 
61
+ ## Intended uses & limitations
62
 
63
+ As with all LLM's this model can also experience bias and hallucinations. Regardless of how you use this model always perform the necessary testing and validation.
64
 
65
+ ## Datasets and Licenses
66
 
67
+ The following dataset was used for DPO alignment:
68
+ - [BramVanroy/ultra_feedback_dutch_cleaned](https://huggingface.co/datasets/BramVanroy/ultra_feedback_dutch_cleaned): apache-2.0
69
 
70
+ ## Model Training
71
 
72
+ The notebook used to train this DPO aligned model is available at the following link: [Schaapje-2B-Chat-DPO-V1.0](https://github.com/RobinSmits/Schaapje/blob/main/Schaapje-2B-Chat-DPO-V1.0.ipynb)