CreitinGameplays commited on
Commit
dec316b
·
verified ·
1 Parent(s): 5f78f5b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -1
README.md CHANGED
@@ -1,4 +1,60 @@
1
  ---
 
 
 
 
 
 
 
 
2
  pipeline_tag: text-generation
3
  library_name: transformers
4
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: mit
3
+ datasets:
4
+ - >-
5
+ CreitinGameplays/Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70B-filtered-mistral
6
+ language:
7
+ - en
8
+ base_model:
9
+ - mistralai/Mistral-Nemo-Instruct-2407
10
  pipeline_tag: text-generation
11
  library_name: transformers
12
+ ---
13
+
14
+ ## Mistral Nemo 12B R1
15
+ ![mistralthink](https://autumn.revolt.chat/attachments/zIqa-Q6gKlwm7BbOvKvFFRLHDdy5OOy30KcU5iFle1/image.png)
16
+
17
+ Took **12 hours** to finetune on **1x Nvidia H100** with the following settings:
18
+ - Batch size: 26
19
+ - Gradient accumulation steps: 1
20
+ - Epochs: 1
21
+ - Learning rate: 2e-5
22
+ - Warmup ratio: 0.1
23
+
24
+ Run the model:
25
+ ```python
26
+ import torch
27
+ from transformers import pipeline
28
+
29
+ model_id = "CreitinGameplays/Mistral-Nemo-12B-R1-v0.2"
30
+
31
+ pipe = pipeline(
32
+ "text-generation",
33
+ model=model_id,
34
+ torch_dtype=torch.bfloat16,
35
+ device_map="auto"
36
+ )
37
+
38
+ messages = [
39
+ {"role": "system", "content": "You are a helpful AI assistant."},
40
+ {"role": "user", "content": "How many r's are in strawberry?"}
41
+ ]
42
+
43
+ outputs = pipe(
44
+ messages,
45
+ temperature=0.6,
46
+ top_p=1.0,
47
+ top_k=50,
48
+ repetition_penalty=1.1,
49
+ max_new_tokens=2048
50
+ )
51
+
52
+ print(outputs[0]["generated_text"][-1])
53
+ ```
54
+
55
+ Recommended system prompt:
56
+ ```
57
+ You are an AI focused on providing systematic, well-reasoned responses. Response Structure: - Format: <think>{reasoning}</think>{answer} - Process: Think first, then answer.
58
+ ```
59
+
60
+ **Note**: The model was mainly finetuned on English dataset, meaning the model may not perform well in other languages; The model may enter an infinite response loop after the reasoning step.