File size: 2,640 Bytes
f661d13
dec316b
 
 
 
 
 
 
 
f661d13
 
dec316b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9773714
dec316b
 
 
 
 
 
634047d
 
dec316b
 
 
 
 
 
 
73887b1
2f47b1c
73887b1
 
4ac6942
73887b1
 
abcb321
73887b1
abcb321
4ac6942
abcb321
4ac6942
abcb321
4ac6942
abcb321
 
4ac6942
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
abcb321
 
eb34306
dec316b
e1bff9e
dec316b
 
30b566b
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
---
license: mit
datasets:
- >-
  CreitinGameplays/Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70B-filtered-mistral
language:
- en
base_model:
- mistralai/Mistral-Nemo-Instruct-2407
pipeline_tag: text-generation
library_name: transformers
---

## Mistral Nemo 12B R1
![mistralthink](https://autumn.revolt.chat/attachments/zIqa-Q6gKlwm7BbOvKvFFRLHDdy5OOy30KcU5iFle1/image.png)

Took **12 hours** to finetune on **1x Nvidia H100** with the following settings:
- Batch size: 26
- Gradient accumulation steps: 1
- Epochs: 1
- Learning rate: 2e-5
- Warmup ratio: 0.1

Run the model:
```python
import torch
from transformers import pipeline

model_id = "CreitinGameplays/Mistral-Nemo-12B-R1-v0.2"

pipe = pipeline(
    "text-generation",
    model=model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

messages = [
    {"role": "system", "content": "You are a helpful AI assistant named Mistral Nemo."},
    {"role": "user", "content": "How many r's are in strawberry?"}
]

outputs = pipe(
    messages,
    temperature=0.6,
    top_p=0.95,
    top_k=40,
    repetition_penalty=1.1,
    max_new_tokens=2048
)

print(outputs[0]["generated_text"][-1])
```

### Example response generated with this model:

- User prompt:
```
how many r's are in strawberry?
```
- Model output:

```
<think>
To determine the number of 'r's in the word "strawberry," I will start by breaking down the word into its component parts: "straw" and "berry."

Next, I will count the number of 'r's in each part. The word "straw" contains one 'r,' and the word "berry" contains two 'r's.

Finally, I will add the number of 'r's from both parts together, which gives me a total of three 'r's in the word "strawberry."
</think>

To determine the number of 'r's in the word "strawberry," let's break it down step by step:

1. **Break down the word:**
   
   The word "strawberry" can be divided into two parts: "straw" and "berry."

2. **Count the 'r's in each part:**
   
   - In "straw": There is **1** 'r'.
   - In "berry": There are **2** 'r's.

3. **Add the 'r's together:**
   
   \( 1 \text{ (from straw)} + 2 \text{ (from berry)} = 3 \)

Therefore, the word "strawberry" contains **3** 'r's.

\[
\boxed{3}
\]</s>
```

Recommended system prompt for activating reasoning step:
```
You are an AI focused on providing systematic, well-reasoned responses. Response Structure: - Format: <think>{reasoning}</think>{answer} - Reasoning: Minimum 6 logical steps only when it required in <think> block - Process: Think first, then answer.
```

**Note**: The model was mainly finetuned on English dataset, meaning the model may not perform well in other languages.