---
license: mit
datasets:
- >-
  CreitinGameplays/Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70B-filtered-mistral
language:
- en
base_model:
- mistralai/Mistral-Nemo-Instruct-2407
pipeline_tag: text-generation
library_name: transformers
---

## Mistral Nemo 12B R1
![mistralthink](https://autumn.revolt.chat/attachments/zIqa-Q6gKlwm7BbOvKvFFRLHDdy5OOy30KcU5iFle1/image.png)

Took **12 hours** to finetune on **1x Nvidia H100** with the following settings:
- Batch size: 26
- Gradient accumulation steps: 1
- Epochs: 1
- Learning rate: 2e-5
- Warmup ratio: 0.1

Run the model:
```python
import torch
from transformers import pipeline

model_id = "CreitinGameplays/Mistral-Nemo-12B-R1-v0.2"

pipe = pipeline(
    "text-generation",
    model=model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

messages = [
    {"role": "system", "content": "You are a helpful AI assistant named Mistral Nemo."},
    {"role": "user", "content": "How many r's are in strawberry?"}
]

outputs = pipe(
    messages,
    temperature=0.6,
    top_p=0.95,
    top_k=40,
    repetition_penalty=1.1,
    max_new_tokens=2048
)

print(outputs[0]["generated_text"][-1])
```

Recommended system prompt:
```
You are an AI focused on providing systematic, well-reasoned responses. Response Structure: - Format: <think>{reasoning}</think>{answer} - Process: Think first, then answer.
```

**Note**: The model was mainly finetuned on English dataset, meaning the model may not perform well in other languages.