CreitinGameplays
/

Mistral-Nemo-12B-R1-v0.2

Text Generation

text-generation-inference

Model card Files Files and versions

CreitinGameplays commited on 22 days ago

Commit

dec316b

·

verified ·

1 Parent(s): 5f78f5b

Update README.md

Files changed (1) hide show

README.md +57 -1

README.md CHANGED Viewed

@@ -1,4 +1,60 @@
 ---
 pipeline_tag: text-generation
 library_name: transformers
----

 ---
+license: mit
+datasets:
+- >-
+  CreitinGameplays/Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70B-filtered-mistral
+language:
+- en
+base_model:
+- mistralai/Mistral-Nemo-Instruct-2407
 pipeline_tag: text-generation
 library_name: transformers
+---
+## Mistral Nemo 12B R1
+![mistralthink](https://autumn.revolt.chat/attachments/zIqa-Q6gKlwm7BbOvKvFFRLHDdy5OOy30KcU5iFle1/image.png)
+Took **12 hours** to finetune on **1x Nvidia H100** with the following settings:
+- Batch size: 26
+- Gradient accumulation steps: 1
+- Epochs: 1
+- Learning rate: 2e-5
+- Warmup ratio: 0.1
+Run the model:
+```python
+import torch
+from transformers import pipeline
+model_id = "CreitinGameplays/Mistral-Nemo-12B-R1-v0.2"
+pipe = pipeline(
+    "text-generation",
+    model=model_id,
+    torch_dtype=torch.bfloat16,
+    device_map="auto"
+)
+messages = [
+    {"role": "system", "content": "You are a helpful AI assistant."},
+    {"role": "user", "content": "How many r's are in strawberry?"}
+]
+outputs = pipe(
+    messages,
+    temperature=0.6,
+    top_p=1.0,
+    top_k=50,
+    repetition_penalty=1.1,
+    max_new_tokens=2048
+)
+print(outputs[0]["generated_text"][-1])
+```
+Recommended system prompt:
+```
+You are an AI focused on providing systematic, well-reasoned responses. Response Structure: - Format: <think>{reasoning}</think>{answer} - Process: Think first, then answer.
+```
+**Note**: The model was mainly finetuned on English dataset, meaning the model may not perform well in other languages; The model may enter an infinite response loop after the reasoning step.