CreitinGameplays
/

Mistral-Nemo-12B-R1-v0.1

Text Generation

text-generation-inference

Model card Files Files and versions

CreitinGameplays commited on Feb 21

Commit

d80ff86

·

verified ·

1 Parent(s): daca9b4

Update README.md

Files changed (1) hide show

README.md +37 -1

README.md CHANGED Viewed

@@ -11,4 +11,40 @@ library_name: transformers
 ---
 ## Mistral Nemo 12B R1
-![mistralthink](https://autumn.revolt.chat/attachments/m3u9-NhgOseyqu_oz7PANOgw4f1zz3_g8YLLE2O_gQ/an_orange_robot_thinking.jpeg)

 ---
 ## Mistral Nemo 12B R1
+![mistralthink](https://autumn.revolt.chat/attachments/m3u9-NhgOseyqu_oz7PANOgw4f1zz3_g8YLLE2O_gQ/an_orange_robot_thinking.jpeg)
+Took **96 hours** to finetune on **2x Nvidia RTX A6000** with the following settings:
+- Batch size: 3
+- Gradient accumulation steps: 1
+- Epochs: 1
+- Learning rate: 1e-4
+- Warmup ratio: 0.1
+Run the model:
+```python
+import torch
+from transformers import pipeline
+model_id = "CreitinGameplays/Mistral-Nemo-12B-R1-v0.1"
+pipe = pipeline(
+    "text-generation",
+    model=model_id,
+    torch_dtype=torch.bfloat16,
+    device_map="auto"
+)
+messages = [
+    {"role": "system", "content": "You are a helpful AI assistant."},
+    {"role": "user", "content": "How many r's are in strawberry?"}
+]
+outputs = pipe(
+    messages,
+    temperature=0.4,
+    repetition_penalty=1.1,
+    max_new_tokens=2048
+)
+print(outputs[0]["generated_text"][-1])
+```