metadata
license: mit
datasets:
- CreitinGameplays/Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70Bmistral
language:
- en
base_model:
- mistralai/Mistral-Nemo-Instruct-2407
pipeline_tag: text-generation
library_name: transformers
Mistral Nemo 12B R1
Took 96 hours to finetune on 2x Nvidia RTX A6000 with the following settings:
- Batch size: 3
- Gradient accumulation steps: 1
- Epochs: 1
- Learning rate: 1e-4
- Warmup ratio: 0.1
Run the model:
import torch
from transformers import pipeline
model_id = "CreitinGameplays/Mistral-Nemo-12B-R1-v0.1"
pipe = pipeline(
"text-generation",
model=model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
messages = [
{"role": "system", "content": "You are a helpful AI assistant."},
{"role": "user", "content": "How many r's are in strawberry?"}
]
outputs = pipe(
messages,
temperature=0.4,
repetition_penalty=1.1,
max_new_tokens=2048
)
print(outputs[0]["generated_text"][-1])
System prompt:
You are an AI focused on providing systematic, well-reasoned responses. Response Structure: - Format: <think>{reasoning}</think>{answer} - Process: Think first, then answer. Always use your reasoning capabilities.