---
license: apache-2.0
base_model: unsloth/DeepSeek-R1-Distill-Llama-8B
tags:
- text-generation
- mathematics
- reasoning
- chain-of-thought
- deepseek
- unsloth
- fine-tuned
language:
- en
pipeline_tag: text-generation
---

# Adv mathematics reasoning Model

This model is a fine-tuned version of [unsloth/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B) specialized for mathematical reasoning and problem-solving.

## Model Description

- **Base Model**: DeepSeek-R1-Distill-Llama-8B  
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
- **Dataset**: Mathematical reasoning dataset with chain-of-thought explanations
- **Specialization**: Mathematical problem-solving with step-by-step reasoning

## Features

- **Chain-of-Thought Reasoning**: The model thinks through problems step-by-step before providing answers
- **Mathematical Expertise**: Trained on mathematical problems and solutions
- **Structured Responses**: Provides both reasoning process and final answers

## Usage

### Direct Usage

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Load the model and tokenizer
model = AutoModelForCausalLM.from_pretrained("Soumyajit-7/adv-mathematics-reasoning-8b")
tokenizer = AutoTokenizer.from_pretrained("Soumyajit-7/adv-mathematics-reasoning-8b")

# Define the prompt format
prompt = '''Below is an instruction that describes a task, paired with an input that provides further context. 
Write a response that appropriately completes the request. 
Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.

### Instruction:
You are a mathematics expert with advanced knowledge in problem-solving, logical reasoning, and mathematical concepts. 
Please solve the following mathematics problem. 

### Question:
{}

### Response:
<think>'''

# Example usage
question = "If x + 5 = 12, what is the value of x?"
inputs = tokenizer([prompt.format(question)], return_tensors="pt")

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=500,
        temperature=0.7,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response.split("### Response:")[1])
```

### Using with Unsloth

```python
from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="Soumyajit-7/adv-mathematics-reasoning-8b",
    max_seq_length=2048,
    dtype=None,
    load_in_4bit=True,
)

FastLanguageModel.for_inference(model)
# Use the model for inference...
```

## Training Details

- **Training Framework**: Unsloth + TRL
- **LoRA Rank**: 16
- **LoRA Alpha**: 16
- **Target Modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- **Learning Rate**: 2e-4
- **Batch Size**: 2 (with gradient accumulation)
- **Optimizer**: AdamW 8-bit

## Model Performance

This model excels at:
- Mathematical problem-solving
- Step-by-step reasoning
- Chain-of-thought explanations
- Arithmetic and algebraic problems
- Logical reasoning tasks

## Limitations

- Specialized for mathematical reasoning; may not perform as well on general tasks
- Requires specific prompt format for optimal performance
- Limited to problems similar to the training data

## License

This model is released under the Apache 2.0 license.

## Citation

If you use this model, please cite:

```bibtex
@misc{adv-mathematics-reasoning,
  title={Adv Mathematics Reasoning Model},
  author={Soumyajit Biswas},
  year={2025},
  howpublished={\url{https://huggingface.co/Soumyajit-7/adv-mathematics-reasoning-8b}},
}
```

## Acknowledgments

- Base model: [DeepSeek-AI](https://huggingface.co/deepseek-ai)
- Fine-tuning framework: [Unsloth](https://github.com/unslothai/unsloth)
- Training framework: [TRL](https://github.com/huggingface/trl)