Soumyajit-7
/

adv-mathematics-reasoning-8b

@@ -1,23 +1,140 @@
 ---
-base_model: unsloth/deepseek-r1-distill-llama-8b-unsloth-bnb-4bit
 tags:
-- text-generation-inference
-- transformers
 - unsloth
-- llama
-- trl
-- sft
-license: apache-2.0
 language:
 - en
 ---
-# Uploaded  model
-- **Developed by:** Soumyajit-7
-- **License:** apache-2.0
-- **Finetuned from model :** unsloth/deepseek-r1-distill-llama-8b-unsloth-bnb-4bit
-This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
-[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 ---
+license: apache-2.0
+base_model: unsloth/DeepSeek-R1-Distill-Llama-8B
 tags:
+- text-generation
+- mathematics
+- reasoning
+- chain-of-thought
+- deepseek
 - unsloth
+- fine-tuned
 language:
 - en
+pipeline_tag: text-generation
 ---
+# DeepSeek R1 Math Reasoning Model
+This model is a fine-tuned version of [unsloth/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B) specialized for mathematical reasoning and problem-solving.
+## Model Description
+- **Base Model**: DeepSeek-R1-Distill-Llama-8B
+- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
+- **Dataset**: Mathematical reasoning dataset with chain-of-thought explanations
+- **Specialization**: Mathematical problem-solving with step-by-step reasoning
+## Features
+- **Chain-of-Thought Reasoning**: The model thinks through problems step-by-step before providing answers
+- **Mathematical Expertise**: Trained on mathematical problems and solutions
+- **Structured Responses**: Provides both reasoning process and final answers
+## Usage
+### Direct Usage
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+import torch
+# Load the model and tokenizer
+model = AutoModelForCausalLM.from_pretrained("Soumyajit-7/adv-mathematics-reasoning-8b")
+tokenizer = AutoTokenizer.from_pretrained("Soumyajit-7/adv-mathematics-reasoning-8b")
+# Define the prompt format
+prompt = '''Below is an instruction that describes a task, paired with an input that provides further context.
+Write a response that appropriately completes the request.
+Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.
+### Instruction:
+You are a mathematics expert with advanced knowledge in problem-solving, logical reasoning, and mathematical concepts.
+Please solve the following mathematics problem.
+### Question:
+{}
+### Response:
+<think>'''
+# Example usage
+question = "If x + 5 = 12, what is the value of x?"
+inputs = tokenizer([prompt.format(question)], return_tensors="pt")
+with torch.no_grad():
+    outputs = model.generate(
+        **inputs,
+        max_new_tokens=500,
+        temperature=0.7,
+        do_sample=True,
+        pad_token_id=tokenizer.eos_token_id
+    )
+response = tokenizer.decode(outputs[0], skip_special_tokens=True)
+print(response.split("### Response:")[1])
+```
+### Using with Unsloth
+```python
+from unsloth import FastLanguageModel
+model, tokenizer = FastLanguageModel.from_pretrained(
+    model_name="Soumyajit-7/adv-mathematics-reasoning-8b",
+    max_seq_length=2048,
+    dtype=None,
+    load_in_4bit=True,
+)
+FastLanguageModel.for_inference(model)
+# Use the model for inference...
+```
+## Training Details
+- **Training Framework**: Unsloth + TRL
+- **LoRA Rank**: 16
+- **LoRA Alpha**: 16
+- **Target Modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
+- **Learning Rate**: 2e-4
+- **Batch Size**: 2 (with gradient accumulation)
+- **Optimizer**: AdamW 8-bit
+## Model Performance
+This model excels at:
+- Mathematical problem-solving
+- Step-by-step reasoning
+- Chain-of-thought explanations
+- Arithmetic and algebraic problems
+- Logical reasoning tasks
+## Limitations
+- Specialized for mathematical reasoning; may not perform as well on general tasks
+- Requires specific prompt format for optimal performance
+- Limited to problems similar to the training data
+## License
+This model is released under the Apache 2.0 license.
+## Citation
+If you use this model, please cite:
+```bibtex
+@misc{deepseek-r1-math-reasoning,
+  title={DeepSeek R1 Math Reasoning Model},
+  author={Your Name},
+  year={2025},
+  howpublished={\url{https://huggingface.co/Soumyajit-7/adv-mathematics-reasoning-8b}},
+}
+```
+## Acknowledgments
+- Base model: [DeepSeek-AI](https://huggingface.co/deepseek-ai)
+- Fine-tuning framework: [Unsloth](https://github.com/unslothai/unsloth)
+- Training framework: [TRL](https://github.com/huggingface/trl)