--- license: apache-2.0 base_model: unsloth/DeepSeek-R1-Distill-Llama-8B tags: - text-generation - mathematics - reasoning - chain-of-thought - deepseek - unsloth - fine-tuned language: - en pipeline_tag: text-generation --- # Adv mathematics reasoning Model This model is a fine-tuned version of [unsloth/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B) specialized for mathematical reasoning and problem-solving. ## Model Description - **Base Model**: DeepSeek-R1-Distill-Llama-8B - **Fine-tuning Method**: LoRA (Low-Rank Adaptation) - **Dataset**: Mathematical reasoning dataset with chain-of-thought explanations - **Specialization**: Mathematical problem-solving with step-by-step reasoning ## Features - **Chain-of-Thought Reasoning**: The model thinks through problems step-by-step before providing answers - **Mathematical Expertise**: Trained on mathematical problems and solutions - **Structured Responses**: Provides both reasoning process and final answers ## Usage ### Direct Usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch # Load the model and tokenizer model = AutoModelForCausalLM.from_pretrained("Soumyajit-7/adv-mathematics-reasoning-8b") tokenizer = AutoTokenizer.from_pretrained("Soumyajit-7/adv-mathematics-reasoning-8b") # Define the prompt format prompt = '''Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response. ### Instruction: You are a mathematics expert with advanced knowledge in problem-solving, logical reasoning, and mathematical concepts. Please solve the following mathematics problem. ### Question: {} ### Response: ''' # Example usage question = "If x + 5 = 12, what is the value of x?" inputs = tokenizer([prompt.format(question)], return_tensors="pt") with torch.no_grad(): outputs = model.generate( **inputs, max_new_tokens=500, temperature=0.7, do_sample=True, pad_token_id=tokenizer.eos_token_id ) response = tokenizer.decode(outputs[0], skip_special_tokens=True) print(response.split("### Response:")[1]) ``` ### Using with Unsloth ```python from unsloth import FastLanguageModel model, tokenizer = FastLanguageModel.from_pretrained( model_name="Soumyajit-7/adv-mathematics-reasoning-8b", max_seq_length=2048, dtype=None, load_in_4bit=True, ) FastLanguageModel.for_inference(model) # Use the model for inference... ``` ## Training Details - **Training Framework**: Unsloth + TRL - **LoRA Rank**: 16 - **LoRA Alpha**: 16 - **Target Modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj - **Learning Rate**: 2e-4 - **Batch Size**: 2 (with gradient accumulation) - **Optimizer**: AdamW 8-bit ## Model Performance This model excels at: - Mathematical problem-solving - Step-by-step reasoning - Chain-of-thought explanations - Arithmetic and algebraic problems - Logical reasoning tasks ## Limitations - Specialized for mathematical reasoning; may not perform as well on general tasks - Requires specific prompt format for optimal performance - Limited to problems similar to the training data ## License This model is released under the Apache 2.0 license. ## Citation If you use this model, please cite: ```bibtex @misc{adv-mathematics-reasoning, title={Adv Mathematics Reasoning Model}, author={Soumyajit Biswas}, year={2025}, howpublished={\url{https://huggingface.co/Soumyajit-7/adv-mathematics-reasoning-8b}}, } ``` ## Acknowledgments - Base model: [DeepSeek-AI](https://huggingface.co/deepseek-ai) - Fine-tuning framework: [Unsloth](https://github.com/unslothai/unsloth) - Training framework: [TRL](https://github.com/huggingface/trl)