Soumyajit-7 commited on
Commit
9286522
·
verified ·
1 Parent(s): 46eb39b

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +130 -13
README.md CHANGED
@@ -1,23 +1,140 @@
1
  ---
2
- base_model: unsloth/deepseek-r1-distill-llama-8b-unsloth-bnb-4bit
 
3
  tags:
4
- - text-generation-inference
5
- - transformers
 
 
 
6
  - unsloth
7
- - llama
8
- - trl
9
- - sft
10
- license: apache-2.0
11
  language:
12
  - en
 
13
  ---
14
 
15
- # Uploaded model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
 
17
- - **Developed by:** Soumyajit-7
18
- - **License:** apache-2.0
19
- - **Finetuned from model :** unsloth/deepseek-r1-distill-llama-8b-unsloth-bnb-4bit
 
 
 
 
 
20
 
21
- This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
22
 
23
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
1
  ---
2
+ license: apache-2.0
3
+ base_model: unsloth/DeepSeek-R1-Distill-Llama-8B
4
  tags:
5
+ - text-generation
6
+ - mathematics
7
+ - reasoning
8
+ - chain-of-thought
9
+ - deepseek
10
  - unsloth
11
+ - fine-tuned
 
 
 
12
  language:
13
  - en
14
+ pipeline_tag: text-generation
15
  ---
16
 
17
+ # DeepSeek R1 Math Reasoning Model
18
+
19
+ This model is a fine-tuned version of [unsloth/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B) specialized for mathematical reasoning and problem-solving.
20
+
21
+ ## Model Description
22
+
23
+ - **Base Model**: DeepSeek-R1-Distill-Llama-8B
24
+ - **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
25
+ - **Dataset**: Mathematical reasoning dataset with chain-of-thought explanations
26
+ - **Specialization**: Mathematical problem-solving with step-by-step reasoning
27
+
28
+ ## Features
29
+
30
+ - **Chain-of-Thought Reasoning**: The model thinks through problems step-by-step before providing answers
31
+ - **Mathematical Expertise**: Trained on mathematical problems and solutions
32
+ - **Structured Responses**: Provides both reasoning process and final answers
33
+
34
+ ## Usage
35
+
36
+ ### Direct Usage
37
+
38
+ ```python
39
+ from transformers import AutoTokenizer, AutoModelForCausalLM
40
+ import torch
41
+
42
+ # Load the model and tokenizer
43
+ model = AutoModelForCausalLM.from_pretrained("Soumyajit-7/adv-mathematics-reasoning-8b")
44
+ tokenizer = AutoTokenizer.from_pretrained("Soumyajit-7/adv-mathematics-reasoning-8b")
45
+
46
+ # Define the prompt format
47
+ prompt = '''Below is an instruction that describes a task, paired with an input that provides further context.
48
+ Write a response that appropriately completes the request.
49
+ Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.
50
+
51
+ ### Instruction:
52
+ You are a mathematics expert with advanced knowledge in problem-solving, logical reasoning, and mathematical concepts.
53
+ Please solve the following mathematics problem.
54
+
55
+ ### Question:
56
+ {}
57
+
58
+ ### Response:
59
+ <think>'''
60
+
61
+ # Example usage
62
+ question = "If x + 5 = 12, what is the value of x?"
63
+ inputs = tokenizer([prompt.format(question)], return_tensors="pt")
64
+
65
+ with torch.no_grad():
66
+ outputs = model.generate(
67
+ **inputs,
68
+ max_new_tokens=500,
69
+ temperature=0.7,
70
+ do_sample=True,
71
+ pad_token_id=tokenizer.eos_token_id
72
+ )
73
+
74
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
75
+ print(response.split("### Response:")[1])
76
+ ```
77
+
78
+ ### Using with Unsloth
79
+
80
+ ```python
81
+ from unsloth import FastLanguageModel
82
+
83
+ model, tokenizer = FastLanguageModel.from_pretrained(
84
+ model_name="Soumyajit-7/adv-mathematics-reasoning-8b",
85
+ max_seq_length=2048,
86
+ dtype=None,
87
+ load_in_4bit=True,
88
+ )
89
+
90
+ FastLanguageModel.for_inference(model)
91
+ # Use the model for inference...
92
+ ```
93
+
94
+ ## Training Details
95
+
96
+ - **Training Framework**: Unsloth + TRL
97
+ - **LoRA Rank**: 16
98
+ - **LoRA Alpha**: 16
99
+ - **Target Modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
100
+ - **Learning Rate**: 2e-4
101
+ - **Batch Size**: 2 (with gradient accumulation)
102
+ - **Optimizer**: AdamW 8-bit
103
+
104
+ ## Model Performance
105
+
106
+ This model excels at:
107
+ - Mathematical problem-solving
108
+ - Step-by-step reasoning
109
+ - Chain-of-thought explanations
110
+ - Arithmetic and algebraic problems
111
+ - Logical reasoning tasks
112
+
113
+ ## Limitations
114
+
115
+ - Specialized for mathematical reasoning; may not perform as well on general tasks
116
+ - Requires specific prompt format for optimal performance
117
+ - Limited to problems similar to the training data
118
+
119
+ ## License
120
+
121
+ This model is released under the Apache 2.0 license.
122
+
123
+ ## Citation
124
+
125
+ If you use this model, please cite:
126
 
127
+ ```bibtex
128
+ @misc{deepseek-r1-math-reasoning,
129
+ title={DeepSeek R1 Math Reasoning Model},
130
+ author={Your Name},
131
+ year={2025},
132
+ howpublished={\url{https://huggingface.co/Soumyajit-7/adv-mathematics-reasoning-8b}},
133
+ }
134
+ ```
135
 
136
+ ## Acknowledgments
137
 
138
+ - Base model: [DeepSeek-AI](https://huggingface.co/deepseek-ai)
139
+ - Fine-tuning framework: [Unsloth](https://github.com/unslothai/unsloth)
140
+ - Training framework: [TRL](https://github.com/huggingface/trl)