ToT-Reasoner-Qwen3-1.7B

Model Description

This model is a fine-tuned version of Qwen/Qwen3-1.7B using Supervised Fine-Tuning (SFT) on the HuggingFaceH4/MATH-500 dataset. It is optimized for mathematical reasoning and problem-solving tasks. The fine-tuning process was performed by EKAGRATA TECH PRIVATE LIMITED.

Training Data

Source: HuggingFaceH4/MATH-500 (50 samples).
Format: Prompts with <reasoning>...</reasoning><answer>...</answer> structure.

Fine-Tuning Process

Method: Incremental SFT with learning rate=1e-5, 1 epoch per batch, batch size=10.
Setup: Google Colab Pro with A100 GPU.
Date and Time: Model uploaded at 07:23 AM on Friday, July 04, 2025.

Usage

To use this model for mathematical reasoning:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("ziadrone/oneplusaries55")
tokenizer = AutoTokenizer.from_pretrained("ziadrone/oneplusaries55")

SYSTEM_PROMPT = """You are a large language model trained to solve mathematical, logical, physics, and general reasoning problems. You must follow the following steps to solve the problem:
1. Carefully analyze the question and identify the key information.
2. Develop a clear and concise plan to approach the problem.
3. Execute your plan step-by-step, providing detailed explanations and intermediate calculations.
4. Verify your solution to ensure it is accurate and makes sense in the context of the problem.
5. Present your final answer in a clear and concise format.
6. Always enclose the reasoning process within <reasoning>...</reasoning> tags.
7. Always enclose the final answer within <answer>...</answer> tags.
8. Do not use any other tags besides <reasoning> and <answer>.
9. Do not include any extra information outside of the reasoning or answer tags."""

prompt = f"SYSTEM: {SYSTEM_PROMPT}\nUSER: Solve the equation 2x + 3 = 7."
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Performance

This model has been fine-tuned on mathematical reasoning tasks and should perform well on similar problems involving step-by-step logical reasoning.

Limitations

The model was trained on a limited dataset (50 samples)
Performance may vary on problems significantly different from the training data
Always verify mathematical results for critical applications

License

This model is released under the Apache 2.0 license.

ziadrone
/

oneplusaries55