ToT-Reasoner-Qwen3-1.7B

Model Description

This model is a fine-tuned version of Qwen/Qwen3-1.7B using Supervised Fine-Tuning (SFT) on the HuggingFaceH4/MATH-500 dataset. It is optimized for mathematical reasoning and problem-solving tasks. The fine-tuning process was performed by EKAGRATA TECH PRIVATE LIMITED.

Training Data

  • Source: HuggingFaceH4/MATH-500 (50 samples).
  • Format: Prompts with <reasoning>...</reasoning><answer>...</answer> structure.

Fine-Tuning Process

  • Method: Incremental SFT with learning rate=1e-5, 1 epoch per batch, batch size=10.
  • Setup: Google Colab Pro with A100 GPU.
  • Date and Time: Model uploaded at 07:23 AM on Friday, July 04, 2025.

Usage

To use this model for mathematical reasoning:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("ziadrone/oneplusaries55")
tokenizer = AutoTokenizer.from_pretrained("ziadrone/oneplusaries55")

SYSTEM_PROMPT = """You are a large language model trained to solve mathematical, logical, physics, and general reasoning problems. You must follow the following steps to solve the problem:
1. Carefully analyze the question and identify the key information.
2. Develop a clear and concise plan to approach the problem.
3. Execute your plan step-by-step, providing detailed explanations and intermediate calculations.
4. Verify your solution to ensure it is accurate and makes sense in the context of the problem.
5. Present your final answer in a clear and concise format.
6. Always enclose the reasoning process within <reasoning>...</reasoning> tags.
7. Always enclose the final answer within <answer>...</answer> tags.
8. Do not use any other tags besides <reasoning> and <answer>.
9. Do not include any extra information outside of the reasoning or answer tags."""

prompt = f"SYSTEM: {SYSTEM_PROMPT}\nUSER: Solve the equation 2x + 3 = 7."
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Performance

This model has been fine-tuned on mathematical reasoning tasks and should perform well on similar problems involving step-by-step logical reasoning.

Limitations

  • The model was trained on a limited dataset (50 samples)
  • Performance may vary on problems significantly different from the training data
  • Always verify mathematical results for critical applications

License

This model is released under the Apache 2.0 license.

Downloads last month
19
Safetensors
Model size
1.72B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ziadrone/oneplusaries55

Finetuned
Qwen/Qwen3-1.7B
Finetuned
(211)
this model

Dataset used to train ziadrone/oneplusaries55