Gen-Verse
/

ReasonFlux-F1-14B

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions Community

ReasonFlux-F1-14B / README.md

BitStarWalkin's picture

Update README.md

3477644 verified 4 months ago

|

history blame contribute delete

4.22 kB

	---
	library_name: transformers
	license: other
	base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
	tags:
	- llama-factory
	- full
	- generated_from_trainer
	model-index:
	- name: ReasonFlux-F1-14B
	results: []
	---

	# ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates
	Revolutionary template-augmented reasoning paradigm enpowers a 32B model to outperform o1-mini and DeepSeek-R1 distilled models in reasoning tasks.

	\| Task/Pass@1 \| ReasonFlux-F1-32B \| ReasonFlux-Zero-32B \| R1-Distill-32B \| o1-mini \| LIMO -32B \| s1-32B \|
	\| :------------- \| :----------------: \| :-------------: \| :-------------------: \| :-----------------: \| :--------: \| :--------: \|
	\| MATH500 \| 96.0 \| 91.2 \| 94.3 \| 90.0 \| 90.6 \| 93.0 \|
	\| AIME 2024 \| 76.7 \| 56.7 \| 72.6 \| 56.7 \| 50.0 \| 56.7 \|
	\| AIME 2025 \| 53.3 \| 37.2 \| 46.67 \| 50.8 \| 37.2 \| 49.3 \|
	\| GPQA-Diamond \| 67.2 \| 61.2 \| 62.1 \| 60.0 \| 65.2 \| 59.6 \|

	# ReasonFlux-F1-14B

	> ReasonFlux-F1-14B is our finetuned SOTA-level reasoning LLM by leveraging the template-augmented reasoning trajectories from our [ReasonFlux-Zero](https://arxiv.org/abs/2502.06772).

	* Github Repository: [Gen-Verse/ReasonFlux](https://github.com/Gen-Verse/ReasonFlux)
	* Paper:[ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates](https://arxiv.org/abs/2502.06772)
	* Dataset: [Gen-Verse/ReasonFlux-F1-SFT](https://huggingface.co/datasets/Gen-Verse/ReasonFlux-F1-SFT)


	## Evaluation
	We present the evaluation results of our ReasonFlux-F1-32B on challenging reasoning tasks including AIME2024,AIM2025,MATH500 and GPQA-Diamond. To make a fair comparison, we report the results of the LLMs on our evaluation scripts in [ReasonFlux-F1](https://github.com/Gen-Verse/ReasonFlux).

	\| Model \| AIME2024@pass1 \| AIME2025@pass1 \| MATH500@pass1 \| GPQA@pass1 \|
	\| --------------------------------------- \| :--------------: \| :--------------: \| :-------------: \| :----------: \|
	\| QwQ-32B-Preview \| 46.7 \| 37.2 \| 90.6 \| 65.2 \|
	\| LIMO-32B \| 56.3 \| 44.5 \| 94.8 \| 58.1 \|
	\| s1-32B \| 56.7 \| 49.3 \| 93.0 \| 59.6 \|
	\| OpenThinker-32B \| 66.0 \| 53.3 \| 94.8 \| 60.1 \|
	\| R1-Distill-32B \| 70.0 \| 46.7 \| 92.0 \| 59.6 \|
	\| ReasonFlux-Zero-32B \| 56.7 \| 37.2 \| 91.2 \| 61.2 \|
	\| ReasonFlux-F1-32B \| 76.7 \| 53.3 \| 96.0 \| 67.2 \|


	## Quick start with VLLM
	```python
	from vllm import LLM, SamplingParams
	from transformers import AutoTokenizer

	model_id = 'Gen-Verse/ReasonFlux-F1-14B'

	model = LLM(
	model_id,
	tensor_parallel_size=8,
	)
	tokenizer = AutoTokenizer.from_pretrained(model_id)

	sampling_params = SamplingParams(
	max_tokens=32768,
	)
	# 2022 AIME I Problems/Problem 15
	question = """Let \(x, y\), and \(z\) be positive real numbers satisfying the system of equations:
	\[
	\begin{array}{c}
	\sqrt{2 x-x y}+\sqrt{2 y-x y}=1 \\
	\sqrt{2 y-y z}+\sqrt{2 z-y z}=\sqrt{2} \\
	\sqrt{2 z-z x}+\sqrt{2 x-z x}=\sqrt{3} .
	\end{array}
	\]
	Then \(\left[(1-x)(1-y)(1-z)\right]^{2}\) can be written as \(\frac{m}{n}\), where \(m\) and \(n\) are relatively prime positive integers. Find \(m+n\)."""
	ds_prompt="<｜User｜>\n" + question + "<｜Assistant｜>\n"
	output = model.generate(ds_prompt, sampling_params=sampling_params)
	print(output[0].outputs[0].text)

	```
	## Citation

	```bash
	@article{yang2025reasonflux,
	title={ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates},
	author={Yang, Ling and Yu, Zhaochen and Cui, Bin and Wang, Mengdi},
	journal={arXiv preprint arXiv:2502.06772},
	year={2025}
	}
	```