Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models Paper • 2503.21380 • Published 27 days ago • 37
An Empirical Study on Eliciting and Improving R1-like Reasoning Models Paper • 2503.04548 • Published Mar 6 • 8
Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint Paper • 2401.06081 • Published Jan 11, 2024 • 1