AetherCode: Evaluating LLMs' Ability to Win In Premier Programming Competitions Paper • 2508.16402 • Published 4 days ago • 9
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models Paper • 2508.09834 • Published 13 days ago • 48
Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models Paper • 2508.10751 • Published 12 days ago • 25
GeRe: Towards Efficient Anti-Forgetting in Continual Learning of LLM via General Samples Replay Paper • 2508.04676 • Published 19 days ago • 4
MathReal: We Keep It Real! A Real Scene Benchmark for Evaluating Math Reasoning in Multimodal Large Language Models Paper • 2508.06009 • Published 18 days ago • 15
Learning to Align, Aligning to Learn: A Unified Approach for Self-Optimized Alignment Paper • 2508.07750 • Published 15 days ago • 19
Can LLM-Generated Textual Explanations Enhance Model Classification Performance? An Empirical Study Paper • 2508.09776 • Published 13 days ago • 3
StepFun-Formalizer: Unlocking the Autoformalization Potential of LLMs through Knowledge-Reasoning Fusion Paper • 2508.04440 • Published 20 days ago • 9
Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning Paper • 2508.03501 • Published 21 days ago • 53
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens Paper • 2508.01191 • Published 24 days ago • 225
Don't Overthink It: A Survey of Efficient R1-style Large Reasoning Models Paper • 2508.02120 • Published 22 days ago • 18
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification Paper • 2508.05629 • Published 18 days ago • 159
Agent Lightning: Train ANY AI Agents with Reinforcement Learning Paper • 2508.03680 • Published 20 days ago • 61
Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving Paper • 2507.23726 • Published 25 days ago • 108
Can One Domain Help Others? A Data-Centric Study on Multi-Domain Reasoning via Reinforcement Learning Paper • 2507.17512 • Published Jul 23 • 36