UserBench: An Interactive Gym Environment for User-Centric Agents Paper • 2507.22034 • Published 24 days ago • 29
Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers Paper • 2506.23918 • Published Jun 30 • 86
Saffron-1: Towards an Inference Scaling Paradigm for LLM Safety Assurance Paper • 2506.06444 • Published Jun 6 • 74
MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning Paper • 2505.24846 • Published May 30 • 15
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time Paper • 2505.24863 • Published May 30 • 97
s3: You Don't Need That Much Data to Train a Search Agent via RL Paper • 2505.14146 • Published May 20 • 18
Transformer Copilot: Learning from The Mistake Log in LLM Fine-tuning Paper • 2505.16270 • Published May 22 • 6
Time-R1: Towards Comprehensive Temporal Reasoning in LLMs Paper • 2505.13508 • Published May 16 • 14