NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation Paper • 2504.13055 • Published 6 days ago • 18 • 2
Efficient Process Reward Model Training via Active Learning Paper • 2504.10559 • Published 9 days ago • 13 • 2