VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning Paper • 2504.08837 • Published 13 days ago • 42
Grounded Image Text Matching with Mismatched Relation Reasoning Paper • 2308.01236 • Published Aug 2, 2023
AURORA:Automated Training Framework of Universal Process Reward Models via Ensemble Prompting and Reverse Verification Paper • 2502.11520 • Published Feb 17
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning Paper • 2504.08837 • Published 13 days ago • 42
COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values Paper • 2504.05535 • Published 16 days ago • 44