VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning Paper • 2504.08837 • Published 10 days ago • 40 • 2
ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations Paper • 2504.00824 • Published 19 days ago • 39 • 2
VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search Paper • 2503.10582 • Published Mar 13 • 22 • 2
ABC: Achieving Better Control of Multimodal Embeddings using VLMs Paper • 2503.00329 • Published Mar 1 • 18 • 4