Pixels, Patterns, but No Poetry: To See The World like Humans Paper • 2507.16863 • Published Jul 21 • 68 • 6
GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning Paper • 2505.11049 • Published May 16 • 61 • 5
Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation Paper • 2503.19622 • Published Mar 25 • 31 • 4
Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation Paper • 2503.19622 • Published Mar 25 • 31 • 4