IAG: Input-aware Backdoor Attack on VLMs for Visual Grounding Paper • 2508.09456 • Published 9 days ago • 7
Mol-R1: Towards Explicit Long-CoT Reasoning in Molecule Discovery Paper • 2508.08401 • Published 10 days ago • 37
Phi-Ground Tech Report: Advancing Perception in GUI Grounding Paper • 2507.23779 • Published 21 days ago • 41
TOMG-Bench: Evaluating LLMs on Text-based Open Molecule Generation Paper • 2412.14642 • Published Dec 19, 2024 • 4
Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning Paper • 2411.18203 • Published Nov 27, 2024 • 41
Seeing and Understanding: Bridging Vision with Chemical Knowledge Via ChemVLM Paper • 2408.07246 • Published Aug 14, 2024 • 22
Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes Paper • 2407.10957 • Published Jul 15, 2024 • 25