Submitted by GenuineWWD 56 VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models · 13 authors 1
Submitted by Alon77777 37 DreamID: High-Fidelity and Fast diffusion-based Face Swapping via Triplet ID Group Learning · 8 authors 7
Submitted by StarThomas1002 17 PHYBench: Holistic Evaluation of Physical Perception and Reasoning in Large Language Models · 52 authors 1
Submitted by Swtheking 16 Pre-DPO: Improving Data Utilization in Direct Preference Optimization Using a Guiding Reference Model · 5 authors 1
Submitted by Kaichengalex 14 Decoupled Global-Local Alignment for Improving Compositional Understanding · 6 authors 1
Submitted by igitman 9 AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset · 8 authors 1
Submitted by Ningyu 9 A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment · 82 authors 1
Submitted by USTCYu 9 Rethinking the Generation of High-Quality CoT Data from the Perspective of LLM-Adaptive Question Difficulty Grading · 10 authors 2
Submitted by mturski 4 Unchecked and Overlooked: Addressing the Checkbox Blind Spot in Large Language Models with CheckboxQA · 3 authors 1
Submitted by anirudhkhatry 3 CRUST-Bench: A Comprehensive Benchmark for C-to-safe-Rust Transpilation · 7 authors 1
Submitted by jcwang0602 1 Progressive Language-guided Visual Learning for Multi-Task Visual Grounding · 6 authors 1