GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding Paper • 2503.10596 • Published Mar 13 • 18
GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding Paper • 2503.10596 • Published Mar 13 • 18 • 2
GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding Paper • 2503.10596 • Published Mar 13 • 18
Knowledge Mining with Scene Text for Fine-Grained Recognition Paper • 2203.14215 • Published Mar 27, 2022
GaussTR: Foundation Model-Aligned Gaussian Transformer for Self-Supervised 3D Spatial Understanding Paper • 2412.13193 • Published Dec 17, 2024
Multimodal Mamba: Decoder-only Multimodal State Space Model via Quadratic to Linear Distillation Paper • 2502.13145 • Published Feb 18 • 38
Multimodal Mamba: Decoder-only Multimodal State Space Model via Quadratic to Linear Distillation Paper • 2502.13145 • Published Feb 18 • 38
Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models Paper • 2501.01423 • Published Jan 2 • 42