VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model Paper β’ 2504.07615 β’ Published 14 days ago β’ 30
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Paper β’ 2504.08685 β’ Published 13 days ago β’ 121
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation Paper β’ 2504.02160 β’ Published 21 days ago β’ 35
One-Minute Video Generation with Test-Time Training Paper β’ 2504.05298 β’ Published 17 days ago β’ 98
SkyReels-A2: Compose Anything in Video Diffusion Transformers Paper β’ 2504.02436 β’ Published 21 days ago β’ 35
VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step Paper β’ 2504.01956 β’ Published 22 days ago β’ 40
BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation Paper β’ 2503.20672 β’ Published 29 days ago β’ 14
Latent Space Super-Resolution for Higher-Resolution Image Generation with Diffusion Models Paper β’ 2503.18446 β’ Published about 1 month ago β’ 10
Unleashing Vecset Diffusion Model for Fast Shape Generation Paper β’ 2503.16302 β’ Published Mar 20 β’ 44
Concat-ID: Towards Universal Identity-Preserving Video Synthesis Paper β’ 2503.14151 β’ Published Mar 18 β’ 10
Personalize Anything for Free with Diffusion Transformer Paper β’ 2503.12590 β’ Published Mar 16 β’ 44
Learning Few-Step Diffusion Models by Trajectory Distribution Matching Paper β’ 2503.06674 β’ Published Mar 9 β’ 7
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video Paper β’ 2503.11647 β’ Published Mar 14 β’ 137
Pinco: Position-induced Consistent Adapter for Diffusion Transformer in Foreground-conditioned Inpainting Paper β’ 2412.03812 β’ Published Dec 5, 2024 β’ 1