MR. Video: "MapReduce" is the Principle for Long Video Understanding Paper • 2504.16082 • Published 1 day ago • 3
SphereDiff: Tuning-free Omnidirectional Panoramic Image and Video Generation via Spherical Latent Representation Paper • 2504.14396 • Published 4 days ago • 27
Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation Paper • 2504.14899 • Published 3 days ago • 14
Packing Input Frame Context in Next-Frame Prediction Models for Video Generation Paper • 2504.12626 • Published 7 days ago • 45
WORLDMEM: Long-term Consistent World Simulation with Memory Paper • 2504.12369 • Published 7 days ago • 30
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published 9 days ago • 239
REPA-E: Unlocking VAE for End-to-End Tuning with Latent Diffusion Transformers Paper • 2504.10483 • Published 9 days ago • 20
Vivid4D: Improving 4D Reconstruction from Monocular Video by Video Inpainting Paper • 2504.11092 • Published 9 days ago • 10
Efficient Generative Model Training via Embedded Representation Warmup Paper • 2504.10188 • Published 10 days ago • 12
GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography Paper • 2504.07083 • Published 14 days ago • 23
OmniSVG: A Unified Scalable Vector Graphics Generation Model Paper • 2504.06263 • Published 15 days ago • 149
SkyReels-A2: Compose Anything in Video Diffusion Transformers Paper • 2504.02436 • Published 21 days ago • 35
VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step Paper • 2504.01956 • Published 21 days ago • 40
MoCha: Towards Movie-Grade Talking Character Synthesis Paper • 2503.23307 • Published 25 days ago • 128
GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors Paper • 2504.01016 • Published 22 days ago • 29