InstantCharacter: Personalize Any Characters with a Scalable Diffusion Transformer Framework Paper β’ 2504.12395 β’ Published 6 days ago β’ 15
Packing Input Frame Context in Next-Frame Prediction Models for Video Generation Paper β’ 2504.12626 β’ Published 6 days ago β’ 45
SimpleAR: Pushing the Frontier of Autoregressive Visual Generation through Pretraining, SFT, and RL Paper β’ 2504.11455 β’ Published 7 days ago β’ 12
MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft Paper β’ 2504.08388 β’ Published 12 days ago β’ 39
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Paper β’ 2504.08685 β’ Published 11 days ago β’ 120
FlexIP: Dynamic Control of Preservation and Personality for Customized Image Generation Paper β’ 2504.07405 β’ Published 13 days ago β’ 11
Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought Paper β’ 2504.05599 β’ Published 15 days ago β’ 80
One-Minute Video Generation with Test-Time Training Paper β’ 2504.05298 β’ Published 15 days ago β’ 96
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation Paper β’ 2504.02160 β’ Published 20 days ago β’ 35
ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving Paper β’ 2404.16771 β’ Published Apr 25, 2024 β’ 20
IDAdapter: Learning Mixed Features for Tuning-Free Personalization of Text-to-Image Models Paper β’ 2403.13535 β’ Published Mar 20, 2024 β’ 24
InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation Paper β’ 2404.19427 β’ Published Apr 30, 2024 β’ 75
DynamicID: Zero-Shot Multi-ID Image Personalization with Flexible Facial Editability Paper β’ 2503.06505 β’ Published Mar 9 β’ 1
ILLUME+: Illuminating Unified MLLM with Dual Visual Tokenization and Diffusion Refinement Paper β’ 2504.01934 β’ Published 20 days ago β’ 23
VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via Iterative Instruction Tuning and Reinforcement Learning Paper β’ 2504.02949 β’ Published 19 days ago β’ 20
Inference-Time Scaling for Generalist Reward Modeling Paper β’ 2504.02495 β’ Published 20 days ago β’ 53
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation Paper β’ 2504.02782 β’ Published 19 days ago β’ 55
SkyReels-A2: Compose Anything in Video Diffusion Transformers Paper β’ 2504.02436 β’ Published 20 days ago β’ 35