Latent Diffusion Autoencoders: Toward Efficient and Meaningful Unsupervised Representation Learning in Medical Imaging Paper • 2504.08635 • Published 11 days ago • 5
ZipIR: Latent Pyramid Diffusion Transformer for High-Resolution Image Restoration Paper • 2504.08591 • Published 11 days ago • 18
GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation Paper • 2504.08736 • Published 11 days ago • 47
Have we unified image generation and understanding yet? An empirical study of GPT-4o's image generation ability Paper • 2504.08003 • Published 13 days ago • 47
Efficient Generative Model Training via Embedded Representation Warmup Paper • 2504.10188 • Published 8 days ago • 12
SimpleAR: Pushing the Frontier of Autoregressive Visual Generation through Pretraining, SFT, and RL Paper • 2504.11455 • Published 7 days ago • 12
DataDecide: How to Predict Best Pretraining Data with Small Experiments Paper • 2504.11393 • Published 7 days ago • 15
REPA-E: Unlocking VAE for End-to-End Tuning with Latent Diffusion Transformers Paper • 2504.10483 • Published 8 days ago • 20
PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding Paper • 2504.13180 • Published 5 days ago • 16
Perception Encoder: The best visual embeddings are not at the output of the network Paper • 2504.13181 • Published 5 days ago • 27
Tokenize Image Patches: Global Context Fusion for Effective Haze Removal in Large Images Paper • 2504.09621 • Published 9 days ago • 10