Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? Paper • 2504.13837 • Published 4 days ago • 84
D^2iT: Dynamic Diffusion Transformer for Accurate Image Generation Paper • 2504.09454 • Published 10 days ago • 11
GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation Paper • 2504.08736 • Published 11 days ago • 47
view article Article Hugging Face and Cloudflare Partner to Make Real-Time Speech and Video Seamless with FastRTC 14 days ago • 21
MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization Paper • 2504.00999 • Published 21 days ago • 83
Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation Paper • 2503.16430 • Published Mar 20 • 35
Unified Reward Model for Multimodal Understanding and Generation Paper • 2503.05236 • Published Mar 7 • 120
OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference Paper • 2502.18411 • Published Feb 25 • 73
MLGym: A New Framework and Benchmark for Advancing AI Research Agents Paper • 2502.14499 • Published Feb 20 • 191
Meta Audiobox Aesthetics: Unified Automatic Quality Assessment for Speech, Music, and Sound Paper • 2502.05139 • Published Feb 7 • 1
SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation Paper • 2502.13128 • Published Feb 18 • 42
Learning Getting-Up Policies for Real-World Humanoid Robots Paper • 2502.12152 • Published Feb 17 • 42