The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published Feb 13 • 195
Diffusion Priors for Dynamic View Synthesis from Monocular Videos Paper • 2401.05583 • Published Jan 10, 2024 • 11
Object-Centric Diffusion for Efficient Video Editing Paper • 2401.05735 • Published Jan 11, 2024 • 11
TRIPS: Trilinear Point Splatting for Real-Time Radiance Field Rendering Paper • 2401.06003 • Published Jan 11, 2024 • 26
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models Paper • 2401.06066 • Published Jan 11, 2024 • 55
PALP: Prompt Aligned Personalization of Text-to-Image Models Paper • 2401.06105 • Published Jan 11, 2024 • 50
OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding? Paper • 2501.05510 • Published Jan 9 • 44
VideoRAG: Retrieval-Augmented Generation over Video Corpus Paper • 2501.05874 • Published Jan 10 • 72
Diffusion Adversarial Post-Training for One-Step Video Generation Paper • 2501.08316 • Published Jan 14 • 34
PokerBench: Training Large Language Models to become Professional Poker Players Paper • 2501.08328 • Published Jan 14 • 17
Potential and Perils of Large Language Models as Judges of Unstructured Textual Data Paper • 2501.08167 • Published Jan 14 • 6
Padding Tone: A Mechanistic Analysis of Padding Tokens in T2I Models Paper • 2501.06751 • Published Jan 12 • 33
MangaNinja: Line Art Colorization with Precise Reference Following Paper • 2501.08332 • Published Jan 14 • 60
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published Jan 14 • 286