T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models Paper • 2504.04718 • Published 17 days ago • 39
Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation Paper • 2504.02542 • Published 21 days ago • 41
MoCha: Towards Movie-Grade Talking Character Synthesis Paper • 2503.23307 • Published 25 days ago • 129
Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models Paper • 2503.18352 • Published Mar 24 • 6
CFG-Zero*: Improved Classifier-Free Guidance for Flow Matching Models Paper • 2503.18886 • Published about 1 month ago • 21
Silent Branding Attack: Trigger-free Data Poisoning Attack on Text-to-Image Diffusion Models Paper • 2503.09669 • Published Mar 12 • 36
FedRand: Enhancing Privacy in Federated Learning with Randomized LoRA Subparameter Updates Paper • 2503.07216 • Published Mar 10 • 32
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching Paper • 2503.05179 • Published Mar 7 • 46
One Shot, One Talk: Whole-body Talking Avatar from a Single Image Paper • 2412.01106 • Published Dec 2, 2024 • 20
Steering Rectified Flow Models in the Vector Field for Controlled Image Generation Paper • 2412.00100 • Published Nov 27, 2024 • 16
FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait Paper • 2412.01064 • Published Dec 2, 2024 • 30