JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization Paper ⢠2503.23377 ⢠Published 24 days ago ⢠52
OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting Paper ⢠2503.08677 ⢠Published Mar 11 ⢠29
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM Mar 12 ⢠398
view article Article LLM Inference on Edge: A Fun and Easy Guide to run LLMs via React Native on your Phone! Mar 7 ⢠53
PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data Paper ⢠2502.14397 ⢠Published Feb 20 ⢠42
view article Article Introducing Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita š„ Feb 18 ⢠98
Phantom: Subject-consistent video generation via cross-modal alignment Paper ⢠2502.11079 ⢠Published Feb 16 ⢠59
Hibiki fr-en Collection Hibiki is a model for streaming speech translation , which can run on device! See https://github.com/kyutai-labs/hibiki. ⢠5 items ⢠Updated Feb 6 ⢠52
AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models Paper ⢠2412.04146 ⢠Published Dec 5, 2024 ⢠23