OpenVLA: An Open-Source Vision-Language-Action Model Paper β’ 2406.09246 β’ Published Jun 13, 2024 β’ 42
view article Article Arc Virtual Cell Challenge: A Primer By FL33TW00D-HF and 1 other β’ Jul 18 β’ 54
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders By thomwolf and 1 other β’ Jul 9 β’ 654
view article Article π€ππ¬π₯οΈπ Kimi-VL-A3B-Thinking-2506: A Quick Navigation By moonshotai and 1 other β’ Jun 21 β’ 67
view article Article SmolVLM2: Bringing Video Understanding to Every Device By orrzohar and 6 others β’ Feb 20 β’ 295
LSceneLLM: Enhancing Large 3D Scene Understanding Using Adaptive Visual Preferences Paper β’ 2412.01292 β’ Published Dec 2, 2024 β’ 13
view article Article Releasing the largest multilingual open pretraining dataset By Pclanglais and 2 others β’ Nov 13, 2024 β’ 102
SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation Paper β’ 2409.06633 β’ Published Sep 10, 2024 β’ 15
Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold Paper β’ 2408.14608 β’ Published Aug 26, 2024 β’ 8
SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners Paper β’ 2408.16768 β’ Published Aug 29, 2024 β’ 29
ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model Paper β’ 2408.16767 β’ Published Aug 29, 2024 β’ 33
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling Paper β’ 2408.16532 β’ Published Aug 29, 2024 β’ 52
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers Paper β’ 2401.11605 β’ Published Jan 21, 2024 β’ 23
view article Article Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA By ybelkada and 4 others β’ May 24, 2023 β’ 162
view article Article Google releases Gemma 2 2B, ShieldGemma and Gemma Scope By Xenova and 3 others β’ Jul 31, 2024 β’ 60