InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published 8 days ago • 239
Gemma 3 QAT Collection Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory. • 19 items • Updated 4 days ago • 22
MAI-DS-R1 Collection MAI-DS-R1 is a DeepSeek-R1 reasoning model that has been post-trained by the Microsoft AI team. • 2 items • Updated 5 days ago • 8