SmolVLM: Redefining small and efficient multimodal models Paper โข 2504.05299 โข Published 15 days ago โข 168
Vision Language Models Quantization Collection Vision Language Models (VLMs) quantized by Neural Magic โข 20 items โข Updated Mar 4 โข 6
MambaVision Collection MambaVision: A Hybrid Mamba-Transformer Vision Backbone. Includes both 1K and 21K pretrained models. โข 13 items โข Updated 8 days ago โข 31
MoshiVis v0.1 Collection MoshiVis is a Vision Speech Model built as a perceptually-augmented version of Moshi v0.1 for conversing about image inputs โข 8 items โข Updated Mar 21 โข 22
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM Mar 12 โข 398