Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Edit Models filters
Main
Tasks
Libraries
Languages
Licenses
Other
1
Apps
llama.cpp
LM Studio
Jan
Backyard AI
Draw Things
DiffusionBee
Jellybox
RecurseChat
Msty
Sanctum
Invoke
JoyFusion
LocalAI
vLLM
node-llama-cpp
Ollama
TGI
MLX LM
Docker Model Runner
Inference Providers
Select all
Cerebras
Novita
Featherless AI
Nebius AI
Fireworks
Together AI
Groq
Hyperbolic
Nscale
Cohere
fal
SambaNova
Replicate
HF Inference API
Misc
Reset Misc
multimodal
Inference Endpoints
text-generation-inference
Eval Results
Merge
4-bit precision
custom_code
8-bit precision
text-embeddings-inference
Carbon Emissions
Mixture of Experts
Apply filters
Models
1,373
Full-text search
Edit filters
Sort: Trending
Active filters:
multimodal
Clear all
Qwen/Qwen2.5-VL-7B-Instruct
Image-Text-to-Text
•
8B
•
Updated
Apr 6
•
5.39M
•
•
1.08k
NCSOFT/VARCO-VISION-2.0-14B
Image-Text-to-Text
•
15B
•
Updated
about 15 hours ago
•
9.03k
•
25
Qwen/Qwen2.5-Omni-7B
Any-to-Any
•
11B
•
Updated
Apr 30
•
118k
•
1.72k
ByteDance-Seed/UI-TARS-1.5-7B
Image-Text-to-Text
•
8B
•
Updated
Apr 18
•
86k
•
327
stepfun-ai/Step1X-Edit
Image-to-Image
•
Updated
16 days ago
•
251
•
•
308
Qwen/Qwen2.5-VL-3B-Instruct
Image-Text-to-Text
•
4B
•
Updated
Apr 6
•
3.9M
•
467
Kwai-Keye/Keye-VL-8B-Preview
Video-Text-to-Text
•
9B
•
Updated
19 days ago
•
41.3k
•
70
lingshu-medical-mllm/Lingshu-7B
Image-Text-to-Text
•
8B
•
Updated
about 1 month ago
•
6.2k
•
49
Qwen/Qwen2.5-VL-72B-Instruct
Image-Text-to-Text
•
73B
•
Updated
Jun 6
•
474k
•
•
516
Qwen/Qwen2.5-VL-32B-Instruct
Image-Text-to-Text
•
33B
•
Updated
Apr 14
•
484k
•
•
411
Qwen/Qwen2.5-Omni-3B
Any-to-Any
•
6B
•
Updated
Apr 30
•
179k
•
257
ByteDance/Dolphin
Image-Text-to-Text
•
0.4B
•
Updated
9 days ago
•
13.6k
•
436
DocReRank/DocReRank-Reranker
Visual Document Retrieval
•
Updated
3 days ago
•
4
HuggingFaceM4/Idefics3-8B-Llama3
Image-Text-to-Text
•
8B
•
Updated
Dec 2, 2024
•
48.3k
•
290
jinaai/jina-clip-v2
Feature Extraction
•
0.9B
•
Updated
Apr 28
•
28.1k
•
•
264
openvla/openvla-7b
Image-Text-to-Text
•
8B
•
Updated
Sep 16, 2024
•
436k
•
131
robotics-diffusion-transformer/rdt-1b
Robotics
•
Updated
Oct 17, 2024
•
754
•
89
rhymes-ai/Aria
Image-Text-to-Text
•
25B
•
Updated
Apr 23
•
22.1k
•
633
ByteDance-Seed/UI-TARS-7B-SFT
Image-Text-to-Text
•
8B
•
Updated
Jan 25
•
14.3k
•
176
Qwen/Qwen2.5-VL-3B-Instruct-AWQ
Image-Text-to-Text
•
1B
•
Updated
Apr 6
•
25.8k
•
46
unsloth/Qwen2.5-VL-7B-Instruct-GGUF
Image-Text-to-Text
•
8B
•
Updated
May 12
•
6.82k
•
14
unsloth/Qwen2.5-Omni-7B-GGUF
Any-to-Any
•
8B
•
Updated
May 28
•
11.7k
•
19
NCSOFT/VARCO-VISION-2.0-1.7B
Image-Text-to-Text
•
Updated
9 days ago
•
8
imageomics/bioclip
Zero-Shot Image Classification
•
Updated
May 17, 2024
•
163k
•
51
Goekdeniz-Guelmez/J.O.S.I.E.v4o
Any-to-Any
•
Updated
Oct 29, 2024
•
26
qnguyen3/nanoLLaVA-1.5
Image-Text-to-Text
•
1B
•
Updated
Sep 21, 2024
•
114
•
111
lmms-lab/llava-onevision-qwen2-0.5b-si
Text Generation
•
0.9B
•
Updated
Sep 2, 2024
•
3.15k
•
14
lmms-lab/llava-onevision-qwen2-0.5b-ov
Text Generation
•
0.9B
•
Updated
Sep 2, 2024
•
39.5k
•
20
lmms-lab/LLaVA-Video-7B-Qwen2
Video-Text-to-Text
•
8B
•
Updated
Oct 25, 2024
•
34.5k
•
105
Qwen/Qwen2-VL-2B
Image-Text-to-Text
•
2B
•
Updated
Dec 6, 2024
•
125k
•
49
Previous
1
2
3
...
46
Next