This model is a FP4 (w4a16_fp4) compressed version of DeepSeek's distill of Qwen3 8B.

VLLM Starter:

python3 -m vllm.entrypoints.openai.api_server --host 0.0.0.0 --port 8000 --model miike-ai/DeepSeek-R1-0528-Qwen3-5B-FP4 --enable-auto-tool-choice --tool-call-parser hermes --enable-reasoning --reasoning-parser deepseek_r1
Downloads last month
14
Safetensors
Model size
5.15B params
Tensor type
F32
·
BF16
·
F8_E4M3
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for miike-ai/DeepSeek-R1-0528-Qwen3-8B-FP4

Quantized
(84)
this model

Collection including miike-ai/DeepSeek-R1-0528-Qwen3-8B-FP4