FP4
Collection
FP4 compressed models
•
7 items
•
Updated
This model is a FP4 (w4a16_fp4) compressed version of DeepSeek's distill of Qwen3 8B.
VLLM Starter:
python3 -m vllm.entrypoints.openai.api_server --host 0.0.0.0 --port 8000 --model miike-ai/DeepSeek-R1-0528-Qwen3-5B-FP4 --enable-auto-tool-choice --tool-call-parser hermes --enable-reasoning --reasoning-parser deepseek_r1
Base model
deepseek-ai/DeepSeek-R1-0528-Qwen3-8B