cpatonn
/

Qwen3-Coder-30B-A3B-Instruct-AWQ-4bit

Text Generation

compressed-tensors

Model card Files Files and versions

cpatonn commited on 20 days ago

Commit

0bbe640

·

verified ·

1 Parent(s): e72982a

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -24,7 +24,7 @@ recipe = [
 ### vllm
 Please load the model into vllm and sglang as float16 data type for AWQ support and use `tensor_parallel_size <= 2` i.e.,
 ```
-vllm serve cpatonn/Qwen3-Coder-30B-A3B-Instruct-AWQ --dtype float16 --tensor-parallel-size 2 --pipeline-parallel-size 2
 ```
 # Qwen3-Coder-30B-A3B-Instruct
 <a href="https://chat.qwen.ai/" target="_blank" style="margin: 2px;">

 ### vllm
 Please load the model into vllm and sglang as float16 data type for AWQ support and use `tensor_parallel_size <= 2` i.e.,
 ```
+vllm serve cpatonn/Qwen3-Coder-30B-A3B-Instruct-AWQ-4bit --dtype float16 --tensor-parallel-size 2 --pipeline-parallel-size 2
 ```
 # Qwen3-Coder-30B-A3B-Instruct
 <a href="https://chat.qwen.ai/" target="_blank" style="margin: 2px;">