Type of quantization

by systemicAnomaly - opened Jul 22

Discussion

systemicAnomaly

Jul 22

Hi,
Is this an AWQ or GPTQ W4A16 quantization ?

chriswritescode

Owner Jul 22

https://huggingface.co/chriswritescode/Qwen3-235B-A22B-Instruct-2507-INT4-W4A16/resolve/main/quantization-script.py

Quant script

koushd

Jul 22

do you need to be able to load the entire model into vram to quant this with llmcompressor? was looking for an awq quant.

chriswritescode

Owner Jul 24

I wanted to create a awq with leaving the gates in full precision but it looks like u do need to be able to load the full model at least in system ram. if not vram. So I could not do it. This quant I could do sequentially

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment