jusjinuk
/

Llama-3.1-70B-Instruct-3bit-GuidedQuant-QTIP-skip_0_v

Model card Files Files and versions Community

Model Card

Base model: meta-llama/Llama-3.1-70B-Instruct
Quantization method: BlockLDLQ with GuidedQuant Hessian
Target bit-width: 3
Backend kernel: QTIP kernel (HYB variant)
Calibration data: RedPajama (1024 sentences / 4096 tokens)
Calibration objective: Next-token prediction
num_groups (for GuidedQuant Hessian): 1
skip_list: 0_v (not quantizing 0_v layer, following YAQA paper)

How to run

Follow the instruction in https://github.com/snu-mllab/GuidedQuant and https://github.com/Cornell-RelaxML/qtip

References

Model Paper

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jusjinuk/Llama-3.1-70B-Instruct-3bit-GuidedQuant-QTIP-skip_0_v

Base model

meta-llama/Llama-3.2-3B-Instruct

Quantized

(361)

this model

Collection including jusjinuk/Llama-3.1-70B-Instruct-3bit-GuidedQuant-QTIP-skip_0_v

Instruction-tuned models (GuidedQuant)

40 items • Updated Jun 19 • 2