Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
jusjinuk
/
Llama-3.1-70B-Instruct-3bit-GuidedQuant-QTIP-skip_0_v
like
0
arxiv:
2505.07004
License:
llama3.1
Model card
Files
Files and versions
Community
1
Model Card
How to run
References
Model Card
Base model:
meta-llama/Llama-3.1-70B-Instruct
Quantization method: BlockLDLQ with GuidedQuant Hessian
Target bit-width: 3
Backend kernel: QTIP kernel (HYB variant)
Calibration data: RedPajama (1024 sentences / 4096 tokens)
Calibration objective: Next-token prediction
num_groups (for GuidedQuant Hessian): 1
skip_list: 0_v (not quantizing 0_v layer, following YAQA paper)
How to run
Follow the instruction in
https://github.com/snu-mllab/GuidedQuant
and
https://github.com/Cornell-RelaxML/qtip
References
Model Paper
Downloads last month
-
Downloads are not tracked for this model.
How to track
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for
jusjinuk/Llama-3.1-70B-Instruct-3bit-GuidedQuant-QTIP-skip_0_v
Base model
meta-llama/Llama-3.2-3B-Instruct
Quantized
(
361
)
this model
Collection including
jusjinuk/Llama-3.1-70B-Instruct-3bit-GuidedQuant-QTIP-skip_0_v
Instruction-tuned models (GuidedQuant)
Collection
40 items
•
Updated
Jun 19
•
2