jusjinuk
/

Llama-2-70b-hf-4bit-LNQ

Model card Files Files and versions Community

Llama-2-70b-hf-4bit-LNQ / README.md

jusjinuk's picture

Create README.md

f5a0e02 verified 2 months ago

|

history blame contribute delete

510 Bytes

metadata

base_model:
  - meta-llama/Llama-2-70b-hf
base_model_relation: quantized
license: llama2

Model Card

Base model: meta-llama/Llama-2-70b-hf
Quantization method: LNQ
Target bit-width: 4
Backend kernel: Any-Precision-LLM kernel (ap-gemv)
Calibration data: RedPajama (1024 sentences / 4096 tokens)
Calibration objective: Next-token prediction

How to run

Follow the instruction in https://github.com/snu-mllab/GuidedQuant.

References

Model Paper