GGUF quants of tencent/Hunyuan-A13B-Instruct

Using llama.cpp b5844 (commit 17a1f0d2d407040ee242e18dd79be8bb212cfcef)

The importance matrix was generated with calibration_datav3.txt.

All quants were generated/calibrated with the imatrix, including the K quants.

Quantized from BF16.

Downloads last month
102
GGUF
Model size
80.4B params
Architecture
hunyuan-moe
Hardware compatibility
Log In to view the estimation

3-bit

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for redponike/Hunyuan-A13B-Instruct-GGUF

Quantized
(17)
this model