This repository aims to explore the extreme compression ratio of the model, so only low bit quantization models are provided. They all quantized from F16.
model | size | ppl |
---|---|---|
F16 | 15G | 8.3662 +/- 0.06216 |
IQ2_M | 2.8G | 10.2360 +/- 0.07470 |
IQ2_S | 2.6G | 11.3735 +/- 0.08396 |
IQ2_XS | 2.5G | 12.3081 +/- 0.08961 |
IQ2_XXS | 2.3G | 15.9081 +/- 0.11701 |
IQ1_M | 2.1G | 26.5610 +/- 0.19391 |
- Downloads last month
- 43
Hardware compatibility
Log In
to view the estimation
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
HF Inference deployability: The model has no library tag.