SistInf
/

Velvet-14B-GGUF

Model card Files Files and versions

marco-giachin commited on Feb 12

Commit

1c09b08

·

verified ·

1 Parent(s): edfbbe0

Update README.md

Added quantization info

Files changed (1) hide show

README.md +4 -8

README.md CHANGED Viewed

@@ -18,20 +18,13 @@ base_model:
 - Almawave/Velvet-14B
 ---
-# Our quantization process
 ## DESCRIPTION
-**This is a test quantization of Velvet-14B**, converted to GGUF format by modifying the llama.cpp script to make possible the utilization of tokenizer.json.
-**Note:** As of today, llama.cpp does not support this model or this chat template https://github.com/ggerganov/llama.cpp/pull/11716.
 **This model does not represent the intended quality of the original product.**
-Tool used: <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> commit b4677.
 To perform this quantization, we started with llama.cpp as our base, modifying the file convert_hf_to_gguf_update.py to support this model.
 For modifying this file, we based our work on what was seen in the PR https://github.com/ggerganov/llama.cpp/pull/11716.
-Original Model: https://huggingface.co/Almawave/Velvet-14B
 ## PROMPT FORMAT
@@ -59,6 +52,9 @@ Prompt format with system message:
 | Q6_K   | [Almawave-Velvet-14B-Q6_K.gguf](https://huggingface.co/SistInf/Velvet-14B-GGUF/blob/main/Almawave-Velvet-14B-Q6_K.gguf) |
 | Q8_0   | [Almawave-Velvet-14B-Q8_0.gguf](https://huggingface.co/SistInf/Velvet-14B-GGUF/blob/main/Almawave-Velvet-14B-Q8_0.gguf) |
 # Model Card for Velvet-14B
 Velvet is an Italian family of large language models, developed from scratch, featuring a dense architecture. This model was trained on the HPC Leonardo infrastructure hosted by [CINECA](https://www.cineca.it/en), utilizing public data that underwent extensive curation.

 - Almawave/Velvet-14B
 ---
 ## DESCRIPTION
 **This model does not represent the intended quality of the original product.**
 To perform this quantization, we started with llama.cpp as our base, modifying the file convert_hf_to_gguf_update.py to support this model.
 For modifying this file, we based our work on what was seen in the PR https://github.com/ggerganov/llama.cpp/pull/11716.
+**Note:** As of today, llama.cpp does not support this model or this chat template https://github.com/ggerganov/llama.cpp/pull/11716.
 ## PROMPT FORMAT
 | Q6_K   | [Almawave-Velvet-14B-Q6_K.gguf](https://huggingface.co/SistInf/Velvet-14B-GGUF/blob/main/Almawave-Velvet-14B-Q6_K.gguf) |
 | Q8_0   | [Almawave-Velvet-14B-Q8_0.gguf](https://huggingface.co/SistInf/Velvet-14B-GGUF/blob/main/Almawave-Velvet-14B-Q8_0.gguf) |
+Original Model: https://huggingface.co/Almawave/Velvet-14B
 # Model Card for Velvet-14B
 Velvet is an Italian family of large language models, developed from scratch, featuring a dense architecture. This model was trained on the HPC Leonardo infrastructure hosted by [CINECA](https://www.cineca.it/en), utilizing public data that underwent extensive curation.