Update README.md
Browse filesAdded quantization info
README.md
CHANGED
@@ -18,20 +18,13 @@ base_model:
|
|
18 |
- Almawave/Velvet-14B
|
19 |
---
|
20 |
|
21 |
-
# Our quantization process
|
22 |
-
|
23 |
## DESCRIPTION
|
24 |
-
|
25 |
-
**This is a test quantization of Velvet-14B**, converted to GGUF format by modifying the llama.cpp script to make possible the utilization of tokenizer.json.
|
26 |
-
**Note:** As of today, llama.cpp does not support this model or this chat template https://github.com/ggerganov/llama.cpp/pull/11716.
|
27 |
**This model does not represent the intended quality of the original product.**
|
28 |
|
29 |
-
Tool used: <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> commit b4677.
|
30 |
-
|
31 |
To perform this quantization, we started with llama.cpp as our base, modifying the file convert_hf_to_gguf_update.py to support this model.
|
32 |
For modifying this file, we based our work on what was seen in the PR https://github.com/ggerganov/llama.cpp/pull/11716.
|
33 |
|
34 |
-
|
35 |
|
36 |
## PROMPT FORMAT
|
37 |
|
@@ -59,6 +52,9 @@ Prompt format with system message:
|
|
59 |
| Q6_K | [Almawave-Velvet-14B-Q6_K.gguf](https://huggingface.co/SistInf/Velvet-14B-GGUF/blob/main/Almawave-Velvet-14B-Q6_K.gguf) |
|
60 |
| Q8_0 | [Almawave-Velvet-14B-Q8_0.gguf](https://huggingface.co/SistInf/Velvet-14B-GGUF/blob/main/Almawave-Velvet-14B-Q8_0.gguf) |
|
61 |
|
|
|
|
|
|
|
62 |
# Model Card for Velvet-14B
|
63 |
|
64 |
Velvet is an Italian family of large language models, developed from scratch, featuring a dense architecture. This model was trained on the HPC Leonardo infrastructure hosted by [CINECA](https://www.cineca.it/en), utilizing public data that underwent extensive curation.
|
|
|
18 |
- Almawave/Velvet-14B
|
19 |
---
|
20 |
|
|
|
|
|
21 |
## DESCRIPTION
|
|
|
|
|
|
|
22 |
**This model does not represent the intended quality of the original product.**
|
23 |
|
|
|
|
|
24 |
To perform this quantization, we started with llama.cpp as our base, modifying the file convert_hf_to_gguf_update.py to support this model.
|
25 |
For modifying this file, we based our work on what was seen in the PR https://github.com/ggerganov/llama.cpp/pull/11716.
|
26 |
|
27 |
+
**Note:** As of today, llama.cpp does not support this model or this chat template https://github.com/ggerganov/llama.cpp/pull/11716.
|
28 |
|
29 |
## PROMPT FORMAT
|
30 |
|
|
|
52 |
| Q6_K | [Almawave-Velvet-14B-Q6_K.gguf](https://huggingface.co/SistInf/Velvet-14B-GGUF/blob/main/Almawave-Velvet-14B-Q6_K.gguf) |
|
53 |
| Q8_0 | [Almawave-Velvet-14B-Q8_0.gguf](https://huggingface.co/SistInf/Velvet-14B-GGUF/blob/main/Almawave-Velvet-14B-Q8_0.gguf) |
|
54 |
|
55 |
+
|
56 |
+
Original Model: https://huggingface.co/Almawave/Velvet-14B
|
57 |
+
|
58 |
# Model Card for Velvet-14B
|
59 |
|
60 |
Velvet is an Italian family of large language models, developed from scratch, featuring a dense architecture. This model was trained on the HPC Leonardo infrastructure hosted by [CINECA](https://www.cineca.it/en), utilizing public data that underwent extensive curation.
|