Transformers
GGUF
vllm
conversational
marco-giachin commited on
Commit
1c09b08
·
verified ·
1 Parent(s): edfbbe0

Update README.md

Browse files

Added quantization info

Files changed (1) hide show
  1. README.md +4 -8
README.md CHANGED
@@ -18,20 +18,13 @@ base_model:
18
  - Almawave/Velvet-14B
19
  ---
20
 
21
- # Our quantization process
22
-
23
  ## DESCRIPTION
24
-
25
- **This is a test quantization of Velvet-14B**, converted to GGUF format by modifying the llama.cpp script to make possible the utilization of tokenizer.json.
26
- **Note:** As of today, llama.cpp does not support this model or this chat template https://github.com/ggerganov/llama.cpp/pull/11716.
27
  **This model does not represent the intended quality of the original product.**
28
 
29
- Tool used: <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> commit b4677.
30
-
31
  To perform this quantization, we started with llama.cpp as our base, modifying the file convert_hf_to_gguf_update.py to support this model.
32
  For modifying this file, we based our work on what was seen in the PR https://github.com/ggerganov/llama.cpp/pull/11716.
33
 
34
- Original Model: https://huggingface.co/Almawave/Velvet-14B
35
 
36
  ## PROMPT FORMAT
37
 
@@ -59,6 +52,9 @@ Prompt format with system message:
59
  | Q6_K | [Almawave-Velvet-14B-Q6_K.gguf](https://huggingface.co/SistInf/Velvet-14B-GGUF/blob/main/Almawave-Velvet-14B-Q6_K.gguf) |
60
  | Q8_0 | [Almawave-Velvet-14B-Q8_0.gguf](https://huggingface.co/SistInf/Velvet-14B-GGUF/blob/main/Almawave-Velvet-14B-Q8_0.gguf) |
61
 
 
 
 
62
  # Model Card for Velvet-14B
63
 
64
  Velvet is an Italian family of large language models, developed from scratch, featuring a dense architecture. This model was trained on the HPC Leonardo infrastructure hosted by [CINECA](https://www.cineca.it/en), utilizing public data that underwent extensive curation.
 
18
  - Almawave/Velvet-14B
19
  ---
20
 
 
 
21
  ## DESCRIPTION
 
 
 
22
  **This model does not represent the intended quality of the original product.**
23
 
 
 
24
  To perform this quantization, we started with llama.cpp as our base, modifying the file convert_hf_to_gguf_update.py to support this model.
25
  For modifying this file, we based our work on what was seen in the PR https://github.com/ggerganov/llama.cpp/pull/11716.
26
 
27
+ **Note:** As of today, llama.cpp does not support this model or this chat template https://github.com/ggerganov/llama.cpp/pull/11716.
28
 
29
  ## PROMPT FORMAT
30
 
 
52
  | Q6_K | [Almawave-Velvet-14B-Q6_K.gguf](https://huggingface.co/SistInf/Velvet-14B-GGUF/blob/main/Almawave-Velvet-14B-Q6_K.gguf) |
53
  | Q8_0 | [Almawave-Velvet-14B-Q8_0.gguf](https://huggingface.co/SistInf/Velvet-14B-GGUF/blob/main/Almawave-Velvet-14B-Q8_0.gguf) |
54
 
55
+
56
+ Original Model: https://huggingface.co/Almawave/Velvet-14B
57
+
58
  # Model Card for Velvet-14B
59
 
60
  Velvet is an Italian family of large language models, developed from scratch, featuring a dense architecture. This model was trained on the HPC Leonardo infrastructure hosted by [CINECA](https://www.cineca.it/en), utilizing public data that underwent extensive curation.