Transformers
GGUF
vllm
conversational
marco-giachin commited on
Commit
edfbbe0
·
verified ·
1 Parent(s): 5407e25

Update README.md

Browse files

Added quantization details

Files changed (1) hide show
  1. README.md +39 -1
README.md CHANGED
@@ -14,12 +14,50 @@ extra_gated_description: >-
14
  our <a href="https://www.almawave.com/privacy-policy/">Privacy Policy</a>.
15
  tags:
16
  - vllm
 
 
17
  ---
18
 
19
  # Our quantization process
20
 
21
- **llama.cpp**
22
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
  # Model Card for Velvet-14B
25
 
 
14
  our <a href="https://www.almawave.com/privacy-policy/">Privacy Policy</a>.
15
  tags:
16
  - vllm
17
+ base_model:
18
+ - Almawave/Velvet-14B
19
  ---
20
 
21
  # Our quantization process
22
 
23
+ ## DESCRIPTION
24
 
25
+ **This is a test quantization of Velvet-14B**, converted to GGUF format by modifying the llama.cpp script to make possible the utilization of tokenizer.json.
26
+ **Note:** As of today, llama.cpp does not support this model or this chat template https://github.com/ggerganov/llama.cpp/pull/11716.
27
+ **This model does not represent the intended quality of the original product.**
28
+
29
+ Tool used: <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> commit b4677.
30
+
31
+ To perform this quantization, we started with llama.cpp as our base, modifying the file convert_hf_to_gguf_update.py to support this model.
32
+ For modifying this file, we based our work on what was seen in the PR https://github.com/ggerganov/llama.cpp/pull/11716.
33
+
34
+ Original Model: https://huggingface.co/Almawave/Velvet-14B
35
+
36
+ ## PROMPT FORMAT
37
+
38
+ Basic prompt format:
39
+
40
+ ```
41
+ <s><instruction>{prompt}</instruction>
42
+ ```
43
+
44
+ Prompt format with system message:
45
+
46
+ ```
47
+ <s><instruction>{system_prompt}
48
+ {prompt}</instruction>
49
+ ```
50
+
51
+ ## DOWNLOAD
52
+
53
+ | Quant | Link |
54
+ | ------ | ---- |
55
+ | BF16 | [Almawave-Velvet-14B-BF16](https://huggingface.co/SistInf/Velvet-14B-GGUF/blob/main/Almawave-Velvet-14B-BF16.gguf) |
56
+ | F16 | [Almawave-Velvet-14B-F16.gguf](https://huggingface.co/SistInf/Velvet-14B-GGUF/blob/main/Almawave-Velvet-14B-F16.gguf) |
57
+ | Q4_K_M | [Almawave-Velvet-14B-Q4_K_M](https://huggingface.co/SistInf/Velvet-14B-GGUF/blob/main/Almawave-Velvet-14B-Q4_K_M.gguf) |
58
+ | Q5_K_M | [Almawave-Velvet-14B-Q5_K_M](https://huggingface.co/SistInf/Velvet-14B-GGUF/blob/main/Almawave-Velvet-14B-Q5_K_M.gguf) |
59
+ | Q6_K | [Almawave-Velvet-14B-Q6_K.gguf](https://huggingface.co/SistInf/Velvet-14B-GGUF/blob/main/Almawave-Velvet-14B-Q6_K.gguf) |
60
+ | Q8_0 | [Almawave-Velvet-14B-Q8_0.gguf](https://huggingface.co/SistInf/Velvet-14B-GGUF/blob/main/Almawave-Velvet-14B-Q8_0.gguf) |
61
 
62
  # Model Card for Velvet-14B
63