Update README.md
Browse filesAdded quantization details
README.md
CHANGED
@@ -14,12 +14,50 @@ extra_gated_description: >-
|
|
14 |
our <a href="https://www.almawave.com/privacy-policy/">Privacy Policy</a>.
|
15 |
tags:
|
16 |
- vllm
|
|
|
|
|
17 |
---
|
18 |
|
19 |
# Our quantization process
|
20 |
|
21 |
-
|
22 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
23 |
|
24 |
# Model Card for Velvet-14B
|
25 |
|
|
|
14 |
our <a href="https://www.almawave.com/privacy-policy/">Privacy Policy</a>.
|
15 |
tags:
|
16 |
- vllm
|
17 |
+
base_model:
|
18 |
+
- Almawave/Velvet-14B
|
19 |
---
|
20 |
|
21 |
# Our quantization process
|
22 |
|
23 |
+
## DESCRIPTION
|
24 |
|
25 |
+
**This is a test quantization of Velvet-14B**, converted to GGUF format by modifying the llama.cpp script to make possible the utilization of tokenizer.json.
|
26 |
+
**Note:** As of today, llama.cpp does not support this model or this chat template https://github.com/ggerganov/llama.cpp/pull/11716.
|
27 |
+
**This model does not represent the intended quality of the original product.**
|
28 |
+
|
29 |
+
Tool used: <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> commit b4677.
|
30 |
+
|
31 |
+
To perform this quantization, we started with llama.cpp as our base, modifying the file convert_hf_to_gguf_update.py to support this model.
|
32 |
+
For modifying this file, we based our work on what was seen in the PR https://github.com/ggerganov/llama.cpp/pull/11716.
|
33 |
+
|
34 |
+
Original Model: https://huggingface.co/Almawave/Velvet-14B
|
35 |
+
|
36 |
+
## PROMPT FORMAT
|
37 |
+
|
38 |
+
Basic prompt format:
|
39 |
+
|
40 |
+
```
|
41 |
+
<s><instruction>{prompt}</instruction>
|
42 |
+
```
|
43 |
+
|
44 |
+
Prompt format with system message:
|
45 |
+
|
46 |
+
```
|
47 |
+
<s><instruction>{system_prompt}
|
48 |
+
{prompt}</instruction>
|
49 |
+
```
|
50 |
+
|
51 |
+
## DOWNLOAD
|
52 |
+
|
53 |
+
| Quant | Link |
|
54 |
+
| ------ | ---- |
|
55 |
+
| BF16 | [Almawave-Velvet-14B-BF16](https://huggingface.co/SistInf/Velvet-14B-GGUF/blob/main/Almawave-Velvet-14B-BF16.gguf) |
|
56 |
+
| F16 | [Almawave-Velvet-14B-F16.gguf](https://huggingface.co/SistInf/Velvet-14B-GGUF/blob/main/Almawave-Velvet-14B-F16.gguf) |
|
57 |
+
| Q4_K_M | [Almawave-Velvet-14B-Q4_K_M](https://huggingface.co/SistInf/Velvet-14B-GGUF/blob/main/Almawave-Velvet-14B-Q4_K_M.gguf) |
|
58 |
+
| Q5_K_M | [Almawave-Velvet-14B-Q5_K_M](https://huggingface.co/SistInf/Velvet-14B-GGUF/blob/main/Almawave-Velvet-14B-Q5_K_M.gguf) |
|
59 |
+
| Q6_K | [Almawave-Velvet-14B-Q6_K.gguf](https://huggingface.co/SistInf/Velvet-14B-GGUF/blob/main/Almawave-Velvet-14B-Q6_K.gguf) |
|
60 |
+
| Q8_0 | [Almawave-Velvet-14B-Q8_0.gguf](https://huggingface.co/SistInf/Velvet-14B-GGUF/blob/main/Almawave-Velvet-14B-Q8_0.gguf) |
|
61 |
|
62 |
# Model Card for Velvet-14B
|
63 |
|