File size: 1,929 Bytes
f94f978
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
60c2bf3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
---

license: mit
datasets:
- fka/awesome-chatgpt-prompts
language:
- zho
- eng
- fra
- spa
- por
- deu
- ita
- rus
- jpn
- kor
- vie
- tha
- ara
base_model:
- Qwen/Qwen2.5-1.5B-Instruct
pipeline_tag: text-generation
---

# Quantized Qwen2.5-1.5B-Instruct

This repository contains 8-bit and 4-bit quantized versions of the Qwen2.5-1.5B-Instruct model using GPTQ. Quantization significantly reduces the model's size and memory footprint, enabling faster inference on resource-constrained devices while maintaining reasonable performance.


## Model Description

The Qwen2.5-1.5B-Instruct is a powerful language model developed by Qwen for instructional tasks. These quantized versions offer a more efficient way to deploy and utilize this model.


## Quantization Details

* **Quantization Method:** GPTQ (Generative Pretrained Transformer Quantization)
* **Quantization Bits:** 8-bit and 4-bit versions available.
* **Dataset:** The model was quantized using a subset of the "fka/awesome-chatgpt-prompts" dataset.


## Usage

To use the quantized models, follow these steps:

**Install Dependencies:**
```bash

pip install transformers accelerate bitsandbytes auto-gptq optimum

```
## Performance

The quantized models offer a significant reduction in size and memory usage compared to the original model. While there might be a slight decrease in performance, the trade-off is often beneficial for deployment on devices with limited resources.


## Disclaimer

These quantized models are provided for research and experimentation purposes. We do not guarantee their performance or suitability for specific applications.


## Acknowledgements

* **Qwen:** For developing the original Qwen2.5-1.5B-Instruct model.
* **Hugging Face:** For providing the platform and tools for model sharing and quantization.
* **GPTQ Authors:** For developing the GPTQ quantization method.