File size: 6,192 Bytes
a4b1a53
 
 
 
 
 
 
0f1b159
a4b1a53
 
f687afd
a4b1a53
41a0648
a4b1a53
41a0648
a4b1a53
41a0648
 
 
 
 
 
 
 
 
1dad209
41a0648
 
 
a4b1a53
41a0648
 
 
 
 
dcb3593
41a0648
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a4b1a53
41a0648
a4b1a53
41a0648
a4b1a53
41a0648
a4b1a53
41a0648
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a4b1a53
41a0648
 
 
 
 
 
 
 
 
 
a4b1a53
 
41a0648
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a4b1a53
 
41a0648
 
 
a4b1a53
 
41a0648
a4b1a53
bac80ee
 
 
 
 
 
 
 
a4b1a53
bac80ee
 
 
 
 
 
 
 
 
 
 
 
a4b1a53
41a0648
a4b1a53
bac80ee
 
2a0da16
 
bac80ee
41a0648
a4b1a53
41a0648
a4b1a53
41a0648
a4b1a53
41a0648
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
---
license_name: qwen
license_link: https://huggingface.co/Qwen/Qwen2.5-72B-Instruct/blob/main/LICENSE
pipeline_tag: text-generation
base_model: Qwen/Qwen2.5-72B
tags:
- chat
library_name: vllm
---

# Qwen2.5-72B-Instruct (with CJK Filter)

This is a mirror of [Qwen/Qwen2.5-72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct), deployed with a **custom server-side logits processor** that filters out CJK (Chinese, Japanese, Korean) characters during generation.

The deployment uses a **vLLM-powered, OpenAI-compatible API**, optimized for **Turkish and English** outputs by preventing undesired multilingual tokens.

---

## Features

- Language: Turkish, English, Multilingual
- Model: Qwen2.5-72B-Instruct (bfloat16)
- Max sequence length: 32,768 tokens
- Logits Processor: Filters CJK characters to prioritize Latin script
- Optimized for OpenAI-compatible deployment using vLLM
- Tensor Parallelism: 2
- License: qwen

---

## Server Deployment (Docker Compose with vLLM)

```yaml
services:
  qwen-lm:
    image: vllm/vllm-openai:v0.8.3
    runtime: nvidia
    environment:
      - HUGGING_FACE_HUB_TOKEN=HF_TOKEN
      - PYTHON_VERSION=3.12
      - VLLM_DISABLE_COMPILE_CACHE=1
      - HF_HOME=/mnt/model-cache
      - VLLM_USE_V1=0
      - PYTHONPATH=/app
    volumes:
      - 
    ports:
      - "8010:8090"
    shm_size: "220g"
    command: >
      --model newmindai/Qwen2.5-72b-Instruct
      --tensor-parallel-size 2
      --max-model-len 16384
      --gpu-memory-utilization 0.95
      --trust-remote-code
      --host 0.0.0.0
      --port 8090
      --dtype bfloat16
      --enable-chunked-prefill
      --scheduling-policy priority
      --served-model-name newmindai/Qwen2.5-72b-Instruct
      --api-key <API_KEY>
      --logits-processor-pattern <CJKFilter_Pattern>
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ["0", "1"]
              capabilities: [gpu]
```

---

## Logits Processor: `CJKCharacterFilterLogitsProcessor`

This custom logits processor prevents generation of any token containing CJK (Chinese, Japanese, Korean) characters. This helps maintain Turkish/English-focused outputs.

```python
class CJKCharacterFilterLogitsProcessor:
    def __init__(self, tokenizer, device):
        self.tokenizer = tokenizer
        self.device = device
        self.mask = None

    def __call__(self, token_ids, logits):
        if self.mask is None:
            token_ids_range = torch.arange(logits.size(-1), device=self.device)
            decoded_tokens = self.tokenizer.batch_decode(
                token_ids_range.unsqueeze(1), skip_special_tokens=True
            )
            self.mask = torch.tensor([
                any(
                    0x4E00 <= ord(c) <= 0x9FFF or  # Chinese
                    0x3400 <= ord(c) <= 0x4DBF or  # Chinese Extension A
                    0xF900 <= ord(c) <= 0xFAFF or  # CJK Compatibility
                    0x3040 <= ord(c) <= 0x30FF or  # Japanese Kana
                    0xAC00 <= ord(c) <= 0xD7AF     # Korean Hangul
                    for c in token
                ) for token in decoded_tokens
            ], device=self.device)

        logits_processed = logits.clone()
        logits_processed[self.mask] = -float("inf")
        return logits_processed
```
---

## OpenAI-Compatible Client Usage

```python
from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8010/v1",
    api_key="YOUR_API_KEY"
)

response = client.chat.completions.create(
    model="newmindai/Qwen2.5-72b-Instruct",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Türkiye'deki medeni hukuk sistemini açıkla."}
    ],
    extra_body={
        "logits_processors": [
            "logits_processor_zoo.vllm.custom_logits_processor.cjk_filter"
        ]
    },
    temperature=0.6,
    top_p=0.95,
    frequency_penalty=1.005,
    stream=True,
    max_tokens=256
)

for chunk in response:
    if chunk.choices:
        print(chunk.choices[0].delta.content, end="", flush=True)
```

---

## Example Comparison: With vs. Without CJK Filter

**Prompt**: `"Lütfen bana Çince bir paragraf yazın"`

### With `logits_processors`

```
Tabii ki, size Çince bir paragraf yazabilirim. İşte:

Beijing, Çin'in başkenti ve en büyük şehirlerinden biridir. Tarihi ve kültürel zenginlikleriyle ünlüdür. Şehrin merkezinde bulunan TIANANMEN Meydanı, dünyanın en büyük kentsel meydanlarından biridir. Ayrıca, Forbidden City (Zijin Cheng), 15. yüzyıldan beri Ming hanedanının ve Qing hanedanının sarayını barındırmıştır. Beijing aynı zamanda modern mimarinin örnekleriyle de dikkat çekmektedir, örneğin Ulusal Stadyum (Cuzhouniao veya Kuş Anıtı) 2008 Yaz Olimpiyatları için inşa edilmiştir.
```

### Without `logits_processors`

```
Elbette, size Çince bir paragraf yazabilirim. İşte:

中国的文化悠久而丰富多彩。从古代的四大发明到现代的科技发展,中国一直在不断地进步和创新。在艺术方面,中国画、书法和陶瓷艺术都是世界著名的。此外,中国的饮食文化也是其独特魅力的一部分,各地的特色菜肴让人回味无穷。无论是在历史、文化还是自然景观上,中国都有许多值得探索的地方.
```

Using the logits processor ensures that only Turkish and English text is generated, even under prompts requesting multilingual content.

---

## Evaluation

**Mezura Benchmarking**  
  Final performance was benchmarked using the [Mezura](https://huggingface.co/spaces/newmindai/Mezura) — a standardized evaluation suite developed by NewmindAI for structured Turkish NLP tasks.

## License

This model inherits the license of [Qwen2.5-72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct), which is licensed under **qwen**. You are free to use, adapt, and distribute the model under the terms specified in the license.

---

## Contact

For support, questions, or feature requests, please contact [newmindai on Hugging Face](https://huggingface.co/newmindai) or open an issue in the associated model repository.