YOYO-AI commited on
Commit
56920e4
·
verified ·
1 Parent(s): c77b370

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -0
README.md CHANGED
@@ -11,4 +11,41 @@ pipeline_tag: text-generation
11
  tags:
12
  - merge
13
  ---
 
14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  tags:
12
  - merge
13
  ---
14
+ # QwQ-Coder-instruct
15
 
16
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64e174e202fa032de4143324/hHMN168t4-JhJwo0tCM8d.png)
17
+
18
+ ## Introduction:
19
+
20
+ Without compromising the long-chain reasoning capabilities of the **QwQ** model, the integration of **Qwen2.5-Coder-32B-instruct** has significantly enhanced the model's **coding abilities** and **instruction-following skills**.
21
+
22
+ Based on my practical tests, the results are exceptionally impressive!
23
+
24
+ ## merge
25
+
26
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
27
+
28
+ ### Merge Method
29
+
30
+ This model was merged using the [SCE](https://arxiv.org/abs/2408.07990) merge method using [Qwen/Qwen2.5-Coder-32B](https://huggingface.co/Qwen/Qwen2.5-Coder-32B) as a base.
31
+
32
+ ### Models Merged
33
+
34
+ The following models were included in the merge:
35
+ * [Qwen/Qwen2.5-Coder-32B-instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-instruct)
36
+ * [Qwen/QwQ-32B-Preview](https://huggingface.co/Qwen/QwQ-32B-Preview)
37
+ ### Configuration
38
+
39
+ The following YAML configuration was used to produce this model:
40
+
41
+ ```yaml
42
+ merge_method: sce
43
+ models:
44
+ - model: Qwen/QwQ-32B-Preview
45
+ - model: Qwen/Qwen2.5-Coder-32B-instruct
46
+ base_model: Qwen/Qwen2.5-Coder-32B
47
+ parameters:
48
+ select_topk: 1
49
+ dtype: bfloat16
50
+ normalize: true
51
+ ```