Text Generation
Transformers
Safetensors
qwen3_moe
programming
code generation
code
codeqwen
Mixture of Experts
coding
coder
qwen2
chat
qwen
qwen-coder
Qwen3-30B-A3B-Thinking-2507
Qwen3-30B-A3B
mixture of experts
128 experts
10 active experts
256k context
qwen3
finetune
brainstorm 40x
brainstorm
thinking
reasoning
conversational
Update README.md
Browse files
README.md
CHANGED
@@ -73,6 +73,7 @@ For coding, programming set expert to:
|
|
73 |
- 10 for moderate work.
|
74 |
- 12-16 for complex work, long projects, complex coding.
|
75 |
- And for longer context, and/or multi-turn -> increase experts by 1-2 to help with longer context/multi turn understanding.
|
|
|
76 |
|
77 |
Recommended settings - general:
|
78 |
- Rep pen 1.05 to 1.1 ; however rep pen of 1 will work well (may need to raise it for lower quants/fewer activated experts)
|
@@ -80,12 +81,14 @@ Recommended settings - general:
|
|
80 |
- Topk of 20, 40 or 100
|
81 |
- Topp of .95 / min p of .05
|
82 |
- System prompt (optional) to focus the model better.
|
|
|
83 |
|
84 |
Creative Use Cases:
|
85 |
- Rep pen of 1.09 or higher, especially if using a lower quant / lower temps.
|
86 |
- Temps of .8 to 2 suggested.
|
87 |
- Also use rep pen of 1.1 or higher with very short prompts.
|
88 |
- You can set active experts as low as "4" for creative use cases.
|
|
|
89 |
- NOTE: The 20x/42B version may be better for creative use cases.
|
90 |
|
91 |
This is the refined version -V1.4- from this project (see this repo for all settings, details, system prompts, example generations etc etc):
|
|
|
73 |
- 10 for moderate work.
|
74 |
- 12-16 for complex work, long projects, complex coding.
|
75 |
- And for longer context, and/or multi-turn -> increase experts by 1-2 to help with longer context/multi turn understanding.
|
76 |
+
- Suggest min context 8k-16k for thinking/output.
|
77 |
|
78 |
Recommended settings - general:
|
79 |
- Rep pen 1.05 to 1.1 ; however rep pen of 1 will work well (may need to raise it for lower quants/fewer activated experts)
|
|
|
81 |
- Topk of 20, 40 or 100
|
82 |
- Topp of .95 / min p of .05
|
83 |
- System prompt (optional) to focus the model better.
|
84 |
+
- Suggest min context 8k-16k for thinking/output.
|
85 |
|
86 |
Creative Use Cases:
|
87 |
- Rep pen of 1.09 or higher, especially if using a lower quant / lower temps.
|
88 |
- Temps of .8 to 2 suggested.
|
89 |
- Also use rep pen of 1.1 or higher with very short prompts.
|
90 |
- You can set active experts as low as "4" for creative use cases.
|
91 |
+
- Suggest min context 8k-16k for thinking/output.
|
92 |
- NOTE: The 20x/42B version may be better for creative use cases.
|
93 |
|
94 |
This is the refined version -V1.4- from this project (see this repo for all settings, details, system prompts, example generations etc etc):
|