Update README.md
Browse files
README.md
CHANGED
@@ -14,6 +14,8 @@ datasets:
|
|
14 |
- open-r1/Mixture-of-Thoughts
|
15 |
---
|
16 |
|
|
|
|
|
17 |
# **Theta-Crucis-0.6B-Turbo1**
|
18 |
|
19 |
> **Theta-Crucis-0.6B-Turbo1** is a compact, high-performance model designed for **code generation**, **technical reasoning**, and **structured output tasks**. Fine-tuned from **Qwen3-0.6B** using the **Mixture of Thoughts (MoT)** dataset with an emphasis on **code expert clusters**, this model delivers agile and accurate coding assistance in low-resource environments. At only **0.6B parameters**, it offers strong fluency in programming, structured syntax, and technical language generation.
|
@@ -108,4 +110,4 @@ print(response)
|
|
108 |
|
109 |
1. [Qwen2.5 Technical Report (2024)](https://arxiv.org/pdf/2412.15115)
|
110 |
2. [YaRN: Efficient Context Window Extension of Large Language Models](https://arxiv.org/pdf/2309.00071)
|
111 |
-
3. [open-r1/Mixture-of-Thoughts](https://huggingface.co/datasets/open-r1/Mixture-of-Thoughts)
|
|
|
14 |
- open-r1/Mixture-of-Thoughts
|
15 |
---
|
16 |
|
17 |
+

|
18 |
+
|
19 |
# **Theta-Crucis-0.6B-Turbo1**
|
20 |
|
21 |
> **Theta-Crucis-0.6B-Turbo1** is a compact, high-performance model designed for **code generation**, **technical reasoning**, and **structured output tasks**. Fine-tuned from **Qwen3-0.6B** using the **Mixture of Thoughts (MoT)** dataset with an emphasis on **code expert clusters**, this model delivers agile and accurate coding assistance in low-resource environments. At only **0.6B parameters**, it offers strong fluency in programming, structured syntax, and technical language generation.
|
|
|
110 |
|
111 |
1. [Qwen2.5 Technical Report (2024)](https://arxiv.org/pdf/2412.15115)
|
112 |
2. [YaRN: Efficient Context Window Extension of Large Language Models](https://arxiv.org/pdf/2309.00071)
|
113 |
+
3. [open-r1/Mixture-of-Thoughts](https://huggingface.co/datasets/open-r1/Mixture-of-Thoughts)
|