fix typo in readme.md
Browse files
README.md
CHANGED
@@ -23,7 +23,7 @@ Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we rele
|
|
23 |
- Training Stage: Pretraining
|
24 |
- Architecture: transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
|
25 |
- Number of Parameters: 7.61B
|
26 |
-
- Number of
|
27 |
- Number of Layers: 28
|
28 |
- Number of Attention Heads (GQA): 28 for Q and 4 for KV
|
29 |
- Context Length: 131,072 tokens
|
|
|
23 |
- Training Stage: Pretraining
|
24 |
- Architecture: transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
|
25 |
- Number of Parameters: 7.61B
|
26 |
+
- Number of Parameters (Non-Embedding): 6.53B
|
27 |
- Number of Layers: 28
|
28 |
- Number of Attention Heads (GQA): 28 for Q and 4 for KV
|
29 |
- Context Length: 131,072 tokens
|