Update README.md

#1
by wping - opened
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -51,7 +51,7 @@ print(outputs[0]["generated_text"][-1])
51
  ## Model Card
52
 
53
  * Base model: [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)
54
- * Continued Pretraining: 1B tokens on 4M Per-source upsampled Pretraining data.
55
  * Supervised fine-tuning (SFT): 1B tokens on open-source instruction datasets across general, mathematics, and code domains.
56
  * Maximum context window: 4M tokens
57
 
 
51
  ## Model Card
52
 
53
  * Base model: [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)
54
+ * Continued Pretraining: The training data consists of 1B tokens sourced from a pretraining corpus using per-domain upsampling based on sample length. The model was trained for 150 iterations with a sequence length of 4M and a global batch size of 2.
55
  * Supervised fine-tuning (SFT): 1B tokens on open-source instruction datasets across general, mathematics, and code domains.
56
  * Maximum context window: 4M tokens
57