nvidia
/

Llama-3.1-Nemotron-8B-UltraLong-4M-Instruct

Text Generation

text-generation-inference

Model card Files Files and versions Community

xp1992slz commited on 24 days ago

Commit

e984511

·

verified ·

1 Parent(s): 45e12ef

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -50,7 +50,7 @@ print(outputs[0]["generated_text"][-1])
 ## Model Card
 * Base model: [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)
-* Continued Pretraining: 1B tokens on 4M Per-source upsampled SlimPajama data.
 * Supervised fine-tuning (SFT): 1B tokens on open-source instruction datasets across general, mathematics, and code domains.
 * Maximum context window: 4M tokens

 ## Model Card
 * Base model: [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)
+* Continued Pretraining: 1B tokens on 4M Per-source upsampled Pretraining data.
 * Supervised fine-tuning (SFT): 1B tokens on open-source instruction datasets across general, mathematics, and code domains.
 * Maximum context window: 4M tokens