Magpie-Align
/

MagpieLM-4B-Chat-v0.1

@@ -8,8 +8,8 @@ tags:
 - dpo
 - generated_from_trainer
 datasets:
-- Magpie-Align/MagpieLM-4B-SFT-v0.1
-- Magpie-Align/MagpieLM-4B-DPO-v0.1
 model-index:
 - name: MagpieLM-4B-Chat-v0.1
   results: []
@@ -27,12 +27,12 @@ model-index:
 This model is an aligned version of [Llama-3.1-Minitron-4B-Width](https://huggingface.co/nvidia/Llama-3.1-Minitron-4B-Width-Base), which achieves state-of-the-art performance among open-aligned SLMs. It even outperforms larger open-weight models including Llama-3-8B-Instruct, Llama-3.1-8B-Instruct and Qwen-2-7B-Instruct.
-We apply the following standard alignment pipeline with two carefully crafted synthetic datasets. The detailed synthetic dataset generation pipeline will be available to public soon. Before that, feel free to use these datasets and reproduce our model, or make your own friendly chatbots :)
-We first perform SFT using [Magpie-Align/MagpieLM-4B-SFT-v0.1](https://huggingface.co/datasets/Magpie-Align/MagpieLM-4B-SFT-v0.1).
 * **SFT Model Checkpoint:** [Magpie-Align/MagpieLM-4B-SFT-v0.1](https://huggingface.co/Magpie-Align/MagpieLM-4B-SFT-v0.1)
-We then perform DPO on the [Magpie-Align/MagpieLM-4B-DPO-v0.1](https://huggingface.co/datasets/Magpie-Align/MagpieLM-4B-DPO-v0.1) dataset.
 ## 🔥 Benchmark Performance
@@ -62,7 +62,7 @@ You can then run conversational inference using the Transformers `pipeline` abst
 import transformers
 import torch
-model_id = "meta-llama/Meta-Llama-3.1-8B-Instruct"
 pipeline = transformers.pipeline(
     "text-generation",
@@ -107,7 +107,7 @@ load_in_4bit: false
 strict: false
 datasets:
-  - path: Magpie-Align/MagpieLM-4B-SFT-v0.1
     type: sharegpt
     conversation: llama3
 dataset_prepared_path: last_run_prepared
@@ -223,7 +223,7 @@ output_dir: alignment_handbook_out/MagpieLM-4B-Chat-v0.1
 run_name: MagpieLM-4B-Chat-v0.1
 dataset_mixer:
-   Magpie-Align/MagpieLM-4B-DPO-v0.1: 1.0
 dataset_splits:
 - train
 - test

 - dpo
 - generated_from_trainer
 datasets:
+- Magpie-Align/MagpieLM-4B-SFT-Data-v0.1
+- Magpie-Align/MagpieLM-4B-DPO-Data-v0.1
 model-index:
 - name: MagpieLM-4B-Chat-v0.1
   results: []
 This model is an aligned version of [Llama-3.1-Minitron-4B-Width](https://huggingface.co/nvidia/Llama-3.1-Minitron-4B-Width-Base), which achieves state-of-the-art performance among open-aligned SLMs. It even outperforms larger open-weight models including Llama-3-8B-Instruct, Llama-3.1-8B-Instruct and Qwen-2-7B-Instruct.
+We apply the following standard alignment pipeline with two carefully crafted synthetic datasets. Feel free to use these datasets and reproduce our model, or make your own friendly chatbots :)
+We first perform SFT using [Magpie-Align/MagpieLM-4B-SFT-Data-v0.1](https://huggingface.co/datasets/Magpie-Align/MagpieLM-4B-SFT-Data-v0.1).
 * **SFT Model Checkpoint:** [Magpie-Align/MagpieLM-4B-SFT-v0.1](https://huggingface.co/Magpie-Align/MagpieLM-4B-SFT-v0.1)
+We then perform DPO on the [Magpie-Align/MagpieLM-4B-DPO-Data-v0.1](https://huggingface.co/datasets/Magpie-Align/MagpieLM-4B-DPO-Data-v0.1) dataset.
 ## 🔥 Benchmark Performance
 import transformers
 import torch
+model_id = "MagpieLM-4B-Chat-v0.1"
 pipeline = transformers.pipeline(
     "text-generation",
 strict: false
 datasets:
+  - path: Magpie-Align/MagpieLM-4B-SFT-Data-v0.1
     type: sharegpt
     conversation: llama3
 dataset_prepared_path: last_run_prepared
 run_name: MagpieLM-4B-Chat-v0.1
 dataset_mixer:
+   Magpie-Align/MagpieLM-4B-DPO-Data-v0.1: 1.0
 dataset_splits:
 - train
 - test