Prikshit7766
/

distilbert-finetuned-imdb-mlm-accelerate

@@ -1,79 +1,90 @@
-# DistilBERT Fine-Tuned on IMDB for Masked Language Modeling (Accelerate)
-## Model Description
-This model is a fine-tuned version of [**`distilbert-base-uncased`**](https://huggingface.co/distilbert/distilbert-base-uncased) for the masked language modeling (MLM) task. It has been trained on the IMDb dataset using the Hugging Face 🤗 Accelerate library.
----
-## Model Training Details
-### Training Dataset
-- **Dataset:** [IMDB dataset](https://huggingface.co/datasets/imdb) from Hugging Face.
-- **Dataset Splits:**
-  - Train: 25,000 samples
-  - Test: 25,000 samples
-  - Unsupervised: 50,000 samples
-- **Training Strategy:**
-  - Combined the train and unsupervised splits for training, resulting in 75,000 training examples.
-  - Applied fixed random masking to the evaluation set to ensure consistent perplexity scores.
----
-### Training Configuration
-The model was trained using the following parameters:
-- **Number of Training Epochs:** `10`
-- **Batch Size:** `64` (per device).
-- **Learning Rate:** `5e-5`
-- **Weight Decay:** `0.01`
-- **Evaluation Strategy:** After each epoch.
-- **Early Stopping:** Enabled (Patience = `3`).
-- **Metric for Best Model:** `eval_loss`
-  - **Direction:** Lower `eval_loss` is better (`greater_is_better = False`).
-- **Learning Rate Scheduler:** Linear decay with no warmup steps.
-- **Mixed Precision Training:** Enabled (FP16).
----
-## Model Results
-### Best Epoch Performance
-- **Best Epoch:** `9`
-- **Loss:** `2.0173`
-- **Perplexity:** `7.5178`
-### Early Stopping
-- The training ran for the full `10` epochs as the evaluation loss continued to improve.
----
-## Model Usage
-This fine-tuned model can be used for masked language modeling tasks using the `fill-mask` pipeline from Hugging Face. Below is an example:
-```python
-from transformers import pipeline
-mask_filler = pipeline("fill-mask", model="Prikshit7766/distilbert-finetuned-imdb-mlm-accelerate")
-text = "This is a great [MASK]."
-predictions = mask_filler(text)
-for pred in predictions:
-    print(f">>> {pred['sequence']}")
-```
-**Example Output:**
-```text
->>> This is a great movie.
->>> This is a great film.
->>> This is a great show.
->>> This is a great story.
->>> This is a great documentary.
-```

+---
+datasets:
+- stanfordnlp/imdb
+language:
+- en
+metrics:
+- perplexity
+base_model:
+- distilbert/distilbert-base-uncased
+pipeline_tag: fill-mask
+---
+# DistilBERT Fine-Tuned on IMDB for Masked Language Modeling (Accelerate)
+## Model Description
+This model is a fine-tuned version of [**`distilbert-base-uncased`**](https://huggingface.co/distilbert/distilbert-base-uncased) for the masked language modeling (MLM) task. It has been trained on the IMDb dataset using the Hugging Face 🤗 Accelerate library.
+---
+## Model Training Details
+### Training Dataset
+- **Dataset:** [IMDB dataset](https://huggingface.co/datasets/imdb) from Hugging Face.
+- **Dataset Splits:**
+  - Train: 25,000 samples
+  - Test: 25,000 samples
+  - Unsupervised: 50,000 samples
+- **Training Strategy:**
+  - Combined the train and unsupervised splits for training, resulting in 75,000 training examples.
+  - Applied fixed random masking to the evaluation set to ensure consistent perplexity scores.
+---
+### Training Configuration
+The model was trained using the following parameters:
+- **Number of Training Epochs:** `10`
+- **Batch Size:** `64` (per device).
+- **Learning Rate:** `5e-5`
+- **Weight Decay:** `0.01`
+- **Evaluation Strategy:** After each epoch.
+- **Early Stopping:** Enabled (Patience = `3`).
+- **Metric for Best Model:** `eval_loss`
+  - **Direction:** Lower `eval_loss` is better (`greater_is_better = False`).
+- **Learning Rate Scheduler:** Linear decay with no warmup steps.
+- **Mixed Precision Training:** Enabled (FP16).
+---
+## Model Results
+### Best Epoch Performance
+- **Best Epoch:** `9`
+- **Loss:** `2.0173`
+- **Perplexity:** `7.5178`
+### Early Stopping
+- The training ran for the full `10` epochs as the evaluation loss continued to improve.
+---
+## Model Usage
+This fine-tuned model can be used for masked language modeling tasks using the `fill-mask` pipeline from Hugging Face. Below is an example:
+```python
+from transformers import pipeline
+mask_filler = pipeline("fill-mask", model="Prikshit7766/distilbert-finetuned-imdb-mlm-accelerate")
+text = "This is a great [MASK]."
+predictions = mask_filler(text)
+for pred in predictions:
+    print(f">>> {pred['sequence']}")
+```
+**Example Output:**
+```text
+>>> This is a great movie.
+>>> This is a great film.
+>>> This is a great show.
+>>> This is a great story.
+>>> This is a great documentary.
+```