zirak-ai
/

pashto-bert-v1

Model card Files Files and versions Community

ijazulhaq commited on Feb 6

Commit

fafe39f

·

verified ·

1 Parent(s): 4dc5e64

Update README.md

Files changed (1) hide show

README.md +53 -3

README.md CHANGED Viewed

@@ -1,3 +1,53 @@
----
-license: mit
----

+convert this to raw readme.md file, it's a model card on huggingface
+# Pashto BERT (BERT-Base)
+## Model Overview
+This is a monolingual **Pashto BERT (BERT-Base)** model trained on a large **Pashto corpus**. The model is designed to understand and generate text in **Pashto**, making it suitable for various downstream **Natural Language Processing (NLP) tasks**.
+## Model Details
+- **Architecture:** BERT-Base (12 layers, 768 hidden size, 12 attention heads, 110M parameters)
+- **Language:** Pashto (ps)
+- **Training Corpus:** A diverse set of Pashto text data, including news articles, books, and web content.
+- **Special Tokens:** `[CLS]`, `[SEP]`, `[PAD]`, `[MASK]`, `[UNK]`
+## Intended Use
+This model can be **fine-tuned** for various Pashto-specific NLP tasks, such as:
+- **Sequence Classification:** Sentiment analysis, topic classification, and document categorization.
+- **Sequence Tagging:** Named entity recognition (NER) and part-of-speech (POS) tagging.
+- **Text Generation & Understanding:** Question answering, text summarization, and machine translation.
+## How to Use
+This model can be loaded using the `transformers` library from Hugging Face:
+```python
+from transformers import AutoModel, AutoTokenizer
+model_name = "your-huggingface-username/pashto-bert-base"
+tokenizer = AutoTokenizer.from_pretrained("/kaggle/working/model/")
+model = AutoModel.from_pretrained(model_name)
+text = "ستاسو نننۍ ورځ څنګه وه؟"
+tokens = tokenizer(text, return_tensors="pt")
+out = model(**tokens)
+```
+## Training Details
+- **Optimization:** AdamW
+- **Sequence Length:** 128
+- **Warmup Steps:** 10,000
+- **Warmup Ratio:** 0.06
+- **Learning Rate:** 1e-4
+- **Weight Decay:** 0.01
+- **Adam Optimizer Parameters:**
+  - **Epsilon:** 1e-8
+  - **Betas:** (0.9, 0.999)
+- **Gradient Accumulation Steps:** 1
+- **Max Gradient Norm:** 1.0
+- **Scheduler:** `linear_schedule_with_warmup`
+## Limitations & Biases
+- The model may reflect biases present in the training data.
+- Performance on **low-resource or domain-specific tasks** may require additional fine-tuning.
+- It is not trained for **code-switching scenarios** (e.g., mixing Pashto with English or other languages).