ijazulhaq commited on
Commit
fafe39f
·
verified ·
1 Parent(s): 4dc5e64

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +53 -3
README.md CHANGED
@@ -1,3 +1,53 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ convert this to raw readme.md file, it's a model card on huggingface
2
+
3
+ # Pashto BERT (BERT-Base)
4
+
5
+ ## Model Overview
6
+ This is a monolingual **Pashto BERT (BERT-Base)** model trained on a large **Pashto corpus**. The model is designed to understand and generate text in **Pashto**, making it suitable for various downstream **Natural Language Processing (NLP) tasks**.
7
+
8
+ ## Model Details
9
+ - **Architecture:** BERT-Base (12 layers, 768 hidden size, 12 attention heads, 110M parameters)
10
+ - **Language:** Pashto (ps)
11
+ - **Training Corpus:** A diverse set of Pashto text data, including news articles, books, and web content.
12
+ - **Special Tokens:** `[CLS]`, `[SEP]`, `[PAD]`, `[MASK]`, `[UNK]`
13
+
14
+ ## Intended Use
15
+ This model can be **fine-tuned** for various Pashto-specific NLP tasks, such as:
16
+ - **Sequence Classification:** Sentiment analysis, topic classification, and document categorization.
17
+ - **Sequence Tagging:** Named entity recognition (NER) and part-of-speech (POS) tagging.
18
+ - **Text Generation & Understanding:** Question answering, text summarization, and machine translation.
19
+
20
+ ## How to Use
21
+ This model can be loaded using the `transformers` library from Hugging Face:
22
+
23
+ ```python
24
+ from transformers import AutoModel, AutoTokenizer
25
+
26
+ model_name = "your-huggingface-username/pashto-bert-base"
27
+ tokenizer = AutoTokenizer.from_pretrained("/kaggle/working/model/")
28
+ model = AutoModel.from_pretrained(model_name)
29
+
30
+ text = "ستاسو نننۍ ورځ څنګه وه؟"
31
+ tokens = tokenizer(text, return_tensors="pt")
32
+ out = model(**tokens)
33
+ ```
34
+
35
+ ## Training Details
36
+ - **Optimization:** AdamW
37
+ - **Sequence Length:** 128
38
+ - **Warmup Steps:** 10,000
39
+ - **Warmup Ratio:** 0.06
40
+ - **Learning Rate:** 1e-4
41
+ - **Weight Decay:** 0.01
42
+ - **Adam Optimizer Parameters:**
43
+ - **Epsilon:** 1e-8
44
+ - **Betas:** (0.9, 0.999)
45
+ - **Gradient Accumulation Steps:** 1
46
+ - **Max Gradient Norm:** 1.0
47
+ - **Scheduler:** `linear_schedule_with_warmup`
48
+
49
+
50
+ ## Limitations & Biases
51
+ - The model may reflect biases present in the training data.
52
+ - Performance on **low-resource or domain-specific tasks** may require additional fine-tuning.
53
+ - It is not trained for **code-switching scenarios** (e.g., mixing Pashto with English or other languages).