File size: 1,602 Bytes
cd23725
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
---

datasets:
- bible-translation
- custom-corpus
language:
- bbj
- en
library_name: transformers
license: apache-2.0
metrics:
- sacrebleu
model_name: DS4H-ICTU/english-ghomala-translation-model-encoderdecoder
pipeline_tag: translation
tags:
- translation
- seq2seq
- low-resource
- ghomala
- encoder-decoder
model_type: encoder-decoder
---

# Ghomala Translation Model

This is a neural machine translation model fine-tuned to translate from **English to Ghomala**, a Bantu language spoken in Cameroon.

## 🚀 Architecture
- **Encoder**: `UBC-NLP/serengeti-E250`
- **Decoder**: `gpt2`

## 🏋️ Training Details
- Fine-tuned on custom parallel Bible + text data
- Epochs: 10  
- Learning rate: 2e-5  
- BLEU score tracked with `evaluate`  
- Batch size: 2 (with gradient accumulation)
- Optimizer: AdamW

## 📌 Usage Example

```python
from transformers import pipeline

translator = pipeline("translation", model="DS4H-ICTU/english-ghomala-translation-model-encoderdecoder")
result = translator("The woman gave water to the prophet.")
print(result)
````

## 🎯 Intended Use

* Cultural and educational preservation
* Language learning and community translation tools

## ⚠️ Limitations

* Still learning with limited Ghomala data
* May hallucinate or repeat translations
* Works only in English → Ghomala direction for now

## 📚 Citation

```
@misc{ghomala_translation_model,
  title={Ghomala Translation Model},
  author={Group 2},
  howpublished={\url{https://huggingface.co/DS4H-ICTU/english-ghomala-translation-model-encoderdecoder}},
  year={2025}
}
```

---