File size: 1,602 Bytes
cd23725 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 |
---
datasets:
- bible-translation
- custom-corpus
language:
- bbj
- en
library_name: transformers
license: apache-2.0
metrics:
- sacrebleu
model_name: DS4H-ICTU/english-ghomala-translation-model-encoderdecoder
pipeline_tag: translation
tags:
- translation
- seq2seq
- low-resource
- ghomala
- encoder-decoder
model_type: encoder-decoder
---
# Ghomala Translation Model
This is a neural machine translation model fine-tuned to translate from **English to Ghomala**, a Bantu language spoken in Cameroon.
## 🚀 Architecture
- **Encoder**: `UBC-NLP/serengeti-E250`
- **Decoder**: `gpt2`
## 🏋️ Training Details
- Fine-tuned on custom parallel Bible + text data
- Epochs: 10
- Learning rate: 2e-5
- BLEU score tracked with `evaluate`
- Batch size: 2 (with gradient accumulation)
- Optimizer: AdamW
## 📌 Usage Example
```python
from transformers import pipeline
translator = pipeline("translation", model="DS4H-ICTU/english-ghomala-translation-model-encoderdecoder")
result = translator("The woman gave water to the prophet.")
print(result)
````
## 🎯 Intended Use
* Cultural and educational preservation
* Language learning and community translation tools
## ⚠️ Limitations
* Still learning with limited Ghomala data
* May hallucinate or repeat translations
* Works only in English → Ghomala direction for now
## 📚 Citation
```
@misc{ghomala_translation_model,
title={Ghomala Translation Model},
author={Group 2},
howpublished={\url{https://huggingface.co/DS4H-ICTU/english-ghomala-translation-model-encoderdecoder}},
year={2025}
}
```
--- |