metadata
datasets:
- bible-translation
- custom-corpus
language:
- bbj
- en
library_name: transformers
license: apache-2.0
metrics:
- sacrebleu
model_name: DS4H-ICTU/english-ghomala-translation-model-encoderdecoder
pipeline_tag: translation
tags:
- translation
- seq2seq
- low-resource
- ghomala
- encoder-decoder
model_type: encoder-decoder
Ghomala Translation Model
This is a neural machine translation model fine-tuned to translate from English to Ghomala, a Bantu language spoken in Cameroon.
🚀 Architecture
- Encoder:
UBC-NLP/serengeti-E250
- Decoder:
gpt2
🏋️ Training Details
- Fine-tuned on custom parallel Bible + text data
- Epochs: 10
- Learning rate: 2e-5
- BLEU score tracked with
evaluate
- Batch size: 2 (with gradient accumulation)
- Optimizer: AdamW
📌 Usage Example
from transformers import pipeline
translator = pipeline("translation", model="DS4H-ICTU/english-ghomala-translation-model-encoderdecoder")
result = translator("The woman gave water to the prophet.")
print(result)
🎯 Intended Use
- Cultural and educational preservation
- Language learning and community translation tools
⚠️ Limitations
- Still learning with limited Ghomala data
- May hallucinate or repeat translations
- Works only in English → Ghomala direction for now
📚 Citation
@misc{ghomala_translation_model,
title={Ghomala Translation Model},
author={Group 2},
howpublished={\url{https://huggingface.co/DS4H-ICTU/english-ghomala-translation-model-encoderdecoder}},
year={2025}
}