nde-dilan's picture
Create README.md
cd23725 verified
metadata
datasets:
  - bible-translation
  - custom-corpus
language:
  - bbj
  - en
library_name: transformers
license: apache-2.0
metrics:
  - sacrebleu
model_name: DS4H-ICTU/english-ghomala-translation-model-encoderdecoder
pipeline_tag: translation
tags:
  - translation
  - seq2seq
  - low-resource
  - ghomala
  - encoder-decoder
model_type: encoder-decoder

Ghomala Translation Model

This is a neural machine translation model fine-tuned to translate from English to Ghomala, a Bantu language spoken in Cameroon.

🚀 Architecture

  • Encoder: UBC-NLP/serengeti-E250
  • Decoder: gpt2

🏋️ Training Details

  • Fine-tuned on custom parallel Bible + text data
  • Epochs: 10
  • Learning rate: 2e-5
  • BLEU score tracked with evaluate
  • Batch size: 2 (with gradient accumulation)
  • Optimizer: AdamW

📌 Usage Example

from transformers import pipeline

translator = pipeline("translation", model="DS4H-ICTU/english-ghomala-translation-model-encoderdecoder")
result = translator("The woman gave water to the prophet.")
print(result)

🎯 Intended Use

  • Cultural and educational preservation
  • Language learning and community translation tools

⚠️ Limitations

  • Still learning with limited Ghomala data
  • May hallucinate or repeat translations
  • Works only in English → Ghomala direction for now

📚 Citation

@misc{ghomala_translation_model,
  title={Ghomala Translation Model},
  author={Group 2},
  howpublished={\url{https://huggingface.co/DS4H-ICTU/english-ghomala-translation-model-encoderdecoder}},
  year={2025}
}