whisper large v3 finetuning using our own dataset
#189
by
rifasca
- opened
I encountered issues while fine-tuning the Whisper-large-v3 model on a 100-hour Arabic dataset using the LoRA-PEFT approach. The resulting transcriptions were highly inaccurate, with excessive hallucinations and frequent duplication of characters.
Hello, I think you're using LoRA and only fine-tuning q_linear and v_linear. You could try fine-tuning all linear layers instead. Also, I believe the Whisper-large-v3 tokenizer performs poorly for low-resource languages.