---
library_name: transformers
license: apache-2.0
base_model: openai/whisper-base
tags:
- whisper-event
- generated_from_trainer
datasets:
- asierhv/composite_corpus_eu_v2.1
metrics:
- wer
model-index:
- name: Whisper Base Basque
  results:
  - task:
      name: Automatic Speech Recognition
      type: automatic-speech-recognition
    dataset:
      name: Mozilla Common Voice 18.0
      type: mozilla-foundation/common_voice_18_0
    metrics:
    - name: Wer
      type: wer
      value: 10.78
language:
- eu
---

# Whisper Base Basque

This model is a fine-tuned version of [openai/whisper-base](https://huggingface.co/openai/whisper-base) specifically for Basque (eu) language Automatic Speech Recognition (ASR). It was trained on the [asierhv/composite_corpus_eu_v2.1](https://huggingface.co/datasets/asierhv/composite_corpus_eu_v2.1) dataset, which is a composite corpus designed to improve Basque ASR performance.

**Key improvements and results compared to the base model:**

* **Significant WER reduction:** The fine-tuned model achieves a Word Error Rate (WER) of 12.3080 on the validation set of the `asierhv/composite_corpus_eu_v2.1` dataset, demonstrating improved accuracy compared to the base `whisper-base` model for Basque.
* **Performance on Common Voice:** When evaluated on the Mozilla Common Voice 18.0 dataset, the model achieved a WER of 10.78. This demonstrates the model's ability to generalize to other Basque speech datasets, and highlights the improvement in accuracy due to the larger base model.

## Model description

This model builds upon the `whisper-base` architecture, known for its strong performance in multilingual speech recognition. By fine-tuning this model on a dedicated Basque speech corpus, it specializes in accurately transcribing Basque speech. The `whisper-base` model offers a larger capacity than `whisper-tiny`, resulting in higher accuracy, albeit with increased computational requirements.

## Intended uses & limitations

**Intended uses:**

* High-accuracy automatic transcription of Basque speech.
* Development of advanced Basque speech-based applications.
* Research in Basque speech processing requiring higher accuracy.
* Professional transcription services for Basque language.
* Applications where slightly higher computational cost is acceptable for improved accuracy.

**Limitations:**

* Performance remains dependent on audio quality, with challenges posed by background noise and poor recording conditions.
* Accuracy may still be affected by highly dialectal or informal Basque speech.
* While demonstrating improved performance, the model may still produce errors, especially with complex linguistic structures.
* The base model is larger than the tiny, so inference will be slower and require more resources.

## Training and evaluation data

* **Training dataset:** [asierhv/composite_corpus_eu_v2.1](https://huggingface.co/datasets/asierhv/composite_corpus_eu_v2.1). This dataset is a carefully curated compilation of Basque speech data, designed to enhance the effectiveness of Basque ASR systems.
* **Evaluation Dataset:** The `test` portion of `asierhv/composite_corpus_eu_v2.1`.

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:

* **learning_rate:** 2.5e-05
* **train_batch_size:** 32
* **eval_batch_size:** 16
* **seed:** 42
* **optimizer:** AdamW with betas=(0.9, 0.999) and epsilon=1e-08
* **lr_scheduler_type:** linear
* **lr_scheduler_warmup_steps:** 500
* **training_steps:** 10000
* **mixed_precision_training:** Native AMP

### Training results

| Training Loss | Epoch | Step  | Validation Loss | WER      |
|---------------|-------|-------|-----------------|----------|
| 0.4816        | 0.1   | 1000  | 0.5136          | 25.7525  |
| 0.2515        | 0.2   | 2000  | 0.4336          | 19.9950  |
| 0.1792        | 0.3   | 3000  | 0.4054          | 17.6408  |
| 0.2485        | 0.4   | 4000  | 0.3804          | 16.3794  |
| 0.1007        | 0.5   | 5000  | 0.4056          | 15.2554  |
| 0.1296        | 0.6   | 6000  | 0.3731          | 15.3241  |
| 0.1555        | 0.7   | 7000  | 0.3764          | 13.3820  |
| 0.114         | 0.8   | 8000  | 0.3097          | 12.7513  |
| 0.0775        | 0.9   | 9000  | 0.3170          | 12.4578  |
| 0.0836        | 1.0   | 10000 | 0.3183          | 12.3080  |

### Framework versions

* Transformers 4.49.0.dev0
* Pytorch 2.6.0+cu124
* Datasets 3.3.1.dev0
* Tokenizers 0.21.0