Automatic Speech Recognition
Transformers
Safetensors
English
Japanese
whisper
audio
hf-asr-leaderboard
asahi417 commited on
Commit
a245fe1
·
verified ·
1 Parent(s): 1b5afd1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -47,7 +47,7 @@ We compare our kotoba-whisper-bilingual with OpenAI whisper models, kotoba-whisp
47
  OpenAI whisper is not trained for English to Japanese speech-to-text translation, and other models are specific to the Task (eg. kotoba-whisper is Japanese ASR and
48
  distil whisper is English ASR only).
49
 
50
- ### Speech2Text Translation (Japanese->English)
51
 
52
  | model | [CoVoST2 (Ja->En)](https://huggingface.co/datasets/japanese-asr/ja2en.s2t_translation)| [Fleurs (Ja->En)](https://huggingface.co/datasets/japanese-asr/ja2en.s2t_translation) |
53
  |:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------:|------------------------------------------------------------------------------------------------------:|
@@ -65,7 +65,7 @@ distil whisper is English ASR only).
65
  | [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) | 377.2 | 474 |
66
 
67
 
68
- ### Speech2Text Translation (English->Japanese)
69
 
70
  | model | [CoVoST2 (En->Ja)](https://huggingface.co/datasets/japanese-asr/en2ja.s2t_translation)| [Fleurs (En->JA)](https://huggingface.co/datasets/japanese-asr/en2ja.s2t_translation) |
71
  |:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------:|------------------------------------------------------------------------------------------------------:|
@@ -83,7 +83,7 @@ distil whisper is English ASR only).
83
  | [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) | 185.2 | 200.5 |
84
 
85
 
86
- ### ASR (Japanese)
87
 
88
  | model | [CommonVoice 8 (Japanese test set)](https://huggingface.co/datasets/japanese-asr/ja_asr.common_voice_8_0) | [JSUT Basic 5000](https://huggingface.co/datasets/japanese-asr/ja_asr.jsut_basic5000) | [ReazonSpeech (held out test set)](https://huggingface.co/datasets/japanese-asr/ja_asr.reazonspeech_test) |
89
  |:--------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------:|----------------------------------------------------------------------------------------:|------------------------------------------------------------------------------------------------------------:|
@@ -101,7 +101,7 @@ distil whisper is English ASR only).
101
 
102
 
103
 
104
- ### ASR (English)
105
 
106
  | model | [ESB](https://huggingface.co/datasets/japanese-asr/en_asr.esb_eval) (ami) | [ESB](https://huggingface.co/datasets/japanese-asr/en_asr.esb_eval) (earnings22) | [ESB](https://huggingface.co/datasets/japanese-asr/en_asr.esb_eval) (librispeech) | [ESB](https://huggingface.co/datasets/japanese-asr/en_asr.esb_eval) (tedlium) | [ESB](https://huggingface.co/datasets/japanese-asr/en_asr.esb_eval) (voxpopuli) |
107
  |:----------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------:|-----------------------------------------------------------------------------------:|------------------------------------------------------------------------------------:|--------------------------------------------------------------------------------:|----------------------------------------------------------------------------------:|
 
47
  OpenAI whisper is not trained for English to Japanese speech-to-text translation, and other models are specific to the Task (eg. kotoba-whisper is Japanese ASR and
48
  distil whisper is English ASR only).
49
 
50
+ ### Speech2Text Translation (Japanese->English): WER
51
 
52
  | model | [CoVoST2 (Ja->En)](https://huggingface.co/datasets/japanese-asr/ja2en.s2t_translation)| [Fleurs (Ja->En)](https://huggingface.co/datasets/japanese-asr/ja2en.s2t_translation) |
53
  |:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------:|------------------------------------------------------------------------------------------------------:|
 
65
  | [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) | 377.2 | 474 |
66
 
67
 
68
+ ### Speech2Text Translation (English->Japanese): CER
69
 
70
  | model | [CoVoST2 (En->Ja)](https://huggingface.co/datasets/japanese-asr/en2ja.s2t_translation)| [Fleurs (En->JA)](https://huggingface.co/datasets/japanese-asr/en2ja.s2t_translation) |
71
  |:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------:|------------------------------------------------------------------------------------------------------:|
 
83
  | [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) | 185.2 | 200.5 |
84
 
85
 
86
+ ### ASR (Japanese): CER
87
 
88
  | model | [CommonVoice 8 (Japanese test set)](https://huggingface.co/datasets/japanese-asr/ja_asr.common_voice_8_0) | [JSUT Basic 5000](https://huggingface.co/datasets/japanese-asr/ja_asr.jsut_basic5000) | [ReazonSpeech (held out test set)](https://huggingface.co/datasets/japanese-asr/ja_asr.reazonspeech_test) |
89
  |:--------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------:|----------------------------------------------------------------------------------------:|------------------------------------------------------------------------------------------------------------:|
 
101
 
102
 
103
 
104
+ ### ASR (English): WER
105
 
106
  | model | [ESB](https://huggingface.co/datasets/japanese-asr/en_asr.esb_eval) (ami) | [ESB](https://huggingface.co/datasets/japanese-asr/en_asr.esb_eval) (earnings22) | [ESB](https://huggingface.co/datasets/japanese-asr/en_asr.esb_eval) (librispeech) | [ESB](https://huggingface.co/datasets/japanese-asr/en_asr.esb_eval) (tedlium) | [ESB](https://huggingface.co/datasets/japanese-asr/en_asr.esb_eval) (voxpopuli) |
107
  |:----------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------:|-----------------------------------------------------------------------------------:|------------------------------------------------------------------------------------:|--------------------------------------------------------------------------------:|----------------------------------------------------------------------------------:|