The models were trained using connectionist temporal classification (CTC) so the model output has to be decoded using | |
[Wav2Vec2CTCTokenizer]. |
The models were trained using connectionist temporal classification (CTC) so the model output has to be decoded using | |
[Wav2Vec2CTCTokenizer]. |