File size: 153 Bytes
5fa1a76
1
The model accepts log mel-filter bank features extracted from the audio waveform and pretrained autoregressively to generate a transcript or translation.