Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
The model accepts log mel-filter bank features extracted from the audio waveform and pretrained autoregressively to generate a transcript or translation.