File size: 352 Bytes
5fa1a76 |
1 2 3 4 5 6 7 |
def prepare_dataset(batch): audio = batch["audio"] batch = processor(audio["array"], sampling_rate=audio["sampling_rate"], text=batch["transcription"]) batch["input_length"] = len(batch["input_values"][0]) return batch To apply the preprocessing function over the entire dataset, use 🤗 Datasets [~datasets.Dataset.map] function. |