Spaces:

Ahmadzei
/

RAG

Runtime error

added 3 more tables for large emb model

5fa1a76 over 1 year ago

602 Bytes

	Specify a maximum sample length, and the feature extractor will either pad or truncate the sequences to match it:

	def preprocess_function(examples):
	audio_arrays = [x["array"] for x in examples["audio"]]
	inputs = feature_extractor(
	audio_arrays,
	sampling_rate=16000,
	padding=True,
	max_length=100000,
	truncation=True,
	)
	return inputs

	Apply the preprocess_function to the first few examples in the dataset:

	processed_dataset = preprocess_function(dataset[:5])

	The sample lengths are now the same and match the specified maximum length.