Spaces:

Ahmadzei
/

RAG

Runtime error

added 3 more tables for large emb model

5fa1a76 over 1 year ago

1.26 kB

	Preprocess
	The next step is to load a Wav2Vec2 processor to process the audio signal:

	from transformers import AutoProcessor
	processor = AutoProcessor.from_pretrained("facebook/wav2vec2-base")

	The MInDS-14 dataset has a sampling rate of 8000kHz (you can find this information in its dataset card), which means you'll need to resample the dataset to 16000kHz to use the pretrained Wav2Vec2 model:

	minds = minds.cast_column("audio", Audio(sampling_rate=16_000))
	minds["train"][0]
	{'audio': {'array': array([-2.38064706e-04, -1.58618059e-04, -5.43987835e-06, ,
	2.78103951e-04, 2.38446111e-04, 1.18740834e-04], dtype=float32),
	'path': '/root/.cache/huggingface/datasets/downloads/extracted/f14948e0e84be638dd7943ac36518a4cf3324e8b7aa331c5ab11541518e9368c/en-US~APP_ERROR/602ba9e2963e11ccd901cd4f.wav',
	'sampling_rate': 16000},
	'path': '/root/.cache/huggingface/datasets/downloads/extracted/f14948e0e84be638dd7943ac36518a4cf3324e8b7aa331c5ab11541518e9368c/en-US~APP_ERROR/602ba9e2963e11ccd901cd4f.wav',
	'transcription': "hi I'm trying to use the banking app on my phone and currently my checking and savings account balance is not refreshing"}

	As you can see in the transcription above, the text contains a mix of upper and lowercase characters.