Video format problem.

#194
by pantorn - opened

from dotenv import load_dotenv
from huggingface_hub import InferenceClient
load_dotenv()

client = InferenceClient(
provider="hf-inference",
api_key=os.getenv('HUGGINGFACE_TOKEN'),
)

output = client.automatic_speech_recognition("what.flac", model="openai/whisper-large-v3")

print("Transcription:", output)

ABOVE IS MY CODE

this is the error
Traceback (most recent call last):
File "/Users/pantornchuavallee/satori/kaizen/English-Proficiency-Voice-Recognition/app.py", line 27, in
output = client.automatic_speech_recognition("what.flac", model="openai/whisper-large-v3")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/pantornchuavallee/satori/kaizen/English-Proficiency-Voice-Recognition/.venv/lib/python3.12/site-packages/huggingface_hub/inference/_client.py", line 443, in
automatic_speech_recognition
response = self._inner_post(request_parameters)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/pantornchuavallee/satori/kaizen/English-Proficiency-Voice-Recognition/.venv/lib/python3.12/site-packages/huggingface_hub/inference/_client.py", line 279, in
_inner_post
hf_raise_for_status(response)
File "/Users/pantornchuavallee/satori/kaizen/English-Proficiency-Voice-Recognition/.venv/lib/python3.12/site-packages/huggingface_hub/utils/_http.py", line 465, in hf_ra
ise_for_status
raise _format(BadRequestError, message, response) from e
huggingface_hub.errors.BadRequestError: (Request ID: Root=1-684fa17d-6af07d610825a92b76b83f21;30ef6708-9a78-4816-9bd9-c7fdf0818483)

Bad request:
Content type "None" not supported.
Supported content types are:
application/json, application/json; charset=UTF-8, text/csv, text/plain, image/png, image/jpeg, image/jpg, image/tiff, image/bmp, image/gif, image/webp, im
age/x-image, audio/x-flac, audio/flac, audio/mpeg, audio/x-mpeg-3, audio/wave, audio/wav, audio/x-wav, audio/ogg, audio/x-audio, audio/webm, audio/webm;codecs=opus, audio/
AMR, audio/amr, audio/AMR-WB, audio/AMR-WB+, audio/m4a, audio/x-m4a
(English-Proficiency-Voice-Recognition)

I use the flac and wav format and got the same

Sign up or log in to comment