Official website demo

#36
by Daemontatox - opened

is there a code for the inference on the main site or something similar ? the cookbooks are no where near as fast or near real time .
maybe i am doing something wrong but the model is so slow , i tried it with flash_attn_2 and sqda and both are very slow , given my use case was text to audio and audio to audio

This comment has been hidden (marked as Off-Topic)

The same. The API used in the official demo supports streaming decoding.
image.png

@shanhaidexiamo thanks for the response , but i was talking about the qwen chat usage and implementation of the mode , its a version of webrtc .

and the official demo on huggingface is not working , i have also tried everything i could think of from the github cookbook and webdemo files.
if you have a working code , it would be great if you can share it here.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment