Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
inputs = processor(text=text, return_tensors="pt")
Create a spectrogram with your model:
spectrogram = model.generate_speech(inputs["input_ids"], speaker_embeddings)
Visualize the spectrogram, if you'd like to:
plt.figure()
plt.imshow(spectrogram.T)
plt.show()
Finally, use the vocoder to turn the spectrogram into sound.