with torch.no_grad():
     speech = vocoder(spectrogram)
from IPython.display import Audio
Audio(speech.numpy(), rate=16000)

In our experience, obtaining satisfactory results from this model can be challenging.