Safetensors
English
ryota-komatsu's picture
Upload 3 files
bb97029 verified
---
license: mit
datasets:
- ryota-komatsu/libritts-r-mhubert-2000units
language:
- en
base_model:
- ryota-komatsu/fastspeech2_conformer_hifigan
---
[Conditional Flow Matching-based acoustic model](https://arxiv.org/abs/2306.15687) with a [HiFi-GAN](https://arxiv.org/abs/2010.05646) vocoder.
This is a model repository of [a GitHub project](https://github.com/ryota-komatsu/speech_resynth).
The model was trained on 16 kHz downsampled [LibriTTS-R](https://arxiv.org/abs/2305.18802) and [EXPRESSO](https://arxiv.org/abs/2308.05725) HuBERT units.