|
--- |
|
license: mit |
|
datasets: |
|
- ryota-komatsu/libritts-r-mhubert-2000units |
|
language: |
|
- en |
|
base_model: |
|
- ryota-komatsu/fastspeech2_conformer_hifigan |
|
--- |
|
|
|
[Conditional Flow Matching-based acoustic model](https://arxiv.org/abs/2306.15687) with a [HiFi-GAN](https://arxiv.org/abs/2010.05646) vocoder. |
|
|
|
This is a model repository of [a GitHub project](https://github.com/ryota-komatsu/speech_resynth). |
|
|
|
The model was trained on 16 kHz downsampled [LibriTTS-R](https://arxiv.org/abs/2305.18802) and [EXPRESSO](https://arxiv.org/abs/2308.05725) HuBERT units. |