havok2
/

Kartoffelbox-v0.1_0.65h2

Model card Files Files and versions Community

havok2 commited on 25 days ago

Commit

80ac382

·

verified ·

1 Parent(s): d5be189

Update README.md

Files changed (1) hide show

README.md +6 -6

README.md CHANGED Viewed

@@ -16,19 +16,19 @@ tags:
 ## Model Description
-This repository contains an experimental, **standalone** German Text-to-Speech model based on the [Chatterbox-TTS](https://github.com/anotherjesse/Chatterbox-TTS) framework.
 This model is a **hybrid** created by merging two fine-tuned models:
 1.  The well-known German TTS "patch" [SebastianBodza/Kartoffelbox-v0.1](https://huggingface.co/SebastianBodza/Kartoffelbox-v0.1).
-2.  A custom model extensively fine-tuned on a specific [male/female] German voice.
-The goal was to combine the natural German prosody of `Kartoffelbox` with the unique vocal identity and robustness of the custom-trained model. The final weights are a **65/35 merge**, favoring the custom model. Unlike patch-based models, this is a complete, self-contained model that can be loaded directly.
 **Key Features:**
 - **Language:** German
-- **Type:** Standalone, Merged Hybrid Model
-- **Capabilities:** High-quality speech synthesis and Zero-Shot Voice Cloning.
-- **Vocal Characteristics:** [Describe what you hear here. E.g., A clear, male voice with a very natural German intonation, sounding less robotic than many standard models.]
 ## How to Use the Model

 ## Model Description
+This repository contains an experimental, **standalone** German Text-to-Speech model based on the [Chatterbox](https://github.com/resemble-ai/chatterbox) framework.
 This model is a **hybrid** created by merging two fine-tuned models:
 1.  The well-known German TTS "patch" [SebastianBodza/Kartoffelbox-v0.1](https://huggingface.co/SebastianBodza/Kartoffelbox-v0.1).
+2.  A custom model extensively fine-tuned on a large, diverse dataset of German voices (~12.000 samples).
+The goal was to create a robust, general-purpose German TTS model by combining the natural prosody of `Kartoffelbox` with a model trained on a wide variety of voices and data types. The final weights are a **65/35 merge**, favoring the custom-trained, multi-speaker model. Unlike patch-based models, this is a complete, self-contained model that can be loaded directly.
 **Key Features:**
 - **Language:** German
+- **Type:** Standalone, Multi-Speaker, Merged Hybrid Model
+- **Capabilities:** High-quality speech synthesis and Zero-Shot Voice Cloning for **variable German voices**.
+- **Robustness:** Specifically trained to handle numbers, dates, and other complex data formats. (which work some times :D)
 ## How to Use the Model