Kartoffelbox-v0.1_0.65h2: A Merged German Chatterbox-TTS Model

Model Description

This repository contains an experimental, standalone German Text-to-Speech model based on the Chatterbox framework.

This model is a hybrid created by merging two fine-tuned models:

The well-known German TTS "patch" SebastianBodza/Kartoffelbox-v0.1.
A custom model extensively fine-tuned on a large, diverse dataset of German voices (~12.000 samples).

The goal was to create a robust, general-purpose German TTS model by combining the natural prosody of Kartoffelbox with a model trained on a wide variety of voices and data types. The final weights are a 65/35 merge, favoring the custom-trained, multi-speaker model. Unlike patch-based models, this is a complete, self-contained model that can be loaded directly.

Key Features:

Language: German
Type: Standalone, Multi-Speaker, Merged Hybrid Model
Capabilities: High-quality speech synthesis and Zero-Shot Voice Cloning for variable German voices.
Robustness: Specifically trained to handle numbers, dates, and other complex data formats. (which work some times :D)

How to Use the Model

This is a complete model and does not require manual patching. You will need the chatterbox library from Resemble AI to run it.

1. Installation

# Clone the official Chatterbox repository and install its dependencies
git clone https://github.com/resemble-ai/chatterbox.git
cd chatterbox
pip install -e .