# Chatterbox TTS Gradio App This Gradio application provides a user interface for text-to-speech generation using the Chatterbox TTS model. It supports both single utterance generation and multi-speaker dialog generation with configurable silence gaps. ## Features - **Single Utterance Generation**: Generate speech from text using a selected speaker - **Dialog Generation**: Create multi-speaker conversations with configurable silence gaps - **Speaker Management**: Add/remove speakers with custom audio samples - **Memory Optimization**: Automatic model cleanup after generation - **Output Organization**: Files saved in `single_output/` and `dialog_output/` directories ## Getting Started 1. Clone the repository: ```bash git clone https://github.com/your-username/chatterbox-test.git ``` 2. Install dependencies: ```bash pip install -r requirements.txt ``` 3. Prepare speaker samples: - Create a `speaker_samples/` directory - Add audio samples (WAV format) for each speaker - Update `speakers.yaml` with speaker names and file paths 4. Run the app: ```bash python gradio_app.py ``` ## Usage ### Single Utterance Tab - Select a speaker from the dropdown - Enter text to synthesize - Adjust generation parameters as needed - Click "Generate Speech" ### Dialog Generation Tab 1. Add speakers using the speaker configuration section 2. Enter dialog in the format: ``` Speaker1: "Hello, how are you?" Speaker2: "I'm doing well!" Silence: 0.5 Speaker1: "What are your plans for today?" ``` 3. Set output base name 4. Click "Generate Dialog" ## File Organization - Generated single utterances are saved to `single_output/` - Dialog generation files are saved to `dialog_output/` - Concatenated dialog files have `_concatenated.wav` suffix - All files are zipped together for download ## Memory Management The app automatically: - Cleans up the TTS model after each generation - Frees GPU memory (for CUDA/MPS devices) - Deletes intermediate tensors to minimize memory footprint ## Troubleshooting - **"Skipping unknown speaker"**: Add the speaker first using the speaker configuration - **"Sample file not found"**: Verify the audio file exists in `speaker_samples/` - **Memory issues**: Try enabling "Re-initialize model each line" for long dialogs