chatterbox-ui/README.md

74 lines
2.3 KiB
Markdown

# Chatterbox TTS Gradio App
This Gradio application provides a user interface for text-to-speech generation using the Chatterbox TTS model. It supports both single utterance generation and multi-speaker dialog generation with configurable silence gaps.
## Features
- **Single Utterance Generation**: Generate speech from text using a selected speaker
- **Dialog Generation**: Create multi-speaker conversations with configurable silence gaps
- **Speaker Management**: Add/remove speakers with custom audio samples
- **Memory Optimization**: Automatic model cleanup after generation
- **Output Organization**: Files saved in `single_output/` and `dialog_output/` directories
## Getting Started
1. Clone the repository:
```bash
git clone https://github.com/your-username/chatterbox-test.git
```
2. Install dependencies:
```bash
pip install -r requirements.txt
```
3. Prepare speaker samples:
- Create a `speaker_samples/` directory
- Add audio samples (WAV format) for each speaker
- Update `speakers.yaml` with speaker names and file paths
4. Run the app:
```bash
python gradio_app.py
```
## Usage
### Single Utterance Tab
- Select a speaker from the dropdown
- Enter text to synthesize
- Adjust generation parameters as needed
- Click "Generate Speech"
### Dialog Generation Tab
1. Add speakers using the speaker configuration section
2. Enter dialog in the format:
```
Speaker1: "Hello, how are you?"
Speaker2: "I'm doing well!"
Silence: 0.5
Speaker1: "What are your plans for today?"
```
3. Set output base name
4. Click "Generate Dialog"
## File Organization
- Generated single utterances are saved to `single_output/`
- Dialog generation files are saved to `dialog_output/`
- Concatenated dialog files have `_concatenated.wav` suffix
- All files are zipped together for download
## Memory Management
The app automatically:
- Cleans up the TTS model after each generation
- Frees GPU memory (for CUDA/MPS devices)
- Deletes intermediate tensors to minimize memory footprint
## Troubleshooting
- **"Skipping unknown speaker"**: Add the speaker first using the speaker configuration
- **"Sample file not found"**: Verify the audio file exists in `speaker_samples/`
- **Memory issues**: Try enabling "Re-initialize model each line" for long dialogs