Go to file

Steve White 4a7c1ea6a1 Added per-line generation and playback; currently regenerates when you hit 'Generate Audio'		2025-06-06 08:44:21 -05:00
.note	Working layout.	2025-06-05 17:38:12 -05:00
backend	added single line generation to the backend	2025-06-06 08:26:15 -05:00
frontend	Added per-line generation and playback; currently regenerates when you hit 'Generate Audio'	2025-06-06 08:44:21 -05:00
speaker_data	added single line generation to the backend	2025-06-06 08:26:15 -05:00
.gitignore	chore: remove node_modules from git tracking and add to .gitignore	2025-06-05 17:40:27 -05:00
README-dialog-generator.md	Gradio app added, cbx-dialog-generate.py added	2025-06-04 08:30:07 -05:00
README.md	Major update: Enhanced memory management, configurable silence gaps, and file organization	2025-06-04 12:37:52 -05:00
babel.config.cjs	Working layout.	2025-06-05 17:38:12 -05:00
cbx-dialog-generate.py	Gradio app added, cbx-dialog-generate.py added	2025-06-04 08:30:07 -05:00
cbx-generate.py	Gradio app added, cbx-dialog-generate.py added	2025-06-04 08:30:07 -05:00
chatterbox-test.py	Gradio app added, cbx-dialog-generate.py added	2025-06-04 08:30:07 -05:00
chatterbox_tts.py.bak	Gradio app added, cbx-dialog-generate.py added	2025-06-04 08:30:07 -05:00
gradio_app.py	Major update: Enhanced memory management, configurable silence gaps, and file organization	2025-06-04 12:37:52 -05:00
package-lock.json	Working layout.	2025-06-05 17:38:12 -05:00
package.json	Working layout.	2025-06-05 17:38:12 -05:00
sample-dialog.md	Gradio app added, cbx-dialog-generate.py added	2025-06-04 08:30:07 -05:00
speakers.yaml	Major update: Enhanced memory management, configurable silence gaps, and file organization	2025-06-04 12:37:52 -05:00
storage_service.py	Updated note directory- gradio interface working.	2025-06-05 09:20:19 -05:00
test1-wav	Gradio app added, cbx-dialog-generate.py added	2025-06-04 08:30:07 -05:00

README.md

Chatterbox TTS Gradio App

This Gradio application provides a user interface for text-to-speech generation using the Chatterbox TTS model. It supports both single utterance generation and multi-speaker dialog generation with configurable silence gaps.

Features

Single Utterance Generation: Generate speech from text using a selected speaker
Dialog Generation: Create multi-speaker conversations with configurable silence gaps
Speaker Management: Add/remove speakers with custom audio samples
Memory Optimization: Automatic model cleanup after generation
Output Organization: Files saved in single_output/ and dialog_output/ directories

Getting Started

Clone the repository:

git clone https://github.com/your-username/chatterbox-test.git

Install dependencies:
```
pip install -r requirements.txt
```
Prepare speaker samples:
- Create a speaker_samples/ directory
- Add audio samples (WAV format) for each speaker
- Update speakers.yaml with speaker names and file paths
Run the app:
```
python gradio_app.py
```

Usage

Single Utterance Tab

Select a speaker from the dropdown
Enter text to synthesize
Adjust generation parameters as needed
Click "Generate Speech"

Dialog Generation Tab

Add speakers using the speaker configuration section

Enter dialog in the format:

Speaker1: "Hello, how are you?"
Speaker2: "I'm doing well!"
Silence: 0.5
Speaker1: "What are your plans for today?"

Set output base name
Click "Generate Dialog"

File Organization

Generated single utterances are saved to single_output/
Dialog generation files are saved to dialog_output/
Concatenated dialog files have _concatenated.wav suffix
All files are zipped together for download

Memory Management

The app automatically:

Cleans up the TTS model after each generation
Frees GPU memory (for CUDA/MPS devices)
Deletes intermediate tensors to minimize memory footprint

Troubleshooting

"Skipping unknown speaker": Add the speaker first using the speaker configuration
"Sample file not found": Verify the audio file exists in speaker_samples/
Memory issues: Try enabling "Re-initialize model each line" for long dialogs