Go to file
Steve White 4a7c1ea6a1 Added per-line generation and playback; currently regenerates when you hit 'Generate Audio' 2025-06-06 08:44:21 -05:00
.note Working layout. 2025-06-05 17:38:12 -05:00
backend added single line generation to the backend 2025-06-06 08:26:15 -05:00
frontend Added per-line generation and playback; currently regenerates when you hit 'Generate Audio' 2025-06-06 08:44:21 -05:00
speaker_data added single line generation to the backend 2025-06-06 08:26:15 -05:00
.gitignore chore: remove node_modules from git tracking and add to .gitignore 2025-06-05 17:40:27 -05:00
README-dialog-generator.md Gradio app added, cbx-dialog-generate.py added 2025-06-04 08:30:07 -05:00
README.md Major update: Enhanced memory management, configurable silence gaps, and file organization 2025-06-04 12:37:52 -05:00
babel.config.cjs Working layout. 2025-06-05 17:38:12 -05:00
cbx-dialog-generate.py Gradio app added, cbx-dialog-generate.py added 2025-06-04 08:30:07 -05:00
cbx-generate.py Gradio app added, cbx-dialog-generate.py added 2025-06-04 08:30:07 -05:00
chatterbox-test.py Gradio app added, cbx-dialog-generate.py added 2025-06-04 08:30:07 -05:00
chatterbox_tts.py.bak Gradio app added, cbx-dialog-generate.py added 2025-06-04 08:30:07 -05:00
gradio_app.py Major update: Enhanced memory management, configurable silence gaps, and file organization 2025-06-04 12:37:52 -05:00
package-lock.json Working layout. 2025-06-05 17:38:12 -05:00
package.json Working layout. 2025-06-05 17:38:12 -05:00
sample-dialog.md Gradio app added, cbx-dialog-generate.py added 2025-06-04 08:30:07 -05:00
speakers.yaml Major update: Enhanced memory management, configurable silence gaps, and file organization 2025-06-04 12:37:52 -05:00
storage_service.py Updated note directory- gradio interface working. 2025-06-05 09:20:19 -05:00
test1-wav Gradio app added, cbx-dialog-generate.py added 2025-06-04 08:30:07 -05:00

README.md

Chatterbox TTS Gradio App

This Gradio application provides a user interface for text-to-speech generation using the Chatterbox TTS model. It supports both single utterance generation and multi-speaker dialog generation with configurable silence gaps.

Features

  • Single Utterance Generation: Generate speech from text using a selected speaker
  • Dialog Generation: Create multi-speaker conversations with configurable silence gaps
  • Speaker Management: Add/remove speakers with custom audio samples
  • Memory Optimization: Automatic model cleanup after generation
  • Output Organization: Files saved in single_output/ and dialog_output/ directories

Getting Started

  1. Clone the repository:

    git clone https://github.com/your-username/chatterbox-test.git
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Prepare speaker samples:

    • Create a speaker_samples/ directory
    • Add audio samples (WAV format) for each speaker
    • Update speakers.yaml with speaker names and file paths
  4. Run the app:

    python gradio_app.py
    

Usage

Single Utterance Tab

  • Select a speaker from the dropdown
  • Enter text to synthesize
  • Adjust generation parameters as needed
  • Click "Generate Speech"

Dialog Generation Tab

  1. Add speakers using the speaker configuration section
  2. Enter dialog in the format:
    Speaker1: "Hello, how are you?"
    Speaker2: "I'm doing well!"
    Silence: 0.5
    Speaker1: "What are your plans for today?"
    
  3. Set output base name
  4. Click "Generate Dialog"

File Organization

  • Generated single utterances are saved to single_output/
  • Dialog generation files are saved to dialog_output/
  • Concatenated dialog files have _concatenated.wav suffix
  • All files are zipped together for download

Memory Management

The app automatically:

  • Cleans up the TTS model after each generation
  • Frees GPU memory (for CUDA/MPS devices)
  • Deletes intermediate tensors to minimize memory footprint

Troubleshooting

  • "Skipping unknown speaker": Add the speaker first using the speaker configuration
  • "Sample file not found": Verify the audio file exists in speaker_samples/
  • Memory issues: Try enabling "Re-initialize model each line" for long dialogs