chatterbox-ui/.note/project_overview.md

2.2 KiB

Project Overview: Chatterbox TTS Application Migration

1. Current System

The project is currently a Gradio-based application named "Chatterbox TTS Gradio App". Its primary function is to provide a user interface for text-to-speech (TTS) generation using the Chatterbox TTS model.

Key features of the current Gradio application include:

  • Single utterance TTS generation.
  • Multi-speaker dialog generation with configurable silence gaps.
  • Speaker management (adding/removing speakers with custom audio samples).
  • Automatic memory optimization (model cleanup after generation).
  • Organized output file storage (single_output/ and dialog_output/).

2. Project Goal: Migration to Modern Web Stack

The primary goal of this project is to re-implement the Chatterbox TTS application, specifically its dialog generation capabilities, by migrating from the current Gradio framework to a new architecture.

The new architecture will consist of:

  • Frontend: Vanilla JavaScript
  • Backend: FastAPI (Python)

This migration aims to address limitations of the Gradio framework, such as audio playback issues, limited UI control, and state management complexity, and to provide a more robust, performant, and professional user experience.

3. High-Level Plan & Existing Documentation

A comprehensive implementation plan for this migration already exists and should be consulted. This plan (Memory ID c20c2cce-46d4-453f-9bc3-c18e05dbc66f) outlines:

  • A 4-phase implementation (Backend API, Frontend Development, Integration & Testing, Production Features).
  • The complete technical architecture.
  • A detailed component system (DialogLine, AudioPlayer, ProjectManager).
  • Features like real-time status updates and drag-and-drop functionality.
  • Migration strategies.
  • Expected benefits (e.g., faster responsiveness, better audio reliability).
  • An estimated timeline.

4. Scope of Current Work

The immediate next step, as requested by the user, is to:

  1. Review the existing gradio_app.py.
  2. Refine or detail the plan for re-implementing the dialog generation functionality with the new stack, leveraging the existing comprehensive plan.

This document will be updated as the project progresses to reflect new decisions, architectural changes, and milestones.