43 lines
2.2 KiB
Markdown
43 lines
2.2 KiB
Markdown
# Project Overview: Chatterbox TTS Application Migration
|
|
|
|
## 1. Current System
|
|
|
|
The project is currently a Gradio-based application named "Chatterbox TTS Gradio App".
|
|
Its primary function is to provide a user interface for text-to-speech (TTS) generation using the Chatterbox TTS model.
|
|
|
|
Key features of the current Gradio application include:
|
|
- Single utterance TTS generation.
|
|
- Multi-speaker dialog generation with configurable silence gaps.
|
|
- Speaker management (adding/removing speakers with custom audio samples).
|
|
- Automatic memory optimization (model cleanup after generation).
|
|
- Organized output file storage (`single_output/` and `dialog_output/`).
|
|
|
|
## 2. Project Goal: Migration to Modern Web Stack
|
|
|
|
The primary goal of this project is to re-implement the Chatterbox TTS application, specifically its dialog generation capabilities, by migrating from the current Gradio framework to a new architecture.
|
|
|
|
The new architecture will consist of:
|
|
- **Frontend**: Vanilla JavaScript
|
|
- **Backend**: FastAPI (Python)
|
|
|
|
This migration aims to address limitations of the Gradio framework, such as audio playback issues, limited UI control, and state management complexity, and to provide a more robust, performant, and professional user experience.
|
|
|
|
## 3. High-Level Plan & Existing Documentation
|
|
|
|
A comprehensive implementation plan for this migration already exists and should be consulted. This plan (Memory ID c20c2cce-46d4-453f-9bc3-c18e05dbc66f) outlines:
|
|
- A 4-phase implementation (Backend API, Frontend Development, Integration & Testing, Production Features).
|
|
- The complete technical architecture.
|
|
- A detailed component system (DialogLine, AudioPlayer, ProjectManager).
|
|
- Features like real-time status updates and drag-and-drop functionality.
|
|
- Migration strategies.
|
|
- Expected benefits (e.g., faster responsiveness, better audio reliability).
|
|
- An estimated timeline.
|
|
|
|
## 4. Scope of Current Work
|
|
|
|
The immediate next step, as requested by the user, is to:
|
|
1. Review the existing `gradio_app.py`.
|
|
2. Refine or detail the plan for re-implementing the dialog generation functionality with the new stack, leveraging the existing comprehensive plan.
|
|
|
|
This document will be updated as the project progresses to reflect new decisions, architectural changes, and milestones.
|