Update docs in .noew

This commit is contained in:
Steve White 2025-06-05 09:22:54 -05:00
parent b781d8abcf
commit 9d1dc330ea
2 changed files with 68 additions and 67 deletions

View File

@ -15,5 +15,6 @@
- Awaiting your feedback on the detailed migration plan (see `.note/detailed_migration_plan.md`).
**Next Steps (pending your approval of plan):**
- Begin Phase 1: Backend API Development (FastAPI).
- Task 1.1: Project Setup (FastAPI project structure, `requirements.txt`).

View File

@ -2,7 +2,7 @@
This plan outlines the steps to re-implement the dialog generation features of the Chatterbox TTS application, moving from the current Gradio-based implementation to a FastAPI backend and a vanilla JavaScript frontend. It incorporates findings from `gradio_app.py` and aligns with the existing high-level strategy (MEMORY[c20c2cce-46d4-453f-9bc3-c18e05dbc66f]).
### 1. Backend (FastAPI) Development
## 1. Backend (FastAPI) Development
**Objective:** Create a robust API to handle TTS generation, speaker management, and file delivery.
@ -44,7 +44,7 @@ This plan outlines the steps to re-implement the dialog generation features of t
7. **Configuration:** Manage paths (`speakers.yaml`, sample storage, output directories) and TTS settings.
8. **Testing:** Thoroughly test all API endpoints using tools like Postman or `curl`.
### 2. Frontend (Vanilla JavaScript) Development
## 2. Frontend (Vanilla JavaScript) Development
**Objective:** Create an intuitive UI for dialog construction, speaker management, and interaction with the backend.
@ -52,7 +52,7 @@ This plan outlines the steps to re-implement the dialog generation features of t
* **HTML (`index.html`):** Structure for dialog editor, speaker controls, results display.
* **CSS (`style.css`):** Styling for a clean and usable interface.
* **JavaScript (`app.js`, `api.js`, `ui.js`):
* **JavaScript (`app.js`, `api.js`, `ui.js`):**
* `api.js`: Functions for all backend API communications (`fetch`).
* `ui.js`: DOM manipulation for dynamic dialog lines, speaker lists, and results rendering.
* `app.js`: Main application logic, event handling, state management (for dialog lines, speaker data).
@ -72,20 +72,20 @@ This plan outlines the steps to re-implement the dialog generation features of t
* "Generate Dialog" button to submit data via `api.js`.
* Display generation log, audio player for concatenated output, and download link for ZIP file.
### 3. Integration & Testing (Phase 3)
## 3. Integration & Testing (Phase 3)
1. **Full System Connection:** Ensure seamless frontend-backend communication.
2. **End-to-End Testing:** Test various dialog scenarios, speaker configurations, and error conditions.
3. **Performance & Memory:** Profile backend memory usage during generation; refine `TTSService` memory strategies if needed.
4. **UX Refinement:** Iterate on UI/UX based on testing feedback.
### 4. Advanced Features & Deployment (Phase 4)
## 4. Advanced Features & Deployment (Phase 4)
* (As per MEMORY[c20c2cce-46d4-453f-9bc3-c18e05dbc66f])
* **Real-time Updates:** Consider WebSockets for live progress during generation.
* **Deployment Strategy:** Plan for deploying the FastAPI application and serving the static frontend assets.
### Key Considerations from `gradio_app.py` Analysis:
## Key Considerations from `gradio_app.py` Analysis
* **Memory Management for TTS Model:** This is critical. The `reinit_each_line` option and explicit cleanup in `generate_audio` highlight this. The FastAPI backend must handle this robustly.
* **Text Chunking:** The `split_text_at_sentence_boundaries` (max 300 chars) logic is essential and must be replicated.