chatterbox-ui/.note/current_focus.md

1.2 KiB

Chatterbox TTS Migration: Backend Development (FastAPI)

Primary Goal: Implement the FastAPI backend for TTS dialog generation.

Recent Accomplishments (Phase 1, Step 2 - Speaker Management):

  • Created Pydantic models for speaker data (speaker_models.py).
  • Implemented SpeakerManagementService (speaker_service.py) for CRUD operations on speakers (metadata in speakers.yaml, samples in speaker_samples/).
  • Created FastAPI router (routers/speakers.py) with endpoints: GET /api/speakers, POST /api/speakers, GET /api/speakers/{id}, DELETE /api/speakers/{id}.
  • Integrated speaker router into the main FastAPI app (main.py).
  • Successfully tested all speaker API endpoints using curl.

Current Task (Phase 1, Step 3 - TTS Core):

  • Develop TTSService in backend/app/services/tts_service.py.
    • Focus on ChatterboxTTS model loading, inference, and critical memory management.
    • Define methods for speech generation using speaker samples.
    • Manage TTS parameters (exaggeration, cfg_weight, temperature).

Next Immediate Steps:

  1. Finalize and test the initial implementation of TTSService.
  2. Proceed to Phase 1, Step 4: Dialog Processing - Implement DialogProcessorService including text splitting logic.