# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Common Development Commands ### Backend (FastAPI) ```bash # Install backend dependencies (run from project root) pip install -r backend/requirements.txt # Run backend development server (run from project root) uvicorn backend.app.main:app --reload --host 0.0.0.0 --port 8000 # Run backend tests python backend/run_api_test.py # Backend API accessible at http://127.0.0.1:8000 # API docs at http://127.0.0.1:8000/docs ``` ### Frontend Testing ```bash # Run frontend tests npm test ``` ### Alternative Interfaces ```bash # Run Gradio interface (standalone TTS app) python gradio_app.py ``` ## Architecture Overview This is a full-stack TTS (Text-to-Speech) application with three interfaces: 1. **Modern web frontend** (vanilla JS) - Interactive dialog editor at `frontend/` 2. **FastAPI backend** - REST API at `backend/` 3. **Gradio interface** - Alternative UI in `gradio_app.py` ### Frontend-Backend Communication - **Frontend**: Vanilla JS (ES6 modules) serving on port 8001 - **Backend**: FastAPI serving on port 8000 - **API Base**: `http://localhost:8000/api` - **CORS**: Configured for frontend communication - **File Serving**: Generated audio served via `/generated_audio/` endpoint ### Key API Endpoints - `/api/speakers/` - Speaker CRUD operations - `/api/dialog/generate/` - Full dialog generation - `/api/dialog/generate_line/` - Single line generation - `/generated_audio/` - Static audio file serving ### Backend Service Architecture Located in `backend/app/services/`: - **TTSService**: Chatterbox TTS model lifecycle management - **SpeakerManagementService**: Speaker data and sample management - **DialogProcessorService**: Dialog script to audio processing - **AudioManipulationService**: Audio concatenation and ZIP creation ### Frontend Architecture - **Modular design**: `api.js` (API layer) + `app.js` (app logic) - **No framework**: Modern vanilla JavaScript with ES6+ features - **Interactive editor**: Table-based dialog creation with drag-drop reordering ### Data Flow 1. User creates dialog in frontend table editor 2. Frontend sends dialog items to `/api/dialog/generate/` 3. Backend processes speech/silence items via services 4. TTS generates audio, segments concatenated 5. ZIP archive created with all outputs 6. Frontend receives URLs for playback/download ### Speaker Configuration - **Location**: `speaker_data/speakers.yaml` and `speaker_data/speaker_samples/` - **Format**: YAML config referencing WAV audio samples - **Management**: Both API endpoints and file-based configuration ### Output Organization - `dialog_output/` - Generated dialog files - `single_output/` - Single utterance outputs - `tts_outputs/` - Raw TTS generation files - Generated ZIPs contain organized file structure ## Development Setup Notes - Python virtual environment expected at project root (`.venv`) - Backend commands run from project root, not `backend/` directory - Frontend served separately (typically port 8001) - Speaker samples must be WAV format in `speaker_data/speaker_samples/`