# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Common Development Commands

### Backend (FastAPI)

```bash
# Install backend dependencies (run from project root)
pip install -r backend/requirements.txt

# Run backend development server (run from project root)
uvicorn backend.app.main:app --reload --host 0.0.0.0 --port 8000

# Run backend tests
python backend/run_api_test.py

# Backend API accessible at http://127.0.0.1:8000
# API docs at http://127.0.0.1:8000/docs
```

### Frontend Testing

```bash
# Run frontend tests
npm test
```

### Alternative Interfaces

```bash
# Run Gradio interface (standalone TTS app)
python gradio_app.py
```

## Architecture Overview

This is a full-stack TTS (Text-to-Speech) application with three interfaces:

1. **Modern web frontend** (vanilla JS) - Interactive dialog editor at `frontend/`
2. **FastAPI backend** - REST API at `backend/`  
3. **Gradio interface** - Alternative UI in `gradio_app.py`

### Frontend-Backend Communication

- **Frontend**: Vanilla JS (ES6 modules) serving on port 8001
- **Backend**: FastAPI serving on port 8000
- **API Base**: `http://localhost:8000/api`
- **CORS**: Configured for frontend communication
- **File Serving**: Generated audio served via `/generated_audio/` endpoint

### Key API Endpoints

- `/api/speakers/` - Speaker CRUD operations
- `/api/dialog/generate/` - Full dialog generation  
- `/api/dialog/generate_line/` - Single line generation
- `/generated_audio/` - Static audio file serving

### Backend Service Architecture

Located in `backend/app/services/`:

- **TTSService**: Chatterbox TTS model lifecycle management
- **SpeakerManagementService**: Speaker data and sample management
- **DialogProcessorService**: Dialog script to audio processing
- **AudioManipulationService**: Audio concatenation and ZIP creation

### Frontend Architecture

- **Modular design**: `api.js` (API layer) + `app.js` (app logic)
- **No framework**: Modern vanilla JavaScript with ES6+ features
- **Interactive editor**: Table-based dialog creation with drag-drop reordering

### Data Flow

1. User creates dialog in frontend table editor
2. Frontend sends dialog items to `/api/dialog/generate/`
3. Backend processes speech/silence items via services
4. TTS generates audio, segments concatenated
5. ZIP archive created with all outputs
6. Frontend receives URLs for playback/download

### Speaker Configuration

- **Location**: `speaker_data/speakers.yaml` and `speaker_data/speaker_samples/`
- **Format**: YAML config referencing WAV audio samples
- **Management**: Both API endpoints and file-based configuration

### Output Organization

- `dialog_output/` - Generated dialog files
- `single_output/` - Single utterance outputs  
- `tts_outputs/` - Raw TTS generation files
- Generated ZIPs contain organized file structure

## Development Setup Notes

- Python virtual environment expected at project root (`.venv`)
- Backend commands run from project root, not `backend/` directory
- Frontend served separately (typically port 8001)
- Speaker samples must be WAV format in `speaker_data/speaker_samples/`