Compare commits
14 Commits
dev-single
...
main
Author | SHA1 | Date |
---|---|---|
|
733c9d1b5f | |
|
9c605cd3a0 | |
|
d3ac8bf4eb | |
|
75a2a37252 | |
|
b28a9bcf58 | |
|
4f47d69aaa | |
|
f095bb14e5 | |
|
93e0407eac | |
|
c9593fe6cc | |
|
cbc164c7a3 | |
|
41f95cdee3 | |
|
b62eb0211f | |
|
948712bb3f | |
|
aeb0f7b638 |
|
@ -22,3 +22,4 @@ backend/tts_generated_dialogs/
|
||||||
|
|
||||||
# Node.js dependencies
|
# Node.js dependencies
|
||||||
node_modules/
|
node_modules/
|
||||||
|
.aider*
|
||||||
|
|
|
@ -0,0 +1,188 @@
|
||||||
|
# Chatterbox TTS Backend: Bounded Concurrency + File I/O Offload Plan
|
||||||
|
|
||||||
|
Date: 2025-08-14
|
||||||
|
Owner: Backend
|
||||||
|
Status: Proposed (ready to implement)
|
||||||
|
|
||||||
|
## Goals
|
||||||
|
|
||||||
|
- Increase GPU utilization and reduce wall-clock time for dialog generation.
|
||||||
|
- Keep model lifecycle stable (leveraging current `ModelManager`).
|
||||||
|
- Minimal-risk changes: no API shape changes to clients.
|
||||||
|
|
||||||
|
## Scope
|
||||||
|
|
||||||
|
- Implement bounded concurrency for per-line speech chunk generation within a single dialog request.
|
||||||
|
- Offload audio file writes to threads to overlap GPU compute and disk I/O.
|
||||||
|
- Add configuration knobs to tune concurrency.
|
||||||
|
|
||||||
|
## Current State (References)
|
||||||
|
|
||||||
|
- `backend/app/services/dialog_processor_service.py`
|
||||||
|
- `DialogProcessorService.process_dialog()` iterates items and awaits `tts_service.generate_speech(...)` sequentially (lines ~171–201).
|
||||||
|
- `backend/app/services/tts_service.py`
|
||||||
|
- `TTSService.generate_speech()` runs the TTS forward and calls `torchaudio.save(...)` on the event loop thread (blocking).
|
||||||
|
- `backend/app/services/model_manager.py`
|
||||||
|
- `ModelManager.using()` tracks active work; prevents idle eviction during requests.
|
||||||
|
- `backend/app/routers/dialog.py`
|
||||||
|
- `process_dialog_flow()` expects ordered `segment_files` and then concatenates; good to keep order stable.
|
||||||
|
|
||||||
|
## Design Overview
|
||||||
|
|
||||||
|
1) Bounded concurrency at dialog level
|
||||||
|
|
||||||
|
- Plan all output segments with a stable `segment_idx` (including speech chunks, silence, and reused audio).
|
||||||
|
- For speech chunks, schedule concurrent async tasks with a global semaphore set by config `TTS_MAX_CONCURRENCY` (start at 3–4).
|
||||||
|
- Await all tasks and collate results by `segment_idx` to preserve order.
|
||||||
|
|
||||||
|
2) File I/O offload
|
||||||
|
|
||||||
|
- Replace direct `torchaudio.save(...)` with `await asyncio.to_thread(torchaudio.save, ...)` in `TTSService.generate_speech()`.
|
||||||
|
- This lets the next GPU forward start while previous file writes happen on worker threads.
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
Add to `backend/app/config.py`:
|
||||||
|
|
||||||
|
- `TTS_MAX_CONCURRENCY: int` (default: `int(os.getenv("TTS_MAX_CONCURRENCY", "3"))`).
|
||||||
|
- Optional (future): `TTS_ENABLE_AMP_ON_CUDA: bool = True` to allow mixed precision on CUDA only.
|
||||||
|
|
||||||
|
## Implementation Steps
|
||||||
|
|
||||||
|
### A. Dialog-level concurrency
|
||||||
|
|
||||||
|
- File: `backend/app/services/dialog_processor_service.py`
|
||||||
|
- Function: `DialogProcessorService.process_dialog()`
|
||||||
|
|
||||||
|
1. Planning pass to assign indices
|
||||||
|
|
||||||
|
- Iterate `dialog_items` and build a list `planned_segments` entries:
|
||||||
|
- For silence or reuse: immediately append a final result with assigned `segment_idx` and continue.
|
||||||
|
- For speech: split into `text_chunks`; for each chunk create a planned entry: `{ segment_idx, type: 'speech', speaker_id, text_chunk, abs_speaker_sample_path, tts_params }`.
|
||||||
|
- Increment `segment_idx` for every planned segment (speech chunk or silence/reuse) to preserve final order.
|
||||||
|
|
||||||
|
2. Concurrency setup
|
||||||
|
|
||||||
|
- Create `sem = asyncio.Semaphore(config.TTS_MAX_CONCURRENCY)`.
|
||||||
|
- For each planned speech segment, create a task with an inner wrapper:
|
||||||
|
|
||||||
|
```python
|
||||||
|
async def run_one(planned):
|
||||||
|
async with sem:
|
||||||
|
try:
|
||||||
|
out_path = await self.tts_service.generate_speech(
|
||||||
|
text=planned.text_chunk,
|
||||||
|
speaker_sample_path=planned.abs_speaker_sample_path,
|
||||||
|
output_filename_base=planned.filename_base,
|
||||||
|
output_dir=dialog_temp_dir,
|
||||||
|
exaggeration=planned.exaggeration,
|
||||||
|
cfg_weight=planned.cfg_weight,
|
||||||
|
temperature=planned.temperature,
|
||||||
|
)
|
||||||
|
return planned.segment_idx, {"type": "speech", "path": str(out_path), "speaker_id": planned.speaker_id, "text_chunk": planned.text_chunk}
|
||||||
|
except Exception as e:
|
||||||
|
return planned.segment_idx, {"type": "error", "message": f"Error generating speech: {e}", "text_chunk": planned.text_chunk}
|
||||||
|
```
|
||||||
|
|
||||||
|
- Schedule with `asyncio.create_task(run_one(p))` and collect tasks.
|
||||||
|
|
||||||
|
3. Await and collate
|
||||||
|
|
||||||
|
- `results_map = {}`; for each completed task, set `results_map[idx] = payload`.
|
||||||
|
- Merge: start with all previously final (silence/reuse/error) entries placed by `segment_idx`, then fill speech results by `segment_idx` into a single `segment_results` list sorted ascending by index.
|
||||||
|
- Keep `processing_log` entries for each planned segment (queued, started, finished, errors).
|
||||||
|
|
||||||
|
4. Return value unchanged
|
||||||
|
|
||||||
|
- Return `{"log": ..., "segment_files": segment_results, "temp_dir": str(dialog_temp_dir)}`. This maintains router and concatenator behavior.
|
||||||
|
|
||||||
|
### B. Offload audio writes
|
||||||
|
|
||||||
|
- File: `backend/app/services/tts_service.py`
|
||||||
|
- Function: `TTSService.generate_speech()`
|
||||||
|
|
||||||
|
1. After obtaining `wav` tensor, replace:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# torchaudio.save(str(output_file_path), wav, self.model.sr)
|
||||||
|
```
|
||||||
|
|
||||||
|
with:
|
||||||
|
|
||||||
|
```python
|
||||||
|
await asyncio.to_thread(torchaudio.save, str(output_file_path), wav, self.model.sr)
|
||||||
|
```
|
||||||
|
|
||||||
|
- Keep the rest of cleanup logic (delete `wav`, `gc.collect()`, cache emptying) unchanged.
|
||||||
|
|
||||||
|
2. Optional (CUDA-only AMP)
|
||||||
|
|
||||||
|
- If CUDA is used and `config.TTS_ENABLE_AMP_ON_CUDA` is True, wrap forward with AMP:
|
||||||
|
|
||||||
|
```python
|
||||||
|
with torch.cuda.amp.autocast(dtype=torch.float16):
|
||||||
|
wav = self.model.generate(...)
|
||||||
|
```
|
||||||
|
|
||||||
|
- Leave MPS/CPU code path as-is.
|
||||||
|
|
||||||
|
## Error Handling & Ordering
|
||||||
|
|
||||||
|
- Every planned segment owns a unique `segment_idx`.
|
||||||
|
- On failure, insert an error record at that index; downstream concatenation will skip missing/nonexistent paths already.
|
||||||
|
- Preserve exact output order expected by `routers/dialog.py::process_dialog_flow()`.
|
||||||
|
|
||||||
|
## Performance Expectations
|
||||||
|
|
||||||
|
- GPU util should increase from ~50% to 75–90% depending on dialog size and line lengths.
|
||||||
|
- Wall-clock reduction is workload-dependent; target 1.5–2.5x on multi-line dialogs.
|
||||||
|
|
||||||
|
## Metrics & Instrumentation
|
||||||
|
|
||||||
|
- Add timestamped log entries per segment: planned→queued→started→saved.
|
||||||
|
- Log effective concurrency (max in-flight), and cumulative GPU time if available.
|
||||||
|
- Optionally add a simple timing summary at end of `process_dialog()`.
|
||||||
|
|
||||||
|
## Testing Plan
|
||||||
|
|
||||||
|
1. Unit-ish
|
||||||
|
|
||||||
|
- Small dialog (3 speech lines, 1 silence). Ensure ordering is stable and files exist.
|
||||||
|
- Introduce an invalid speaker to verify error propagation doesn’t break the rest.
|
||||||
|
|
||||||
|
2. Integration
|
||||||
|
|
||||||
|
- POST `/api/dialog/generate` with 20–50 mixed-length lines and a couple silences.
|
||||||
|
- Validate: response OK, concatenated file exists, zip contains all generated speech segments, order preserved.
|
||||||
|
- Compare runtime vs. sequential baseline (before/after).
|
||||||
|
|
||||||
|
3. Stress/limits
|
||||||
|
|
||||||
|
- Long lines split into many chunks; verify no OOM with `TTS_MAX_CONCURRENCY`=3.
|
||||||
|
- Try `TTS_MAX_CONCURRENCY`=1 to simulate sequential; compare metrics.
|
||||||
|
|
||||||
|
## Rollout & Config Defaults
|
||||||
|
|
||||||
|
- Default `TTS_MAX_CONCURRENCY=3`.
|
||||||
|
- Expose via environment variable; no client changes needed.
|
||||||
|
- If instability observed, set `TTS_MAX_CONCURRENCY=1` to revert to sequential behavior quickly.
|
||||||
|
|
||||||
|
## Risks & Mitigations
|
||||||
|
|
||||||
|
- OOM under high concurrency → Mitigate with low default, easy rollback, and chunking already in place.
|
||||||
|
- Disk I/O saturation → Offload to threads; if disk is a bottleneck, decrease concurrency.
|
||||||
|
- Model thread safety → We call `model.generate` concurrently only up to semaphore cap; if underlying library is not thread-safe for forward passes, consider serializing forwards but still overlapping with file I/O; early logs will reveal.
|
||||||
|
|
||||||
|
## Follow-up (Out of Scope for this change)
|
||||||
|
|
||||||
|
- Dynamic batching queue inside `TTSService` for further GPU efficiency.
|
||||||
|
- CUDA AMP enablement and profiling.
|
||||||
|
- Per-speaker sub-queues if batching requires same-speaker inputs.
|
||||||
|
|
||||||
|
## Acceptance Criteria
|
||||||
|
|
||||||
|
- `TTS_MAX_CONCURRENCY` is configurable; default=3.
|
||||||
|
- File writes occur via `asyncio.to_thread`.
|
||||||
|
- Order of `segment_files` unchanged relative to sequential output.
|
||||||
|
- End-to-end works for both small and large dialogs; error cases logged.
|
||||||
|
- Observed GPU utilization and runtime improve on representative dialog.
|
|
@ -0,0 +1,138 @@
|
||||||
|
# Frontend Review and Recommendations
|
||||||
|
|
||||||
|
Date: 2025-08-12T11:32:16-05:00
|
||||||
|
Scope: `frontend/` of `chatterbox-test` monorepo
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
- Static vanilla JS frontend served by `frontend/start_dev_server.py` interacting with FastAPI backend under `/api`.
|
||||||
|
- Solid feature set (speaker management, dialog editor, per-line generation, full dialog generation, save/load) with robust error handling.
|
||||||
|
- Key issues: inconsistent API trailing slashes, Jest/babel-jest version/config mismatch, minor state duplication, alert/confirm UX, overly dark border color, token in `package.json` repo URL.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Findings
|
||||||
|
|
||||||
|
- **Framework/structure**
|
||||||
|
- `frontend/` is static vanilla JS. Main files:
|
||||||
|
- `index.html`, `js/app.js`, `js/api.js`, `js/config.js`, `css/style.css`.
|
||||||
|
- Dev server: `frontend/start_dev_server.py` (CORS, env-based port/host).
|
||||||
|
|
||||||
|
- **API client vs backend routes (trailing slashes)**
|
||||||
|
- Frontend `frontend/js/api.js` currently uses:
|
||||||
|
- `getSpeakers()`: `${API_BASE_URL}/speakers/` (trailing).
|
||||||
|
- `addSpeaker()`: `${API_BASE_URL}/speakers/` (trailing).
|
||||||
|
- `deleteSpeaker()`: `${API_BASE_URL}/speakers/${speakerId}/` (trailing).
|
||||||
|
- `generateLine()`: `${API_BASE_URL}/dialog/generate_line`.
|
||||||
|
- `generateDialog()`: `${API_BASE_URL}/dialog/generate`.
|
||||||
|
- Backend routes:
|
||||||
|
- `backend/app/routers/speakers.py`: `GET/POST /` and `DELETE /{speaker_id}` (no trailing slash on delete when prefixed under `/api/speakers`).
|
||||||
|
- `backend/app/routers/dialog.py`: `/generate_line` and `/generate` (match frontend).
|
||||||
|
- Tests in `frontend/tests/api.test.js` expect no trailing slashes for `/speakers` and `/speakers/{id}`.
|
||||||
|
- Implication: Inconsistent trailing slashes can cause test failures and possible 404s for delete.
|
||||||
|
|
||||||
|
- **Payload schema inconsistencies**
|
||||||
|
- `generateDialog()` JSDoc shows `silence` as `{ duration_ms: 500 }` but backend expects `duration` (seconds). UI also uses `duration` seconds.
|
||||||
|
|
||||||
|
- **Form fields alignment**
|
||||||
|
- Speaker add uses `name` and `audio_file` which match backend (`Form` and `File`).
|
||||||
|
|
||||||
|
- **State management duplication in `frontend/js/app.js`**
|
||||||
|
- `dialogItems` and `availableSpeakersCache` defined at module scope and again inside `initializeDialogEditor()`, creating shadowing risk. Consolidate to a single source of truth.
|
||||||
|
|
||||||
|
- **UX considerations**
|
||||||
|
- Heavy use of `alert()`/`confirm()`. Prefer inline notifications/banners and per-row error chips (you already render `item.error`).
|
||||||
|
- Add global loading/disabled states for long actions (e.g., full dialog generation, speaker add/delete).
|
||||||
|
|
||||||
|
- **CSS theme issue**
|
||||||
|
- `--border-light` is `#1b0404` (dark red); semantically a light gray fits better and improves contrast harmony.
|
||||||
|
|
||||||
|
- **Testing/Jest/Babel config**
|
||||||
|
- Root `package.json` uses `jest@^29.7.0` with `babel-jest@^30.0.0-beta.3` (major mismatch). Align versions.
|
||||||
|
- No `jest.config.cjs` to configure `transform` via `babel-jest` for ESM modules.
|
||||||
|
|
||||||
|
- **Security**
|
||||||
|
- `package.json` `repository.url` embeds a token. Remove secrets from VCS immediately.
|
||||||
|
|
||||||
|
- **Dev scripts**
|
||||||
|
- Only `"test": "jest"` present. Add scripts to run the frontend dev server and test config explicitly.
|
||||||
|
|
||||||
|
- **Response handling consistency**
|
||||||
|
- `generateLine()` parses via `response.text()` then `JSON.parse()`. Others use `response.json()`. Standardize for consistency.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Recommended Actions (Phase 1: Quick wins)
|
||||||
|
|
||||||
|
- **Normalize API paths in `frontend/js/api.js`**
|
||||||
|
- Use no trailing slashes:
|
||||||
|
- `GET/POST`: `${API_BASE_URL}/speakers`
|
||||||
|
- `DELETE`: `${API_BASE_URL}/speakers/${speakerId}`
|
||||||
|
- Keep dialog endpoints unchanged.
|
||||||
|
|
||||||
|
- **Fix JSDoc for `generateDialog()`**
|
||||||
|
- Use `silence: { duration: number }` (seconds), not `duration_ms`.
|
||||||
|
|
||||||
|
- **Refactor `frontend/js/app.js` state**
|
||||||
|
- Remove duplicate `dialogItems`/`availableSpeakersCache` declarations. Choose module-scope or function-scope, and pass references.
|
||||||
|
|
||||||
|
- **Improve UX**
|
||||||
|
- Replace `alert/confirm` with inline banners near `#results-display` and per-row error chips (extend existing `.line-error-msg`).
|
||||||
|
- Add disabled/loading states for global generate and speaker actions.
|
||||||
|
|
||||||
|
- **CSS tweak**
|
||||||
|
- Set `--border-light: #e5e7eb;` (or similar) to reflect a light border.
|
||||||
|
|
||||||
|
- **Harden tests/Jest config**
|
||||||
|
- Align versions: either Jest 29 + `babel-jest` 29, or upgrade both to 30 stable together.
|
||||||
|
- Add `jest.config.cjs` with `transform` using `babel-jest` and suitable `testEnvironment`.
|
||||||
|
- Ensure tests expect normalized API paths (recommended to change code to match tests).
|
||||||
|
|
||||||
|
- **Dev scripts**
|
||||||
|
- Add to root `package.json`:
|
||||||
|
- `"frontend:dev": "python3 frontend/start_dev_server.py"`
|
||||||
|
- `"test:frontend": "jest --config ./jest.config.cjs"`
|
||||||
|
|
||||||
|
- **Sanitize repository URL**
|
||||||
|
- Remove embedded token from `package.json`.
|
||||||
|
|
||||||
|
- **Standardize response parsing**
|
||||||
|
- Switch `generateLine()` to `response.json()` unless backend returns `text/plain`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Backend Endpoint Confirmation
|
||||||
|
|
||||||
|
- `speakers` router (`backend/app/routers/speakers.py`):
|
||||||
|
- List/Create: `GET /`, `POST /` (when mounted under `/api/speakers` → `/api/speakers/`).
|
||||||
|
- Delete: `DELETE /{speaker_id}` (→ `/api/speakers/{speaker_id}`), no trailing slash.
|
||||||
|
- `dialog` router (`backend/app/routers/dialog.py`):
|
||||||
|
- `POST /generate_line`, `POST /generate` (mounted under `/api/dialog`).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Proposed Implementation Plan
|
||||||
|
|
||||||
|
- **Phase 1 (1–2 hours)**
|
||||||
|
- Normalize API paths in `api.js`.
|
||||||
|
- Fix JSDoc for `generateDialog`.
|
||||||
|
- Consolidate dialog state in `app.js`.
|
||||||
|
- Adjust `--border-light` to light gray.
|
||||||
|
- Add `jest.config.cjs`, align Jest/babel-jest versions.
|
||||||
|
- Add dev/test scripts.
|
||||||
|
- Remove token from `package.json`.
|
||||||
|
|
||||||
|
- **Phase 2 (2–4 hours)**
|
||||||
|
- Inline notifications and comprehensive loading/disabled states.
|
||||||
|
|
||||||
|
- **Phase 3 (optional)**
|
||||||
|
- ESLint + Prettier.
|
||||||
|
- Consider Vite migration (HMR, proxy to backend, improved DX).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
- Current local time captured for this review: 2025-08-12T11:32:16-05:00.
|
||||||
|
- Frontend config (`frontend/js/config.js`) supports env overrides for API base and dev server port.
|
||||||
|
- Tests (`frontend/tests/api.test.js`) currently assume endpoints without trailing slashes.
|
|
@ -0,0 +1,204 @@
|
||||||
|
# Unload Model on Idle: Implementation Plan
|
||||||
|
|
||||||
|
## Goals
|
||||||
|
- Automatically unload large TTS model(s) when idle to reduce RAM/VRAM usage.
|
||||||
|
- Lazy-load on demand without breaking API semantics.
|
||||||
|
- Configurable timeout and safety controls.
|
||||||
|
|
||||||
|
## Requirements
|
||||||
|
- Config-driven idle timeout and poll interval.
|
||||||
|
- Thread-/async-safe across concurrent requests.
|
||||||
|
- No unload while an inference is in progress.
|
||||||
|
- Clear logs and metrics for load/unload events.
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
File: `backend/app/config.py`
|
||||||
|
- Add:
|
||||||
|
- `MODEL_IDLE_TIMEOUT_SECONDS: int = 900` (0 disables eviction)
|
||||||
|
- `MODEL_IDLE_CHECK_INTERVAL_SECONDS: int = 60`
|
||||||
|
- `MODEL_EVICTION_ENABLED: bool = True`
|
||||||
|
- Bind to env: `MODEL_IDLE_TIMEOUT_SECONDS`, `MODEL_IDLE_CHECK_INTERVAL_SECONDS`, `MODEL_EVICTION_ENABLED`.
|
||||||
|
|
||||||
|
## Design
|
||||||
|
### ModelManager (Singleton)
|
||||||
|
File: `backend/app/services/model_manager.py` (new)
|
||||||
|
- Responsibilities:
|
||||||
|
- Manage lifecycle (load/unload) of the TTS model/pipeline.
|
||||||
|
- Provide `get()` that returns a ready model (lazy-load if needed) and updates `last_used`.
|
||||||
|
- Track active request count to block eviction while > 0.
|
||||||
|
- Internals:
|
||||||
|
- `self._model` (or components), `self._last_used: float`, `self._active: int`.
|
||||||
|
- Locks: `asyncio.Lock` for load/unload; `asyncio.Lock` or `asyncio.Semaphore` for counters.
|
||||||
|
- Optional CUDA cleanup: `torch.cuda.empty_cache()` after unload.
|
||||||
|
- API:
|
||||||
|
- `async def get(self) -> Model`: ensures loaded; bumps `last_used`.
|
||||||
|
- `async def load(self)`: idempotent; guarded by lock.
|
||||||
|
- `async def unload(self)`: only when `self._active == 0`; clears refs and caches.
|
||||||
|
- `def touch(self)`: update `last_used`.
|
||||||
|
- Context helper: `async def using(self)`: async context manager incrementing/decrementing `active` safely.
|
||||||
|
|
||||||
|
### Idle Reaper Task
|
||||||
|
Registration: FastAPI startup (e.g., in `backend/app/main.py`)
|
||||||
|
- Background task loop every `MODEL_IDLE_CHECK_INTERVAL_SECONDS`:
|
||||||
|
- If eviction enabled and timeout > 0 and model is loaded and `active == 0` and `now - last_used >= timeout`, call `unload()`.
|
||||||
|
- Handle cancellation on shutdown.
|
||||||
|
|
||||||
|
### API Integration
|
||||||
|
- Replace direct model access in endpoints with:
|
||||||
|
```python
|
||||||
|
manager = ModelManager.instance()
|
||||||
|
async with manager.using():
|
||||||
|
model = await manager.get()
|
||||||
|
# perform inference
|
||||||
|
```
|
||||||
|
- Optionally call `manager.touch()` at request start for non-inference paths that still need the model resident.
|
||||||
|
|
||||||
|
## Pseudocode
|
||||||
|
```python
|
||||||
|
# services/model_manager.py
|
||||||
|
import time, asyncio
|
||||||
|
from typing import Optional
|
||||||
|
from .config import settings
|
||||||
|
|
||||||
|
class ModelManager:
|
||||||
|
_instance: Optional["ModelManager"] = None
|
||||||
|
|
||||||
|
def __init__(self):
|
||||||
|
self._model = None
|
||||||
|
self._last_used = time.time()
|
||||||
|
self._active = 0
|
||||||
|
self._lock = asyncio.Lock()
|
||||||
|
self._counter_lock = asyncio.Lock()
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def instance(cls):
|
||||||
|
if not cls._instance:
|
||||||
|
cls._instance = cls()
|
||||||
|
return cls._instance
|
||||||
|
|
||||||
|
async def load(self):
|
||||||
|
async with self._lock:
|
||||||
|
if self._model is not None:
|
||||||
|
return
|
||||||
|
# ... load model/pipeline here ...
|
||||||
|
self._model = await load_pipeline()
|
||||||
|
self._last_used = time.time()
|
||||||
|
|
||||||
|
async def unload(self):
|
||||||
|
async with self._lock:
|
||||||
|
if self._model is None:
|
||||||
|
return
|
||||||
|
if self._active > 0:
|
||||||
|
return # safety: do not unload while in use
|
||||||
|
# ... free resources ...
|
||||||
|
self._model = None
|
||||||
|
try:
|
||||||
|
import torch
|
||||||
|
torch.cuda.empty_cache()
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
|
||||||
|
async def get(self):
|
||||||
|
if self._model is None:
|
||||||
|
await self.load()
|
||||||
|
self._last_used = time.time()
|
||||||
|
return self._model
|
||||||
|
|
||||||
|
async def _inc(self):
|
||||||
|
async with self._counter_lock:
|
||||||
|
self._active += 1
|
||||||
|
|
||||||
|
async def _dec(self):
|
||||||
|
async with self._counter_lock:
|
||||||
|
self._active = max(0, self._active - 1)
|
||||||
|
self._last_used = time.time()
|
||||||
|
|
||||||
|
def last_used(self):
|
||||||
|
return self._last_used
|
||||||
|
|
||||||
|
def is_loaded(self):
|
||||||
|
return self._model is not None
|
||||||
|
|
||||||
|
def active(self):
|
||||||
|
return self._active
|
||||||
|
|
||||||
|
def using(self):
|
||||||
|
manager = self
|
||||||
|
class _Ctx:
|
||||||
|
async def __aenter__(self):
|
||||||
|
await manager._inc()
|
||||||
|
return manager
|
||||||
|
async def __aexit__(self, exc_type, exc, tb):
|
||||||
|
await manager._dec()
|
||||||
|
return _Ctx()
|
||||||
|
|
||||||
|
# main.py (startup)
|
||||||
|
@app.on_event("startup")
|
||||||
|
async def start_reaper():
|
||||||
|
async def reaper():
|
||||||
|
while True:
|
||||||
|
try:
|
||||||
|
await asyncio.sleep(settings.MODEL_IDLE_CHECK_INTERVAL_SECONDS)
|
||||||
|
if not settings.MODEL_EVICTION_ENABLED:
|
||||||
|
continue
|
||||||
|
timeout = settings.MODEL_IDLE_TIMEOUT_SECONDS
|
||||||
|
if timeout <= 0:
|
||||||
|
continue
|
||||||
|
m = ModelManager.instance()
|
||||||
|
if m.is_loaded() and m.active() == 0 and (time.time() - m.last_used()) >= timeout:
|
||||||
|
await m.unload()
|
||||||
|
except asyncio.CancelledError:
|
||||||
|
break
|
||||||
|
except Exception as e:
|
||||||
|
logger.exception("Idle reaper error: %s", e)
|
||||||
|
app.state._model_reaper_task = asyncio.create_task(reaper())
|
||||||
|
|
||||||
|
@app.on_event("shutdown")
|
||||||
|
async def stop_reaper():
|
||||||
|
task = getattr(app.state, "_model_reaper_task", None)
|
||||||
|
if task:
|
||||||
|
task.cancel()
|
||||||
|
with contextlib.suppress(Exception):
|
||||||
|
await task
|
||||||
|
```
|
||||||
|
```
|
||||||
|
|
||||||
|
## Observability
|
||||||
|
- Logs: model load/unload, reaper decisions, active count.
|
||||||
|
- Metrics (optional): counters and gauges (load events, active, residency time).
|
||||||
|
|
||||||
|
## Safety & Edge Cases
|
||||||
|
- Avoid unload when `active > 0`.
|
||||||
|
- Guard multiple loads/unloads with lock.
|
||||||
|
- Multi-worker servers: each worker manages its own model.
|
||||||
|
- Cold-start latency: document expected additional latency for first request after idle unload.
|
||||||
|
|
||||||
|
## Testing
|
||||||
|
- Unit tests for `ModelManager`: load/unload idempotency, counter behavior.
|
||||||
|
- Simulated reaper triggering with short timeouts.
|
||||||
|
- Endpoint tests: concurrency (N simultaneous inferences), ensure no unload mid-flight.
|
||||||
|
|
||||||
|
## Rollout Plan
|
||||||
|
1. Introduce config + Manager (no reaper), switch endpoints to `using()`.
|
||||||
|
2. Enable reaper with long timeout in staging; observe logs/metrics.
|
||||||
|
3. Tune timeout; enable in production.
|
||||||
|
|
||||||
|
## Tasks Checklist
|
||||||
|
- [ ] Add config flags and defaults in `backend/app/config.py`.
|
||||||
|
- [ ] Create `backend/app/services/model_manager.py`.
|
||||||
|
- [ ] Register startup/shutdown reaper in app init (`backend/app/main.py`).
|
||||||
|
- [ ] Refactor endpoints to use `ModelManager.instance().using()` and `get()`.
|
||||||
|
- [ ] Add logs and optional metrics.
|
||||||
|
- [ ] Add unit/integration tests.
|
||||||
|
- [ ] Update README/ops docs.
|
||||||
|
|
||||||
|
## Alternatives Considered
|
||||||
|
- Gunicorn/uvicorn worker preloading with external idle supervisor: more complexity, less portability.
|
||||||
|
- OS-level cgroup memory pressure eviction: opaque and risky for correctness.
|
||||||
|
|
||||||
|
## Configuration Examples
|
||||||
|
```
|
||||||
|
MODEL_EVICTION_ENABLED=true
|
||||||
|
MODEL_IDLE_TIMEOUT_SECONDS=900
|
||||||
|
MODEL_IDLE_CHECK_INTERVAL_SECONDS=60
|
||||||
|
```
|
Binary file not shown.
Binary file not shown.
Binary file not shown.
11
AGENTS.md
11
AGENTS.md
|
@ -10,6 +10,9 @@ python backend/run_api_test.py # Run all backend tests
|
||||||
# Frontend
|
# Frontend
|
||||||
npm test # Run all frontend tests
|
npm test # Run all frontend tests
|
||||||
npx jest frontend/tests/api.test.js # Run single test file
|
npx jest frontend/tests/api.test.js # Run single test file
|
||||||
|
|
||||||
|
# Alternative UI
|
||||||
|
python gradio_app.py # Run Gradio interface
|
||||||
```
|
```
|
||||||
|
|
||||||
## Code Style Guidelines
|
## Code Style Guidelines
|
||||||
|
@ -38,4 +41,10 @@ npx jest frontend/tests/api.test.js # Run single test file
|
||||||
### Naming Conventions
|
### Naming Conventions
|
||||||
- Python: snake_case for variables/functions, PascalCase for classes
|
- Python: snake_case for variables/functions, PascalCase for classes
|
||||||
- JavaScript: camelCase for variables/functions
|
- JavaScript: camelCase for variables/functions
|
||||||
- Descriptive, intention-revealing names
|
- Descriptive, intention-revealing names
|
||||||
|
|
||||||
|
### Architecture Notes
|
||||||
|
- Backend: FastAPI on port 8000, structured as routers/models/services
|
||||||
|
- Frontend: Vanilla JS (ES6+) on port 8001, modular design
|
||||||
|
- API Base URL: http://localhost:8000/api
|
||||||
|
- Speaker data in YAML format at speaker_data/speakers.yaml
|
|
@ -359,7 +359,7 @@ The API uses the following directory structure (configurable in `app/config.py`)
|
||||||
- **Temporary Files**: `{PROJECT_ROOT}/tts_temp_outputs/`
|
- **Temporary Files**: `{PROJECT_ROOT}/tts_temp_outputs/`
|
||||||
|
|
||||||
### CORS Settings
|
### CORS Settings
|
||||||
- Allowed Origins: `http://localhost:8001`, `http://127.0.0.1:8001`
|
- Allowed Origins: `http://localhost:8001`, `http://127.0.0.1:8001` (plus any `FRONTEND_HOST:FRONTEND_PORT` when using `start_servers.py`)
|
||||||
- Allowed Methods: All
|
- Allowed Methods: All
|
||||||
- Allowed Headers: All
|
- Allowed Headers: All
|
||||||
- Credentials: Enabled
|
- Credentials: Enabled
|
||||||
|
|
65
CLAUDE.md
65
CLAUDE.md
|
@ -20,11 +20,40 @@ python backend/run_api_test.py
|
||||||
# API docs at http://127.0.0.1:8000/docs
|
# API docs at http://127.0.0.1:8000/docs
|
||||||
```
|
```
|
||||||
|
|
||||||
### Frontend Testing
|
### Frontend Development
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
# Install frontend dependencies
|
||||||
|
npm install
|
||||||
|
|
||||||
# Run frontend tests
|
# Run frontend tests
|
||||||
npm test
|
npm test
|
||||||
|
|
||||||
|
# Start frontend dev server separately
|
||||||
|
cd frontend && python start_dev_server.py
|
||||||
|
```
|
||||||
|
|
||||||
|
### Integrated Development Environment
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Start both backend and frontend servers concurrently
|
||||||
|
python start_servers.py
|
||||||
|
|
||||||
|
# Or alternatively, run backend startup script from backend directory
|
||||||
|
cd backend && python start_server.py
|
||||||
|
```
|
||||||
|
|
||||||
|
### Command-Line TTS Generation
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Generate single utterance with CLI
|
||||||
|
python cbx-generate.py --sample speaker_samples/voice.wav --output output.wav --text "Hello world"
|
||||||
|
|
||||||
|
# Generate dialog from script
|
||||||
|
python cbx-dialog-generate.py --dialog dialog.md --output dialog_output
|
||||||
|
|
||||||
|
# Generate audiobook from text file
|
||||||
|
python cbx-audiobook.py --input book.txt --output audiobook --speaker speaker_name
|
||||||
```
|
```
|
||||||
|
|
||||||
### Alternative Interfaces
|
### Alternative Interfaces
|
||||||
|
@ -100,3 +129,37 @@ Located in `backend/app/services/`:
|
||||||
- Backend commands run from project root, not `backend/` directory
|
- Backend commands run from project root, not `backend/` directory
|
||||||
- Frontend served separately (typically port 8001)
|
- Frontend served separately (typically port 8001)
|
||||||
- Speaker samples must be WAV format in `speaker_data/speaker_samples/`
|
- Speaker samples must be WAV format in `speaker_data/speaker_samples/`
|
||||||
|
|
||||||
|
## Environment Configuration
|
||||||
|
|
||||||
|
### Quick Setup
|
||||||
|
```bash
|
||||||
|
# Run automated setup (creates .env files)
|
||||||
|
python setup.py
|
||||||
|
|
||||||
|
# Install dependencies
|
||||||
|
pip install -r backend/requirements.txt
|
||||||
|
npm install
|
||||||
|
```
|
||||||
|
|
||||||
|
### Manual Environment Variables
|
||||||
|
Key environment variables that can be configured in `.env` files:
|
||||||
|
|
||||||
|
- `PROJECT_ROOT`: Base project directory
|
||||||
|
- `BACKEND_PORT`/`FRONTEND_PORT`: Server ports (default: 8000/8001)
|
||||||
|
- `DEVICE`: TTS model device (`auto`, `cpu`, `cuda`, `mps`)
|
||||||
|
- `CORS_ORIGINS`: Allowed frontend origins for CORS
|
||||||
|
- `SPEAKER_SAMPLES_DIR`: Directory for speaker audio files
|
||||||
|
|
||||||
|
### Configuration Files Structure
|
||||||
|
- `.env`: Global configuration
|
||||||
|
- `backend/.env`: Backend-specific settings
|
||||||
|
- `frontend/.env`: Frontend-specific settings
|
||||||
|
- `speaker_data/speakers.yaml`: Speaker configuration
|
||||||
|
|
||||||
|
## CLI Tools Overview
|
||||||
|
|
||||||
|
- `cbx-generate.py`: Single utterance generation
|
||||||
|
- `cbx-dialog-generate.py`: Multi-speaker dialog generation
|
||||||
|
- `cbx-audiobook.py`: Long-form audiobook generation
|
||||||
|
- `start_servers.py`: Integrated development server launcher
|
||||||
|
|
|
@ -58,7 +58,7 @@ The application uses environment variables for configuration. Three `.env` files
|
||||||
- `VITE_DEV_SERVER_HOST`: Frontend development server host
|
- `VITE_DEV_SERVER_HOST`: Frontend development server host
|
||||||
|
|
||||||
#### CORS Configuration
|
#### CORS Configuration
|
||||||
- `CORS_ORIGINS`: Comma-separated list of allowed origins
|
- `CORS_ORIGINS`: Comma-separated list of allowed origins. When using `start_servers.py` with the default `FRONTEND_HOST=0.0.0.0` and no explicit `CORS_ORIGINS`, CORS will allow all origins (wildcard) to simplify development.
|
||||||
|
|
||||||
#### Device Configuration
|
#### Device Configuration
|
||||||
- `DEVICE`: Device for TTS model (auto, cpu, cuda, mps)
|
- `DEVICE`: Device for TTS model (auto, cpu, cuda, mps)
|
||||||
|
@ -101,7 +101,7 @@ CORS_ORIGINS=http://localhost:3000
|
||||||
### Common Issues
|
### Common Issues
|
||||||
|
|
||||||
1. **Permission Errors**: Ensure the `PROJECT_ROOT` directory is writable
|
1. **Permission Errors**: Ensure the `PROJECT_ROOT` directory is writable
|
||||||
2. **CORS Errors**: Check that your frontend URL is in `CORS_ORIGINS`
|
2. **CORS Errors**: Check that your frontend URL is in `CORS_ORIGINS`. (When using `start_servers.py`, your specified `FRONTEND_HOST:FRONTEND_PORT` will be auto‑included.)
|
||||||
3. **Model Loading Errors**: Verify `DEVICE` setting matches your hardware
|
3. **Model Loading Errors**: Verify `DEVICE` setting matches your hardware
|
||||||
4. **Path Errors**: Ensure all path variables point to existing, accessible directories
|
4. **Path Errors**: Ensure all path variables point to existing, accessible directories
|
||||||
|
|
||||||
|
|
|
@ -0,0 +1,62 @@
|
||||||
|
# OpenCode.md - Chatterbox UI Development Guide
|
||||||
|
|
||||||
|
## Build & Run Commands
|
||||||
|
```bash
|
||||||
|
# Backend (FastAPI)
|
||||||
|
pip install -r backend/requirements.txt
|
||||||
|
uvicorn backend.app.main:app --reload --host 0.0.0.0 --port 8000
|
||||||
|
|
||||||
|
# Frontend
|
||||||
|
python frontend/start_dev_server.py # Serves on port 8001
|
||||||
|
|
||||||
|
# Run backend tests
|
||||||
|
python backend/run_api_test.py
|
||||||
|
|
||||||
|
# Run frontend tests
|
||||||
|
npm test
|
||||||
|
|
||||||
|
# Run specific frontend test
|
||||||
|
npx jest frontend/tests/api.test.js
|
||||||
|
|
||||||
|
# Run Gradio interface
|
||||||
|
python gradio_app.py
|
||||||
|
|
||||||
|
# Run utility scripts
|
||||||
|
python cbx-audiobook.py --list-speakers # List available speakers
|
||||||
|
python cbx-audiobook.py sample-audiobook.txt --speaker <speaker_id> # Generate audiobook
|
||||||
|
python cbx-dialog-generate.py sample-dialog.md # Generate dialog
|
||||||
|
```
|
||||||
|
|
||||||
|
## Code Style Guidelines
|
||||||
|
|
||||||
|
### Python
|
||||||
|
- Use type hints (from typing import Optional, List, Dict, etc.)
|
||||||
|
- Error handling: Use try/except with specific exceptions
|
||||||
|
- Async/await for I/O operations
|
||||||
|
- Docstrings for functions and classes
|
||||||
|
- PEP 8 naming: snake_case for functions/variables, PascalCase for classes
|
||||||
|
|
||||||
|
### JavaScript
|
||||||
|
- ES6 modules with import/export
|
||||||
|
- Async/await for API calls
|
||||||
|
- JSDoc comments for functions
|
||||||
|
- Error handling: try/catch with detailed error messages
|
||||||
|
- Camel case for variables/functions (camelCase)
|
||||||
|
|
||||||
|
## Import Structure
|
||||||
|
- When importing from scripts, use `import import_helper` first to fix Python path
|
||||||
|
- Backend modules use relative imports within the app package
|
||||||
|
- Services are in `backend.app.services`
|
||||||
|
- Models are in `backend.app.models`
|
||||||
|
- Configuration is in `backend.app.config`
|
||||||
|
|
||||||
|
## Project Structure
|
||||||
|
- Backend: FastAPI with service-oriented architecture
|
||||||
|
- Frontend: Vanilla JS with modular design (api.js, app.js, config.js)
|
||||||
|
- Speaker data in YAML format with WAV samples
|
||||||
|
- Output directories: dialog_output/, single_output/, tts_outputs/
|
||||||
|
|
||||||
|
## Common Issues
|
||||||
|
- Import errors: Make sure to use `import import_helper` in scripts
|
||||||
|
- Speaker samples must be WAV format in `speaker_data/speaker_samples/`
|
||||||
|
- TTS model requires GPU (CUDA) or Apple Silicon (MPS)
|
214
README.md
214
README.md
|
@ -1,73 +1,203 @@
|
||||||
# Chatterbox TTS Gradio App
|
# Chatterbox TTS Application
|
||||||
|
|
||||||
This Gradio application provides a user interface for text-to-speech generation using the Chatterbox TTS model. It supports both single utterance generation and multi-speaker dialog generation with configurable silence gaps.
|
A comprehensive text-to-speech application with multiple interfaces for generating speech from text using the Chatterbox TTS model. Supports single utterance generation, multi-speaker dialogs, and long-form audiobook generation.
|
||||||
|
|
||||||
## Features
|
## Features
|
||||||
|
|
||||||
|
- **Multiple Interfaces**: Web UI, FastAPI backend, Gradio interface, and CLI tools
|
||||||
- **Single Utterance Generation**: Generate speech from text using a selected speaker
|
- **Single Utterance Generation**: Generate speech from text using a selected speaker
|
||||||
- **Dialog Generation**: Create multi-speaker conversations with configurable silence gaps
|
- **Dialog Generation**: Create multi-speaker conversations with configurable silence gaps
|
||||||
|
- **Audiobook Generation**: Convert long-form text into narrated audiobooks
|
||||||
- **Speaker Management**: Add/remove speakers with custom audio samples
|
- **Speaker Management**: Add/remove speakers with custom audio samples
|
||||||
|
- **Paste Script (JSONL) Import**: Paste a dialog script as JSONL directly into the editor via a modal
|
||||||
- **Memory Optimization**: Automatic model cleanup after generation
|
- **Memory Optimization**: Automatic model cleanup after generation
|
||||||
- **Output Organization**: Files saved in `single_output/` and `dialog_output/` directories
|
- **Output Organization**: Files saved in organized directories with ZIP packaging
|
||||||
|
|
||||||
## Getting Started
|
## Getting Started
|
||||||
|
|
||||||
1. Clone the repository:
|
### Quick Setup
|
||||||
```bash
|
|
||||||
git clone https://github.com/your-username/chatterbox-test.git
|
|
||||||
```
|
|
||||||
|
|
||||||
2. Install dependencies:
|
1. Clone the repository and install dependencies:
|
||||||
```bash
|
```bash
|
||||||
|
git clone https://github.com/your-username/chatterbox-ui.git
|
||||||
|
cd chatterbox-ui
|
||||||
pip install -r requirements.txt
|
pip install -r requirements.txt
|
||||||
|
npm install
|
||||||
|
```
|
||||||
|
2. Run automated setup:
|
||||||
|
```bash
|
||||||
|
python setup.py
|
||||||
```
|
```
|
||||||
|
|
||||||
3. Prepare speaker samples:
|
3. Prepare speaker samples:
|
||||||
- Create a `speaker_samples/` directory
|
- Add audio samples (WAV format) to `speaker_data/speaker_samples/`
|
||||||
- Add audio samples (WAV format) for each speaker
|
- Configure speakers in `speaker_data/speakers.yaml`
|
||||||
- Update `speakers.yaml` with speaker names and file paths
|
|
||||||
|
|
||||||
4. Run the app:
|
### Windows Quick Start
|
||||||
```bash
|
|
||||||
python gradio_app.py
|
On Windows, a PowerShell setup script is provided to automate environment setup and startup.
|
||||||
```
|
|
||||||
|
```powershell
|
||||||
|
# From the repository root in PowerShell
|
||||||
|
./setup-windows.ps1
|
||||||
|
|
||||||
|
# First time only, if scripts are blocked:
|
||||||
|
# Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
|
||||||
|
```
|
||||||
|
|
||||||
|
What it does:
|
||||||
|
- Creates/uses `.venv`
|
||||||
|
- Upgrades pip and installs deps from `backend/requirements.txt` and root `requirements.txt`
|
||||||
|
- Creates a default `.env` with sensible ports if missing
|
||||||
|
- Starts both servers via `start_servers.py`
|
||||||
|
|
||||||
|
### Running the Application
|
||||||
|
|
||||||
|
**Full-Stack Web Application:**
|
||||||
|
```bash
|
||||||
|
# Start both backend and frontend servers
|
||||||
|
python start_servers.py
|
||||||
|
```
|
||||||
|
|
||||||
|
On Windows, you can also use the one-liner PowerShell script:
|
||||||
|
|
||||||
|
```powershell
|
||||||
|
./setup-windows.ps1
|
||||||
|
```
|
||||||
|
|
||||||
|
**Individual Components:**
|
||||||
|
```bash
|
||||||
|
# Backend only (FastAPI)
|
||||||
|
uvicorn backend.app.main:app --reload --host 0.0.0.0 --port 8000
|
||||||
|
|
||||||
|
# Frontend only
|
||||||
|
cd frontend && python start_dev_server.py
|
||||||
|
|
||||||
|
# Gradio interface
|
||||||
|
python gradio_app.py
|
||||||
|
```
|
||||||
|
|
||||||
## Usage
|
## Usage
|
||||||
|
|
||||||
### Single Utterance Tab
|
### Web Interface
|
||||||
- Select a speaker from the dropdown
|
Access the modern web UI at `http://localhost:8001` for interactive dialog creation.
|
||||||
- Enter text to synthesize
|
|
||||||
- Adjust generation parameters as needed
|
|
||||||
- Click "Generate Speech"
|
|
||||||
|
|
||||||
### Dialog Generation Tab
|
#### Paste Script (JSONL) in Dialog Editor
|
||||||
1. Add speakers using the speaker configuration section
|
Quickly load a dialog by pasting JSONL (one JSON object per line):
|
||||||
2. Enter dialog in the format:
|
|
||||||
```
|
|
||||||
Speaker1: "Hello, how are you?"
|
|
||||||
Speaker2: "I'm doing well!"
|
|
||||||
Silence: 0.5
|
|
||||||
Speaker1: "What are your plans for today?"
|
|
||||||
```
|
|
||||||
3. Set output base name
|
|
||||||
4. Click "Generate Dialog"
|
|
||||||
|
|
||||||
## File Organization
|
1. Click `Paste Script` in the Dialog Editor.
|
||||||
|
2. Paste JSONL content, for example:
|
||||||
|
|
||||||
- Generated single utterances are saved to `single_output/`
|
```jsonl
|
||||||
- Dialog generation files are saved to `dialog_output/`
|
{"type":"speech","speaker_id":"dummy_speaker","text":"Hello there!"}
|
||||||
- Concatenated dialog files have `_concatenated.wav` suffix
|
{"type":"silence","duration":0.5}
|
||||||
- All files are zipped together for download
|
{"type":"speech","speaker_id":"dummy_speaker","text":"This is the second line."}
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Click `Load` and confirm replacement if prompted.
|
||||||
|
|
||||||
|
Notes:
|
||||||
|
- Input is validated per line; errors report line numbers.
|
||||||
|
- The dialog is saved to localStorage, so it persists across refreshes.
|
||||||
|
- Unknown `speaker_id`s will still load; add speakers later if needed.
|
||||||
|
|
||||||
|
### CLI Tools
|
||||||
|
|
||||||
|
**Single utterance generation:**
|
||||||
|
```bash
|
||||||
|
python cbx-generate.py --sample speaker_samples/voice.wav --output output.wav --text "Hello world"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Dialog generation:**
|
||||||
|
```bash
|
||||||
|
python cbx-dialog-generate.py --dialog dialog.md --output dialog_output
|
||||||
|
```
|
||||||
|
|
||||||
|
**Audiobook generation:**
|
||||||
|
```bash
|
||||||
|
python cbx-audiobook.py --input book.txt --output audiobook --speaker speaker_name
|
||||||
|
```
|
||||||
|
|
||||||
|
### Gradio Interface
|
||||||
|
- **Single Utterance Tab**: Select speaker, enter text, adjust parameters, generate
|
||||||
|
- **Dialog Generation Tab**: Configure speakers and create multi-speaker conversations
|
||||||
|
- Dialog format:
|
||||||
|
```
|
||||||
|
Speaker1: "Hello, how are you?"
|
||||||
|
Speaker2: "I'm doing well!"
|
||||||
|
Silence: 0.5
|
||||||
|
Speaker1: "What are your plans for today?"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Architecture Overview
|
||||||
|
|
||||||
|
### Application Structure
|
||||||
|
- **Frontend**: Modern vanilla JavaScript web UI (`frontend/`)
|
||||||
|
- **Backend**: FastAPI REST API (`backend/`)
|
||||||
|
- **CLI Tools**: Command-line utilities (`cbx-*.py`)
|
||||||
|
- **Gradio Interface**: Alternative web UI (`gradio_app.py`)
|
||||||
|
|
||||||
|
### New Files and Features
|
||||||
|
- **`cbx-audiobook.py`**: Generate long-form audiobooks from text files
|
||||||
|
- **`import_helper.py`**: Utility for managing imports and dependencies
|
||||||
|
- **Backend Services**: Enhanced dialog processing, speaker management, and TTS services
|
||||||
|
- **Web Frontend**: Interactive dialog editor with drag-and-drop functionality
|
||||||
|
|
||||||
|
### File Organization
|
||||||
|
- `single_output/` - Single utterance generations
|
||||||
|
- `dialog_output/` - Multi-speaker dialog files
|
||||||
|
- `tts_outputs/` - Raw TTS generation files
|
||||||
|
- `speaker_data/` - Speaker configurations and audio samples
|
||||||
|
- Generated files packaged in ZIP archives for download
|
||||||
|
|
||||||
|
### API Endpoints
|
||||||
|
- `/api/speakers/` - Speaker CRUD operations
|
||||||
|
- `/api/dialog/generate/` - Full dialog generation
|
||||||
|
- `/api/dialog/generate_line/` - Single line generation
|
||||||
|
- `/generated_audio/` - Static audio file serving
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
### Environment Setup
|
||||||
|
Key configuration files:
|
||||||
|
- `.env` - Global settings
|
||||||
|
- `backend/.env` - Backend-specific settings
|
||||||
|
- `frontend/.env` - Frontend-specific settings
|
||||||
|
- `speaker_data/speakers.yaml` - Speaker configuration
|
||||||
|
|
||||||
|
### Development Commands
|
||||||
|
```bash
|
||||||
|
# Run tests
|
||||||
|
python backend/run_api_test.py
|
||||||
|
npm test
|
||||||
|
|
||||||
|
# Backend development
|
||||||
|
uvicorn backend.app.main:app --reload --host 0.0.0.0 --port 8000
|
||||||
|
|
||||||
|
# Access points
|
||||||
|
# Web UI: http://localhost:8001
|
||||||
|
# API: http://localhost:8000
|
||||||
|
# API Docs: http://localhost:8000/docs
|
||||||
|
```
|
||||||
|
|
||||||
## Memory Management
|
## Memory Management
|
||||||
|
|
||||||
The app automatically:
|
The application automatically:
|
||||||
- Cleans up the TTS model after each generation
|
- Cleans up the TTS model after each generation
|
||||||
- Frees GPU memory (for CUDA/MPS devices)
|
- Manages GPU memory (CUDA/MPS devices)
|
||||||
- Deletes intermediate tensors to minimize memory footprint
|
- Optimizes memory usage for long-form content
|
||||||
|
|
||||||
## Troubleshooting
|
## Troubleshooting
|
||||||
|
|
||||||
- **"Skipping unknown speaker"**: Add the speaker first using the speaker configuration
|
- **"Skipping unknown speaker"**: Configure speaker in `speaker_data/speakers.yaml`
|
||||||
- **"Sample file not found"**: Verify the audio file exists in `speaker_samples/`
|
- **"Sample file not found"**: Verify audio files exist in `speaker_data/speaker_samples/`
|
||||||
- **Memory issues**: Try enabling "Re-initialize model each line" for long dialogs
|
- **Memory issues**: Use model reinitialization options for long content
|
||||||
|
- **CORS errors**: Check frontend/backend port configuration (frontend origin is auto-included when using `start_servers.py`)
|
||||||
|
- **Import errors**: Run `python import_helper.py` to check dependencies
|
||||||
|
|
||||||
|
### Windows-specific
|
||||||
|
- If PowerShell blocks script execution, run once:
|
||||||
|
```powershell
|
||||||
|
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
|
||||||
|
```
|
||||||
|
- If Windows Firewall prompts the first time you run servers, allow access on your private network.
|
||||||
|
|
|
@ -6,20 +6,34 @@ from dotenv import load_dotenv
|
||||||
load_dotenv()
|
load_dotenv()
|
||||||
|
|
||||||
# Project root - can be overridden by environment variable
|
# Project root - can be overridden by environment variable
|
||||||
PROJECT_ROOT = Path(os.getenv("PROJECT_ROOT", Path(__file__).parent.parent.parent)).resolve()
|
PROJECT_ROOT = Path(
|
||||||
|
os.getenv("PROJECT_ROOT", Path(__file__).parent.parent.parent)
|
||||||
|
).resolve()
|
||||||
|
|
||||||
# Directory paths
|
# Directory paths
|
||||||
SPEAKER_DATA_BASE_DIR = Path(os.getenv("SPEAKER_DATA_BASE_DIR", str(PROJECT_ROOT / "speaker_data")))
|
SPEAKER_DATA_BASE_DIR = Path(
|
||||||
SPEAKER_SAMPLES_DIR = Path(os.getenv("SPEAKER_SAMPLES_DIR", str(SPEAKER_DATA_BASE_DIR / "speaker_samples")))
|
os.getenv("SPEAKER_DATA_BASE_DIR", str(PROJECT_ROOT / "speaker_data"))
|
||||||
SPEAKERS_YAML_FILE = Path(os.getenv("SPEAKERS_YAML_FILE", str(SPEAKER_DATA_BASE_DIR / "speakers.yaml")))
|
)
|
||||||
|
SPEAKER_SAMPLES_DIR = Path(
|
||||||
|
os.getenv("SPEAKER_SAMPLES_DIR", str(SPEAKER_DATA_BASE_DIR / "speaker_samples"))
|
||||||
|
)
|
||||||
|
SPEAKERS_YAML_FILE = Path(
|
||||||
|
os.getenv("SPEAKERS_YAML_FILE", str(SPEAKER_DATA_BASE_DIR / "speakers.yaml"))
|
||||||
|
)
|
||||||
|
|
||||||
# TTS temporary output path (used by DialogProcessorService)
|
# TTS temporary output path (used by DialogProcessorService)
|
||||||
TTS_TEMP_OUTPUT_DIR = Path(os.getenv("TTS_TEMP_OUTPUT_DIR", str(PROJECT_ROOT / "tts_temp_outputs")))
|
TTS_TEMP_OUTPUT_DIR = Path(
|
||||||
|
os.getenv("TTS_TEMP_OUTPUT_DIR", str(PROJECT_ROOT / "tts_temp_outputs"))
|
||||||
|
)
|
||||||
|
|
||||||
# Final dialog output path (used by Dialog router and served by main app)
|
# Final dialog output path (used by Dialog router and served by main app)
|
||||||
# These are stored within the 'backend' directory to be easily servable.
|
# These are stored within the 'backend' directory to be easily servable.
|
||||||
DIALOG_OUTPUT_PARENT_DIR = PROJECT_ROOT / "backend"
|
DIALOG_OUTPUT_PARENT_DIR = PROJECT_ROOT / "backend"
|
||||||
DIALOG_GENERATED_DIR = Path(os.getenv("DIALOG_GENERATED_DIR", str(DIALOG_OUTPUT_PARENT_DIR / "tts_generated_dialogs")))
|
DIALOG_GENERATED_DIR = Path(
|
||||||
|
os.getenv(
|
||||||
|
"DIALOG_GENERATED_DIR", str(DIALOG_OUTPUT_PARENT_DIR / "tts_generated_dialogs")
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
# Alias for clarity and backward compatibility
|
# Alias for clarity and backward compatibility
|
||||||
DIALOG_OUTPUT_DIR = DIALOG_GENERATED_DIR
|
DIALOG_OUTPUT_DIR = DIALOG_GENERATED_DIR
|
||||||
|
@ -29,11 +43,41 @@ HOST = os.getenv("HOST", "0.0.0.0")
|
||||||
PORT = int(os.getenv("PORT", "8000"))
|
PORT = int(os.getenv("PORT", "8000"))
|
||||||
RELOAD = os.getenv("RELOAD", "true").lower() == "true"
|
RELOAD = os.getenv("RELOAD", "true").lower() == "true"
|
||||||
|
|
||||||
# CORS configuration
|
# CORS configuration: determine allowed origins based on env & frontend binding
|
||||||
CORS_ORIGINS = [origin.strip() for origin in os.getenv("CORS_ORIGINS", "http://localhost:8001,http://127.0.0.1:8001").split(",")]
|
_cors_env = os.getenv("CORS_ORIGINS", "")
|
||||||
|
_frontend_host = os.getenv("FRONTEND_HOST")
|
||||||
|
_frontend_port = os.getenv("FRONTEND_PORT")
|
||||||
|
|
||||||
|
# If the dev server is bound to 0.0.0.0 (all interfaces), allow all origins
|
||||||
|
if _frontend_host == "0.0.0.0": # dev convenience when binding wildcard
|
||||||
|
CORS_ORIGINS = ["*"]
|
||||||
|
elif _cors_env:
|
||||||
|
# parse comma-separated origins, strip whitespace
|
||||||
|
CORS_ORIGINS = [origin.strip() for origin in _cors_env.split(",") if origin.strip()]
|
||||||
|
else:
|
||||||
|
# default to allow all origins in development
|
||||||
|
CORS_ORIGINS = ["*"]
|
||||||
|
|
||||||
|
# Auto-include specific frontend origin when not using wildcard CORS
|
||||||
|
if CORS_ORIGINS != ["*"] and _frontend_host and _frontend_port:
|
||||||
|
_frontend_origin = f"http://{_frontend_host.strip()}:{_frontend_port.strip()}"
|
||||||
|
if _frontend_origin not in CORS_ORIGINS:
|
||||||
|
CORS_ORIGINS.append(_frontend_origin)
|
||||||
|
|
||||||
# Device configuration
|
# Device configuration
|
||||||
DEVICE = os.getenv("DEVICE", "auto")
|
DEVICE = os.getenv("DEVICE", "auto")
|
||||||
|
|
||||||
|
# Concurrency configuration
|
||||||
|
# Max number of concurrent TTS generation tasks per dialog request
|
||||||
|
TTS_MAX_CONCURRENCY = int(os.getenv("TTS_MAX_CONCURRENCY", "3"))
|
||||||
|
|
||||||
|
# Model idle eviction configuration
|
||||||
|
# Enable/disable idle-based model eviction
|
||||||
|
MODEL_EVICTION_ENABLED = os.getenv("MODEL_EVICTION_ENABLED", "true").lower() == "true"
|
||||||
|
# Unload model after this many seconds of inactivity (0 disables eviction)
|
||||||
|
MODEL_IDLE_TIMEOUT_SECONDS = int(os.getenv("MODEL_IDLE_TIMEOUT_SECONDS", "900"))
|
||||||
|
# How often the reaper checks for idleness
|
||||||
|
MODEL_IDLE_CHECK_INTERVAL_SECONDS = int(os.getenv("MODEL_IDLE_CHECK_INTERVAL_SECONDS", "60"))
|
||||||
|
|
||||||
# Ensure directories exist
|
# Ensure directories exist
|
||||||
SPEAKER_SAMPLES_DIR.mkdir(parents=True, exist_ok=True)
|
SPEAKER_SAMPLES_DIR.mkdir(parents=True, exist_ok=True)
|
||||||
|
|
|
@ -2,6 +2,10 @@ from fastapi import FastAPI
|
||||||
from fastapi.staticfiles import StaticFiles
|
from fastapi.staticfiles import StaticFiles
|
||||||
from fastapi.middleware.cors import CORSMiddleware
|
from fastapi.middleware.cors import CORSMiddleware
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
import asyncio
|
||||||
|
import contextlib
|
||||||
|
import logging
|
||||||
|
import time
|
||||||
from app.routers import speakers, dialog # Import the routers
|
from app.routers import speakers, dialog # Import the routers
|
||||||
from app import config
|
from app import config
|
||||||
|
|
||||||
|
@ -38,3 +42,47 @@ config.DIALOG_GENERATED_DIR.mkdir(parents=True, exist_ok=True)
|
||||||
app.mount("/generated_audio", StaticFiles(directory=config.DIALOG_GENERATED_DIR), name="generated_audio")
|
app.mount("/generated_audio", StaticFiles(directory=config.DIALOG_GENERATED_DIR), name="generated_audio")
|
||||||
|
|
||||||
# Further endpoints for speakers, dialog generation, etc., will be added here.
|
# Further endpoints for speakers, dialog generation, etc., will be added here.
|
||||||
|
|
||||||
|
# --- Background task: idle model reaper ---
|
||||||
|
logger = logging.getLogger("app.model_reaper")
|
||||||
|
|
||||||
|
@app.on_event("startup")
|
||||||
|
async def _start_model_reaper():
|
||||||
|
from app.services.model_manager import ModelManager
|
||||||
|
|
||||||
|
async def reaper():
|
||||||
|
while True:
|
||||||
|
try:
|
||||||
|
await asyncio.sleep(config.MODEL_IDLE_CHECK_INTERVAL_SECONDS)
|
||||||
|
if not getattr(config, "MODEL_EVICTION_ENABLED", True):
|
||||||
|
continue
|
||||||
|
timeout = getattr(config, "MODEL_IDLE_TIMEOUT_SECONDS", 0)
|
||||||
|
if timeout <= 0:
|
||||||
|
continue
|
||||||
|
m = ModelManager.instance()
|
||||||
|
if m.is_loaded() and m.active() == 0 and (time.time() - m.last_used()) >= timeout:
|
||||||
|
logger.info("Idle timeout reached (%.0fs). Unloading model...", timeout)
|
||||||
|
await m.unload()
|
||||||
|
except asyncio.CancelledError:
|
||||||
|
break
|
||||||
|
except Exception:
|
||||||
|
logger.exception("Model reaper encountered an error")
|
||||||
|
|
||||||
|
# Log eviction configuration at startup
|
||||||
|
logger.info(
|
||||||
|
"Model Eviction -> enabled: %s | idle_timeout: %ss | check_interval: %ss",
|
||||||
|
getattr(config, "MODEL_EVICTION_ENABLED", True),
|
||||||
|
getattr(config, "MODEL_IDLE_TIMEOUT_SECONDS", 0),
|
||||||
|
getattr(config, "MODEL_IDLE_CHECK_INTERVAL_SECONDS", 60),
|
||||||
|
)
|
||||||
|
|
||||||
|
app.state._model_reaper_task = asyncio.create_task(reaper())
|
||||||
|
|
||||||
|
|
||||||
|
@app.on_event("shutdown")
|
||||||
|
async def _stop_model_reaper():
|
||||||
|
task = getattr(app.state, "_model_reaper_task", None)
|
||||||
|
if task:
|
||||||
|
task.cancel()
|
||||||
|
with contextlib.suppress(Exception):
|
||||||
|
await task
|
||||||
|
|
|
@ -9,6 +9,8 @@ from app.services.speaker_service import SpeakerManagementService
|
||||||
from app.services.dialog_processor_service import DialogProcessorService
|
from app.services.dialog_processor_service import DialogProcessorService
|
||||||
from app.services.audio_manipulation_service import AudioManipulationService
|
from app.services.audio_manipulation_service import AudioManipulationService
|
||||||
from app import config
|
from app import config
|
||||||
|
from typing import AsyncIterator
|
||||||
|
from app.services.model_manager import ModelManager
|
||||||
|
|
||||||
router = APIRouter()
|
router = APIRouter()
|
||||||
|
|
||||||
|
@ -16,9 +18,12 @@ router = APIRouter()
|
||||||
# These can be more sophisticated with a proper DI container or FastAPI's Depends system if services had complex init.
|
# These can be more sophisticated with a proper DI container or FastAPI's Depends system if services had complex init.
|
||||||
# For now, direct instantiation or simple Depends is fine.
|
# For now, direct instantiation or simple Depends is fine.
|
||||||
|
|
||||||
def get_tts_service():
|
async def get_tts_service() -> AsyncIterator[TTSService]:
|
||||||
# Consider making device configurable
|
"""Dependency that holds a usage token for the duration of the request."""
|
||||||
return TTSService(device="mps")
|
manager = ModelManager.instance()
|
||||||
|
async with manager.using():
|
||||||
|
service = await manager.get_service()
|
||||||
|
yield service
|
||||||
|
|
||||||
def get_speaker_management_service():
|
def get_speaker_management_service():
|
||||||
return SpeakerManagementService()
|
return SpeakerManagementService()
|
||||||
|
@ -32,7 +37,7 @@ def get_dialog_processor_service(
|
||||||
def get_audio_manipulation_service():
|
def get_audio_manipulation_service():
|
||||||
return AudioManipulationService()
|
return AudioManipulationService()
|
||||||
|
|
||||||
# --- Helper function to manage TTS model loading/unloading ---
|
# --- Helper imports ---
|
||||||
|
|
||||||
from app.models.dialog_models import SpeechItem, SilenceItem
|
from app.models.dialog_models import SpeechItem, SilenceItem
|
||||||
from app.services.tts_service import TTSService
|
from app.services.tts_service import TTSService
|
||||||
|
@ -128,19 +133,7 @@ async def generate_line(
|
||||||
detail=error_detail
|
detail=error_detail
|
||||||
)
|
)
|
||||||
|
|
||||||
async def manage_tts_model_lifecycle(tts_service: TTSService, task_function, *args, **kwargs):
|
# Removed per-request load/unload in favor of ModelManager idle eviction.
|
||||||
"""Loads TTS model, executes task, then unloads model."""
|
|
||||||
try:
|
|
||||||
print("API: Loading TTS model...")
|
|
||||||
tts_service.load_model()
|
|
||||||
return await task_function(*args, **kwargs)
|
|
||||||
except Exception as e:
|
|
||||||
# Log or handle specific exceptions if needed before re-raising
|
|
||||||
print(f"API: Error during TTS model lifecycle or task execution: {e}")
|
|
||||||
raise
|
|
||||||
finally:
|
|
||||||
print("API: Unloading TTS model...")
|
|
||||||
tts_service.unload_model()
|
|
||||||
|
|
||||||
async def process_dialog_flow(
|
async def process_dialog_flow(
|
||||||
request: DialogRequest,
|
request: DialogRequest,
|
||||||
|
@ -274,12 +267,10 @@ async def generate_dialog_endpoint(
|
||||||
- Concatenates all audio segments into a single file.
|
- Concatenates all audio segments into a single file.
|
||||||
- Creates a ZIP archive of all individual segments and the concatenated file.
|
- Creates a ZIP archive of all individual segments and the concatenated file.
|
||||||
"""
|
"""
|
||||||
# Wrap the core processing logic with model loading/unloading
|
# Execute core processing; ModelManager dependency keeps the model marked "in use".
|
||||||
return await manage_tts_model_lifecycle(
|
return await process_dialog_flow(
|
||||||
tts_service,
|
request=request,
|
||||||
process_dialog_flow,
|
dialog_processor=dialog_processor,
|
||||||
request=request,
|
|
||||||
dialog_processor=dialog_processor,
|
|
||||||
audio_manipulator=audio_manipulator,
|
audio_manipulator=audio_manipulator,
|
||||||
background_tasks=background_tasks
|
background_tasks=background_tasks,
|
||||||
)
|
)
|
||||||
|
|
|
@ -1,10 +1,16 @@
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from typing import List, Dict, Any, Union
|
from typing import List, Dict, Any, Union
|
||||||
import re
|
import re
|
||||||
|
import asyncio
|
||||||
|
from datetime import datetime
|
||||||
|
|
||||||
from .tts_service import TTSService
|
from .tts_service import TTSService
|
||||||
from .speaker_service import SpeakerManagementService
|
from .speaker_service import SpeakerManagementService
|
||||||
from app import config
|
try:
|
||||||
|
from app import config
|
||||||
|
except ModuleNotFoundError:
|
||||||
|
# When imported from scripts at project root
|
||||||
|
from backend.app import config
|
||||||
# Potentially models for dialog structure if we define them
|
# Potentially models for dialog structure if we define them
|
||||||
# from ..models.dialog_models import DialogItem # Example
|
# from ..models.dialog_models import DialogItem # Example
|
||||||
|
|
||||||
|
@ -88,24 +94,72 @@ class DialogProcessorService:
|
||||||
|
|
||||||
import shutil
|
import shutil
|
||||||
segment_idx = 0
|
segment_idx = 0
|
||||||
|
tasks = []
|
||||||
|
results_map: Dict[int, Dict[str, Any]] = {}
|
||||||
|
sem = asyncio.Semaphore(getattr(config, "TTS_MAX_CONCURRENCY", 2))
|
||||||
|
|
||||||
|
async def run_one(planned: Dict[str, Any]):
|
||||||
|
async with sem:
|
||||||
|
text_chunk = planned["text_chunk"]
|
||||||
|
speaker_id = planned["speaker_id"]
|
||||||
|
abs_speaker_sample_path = planned["abs_speaker_sample_path"]
|
||||||
|
filename_base = planned["filename_base"]
|
||||||
|
params = planned["params"]
|
||||||
|
seg_idx = planned["segment_idx"]
|
||||||
|
start_ts = datetime.now()
|
||||||
|
start_line = (
|
||||||
|
f"[{start_ts.isoformat(timespec='seconds')}] [TTS-TASK] START seg_idx={seg_idx} "
|
||||||
|
f"speaker={speaker_id} chunk_len={len(text_chunk)} base={filename_base}"
|
||||||
|
)
|
||||||
|
try:
|
||||||
|
out_path = await self.tts_service.generate_speech(
|
||||||
|
text=text_chunk,
|
||||||
|
speaker_id=speaker_id,
|
||||||
|
speaker_sample_path=str(abs_speaker_sample_path),
|
||||||
|
output_filename_base=filename_base,
|
||||||
|
output_dir=dialog_temp_dir,
|
||||||
|
exaggeration=params.get('exaggeration', 0.5),
|
||||||
|
cfg_weight=params.get('cfg_weight', 0.5),
|
||||||
|
temperature=params.get('temperature', 0.8),
|
||||||
|
)
|
||||||
|
end_ts = datetime.now()
|
||||||
|
duration = (end_ts - start_ts).total_seconds()
|
||||||
|
end_line = (
|
||||||
|
f"[{end_ts.isoformat(timespec='seconds')}] [TTS-TASK] END seg_idx={seg_idx} "
|
||||||
|
f"dur={duration:.2f}s -> {out_path}"
|
||||||
|
)
|
||||||
|
return seg_idx, {
|
||||||
|
"type": "speech",
|
||||||
|
"path": str(out_path),
|
||||||
|
"speaker_id": speaker_id,
|
||||||
|
"text_chunk": text_chunk,
|
||||||
|
}, start_line + "\n" + f"Successfully generated segment: {out_path}" + "\n" + end_line
|
||||||
|
except Exception as e:
|
||||||
|
end_ts = datetime.now()
|
||||||
|
err_line = (
|
||||||
|
f"[{end_ts.isoformat(timespec='seconds')}] [TTS-TASK] ERROR seg_idx={seg_idx} "
|
||||||
|
f"speaker={speaker_id} err={repr(e)}"
|
||||||
|
)
|
||||||
|
return seg_idx, {
|
||||||
|
"type": "error",
|
||||||
|
"message": f"Error generating speech for chunk '{text_chunk[:50]}...': {repr(e)}",
|
||||||
|
"text_chunk": text_chunk,
|
||||||
|
}, err_line
|
||||||
|
|
||||||
for i, item in enumerate(dialog_items):
|
for i, item in enumerate(dialog_items):
|
||||||
item_type = item.get("type")
|
item_type = item.get("type")
|
||||||
processing_log.append(f"Processing item {i+1}: type='{item_type}'")
|
processing_log.append(f"Processing item {i+1}: type='{item_type}'")
|
||||||
|
|
||||||
# --- Universal: Handle reuse of existing audio for both speech and silence ---
|
# --- Handle reuse of existing audio ---
|
||||||
use_existing_audio = item.get("use_existing_audio", False)
|
use_existing_audio = item.get("use_existing_audio", False)
|
||||||
audio_url = item.get("audio_url")
|
audio_url = item.get("audio_url")
|
||||||
if use_existing_audio and audio_url:
|
if use_existing_audio and audio_url:
|
||||||
# Determine source path (handle both absolute and relative)
|
|
||||||
# Map web URL to actual file location in tts_generated_dialogs
|
|
||||||
if audio_url.startswith("/generated_audio/"):
|
if audio_url.startswith("/generated_audio/"):
|
||||||
src_audio_path = config.DIALOG_OUTPUT_DIR / audio_url[len("/generated_audio/"):]
|
src_audio_path = config.DIALOG_OUTPUT_DIR / audio_url[len("/generated_audio/"):]
|
||||||
else:
|
else:
|
||||||
src_audio_path = Path(audio_url)
|
src_audio_path = Path(audio_url)
|
||||||
if not src_audio_path.is_absolute():
|
if not src_audio_path.is_absolute():
|
||||||
# Assume relative to the generated audio root dir
|
|
||||||
src_audio_path = config.DIALOG_OUTPUT_DIR / audio_url.lstrip("/\\")
|
src_audio_path = config.DIALOG_OUTPUT_DIR / audio_url.lstrip("/\\")
|
||||||
# Now src_audio_path should point to the real file in tts_generated_dialogs
|
|
||||||
if src_audio_path.is_file():
|
if src_audio_path.is_file():
|
||||||
segment_filename = f"{output_base_name}_seg{segment_idx}_reused.wav"
|
segment_filename = f"{output_base_name}_seg{segment_idx}_reused.wav"
|
||||||
dest_path = (self.temp_audio_dir / output_base_name / segment_filename)
|
dest_path = (self.temp_audio_dir / output_base_name / segment_filename)
|
||||||
|
@ -119,22 +173,18 @@ class DialogProcessorService:
|
||||||
processing_log.append(f"[REUSE] Destination audio file was not created: {dest_path}")
|
processing_log.append(f"[REUSE] Destination audio file was not created: {dest_path}")
|
||||||
else:
|
else:
|
||||||
processing_log.append(f"[REUSE] Destination audio file created: {dest_path}, size={dest_path.stat().st_size} bytes")
|
processing_log.append(f"[REUSE] Destination audio file created: {dest_path}, size={dest_path.stat().st_size} bytes")
|
||||||
# Only include 'type' and 'path' so the concatenator always includes this segment
|
results_map[segment_idx] = {"type": item_type, "path": str(dest_path)}
|
||||||
segment_results.append({
|
|
||||||
"type": item_type,
|
|
||||||
"path": str(dest_path)
|
|
||||||
})
|
|
||||||
processing_log.append(f"Reused existing audio for item {i+1}: copied from {src_audio_path} to {dest_path}")
|
processing_log.append(f"Reused existing audio for item {i+1}: copied from {src_audio_path} to {dest_path}")
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
error_message = f"Failed to copy reused audio for item {i+1}: {e}"
|
error_message = f"Failed to copy reused audio for item {i+1}: {e}"
|
||||||
processing_log.append(error_message)
|
processing_log.append(error_message)
|
||||||
segment_results.append({"type": "error", "message": error_message})
|
results_map[segment_idx] = {"type": "error", "message": error_message}
|
||||||
segment_idx += 1
|
segment_idx += 1
|
||||||
continue
|
continue
|
||||||
else:
|
else:
|
||||||
error_message = f"Audio file for reuse not found at {src_audio_path} for item {i+1}."
|
error_message = f"Audio file for reuse not found at {src_audio_path} for item {i+1}."
|
||||||
processing_log.append(error_message)
|
processing_log.append(error_message)
|
||||||
segment_results.append({"type": "error", "message": error_message})
|
results_map[segment_idx] = {"type": "error", "message": error_message}
|
||||||
segment_idx += 1
|
segment_idx += 1
|
||||||
continue
|
continue
|
||||||
|
|
||||||
|
@ -143,70 +193,81 @@ class DialogProcessorService:
|
||||||
text = item.get("text")
|
text = item.get("text")
|
||||||
if not speaker_id or not text:
|
if not speaker_id or not text:
|
||||||
processing_log.append(f"Skipping speech item {i+1} due to missing speaker_id or text.")
|
processing_log.append(f"Skipping speech item {i+1} due to missing speaker_id or text.")
|
||||||
segment_results.append({"type": "error", "message": "Missing speaker_id or text"})
|
results_map[segment_idx] = {"type": "error", "message": "Missing speaker_id or text"}
|
||||||
|
segment_idx += 1
|
||||||
continue
|
continue
|
||||||
|
|
||||||
# Validate speaker_id and get speaker_sample_path
|
|
||||||
speaker_info = self.speaker_service.get_speaker_by_id(speaker_id)
|
speaker_info = self.speaker_service.get_speaker_by_id(speaker_id)
|
||||||
if not speaker_info:
|
if not speaker_info:
|
||||||
processing_log.append(f"Speaker ID '{speaker_id}' not found. Skipping item {i+1}.")
|
processing_log.append(f"Speaker ID '{speaker_id}' not found. Skipping item {i+1}.")
|
||||||
segment_results.append({"type": "error", "message": f"Speaker ID '{speaker_id}' not found"})
|
results_map[segment_idx] = {"type": "error", "message": f"Speaker ID '{speaker_id}' not found"}
|
||||||
|
segment_idx += 1
|
||||||
continue
|
continue
|
||||||
if not speaker_info.sample_path:
|
if not speaker_info.sample_path:
|
||||||
processing_log.append(f"Speaker ID '{speaker_id}' has no sample path defined. Skipping item {i+1}.")
|
processing_log.append(f"Speaker ID '{speaker_id}' has no sample path defined. Skipping item {i+1}.")
|
||||||
segment_results.append({"type": "error", "message": f"Speaker ID '{speaker_id}' has no sample path defined"})
|
results_map[segment_idx] = {"type": "error", "message": f"Speaker ID '{speaker_id}' has no sample path defined"}
|
||||||
|
segment_idx += 1
|
||||||
continue
|
continue
|
||||||
|
|
||||||
# speaker_info.sample_path is relative to config.SPEAKER_DATA_BASE_DIR
|
|
||||||
abs_speaker_sample_path = config.SPEAKER_DATA_BASE_DIR / speaker_info.sample_path
|
abs_speaker_sample_path = config.SPEAKER_DATA_BASE_DIR / speaker_info.sample_path
|
||||||
if not abs_speaker_sample_path.is_file():
|
if not abs_speaker_sample_path.is_file():
|
||||||
processing_log.append(f"Speaker sample file not found or is not a file at '{abs_speaker_sample_path}' for speaker ID '{speaker_id}'. Skipping item {i+1}.")
|
processing_log.append(f"Speaker sample file not found or is not a file at '{abs_speaker_sample_path}' for speaker ID '{speaker_id}'. Skipping item {i+1}.")
|
||||||
segment_results.append({"type": "error", "message": f"Speaker sample not a file or not found: {abs_speaker_sample_path}"})
|
results_map[segment_idx] = {"type": "error", "message": f"Speaker sample not a file or not found: {abs_speaker_sample_path}"}
|
||||||
|
segment_idx += 1
|
||||||
continue
|
continue
|
||||||
|
|
||||||
text_chunks = self._split_text(text)
|
text_chunks = self._split_text(text)
|
||||||
processing_log.append(f"Split text for speaker '{speaker_id}' into {len(text_chunks)} chunk(s).")
|
processing_log.append(f"Split text for speaker '{speaker_id}' into {len(text_chunks)} chunk(s).")
|
||||||
|
|
||||||
for chunk_idx, text_chunk in enumerate(text_chunks):
|
for chunk_idx, text_chunk in enumerate(text_chunks):
|
||||||
segment_filename_base = f"{output_base_name}_seg{segment_idx}_spk{speaker_id}_chunk{chunk_idx}"
|
filename_base = f"{output_base_name}_seg{segment_idx}_spk{speaker_id}_chunk{chunk_idx}"
|
||||||
processing_log.append(f"Generating speech for chunk: '{text_chunk[:50]}...' using speaker '{speaker_id}'")
|
processing_log.append(f"Queueing TTS for chunk: '{text_chunk[:50]}...' using speaker '{speaker_id}'")
|
||||||
|
planned = {
|
||||||
try:
|
"segment_idx": segment_idx,
|
||||||
segment_output_path = await self.tts_service.generate_speech(
|
"speaker_id": speaker_id,
|
||||||
text=text_chunk,
|
"text_chunk": text_chunk,
|
||||||
speaker_id=speaker_id, # For metadata, actual sample path is used by TTS
|
"abs_speaker_sample_path": abs_speaker_sample_path,
|
||||||
speaker_sample_path=str(abs_speaker_sample_path),
|
"filename_base": filename_base,
|
||||||
output_filename_base=segment_filename_base,
|
"params": {
|
||||||
output_dir=dialog_temp_dir, # Save to the dialog's temp dir
|
'exaggeration': item.get('exaggeration', 0.5),
|
||||||
exaggeration=item.get('exaggeration', 0.5), # Default from Gradio, Pydantic model should provide this
|
'cfg_weight': item.get('cfg_weight', 0.5),
|
||||||
cfg_weight=item.get('cfg_weight', 0.5), # Default from Gradio, Pydantic model should provide this
|
'temperature': item.get('temperature', 0.8),
|
||||||
temperature=item.get('temperature', 0.8) # Default from Gradio, Pydantic model should provide this
|
},
|
||||||
)
|
}
|
||||||
segment_results.append({
|
tasks.append(asyncio.create_task(run_one(planned)))
|
||||||
"type": "speech",
|
|
||||||
"path": str(segment_output_path),
|
|
||||||
"speaker_id": speaker_id,
|
|
||||||
"text_chunk": text_chunk
|
|
||||||
})
|
|
||||||
processing_log.append(f"Successfully generated segment: {segment_output_path}")
|
|
||||||
except Exception as e:
|
|
||||||
error_message = f"Error generating speech for chunk '{text_chunk[:50]}...': {repr(e)}"
|
|
||||||
processing_log.append(error_message)
|
|
||||||
segment_results.append({"type": "error", "message": error_message, "text_chunk": text_chunk})
|
|
||||||
segment_idx += 1
|
segment_idx += 1
|
||||||
|
|
||||||
elif item_type == "silence":
|
elif item_type == "silence":
|
||||||
duration = item.get("duration")
|
duration = item.get("duration")
|
||||||
if duration is None or duration < 0:
|
if duration is None or duration < 0:
|
||||||
processing_log.append(f"Skipping silence item {i+1} due to invalid duration.")
|
processing_log.append(f"Skipping silence item {i+1} due to invalid duration.")
|
||||||
segment_results.append({"type": "error", "message": "Invalid duration for silence"})
|
results_map[segment_idx] = {"type": "error", "message": "Invalid duration for silence"}
|
||||||
|
segment_idx += 1
|
||||||
continue
|
continue
|
||||||
segment_results.append({"type": "silence", "duration": float(duration)})
|
results_map[segment_idx] = {"type": "silence", "duration": float(duration)}
|
||||||
processing_log.append(f"Added silence of {duration}s.")
|
processing_log.append(f"Added silence of {duration}s.")
|
||||||
|
segment_idx += 1
|
||||||
|
|
||||||
else:
|
else:
|
||||||
processing_log.append(f"Unknown item type '{item_type}' at item {i+1}. Skipping.")
|
processing_log.append(f"Unknown item type '{item_type}' at item {i+1}. Skipping.")
|
||||||
segment_results.append({"type": "error", "message": f"Unknown item type: {item_type}"})
|
results_map[segment_idx] = {"type": "error", "message": f"Unknown item type: {item_type}"}
|
||||||
|
segment_idx += 1
|
||||||
|
|
||||||
|
# Await all TTS tasks and merge results
|
||||||
|
if tasks:
|
||||||
|
processing_log.append(
|
||||||
|
f"Dispatching {len(tasks)} TTS task(s) with concurrency limit "
|
||||||
|
f"{getattr(config, 'TTS_MAX_CONCURRENCY', 2)}"
|
||||||
|
)
|
||||||
|
completed = await asyncio.gather(*tasks, return_exceptions=False)
|
||||||
|
for idx, payload, maybe_log in completed:
|
||||||
|
results_map[idx] = payload
|
||||||
|
if maybe_log:
|
||||||
|
processing_log.append(maybe_log)
|
||||||
|
|
||||||
|
# Build ordered list
|
||||||
|
for idx in sorted(results_map.keys()):
|
||||||
|
segment_results.append(results_map[idx])
|
||||||
|
|
||||||
# Log the full segment_results list for debugging
|
# Log the full segment_results list for debugging
|
||||||
processing_log.append("[DEBUG] Final segment_results list:")
|
processing_log.append("[DEBUG] Final segment_results list:")
|
||||||
|
@ -216,7 +277,7 @@ class DialogProcessorService:
|
||||||
return {
|
return {
|
||||||
"log": "\n".join(processing_log),
|
"log": "\n".join(processing_log),
|
||||||
"segment_files": segment_results,
|
"segment_files": segment_results,
|
||||||
"temp_dir": str(dialog_temp_dir) # For cleanup or zipping later
|
"temp_dir": str(dialog_temp_dir)
|
||||||
}
|
}
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
|
|
|
@ -0,0 +1,170 @@
|
||||||
|
import asyncio
|
||||||
|
import time
|
||||||
|
import logging
|
||||||
|
from typing import Optional
|
||||||
|
import gc
|
||||||
|
import os
|
||||||
|
|
||||||
|
_proc = None
|
||||||
|
try:
|
||||||
|
import psutil # type: ignore
|
||||||
|
_proc = psutil.Process(os.getpid())
|
||||||
|
except Exception:
|
||||||
|
psutil = None # type: ignore
|
||||||
|
|
||||||
|
def _rss_mb() -> float:
|
||||||
|
"""Return current process RSS in MB, or -1.0 if unavailable."""
|
||||||
|
global _proc
|
||||||
|
try:
|
||||||
|
if _proc is None and psutil is not None:
|
||||||
|
_proc = psutil.Process(os.getpid())
|
||||||
|
if _proc is not None:
|
||||||
|
return _proc.memory_info().rss / (1024 * 1024)
|
||||||
|
except Exception:
|
||||||
|
return -1.0
|
||||||
|
return -1.0
|
||||||
|
|
||||||
|
try:
|
||||||
|
import torch # Optional; used for cache cleanup metrics
|
||||||
|
except Exception: # pragma: no cover - torch may not be present in some envs
|
||||||
|
torch = None # type: ignore
|
||||||
|
|
||||||
|
from app import config
|
||||||
|
from app.services.tts_service import TTSService
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
class ModelManager:
|
||||||
|
_instance: Optional["ModelManager"] = None
|
||||||
|
|
||||||
|
def __init__(self):
|
||||||
|
self._service: Optional[TTSService] = None
|
||||||
|
self._last_used: float = time.time()
|
||||||
|
self._active: int = 0
|
||||||
|
self._lock = asyncio.Lock()
|
||||||
|
self._counter_lock = asyncio.Lock()
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def instance(cls) -> "ModelManager":
|
||||||
|
if not cls._instance:
|
||||||
|
cls._instance = cls()
|
||||||
|
return cls._instance
|
||||||
|
|
||||||
|
async def _ensure_service(self) -> None:
|
||||||
|
if self._service is None:
|
||||||
|
# Use configured device, default is handled by TTSService itself
|
||||||
|
device = getattr(config, "DEVICE", "auto")
|
||||||
|
# TTSService presently expects explicit device like "mps"/"cpu"/"cuda"; map "auto" to "mps" on Mac otherwise cpu
|
||||||
|
if device == "auto":
|
||||||
|
try:
|
||||||
|
import torch
|
||||||
|
if hasattr(torch.backends, "mps") and torch.backends.mps.is_available():
|
||||||
|
device = "mps"
|
||||||
|
elif torch.cuda.is_available():
|
||||||
|
device = "cuda"
|
||||||
|
else:
|
||||||
|
device = "cpu"
|
||||||
|
except Exception:
|
||||||
|
device = "cpu"
|
||||||
|
self._service = TTSService(device=device)
|
||||||
|
|
||||||
|
async def load(self) -> None:
|
||||||
|
async with self._lock:
|
||||||
|
await self._ensure_service()
|
||||||
|
if self._service and self._service.model is None:
|
||||||
|
before_mb = _rss_mb()
|
||||||
|
logger.info(
|
||||||
|
"Loading TTS model (device=%s)... (rss_before=%.1f MB)",
|
||||||
|
self._service.device,
|
||||||
|
before_mb,
|
||||||
|
)
|
||||||
|
self._service.load_model()
|
||||||
|
after_mb = _rss_mb()
|
||||||
|
if after_mb >= 0 and before_mb >= 0:
|
||||||
|
logger.info(
|
||||||
|
"TTS model loaded (rss_after=%.1f MB, delta=%.1f MB)",
|
||||||
|
after_mb,
|
||||||
|
after_mb - before_mb,
|
||||||
|
)
|
||||||
|
self._last_used = time.time()
|
||||||
|
|
||||||
|
async def unload(self) -> None:
|
||||||
|
async with self._lock:
|
||||||
|
if not self._service:
|
||||||
|
return
|
||||||
|
if self._active > 0:
|
||||||
|
logger.debug("Skip unload: %d active operations", self._active)
|
||||||
|
return
|
||||||
|
if self._service.model is not None:
|
||||||
|
before_mb = _rss_mb()
|
||||||
|
logger.info(
|
||||||
|
"Unloading idle TTS model... (rss_before=%.1f MB, active=%d)",
|
||||||
|
before_mb,
|
||||||
|
self._active,
|
||||||
|
)
|
||||||
|
self._service.unload_model()
|
||||||
|
# Drop the service instance as well to release any lingering refs
|
||||||
|
self._service = None
|
||||||
|
# Force GC and attempt allocator cache cleanup
|
||||||
|
try:
|
||||||
|
gc.collect()
|
||||||
|
finally:
|
||||||
|
if torch is not None:
|
||||||
|
try:
|
||||||
|
if hasattr(torch, "cuda") and torch.cuda.is_available():
|
||||||
|
torch.cuda.empty_cache()
|
||||||
|
except Exception:
|
||||||
|
logger.debug("cuda.empty_cache() failed", exc_info=True)
|
||||||
|
try:
|
||||||
|
# MPS empty_cache may exist depending on torch version
|
||||||
|
mps = getattr(torch, "mps", None)
|
||||||
|
if mps is not None and hasattr(mps, "empty_cache"):
|
||||||
|
mps.empty_cache()
|
||||||
|
except Exception:
|
||||||
|
logger.debug("mps.empty_cache() failed", exc_info=True)
|
||||||
|
after_mb = _rss_mb()
|
||||||
|
if after_mb >= 0 and before_mb >= 0:
|
||||||
|
logger.info(
|
||||||
|
"Idle unload complete (rss_after=%.1f MB, delta=%.1f MB)",
|
||||||
|
after_mb,
|
||||||
|
after_mb - before_mb,
|
||||||
|
)
|
||||||
|
self._last_used = time.time()
|
||||||
|
|
||||||
|
async def get_service(self) -> TTSService:
|
||||||
|
if not self._service or self._service.model is None:
|
||||||
|
await self.load()
|
||||||
|
self._last_used = time.time()
|
||||||
|
return self._service # type: ignore[return-value]
|
||||||
|
|
||||||
|
async def _inc(self) -> None:
|
||||||
|
async with self._counter_lock:
|
||||||
|
self._active += 1
|
||||||
|
|
||||||
|
async def _dec(self) -> None:
|
||||||
|
async with self._counter_lock:
|
||||||
|
self._active = max(0, self._active - 1)
|
||||||
|
self._last_used = time.time()
|
||||||
|
|
||||||
|
def last_used(self) -> float:
|
||||||
|
return self._last_used
|
||||||
|
|
||||||
|
def is_loaded(self) -> bool:
|
||||||
|
return bool(self._service and self._service.model is not None)
|
||||||
|
|
||||||
|
def active(self) -> int:
|
||||||
|
return self._active
|
||||||
|
|
||||||
|
def using(self):
|
||||||
|
manager = self
|
||||||
|
|
||||||
|
class _Ctx:
|
||||||
|
async def __aenter__(self):
|
||||||
|
await manager._inc()
|
||||||
|
return manager
|
||||||
|
|
||||||
|
async def __aexit__(self, exc_type, exc, tb):
|
||||||
|
await manager._dec()
|
||||||
|
|
||||||
|
return _Ctx()
|
|
@ -7,8 +7,13 @@ from pathlib import Path
|
||||||
from typing import List, Dict, Optional, Any
|
from typing import List, Dict, Optional, Any
|
||||||
|
|
||||||
from fastapi import UploadFile, HTTPException
|
from fastapi import UploadFile, HTTPException
|
||||||
from app.models.speaker_models import Speaker, SpeakerCreate
|
try:
|
||||||
from app import config
|
from app.models.speaker_models import Speaker, SpeakerCreate
|
||||||
|
from app import config
|
||||||
|
except ModuleNotFoundError:
|
||||||
|
# When imported from scripts at project root
|
||||||
|
from backend.app.models.speaker_models import Speaker, SpeakerCreate
|
||||||
|
from backend.app import config
|
||||||
|
|
||||||
class SpeakerManagementService:
|
class SpeakerManagementService:
|
||||||
def __init__(self):
|
def __init__(self):
|
||||||
|
|
|
@ -1,14 +1,21 @@
|
||||||
import torch
|
import torch
|
||||||
import torchaudio
|
import torchaudio
|
||||||
|
import asyncio
|
||||||
from typing import Optional
|
from typing import Optional
|
||||||
from chatterbox.tts import ChatterboxTTS
|
from chatterbox.tts import ChatterboxTTS
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
import gc # Garbage collector for memory management
|
import gc # Garbage collector for memory management
|
||||||
import os
|
import os
|
||||||
from contextlib import contextmanager
|
from contextlib import contextmanager
|
||||||
|
from datetime import datetime
|
||||||
|
import time
|
||||||
|
|
||||||
# Import configuration
|
# Import configuration
|
||||||
from app.config import TTS_TEMP_OUTPUT_DIR, SPEAKER_SAMPLES_DIR
|
try:
|
||||||
|
from app.config import TTS_TEMP_OUTPUT_DIR, SPEAKER_SAMPLES_DIR
|
||||||
|
except ModuleNotFoundError:
|
||||||
|
# When imported from scripts at project root
|
||||||
|
from backend.app.config import TTS_TEMP_OUTPUT_DIR, SPEAKER_SAMPLES_DIR
|
||||||
|
|
||||||
# Use configuration for TTS output directory
|
# Use configuration for TTS output directory
|
||||||
TTS_OUTPUT_DIR = TTS_TEMP_OUTPUT_DIR
|
TTS_OUTPUT_DIR = TTS_TEMP_OUTPUT_DIR
|
||||||
|
@ -88,6 +95,7 @@ class TTSService:
|
||||||
exaggeration: float = 0.5, # Default from Gradio
|
exaggeration: float = 0.5, # Default from Gradio
|
||||||
cfg_weight: float = 0.5, # Default from Gradio
|
cfg_weight: float = 0.5, # Default from Gradio
|
||||||
temperature: float = 0.8, # Default from Gradio
|
temperature: float = 0.8, # Default from Gradio
|
||||||
|
unload_after: bool = False, # Whether to unload the model after generation
|
||||||
) -> Path:
|
) -> Path:
|
||||||
"""
|
"""
|
||||||
Generates speech from text using the loaded TTS model and a speaker sample.
|
Generates speech from text using the loaded TTS model and a speaker sample.
|
||||||
|
@ -109,26 +117,51 @@ class TTSService:
|
||||||
# output_filename_base from DialogProcessorService is expected to be comprehensive (e.g., includes speaker_id, segment info)
|
# output_filename_base from DialogProcessorService is expected to be comprehensive (e.g., includes speaker_id, segment info)
|
||||||
output_file_path = target_output_dir / f"{output_filename_base}.wav"
|
output_file_path = target_output_dir / f"{output_filename_base}.wav"
|
||||||
|
|
||||||
print(f"Generating audio for text: \"{text[:50]}...\" with speaker sample: {speaker_sample_path}")
|
start_ts = datetime.now()
|
||||||
|
print(f"[{start_ts.isoformat(timespec='seconds')}] [TTS] START generate+save base={output_filename_base} len={len(text)} sample={speaker_sample_path}")
|
||||||
try:
|
try:
|
||||||
with torch.no_grad(): # Important for inference
|
def _gen_and_save() -> Path:
|
||||||
wav = self.model.generate(
|
t0 = time.perf_counter()
|
||||||
text=text,
|
wav = None
|
||||||
audio_prompt_path=str(speaker_sample_p), # Must be a string path
|
try:
|
||||||
exaggeration=exaggeration,
|
with torch.no_grad(): # Important for inference
|
||||||
cfg_weight=cfg_weight,
|
wav = self.model.generate(
|
||||||
temperature=temperature,
|
text=text,
|
||||||
)
|
audio_prompt_path=str(speaker_sample_p), # Must be a string path
|
||||||
|
exaggeration=exaggeration,
|
||||||
torchaudio.save(str(output_file_path), wav, self.model.sr)
|
cfg_weight=cfg_weight,
|
||||||
print(f"Audio saved to: {output_file_path}")
|
temperature=temperature,
|
||||||
return output_file_path
|
)
|
||||||
|
|
||||||
|
# Save the audio synchronously in the same thread
|
||||||
|
torchaudio.save(str(output_file_path), wav, self.model.sr)
|
||||||
|
t1 = time.perf_counter()
|
||||||
|
print(f"[TTS-THREAD] Saved {output_file_path.name} in {t1 - t0:.2f}s")
|
||||||
|
return output_file_path
|
||||||
|
finally:
|
||||||
|
# Cleanup in the same thread that created the tensor
|
||||||
|
if wav is not None:
|
||||||
|
del wav
|
||||||
|
gc.collect()
|
||||||
|
if self.device == "cuda":
|
||||||
|
torch.cuda.empty_cache()
|
||||||
|
elif self.device == "mps":
|
||||||
|
if hasattr(torch.mps, "empty_cache"):
|
||||||
|
torch.mps.empty_cache()
|
||||||
|
|
||||||
|
out_path = await asyncio.to_thread(_gen_and_save)
|
||||||
|
end_ts = datetime.now()
|
||||||
|
print(f"[{end_ts.isoformat(timespec='seconds')}] [TTS] END generate+save base={output_filename_base} dur={(end_ts - start_ts).total_seconds():.2f}s -> {out_path}")
|
||||||
|
|
||||||
|
# Optionally unload model after generation
|
||||||
|
if unload_after:
|
||||||
|
print("Unloading TTS model after generation...")
|
||||||
|
self.unload_model()
|
||||||
|
|
||||||
|
return out_path
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
print(f"Error during TTS generation or saving: {e}")
|
print(f"Error during TTS generation or saving: {e}")
|
||||||
raise
|
raise
|
||||||
finally:
|
|
||||||
# For now, we keep it loaded. Memory management might need refinement.
|
|
||||||
pass
|
|
||||||
|
|
||||||
# Example usage (for testing, not part of the service itself)
|
# Example usage (for testing, not part of the service itself)
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
|
|
|
@ -14,6 +14,14 @@ if __name__ == "__main__":
|
||||||
print(f"CORS Origins: {config.CORS_ORIGINS}")
|
print(f"CORS Origins: {config.CORS_ORIGINS}")
|
||||||
print(f"Project Root: {config.PROJECT_ROOT}")
|
print(f"Project Root: {config.PROJECT_ROOT}")
|
||||||
print(f"Device: {config.DEVICE}")
|
print(f"Device: {config.DEVICE}")
|
||||||
|
# Idle eviction settings
|
||||||
|
print(
|
||||||
|
"Model Eviction -> enabled: {} | idle_timeout: {}s | check_interval: {}s".format(
|
||||||
|
getattr(config, "MODEL_EVICTION_ENABLED", True),
|
||||||
|
getattr(config, "MODEL_IDLE_TIMEOUT_SECONDS", 0),
|
||||||
|
getattr(config, "MODEL_IDLE_CHECK_INTERVAL_SECONDS", 60),
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
uvicorn.run(
|
uvicorn.run(
|
||||||
"app.main:app",
|
"app.main:app",
|
||||||
|
|
|
@ -0,0 +1,496 @@
|
||||||
|
#!/usr/bin/env python
|
||||||
|
"""
|
||||||
|
Chatterbox Audiobook Generator
|
||||||
|
|
||||||
|
This script converts a text file into an audiobook using the Chatterbox TTS system.
|
||||||
|
It parses the text file into manageable chunks, generates audio for each chunk,
|
||||||
|
and assembles them into a complete audiobook.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import asyncio
|
||||||
|
import gc
|
||||||
|
import os
|
||||||
|
import re
|
||||||
|
import subprocess
|
||||||
|
import sys
|
||||||
|
import torch
|
||||||
|
from pathlib import Path
|
||||||
|
import uuid
|
||||||
|
|
||||||
|
# Import helper to fix Python path
|
||||||
|
import import_helper
|
||||||
|
|
||||||
|
# Import backend services
|
||||||
|
from backend.app.services.tts_service import TTSService
|
||||||
|
from backend.app.services.speaker_service import SpeakerManagementService
|
||||||
|
from backend.app.services.audio_manipulation_service import AudioManipulationService
|
||||||
|
from backend.app.config import DIALOG_GENERATED_DIR, TTS_TEMP_OUTPUT_DIR
|
||||||
|
|
||||||
|
class AudiobookGenerator:
|
||||||
|
def __init__(self, speaker_id, output_base_name, device="mps",
|
||||||
|
exaggeration=0.5, cfg_weight=0.5, temperature=0.8,
|
||||||
|
pause_between_sentences=0.5, pause_between_paragraphs=1.0,
|
||||||
|
keep_model_loaded=False, cleanup_interval=10, use_subprocess=False):
|
||||||
|
"""
|
||||||
|
Initialize the audiobook generator.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
speaker_id: ID of the speaker to use
|
||||||
|
output_base_name: Base name for output files
|
||||||
|
device: Device to use for TTS (mps, cuda, cpu)
|
||||||
|
exaggeration: Controls expressiveness (0.0-1.0)
|
||||||
|
cfg_weight: Controls alignment with speaker characteristics (0.0-1.0)
|
||||||
|
temperature: Controls randomness in generation (0.0-1.0)
|
||||||
|
pause_between_sentences: Pause duration between sentences in seconds
|
||||||
|
pause_between_paragraphs: Pause duration between paragraphs in seconds
|
||||||
|
keep_model_loaded: If True, keeps model loaded across chunks (more efficient but uses more memory)
|
||||||
|
cleanup_interval: How often to perform deep cleanup when keep_model_loaded=True
|
||||||
|
use_subprocess: If True, uses separate processes for each chunk (slower but guarantees memory release)
|
||||||
|
"""
|
||||||
|
self.speaker_id = speaker_id
|
||||||
|
self.output_base_name = output_base_name
|
||||||
|
self.device = device
|
||||||
|
self.exaggeration = exaggeration
|
||||||
|
self.cfg_weight = cfg_weight
|
||||||
|
self.temperature = temperature
|
||||||
|
self.pause_between_sentences = pause_between_sentences
|
||||||
|
self.pause_between_paragraphs = pause_between_paragraphs
|
||||||
|
self.keep_model_loaded = keep_model_loaded
|
||||||
|
self.cleanup_interval = cleanup_interval
|
||||||
|
self.use_subprocess = use_subprocess
|
||||||
|
self.chunk_counter = 0
|
||||||
|
|
||||||
|
# Initialize services
|
||||||
|
self.tts_service = TTSService(device=device)
|
||||||
|
self.speaker_service = SpeakerManagementService()
|
||||||
|
self.audio_manipulator = AudioManipulationService()
|
||||||
|
|
||||||
|
# Create output directories
|
||||||
|
self.output_dir = DIALOG_GENERATED_DIR / output_base_name
|
||||||
|
self.output_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
self.temp_dir = TTS_TEMP_OUTPUT_DIR / output_base_name
|
||||||
|
self.temp_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
|
||||||
|
# Validate speaker
|
||||||
|
self._validate_speaker()
|
||||||
|
|
||||||
|
def _validate_speaker(self):
|
||||||
|
"""Validate that the specified speaker exists."""
|
||||||
|
speaker_info = self.speaker_service.get_speaker_by_id(self.speaker_id)
|
||||||
|
if not speaker_info:
|
||||||
|
raise ValueError(f"Speaker ID '{self.speaker_id}' not found.")
|
||||||
|
if not speaker_info.sample_path:
|
||||||
|
raise ValueError(f"Speaker ID '{self.speaker_id}' has no sample path defined.")
|
||||||
|
|
||||||
|
# Store speaker info for later use
|
||||||
|
self.speaker_info = speaker_info
|
||||||
|
|
||||||
|
def _cleanup_memory(self):
|
||||||
|
"""Force memory cleanup and garbage collection."""
|
||||||
|
print("Performing memory cleanup...")
|
||||||
|
|
||||||
|
# Force garbage collection multiple times for thorough cleanup
|
||||||
|
for _ in range(3):
|
||||||
|
gc.collect()
|
||||||
|
|
||||||
|
# Clear device-specific caches
|
||||||
|
if self.device == "cuda" and torch.cuda.is_available():
|
||||||
|
torch.cuda.empty_cache()
|
||||||
|
torch.cuda.synchronize()
|
||||||
|
# Additional CUDA cleanup
|
||||||
|
try:
|
||||||
|
torch.cuda.reset_peak_memory_stats()
|
||||||
|
except:
|
||||||
|
pass
|
||||||
|
elif self.device == "mps" and torch.backends.mps.is_available():
|
||||||
|
if hasattr(torch.mps, "empty_cache"):
|
||||||
|
torch.mps.empty_cache()
|
||||||
|
if hasattr(torch.mps, "synchronize"):
|
||||||
|
torch.mps.synchronize()
|
||||||
|
# Try to free MPS memory more aggressively
|
||||||
|
try:
|
||||||
|
import os
|
||||||
|
# This forces MPS to release memory back to the system
|
||||||
|
if hasattr(torch.mps, "set_per_process_memory_fraction"):
|
||||||
|
current_allocated = torch.mps.current_allocated_memory() if hasattr(torch.mps, "current_allocated_memory") else 0
|
||||||
|
if current_allocated > 0:
|
||||||
|
torch.mps.empty_cache()
|
||||||
|
except:
|
||||||
|
pass
|
||||||
|
|
||||||
|
# Additional aggressive cleanup
|
||||||
|
if hasattr(torch, '_C') and hasattr(torch._C, '_cuda_clearCublasWorkspaces'):
|
||||||
|
try:
|
||||||
|
torch._C._cuda_clearCublasWorkspaces()
|
||||||
|
except:
|
||||||
|
pass
|
||||||
|
|
||||||
|
print("Memory cleanup completed.")
|
||||||
|
|
||||||
|
async def _generate_chunk_subprocess(self, chunk, segment_filename_base, speaker_sample_path):
|
||||||
|
"""
|
||||||
|
Generate a single chunk using cbx-generate.py in a subprocess.
|
||||||
|
This guarantees memory is released when the process exits.
|
||||||
|
"""
|
||||||
|
output_file = self.temp_dir / f"{segment_filename_base}.wav"
|
||||||
|
|
||||||
|
# Use cbx-generate.py for single chunk generation
|
||||||
|
cmd = [
|
||||||
|
sys.executable, "cbx-generate.py",
|
||||||
|
"--sample", str(speaker_sample_path),
|
||||||
|
"--output", str(output_file),
|
||||||
|
"--text", chunk,
|
||||||
|
"--device", self.device
|
||||||
|
]
|
||||||
|
|
||||||
|
print(f"Running subprocess: {' '.join(cmd[:4])} ... (text truncated)")
|
||||||
|
|
||||||
|
try:
|
||||||
|
result = subprocess.run(
|
||||||
|
cmd,
|
||||||
|
capture_output=True,
|
||||||
|
text=True,
|
||||||
|
timeout=300, # 5 minute timeout per chunk
|
||||||
|
cwd=Path(__file__).parent # Run from project root
|
||||||
|
)
|
||||||
|
|
||||||
|
if result.returncode != 0:
|
||||||
|
raise RuntimeError(f"Subprocess failed: {result.stderr}")
|
||||||
|
|
||||||
|
if not output_file.exists():
|
||||||
|
raise RuntimeError(f"Output file not created: {output_file}")
|
||||||
|
|
||||||
|
print(f"Subprocess completed successfully: {output_file}")
|
||||||
|
return output_file
|
||||||
|
|
||||||
|
except subprocess.TimeoutExpired:
|
||||||
|
raise RuntimeError(f"Subprocess timed out after 5 minutes")
|
||||||
|
except Exception as e:
|
||||||
|
raise RuntimeError(f"Subprocess error: {e}")
|
||||||
|
|
||||||
|
def split_text_into_chunks(self, text, max_length=300):
|
||||||
|
"""
|
||||||
|
Split text into chunks suitable for TTS processing.
|
||||||
|
|
||||||
|
This uses the same logic as the DialogProcessorService._split_text method
|
||||||
|
but adds additional paragraph handling.
|
||||||
|
"""
|
||||||
|
# Split text into paragraphs first
|
||||||
|
paragraphs = re.split(r'\n\s*\n', text)
|
||||||
|
paragraphs = [p.strip() for p in paragraphs if p.strip()]
|
||||||
|
|
||||||
|
all_chunks = []
|
||||||
|
|
||||||
|
for paragraph in paragraphs:
|
||||||
|
# Split paragraph into sentences
|
||||||
|
sentences = re.split(r'(?<=[.!?\u2026])\s+|(?<=[.!?\u2026])(?=[\"\')\]\}\u201d\u2019])|(?<=[.!?\u2026])$', paragraph.strip())
|
||||||
|
sentences = [s.strip() for s in sentences if s and s.strip()]
|
||||||
|
|
||||||
|
chunks = []
|
||||||
|
current_chunk = ""
|
||||||
|
|
||||||
|
for sentence in sentences:
|
||||||
|
if not sentence:
|
||||||
|
continue
|
||||||
|
if not current_chunk: # First sentence for this chunk
|
||||||
|
current_chunk = sentence
|
||||||
|
elif len(current_chunk) + len(sentence) + 1 <= max_length:
|
||||||
|
current_chunk += " " + sentence
|
||||||
|
else:
|
||||||
|
chunks.append(current_chunk)
|
||||||
|
current_chunk = sentence
|
||||||
|
|
||||||
|
if current_chunk: # Add the last chunk
|
||||||
|
chunks.append(current_chunk)
|
||||||
|
|
||||||
|
# Further split any chunks that are still too long
|
||||||
|
paragraph_chunks = []
|
||||||
|
for chunk in chunks:
|
||||||
|
if len(chunk) > max_length:
|
||||||
|
# Simple split by length if a sentence itself is too long
|
||||||
|
for i in range(0, len(chunk), max_length):
|
||||||
|
paragraph_chunks.append(chunk[i:i+max_length])
|
||||||
|
else:
|
||||||
|
paragraph_chunks.append(chunk)
|
||||||
|
|
||||||
|
# Add paragraph marker
|
||||||
|
if paragraph_chunks:
|
||||||
|
all_chunks.append({"type": "paragraph", "chunks": paragraph_chunks})
|
||||||
|
|
||||||
|
return all_chunks
|
||||||
|
|
||||||
|
async def generate_audiobook(self, text_file_path):
|
||||||
|
"""
|
||||||
|
Generate an audiobook from a text file.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text_file_path: Path to the text file to convert
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Path to the generated audiobook file
|
||||||
|
"""
|
||||||
|
# Read the text file
|
||||||
|
text_path = Path(text_file_path)
|
||||||
|
if not text_path.exists():
|
||||||
|
raise FileNotFoundError(f"Text file not found: {text_file_path}")
|
||||||
|
|
||||||
|
with open(text_path, 'r', encoding='utf-8') as f:
|
||||||
|
text = f.read()
|
||||||
|
|
||||||
|
print(f"Processing text file: {text_file_path}")
|
||||||
|
print(f"Text length: {len(text)} characters")
|
||||||
|
|
||||||
|
# Split text into chunks
|
||||||
|
paragraphs = self.split_text_into_chunks(text)
|
||||||
|
total_chunks = sum(len(p["chunks"]) for p in paragraphs)
|
||||||
|
print(f"Split into {len(paragraphs)} paragraphs with {total_chunks} total chunks")
|
||||||
|
|
||||||
|
# Generate audio for each chunk
|
||||||
|
segment_results = []
|
||||||
|
chunk_count = 0
|
||||||
|
|
||||||
|
# Pre-load model if keeping it loaded
|
||||||
|
if self.keep_model_loaded:
|
||||||
|
print("Pre-loading TTS model for batch processing...")
|
||||||
|
self.tts_service.load_model()
|
||||||
|
|
||||||
|
try:
|
||||||
|
for para_idx, paragraph in enumerate(paragraphs):
|
||||||
|
print(f"Processing paragraph {para_idx+1}/{len(paragraphs)}")
|
||||||
|
|
||||||
|
for chunk_idx, chunk in enumerate(paragraph["chunks"]):
|
||||||
|
chunk_count += 1
|
||||||
|
self.chunk_counter += 1
|
||||||
|
print(f" Generating audio for chunk {chunk_count}/{total_chunks}: {chunk[:50]}...")
|
||||||
|
|
||||||
|
# Generate unique filename for this chunk
|
||||||
|
segment_filename_base = f"{self.output_base_name}_p{para_idx}_c{chunk_idx}_{uuid.uuid4().hex[:8]}"
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Get absolute speaker sample path
|
||||||
|
speaker_sample_path = Path(self.speaker_info.sample_path)
|
||||||
|
if not speaker_sample_path.is_absolute():
|
||||||
|
from backend.app.config import SPEAKER_DATA_BASE_DIR
|
||||||
|
speaker_sample_path = SPEAKER_DATA_BASE_DIR / speaker_sample_path
|
||||||
|
|
||||||
|
# Generate speech for this chunk
|
||||||
|
if self.use_subprocess:
|
||||||
|
# Use subprocess for guaranteed memory release
|
||||||
|
segment_output_path = await self._generate_chunk_subprocess(
|
||||||
|
chunk=chunk,
|
||||||
|
segment_filename_base=segment_filename_base,
|
||||||
|
speaker_sample_path=speaker_sample_path
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
# Load model for this chunk (if not keeping loaded)
|
||||||
|
if not self.keep_model_loaded:
|
||||||
|
print("Loading TTS model...")
|
||||||
|
self.tts_service.load_model()
|
||||||
|
|
||||||
|
# Generate speech using the TTS service
|
||||||
|
segment_output_path = await self.tts_service.generate_speech(
|
||||||
|
text=chunk,
|
||||||
|
speaker_id=self.speaker_id,
|
||||||
|
speaker_sample_path=str(speaker_sample_path),
|
||||||
|
output_filename_base=segment_filename_base,
|
||||||
|
output_dir=self.temp_dir,
|
||||||
|
exaggeration=self.exaggeration,
|
||||||
|
cfg_weight=self.cfg_weight,
|
||||||
|
temperature=self.temperature
|
||||||
|
)
|
||||||
|
|
||||||
|
# Memory management strategy based on model lifecycle
|
||||||
|
if self.use_subprocess:
|
||||||
|
# No memory management needed - subprocess handles it
|
||||||
|
pass
|
||||||
|
elif self.keep_model_loaded:
|
||||||
|
# Light cleanup after each chunk
|
||||||
|
if self.chunk_counter % self.cleanup_interval == 0:
|
||||||
|
print(f"Performing periodic deep cleanup (chunk {self.chunk_counter})")
|
||||||
|
self._cleanup_memory()
|
||||||
|
else:
|
||||||
|
# Explicit memory cleanup after generation
|
||||||
|
self._cleanup_memory()
|
||||||
|
|
||||||
|
# Unload model after generation
|
||||||
|
print("Unloading TTS model...")
|
||||||
|
self.tts_service.unload_model()
|
||||||
|
|
||||||
|
# Additional memory cleanup after model unload
|
||||||
|
self._cleanup_memory()
|
||||||
|
|
||||||
|
# Add to segment results
|
||||||
|
segment_results.append({
|
||||||
|
"type": "speech",
|
||||||
|
"path": str(segment_output_path)
|
||||||
|
})
|
||||||
|
|
||||||
|
# Add pause between sentences
|
||||||
|
if chunk_idx < len(paragraph["chunks"]) - 1:
|
||||||
|
segment_results.append({
|
||||||
|
"type": "silence",
|
||||||
|
"duration": self.pause_between_sentences
|
||||||
|
})
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Error generating speech for chunk: {e}")
|
||||||
|
# Ensure model is unloaded if there was an error and not using subprocess
|
||||||
|
if not self.use_subprocess:
|
||||||
|
if not self.keep_model_loaded and self.tts_service.model is not None:
|
||||||
|
print("Unloading TTS model after error...")
|
||||||
|
self.tts_service.unload_model()
|
||||||
|
# Force cleanup after error
|
||||||
|
self._cleanup_memory()
|
||||||
|
# Continue with next chunk
|
||||||
|
|
||||||
|
# Add longer pause between paragraphs
|
||||||
|
if para_idx < len(paragraphs) - 1:
|
||||||
|
segment_results.append({
|
||||||
|
"type": "silence",
|
||||||
|
"duration": self.pause_between_paragraphs
|
||||||
|
})
|
||||||
|
|
||||||
|
finally:
|
||||||
|
# Always unload model at the end if it was kept loaded
|
||||||
|
if self.keep_model_loaded and self.tts_service.model is not None:
|
||||||
|
print("Final cleanup: Unloading TTS model...")
|
||||||
|
self.tts_service.unload_model()
|
||||||
|
self._cleanup_memory()
|
||||||
|
|
||||||
|
# Concatenate all segments
|
||||||
|
print("Concatenating audio segments...")
|
||||||
|
concatenated_filename = f"{self.output_base_name}_audiobook.wav"
|
||||||
|
concatenated_path = self.output_dir / concatenated_filename
|
||||||
|
|
||||||
|
self.audio_manipulator.concatenate_audio_segments(
|
||||||
|
segment_results=segment_results,
|
||||||
|
output_concatenated_path=concatenated_path
|
||||||
|
)
|
||||||
|
|
||||||
|
# Create ZIP archive with all files
|
||||||
|
print("Creating ZIP archive...")
|
||||||
|
zip_filename = f"{self.output_base_name}_audiobook.zip"
|
||||||
|
zip_path = self.output_dir / zip_filename
|
||||||
|
|
||||||
|
# Collect all speech segment files
|
||||||
|
speech_segment_paths = [
|
||||||
|
Path(s["path"]) for s in segment_results
|
||||||
|
if s["type"] == "speech" and Path(s["path"]).exists()
|
||||||
|
]
|
||||||
|
|
||||||
|
self.audio_manipulator.create_zip_archive(
|
||||||
|
segment_file_paths=speech_segment_paths,
|
||||||
|
concatenated_audio_path=concatenated_path,
|
||||||
|
output_zip_path=zip_path
|
||||||
|
)
|
||||||
|
|
||||||
|
print(f"Audiobook generation complete!")
|
||||||
|
print(f"Audiobook file: {concatenated_path}")
|
||||||
|
print(f"ZIP archive: {zip_path}")
|
||||||
|
|
||||||
|
# Ensure model is unloaded at the end (just in case)
|
||||||
|
if self.tts_service.model is not None:
|
||||||
|
print("Final check: Unloading TTS model...")
|
||||||
|
self.tts_service.unload_model()
|
||||||
|
|
||||||
|
return concatenated_path
|
||||||
|
|
||||||
|
async def main():
|
||||||
|
parser = argparse.ArgumentParser(description="Generate an audiobook from a text file using Chatterbox TTS")
|
||||||
|
|
||||||
|
# Create a mutually exclusive group for the main operation vs listing speakers
|
||||||
|
group = parser.add_mutually_exclusive_group(required=True)
|
||||||
|
group.add_argument("--list-speakers", action="store_true", help="List available speakers and exit")
|
||||||
|
group.add_argument("text_file", nargs="?", help="Path to the text file to convert")
|
||||||
|
|
||||||
|
# Other arguments
|
||||||
|
parser.add_argument("--speaker", "-s", help="ID of the speaker to use")
|
||||||
|
parser.add_argument("--output", "-o", help="Base name for output files (default: derived from text filename)")
|
||||||
|
parser.add_argument("--device", default="mps", choices=["mps", "cuda", "cpu"], help="Device to use for TTS (default: mps)")
|
||||||
|
parser.add_argument("--exaggeration", type=float, default=0.5, help="Controls expressiveness (0.0-1.0, default: 0.5)")
|
||||||
|
parser.add_argument("--cfg-weight", type=float, default=0.5, help="Controls alignment with speaker (0.0-1.0, default: 0.5)")
|
||||||
|
parser.add_argument("--temperature", type=float, default=0.8, help="Controls randomness (0.0-1.0, default: 0.8)")
|
||||||
|
parser.add_argument("--sentence-pause", type=float, default=0.5, help="Pause between sentences in seconds (default: 0.5)")
|
||||||
|
parser.add_argument("--paragraph-pause", type=float, default=1.0, help="Pause between paragraphs in seconds (default: 1.0)")
|
||||||
|
parser.add_argument("--keep-model-loaded", action="store_true", help="Keep model loaded between chunks (faster but uses more memory)")
|
||||||
|
parser.add_argument("--cleanup-interval", type=int, default=10, help="How often to perform deep cleanup when keeping model loaded (default: 10)")
|
||||||
|
parser.add_argument("--force-cpu-on-oom", action="store_true", help="Automatically switch to CPU if MPS/CUDA runs out of memory")
|
||||||
|
parser.add_argument("--max-chunk-length", type=int, default=300, help="Maximum chunk length for text splitting (default: 300)")
|
||||||
|
parser.add_argument("--use-subprocess", action="store_true", help="Use separate processes for each chunk (guarantees memory release but slower)")
|
||||||
|
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
# List speakers if requested
|
||||||
|
if args.list_speakers:
|
||||||
|
speaker_service = SpeakerManagementService()
|
||||||
|
speakers = speaker_service.get_speakers()
|
||||||
|
print("Available speakers:")
|
||||||
|
for speaker in speakers:
|
||||||
|
print(f" {speaker.id}: {speaker.name}")
|
||||||
|
return
|
||||||
|
|
||||||
|
# Validate required arguments for audiobook generation
|
||||||
|
if not args.text_file:
|
||||||
|
parser.error("text_file is required when not using --list-speakers")
|
||||||
|
|
||||||
|
if not args.speaker:
|
||||||
|
parser.error("--speaker/-s is required when not using --list-speakers")
|
||||||
|
|
||||||
|
# Determine output base name if not provided
|
||||||
|
if not args.output:
|
||||||
|
text_path = Path(args.text_file)
|
||||||
|
args.output = text_path.stem
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Create audiobook generator
|
||||||
|
generator = AudiobookGenerator(
|
||||||
|
speaker_id=args.speaker,
|
||||||
|
output_base_name=args.output,
|
||||||
|
device=args.device,
|
||||||
|
exaggeration=args.exaggeration,
|
||||||
|
cfg_weight=args.cfg_weight,
|
||||||
|
temperature=args.temperature,
|
||||||
|
pause_between_sentences=args.sentence_pause,
|
||||||
|
pause_between_paragraphs=args.paragraph_pause,
|
||||||
|
keep_model_loaded=args.keep_model_loaded,
|
||||||
|
cleanup_interval=args.cleanup_interval,
|
||||||
|
use_subprocess=args.use_subprocess
|
||||||
|
)
|
||||||
|
|
||||||
|
# Generate audiobook with automatic fallback
|
||||||
|
try:
|
||||||
|
await generator.generate_audiobook(args.text_file)
|
||||||
|
except (RuntimeError, torch.OutOfMemoryError) as e:
|
||||||
|
if args.force_cpu_on_oom and "out of memory" in str(e).lower() and args.device != "cpu":
|
||||||
|
print(f"\n⚠️ {args.device.upper()} out of memory: {e}")
|
||||||
|
print("🔄 Automatically switching to CPU and retrying...")
|
||||||
|
|
||||||
|
# Create new generator with CPU
|
||||||
|
generator = AudiobookGenerator(
|
||||||
|
speaker_id=args.speaker,
|
||||||
|
output_base_name=args.output,
|
||||||
|
device="cpu",
|
||||||
|
exaggeration=args.exaggeration,
|
||||||
|
cfg_weight=args.cfg_weight,
|
||||||
|
temperature=args.temperature,
|
||||||
|
pause_between_sentences=args.sentence_pause,
|
||||||
|
pause_between_paragraphs=args.paragraph_pause,
|
||||||
|
keep_model_loaded=args.keep_model_loaded,
|
||||||
|
cleanup_interval=args.cleanup_interval,
|
||||||
|
use_subprocess=args.use_subprocess
|
||||||
|
)
|
||||||
|
|
||||||
|
await generator.generate_audiobook(args.text_file)
|
||||||
|
print("✅ Successfully completed using CPU fallback!")
|
||||||
|
else:
|
||||||
|
raise
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Error: {e}", file=sys.stderr)
|
||||||
|
return 1
|
||||||
|
|
||||||
|
return 0
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
sys.exit(asyncio.run(main()))
|
|
@ -6,6 +6,9 @@ import yaml
|
||||||
import torchaudio as ta
|
import torchaudio as ta
|
||||||
from chatterbox.tts import ChatterboxTTS
|
from chatterbox.tts import ChatterboxTTS
|
||||||
|
|
||||||
|
# Import helper to fix Python path
|
||||||
|
import import_helper
|
||||||
|
|
||||||
def split_text_at_sentence_boundaries(text, max_length=300):
|
def split_text_at_sentence_boundaries(text, max_length=300):
|
||||||
"""
|
"""
|
||||||
Split text at sentence boundaries, ensuring each chunk is <= max_length.
|
Split text at sentence boundaries, ensuring each chunk is <= max_length.
|
||||||
|
|
|
@ -1,22 +1,77 @@
|
||||||
import argparse
|
import argparse
|
||||||
|
import gc
|
||||||
|
import torch
|
||||||
import torchaudio as ta
|
import torchaudio as ta
|
||||||
from chatterbox.tts import ChatterboxTTS
|
from chatterbox.tts import ChatterboxTTS
|
||||||
|
from contextlib import contextmanager
|
||||||
|
|
||||||
|
# Import helper to fix Python path
|
||||||
|
import import_helper
|
||||||
|
|
||||||
|
def safe_load_chatterbox_tts(device):
|
||||||
|
"""
|
||||||
|
Safely load ChatterboxTTS model with device mapping to handle CUDA->MPS/CPU conversion.
|
||||||
|
This patches torch.load temporarily to map CUDA tensors to the appropriate device.
|
||||||
|
"""
|
||||||
|
@contextmanager
|
||||||
|
def patch_torch_load(target_device):
|
||||||
|
original_load = torch.load
|
||||||
|
|
||||||
|
def patched_load(*args, **kwargs):
|
||||||
|
# Add map_location to handle device mapping
|
||||||
|
if 'map_location' not in kwargs:
|
||||||
|
if target_device == "mps" and torch.backends.mps.is_available():
|
||||||
|
kwargs['map_location'] = torch.device('mps')
|
||||||
|
else:
|
||||||
|
kwargs['map_location'] = torch.device('cpu')
|
||||||
|
return original_load(*args, **kwargs)
|
||||||
|
|
||||||
|
torch.load = patched_load
|
||||||
|
try:
|
||||||
|
yield
|
||||||
|
finally:
|
||||||
|
torch.load = original_load
|
||||||
|
|
||||||
|
with patch_torch_load(device):
|
||||||
|
return ChatterboxTTS.from_pretrained(device=device)
|
||||||
|
|
||||||
def main():
|
def main():
|
||||||
parser = argparse.ArgumentParser(description="Chatterbox TTS audio generation")
|
parser = argparse.ArgumentParser(description="Chatterbox TTS audio generation")
|
||||||
parser.add_argument('--sample', required=True, type=str, help='Prompt/reference audio file (e.g. .wav, .mp3) for the voice')
|
parser.add_argument('--sample', required=True, type=str, help='Prompt/reference audio file (e.g. .wav, .mp3) for the voice')
|
||||||
parser.add_argument('--output', required=True, type=str, help='Output audio file path (should end with .wav)')
|
parser.add_argument('--output', required=True, type=str, help='Output audio file path (should end with .wav)')
|
||||||
parser.add_argument('--text', required=True, type=str, help='Text to synthesize')
|
parser.add_argument('--text', required=True, type=str, help='Text to synthesize')
|
||||||
|
parser.add_argument('--device', default="mps", choices=["mps", "cuda", "cpu"], help='Device to use for TTS (default: mps)')
|
||||||
args = parser.parse_args()
|
args = parser.parse_args()
|
||||||
|
|
||||||
# Load model on MPS (for Apple Silicon)
|
model = None
|
||||||
model = ChatterboxTTS.from_pretrained(device="mps")
|
wav = None
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Load model with safe device mapping
|
||||||
|
model = safe_load_chatterbox_tts(args.device)
|
||||||
|
|
||||||
# Generate the audio
|
# Generate the audio
|
||||||
wav = model.generate(args.text, audio_prompt_path=args.sample)
|
with torch.no_grad():
|
||||||
# Save to output .wav
|
wav = model.generate(args.text, audio_prompt_path=args.sample)
|
||||||
ta.save(args.output, wav, model.sr)
|
|
||||||
print(f"Generated audio saved to {args.output}")
|
# Save to output .wav
|
||||||
|
ta.save(args.output, wav, model.sr)
|
||||||
|
print(f"Generated audio saved to {args.output}")
|
||||||
|
|
||||||
|
finally:
|
||||||
|
# Explicit cleanup
|
||||||
|
if wav is not None:
|
||||||
|
del wav
|
||||||
|
if model is not None:
|
||||||
|
del model
|
||||||
|
|
||||||
|
# Force cleanup
|
||||||
|
gc.collect()
|
||||||
|
if args.device == "cuda" and torch.cuda.is_available():
|
||||||
|
torch.cuda.empty_cache()
|
||||||
|
elif args.device == "mps" and torch.backends.mps.is_available():
|
||||||
|
if hasattr(torch.mps, "empty_cache"):
|
||||||
|
torch.mps.empty_cache()
|
||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
main()
|
main()
|
||||||
|
|
|
@ -0,0 +1,2 @@
|
||||||
|
# yaml-language-server: $schema=https://raw.githubusercontent.com/antinomyhq/forge/refs/heads/main/forge.schema.json
|
||||||
|
model: qwen/qwen3-coder
|
|
@ -24,7 +24,7 @@
|
||||||
--text-blue-darker: #205081;
|
--text-blue-darker: #205081;
|
||||||
|
|
||||||
/* Border Colors */
|
/* Border Colors */
|
||||||
--border-light: #1b0404;
|
--border-light: #e5e7eb;
|
||||||
--border-medium: #cfd8dc;
|
--border-medium: #cfd8dc;
|
||||||
--border-blue: #b5c6df;
|
--border-blue: #b5c6df;
|
||||||
--border-gray: #e3e3e3;
|
--border-gray: #e3e3e3;
|
||||||
|
@ -55,7 +55,7 @@ body {
|
||||||
}
|
}
|
||||||
|
|
||||||
.container {
|
.container {
|
||||||
max-width: 1100px;
|
max-width: 1280px;
|
||||||
margin: 0 auto;
|
margin: 0 auto;
|
||||||
padding: 0 18px;
|
padding: 0 18px;
|
||||||
}
|
}
|
||||||
|
@ -134,6 +134,17 @@ main {
|
||||||
font-size: 1rem;
|
font-size: 1rem;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/* Allow wrapping for Text/Duration (3rd) column */
|
||||||
|
#dialog-items-table td:nth-child(3),
|
||||||
|
#dialog-items-table td.dialog-editable-cell {
|
||||||
|
white-space: pre-wrap; /* wrap text and preserve newlines */
|
||||||
|
overflow: visible; /* override global overflow hidden */
|
||||||
|
text-overflow: clip; /* no ellipsis */
|
||||||
|
word-break: break-word;/* wrap long words/URLs */
|
||||||
|
color: var(--text-primary); /* darker text for readability */
|
||||||
|
font-weight: 350; /* slightly heavier than 300, lighter than 400 */
|
||||||
|
}
|
||||||
|
|
||||||
/* Make the Speaker (2nd) column narrower */
|
/* Make the Speaker (2nd) column narrower */
|
||||||
#dialog-items-table th:nth-child(2), #dialog-items-table td:nth-child(2) {
|
#dialog-items-table th:nth-child(2), #dialog-items-table td:nth-child(2) {
|
||||||
width: 60px;
|
width: 60px;
|
||||||
|
@ -142,11 +153,11 @@ main {
|
||||||
text-align: center;
|
text-align: center;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Make the Actions (4th) column narrower */
|
/* Actions (4th) column sizing */
|
||||||
#dialog-items-table th:nth-child(4), #dialog-items-table td:nth-child(4) {
|
#dialog-items-table th:nth-child(4), #dialog-items-table td:nth-child(4) {
|
||||||
width: 110px;
|
width: 200px;
|
||||||
min-width: 90px;
|
min-width: 180px;
|
||||||
max-width: 130px;
|
max-width: 280px;
|
||||||
text-align: left;
|
text-align: left;
|
||||||
padding-left: 0;
|
padding-left: 0;
|
||||||
padding-right: 0;
|
padding-right: 0;
|
||||||
|
@ -186,8 +197,22 @@ main {
|
||||||
|
|
||||||
#dialog-items-table td.actions {
|
#dialog-items-table td.actions {
|
||||||
text-align: left;
|
text-align: left;
|
||||||
min-width: 110px;
|
min-width: 200px;
|
||||||
white-space: nowrap;
|
white-space: normal; /* allow wrapping so we don't see ellipsis */
|
||||||
|
overflow: visible; /* override table cell default from global rule */
|
||||||
|
text-overflow: clip; /* no ellipsis */
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Allow wrapping of action buttons on smaller screens */
|
||||||
|
@media (max-width: 900px) {
|
||||||
|
#dialog-items-table th:nth-child(4), #dialog-items-table td:nth-child(4) {
|
||||||
|
width: auto;
|
||||||
|
min-width: 160px;
|
||||||
|
max-width: none;
|
||||||
|
}
|
||||||
|
#dialog-items-table td.actions {
|
||||||
|
white-space: normal;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Collapsible log details */
|
/* Collapsible log details */
|
||||||
|
@ -346,7 +371,7 @@ button {
|
||||||
margin-right: 10px;
|
margin-right: 10px;
|
||||||
}
|
}
|
||||||
|
|
||||||
.generate-line-btn, .play-line-btn {
|
.generate-line-btn, .play-line-btn, .stop-line-btn {
|
||||||
background: var(--bg-blue-light);
|
background: var(--bg-blue-light);
|
||||||
color: var(--text-blue);
|
color: var(--text-blue);
|
||||||
border: 1.5px solid var(--border-blue);
|
border: 1.5px solid var(--border-blue);
|
||||||
|
@ -363,7 +388,7 @@ button {
|
||||||
vertical-align: middle;
|
vertical-align: middle;
|
||||||
}
|
}
|
||||||
|
|
||||||
.generate-line-btn:disabled, .play-line-btn:disabled {
|
.generate-line-btn:disabled, .play-line-btn:disabled, .stop-line-btn:disabled {
|
||||||
opacity: 0.45;
|
opacity: 0.45;
|
||||||
cursor: not-allowed;
|
cursor: not-allowed;
|
||||||
}
|
}
|
||||||
|
@ -374,7 +399,7 @@ button {
|
||||||
border-color: var(--warning-border);
|
border-color: var(--warning-border);
|
||||||
}
|
}
|
||||||
|
|
||||||
.generate-line-btn:hover, .play-line-btn:hover {
|
.generate-line-btn:hover, .play-line-btn:hover, .stop-line-btn:hover {
|
||||||
background: var(--bg-blue-lighter);
|
background: var(--bg-blue-lighter);
|
||||||
color: var(--text-blue-darker);
|
color: var(--text-blue-darker);
|
||||||
border-color: var(--text-blue);
|
border-color: var(--text-blue);
|
||||||
|
@ -449,6 +474,72 @@ footer {
|
||||||
border-top: 3px solid var(--primary-blue);
|
border-top: 3px solid var(--primary-blue);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/* Inline Notification */
|
||||||
|
.notice {
|
||||||
|
max-width: 1280px;
|
||||||
|
margin: 16px auto 0;
|
||||||
|
padding: 12px 16px;
|
||||||
|
border-radius: 6px;
|
||||||
|
border: 1px solid var(--border-medium);
|
||||||
|
background: var(--bg-white);
|
||||||
|
color: var(--text-primary);
|
||||||
|
display: flex;
|
||||||
|
align-items: center;
|
||||||
|
gap: 12px;
|
||||||
|
box-shadow: 0 1px 2px var(--shadow-light);
|
||||||
|
}
|
||||||
|
|
||||||
|
.notice--info {
|
||||||
|
border-color: var(--border-blue);
|
||||||
|
background: var(--bg-blue-light);
|
||||||
|
}
|
||||||
|
|
||||||
|
.notice--success {
|
||||||
|
border-color: #A7F3D0;
|
||||||
|
background: #ECFDF5;
|
||||||
|
}
|
||||||
|
|
||||||
|
.notice--warning {
|
||||||
|
border-color: var(--warning-border);
|
||||||
|
background: var(--warning-bg);
|
||||||
|
}
|
||||||
|
|
||||||
|
.notice--error {
|
||||||
|
border-color: var(--error-bg-dark);
|
||||||
|
background: #FEE2E2;
|
||||||
|
}
|
||||||
|
|
||||||
|
.notice__content {
|
||||||
|
flex: 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
.notice__actions {
|
||||||
|
display: flex;
|
||||||
|
gap: 8px;
|
||||||
|
}
|
||||||
|
|
||||||
|
.notice__actions button {
|
||||||
|
padding: 6px 12px;
|
||||||
|
border-radius: 4px;
|
||||||
|
border: 1px solid var(--border-medium);
|
||||||
|
background: var(--bg-white);
|
||||||
|
cursor: pointer;
|
||||||
|
}
|
||||||
|
|
||||||
|
.notice__actions .btn-primary {
|
||||||
|
background: var(--primary-blue);
|
||||||
|
color: var(--text-white);
|
||||||
|
border: none;
|
||||||
|
}
|
||||||
|
|
||||||
|
.notice__close {
|
||||||
|
background: none;
|
||||||
|
border: none;
|
||||||
|
font-size: 18px;
|
||||||
|
cursor: pointer;
|
||||||
|
color: var(--text-secondary);
|
||||||
|
}
|
||||||
|
|
||||||
@media (max-width: 900px) {
|
@media (max-width: 900px) {
|
||||||
.panel-grid {
|
.panel-grid {
|
||||||
flex-direction: column;
|
flex-direction: column;
|
||||||
|
|
|
@ -11,8 +11,38 @@
|
||||||
<div class="container">
|
<div class="container">
|
||||||
<h1>Chatterbox TTS</h1>
|
<h1>Chatterbox TTS</h1>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
|
<!-- Paste Script Modal -->
|
||||||
|
<div id="paste-script-modal" class="modal" style="display: none;">
|
||||||
|
<div class="modal-content">
|
||||||
|
<div class="modal-header">
|
||||||
|
<h3>Paste Dialog Script</h3>
|
||||||
|
<button class="modal-close" id="paste-script-close">×</button>
|
||||||
|
</div>
|
||||||
|
<div class="modal-body">
|
||||||
|
<p>Paste JSONL content (one JSON object per line). Example lines:</p>
|
||||||
|
<pre style="white-space:pre-wrap; background:#f6f8fa; padding:8px; border-radius:4px;">
|
||||||
|
{"type":"speech","speaker_id":"alice","text":"Hello there!"}
|
||||||
|
{"type":"silence","duration":0.5}
|
||||||
|
{"type":"speech","speaker_id":"bob","text":"Hi!"}
|
||||||
|
</pre>
|
||||||
|
<textarea id="paste-script-text" rows="10" style="width:100%;" placeholder='Paste JSONL here'></textarea>
|
||||||
|
</div>
|
||||||
|
<div class="modal-footer">
|
||||||
|
<button id="paste-script-load" class="btn-primary">Load</button>
|
||||||
|
<button id="paste-script-cancel" class="btn-secondary">Cancel</button>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
</header>
|
</header>
|
||||||
|
|
||||||
|
<!-- Global inline notification area -->
|
||||||
|
<div id="global-notice" class="notice" role="status" aria-live="polite" style="display:none;">
|
||||||
|
<div class="notice__content" id="global-notice-content"></div>
|
||||||
|
<div class="notice__actions" id="global-notice-actions"></div>
|
||||||
|
<button class="notice__close" id="global-notice-close" aria-label="Close notification">×</button>
|
||||||
|
</div>
|
||||||
|
|
||||||
<main class="container" role="main">
|
<main class="container" role="main">
|
||||||
<div class="panel-grid">
|
<div class="panel-grid">
|
||||||
<section id="dialog-editor" class="panel full-width-panel" aria-labelledby="dialog-editor-title">
|
<section id="dialog-editor" class="panel full-width-panel" aria-labelledby="dialog-editor-title">
|
||||||
|
@ -48,6 +78,7 @@
|
||||||
<button id="save-script-btn">Save Script</button>
|
<button id="save-script-btn">Save Script</button>
|
||||||
<input type="file" id="load-script-input" accept=".jsonl" style="display: none;">
|
<input type="file" id="load-script-input" accept=".jsonl" style="display: none;">
|
||||||
<button id="load-script-btn">Load Script</button>
|
<button id="load-script-btn">Load Script</button>
|
||||||
|
<button id="paste-script-btn">Paste Script</button>
|
||||||
</div>
|
</div>
|
||||||
</section>
|
</section>
|
||||||
</div>
|
</div>
|
||||||
|
@ -101,8 +132,8 @@
|
||||||
</div>
|
</div>
|
||||||
</footer>
|
</footer>
|
||||||
|
|
||||||
<!-- TTS Settings Modal -->
|
<!-- TTS Settings Modal -->
|
||||||
<div id="tts-settings-modal" class="modal" style="display: none;">
|
<div id="tts-settings-modal" class="modal" style="display: none;">
|
||||||
<div class="modal-content">
|
<div class="modal-content">
|
||||||
<div class="modal-header">
|
<div class="modal-header">
|
||||||
<h3>TTS Settings</h3>
|
<h3>TTS Settings</h3>
|
||||||
|
|
|
@ -10,7 +10,7 @@ const API_BASE_URL = API_BASE_URL_WITH_PREFIX;
|
||||||
* @throws {Error} If the network response is not ok.
|
* @throws {Error} If the network response is not ok.
|
||||||
*/
|
*/
|
||||||
export async function getSpeakers() {
|
export async function getSpeakers() {
|
||||||
const response = await fetch(`${API_BASE_URL}/speakers/`);
|
const response = await fetch(`${API_BASE_URL}/speakers`);
|
||||||
if (!response.ok) {
|
if (!response.ok) {
|
||||||
const errorData = await response.json().catch(() => ({ message: response.statusText }));
|
const errorData = await response.json().catch(() => ({ message: response.statusText }));
|
||||||
throw new Error(`Failed to fetch speakers: ${errorData.detail || errorData.message || response.statusText}`);
|
throw new Error(`Failed to fetch speakers: ${errorData.detail || errorData.message || response.statusText}`);
|
||||||
|
@ -26,12 +26,12 @@ export async function getSpeakers() {
|
||||||
* Adds a new speaker.
|
* Adds a new speaker.
|
||||||
* @param {FormData} formData - The form data containing speaker name and audio file.
|
* @param {FormData} formData - The form data containing speaker name and audio file.
|
||||||
* Example: formData.append('name', 'New Speaker');
|
* Example: formData.append('name', 'New Speaker');
|
||||||
* formData.append('audio_sample_file', fileInput.files[0]);
|
* formData.append('audio_file', fileInput.files[0]);
|
||||||
* @returns {Promise<Object>} A promise that resolves to the new speaker object.
|
* @returns {Promise<Object>} A promise that resolves to the new speaker object.
|
||||||
* @throws {Error} If the network response is not ok.
|
* @throws {Error} If the network response is not ok.
|
||||||
*/
|
*/
|
||||||
export async function addSpeaker(formData) {
|
export async function addSpeaker(formData) {
|
||||||
const response = await fetch(`${API_BASE_URL}/speakers/`, {
|
const response = await fetch(`${API_BASE_URL}/speakers`, {
|
||||||
method: 'POST',
|
method: 'POST',
|
||||||
body: formData, // FormData sets Content-Type to multipart/form-data automatically
|
body: formData, // FormData sets Content-Type to multipart/form-data automatically
|
||||||
});
|
});
|
||||||
|
@ -86,7 +86,7 @@ export async function addSpeaker(formData) {
|
||||||
* @throws {Error} If the network response is not ok.
|
* @throws {Error} If the network response is not ok.
|
||||||
*/
|
*/
|
||||||
export async function deleteSpeaker(speakerId) {
|
export async function deleteSpeaker(speakerId) {
|
||||||
const response = await fetch(`${API_BASE_URL}/speakers/${speakerId}/`, {
|
const response = await fetch(`${API_BASE_URL}/speakers/${speakerId}`, {
|
||||||
method: 'DELETE',
|
method: 'DELETE',
|
||||||
});
|
});
|
||||||
if (!response.ok) {
|
if (!response.ok) {
|
||||||
|
@ -124,18 +124,8 @@ export async function generateLine(line) {
|
||||||
const errorData = await response.json().catch(() => ({ message: response.statusText }));
|
const errorData = await response.json().catch(() => ({ message: response.statusText }));
|
||||||
throw new Error(`Failed to generate line audio: ${errorData.detail || errorData.message || response.statusText}`);
|
throw new Error(`Failed to generate line audio: ${errorData.detail || errorData.message || response.statusText}`);
|
||||||
}
|
}
|
||||||
|
const data = await response.json();
|
||||||
const responseText = await response.text();
|
return data;
|
||||||
console.log('Raw response text:', responseText);
|
|
||||||
|
|
||||||
try {
|
|
||||||
const jsonData = JSON.parse(responseText);
|
|
||||||
console.log('Parsed JSON:', jsonData);
|
|
||||||
return jsonData;
|
|
||||||
} catch (parseError) {
|
|
||||||
console.error('JSON parse error:', parseError);
|
|
||||||
throw new Error(`Invalid JSON response: ${responseText}`);
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
@ -146,7 +136,7 @@ export async function generateLine(line) {
|
||||||
* output_base_name: "my_dialog",
|
* output_base_name: "my_dialog",
|
||||||
* dialog_items: [
|
* dialog_items: [
|
||||||
* { type: "speech", speaker_id: "speaker1", text: "Hello world.", exaggeration: 1.0, cfg_weight: 2.0, temperature: 0.7 },
|
* { type: "speech", speaker_id: "speaker1", text: "Hello world.", exaggeration: 1.0, cfg_weight: 2.0, temperature: 0.7 },
|
||||||
* { type: "silence", duration_ms: 500 },
|
* { type: "silence", duration: 0.5 },
|
||||||
* { type: "speech", speaker_id: "speaker2", text: "How are you?" }
|
* { type: "speech", speaker_id: "speaker2", text: "How are you?" }
|
||||||
* ]
|
* ]
|
||||||
* }
|
* }
|
||||||
|
|
|
@ -1,6 +1,69 @@
|
||||||
import { getSpeakers, addSpeaker, deleteSpeaker, generateDialog } from './api.js';
|
import { getSpeakers, addSpeaker, deleteSpeaker, generateDialog } from './api.js';
|
||||||
import { API_BASE_URL, API_BASE_URL_FOR_FILES } from './config.js';
|
import { API_BASE_URL, API_BASE_URL_FOR_FILES } from './config.js';
|
||||||
|
|
||||||
|
// Shared per-line audio playback state to prevent overlapping playback
|
||||||
|
let currentLineAudio = null;
|
||||||
|
let currentLinePlayBtn = null;
|
||||||
|
let currentLineStopBtn = null;
|
||||||
|
|
||||||
|
// --- Global Inline Notification Helpers --- //
|
||||||
|
const noticeEl = document.getElementById('global-notice');
|
||||||
|
const noticeContentEl = document.getElementById('global-notice-content');
|
||||||
|
const noticeActionsEl = document.getElementById('global-notice-actions');
|
||||||
|
const noticeCloseBtn = document.getElementById('global-notice-close');
|
||||||
|
|
||||||
|
function hideNotice() {
|
||||||
|
if (!noticeEl) return;
|
||||||
|
noticeEl.style.display = 'none';
|
||||||
|
noticeEl.className = 'notice';
|
||||||
|
if (noticeContentEl) noticeContentEl.textContent = '';
|
||||||
|
if (noticeActionsEl) noticeActionsEl.innerHTML = '';
|
||||||
|
}
|
||||||
|
|
||||||
|
function showNotice(message, type = 'info', options = {}) {
|
||||||
|
if (!noticeEl || !noticeContentEl || !noticeActionsEl) {
|
||||||
|
console[type === 'error' ? 'error' : 'log']('[NOTICE]', message);
|
||||||
|
return () => {};
|
||||||
|
}
|
||||||
|
const { timeout = null, actions = [] } = options;
|
||||||
|
noticeEl.className = `notice notice--${type}`;
|
||||||
|
noticeContentEl.textContent = message;
|
||||||
|
noticeActionsEl.innerHTML = '';
|
||||||
|
|
||||||
|
actions.forEach(({ text, primary = false, onClick }) => {
|
||||||
|
const btn = document.createElement('button');
|
||||||
|
btn.textContent = text;
|
||||||
|
if (primary) btn.classList.add('btn-primary');
|
||||||
|
btn.onclick = () => {
|
||||||
|
try { onClick && onClick(); } finally { hideNotice(); }
|
||||||
|
};
|
||||||
|
noticeActionsEl.appendChild(btn);
|
||||||
|
});
|
||||||
|
|
||||||
|
if (noticeCloseBtn) noticeCloseBtn.onclick = hideNotice;
|
||||||
|
noticeEl.style.display = 'flex';
|
||||||
|
|
||||||
|
let timerId = null;
|
||||||
|
if (timeout && Number.isFinite(timeout)) {
|
||||||
|
timerId = window.setTimeout(hideNotice, timeout);
|
||||||
|
}
|
||||||
|
return () => {
|
||||||
|
if (timerId) window.clearTimeout(timerId);
|
||||||
|
hideNotice();
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
function confirmAction(message) {
|
||||||
|
return new Promise((resolve) => {
|
||||||
|
showNotice(message, 'warning', {
|
||||||
|
actions: [
|
||||||
|
{ text: 'Cancel', primary: false, onClick: () => resolve(false) },
|
||||||
|
{ text: 'Confirm', primary: true, onClick: () => resolve(true) },
|
||||||
|
],
|
||||||
|
});
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
document.addEventListener('DOMContentLoaded', async () => {
|
document.addEventListener('DOMContentLoaded', async () => {
|
||||||
console.log('DOM fully loaded and parsed');
|
console.log('DOM fully loaded and parsed');
|
||||||
initializeSpeakerManagement();
|
initializeSpeakerManagement();
|
||||||
|
@ -23,18 +86,24 @@ function initializeSpeakerManagement() {
|
||||||
const audioFile = formData.get('audio_file');
|
const audioFile = formData.get('audio_file');
|
||||||
|
|
||||||
if (!speakerName || !audioFile || audioFile.size === 0) {
|
if (!speakerName || !audioFile || audioFile.size === 0) {
|
||||||
alert('Please provide a speaker name and an audio file.');
|
showNotice('Please provide a speaker name and an audio file.', 'warning', { timeout: 4000 });
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
try {
|
try {
|
||||||
|
const submitBtn = addSpeakerForm.querySelector('button[type="submit"]');
|
||||||
|
const prevText = submitBtn ? submitBtn.textContent : null;
|
||||||
|
if (submitBtn) { submitBtn.disabled = true; submitBtn.textContent = 'Adding…'; }
|
||||||
const newSpeaker = await addSpeaker(formData);
|
const newSpeaker = await addSpeaker(formData);
|
||||||
alert(`Speaker added: ${newSpeaker.name} (ID: ${newSpeaker.id})`);
|
showNotice(`Speaker added: ${newSpeaker.name} (ID: ${newSpeaker.id})`, 'success', { timeout: 3000 });
|
||||||
addSpeakerForm.reset();
|
addSpeakerForm.reset();
|
||||||
loadSpeakers(); // Refresh speaker list
|
loadSpeakers(); // Refresh speaker list
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
console.error('Failed to add speaker:', error);
|
console.error('Failed to add speaker:', error);
|
||||||
alert('Error adding speaker: ' + error.message);
|
showNotice('Error adding speaker: ' + error.message, 'error');
|
||||||
|
} finally {
|
||||||
|
const submitBtn = addSpeakerForm.querySelector('button[type="submit"]');
|
||||||
|
if (submitBtn) { submitBtn.disabled = false; submitBtn.textContent = 'Add Speaker'; }
|
||||||
}
|
}
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
@ -79,23 +148,24 @@ async function loadSpeakers() {
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
console.error('Failed to load speakers:', error);
|
console.error('Failed to load speakers:', error);
|
||||||
speakerListUL.innerHTML = '<li>Error loading speakers. See console for details.</li>';
|
speakerListUL.innerHTML = '<li>Error loading speakers. See console for details.</li>';
|
||||||
alert('Error loading speakers: ' + error.message);
|
showNotice('Error loading speakers: ' + error.message, 'error');
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
async function handleDeleteSpeaker(speakerId) {
|
async function handleDeleteSpeaker(speakerId) {
|
||||||
if (!speakerId) {
|
if (!speakerId) {
|
||||||
alert('Cannot delete speaker: Speaker ID is missing.');
|
showNotice('Cannot delete speaker: Speaker ID is missing.', 'warning', { timeout: 4000 });
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
if (!confirm(`Are you sure you want to delete speaker ${speakerId}?`)) return;
|
const ok = await confirmAction(`Are you sure you want to delete speaker ${speakerId}?`);
|
||||||
|
if (!ok) return;
|
||||||
try {
|
try {
|
||||||
await deleteSpeaker(speakerId);
|
await deleteSpeaker(speakerId);
|
||||||
alert(`Speaker ${speakerId} deleted successfully.`);
|
showNotice(`Speaker ${speakerId} deleted successfully.`, 'success', { timeout: 3000 });
|
||||||
loadSpeakers(); // Refresh speaker list
|
loadSpeakers(); // Refresh speaker list
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
console.error(`Failed to delete speaker ${speakerId}:`, error);
|
console.error(`Failed to delete speaker ${speakerId}:`, error);
|
||||||
alert(`Error deleting speaker: ${error.message}`);
|
showNotice(`Error deleting speaker: ${error.message}`, 'error');
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -131,6 +201,12 @@ async function initializeDialogEditor() {
|
||||||
const saveScriptBtn = document.getElementById('save-script-btn');
|
const saveScriptBtn = document.getElementById('save-script-btn');
|
||||||
const loadScriptBtn = document.getElementById('load-script-btn');
|
const loadScriptBtn = document.getElementById('load-script-btn');
|
||||||
const loadScriptInput = document.getElementById('load-script-input');
|
const loadScriptInput = document.getElementById('load-script-input');
|
||||||
|
const pasteScriptBtn = document.getElementById('paste-script-btn');
|
||||||
|
const pasteModal = document.getElementById('paste-script-modal');
|
||||||
|
const pasteText = document.getElementById('paste-script-text');
|
||||||
|
const pasteLoadBtn = document.getElementById('paste-script-load');
|
||||||
|
const pasteCancelBtn = document.getElementById('paste-script-cancel');
|
||||||
|
const pasteCloseBtn = document.getElementById('paste-script-close');
|
||||||
|
|
||||||
// Results Display Elements
|
// Results Display Elements
|
||||||
const generationLogPre = document.getElementById('generation-log-content'); // Corrected ID
|
const generationLogPre = document.getElementById('generation-log-content'); // Corrected ID
|
||||||
|
@ -140,9 +216,6 @@ async function initializeDialogEditor() {
|
||||||
const zipArchivePlaceholder = document.getElementById('zip-archive-placeholder');
|
const zipArchivePlaceholder = document.getElementById('zip-archive-placeholder');
|
||||||
const resultsDisplaySection = document.getElementById('results-display');
|
const resultsDisplaySection = document.getElementById('results-display');
|
||||||
|
|
||||||
let dialogItems = [];
|
|
||||||
let availableSpeakersCache = []; // Cache for speaker names and IDs
|
|
||||||
|
|
||||||
// Load speakers at startup
|
// Load speakers at startup
|
||||||
try {
|
try {
|
||||||
availableSpeakersCache = await getSpeakers();
|
availableSpeakersCache = await getSpeakers();
|
||||||
|
@ -152,6 +225,48 @@ async function initializeDialogEditor() {
|
||||||
// Continue without speakers - they'll be loaded when needed
|
// Continue without speakers - they'll be loaded when needed
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// --- LocalStorage persistence helpers ---
|
||||||
|
const LS_KEY = 'dialogEditor.items.v1';
|
||||||
|
|
||||||
|
function saveDialogToLocalStorage() {
|
||||||
|
try {
|
||||||
|
const exportData = dialogItems.map(item => {
|
||||||
|
const obj = { type: item.type };
|
||||||
|
if (item.type === 'speech') {
|
||||||
|
obj.speaker_id = item.speaker_id;
|
||||||
|
obj.text = item.text;
|
||||||
|
if (item.exaggeration !== undefined) obj.exaggeration = item.exaggeration;
|
||||||
|
if (item.cfg_weight !== undefined) obj.cfg_weight = item.cfg_weight;
|
||||||
|
if (item.temperature !== undefined) obj.temperature = item.temperature;
|
||||||
|
if (item.audioUrl) obj.audioUrl = item.audioUrl; // keep existing audio reference if present
|
||||||
|
} else if (item.type === 'silence') {
|
||||||
|
obj.duration = item.duration;
|
||||||
|
}
|
||||||
|
return obj;
|
||||||
|
});
|
||||||
|
localStorage.setItem(LS_KEY, JSON.stringify({ items: exportData }));
|
||||||
|
} catch (e) {
|
||||||
|
console.warn('Failed to save dialog to localStorage:', e);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function loadDialogFromLocalStorage() {
|
||||||
|
try {
|
||||||
|
const raw = localStorage.getItem(LS_KEY);
|
||||||
|
if (!raw) return;
|
||||||
|
const parsed = JSON.parse(raw);
|
||||||
|
if (!parsed || !Array.isArray(parsed.items)) return;
|
||||||
|
const loaded = parsed.items.map(normalizeDialogItem);
|
||||||
|
dialogItems.splice(0, dialogItems.length, ...loaded);
|
||||||
|
console.log(`Restored ${loaded.length} dialog items from localStorage`);
|
||||||
|
} catch (e) {
|
||||||
|
console.warn('Failed to load dialog from localStorage:', e);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Attempt to restore saved dialog before first render
|
||||||
|
loadDialogFromLocalStorage();
|
||||||
|
|
||||||
// Function to render the current dialogItems array to the DOM as table rows
|
// Function to render the current dialogItems array to the DOM as table rows
|
||||||
function renderDialogItems() {
|
function renderDialogItems() {
|
||||||
if (!dialogItemsContainer) return;
|
if (!dialogItemsContainer) return;
|
||||||
|
@ -184,6 +299,8 @@ async function initializeDialogEditor() {
|
||||||
});
|
});
|
||||||
speakerSelect.onchange = (e) => {
|
speakerSelect.onchange = (e) => {
|
||||||
dialogItems[index].speaker_id = e.target.value;
|
dialogItems[index].speaker_id = e.target.value;
|
||||||
|
// Persist change
|
||||||
|
saveDialogToLocalStorage();
|
||||||
};
|
};
|
||||||
speakerTd.appendChild(speakerSelect);
|
speakerTd.appendChild(speakerSelect);
|
||||||
} else {
|
} else {
|
||||||
|
@ -195,8 +312,7 @@ async function initializeDialogEditor() {
|
||||||
const textTd = document.createElement('td');
|
const textTd = document.createElement('td');
|
||||||
textTd.className = 'dialog-editable-cell';
|
textTd.className = 'dialog-editable-cell';
|
||||||
if (item.type === 'speech') {
|
if (item.type === 'speech') {
|
||||||
let txt = item.text.length > 60 ? item.text.substring(0, 57) + '…' : item.text;
|
textTd.textContent = `"${item.text}"`;
|
||||||
textTd.textContent = `"${txt}"`;
|
|
||||||
textTd.title = item.text;
|
textTd.title = item.text;
|
||||||
} else {
|
} else {
|
||||||
textTd.textContent = `${item.duration}s`;
|
textTd.textContent = `${item.duration}s`;
|
||||||
|
@ -243,6 +359,8 @@ async function initializeDialogEditor() {
|
||||||
if (!isNaN(val) && val > 0) dialogItems[index].duration = val;
|
if (!isNaN(val) && val > 0) dialogItems[index].duration = val;
|
||||||
dialogItems[index].audioUrl = null;
|
dialogItems[index].audioUrl = null;
|
||||||
}
|
}
|
||||||
|
// Persist changes before re-render
|
||||||
|
saveDialogToLocalStorage();
|
||||||
renderDialogItems();
|
renderDialogItems();
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
@ -261,6 +379,7 @@ async function initializeDialogEditor() {
|
||||||
upBtn.onclick = () => {
|
upBtn.onclick = () => {
|
||||||
if (index > 0) {
|
if (index > 0) {
|
||||||
[dialogItems[index - 1], dialogItems[index]] = [dialogItems[index], dialogItems[index - 1]];
|
[dialogItems[index - 1], dialogItems[index]] = [dialogItems[index], dialogItems[index - 1]];
|
||||||
|
saveDialogToLocalStorage();
|
||||||
renderDialogItems();
|
renderDialogItems();
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
@ -275,6 +394,7 @@ async function initializeDialogEditor() {
|
||||||
downBtn.onclick = () => {
|
downBtn.onclick = () => {
|
||||||
if (index < dialogItems.length - 1) {
|
if (index < dialogItems.length - 1) {
|
||||||
[dialogItems[index], dialogItems[index + 1]] = [dialogItems[index + 1], dialogItems[index]];
|
[dialogItems[index], dialogItems[index + 1]] = [dialogItems[index + 1], dialogItems[index]];
|
||||||
|
saveDialogToLocalStorage();
|
||||||
renderDialogItems();
|
renderDialogItems();
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
@ -288,6 +408,7 @@ async function initializeDialogEditor() {
|
||||||
removeBtn.title = 'Remove';
|
removeBtn.title = 'Remove';
|
||||||
removeBtn.onclick = () => {
|
removeBtn.onclick = () => {
|
||||||
dialogItems.splice(index, 1);
|
dialogItems.splice(index, 1);
|
||||||
|
saveDialogToLocalStorage();
|
||||||
renderDialogItems();
|
renderDialogItems();
|
||||||
};
|
};
|
||||||
actionsTd.appendChild(removeBtn);
|
actionsTd.appendChild(removeBtn);
|
||||||
|
@ -314,6 +435,8 @@ async function initializeDialogEditor() {
|
||||||
if (result && result.audio_url) {
|
if (result && result.audio_url) {
|
||||||
dialogItems[index].audioUrl = result.audio_url;
|
dialogItems[index].audioUrl = result.audio_url;
|
||||||
console.log('Set audioUrl to:', result.audio_url);
|
console.log('Set audioUrl to:', result.audio_url);
|
||||||
|
// Persist newly generated audio reference
|
||||||
|
saveDialogToLocalStorage();
|
||||||
} else {
|
} else {
|
||||||
console.error('Invalid result structure:', result);
|
console.error('Invalid result structure:', result);
|
||||||
throw new Error('Invalid response: missing audio_url');
|
throw new Error('Invalid response: missing audio_url');
|
||||||
|
@ -321,7 +444,7 @@ async function initializeDialogEditor() {
|
||||||
} catch (err) {
|
} catch (err) {
|
||||||
console.error('Error in generateLine:', err);
|
console.error('Error in generateLine:', err);
|
||||||
dialogItems[index].error = err.message || 'Failed to generate audio.';
|
dialogItems[index].error = err.message || 'Failed to generate audio.';
|
||||||
alert(dialogItems[index].error);
|
showNotice(dialogItems[index].error, 'error');
|
||||||
} finally {
|
} finally {
|
||||||
dialogItems[index].isGenerating = false;
|
dialogItems[index].isGenerating = false;
|
||||||
renderDialogItems();
|
renderDialogItems();
|
||||||
|
@ -330,19 +453,107 @@ async function initializeDialogEditor() {
|
||||||
actionsTd.appendChild(generateBtn);
|
actionsTd.appendChild(generateBtn);
|
||||||
|
|
||||||
// --- NEW: Per-line Play button ---
|
// --- NEW: Per-line Play button ---
|
||||||
const playBtn = document.createElement('button');
|
const playPauseBtn = document.createElement('button');
|
||||||
playBtn.innerHTML = '⏵';
|
playPauseBtn.innerHTML = '⏵';
|
||||||
playBtn.title = item.audioUrl ? 'Play generated audio' : 'No audio generated yet';
|
playPauseBtn.title = item.audioUrl ? 'Play' : 'No audio generated yet';
|
||||||
playBtn.className = 'play-line-btn';
|
playPauseBtn.className = 'play-line-btn';
|
||||||
playBtn.disabled = !item.audioUrl;
|
playPauseBtn.disabled = !item.audioUrl;
|
||||||
playBtn.onclick = () => {
|
|
||||||
if (!item.audioUrl) return;
|
const stopBtn = document.createElement('button');
|
||||||
let audioUrl = item.audioUrl.startsWith('http') ? item.audioUrl : `${API_BASE_URL_FOR_FILES}${item.audioUrl}`;
|
stopBtn.innerHTML = '⏹';
|
||||||
// Use a shared audio element or create one per play
|
stopBtn.title = 'Stop';
|
||||||
let audio = new window.Audio(audioUrl);
|
stopBtn.className = 'stop-line-btn';
|
||||||
audio.play();
|
stopBtn.disabled = !item.audioUrl;
|
||||||
|
|
||||||
|
const setBtnStatesForPlaying = () => {
|
||||||
|
try {
|
||||||
|
playPauseBtn.innerHTML = '⏸';
|
||||||
|
playPauseBtn.title = 'Pause';
|
||||||
|
stopBtn.disabled = false;
|
||||||
|
} catch (e) { /* detached */ }
|
||||||
};
|
};
|
||||||
actionsTd.appendChild(playBtn);
|
const setBtnStatesForPausedOrStopped = () => {
|
||||||
|
try {
|
||||||
|
playPauseBtn.innerHTML = '⏵';
|
||||||
|
playPauseBtn.title = 'Play';
|
||||||
|
} catch (e) { /* detached */ }
|
||||||
|
};
|
||||||
|
|
||||||
|
const stopCurrent = () => {
|
||||||
|
if (currentLineAudio) {
|
||||||
|
try { currentLineAudio.pause(); currentLineAudio.currentTime = 0; } catch (e) { /* noop */ }
|
||||||
|
}
|
||||||
|
if (currentLinePlayBtn) {
|
||||||
|
try { currentLinePlayBtn.innerHTML = '⏵'; currentLinePlayBtn.title = 'Play'; } catch (e) { /* detached */ }
|
||||||
|
}
|
||||||
|
if (currentLineStopBtn) {
|
||||||
|
try { currentLineStopBtn.disabled = true; } catch (e) { /* detached */ }
|
||||||
|
}
|
||||||
|
currentLineAudio = null;
|
||||||
|
currentLinePlayBtn = null;
|
||||||
|
currentLineStopBtn = null;
|
||||||
|
};
|
||||||
|
|
||||||
|
playPauseBtn.onclick = () => {
|
||||||
|
if (!item.audioUrl) return;
|
||||||
|
const audioUrl = item.audioUrl.startsWith('http') ? item.audioUrl : `${API_BASE_URL_FOR_FILES}${item.audioUrl}`;
|
||||||
|
|
||||||
|
// If controlling the same line
|
||||||
|
if (currentLineAudio && currentLinePlayBtn === playPauseBtn) {
|
||||||
|
if (currentLineAudio.paused) {
|
||||||
|
// Resume
|
||||||
|
currentLineAudio.play().then(() => setBtnStatesForPlaying()).catch(err => {
|
||||||
|
console.error('Audio resume failed:', err);
|
||||||
|
showNotice('Could not resume audio.', 'error', { timeout: 2000 });
|
||||||
|
});
|
||||||
|
} else {
|
||||||
|
// Pause
|
||||||
|
try { currentLineAudio.pause(); } catch (e) { /* noop */ }
|
||||||
|
setBtnStatesForPausedOrStopped();
|
||||||
|
}
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Switching to a different line: stop previous
|
||||||
|
if (currentLineAudio) {
|
||||||
|
stopCurrent();
|
||||||
|
}
|
||||||
|
|
||||||
|
// Start new audio
|
||||||
|
const audio = new window.Audio(audioUrl);
|
||||||
|
currentLineAudio = audio;
|
||||||
|
currentLinePlayBtn = playPauseBtn;
|
||||||
|
currentLineStopBtn = stopBtn;
|
||||||
|
|
||||||
|
const clearState = () => {
|
||||||
|
if (currentLineAudio === audio) {
|
||||||
|
setBtnStatesForPausedOrStopped();
|
||||||
|
try { stopBtn.disabled = true; } catch (e) { /* detached */ }
|
||||||
|
currentLineAudio = null;
|
||||||
|
currentLinePlayBtn = null;
|
||||||
|
currentLineStopBtn = null;
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
audio.addEventListener('ended', clearState, { once: true });
|
||||||
|
audio.addEventListener('error', clearState, { once: true });
|
||||||
|
|
||||||
|
audio.play().then(() => setBtnStatesForPlaying()).catch(err => {
|
||||||
|
console.error('Audio play failed:', err);
|
||||||
|
clearState();
|
||||||
|
showNotice('Could not play audio.', 'error', { timeout: 2000 });
|
||||||
|
});
|
||||||
|
};
|
||||||
|
|
||||||
|
stopBtn.onclick = () => {
|
||||||
|
// Only acts if this line is the active one
|
||||||
|
if (currentLineAudio && currentLinePlayBtn === playPauseBtn) {
|
||||||
|
stopCurrent();
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
actionsTd.appendChild(playPauseBtn);
|
||||||
|
actionsTd.appendChild(stopBtn);
|
||||||
|
|
||||||
// --- NEW: Settings button for speech items ---
|
// --- NEW: Settings button for speech items ---
|
||||||
if (item.type === 'speech') {
|
if (item.type === 'speech') {
|
||||||
|
@ -383,13 +594,13 @@ async function initializeDialogEditor() {
|
||||||
try {
|
try {
|
||||||
availableSpeakersCache = await getSpeakers();
|
availableSpeakersCache = await getSpeakers();
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
alert('Could not load speakers. Please try again.');
|
showNotice('Could not load speakers. Please try again.', 'error');
|
||||||
console.error('Error fetching speakers for dialog:', error);
|
console.error('Error fetching speakers for dialog:', error);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
if (availableSpeakersCache.length === 0) {
|
if (availableSpeakersCache.length === 0) {
|
||||||
alert('No speakers available. Please add a speaker first.');
|
showNotice('No speakers available. Please add a speaker first.', 'warning', { timeout: 4000 });
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -419,10 +630,11 @@ async function initializeDialogEditor() {
|
||||||
const speakerId = speakerSelect.value;
|
const speakerId = speakerSelect.value;
|
||||||
const text = textInput.value.trim();
|
const text = textInput.value.trim();
|
||||||
if (!speakerId || !text) {
|
if (!speakerId || !text) {
|
||||||
alert('Please select a speaker and enter text.');
|
showNotice('Please select a speaker and enter text.', 'warning', { timeout: 4000 });
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
dialogItems.push(normalizeDialogItem({ type: 'speech', speaker_id: speakerId, text: text }));
|
dialogItems.push(normalizeDialogItem({ type: 'speech', speaker_id: speakerId, text: text }));
|
||||||
|
saveDialogToLocalStorage();
|
||||||
renderDialogItems();
|
renderDialogItems();
|
||||||
clearTempInputArea();
|
clearTempInputArea();
|
||||||
};
|
};
|
||||||
|
@ -461,10 +673,11 @@ async function initializeDialogEditor() {
|
||||||
addButton.onclick = () => {
|
addButton.onclick = () => {
|
||||||
const duration = parseFloat(durationInput.value);
|
const duration = parseFloat(durationInput.value);
|
||||||
if (isNaN(duration) || duration <= 0) {
|
if (isNaN(duration) || duration <= 0) {
|
||||||
alert('Invalid duration. Please enter a positive number.');
|
showNotice('Invalid duration. Please enter a positive number.', 'warning', { timeout: 4000 });
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
dialogItems.push(normalizeDialogItem({ type: 'silence', duration: duration }));
|
dialogItems.push(normalizeDialogItem({ type: 'silence', duration: duration }));
|
||||||
|
saveDialogToLocalStorage();
|
||||||
renderDialogItems();
|
renderDialogItems();
|
||||||
clearTempInputArea();
|
clearTempInputArea();
|
||||||
};
|
};
|
||||||
|
@ -486,15 +699,18 @@ async function initializeDialogEditor() {
|
||||||
generateDialogBtn.addEventListener('click', async () => {
|
generateDialogBtn.addEventListener('click', async () => {
|
||||||
const outputBaseName = outputBaseNameInput.value.trim();
|
const outputBaseName = outputBaseNameInput.value.trim();
|
||||||
if (!outputBaseName) {
|
if (!outputBaseName) {
|
||||||
alert('Please enter an output base name.');
|
showNotice('Please enter an output base name.', 'warning', { timeout: 4000 });
|
||||||
outputBaseNameInput.focus();
|
outputBaseNameInput.focus();
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
if (dialogItems.length === 0) {
|
if (dialogItems.length === 0) {
|
||||||
alert('Please add at least one speech or silence line to the dialog.');
|
showNotice('Please add at least one speech or silence line to the dialog.', 'warning', { timeout: 4000 });
|
||||||
return; // Prevent further execution if no dialog items
|
return; // Prevent further execution if no dialog items
|
||||||
}
|
}
|
||||||
|
|
||||||
|
const prevText = generateDialogBtn.textContent;
|
||||||
|
generateDialogBtn.disabled = true;
|
||||||
|
generateDialogBtn.textContent = 'Generating…';
|
||||||
// Smart dialog-wide generation: use pre-generated audio where present
|
// Smart dialog-wide generation: use pre-generated audio where present
|
||||||
const dialogItemsToGenerate = dialogItems.map(item => {
|
const dialogItemsToGenerate = dialogItems.map(item => {
|
||||||
// Only send minimal fields for items that need generation
|
// Only send minimal fields for items that need generation
|
||||||
|
@ -546,7 +762,11 @@ async function initializeDialogEditor() {
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
console.error('Dialog generation failed:', error);
|
console.error('Dialog generation failed:', error);
|
||||||
if (generationLogPre) generationLogPre.textContent = `Error generating dialog: ${error.message}`;
|
if (generationLogPre) generationLogPre.textContent = `Error generating dialog: ${error.message}`;
|
||||||
alert(`Error generating dialog: ${error.message}`);
|
showNotice(`Error generating dialog: ${error.message}`, 'error');
|
||||||
|
}
|
||||||
|
finally {
|
||||||
|
generateDialogBtn.disabled = false;
|
||||||
|
generateDialogBtn.textContent = prevText;
|
||||||
}
|
}
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
@ -554,7 +774,7 @@ async function initializeDialogEditor() {
|
||||||
// --- Save/Load Script Functionality ---
|
// --- Save/Load Script Functionality ---
|
||||||
function saveDialogScript() {
|
function saveDialogScript() {
|
||||||
if (dialogItems.length === 0) {
|
if (dialogItems.length === 0) {
|
||||||
alert('No dialog items to save. Please add some speech or silence lines first.');
|
showNotice('No dialog items to save. Please add some speech or silence lines first.', 'warning', { timeout: 4000 });
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -599,11 +819,12 @@ async function initializeDialogEditor() {
|
||||||
URL.revokeObjectURL(url);
|
URL.revokeObjectURL(url);
|
||||||
|
|
||||||
console.log(`Dialog script saved as ${filename}`);
|
console.log(`Dialog script saved as ${filename}`);
|
||||||
|
showNotice(`Dialog script saved as ${filename}`, 'success', { timeout: 3000 });
|
||||||
}
|
}
|
||||||
|
|
||||||
function loadDialogScript(file) {
|
function loadDialogScript(file) {
|
||||||
if (!file) {
|
if (!file) {
|
||||||
alert('Please select a file to load.');
|
showNotice('Please select a file to load.', 'warning', { timeout: 4000 });
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -626,19 +847,19 @@ async function initializeDialogEditor() {
|
||||||
}
|
}
|
||||||
} catch (parseError) {
|
} catch (parseError) {
|
||||||
console.error(`Error parsing line ${i + 1}:`, parseError);
|
console.error(`Error parsing line ${i + 1}:`, parseError);
|
||||||
alert(`Error parsing line ${i + 1}: ${parseError.message}`);
|
showNotice(`Error parsing line ${i + 1}: ${parseError.message}`, 'error');
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
if (loadedItems.length === 0) {
|
if (loadedItems.length === 0) {
|
||||||
alert('No valid dialog items found in the file.');
|
showNotice('No valid dialog items found in the file.', 'warning', { timeout: 4000 });
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
// Confirm replacement if existing items
|
// Confirm replacement if existing items
|
||||||
if (dialogItems.length > 0) {
|
if (dialogItems.length > 0) {
|
||||||
const confirmed = confirm(
|
const confirmed = await confirmAction(
|
||||||
`This will replace your current dialog (${dialogItems.length} items) with the loaded script (${loadedItems.length} items). Continue?`
|
`This will replace your current dialog (${dialogItems.length} items) with the loaded script (${loadedItems.length} items). Continue?`
|
||||||
);
|
);
|
||||||
if (!confirmed) return;
|
if (!confirmed) return;
|
||||||
|
@ -650,30 +871,97 @@ async function initializeDialogEditor() {
|
||||||
availableSpeakersCache = await getSpeakers();
|
availableSpeakersCache = await getSpeakers();
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
console.error('Error fetching speakers:', error);
|
console.error('Error fetching speakers:', error);
|
||||||
alert('Could not load speakers. Dialog loaded but speaker names may not display correctly.');
|
showNotice('Could not load speakers. Dialog loaded but speaker names may not display correctly.', 'warning', { timeout: 5000 });
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// Replace current dialog
|
// Replace current dialog
|
||||||
dialogItems.splice(0, dialogItems.length, ...loadedItems);
|
dialogItems.splice(0, dialogItems.length, ...loadedItems);
|
||||||
|
// Persist loaded script
|
||||||
|
saveDialogToLocalStorage();
|
||||||
renderDialogItems();
|
renderDialogItems();
|
||||||
|
|
||||||
console.log(`Loaded ${loadedItems.length} dialog items from script`);
|
console.log(`Loaded ${loadedItems.length} dialog items from script`);
|
||||||
alert(`Successfully loaded ${loadedItems.length} dialog items.`);
|
showNotice(`Successfully loaded ${loadedItems.length} dialog items.`, 'success', { timeout: 3000 });
|
||||||
|
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
console.error('Error loading dialog script:', error);
|
console.error('Error loading dialog script:', error);
|
||||||
alert(`Error loading dialog script: ${error.message}`);
|
showNotice(`Error loading dialog script: ${error.message}`, 'error');
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
reader.onerror = function() {
|
reader.onerror = function() {
|
||||||
alert('Error reading file. Please try again.');
|
showNotice('Error reading file. Please try again.', 'error');
|
||||||
};
|
};
|
||||||
|
|
||||||
reader.readAsText(file);
|
reader.readAsText(file);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Load dialog script from pasted JSONL text
|
||||||
|
async function loadDialogScriptFromText(text) {
|
||||||
|
if (!text || !text.trim()) {
|
||||||
|
showNotice('Please paste JSONL content to load.', 'warning', { timeout: 4000 });
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
try {
|
||||||
|
const lines = text.trim().split('\n');
|
||||||
|
const loadedItems = [];
|
||||||
|
|
||||||
|
for (let i = 0; i < lines.length; i++) {
|
||||||
|
const line = lines[i].trim();
|
||||||
|
if (!line) continue; // Skip empty lines
|
||||||
|
try {
|
||||||
|
const item = JSON.parse(line);
|
||||||
|
const validatedItem = validateDialogItem(item, i + 1);
|
||||||
|
if (validatedItem) {
|
||||||
|
loadedItems.push(normalizeDialogItem(validatedItem));
|
||||||
|
}
|
||||||
|
} catch (parseError) {
|
||||||
|
console.error(`Error parsing line ${i + 1}:`, parseError);
|
||||||
|
showNotice(`Error parsing line ${i + 1}: ${parseError.message}`, 'error');
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if (loadedItems.length === 0) {
|
||||||
|
showNotice('No valid dialog items found in the pasted content.', 'warning', { timeout: 4000 });
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Confirm replacement if existing items
|
||||||
|
if (dialogItems.length > 0) {
|
||||||
|
const confirmed = await confirmAction(
|
||||||
|
`This will replace your current dialog (${dialogItems.length} items) with the pasted script (${loadedItems.length} items). Continue?`
|
||||||
|
);
|
||||||
|
if (!confirmed) return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Ensure speakers are loaded before rendering
|
||||||
|
if (availableSpeakersCache.length === 0) {
|
||||||
|
try {
|
||||||
|
availableSpeakersCache = await getSpeakers();
|
||||||
|
} catch (error) {
|
||||||
|
console.error('Error fetching speakers:', error);
|
||||||
|
showNotice('Could not load speakers. Dialog loaded but speaker names may not display correctly.', 'warning', { timeout: 5000 });
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Replace current dialog
|
||||||
|
dialogItems.splice(0, dialogItems.length, ...loadedItems);
|
||||||
|
// Persist loaded script
|
||||||
|
saveDialogToLocalStorage();
|
||||||
|
renderDialogItems();
|
||||||
|
|
||||||
|
console.log(`Loaded ${loadedItems.length} dialog items from pasted text`);
|
||||||
|
showNotice(`Successfully loaded ${loadedItems.length} dialog items.`, 'success', { timeout: 3000 });
|
||||||
|
return true;
|
||||||
|
} catch (error) {
|
||||||
|
console.error('Error loading dialog script from text:', error);
|
||||||
|
showNotice(`Error loading dialog script: ${error.message}`, 'error');
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
function validateDialogItem(item, lineNumber) {
|
function validateDialogItem(item, lineNumber) {
|
||||||
if (!item || typeof item !== 'object') {
|
if (!item || typeof item !== 'object') {
|
||||||
throw new Error(`Line ${lineNumber}: Invalid item format`);
|
throw new Error(`Line ${lineNumber}: Invalid item format`);
|
||||||
|
@ -729,12 +1017,75 @@ async function initializeDialogEditor() {
|
||||||
const file = e.target.files[0];
|
const file = e.target.files[0];
|
||||||
if (file) {
|
if (file) {
|
||||||
loadDialogScript(file);
|
loadDialogScript(file);
|
||||||
// Reset input so same file can be loaded again
|
|
||||||
e.target.value = '';
|
|
||||||
}
|
}
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// --- Paste Script (JSONL) Modal Handlers ---
|
||||||
|
if (pasteScriptBtn && pasteModal && pasteText && pasteLoadBtn && pasteCancelBtn && pasteCloseBtn) {
|
||||||
|
let escHandler = null;
|
||||||
|
const closePasteModal = () => {
|
||||||
|
pasteModal.style.display = 'none';
|
||||||
|
pasteLoadBtn.onclick = null;
|
||||||
|
pasteCancelBtn.onclick = null;
|
||||||
|
pasteCloseBtn.onclick = null;
|
||||||
|
pasteModal.onclick = null;
|
||||||
|
if (escHandler) {
|
||||||
|
document.removeEventListener('keydown', escHandler);
|
||||||
|
escHandler = null;
|
||||||
|
}
|
||||||
|
};
|
||||||
|
const openPasteModal = () => {
|
||||||
|
pasteText.value = '';
|
||||||
|
pasteModal.style.display = 'flex';
|
||||||
|
escHandler = (e) => { if (e.key === 'Escape') closePasteModal(); };
|
||||||
|
document.addEventListener('keydown', escHandler);
|
||||||
|
pasteModal.onclick = (e) => { if (e.target === pasteModal) closePasteModal(); };
|
||||||
|
pasteCloseBtn.onclick = closePasteModal;
|
||||||
|
pasteCancelBtn.onclick = closePasteModal;
|
||||||
|
pasteLoadBtn.onclick = async () => {
|
||||||
|
const ok = await loadDialogScriptFromText(pasteText.value);
|
||||||
|
if (ok) closePasteModal();
|
||||||
|
};
|
||||||
|
};
|
||||||
|
pasteScriptBtn.addEventListener('click', openPasteModal);
|
||||||
|
}
|
||||||
|
|
||||||
|
// --- Clear Dialog Button ---
|
||||||
|
let clearDialogBtn = document.getElementById('clear-dialog-btn');
|
||||||
|
if (!clearDialogBtn) {
|
||||||
|
clearDialogBtn = document.createElement('button');
|
||||||
|
clearDialogBtn.id = 'clear-dialog-btn';
|
||||||
|
clearDialogBtn.textContent = 'Clear Dialog';
|
||||||
|
// Insert next to Save/Load if possible
|
||||||
|
const saveLoadContainer = saveScriptBtn ? saveScriptBtn.parentElement : null;
|
||||||
|
if (saveLoadContainer) {
|
||||||
|
saveLoadContainer.appendChild(clearDialogBtn);
|
||||||
|
} else {
|
||||||
|
// Fallback: append near the add buttons container
|
||||||
|
const addBtnsContainer = addSpeechLineBtn ? addSpeechLineBtn.parentElement : null;
|
||||||
|
if (addBtnsContainer) addBtnsContainer.appendChild(clearDialogBtn);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if (clearDialogBtn) {
|
||||||
|
clearDialogBtn.addEventListener('click', async () => {
|
||||||
|
if (dialogItems.length === 0) {
|
||||||
|
showNotice('Dialog is already empty.', 'info', { timeout: 2500 });
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
const ok = await confirmAction(`This will remove ${dialogItems.length} dialog item(s). Continue?`);
|
||||||
|
if (!ok) return;
|
||||||
|
// Clear any transient input UI
|
||||||
|
if (typeof clearTempInputArea === 'function') clearTempInputArea();
|
||||||
|
// Clear state and persistence
|
||||||
|
dialogItems.splice(0, dialogItems.length);
|
||||||
|
try { localStorage.removeItem(LS_KEY); } catch (e) { /* ignore */ }
|
||||||
|
renderDialogItems();
|
||||||
|
showNotice('Dialog cleared.', 'success', { timeout: 2500 });
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
console.log('Dialog Editor Initialized');
|
console.log('Dialog Editor Initialized');
|
||||||
renderDialogItems(); // Initial render (empty)
|
renderDialogItems(); // Initial render (empty)
|
||||||
|
|
||||||
|
@ -781,6 +1132,8 @@ async function initializeDialogEditor() {
|
||||||
dialogItems[index].audioUrl = null;
|
dialogItems[index].audioUrl = null;
|
||||||
|
|
||||||
closeModal();
|
closeModal();
|
||||||
|
// Persist settings change
|
||||||
|
saveDialogToLocalStorage();
|
||||||
renderDialogItems(); // Re-render to reflect changes
|
renderDialogItems(); // Re-render to reflect changes
|
||||||
console.log('TTS settings updated for item:', dialogItems[index]);
|
console.log('TTS settings updated for item:', dialogItems[index]);
|
||||||
};
|
};
|
||||||
|
|
|
@ -13,8 +13,15 @@ const getEnvVar = (name, defaultValue) => {
|
||||||
};
|
};
|
||||||
|
|
||||||
// API Configuration
|
// API Configuration
|
||||||
export const API_BASE_URL = getEnvVar('VITE_API_BASE_URL', 'http://localhost:8000');
|
// Default to the same hostname as the frontend, on port 8000 (override via VITE_API_BASE_URL*)
|
||||||
export const API_BASE_URL_WITH_PREFIX = getEnvVar('VITE_API_BASE_URL_WITH_PREFIX', 'http://localhost:8000/api');
|
const _defaultHost = (typeof window !== 'undefined' && window.location?.hostname) || 'localhost';
|
||||||
|
const _defaultPort = getEnvVar('VITE_API_BASE_URL_PORT', '8000');
|
||||||
|
const _defaultBase = `http://${_defaultHost}:${_defaultPort}`;
|
||||||
|
export const API_BASE_URL = getEnvVar('VITE_API_BASE_URL', _defaultBase);
|
||||||
|
export const API_BASE_URL_WITH_PREFIX = getEnvVar(
|
||||||
|
'VITE_API_BASE_URL_WITH_PREFIX',
|
||||||
|
`${_defaultBase}/api`
|
||||||
|
);
|
||||||
|
|
||||||
// For file serving (same as API_BASE_URL since files are served from the same server)
|
// For file serving (same as API_BASE_URL since files are served from the same server)
|
||||||
export const API_BASE_URL_FOR_FILES = API_BASE_URL;
|
export const API_BASE_URL_FOR_FILES = API_BASE_URL;
|
||||||
|
|
|
@ -0,0 +1,31 @@
|
||||||
|
"""
|
||||||
|
Import helper module for Chatterbox UI.
|
||||||
|
|
||||||
|
This module provides a function to add the project root to the Python path,
|
||||||
|
which helps resolve import issues when running scripts from different locations.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import sys
|
||||||
|
import os
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
def setup_python_path():
|
||||||
|
"""
|
||||||
|
Add the project root to the Python path.
|
||||||
|
This allows imports to work correctly regardless of where the script is run from.
|
||||||
|
"""
|
||||||
|
# Get the project root (parent of the directory containing this file)
|
||||||
|
project_root = Path(__file__).resolve().parent
|
||||||
|
|
||||||
|
# Add the project root to the Python path if it's not already there
|
||||||
|
if str(project_root) not in sys.path:
|
||||||
|
sys.path.insert(0, str(project_root))
|
||||||
|
print(f"Added {project_root} to Python path")
|
||||||
|
|
||||||
|
# Set environment variable for other modules to use
|
||||||
|
os.environ["PROJECT_ROOT"] = str(project_root)
|
||||||
|
|
||||||
|
return project_root
|
||||||
|
|
||||||
|
# Run setup when this module is imported
|
||||||
|
project_root = setup_python_path()
|
|
@ -0,0 +1,9 @@
|
||||||
|
// jest.config.cjs
|
||||||
|
module.exports = {
|
||||||
|
testEnvironment: 'node',
|
||||||
|
transform: {
|
||||||
|
'^.+\\.js$': 'babel-jest',
|
||||||
|
},
|
||||||
|
moduleFileExtensions: ['js', 'json'],
|
||||||
|
roots: ['<rootDir>/frontend/tests', '<rootDir>'],
|
||||||
|
};
|
|
@ -5,11 +5,13 @@
|
||||||
"main": "index.js",
|
"main": "index.js",
|
||||||
"type": "module",
|
"type": "module",
|
||||||
"scripts": {
|
"scripts": {
|
||||||
"test": "jest"
|
"test": "jest",
|
||||||
|
"test:frontend": "jest --config ./jest.config.cjs",
|
||||||
|
"frontend:dev": "python3 frontend/start_dev_server.py"
|
||||||
},
|
},
|
||||||
"repository": {
|
"repository": {
|
||||||
"type": "git",
|
"type": "git",
|
||||||
"url": "https://oauth2:78f77aaebb8fa1cd3efbd5b738177c127f7d7d0b@gitea.r8z.us/stwhite/chatterbox-ui.git"
|
"url": "https://gitea.r8z.us/stwhite/chatterbox-ui.git"
|
||||||
},
|
},
|
||||||
"keywords": [],
|
"keywords": [],
|
||||||
"author": "",
|
"author": "",
|
||||||
|
@ -17,7 +19,7 @@
|
||||||
"devDependencies": {
|
"devDependencies": {
|
||||||
"@babel/core": "^7.27.4",
|
"@babel/core": "^7.27.4",
|
||||||
"@babel/preset-env": "^7.27.2",
|
"@babel/preset-env": "^7.27.2",
|
||||||
"babel-jest": "^30.0.0-beta.3",
|
"babel-jest": "^29.7.0",
|
||||||
"jest": "^29.7.0"
|
"jest": "^29.7.0"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
|
@ -0,0 +1,129 @@
|
||||||
|
He watched from the pickup, his feet dangling off of the end of the tailgate as he sipped a beer and swung his boots back and forth. He adjusted the Royals baseball cap and leaned back on his left hand, languid in the warm summer evening, the last bit of sun having disappeared just ten minutes ago, bringing surcease from the ridiculous August heat in Missouri. The high, thin clouds were that beautiful shade of salmon that made this the end of the “Golden Hour”. His black tank top was damp from sweat but his jeans were clean and he still smelled good.
|
||||||
|
|
||||||
|
The girl got out of a blue Prius that looked black under the flickering yellow pall of a high-pressure sodium light in the next row of the parking lot. She had glossy, dark hair that fell in waves to her shoulders that were bared by a ribbed white tank top, its hems decorated with lace, baring her gold belly-button-ring. She wore cutoff jeans shorts - they just missed the “daisy” appellation, but were short enough, with a frill of loose cotton threads neatly trimmed into a small white fringe at the bottom of each cut-off leg - and brown sandals with wedge heels and leather laces up her ankles to her calves. A tiny brown leather clutch swung from a strap across her body, and she tucked her car keys in it, snapped it closed, and let it fall to her side. She was very lightly tanned, much paler than most he’d seen around here. Her body was shapely; no bra, full breasts, narrow waist. Supple curves arced from her hips to her thighs and toned calves. Her fingers seemed both slender and strong, and she managed to look both muscular and soft. She started off towards the entrance to the fair at a brisk pace, her footfalls light, but she jiggled and bounced in interesting - and likely, intentional - ways.
|
||||||
|
He admired the way the muscles of her calves flexed to maintain her balance. She was breathtaking.
|
||||||
|
|
||||||
|
He hopped off of the tailgate, his boots crunching in the gravel, then closed it, silently, and began to follow her to the gate. He knew she’d turn to the right after going through the gate; he wasn’t worried he would lose her. He didn’t know how he knew - he could just tell. Hell, even if she didn’t, it was a big county fair, but not so big he couldn’t find one pretty little brunette. As he walked, hips loose, boots crunching in the gravel, fifty feet back, he wondered if she was meeting anyone. He was far enough away that she didn’t even glance back. He knew from experience if he broke into a run she’d look back with the instinctive fear of a prey animal - a lovely, country-flavored gazelle or white tail on her cloven hooves of braided leather and cork. He didn’t want her to know she was being pursued - not yet. That wasn’t the game.
|
||||||
|
|
||||||
|
The fresh gray gravel was still hot from the sun, radiating heat into his feet and the air above it, along with the scent of dust and rock. It was mounded in the center, pressed down in the places where car tires rolled over it. Grasses pushed up to the edge of the gravel parking lot of the county fair. A slight breeze brought the distant smells of funnel cakes, hotdogs, and cotton candy. The lights of the fair were visible beyond the surrounding fence. He felt sweat in the small of his back and on his upper lip. Night was coming, though, and the temperature would continue to drop.
|
||||||
|
|
||||||
|
She reached the ticket ‘booth’ - which was a folding card table flanked by bales of straw, manned by a fat, middle-aged, bleach-blonde woman and what he presumed was her fat, bored offspring. Mouth breathers, he observed. He could smell them from here, stale sweat, cigarette smoke, cheap cologne. He heard the girl’s voice, a rich, lilting contralto that made him feel like salivating. “Just one, please.”
|
||||||
|
|
||||||
|
“Adult?” asked the bored woman, not even looking up, just staring at the roll of tickets, the money box, the electronic payment device. The girl laughed and it rang through him like a bell, inflaming a hunger he knew well. “Yes, please.” she replied, waving her phone at the point-of-sale payment device. It chimed and the woman handed her a ticket and a bright orange wristband, then waved her on. “Have fun!” she called after the girl, in a voice so empty of enthusiasm it seemed to suck happiness from the very air around it. She had a KC Chiefs tee shirt and black jeans stretched to their tensile limit. He assumed she had some boots on as well, a rural affectation common in this county where there was more forest than cattle.
|
||||||
|
|
||||||
|
When he arrived at the table, she regarded him with dead, watery faded blue eyes. “Adult?”
|
||||||
|
|
||||||
|
“Yep.” He didn’t even bother to laugh.
|
||||||
|
|
||||||
|
“Ten bucks.” She picked up the ticket and wrist band and waited expectantly. He pulled a ten from his pocket and laid it on the table in front of her. She took it and handed it to the kid, who was wearing a tee-shirt emblazoned with the words “Let’s go Brandon”. “Have fun.” she repeated, just as enthusiastically as before, tucking a damp lock of chemically altered hair behind her left ear. He grunted noncommittally and strolled after his gazelle, who’d gone out of sight - to his right, of course. Towards the paved area of the fair, where carnival rides and games blasted forth a cacophony of light and noise into the hot midwestern night, the smell of hot dogs, popcorn, and cotton candy vying for the attention of the press of fairgoers in their cowboy boots, jeans, and short skirts. Tee shirts here and there, and sometimes a pair of overalls, but it could have been a uniform. The people here were largely overweight, trending to dangerously obese, massive instances of humanity that lumbered, stomped, or waddled from game to game and food cart to food cart. He watched a dark-haired, overall-clad man who was at least six foot six and had to weigh 400 lbs on the hoof consume an enormous hot dog as though it were a light snack, in three quick bites, grease, mustard, and cheese running down his hand. He licked cheese from the back of his hand and wiped his hands on his capacious pants legs. He had a handgun in a high hip holster. Open carry was in evidence everywhere, peppered across demographics, from shapely young women with Glocks to octogenarians sporting well-worn 1911s and white flat-tops. It was Missouri, after all. He didn’t need or want a gun, and it wouldn’t do them any good if he turned his attention to them.
|
||||||
|
|
||||||
|
Cops were scattered through the fairground. Some were clearly private security, others might have been local police, sheriffs, or even highway patrol, for all he knew. There were at least four uniforms represented. Cops didn’t concern him. He didn’t look dangerous or threatening, and none looked at him directly, no scanning eyes paused on him or tracked his progress across the straw-strewn asphalt. It could get inconvenient if police became involved, of course, but he didn’t worry much.
|
||||||
|
|
||||||
|
He’d gotten distracted, and was surprised as he nearly ran into his gazelle as she came around the end of a food cart, and he stopped suddenly to avoid bowling her over. She smiled and said “Excuse me!” and kept walking. “No problem!” he called after her, grinning at her back. It was good, he thought; interaction was key to breaking the ice later. Folks often walk the same direction through an exhibition like a fair, so being in the same general area over time wasn’t unusual and she’d never know he was stalking her. They never figured it out, not before he wanted them to figure it out. He was an attractive, friendly looking man with an open, disarming smile, medium brown hair, a strong, muscular body, capable, competent, without being threatening. He was tall, but not surprisingly so - six feet nothing, maybe a hair more in his boots. A hundred and eighty pounds on most days, no belly but not sporting a sharp six pack either. Women found him attractive but not threatening, which was his intention. His eyes were blue and he had a well-trimmed mustache and the slightest hint of stubble. He watched her without looking at her, noting that she was alone, but kept checking her phone, occasionally texting someone. If her friends didn’t show up it would make it easier for him to get her attention, to draw her in.
|
||||||
|
|
||||||
|
He floated near her, just exploring the fair in the same sequence, seemingly by chance. He paid $3 to play a game of chess against a fellow who was playing 12 people simultaneously. Overhead light from an LED lamp on a pole lit a rectangle of narrow tables, four chess boards on each. The man playing chess was dressed like one might imagine Sherlock Holmes, with a pipe clamped in his teeth. Sherlock walked clockwise around the rectangle making a move on each board as he came to it. He crushed the “chess master” in eighteen moves and moved on before the man could comment. He threw darts at balloons while watching her from the corner of his eye as she tried to ring a bell by swinging a hammer. He saw her check her phone again and look exasperated, her full lips pursing in frustration at something she read on the screen. She shrugged and looked around, almost catching him staring. Her eyes roamed the area and paused for the tiniest second on his profile, then swept along to take in the rest of the area. He strolled slowly to the next attraction, which was a booth where one could pay $5 to throw three hatchets at targets for prizes. There was a roof held up by four-by-fours spaced every five or six feet; each pair of four-by-fours made a lane for throwing axes, and there was a big target at the end of each lane, maybe twenty feet away. Five lanes, the ubiquitous straw strewn over the asphalt - to give that barnyard feel, he thought. He stepped up and handed the barker a twenty. The man was cajoling onlookers, almost chanting, about trying your luck and winning prizes throwing the axes, and his voice never faltered. He had a belt pouch that contained change, and was wearing worn jeans, worn athletic shoes, and a worn tee-shirt from a rock concert of a band long forgotten in this day and age. Belt Pouch put three axes in the basket next to the lane opening, put three fives on the small change shelf and stepped aside, making the twenty vanish into the pouch.
|
||||||
|
|
||||||
|
He picked up the first ax and measured its weight in his hand. He judged the distance and tossed the ax overhand in a smooth gesture. It struck head-first with a loud thump and fell to the ground, the head clanging against the asphalt.. He picked up the next ax and tossed it without any theatrics and it stuck solid, outside the bullseye.. He flipped the third after it almost nonchalantly and it stuck next to its sibling, this time at the edge of the red circle. Belt Pouch paused for an instant, and retrieved the thrown axes, offering them to him, and he accepted with a nod. The carnie’s patter changed, saying something about watching an expert at work. He tossed all three, rapidly, one after the other, and they lined up on the bullseye, separated by a hair’s breadth. The carnie laughed, and he heard a low whistle. A breeze swirled some loose bits of straw and cooled the light sweat on his back.
|
||||||
|
|
||||||
|
“Impressive.” she said, her voice rich and beautifully textured.
|
||||||
|
|
||||||
|
He shrugged. The carnie gathered the axes and offered them to him again. He nodded, not paying any attention to the man. “Wanna try it?” he asked the gazelle.
|
||||||
|
|
||||||
|
Her eyes were ice blue - he had expected them to be brown! - and long, dark lashes veiled them when she blinked. Her makeup was understated, but perfect - a dash of color and shadow. She cocked her head to one side, evaluating him, her lips curving slightly at the corners, the smile staying mostly in her eyes. She seemed to come to a decision and shrugged, then nodded. “Sure, why not? You make it look pretty easy!” She stepped up next to him and he yielded some space to allow her the center of the throwing lane. A couple of men in jeans and cowboy boots had stopped to watch, idly glancing from the target to him, then to her, their thumbs hooked in their belt loops. Their eyes lingered carefully on her, he could see, and they missed nothing. But she was his now. They would know better. The same way a jackal knew that the lion’s food was not for him.
|
||||||
|
|
||||||
|
She held out her hand and said, “Kim.” He took it, smooth and warm, and nodded. “Dave.” It wasn’t his name. Hell, hers probably wasn’t “Kim”. He knew how this sort of thing went. If he’d been a normal man, at the end of the night she’d have written a fake phone number on his palm and made him promise to call. “Nice to meet you, Dave.” She smiled a little and held out a hand. He passed her one of the hatchets and she bounced it in her hand, holding it like a hammer. “Heavier than it looks!” she observed.
|
||||||
|
|
||||||
|
“Have you done this before?”
|
||||||
|
|
||||||
|
She shook her head. “Is there a trick to it?”
|
||||||
|
|
||||||
|
“Isn’t there always?”
|
||||||
|
|
||||||
|
She laughed and shrugged, then concentrated. She drew back, holding it more like he had, concentrating with a small frown, and smoothly flung it down-range. It struck handle-first and fell to the floor. Boom-clang. “Shit.”
|
||||||
|
|
||||||
|
“It’s your first try! Don’t be so hard on yourself.” he said, offering her the second worn ax, handle first. She took it and grinned. He glanced around, noting that at least four men were watching her carefully now, along with Belt Pouch, who’d resumed his half-hearted patter about trying your luck and winning prizes, but was watching the couple with interest. Another breeze stirred some loose straw and made her hair flutter a bit as she turned and set her feet. She scuffed a foot on the asphalt. “I’d probably do better with sneakers or boots. High heels aren’t really ideal for this sort of thing, I bet.” Concentrating, she drew back the ax, holding it almost exactly as he had, and smoothly tossing it downrange, where it stuck. Not in the bullseye, but on the target.
|
||||||
|
|
||||||
|
“Not too shabby!” He nodded approvingly, offering her the last ax. She flashed him a grin and took it, shifting her stance and her grip, then in one smooth motion, the ax sailed smoothly to the target and stuck, on the very edge of the red circle, just outside the bullseye. “Nice!” he said, grinning.
|
||||||
|
|
||||||
|
“I guess you made it look too easy.” She leaned against the 4x4, looking at him speculatively. “Win something for me.” She grinned, white teeth with the slightest hint of irregularity shining in the LED light.. “A teddy bear, or a beer hat, or, you know, something fair-appropriate. You can do it, right?”
|
||||||
|
|
||||||
|
He paused for a moment, regarding her. “Perhaps.” He glanced at the carnie and jerked his head, and the carnie correctly interpreted the motion and retrieved the axes and picked up the last five. “What can I win?”
|
||||||
|
|
||||||
|
“You’ve already got some points racked up, so one bullseye will get you anything on this shelf.” He indicated a shelf littered with various sorts of toys, stuffed animals, lighters, and the like.
|
||||||
|
|
||||||
|
“What about three bullseyes?”
|
||||||
|
|
||||||
|
“That’s this shelf.” There was nothing obviously different about the two shelves except the “points” on the label, and the fact that it was the highest one, but he nodded. He turned back downrange and tossed all three in a smooth, mechanical sequence, and they once again lined up on the bullseye, thunk-thunk-thunk. The carnie looked at him, his gaze unreadable, and pointed at the highest shelf. “What can I get you?”
|
||||||
|
|
||||||
|
‘Dave’ glanced at the gazelle. “Kim? Choose your prize.” He grinned.
|
||||||
|
|
||||||
|
Her eyes flashed a grin in return and she stepped up to the rail, pointing. “That, right there.” It wasn’t a teddy bear. It was a cheap ripoff of a Zippo lighter with a praying mantis enameled onto the front in green, yellow, and black. The carnie shrugged and plucked it from the shelf and deposited it in her hand. She weighed it in her palm and flicked it open and closed a few times.
|
||||||
|
|
||||||
|
“It won’t work.” the carnie said. “No fluid in it. You’ll have to load it up when you get home.”
|
||||||
|
|
||||||
|
She nodded and turned back to ‘Dave’. “Wanna get a beer?”
|
||||||
|
|
||||||
|
He nodded. “Sure. Just one, though. I’m driving.” Together they threaded through the crowd to a place that had beer signs on posts. He noted the eyes of strangers on her as they made their way, and he grinned to himself. There was lust and jealousy and frustration in the eyes of the men. She really was quite attractive. A couple of women looked irritated, the way women sometimes do when a beautiful woman draws the attention of a man they feel belongs to them.
|
||||||
|
|
||||||
|
The “bar” was a roped off area set with high bar tables and stools, looking over a broad bit of straw-strewn ground where someone had erected a mechanical bull. It was surrounded with layers of foam pads a couple of inches thick, laid out so that the drunks tossed from the bull’s back wouldn’t end up traumatized in the emergency room, or worse. A couple of huge, slowly turning fans created a constant moderate breeze that felt good in the humid night air. Her hair fluttered as she hooked a foot into a stool and swung up onto the stool, to put her elbows down on the round tabletop, which was a mosaic of beer bottle caps entombed in some scuffed, clear plastic resin. Napkins, ketchup, mustard, and other condiments inhabited a little rack, along with salt and pepper packets. A waitress materialized at his elbow and mumbled something that ended in “... getcha?” He could smell a fryer and the aromas of bar food. Hot wings, french fries, hamburgers, nachos. He wasn’t interested in that sort of thing, though.
|
||||||
|
|
||||||
|
Kim glanced at the beer menu clipped in a metal ring on the condiment carrier and tapped one - a mass market IPA. He held up two fingers, the waitress said, “Got it” and turned away. He hadn’t wanted any food but was momentarily irritated that the mousy, pale woman hadn’t asked him or his date. Kim grinned at him as though she could read his thoughts. One manicured finger tapped the table top and she cocked her head to one side again. “So, ‘Dave’, what do you do?’
|
||||||
|
|
||||||
|
He crossed his arms and met her gaze. What was the right answer for this one? Hard working laborer, or executive out to play? Salesman, computer nerd, actor? “Guess” he finally said. “What do you think I do?”
|
||||||
|
|
||||||
|
“Go to county fairs to meet women.” Her reply was immediate, as though she’d known what he was going to say. “Professional ax thrower. Maybe you’re secretly a carnie on a night off?”
|
||||||
|
|
||||||
|
She wore a tiny cross on a chain and a pair of stud earrings that were just bright golden spheres against her earlobes. He decided she wasn’t the sort to see herself as a gold digger and shrugged. “I work in a warehouse. Drive a forklift.”
|
||||||
|
|
||||||
|
“A workin’ man, eh? Union? I hear forklift driving is a decent gig if it's a union job.”
|
||||||
|
|
||||||
|
“Decent enough.” He shrugged. “Paid for my truck, keeps me in meals. It isn’t for everyone, but I like it.” He shifted on the stool. “You?”
|
||||||
|
|
||||||
|
Just then the waitress returned with two bottles on a tray. “Ten dollars.” He took the bottles and dropped a ten and a five on the tray and she vanished without a word. Kim took a sip from the condensation-shrouded bottle and said “I do books. Taxes, accounting, that sort of thing. Got a four year degree in accounting and left for the big city - Lebanon, Missouri. It pays the bills.”
|
||||||
|
|
||||||
|
“Ever think about getting out?” he asked. “Heading for the big city? New York, Paris, you know. Bright lights and parties?” It was the question every rural and small town dweller asked themselves at some point. Cities were too dangerous for his kind, of course, but he knew how these people thought.
|
||||||
|
|
||||||
|
“Nah, not much. Nothing there for me. I have friends and family here.”
|
||||||
|
|
||||||
|
“That why you’re here alone?”
|
||||||
|
|
||||||
|
“My sister’s car broke down. Shit happens. And I’m not alone, right?”
|
||||||
|
|
||||||
|
He shrugged, nodded, then took a sip of cold, bitter, hoppy beer. “Why’d you pick that thing?” he asked, suddenly, pointing at the cheap zippo ripoff.
|
||||||
|
|
||||||
|
She shrugged. “I’ve just always loved praying mantises. They seem intelligent. They turn their head to watch you, and sometimes they’ll dance with you.” She turned the lighter so the mantis was up, and opened the top. “They’re related to walking sticks. There’s one in Indonesia that looks like an orchid; it’s evolved to pretend to be an orchid until the food gets close to what it sees as a flower. It’s gorgeous, the same pastel colors as the orchids it sits on, all pinks and blues and purples.” She shut the lighter suddenly. “Then snap, the mantis moves like lightning and … dinner!”
|
||||||
|
|
||||||
|
“Sounds dangerous!” He grinned and tossed back most of the beer. He could feel her relaxing, the darkness that drove him a burning hunger in his chest. His skin felt like it was rippling with electricity and he could smell her, delicate, rich, delicious. For a moment he saw a glowing outline around her as his hunger grew. He set the bottle back down and waved away the waitress as she stepped forward to see if he wanted more. Kim took a long pull on hers and tossed the almost empty bottle into the trash bin a few feet away.
|
||||||
|
|
||||||
|
“Let’s go wander around, see what there is to see.” She slid off the stool and stretched fetchingly, her tiny purse bouncing against her trim belly. They slipped out of the roped-off ‘bar’ area into the crowd. They watched a few drunks get dumped off the mechanical bull and laughed. She refused his challenge to get on it, and he refused hers in turn. They wandered through the fair, watching people and talking. She put her hand on his arm and pointed. “Let’s go get our fortunes read!” There was a squared-off trailer with a sign that said “Tarot, fortunes told, palms read, loved ones contacted” It was painted lots of colors and there was a small sign over the door that said “Entrance”. There were the ubiquitous straw bales delineating a small courtyard with eight chairs - all empty except one, inhabited by a tall, slender woman with frosted blond hair and dark eyes. She was cajoling passers by with promises of answers about love, life, and the future. When they turned into the tiny “courtyard” of hay bales, the woman stepped in front of them, holding up her hands. She shook her head. “This is not for you” she said, eyes hooded and giving up nothing. “We don’t need your money.” He thought for a moment he saw a glow in her eyes, and there were definitely faint glowing outlines around the door.
|
||||||
|
|
||||||
|
“What was that all about?” Kim asked, looking back over her shoulder, her voice betraying some mild irritation. “This is not for you!” she mimicked the woman’s voice derisively. “What did I do? Do you know them? I’ve never seen them before.”
|
||||||
|
|
||||||
|
He shook his head. “I don’t know them.” But he did. He knew her kind. The darkness inside him gave her a name. Witch. But Kim wouldn’t find that amusing if spoken aloud. He glanced back and saw the woman making a hand gesture at their backs, her thumb clamped between her index and middle finger. It couldn’t hurt him, but witches had been known to have … helpers; helpers who could hurt him.
|
||||||
|
|
||||||
|
“Eh, it’s just as well.” she said. “It’s getting late, I’m tired, and I should probably head home.”
|
||||||
|
|
||||||
|
“So soon?” he let disappointment creep into his voice. “Can I see you again?” It was the game, and he played it well.
|
||||||
|
|
||||||
|
She smiled and shook her head. “This wasn’t that kind of date, ‘Dave’. You know it, I know it. I won’t even write a fake number on your hand and implore you to call me sometime.”
|
||||||
|
|
||||||
|
He looked at her, a bit surprised. “You too good for a forklift driver?”
|
||||||
|
|
||||||
|
Her eyebrow raised and her blue eyes sparkled. “No, of course not. I just have a policy about men I meet alone at the fair. I know why you came here alone.”
|
||||||
|
|
||||||
|
He shrugged. It wouldn’t matter anyway. He’d catch her in the parking lot and it wouldn’t make any difference at all. She couldn’t get away; he’d already chosen her. He would just miss that delicious moment where the prey, having surrendered her trust, would suddenly recognize the error she had made and comfort would turn to terror, her heart leaping in fear and hammering against her ribs, her eyes going wide and her breasts heaving, nipples erect with fear and adrenaline as he forced her down with hands too strong for his size and build. This one would already be frightened when he got his hands on her. It would still be delicious, though. She might survive. Some did, empty husks, devoid of everything that makes life rich and beautiful, empty of life, of love, soulless in a sense. Most did not survive, giving up the spark of life along with the flame that he took.
|
||||||
|
|
||||||
|
She must have seen something in his eyes, then, because she looked a little uncomfortable. She waved her hand at him and started towards the gate, walking quickly without looking back. He could feel the tension in her body, the fear. She was already telling herself she was being ridiculous though, telling herself that he wasn’t a danger to her. She turned the corner around a food cart onto one of the fair’s rows of games and shops and walked out of sight, carefully not looking back. He knew that she’d glance over her shoulder as soon as she thought she was out of sight. He moved, quickly, but not running, not drawing undue attention. He slipped between a couple of trailers and stepped over the mobile rail that marked off the fair from the fields around it and moved out of the light. Then he ran, his feet light, his heart beating, the thrill of the hunt coursing through his veins and the darkness within him crying out a wordless “YES”.
|
||||||
|
|
||||||
|
He rounded a large red shipping container that marked the edge of the parking lot and slipped in between the rows of trucks and SUVs. There weren’t many people there, but there was Kim, walking from the gate and almost trotting towards her Prius, glancing back over her shoulder furtively. He took a deep breath and could smell her rich scent, now tinged with fear and exertion, making his skin tingle and buzz with energy. He ducked low and paced along silently, just behind the row of cars where her Prius sat waiting. She gained the car and he was mere feet from her when he stood and said “Hi.”
|
||||||
|
|
||||||
|
She yelped, a sharp, bright sound, and bolted, sprinting between the cars and out towards the open field and the woods a hundred feet beyond. He laughed and didn’t even care that two cops had heard her and were running after him. He trotted lightly after her, wanting her to make it to the trees before he caught her, but the cops were faster than her and were gaining on him. “On the ground!” one shouted, and drew a taser. ‘Dave’ juked to one side, turned suddenly, faster than humanly possible, and drove a rigid hand into the neck of the pursuing cop, crushing his trachea and driving a shockwave into his spine. The cop was unconscious before he hit the ground, and ‘Dave’ was ducking and rolling toward the other cop who hadn’t quite realized what had happened. The second cop got his gun out but ‘Dave’ had his hand on it before it cleared the holster, and he stripped it away, taking some of the cop’s hand with it and silencing the man’s sudden shout of pain with another vicious blow to the throat, the butt of the pistol crushing through cartilage and driving a vertebra so far out of alignment with the rest of his spine it severed the cord, the magic string, and he fell to the ground like roast and potatoes spilled from a platter. The world fell silent again except for the sound of her running feet, getting close to the trees.
|
||||||
|
|
||||||
|
His blood was singing and the darkness in him filled him to bursting, rendering the night in sharp relief, enabling him to see in this blackness as well as he could during the day. He could see her in the trees, a glowing body of beauty and heat and life, scrambling between the trees and trying to put distance between them. He moved silently, but fast, too fast for a human, for he was not only human, not at all. He was a predator, a hunter, and she was his meat. He was not a vampire, nor an incubus, but those legends might have originated with tales of creatures like him, creatures of darkness and stealth that lived on the delicious life of the prey they had hunted through the ages.
|
||||||
|
|
||||||
|
He drew even with her, silent in the darkness, and he could see her as though it were noon. Her eyes were wide and staring, rolling back and forth - he knew she couldn’t see him at all. He stepped down hard to break a twig and she froze at the sudden snap, staring around, trying to keep from breathing too loudly. She crept forward, trying to be quiet, trying to escape, without knowing it was already far too late. He stepped close to her and touched her neck with a gentle finger. Her entire body spasmed and she made a quiet, breathless whimpering sound, lunging away from his touch. He could see her trying to produce the scream trapped in her mind, but terror stole her breath and all that escaped was a croaking sound. He stepped close and ripped her tank top from her in a single move, exposing her body to his vision and his alone. She covered her breasts and whimpered, backing away from where she thought he was. He took two steps and ran his hand down her torso, gently, caressing, and she thrashed again and let out a little shout. He grabbed her by the throat, lifted her, and slammed her to the ground, driving the air from her lungs, and lay on her, his face close to her ear. “No screaming!” he said, quietly, and she turned her head away and tried to push him away, ineffectual and weak. He held her down by the slender throat and clawed her shorts off with the other hand and she sobbed, trying to cover herself. He grabbed one of her wrists in each hand and spread them as far apart as he could, forcing his knees between her legs and pressing her body down with his torso. She tried to buck but it didn’t matter, it didn’t move him. Not him. She was his prey, and he was here to consume her, not to be pushed away.
|
||||||
|
|
||||||
|
He looked into her wide, staring eyes, and thrust himself inside her. Or tried. Something was wrong - he’d missed some bit of clothing… He’d encountered something hard, like she’d been wearing some kind of chastity belt or … what the fuck? He transferred both of her wrists to his right hand and held them above her head and started to reach down to investigate and at that moment, her legs lifted and snapped around him, strong, hard, crushing him to her, his hips locked into place by legs he should have been able to push away easily but instead held him like iron bands, urging him closer. And the thing he’d mistaken for a chastity belt opened - he felt it, oh shit oh shit oh shit - and took in what he’d tried to thrust into her and bit into it with sharp teeth like hypodermic needles - he felt the loss and the rush of blood and release of pressure - and an immense, empty cold began to flood into him at that junction between them, a vacuum that sucked out of him everything he was or had ever been and the darkness in him gibbered and capered in terror it had never before known. It was his turn to wrestle weakly and ineffectually to try and break the deadly embrace. Her arms, suddenly as strong as hydraulic presses, pulled easily from his grasp and embraced him, pulling him close to her, pressing him to her body, once soft and supple, now hard and glossy. The coldness and emptiness grew in him, emptying him, and in the eldritch vision the darkness granted him he saw it, in the darkness, the dark, chitinous, triumphant, enormous body just on the other side of the veil, disguised in this world as a pale, soft, attractive girl… exactly the kind of girl he sought out, he hunted, he consumed. His thoughts spun, fear gripping him, his arms flailing uselessly as the emptiness consumed everything that was him. Then, at last, there was final darkness, and felt himself evaporating into it, and was no more.
|
||||||
|
|
||||||
|
There was silence for a moment in the trees, and everything was still and quiet. Something stirred, something pale and slender. His body was tossed aside, empty now of everything important, and the girl stood, naked but for lace-up wedge-heeled sandals, her body soft and supple again. Her clothing re-appeared over her flesh as though it were extruded from another place, and her makeup restored itself, the smears and streaks fading back into perfect order. She smoothed the ribbed tank top, now clean again and free of leaves or litter,, ran a slender hand through her hair, and started back towards her car.
|
||||||
|
|
|
@ -3,3 +3,4 @@ PyYAML>=6.0
|
||||||
torch>=2.0.0
|
torch>=2.0.0
|
||||||
torchaudio>=2.0.0
|
torchaudio>=2.0.0
|
||||||
numpy>=1.21.0
|
numpy>=1.21.0
|
||||||
|
chatterbox-tts
|
||||||
|
|
|
@ -0,0 +1,21 @@
|
||||||
|
# The Importance of Text-to-Speech Technology
|
||||||
|
|
||||||
|
Text-to-speech (TTS) technology has become increasingly important in our digital world. It enables computers and other devices to convert written text into spoken words, making content more accessible to a wider audience.
|
||||||
|
|
||||||
|
## Applications of TTS
|
||||||
|
|
||||||
|
TTS has numerous applications across various fields. In education, it helps students with reading difficulties by allowing them to listen to text. For people with visual impairments, TTS serves as a crucial tool for accessing digital content.
|
||||||
|
|
||||||
|
Mobile devices use TTS for navigation instructions, allowing drivers to keep their eyes on the road. Voice assistants like Siri and Alexa rely on TTS to communicate with users, answering questions and providing information.
|
||||||
|
|
||||||
|
## Recent Advancements
|
||||||
|
|
||||||
|
Recent advancements in neural network-based TTS systems have dramatically improved the quality of synthesized speech. Modern TTS voices sound more natural and expressive than ever before, with proper intonation, rhythm, and emphasis.
|
||||||
|
|
||||||
|
Chatterbox TTS represents the cutting edge of this technology, offering highly realistic voice synthesis that can be customized for different speakers and styles. This makes it ideal for creating audiobooks, podcasts, and other spoken content with a personal touch.
|
||||||
|
|
||||||
|
## Future Directions
|
||||||
|
|
||||||
|
The future of TTS technology looks promising, with ongoing research focused on making synthesized voices even more natural and emotionally expressive. We can expect to see TTS systems that can adapt to different contexts, conveying appropriate emotions and speaking styles based on the content.
|
||||||
|
|
||||||
|
As TTS technology continues to evolve, it will play an increasingly important role in human-computer interaction, accessibility, and content consumption.
|
|
@ -0,0 +1,123 @@
|
||||||
|
#Requires -Version 5.1
|
||||||
|
<#!
|
||||||
|
Chatterbox TTS - Windows setup script
|
||||||
|
|
||||||
|
What it does:
|
||||||
|
- Creates a Python virtual environment in .venv (if missing)
|
||||||
|
- Upgrades pip
|
||||||
|
- Installs dependencies from backend/requirements.txt and requirements.txt
|
||||||
|
- Creates a default .env with sensible ports if not present
|
||||||
|
- Launches start_servers.py using the venv's Python
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
- Right-click this file and "Run with PowerShell" OR from PowerShell:
|
||||||
|
./setup-windows.ps1
|
||||||
|
- Optional flags:
|
||||||
|
-NoInstall -> Skip installing dependencies (just start servers)
|
||||||
|
-NoStart -> Prepare env but do not start servers
|
||||||
|
|
||||||
|
Notes:
|
||||||
|
- You may need to allow script execution once:
|
||||||
|
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
|
||||||
|
- Press Ctrl+C in the console to stop both servers.
|
||||||
|
!#>
|
||||||
|
|
||||||
|
param(
|
||||||
|
[switch]$NoInstall,
|
||||||
|
[switch]$NoStart
|
||||||
|
)
|
||||||
|
|
||||||
|
$ErrorActionPreference = 'Stop'
|
||||||
|
|
||||||
|
function Write-Info($msg) { Write-Host "[INFO] $msg" -ForegroundColor Cyan }
|
||||||
|
function Write-Ok($msg) { Write-Host "[ OK ] $msg" -ForegroundColor Green }
|
||||||
|
function Write-Warn($msg) { Write-Host "[WARN] $msg" -ForegroundColor Yellow }
|
||||||
|
function Write-Err($msg) { Write-Host "[FAIL] $msg" -ForegroundColor Red }
|
||||||
|
|
||||||
|
$root = Split-Path -Parent $MyInvocation.MyCommand.Path
|
||||||
|
Set-Location $root
|
||||||
|
|
||||||
|
$venvDir = Join-Path $root ".venv"
|
||||||
|
$venvPython = Join-Path $venvDir "Scripts/python.exe"
|
||||||
|
|
||||||
|
# 1) Ensure Python available
|
||||||
|
function Get-BasePython {
|
||||||
|
try {
|
||||||
|
$pyExe = (Get-Command py -ErrorAction SilentlyContinue)
|
||||||
|
if ($pyExe) { return 'py -3' }
|
||||||
|
} catch { }
|
||||||
|
try {
|
||||||
|
$pyExe = (Get-Command python -ErrorAction SilentlyContinue)
|
||||||
|
if ($pyExe) { return 'python' }
|
||||||
|
} catch { }
|
||||||
|
throw "Python not found. Please install Python 3.x and add it to PATH."
|
||||||
|
}
|
||||||
|
|
||||||
|
# 2) Create venv if missing
|
||||||
|
if (-not (Test-Path $venvPython)) {
|
||||||
|
Write-Info "Creating virtual environment in .venv"
|
||||||
|
$basePy = Get-BasePython
|
||||||
|
if ($basePy -eq 'py -3') {
|
||||||
|
& py -3 -m venv .venv
|
||||||
|
} else {
|
||||||
|
& python -m venv .venv
|
||||||
|
}
|
||||||
|
Write-Ok "Virtual environment created"
|
||||||
|
} else {
|
||||||
|
Write-Info "Using existing virtual environment: $venvDir"
|
||||||
|
}
|
||||||
|
|
||||||
|
if (-not (Test-Path $venvPython)) {
|
||||||
|
throw ".venv python not found at $venvPython"
|
||||||
|
}
|
||||||
|
|
||||||
|
# 3) Install dependencies
|
||||||
|
if (-not $NoInstall) {
|
||||||
|
Write-Info "Upgrading pip"
|
||||||
|
& $venvPython -m pip install --upgrade pip
|
||||||
|
|
||||||
|
# Backend requirements
|
||||||
|
$backendReq = Join-Path $root 'backend/requirements.txt'
|
||||||
|
if (Test-Path $backendReq) {
|
||||||
|
Write-Info "Installing backend requirements"
|
||||||
|
& $venvPython -m pip install -r $backendReq
|
||||||
|
} else {
|
||||||
|
Write-Warn "backend/requirements.txt not found"
|
||||||
|
}
|
||||||
|
|
||||||
|
# Root requirements (optional frontend / project libs)
|
||||||
|
$rootReq = Join-Path $root 'requirements.txt'
|
||||||
|
if (Test-Path $rootReq) {
|
||||||
|
Write-Info "Installing root requirements"
|
||||||
|
& $venvPython -m pip install -r $rootReq
|
||||||
|
} else {
|
||||||
|
Write-Warn "requirements.txt not found at repo root"
|
||||||
|
}
|
||||||
|
|
||||||
|
Write-Ok "Dependency installation complete"
|
||||||
|
}
|
||||||
|
|
||||||
|
# 4) Ensure .env exists with sensible defaults
|
||||||
|
$envPath = Join-Path $root '.env'
|
||||||
|
if (-not (Test-Path $envPath)) {
|
||||||
|
Write-Info "Creating default .env"
|
||||||
|
@(
|
||||||
|
'BACKEND_PORT=8000',
|
||||||
|
'BACKEND_HOST=127.0.0.1',
|
||||||
|
'FRONTEND_PORT=8001',
|
||||||
|
'FRONTEND_HOST=127.0.0.1'
|
||||||
|
) -join "`n" | Out-File -FilePath $envPath -Encoding utf8 -Force
|
||||||
|
Write-Ok ".env created"
|
||||||
|
} else {
|
||||||
|
Write-Info ".env already exists; leaving as-is"
|
||||||
|
}
|
||||||
|
|
||||||
|
# 5) Start servers
|
||||||
|
if ($NoStart) {
|
||||||
|
Write-Info "-NoStart specified; setup complete. You can start later with:"
|
||||||
|
Write-Host " `"$venvPython`" `"$root\start_servers.py`"" -ForegroundColor Gray
|
||||||
|
exit 0
|
||||||
|
}
|
||||||
|
|
||||||
|
Write-Info "Starting servers via start_servers.py"
|
||||||
|
& $venvPython "$root/start_servers.py"
|
|
@ -28,3 +28,9 @@ dd3552d9-f4e8-49ed-9892-f9e67afcf23c:
|
||||||
2cdd6d3d-c533-44bf-a5f6-cc83bd089d32:
|
2cdd6d3d-c533-44bf-a5f6-cc83bd089d32:
|
||||||
name: Grace
|
name: Grace
|
||||||
sample_path: speaker_samples/2cdd6d3d-c533-44bf-a5f6-cc83bd089d32.wav
|
sample_path: speaker_samples/2cdd6d3d-c533-44bf-a5f6-cc83bd089d32.wav
|
||||||
|
3d3e85db-3d67-4488-94b2-ffc189fbb287:
|
||||||
|
name: RCB
|
||||||
|
sample_path: speaker_samples/3d3e85db-3d67-4488-94b2-ffc189fbb287.wav
|
||||||
|
f754cf35-892c-49b6-822a-f2e37246623b:
|
||||||
|
name: Jim
|
||||||
|
sample_path: speaker_samples/f754cf35-892c-49b6-822a-f2e37246623b.wav
|
||||||
|
|
|
@ -14,101 +14,109 @@ from pathlib import Path
|
||||||
# Try to load environment variables, but don't fail if dotenv is not available
|
# Try to load environment variables, but don't fail if dotenv is not available
|
||||||
try:
|
try:
|
||||||
from dotenv import load_dotenv
|
from dotenv import load_dotenv
|
||||||
|
|
||||||
load_dotenv()
|
load_dotenv()
|
||||||
except ImportError:
|
except ImportError:
|
||||||
print("python-dotenv not installed, using system environment variables only")
|
print("python-dotenv not installed, using system environment variables only")
|
||||||
|
|
||||||
# Configuration
|
# Configuration
|
||||||
BACKEND_PORT = int(os.getenv('BACKEND_PORT', '8000'))
|
BACKEND_PORT = int(os.getenv("BACKEND_PORT", "8000"))
|
||||||
BACKEND_HOST = os.getenv('BACKEND_HOST', '0.0.0.0')
|
BACKEND_HOST = os.getenv("BACKEND_HOST", "0.0.0.0")
|
||||||
FRONTEND_PORT = int(os.getenv('FRONTEND_PORT', '8001'))
|
# Frontend host/port (for dev server binding)
|
||||||
FRONTEND_HOST = os.getenv('FRONTEND_HOST', '127.0.0.1')
|
FRONTEND_PORT = int(os.getenv("FRONTEND_PORT", "8001"))
|
||||||
|
FRONTEND_HOST = os.getenv("FRONTEND_HOST", "0.0.0.0")
|
||||||
|
|
||||||
|
# Export frontend host/port so backend CORS config can pick them up automatically
|
||||||
|
os.environ["FRONTEND_HOST"] = FRONTEND_HOST
|
||||||
|
os.environ["FRONTEND_PORT"] = str(FRONTEND_PORT)
|
||||||
|
|
||||||
# Get project root directory
|
# Get project root directory
|
||||||
PROJECT_ROOT = Path(__file__).parent.absolute()
|
PROJECT_ROOT = Path(__file__).parent.absolute()
|
||||||
|
|
||||||
|
|
||||||
def run_backend():
|
def run_backend():
|
||||||
"""Run the backend FastAPI server"""
|
"""Run the backend FastAPI server"""
|
||||||
os.chdir(PROJECT_ROOT / "backend")
|
os.chdir(PROJECT_ROOT / "backend")
|
||||||
cmd = [
|
cmd = [
|
||||||
sys.executable, "-m", "uvicorn",
|
sys.executable,
|
||||||
"app.main:app",
|
"-m",
|
||||||
"--reload",
|
"uvicorn",
|
||||||
f"--host={BACKEND_HOST}",
|
"app.main:app",
|
||||||
f"--port={BACKEND_PORT}"
|
"--reload",
|
||||||
|
f"--host={BACKEND_HOST}",
|
||||||
|
f"--port={BACKEND_PORT}",
|
||||||
]
|
]
|
||||||
|
|
||||||
print(f"\n{'='*50}")
|
print(f"\n{'='*50}")
|
||||||
print(f"Starting Backend Server at http://{BACKEND_HOST}:{BACKEND_PORT}")
|
print(f"Starting Backend Server at http://{BACKEND_HOST}:{BACKEND_PORT}")
|
||||||
print(f"API docs available at http://{BACKEND_HOST}:{BACKEND_PORT}/docs")
|
print(f"API docs available at http://{BACKEND_HOST}:{BACKEND_PORT}/docs")
|
||||||
print(f"{'='*50}\n")
|
print(f"{'='*50}\n")
|
||||||
|
|
||||||
return subprocess.Popen(
|
return subprocess.Popen(
|
||||||
cmd,
|
cmd,
|
||||||
stdout=subprocess.PIPE,
|
stdout=subprocess.PIPE,
|
||||||
stderr=subprocess.STDOUT,
|
stderr=subprocess.STDOUT,
|
||||||
universal_newlines=True,
|
universal_newlines=True,
|
||||||
bufsize=1
|
bufsize=1,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
def run_frontend():
|
def run_frontend():
|
||||||
"""Run the frontend development server"""
|
"""Run the frontend development server"""
|
||||||
frontend_dir = PROJECT_ROOT / "frontend"
|
frontend_dir = PROJECT_ROOT / "frontend"
|
||||||
os.chdir(frontend_dir)
|
os.chdir(frontend_dir)
|
||||||
|
|
||||||
cmd = [sys.executable, "start_dev_server.py"]
|
cmd = [sys.executable, "start_dev_server.py"]
|
||||||
env = os.environ.copy()
|
env = os.environ.copy()
|
||||||
env["VITE_DEV_SERVER_HOST"] = FRONTEND_HOST
|
env["VITE_DEV_SERVER_HOST"] = FRONTEND_HOST
|
||||||
env["VITE_DEV_SERVER_PORT"] = str(FRONTEND_PORT)
|
env["VITE_DEV_SERVER_PORT"] = str(FRONTEND_PORT)
|
||||||
|
|
||||||
print(f"\n{'='*50}")
|
print(f"\n{'='*50}")
|
||||||
print(f"Starting Frontend Server at http://{FRONTEND_HOST}:{FRONTEND_PORT}")
|
print(f"Starting Frontend Server at http://{FRONTEND_HOST}:{FRONTEND_PORT}")
|
||||||
print(f"{'='*50}\n")
|
print(f"{'='*50}\n")
|
||||||
|
|
||||||
return subprocess.Popen(
|
return subprocess.Popen(
|
||||||
cmd,
|
cmd,
|
||||||
env=env,
|
env=env,
|
||||||
stdout=subprocess.PIPE,
|
stdout=subprocess.PIPE,
|
||||||
stderr=subprocess.STDOUT,
|
stderr=subprocess.STDOUT,
|
||||||
universal_newlines=True,
|
universal_newlines=True,
|
||||||
bufsize=1
|
bufsize=1,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
def print_process_output(process, prefix):
|
def print_process_output(process, prefix):
|
||||||
"""Print process output with a prefix"""
|
"""Print process output with a prefix"""
|
||||||
for line in iter(process.stdout.readline, ''):
|
for line in iter(process.stdout.readline, ""):
|
||||||
if not line:
|
if not line:
|
||||||
break
|
break
|
||||||
print(f"{prefix} | {line}", end='')
|
print(f"{prefix} | {line}", end="")
|
||||||
|
|
||||||
|
|
||||||
def main():
|
def main():
|
||||||
"""Main function to start both servers"""
|
"""Main function to start both servers"""
|
||||||
print("\n🚀 Starting Chatterbox UI Development Environment")
|
print("\n🚀 Starting Chatterbox UI Development Environment")
|
||||||
|
|
||||||
# Start the backend server
|
# Start the backend server
|
||||||
backend_process = run_backend()
|
backend_process = run_backend()
|
||||||
|
|
||||||
# Give the backend a moment to start
|
# Give the backend a moment to start
|
||||||
time.sleep(2)
|
time.sleep(2)
|
||||||
|
|
||||||
# Start the frontend server
|
# Start the frontend server
|
||||||
frontend_process = run_frontend()
|
frontend_process = run_frontend()
|
||||||
|
|
||||||
# Create threads to monitor and print output
|
# Create threads to monitor and print output
|
||||||
backend_monitor = threading.Thread(
|
backend_monitor = threading.Thread(
|
||||||
target=print_process_output,
|
target=print_process_output, args=(backend_process, "BACKEND"), daemon=True
|
||||||
args=(backend_process, "BACKEND"),
|
|
||||||
daemon=True
|
|
||||||
)
|
)
|
||||||
frontend_monitor = threading.Thread(
|
frontend_monitor = threading.Thread(
|
||||||
target=print_process_output,
|
target=print_process_output, args=(frontend_process, "FRONTEND"), daemon=True
|
||||||
args=(frontend_process, "FRONTEND"),
|
|
||||||
daemon=True
|
|
||||||
)
|
)
|
||||||
|
|
||||||
backend_monitor.start()
|
backend_monitor.start()
|
||||||
frontend_monitor.start()
|
frontend_monitor.start()
|
||||||
|
|
||||||
# Setup signal handling for graceful shutdown
|
# Setup signal handling for graceful shutdown
|
||||||
def signal_handler(sig, frame):
|
def signal_handler(sig, frame):
|
||||||
print("\n\n🛑 Shutting down servers...")
|
print("\n\n🛑 Shutting down servers...")
|
||||||
|
@ -117,16 +125,16 @@ def main():
|
||||||
# Threads are daemon, so they'll exit when the main thread exits
|
# Threads are daemon, so they'll exit when the main thread exits
|
||||||
print("✅ Servers stopped successfully")
|
print("✅ Servers stopped successfully")
|
||||||
sys.exit(0)
|
sys.exit(0)
|
||||||
|
|
||||||
signal.signal(signal.SIGINT, signal_handler)
|
signal.signal(signal.SIGINT, signal_handler)
|
||||||
|
|
||||||
# Print access information
|
# Print access information
|
||||||
print("\n📋 Access Information:")
|
print("\n📋 Access Information:")
|
||||||
print(f" • Frontend: http://{FRONTEND_HOST}:{FRONTEND_PORT}")
|
print(f" • Frontend: http://{FRONTEND_HOST}:{FRONTEND_PORT}")
|
||||||
print(f" • Backend API: http://{BACKEND_HOST}:{BACKEND_PORT}/api")
|
print(f" • Backend API: http://{BACKEND_HOST}:{BACKEND_PORT}/api")
|
||||||
print(f" • API Documentation: http://{BACKEND_HOST}:{BACKEND_PORT}/docs")
|
print(f" • API Documentation: http://{BACKEND_HOST}:{BACKEND_PORT}/docs")
|
||||||
print("\n⚠️ Press Ctrl+C to stop both servers\n")
|
print("\n⚠️ Press Ctrl+C to stop both servers\n")
|
||||||
|
|
||||||
# Keep the main process running
|
# Keep the main process running
|
||||||
try:
|
try:
|
||||||
while True:
|
while True:
|
||||||
|
@ -134,5 +142,6 @@ def main():
|
||||||
except KeyboardInterrupt:
|
except KeyboardInterrupt:
|
||||||
signal_handler(None, None)
|
signal_handler(None, None)
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
main()
|
main()
|
||||||
|
|
Loading…
Reference in New Issue