hfstudio / CLAUDE.md
GitHub Action
Sync from GitHub: ffdd28283d24c66d2e788d9b4a630d7d9f76b0a1
d3f86d8

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Development Commands

Quick Start (Recommended)

./run_dev.sh

This single script starts both frontend (port 11111) and backend (port 11110) with auto-browser opening.

Manual Development Setup

Frontend (Svelte):

cd frontend
npm install
npm run dev          # Development server on :11111
npm run build        # Production build
npm run check        # Type checking

Backend (FastAPI):

cd backend
pip install -r requirements.txt
python -m hfstudio.cli --dev  # Development server on :11110
# OR
pip install -e .
hfstudio --dev

Port Configuration

  • Frontend: 11111 (configured in vite.config.js)
  • Backend: 7860 (configured in cli.py)
  • CORS is configured for these specific ports in server.py

Architecture Overview

Frontend Structure (SvelteKit + TailwindCSS)

  • Single Page App: Main interface in src/routes/+page.svelte
  • Layout: Global layout with sidebar in src/routes/+layout.svelte
  • Design System: HuggingFace brand colors (#FFD21E, #FF9D00) used throughout
  • State Management: Local component state with reactive variables
  • Audio Handling: Custom HTML5 audio element with manual progress tracking

Key UI Components

  • Three-panel layout: Sidebar (56 units) + Main content + Settings panel (80 units)
  • Fixed bottom button: Generate button positioned absolutely at page bottom
  • Mini audio player: Compact controls in generated audio card
  • Full audio player: Expanded controls with ElevenLabs-style design
  • Custom pause icon: CSS-only filled bars instead of outline

Backend Structure (FastAPI)

  • Main server: server.py with CORS configured for development ports
  • CLI interface: cli.py with typer for command-line control
  • Pydantic models: TTSRequest, TTSResponse, Voice, Model
  • Current implementation: Mock TTS generation using placeholder audio

API Endpoints

GET  /                     - Health check
GET  /api/status          - Mode and availability info
GET  /api/voices          - Available voice list
GET  /api/models          - Available model list  
POST /api/tts/generate    - Generate speech from text

Design Patterns & Conventions

Frontend Patterns

  • HF Brand Integration: Uses official logo (/assets/hf-logo.png) and gradient colors
  • Responsive Controls: All sliders use custom .slider-hf class with HF colors
  • Audio State Management: Manual synchronization between UI state and HTML5 audio element
  • Progressive Enhancement: Settings always visible, no hidden toggles

Backend Patterns

  • Development Mode: Auto-reload enabled with --dev flag
  • Mock Implementation: Currently returns /samples/harvard.wav for testing
  • CORS Configuration: Explicitly configured for development ports

Styling Conventions

  • TailwindCSS: Primary styling framework
  • Custom CSS: Limited to audio sliders and pause icon in app.css
  • Color Scheme: Light theme with HF amber/orange accents
  • Typography: System fonts with careful spacing

Key Implementation Details

Audio System

The app uses a hidden HTML5 <audio> element controlled by custom UI:

  • Real audio playback through bound audioElement
  • Manual progress tracking via timeupdate events
  • Auto-play when generation completes
  • Custom pause icon using CSS pseudo-elements

Voice & Model Data

// Current models (in +page.svelte)
const models = [
  { id: 'chatterbox', name: 'Chatterbox', badge: 'recommended' },
  { id: 'kokoro', name: 'Kokoro', badge: 'faster but lower quality' }
];

// Current voices with descriptions
const voices = [
  { id: 'novia', name: 'Novia', description: 'Warm, conversational voice' },
  // ... etc
];

Configuration Files

  • Frontend env: .env with PUBLIC_API_URL=http://localhost:11110
  • Vite config: Custom port (11111) and host settings
  • TailwindCSS: Custom colors and slider styling
  • Backend requirements: FastAPI, numpy, soundfile for audio processing

Static Assets

  • Logo: HuggingFace logo at /assets/hf-logo.png
  • Sample audio: Harvard sample at /samples/harvard.wav for testing
  • Favicon: Uses HF logo

Development Workflow

Making UI Changes

  1. Edit components in frontend/src/routes/+page.svelte
  2. Layout changes go in frontend/src/routes/+layout.svelte
  3. Global styles in frontend/src/app.css
  4. Hot reload shows changes instantly

Adding API Features

  1. Add Pydantic models in server.py
  2. Create new endpoints following existing patterns
  3. Update CORS origins if needed
  4. Test at http://localhost:11110/docs

Troubleshooting

  • Port conflicts: Change ports in vite.config.js and cli.py
  • CORS issues: Update allow_origins in server.py
  • Audio not playing: Check audio file exists at /samples/harvard.wav
  • Dependencies: Run npm install in frontend, pip install -r requirements.txt in backend