mirror of https://github.com/justinlime/Fatterbox.git synced 2025-12-11 17:02:46 -06:00

No description

Find a file

justinlime 50701aaea3 cleanup		2025-12-11 16:10:35 -06:00
src	cleanup	2025-12-11 16:10:35 -06:00
.gitignore	cleanup	2025-12-08 15:37:34 -06:00
docker_init.py	docker	2025-12-09 22:55:21 -06:00
Dockerfile	cleanup	2025-12-11 16:10:35 -06:00
main.py	cleanup	2025-12-11 16:10:35 -06:00
pyproject.toml	Voices Dir, OpenAPI fix	2025-12-10 12:52:46 -06:00
README.md	docker	2025-12-09 22:55:21 -06:00
uv.lock	Voices Dir, OpenAPI fix	2025-12-10 12:52:46 -06:00

README.md

Fatterbox TTS server

A lightweight, production-ready TTS server that supports OpenAI-compatible API and Wyoming Protocol for voice synthesis. Built for easy deployment with Docker and minimal configuration.

🐳 Docker Image

Use the pre-built Docker image from Docker Hub:

docker pull docker.io/justinlime/fatterbox:latest

This image includes all dependencies, optimized for GPU/CPU inference and ready to run.

🚀 Quick Start

Run the Container

docker run -p 5002:5002 -p 10200:10200 \
  -e HOST=0.0.0.0 \
  -e OPENAI_PORT=5002 \
  -e WYOMING_PORT=10200 \
  -e DEVICE=cuda \
  -e AUDIO_PROMPT=/audio/prompt.wav \
  -e LANGUAGE_ID=en \
  -e STREAM=true \
  -e EXAGGERATION=0.5 \
  -e TEMPERATURE=0.8 \
  -e MIN_P=0.05 \
  -e TOP_P=1.0 \
  -e REPETITION_PENALTY=1.2 \
  -e DTYPE=bfloat16 \
  -v /path/to/audio/prompt.wav:/audio/prompt.wav \
  --name chatterbox-tts \
  docker.io/justinlime/fatterbox:latest

Replace /path/to/audio/prompt.wav with the actual path to your audio prompt file.

🛠️ Environment Variables

Variable	Description	Default
`HOST`	Server IP	`0.0.0.0`
`OPENAI_PORT`	OpenAI API port	`5002`
`WYOMING_PORT`	Wyoming protocol port	`10200`
`DEBUG`	Enable debug logs	`false`
`DEVICE`	`cpu`, `cuda`, or `mps`	`cpu`
`STREAM`	Enable streaming	`true`
`LANGUAGE_ID`	Language ID	`en`
`AUDIO_PROMPT`	Path to audio prompt	(required)
`EXAGGERATION`	Voice expression	`0.5`
`TEMPERATURE`	Sampling temperature	`0.8`
`MIN_P`	Nucleus sampling	`0.05`
`TOP_P`	Top-p sampling	`1.0`
`REPETITION_PENALTY`	Repetition control	`1.2`
`DTYPE`	Model precision	`bfloat16`

🚀 Usage

OpenAI-Compatible API

curl -X POST http://localhost:5002/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Hello, this is a test.",
    "voice": "alloy",
    "response_format": "mp3"
  }'

Wyoming Protocol

Connect via:

wyoming-client --server tcp://localhost:10200 --text "Hello world"

🧩 Features

✅ OpenAI API compatibility
✅ Wyoming Protocol support
✅ Voice cloning with audio prompt
✅ Multi-lingual support
✅ GPU acceleration (CUDA/MPS)
✅ Mixed precision (bfloat16/float16)
✅ Streaming & batch inference
✅ Low-latency design

🧩 Model Requirements

Chatterbox TTS model (pretrained)
Audio prompt file (for voice cloning)
Supported languages: en, de, fr, es, etc.

🧪 Testing

Test API

curl -X POST http://localhost:5002/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{
    "input": "The quick brown fox jumps over the lazy dog.",
    "voice": "nova",
    "response_format": "mp3"
  }' > output.mp3

Test Wyoming

wyoming-client --server tcp://localhost:10200 --text "Hello world"

🧩 Build & Run

Use Pre-Built Image

docker pull docker.io/justinlime/fatterbox:latest
docker run -p 5002:5002 -p 10200:10200 -e DEVICE=cuda -v /path/to/audio/prompt.wav:/audio/prompt.wav docker.io/justinlime/fatterbox:latest