AI/TLDR

Piper

Fast, local neural text-to-speech that runs offline on your own machine

Overview

Piper is an open-source neural text-to-speech engine that turns written text into natural-sounding speech. It is built to be fast and to run fully on your own machine, so you do not need to send text to a cloud service.

Piper uses the espeak-ng project to convert words into phonemes, then a neural voice model to produce the audio. It ships ready-made voices in many languages, works well even on small hardware like a Raspberry Pi, and is used by projects such as Home Assistant and screen readers for the visually impaired.

What it does

  • Local, offline speech synthesis with no cloud dependency, so your text stays on your own device
  • Fast neural voices that run on modest hardware, including small boards like the Raspberry Pi
  • Pre-trained voices in many languages that you can list and download on demand
  • Command-line interface, HTTP web server, Python API, and a C/C++ library for different integration needs
  • Optional GPU acceleration through onnxruntime-gpu for higher throughput
  • Tunable output, including volume, speaking speed, audio variation, and raw espeak-ng phoneme injection

Getting started

Install Piper from PyPI, download a voice, then generate a WAV file from text. The Python package name is piper-tts.

Install Piper

Install the Piper package from PyPI with pip.

shsh
pip install piper-tts

Download a voice

List the available voices, then download one. This example downloads an English (US) voice into the current directory.

shsh
python3 -m piper.download_voices en_US-lessac-medium

Generate speech from the command line

Run Piper with a voice model and write the spoken text to a WAV file. Here it creates test.wav from a short sentence.

shsh
python3 -m piper -m en_US-lessac-medium -f test.wav -- 'This is a test.'

Use the Python API

You can also call Piper from Python with PiperVoice.synthesize_wav to write audio to a WAV file.

pythonpython
import wave
from piper import PiperVoice

voice = PiperVoice.load("/path/to/en_US-lessac-medium.onnx")
with wave.open("test.wav", "wb") as wav_file:
    voice.synthesize_wav("Welcome to the world of speech synthesis!", wav_file)

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

  • Adding offline voice output to home automation setups such as Home Assistant
  • Powering screen readers and accessibility tools that read text aloud for visually impaired users
  • Generating narration or voiceovers for videos and apps without relying on a paid cloud service
  • Running a local text-to-speech web server for repeated synthesis from your own software

How Piper compares

Piper alongside other open-source audio, music & voice tools AI/TLDR tracks, ranked by GitHub stars.

ToolStarsWhat it does
Whisper★ 104kOpenAI's speech recognition model that transcribes and translates audio across many languages.
GPT-SoVITS★ 59kAn open-source WebUI that clones a voice from a short audio sample and turns text into speech, with zero-shot and few-shot fine-tuning.
VibeVoice★ 49.6kMicrosoft's text-to-speech model for generating long, expressive multi-speaker audio like podcasts.
Coqui TTS★ 45.6kA library of text-to-speech models including the multilingual XTTS voice-cloning model.
ChatTTS★ 39.5kChatTTS is an open-source text-to-speech model tuned for dialogue, with multi-speaker support and fine-grained control over laughter, pauses, and prosody.
MockingBird★ 36.9kAn open-source PyTorch toolbox that clones a voice from a short sample and generates Mandarin Chinese speech, with a web app, desktop toolbox, and command line.
OpenVoice★ 36.8kOpenVoice clones a voice from a short reference clip and speaks in multiple languages, with control over emotion, accent, rhythm, and intonation.
Piper★ 11.1kFast, local neural text-to-speech that runs offline on your own machine