AI/TLDR

edge-tts

Use Microsoft Edge's online text-to-speech from Python or the command line

Overview

edge-tts is a Python module that lets you use Microsoft Edge's online text-to-speech service straight from your own code, or through the bundled edge-tts and edge-playback commands. You give it text, and it returns spoken audio in many languages and voices.

Because it talks to Microsoft's hosted service, you do not need to download or run any large speech model yourself. You can generate an MP3 file, write matching subtitles, and tune the rate, volume and pitch of the result with simple options.

What it does

  • Convert text to spoken audio using Microsoft Edge's online text-to-speech voices
  • Choose from a large list of voices across many languages and genders with the --list-voices option
  • Write the generated audio to an MP3 file and matching subtitles to an SRT file
  • Adjust the rate, volume and pitch of the speech with the --rate, --volume and --pitch options
  • Play audio back immediately with the edge-playback command (uses the mpv player on non-Windows systems)
  • Use the module directly from Python, or run the edge-tts and edge-playback command-line tools

Getting started

Install edge-tts with pip, then generate speech from the command line. If you only want the command-line tools, pipx is a cleaner choice.

Install with pip

Install the module and its commands with pip.

bashbash
pip install edge-tts

Install the commands only with pipx

If you only need the edge-tts and edge-playback commands, pipx keeps them isolated from your other Python packages.

bashbash
pipx install edge-tts

Generate speech and subtitles

Pass your text and choose where to write the audio and the subtitles.

bashbash
edge-tts --text "Hello, world!" --write-media hello.mp3 --write-subtitles hello.srt

Pick a voice and tune the output

List the available voices, then select one and adjust the rate, volume or pitch. When using a negative value, write the option with an equals sign, for example --rate=-50%, so it is not read as a separate flag.

bashbash
edge-tts --list-voices
edge-tts --rate=-50% --text "Hello, world!" --write-media slower.mp3

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

  • Generate voiceovers or narration files for videos, slideshows and other media
  • Add spoken audio output to your own apps and scripts by calling the Python module
  • Create audio versions of articles or notes so you can listen instead of read
  • Produce sample clips of many voices and languages to help pick the right voice for a project

How edge-tts compares

edge-tts alongside other open-source audio, music & voice tools AI/TLDR tracks, ranked by GitHub stars.

ToolStarsWhat it does
Whisper★ 104kOpenAI's speech recognition model that transcribes and translates audio across many languages.
GPT-SoVITS★ 59kAn open-source WebUI that clones a voice from a short audio sample and turns text into speech, with zero-shot and few-shot fine-tuning.
VibeVoice★ 49.6kMicrosoft's text-to-speech model for generating long, expressive multi-speaker audio like podcasts.
Coqui TTS★ 45.6kA library of text-to-speech models including the multilingual XTTS voice-cloning model.
ChatTTS★ 39.5kChatTTS is an open-source text-to-speech model tuned for dialogue, with multi-speaker support and fine-grained control over laughter, pauses, and prosody.
MockingBird★ 36.9kAn open-source PyTorch toolbox that clones a voice from a short sample and generates Mandarin Chinese speech, with a web app, desktop toolbox, and command line.
OpenVoice★ 36.8kOpenVoice clones a voice from a short reference clip and speaks in multiple languages, with control over emotion, accent, rhythm, and intonation.
edge-tts★ 11.4kUse Microsoft Edge's online text-to-speech from Python or the command line