OpenVoice

Instant voice cloning that copies a tone color and speaks in many languages

github.com/myshell-ai/OpenVoice★ 36.7k research.myshell.ai/open-voice

Overview

OpenVoice is an open-source voice cloning model built by researchers at MIT, Tsinghua University, and MyShell. It copies the tone color of a reference speaker from a short audio clip and then uses that voice to generate new speech.

The reference clip can be in any language, and the cloned voice can speak in several languages, including English, Spanish, French, Chinese, Japanese, and Korean in version 2. Beyond just copying a voice, OpenVoice gives you control over style details like emotion, accent, rhythm, pauses, and intonation.

Both V1 and V2 are released under the MIT License, so the project is free for commercial and research use. It has powered the instant voice cloning feature on the MyShell platform.

What it does

Accurate tone color cloning from a short reference audio clip
Zero-shot cross-lingual cloning: the reference and output languages need not match
Flexible style control over emotion, accent, rhythm, pauses, and intonation
Native multi-lingual support in V2 for English, Spanish, French, Chinese, Japanese, and Korean
MIT licensed and free for both commercial and research use
Local Gradio demo and example notebooks for trying cloning end to end

Getting started

OpenVoice targets developers and researchers comfortable with Linux, Python, and PyTorch. Create a Conda environment, install the package, then download the model checkpoints. For V2 you also install MeloTTS for multi-lingual speech.

Create the environment and install OpenVoice

Make a Python 3.9 Conda environment, clone the repository, and install the package in editable mode. This same install works for both V1 and V2.

bashbash

conda create -n openvoice python=3.9
conda activate openvoice
git clone git@github.com:myshell-ai/OpenVoice.git
cd OpenVoice
pip install -e .

Add MeloTTS for V2 multi-lingual speech

OpenVoice V2 uses MeloTTS to generate base speech across languages. Install it and download the dictionary data.

bashbash

pip install git+https://github.com/myshell-ai/MeloTTS.git
python -m unidic download

Try the local Gradio demo

After downloading the checkpoints into the checkpoints folder, launch the minimalist local Gradio demo. The example notebooks demo_part1, demo_part2, and demo_part3 show full cloning workflows.

bashbash

python -m openvoice_app --share

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

Clone a narrator's voice from a short clip and generate audiobook or video narration in several languages
Build multilingual voice assistants or chatbots that keep a consistent voice across English, Spanish, French, Chinese, Japanese, and Korean
Add emotion, accent, and rhythm control to text-to-speech output for more natural-sounding dialogue
Localize content by speaking the same voice in a language the original speaker never recorded

How OpenVoice compares

OpenVoice alongside other open-source audio, music & voice tools AI/TLDR tracks, ranked by GitHub stars.

Tool	Stars	What it does
Whisper	★ 103k	OpenAI's speech recognition model that transcribes and translates audio across many languages.
GPT-SoVITS	★ 58.9k	An open-source WebUI that clones a voice from a short audio sample and turns text into speech, with zero-shot and few-shot fine-tuning.
VibeVoice	★ 49.5k	Microsoft's text-to-speech model for generating long, expressive multi-speaker audio like podcasts.
Coqui TTS	★ 45.6k	A library of text-to-speech models including the multilingual XTTS voice-cloning model.
ChatTTS	★ 39.5k	ChatTTS is an open-source text-to-speech model tuned for dialogue, with multi-speaker support and fine-grained control over laughter, pauses, and prosody.
MockingBird	★ 36.9k	An open-source PyTorch toolbox that clones a voice from a short sample and generates Mandarin Chinese speech, with a web app, desktop toolbox, and command line.
OpenVoice	★ 36.7k	Instant voice cloning that copies a tone color and speaks in many languages
VoxCPM	★ 31k	An open-source text-to-speech system that generates natural multilingual speech, designs voices from text descriptions, and clones any voice from a short clip.

// Overview

// What it does

// Getting started

Create the environment and install OpenVoice

Add MeloTTS for V2 multi-lingual speech

Try the local Gradio demo

// When to use it

// How OpenVoice compares

Overview

What it does

Getting started

When to use it

How OpenVoice compares