DeepL · 2026-04-16 · notable
DeepL Voice-to-Voice — Real-Time Speech Translation Suite
DeepL Voice-to-Voice ships with 40+ language coverage for meetings (Zoom/Teams), in-person conversations, and a developer API. Independent evaluation: 96% of linguists preferred it over Google, Microsoft, and Zoom native translation.

DeepL extends its translation API to live voice — real-time speech translation for meetings, conversations, and custom apps across 40+ languages.
Key specs
| Languages supported | 40+ |
|---|---|
| Linguist preference vs google/microsoft/zoom | 96% |
What is it?
DeepL Voice-to-Voice is a real-time spoken translation suite from DeepL, the company behind the translation API used by millions of developers. It launches with four products: Voice for Conversations (mobile and web, generally available now), Group Conversations (QR-code access for frontline teams, April 30), Voice for Meetings (Zoom and Teams captions, early access June), and a Voice-to-Voice API for embedding real-time speech translation in custom business apps. All 24 official EU languages are supported plus Arabic, Thai, Hebrew, Bengali, Vietnamese, Tagalog, and Norwegian — 40+ total.
How does it work?
The pipeline converts speech to text, applies DeepL's proprietary translation models tuned specifically for spoken language, then synthesizes speech output. DeepL's in-house LLM handles the fragmentation and incomplete sentences typical in live conversation. A terminology customization hub lets teams add product names, company jargon, and technical vocabulary so domain-specific speech translates accurately. DeepL is developing a fully end-to-end voice model (bypassing the text step entirely) as an ongoing AI Labs research project.
Why does it matter?
DeepL has 200,000+ business customers who already trust its written translation API. The Voice-to-Voice API is the most developer-relevant part: embed live speech translation directly into contact center software, kiosk apps, or field applications without building a multi-step STT → translate → TTS pipeline yourself. An independent evaluation by Slator found 96% of professional linguists preferred DeepL Voice over native translation from Google, Microsoft, and Zoom.
Who is it for?
Developers building multilingual voice apps and enterprises running global customer-facing operations.
Try it
https://www.deepl.com/en/products/voice