AI/TLDR

DeepL · 2026-04-16 · notable

DeepL Voice-to-Voice — Real-Time Speech Translation Suite

DeepL Voice-to-Voice ships with 40+ language coverage for meetings (Zoom/Teams), in-person conversations, and a developer API. Independent evaluation: 96% of linguists preferred it over Google, Microsoft, and Zoom native translation.

DeepL Voice for Meetings interface showing real-time speech translation in a video call

DeepL extends its translation API to live voice — real-time speech translation for meetings, conversations, and custom apps across 40+ languages.

Key specs

Languages supported40+
Linguist preference vs google/microsoft/zoom96%

What is it?

DeepL Voice-to-Voice is a real-time spoken translation suite from DeepL, the company behind the translation API used by millions of developers. It launches with four products: Voice for Conversations (mobile and web, generally available now), Group Conversations (QR-code access for frontline teams, April 30), Voice for Meetings (Zoom and Teams captions, early access June), and a Voice-to-Voice API for embedding real-time speech translation in custom business apps. All 24 official EU languages are supported plus Arabic, Thai, Hebrew, Bengali, Vietnamese, Tagalog, and Norwegian — 40+ total.

How does it work?

The pipeline converts speech to text, applies DeepL's proprietary translation models tuned specifically for spoken language, then synthesizes speech output. DeepL's in-house LLM handles the fragmentation and incomplete sentences typical in live conversation. A terminology customization hub lets teams add product names, company jargon, and technical vocabulary so domain-specific speech translates accurately. DeepL is developing a fully end-to-end voice model (bypassing the text step entirely) as an ongoing AI Labs research project.

Why does it matter?

DeepL has 200,000+ business customers who already trust its written translation API. The Voice-to-Voice API is the most developer-relevant part: embed live speech translation directly into contact center software, kiosk apps, or field applications without building a multi-step STT → translate → TTS pipeline yourself. An independent evaluation by Slator found 96% of professional linguists preferred DeepL Voice over native translation from Google, Microsoft, and Zoom.

Who is it for?

Developers building multilingual voice apps and enterprises running global customer-facing operations.

Try it

https://www.deepl.com/en/products/voice

Sources · 3 outlets

Tags

  • deepl
  • voice
  • translation
  • speech
  • real-time
  • multilingual
  • api
  • meetings
  • enterprise

← All releases · Learn AI