Deepgram Nova-3 Boosts Multilingual Speech-to-Text Accuracy

Deepgram's latest Nova-3 update dramatically improves multilingual speech recognition, especially for mixed-language conversations.

Deepgram has released an updated Nova-3 Multilingual speech-to-text model, significantly reducing word error rates. This update particularly enhances performance in 'code-switching' scenarios where languages are mixed within a single conversation, making it easier for businesses to understand global customers.

Sarah Kline

By Sarah Kline

February 14, 2026

4 min read

Deepgram Nova-3 Boosts Multilingual Speech-to-Text Accuracy

Key Facts

  • Deepgram released an updated Nova-3 Multilingual speech-to-text model.
  • The update delivers a ~34% relative reduction in batch mean WER and a ~21% relative reduction in streaming mean WER.
  • Significant accuracy gains were achieved in 'code-switching' scenarios (mixing languages mid-sentence).
  • No API or configuration changes are required for users.
  • Nova-3 Multilingual supports 10 languages, including English, Spanish, French, German, and Japanese.

Why You Care

Ever tried to use voice AI and found it stumbled over a mixed-language sentence? It’s frustrating, right? What if your voice assistant could seamlessly understand you, even when you switch languages mid-sentence?

Deepgram’s latest Nova-3 Multilingual speech-to-text model is here to solve that problem. This update promises to make voice AI far more accurate and useful for anyone dealing with diverse language interactions. Your ability to communicate across linguistic boundaries just got a significant upgrade.

What Actually Happened

Deepgram has rolled out an updated Nova-3 Multilingual speech-to-text model, according to the announcement. This new version brings substantial accuracy improvements across all its supported languages. The key focus is on enhancing real-world multilingual speech recognition, especially for inputs where languages are combined within a single utterance or conversation. Technical terms like ‘code-switching’ – which means mixing languages in one sentence – are now handled much better. The best part? No API changes are required for existing users, as mentioned in the release. This means the updated model is live and ready to use immediately.

Why This Matters to You

Speech recognition in the real world is often complex and unpredictable. People frequently switch languages mid-sentence, use varied accents, and mix vocabulary, as the company reports. This makes multilingual speech recognition a significant challenge for automatic speech recognition (ASR) systems. Imagine you’re running a global customer support line. Your customers might naturally blend languages when explaining an issue. This update directly addresses that difficulty.

Consider a bilingual English/Spanish speaker saying, “I was charged twice, pero solo hice una compra.” This translates to: “I was charged twice, but I only made one purchase.” Historically, these transitions have been challenging for multilingual systems. The Nova-3 update means your voice AI can now correctly recognize words as speakers switch languages mid-sentence. This leads to more accurate transcriptions and better understanding.

Key Improvements for Your Business:

  • Reduced Word Error Rate (WER): Lower errors in both batch and streaming transcription.
  • Enhanced Code-Switching: Better handling of mixed-language conversations.
  • No Integration Hassle: The updated model works without any API changes.

How much smoother would your international communications be with this improved accuracy?

The Surprising Finding

Here’s the twist: the most significant gains from this update are seen in ‘code-switching’ scenarios. While overall accuracy improved, the team revealed that the largest gains were specifically in situations where languages are mixed. This challenges the common assumption that general accuracy improvements would uniformly distribute across all use cases. Instead, Deepgram specifically targeted and achieved substantial progress in these complex, mixed-language interactions. The research shows a ~34% relative reduction in batch mean WER and a ~21% relative reduction in streaming mean WER. These figures highlight a focused betterment where it’s arguably most needed for real-world multilingual communication. It means the model isn’t just generally better; it’s specifically much better at handling the messiness of actual human speech across languages.

What Happens Next

This update sets the stage for more global communication tools. Businesses using Deepgram’s Nova-3 can expect benefits in their multilingual operations, according to the announcement. For example, call centers handling international clients will see more accurate transcriptions from today. This could lead to better customer service and more efficient data analysis. You can start building more effective voice AI applications for diverse linguistic environments right now.

Deepgram’s continued focus on real-world speech complexity indicates future enhancements will likely build on these specialized improvements. We might see even more nuanced handling of accents and dialects within mixed-language contexts in the coming months. For developers, the actionable advice is to explore how this enhanced multilingual capability can improve your current or upcoming projects. Think about how your applications can better serve a global audience. The industry implications are clear: higher accuracy in multilingual speech-to-text will drive wider adoption of voice AI across international markets. This is a crucial step towards truly global voice AI solutions, as the company reports.

Ready to start creating?

Create Voiceover

Transcribe Speech

Create Dialogues

Create Visuals

Clone a Voice