Language Support

Edesy voice agents support 50+ languages with automatic provider selection optimized for each language.

Supported Languages

Tier 1 (Excellent Support)

Language	Code	STT	TTS	LLM	Notes
English (US)	en-US	All	All	All	Best support
English (UK)	en-GB	All	All	All	British accent
English (IN)	en-IN	All	Azure	All	Indian accent
Spanish	es	All	All	All	Latam + Spain
French	fr	All	All	All	France + Canada
German	de	All	All	All
Portuguese	pt-BR	All	All	All	Brazil
Italian	it	All	All	All
Japanese	ja	All	All	All
Korean	ko	All	All	All
Chinese (Mandarin)	zh-CN	All	All	All	Simplified

Tier 2 (Good Support)

Language	Code	Best STT	Best TTS	Notes
Hindi	hi-IN	Google Chirp	Azure	Excellent support
Bengali	bn-IN	Google Chirp	Azure
Tamil	ta-IN	Google Chirp	Azure
Telugu	te-IN	Google Chirp	Azure
Marathi	mr-IN	Google Chirp	Azure
Gujarati	gu-IN	Google Chirp	Azure
Kannada	kn-IN	Google Chirp	Azure
Malayalam	ml-IN	Google Chirp	Azure
Punjabi	pa-IN	Google Chirp	Azure
Dutch	nl	Deepgram	Azure
Polish	pl	Deepgram	Azure
Russian	ru	Deepgram	Azure
Turkish	tr	Deepgram	Azure
Arabic	ar	Google	Azure	MSA + dialects
Hebrew	he	Google	Azure

Tier 3 (Basic Support)

Language	Code	Best STT	Best TTS	Notes
Assamese	as-IN	ElevenLabs	Azure	Limited vocabulary
Odia	or-IN	Google	Azure
Nepali	ne	Google	Azure
Sinhala	si	Google	Azure
Thai	th	Google	Azure
Vietnamese	vi	Google	Azure
Indonesian	id	Deepgram	Azure
Malay	ms	Google	Azure
Filipino	fil	Google	Azure
Ukrainian	uk	Google	Azure
Czech	cs	Deepgram	Azure
Romanian	ro	Deepgram	Azure
Hungarian	hu	Deepgram	Azure

Configuration

Basic Language Setup

{
  "agent": {
    "name": "Hindi Support Agent",
    "language": "hi-IN",
    "llmProvider": "gemini-2.5",
    "sttProvider": "google",
    "sttModel": "chirp_2",
    "ttsProvider": "azure",
    "ttsVoice": "hi-IN-SwaraNeural"
  }
}

Auto-Detection

Enable language detection for multi-lingual support:

{
  "agent": {
    "name": "Multi-lingual Agent",
    "language": "auto",
    "supportedLanguages": ["en-US", "hi-IN", "es"],
    "sttConfig": {
      "enableLanguageDetection": true
    }
  }
}

Provider Recommendations by Language

English (en-US, en-GB)

{
  "sttProvider": "deepgram",
  "sttModel": "nova-3",
  "ttsProvider": "cartesia",
  "llmProvider": "gemini-2.5"
}

Why: Deepgram Nova-3 has the lowest latency and highest accuracy for English. Cartesia provides natural-sounding voices with fast streaming.

Hindi (hi-IN)

{
  "sttProvider": "google",
  "sttModel": "chirp_2",
  "ttsProvider": "azure",
  "ttsVoice": "hi-IN-SwaraNeural",
  "llmProvider": "gemini-2.5"
}

Why: Google Chirp 2 is specifically optimized for Indic languages. Azure has the best Hindi voices.

Available Hindi Voices (Azure)

Voice	Gender	Style
hi-IN-SwaraNeural	Female	Warm, conversational
hi-IN-MadhurNeural	Male	Professional
hi-IN-AnanyaNeural	Female	Clear, formal
hi-IN-ArjunNeural	Male	Young, friendly

Tamil (ta-IN)

{
  "sttProvider": "google",
  "sttModel": "chirp_2",
  "ttsProvider": "azure",
  "ttsVoice": "ta-IN-PallaviNeural",
  "llmProvider": "gemini-2.5"
}

Bengali (bn-IN)

{
  "sttProvider": "google",
  "sttModel": "chirp_2",
  "ttsProvider": "azure",
  "ttsVoice": "bn-IN-TanishaaNeural",
  "llmProvider": "gemini-2.5"
}

Assamese (as-IN)

{
  "sttProvider": "elevenlabs",
  "ttsProvider": "azure",
  "ttsVoice": "as-IN-YashicaNeural",
  "llmProvider": "gemini-2.5"
}

Note: Assamese has limited STT support. ElevenLabs Scribe provides the best accuracy.

Language-Specific Configuration

Script Direction

For RTL languages (Arabic, Hebrew):

{
  "language": "ar",
  "ttsConfig": {
    "scriptDirection": "rtl"
  }
}

Number Formatting

Configure how numbers are spoken:

// Language-specific number formatting
type NumberFormat struct {
    Language        string
    GroupingSep     string  // "," or "."
    DecimalSep      string  // "." or ","
    SpokenFormat    string  // "one two three" vs "hundred twenty three"
}

var formats = map[string]NumberFormat{
    "en-US": {Language: "en-US", GroupingSep: ",", DecimalSep: ".", SpokenFormat: "natural"},
    "hi-IN": {Language: "hi-IN", GroupingSep: ",", DecimalSep: ".", SpokenFormat: "indian"},
    "de":    {Language: "de", GroupingSep: ".", DecimalSep: ",", SpokenFormat: "natural"},
}

Honorifics

For languages with formal address:

{
  "language": "hi-IN",
  "prompt": "Always use 'आप' (respectful you) instead of 'तुम'. Address customers with 'जी' suffix.",
  "llmConfig": {
    "formalityLevel": "formal"
  }
}

Multi-lingual Agents

Code-Switching Support

Handle users who switch languages mid-conversation:

{
  "agent": {
    "name": "Bilingual Support",
    "primaryLanguage": "en-US",
    "secondaryLanguage": "hi-IN",
    "codeSwitchingEnabled": true,
    "prompt": "You can respond in both English and Hindi. Match the language the customer uses. Hinglish (mixed) is acceptable if the customer uses it."
  }
}

Language Detection

type LanguageDetector struct {
    stt STTProvider
}

func (d *LanguageDetector) Detect(audio []byte) (string, float32) {
    result := d.stt.DetectLanguage(audio)
    return result.Language, result.Confidence
}

// In pipeline
func (p *Pipeline) handleLanguageDetection(audio []byte) {
    lang, confidence := p.langDetector.Detect(audio)

    if confidence > 0.8 && lang != p.currentLanguage {
        p.switchLanguage(lang)
    }
}

func (p *Pipeline) switchLanguage(lang string) {
    // Update STT language
    p.stt.SetLanguage(lang)

    // Update TTS voice
    p.tts.SetVoice(getVoiceForLanguage(lang))

    // Update LLM context
    p.llm.SetLanguageContext(lang)
}

Voice Selection by Language

English Voices

Provider	Voice	Style	Best For
Cartesia	Sophie	Warm, friendly	Customer support
ElevenLabs	Rachel	Professional	Sales
Azure	en-US-JennyNeural	Clear	IVR

Hindi Voices

Provider	Voice	Style	Gender
Azure	hi-IN-SwaraNeural	Conversational	Female
Azure	hi-IN-MadhurNeural	Professional	Male
Google	hi-IN-Wavenet-A	Clear	Female

Spanish Voices

Provider	Voice	Accent	Gender
Azure	es-MX-DaliaNeural	Mexican	Female
Azure	es-ES-ElviraNeural	Castilian	Female
ElevenLabs	Valentina	Neutral	Female

Accuracy Considerations

Word Error Rate (WER) by Language

Language	Deepgram	Google	Azure	ElevenLabs
en-US	8.4%	10.2%	11.5%	12.1%
hi-IN	22%	12%	15%	18%
ta-IN	-	14%	18%	-
as-IN	-	30%	-	20%

Improving Accuracy

Custom Vocabulary: Add domain-specific terms

{
  "sttConfig": {
    "keywords": ["Edesy:2", "voice agent:2"]
  }
}

Prompt Engineering: Help LLM understand context

"If the user's speech is unclear, ask them to repeat.
Common misheard words in Hindi: 'हां' (yes) and 'नहीं' (no)"

Post-Processing: Correct common errors

func correctTranscript(text, language string) string {
    corrections := languageCorrections[language]
    for wrong, right := range corrections {
        text = strings.ReplaceAll(text, wrong, right)
    }
    return text
}

Next Steps

Hindi Configuration - Detailed Hindi setup
STT Providers - Provider comparison
TTS Providers - Voice selection