Perfect timing for AI conversations. Configure how long the AI waits before responding. No more cutting off callers mid-thought or awkward silences.
VAD Profiles
Fastest Detection
Maximum Patience
Fewer Interruptions
Voice Activity Detection determines when the caller has finished speaking. Get it wrong and conversations feel robotic or frustrating.
VAD Too Aggressive (Short)
"I need to change my order to—"
"I can help you with your order. What would you like to change?"
"Wait, I wasn't done! I was saying I need to change it to a different size..."
AI jumped in too soon, caller frustrated
VAD Just Right
"I need to change my order to... um... let me think... yeah, a medium size instead."
VAD: Detected pause... waiting... caller continued... now done.
"I can change your order to a medium size. Let me update that for you now."
AI waited, heard full request, responded appropriately
Pre-configured profiles for different conversation needs
100ms silence threshold
Responds as fast as possible. AI jumps in after brief pause.
Best for:
Sales conversations
Fast-paced interactions
Native English speakers
Yes/no question flows
Not ideal for:
Thoughtful responses needed
Indian languages
Elderly callers
Complex explanations
Trade-off: May cut off callers mid-thought, especially if they pause to think.
200ms silence threshold
Good balance of responsiveness and patience. Works for most cases.
Best for:
General customer support
Mixed query types
Standard conversations
Most use cases
Not ideal for:
Ultra-low latency needs
Languages with longer pauses
Callers who need time
Trade-off: Middle ground - not the fastest, not the most patient.
350ms silence threshold
Waits longer before responding. More patient for complex conversations.
Best for:
Hindi, Gujarati, Marathi
Assamese, Malayalam
Elderly callers
Complex topics requiring thought
Healthcare and sensitive calls
Not ideal for:
Fast-paced sales calls
Time-sensitive queries
Callers who expect instant response
Trade-off: Slightly slower response, but far fewer interruptions.
Different languages have different natural pause patterns
Language | Recommended Profile |
|---|---|
English | Low Latency / Balanced |
Hindi | Conservative |
Tamil | Balanced / Conservative |
Telugu | Balanced / Conservative |
Gujarati | Conservative |
Marathi | Conservative |
Bengali | Balanced |
Kannada | Balanced |
Malayalam | Conservative |
Punjabi | Balanced |
Assamese | Conservative |
Fine-tune voice detection for your use case
AI distinguishes between 'thinking pauses' and 'I'm done speaking' silence based on context and duration patterns.
Conservative profiles prevent the AI from jumping in while callers are still forming thoughts or mid-sentence.
Different languages have different natural pause patterns. Pre-optimized settings for 24 languages.
Configure for elderly callers, non-native speakers, or other demographics that benefit from more patience.
Different agents can have different VAD profiles. Sales = low latency, support = balanced, healthcare = conservative.
VAD settings can be changed instantly in agent settings. No redeploy needed.
How different industries should configure VAD
Fast responses keep momentum. Sales conversations benefit from energetic pacing where the AI responds quickly.
Note: Train callers to expect fast responses. May need fallback for complex questions.
Support calls have mixed query complexity. Balanced profile handles both quick questions and longer explanations.
Note: Monitor for interruption complaints and adjust if needed.
Patients discussing health concerns need time to explain. Never rush medical conversations.
Note: Callers may be anxious, confused, or elderly. Extra patience is essential.
Indian languages have different pause patterns than English. Conservative profile prevents mid-sentence interruptions.
Note: Test thoroughly with native speakers. Regional accents may vary.
Get voice detection right for better conversations
No Interruptions
AI waits for caller to finish
Natural Flow
Feels like talking to a human
Less Frustration
Callers feel heard
Higher CSAT
Better conversation quality
Per-Agent Control
Different settings per use case
Easy Testing
Switch profiles instantly
Language Flexibility
Optimized for 24 languages
Fewer Escalations
Proper VAD reduces complaints
Simple configuration in agent settings
Agent Configuration (JSON)
{
"llmConfig": {
"gemini-live-2.5": {
"vadProfile": "conservative" // Options: "low_latency", "balanced", "conservative"
}
}
}"low_latency"
100ms silence
"balanced"
200ms silence (default)
"conservative"
350ms silence
Common questions about Voice Activity Detection
VAD determines when a caller has finished speaking and it's the AI's turn to respond. It measures silence duration after speech to decide if the caller is done or just pausing. Too aggressive (short) = AI interrupts. Too passive (long) = awkward delays. The right VAD profile balances responsiveness with not cutting people off.
Start with 'Balanced' (200ms) for most cases. If you're handling Indian languages (Hindi, Gujarati, Marathi, etc.), elderly callers, or complex topics, use 'Conservative' (350ms). Only use 'Low Latency' (100ms) for fast-paced sales calls with native English speakers who expect rapid responses.
Hindi speakers naturally pause mid-sentence more than English speakers due to sentence structure and thinking patterns. A 100ms silence in Hindi often means 'still thinking' while in English it often means 'done speaking'. Conservative mode (350ms) prevents the AI from jumping in during these natural Hindi pauses.
Yes. VAD profile is an agent setting that takes effect immediately. Change from 'Low Latency' to 'Conservative' in the dashboard and the next call uses the new setting. No code changes or deployment needed.
Signs of VAD too aggressive (too short): callers say 'wait' or 'let me finish', callers repeat themselves, callers seem frustrated. Signs of VAD too passive (too long): awkward silences, callers say 'hello?' thinking AI disconnected, conversations feel slow. Listen to recordings or use browser testing to diagnose.
Often, yes. Sales agents benefit from low-latency (fast, energetic). Support agents work well with balanced. Healthcare or elderly-focused agents need conservative. Configure each agent independently based on its use case.
They're related but different. VAD determines end-of-turn detection (when the AI should start responding). Barge-in is whether the caller can interrupt the AI mid-response. On Gemini Live 2.5, barge-in is always enabled. VAD controls how quickly the AI responds after the caller stops.
Poor connections can add variable latency that affects perceived VAD timing. If you're serving callers with slow connections, lean toward Conservative profile to account for network delays. Browser testing doesn't simulate this - test with actual phone calls in target conditions.
Explore more voice AI configuration options
Get voice detection right for natural, frustration-free conversations
Real demo calls showcasing low latency and natural conversations in multiple Indian languages
AI voice agent qualifying B2B leads for corporate gifting. Ultra-low latency with 1-2 second response time. Bilingual conversation in Hindi and English.
Audio player powered by Google Drive
Open in DriveAI voice agent handling admission inquiries and appointment booking for educational institutes in Malayalam language.
Audio player powered by Google Drive
Open in DriveAI voice agent handling admission inquiries and appointment booking for educational institutes in Tamil language.
Audio player powered by Google Drive
Open in DriveAI voice agent qualifying leads for solar installation company in Assamese language. Natural conversation flow with product inquiry handling.
Audio player powered by Google Drive
Open in DriveAI voice bot helping patients book hospital appointments in Bengali. Natural conversation with availability checking and confirmation.
Audio player powered by Google Drive
Open in DriveAI voice bot helping patients book hospital appointments in Hindi. Handles doctor selection, time slot booking, and confirmation.
Audio player powered by Google Drive
Open in DriveAI voice bot helping patients book hospital appointments in Telugu. Natural conversation flow for healthcare scheduling.
Audio player powered by Google Drive
Open in DriveBest AI voice agent pricing worldwide - from ₹4/min ($0.04) | 40% more affordable than US alternatives