Provider Costs Breakdown
Detailed pricing for each provider (approximate, per minute).
STT Providers
| Provider |
Cost/min |
Best For |
| Deepgram |
Lowest |
English, speed |
| OpenAI Whisper |
Low |
Quality |
| ElevenLabs Scribe |
Low |
Multilingual |
| AssemblyAI |
Medium |
English |
| Azure |
Medium |
Enterprise |
| Google Chirp |
Higher |
Accuracy |
TTS Providers
| Provider |
Cost/min |
Best For |
| Deepgram |
Lowest |
Speed |
| Sarvam |
Low |
Indian languages |
| Cartesia |
Low |
Low latency |
| Azure |
Medium |
Regional languages |
| OpenAI |
Medium |
Multilingual |
| Google |
Medium |
General |
| ElevenLabs |
Premium |
Highest quality |
LLM Providers
| Provider |
Cost/min |
Best For |
| Gemini 2.5 Flash-Lite |
Lowest |
Real-time voice |
| Gemini 2.0 Flash |
Low |
Indian languages |
| GPT-4o-mini |
Low |
Budget quality |
| Claude 3 Haiku |
Low |
Fast |
| GPT-4o |
Medium |
Best reasoning |
| Claude 3.5 Sonnet |
Medium |
Complex tasks |
Native Audio Models
| Model |
Cost/min |
Notes |
| Gemini Live 2.0 |
Low |
Standard |
| Gemini Live 2.5 HD |
Low |
Premium features |
| OpenAI Realtime Mini |
Lowest |
Budget |
| OpenAI Realtime |
Medium |
Premium |
Cost Optimization Tips
Budget-Conscious Setup
STT: Deepgram
LLM: Gemini 2.5 Flash-Lite
TTS: Sarvam or similar
────────────────────────────────────
Total: Very low cost per minute
Quality-First Setup
STT: ElevenLabs Scribe
LLM: GPT-4o
TTS: ElevenLabs
────────────────────────────────────
Total: Higher cost, premium quality
Native Audio (Lowest Latency)
Gemini Live 2.5 HD
(Includes voice input and output)
Lowest latency, competitive cost