Custom Voice Stack Assembly (Self-Rolled)
Bespoke voice agent stack assembled from individual OSS components (Whisper, Llama, Coqui, Sarvam, custom).
Rs 2,99,999
Flat Fee
3-4 weeks
Delivery Timeline
10
Deliverables
GST
Invoice Included
Razorpay Verified
GST Invoice
Fixed Price
Scoping Call First
Custom Voice Stack Assembly (Self-Rolled)
For sophisticated buyers who don't want a framework like Pipecat or LiveKit — they want individual components stitched together for maximum control. Voice as a core product capability, not a feature bought from a vendor. We pick the right STT (Whisper / Sarvam / AI4Bharat / custom), LLM (Llama / Mistral / GPT-4 via Ollama / Anthropic / hybrid), TTS (Coqui / ElevenLabs / AI4Bharat / custom), telephony layer (Twilio / Plivo / Exotel / custom SIP), and orchestration approach (state machine, LLM-driven, or hybrid). Then we build the whole assembly in your repo, no framework dependency. Highest complexity, longest timeline, deepest customization in the catalog.
What's included
Every item below is delivered before final payment.
Component selection report with rationale
Custom voice stack code in your Git repo (no framework dependency)
STT integration (Whisper / Sarvam / custom)
LLM integration (local / cloud / hybrid)
TTS integration (Coqui / ElevenLabs / custom)
Telephony layer (Twilio / Plivo / Exotel / custom SIP)
Orchestration code (state machine or LLM-driven)
Monitoring + alerting
Deployment runbook + architecture documentation
60 days post-launch support
What's in scope (and what isn't)
Honest framing of the engagement. Self-hosted deployments give you control — and the ongoing infra bills come with that control.
- Architecture design and infrastructure provisioning runbook
- Framework setup and configuration in your repo
- Voice agent code, integration code, and webhook handlers
- India language tuning (Hindi default; regional + niche on add-on)
- Production hardening: error handling, retry logic, monitoring
- Deployment + post-launch support window
- Cloud infrastructure account (AWS / GCP / Azure / your own)
- LLM API account and bills (OpenAI / Anthropic / Gemini / etc.)
- STT API account and bills (Deepgram / Sarvam / Azure / etc.)
- TTS API account and bills (ElevenLabs / Cartesia / Sarvam / etc.)
- Telephony account and bills (Twilio / Plivo / Exotel / etc.)
- Ongoing infrastructure operations (covered if you add the maintenance retainer)
Deep-tech teams building voice as a core product capability who reject framework abstractions
Plus 18% GST at Razorpay checkout. Add your GSTIN to claim Input Tax Credit. Per-minute platform usage billed separately on prepaid wallet at voice-agent.edesy.in.
How this works
Buy Now triggers Razorpay checkout. Inquire First books a scoping call.
Confirm fit, deliverables, success criteria, go-live date. No payment yet for Inquire path.
Rs 2,99,999 + 18% GST. Add GSTIN to claim Input Tax Credit.
3-4 weeks from kickoff. Weekly check-ins; daily near launch.
Acceptance review, knowledge transfer, runbook handover.
Related Services
Other oss packages you might need
Rs 1,49,999
2 weeks
Rs 1,49,999
2 weeks
Rs 99,999
10 days
FAQ
Ready to start custom voice stack assembly (self-rolled)?
Buy directly via Razorpay (GST invoice included), or talk to sales first if you need a custom scope.