Build custom voice AI pipelines with Pipecat's open-source framework. Modular, production-ready components for real-time audio processing, STT, LLM orchestration, and TTS -- assembled exactly the way your application needs.
Real-Time
Processing
Open
Source
Modular
Architecture
Trusted by businesses worldwide
Assemble voice AI pipelines from interchangeable components. Swap STT, LLM, and TTS providers without changing application logic. Each component is a self-contained processor in the audio pipeline.
Pipecat processes audio at the frame level, enabling precise control over timing, interruptions, and turn-taking. Sub-frame latency for the most responsive voice interactions possible.
Fully open-source with an active community. Inspect the pipeline internals, add custom processors, contribute improvements, and build on a foundation that evolves with the voice AI ecosystem.
Built-in integrations with Deepgram, OpenAI, ElevenLabs, Azure, Google, Anthropic, and more. Switch providers by changing a single configuration without rewriting pipeline code.
Battle-tested error handling, automatic reconnection, graceful degradation, and comprehensive logging. Built for production workloads with the reliability voice AI requires.
Write custom pipeline processors for domain-specific logic -- sentiment analysis, compliance filtering, dynamic prompt injection, or audio watermarking -- and insert them anywhere in the pipeline.
Frame Processing
Source
Components
Ready
Define your voice AI pipeline by selecting and ordering components: transport (telephony), STT, LLM, TTS, and any custom processors for your specific use case.
Pipecat connects to your telephony provider (Twilio, Exotel, Jambonz, or WebRTC) via transport adapters. Audio frames flow into the pipeline from the transport layer.
Audio frames flow through the pipeline in real time. Each processor handles its function -- transcription, language understanding, response generation, speech synthesis -- passing frames to the next stage.
Synthesized audio frames are streamed back through the transport layer to the caller. Pipecat handles interruption detection, turn-taking, and audio buffering automatically.
"Pipecat's modular architecture let us add a custom compliance filter between STT and LLM. Every transcription is checked for sensitive data before the AI sees it. Could not do this with a black-box solution."
Custom Compliance Layer
Financial Services
CTO
"We benchmark different STT and TTS providers monthly. With Pipecat, swapping Deepgram for Whisper or ElevenLabs for Azure is a one-line config change. Keeps us on the cutting edge."
1-Line Provider Swap
Technology
Lead Developer
Resources to help you evaluate and implement
AI-powered phone calls from ₹6/min - 60% cheaper than alternatives
Build custom voice AI pipelines with modular, open-source components