Interruption Handling
Interruption handling (also known as barge-in) allows users to speak while the bot is talking, creating a more natural conversation experience.
Why Interruptions Matter
Without interruption handling:
─────────────────────────────────────────────────────────────
Bot: "I can help you with that. Your order number one two three
four five was shipped on December twentieth and is currently
in transit. The estimated delivery date is December twenty..."
User: "WAIT! That's the wrong order!"
Bot: "...ninth. Is there anything else I can help you with?"
← Bot continues, ignoring user
With interruption handling:
─────────────────────────────────────────────────────────────
Bot: "I can help you with that. Your order number one two three—"
User: "Wait, that's the wrong order!"
Bot: [Stops immediately] "I apologize. What's the correct order number?"
← Natural conversation
How Interruption Detection Works
┌─────────────────────────────────────────┐
│ Pipeline │
│ │
User Audio ─────────►│─► VAD ──────────────────────────────────│
│ │ │
│ ▼ │
│ Is bot speaking? │
│ │ │
│ ├── No: Normal processing │
│ │ │
│ └── Yes: INTERRUPTION! │
│ │ │
Bot Speaking ◄───────│◄───────────│ │
│ ▼ │
│ 1. Stop TTS playback │
│ 2. Cancel LLM generation │
│ 3. Clear audio queue │
│ 4. Process user speech │
│ │
└─────────────────────────────────────────┘
Configuration
Enable/Disable Per Agent
{
"agent": {
"name": "Customer Support",
"allowInterruptions": true,
"interruptionConfig": {
"minSpeechDuration": 200,
"confidenceThreshold": 0.85
}
}
}
Configuration Options
| Parameter | Type | Default | Description |
|---|---|---|---|
allowInterruptions |
bool | true | Enable barge-in |
minSpeechDuration |
int | 200 | Min ms of speech before interrupt |
confidenceThreshold |
float | 0.8 | VAD confidence to trigger |
ignoreFillerWords |
bool | true | Don't interrupt for "um", "uh" |
Implementation
Interruption Handler
type InterruptionHandler struct {
pipeline *Pipeline
enabled bool
minDuration time.Duration
threshold float32
callback func()
}
func (h *InterruptionHandler) OnVADEvent(event VADEvent) {
if !h.enabled {
return
}
if event.Type != SpeechStart {
return
}
// Check if bot is currently speaking
if !h.pipeline.IsBotSpeaking() {
return
}
// Wait for minimum speech duration to avoid false positives
time.Sleep(h.minDuration)
// Re-check if still speaking
if !h.pipeline.IsUserSpeaking() {
return // Brief noise, not real interruption
}
// Trigger interruption
h.triggerInterruption()
}
func (h *InterruptionHandler) triggerInterruption() {
log.Debug("Interruption detected, stopping bot output")
// 1. Stop audio playback immediately
h.pipeline.ClearAudioQueue()
// 2. Cancel LLM generation
h.pipeline.CancelLLMGeneration()
// 3. Stop TTS synthesis
h.pipeline.StopTTS()
// 4. Mark bot as not speaking
h.pipeline.SetBotSpeaking(false)
// 5. Enable STT for user speech
h.pipeline.UnmuteStt()
// 6. Notify callback
if h.callback != nil {
h.callback()
}
// Track metric
metrics.RecordCounter("pipeline.interruptions", 1)
}
Pipeline Integration
func (p *Pipeline) setupInterruptionHandling() {
if !p.config.AllowInterruptions {
return
}
handler := &InterruptionHandler{
pipeline: p,
enabled: true,
minDuration: time.Duration(p.config.MinSpeechDuration) * time.Millisecond,
threshold: p.config.InterruptionThreshold,
callback: p.onInterruption,
}
p.vadProcessor.SetInterruptionHandler(handler)
}
func (p *Pipeline) onInterruption() {
// Add context for LLM about interruption
p.context.AddEvent(InterruptionEvent{
Timestamp: time.Now(),
BotWasSaying: p.lastBotUtterance,
})
}
Audio Queue Management
When interrupted, clear pending audio immediately:
type AudioQueue struct {
queue [][]byte
mu sync.Mutex
playing bool
}
func (q *AudioQueue) Clear() {
q.mu.Lock()
defer q.mu.Unlock()
q.queue = nil
q.playing = false
}
func (q *AudioQueue) Enqueue(audio []byte) {
q.mu.Lock()
defer q.mu.Unlock()
q.queue = append(q.queue, audio)
}
// In telephony provider
func (p *TwilioProvider) ClearPlayback() {
// Clear local queue
p.audioQueue.Clear()
// Send clear message to Twilio
p.sendMessage(TwilioMessage{
Event: "clear",
StreamSid: p.streamSid,
})
}
LLM Cancellation
Cancel in-flight LLM requests:
type LLMProcessor struct {
cancelFunc context.CancelFunc
generating bool
}
func (l *LLMProcessor) Cancel() {
if l.cancelFunc != nil {
l.cancelFunc()
}
l.generating = false
}
func (l *LLMProcessor) Generate(ctx context.Context, messages []Message) <-chan string {
// Create cancellable context
ctx, l.cancelFunc = context.WithCancel(ctx)
l.generating = true
tokenChan := make(chan string)
go func() {
defer close(tokenChan)
defer func() { l.generating = false }()
for token := range l.llm.StreamGenerate(ctx, messages) {
select {
case <-ctx.Done():
return // Cancelled
case tokenChan <- token:
}
}
}()
return tokenChan
}
Handling Interruption Context
Help the LLM understand the interruption:
func (p *Pipeline) processInterruption(userSpeech string) {
// What the bot was saying when interrupted
botContext := p.lastBotUtterance
// Build context message for LLM
interruptContext := fmt.Sprintf(
"[User interrupted. Bot was saying: \"%s\". User said: \"%s\"]",
truncate(botContext, 100),
userSpeech,
)
// Add to message history
p.context.AddMessage(Message{
Role: "system",
Content: interruptContext,
})
}
LLM Prompt for Handling Interruptions
systemPrompt := `You are a customer support agent.
INTERRUPTION HANDLING:
- If user interrupts, acknowledge briefly and address their concern
- Don't repeat what you were saying
- Stay focused on what the user wants
- Examples:
* "Got it, let me help with that instead."
* "Of course, what would you like to know?"
* "Sure, I'll look into that for you."
DO NOT:
- Say "I was saying..." or "As I was mentioning..."
- Apologize excessively for being interrupted
- Repeat the interrupted content`
Avoiding False Positives
Filler Word Detection
func (h *InterruptionHandler) isFillerWord(transcript string) bool {
fillers := []string{
"um", "uh", "hmm", "ah", "er", "like",
"you know", "i mean", "okay", "right",
}
lower := strings.ToLower(strings.TrimSpace(transcript))
for _, filler := range fillers {
if lower == filler {
return true
}
}
return false
}
func (h *InterruptionHandler) OnTranscript(transcript TranscriptEvent) {
if h.isFillerWord(transcript.Text) {
return // Don't interrupt for filler words
}
// Proceed with interruption handling
if transcript.IsFinal && h.pendingInterruption {
h.confirmInterruption()
}
}
Speech Duration Check
func (h *InterruptionHandler) OnVADEvent(event VADEvent) {
if event.Type == SpeechStart {
h.speechStartTime = time.Now()
h.pendingInterruption = true
}
if event.Type == SpeechEnd {
duration := time.Since(h.speechStartTime)
if duration < h.minDuration {
// Too short, likely noise
h.pendingInterruption = false
return
}
// Real interruption
h.confirmInterruption()
}
}
Gemini Live Interruptions
Gemini Live has built-in interruption handling:
func (g *GeminiLiveClient) HandleInterruption() error {
// Send turn complete signal
msg := map[string]any{
"clientContent": map[string]any{
"turnComplete": true,
},
}
return g.conn.WriteJSON(msg)
}
// Gemini Live automatically:
// 1. Stops generating audio
// 2. Clears pending output
// 3. Listens for new user input
Best Practices
1. Buffer Before Confirming
Wait briefly before confirming interruption:
// Wait 200ms of continuous speech before interrupting
if speechDuration > 200*time.Millisecond {
triggerInterruption()
}
2. Don't Interrupt During Critical Info
func (p *Pipeline) ShouldAllowInterruption() bool {
// Don't allow during confirmation
if p.state == StateConfirmingAction {
return false
}
// Don't allow during sensitive info
if p.lastMessage.ContainsSensitiveInfo {
return false
}
return p.config.AllowInterruptions
}
3. Track Interruption Patterns
type InterruptionMetrics struct {
TotalInterruptions int
FalsePositives int
AveragePosition float64 // % into bot utterance
CommonPhrases map[string]int
}
func (m *InterruptionMetrics) Record(event InterruptionEvent) {
m.TotalInterruptions++
position := float64(event.BotPosition) / float64(event.BotTotalLength)
m.AveragePosition = (m.AveragePosition + position) / 2
phrase := event.UserFirstWords
m.CommonPhrases[phrase]++
}
Troubleshooting
| Issue | Cause | Solution |
|---|---|---|
| Too many interruptions | Threshold too low | Increase minSpeechDuration |
| Missed interruptions | Threshold too high | Decrease confidenceThreshold |
| Delayed response | Queue not cleared | Check ClearPlayback() |
| Bot repeats itself | No interruption context | Add context to LLM |
Next Steps
- VAD Configuration - Tune voice detection
- Turn Detection - Conversation flow
- Gemini Live - Built-in interruptions