Anthropic Claude LLM
Claude 3.5 Sonnet provides excellent instruction following and nuanced responses, making it ideal for complex customer interactions.
Why Claude?
| Feature | Claude 3.5 Sonnet | GPT-4o |
|---|---|---|
| Instruction Following | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
| Context Window | 200K tokens | 128K tokens |
| Time to First Token | ~220ms | ~250ms |
| Safety | Excellent | Good |
| Cost | $3/$15 per 1M tokens | $5/$15 per 1M tokens |
Best for: Complex conversations requiring nuance, safety-critical applications.
Configuration
Basic Setup
{
"agent": {
"name": "Premium Support",
"llmProvider": "anthropic",
"llmModel": "claude-3-5-sonnet-20241022",
"llmTemperature": 0.7,
"prompt": "You are a helpful customer support agent..."
}
}
Environment Variables
ANTHROPIC_API_KEY=your_anthropic_api_key
Advanced Configuration
{
"llmProvider": "anthropic",
"llmModel": "claude-3-5-sonnet-20241022",
"llmConfig": {
"temperature": 0.7,
"max_tokens": 500,
"top_p": 0.9,
"top_k": 40
}
}
Model Comparison
| Model | Speed | Intelligence | Cost | Best For |
|---|---|---|---|---|
| claude-3-5-sonnet | 🚀 Fast | ⭐⭐⭐⭐⭐ | $$$ | Production voice |
| claude-3-opus | 🐢 Slower | ⭐⭐⭐⭐⭐ | $$$$ | Complex reasoning |
| claude-3-haiku | ⚡ Fastest | ⭐⭐⭐⭐ | $ | Cost-sensitive |
Implementation
Streaming Response
type AnthropicLLM struct {
client *anthropic.Client
model string
}
func NewAnthropicLLM(apiKey, model string) *AnthropicLLM {
client := anthropic.NewClient(apiKey)
return &AnthropicLLM{
client: client,
model: model,
}
}
func (a *AnthropicLLM) StreamGenerate(ctx context.Context, messages []Message) <-chan string {
tokenChan := make(chan string)
go func() {
defer close(tokenChan)
// Convert to Anthropic format
anthropicMessages := convertMessages(messages)
stream, err := a.client.Messages.Stream(ctx, anthropic.MessageCreateParams{
Model: a.model,
MaxTokens: 500,
Messages: anthropicMessages,
})
if err != nil {
return
}
defer stream.Close()
for {
event, err := stream.Recv()
if err == io.EOF {
return
}
if err != nil {
return
}
if delta, ok := event.(anthropic.ContentBlockDelta); ok {
if text := delta.Delta.Text; text != "" {
tokenChan <- text
}
}
}
}()
return tokenChan
}
Function Calling (Tools)
func (a *AnthropicLLM) GenerateWithTools(ctx context.Context, messages []Message, tools []Tool) (*Response, error) {
// Convert tools to Anthropic format
anthropicTools := make([]anthropic.Tool, len(tools))
for i, tool := range tools {
anthropicTools[i] = anthropic.Tool{
Name: tool.Name,
Description: tool.Description,
InputSchema: tool.Parameters,
}
}
resp, err := a.client.Messages.Create(ctx, anthropic.MessageCreateParams{
Model: a.model,
MaxTokens: 500,
Messages: convertMessages(messages),
Tools: anthropicTools,
})
if err != nil {
return nil, err
}
// Check for tool use
for _, block := range resp.Content {
if toolUse, ok := block.(anthropic.ToolUseBlock); ok {
return &Response{
ToolCalls: []ToolCall{{
ID: toolUse.ID,
Name: toolUse.Name,
Arguments: toolUse.Input,
}},
}, nil
}
}
// Extract text response
return extractTextResponse(resp), nil
}
System Prompt Best Practices
Claude excels at following detailed instructions:
systemPrompt := `You are Alex, a customer support agent for Acme Corp.
<persona>
- Warm and professional tone
- Patient with frustrated customers
- Admits uncertainty rather than guessing
</persona>
<voice_guidelines>
- Keep responses to 1-2 sentences
- Use conversational language
- Avoid bullet points and lists
- Numbers: "one two three" not "123"
</voice_guidelines>
<boundaries>
- Only discuss Acme products and services
- Don't make promises about refunds over $100
- Transfer to human for: legal, compliance, security issues
</boundaries>
<tools>
You have access to:
- get_order_status: Look up order information
- transfer_call: Connect to human agent
</tools>
<examples>
User: "Where's my order?"
Good: "I'd be happy to check that for you. What's your order number?"
Bad: "I can help with that! To look up your order status, I'll need your order number. Our orders typically ship within 2-3 business days and..."
</examples>`
Claude's Unique Strengths
1. Nuanced Understanding
Claude handles ambiguous requests well:
User: "I'm not sure if I want this anymore"
Claude: "I understand you're having second thoughts. Would you like me to
explain the return process, or would you prefer to discuss what's
making you hesitate?"
2. Safety and Boundaries
Claude naturally respects boundaries:
systemPrompt := `You are a bank support agent.
NEVER:
- Share account numbers over the phone
- Process transactions without verification
- Discuss other customers' accounts
When asked to do these things, politely explain you cannot.`
// Claude will refuse gracefully without being preachy
3. Long Context
Use the 200K context window for detailed history:
func buildContext(history []Message, documents []Document) []Message {
messages := []Message{{Role: "system", Content: systemPrompt}}
// Add relevant documents (Claude handles long context well)
for _, doc := range documents {
messages = append(messages, Message{
Role: "user",
Content: fmt.Sprintf("<document name=\"%s\">%s</document>", doc.Name, doc.Content),
})
}
// Add conversation history
messages = append(messages, history...)
return messages
}
Latency Optimization
1. Use Haiku for Simple Tasks
func selectModel(intent string) string {
switch intent {
case "greeting", "confirmation", "farewell":
return "claude-3-haiku-20240307" // Fast and cheap
case "complex_query", "reasoning":
return "claude-3-5-sonnet-20241022" // More capable
default:
return "claude-3-haiku-20240307"
}
}
2. Prompt Caching (Beta)
Reduce latency for repeated prompts:
// Cache the system prompt
resp, err := client.Messages.Create(ctx, anthropic.MessageCreateParams{
Model: "claude-3-5-sonnet-20241022",
System: anthropic.SystemPrompt{
Text: systemPrompt,
CacheControl: &anthropic.CacheControl{
Type: "ephemeral",
},
},
Messages: messages,
})
3. Shorter Responses
systemPrompt := `Keep all responses under 30 words.
Be direct and actionable.
Ask only one question at a time.`
Error Handling
func (a *AnthropicLLM) generateWithRetry(ctx context.Context, messages []Message) (*Response, error) {
maxRetries := 3
backoff := 500 * time.Millisecond
for i := 0; i < maxRetries; i++ {
resp, err := a.generate(ctx, messages)
if err == nil {
return resp, nil
}
var apiErr *anthropic.APIError
if errors.As(err, &apiErr) {
switch apiErr.StatusCode {
case 429: // Rate limit
retryAfter := parseRetryAfter(apiErr.Headers)
time.Sleep(retryAfter)
continue
case 529: // Overloaded
time.Sleep(backoff)
backoff *= 2
continue
case 500, 502, 503:
time.Sleep(backoff)
continue
default:
return nil, err
}
}
return nil, err
}
return nil, fmt.Errorf("max retries exceeded")
}
Cost Tracking
func (a *AnthropicLLM) trackUsage(resp *anthropic.MessageResponse) {
inputTokens := resp.Usage.InputTokens
outputTokens := resp.Usage.OutputTokens
metrics.RecordCounter("llm.anthropic.input_tokens", int64(inputTokens))
metrics.RecordCounter("llm.anthropic.output_tokens", int64(outputTokens))
// Claude 3.5 Sonnet pricing
inputCost := float64(inputTokens) * 0.000003 // $3/1M tokens
outputCost := float64(outputTokens) * 0.000015 // $15/1M tokens
metrics.RecordCounter("llm.anthropic.cost_usd", inputCost+outputCost)
}
Comparison with GPT-4o
| Aspect | Claude 3.5 Sonnet | GPT-4o |
|---|---|---|
| Instruction following | Excellent | Excellent |
| Function calling | Good | Best |
| Long context | 200K | 128K |
| Safety | More cautious | Less cautious |
| Creativity | More creative | More factual |
| Latency | Similar | Similar |
When to Choose Claude
- Complex, nuanced conversations
- Safety-critical applications
- Long document context needed
- Creative problem-solving
When to Choose GPT-4o
- Heavy function calling
- Structured output requirements
- Multi-modal (image) needs
Next Steps
- OpenAI Configuration - GPT-4o setup
- Gemini Configuration - Faster alternative
- Function Calling - Add tools