4.1 KiB
4.1 KiB
#Tutorial 3: LLM Routing
Route LLM requests to optimal providers based on task type and budget.
Prerequisites
- Complete 02-basic-agents.md
Learning Objectives
- Configure multiple LLM providers
- Route requests to optimal providers
- Understand provider pricing
- Handle fallback providers
Provider Options
| Provider | Cost | Quality | Speed | Best For |
|---|---|---|---|---|
| Claude | $15/1M | Highest | Good | Complex reasoning |
| GPT-4 | $10/1M | Very High | Good | General purpose |
| Gemini | $5/1M | Good | Excellent | Budget-friendly |
| Ollama | Free | Good | Depends | Local, privacy |
Step 1: Create Router
use vapora_llm_router::LLMRouter;
let router = LLMRouter::default();
Step 2: Configure Providers
use std::collections::HashMap;
let mut rules = HashMap::new();
rules.insert("coding", "claude"); // Complex tasks → Claude
rules.insert("testing", "gpt-4"); // Testing → GPT-4
rules.insert("documentation", "ollama"); // Local → Ollama
Step 3: Select Provider
let provider = if let Some(rule) = rules.get("coding") {
*rule
} else {
"claude" // default
};
println!("Selected provider: {}", provider);
Step 4: Cost Estimation
// Token usage
let input_tokens = 1500;
let output_tokens = 800;
// Claude pricing: $3/1M input, $15/1M output
let input_cost = (input_tokens as f64 * 3.0) / 1_000_000.0;
let output_cost = (output_tokens as f64 * 15.0) / 1_000_000.0;
let total_cost = input_cost + output_cost;
println!("Estimated cost: ${:.4}", total_cost);
Running the Example
cargo run --example 01-provider-selection -p vapora-llm-router
Expected Output
=== LLM Provider Selection Example ===
Available Providers:
1. claude (models: claude-opus-4-5, claude-sonnet-4)
- Use case: Complex reasoning, code generation
- Cost: $15 per 1M input tokens
2. gpt-4 (models: gpt-4-turbo, gpt-4)
- Use case: General-purpose, multimodal
- Cost: $10 per 1M input tokens
3. ollama (models: llama2, mistral)
- Use case: Local execution, no cost
- Cost: $0.00 (local/on-premise)
Task: code_analysis
Selected provider: claude
Model: claude-opus-4-5
Cost: $0.075 per 1K tokens
Fallback: gpt-4 (if budget exceeded)
Routing Strategies
Rule-Based Routing
match task_type {
"code_generation" => "claude",
"documentation" => "ollama", // Free
"analysis" => "gpt-4",
_ => "claude", // default
}
Cost-Aware Routing
if budget_remaining < 50 { // dollars
"gemini" // cheaper
} else if budget_remaining < 100 {
"gpt-4"
} else {
"claude" // most capable
}
Quality-Aware Routing
match complexity_score {
high if high > 0.8 => "claude", // Best quality
medium if medium > 0.5 => "gpt-4", // Good balance
_ => "ollama", // Fast & cheap
}
Fallback Strategy
Always have a fallback when budget is critical:
let primary = "claude";
let fallback = "ollama";
let provider = if budget_exceeded {
fallback
} else {
primary
};
Common Patterns
Cost Optimization
// Use cheaper models for high-volume tasks
if task_count > 100 {
"gemini" // $5/1M (cheaper than Claude $15/1M)
} else {
"claude"
}
Multi-Step Tasks
// Step 1: Claude (expensive, high quality)
let analysis = route_to("claude", "analyze_code");
// Step 2: GPT-4 (medium cost)
let design = route_to("gpt-4", "design_solution");
// Step 3: Ollama (free)
let formatting = route_to("ollama", "format_output");
Troubleshooting
Q: "Provider not available" A: Check API keys in environment:
export ANTHROPIC_API_KEY=sk-ant-...
export OPENAI_API_KEY=sk-...
Q: "Budget exceeded" A: Use fallback provider or wait for budget reset
Next Steps
- Tutorial 4: Learning Profiles
- Example:
crates/vapora-llm-router/examples/02-budget-enforcement.rs
Reference
- Source:
crates/vapora-llm-router/src/router.rs - API:
cargo doc --open -p vapora-llm-router