#Tutorial 3: LLM Routing
Route LLM requests to optimal providers based on task type and budget.
Prerequisites
- Complete 02-basic-agents.md
Learning Objectives
- Configure multiple LLM providers
- Route requests to optimal providers
- Understand provider pricing
- Handle fallback providers
Provider Options
| Provider | Cost | Quality | Speed | Best For |
|---|---|---|---|---|
| Claude | $15/1M | Highest | Good | Complex reasoning |
| GPT-4 | $10/1M | Very High | Good | General purpose |
| Gemini | $5/1M | Good | Excellent | Budget-friendly |
| Ollama | Free | Good | Depends | Local, privacy |
Step 1: Create Router
#![allow(unused)] fn main() { use vapora_llm_router::LLMRouter; let router = LLMRouter::default(); }
Step 2: Configure Providers
#![allow(unused)] fn main() { use std::collections::HashMap; let mut rules = HashMap::new(); rules.insert("coding", "claude"); // Complex tasks → Claude rules.insert("testing", "gpt-4"); // Testing → GPT-4 rules.insert("documentation", "ollama"); // Local → Ollama }
Step 3: Select Provider
#![allow(unused)] fn main() { let provider = if let Some(rule) = rules.get("coding") { *rule } else { "claude" // default }; println!("Selected provider: {}", provider); }
Step 4: Cost Estimation
#![allow(unused)] fn main() { // Token usage let input_tokens = 1500; let output_tokens = 800; // Claude pricing: $3/1M input, $15/1M output let input_cost = (input_tokens as f64 * 3.0) / 1_000_000.0; let output_cost = (output_tokens as f64 * 15.0) / 1_000_000.0; let total_cost = input_cost + output_cost; println!("Estimated cost: ${:.4}", total_cost); }
Running the Example
cargo run --example 01-provider-selection -p vapora-llm-router
Expected Output
=== LLM Provider Selection Example ===
Available Providers:
1. claude (models: claude-opus-4-5, claude-sonnet-4)
- Use case: Complex reasoning, code generation
- Cost: $15 per 1M input tokens
2. gpt-4 (models: gpt-4-turbo, gpt-4)
- Use case: General-purpose, multimodal
- Cost: $10 per 1M input tokens
3. ollama (models: llama2, mistral)
- Use case: Local execution, no cost
- Cost: $0.00 (local/on-premise)
Task: code_analysis
Selected provider: claude
Model: claude-opus-4-5
Cost: $0.075 per 1K tokens
Fallback: gpt-4 (if budget exceeded)
Routing Strategies
Rule-Based Routing
#![allow(unused)] fn main() { match task_type { "code_generation" => "claude", "documentation" => "ollama", // Free "analysis" => "gpt-4", _ => "claude", // default } }
Cost-Aware Routing
#![allow(unused)] fn main() { if budget_remaining < 50 { // dollars "gemini" // cheaper } else if budget_remaining < 100 { "gpt-4" } else { "claude" // most capable } }
Quality-Aware Routing
#![allow(unused)] fn main() { match complexity_score { high if high > 0.8 => "claude", // Best quality medium if medium > 0.5 => "gpt-4", // Good balance _ => "ollama", // Fast & cheap } }
Fallback Strategy
Always have a fallback when budget is critical:
#![allow(unused)] fn main() { let primary = "claude"; let fallback = "ollama"; let provider = if budget_exceeded { fallback } else { primary }; }
Common Patterns
Cost Optimization
#![allow(unused)] fn main() { // Use cheaper models for high-volume tasks if task_count > 100 { "gemini" // $5/1M (cheaper than Claude $15/1M) } else { "claude" } }
Multi-Step Tasks
#![allow(unused)] fn main() { // Step 1: Claude (expensive, high quality) let analysis = route_to("claude", "analyze_code"); // Step 2: GPT-4 (medium cost) let design = route_to("gpt-4", "design_solution"); // Step 3: Ollama (free) let formatting = route_to("ollama", "format_output"); }
Troubleshooting
Q: "Provider not available" A: Check API keys in environment:
export ANTHROPIC_API_KEY=sk-ant-...
export OPENAI_API_KEY=sk-...
Q: "Budget exceeded" A: Use fallback provider or wait for budget reset
Next Steps
- Tutorial 4: Learning Profiles
- Example:
crates/vapora-llm-router/examples/02-budget-enforcement.rs
Reference
- Source:
crates/vapora-llm-router/src/router.rs - API:
cargo doc --open -p vapora-llm-router