Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

#Tutorial 3: LLM Routing

Route LLM requests to optimal providers based on task type and budget.

Prerequisites

Learning Objectives

  • Configure multiple LLM providers
  • Route requests to optimal providers
  • Understand provider pricing
  • Handle fallback providers

Provider Options

ProviderCostQualitySpeedBest For
Claude$15/1MHighestGoodComplex reasoning
GPT-4$10/1MVery HighGoodGeneral purpose
Gemini$5/1MGoodExcellentBudget-friendly
OllamaFreeGoodDependsLocal, privacy

Step 1: Create Router

#![allow(unused)]
fn main() {
use vapora_llm_router::LLMRouter;

let router = LLMRouter::default();
}

Step 2: Configure Providers

#![allow(unused)]
fn main() {
use std::collections::HashMap;

let mut rules = HashMap::new();
rules.insert("coding", "claude");      // Complex tasks → Claude
rules.insert("testing", "gpt-4");      // Testing → GPT-4
rules.insert("documentation", "ollama"); // Local → Ollama
}

Step 3: Select Provider

#![allow(unused)]
fn main() {
let provider = if let Some(rule) = rules.get("coding") {
    *rule
} else {
    "claude" // default
};

println!("Selected provider: {}", provider);
}

Step 4: Cost Estimation

#![allow(unused)]
fn main() {
// Token usage
let input_tokens = 1500;
let output_tokens = 800;

// Claude pricing: $3/1M input, $15/1M output
let input_cost = (input_tokens as f64 * 3.0) / 1_000_000.0;
let output_cost = (output_tokens as f64 * 15.0) / 1_000_000.0;
let total_cost = input_cost + output_cost;

println!("Estimated cost: ${:.4}", total_cost);
}

Running the Example

cargo run --example 01-provider-selection -p vapora-llm-router

Expected Output

=== LLM Provider Selection Example ===

Available Providers:
1. claude (models: claude-opus-4-5, claude-sonnet-4)
   - Use case: Complex reasoning, code generation
   - Cost: $15 per 1M input tokens

2. gpt-4 (models: gpt-4-turbo, gpt-4)
   - Use case: General-purpose, multimodal
   - Cost: $10 per 1M input tokens

3. ollama (models: llama2, mistral)
   - Use case: Local execution, no cost
   - Cost: $0.00 (local/on-premise)

Task: code_analysis
  Selected provider: claude
  Model: claude-opus-4-5
  Cost: $0.075 per 1K tokens
  Fallback: gpt-4 (if budget exceeded)

Routing Strategies

Rule-Based Routing

#![allow(unused)]
fn main() {
match task_type {
    "code_generation" => "claude",
    "documentation" => "ollama", // Free
    "analysis" => "gpt-4",
    _ => "claude", // default
}
}

Cost-Aware Routing

#![allow(unused)]
fn main() {
if budget_remaining < 50 { // dollars
    "gemini" // cheaper
} else if budget_remaining < 100 {
    "gpt-4"
} else {
    "claude" // most capable
}
}

Quality-Aware Routing

#![allow(unused)]
fn main() {
match complexity_score {
    high if high > 0.8 => "claude",    // Best quality
    medium if medium > 0.5 => "gpt-4", // Good balance
    _ => "ollama",                     // Fast & cheap
}
}

Fallback Strategy

Always have a fallback when budget is critical:

#![allow(unused)]
fn main() {
let primary = "claude";
let fallback = "ollama";

let provider = if budget_exceeded {
    fallback
} else {
    primary
};
}

Common Patterns

Cost Optimization

#![allow(unused)]
fn main() {
// Use cheaper models for high-volume tasks
if task_count > 100 {
    "gemini" // $5/1M (cheaper than Claude $15/1M)
} else {
    "claude"
}
}

Multi-Step Tasks

#![allow(unused)]
fn main() {
// Step 1: Claude (expensive, high quality)
let analysis = route_to("claude", "analyze_code");

// Step 2: GPT-4 (medium cost)
let design = route_to("gpt-4", "design_solution");

// Step 3: Ollama (free)
let formatting = route_to("ollama", "format_output");
}

Troubleshooting

Q: "Provider not available" A: Check API keys in environment:

export ANTHROPIC_API_KEY=sk-ant-...
export OPENAI_API_KEY=sk-...

Q: "Budget exceeded" A: Use fallback provider or wait for budget reset

Next Steps

  • Tutorial 4: Learning Profiles
  • Example: crates/vapora-llm-router/examples/02-budget-enforcement.rs

Reference

  • Source: crates/vapora-llm-router/src/router.rs
  • API: cargo doc --open -p vapora-llm-router