Vapora/docs/tutorials/03-llm-routing.md
Jesús Pérez 7110ffeea2
Some checks failed
Rust CI / Security Audit (push) Has been cancelled
Rust CI / Check + Test + Lint (nightly) (push) Has been cancelled
Rust CI / Check + Test + Lint (stable) (push) Has been cancelled
chore: extend doc: adr, tutorials, operations, etc
2026-01-12 03:32:47 +00:00

4.1 KiB

#Tutorial 3: LLM Routing

Route LLM requests to optimal providers based on task type and budget.

Prerequisites

Learning Objectives

  • Configure multiple LLM providers
  • Route requests to optimal providers
  • Understand provider pricing
  • Handle fallback providers

Provider Options

Provider Cost Quality Speed Best For
Claude $15/1M Highest Good Complex reasoning
GPT-4 $10/1M Very High Good General purpose
Gemini $5/1M Good Excellent Budget-friendly
Ollama Free Good Depends Local, privacy

Step 1: Create Router

use vapora_llm_router::LLMRouter;

let router = LLMRouter::default();

Step 2: Configure Providers

use std::collections::HashMap;

let mut rules = HashMap::new();
rules.insert("coding", "claude");      // Complex tasks → Claude
rules.insert("testing", "gpt-4");      // Testing → GPT-4
rules.insert("documentation", "ollama"); // Local → Ollama

Step 3: Select Provider

let provider = if let Some(rule) = rules.get("coding") {
    *rule
} else {
    "claude" // default
};

println!("Selected provider: {}", provider);

Step 4: Cost Estimation

// Token usage
let input_tokens = 1500;
let output_tokens = 800;

// Claude pricing: $3/1M input, $15/1M output
let input_cost = (input_tokens as f64 * 3.0) / 1_000_000.0;
let output_cost = (output_tokens as f64 * 15.0) / 1_000_000.0;
let total_cost = input_cost + output_cost;

println!("Estimated cost: ${:.4}", total_cost);

Running the Example

cargo run --example 01-provider-selection -p vapora-llm-router

Expected Output

=== LLM Provider Selection Example ===

Available Providers:
1. claude (models: claude-opus-4-5, claude-sonnet-4)
   - Use case: Complex reasoning, code generation
   - Cost: $15 per 1M input tokens

2. gpt-4 (models: gpt-4-turbo, gpt-4)
   - Use case: General-purpose, multimodal
   - Cost: $10 per 1M input tokens

3. ollama (models: llama2, mistral)
   - Use case: Local execution, no cost
   - Cost: $0.00 (local/on-premise)

Task: code_analysis
  Selected provider: claude
  Model: claude-opus-4-5
  Cost: $0.075 per 1K tokens
  Fallback: gpt-4 (if budget exceeded)

Routing Strategies

Rule-Based Routing

match task_type {
    "code_generation" => "claude",
    "documentation" => "ollama", // Free
    "analysis" => "gpt-4",
    _ => "claude", // default
}

Cost-Aware Routing

if budget_remaining < 50 { // dollars
    "gemini" // cheaper
} else if budget_remaining < 100 {
    "gpt-4"
} else {
    "claude" // most capable
}

Quality-Aware Routing

match complexity_score {
    high if high > 0.8 => "claude",    // Best quality
    medium if medium > 0.5 => "gpt-4", // Good balance
    _ => "ollama",                     // Fast & cheap
}

Fallback Strategy

Always have a fallback when budget is critical:

let primary = "claude";
let fallback = "ollama";

let provider = if budget_exceeded {
    fallback
} else {
    primary
};

Common Patterns

Cost Optimization

// Use cheaper models for high-volume tasks
if task_count > 100 {
    "gemini" // $5/1M (cheaper than Claude $15/1M)
} else {
    "claude"
}

Multi-Step Tasks

// Step 1: Claude (expensive, high quality)
let analysis = route_to("claude", "analyze_code");

// Step 2: GPT-4 (medium cost)
let design = route_to("gpt-4", "design_solution");

// Step 3: Ollama (free)
let formatting = route_to("ollama", "format_output");

Troubleshooting

Q: "Provider not available" A: Check API keys in environment:

export ANTHROPIC_API_KEY=sk-ant-...
export OPENAI_API_KEY=sk-...

Q: "Budget exceeded" A: Use fallback provider or wait for budget reset

Next Steps

  • Tutorial 4: Learning Profiles
  • Example: crates/vapora-llm-router/examples/02-budget-enforcement.rs

Reference

  • Source: crates/vapora-llm-router/src/router.rs
  • API: cargo doc --open -p vapora-llm-router