Vapora/03-llm-routing.md at 2227e891223f3940f0191ab3773a04b0d9b23bbd

jesus/Vapora

Fork 0

Jesús Pérez 7110ffeea2

Rust CI / Security Audit (push) Has been cancelled

Details

Rust CI / Check + Test + Lint (nightly) (push) Has been cancelled

Details

Rust CI / Check + Test + Lint (stable) (push) Has been cancelled

Details

chore: extend doc: adr, tutorials, operations, etc

2026-01-12 03:32:47 +00:00

4.1 KiB

Raw Blame History

#Tutorial 3: LLM Routing

Route LLM requests to optimal providers based on task type and budget.

Prerequisites

Complete 02-basic-agents.md

Learning Objectives

Configure multiple LLM providers
Route requests to optimal providers
Understand provider pricing
Handle fallback providers

Provider Options

Provider	Cost	Quality	Speed	Best For
Claude	$15/1M	Highest	Good	Complex reasoning
GPT-4	$10/1M	Very High	Good	General purpose
Gemini	$5/1M	Good	Excellent	Budget-friendly
Ollama	Free	Good	Depends	Local, privacy

Step 1: Create Router

use vapora_llm_router::LLMRouter;

let router = LLMRouter::default();

Step 2: Configure Providers

use std::collections::HashMap;

let mut rules = HashMap::new();
rules.insert("coding", "claude");      // Complex tasks → Claude
rules.insert("testing", "gpt-4");      // Testing → GPT-4
rules.insert("documentation", "ollama"); // Local → Ollama

Step 3: Select Provider

let provider = if let Some(rule) = rules.get("coding") {
    *rule
} else {
    "claude" // default
};

println!("Selected provider: {}", provider);

Step 4: Cost Estimation

// Token usage
let input_tokens = 1500;
let output_tokens = 800;

// Claude pricing: $3/1M input, $15/1M output
let input_cost = (input_tokens as f64 * 3.0) / 1_000_000.0;
let output_cost = (output_tokens as f64 * 15.0) / 1_000_000.0;
let total_cost = input_cost + output_cost;

println!("Estimated cost: ${:.4}", total_cost);

Running the Example

cargo run --example 01-provider-selection -p vapora-llm-router

Expected Output

=== LLM Provider Selection Example ===

Available Providers:
1. claude (models: claude-opus-4-5, claude-sonnet-4)
   - Use case: Complex reasoning, code generation
   - Cost: $15 per 1M input tokens

2. gpt-4 (models: gpt-4-turbo, gpt-4)
   - Use case: General-purpose, multimodal
   - Cost: $10 per 1M input tokens

3. ollama (models: llama2, mistral)
   - Use case: Local execution, no cost
   - Cost: $0.00 (local/on-premise)

Task: code_analysis
  Selected provider: claude
  Model: claude-opus-4-5
  Cost: $0.075 per 1K tokens
  Fallback: gpt-4 (if budget exceeded)

Routing Strategies

Rule-Based Routing

match task_type {
    "code_generation" => "claude",
    "documentation" => "ollama", // Free
    "analysis" => "gpt-4",
    _ => "claude", // default
}

Cost-Aware Routing

if budget_remaining < 50 { // dollars
    "gemini" // cheaper
} else if budget_remaining < 100 {
    "gpt-4"
} else {
    "claude" // most capable
}

Quality-Aware Routing

match complexity_score {
    high if high > 0.8 => "claude",    // Best quality
    medium if medium > 0.5 => "gpt-4", // Good balance
    _ => "ollama",                     // Fast & cheap
}

Fallback Strategy

Always have a fallback when budget is critical:

let primary = "claude";
let fallback = "ollama";

let provider = if budget_exceeded {
    fallback
} else {
    primary
};

Common Patterns

Cost Optimization

// Use cheaper models for high-volume tasks
if task_count > 100 {
    "gemini" // $5/1M (cheaper than Claude $15/1M)
} else {
    "claude"
}

Multi-Step Tasks

// Step 1: Claude (expensive, high quality)
let analysis = route_to("claude", "analyze_code");

// Step 2: GPT-4 (medium cost)
let design = route_to("gpt-4", "design_solution");

// Step 3: Ollama (free)
let formatting = route_to("ollama", "format_output");

Troubleshooting

Q: "Provider not available" A: Check API keys in environment:

export ANTHROPIC_API_KEY=sk-ant-...
export OPENAI_API_KEY=sk-...

Q: "Budget exceeded" A: Use fallback provider or wait for budget reset

Next Steps

Tutorial 4: Learning Profiles
Example: crates/vapora-llm-router/examples/02-budget-enforcement.rs

Reference

Source: crates/vapora-llm-router/src/router.rs
API: cargo doc --open -p vapora-llm-router

4.1 KiB Raw Blame History

Prerequisites

Learning Objectives

Provider Options

Step 1: Create Router

Step 2: Configure Providers

Step 3: Select Provider

Step 4: Cost Estimation

Running the Example

Expected Output

Routing Strategies

Rule-Based Routing

Cost-Aware Routing

Quality-Aware Routing

Fallback Strategy

Common Patterns

Cost Optimization

Multi-Step Tasks

Troubleshooting

Next Steps

Reference

4.1 KiB

Raw Blame History