Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

ADR-006: Rig Framework para LLM Agent Orchestration

Status: Accepted | Implemented Date: 2024-11-01 Deciders: LLM Architecture Team Technical Story: Selecting Rust-native framework for LLM agent tool calling and streaming


Decision

Usar rig-core 0.15 para orquestación de agentes LLM (no LangChain, no SDKs directos de proveedores).


Rationale

  1. Rust-Native: Sin dependencias Python, compila a binario standalone
  2. Tool Calling Support: First-class abstraction para function calling
  3. Streaming: Built-in streaming de respuestas
  4. Minimal Abstraction: Wrapper thin sobre APIs de proveedores (no over-engineering)
  5. Type Safety: Schemas automáticos para tool definitions

Alternatives Considered

❌ LangChain (Python Bridge)

  • Pros: Muy maduro, mucho tooling
  • Cons: Requiere Python runtime, complejidad de IPC

❌ Direct Provider SDKs (Claude, OpenAI, etc.)

  • Pros: Control total
  • Cons: Reimplementar tool calling, streaming, error handling múltiples veces

✅ Rig Framework (CHOSEN)

  • Rust-native, thin abstraction
  • Tool calling built-in
  • Streaming support

Trade-offs

Pros:

  • ✅ Rust-native (no Python dependency)
  • ✅ Tool calling abstraction reducida
  • ✅ Streaming responses
  • ✅ Type-safe schemas
  • ✅ Minimal memory footprint

Cons:

  • ⚠️ Comunidad más pequeña que LangChain
  • ⚠️ Menos ejemplos/tutorials disponibles
  • ⚠️ Actualización menos frecuente que alternatives

Implementation

Agent with Tool Calling:

#![allow(unused)]
fn main() {
// crates/vapora-llm-router/src/providers.rs
use rig::client::Client;
use rig::completion::Prompt;

let client = rig::client::OpenAIClient::new(&api_key);

// Define tool schema
let calculate_tool = rig::tool::Tool {
    name: "calculate".to_string(),
    description: "Perform arithmetic calculation".to_string(),
    schema: json!({
        "type": "object",
        "properties": {
            "expression": {"type": "string"}
        }
    }),
};

// Call with tool
let response = client
    .post_chat()
    .preamble("You are a helpful assistant")
    .user_message("What is 2 + 2?")
    .tool(calculate_tool)
    .call()
    .await?;
}

Streaming Responses:

#![allow(unused)]
fn main() {
// Stream chunks as they arrive
let mut stream = client
    .post_chat()
    .user_message(prompt)
    .stream()
    .await?;

while let Some(chunk) = stream.next().await {
    match chunk {
        Ok(text) => println!("{}", text),
        Err(e) => eprintln!("Error: {:?}", e),
    }
}
}

Key Files:

  • /crates/vapora-llm-router/src/providers.rs (provider implementations)
  • /crates/vapora-llm-router/src/router.rs (routing logic)
  • /crates/vapora-agents/src/executor.rs (agent task execution)

Verification

# Test tool calling
cargo test -p vapora-llm-router test_tool_calling

# Test streaming
cargo test -p vapora-llm-router test_streaming_response

# Integration test with real provider
cargo test -p vapora-llm-router test_agent_execution -- --nocapture

# Benchmark tool calling latency
cargo bench -p vapora-llm-router bench_tool_response_time

Expected Output:

  • Tools invoked correctly with parameters
  • Streaming chunks received in order
  • Agent executes tasks and returns results
  • Latency < 100ms per tool call

Consequences

Developer Workflow

  • Tool schemas defined in code (type-safe)
  • No Python bridge debugging complexity
  • Single-language stack (all Rust)

Performance

  • Minimal latency (direct to provider APIs)
  • Streaming reduces perceived latency
  • Tool calling has <50ms overhead

Future Extensibility

  • Adding new providers: implement LLMClient trait
  • Custom tools: define schema + handler in Rust
  • See ADR-007 (Multi-Provider Support)

References

  • Rig Framework Documentation
  • /crates/vapora-llm-router/src/providers.rs (provider abstractions)
  • /crates/vapora-agents/src/executor.rs (agent execution)

Related ADRs: ADR-007 (Multi-Provider LLM), ADR-001 (Workspace)