ADR-006: Rig Framework para LLM Agent Orchestration

Status: Accepted | Implemented Date: 2024-11-01 Deciders: LLM Architecture Team Technical Story: Selecting Rust-native framework for LLM agent tool calling and streaming

Decision

Usar rig-core 0.15 para orquestación de agentes LLM (no LangChain, no SDKs directos de proveedores).

Rationale

Rust-Native: Sin dependencias Python, compila a binario standalone
Tool Calling Support: First-class abstraction para function calling
Streaming: Built-in streaming de respuestas
Minimal Abstraction: Wrapper thin sobre APIs de proveedores (no over-engineering)
Type Safety: Schemas automáticos para tool definitions

Alternatives Considered

❌ LangChain (Python Bridge)

Pros: Muy maduro, mucho tooling
Cons: Requiere Python runtime, complejidad de IPC

❌ Direct Provider SDKs (Claude, OpenAI, etc.)

Pros: Control total
Cons: Reimplementar tool calling, streaming, error handling múltiples veces

✅ Rig Framework (CHOSEN)

Rust-native, thin abstraction
Tool calling built-in
Streaming support

Trade-offs

Pros:

✅ Rust-native (no Python dependency)
✅ Tool calling abstraction reducida
✅ Streaming responses
✅ Type-safe schemas
✅ Minimal memory footprint

Cons:

⚠️ Comunidad más pequeña que LangChain
⚠️ Menos ejemplos/tutorials disponibles
⚠️ Actualización menos frecuente que alternatives

Implementation

Agent with Tool Calling:

#![allow(unused)]
fn main() {
// crates/vapora-llm-router/src/providers.rs
use rig::client::Client;
use rig::completion::Prompt;

let client = rig::client::OpenAIClient::new(&api_key);

// Define tool schema
let calculate_tool = rig::tool::Tool {
    name: "calculate".to_string(),
    description: "Perform arithmetic calculation".to_string(),
    schema: json!({
        "type": "object",
        "properties": {
            "expression": {"type": "string"}
        }
    }),
};

// Call with tool
let response = client
    .post_chat()
    .preamble("You are a helpful assistant")
    .user_message("What is 2 + 2?")
    .tool(calculate_tool)
    .call()
    .await?;
}

Streaming Responses:

#![allow(unused)]
fn main() {
// Stream chunks as they arrive
let mut stream = client
    .post_chat()
    .user_message(prompt)
    .stream()
    .await?;

while let Some(chunk) = stream.next().await {
    match chunk {
        Ok(text) => println!("{}", text),
        Err(e) => eprintln!("Error: {:?}", e),
    }
}
}

Key Files:

/crates/vapora-llm-router/src/providers.rs (provider implementations)
/crates/vapora-llm-router/src/router.rs (routing logic)
/crates/vapora-agents/src/executor.rs (agent task execution)

Verification

# Test tool calling
cargo test -p vapora-llm-router test_tool_calling

# Test streaming
cargo test -p vapora-llm-router test_streaming_response

# Integration test with real provider
cargo test -p vapora-llm-router test_agent_execution -- --nocapture

# Benchmark tool calling latency
cargo bench -p vapora-llm-router bench_tool_response_time

Expected Output:

Tools invoked correctly with parameters
Streaming chunks received in order
Agent executes tasks and returns results
Latency < 100ms per tool call

Consequences

Developer Workflow

Tool schemas defined in code (type-safe)
No Python bridge debugging complexity
Single-language stack (all Rust)

Performance

Minimal latency (direct to provider APIs)
Streaming reduces perceived latency
Tool calling has <50ms overhead

Future Extensibility

Adding new providers: implement LLMClient trait
Custom tools: define schema + handler in Rust
See ADR-007 (Multi-Provider Support)

References

Rig Framework Documentation
/crates/vapora-llm-router/src/providers.rs (provider abstractions)
/crates/vapora-agents/src/executor.rs (agent execution)

Related ADRs: ADR-007 (Multi-Provider LLM), ADR-001 (Workspace)

Keyboard shortcuts

VAPORA Platform Documentation