ADR-006: Rig Framework para LLM Agent Orchestration
Status: Accepted | Implemented Date: 2024-11-01 Deciders: LLM Architecture Team Technical Story: Selecting Rust-native framework for LLM agent tool calling and streaming
Decision
Usar rig-core 0.15 para orquestación de agentes LLM (no LangChain, no SDKs directos de proveedores).
Rationale
- Rust-Native: Sin dependencias Python, compila a binario standalone
- Tool Calling Support: First-class abstraction para function calling
- Streaming: Built-in streaming de respuestas
- Minimal Abstraction: Wrapper thin sobre APIs de proveedores (no over-engineering)
- Type Safety: Schemas automáticos para tool definitions
Alternatives Considered
❌ LangChain (Python Bridge)
- Pros: Muy maduro, mucho tooling
- Cons: Requiere Python runtime, complejidad de IPC
❌ Direct Provider SDKs (Claude, OpenAI, etc.)
- Pros: Control total
- Cons: Reimplementar tool calling, streaming, error handling múltiples veces
✅ Rig Framework (CHOSEN)
- Rust-native, thin abstraction
- Tool calling built-in
- Streaming support
Trade-offs
Pros:
- ✅ Rust-native (no Python dependency)
- ✅ Tool calling abstraction reducida
- ✅ Streaming responses
- ✅ Type-safe schemas
- ✅ Minimal memory footprint
Cons:
- ⚠️ Comunidad más pequeña que LangChain
- ⚠️ Menos ejemplos/tutorials disponibles
- ⚠️ Actualización menos frecuente que alternatives
Implementation
Agent with Tool Calling:
#![allow(unused)] fn main() { // crates/vapora-llm-router/src/providers.rs use rig::client::Client; use rig::completion::Prompt; let client = rig::client::OpenAIClient::new(&api_key); // Define tool schema let calculate_tool = rig::tool::Tool { name: "calculate".to_string(), description: "Perform arithmetic calculation".to_string(), schema: json!({ "type": "object", "properties": { "expression": {"type": "string"} } }), }; // Call with tool let response = client .post_chat() .preamble("You are a helpful assistant") .user_message("What is 2 + 2?") .tool(calculate_tool) .call() .await?; }
Streaming Responses:
#![allow(unused)] fn main() { // Stream chunks as they arrive let mut stream = client .post_chat() .user_message(prompt) .stream() .await?; while let Some(chunk) = stream.next().await { match chunk { Ok(text) => println!("{}", text), Err(e) => eprintln!("Error: {:?}", e), } } }
Key Files:
/crates/vapora-llm-router/src/providers.rs(provider implementations)/crates/vapora-llm-router/src/router.rs(routing logic)/crates/vapora-agents/src/executor.rs(agent task execution)
Verification
# Test tool calling
cargo test -p vapora-llm-router test_tool_calling
# Test streaming
cargo test -p vapora-llm-router test_streaming_response
# Integration test with real provider
cargo test -p vapora-llm-router test_agent_execution -- --nocapture
# Benchmark tool calling latency
cargo bench -p vapora-llm-router bench_tool_response_time
Expected Output:
- Tools invoked correctly with parameters
- Streaming chunks received in order
- Agent executes tasks and returns results
- Latency < 100ms per tool call
Consequences
Developer Workflow
- Tool schemas defined in code (type-safe)
- No Python bridge debugging complexity
- Single-language stack (all Rust)
Performance
- Minimal latency (direct to provider APIs)
- Streaming reduces perceived latency
- Tool calling has <50ms overhead
Future Extensibility
- Adding new providers: implement
LLMClienttrait - Custom tools: define schema + handler in Rust
- See ADR-007 (Multi-Provider Support)
References
- Rig Framework Documentation
/crates/vapora-llm-router/src/providers.rs(provider abstractions)/crates/vapora-agents/src/executor.rs(agent execution)
Related ADRs: ADR-007 (Multi-Provider LLM), ADR-001 (Workspace)