4.0 KiB
4.0 KiB
ADR-006: Rig Framework para LLM Agent Orchestration
Status: Accepted | Implemented Date: 2024-11-01 Deciders: LLM Architecture Team Technical Story: Selecting Rust-native framework for LLM agent tool calling and streaming
Decision
Usar rig-core 0.15 para orquestación de agentes LLM (no LangChain, no SDKs directos de proveedores).
Rationale
- Rust-Native: Sin dependencias Python, compila a binario standalone
- Tool Calling Support: First-class abstraction para function calling
- Streaming: Built-in streaming de respuestas
- Minimal Abstraction: Wrapper thin sobre APIs de proveedores (no over-engineering)
- Type Safety: Schemas automáticos para tool definitions
Alternatives Considered
❌ LangChain (Python Bridge)
- Pros: Muy maduro, mucho tooling
- Cons: Requiere Python runtime, complejidad de IPC
❌ Direct Provider SDKs (Claude, OpenAI, etc.)
- Pros: Control total
- Cons: Reimplementar tool calling, streaming, error handling múltiples veces
✅ Rig Framework (CHOSEN)
- Rust-native, thin abstraction
- Tool calling built-in
- Streaming support
Trade-offs
Pros:
- ✅ Rust-native (no Python dependency)
- ✅ Tool calling abstraction reducida
- ✅ Streaming responses
- ✅ Type-safe schemas
- ✅ Minimal memory footprint
Cons:
- ⚠️ Comunidad más pequeña que LangChain
- ⚠️ Menos ejemplos/tutorials disponibles
- ⚠️ Actualización menos frecuente que alternatives
Implementation
Agent with Tool Calling:
// crates/vapora-llm-router/src/providers.rs
use rig::client::Client;
use rig::completion::Prompt;
let client = rig::client::OpenAIClient::new(&api_key);
// Define tool schema
let calculate_tool = rig::tool::Tool {
name: "calculate".to_string(),
description: "Perform arithmetic calculation".to_string(),
schema: json!({
"type": "object",
"properties": {
"expression": {"type": "string"}
}
}),
};
// Call with tool
let response = client
.post_chat()
.preamble("You are a helpful assistant")
.user_message("What is 2 + 2?")
.tool(calculate_tool)
.call()
.await?;
Streaming Responses:
// Stream chunks as they arrive
let mut stream = client
.post_chat()
.user_message(prompt)
.stream()
.await?;
while let Some(chunk) = stream.next().await {
match chunk {
Ok(text) => println!("{}", text),
Err(e) => eprintln!("Error: {:?}", e),
}
}
Key Files:
/crates/vapora-llm-router/src/providers.rs(provider implementations)/crates/vapora-llm-router/src/router.rs(routing logic)/crates/vapora-agents/src/executor.rs(agent task execution)
Verification
# Test tool calling
cargo test -p vapora-llm-router test_tool_calling
# Test streaming
cargo test -p vapora-llm-router test_streaming_response
# Integration test with real provider
cargo test -p vapora-llm-router test_agent_execution -- --nocapture
# Benchmark tool calling latency
cargo bench -p vapora-llm-router bench_tool_response_time
Expected Output:
- Tools invoked correctly with parameters
- Streaming chunks received in order
- Agent executes tasks and returns results
- Latency < 100ms per tool call
Consequences
Developer Workflow
- Tool schemas defined in code (type-safe)
- No Python bridge debugging complexity
- Single-language stack (all Rust)
Performance
- Minimal latency (direct to provider APIs)
- Streaming reduces perceived latency
- Tool calling has <50ms overhead
Future Extensibility
- Adding new providers: implement
LLMClienttrait - Custom tools: define schema + handler in Rust
- See ADR-007 (Multi-Provider Support)
References
- Rig Framework Documentation
/crates/vapora-llm-router/src/providers.rs(provider abstractions)/crates/vapora-agents/src/executor.rs(agent execution)
Related ADRs: ADR-007 (Multi-Provider LLM), ADR-001 (Workspace)