167 lines
4.0 KiB
Markdown
167 lines
4.0 KiB
Markdown
|
|
# ADR-006: Rig Framework para LLM Agent Orchestration
|
||
|
|
|
||
|
|
**Status**: Accepted | Implemented
|
||
|
|
**Date**: 2024-11-01
|
||
|
|
**Deciders**: LLM Architecture Team
|
||
|
|
**Technical Story**: Selecting Rust-native framework for LLM agent tool calling and streaming
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Decision
|
||
|
|
|
||
|
|
Usar **rig-core 0.15** para orquestación de agentes LLM (no LangChain, no SDKs directos de proveedores).
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Rationale
|
||
|
|
|
||
|
|
1. **Rust-Native**: Sin dependencias Python, compila a binario standalone
|
||
|
|
2. **Tool Calling Support**: First-class abstraction para function calling
|
||
|
|
3. **Streaming**: Built-in streaming de respuestas
|
||
|
|
4. **Minimal Abstraction**: Wrapper thin sobre APIs de proveedores (no over-engineering)
|
||
|
|
5. **Type Safety**: Schemas automáticos para tool definitions
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Alternatives Considered
|
||
|
|
|
||
|
|
### ❌ LangChain (Python Bridge)
|
||
|
|
- **Pros**: Muy maduro, mucho tooling
|
||
|
|
- **Cons**: Requiere Python runtime, complejidad de IPC
|
||
|
|
|
||
|
|
### ❌ Direct Provider SDKs (Claude, OpenAI, etc.)
|
||
|
|
- **Pros**: Control total
|
||
|
|
- **Cons**: Reimplementar tool calling, streaming, error handling múltiples veces
|
||
|
|
|
||
|
|
### ✅ Rig Framework (CHOSEN)
|
||
|
|
- Rust-native, thin abstraction
|
||
|
|
- Tool calling built-in
|
||
|
|
- Streaming support
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Trade-offs
|
||
|
|
|
||
|
|
**Pros**:
|
||
|
|
- ✅ Rust-native (no Python dependency)
|
||
|
|
- ✅ Tool calling abstraction reducida
|
||
|
|
- ✅ Streaming responses
|
||
|
|
- ✅ Type-safe schemas
|
||
|
|
- ✅ Minimal memory footprint
|
||
|
|
|
||
|
|
**Cons**:
|
||
|
|
- ⚠️ Comunidad más pequeña que LangChain
|
||
|
|
- ⚠️ Menos ejemplos/tutorials disponibles
|
||
|
|
- ⚠️ Actualización menos frecuente que alternatives
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Implementation
|
||
|
|
|
||
|
|
**Agent with Tool Calling**:
|
||
|
|
```rust
|
||
|
|
// crates/vapora-llm-router/src/providers.rs
|
||
|
|
use rig::client::Client;
|
||
|
|
use rig::completion::Prompt;
|
||
|
|
|
||
|
|
let client = rig::client::OpenAIClient::new(&api_key);
|
||
|
|
|
||
|
|
// Define tool schema
|
||
|
|
let calculate_tool = rig::tool::Tool {
|
||
|
|
name: "calculate".to_string(),
|
||
|
|
description: "Perform arithmetic calculation".to_string(),
|
||
|
|
schema: json!({
|
||
|
|
"type": "object",
|
||
|
|
"properties": {
|
||
|
|
"expression": {"type": "string"}
|
||
|
|
}
|
||
|
|
}),
|
||
|
|
};
|
||
|
|
|
||
|
|
// Call with tool
|
||
|
|
let response = client
|
||
|
|
.post_chat()
|
||
|
|
.preamble("You are a helpful assistant")
|
||
|
|
.user_message("What is 2 + 2?")
|
||
|
|
.tool(calculate_tool)
|
||
|
|
.call()
|
||
|
|
.await?;
|
||
|
|
```
|
||
|
|
|
||
|
|
**Streaming Responses**:
|
||
|
|
```rust
|
||
|
|
// Stream chunks as they arrive
|
||
|
|
let mut stream = client
|
||
|
|
.post_chat()
|
||
|
|
.user_message(prompt)
|
||
|
|
.stream()
|
||
|
|
.await?;
|
||
|
|
|
||
|
|
while let Some(chunk) = stream.next().await {
|
||
|
|
match chunk {
|
||
|
|
Ok(text) => println!("{}", text),
|
||
|
|
Err(e) => eprintln!("Error: {:?}", e),
|
||
|
|
}
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
**Key Files**:
|
||
|
|
- `/crates/vapora-llm-router/src/providers.rs` (provider implementations)
|
||
|
|
- `/crates/vapora-llm-router/src/router.rs` (routing logic)
|
||
|
|
- `/crates/vapora-agents/src/executor.rs` (agent task execution)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Verification
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Test tool calling
|
||
|
|
cargo test -p vapora-llm-router test_tool_calling
|
||
|
|
|
||
|
|
# Test streaming
|
||
|
|
cargo test -p vapora-llm-router test_streaming_response
|
||
|
|
|
||
|
|
# Integration test with real provider
|
||
|
|
cargo test -p vapora-llm-router test_agent_execution -- --nocapture
|
||
|
|
|
||
|
|
# Benchmark tool calling latency
|
||
|
|
cargo bench -p vapora-llm-router bench_tool_response_time
|
||
|
|
```
|
||
|
|
|
||
|
|
**Expected Output**:
|
||
|
|
- Tools invoked correctly with parameters
|
||
|
|
- Streaming chunks received in order
|
||
|
|
- Agent executes tasks and returns results
|
||
|
|
- Latency < 100ms per tool call
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Consequences
|
||
|
|
|
||
|
|
### Developer Workflow
|
||
|
|
- Tool schemas defined in code (type-safe)
|
||
|
|
- No Python bridge debugging complexity
|
||
|
|
- Single-language stack (all Rust)
|
||
|
|
|
||
|
|
### Performance
|
||
|
|
- Minimal latency (direct to provider APIs)
|
||
|
|
- Streaming reduces perceived latency
|
||
|
|
- Tool calling has <50ms overhead
|
||
|
|
|
||
|
|
### Future Extensibility
|
||
|
|
- Adding new providers: implement `LLMClient` trait
|
||
|
|
- Custom tools: define schema + handler in Rust
|
||
|
|
- See ADR-007 (Multi-Provider Support)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## References
|
||
|
|
|
||
|
|
- [Rig Framework Documentation](https://github.com/0xPlaygrounds/rig)
|
||
|
|
- `/crates/vapora-llm-router/src/providers.rs` (provider abstractions)
|
||
|
|
- `/crates/vapora-agents/src/executor.rs` (agent execution)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
**Related ADRs**: ADR-007 (Multi-Provider LLM), ADR-001 (Workspace)
|