Vapora/docs/adrs/0006-rig-framework.md

# ADR-006: Rig Framework para LLM Agent Orchestration

**Status**: Accepted | Implemented
**Date**: 2024-11-01
**Deciders**: LLM Architecture Team
**Technical Story**: Selecting Rust-native framework for LLM agent tool calling and streaming

---

## Decision

Usar **rig-core 0.15** para orquestación de agentes LLM (no LangChain, no SDKs directos de proveedores).

---

## Rationale

1. **Rust-Native**: Sin dependencias Python, compila a binario standalone
2. **Tool Calling Support**: First-class abstraction para function calling
3. **Streaming**: Built-in streaming de respuestas
4. **Minimal Abstraction**: Wrapper thin sobre APIs de proveedores (no over-engineering)
5. **Type Safety**: Schemas automáticos para tool definitions

---

## Alternatives Considered

### ❌ LangChain (Python Bridge)
- **Pros**: Muy maduro, mucho tooling
- **Cons**: Requiere Python runtime, complejidad de IPC

### ❌ Direct Provider SDKs (Claude, OpenAI, etc.)
- **Pros**: Control total
- **Cons**: Reimplementar tool calling, streaming, error handling múltiples veces

### ✅ Rig Framework (CHOSEN)
- Rust-native, thin abstraction
- Tool calling built-in
- Streaming support

---

## Trade-offs

**Pros**:
- ✅ Rust-native (no Python dependency)
- ✅ Tool calling abstraction reducida
- ✅ Streaming responses
- ✅ Type-safe schemas
- ✅ Minimal memory footprint

**Cons**:
- ⚠️ Comunidad más pequeña que LangChain
- ⚠️ Menos ejemplos/tutorials disponibles
- ⚠️ Actualización menos frecuente que alternatives

---

## Implementation

**Agent with Tool Calling**:
```rust
// crates/vapora-llm-router/src/providers.rs
use rig::client::Client;
use rig::completion::Prompt;

let client = rig::client::OpenAIClient::new(&api_key);

// Define tool schema
let calculate_tool = rig::tool::Tool {
    name: "calculate".to_string(),
    description: "Perform arithmetic calculation".to_string(),
    schema: json!({
        "type": "object",
        "properties": {
            "expression": {"type": "string"}
        }
    }),
};

// Call with tool
let response = client
    .post_chat()
    .preamble("You are a helpful assistant")
    .user_message("What is 2 + 2?")
    .tool(calculate_tool)
    .call()
    .await?;
```

**Streaming Responses**:
```rust
// Stream chunks as they arrive
let mut stream = client
    .post_chat()
    .user_message(prompt)
    .stream()
    .await?;

while let Some(chunk) = stream.next().await {
    match chunk {
        Ok(text) => println!("{}", text),
        Err(e) => eprintln!("Error: {:?}", e),
    }
}
```

**Key Files**:
- `/crates/vapora-llm-router/src/providers.rs` (provider implementations)
- `/crates/vapora-llm-router/src/router.rs` (routing logic)
- `/crates/vapora-agents/src/executor.rs` (agent task execution)

---

## Verification

```bash
# Test tool calling
cargo test -p vapora-llm-router test_tool_calling

# Test streaming
cargo test -p vapora-llm-router test_streaming_response

# Integration test with real provider
cargo test -p vapora-llm-router test_agent_execution -- --nocapture

# Benchmark tool calling latency
cargo bench -p vapora-llm-router bench_tool_response_time
```

**Expected Output**:
- Tools invoked correctly with parameters
- Streaming chunks received in order
- Agent executes tasks and returns results
- Latency < 100ms per tool call

---

## Consequences

### Developer Workflow
- Tool schemas defined in code (type-safe)
- No Python bridge debugging complexity
- Single-language stack (all Rust)

### Performance
- Minimal latency (direct to provider APIs)
- Streaming reduces perceived latency
- Tool calling has <50ms overhead

### Future Extensibility
- Adding new providers: implement `LLMClient` trait
- Custom tools: define schema + handler in Rust
- See ADR-007 (Multi-Provider Support)

---

## References

- [Rig Framework Documentation](https://github.com/0xPlaygrounds/rig)
- `/crates/vapora-llm-router/src/providers.rs` (provider abstractions)
- `/crates/vapora-agents/src/executor.rs` (agent execution)

---

**Related ADRs**: ADR-007 (Multi-Provider LLM), ADR-001 (Workspace)