Vapora/docs/adrs/0008-tokio-runtime.md
Jesús Pérez 7110ffeea2
Some checks failed
Rust CI / Security Audit (push) Has been cancelled
Rust CI / Check + Test + Lint (nightly) (push) Has been cancelled
Rust CI / Check + Test + Lint (stable) (push) Has been cancelled
chore: extend doc: adr, tutorials, operations, etc
2026-01-12 03:32:47 +00:00

179 lines
4.8 KiB
Markdown

# ADR-008: Tokio Multi-Threaded Runtime
**Status**: Accepted | Implemented
**Date**: 2024-11-01
**Deciders**: Runtime Architecture Team
**Technical Story**: Selecting async runtime for I/O-heavy workload (API, DB, LLM calls)
---
## Decision
Usar **Tokio multi-threaded runtime** con configuración default (no single-threaded, no custom thread pool).
---
## Rationale
1. **I/O-Heavy Workload**: VAPORA hace many concurrent calls (SurrealDB, NATS, LLM APIs, WebSockets)
2. **Multi-Core Scalability**: Multi-threaded distributes work across cores eficientemente
3. **Production-Ready**: Tokio es de-facto estándar en Rust async ecosystem
4. **Minimal Config Overhead**: Default settings tuned para la mayoría de casos
---
## Alternatives Considered
### ❌ Single-Threaded Tokio (`tokio::main` single_threaded)
- **Pros**: Simpler to debug, predictable ordering
- **Cons**: Single core only, no scaling, inadequate for concurrent workload
### ❌ Custom ThreadPool
- **Pros**: Full control
- **Cons**: Manual scheduling, error-prone, maintenance burden
### ✅ Tokio Multi-Threaded (CHOSEN)
- Production-ready, well-tuned, scales across cores
---
## Trade-offs
**Pros**:
- ✅ Scales across all CPU cores
- ✅ Efficient I/O multiplexing (epoll on Linux, kqueue on macOS)
- ✅ Proven in production systems
- ✅ Built-in task spawning with `tokio::spawn`
- ✅ Graceful shutdown handling
**Cons**:
- ⚠️ More complex debugging (multiple threads)
- ⚠️ Potential data race if `Send/Sync` bounds not respected
- ⚠️ Memory overhead (per-thread stacks)
---
## Implementation
**Runtime Configuration**:
```rust
// crates/vapora-backend/src/main.rs:26
#[tokio::main]
async fn main() -> Result<()> {
// Default: worker threads = num_cpus(), stack size = 2MB
// Equivalent to:
// let rt = tokio::runtime::Builder::new_multi_thread()
// .worker_threads(num_cpus::get())
// .enable_all()
// .build()?;
}
```
**Async Task Spawning**:
```rust
// Spawn independent task (runs concurrently on available worker)
tokio::spawn(async {
let result = expensive_operation().await;
handle_result(result).await;
});
```
**Blocking Code in Async Context**:
```rust
// Block sync code without blocking entire executor
let result = tokio::task::block_in_place(|| {
// CPU-bound work or blocking I/O (file system, etc)
expensive_computation()
});
```
**Graceful Shutdown**:
```rust
// Listen for Ctrl+C
let shutdown = tokio::signal::ctrl_c();
tokio::select! {
_ = shutdown => {
info!("Shutting down gracefully...");
// Cancel in-flight tasks, drain channels, close connections
}
_ = run_server() => {}
}
```
**Key Files**:
- `/crates/vapora-backend/src/main.rs:26` (Tokio main)
- `/crates/vapora-agents/src/bin/server.rs` (Agent server with Tokio)
- `/crates/vapora-llm-router/src/router.rs` (Concurrent LLM calls via tokio::spawn)
---
## Verification
```bash
# Check runtime worker threads at startup
RUST_LOG=tokio=debug cargo run -p vapora-backend 2>&1 | grep "worker"
# Monitor CPU usage across cores
top -H -p $(pgrep -f vapora-backend)
# Test concurrent task spawning
cargo test -p vapora-backend test_concurrent_requests
# Profile thread behavior
cargo flamegraph --bin vapora-backend -- --profile cpu
# Stress test with load generator
wrk -t 4 -c 100 -d 30s http://localhost:8001/health
# Check task wakeups and efficiency
cargo run -p vapora-backend --release
# In another terminal:
perf record -p $(pgrep -f vapora-backend) sleep 5
perf report | grep -i "wakeup\|context"
```
**Expected Output**:
- Worker threads = number of CPU cores
- Concurrent requests handled efficiently
- CPU usage distributed across cores
- Low context switching overhead
- Latency p99 < 100ms for simple endpoints
---
## Consequences
### Concurrency Model
- Use `Arc<>` for shared state (cheap clones)
- Use `tokio::sync::RwLock`, `Mutex`, `broadcast` for synchronization
- Avoid blocking operations in async code (use `block_in_place`)
### Error Handling
- Panics in spawned tasks don't kill runtime (captured via `JoinHandle`)
- Use `.await?` for proper error propagation
- Set panic hook for graceful degradation
### Monitoring
- Track task queue depth (available via `tokio-console`)
- Monitor executor CPU usage
- Alert if thread starvation detected
### Performance Tuning
- Default settings adequate for most workloads
- Only customize if profiling shows bottleneck
- Typical: num_workers = num_cpus, stack size = 2MB
---
## References
- [Tokio Documentation](https://tokio.rs/tokio/tutorial)
- [Tokio Runtime Configuration](https://docs.rs/tokio/latest/tokio/runtime/struct.Builder.html)
- `/crates/vapora-backend/src/main.rs` (runtime entry point)
- `/crates/vapora-agents/src/bin/server.rs` (agent runtime)
---
**Related ADRs**: ADR-001 (Workspace), ADR-005 (NATS JetStream)