179 lines
4.8 KiB
Markdown
179 lines
4.8 KiB
Markdown
# ADR-008: Tokio Multi-Threaded Runtime
|
|
|
|
**Status**: Accepted | Implemented
|
|
**Date**: 2024-11-01
|
|
**Deciders**: Runtime Architecture Team
|
|
**Technical Story**: Selecting async runtime for I/O-heavy workload (API, DB, LLM calls)
|
|
|
|
---
|
|
|
|
## Decision
|
|
|
|
Usar **Tokio multi-threaded runtime** con configuración default (no single-threaded, no custom thread pool).
|
|
|
|
---
|
|
|
|
## Rationale
|
|
|
|
1. **I/O-Heavy Workload**: VAPORA hace many concurrent calls (SurrealDB, NATS, LLM APIs, WebSockets)
|
|
2. **Multi-Core Scalability**: Multi-threaded distributes work across cores eficientemente
|
|
3. **Production-Ready**: Tokio es de-facto estándar en Rust async ecosystem
|
|
4. **Minimal Config Overhead**: Default settings tuned para la mayoría de casos
|
|
|
|
---
|
|
|
|
## Alternatives Considered
|
|
|
|
### ❌ Single-Threaded Tokio (`tokio::main` single_threaded)
|
|
- **Pros**: Simpler to debug, predictable ordering
|
|
- **Cons**: Single core only, no scaling, inadequate for concurrent workload
|
|
|
|
### ❌ Custom ThreadPool
|
|
- **Pros**: Full control
|
|
- **Cons**: Manual scheduling, error-prone, maintenance burden
|
|
|
|
### ✅ Tokio Multi-Threaded (CHOSEN)
|
|
- Production-ready, well-tuned, scales across cores
|
|
|
|
---
|
|
|
|
## Trade-offs
|
|
|
|
**Pros**:
|
|
- ✅ Scales across all CPU cores
|
|
- ✅ Efficient I/O multiplexing (epoll on Linux, kqueue on macOS)
|
|
- ✅ Proven in production systems
|
|
- ✅ Built-in task spawning with `tokio::spawn`
|
|
- ✅ Graceful shutdown handling
|
|
|
|
**Cons**:
|
|
- ⚠️ More complex debugging (multiple threads)
|
|
- ⚠️ Potential data race if `Send/Sync` bounds not respected
|
|
- ⚠️ Memory overhead (per-thread stacks)
|
|
|
|
---
|
|
|
|
## Implementation
|
|
|
|
**Runtime Configuration**:
|
|
```rust
|
|
// crates/vapora-backend/src/main.rs:26
|
|
#[tokio::main]
|
|
async fn main() -> Result<()> {
|
|
// Default: worker threads = num_cpus(), stack size = 2MB
|
|
// Equivalent to:
|
|
// let rt = tokio::runtime::Builder::new_multi_thread()
|
|
// .worker_threads(num_cpus::get())
|
|
// .enable_all()
|
|
// .build()?;
|
|
}
|
|
```
|
|
|
|
**Async Task Spawning**:
|
|
```rust
|
|
// Spawn independent task (runs concurrently on available worker)
|
|
tokio::spawn(async {
|
|
let result = expensive_operation().await;
|
|
handle_result(result).await;
|
|
});
|
|
```
|
|
|
|
**Blocking Code in Async Context**:
|
|
```rust
|
|
// Block sync code without blocking entire executor
|
|
let result = tokio::task::block_in_place(|| {
|
|
// CPU-bound work or blocking I/O (file system, etc)
|
|
expensive_computation()
|
|
});
|
|
```
|
|
|
|
**Graceful Shutdown**:
|
|
```rust
|
|
// Listen for Ctrl+C
|
|
let shutdown = tokio::signal::ctrl_c();
|
|
|
|
tokio::select! {
|
|
_ = shutdown => {
|
|
info!("Shutting down gracefully...");
|
|
// Cancel in-flight tasks, drain channels, close connections
|
|
}
|
|
_ = run_server() => {}
|
|
}
|
|
```
|
|
|
|
**Key Files**:
|
|
- `/crates/vapora-backend/src/main.rs:26` (Tokio main)
|
|
- `/crates/vapora-agents/src/bin/server.rs` (Agent server with Tokio)
|
|
- `/crates/vapora-llm-router/src/router.rs` (Concurrent LLM calls via tokio::spawn)
|
|
|
|
---
|
|
|
|
## Verification
|
|
|
|
```bash
|
|
# Check runtime worker threads at startup
|
|
RUST_LOG=tokio=debug cargo run -p vapora-backend 2>&1 | grep "worker"
|
|
|
|
# Monitor CPU usage across cores
|
|
top -H -p $(pgrep -f vapora-backend)
|
|
|
|
# Test concurrent task spawning
|
|
cargo test -p vapora-backend test_concurrent_requests
|
|
|
|
# Profile thread behavior
|
|
cargo flamegraph --bin vapora-backend -- --profile cpu
|
|
|
|
# Stress test with load generator
|
|
wrk -t 4 -c 100 -d 30s http://localhost:8001/health
|
|
|
|
# Check task wakeups and efficiency
|
|
cargo run -p vapora-backend --release
|
|
# In another terminal:
|
|
perf record -p $(pgrep -f vapora-backend) sleep 5
|
|
perf report | grep -i "wakeup\|context"
|
|
```
|
|
|
|
**Expected Output**:
|
|
- Worker threads = number of CPU cores
|
|
- Concurrent requests handled efficiently
|
|
- CPU usage distributed across cores
|
|
- Low context switching overhead
|
|
- Latency p99 < 100ms for simple endpoints
|
|
|
|
---
|
|
|
|
## Consequences
|
|
|
|
### Concurrency Model
|
|
- Use `Arc<>` for shared state (cheap clones)
|
|
- Use `tokio::sync::RwLock`, `Mutex`, `broadcast` for synchronization
|
|
- Avoid blocking operations in async code (use `block_in_place`)
|
|
|
|
### Error Handling
|
|
- Panics in spawned tasks don't kill runtime (captured via `JoinHandle`)
|
|
- Use `.await?` for proper error propagation
|
|
- Set panic hook for graceful degradation
|
|
|
|
### Monitoring
|
|
- Track task queue depth (available via `tokio-console`)
|
|
- Monitor executor CPU usage
|
|
- Alert if thread starvation detected
|
|
|
|
### Performance Tuning
|
|
- Default settings adequate for most workloads
|
|
- Only customize if profiling shows bottleneck
|
|
- Typical: num_workers = num_cpus, stack size = 2MB
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
- [Tokio Documentation](https://tokio.rs/tokio/tutorial)
|
|
- [Tokio Runtime Configuration](https://docs.rs/tokio/latest/tokio/runtime/struct.Builder.html)
|
|
- `/crates/vapora-backend/src/main.rs` (runtime entry point)
|
|
- `/crates/vapora-agents/src/bin/server.rs` (agent runtime)
|
|
|
|
---
|
|
|
|
**Related ADRs**: ADR-001 (Workspace), ADR-005 (NATS JetStream)
|