# ADR-008: Tokio Multi-Threaded Runtime **Status**: Accepted | Implemented **Date**: 2024-11-01 **Deciders**: Runtime Architecture Team **Technical Story**: Selecting async runtime for I/O-heavy workload (API, DB, LLM calls) --- ## Decision Usar **Tokio multi-threaded runtime** con configuración default (no single-threaded, no custom thread pool). --- ## Rationale 1. **I/O-Heavy Workload**: VAPORA hace many concurrent calls (SurrealDB, NATS, LLM APIs, WebSockets) 2. **Multi-Core Scalability**: Multi-threaded distributes work across cores eficientemente 3. **Production-Ready**: Tokio es de-facto estándar en Rust async ecosystem 4. **Minimal Config Overhead**: Default settings tuned para la mayoría de casos --- ## Alternatives Considered ### ❌ Single-Threaded Tokio (`tokio::main` single_threaded) - **Pros**: Simpler to debug, predictable ordering - **Cons**: Single core only, no scaling, inadequate for concurrent workload ### ❌ Custom ThreadPool - **Pros**: Full control - **Cons**: Manual scheduling, error-prone, maintenance burden ### ✅ Tokio Multi-Threaded (CHOSEN) - Production-ready, well-tuned, scales across cores --- ## Trade-offs **Pros**: - ✅ Scales across all CPU cores - ✅ Efficient I/O multiplexing (epoll on Linux, kqueue on macOS) - ✅ Proven in production systems - ✅ Built-in task spawning with `tokio::spawn` - ✅ Graceful shutdown handling **Cons**: - ⚠️ More complex debugging (multiple threads) - ⚠️ Potential data race if `Send/Sync` bounds not respected - ⚠️ Memory overhead (per-thread stacks) --- ## Implementation **Runtime Configuration**: ```rust // crates/vapora-backend/src/main.rs:26 #[tokio::main] async fn main() -> Result<()> { // Default: worker threads = num_cpus(), stack size = 2MB // Equivalent to: // let rt = tokio::runtime::Builder::new_multi_thread() // .worker_threads(num_cpus::get()) // .enable_all() // .build()?; } ``` **Async Task Spawning**: ```rust // Spawn independent task (runs concurrently on available worker) tokio::spawn(async { let result = expensive_operation().await; handle_result(result).await; }); ``` **Blocking Code in Async Context**: ```rust // Block sync code without blocking entire executor let result = tokio::task::block_in_place(|| { // CPU-bound work or blocking I/O (file system, etc) expensive_computation() }); ``` **Graceful Shutdown**: ```rust // Listen for Ctrl+C let shutdown = tokio::signal::ctrl_c(); tokio::select! { _ = shutdown => { info!("Shutting down gracefully..."); // Cancel in-flight tasks, drain channels, close connections } _ = run_server() => {} } ``` **Key Files**: - `/crates/vapora-backend/src/main.rs:26` (Tokio main) - `/crates/vapora-agents/src/bin/server.rs` (Agent server with Tokio) - `/crates/vapora-llm-router/src/router.rs` (Concurrent LLM calls via tokio::spawn) --- ## Verification ```bash # Check runtime worker threads at startup RUST_LOG=tokio=debug cargo run -p vapora-backend 2>&1 | grep "worker" # Monitor CPU usage across cores top -H -p $(pgrep -f vapora-backend) # Test concurrent task spawning cargo test -p vapora-backend test_concurrent_requests # Profile thread behavior cargo flamegraph --bin vapora-backend -- --profile cpu # Stress test with load generator wrk -t 4 -c 100 -d 30s http://localhost:8001/health # Check task wakeups and efficiency cargo run -p vapora-backend --release # In another terminal: perf record -p $(pgrep -f vapora-backend) sleep 5 perf report | grep -i "wakeup\|context" ``` **Expected Output**: - Worker threads = number of CPU cores - Concurrent requests handled efficiently - CPU usage distributed across cores - Low context switching overhead - Latency p99 < 100ms for simple endpoints --- ## Consequences ### Concurrency Model - Use `Arc<>` for shared state (cheap clones) - Use `tokio::sync::RwLock`, `Mutex`, `broadcast` for synchronization - Avoid blocking operations in async code (use `block_in_place`) ### Error Handling - Panics in spawned tasks don't kill runtime (captured via `JoinHandle`) - Use `.await?` for proper error propagation - Set panic hook for graceful degradation ### Monitoring - Track task queue depth (available via `tokio-console`) - Monitor executor CPU usage - Alert if thread starvation detected ### Performance Tuning - Default settings adequate for most workloads - Only customize if profiling shows bottleneck - Typical: num_workers = num_cpus, stack size = 2MB --- ## References - [Tokio Documentation](https://tokio.rs/tokio/tutorial) - [Tokio Runtime Configuration](https://docs.rs/tokio/latest/tokio/runtime/struct.Builder.html) - `/crates/vapora-backend/src/main.rs` (runtime entry point) - `/crates/vapora-agents/src/bin/server.rs` (agent runtime) --- **Related ADRs**: ADR-001 (Workspace), ADR-005 (NATS JetStream)