jesus/Vapora

Fork 0

Jesús Pérez 7110ffeea2

Rust CI / Security Audit (push) Has been cancelled

Details

Rust CI / Check + Test + Lint (nightly) (push) Has been cancelled

Details

Rust CI / Check + Test + Lint (stable) (push) Has been cancelled

Details

chore: extend doc: adr, tutorials, operations, etc

2026-01-12 03:32:47 +00:00

4.8 KiB

Raw Blame History

ADR-008: Tokio Multi-Threaded Runtime

Status: Accepted | Implemented Date: 2024-11-01 Deciders: Runtime Architecture Team Technical Story: Selecting async runtime for I/O-heavy workload (API, DB, LLM calls)

Decision

Usar Tokio multi-threaded runtime con configuración default (no single-threaded, no custom thread pool).

Rationale

I/O-Heavy Workload: VAPORA hace many concurrent calls (SurrealDB, NATS, LLM APIs, WebSockets)
Multi-Core Scalability: Multi-threaded distributes work across cores eficientemente
Production-Ready: Tokio es de-facto estándar en Rust async ecosystem
Minimal Config Overhead: Default settings tuned para la mayoría de casos

Alternatives Considered

❌ Single-Threaded Tokio (`tokio::main` single_threaded)

Pros: Simpler to debug, predictable ordering
Cons: Single core only, no scaling, inadequate for concurrent workload

❌ Custom ThreadPool

Pros: Full control
Cons: Manual scheduling, error-prone, maintenance burden

✅ Tokio Multi-Threaded (CHOSEN)

Production-ready, well-tuned, scales across cores

Trade-offs

Pros:

✅ Scales across all CPU cores
✅ Efficient I/O multiplexing (epoll on Linux, kqueue on macOS)
✅ Proven in production systems
✅ Built-in task spawning with tokio::spawn
✅ Graceful shutdown handling

Cons:

⚠️ More complex debugging (multiple threads)
⚠️ Potential data race if Send/Sync bounds not respected
⚠️ Memory overhead (per-thread stacks)

Implementation

Runtime Configuration:

// crates/vapora-backend/src/main.rs:26
#[tokio::main]
async fn main() -> Result<()> {
    // Default: worker threads = num_cpus(), stack size = 2MB
    // Equivalent to:
    // let rt = tokio::runtime::Builder::new_multi_thread()
    //     .worker_threads(num_cpus::get())
    //     .enable_all()
    //     .build()?;
}

Async Task Spawning:

// Spawn independent task (runs concurrently on available worker)
tokio::spawn(async {
    let result = expensive_operation().await;
    handle_result(result).await;
});

Blocking Code in Async Context:

// Block sync code without blocking entire executor
let result = tokio::task::block_in_place(|| {
    // CPU-bound work or blocking I/O (file system, etc)
    expensive_computation()
});

Graceful Shutdown:

// Listen for Ctrl+C
let shutdown = tokio::signal::ctrl_c();

tokio::select! {
    _ = shutdown => {
        info!("Shutting down gracefully...");
        // Cancel in-flight tasks, drain channels, close connections
    }
    _ = run_server() => {}
}

Key Files:

/crates/vapora-backend/src/main.rs:26 (Tokio main)
/crates/vapora-agents/src/bin/server.rs (Agent server with Tokio)
/crates/vapora-llm-router/src/router.rs (Concurrent LLM calls via tokio::spawn)

Verification

# Check runtime worker threads at startup
RUST_LOG=tokio=debug cargo run -p vapora-backend 2>&1 | grep "worker"

# Monitor CPU usage across cores
top -H -p $(pgrep -f vapora-backend)

# Test concurrent task spawning
cargo test -p vapora-backend test_concurrent_requests

# Profile thread behavior
cargo flamegraph --bin vapora-backend -- --profile cpu

# Stress test with load generator
wrk -t 4 -c 100 -d 30s http://localhost:8001/health

# Check task wakeups and efficiency
cargo run -p vapora-backend --release
# In another terminal:
perf record -p $(pgrep -f vapora-backend) sleep 5
perf report | grep -i "wakeup\|context"

Expected Output:

Worker threads = number of CPU cores
Concurrent requests handled efficiently
CPU usage distributed across cores
Low context switching overhead
Latency p99 < 100ms for simple endpoints

Consequences

Concurrency Model

Use Arc<> for shared state (cheap clones)
Use tokio::sync::RwLock, Mutex, broadcast for synchronization
Avoid blocking operations in async code (use block_in_place)

Error Handling

Panics in spawned tasks don't kill runtime (captured via JoinHandle)
Use .await? for proper error propagation
Set panic hook for graceful degradation

Monitoring

Track task queue depth (available via tokio-console)
Monitor executor CPU usage
Alert if thread starvation detected

Performance Tuning

Default settings adequate for most workloads
Only customize if profiling shows bottleneck
Typical: num_workers = num_cpus, stack size = 2MB

References

Tokio Documentation
Tokio Runtime Configuration
/crates/vapora-backend/src/main.rs (runtime entry point)
/crates/vapora-agents/src/bin/server.rs (agent runtime)

Related ADRs: ADR-001 (Workspace), ADR-005 (NATS JetStream)

4.8 KiB Raw Blame History

ADR-008: Tokio Multi-Threaded Runtime

Decision

Rationale

Alternatives Considered

❌ Single-Threaded Tokio (tokio::main single_threaded)

❌ Custom ThreadPool

✅ Tokio Multi-Threaded (CHOSEN)

Trade-offs

Implementation

Verification

Consequences

Concurrency Model

Error Handling

Monitoring

Performance Tuning

References

4.8 KiB

Raw Blame History

❌ Single-Threaded Tokio (`tokio::main` single_threaded)