# ADR-026: Arc-Based Shared State Management **Status**: Accepted | Implemented **Date**: 2024-11-01 **Deciders**: Backend Architecture Team **Technical Story**: Managing thread-safe shared state across async Tokio handlers --- ## Decision Implementar **Arc-wrapped shared state** con `RwLock` (read-heavy) y `Mutex` (write-heavy) para coordinación inter-handler. --- ## Rationale 1. **Cheap Clones**: `Arc` enables sharing without duplication 2. **Thread-Safe**: `RwLock`/`Mutex` provide safe concurrent access 3. **Async-Native**: Works with Tokio async/await 4. **Handler Distribution**: Each handler gets Arc clone (scales across threads) --- ## Alternatives Considered ### ❌ Direct Shared References - **Pros**: Simple - **Cons**: Borrow checker issues in async, unsafe ### ❌ Message Passing Only (Channels) - **Pros**: Avoids shared state - **Cons**: Overkill for read-heavy state, latency ### ✅ Arc> / Arc> (CHOSEN) - Right balance of simplicity and safety --- ## Trade-offs **Pros**: - ✅ Cheap clones via Arc - ✅ Type-safe via Rust borrow checker - ✅ Works seamlessly with async/await - ✅ RwLock for read-heavy workloads (multiple readers) - ✅ Mutex for write-heavy/simple cases **Cons**: - ⚠️ Lock contention possible under high concurrency - ⚠️ Deadlock risk if not careful (nested locks) - ⚠️ Poisoned lock handling needed --- ## Implementation **Shared State Definition**: ```rust // crates/vapora-backend/src/api/state.rs pub struct AppState { pub project_service: Arc, pub task_service: Arc, pub agent_service: Arc, // Shared mutable state pub task_queue: Arc>>, pub agent_registry: Arc>>, pub metrics: Arc>, } impl AppState { pub fn new( project_service: ProjectService, task_service: TaskService, agent_service: AgentService, ) -> Self { Self { project_service: Arc::new(project_service), task_service: Arc::new(task_service), agent_service: Arc::new(agent_service), task_queue: Arc::new(Mutex::new(Vec::new())), agent_registry: Arc::new(RwLock::new(HashMap::new())), metrics: Arc::new(RwLock::new(Metrics::default())), } } } ``` **Using Arc in Handlers**: ```rust // Handlers receive State which is Arc already pub async fn create_task( State(app_state): State, // AppState is Arc Json(req): Json, ) -> Result, ApiError> { let task = app_state .task_service .create_task(&req) .await?; // Push to shared queue let mut queue = app_state.task_queue.lock().await; queue.push(task.clone()); Ok(Json(task)) } ``` **RwLock Pattern (Read-Heavy)**: ```rust // crates/vapora-backend/src/swarm/registry.rs pub async fn get_agent_status( app_state: &AppState, agent_id: &str, ) -> Result { // Multiple concurrent readers can hold read lock let registry = app_state.agent_registry.read().await; let agent = registry .get(agent_id) .ok_or(VaporaError::NotFound)?; Ok(agent.status) } pub async fn update_agent_status( app_state: &AppState, agent_id: &str, new_status: AgentStatus, ) -> Result<()> { // Exclusive write lock let mut registry = app_state.agent_registry.write().await; if let Some(agent) = registry.get_mut(agent_id) { agent.status = new_status; Ok(()) } else { Err(VaporaError::NotFound) } } ``` **Mutex Pattern (Write-Heavy)**: ```rust // crates/vapora-backend/src/api/task_queue.rs pub async fn dequeue_task( app_state: &AppState, ) -> Option { let mut queue = app_state.task_queue.lock().await; queue.pop() } pub async fn enqueue_task( app_state: &AppState, task: Task, ) { let mut queue = app_state.task_queue.lock().await; queue.push(task); } ``` **Avoiding Deadlocks**: ```rust // ✅ GOOD: Single lock acquisition pub async fn safe_operation(app_state: &AppState) { let mut registry = app_state.agent_registry.write().await; // Do work // Lock automatically released when dropped } // ❌ BAD: Nested locks (can deadlock) pub async fn unsafe_operation(app_state: &AppState) { let mut registry = app_state.agent_registry.write().await; let mut queue = app_state.task_queue.lock().await; // Risk: lock order inversion // If another task acquires locks in opposite order, deadlock! } // ✅ GOOD: Consistent lock order prevents deadlocks // Always acquire: agent_registry → task_queue pub async fn safe_nested(app_state: &AppState) { let mut registry = app_state.agent_registry.write().await; let mut queue = app_state.task_queue.lock().await; // Same order everywhere // Safe from deadlock } ``` **Poisoned Lock Handling**: ```rust pub async fn handle_poisoned_lock( app_state: &AppState, ) -> Result> { match app_state.task_queue.lock().await { Ok(queue) => Ok(queue.clone()), Err(poisoned) => { // Lock was poisoned (panic inside lock) // Recover by using inner value let queue = poisoned.into_inner(); Ok(queue.clone()) } } } ``` **Key Files**: - `/crates/vapora-backend/src/api/state.rs` (state definition) - `/crates/vapora-backend/src/main.rs` (state creation) - `/crates/vapora-backend/src/api/` (handlers using Arc) --- ## Verification ```bash # Test concurrent access to shared state cargo test -p vapora-backend test_concurrent_state_access # Test RwLock read-heavy performance cargo test -p vapora-backend test_rwlock_concurrent_reads # Test Mutex write-heavy correctness cargo test -p vapora-backend test_mutex_exclusive_writes # Integration: multiple handlers accessing shared state cargo test -p vapora-backend test_shared_state_integration # Stress test: high concurrency cargo test -p vapora-backend test_shared_state_stress ``` **Expected Output**: - Concurrent reads successful (RwLock) - Exclusive writes correct (Mutex) - No data races (Rust guarantees) - Deadlock-free (consistent lock ordering) - High throughput under load --- ## Consequences ### Performance - Read locks: low contention (multiple readers) - Write locks: exclusive (single writer) - Mutex: simple but may serialize ### Concurrency Model - Handlers clone Arc (cheap, ~8 bytes) - Multiple threads access same data - Lock guards released when dropped ### Debugging - Data races impossible (Rust compiler) - Deadlocks prevented by discipline - Poisoned locks rare (panic handling) ### Scaling - Per-core scalability excellent (read-heavy) - Write contention bottleneck (if heavy) - Sharding option for write-heavy --- ## References - [Arc Documentation](https://doc.rust-lang.org/std/sync/struct.Arc.html) - [RwLock Documentation](https://docs.rs/tokio/latest/tokio/sync/struct.RwLock.html) - [Mutex Documentation](https://docs.rs/tokio/latest/tokio/sync/struct.Mutex.html) - `/crates/vapora-backend/src/api/state.rs` (implementation) --- **Related ADRs**: ADR-008 (Tokio Runtime), ADR-024 (Service Architecture)