Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

ADR-026: Arc-Based Shared State Management

Status: Accepted | Implemented Date: 2024-11-01 Deciders: Backend Architecture Team Technical Story: Managing thread-safe shared state across async Tokio handlers


Decision

Implementar Arc-wrapped shared state con RwLock (read-heavy) y Mutex (write-heavy) para coordinación inter-handler.


Rationale

  1. Cheap Clones: Arc enables sharing without duplication
  2. Thread-Safe: RwLock/Mutex provide safe concurrent access
  3. Async-Native: Works with Tokio async/await
  4. Handler Distribution: Each handler gets Arc clone (scales across threads)

Alternatives Considered

❌ Direct Shared References

  • Pros: Simple
  • Cons: Borrow checker issues in async, unsafe

❌ Message Passing Only (Channels)

  • Pros: Avoids shared state
  • Cons: Overkill for read-heavy state, latency

✅ Arc<RwLock<>> / Arc<Mutex<>> (CHOSEN)

  • Right balance of simplicity and safety

Trade-offs

Pros:

  • ✅ Cheap clones via Arc
  • ✅ Type-safe via Rust borrow checker
  • ✅ Works seamlessly with async/await
  • ✅ RwLock for read-heavy workloads (multiple readers)
  • ✅ Mutex for write-heavy/simple cases

Cons:

  • ⚠️ Lock contention possible under high concurrency
  • ⚠️ Deadlock risk if not careful (nested locks)
  • ⚠️ Poisoned lock handling needed

Implementation

Shared State Definition:

#![allow(unused)]
fn main() {
// crates/vapora-backend/src/api/state.rs

pub struct AppState {
    pub project_service: Arc<ProjectService>,
    pub task_service: Arc<TaskService>,
    pub agent_service: Arc<AgentService>,

    // Shared mutable state
    pub task_queue: Arc<Mutex<Vec<Task>>>,
    pub agent_registry: Arc<RwLock<HashMap<String, AgentState>>>,
    pub metrics: Arc<RwLock<Metrics>>,
}

impl AppState {
    pub fn new(
        project_service: ProjectService,
        task_service: TaskService,
        agent_service: AgentService,
    ) -> Self {
        Self {
            project_service: Arc::new(project_service),
            task_service: Arc::new(task_service),
            agent_service: Arc::new(agent_service),
            task_queue: Arc::new(Mutex::new(Vec::new())),
            agent_registry: Arc::new(RwLock::new(HashMap::new())),
            metrics: Arc::new(RwLock::new(Metrics::default())),
        }
    }
}
}

Using Arc in Handlers:

#![allow(unused)]
fn main() {
// Handlers receive State which is Arc already
pub async fn create_task(
    State(app_state): State<AppState>,  // AppState is Arc<AppState>
    Json(req): Json<CreateTaskRequest>,
) -> Result<Json<Task>, ApiError> {
    let task = app_state
        .task_service
        .create_task(&req)
        .await?;

    // Push to shared queue
    let mut queue = app_state.task_queue.lock().await;
    queue.push(task.clone());

    Ok(Json(task))
}
}

RwLock Pattern (Read-Heavy):

#![allow(unused)]
fn main() {
// crates/vapora-backend/src/swarm/registry.rs

pub async fn get_agent_status(
    app_state: &AppState,
    agent_id: &str,
) -> Result<AgentStatus> {
    // Multiple concurrent readers can hold read lock
    let registry = app_state.agent_registry.read().await;

    let agent = registry
        .get(agent_id)
        .ok_or(VaporaError::NotFound)?;

    Ok(agent.status)
}

pub async fn update_agent_status(
    app_state: &AppState,
    agent_id: &str,
    new_status: AgentStatus,
) -> Result<()> {
    // Exclusive write lock
    let mut registry = app_state.agent_registry.write().await;

    if let Some(agent) = registry.get_mut(agent_id) {
        agent.status = new_status;
        Ok(())
    } else {
        Err(VaporaError::NotFound)
    }
}
}

Mutex Pattern (Write-Heavy):

#![allow(unused)]
fn main() {
// crates/vapora-backend/src/api/task_queue.rs

pub async fn dequeue_task(
    app_state: &AppState,
) -> Option<Task> {
    let mut queue = app_state.task_queue.lock().await;
    queue.pop()
}

pub async fn enqueue_task(
    app_state: &AppState,
    task: Task,
) {
    let mut queue = app_state.task_queue.lock().await;
    queue.push(task);
}
}

Avoiding Deadlocks:

#![allow(unused)]
fn main() {
// ✅ GOOD: Single lock acquisition
pub async fn safe_operation(app_state: &AppState) {
    let mut registry = app_state.agent_registry.write().await;
    // Do work
    // Lock automatically released when dropped
}

// ❌ BAD: Nested locks (can deadlock)
pub async fn unsafe_operation(app_state: &AppState) {
    let mut registry = app_state.agent_registry.write().await;
    let mut queue = app_state.task_queue.lock().await;  // Risk: lock order inversion
    // If another task acquires locks in opposite order, deadlock!
}

// ✅ GOOD: Consistent lock order prevents deadlocks
// Always acquire: agent_registry → task_queue
pub async fn safe_nested(app_state: &AppState) {
    let mut registry = app_state.agent_registry.write().await;
    let mut queue = app_state.task_queue.lock().await;  // Same order everywhere
    // Safe from deadlock
}
}

Poisoned Lock Handling:

#![allow(unused)]
fn main() {
pub async fn handle_poisoned_lock(
    app_state: &AppState,
) -> Result<Vec<Task>> {
    match app_state.task_queue.lock().await {
        Ok(queue) => Ok(queue.clone()),
        Err(poisoned) => {
            // Lock was poisoned (panic inside lock)
            // Recover by using inner value
            let queue = poisoned.into_inner();
            Ok(queue.clone())
        }
    }
}
}

Key Files:

  • /crates/vapora-backend/src/api/state.rs (state definition)
  • /crates/vapora-backend/src/main.rs (state creation)
  • /crates/vapora-backend/src/api/ (handlers using Arc)

Verification

# Test concurrent access to shared state
cargo test -p vapora-backend test_concurrent_state_access

# Test RwLock read-heavy performance
cargo test -p vapora-backend test_rwlock_concurrent_reads

# Test Mutex write-heavy correctness
cargo test -p vapora-backend test_mutex_exclusive_writes

# Integration: multiple handlers accessing shared state
cargo test -p vapora-backend test_shared_state_integration

# Stress test: high concurrency
cargo test -p vapora-backend test_shared_state_stress

Expected Output:

  • Concurrent reads successful (RwLock)
  • Exclusive writes correct (Mutex)
  • No data races (Rust guarantees)
  • Deadlock-free (consistent lock ordering)
  • High throughput under load

Consequences

Performance

  • Read locks: low contention (multiple readers)
  • Write locks: exclusive (single writer)
  • Mutex: simple but may serialize

Concurrency Model

  • Handlers clone Arc (cheap, ~8 bytes)
  • Multiple threads access same data
  • Lock guards released when dropped

Debugging

  • Data races impossible (Rust compiler)
  • Deadlocks prevented by discipline
  • Poisoned locks rare (panic handling)

Scaling

  • Per-core scalability excellent (read-heavy)
  • Write contention bottleneck (if heavy)
  • Sharding option for write-heavy

References


Related ADRs: ADR-008 (Tokio Runtime), ADR-024 (Service Architecture)