7.1 KiB
7.1 KiB
ADR-026: Arc-Based Shared State Management
Status: Accepted | Implemented Date: 2024-11-01 Deciders: Backend Architecture Team Technical Story: Managing thread-safe shared state across async Tokio handlers
Decision
Implementar Arc-wrapped shared state con RwLock (read-heavy) y Mutex (write-heavy) para coordinación inter-handler.
Rationale
- Cheap Clones:
Arcenables sharing without duplication - Thread-Safe:
RwLock/Mutexprovide safe concurrent access - Async-Native: Works with Tokio async/await
- Handler Distribution: Each handler gets Arc clone (scales across threads)
Alternatives Considered
❌ Direct Shared References
- Pros: Simple
- Cons: Borrow checker issues in async, unsafe
❌ Message Passing Only (Channels)
- Pros: Avoids shared state
- Cons: Overkill for read-heavy state, latency
✅ Arc<RwLock<>> / Arc<Mutex<>> (CHOSEN)
- Right balance of simplicity and safety
Trade-offs
Pros:
- ✅ Cheap clones via Arc
- ✅ Type-safe via Rust borrow checker
- ✅ Works seamlessly with async/await
- ✅ RwLock for read-heavy workloads (multiple readers)
- ✅ Mutex for write-heavy/simple cases
Cons:
- ⚠️ Lock contention possible under high concurrency
- ⚠️ Deadlock risk if not careful (nested locks)
- ⚠️ Poisoned lock handling needed
Implementation
Shared State Definition:
// crates/vapora-backend/src/api/state.rs
pub struct AppState {
pub project_service: Arc<ProjectService>,
pub task_service: Arc<TaskService>,
pub agent_service: Arc<AgentService>,
// Shared mutable state
pub task_queue: Arc<Mutex<Vec<Task>>>,
pub agent_registry: Arc<RwLock<HashMap<String, AgentState>>>,
pub metrics: Arc<RwLock<Metrics>>,
}
impl AppState {
pub fn new(
project_service: ProjectService,
task_service: TaskService,
agent_service: AgentService,
) -> Self {
Self {
project_service: Arc::new(project_service),
task_service: Arc::new(task_service),
agent_service: Arc::new(agent_service),
task_queue: Arc::new(Mutex::new(Vec::new())),
agent_registry: Arc::new(RwLock::new(HashMap::new())),
metrics: Arc::new(RwLock::new(Metrics::default())),
}
}
}
Using Arc in Handlers:
// Handlers receive State which is Arc already
pub async fn create_task(
State(app_state): State<AppState>, // AppState is Arc<AppState>
Json(req): Json<CreateTaskRequest>,
) -> Result<Json<Task>, ApiError> {
let task = app_state
.task_service
.create_task(&req)
.await?;
// Push to shared queue
let mut queue = app_state.task_queue.lock().await;
queue.push(task.clone());
Ok(Json(task))
}
RwLock Pattern (Read-Heavy):
// crates/vapora-backend/src/swarm/registry.rs
pub async fn get_agent_status(
app_state: &AppState,
agent_id: &str,
) -> Result<AgentStatus> {
// Multiple concurrent readers can hold read lock
let registry = app_state.agent_registry.read().await;
let agent = registry
.get(agent_id)
.ok_or(VaporaError::NotFound)?;
Ok(agent.status)
}
pub async fn update_agent_status(
app_state: &AppState,
agent_id: &str,
new_status: AgentStatus,
) -> Result<()> {
// Exclusive write lock
let mut registry = app_state.agent_registry.write().await;
if let Some(agent) = registry.get_mut(agent_id) {
agent.status = new_status;
Ok(())
} else {
Err(VaporaError::NotFound)
}
}
Mutex Pattern (Write-Heavy):
// crates/vapora-backend/src/api/task_queue.rs
pub async fn dequeue_task(
app_state: &AppState,
) -> Option<Task> {
let mut queue = app_state.task_queue.lock().await;
queue.pop()
}
pub async fn enqueue_task(
app_state: &AppState,
task: Task,
) {
let mut queue = app_state.task_queue.lock().await;
queue.push(task);
}
Avoiding Deadlocks:
// ✅ GOOD: Single lock acquisition
pub async fn safe_operation(app_state: &AppState) {
let mut registry = app_state.agent_registry.write().await;
// Do work
// Lock automatically released when dropped
}
// ❌ BAD: Nested locks (can deadlock)
pub async fn unsafe_operation(app_state: &AppState) {
let mut registry = app_state.agent_registry.write().await;
let mut queue = app_state.task_queue.lock().await; // Risk: lock order inversion
// If another task acquires locks in opposite order, deadlock!
}
// ✅ GOOD: Consistent lock order prevents deadlocks
// Always acquire: agent_registry → task_queue
pub async fn safe_nested(app_state: &AppState) {
let mut registry = app_state.agent_registry.write().await;
let mut queue = app_state.task_queue.lock().await; // Same order everywhere
// Safe from deadlock
}
Poisoned Lock Handling:
pub async fn handle_poisoned_lock(
app_state: &AppState,
) -> Result<Vec<Task>> {
match app_state.task_queue.lock().await {
Ok(queue) => Ok(queue.clone()),
Err(poisoned) => {
// Lock was poisoned (panic inside lock)
// Recover by using inner value
let queue = poisoned.into_inner();
Ok(queue.clone())
}
}
}
Key Files:
/crates/vapora-backend/src/api/state.rs(state definition)/crates/vapora-backend/src/main.rs(state creation)/crates/vapora-backend/src/api/(handlers using Arc)
Verification
# Test concurrent access to shared state
cargo test -p vapora-backend test_concurrent_state_access
# Test RwLock read-heavy performance
cargo test -p vapora-backend test_rwlock_concurrent_reads
# Test Mutex write-heavy correctness
cargo test -p vapora-backend test_mutex_exclusive_writes
# Integration: multiple handlers accessing shared state
cargo test -p vapora-backend test_shared_state_integration
# Stress test: high concurrency
cargo test -p vapora-backend test_shared_state_stress
Expected Output:
- Concurrent reads successful (RwLock)
- Exclusive writes correct (Mutex)
- No data races (Rust guarantees)
- Deadlock-free (consistent lock ordering)
- High throughput under load
Consequences
Performance
- Read locks: low contention (multiple readers)
- Write locks: exclusive (single writer)
- Mutex: simple but may serialize
Concurrency Model
- Handlers clone Arc (cheap, ~8 bytes)
- Multiple threads access same data
- Lock guards released when dropped
Debugging
- Data races impossible (Rust compiler)
- Deadlocks prevented by discipline
- Poisoned locks rare (panic handling)
Scaling
- Per-core scalability excellent (read-heavy)
- Write contention bottleneck (if heavy)
- Sharding option for write-heavy
References
- Arc Documentation
- RwLock Documentation
- Mutex Documentation
/crates/vapora-backend/src/api/state.rs(implementation)
Related ADRs: ADR-008 (Tokio Runtime), ADR-024 (Service Architecture)