2026-01-12 03:32:47 +00:00
# VAPORA Architecture Decision Records (ADRs)
Documentación de las decisiones arquitectónicas clave del proyecto VAPORA.
2026-02-17 13:18:12 +00:00
**Status**: Complete (32 ADRs documented)
**Last Updated**: 2026-02-17
2026-01-12 03:32:47 +00:00
**Format**: Custom VAPORA (Decision, Rationale, Alternatives, Trade-offs, Implementation, Verification, Consequences)
---
## 📑 ADRs by Category
---
## 🗄️ Database & Persistence (1 ADR)
Decisiones sobre almacenamiento de datos y persistencia.
| ID | Título | Decisión | Status |
|----|---------| ---------|--------|
| [004 ](./0004-surrealdb-database.md ) | SurrealDB como Database Único | SurrealDB 2.3 multi-model (relational + graph + document) | ✅ Accepted |
---
## 🏗️ Core Architecture (6 ADRs)
Decisiones fundamentales sobre el stack tecnológico y estructura base del proyecto.
| ID | Título | Decisión | Status |
|----|---------| ---------|--------|
| [001 ](./0001-cargo-workspace.md ) | Cargo Workspace con 13 Crates | Monorepo con workspace Cargo | ✅ Accepted |
| [002 ](./0002-axum-backend.md ) | Axum como Backend Framework | Axum 0.8.6 REST API + composable middleware | ✅ Accepted |
| [003 ](./0003-leptos-frontend.md ) | Leptos CSR-Only Frontend | Leptos 0.8.12 WASM (Client-Side Rendering) | ✅ Accepted |
| [006 ](./0006-rig-framework.md ) | Rig Framework para LLM Agents | rig-core 0.15 para orquestación de agentes | ✅ Accepted |
| [008 ](./0008-tokio-runtime.md ) | Tokio Multi-Threaded Runtime | Tokio async runtime con configuración default | ✅ Accepted |
| [013 ](./0013-knowledge-graph.md ) | Knowledge Graph Temporal | SurrealDB temporal KG + learning curves | ✅ Accepted |
---
2026-02-17 13:18:12 +00:00
## 🔄 Agent Coordination & Messaging (5 ADRs)
2026-01-12 03:32:47 +00:00
Decisiones sobre coordinación entre agentes y comunicación de mensajes.
| ID | Título | Decisión | Status |
|----|---------| ---------|--------|
| [005 ](./0005-nats-jetstream.md ) | NATS JetStream para Agent Coordination | async-nats 0.45 con JetStream (at-least-once delivery) | ✅ Accepted |
| [007 ](./0007-multi-provider-llm.md ) | Multi-Provider LLM Support | Claude + OpenAI + Gemini + Ollama con fallback automático | ✅ Accepted |
2026-02-17 13:18:12 +00:00
| [030 ](./0030-a2a-protocol-implementation.md ) | A2A Protocol Implementation | Axum JSON-RPC 2.0 server + resilient client con exponential backoff | ✅ Implemented |
| [031 ](./0031-kubernetes-deployment-kagent.md ) | Kubernetes Deployment Strategy para kagent | Kustomize + StatefulSet con overlays dev/prod | ✅ Accepted |
| [032 ](./0032-a2a-error-handling-json-rpc.md ) | A2A Error Handling y JSON-RPC 2.0 Compliance | Two-layer: thiserror domain errors + JSON-RPC 2.0 protocol conversion | ✅ Implemented |
2026-01-12 03:32:47 +00:00
---
## ☁️ Infrastructure & Security (4 ADRs)
Decisiones sobre infraestructura Kubernetes, seguridad, y gestión de secretos.
| ID | Título | Decisión | Status |
|----|---------| ---------|--------|
| [009 ](./0009-istio-service-mesh.md ) | Istio Service Mesh | Istio para mTLS + traffic management + observability | ✅ Accepted |
| [010 ](./0010-cedar-authorization.md ) | Cedar Policy Engine | Cedar policies para RBAC declarativo | ✅ Accepted |
| [011 ](./0011-secretumvault.md ) | SecretumVault Secrets Management | Post-quantum crypto para gestión de secretos | ✅ Accepted |
| [012 ](./0012-llm-routing-tiers.md ) | Three-Tier LLM Routing | Rules-based + Dynamic + Manual Override | ✅ Accepted |
---
2026-02-17 13:18:12 +00:00
## 🚀 Innovaciones VAPORA (10 ADRs)
2026-01-12 03:32:47 +00:00
Decisiones únicas que diferencian a VAPORA de otras plataformas de orquestación multi-agente.
| ID | Título | Decisión | Status |
|----|---------| ---------|--------|
| [014 ](./0014-learning-profiles.md ) | Learning Profiles con Recency Bias | Exponential recency weighting (3× para últimos 7 días) | ✅ Accepted |
| [015 ](./0015-budget-enforcement.md ) | Three-Tier Budget Enforcement | Monthly + weekly limits con auto-fallback a Ollama | ✅ Accepted |
| [016 ](./0016-cost-efficiency-ranking.md ) | Cost Efficiency Ranking | Formula: (quality_score * 100) / (cost_cents + 1) | ✅ Accepted |
| [017 ](./0017-confidence-weighting.md ) | Confidence Weighting | min(1.0, executions/20) previene lucky streaks | ✅ Accepted |
| [018 ](./0018-swarm-load-balancing.md ) | Swarm Load-Balanced Assignment | assignment_score = success_rate / (1 + load) | ✅ Accepted |
| [019 ](./0019-temporal-execution-history.md ) | Temporal Execution History | Daily windowed aggregations para learning curves | ✅ Accepted |
| [020 ](./0020-audit-trail.md ) | Audit Trail para Compliance | Complete event logging + queryability | ✅ Accepted |
| [021 ](./0021-websocket-updates.md ) | Real-Time WebSocket Updates | tokio::sync::broadcast para pub/sub eficiente | ✅ Accepted |
2026-02-17 13:18:12 +00:00
| [028 ](./0028-workflow-orchestrator.md ) | Workflow Orchestrator para Multi-Agent Pipelines | Short-lived agent contexts + artifact passing para reducir cache tokens 95% | ✅ Accepted |
| [029 ](./0029-rlm-recursive-language-models.md ) | Recursive Language Models (RLM) | Custom Rust engine: BM25 + semantic hybrid search + distributed LLM dispatch + WASM/Docker sandbox | ✅ Accepted |
2026-01-12 03:32:47 +00:00
---
## 🔧 Development Patterns (6 ADRs)
Patrones de desarrollo y arquitectura utilizados en todo el codebase.
| ID | Título | Decisión | Status |
|----|---------| ---------|--------|
| [022 ](./0022-error-handling.md ) | Two-Tier Error Handling | thiserror domain errors + ApiError HTTP wrapper | ✅ Accepted |
| [023 ](./0023-testing-strategy.md ) | Multi-Layer Testing Strategy | Unit tests (inline) + Integration (tests/) + Real DB | ✅ Accepted |
| [024 ](./0024-service-architecture.md ) | Service-Oriented Architecture | API layer (thin) + Services layer (thick business logic) | ✅ Accepted |
| [025 ](./0025-multi-tenancy.md ) | SurrealDB Scope-Based Multi-Tenancy | tenant_id fields + database scopes para defense-in-depth | ✅ Accepted |
| [026 ](./0026-shared-state.md ) | Arc-Based Shared State | Arc< RwLock < > > para read-heavy, Arc< Mutex < > > para write-heavy | ✅ Accepted |
| [027 ](./0027-documentation-layers.md ) | Three-Layer Documentation System | .coder/ (session) + .claude/ (operational) + docs/ (product) | ✅ Accepted |
---
## Documentation by Category
### 🗄️ Database & Persistence
- **SurrealDB**: Multi-model database (relational + graph + document) unifies all VAPORA data needs with native multi-tenancy support via scopes
### 🏗️ Core Architecture
- **Workspace**: Monorepo structure with 13 specialized crates enables independent testing, parallel development, code reuse
- **Backend**: Axum provides composable middleware, type-safe routing, direct Tokio ecosystem integration
- **Frontend**: Leptos CSR enables fine-grained reactivity and WASM performance (no SEO needed for platform)
- **LLM Framework**: Rig enables tool calling and streaming with minimal abstraction
- **Runtime**: Tokio multi-threaded optimized for I/O-heavy workloads (API, DB, LLM calls)
- **Knowledge Graph**: Temporal history with learning curves enables collective agent learning via SurrealDB
### 🔄 Agent Coordination & Messaging
- **NATS JetStream**: Provides persistent, reliable at-least-once delivery for agent task coordination
- **Multi-Provider LLM**: Support 4 providers (Claude, OpenAI, Gemini, Ollama) with automatic fallback chain
2026-02-17 13:18:12 +00:00
- **A2A Protocol**: JSON-RPC 2.0 over HTTP enables interoperability with Google kagent and other A2A-compliant agents
- **kagent Kubernetes Deployment**: Kustomize StatefulSet with stable pod identities for predictable A2A endpoint addressing
- **A2A Error Handling**: Two-layer strategy (domain `thiserror` + JSON-RPC 2.0 protocol conversion) specializes ADR-0022 for A2A
2026-01-12 03:32:47 +00:00
### ☁️ Infrastructure & Security
- **Istio Service Mesh**: Provides zero-trust security (mTLS), traffic management, observability for inter-service communication
- **Cedar Authorization**: Declarative, auditable RBAC policies for fine-grained access control
- **SecretumVault**: Post-quantum cryptography future-proofs API key and credential storage
- **Three-Tier LLM Routing**: Balances predictability (rules-based) with flexibility (dynamic scoring) and manual override capability
### 🚀 Innovations Unique to VAPORA
- **Learning Profiles**: Recency-biased expertise tracking (3× weight for last 7 days) adapts agent selection to current capability
- **Budget Enforcement**: Dual time windows (monthly + weekly) with three enforcement states + auto-fallback prevent both long-term and short-term overspend
- **Cost Efficiency Ranking**: Quality-to-cost formula `(quality_score * 100) / (cost_cents + 1)` prevents overfitting to cheap providers
- **Confidence Weighting**: `min(1.0, executions/20)` prevents new agents from being selected on lucky streaks
- **Swarm Load Balancing**: `success_rate / (1 + load)` balances agent expertise with availability
- **Temporal Execution History**: Daily windowed aggregations identify improvement trends and enable collective learning
- **Audit Trail**: Complete event logging for compliance, incident investigation, and event sourcing potential
- **Real-Time WebSocket Updates**: Broadcast channels for efficient multi-client workflow progress updates
2026-02-17 13:18:12 +00:00
- **Workflow Orchestrator**: Short-lived agent contexts + artifact passing reduce cache token costs ~95% vs monolithic sessions
- **Recursive Language Models (RLM)**: Hybrid BM25+semantic search + distributed LLM dispatch + WASM/Docker sandbox enables reasoning over 100k+ token documents
2026-01-12 03:32:47 +00:00
### 🔧 Development Patterns
- **Two-Tier Error Handling**: Domain errors (`VaporaError` ) separate from HTTP responses (`ApiError` ) for reusability
- **Multi-Layer Testing**: Unit tests (inline) + Integration tests (tests/ dir) + Real database connections = 218+ tests
- **Service-Oriented Architecture**: Thin API layer delegates to thick services layer containing business logic
- **Scope-Based Multi-Tenancy**: `tenant_id` fields + SurrealDB scopes provide defense-in-depth tenant isolation
- **Arc-Based Shared State**: `Arc<RwLock<>>` for read-heavy, `Arc<Mutex<>>` for write-heavy state management
- **Three-Layer Documentation**: `.coder/` (session) + `.claude/` (operational) + `docs/` (product) separates concerns
---
## How to Use These ADRs
### For Team Members
1. **Understanding Architecture** : Start with Core Architecture ADRs (001-013) to understand technology choices
2. **Learning VAPORA's Unique Features** : Read Innovations ADRs (014-021) to understand what makes VAPORA different
3. **Writing New Code** : Reference relevant ADRs in Patterns section (022-027) when implementing features
### For New Hires
1. Read Core Architecture (001-013) first - ~30 minutes to understand the stack
2. Read Innovations (014-021) - ~45 minutes to understand VAPORA's differentiators
3. Reference Patterns (022-027) as you write your first contributions
### For Architectural Decisions
When making new architectural decisions:
1. Check existing ADRs to understand previous choices and trade-offs
2. Create a new ADR following the Custom VAPORA format
3. Reference existing ADRs that influenced your decision
4. Get team review before implementation
### For Troubleshooting
When debugging or optimizing:
1. Find the ADR for the relevant component
2. Review the "Implementation" section for key files
3. Check "Verification" for testing commands
4. Review "Consequences" for known limitations
---
## Format
Each ADR follows the Custom VAPORA format:
```markdown
# ADR-XXX: [Title]
**Status**: Accepted | Implemented
**Date**: YYYY-MM-DD
**Deciders**: [Team/Role]
**Technical Story**: [Context/Issue]
---
## Decision
[Descripción clara de la decisión]
## Rationale
[Por qué se tomó esta decisión]
## Alternatives Considered
[Opciones evaluadas y por qué se descartaron]
## Trade-offs
**Pros**: [Beneficios]
**Cons**: [Costos]
## Implementation
[Dónde está implementada, archivos clave, ejemplos de código]
## Verification
[Cómo verificar que la decisión está correctamente implementada]
## Consequences
[Impacto a largo plazo, dependencias, mantenimiento]
## References
[Links a docs, código, issues]
```
---
## Integration with Project Documentation
- **docs/operations/**: Deployment, disaster recovery, operational runbooks
- **docs/disaster-recovery/**: Backup strategy, recovery procedures, business continuity
- **.claude/guidelines/**: Development conventions (Rust, Nushell, Nickel)
- **.claude/CLAUDE.md**: Project-specific constraints and patterns
---
## Maintenance
### When to Update ADRs
- ❌ Do NOT create new ADRs for minor code changes
- ✅ DO create ADRs for significant architectural decisions (framework changes, new patterns, major refactoring)
- ✅ DO update ADRs if a decision changes (mark as "Superseded" and create new ADR)
### Review Process
- ADRs should be reviewed before major architectural changes
- Use ADRs as reference during code reviews to ensure consistency
- Update ADRs if they don't reflect current reality (source of truth = code)
### Quarterly Review
- Review all ADRs quarterly to ensure they're still accurate
- Update "Date" field if reviewed and still valid
- Mark as "Superseded" if implementation has changed
---
## Statistics
2026-02-17 13:18:12 +00:00
- **Total ADRs**: 32
- **Core Architecture**: 13 (41%)
- **Agent Coordination**: 5 (16%)
- **Infrastructure**: 4 (12%)
- **Innovations**: 10 (31%)
- **Patterns**: 6 (19%)
2026-01-12 03:32:47 +00:00
- **Production Status**: All Accepted and Implemented
---
## Related Resources
- [VAPORA Architecture Overview ](../README.md#architecture )
- [Development Guidelines ](./../.claude/guidelines/rust.md )
- [Deployment Guide ](./operations/deployment-runbook.md )
- [Disaster Recovery ](./disaster-recovery/README.md )
---
**Generated**: January 12, 2026
**Status**: Production-Ready
2026-02-17 13:18:12 +00:00
**Last Reviewed**: 2026-02-17