VAPORA Architecture

Multi-Agent Multi-IA Cloud-Native Platform

Status: Production Ready (v1.2.0) Date: January 2026

📊 Executive Summary

VAPORA is a cloud-native platform for multi-agent software development:

✅ 12 specialized agents working in parallel (Architect, Developer, Reviewer, Tester, Documenter, etc.)
✅ Multi-IA routing (Claude, OpenAI, Gemini, Ollama) optimized per task
✅ Full-stack Rust (Backend, Frontend, Agents, Infrastructure)
✅ Kubernetes-native deployment via Provisioning
✅ Self-hosted - no SaaS dependencies
✅ Cedar-based RBAC for teams and access control
✅ NATS JetStream for inter-agent coordination
✅ Learning-based agent selection with task-type expertise
✅ Budget-enforced LLM routing with automatic fallback
✅ Knowledge Graph for execution history and learning curves

🏗️ 4-Layer Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                         Frontend Layer                              │
│              Leptos CSR (WASM) + UnoCSS Glassmorphism               │
│                                                                     │
│  Kanban Board  │  Projects  │  Agents Marketplace  │  Settings      │
└──────────────────────────────┬──────────────────────────────────────┘
                               │
                        Istio Ingress (mTLS)
                               │
┌──────────────────────────────┴──────────────────────────────────────┐
│                         API Layer                                   │
│              Axum REST API + WebSocket (Async Rust)                 │
│                                                                     │
│      /tasks  │  /agents  │  /workflows  │  /auth  │  /projects      │
│      Rate Limiting  │  Auth (JWT)  │  Compression                   │
└──────────────────────────────┬──────────────────────────────────────┘
                               │
          ┌────────────────────┼────────────────────┐
          │                    │                    │
┌─────────▼────────┐ ┌────────▼────────┐ ┌────────▼─────────┐
│   Agent Service  │ │  LLM Router     │ │   MCP Gateway    │
│   Orchestration  │ │  (Multi-IA)     │ │  (Plugin System) │
└────────┬─────────┘ └────────┬────────┘ └────────┬─────────┘
         │                    │                   │
         └────────────────────┼───────────────────┘
                              │
         ┌────────────────────┼───────────────────┐
         │                    │                   │
    ┌────▼─────┐      ┌──────▼──────┐      ┌────▼──────┐
    │SurrealDB │      │NATS Jet     │      │RustyVault │
    │(MultiTen)│      │Stream (Jobs)│      │(Secrets)  │
    └──────────┘      └─────────────┘      └───────────┘
                              │
                    ┌─────────▼─────────┐
                    │ Observability     │
                    │ Prometheus/Grafana│
                    │ Loki/Tempo (Logs) │
                    └───────────────────┘

📋 Component Overview

Frontend (Leptos WASM)

Kanban Board: Drag-drop task management with real-time updates
Project Dashboard: Project overview, metrics, team stats
Agent Marketplace: Browse, install, configure agent plugins
Settings: User preferences, workspace configuration

Tech: Leptos (reactive), UnoCSS (styling), WebSocket (real-time)

API Layer (Axum)

REST Endpoints (40+): Full CRUD for projects, tasks, agents, workflows
WebSocket API: Real-time task updates, agent status changes
Authentication: JWT tokens, refresh rotation
Rate Limiting: Per-user/IP throttling
Compression: gzip for bandwidth optimization

Tech: Axum (async), Tokio (runtime), Tower middleware

Service Layer

Agent Orchestration:

Agent registry with capability-based discovery
Task assignment via SwarmCoordinator with load balancing
Learning profiles for task-type expertise
Health checking with automatic agent removal
NATS JetStream integration for async coordination

LLM Router (Multi-Provider):

Claude (Opus, Sonnet, Haiku)
OpenAI (GPT-4, GPT-4o)
Google Gemini (2.0 Pro, Flash)
Ollama (Local open-source models)

Provider Selection Strategy:

Rules-based routing by task complexity/type
Learning-based selection by agent expertise
Budget-aware routing with automatic fallback
Cost efficiency ranking (quality/cost ratio)

MCP Gateway:

Plugin protocol for external tools
Code analysis, RAG, GitHub, Jira integrations
Tool calling and resource management

Data Layer

SurrealDB:

Multi-tenant scopes for workspace isolation
Nested tables for relational data
Full-text search for task/doc indexing
Versioning for audit trails

NATS JetStream:

Reliable message queue for agent jobs
Consumer groups for load balancing
At-least-once delivery guarantee

RustyVault:

API key storage (OpenAI, Anthropic, Google)
Encryption at rest
Audit logging

🔄 Data Flow: Task Execution

1. User creates task in Kanban → API POST /tasks
2. Backend validates and persists to SurrealDB
3. Task published to NATS subject: tasks.{type}.{priority}
4. SwarmCoordinator subscribes, selects best agent:
   - Learning profile lookup (task-type expertise)
   - Load balancing (success_rate / (1 + load))
   - Scoring: 0.3*load + 0.5*expertise + 0.2*confidence
5. Agent receives job, calls LLMRouter.select_provider():
   - Check budget status (monthly/weekly limits)
   - If budget exceeded: fallback to cheap provider (Ollama/Gemini)
   - If near threshold: prefer cost-efficient provider
   - Otherwise: rule-based routing
6. LLM generates response
7. Agent processes result, stores execution in KG
8. Result persisted to SurrealDB
9. Learning profiles updated (background sync, 30s interval)
10. Budget tracker updated
11. WebSocket pushes update to frontend
12. Kanban board updates in real-time

🔐 Security & Multi-Tenancy

Tenant Isolation:

SurrealDB scopes: workspace:123, team:456
Row-level filtering in all queries
No cross-tenant data leakage

Authentication:

JWT tokens (HS256)
Token TTL: 15 minutes
Refresh token rotation (7 days)
HTTPS/mTLS enforced

Authorization (Cedar Policy Engine):

Fine-grained RBAC per workspace
Roles: Owner, Admin, Member, Viewer
Resource-scoped permissions: create_task, edit_workflow, etc.

Audit Logging:

All significant actions logged: task creation, agent assignment, provider selection
Timestamp, actor, action, resource, result
Searchable in SurrealDB

🚀 Learning & Cost Optimization

Multi-Agent Learning (Phase 5.3)

Learning Profiles:

Per-agent, per-task-type expertise tracking
Success rate calculation with recency bias (7-day window, 3× weight)
Confidence scoring to prevent overfitting
Learning curves for trend analysis

Agent Scoring Formula:

final_score = 0.3*base_score + 0.5*expertise_score + 0.2*confidence

Cost Optimization (Phase 5.4)

Budget Enforcement:

Per-role budget limits (monthly/weekly in cents)
Three-tier policy:
1. Normal: Rule-based routing
2. Near-threshold (>80%): Prefer cheaper providers
3. Budget exceeded: Automatic fallback to cheapest provider

Provider Fallback Chain (cost-ordered):