Vapora/docs/setup/deployment.md

# VAPORA v1.0 Deployment Guide

Complete guide for deploying VAPORA v1.0 to Kubernetes (self-hosted).

**Version**: 0.1.0
**Status**: Production Ready
**Last Updated**: 2025-11-10

---

## Table of Contents

1. [Overview](#overview)
2. [Prerequisites](#prerequisites)
3. [Architecture](#architecture)
4. [Deployment Methods](#deployment-methods)
5. [Building Docker Images](#building-docker-images)
6. [Kubernetes Deployment](#kubernetes-deployment)
7. [Provisioning Deployment](#provisioning-deployment)
8. [Configuration](#configuration)
9. [Monitoring & Health Checks](#monitoring--health-checks)
10. [Scaling](#scaling)
11. [Troubleshooting](#troubleshooting)
12. [Rollback](#rollback)
13. [Security](#security)

---

## Overview

VAPORA v1.0 is a **cloud-native multi-agent software development platform** that runs on Kubernetes. It consists of:

- **6 Rust services**: Backend API, Frontend UI, Agents, MCP Server, LLM Router (embedded), Shared library
- **2 Infrastructure services**: SurrealDB (database), NATS JetStream (messaging)
- **Multi-IA routing**: Claude, OpenAI, Gemini, Ollama support
- **12 specialized agents**: Architect, Developer, Reviewer, Tester, Documenter, etc.

All services are containerized and deployed as Kubernetes workloads.

---

## Prerequisites

### Required Tools

- **Kubernetes 1.25+** (K3s, RKE2, or managed Kubernetes)
- **kubectl** (configured and connected to cluster)
- **Docker** or **Podman** (for building images)
- **Nushell** (for deployment scripts)

### Optional Tools

- **Provisioning CLI** (for advanced deployment)
- **Helm** (if using Helm charts)
- **cert-manager** (for automatic TLS certificates)
- **Prometheus/Grafana** (for monitoring)

### Cluster Requirements

- **Minimum**: 4 CPU, 8GB RAM, 50GB storage
- **Recommended**: 8 CPU, 16GB RAM, 100GB storage
- **Production**: 16+ CPU, 32GB+ RAM, 200GB+ storage

### Storage

- **Storage Class**: Required for SurrealDB PersistentVolumeClaim
- **Options**: local-path, nfs-client, rook-ceph, or cloud provider storage
- **Minimum**: 20Gi for database

### Ingress

- **nginx-ingress** controller installed
- **Domain name** pointing to cluster ingress IP
- **TLS certificate** (optional, recommended for production)

---

## Architecture

```
┌─────────────────────────────────────────────────────┐
│                  Internet / Users                   │
└───────────────────────┬─────────────────────────────┘
                        │
┌───────────────────────▼─────────────────────────────┐
│  Ingress (nginx)                                    │
│  - vapora.example.com                               │
│  - TLS termination                                  │
└────┬────────┬─────────┬─────────┬──────────────────┘
     │        │         │         │
     │        │         │         │
┌────▼────┐ ┌▼─────┐ ┌▼─────┐  ┌▼──────────┐
│Frontend │ │Backend│ │ MCP  │  │           │
│(Leptos) │ │(Axum) │ │Server│  │           │
│ 2 pods  │ │2 pods │ │1 pod │  │           │
└─────────┘ └───┬───┘ └──────┘  │           │
                │                 │           │
         ┌──────┴──────┬──────────┤           │
         │             │          │           │
    ┌────▼────┐   ┌───▼─────┐  ┌▼───────┐   │
    │SurrealDB│   │  NATS   │  │ Agents │   │
    │StatefulS│   │JetStream│  │ 3 pods │   │
    │  1 pod  │   │  1 pod  │  └────────┘   │
    └─────────┘   └─────────┘                │
         │                                   │
    ┌────▼────────────────────────────────┐  │
    │  Persistent Volume (20Gi)           │  │
    │  - SurrealDB data                   │  │
    └─────────────────────────────────────┘  │
                                              │
┌─────────────────────────────────────────────▼──┐
│  External LLM APIs                            │
│  - Anthropic Claude API                       │
│  - OpenAI API                                 │
│  - Google Gemini API                          │
│  - (Optional) Ollama local                    │
└───────────────────────────────────────────────┘
```

---

## Deployment Methods

VAPORA supports two deployment methods:

### Method 1: Vanilla Kubernetes (Recommended for Getting Started)

**Pros**:
- Simple, well-documented
- Standard K8s manifests
- Easy to understand and modify
- No additional tools required

**Cons**:
- Manual cluster management
- Manual service ordering
- No built-in rollback

**Use when**: Learning, testing, or simple deployments

### Method 2: Provisioning (Recommended for Production)

**Pros**:
- Automated cluster creation
- Declarative workflows
- Built-in rollback
- Service mesh integration
- Secret management

**Cons**:
- Requires Provisioning CLI
- More complex configuration
- Steeper learning curve

**Use when**: Production deployments, complex environments

---

## Building Docker Images

### Option 1: Using Nushell Script (Recommended)

```bash
# Build all images (local registry)
nu scripts/build-docker.nu

# Build and push to Docker Hub
nu scripts/build-docker.nu --registry docker.io --push

# Build with specific tag
nu scripts/build-docker.nu --tag v0.1.0

# Build without cache
nu scripts/build-docker.nu --no-cache
```

### Option 2: Manual Docker Build

```bash
# From project root

# Backend
docker build -f crates/vapora-backend/Dockerfile -t vapora/backend:latest .

# Frontend
docker build -f crates/vapora-frontend/Dockerfile -t vapora/frontend:latest .

# Agents
docker build -f crates/vapora-agents/Dockerfile -t vapora/agents:latest .

# MCP Server
docker build -f crates/vapora-mcp-server/Dockerfile -t vapora/mcp-server:latest .
```

### Image Sizes (Approximate)

- **vapora/backend**: ~50MB (Alpine + Rust binary)
- **vapora/frontend**: ~30MB (nginx + WASM)
- **vapora/agents**: ~50MB (Alpine + Rust binary)
- **vapora/mcp-server**: ~45MB (Alpine + Rust binary)

---

## Kubernetes Deployment

### Step 1: Configure Secrets

Edit `kubernetes/03-secrets.yaml`:

```yaml
stringData:
  # Generate strong JWT secret
  jwt-secret: "$(openssl rand -base64 32)"

  # Add your LLM API keys
  anthropic-api-key: "sk-ant-xxxxx"
  openai-api-key: "sk-xxxxx"
  gemini-api-key: "xxxxx"  # Optional

  # Database credentials
  surrealdb-user: "root"
  surrealdb-pass: "$(openssl rand -base64 32)"
```

**IMPORTANT**: Never commit real secrets to version control!

### Step 2: Configure Ingress

Edit `kubernetes/08-ingress.yaml`:

```yaml
spec:
  rules:
  - host: vapora.yourdomain.com  # Change this!
```

### Step 3: Deploy Using Script (Recommended)

```bash
# Dry run to validate
nu scripts/deploy-k8s.nu --dry-run

# Deploy to default namespace (vapora)
nu scripts/deploy-k8s.nu

# Deploy to custom namespace
nu scripts/deploy-k8s.nu --namespace my-vapora

# Skip secrets (if already created)
nu scripts/deploy-k8s.nu --skip-secrets
```

### Step 4: Manual Deploy (Alternative)

```bash
# Apply manifests in order
kubectl apply -f kubernetes/00-namespace.yaml
kubectl apply -f kubernetes/01-surrealdb.yaml
kubectl apply -f kubernetes/02-nats.yaml
kubectl apply -f kubernetes/03-secrets.yaml
kubectl apply -f kubernetes/04-backend.yaml
kubectl apply -f kubernetes/05-frontend.yaml
kubectl apply -f kubernetes/06-agents.yaml
kubectl apply -f kubernetes/07-mcp-server.yaml
kubectl apply -f kubernetes/08-ingress.yaml

# Wait for rollout
kubectl rollout status deployment/vapora-backend -n vapora
kubectl rollout status deployment/vapora-frontend -n vapora
```

### Step 5: Verify Deployment

```bash
# Check all pods are running
kubectl get pods -n vapora

# Expected output:
# NAME                                READY   STATUS    RESTARTS
# surrealdb-0                         1/1     Running   0
# nats-xxx                            1/1     Running   0
# vapora-backend-xxx                  1/1     Running   0
# vapora-backend-yyy                  1/1     Running   0
# vapora-frontend-xxx                 1/1     Running   0
# vapora-frontend-yyy                 1/1     Running   0
# vapora-agents-xxx                   1/1     Running   0
# vapora-agents-yyy                   1/1     Running   0
# vapora-agents-zzz                   1/1     Running   0
# vapora-mcp-server-xxx               1/1     Running   0

# Check services
kubectl get svc -n vapora

# Check ingress
kubectl get ingress -n vapora
```

### Step 6: Access VAPORA

```bash
# Get ingress IP/hostname
kubectl get ingress vapora -n vapora

# Configure DNS
# Point vapora.yourdomain.com to ingress IP

# Access UI
open https://vapora.yourdomain.com
```

---

## Provisioning Deployment

### Step 1: Validate Configuration

```bash
# Validate Provisioning workspace
nu scripts/validate-provisioning.nu
```

### Step 2: Create Cluster

```bash
cd provisioning/vapora-wrksp

# Validate configuration
provisioning validate --all

# Create cluster
provisioning cluster create --config workspace.toml
```

### Step 3: Deploy Services

```bash
# Deploy infrastructure (database, messaging)
provisioning workflow run workflows/deploy-infra.yaml

# Deploy services (backend, frontend, agents)
provisioning workflow run workflows/deploy-services.yaml

# Or deploy full stack at once
provisioning workflow run workflows/deploy-full-stack.yaml
```

### Step 4: Health Check

```bash
provisioning workflow run workflows/health-check.yaml
```

See `provisioning-integration/README.md` for details.

---

## Configuration

### Environment Variables

#### Backend (`vapora-backend`)

```bash
RUST_LOG=info,vapora=debug
SURREALDB_URL=http://surrealdb:8000
SURREALDB_USER=root
SURREALDB_PASS=<secret>
NATS_URL=nats://nats:4222
JWT_SECRET=<secret>
BIND_ADDR=0.0.0.0:8080
```

#### Agents (`vapora-agents`)

```bash
RUST_LOG=info,vapora_agents=debug
NATS_URL=nats://nats:4222
BIND_ADDR=0.0.0.0:9000
ANTHROPIC_API_KEY=<secret>
OPENAI_API_KEY=<secret>
GEMINI_API_KEY=<secret>
VAPORA_AGENT_CONFIG=/etc/vapora/agents.toml  # Optional
```

#### MCP Server (`vapora-mcp-server`)

```bash
RUST_LOG=info,vapora_mcp_server=debug
# Port configured via --port flag
```

### ConfigMaps

Create custom configuration:

```bash
kubectl create configmap agent-config -n vapora \
  --from-file=agents.toml
```

Mount in deployment:

```yaml
volumeMounts:
- name: config
  mountPath: /etc/vapora
volumes:
- name: config
  configMap:
    name: agent-config
```

---

## Monitoring & Health Checks

### Health Endpoints

All services expose health check endpoints:

- **Backend**: `GET /health`
- **Frontend**: `GET /health.html`
- **Agents**: `GET /health`, `GET /ready`
- **MCP Server**: `GET /health`
- **SurrealDB**: `GET /health`
- **NATS**: `GET /healthz` (port 8222)

### Manual Health Checks

```bash
# Backend health
kubectl exec -n vapora deploy/vapora-backend -- \
  curl -s http://localhost:8080/health

# Database health
kubectl exec -n vapora deploy/vapora-backend -- \
  curl -s http://surrealdb:8000/health

# NATS health
kubectl exec -n vapora deploy/vapora-backend -- \
  curl -s http://nats:8222/healthz
```

### Kubernetes Probes

All deployments have:
- **Liveness Probe**: Restarts unhealthy pods
- **Readiness Probe**: Removes pod from service until ready

### Logs

```bash
# View backend logs
kubectl logs -n vapora -l app=vapora-backend -f

# View agent logs
kubectl logs -n vapora -l app=vapora-agents -f

# View all logs
kubectl logs -n vapora -l app --all-containers=true -f
```

### Metrics (Optional)

Deploy Prometheus + Grafana:

```bash
# Install Prometheus Operator
helm install prometheus prometheus-community/kube-prometheus-stack \
  -n monitoring --create-namespace

# Access Grafana
kubectl port-forward -n monitoring svc/prometheus-grafana 3000:80
```

VAPORA services expose metrics on `/metrics` endpoint (future enhancement).

---

## Scaling

### Manual Scaling

```bash
# Scale backend
kubectl scale deployment vapora-backend -n vapora --replicas=4

# Scale frontend
kubectl scale deployment vapora-frontend -n vapora --replicas=3

# Scale agents (for higher workload)
kubectl scale deployment vapora-agents -n vapora --replicas=10
```

### Horizontal Pod Autoscaler (HPA)

```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: vapora-backend-hpa
  namespace: vapora
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: vapora-backend
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
```

Apply:

```bash
kubectl apply -f hpa.yaml
```

### Resource Limits

Adjust in deployment YAML:

```yaml
resources:
  requests:
    cpu: 200m
    memory: 256Mi
  limits:
    cpu: 1000m
    memory: 1Gi
```

---

## Troubleshooting

### Pods Not Starting

```bash
# Check pod status
kubectl get pods -n vapora

# Describe pod for events
kubectl describe pod -n vapora <pod-name>

# Check logs
kubectl logs -n vapora <pod-name>

# Check previous logs (if crashed)
kubectl logs -n vapora <pod-name> --previous
```

### Database Connection Issues

```bash
# Check SurrealDB is running
kubectl get pod -n vapora -l app=surrealdb

# Test connection from backend
kubectl exec -n vapora deploy/vapora-backend -- \
  curl -v http://surrealdb:8000/health

# Check SurrealDB logs
kubectl logs -n vapora surrealdb-0
```

### NATS Connection Issues

```bash
# Check NATS is running
kubectl get pod -n vapora -l app=nats

# Test connection
kubectl exec -n vapora deploy/vapora-backend -- \
  curl http://nats:8222/varz

# Check NATS logs
kubectl logs -n vapora -l app=nats
```

### Image Pull Errors

```bash
# Check image pull secrets
kubectl get secrets -n vapora

# Create Docker registry secret
kubectl create secret docker-registry regcred \
  -n vapora \
  --docker-server=<registry> \
  --docker-username=<username> \
  --docker-password=<password>

# Add to deployment
spec:
  imagePullSecrets:
  - name: regcred
```

### Ingress Not Working

```bash
# Check ingress controller is installed
kubectl get pods -n ingress-nginx

# Check ingress resource
kubectl describe ingress vapora -n vapora

# Check ingress logs
kubectl logs -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx
```

---

## Rollback

### Kubernetes Rollback

```bash
# View rollout history
kubectl rollout history deployment/vapora-backend -n vapora

# Rollback to previous version
kubectl rollout undo deployment/vapora-backend -n vapora

# Rollback to specific revision
kubectl rollout undo deployment/vapora-backend -n vapora --to-revision=2
```

### Provisioning Rollback

```bash
cd provisioning/vapora-wrksp

# List versions
provisioning version list

# Rollback to previous version
provisioning rollback --to-version <version-id>
```

---

## Security

### Secrets Management

- **Kubernetes Secrets**: Encrypted at rest (if configured in K8s)
- **External Secrets Operator**: Sync from Vault, AWS Secrets Manager, etc.
- **RustyVault**: Integrated with Provisioning

### Network Policies

Apply network policies to restrict pod-to-pod communication:

```yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: vapora-backend
  namespace: vapora
spec:
  podSelector:
    matchLabels:
      app: vapora-backend
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: vapora-frontend
    ports:
    - protocol: TCP
      port: 8080
```

### TLS Certificates

Use cert-manager for automatic TLS:

```bash
# Install cert-manager
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.12.0/cert-manager.yaml

# Create ClusterIssuer
kubectl apply -f - <<EOF
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: admin@yourdomain.com
    privateKeySecretRef:
      name: letsencrypt-prod
    solvers:
    - http01:
        ingress:
          class: nginx
EOF
```

Update ingress:

```yaml
metadata:
  annotations:
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
  tls:
  - hosts:
    - vapora.yourdomain.com
    secretName: vapora-tls
```

---

## Backup & Restore

### SurrealDB Backup

```bash
# Create backup
kubectl exec -n vapora surrealdb-0 -- \
  surreal export --conn http://localhost:8000 \
  --user root --pass <password> \
  --ns vapora --db main backup.surql

# Copy backup locally
kubectl cp vapora/surrealdb-0:/backup.surql ./backup-$(date +%Y%m%d).surql
```

### SurrealDB Restore

```bash
# Copy backup to pod
kubectl cp ./backup.surql vapora/surrealdb-0:/restore.surql

# Restore
kubectl exec -n vapora surrealdb-0 -- \
  surreal import --conn http://localhost:8000 \
  --user root --pass <password> \
  --ns vapora --db main /restore.surql
```

### PVC Backup

```bash
# Snapshot PVC (if supported by storage class)
kubectl apply -f - <<EOF
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: surrealdb-snapshot
  namespace: vapora
spec:
  source:
    persistentVolumeClaimName: data-surrealdb-0
EOF
```

---

## Uninstall

### Delete All Resources

```bash
# Delete namespace (deletes all resources)
kubectl delete namespace vapora

# Or delete manifests individually
kubectl delete -f kubernetes/
```

### Delete PVCs

```bash
# List PVCs
kubectl get pvc -n vapora

# Delete PVC (data will be lost!)
kubectl delete pvc data-surrealdb-0 -n vapora
```

---

## Next Steps

After successful deployment:

1. **Configure DNS**: Point domain to ingress IP
2. **Set up TLS**: Configure cert-manager for HTTPS
3. **Enable monitoring**: Deploy Prometheus/Grafana
4. **Configure backups**: Schedule SurrealDB backups
5. **Set up CI/CD**: Automate deployments
6. **Configure HPA**: Enable autoscaling
7. **Test disaster recovery**: Practice rollback procedures

---

## Support

- **Deployment Issues**: Check `kubernetes/README.md`
- **Provisioning Issues**: Check `provisioning-integration/README.md`
- **Scripts Help**: Run `nu scripts/<script-name>.nu --help`
- **Kubernetes Docs**: https://kubernetes.io/docs/

---

**VAPORA v1.0** - Cloud-Native Multi-Agent Platform
**Status**: Production Ready ✅