Vapora/docs/adrs/0031-kubernetes-deployment-kagent.md
Jesús Pérez 0b78d97fd7
Some checks failed
Rust CI / Security Audit (push) Has been cancelled
Rust CI / Check + Test + Lint (nightly) (push) Has been cancelled
Rust CI / Check + Test + Lint (stable) (push) Has been cancelled
Documentation Lint & Validation / Markdown Linting (push) Has been cancelled
Documentation Lint & Validation / Validate mdBook Configuration (push) Has been cancelled
Documentation Lint & Validation / Content & Structure Validation (push) Has been cancelled
Documentation Lint & Validation / Lint & Validation Summary (push) Has been cancelled
mdBook Build & Deploy / Build mdBook (push) Has been cancelled
mdBook Build & Deploy / Documentation Quality Check (push) Has been cancelled
mdBook Build & Deploy / Deploy to GitHub Pages (push) Has been cancelled
mdBook Build & Deploy / Notification (push) Has been cancelled
chore: update adrs
2026-02-17 13:18:12 +00:00

127 lines
4.6 KiB
Markdown

# ADR-0031: Kubernetes Deployment Strategy for kagent Integration
**Status**: Accepted
**Date**: 2026-02-07
**Deciders**: VAPORA Team
**Technical Story**: Kubernetes-native deployment of kagent that supports dev/prod environments and A2A protocol connectivity with VAPORA
---
## Decision
**Kustomize-based deployment** with a shared base and environment-specific overlays:
```text
kubernetes/kagent/
├── base/
│ ├── namespace.yaml
│ ├── rbac.yaml
│ ├── configmap.yaml
│ ├── statefulset.yaml
│ └── service.yaml
└── overlays/
├── dev/ # 1 replica, debug logging, relaxed resources
└── prod/ # 5 replicas, required pod anti-affinity, HPA-ready
```
**StatefulSet** (not Deployment) with pod anti-affinity configured per environment.
---
## Rationale
**Why Kustomize over Helm?** No external dependencies or Go templating. Standard `kubectl apply -k` workflow. Produced YAML is auditable and transparent. Complexity does not justify a templating layer at current scale.
**Why StatefulSet?** Stable pod identities (`kagent-0`, `kagent-1`) simplify debugging. A2A clients can reference predictable endpoint names. Compatible with persistent volumes if needed. Ordered startup/shutdown matches A2A readiness requirements.
**Why ConfigMap for A2A settings?** Configuration changes (discovery intervals, VAPORA URL) don't require image rebuilds. Changes are tracked in Git. `kubectl rollout restart` applies new config atomically.
**Why separate dev/prod overlays?** Resource requirements, replica counts, and anti-affinity policies differ between environments. Base inheritance prevents duplication. Additional environments (staging, canary) can be added as overlays without touching base.
---
## Alternatives Considered
**Helm Charts** — rejected: Go template complexity exceeds current requirements. Revisit if the manifest set grows substantially.
**Deployment + HPA** — rejected: StatefulSet provides the stable identities needed for A2A client configuration and ordered rollout. HPA can be layered over StatefulSet when scaling requirements emerge.
**Single all-in-one manifest** — rejected: Duplicates resource specs between environments, no clear mechanism for environment differentiation.
---
## Trade-offs
**Pros:**
- Identical code path in dev and prod (overlays change parameters, not structure)
- Configuration in version control — full audit trail
- No tooling beyond `kubectl` required
- Pod anti-affinity prevents correlated failures in production
**Cons:**
- Manual scaling (no HPA initially — requires operator action for load spikes)
- Kustomize has limited expressiveness for complex conditional logic
- StatefulSet rolling updates are slower than Deployment rolling updates
---
## Implementation
**Apply commands:**
```bash
# Development
kubectl apply -k kubernetes/kagent/overlays/dev
# Production
kubectl apply -k kubernetes/kagent/overlays/prod
# Verify rollout
kubectl rollout status statefulset/kagent -n kagent
```
**Key manifest locations:**
- `kubernetes/kagent/base/statefulset.yaml` — StatefulSet template
- `kubernetes/kagent/base/configmap.yaml` — A2A discovery config (VAPORA URL, interval)
- `kubernetes/kagent/overlays/prod/statefulset-patch.yaml` — 5 replicas + required anti-affinity
- `kubernetes/kagent/overlays/dev/statefulset-patch.yaml` — 1 replica + preferred anti-affinity
---
## Verification
```bash
# Validate manifests without applying
kubectl kustomize kubernetes/kagent/overlays/dev | kubectl apply --dry-run=client -f -
kubectl kustomize kubernetes/kagent/overlays/prod | kubectl apply --dry-run=client -f -
# Verify running pods
kubectl get pods -n kagent -l app=kagent
kubectl get statefulset kagent -n kagent
```
---
## Consequences
- Adding a new environment requires only a new overlay directory — base is never modified
- Scaling kagent horizontally in production requires a manual `kubectl scale` or an HPA manifest in the prod overlay
- A2A endpoint (`POST /`) must be exposed via a Kubernetes Service (ClusterIP or LoadBalancer) for VAPORA backend to reach it
---
## References
- `kubernetes/kagent/` — Manifests
- [Kustomize Documentation](https://kustomize.io/)
- [Kubernetes StatefulSets](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/)
**Related ADRs:**
- [ADR-0030](./0030-a2a-protocol-implementation.md) — A2A protocol server that kagent communicates with
- [ADR-0032](./0032-a2a-error-handling-json-rpc.md) — Error handling in A2A communication
- [ADR-0009](./0009-istio-service-mesh.md) — Istio service mesh (mTLS for kagent ↔ VAPORA traffic)