Vapora/docs/adrs/0031-kubernetes-deployment-kagent.md

# ADR-0031: Kubernetes Deployment Strategy for kagent Integration

**Status**: Accepted
**Date**: 2026-02-07
**Deciders**: VAPORA Team
**Technical Story**: Kubernetes-native deployment of kagent that supports dev/prod environments and A2A protocol connectivity with VAPORA

---

## Decision

**Kustomize-based deployment** with a shared base and environment-specific overlays:

```text
kubernetes/kagent/
├── base/
│   ├── namespace.yaml
│   ├── rbac.yaml
│   ├── configmap.yaml
│   ├── statefulset.yaml
│   └── service.yaml
└── overlays/
    ├── dev/     # 1 replica, debug logging, relaxed resources
    └── prod/    # 5 replicas, required pod anti-affinity, HPA-ready
```

**StatefulSet** (not Deployment) with pod anti-affinity configured per environment.

---

## Rationale

**Why Kustomize over Helm?** No external dependencies or Go templating. Standard `kubectl apply -k` workflow. Produced YAML is auditable and transparent. Complexity does not justify a templating layer at current scale.

**Why StatefulSet?** Stable pod identities (`kagent-0`, `kagent-1`) simplify debugging. A2A clients can reference predictable endpoint names. Compatible with persistent volumes if needed. Ordered startup/shutdown matches A2A readiness requirements.

**Why ConfigMap for A2A settings?** Configuration changes (discovery intervals, VAPORA URL) don't require image rebuilds. Changes are tracked in Git. `kubectl rollout restart` applies new config atomically.

**Why separate dev/prod overlays?** Resource requirements, replica counts, and anti-affinity policies differ between environments. Base inheritance prevents duplication. Additional environments (staging, canary) can be added as overlays without touching base.

---

## Alternatives Considered

**Helm Charts** — rejected: Go template complexity exceeds current requirements. Revisit if the manifest set grows substantially.

**Deployment + HPA** — rejected: StatefulSet provides the stable identities needed for A2A client configuration and ordered rollout. HPA can be layered over StatefulSet when scaling requirements emerge.

**Single all-in-one manifest** — rejected: Duplicates resource specs between environments, no clear mechanism for environment differentiation.

---

## Trade-offs

**Pros:**

- Identical code path in dev and prod (overlays change parameters, not structure)
- Configuration in version control — full audit trail
- No tooling beyond `kubectl` required
- Pod anti-affinity prevents correlated failures in production

**Cons:**

- Manual scaling (no HPA initially — requires operator action for load spikes)
- Kustomize has limited expressiveness for complex conditional logic
- StatefulSet rolling updates are slower than Deployment rolling updates

---

## Implementation

**Apply commands:**

```bash
# Development
kubectl apply -k kubernetes/kagent/overlays/dev

# Production
kubectl apply -k kubernetes/kagent/overlays/prod

# Verify rollout
kubectl rollout status statefulset/kagent -n kagent
```

**Key manifest locations:**

- `kubernetes/kagent/base/statefulset.yaml` — StatefulSet template
- `kubernetes/kagent/base/configmap.yaml` — A2A discovery config (VAPORA URL, interval)
- `kubernetes/kagent/overlays/prod/statefulset-patch.yaml` — 5 replicas + required anti-affinity
- `kubernetes/kagent/overlays/dev/statefulset-patch.yaml` — 1 replica + preferred anti-affinity

---

## Verification

```bash
# Validate manifests without applying
kubectl kustomize kubernetes/kagent/overlays/dev | kubectl apply --dry-run=client -f -
kubectl kustomize kubernetes/kagent/overlays/prod | kubectl apply --dry-run=client -f -

# Verify running pods
kubectl get pods -n kagent -l app=kagent
kubectl get statefulset kagent -n kagent
```

---

## Consequences

- Adding a new environment requires only a new overlay directory — base is never modified
- Scaling kagent horizontally in production requires a manual `kubectl scale` or an HPA manifest in the prod overlay
- A2A endpoint (`POST /`) must be exposed via a Kubernetes Service (ClusterIP or LoadBalancer) for VAPORA backend to reach it

---

## References

- `kubernetes/kagent/` — Manifests
- [Kustomize Documentation](https://kustomize.io/)
- [Kubernetes StatefulSets](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/)

**Related ADRs:**

- [ADR-0030](./0030-a2a-protocol-implementation.md) — A2A protocol server that kagent communicates with
- [ADR-0032](./0032-a2a-error-handling-json-rpc.md) — Error handling in A2A communication
- [ADR-0009](./0009-istio-service-mesh.md) — Istio service mesh (mTLS for kagent ↔ VAPORA traffic)