Vapora/docs/adrs/0031-kubernetes-deployment-kagent.md

# ADR-0031: Kubernetes Deployment Strategy for kagent Integration

**Status**: Accepted
**Date**: 2026-02-07
**Deciders**: VAPORA Team
**Technical Story**: Kubernetes-native deployment of kagent that supports dev/prod environments and A2A protocol connectivity with VAPORA

---

## Decision

**Kustomize-based deployment** with a shared base and environment-specific overlays:

```text
kubernetes/kagent/
├── base/
│   ├── namespace.yaml
│   ├── rbac.yaml
│   ├── configmap.yaml
│   ├── statefulset.yaml
│   └── service.yaml
└── overlays/
    ├── dev/     # 1 replica, debug logging, relaxed resources
    └── prod/    # 5 replicas, required pod anti-affinity, HPA-ready
```

**StatefulSet** (not Deployment) with pod anti-affinity configured per environment.

---

## Rationale

**Why Kustomize over Helm?** No external dependencies or Go templating. Standard `kubectl apply -k` workflow. Produced YAML is auditable and transparent. Complexity does not justify a templating layer at current scale.

**Why StatefulSet?** Stable pod identities (`kagent-0`, `kagent-1`) simplify debugging. A2A clients can reference predictable endpoint names. Compatible with persistent volumes if needed. Ordered startup/shutdown matches A2A readiness requirements.

**Why ConfigMap for A2A settings?** Configuration changes (discovery intervals, VAPORA URL) don't require image rebuilds. Changes are tracked in Git. `kubectl rollout restart` applies new config atomically.

**Why separate dev/prod overlays?** Resource requirements, replica counts, and anti-affinity policies differ between environments. Base inheritance prevents duplication. Additional environments (staging, canary) can be added as overlays without touching base.

---

## Alternatives Considered

**Helm Charts** — rejected: Go template complexity exceeds current requirements. Revisit if the manifest set grows substantially.

**Deployment + HPA** — rejected: StatefulSet provides the stable identities needed for A2A client configuration and ordered rollout. HPA can be layered over StatefulSet when scaling requirements emerge.

**Single all-in-one manifest** — rejected: Duplicates resource specs between environments, no clear mechanism for environment differentiation.

---

## Trade-offs

**Pros:**

- Identical code path in dev and prod (overlays change parameters, not structure)
- Configuration in version control — full audit trail
- No tooling beyond `kubectl` required
- Pod anti-affinity prevents correlated failures in production

**Cons:**

- Manual scaling (no HPA initially — requires operator action for load spikes)
- Kustomize has limited expressiveness for complex conditional logic
- StatefulSet rolling updates are slower than Deployment rolling updates

---

## Implementation

**Apply commands:**

```bash
# Development
kubectl apply -k kubernetes/kagent/overlays/dev

# Production
kubectl apply -k kubernetes/kagent/overlays/prod

# Verify rollout
kubectl rollout status statefulset/kagent -n kagent
```

**Key manifest locations:**

- `kubernetes/kagent/base/statefulset.yaml` — StatefulSet template
- `kubernetes/kagent/base/configmap.yaml` — A2A discovery config (VAPORA URL, interval)
- `kubernetes/kagent/overlays/prod/statefulset-patch.yaml` — 5 replicas + required anti-affinity
- `kubernetes/kagent/overlays/dev/statefulset-patch.yaml` — 1 replica + preferred anti-affinity

---

## Verification

```bash
# Validate manifests without applying
kubectl kustomize kubernetes/kagent/overlays/dev | kubectl apply --dry-run=client -f -
kubectl kustomize kubernetes/kagent/overlays/prod | kubectl apply --dry-run=client -f -

# Verify running pods
kubectl get pods -n kagent -l app=kagent
kubectl get statefulset kagent -n kagent
```

---

## Consequences

- Adding a new environment requires only a new overlay directory — base is never modified
- Scaling kagent horizontally in production requires a manual `kubectl scale` or an HPA manifest in the prod overlay
- A2A endpoint (`POST /`) must be exposed via a Kubernetes Service (ClusterIP or LoadBalancer) for VAPORA backend to reach it

---

## References

- `kubernetes/kagent/` — Manifests
- [Kustomize Documentation](https://kustomize.io/)
- [Kubernetes StatefulSets](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/)

**Related ADRs:**

- [ADR-0030](./0030-a2a-protocol-implementation.md) — A2A protocol server that kagent communicates with
- [ADR-0032](./0032-a2a-error-handling-json-rpc.md) — Error handling in A2A communication
- [ADR-0009](./0009-istio-service-mesh.md) — Istio service mesh (mTLS for kagent ↔ VAPORA traffic)
chore: update adrs 2026-02-17 13:18:12 +00:00			`# ADR-0031: Kubernetes Deployment Strategy for kagent Integration`

			`Status: Accepted`
			`Date: 2026-02-07`
			`Deciders: VAPORA Team`
			`Technical Story: Kubernetes-native deployment of kagent that supports dev/prod environments and A2A protocol connectivity with VAPORA`

			`---`

			`## Decision`

			`Kustomize-based deployment with a shared base and environment-specific overlays:`

			```text
			`kubernetes/kagent/`
			`├── base/`
			`│ ├── namespace.yaml`
			`│ ├── rbac.yaml`
			`│ ├── configmap.yaml`
			`│ ├── statefulset.yaml`
			`│ └── service.yaml`
			`└── overlays/`
			`├── dev/ # 1 replica, debug logging, relaxed resources`
			`└── prod/ # 5 replicas, required pod anti-affinity, HPA-ready`
			```

			`StatefulSet (not Deployment) with pod anti-affinity configured per environment.`

			`---`

			`## Rationale`

			Why Kustomize over Helm? No external dependencies or Go templating. Standard `kubectl apply -k` workflow. Produced YAML is auditable and transparent. Complexity does not justify a templating layer at current scale.

			Why StatefulSet? Stable pod identities (`kagent-0`, `kagent-1`) simplify debugging. A2A clients can reference predictable endpoint names. Compatible with persistent volumes if needed. Ordered startup/shutdown matches A2A readiness requirements.

			Why ConfigMap for A2A settings? Configuration changes (discovery intervals, VAPORA URL) don't require image rebuilds. Changes are tracked in Git. `kubectl rollout restart` applies new config atomically.

			`Why separate dev/prod overlays? Resource requirements, replica counts, and anti-affinity policies differ between environments. Base inheritance prevents duplication. Additional environments (staging, canary) can be added as overlays without touching base.`

			`---`

			`## Alternatives Considered`

			`Helm Charts — rejected: Go template complexity exceeds current requirements. Revisit if the manifest set grows substantially.`

			`Deployment + HPA — rejected: StatefulSet provides the stable identities needed for A2A client configuration and ordered rollout. HPA can be layered over StatefulSet when scaling requirements emerge.`

			`Single all-in-one manifest — rejected: Duplicates resource specs between environments, no clear mechanism for environment differentiation.`

			`---`

			`## Trade-offs`

			`Pros:`

			`- Identical code path in dev and prod (overlays change parameters, not structure)`
			`- Configuration in version control — full audit trail`
			- No tooling beyond `kubectl` required
			`- Pod anti-affinity prevents correlated failures in production`

			`Cons:`

			`- Manual scaling (no HPA initially — requires operator action for load spikes)`
			`- Kustomize has limited expressiveness for complex conditional logic`
			`- StatefulSet rolling updates are slower than Deployment rolling updates`

			`---`

			`## Implementation`

			`Apply commands:`

			```bash
			`# Development`
			`kubectl apply -k kubernetes/kagent/overlays/dev`

			`# Production`
			`kubectl apply -k kubernetes/kagent/overlays/prod`

			`# Verify rollout`
			`kubectl rollout status statefulset/kagent -n kagent`
			```

			`Key manifest locations:`

			- `kubernetes/kagent/base/statefulset.yaml` — StatefulSet template
			- `kubernetes/kagent/base/configmap.yaml` — A2A discovery config (VAPORA URL, interval)
			- `kubernetes/kagent/overlays/prod/statefulset-patch.yaml` — 5 replicas + required anti-affinity
			- `kubernetes/kagent/overlays/dev/statefulset-patch.yaml` — 1 replica + preferred anti-affinity

			`---`

			`## Verification`

			```bash
			`# Validate manifests without applying`
			`kubectl kustomize kubernetes/kagent/overlays/dev \| kubectl apply --dry-run=client -f -`
			`kubectl kustomize kubernetes/kagent/overlays/prod \| kubectl apply --dry-run=client -f -`

			`# Verify running pods`
			`kubectl get pods -n kagent -l app=kagent`
			`kubectl get statefulset kagent -n kagent`
			```

			`---`

			`## Consequences`

			`- Adding a new environment requires only a new overlay directory — base is never modified`
			- Scaling kagent horizontally in production requires a manual `kubectl scale` or an HPA manifest in the prod overlay
			- A2A endpoint (`POST /`) must be exposed via a Kubernetes Service (ClusterIP or LoadBalancer) for VAPORA backend to reach it

			`---`

			`## References`

			- `kubernetes/kagent/` — Manifests
			`- [Kustomize Documentation](https://kustomize.io/)`
			`- [Kubernetes StatefulSets](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/)`

			`Related ADRs:`

			`- [ADR-0030](./0030-a2a-protocol-implementation.md) — A2A protocol server that kagent communicates with`
			`- [ADR-0032](./0032-a2a-error-handling-json-rpc.md) — Error handling in A2A communication`
			`- [ADR-0009](./0009-istio-service-mesh.md) — Istio service mesh (mTLS for kagent ↔ VAPORA traffic)`