Vapora/docs/architecture/adr/0002-kubernetes-deployment-strategy.md
Jesús Pérez b6a4d77421
Some checks are pending
Documentation Lint & Validation / Markdown Linting (push) Waiting to run
Documentation Lint & Validation / Validate mdBook Configuration (push) Waiting to run
Documentation Lint & Validation / Content & Structure Validation (push) Waiting to run
Documentation Lint & Validation / Lint & Validation Summary (push) Blocked by required conditions
mdBook Build & Deploy / Build mdBook (push) Waiting to run
mdBook Build & Deploy / Documentation Quality Check (push) Blocked by required conditions
mdBook Build & Deploy / Deploy to GitHub Pages (push) Blocked by required conditions
mdBook Build & Deploy / Notification (push) Blocked by required conditions
Rust CI / Security Audit (push) Waiting to run
Rust CI / Check + Test + Lint (nightly) (push) Waiting to run
Rust CI / Check + Test + Lint (stable) (push) Waiting to run
feat: add Leptos UI library and modularize MCP server
2026-02-14 20:10:55 +00:00

158 lines
4.8 KiB
Markdown

# ADR 0002: Kubernetes Deployment Strategy for kagent Integration
**Status:** Accepted
**Date:** 2026-02-07
**Authors:** VAPORA Team
## Context
kagent integration required a Kubernetes-native deployment strategy that:
- Supports development and production environments
- Maintains A2A protocol connectivity with VAPORA
- Enables horizontal scaling
- Ensures high availability in production
- Minimizes operational complexity
- Facilitates updates and configuration changes
## Decision
We adopted a **Kustomize-based deployment strategy** with environment-specific overlays:
```
kubernetes/kagent/
├── base/ # Environment-agnostic base
│ ├── namespace.yaml
│ ├── rbac.yaml
│ ├── configmap.yaml
│ ├── statefulset.yaml
│ └── service.yaml
├── overlays/
│ ├── dev/ # Development: 1 replica, debug logging
│ └── prod/ # Production: 5 replicas, HA
```
### Key Design Decisions
1. **StatefulSet over Deployment**
- Provides stable pod identities
- Supports ordered startup/shutdown
- Compatible with persistent volumes
2. **Kustomize over Helm**
- Native Kubernetes tooling (kubectl)
- YAML-based, no templating language
- Easier code review of actual manifests
- Lower complexity for our use case
3. **Separate dev/prod Overlays**
- Code reuse via base inheritance
- Clear environment differentiation
- Easy to add staging, testing, etc.
- Single source of truth for base configuration
4. **ConfigMap-based A2A Integration**
- Runtime configuration without rebuilding images
- Environment-specific values (discovery interval, etc.)
- Easy rollback via kubectl rollout
5. **Pod Anti-Affinity**
- Development: Preferred (best-effort distribution)
- Production: Required (strict node separation)
- Prevents single-node failure modes
## Rationale
**Why Kustomize?**
- No external dependencies or DSLs to learn
- kubectl integration (no new tools for operators)
- Transparent YAML (easier auditing)
- Suitable for our scale (not complex microservices)
**Why StatefulSet?**
- Pod names are predictable (kagent-0, kagent-1, etc.)
- Simplifies debugging and troubleshooting
- Compatible with persistent volumes for future phase
- A2A clients can reference stable endpoints
**Why ConfigMap for A2A settings?**
- No image rebuild required for config changes
- Easy to adjust discovery intervals per environment
- Transparent configuration in Git
- Can be patched/updated at runtime
**Why separate dev/prod?**
- Resource requirements differ dramatically
- Logging levels should differ
- Scaling policies differ
- Both treated equally in code review
## Consequences
**Positive:**
- Identical code paths in dev and prod (just different replicas/resources)
- Easy to add more environments (staging, testing, etc.)
- Standard kubectl workflows
- Clear separation of concerns
- Configuration in version control
- No external tools beyond kubectl
**Negative:**
- Manual pod management (no autoscaling annotations initially)
- Kustomize has limitations for complex overlays
- No templating language flexibility
- Requires understanding of Kubernetes primitives
## Alternatives Considered
1. **Helm Charts**
- Rejected: Go templates more complex than needed
- Revisit if complexity demands it
2. **Deployment + Horizontal Pod Autoscaler**
- Rejected: StatefulSet provides stability needed for debugging
- Can layer HPA over StatefulSet if needed
3. **All-in-one manifest**
- Rejected: Code duplication between dev/prod
- No clear environment separation
## Migration Path
1. **Current:** Kustomize with manual scaling
2. **Phase 2:** Add HorizontalPodAutoscaler overlay
3. **Phase 3:** Add Prometheus/Grafana monitoring
4. **Phase 4:** Integrate with Istio service mesh
## File Structure Rationale
```
base/ # Applied to all environments
├── namespace.yaml # Single kagent namespace
├── rbac.yaml # Shared RBAC policies
├── configmap.yaml # Base A2A configuration
├── statefulset.yaml # Base deployment template
└── service.yaml # Shared services
overlays/dev/ # Development-specific
├── kustomization.yaml # Patch application order
└── statefulset-patch.yaml # 1 replica, lower resources
overlays/prod/ # Production-specific
├── kustomization.yaml # Patch application order
└── statefulset-patch.yaml # 5 replicas, higher resources
```
## Related Decisions
- ADR-0001: A2A Protocol Implementation
- ADR-0003: Error Handling and Protocol Compliance
## References
- Kustomize Documentation: https://kustomize.io/
- Kubernetes StatefulSets: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/
- kubectl: https://kubernetes.io/docs/reference/kubectl/