158 lines
4.8 KiB
Markdown
158 lines
4.8 KiB
Markdown
|
|
# ADR 0002: Kubernetes Deployment Strategy for kagent Integration
|
||
|
|
|
||
|
|
**Status:** Accepted
|
||
|
|
|
||
|
|
**Date:** 2026-02-07
|
||
|
|
|
||
|
|
**Authors:** VAPORA Team
|
||
|
|
|
||
|
|
## Context
|
||
|
|
|
||
|
|
kagent integration required a Kubernetes-native deployment strategy that:
|
||
|
|
|
||
|
|
- Supports development and production environments
|
||
|
|
- Maintains A2A protocol connectivity with VAPORA
|
||
|
|
- Enables horizontal scaling
|
||
|
|
- Ensures high availability in production
|
||
|
|
- Minimizes operational complexity
|
||
|
|
- Facilitates updates and configuration changes
|
||
|
|
|
||
|
|
## Decision
|
||
|
|
|
||
|
|
We adopted a **Kustomize-based deployment strategy** with environment-specific overlays:
|
||
|
|
|
||
|
|
```
|
||
|
|
kubernetes/kagent/
|
||
|
|
├── base/ # Environment-agnostic base
|
||
|
|
│ ├── namespace.yaml
|
||
|
|
│ ├── rbac.yaml
|
||
|
|
│ ├── configmap.yaml
|
||
|
|
│ ├── statefulset.yaml
|
||
|
|
│ └── service.yaml
|
||
|
|
├── overlays/
|
||
|
|
│ ├── dev/ # Development: 1 replica, debug logging
|
||
|
|
│ └── prod/ # Production: 5 replicas, HA
|
||
|
|
```
|
||
|
|
|
||
|
|
### Key Design Decisions
|
||
|
|
|
||
|
|
1. **StatefulSet over Deployment**
|
||
|
|
- Provides stable pod identities
|
||
|
|
- Supports ordered startup/shutdown
|
||
|
|
- Compatible with persistent volumes
|
||
|
|
|
||
|
|
2. **Kustomize over Helm**
|
||
|
|
- Native Kubernetes tooling (kubectl)
|
||
|
|
- YAML-based, no templating language
|
||
|
|
- Easier code review of actual manifests
|
||
|
|
- Lower complexity for our use case
|
||
|
|
|
||
|
|
3. **Separate dev/prod Overlays**
|
||
|
|
- Code reuse via base inheritance
|
||
|
|
- Clear environment differentiation
|
||
|
|
- Easy to add staging, testing, etc.
|
||
|
|
- Single source of truth for base configuration
|
||
|
|
|
||
|
|
4. **ConfigMap-based A2A Integration**
|
||
|
|
- Runtime configuration without rebuilding images
|
||
|
|
- Environment-specific values (discovery interval, etc.)
|
||
|
|
- Easy rollback via kubectl rollout
|
||
|
|
|
||
|
|
5. **Pod Anti-Affinity**
|
||
|
|
- Development: Preferred (best-effort distribution)
|
||
|
|
- Production: Required (strict node separation)
|
||
|
|
- Prevents single-node failure modes
|
||
|
|
|
||
|
|
## Rationale
|
||
|
|
|
||
|
|
**Why Kustomize?**
|
||
|
|
- No external dependencies or DSLs to learn
|
||
|
|
- kubectl integration (no new tools for operators)
|
||
|
|
- Transparent YAML (easier auditing)
|
||
|
|
- Suitable for our scale (not complex microservices)
|
||
|
|
|
||
|
|
**Why StatefulSet?**
|
||
|
|
- Pod names are predictable (kagent-0, kagent-1, etc.)
|
||
|
|
- Simplifies debugging and troubleshooting
|
||
|
|
- Compatible with persistent volumes for future phase
|
||
|
|
- A2A clients can reference stable endpoints
|
||
|
|
|
||
|
|
**Why ConfigMap for A2A settings?**
|
||
|
|
- No image rebuild required for config changes
|
||
|
|
- Easy to adjust discovery intervals per environment
|
||
|
|
- Transparent configuration in Git
|
||
|
|
- Can be patched/updated at runtime
|
||
|
|
|
||
|
|
**Why separate dev/prod?**
|
||
|
|
- Resource requirements differ dramatically
|
||
|
|
- Logging levels should differ
|
||
|
|
- Scaling policies differ
|
||
|
|
- Both treated equally in code review
|
||
|
|
|
||
|
|
## Consequences
|
||
|
|
|
||
|
|
**Positive:**
|
||
|
|
- Identical code paths in dev and prod (just different replicas/resources)
|
||
|
|
- Easy to add more environments (staging, testing, etc.)
|
||
|
|
- Standard kubectl workflows
|
||
|
|
- Clear separation of concerns
|
||
|
|
- Configuration in version control
|
||
|
|
- No external tools beyond kubectl
|
||
|
|
|
||
|
|
**Negative:**
|
||
|
|
- Manual pod management (no autoscaling annotations initially)
|
||
|
|
- Kustomize has limitations for complex overlays
|
||
|
|
- No templating language flexibility
|
||
|
|
- Requires understanding of Kubernetes primitives
|
||
|
|
|
||
|
|
## Alternatives Considered
|
||
|
|
|
||
|
|
1. **Helm Charts**
|
||
|
|
- Rejected: Go templates more complex than needed
|
||
|
|
- Revisit if complexity demands it
|
||
|
|
|
||
|
|
2. **Deployment + Horizontal Pod Autoscaler**
|
||
|
|
- Rejected: StatefulSet provides stability needed for debugging
|
||
|
|
- Can layer HPA over StatefulSet if needed
|
||
|
|
|
||
|
|
3. **All-in-one manifest**
|
||
|
|
- Rejected: Code duplication between dev/prod
|
||
|
|
- No clear environment separation
|
||
|
|
|
||
|
|
## Migration Path
|
||
|
|
|
||
|
|
1. **Current:** Kustomize with manual scaling
|
||
|
|
2. **Phase 2:** Add HorizontalPodAutoscaler overlay
|
||
|
|
3. **Phase 3:** Add Prometheus/Grafana monitoring
|
||
|
|
4. **Phase 4:** Integrate with Istio service mesh
|
||
|
|
|
||
|
|
## File Structure Rationale
|
||
|
|
|
||
|
|
```
|
||
|
|
base/ # Applied to all environments
|
||
|
|
├── namespace.yaml # Single kagent namespace
|
||
|
|
├── rbac.yaml # Shared RBAC policies
|
||
|
|
├── configmap.yaml # Base A2A configuration
|
||
|
|
├── statefulset.yaml # Base deployment template
|
||
|
|
└── service.yaml # Shared services
|
||
|
|
|
||
|
|
overlays/dev/ # Development-specific
|
||
|
|
├── kustomization.yaml # Patch application order
|
||
|
|
└── statefulset-patch.yaml # 1 replica, lower resources
|
||
|
|
|
||
|
|
overlays/prod/ # Production-specific
|
||
|
|
├── kustomization.yaml # Patch application order
|
||
|
|
└── statefulset-patch.yaml # 5 replicas, higher resources
|
||
|
|
```
|
||
|
|
|
||
|
|
## Related Decisions
|
||
|
|
|
||
|
|
- ADR-0001: A2A Protocol Implementation
|
||
|
|
- ADR-0003: Error Handling and Protocol Compliance
|
||
|
|
|
||
|
|
## References
|
||
|
|
|
||
|
|
- Kustomize Documentation: https://kustomize.io/
|
||
|
|
- Kubernetes StatefulSets: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/
|
||
|
|
- kubectl: https://kubernetes.io/docs/reference/kubectl/
|