227 lines
5.6 KiB
Markdown
227 lines
5.6 KiB
Markdown
# ADR-009: Istio Service Mesh para Kubernetes
|
|
|
|
**Status**: Accepted | Implemented
|
|
**Date**: 2024-11-01
|
|
**Deciders**: Kubernetes Architecture Team
|
|
**Technical Story**: Adding zero-trust security and traffic management for microservices in K8s
|
|
|
|
---
|
|
|
|
## Decision
|
|
|
|
Usar **Istio** como service mesh para mTLS, traffic management, rate limiting, y observability en Kubernetes.
|
|
|
|
---
|
|
|
|
## Rationale
|
|
|
|
1. **mTLS Out-of-Box**: Automático TLS entre servicios sin código cambios
|
|
2. **Zero-Trust**: Enforced mutual TLS por defecto
|
|
3. **Traffic Management**: Circuit breakers, retries, timeouts sin lógica en aplicación
|
|
4. **Observability**: Tracing automático, metrics collection
|
|
5. **VAPORA Multiservice**: 4 deployments (backend, agents, LLM router, frontend) necesitan seguridad inter-service
|
|
|
|
---
|
|
|
|
## Alternatives Considered
|
|
|
|
### ❌ Plain Kubernetes Networking
|
|
- **Pros**: Simpler setup, fewer components
|
|
- **Cons**: No mTLS, no traffic policies, manual observability
|
|
|
|
### ❌ Linkerd (Minimal Service Mesh)
|
|
- **Pros**: Lighter weight than Istio
|
|
- **Cons**: Less feature-rich, smaller ecosystem
|
|
|
|
### ✅ Istio (CHOSEN)
|
|
- Industry standard, feature-rich, VAPORA deployment compatible
|
|
|
|
---
|
|
|
|
## Trade-offs
|
|
|
|
**Pros**:
|
|
- ✅ Automatic mTLS between services
|
|
- ✅ Declarative traffic policies (no code changes)
|
|
- ✅ Circuit breakers and retries built-in
|
|
- ✅ Integrated observability (tracing, metrics)
|
|
- ✅ Gradual rollout support (canary deployments)
|
|
- ✅ Rate limiting and authentication policies
|
|
|
|
**Cons**:
|
|
- ⚠️ Operational complexity (data plane + control plane)
|
|
- ⚠️ Memory overhead per pod (sidecar proxy)
|
|
- ⚠️ Debugging complexity (multiple proxy layers)
|
|
- ⚠️ Certification/certificate rotation management
|
|
|
|
---
|
|
|
|
## Implementation
|
|
|
|
**Installation**:
|
|
```bash
|
|
# Install Istio
|
|
istioctl install --set profile=production -y
|
|
|
|
# Enable sidecar injection for namespace
|
|
kubectl label namespace vapora istio-injection=enabled
|
|
|
|
# Verify installation
|
|
kubectl get pods -n istio-system
|
|
```
|
|
|
|
**Service Mesh Configuration**:
|
|
```yaml
|
|
# kubernetes/platform/istio-config.yaml
|
|
|
|
# Virtual Service for traffic policies
|
|
apiVersion: networking.istio.io/v1beta1
|
|
kind: VirtualService
|
|
metadata:
|
|
name: vapora-backend
|
|
namespace: vapora
|
|
spec:
|
|
hosts:
|
|
- vapora-backend
|
|
http:
|
|
- match:
|
|
- uri:
|
|
prefix: /api/health
|
|
route:
|
|
- destination:
|
|
host: vapora-backend
|
|
port:
|
|
number: 8001
|
|
timeout: 5s
|
|
retries:
|
|
attempts: 3
|
|
perTryTimeout: 2s
|
|
|
|
---
|
|
# Destination Rule for circuit breaker
|
|
apiVersion: networking.istio.io/v1beta1
|
|
kind: DestinationRule
|
|
metadata:
|
|
name: vapora-backend
|
|
namespace: vapora
|
|
spec:
|
|
host: vapora-backend
|
|
trafficPolicy:
|
|
connectionPool:
|
|
tcp:
|
|
maxConnections: 100
|
|
http:
|
|
http1MaxPendingRequests: 100
|
|
http2MaxRequests: 1000
|
|
outlierDetection:
|
|
consecutive5xxErrors: 5
|
|
interval: 30s
|
|
baseEjectionTime: 30s
|
|
|
|
---
|
|
# Authorization Policy (deny all by default)
|
|
apiVersion: security.istio.io/v1beta1
|
|
kind: AuthorizationPolicy
|
|
metadata:
|
|
name: vapora-default-deny
|
|
namespace: vapora
|
|
spec:
|
|
{} # Default deny-all
|
|
|
|
---
|
|
# Allow backend to agents
|
|
apiVersion: security.istio.io/v1beta1
|
|
kind: AuthorizationPolicy
|
|
metadata:
|
|
name: allow-backend-to-agents
|
|
namespace: vapora
|
|
spec:
|
|
rules:
|
|
- from:
|
|
- source:
|
|
principals: ["cluster.local/ns/vapora/sa/vapora-backend"]
|
|
to:
|
|
- operation:
|
|
ports: ["8002"]
|
|
```
|
|
|
|
**Key Files**:
|
|
- `/kubernetes/platform/istio-config.yaml` (Istio configuration)
|
|
- `/kubernetes/base/` (Deployment manifests with sidecar injection)
|
|
- `istioctl` commands for traffic management
|
|
|
|
---
|
|
|
|
## Verification
|
|
|
|
```bash
|
|
# Check sidecar injection
|
|
kubectl get pods -n vapora -o jsonpath='{.items[*].spec.containers[*].name}' | grep istio-proxy
|
|
|
|
# List virtual services
|
|
kubectl get virtualservices -n vapora
|
|
|
|
# Check mTLS status
|
|
istioctl analyze -n vapora
|
|
|
|
# Monitor traffic between services
|
|
kubectl logs -n vapora deployment/vapora-backend -c istio-proxy --tail 20
|
|
|
|
# Test circuit breaker (should retry and fail gracefully)
|
|
kubectl exec -it deployment/vapora-backend -n vapora -- \
|
|
curl -v http://vapora-agents:8002/health -X GET \
|
|
--max-time 10
|
|
|
|
# Verify authorization policies
|
|
kubectl get authorizationpolicies -n vapora
|
|
|
|
# Check metrics collection
|
|
kubectl port-forward -n istio-system svc/prometheus 9090:9090
|
|
# Open http://localhost:9090 and query: rate(istio_request_total[1m])
|
|
```
|
|
|
|
**Expected Output**:
|
|
- All pods have istio-proxy sidecar
|
|
- VirtualServices and DestinationRules configured
|
|
- mTLS enabled between services
|
|
- Circuit breaker protects against cascading failures
|
|
- Authorization policies enforce least-privilege access
|
|
- Metrics collected for all inter-service traffic
|
|
|
|
---
|
|
|
|
## Consequences
|
|
|
|
### Operational
|
|
- Certificate rotation automatic (Istio CA)
|
|
- Service-to-service debugging requires understanding proxy layers
|
|
- Traffic policies applied without code redeployment
|
|
|
|
### Performance
|
|
- Sidecar proxy adds ~5-10ms latency per call
|
|
- Memory per pod: +50MB for proxy container
|
|
- Worth the security/observability trade-off
|
|
|
|
### Debugging
|
|
- Use `istioctl analyze` to diagnose issues
|
|
- Envoy proxy logs in sidecar containers
|
|
- Distributed tracing via Jaeger/Zipkin integration
|
|
|
|
### Scaling
|
|
- Automatic load balancing via DestinationRule
|
|
- Circuit breaker prevents thundering herd
|
|
- Support for canary rollouts via traffic splitting
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
- [Istio Documentation](https://istio.io/latest/docs/)
|
|
- [Istio Security](https://istio.io/latest/docs/concepts/security/)
|
|
- `/kubernetes/platform/istio-config.yaml` (configuration)
|
|
- [Prometheus Integration](https://istio.io/latest/docs/ops/integrations/prometheus/)
|
|
|
|
---
|
|
|
|
**Related ADRs**: ADR-001 (Workspace), ADR-010 (Cedar Authorization)
|