5.6 KiB
5.6 KiB
ADR-009: Istio Service Mesh para Kubernetes
Status: Accepted | Implemented Date: 2024-11-01 Deciders: Kubernetes Architecture Team Technical Story: Adding zero-trust security and traffic management for microservices in K8s
Decision
Usar Istio como service mesh para mTLS, traffic management, rate limiting, y observability en Kubernetes.
Rationale
- mTLS Out-of-Box: Automático TLS entre servicios sin código cambios
- Zero-Trust: Enforced mutual TLS por defecto
- Traffic Management: Circuit breakers, retries, timeouts sin lógica en aplicación
- Observability: Tracing automático, metrics collection
- VAPORA Multiservice: 4 deployments (backend, agents, LLM router, frontend) necesitan seguridad inter-service
Alternatives Considered
❌ Plain Kubernetes Networking
- Pros: Simpler setup, fewer components
- Cons: No mTLS, no traffic policies, manual observability
❌ Linkerd (Minimal Service Mesh)
- Pros: Lighter weight than Istio
- Cons: Less feature-rich, smaller ecosystem
✅ Istio (CHOSEN)
- Industry standard, feature-rich, VAPORA deployment compatible
Trade-offs
Pros:
- ✅ Automatic mTLS between services
- ✅ Declarative traffic policies (no code changes)
- ✅ Circuit breakers and retries built-in
- ✅ Integrated observability (tracing, metrics)
- ✅ Gradual rollout support (canary deployments)
- ✅ Rate limiting and authentication policies
Cons:
- ⚠️ Operational complexity (data plane + control plane)
- ⚠️ Memory overhead per pod (sidecar proxy)
- ⚠️ Debugging complexity (multiple proxy layers)
- ⚠️ Certification/certificate rotation management
Implementation
Installation:
# Install Istio
istioctl install --set profile=production -y
# Enable sidecar injection for namespace
kubectl label namespace vapora istio-injection=enabled
# Verify installation
kubectl get pods -n istio-system
Service Mesh Configuration:
# kubernetes/platform/istio-config.yaml
# Virtual Service for traffic policies
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: vapora-backend
namespace: vapora
spec:
hosts:
- vapora-backend
http:
- match:
- uri:
prefix: /api/health
route:
- destination:
host: vapora-backend
port:
number: 8001
timeout: 5s
retries:
attempts: 3
perTryTimeout: 2s
---
# Destination Rule for circuit breaker
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: vapora-backend
namespace: vapora
spec:
host: vapora-backend
trafficPolicy:
connectionPool:
tcp:
maxConnections: 100
http:
http1MaxPendingRequests: 100
http2MaxRequests: 1000
outlierDetection:
consecutive5xxErrors: 5
interval: 30s
baseEjectionTime: 30s
---
# Authorization Policy (deny all by default)
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: vapora-default-deny
namespace: vapora
spec:
{} # Default deny-all
---
# Allow backend to agents
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: allow-backend-to-agents
namespace: vapora
spec:
rules:
- from:
- source:
principals: ["cluster.local/ns/vapora/sa/vapora-backend"]
to:
- operation:
ports: ["8002"]
Key Files:
/kubernetes/platform/istio-config.yaml(Istio configuration)/kubernetes/base/(Deployment manifests with sidecar injection)istioctlcommands for traffic management
Verification
# Check sidecar injection
kubectl get pods -n vapora -o jsonpath='{.items[*].spec.containers[*].name}' | grep istio-proxy
# List virtual services
kubectl get virtualservices -n vapora
# Check mTLS status
istioctl analyze -n vapora
# Monitor traffic between services
kubectl logs -n vapora deployment/vapora-backend -c istio-proxy --tail 20
# Test circuit breaker (should retry and fail gracefully)
kubectl exec -it deployment/vapora-backend -n vapora -- \
curl -v http://vapora-agents:8002/health -X GET \
--max-time 10
# Verify authorization policies
kubectl get authorizationpolicies -n vapora
# Check metrics collection
kubectl port-forward -n istio-system svc/prometheus 9090:9090
# Open http://localhost:9090 and query: rate(istio_request_total[1m])
Expected Output:
- All pods have istio-proxy sidecar
- VirtualServices and DestinationRules configured
- mTLS enabled between services
- Circuit breaker protects against cascading failures
- Authorization policies enforce least-privilege access
- Metrics collected for all inter-service traffic
Consequences
Operational
- Certificate rotation automatic (Istio CA)
- Service-to-service debugging requires understanding proxy layers
- Traffic policies applied without code redeployment
Performance
- Sidecar proxy adds ~5-10ms latency per call
- Memory per pod: +50MB for proxy container
- Worth the security/observability trade-off
Debugging
- Use
istioctl analyzeto diagnose issues - Envoy proxy logs in sidecar containers
- Distributed tracing via Jaeger/Zipkin integration
Scaling
- Automatic load balancing via DestinationRule
- Circuit breaker prevents thundering herd
- Support for canary rollouts via traffic splitting
References
- Istio Documentation
- Istio Security
/kubernetes/platform/istio-config.yaml(configuration)- Prometheus Integration
Related ADRs: ADR-001 (Workspace), ADR-010 (Cedar Authorization)