Kubernetes Templates
Nickel-based Kubernetes manifest templates for provisioning platform services.
Overview
This directory contains Kubernetes deployment manifests written in Nickel language. These templates are parameterized to support all four deployment modes:
- solo: Single developer, 1 replica per service, minimal resources
- multiuser: Team collaboration, 1-2 replicas per service, PostgreSQL + SurrealDB
- cicd: CI/CD pipelines, 1 replica, stateless and ephemeral
- enterprise: Production HA, 2-3 replicas per service, full monitoring stack
Templates
Service Deployments
orchestrator-deployment.yaml.ncl
Orchestrator workflow engine deployment with:
- 3 replicas (enterprise mode, override per mode)
- Service account for RBAC
- Health checks (liveness + readiness probes)
- Resource requests/limits (500m CPU, 512Mi RAM minimum)
- Volume mounts for data and logs
- Pod anti-affinity for distributed deployment
- Init containers for dependency checking
Mode-specific overrides:
- Solo: 1 replica, filesystem storage
- MultiUser: 1 replica, SurrealDB backend
- CI/CD: 1 replica, ephemeral storage
- Enterprise: 3 replicas, SurrealDB cluster
orchestrator-service.yaml.ncl
Internal ClusterIP service for orchestrator with:
- Session affinity (3-hour timeout)
- Port 9090 (HTTP API)
- Port 9091 (Metrics)
- Internal access only (ClusterIP)
Mode-specific overrides:
- Enterprise: LoadBalancer for external access
control-center-deployment.yaml.ncl
Control Center policy and RBAC management with:
- 2 replicas (enterprise mode)
- Database integration (PostgreSQL or RocksDB)
- RBAC and JWT configuration
- MFA support
- Health checks and resource limits
- Security context (non-root user)
Environment variables:
- Database type and URL
- RBAC enablement
- JWT issuer, audience, secret
- MFA requirement
- Log level
control-center-service.yaml.ncl
Internal ClusterIP service for Control Center with:
- Port 8080 (HTTP API + UI)
- Port 8081 (Metrics)
- Session affinity
mcp-server-deployment.yaml.ncl
Model Context Protocol server for AI/LLM integration with:
- Lightweight deployment (100m CPU, 128Mi RAM minimum)
- Orchestrator integration
- Control Center integration
- MCP capabilities (tools, resources, prompts)
- Tool concurrency limits
- Resource size limits
Mode-specific overrides:
- Solo: 1 replica
- Enterprise: 2 replicas for HA
mcp-server-service.yaml.ncl
Internal ClusterIP service for MCP server with:
- Port 8888 (HTTP API)
- Port 8889 (Metrics)
Networking
platform-ingress.yaml.ncl
Nginx ingress for external HTTP/HTTPS routing with:
- TLS termination with Let's Encrypt (cert-manager)
- CORS configuration
- Security headers (HSTS, X-Frame-Options, etc.)
- Rate limiting (1000 RPS, 100 connections)
- Path-based routing to services
Routes:
api.example.com/orchestrator→ orchestrator:9090control-center.example.com/→ control-center:8080mcp.example.com/→ mcp-server:8888orchestrator.example.com/api→ orchestrator:9090orchestrator.example.com/policy→ control-center:8080
Namespace and Cluster Configuration
namespace.yaml.ncl
Kubernetes Namespace for provisioning platform with:
- Pod security policies (baseline enforcement)
- Labels for organization and monitoring
- Annotations for description
resource-quota.yaml.ncl
ResourceQuota for resource consumption limits:
- CPU: 8 requests / 16 limits (total)
- Memory: 16GB requests / 32GB limits (total)
- Storage: 200GB (persistent volumes)
- Pod limit: 20 pods maximum
- Services: 10 maximum
- ConfigMaps/Secrets: 50 each
- Deployments/StatefulSets/Jobs: Limited per type
Mode-specific overrides:
- Solo: 4 CPU / 8GB memory, 10 pods
- MultiUser: 8 CPU / 16GB memory, 20 pods
- CI/CD: 16 CPU / 32GB memory, 50 pods (ephemeral)
- Enterprise: Unlimited (managed externally)
network-policy.yaml.ncl
NetworkPolicy for network isolation and security:
- Ingress: Allow traffic from Nginx, inter-pod, Prometheus, DNS
- Egress: Allow DNS queries, inter-pod, external HTTPS
- Default: Deny all except explicitly allowed
Ports managed:
- 9090: Orchestrator API
- 8080: Control Center API/UI
- 8888: MCP Server
- 5432: PostgreSQL
- 8000: SurrealDB
- 53: DNS (TCP/UDP)
- 443/80: External HTTPS/HTTP
rbac.yaml.ncl
Role-Based Access Control (RBAC) setup with:
- ServiceAccounts: orchestrator, control-center, mcp-server
- Roles: Minimal permissions per service
- RoleBindings: Connect ServiceAccounts to Roles
Permissions:
- Orchestrator: Read ConfigMaps, Secrets, Pods, Services
- Control Center: Read/Write Secrets, ConfigMaps, Deployments
- MCP Server: Read ConfigMaps, Secrets, Pods, Services
Usage
Rendering Templates
Each template is a Nickel file that exports to JSON, then converts to YAML:
# Render a single template
nickel eval --format json orchestrator-deployment.yaml.ncl | yq -P > orchestrator-deployment.yaml
# Render all templates
for template in *.ncl; do
nickel eval --format json "$template" | yq -P > "${template%.ncl}.yaml"
done
Deploying to Kubernetes
# Create namespace
kubectl create namespace provisioning
# Create ConfigMaps for configuration
kubectl create configmap orchestrator-config \
--from-literal=storage_backend=surrealdb \
--from-literal=max_concurrent_tasks=50 \
--from-literal=batch_parallel_limit=20 \
--from-literal=log_level=info \
-n provisioning
# Create secrets for sensitive data
kubectl create secret generic control-center-secrets \
--from-literal=database_url="postgresql://user:pass@postgres/provisioning" \
--from-literal=jwt_secret="your-jwt-secret-here" \
-n provisioning
# Apply manifests
kubectl apply -f orchestrator-deployment.yaml -n provisioning
kubectl apply -f orchestrator-service.yaml -n provisioning
kubectl apply -f control-center-deployment.yaml -n provisioning
kubectl apply -f control-center-service.yaml -n provisioning
kubectl apply -f mcp-server-deployment.yaml -n provisioning
kubectl apply -f mcp-server-service.yaml -n provisioning
kubectl apply -f platform-ingress.yaml -n provisioning
Verifying Deployment
# Check deployments
kubectl get deployments -n provisioning
# Check services
kubectl get svc -n provisioning
# Check ingress
kubectl get ingress -n provisioning
# View logs
kubectl logs -n provisioning -l app=orchestrator -f
kubectl logs -n provisioning -l app=control-center -f
kubectl logs -n provisioning -l app=mcp-server -f
# Describe resource
kubectl describe deployment orchestrator -n provisioning
kubectl describe service orchestrator -n provisioning
ConfigMaps and Secrets
Required ConfigMaps
orchestrator-config
apiVersion: v1
kind: ConfigMap
metadata:
name: orchestrator-config
namespace: provisioning
data:
storage_backend: "surrealdb" # or "filesystem"
max_concurrent_tasks: "50" # Must match constraint.orchestrator.queue.concurrent_tasks.max
batch_parallel_limit: "20" # Must match constraint.orchestrator.batch.parallel_limit.max
log_level: "info"
control-center-config
apiVersion: v1
kind: ConfigMap
metadata:
name: control-center-config
namespace: provisioning
data:
database_type: "postgres" # or "rocksdb"
rbac_enabled: "true"
jwt_issuer: "provisioning.local"
jwt_audience: "orchestrator"
mfa_required: "true" # Enterprise only
log_level: "info"
mcp-server-config
apiVersion: v1
kind: ConfigMap
metadata:
name: mcp-server-config
namespace: provisioning
data:
protocol: "stdio" # or "http"
orchestrator_url: "http://orchestrator:9090"
control_center_url: "http://control-center:8080"
enable_tools: "true"
enable_resources: "true"
enable_prompts: "true"
max_concurrent_tools: "10"
max_resource_size: "1073741824" # 1GB in bytes
log_level: "info"
Required Secrets
control-center-secrets
apiVersion: v1
kind: Secret
metadata:
name: control-center-secrets
namespace: provisioning
type: Opaque
stringData:
database_url: "postgresql://user:password@postgres:5432/provisioning"
jwt_secret: "your-secure-random-string-here"
Persistence
All deployments use PersistentVolumeClaims for data storage:
# Create PersistentVolumes and PersistentVolumeClaims
kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: orchestrator-data
namespace: provisioning
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: orchestrator-logs
namespace: provisioning
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: control-center-logs
namespace: provisioning
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mcp-server-logs
namespace: provisioning
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
EOF
Customization by Mode
Solo Mode Overrides
replicas: 1
resources:
requests:
cpu: "100m"
memory: "256Mi"
limits:
cpu: "500m"
memory: "512Mi"
storageBackend: "filesystem"
MultiUser Mode Overrides
replicas: 1
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "1"
memory: "1Gi"
storageBackend: "surrealdb_server"
database: "postgres"
rbac_enabled: "true"
CI/CD Mode Overrides
replicas: 1
restartPolicy: "Never"
ttlSecondsAfterFinished: 86400 # Keep for 24 hours
storageBackend: "filesystem"
ephemeral: true
Enterprise Mode Overrides
replicas: 3
resources:
requests:
cpu: "1"
memory: "1Gi"
limits:
cpu: "4"
memory: "4Gi"
storageBackend: "surrealdb_cluster"
database: "postgres_ha"
rbac_enabled: "true"
mfa_required: "true"
monitoring: "enabled"
Monitoring and Observability
Prometheus Integration
All services expose metrics on ports:
- Orchestrator: 9091
- Control Center: 8081
- MCP Server: 8889
ServiceMonitor for Prometheus:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: provisioning-platform
namespace: provisioning
spec:
selector:
matchLabels:
component: provisioning-platform
endpoints:
- port: metrics
interval: 30s
Health Checks
- Liveness Probe:
GET /health- determines if pod is alive - Readiness Probe:
GET /ready- determines if pod can serve traffic
Both use HTTP GET with sensible timeouts and failure thresholds.
Troubleshooting
Pod fails to start
# Check events
kubectl describe pod -n provisioning -l app=orchestrator
# Check logs
kubectl logs -n provisioning -l app=orchestrator --previous
# Check resource availability
kubectl top nodes
kubectl top pods -n provisioning
Service not reachable
# Check service DNS
kubectl exec -it <pod> -n provisioning -- nslookup orchestrator
# Check ingress routing
kubectl describe ingress platform-ingress -n provisioning
# Test connectivity from pod
kubectl run -it --rm test --image=busybox -n provisioning -- wget http://orchestrator:9090/health
TLS certificate issues
# Check certificate status
kubectl describe certificate platform-tls-cert -n provisioning
# Check cert-manager logs
kubectl logs -n cert-manager deployment/cert-manager -f