Jesús Pérez 44648e3206

chore: complete nickel migration and consolidate legacy configs

- Remove KCL ecosystem (~220 files deleted)
- Migrate all infrastructure to Nickel schema system
- Consolidate documentation: legacy docs → provisioning/docs/src/
- Add CI/CD workflows (.github/) and Rust build config (.cargo/)
- Update core system for Nickel schema parsing
- Update README.md and CHANGES.md for v5.0.0 release
- Fix pre-commit hooks: end-of-file, trailing-whitespace
- Breaking changes: KCL workspaces require migration
- Migration bridge available in docs/src/development/

2026-01-08 09:55:37 +00:00

12 KiB

Raw Blame History

Kubernetes Templates

Nickel-based Kubernetes manifest templates for provisioning platform services.

Overview

This directory contains Kubernetes deployment manifests written in Nickel language. These templates are parameterized to support all four deployment modes:

solo: Single developer, 1 replica per service, minimal resources
multiuser: Team collaboration, 1-2 replicas per service, PostgreSQL + SurrealDB
cicd: CI/CD pipelines, 1 replica, stateless and ephemeral
enterprise: Production HA, 2-3 replicas per service, full monitoring stack

Templates

Service Deployments

orchestrator-deployment.yaml.ncl

Orchestrator workflow engine deployment with:

3 replicas (enterprise mode, override per mode)
Service account for RBAC
Health checks (liveness + readiness probes)
Resource requests/limits (500m CPU, 512Mi RAM minimum)
Volume mounts for data and logs
Pod anti-affinity for distributed deployment
Init containers for dependency checking

Mode-specific overrides:

Solo: 1 replica, filesystem storage
MultiUser: 1 replica, SurrealDB backend
CI/CD: 1 replica, ephemeral storage
Enterprise: 3 replicas, SurrealDB cluster

orchestrator-service.yaml.ncl

Internal ClusterIP service for orchestrator with:

Session affinity (3-hour timeout)
Port 9090 (HTTP API)
Port 9091 (Metrics)
Internal access only (ClusterIP)

Mode-specific overrides:

Enterprise: LoadBalancer for external access

control-center-deployment.yaml.ncl

Control Center policy and RBAC management with:

2 replicas (enterprise mode)
Database integration (PostgreSQL or RocksDB)
RBAC and JWT configuration
MFA support
Health checks and resource limits
Security context (non-root user)

Environment variables:

Database type and URL
RBAC enablement
JWT issuer, audience, secret
MFA requirement
Log level

control-center-service.yaml.ncl

Internal ClusterIP service for Control Center with:

Port 8080 (HTTP API + UI)
Port 8081 (Metrics)
Session affinity

mcp-server-deployment.yaml.ncl

Model Context Protocol server for AI/LLM integration with:

Lightweight deployment (100m CPU, 128Mi RAM minimum)
Orchestrator integration
Control Center integration
MCP capabilities (tools, resources, prompts)
Tool concurrency limits
Resource size limits

Mode-specific overrides:

Solo: 1 replica
Enterprise: 2 replicas for HA

mcp-server-service.yaml.ncl

Internal ClusterIP service for MCP server with:

Port 8888 (HTTP API)
Port 8889 (Metrics)

Networking

platform-ingress.yaml.ncl

Nginx ingress for external HTTP/HTTPS routing with:

TLS termination with Let's Encrypt (cert-manager)
CORS configuration
Security headers (HSTS, X-Frame-Options, etc.)
Rate limiting (1000 RPS, 100 connections)
Path-based routing to services

Routes:

api.example.com/orchestrator → orchestrator:9090
control-center.example.com/ → control-center:8080
mcp.example.com/ → mcp-server:8888
orchestrator.example.com/api → orchestrator:9090
orchestrator.example.com/policy → control-center:8080

Namespace and Cluster Configuration

namespace.yaml.ncl

Kubernetes Namespace for provisioning platform with:

Pod security policies (baseline enforcement)
Labels for organization and monitoring
Annotations for description

resource-quota.yaml.ncl

ResourceQuota for resource consumption limits:

CPU: 8 requests / 16 limits (total)
Memory: 16GB requests / 32GB limits (total)
Storage: 200GB (persistent volumes)
Pod limit: 20 pods maximum
Services: 10 maximum
ConfigMaps/Secrets: 50 each
Deployments/StatefulSets/Jobs: Limited per type

Mode-specific overrides:

Solo: 4 CPU / 8GB memory, 10 pods
MultiUser: 8 CPU / 16GB memory, 20 pods
CI/CD: 16 CPU / 32GB memory, 50 pods (ephemeral)
Enterprise: Unlimited (managed externally)

network-policy.yaml.ncl

NetworkPolicy for network isolation and security:

Ingress: Allow traffic from Nginx, inter-pod, Prometheus, DNS
Egress: Allow DNS queries, inter-pod, external HTTPS
Default: Deny all except explicitly allowed

Ports managed:

9090: Orchestrator API
8080: Control Center API/UI
8888: MCP Server
5432: PostgreSQL
8000: SurrealDB
53: DNS (TCP/UDP)
443/80: External HTTPS/HTTP

rbac.yaml.ncl

Role-Based Access Control (RBAC) setup with:

ServiceAccounts: orchestrator, control-center, mcp-server
Roles: Minimal permissions per service
RoleBindings: Connect ServiceAccounts to Roles

Permissions:

Orchestrator: Read ConfigMaps, Secrets, Pods, Services
Control Center: Read/Write Secrets, ConfigMaps, Deployments
MCP Server: Read ConfigMaps, Secrets, Pods, Services

Usage

Rendering Templates

Each template is a Nickel file that exports to JSON, then converts to YAML:

# Render a single template
nickel eval --format json orchestrator-deployment.yaml.ncl | yq -P > orchestrator-deployment.yaml

# Render all templates
for template in *.ncl; do
  nickel eval --format json "$template" | yq -P > "${template%.ncl}.yaml"
done

Deploying to Kubernetes

# Create namespace
kubectl create namespace provisioning

# Create ConfigMaps for configuration
kubectl create configmap orchestrator-config \
  --from-literal=storage_backend=surrealdb \
  --from-literal=max_concurrent_tasks=50 \
  --from-literal=batch_parallel_limit=20 \
  --from-literal=log_level=info \
  -n provisioning

# Create secrets for sensitive data
kubectl create secret generic control-center-secrets \
  --from-literal=database_url="postgresql://user:pass@postgres/provisioning" \
  --from-literal=jwt_secret="your-jwt-secret-here" \
  -n provisioning

# Apply manifests
kubectl apply -f orchestrator-deployment.yaml -n provisioning
kubectl apply -f orchestrator-service.yaml -n provisioning
kubectl apply -f control-center-deployment.yaml -n provisioning
kubectl apply -f control-center-service.yaml -n provisioning
kubectl apply -f mcp-server-deployment.yaml -n provisioning
kubectl apply -f mcp-server-service.yaml -n provisioning
kubectl apply -f platform-ingress.yaml -n provisioning

Verifying Deployment

# Check deployments
kubectl get deployments -n provisioning

# Check services
kubectl get svc -n provisioning

# Check ingress
kubectl get ingress -n provisioning

# View logs
kubectl logs -n provisioning -l app=orchestrator -f
kubectl logs -n provisioning -l app=control-center -f
kubectl logs -n provisioning -l app=mcp-server -f

# Describe resource
kubectl describe deployment orchestrator -n provisioning
kubectl describe service orchestrator -n provisioning

ConfigMaps and Secrets

Required ConfigMaps

orchestrator-config

apiVersion: v1
kind: ConfigMap
metadata:
  name: orchestrator-config
  namespace: provisioning
data:
  storage_backend: "surrealdb"  # or "filesystem"
  max_concurrent_tasks: "50"    # Must match constraint.orchestrator.queue.concurrent_tasks.max
  batch_parallel_limit: "20"    # Must match constraint.orchestrator.batch.parallel_limit.max
  log_level: "info"

control-center-config

apiVersion: v1
kind: ConfigMap
metadata:
  name: control-center-config
  namespace: provisioning
data:
  database_type: "postgres"     # or "rocksdb"
  rbac_enabled: "true"
  jwt_issuer: "provisioning.local"
  jwt_audience: "orchestrator"
  mfa_required: "true"           # Enterprise only
  log_level: "info"

mcp-server-config

apiVersion: v1
kind: ConfigMap
metadata:
  name: mcp-server-config
  namespace: provisioning
data:
  protocol: "stdio"              # or "http"
  orchestrator_url: "http://orchestrator:9090"
  control_center_url: "http://control-center:8080"
  enable_tools: "true"
  enable_resources: "true"
  enable_prompts: "true"
  max_concurrent_tools: "10"
  max_resource_size: "1073741824"  # 1GB in bytes
  log_level: "info"

Required Secrets

control-center-secrets

apiVersion: v1
kind: Secret
metadata:
  name: control-center-secrets
  namespace: provisioning
type: Opaque
stringData:
  database_url: "postgresql://user:password@postgres:5432/provisioning"
  jwt_secret: "your-secure-random-string-here"

Persistence

All deployments use PersistentVolumeClaims for data storage:

# Create PersistentVolumes and PersistentVolumeClaims
kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: orchestrator-data
  namespace: provisioning
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 100Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: orchestrator-logs
  namespace: provisioning
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: control-center-logs
  namespace: provisioning
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mcp-server-logs
  namespace: provisioning
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
EOF

Customization by Mode

Solo Mode Overrides

replicas: 1
resources:
  requests:
    cpu: "100m"
    memory: "256Mi"
  limits:
    cpu: "500m"
    memory: "512Mi"
storageBackend: "filesystem"

MultiUser Mode Overrides

replicas: 1
resources:
  requests:
    cpu: "250m"
    memory: "512Mi"
  limits:
    cpu: "1"
    memory: "1Gi"
storageBackend: "surrealdb_server"
database: "postgres"
rbac_enabled: "true"

CI/CD Mode Overrides

replicas: 1
restartPolicy: "Never"
ttlSecondsAfterFinished: 86400  # Keep for 24 hours
storageBackend: "filesystem"
ephemeral: true

Enterprise Mode Overrides

replicas: 3
resources:
  requests:
    cpu: "1"
    memory: "1Gi"
  limits:
    cpu: "4"
    memory: "4Gi"
storageBackend: "surrealdb_cluster"
database: "postgres_ha"
rbac_enabled: "true"
mfa_required: "true"
monitoring: "enabled"

Monitoring and Observability

Prometheus Integration

All services expose metrics on ports:

Orchestrator: 9091
Control Center: 8081
MCP Server: 8889

ServiceMonitor for Prometheus:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: provisioning-platform
  namespace: provisioning
spec:
  selector:
    matchLabels:
      component: provisioning-platform
  endpoints:
  - port: metrics
    interval: 30s

Health Checks

Liveness Probe: GET /health - determines if pod is alive
Readiness Probe: GET /ready - determines if pod can serve traffic

Both use HTTP GET with sensible timeouts and failure thresholds.

Troubleshooting

Pod fails to start

# Check events
kubectl describe pod -n provisioning -l app=orchestrator

# Check logs
kubectl logs -n provisioning -l app=orchestrator --previous

# Check resource availability
kubectl top nodes
kubectl top pods -n provisioning

Service not reachable

# Check service DNS
kubectl exec -it <pod> -n provisioning -- nslookup orchestrator

# Check ingress routing
kubectl describe ingress platform-ingress -n provisioning

# Test connectivity from pod
kubectl run -it --rm test --image=busybox -n provisioning -- wget http://orchestrator:9090/health

TLS certificate issues

# Check certificate status
kubectl describe certificate platform-tls-cert -n provisioning

# Check cert-manager logs
kubectl logs -n cert-manager deployment/cert-manager -f

12 KiB Raw Blame History

Kubernetes Templates

Overview

Templates

Service Deployments

orchestrator-deployment.yaml.ncl

orchestrator-service.yaml.ncl

control-center-deployment.yaml.ncl

control-center-service.yaml.ncl

mcp-server-deployment.yaml.ncl

mcp-server-service.yaml.ncl

Networking

platform-ingress.yaml.ncl

Namespace and Cluster Configuration

namespace.yaml.ncl

resource-quota.yaml.ncl

network-policy.yaml.ncl

rbac.yaml.ncl

Usage

Rendering Templates

Deploying to Kubernetes

Verifying Deployment

ConfigMaps and Secrets

Required ConfigMaps

orchestrator-config

control-center-config

mcp-server-config

Required Secrets

control-center-secrets

Persistence

Customization by Mode

Solo Mode Overrides

MultiUser Mode Overrides

CI/CD Mode Overrides

Enterprise Mode Overrides

Monitoring and Observability

Prometheus Integration

Health Checks

Troubleshooting

Pod fails to start

Service not reachable

TLS certificate issues

References

12 KiB

Raw Blame History