16 KiB
16 KiB
values/*.ncl and expands to valid Docker Compose YAML.\n\nKey Pattern: Templates use Nickel composition to build service definitions dynamically based on configuration, allowing parameterized infrastructure-as-code.\n\n## Templates\n\n### 1. platform-stack.solo.yml.ncl\n\nPurpose: Single-developer local development stack\n\nServices:\n- orchestrator - Workflow engine\n- control-center - Policy and RBAC management\n- mcp-server - MCP protocol server\n\nConfiguration:\n- Network: Bridge network named provisioning\n- Volumes: 5 named volumes for persistence\n - orchestrator-data - Orchestrator workflows\n - control-center-data - Control Center policies\n - mcp-server-data - MCP Server cache\n - logs - Shared log volume\n - cache - Shared cache volume\n- Ports:\n - 9090 - Orchestrator API\n - 8080 - Control Center UI\n - 8888 - MCP Server\n- Health Checks: 30-second intervals for all services\n- Logging: JSON format, 10MB max file size, 3 backups\n- Restart Policy: unless-stopped (survives host reboot)\n\nUsage:\n\n\n# Generate from Nickel template\nnickel export --format json platform-stack.solo.yml.ncl | yq -P > docker-compose.solo.yml\n\n# Start services\ndocker-compose -f docker-compose.solo.yml up -d\n\n# View logs\ndocker-compose -f docker-compose.solo.yml logs -f\n\n# Stop services\ndocker-compose -f docker-compose.solo.yml down\n\n\nEnvironment Variables (recommended in .env file):\n\n\nORCHESTRATOR_LOG_LEVEL=debug\nCONTROL_CENTER_LOG_LEVEL=info\nMCP_SERVER_LOG_LEVEL=info\n\n\n---\n\n### 2. platform-stack.multiuser.yml.ncl\n\nPurpose: Team collaboration with persistent database storage\n\nServices (6 total):\n- postgres - Primary database (PostgreSQL 15)\n- orchestrator - Workflow engine\n- control-center - Policy and RBAC management\n- mcp-server - MCP protocol server\n- surrealdb - Workflow storage (SurrealDB server)\n- gitea - Git repository hosting (optional, for version control)\n\nConfiguration:\n- Network: Custom bridge network named provisioning-network\n- Volumes:\n - postgres-data - PostgreSQL database files\n - orchestrator-data - Orchestrator workflows\n - control-center-data - Control Center policies\n - surrealdb-data - SurrealDB files\n - gitea-data - Gitea repositories and configuration\n - logs - Shared logs\n- Ports:\n - 9090 - Orchestrator API\n - 8080 - Control Center UI\n - 8888 - MCP Server\n - 5432 - PostgreSQL (internal only)\n - 8000 - SurrealDB (internal only)\n - 3000 - Gitea web UI (optional)\n - 22 - Gitea SSH (optional)\n- Service Dependencies: Explicit depends_on with health checks\n - Control Center waits for PostgreSQL\n - SurrealDB starts before Orchestrator\n- Health Checks: Service-specific health checks\n- Restart Policy: always (automatic recovery on failure)\n- Logging: JSON format with rotation\n\nUsage:\n\n\n# Generate from Nickel template\nnickel export --format json platform-stack.multiuser.yml.ncl | yq -P > docker-compose.multiuser.yml\n\n# Create environment file\ncat > .env.multiuser << 'EOF'\nDB_PASSWORD=secure-postgres-password\nSURREALDB_PASSWORD=secure-surrealdb-password\nJWT_SECRET=secure-jwt-secret-256-bits\nEOF\n\n# Start services\ndocker-compose -f docker-compose.multiuser.yml --env-file .env.multiuser up -d\n\n# Wait for all services to be healthy\ndocker-compose -f docker-compose.multiuser.yml ps\n\n# Create database and initialize schema (one-time)\ndocker-compose exec postgres psql -U postgres -c "CREATE DATABASE provisioning;"\n\n\nDatabase Initialization:\n\n\n# Connect to PostgreSQL for schema creation\ndocker-compose exec postgres psql -U provisioning -d provisioning\n\n# Connect to SurrealDB for schema setup\ndocker-compose exec surrealdb surreal sql --auth root:password\n\n# Connect to Gitea web UI\n# http://localhost:3000 (admin:admin by default)\n\n\nEnvironment Variables (in .env.multiuser):\n\n\n# Database Credentials (CRITICAL - change before production)\nDB_PASSWORD=your-strong-password\nSURREALDB_PASSWORD=your-strong-password\n\n# Security\nJWT_SECRET=your-256-bit-random-string\n\n# Logging\nORCHESTRATOR_LOG_LEVEL=info\nCONTROL_CENTER_LOG_LEVEL=info\nMCP_SERVER_LOG_LEVEL=info\n\n# Optional: Gitea Configuration\nGITEA_DOMAIN=localhost:3000\nGITEA_ROOT_URL=http://localhost:3000/\n\n\n---\n\n### 3. platform-stack.cicd.yml.ncl\n\nPurpose: Ephemeral CI/CD pipeline stack with minimal persistence\n\nServices (2 total):\n- orchestrator - API-only mode (no UI, streamlined for programmatic use)\n- api-gateway - Optional: Request routing and authentication\n\nConfiguration:\n- Network: Bridge network\n- Volumes:\n - orchestrator-tmpfs - Temporary storage (tmpfs - in-memory, no persistence)\n- Ports:\n - 9090 - Orchestrator API (read-only orchestrator state)\n - 8000 - API Gateway (optional)\n- Health Checks: Fast checks (10-second intervals)\n- Restart Policy: no (containers do not auto-restart)\n- Logging: Minimal (only warnings and errors)\n- Cleanup: All artifacts deleted when containers stop\n\nCharacteristics:\n- Ephemeral: No persistent storage (uses tmpfs)\n- Fast Startup: Minimal services, quick boot time\n- API-First: No UI, command-line/API integration only\n- Stateless: Clean slate each run\n- Low Resource: Minimal memory/CPU footprint\n\nUsage:\n\n\n# Generate from Nickel template\nnickel export --format json platform-stack.cicd.yml.ncl | yq -P > docker-compose.cicd.yml\n\n# Start ephemeral stack\ndocker-compose -f docker-compose.cicd.yml up\n\n# Run CI/CD commands (in parallel terminal)\ncurl -X POST http://localhost:9090/api/workflows \\n -H "Content-Type: application/json" \\n -d @workflow.json\n\n# Stop and cleanup (all data lost)\ndocker-compose -f docker-compose.cicd.yml down\n# Or with volume cleanup\ndocker-compose -f docker-compose.cicd.yml down -v\n\n\nCI/CD Integration Example:\n\n\n# GitHub Actions workflow\n- name: Start Provisioning Stack\n run: docker-compose -f docker-compose.cicd.yml up -d\n\n- name: Run Tests\n run: |\n ./tests/integration.sh\n curl -X GET http://localhost:9090/health\n\n- name: Cleanup\n if: always()\n run: docker-compose -f docker-compose.cicd.yml down -v\n\n\nEnvironment Variables (minimal):\n\n\n# Logging (optional)\nORCHESTRATOR_LOG_LEVEL=warn\n\n\n---\n\n### 4. platform-stack.enterprise.yml.ncl\n\nPurpose: Production-grade high-availability deployment\n\nServices (10+ total):\n- postgres - PostgreSQL 15 (primary database)\n- orchestrator (3 replicas) - Load-balanced workflow engine\n- control-center (2 replicas) - Load-balanced policy management\n- mcp-server (1-2 replicas) - MCP server for AI integration\n- surrealdb-1, surrealdb-2, surrealdb-3 - SurrealDB cluster (3 nodes)\n- nginx - Load balancer and reverse proxy\n- prometheus - Metrics collection\n- grafana - Visualization and dashboards\n- loki - Log aggregation\n\nConfiguration:\n- Network: Custom bridge network named provisioning-enterprise\n- Volumes:\n - postgres-data - PostgreSQL HA storage\n - surrealdb-node-1, surrealdb-node-2, surrealdb-node-3 - Cluster storage\n - prometheus-data - Metrics storage\n - grafana-data - Grafana configuration\n - loki-data - Log storage\n - logs - Shared log aggregation\n- Ports:\n - 80 - HTTP (Nginx reverse proxy)\n - 443 - HTTPS (TLS - requires certificates)\n - 9090 - Orchestrator API (internal)\n - 8080 - Control Center UI (internal)\n - 8888 - MCP Server (internal)\n - 5432 - PostgreSQL (internal only)\n - 8000 - SurrealDB cluster (internal)\n - 9091 - Prometheus metrics (internal)\n - 3000 - Grafana dashboards (external)\n- Service Dependencies:\n - Control Center waits for PostgreSQL\n - Orchestrator waits for SurrealDB cluster\n - MCP Server waits for Orchestrator and Control Center\n - Prometheus waits for all services\n- Health Checks: 30-second intervals with 10-second timeout\n- Restart Policy: always (high availability)\n- Load Balancing: Nginx upstream blocks for orchestrator, control-center\n- Logging: JSON format with 500MB files, kept 30 versions\n\nArchitecture:\n\n\n┌──────────────────────┐\n│ External Client │\n│ (HTTPS, Port 443) │\n└──────────┬───────────┘\n │\n ┌──────▼──────────┐\n │ Nginx Load │\n │ Balancer │\n │ (TLS, CORS, │\n │ Rate Limiting) │\n └───────┬──────┬──────┬─────┐\n │ │ │ │\n ┌────────▼──┐ ┌──────▼──┐ ┌──▼────────┐\n │Orchestrator│ │Control │ │MCP Server │\n │ (3 copies) │ │ Center │ │ (1-2 copy)│\n │ │ │(2 copies)│ │ │\n └────────┬──┘ └─────┬───┘ └──┬───────┘\n │ │ │\n ┌───────▼────────┬──▼────┐ │\n │ SurrealDB │ PostSQL │\n │ Cluster │ HA │\n │ (3 nodes) │ (Primary/│\n │ │ Replica)│\n └────────────────┴──────────┘\n\nObservability Stack:\n┌────────────┬───────────┬───────────┐\n│ Prometheus │ Grafana │ Loki │\n│ (Metrics) │(Dashboard)│ (Logs) │\n└────────────┴───────────┴───────────┘\n\n\nUsage:\n\n\n# Generate from Nickel template\nnickel export --format json platform-stack.enterprise.yml.ncl | yq -P > docker-compose.enterprise.yml\n\n# Create environment file with secrets\ncat > .env.enterprise << 'EOF'\n# Database\nDB_PASSWORD=generate-strong-password\nSURREALDB_PASSWORD=generate-strong-password\n\n# Security\nJWT_SECRET=generate-256-bit-random-string\nADMIN_PASSWORD=generate-strong-admin-password\n\n# TLS Certificates\nTLS_CERT_PATH=/path/to/cert.pem\nTLS_KEY_PATH=/path/to/key.pem\n\n# Logging and Monitoring\nPROMETHEUS_RETENTION=30d\nGRAFANA_ADMIN_PASSWORD=generate-strong-password\nLOKI_RETENTION_DAYS=30\nEOF\n\n# Start entire stack\ndocker-compose -f docker-compose.enterprise.yml --env-file .env.enterprise up -d\n\n# Verify all services are healthy\ndocker-compose -f docker-compose.enterprise.yml ps\n\n# Check load balancer status\ncurl -H "Host: orchestrator.example.com" http://localhost/health\n\n# Access monitoring\n# Grafana: http://localhost:3000 (admin/password)\n# Prometheus: http://localhost:9091 (internal)\n# Loki: http://localhost:3100 (internal)\n\n\nProduction Checklist:\n- [ ] Generate strong database passwords (32+ characters)\n- [ ] Generate strong JWT secret (256-bit random string)\n- [ ] Provision valid TLS certificates (not self-signed)\n- [ ] Configure Nginx upstream health checks\n- [ ] Set up log retention policies (30+ days)\n- [ ] Enable Prometheus scraping with 15-second intervals\n- [ ] Configure Grafana dashboards and alerts\n- [ ] Test SurrealDB cluster failover\n- [ ] Document backup procedures\n- [ ] Enable PostgreSQL replication and backups\n- [ ] Configure external log aggregation (ELK stack, Splunk, etc.)\n\nEnvironment Variables (in .env.enterprise):\n\n\n# Database Credentials (CRITICAL)\nDB_PASSWORD=your-strong-password-32-chars-min\nSURREALDB_PASSWORD=your-strong-password-32-chars-min\n\n# Security\nJWT_SECRET=your-256-bit-random-base64-encoded-string\nADMIN_PASSWORD=your-strong-admin-password\n\n# TLS/HTTPS\nTLS_CERT_PATH=/etc/provisioning/certs/server.crt\nTLS_KEY_PATH=/etc/provisioning/certs/server.key\n\n# Logging and Monitoring\nPROMETHEUS_RETENTION=30d\nPROMETHEUS_SCRAPE_INTERVAL=15s\nGRAFANA_ADMIN_USER=admin\nGRAFANA_ADMIN_PASSWORD=your-strong-grafana-password\nLOKI_RETENTION_DAYS=30\n\n# Optional: External Integrations\nSLACK_WEBHOOK_URL=https://hooks.slack.com/services/xxxxxxx\nPAGERDUTY_INTEGRATION_KEY=your-pagerduty-key\n\n\n---\n\n## Workflow: From Nickel to Docker Compose\n\n### 1. Configuration Source (values/*.ncl)\n\n\n# values/orchestrator.enterprise.ncl\n{\n orchestrator = {\n server = {\n host = "0.0.0.0",\n port = 9090,\n workers = 8,\n },\n storage = {\n backend = 'surrealdb_cluster,\n surrealdb_url = "surrealdb://surrealdb-1:8000",\n },\n queue = {\n max_concurrent_tasks = 100,\n retry_attempts = 5,\n task_timeout = 7200000,\n },\n monitoring = {\n enabled = true,\n metrics_interval = 10,\n },\n },\n}\n\n\n### 2. Template Generation (Nickel → JSON)\n\n\n# Exports Nickel config as JSON\nnickel export --format json platform-stack.enterprise.yml.ncl\n\n\n### 3. YAML Conversion (JSON → YAML)\n\n\n# Converts JSON to YAML format\nnickel export --format json platform-stack.enterprise.yml.ncl | yq -P > docker-compose.enterprise.yml\n\n\n### 4. Deployment (YAML → Running Containers)\n\n\n# Starts all services defined in YAML\ndocker-compose -f docker-compose.enterprise.yml up -d\n\n\n---\n\n## Common Customizations\n\n### Change Service Replicas\n\nEdit the template to adjust replica counts:\n\n\n# In platform-stack.enterprise.yml.ncl\nlet orchestrator_replicas = 5 in # Instead of 3\nlet control_center_replicas = 3 in # Instead of 2\nservices.orchestrator_replicas\n\n\n### Add Custom Service\n\nAdd to the template services record:\n\n\n# In platform-stack.enterprise.yml.ncl\nservices = base_services & {\n custom_service = {\n image = "custom:latest",\n ports = ["9999:9999"],\n volumes = ["custom-data:/data"],\n restart = "always",\n healthcheck = {\n test = ["CMD", "curl", "-f", "http://localhost:9999/health"],\n interval = "30s",\n timeout = "10s",\n retries = 3,\n },\n },\n}\n\n\n### Modify Resource Limits\n\nIn each service definition:\n\n\norchestrator = {\n deploy = {\n resources = {\n limits = {\n cpus = "2.0",\n memory = "2G",\n },\n reservations = {\n cpus = "1.0",\n memory = "1G",\n },\n },\n },\n}\n\n\n---\n\n## Validation and Testing\n\n### Syntax Validation\n\n\n# Validate YAML before deploying\ndocker-compose -f docker-compose.enterprise.yml config --quiet\n\n# Check service definitions\ndocker-compose -f docker-compose.enterprise.yml ps\n\n\n### Health Checks\n\n\n# Monitor health of all services\nwatch docker-compose ps\n\n# Check specific service health\ndocker-compose exec orchestrator curl -s http://localhost:9090/health\n\n\n### Log Inspection\n\n\n# View logs from all services\ndocker-compose logs -f\n\n# View logs from specific service\ndocker-compose logs -f orchestrator\n\n# Follow specific container\ndocker logs -f $(docker ps | grep orchestrator | awk '{print $1}')\n\n\n---\n\n## Troubleshooting\n\n### Port Already in Use\n\nError: bind: address already in use\n\nFix: Change port in template or stop conflicting container:\n\n\n# Find process using port\nlsof -i :9090\n\n# Kill process\nkill -9 <PID>\n\n# Or change port in docker-compose file\nports:\n - "9999:9090" # Use 9999 instead\n\n\n### Service Fails to Start\n\nCheck logs:\n\n\ndocker-compose logs orchestrator\n\n\nCommon causes:\n- Port conflict - Check if another service uses port\n- Missing volume - Create volume before starting\n- Network connectivity - Verify docker network exists\n- Database not ready - Wait for db service to become healthy\n- Configuration error - Validate YAML syntax\n\n### Persistent Volume Issues\n\nClean volumes (WARNING: Deletes data):\n\n\ndocker-compose down -v\ndocker volume prune -f\n\n\n---\n\n## See Also\n\n- Kubernetes Templates: ../kubernetes/ - For production K8s deployments\n- Configuration System: ../../ - Full configuration documentation\n- Examples: ../../examples/ - Example deployment scenarios\n- Scripts: ../../scripts/ - Automation scripts\n\n---\n\nVersion: 1.0\nLast Updated: 2025-01-05\nStatus: Production Ready