# Service Management Guide\n\n**Version**: 1.0.0\n**Last Updated**: 2025-10-06\n\n## Table of Contents\n\n1. [Overview](#overview)\n2. [Service Architecture](#service-architecture)\n3. [Service Registry](#service-registry)\n4. [Platform Commands](#platform-commands)\n5. [Service Commands](#service-commands)\n6. [Deployment Modes](#deployment-modes)\n7. [Health Monitoring](#health-monitoring)\n8. [Dependency Management](#dependency-management)\n9. [Pre-flight Checks](#pre-flight-checks)\n10. [Troubleshooting](#troubleshooting)\n\n---\n\n## Overview\n\nThe Service Management System provides comprehensive lifecycle management for all platform services (orchestrator, control-center, CoreDNS, Gitea, OCI\nregistry, MCP server, API gateway).\n\n### Key Features\n\n- **Unified Service Management**: Single interface for all services\n- **Automatic Dependency Resolution**: Start services in correct order\n- **Health Monitoring**: Continuous health checks with automatic recovery\n- **Multiple Deployment Modes**: Binary, Docker, Docker Compose, Kubernetes, Remote\n- **Pre-flight Checks**: Validate prerequisites before operations\n- **Service Registry**: Centralized service configuration\n\n### Supported Services\n\n| Service | Type | Category | Description |\n| --------- | ------ | ---------- | ------------- |\n| orchestrator | Platform | Orchestration | Rust-based workflow coordinator |\n| control-center | Platform | UI | Web-based management interface |\n| coredns | Infrastructure | DNS | Local DNS resolution |\n| gitea | Infrastructure | Git | Self-hosted Git service |\n| oci-registry | Infrastructure | Registry | OCI-compliant container registry |\n| mcp-server | Platform | API | Model Context Protocol server |\n| api-gateway | Platform | API | Unified REST API gateway |\n\n---\n\n## Service Architecture\n\n### System Architecture\n\n```\n┌─────────────────────────────────────────┐\n│ Service Management CLI │\n│ (platform/services commands) │\n└─────────────────┬───────────────────────┘\n │\n ┌──────────┴──────────┐\n │ │\n ▼ ▼\n┌──────────────┐ ┌───────────────┐\n│ Manager │ │ Lifecycle │\n│ (Core) │ │ (Start/Stop)│\n└──────┬───────┘ └───────┬───────┘\n │ │\n ▼ ▼\n┌──────────────┐ ┌───────────────┐\n│ Health │ │ Dependencies │\n│ (Checks) │ │ (Resolution) │\n└──────────────┘ └───────────────┘\n │ │\n └────────┬───────────┘\n │\n ▼\n ┌────────────────┐\n │ Pre-flight │\n │ (Validation) │\n └────────────────┘\n```\n\n### Component Responsibilities\n\n**Manager** (`manager.nu`)\n\n- Service registry loading\n- Service status tracking\n- State persistence\n\n**Lifecycle** (`lifecycle.nu`)\n\n- Service start/stop operations\n- Deployment mode handling\n- Process management\n\n**Health** (`health.nu`)\n\n- Health check execution\n- HTTP/TCP/Command/File checks\n- Continuous monitoring\n\n**Dependencies** (`dependencies.nu`)\n\n- Dependency graph analysis\n- Topological sorting\n- Startup order calculation\n\n**Pre-flight** (`preflight.nu`)\n\n- Prerequisite validation\n- Conflict detection\n- Auto-start orchestration\n\n---\n\n## Service Registry\n\n### Configuration File\n\n**Location**: `provisioning/config/services.toml`\n\n### Service Definition Structure\n\n```\n[services.]\nname = ""\ntype = "platform" | "infrastructure" | "utility"\ncategory = "orchestration" | "auth" | "dns" | "git" | "registry" | "api" | "ui"\ndescription = "Service description"\nrequired_for = ["operation1", "operation2"]\ndependencies = ["dependency1", "dependency2"]\nconflicts = ["conflicting-service"]\n\n[services..deployment]\nmode = "binary" | "docker" | "docker-compose" | "kubernetes" | "remote"\n\n# Mode-specific configuration\n[services..deployment.binary]\nbinary_path = "/path/to/binary"\nargs = ["--arg1", "value1"]\nworking_dir = "/working/directory"\nenv = { KEY = "value" }\n\n[services..health_check]\ntype = "http" | "tcp" | "command" | "file" | "none"\ninterval = 10\nretries = 3\ntimeout = 5\n\n[services..health_check.http]\nendpoint = "http://localhost:9090/health"\nexpected_status = 200\nmethod = "GET"\n\n[services..startup]\nauto_start = true\nstart_timeout = 30\nstart_order = 10\nrestart_on_failure = true\nmax_restarts = 3\n```\n\n### Example: Orchestrator Service\n\n```\n[services.orchestrator]\nname = "orchestrator"\ntype = "platform"\ncategory = "orchestration"\ndescription = "Rust-based orchestrator for workflow coordination"\nrequired_for = ["server", "taskserv", "cluster", "workflow", "batch"]\n\n[services.orchestrator.deployment]\nmode = "binary"\n\n[services.orchestrator.deployment.binary]\nbinary_path = "${HOME}/.provisioning/bin/provisioning-orchestrator"\nargs = ["--port", "8080", "--data-dir", "${HOME}/.provisioning/orchestrator/data"]\n\n[services.orchestrator.health_check]\ntype = "http"\n\n[services.orchestrator.health_check.http]\nendpoint = "http://localhost:9090/health"\nexpected_status = 200\n\n[services.orchestrator.startup]\nauto_start = true\nstart_timeout = 30\nstart_order = 10\n```\n\n---\n\n## Platform Commands\n\nPlatform commands manage all services as a cohesive system.\n\n### Start Platform\n\nStart all auto-start services or specific services:\n\n```\n# Start all auto-start services\nprovisioning platform start\n\n# Start specific services (with dependencies)\nprovisioning platform start orchestrator control-center\n\n# Force restart if already running\nprovisioning platform start --force orchestrator\n```\n\n**Behavior**:\n\n1. Resolves dependencies\n2. Calculates startup order (topological sort)\n3. Starts services in correct order\n4. Waits for health checks\n5. Reports success/failure\n\n### Stop Platform\n\nStop all running services or specific services:\n\n```\n# Stop all running services\nprovisioning platform stop\n\n# Stop specific services\nprovisioning platform stop orchestrator control-center\n\n# Force stop (kill -9)\nprovisioning platform stop --force orchestrator\n```\n\n**Behavior**:\n\n1. Checks for dependent services\n2. Stops in reverse dependency order\n3. Updates service state\n4. Cleans up PID files\n\n### Restart Platform\n\nRestart running services:\n\n```\n# Restart all running services\nprovisioning platform restart\n\n# Restart specific services\nprovisioning platform restart orchestrator\n```\n\n### Platform Status\n\nShow status of all services:\n\n```\nprovisioning platform status\n```\n\n**Output**:\n\n```\nPlatform Services Status\n\nRunning: 3/7\n\n=== ORCHESTRATION ===\n 🟢 orchestrator - running (uptime: 3600s) ✅\n\n=== UI ===\n 🟢 control-center - running (uptime: 3550s) ✅\n\n=== DNS ===\n ⚪ coredns - stopped ❓\n\n=== GIT ===\n ⚪ gitea - stopped ❓\n\n=== REGISTRY ===\n ⚪ oci-registry - stopped ❓\n\n=== API ===\n 🟢 mcp-server - running (uptime: 3540s) ✅\n ⚪ api-gateway - stopped ❓\n```\n\n### Platform Health\n\nCheck health of all running services:\n\n```\nprovisioning platform health\n```\n\n**Output**:\n\n```\nPlatform Health Check\n\n✅ orchestrator: Healthy - HTTP health check passed\n✅ control-center: Healthy - HTTP status 200 matches expected\n⚪ coredns: Not running\n✅ mcp-server: Healthy - HTTP health check passed\n\nSummary: 3 healthy, 0 unhealthy, 4 not running\n```\n\n### Platform Logs\n\nView service logs:\n\n```\n# View last 50 lines\nprovisioning platform logs orchestrator\n\n# View last 100 lines\nprovisioning platform logs orchestrator --lines 100\n\n# Follow logs in real-time\nprovisioning platform logs orchestrator --follow\n```\n\n---\n\n## Service Commands\n\nIndividual service management commands.\n\n### List Services\n\n```\n# List all services\nprovisioning services list\n\n# List only running services\nprovisioning services list --running\n\n# Filter by category\nprovisioning services list --category orchestration\n```\n\n**Output**:\n\n```\nname type category status deployment_mode auto_start\norchestrator platform orchestration running binary true\ncontrol-center platform ui stopped binary false\ncoredns infrastructure dns stopped docker false\n```\n\n### Service Status\n\nGet detailed status of a service:\n\n```\nprovisioning services status orchestrator\n```\n\n**Output**:\n\n```\nService: orchestrator\nType: platform\nCategory: orchestration\nStatus: running\nDeployment: binary\nHealth: healthy\nAuto-start: true\nPID: 12345\nUptime: 3600s\nDependencies: []\n```\n\n### Start Service\n\n```\n# Start service (with pre-flight checks)\nprovisioning services start orchestrator\n\n# Force start (skip checks)\nprovisioning services start orchestrator --force\n```\n\n**Pre-flight Checks**:\n\n1. Validate prerequisites (binary exists, Docker running, etc.)\n2. Check for conflicts\n3. Verify dependencies are running\n4. Auto-start dependencies if needed\n\n### Stop Service\n\n```\n# Stop service (with dependency check)\nprovisioning services stop orchestrator\n\n# Force stop (ignore dependents)\nprovisioning services stop orchestrator --force\n```\n\n### Restart Service\n\n```\nprovisioning services restart orchestrator\n```\n\n### Service Health\n\nCheck service health:\n\n```\nprovisioning services health orchestrator\n```\n\n**Output**:\n\n```\nService: orchestrator\nStatus: healthy\nHealthy: true\nMessage: HTTP health check passed\nCheck type: http\nCheck duration: 15 ms\n```\n\n### Service Logs\n\n```\n# View logs\nprovisioning services logs orchestrator\n\n# Follow logs\nprovisioning services logs orchestrator --follow\n\n# Custom line count\nprovisioning services logs orchestrator --lines 200\n```\n\n### Check Required Services\n\nCheck which services are required for an operation:\n\n```\nprovisioning services check server\n```\n\n**Output**:\n\n```\nOperation: server\nRequired services: orchestrator\nAll running: true\n```\n\n### Service Dependencies\n\nView dependency graph:\n\n```\n# View all dependencies\nprovisioning services dependencies\n\n# View specific service dependencies\nprovisioning services dependencies control-center\n```\n\n### Validate Services\n\nValidate all service configurations:\n\n```\nprovisioning services validate\n```\n\n**Output**:\n\n```\nTotal services: 7\nValid: 6\nInvalid: 1\n\nInvalid services:\n ❌ coredns:\n - Docker is not installed or not running\n```\n\n### Readiness Report\n\nGet platform readiness report:\n\n```\nprovisioning services readiness\n```\n\n**Output**:\n\n```\nPlatform Readiness Report\n\nTotal services: 7\nRunning: 3\nReady to start: 6\n\nServices:\n 🟢 orchestrator - platform - orchestration\n 🟢 control-center - platform - ui\n 🔴 coredns - infrastructure - dns\n Issues: 1\n 🟡 gitea - infrastructure - git\n```\n\n### Monitor Service\n\nContinuous health monitoring:\n\n```\n# Monitor with default interval (30s)\nprovisioning services monitor orchestrator\n\n# Custom interval\nprovisioning services monitor orchestrator --interval 10\n```\n\n---\n\n## Deployment Modes\n\n### Binary Deployment\n\nRun services as native binaries.\n\n**Configuration**:\n\n```\n[services.orchestrator.deployment]\nmode = "binary"\n\n[services.orchestrator.deployment.binary]\nbinary_path = "${HOME}/.provisioning/bin/provisioning-orchestrator"\nargs = ["--port", "8080"]\nworking_dir = "${HOME}/.provisioning/orchestrator"\nenv = { RUST_LOG = "info" }\n```\n\n**Process Management**:\n\n- PID tracking in `~/.provisioning/services/pids/`\n- Log output to `~/.provisioning/services/logs/`\n- State tracking in `~/.provisioning/services/state/`\n\n### Docker Deployment\n\nRun services as Docker containers.\n\n**Configuration**:\n\n```\n[services.coredns.deployment]\nmode = "docker"\n\n[services.coredns.deployment.docker]\nimage = "coredns/coredns:1.11.1"\ncontainer_name = "provisioning-coredns"\nports = ["5353:53/udp"]\nvolumes = ["${HOME}/.provisioning/coredns/Corefile:/Corefile:ro"]\nrestart_policy = "unless-stopped"\n```\n\n**Prerequisites**:\n\n- Docker daemon running\n- Docker CLI installed\n\n### Docker Compose Deployment\n\nRun services via Docker Compose.\n\n**Configuration**:\n\n```\n[services.platform.deployment]\nmode = "docker-compose"\n\n[services.platform.deployment.docker_compose]\ncompose_file = "${HOME}/.provisioning/platform/docker-compose.yaml"\nservice_name = "orchestrator"\nproject_name = "provisioning"\n```\n\n**File**: `provisioning/platform/docker-compose.yaml`\n\n### Kubernetes Deployment\n\nRun services on Kubernetes.\n\n**Configuration**:\n\n```\n[services.orchestrator.deployment]\nmode = "kubernetes"\n\n[services.orchestrator.deployment.kubernetes]\nnamespace = "provisioning"\ndeployment_name = "orchestrator"\nmanifests_path = "${HOME}/.provisioning/k8s/orchestrator/"\n```\n\n**Prerequisites**:\n\n- kubectl installed and configured\n- Kubernetes cluster accessible\n\n### Remote Deployment\n\nConnect to remotely-running services.\n\n**Configuration**:\n\n```\n[services.orchestrator.deployment]\nmode = "remote"\n\n[services.orchestrator.deployment.remote]\nendpoint = "https://orchestrator.example.com"\ntls_enabled = true\nauth_token_path = "${HOME}/.provisioning/tokens/orchestrator.token"\n```\n\n---\n\n## Health Monitoring\n\n### Health Check Types\n\n#### HTTP Health Check\n\n```\n[services.orchestrator.health_check]\ntype = "http"\n\n[services.orchestrator.health_check.http]\nendpoint = "http://localhost:9090/health"\nexpected_status = 200\nmethod = "GET"\n```\n\n#### TCP Health Check\n\n```\n[services.coredns.health_check]\ntype = "tcp"\n\n[services.coredns.health_check.tcp]\nhost = "localhost"\nport = 5353\n```\n\n#### Command Health Check\n\n```\n[services.custom.health_check]\ntype = "command"\n\n[services.custom.health_check.command]\ncommand = "systemctl is-active myservice"\nexpected_exit_code = 0\n```\n\n#### File Health Check\n\n```\n[services.custom.health_check]\ntype = "file"\n\n[services.custom.health_check.file]\npath = "/var/run/myservice.pid"\nmust_exist = true\n```\n\n### Health Check Configuration\n\n- `interval`: Seconds between checks (default: 10)\n- `retries`: Max retry attempts (default: 3)\n- `timeout`: Check timeout in seconds (default: 5)\n\n### Continuous Monitoring\n\n```\nprovisioning services monitor orchestrator --interval 30\n```\n\n**Output**:\n\n```\nStarting health monitoring for orchestrator (interval: 30s)\nPress Ctrl+C to stop\n2025-10-06 14:30:00 ✅ orchestrator: HTTP health check passed\n2025-10-06 14:30:30 ✅ orchestrator: HTTP health check passed\n2025-10-06 14:31:00 ✅ orchestrator: HTTP health check passed\n```\n\n---\n\n## Dependency Management\n\n### Dependency Graph\n\nServices can depend on other services:\n\n```\n[services.control-center]\ndependencies = ["orchestrator"]\n\n[services.api-gateway]\ndependencies = ["orchestrator", "control-center", "mcp-server"]\n```\n\n### Startup Order\n\nServices start in topological order:\n\n```\norchestrator (order: 10)\n └─> control-center (order: 20)\n └─> api-gateway (order: 45)\n```\n\n### Dependency Resolution\n\nAutomatic dependency resolution when starting services:\n\n```\n# Starting control-center automatically starts orchestrator first\nprovisioning services start control-center\n```\n\n**Output**:\n\n```\nStarting dependency: orchestrator\n✅ Started orchestrator with PID 12345\nWaiting for orchestrator to become healthy...\n✅ Service orchestrator is healthy\nStarting service: control-center\n✅ Started control-center with PID 12346\n✅ Service control-center is healthy\n```\n\n### Conflicts\n\nServices can conflict with each other:\n\n```\n[services.coredns]\nconflicts = ["dnsmasq", "systemd-resolved"]\n```\n\nAttempting to start a conflicting service will fail:\n\n```\nprovisioning services start coredns\n```\n\n**Output**:\n\n```\n❌ Pre-flight check failed: conflicts\nConflicting services running: dnsmasq\n```\n\n### Reverse Dependencies\n\nCheck which services depend on a service:\n\n```\nprovisioning services dependencies orchestrator\n```\n\n**Output**:\n\n```\n## orchestrator\n- Type: platform\n- Category: orchestration\n- Required by:\n - control-center\n - mcp-server\n - api-gateway\n```\n\n### Safe Stop\n\nSystem prevents stopping services with running dependents:\n\n```\nprovisioning services stop orchestrator\n```\n\n**Output**:\n\n```\n❌ Cannot stop orchestrator:\n Dependent services running: control-center, mcp-server, api-gateway\n Use --force to stop anyway\n```\n\n---\n\n## Pre-flight Checks\n\n### Purpose\n\nPre-flight checks ensure services can start successfully before attempting to start them.\n\n### Check Types\n\n1. **Prerequisites**: Binary exists, Docker running, etc.\n2. **Conflicts**: No conflicting services running\n3. **Dependencies**: All dependencies available\n\n### Automatic Checks\n\nPre-flight checks run automatically when starting services:\n\n```\nprovisioning services start orchestrator\n```\n\n**Check Process**:\n\n```\nRunning pre-flight checks for orchestrator...\n✅ Binary found: /Users/user/.provisioning/bin/provisioning-orchestrator\n✅ No conflicts detected\n✅ All dependencies available\nStarting service: orchestrator\n```\n\n### Manual Validation\n\nValidate all services:\n\n```\nprovisioning services validate\n```\n\nValidate specific service:\n\n```\nprovisioning services status orchestrator\n```\n\n### Auto-Start\n\nServices with `auto_start = true` can be started automatically when needed:\n\n```\n# Orchestrator auto-starts if needed for server operations\nprovisioning server create\n```\n\n**Output**:\n\n```\nStarting required services...\n✅ Orchestrator started\nCreating server...\n```\n\n---\n\n## Troubleshooting\n\n### Service Won't Start\n\n**Check prerequisites**:\n\n```\nprovisioning services validate\nprovisioning services status \n```\n\n**Common issues**:\n\n- Binary not found: Check `binary_path` in config\n- Docker not running: Start Docker daemon\n- Port already in use: Check for conflicting processes\n- Dependencies not running: Start dependencies first\n\n### Service Health Check Failing\n\n**View health status**:\n\n```\nprovisioning services health \n```\n\n**Check logs**:\n\n```\nprovisioning services logs --follow\n```\n\n**Common issues**:\n\n- Service not fully initialized: Wait longer or increase `start_timeout`\n- Wrong health check endpoint: Verify endpoint in config\n- Network issues: Check firewall, port bindings\n\n### Dependency Issues\n\n**View dependency tree**:\n\n```\nprovisioning services dependencies \n```\n\n**Check dependency status**:\n\n```\nprovisioning services status \n```\n\n**Start with dependencies**:\n\n```\nprovisioning platform start \n```\n\n### Circular Dependencies\n\n**Validate dependency graph**:\n\n```\n# This is done automatically but you can check manually\nnu -c "use lib_provisioning/services/mod.nu *; validate-dependency-graph"\n```\n\n### PID File Stale\n\nIf service reports running but isn't:\n\n```\n# Manual cleanup\nrm ~/.provisioning/services/pids/.pid\n\n# Force restart\nprovisioning services restart \n```\n\n### Port Conflicts\n\n**Find process using port**:\n\n```\nlsof -i :9090\n```\n\n**Kill conflicting process**:\n\n```\nkill \n```\n\n### Docker Issues\n\n**Check Docker status**:\n\n```\ndocker ps\ndocker info\n```\n\n**View container logs**:\n\n```\ndocker logs provisioning-\n```\n\n**Restart Docker daemon**:\n\n```\n# macOS\nkillall Docker && open /Applications/Docker.app\n\n# Linux\nsystemctl restart docker\n```\n\n### Service Logs\n\n**View recent logs**:\n\n```\ntail -f ~/.provisioning/services/logs/.log\n```\n\n**Search logs**:\n\n```\ngrep "ERROR" ~/.provisioning/services/logs/.log\n```\n\n---\n\n## Advanced Usage\n\n### Custom Service Registration\n\nAdd custom services by editing `provisioning/config/services.toml`.\n\n### Integration with Workflows\n\nServices automatically start when required by workflows:\n\n```\n# Orchestrator starts automatically if not running\nprovisioning workflow submit my-workflow\n```\n\n### CI/CD Integration\n\n```\n# GitLab CI\nbefore_script:\n - provisioning platform start orchestrator\n - provisioning services health orchestrator\n\ntest:\n script:\n - provisioning test quick kubernetes\n```\n\n### Monitoring Integration\n\nServices can integrate with monitoring systems via health endpoints.\n\n---\n\n## Related Documentation\n\n- Orchestrator README\n- [Test Environment Guide](test-environment-guide.md)\n- [Workflow Management](workflow-management.md)\n\n---\n\n## Quick Reference\n\n**Version**: 1.0.0\n\n### Platform Commands (Manage All Services)\n\n```\n# Start all auto-start services\nprovisioning platform start\n\n# Start specific services with dependencies\nprovisioning platform start control-center mcp-server\n\n# Stop all running services\nprovisioning platform stop\n\n# Stop specific services\nprovisioning platform stop orchestrator\n\n# Restart services\nprovisioning platform restart\n\n# Show platform status\nprovisioning platform status\n\n# Check platform health\nprovisioning platform health\n\n# View service logs\nprovisioning platform logs orchestrator --follow\n```\n\n---\n\n### Service Commands (Individual Services)\n\n```\n# List all services\nprovisioning services list\n\n# List only running services\nprovisioning services list --running\n\n# Filter by category\nprovisioning services list --category orchestration\n\n# Service status\nprovisioning services status orchestrator\n\n# Start service (with pre-flight checks)\nprovisioning services start orchestrator\n\n# Force start (skip checks)\nprovisioning services start orchestrator --force\n\n# Stop service\nprovisioning services stop orchestrator\n\n# Force stop (ignore dependents)\nprovisioning services stop orchestrator --force\n\n# Restart service\nprovisioning services restart orchestrator\n\n# Check health\nprovisioning services health orchestrator\n\n# View logs\nprovisioning services logs orchestrator --follow --lines 100\n\n# Monitor health continuously\nprovisioning services monitor orchestrator --interval 30\n```\n\n---\n\n### Dependency & Validation\n\n```\n# View dependency graph\nprovisioning services dependencies\n\n# View specific service dependencies\nprovisioning services dependencies control-center\n\n# Validate all services\nprovisioning services validate\n\n# Check readiness\nprovisioning services readiness\n\n# Check required services for operation\nprovisioning services check server\n```\n\n---\n\n### Registered Services\n\n| Service | Port | Type | Auto-Start | Dependencies |\n| --------- | ------ | ------ | ------------ | -------------- |\n| orchestrator | 8080 | Platform | Yes | - |\n| control-center | 8081 | Platform | No | orchestrator |\n| coredns | 5353 | Infrastructure | No | - |\n| gitea | 3000, 222 | Infrastructure | No | - |\n| oci-registry | 5000 | Infrastructure | No | - |\n| mcp-server | 8082 | Platform | No | orchestrator |\n| api-gateway | 8083 | Platform | No | orchestrator, control-center, mcp-server |\n\n---\n\n### Docker Compose\n\n```\n# Start all services\ncd provisioning/platform\ndocker-compose up -d\n\n# Start specific services\ndocker-compose up -d orchestrator control-center\n\n# Check status\ndocker-compose ps\n\n# View logs\ndocker-compose logs -f orchestrator\n\n# Stop all services\ndocker-compose down\n\n# Stop and remove volumes\ndocker-compose down -v\n```\n\n---\n\n### Service State Directories\n\n```\n~/.provisioning/services/\n├── pids/ # Process ID files\n├── state/ # Service state (JSON)\n└── logs/ # Service logs\n```\n\n---\n\n### Health Check Endpoints\n\n| Service | Endpoint | Type |\n| --------- | ---------- | ------ |\n| orchestrator | | HTTP |\n| control-center | | HTTP |\n| coredns | localhost:5353 | TCP |\n| gitea | | HTTP |\n| oci-registry | | HTTP |\n| mcp-server | | HTTP |\n| api-gateway | | HTTP |\n\n---\n\n### Common Workflows\n\n#### Start Platform for Development\n\n```\n# Start core services\nprovisioning platform start orchestrator\n\n# Check status\nprovisioning platform status\n\n# Check health\nprovisioning platform health\n```\n\n#### Start Full Platform Stack\n\n```\n# Use Docker Compose\ncd provisioning/platform\ndocker-compose up -d\n\n# Verify\ndocker-compose ps\nprovisioning platform health\n```\n\n#### Debug Service Issues\n\n```\n# Check service status\nprovisioning services status \n\n# View logs\nprovisioning services logs --follow\n\n# Check health\nprovisioning services health \n\n# Validate prerequisites\nprovisioning services validate\n\n# Restart service\nprovisioning services restart \n```\n\n#### Safe Service Shutdown\n\n```\n# Check dependents\nnu -c "use lib_provisioning/services/mod.nu *; can-stop-service orchestrator"\n\n# Stop with dependency check\nprovisioning services stop orchestrator\n\n# Force stop if needed\nprovisioning services stop orchestrator --force\n```\n\n---\n\n### Troubleshooting\n\n#### Service Won't Start\n\n```\n# 1. Check prerequisites\nprovisioning services validate\n\n# 2. View detailed status\nprovisioning services status \n\n# 3. Check logs\nprovisioning services logs \n\n# 4. Verify binary/image exists\nls ~/.provisioning/bin/\ndocker images | grep \n```\n\n#### Health Check Failing\n\n```\n# Check endpoint manually\ncurl http://localhost:9090/health\n\n# View health details\nprovisioning services health \n\n# Monitor continuously\nprovisioning services monitor --interval 10\n```\n\n#### PID File Stale\n\n```\n# Remove stale PID file\nrm ~/.provisioning/services/pids/.pid\n\n# Restart service\nprovisioning services restart \n```\n\n#### Port Already in Use\n\n```\n# Find process using port\nlsof -i :9090\n\n# Kill process\nkill \n\n# Restart service\nprovisioning services start \n```\n\n---\n\n### Integration with Operations\n\n#### Server Operations\n\n```\n# Orchestrator auto-starts if needed\nprovisioning server create\n\n# Manual check\nprovisioning services check server\n```\n\n#### Workflow Operations\n\n```\n# Orchestrator auto-starts\nprovisioning workflow submit my-workflow\n\n# Check status\nprovisioning services status orchestrator\n```\n\n#### Test Operations\n\n```\n# Orchestrator required for test environments\nprovisioning test quick kubernetes\n\n# Pre-flight check\nprovisioning services check test-env\n```\n\n---\n\n### Advanced Usage\n\n#### Custom Service Startup Order\n\nServices start based on:\n\n1. Dependency order (topological sort)\n2. `start_order` field (lower = earlier)\n\n#### Auto-Start Configuration\n\nEdit `provisioning/config/services.toml`:\n\n```\n[services..startup]\nauto_start = true # Enable auto-start\nstart_timeout = 30 # Timeout in seconds\nstart_order = 10 # Startup priority\n```\n\n#### Health Check Configuration\n\n```\n[services..health_check]\ntype = "http" # http, tcp, command, file\ninterval = 10 # Seconds between checks\nretries = 3 # Max retry attempts\ntimeout = 5 # Check timeout\n\n[services..health_check.http]\nendpoint = "http://localhost:9090/health"\nexpected_status = 200\n```\n\n---\n\n### Key Files\n\n- **Service Registry**: `provisioning/config/services.toml`\n- **KCL Schema**: `provisioning/kcl/services.k`\n- **Docker Compose**: `provisioning/platform/docker-compose.yaml`\n- **User Guide**: `docs/user/SERVICE_MANAGEMENT_GUIDE.md`\n\n---\n\n### Getting Help\n\n```\n# View documentation\ncat docs/user/SERVICE_MANAGEMENT_GUIDE.md | less\n\n# Run verification\nnu provisioning/core/nulib/tests/verify_services.nu\n\n# Check readiness\nprovisioning services readiness\n```\n\n---\n\n**Quick Tip**: Use `--help` flag with any command for detailed usage information.\n\n---\n\n**Maintained By**: Platform Team\n**Support**: [GitHub Issues](https://github.com/your-org/provisioning/issues)