prvng_platform/crates/orchestrator/docs/service-orchestration.md
2026-01-14 03:25:20 +00:00

12 KiB

Service Orchestration Guide\n\n## Overview\n\nThe service orchestration module manages platform services with dependency-based startup, health checking, and automatic service coordination.\n\n## Architecture\n\n{$detected_lang}\n┌──────────────────────┐\n│ Orchestrator │\n│ (Rust) │\n└──────────┬───────────┘\n │\n ▼\n┌──────────────────────┐\n│ Service Orchestrator │\n│ │\n│ - Dependency graph │\n│ - Startup order │\n│ - Health checking │\n└──────────┬───────────┘\n │\n ▼\n┌──────────────────────┐\n│ Service Manager │\n│ (Nushell calls) │\n└──────────┬───────────┘\n │\n ▼\n┌──────────────────────┐\n│ Platform Services │\n│ (CoreDNS, OCI, etc) │\n└──────────────────────┘\n\n\n## Features\n\n### 1. Dependency Resolution\n\nAutomatically resolve service startup order based on dependencies:\n\n{$detected_lang}\nlet order = service_orchestrator.resolve_startup_order(&[\n "service-c".to_string()\n]).await?;\n\n// Returns: ["service-a", "service-b", "service-c"]\n\n\n### 2. Automatic Dependency Startup\n\nWhen enabled, dependencies are started automatically:\n\n{$detected_lang}\n// Start service with dependencies\nservice_orchestrator.start_service("web-app").await?;\n\n// Automatically starts: database -> cache -> web-app\n\n\n### 3. Health Checking\n\nMonitor service health with HTTP or process checks:\n\n{$detected_lang}\nlet health = service_orchestrator.check_service_health("web-app").await?;\n\nif health.healthy {\n println!("Service is healthy: {}", health.message);\n}\n\n\n### 4. Service Status\n\nGet current status of any registered service:\n\n{$detected_lang}\nlet status = service_orchestrator.get_service_status("web-app").await?;\n\nmatch status {\n ServiceStatus::Running => println!("Service is running"),\n ServiceStatus::Stopped => println!("Service is stopped"),\n ServiceStatus::Failed => println!("Service has failed"),\n ServiceStatus::Unknown => println!("Service status unknown"),\n}\n\n\n## Service Definition\n\n### Service Structure\n\n{$detected_lang}\npub struct Service {\n pub name: String,\n pub description: String,\n pub dependencies: Vec<String>,\n pub start_command: String,\n pub stop_command: String,\n pub health_check_endpoint: Option<String>,\n}\n\n\n### Example Service Definition\n\n{$detected_lang}\nlet coredns_service = Service {\n name: "coredns".to_string(),\n description: "CoreDNS DNS server".to_string(),\n dependencies: vec![], // No dependencies\n start_command: "systemctl start coredns".to_string(),\n stop_command: "systemctl stop coredns".to_string(),\n health_check_endpoint: Some("http://localhost:53/health".to_string()),\n};\n\n\n### Service with Dependencies\n\n{$detected_lang}\nlet oci_registry = Service {\n name: "oci-registry".to_string(),\n description: "OCI distribution registry".to_string(),\n dependencies: vec!["coredns".to_string()], // Depends on DNS\n start_command: "systemctl start oci-registry".to_string(),\n stop_command: "systemctl stop oci-registry".to_string(),\n health_check_endpoint: Some("http://localhost:5000/v2/".to_string()),\n};\n\n\n## Configuration\n\nService orchestration settings in config.defaults.toml:\n\n{$detected_lang}\n[orchestrator.services]\nmanager_enabled = true\nauto_start_dependencies = true\n\n\n### Configuration Options\n\n- manager_enabled: Enable service orchestration (default: true)\n- auto_start_dependencies: Auto-start dependencies when starting a service (default: true)\n\n## API Endpoints\n\n### List Services\n\n{$detected_lang}\nGET /api/v1/services/list\n\n\nResponse:\n\n{$detected_lang}\n{\n "success": true,\n "data": [\n {\n "name": "coredns",\n "description": "CoreDNS DNS server",\n "dependencies": [],\n "start_command": "systemctl start coredns",\n "stop_command": "systemctl stop coredns",\n "health_check_endpoint": "http://localhost:53/health"\n }\n ]\n}\n\n\n### Get Services Status\n\n{$detected_lang}\nGET /api/v1/services/status\n\n\nResponse:\n\n{$detected_lang}\n{\n "success": true,\n "data": [\n {\n "name": "coredns",\n "status": "Running"\n },\n {\n "name": "oci-registry",\n "status": "Running"\n }\n ]\n}\n\n\n## Usage Examples\n\n### Register Services\n\n{$detected_lang}\nuse provisioning_orchestrator::services::{ServiceOrchestrator, Service};\n\nlet orchestrator = ServiceOrchestrator::new(\n "/usr/local/bin/nu".to_string(),\n "/usr/local/bin/provisioning".to_string(),\n true, // auto_start_dependencies\n);\n\n// Register CoreDNS\nlet coredns = Service {\n name: "coredns".to_string(),\n description: "CoreDNS DNS server".to_string(),\n dependencies: vec![],\n start_command: "systemctl start coredns".to_string(),\n stop_command: "systemctl stop coredns".to_string(),\n health_check_endpoint: Some("http://localhost:53/health".to_string()),\n};\n\norchestrator.register_service(coredns).await;\n\n// Register OCI Registry (depends on CoreDNS)\nlet oci = Service {\n name: "oci-registry".to_string(),\n description: "OCI distribution registry".to_string(),\n dependencies: vec!["coredns".to_string()],\n start_command: "systemctl start oci-registry".to_string(),\n stop_command: "systemctl stop oci-registry".to_string(),\n health_check_endpoint: Some("http://localhost:5000/v2/".to_string()),\n};\n\norchestrator.register_service(oci).await;\n\n\n### Start Service with Dependencies\n\n{$detected_lang}\n// This will automatically start coredns first, then oci-registry\norchestrator.start_service("oci-registry").await?;\n\n\n### Resolve Startup Order\n\n{$detected_lang}\nlet services = vec![\n "web-app".to_string(),\n "api-server".to_string(),\n];\n\nlet order = orchestrator.resolve_startup_order(&services).await?;\n\nprintln!("Startup order:");\nfor (i, service) in order.iter().enumerate() {\n println!("{}. {}", i + 1, service);\n}\n\n\n### Start All Services\n\n{$detected_lang}\nlet started = orchestrator.start_all_services().await?;\n\nprintln!("Started {} services:", started.len());\nfor service in started {\n println!(" ✓ {}", service);\n}\n\n\n### Check Service Health\n\n{$detected_lang}\nlet health = orchestrator.check_service_health("coredns").await?;\n\nif health.healthy {\n println!("✓ {} is healthy", "coredns");\n println!(" Message: {}", health.message);\n println!(" Last check: {}", health.last_check);\n} else {\n println!("✗ {} is unhealthy", "coredns");\n println!(" Message: {}", health.message);\n}\n\n\n## Dependency Graph Examples\n\n### Simple Chain\n\n{$detected_lang}\nA -> B -> C\n\n\nStartup order: A, B, C\n\n{$detected_lang}\nlet a = Service { name: "a".to_string(), dependencies: vec![], /* ... */ };\nlet b = Service { name: "b".to_string(), dependencies: vec!["a".to_string()], /* ... */ };\nlet c = Service { name: "c".to_string(), dependencies: vec!["b".to_string()], /* ... */ };\n\n\n### Diamond Dependency\n\n{$detected_lang}\n A\n / \\n B C\n \ /\n D\n\n\nStartup order: A, B, C, D (B and C can start in parallel)\n\n{$detected_lang}\nlet a = Service { name: "a".to_string(), dependencies: vec![], /* ... */ };\nlet b = Service { name: "b".to_string(), dependencies: vec!["a".to_string()], /* ... */ };\nlet c = Service { name: "c".to_string(), dependencies: vec!["a".to_string()], /* ... */ };\nlet d = Service { name: "d".to_string(), dependencies: vec!["b".to_string(), "c".to_string()], /* ... */ };\n\n\n### Complex Dependency\n\n{$detected_lang}\n A\n |\n B\n / \\n C D\n | |\n E F\n \ /\n G\n\n\nStartup order: A, B, C, D, E, F, G\n\n## Integration with Platform Services\n\n### CoreDNS Service\n\n{$detected_lang}\nlet coredns = Service {\n name: "coredns".to_string(),\n description: "CoreDNS DNS server for automatic DNS registration".to_string(),\n dependencies: vec![],\n start_command: "systemctl start coredns".to_string(),\n stop_command: "systemctl stop coredns".to_string(),\n health_check_endpoint: Some("http://localhost:53/health".to_string()),\n};\n\n\n### OCI Registry Service\n\n{$detected_lang}\nlet oci_registry = Service {\n name: "oci-registry".to_string(),\n description: "OCI distribution registry for artifacts".to_string(),\n dependencies: vec!["coredns".to_string()],\n start_command: "systemctl start oci-registry".to_string(),\n stop_command: "systemctl stop oci-registry".to_string(),\n health_check_endpoint: Some("http://localhost:5000/v2/".to_string()),\n};\n\n\n### Orchestrator Service\n\n{$detected_lang}\nlet orchestrator = Service {\n name: "orchestrator".to_string(),\n description: "Main orchestrator service".to_string(),\n dependencies: vec!["coredns".to_string(), "oci-registry".to_string()],\n start_command: "./scripts/start-orchestrator.nu --background".to_string(),\n stop_command: "./scripts/start-orchestrator.nu --stop".to_string(),\n health_check_endpoint: Some("http://localhost:9090/health".to_string()),\n};\n\n\n## Error Handling\n\nThe service orchestrator handles errors gracefully:\n\n- Missing dependencies: Reports missing services\n- Circular dependencies: Detects and reports cycles\n- Start failures: Continues with other services\n- Health check failures: Marks service as unhealthy\n\n### Circular Dependency Detection\n\n{$detected_lang}\n// This would create a cycle: A -> B -> C -> A\nlet a = Service { name: "a".to_string(), dependencies: vec!["c".to_string()], /* ... */ };\nlet b = Service { name: "b".to_string(), dependencies: vec!["a".to_string()], /* ... */ };\nlet c = Service { name: "c".to_string(), dependencies: vec!["b".to_string()], /* ... */ };\n\n// Error: Circular dependency detected\nlet result = orchestrator.resolve_startup_order(&["a".to_string()]).await;\nassert!(result.is_err());\n\n\n## Testing\n\nRun service orchestration tests:\n\n{$detected_lang}\ncd provisioning/platform/orchestrator\ncargo test test_service_orchestration\n\n\n## Troubleshooting\n\n### Service fails to start\n\n1. Check service is registered\n2. Verify dependencies are running\n3. Review service start command\n4. Check service logs\n5. Verify permissions\n\n### Dependency resolution fails\n\n1. Check for circular dependencies\n2. Verify all services are registered\n3. Review dependency declarations\n\n### Health check fails\n\n1. Verify health endpoint is correct\n2. Check service is actually running\n3. Review network connectivity\n4. Check health check timeout\n\n## Best Practices\n\n1. Minimize dependencies: Only declare necessary dependencies\n2. Health endpoints: Implement health checks for all services\n3. Graceful shutdown: Implement proper stop commands\n4. Idempotent starts: Ensure services can be restarted safely\n5. Error logging: Log all service operations\n\n## Security Considerations\n\n1. Command injection: Validate service commands\n2. Access control: Restrict service management\n3. Audit logging: Log all service operations\n4. Least privilege: Run services with minimal permissions\n\n## Performance\n\n### Startup Optimization\n\n- Parallel starts: Services without dependencies start in parallel\n- Dependency caching: Cache dependency resolution\n- Health check batching: Batch health checks for efficiency\n\n### Monitoring\n\nTrack service metrics:\n\n- Start time: Time to start each service\n- Health check latency: Health check response time\n- Failure rate: Percentage of failed starts\n- Uptime: Service availability percentage\n\n## Future Enhancements\n\n- [ ] Service restart policies\n- [ ] Graceful shutdown ordering\n- [ ] Service watchdog\n- [ ] Auto-restart on failure\n- [ ] Service templates\n- [ ] Container-based services