Jesús Pérez f2be2414e4 core: init repo and codebase

2025-10-07 10:59:52 +01:00

11 KiB

Raw Permalink Blame History

Service Orchestration Guide

Overview

The service orchestration module manages platform services with dependency-based startup, health checking, and automatic service coordination.

Architecture

┌──────────────────────┐
│    Orchestrator      │
│      (Rust)          │
└──────────┬───────────┘
           │
           ▼
┌──────────────────────┐
│ Service Orchestrator │
│                      │
│  - Dependency graph  │
│  - Startup order     │
│  - Health checking   │
└──────────┬───────────┘
           │
           ▼
┌──────────────────────┐
│  Service Manager     │
│  (Nushell calls)     │
└──────────┬───────────┘
           │
           ▼
┌──────────────────────┐
│  Platform Services   │
│  (CoreDNS, OCI, etc) │
└──────────────────────┘

Features

1. Dependency Resolution

Automatically resolve service startup order based on dependencies:

let order = service_orchestrator.resolve_startup_order(&[
    "service-c".to_string()
]).await?;

// Returns: ["service-a", "service-b", "service-c"]

2. Automatic Dependency Startup

When enabled, dependencies are started automatically:

// Start service with dependencies
service_orchestrator.start_service("web-app").await?;

// Automatically starts: database -> cache -> web-app

3. Health Checking

Monitor service health with HTTP or process checks:

let health = service_orchestrator.check_service_health("web-app").await?;

if health.healthy {
    println!("Service is healthy: {}", health.message);
}

4. Service Status

Get current status of any registered service:

let status = service_orchestrator.get_service_status("web-app").await?;

match status {
    ServiceStatus::Running => println!("Service is running"),
    ServiceStatus::Stopped => println!("Service is stopped"),
    ServiceStatus::Failed => println!("Service has failed"),
    ServiceStatus::Unknown => println!("Service status unknown"),
}

Service Definition

Service Structure

pub struct Service {
    pub name: String,
    pub description: String,
    pub dependencies: Vec<String>,
    pub start_command: String,
    pub stop_command: String,
    pub health_check_endpoint: Option<String>,
}

Example Service Definition

let coredns_service = Service {
    name: "coredns".to_string(),
    description: "CoreDNS DNS server".to_string(),
    dependencies: vec![],  // No dependencies
    start_command: "systemctl start coredns".to_string(),
    stop_command: "systemctl stop coredns".to_string(),
    health_check_endpoint: Some("http://localhost:53/health".to_string()),
};

Service with Dependencies

let oci_registry = Service {
    name: "oci-registry".to_string(),
    description: "OCI distribution registry".to_string(),
    dependencies: vec!["coredns".to_string()],  // Depends on DNS
    start_command: "systemctl start oci-registry".to_string(),
    stop_command: "systemctl stop oci-registry".to_string(),
    health_check_endpoint: Some("http://localhost:5000/v2/".to_string()),
};

Configuration

Service orchestration settings in config.defaults.toml:

[orchestrator.services]
manager_enabled = true
auto_start_dependencies = true

Configuration Options

manager_enabled: Enable service orchestration (default: true)
auto_start_dependencies: Auto-start dependencies when starting a service (default: true)

API Endpoints

List Services

GET /api/v1/services/list

Response:

{
  "success": true,
  "data": [
    {
      "name": "coredns",
      "description": "CoreDNS DNS server",
      "dependencies": [],
      "start_command": "systemctl start coredns",
      "stop_command": "systemctl stop coredns",
      "health_check_endpoint": "http://localhost:53/health"
    }
  ]
}

Get Services Status

GET /api/v1/services/status

Response:

{
  "success": true,
  "data": [
    {
      "name": "coredns",
      "status": "Running"
    },
    {
      "name": "oci-registry",
      "status": "Running"
    }
  ]
}

Usage Examples

Register Services

use provisioning_orchestrator::services::{ServiceOrchestrator, Service};

let orchestrator = ServiceOrchestrator::new(
    "/usr/local/bin/nu".to_string(),
    "/usr/local/bin/provisioning".to_string(),
    true,  // auto_start_dependencies
);

// Register CoreDNS
let coredns = Service {
    name: "coredns".to_string(),
    description: "CoreDNS DNS server".to_string(),
    dependencies: vec![],
    start_command: "systemctl start coredns".to_string(),
    stop_command: "systemctl stop coredns".to_string(),
    health_check_endpoint: Some("http://localhost:53/health".to_string()),
};

orchestrator.register_service(coredns).await;

// Register OCI Registry (depends on CoreDNS)
let oci = Service {
    name: "oci-registry".to_string(),
    description: "OCI distribution registry".to_string(),
    dependencies: vec!["coredns".to_string()],
    start_command: "systemctl start oci-registry".to_string(),
    stop_command: "systemctl stop oci-registry".to_string(),
    health_check_endpoint: Some("http://localhost:5000/v2/".to_string()),
};

orchestrator.register_service(oci).await;

Start Service with Dependencies

// This will automatically start coredns first, then oci-registry
orchestrator.start_service("oci-registry").await?;

Resolve Startup Order

let services = vec![
    "web-app".to_string(),
    "api-server".to_string(),
];

let order = orchestrator.resolve_startup_order(&services).await?;

println!("Startup order:");
for (i, service) in order.iter().enumerate() {
    println!("{}. {}", i + 1, service);
}

Start All Services

let started = orchestrator.start_all_services().await?;

println!("Started {} services:", started.len());
for service in started {
    println!("  ✓ {}", service);
}

Check Service Health

let health = orchestrator.check_service_health("coredns").await?;

if health.healthy {
    println!("✓ {} is healthy", "coredns");
    println!("  Message: {}", health.message);
    println!("  Last check: {}", health.last_check);
} else {
    println!("✗ {} is unhealthy", "coredns");
    println!("  Message: {}", health.message);
}

Dependency Graph Examples

Simple Chain

A -> B -> C

Startup order: A, B, C

let a = Service { name: "a".to_string(), dependencies: vec![], /* ... */ };
let b = Service { name: "b".to_string(), dependencies: vec!["a".to_string()], /* ... */ };
let c = Service { name: "c".to_string(), dependencies: vec!["b".to_string()], /* ... */ };

Diamond Dependency

    A
   / \
  B   C
   \ /
    D

Startup order: A, B, C, D (B and C can start in parallel)

let a = Service { name: "a".to_string(), dependencies: vec![], /* ... */ };
let b = Service { name: "b".to_string(), dependencies: vec!["a".to_string()], /* ... */ };
let c = Service { name: "c".to_string(), dependencies: vec!["a".to_string()], /* ... */ };
let d = Service { name: "d".to_string(), dependencies: vec!["b".to_string(), "c".to_string()], /* ... */ };

Complex Dependency

    A
    |
    B
   / \
  C   D
  |   |
  E   F
   \ /
    G

Startup order: A, B, C, D, E, F, G

Integration with Platform Services

CoreDNS Service

let coredns = Service {
    name: "coredns".to_string(),
    description: "CoreDNS DNS server for automatic DNS registration".to_string(),
    dependencies: vec![],
    start_command: "systemctl start coredns".to_string(),
    stop_command: "systemctl stop coredns".to_string(),
    health_check_endpoint: Some("http://localhost:53/health".to_string()),
};

OCI Registry Service

let oci_registry = Service {
    name: "oci-registry".to_string(),
    description: "OCI distribution registry for artifacts".to_string(),
    dependencies: vec!["coredns".to_string()],
    start_command: "systemctl start oci-registry".to_string(),
    stop_command: "systemctl stop oci-registry".to_string(),
    health_check_endpoint: Some("http://localhost:5000/v2/".to_string()),
};

Orchestrator Service

let orchestrator = Service {
    name: "orchestrator".to_string(),
    description: "Main orchestrator service".to_string(),
    dependencies: vec!["coredns".to_string(), "oci-registry".to_string()],
    start_command: "./scripts/start-orchestrator.nu --background".to_string(),
    stop_command: "./scripts/start-orchestrator.nu --stop".to_string(),
    health_check_endpoint: Some("http://localhost:8080/health".to_string()),
};

Error Handling

The service orchestrator handles errors gracefully:

Missing dependencies: Reports missing services
Circular dependencies: Detects and reports cycles
Start failures: Continues with other services
Health check failures: Marks service as unhealthy

Circular Dependency Detection

// This would create a cycle: A -> B -> C -> A
let a = Service { name: "a".to_string(), dependencies: vec!["c".to_string()], /* ... */ };
let b = Service { name: "b".to_string(), dependencies: vec!["a".to_string()], /* ... */ };
let c = Service { name: "c".to_string(), dependencies: vec!["b".to_string()], /* ... */ };

// Error: Circular dependency detected
let result = orchestrator.resolve_startup_order(&["a".to_string()]).await;
assert!(result.is_err());

Testing

Run service orchestration tests:

cd provisioning/platform/orchestrator
cargo test test_service_orchestration

Troubleshooting

Service fails to start

Check service is registered
Verify dependencies are running
Review service start command
Check service logs
Verify permissions

Dependency resolution fails

Check for circular dependencies
Verify all services are registered
Review dependency declarations

Health check fails

Verify health endpoint is correct
Check service is actually running
Review network connectivity
Check health check timeout

Best Practices

Minimize dependencies: Only declare necessary dependencies
Health endpoints: Implement health checks for all services
Graceful shutdown: Implement proper stop commands
Idempotent starts: Ensure services can be restarted safely
Error logging: Log all service operations

Security Considerations

Command injection: Validate service commands
Access control: Restrict service management
Audit logging: Log all service operations
Least privilege: Run services with minimal permissions

Performance

Startup Optimization

Parallel starts: Services without dependencies start in parallel
Dependency caching: Cache dependency resolution
Health check batching: Batch health checks for efficiency

Monitoring

Track service metrics:

Start time: Time to start each service
Health check latency: Health check response time
Failure rate: Percentage of failed starts
Uptime: Service availability percentage

Future Enhancements

Service restart policies
Graceful shutdown ordering
Service watchdog
Auto-restart on failure
Service templates
Container-based services

11 KiB Raw Permalink Blame History