prvng_platform/README.md

23 KiB
Raw Blame History

Provisioning Logo

Provisioning


Platform Services

Platform-level services for the Provisioning project infrastructure automation platform. These services provide the high-performance execution layer, management interfaces, and supporting infrastructure for the entire provisioning system.

Overview

The Platform layer consists of production-ready services built primarily in Rust, providing:

  • Workflow Execution - High-performance orchestration and task coordination
  • Management Interfaces - Web UI and REST APIs for infrastructure management
  • Security & Authorization - Enterprise-grade access control and permissions
  • Installation & Distribution - Multi-mode installer with TUI, CLI, and unattended modes
  • AI Integration - Model Context Protocol (MCP) server for intelligent assistance
  • Extension Management - OCI-based registry for distributing modules

Core Platform Services

1. Orchestrator (orchestrator/)

High-performance Rust/Nushell hybrid orchestrator for workflow execution.

Language: Rust + Nushell integration

Purpose: Workflow execution, task scheduling, state management

Key Features:

  • File-based persistence for reliability
  • Priority processing with retry logic
  • Checkpoint recovery and automatic rollback
  • REST API endpoints for external integration
  • Solves deep call stack limitations
  • Parallel task execution with dependency resolution

Status: Production Ready (v3.0.0)

Documentation: See Orchestrator Endpoints

Quick Start:

cd orchestrator
./scripts/start-orchestrator.nu --background

REST API:

  • GET http://localhost:8080/health - Health check
  • GET http://localhost:8080/tasks - List all tasks
  • POST http://localhost:8080/workflows/servers/create - Server workflow
  • POST http://localhost:8080/workflows/taskserv/create - Taskserv workflow

2. Control Center (control-center/)

Backend control center service with authorization and permissions management.

Language: Rust

Purpose: Web-based infrastructure management with RBAC

Key Features:

  • Authorization and permissions control (enterprise security)
  • Role-Based Access Control (RBAC)
  • Audit logging and compliance tracking
  • System management APIs
  • Configuration management
  • Resource monitoring

Status: Active Development

Security Features:

  • Fine-grained permissions system
  • User authentication and session management
  • API key management
  • Activity audit logs

3. Control Center UI (control-center-ui/)

Frontend web interface for infrastructure management.

Language: Web (HTML/CSS/JavaScript)

Purpose: User-friendly dashboard and administration interface

Key Features:

  • Dashboard with real-time monitoring
  • Configuration management interface
  • System administration tools
  • Workflow visualization
  • Log viewing and search

Status: Active Development

Integration: Communicates with Control Center backend and Orchestrator APIs


4. Installer (installer/)

Multi-mode platform installation system with interactive TUI, headless CLI, and unattended modes.

Language: Rust (Ratatui TUI) + Nushell scripts

Purpose: Platform installation and configuration generation

Key Features:

  • Interactive TUI Mode: Beautiful terminal UI with 7 screens
  • Headless Mode: CLI automation for scripted installations
  • Unattended Mode: Zero-interaction CI/CD deployments
  • Deployment Modes: Solo (2 CPU/4GB), MultiUser (4 CPU/8GB), CICD (8 CPU/16GB), Enterprise (16 CPU/32GB)
  • MCP Integration: 7 AI-powered settings tools for intelligent configuration
  • Nushell Scripts: Complete deployment automation for Docker, Podman, Kubernetes, OrbStack

Status: Production Ready (v3.5.0)

Quick Start:

# Interactive TUI
provisioning-installer

# Headless mode
provisioning-installer --headless --mode solo --yes

# Unattended CI/CD
provisioning-installer --unattended --config config.toml

Documentation: installer/docs/ - Complete guides and references


5. MCP Server (mcp-server/)

Model Context Protocol server for AI-powered assistance.

Language: Nushell

Purpose: AI integration for intelligent configuration and assistance

Key Features:

  • 7 AI-powered settings tools
  • Intelligent config completion
  • Natural language infrastructure queries
  • Configuration validation and suggestions
  • Context-aware help system

Status: Active Development

MCP Tools:

  • Settings generation
  • Configuration validation
  • Best practice recommendations
  • Infrastructure planning assistance
  • Error diagnosis and resolution

6. OCI Registry (infrastructure/oci-registry/)

OCI-compliant registry for extension distribution and versioning.

Purpose: Distributing and managing extensions

Key Features:

  • Task service packages
  • Provider packages
  • Cluster templates
  • Workflow definitions
  • Version management and updates
  • Dependency resolution

Status: 🔄 Planned

Benefits:

  • Centralized extension management
  • Version control and rollback
  • Dependency tracking
  • Community marketplace ready

7. ops-keeper (crates/ops-keeper/)

Policy-based operation gate — signs approved operations with Ed25519 keys before forwarding to the control plane.

Language: Rust

Purpose: Operation approval, policy enforcement, and keeper-signed JWT emission

Key Features:

  • Glob-based PolicyDef matching against op type, image patterns, and target patterns
  • Signer wraps an Ed25519 key pair; emits compact JWTs (OpsClaims) on approval
  • PendingOp tracking with NATS JetStream durable consumer (ops.pending.*)
  • AuditEvent emission to ops.audit.* stream on approval or rejection
  • Nickel-driven policy config (keeper_policy.ncl)

Status: Active Development


8. ops-controller (crates/ops-controller/)

NATS JetStream consumer that processes keeper-approved operations, calls the orchestrator, and enforces idempotency via SurrealDB.

Language: Rust

Purpose: Durable control plane execution with at-least-once delivery guarantees

Key Features:

  • Pull consumer on ops.pending.* JetStream stream
  • Ed25519 JWT verification of keeper-signed claims before dispatch
  • Idempotency check via SurrealDB; reconciles stale pending ops on startup
  • Orchestrator HTTP dispatch with structured AckResult (Ack/Nak/Term)
  • Audit emission (ops.audit.*) on every terminal outcome

Status: Active Development

ADR: ADR-038 (ops control plane design)


9. audit-mirror (crates/audit-mirror/)

Sidecar that consumes ops.audit.* NATS events and mirrors each event as a signed git commit into a Radicle repository.

Language: Rust

Purpose: Immutable, content-addressed audit trail via Radicle git storage

Key Features:

  • NATS JetStream pull consumer on ops.audit.*
  • JTI deduplication — skips already-committed event IDs via git log scan
  • commit_writer creates signed commits with the audit payload as the blob
  • radicle_publish announces the repo to the Radicle network after each commit
  • Configurable via CLI flags (NATS URL, workspace, Radicle repo path, key path)

Status: Active Development

ADR: ADR-038


10. API Gateway (infrastructure/api-gateway/)

Unified REST API gateway for external integration.

Language: Rust

Purpose: API routing, authentication, and rate limiting

Key Features:

  • Request routing to backend services
  • Authentication and authorization
  • Rate limiting and throttling
  • API versioning

Status: 🔄 Planned


11. Extension Registry (crates/extension-registry/)

Registry and catalog for browsing, discovering, and distributing extensions.

Language: Rust

Purpose: Extension discovery, metadata management, and OCI/Forgejo-backed distribution

Status: Active Development


12. contract-tests (crates/contract-tests/)

G3 contract test suite — verifies semantic equivalence across the CLI↔HTTP↔MCP tier stack.

Language: Rust (test crate)

Purpose: Prevent drift between registry, HTTP daemon, and MCP server response shapes

Key Features:

  • Tier A: direct registry invocation (reference baseline)
  • Tier B: axum HTTP server on 127.0.0.1:0 (ephemeral port)
  • Tier C: in-process MCP handle_request
  • Normaliser strips volatile fields (trace_id, timestamp) — asserts semantic, not byte-for-byte equality
  • JSON schema validation against listing_output_schema on every tier

Status: Active Development


13. ncl-sync (crates/ncl-sync/)

Nickel configuration sync daemon — compiles NCL to JSON proactively and maintains a shared cache for all Nu processes.

Language: Rust

Purpose: Eliminate nickel export latency (~25s per call) from CLI commands by pre-compiling NCL files and serving results from an in-memory-backed file cache.

Key Features:

  • File watcher (notify) on workspace NCL directories — re-exports on change automatically
  • Warm-up on prvng platform start — first command of the day already finds cache hot
  • Shared cache at ~/.cache/provisioning/config-cache/ used by both this daemon and nu_plugin_nickel
  • Content-addressed keys: SHA256(file_content + sorted_import_paths + format) — identical to plugin key strategy, zero coordination overhead
  • Post-operation sync: Nu writes .sync-<pid>.json sidecar after mutations; daemon re-exports within 500 ms
  • Configurable via platform/config/ncl-sync.ncl (idle timeout, concurrency, poll interval)
  • No NATS, no SurrealDB, no platform service dependencies — intentional to avoid bootstrap circularity

Status: Production Ready

Install:

cargo build --release --package ncl-sync
install -m 0755 target/release/provisioning-ncl-sync ~/.local/bin/provisioning-ncl-sync

Usage:

# Start daemon for a workspace
ncl-sync daemon --workspace ~/workspaces/libre-daoshi

# One-shot warm-up
ncl-sync warm ~/workspaces/libre-daoshi

# Evict a specific file from cache
ncl-sync invalidate settings.ncl

# Print cache key (parity testing)
ncl-sync key settings.ncl --import-path /ws --import-path /prov

# Cache statistics
ncl-sync stats

Lifecycle integration: Started automatically by prvng platform start, stopped by prvng platform stop. Status visible in prvng platform status.

Performance impact (with warm cache):

Command Before After
prvng component list ~37 s ~1.5 s
prvng workflow list ~35 s ~1.5 s
prvng deploy ~1530 s ~35 s

Configuration (platform/config/ncl-sync.ncl):

{
  ncl_sync = {
    idle_timeout_secs = 600,     # daemon auto-shutdown after N seconds idle
    sync_poll_interval_ms = 500, # how often to check for sync-request sidecars
    warm_concurrency = 4,        # max parallel nickel export during warm-up
    extra_import_paths = [],     # additional import paths beyond workspace + $PROVISIONING
  }
}

ADRs: ADR-022 (daemon design), ADR-023 (Nu wrapper strategy)


Supporting Services

CoreDNS (config/coredns/)

DNS service configuration for cluster environments.

Purpose: Service discovery and DNS resolution

Status: Configuration Ready


Monitoring (infrastructure/monitoring/)

Observability and monitoring infrastructure.

Purpose: Metrics, logging, and alerting

Components:

  • Prometheus configuration
  • Grafana dashboards
  • Alert rules

Status: Configuration Ready


Nginx (infrastructure/nginx/)

Reverse proxy and load balancer configurations.

Purpose: HTTP routing and SSL termination

Status: Configuration Ready


Docker Compose (infrastructure/docker/)

Docker Compose configurations for local development.

Purpose: Quick local platform deployment

Status: Ready for Development


Systemd (infrastructure/systemd/)

Systemd service units for platform services.

Purpose: Production deployment with systemd

Status: Ready for Production


Architecture

┌──────────────────────────────────────────────────────────────┐
│                   User Interfaces                            │
│  • CLI (provisioning command)                                │
│  • Web UI (Control Center UI)                                │
│  • API Clients                                               │
└──────────────────────────────────────────────────────────────┘
                             ↓
┌──────────────────────────────────────────────────────────────┐
│                   API Gateway                                │
│  • Request Routing                                           │
│  • Authentication & Authorization                            │
│  • Rate Limiting                                             │
└──────────────────────────────────────────────────────────────┘
                             ↓
┌──────────────────────────────────────────────────────────────┐
│              Platform Services Layer                         │
│                                                              │
│   ┌──────────────┐  ┌──────────────┐  ┌──────────────┐       │
│   │ Orchestrator │  │ Control Ctr  │  │ MCP Server   │       │
│   │   (Rust)     │  │   (Rust)     │  │ (Nushell)    │       │
│   └──────────────┘  └──────────────┘  └──────────────┘       │
│                                                              │
│   ┌──────────────┐  ┌──────────────┐  ┌──────────────┐       │
│   │  Installer   │  │ Extension    │  │  ops-keeper  │       │
│   │  (Rust/Nu)   │  │  Registry    │  │  (Rust)      │       │
│   └──────────────┘  └──────────────┘  └──────────────┘       │
│                                                              │
│   ┌──────────────┐  ┌──────────────┐                         │
│   │ops-controller│  │ audit-mirror │                         │
│   │   (Rust)     │  │   (Rust)     │                         │
│   └──────────────┘  └──────────────┘                         │
│                                                              │
│   ┌──────────────────────────────────────────────────────┐   │
│   │  ncl-sync daemon  (Rust)                             │   │
│   │  ~/.cache/provisioning/config-cache/  ←→  Nu procs  │   │
│   └──────────────────────────────────────────────────────┘   │
└──────────────────────────────────────────────────────────────┘
                             ↓
┌──────────────────────────────────────────────────────────────┐
│                Data & State Layer                            │
│  • NATS JetStream (ops.pending.*, ops.audit.*, TASKS)        │
│  • SurrealDB (State Management, Idempotency)                 │
│  • Radicle (Immutable Audit Log via git)                     │
│  • File-based Persistence (Checkpoints)                      │
└──────────────────────────────────────────────────────────────┘

Technology Stack

Primary Languages

Language Usage Services
Rust Platform services, performance layer Orchestrator, Control Center, Installer, API Gateway
Nushell Scripting, automation, MCP integration MCP Server, Installer scripts
Web Frontend interfaces Control Center UI

Key Dependencies

  • tokio - Async runtime for Rust services
  • axum - Web framework (control-center, orchestrator, provisioning-daemon)
  • async-nats - NATS JetStream client (ops-keeper, ops-controller, audit-mirror, control-center)
  • surrealdb - State management and idempotency store
  • serde - Serialization/deserialization
  • ratatui - Terminal UI framework (installer)
  • git2 - Radicle git integration (audit-mirror)
  • jsonwebtoken - Ed25519 JWT signing/verification (ops-keeper, ops-controller)

Deployment Modes

1. Development Mode

# Docker Compose for local development
docker-compose -f infrastructure/docker/dev.yml up

2. Production Mode (Systemd)

# Install systemd units
sudo cp infrastructure/systemd/*.service /etc/infrastructure/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now provisioning-orchestrator
sudo systemctl enable --now provisioning-control-center

3. Kubernetes Deployment

# Deploy platform services to Kubernetes
kubectl apply -f k8s/

Security Features

Enterprise Security Stack

  1. Authorization & Permissions (Control Center)

    • Role-Based Access Control (RBAC)
    • Fine-grained permissions
    • Audit logging
  2. Authentication

    • API key management
    • Session management
    • Token-based auth (JWT)
  3. Secrets Management

    • Integration with SOPS/Age
    • Cosmian KMS support
    • Secure configuration storage
  4. Policy Enforcement

    • Cedar policy engine integration
    • Compliance checking
    • Anomaly detection

Getting Started

Prerequisites

  • Rust - Latest stable (for building platform services)
  • Nushell 0.107.1+ - For MCP server and scripts
  • Docker (optional) - For containerized deployment
  • Kubernetes (optional) - For K8s deployment

Building Platform Services

# Build all Rust services
cd orchestrator && cargo build --release
cd ../control-center && cargo build --release
cd ../installer && cargo build --release

Running Services

# Start orchestrator
cd orchestrator
./scripts/start-orchestrator.nu --background

# Start control center
cd control-center
cargo run --release

# Start MCP server
cd mcp-server
nu run.nu

Development

Project Structure

platform/
├── crates/
│   ├── orchestrator/       # Rust orchestrator service
│   ├── control-center/     # Rust control center backend
│   ├── control-center-ui/  # Web frontend
│   ├── mcp-server/         # Nushell MCP server
│   ├── ncl-sync/           # Nickel config sync daemon
│   └── ...
├── config/
│   ├── ncl-sync.ncl        # ncl-sync daemon configuration
│   └── external-services.ncl
├── infrastructure/
│   ├── api-gateway/        # Rust API gateway (planned)
│   ├── oci-registry/       # OCI registry (planned)
│   ├── docker/             # Docker Compose configs
│   ├── systemd/            # Systemd units
│   └── ...
└── docs/                   # Platform documentation

Adding New Services

  1. Create service directory in platform/
  2. Add README.md with service description
  3. Implement service following architecture patterns
  4. Add tests and documentation
  5. Update platform/README.md (this file)
  6. Add deployment configurations (docker-compose, k8s, systemd)

Integration with Provisioning

Platform services integrate seamlessly with the Provisioning system:

  • Core Engine (../core/) provides CLI and libraries
  • Extensions (../extensions/) provide providers, taskservs, clusters
  • Platform Services (this directory) provide execution and management
  • Configuration (../kcl/, ../config/) defines infrastructure

Documentation

Platform Documentation

API Documentation

  • REST API Reference: docs/api/ (when orchestrator is running)
  • MCP Tools Reference: mcp-server/docs/

Architecture Documentation


Contributing

When contributing to platform services:

  1. Follow Rust Best Practices - Idiomatic Rust, proper error handling
  2. Security First - Always consider security implications
  3. Performance Matters - Platform services are performance-critical
  4. Document APIs - All REST endpoints must be documented
  5. Add Tests - Unit tests and integration tests required
  6. Update Docs - Keep README and API docs current

Status Legend

  • Production Ready - Fully implemented and tested
  • Active Development - Working implementation, ongoing improvements
  • Configuration Ready - Configuration files ready for deployment
  • 🔄 Planned - Design phase, implementation pending
  • 🔄 In Development - Early implementation stage

Support

For platform service issues:

  • Check service-specific README in service directory
  • Review logs: journalctl -u provisioning-* (systemd)
  • API documentation: http://localhost:8080/docs (when running)
  • See Provisioning project for general support

Maintained By: Platform Team Last Updated: 2026-05-12 Platform Version: 3.6.0