Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Provisioning Platform - Architecture Overview

Version: 3.5.0 Date: 2025-10-06 Status: Production Maintainers: Architecture Team


Table of Contents

  1. Executive Summary
  2. System Architecture
  3. Component Architecture
  4. Mode Architecture
  5. Network Architecture
  6. Data Architecture
  7. Security Architecture
  8. Deployment Architecture
  9. Integration Architecture
  10. Performance and Scalability
  11. Evolution and Roadmap

Executive Summary

What is the Provisioning Platform?

The Provisioning Platform is a modern, cloud-native infrastructure automation system that combines the simplicity of declarative configuration (KCL) with the power of shell scripting (Nushell) and high-performance coordination (Rust).

Key Characteristics

  • Hybrid Architecture: Rust for coordination, Nushell for business logic, KCL for configuration
  • Mode-Based: Adapts from solo development to enterprise production
  • OCI-Native: Extends leveraging industry-standard OCI distribution
  • Provider-Agnostic: Supports multiple cloud providers (AWS, UpCloud) and local infrastructure
  • Extension-Driven: Core functionality enhanced through modular extensions

Architecture at a Glance

┌─────────────────────────────────────────────────────────────────────┐
│                        Provisioning Platform                        │
├─────────────────────────────────────────────────────────────────────┤
│                                                                       │
│   ┌──────────────┐  ┌──────────────┐  ┌──────────────┐             │
│   │ User Layer   │  │ Extension    │  │ Service      │             │
│   │  (CLI/UI)    │  │ Registry     │  │ Registry     │             │
│   └──────┬───────┘  └──────┬───────┘  └──────┬───────┘             │
│          │                  │                  │                      │
│   ┌──────┴──────────────────┴──────────────────┴───────┐             │
│   │            Core Provisioning Engine                 │             │
│   │  (Config | Dependency Resolution | Workflows)       │             │
│   └──────┬──────────────────────────────────────┬───────┘             │
│          │                                       │                      │
│   ┌──────┴─────────┐                   ┌───────┴──────────┐           │
│   │  Orchestrator  │                   │   Business Logic │           │
│   │    (Rust)      │ ←─ Coordination → │    (Nushell)    │           │
│   └──────┬─────────┘                   └───────┬──────────┘           │
│          │                                       │                      │
│   ┌──────┴───────────────────────────────────────┴──────┐             │
│   │              Extension System                        │             │
│   │  (Providers | Task Services | Clusters)             │             │
│   └──────┬───────────────────────────────────────────────┘             │
│          │                                                              │
│   ┌──────┴───────────────────────────────────────────────────┐        │
│   │        Infrastructure (Cloud | Local | Kubernetes)        │        │
│   └───────────────────────────────────────────────────────────┘        │
│                                                                          │
└─────────────────────────────────────────────────────────────────────┘

Key Metrics

MetricValueDescription
Codebase Size~50,000 LOCNushell (60%), Rust (30%), KCL (10%)
Extensions100+Providers, taskservs, clusters
Supported Providers3AWS, UpCloud, Local
Task Services50+Kubernetes, databases, monitoring, etc.
Deployment Modes5Binary, Docker, Docker Compose, K8s, Remote
Operational Modes4Solo, Multi-user, CI/CD, Enterprise
API Endpoints80+REST, WebSocket, GraphQL (planned)

System Architecture

High-Level Architecture

┌────────────────────────────────────────────────────────────────────────────┐
│                         PRESENTATION LAYER                                  │
├────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ┌─────────────┐  ┌──────────────┐  ┌──────────────┐  ┌────────────┐     │
│  │  CLI (Nu)   │  │ Control      │  │  REST API    │  │  MCP       │     │
│  │             │  │ Center (Yew) │  │  Gateway     │  │  Server    │     │
│  └─────────────┘  └──────────────┘  └──────────────┘  └────────────┘     │
│                                                                              │
└──────────────────────────────────┬─────────────────────────────────────────┘
                                   │
┌──────────────────────────────────┴─────────────────────────────────────────┐
│                         CORE LAYER                                           │
├────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ┌──────────────────────────────────────────────────────────────────┐      │
│  │               Configuration Management                            │      │
│  │   (KCL Schemas | TOML Config | Hierarchical Loading)            │      │
│  └──────────────────────────────────────────────────────────────────┘      │
│                                                                              │
│  ┌──────────────────┐  ┌──────────────────┐  ┌──────────────────┐         │
│  │   Dependency     │  │   Module/Layer   │  │   Workspace      │         │
│  │   Resolution     │  │     System       │  │   Management     │         │
│  └──────────────────┘  └──────────────────┘  └──────────────────┘         │
│                                                                              │
│  ┌──────────────────────────────────────────────────────────────────┐      │
│  │                  Workflow Engine                                  │      │
│  │   (Batch Operations | Checkpoints | Rollback)                    │      │
│  └──────────────────────────────────────────────────────────────────┘      │
│                                                                              │
└──────────────────────────────────┬─────────────────────────────────────────┘
                                   │
┌──────────────────────────────────┴─────────────────────────────────────────┐
│                      ORCHESTRATION LAYER                                     │
├────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ┌──────────────────────────────────────────────────────────────────┐      │
│  │                Orchestrator (Rust)                                │      │
│  │   • Task Queue (File-based persistence)                          │      │
│  │   • State Management (Checkpoints)                               │      │
│  │   • Health Monitoring                                             │      │
│  │   • REST API (HTTP/WS)                                           │      │
│  └──────────────────────────────────────────────────────────────────┘      │
│                                                                              │
│  ┌──────────────────────────────────────────────────────────────────┐      │
│  │           Business Logic (Nushell)                                │      │
│  │   • Provider operations (AWS, UpCloud, Local)                    │      │
│  │   • Server lifecycle (create, delete, configure)                 │      │
│  │   • Taskserv installation (50+ services)                         │      │
│  │   • Cluster deployment                                            │      │
│  └──────────────────────────────────────────────────────────────────┘      │
│                                                                              │
└──────────────────────────────────┬─────────────────────────────────────────┘
                                   │
┌──────────────────────────────────┴─────────────────────────────────────────┐
│                      EXTENSION LAYER                                         │
├────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ┌────────────────┐  ┌──────────────────┐  ┌───────────────────┐          │
│  │   Providers    │  │   Task Services  │  │    Clusters       │          │
│  │   (3 types)    │  │   (50+ types)    │  │   (10+ types)     │          │
│  │                │  │                  │  │                   │          │
│  │  • AWS         │  │  • Kubernetes    │  │  • Buildkit       │          │
│  │  • UpCloud     │  │  • Containerd    │  │  • Web cluster    │          │
│  │  • Local       │  │  • Databases     │  │  • CI/CD          │          │
│  │                │  │  • Monitoring    │  │                   │          │
│  └────────────────┘  └──────────────────┘  └───────────────────┘          │
│                                                                              │
│  ┌──────────────────────────────────────────────────────────────────┐      │
│  │            Extension Distribution (OCI Registry)                  │      │
│  │   • Zot (local development)                                      │      │
│  │   • Harbor (multi-user/enterprise)                               │      │
│  └──────────────────────────────────────────────────────────────────┘      │
│                                                                              │
└──────────────────────────────────┬─────────────────────────────────────────┘
                                   │
┌──────────────────────────────────┴─────────────────────────────────────────┐
│                      INFRASTRUCTURE LAYER                                    │
├────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ┌────────────────┐  ┌──────────────────┐  ┌───────────────────┐          │
│  │  Cloud (AWS)   │  │ Cloud (UpCloud)  │  │  Local (Docker)   │          │
│  │                │  │                  │  │                   │          │
│  │  • EC2         │  │  • Servers       │  │  • Containers     │          │
│  │  • EKS         │  │  • LoadBalancer  │  │  • Local K8s      │          │
│  │  • RDS         │  │  • Networking    │  │  • Processes      │          │
│  └────────────────┘  └──────────────────┘  └───────────────────┘          │
│                                                                              │
└────────────────────────────────────────────────────────────────────────────┘

Multi-Repository Architecture

The system is organized into three separate repositories:

provisioning-core

Core system functionality
├── CLI interface (Nushell entry point)
├── Core libraries (lib_provisioning)
├── Base KCL schemas
├── Configuration system
├── Workflow engine
└── Build/distribution tools

Distribution: oci://registry/provisioning-core:v3.5.0

provisioning-extensions

All provider, taskserv, cluster extensions
├── providers/
│   ├── aws/
│   ├── upcloud/
│   └── local/
├── taskservs/
│   ├── kubernetes/
│   ├── containerd/
│   ├── postgres/
│   └── (50+ more)
└── clusters/
    ├── buildkit/
    ├── web/
    └── (10+ more)

Distribution: Each extension as separate OCI artifact

  • oci://registry/provisioning-extensions/kubernetes:1.28.0
  • oci://registry/provisioning-extensions/aws:2.0.0

provisioning-platform

Platform services
├── orchestrator/      (Rust)
├── control-center/    (Rust/Yew)
├── mcp-server/        (Rust)
└── api-gateway/       (Rust)

Distribution: Docker images in OCI registry

  • oci://registry/provisioning-platform/orchestrator:v1.2.0

Component Architecture

Core Components

1. CLI Interface (Nushell)

Location: provisioning/core/cli/provisioning

Purpose: Primary user interface for all provisioning operations

Architecture:

Main CLI (211 lines)
    ↓
Command Dispatcher (264 lines)
    ↓
Domain Handlers (7 modules)
    ├── infrastructure.nu (117 lines)
    ├── orchestration.nu (64 lines)
    ├── development.nu (72 lines)
    ├── workspace.nu (56 lines)
    ├── generation.nu (78 lines)
    ├── utilities.nu (157 lines)
    └── configuration.nu (316 lines)

Key Features:

  • 80+ command shortcuts
  • Bi-directional help system
  • Centralized flag handling
  • Domain-driven design

2. Configuration System (KCL + TOML)

Hierarchical Loading:

1. System defaults     (config.defaults.toml)
2. User config         (~/.provisioning/config.user.toml)
3. Workspace config    (workspace/config/provisioning.yaml)
4. Environment config  (workspace/config/{env}-defaults.toml)
5. Infrastructure config (workspace/infra/{name}/config.toml)
6. Runtime overrides   (CLI flags, ENV variables)

Variable Interpolation:

  • {{paths.base}} - Path references
  • {{env.HOME}} - Environment variables
  • {{now.date}} - Dynamic values
  • {{git.branch}} - Git context

3. Orchestrator (Rust)

Location: provisioning/platform/orchestrator/

Architecture:

src/
├── main.rs              // Entry point
├── api/
│   ├── routes.rs        // HTTP routes
│   ├── workflows.rs     // Workflow endpoints
│   └── batch.rs         // Batch endpoints
├── workflow/
│   ├── engine.rs        // Workflow execution
│   ├── state.rs         // State management
│   └── checkpoint.rs    // Checkpoint/recovery
├── task_queue/
│   ├── queue.rs         // File-based queue
│   ├── priority.rs      // Priority scheduling
│   └── retry.rs         // Retry logic
├── health/
│   └── monitor.rs       // Health checks
├── nushell/
│   └── bridge.rs        // Nu execution bridge
└── test_environment/    // Test env management
    ├── container_manager.rs
    ├── test_orchestrator.rs
    └── topologies.rs

Key Features:

  • File-based task queue (reliable, simple)
  • Checkpoint-based recovery
  • Priority scheduling
  • REST API (HTTP/WebSocket)
  • Nushell script execution bridge

4. Workflow Engine (Nushell)

Location: provisioning/core/nulib/workflows/

Workflow Types:

workflows/
├── server_create.nu     // Server provisioning
├── taskserv.nu          // Task service management
├── cluster.nu           // Cluster deployment
├── batch.nu             // Batch operations
└── management.nu        // Workflow monitoring

Batch Workflow Features:

  • Provider-agnostic (mix AWS, UpCloud, local)
  • Dependency resolution (hard/soft dependencies)
  • Parallel execution (configurable limits)
  • Rollback support
  • Real-time monitoring

5. Extension System

Extension Types:

TypeCountPurposeExample
Providers3Cloud platform integrationAWS, UpCloud, Local
Task Services50+Infrastructure componentsKubernetes, Postgres
Clusters10+Complete configurationsBuildkit, Web cluster

Extension Structure:

extension-name/
├── kcl/
│   ├── kcl.mod              // KCL dependencies
│   ├── {name}.k             // Main schema
│   ├── version.k            // Version management
│   └── dependencies.k       // Dependencies
├── scripts/
│   ├── install.nu           // Installation logic
│   ├── check.nu             // Health check
│   └── uninstall.nu         // Cleanup
├── templates/               // Config templates
├── docs/                    // Documentation
├── tests/                   // Extension tests
└── manifest.yaml            // Extension metadata

OCI Distribution: Each extension packaged as OCI artifact:

  • KCL schemas
  • Nushell scripts
  • Templates
  • Documentation
  • Manifest

6. Module and Layer System

Module System:

# Discover available extensions
provisioning module discover taskservs

# Load into workspace
provisioning module load taskserv my-workspace kubernetes containerd

# List loaded modules
provisioning module list taskserv my-workspace

Layer System (Configuration Inheritance):

Layer 1: Core     (provisioning/extensions/{type}/{name})
    ↓
Layer 2: Workspace (workspace/extensions/{type}/{name})
    ↓
Layer 3: Infrastructure (workspace/infra/{infra}/extensions/{type}/{name})

Resolution Priority: Infrastructure → Workspace → Core

7. Dependency Resolution

Algorithm: Topological sort with cycle detection

Features:

  • Hard dependencies (must exist)
  • Soft dependencies (optional enhancement)
  • Conflict detection
  • Circular dependency prevention
  • Version compatibility checking

Example:

import provisioning.dependencies as schema

_dependencies = schema.TaskservDependencies {
    name = "kubernetes"
    version = "1.28.0"
    requires = ["containerd", "etcd", "os"]
    optional = ["cilium", "helm"]
    conflicts = ["docker", "podman"]
}

8. Service Management

Supported Services:

ServiceTypeCategoryPurpose
orchestratorPlatformOrchestrationWorkflow coordination
control-centerPlatformUIWeb management interface
corednsInfrastructureDNSLocal DNS resolution
giteaInfrastructureGitSelf-hosted Git service
oci-registryInfrastructureRegistryOCI artifact storage
mcp-serverPlatformAPIModel Context Protocol
api-gatewayPlatformAPIUnified API access

Lifecycle Management:

# Start all auto-start services
provisioning platform start

# Start specific service (with dependencies)
provisioning platform start orchestrator

# Check health
provisioning platform health

# View logs
provisioning platform logs orchestrator --follow

9. Test Environment Service

Architecture:

User Command (CLI)
    ↓
Test Orchestrator (Rust)
    ↓
Container Manager (bollard)
    ↓
Docker API
    ↓
Isolated Test Containers

Test Types:

  • Single taskserv testing
  • Server simulation (multiple taskservs)
  • Multi-node cluster topologies

Topology Templates:

  • kubernetes_3node - 3-node HA cluster
  • kubernetes_single - All-in-one K8s
  • etcd_cluster - 3-node etcd
  • postgres_redis - Database stack

Mode Architecture

Mode-Based System Overview

The platform supports four operational modes that adapt the system from individual development to enterprise production.

Mode Comparison

┌───────────────────────────────────────────────────────────────────────┐
│                        MODE ARCHITECTURE                               │
├───────────────┬───────────────┬───────────────┬───────────────────────┤
│    SOLO       │  MULTI-USER   │    CI/CD      │    ENTERPRISE         │
├───────────────┼───────────────┼───────────────┼───────────────────────┤
│               │               │               │                        │
│  Single Dev   │  Team (5-20)  │  Pipelines    │  Production           │
│               │               │               │                        │
│  ┌─────────┐ │ ┌──────────┐  │ ┌──────────┐  │ ┌──────────────────┐  │
│  │ No Auth │ │ │Token(JWT)│  │ │Token(1h) │  │ │  mTLS (TLS 1.3) │  │
│  └─────────┘ │ └──────────┘  │ └──────────┘  │ └──────────────────┘  │
│               │               │               │                        │
│  ┌─────────┐ │ ┌──────────┐  │ ┌──────────┐  │ ┌──────────────────┐  │
│  │ Local   │ │ │ Remote   │  │ │ Remote   │  │ │ Kubernetes (HA) │  │
│  │ Binary  │ │ │ Docker   │  │ │ K8s      │  │ │ Multi-AZ        │  │
│  └─────────┘ │ └──────────┘  │ └──────────┘  │ └──────────────────┘  │
│               │               │               │                        │
│  ┌─────────┐ │ ┌──────────┐  │ ┌──────────┐  │ ┌──────────────────┐  │
│  │ Local   │ │ │ OCI (Zot)│  │ │OCI(Harbor│  │ │ OCI (Harbor HA) │  │
│  │ Files   │ │ │ or Harbor│  │ │ required)│  │ │ + Replication   │  │
│  └─────────┘ │ └──────────┘  │ └──────────┘  │ └──────────────────┘  │
│               │               │               │                        │
│  ┌─────────┐ │ ┌──────────┐  │ ┌──────────┐  │ ┌──────────────────┐  │
│  │ None    │ │ │ Gitea    │  │ │ Disabled │  │ │ etcd (mandatory) │  │
│  │         │ │ │(optional)│  │ │ (stateless)  │ │                  │  │
│  └─────────┘ │ └──────────┘  │ └──────────┘  │ └──────────────────┘  │
│               │               │               │                        │
│  Unlimited    │ 10 srv, 32   │ 5 srv, 16    │ 20 srv, 64 cores     │
│               │ cores, 128GB  │ cores, 64GB   │ 256GB per user       │
│               │               │               │                        │
└───────────────┴───────────────┴───────────────┴───────────────────────┘

Mode Configuration

Mode Templates: workspace/config/modes/{mode}.yaml

Active Mode: ~/.provisioning/config/active-mode.yaml

Switching Modes:

# Check current mode
provisioning mode current

# Switch to another mode
provisioning mode switch multi-user

# Validate mode requirements
provisioning mode validate enterprise

Mode-Specific Workflows

Solo Mode

# 1. Default mode, no setup needed
provisioning workspace init

# 2. Start local orchestrator
provisioning platform start orchestrator

# 3. Create infrastructure
provisioning server create

Multi-User Mode

# 1. Switch mode and authenticate
provisioning mode switch multi-user
provisioning auth login

# 2. Lock workspace
provisioning workspace lock my-infra

# 3. Pull extensions from OCI
provisioning extension pull upcloud kubernetes

# 4. Work...

# 5. Unlock workspace
provisioning workspace unlock my-infra

CI/CD Mode

# GitLab CI
deploy:
  stage: deploy
  script:
    - export PROVISIONING_MODE=cicd
    - echo "$TOKEN" > /var/run/secrets/provisioning/token
    - provisioning validate --all
    - provisioning test quick kubernetes
    - provisioning server create --check
    - provisioning server create
  after_script:
    - provisioning workspace cleanup

Enterprise Mode

# 1. Switch to enterprise, verify K8s
provisioning mode switch enterprise
kubectl get pods -n provisioning-system

# 2. Request workspace (approval required)
provisioning workspace request prod-deployment

# 3. After approval, lock with etcd
provisioning workspace lock prod-deployment --provider etcd

# 4. Pull verified extensions
provisioning extension pull upcloud --verify-signature

# 5. Deploy
provisioning infra create --check
provisioning infra create

# 6. Release
provisioning workspace unlock prod-deployment

Network Architecture

Service Communication

┌──────────────────────────────────────────────────────────────────────┐
│                         NETWORK LAYER                                 │
├──────────────────────────────────────────────────────────────────────┤
│                                                                        │
│  ┌───────────────────────┐          ┌──────────────────────────┐     │
│  │   Ingress/Load        │          │    API Gateway           │     │
│  │   Balancer            │──────────│   (Optional)             │     │
│  └───────────────────────┘          └──────────────────────────┘     │
│              │                                    │                   │
│              │                                    │                   │
│  ┌───────────┴────────────────────────────────────┴──────────┐       │
│  │                 Service Mesh (Optional)                    │       │
│  │           (mTLS, Circuit Breaking, Retries)               │       │
│  └────┬──────────┬───────────┬────────────┬──────────────┬───┘       │
│       │          │           │            │              │            │
│  ┌────┴─────┐ ┌─┴────────┐ ┌┴─────────┐ ┌┴──────────┐ ┌┴───────┐   │
│  │ Orchestr │ │ Control  │ │ CoreDNS  │ │   Gitea   │ │  OCI   │   │
│  │   ator   │ │ Center   │ │          │ │           │ │Registry│   │
│  │          │ │          │ │          │ │           │ │        │   │
│  │ :9090    │ │ :3000    │ │ :5353    │ │ :3001     │ │ :5000  │   │
│  └──────────┘ └──────────┘ └──────────┘ └───────────┘ └────────┘   │
│                                                                        │
│  ┌────────────────────────────────────────────────────────────┐       │
│  │              DNS Resolution (CoreDNS)                       │       │
│  │  • *.prov.local  →  Internal services                      │       │
│  │  • *.infra.local →  Infrastructure nodes                   │       │
│  └────────────────────────────────────────────────────────────┘       │
│                                                                        │
└──────────────────────────────────────────────────────────────────────┘

Port Allocation

ServicePortProtocolPurpose
Orchestrator8080HTTP/WSREST API, WebSocket
Control Center3000HTTPWeb UI
CoreDNS5353UDP/TCPDNS resolution
Gitea3001HTTPGit operations
OCI Registry (Zot)5000HTTPOCI artifacts
OCI Registry (Harbor)443HTTPSOCI artifacts (prod)
MCP Server8081HTTPMCP protocol
API Gateway8082HTTPUnified API

Network Security

Solo Mode:

  • Localhost-only bindings
  • No authentication
  • No encryption

Multi-User Mode:

  • Token-based authentication (JWT)
  • TLS for external access
  • Firewall rules

CI/CD Mode:

  • Token authentication (short-lived)
  • Full TLS encryption
  • Network isolation

Enterprise Mode:

  • mTLS for all connections
  • Network policies (Kubernetes)
  • Zero-trust networking
  • Audit logging

Data Architecture

Data Storage

┌────────────────────────────────────────────────────────────────┐
│                     DATA LAYER                                  │
├────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │            Configuration Data (Hierarchical)             │   │
│  │                                                           │   │
│  │  ~/.provisioning/                                        │   │
│  │  ├── config.user.toml       (User preferences)          │   │
│  │  └── config/                                             │   │
│  │      ├── active-mode.yaml   (Active mode)               │   │
│  │      └── user_config.yaml   (Workspaces, preferences)   │   │
│  │                                                           │   │
│  │  workspace/                                              │   │
│  │  ├── config/                                             │   │
│  │  │   ├── provisioning.yaml  (Workspace config)          │   │
│  │  │   └── modes/*.yaml       (Mode templates)            │   │
│  │  └── infra/{name}/                                       │   │
│  │      ├── settings.k         (Infrastructure KCL)        │   │
│  │      └── config.toml        (Infra-specific)            │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                  │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │            State Data (Runtime)                          │   │
│  │                                                           │   │
│  │  ~/.provisioning/orchestrator/data/                      │   │
│  │  ├── tasks/                  (Task queue)                │   │
│  │  ├── workflows/              (Workflow state)            │   │
│  │  └── checkpoints/            (Recovery points)           │   │
│  │                                                           │   │
│  │  ~/.provisioning/services/                               │   │
│  │  ├── pids/                   (Process IDs)               │   │
│  │  ├── logs/                   (Service logs)              │   │
│  │  └── state/                  (Service state)             │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                  │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │            Cache Data (Performance)                      │   │
│  │                                                           │   │
│  │  ~/.provisioning/cache/                                  │   │
│  │  ├── oci/                    (OCI artifacts)             │   │
│  │  ├── kcl/                    (Compiled KCL)              │   │
│  │  └── modules/                (Module cache)              │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                  │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │            Extension Data (OCI Artifacts)                │   │
│  │                                                           │   │
│  │  OCI Registry (localhost:5000 or harbor.company.com)    │   │
│  │  ├── provisioning-core:v3.5.0                           │   │
│  │  ├── provisioning-extensions/                           │   │
│  │  │   ├── kubernetes:1.28.0                              │   │
│  │  │   ├── aws:2.0.0                                      │   │
│  │  │   └── (100+ artifacts)                               │   │
│  │  └── provisioning-platform/                             │   │
│  │      ├── orchestrator:v1.2.0                            │   │
│  │      └── (4 service images)                             │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                  │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │            Secrets (Encrypted)                           │   │
│  │                                                           │   │
│  │  workspace/secrets/                                      │   │
│  │  ├── keys.yaml.enc           (SOPS-encrypted)           │   │
│  │  ├── ssh-keys/               (SSH keys)                 │   │
│  │  └── tokens/                 (API tokens)               │   │
│  │                                                           │   │
│  │  KMS Integration (Enterprise):                          │   │
│  │  • AWS KMS                                               │   │
│  │  • HashiCorp Vault                                       │   │
│  │  • Age encryption (local)                                │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                  │
└────────────────────────────────────────────────────────────────┘

Data Flow

Configuration Loading:

1. Load system defaults (config.defaults.toml)
2. Merge user config (~/.provisioning/config.user.toml)
3. Load workspace config (workspace/config/provisioning.yaml)
4. Load environment config (workspace/config/{env}-defaults.toml)
5. Load infrastructure config (workspace/infra/{name}/config.toml)
6. Apply runtime overrides (ENV variables, CLI flags)

State Persistence:

Workflow execution
    ↓
Create checkpoint (JSON)
    ↓
Save to ~/.provisioning/orchestrator/data/checkpoints/
    ↓
On failure, load checkpoint and resume

OCI Artifact Flow:

1. Package extension (oci-package.nu)
2. Push to OCI registry (provisioning oci push)
3. Extension stored as OCI artifact
4. Pull when needed (provisioning oci pull)
5. Cache locally (~/.provisioning/cache/oci/)

Security Architecture

Security Layers

┌─────────────────────────────────────────────────────────────────┐
│                     SECURITY ARCHITECTURE                        │
├─────────────────────────────────────────────────────────────────┤
│                                                                   │
│  ┌────────────────────────────────────────────────────────┐     │
│  │  Layer 1: Authentication & Authorization               │     │
│  │                                                          │     │
│  │  Solo:       None (local development)                  │     │
│  │  Multi-user: JWT tokens (24h expiry)                   │     │
│  │  CI/CD:      CI-injected tokens (1h expiry)            │     │
│  │  Enterprise: mTLS (TLS 1.3, mutual auth)               │     │
│  └────────────────────────────────────────────────────────┘     │
│                                                                   │
│  ┌────────────────────────────────────────────────────────┐     │
│  │  Layer 2: Encryption                                    │     │
│  │                                                          │     │
│  │  In Transit:                                            │     │
│  │  • TLS 1.3 (multi-user, CI/CD, enterprise)             │     │
│  │  • mTLS (enterprise)                                    │     │
│  │                                                          │     │
│  │  At Rest:                                               │     │
│  │  • SOPS + Age (secrets encryption)                      │     │
│  │  • KMS integration (CI/CD, enterprise)                  │     │
│  │  • Encrypted filesystems (enterprise)                   │     │
│  └────────────────────────────────────────────────────────┘     │
│                                                                   │
│  ┌────────────────────────────────────────────────────────┐     │
│  │  Layer 3: Secret Management                             │     │
│  │                                                          │     │
│  │  • SOPS for file encryption                             │     │
│  │  • Age for key management                               │     │
│  │  • KMS integration (AWS KMS, Vault)                     │     │
│  │  • SSH key storage (KMS-backed)                         │     │
│  │  • API token management                                 │     │
│  └────────────────────────────────────────────────────────┘     │
│                                                                   │
│  ┌────────────────────────────────────────────────────────┐     │
│  │  Layer 4: Access Control                                │     │
│  │                                                          │     │
│  │  • RBAC (Role-Based Access Control)                     │     │
│  │  • Workspace isolation                                   │     │
│  │  • Workspace locking (Gitea, etcd)                      │     │
│  │  • Resource quotas (per-user limits)                    │     │
│  └────────────────────────────────────────────────────────┘     │
│                                                                   │
│  ┌────────────────────────────────────────────────────────┐     │
│  │  Layer 5: Network Security                              │     │
│  │                                                          │     │
│  │  • Network policies (Kubernetes)                        │     │
│  │  • Firewall rules                                       │     │
│  │  • Zero-trust networking (enterprise)                   │     │
│  │  • Service mesh (optional, mTLS)                        │     │
│  └────────────────────────────────────────────────────────┘     │
│                                                                   │
│  ┌────────────────────────────────────────────────────────┐     │
│  │  Layer 6: Audit & Compliance                            │     │
│  │                                                          │     │
│  │  • Audit logs (all operations)                          │     │
│  │  • Compliance policies (SOC2, ISO27001)                 │     │
│  │  • Image signing (cosign, notation)                     │     │
│  │  • Vulnerability scanning (Harbor)                      │     │
│  └────────────────────────────────────────────────────────┘     │
│                                                                   │
└─────────────────────────────────────────────────────────────────┘

Secret Management

SOPS Integration:

# Edit encrypted file
provisioning sops workspace/secrets/keys.yaml.enc

# Encryption happens automatically on save
# Decryption happens automatically on load

KMS Integration (Enterprise):

# workspace/config/provisioning.yaml
secrets:
  provider: "kms"
  kms:
    type: "aws"  # or "vault"
    region: "us-east-1"
    key_id: "arn:aws:kms:..."

Image Signing and Verification

CI/CD Mode (Required):

# Sign OCI artifact
cosign sign oci://registry/kubernetes:1.28.0

# Verify signature
cosign verify oci://registry/kubernetes:1.28.0

Enterprise Mode (Mandatory):

# Pull with verification
provisioning extension pull kubernetes --verify-signature

# System blocks unsigned artifacts

Deployment Architecture

Deployment Modes

1. Binary Deployment (Solo, Multi-user)

User Machine
├── ~/.provisioning/bin/
│   ├── provisioning-orchestrator
│   ├── provisioning-control-center
│   └── ...
├── ~/.provisioning/orchestrator/data/
├── ~/.provisioning/services/
└── Process Management (PID files, logs)

Pros: Simple, fast startup, no Docker dependency Cons: Platform-specific binaries, manual updates

2. Docker Deployment (Multi-user, CI/CD)

Docker Daemon
├── Container: provisioning-orchestrator
├── Container: provisioning-control-center
├── Container: provisioning-coredns
├── Container: provisioning-gitea
├── Container: provisioning-oci-registry
└── Volumes: ~/.provisioning/data/

Pros: Consistent environment, easy updates Cons: Requires Docker, resource overhead

3. Docker Compose Deployment (Multi-user)

# provisioning/platform/docker-compose.yaml
services:
  orchestrator:
    image: provisioning-platform/orchestrator:v1.2.0
    ports:
      - "8080:9090"
    volumes:
      - orchestrator-data:/data

  control-center:
    image: provisioning-platform/control-center:v1.2.0
    ports:
      - "3000:3000"
    depends_on:
      - orchestrator

  coredns:
    image: coredns/coredns:1.11.1
    ports:
      - "5353:53/udp"

  gitea:
    image: gitea/gitea:1.20
    ports:
      - "3001:3000"

  oci-registry:
    image: ghcr.io/project-zot/zot:latest
    ports:
      - "5000:5000"

Pros: Easy multi-service orchestration, declarative Cons: Local only, no HA

4. Kubernetes Deployment (CI/CD, Enterprise)

# Namespace: provisioning-system
apiVersion: apps/v1
kind: Deployment
metadata:
  name: orchestrator
spec:
  replicas: 3  # HA
  selector:
    matchLabels:
      app: orchestrator
  template:
    metadata:
      labels:
        app: orchestrator
    spec:
      containers:
      - name: orchestrator
        image: harbor.company.com/provisioning-platform/orchestrator:v1.2.0
        ports:
        - containerPort: 8080
        env:
        - name: RUST_LOG
          value: "info"
        volumeMounts:
        - name: data
          mountPath: /data
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
        readinessProbe:
          httpGet:
            path: /health
            port: 8080
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: orchestrator-data

Pros: HA, scalability, production-ready Cons: Complex setup, Kubernetes required

5. Remote Deployment (All modes)

# Connect to remotely-running services
services:
  orchestrator:
    deployment:
      mode: "remote"
      remote:
        endpoint: "https://orchestrator.company.com"
        tls_enabled: true
        auth_token_path: "~/.provisioning/tokens/orchestrator.token"

Pros: No local resources, centralized Cons: Network dependency, latency


Integration Architecture

Integration Patterns

1. Hybrid Language Integration (Rust ↔ Nushell)

Rust Orchestrator
    ↓ (HTTP API)
Nushell CLI
    ↓ (exec via bridge)
Nushell Business Logic
    ↓ (returns JSON)
Rust Orchestrator
    ↓ (updates state)
File-based Task Queue

Communication: HTTP API + stdin/stdout JSON

2. Provider Abstraction

Unified Provider Interface
├── create_server(config) -> Server
├── delete_server(id) -> bool
├── list_servers() -> [Server]
└── get_server_status(id) -> Status

Provider Implementations:
├── AWS Provider (aws-sdk-rust, aws cli)
├── UpCloud Provider (upcloud API)
└── Local Provider (Docker, libvirt)

3. OCI Registry Integration

Extension Development
    ↓
Package (oci-package.nu)
    ↓
Push (provisioning oci push)
    ↓
OCI Registry (Zot/Harbor)
    ↓
Pull (provisioning oci pull)
    ↓
Cache (~/.provisioning/cache/oci/)
    ↓
Load into Workspace

4. Gitea Integration (Multi-user, Enterprise)

Workspace Operations
    ↓
Check Lock Status (Gitea API)
    ↓
Acquire Lock (Create lock file in Git)
    ↓
Perform Changes
    ↓
Commit + Push
    ↓
Release Lock (Delete lock file)

Benefits:

  • Distributed locking
  • Change tracking via Git history
  • Collaboration features

5. CoreDNS Integration

Service Registration
    ↓
Update CoreDNS Corefile
    ↓
Reload CoreDNS
    ↓
DNS Resolution Available

Zones:
├── *.prov.local     (Internal services)
├── *.infra.local    (Infrastructure nodes)
└── *.test.local     (Test environments)

Performance and Scalability

Performance Characteristics

MetricValueNotes
CLI Startup Time< 100msNushell cold start
CLI Response Time< 50msMost commands
Workflow Submission< 200msTo orchestrator
Task Processing10-50/secOrchestrator throughput
Batch OperationsUp to 100 serversParallel execution
OCI Pull Time1-5sCached: <100ms
Configuration Load< 500msFull hierarchy
Health Check Interval10sConfigurable

Scalability Limits

Solo Mode:

  • Unlimited local resources
  • Limited by machine capacity

Multi-User Mode:

  • 10 servers per user
  • 32 cores, 128GB RAM per user
  • 5-20 concurrent users

CI/CD Mode:

  • 5 servers per pipeline
  • 16 cores, 64GB RAM per pipeline
  • 100+ concurrent pipelines

Enterprise Mode:

  • 20 servers per user
  • 64 cores, 256GB RAM per user
  • 1000+ concurrent users
  • Horizontal scaling via Kubernetes

Optimization Strategies

Caching:

  • OCI artifacts cached locally
  • KCL compilation cached
  • Module resolution cached

Parallel Execution:

  • Batch operations with configurable limits
  • Dependency-aware parallel starts
  • Workflow DAG execution

Incremental Operations:

  • Only update changed resources
  • Checkpoint-based recovery
  • Delta synchronization

Evolution and Roadmap

Version History

VersionDateMajor Features
v3.5.02025-10-06Mode system, OCI distribution, comprehensive docs
v3.4.02025-10-06Test environment service
v3.3.02025-09-30Interactive guides
v3.2.02025-09-30Modular CLI refactoring
v3.1.02025-09-25Batch workflow system
v3.0.02025-09-25Hybrid orchestrator
v2.0.52025-10-02Workspace switching
v2.0.02025-09-23Configuration migration

Roadmap (Future Versions)

v3.6.0 (Q1 2026):

  • GraphQL API
  • Advanced RBAC
  • Multi-tenancy
  • Observability enhancements (OpenTelemetry)

v4.0.0 (Q2 2026):

  • Multi-repository split complete
  • Extension marketplace
  • Advanced workflow features (conditional execution, loops)
  • Cost optimization engine

v4.1.0 (Q3 2026):

  • AI-assisted infrastructure generation
  • Policy-as-code (OPA integration)
  • Advanced compliance features

Long-term Vision:

  • Serverless workflow execution
  • Edge computing support
  • Multi-cloud failover
  • Self-healing infrastructure

Architecture

ADRs

User Guides


Maintained By: Architecture Team Review Cycle: Quarterly Next Review: 2026-01-06