Provisioning Platform - Architecture Overview
Version: 3.5.0 Date: 2025-10-06 Status: Production Maintainers: Architecture Team
Table of Contents
- Executive Summary
- System Architecture
- Component Architecture
- Mode Architecture
- Network Architecture
- Data Architecture
- Security Architecture
- Deployment Architecture
- Integration Architecture
- Performance and Scalability
- Evolution and Roadmap
Executive Summary
What is the Provisioning Platform?
The Provisioning Platform is a modern, cloud-native infrastructure automation system that combines the simplicity of declarative configuration (KCL) with the power of shell scripting (Nushell) and high-performance coordination (Rust).
Key Characteristics
- Hybrid Architecture: Rust for coordination, Nushell for business logic, KCL for configuration
- Mode-Based: Adapts from solo development to enterprise production
- OCI-Native: Extends leveraging industry-standard OCI distribution
- Provider-Agnostic: Supports multiple cloud providers (AWS, UpCloud) and local infrastructure
- Extension-Driven: Core functionality enhanced through modular extensions
Architecture at a Glance
┌─────────────────────────────────────────────────────────────────────┐
│ Provisioning Platform │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ User Layer │ │ Extension │ │ Service │ │
│ │ (CLI/UI) │ │ Registry │ │ Registry │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
│ ┌──────┴──────────────────┴──────────────────┴───────┐ │
│ │ Core Provisioning Engine │ │
│ │ (Config | Dependency Resolution | Workflows) │ │
│ └──────┬──────────────────────────────────────┬───────┘ │
│ │ │ │
│ ┌──────┴─────────┐ ┌───────┴──────────┐ │
│ │ Orchestrator │ │ Business Logic │ │
│ │ (Rust) │ ←─ Coordination → │ (Nushell) │ │
│ └──────┬─────────┘ └───────┬──────────┘ │
│ │ │ │
│ ┌──────┴───────────────────────────────────────┴──────┐ │
│ │ Extension System │ │
│ │ (Providers | Task Services | Clusters) │ │
│ └──────┬───────────────────────────────────────────────┘ │
│ │ │
│ ┌──────┴───────────────────────────────────────────────────┐ │
│ │ Infrastructure (Cloud | Local | Kubernetes) │ │
│ └───────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────┘
Key Metrics
| Metric | Value | Description |
|---|---|---|
| Codebase Size | ~50,000 LOC | Nushell (60%), Rust (30%), KCL (10%) |
| Extensions | 100+ | Providers, taskservs, clusters |
| Supported Providers | 3 | AWS, UpCloud, Local |
| Task Services | 50+ | Kubernetes, databases, monitoring, etc. |
| Deployment Modes | 5 | Binary, Docker, Docker Compose, K8s, Remote |
| Operational Modes | 4 | Solo, Multi-user, CI/CD, Enterprise |
| API Endpoints | 80+ | REST, WebSocket, GraphQL (planned) |
System Architecture
High-Level Architecture
┌────────────────────────────────────────────────────────────────────────────┐
│ PRESENTATION LAYER │
├────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌──────────────┐ ┌──────────────┐ ┌────────────┐ │
│ │ CLI (Nu) │ │ Control │ │ REST API │ │ MCP │ │
│ │ │ │ Center (Yew) │ │ Gateway │ │ Server │ │
│ └─────────────┘ └──────────────┘ └──────────────┘ └────────────┘ │
│ │
└──────────────────────────────────┬─────────────────────────────────────────┘
│
┌──────────────────────────────────┴─────────────────────────────────────────┐
│ CORE LAYER │
├────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ Configuration Management │ │
│ │ (KCL Schemas | TOML Config | Hierarchical Loading) │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐ │
│ │ Dependency │ │ Module/Layer │ │ Workspace │ │
│ │ Resolution │ │ System │ │ Management │ │
│ └──────────────────┘ └──────────────────┘ └──────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ Workflow Engine │ │
│ │ (Batch Operations | Checkpoints | Rollback) │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
└──────────────────────────────────┬─────────────────────────────────────────┘
│
┌──────────────────────────────────┴─────────────────────────────────────────┐
│ ORCHESTRATION LAYER │
├────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ Orchestrator (Rust) │ │
│ │ • Task Queue (File-based persistence) │ │
│ │ • State Management (Checkpoints) │ │
│ │ • Health Monitoring │ │
│ │ • REST API (HTTP/WS) │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ Business Logic (Nushell) │ │
│ │ • Provider operations (AWS, UpCloud, Local) │ │
│ │ • Server lifecycle (create, delete, configure) │ │
│ │ • Taskserv installation (50+ services) │ │
│ │ • Cluster deployment │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
└──────────────────────────────────┬─────────────────────────────────────────┘
│
┌──────────────────────────────────┴─────────────────────────────────────────┐
│ EXTENSION LAYER │
├────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌────────────────┐ ┌──────────────────┐ ┌───────────────────┐ │
│ │ Providers │ │ Task Services │ │ Clusters │ │
│ │ (3 types) │ │ (50+ types) │ │ (10+ types) │ │
│ │ │ │ │ │ │ │
│ │ • AWS │ │ • Kubernetes │ │ • Buildkit │ │
│ │ • UpCloud │ │ • Containerd │ │ • Web cluster │ │
│ │ • Local │ │ • Databases │ │ • CI/CD │ │
│ │ │ │ • Monitoring │ │ │ │
│ └────────────────┘ └──────────────────┘ └───────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ Extension Distribution (OCI Registry) │ │
│ │ • Zot (local development) │ │
│ │ • Harbor (multi-user/enterprise) │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
└──────────────────────────────────┬─────────────────────────────────────────┘
│
┌──────────────────────────────────┴─────────────────────────────────────────┐
│ INFRASTRUCTURE LAYER │
├────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌────────────────┐ ┌──────────────────┐ ┌───────────────────┐ │
│ │ Cloud (AWS) │ │ Cloud (UpCloud) │ │ Local (Docker) │ │
│ │ │ │ │ │ │ │
│ │ • EC2 │ │ • Servers │ │ • Containers │ │
│ │ • EKS │ │ • LoadBalancer │ │ • Local K8s │ │
│ │ • RDS │ │ • Networking │ │ • Processes │ │
│ └────────────────┘ └──────────────────┘ └───────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────────────────┘
Multi-Repository Architecture
The system is organized into three separate repositories:
provisioning-core
Core system functionality
├── CLI interface (Nushell entry point)
├── Core libraries (lib_provisioning)
├── Base KCL schemas
├── Configuration system
├── Workflow engine
└── Build/distribution tools
Distribution: oci://registry/provisioning-core:v3.5.0
provisioning-extensions
All provider, taskserv, cluster extensions
├── providers/
│ ├── aws/
│ ├── upcloud/
│ └── local/
├── taskservs/
│ ├── kubernetes/
│ ├── containerd/
│ ├── postgres/
│ └── (50+ more)
└── clusters/
├── buildkit/
├── web/
└── (10+ more)
Distribution: Each extension as separate OCI artifact
oci://registry/provisioning-extensions/kubernetes:1.28.0oci://registry/provisioning-extensions/aws:2.0.0
provisioning-platform
Platform services
├── orchestrator/ (Rust)
├── control-center/ (Rust/Yew)
├── mcp-server/ (Rust)
└── api-gateway/ (Rust)
Distribution: Docker images in OCI registry
oci://registry/provisioning-platform/orchestrator:v1.2.0
Component Architecture
Core Components
1. CLI Interface (Nushell)
Location: provisioning/core/cli/provisioning
Purpose: Primary user interface for all provisioning operations
Architecture:
Main CLI (211 lines)
↓
Command Dispatcher (264 lines)
↓
Domain Handlers (7 modules)
├── infrastructure.nu (117 lines)
├── orchestration.nu (64 lines)
├── development.nu (72 lines)
├── workspace.nu (56 lines)
├── generation.nu (78 lines)
├── utilities.nu (157 lines)
└── configuration.nu (316 lines)
Key Features:
- 80+ command shortcuts
- Bi-directional help system
- Centralized flag handling
- Domain-driven design
2. Configuration System (KCL + TOML)
Hierarchical Loading:
1. System defaults (config.defaults.toml)
2. User config (~/.provisioning/config.user.toml)
3. Workspace config (workspace/config/provisioning.yaml)
4. Environment config (workspace/config/{env}-defaults.toml)
5. Infrastructure config (workspace/infra/{name}/config.toml)
6. Runtime overrides (CLI flags, ENV variables)
Variable Interpolation:
{{paths.base}}- Path references{{env.HOME}}- Environment variables{{now.date}}- Dynamic values{{git.branch}}- Git context
3. Orchestrator (Rust)
Location: provisioning/platform/orchestrator/
Architecture:
src/
├── main.rs // Entry point
├── api/
│ ├── routes.rs // HTTP routes
│ ├── workflows.rs // Workflow endpoints
│ └── batch.rs // Batch endpoints
├── workflow/
│ ├── engine.rs // Workflow execution
│ ├── state.rs // State management
│ └── checkpoint.rs // Checkpoint/recovery
├── task_queue/
│ ├── queue.rs // File-based queue
│ ├── priority.rs // Priority scheduling
│ └── retry.rs // Retry logic
├── health/
│ └── monitor.rs // Health checks
├── nushell/
│ └── bridge.rs // Nu execution bridge
└── test_environment/ // Test env management
├── container_manager.rs
├── test_orchestrator.rs
└── topologies.rs
Key Features:
- File-based task queue (reliable, simple)
- Checkpoint-based recovery
- Priority scheduling
- REST API (HTTP/WebSocket)
- Nushell script execution bridge
4. Workflow Engine (Nushell)
Location: provisioning/core/nulib/workflows/
Workflow Types:
workflows/
├── server_create.nu // Server provisioning
├── taskserv.nu // Task service management
├── cluster.nu // Cluster deployment
├── batch.nu // Batch operations
└── management.nu // Workflow monitoring
Batch Workflow Features:
- Provider-agnostic (mix AWS, UpCloud, local)
- Dependency resolution (hard/soft dependencies)
- Parallel execution (configurable limits)
- Rollback support
- Real-time monitoring
5. Extension System
Extension Types:
| Type | Count | Purpose | Example |
|---|---|---|---|
| Providers | 3 | Cloud platform integration | AWS, UpCloud, Local |
| Task Services | 50+ | Infrastructure components | Kubernetes, Postgres |
| Clusters | 10+ | Complete configurations | Buildkit, Web cluster |
Extension Structure:
extension-name/
├── kcl/
│ ├── kcl.mod // KCL dependencies
│ ├── {name}.k // Main schema
│ ├── version.k // Version management
│ └── dependencies.k // Dependencies
├── scripts/
│ ├── install.nu // Installation logic
│ ├── check.nu // Health check
│ └── uninstall.nu // Cleanup
├── templates/ // Config templates
├── docs/ // Documentation
├── tests/ // Extension tests
└── manifest.yaml // Extension metadata
OCI Distribution: Each extension packaged as OCI artifact:
- KCL schemas
- Nushell scripts
- Templates
- Documentation
- Manifest
6. Module and Layer System
Module System:
# Discover available extensions
provisioning module discover taskservs
# Load into workspace
provisioning module load taskserv my-workspace kubernetes containerd
# List loaded modules
provisioning module list taskserv my-workspace
Layer System (Configuration Inheritance):
Layer 1: Core (provisioning/extensions/{type}/{name})
↓
Layer 2: Workspace (workspace/extensions/{type}/{name})
↓
Layer 3: Infrastructure (workspace/infra/{infra}/extensions/{type}/{name})
Resolution Priority: Infrastructure → Workspace → Core
7. Dependency Resolution
Algorithm: Topological sort with cycle detection
Features:
- Hard dependencies (must exist)
- Soft dependencies (optional enhancement)
- Conflict detection
- Circular dependency prevention
- Version compatibility checking
Example:
import provisioning.dependencies as schema
_dependencies = schema.TaskservDependencies {
name = "kubernetes"
version = "1.28.0"
requires = ["containerd", "etcd", "os"]
optional = ["cilium", "helm"]
conflicts = ["docker", "podman"]
}
8. Service Management
Supported Services:
| Service | Type | Category | Purpose |
|---|---|---|---|
| orchestrator | Platform | Orchestration | Workflow coordination |
| control-center | Platform | UI | Web management interface |
| coredns | Infrastructure | DNS | Local DNS resolution |
| gitea | Infrastructure | Git | Self-hosted Git service |
| oci-registry | Infrastructure | Registry | OCI artifact storage |
| mcp-server | Platform | API | Model Context Protocol |
| api-gateway | Platform | API | Unified API access |
Lifecycle Management:
# Start all auto-start services
provisioning platform start
# Start specific service (with dependencies)
provisioning platform start orchestrator
# Check health
provisioning platform health
# View logs
provisioning platform logs orchestrator --follow
9. Test Environment Service
Architecture:
User Command (CLI)
↓
Test Orchestrator (Rust)
↓
Container Manager (bollard)
↓
Docker API
↓
Isolated Test Containers
Test Types:
- Single taskserv testing
- Server simulation (multiple taskservs)
- Multi-node cluster topologies
Topology Templates:
kubernetes_3node- 3-node HA clusterkubernetes_single- All-in-one K8setcd_cluster- 3-node etcdpostgres_redis- Database stack
Mode Architecture
Mode-Based System Overview
The platform supports four operational modes that adapt the system from individual development to enterprise production.
Mode Comparison
┌───────────────────────────────────────────────────────────────────────┐
│ MODE ARCHITECTURE │
├───────────────┬───────────────┬───────────────┬───────────────────────┤
│ SOLO │ MULTI-USER │ CI/CD │ ENTERPRISE │
├───────────────┼───────────────┼───────────────┼───────────────────────┤
│ │ │ │ │
│ Single Dev │ Team (5-20) │ Pipelines │ Production │
│ │ │ │ │
│ ┌─────────┐ │ ┌──────────┐ │ ┌──────────┐ │ ┌──────────────────┐ │
│ │ No Auth │ │ │Token(JWT)│ │ │Token(1h) │ │ │ mTLS (TLS 1.3) │ │
│ └─────────┘ │ └──────────┘ │ └──────────┘ │ └──────────────────┘ │
│ │ │ │ │
│ ┌─────────┐ │ ┌──────────┐ │ ┌──────────┐ │ ┌──────────────────┐ │
│ │ Local │ │ │ Remote │ │ │ Remote │ │ │ Kubernetes (HA) │ │
│ │ Binary │ │ │ Docker │ │ │ K8s │ │ │ Multi-AZ │ │
│ └─────────┘ │ └──────────┘ │ └──────────┘ │ └──────────────────┘ │
│ │ │ │ │
│ ┌─────────┐ │ ┌──────────┐ │ ┌──────────┐ │ ┌──────────────────┐ │
│ │ Local │ │ │ OCI (Zot)│ │ │OCI(Harbor│ │ │ OCI (Harbor HA) │ │
│ │ Files │ │ │ or Harbor│ │ │ required)│ │ │ + Replication │ │
│ └─────────┘ │ └──────────┘ │ └──────────┘ │ └──────────────────┘ │
│ │ │ │ │
│ ┌─────────┐ │ ┌──────────┐ │ ┌──────────┐ │ ┌──────────────────┐ │
│ │ None │ │ │ Gitea │ │ │ Disabled │ │ │ etcd (mandatory) │ │
│ │ │ │ │(optional)│ │ │ (stateless) │ │ │ │
│ └─────────┘ │ └──────────┘ │ └──────────┘ │ └──────────────────┘ │
│ │ │ │ │
│ Unlimited │ 10 srv, 32 │ 5 srv, 16 │ 20 srv, 64 cores │
│ │ cores, 128GB │ cores, 64GB │ 256GB per user │
│ │ │ │ │
└───────────────┴───────────────┴───────────────┴───────────────────────┘
Mode Configuration
Mode Templates: workspace/config/modes/{mode}.yaml
Active Mode: ~/.provisioning/config/active-mode.yaml
Switching Modes:
# Check current mode
provisioning mode current
# Switch to another mode
provisioning mode switch multi-user
# Validate mode requirements
provisioning mode validate enterprise
Mode-Specific Workflows
Solo Mode
# 1. Default mode, no setup needed
provisioning workspace init
# 2. Start local orchestrator
provisioning platform start orchestrator
# 3. Create infrastructure
provisioning server create
Multi-User Mode
# 1. Switch mode and authenticate
provisioning mode switch multi-user
provisioning auth login
# 2. Lock workspace
provisioning workspace lock my-infra
# 3. Pull extensions from OCI
provisioning extension pull upcloud kubernetes
# 4. Work...
# 5. Unlock workspace
provisioning workspace unlock my-infra
CI/CD Mode
# GitLab CI
deploy:
stage: deploy
script:
- export PROVISIONING_MODE=cicd
- echo "$TOKEN" > /var/run/secrets/provisioning/token
- provisioning validate --all
- provisioning test quick kubernetes
- provisioning server create --check
- provisioning server create
after_script:
- provisioning workspace cleanup
Enterprise Mode
# 1. Switch to enterprise, verify K8s
provisioning mode switch enterprise
kubectl get pods -n provisioning-system
# 2. Request workspace (approval required)
provisioning workspace request prod-deployment
# 3. After approval, lock with etcd
provisioning workspace lock prod-deployment --provider etcd
# 4. Pull verified extensions
provisioning extension pull upcloud --verify-signature
# 5. Deploy
provisioning infra create --check
provisioning infra create
# 6. Release
provisioning workspace unlock prod-deployment
Network Architecture
Service Communication
┌──────────────────────────────────────────────────────────────────────┐
│ NETWORK LAYER │
├──────────────────────────────────────────────────────────────────────┤
│ │
│ ┌───────────────────────┐ ┌──────────────────────────┐ │
│ │ Ingress/Load │ │ API Gateway │ │
│ │ Balancer │──────────│ (Optional) │ │
│ └───────────────────────┘ └──────────────────────────┘ │
│ │ │ │
│ │ │ │
│ ┌───────────┴────────────────────────────────────┴──────────┐ │
│ │ Service Mesh (Optional) │ │
│ │ (mTLS, Circuit Breaking, Retries) │ │
│ └────┬──────────┬───────────┬────────────┬──────────────┬───┘ │
│ │ │ │ │ │ │
│ ┌────┴─────┐ ┌─┴────────┐ ┌┴─────────┐ ┌┴──────────┐ ┌┴───────┐ │
│ │ Orchestr │ │ Control │ │ CoreDNS │ │ Gitea │ │ OCI │ │
│ │ ator │ │ Center │ │ │ │ │ │Registry│ │
│ │ │ │ │ │ │ │ │ │ │ │
│ │ :9090 │ │ :3000 │ │ :5353 │ │ :3001 │ │ :5000 │ │
│ └──────────┘ └──────────┘ └──────────┘ └───────────┘ └────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ DNS Resolution (CoreDNS) │ │
│ │ • *.prov.local → Internal services │ │
│ │ • *.infra.local → Infrastructure nodes │ │
│ └────────────────────────────────────────────────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────────┘
Port Allocation
| Service | Port | Protocol | Purpose |
|---|---|---|---|
| Orchestrator | 8080 | HTTP/WS | REST API, WebSocket |
| Control Center | 3000 | HTTP | Web UI |
| CoreDNS | 5353 | UDP/TCP | DNS resolution |
| Gitea | 3001 | HTTP | Git operations |
| OCI Registry (Zot) | 5000 | HTTP | OCI artifacts |
| OCI Registry (Harbor) | 443 | HTTPS | OCI artifacts (prod) |
| MCP Server | 8081 | HTTP | MCP protocol |
| API Gateway | 8082 | HTTP | Unified API |
Network Security
Solo Mode:
- Localhost-only bindings
- No authentication
- No encryption
Multi-User Mode:
- Token-based authentication (JWT)
- TLS for external access
- Firewall rules
CI/CD Mode:
- Token authentication (short-lived)
- Full TLS encryption
- Network isolation
Enterprise Mode:
- mTLS for all connections
- Network policies (Kubernetes)
- Zero-trust networking
- Audit logging
Data Architecture
Data Storage
┌────────────────────────────────────────────────────────────────┐
│ DATA LAYER │
├────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Configuration Data (Hierarchical) │ │
│ │ │ │
│ │ ~/.provisioning/ │ │
│ │ ├── config.user.toml (User preferences) │ │
│ │ └── config/ │ │
│ │ ├── active-mode.yaml (Active mode) │ │
│ │ └── user_config.yaml (Workspaces, preferences) │ │
│ │ │ │
│ │ workspace/ │ │
│ │ ├── config/ │ │
│ │ │ ├── provisioning.yaml (Workspace config) │ │
│ │ │ └── modes/*.yaml (Mode templates) │ │
│ │ └── infra/{name}/ │ │
│ │ ├── settings.k (Infrastructure KCL) │ │
│ │ └── config.toml (Infra-specific) │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ State Data (Runtime) │ │
│ │ │ │
│ │ ~/.provisioning/orchestrator/data/ │ │
│ │ ├── tasks/ (Task queue) │ │
│ │ ├── workflows/ (Workflow state) │ │
│ │ └── checkpoints/ (Recovery points) │ │
│ │ │ │
│ │ ~/.provisioning/services/ │ │
│ │ ├── pids/ (Process IDs) │ │
│ │ ├── logs/ (Service logs) │ │
│ │ └── state/ (Service state) │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Cache Data (Performance) │ │
│ │ │ │
│ │ ~/.provisioning/cache/ │ │
│ │ ├── oci/ (OCI artifacts) │ │
│ │ ├── kcl/ (Compiled KCL) │ │
│ │ └── modules/ (Module cache) │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Extension Data (OCI Artifacts) │ │
│ │ │ │
│ │ OCI Registry (localhost:5000 or harbor.company.com) │ │
│ │ ├── provisioning-core:v3.5.0 │ │
│ │ ├── provisioning-extensions/ │ │
│ │ │ ├── kubernetes:1.28.0 │ │
│ │ │ ├── aws:2.0.0 │ │
│ │ │ └── (100+ artifacts) │ │
│ │ └── provisioning-platform/ │ │
│ │ ├── orchestrator:v1.2.0 │ │
│ │ └── (4 service images) │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Secrets (Encrypted) │ │
│ │ │ │
│ │ workspace/secrets/ │ │
│ │ ├── keys.yaml.enc (SOPS-encrypted) │ │
│ │ ├── ssh-keys/ (SSH keys) │ │
│ │ └── tokens/ (API tokens) │ │
│ │ │ │
│ │ KMS Integration (Enterprise): │ │
│ │ • AWS KMS │ │
│ │ • HashiCorp Vault │ │
│ │ • Age encryption (local) │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────┘
Data Flow
Configuration Loading:
1. Load system defaults (config.defaults.toml)
2. Merge user config (~/.provisioning/config.user.toml)
3. Load workspace config (workspace/config/provisioning.yaml)
4. Load environment config (workspace/config/{env}-defaults.toml)
5. Load infrastructure config (workspace/infra/{name}/config.toml)
6. Apply runtime overrides (ENV variables, CLI flags)
State Persistence:
Workflow execution
↓
Create checkpoint (JSON)
↓
Save to ~/.provisioning/orchestrator/data/checkpoints/
↓
On failure, load checkpoint and resume
OCI Artifact Flow:
1. Package extension (oci-package.nu)
2. Push to OCI registry (provisioning oci push)
3. Extension stored as OCI artifact
4. Pull when needed (provisioning oci pull)
5. Cache locally (~/.provisioning/cache/oci/)
Security Architecture
Security Layers
┌─────────────────────────────────────────────────────────────────┐
│ SECURITY ARCHITECTURE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Layer 1: Authentication & Authorization │ │
│ │ │ │
│ │ Solo: None (local development) │ │
│ │ Multi-user: JWT tokens (24h expiry) │ │
│ │ CI/CD: CI-injected tokens (1h expiry) │ │
│ │ Enterprise: mTLS (TLS 1.3, mutual auth) │ │
│ └────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Layer 2: Encryption │ │
│ │ │ │
│ │ In Transit: │ │
│ │ • TLS 1.3 (multi-user, CI/CD, enterprise) │ │
│ │ • mTLS (enterprise) │ │
│ │ │ │
│ │ At Rest: │ │
│ │ • SOPS + Age (secrets encryption) │ │
│ │ • KMS integration (CI/CD, enterprise) │ │
│ │ • Encrypted filesystems (enterprise) │ │
│ └────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Layer 3: Secret Management │ │
│ │ │ │
│ │ • SOPS for file encryption │ │
│ │ • Age for key management │ │
│ │ • KMS integration (AWS KMS, Vault) │ │
│ │ • SSH key storage (KMS-backed) │ │
│ │ • API token management │ │
│ └────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Layer 4: Access Control │ │
│ │ │ │
│ │ • RBAC (Role-Based Access Control) │ │
│ │ • Workspace isolation │ │
│ │ • Workspace locking (Gitea, etcd) │ │
│ │ • Resource quotas (per-user limits) │ │
│ └────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Layer 5: Network Security │ │
│ │ │ │
│ │ • Network policies (Kubernetes) │ │
│ │ • Firewall rules │ │
│ │ • Zero-trust networking (enterprise) │ │
│ │ • Service mesh (optional, mTLS) │ │
│ └────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Layer 6: Audit & Compliance │ │
│ │ │ │
│ │ • Audit logs (all operations) │ │
│ │ • Compliance policies (SOC2, ISO27001) │ │
│ │ • Image signing (cosign, notation) │ │
│ │ • Vulnerability scanning (Harbor) │ │
│ └────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Secret Management
SOPS Integration:
# Edit encrypted file
provisioning sops workspace/secrets/keys.yaml.enc
# Encryption happens automatically on save
# Decryption happens automatically on load
KMS Integration (Enterprise):
# workspace/config/provisioning.yaml
secrets:
provider: "kms"
kms:
type: "aws" # or "vault"
region: "us-east-1"
key_id: "arn:aws:kms:..."
Image Signing and Verification
CI/CD Mode (Required):
# Sign OCI artifact
cosign sign oci://registry/kubernetes:1.28.0
# Verify signature
cosign verify oci://registry/kubernetes:1.28.0
Enterprise Mode (Mandatory):
# Pull with verification
provisioning extension pull kubernetes --verify-signature
# System blocks unsigned artifacts
Deployment Architecture
Deployment Modes
1. Binary Deployment (Solo, Multi-user)
User Machine
├── ~/.provisioning/bin/
│ ├── provisioning-orchestrator
│ ├── provisioning-control-center
│ └── ...
├── ~/.provisioning/orchestrator/data/
├── ~/.provisioning/services/
└── Process Management (PID files, logs)
Pros: Simple, fast startup, no Docker dependency Cons: Platform-specific binaries, manual updates
2. Docker Deployment (Multi-user, CI/CD)
Docker Daemon
├── Container: provisioning-orchestrator
├── Container: provisioning-control-center
├── Container: provisioning-coredns
├── Container: provisioning-gitea
├── Container: provisioning-oci-registry
└── Volumes: ~/.provisioning/data/
Pros: Consistent environment, easy updates Cons: Requires Docker, resource overhead
3. Docker Compose Deployment (Multi-user)
# provisioning/platform/docker-compose.yaml
services:
orchestrator:
image: provisioning-platform/orchestrator:v1.2.0
ports:
- "8080:9090"
volumes:
- orchestrator-data:/data
control-center:
image: provisioning-platform/control-center:v1.2.0
ports:
- "3000:3000"
depends_on:
- orchestrator
coredns:
image: coredns/coredns:1.11.1
ports:
- "5353:53/udp"
gitea:
image: gitea/gitea:1.20
ports:
- "3001:3000"
oci-registry:
image: ghcr.io/project-zot/zot:latest
ports:
- "5000:5000"
Pros: Easy multi-service orchestration, declarative Cons: Local only, no HA
4. Kubernetes Deployment (CI/CD, Enterprise)
# Namespace: provisioning-system
apiVersion: apps/v1
kind: Deployment
metadata:
name: orchestrator
spec:
replicas: 3 # HA
selector:
matchLabels:
app: orchestrator
template:
metadata:
labels:
app: orchestrator
spec:
containers:
- name: orchestrator
image: harbor.company.com/provisioning-platform/orchestrator:v1.2.0
ports:
- containerPort: 8080
env:
- name: RUST_LOG
value: "info"
volumeMounts:
- name: data
mountPath: /data
livenessProbe:
httpGet:
path: /health
port: 8080
readinessProbe:
httpGet:
path: /health
port: 8080
volumes:
- name: data
persistentVolumeClaim:
claimName: orchestrator-data
Pros: HA, scalability, production-ready Cons: Complex setup, Kubernetes required
5. Remote Deployment (All modes)
# Connect to remotely-running services
services:
orchestrator:
deployment:
mode: "remote"
remote:
endpoint: "https://orchestrator.company.com"
tls_enabled: true
auth_token_path: "~/.provisioning/tokens/orchestrator.token"
Pros: No local resources, centralized Cons: Network dependency, latency
Integration Architecture
Integration Patterns
1. Hybrid Language Integration (Rust ↔ Nushell)
Rust Orchestrator
↓ (HTTP API)
Nushell CLI
↓ (exec via bridge)
Nushell Business Logic
↓ (returns JSON)
Rust Orchestrator
↓ (updates state)
File-based Task Queue
Communication: HTTP API + stdin/stdout JSON
2. Provider Abstraction
Unified Provider Interface
├── create_server(config) -> Server
├── delete_server(id) -> bool
├── list_servers() -> [Server]
└── get_server_status(id) -> Status
Provider Implementations:
├── AWS Provider (aws-sdk-rust, aws cli)
├── UpCloud Provider (upcloud API)
└── Local Provider (Docker, libvirt)
3. OCI Registry Integration
Extension Development
↓
Package (oci-package.nu)
↓
Push (provisioning oci push)
↓
OCI Registry (Zot/Harbor)
↓
Pull (provisioning oci pull)
↓
Cache (~/.provisioning/cache/oci/)
↓
Load into Workspace
4. Gitea Integration (Multi-user, Enterprise)
Workspace Operations
↓
Check Lock Status (Gitea API)
↓
Acquire Lock (Create lock file in Git)
↓
Perform Changes
↓
Commit + Push
↓
Release Lock (Delete lock file)
Benefits:
- Distributed locking
- Change tracking via Git history
- Collaboration features
5. CoreDNS Integration
Service Registration
↓
Update CoreDNS Corefile
↓
Reload CoreDNS
↓
DNS Resolution Available
Zones:
├── *.prov.local (Internal services)
├── *.infra.local (Infrastructure nodes)
└── *.test.local (Test environments)
Performance and Scalability
Performance Characteristics
| Metric | Value | Notes |
|---|---|---|
| CLI Startup Time | < 100ms | Nushell cold start |
| CLI Response Time | < 50ms | Most commands |
| Workflow Submission | < 200ms | To orchestrator |
| Task Processing | 10-50/sec | Orchestrator throughput |
| Batch Operations | Up to 100 servers | Parallel execution |
| OCI Pull Time | 1-5s | Cached: <100ms |
| Configuration Load | < 500ms | Full hierarchy |
| Health Check Interval | 10s | Configurable |
Scalability Limits
Solo Mode:
- Unlimited local resources
- Limited by machine capacity
Multi-User Mode:
- 10 servers per user
- 32 cores, 128GB RAM per user
- 5-20 concurrent users
CI/CD Mode:
- 5 servers per pipeline
- 16 cores, 64GB RAM per pipeline
- 100+ concurrent pipelines
Enterprise Mode:
- 20 servers per user
- 64 cores, 256GB RAM per user
- 1000+ concurrent users
- Horizontal scaling via Kubernetes
Optimization Strategies
Caching:
- OCI artifacts cached locally
- KCL compilation cached
- Module resolution cached
Parallel Execution:
- Batch operations with configurable limits
- Dependency-aware parallel starts
- Workflow DAG execution
Incremental Operations:
- Only update changed resources
- Checkpoint-based recovery
- Delta synchronization
Evolution and Roadmap
Version History
| Version | Date | Major Features |
|---|---|---|
| v3.5.0 | 2025-10-06 | Mode system, OCI distribution, comprehensive docs |
| v3.4.0 | 2025-10-06 | Test environment service |
| v3.3.0 | 2025-09-30 | Interactive guides |
| v3.2.0 | 2025-09-30 | Modular CLI refactoring |
| v3.1.0 | 2025-09-25 | Batch workflow system |
| v3.0.0 | 2025-09-25 | Hybrid orchestrator |
| v2.0.5 | 2025-10-02 | Workspace switching |
| v2.0.0 | 2025-09-23 | Configuration migration |
Roadmap (Future Versions)
v3.6.0 (Q1 2026):
- GraphQL API
- Advanced RBAC
- Multi-tenancy
- Observability enhancements (OpenTelemetry)
v4.0.0 (Q2 2026):
- Multi-repository split complete
- Extension marketplace
- Advanced workflow features (conditional execution, loops)
- Cost optimization engine
v4.1.0 (Q3 2026):
- AI-assisted infrastructure generation
- Policy-as-code (OPA integration)
- Advanced compliance features
Long-term Vision:
- Serverless workflow execution
- Edge computing support
- Multi-cloud failover
- Self-healing infrastructure
Related Documentation
Architecture
- Multi-Repo Architecture - Repository organization
- Design Principles - Architectural philosophy
- Integration Patterns - Integration details
- Orchestrator Model - Hybrid orchestration
ADRs
- ADR-001 - Project structure
- ADR-002 - Distribution strategy
- ADR-003 - Workspace isolation
- ADR-004 - Hybrid architecture
- ADR-005 - Extension framework
- ADR-006 - CLI refactoring
User Guides
- Getting Started - First steps
- Mode System - Modes overview
- Service Management - Services
- OCI Registry - OCI operations
Maintained By: Architecture Team Review Cycle: Quarterly Next Review: 2026-01-06