2026-01-14 04:53:21 +00:00
|
|
|
# System Overview
|
|
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
Complete architecture of the Provisioning Infrastructure Automation Platform.
|
|
|
|
|
|
|
|
|
|
## Architecture Layers
|
|
|
|
|
|
|
|
|
|
Provisioning uses a 5-layer modular architecture:
|
|
|
|
|
|
|
|
|
|
```text
|
|
|
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
|
|
|
│ User Interface Layer │
|
|
|
|
|
│ • CLI (provisioning command) • Web Control Center (UI) │
|
|
|
|
|
│ • REST API • MCP Server (AI) • Batch Scheduler │
|
|
|
|
|
└──────────────────────────┬──────────────────────────────────┘
|
|
|
|
|
↓
|
|
|
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
|
|
|
│ Core Engine Layer (provisioning/core/) │
|
|
|
|
|
│ • 211-line CLI dispatcher (84% code reduction) │
|
|
|
|
|
│ • 476+ configuration accessors (hierarchical) │
|
|
|
|
|
│ • Provider abstraction (multi-cloud support) │
|
|
|
|
|
│ • Workspace management system │
|
|
|
|
|
│ • Infrastructure validation (54+ Nushell libraries) │
|
|
|
|
|
│ • Secrets management (SOPS + Age integration) │
|
|
|
|
|
└──────────────────────────┬──────────────────────────────────┘
|
|
|
|
|
↓
|
|
|
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
|
|
|
│ Orchestration Layer (provisioning/platform/) │
|
|
|
|
|
│ • Hybrid Orchestrator (Rust + Nushell) │
|
|
|
|
|
│ • Workflow execution with checkpoints │
|
|
|
|
|
│ • Dependency resolver & task scheduler │
|
|
|
|
|
│ • File-based persistence │
|
|
|
|
|
│ • REST API endpoints (83+) │
|
|
|
|
|
│ • State management (SurrealDB) │
|
|
|
|
|
└──────────────────────────┬──────────────────────────────────┘
|
|
|
|
|
↓
|
|
|
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
|
|
|
│ Extension Layer (provisioning/extensions/) │
|
|
|
|
|
│ • Cloud Providers (UpCloud, AWS, Hetzner, Local) │
|
|
|
|
|
│ • Task Services (50+ services in 18 categories) │
|
|
|
|
|
│ • Clusters (9 pre-built cluster templates) │
|
|
|
|
|
│ • Batch Workflows (automation templates) │
|
|
|
|
|
│ • Nushell Plugins (10-50x performance gains) │
|
|
|
|
|
└──────────────────────────┬──────────────────────────────────┘
|
|
|
|
|
↓
|
|
|
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
|
|
|
│ Infrastructure Layer │
|
|
|
|
|
│ • Cloud Resources (servers, networks, storage) │
|
|
|
|
|
│ • Running Services (Kubernetes, databases, etc.) │
|
|
|
|
|
│ • State Persistence (SurrealDB, file storage) │
|
|
|
|
|
│ • Monitoring & Logging (Prometheus, Loki) │
|
|
|
|
|
└─────────────────────────────────────────────────────────────┘
|
2026-01-14 04:53:21 +00:00
|
|
|
```
|
|
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
## Core System Components
|
2026-01-14 04:53:21 +00:00
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
### 1. CLI Layer (`provisioning/core/cli/`)
|
2026-01-14 04:53:21 +00:00
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
**Entry Point**: `provisioning/core/cli/provisioning`
|
2026-01-14 04:53:21 +00:00
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
- **Bash wrapper** (210 lines) - Minimal bootstrap
|
|
|
|
|
- Routes commands to Nushell dispatcher
|
|
|
|
|
- Loads environment and validates workspace
|
|
|
|
|
- Handles error reporting
|
2026-01-14 04:53:21 +00:00
|
|
|
|
|
|
|
|
**Key Features**:
|
2026-01-17 03:58:28 +00:00
|
|
|
- Single entry point
|
|
|
|
|
- Pluggable architecture
|
|
|
|
|
- Support for 111+ commands
|
|
|
|
|
- 80+ shortcuts for productivity
|
|
|
|
|
|
|
|
|
|
### 2. Core Engine (`provisioning/core/nulib/`)
|
|
|
|
|
|
|
|
|
|
**Structure**: 54 Nushell libraries organized by function
|
|
|
|
|
|
|
|
|
|
**Main Components**:
|
|
|
|
|
|
|
|
|
|
#### **Configuration Management** (`lib_provisioning/config/`)
|
|
|
|
|
- **Hierarchical loading**: 5-layer precedence system
|
|
|
|
|
- **476+ accessors**: Type-safe configuration access
|
|
|
|
|
- **Variable interpolation**: Template expansion
|
|
|
|
|
- **TOML merging**: Environment-specific overrides
|
|
|
|
|
- **Validation**: Schema enforcement
|
|
|
|
|
|
|
|
|
|
#### **Provider Abstraction** (`lib_provisioning/providers/`)
|
|
|
|
|
- **Multi-cloud support**: UpCloud, AWS, Hetzner, Local
|
|
|
|
|
- **Unified interface**: Single API for all providers
|
|
|
|
|
- **Dynamic loading**: Load providers on-demand
|
|
|
|
|
- **Credential management**: Encrypted credential handling
|
|
|
|
|
- **State tracking**: Provider-specific state persistence
|
|
|
|
|
|
|
|
|
|
#### **Workspace Management** (`lib_provisioning/workspace/`)
|
|
|
|
|
- **Workspace registry**: Track all workspaces
|
|
|
|
|
- **Switching**: Atomic workspace transitions
|
|
|
|
|
- **Isolation**: Independent state per workspace
|
|
|
|
|
- **Configuration loading**: Workspace-specific overrides
|
|
|
|
|
- **Extensions**: Inherit from platform extensions
|
|
|
|
|
|
|
|
|
|
#### **Infrastructure Validation** (`lib_provisioning/infra_validator/`)
|
|
|
|
|
- **Schema validation**: Nickel contract checking
|
|
|
|
|
- **Constraint enforcement**: Business rule validation
|
|
|
|
|
- **Dependency analysis**: Infrastructure dependency graph
|
|
|
|
|
- **Type checking**: Static type validation
|
|
|
|
|
- **Error reporting**: Detailed error messages with suggestions
|
|
|
|
|
|
|
|
|
|
#### **Secrets Management** (`lib_provisioning/secrets/`)
|
|
|
|
|
- **SOPS integration**: Mozilla SOPS for encryption
|
|
|
|
|
- **Age encryption**: Modern symmetric encryption
|
|
|
|
|
- **KMS backends**: Cosmian, AWS KMS, local
|
|
|
|
|
- **Credential injection**: Runtime variable substitution
|
|
|
|
|
- **Audit logging**: Track secret access
|
|
|
|
|
|
|
|
|
|
#### **Command Utilities** (`lib_provisioning/cmd/`)
|
|
|
|
|
- **SSH operations**: Remote command execution
|
|
|
|
|
- **Batch operations**: Parallel command execution
|
|
|
|
|
- **Error handling**: Structured error reporting
|
|
|
|
|
- **Logging**: Comprehensive operation logging
|
|
|
|
|
- **Retry logic**: Automatic retry with backoff
|
|
|
|
|
|
|
|
|
|
### 3. Orchestration Engine (`provisioning/platform/`)
|
|
|
|
|
|
|
|
|
|
**Technology**: Rust + Nushell hybrid
|
|
|
|
|
|
|
|
|
|
**12 Microservices** (Rust crates):
|
|
|
|
|
|
|
|
|
|
| Service | Purpose | Key Features |
|
|
|
|
|
| --- | --- | --- |
|
|
|
|
|
| orchestrator | Workflow execution | Scheduler, file persistence, REST API |
|
|
|
|
|
| control-center | API gateway + auth | RBAC, Cedar policies, audit logging |
|
|
|
|
|
| control-center-ui | Web dashboard | Infrastructure view, config management |
|
|
|
|
|
| mcp-server | AI integration | Model Context Protocol, auto-completion |
|
|
|
|
|
| vault-service | Secrets storage | Encryption, KMS, credential injection |
|
|
|
|
|
| extension-registry | OCI registry | Extension distribution, versioning |
|
|
|
|
|
| ai-service | LLM features | Prompt optimization, context awareness |
|
|
|
|
|
| detector | Anomaly detection | Health monitoring, pattern recognition |
|
|
|
|
|
| rag | Knowledge retrieval | Document embedding, semantic search |
|
|
|
|
|
| provisioning-daemon | Background service | Event monitoring, task scheduling |
|
|
|
|
|
| platform-config | Config management | Schema validation, environment handling |
|
|
|
|
|
| service-clients | API clients | SDK for platform services, cloud APIs |
|
|
|
|
|
|
|
|
|
|
**Detailed Services**:
|
|
|
|
|
|
|
|
|
|
#### **Orchestrator** (`crates/orchestrator/`)
|
|
|
|
|
- **High-performance scheduler**: Rust core
|
|
|
|
|
- **File-based persistence**: Durable queue
|
|
|
|
|
- **Workflow execution**: Dependency-aware scheduling
|
|
|
|
|
- **Checkpoint recovery**: Resume from failures
|
|
|
|
|
- **Parallel execution**: Multi-task handling
|
|
|
|
|
- **State management**: Track job status
|
|
|
|
|
- **REST API**: 9 core endpoints
|
|
|
|
|
- **Port**: 9090 (health check endpoint)
|
|
|
|
|
|
|
|
|
|
#### **Control Center** (`crates/control-center/`)
|
|
|
|
|
- **Authorization engine**: Cedar policy enforcement
|
|
|
|
|
- **RBAC system**: Role-based access control
|
|
|
|
|
- **Audit logging**: Complete audit trail
|
|
|
|
|
- **API gateway**: REST API for all operations
|
|
|
|
|
- **System configuration**: Central configuration management
|
|
|
|
|
- **Health monitoring**: Real-time system status
|
|
|
|
|
|
|
|
|
|
#### **Control Center UI** (`crates/control-center-ui/`)
|
|
|
|
|
- **Web dashboard**: Real-time infrastructure view
|
|
|
|
|
- **Workflow visualization**: Batch job monitoring
|
|
|
|
|
- **Configuration management**: Web-based configuration
|
|
|
|
|
- **Resource explorer**: Browse infrastructure
|
|
|
|
|
- **Audit viewer**: Security audit trail
|
|
|
|
|
|
|
|
|
|
#### **MCP Server** (`crates/mcp-server/`)
|
|
|
|
|
- **AI integration**: Model Context Protocol support
|
|
|
|
|
- **Natural language**: Parse infrastructure requests
|
|
|
|
|
- **Auto-completion**: Intelligent configuration suggestions
|
|
|
|
|
- **7 settings tools**: Configuration management via LLM
|
|
|
|
|
- **Context-aware**: Understand workspace context
|
|
|
|
|
|
|
|
|
|
#### **Vault Service** (`crates/vault-service/`)
|
|
|
|
|
- **Secrets backend**: Encrypted credential storage
|
|
|
|
|
- **KMS integration**: Key Management System support
|
|
|
|
|
- **SOPS + Age**: SOPS encryption backend
|
|
|
|
|
- **Credential injection**: Secure credential delivery
|
|
|
|
|
- **Audit logging**: Secret access tracking
|
|
|
|
|
|
|
|
|
|
#### **Extension Registry** (`crates/extension-registry/`)
|
|
|
|
|
- **OCI distribution**: Container image distribution
|
|
|
|
|
- **Extension packaging**: Provider/taskserv distribution
|
|
|
|
|
- **Version management**: Semantic versioning
|
|
|
|
|
- **Registry API**: Content addressable storage
|
|
|
|
|
|
|
|
|
|
#### **AI Service** (`crates/ai-service/`)
|
|
|
|
|
- **LLM integration**: Large Language Model support
|
|
|
|
|
- **Prompt optimization**: Infrastructure request parsing
|
|
|
|
|
- **Context awareness**: Workspace context enrichment
|
|
|
|
|
- **Response generation**: Configuration suggestions
|
|
|
|
|
|
|
|
|
|
#### **Detector** (`crates/detector/`)
|
|
|
|
|
- **Anomaly detection**: System health monitoring
|
|
|
|
|
- **Pattern recognition**: Infrastructure issue identification
|
|
|
|
|
- **Alert generation**: Alerting system integration
|
|
|
|
|
- **Real-time monitoring**: Continuous surveillance
|
|
|
|
|
|
|
|
|
|
#### **Platform Config** (`crates/platform-config/`)
|
|
|
|
|
- **Configuration management**: Centralized config loading
|
|
|
|
|
- **Schema validation**: Configuration validation
|
|
|
|
|
- **Environment handling**: Multi-environment support
|
|
|
|
|
- **Default settings**: System-wide defaults
|
|
|
|
|
|
|
|
|
|
#### **Provisioning Daemon** (`crates/provisioning-daemon/`)
|
|
|
|
|
- **Background service**: Continuous operation
|
|
|
|
|
- **Event monitoring**: System event handling
|
|
|
|
|
- **Task scheduling**: Background job execution
|
|
|
|
|
- **State synchronization**: Infrastructure state sync
|
|
|
|
|
|
|
|
|
|
#### **RAG Service** (`crates/rag/`)
|
|
|
|
|
- **Retrieval Augmented Generation**: Knowledge base integration
|
|
|
|
|
- **Document embedding**: Semantic search
|
|
|
|
|
- **Context retrieval**: Intelligent response context
|
|
|
|
|
- **Knowledge synthesis**: Answer generation
|
|
|
|
|
|
|
|
|
|
#### **Service Clients** (`crates/service-clients/`)
|
|
|
|
|
- **API clients**: Client SDK for platform services
|
|
|
|
|
- **Cloud providers**: Multi-cloud provider SDKs
|
|
|
|
|
- **Request handling**: HTTP/RPC client utilities
|
|
|
|
|
- **Connection pooling**: Efficient resource management
|
|
|
|
|
|
|
|
|
|
### 4. Extensions (`provisioning/extensions/`)
|
|
|
|
|
|
|
|
|
|
**Modular infrastructure components**:
|
|
|
|
|
|
|
|
|
|
#### **Providers** (5 cloud providers)
|
|
|
|
|
- **UpCloud** - Primary European cloud
|
|
|
|
|
- **AWS** - Amazon Web Services
|
|
|
|
|
- **Hetzner** - Baremetal & cloud servers
|
|
|
|
|
- **Local** - Development environment
|
|
|
|
|
- **Demo** - Testing & mocking
|
|
|
|
|
|
|
|
|
|
Each provider includes:
|
|
|
|
|
- Nickel schemas for configuration
|
|
|
|
|
- API client implementation
|
|
|
|
|
- Server creation/deletion logic
|
|
|
|
|
- Network management
|
|
|
|
|
- State tracking
|
|
|
|
|
|
|
|
|
|
#### **Task Services** (50+ services in 18 categories)
|
|
|
|
|
|
|
|
|
|
| Category | Services | Purpose |
|
|
|
|
|
| --- | ---------| - --- |
|
|
|
|
|
| Container Runtime | containerd, crio, podman, crun, youki, runc | Container execution |
|
|
|
|
|
| Kubernetes | kubernetes, etcd, coredns, cilium, flannel, calico | Orchestration |
|
|
|
|
|
| Storage | rook-ceph, local-storage, mayastor, external-nfs | Data persistence |
|
|
|
|
|
| Databases | postgres, redis, mysql, mongodb | Data management |
|
|
|
|
|
| Networking | ip-aliases, proxy, resolv, kms | Network services |
|
|
|
|
|
| Security | webhook, kms, oras, radicle | Security services |
|
|
|
|
|
| Observability | prometheus, grafana, loki, jaeger | Monitoring & logging |
|
|
|
|
|
| Development | gitea, coder, desktop, buildkit | Developer tools |
|
|
|
|
|
| Hypervisor | kvm, qemu, libvirt | Virtualization |
|
|
|
|
|
|
|
|
|
|
#### **Clusters** (9 pre-built templates)
|
|
|
|
|
- **web** - Web service cluster (nginx + postgres)
|
|
|
|
|
- **oci-reg** - Container registry
|
|
|
|
|
- **git** - Git hosting (Gitea)
|
|
|
|
|
- **buildkit** - Build infrastructure
|
|
|
|
|
- **k8s-ha** - HA Kubernetes (3 control planes)
|
|
|
|
|
- **postgresql** - HA PostgreSQL cluster
|
|
|
|
|
- **cicd-argocd** - GitOps CI/CD
|
|
|
|
|
- **cicd-tekton** - Tekton pipelines
|
|
|
|
|
|
|
|
|
|
### 5. Infrastructure Layer
|
|
|
|
|
|
|
|
|
|
**What Provisioning Manages**:
|
|
|
|
|
|
|
|
|
|
- **Cloud Resources**: VMs, networks, storage
|
|
|
|
|
- **Services**: Kubernetes, databases, monitoring
|
|
|
|
|
- **Applications**: Web services, APIs, tools
|
|
|
|
|
- **State**: Configuration, data, logs
|
|
|
|
|
- **Monitoring**: Metrics, traces, logs
|
|
|
|
|
|
|
|
|
|
## Configuration System
|
|
|
|
|
|
|
|
|
|
**Hierarchical 5-Layer System**:
|
|
|
|
|
|
|
|
|
|
```text
|
|
|
|
|
Precedence (High → Low):
|
|
|
|
|
|
|
|
|
|
1. Runtime Arguments (CLI flags: --provider upcloud)
|
|
|
|
|
↓
|
|
|
|
|
2. Environment Variables (PROVISIONING_PROVIDER=aws)
|
|
|
|
|
↓
|
|
|
|
|
3. Workspace Config (~workspace/config/provisioning.yaml)
|
|
|
|
|
↓
|
|
|
|
|
4. Environment Defaults (workspace/config/prod-defaults.toml)
|
|
|
|
|
↓
|
|
|
|
|
5. System Defaults (~/.config/provisioning/ + platform defaults)
|
|
|
|
|
```
|
2026-01-14 04:53:21 +00:00
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
**Configuration Languages**:
|
2026-01-14 04:53:21 +00:00
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
| Format | Purpose | Validation | Editability |
|
|
|
|
|
| --- | --------| - --- | ------------ |
|
|
|
|
|
| **Nickel** | Infrastructure source | ✅ Type-safe, contracts | Direct |
|
|
|
|
|
| **TOML** | Settings, defaults | Schema validation | Direct |
|
|
|
|
|
| **YAML** | User config, metadata | Schema validation | Direct |
|
|
|
|
|
| **JSON** | Exported configs | Schema validation | Generated |
|
2026-01-14 04:53:21 +00:00
|
|
|
|
|
|
|
|
**Key Features**:
|
2026-01-17 03:58:28 +00:00
|
|
|
- Lazy evaluation
|
|
|
|
|
- Recursive merging
|
|
|
|
|
- Variable interpolation
|
|
|
|
|
- Constraint checking
|
|
|
|
|
- Automatic validation
|
|
|
|
|
|
|
|
|
|
## State Management
|
|
|
|
|
|
|
|
|
|
**SurrealDB Graph Database**:
|
|
|
|
|
|
|
|
|
|
Stores complex infrastructure relationships:
|
|
|
|
|
|
|
|
|
|
```text
|
|
|
|
|
Nodes:
|
|
|
|
|
- Servers (compute)
|
|
|
|
|
- Networks (connectivity)
|
|
|
|
|
- Storage (persistence)
|
|
|
|
|
- Services (software)
|
|
|
|
|
- Workflows (automation)
|
|
|
|
|
|
|
|
|
|
Edges:
|
|
|
|
|
- Server → Network (connected)
|
|
|
|
|
- Server → Storage (mounted)
|
|
|
|
|
- Service → Server (running on)
|
|
|
|
|
- Workflow → Dependency (depends on)
|
2026-01-14 04:53:21 +00:00
|
|
|
```
|
|
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
**File-Based Persistence**:
|
2026-01-14 04:53:21 +00:00
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
For orchestrator queue and checkpoints:
|
2026-01-14 04:53:21 +00:00
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
```text
|
|
|
|
|
~/.provisioning/
|
|
|
|
|
├── state/ # Infrastructure state
|
|
|
|
|
├── checkpoints/ # Workflow checkpoints
|
|
|
|
|
├── queue/ # Orchestrator queue
|
|
|
|
|
└── logs/ # Operational logs
|
2026-01-14 04:53:21 +00:00
|
|
|
```
|
|
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
## Security Architecture
|
2026-01-14 04:53:21 +00:00
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
**4-Layer Security Model**:
|
|
|
|
|
|
|
|
|
|
| Layer | Components | Features |
|
|
|
|
|
| --- | ----------| - --- |
|
|
|
|
|
| **Authentication** | JWT, sessions, MFA | 2FA, TOTP, WebAuthn |
|
|
|
|
|
| **Authorization** | Cedar policies, RBAC | Fine-grained permissions |
|
|
|
|
|
| **Encryption** | AES-256-GCM, TLS | At-rest & in-transit |
|
|
|
|
|
| **Audit** | Logging, compliance | 7-year retention |
|
|
|
|
|
|
|
|
|
|
**Security Services**:
|
|
|
|
|
- JWT token validation
|
|
|
|
|
- Argon2id password hashing
|
|
|
|
|
- Multi-factor authentication
|
|
|
|
|
- Cedar policy enforcement
|
|
|
|
|
- Encrypted credential storage
|
|
|
|
|
- KMS integration (5 backends)
|
|
|
|
|
- Audit logging (5 export formats)
|
|
|
|
|
- Compliance checking (SOC2, GDPR, HIPAA)
|
|
|
|
|
|
|
|
|
|
## Performance Characteristics
|
|
|
|
|
|
|
|
|
|
**Modular CLI** (84% code reduction):
|
|
|
|
|
- Main CLI: 211 lines (vs. 1,329 before)
|
|
|
|
|
- Command discovery: O(1) dispatcher
|
|
|
|
|
- Lazy loading: Commands loaded on-demand
|
|
|
|
|
- Caching: Configuration cached after first load
|
|
|
|
|
|
|
|
|
|
**Orchestrator Performance**:
|
|
|
|
|
- Dependency resolution: O(n log n) topological sort
|
|
|
|
|
- Parallel execution: Configurable task limit
|
|
|
|
|
- Checkpoint recovery: Resume from failure point
|
|
|
|
|
- Memory efficient: File-based queue
|
|
|
|
|
|
|
|
|
|
**Provider Operations**:
|
|
|
|
|
- Batch creation: Parallel server provisioning
|
|
|
|
|
- Bulk operations: Multi-resource transactions
|
|
|
|
|
- State tracking: Efficient state queries
|
|
|
|
|
- Rollback: Atomic operation reversal
|
|
|
|
|
|
|
|
|
|
**Nushell Plugins** (10-50x speedup):
|
|
|
|
|
- Compiled Rust extensions
|
|
|
|
|
- Direct native code execution
|
|
|
|
|
- Zero-copy data passing
|
|
|
|
|
- Async I/O support
|
|
|
|
|
|
|
|
|
|
## Deployment Modes
|
|
|
|
|
|
|
|
|
|
**Three Operational Modes**:
|
|
|
|
|
|
|
|
|
|
| Mode | Interaction | Configuration | Rollback | Use Case |
|
|
|
|
|
| --- | ------------| - --- | ---------| - --- |
|
|
|
|
|
| **Interactive TUI** | Ratatui UI | Manual input | Automatic | Development |
|
|
|
|
|
| **Headless CLI** | Command-line | Script-driven | Manual | Automation |
|
|
|
|
|
| **Unattended CI/CD** | Non-interactive | Configuration file | Automatic | CI/CD pipelines |
|
2026-01-14 04:53:21 +00:00
|
|
|
|
|
|
|
|
## Technology Stack
|
|
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
| Component | Technology | Why |
|
|
|
|
|
| --- | ----------| - --- |
|
|
|
|
|
| **IaC Language** | Nickel | Type-safe, lazy evaluation, contracts |
|
|
|
|
|
| **Scripting** | Nushell 0.109+ | Structured data pipelines |
|
|
|
|
|
| **Performance** | Rust | Zero-cost abstractions, memory safety |
|
|
|
|
|
| **State** | SurrealDB | Graph database for relationships |
|
|
|
|
|
| **Encryption** | SOPS + Age | Industry-standard encryption |
|
|
|
|
|
| **Security** | Cedar + JWT | Policy enforcement + tokens |
|
|
|
|
|
| **Orchestration** | Custom | Specialized for infrastructure workflows |
|
|
|
|
|
|
|
|
|
|
## File Organization
|
|
|
|
|
|
|
|
|
|
```text
|
|
|
|
|
provisioning/
|
|
|
|
|
├── core/ # CLI engine (Nushell)
|
|
|
|
|
│ ├── cli/provisioning # Main entry point
|
|
|
|
|
│ ├── nulib/ # 54 core libraries
|
|
|
|
|
│ ├── plugins/ # Nushell plugins (Rust)
|
|
|
|
|
│ └── scripts/ # Utility scripts
|
|
|
|
|
│
|
|
|
|
|
├── platform/ # Microservices (Rust)
|
|
|
|
|
│ ├── crates/ # 12 microservices
|
|
|
|
|
│ │ ├── orchestrator/ # Workflow scheduler
|
|
|
|
|
│ │ ├── control-center/ # API gateway + auth
|
|
|
|
|
│ │ ├── control-center-ui/ # Web dashboard
|
|
|
|
|
│ │ ├── mcp-server/ # AI integration
|
|
|
|
|
│ │ ├── vault-service/ # Secrets backend
|
|
|
|
|
│ │ ├── extension-registry/ # OCI registry
|
|
|
|
|
│ │ ├── ai-service/ # LLM features
|
|
|
|
|
│ │ ├── detector/ # Anomaly detection
|
|
|
|
|
│ │ ├── rag/ # Knowledge retrieval
|
|
|
|
|
│ │ ├── provisioning-daemon/ # Background service
|
|
|
|
|
│ │ ├── platform-config/ # Config management
|
|
|
|
|
│ │ └── service-clients/ # API clients
|
|
|
|
|
│ └── Cargo.toml # Rust workspace
|
|
|
|
|
│
|
|
|
|
|
├── extensions/ # Extensible components
|
|
|
|
|
│ ├── providers/ # Cloud providers (5)
|
|
|
|
|
│ ├── taskservs/ # Task services (50+)
|
|
|
|
|
│ ├── clusters/ # Cluster templates (9)
|
|
|
|
|
│ └── workflows/ # Automation templates
|
|
|
|
|
│
|
|
|
|
|
├── schemas/ # Nickel schemas
|
|
|
|
|
│ ├── main.ncl # Entry point
|
|
|
|
|
│ ├── config/ # Configuration schemas
|
|
|
|
|
│ ├── infrastructure/ # Infrastructure schemas
|
|
|
|
|
│ ├── operations/ # Operational schemas
|
|
|
|
|
│ └── [other schemas] # Additional schemas
|
|
|
|
|
│
|
|
|
|
|
├── config/ # System configuration
|
|
|
|
|
│ └── config.defaults.toml # Default settings
|
|
|
|
|
│
|
|
|
|
|
├── bootstrap/ # Installation
|
|
|
|
|
│ ├── install.sh # Bash bootstrap
|
|
|
|
|
│ └── install.nu # Nushell installer
|
|
|
|
|
│
|
|
|
|
|
├── docs/ # Product documentation
|
|
|
|
|
│ └── src/ # mdBook source
|
|
|
|
|
│
|
|
|
|
|
└── README.md # Project overview
|
|
|
|
|
```
|
2026-01-14 04:53:21 +00:00
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
## Component Interaction
|
|
|
|
|
|
|
|
|
|
**Typical Workflow**:
|
|
|
|
|
|
|
|
|
|
```text
|
|
|
|
|
User Input
|
|
|
|
|
↓
|
|
|
|
|
CLI Dispatcher (provisioning/core/cli/provisioning)
|
|
|
|
|
↓
|
|
|
|
|
Nushell Handler (provisioning/core/nulib/commands/)
|
|
|
|
|
↓
|
|
|
|
|
Configuration Loading (lib_provisioning/config/)
|
|
|
|
|
↓
|
|
|
|
|
Provider Selection (lib_provisioning/providers/)
|
|
|
|
|
↓
|
|
|
|
|
Validation (lib_provisioning/infra_validator/)
|
|
|
|
|
↓
|
|
|
|
|
Orchestrator Queue (provisioning/platform/orchestrator/)
|
|
|
|
|
↓
|
|
|
|
|
Task Execution (provider + task service)
|
|
|
|
|
↓
|
|
|
|
|
State Update (SurrealDB / file storage)
|
|
|
|
|
↓
|
|
|
|
|
Audit Logging (security system)
|
|
|
|
|
↓
|
|
|
|
|
User Feedback
|
|
|
|
|
```
|
2026-01-14 04:53:21 +00:00
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
## Scalability
|
2026-01-14 04:53:21 +00:00
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
Provisioning scales from:
|
2026-01-14 04:53:21 +00:00
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
- **Solo**: 2 CPU cores, 4GB RAM (single instance)
|
|
|
|
|
- **MultiUser**: 4-8 CPU cores, 8GB RAM (small team)
|
|
|
|
|
- **CICD**: 8+ CPU cores, 16GB RAM (enterprise)
|
|
|
|
|
- **Enterprise**: Multi-node Kubernetes (unlimited)
|
2026-01-14 04:53:21 +00:00
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
**Bottlenecks & Solutions**:
|
2026-01-14 04:53:21 +00:00
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
| Component | Bottleneck | Solution |
|
|
|
|
|
| --- | ----------| - --- |
|
|
|
|
|
| **Orchestrator** | Task queue | Partition by workspace |
|
|
|
|
|
| **State** | SurrealDB | Horizontal scaling |
|
|
|
|
|
| **Providers** | API rate limits | Exponential backoff |
|
|
|
|
|
| **Storage** | Disk I/O | SSD + caching |
|
2026-01-14 04:53:21 +00:00
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
## Integration Points
|
2026-01-14 04:53:21 +00:00
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
Provisioning integrates with:
|
2026-01-14 04:53:21 +00:00
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
- **Kubernetes API** - Cluster management
|
|
|
|
|
- **Cloud Provider APIs** - Resource provisioning
|
|
|
|
|
- **SOPS + Age** - Secrets encryption
|
|
|
|
|
- **Prometheus** - Metrics collection
|
|
|
|
|
- **Cedar** - Policy enforcement
|
|
|
|
|
- **SurrealDB** - State persistence
|
|
|
|
|
- **MCP** - AI integration
|
|
|
|
|
- **KMS** - Key management (Cosmian, AWS, local)
|
2026-01-14 04:53:21 +00:00
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
## Reliability Features
|
2026-01-14 04:53:21 +00:00
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
**Fault Tolerance**:
|
|
|
|
|
- Checkpoint recovery - Resume from failure
|
|
|
|
|
- Automatic rollback - Revert failed operations
|
|
|
|
|
- Retry logic - Exponential backoff
|
|
|
|
|
- Health checks - Continuous monitoring
|
|
|
|
|
- Backup & restore - Data protection
|
2026-01-14 04:53:21 +00:00
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
**High Availability**:
|
|
|
|
|
- Multi-node orchestrator
|
|
|
|
|
- Database replication
|
|
|
|
|
- Service redundancy
|
|
|
|
|
- Load balancing
|
|
|
|
|
- Failover automation
|
2026-01-14 04:53:21 +00:00
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
## Related Documentation
|
2026-01-14 04:53:21 +00:00
|
|
|
|
2026-01-17 03:58:28 +00:00
|
|
|
- [Design Principles](design-principles.md)
|
|
|
|
|
- [Component Architecture](component-architecture.md)
|
|
|
|
|
- [Integration Patterns](integration-patterns.md)
|