provisioning/docs/src/architecture/design-principles.md
2026-01-17 03:58:28 +00:00

8.3 KiB

Design Principles

Core principles guiding Provisioning architecture and development.

1. Workspace-First Design

Principle: Workspaces are the default organizational unit for ALL infrastructure work.

Why:

  • Explicit project isolation
  • Prevent accidental cross-project modifications
  • Independent credential management
  • Clear configuration boundaries
  • Team collaboration enablement

Application:

  • Every workspace has independent state
  • Workspace switching is atomic
  • Configuration per workspace
  • Extensions inherited from platform

Code Example:

# Workspace-enforced workflow
provisioning workspace init my-project
provisioning workspace switch my-project

# This command requires active workspace
provisioning server create --name web-01

Impact: All commands validate active workspace before execution.


2. Type-Safety Mandatory

Principle: ALL configurations MUST be type-safe. Validation is NEVER optional.

Why:

  • Catch errors at configuration time
  • Prevent runtime failures
  • Enable IDE support (LSP)
  • Enforce consistency
  • Reduce deployment risk

Application:

  • Nickel is source of truth (NOT TOML)
  • Type contracts on ALL schemas
  • Gradual typing not allowed
  • Validation in ALL profiles (dev, prod, cicd)
  • Static analysis before deployment

Code Example:

# Type-safe infrastructure definition
{
  name : String = "server-01"
  plan : | [ 'small, 'medium, 'large | ] = 'medium
  zone : String = "de-fra1"
  backup_enabled : Bool = false
} | ServerContract

Impact: Type errors caught before infrastructure changes.


3. Configuration-Driven, Never Hardcoded

Principle: Configuration is the source of truth. Hardcoded values are forbidden.

Why:

  • Enable environment-specific behavior
  • Support multiple deployment modes
  • Allow runtime reconfiguration
  • Audit configuration changes
  • Team collaboration

Application:

  • 5-layer configuration hierarchy
  • 476+ configuration accessors
  • Variable interpolation
  • Environment-specific overrides
  • Schema validation

Code Example:

# Configuration drives behavior
provisioning server create --plan $(config.server.default_plan)

# Environment-specific configs
PROVISIONING_ENV=prod provisioning server create

Forbidden:

# ❌ WRONG - Hardcoded values
let server_plan = "medium"

# ✅ RIGHT - Configuration-driven
let server_plan = (config.server.plan)

Impact: Single codebase supports all environments.


4. Multi-Cloud Abstraction

Principle: Provider-agnostic interfaces enable multi-cloud deployments.

Why:

  • Avoid vendor lock-in
  • Reuse infrastructure code
  • Support multiple cloud strategies
  • Easy provider switching

Application:

  • Unified provider interface
  • Abstract resource definitions
  • Provider-specific implementation
  • Automatic provider selection

Code Example:

# Provider-agnostic configuration
{
  servers = [
    {
      name = "web-01"
      plan = "medium"      # Abstract plan size
      provider = "upcloud" # Swappable provider
    }
  ]
}

Impact: Same Nickel schema deploys to UpCloud, AWS, or Hetzner.


5. Modular, Extensible Architecture

Principle: Components are loosely coupled, independently deployable.

Why:

  • Easy to add features
  • Support custom extensions
  • Avoid monolithic growth
  • Enable community contributions
  • Flexible deployment options

Application:

  • 54 core Nushell libraries
  • 111+ CLI commands in 7 domains
  • 50+ task services
  • 5 cloud providers
  • 9 cluster templates
  • Pluggable provider interface

Impact: Add features without modifying core system.


6. Hybrid Rust + Nushell

Principle: Rust for performance-critical components, Nushell for orchestration.

Why:

  • Rust: Type safety, zero-cost abstractions, performance
  • Nushell: Structured data, productivity, easy automation
  • Hybrid: Best of both worlds

Application:

  • Core CLI: Bash wrapper → Nushell dispatcher
  • Orchestrator: Rust scheduler + Nushell task execution
  • Libraries: Nushell for business logic
  • Performance: Rust plugins for 10-50x speedup

Impact: Fast, type-safe, productive infrastructure automation.


7. State Management via Graph Database

Principle: Infrastructure relationships tracked via SurrealDB graph.

Why:

  • Model complex infrastructure relationships
  • Query relationships efficiently
  • Track dependencies
  • Support rollback via state history
  • Audit trail

Application:

  • SurrealDB for relationship queries
  • File-based persistence for queue
  • Event-driven state updates
  • Checkpoint-based recovery

Example Relationships:

Server → Network (connected to)
Server → Storage (mounts)
Cluster → Service (runs)
Workflow → Dependency (depends on)

Impact: Complex infrastructure relationships handled gracefully.


8. Security-First Design

Principle: Security is built-in, not bolted-on.

Why:

  • Enterprise compliance
  • Data protection
  • Access control
  • Audit trails
  • Threat detection

Application:

  • 4-layer security model (auth, authz, encryption, audit)
  • JWT authentication
  • Cedar policy enforcement
  • AES-256-GCM encryption
  • 7-year audit retention
  • MFA support (TOTP, WebAuthn)

Impact: Enterprise-grade security by default.


9. Progressive Disclosure

Principle: Simple for common cases, powerful for advanced use cases.

Why:

  • Low barrier to entry
  • Professional productivity
  • Advanced features available
  • Avoid overwhelming users
  • Gradual learning curve

Application:

  • Simple: Interactive TUI installer
  • Productive: CLI with 80+ shortcuts
  • Powerful: Batch workflows, policies
  • Advanced: Custom extensions, hooks

Impact: All skill levels supported.


10. Fail-Fast, Recover Gracefully

Principle: Detect issues early, provide recovery mechanisms.

Why:

  • Prevent invalid deployments
  • Enable safe recovery
  • Minimize blast radius
  • Audit failures for learning

Application:

  • Validation before execution
  • Checkpoint-based recovery
  • Automatic rollback on failure
  • Detailed error messages
  • Retry with exponential backoff

Code Example:

# Validate before deployment
provisioning validate config --strict

# Dry-run to check impact
provisioning --check server create

# Safe rollback on failure
provisioning workflow rollback --to-checkpoint

Impact: Safe infrastructure changes with confidence.


11. Observable & Auditable

Principle: All operations traceable, all changes auditable.

Why:

  • Compliance & regulation
  • Troubleshooting
  • Security investigation
  • Team accountability
  • Historical analysis

Application:

  • Comprehensive audit logging
  • 5 export formats (JSON, YAML, CSV, syslog, CloudWatch)
  • Structured log entries
  • Operation tracing
  • Resource change tracking

Impact: Complete visibility into infrastructure changes.


12. No Shortcuts on Reliability

Principle: Reliability features are standard, not optional.

Why:

  • Production requirements
  • Minimize downtime
  • Data protection
  • Business continuity
  • Trust & confidence

Application:

  • Checkpoint recovery
  • Automatic rollback
  • Health monitoring
  • Backup & restore
  • Multi-node deployment
  • Service redundancy

Impact: Enterprise-grade reliability standard.


Architectural Decision Records (ADRs)

Key decisions documenting rationale:

| ADR | Decision | Rationale | | --- | ---------| - --- | | ADR-011 | Nickel Migration | Type-safety over KCL flexibility | | ADR-010 | Config Strategy | 5-layer hierarchy over flat config | | ADR-009 | SurrealDB | Graph relationships over relational | | ADR-008 | Modular CLI | 80+ shortcuts over verbose commands | | ADR-007 | Workspace-First | Isolation over global state | | ADR-006 | Hybrid Architecture | Rust + Nushell for best of both |


Design Trade-offs

| Decision | Gain | Cost | | --- | -----| - --- | | Type-Safety | Fewer errors | Learning curve | | Config Hierarchy | Flexibility | Complexity | | Workspace Isolation | Safety | Duplication | | Modular CLI | Discoverability | No single command | | SurrealDB | Relationships | Resource overhead | | Validation Strict | Safety | Fast iteration friction |