# Design Principles Core principles guiding Provisioning architecture and development. ## 1. Workspace-First Design **Principle**: Workspaces are the default organizational unit for ALL infrastructure work. **Why**: - Explicit project isolation - Prevent accidental cross-project modifications - Independent credential management - Clear configuration boundaries - Team collaboration enablement **Application**: - Every workspace has independent state - Workspace switching is atomic - Configuration per workspace - Extensions inherited from platform **Code Example**: ```bash # Workspace-enforced workflow provisioning workspace init my-project provisioning workspace switch my-project # This command requires active workspace provisioning server create --name web-01 ``` **Impact**: All commands validate active workspace before execution. --- ## 2. Type-Safety Mandatory **Principle**: ALL configurations MUST be type-safe. Validation is NEVER optional. **Why**: - Catch errors at configuration time - Prevent runtime failures - Enable IDE support (LSP) - Enforce consistency - Reduce deployment risk **Application**: - **Nickel is source of truth** (NOT TOML) - Type contracts on ALL schemas - Gradual typing not allowed - Validation in ALL profiles (dev, prod, cicd) - Static analysis before deployment **Code Example**: ```nickel # Type-safe infrastructure definition { name : String = "server-01" plan : | [ 'small, 'medium, 'large | ] = 'medium zone : String = "de-fra1" backup_enabled : Bool = false } | ServerContract ``` **Impact**: Type errors caught before infrastructure changes. --- ## 3. Configuration-Driven, Never Hardcoded **Principle**: Configuration is the source of truth. Hardcoded values are forbidden. **Why**: - Enable environment-specific behavior - Support multiple deployment modes - Allow runtime reconfiguration - Audit configuration changes - Team collaboration **Application**: - 5-layer configuration hierarchy - 476+ configuration accessors - Variable interpolation - Environment-specific overrides - Schema validation **Code Example**: ```bash # Configuration drives behavior provisioning server create --plan $(config.server.default_plan) # Environment-specific configs PROVISIONING_ENV=prod provisioning server create ``` **Forbidden**: ```nushell # ❌ WRONG - Hardcoded values let server_plan = "medium" # ✅ RIGHT - Configuration-driven let server_plan = (config.server.plan) ``` **Impact**: Single codebase supports all environments. --- ## 4. Multi-Cloud Abstraction **Principle**: Provider-agnostic interfaces enable multi-cloud deployments. **Why**: - Avoid vendor lock-in - Reuse infrastructure code - Support multiple cloud strategies - Easy provider switching **Application**: - Unified provider interface - Abstract resource definitions - Provider-specific implementation - Automatic provider selection **Code Example**: ```nickel # Provider-agnostic configuration { servers = [ { name = "web-01" plan = "medium" # Abstract plan size provider = "upcloud" # Swappable provider } ] } ``` **Impact**: Same Nickel schema deploys to UpCloud, AWS, or Hetzner. --- ## 5. Modular, Extensible Architecture **Principle**: Components are loosely coupled, independently deployable. **Why**: - Easy to add features - Support custom extensions - Avoid monolithic growth - Enable community contributions - Flexible deployment options **Application**: - 54 core Nushell libraries - 111+ CLI commands in 7 domains - 50+ task services - 5 cloud providers - 9 cluster templates - Pluggable provider interface **Impact**: Add features without modifying core system. --- ## 6. Hybrid Rust + Nushell **Principle**: Rust for performance-critical components, Nushell for orchestration. **Why**: - **Rust**: Type safety, zero-cost abstractions, performance - **Nushell**: Structured data, productivity, easy automation - **Hybrid**: Best of both worlds **Application**: - Core CLI: Bash wrapper → Nushell dispatcher - Orchestrator: Rust scheduler + Nushell task execution - Libraries: Nushell for business logic - Performance: Rust plugins for 10-50x speedup **Impact**: Fast, type-safe, productive infrastructure automation. --- ## 7. State Management via Graph Database **Principle**: Infrastructure relationships tracked via SurrealDB graph. **Why**: - Model complex infrastructure relationships - Query relationships efficiently - Track dependencies - Support rollback via state history - Audit trail **Application**: - SurrealDB for relationship queries - File-based persistence for queue - Event-driven state updates - Checkpoint-based recovery **Example Relationships**: ```text Server → Network (connected to) Server → Storage (mounts) Cluster → Service (runs) Workflow → Dependency (depends on) ``` **Impact**: Complex infrastructure relationships handled gracefully. --- ## 8. Security-First Design **Principle**: Security is built-in, not bolted-on. **Why**: - Enterprise compliance - Data protection - Access control - Audit trails - Threat detection **Application**: - 4-layer security model (auth, authz, encryption, audit) - JWT authentication - Cedar policy enforcement - AES-256-GCM encryption - 7-year audit retention - MFA support (TOTP, WebAuthn) **Impact**: Enterprise-grade security by default. --- ## 9. Progressive Disclosure **Principle**: Simple for common cases, powerful for advanced use cases. **Why**: - Low barrier to entry - Professional productivity - Advanced features available - Avoid overwhelming users - Gradual learning curve **Application**: - **Simple**: Interactive TUI installer - **Productive**: CLI with 80+ shortcuts - **Powerful**: Batch workflows, policies - **Advanced**: Custom extensions, hooks **Impact**: All skill levels supported. --- ## 10. Fail-Fast, Recover Gracefully **Principle**: Detect issues early, provide recovery mechanisms. **Why**: - Prevent invalid deployments - Enable safe recovery - Minimize blast radius - Audit failures for learning **Application**: - Validation before execution - Checkpoint-based recovery - Automatic rollback on failure - Detailed error messages - Retry with exponential backoff **Code Example**: ```bash # Validate before deployment provisioning validate config --strict # Dry-run to check impact provisioning --check server create # Safe rollback on failure provisioning workflow rollback --to-checkpoint ``` **Impact**: Safe infrastructure changes with confidence. --- ## 11. Observable & Auditable **Principle**: All operations traceable, all changes auditable. **Why**: - Compliance & regulation - Troubleshooting - Security investigation - Team accountability - Historical analysis **Application**: - Comprehensive audit logging - 5 export formats (JSON, YAML, CSV, syslog, CloudWatch) - Structured log entries - Operation tracing - Resource change tracking **Impact**: Complete visibility into infrastructure changes. --- ## 12. No Shortcuts on Reliability **Principle**: Reliability features are standard, not optional. **Why**: - Production requirements - Minimize downtime - Data protection - Business continuity - Trust & confidence **Application**: - Checkpoint recovery - Automatic rollback - Health monitoring - Backup & restore - Multi-node deployment - Service redundancy **Impact**: Enterprise-grade reliability standard. --- ## Architectural Decision Records (ADRs) Key decisions documenting rationale: | ADR | Decision | Rationale | | --- | ---------| - --- | | ADR-011 | Nickel Migration | Type-safety over KCL flexibility | | ADR-010 | Config Strategy | 5-layer hierarchy over flat config | | ADR-009 | SurrealDB | Graph relationships over relational | | ADR-008 | Modular CLI | 80+ shortcuts over verbose commands | | ADR-007 | Workspace-First | Isolation over global state | | ADR-006 | Hybrid Architecture | Rust + Nushell for best of both | --- ## Design Trade-offs | Decision | Gain | Cost | | --- | -----| - --- | | **Type-Safety** | Fewer errors | Learning curve | | **Config Hierarchy** | Flexibility | Complexity | | **Workspace Isolation** | Safety | Duplication | | **Modular CLI** | Discoverability | No single command | | **SurrealDB** | Relationships | Resource overhead | | **Validation Strict** | Safety | Fast iteration friction | --- ## Related Documentation - [System Overview](system-overview.md) - [Component Architecture](component-architecture.md) - [Integration Patterns](integration-patterns.md)