diff --git a/.typedialog/README.md b/.typedialog/README.md index db0dbeb..1ad576a 100644 --- a/.typedialog/README.md +++ b/.typedialog/README.md @@ -1,182 +1 @@ -# TypeDialog Configuration Structure - -This directory contains TypeDialog forms, templates, and configuration data organized by subsystem. - -## Directory Organization - -``` -.typedialog/ -├── core/ # Core subsystem forms (setup, auth, infrastructure) -├── provisioning/ # Main provisioning configuration fragments -└── platform/ # Platform services forms (future) -``` - -### Why Multiple Subdirectories - -Different subsystems have different form requirements: - -1. **`core/`** - Core infrastructure operations - - System setup wizard - - Authentication (login, MFA) - - Infrastructure confirmations (delete, deploy) - - **Users**: Developers, operators - -2. **`provisioning/`** - Project provisioning configuration - - Deployment target selection (docker, k8s, ssh) - - Database configuration (postgres, mysql, sqlite) - - Monitoring setup - - **Users**: Project setup, CI/CD - -3. **`platform/`** (future) - Platform services - - Orchestrator configuration - - Control center setup - - Service-specific forms - - **Users**: Platform administrators - -## Structure Within Each Subdirectory - -Each subdirectory follows this pattern: - -``` -{subsystem}/ -├── forms/ # TOML form definitions -├── templates/ # Nickel/Jinja2 templates -├── defaults/ # Default configurations -├── constraints/ # Validation rules -└── generated/ # Generated configs (gitignored) -``` - -## Core Subsystem (`core/`) - -**Purpose**: Core infrastructure operations (setup, auth, confirmations) - -**Forms**: -- `forms/setup-wizard.toml` - Initial system setup -- `forms/auth-login.toml` - User authentication -- `forms/mfa-enroll.toml` - MFA enrollment -- `forms/infrastructure/*.toml` - Delete confirmations (server, cluster, taskserv) - -**Bash Wrappers** (TTY-safe): -- `../../core/shlib/setup-wizard-tty.sh` -- `../../core/shlib/auth-login-tty.sh` -- `../../core/shlib/mfa-enroll-tty.sh` - -**Usage**: -``` -# Run setup wizard -./provisioning/core/shlib/setup-wizard-tty.sh - -# Nushell reads result -let config = (open provisioning/.typedialog/core/generated/setup-wizard-result.json | from json) -``` - -## Provisioning Subsystem (`provisioning/`) - -**Purpose**: Main provisioning configuration (deployments, databases, monitoring) - -**Structure**: -- `form.toml` - Main provisioning form -- `fragments/` - Modular form fragments - - `deployment-*.toml` - Docker, K8s, SSH deployments - - `database-*.toml` - Database configurations - - `monitoring.toml` - Monitoring setup - - `auth-*.toml` - Authentication methods -- `constraints.toml` - Validation constraints -- `defaults/` - Default values -- `schemas/` - Nickel schemas - -**Usage**: -``` -# Configure provisioning -nu provisioning/.typedialog/provisioning/configure.nu --backend web -``` - -## Platform Subsystem (`platform/` - Future) - -**Purpose**: Platform services configuration - -**Planned forms**: -- Orchestrator configuration -- Control center setup -- MCP server configuration -- Vault service setup - -**Status**: Structure planned, not yet implemented - -## Integration with Code - -### Bash Wrappers (TTY-safe) - -Located in: `provisioning/core/shlib/*-tty.sh` - -These wrappers solve Nushell's TTY input limitations by: -1. Handling interactive input in bash -2. Calling TypeDialog with proper TTY forwarding -3. Generating JSON output for Nushell consumption - -**Pattern**: -``` -Bash wrapper → TypeDialog (TTY input) → Nickel config → JSON → Nushell -``` - -### Nushell Integration - -Located in: `provisioning/core/nulib/lib_provisioning/` - -Functions that call the bash wrappers: -- `setup/wizard.nu::run-setup-wizard-interactive` -- `plugins/auth.nu::login-interactive` -- `plugins/auth.nu::mfa-enroll-interactive` - -## Generated Files - -**Location**: `{subsystem}/generated/` - -**Files**: -- `*.ncl` - Nickel configuration files -- `*.json` - JSON exports for Nushell -- `*-defaults.ncl` - Default configurations - -**Note**: All generated files are gitignored - -## Form Naming Conventions - -1. **Top-level forms**: `{purpose}.toml` - - Example: `setup-wizard.toml`, `auth-login.toml` - -2. **Fragment forms**: `fragments/{category}-{variant}.toml` - - Example: `deployment-docker.toml`, `database-postgres.toml` - -3. **Infrastructure forms**: `forms/infrastructure/{operation}_{resource}_confirm.toml` - - Example: `server_delete_confirm.toml` - -## Adding New Forms - -### For Core Operations - -1. Create form: `.typedialog/core/forms/{operation}.toml` -2. Create wrapper: `core/shlib/{operation}-tty.sh` -3. Integrate in Nushell: `core/nulib/lib_provisioning/` - -### For Provisioning Config - -1. Create fragment: `.typedialog/provisioning/fragments/{category}-{variant}.toml` -2. Update main form: `.typedialog/provisioning/form.toml` -3. Add defaults: `.typedialog/provisioning/defaults/` - -### For Platform Services (Future) - -1. Create subsystem: `.typedialog/platform/` -2. Follow same structure as `core/` or `provisioning/` -3. Document in this README - -## Related Documentation - -- **Bash wrappers**: `provisioning/core/shlib/README.md` -- **TypeDialog integration**: `provisioning/platform/.typedialog/README.md` -- **Nushell setup**: `provisioning/core/nulib/lib_provisioning/setup/wizard.nu` - ---- - -**Last Updated**: 2025-01-09 -**Structure Version**: 2.0 (Multi-subsystem organization) +# TypeDialog Configuration Structure\n\nThis directory contains TypeDialog forms, templates, and configuration data organized by subsystem.\n\n## Directory Organization\n\n```\n.typedialog/\n├── core/ # Core subsystem forms (setup, auth, infrastructure)\n├── provisioning/ # Main provisioning configuration fragments\n└── platform/ # Platform services forms (future)\n```\n\n### Why Multiple Subdirectories\n\nDifferent subsystems have different form requirements:\n\n1. **`core/`** - Core infrastructure operations\n - System setup wizard\n - Authentication (login, MFA)\n - Infrastructure confirmations (delete, deploy)\n - **Users**: Developers, operators\n\n2. **`provisioning/`** - Project provisioning configuration\n - Deployment target selection (docker, k8s, ssh)\n - Database configuration (postgres, mysql, sqlite)\n - Monitoring setup\n - **Users**: Project setup, CI/CD\n\n3. **`platform/`** (future) - Platform services\n - Orchestrator configuration\n - Control center setup\n - Service-specific forms\n - **Users**: Platform administrators\n\n## Structure Within Each Subdirectory\n\nEach subdirectory follows this pattern:\n\n```\n{subsystem}/\n├── forms/ # TOML form definitions\n├── templates/ # Nickel/Jinja2 templates\n├── defaults/ # Default configurations\n├── constraints/ # Validation rules\n└── generated/ # Generated configs (gitignored)\n```\n\n## Core Subsystem (`core/`)\n\n**Purpose**: Core infrastructure operations (setup, auth, confirmations)\n\n**Forms**:\n- `forms/setup-wizard.toml` - Initial system setup\n- `forms/auth-login.toml` - User authentication\n- `forms/mfa-enroll.toml` - MFA enrollment\n- `forms/infrastructure/*.toml` - Delete confirmations (server, cluster, taskserv)\n\n**Bash Wrappers** (TTY-safe):\n- `../../core/shlib/setup-wizard-tty.sh`\n- `../../core/shlib/auth-login-tty.sh`\n- `../../core/shlib/mfa-enroll-tty.sh`\n\n**Usage**:\n```\n# Run setup wizard\n./provisioning/core/shlib/setup-wizard-tty.sh\n\n# Nushell reads result\nlet config = (open provisioning/.typedialog/core/generated/setup-wizard-result.json | from json)\n```\n\n## Provisioning Subsystem (`provisioning/`)\n\n**Purpose**: Main provisioning configuration (deployments, databases, monitoring)\n\n**Structure**:\n- `form.toml` - Main provisioning form\n- `fragments/` - Modular form fragments\n - `deployment-*.toml` - Docker, K8s, SSH deployments\n - `database-*.toml` - Database configurations\n - `monitoring.toml` - Monitoring setup\n - `auth-*.toml` - Authentication methods\n- `constraints.toml` - Validation constraints\n- `defaults/` - Default values\n- `schemas/` - Nickel schemas\n\n**Usage**:\n```\n# Configure provisioning\nnu provisioning/.typedialog/provisioning/configure.nu --backend web\n```\n\n## Platform Subsystem (`platform/` - Future)\n\n**Purpose**: Platform services configuration\n\n**Planned forms**:\n- Orchestrator configuration\n- Control center setup\n- MCP server configuration\n- Vault service setup\n\n**Status**: Structure planned, not yet implemented\n\n## Integration with Code\n\n### Bash Wrappers (TTY-safe)\n\nLocated in: `provisioning/core/shlib/*-tty.sh`\n\nThese wrappers solve Nushell's TTY input limitations by:\n1. Handling interactive input in bash\n2. Calling TypeDialog with proper TTY forwarding\n3. Generating JSON output for Nushell consumption\n\n**Pattern**:\n```\nBash wrapper → TypeDialog (TTY input) → Nickel config → JSON → Nushell\n```\n\n### Nushell Integration\n\nLocated in: `provisioning/core/nulib/lib_provisioning/`\n\nFunctions that call the bash wrappers:\n- `setup/wizard.nu::run-setup-wizard-interactive`\n- `plugins/auth.nu::login-interactive`\n- `plugins/auth.nu::mfa-enroll-interactive`\n\n## Generated Files\n\n**Location**: `{subsystem}/generated/`\n\n**Files**:\n- `*.ncl` - Nickel configuration files\n- `*.json` - JSON exports for Nushell\n- `*-defaults.ncl` - Default configurations\n\n**Note**: All generated files are gitignored\n\n## Form Naming Conventions\n\n1. **Top-level forms**: `{purpose}.toml`\n - Example: `setup-wizard.toml`, `auth-login.toml`\n\n2. **Fragment forms**: `fragments/{category}-{variant}.toml`\n - Example: `deployment-docker.toml`, `database-postgres.toml`\n\n3. **Infrastructure forms**: `forms/infrastructure/{operation}_{resource}_confirm.toml`\n - Example: `server_delete_confirm.toml`\n\n## Adding New Forms\n\n### For Core Operations\n\n1. Create form: `.typedialog/core/forms/{operation}.toml`\n2. Create wrapper: `core/shlib/{operation}-tty.sh`\n3. Integrate in Nushell: `core/nulib/lib_provisioning/`\n\n### For Provisioning Config\n\n1. Create fragment: `.typedialog/provisioning/fragments/{category}-{variant}.toml`\n2. Update main form: `.typedialog/provisioning/form.toml`\n3. Add defaults: `.typedialog/provisioning/defaults/`\n\n### For Platform Services (Future)\n\n1. Create subsystem: `.typedialog/platform/`\n2. Follow same structure as `core/` or `provisioning/`\n3. Document in this README\n\n## Related Documentation\n\n- **Bash wrappers**: `provisioning/core/shlib/README.md`\n- **TypeDialog integration**: `provisioning/platform/.typedialog/README.md`\n- **Nushell setup**: `provisioning/core/nulib/lib_provisioning/setup/wizard.nu`\n\n---\n\n**Last Updated**: 2025-01-09\n**Structure Version**: 2.0 (Multi-subsystem organization) diff --git a/.typedialog/ci/README.md b/.typedialog/ci/README.md index 9745b3e..a20280d 100644 --- a/.typedialog/ci/README.md +++ b/.typedialog/ci/README.md @@ -1,328 +1 @@ -# CI System - Configuration Guide - -**Installed**: 2026-01-01 -**Detected Languages**: rust, nushell, nickel, bash, markdown, python, javascript - ---- - -## Quick Start - -### Option 1: Using configure.sh (Recommended) - -A convenience script is installed in `.typedialog/ci/`: - -``` -# Use web backend (default) - Opens in browser -.typedialog/ci/configure.sh - -# Use TUI backend - Terminal interface -.typedialog/ci/configure.sh tui - -# Use CLI backend - Command-line prompts -.typedialog/ci/configure.sh cli -``` - -**This script automatically:** - -- Sources `.typedialog/ci/envrc` for environment setup -- Loads defaults from `config.ncl` (Nickel format) -- Uses cascading search for fragments (local → Tools) -- Creates backup before overwriting existing config -- Saves output in Nickel format using nickel-roundtrip with documented template -- Generates `config.ncl` compatible with `nickel doc` command - -### Option 2: Direct TypeDialog Commands - -Use TypeDialog nickel-roundtrip directly with manual paths: - -#### Web Backend (Recommended - Easy Viewing) - -``` -cd .typedialog/ci # Change to CI directory -source envrc # Load environment -typedialog-web nickel-roundtrip config.ncl form.toml \ - --output config.ncl \ - --ncl-template $TOOLS_PATH/dev-system/ci/templates/config.ncl.j2 -``` - -#### TUI Backend - -``` -cd .typedialog/ci -source envrc -typedialog-tui nickel-roundtrip config.ncl form.toml \ - --output config.ncl \ - --ncl-template $TOOLS_PATH/dev-system/ci/templates/config.ncl.j2 -``` - -#### CLI Backend - -``` -cd .typedialog/ci -source envrc -typedialog nickel-roundtrip config.ncl form.toml \ - --output config.ncl \ - --ncl-template $TOOLS_PATH/dev-system/ci/templates/config.ncl.j2 -``` - -**Note:** The `--ncl-template` flag uses a Tera template that adds: - -- Descriptive comments for each section -- Documentation compatible with `nickel doc config.ncl` -- Consistent formatting and structure - -**All backends will:** - -- Show only options relevant to your detected languages -- Guide you through all configuration choices -- Validate your inputs -- Generate config.ncl in Nickel format - -### Option 3: Manual Configuration - -Edit `config.ncl` directly: - -``` -vim .typedialog/ci/config.ncl -``` - ---- - -## Configuration Format: Nickel - -**This project uses Nickel format by default** for all configuration files. - -### Why Nickel? - -- ✅ **Typed configuration** - Static type checking with `nickel typecheck` -- ✅ **Documentation** - Generate docs with `nickel doc config.ncl` -- ✅ **Validation** - Built-in schema validation -- ✅ **Comments** - Rich inline documentation support -- ✅ **Modular** - Import/export system for reusable configs - -### Nickel Template - -The output structure is controlled by a **Tera template** at: - -- **Tools default**: `$TOOLS_PATH/dev-system/ci/templates/config.ncl.j2` -- **Local override**: `.typedialog/ci/config.ncl.j2` (optional) - -**To customize the template:** - -``` -# Copy the default template -cp $TOOLS_PATH/dev-system/ci/templates/config.ncl.j2 \ - .typedialog/ci/config.ncl.j2 - -# Edit to add custom comments, documentation, or structure -vim .typedialog/ci/config.ncl.j2 - -# Your template will now be used automatically -``` - -**Template features:** - -- Customizable comments per section -- Control field ordering -- Add project-specific documentation -- Configure output for `nickel doc` command - -### TypeDialog Environment Variables - -You can customize TypeDialog behavior with environment variables: - -``` -# Web server configuration -export TYPEDIALOG_PORT=9000 # Port for web backend (default: 9000) -export TYPEDIALOG_HOST=localhost # Host binding (default: localhost) - -# Localization -export TYPEDIALOG_LANG=en_US.UTF-8 # Form language (default: system locale) - -# Run with custom settings -TYPEDIALOG_PORT=8080 .typedialog/ci/configure.sh web -``` - -**Common use cases:** - -``` -# Access from other machines in network -TYPEDIALOG_HOST=0.0.0.0 TYPEDIALOG_PORT=8080 .typedialog/ci/configure.sh web - -# Use different port if 9000 is busy -TYPEDIALOG_PORT=3000 .typedialog/ci/configure.sh web - -# Spanish interface -TYPEDIALOG_LANG=es_ES.UTF-8 .typedialog/ci/configure.sh web -``` - -## Configuration Structure - -Your config.ncl is organized in the `ci` namespace (Nickel format): - -``` -{ - ci = { - project = { - name = "rust", - detected_languages = ["rust, nushell, nickel, bash, markdown, python, javascript"], - primary_language = "rust", - }, - tools = { - # Tools are added based on detected languages - }, - features = { - # CI features (pre-commit, GitHub Actions, etc.) - }, - ci_providers = { - # CI provider configurations - }, - }, -} -``` - -## Available Fragments - -Tool configurations are modular. Check `.typedialog/ci/fragments/` for: - -- rust-tools.toml - Tools for rust -- nushell-tools.toml - Tools for nushell -- nickel-tools.toml - Tools for nickel -- bash-tools.toml - Tools for bash -- markdown-tools.toml - Tools for markdown -- python-tools.toml - Tools for python -- javascript-tools.toml - Tools for javascript -- general-tools.toml - Cross-language tools -- ci-providers.toml - GitHub Actions, Woodpecker, etc. - -## Cascading Override System - -This project uses a **local → Tools cascading search** for all resources: - -### How It Works - -Resources are searched in priority order: - -1. **Local files** (`.typedialog/ci/`) - **FIRST** (highest priority) -2. **Tools files** (`$TOOLS_PATH/dev-system/ci/`) - **FALLBACK** (default) - -### Affected Resources - -| Resource | Local Path | Tools Path | -| ---------- | ------------ | ------------ | -| Fragments | `.typedialog/ci/fragments/` | `$TOOLS_PATH/dev-system/ci/forms/fragments/` | -| Schemas | `.typedialog/ci/schemas/` | `$TOOLS_PATH/dev-system/ci/schemas/` | -| Validators | `.typedialog/ci/validators/` | `$TOOLS_PATH/dev-system/ci/validators/` | -| Defaults | `.typedialog/ci/defaults/` | `$TOOLS_PATH/dev-system/ci/defaults/` | -| Nickel Template | `.typedialog/ci/config.ncl.j2` | `$TOOLS_PATH/dev-system/ci/templates/config.ncl.j2` | - -### Environment Setup (.envrc) - -The `.typedialog/ci/.envrc` file configures search paths: - -``` -# Source this file to load environment -source .typedialog/ci/.envrc - -# Or use direnv for automatic loading -echo 'source .typedialog/ci/.envrc' >> .envrc -``` - -**What's in .envrc:** - -``` -export NICKEL_IMPORT_PATH="schemas:$TOOLS_PATH/dev-system/ci/schemas:validators:..." -export TYPEDIALOG_FRAGMENT_PATH=".:$TOOLS_PATH/dev-system/ci/forms" -export NCL_TEMPLATE="" -export TYPEDIALOG_PORT=9000 # Web server port -export TYPEDIALOG_HOST=localhost # Web server host -export TYPEDIALOG_LANG="${LANG}" # Form localization -``` - -### Creating Overrides - -**By default:** All resources come from Tools (no duplication). - -**To customize:** Create file in local directory with same name: - -``` -# Override a fragment -cp $TOOLS_PATH/dev-system/ci/fragments/rust-tools.toml \ - .typedialog/ci/fragments/rust-tools.toml - -# Edit your local version -vim .typedialog/ci/fragments/rust-tools.toml - -# Override Nickel template (customize comments, structure, nickel doc output) -cp $TOOLS_PATH/dev-system/ci/templates/config.ncl.j2 \ - .typedialog/ci/config.ncl.j2 - -# Edit to customize documentation and structure -vim .typedialog/ci/config.ncl.j2 - -# Now your version will be used instead of Tools version -``` - -**Benefits:** - -- ✅ Override only what you need -- ✅ Everything else stays synchronized with Tools -- ✅ No duplication by default -- ✅ Automatic updates when Tools is updated - -**See:** `$TOOLS_PATH/dev-system/ci/docs/cascade-override.md` for complete documentation. - -## Testing Your Configuration - -### Validate Configuration - -``` -nu $env.TOOLS_PATH/dev-system/ci/scripts/validator.nu \ - --config .typedialog/ci/config.ncl \ - --project . \ - --namespace ci -``` - -### Regenerate CI Files - -``` -nu $env.TOOLS_PATH/dev-system/ci/scripts/generate-configs.nu \ - --config .typedialog/ci/config.ncl \ - --templates $env.TOOLS_PATH/dev-system/ci/templates \ - --output . \ - --namespace ci -``` - -## Common Tasks - -### Add a New Tool - -Edit `config.ncl` and add under `ci.tools`: - -``` -{ - ci = { - tools = { - newtool = { - enabled = true, - install_method = "cargo", - version = "latest", - }, - }, - }, -} -``` - -### Disable a Feature - -``` -[ci.features] -enable_pre_commit = false -``` - -## Need Help? - -For detailed documentation, see: - -- $env.TOOLS_PATH/dev-system/ci/docs/configuration-guide.md -- $env.TOOLS_PATH/dev-system/ci/docs/installation-guide.md +# CI System - Configuration Guide\n\n**Installed**: 2026-01-01\n**Detected Languages**: rust, nushell, nickel, bash, markdown, python, javascript\n\n---\n\n## Quick Start\n\n### Option 1: Using configure.sh (Recommended)\n\nA convenience script is installed in `.typedialog/ci/`:\n\n```\n# Use web backend (default) - Opens in browser\n.typedialog/ci/configure.sh\n\n# Use TUI backend - Terminal interface\n.typedialog/ci/configure.sh tui\n\n# Use CLI backend - Command-line prompts\n.typedialog/ci/configure.sh cli\n```\n\n**This script automatically:**\n\n- Sources `.typedialog/ci/envrc` for environment setup\n- Loads defaults from `config.ncl` (Nickel format)\n- Uses cascading search for fragments (local → Tools)\n- Creates backup before overwriting existing config\n- Saves output in Nickel format using nickel-roundtrip with documented template\n- Generates `config.ncl` compatible with `nickel doc` command\n\n### Option 2: Direct TypeDialog Commands\n\nUse TypeDialog nickel-roundtrip directly with manual paths:\n\n#### Web Backend (Recommended - Easy Viewing)\n\n```\ncd .typedialog/ci # Change to CI directory\nsource envrc # Load environment\ntypedialog-web nickel-roundtrip config.ncl form.toml \\n --output config.ncl \\n --ncl-template $TOOLS_PATH/dev-system/ci/templates/config.ncl.j2\n```\n\n#### TUI Backend\n\n```\ncd .typedialog/ci\nsource envrc\ntypedialog-tui nickel-roundtrip config.ncl form.toml \\n --output config.ncl \\n --ncl-template $TOOLS_PATH/dev-system/ci/templates/config.ncl.j2\n```\n\n#### CLI Backend\n\n```\ncd .typedialog/ci\nsource envrc\ntypedialog nickel-roundtrip config.ncl form.toml \\n --output config.ncl \\n --ncl-template $TOOLS_PATH/dev-system/ci/templates/config.ncl.j2\n```\n\n**Note:** The `--ncl-template` flag uses a Tera template that adds:\n\n- Descriptive comments for each section\n- Documentation compatible with `nickel doc config.ncl`\n- Consistent formatting and structure\n\n**All backends will:**\n\n- Show only options relevant to your detected languages\n- Guide you through all configuration choices\n- Validate your inputs\n- Generate config.ncl in Nickel format\n\n### Option 3: Manual Configuration\n\nEdit `config.ncl` directly:\n\n```\nvim .typedialog/ci/config.ncl\n```\n\n---\n\n## Configuration Format: Nickel\n\n**This project uses Nickel format by default** for all configuration files.\n\n### Why Nickel?\n\n- ✅ **Typed configuration** - Static type checking with `nickel typecheck`\n- ✅ **Documentation** - Generate docs with `nickel doc config.ncl`\n- ✅ **Validation** - Built-in schema validation\n- ✅ **Comments** - Rich inline documentation support\n- ✅ **Modular** - Import/export system for reusable configs\n\n### Nickel Template\n\nThe output structure is controlled by a **Tera template** at:\n\n- **Tools default**: `$TOOLS_PATH/dev-system/ci/templates/config.ncl.j2`\n- **Local override**: `.typedialog/ci/config.ncl.j2` (optional)\n\n**To customize the template:**\n\n```\n# Copy the default template\ncp $TOOLS_PATH/dev-system/ci/templates/config.ncl.j2 \\n .typedialog/ci/config.ncl.j2\n\n# Edit to add custom comments, documentation, or structure\nvim .typedialog/ci/config.ncl.j2\n\n# Your template will now be used automatically\n```\n\n**Template features:**\n\n- Customizable comments per section\n- Control field ordering\n- Add project-specific documentation\n- Configure output for `nickel doc` command\n\n### TypeDialog Environment Variables\n\nYou can customize TypeDialog behavior with environment variables:\n\n```\n# Web server configuration\nexport TYPEDIALOG_PORT=9000 # Port for web backend (default: 9000)\nexport TYPEDIALOG_HOST=localhost # Host binding (default: localhost)\n\n# Localization\nexport TYPEDIALOG_LANG=en_US.UTF-8 # Form language (default: system locale)\n\n# Run with custom settings\nTYPEDIALOG_PORT=8080 .typedialog/ci/configure.sh web\n```\n\n**Common use cases:**\n\n```\n# Access from other machines in network\nTYPEDIALOG_HOST=0.0.0.0 TYPEDIALOG_PORT=8080 .typedialog/ci/configure.sh web\n\n# Use different port if 9000 is busy\nTYPEDIALOG_PORT=3000 .typedialog/ci/configure.sh web\n\n# Spanish interface\nTYPEDIALOG_LANG=es_ES.UTF-8 .typedialog/ci/configure.sh web\n```\n\n## Configuration Structure\n\nYour config.ncl is organized in the `ci` namespace (Nickel format):\n\n```\n{\n ci = {\n project = {\n name = "rust",\n detected_languages = ["rust, nushell, nickel, bash, markdown, python, javascript"],\n primary_language = "rust",\n },\n tools = {\n # Tools are added based on detected languages\n },\n features = {\n # CI features (pre-commit, GitHub Actions, etc.)\n },\n ci_providers = {\n # CI provider configurations\n },\n },\n}\n```\n\n## Available Fragments\n\nTool configurations are modular. Check `.typedialog/ci/fragments/` for:\n\n- rust-tools.toml - Tools for rust\n- nushell-tools.toml - Tools for nushell\n- nickel-tools.toml - Tools for nickel\n- bash-tools.toml - Tools for bash\n- markdown-tools.toml - Tools for markdown\n- python-tools.toml - Tools for python\n- javascript-tools.toml - Tools for javascript\n- general-tools.toml - Cross-language tools\n- ci-providers.toml - GitHub Actions, Woodpecker, etc.\n\n## Cascading Override System\n\nThis project uses a **local → Tools cascading search** for all resources:\n\n### How It Works\n\nResources are searched in priority order:\n\n1. **Local files** (`.typedialog/ci/`) - **FIRST** (highest priority)\n2. **Tools files** (`$TOOLS_PATH/dev-system/ci/`) - **FALLBACK** (default)\n\n### Affected Resources\n\n| Resource | Local Path | Tools Path |\n| ---------- | ------------ | ------------ |\n| Fragments | `.typedialog/ci/fragments/` | `$TOOLS_PATH/dev-system/ci/forms/fragments/` |\n| Schemas | `.typedialog/ci/schemas/` | `$TOOLS_PATH/dev-system/ci/schemas/` |\n| Validators | `.typedialog/ci/validators/` | `$TOOLS_PATH/dev-system/ci/validators/` |\n| Defaults | `.typedialog/ci/defaults/` | `$TOOLS_PATH/dev-system/ci/defaults/` |\n| Nickel Template | `.typedialog/ci/config.ncl.j2` | `$TOOLS_PATH/dev-system/ci/templates/config.ncl.j2` |\n\n### Environment Setup (.envrc)\n\nThe `.typedialog/ci/.envrc` file configures search paths:\n\n```\n# Source this file to load environment\nsource .typedialog/ci/.envrc\n\n# Or use direnv for automatic loading\necho 'source .typedialog/ci/.envrc' >> .envrc\n```\n\n**What's in .envrc:**\n\n```\nexport NICKEL_IMPORT_PATH="schemas:$TOOLS_PATH/dev-system/ci/schemas:validators:..."\nexport TYPEDIALOG_FRAGMENT_PATH=".:$TOOLS_PATH/dev-system/ci/forms"\nexport NCL_TEMPLATE=""\nexport TYPEDIALOG_PORT=9000 # Web server port\nexport TYPEDIALOG_HOST=localhost # Web server host\nexport TYPEDIALOG_LANG="${LANG}" # Form localization\n```\n\n### Creating Overrides\n\n**By default:** All resources come from Tools (no duplication).\n\n**To customize:** Create file in local directory with same name:\n\n```\n# Override a fragment\ncp $TOOLS_PATH/dev-system/ci/fragments/rust-tools.toml \\n .typedialog/ci/fragments/rust-tools.toml\n\n# Edit your local version\nvim .typedialog/ci/fragments/rust-tools.toml\n\n# Override Nickel template (customize comments, structure, nickel doc output)\ncp $TOOLS_PATH/dev-system/ci/templates/config.ncl.j2 \\n .typedialog/ci/config.ncl.j2\n\n# Edit to customize documentation and structure\nvim .typedialog/ci/config.ncl.j2\n\n# Now your version will be used instead of Tools version\n```\n\n**Benefits:**\n\n- ✅ Override only what you need\n- ✅ Everything else stays synchronized with Tools\n- ✅ No duplication by default\n- ✅ Automatic updates when Tools is updated\n\n**See:** `$TOOLS_PATH/dev-system/ci/docs/cascade-override.md` for complete documentation.\n\n## Testing Your Configuration\n\n### Validate Configuration\n\n```\nnu $env.TOOLS_PATH/dev-system/ci/scripts/validator.nu \\n --config .typedialog/ci/config.ncl \\n --project . \\n --namespace ci\n```\n\n### Regenerate CI Files\n\n```\nnu $env.TOOLS_PATH/dev-system/ci/scripts/generate-configs.nu \\n --config .typedialog/ci/config.ncl \\n --templates $env.TOOLS_PATH/dev-system/ci/templates \\n --output . \\n --namespace ci\n```\n\n## Common Tasks\n\n### Add a New Tool\n\nEdit `config.ncl` and add under `ci.tools`:\n\n```\n{\n ci = {\n tools = {\n newtool = {\n enabled = true,\n install_method = "cargo",\n version = "latest",\n },\n },\n },\n}\n```\n\n### Disable a Feature\n\n```\n[ci.features]\nenable_pre_commit = false\n```\n\n## Need Help?\n\nFor detailed documentation, see:\n\n- $env.TOOLS_PATH/dev-system/ci/docs/configuration-guide.md\n- $env.TOOLS_PATH/dev-system/ci/docs/installation-guide.md diff --git a/.typedialog/platform/forms/README.md b/.typedialog/platform/forms/README.md index 4eaf99c..a8f99b2 100644 --- a/.typedialog/platform/forms/README.md +++ b/.typedialog/platform/forms/README.md @@ -1,390 +1 @@ -# Forms - -TypeDialog form definitions for interactive configuration of platform services. - -## Purpose - -Forms provide: -- **Interactive configuration** - Web/TUI/CLI interfaces for user input -- **Constraint validation** - Dynamic min/max from constraints.toml -- **Nickel mapping** - Form fields map to Nickel structure via `nickel_path` -- **Jinja2 template integration** - Generate Nickel configs from form values -- **nickel-roundtrip workflow** - Load existing Nickel → edit → generate updated Nickel - -## File Organization - -``` -forms/ -├── README.md # This file -├── orchestrator-form.toml # Orchestrator configuration form -├── control-center-form.toml # Control Center configuration form -├── mcp-server-form.toml # MCP Server configuration form -├── installer-form.toml # Installer configuration form -└── fragments/ # FLAT fragment directory (all fragments here) - ├── workspace-section.toml # Workspace configuration - ├── server-section.toml # HTTP server settings - ├── database-rocksdb-section.toml # RocksDB configuration - ├── database-surrealdb-section.toml # SurrealDB configuration - ├── database-postgres-section.toml # PostgreSQL configuration - ├── security-section.toml # Auth, RBAC, encryption - ├── monitoring-section.toml # Metrics, health checks - ├── logging-section.toml # Log configuration - ├── orchestrator-queue-section.toml # Orchestrator queue config - ├── orchestrator-workflow-section.toml - ├── control-center-jwt-section.toml - ├── control-center-rbac-section.toml - ├── mcp-capabilities-section.toml - ├── deployment-mode-section.toml # Mode selection - └── README.md # Fragment documentation -``` - -## Critical: Fragment Organization - -**Fragments are FLAT** - all stored in `forms/fragments/` at the same level, referenced by paths in form includes: - -``` -# Main form (orchestrator-form.toml) -[[items]] -name = "workspace_group" -type = "group" -includes = ["fragments/workspace-section.toml"] # Path reference to flat fragment - -[[items]] -name = "queue_group" -type = "group" -includes = ["fragments/orchestrator-queue-section.toml"] # Same level, different name -``` - -**NOT nested directories** like `fragments/orchestrator/queue-section.toml` - all in `fragments/` - -## TypeDialog nickel-roundtrip Workflow - -CRITICAL: Forms integrate with Nickel config generation via: - -``` -typedialog-web nickel-roundtrip "$CONFIG_FILE" "$FORM_FILE" --output "$CONFIG_FILE" --template "$NCL_TEMPLATE" -``` - -This workflow: -1. **Loads existing Nickel config** as default values in form -2. **Shows form** with validated constraints -3. **User edits** configuration values -4. **Generates updated Nickel** using Jinja2 template - -## Required Fields: nickel_path - -**CRITICAL**: Every form element MUST have `nickel_path` to map to Nickel structure: - -``` -[[elements]] -name = "workspace_name" -type = "text" -prompt = "Workspace Name" -nickel_path = ["orchestrator", "workspace", "name"] # ← REQUIRED -``` - -The `nickel_path` array specifies the path in the Nickel config structure: -- `["orchestrator", "workspace", "name"]` → `orchestrator.workspace.name` -- `["orchestrator", "queue", "max_concurrent_tasks"]` → `orchestrator.queue.max_concurrent_tasks` - -## Constraint Interpolation - -Form fields reference constraints dynamically: - -``` -[[elements]] -name = "max_concurrent_tasks" -type = "number" -prompt = "Maximum Concurrent Tasks" -min = "${constraint.orchestrator.queue.concurrent_tasks.min}" # Dynamic -max = "${constraint.orchestrator.queue.concurrent_tasks.max}" # Dynamic -help = "Range: ${constraint.orchestrator.queue.concurrent_tasks.min}-${constraint.orchestrator.queue.concurrent_tasks.max}" -nickel_path = ["orchestrator", "queue", "max_concurrent_tasks"] -``` - -TypeDialog resolves `${constraint.path}` from `constraints/constraints.toml`. - -## Main Form Structure - -All main forms follow this pattern: - -``` -name = "service_configuration" -description = "Interactive configuration for {Service}" -display_mode = "complete" - -# Section 1: Deployment mode selection -[[items]] -name = "deployment_mode_group" -type = "group" -includes = ["fragments/deployment-mode-section.toml"] - -# Section 2: Workspace configuration -[[items]] -name = "workspace_group" -type = "group" -includes = ["fragments/workspace-section.toml"] - -# Section 3: Server configuration -[[items]] -name = "server_group" -type = "group" -includes = ["fragments/server-section.toml"] - -# Section N: Service-specific configuration -[[items]] -name = "service_group" -type = "group" -includes = ["fragments/{service}-specific-section.toml"] - -# Optional: Conditional sections -[[items]] -name = "monitoring_group" -type = "group" -when = "enable_monitoring == true" -includes = ["fragments/monitoring-section.toml"] -``` - -## Fragment Example: workspace-section.toml - -``` -# Workspace configuration fragment -[[elements]] -border_top = true -border_bottom = true -name = "workspace_header" -title = "🗂️ Workspace Configuration" -type = "section_header" - -[[elements]] -name = "workspace_name" -type = "text" -prompt = "Workspace Name" -default = "default" -placeholder = "e.g., librecloud, production" -required = true -help = "Name of the workspace" -nickel_path = ["orchestrator", "workspace", "name"] - -[[elements]] -name = "workspace_path" -type = "text" -prompt = "Workspace Path" -default = "/var/lib/provisioning/orchestrator" -required = true -help = "Absolute path to workspace directory" -nickel_path = ["orchestrator", "workspace", "path"] - -[[elements]] -name = "workspace_enabled" -type = "confirm" -prompt = "Enable Workspace?" -default = true -nickel_path = ["orchestrator", "workspace", "enabled"] - -[[elements]] -name = "multi_workspace" -type = "confirm" -prompt = "Multi-Workspace Mode?" -default = false -help = "Allow serving multiple workspaces" -nickel_path = ["orchestrator", "workspace", "multi_workspace"] -``` - -## Fragment Example: orchestrator-queue-section.toml - -``` -# Orchestrator queue configuration -[[elements]] -border_top = true -name = "queue_header" -title = "⚙️ Queue Configuration" -type = "section_header" - -[[elements]] -name = "max_concurrent_tasks" -type = "number" -prompt = "Maximum Concurrent Tasks" -default = 5 -min = "${constraint.orchestrator.queue.concurrent_tasks.min}" -max = "${constraint.orchestrator.queue.concurrent_tasks.max}" -required = true -help = "Max tasks running simultaneously. Range: ${constraint.orchestrator.queue.concurrent_tasks.min}-${constraint.orchestrator.queue.concurrent_tasks.max}" -nickel_path = ["orchestrator", "queue", "max_concurrent_tasks"] - -[[elements]] -name = "retry_attempts" -type = "number" -prompt = "Retry Attempts" -default = 3 -min = 0 -max = 10 -help = "Number of retry attempts for failed tasks" -nickel_path = ["orchestrator", "queue", "retry_attempts"] - -[[elements]] -name = "retry_delay" -type = "number" -prompt = "Retry Delay (ms)" -default = 5000 -min = 1000 -max = 60000 -help = "Delay between retries in milliseconds" -nickel_path = ["orchestrator", "queue", "retry_delay"] - -[[elements]] -name = "task_timeout" -type = "number" -prompt = "Task Timeout (ms)" -default = 3600000 -min = 60000 -max = 86400000 -help = "Default timeout for task execution (min 1 min, max 24 hrs)" -nickel_path = ["orchestrator", "queue", "task_timeout"] -``` - -## Jinja2 Template Integration - -Jinja2 templates (`templates/{service}-config.ncl.j2`) convert form values to Nickel: - -``` -# templates/orchestrator-config.ncl.j2 -{ - orchestrator = { - workspace = { - {%- if workspace_name %} - name = "{{ workspace_name }}", - {%- endif %} - {%- if workspace_path %} - path = "{{ workspace_path }}", - {%- endif %} - {%- if workspace_enabled is defined %} - enabled = {{ workspace_enabled | lower }}, - {%- endif %} - }, - queue = { - {%- if max_concurrent_tasks %} - max_concurrent_tasks = {{ max_concurrent_tasks }}, - {%- endif %} - {%- if retry_attempts %} - retry_attempts = {{ retry_attempts }}, - {%- endif %} - {%- if retry_delay %} - retry_delay = {{ retry_delay }}, - {%- endif %} - {%- if task_timeout %} - task_timeout = {{ task_timeout }}, - {%- endif %} - }, - }, -} -``` - -## Conditional Sections - -Forms can show/hide sections based on user selections: - -``` -# Always shown -[[items]] -name = "deployment_mode_group" -type = "group" -includes = ["fragments/deployment-mode-section.toml"] - -# Only shown if enable_monitoring is true -[[items]] -name = "monitoring_group" -type = "group" -when = "enable_monitoring == true" -includes = ["fragments/monitoring-section.toml"] - -# Only shown if deployment_mode is "enterprise" -[[items]] -name = "enterprise_options" -type = "group" -when = "deployment_mode == 'enterprise'" -includes = ["fragments/enterprise-options-section.toml"] -``` - -## Element Types - -``` -type = "text" # Single-line text input -type = "number" # Numeric input -type = "confirm" # Boolean checkbox -type = "select" # Dropdown (single choice) -type = "multiselect" # Checkboxes (multiple choices) -type = "password" # Hidden text input -type = "textarea" # Multi-line text -type = "section_header" # Visual section separator -type = "footer" # Confirmation text -type = "group" # Container for fragments -``` - -## Usage Workflow - -### 1. Run Configuration Wizard - -``` -nu provisioning/.typedialog/provisioning/platform/scripts/configure.nu orchestrator solo -``` - -### 2. TypeDialog Loads Form - -- Shows `forms/orchestrator-form.toml` -- Includes fragments from `forms/fragments/*.toml` -- Applies constraint interpolation -- Loads existing config as defaults (if exists) - -### 3. User Edits - -- Fills form fields -- Validates against constraints -- Shows validation errors - -### 4. Generate Nickel - -- Uses `templates/orchestrator-config.ncl.j2` -- Converts form values to Nickel -- Saves to `values/orchestrator.solo.ncl` - -## Best Practices - -1. **Use fragments** - Don't duplicate form sections -2. **Always add nickel_path** - Required for Nickel mapping -3. **Use constraint interpolation** - Dynamic limits from constraints.toml -4. **Provide defaults** - Sensible defaults speed up configuration -5. **Use clear prompts** - Explain what each field does in `help` text -6. **Group related fields** - Use fragments to organize logically -7. **Test constraint interpolation** - Verify ${constraint.*} resolves -8. **Document fragments** - Use headers and help text - -## Testing Forms - -``` -# Validate form TOML syntax (if supported by TypeDialog) -# typedialog validate forms/orchestrator-form.toml - -# Launch interactive form (web backend) -nu provisioning/.typedialog/provisioning/platform/scripts/configure.nu orchestrator solo --backend web - -# View generated Nickel -cat provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl -``` - -## Adding New Fields - -To add a new configuration field: - -1. **Add to schema** (schemas/{service}.ncl) -2. **Add to defaults** (defaults/{service}-defaults.ncl) -3. **Add to fragment** (forms/fragments/{appropriate}-section.toml) - - Include `nickel_path` mapping - - Add constraint if numeric -4. **Update Jinja2 template** (templates/{service}-config.ncl.j2) -5. **Test**: `nu scripts/configure.nu {service} {mode}` - ---- - -**Version**: 1.0.0 -**Last Updated**: 2025-01-05 +# Forms\n\nTypeDialog form definitions for interactive configuration of platform services.\n\n## Purpose\n\nForms provide:\n- **Interactive configuration** - Web/TUI/CLI interfaces for user input\n- **Constraint validation** - Dynamic min/max from constraints.toml\n- **Nickel mapping** - Form fields map to Nickel structure via `nickel_path`\n- **Jinja2 template integration** - Generate Nickel configs from form values\n- **nickel-roundtrip workflow** - Load existing Nickel → edit → generate updated Nickel\n\n## File Organization\n\n```\nforms/\n├── README.md # This file\n├── orchestrator-form.toml # Orchestrator configuration form\n├── control-center-form.toml # Control Center configuration form\n├── mcp-server-form.toml # MCP Server configuration form\n├── installer-form.toml # Installer configuration form\n└── fragments/ # FLAT fragment directory (all fragments here)\n ├── workspace-section.toml # Workspace configuration\n ├── server-section.toml # HTTP server settings\n ├── database-rocksdb-section.toml # RocksDB configuration\n ├── database-surrealdb-section.toml # SurrealDB configuration\n ├── database-postgres-section.toml # PostgreSQL configuration\n ├── security-section.toml # Auth, RBAC, encryption\n ├── monitoring-section.toml # Metrics, health checks\n ├── logging-section.toml # Log configuration\n ├── orchestrator-queue-section.toml # Orchestrator queue config\n ├── orchestrator-workflow-section.toml\n ├── control-center-jwt-section.toml\n ├── control-center-rbac-section.toml\n ├── mcp-capabilities-section.toml\n ├── deployment-mode-section.toml # Mode selection\n └── README.md # Fragment documentation\n```\n\n## Critical: Fragment Organization\n\n**Fragments are FLAT** - all stored in `forms/fragments/` at the same level, referenced by paths in form includes:\n\n```\n# Main form (orchestrator-form.toml)\n[[items]]\nname = "workspace_group"\ntype = "group"\nincludes = ["fragments/workspace-section.toml"] # Path reference to flat fragment\n\n[[items]]\nname = "queue_group"\ntype = "group"\nincludes = ["fragments/orchestrator-queue-section.toml"] # Same level, different name\n```\n\n**NOT nested directories** like `fragments/orchestrator/queue-section.toml` - all in `fragments/`\n\n## TypeDialog nickel-roundtrip Workflow\n\nCRITICAL: Forms integrate with Nickel config generation via:\n\n```\ntypedialog-web nickel-roundtrip "$CONFIG_FILE" "$FORM_FILE" --output "$CONFIG_FILE" --template "$NCL_TEMPLATE"\n```\n\nThis workflow:\n1. **Loads existing Nickel config** as default values in form\n2. **Shows form** with validated constraints\n3. **User edits** configuration values\n4. **Generates updated Nickel** using Jinja2 template\n\n## Required Fields: nickel_path\n\n**CRITICAL**: Every form element MUST have `nickel_path` to map to Nickel structure:\n\n```\n[[elements]]\nname = "workspace_name"\ntype = "text"\nprompt = "Workspace Name"\nnickel_path = ["orchestrator", "workspace", "name"] # ← REQUIRED\n```\n\nThe `nickel_path` array specifies the path in the Nickel config structure:\n- `["orchestrator", "workspace", "name"]` → `orchestrator.workspace.name`\n- `["orchestrator", "queue", "max_concurrent_tasks"]` → `orchestrator.queue.max_concurrent_tasks`\n\n## Constraint Interpolation\n\nForm fields reference constraints dynamically:\n\n```\n[[elements]]\nname = "max_concurrent_tasks"\ntype = "number"\nprompt = "Maximum Concurrent Tasks"\nmin = "${constraint.orchestrator.queue.concurrent_tasks.min}" # Dynamic\nmax = "${constraint.orchestrator.queue.concurrent_tasks.max}" # Dynamic\nhelp = "Range: ${constraint.orchestrator.queue.concurrent_tasks.min}-${constraint.orchestrator.queue.concurrent_tasks.max}"\nnickel_path = ["orchestrator", "queue", "max_concurrent_tasks"]\n```\n\nTypeDialog resolves `${constraint.path}` from `constraints/constraints.toml`.\n\n## Main Form Structure\n\nAll main forms follow this pattern:\n\n```\nname = "service_configuration"\ndescription = "Interactive configuration for {Service}"\ndisplay_mode = "complete"\n\n# Section 1: Deployment mode selection\n[[items]]\nname = "deployment_mode_group"\ntype = "group"\nincludes = ["fragments/deployment-mode-section.toml"]\n\n# Section 2: Workspace configuration\n[[items]]\nname = "workspace_group"\ntype = "group"\nincludes = ["fragments/workspace-section.toml"]\n\n# Section 3: Server configuration\n[[items]]\nname = "server_group"\ntype = "group"\nincludes = ["fragments/server-section.toml"]\n\n# Section N: Service-specific configuration\n[[items]]\nname = "service_group"\ntype = "group"\nincludes = ["fragments/{service}-specific-section.toml"]\n\n# Optional: Conditional sections\n[[items]]\nname = "monitoring_group"\ntype = "group"\nwhen = "enable_monitoring == true"\nincludes = ["fragments/monitoring-section.toml"]\n```\n\n## Fragment Example: workspace-section.toml\n\n```\n# Workspace configuration fragment\n[[elements]]\nborder_top = true\nborder_bottom = true\nname = "workspace_header"\ntitle = "🗂️ Workspace Configuration"\ntype = "section_header"\n\n[[elements]]\nname = "workspace_name"\ntype = "text"\nprompt = "Workspace Name"\ndefault = "default"\nplaceholder = "e.g., librecloud, production"\nrequired = true\nhelp = "Name of the workspace"\nnickel_path = ["orchestrator", "workspace", "name"]\n\n[[elements]]\nname = "workspace_path"\ntype = "text"\nprompt = "Workspace Path"\ndefault = "/var/lib/provisioning/orchestrator"\nrequired = true\nhelp = "Absolute path to workspace directory"\nnickel_path = ["orchestrator", "workspace", "path"]\n\n[[elements]]\nname = "workspace_enabled"\ntype = "confirm"\nprompt = "Enable Workspace?"\ndefault = true\nnickel_path = ["orchestrator", "workspace", "enabled"]\n\n[[elements]]\nname = "multi_workspace"\ntype = "confirm"\nprompt = "Multi-Workspace Mode?"\ndefault = false\nhelp = "Allow serving multiple workspaces"\nnickel_path = ["orchestrator", "workspace", "multi_workspace"]\n```\n\n## Fragment Example: orchestrator-queue-section.toml\n\n```\n# Orchestrator queue configuration\n[[elements]]\nborder_top = true\nname = "queue_header"\ntitle = "⚙️ Queue Configuration"\ntype = "section_header"\n\n[[elements]]\nname = "max_concurrent_tasks"\ntype = "number"\nprompt = "Maximum Concurrent Tasks"\ndefault = 5\nmin = "${constraint.orchestrator.queue.concurrent_tasks.min}"\nmax = "${constraint.orchestrator.queue.concurrent_tasks.max}"\nrequired = true\nhelp = "Max tasks running simultaneously. Range: ${constraint.orchestrator.queue.concurrent_tasks.min}-${constraint.orchestrator.queue.concurrent_tasks.max}"\nnickel_path = ["orchestrator", "queue", "max_concurrent_tasks"]\n\n[[elements]]\nname = "retry_attempts"\ntype = "number"\nprompt = "Retry Attempts"\ndefault = 3\nmin = 0\nmax = 10\nhelp = "Number of retry attempts for failed tasks"\nnickel_path = ["orchestrator", "queue", "retry_attempts"]\n\n[[elements]]\nname = "retry_delay"\ntype = "number"\nprompt = "Retry Delay (ms)"\ndefault = 5000\nmin = 1000\nmax = 60000\nhelp = "Delay between retries in milliseconds"\nnickel_path = ["orchestrator", "queue", "retry_delay"]\n\n[[elements]]\nname = "task_timeout"\ntype = "number"\nprompt = "Task Timeout (ms)"\ndefault = 3600000\nmin = 60000\nmax = 86400000\nhelp = "Default timeout for task execution (min 1 min, max 24 hrs)"\nnickel_path = ["orchestrator", "queue", "task_timeout"]\n```\n\n## Jinja2 Template Integration\n\nJinja2 templates (`templates/{service}-config.ncl.j2`) convert form values to Nickel:\n\n```\n# templates/orchestrator-config.ncl.j2\n{\n orchestrator = {\n workspace = {\n {%- if workspace_name %}\n name = "{{ workspace_name }}",\n {%- endif %}\n {%- if workspace_path %}\n path = "{{ workspace_path }}",\n {%- endif %}\n {%- if workspace_enabled is defined %}\n enabled = {{ workspace_enabled | lower }},\n {%- endif %}\n },\n queue = {\n {%- if max_concurrent_tasks %}\n max_concurrent_tasks = {{ max_concurrent_tasks }},\n {%- endif %}\n {%- if retry_attempts %}\n retry_attempts = {{ retry_attempts }},\n {%- endif %}\n {%- if retry_delay %}\n retry_delay = {{ retry_delay }},\n {%- endif %}\n {%- if task_timeout %}\n task_timeout = {{ task_timeout }},\n {%- endif %}\n },\n },\n}\n```\n\n## Conditional Sections\n\nForms can show/hide sections based on user selections:\n\n```\n# Always shown\n[[items]]\nname = "deployment_mode_group"\ntype = "group"\nincludes = ["fragments/deployment-mode-section.toml"]\n\n# Only shown if enable_monitoring is true\n[[items]]\nname = "monitoring_group"\ntype = "group"\nwhen = "enable_monitoring == true"\nincludes = ["fragments/monitoring-section.toml"]\n\n# Only shown if deployment_mode is "enterprise"\n[[items]]\nname = "enterprise_options"\ntype = "group"\nwhen = "deployment_mode == 'enterprise'"\nincludes = ["fragments/enterprise-options-section.toml"]\n```\n\n## Element Types\n\n```\ntype = "text" # Single-line text input\ntype = "number" # Numeric input\ntype = "confirm" # Boolean checkbox\ntype = "select" # Dropdown (single choice)\ntype = "multiselect" # Checkboxes (multiple choices)\ntype = "password" # Hidden text input\ntype = "textarea" # Multi-line text\ntype = "section_header" # Visual section separator\ntype = "footer" # Confirmation text\ntype = "group" # Container for fragments\n```\n\n## Usage Workflow\n\n### 1. Run Configuration Wizard\n\n```\nnu provisioning/.typedialog/provisioning/platform/scripts/configure.nu orchestrator solo\n```\n\n### 2. TypeDialog Loads Form\n\n- Shows `forms/orchestrator-form.toml`\n- Includes fragments from `forms/fragments/*.toml`\n- Applies constraint interpolation\n- Loads existing config as defaults (if exists)\n\n### 3. User Edits\n\n- Fills form fields\n- Validates against constraints\n- Shows validation errors\n\n### 4. Generate Nickel\n\n- Uses `templates/orchestrator-config.ncl.j2`\n- Converts form values to Nickel\n- Saves to `values/orchestrator.solo.ncl`\n\n## Best Practices\n\n1. **Use fragments** - Don't duplicate form sections\n2. **Always add nickel_path** - Required for Nickel mapping\n3. **Use constraint interpolation** - Dynamic limits from constraints.toml\n4. **Provide defaults** - Sensible defaults speed up configuration\n5. **Use clear prompts** - Explain what each field does in `help` text\n6. **Group related fields** - Use fragments to organize logically\n7. **Test constraint interpolation** - Verify ${constraint.*} resolves\n8. **Document fragments** - Use headers and help text\n\n## Testing Forms\n\n```\n# Validate form TOML syntax (if supported by TypeDialog)\n# typedialog validate forms/orchestrator-form.toml\n\n# Launch interactive form (web backend)\nnu provisioning/.typedialog/provisioning/platform/scripts/configure.nu orchestrator solo --backend web\n\n# View generated Nickel\ncat provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl\n```\n\n## Adding New Fields\n\nTo add a new configuration field:\n\n1. **Add to schema** (schemas/{service}.ncl)\n2. **Add to defaults** (defaults/{service}-defaults.ncl)\n3. **Add to fragment** (forms/fragments/{appropriate}-section.toml)\n - Include `nickel_path` mapping\n - Add constraint if numeric\n4. **Update Jinja2 template** (templates/{service}-config.ncl.j2)\n5. **Test**: `nu scripts/configure.nu {service} {mode}`\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 diff --git a/.typedialog/platform/forms/fragments/README.md b/.typedialog/platform/forms/fragments/README.md index f5b3aed..802e1fb 100644 --- a/.typedialog/platform/forms/fragments/README.md +++ b/.typedialog/platform/forms/fragments/README.md @@ -1,334 +1 @@ -# Fragments - -Reusable form fragments organized FLAT in this directory (not nested subdirectories). - -## Purpose - -Fragments provide: -- **Reusable sections** - Used by multiple forms -- **Modularity** - Change once, applies to all forms using it -- **Organization** - Named by purpose (workspace, server, queue, etc.) -- **DRY principle** - Don't repeat configuration sections - -## Fragment Organization - -**CRITICAL**: All fragments are stored at the SAME LEVEL (flat directory). - -``` -fragments/ -├── workspace-section.toml # Workspace configuration -├── server-section.toml # HTTP server settings -├── database-rocksdb-section.toml # RocksDB database -├── database-surrealdb-section.toml # SurrealDB database -├── database-postgres-section.toml # PostgreSQL database -├── security-section.toml # Auth, RBAC, encryption -├── monitoring-section.toml # Metrics, health checks -├── logging-section.toml # Log configuration -├── orchestrator-queue-section.toml # Orchestrator queue config -├── orchestrator-workflow-section.toml # Orchestrator batch workflow -├── orchestrator-storage-section.toml # Orchestrator storage backend -├── control-center-jwt-section.toml # Control Center JWT -├── control-center-rbac-section.toml # Control Center RBAC -├── control-center-compliance-section.toml -├── mcp-capabilities-section.toml # MCP capabilities -├── mcp-tools-section.toml # MCP tools configuration -├── mcp-resources-section.toml # MCP resource limits -├── deployment-mode-section.toml # Deployment mode selection -├── resources-section.toml # Resource allocation (CPU, RAM, disk) -└── README.md # This file -``` - -Referenced in forms as: -``` -[[items]] -name = "workspace_group" -type = "group" -includes = ["fragments/workspace-section.toml"] # Flat reference - -[[items]] -name = "queue_group" -type = "group" -includes = ["fragments/orchestrator-queue-section.toml"] # Same level -``` - -## Fragment Categories - -### Common Fragments (Used by Multiple Services) - -- **workspace-section.toml** - Workspace name, path, enable/disable -- **server-section.toml** - HTTP server host, port, workers, keep-alive -- **database-rocksdb-section.toml** - RocksDB path (filesystem-backed) -- **database-surrealdb-section.toml** - SurrealDB embedded (no external service) -- **database-postgres-section.toml** - PostgreSQL server connection -- **security-section.toml** - JWT issuer, RBAC, encryption keys -- **monitoring-section.toml** - Metrics interval, health checks -- **logging-section.toml** - Log level, format, rotation -- **resources-section.toml** - CPU cores, memory, disk allocation -- **deployment-mode-section.toml** - Solo/MultiUser/CI/CD/Enterprise selection - -### Service-Specific Fragments - -**Orchestrator** (workflow engine): -- **orchestrator-queue-section.toml** - Max concurrent tasks, retries, timeout -- **orchestrator-workflow-section.toml** - Batch workflow settings, parallelism -- **orchestrator-storage-section.toml** - Storage backend selection - -**Control Center** (policy/RBAC): -- **control-center-jwt-section.toml** - JWT issuer, audience, token expiration -- **control-center-rbac-section.toml** - Roles, permissions, policies -- **control-center-compliance-section.toml** - SOC2, HIPAA, audit logging - -**MCP Server** (protocol): -- **mcp-capabilities-section.toml** - Tools, prompts, resources, sampling -- **mcp-tools-section.toml** - Tool timeout, max concurrent, categories -- **mcp-resources-section.toml** - Max size, caching, TTL - -## Fragment Structure - -Each fragment is a TOML file containing `[[elements]]` definitions: - -``` -# fragments/workspace-section.toml - -[[elements]] -border_top = true -border_bottom = true -name = "workspace_header" -title = "🗂️ Workspace Configuration" -type = "section_header" - -[[elements]] -name = "workspace_name" -type = "text" -prompt = "Workspace Name" -default = "default" -required = true -help = "Name of the workspace this service will serve" -nickel_path = ["orchestrator", "workspace", "name"] - -[[elements]] -name = "workspace_path" -type = "text" -prompt = "Workspace Path" -default = "/var/lib/provisioning/orchestrator" -required = true -help = "Absolute path to the workspace directory" -nickel_path = ["orchestrator", "workspace", "path"] - -[[elements]] -name = "workspace_enabled" -type = "confirm" -prompt = "Enable Workspace?" -default = true -help = "Enable or disable this workspace" -nickel_path = ["orchestrator", "workspace", "enabled"] -``` - -## Fragment Composition - -Fragments are included in main forms: - -``` -# forms/orchestrator-form.toml - -name = "orchestrator_configuration" -description = "Interactive configuration for Orchestrator" - -# Include fragments in order - -[[items]] -name = "deployment_group" -type = "group" -includes = ["fragments/deployment-mode-section.toml"] - -[[items]] -name = "workspace_group" -type = "group" -includes = ["fragments/workspace-section.toml"] - -[[items]] -name = "server_group" -type = "group" -includes = ["fragments/server-section.toml"] - -[[items]] -name = "storage_group" -type = "group" -includes = ["fragments/orchestrator-storage-section.toml"] - -[[items]] -name = "queue_group" -type = "group" -includes = ["fragments/orchestrator-queue-section.toml"] - -# Optional sections -[[items]] -name = "monitoring_group" -type = "group" -when = "enable_monitoring == true" -includes = ["fragments/monitoring-section.toml"] -``` - -## Element Requirements - -Every element in a fragment MUST include: - -1. **name** - Unique identifier (used in form data) -2. **type** - Element type (text, number, confirm, select, etc.) -3. **prompt** - User-facing label -4. **nickel_path** - Mapping to Nickel structure (**CRITICAL**) - -Example: -``` -[[elements]] -name = "max_concurrent_tasks" # Unique identifier -type = "number" # Type -prompt = "Maximum Concurrent Tasks" # User label -nickel_path = ["orchestrator", "queue", "max_concurrent_tasks"] # Nickel mapping -``` - -## Constraint Interpolation - -Fragments reference constraints dynamically: - -``` -[[elements]] -name = "max_concurrent_tasks" -type = "number" -prompt = "Maximum Concurrent Tasks" -min = "${constraint.orchestrator.queue.concurrent_tasks.min}" # Dynamic -max = "${constraint.orchestrator.queue.concurrent_tasks.max}" # Dynamic -nickel_path = ["orchestrator", "queue", "max_concurrent_tasks"] -``` - -The `${constraint.path.to.value}` syntax references `constraints/constraints.toml`. - -## Common Fragment Patterns - -### Workspace Fragment Pattern -``` -[[elements]] -name = "workspace_name" -type = "text" -prompt = "Workspace Name" -nickel_path = ["orchestrator", "workspace", "name"] - -[[elements]] -name = "workspace_path" -type = "text" -prompt = "Workspace Path" -nickel_path = ["orchestrator", "workspace", "path"] - -[[elements]] -name = "workspace_enabled" -type = "confirm" -prompt = "Enable Workspace?" -nickel_path = ["orchestrator", "workspace", "enabled"] -``` - -### Server Fragment Pattern -``` -[[elements]] -name = "server_host" -type = "text" -prompt = "Server Host" -default = "127.0.0.1" -nickel_path = ["orchestrator", "server", "host"] - -[[elements]] -name = "server_port" -type = "number" -prompt = "Server Port" -min = "${constraint.common.server.port.min}" -max = "${constraint.common.server.port.max}" -nickel_path = ["orchestrator", "server", "port"] - -[[elements]] -name = "server_workers" -type = "number" -prompt = "Worker Threads" -min = 1 -max = 32 -nickel_path = ["orchestrator", "server", "workers"] -``` - -### Database Selection Pattern -``` -[[elements]] -name = "storage_backend" -type = "select" -prompt = "Storage Backend" -options = [ - { value = "filesystem", label = "📁 Filesystem" }, - { value = "rocksdb", label = "🗄️ RocksDB (Embedded)" }, - { value = "surrealdb", label = "📊 SurrealDB" }, - { value = "postgres", label = "🐘 PostgreSQL" }, -] -nickel_path = ["orchestrator", "storage", "backend"] - -[[elements]] -name = "rocksdb_group" -type = "group" -when = "storage_backend == 'rocksdb'" -includes = ["fragments/database-rocksdb-section.toml"] - -[[elements]] -name = "postgres_group" -type = "group" -when = "storage_backend == 'postgres'" -includes = ["fragments/database-postgres-section.toml"] - -[[elements]] -name = "surrealdb_group" -type = "group" -when = "storage_backend == 'surrealdb'" -includes = ["fragments/database-surrealdb-section.toml"] -``` - -## Best Practices - -1. **Clear naming** - Fragment name describes its purpose (queue-section, not qs) -2. **Meaningful headers** - Each fragment starts with a section header (name, title, emoji) -3. **Constraint interpolation** - Use `${constraint.*}` for dynamic validation -4. **Consistent nickel_path** - Paths match actual Nickel structure -5. **Provide defaults** - Sensible defaults improve UX -6. **Help text** - Explain each field clearly -7. **Group logically** - Related fields in same fragment -8. **Test with form** - Verify fragment loads correctly in form - -## Adding a New Fragment - -1. **Create fragment file** in `forms/fragments/{name}-section.toml` -2. **Add section header** (name, title, emoji) -3. **Add form elements**: - - Include `name`, `type`, `prompt` - - Add `nickel_path` (CRITICAL) - - Add constraints if applicable - - Add `help` and `default` if appropriate -4. **Include in form** - Add to main form via `includes` field -5. **Test** - Run configuration wizard to verify fragment loads - -## Fragment Naming Convention - -- **Section fragments**: `{topic}-section.toml` (workspace-section.toml) -- **Service-specific**: `{service}-{topic}-section.toml` (orchestrator-queue-section.toml) -- **Database-specific**: `database-{backend}-section.toml` (database-postgres-section.toml) -- **Deployment-specific**: `{mode}-{topic}-section.toml` (enterprise-options-section.toml) - -## Testing Fragments - -``` -# Validate form that uses fragment -nu provisioning/.typedialog/provisioning/platform/scripts/configure.nu orchestrator solo --backend web - -# Verify constraint interpolation works -grep "constraint\." forms/fragments/*.toml - -# Check nickel_path consistency -grep "nickel_path" forms/fragments/*.toml | sort -``` - ---- - -**Version**: 1.0.0 -**Last Updated**: 2025-01-05 +# Fragments\n\nReusable form fragments organized FLAT in this directory (not nested subdirectories).\n\n## Purpose\n\nFragments provide:\n- **Reusable sections** - Used by multiple forms\n- **Modularity** - Change once, applies to all forms using it\n- **Organization** - Named by purpose (workspace, server, queue, etc.)\n- **DRY principle** - Don't repeat configuration sections\n\n## Fragment Organization\n\n**CRITICAL**: All fragments are stored at the SAME LEVEL (flat directory).\n\n```\nfragments/\n├── workspace-section.toml # Workspace configuration\n├── server-section.toml # HTTP server settings\n├── database-rocksdb-section.toml # RocksDB database\n├── database-surrealdb-section.toml # SurrealDB database\n├── database-postgres-section.toml # PostgreSQL database\n├── security-section.toml # Auth, RBAC, encryption\n├── monitoring-section.toml # Metrics, health checks\n├── logging-section.toml # Log configuration\n├── orchestrator-queue-section.toml # Orchestrator queue config\n├── orchestrator-workflow-section.toml # Orchestrator batch workflow\n├── orchestrator-storage-section.toml # Orchestrator storage backend\n├── control-center-jwt-section.toml # Control Center JWT\n├── control-center-rbac-section.toml # Control Center RBAC\n├── control-center-compliance-section.toml\n├── mcp-capabilities-section.toml # MCP capabilities\n├── mcp-tools-section.toml # MCP tools configuration\n├── mcp-resources-section.toml # MCP resource limits\n├── deployment-mode-section.toml # Deployment mode selection\n├── resources-section.toml # Resource allocation (CPU, RAM, disk)\n└── README.md # This file\n```\n\nReferenced in forms as:\n```\n[[items]]\nname = "workspace_group"\ntype = "group"\nincludes = ["fragments/workspace-section.toml"] # Flat reference\n\n[[items]]\nname = "queue_group"\ntype = "group"\nincludes = ["fragments/orchestrator-queue-section.toml"] # Same level\n```\n\n## Fragment Categories\n\n### Common Fragments (Used by Multiple Services)\n\n- **workspace-section.toml** - Workspace name, path, enable/disable\n- **server-section.toml** - HTTP server host, port, workers, keep-alive\n- **database-rocksdb-section.toml** - RocksDB path (filesystem-backed)\n- **database-surrealdb-section.toml** - SurrealDB embedded (no external service)\n- **database-postgres-section.toml** - PostgreSQL server connection\n- **security-section.toml** - JWT issuer, RBAC, encryption keys\n- **monitoring-section.toml** - Metrics interval, health checks\n- **logging-section.toml** - Log level, format, rotation\n- **resources-section.toml** - CPU cores, memory, disk allocation\n- **deployment-mode-section.toml** - Solo/MultiUser/CI/CD/Enterprise selection\n\n### Service-Specific Fragments\n\n**Orchestrator** (workflow engine):\n- **orchestrator-queue-section.toml** - Max concurrent tasks, retries, timeout\n- **orchestrator-workflow-section.toml** - Batch workflow settings, parallelism\n- **orchestrator-storage-section.toml** - Storage backend selection\n\n**Control Center** (policy/RBAC):\n- **control-center-jwt-section.toml** - JWT issuer, audience, token expiration\n- **control-center-rbac-section.toml** - Roles, permissions, policies\n- **control-center-compliance-section.toml** - SOC2, HIPAA, audit logging\n\n**MCP Server** (protocol):\n- **mcp-capabilities-section.toml** - Tools, prompts, resources, sampling\n- **mcp-tools-section.toml** - Tool timeout, max concurrent, categories\n- **mcp-resources-section.toml** - Max size, caching, TTL\n\n## Fragment Structure\n\nEach fragment is a TOML file containing `[[elements]]` definitions:\n\n```\n# fragments/workspace-section.toml\n\n[[elements]]\nborder_top = true\nborder_bottom = true\nname = "workspace_header"\ntitle = "🗂️ Workspace Configuration"\ntype = "section_header"\n\n[[elements]]\nname = "workspace_name"\ntype = "text"\nprompt = "Workspace Name"\ndefault = "default"\nrequired = true\nhelp = "Name of the workspace this service will serve"\nnickel_path = ["orchestrator", "workspace", "name"]\n\n[[elements]]\nname = "workspace_path"\ntype = "text"\nprompt = "Workspace Path"\ndefault = "/var/lib/provisioning/orchestrator"\nrequired = true\nhelp = "Absolute path to the workspace directory"\nnickel_path = ["orchestrator", "workspace", "path"]\n\n[[elements]]\nname = "workspace_enabled"\ntype = "confirm"\nprompt = "Enable Workspace?"\ndefault = true\nhelp = "Enable or disable this workspace"\nnickel_path = ["orchestrator", "workspace", "enabled"]\n```\n\n## Fragment Composition\n\nFragments are included in main forms:\n\n```\n# forms/orchestrator-form.toml\n\nname = "orchestrator_configuration"\ndescription = "Interactive configuration for Orchestrator"\n\n# Include fragments in order\n\n[[items]]\nname = "deployment_group"\ntype = "group"\nincludes = ["fragments/deployment-mode-section.toml"]\n\n[[items]]\nname = "workspace_group"\ntype = "group"\nincludes = ["fragments/workspace-section.toml"]\n\n[[items]]\nname = "server_group"\ntype = "group"\nincludes = ["fragments/server-section.toml"]\n\n[[items]]\nname = "storage_group"\ntype = "group"\nincludes = ["fragments/orchestrator-storage-section.toml"]\n\n[[items]]\nname = "queue_group"\ntype = "group"\nincludes = ["fragments/orchestrator-queue-section.toml"]\n\n# Optional sections\n[[items]]\nname = "monitoring_group"\ntype = "group"\nwhen = "enable_monitoring == true"\nincludes = ["fragments/monitoring-section.toml"]\n```\n\n## Element Requirements\n\nEvery element in a fragment MUST include:\n\n1. **name** - Unique identifier (used in form data)\n2. **type** - Element type (text, number, confirm, select, etc.)\n3. **prompt** - User-facing label\n4. **nickel_path** - Mapping to Nickel structure (**CRITICAL**)\n\nExample:\n```\n[[elements]]\nname = "max_concurrent_tasks" # Unique identifier\ntype = "number" # Type\nprompt = "Maximum Concurrent Tasks" # User label\nnickel_path = ["orchestrator", "queue", "max_concurrent_tasks"] # Nickel mapping\n```\n\n## Constraint Interpolation\n\nFragments reference constraints dynamically:\n\n```\n[[elements]]\nname = "max_concurrent_tasks"\ntype = "number"\nprompt = "Maximum Concurrent Tasks"\nmin = "${constraint.orchestrator.queue.concurrent_tasks.min}" # Dynamic\nmax = "${constraint.orchestrator.queue.concurrent_tasks.max}" # Dynamic\nnickel_path = ["orchestrator", "queue", "max_concurrent_tasks"]\n```\n\nThe `${constraint.path.to.value}` syntax references `constraints/constraints.toml`.\n\n## Common Fragment Patterns\n\n### Workspace Fragment Pattern\n```\n[[elements]]\nname = "workspace_name"\ntype = "text"\nprompt = "Workspace Name"\nnickel_path = ["orchestrator", "workspace", "name"]\n\n[[elements]]\nname = "workspace_path"\ntype = "text"\nprompt = "Workspace Path"\nnickel_path = ["orchestrator", "workspace", "path"]\n\n[[elements]]\nname = "workspace_enabled"\ntype = "confirm"\nprompt = "Enable Workspace?"\nnickel_path = ["orchestrator", "workspace", "enabled"]\n```\n\n### Server Fragment Pattern\n```\n[[elements]]\nname = "server_host"\ntype = "text"\nprompt = "Server Host"\ndefault = "127.0.0.1"\nnickel_path = ["orchestrator", "server", "host"]\n\n[[elements]]\nname = "server_port"\ntype = "number"\nprompt = "Server Port"\nmin = "${constraint.common.server.port.min}"\nmax = "${constraint.common.server.port.max}"\nnickel_path = ["orchestrator", "server", "port"]\n\n[[elements]]\nname = "server_workers"\ntype = "number"\nprompt = "Worker Threads"\nmin = 1\nmax = 32\nnickel_path = ["orchestrator", "server", "workers"]\n```\n\n### Database Selection Pattern\n```\n[[elements]]\nname = "storage_backend"\ntype = "select"\nprompt = "Storage Backend"\noptions = [\n { value = "filesystem", label = "📁 Filesystem" },\n { value = "rocksdb", label = "🗄️ RocksDB (Embedded)" },\n { value = "surrealdb", label = "📊 SurrealDB" },\n { value = "postgres", label = "🐘 PostgreSQL" },\n]\nnickel_path = ["orchestrator", "storage", "backend"]\n\n[[elements]]\nname = "rocksdb_group"\ntype = "group"\nwhen = "storage_backend == 'rocksdb'"\nincludes = ["fragments/database-rocksdb-section.toml"]\n\n[[elements]]\nname = "postgres_group"\ntype = "group"\nwhen = "storage_backend == 'postgres'"\nincludes = ["fragments/database-postgres-section.toml"]\n\n[[elements]]\nname = "surrealdb_group"\ntype = "group"\nwhen = "storage_backend == 'surrealdb'"\nincludes = ["fragments/database-surrealdb-section.toml"]\n```\n\n## Best Practices\n\n1. **Clear naming** - Fragment name describes its purpose (queue-section, not qs)\n2. **Meaningful headers** - Each fragment starts with a section header (name, title, emoji)\n3. **Constraint interpolation** - Use `${constraint.*}` for dynamic validation\n4. **Consistent nickel_path** - Paths match actual Nickel structure\n5. **Provide defaults** - Sensible defaults improve UX\n6. **Help text** - Explain each field clearly\n7. **Group logically** - Related fields in same fragment\n8. **Test with form** - Verify fragment loads correctly in form\n\n## Adding a New Fragment\n\n1. **Create fragment file** in `forms/fragments/{name}-section.toml`\n2. **Add section header** (name, title, emoji)\n3. **Add form elements**:\n - Include `name`, `type`, `prompt`\n - Add `nickel_path` (CRITICAL)\n - Add constraints if applicable\n - Add `help` and `default` if appropriate\n4. **Include in form** - Add to main form via `includes` field\n5. **Test** - Run configuration wizard to verify fragment loads\n\n## Fragment Naming Convention\n\n- **Section fragments**: `{topic}-section.toml` (workspace-section.toml)\n- **Service-specific**: `{service}-{topic}-section.toml` (orchestrator-queue-section.toml)\n- **Database-specific**: `database-{backend}-section.toml` (database-postgres-section.toml)\n- **Deployment-specific**: `{mode}-{topic}-section.toml` (enterprise-options-section.toml)\n\n## Testing Fragments\n\n```\n# Validate form that uses fragment\nnu provisioning/.typedialog/provisioning/platform/scripts/configure.nu orchestrator solo --backend web\n\n# Verify constraint interpolation works\ngrep "constraint\." forms/fragments/*.toml\n\n# Check nickel_path consistency\ngrep "nickel_path" forms/fragments/*.toml | sort\n```\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 diff --git a/.typedialog/platform/forms/fragments/constraint_interpolation_guide.md b/.typedialog/platform/forms/fragments/constraint_interpolation_guide.md index 8b6f274..47a57d7 100644 --- a/.typedialog/platform/forms/fragments/constraint_interpolation_guide.md +++ b/.typedialog/platform/forms/fragments/constraint_interpolation_guide.md @@ -1,225 +1 @@ -# Constraint Interpolation Guide - -## Overview - -TypeDialog form fields can reference constraints from `constraints.toml` using Jinja2-style template syntax. This provides a **single source of truth** for validation limits across forms, Nickel schemas, and validators. - -## Pattern - -All numeric form fields should use constraint interpolation for `min` and `max` values: - -``` -[[elements]] -name = "field_name" -type = "number" -default = 5 -help = "Field description (range: ${constraint.path.to.constraint.min}-${constraint.path.to.constraint.max})" -min = "${constraint.path.to.constraint.min}" -max = "${constraint.path.to.constraint.max}" -nickel_path = ["path", "to", "field"] -prompt = "Field Label" -``` - -## Benefits - -1. **Single Source of Truth**: Constraints defined once in `constraints.toml`, used everywhere -2. **Dynamic Validation**: If constraint changes, all forms automatically get updated ranges -3. **User-Friendly**: Forms show actual valid ranges in help text -4. **Type Safety**: Constraints match Nickel schema contract ranges - -## Complete Constraint Mapping - -### Orchestrator Fragments - -| Fragment | Field | Constraint Path | Min | Max | -| ---------- | ------- | ----------------- | ----- | ----- | -| `queue-section.toml` | `queue_max_concurrent_tasks` | `orchestrator.queue.concurrent_tasks` | 1 | 100 | -| `queue-section.toml` | `queue_retry_attempts` | `orchestrator.queue.retry_attempts` | 0 | 10 | -| `queue-section.toml` | `queue_retry_delay` | `orchestrator.queue.retry_delay` | 1000 | 60000 | -| `queue-section.toml` | `queue_task_timeout` | `orchestrator.queue.task_timeout` | 60000 | 86400000 | -| `batch-section.toml` | `batch_parallel_limit` | `orchestrator.batch.parallel_limit` | 1 | 50 | -| `batch-section.toml` | `batch_operation_timeout` | `orchestrator.batch.operation_timeout` | 60000 | 3600000 | -| `extensions-section.toml` | `extensions_max_concurrent` | `orchestrator.extensions.max_concurrent` | 1 | 20 | -| `extensions-section.toml` | `extensions_discovery_interval` | Not in constraints (use reasonable bounds) | 300 | 86400 | -| `extensions-section.toml` | `extensions_init_timeout` | Not in constraints (use reasonable bounds) | 1000 | 300000 | -| `extensions-section.toml` | `extensions_sandbox_max_memory_mb` | Not in constraints (use reasonable bounds) | 64 | 4096 | -| `performance-section.toml` | `memory_max_heap_mb` | Not in constraints (use mode-based bounds) | 256 | 131072 | -| `performance-section.toml` | `profiling_sample_rate` | Not in constraints (use reasonable bounds) | 10 | 1000 | -| `storage-section.toml` | `storage_cache_ttl` | Not in constraints (use 60-3600) | 60 | 3600 | -| `storage-section.toml` | `storage_cache_max_entries` | Not in constraints (use 10-100000) | 10 | 100000 | -| `storage-section.toml` | `storage_compression_level` | Not in constraints (zstd: 1-19) | 1 | 19 | -| `storage-section.toml` | `storage_gc_retention` | Not in constraints (use 3600-31536000) | 3600 | 31536000 | -| `storage-section.toml` | `storage_gc_interval` | Not in constraints (use 300-86400) | 300 | 86400 | - -### Control Center Fragments - -| Fragment | Field | Constraint Path | Min | Max | -| ---------- | ------- | ----------------- | ----- | ----- | -| `security-section.toml` | `jwt_token_expiration` | `control_center.jwt.token_expiration` | 300 | 604800 | -| `security-section.toml` | `jwt_refresh_expiration` | `control_center.jwt.refresh_expiration` | 3600 | 2592000 | -| `security-section.toml` | `rate_limiting_max_requests` | `control_center.rate_limiting.max_requests` | 10 | 10000 | -| `security-section.toml` | `rate_limiting_window` | `control_center.rate_limiting.window_seconds` | 1 | 3600 | -| `security-section.toml` | `users_sessions_max_active` | Not in constraints (use 1-100) | 1 | 100 | -| `security-section.toml` | `users_sessions_idle_timeout` | Not in constraints (use 300-86400) | 300 | 86400 | -| `security-section.toml` | `users_sessions_absolute_timeout` | Not in constraints (use 3600-2592000) | 3600 | 2592000 | -| `policy-section.toml` | `policy_cache_ttl` | Not in constraints (use 60-86400) | 60 | 86400 | -| `policy-section.toml` | `policy_cache_max_policies` | Not in constraints (use 100-10000) | 100 | 10000 | -| `policy-section.toml` | `policy_versioning_max_versions` | Not in constraints (use 1-100) | 1 | 100 | -| `users-section.toml` | `users_registration_auto_role` | Not in constraints (select field, not numeric) | - | - | -| `users-section.toml` | `users_sessions_max_active` | Not in constraints (use 1-100) | 1 | 100 | -| `users-section.toml` | `users_sessions_idle_timeout` | Not in constraints (use 300-86400) | 300 | 86400 | -| `users-section.toml` | `users_sessions_absolute_timeout` | Not in constraints (use 3600-2592000) | 3600 | 2592000 | -| `compliance-section.toml` | `audit_retention_days` | `control_center.audit.retention_days` | 1 | 3650 | -| `compliance-section.toml` | `compliance_validation_interval` | Not in constraints (use 1-168 hours) | 1 | 168 | -| `compliance-section.toml` | `compliance_data_retention_years` | Not in constraints (use 1-30) | 1 | 30 | -| `compliance-section.toml` | `compliance_audit_log_days` | Not in constraints (use 90-10950) | 90 | 10950 | - -### MCP Server Fragments - -| Fragment | Field | Constraint Path | Min | Max | -| ---------- | ------- | ----------------- | ----- | ----- | -| `tools-section.toml` | `tools_max_concurrent` | `mcp_server.tools.max_concurrent` | 1 | 20 | -| `tools-section.toml` | `tools_timeout` | `mcp_server.tools.timeout` | 5000 | 600000 | -| `prompts-section.toml` | `prompts_max_templates` | `mcp_server.prompts.max_templates` | 1 | 100 | -| `prompts-section.toml` | `prompts_cache_ttl` | Not in constraints (use 60-86400) | 60 | 86400 | -| `prompts-section.toml` | `prompts_versioning_max_versions` | Not in constraints (use 1-100) | 1 | 100 | -| `resources-section.toml` | `resources_max_size` | `mcp_server.resources.max_size` | 1048576 | 1073741824 | -| `resources-section.toml` | `resources_cache_max_size_mb` | Not in constraints (use 10-10240) | 10 | 10240 | -| `resources-section.toml` | `resources_cache_ttl` | `mcp_server.resources.cache_ttl` | 60 | 3600 | -| `resources-section.toml` | `resources_validation_max_depth` | Not in constraints (use 1-100) | 1 | 100 | -| `sampling-section.toml` | `sampling_max_tokens` | `mcp_server.sampling.max_tokens` | 100 | 100000 | -| `sampling-section.toml` | `sampling_temperature` | Not in constraints (use 0.0-2.0) | 0.0 | 2.0 | -| `sampling-section.toml` | `sampling_cache_ttl` | Not in constraints (use 60-3600) | 60 | 3600 | - -### Common/Shared Fragments - -| Fragment | Field | Constraint Path | Min | Max | -| ---------- | ------- | ----------------- | ----- | ----- | -| `server-section.toml` | `server_port` | `common.server.port` | 1024 | 65535 | -| `server-section.toml` | `server_workers` | `common.server.workers` | 1 | 32 | -| `server-section.toml` | `server_max_connections` | `common.server.max_connections` | 10 | 10000 | -| `server-section.toml` | `server_keep_alive` | `common.server.keep_alive` | 0 | 600 | -| `monitoring-section.toml` | `monitoring_metrics_interval` | `common.monitoring.metrics_interval` | 10 | 300 | -| `monitoring-section.toml` | `monitoring_health_check_interval` | `common.monitoring.health_check_interval` | 5 | 300 | -| `logging-section.toml` | `logging_max_file_size` | `common.logging.max_file_size` | 1048576 | 1073741824 | -| `logging-section.toml` | `logging_max_backups` | `common.logging.max_backups` | 1 | 100 | -| `database-rocksdb-section.toml` | `database_pool_size` | Not in constraints (use 1-100) | 1 | 100 | -| `database-rocksdb-section.toml` | `database_timeout` | Not in constraints (use 10-3600) | 10 | 3600 | -| `database-rocksdb-section.toml` | `database_retry_attempts` | Not in constraints (use 0-10) | 0 | 10 | -| `database-rocksdb-section.toml` | `database_retry_delay` | Not in constraints (use 1000-60000) | 1000 | 60000 | -| `database-surrealdb-section.toml` | `pool_size` | Not in constraints (use 1-200) | 1 | 200 | -| `database-surrealdb-section.toml` | `timeout` | Not in constraints (use 10-3600) | 10 | 3600 | -| `database-postgres-section.toml` | `postgres_port` | Not in constraints (use 1024-65535) | 1024 | 65535 | -| `database-postgres-section.toml` | `postgres_pool_size` | Not in constraints (use 5-200) | 5 | 200 | - -### Installer Fragments - -| Fragment | Field | Constraint Path | Min | Max | -| ---------- | ------- | ----------------- | ----- | ----- | -| `target-section.toml` | `remote_ssh_port` | `common.server.port` | 1024 | 65535 | -| `preflight-section.toml` | `min_disk_gb` | `deployment.solo.disk_gb.min` (mode-dependent) | Variable | Variable | -| `preflight-section.toml` | `min_memory_gb` | `deployment.solo.memory_mb.min` (mode-dependent) | Variable | Variable | -| `preflight-section.toml` | `min_cpu_cores` | `deployment.solo.cpu.min` (mode-dependent) | Variable | Variable | -| `installation-section.toml` | `parallel_services` | Not in constraints (use 1-10) | 1 | 10 | -| `installation-section.toml` | `installation_timeout_seconds` | Not in constraints (use 0-14400) | 0 | 14400 | -| `installation-section.toml` | `log_level` | Not in constraints (select field, not numeric) | - | - | -| `installation-section.toml` | `validation_timeout` | Not in constraints (use 5000-300000) | 5000 | 300000 | -| `services-section.toml` | `orchestrator_port` | `common.server.port` | 1024 | 65535 | -| `services-section.toml` | `control_center_port` | `common.server.port` | 1024 | 65535 | -| `services-section.toml` | `mcp_server_port` | `common.server.port` | 1024 | 65535 | -| `services-section.toml` | `api_gateway_port` | `common.server.port` | 1024 | 65535 | -| `database-section.toml` | `connection_pool_size` | Not in constraints (use 1-100) | 1 | 100 | -| `database-section.toml` | `connection_pool_timeout` | Not in constraints (use 10-3600) | 10 | 3600 | -| `database-section.toml` | `connection_idle_timeout` | Not in constraints (use 60-14400) | 60 | 14400 | -| `storage-section.toml` | `storage_size_gb` | Not in constraints (use 10-100000) | 10 | 100000 | -| `storage-section.toml` | `storage_replication_factor` | Not in constraints (use 2-10) | 2 | 10 | -| `networking-section.toml` | `load_balancer_http_port` | `common.server.port` | 1024 | 65535 | -| `networking-section.toml` | `load_balancer_https_port` | `common.server.port` | 1024 | 65535 | -| `ha-section.toml` | `ha_cluster_size` | Not in constraints (use 3-256) | 3 | 256 | -| `ha-section.toml` | `ha_db_quorum_size` | Not in constraints (use 1-max_cluster_size) | 1 | 256 | -| `ha-section.toml` | `ha_health_check_interval` | Not in constraints (use 1-120) | 1 | 120 | -| `ha-section.toml` | `ha_health_check_failure_threshold` | Not in constraints (use 1-10) | 1 | 10 | -| `ha-section.toml` | `ha_failover_delay` | Not in constraints (use 0-600) | 0 | 600 | -| `upgrades-section.toml` | `rolling_upgrade_parallel` | Not in constraints (use 1-10) | 1 | 10 | -| `upgrades-section.toml` | `canary_percentage` | Not in constraints (use 1-50) | 1 | 50 | -| `upgrades-section.toml` | `canary_duration_seconds` | Not in constraints (use 30-3600) | 30 | 3600 | - -## Fragments Status - -### ✅ Completed (Constraints Interpolated) -- `server-section.toml` - All numeric fields updated -- `monitoring-section.toml` - Core metrics interval updated -- `orchestrator/queue-section.toml` - All queue fields updated -- `orchestrator/batch-section.toml` - Parallel limit and operation timeout updated -- `mcp-server/tools-section.toml` - Tools concurrency and timeout updated - -### ⏳ Remaining (Need Updates) -- All other orchestrator fragments (extensions, performance, storage) -- All control-center fragments (security, policy, users, compliance) -- Remaining MCP fragments (prompts, resources, sampling) -- All installer fragments (target, preflight, installation, services, database, storage, networking, ha, upgrades) -- All database fragments (rocksdb, surrealdb, postgres) -- logging-section.toml - -## How to Add Constraints to a Fragment - -1. **Identify numeric fields** with `type = "number"` that have `min` and/or `max` values -2. **Find the constraint path** in the mapping table above -3. **Update the field** with constraint references: - -``` -# Before -[[elements]] -default = 5 -min = 1 -max = 100 -name = "my_field" -type = "number" - -# After -[[elements]] -default = 5 -help = "Field description (range: ${constraint.path.to.field.min}-${constraint.path.to.field.max})" -min = "${constraint.path.to.field.min}" -max = "${constraint.path.to.field.max}" -name = "my_field" -type = "number" -``` - -4. **For fields without existing constraints**, add reasonable bounds based on the domain: - - Timeouts: typically 1 second to 1 hour (1000-3600000 ms) - - Counters: typically 1-100 or 1-1000 - - Memory: use deployment mode constraints (64MB-256GB) - - Ports: use `common.server.port` (1024-65535) - -5. **Test** that the constraint is accessible in `constraints.toml` - -## Example: Adding Constraint to a New Field - -``` -[[elements]] -default = 3600 -help = "Cache timeout in seconds (range: ${constraint.common.monitoring.health_check_interval.min}-${constraint.common.monitoring.health_check_interval.max})" -min = "${constraint.common.monitoring.health_check_interval.min}" -max = "${constraint.common.monitoring.health_check_interval.max}" -name = "cache_timeout_seconds" -nickel_path = ["cache", "timeout_seconds"] -prompt = "Cache Timeout (seconds)" -type = "number" -``` - -## Integration with TypeDialog - -When TypeDialog processes forms: - -1. **Load time**: Constraint references are resolved from `constraints.toml` -2. **Validation**: User input is validated against resolved min/max values -3. **Help text**: Ranges are shown to user in help messages -4. **Nickel generation**: Jinja2 templates receive validated values - -## See Also - -- `provisioning/.typedialog/provisioning/platform/constraints/constraints.toml` - Constraint definitions -- `constraint_update_status.md` - Progress tracking for constraint interpolation updates -- `provisioning/.typedialog/provisioning/platform/templates/*.j2` - Jinja2 templates for code generation -- `provisioning/schemas/` - Nickel schemas (use same ranges as constraints) +# Constraint Interpolation Guide\n\n## Overview\n\nTypeDialog form fields can reference constraints from `constraints.toml` using Jinja2-style template syntax. This provides a **single source of truth** for validation limits across forms, Nickel schemas, and validators.\n\n## Pattern\n\nAll numeric form fields should use constraint interpolation for `min` and `max` values:\n\n```\n[[elements]]\nname = "field_name"\ntype = "number"\ndefault = 5\nhelp = "Field description (range: ${constraint.path.to.constraint.min}-${constraint.path.to.constraint.max})"\nmin = "${constraint.path.to.constraint.min}"\nmax = "${constraint.path.to.constraint.max}"\nnickel_path = ["path", "to", "field"]\nprompt = "Field Label"\n```\n\n## Benefits\n\n1. **Single Source of Truth**: Constraints defined once in `constraints.toml`, used everywhere\n2. **Dynamic Validation**: If constraint changes, all forms automatically get updated ranges\n3. **User-Friendly**: Forms show actual valid ranges in help text\n4. **Type Safety**: Constraints match Nickel schema contract ranges\n\n## Complete Constraint Mapping\n\n### Orchestrator Fragments\n\n| Fragment | Field | Constraint Path | Min | Max |\n| ---------- | ------- | ----------------- | ----- | ----- |\n| `queue-section.toml` | `queue_max_concurrent_tasks` | `orchestrator.queue.concurrent_tasks` | 1 | 100 |\n| `queue-section.toml` | `queue_retry_attempts` | `orchestrator.queue.retry_attempts` | 0 | 10 |\n| `queue-section.toml` | `queue_retry_delay` | `orchestrator.queue.retry_delay` | 1000 | 60000 |\n| `queue-section.toml` | `queue_task_timeout` | `orchestrator.queue.task_timeout` | 60000 | 86400000 |\n| `batch-section.toml` | `batch_parallel_limit` | `orchestrator.batch.parallel_limit` | 1 | 50 |\n| `batch-section.toml` | `batch_operation_timeout` | `orchestrator.batch.operation_timeout` | 60000 | 3600000 |\n| `extensions-section.toml` | `extensions_max_concurrent` | `orchestrator.extensions.max_concurrent` | 1 | 20 |\n| `extensions-section.toml` | `extensions_discovery_interval` | Not in constraints (use reasonable bounds) | 300 | 86400 |\n| `extensions-section.toml` | `extensions_init_timeout` | Not in constraints (use reasonable bounds) | 1000 | 300000 |\n| `extensions-section.toml` | `extensions_sandbox_max_memory_mb` | Not in constraints (use reasonable bounds) | 64 | 4096 |\n| `performance-section.toml` | `memory_max_heap_mb` | Not in constraints (use mode-based bounds) | 256 | 131072 |\n| `performance-section.toml` | `profiling_sample_rate` | Not in constraints (use reasonable bounds) | 10 | 1000 |\n| `storage-section.toml` | `storage_cache_ttl` | Not in constraints (use 60-3600) | 60 | 3600 |\n| `storage-section.toml` | `storage_cache_max_entries` | Not in constraints (use 10-100000) | 10 | 100000 |\n| `storage-section.toml` | `storage_compression_level` | Not in constraints (zstd: 1-19) | 1 | 19 |\n| `storage-section.toml` | `storage_gc_retention` | Not in constraints (use 3600-31536000) | 3600 | 31536000 |\n| `storage-section.toml` | `storage_gc_interval` | Not in constraints (use 300-86400) | 300 | 86400 |\n\n### Control Center Fragments\n\n| Fragment | Field | Constraint Path | Min | Max |\n| ---------- | ------- | ----------------- | ----- | ----- |\n| `security-section.toml` | `jwt_token_expiration` | `control_center.jwt.token_expiration` | 300 | 604800 |\n| `security-section.toml` | `jwt_refresh_expiration` | `control_center.jwt.refresh_expiration` | 3600 | 2592000 |\n| `security-section.toml` | `rate_limiting_max_requests` | `control_center.rate_limiting.max_requests` | 10 | 10000 |\n| `security-section.toml` | `rate_limiting_window` | `control_center.rate_limiting.window_seconds` | 1 | 3600 |\n| `security-section.toml` | `users_sessions_max_active` | Not in constraints (use 1-100) | 1 | 100 |\n| `security-section.toml` | `users_sessions_idle_timeout` | Not in constraints (use 300-86400) | 300 | 86400 |\n| `security-section.toml` | `users_sessions_absolute_timeout` | Not in constraints (use 3600-2592000) | 3600 | 2592000 |\n| `policy-section.toml` | `policy_cache_ttl` | Not in constraints (use 60-86400) | 60 | 86400 |\n| `policy-section.toml` | `policy_cache_max_policies` | Not in constraints (use 100-10000) | 100 | 10000 |\n| `policy-section.toml` | `policy_versioning_max_versions` | Not in constraints (use 1-100) | 1 | 100 |\n| `users-section.toml` | `users_registration_auto_role` | Not in constraints (select field, not numeric) | - | - |\n| `users-section.toml` | `users_sessions_max_active` | Not in constraints (use 1-100) | 1 | 100 |\n| `users-section.toml` | `users_sessions_idle_timeout` | Not in constraints (use 300-86400) | 300 | 86400 |\n| `users-section.toml` | `users_sessions_absolute_timeout` | Not in constraints (use 3600-2592000) | 3600 | 2592000 |\n| `compliance-section.toml` | `audit_retention_days` | `control_center.audit.retention_days` | 1 | 3650 |\n| `compliance-section.toml` | `compliance_validation_interval` | Not in constraints (use 1-168 hours) | 1 | 168 |\n| `compliance-section.toml` | `compliance_data_retention_years` | Not in constraints (use 1-30) | 1 | 30 |\n| `compliance-section.toml` | `compliance_audit_log_days` | Not in constraints (use 90-10950) | 90 | 10950 |\n\n### MCP Server Fragments\n\n| Fragment | Field | Constraint Path | Min | Max |\n| ---------- | ------- | ----------------- | ----- | ----- |\n| `tools-section.toml` | `tools_max_concurrent` | `mcp_server.tools.max_concurrent` | 1 | 20 |\n| `tools-section.toml` | `tools_timeout` | `mcp_server.tools.timeout` | 5000 | 600000 |\n| `prompts-section.toml` | `prompts_max_templates` | `mcp_server.prompts.max_templates` | 1 | 100 |\n| `prompts-section.toml` | `prompts_cache_ttl` | Not in constraints (use 60-86400) | 60 | 86400 |\n| `prompts-section.toml` | `prompts_versioning_max_versions` | Not in constraints (use 1-100) | 1 | 100 |\n| `resources-section.toml` | `resources_max_size` | `mcp_server.resources.max_size` | 1048576 | 1073741824 |\n| `resources-section.toml` | `resources_cache_max_size_mb` | Not in constraints (use 10-10240) | 10 | 10240 |\n| `resources-section.toml` | `resources_cache_ttl` | `mcp_server.resources.cache_ttl` | 60 | 3600 |\n| `resources-section.toml` | `resources_validation_max_depth` | Not in constraints (use 1-100) | 1 | 100 |\n| `sampling-section.toml` | `sampling_max_tokens` | `mcp_server.sampling.max_tokens` | 100 | 100000 |\n| `sampling-section.toml` | `sampling_temperature` | Not in constraints (use 0.0-2.0) | 0.0 | 2.0 |\n| `sampling-section.toml` | `sampling_cache_ttl` | Not in constraints (use 60-3600) | 60 | 3600 |\n\n### Common/Shared Fragments\n\n| Fragment | Field | Constraint Path | Min | Max |\n| ---------- | ------- | ----------------- | ----- | ----- |\n| `server-section.toml` | `server_port` | `common.server.port` | 1024 | 65535 |\n| `server-section.toml` | `server_workers` | `common.server.workers` | 1 | 32 |\n| `server-section.toml` | `server_max_connections` | `common.server.max_connections` | 10 | 10000 |\n| `server-section.toml` | `server_keep_alive` | `common.server.keep_alive` | 0 | 600 |\n| `monitoring-section.toml` | `monitoring_metrics_interval` | `common.monitoring.metrics_interval` | 10 | 300 |\n| `monitoring-section.toml` | `monitoring_health_check_interval` | `common.monitoring.health_check_interval` | 5 | 300 |\n| `logging-section.toml` | `logging_max_file_size` | `common.logging.max_file_size` | 1048576 | 1073741824 |\n| `logging-section.toml` | `logging_max_backups` | `common.logging.max_backups` | 1 | 100 |\n| `database-rocksdb-section.toml` | `database_pool_size` | Not in constraints (use 1-100) | 1 | 100 |\n| `database-rocksdb-section.toml` | `database_timeout` | Not in constraints (use 10-3600) | 10 | 3600 |\n| `database-rocksdb-section.toml` | `database_retry_attempts` | Not in constraints (use 0-10) | 0 | 10 |\n| `database-rocksdb-section.toml` | `database_retry_delay` | Not in constraints (use 1000-60000) | 1000 | 60000 |\n| `database-surrealdb-section.toml` | `pool_size` | Not in constraints (use 1-200) | 1 | 200 |\n| `database-surrealdb-section.toml` | `timeout` | Not in constraints (use 10-3600) | 10 | 3600 |\n| `database-postgres-section.toml` | `postgres_port` | Not in constraints (use 1024-65535) | 1024 | 65535 |\n| `database-postgres-section.toml` | `postgres_pool_size` | Not in constraints (use 5-200) | 5 | 200 |\n\n### Installer Fragments\n\n| Fragment | Field | Constraint Path | Min | Max |\n| ---------- | ------- | ----------------- | ----- | ----- |\n| `target-section.toml` | `remote_ssh_port` | `common.server.port` | 1024 | 65535 |\n| `preflight-section.toml` | `min_disk_gb` | `deployment.solo.disk_gb.min` (mode-dependent) | Variable | Variable |\n| `preflight-section.toml` | `min_memory_gb` | `deployment.solo.memory_mb.min` (mode-dependent) | Variable | Variable |\n| `preflight-section.toml` | `min_cpu_cores` | `deployment.solo.cpu.min` (mode-dependent) | Variable | Variable |\n| `installation-section.toml` | `parallel_services` | Not in constraints (use 1-10) | 1 | 10 |\n| `installation-section.toml` | `installation_timeout_seconds` | Not in constraints (use 0-14400) | 0 | 14400 |\n| `installation-section.toml` | `log_level` | Not in constraints (select field, not numeric) | - | - |\n| `installation-section.toml` | `validation_timeout` | Not in constraints (use 5000-300000) | 5000 | 300000 |\n| `services-section.toml` | `orchestrator_port` | `common.server.port` | 1024 | 65535 |\n| `services-section.toml` | `control_center_port` | `common.server.port` | 1024 | 65535 |\n| `services-section.toml` | `mcp_server_port` | `common.server.port` | 1024 | 65535 |\n| `services-section.toml` | `api_gateway_port` | `common.server.port` | 1024 | 65535 |\n| `database-section.toml` | `connection_pool_size` | Not in constraints (use 1-100) | 1 | 100 |\n| `database-section.toml` | `connection_pool_timeout` | Not in constraints (use 10-3600) | 10 | 3600 |\n| `database-section.toml` | `connection_idle_timeout` | Not in constraints (use 60-14400) | 60 | 14400 |\n| `storage-section.toml` | `storage_size_gb` | Not in constraints (use 10-100000) | 10 | 100000 |\n| `storage-section.toml` | `storage_replication_factor` | Not in constraints (use 2-10) | 2 | 10 |\n| `networking-section.toml` | `load_balancer_http_port` | `common.server.port` | 1024 | 65535 |\n| `networking-section.toml` | `load_balancer_https_port` | `common.server.port` | 1024 | 65535 |\n| `ha-section.toml` | `ha_cluster_size` | Not in constraints (use 3-256) | 3 | 256 |\n| `ha-section.toml` | `ha_db_quorum_size` | Not in constraints (use 1-max_cluster_size) | 1 | 256 |\n| `ha-section.toml` | `ha_health_check_interval` | Not in constraints (use 1-120) | 1 | 120 |\n| `ha-section.toml` | `ha_health_check_failure_threshold` | Not in constraints (use 1-10) | 1 | 10 |\n| `ha-section.toml` | `ha_failover_delay` | Not in constraints (use 0-600) | 0 | 600 |\n| `upgrades-section.toml` | `rolling_upgrade_parallel` | Not in constraints (use 1-10) | 1 | 10 |\n| `upgrades-section.toml` | `canary_percentage` | Not in constraints (use 1-50) | 1 | 50 |\n| `upgrades-section.toml` | `canary_duration_seconds` | Not in constraints (use 30-3600) | 30 | 3600 |\n\n## Fragments Status\n\n### ✅ Completed (Constraints Interpolated)\n- `server-section.toml` - All numeric fields updated\n- `monitoring-section.toml` - Core metrics interval updated\n- `orchestrator/queue-section.toml` - All queue fields updated\n- `orchestrator/batch-section.toml` - Parallel limit and operation timeout updated\n- `mcp-server/tools-section.toml` - Tools concurrency and timeout updated\n\n### ⏳ Remaining (Need Updates)\n- All other orchestrator fragments (extensions, performance, storage)\n- All control-center fragments (security, policy, users, compliance)\n- Remaining MCP fragments (prompts, resources, sampling)\n- All installer fragments (target, preflight, installation, services, database, storage, networking, ha, upgrades)\n- All database fragments (rocksdb, surrealdb, postgres)\n- logging-section.toml\n\n## How to Add Constraints to a Fragment\n\n1. **Identify numeric fields** with `type = "number"` that have `min` and/or `max` values\n2. **Find the constraint path** in the mapping table above\n3. **Update the field** with constraint references:\n\n```\n# Before\n[[elements]]\ndefault = 5\nmin = 1\nmax = 100\nname = "my_field"\ntype = "number"\n\n# After\n[[elements]]\ndefault = 5\nhelp = "Field description (range: ${constraint.path.to.field.min}-${constraint.path.to.field.max})"\nmin = "${constraint.path.to.field.min}"\nmax = "${constraint.path.to.field.max}"\nname = "my_field"\ntype = "number"\n```\n\n4. **For fields without existing constraints**, add reasonable bounds based on the domain:\n - Timeouts: typically 1 second to 1 hour (1000-3600000 ms)\n - Counters: typically 1-100 or 1-1000\n - Memory: use deployment mode constraints (64MB-256GB)\n - Ports: use `common.server.port` (1024-65535)\n\n5. **Test** that the constraint is accessible in `constraints.toml`\n\n## Example: Adding Constraint to a New Field\n\n```\n[[elements]]\ndefault = 3600\nhelp = "Cache timeout in seconds (range: ${constraint.common.monitoring.health_check_interval.min}-${constraint.common.monitoring.health_check_interval.max})"\nmin = "${constraint.common.monitoring.health_check_interval.min}"\nmax = "${constraint.common.monitoring.health_check_interval.max}"\nname = "cache_timeout_seconds"\nnickel_path = ["cache", "timeout_seconds"]\nprompt = "Cache Timeout (seconds)"\ntype = "number"\n```\n\n## Integration with TypeDialog\n\nWhen TypeDialog processes forms:\n\n1. **Load time**: Constraint references are resolved from `constraints.toml`\n2. **Validation**: User input is validated against resolved min/max values\n3. **Help text**: Ranges are shown to user in help messages\n4. **Nickel generation**: Jinja2 templates receive validated values\n\n## See Also\n\n- `provisioning/.typedialog/provisioning/platform/constraints/constraints.toml` - Constraint definitions\n- `constraint_update_status.md` - Progress tracking for constraint interpolation updates\n- `provisioning/.typedialog/provisioning/platform/templates/*.j2` - Jinja2 templates for code generation\n- `provisioning/schemas/` - Nickel schemas (use same ranges as constraints) diff --git a/.typedialog/platform/forms/fragments/constraint_update_status.md b/.typedialog/platform/forms/fragments/constraint_update_status.md index 3a16499..c81ead2 100644 --- a/.typedialog/platform/forms/fragments/constraint_update_status.md +++ b/.typedialog/platform/forms/fragments/constraint_update_status.md @@ -1,298 +1 @@ -# Constraint Interpolation Update Status - -**Date**: 2025-01-05 -**Status**: Phase 1.5 - COMPLETE ✅ All Constraint Interpolation Finished -**Progress**: 33 / 33 fragments updated (100%) - -## Summary - -Constraint interpolation has been implemented for critical numeric form fields, providing a single source of truth for validation limits. The comprehensive mapping guide documents which constraints should be applied to remaining fragments. - -## Completed Fragments ✅ - -### Common/Shared Fragments -- ✅ **server-section.toml** (100%) - - server_port → `common.server.port` - - server_workers → `common.server.workers` - - server_max_connections → `common.server.max_connections` - - server_keep_alive → `common.server.keep_alive` - -- ✅ **monitoring-section.toml** (1 of 1 critical field) - - monitoring_metrics_interval → `common.monitoring.metrics_interval` - -### Orchestrator Fragments -- ✅ **orchestrator/queue-section.toml** (100%) - - queue_max_concurrent_tasks → `orchestrator.queue.concurrent_tasks` - - queue_retry_attempts → `orchestrator.queue.retry_attempts` - - queue_retry_delay → `orchestrator.queue.retry_delay` - - queue_task_timeout → `orchestrator.queue.task_timeout` - -- ✅ **orchestrator/batch-section.toml** (2 of 2 critical fields) - - batch_parallel_limit → `orchestrator.batch.parallel_limit` - - batch_operation_timeout → `orchestrator.batch.operation_timeout` - -### MCP Server Fragments -- ✅ **mcp-server/tools-section.toml** (100%) - - tools_max_concurrent → `mcp_server.tools.max_concurrent` - - tools_timeout → `mcp_server.tools.timeout` - -- ✅ **mcp-server/prompts-section.toml** (100%) - - prompts_max_templates → `mcp_server.prompts.max_templates` - - prompts_cache_ttl → reasonable bounds (60-86400) - - prompts_versioning_max_versions → reasonable bounds (1-100) - -- ✅ **mcp-server/resources-section.toml** (100%) - - resources_max_size → `mcp_server.resources.max_size` - - resources_cache_ttl → `mcp_server.resources.cache_ttl` - - resources_cache_max_size_mb → reasonable bounds (10-10240) - - resources_validation_max_depth → reasonable bounds (1-100) - -- ✅ **mcp-server/sampling-section.toml** (100%) - - sampling_max_tokens → `mcp_server.sampling.max_tokens` - - sampling_cache_ttl → reasonable bounds (60-3600) - -### Control Center Fragments -- ✅ **control-center/security-section.toml** (100%) - - jwt_token_expiration → `control_center.jwt.token_expiration` - - jwt_refresh_expiration → `control_center.jwt.refresh_expiration` - - rate_limiting_max_requests → `control_center.rate_limiting.max_requests` - - rate_limiting_window → `control_center.rate_limiting.window_seconds` - -- ✅ **control-center/compliance-section.toml** (100%) - - audit_retention_days → `control_center.audit.retention_days` - - compliance_validation_interval → reasonable bounds (1-168 hours) - - compliance_data_retention_years → reasonable bounds (1-30) - - compliance_audit_log_days → reasonable bounds (90-10950) - -### Shared/Common Fragments -- ✅ **logging-section.toml** (100%) - - logging_max_file_size → `common.logging.max_file_size` - - logging_max_backups → `common.logging.max_backups` - -### Orchestrator Fragments -- ✅ **orchestrator/extensions-section.toml** (100%) - - extensions_max_concurrent → `orchestrator.extensions.max_concurrent` - - extensions_discovery_interval → reasonable bounds (300-86400) - - extensions_init_timeout → reasonable bounds (1000-300000) - - extensions_health_check_interval → reasonable bounds (5000-300000) - -## All Fragments Completed ✅ - -### Orchestrator Fragments (3/3 Complete) -- [x] ✅ orchestrator/extensions-section.toml (100%) - - extensions_max_concurrent → `orchestrator.extensions.max_concurrent` - - extensions_discovery_interval, init_timeout, health_check_interval → reasonable bounds - -- [x] ✅ orchestrator/performance-section.toml (100% - TODAY) - - memory_initial_heap_mb → reasonable bounds (128-131072) - - profiling_memory_min_size_kb → reasonable bounds (1-1048576) - - inline_cache_max_entries → reasonable bounds (1000-1000000) - - inline_cache_ttl → reasonable bounds (60-86400) - - async_io_max_in_flight → reasonable bounds (256-1048576) - -- [x] ✅ orchestrator/storage-section.toml (100% - TODAY) - - storage_cache_ttl → reasonable bounds (60-86400) - - storage_cache_max_entries → reasonable bounds (10-1000000) - - storage_compression_level → already has max (1-19) - - storage_gc_retention → reasonable bounds (3600-31536000 / 1 hour-1 year) - - storage_gc_interval → reasonable bounds (300-86400) - -### Control Center Fragments (5/5 Complete) -- [x] ✅ control-center/security-section.toml (100%) - - jwt_token_expiration → `control_center.jwt.token_expiration` - - rate_limiting_max_requests → `control_center.rate_limiting.max_requests` - -- [x] ✅ control-center/policy-section.toml (100% - TODAY) - - policy_cache_ttl → reasonable bounds (60-86400) - - policy_cache_max_policies → reasonable bounds (100-1000000) - - policy_versioning_max_versions → reasonable bounds (1-1000) - -- [x] ✅ control-center/users-section.toml (100% - TODAY) - - users_sessions_max_active → reasonable bounds (1-100) - - users_sessions_idle_timeout → reasonable bounds (300-86400) - - users_sessions_absolute_timeout → reasonable bounds (3600-604800 / 1 hour-1 week) - -- [x] ✅ control-center/compliance-section.toml (100%) - - audit_retention_days → `control_center.audit.retention_days` - -- [x] ✅ control-center/rbac-section.toml (100%) - - No numeric fields (confirm/select only) - -### MCP Server (3 fragments) -- [x] ✅ mcp-server/prompts-section.toml - -- [x] ✅ mcp-server/resources-section.toml - -- [x] ✅ mcp-server/sampling-section.toml - -### Common Database Fragments (3 fragments) -- [x] ✅ database-rocksdb-section.toml (100%) - - connection_pool_size → reasonable bounds (1-100) - - connection_pool_timeout → reasonable bounds (10-3600) - - connection_retry_attempts → reasonable bounds (0-10) - - connection_retry_delay → reasonable bounds (1000-60000) - -- [x] ✅ database-surrealdb-section.toml (100%) - - connection_pool_size → reasonable bounds (1-200) - - connection_pool_timeout → reasonable bounds (10-3600) - - connection_retry_attempts → reasonable bounds (0-10) - - connection_retry_delay → reasonable bounds (1000-60000) - -- [x] ✅ database-postgres-section.toml (100%) - - postgres_port → `common.server.port` - - postgres_pool_size → reasonable bounds (5-200) - - postgres_pool_timeout → reasonable bounds (10-3600) - - postgres_retry_attempts → reasonable bounds (0-10) - - postgres_retry_delay → reasonable bounds (1000-60000) - -### Other Shared Fragments (1 fragment) -- [x] ✅ logging-section.toml - -### Installer Fragments (10 fragments) - ALL COMPLETE ✅ - -- [x] ✅ installer/target-section.toml (100%) - - remote_ssh_port → `common.server.port` - -- [x] ✅ installer/preflight-section.toml (100%) - - min_disk_gb → reasonable bounds (1-10000) - - min_memory_gb → already has constraints (1-512) - - min_cpu_cores → already has constraints (1-128) - -- [x] ✅ installer/installation-section.toml (100%) - - parallel_services → reasonable bounds (1-10) - - installation_timeout_seconds → reasonable bounds (0-14400) - - validation_timeout → reasonable bounds (5000-300000) - -- [x] ✅ installer/services-section.toml (100%) - - orchestrator_port → `common.server.port` - - control_center_port → `common.server.port` - - mcp_server_port → `common.server.port` - - api_gateway_port → `common.server.port` - -- [x] ✅ installer/database-section.toml (100%) - - connection_pool_size → reasonable bounds (1-100) - - connection_pool_timeout → reasonable bounds (10-3600) - - connection_idle_timeout → reasonable bounds (60-14400) - -- [x] ✅ installer/storage-section.toml (100%) - - storage_size_gb → reasonable bounds (10-100000) - - storage_replication_factor → reasonable bounds (2-10) - -- [x] ✅ installer/networking-section.toml (100%) - - load_balancer_http_port → `common.server.port` - - load_balancer_https_port → `common.server.port` - -- [x] ✅ installer/ha-section.toml (100%) - - ha_cluster_size → reasonable bounds (3-256) - - ha_db_quorum_size → reasonable bounds (1-256) - - ha_health_check_interval → reasonable bounds (1-120) - - ha_health_check_timeout → reasonable bounds (1000-300000) - - ha_failover_delay → reasonable bounds (0-600) - - ha_backup_interval → reasonable bounds (300-86400) - - ha_metrics_interval → reasonable bounds (5-300) - -- [x] ✅ installer/post-install-section.toml (100%) - - verification_timeout → reasonable bounds (30-3600) - -- [x] ✅ installer/upgrades-section.toml (100%) - - rolling_upgrade_parallel → reasonable bounds (1-10) - - canary_percentage → reasonable bounds (1-50) - - canary_duration_seconds → reasonable bounds (30-7200) - - maintenance_duration_seconds → reasonable bounds (600-86400) - - backup_timeout_minutes → reasonable bounds (5-1440) - - rollback_validation_delay → reasonable bounds (30-1800) - - post_upgrade_health_check_interval → reasonable bounds (5-300) - - post_upgrade_monitoring_duration → reasonable bounds (60-86400) - -## How to Continue - -1. **Reference the mapping**: See `constraint_interpolation_guide.md` for complete field → constraint mappings - -2. **For fragments with existing constraints** (e.g., `security-section.toml`): - ```bash - # Update fields using the pattern from completed fragments - # Example: jwt_token_expiration → control_center.jwt.token_expiration - ``` - -3. **For fragments without existing constraints** (e.g., `performance-section.toml`): - - Use reasonable domain-based ranges - - Document your choice in the help text - - Examples: - - Timeouts: 1s-1hr range (1000-3600000 ms) - - Thread counts: 1-32 range - - Memory: 64MB-256GB range (use deployment modes) - - Ports: use `common.server.port` (1024-65535) - -## Testing - -After updating a fragment: - -``` -# 1. Verify fragment syntax -cd provisioning/.typedialog/provisioning/platform/forms/fragments -grep -n 'min = \|max = ' .toml | head -20 - -# 2. Validate constraints exist -cd ../.. -grep -r "$(constraint path)" constraints/constraints.toml - -# 3. Test form rendering -typedialog-cli validate forms/-form.toml -``` - -## Notes - -### Pattern Applied -All numeric fields now follow this structure: -``` -[[elements]] -default = 10 -help = "Field description (range: ${constraint.path.min}-${constraint.path.max})" -min = "${constraint.path.min}" -max = "${constraint.path.max}" -name = "field_name" -nickel_path = ["path", "to", "nickel"] -prompt = "Field Label" -type = "number" -``` - -### Benefits Realized -- ✅ Single source of truth in `constraints.toml` -- ✅ Help text shows actual valid ranges to users -- ✅ TypeDialog validates input against constraints -- ✅ Jinja2 templates receive validated values -- ✅ Easy to update limits globally (all forms auto-update) - -## Completion Summary - -**Final Status**: 33/33 fragments (100%) ✅ COMPLETE - -**Work Completed Today**: -- ✅ orchestrator/performance-section.toml (5 fields with max bounds) -- ✅ orchestrator/storage-section.toml (4 fields with max bounds) -- ✅ control-center/policy-section.toml (3 fields with max bounds) -- ✅ control-center/users-section.toml (3 fields with max bounds) -- ✅ Fragments with no numeric fields (rbac, mode-selection, workspace) verified as complete - -**Total Progress This Session**: -- Started: 12/33 (36%) -- Ended: 33/33 (100%) -- +21 fragments updated -- +50+ numeric fields with constraint bounds added - -### Next Phase: Phase 8 - Nushell Scripts -Ready to proceed with implementation: -- Interactive configuration wizard (configure.nu) -- Config generation from Nickel → TOML (generate-configs.nu) -- Validation and roundtrip workflows -- Template rendering (Docker Compose, Kubernetes) - -## Files - -- `constraints/constraints.toml` - Source of truth for all validation limits -- `constraint_interpolation_guide.md` - Complete mapping and best practices -- `constraint_update_status.md` - This file (progress tracking) - ---- - -**To contribute**: Pick any unchecked fragment above and follow the pattern in `constraint_interpolation_guide.md`. Each constraint update takes ~5 minutes per fragment. +# Constraint Interpolation Update Status\n\n**Date**: 2025-01-05\n**Status**: Phase 1.5 - COMPLETE ✅ All Constraint Interpolation Finished\n**Progress**: 33 / 33 fragments updated (100%)\n\n## Summary\n\nConstraint interpolation has been implemented for critical numeric form fields, providing a single source of truth for validation limits. The comprehensive mapping guide documents which constraints should be applied to remaining fragments.\n\n## Completed Fragments ✅\n\n### Common/Shared Fragments\n- ✅ **server-section.toml** (100%)\n - server_port → `common.server.port`\n - server_workers → `common.server.workers`\n - server_max_connections → `common.server.max_connections`\n - server_keep_alive → `common.server.keep_alive`\n\n- ✅ **monitoring-section.toml** (1 of 1 critical field)\n - monitoring_metrics_interval → `common.monitoring.metrics_interval`\n\n### Orchestrator Fragments\n- ✅ **orchestrator/queue-section.toml** (100%)\n - queue_max_concurrent_tasks → `orchestrator.queue.concurrent_tasks`\n - queue_retry_attempts → `orchestrator.queue.retry_attempts`\n - queue_retry_delay → `orchestrator.queue.retry_delay`\n - queue_task_timeout → `orchestrator.queue.task_timeout`\n\n- ✅ **orchestrator/batch-section.toml** (2 of 2 critical fields)\n - batch_parallel_limit → `orchestrator.batch.parallel_limit`\n - batch_operation_timeout → `orchestrator.batch.operation_timeout`\n\n### MCP Server Fragments\n- ✅ **mcp-server/tools-section.toml** (100%)\n - tools_max_concurrent → `mcp_server.tools.max_concurrent`\n - tools_timeout → `mcp_server.tools.timeout`\n\n- ✅ **mcp-server/prompts-section.toml** (100%)\n - prompts_max_templates → `mcp_server.prompts.max_templates`\n - prompts_cache_ttl → reasonable bounds (60-86400)\n - prompts_versioning_max_versions → reasonable bounds (1-100)\n\n- ✅ **mcp-server/resources-section.toml** (100%)\n - resources_max_size → `mcp_server.resources.max_size`\n - resources_cache_ttl → `mcp_server.resources.cache_ttl`\n - resources_cache_max_size_mb → reasonable bounds (10-10240)\n - resources_validation_max_depth → reasonable bounds (1-100)\n\n- ✅ **mcp-server/sampling-section.toml** (100%)\n - sampling_max_tokens → `mcp_server.sampling.max_tokens`\n - sampling_cache_ttl → reasonable bounds (60-3600)\n\n### Control Center Fragments\n- ✅ **control-center/security-section.toml** (100%)\n - jwt_token_expiration → `control_center.jwt.token_expiration`\n - jwt_refresh_expiration → `control_center.jwt.refresh_expiration`\n - rate_limiting_max_requests → `control_center.rate_limiting.max_requests`\n - rate_limiting_window → `control_center.rate_limiting.window_seconds`\n\n- ✅ **control-center/compliance-section.toml** (100%)\n - audit_retention_days → `control_center.audit.retention_days`\n - compliance_validation_interval → reasonable bounds (1-168 hours)\n - compliance_data_retention_years → reasonable bounds (1-30)\n - compliance_audit_log_days → reasonable bounds (90-10950)\n\n### Shared/Common Fragments\n- ✅ **logging-section.toml** (100%)\n - logging_max_file_size → `common.logging.max_file_size`\n - logging_max_backups → `common.logging.max_backups`\n\n### Orchestrator Fragments\n- ✅ **orchestrator/extensions-section.toml** (100%)\n - extensions_max_concurrent → `orchestrator.extensions.max_concurrent`\n - extensions_discovery_interval → reasonable bounds (300-86400)\n - extensions_init_timeout → reasonable bounds (1000-300000)\n - extensions_health_check_interval → reasonable bounds (5000-300000)\n\n## All Fragments Completed ✅\n\n### Orchestrator Fragments (3/3 Complete)\n- [x] ✅ orchestrator/extensions-section.toml (100%)\n - extensions_max_concurrent → `orchestrator.extensions.max_concurrent`\n - extensions_discovery_interval, init_timeout, health_check_interval → reasonable bounds\n\n- [x] ✅ orchestrator/performance-section.toml (100% - TODAY)\n - memory_initial_heap_mb → reasonable bounds (128-131072)\n - profiling_memory_min_size_kb → reasonable bounds (1-1048576)\n - inline_cache_max_entries → reasonable bounds (1000-1000000)\n - inline_cache_ttl → reasonable bounds (60-86400)\n - async_io_max_in_flight → reasonable bounds (256-1048576)\n\n- [x] ✅ orchestrator/storage-section.toml (100% - TODAY)\n - storage_cache_ttl → reasonable bounds (60-86400)\n - storage_cache_max_entries → reasonable bounds (10-1000000)\n - storage_compression_level → already has max (1-19)\n - storage_gc_retention → reasonable bounds (3600-31536000 / 1 hour-1 year)\n - storage_gc_interval → reasonable bounds (300-86400)\n\n### Control Center Fragments (5/5 Complete)\n- [x] ✅ control-center/security-section.toml (100%)\n - jwt_token_expiration → `control_center.jwt.token_expiration`\n - rate_limiting_max_requests → `control_center.rate_limiting.max_requests`\n\n- [x] ✅ control-center/policy-section.toml (100% - TODAY)\n - policy_cache_ttl → reasonable bounds (60-86400)\n - policy_cache_max_policies → reasonable bounds (100-1000000)\n - policy_versioning_max_versions → reasonable bounds (1-1000)\n\n- [x] ✅ control-center/users-section.toml (100% - TODAY)\n - users_sessions_max_active → reasonable bounds (1-100)\n - users_sessions_idle_timeout → reasonable bounds (300-86400)\n - users_sessions_absolute_timeout → reasonable bounds (3600-604800 / 1 hour-1 week)\n\n- [x] ✅ control-center/compliance-section.toml (100%)\n - audit_retention_days → `control_center.audit.retention_days`\n\n- [x] ✅ control-center/rbac-section.toml (100%)\n - No numeric fields (confirm/select only)\n\n### MCP Server (3 fragments)\n- [x] ✅ mcp-server/prompts-section.toml\n\n- [x] ✅ mcp-server/resources-section.toml\n\n- [x] ✅ mcp-server/sampling-section.toml\n\n### Common Database Fragments (3 fragments)\n- [x] ✅ database-rocksdb-section.toml (100%)\n - connection_pool_size → reasonable bounds (1-100)\n - connection_pool_timeout → reasonable bounds (10-3600)\n - connection_retry_attempts → reasonable bounds (0-10)\n - connection_retry_delay → reasonable bounds (1000-60000)\n\n- [x] ✅ database-surrealdb-section.toml (100%)\n - connection_pool_size → reasonable bounds (1-200)\n - connection_pool_timeout → reasonable bounds (10-3600)\n - connection_retry_attempts → reasonable bounds (0-10)\n - connection_retry_delay → reasonable bounds (1000-60000)\n\n- [x] ✅ database-postgres-section.toml (100%)\n - postgres_port → `common.server.port`\n - postgres_pool_size → reasonable bounds (5-200)\n - postgres_pool_timeout → reasonable bounds (10-3600)\n - postgres_retry_attempts → reasonable bounds (0-10)\n - postgres_retry_delay → reasonable bounds (1000-60000)\n\n### Other Shared Fragments (1 fragment)\n- [x] ✅ logging-section.toml\n\n### Installer Fragments (10 fragments) - ALL COMPLETE ✅\n\n- [x] ✅ installer/target-section.toml (100%)\n - remote_ssh_port → `common.server.port`\n\n- [x] ✅ installer/preflight-section.toml (100%)\n - min_disk_gb → reasonable bounds (1-10000)\n - min_memory_gb → already has constraints (1-512)\n - min_cpu_cores → already has constraints (1-128)\n\n- [x] ✅ installer/installation-section.toml (100%)\n - parallel_services → reasonable bounds (1-10)\n - installation_timeout_seconds → reasonable bounds (0-14400)\n - validation_timeout → reasonable bounds (5000-300000)\n\n- [x] ✅ installer/services-section.toml (100%)\n - orchestrator_port → `common.server.port`\n - control_center_port → `common.server.port`\n - mcp_server_port → `common.server.port`\n - api_gateway_port → `common.server.port`\n\n- [x] ✅ installer/database-section.toml (100%)\n - connection_pool_size → reasonable bounds (1-100)\n - connection_pool_timeout → reasonable bounds (10-3600)\n - connection_idle_timeout → reasonable bounds (60-14400)\n\n- [x] ✅ installer/storage-section.toml (100%)\n - storage_size_gb → reasonable bounds (10-100000)\n - storage_replication_factor → reasonable bounds (2-10)\n\n- [x] ✅ installer/networking-section.toml (100%)\n - load_balancer_http_port → `common.server.port`\n - load_balancer_https_port → `common.server.port`\n\n- [x] ✅ installer/ha-section.toml (100%)\n - ha_cluster_size → reasonable bounds (3-256)\n - ha_db_quorum_size → reasonable bounds (1-256)\n - ha_health_check_interval → reasonable bounds (1-120)\n - ha_health_check_timeout → reasonable bounds (1000-300000)\n - ha_failover_delay → reasonable bounds (0-600)\n - ha_backup_interval → reasonable bounds (300-86400)\n - ha_metrics_interval → reasonable bounds (5-300)\n\n- [x] ✅ installer/post-install-section.toml (100%)\n - verification_timeout → reasonable bounds (30-3600)\n\n- [x] ✅ installer/upgrades-section.toml (100%)\n - rolling_upgrade_parallel → reasonable bounds (1-10)\n - canary_percentage → reasonable bounds (1-50)\n - canary_duration_seconds → reasonable bounds (30-7200)\n - maintenance_duration_seconds → reasonable bounds (600-86400)\n - backup_timeout_minutes → reasonable bounds (5-1440)\n - rollback_validation_delay → reasonable bounds (30-1800)\n - post_upgrade_health_check_interval → reasonable bounds (5-300)\n - post_upgrade_monitoring_duration → reasonable bounds (60-86400)\n\n## How to Continue\n\n1. **Reference the mapping**: See `constraint_interpolation_guide.md` for complete field → constraint mappings\n\n2. **For fragments with existing constraints** (e.g., `security-section.toml`):\n ```bash\n # Update fields using the pattern from completed fragments\n # Example: jwt_token_expiration → control_center.jwt.token_expiration\n ```\n\n3. **For fragments without existing constraints** (e.g., `performance-section.toml`):\n - Use reasonable domain-based ranges\n - Document your choice in the help text\n - Examples:\n - Timeouts: 1s-1hr range (1000-3600000 ms)\n - Thread counts: 1-32 range\n - Memory: 64MB-256GB range (use deployment modes)\n - Ports: use `common.server.port` (1024-65535)\n\n## Testing\n\nAfter updating a fragment:\n\n```\n# 1. Verify fragment syntax\ncd provisioning/.typedialog/provisioning/platform/forms/fragments\ngrep -n 'min = \|max = ' .toml | head -20\n\n# 2. Validate constraints exist\ncd ../..\ngrep -r "$(constraint path)" constraints/constraints.toml\n\n# 3. Test form rendering\ntypedialog-cli validate forms/-form.toml\n```\n\n## Notes\n\n### Pattern Applied\nAll numeric fields now follow this structure:\n```\n[[elements]]\ndefault = 10\nhelp = "Field description (range: ${constraint.path.min}-${constraint.path.max})"\nmin = "${constraint.path.min}"\nmax = "${constraint.path.max}"\nname = "field_name"\nnickel_path = ["path", "to", "nickel"]\nprompt = "Field Label"\ntype = "number"\n```\n\n### Benefits Realized\n- ✅ Single source of truth in `constraints.toml`\n- ✅ Help text shows actual valid ranges to users\n- ✅ TypeDialog validates input against constraints\n- ✅ Jinja2 templates receive validated values\n- ✅ Easy to update limits globally (all forms auto-update)\n\n## Completion Summary\n\n**Final Status**: 33/33 fragments (100%) ✅ COMPLETE\n\n**Work Completed Today**:\n- ✅ orchestrator/performance-section.toml (5 fields with max bounds)\n- ✅ orchestrator/storage-section.toml (4 fields with max bounds)\n- ✅ control-center/policy-section.toml (3 fields with max bounds)\n- ✅ control-center/users-section.toml (3 fields with max bounds)\n- ✅ Fragments with no numeric fields (rbac, mode-selection, workspace) verified as complete\n\n**Total Progress This Session**:\n- Started: 12/33 (36%)\n- Ended: 33/33 (100%)\n- +21 fragments updated\n- +50+ numeric fields with constraint bounds added\n\n### Next Phase: Phase 8 - Nushell Scripts\nReady to proceed with implementation:\n- Interactive configuration wizard (configure.nu)\n- Config generation from Nickel → TOML (generate-configs.nu)\n- Validation and roundtrip workflows\n- Template rendering (Docker Compose, Kubernetes)\n\n## Files\n\n- `constraints/constraints.toml` - Source of truth for all validation limits\n- `constraint_interpolation_guide.md` - Complete mapping and best practices\n- `constraint_update_status.md` - This file (progress tracking)\n\n---\n\n**To contribute**: Pick any unchecked fragment above and follow the pattern in `constraint_interpolation_guide.md`. Each constraint update takes ~5 minutes per fragment. diff --git a/.typedialog/platform/forms/fragments/provisioning/.typedialog/provisioning/platform/scripts/README.md b/.typedialog/platform/forms/fragments/provisioning/.typedialog/provisioning/platform/scripts/README.md index 41e320f..eb94ef1 100644 --- a/.typedialog/platform/forms/fragments/provisioning/.typedialog/provisioning/platform/scripts/README.md +++ b/.typedialog/platform/forms/fragments/provisioning/.typedialog/provisioning/platform/scripts/README.md @@ -1,98 +1 @@ -# TypeDialog + Nickel Configuration Scripts - -Phase 8 Nushell automation scripts for interactive configuration workflow, config generation, validation, and deployment. - -## Quick Start - -``` -# 1. Interactive Configuration (TypeDialog) -nu scripts/configure.nu orchestrator solo - -# 2. Generate TOML configs -nu scripts/generate-configs.nu orchestrator solo - -# 3. Validate configuration -nu scripts/validate-config.nu provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl - -# 4. Render Docker Compose -nu scripts/render-docker-compose.nu solo - -# 5. Full deployment workflow -nu scripts/install-services.nu orchestrator solo --docker -``` - -## Scripts Overview - -### Shared Helpers -- **ansi.nu** - ANSI color and emoji output formatting -- **paths.nu** - Path validation and directory structure helpers -- **external.nu** - Safe external command execution with error handling - -### Core Configuration Scripts -- **configure.nu** - Interactive TypeDialog configuration wizard -- **generate-configs.nu** - Export Nickel configs to TOML -- **validate-config.nu** - Validate Nickel configuration - -### Rendering Scripts -- **render-docker-compose.nu** - Render Docker Compose from Nickel templates -- **render-kubernetes.nu** - Render Kubernetes manifests from Nickel templates - -### Deployment & Monitoring Scripts -- **install-services.nu** - Full deployment orchestration -- **detect-services.nu** - Auto-detect running services - -## Supported Services -- orchestrator (port 9090) -- control-center (port 8080) -- mcp-server (port 8888) -- installer (port 8000) - -## Supported Deployment Modes -- solo (2 CPU, 4GB RAM) -- multiuser (4 CPU, 8GB RAM) -- cicd (8 CPU, 16GB RAM) -- enterprise (16+ CPU, 32+ GB RAM) - -## Nushell Compliance -All scripts follow Nushell 0.109.0+ guidelines with proper type signatures, error handling, and no try-catch blocks. - -## Examples - -### Single Service Configuration -``` -nu scripts/configure.nu orchestrator solo --backend web -nu scripts/validate-config.nu provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl -nu scripts/generate-configs.nu orchestrator solo -cargo run -p orchestrator -- --config provisioning/platform/config/orchestrator.solo.toml -``` - -### Docker Compose Deployment -``` -nu scripts/generate-configs.nu orchestrator multiuser -nu scripts/render-docker-compose.nu multiuser -docker-compose -f provisioning/platform/infrastructure/docker/docker-compose.multiuser.yml up -d -``` - -### Kubernetes Deployment -``` -nu scripts/generate-configs.nu orchestrator enterprise -nu scripts/render-kubernetes.nu enterprise --namespace production -nu scripts/install-services.nu all enterprise --kubernetes --namespace production -``` - -## Phase 8 Status - -✅ Phase 8.A: Shared helper modules -✅ Phase 8.B: Core configuration scripts -✅ Phase 8.C: Rendering scripts -✅ Phase 8.D: Deployment orchestration -✅ Phase 8.E: Testing and documentation - -## Requirements - -- Nushell 0.109.1+ -- Nickel 1.15.1+ -- TypeDialog CLI -- yq v4.50.1+ -- Docker (optional) -- kubectl (optional) +# TypeDialog + Nickel Configuration Scripts\n\nPhase 8 Nushell automation scripts for interactive configuration workflow, config generation, validation, and deployment.\n\n## Quick Start\n\n```\n# 1. Interactive Configuration (TypeDialog)\nnu scripts/configure.nu orchestrator solo\n\n# 2. Generate TOML configs\nnu scripts/generate-configs.nu orchestrator solo\n\n# 3. Validate configuration\nnu scripts/validate-config.nu provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl\n\n# 4. Render Docker Compose\nnu scripts/render-docker-compose.nu solo\n\n# 5. Full deployment workflow\nnu scripts/install-services.nu orchestrator solo --docker\n```\n\n## Scripts Overview\n\n### Shared Helpers\n- **ansi.nu** - ANSI color and emoji output formatting\n- **paths.nu** - Path validation and directory structure helpers \n- **external.nu** - Safe external command execution with error handling\n\n### Core Configuration Scripts\n- **configure.nu** - Interactive TypeDialog configuration wizard\n- **generate-configs.nu** - Export Nickel configs to TOML\n- **validate-config.nu** - Validate Nickel configuration\n\n### Rendering Scripts\n- **render-docker-compose.nu** - Render Docker Compose from Nickel templates\n- **render-kubernetes.nu** - Render Kubernetes manifests from Nickel templates\n\n### Deployment & Monitoring Scripts\n- **install-services.nu** - Full deployment orchestration\n- **detect-services.nu** - Auto-detect running services\n\n## Supported Services\n- orchestrator (port 9090)\n- control-center (port 8080)\n- mcp-server (port 8888)\n- installer (port 8000)\n\n## Supported Deployment Modes\n- solo (2 CPU, 4GB RAM)\n- multiuser (4 CPU, 8GB RAM)\n- cicd (8 CPU, 16GB RAM)\n- enterprise (16+ CPU, 32+ GB RAM)\n\n## Nushell Compliance\nAll scripts follow Nushell 0.109.0+ guidelines with proper type signatures, error handling, and no try-catch blocks.\n\n## Examples\n\n### Single Service Configuration\n```\nnu scripts/configure.nu orchestrator solo --backend web\nnu scripts/validate-config.nu provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl\nnu scripts/generate-configs.nu orchestrator solo\ncargo run -p orchestrator -- --config provisioning/platform/config/orchestrator.solo.toml\n```\n\n### Docker Compose Deployment\n```\nnu scripts/generate-configs.nu orchestrator multiuser\nnu scripts/render-docker-compose.nu multiuser\ndocker-compose -f provisioning/platform/infrastructure/docker/docker-compose.multiuser.yml up -d\n```\n\n### Kubernetes Deployment\n```\nnu scripts/generate-configs.nu orchestrator enterprise\nnu scripts/render-kubernetes.nu enterprise --namespace production\nnu scripts/install-services.nu all enterprise --kubernetes --namespace production\n```\n\n## Phase 8 Status\n\n✅ Phase 8.A: Shared helper modules\n✅ Phase 8.B: Core configuration scripts \n✅ Phase 8.C: Rendering scripts\n✅ Phase 8.D: Deployment orchestration\n✅ Phase 8.E: Testing and documentation\n\n## Requirements\n\n- Nushell 0.109.1+\n- Nickel 1.15.1+\n- TypeDialog CLI\n- yq v4.50.1+\n- Docker (optional)\n- kubectl (optional) diff --git a/.typedialog/platform/scripts/README.md b/.typedialog/platform/scripts/README.md index b144f21..fe6e914 100644 --- a/.typedialog/platform/scripts/README.md +++ b/.typedialog/platform/scripts/README.md @@ -1,255 +1 @@ -# Scripts - -Nushell orchestration scripts for configuration workflow automation (NuShell 0.109+). - -## Purpose - -Scripts provide: -- **Interactive configuration wizard** - TypeDialog with nickel-roundtrip -- **Configuration generation** - Nickel → TOML export -- **Validation** - Nickel typecheck and constraint validation -- **Deployment** - Docker Compose, Kubernetes, service installation - -## Script Organization - -``` -scripts/ -├── README.md # This file -├── configure.nu # Interactive TypeDialog wizard -├── generate-configs.nu # Nickel → TOML export -├── validate-config.nu # Nickel typecheck -├── render-docker-compose.nu # Docker Compose generation -├── render-kubernetes.nu # Kubernetes manifests generation -├── install-services.nu # Deploy platform services -└── detect-services.nu # Auto-detect running services -``` - -## Scripts (Planned Implementation) - -### configure.nu -Interactive configuration wizard using TypeDialog nickel-roundtrip: - -``` -nu provisioning/.typedialog/platform/scripts/configure.nu orchestrator solo --backend web -``` - -Workflow: -1. Loads existing config (if exists) as defaults -2. Launches TypeDialog form (web/tui/cli) -3. Shows form with validated constraints -4. User edits configuration -5. Generates updated Nickel config to `provisioning/schemas/platform/values/orchestrator.solo.ncl` - -Usage: -``` -nu scripts/configure.nu [service] [mode] --backend [web|tui|cli] - service: orchestrator | control-center | mcp-server | vault-service | extension-registry | rag | ai-service | provisioning-daemon - mode: solo | multiuser | cicd | enterprise - backend: web (default) | tui | cli -``` - -### generate-configs.nu -Export Nickel configuration to TOML: - -``` -nu provisioning/.typedialog/platform/scripts/generate-configs.nu orchestrator solo -``` - -Workflow: -1. Validates Nickel config (typecheck) -2. Exports to TOML format -3. Saves to `provisioning/config/runtime/generated/{service}.{mode}.toml` - -Usage: -``` -nu scripts/generate-configs.nu [service] [mode] - service: orchestrator | control-center | mcp-server | vault-service | extension-registry | rag | ai-service | provisioning-daemon - mode: solo | multiuser | cicd | enterprise -``` - -### validate-config.nu -Typecheck Nickel configuration: - -``` -nu provisioning/.typedialog/platform/scripts/validate-config.nu provisioning/schemas/platform/values/orchestrator.solo.ncl -``` - -Workflow: -1. Runs nickel typecheck -2. Reports errors (schema violations, constraint errors) -3. Exits with status - -Usage: -``` -nu scripts/validate-config.nu [config_path] - config_path: Path to Nickel config file -``` - -### render-docker-compose.nu -Generate Docker Compose files from Nickel templates: - -``` -nu provisioning/.typedialog/platform/scripts/render-docker-compose.nu solo -``` - -Workflow: -1. Evaluates Nickel template -2. Exports to JSON -3. Converts to YAML (via yq) -4. Saves to `provisioning/platform/infrastructure/docker/docker-compose.{mode}.yml` - -Usage: -``` -nu scripts/render-docker-compose.nu [mode] - mode: solo | multiuser | cicd | enterprise -``` - -### render-kubernetes.nu -Generate Kubernetes manifests: - -``` -nu scripts/render-kubernetes.nu solo -``` - -Workflow: -1. Evaluates Nickel templates -2. Exports to JSON -3. Converts to YAML -4. Saves to `provisioning/platform/infrastructure/kubernetes/` - -### install-services.nu -Deploy platform services: - -``` -nu scripts/install-services.nu solo --backend docker -``` - -Workflow: -1. Generates all configs for mode -2. Renders deployment manifests -3. Deploys services (Docker Compose or Kubernetes) -4. Verifies service startup - -### detect-services.nu -Auto-detect running services: - -``` -nu scripts/detect-services.nu -``` - -Outputs: -- Running service list -- Detected mode -- Port usage -- Container/pod status - -## Common Workflow - -``` -# 1. Configure service -nu scripts/configure.nu orchestrator solo - -# 2. Validate configuration -nu scripts/validate-config.nu provisioning/schemas/platform/values/orchestrator.solo.ncl - -# 3. Generate TOML -nu scripts/generate-configs.nu orchestrator solo - -# 4. Review generated config -cat provisioning/config/runtime/generated/orchestrator.solo.toml - -# 5. Render Docker Compose -nu scripts/render-docker-compose.nu solo - -# 6. Deploy services -nu scripts/install-services.nu solo --backend docker - -# 7. Verify running services -nu scripts/detect-services.nu -``` - -## Guidelines - -All scripts follow @.claude/guidelines/nushell.md (NuShell 0.109+): - -- **Explicit type signatures** - Function parameters with type annotations -- **Colon notation** - Use `:` before input type, `->` before output type -- **Error handling** - Use `do { } | complete` pattern (not try-catch) -- **Pipeline operations** - Chain operations, avoid nested calls -- **No mutable variables** - Use reduce/recursion instead -- **External commands** - Use `^` prefix (`^nickel`, `^docker`, etc.) - -Example: -``` -export def main [ - service: string, # Type annotation - mode: string -]: nothing -> nothing { # Input/output types - let result = do { - ^nickel typecheck $config_path - } | complete - - if $result.exit_code == 0 { - print "✅ Validation passed" - } else { - print $"❌ Validation failed: ($result.stderr)" - exit 1 - } -} -``` - -## Error Handling Pattern - -All scripts use `do { } | complete` for error handling: - -``` -let result = do { - ^some-command --flag value -} | complete - -if $result.exit_code != 0 { - error make { - msg: $"Command failed: ($result.stderr)" - } -} -``` - -**Never use try-catch** (not supported in 0.109+). - -## Script Dependencies - -All scripts assume: -- **NuShell 0.109+** - Modern shell -- **Nickel** (0.10+) - Configuration language -- **TypeDialog** - Interactive forms -- **Docker** or **kubectl** - Deployment backends -- **yq** - YAML/JSON conversion -- **jq** - JSON processing - -## Testing Scripts - -``` -# Validate Nushell syntax -nu --version # Verify 0.109+ - -# Test script execution -nu scripts/validate-config.nu values/orchestrator.solo.ncl - -# Check script compliance -grep -r "try\|panic\|todo" scripts/ # Should be empty -``` - -## Adding a New Script - -1. **Create script file** (`scripts/{name}.nu`) -2. **Add @.claude/guidelines/nushell.md** compliance -3. **Define main function** with type signatures -4. **Use do { } | complete** for error handling -5. **Test execution**: `nu scripts/{name}.nu` -6. **Verify**: No try-catch, no mutable vars, no panic - ---- - -**Version**: 1.0.0 -**Last Updated**: 2025-01-05 -**Guideline**: @.claude/guidelines/nushell.md (NuShell 0.109+) +# Scripts\n\nNushell orchestration scripts for configuration workflow automation (NuShell 0.109+).\n\n## Purpose\n\nScripts provide:\n- **Interactive configuration wizard** - TypeDialog with nickel-roundtrip\n- **Configuration generation** - Nickel → TOML export\n- **Validation** - Nickel typecheck and constraint validation\n- **Deployment** - Docker Compose, Kubernetes, service installation\n\n## Script Organization\n\n```\nscripts/\n├── README.md # This file\n├── configure.nu # Interactive TypeDialog wizard\n├── generate-configs.nu # Nickel → TOML export\n├── validate-config.nu # Nickel typecheck\n├── render-docker-compose.nu # Docker Compose generation\n├── render-kubernetes.nu # Kubernetes manifests generation\n├── install-services.nu # Deploy platform services\n└── detect-services.nu # Auto-detect running services\n```\n\n## Scripts (Planned Implementation)\n\n### configure.nu\nInteractive configuration wizard using TypeDialog nickel-roundtrip:\n\n```\nnu provisioning/.typedialog/platform/scripts/configure.nu orchestrator solo --backend web\n```\n\nWorkflow:\n1. Loads existing config (if exists) as defaults\n2. Launches TypeDialog form (web/tui/cli)\n3. Shows form with validated constraints\n4. User edits configuration\n5. Generates updated Nickel config to `provisioning/schemas/platform/values/orchestrator.solo.ncl`\n\nUsage:\n```\nnu scripts/configure.nu [service] [mode] --backend [web|tui|cli]\n service: orchestrator | control-center | mcp-server | vault-service | extension-registry | rag | ai-service | provisioning-daemon\n mode: solo | multiuser | cicd | enterprise\n backend: web (default) | tui | cli\n```\n\n### generate-configs.nu\nExport Nickel configuration to TOML:\n\n```\nnu provisioning/.typedialog/platform/scripts/generate-configs.nu orchestrator solo\n```\n\nWorkflow:\n1. Validates Nickel config (typecheck)\n2. Exports to TOML format\n3. Saves to `provisioning/config/runtime/generated/{service}.{mode}.toml`\n\nUsage:\n```\nnu scripts/generate-configs.nu [service] [mode]\n service: orchestrator | control-center | mcp-server | vault-service | extension-registry | rag | ai-service | provisioning-daemon\n mode: solo | multiuser | cicd | enterprise\n```\n\n### validate-config.nu\nTypecheck Nickel configuration:\n\n```\nnu provisioning/.typedialog/platform/scripts/validate-config.nu provisioning/schemas/platform/values/orchestrator.solo.ncl\n```\n\nWorkflow:\n1. Runs nickel typecheck\n2. Reports errors (schema violations, constraint errors)\n3. Exits with status\n\nUsage:\n```\nnu scripts/validate-config.nu [config_path]\n config_path: Path to Nickel config file\n```\n\n### render-docker-compose.nu\nGenerate Docker Compose files from Nickel templates:\n\n```\nnu provisioning/.typedialog/platform/scripts/render-docker-compose.nu solo\n```\n\nWorkflow:\n1. Evaluates Nickel template\n2. Exports to JSON\n3. Converts to YAML (via yq)\n4. Saves to `provisioning/platform/infrastructure/docker/docker-compose.{mode}.yml`\n\nUsage:\n```\nnu scripts/render-docker-compose.nu [mode]\n mode: solo | multiuser | cicd | enterprise\n```\n\n### render-kubernetes.nu\nGenerate Kubernetes manifests:\n\n```\nnu scripts/render-kubernetes.nu solo\n```\n\nWorkflow:\n1. Evaluates Nickel templates\n2. Exports to JSON\n3. Converts to YAML\n4. Saves to `provisioning/platform/infrastructure/kubernetes/`\n\n### install-services.nu\nDeploy platform services:\n\n```\nnu scripts/install-services.nu solo --backend docker\n```\n\nWorkflow:\n1. Generates all configs for mode\n2. Renders deployment manifests\n3. Deploys services (Docker Compose or Kubernetes)\n4. Verifies service startup\n\n### detect-services.nu\nAuto-detect running services:\n\n```\nnu scripts/detect-services.nu\n```\n\nOutputs:\n- Running service list\n- Detected mode\n- Port usage\n- Container/pod status\n\n## Common Workflow\n\n```\n# 1. Configure service\nnu scripts/configure.nu orchestrator solo\n\n# 2. Validate configuration\nnu scripts/validate-config.nu provisioning/schemas/platform/values/orchestrator.solo.ncl\n\n# 3. Generate TOML\nnu scripts/generate-configs.nu orchestrator solo\n\n# 4. Review generated config\ncat provisioning/config/runtime/generated/orchestrator.solo.toml\n\n# 5. Render Docker Compose\nnu scripts/render-docker-compose.nu solo\n\n# 6. Deploy services\nnu scripts/install-services.nu solo --backend docker\n\n# 7. Verify running services\nnu scripts/detect-services.nu\n```\n\n## Guidelines\n\nAll scripts follow @.claude/guidelines/nushell.md (NuShell 0.109+):\n\n- **Explicit type signatures** - Function parameters with type annotations\n- **Colon notation** - Use `:` before input type, `->` before output type\n- **Error handling** - Use `do { } | complete` pattern (not try-catch)\n- **Pipeline operations** - Chain operations, avoid nested calls\n- **No mutable variables** - Use reduce/recursion instead\n- **External commands** - Use `^` prefix (`^nickel`, `^docker`, etc.)\n\nExample:\n```\nexport def main [\n service: string, # Type annotation\n mode: string\n]: nothing -> nothing { # Input/output types\n let result = do {\n ^nickel typecheck $config_path\n } | complete\n\n if $result.exit_code == 0 {\n print "✅ Validation passed"\n } else {\n print $"❌ Validation failed: ($result.stderr)"\n exit 1\n }\n}\n```\n\n## Error Handling Pattern\n\nAll scripts use `do { } | complete` for error handling:\n\n```\nlet result = do {\n ^some-command --flag value\n} | complete\n\nif $result.exit_code != 0 {\n error make {\n msg: $"Command failed: ($result.stderr)"\n }\n}\n```\n\n**Never use try-catch** (not supported in 0.109+).\n\n## Script Dependencies\n\nAll scripts assume:\n- **NuShell 0.109+** - Modern shell\n- **Nickel** (0.10+) - Configuration language\n- **TypeDialog** - Interactive forms\n- **Docker** or **kubectl** - Deployment backends\n- **yq** - YAML/JSON conversion\n- **jq** - JSON processing\n\n## Testing Scripts\n\n```\n# Validate Nushell syntax\nnu --version # Verify 0.109+\n\n# Test script execution\nnu scripts/validate-config.nu values/orchestrator.solo.ncl\n\n# Check script compliance\ngrep -r "try\|panic\|todo" scripts/ # Should be empty\n```\n\n## Adding a New Script\n\n1. **Create script file** (`scripts/{name}.nu`)\n2. **Add @.claude/guidelines/nushell.md** compliance\n3. **Define main function** with type signatures\n4. **Use do { } | complete** for error handling\n5. **Test execution**: `nu scripts/{name}.nu`\n6. **Verify**: No try-catch, no mutable vars, no panic\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05\n**Guideline**: @.claude/guidelines/nushell.md (NuShell 0.109+) diff --git a/.woodpecker/README.md b/.woodpecker/README.md index 7be27d4..11d2da0 100644 --- a/.woodpecker/README.md +++ b/.woodpecker/README.md @@ -1,79 +1 @@ -# Woodpecker CI Configuration - -Pipelines for Gitea/Forgejo + Woodpecker CI. - -## Files - -- **`ci.yml`** - Main CI pipeline (push, pull requests) -- **`Dockerfile`** - Custom CI image with pre-installed tools -- **`Dockerfile.cross`** - Cross-compilation image for multi-platform builds - -## Setup - -### 1. Activate Woodpecker CI - -Enable Woodpecker CI in your Gitea/Forgejo repository settings. - -### 2. (Optional) Build Custom Image - -Speeds up CI by pre-installing tools (~5 min faster per run). - -``` -# Build CI image -docker build -t your-registry/ci:latest -f .woodpecker/Dockerfile . - -# Push to your registry -docker push your-registry/ci:latest - -# Update .woodpecker/ci.yml -# Change: image: rust:latest -# To: image: your-registry/ci:latest -``` - -### 3. Cross-Compilation Setup - -For multi-platform builds: - -``` -# Build cross-compilation image -docker build -t your-registry/ci-cross:latest -f .woodpecker/Dockerfile.cross . - -# Push to registry -docker push your-registry/ci-cross:latest -``` - -## CI Pipeline (`ci.yml`) - -**Triggers**: Push to `main`/`develop`, Pull Requests - -**Jobs**: - -1. Lint (Rust, Bash, Nickel, Nushell, Markdown) - Parallel -2. Test (all features) -3. Build (release) -4. Security audit -5. License compliance check - -**Duration**: ~15-20 minutes (without custom image), ~10-15 minutes (with custom image) - -## Triggering Pipelines - -``` -# CI pipeline (automatic on push/PR) -git push origin main -``` - -## Viewing Results - -- **Gitea/Forgejo**: Repository → Actions → Pipeline runs -- **Woodpecker UI**: - -## Differences from GitHub Actions - -| Feature | GitHub Actions | Woodpecker CI | -| --------- | --------------- | --------------- | -| Matrix builds | ✅ 3 OS | ❌ Linux only* | -| Caching | ✅ Built-in | ⚠️ Server-side** | - -\* Multi-OS builds require multiple Woodpecker agents -\*\* Configure in Woodpecker server settings +# Woodpecker CI Configuration\n\nPipelines for Gitea/Forgejo + Woodpecker CI.\n\n## Files\n\n- **`ci.yml`** - Main CI pipeline (push, pull requests)\n- **`Dockerfile`** - Custom CI image with pre-installed tools\n- **`Dockerfile.cross`** - Cross-compilation image for multi-platform builds\n\n## Setup\n\n### 1. Activate Woodpecker CI\n\nEnable Woodpecker CI in your Gitea/Forgejo repository settings.\n\n### 2. (Optional) Build Custom Image\n\nSpeeds up CI by pre-installing tools (~5 min faster per run).\n\n```\n# Build CI image\ndocker build -t your-registry/ci:latest -f .woodpecker/Dockerfile .\n\n# Push to your registry\ndocker push your-registry/ci:latest\n\n# Update .woodpecker/ci.yml\n# Change: image: rust:latest\n# To: image: your-registry/ci:latest\n```\n\n### 3. Cross-Compilation Setup\n\nFor multi-platform builds:\n\n```\n# Build cross-compilation image\ndocker build -t your-registry/ci-cross:latest -f .woodpecker/Dockerfile.cross .\n\n# Push to registry\ndocker push your-registry/ci-cross:latest\n```\n\n## CI Pipeline (`ci.yml`)\n\n**Triggers**: Push to `main`/`develop`, Pull Requests\n\n**Jobs**:\n\n1. Lint (Rust, Bash, Nickel, Nushell, Markdown) - Parallel\n2. Test (all features)\n3. Build (release)\n4. Security audit\n5. License compliance check\n\n**Duration**: ~15-20 minutes (without custom image), ~10-15 minutes (with custom image)\n\n## Triggering Pipelines\n\n```\n# CI pipeline (automatic on push/PR)\ngit push origin main\n```\n\n## Viewing Results\n\n- **Gitea/Forgejo**: Repository → Actions → Pipeline runs\n- **Woodpecker UI**: \n\n## Differences from GitHub Actions\n\n| Feature | GitHub Actions | Woodpecker CI |\n| --------- | --------------- | --------------- |\n| Matrix builds | ✅ 3 OS | ❌ Linux only* |\n| Caching | ✅ Built-in | ⚠️ Server-side** |\n\n\* Multi-OS builds require multiple Woodpecker agents\n\*\* Configure in Woodpecker server settings diff --git a/CHANGELOG.md b/CHANGELOG.md index c0495f8..37cd429 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,131 +1 @@ -# Provisioning Repository - Changes - -**Date**: 2026-01-08 -**Repository**: provisioning (standalone, nickel branch) -**Changes**: Nickel IaC migration complete - Legacy KCL and config cleanup - ---- - -## 📋 Summary - -Complete migration to Nickel-based infrastructure-as-code with consolidated configuration strategy. Legacy KCL schemas, deprecated config files, and redundant documentation removed. New project structure with `.cargo/`, `.github/`, and schema-driven configuration system. - ---- - -## 📁 Changes by Directory - -### ✅ REMOVED (Legacy KCL Ecosystem) - -- **config/** - Deprecated TOML configs (config.defaults.toml, kms.toml, plugins.toml, etc.) -- **config/cedar-policies/** - Legacy Cedar policies (moved to Nickel schemas) -- **config/templates/** - Old Jinja2 templates (replaced by Nickel generator/) -- **config/installer-examples/** - KCL-based examples -- **docs/src/** - Legacy documentation (full migration to provisioning/docs/src/) -- **kcl/** - Complete removal (all workspaces migrated to Nickel) -- **tools/kcl-packager.nu** - KCL packaging system - -### ✅ ADDED (Nickel IaC & New Structure) - -- **.cargo/** - Rust build configuration (clippy settings, rustfmt.toml) -- **.github/** - GitHub Actions CI/CD workflows -- **schemas/** - Nickel schema definitions (primary IaC format) - - main.ncl, provider-aws.ncl, provider-local.ncl, provider-upcloud.ncl - - Infrastructure, deployment, services, operations schemas -- **docs/src/architecture/adr/** - ADR updates for Nickel migration - - adr-010-configuration-format-strategy.md - - adr-011-nickel-migration.md - - adr-012-nushell-nickel-plugin-cli-wrapper.md - -### 📝 UPDATED (Core System) - -- **provisioning/docs/src/** - Comprehensive product documentation - - API reference, architecture, guides, operations, security, testing - - Nickel configuration guide with examples - - Migrated from legacy KCL documentation - -- **core/** - Updated with Nickel integration - - Scripts, plugins, CLI updated for Nickel schema parsing - -- **justfiles/** - Added ci.just for Nickel-aware CI/CD -- **README.md** - Complete restructure for Nickel-first approach -- **.gitignore** - Updated to ignore Nickel build artifacts - ---- - -## 📊 Change Statistics - -| Category | Removed | Added | Modified | -| ---------- | --------- | ------- | ---------- | -| Configuration | 50+ | 10+ | 3 | -| Documentation | 150+ | 200+ | 40+ | -| Infrastructure | 1 (kcl/) | - | - | -| Plugins | 1 | - | 5+ | -| Build System | 5 | 8+ | 3 | -| **Total** | **~220 files** | **~250 files** | **50+ files** | - -## ⚠️ Breaking Changes - -1. **KCL Sunset**: All KCL infrastructure code removed. Migrate workspaces using `nickel-kcl-bridge` or rewrite directly in Nickel. -2. **Config Format**: TOML configuration files moved to schema-driven Nickel system. Legacy config loading deprecated. -3. **Documentation**: Old KCL/legacy docs removed. Use `provisioning/docs/` for current product documentation. -4. **Plugin System**: Updated to Nickel-aware plugin API. Legacy Nushell plugins require recompilation. - -## 🔧 Migration Path - -``` -# For existing workspaces: -provisioning workspace migrate --from-kcl - -# For custom configs: -nickel eval --format json | jq '.' -``` - -## ✨ Key Features - -- **Type-Safe**: Nickel schemas eliminate silent config errors -- **Composable**: Modular infrastructure definitions with lazy evaluation -- **Documented**: Schema validation built-in, IDE support via LSP -- **Validated**: All imports pre-checked, circular dependencies prevented -- **Bridge Available**: `nickel-kcl-bridge` for gradual KCL→Nickel migration - ---- - -## 📝 Implementation Details - -### Nickel Schema System - -- **Three-tier architecture**: infrastructure, operations, deployment -- **Lazy evaluation**: Efficient resource binding and composition -- **Record merging**: Clean override patterns without duplication -- **Type validation**: LSP-aware with IDE auto-completion -- **Generator system**: Nickel-based dynamic configuration at runtime - -### Documentation Reorganization - -- **provisioning/docs/src/** (200+ files) - Customer-facing product docs -- **docs/src/** (20-30 files) - Architecture and development guidelines -- **.coder/** - Session files and implementation records -- Separation of concerns: Product docs isolated from session artifacts - -### CI/CD Integration - -- GitHub Actions workflows for Rust, Nickel, Nushell -- Automated schema validation pre-commit -- Cross-platform testing (Linux, macOS) -- Build artifact caching for fast iteration - ---- - -## ⚠️ Compatibility Notes - -**Breaking**: KCL workspaces require migration to Nickel. Use schema-aware tooling for validation. - -**Migration support**: `nickel-kcl-bridge` tool and guides available in `provisioning/docs/src/development/`. - -**Legacy configs**: Old TOML files no longer loaded. Migrate to Nickel schema format via CLI tool. - ---- - -**Status**: Nickel migration complete. System is production-ready. -**Date**: 2026-01-08 -**Branch**: nickel +# Provisioning Repository - Changes\n\n**Date**: 2026-01-08\n**Repository**: provisioning (standalone, nickel branch)\n**Changes**: Nickel IaC migration complete - Legacy KCL and config cleanup\n\n---\n\n## 📋 Summary\n\nComplete migration to Nickel-based infrastructure-as-code with consolidated configuration strategy. Legacy KCL schemas, deprecated config files, and redundant documentation removed. New project structure with `.cargo/`, `.github/`, and schema-driven configuration system.\n\n---\n\n## 📁 Changes by Directory\n\n### ✅ REMOVED (Legacy KCL Ecosystem)\n\n- **config/** - Deprecated TOML configs (config.defaults.toml, kms.toml, plugins.toml, etc.)\n- **config/cedar-policies/** - Legacy Cedar policies (moved to Nickel schemas)\n- **config/templates/** - Old Jinja2 templates (replaced by Nickel generator/)\n- **config/installer-examples/** - KCL-based examples\n- **docs/src/** - Legacy documentation (full migration to provisioning/docs/src/)\n- **kcl/** - Complete removal (all workspaces migrated to Nickel)\n- **tools/kcl-packager.nu** - KCL packaging system\n\n### ✅ ADDED (Nickel IaC & New Structure)\n\n- **.cargo/** - Rust build configuration (clippy settings, rustfmt.toml)\n- **.github/** - GitHub Actions CI/CD workflows\n- **schemas/** - Nickel schema definitions (primary IaC format)\n - main.ncl, provider-aws.ncl, provider-local.ncl, provider-upcloud.ncl\n - Infrastructure, deployment, services, operations schemas\n- **docs/src/architecture/adr/** - ADR updates for Nickel migration\n - adr-010-configuration-format-strategy.md\n - adr-011-nickel-migration.md\n - adr-012-nushell-nickel-plugin-cli-wrapper.md\n\n### 📝 UPDATED (Core System)\n\n- **provisioning/docs/src/** - Comprehensive product documentation\n - API reference, architecture, guides, operations, security, testing\n - Nickel configuration guide with examples\n - Migrated from legacy KCL documentation\n\n- **core/** - Updated with Nickel integration\n - Scripts, plugins, CLI updated for Nickel schema parsing\n\n- **justfiles/** - Added ci.just for Nickel-aware CI/CD\n- **README.md** - Complete restructure for Nickel-first approach\n- **.gitignore** - Updated to ignore Nickel build artifacts\n\n---\n\n## 📊 Change Statistics\n\n| Category | Removed | Added | Modified |\n| ---------- | --------- | ------- | ---------- |\n| Configuration | 50+ | 10+ | 3 |\n| Documentation | 150+ | 200+ | 40+ |\n| Infrastructure | 1 (kcl/) | - | - |\n| Plugins | 1 | - | 5+ |\n| Build System | 5 | 8+ | 3 |\n| **Total** | **~220 files** | **~250 files** | **50+ files** |\n\n## ⚠️ Breaking Changes\n\n1. **KCL Sunset**: All KCL infrastructure code removed. Migrate workspaces using `nickel-kcl-bridge` or rewrite directly in Nickel.\n2. **Config Format**: TOML configuration files moved to schema-driven Nickel system. Legacy config loading deprecated.\n3. **Documentation**: Old KCL/legacy docs removed. Use `provisioning/docs/` for current product documentation.\n4. **Plugin System**: Updated to Nickel-aware plugin API. Legacy Nushell plugins require recompilation.\n\n## 🔧 Migration Path\n\n```\n# For existing workspaces:\nprovisioning workspace migrate --from-kcl \n\n# For custom configs:\nnickel eval --format json | jq '.'\n```\n\n## ✨ Key Features\n\n- **Type-Safe**: Nickel schemas eliminate silent config errors\n- **Composable**: Modular infrastructure definitions with lazy evaluation\n- **Documented**: Schema validation built-in, IDE support via LSP\n- **Validated**: All imports pre-checked, circular dependencies prevented\n- **Bridge Available**: `nickel-kcl-bridge` for gradual KCL→Nickel migration\n\n---\n\n## 📝 Implementation Details\n\n### Nickel Schema System\n\n- **Three-tier architecture**: infrastructure, operations, deployment\n- **Lazy evaluation**: Efficient resource binding and composition\n- **Record merging**: Clean override patterns without duplication\n- **Type validation**: LSP-aware with IDE auto-completion\n- **Generator system**: Nickel-based dynamic configuration at runtime\n\n### Documentation Reorganization\n\n- **provisioning/docs/src/** (200+ files) - Customer-facing product docs\n- **docs/src/** (20-30 files) - Architecture and development guidelines\n- **.coder/** - Session files and implementation records\n- Separation of concerns: Product docs isolated from session artifacts\n\n### CI/CD Integration\n\n- GitHub Actions workflows for Rust, Nickel, Nushell\n- Automated schema validation pre-commit\n- Cross-platform testing (Linux, macOS)\n- Build artifact caching for fast iteration\n\n---\n\n## ⚠️ Compatibility Notes\n\n**Breaking**: KCL workspaces require migration to Nickel. Use schema-aware tooling for validation.\n\n**Migration support**: `nickel-kcl-bridge` tool and guides available in `provisioning/docs/src/development/`.\n\n**Legacy configs**: Old TOML files no longer loaded. Migrate to Nickel schema format via CLI tool.\n\n---\n\n**Status**: Nickel migration complete. System is production-ready.\n**Date**: 2026-01-08\n**Branch**: nickel diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md index 49b8c67..86d2ac5 100644 --- a/CODE_OF_CONDUCT.md +++ b/CODE_OF_CONDUCT.md @@ -1 +1 @@ -# Code of Conduct\n\n## Our Pledge\n\nWe, as members, contributors, and leaders, pledge to make participation in our project and community a harassment-free experience for everyone, regardless of:\n\n- Age\n- Body size\n- Visible or invisible disability\n- Ethnicity\n- Sex characteristics\n- Gender identity and expression\n- Level of experience\n- Education\n- Socioeconomic status\n- Nationality\n- Personal appearance\n- Race\n- Caste\n- Color\n- Religion\n- Sexual identity and orientation\n\nWe pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.\n\n## Our Standards\n\nExamples of behavior that contributes to a positive environment for our community include:\n\n- Demonstrating empathy and kindness toward other people\n- Being respectful of differing opinions, viewpoints, and experiences\n- Giving and gracefully accepting constructive feedback\n- Accepting responsibility and apologizing to those affected by mistakes\n- Focusing on what is best not just for us as individuals, but for the overall community\n\nExamples of unacceptable behavior include:\n\n- The use of sexualized language or imagery\n- Trolling, insulting, or derogatory comments\n- Personal or political attacks\n- Public or private harassment\n- Publishing others' private information (doxing)\n- Other conduct which could reasonably be considered inappropriate in a professional setting\n\n## Enforcement Responsibilities\n\nProject maintainers are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate corrective action in response to unacceptable behavior.\n\nMaintainers have the right and responsibility to:\n\n- Remove, edit, or reject comments, commits, code, and other contributions\n- Ban contributors for behavior they deem inappropriate, threatening, or harmful\n\n## Scope\n\nThis Code of Conduct applies to:\n\n- All community spaces (GitHub, forums, chat, events, etc.)\n- Official project channels and representations\n- Interactions between community members related to the project\n\n## Enforcement\n\nInstances of abusive, harassing, or otherwise unacceptable behavior may be reported to project maintainers:\n\n- Email: [project contact]\n- GitHub: Private security advisory\n- Issues: Report with `conduct` label (public discussions only)\n\nAll complaints will be reviewed and investigated promptly and fairly.\n\n### Enforcement Guidelines\n\n**1. Correction**\n\n- Community impact: Use of inappropriate language or unwelcoming behavior\n- Action: Private written warning with explanation and clarity on impact\n- Consequence: Warning and no further violations\n\n**2. Warning**\n\n- Community impact: Violation through single incident or series of actions\n- Action: Written warning with severity consequences for continued behavior\n- Consequence: Suspension from community interaction\n\n**3. Temporary Ban**\n\n- Community impact: Serious violation of standards\n- Action: Temporary ban from community interaction\n- Consequence: Revocation of ban after reflection period\n\n**4. Permanent Ban**\n\n- Community impact: Pattern of violating community standards\n- Action: Permanent ban from community interaction\n\n## Attribution\n\nThis Code of Conduct is adapted from the [Contributor Covenant](https://www.contributor-covenant.org), version 2.1.\n\nFor answers to common questions about this code of conduct, see the FAQ at .\n\n---\n\n**Thank you for being part of our community!**\n\nWe believe in creating a welcoming and inclusive space where everyone can contribute their best work. Together, we make this project better. \ No newline at end of file +# Code of Conduct\n\n## Our Pledge\n\nWe, as members, contributors, and leaders, pledge to make participation in our project and community a harassment-free experience for everyone, regardless of:\n\n- Age\n- Body size\n- Visible or invisible disability\n- Ethnicity\n- Sex characteristics\n- Gender identity and expression\n- Level of experience\n- Education\n- Socioeconomic status\n- Nationality\n- Personal appearance\n- Race\n- Caste\n- Color\n- Religion\n- Sexual identity and orientation\n\nWe pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.\n\n## Our Standards\n\nExamples of behavior that contributes to a positive environment for our community include:\n\n- Demonstrating empathy and kindness toward other people\n- Being respectful of differing opinions, viewpoints, and experiences\n- Giving and gracefully accepting constructive feedback\n- Accepting responsibility and apologizing to those affected by mistakes\n- Focusing on what is best not just for us as individuals, but for the overall community\n\nExamples of unacceptable behavior include:\n\n- The use of sexualized language or imagery\n- Trolling, insulting, or derogatory comments\n- Personal or political attacks\n- Public or private harassment\n- Publishing others' private information (doxing)\n- Other conduct which could reasonably be considered inappropriate in a professional setting\n\n## Enforcement Responsibilities\n\nProject maintainers are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate corrective action in response to unacceptable behavior.\n\nMaintainers have the right and responsibility to:\n\n- Remove, edit, or reject comments, commits, code, and other contributions\n- Ban contributors for behavior they deem inappropriate, threatening, or harmful\n\n## Scope\n\nThis Code of Conduct applies to:\n\n- All community spaces (GitHub, forums, chat, events, etc.)\n- Official project channels and representations\n- Interactions between community members related to the project\n\n## Enforcement\n\nInstances of abusive, harassing, or otherwise unacceptable behavior may be reported to project maintainers:\n\n- Email: [project contact]\n- GitHub: Private security advisory\n- Issues: Report with `conduct` label (public discussions only)\n\nAll complaints will be reviewed and investigated promptly and fairly.\n\n### Enforcement Guidelines\n\n**1. Correction**\n\n- Community impact: Use of inappropriate language or unwelcoming behavior\n- Action: Private written warning with explanation and clarity on impact\n- Consequence: Warning and no further violations\n\n**2. Warning**\n\n- Community impact: Violation through single incident or series of actions\n- Action: Written warning with severity consequences for continued behavior\n- Consequence: Suspension from community interaction\n\n**3. Temporary Ban**\n\n- Community impact: Serious violation of standards\n- Action: Temporary ban from community interaction\n- Consequence: Revocation of ban after reflection period\n\n**4. Permanent Ban**\n\n- Community impact: Pattern of violating community standards\n- Action: Permanent ban from community interaction\n\n## Attribution\n\nThis Code of Conduct is adapted from the [Contributor Covenant](https://www.contributor-covenant.org), version 2.1.\n\nFor answers to common questions about this code of conduct, see the FAQ at .\n\n---\n\n**Thank you for being part of our community!**\n\nWe believe in creating a welcoming and inclusive space where everyone can contribute their best work. Together, we make this project better. diff --git a/README.md b/README.md index 413e032..e998456 100644 --- a/README.md +++ b/README.md @@ -1,1125 +1 @@ -

- Provisioning Logo -

-

- Provisioning -

- -# Provisioning - Infrastructure Automation Platform - -> **A modular, declarative Infrastructure as Code (IaC) platform for managing complete infrastructure lifecycles** - -## Table of Contents - -- [What is Provisioning?](#what-is-provisioning) -- [Why Provisioning?](#why-provisioning) -- [Core Concepts](#core-concepts) -- [Architecture](#architecture) -- [Key Features](#key-features) -- [Technology Stack](#technology-stack) -- [How It Works](#how-it-works) -- [Use Cases](#use-cases) -- [Getting Started](#getting-started) - ---- - -## What is Provisioning? - -**Provisioning** is a comprehensive **Infrastructure as Code (IaC)** platform designed to manage -complete infrastructure lifecycles: cloud providers, infrastructure services, clusters, -and isolated workspaces across multiple cloud/local environments. - -Extensible and customizable by design, it delivers type-safe, configuration-driven workflows -with enterprise security (encrypted configuration, Cosmian KMS integration, Cedar policy engine, -secrets management, authorization and permissions control, compliance checking, anomaly detection) -and adaptable deployment modes (interactive UI, CLI automation, unattended CI/CD) -suitable for any scale from development to production. - -### Technical Definition - -Declarative Infrastructure as Code (IaC) platform providing: - -- **Type-safe, configuration-driven workflows** with schema validation and constraint checking -- **Modular, extensible architecture**: cloud providers, task services, clusters, workspaces -- **Multi-cloud abstraction layer** with unified API (UpCloud, AWS, local infrastructure) -- **High-performance state management**: - - Graph database backend for complex relationships - - Real-time state tracking and queries - - Multi-model data storage (document, graph, relational) -- **Enterprise security stack**: - - Encrypted configuration and secrets management - - Cosmian KMS integration for confidential key management - - Cedar policy engine for fine-grained access control - - Authorization and permissions control via platform services - - Compliance checking and policy enforcement - - Anomaly detection for security monitoring - - Audit logging and compliance tracking -- **Hybrid orchestration**: Rust-based performance layer + scripting flexibility -- **Production-ready features**: - - Batch workflows with dependency resolution - - Checkpoint recovery and automatic rollback - - Parallel execution with state management -- **Adaptable deployment modes**: - - Interactive TUI for guided setup - - Headless CLI for scripted automation - - Unattended mode for CI/CD pipelines -- **Hierarchical configuration system** with inheritance and overrides - -### What It Does - -- **Provisions Infrastructure** - Create servers, networks, storage across multiple cloud providers -- **Installs Services** - Deploy Kubernetes, containerd, databases, monitoring, and 50+ infrastructure components -- **Manages Clusters** - Orchestrate complete cluster deployments with dependency management -- **Handles Configuration** - Hierarchical configuration system with inheritance and overrides -- **Orchestrates Workflows** - Batch operations with parallel execution and checkpoint recovery -- **Manages Secrets** - SOPS/Age integration for encrypted configuration -- **Secures Infrastructure** - Enterprise security with JWT, MFA, Cedar policies, audit logging -- **Optimizes Performance** - Native plugins providing 10-50x speed improvements - ---- - -## Why Provisioning? - -### The Problems It Solves - -#### 1. **Multi-Cloud Complexity** - -**Problem**: Each cloud provider has different APIs, tools, and workflows. - -**Solution**: Unified abstraction layer with provider-agnostic interfaces. Write configuration once, deploy anywhere using Nickel schemas. - -``` -# Same configuration works on UpCloud, AWS, or local infrastructure -{ - servers = [ - { - name = "web-01" - plan = "medium" # Abstract size, provider-specific translation - provider = "upcloud" # Switch to "aws" or "local" as needed - } - ] -} -``` - -#### 2. **Dependency Hell** - -**Problem**: Infrastructure components have complex dependencies (Kubernetes needs containerd, Cilium needs Kubernetes, etc.). - -**Solution**: Automatic dependency resolution with topological sorting and health checks via Nickel schemas. - -``` -# Provisioning resolves: containerd → etcd → kubernetes → cilium -{ - taskservs = ["cilium"] # Automatically installs all dependencies -} -``` - -#### 3. **Configuration Sprawl** - -**Problem**: Environment variables, hardcoded values, scattered configuration files. - -**Solution**: Hierarchical configuration system with 476+ config accessors replacing 200+ ENV variables. - -``` -Defaults → User → Project → Infrastructure → Environment → Runtime -``` - -#### 4. **Imperative Scripts** - -**Problem**: Brittle shell scripts that don't handle failures, don't support rollback, hard to maintain. - -**Solution**: Declarative Nickel configurations with validation, type safety, lazy evaluation, and automatic rollback. - -#### 5. **Lack of Visibility** - -**Problem**: No insight into what's happening during deployment, hard to debug failures. - -**Solution**: - -- Real-time workflow monitoring -- Comprehensive logging system -- Web-based control center -- REST API for integration - -#### 6. **No Standardization** - -**Problem**: Each team builds their own deployment tools, no shared patterns. - -**Solution**: Reusable task services, cluster templates, and workflow patterns. - ---- - -## Core Concepts - -### 1. **Providers** - -Cloud infrastructure backends that handle resource provisioning. - -- **UpCloud** - Primary cloud provider -- **AWS** - Amazon Web Services integration -- **Local** - Local infrastructure (VMs, Docker, bare metal) - -Providers implement a common interface, making infrastructure code portable. - -### 2. **Task Services (TaskServs)** - -Reusable infrastructure components that can be installed on servers. - -**Categories**: - -- **Container Runtimes** - containerd, Docker, Podman, crun, runc, youki -- **Orchestration** - Kubernetes, etcd, CoreDNS -- **Networking** - Cilium, Flannel, Calico, ip-aliases -- **Storage** - Rook-Ceph, local storage -- **Databases** - PostgreSQL, Redis, SurrealDB -- **Observability** - Prometheus, Grafana, Loki -- **Security** - Webhook, KMS, Vault -- **Development** - Gitea, Radicle, ORAS - -Each task service includes: - -- Version management -- Dependency declarations -- Health checks -- Installation/uninstallation logic -- Configuration schemas - -### 3. **Clusters** - -Complete infrastructure deployments combining servers and task services. - -**Examples**: - -- **Kubernetes Cluster** - HA control plane + worker nodes + CNI + storage -- **Database Cluster** - Replicated PostgreSQL with backup -- **Build Infrastructure** - BuildKit + container registry + CI/CD - -Clusters handle: - -- Multi-node coordination -- Service distribution -- High availability -- Rolling updates - -### 4. **Workspaces** - -Isolated environments for different projects or deployment stages. - -``` -workspace_librecloud/ # Production workspace -├── infra/ # Infrastructure definitions -├── config/ # Workspace configuration -├── extensions/ # Custom modules -└── runtime/ # State and runtime data - -workspace_dev/ # Development workspace -├── infra/ -└── config/ -``` - -Switch between workspaces with single command: - -``` -provisioning workspace switch librecloud -``` - -### 5. **Workflows** - -Coordinated sequences of operations with dependency management. - -**Types**: - -- **Server Workflows** - Create/delete/update servers -- **TaskServ Workflows** - Install/remove infrastructure services -- **Cluster Workflows** - Deploy/scale complete clusters -- **Batch Workflows** - Multi-cloud parallel operations - -**Features**: - -- Dependency resolution -- Parallel execution -- Checkpoint recovery -- Automatic rollback -- Progress monitoring - ---- - -## Architecture - -### System Components - -``` -┌─────────────────────────────────────────────────────────────────┐ -│ User Interface Layer │ -│ • CLI (provisioning command) │ -│ • Web Control Center (UI) │ -│ • REST API │ -└─────────────────────────────────────────────────────────────────┘ - ↓ -┌─────────────────────────────────────────────────────────────────┐ -│ Core Engine Layer │ -│ • Command Routing & Dispatch │ -│ • Configuration Management │ -│ • Provider Abstraction │ -│ • Utility Libraries │ -└─────────────────────────────────────────────────────────────────┘ - ↓ -┌─────────────────────────────────────────────────────────────────┐ -│ Orchestration Layer │ -│ • Workflow Orchestrator (Rust/Nushell hybrid) │ -│ • Dependency Resolver │ -│ • State Manager │ -│ • Task Scheduler │ -└─────────────────────────────────────────────────────────────────┘ - ↓ -┌─────────────────────────────────────────────────────────────────┐ -│ Extension Layer │ -│ • Providers (Cloud APIs) │ -│ • Task Services (Infrastructure Components) │ -│ • Clusters (Complete Deployments) │ -│ • Workflows (Automation Templates) │ -└─────────────────────────────────────────────────────────────────┘ - ↓ -┌─────────────────────────────────────────────────────────────────┐ -│ Infrastructure Layer │ -│ • Cloud Resources (Servers, Networks, Storage) │ -│ • Kubernetes Clusters │ -│ • Running Services │ -└─────────────────────────────────────────────────────────────────┘ -``` - -### Directory Structure - -``` -project-provisioning/ -├── provisioning/ # Core provisioning system -│ ├── core/ # Core engine and libraries -│ │ ├── cli/ # Command-line interface -│ │ ├── nulib/ # Core Nushell libraries -│ │ ├── plugins/ # System plugins (Rust native) -│ │ └── scripts/ # Utility scripts -│ │ -│ ├── extensions/ # Extensible components -│ │ ├── providers/ # Cloud provider implementations -│ │ ├── taskservs/ # Infrastructure service definitions -│ │ ├── clusters/ # Complete cluster configurations -│ │ └── workflows/ # Core workflow templates -│ │ -│ ├── platform/ # Platform services -│ │ ├── orchestrator/ # Rust orchestrator service -│ │ ├── control-center/ # Web control center -│ │ ├── mcp-server/ # Model Context Protocol server -│ │ ├── api-gateway/ # REST API gateway -│ │ ├── oci-registry/ # OCI registry for extensions -│ │ └── installer/ # Platform installer (TUI + CLI) -│ │ -│ ├── schemas/ # Nickel schema definitions (PRIMARY IaC) -│ │ ├── main.ncl # Main infrastructure schema -│ │ ├── providers/ # Provider-specific schemas -│ │ ├── infrastructure/ # Infra definitions -│ │ ├── deployment/ # Deployment schemas -│ │ ├── services/ # Service schemas -│ │ ├── operations/ # Operations schemas -│ │ └── generator/ # Runtime schema generation -│ │ -│ ├── docs/ # Product documentation (mdBook) -│ ├── config/ # Configuration examples -│ ├── tools/ # Build and distribution tools -│ └── justfiles/ # Just recipes for common tasks -│ -├── workspace/ # User workspaces and data -│ ├── infra/ # Infrastructure definitions -│ ├── config/ # User configuration -│ ├── extensions/ # User extensions -│ └── runtime/ # Runtime data and state -│ -├── docs/ # Architecture & Development docs -│ ├── architecture/ # System design and ADRs -│ └── development/ # Development guidelines -│ -└── .github/ # CI/CD workflows - └── workflows/ # GitHub Actions (Rust, Nickel, Nushell) -``` - -### Platform Services - -#### 1. **Orchestrator** (`platform/orchestrator/`) - -- **Language**: Rust + Nushell -- **Purpose**: Workflow execution, task scheduling, state management -- **Features**: - - File-based persistence - - Priority processing - - Retry logic with exponential backoff - - Checkpoint-based recovery - - REST API endpoints - -#### 2. **Control Center** (`platform/control-center/`) - -- **Language**: Web UI + Backend API -- **Purpose**: Web-based infrastructure management -- **Features**: - - Dashboard views - - Real-time monitoring - - Interactive deployments - - Log viewing - -#### 3. **MCP Server** (`platform/mcp-server/`) - -- **Language**: Nushell -- **Purpose**: Model Context Protocol integration for AI assistance -- **Features**: - - 7 AI-powered settings tools - - Intelligent config completion - - Natural language infrastructure queries - -#### 4. **OCI Registry** (`platform/oci-registry/`) - -- **Purpose**: Extension distribution and versioning -- **Features**: - - Task service packages - - Provider packages - - Cluster templates - - Workflow definitions - -#### 5. **Installer** (`platform/installer/`) - -- **Language**: Rust (Ratatui TUI) + Nushell -- **Purpose**: Platform installation and setup -- **Features**: - - Interactive TUI mode - - Headless CLI mode - - Unattended CI/CD mode - - Configuration generation - ---- - -## Key Features - -### 1. **Modular CLI Architecture** (v3.2.0) - -84% code reduction with domain-driven design. - -- **Main CLI**: 211 lines (from 1,329 lines) -- **80+ shortcuts**: `s` → `server`, `t` → `taskserv`, etc. -- **Bi-directional help**: `provisioning help ws` = `provisioning ws help` -- **7 domain modules**: infrastructure, orchestration, development, workspace, configuration, utilities, generation - -### 2. **Configuration System** (v2.0.0) - -Hierarchical, config-driven architecture. - -- **476+ config accessors** replacing 200+ ENV variables -- **Hierarchical loading**: defaults → user → project → infra → env → runtime -- **Variable interpolation**: `{{paths.base}}`, `{{env.HOME}}`, `{{now.date}}` -- **Multi-format support**: TOML, YAML, KCL - -### 3. **Batch Workflow System** (v3.1.0) - -Provider-agnostic batch operations with 85-90% token efficiency. - -- **Multi-cloud support**: Mixed UpCloud + AWS + local in single workflow -- **KCL schema integration**: Type-safe workflow definitions -- **Dependency resolution**: Topological sorting with soft/hard dependencies -- **State management**: Checkpoint-based recovery with rollback -- **Real-time monitoring**: Live progress tracking - -### 4. **Hybrid Orchestrator** (v3.0.0) - -Rust/Nushell architecture solving deep call stack limitations. - -- **High-performance coordination layer** -- **File-based persistence** -- **Priority processing with retry logic** -- **REST API for external integration** -- **Comprehensive workflow system** - -### 5. **Workspace Switching** (v2.0.5) - -Centralized workspace management. - -- **Single-command switching**: `provisioning workspace switch ` -- **Automatic tracking**: Last-used timestamps, active workspace markers -- **User preferences**: Global settings across all workspaces -- **Workspace registry**: Centralized configuration in `user_config.yaml` - -### 6. **Interactive Guides** (v3.3.0) - -Step-by-step walkthroughs and quick references. - -- **Quick reference**: `provisioning sc` (fastest) -- **Complete guides**: from-scratch, update, customize -- **Copy-paste ready**: All commands include placeholders -- **Beautiful rendering**: Uses glow, bat, or less - -### 7. **Test Environment Service** (v3.4.0) - -Automated container-based testing. - -- **Three test types**: Single taskserv, server simulation, multi-node clusters -- **Topology templates**: Kubernetes HA, etcd clusters, etc. -- **Auto-cleanup**: Optional automatic cleanup after tests -- **CI/CD integration**: Easy integration into pipelines - -### 8. **Platform Installer** (v3.5.0) - -Multi-mode installation system with TUI, CLI, and unattended modes. - -- **Interactive TUI**: Beautiful Ratatui terminal UI with 7 screens -- **Headless Mode**: CLI automation for scripted installations -- **Unattended Mode**: Zero-interaction CI/CD deployments -- **Deployment Modes**: Solo (2 CPU/4GB), MultiUser (4 CPU/8GB), CICD (8 CPU/16GB), Enterprise (16 CPU/32GB) -- **MCP Integration**: 7 AI-powered settings tools for intelligent configuration - -### 9. **Version Management System** (v3.6.0) - -Centralized tool and provider version management with bash-compatible export. - -- **Unified Version Source**: All versions defined in Nickel files (`versions.ncl` and provider `version.ncl`) -- **Generated Versions File**: Bash-compatible KEY="VALUE" format for shell scripts -- **Core Tools**: NUSHELL, NICKEL, SOPS, AGE, K9S with convenient aliases (NU for NUSHELL) -- **Provider Versions**: Automatically discovers and includes all provider versions (AWS, HCLOUD, UPCTL, etc.) -- **Command**: `provisioning setup versions` generates `/provisioning/core/versions` file -- **Shell Integration**: Can be sourced directly in bash scripts: `source /provisioning/core/versions && echo $NU_VERSION` -- **Usage**: - ```bash - # Generate versions file - provisioning setup versions - - # Use in bash scripts - source /provisioning/core/versions - echo "Using Nushell version: $NU_VERSION" - echo "AWS CLI version: $PROVIDER_AWS_VERSION" - ``` - -### 10. **Nushell Plugins Integration** (v1.0.0) - -Three native Rust plugins providing 10-50x performance improvements over HTTP API. - -- **Three Native Plugins**: auth, KMS, orchestrator -- **Performance Gains**: - - KMS operations: ~5ms vs ~50ms (10x faster) - - Orchestrator queries: ~1ms vs ~30ms (30x faster) - - Auth verification: ~10ms vs ~50ms (5x faster) -- **OS-Native Keyring**: macOS Keychain, Linux Secret Service, Windows Credential Manager -- **KMS Backends**: RustyVault, Age, AWS KMS, Vault, Cosmian -- **Graceful Fallback**: Automatic fallback to HTTP if plugins not installed - -### 11. **Complete Security System** (v4.0.0) - -Enterprise-grade security with 39,699 lines across 12 components. - -- **12 Components**: JWT Auth, Cedar Authorization, MFA (TOTP + WebAuthn), Secrets Management, KMS, Audit Logging, Break-Glass, Compliance, Audit Query, Token Management, Access Control, Encryption -- **Performance**: <20ms overhead per secure operation -- **Testing**: 350+ comprehensive test cases -- **API**: 83+ REST endpoints, 111+ CLI commands -- **Standards**: GDPR, SOC2, ISO 27001 compliance -- **Key Features**: - - RS256 authentication with Argon2id hashing - - Policy-as-code with hot reload - - Multi-factor authentication (TOTP + WebAuthn/FIDO2) - - Dynamic secrets (AWS STS, SSH keys) with TTL - - 5 KMS backends with envelope encryption - - 7-year audit retention with 5 export formats - - Multi-party break-glass approval - ---- - -## Technology Stack - -### Core Technologies - -| Technology | Version | Purpose | Why | -| ------------ | --------- | --------- | ----- | -| **Nickel** | Latest | PRIMARY - Infrastructure-as-code language | Type-safe schemas, lazy evaluation, LSP support, composable records, gradual validation | -| **Nushell** | 0.109.0+ | Scripting and task automation | Structured data pipelines, cross-platform, modern built-in parsers (JSON/YAML/TOML) | -| **Rust** | Latest | Platform services (orchestrator, control-center, installer) | Performance, memory safety, concurrency, reliability | -| **KCL** | DEPRECATED | Legacy configuration (fully replaced by Nickel) | Migration bridge available; use Nickel for new work | - -### Data & State Management - -| Technology | Version | Purpose | Features | -| ------------ | --------- | --------- | ---------- | -| **SurrealDB** | Latest | High-performance graph database backend | Multi-model (document, graph, relational), real-time queries, distributed architecture, complex relationship tracking | - -### Platform Services (Rust-based) - -| Service | Purpose | Security Features | -| --------- | --------- | ------------------- | -| **Orchestrator** | Workflow execution, task scheduling, state management | File-based persistence, retry logic, checkpoint recovery | -| **Control Center** | Web-based infrastructure management | **Authorization and permissions control**, RBAC, audit logging | -| **Installer** | Platform installation (TUI + CLI modes) | Secure configuration generation, validation | -| **API Gateway** | REST API for external integration | Authentication, rate limiting, request validation | -| **MCP Server** | AI-powered configuration management | 7 settings tools, intelligent config completion | -| **OCI Registry** | Extension distribution and versioning | Task services, providers, cluster templates | - -### Security & Secrets - -| Technology | Version | Purpose | Enterprise Features | -| ------------ | --------- | --------- | --------------------- | -| **SOPS** | 3.10.2+ | Secrets management | Encrypted configuration files | -| **Age** | 1.2.1+ | Encryption | Secure key-based encryption | -| **Cosmian KMS** | Latest | Key Management System | Confidential computing, secure key storage, cloud-native KMS | -| **Cedar** | Latest | Policy engine | Fine-grained access control, policy-as-code, compliance checking, anomaly detection | -| **RustyVault** | Latest | Transit encryption engine | 5ms encryption performance, multiple KMS backends | -| **JWT** | Latest | Authentication tokens | RS256 signatures, Argon2id password hashing | -| **Keyring** | Latest | OS-native secure storage | macOS Keychain, Linux Secret Service, Windows Credential Manager | - -### Version Management - -| Component | Purpose | Format | -| ----------- | --------- | -------- | -| **versions.ncl** | Core tool versions (Nickel primary) | Nickel schema | -| **provider version.ncl** | Provider-specific versions | Nickel schema | -| **provisioning setup versions** | Version file generator | Nushell command | -| **versions file** | Bash-compatible exports | KEY="VALUE" format | - -**Usage**: -``` -# Generate versions file from Nickel schemas -provisioning setup versions - -# Source in shell scripts -source /provisioning/core/versions -echo $NU_VERSION $PROVIDER_AWS_VERSION -``` - -### Optional Tools - -| Tool | Purpose | -| ------ | --------- | -| **K9s** | Kubernetes management interface | -| **nu_plugin_tera** | Nushell plugin for Tera template rendering | -| **nu_plugin_kcl** | Nushell plugin for KCL integration (CLI required, plugin optional) | -| **nu_plugin_auth** | Authentication plugin (5x faster auth, OS keyring integration) | -| **nu_plugin_kms** | KMS encryption plugin (10x faster, 5ms encryption) | -| **nu_plugin_orchestrator** | Orchestrator plugin (30-50x faster queries) | -| **glow** | Markdown rendering for interactive guides | -| **bat** | Syntax highlighting for file viewing and guides | - ---- - -## How It Works - -### Data Flow - -``` -1. User defines infrastructure in Nickel schemas - ↓ -2. Nickel evaluates with type validation and lazy evaluation - ↓ -3. CLI loads configuration (hierarchical merging) - ↓ -4. Configuration validated against provider schemas - ↓ -5. Workflow created with operations - ↓ -6. Orchestrator receives workflow - ↓ -7. Dependencies resolved (topological sort) - ↓ -8. Operations executed in order (parallel where possible) - ↓ -9. Providers handle cloud operations - ↓ -10. Task services installed on servers - ↓ -11. State persisted and monitored -``` - -### Example Workflow: Deploy Kubernetes Cluster - -**Step 1**: Define infrastructure in Nickel - -``` -# schemas/my-cluster.ncl -{ - metadata = { - name = "my-cluster" - provider = "upcloud" - environment = "production" - } - - infrastructure = { - servers = [ - {name = "control-01", plan = "medium", role = "control"} - {name = "worker-01", plan = "large", role = "worker"} - {name = "worker-02", plan = "large", role = "worker"} - ] - } - - services = { - taskservs = ["kubernetes", "cilium", "rook-ceph"] - } -} -``` - -**Step 2**: Submit to Provisioning - -``` -provisioning server create --infra my-cluster -``` - -**Step 3**: Provisioning executes workflow - -``` -1. Create workflow: "deploy-my-cluster" -2. Resolve dependencies: - - containerd (required by kubernetes) - - etcd (required by kubernetes) - - kubernetes (explicitly requested) - - cilium (explicitly requested, requires kubernetes) - - rook-ceph (explicitly requested, requires kubernetes) - -3. Execution order: - a. Provision servers (parallel) - b. Install containerd on all nodes - c. Install etcd on control nodes - d. Install kubernetes control plane - e. Join worker nodes - f. Install Cilium CNI - g. Install Rook-Ceph storage - -4. Checkpoint after each step -5. Monitor health checks -6. Report completion -``` - -**Step 4**: Verify deployment - -``` -provisioning cluster status my-cluster -``` - -### Configuration Hierarchy - -Configuration values are resolved through a hierarchy: - -``` -1. System Defaults (provisioning/config/config.defaults.toml) - ↓ (overridden by) -2. User Preferences (~/.config/provisioning/user_config.yaml) - ↓ (overridden by) -3. Workspace Config (workspace/config/provisioning.yaml) - ↓ (overridden by) -4. Infrastructure Config (workspace/infra//config.toml) - ↓ (overridden by) -5. Environment Config (workspace/config/prod-defaults.toml) - ↓ (overridden by) -6. Runtime Flags (--flag value) -``` - -**Example**: - -``` -# System default -[servers] -default_plan = "small" - -# User preference -[servers] -default_plan = "medium" # Overrides system default - -# Infrastructure config -[servers] -default_plan = "large" # Overrides user preference - -# Runtime -provisioning server create --plan xlarge # Overrides everything -``` - ---- - -## Use Cases - -### 1. **Multi-Cloud Kubernetes Deployment** - -Deploy Kubernetes clusters across different cloud providers with identical configuration. - -``` -# UpCloud cluster -provisioning cluster create k8s-prod --provider upcloud - -# AWS cluster (same config) -provisioning cluster create k8s-prod --provider aws -``` - -### 2. **Development → Staging → Production Pipeline** - -Manage multiple environments with workspace switching. - -``` -# Development -provisioning workspace switch dev -provisioning cluster create app-stack - -# Staging (same config, different resources) -provisioning workspace switch staging -provisioning cluster create app-stack - -# Production (HA, larger resources) -provisioning workspace switch prod -provisioning cluster create app-stack -``` - -### 3. **Infrastructure as Code Testing** - -Test infrastructure changes before deploying to production. - -``` -# Test Kubernetes upgrade locally -provisioning test topology load kubernetes_3node | \ - test env cluster kubernetes --version 1.29.0 - -# Verify functionality -provisioning test env run - -# Cleanup -provisioning test env cleanup -``` - -### 4. **Batch Multi-Region Deployment** - -Deploy to multiple regions in parallel using Nickel batch workflows. - -``` -# schemas/batch/multi-region.ncl -{ - batch_workflow = { - operations = [ - { - id = "eu-cluster" - type = "cluster" - region = "eu-west-1" - cluster = "app-stack" - } - { - id = "us-cluster" - type = "cluster" - region = "us-east-1" - cluster = "app-stack" - } - { - id = "asia-cluster" - type = "cluster" - region = "ap-south-1" - cluster = "app-stack" - } - ] - parallel_limit = 3 # All at once - } -} -``` - -``` -provisioning batch submit schemas/batch/multi-region.ncl -provisioning batch monitor -``` - -### 5. **Automated Disaster Recovery** - -Recreate infrastructure from configuration. - -``` -# Infrastructure destroyed -provisioning workspace switch prod - -# Recreate from config -provisioning cluster create --infra backup-restore --wait - -# All services restored with same configuration -``` - -### 6. **CI/CD Integration** - -Automated testing and deployment pipelines. - -``` -# .gitlab-ci.yml -test-infrastructure: - script: - - provisioning test quick kubernetes - - provisioning test quick postgres - -deploy-staging: - script: - - provisioning workspace switch staging - - provisioning cluster create app-stack --check - - provisioning cluster create app-stack --yes - -deploy-production: - when: manual - script: - - provisioning workspace switch prod - - provisioning cluster create app-stack --yes -``` - ---- - -## Getting Started - -### Quick Start - -1. **Install Prerequisites** - - ```bash - # Install Nushell (0.109.0+) - brew install nushell # macOS - - # Install Nickel (required for IaC) - brew install nickel # macOS or from source - - # Install SOPS (optional, for encrypted secrets) - brew install sops - ``` - -2. **Add CLI to PATH** - - ```bash - ln -sf "$(pwd)/provisioning/core/cli/provisioning" /usr/local/bin/provisioning - ``` - -3. **Initialize Workspace** - - ```bash - provisioning workspace init my-project - cd my-project - ``` - -3.5. **Generate Versions File** (Optional - for bash scripts) - - ```bash - provisioning setup versions - # Creates /provisioning/core/versions with all tool and provider versions - - # Use in your deployment scripts - source /provisioning/core/versions - echo "Deploying with Nushell $NU_VERSION and AWS CLI $PROVIDER_AWS_VERSION" - ``` - -4. **Define Infrastructure (Nickel)** - - ```bash - # Create workspace infrastructure schema - cat > workspace/infra/my-cluster.ncl <<'EOF' - { - metadata.name = "my-cluster" - metadata.provider = "upcloud" - - infrastructure.servers = [ - {name = "control-01", plan = "medium"} - {name = "worker-01", plan = "large"} - ] - - services.taskservs = ["kubernetes", "cilium"] - } - EOF - ``` - -5. **Deploy Infrastructure** - - ```bash - # Validate configuration - provisioning config validate - - # Check what will be created - provisioning server create --check - - # Create servers - provisioning server create --yes - - # Install Kubernetes - provisioning taskserv create kubernetes - ``` - -### Learning Path - -1. **Start with Guides** - - ```bash - provisioning sc # Quick reference - provisioning guide from-scratch # Complete walkthrough - ``` - -2. **Explore Examples** - - ```bash - ls provisioning/examples/ - ``` - -3. **Read Architecture Docs** - - [Core Engine](provisioning/core/README.md) - - [CLI Architecture](.claude/features/cli-architecture.md) - - [Configuration System](.claude/features/configuration-system.md) - - [Batch Workflows](.claude/features/batch-workflow-system.md) - -4. **Try Test Environments** - - ```bash - provisioning test quick kubernetes - provisioning test quick postgres - ``` - -5. **Build Custom Extensions** - - Create custom task services - - Define cluster templates - - Write workflow automation - ---- - -## Documentation Index - -### User & Operations Guides - -See **[provisioning/docs/src/](provisioning/docs/src/)** for comprehensive documentation: - -- **Quick Start** - Get started in 10 minutes -- **Command Reference** - Complete CLI command reference -- **Nickel Configuration Guide** - IaC language and patterns -- **Workspace Management** - Multi-workspace guide -- **Test Environment Guide** - Testing infrastructure with containers -- **Plugin Integration** - Native Rust plugins (10-50x faster) -- **Security System** - Authentication, MFA, KMS, Cedar policies -- **Operations** - Deployment, monitoring, incident response - -### Architecture & Design Decisions - -See **[docs/src/architecture/](docs/src/architecture/)** for design patterns: - -- **System Architecture** - Multi-layer design -- **ADRs (Architecture Decision Records)** - Major decisions including: - - ADR-011: Nickel Migration (from KCL) - - ADR-012: Nushell + Nickel plugin wrapper - - ADR-010: Configuration format strategy -- **Multi-Repo Strategy** - Repository organization -- **Integration Patterns** - How components interact - -### Development Guidelines - -- **[Repository Structure](docs/src/development/)** - Codebase organization -- **[Contributing Guide](CONTRIBUTING.md)** - How to contribute -- **[Nushell Guidelines](.claude/guidelines/nushell/)** - Best practices -- **[Nickel Guidelines](.claude/guidelines/nickel.md)** - IaC patterns -- **[Rust Guidelines](.claude/guidelines/rust/)** - Rust conventions - -### API Reference - -- **REST API** - HTTP endpoints in `provisioning/docs/src/api-reference/` -- **Nushell API** - Library functions and modules -- **Provider API** - Cloud provider interface specification - ---- - -## Project Status - -**Current Version**: v5.0.0-nickel (Production Ready) | **Date**: 2026-01-08 - -### Completed Milestones - -- ✅ **v5.0.0** (2026-01-08) - **Nickel IaC Migration Complete** - - Full KCL→Nickel migration - - Schema-driven configuration system - - Type-safe lazy evaluation - - ~220 legacy files removed, ~250 new schema files added - -- ✅ **v3.6.0** (2026-01-08) - Version Management System - - Centralized tool and provider version management - - Bash-compatible versions file generation - - `provisioning setup versions` command - - Automatic provider version discovery from Nickel schemas - - Shell script integration with sourcing support - -- ✅ **v4.0.0** (2025-10-09) - Complete Security System (12 components, 39,699 lines) -- ✅ **v3.5.0** (2025-10-07) - Platform Installer with TUI and CI/CD modes -- ✅ **v3.4.0** (2025-10-06) - Test Environment Service with container management -- ✅ **v3.3.0** (2025-09-30) - Interactive Guides system -- ✅ **v3.2.0** (2025-09-30) - Modular CLI Architecture (84% code reduction) -- ✅ **v3.1.0** (2025-09-25) - Batch Workflow System (85-90% token efficiency) -- ✅ **v3.0.0** (2025-09-25) - Hybrid Orchestrator (Rust/Nushell) -- ✅ **v2.0.5** (2025-10-02) - Workspace Switching system -- ✅ **v2.0.0** (2025-09-23) - Configuration System (476+ accessors) -- ✅ **v1.0.0** (2025-10-09) - Nushell Plugins Integration (10-50x performance) - -### Current Focus - -- **Nickel Ecosystem** - IDE support, LSP integration, schema libraries -- **Platform Consolidation** - GitHub Actions CI/CD, cross-platform testing -- **Extension Registry** - OCI-based distribution for task services and providers -- **Documentation** - Complete Nickel migration guides, ADR updates - ---- - -## Support and Community - -### Getting Help - -- **Documentation**: Start with `provisioning help` or `provisioning guide from-scratch` -- **Issues**: Report bugs and request features on the issue tracker -- **Discussions**: Join community discussions for questions and ideas - -### Contributing - -Contributions are welcome! See [CONTRIBUTING.md](docs/development/CONTRIBUTING.md) for guidelines. - -**Key areas for contribution**: - -- New task service definitions -- Cloud provider implementations -- Cluster templates -- Documentation improvements -- Bug fixes and testing - ---- - -## License - -See [LICENSE](LICENSE) file in project root. - ---- - -**Maintained By**: Architecture Team -**Last Updated**: 2026-01-08 (Version Management System v3.6.0 + Nickel v5.0.0 Migration Complete) -**Current Branch**: nickel -**Project Home**: [provisioning/](provisioning/) - ---- - -## Recent Changes (2026-01-08) - -### Version Management System (v3.6.0) - -**What Changed**: -- ✅ Implemented `provisioning setup versions` command -- ✅ Generates bash-compatible `/provisioning/core/versions` file -- ✅ Automatically discovers and includes all provider versions from Nickel schemas -- ✅ Fixed to remove redundant metadata (all sources are Nickel) -- ✅ Core tools with aliases: NUSHELL→NU, NICKEL, SOPS, AGE, K9S -- ✅ Shell script integration: `source /provisioning/core/versions && echo $NU_VERSION` - -**Files Modified**: -- `provisioning/core/nulib/lib_provisioning/setup/utils.nu` - Core implementation -- `provisioning/core/nulib/main_provisioning/commands/setup.nu` - Command routing -- `provisioning/core/nulib/lib_provisioning/workspace/enforcement.nu` - Workspace exemption -- `provisioning/README.md` - Documentation updates - -**Generated File Example**: -``` -NUSHELL_VERSION="0.109.1" -NUSHELL_SOURCE="https://github.com/nushell/nushell/releases" -NU_VERSION="0.109.1" -NU_SOURCE="https://github.com/nushell/nushell/releases" - -NICKEL_VERSION="1.15.1" -NICKEL_SOURCE="https://github.com/tweag/nickel/releases" - -PROVIDER_AWS_VERSION="2.32.11" -PROVIDER_AWS_SOURCE="https://github.com/aws/aws-cli/releases" -# ... and more providers -``` - -**Key Improvements**: -- Clean metadata (no redundant `_LIB` fields - all sources are Nickel) -- Automatic provider discovery from `extensions/providers/*/nickel/version.ncl` -- Direct Nickel file parsing with JSON export -- Zero dependency on environment variables or legacy systems -- 100% bash/shell compatible for deployment scripts +

\n Provisioning Logo\n

\n

\n Provisioning\n

\n\n# Provisioning - Infrastructure Automation Platform\n\n> **A modular, declarative Infrastructure as Code (IaC) platform for managing complete infrastructure lifecycles**\n\n## Table of Contents\n\n- [What is Provisioning?](#what-is-provisioning)\n- [Why Provisioning?](#why-provisioning)\n- [Core Concepts](#core-concepts)\n- [Architecture](#architecture)\n- [Key Features](#key-features)\n- [Technology Stack](#technology-stack)\n- [How It Works](#how-it-works)\n- [Use Cases](#use-cases)\n- [Getting Started](#getting-started)\n\n---\n\n## What is Provisioning?\n\n**Provisioning** is a comprehensive **Infrastructure as Code (IaC)** platform designed to manage\ncomplete infrastructure lifecycles: cloud providers, infrastructure services, clusters,\nand isolated workspaces across multiple cloud/local environments.\n\nExtensible and customizable by design, it delivers type-safe, configuration-driven workflows\nwith enterprise security (encrypted configuration, Cosmian KMS integration, Cedar policy engine,\nsecrets management, authorization and permissions control, compliance checking, anomaly detection)\nand adaptable deployment modes (interactive UI, CLI automation, unattended CI/CD)\nsuitable for any scale from development to production.\n\n### Technical Definition\n\nDeclarative Infrastructure as Code (IaC) platform providing:\n\n- **Type-safe, configuration-driven workflows** with schema validation and constraint checking\n- **Modular, extensible architecture**: cloud providers, task services, clusters, workspaces\n- **Multi-cloud abstraction layer** with unified API (UpCloud, AWS, local infrastructure)\n- **High-performance state management**:\n - Graph database backend for complex relationships\n - Real-time state tracking and queries\n - Multi-model data storage (document, graph, relational)\n- **Enterprise security stack**:\n - Encrypted configuration and secrets management\n - Cosmian KMS integration for confidential key management\n - Cedar policy engine for fine-grained access control\n - Authorization and permissions control via platform services\n - Compliance checking and policy enforcement\n - Anomaly detection for security monitoring\n - Audit logging and compliance tracking\n- **Hybrid orchestration**: Rust-based performance layer + scripting flexibility\n- **Production-ready features**:\n - Batch workflows with dependency resolution\n - Checkpoint recovery and automatic rollback\n - Parallel execution with state management\n- **Adaptable deployment modes**:\n - Interactive TUI for guided setup\n - Headless CLI for scripted automation\n - Unattended mode for CI/CD pipelines\n- **Hierarchical configuration system** with inheritance and overrides\n\n### What It Does\n\n- **Provisions Infrastructure** - Create servers, networks, storage across multiple cloud providers\n- **Installs Services** - Deploy Kubernetes, containerd, databases, monitoring, and 50+ infrastructure components\n- **Manages Clusters** - Orchestrate complete cluster deployments with dependency management\n- **Handles Configuration** - Hierarchical configuration system with inheritance and overrides\n- **Orchestrates Workflows** - Batch operations with parallel execution and checkpoint recovery\n- **Manages Secrets** - SOPS/Age integration for encrypted configuration\n- **Secures Infrastructure** - Enterprise security with JWT, MFA, Cedar policies, audit logging\n- **Optimizes Performance** - Native plugins providing 10-50x speed improvements\n\n---\n\n## Why Provisioning?\n\n### The Problems It Solves\n\n#### 1. **Multi-Cloud Complexity**\n\n**Problem**: Each cloud provider has different APIs, tools, and workflows.\n\n**Solution**: Unified abstraction layer with provider-agnostic interfaces. Write configuration once, deploy anywhere using Nickel schemas.\n\n```\n# Same configuration works on UpCloud, AWS, or local infrastructure\n{\n servers = [\n {\n name = "web-01"\n plan = "medium" # Abstract size, provider-specific translation\n provider = "upcloud" # Switch to "aws" or "local" as needed\n }\n ]\n}\n```\n\n#### 2. **Dependency Hell**\n\n**Problem**: Infrastructure components have complex dependencies (Kubernetes needs containerd, Cilium needs Kubernetes, etc.).\n\n**Solution**: Automatic dependency resolution with topological sorting and health checks via Nickel schemas.\n\n```\n# Provisioning resolves: containerd → etcd → kubernetes → cilium\n{\n taskservs = ["cilium"] # Automatically installs all dependencies\n}\n```\n\n#### 3. **Configuration Sprawl**\n\n**Problem**: Environment variables, hardcoded values, scattered configuration files.\n\n**Solution**: Hierarchical configuration system with 476+ config accessors replacing 200+ ENV variables.\n\n```\nDefaults → User → Project → Infrastructure → Environment → Runtime\n```\n\n#### 4. **Imperative Scripts**\n\n**Problem**: Brittle shell scripts that don't handle failures, don't support rollback, hard to maintain.\n\n**Solution**: Declarative Nickel configurations with validation, type safety, lazy evaluation, and automatic rollback.\n\n#### 5. **Lack of Visibility**\n\n**Problem**: No insight into what's happening during deployment, hard to debug failures.\n\n**Solution**:\n\n- Real-time workflow monitoring\n- Comprehensive logging system\n- Web-based control center\n- REST API for integration\n\n#### 6. **No Standardization**\n\n**Problem**: Each team builds their own deployment tools, no shared patterns.\n\n**Solution**: Reusable task services, cluster templates, and workflow patterns.\n\n---\n\n## Core Concepts\n\n### 1. **Providers**\n\nCloud infrastructure backends that handle resource provisioning.\n\n- **UpCloud** - Primary cloud provider\n- **AWS** - Amazon Web Services integration\n- **Local** - Local infrastructure (VMs, Docker, bare metal)\n\nProviders implement a common interface, making infrastructure code portable.\n\n### 2. **Task Services (TaskServs)**\n\nReusable infrastructure components that can be installed on servers.\n\n**Categories**:\n\n- **Container Runtimes** - containerd, Docker, Podman, crun, runc, youki\n- **Orchestration** - Kubernetes, etcd, CoreDNS\n- **Networking** - Cilium, Flannel, Calico, ip-aliases\n- **Storage** - Rook-Ceph, local storage\n- **Databases** - PostgreSQL, Redis, SurrealDB\n- **Observability** - Prometheus, Grafana, Loki\n- **Security** - Webhook, KMS, Vault\n- **Development** - Gitea, Radicle, ORAS\n\nEach task service includes:\n\n- Version management\n- Dependency declarations\n- Health checks\n- Installation/uninstallation logic\n- Configuration schemas\n\n### 3. **Clusters**\n\nComplete infrastructure deployments combining servers and task services.\n\n**Examples**:\n\n- **Kubernetes Cluster** - HA control plane + worker nodes + CNI + storage\n- **Database Cluster** - Replicated PostgreSQL with backup\n- **Build Infrastructure** - BuildKit + container registry + CI/CD\n\nClusters handle:\n\n- Multi-node coordination\n- Service distribution\n- High availability\n- Rolling updates\n\n### 4. **Workspaces**\n\nIsolated environments for different projects or deployment stages.\n\n```\nworkspace_librecloud/ # Production workspace\n├── infra/ # Infrastructure definitions\n├── config/ # Workspace configuration\n├── extensions/ # Custom modules\n└── runtime/ # State and runtime data\n\nworkspace_dev/ # Development workspace\n├── infra/\n└── config/\n```\n\nSwitch between workspaces with single command:\n\n```\nprovisioning workspace switch librecloud\n```\n\n### 5. **Workflows**\n\nCoordinated sequences of operations with dependency management.\n\n**Types**:\n\n- **Server Workflows** - Create/delete/update servers\n- **TaskServ Workflows** - Install/remove infrastructure services\n- **Cluster Workflows** - Deploy/scale complete clusters\n- **Batch Workflows** - Multi-cloud parallel operations\n\n**Features**:\n\n- Dependency resolution\n- Parallel execution\n- Checkpoint recovery\n- Automatic rollback\n- Progress monitoring\n\n---\n\n## Architecture\n\n### System Components\n\n```\n┌─────────────────────────────────────────────────────────────────┐\n│ User Interface Layer │\n│ • CLI (provisioning command) │\n│ • Web Control Center (UI) │\n│ • REST API │\n└─────────────────────────────────────────────────────────────────┘\n ↓\n┌─────────────────────────────────────────────────────────────────┐\n│ Core Engine Layer │\n│ • Command Routing & Dispatch │\n│ • Configuration Management │\n│ • Provider Abstraction │\n│ • Utility Libraries │\n└─────────────────────────────────────────────────────────────────┘\n ↓\n┌─────────────────────────────────────────────────────────────────┐\n│ Orchestration Layer │\n│ • Workflow Orchestrator (Rust/Nushell hybrid) │\n│ • Dependency Resolver │\n│ • State Manager │\n│ • Task Scheduler │\n└─────────────────────────────────────────────────────────────────┘\n ↓\n┌─────────────────────────────────────────────────────────────────┐\n│ Extension Layer │\n│ • Providers (Cloud APIs) │\n│ • Task Services (Infrastructure Components) │\n│ • Clusters (Complete Deployments) │\n│ • Workflows (Automation Templates) │\n└─────────────────────────────────────────────────────────────────┘\n ↓\n┌─────────────────────────────────────────────────────────────────┐\n│ Infrastructure Layer │\n│ • Cloud Resources (Servers, Networks, Storage) │\n│ • Kubernetes Clusters │\n│ • Running Services │\n└─────────────────────────────────────────────────────────────────┘\n```\n\n### Directory Structure\n\n```\nproject-provisioning/\n├── provisioning/ # Core provisioning system\n│ ├── core/ # Core engine and libraries\n│ │ ├── cli/ # Command-line interface\n│ │ ├── nulib/ # Core Nushell libraries\n│ │ ├── plugins/ # System plugins (Rust native)\n│ │ └── scripts/ # Utility scripts\n│ │\n│ ├── extensions/ # Extensible components\n│ │ ├── providers/ # Cloud provider implementations\n│ │ ├── taskservs/ # Infrastructure service definitions\n│ │ ├── clusters/ # Complete cluster configurations\n│ │ └── workflows/ # Core workflow templates\n│ │\n│ ├── platform/ # Platform services\n│ │ ├── orchestrator/ # Rust orchestrator service\n│ │ ├── control-center/ # Web control center\n│ │ ├── mcp-server/ # Model Context Protocol server\n│ │ ├── api-gateway/ # REST API gateway\n│ │ ├── oci-registry/ # OCI registry for extensions\n│ │ └── installer/ # Platform installer (TUI + CLI)\n│ │\n│ ├── schemas/ # Nickel schema definitions (PRIMARY IaC)\n│ │ ├── main.ncl # Main infrastructure schema\n│ │ ├── providers/ # Provider-specific schemas\n│ │ ├── infrastructure/ # Infra definitions\n│ │ ├── deployment/ # Deployment schemas\n│ │ ├── services/ # Service schemas\n│ │ ├── operations/ # Operations schemas\n│ │ └── generator/ # Runtime schema generation\n│ │\n│ ├── docs/ # Product documentation (mdBook)\n│ ├── config/ # Configuration examples\n│ ├── tools/ # Build and distribution tools\n│ └── justfiles/ # Just recipes for common tasks\n│\n├── workspace/ # User workspaces and data\n│ ├── infra/ # Infrastructure definitions\n│ ├── config/ # User configuration\n│ ├── extensions/ # User extensions\n│ └── runtime/ # Runtime data and state\n│\n├── docs/ # Architecture & Development docs\n│ ├── architecture/ # System design and ADRs\n│ └── development/ # Development guidelines\n│\n└── .github/ # CI/CD workflows\n └── workflows/ # GitHub Actions (Rust, Nickel, Nushell)\n```\n\n### Platform Services\n\n#### 1. **Orchestrator** (`platform/orchestrator/`)\n\n- **Language**: Rust + Nushell\n- **Purpose**: Workflow execution, task scheduling, state management\n- **Features**:\n - File-based persistence\n - Priority processing\n - Retry logic with exponential backoff\n - Checkpoint-based recovery\n - REST API endpoints\n\n#### 2. **Control Center** (`platform/control-center/`)\n\n- **Language**: Web UI + Backend API\n- **Purpose**: Web-based infrastructure management\n- **Features**:\n - Dashboard views\n - Real-time monitoring\n - Interactive deployments\n - Log viewing\n\n#### 3. **MCP Server** (`platform/mcp-server/`)\n\n- **Language**: Nushell\n- **Purpose**: Model Context Protocol integration for AI assistance\n- **Features**:\n - 7 AI-powered settings tools\n - Intelligent config completion\n - Natural language infrastructure queries\n\n#### 4. **OCI Registry** (`platform/oci-registry/`)\n\n- **Purpose**: Extension distribution and versioning\n- **Features**:\n - Task service packages\n - Provider packages\n - Cluster templates\n - Workflow definitions\n\n#### 5. **Installer** (`platform/installer/`)\n\n- **Language**: Rust (Ratatui TUI) + Nushell\n- **Purpose**: Platform installation and setup\n- **Features**:\n - Interactive TUI mode\n - Headless CLI mode\n - Unattended CI/CD mode\n - Configuration generation\n\n---\n\n## Key Features\n\n### 1. **Modular CLI Architecture** (v3.2.0)\n\n84% code reduction with domain-driven design.\n\n- **Main CLI**: 211 lines (from 1,329 lines)\n- **80+ shortcuts**: `s` → `server`, `t` → `taskserv`, etc.\n- **Bi-directional help**: `provisioning help ws` = `provisioning ws help`\n- **7 domain modules**: infrastructure, orchestration, development, workspace, configuration, utilities, generation\n\n### 2. **Configuration System** (v2.0.0)\n\nHierarchical, config-driven architecture.\n\n- **476+ config accessors** replacing 200+ ENV variables\n- **Hierarchical loading**: defaults → user → project → infra → env → runtime\n- **Variable interpolation**: `{{paths.base}}`, `{{env.HOME}}`, `{{now.date}}`\n- **Multi-format support**: TOML, YAML, KCL\n\n### 3. **Batch Workflow System** (v3.1.0)\n\nProvider-agnostic batch operations with 85-90% token efficiency.\n\n- **Multi-cloud support**: Mixed UpCloud + AWS + local in single workflow\n- **KCL schema integration**: Type-safe workflow definitions\n- **Dependency resolution**: Topological sorting with soft/hard dependencies\n- **State management**: Checkpoint-based recovery with rollback\n- **Real-time monitoring**: Live progress tracking\n\n### 4. **Hybrid Orchestrator** (v3.0.0)\n\nRust/Nushell architecture solving deep call stack limitations.\n\n- **High-performance coordination layer**\n- **File-based persistence**\n- **Priority processing with retry logic**\n- **REST API for external integration**\n- **Comprehensive workflow system**\n\n### 5. **Workspace Switching** (v2.0.5)\n\nCentralized workspace management.\n\n- **Single-command switching**: `provisioning workspace switch `\n- **Automatic tracking**: Last-used timestamps, active workspace markers\n- **User preferences**: Global settings across all workspaces\n- **Workspace registry**: Centralized configuration in `user_config.yaml`\n\n### 6. **Interactive Guides** (v3.3.0)\n\nStep-by-step walkthroughs and quick references.\n\n- **Quick reference**: `provisioning sc` (fastest)\n- **Complete guides**: from-scratch, update, customize\n- **Copy-paste ready**: All commands include placeholders\n- **Beautiful rendering**: Uses glow, bat, or less\n\n### 7. **Test Environment Service** (v3.4.0)\n\nAutomated container-based testing.\n\n- **Three test types**: Single taskserv, server simulation, multi-node clusters\n- **Topology templates**: Kubernetes HA, etcd clusters, etc.\n- **Auto-cleanup**: Optional automatic cleanup after tests\n- **CI/CD integration**: Easy integration into pipelines\n\n### 8. **Platform Installer** (v3.5.0)\n\nMulti-mode installation system with TUI, CLI, and unattended modes.\n\n- **Interactive TUI**: Beautiful Ratatui terminal UI with 7 screens\n- **Headless Mode**: CLI automation for scripted installations\n- **Unattended Mode**: Zero-interaction CI/CD deployments\n- **Deployment Modes**: Solo (2 CPU/4GB), MultiUser (4 CPU/8GB), CICD (8 CPU/16GB), Enterprise (16 CPU/32GB)\n- **MCP Integration**: 7 AI-powered settings tools for intelligent configuration\n\n### 9. **Version Management System** (v3.6.0)\n\nCentralized tool and provider version management with bash-compatible export.\n\n- **Unified Version Source**: All versions defined in Nickel files (`versions.ncl` and provider `version.ncl`)\n- **Generated Versions File**: Bash-compatible KEY="VALUE" format for shell scripts\n- **Core Tools**: NUSHELL, NICKEL, SOPS, AGE, K9S with convenient aliases (NU for NUSHELL)\n- **Provider Versions**: Automatically discovers and includes all provider versions (AWS, HCLOUD, UPCTL, etc.)\n- **Command**: `provisioning setup versions` generates `/provisioning/core/versions` file\n- **Shell Integration**: Can be sourced directly in bash scripts: `source /provisioning/core/versions && echo $NU_VERSION`\n- **Usage**:\n ```bash\n # Generate versions file\n provisioning setup versions\n\n # Use in bash scripts\n source /provisioning/core/versions\n echo "Using Nushell version: $NU_VERSION"\n echo "AWS CLI version: $PROVIDER_AWS_VERSION"\n ```\n\n### 10. **Nushell Plugins Integration** (v1.0.0)\n\nThree native Rust plugins providing 10-50x performance improvements over HTTP API.\n\n- **Three Native Plugins**: auth, KMS, orchestrator\n- **Performance Gains**:\n - KMS operations: ~5ms vs ~50ms (10x faster)\n - Orchestrator queries: ~1ms vs ~30ms (30x faster)\n - Auth verification: ~10ms vs ~50ms (5x faster)\n- **OS-Native Keyring**: macOS Keychain, Linux Secret Service, Windows Credential Manager\n- **KMS Backends**: RustyVault, Age, AWS KMS, Vault, Cosmian\n- **Graceful Fallback**: Automatic fallback to HTTP if plugins not installed\n\n### 11. **Complete Security System** (v4.0.0)\n\nEnterprise-grade security with 39,699 lines across 12 components.\n\n- **12 Components**: JWT Auth, Cedar Authorization, MFA (TOTP + WebAuthn), Secrets Management, KMS, Audit Logging, Break-Glass, Compliance, Audit Query, Token Management, Access Control, Encryption\n- **Performance**: <20ms overhead per secure operation\n- **Testing**: 350+ comprehensive test cases\n- **API**: 83+ REST endpoints, 111+ CLI commands\n- **Standards**: GDPR, SOC2, ISO 27001 compliance\n- **Key Features**:\n - RS256 authentication with Argon2id hashing\n - Policy-as-code with hot reload\n - Multi-factor authentication (TOTP + WebAuthn/FIDO2)\n - Dynamic secrets (AWS STS, SSH keys) with TTL\n - 5 KMS backends with envelope encryption\n - 7-year audit retention with 5 export formats\n - Multi-party break-glass approval\n\n---\n\n## Technology Stack\n\n### Core Technologies\n\n| Technology | Version | Purpose | Why |\n| ------------ | --------- | --------- | ----- |\n| **Nickel** | Latest | PRIMARY - Infrastructure-as-code language | Type-safe schemas, lazy evaluation, LSP support, composable records, gradual validation |\n| **Nushell** | 0.109.0+ | Scripting and task automation | Structured data pipelines, cross-platform, modern built-in parsers (JSON/YAML/TOML) |\n| **Rust** | Latest | Platform services (orchestrator, control-center, installer) | Performance, memory safety, concurrency, reliability |\n| **KCL** | DEPRECATED | Legacy configuration (fully replaced by Nickel) | Migration bridge available; use Nickel for new work |\n\n### Data & State Management\n\n| Technology | Version | Purpose | Features |\n| ------------ | --------- | --------- | ---------- |\n| **SurrealDB** | Latest | High-performance graph database backend | Multi-model (document, graph, relational), real-time queries, distributed architecture, complex relationship tracking |\n\n### Platform Services (Rust-based)\n\n| Service | Purpose | Security Features |\n| --------- | --------- | ------------------- |\n| **Orchestrator** | Workflow execution, task scheduling, state management | File-based persistence, retry logic, checkpoint recovery |\n| **Control Center** | Web-based infrastructure management | **Authorization and permissions control**, RBAC, audit logging |\n| **Installer** | Platform installation (TUI + CLI modes) | Secure configuration generation, validation |\n| **API Gateway** | REST API for external integration | Authentication, rate limiting, request validation |\n| **MCP Server** | AI-powered configuration management | 7 settings tools, intelligent config completion |\n| **OCI Registry** | Extension distribution and versioning | Task services, providers, cluster templates |\n\n### Security & Secrets\n\n| Technology | Version | Purpose | Enterprise Features |\n| ------------ | --------- | --------- | --------------------- |\n| **SOPS** | 3.10.2+ | Secrets management | Encrypted configuration files |\n| **Age** | 1.2.1+ | Encryption | Secure key-based encryption |\n| **Cosmian KMS** | Latest | Key Management System | Confidential computing, secure key storage, cloud-native KMS |\n| **Cedar** | Latest | Policy engine | Fine-grained access control, policy-as-code, compliance checking, anomaly detection |\n| **RustyVault** | Latest | Transit encryption engine | 5ms encryption performance, multiple KMS backends |\n| **JWT** | Latest | Authentication tokens | RS256 signatures, Argon2id password hashing |\n| **Keyring** | Latest | OS-native secure storage | macOS Keychain, Linux Secret Service, Windows Credential Manager |\n\n### Version Management\n\n| Component | Purpose | Format |\n| ----------- | --------- | -------- |\n| **versions.ncl** | Core tool versions (Nickel primary) | Nickel schema |\n| **provider version.ncl** | Provider-specific versions | Nickel schema |\n| **provisioning setup versions** | Version file generator | Nushell command |\n| **versions file** | Bash-compatible exports | KEY="VALUE" format |\n\n**Usage**:\n```\n# Generate versions file from Nickel schemas\nprovisioning setup versions\n\n# Source in shell scripts\nsource /provisioning/core/versions\necho $NU_VERSION $PROVIDER_AWS_VERSION\n```\n\n### Optional Tools\n\n| Tool | Purpose |\n| ------ | --------- |\n| **K9s** | Kubernetes management interface |\n| **nu_plugin_tera** | Nushell plugin for Tera template rendering |\n| **nu_plugin_kcl** | Nushell plugin for KCL integration (CLI required, plugin optional) |\n| **nu_plugin_auth** | Authentication plugin (5x faster auth, OS keyring integration) |\n| **nu_plugin_kms** | KMS encryption plugin (10x faster, 5ms encryption) |\n| **nu_plugin_orchestrator** | Orchestrator plugin (30-50x faster queries) |\n| **glow** | Markdown rendering for interactive guides |\n| **bat** | Syntax highlighting for file viewing and guides |\n\n---\n\n## How It Works\n\n### Data Flow\n\n```\n1. User defines infrastructure in Nickel schemas\n ↓\n2. Nickel evaluates with type validation and lazy evaluation\n ↓\n3. CLI loads configuration (hierarchical merging)\n ↓\n4. Configuration validated against provider schemas\n ↓\n5. Workflow created with operations\n ↓\n6. Orchestrator receives workflow\n ↓\n7. Dependencies resolved (topological sort)\n ↓\n8. Operations executed in order (parallel where possible)\n ↓\n9. Providers handle cloud operations\n ↓\n10. Task services installed on servers\n ↓\n11. State persisted and monitored\n```\n\n### Example Workflow: Deploy Kubernetes Cluster\n\n**Step 1**: Define infrastructure in Nickel\n\n```\n# schemas/my-cluster.ncl\n{\n metadata = {\n name = "my-cluster"\n provider = "upcloud"\n environment = "production"\n }\n\n infrastructure = {\n servers = [\n {name = "control-01", plan = "medium", role = "control"}\n {name = "worker-01", plan = "large", role = "worker"}\n {name = "worker-02", plan = "large", role = "worker"}\n ]\n }\n\n services = {\n taskservs = ["kubernetes", "cilium", "rook-ceph"]\n }\n}\n```\n\n**Step 2**: Submit to Provisioning\n\n```\nprovisioning server create --infra my-cluster\n```\n\n**Step 3**: Provisioning executes workflow\n\n```\n1. Create workflow: "deploy-my-cluster"\n2. Resolve dependencies:\n - containerd (required by kubernetes)\n - etcd (required by kubernetes)\n - kubernetes (explicitly requested)\n - cilium (explicitly requested, requires kubernetes)\n - rook-ceph (explicitly requested, requires kubernetes)\n\n3. Execution order:\n a. Provision servers (parallel)\n b. Install containerd on all nodes\n c. Install etcd on control nodes\n d. Install kubernetes control plane\n e. Join worker nodes\n f. Install Cilium CNI\n g. Install Rook-Ceph storage\n\n4. Checkpoint after each step\n5. Monitor health checks\n6. Report completion\n```\n\n**Step 4**: Verify deployment\n\n```\nprovisioning cluster status my-cluster\n```\n\n### Configuration Hierarchy\n\nConfiguration values are resolved through a hierarchy:\n\n```\n1. System Defaults (provisioning/config/config.defaults.toml)\n ↓ (overridden by)\n2. User Preferences (~/.config/provisioning/user_config.yaml)\n ↓ (overridden by)\n3. Workspace Config (workspace/config/provisioning.yaml)\n ↓ (overridden by)\n4. Infrastructure Config (workspace/infra//config.toml)\n ↓ (overridden by)\n5. Environment Config (workspace/config/prod-defaults.toml)\n ↓ (overridden by)\n6. Runtime Flags (--flag value)\n```\n\n**Example**:\n\n```\n# System default\n[servers]\ndefault_plan = "small"\n\n# User preference\n[servers]\ndefault_plan = "medium" # Overrides system default\n\n# Infrastructure config\n[servers]\ndefault_plan = "large" # Overrides user preference\n\n# Runtime\nprovisioning server create --plan xlarge # Overrides everything\n```\n\n---\n\n## Use Cases\n\n### 1. **Multi-Cloud Kubernetes Deployment**\n\nDeploy Kubernetes clusters across different cloud providers with identical configuration.\n\n```\n# UpCloud cluster\nprovisioning cluster create k8s-prod --provider upcloud\n\n# AWS cluster (same config)\nprovisioning cluster create k8s-prod --provider aws\n```\n\n### 2. **Development → Staging → Production Pipeline**\n\nManage multiple environments with workspace switching.\n\n```\n# Development\nprovisioning workspace switch dev\nprovisioning cluster create app-stack\n\n# Staging (same config, different resources)\nprovisioning workspace switch staging\nprovisioning cluster create app-stack\n\n# Production (HA, larger resources)\nprovisioning workspace switch prod\nprovisioning cluster create app-stack\n```\n\n### 3. **Infrastructure as Code Testing**\n\nTest infrastructure changes before deploying to production.\n\n```\n# Test Kubernetes upgrade locally\nprovisioning test topology load kubernetes_3node | \\n test env cluster kubernetes --version 1.29.0\n\n# Verify functionality\nprovisioning test env run \n\n# Cleanup\nprovisioning test env cleanup \n```\n\n### 4. **Batch Multi-Region Deployment**\n\nDeploy to multiple regions in parallel using Nickel batch workflows.\n\n```\n# schemas/batch/multi-region.ncl\n{\n batch_workflow = {\n operations = [\n {\n id = "eu-cluster"\n type = "cluster"\n region = "eu-west-1"\n cluster = "app-stack"\n }\n {\n id = "us-cluster"\n type = "cluster"\n region = "us-east-1"\n cluster = "app-stack"\n }\n {\n id = "asia-cluster"\n type = "cluster"\n region = "ap-south-1"\n cluster = "app-stack"\n }\n ]\n parallel_limit = 3 # All at once\n }\n}\n```\n\n```\nprovisioning batch submit schemas/batch/multi-region.ncl\nprovisioning batch monitor \n```\n\n### 5. **Automated Disaster Recovery**\n\nRecreate infrastructure from configuration.\n\n```\n# Infrastructure destroyed\nprovisioning workspace switch prod\n\n# Recreate from config\nprovisioning cluster create --infra backup-restore --wait\n\n# All services restored with same configuration\n```\n\n### 6. **CI/CD Integration**\n\nAutomated testing and deployment pipelines.\n\n```\n# .gitlab-ci.yml\ntest-infrastructure:\n script:\n - provisioning test quick kubernetes\n - provisioning test quick postgres\n\ndeploy-staging:\n script:\n - provisioning workspace switch staging\n - provisioning cluster create app-stack --check\n - provisioning cluster create app-stack --yes\n\ndeploy-production:\n when: manual\n script:\n - provisioning workspace switch prod\n - provisioning cluster create app-stack --yes\n```\n\n---\n\n## Getting Started\n\n### Quick Start\n\n1. **Install Prerequisites**\n\n ```bash\n # Install Nushell (0.109.0+)\n brew install nushell # macOS\n\n # Install Nickel (required for IaC)\n brew install nickel # macOS or from source\n\n # Install SOPS (optional, for encrypted secrets)\n brew install sops\n ```\n\n2. **Add CLI to PATH**\n\n ```bash\n ln -sf "$(pwd)/provisioning/core/cli/provisioning" /usr/local/bin/provisioning\n ```\n\n3. **Initialize Workspace**\n\n ```bash\n provisioning workspace init my-project\n cd my-project\n ```\n\n3.5. **Generate Versions File** (Optional - for bash scripts)\n\n ```bash\n provisioning setup versions\n # Creates /provisioning/core/versions with all tool and provider versions\n\n # Use in your deployment scripts\n source /provisioning/core/versions\n echo "Deploying with Nushell $NU_VERSION and AWS CLI $PROVIDER_AWS_VERSION"\n ```\n\n4. **Define Infrastructure (Nickel)**\n\n ```bash\n # Create workspace infrastructure schema\n cat > workspace/infra/my-cluster.ncl <<'EOF'\n {\n metadata.name = "my-cluster"\n metadata.provider = "upcloud"\n\n infrastructure.servers = [\n {name = "control-01", plan = "medium"}\n {name = "worker-01", plan = "large"}\n ]\n\n services.taskservs = ["kubernetes", "cilium"]\n }\n EOF\n ```\n\n5. **Deploy Infrastructure**\n\n ```bash\n # Validate configuration\n provisioning config validate\n\n # Check what will be created\n provisioning server create --check\n\n # Create servers\n provisioning server create --yes\n\n # Install Kubernetes\n provisioning taskserv create kubernetes\n ```\n\n### Learning Path\n\n1. **Start with Guides**\n\n ```bash\n provisioning sc # Quick reference\n provisioning guide from-scratch # Complete walkthrough\n ```\n\n2. **Explore Examples**\n\n ```bash\n ls provisioning/examples/\n ```\n\n3. **Read Architecture Docs**\n - [Core Engine](provisioning/core/README.md)\n - [CLI Architecture](.claude/features/cli-architecture.md)\n - [Configuration System](.claude/features/configuration-system.md)\n - [Batch Workflows](.claude/features/batch-workflow-system.md)\n\n4. **Try Test Environments**\n\n ```bash\n provisioning test quick kubernetes\n provisioning test quick postgres\n ```\n\n5. **Build Custom Extensions**\n - Create custom task services\n - Define cluster templates\n - Write workflow automation\n\n---\n\n## Documentation Index\n\n### User & Operations Guides\n\nSee **[provisioning/docs/src/](provisioning/docs/src/)** for comprehensive documentation:\n\n- **Quick Start** - Get started in 10 minutes\n- **Command Reference** - Complete CLI command reference\n- **Nickel Configuration Guide** - IaC language and patterns\n- **Workspace Management** - Multi-workspace guide\n- **Test Environment Guide** - Testing infrastructure with containers\n- **Plugin Integration** - Native Rust plugins (10-50x faster)\n- **Security System** - Authentication, MFA, KMS, Cedar policies\n- **Operations** - Deployment, monitoring, incident response\n\n### Architecture & Design Decisions\n\nSee **[docs/src/architecture/](docs/src/architecture/)** for design patterns:\n\n- **System Architecture** - Multi-layer design\n- **ADRs (Architecture Decision Records)** - Major decisions including:\n - ADR-011: Nickel Migration (from KCL)\n - ADR-012: Nushell + Nickel plugin wrapper\n - ADR-010: Configuration format strategy\n- **Multi-Repo Strategy** - Repository organization\n- **Integration Patterns** - How components interact\n\n### Development Guidelines\n\n- **[Repository Structure](docs/src/development/)** - Codebase organization\n- **[Contributing Guide](CONTRIBUTING.md)** - How to contribute\n- **[Nushell Guidelines](.claude/guidelines/nushell/)** - Best practices\n- **[Nickel Guidelines](.claude/guidelines/nickel.md)** - IaC patterns\n- **[Rust Guidelines](.claude/guidelines/rust/)** - Rust conventions\n\n### API Reference\n\n- **REST API** - HTTP endpoints in `provisioning/docs/src/api-reference/`\n- **Nushell API** - Library functions and modules\n- **Provider API** - Cloud provider interface specification\n\n---\n\n## Project Status\n\n**Current Version**: v5.0.0-nickel (Production Ready) | **Date**: 2026-01-08\n\n### Completed Milestones\n\n- ✅ **v5.0.0** (2026-01-08) - **Nickel IaC Migration Complete**\n - Full KCL→Nickel migration\n - Schema-driven configuration system\n - Type-safe lazy evaluation\n - ~220 legacy files removed, ~250 new schema files added\n\n- ✅ **v3.6.0** (2026-01-08) - Version Management System\n - Centralized tool and provider version management\n - Bash-compatible versions file generation\n - `provisioning setup versions` command\n - Automatic provider version discovery from Nickel schemas\n - Shell script integration with sourcing support\n\n- ✅ **v4.0.0** (2025-10-09) - Complete Security System (12 components, 39,699 lines)\n- ✅ **v3.5.0** (2025-10-07) - Platform Installer with TUI and CI/CD modes\n- ✅ **v3.4.0** (2025-10-06) - Test Environment Service with container management\n- ✅ **v3.3.0** (2025-09-30) - Interactive Guides system\n- ✅ **v3.2.0** (2025-09-30) - Modular CLI Architecture (84% code reduction)\n- ✅ **v3.1.0** (2025-09-25) - Batch Workflow System (85-90% token efficiency)\n- ✅ **v3.0.0** (2025-09-25) - Hybrid Orchestrator (Rust/Nushell)\n- ✅ **v2.0.5** (2025-10-02) - Workspace Switching system\n- ✅ **v2.0.0** (2025-09-23) - Configuration System (476+ accessors)\n- ✅ **v1.0.0** (2025-10-09) - Nushell Plugins Integration (10-50x performance)\n\n### Current Focus\n\n- **Nickel Ecosystem** - IDE support, LSP integration, schema libraries\n- **Platform Consolidation** - GitHub Actions CI/CD, cross-platform testing\n- **Extension Registry** - OCI-based distribution for task services and providers\n- **Documentation** - Complete Nickel migration guides, ADR updates\n\n---\n\n## Support and Community\n\n### Getting Help\n\n- **Documentation**: Start with `provisioning help` or `provisioning guide from-scratch`\n- **Issues**: Report bugs and request features on the issue tracker\n- **Discussions**: Join community discussions for questions and ideas\n\n### Contributing\n\nContributions are welcome! See [CONTRIBUTING.md](docs/development/CONTRIBUTING.md) for guidelines.\n\n**Key areas for contribution**:\n\n- New task service definitions\n- Cloud provider implementations\n- Cluster templates\n- Documentation improvements\n- Bug fixes and testing\n\n---\n\n## License\n\nSee [LICENSE](LICENSE) file in project root.\n\n---\n\n**Maintained By**: Architecture Team\n**Last Updated**: 2026-01-08 (Version Management System v3.6.0 + Nickel v5.0.0 Migration Complete)\n**Current Branch**: nickel\n**Project Home**: [provisioning/](provisioning/)\n\n---\n\n## Recent Changes (2026-01-08)\n\n### Version Management System (v3.6.0)\n\n**What Changed**:\n- ✅ Implemented `provisioning setup versions` command\n- ✅ Generates bash-compatible `/provisioning/core/versions` file\n- ✅ Automatically discovers and includes all provider versions from Nickel schemas\n- ✅ Fixed to remove redundant metadata (all sources are Nickel)\n- ✅ Core tools with aliases: NUSHELL→NU, NICKEL, SOPS, AGE, K9S\n- ✅ Shell script integration: `source /provisioning/core/versions && echo $NU_VERSION`\n\n**Files Modified**:\n- `provisioning/core/nulib/lib_provisioning/setup/utils.nu` - Core implementation\n- `provisioning/core/nulib/main_provisioning/commands/setup.nu` - Command routing\n- `provisioning/core/nulib/lib_provisioning/workspace/enforcement.nu` - Workspace exemption\n- `provisioning/README.md` - Documentation updates\n\n**Generated File Example**:\n```\nNUSHELL_VERSION="0.109.1"\nNUSHELL_SOURCE="https://github.com/nushell/nushell/releases"\nNU_VERSION="0.109.1"\nNU_SOURCE="https://github.com/nushell/nushell/releases"\n\nNICKEL_VERSION="1.15.1"\nNICKEL_SOURCE="https://github.com/tweag/nickel/releases"\n\nPROVIDER_AWS_VERSION="2.32.11"\nPROVIDER_AWS_SOURCE="https://github.com/aws/aws-cli/releases"\n# ... and more providers\n```\n\n**Key Improvements**:\n- Clean metadata (no redundant `_LIB` fields - all sources are Nickel)\n- Automatic provider discovery from `extensions/providers/*/nickel/version.ncl`\n- Direct Nickel file parsing with JSON export\n- Zero dependency on environment variables or legacy systems\n- 100% bash/shell compatible for deployment scripts diff --git a/SECURITY.md b/SECURITY.md index 8181857..c79b018 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -1 +1 @@ -# Security Policy\n\n## Supported Versions\n\nThis project provides security updates for the following versions:\n\n| Version | Supported |\n|---------|-----------|\n| 1.x | ✅ Yes |\n| 0.x | ❌ No |\n\nOnly the latest major version receives security patches. Users are encouraged to upgrade to the latest version.\n\n## Reporting a Vulnerability\n\n**Do not open public GitHub issues for security vulnerabilities.**\n\nInstead, please report security issues to the maintainers privately:\n\n### Reporting Process\n\n1. Email security details to the maintainers (see project README for contact)\n2. Include:\n - Description of the vulnerability\n - Steps to reproduce (if possible)\n - Potential impact\n - Suggested fix (if you have one)\n\n3. Expect acknowledgment within 48 hours\n4. We will work on a fix and coordinate disclosure timing\n\n### Responsible Disclosure\n\n- Allow reasonable time for a fix before public disclosure\n- Work with us to understand and validate the issue\n- Maintain confidentiality until the fix is released\n\n## Security Best Practices\n\n### For Users\n\n- Keep dependencies up to date\n- Use the latest version of this project\n- Review security advisories regularly\n- Report vulnerabilities responsibly\n\n### For Contributors\n\n- Run `cargo audit` before submitting PRs\n- Use `cargo deny` to check license compliance\n- Follow secure coding practices\n- Don't hardcode secrets or credentials\n- Validate all external inputs\n\n## Dependency Security\n\nWe use automated tools to monitor dependencies:\n\n- **cargo-audit**: Scans for known security vulnerabilities\n- **cargo-deny**: Checks licenses and bans unsafe dependencies\n\nThese run in CI on every push and PR.\n\n## Code Review\n\nAll code changes go through review before merging:\n\n- At least one maintainer review required\n- Security implications considered\n- Tests required for all changes\n- CI checks must pass\n\n## Known Vulnerabilities\n\nWe maintain transparency about known issues:\n\n- Documented in GitHub security advisories\n- Announced in release notes\n- Tracked in issues with `security` label\n\n## Security Contact\n\nFor security inquiries, please contact:\n\n- Email: [project maintainers]\n- Issue: Open a private security advisory on GitHub\n\n## Changelog\n\nSecurity fixes are highlighted in CHANGELOG.md with [SECURITY] prefix.\n\n## Resources\n\n- [OWASP Top 10](https://owasp.org/www-project-top-ten/)\n- [CWE: Common Weakness Enumeration](https://cwe.mitre.org/)\n- [Rust Security](https://www.rust-lang.org/governance/security-disclosures)\n- [npm Security](https://docs.npmjs.com/about-npm/security)\n\n## Questions\n\nIf you have security questions (not vulnerabilities), open a discussion or issue with the `security` label. \ No newline at end of file +# Security Policy\n\n## Supported Versions\n\nThis project provides security updates for the following versions:\n\n| Version | Supported |\n|---------|-----------|\n| 1.x | ✅ Yes |\n| 0.x | ❌ No |\n\nOnly the latest major version receives security patches. Users are encouraged to upgrade to the latest version.\n\n## Reporting a Vulnerability\n\n**Do not open public GitHub issues for security vulnerabilities.**\n\nInstead, please report security issues to the maintainers privately:\n\n### Reporting Process\n\n1. Email security details to the maintainers (see project README for contact)\n2. Include:\n - Description of the vulnerability\n - Steps to reproduce (if possible)\n - Potential impact\n - Suggested fix (if you have one)\n\n3. Expect acknowledgment within 48 hours\n4. We will work on a fix and coordinate disclosure timing\n\n### Responsible Disclosure\n\n- Allow reasonable time for a fix before public disclosure\n- Work with us to understand and validate the issue\n- Maintain confidentiality until the fix is released\n\n## Security Best Practices\n\n### For Users\n\n- Keep dependencies up to date\n- Use the latest version of this project\n- Review security advisories regularly\n- Report vulnerabilities responsibly\n\n### For Contributors\n\n- Run `cargo audit` before submitting PRs\n- Use `cargo deny` to check license compliance\n- Follow secure coding practices\n- Don't hardcode secrets or credentials\n- Validate all external inputs\n\n## Dependency Security\n\nWe use automated tools to monitor dependencies:\n\n- **cargo-audit**: Scans for known security vulnerabilities\n- **cargo-deny**: Checks licenses and bans unsafe dependencies\n\nThese run in CI on every push and PR.\n\n## Code Review\n\nAll code changes go through review before merging:\n\n- At least one maintainer review required\n- Security implications considered\n- Tests required for all changes\n- CI checks must pass\n\n## Known Vulnerabilities\n\nWe maintain transparency about known issues:\n\n- Documented in GitHub security advisories\n- Announced in release notes\n- Tracked in issues with `security` label\n\n## Security Contact\n\nFor security inquiries, please contact:\n\n- Email: [project maintainers]\n- Issue: Open a private security advisory on GitHub\n\n## Changelog\n\nSecurity fixes are highlighted in CHANGELOG.md with [SECURITY] prefix.\n\n## Resources\n\n- [OWASP Top 10](https://owasp.org/www-project-top-ten/)\n- [CWE: Common Weakness Enumeration](https://cwe.mitre.org/)\n- [Rust Security](https://www.rust-lang.org/governance/security-disclosures)\n- [npm Security](https://docs.npmjs.com/about-npm/security)\n\n## Questions\n\nIf you have security questions (not vulnerabilities), open a discussion or issue with the `security` label. diff --git a/bootstrap/README.md b/bootstrap/README.md index d108da4..7973e8a 100644 --- a/bootstrap/README.md +++ b/bootstrap/README.md @@ -1,246 +1 @@ -# Provisioning Platform Bootstrap - -Simple, flexible bootstrap script for provisioning platform installation. - -**No Rust compilation required** - uses pure Bash + Nushell. - -## Quick Start - -### From Git Repository - -``` -git clone https://github.com/provisioning/provisioning.git -cd provisioning - -# Run bootstrap -./provisioning/bootstrap/install.sh -``` - -### What it Does (7 Stages) - -1. **System Detection** - Detects OS, CPU, RAM, architecture -2. **Dependency Check** - Validates Docker, Rust, Nushell installed -3. **Directory Structure** - Creates workspace directories -4. **Configuration Validation** - Validates Nickel config syntax -5. **Export Configuration** - Exports config.ncl → TOML for services -6. **Initialize Orchestrator** - Starts orchestrator service -7. **Verification** - Confirms all files created and services running - -## Usage - -### Standard Bootstrap (Interactive) - -``` -./provisioning/bootstrap/install.sh -``` - -### Nushell Direct - -``` -nu provisioning/bootstrap/install.nu $(pwd) -``` - -## Requirements - -**Minimum**: - -- Nushell 0.109.0+ (auto-installed if missing) -- Docker (for containers) -- Rust + Cargo (for building services) -- Git (for cloning) - -**Recommended**: - -- 2+ GB RAM -- 10+ GB disk -- macOS, Linux, or WSL2 - -## What Gets Created - -After bootstrap, your workspace has: - -``` -workspace_librecloud/ -├── config/ -│ ├── config.ncl ← Master config (Nickel) -│ └── generated/ ← Auto-exported TOML -│ ├── workspace.toml -│ ├── providers/ -│ │ ├── upcloud.toml -│ │ └── local.toml -│ └── platform/ -│ └── orchestrator.toml -├── .orchestrator/data/queue/ ← Orchestrator data -├── .kms/ ← KMS data -├── .providers/ ← Provider state -├── .taskservs/ ← Task service data -└── .clusters/ ← Cluster data -``` - -## Differences from Rust Installer - -| Feature | Rust Installer | Bash+Nushell Bootstrap | -| --------- | ----------------- | ------------------------ | -| **Requires compilation** | ✅ Yes (5+ min) | ❌ No | -| **Flexible** | ⚠️ Limited | ✅ Fully scriptable | -| **Source code** | ❌ Binary | ✅ Clear scripts | -| **Easy to modify** | ❌ Recompile | ✅ Edit script | -| **Integrates with TypeDialog** | ❌ Hard | ✅ Easy | -| **Deployable everywhere** | ✅ Binary | ✅ Script | -| **TUI Interface** | ✅ Ratatui | ⚠️ Text menus | - -## Troubleshooting - -### "Nushell not found" - -``` -# Install Nushell manually: -# macOS: -brew install nushell - -# Linux (Debian): -sudo apt install nushell - -# Linux (RHEL): -sudo yum install nushell - -# Or: https://nushell.sh/book/installation.html -``` - -### "Docker not installed" - -``` -# https://docs.docker.com/get-docker/ -``` - -### "Rust not installed" - -``` -# https://rustup.rs/ -curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -rustup default stable -``` - -### "Configuration validation failed" - -``` -# Check Nickel syntax -nickel typecheck workspace_librecloud/config/config.ncl - -# Fix errors in config.ncl -vim workspace_librecloud/config/config.ncl - -# Re-run bootstrap -./provisioning/bootstrap/install.sh -``` - -### "Orchestrator didn't start" - -``` -# Check logs -tail -f workspace_librecloud/.orchestrator/logs/orchestrator.log - -# Manual start -cd provisioning/platform/orchestrator -./scripts/start-orchestrator.nu --background - -# Check health -curl http://localhost:9090/health -``` - -## After Bootstrap - -Once complete: - -1. **Verify orchestrator**: - - ```bash - curl http://localhost:9090/health - ``` - -1. **Update configuration** (optional): - - ```bash - provisioning config platform orchestrator - ``` - -2. **Start provisioning**: - - ```bash - provisioning server create --infra sgoyol --name web-01 - ``` - -3. **Monitor progress**: - - ```bash - provisioning workflow monitor - ``` - -## Development - -### Add New Bootstrap Stage - -Edit `install.nu` and add: - -``` -# Stage N: YOUR STAGE NAME -print "🔧 Stage N: Your Stage Name" -print "─────────────────────────────────────────────────────────────────" - -# Your logic here - -print " ✅ Done" -print "" -``` - -### Modify Existing Stages - -Direct script edits - no compilation needed. Changes take effect immediately. - -### Extend Bootstrap - -Add new scripts in `provisioning/bootstrap/` directory: - -``` -provisioning/bootstrap/ -├── install.sh # Entry point -├── install.nu # Main orchestrator -├── validators.nu # Validation helpers (future) -├── generators.nu # Generator helpers (future) -└── README.md # This file -``` - -## Comparison to Old Rust Installer - -**Old way**: - -1. Run Rust installer binary -2. Need to recompile for any changes -3. Difficult to integrate with TypeDialog -4. Hard to debug - -**New way**: - -1. Run simple bash script -2. Changes take effect immediately -3. Uses existing Nushell libraries -4. Easy to extend and debug - -## FAQ - -**Q: Why not keep the Rust installer?** -A: Rust crate was over-engineered for bootstrap. Bash+Nushell is simpler, more flexible, and integrates better with the rest of the system. - -**Q: Can I customize the bootstrap?** -A: Yes! Edit `install.nu` directly. Add new stages, change logic, integrate TypeDialog - all without compilation. - -**Q: What about TUI interface?** -A: Bootstrap uses text menus. If you need a fancy TUI, you can build a separate Rust tool, but it's not required for basic installation. - -**Q: Is this production-ready?** -A: Yes. It's simpler and more robust than the old Rust installer. - ---- - -**Status**: ✅ Ready for use -**Last Updated**: 2025-01-02 +# Provisioning Platform Bootstrap\n\nSimple, flexible bootstrap script for provisioning platform installation.\n\n**No Rust compilation required** - uses pure Bash + Nushell.\n\n## Quick Start\n\n### From Git Repository\n\n```\ngit clone https://github.com/provisioning/provisioning.git\ncd provisioning\n\n# Run bootstrap\n./provisioning/bootstrap/install.sh\n```\n\n### What it Does (7 Stages)\n\n1. **System Detection** - Detects OS, CPU, RAM, architecture\n2. **Dependency Check** - Validates Docker, Rust, Nushell installed\n3. **Directory Structure** - Creates workspace directories\n4. **Configuration Validation** - Validates Nickel config syntax\n5. **Export Configuration** - Exports config.ncl → TOML for services\n6. **Initialize Orchestrator** - Starts orchestrator service\n7. **Verification** - Confirms all files created and services running\n\n## Usage\n\n### Standard Bootstrap (Interactive)\n\n```\n./provisioning/bootstrap/install.sh\n```\n\n### Nushell Direct\n\n```\nnu provisioning/bootstrap/install.nu $(pwd)\n```\n\n## Requirements\n\n**Minimum**:\n\n- Nushell 0.109.0+ (auto-installed if missing)\n- Docker (for containers)\n- Rust + Cargo (for building services)\n- Git (for cloning)\n\n**Recommended**:\n\n- 2+ GB RAM\n- 10+ GB disk\n- macOS, Linux, or WSL2\n\n## What Gets Created\n\nAfter bootstrap, your workspace has:\n\n```\nworkspace_librecloud/\n├── config/\n│ ├── config.ncl ← Master config (Nickel)\n│ └── generated/ ← Auto-exported TOML\n│ ├── workspace.toml\n│ ├── providers/\n│ │ ├── upcloud.toml\n│ │ └── local.toml\n│ └── platform/\n│ └── orchestrator.toml\n├── .orchestrator/data/queue/ ← Orchestrator data\n├── .kms/ ← KMS data\n├── .providers/ ← Provider state\n├── .taskservs/ ← Task service data\n└── .clusters/ ← Cluster data\n```\n\n## Differences from Rust Installer\n\n| Feature | Rust Installer | Bash+Nushell Bootstrap |\n| --------- | ----------------- | ------------------------ |\n| **Requires compilation** | ✅ Yes (5+ min) | ❌ No |\n| **Flexible** | ⚠️ Limited | ✅ Fully scriptable |\n| **Source code** | ❌ Binary | ✅ Clear scripts |\n| **Easy to modify** | ❌ Recompile | ✅ Edit script |\n| **Integrates with TypeDialog** | ❌ Hard | ✅ Easy |\n| **Deployable everywhere** | ✅ Binary | ✅ Script |\n| **TUI Interface** | ✅ Ratatui | ⚠️ Text menus |\n\n## Troubleshooting\n\n### "Nushell not found"\n\n```\n# Install Nushell manually:\n# macOS:\nbrew install nushell\n\n# Linux (Debian):\nsudo apt install nushell\n\n# Linux (RHEL):\nsudo yum install nushell\n\n# Or: https://nushell.sh/book/installation.html\n```\n\n### "Docker not installed"\n\n```\n# https://docs.docker.com/get-docker/\n```\n\n### "Rust not installed"\n\n```\n# https://rustup.rs/\ncurl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh\nrustup default stable\n```\n\n### "Configuration validation failed"\n\n```\n# Check Nickel syntax\nnickel typecheck workspace_librecloud/config/config.ncl\n\n# Fix errors in config.ncl\nvim workspace_librecloud/config/config.ncl\n\n# Re-run bootstrap\n./provisioning/bootstrap/install.sh\n```\n\n### "Orchestrator didn't start"\n\n```\n# Check logs\ntail -f workspace_librecloud/.orchestrator/logs/orchestrator.log\n\n# Manual start\ncd provisioning/platform/orchestrator\n./scripts/start-orchestrator.nu --background\n\n# Check health\ncurl http://localhost:9090/health\n```\n\n## After Bootstrap\n\nOnce complete:\n\n1. **Verify orchestrator**:\n\n ```bash\n curl http://localhost:9090/health\n ```\n\n1. **Update configuration** (optional):\n\n ```bash\n provisioning config platform orchestrator\n ```\n\n2. **Start provisioning**:\n\n ```bash\n provisioning server create --infra sgoyol --name web-01\n ```\n\n3. **Monitor progress**:\n\n ```bash\n provisioning workflow monitor \n ```\n\n## Development\n\n### Add New Bootstrap Stage\n\nEdit `install.nu` and add:\n\n```\n# Stage N: YOUR STAGE NAME\nprint "🔧 Stage N: Your Stage Name"\nprint "─────────────────────────────────────────────────────────────────"\n\n# Your logic here\n\nprint " ✅ Done"\nprint ""\n```\n\n### Modify Existing Stages\n\nDirect script edits - no compilation needed. Changes take effect immediately.\n\n### Extend Bootstrap\n\nAdd new scripts in `provisioning/bootstrap/` directory:\n\n```\nprovisioning/bootstrap/\n├── install.sh # Entry point\n├── install.nu # Main orchestrator\n├── validators.nu # Validation helpers (future)\n├── generators.nu # Generator helpers (future)\n└── README.md # This file\n```\n\n## Comparison to Old Rust Installer\n\n**Old way**:\n\n1. Run Rust installer binary\n2. Need to recompile for any changes\n3. Difficult to integrate with TypeDialog\n4. Hard to debug\n\n**New way**:\n\n1. Run simple bash script\n2. Changes take effect immediately\n3. Uses existing Nushell libraries\n4. Easy to extend and debug\n\n## FAQ\n\n**Q: Why not keep the Rust installer?**\nA: Rust crate was over-engineered for bootstrap. Bash+Nushell is simpler, more flexible, and integrates better with the rest of the system.\n\n**Q: Can I customize the bootstrap?**\nA: Yes! Edit `install.nu` directly. Add new stages, change logic, integrate TypeDialog - all without compilation.\n\n**Q: What about TUI interface?**\nA: Bootstrap uses text menus. If you need a fancy TUI, you can build a separate Rust tool, but it's not required for basic installation.\n\n**Q: Is this production-ready?**\nA: Yes. It's simpler and more robust than the old Rust installer.\n\n---\n\n**Status**: ✅ Ready for use\n**Last Updated**: 2025-01-02 diff --git a/config/README.md b/config/README.md index 6691c71..e46bf9e 100644 --- a/config/README.md +++ b/config/README.md @@ -1,391 +1 @@ -# Platform Configuration Management - -This directory manages **runtime configurations** for provisioning platform services. - -## Structure - -``` -provisioning/config/ -├── runtime/ # 🔒 PRIVATE (gitignored) -│ ├── .gitignore # Runtime files are private -│ ├── orchestrator.solo.ncl # Runtime config (editable) -│ ├── vault-service.multiuser.ncl # Runtime config (editable) -│ └── generated/ # 📄 Auto-generated TOMLs -│ ├── orchestrator.solo.toml # Exported from .ncl -│ └── vault-service.multiuser.toml -│ -├── examples/ # 📘 PUBLIC (reference) -│ ├── orchestrator.solo.example.ncl -│ └── orchestrator.enterprise.example.ncl -│ -├── README.md # This file -└── setup-platform-config.sh # ← See provisioning/scripts/setup-platform-config.sh -``` - -## Quick Start - -### 1. Setup Platform Configuration (First Time) - -``` -# Interactive wizard (recommended) -./provisioning/scripts/setup-platform-config.sh - -# Or quick setup for all services in solo mode -./provisioning/scripts/setup-platform-config.sh --quick-mode --mode solo -``` - -### 2. Run Services - -``` -# Service reads config from generated TOML -export ORCHESTRATOR_MODE=solo -cargo run -p orchestrator - -# Or with explicit config path -export ORCHESTRATOR_CONFIG=provisioning/config/runtime/generated/orchestrator.solo.toml -cargo run -p orchestrator -``` - -### 3. Update Configuration - -**Option A: Interactive (Recommended)** -``` -# Update via TypeDialog UI -./provisioning/scripts/setup-platform-config.sh --service orchestrator --mode solo -``` - -**Option B: Manual Edit** -``` -# Edit Nickel directly -vim provisioning/config/runtime/orchestrator.solo.ncl - -# ⚠️ CRITICAL: Regenerate TOML afterward -./provisioning/scripts/setup-platform-config.sh --generate-toml -``` - -## Configuration Layers - -``` -📘 PUBLIC (provisioning/schemas/platform/) -├── schemas/ → Type contracts (Nickel) -├── defaults/ → Base configuration values -│ └── deployment/ → Mode-specific overlays (solo/multiuser/cicd/enterprise) -├── validators/ → Business logic validation -└── common/ - └── helpers.ncl → Merge functions - - ⬇️ COMPOSITION PROCESS ⬇️ - -🔒 PRIVATE (provisioning/config/runtime/) -├── orchestrator.solo.ncl ← User editable -│ (imports schemas + defaults + mode overlay) -│ (uses helpers.compose_config for merge) -│ -└── generated/ - └── orchestrator.solo.toml ← Auto-exported for Rust services - (generated by: nickel export --format toml) -``` - -## Key Concepts - -### Schema (Type Contract) -- **File**: `provisioning/schemas/platform/schemas/orchestrator.ncl` -- **Purpose**: Defines valid fields, types, constraints -- **Status**: 📘 PUBLIC, versioned, source of truth -- **Edit**: Rarely (architecture changes only) - -### Defaults (Base Values) -- **File**: `provisioning/schemas/platform/defaults/orchestrator-defaults.ncl` -- **Purpose**: Default values for all orchestrator settings -- **Status**: 📘 PUBLIC, versioned, part of product -- **Edit**: When changing default behavior - -### Mode Overlay (Tuning) -- **File**: `provisioning/schemas/platform/defaults/deployment/solo-defaults.ncl` -- **Purpose**: Mode-specific resource/behavior tuning -- **Status**: 📘 PUBLIC, versioned -- **Example**: solo mode uses 2 CPU, enterprise uses 16+ CPU - -### Runtime Config (User Customization) -- **File**: `provisioning/config/runtime/orchestrator.solo.ncl` -- **Purpose**: Actual deployment configuration (can be hand-edited) -- **Status**: 🔒 PRIVATE, gitignored -- **Edit**: Yes, use setup script or edit manually + regenerate TOML - -### Generated TOML (Service Consumption) -- **File**: `provisioning/config/runtime/generated/orchestrator.solo.toml` -- **Purpose**: What Rust services actually read -- **Status**: 🔒 PRIVATE, gitignored, auto-generated -- **Edit**: NO - regenerate from .ncl instead -- **Generation**: `nickel export --format toml ` - -## Workflows - -### Scenario 1: First-Time Setup - -``` -# 1. Run setup script -./provisioning/scripts/setup-platform-config.sh - -# 2. Choose action (TypeDialog or Quick Mode) -# ↓ -# TypeDialog: User fills form → generates orchestrator.solo.ncl -# Quick Mode: Composes defaults + mode overlay → generates all 8 services - -# 3. Script auto-exports to TOML -# orchestrator.solo.ncl → orchestrator.solo.toml - -# 4. Service reads TOML -# cargo run -p orchestrator (reads generated/orchestrator.solo.toml) -``` - -### Scenario 2: Update Configuration - -``` -# Option A: Interactive TypeDialog -./provisioning/scripts/setup-platform-config.sh \ - --service orchestrator \ - --mode solo \ - --backend web - -# Result: Updated orchestrator.solo.ncl + auto-exported TOML - -# Option B: Manual Edit -vim provisioning/config/runtime/orchestrator.solo.ncl - -# ⚠️ CRITICAL: Must regenerate TOML -./provisioning/scripts/setup-platform-config.sh --generate-toml - -# Result: Updated TOML in generated/ -``` - -### Scenario 3: Switch Deployment Mode - -``` -# From solo to enterprise -./provisioning/scripts/setup-platform-config.sh --quick-mode --mode enterprise - -# Result: All 8 services configured for enterprise mode -# 16+ CPU, 32+ GB RAM, HA setup, KMS integration, etc. -``` - -### Scenario 4: Workspace-Specific Overrides - -``` -workspace_librecloud/ -├── config/ -│ └── platform-overrides.ncl # Workspace customization -│ -# Example: -# { -# orchestrator.server.port = 9999, -# orchestrator.workspace.name = "librecloud", -# vault-service.storage.path = "./workspace_librecloud/data/vault" -# } -``` - -## Important Notes - -### ⚠️ Manual Edits Require TOML Regeneration - -If you edit `.ncl` files directly: - -``` -# 1. Edit the .ncl file -vim provisioning/config/runtime/orchestrator.solo.ncl - -# 2. ALWAYS regenerate TOML -./provisioning/scripts/setup-platform-config.sh --generate-toml - -# Service will NOT see your changes until TOML is regenerated -``` - -### 🔒 Private by Design - -Runtime configs are **gitignored** for good reasons: - -- **May contain secrets**: Encrypted credentials, API keys, tokens -- **Deployment-specific**: Different values per environment -- **User-customized**: Each developer/workspace has different needs -- **Not shared**: Don't commit locally-built configs - -### 📘 Schemas are Public - -Schema/defaults in `provisioning/schemas/` are **version-controlled**: - -- Product definition (part of releases) -- Shared across team -- Source of truth for config structure -- Can reference in documentation - -### 🔄 Idempotent Setup - -The setup script is safe to run multiple times: - -``` -# Safe: Updates only what's needed -./provisioning/scripts/setup-platform-config.sh --quick-mode --mode enterprise - -# Safe: Doesn't overwrite unless --clean is used -./provisioning/scripts/setup-platform-config.sh --generate-toml - -# Use --clean to start fresh -./provisioning/scripts/setup-platform-config.sh --clean -``` - -## Service Configuration Paths - -Each service loads config using this priority: - -``` -1. Environment variable: ORCHESTRATOR_CONFIG=/path/to/custom.toml -2. Mode-specific runtime: provisioning/config/runtime/generated/orchestrator.{MODE}.toml -3. Fallback defaults: provisioning/schemas/platform/defaults/orchestrator-defaults.ncl -``` - -## Configuration Composition (Technical) - -The setup script uses Nickel's `helpers.compose_config` function: - -``` -# Generated .ncl file imports: -let helpers = import "provisioning/schemas/platform/common/helpers.ncl" -let defaults = import "provisioning/schemas/platform/defaults/orchestrator-defaults.ncl" -let mode_config = import "provisioning/schemas/platform/defaults/deployment/solo-defaults.ncl" - -# Compose: base + mode overlay -helpers.compose_config defaults mode_config {} -# ^base ^mode overlay ^user overrides (empty if not customized) -``` - -This ensures: -- Type safety (validated by Nickel schema) -- Proper layering (base + mode + user) -- Reproducibility (same compose always produces same result) -- Extensibility (can add user layer via Nickel import) - -## Troubleshooting - -### Config Won't Generate TOML - -``` -# Check Nickel syntax -nickel typecheck provisioning/config/runtime/orchestrator.solo.ncl - -# Check for schema import errors -nickel export --format json provisioning/config/runtime/orchestrator.solo.ncl - -# View detailed error message -nickel typecheck -i provisioning/config/runtime/orchestrator.solo.ncl 2>&1 | less -``` - -### Service Won't Start - -``` -# Verify TOML exists -ls -la provisioning/config/runtime/generated/orchestrator.solo.toml - -# Verify TOML syntax -toml-cli validate provisioning/config/runtime/generated/orchestrator.solo.toml - -# Check service config loading -RUST_LOG=debug cargo run -p orchestrator 2>&1 | head -50 -``` - -### Wrong Configuration Being Used - -``` -# Verify environment mode -echo $ORCHESTRATOR_MODE # Should be: solo, multiuser, cicd, or enterprise - -# Check which file service is reading -ORCHESTRATOR_CONFIG=provisioning/config/runtime/generated/orchestrator.solo.toml \ - cargo run -p orchestrator - -# Verify file modification time -ls -lah provisioning/config/runtime/generated/orchestrator.*.toml -``` - -## Integration Points - -### ⚠️ Provisioning Installer Status - -**Current Status**: Installer NOT YET IMPLEMENTED - -The `setup-platform-config.sh` script is a **standalone tool** that: -- ✅ Works independently from the provisioning installer -- ✅ Can be called manually for configuration setup -- ⏳ Will be integrated into the installer once it's implemented - -**For Now**: Use script manually before running services: - -``` -# Manual setup (until installer is implemented) -cd /path/to/project-provisioning -./provisioning/scripts/setup-platform-config.sh --quick-mode --mode solo - -# Then run services -export ORCHESTRATOR_MODE=solo -cargo run -p orchestrator -``` - -### Future: Integration into Provisioning Installer - -Once `provisioning/scripts/install.sh` is implemented, it will automatically call this script: - -``` -#!/bin/bash -# provisioning/scripts/install.sh (FUTURE - NOT YET IMPLEMENTED) - -# Pre-flight checks (verification of dependencies, paths, permissions) -check_dependencies() { - command -v nickel >/dev/null || { echo "Nickel required"; exit 1; } - command -v nu >/dev/null || { echo "Nushell required"; exit 1; } -} -check_dependencies - -# Install core provisioning system -echo "Installing provisioning system..." -# (install implementation details here) - -# Setup platform configurations -echo "Setting up platform configurations..." -./provisioning/scripts/setup-platform-config.sh --quick-mode --mode solo - -# Build and test platform services -echo "Building platform services..." -cargo build -p orchestrator -p control-center -p mcp-server - -# Verify services are operational -echo "Verification complete - services ready to run" -``` - -### CI/CD Pipeline Integration - -For automated CI/CD setups (can use now): - -``` -#!/bin/bash -# ci/setup.sh - -# Setup configurations for CI/CD mode -cd /path/to/project-provisioning -./provisioning/scripts/setup-platform-config.sh \ - --quick-mode \ - --mode cicd - -# Result: All services configured for CI/CD mode -# (ephemeral, API-driven, fast cleanup, minimal resource footprint) - -# Run tests -cargo test --all - -# Deploy (CI/CD specific) -docker-compose -f provisioning/platform/infrastructure/docker/docker-compose.cicd.yml up -``` - ---- - -**Version**: 1.0.0 -**Last Updated**: 2026-01-05 -**Script Reference**: `provisioning/scripts/setup-platform-config.sh` +# Platform Configuration Management\n\nThis directory manages **runtime configurations** for provisioning platform services.\n\n## Structure\n\n```\nprovisioning/config/\n├── runtime/ # 🔒 PRIVATE (gitignored)\n│ ├── .gitignore # Runtime files are private\n│ ├── orchestrator.solo.ncl # Runtime config (editable)\n│ ├── vault-service.multiuser.ncl # Runtime config (editable)\n│ └── generated/ # 📄 Auto-generated TOMLs\n│ ├── orchestrator.solo.toml # Exported from .ncl\n│ └── vault-service.multiuser.toml\n│\n├── examples/ # 📘 PUBLIC (reference)\n│ ├── orchestrator.solo.example.ncl\n│ └── orchestrator.enterprise.example.ncl\n│\n├── README.md # This file\n└── setup-platform-config.sh # ← See provisioning/scripts/setup-platform-config.sh\n```\n\n## Quick Start\n\n### 1. Setup Platform Configuration (First Time)\n\n```\n# Interactive wizard (recommended)\n./provisioning/scripts/setup-platform-config.sh\n\n# Or quick setup for all services in solo mode\n./provisioning/scripts/setup-platform-config.sh --quick-mode --mode solo\n```\n\n### 2. Run Services\n\n```\n# Service reads config from generated TOML\nexport ORCHESTRATOR_MODE=solo\ncargo run -p orchestrator\n\n# Or with explicit config path\nexport ORCHESTRATOR_CONFIG=provisioning/config/runtime/generated/orchestrator.solo.toml\ncargo run -p orchestrator\n```\n\n### 3. Update Configuration\n\n**Option A: Interactive (Recommended)**\n```\n# Update via TypeDialog UI\n./provisioning/scripts/setup-platform-config.sh --service orchestrator --mode solo\n```\n\n**Option B: Manual Edit**\n```\n# Edit Nickel directly\nvim provisioning/config/runtime/orchestrator.solo.ncl\n\n# ⚠️ CRITICAL: Regenerate TOML afterward\n./provisioning/scripts/setup-platform-config.sh --generate-toml\n```\n\n## Configuration Layers\n\n```\n📘 PUBLIC (provisioning/schemas/platform/)\n├── schemas/ → Type contracts (Nickel)\n├── defaults/ → Base configuration values\n│ └── deployment/ → Mode-specific overlays (solo/multiuser/cicd/enterprise)\n├── validators/ → Business logic validation\n└── common/\n └── helpers.ncl → Merge functions\n\n ⬇️ COMPOSITION PROCESS ⬇️\n\n🔒 PRIVATE (provisioning/config/runtime/)\n├── orchestrator.solo.ncl ← User editable\n│ (imports schemas + defaults + mode overlay)\n│ (uses helpers.compose_config for merge)\n│\n└── generated/\n └── orchestrator.solo.toml ← Auto-exported for Rust services\n (generated by: nickel export --format toml)\n```\n\n## Key Concepts\n\n### Schema (Type Contract)\n- **File**: `provisioning/schemas/platform/schemas/orchestrator.ncl`\n- **Purpose**: Defines valid fields, types, constraints\n- **Status**: 📘 PUBLIC, versioned, source of truth\n- **Edit**: Rarely (architecture changes only)\n\n### Defaults (Base Values)\n- **File**: `provisioning/schemas/platform/defaults/orchestrator-defaults.ncl`\n- **Purpose**: Default values for all orchestrator settings\n- **Status**: 📘 PUBLIC, versioned, part of product\n- **Edit**: When changing default behavior\n\n### Mode Overlay (Tuning)\n- **File**: `provisioning/schemas/platform/defaults/deployment/solo-defaults.ncl`\n- **Purpose**: Mode-specific resource/behavior tuning\n- **Status**: 📘 PUBLIC, versioned\n- **Example**: solo mode uses 2 CPU, enterprise uses 16+ CPU\n\n### Runtime Config (User Customization)\n- **File**: `provisioning/config/runtime/orchestrator.solo.ncl`\n- **Purpose**: Actual deployment configuration (can be hand-edited)\n- **Status**: 🔒 PRIVATE, gitignored\n- **Edit**: Yes, use setup script or edit manually + regenerate TOML\n\n### Generated TOML (Service Consumption)\n- **File**: `provisioning/config/runtime/generated/orchestrator.solo.toml`\n- **Purpose**: What Rust services actually read\n- **Status**: 🔒 PRIVATE, gitignored, auto-generated\n- **Edit**: NO - regenerate from .ncl instead\n- **Generation**: `nickel export --format toml `\n\n## Workflows\n\n### Scenario 1: First-Time Setup\n\n```\n# 1. Run setup script\n./provisioning/scripts/setup-platform-config.sh\n\n# 2. Choose action (TypeDialog or Quick Mode)\n# ↓\n# TypeDialog: User fills form → generates orchestrator.solo.ncl\n# Quick Mode: Composes defaults + mode overlay → generates all 8 services\n\n# 3. Script auto-exports to TOML\n# orchestrator.solo.ncl → orchestrator.solo.toml\n\n# 4. Service reads TOML\n# cargo run -p orchestrator (reads generated/orchestrator.solo.toml)\n```\n\n### Scenario 2: Update Configuration\n\n```\n# Option A: Interactive TypeDialog\n./provisioning/scripts/setup-platform-config.sh \\n --service orchestrator \\n --mode solo \\n --backend web\n\n# Result: Updated orchestrator.solo.ncl + auto-exported TOML\n\n# Option B: Manual Edit\nvim provisioning/config/runtime/orchestrator.solo.ncl\n\n# ⚠️ CRITICAL: Must regenerate TOML\n./provisioning/scripts/setup-platform-config.sh --generate-toml\n\n# Result: Updated TOML in generated/\n```\n\n### Scenario 3: Switch Deployment Mode\n\n```\n# From solo to enterprise\n./provisioning/scripts/setup-platform-config.sh --quick-mode --mode enterprise\n\n# Result: All 8 services configured for enterprise mode\n# 16+ CPU, 32+ GB RAM, HA setup, KMS integration, etc.\n```\n\n### Scenario 4: Workspace-Specific Overrides\n\n```\nworkspace_librecloud/\n├── config/\n│ └── platform-overrides.ncl # Workspace customization\n│\n# Example:\n# {\n# orchestrator.server.port = 9999,\n# orchestrator.workspace.name = "librecloud",\n# vault-service.storage.path = "./workspace_librecloud/data/vault"\n# }\n```\n\n## Important Notes\n\n### ⚠️ Manual Edits Require TOML Regeneration\n\nIf you edit `.ncl` files directly:\n\n```\n# 1. Edit the .ncl file\nvim provisioning/config/runtime/orchestrator.solo.ncl\n\n# 2. ALWAYS regenerate TOML\n./provisioning/scripts/setup-platform-config.sh --generate-toml\n\n# Service will NOT see your changes until TOML is regenerated\n```\n\n### 🔒 Private by Design\n\nRuntime configs are **gitignored** for good reasons:\n\n- **May contain secrets**: Encrypted credentials, API keys, tokens\n- **Deployment-specific**: Different values per environment\n- **User-customized**: Each developer/workspace has different needs\n- **Not shared**: Don't commit locally-built configs\n\n### 📘 Schemas are Public\n\nSchema/defaults in `provisioning/schemas/` are **version-controlled**:\n\n- Product definition (part of releases)\n- Shared across team\n- Source of truth for config structure\n- Can reference in documentation\n\n### 🔄 Idempotent Setup\n\nThe setup script is safe to run multiple times:\n\n```\n# Safe: Updates only what's needed\n./provisioning/scripts/setup-platform-config.sh --quick-mode --mode enterprise\n\n# Safe: Doesn't overwrite unless --clean is used\n./provisioning/scripts/setup-platform-config.sh --generate-toml\n\n# Use --clean to start fresh\n./provisioning/scripts/setup-platform-config.sh --clean\n```\n\n## Service Configuration Paths\n\nEach service loads config using this priority:\n\n```\n1. Environment variable: ORCHESTRATOR_CONFIG=/path/to/custom.toml\n2. Mode-specific runtime: provisioning/config/runtime/generated/orchestrator.{MODE}.toml\n3. Fallback defaults: provisioning/schemas/platform/defaults/orchestrator-defaults.ncl\n```\n\n## Configuration Composition (Technical)\n\nThe setup script uses Nickel's `helpers.compose_config` function:\n\n```\n# Generated .ncl file imports:\nlet helpers = import "provisioning/schemas/platform/common/helpers.ncl"\nlet defaults = import "provisioning/schemas/platform/defaults/orchestrator-defaults.ncl"\nlet mode_config = import "provisioning/schemas/platform/defaults/deployment/solo-defaults.ncl"\n\n# Compose: base + mode overlay\nhelpers.compose_config defaults mode_config {}\n# ^base ^mode overlay ^user overrides (empty if not customized)\n```\n\nThis ensures:\n- Type safety (validated by Nickel schema)\n- Proper layering (base + mode + user)\n- Reproducibility (same compose always produces same result)\n- Extensibility (can add user layer via Nickel import)\n\n## Troubleshooting\n\n### Config Won't Generate TOML\n\n```\n# Check Nickel syntax\nnickel typecheck provisioning/config/runtime/orchestrator.solo.ncl\n\n# Check for schema import errors\nnickel export --format json provisioning/config/runtime/orchestrator.solo.ncl\n\n# View detailed error message\nnickel typecheck -i provisioning/config/runtime/orchestrator.solo.ncl 2>&1 | less\n```\n\n### Service Won't Start\n\n```\n# Verify TOML exists\nls -la provisioning/config/runtime/generated/orchestrator.solo.toml\n\n# Verify TOML syntax\ntoml-cli validate provisioning/config/runtime/generated/orchestrator.solo.toml\n\n# Check service config loading\nRUST_LOG=debug cargo run -p orchestrator 2>&1 | head -50\n```\n\n### Wrong Configuration Being Used\n\n```\n# Verify environment mode\necho $ORCHESTRATOR_MODE # Should be: solo, multiuser, cicd, or enterprise\n\n# Check which file service is reading\nORCHESTRATOR_CONFIG=provisioning/config/runtime/generated/orchestrator.solo.toml \\n cargo run -p orchestrator\n\n# Verify file modification time\nls -lah provisioning/config/runtime/generated/orchestrator.*.toml\n```\n\n## Integration Points\n\n### ⚠️ Provisioning Installer Status\n\n**Current Status**: Installer NOT YET IMPLEMENTED\n\nThe `setup-platform-config.sh` script is a **standalone tool** that:\n- ✅ Works independently from the provisioning installer\n- ✅ Can be called manually for configuration setup\n- ⏳ Will be integrated into the installer once it's implemented\n\n**For Now**: Use script manually before running services:\n\n```\n# Manual setup (until installer is implemented)\ncd /path/to/project-provisioning\n./provisioning/scripts/setup-platform-config.sh --quick-mode --mode solo\n\n# Then run services\nexport ORCHESTRATOR_MODE=solo\ncargo run -p orchestrator\n```\n\n### Future: Integration into Provisioning Installer\n\nOnce `provisioning/scripts/install.sh` is implemented, it will automatically call this script:\n\n```\n#!/bin/bash\n# provisioning/scripts/install.sh (FUTURE - NOT YET IMPLEMENTED)\n\n# Pre-flight checks (verification of dependencies, paths, permissions)\ncheck_dependencies() {\n command -v nickel >/dev/null || { echo "Nickel required"; exit 1; }\n command -v nu >/dev/null || { echo "Nushell required"; exit 1; }\n}\ncheck_dependencies\n\n# Install core provisioning system\necho "Installing provisioning system..."\n# (install implementation details here)\n\n# Setup platform configurations\necho "Setting up platform configurations..."\n./provisioning/scripts/setup-platform-config.sh --quick-mode --mode solo\n\n# Build and test platform services\necho "Building platform services..."\ncargo build -p orchestrator -p control-center -p mcp-server\n\n# Verify services are operational\necho "Verification complete - services ready to run"\n```\n\n### CI/CD Pipeline Integration\n\nFor automated CI/CD setups (can use now):\n\n```\n#!/bin/bash\n# ci/setup.sh\n\n# Setup configurations for CI/CD mode\ncd /path/to/project-provisioning\n./provisioning/scripts/setup-platform-config.sh \\n --quick-mode \\n --mode cicd\n\n# Result: All services configured for CI/CD mode\n# (ephemeral, API-driven, fast cleanup, minimal resource footprint)\n\n# Run tests\ncargo test --all\n\n# Deploy (CI/CD specific)\ndocker-compose -f provisioning/platform/infrastructure/docker/docker-compose.cicd.yml up\n```\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2026-01-05\n**Script Reference**: `provisioning/scripts/setup-platform-config.sh` diff --git a/config/config.defaults.toml b/config/config.defaults.toml index 21ea24e..d19d4c5 100644 --- a/config/config.defaults.toml +++ b/config/config.defaults.toml @@ -81,8 +81,6 @@ enable_tls = false cert_path = "" key_path = "" ---- - # Configuration Notes # # 1. User Configuration Override diff --git a/config/examples/README.md b/config/examples/README.md index ff89eed..4d8317f 100644 --- a/config/examples/README.md +++ b/config/examples/README.md @@ -1,494 +1 @@ -# Example Platform Service Configurations - -This directory contains reference configurations for platform services in different deployment modes. These examples show realistic settings and best practices for each mode. - -## What Are These Examples? - -These are **Nickel configuration files** (.ncl format) that demonstrate how to configure the provisioning platform services. They show: - -- Recommended settings for each deployment mode -- How to customize services for your environment -- Best practices for development, staging, and production -- Performance tuning for different scenarios -- Security settings appropriate to each mode - -## Directory Structure - -``` -provisioning/config/examples/ -├── README.md # This file -├── orchestrator.solo.example.ncl # Development mode reference -├── orchestrator.multiuser.example.ncl # Team staging reference -└── orchestrator.enterprise.example.ncl # Production reference -``` - -## Deployment Modes - -### Solo Mode (Development) - -**File**: `orchestrator.solo.example.ncl` - -**Characteristics**: -- 2 CPU, 4GB RAM (lightweight) -- Single user/developer -- Local development machine -- Minimal resource consumption -- No TLS or authentication -- In-memory storage - -**When to use**: -- Local development -- Testing configurations -- Learning the platform -- CI/CD test environments - -**Key Settings**: -- workers: 2 -- max_concurrent_tasks: 2 -- max_memory: 1GB -- tls: disabled -- auth: disabled - -### Multiuser Mode (Team Staging) - -**File**: `orchestrator.multiuser.example.ncl` - -**Characteristics**: -- 4 CPU, 8GB RAM (moderate) -- Multiple concurrent users -- Team staging environment -- Production-like testing -- Basic TLS and token auth -- Filesystem storage with caching - -**When to use**: -- Team development -- Integration testing -- Staging environment -- Pre-production validation -- Multi-user environments - -**Key Settings**: -- workers: 4 -- max_concurrent_tasks: 10 -- max_memory: 4GB -- tls: enabled (certificates required) -- auth: token-based -- storage: filesystem with replication - -### Enterprise Mode (Production) - -**File**: `orchestrator.enterprise.example.ncl` - -**Characteristics**: -- 16+ CPU, 32+ GB RAM (high-performance) -- Multi-team, multi-workspace -- Production mission-critical -- Full redundancy and HA -- OAuth2/Enterprise auth -- Distributed storage with replication -- Full monitoring, tracing, audit - -**When to use**: -- Production deployment -- Mission-critical systems -- High-availability requirements -- Multi-tenant environments -- Compliance requirements (SOC2, ISO27001) - -**Key Settings**: -- workers: 16 -- max_concurrent_tasks: 100 -- max_memory: 32GB -- tls: mandatory (TLS 1.3) -- auth: OAuth2 (enterprise provider) -- storage: distributed with 3-way replication -- monitoring: comprehensive with tracing -- disaster_recovery: enabled -- compliance: SOC2, ISO27001 - -## How to Use These Examples - -### Step 1: Copy the Appropriate Example - -Choose the example that matches your deployment mode: - -``` -# For development (solo) -cp provisioning/config/examples/orchestrator.solo.example.ncl \ - provisioning/config/runtime/orchestrator.solo.ncl - -# For team staging (multiuser) -cp provisioning/config/examples/orchestrator.multiuser.example.ncl \ - provisioning/config/runtime/orchestrator.multiuser.ncl - -# For production (enterprise) -cp provisioning/config/examples/orchestrator.enterprise.example.ncl \ - provisioning/config/runtime/orchestrator.enterprise.ncl -``` - -### Step 2: Customize for Your Environment - -Edit the copied file to match your specific setup: - -``` -# Edit the configuration -vim provisioning/config/runtime/orchestrator.solo.ncl - -# Examples of customizations: -# - Change workspace path to your project -# - Adjust worker count based on CPU cores -# - Set your domain names and hostnames -# - Configure storage paths for your filesystem -# - Update certificate paths for production -# - Set logging endpoints for your infrastructure -``` - -### Step 3: Validate Configuration - -Verify the configuration is syntactically correct: - -``` -# Check Nickel syntax -nickel typecheck provisioning/config/runtime/orchestrator.solo.ncl - -# View generated TOML -nickel export --format toml provisioning/config/runtime/orchestrator.solo.ncl -``` - -### Step 4: Generate TOML - -Export the Nickel configuration to TOML format for service consumption: - -``` -# Use setup script to generate TOML -./provisioning/scripts/setup-platform-config.sh --generate-toml - -# Or manually export -nickel export --format toml provisioning/config/runtime/orchestrator.solo.ncl > \ - provisioning/config/runtime/generated/orchestrator.solo.toml -``` - -### Step 5: Run Services - -Start your platform services with the generated configuration: - -``` -# Set the deployment mode -export ORCHESTRATOR_MODE=solo - -# Run the orchestrator -cargo run -p orchestrator -``` - -## Configuration Reference - -### Solo Mode Example Settings - -``` -server.workers = 2 -queue.max_concurrent_tasks = 2 -performance.max_memory = 1000 # 1GB max -security.tls.enabled = false # No TLS for local dev -security.auth.enabled = false # No auth for local dev -``` - -**Use case**: Single developer on local machine - -### Multiuser Mode Example Settings - -``` -server.workers = 4 -queue.max_concurrent_tasks = 10 -performance.max_memory = 4000 # 4GB max -security.tls.enabled = true # Enable TLS -security.auth.type = "token" # Token-based auth -``` - -**Use case**: Team of 5-10 developers in staging - -### Enterprise Mode Example Settings - -``` -server.workers = 16 -queue.max_concurrent_tasks = 100 -performance.max_memory = 32000 # 32GB max -security.tls.enabled = true # TLS 1.3 only -security.auth.type = "oauth2" # OAuth2 for enterprise -storage.replication.factor = 3 # 3-way replication -``` - -**Use case**: Production with 100+ users across multiple teams - -## Key Configuration Sections - -### Server Configuration - -Controls HTTP server behavior: - -``` -server = { - host = "0.0.0.0", # Bind address - port = 9090, # Listen port - workers = 4, # Worker threads - max_connections = 200, # Concurrent connections - request_timeout = 30000, # Milliseconds -} -``` - -### Storage Configuration - -Controls data persistence: - -``` -storage = { - backend = "filesystem", # filesystem or distributed - path = "/var/lib/provisioning/orchestrator/data", - cache.enabled = true, - replication.enabled = true, - replication.factor = 3, # 3-way replication for HA -} -``` - -### Queue Configuration - -Controls task queuing: - -``` -queue = { - max_concurrent_tasks = 10, - retry_attempts = 3, - task_timeout = 3600000, # 1 hour in milliseconds - priority_queue = true, # Enable priority for tasks - metrics = true, # Enable queue metrics -} -``` - -### Security Configuration - -Controls authentication and encryption: - -``` -security = { - tls = { - enabled = true, - cert_path = "/etc/provisioning/certs/cert.crt", - key_path = "/etc/provisioning/certs/key.key", - min_tls_version = "1.3", - }, - auth = { - enabled = true, - type = "oauth2", # oauth2, token, or none - provider = "okta", - }, - encryption = { - enabled = true, - algorithm = "aes-256-gcm", - }, -} -``` - -### Logging Configuration - -Controls log output and persistence: - -``` -logging = { - level = "info", # debug, info, warning, error - format = "json", - output = "both", # stdout, file, or both - file = { - enabled = true, - path = "/var/log/orchestrator.log", - rotation.max_size = 104857600, # 100MB per file - }, -} -``` - -### Monitoring Configuration - -Controls observability and metrics: - -``` -monitoring = { - enabled = true, - metrics.enabled = true, - health_check.enabled = true, - distributed_tracing.enabled = true, - audit_logging.enabled = true, -} -``` - -## Customization Examples - -### Example 1: Change Workspace Name - -Change the workspace identifier in solo mode: - -``` -workspace = { - name = "myproject", - path = "./provisioning/data/orchestrator", -} -``` - -Instead of default "development", use "myproject". - -### Example 2: Custom Server Port - -Change server port from default 9090: - -``` -server = { - port = 8888, -} -``` - -Useful if port 9090 is already in use. - -### Example 3: Enable TLS in Solo Mode - -Add TLS certificates to solo development: - -``` -security = { - tls = { - enabled = true, - cert_path = "./certs/localhost.crt", - key_path = "./certs/localhost.key", - }, -} -``` - -Useful for testing TLS locally before production. - -### Example 4: Custom Storage Path - -Use custom storage location: - -``` -storage = { - path = "/mnt/fast-storage/orchestrator/data", -} -``` - -Useful if you have fast SSD storage available. - -### Example 5: Increase Workers for Staging - -Increase from 4 to 8 workers in multiuser: - -``` -server = { - workers = 8, -} -``` - -Useful when you have more CPU cores available. - -## Troubleshooting Configuration - -### Issue: "Configuration Won't Validate" - -``` -# Check for Nickel syntax errors -nickel typecheck provisioning/config/runtime/orchestrator.solo.ncl - -# Get detailed error message -nickel typecheck -i provisioning/config/runtime/orchestrator.solo.ncl -``` - -The typecheck command will show exactly where the syntax error is. - -### Issue: "Service Won't Start" - -``` -# Verify TOML was exported correctly -cat provisioning/config/runtime/generated/orchestrator.solo.toml | head -20 - -# Check TOML syntax is valid -toml-cli validate provisioning/config/runtime/generated/orchestrator.solo.toml -``` - -The TOML must be valid for the Rust service to parse it. - -### Issue: "Service Uses Wrong Configuration" - -``` -# Verify deployment mode is set -echo $ORCHESTRATOR_MODE - -# Check which TOML file service reads -ls -lah provisioning/config/runtime/generated/orchestrator.*.toml - -# Verify TOML modification time is recent -stat provisioning/config/runtime/generated/orchestrator.solo.toml -``` - -The service reads from `orchestrator.{MODE}.toml` based on environment variable. - -## Best Practices - -### Development (Solo Mode) - -1. Start simple using the solo example as-is first -2. Iterate gradually, making one change at a time -3. Enable logging by setting level = "debug" for troubleshooting -4. Disable security features for local development (TLS/auth) -5. Store data in ./provisioning/data/ which is gitignored - -### Staging (Multiuser Mode) - -1. Mirror production settings to test realistically -2. Enable authentication even in staging to test auth flows -3. Enable TLS with valid certificates to test secure connections -4. Set up monitoring metrics and health checks -5. Plan worker count based on expected concurrent users - -### Production (Enterprise Mode) - -1. Follow the enterprise example as baseline configuration -2. Use secure vault for storing credentials and secrets -3. Enable redundancy with 3-way replication for HA -4. Enable full monitoring with distributed tracing -5. Test failover scenarios regularly -6. Enable audit logging for compliance -7. Enforce TLS 1.3 and certificate rotation - -## Migration Between Modes - -To upgrade from solo → multiuser → enterprise: - -``` -# 1. Backup current configuration -cp provisioning/config/runtime/orchestrator.solo.ncl \ - provisioning/config/runtime/orchestrator.solo.ncl.bak - -# 2. Copy new example for target mode -cp provisioning/config/examples/orchestrator.multiuser.example.ncl \ - provisioning/config/runtime/orchestrator.multiuser.ncl - -# 3. Customize for your environment -vim provisioning/config/runtime/orchestrator.multiuser.ncl - -# 4. Validate and generate TOML -./provisioning/scripts/setup-platform-config.sh --generate-toml - -# 5. Update mode environment variable and restart -export ORCHESTRATOR_MODE=multiuser -cargo run -p orchestrator -``` - -## Related Documentation - -- **Platform Configuration Guide**: `provisioning/docs/src/getting-started/05-platform-configuration.md` -- **Configuration README**: `provisioning/config/README.md` -- **System Status**: `provisioning/config/SETUP_STATUS.md` -- **Setup Script Reference**: `provisioning/scripts/setup-platform-config.sh.md` -- **Advanced TypeDialog Guide**: `provisioning/docs/src/development/typedialog-platform-config-guide.md` - ---- - -**Version**: 1.0.0 -**Last Updated**: 2026-01-05 -**Status**: Ready to use +# Example Platform Service Configurations\n\nThis directory contains reference configurations for platform services in different deployment modes. These examples show realistic settings and best practices for each mode.\n\n## What Are These Examples?\n\nThese are **Nickel configuration files** (.ncl format) that demonstrate how to configure the provisioning platform services. They show:\n\n- Recommended settings for each deployment mode\n- How to customize services for your environment\n- Best practices for development, staging, and production\n- Performance tuning for different scenarios\n- Security settings appropriate to each mode\n\n## Directory Structure\n\n```\nprovisioning/config/examples/\n├── README.md # This file\n├── orchestrator.solo.example.ncl # Development mode reference\n├── orchestrator.multiuser.example.ncl # Team staging reference\n└── orchestrator.enterprise.example.ncl # Production reference\n```\n\n## Deployment Modes\n\n### Solo Mode (Development)\n\n**File**: `orchestrator.solo.example.ncl`\n\n**Characteristics**:\n- 2 CPU, 4GB RAM (lightweight)\n- Single user/developer\n- Local development machine\n- Minimal resource consumption\n- No TLS or authentication\n- In-memory storage\n\n**When to use**:\n- Local development\n- Testing configurations\n- Learning the platform\n- CI/CD test environments\n\n**Key Settings**:\n- workers: 2\n- max_concurrent_tasks: 2\n- max_memory: 1GB\n- tls: disabled\n- auth: disabled\n\n### Multiuser Mode (Team Staging)\n\n**File**: `orchestrator.multiuser.example.ncl`\n\n**Characteristics**:\n- 4 CPU, 8GB RAM (moderate)\n- Multiple concurrent users\n- Team staging environment\n- Production-like testing\n- Basic TLS and token auth\n- Filesystem storage with caching\n\n**When to use**:\n- Team development\n- Integration testing\n- Staging environment\n- Pre-production validation\n- Multi-user environments\n\n**Key Settings**:\n- workers: 4\n- max_concurrent_tasks: 10\n- max_memory: 4GB\n- tls: enabled (certificates required)\n- auth: token-based\n- storage: filesystem with replication\n\n### Enterprise Mode (Production)\n\n**File**: `orchestrator.enterprise.example.ncl`\n\n**Characteristics**:\n- 16+ CPU, 32+ GB RAM (high-performance)\n- Multi-team, multi-workspace\n- Production mission-critical\n- Full redundancy and HA\n- OAuth2/Enterprise auth\n- Distributed storage with replication\n- Full monitoring, tracing, audit\n\n**When to use**:\n- Production deployment\n- Mission-critical systems\n- High-availability requirements\n- Multi-tenant environments\n- Compliance requirements (SOC2, ISO27001)\n\n**Key Settings**:\n- workers: 16\n- max_concurrent_tasks: 100\n- max_memory: 32GB\n- tls: mandatory (TLS 1.3)\n- auth: OAuth2 (enterprise provider)\n- storage: distributed with 3-way replication\n- monitoring: comprehensive with tracing\n- disaster_recovery: enabled\n- compliance: SOC2, ISO27001\n\n## How to Use These Examples\n\n### Step 1: Copy the Appropriate Example\n\nChoose the example that matches your deployment mode:\n\n```\n# For development (solo)\ncp provisioning/config/examples/orchestrator.solo.example.ncl \\n provisioning/config/runtime/orchestrator.solo.ncl\n\n# For team staging (multiuser)\ncp provisioning/config/examples/orchestrator.multiuser.example.ncl \\n provisioning/config/runtime/orchestrator.multiuser.ncl\n\n# For production (enterprise)\ncp provisioning/config/examples/orchestrator.enterprise.example.ncl \\n provisioning/config/runtime/orchestrator.enterprise.ncl\n```\n\n### Step 2: Customize for Your Environment\n\nEdit the copied file to match your specific setup:\n\n```\n# Edit the configuration\nvim provisioning/config/runtime/orchestrator.solo.ncl\n\n# Examples of customizations:\n# - Change workspace path to your project\n# - Adjust worker count based on CPU cores\n# - Set your domain names and hostnames\n# - Configure storage paths for your filesystem\n# - Update certificate paths for production\n# - Set logging endpoints for your infrastructure\n```\n\n### Step 3: Validate Configuration\n\nVerify the configuration is syntactically correct:\n\n```\n# Check Nickel syntax\nnickel typecheck provisioning/config/runtime/orchestrator.solo.ncl\n\n# View generated TOML\nnickel export --format toml provisioning/config/runtime/orchestrator.solo.ncl\n```\n\n### Step 4: Generate TOML\n\nExport the Nickel configuration to TOML format for service consumption:\n\n```\n# Use setup script to generate TOML\n./provisioning/scripts/setup-platform-config.sh --generate-toml\n\n# Or manually export\nnickel export --format toml provisioning/config/runtime/orchestrator.solo.ncl > \\n provisioning/config/runtime/generated/orchestrator.solo.toml\n```\n\n### Step 5: Run Services\n\nStart your platform services with the generated configuration:\n\n```\n# Set the deployment mode\nexport ORCHESTRATOR_MODE=solo\n\n# Run the orchestrator\ncargo run -p orchestrator\n```\n\n## Configuration Reference\n\n### Solo Mode Example Settings\n\n```\nserver.workers = 2\nqueue.max_concurrent_tasks = 2\nperformance.max_memory = 1000 # 1GB max\nsecurity.tls.enabled = false # No TLS for local dev\nsecurity.auth.enabled = false # No auth for local dev\n```\n\n**Use case**: Single developer on local machine\n\n### Multiuser Mode Example Settings\n\n```\nserver.workers = 4\nqueue.max_concurrent_tasks = 10\nperformance.max_memory = 4000 # 4GB max\nsecurity.tls.enabled = true # Enable TLS\nsecurity.auth.type = "token" # Token-based auth\n```\n\n**Use case**: Team of 5-10 developers in staging\n\n### Enterprise Mode Example Settings\n\n```\nserver.workers = 16\nqueue.max_concurrent_tasks = 100\nperformance.max_memory = 32000 # 32GB max\nsecurity.tls.enabled = true # TLS 1.3 only\nsecurity.auth.type = "oauth2" # OAuth2 for enterprise\nstorage.replication.factor = 3 # 3-way replication\n```\n\n**Use case**: Production with 100+ users across multiple teams\n\n## Key Configuration Sections\n\n### Server Configuration\n\nControls HTTP server behavior:\n\n```\nserver = {\n host = "0.0.0.0", # Bind address\n port = 9090, # Listen port\n workers = 4, # Worker threads\n max_connections = 200, # Concurrent connections\n request_timeout = 30000, # Milliseconds\n}\n```\n\n### Storage Configuration\n\nControls data persistence:\n\n```\nstorage = {\n backend = "filesystem", # filesystem or distributed\n path = "/var/lib/provisioning/orchestrator/data",\n cache.enabled = true,\n replication.enabled = true,\n replication.factor = 3, # 3-way replication for HA\n}\n```\n\n### Queue Configuration\n\nControls task queuing:\n\n```\nqueue = {\n max_concurrent_tasks = 10,\n retry_attempts = 3,\n task_timeout = 3600000, # 1 hour in milliseconds\n priority_queue = true, # Enable priority for tasks\n metrics = true, # Enable queue metrics\n}\n```\n\n### Security Configuration\n\nControls authentication and encryption:\n\n```\nsecurity = {\n tls = {\n enabled = true,\n cert_path = "/etc/provisioning/certs/cert.crt",\n key_path = "/etc/provisioning/certs/key.key",\n min_tls_version = "1.3",\n },\n auth = {\n enabled = true,\n type = "oauth2", # oauth2, token, or none\n provider = "okta",\n },\n encryption = {\n enabled = true,\n algorithm = "aes-256-gcm",\n },\n}\n```\n\n### Logging Configuration\n\nControls log output and persistence:\n\n```\nlogging = {\n level = "info", # debug, info, warning, error\n format = "json",\n output = "both", # stdout, file, or both\n file = {\n enabled = true,\n path = "/var/log/orchestrator.log",\n rotation.max_size = 104857600, # 100MB per file\n },\n}\n```\n\n### Monitoring Configuration\n\nControls observability and metrics:\n\n```\nmonitoring = {\n enabled = true,\n metrics.enabled = true,\n health_check.enabled = true,\n distributed_tracing.enabled = true,\n audit_logging.enabled = true,\n}\n```\n\n## Customization Examples\n\n### Example 1: Change Workspace Name\n\nChange the workspace identifier in solo mode:\n\n```\nworkspace = {\n name = "myproject",\n path = "./provisioning/data/orchestrator",\n}\n```\n\nInstead of default "development", use "myproject".\n\n### Example 2: Custom Server Port\n\nChange server port from default 9090:\n\n```\nserver = {\n port = 8888,\n}\n```\n\nUseful if port 9090 is already in use.\n\n### Example 3: Enable TLS in Solo Mode\n\nAdd TLS certificates to solo development:\n\n```\nsecurity = {\n tls = {\n enabled = true,\n cert_path = "./certs/localhost.crt",\n key_path = "./certs/localhost.key",\n },\n}\n```\n\nUseful for testing TLS locally before production.\n\n### Example 4: Custom Storage Path\n\nUse custom storage location:\n\n```\nstorage = {\n path = "/mnt/fast-storage/orchestrator/data",\n}\n```\n\nUseful if you have fast SSD storage available.\n\n### Example 5: Increase Workers for Staging\n\nIncrease from 4 to 8 workers in multiuser:\n\n```\nserver = {\n workers = 8,\n}\n```\n\nUseful when you have more CPU cores available.\n\n## Troubleshooting Configuration\n\n### Issue: "Configuration Won't Validate"\n\n```\n# Check for Nickel syntax errors\nnickel typecheck provisioning/config/runtime/orchestrator.solo.ncl\n\n# Get detailed error message\nnickel typecheck -i provisioning/config/runtime/orchestrator.solo.ncl\n```\n\nThe typecheck command will show exactly where the syntax error is.\n\n### Issue: "Service Won't Start"\n\n```\n# Verify TOML was exported correctly\ncat provisioning/config/runtime/generated/orchestrator.solo.toml | head -20\n\n# Check TOML syntax is valid\ntoml-cli validate provisioning/config/runtime/generated/orchestrator.solo.toml\n```\n\nThe TOML must be valid for the Rust service to parse it.\n\n### Issue: "Service Uses Wrong Configuration"\n\n```\n# Verify deployment mode is set\necho $ORCHESTRATOR_MODE\n\n# Check which TOML file service reads\nls -lah provisioning/config/runtime/generated/orchestrator.*.toml\n\n# Verify TOML modification time is recent\nstat provisioning/config/runtime/generated/orchestrator.solo.toml\n```\n\nThe service reads from `orchestrator.{MODE}.toml` based on environment variable.\n\n## Best Practices\n\n### Development (Solo Mode)\n\n1. Start simple using the solo example as-is first\n2. Iterate gradually, making one change at a time\n3. Enable logging by setting level = "debug" for troubleshooting\n4. Disable security features for local development (TLS/auth)\n5. Store data in ./provisioning/data/ which is gitignored\n\n### Staging (Multiuser Mode)\n\n1. Mirror production settings to test realistically\n2. Enable authentication even in staging to test auth flows\n3. Enable TLS with valid certificates to test secure connections\n4. Set up monitoring metrics and health checks\n5. Plan worker count based on expected concurrent users\n\n### Production (Enterprise Mode)\n\n1. Follow the enterprise example as baseline configuration\n2. Use secure vault for storing credentials and secrets\n3. Enable redundancy with 3-way replication for HA\n4. Enable full monitoring with distributed tracing\n5. Test failover scenarios regularly\n6. Enable audit logging for compliance\n7. Enforce TLS 1.3 and certificate rotation\n\n## Migration Between Modes\n\nTo upgrade from solo → multiuser → enterprise:\n\n```\n# 1. Backup current configuration\ncp provisioning/config/runtime/orchestrator.solo.ncl \\n provisioning/config/runtime/orchestrator.solo.ncl.bak\n\n# 2. Copy new example for target mode\ncp provisioning/config/examples/orchestrator.multiuser.example.ncl \\n provisioning/config/runtime/orchestrator.multiuser.ncl\n\n# 3. Customize for your environment\nvim provisioning/config/runtime/orchestrator.multiuser.ncl\n\n# 4. Validate and generate TOML\n./provisioning/scripts/setup-platform-config.sh --generate-toml\n\n# 5. Update mode environment variable and restart\nexport ORCHESTRATOR_MODE=multiuser\ncargo run -p orchestrator\n```\n\n## Related Documentation\n\n- **Platform Configuration Guide**: `provisioning/docs/src/getting-started/05-platform-configuration.md`\n- **Configuration README**: `provisioning/config/README.md`\n- **System Status**: `provisioning/config/SETUP_STATUS.md`\n- **Setup Script Reference**: `provisioning/scripts/setup-platform-config.sh.md`\n- **Advanced TypeDialog Guide**: `provisioning/docs/src/development/typedialog-platform-config-guide.md`\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2026-01-05\n**Status**: Ready to use diff --git a/examples/complete-workflow.md b/examples/complete-workflow.md index 9db45bc..91597d8 100644 --- a/examples/complete-workflow.md +++ b/examples/complete-workflow.md @@ -1 +1 @@ -# Complete Workflow Example: Kubernetes Cluster with Package System\n\nThis example demonstrates the complete workflow using the new KCL package and module loader system to deploy a production Kubernetes cluster.\n\n## Scenario\n\nDeploy a 3-node Kubernetes cluster with:\n\n- 1 master node\n- 2 worker nodes\n- Cilium CNI\n- Containerd runtime\n- UpCloud provider\n- Production-ready configuration\n\n## Prerequisites\n\n1. Core provisioning package installed\n2. UpCloud credentials configured\n3. SSH keys set up\n\n## Step 1: Environment Setup\n\n```\n# Ensure core package is installed\ncd /Users/Akasha/project-provisioning\n./provisioning/tools/kcl-packager.nu build --version 1.0.0\n./provisioning/tools/kcl-packager.nu install dist/provisioning-1.0.0.tar.gz\n\n# Verify installation\nkcl list packages | grep provisioning\n```\n\n## Step 2: Create Workspace\n\n```\n# Create new workspace from template\nmkdir -p workspace/infra/production-k8s\ncd workspace/infra/production-k8s\n\n# Initialize workspace structure\n../../../provisioning/tools/workspace-init.nu . init\n\n# Verify structure\ntree -a .\n```\n\nExpected output:\n\n```\n.\n├── kcl.mod\n├── servers.k\n├── README.md\n├── .gitignore\n├── .taskservs/\n├── .providers/\n├── .clusters/\n├── .manifest/\n├── data/\n├── tmp/\n├── resources/\n└── clusters/\n```\n\n## Step 3: Discover Available Modules\n\n```\n# Discover available taskservs\n../../../provisioning/core/cli/module-loader discover taskservs\n\n# Search for Kubernetes-related modules\n../../../provisioning/core/cli/module-loader discover taskservs kubernetes\n\n# Discover providers\n../../../provisioning/core/cli/module-loader discover providers\n\n# Check output formats\n../../../provisioning/core/cli/module-loader discover taskservs --format json\n```\n\n## Step 4: Load Required Modules\n\n```\n# Load Kubernetes stack taskservs\n../../../provisioning/core/cli/module-loader load taskservs . [kubernetes, cilium, containerd]\n\n# Load UpCloud provider\n../../../provisioning/core/cli/module-loader load providers . [upcloud]\n\n# Verify loading\n../../../provisioning/core/cli/module-loader list taskservs .\n../../../provisioning/core/cli/module-loader list providers .\n```\n\nCheck generated files:\n\n```\n# Check auto-generated imports\ncat taskservs.k\ncat providers.k\n\n# Check manifest\ncat .manifest/taskservs.yaml\ncat .manifest/providers.yaml\n```\n\n## Step 5: Configure Infrastructure\n\nEdit `servers.k` to configure the Kubernetes cluster:\n\n```\n# Production Kubernetes Cluster Configuration\nimport provisioning.settings as settings\nimport provisioning.server as server\nimport provisioning.defaults as defaults\n\n# Import loaded modules (auto-generated)\nimport .taskservs.kubernetes.kubernetes as k8s\nimport .taskservs.cilium.cilium as cilium\nimport .taskservs.containerd.containerd as containerd\nimport .providers.upcloud as upcloud\n\n# Cluster settings\nk8s_settings: settings.Settings = {\n main_name = "production-k8s"\n main_title = "Production Kubernetes Cluster"\n\n # Configure paths\n settings_path = "./data/settings.yaml"\n defaults_provs_dirpath = "./defs"\n prov_data_dirpath = "./data"\n created_taskservs_dirpath = "./tmp/k8s-deployment"\n prov_resources_path = "./resources"\n created_clusters_dirpath = "./tmp/k8s-clusters"\n prov_clusters_path = "./clusters"\n\n # Kubernetes cluster settings\n cluster_admin_host = "" # Set by provider (first master)\n cluster_admin_port = 22\n cluster_admin_user = "admin"\n servers_wait_started = 60 # K8s nodes need more time\n\n runset = {\n wait = True\n output_format = "human"\n output_path = "tmp/k8s-deployment"\n inventory_file = "./k8s-inventory.yaml"\n use_time = True\n }\n\n # Secrets configuration\n secrets = {\n provider = "sops"\n sops_config = {\n age_key_file = "~/.age/keys.txt"\n use_age = True\n }\n }\n}\n\n# Production Kubernetes cluster servers\nproduction_servers: [server.Server] = [\n # Control plane node\n {\n hostname = "k8s-master-01"\n title = "Kubernetes Master Node 01"\n\n # Production specifications\n time_zone = "UTC"\n running_wait = 20\n running_timeout = 400\n storage_os_find = "name: debian-12 | arch: x86_64"\n\n # Network configuration\n network_utility_ipv4 = True\n network_public_ipv4 = True\n priv_cidr_block = "10.0.0.0/24"\n\n # User settings\n user = "admin"\n user_ssh_port = 22\n fix_local_hosts = True\n labels = "env: production, role: control-plane, tier: master"\n\n # Taskservs configuration\n taskservs = [\n {\n name = "containerd"\n profile = "production"\n install_mode = "library"\n },\n {\n name = "kubernetes"\n profile = "master"\n install_mode = "library-server"\n },\n {\n name = "cilium"\n profile = "master"\n install_mode = "library"\n }\n ]\n },\n\n # Worker nodes\n {\n hostname = "k8s-worker-01"\n title = "Kubernetes Worker Node 01"\n\n time_zone = "UTC"\n running_wait = 20\n running_timeout = 400\n storage_os_find = "name: debian-12 | arch: x86_64"\n\n network_utility_ipv4 = True\n network_public_ipv4 = True\n priv_cidr_block = "10.0.0.0/24"\n\n user = "admin"\n user_ssh_port = 22\n fix_local_hosts = True\n labels = "env: production, role: worker, tier: compute"\n\n taskservs = [\n {\n name = "containerd"\n profile = "production"\n install_mode = "library"\n },\n {\n name = "kubernetes"\n profile = "worker"\n install_mode = "library"\n },\n {\n name = "cilium"\n profile = "worker"\n install_mode = "library"\n }\n ]\n },\n\n {\n hostname = "k8s-worker-02"\n title = "Kubernetes Worker Node 02"\n\n time_zone = "UTC"\n running_wait = 20\n running_timeout = 400\n storage_os_find = "name: debian-12 | arch: x86_64"\n\n network_utility_ipv4 = True\n network_public_ipv4 = True\n priv_cidr_block = "10.0.0.0/24"\n\n user = "admin"\n user_ssh_port = 22\n fix_local_hosts = True\n labels = "env: production, role: worker, tier: compute"\n\n taskservs = [\n {\n name = "containerd"\n profile = "production"\n install_mode = "library"\n },\n {\n name = "kubernetes"\n profile = "worker"\n install_mode = "library"\n },\n {\n name = "cilium"\n profile = "worker"\n install_mode = "library"\n }\n ]\n }\n]\n\n# Export for provisioning system\n{\n settings = k8s_settings\n servers = production_servers\n}\n```\n\n## Step 6: Validate Configuration\n\n```\n# Validate KCL configuration\nkcl run servers.k\n\n# Validate workspace\n../../../provisioning/core/cli/module-loader validate .\n\n# Check workspace info\n../../../provisioning/tools/workspace-init.nu . info\n```\n\n## Step 7: Configure Provider Credentials\n\n```\n# Create provider configuration directory\nmkdir -p defs\n\n# Create UpCloud provider defaults (example)\ncat > defs/upcloud_defaults.k << 'EOF'\n# UpCloud Provider Defaults\nimport provisioning.defaults as defaults\n\nupcloud_defaults: defaults.ServerDefaults = {\n lock = False\n time_zone = "UTC"\n running_wait = 15\n running_timeout = 300\n\n # UpCloud specific settings\n storage_os_find = "name: debian-12 | arch: x86_64"\n\n # Network settings\n network_utility_ipv4 = True\n network_public_ipv4 = True\n\n # SSH settings\n ssh_key_path = "~/.ssh/id_rsa.pub"\n user = "admin"\n user_ssh_port = 22\n fix_local_hosts = True\n\n # UpCloud plan specifications\n labels = "provider: upcloud"\n}\n\nupcloud_defaults\nEOF\n```\n\n## Step 8: Deploy Infrastructure\n\n```\n# Create servers with check mode first\n../../../provisioning/core/cli/provisioning server create --infra . --check\n\n# If validation passes, deploy for real\n../../../provisioning/core/cli/provisioning server create --infra .\n\n# Monitor server creation\n../../../provisioning/core/cli/provisioning server list --infra .\n```\n\n## Step 9: Install Taskservs\n\n```\n# Install containerd on all nodes\n../../../provisioning/core/cli/provisioning taskserv create containerd --infra .\n\n# Install Kubernetes (this will set up master and join workers)\n../../../provisioning/core/cli/provisioning taskserv create kubernetes --infra .\n\n# Install Cilium CNI\n../../../provisioning/core/cli/provisioning taskserv create cilium --infra .\n```\n\n## Step 10: Verify Cluster\n\n```\n# SSH to master node and verify cluster\n../../../provisioning/core/cli/provisioning server ssh k8s-master-01 --infra .\n\n# On the master node:\nkubectl get nodes\nkubectl get pods -A\nkubectl get services -A\n\n# Test Cilium connectivity\ncilium status\ncilium connectivity test\n```\n\n## Step 11: Deploy Sample Application\n\nCreate a test deployment to verify the cluster:\n\n```\n# Create namespace\nkubectl create namespace test-app\n\n# Deploy nginx\nkubectl create deployment nginx --image=nginx:latest -n test-app\nkubectl expose deployment nginx --port=80 --type=ClusterIP -n test-app\n\n# Verify deployment\nkubectl get pods -n test-app\nkubectl get services -n test-app\n```\n\n## Step 12: Cluster Management\n\n```\n# Add monitoring (example)\n../../../provisioning/core/cli/module-loader load taskservs . [prometheus, grafana]\n\n# Regenerate configuration\n../../../provisioning/core/cli/module-loader list taskservs .\n\n# Deploy monitoring stack\n../../../provisioning/core/cli/provisioning taskserv create prometheus --infra .\n../../../provisioning/core/cli/provisioning taskserv create grafana --infra .\n```\n\n## Step 13: Backup and Documentation\n\n```\n# Create cluster documentation\ncat > cluster-info.md << 'EOF'\n# Production Kubernetes Cluster\n\n## Cluster Details\n- **Name**: production-k8s\n- **Nodes**: 3 (1 master, 2 workers)\n- **CNI**: Cilium\n- **Runtime**: Containerd\n- **Provider**: UpCloud\n\n## Node Information\n- k8s-master-01: Control plane node\n- k8s-worker-01: Worker node\n- k8s-worker-02: Worker node\n\n## Loaded Modules\n- kubernetes (master/worker profiles)\n- cilium (cluster networking)\n- containerd (container runtime)\n- upcloud (cloud provider)\n\n## Management Commands\n```\n# SSH to master\n../../../provisioning/core/cli/provisioning server ssh k8s-master-01 --infra .\n\n# Update cluster\n../../../provisioning/core/cli/provisioning taskserv generate kubernetes --infra .\n```\n\nEOF\n\n# Backup workspace\n\ncp -r . ../production-k8s-backup-$(date +%Y%m%d)\n\n# Commit to version control\n\ngit add .\ngit commit -m "Initial Kubernetes cluster deployment with package system"\n\n```\n\n## Troubleshooting\n\n### Module Loading Issues\n```\n# If modules don't load properly\n../../../provisioning/core/cli/module-loader discover taskservs\n../../../provisioning/core/cli/module-loader load taskservs . [kubernetes, cilium, containerd] --force\n\n# Check generated imports\ncat taskservs.k\n```\n\n### KCL Compilation Issues\n\n```\n# Check for syntax errors\nkcl check servers.k\n\n# Validate specific schemas\nkcl run --dry-run servers.k\n```\n\n### Provider Authentication Issues\n\n```\n# Check provider configuration\ncat .providers/upcloud/provision_upcloud.k\n\n# Verify credentials\n../../../provisioning/core/cli/provisioning server price --provider upcloud\n```\n\n### Kubernetes Setup Issues\n\n```\n# Check taskserv logs\ntail -f tmp/k8s-deployment/kubernetes-*.log\n\n# Verify SSH connectivity\n../../../provisioning/core/cli/provisioning server ssh k8s-master-01 --infra . --command "systemctl status kubelet"\n```\n\n## Next Steps\n\n1. **Scale the cluster**: Add more worker nodes\n2. **Add storage**: Load and configure storage taskservs (rook-ceph, mayastor)\n3. **Setup monitoring**: Deploy Prometheus/Grafana stack\n4. **Configure ingress**: Set up ingress controllers\n5. **Implement GitOps**: Configure ArgoCD or Flux\n\nThis example demonstrates the complete workflow from workspace creation to production Kubernetes cluster deployment using the new package-based system. \ No newline at end of file +# Complete Workflow Example: Kubernetes Cluster with Package System\n\nThis example demonstrates the complete workflow using the new KCL package and module loader system to deploy a production Kubernetes cluster.\n\n## Scenario\n\nDeploy a 3-node Kubernetes cluster with:\n\n- 1 master node\n- 2 worker nodes\n- Cilium CNI\n- Containerd runtime\n- UpCloud provider\n- Production-ready configuration\n\n## Prerequisites\n\n1. Core provisioning package installed\n2. UpCloud credentials configured\n3. SSH keys set up\n\n## Step 1: Environment Setup\n\n```\n# Ensure core package is installed\ncd /Users/Akasha/project-provisioning\n./provisioning/tools/kcl-packager.nu build --version 1.0.0\n./provisioning/tools/kcl-packager.nu install dist/provisioning-1.0.0.tar.gz\n\n# Verify installation\nkcl list packages | grep provisioning\n```\n\n## Step 2: Create Workspace\n\n```\n# Create new workspace from template\nmkdir -p workspace/infra/production-k8s\ncd workspace/infra/production-k8s\n\n# Initialize workspace structure\n../../../provisioning/tools/workspace-init.nu . init\n\n# Verify structure\ntree -a .\n```\n\nExpected output:\n\n```\n.\n├── kcl.mod\n├── servers.k\n├── README.md\n├── .gitignore\n├── .taskservs/\n├── .providers/\n├── .clusters/\n├── .manifest/\n├── data/\n├── tmp/\n├── resources/\n└── clusters/\n```\n\n## Step 3: Discover Available Modules\n\n```\n# Discover available taskservs\n../../../provisioning/core/cli/module-loader discover taskservs\n\n# Search for Kubernetes-related modules\n../../../provisioning/core/cli/module-loader discover taskservs kubernetes\n\n# Discover providers\n../../../provisioning/core/cli/module-loader discover providers\n\n# Check output formats\n../../../provisioning/core/cli/module-loader discover taskservs --format json\n```\n\n## Step 4: Load Required Modules\n\n```\n# Load Kubernetes stack taskservs\n../../../provisioning/core/cli/module-loader load taskservs . [kubernetes, cilium, containerd]\n\n# Load UpCloud provider\n../../../provisioning/core/cli/module-loader load providers . [upcloud]\n\n# Verify loading\n../../../provisioning/core/cli/module-loader list taskservs .\n../../../provisioning/core/cli/module-loader list providers .\n```\n\nCheck generated files:\n\n```\n# Check auto-generated imports\ncat taskservs.k\ncat providers.k\n\n# Check manifest\ncat .manifest/taskservs.yaml\ncat .manifest/providers.yaml\n```\n\n## Step 5: Configure Infrastructure\n\nEdit `servers.k` to configure the Kubernetes cluster:\n\n```\n# Production Kubernetes Cluster Configuration\nimport provisioning.settings as settings\nimport provisioning.server as server\nimport provisioning.defaults as defaults\n\n# Import loaded modules (auto-generated)\nimport .taskservs.kubernetes.kubernetes as k8s\nimport .taskservs.cilium.cilium as cilium\nimport .taskservs.containerd.containerd as containerd\nimport .providers.upcloud as upcloud\n\n# Cluster settings\nk8s_settings: settings.Settings = {\n main_name = "production-k8s"\n main_title = "Production Kubernetes Cluster"\n\n # Configure paths\n settings_path = "./data/settings.yaml"\n defaults_provs_dirpath = "./defs"\n prov_data_dirpath = "./data"\n created_taskservs_dirpath = "./tmp/k8s-deployment"\n prov_resources_path = "./resources"\n created_clusters_dirpath = "./tmp/k8s-clusters"\n prov_clusters_path = "./clusters"\n\n # Kubernetes cluster settings\n cluster_admin_host = "" # Set by provider (first master)\n cluster_admin_port = 22\n cluster_admin_user = "admin"\n servers_wait_started = 60 # K8s nodes need more time\n\n runset = {\n wait = True\n output_format = "human"\n output_path = "tmp/k8s-deployment"\n inventory_file = "./k8s-inventory.yaml"\n use_time = True\n }\n\n # Secrets configuration\n secrets = {\n provider = "sops"\n sops_config = {\n age_key_file = "~/.age/keys.txt"\n use_age = True\n }\n }\n}\n\n# Production Kubernetes cluster servers\nproduction_servers: [server.Server] = [\n # Control plane node\n {\n hostname = "k8s-master-01"\n title = "Kubernetes Master Node 01"\n\n # Production specifications\n time_zone = "UTC"\n running_wait = 20\n running_timeout = 400\n storage_os_find = "name: debian-12 | arch: x86_64"\n\n # Network configuration\n network_utility_ipv4 = True\n network_public_ipv4 = True\n priv_cidr_block = "10.0.0.0/24"\n\n # User settings\n user = "admin"\n user_ssh_port = 22\n fix_local_hosts = True\n labels = "env: production, role: control-plane, tier: master"\n\n # Taskservs configuration\n taskservs = [\n {\n name = "containerd"\n profile = "production"\n install_mode = "library"\n },\n {\n name = "kubernetes"\n profile = "master"\n install_mode = "library-server"\n },\n {\n name = "cilium"\n profile = "master"\n install_mode = "library"\n }\n ]\n },\n\n # Worker nodes\n {\n hostname = "k8s-worker-01"\n title = "Kubernetes Worker Node 01"\n\n time_zone = "UTC"\n running_wait = 20\n running_timeout = 400\n storage_os_find = "name: debian-12 | arch: x86_64"\n\n network_utility_ipv4 = True\n network_public_ipv4 = True\n priv_cidr_block = "10.0.0.0/24"\n\n user = "admin"\n user_ssh_port = 22\n fix_local_hosts = True\n labels = "env: production, role: worker, tier: compute"\n\n taskservs = [\n {\n name = "containerd"\n profile = "production"\n install_mode = "library"\n },\n {\n name = "kubernetes"\n profile = "worker"\n install_mode = "library"\n },\n {\n name = "cilium"\n profile = "worker"\n install_mode = "library"\n }\n ]\n },\n\n {\n hostname = "k8s-worker-02"\n title = "Kubernetes Worker Node 02"\n\n time_zone = "UTC"\n running_wait = 20\n running_timeout = 400\n storage_os_find = "name: debian-12 | arch: x86_64"\n\n network_utility_ipv4 = True\n network_public_ipv4 = True\n priv_cidr_block = "10.0.0.0/24"\n\n user = "admin"\n user_ssh_port = 22\n fix_local_hosts = True\n labels = "env: production, role: worker, tier: compute"\n\n taskservs = [\n {\n name = "containerd"\n profile = "production"\n install_mode = "library"\n },\n {\n name = "kubernetes"\n profile = "worker"\n install_mode = "library"\n },\n {\n name = "cilium"\n profile = "worker"\n install_mode = "library"\n }\n ]\n }\n]\n\n# Export for provisioning system\n{\n settings = k8s_settings\n servers = production_servers\n}\n```\n\n## Step 6: Validate Configuration\n\n```\n# Validate KCL configuration\nkcl run servers.k\n\n# Validate workspace\n../../../provisioning/core/cli/module-loader validate .\n\n# Check workspace info\n../../../provisioning/tools/workspace-init.nu . info\n```\n\n## Step 7: Configure Provider Credentials\n\n```\n# Create provider configuration directory\nmkdir -p defs\n\n# Create UpCloud provider defaults (example)\ncat > defs/upcloud_defaults.k << 'EOF'\n# UpCloud Provider Defaults\nimport provisioning.defaults as defaults\n\nupcloud_defaults: defaults.ServerDefaults = {\n lock = False\n time_zone = "UTC"\n running_wait = 15\n running_timeout = 300\n\n # UpCloud specific settings\n storage_os_find = "name: debian-12 | arch: x86_64"\n\n # Network settings\n network_utility_ipv4 = True\n network_public_ipv4 = True\n\n # SSH settings\n ssh_key_path = "~/.ssh/id_rsa.pub"\n user = "admin"\n user_ssh_port = 22\n fix_local_hosts = True\n\n # UpCloud plan specifications\n labels = "provider: upcloud"\n}\n\nupcloud_defaults\nEOF\n```\n\n## Step 8: Deploy Infrastructure\n\n```\n# Create servers with check mode first\n../../../provisioning/core/cli/provisioning server create --infra . --check\n\n# If validation passes, deploy for real\n../../../provisioning/core/cli/provisioning server create --infra .\n\n# Monitor server creation\n../../../provisioning/core/cli/provisioning server list --infra .\n```\n\n## Step 9: Install Taskservs\n\n```\n# Install containerd on all nodes\n../../../provisioning/core/cli/provisioning taskserv create containerd --infra .\n\n# Install Kubernetes (this will set up master and join workers)\n../../../provisioning/core/cli/provisioning taskserv create kubernetes --infra .\n\n# Install Cilium CNI\n../../../provisioning/core/cli/provisioning taskserv create cilium --infra .\n```\n\n## Step 10: Verify Cluster\n\n```\n# SSH to master node and verify cluster\n../../../provisioning/core/cli/provisioning server ssh k8s-master-01 --infra .\n\n# On the master node:\nkubectl get nodes\nkubectl get pods -A\nkubectl get services -A\n\n# Test Cilium connectivity\ncilium status\ncilium connectivity test\n```\n\n## Step 11: Deploy Sample Application\n\nCreate a test deployment to verify the cluster:\n\n```\n# Create namespace\nkubectl create namespace test-app\n\n# Deploy nginx\nkubectl create deployment nginx --image=nginx:latest -n test-app\nkubectl expose deployment nginx --port=80 --type=ClusterIP -n test-app\n\n# Verify deployment\nkubectl get pods -n test-app\nkubectl get services -n test-app\n```\n\n## Step 12: Cluster Management\n\n```\n# Add monitoring (example)\n../../../provisioning/core/cli/module-loader load taskservs . [prometheus, grafana]\n\n# Regenerate configuration\n../../../provisioning/core/cli/module-loader list taskservs .\n\n# Deploy monitoring stack\n../../../provisioning/core/cli/provisioning taskserv create prometheus --infra .\n../../../provisioning/core/cli/provisioning taskserv create grafana --infra .\n```\n\n## Step 13: Backup and Documentation\n\n```\n# Create cluster documentation\ncat > cluster-info.md << 'EOF'\n# Production Kubernetes Cluster\n\n## Cluster Details\n- **Name**: production-k8s\n- **Nodes**: 3 (1 master, 2 workers)\n- **CNI**: Cilium\n- **Runtime**: Containerd\n- **Provider**: UpCloud\n\n## Node Information\n- k8s-master-01: Control plane node\n- k8s-worker-01: Worker node\n- k8s-worker-02: Worker node\n\n## Loaded Modules\n- kubernetes (master/worker profiles)\n- cilium (cluster networking)\n- containerd (container runtime)\n- upcloud (cloud provider)\n\n## Management Commands\n```\n# SSH to master\n../../../provisioning/core/cli/provisioning server ssh k8s-master-01 --infra .\n\n# Update cluster\n../../../provisioning/core/cli/provisioning taskserv generate kubernetes --infra .\n```\n\nEOF\n\n# Backup workspace\n\ncp -r . ../production-k8s-backup-$(date +%Y%m%d)\n\n# Commit to version control\n\ngit add .\ngit commit -m "Initial Kubernetes cluster deployment with package system"\n\n```\n\n## Troubleshooting\n\n### Module Loading Issues\n```\n# If modules don't load properly\n../../../provisioning/core/cli/module-loader discover taskservs\n../../../provisioning/core/cli/module-loader load taskservs . [kubernetes, cilium, containerd] --force\n\n# Check generated imports\ncat taskservs.k\n```\n\n### KCL Compilation Issues\n\n```\n# Check for syntax errors\nkcl check servers.k\n\n# Validate specific schemas\nkcl run --dry-run servers.k\n```\n\n### Provider Authentication Issues\n\n```\n# Check provider configuration\ncat .providers/upcloud/provision_upcloud.k\n\n# Verify credentials\n../../../provisioning/core/cli/provisioning server price --provider upcloud\n```\n\n### Kubernetes Setup Issues\n\n```\n# Check taskserv logs\ntail -f tmp/k8s-deployment/kubernetes-*.log\n\n# Verify SSH connectivity\n../../../provisioning/core/cli/provisioning server ssh k8s-master-01 --infra . --command "systemctl status kubelet"\n```\n\n## Next Steps\n\n1. **Scale the cluster**: Add more worker nodes\n2. **Add storage**: Load and configure storage taskservs (rook-ceph, mayastor)\n3. **Setup monitoring**: Deploy Prometheus/Grafana stack\n4. **Configure ingress**: Set up ingress controllers\n5. **Implement GitOps**: Configure ArgoCD or Flux\n\nThis example demonstrates the complete workflow from workspace creation to production Kubernetes cluster deployment using the new package-based system. diff --git a/examples/workspaces/cost-optimized/README.md b/examples/workspaces/cost-optimized/README.md index f677543..e0892b3 100644 --- a/examples/workspaces/cost-optimized/README.md +++ b/examples/workspaces/cost-optimized/README.md @@ -1 +1 @@ -# Cost-Optimized Multi-Provider Workspace\n\nThis workspace demonstrates cost optimization through intelligent provider specialization:\n\n- **Hetzner**: Compute tier (CPX21 servers at €20.90/month) - best price/performance\n- **AWS**: Managed services (RDS, ElastiCache, SQS) - reliability without ops overhead\n- **DigitalOcean**: CDN and object storage - affordable content delivery\n\n## Why This Architecture?\n\n### Cost Comparison\n\n```\nCost-Optimized Architecture:\n├── Hetzner compute: €72.70/month (~$78)\n├── AWS managed services: $115/month\n└── DigitalOcean CDN: $64/month\nTotal: ~$280/month\n\nAll-AWS Equivalent:\n├── EC2 instances: ~$200+\n├── RDS database: ~$150+\n├── ElastiCache: ~$50+\n├── CloudFront CDN: ~$100+\n└── Other services: ~$50+\nTotal: ~$600+/month\n\nSavings: ~$320/month (53% reduction)\n```\n\n### Architecture Benefits\n\n**Hetzner Advantages**:\n- Best price/performance for compute (€20.90/month for 4 vCPU/8GB)\n- Powerful Load Balancer (€10/month)\n- Fast networking (10Gbps)\n- EU data residency (GDPR compliant)\n\n**AWS Advantages**:\n- Managed RDS: Automatic backups, failover, patching\n- ElastiCache: Redis cluster with automatic failover\n- SQS: Scalable message queue (pay per message)\n- CloudWatch: Comprehensive monitoring\n\n**DigitalOcean Advantages**:\n- CDN: Cost-effective content delivery ($25/month)\n- Spaces: Object storage at scale ($15/month)\n- Simple pricing and management\n- Edge nodes for regional distribution\n\n## Architecture Overview\n\n```\n┌────────────────────────────────────────────────┐\n│ Client Requests │\n└─────────────────┬────────────────────────────────┘\n │ HTTPS/HTTP\n ┌────────▼─────────┐\n │ DigitalOcean │\n │ CDN / Spaces │\n └────────┬─────────┘\n │\n ┌────────────┼────────────┐\n │ │ │\n┌────▼──────┐ ┌──▼────────┐ ┌─▼──────┐\n│ Hetzner │ │ AWS │ │ DO │\n│ Compute │ │ Managed │ │ CDN │\n│ (Load LB) │ │ Services │ │ │\n└────┬──────┘ └──┬────────┘ └────────┘\n │VPN Tunnel │\n┌────▼──────────▼────┐\n│ Hetzner Network │ AWS VPC DO Spaces\n│ 10.0.0.0/16 ◄──► 10.1.0.0/16 ◄──► nyc3\n│ 3x CPX21 Servers │ RDS + Cache CDN +\n│ │ + SQS Backups\n└────────────────────┘\n```\n\n## Prerequisites\n\n### 1. Cloud Accounts\n\n- **Hetzner**: Account with API token\n- **AWS**: Account with access keys\n- **DigitalOcean**: Account with API token\n\n### 2. Environment Variables\n\n```\nexport HCLOUD_TOKEN="MC4wNTI1YmE1M2E4YmE0YTQzMTQyZTdlODYy"\nexport AWS_ACCESS_KEY_ID="AKIA1234567890ABCDEF"\nexport AWS_SECRET_ACCESS_KEY="wJalrXUtnFEMI/K7MDENG+j/zI0m1234567890ab"\nexport DIGITALOCEAN_TOKEN="dop_v1_abc123def456ghi789jkl012mno"\n```\n\n### 3. CLI Tools\n\n```\n# Install and verify\nwhich hcloud && hcloud version\nwhich aws && aws --version\nwhich doctl && doctl version\nwhich nickel && nickel --version\n```\n\n### 4. SSH Keys\n\n```\n# Hetzner\nhcloud ssh-key create --name provisioning-key \\n --public-key-from-file ~/.ssh/id_rsa.pub\n\n# AWS\naws ec2 create-key-pair --key-name provisioning-key \\n --query 'KeyMaterial' --output text > provisioning-key.pem\nchmod 600 provisioning-key.pem\n\n# DigitalOcean\ndoctl compute ssh-key create provisioning-key \\n --public-key-from-file ~/.ssh/id_rsa.pub\n```\n\n## Deployment\n\n### Step 1: Configure the Workspace\n\nEdit `workspace.ncl`:\n\n```\n# Update networking if needed\ncompute_tier.primary_servers = hetzner.Server & {\n server_type = "cpx21",\n count = 3,\n location = "nbg1"\n}\n\n# Update AWS region if needed\nmanaged_services.database = aws.RDS & {\n instance_class = "db.t3.small",\n region = "us-east-1"\n}\n\n# Update CDN endpoints\ncdn_tier.cdn.endpoints = [{\n name = "app-cdn",\n origin = "content.example.com"\n}]\n```\n\nEdit `config.toml`:\n\n```\n[cost_tracking]\nmonthly_budget = 300\nbudget_alert_threshold = 280\n\n[application.cache]\nmax_memory = "250MB"\n```\n\n### Step 2: Validate Configuration\n\n```\n# Validate Nickel syntax\nnickel export workspace.ncl | jq . > /dev/null\n\n# Verify provider access\nhcloud context use default\naws sts get-caller-identity\ndoctl account get\n```\n\n### Step 3: Deploy\n\n```\nchmod +x deploy.nu\n./deploy.nu\n\n# Or with debug output\n./deploy.nu --debug\n```\n\n### Step 4: Verify Deployment\n\n```\n# Hetzner compute resources\nhcloud server list\nhcloud load-balancer list\n\n# AWS managed services\naws rds describe-db-instances --region us-east-1\naws elasticache describe-cache-clusters --region us-east-1\naws sqs list-queues --region us-east-1\n\n# DigitalOcean CDN\ndoctl compute cdn list\ndoctl compute spaces list\n```\n\n## Post-Deployment Configuration\n\n### 1. Connect Hetzner Compute to AWS Database\n\n```\n# Get Hetzner server IPs\nhcloud server list --format ID,PublicIPv4\n\n# Get RDS endpoint\naws rds describe-db-instances --region us-east-1 \\n --query 'DBInstances[0].Endpoint.Address'\n\n# On Hetzner server, install PostgreSQL client\nssh root@hetzner-server\napt-get update && apt-get install postgresql-client\n\n# Test connection to RDS\npsql -h app-db.abc123.us-east-1.rds.amazonaws.com \\n -U admin -d postgres -c "SELECT now();"\n```\n\n### 2. Configure Application for Services\n\n```\n# Application configuration file\ncat > /var/www/app/.env << EOF\nDATABASE_HOST=app-db.abc123.us-east-1.rds.amazonaws.com\nDATABASE_PORT=5432\nDATABASE_USER=admin\nDATABASE_PASSWORD=your_password\nDATABASE_NAME=app_db\n\nREDIS_HOST=app-cache.abc123.ng.0001.euc1.cache.amazonaws.com\nREDIS_PORT=6379\n\nSQS_QUEUE_URL=https://sqs.us-east-1.amazonaws.com/123456789/app-queue\n\nCDN_ENDPOINT=https://content.example.com\nSPACES_ENDPOINT=https://app-content.nyc3.digitaloceanspaces.com\nSPACES_KEY=your_spaces_key\nSPACES_SECRET=your_spaces_secret\n\nENVIRONMENT=production\nEOF\n```\n\n### 3. Setup CDN and Object Storage\n\n```\n# Configure Spaces bucket\ndoctl compute spaces create app-content --region nyc3\n\n# Get Spaces endpoint\ndoctl compute spaces list\n\n# Configure CDN endpoint\ndoctl compute cdn create --origin content.example.com\n\n# Upload test file\naws s3 cp test.html s3://app-content/\n```\n\n### 4. Configure Application Queue\n\n```\n# Get SQS queue URL\naws sqs list-queues --region us-east-1\n\n# Create queue if needed\naws sqs create-queue --queue-name app-queue --region us-east-1\n\n# Test queue\naws sqs send-message --queue-url https://sqs.us-east-1.amazonaws.com/123456789/app-queue \\n --message-body "test message" --region us-east-1\n```\n\n### 5. Deploy Application\n\nSSH to Hetzner servers:\n\n```\n# Get server IPs\nSERVERS=$(hcloud server list --format PublicIPv4 --no-header)\n\n# Deploy to each server\nfor server in $SERVERS; do\n ssh -o StrictHostKeyChecking=no root@$server << 'DEPLOY'\n cd /var/www\n git clone https://github.com/your-org/app.git\n cd app\n cp .env.example .env\n ./deploy.sh\n DEPLOY\ndone\n```\n\n## Monitoring and Cost Control\n\n### Cost Monitoring\n\n```\n# Hetzner billing\n# Manual via console: https://console.hetzner.cloud/billing\n\n# AWS cost tracking\naws ce get-cost-and-usage \\n --time-period Start=2024-01-01,End=2024-01-31 \\n --granularity MONTHLY \\n --metrics BlendedCost \\n --group-by Type=DIMENSION,Key=SERVICE\n\n# DigitalOcean billing\ndoctl billing get\n\n# Real-time cost status\naws ce get-cost-and-usage \\n --time-period Start=$(date -d '1 day ago' +%Y-%m-%d),End=$(date +%Y-%m-%d) \\n --granularity DAILY \\n --metrics BlendedCost\n```\n\n### Application Performance Monitoring\n\n```\n# RDS performance insights\naws pi describe-dimension-keys \\n --service-type RDS \\n --identifier arn:aws:rds:us-east-1:123456789:db:app-db \\n --start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%S) \\n --end-time $(date -u +%Y-%m-%dT%H:%M:%S) \\n --period-in-seconds 60 \\n --metric db.load.avg \\n --partition-by Dimension \\n --dimension-group.group-by WAIT_EVENT\n\n# ElastiCache monitoring\naws cloudwatch get-metric-statistics \\n --namespace AWS/ElastiCache \\n --metric-name CPUUtilization \\n --dimensions Name=CacheClusterId,Value=app-cache \\n --start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%S) \\n --end-time $(date -u +%Y-%m-%dT%H:%M:%S) \\n --period 300 \\n --statistics Average\n\n# SQS monitoring\naws sqs get-queue-attributes \\n --queue-url https://sqs.us-east-1.amazonaws.com/123456789/app-queue \\n --attribute-names All\n```\n\n### Alerts Configuration\n\n```\n# CPU threshold alert\naws cloudwatch put-metric-alarm \\n --alarm-name hetzner-cpu-high \\n --alarm-description "Alert when Hetzner CPU > 80%" \\n --metric-name CPUUtilization \\n --threshold 80 \\n --comparison-operator GreaterThanThreshold\n\n# Queue depth alert\naws cloudwatch put-metric-alarm \\n --alarm-name sqs-queue-depth-high \\n --alarm-description "Alert when SQS queue depth > 1000" \\n --metric-name ApproximateNumberOfMessagesVisible \\n --threshold 1000 \\n --comparison-operator GreaterThanThreshold\n\n# Cache eviction alert\naws cloudwatch put-metric-alarm \\n --alarm-name elasticache-eviction-rate-high \\n --alarm-description "Alert when cache eviction rate > 10%" \\n --metric-name EvictionRate \\n --namespace AWS/ElastiCache \\n --threshold 10 \\n --comparison-operator GreaterThanThreshold\n```\n\n## Scaling and Optimization\n\n### Scale Hetzner Compute\n\nEdit `workspace.ncl`:\n\n```\ncompute_tier.primary_servers = hetzner.Server & {\n count = 5, # Increase from 3\n server_type = "cpx21"\n}\n```\n\nRedeploy:\n\n```\n./deploy.nu\n```\n\n### Upgrade Database\n\n```\n# Modify RDS instance class\naws rds modify-db-instance \\n --db-instance-identifier app-db \\n --db-instance-class db.t3.medium \\n --apply-immediately \\n --region us-east-1\n```\n\n### Add Caching Layer\n\nAlready configured with ElastiCache. Optimize by adjusting:\n\n```\n[application.cache]\nmax_memory = "512MB"\neviction_policy = "allkeys-lru"\n```\n\n### Increase Queue Throughput\n\nSQS automatically scales. Monitor with:\n\n```\naws sqs get-queue-attributes \\n --queue-url https://sqs.us-east-1.amazonaws.com/123456789/app-queue \\n --attribute-names ApproximateNumberOfMessages\n```\n\n## Cost Optimization Tips\n\n1. **Hetzner Compute**: CPX21 is sweet spot. Consider CX21 for lower workloads\n2. **AWS RDS**: Use t3.small for dev, t3.medium for prod with burst capability\n3. **ElastiCache**: 2 nodes with auto-failover. Monitor eviction rates\n4. **SQS**: Pay per request, no fixed costs. Good for variable load\n5. **DigitalOcean CDN**: Cache more aggressively (86400s TTL for assets)\n6. **Spaces**: Use lifecycle rules to delete old files automatically\n\n### Cost Reduction Checklist\n\n- Reduce Hetzner servers from 3 to 2 (saves ~€21/month)\n- Downgrade RDS to db.t3.micro for dev (saves ~$40/month)\n- Reduce ElastiCache nodes from 2 to 1 (saves ~$12/month)\n- Archive old CDN content (savings from Spaces storage)\n- Use reserved capacity on AWS (20-30% discount)\n\nPotential total savings: ~$100+/month with right-sizing.\n\n## Troubleshooting\n\n### Issue: Hetzner Can't Connect to RDS\n\n**Diagnosis**:\n```\n# SSH to Hetzner server\nssh root@hetzner-server\n\n# Test connectivity\nnc -zv app-db.abc123.us-east-1.rds.amazonaws.com 5432\n```\n\n**Solution**:\n- Check VPN tunnel is active\n- Verify RDS security group allows port 5432 from Hetzner network\n- Check route table on both sides\n\n### Issue: High Database Latency\n\n**Diagnosis**:\n```\n# Check RDS performance\naws pi describe-dimension-keys --service-type RDS ...\n\n# Check network latency\nping -c 5 app-db.abc123.us-east-1.rds.amazonaws.com\n```\n\n**Solution**:\n- Upgrade RDS instance class\n- Increase ElastiCache size to reduce database queries\n- Check network bandwidth between providers\n\n### Issue: Queue Processing Slow\n\n**Diagnosis**:\n```\n# Check queue depth and age\naws sqs get-queue-attributes \\n --queue-url \\n --attribute-names All\n```\n\n**Solution**:\n- Scale up application servers processing queue\n- Reduce visibility timeout if messages are timing out\n- Check application logs for processing errors\n\n## Cleanup\n\n```\n# Hetzner\nhcloud server delete hetzner-app-1 hetzner-app-2 hetzner-app-3\nhcloud load-balancer delete app-lb\n\n# AWS\naws rds delete-db-instance --db-instance-identifier app-db --skip-final-snapshot\naws elasticache delete-cache-cluster --cache-cluster-id app-cache\naws sqs delete-queue --queue-url https://sqs.us-east-1.amazonaws.com/123456789/app-queue\n\n# DigitalOcean\ndoctl compute spaces delete app-content\ndoctl compute cdn delete cdn-app\ndoctl compute droplet delete edge-node-1 edge-node-2 edge-node-3\n```\n\n## Next Steps\n\n1. Implement application logging to CloudWatch\n2. Set up Hetzner monitoring dashboard\n3. Configure auto-scaling based on queue depth\n4. Implement database read replicas for read-heavy workloads\n5. Add WAF protection to Hetzner load balancer\n6. Implement cross-region backups to Spaces\n7. Set up cost anomaly detection alerts\n\n## Support\n\nFor issues or questions:\n\n- Review the cost-optimized deployment guide\n- Check provider-specific documentation\n- Monitor costs with: `aws ce get-cost-and-usage ...`\n- Review deployment logs: `./deploy.nu --debug`\n\n## Files\n\n- `workspace.ncl`: Infrastructure definition (Nickel)\n- `config.toml`: Provider credentials and settings\n- `deploy.nu`: Deployment orchestration (Nushell)\n- `README.md`: This file \ No newline at end of file +# Cost-Optimized Multi-Provider Workspace\n\nThis workspace demonstrates cost optimization through intelligent provider specialization:\n\n- **Hetzner**: Compute tier (CPX21 servers at €20.90/month) - best price/performance\n- **AWS**: Managed services (RDS, ElastiCache, SQS) - reliability without ops overhead\n- **DigitalOcean**: CDN and object storage - affordable content delivery\n\n## Why This Architecture?\n\n### Cost Comparison\n\n```\nCost-Optimized Architecture:\n├── Hetzner compute: €72.70/month (~$78)\n├── AWS managed services: $115/month\n└── DigitalOcean CDN: $64/month\nTotal: ~$280/month\n\nAll-AWS Equivalent:\n├── EC2 instances: ~$200+\n├── RDS database: ~$150+\n├── ElastiCache: ~$50+\n├── CloudFront CDN: ~$100+\n└── Other services: ~$50+\nTotal: ~$600+/month\n\nSavings: ~$320/month (53% reduction)\n```\n\n### Architecture Benefits\n\n**Hetzner Advantages**:\n- Best price/performance for compute (€20.90/month for 4 vCPU/8GB)\n- Powerful Load Balancer (€10/month)\n- Fast networking (10Gbps)\n- EU data residency (GDPR compliant)\n\n**AWS Advantages**:\n- Managed RDS: Automatic backups, failover, patching\n- ElastiCache: Redis cluster with automatic failover\n- SQS: Scalable message queue (pay per message)\n- CloudWatch: Comprehensive monitoring\n\n**DigitalOcean Advantages**:\n- CDN: Cost-effective content delivery ($25/month)\n- Spaces: Object storage at scale ($15/month)\n- Simple pricing and management\n- Edge nodes for regional distribution\n\n## Architecture Overview\n\n```\n┌────────────────────────────────────────────────┐\n│ Client Requests │\n└─────────────────┬────────────────────────────────┘\n │ HTTPS/HTTP\n ┌────────▼─────────┐\n │ DigitalOcean │\n │ CDN / Spaces │\n └────────┬─────────┘\n │\n ┌────────────┼────────────┐\n │ │ │\n┌────▼──────┐ ┌──▼────────┐ ┌─▼──────┐\n│ Hetzner │ │ AWS │ │ DO │\n│ Compute │ │ Managed │ │ CDN │\n│ (Load LB) │ │ Services │ │ │\n└────┬──────┘ └──┬────────┘ └────────┘\n │VPN Tunnel │\n┌────▼──────────▼────┐\n│ Hetzner Network │ AWS VPC DO Spaces\n│ 10.0.0.0/16 ◄──► 10.1.0.0/16 ◄──► nyc3\n│ 3x CPX21 Servers │ RDS + Cache CDN +\n│ │ + SQS Backups\n└────────────────────┘\n```\n\n## Prerequisites\n\n### 1. Cloud Accounts\n\n- **Hetzner**: Account with API token\n- **AWS**: Account with access keys\n- **DigitalOcean**: Account with API token\n\n### 2. Environment Variables\n\n```\nexport HCLOUD_TOKEN="MC4wNTI1YmE1M2E4YmE0YTQzMTQyZTdlODYy"\nexport AWS_ACCESS_KEY_ID="AKIA1234567890ABCDEF"\nexport AWS_SECRET_ACCESS_KEY="wJalrXUtnFEMI/K7MDENG+j/zI0m1234567890ab"\nexport DIGITALOCEAN_TOKEN="dop_v1_abc123def456ghi789jkl012mno"\n```\n\n### 3. CLI Tools\n\n```\n# Install and verify\nwhich hcloud && hcloud version\nwhich aws && aws --version\nwhich doctl && doctl version\nwhich nickel && nickel --version\n```\n\n### 4. SSH Keys\n\n```\n# Hetzner\nhcloud ssh-key create --name provisioning-key \\n --public-key-from-file ~/.ssh/id_rsa.pub\n\n# AWS\naws ec2 create-key-pair --key-name provisioning-key \\n --query 'KeyMaterial' --output text > provisioning-key.pem\nchmod 600 provisioning-key.pem\n\n# DigitalOcean\ndoctl compute ssh-key create provisioning-key \\n --public-key-from-file ~/.ssh/id_rsa.pub\n```\n\n## Deployment\n\n### Step 1: Configure the Workspace\n\nEdit `workspace.ncl`:\n\n```\n# Update networking if needed\ncompute_tier.primary_servers = hetzner.Server & {\n server_type = "cpx21",\n count = 3,\n location = "nbg1"\n}\n\n# Update AWS region if needed\nmanaged_services.database = aws.RDS & {\n instance_class = "db.t3.small",\n region = "us-east-1"\n}\n\n# Update CDN endpoints\ncdn_tier.cdn.endpoints = [{\n name = "app-cdn",\n origin = "content.example.com"\n}]\n```\n\nEdit `config.toml`:\n\n```\n[cost_tracking]\nmonthly_budget = 300\nbudget_alert_threshold = 280\n\n[application.cache]\nmax_memory = "250MB"\n```\n\n### Step 2: Validate Configuration\n\n```\n# Validate Nickel syntax\nnickel export workspace.ncl | jq . > /dev/null\n\n# Verify provider access\nhcloud context use default\naws sts get-caller-identity\ndoctl account get\n```\n\n### Step 3: Deploy\n\n```\nchmod +x deploy.nu\n./deploy.nu\n\n# Or with debug output\n./deploy.nu --debug\n```\n\n### Step 4: Verify Deployment\n\n```\n# Hetzner compute resources\nhcloud server list\nhcloud load-balancer list\n\n# AWS managed services\naws rds describe-db-instances --region us-east-1\naws elasticache describe-cache-clusters --region us-east-1\naws sqs list-queues --region us-east-1\n\n# DigitalOcean CDN\ndoctl compute cdn list\ndoctl compute spaces list\n```\n\n## Post-Deployment Configuration\n\n### 1. Connect Hetzner Compute to AWS Database\n\n```\n# Get Hetzner server IPs\nhcloud server list --format ID,PublicIPv4\n\n# Get RDS endpoint\naws rds describe-db-instances --region us-east-1 \\n --query 'DBInstances[0].Endpoint.Address'\n\n# On Hetzner server, install PostgreSQL client\nssh root@hetzner-server\napt-get update && apt-get install postgresql-client\n\n# Test connection to RDS\npsql -h app-db.abc123.us-east-1.rds.amazonaws.com \\n -U admin -d postgres -c "SELECT now();"\n```\n\n### 2. Configure Application for Services\n\n```\n# Application configuration file\ncat > /var/www/app/.env << EOF\nDATABASE_HOST=app-db.abc123.us-east-1.rds.amazonaws.com\nDATABASE_PORT=5432\nDATABASE_USER=admin\nDATABASE_PASSWORD=your_password\nDATABASE_NAME=app_db\n\nREDIS_HOST=app-cache.abc123.ng.0001.euc1.cache.amazonaws.com\nREDIS_PORT=6379\n\nSQS_QUEUE_URL=https://sqs.us-east-1.amazonaws.com/123456789/app-queue\n\nCDN_ENDPOINT=https://content.example.com\nSPACES_ENDPOINT=https://app-content.nyc3.digitaloceanspaces.com\nSPACES_KEY=your_spaces_key\nSPACES_SECRET=your_spaces_secret\n\nENVIRONMENT=production\nEOF\n```\n\n### 3. Setup CDN and Object Storage\n\n```\n# Configure Spaces bucket\ndoctl compute spaces create app-content --region nyc3\n\n# Get Spaces endpoint\ndoctl compute spaces list\n\n# Configure CDN endpoint\ndoctl compute cdn create --origin content.example.com\n\n# Upload test file\naws s3 cp test.html s3://app-content/\n```\n\n### 4. Configure Application Queue\n\n```\n# Get SQS queue URL\naws sqs list-queues --region us-east-1\n\n# Create queue if needed\naws sqs create-queue --queue-name app-queue --region us-east-1\n\n# Test queue\naws sqs send-message --queue-url https://sqs.us-east-1.amazonaws.com/123456789/app-queue \\n --message-body "test message" --region us-east-1\n```\n\n### 5. Deploy Application\n\nSSH to Hetzner servers:\n\n```\n# Get server IPs\nSERVERS=$(hcloud server list --format PublicIPv4 --no-header)\n\n# Deploy to each server\nfor server in $SERVERS; do\n ssh -o StrictHostKeyChecking=no root@$server << 'DEPLOY'\n cd /var/www\n git clone https://github.com/your-org/app.git\n cd app\n cp .env.example .env\n ./deploy.sh\n DEPLOY\ndone\n```\n\n## Monitoring and Cost Control\n\n### Cost Monitoring\n\n```\n# Hetzner billing\n# Manual via console: https://console.hetzner.cloud/billing\n\n# AWS cost tracking\naws ce get-cost-and-usage \\n --time-period Start=2024-01-01,End=2024-01-31 \\n --granularity MONTHLY \\n --metrics BlendedCost \\n --group-by Type=DIMENSION,Key=SERVICE\n\n# DigitalOcean billing\ndoctl billing get\n\n# Real-time cost status\naws ce get-cost-and-usage \\n --time-period Start=$(date -d '1 day ago' +%Y-%m-%d),End=$(date +%Y-%m-%d) \\n --granularity DAILY \\n --metrics BlendedCost\n```\n\n### Application Performance Monitoring\n\n```\n# RDS performance insights\naws pi describe-dimension-keys \\n --service-type RDS \\n --identifier arn:aws:rds:us-east-1:123456789:db:app-db \\n --start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%S) \\n --end-time $(date -u +%Y-%m-%dT%H:%M:%S) \\n --period-in-seconds 60 \\n --metric db.load.avg \\n --partition-by Dimension \\n --dimension-group.group-by WAIT_EVENT\n\n# ElastiCache monitoring\naws cloudwatch get-metric-statistics \\n --namespace AWS/ElastiCache \\n --metric-name CPUUtilization \\n --dimensions Name=CacheClusterId,Value=app-cache \\n --start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%S) \\n --end-time $(date -u +%Y-%m-%dT%H:%M:%S) \\n --period 300 \\n --statistics Average\n\n# SQS monitoring\naws sqs get-queue-attributes \\n --queue-url https://sqs.us-east-1.amazonaws.com/123456789/app-queue \\n --attribute-names All\n```\n\n### Alerts Configuration\n\n```\n# CPU threshold alert\naws cloudwatch put-metric-alarm \\n --alarm-name hetzner-cpu-high \\n --alarm-description "Alert when Hetzner CPU > 80%" \\n --metric-name CPUUtilization \\n --threshold 80 \\n --comparison-operator GreaterThanThreshold\n\n# Queue depth alert\naws cloudwatch put-metric-alarm \\n --alarm-name sqs-queue-depth-high \\n --alarm-description "Alert when SQS queue depth > 1000" \\n --metric-name ApproximateNumberOfMessagesVisible \\n --threshold 1000 \\n --comparison-operator GreaterThanThreshold\n\n# Cache eviction alert\naws cloudwatch put-metric-alarm \\n --alarm-name elasticache-eviction-rate-high \\n --alarm-description "Alert when cache eviction rate > 10%" \\n --metric-name EvictionRate \\n --namespace AWS/ElastiCache \\n --threshold 10 \\n --comparison-operator GreaterThanThreshold\n```\n\n## Scaling and Optimization\n\n### Scale Hetzner Compute\n\nEdit `workspace.ncl`:\n\n```\ncompute_tier.primary_servers = hetzner.Server & {\n count = 5, # Increase from 3\n server_type = "cpx21"\n}\n```\n\nRedeploy:\n\n```\n./deploy.nu\n```\n\n### Upgrade Database\n\n```\n# Modify RDS instance class\naws rds modify-db-instance \\n --db-instance-identifier app-db \\n --db-instance-class db.t3.medium \\n --apply-immediately \\n --region us-east-1\n```\n\n### Add Caching Layer\n\nAlready configured with ElastiCache. Optimize by adjusting:\n\n```\n[application.cache]\nmax_memory = "512MB"\neviction_policy = "allkeys-lru"\n```\n\n### Increase Queue Throughput\n\nSQS automatically scales. Monitor with:\n\n```\naws sqs get-queue-attributes \\n --queue-url https://sqs.us-east-1.amazonaws.com/123456789/app-queue \\n --attribute-names ApproximateNumberOfMessages\n```\n\n## Cost Optimization Tips\n\n1. **Hetzner Compute**: CPX21 is sweet spot. Consider CX21 for lower workloads\n2. **AWS RDS**: Use t3.small for dev, t3.medium for prod with burst capability\n3. **ElastiCache**: 2 nodes with auto-failover. Monitor eviction rates\n4. **SQS**: Pay per request, no fixed costs. Good for variable load\n5. **DigitalOcean CDN**: Cache more aggressively (86400s TTL for assets)\n6. **Spaces**: Use lifecycle rules to delete old files automatically\n\n### Cost Reduction Checklist\n\n- Reduce Hetzner servers from 3 to 2 (saves ~€21/month)\n- Downgrade RDS to db.t3.micro for dev (saves ~$40/month)\n- Reduce ElastiCache nodes from 2 to 1 (saves ~$12/month)\n- Archive old CDN content (savings from Spaces storage)\n- Use reserved capacity on AWS (20-30% discount)\n\nPotential total savings: ~$100+/month with right-sizing.\n\n## Troubleshooting\n\n### Issue: Hetzner Can't Connect to RDS\n\n**Diagnosis**:\n```\n# SSH to Hetzner server\nssh root@hetzner-server\n\n# Test connectivity\nnc -zv app-db.abc123.us-east-1.rds.amazonaws.com 5432\n```\n\n**Solution**:\n- Check VPN tunnel is active\n- Verify RDS security group allows port 5432 from Hetzner network\n- Check route table on both sides\n\n### Issue: High Database Latency\n\n**Diagnosis**:\n```\n# Check RDS performance\naws pi describe-dimension-keys --service-type RDS ...\n\n# Check network latency\nping -c 5 app-db.abc123.us-east-1.rds.amazonaws.com\n```\n\n**Solution**:\n- Upgrade RDS instance class\n- Increase ElastiCache size to reduce database queries\n- Check network bandwidth between providers\n\n### Issue: Queue Processing Slow\n\n**Diagnosis**:\n```\n# Check queue depth and age\naws sqs get-queue-attributes \\n --queue-url \\n --attribute-names All\n```\n\n**Solution**:\n- Scale up application servers processing queue\n- Reduce visibility timeout if messages are timing out\n- Check application logs for processing errors\n\n## Cleanup\n\n```\n# Hetzner\nhcloud server delete hetzner-app-1 hetzner-app-2 hetzner-app-3\nhcloud load-balancer delete app-lb\n\n# AWS\naws rds delete-db-instance --db-instance-identifier app-db --skip-final-snapshot\naws elasticache delete-cache-cluster --cache-cluster-id app-cache\naws sqs delete-queue --queue-url https://sqs.us-east-1.amazonaws.com/123456789/app-queue\n\n# DigitalOcean\ndoctl compute spaces delete app-content\ndoctl compute cdn delete cdn-app\ndoctl compute droplet delete edge-node-1 edge-node-2 edge-node-3\n```\n\n## Next Steps\n\n1. Implement application logging to CloudWatch\n2. Set up Hetzner monitoring dashboard\n3. Configure auto-scaling based on queue depth\n4. Implement database read replicas for read-heavy workloads\n5. Add WAF protection to Hetzner load balancer\n6. Implement cross-region backups to Spaces\n7. Set up cost anomaly detection alerts\n\n## Support\n\nFor issues or questions:\n\n- Review the cost-optimized deployment guide\n- Check provider-specific documentation\n- Monitor costs with: `aws ce get-cost-and-usage ...`\n- Review deployment logs: `./deploy.nu --debug`\n\n## Files\n\n- `workspace.ncl`: Infrastructure definition (Nickel)\n- `config.toml`: Provider credentials and settings\n- `deploy.nu`: Deployment orchestration (Nushell)\n- `README.md`: This file diff --git a/examples/workspaces/multi-provider-web-app/README.md b/examples/workspaces/multi-provider-web-app/README.md index e0578e6..9a9b577 100644 --- a/examples/workspaces/multi-provider-web-app/README.md +++ b/examples/workspaces/multi-provider-web-app/README.md @@ -1 +1 @@ -# Multi-Provider Web App Workspace\n\nThis workspace demonstrates a production-ready web application deployment spanning three cloud providers:\n\n- **DigitalOcean**: Web servers and load balancing (NYC region)\n- **AWS**: Managed PostgreSQL database with high availability (US-East region)\n- **Hetzner**: Backup storage and disaster recovery (Germany region)\n\n## Why Three Providers?\n\nThis architecture optimizes cost, performance, and reliability:\n\n- **DigitalOcean** (~$77/month): Cost-effective compute with simple management\n- **AWS RDS** (~$75/month): Managed database with automatic failover\n- **Hetzner** (~$13/month): Affordable backup storage\n- **Total**: ~$165/month (vs $300+ for equivalent all-cloud setup)\n\n## Architecture Overview\n\n```\n┌─────────────────────────────────────────────┐\n│ Client Requests │\n└──────────────┬──────────────────────────────┘\n │ HTTPS/HTTP\n ┌───────▼─────────┐\n │ DigitalOcean LB │\n └───────┬─────────┘\n ┌────────┼────────┐\n │ │ │\n ┌─▼──┐ ┌─▼──┐ ┌─▼──┐\n │Web │ │Web │ │Web │ (DigitalOcean Droplets)\n │ 1 │ │ 2 │ │ 3 │\n └──┬─┘ └──┬─┘ └──┬─┘\n │ │ │\n └───────┼───────┘\n │ VPN Tunnel\n ┌───────▼────────────┐\n │ AWS RDS (PG) │ (us-east-1)\n │ Multi-AZ Cluster │\n └────────┬───────────┘\n │ Replication\n ┌──────▼──────────┐\n │ Hetzner Volume │ (nbg1 - Germany)\n │ Backups │\n └─────────────────┘\n```\n\n## Prerequisites\n\n### 1. Cloud Accounts\n\n- **DigitalOcean**: Account with API token\n- **AWS**: Account with access keys\n- **Hetzner**: Account with API token\n\n### 2. Environment Variables\n\nSet these before deployment:\n\n```\nexport DIGITALOCEAN_TOKEN="dop_v1_abc123def456ghi789jkl012mno"\nexport AWS_ACCESS_KEY_ID="AKIA1234567890ABCDEF"\nexport AWS_SECRET_ACCESS_KEY="wJalrXUtnFEMI/K7MDENG+j/zI0m1234567890ab"\nexport HCLOUD_TOKEN="MC4wNTI1YmE1M2E4YmE0YTQzMTQyZTdlODYy"\n```\n\n### 3. SSH Key Setup\n\n#### DigitalOcean\n```\n# Upload your SSH public key\ndoctl compute ssh-key create provisioning-key \\n --public-key-from-file ~/.ssh/id_rsa.pub\n\n# Note the key ID for workspace.ncl\ndoctl compute ssh-key list\n```\n\n#### AWS\n```\n# Create EC2 key pair (if needed)\naws ec2 create-key-pair --key-name provisioning-key \\n --query 'KeyMaterial' --output text > provisioning-key.pem\nchmod 600 provisioning-key.pem\n```\n\n#### Hetzner\n```\n# Upload SSH key\nhcloud ssh-key create --name provisioning-key \\n --public-key-from-file ~/.ssh/id_rsa.pub\n\n# List keys\nhcloud ssh-key list\n```\n\n### 4. DNS Setup\n\nUpdate `workspace.ncl` with your domain:\n- Replace `your-certificate-id` with actual AWS certificate ID\n- Update load balancer CNAME to point to your domain\n\n## Deployment\n\n### Step 1: Configure the Workspace\n\nEdit `workspace.ncl` to:\n- Set your SSH key IDs\n- Update certificate ID for HTTPS\n- Set domain names\n- Adjust instance counts if needed\n\nEdit `config.toml` to:\n- Set correct environment variable names\n- Adjust thresholds and settings\n\n### Step 2: Validate Configuration\n\n```\n# Validate Nickel syntax\nnickel export workspace.ncl | jq .\n\n# Validate provider credentials\nprovisioning provider verify digitalocean\nprovisioning provider verify aws\nprovisioning provider verify hetzner\n```\n\n### Step 3: Deploy\n\n```\n# Using provided deploy script\n./deploy.nu\n\n# Or manually via provisioning CLI\nprovisioning workspace deploy --config config.toml\n```\n\n### Step 4: Verify Deployment\n\n```\n# List resources per provider\ndoctl compute droplet list\naws rds describe-db-instances\nhcloud volume list\n\n# Test load balancer\ncurl http://your-domain.com/health\n```\n\n## Post-Deployment Configuration\n\n### 1. Application Deployment\n\nSSH into web servers and deploy application:\n\n```\n# Get web server IPs\ndoctl compute droplet list --format Name,PublicIPv4\n\n# SSH to first server\nssh root@198.51.100.15\n\n# Deploy application\ncd /var/www\ngit clone https://github.com/your-org/web-app.git\ncd web-app\n./deploy.sh\n```\n\n### 2. Database Configuration\n\nConnect to RDS database and initialize schema:\n\n```\n# Get RDS endpoint\naws rds describe-db-instances --query 'DBInstances[0].Endpoint.Address'\n\n# Connect and initialize\npsql -h webapp-db.c9akciq32.us-east-1.rds.amazonaws.com -U admin -d defaultdb < schema.sql\n```\n\n### 3. DNS Configuration\n\nPoint your domain to the load balancer:\n\n```\n# Get load balancer IP\ndoctl compute load-balancer list\n\n# Update DNS CNAME\n# Add CNAME record: app.example.com -> lb-123456789.nyc3.digitalocean.com\n```\n\n### 4. SSL/TLS Certificate\n\nUse AWS Certificate Manager:\n\n```\n# Request certificate\naws acm request-certificate \\n --domain-name app.example.com \\n --validation-method DNS\n\n# Validate and get certificate ID\naws acm list-certificates | grep app.example.com\n\n# Update workspace.ncl with certificate ID\n```\n\n## Monitoring\n\n### DigitalOcean Monitoring\n\n- CPU usage tracked per droplet\n- Memory usage alerts on Droplet greater than 85%\n- Disk space alerts on greater than 90% full\n\n### AWS CloudWatch\n\n- RDS database metrics (CPU, connections, disk)\n- Automatic failover notifications\n- Slow query logging\n\n### Hetzner Monitoring\n\n- Volume usage tracking\n- Manual monitoring script via cron\n\n### Application Monitoring\n\nImplement application-level monitoring:\n\n```\n# SSH to web server\nssh root@198.51.100.15\n\n# Check app logs\ntail -f /var/www/app/logs/application.log\n\n# Monitor system resources\ntop\niostat -x 1\n\n# Check database connection pool\npsql -h webapp-db.c9akciq32.us-east-1.rds.amazonaws.com -c "SELECT count(plus) FROM pg_stat_activity;"\n```\n\n## Backup and Recovery\n\n### Automated Backups\n\n- **RDS**: Daily backups retained for 30 days (AWS handles)\n- **Application Data**: Weekly backups to Hetzner volume\n- **Configuration**: Version control via Git\n\n### Manual Backup\n\n```\n# Backup RDS to Hetzner volume\nssh hetzner-backup-volume\n\n# Mount Hetzner volume (if not mounted)\nsudo mount /dev/sdb /mnt/backups\n\n# Backup RDS database\npg_dump -h webapp-db.c9akciq32.us-east-1.rds.amazonaws.com -U admin -d defaultdb | \\n gzip > /mnt/backups/db-$(date +%Y%m%d).sql.gz\n```\n\n### Recovery Procedure\n\n1. **Web Server Failure**: Load balancer automatically redirects to healthy server\n2. **Database Failure**: RDS Multi-AZ automatic failover\n3. **Complete Failure**: Restore from Hetzner backup volume\n\n## Scaling\n\n### Add More Web Servers\n\nEdit `workspace.ncl`:\n\n```\ndroplets = digitalocean.Droplet & {\n name = "web-server",\n region = "nyc3",\n size = "s-2vcpu-4gb",\n count = 5\n}\n```\n\nRedeploy:\n\n```\n./deploy.nu\n```\n\n### Upgrade Database\n\nEdit `workspace.ncl`:\n\n```\ndatabase_tier = aws.RDS & {\n identifier = "webapp-db",\n instance_class = "db.t3.large"\n}\n```\n\nRedeploy with minimal downtime (Multi-AZ handles switchover).\n\n## Cost Optimization\n\n### Reduce Costs\n\n1. **Droplets**: Use smaller size or fewer instances\n2. **Database**: Switch to smaller db.t3.small (approximately $30/month)\n3. **Storage**: Reduce backup volume size\n4. **Data Transfer**: Monitor and optimize outbound traffic\n\n### Monitor Costs\n\n```\n# DigitalOcean estimated bill\ndoctl billing get\n\n# AWS Cost Explorer\naws ce get-cost-and-usage --time-period Start=2024-01-01,End=2024-01-31\n\n# Hetzner manual tracking via console\n# Navigate to https://console.hetzner.cloud/billing\n```\n\n## Troubleshooting\n\n### Issue: Web Servers Unreachable\n\n**Diagnosis**:\n```\ndoctl compute droplet list\ndoctl compute firewall list-rules firewall-id\n```\n\n**Solution**:\n- Check firewall allows ports 80, 443\n- Verify droplets have public IPs\n- Check web server application status\n\n### Issue: Database Connection Failure\n\n**Diagnosis**:\n```\naws rds describe-db-instances\naws security-group describe-security-groups\n```\n\n**Solution**:\n- Verify RDS security group allows port 5432 from web servers\n- Check RDS status is "available"\n- Verify connection string in application\n\n### Issue: Backup Volume Not Mounted\n\n**Diagnosis**:\n```\nhcloud volume list\nssh hetzner-volume\nlsblk\n```\n\n**Solution**:\n```\nsudo mkfs.ext4 /dev/sdb\nsudo mount /dev/sdb /mnt/backups\necho '/dev/sdb /mnt/backups ext4 defaults,nofail 0 0' | sudo tee -a /etc/fstab\n```\n\n## Cleanup\n\nTo destroy all resources:\n\n```\n# This will delete everything - use carefully\nprovisioning workspace destroy --config config.toml\n\n# Or manually\ndoctl compute droplet delete web-server-1 web-server-2 web-server-3\ndoctl compute load-balancer delete web-lb\naws rds delete-db-instance --db-instance-identifier webapp-db --skip-final-snapshot\nhcloud volume delete webapp-backups\n```\n\n## Next Steps\n\n1. **SSL/TLS**: Update certificate and enable HTTPS\n2. **Auto-scaling**: Add DigitalOcean autoscaling based on load\n3. **Multi-region**: Add additional AWS RDS read replicas in other regions\n4. **Disaster Recovery**: Test failover procedures\n5. **Cost Optimization**: Review and optimize resource sizes\n\n## Support\n\nFor issues or questions:\n\n- Review the multi-provider deployment guide\n- Check provider-specific documentation\n- Review workspace logs with debug flag: ./deploy.nu --debug\n\n## Files\n\n- `workspace.ncl`: Infrastructure definition (Nickel)\n- `config.toml`: Provider credentials and settings\n- `deploy.nu`: Deployment automation script (Nushell)\n- `README.md`: This file \ No newline at end of file +# Multi-Provider Web App Workspace\n\nThis workspace demonstrates a production-ready web application deployment spanning three cloud providers:\n\n- **DigitalOcean**: Web servers and load balancing (NYC region)\n- **AWS**: Managed PostgreSQL database with high availability (US-East region)\n- **Hetzner**: Backup storage and disaster recovery (Germany region)\n\n## Why Three Providers?\n\nThis architecture optimizes cost, performance, and reliability:\n\n- **DigitalOcean** (~$77/month): Cost-effective compute with simple management\n- **AWS RDS** (~$75/month): Managed database with automatic failover\n- **Hetzner** (~$13/month): Affordable backup storage\n- **Total**: ~$165/month (vs $300+ for equivalent all-cloud setup)\n\n## Architecture Overview\n\n```\n┌─────────────────────────────────────────────┐\n│ Client Requests │\n└──────────────┬──────────────────────────────┘\n │ HTTPS/HTTP\n ┌───────▼─────────┐\n │ DigitalOcean LB │\n └───────┬─────────┘\n ┌────────┼────────┐\n │ │ │\n ┌─▼──┐ ┌─▼──┐ ┌─▼──┐\n │Web │ │Web │ │Web │ (DigitalOcean Droplets)\n │ 1 │ │ 2 │ │ 3 │\n └──┬─┘ └──┬─┘ └──┬─┘\n │ │ │\n └───────┼───────┘\n │ VPN Tunnel\n ┌───────▼────────────┐\n │ AWS RDS (PG) │ (us-east-1)\n │ Multi-AZ Cluster │\n └────────┬───────────┘\n │ Replication\n ┌──────▼──────────┐\n │ Hetzner Volume │ (nbg1 - Germany)\n │ Backups │\n └─────────────────┘\n```\n\n## Prerequisites\n\n### 1. Cloud Accounts\n\n- **DigitalOcean**: Account with API token\n- **AWS**: Account with access keys\n- **Hetzner**: Account with API token\n\n### 2. Environment Variables\n\nSet these before deployment:\n\n```\nexport DIGITALOCEAN_TOKEN="dop_v1_abc123def456ghi789jkl012mno"\nexport AWS_ACCESS_KEY_ID="AKIA1234567890ABCDEF"\nexport AWS_SECRET_ACCESS_KEY="wJalrXUtnFEMI/K7MDENG+j/zI0m1234567890ab"\nexport HCLOUD_TOKEN="MC4wNTI1YmE1M2E4YmE0YTQzMTQyZTdlODYy"\n```\n\n### 3. SSH Key Setup\n\n#### DigitalOcean\n```\n# Upload your SSH public key\ndoctl compute ssh-key create provisioning-key \\n --public-key-from-file ~/.ssh/id_rsa.pub\n\n# Note the key ID for workspace.ncl\ndoctl compute ssh-key list\n```\n\n#### AWS\n```\n# Create EC2 key pair (if needed)\naws ec2 create-key-pair --key-name provisioning-key \\n --query 'KeyMaterial' --output text > provisioning-key.pem\nchmod 600 provisioning-key.pem\n```\n\n#### Hetzner\n```\n# Upload SSH key\nhcloud ssh-key create --name provisioning-key \\n --public-key-from-file ~/.ssh/id_rsa.pub\n\n# List keys\nhcloud ssh-key list\n```\n\n### 4. DNS Setup\n\nUpdate `workspace.ncl` with your domain:\n- Replace `your-certificate-id` with actual AWS certificate ID\n- Update load balancer CNAME to point to your domain\n\n## Deployment\n\n### Step 1: Configure the Workspace\n\nEdit `workspace.ncl` to:\n- Set your SSH key IDs\n- Update certificate ID for HTTPS\n- Set domain names\n- Adjust instance counts if needed\n\nEdit `config.toml` to:\n- Set correct environment variable names\n- Adjust thresholds and settings\n\n### Step 2: Validate Configuration\n\n```\n# Validate Nickel syntax\nnickel export workspace.ncl | jq .\n\n# Validate provider credentials\nprovisioning provider verify digitalocean\nprovisioning provider verify aws\nprovisioning provider verify hetzner\n```\n\n### Step 3: Deploy\n\n```\n# Using provided deploy script\n./deploy.nu\n\n# Or manually via provisioning CLI\nprovisioning workspace deploy --config config.toml\n```\n\n### Step 4: Verify Deployment\n\n```\n# List resources per provider\ndoctl compute droplet list\naws rds describe-db-instances\nhcloud volume list\n\n# Test load balancer\ncurl http://your-domain.com/health\n```\n\n## Post-Deployment Configuration\n\n### 1. Application Deployment\n\nSSH into web servers and deploy application:\n\n```\n# Get web server IPs\ndoctl compute droplet list --format Name,PublicIPv4\n\n# SSH to first server\nssh root@198.51.100.15\n\n# Deploy application\ncd /var/www\ngit clone https://github.com/your-org/web-app.git\ncd web-app\n./deploy.sh\n```\n\n### 2. Database Configuration\n\nConnect to RDS database and initialize schema:\n\n```\n# Get RDS endpoint\naws rds describe-db-instances --query 'DBInstances[0].Endpoint.Address'\n\n# Connect and initialize\npsql -h webapp-db.c9akciq32.us-east-1.rds.amazonaws.com -U admin -d defaultdb < schema.sql\n```\n\n### 3. DNS Configuration\n\nPoint your domain to the load balancer:\n\n```\n# Get load balancer IP\ndoctl compute load-balancer list\n\n# Update DNS CNAME\n# Add CNAME record: app.example.com -> lb-123456789.nyc3.digitalocean.com\n```\n\n### 4. SSL/TLS Certificate\n\nUse AWS Certificate Manager:\n\n```\n# Request certificate\naws acm request-certificate \\n --domain-name app.example.com \\n --validation-method DNS\n\n# Validate and get certificate ID\naws acm list-certificates | grep app.example.com\n\n# Update workspace.ncl with certificate ID\n```\n\n## Monitoring\n\n### DigitalOcean Monitoring\n\n- CPU usage tracked per droplet\n- Memory usage alerts on Droplet greater than 85%\n- Disk space alerts on greater than 90% full\n\n### AWS CloudWatch\n\n- RDS database metrics (CPU, connections, disk)\n- Automatic failover notifications\n- Slow query logging\n\n### Hetzner Monitoring\n\n- Volume usage tracking\n- Manual monitoring script via cron\n\n### Application Monitoring\n\nImplement application-level monitoring:\n\n```\n# SSH to web server\nssh root@198.51.100.15\n\n# Check app logs\ntail -f /var/www/app/logs/application.log\n\n# Monitor system resources\ntop\niostat -x 1\n\n# Check database connection pool\npsql -h webapp-db.c9akciq32.us-east-1.rds.amazonaws.com -c "SELECT count(plus) FROM pg_stat_activity;"\n```\n\n## Backup and Recovery\n\n### Automated Backups\n\n- **RDS**: Daily backups retained for 30 days (AWS handles)\n- **Application Data**: Weekly backups to Hetzner volume\n- **Configuration**: Version control via Git\n\n### Manual Backup\n\n```\n# Backup RDS to Hetzner volume\nssh hetzner-backup-volume\n\n# Mount Hetzner volume (if not mounted)\nsudo mount /dev/sdb /mnt/backups\n\n# Backup RDS database\npg_dump -h webapp-db.c9akciq32.us-east-1.rds.amazonaws.com -U admin -d defaultdb | \\n gzip > /mnt/backups/db-$(date +%Y%m%d).sql.gz\n```\n\n### Recovery Procedure\n\n1. **Web Server Failure**: Load balancer automatically redirects to healthy server\n2. **Database Failure**: RDS Multi-AZ automatic failover\n3. **Complete Failure**: Restore from Hetzner backup volume\n\n## Scaling\n\n### Add More Web Servers\n\nEdit `workspace.ncl`:\n\n```\ndroplets = digitalocean.Droplet & {\n name = "web-server",\n region = "nyc3",\n size = "s-2vcpu-4gb",\n count = 5\n}\n```\n\nRedeploy:\n\n```\n./deploy.nu\n```\n\n### Upgrade Database\n\nEdit `workspace.ncl`:\n\n```\ndatabase_tier = aws.RDS & {\n identifier = "webapp-db",\n instance_class = "db.t3.large"\n}\n```\n\nRedeploy with minimal downtime (Multi-AZ handles switchover).\n\n## Cost Optimization\n\n### Reduce Costs\n\n1. **Droplets**: Use smaller size or fewer instances\n2. **Database**: Switch to smaller db.t3.small (approximately $30/month)\n3. **Storage**: Reduce backup volume size\n4. **Data Transfer**: Monitor and optimize outbound traffic\n\n### Monitor Costs\n\n```\n# DigitalOcean estimated bill\ndoctl billing get\n\n# AWS Cost Explorer\naws ce get-cost-and-usage --time-period Start=2024-01-01,End=2024-01-31\n\n# Hetzner manual tracking via console\n# Navigate to https://console.hetzner.cloud/billing\n```\n\n## Troubleshooting\n\n### Issue: Web Servers Unreachable\n\n**Diagnosis**:\n```\ndoctl compute droplet list\ndoctl compute firewall list-rules firewall-id\n```\n\n**Solution**:\n- Check firewall allows ports 80, 443\n- Verify droplets have public IPs\n- Check web server application status\n\n### Issue: Database Connection Failure\n\n**Diagnosis**:\n```\naws rds describe-db-instances\naws security-group describe-security-groups\n```\n\n**Solution**:\n- Verify RDS security group allows port 5432 from web servers\n- Check RDS status is "available"\n- Verify connection string in application\n\n### Issue: Backup Volume Not Mounted\n\n**Diagnosis**:\n```\nhcloud volume list\nssh hetzner-volume\nlsblk\n```\n\n**Solution**:\n```\nsudo mkfs.ext4 /dev/sdb\nsudo mount /dev/sdb /mnt/backups\necho '/dev/sdb /mnt/backups ext4 defaults,nofail 0 0' | sudo tee -a /etc/fstab\n```\n\n## Cleanup\n\nTo destroy all resources:\n\n```\n# This will delete everything - use carefully\nprovisioning workspace destroy --config config.toml\n\n# Or manually\ndoctl compute droplet delete web-server-1 web-server-2 web-server-3\ndoctl compute load-balancer delete web-lb\naws rds delete-db-instance --db-instance-identifier webapp-db --skip-final-snapshot\nhcloud volume delete webapp-backups\n```\n\n## Next Steps\n\n1. **SSL/TLS**: Update certificate and enable HTTPS\n2. **Auto-scaling**: Add DigitalOcean autoscaling based on load\n3. **Multi-region**: Add additional AWS RDS read replicas in other regions\n4. **Disaster Recovery**: Test failover procedures\n5. **Cost Optimization**: Review and optimize resource sizes\n\n## Support\n\nFor issues or questions:\n\n- Review the multi-provider deployment guide\n- Check provider-specific documentation\n- Review workspace logs with debug flag: ./deploy.nu --debug\n\n## Files\n\n- `workspace.ncl`: Infrastructure definition (Nickel)\n- `config.toml`: Provider credentials and settings\n- `deploy.nu`: Deployment automation script (Nushell)\n- `README.md`: This file diff --git a/examples/workspaces/multi-region-ha/README.md b/examples/workspaces/multi-region-ha/README.md index edb4a78..d4a9b56 100644 --- a/examples/workspaces/multi-region-ha/README.md +++ b/examples/workspaces/multi-region-ha/README.md @@ -1 +1 @@ -# Multi-Region High Availability Workspace\n\nThis workspace demonstrates a production-ready global high availability deployment spanning three cloud providers across three geographic regions:\n\n- **US East (DigitalOcean NYC)**: Primary region - active serving, primary database\n- **EU Central (Hetzner Germany)**: Secondary region - active serving, read replicas\n- **Asia Pacific (AWS Singapore)**: Tertiary region - active serving, read replicas\n\n## Why Multi-Region High Availability?\n\n### Business Benefits\n\n- **99.99% Uptime**: Automatic failover across regions\n- **Low Latency**: Users served from geographically closest region\n- **Compliance**: Data residency in specific regions (GDPR for EU)\n- **Disaster Recovery**: Complete regional failure tolerance\n\n### Technical Benefits\n\n- **Load Distribution**: Traffic spread across 3 regions\n- **Cost Optimization**: Pay only for actual usage (~$311/month)\n- **Provider Diversity**: Reduces vendor lock-in risk\n- **Capacity Planning**: Scale independently per region\n\n## Architecture Overview\n\n```\n┌─────────────────────────────────────────────────────────────────┐\n│ Global Route53 DNS │\n│ Geographic Routing + Health Checks │\n└────────────────────┬────────────────────────────────────────────┘\n │\n ┌───────────┼───────────┐\n │ │ │\n ┌────▼─────┐ ┌──▼────────┐ ┌▼──────────┐\n │ US │ │ EU │ │ APAC │\n │ Primary │ │ Secondary │ │ Tertiary │\n └────┬─────┘ └──┬────────┘ └▼──────────┘\n │ │ │\n ┌────▼──────────▼───────────▼────┐\n │ Multi-Master Database │\n │ Replication (300s lag) │\n └────────────────────────────────┘\n │ │ │\n ┌────▼────┐ ┌──▼─────┐ ┌──▼────┐\n │DO Droplets Hetzner AWS\n │ 3 x nyc3 3 x nbg1 3 x sgp1\n │ │ │ │\n │ Load Balancer (per region)\n │ │ │ │\n └─────────┼─────────┼─────────┘\n │VPN Tunnels (IPSec)│\n └───────────────────┘\n```\n\n### Regional Components\n\n#### US East (DigitalOcean) - Primary\n\n```\nRegion: nyc3 (New York)\nCompute: 3x Droplets (s-2vcpu-4gb)\nLoad Balancer: Round-robin with health checks\nDatabase: PostgreSQL (3-node cluster, Multi-AZ)\nNetwork: VPC 10.0.0.0/16\nCost: ~$102/month\n```\n\n#### EU Central (Hetzner) - Secondary\n\n```\nRegion: nbg1 (Nuremberg, Germany)\nCompute: 3x CPX21 servers (4 vCPU, 8GB RAM)\nLoad Balancer: Hetzner Load Balancer\nDatabase: Read-only replica (lag: 300s)\nNetwork: vSwitch 10.1.0.0/16\nCost: ~$79/month (€72.70)\n```\n\n#### Asia Pacific (AWS) - Tertiary\n\n```\nRegion: ap-southeast-1 (Singapore)\nCompute: 3x EC2 t3.medium instances\nLoad Balancer: Application Load Balancer (ALB)\nDatabase: RDS read-only replica (lag: 300s)\nNetwork: VPC 10.2.0.0/16\nCost: ~$130/month\n```\n\n## Prerequisites\n\n### 1. Cloud Accounts & Credentials\n\n#### DigitalOcean\n```\n# Create API token\n# Dashboard → API → Tokens/Keys → Generate New Token\n# Scopes: read, write\n\nexport DIGITALOCEAN_TOKEN="dop_v1_abc123def456ghi789jkl012mno"\n```\n\n#### Hetzner\n```\n# Create API token\n# Dashboard → Security → API Tokens → Generate Token\n\nexport HCLOUD_TOKEN="MC4wNTI1YmE1M2E4YmE0YTQzMTQyZTdlODYy"\n```\n\n#### AWS\n```\n# Create IAM user with programmatic access\n# IAM → Users → Add User → Check "Programmatic access"\n# Attach policies: AmazonEC2FullAccess, AmazonRDSFullAccess, Route53FullAccess\n\nexport AWS_ACCESS_KEY_ID="AKIA1234567890ABCDEF"\nexport AWS_SECRET_ACCESS_KEY="wJalrXUtnFEMI/K7MDENG+j/zI0m1234567890ab"\n```\n\n### 2. CLI Tools\n\n```\n# Verify all CLIs are installed\nwhich doctl\nwhich hcloud\nwhich aws\nwhich nickel\n\n# Versions\ndoctl version # >= 1.94.0\nhcloud version # >= 1.35.0\naws --version # >= 2.0\nnickel --version # >= 1.0\n```\n\n### 3. SSH Keys\n\n#### DigitalOcean\n```\n# Upload SSH key\ndoctl compute ssh-key create provisioning-key \\n --public-key-from-file ~/.ssh/id_rsa.pub\n\n# Note the key ID\ndoctl compute ssh-key list\n```\n\n#### Hetzner\n```\n# Upload SSH key\nhcloud ssh-key create \\n --name provisioning-key \\n --public-key-from-file ~/.ssh/id_rsa.pub\n\n# List keys\nhcloud ssh-key list\n```\n\n#### AWS\n```\n# Create or import EC2 key pair\naws ec2 create-key-pair \\n --key-name provisioning-key \\n --query 'KeyMaterial' --output text > provisioning-key.pem\n\nchmod 600 provisioning-key.pem\n```\n\n### 4. Domain and DNS\n\nYou need a domain with Route53 or ability to create DNS records:\n\n```\n# Create hosted zone in Route53\naws route53 create-hosted-zone \\n --name api.example.com \\n --caller-reference $(date +%s)\n\n# Note the Zone ID for updates\naws route53 list-hosted-zones\n```\n\n## Deployment\n\n### Step 1: Configure the Workspace\n\nEdit `workspace.ncl` to customize:\n\n```\n# Update SSH key references\ndroplets = digitalocean.Droplet & {\n ssh_keys = ["YOUR_DO_KEY_ID"],\n name = "us-app",\n region = "nyc3"\n}\n\n# Update AWS AMI IDs for your region\napp_servers = aws.EC2 & {\n image_id = "ami-09d56f8956ab235b7",\n instance_type = "t3.medium",\n region = "ap-southeast-1"\n}\n\n# Update certificate ID\nload_balancer = digitalocean.LoadBalancer & {\n forwarding_rules = [{\n certificate_id = "your-certificate-id",\n entry_protocol = "https",\n entry_port = 443\n }]\n}\n```\n\nEdit `config.toml`:\n\n```\n# Update regional names if different\n[providers.digitalocean]\nregion_name = "us-east"\n\n[providers.hetzner]\nregion_name = "eu-central"\n\n[providers.aws]\nregion_name = "asia-southeast"\n\n# Update domain\n[dns]\ndomain = "api.example.com"\n```\n\n### Step 2: Validate Configuration\n\n```\n# Validate Nickel syntax\nnickel export workspace.ncl | jq . > /dev/null\n\n# Verify credentials per provider\ndoctl auth init --access-token $DIGITALOCEAN_TOKEN\nhcloud context use default\naws sts get-caller-identity\n\n# Check connectivity\ndoctl account get\nhcloud server list\naws ec2 describe-regions\n```\n\n### Step 3: Deploy\n\n```\n# Make script executable\nchmod +x deploy.nu\n\n# Execute deployment (step-by-step)\n./deploy.nu\n\n# Or with debug output\n./deploy.nu --debug\n\n# Or deploy per region\n./deploy.nu --region us-east\n./deploy.nu --region eu-central\n./deploy.nu --region asia-southeast\n```\n\n### Step 4: Verify Global Deployment\n\n```\n# List resources per region\necho "=== US EAST (DigitalOcean) ==="\ndoctl compute droplet list --format Name,Region,Status,PublicIPv4\ndoctl compute load-balancer list\n\necho "=== EU CENTRAL (Hetzner) ==="\nhcloud server list\n\necho "=== ASIA PACIFIC (AWS) ==="\naws ec2 describe-instances --region ap-southeast-1 \\n --query 'Reservations[*].Instances[*].[InstanceId,InstanceType,State.Name,PublicIpAddress]' \\n --output table\naws elbv2 describe-load-balancers --region ap-southeast-1\n```\n\n## Post-Deployment Configuration\n\n### 1. SSL/TLS Certificates\n\n#### AWS Certificate Manager\n```\n# Request certificate for all regions\naws acm request-certificate \\n --domain-name api.example.com \\n --subject-alternative-names *.api.example.com \\n --validation-method DNS \\n --region us-east-1\n\n# Get certificate ARN\naws acm list-certificates --region us-east-1\n\n# Note the ARN for workspace.ncl\n```\n\n### 2. Database Primary/Replica Setup\n\n```\n# Connect to US East primary\nPGPASSWORD=admin psql -h us-db-primary.abc123.us-east-1.rds.amazonaws.com -U admin -d postgres\n\n# Create read-only replication users for EU and APAC\nCREATE ROLE replication_user WITH REPLICATION LOGIN PASSWORD 'replica_password';\n\n# On EU read replica (Hetzner) - verify replication\nSELECT slot_name, restart_lsn, confirmed_flush_lsn FROM pg_replication_slots;\n\n# On APAC read replica (AWS RDS) - verify replica status\nSELECT databaseid, xmin, catalog_xmin FROM pg_replication_origin_status;\n```\n\n### 3. Global DNS Setup\n\n```\n# Create Route53 records for each region\naws route53 change-resource-record-sets \\n --hosted-zone-id Z1234567890ABC \\n --change-batch '{\n "Changes": [\n {\n "Action": "CREATE",\n "ResourceRecordSet": {\n "Name": "us.api.example.com",\n "Type": "A",\n "TTL": 60,\n "ResourceRecords": [{"Value": "198.51.100.15"}]\n }\n },\n {\n "Action": "CREATE",\n "ResourceRecordSet": {\n "Name": "eu.api.example.com",\n "Type": "A",\n "TTL": 60,\n "ResourceRecords": [{"Value": "192.0.2.100"}]\n }\n },\n {\n "Action": "CREATE",\n "ResourceRecordSet": {\n "Name": "asia.api.example.com",\n "Type": "A",\n "TTL": 60,\n "ResourceRecords": [{"Value": "203.0.113.50"}]\n }\n }\n ]\n }'\n\n# Health checks per region\naws route53 create-health-check \\n --health-check-config '{\n "Type": "HTTPS",\n "ResourcePath": "/health",\n "FullyQualifiedDomainName": "us.api.example.com",\n "Port": 443,\n "RequestInterval": 30,\n "FailureThreshold": 3\n }'\n```\n\n### 4. Application Deployment\n\nSSH to web servers in each region:\n\n```\n# US East\nUS_IP=$(doctl compute droplet get us-app-1 --format PublicIPv4 --no-header)\nssh root@$US_IP\n\n# Deploy application\ncd /var/www\ngit clone https://github.com/your-org/app.git\ncd app\n./deploy.sh\n\n# EU Central\nEU_IP=$(hcloud server list --selector region=eu-central --format ID | head -1 | xargs -I {} hcloud server ip {})\nssh root@$EU_IP\n\n# Asia Pacific\nASIA_IP=$(aws ec2 describe-instances \\n --region ap-southeast-1 \\n --filters "Name=tag:Name,Values=asia-app-1" \\n --query 'Reservations[0].Instances[0].PublicIpAddress' \\n --output text)\nssh -i provisioning-key.pem ec2-user@$ASIA_IP\n```\n\n## Monitoring and Health Checks\n\n### Regional Monitoring\n\nEach region generates metrics to CloudWatch/provider-specific monitoring:\n\n```\n# DigitalOcean metrics\ndoctl monitoring metrics list droplet \\n --droplet-id 123456789 \\n --metric cpu\n\n# Hetzner metrics (manual monitoring)\nhcloud server list\n\n# AWS CloudWatch\naws cloudwatch get-metric-statistics \\n --metric-name CPUUtilization \\n --namespace AWS/EC2 \\n --start-time 2024-01-01T00:00:00Z \\n --end-time 2024-01-02T00:00:00Z \\n --period 300 \\n --statistics Average\n```\n\n### Global Health Checks\n\nRoute53 health checks verify all regions are healthy:\n\n```\n# List health checks\naws route53 list-health-checks\n\n# Get detailed status\naws route53 get-health-check-status --health-check-id abc123\n\n# Verify replication lag\n# On primary (US East) DigitalOcean\nSELECT now() - pg_last_xact_replay_timestamp() AS replication_lag;\n\n# Should be less than 300 seconds\n```\n\n### Alert Configuration\n\nConfigure alerts for critical metrics:\n\n```\n# CPU > 80%\naws cloudwatch put-metric-alarm \\n --alarm-name us-east-high-cpu \\n --alarm-actions arn:aws:sns:us-east-1:123456:ops-alerts \\n --metric-name CPUUtilization \\n --threshold 80 \\n --comparison-operator GreaterThanThreshold\n\n# Replication lag > 600s\naws cloudwatch put-metric-alarm \\n --alarm-name replication-lag-critical \\n --metric-name ReplicationLag \\n --threshold 600 \\n --comparison-operator GreaterThanThreshold\n```\n\n## Failover Testing\n\n### Planned Failover - US East to EU Central\n\n```\n# 1. Stop traffic to US East\naws route53 change-resource-record-sets \\n --hosted-zone-id Z1234567890ABC \\n --change-batch '{\n "Changes": [{\n "Action": "UPSERT",\n "ResourceRecordSet": {\n "Name": "api.example.com",\n "Type": "A",\n "TTL": 60,\n "ResourceRecords": [{"Value": "192.0.2.100"}]\n }\n }]\n }'\n\n# 2. Promote EU Central to primary\n# Connect to EU read replica and promote\npsql -h hetzner-eu-db.netz.de -U admin -d postgres \\n -c "SELECT pg_promote();"\n\n# 3. Verify failover\ncurl https://api.example.com/health\n\n# 4. Monitor replication (now from EU)\nSELECT now() - pg_last_xact_replay_timestamp() AS replication_lag;\n```\n\n### Automatic Failover - Health Check Failure\n\nRoute53 automatically fails over when health checks fail:\n\n```\n# Simulate US East failure (for testing only)\n# Stop web servers temporarily\ndoctl compute droplet-action power-off us-app-1 us-app-2 us-app-3\n\n# Wait ~1 minute for health check to fail\nsleep 60\n\n# Verify traffic now routes to EU/APAC\ncurl https://api.example.com/ -v | grep -E "^< Server"\n\n# Restore US East\ndoctl compute droplet-action power-on us-app-1 us-app-2 us-app-3\n```\n\n## Scaling and Upgrades\n\n### Add More Web Servers\n\nEdit `workspace.ncl`:\n\n```\n# Increase droplet count\nregion_us_east.app_servers = digitalocean.Droplet & {\n count = 5,\n name = "us-app",\n region = "nyc3"\n}\n\n# Increase Hetzner servers\nregion_eu_central.app_servers = hetzner.Server & {\n count = 5,\n server_type = "cpx21",\n location = "nbg1"\n}\n\n# Increase AWS EC2 instances\nregion_asia_southeast.app_servers = aws.EC2 & {\n count = 5,\n instance_type = "t3.medium",\n region = "ap-southeast-1"\n}\n```\n\nRedeploy:\n\n```\n./deploy.nu --region us-east\n./deploy.nu --region eu-central\n./deploy.nu --region asia-southeast\n```\n\n### Upgrade Database Instance Class\n\nEdit `workspace.ncl`:\n\n```\n# US East primary\ndatabase = digitalocean.Database & {\n size = "db-s-4vcpu-8gb",\n name = "us-db-primary",\n engine = "pg"\n}\n```\n\nDigitalOcean handles upgrade with minimal downtime.\n\n### Upgrade EC2 Instances\n\n```\n# Stop instances for upgrade (rolling)\naws ec2 stop-instances --region ap-southeast-1 --instance-ids i-1234567890abcdef0\n\n# Wait for stop\naws ec2 wait instance-stopped --region ap-southeast-1 --instance-ids i-1234567890abcdef0\n\n# Modify instance type\naws ec2 modify-instance-attribute \\n --region ap-southeast-1 \\n --instance-id i-1234567890abcdef0 \\n --instance-type t3.large\n\n# Start instance\naws ec2 start-instances --region ap-southeast-1 --instance-ids i-1234567890abcdef0\n```\n\n## Cost Optimization\n\n### Monthly Cost Breakdown\n\n| Component | US East | EU Central | Asia Pacific | Total |\n| ----------- | --------- | ----------- | -------------- | ------- |\n| Compute | $72 | €62.70 | $80 | $242.70 |\n| Database | $30 | Read Replica | $30 | $60 |\n| Load Balancer | Free | ~$10 | ~$20 | ~$30 |\n| **Total** | **$102** | **~$79** | **$130** | **~$311** |\n\n### Optimization Strategies\n\n1. Reduce instance count from 3 to 2 (saves ~$30-40/month)\n2. Downsize compute to s-1vcpu-2gb (saves ~$20-30/month)\n3. Use Reserved Instances on AWS (saves ~20-30%)\n4. Optimize data transfer between regions\n5. Review backups and retention settings\n\n### Monitor Costs\n\n```\n# DigitalOcean\ndoctl billing get\n\n# AWS Cost Explorer\naws ce get-cost-and-usage \\n --time-period Start=2024-01-01,End=2024-01-31 \\n --granularity MONTHLY \\n --metrics BlendedCost \\n --group-by Type=DIMENSION,Key=SERVICE\n\n# Hetzner (manual via console)\n# https://console.hetzner.cloud/billing\n```\n\n## Troubleshooting\n\n### Issue: One Region Not Responding\n\n**Diagnosis**:\n```\n# Check health checks\naws route53 get-health-check-status --health-check-id abc123\n\n# Test regional endpoints\ncurl -v https://us.api.example.com/health\ncurl -v https://eu.api.example.com/health\ncurl -v https://asia.api.example.com/health\n```\n\n**Solution**:\n- Check web server status in affected region\n- Verify load balancer is healthy\n- Review security groups/firewall rules\n- Check application logs on web servers\n\n### Issue: High Replication Lag\n\n**Diagnosis**:\n```\n# Check replication status\npsql -h us-db-primary.abc123.us-east-1.rds.amazonaws.com -U admin -d postgres \\n -c "SELECT now() - pg_last_xact_replay_timestamp() AS replication_lag;"\n\n# Check replication slots\npsql -h us-db-primary.abc123.us-east-1.rds.amazonaws.com -U admin -d postgres \\n -c "SELECT * FROM pg_replication_slots;"\n```\n\n**Solution**:\n- Check network connectivity between regions\n- Verify VPN tunnels are operational\n- Reduce write load on primary\n- Monitor network bandwidth\n- May need larger database instance\n\n### Issue: VPN Tunnel Down\n\n**Diagnosis**:\n```\n# Check VPN connection status\naws ec2 describe-vpn-connections --region us-east-1\n\n# Test connectivity between regions\nssh hetzner-server "ping 10.0.0.1"\n```\n\n**Solution**:\n- Reconnect VPN tunnel manually\n- Verify tunnel configuration\n- Check security groups allow necessary ports\n- Review ISP routing\n\n## Cleanup\n\nTo destroy all resources (use carefully):\n\n```\n# DigitalOcean\ndoctl compute droplet delete --force us-app-1 us-app-2 us-app-3\ndoctl compute load-balancer delete --force us-lb\ndoctl compute database delete --force us-db-primary\n\n# Hetzner\nhcloud server delete hetzner-eu-1 hetzner-eu-2 hetzner-eu-3\nhcloud load-balancer delete eu-lb\nhcloud volume delete eu-backups\n\n# AWS\naws ec2 terminate-instances --region ap-southeast-1 --instance-ids i-xxxxx\naws elbv2 delete-load-balancer --load-balancer-arn arn:aws:elasticloadbalancing:ap-southeast-1:123456789:loadbalancer/app/asia-lb/1234567890abcdef\naws rds delete-db-instance --db-instance-identifier asia-db-replica --skip-final-snapshot\n\n# Route53\naws route53 delete-health-check --health-check-id abc123\naws route53 delete-hosted-zone --id Z1234567890ABC\n```\n\n## Next Steps\n\n1. Disaster Recovery Testing: Regular failover drills\n2. Auto-scaling: Add provider-specific autoscaling\n3. Monitoring Integration: Connect to centralized monitoring (Datadog, New Relic, Prometheus)\n4. Backup Automation: Implement cross-region backups\n5. Cost Optimization: Review and tune resource sizing\n6. Security Hardening: Implement WAF, DDoS protection\n7. Load Testing: Validate performance across regions\n\n## Support\n\nFor issues or questions:\n\n- Review the multi-provider networking guide\n- Check provider-specific documentation\n- Review regional deployment logs: `./deploy.nu --debug`\n- Test regional endpoints independently\n\n## Files\n\n- `workspace.ncl`: Global infrastructure definition (Nickel)\n- `config.toml`: Provider credentials and regional settings\n- `deploy.nu`: Multi-region deployment orchestration (Nushell)\n- `README.md`: This file \ No newline at end of file +# Multi-Region High Availability Workspace\n\nThis workspace demonstrates a production-ready global high availability deployment spanning three cloud providers across three geographic regions:\n\n- **US East (DigitalOcean NYC)**: Primary region - active serving, primary database\n- **EU Central (Hetzner Germany)**: Secondary region - active serving, read replicas\n- **Asia Pacific (AWS Singapore)**: Tertiary region - active serving, read replicas\n\n## Why Multi-Region High Availability?\n\n### Business Benefits\n\n- **99.99% Uptime**: Automatic failover across regions\n- **Low Latency**: Users served from geographically closest region\n- **Compliance**: Data residency in specific regions (GDPR for EU)\n- **Disaster Recovery**: Complete regional failure tolerance\n\n### Technical Benefits\n\n- **Load Distribution**: Traffic spread across 3 regions\n- **Cost Optimization**: Pay only for actual usage (~$311/month)\n- **Provider Diversity**: Reduces vendor lock-in risk\n- **Capacity Planning**: Scale independently per region\n\n## Architecture Overview\n\n```\n┌─────────────────────────────────────────────────────────────────┐\n│ Global Route53 DNS │\n│ Geographic Routing + Health Checks │\n└────────────────────┬────────────────────────────────────────────┘\n │\n ┌───────────┼───────────┐\n │ │ │\n ┌────▼─────┐ ┌──▼────────┐ ┌▼──────────┐\n │ US │ │ EU │ │ APAC │\n │ Primary │ │ Secondary │ │ Tertiary │\n └────┬─────┘ └──┬────────┘ └▼──────────┘\n │ │ │\n ┌────▼──────────▼───────────▼────┐\n │ Multi-Master Database │\n │ Replication (300s lag) │\n └────────────────────────────────┘\n │ │ │\n ┌────▼────┐ ┌──▼─────┐ ┌──▼────┐\n │DO Droplets Hetzner AWS\n │ 3 x nyc3 3 x nbg1 3 x sgp1\n │ │ │ │\n │ Load Balancer (per region)\n │ │ │ │\n └─────────┼─────────┼─────────┘\n │VPN Tunnels (IPSec)│\n └───────────────────┘\n```\n\n### Regional Components\n\n#### US East (DigitalOcean) - Primary\n\n```\nRegion: nyc3 (New York)\nCompute: 3x Droplets (s-2vcpu-4gb)\nLoad Balancer: Round-robin with health checks\nDatabase: PostgreSQL (3-node cluster, Multi-AZ)\nNetwork: VPC 10.0.0.0/16\nCost: ~$102/month\n```\n\n#### EU Central (Hetzner) - Secondary\n\n```\nRegion: nbg1 (Nuremberg, Germany)\nCompute: 3x CPX21 servers (4 vCPU, 8GB RAM)\nLoad Balancer: Hetzner Load Balancer\nDatabase: Read-only replica (lag: 300s)\nNetwork: vSwitch 10.1.0.0/16\nCost: ~$79/month (€72.70)\n```\n\n#### Asia Pacific (AWS) - Tertiary\n\n```\nRegion: ap-southeast-1 (Singapore)\nCompute: 3x EC2 t3.medium instances\nLoad Balancer: Application Load Balancer (ALB)\nDatabase: RDS read-only replica (lag: 300s)\nNetwork: VPC 10.2.0.0/16\nCost: ~$130/month\n```\n\n## Prerequisites\n\n### 1. Cloud Accounts & Credentials\n\n#### DigitalOcean\n```\n# Create API token\n# Dashboard → API → Tokens/Keys → Generate New Token\n# Scopes: read, write\n\nexport DIGITALOCEAN_TOKEN="dop_v1_abc123def456ghi789jkl012mno"\n```\n\n#### Hetzner\n```\n# Create API token\n# Dashboard → Security → API Tokens → Generate Token\n\nexport HCLOUD_TOKEN="MC4wNTI1YmE1M2E4YmE0YTQzMTQyZTdlODYy"\n```\n\n#### AWS\n```\n# Create IAM user with programmatic access\n# IAM → Users → Add User → Check "Programmatic access"\n# Attach policies: AmazonEC2FullAccess, AmazonRDSFullAccess, Route53FullAccess\n\nexport AWS_ACCESS_KEY_ID="AKIA1234567890ABCDEF"\nexport AWS_SECRET_ACCESS_KEY="wJalrXUtnFEMI/K7MDENG+j/zI0m1234567890ab"\n```\n\n### 2. CLI Tools\n\n```\n# Verify all CLIs are installed\nwhich doctl\nwhich hcloud\nwhich aws\nwhich nickel\n\n# Versions\ndoctl version # >= 1.94.0\nhcloud version # >= 1.35.0\naws --version # >= 2.0\nnickel --version # >= 1.0\n```\n\n### 3. SSH Keys\n\n#### DigitalOcean\n```\n# Upload SSH key\ndoctl compute ssh-key create provisioning-key \\n --public-key-from-file ~/.ssh/id_rsa.pub\n\n# Note the key ID\ndoctl compute ssh-key list\n```\n\n#### Hetzner\n```\n# Upload SSH key\nhcloud ssh-key create \\n --name provisioning-key \\n --public-key-from-file ~/.ssh/id_rsa.pub\n\n# List keys\nhcloud ssh-key list\n```\n\n#### AWS\n```\n# Create or import EC2 key pair\naws ec2 create-key-pair \\n --key-name provisioning-key \\n --query 'KeyMaterial' --output text > provisioning-key.pem\n\nchmod 600 provisioning-key.pem\n```\n\n### 4. Domain and DNS\n\nYou need a domain with Route53 or ability to create DNS records:\n\n```\n# Create hosted zone in Route53\naws route53 create-hosted-zone \\n --name api.example.com \\n --caller-reference $(date +%s)\n\n# Note the Zone ID for updates\naws route53 list-hosted-zones\n```\n\n## Deployment\n\n### Step 1: Configure the Workspace\n\nEdit `workspace.ncl` to customize:\n\n```\n# Update SSH key references\ndroplets = digitalocean.Droplet & {\n ssh_keys = ["YOUR_DO_KEY_ID"],\n name = "us-app",\n region = "nyc3"\n}\n\n# Update AWS AMI IDs for your region\napp_servers = aws.EC2 & {\n image_id = "ami-09d56f8956ab235b7",\n instance_type = "t3.medium",\n region = "ap-southeast-1"\n}\n\n# Update certificate ID\nload_balancer = digitalocean.LoadBalancer & {\n forwarding_rules = [{\n certificate_id = "your-certificate-id",\n entry_protocol = "https",\n entry_port = 443\n }]\n}\n```\n\nEdit `config.toml`:\n\n```\n# Update regional names if different\n[providers.digitalocean]\nregion_name = "us-east"\n\n[providers.hetzner]\nregion_name = "eu-central"\n\n[providers.aws]\nregion_name = "asia-southeast"\n\n# Update domain\n[dns]\ndomain = "api.example.com"\n```\n\n### Step 2: Validate Configuration\n\n```\n# Validate Nickel syntax\nnickel export workspace.ncl | jq . > /dev/null\n\n# Verify credentials per provider\ndoctl auth init --access-token $DIGITALOCEAN_TOKEN\nhcloud context use default\naws sts get-caller-identity\n\n# Check connectivity\ndoctl account get\nhcloud server list\naws ec2 describe-regions\n```\n\n### Step 3: Deploy\n\n```\n# Make script executable\nchmod +x deploy.nu\n\n# Execute deployment (step-by-step)\n./deploy.nu\n\n# Or with debug output\n./deploy.nu --debug\n\n# Or deploy per region\n./deploy.nu --region us-east\n./deploy.nu --region eu-central\n./deploy.nu --region asia-southeast\n```\n\n### Step 4: Verify Global Deployment\n\n```\n# List resources per region\necho "=== US EAST (DigitalOcean) ==="\ndoctl compute droplet list --format Name,Region,Status,PublicIPv4\ndoctl compute load-balancer list\n\necho "=== EU CENTRAL (Hetzner) ==="\nhcloud server list\n\necho "=== ASIA PACIFIC (AWS) ==="\naws ec2 describe-instances --region ap-southeast-1 \\n --query 'Reservations[*].Instances[*].[InstanceId,InstanceType,State.Name,PublicIpAddress]' \\n --output table\naws elbv2 describe-load-balancers --region ap-southeast-1\n```\n\n## Post-Deployment Configuration\n\n### 1. SSL/TLS Certificates\n\n#### AWS Certificate Manager\n```\n# Request certificate for all regions\naws acm request-certificate \\n --domain-name api.example.com \\n --subject-alternative-names *.api.example.com \\n --validation-method DNS \\n --region us-east-1\n\n# Get certificate ARN\naws acm list-certificates --region us-east-1\n\n# Note the ARN for workspace.ncl\n```\n\n### 2. Database Primary/Replica Setup\n\n```\n# Connect to US East primary\nPGPASSWORD=admin psql -h us-db-primary.abc123.us-east-1.rds.amazonaws.com -U admin -d postgres\n\n# Create read-only replication users for EU and APAC\nCREATE ROLE replication_user WITH REPLICATION LOGIN PASSWORD 'replica_password';\n\n# On EU read replica (Hetzner) - verify replication\nSELECT slot_name, restart_lsn, confirmed_flush_lsn FROM pg_replication_slots;\n\n# On APAC read replica (AWS RDS) - verify replica status\nSELECT databaseid, xmin, catalog_xmin FROM pg_replication_origin_status;\n```\n\n### 3. Global DNS Setup\n\n```\n# Create Route53 records for each region\naws route53 change-resource-record-sets \\n --hosted-zone-id Z1234567890ABC \\n --change-batch '{\n "Changes": [\n {\n "Action": "CREATE",\n "ResourceRecordSet": {\n "Name": "us.api.example.com",\n "Type": "A",\n "TTL": 60,\n "ResourceRecords": [{"Value": "198.51.100.15"}]\n }\n },\n {\n "Action": "CREATE",\n "ResourceRecordSet": {\n "Name": "eu.api.example.com",\n "Type": "A",\n "TTL": 60,\n "ResourceRecords": [{"Value": "192.0.2.100"}]\n }\n },\n {\n "Action": "CREATE",\n "ResourceRecordSet": {\n "Name": "asia.api.example.com",\n "Type": "A",\n "TTL": 60,\n "ResourceRecords": [{"Value": "203.0.113.50"}]\n }\n }\n ]\n }'\n\n# Health checks per region\naws route53 create-health-check \\n --health-check-config '{\n "Type": "HTTPS",\n "ResourcePath": "/health",\n "FullyQualifiedDomainName": "us.api.example.com",\n "Port": 443,\n "RequestInterval": 30,\n "FailureThreshold": 3\n }'\n```\n\n### 4. Application Deployment\n\nSSH to web servers in each region:\n\n```\n# US East\nUS_IP=$(doctl compute droplet get us-app-1 --format PublicIPv4 --no-header)\nssh root@$US_IP\n\n# Deploy application\ncd /var/www\ngit clone https://github.com/your-org/app.git\ncd app\n./deploy.sh\n\n# EU Central\nEU_IP=$(hcloud server list --selector region=eu-central --format ID | head -1 | xargs -I {} hcloud server ip {})\nssh root@$EU_IP\n\n# Asia Pacific\nASIA_IP=$(aws ec2 describe-instances \\n --region ap-southeast-1 \\n --filters "Name=tag:Name,Values=asia-app-1" \\n --query 'Reservations[0].Instances[0].PublicIpAddress' \\n --output text)\nssh -i provisioning-key.pem ec2-user@$ASIA_IP\n```\n\n## Monitoring and Health Checks\n\n### Regional Monitoring\n\nEach region generates metrics to CloudWatch/provider-specific monitoring:\n\n```\n# DigitalOcean metrics\ndoctl monitoring metrics list droplet \\n --droplet-id 123456789 \\n --metric cpu\n\n# Hetzner metrics (manual monitoring)\nhcloud server list\n\n# AWS CloudWatch\naws cloudwatch get-metric-statistics \\n --metric-name CPUUtilization \\n --namespace AWS/EC2 \\n --start-time 2024-01-01T00:00:00Z \\n --end-time 2024-01-02T00:00:00Z \\n --period 300 \\n --statistics Average\n```\n\n### Global Health Checks\n\nRoute53 health checks verify all regions are healthy:\n\n```\n# List health checks\naws route53 list-health-checks\n\n# Get detailed status\naws route53 get-health-check-status --health-check-id abc123\n\n# Verify replication lag\n# On primary (US East) DigitalOcean\nSELECT now() - pg_last_xact_replay_timestamp() AS replication_lag;\n\n# Should be less than 300 seconds\n```\n\n### Alert Configuration\n\nConfigure alerts for critical metrics:\n\n```\n# CPU > 80%\naws cloudwatch put-metric-alarm \\n --alarm-name us-east-high-cpu \\n --alarm-actions arn:aws:sns:us-east-1:123456:ops-alerts \\n --metric-name CPUUtilization \\n --threshold 80 \\n --comparison-operator GreaterThanThreshold\n\n# Replication lag > 600s\naws cloudwatch put-metric-alarm \\n --alarm-name replication-lag-critical \\n --metric-name ReplicationLag \\n --threshold 600 \\n --comparison-operator GreaterThanThreshold\n```\n\n## Failover Testing\n\n### Planned Failover - US East to EU Central\n\n```\n# 1. Stop traffic to US East\naws route53 change-resource-record-sets \\n --hosted-zone-id Z1234567890ABC \\n --change-batch '{\n "Changes": [{\n "Action": "UPSERT",\n "ResourceRecordSet": {\n "Name": "api.example.com",\n "Type": "A",\n "TTL": 60,\n "ResourceRecords": [{"Value": "192.0.2.100"}]\n }\n }]\n }'\n\n# 2. Promote EU Central to primary\n# Connect to EU read replica and promote\npsql -h hetzner-eu-db.netz.de -U admin -d postgres \\n -c "SELECT pg_promote();"\n\n# 3. Verify failover\ncurl https://api.example.com/health\n\n# 4. Monitor replication (now from EU)\nSELECT now() - pg_last_xact_replay_timestamp() AS replication_lag;\n```\n\n### Automatic Failover - Health Check Failure\n\nRoute53 automatically fails over when health checks fail:\n\n```\n# Simulate US East failure (for testing only)\n# Stop web servers temporarily\ndoctl compute droplet-action power-off us-app-1 us-app-2 us-app-3\n\n# Wait ~1 minute for health check to fail\nsleep 60\n\n# Verify traffic now routes to EU/APAC\ncurl https://api.example.com/ -v | grep -E "^< Server"\n\n# Restore US East\ndoctl compute droplet-action power-on us-app-1 us-app-2 us-app-3\n```\n\n## Scaling and Upgrades\n\n### Add More Web Servers\n\nEdit `workspace.ncl`:\n\n```\n# Increase droplet count\nregion_us_east.app_servers = digitalocean.Droplet & {\n count = 5,\n name = "us-app",\n region = "nyc3"\n}\n\n# Increase Hetzner servers\nregion_eu_central.app_servers = hetzner.Server & {\n count = 5,\n server_type = "cpx21",\n location = "nbg1"\n}\n\n# Increase AWS EC2 instances\nregion_asia_southeast.app_servers = aws.EC2 & {\n count = 5,\n instance_type = "t3.medium",\n region = "ap-southeast-1"\n}\n```\n\nRedeploy:\n\n```\n./deploy.nu --region us-east\n./deploy.nu --region eu-central\n./deploy.nu --region asia-southeast\n```\n\n### Upgrade Database Instance Class\n\nEdit `workspace.ncl`:\n\n```\n# US East primary\ndatabase = digitalocean.Database & {\n size = "db-s-4vcpu-8gb",\n name = "us-db-primary",\n engine = "pg"\n}\n```\n\nDigitalOcean handles upgrade with minimal downtime.\n\n### Upgrade EC2 Instances\n\n```\n# Stop instances for upgrade (rolling)\naws ec2 stop-instances --region ap-southeast-1 --instance-ids i-1234567890abcdef0\n\n# Wait for stop\naws ec2 wait instance-stopped --region ap-southeast-1 --instance-ids i-1234567890abcdef0\n\n# Modify instance type\naws ec2 modify-instance-attribute \\n --region ap-southeast-1 \\n --instance-id i-1234567890abcdef0 \\n --instance-type t3.large\n\n# Start instance\naws ec2 start-instances --region ap-southeast-1 --instance-ids i-1234567890abcdef0\n```\n\n## Cost Optimization\n\n### Monthly Cost Breakdown\n\n| Component | US East | EU Central | Asia Pacific | Total |\n| ----------- | --------- | ----------- | -------------- | ------- |\n| Compute | $72 | €62.70 | $80 | $242.70 |\n| Database | $30 | Read Replica | $30 | $60 |\n| Load Balancer | Free | ~$10 | ~$20 | ~$30 |\n| **Total** | **$102** | **~$79** | **$130** | **~$311** |\n\n### Optimization Strategies\n\n1. Reduce instance count from 3 to 2 (saves ~$30-40/month)\n2. Downsize compute to s-1vcpu-2gb (saves ~$20-30/month)\n3. Use Reserved Instances on AWS (saves ~20-30%)\n4. Optimize data transfer between regions\n5. Review backups and retention settings\n\n### Monitor Costs\n\n```\n# DigitalOcean\ndoctl billing get\n\n# AWS Cost Explorer\naws ce get-cost-and-usage \\n --time-period Start=2024-01-01,End=2024-01-31 \\n --granularity MONTHLY \\n --metrics BlendedCost \\n --group-by Type=DIMENSION,Key=SERVICE\n\n# Hetzner (manual via console)\n# https://console.hetzner.cloud/billing\n```\n\n## Troubleshooting\n\n### Issue: One Region Not Responding\n\n**Diagnosis**:\n```\n# Check health checks\naws route53 get-health-check-status --health-check-id abc123\n\n# Test regional endpoints\ncurl -v https://us.api.example.com/health\ncurl -v https://eu.api.example.com/health\ncurl -v https://asia.api.example.com/health\n```\n\n**Solution**:\n- Check web server status in affected region\n- Verify load balancer is healthy\n- Review security groups/firewall rules\n- Check application logs on web servers\n\n### Issue: High Replication Lag\n\n**Diagnosis**:\n```\n# Check replication status\npsql -h us-db-primary.abc123.us-east-1.rds.amazonaws.com -U admin -d postgres \\n -c "SELECT now() - pg_last_xact_replay_timestamp() AS replication_lag;"\n\n# Check replication slots\npsql -h us-db-primary.abc123.us-east-1.rds.amazonaws.com -U admin -d postgres \\n -c "SELECT * FROM pg_replication_slots;"\n```\n\n**Solution**:\n- Check network connectivity between regions\n- Verify VPN tunnels are operational\n- Reduce write load on primary\n- Monitor network bandwidth\n- May need larger database instance\n\n### Issue: VPN Tunnel Down\n\n**Diagnosis**:\n```\n# Check VPN connection status\naws ec2 describe-vpn-connections --region us-east-1\n\n# Test connectivity between regions\nssh hetzner-server "ping 10.0.0.1"\n```\n\n**Solution**:\n- Reconnect VPN tunnel manually\n- Verify tunnel configuration\n- Check security groups allow necessary ports\n- Review ISP routing\n\n## Cleanup\n\nTo destroy all resources (use carefully):\n\n```\n# DigitalOcean\ndoctl compute droplet delete --force us-app-1 us-app-2 us-app-3\ndoctl compute load-balancer delete --force us-lb\ndoctl compute database delete --force us-db-primary\n\n# Hetzner\nhcloud server delete hetzner-eu-1 hetzner-eu-2 hetzner-eu-3\nhcloud load-balancer delete eu-lb\nhcloud volume delete eu-backups\n\n# AWS\naws ec2 terminate-instances --region ap-southeast-1 --instance-ids i-xxxxx\naws elbv2 delete-load-balancer --load-balancer-arn arn:aws:elasticloadbalancing:ap-southeast-1:123456789:loadbalancer/app/asia-lb/1234567890abcdef\naws rds delete-db-instance --db-instance-identifier asia-db-replica --skip-final-snapshot\n\n# Route53\naws route53 delete-health-check --health-check-id abc123\naws route53 delete-hosted-zone --id Z1234567890ABC\n```\n\n## Next Steps\n\n1. Disaster Recovery Testing: Regular failover drills\n2. Auto-scaling: Add provider-specific autoscaling\n3. Monitoring Integration: Connect to centralized monitoring (Datadog, New Relic, Prometheus)\n4. Backup Automation: Implement cross-region backups\n5. Cost Optimization: Review and tune resource sizing\n6. Security Hardening: Implement WAF, DDoS protection\n7. Load Testing: Validate performance across regions\n\n## Support\n\nFor issues or questions:\n\n- Review the multi-provider networking guide\n- Check provider-specific documentation\n- Review regional deployment logs: `./deploy.nu --debug`\n- Test regional endpoints independently\n\n## Files\n\n- `workspace.ncl`: Global infrastructure definition (Nickel)\n- `config.toml`: Provider credentials and regional settings\n- `deploy.nu`: Multi-region deployment orchestration (Nushell)\n- `README.md`: This file diff --git a/locales/TRANSLATIONS_STATUS.md b/locales/TRANSLATIONS_STATUS.md index b851675..3a50de2 100644 --- a/locales/TRANSLATIONS_STATUS.md +++ b/locales/TRANSLATIONS_STATUS.md @@ -1,346 +1 @@ -# Multilingual Provisioning System - Translation Status - -**Last Updated**: 2026-01-13 -**Status**: 100% Complete (Phases 1-4D) - -## Executive Summary - -The Provisioning system now supports comprehensive multilingual interfaces across all components: - -- **Help System**: 65 strings extracted and translated -- **Forms**: ~180 strings covering setup, authentication, and infrastructure management -- **Supported Languages**: English (en-US), Spanish (es-ES) with infrastructure for future additions - -## Language Coverage - -### English (en-US) - 100% Complete ✅ - -**Source of Truth**: `/provisioning/locales/en-US/` - -| Component | File | Strings | Status | -|-----------|------|---------|--------| -| Help System | `help.ftl` | 65 | ✅ Complete | -| Forms | `forms.ftl` | 180 | ✅ Complete | -| **Total** | | **245** | **✅ Complete** | - -### Spanish (es-ES) - 100% Complete ✅ - -**Source of Truth**: `/provisioning/locales/es-ES/` - -| Component | File | Strings | Status | -|-----------|------|---------|--------| -| Help System | `help.ftl` | 65 | ✅ Complete | -| Forms | `forms.ftl` | 180 | ✅ Complete | -| **Total** | | **245** | **✅ Complete** | - -## String Breakdown by Category - -### Help System (65 strings) - -**Main Menu** (26 strings): -- Title, subtitle, category hint -- 13 category names and descriptions (infrastructure, orchestration, development, workspace, platform, setup, authentication, plugins, utilities, tools, vm, diagnostics, concepts, guides, integrations) -- 3 error messages - -**Infrastructure** (21 strings): -- Section title -- Server operations (4 strings) -- TaskServ management (4 strings) -- Cluster management (4 strings) -- VM operations (4 strings) -- Infrastructure tip - -**Orchestration** (18 strings): -- Section title -- Orchestrator management (5 strings) -- Workflow operations (5 strings) -- Batch operations (7 strings) -- Batch workflows tip + example - -### Forms (180 strings) - -**Unified Setup Form** (~140 strings): -- Project information (project name, version, description) -- Database configuration (PostgreSQL, MySQL, MongoDB, SQLite) -- API service configuration (name, ports, health check, replicas) -- Deployment options (Docker Compose, Kubernetes, Cloud) -- Advanced options (monitoring, TLS) -- Security & authentication (JWT, OAuth2, SAML, None) -- Terms & conditions (terms, newsletter, save address) - -**Core Forms** (~40 strings): -- Auth login form (username, password, remember me, forgot password) -- Setup wizard (quick, standard, advanced) -- MFA enrollment (TOTP, SMS, backup codes, device name) - -**Infrastructure Forms** (interactive): -- Delete confirmations (30+ potential variations for different resources) -- Resource confirmation prompts -- Data retention options - -## TypeDialog Integration - -### Configured Forms - -The following root forms have `locales_path` configured to use Fluent catalogs: - -| Form | Path | locales_path | -|------|------|--------------| -| Unified Setup | `.typedialog/provisioning/form.toml` | `../../../locales` | -| CI Configuration | `.typedialog/ci/form.toml` | `../../../locales` | -| Core Auth | `.typedialog/core/forms/auth-login.toml` | `../../../../../locales` | -| Core Setup Wizard | `.typedialog/core/forms/setup-wizard.toml` | `../../../../../locales` | -| Core MFA | `.typedialog/core/forms/mfa-enroll.toml` | `../../../../../locales` | - -**Note**: Fragment forms (70+ files) inherit locales from their parent forms through TypeDialog's `load_fragments` mechanism. - -## Fluent File Organization - -``` -provisioning/locales/ -├── i18n-config.toml # Central i18n configuration -├── TRANSLATIONS_STATUS.md # This file -├── en-US/ # English base language -│ ├── help.ftl # Help system strings (65 keys) -│ └── forms.ftl # Form strings (180 keys) -└── es-ES/ # Spanish translations - ├── help.ftl # Help system translations - └── forms.ftl # Form translations -``` - -## Fallback Chains - -When a string is missing in the active locale, TypeDialog automatically falls back to the configured chain: - -``` -es-ES → en-US (default) -``` - -**Configuration** in `i18n-config.toml`: -``` -[fallback_chains] -es-ES = ["en-US"] -``` - -This ensures that if any Spanish translation is missing, the English version is displayed as a fallback. - -## Feature Configuration - -The i18n system supports these features (enabled in `i18n-config.toml`): - -| Feature | Status | Purpose | -|---------|--------|---------| -| Pluralization | ✅ Enabled | Support plural forms in translations | -| Number Formatting | ✅ Enabled | Locale-specific number/currency formatting | -| Date Formatting | ✅ Enabled | Locale-specific date formats | -| Fallback Chains | ✅ Enabled | Automatic fallback to English | -| Gender Agreement | ⚠️ Disabled | Spanish doesn't need gender in these strings | -| RTL Support | ⚠️ Disabled | No RTL languages configured yet | - -## Translation Quality Standards - -### Naming Conventions - -All Fluent keys follow a consistent pattern: - -- **Help strings**: `help-{category}-{element}` (e.g., `help-infra-server-create`) -- **Form prompts**: `form-{element}-prompt` (e.g., `form-project_name-prompt`) -- **Form help**: `form-{element}-help` (e.g., `form-project_name-help`) -- **Form placeholders**: `form-{element}-placeholder` -- **Form options**: `form-{element}-option-{value}` (e.g., `form-database_type-option-postgres`) -- **Section headers**: `section-{name}-title` - -### Coverage Requirements - -From `i18n-config.toml`: - -- **Required Coverage**: 95% (critical locales: en-US, es-ES) -- **Warning Threshold**: 80% -- **Validation**: Missing keys trigger warnings during build - -## Testing & Validation - -### Locale Resolution - -The system uses the `LANG` environment variable for locale selection: - -``` -# English (default) -$ LANG=en_US provisioning help infrastructure -# Output: SERVER & INFRASTRUCTURE... - -# Spanish -$ LANG=es_ES provisioning help infrastructure -# Output: SERVIDOR E INFRAESTRUCTURA... - -# Fallback to English if missing locale file -$ LANG=fr_FR provisioning help infrastructure -# Output: SERVER & INFRASTRUCTURE... (fallback) -``` - -### Form Testing - -TypeDialog forms automatically use the configured locale: - -``` -# Display Unified Setup in English -$ LANG=en_US provisioning setup profile - -# Display Unified Setup in Spanish -$ LANG=es_ES provisioning setup profile -``` - -### Coverage Validation - -To validate translation coverage: - -``` -# Check translation status -provisioning i18n status - -# Generate coverage report -provisioning i18n coverage --locale es-ES -# Expected: 100% (245/245 strings translated) - -# Validate Fluent files -provisioning i18n validate -``` - -## Future Expansion - -The infrastructure supports adding new languages. To add a new locale (e.g., Portuguese): - -### 1. Add Locale Configuration - -In `i18n-config.toml`: - -``` -[locales.pt-BR] -name = "Portuguese (Brazil)" -direction = "ltr" -plurals = 2 -decimal_separator = "," -thousands_separator = "." -date_format = "DD/MM/YYYY" - -[fallback_chains] -pt-BR = ["pt-PT", "es-ES", "en-US"] -``` - -### 2. Create Locale Directory - -``` -mkdir -p provisioning/locales/pt-BR -``` - -### 3. Create Translation Files - -``` -cp provisioning/locales/en-US/help.ftl provisioning/locales/pt-BR/help.ftl -cp provisioning/locales/en-US/forms.ftl provisioning/locales/pt-BR/forms.ftl -``` - -### 4. Translate Strings - -Update `pt-BR/help.ftl` and `pt-BR/forms.ftl` with Portuguese translations. - -### 5. Validate - -``` -provisioning i18n validate --locale pt-BR -``` - -## Architecture & Implementation Details - -### Mozilla Fluent Format - -All translations use Mozilla Fluent (.ftl files), which offers: - -- **Simple Syntax**: `key = value` format -- **Rich Features**: Pluralization, gender agreement, attributes -- **Fallback Support**: Automatic chain resolution -- **Extensibility**: Support for custom functions and formatting - -**Example**: -``` -help-infra-server-create = Create a new server -form-database_type-option-postgres = PostgreSQL (Recommended) -``` - -### TypeDialog Integration - -TypeDialog forms reference Fluent keys via `locales_path`: - -``` -locales_path = "../../../locales" - -[[elements]] -name = "project_name" -prompt = "form-project_name-prompt" # References: locales/*/forms.ftl -help = "form-project_name-help" -placeholder = "form-project_name-placeholder" -``` - -TypeDialog's locale resolution: - -1. Check `locales_path` configuration -2. Look for `LANG` environment variable (e.g., `es_ES`) -3. Find corresponding Fluent file (e.g., `es-ES/forms.ftl`) -4. Resolve key → value -5. Fallback to parent locale chain if missing -6. Use literal key if no translation found - -## Coverage Metrics - -### String Count Summary - -| Category | en-US | es-ES | Coverage | -|----------|-------|-------|----------| -| Help System | 65 | 65 | 100% ✅ | -| Forms | 180 | 180 | 100% ✅ | -| **Total** | **245** | **245** | **100%** ✅ | - -### Language Support - -| Locale | Strings | Status | Notes | -|--------|---------|--------|-------| -| en-US | 245 | ✅ Complete | Base language | -| es-ES | 245 | ✅ Complete | Full translation | -| pt-BR | - | 🔄 Planned | Infrastructure ready | -| fr-FR | - | 🔄 Planned | Infrastructure ready | -| ja-JP | - | 🔄 Planned | Infrastructure ready | - -## Maintenance & Updates - -### Adding Translations - -When new forms or help sections are added: - -1. Extract strings using extraction tools -2. Add Fluent keys to `en-US/*.ftl` -3. Translate to `es-ES/*.ftl` -4. Update this status document - -### Validation Checklist - -- [ ] All new strings have Fluent keys -- [ ] Keys follow naming conventions -- [ ] English translation complete -- [ ] Spanish translation complete -- [ ] Fallback chains tested -- [ ] LANG environment variable works -- [ ] TypeDialog forms display correctly - -## References - -- **Fluent Documentation**: https://projectfluent.org/ -- **TypeDialog i18n**: TypeDialog embedded documentation -- **i18n Configuration**: See `provisioning/locales/i18n-config.toml` -- **Help System**: See `provisioning/core/nulib/main_provisioning/help_system.nu` - ---- - -**Status**: ✅ Complete -**Phases Completed**: 1, 2, 3, 4A, 4B, 4C, 4D -**Ready for**: Production deployment, further language additions, testing - +# Multilingual Provisioning System - Translation Status\n\n**Last Updated**: 2026-01-13\n**Status**: 100% Complete (Phases 1-4D)\n\n## Executive Summary\n\nThe Provisioning system now supports comprehensive multilingual interfaces across all components:\n\n- **Help System**: 65 strings extracted and translated\n- **Forms**: ~180 strings covering setup, authentication, and infrastructure management\n- **Supported Languages**: English (en-US), Spanish (es-ES) with infrastructure for future additions\n\n## Language Coverage\n\n### English (en-US) - 100% Complete ✅\n\n**Source of Truth**: `/provisioning/locales/en-US/`\n\n| Component | File | Strings | Status |\n|-----------|------|---------|--------|\n| Help System | `help.ftl` | 65 | ✅ Complete |\n| Forms | `forms.ftl` | 180 | ✅ Complete |\n| **Total** | | **245** | **✅ Complete** |\n\n### Spanish (es-ES) - 100% Complete ✅\n\n**Source of Truth**: `/provisioning/locales/es-ES/`\n\n| Component | File | Strings | Status |\n|-----------|------|---------|--------|\n| Help System | `help.ftl` | 65 | ✅ Complete |\n| Forms | `forms.ftl` | 180 | ✅ Complete |\n| **Total** | | **245** | **✅ Complete** |\n\n## String Breakdown by Category\n\n### Help System (65 strings)\n\n**Main Menu** (26 strings):\n- Title, subtitle, category hint\n- 13 category names and descriptions (infrastructure, orchestration, development, workspace, platform, setup, authentication, plugins, utilities, tools, vm, diagnostics, concepts, guides, integrations)\n- 3 error messages\n\n**Infrastructure** (21 strings):\n- Section title\n- Server operations (4 strings)\n- TaskServ management (4 strings)\n- Cluster management (4 strings)\n- VM operations (4 strings)\n- Infrastructure tip\n\n**Orchestration** (18 strings):\n- Section title\n- Orchestrator management (5 strings)\n- Workflow operations (5 strings)\n- Batch operations (7 strings)\n- Batch workflows tip + example\n\n### Forms (180 strings)\n\n**Unified Setup Form** (~140 strings):\n- Project information (project name, version, description)\n- Database configuration (PostgreSQL, MySQL, MongoDB, SQLite)\n- API service configuration (name, ports, health check, replicas)\n- Deployment options (Docker Compose, Kubernetes, Cloud)\n- Advanced options (monitoring, TLS)\n- Security & authentication (JWT, OAuth2, SAML, None)\n- Terms & conditions (terms, newsletter, save address)\n\n**Core Forms** (~40 strings):\n- Auth login form (username, password, remember me, forgot password)\n- Setup wizard (quick, standard, advanced)\n- MFA enrollment (TOTP, SMS, backup codes, device name)\n\n**Infrastructure Forms** (interactive):\n- Delete confirmations (30+ potential variations for different resources)\n- Resource confirmation prompts\n- Data retention options\n\n## TypeDialog Integration\n\n### Configured Forms\n\nThe following root forms have `locales_path` configured to use Fluent catalogs:\n\n| Form | Path | locales_path |\n|------|------|--------------|\n| Unified Setup | `.typedialog/provisioning/form.toml` | `../../../locales` |\n| CI Configuration | `.typedialog/ci/form.toml` | `../../../locales` |\n| Core Auth | `.typedialog/core/forms/auth-login.toml` | `../../../../../locales` |\n| Core Setup Wizard | `.typedialog/core/forms/setup-wizard.toml` | `../../../../../locales` |\n| Core MFA | `.typedialog/core/forms/mfa-enroll.toml` | `../../../../../locales` |\n\n**Note**: Fragment forms (70+ files) inherit locales from their parent forms through TypeDialog's `load_fragments` mechanism.\n\n## Fluent File Organization\n\n```\nprovisioning/locales/\n├── i18n-config.toml # Central i18n configuration\n├── TRANSLATIONS_STATUS.md # This file\n├── en-US/ # English base language\n│ ├── help.ftl # Help system strings (65 keys)\n│ └── forms.ftl # Form strings (180 keys)\n└── es-ES/ # Spanish translations\n ├── help.ftl # Help system translations\n └── forms.ftl # Form translations\n```\n\n## Fallback Chains\n\nWhen a string is missing in the active locale, TypeDialog automatically falls back to the configured chain:\n\n```\nes-ES → en-US (default)\n```\n\n**Configuration** in `i18n-config.toml`:\n```\n[fallback_chains]\nes-ES = ["en-US"]\n```\n\nThis ensures that if any Spanish translation is missing, the English version is displayed as a fallback.\n\n## Feature Configuration\n\nThe i18n system supports these features (enabled in `i18n-config.toml`):\n\n| Feature | Status | Purpose |\n|---------|--------|---------|\n| Pluralization | ✅ Enabled | Support plural forms in translations |\n| Number Formatting | ✅ Enabled | Locale-specific number/currency formatting |\n| Date Formatting | ✅ Enabled | Locale-specific date formats |\n| Fallback Chains | ✅ Enabled | Automatic fallback to English |\n| Gender Agreement | ⚠️ Disabled | Spanish doesn't need gender in these strings |\n| RTL Support | ⚠️ Disabled | No RTL languages configured yet |\n\n## Translation Quality Standards\n\n### Naming Conventions\n\nAll Fluent keys follow a consistent pattern:\n\n- **Help strings**: `help-{category}-{element}` (e.g., `help-infra-server-create`)\n- **Form prompts**: `form-{element}-prompt` (e.g., `form-project_name-prompt`)\n- **Form help**: `form-{element}-help` (e.g., `form-project_name-help`)\n- **Form placeholders**: `form-{element}-placeholder`\n- **Form options**: `form-{element}-option-{value}` (e.g., `form-database_type-option-postgres`)\n- **Section headers**: `section-{name}-title`\n\n### Coverage Requirements\n\nFrom `i18n-config.toml`:\n\n- **Required Coverage**: 95% (critical locales: en-US, es-ES)\n- **Warning Threshold**: 80%\n- **Validation**: Missing keys trigger warnings during build\n\n## Testing & Validation\n\n### Locale Resolution\n\nThe system uses the `LANG` environment variable for locale selection:\n\n```\n# English (default)\n$ LANG=en_US provisioning help infrastructure\n# Output: SERVER & INFRASTRUCTURE...\n\n# Spanish\n$ LANG=es_ES provisioning help infrastructure\n# Output: SERVIDOR E INFRAESTRUCTURA...\n\n# Fallback to English if missing locale file\n$ LANG=fr_FR provisioning help infrastructure\n# Output: SERVER & INFRASTRUCTURE... (fallback)\n```\n\n### Form Testing\n\nTypeDialog forms automatically use the configured locale:\n\n```\n# Display Unified Setup in English\n$ LANG=en_US provisioning setup profile\n\n# Display Unified Setup in Spanish\n$ LANG=es_ES provisioning setup profile\n```\n\n### Coverage Validation\n\nTo validate translation coverage:\n\n```\n# Check translation status\nprovisioning i18n status\n\n# Generate coverage report\nprovisioning i18n coverage --locale es-ES\n# Expected: 100% (245/245 strings translated)\n\n# Validate Fluent files\nprovisioning i18n validate\n```\n\n## Future Expansion\n\nThe infrastructure supports adding new languages. To add a new locale (e.g., Portuguese):\n\n### 1. Add Locale Configuration\n\nIn `i18n-config.toml`:\n\n```\n[locales.pt-BR]\nname = "Portuguese (Brazil)"\ndirection = "ltr"\nplurals = 2\ndecimal_separator = ","\nthousands_separator = "."\ndate_format = "DD/MM/YYYY"\n\n[fallback_chains]\npt-BR = ["pt-PT", "es-ES", "en-US"]\n```\n\n### 2. Create Locale Directory\n\n```\nmkdir -p provisioning/locales/pt-BR\n```\n\n### 3. Create Translation Files\n\n```\ncp provisioning/locales/en-US/help.ftl provisioning/locales/pt-BR/help.ftl\ncp provisioning/locales/en-US/forms.ftl provisioning/locales/pt-BR/forms.ftl\n```\n\n### 4. Translate Strings\n\nUpdate `pt-BR/help.ftl` and `pt-BR/forms.ftl` with Portuguese translations.\n\n### 5. Validate\n\n```\nprovisioning i18n validate --locale pt-BR\n```\n\n## Architecture & Implementation Details\n\n### Mozilla Fluent Format\n\nAll translations use Mozilla Fluent (.ftl files), which offers:\n\n- **Simple Syntax**: `key = value` format\n- **Rich Features**: Pluralization, gender agreement, attributes\n- **Fallback Support**: Automatic chain resolution\n- **Extensibility**: Support for custom functions and formatting\n\n**Example**:\n```\nhelp-infra-server-create = Create a new server\nform-database_type-option-postgres = PostgreSQL (Recommended)\n```\n\n### TypeDialog Integration\n\nTypeDialog forms reference Fluent keys via `locales_path`:\n\n```\nlocales_path = "../../../locales"\n\n[[elements]]\nname = "project_name"\nprompt = "form-project_name-prompt" # References: locales/*/forms.ftl\nhelp = "form-project_name-help"\nplaceholder = "form-project_name-placeholder"\n```\n\nTypeDialog's locale resolution:\n\n1. Check `locales_path` configuration\n2. Look for `LANG` environment variable (e.g., `es_ES`)\n3. Find corresponding Fluent file (e.g., `es-ES/forms.ftl`)\n4. Resolve key → value\n5. Fallback to parent locale chain if missing\n6. Use literal key if no translation found\n\n## Coverage Metrics\n\n### String Count Summary\n\n| Category | en-US | es-ES | Coverage |\n|----------|-------|-------|----------|\n| Help System | 65 | 65 | 100% ✅ |\n| Forms | 180 | 180 | 100% ✅ |\n| **Total** | **245** | **245** | **100%** ✅ |\n\n### Language Support\n\n| Locale | Strings | Status | Notes |\n|--------|---------|--------|-------|\n| en-US | 245 | ✅ Complete | Base language |\n| es-ES | 245 | ✅ Complete | Full translation |\n| pt-BR | - | 🔄 Planned | Infrastructure ready |\n| fr-FR | - | 🔄 Planned | Infrastructure ready |\n| ja-JP | - | 🔄 Planned | Infrastructure ready |\n\n## Maintenance & Updates\n\n### Adding Translations\n\nWhen new forms or help sections are added:\n\n1. Extract strings using extraction tools\n2. Add Fluent keys to `en-US/*.ftl`\n3. Translate to `es-ES/*.ftl`\n4. Update this status document\n\n### Validation Checklist\n\n- [ ] All new strings have Fluent keys\n- [ ] Keys follow naming conventions\n- [ ] English translation complete\n- [ ] Spanish translation complete\n- [ ] Fallback chains tested\n- [ ] LANG environment variable works\n- [ ] TypeDialog forms display correctly\n\n## References\n\n- **Fluent Documentation**: https://projectfluent.org/\n- **TypeDialog i18n**: TypeDialog embedded documentation\n- **i18n Configuration**: See `provisioning/locales/i18n-config.toml`\n- **Help System**: See `provisioning/core/nulib/main_provisioning/help_system.nu`\n\n---\n\n**Status**: ✅ Complete\n**Phases Completed**: 1, 2, 3, 4A, 4B, 4C, 4D\n**Ready for**: Production deployment, further language additions, testing\n diff --git a/locales/en-US/forms.ftl b/locales/en-US/forms.ftl index 23a5501..2925d25 100644 --- a/locales/en-US/forms.ftl +++ b/locales/en-US/forms.ftl @@ -122,4 +122,3 @@ form-confirm_name-prompt = Type resource name to confirm form-confirm_name-help = Type the exact name of the resource form-keep_data-prompt = Keep associated data? form-keep_data-help = Preserve related configuration and backups - diff --git a/locales/en-US/help.ftl b/locales/en-US/help.ftl index 3bf9bcf..c8c25c6 100644 --- a/locales/en-US/help.ftl +++ b/locales/en-US/help.ftl @@ -164,4 +164,3 @@ help-guides-intro = Quick start guides, tutorials, and reference sheets. help-integrations-intro = Integrate with prov-ecosystem, provctl, and external services. help-more-info = Use 'provisioning help ' for detailed information. - diff --git a/locales/es-ES/forms.ftl b/locales/es-ES/forms.ftl index 5435571..8a3400c 100644 --- a/locales/es-ES/forms.ftl +++ b/locales/es-ES/forms.ftl @@ -122,4 +122,3 @@ form-confirm_name-prompt = Escribe el nombre del recurso para confirmar form-confirm_name-help = Escribe el nombre exacto del recurso form-keep_data-prompt = ¿Conservar datos asociados? form-keep_data-help = Preservar configuración relacionada y copias de seguridad - diff --git a/locales/es-ES/help.ftl b/locales/es-ES/help.ftl index 34ba368..d8b2d7b 100644 --- a/locales/es-ES/help.ftl +++ b/locales/es-ES/help.ftl @@ -164,4 +164,3 @@ help-guides-intro = Guías de inicio rápido, tutoriales y hojas de referencia. help-integrations-intro = Integrar con prov-ecosystem, provctl y servicios externos. help-more-info = Use 'provisioning help ' para información detallada. - diff --git a/resources/images/how-to-use.md b/resources/images/how-to-use.md index 7c77db1..3e04285 100644 --- a/resources/images/how-to-use.md +++ b/resources/images/how-to-use.md @@ -1,5 +1 @@ -provisioning_logo-dark.svg logo image for dark mode -provisioning_logo-light.svg logo image for normal mode -provisioning_logo-text-dark.svg logo text for dark mode -provisioning_logo-text-light.svg logo text for normal mode -provisioning_logo-image.svg main image +provisioning_logo-dark.svg logo image for dark mode\nprovisioning_logo-light.svg logo image for normal mode\nprovisioning_logo-text-dark.svg logo text for dark mode\nprovisioning_logo-text-light.svg logo text for normal mode\nprovisioning_logo-image.svg main image diff --git a/schemas/infrastructure/README.md b/schemas/infrastructure/README.md index 70ab012..7452892 100644 --- a/schemas/infrastructure/README.md +++ b/schemas/infrastructure/README.md @@ -1 +1 @@ -# Infrastructure Schemas\n\nThis directory contains Nickel type-safe schemas for infrastructure configuration generation.\n\n## Overview\n\nThese schemas provide type contracts and validation for multi-format infrastructure configuration generation:\n\n- **Docker Compose** (`docker-compose.ncl`) - Container orchestration via Docker Compose\n- **Kubernetes** (`kubernetes.ncl`) - Kubernetes manifest generation (Deployments, Services, ConfigMaps)\n- **Nginx** (`nginx.ncl`) - Reverse proxy and load balancer configuration\n- **Prometheus** (`prometheus.ncl`) - Metrics collection and monitoring\n- **Systemd** (`systemd.ncl`) - System service units for standalone deployments\n- **OCI Registry** (`oci-registry.ncl`) - Container registry backend configuration (Zot, Distribution, Harbor)\n\n## Key Features\n\n### 1. Mode-Based Presets\n\nEach schema includes presets for different deployment modes:\n\n- **solo**: Single-node deployments (minimal resources)\n- **multiuser**: Staging/small production (2 replicas, HA)\n- **enterprise**: Large-scale production (3+ replicas, distributed storage)\n- **cicd**: CI/CD pipeline deployments\n\n### 2. Type Safety\n\n```\n# All fields are strongly typed with validation\nResourceLimits = {\n cpus | String, # Type: string\n memory | String,\n},\n\n# Enum validation\nServiceType = [| 'ClusterIP, 'NodePort, 'LoadBalancer |],\n\n# Numeric range validation\nPort = Number | {\n predicate = fun n => n > 0 && n < 65536,\n}\n```\n\n### 3. Export Formats\n\nSchemas export to multiple formats:\n\n```\n# Export as YAML (K8s, Docker Compose)\nnickel export --format yaml provisioning/schemas/infrastructure/kubernetes.ncl\n\n# Export as JSON (OCI Registry, Prometheus configs)\nnickel export --format json provisioning/schemas/infrastructure/oci-registry.ncl\n\n# Export as TOML (systemd, Nginx)\nnickel export --format toml provisioning/schemas/infrastructure/systemd.ncl\n```\n\n## Single Source of Truth Pattern\n\nDefine service configuration once, generate multiple infrastructure outputs:\n\n```\norchestrator.ncl (Platform Service Schema)\n ↓\nInfrastructure Schemas (Docker, Kubernetes, Nginx, etc.)\n ↓\n[Multiple Outputs]\n├─→ docker-compose.yaml\n├─→ kubernetes/deployment.yaml\n├─→ nginx.conf\n├─→ prometheus.yml\n└─→ systemd/orchestrator.service\n```\n\n### Example: Service Port Definition\n\n```\n# Platform service schema (provisioning/schemas/platform/schemas/orchestrator.ncl)\nserver = {\n port | Number, # Define port once\n}\n\n# Used in Docker Compose\ndocker-compose = {\n services.orchestrator = {\n ports = ["%{orchestrator.server.port}:8080"],\n }\n}\n\n# Used in Kubernetes\nkubernetes = {\n containers.ports = [{\n containerPort = orchestrator.server.port,\n }]\n}\n\n# Used in Nginx\nnginx = {\n upstreams.orchestrator.servers = [{\n address = "orchestrator:%{orchestrator.server.port}",\n }]\n}\n```\n\n**Benefit**: Change port in one place, all infrastructure configs update automatically.\n\n## Validation Before Deployment\n\n```\n# Type check schema\nnickel typecheck provisioning/schemas/infrastructure/docker-compose.ncl\n\n# Validate export\nnickel export --format json provisioning/schemas/infrastructure/kubernetes.ncl \\n | jq . # Validate JSON structure\n\n# Check generated YAML\nnickel export --format yaml provisioning/schemas/infrastructure/kubernetes.ncl \\n | kubectl apply --dry-run=client -f -\n```\n\n## File Structure\n\n```\ninfrastructure/\n├── README.md # This file\n├── docker-compose.ncl # Docker Compose schema (232 lines)\n├── kubernetes.ncl # Kubernetes manifests (376 lines)\n├── nginx.ncl # Nginx configuration (233 lines)\n├── prometheus.ncl # Prometheus configuration (280 lines)\n├── systemd.ncl # Systemd service units (235 lines)\n└── oci-registry.ncl # OCI Registry configuration (221 lines)\n```\n\n**Total**: 1,577 lines of type-safe infrastructure schemas\n\n## Usage Patterns\n\n### 1. Generate Solo Mode Infrastructure\n\n```\n# Export docker-compose for solo deployment\nnickel export --format yaml provisioning/schemas/infrastructure/docker-compose.ncl \\n | tee provisioning/platform/infrastructure/docker/docker-compose.solo.yaml\n\n# Validate with Docker\ndocker-compose -f docker-compose.solo.yaml config --quiet\n```\n\n### 2. Generate Enterprise HA Kubernetes\n\n```\n# Export Kubernetes manifests\nnickel export --format yaml provisioning/schemas/infrastructure/kubernetes.ncl \\n > provisioning/platform/infrastructure/kubernetes/deployment.yaml\n\n# Validate and apply\nkubectl apply --dry-run=client -f deployment.yaml\nkubectl apply -f deployment.yaml\n```\n\n### 3. Generate Monitoring Stack\n\n```\n# Prometheus configuration\nnickel export --format yaml provisioning/schemas/infrastructure/prometheus.ncl \\n > provisioning/platform/infrastructure/prometheus/prometheus.yml\n\n# Validate Prometheus config\npromtool check config provisioning/platform/infrastructure/prometheus/prometheus.yml\n```\n\n### 4. Auto-Generate Infrastructure from Service Schemas\n\n```\n# Composition function: generate Docker Compose from service port\nlet service = import "../platform/schemas/orchestrator.ncl" in\n{\n services.orchestrator = {\n image = "provisioning/orchestrator:latest",\n ports = ["%{service.server.port}:8080"],\n deploy.resources.limits = service.deploy.resources.limits,\n }\n}\n```\n\n## Documentation\n\n### Inline Schema Documentation\n\nEach schema field includes inline documentation (via `| doc`):\n\n```\nfield | Type | doc "description" | default = value\n```\n\n**Important**: With Nickel, `| doc` must come BEFORE `| default`:\n\n```\n✅ CORRECT: cpus | String | doc "CPU limit" | default = "2.0"\n❌ INCORRECT: cpus | String | default = "2.0" | doc "CPU limit"\n```\n\nFor details, see `.claude/guidelines/nickel.md`\n\n## Validation Rules\n\n### Docker Compose\n\n- ✅ Valid service names, port ranges\n- ✅ Resource limits: CPU and memory strings\n- ✅ Health check configuration\n- ✅ Environment variables typed as strings\n\n### Kubernetes\n\n- ✅ Valid API versions (apps/v1, v1)\n- ✅ Container resource requests/limits\n- ✅ Valid restart policies (Always, OnFailure, Never)\n- ✅ Port ranges (1-65535)\n\n### Nginx\n\n- ✅ Upstream server addresses\n- ✅ Rate limiting zones and rules\n- ✅ TLS configuration validation\n- ✅ Security headers structure\n\n### Prometheus\n\n- ✅ Scrape job configuration\n- ✅ Alert manager targets\n- ✅ Scrape intervals (duration format)\n- ✅ Relabel configuration\n\n### Systemd\n\n- ✅ Unit dependencies (after, requires, wants)\n- ✅ Resource limits (CPU quota, memory)\n- ✅ Restart policies\n- ✅ Service types (simple, forking, oneshot, etc.)\n\n### OCI Registry\n\n- ✅ Registry backends (Zot, Distribution, Harbor)\n- ✅ Storage backend selection (filesystem, S3, Azure)\n- ✅ Authentication methods (none, basic, bearer, OIDC)\n- ✅ Access control policies\n\n## Deployment Examples\n\nTwo comprehensive infrastructure examples are provided demonstrating solo and enterprise configurations:\n\n### Solo Deployment Example\n\n**File**: `examples-solo-deployment.ncl`\n\nMinimal single-node setup for development/testing:\n\n```\n# Exports 4 infrastructure components\ndocker_compose_services # 5 services: orchestrator, control-center, coredns, kms, oci_registry\nnginx_config # Simple upstream routing to localhost services\nprometheus_config # 4 scrape jobs for basic monitoring\noci_registry_config # Zot backend with filesystem storage\n```\n\n**Resource Allocation**:\n- Orchestrator: 1.0 CPU, 1024M RAM\n- Control Center: 0.5 CPU, 512M RAM\n- Other services: 0.25-0.5 CPU, 256-512M RAM\n\n**Export to JSON**:\n\n```\nnickel export --format json provisioning/schemas/infrastructure/examples-solo-deployment.ncl\n# Output: 198 lines of configuration\n```\n\n### Enterprise Deployment Example\n\n**File**: `examples-enterprise-deployment.ncl`\n\nHigh-availability production-grade deployment:\n\n```\n# Exports 4 infrastructure components (HA versions)\ndocker_compose_services # 6 services with 3 replicas for HA\nnginx_config # Multiple upstreams with rate limiting and failover\nprometheus_config # 7 scrape jobs with remote storage\noci_registry_config # Harbor backend with S3 replication\n```\n\n**Resource Allocation**:\n- Orchestrator: 4.0 CPU, 4096M RAM (3 replicas)\n- Control Center: 2.0 CPU, 2048M RAM (HA)\n- Services scale appropriately for production load\n\n**Export to JSON**:\n\n```\nnickel export --format json provisioning/schemas/infrastructure/examples-enterprise-deployment.ncl\n# Output: 313 lines of configuration\n```\n\n### Example Comparison\n\n| Aspect | Solo | Enterprise |\n| -------- | ------ | ----------- |\n| **Services** | 5 | 6 |\n| **Orchestrator CPU** | 1.0 | 4.0 |\n| **Orchestrator Memory** | 1024M | 4096M |\n| **Prometheus Jobs** | 4 | 7 |\n| **Registry Backend** | Zot | Harbor |\n| **Use Case** | Dev/Testing | Production |\n| **JSON Size** | 198 lines | 313 lines |\n\n### Validation Results\n\nBoth examples have been tested and validated:\n\n✅ **Solo Deployment** (`examples-solo-deployment.ncl`):\n- Type-checks without errors\n- Exports to valid JSON (198 lines)\n- All resource limits validated\n- Port range validation: 8080, 9090, 5432, 53\n- JSON structure: docker_compose_services, nginx_config, prometheus_config, oci_registry_config\n\n✅ **Enterprise Deployment** (`examples-enterprise-deployment.ncl`):\n- Type-checks without errors\n- Exports to valid JSON (313 lines)\n- HA configuration with 3 replicas\n- Enhanced monitoring: 7 vs 4 scrape jobs\n- Distributed storage backend (Harbor vs Zot)\n- Full JSON structure validated with jq\n\n## Automation Scripts\n\nGenerate all infrastructure configs in one command:\n\n```\n# Generate all formats for all modes\nprovisioning/platform/scripts/generate-infrastructure-configs.nu\n\n# Generate specific mode/format\nprovisioning/platform/scripts/generate-infrastructure-configs.nu --mode solo --format yaml\n\n# Specify output directory\nprovisioning/platform/scripts/generate-infrastructure-configs.nu --output-dir /tmp/infra\n```\n\nSee `provisioning/platform/scripts/generate-infrastructure-configs.nu` for implementation details.\n\n## Validation and Testing\n\n### Test Generated Configs\n\n```\n# Export solo deployment\nnickel export --format json provisioning/schemas/infrastructure/examples-solo-deployment.ncl \\n > solo-infra.json\n\n# Validate JSON structure\njq . solo-infra.json\n\n# Inspect specific component (Docker Compose services)\njq '.docker_compose_services | keys' solo-infra.json\n\n# Check resource allocation\njq '.docker_compose_services.orchestrator.deploy.resources.limits' solo-infra.json\n```\n\n### Validate with Docker/Kubectl\n\n```\n# Export and validate Docker Compose\nnickel export --format yaml examples-solo-deployment.ncl \\n | docker-compose config --quiet\n\n# Validate Kubernetes (if applicable)\nnickel export --format yaml examples-enterprise-deployment.ncl \\n | kubectl apply --dry-run=client -f -\n\n# Validate Prometheus config\nnickel export --format yaml prometheus.ncl \\n | promtool check config -\n```\n\n## Integration with ConfigLoader\n\nInfrastructure schemas are independent from platform config schemas:\n\n- **Platform configs** → Service-specific settings (port, timeouts, auth)\n- **Infrastructure schemas** → Deployment-specific settings (replicas, resources, networking)\n\nConfigLoader automatically loads platform configs. Infrastructure configs are generated separately and deployed via infrastructure tools:\n\n```\nPlatform Schema (Nickel)\n ↓ nickel export → TOML\n ↓ ConfigLoader → Service reads config\n\nInfrastructure Schema (Nickel)\n ↓ nickel export → YAML/JSON\n ↓ Docker/Kubernetes/Nginx CLI\n```\n\n## Next Steps\n\n1. **Use these schemas** in your infrastructure-as-code pipeline\n2. **Generate configs** with the automation script\n3. **Validate** before deployment using format-specific tools\n4. **Maintain single source of truth** by updating schemas, not generated files\n\n---\n\n**Version**: 1.1.0 (Infrastructure Examples & Validation Added)\n**Total Schemas**: 6 core files, 1,577 lines\n**Deployment Examples**: 2 files, 54 lines (solo + enterprise)\n**Validated**: All schemas and examples pass type-checking and export validation\n**Last Updated**: 2025-01-06\n**Nickel Version**: Latest \ No newline at end of file +# Infrastructure Schemas\n\nThis directory contains Nickel type-safe schemas for infrastructure configuration generation.\n\n## Overview\n\nThese schemas provide type contracts and validation for multi-format infrastructure configuration generation:\n\n- **Docker Compose** (`docker-compose.ncl`) - Container orchestration via Docker Compose\n- **Kubernetes** (`kubernetes.ncl`) - Kubernetes manifest generation (Deployments, Services, ConfigMaps)\n- **Nginx** (`nginx.ncl`) - Reverse proxy and load balancer configuration\n- **Prometheus** (`prometheus.ncl`) - Metrics collection and monitoring\n- **Systemd** (`systemd.ncl`) - System service units for standalone deployments\n- **OCI Registry** (`oci-registry.ncl`) - Container registry backend configuration (Zot, Distribution, Harbor)\n\n## Key Features\n\n### 1. Mode-Based Presets\n\nEach schema includes presets for different deployment modes:\n\n- **solo**: Single-node deployments (minimal resources)\n- **multiuser**: Staging/small production (2 replicas, HA)\n- **enterprise**: Large-scale production (3+ replicas, distributed storage)\n- **cicd**: CI/CD pipeline deployments\n\n### 2. Type Safety\n\n```\n# All fields are strongly typed with validation\nResourceLimits = {\n cpus | String, # Type: string\n memory | String,\n},\n\n# Enum validation\nServiceType = [| 'ClusterIP, 'NodePort, 'LoadBalancer |],\n\n# Numeric range validation\nPort = Number | {\n predicate = fun n => n > 0 && n < 65536,\n}\n```\n\n### 3. Export Formats\n\nSchemas export to multiple formats:\n\n```\n# Export as YAML (K8s, Docker Compose)\nnickel export --format yaml provisioning/schemas/infrastructure/kubernetes.ncl\n\n# Export as JSON (OCI Registry, Prometheus configs)\nnickel export --format json provisioning/schemas/infrastructure/oci-registry.ncl\n\n# Export as TOML (systemd, Nginx)\nnickel export --format toml provisioning/schemas/infrastructure/systemd.ncl\n```\n\n## Single Source of Truth Pattern\n\nDefine service configuration once, generate multiple infrastructure outputs:\n\n```\norchestrator.ncl (Platform Service Schema)\n ↓\nInfrastructure Schemas (Docker, Kubernetes, Nginx, etc.)\n ↓\n[Multiple Outputs]\n├─→ docker-compose.yaml\n├─→ kubernetes/deployment.yaml\n├─→ nginx.conf\n├─→ prometheus.yml\n└─→ systemd/orchestrator.service\n```\n\n### Example: Service Port Definition\n\n```\n# Platform service schema (provisioning/schemas/platform/schemas/orchestrator.ncl)\nserver = {\n port | Number, # Define port once\n}\n\n# Used in Docker Compose\ndocker-compose = {\n services.orchestrator = {\n ports = ["%{orchestrator.server.port}:8080"],\n }\n}\n\n# Used in Kubernetes\nkubernetes = {\n containers.ports = [{\n containerPort = orchestrator.server.port,\n }]\n}\n\n# Used in Nginx\nnginx = {\n upstreams.orchestrator.servers = [{\n address = "orchestrator:%{orchestrator.server.port}",\n }]\n}\n```\n\n**Benefit**: Change port in one place, all infrastructure configs update automatically.\n\n## Validation Before Deployment\n\n```\n# Type check schema\nnickel typecheck provisioning/schemas/infrastructure/docker-compose.ncl\n\n# Validate export\nnickel export --format json provisioning/schemas/infrastructure/kubernetes.ncl \\n | jq . # Validate JSON structure\n\n# Check generated YAML\nnickel export --format yaml provisioning/schemas/infrastructure/kubernetes.ncl \\n | kubectl apply --dry-run=client -f -\n```\n\n## File Structure\n\n```\ninfrastructure/\n├── README.md # This file\n├── docker-compose.ncl # Docker Compose schema (232 lines)\n├── kubernetes.ncl # Kubernetes manifests (376 lines)\n├── nginx.ncl # Nginx configuration (233 lines)\n├── prometheus.ncl # Prometheus configuration (280 lines)\n├── systemd.ncl # Systemd service units (235 lines)\n└── oci-registry.ncl # OCI Registry configuration (221 lines)\n```\n\n**Total**: 1,577 lines of type-safe infrastructure schemas\n\n## Usage Patterns\n\n### 1. Generate Solo Mode Infrastructure\n\n```\n# Export docker-compose for solo deployment\nnickel export --format yaml provisioning/schemas/infrastructure/docker-compose.ncl \\n | tee provisioning/platform/infrastructure/docker/docker-compose.solo.yaml\n\n# Validate with Docker\ndocker-compose -f docker-compose.solo.yaml config --quiet\n```\n\n### 2. Generate Enterprise HA Kubernetes\n\n```\n# Export Kubernetes manifests\nnickel export --format yaml provisioning/schemas/infrastructure/kubernetes.ncl \\n > provisioning/platform/infrastructure/kubernetes/deployment.yaml\n\n# Validate and apply\nkubectl apply --dry-run=client -f deployment.yaml\nkubectl apply -f deployment.yaml\n```\n\n### 3. Generate Monitoring Stack\n\n```\n# Prometheus configuration\nnickel export --format yaml provisioning/schemas/infrastructure/prometheus.ncl \\n > provisioning/platform/infrastructure/prometheus/prometheus.yml\n\n# Validate Prometheus config\npromtool check config provisioning/platform/infrastructure/prometheus/prometheus.yml\n```\n\n### 4. Auto-Generate Infrastructure from Service Schemas\n\n```\n# Composition function: generate Docker Compose from service port\nlet service = import "../platform/schemas/orchestrator.ncl" in\n{\n services.orchestrator = {\n image = "provisioning/orchestrator:latest",\n ports = ["%{service.server.port}:8080"],\n deploy.resources.limits = service.deploy.resources.limits,\n }\n}\n```\n\n## Documentation\n\n### Inline Schema Documentation\n\nEach schema field includes inline documentation (via `| doc`):\n\n```\nfield | Type | doc "description" | default = value\n```\n\n**Important**: With Nickel, `| doc` must come BEFORE `| default`:\n\n```\n✅ CORRECT: cpus | String | doc "CPU limit" | default = "2.0"\n❌ INCORRECT: cpus | String | default = "2.0" | doc "CPU limit"\n```\n\nFor details, see `.claude/guidelines/nickel.md`\n\n## Validation Rules\n\n### Docker Compose\n\n- ✅ Valid service names, port ranges\n- ✅ Resource limits: CPU and memory strings\n- ✅ Health check configuration\n- ✅ Environment variables typed as strings\n\n### Kubernetes\n\n- ✅ Valid API versions (apps/v1, v1)\n- ✅ Container resource requests/limits\n- ✅ Valid restart policies (Always, OnFailure, Never)\n- ✅ Port ranges (1-65535)\n\n### Nginx\n\n- ✅ Upstream server addresses\n- ✅ Rate limiting zones and rules\n- ✅ TLS configuration validation\n- ✅ Security headers structure\n\n### Prometheus\n\n- ✅ Scrape job configuration\n- ✅ Alert manager targets\n- ✅ Scrape intervals (duration format)\n- ✅ Relabel configuration\n\n### Systemd\n\n- ✅ Unit dependencies (after, requires, wants)\n- ✅ Resource limits (CPU quota, memory)\n- ✅ Restart policies\n- ✅ Service types (simple, forking, oneshot, etc.)\n\n### OCI Registry\n\n- ✅ Registry backends (Zot, Distribution, Harbor)\n- ✅ Storage backend selection (filesystem, S3, Azure)\n- ✅ Authentication methods (none, basic, bearer, OIDC)\n- ✅ Access control policies\n\n## Deployment Examples\n\nTwo comprehensive infrastructure examples are provided demonstrating solo and enterprise configurations:\n\n### Solo Deployment Example\n\n**File**: `examples-solo-deployment.ncl`\n\nMinimal single-node setup for development/testing:\n\n```\n# Exports 4 infrastructure components\ndocker_compose_services # 5 services: orchestrator, control-center, coredns, kms, oci_registry\nnginx_config # Simple upstream routing to localhost services\nprometheus_config # 4 scrape jobs for basic monitoring\noci_registry_config # Zot backend with filesystem storage\n```\n\n**Resource Allocation**:\n- Orchestrator: 1.0 CPU, 1024M RAM\n- Control Center: 0.5 CPU, 512M RAM\n- Other services: 0.25-0.5 CPU, 256-512M RAM\n\n**Export to JSON**:\n\n```\nnickel export --format json provisioning/schemas/infrastructure/examples-solo-deployment.ncl\n# Output: 198 lines of configuration\n```\n\n### Enterprise Deployment Example\n\n**File**: `examples-enterprise-deployment.ncl`\n\nHigh-availability production-grade deployment:\n\n```\n# Exports 4 infrastructure components (HA versions)\ndocker_compose_services # 6 services with 3 replicas for HA\nnginx_config # Multiple upstreams with rate limiting and failover\nprometheus_config # 7 scrape jobs with remote storage\noci_registry_config # Harbor backend with S3 replication\n```\n\n**Resource Allocation**:\n- Orchestrator: 4.0 CPU, 4096M RAM (3 replicas)\n- Control Center: 2.0 CPU, 2048M RAM (HA)\n- Services scale appropriately for production load\n\n**Export to JSON**:\n\n```\nnickel export --format json provisioning/schemas/infrastructure/examples-enterprise-deployment.ncl\n# Output: 313 lines of configuration\n```\n\n### Example Comparison\n\n| Aspect | Solo | Enterprise |\n| -------- | ------ | ----------- |\n| **Services** | 5 | 6 |\n| **Orchestrator CPU** | 1.0 | 4.0 |\n| **Orchestrator Memory** | 1024M | 4096M |\n| **Prometheus Jobs** | 4 | 7 |\n| **Registry Backend** | Zot | Harbor |\n| **Use Case** | Dev/Testing | Production |\n| **JSON Size** | 198 lines | 313 lines |\n\n### Validation Results\n\nBoth examples have been tested and validated:\n\n✅ **Solo Deployment** (`examples-solo-deployment.ncl`):\n- Type-checks without errors\n- Exports to valid JSON (198 lines)\n- All resource limits validated\n- Port range validation: 8080, 9090, 5432, 53\n- JSON structure: docker_compose_services, nginx_config, prometheus_config, oci_registry_config\n\n✅ **Enterprise Deployment** (`examples-enterprise-deployment.ncl`):\n- Type-checks without errors\n- Exports to valid JSON (313 lines)\n- HA configuration with 3 replicas\n- Enhanced monitoring: 7 vs 4 scrape jobs\n- Distributed storage backend (Harbor vs Zot)\n- Full JSON structure validated with jq\n\n## Automation Scripts\n\nGenerate all infrastructure configs in one command:\n\n```\n# Generate all formats for all modes\nprovisioning/platform/scripts/generate-infrastructure-configs.nu\n\n# Generate specific mode/format\nprovisioning/platform/scripts/generate-infrastructure-configs.nu --mode solo --format yaml\n\n# Specify output directory\nprovisioning/platform/scripts/generate-infrastructure-configs.nu --output-dir /tmp/infra\n```\n\nSee `provisioning/platform/scripts/generate-infrastructure-configs.nu` for implementation details.\n\n## Validation and Testing\n\n### Test Generated Configs\n\n```\n# Export solo deployment\nnickel export --format json provisioning/schemas/infrastructure/examples-solo-deployment.ncl \\n > solo-infra.json\n\n# Validate JSON structure\njq . solo-infra.json\n\n# Inspect specific component (Docker Compose services)\njq '.docker_compose_services | keys' solo-infra.json\n\n# Check resource allocation\njq '.docker_compose_services.orchestrator.deploy.resources.limits' solo-infra.json\n```\n\n### Validate with Docker/Kubectl\n\n```\n# Export and validate Docker Compose\nnickel export --format yaml examples-solo-deployment.ncl \\n | docker-compose config --quiet\n\n# Validate Kubernetes (if applicable)\nnickel export --format yaml examples-enterprise-deployment.ncl \\n | kubectl apply --dry-run=client -f -\n\n# Validate Prometheus config\nnickel export --format yaml prometheus.ncl \\n | promtool check config -\n```\n\n## Integration with ConfigLoader\n\nInfrastructure schemas are independent from platform config schemas:\n\n- **Platform configs** → Service-specific settings (port, timeouts, auth)\n- **Infrastructure schemas** → Deployment-specific settings (replicas, resources, networking)\n\nConfigLoader automatically loads platform configs. Infrastructure configs are generated separately and deployed via infrastructure tools:\n\n```\nPlatform Schema (Nickel)\n ↓ nickel export → TOML\n ↓ ConfigLoader → Service reads config\n\nInfrastructure Schema (Nickel)\n ↓ nickel export → YAML/JSON\n ↓ Docker/Kubernetes/Nginx CLI\n```\n\n## Next Steps\n\n1. **Use these schemas** in your infrastructure-as-code pipeline\n2. **Generate configs** with the automation script\n3. **Validate** before deployment using format-specific tools\n4. **Maintain single source of truth** by updating schemas, not generated files\n\n---\n\n**Version**: 1.1.0 (Infrastructure Examples & Validation Added)\n**Total Schemas**: 6 core files, 1,577 lines\n**Deployment Examples**: 2 files, 54 lines (solo + enterprise)\n**Validated**: All schemas and examples pass type-checking and export validation\n**Last Updated**: 2025-01-06\n**Nickel Version**: Latest diff --git a/schemas/platform/README.md b/schemas/platform/README.md index 317e509..b7f2534 100644 --- a/schemas/platform/README.md +++ b/schemas/platform/README.md @@ -1 +1 @@ -# TypeDialog + Nickel Configuration System for Platform Services\n\nComplete configuration system for provisioning platform services (orchestrator, control-center, mcp-server, vault-service,\nextension-registry, rag, ai-service, provisioning-daemon) across multiple deployment modes (solo, multiuser, cicd, enterprise).\n\n## Architecture Overview\n\nThis system implements a **TypeDialog + Nickel configuration workflow** that provides:\n\n- **Type-safe configuration** via Nickel schemas with validation\n- **Interactive configuration** via TypeDialog forms with real-time constraint validation\n- **Multi-mode deployment** (solo/multiuser/cicd/enterprise) with mode-specific defaults\n- **Configuration composition** (base defaults + mode overlays + user customization + validation)\n- **Automated TOML export** for Rust service consumption\n- **Docker Compose + Kubernetes templates** for infrastructure deployment\n\n## Directory Structure\n\n```\nprovisioning/.typedialog/provisioning/platform/\n├── constraints/ # Single source of truth for validation limits\n├── schemas/ # Nickel type contracts (services + common + deployment modes)\n├── defaults/ # Default configuration values (services + common + deployment modes)\n├── validators/ # Validation logic (constraints, ranges, business rules)\n├── configs/ # Generated mode-specific Nickel configurations (4 services × 4 modes = 16 configs)\n├── forms/ # TypeDialog form definitions (4 main forms + flat fragments)\n│ └── fragments/ # Reusable form fragments (workspace, server, database, etc.)\n├── templates/ # Jinja2 + Nickel templates for config/deployment generation\n│ ├── docker-compose/ # Docker Compose templates (solo/multiuser/cicd/enterprise)\n│ ├── kubernetes/ # Kubernetes deployment templates\n│ └── configs/ # Service configuration templates (TOML generation)\n├── scripts/ # Nushell orchestration scripts (configure, generate, validate, deploy)\n├── examples/ # Example configurations for different deployment scenarios\n└── values/ # User configuration files (gitignored *.ncl)\n```\n\n## Configuration Workflow\n\n### 1. User Interaction (TypeDialog)\n\n```\nnu scripts/configure.nu orchestrator solo --backend web\n```\n\n- Launches interactive form (web/tui/cli)\n- Loads existing config as default values (if exists)\n- Validates user input against constraints\n- Generates updated Nickel config\n\n### 2. Configuration Composition\n\n```\nBase Defaults (defaults/*.ncl)\n ↓\n+ Mode Overlay (defaults/deployment/{mode}-defaults.ncl)\n ↓\n+ User Customization (values/{service}.{mode}.ncl)\n ↓\n+ Schema Validation (schemas/*.ncl)\n ↓\n+ Constraint Validation (validators/*.ncl)\n ↓\n= Final Configuration (configs/{service}.{mode}.ncl)\n```\n\n### 3. TOML Export\n\n```\nnu scripts/generate-configs.nu orchestrator solo\n```\n\nExports Nickel config to TOML:\n- `provisioning/platform/config/orchestrator.solo.toml` (consumed by Rust services)\n\n## Deployment Modes\n\n### Solo (2 CPU, 4GB RAM)\n- Single developer/testing\n- Filesystem or embedded database\n- Minimal security\n- All services enabled\n\n### MultiUser (4 CPU, 8GB RAM)\n- Team collaboration, staging\n- PostgreSQL or SurrealDB server\n- RBAC enabled\n- Gitea integration\n\n### CI/CD (8 CPU, 16GB RAM)\n- Automated pipelines, ephemeral\n- API-driven configuration\n- Fast cleanup, minimal storage\n\n### Enterprise (16+ CPU, 32+ GB RAM)\n- Production high availability\n- SurrealDB cluster with replication\n- MFA required, KMS integration\n- Compliance (SOC2/HIPAA)\n\n## Key Components\n\n### Constraints (constraints/constraints.toml)\nSingle source of truth for validation limits across all services. Used for:\n- Form field validation (min/max values)\n- Constraint interpolation in TypeDialog forms\n- Nickel validator bounds checking\n\n### Schemas (schemas/*.ncl)\nType-safe configuration contracts defining:\n- Required/optional fields\n- Valid value types and enums\n- Default values\n- Input/output type signatures\n\n**Organization**:\n- `schemas/common/` - HTTP server, database, security, monitoring, logging\n- `schemas/{orchestrator,control-center,mcp-server,vault-service,extension-registry,rag,ai-service,provisioning-daemon}.ncl` - Service-specific schemas\n- `schemas/deployment/{solo,multiuser,cicd,enterprise}.ncl` - Mode-specific schemas\n\n### Defaults (defaults/*.ncl)\nConfiguration base values composed with mode overlays:\n- `defaults/{service}-defaults.ncl` - Service base defaults\n- `defaults/common/` - Shared defaults (server, database, security)\n- `defaults/deployment/{mode}-defaults.ncl` - Mode-specific value overrides\n\n### Validators (validators/*.ncl)\nBusiness logic validation using constraints:\n- Port range validation (1024-65535)\n- Resource allocation validation (CPU, memory)\n- Workflow/policy validation (service-specific)\n- Cross-field validation\n\n### Configurations (configs/*.ncl)\nGenerated mode-specific Nickel configs (NOT manually edited):\n- `orchestrator.{solo,multiuser,cicd,enterprise}.ncl`\n- `control-center.{solo,multiuser,cicd,enterprise}.ncl`\n- `mcp-server.{solo,multiuser,cicd,enterprise}.ncl`\n- `vault-service.{solo,multiuser,cicd,enterprise}.ncl`\n- `extension-registry.{solo,multiuser,cicd,enterprise}.ncl`\n- `rag.{solo,multiuser,cicd,enterprise}.ncl`\n- `ai-service.{solo,multiuser,cicd,enterprise}.ncl`\n- `provisioning-daemon.{solo,multiuser,cicd,enterprise}.ncl`\n\n### Forms (forms/*.toml)\nTypeDialog form definitions with **flat fragments** referenced by paths:\n- 4 main forms: `{service}-form.toml`\n- Fragments: `fragments/{name}-section.toml` (workspace, server, database, security, monitoring, etc.)\n- CRITICAL: Every form element has `nickel_path` for Nickel structure mapping\n\n**Fragment Organization** (FLAT, referenced by paths):\n- `workspace-section.toml`\n- `server-section.toml`\n- `database-rocksdb-section.toml`\n- `database-surrealdb-section.toml`\n- `database-postgres-section.toml`\n- `security-section.toml`\n- `monitoring-section.toml`\n- `logging-section.toml`\n- `orchestrator-queue-section.toml`\n- `orchestrator-workflow-section.toml`\n- ... (service-specific and mode-specific fragments)\n\n### Templates (templates/)\nJinja2 + Nickel templates for automated generation:\n- `{service}-config.ncl.j2` - Nickel output template (critical for TypeDialog nickel-roundtrip)\n- `docker-compose/platform-stack.{mode}.yml.ncl` - Docker Compose templates\n- `kubernetes/{service}-deployment.yaml.ncl` - Kubernetes templates\n\n### Scripts (scripts/)\nNushell orchestration (NuShell 0.109+):\n- `configure.nu` - Interactive TypeDialog wizard (nickel-roundtrip workflow)\n- `generate-configs.nu` - Export Nickel → TOML\n- `validate-config.nu` - Typecheck Nickel configs\n- `render-docker-compose.nu` - Generate Docker Compose files\n- `render-kubernetes.nu` - Generate Kubernetes manifests\n- `install-services.nu` - Deploy platform services\n- `detect-services.nu` - Auto-detect running services\n\n### Examples (examples/)\nReference configurations for different scenarios:\n- `orchestrator-solo.ncl` - Simple development setup\n- `orchestrator-enterprise.ncl` - Complex production setup\n- `full-platform-enterprise.ncl` - Complete enterprise stack\n\n### Values (values/)\nUser configuration directory (gitignored):\n- `{service}.{mode}.ncl` - User customizations (loaded in compose)\n- `.gitignore` - Ignores `*.ncl` files\n- `orchestrator.example.ncl` - Documented example template\n\n## TypeDialog nickel-roundtrip Workflow\n\nCRITICAL: Forms use Jinja2 templates for Nickel generation:\n\n```\n# Command pattern\ntypedialog-web nickel-roundtrip "$CONFIG_FILE" "$FORM_FILE" --output "$CONFIG_FILE" --template "$NCL_TEMPLATE"\n\n# Example\ntypedialog-web nickel-roundtrip \\n "provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl" \\n "provisioning/.typedialog/provisioning/platform/forms/orchestrator-form.toml" \\n --output "provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl" \\n --template "provisioning/.typedialog/provisioning/platform/templates/orchestrator-config.ncl.j2"\n```\n\n**Key Requirements**:\n1. **Jinja2 template** (`config.ncl.j2`) - Defines Nickel output structure with conditional `{% if %}` blocks\n2. **nickel_path** in form elements - Maps form fields to Nickel structure paths (e.g., `["orchestrator", "queue", "max_concurrent_tasks"]`)\n3. **Constraint interpolation** - Form limits reference constraints (e.g., `${constraint.orchestrator.queue.concurrent_tasks.max}`)\n4. **Base + overlay composition** - Nickel imports merge defaults + mode overlays + validators\n\n## Usage Workflow\n\n### 1. Configure Service (Interactive)\n\n```\n# Start TypeDialog wizard for orchestrator in solo mode\nnu provisioning/.typedialog/provisioning/platform/scripts/configure.nu orchestrator solo --backend web\n```\n\nWizard:\n1. Loads existing config (if exists) as defaults\n2. Shows form with validated constraints\n3. User edits configuration\n4. Generates updated Nickel config to `provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl`\n\n### 2. Validate Configuration\n\n```\n# Typecheck Nickel config\nnu provisioning/.typedialog/provisioning/platform/scripts/validate-config.nu provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl\n```\n\n### 3. Generate TOML for Rust Services\n\n```\n# Export Nickel → TOML\nnu provisioning/.typedialog/provisioning/platform/scripts/generate-configs.nu orchestrator solo\n```\n\nOutput: `provisioning/platform/config/orchestrator.solo.toml`\n\n### 4. Deploy Services\n\n```\n# Install services (Docker Compose or Kubernetes)\nnu provisioning/.typedialog/provisioning/platform/scripts/install-services.nu solo\n```\n\n## Configuration Loading Hierarchy (Rust Services)\n\n```\n1. Environment variables (ORCHESTRATOR_*)\n2. User config (values/{service}.{mode}.ncl → TOML)\n3. Mode-specific defaults (configs/{service}.{mode}.toml)\n4. Service defaults (config/orchestrator.defaults.toml)\n```\n\n## Constraint Interpolation Example\n\n**constraints.toml**:\n\n```\n[orchestrator.queue.concurrent_tasks]\nmin = 1\nmax = 100\n```\n\n**Form element** (fragments/orchestrator-queue-section.toml):\n\n```\n[[elements]]\nname = "max_concurrent_tasks"\ntype = "number"\nmin = "${constraint.orchestrator.queue.concurrent_tasks.min}"\nmax = "${constraint.orchestrator.queue.concurrent_tasks.max}"\nnickel_path = ["orchestrator", "queue", "max_concurrent_tasks"]\n```\n\n**Jinja2 template** (orchestrator-config.ncl.j2):\n\n```\norchestrator = {\n queue = {\n {%- if max_concurrent_tasks %}\n max_concurrent_tasks = {{ max_concurrent_tasks }},\n {%- endif %}\n },\n}\n```\n\n## Getting Started\n\n1. **Run configuration wizard**:\n\n ```bash\n nu provisioning/.typedialog/provisioning/platform/scripts/configure.nu orchestrator solo\n ```\n\n2. **Generate TOML configs**:\n\n ```bash\n nu provisioning/.typedialog/provisioning/platform/scripts/generate-configs.nu orchestrator solo\n ```\n\n3. **Deploy services**:\n\n ```bash\n nu provisioning/.typedialog/provisioning/platform/scripts/install-services.nu solo\n ```\n\n## Documentation\n\n- `constraints/README.md` - How to modify validation constraints\n- `schemas/README.md` - Schema patterns and imports\n- `defaults/README.md` - Defaults composition and merging strategy\n- `validators/README.md` - Validator patterns and error handling\n- `forms/README.md` - Form structure and fragment organization\n- `forms/fragments/README.md` - Fragment usage and nickel_path mapping\n- `scripts/README.md` - Script usage and dependencies\n- `examples/README.md` - Example deployment scenarios\n- `templates/README.md` - Template patterns and interpolation\n\n## Key Files\n\n| File | Purpose |\n| ------ | --------- |\n| `constraints/constraints.toml` | Single source of truth for validation limits |\n| `schemas/orchestrator.ncl` | Orchestrator type schema |\n| `defaults/orchestrator-defaults.ncl` | Orchestrator default values |\n| `validators/orchestrator-validator.ncl` | Orchestrator validation logic |\n| `configs/orchestrator.solo.ncl` | Generated solo mode config |\n| `forms/orchestrator-form.toml` | Orchestrator form definition |\n| `templates/orchestrator-config.ncl.j2` | Nickel output template |\n| `scripts/configure.nu` | Interactive configuration wizard |\n| `scripts/generate-configs.nu` | Nickel → TOML export |\n| `values/orchestrator.solo.ncl` | User configuration (gitignored) |\n\n## Tools Required\n\n- **Nickel** (0.10+) - Configuration language\n- **TypeDialog** - Interactive form backend\n- **NuShell** (0.109+) - Script orchestration\n- **Jinja2/tera** - Template rendering (via nu_plugin_tera)\n- **TOML** - Config file format (for Rust services)\n\n## Notes\n\n- Configuration files in `values/` are **gitignored** (user-specific)\n- Generated configs in `configs/` are composed automatically (not hand-edited)\n- Each mode (solo/multiuser/cicd/enterprise) has different resource defaults\n- Fragments are **flat** in `forms/fragments/` and referenced by paths in form definitions\n- All form elements must have `nickel_path` for proper Nickel structure mapping\n- Constraint interpolation enables dynamic form validation based on service requirements\n\n---\n\n**Version**: 1.0.0\n**Created**: 2025-01-05\n**Last Updated**: 2025-01-05 \ No newline at end of file +# TypeDialog + Nickel Configuration System for Platform Services\n\nComplete configuration system for provisioning platform services (orchestrator, control-center, mcp-server, vault-service,\nextension-registry, rag, ai-service, provisioning-daemon) across multiple deployment modes (solo, multiuser, cicd, enterprise).\n\n## Architecture Overview\n\nThis system implements a **TypeDialog + Nickel configuration workflow** that provides:\n\n- **Type-safe configuration** via Nickel schemas with validation\n- **Interactive configuration** via TypeDialog forms with real-time constraint validation\n- **Multi-mode deployment** (solo/multiuser/cicd/enterprise) with mode-specific defaults\n- **Configuration composition** (base defaults + mode overlays + user customization + validation)\n- **Automated TOML export** for Rust service consumption\n- **Docker Compose + Kubernetes templates** for infrastructure deployment\n\n## Directory Structure\n\n```\nprovisioning/.typedialog/provisioning/platform/\n├── constraints/ # Single source of truth for validation limits\n├── schemas/ # Nickel type contracts (services + common + deployment modes)\n├── defaults/ # Default configuration values (services + common + deployment modes)\n├── validators/ # Validation logic (constraints, ranges, business rules)\n├── configs/ # Generated mode-specific Nickel configurations (4 services × 4 modes = 16 configs)\n├── forms/ # TypeDialog form definitions (4 main forms + flat fragments)\n│ └── fragments/ # Reusable form fragments (workspace, server, database, etc.)\n├── templates/ # Jinja2 + Nickel templates for config/deployment generation\n│ ├── docker-compose/ # Docker Compose templates (solo/multiuser/cicd/enterprise)\n│ ├── kubernetes/ # Kubernetes deployment templates\n│ └── configs/ # Service configuration templates (TOML generation)\n├── scripts/ # Nushell orchestration scripts (configure, generate, validate, deploy)\n├── examples/ # Example configurations for different deployment scenarios\n└── values/ # User configuration files (gitignored *.ncl)\n```\n\n## Configuration Workflow\n\n### 1. User Interaction (TypeDialog)\n\n```\nnu scripts/configure.nu orchestrator solo --backend web\n```\n\n- Launches interactive form (web/tui/cli)\n- Loads existing config as default values (if exists)\n- Validates user input against constraints\n- Generates updated Nickel config\n\n### 2. Configuration Composition\n\n```\nBase Defaults (defaults/*.ncl)\n ↓\n+ Mode Overlay (defaults/deployment/{mode}-defaults.ncl)\n ↓\n+ User Customization (values/{service}.{mode}.ncl)\n ↓\n+ Schema Validation (schemas/*.ncl)\n ↓\n+ Constraint Validation (validators/*.ncl)\n ↓\n= Final Configuration (configs/{service}.{mode}.ncl)\n```\n\n### 3. TOML Export\n\n```\nnu scripts/generate-configs.nu orchestrator solo\n```\n\nExports Nickel config to TOML:\n- `provisioning/platform/config/orchestrator.solo.toml` (consumed by Rust services)\n\n## Deployment Modes\n\n### Solo (2 CPU, 4GB RAM)\n- Single developer/testing\n- Filesystem or embedded database\n- Minimal security\n- All services enabled\n\n### MultiUser (4 CPU, 8GB RAM)\n- Team collaboration, staging\n- PostgreSQL or SurrealDB server\n- RBAC enabled\n- Gitea integration\n\n### CI/CD (8 CPU, 16GB RAM)\n- Automated pipelines, ephemeral\n- API-driven configuration\n- Fast cleanup, minimal storage\n\n### Enterprise (16+ CPU, 32+ GB RAM)\n- Production high availability\n- SurrealDB cluster with replication\n- MFA required, KMS integration\n- Compliance (SOC2/HIPAA)\n\n## Key Components\n\n### Constraints (constraints/constraints.toml)\nSingle source of truth for validation limits across all services. Used for:\n- Form field validation (min/max values)\n- Constraint interpolation in TypeDialog forms\n- Nickel validator bounds checking\n\n### Schemas (schemas/*.ncl)\nType-safe configuration contracts defining:\n- Required/optional fields\n- Valid value types and enums\n- Default values\n- Input/output type signatures\n\n**Organization**:\n- `schemas/common/` - HTTP server, database, security, monitoring, logging\n- `schemas/{orchestrator,control-center,mcp-server,vault-service,extension-registry,rag,ai-service,provisioning-daemon}.ncl` - Service-specific schemas\n- `schemas/deployment/{solo,multiuser,cicd,enterprise}.ncl` - Mode-specific schemas\n\n### Defaults (defaults/*.ncl)\nConfiguration base values composed with mode overlays:\n- `defaults/{service}-defaults.ncl` - Service base defaults\n- `defaults/common/` - Shared defaults (server, database, security)\n- `defaults/deployment/{mode}-defaults.ncl` - Mode-specific value overrides\n\n### Validators (validators/*.ncl)\nBusiness logic validation using constraints:\n- Port range validation (1024-65535)\n- Resource allocation validation (CPU, memory)\n- Workflow/policy validation (service-specific)\n- Cross-field validation\n\n### Configurations (configs/*.ncl)\nGenerated mode-specific Nickel configs (NOT manually edited):\n- `orchestrator.{solo,multiuser,cicd,enterprise}.ncl`\n- `control-center.{solo,multiuser,cicd,enterprise}.ncl`\n- `mcp-server.{solo,multiuser,cicd,enterprise}.ncl`\n- `vault-service.{solo,multiuser,cicd,enterprise}.ncl`\n- `extension-registry.{solo,multiuser,cicd,enterprise}.ncl`\n- `rag.{solo,multiuser,cicd,enterprise}.ncl`\n- `ai-service.{solo,multiuser,cicd,enterprise}.ncl`\n- `provisioning-daemon.{solo,multiuser,cicd,enterprise}.ncl`\n\n### Forms (forms/*.toml)\nTypeDialog form definitions with **flat fragments** referenced by paths:\n- 4 main forms: `{service}-form.toml`\n- Fragments: `fragments/{name}-section.toml` (workspace, server, database, security, monitoring, etc.)\n- CRITICAL: Every form element has `nickel_path` for Nickel structure mapping\n\n**Fragment Organization** (FLAT, referenced by paths):\n- `workspace-section.toml`\n- `server-section.toml`\n- `database-rocksdb-section.toml`\n- `database-surrealdb-section.toml`\n- `database-postgres-section.toml`\n- `security-section.toml`\n- `monitoring-section.toml`\n- `logging-section.toml`\n- `orchestrator-queue-section.toml`\n- `orchestrator-workflow-section.toml`\n- ... (service-specific and mode-specific fragments)\n\n### Templates (templates/)\nJinja2 + Nickel templates for automated generation:\n- `{service}-config.ncl.j2` - Nickel output template (critical for TypeDialog nickel-roundtrip)\n- `docker-compose/platform-stack.{mode}.yml.ncl` - Docker Compose templates\n- `kubernetes/{service}-deployment.yaml.ncl` - Kubernetes templates\n\n### Scripts (scripts/)\nNushell orchestration (NuShell 0.109+):\n- `configure.nu` - Interactive TypeDialog wizard (nickel-roundtrip workflow)\n- `generate-configs.nu` - Export Nickel → TOML\n- `validate-config.nu` - Typecheck Nickel configs\n- `render-docker-compose.nu` - Generate Docker Compose files\n- `render-kubernetes.nu` - Generate Kubernetes manifests\n- `install-services.nu` - Deploy platform services\n- `detect-services.nu` - Auto-detect running services\n\n### Examples (examples/)\nReference configurations for different scenarios:\n- `orchestrator-solo.ncl` - Simple development setup\n- `orchestrator-enterprise.ncl` - Complex production setup\n- `full-platform-enterprise.ncl` - Complete enterprise stack\n\n### Values (values/)\nUser configuration directory (gitignored):\n- `{service}.{mode}.ncl` - User customizations (loaded in compose)\n- `.gitignore` - Ignores `*.ncl` files\n- `orchestrator.example.ncl` - Documented example template\n\n## TypeDialog nickel-roundtrip Workflow\n\nCRITICAL: Forms use Jinja2 templates for Nickel generation:\n\n```\n# Command pattern\ntypedialog-web nickel-roundtrip "$CONFIG_FILE" "$FORM_FILE" --output "$CONFIG_FILE" --template "$NCL_TEMPLATE"\n\n# Example\ntypedialog-web nickel-roundtrip \\n "provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl" \\n "provisioning/.typedialog/provisioning/platform/forms/orchestrator-form.toml" \\n --output "provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl" \\n --template "provisioning/.typedialog/provisioning/platform/templates/orchestrator-config.ncl.j2"\n```\n\n**Key Requirements**:\n1. **Jinja2 template** (`config.ncl.j2`) - Defines Nickel output structure with conditional `{% if %}` blocks\n2. **nickel_path** in form elements - Maps form fields to Nickel structure paths (e.g., `["orchestrator", "queue", "max_concurrent_tasks"]`)\n3. **Constraint interpolation** - Form limits reference constraints (e.g., `${constraint.orchestrator.queue.concurrent_tasks.max}`)\n4. **Base + overlay composition** - Nickel imports merge defaults + mode overlays + validators\n\n## Usage Workflow\n\n### 1. Configure Service (Interactive)\n\n```\n# Start TypeDialog wizard for orchestrator in solo mode\nnu provisioning/.typedialog/provisioning/platform/scripts/configure.nu orchestrator solo --backend web\n```\n\nWizard:\n1. Loads existing config (if exists) as defaults\n2. Shows form with validated constraints\n3. User edits configuration\n4. Generates updated Nickel config to `provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl`\n\n### 2. Validate Configuration\n\n```\n# Typecheck Nickel config\nnu provisioning/.typedialog/provisioning/platform/scripts/validate-config.nu provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl\n```\n\n### 3. Generate TOML for Rust Services\n\n```\n# Export Nickel → TOML\nnu provisioning/.typedialog/provisioning/platform/scripts/generate-configs.nu orchestrator solo\n```\n\nOutput: `provisioning/platform/config/orchestrator.solo.toml`\n\n### 4. Deploy Services\n\n```\n# Install services (Docker Compose or Kubernetes)\nnu provisioning/.typedialog/provisioning/platform/scripts/install-services.nu solo\n```\n\n## Configuration Loading Hierarchy (Rust Services)\n\n```\n1. Environment variables (ORCHESTRATOR_*)\n2. User config (values/{service}.{mode}.ncl → TOML)\n3. Mode-specific defaults (configs/{service}.{mode}.toml)\n4. Service defaults (config/orchestrator.defaults.toml)\n```\n\n## Constraint Interpolation Example\n\n**constraints.toml**:\n\n```\n[orchestrator.queue.concurrent_tasks]\nmin = 1\nmax = 100\n```\n\n**Form element** (fragments/orchestrator-queue-section.toml):\n\n```\n[[elements]]\nname = "max_concurrent_tasks"\ntype = "number"\nmin = "${constraint.orchestrator.queue.concurrent_tasks.min}"\nmax = "${constraint.orchestrator.queue.concurrent_tasks.max}"\nnickel_path = ["orchestrator", "queue", "max_concurrent_tasks"]\n```\n\n**Jinja2 template** (orchestrator-config.ncl.j2):\n\n```\norchestrator = {\n queue = {\n {%- if max_concurrent_tasks %}\n max_concurrent_tasks = {{ max_concurrent_tasks }},\n {%- endif %}\n },\n}\n```\n\n## Getting Started\n\n1. **Run configuration wizard**:\n\n ```bash\n nu provisioning/.typedialog/provisioning/platform/scripts/configure.nu orchestrator solo\n ```\n\n2. **Generate TOML configs**:\n\n ```bash\n nu provisioning/.typedialog/provisioning/platform/scripts/generate-configs.nu orchestrator solo\n ```\n\n3. **Deploy services**:\n\n ```bash\n nu provisioning/.typedialog/provisioning/platform/scripts/install-services.nu solo\n ```\n\n## Documentation\n\n- `constraints/README.md` - How to modify validation constraints\n- `schemas/README.md` - Schema patterns and imports\n- `defaults/README.md` - Defaults composition and merging strategy\n- `validators/README.md` - Validator patterns and error handling\n- `forms/README.md` - Form structure and fragment organization\n- `forms/fragments/README.md` - Fragment usage and nickel_path mapping\n- `scripts/README.md` - Script usage and dependencies\n- `examples/README.md` - Example deployment scenarios\n- `templates/README.md` - Template patterns and interpolation\n\n## Key Files\n\n| File | Purpose |\n| ------ | --------- |\n| `constraints/constraints.toml` | Single source of truth for validation limits |\n| `schemas/orchestrator.ncl` | Orchestrator type schema |\n| `defaults/orchestrator-defaults.ncl` | Orchestrator default values |\n| `validators/orchestrator-validator.ncl` | Orchestrator validation logic |\n| `configs/orchestrator.solo.ncl` | Generated solo mode config |\n| `forms/orchestrator-form.toml` | Orchestrator form definition |\n| `templates/orchestrator-config.ncl.j2` | Nickel output template |\n| `scripts/configure.nu` | Interactive configuration wizard |\n| `scripts/generate-configs.nu` | Nickel → TOML export |\n| `values/orchestrator.solo.ncl` | User configuration (gitignored) |\n\n## Tools Required\n\n- **Nickel** (0.10+) - Configuration language\n- **TypeDialog** - Interactive form backend\n- **NuShell** (0.109+) - Script orchestration\n- **Jinja2/tera** - Template rendering (via nu_plugin_tera)\n- **TOML** - Config file format (for Rust services)\n\n## Notes\n\n- Configuration files in `values/` are **gitignored** (user-specific)\n- Generated configs in `configs/` are composed automatically (not hand-edited)\n- Each mode (solo/multiuser/cicd/enterprise) has different resource defaults\n- Fragments are **flat** in `forms/fragments/` and referenced by paths in form definitions\n- All form elements must have `nickel_path` for proper Nickel structure mapping\n- Constraint interpolation enables dynamic form validation based on service requirements\n\n---\n\n**Version**: 1.0.0\n**Created**: 2025-01-05\n**Last Updated**: 2025-01-05 diff --git a/schemas/platform/configs/README.md b/schemas/platform/configs/README.md index 6a7c372..5c3c0dc 100644 --- a/schemas/platform/configs/README.md +++ b/schemas/platform/configs/README.md @@ -1 +1 @@ -# Configurations\n\nMode-specific Nickel configurations for all services (NOT manually edited).\n\n## Purpose\n\nConfigurations are **automatically generated** by composing:\n1. Service base defaults (defaults/{service}-defaults.ncl)\n2. Mode overlay (defaults/deployment/{mode}-defaults.ncl)\n3. User customization (values/{service}.{mode}.ncl)\n4. Schema validation (schemas/{service}.ncl)\n5. Constraint validation (validators/{service}-validator.ncl)\n\n## File Organization\n\n```\nconfigs/\n├── README.md # This file\n├── orchestrator.solo.ncl # Orchestrator solo mode\n├── orchestrator.multiuser.ncl # Orchestrator multi-user mode\n├── orchestrator.cicd.ncl # Orchestrator CI/CD mode\n├── orchestrator.enterprise.ncl # Orchestrator enterprise mode\n├── control-center.solo.ncl\n├── control-center.multiuser.ncl\n├── control-center.cicd.ncl\n├── control-center.enterprise.ncl\n├── mcp-server.solo.ncl\n├── mcp-server.multiuser.ncl\n├── mcp-server.cicd.ncl\n├── mcp-server.enterprise.ncl\n├── installer.solo.ncl\n├── installer.multiuser.ncl\n├── installer.cicd.ncl\n└── installer.enterprise.ncl\n```\n\n## Configuration Composition\n\nEach config is built from layers:\n\n```\n# configs/orchestrator.solo.ncl\nlet schemas = import "../schemas/orchestrator.ncl" in\nlet defaults = import "../defaults/orchestrator-defaults.ncl" in\nlet solo_defaults = import "../defaults/deployment/solo-defaults.ncl" in\nlet validators = import "../validators/orchestrator-validator.ncl" in\n\n{\n # Merge: base defaults + mode overrides + user customization\n orchestrator = defaults.orchestrator & solo_defaults.services.orchestrator & {\n # User customization goes here (from values/orchestrator.solo.ncl)\n },\n} | schemas.OrchestratorConfig # Apply schema validation\n```\n\n## Example Configuration\n\n### Base Defaults\n\n```\n# defaults/orchestrator-defaults.ncl\norchestrator = {\n workspace = {\n name = "default",\n path = "/var/lib/provisioning/orchestrator",\n enabled = true,\n },\n server = {\n host = "127.0.0.1",\n port = 9090,\n workers = 4,\n },\n queue = {\n max_concurrent_tasks = 5,\n },\n}\n```\n\n### Solo Mode Override\n\n```\n# defaults/deployment/solo-defaults.ncl\nservices.orchestrator = {\n workers = 2, # Fewer workers\n queue_max_concurrent_tasks = 3, # Limited concurrency\n storage_backend = 'filesystem,\n}\n```\n\n### Generated Config\n\n```\n# configs/orchestrator.solo.ncl (auto-generated)\n{\n orchestrator = {\n workspace = {\n name = "default", # From base defaults\n path = "/var/lib/provisioning/orchestrator",\n enabled = true,\n },\n server = {\n host = "127.0.0.1", # From base defaults\n port = 9090, # From base defaults\n workers = 2, # OVERRIDDEN by solo mode\n },\n queue = {\n max_concurrent_tasks = 3, # OVERRIDDEN by solo mode\n },\n },\n}\n```\n\n## Updating Configurations\n\n**DO NOT manually edit** configs/ files. Instead:\n\n1. **Modify service defaults** (defaults/{service}-defaults.ncl)\n2. **Modify mode overrides** (defaults/deployment/{mode}-defaults.ncl)\n3. **Modify user values** (values/{service}.{mode}.ncl)\n4. **Regenerate configs** (via TypeDialog or manual rebuild)\n\n### Regenerating Configs\n\n#### Via TypeDialog (Recommended)\n\n```\nnu provisioning/.typedialog/provisioning/platform/scripts/configure.nu orchestrator solo\n```\n\nAutomatically:\n1. Loads existing config as defaults\n2. Shows form with validated constraints\n3. User edits configuration\n4. Generates updated config\n\n#### Manual Rebuild\n\n```\n# (Future) Script to rebuild all configs from sources\nnu provisioning/.typedialog/provisioning/platform/scripts/generate-configs.nu orchestrator solo\n```\n\n## Config Types\n\n### Orchestrator (Workflow Engine)\n- Workspace configuration\n- Server settings\n- Storage backend (filesystem, RocksDB, SurrealDB)\n- Queue configuration (concurrency, retries, timeout)\n- Batch workflow settings\n- Optional: monitoring, rollback, extensions\n\n### Control Center (Policy/RBAC)\n- Workspace configuration\n- Server settings\n- Database configuration\n- Security (JWT, RBAC, encryption)\n- Optional: compliance, audit logging\n\n### MCP Server (Protocol Server)\n- Workspace configuration\n- Server settings\n- MCP capabilities (tools, prompts, resources)\n- Optional: custom tools, resource limits\n\n### Installer (Setup Automation)\n- Target configuration\n- Provider settings\n- Pre-flight checks\n- Installation options\n\n## Configuration Values Hierarchy\n\n```\n1. Explicit user customization (values/{service}.{mode}.ncl)\n2. Mode-specific defaults (defaults/deployment/{mode}-defaults.ncl)\n3. Service base defaults (defaults/{service}-defaults.ncl)\n4. Common shared defaults (defaults/common/*.ncl)\n```\n\n## Validation Levels\n\nConfigurations are validated at three levels:\n\n### 1. Schema Validation\nType checking when config is evaluated:\n\n```\n| schemas.OrchestratorConfig\n```\n\n### 2. Constraint Validation\nRange checking via validators:\n\n```\nmax_concurrent_tasks = validators.ValidConcurrentTasks 5\n```\n\n### 3. Business Logic Validation\nService-specific rules in validators.\n\n## Usage in Rust Services\n\nConfigs are exported to TOML for Rust services:\n\n```\n# Generate TOML\nnu provisioning/.typedialog/provisioning/platform/scripts/generate-configs.nu orchestrator solo\n\n# Output: provisioning/platform/config/orchestrator.solo.toml\n```\n\nRust services load the TOML:\n\n```\nlet config_path = "provisioning/platform/config/orchestrator.solo.toml";\nlet config = Config::from_file(config_path)?;\n```\n\n## Deployment Mode Specifics\n\n### Solo Mode Config\n- Minimal resources (2 CPU, 4GB)\n- Filesystem storage (no DB infrastructure)\n- Single worker, low concurrency\n- Simplified security (no MFA)\n\n### MultiUser Mode Config\n- Team resources (4 CPU, 8GB)\n- PostgreSQL or SurrealDB\n- Moderate concurrency (4-8 workers)\n- RBAC enabled\n\n### CI/CD Mode Config\n- Ephemeral (cleanup after run)\n- API-driven (no UI/forms)\n- High concurrency (8+ workers)\n- Minimal security overhead\n\n### Enterprise Mode Config\n- Production HA (16+ CPU, 32+ GB)\n- SurrealDB cluster with replication\n- High concurrency (16+ workers)\n- Full security (MFA, KMS, compliance)\n\n## Testing Configurations\n\n```\n# Typecheck a config\nnickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Evaluate and view\nnickel eval provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl | head -50\n\n# Export to TOML\nnickel export --format toml provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Export to JSON\nnickel export --format json provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n```\n\n## Configuration Merge Example\n\n```\n# Base\n{\n server = {\n host = "127.0.0.1",\n port = 9090,\n workers = 4,\n },\n}\n\n# + Mode override\n& {\n server.workers = 2,\n}\n\n# = Result\n{\n server = {\n host = "127.0.0.1",\n port = 9090,\n workers = 2, # OVERRIDDEN\n },\n}\n```\n\nNickel's `&` operator is a **shallow merge** - only top-level fields are replaced, deeper nesting is preserved.\n\n## Generated Config Structure\n\nAll generated configs follow this structure:\n\n```\n# Service config\n{\n {service} = {\n # Workspace\n workspace = { ... },\n\n # Server\n server = { ... },\n\n # Storage/Database\n [storage | database] = { ... },\n\n # Service-specific\n [queue | rbac | capabilities] = { ... },\n\n # Optional\n [monitoring | security | compliance] = { ... },\n },\n}\n```\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 \ No newline at end of file +# Configurations\n\nMode-specific Nickel configurations for all services (NOT manually edited).\n\n## Purpose\n\nConfigurations are **automatically generated** by composing:\n1. Service base defaults (defaults/{service}-defaults.ncl)\n2. Mode overlay (defaults/deployment/{mode}-defaults.ncl)\n3. User customization (values/{service}.{mode}.ncl)\n4. Schema validation (schemas/{service}.ncl)\n5. Constraint validation (validators/{service}-validator.ncl)\n\n## File Organization\n\n```\nconfigs/\n├── README.md # This file\n├── orchestrator.solo.ncl # Orchestrator solo mode\n├── orchestrator.multiuser.ncl # Orchestrator multi-user mode\n├── orchestrator.cicd.ncl # Orchestrator CI/CD mode\n├── orchestrator.enterprise.ncl # Orchestrator enterprise mode\n├── control-center.solo.ncl\n├── control-center.multiuser.ncl\n├── control-center.cicd.ncl\n├── control-center.enterprise.ncl\n├── mcp-server.solo.ncl\n├── mcp-server.multiuser.ncl\n├── mcp-server.cicd.ncl\n├── mcp-server.enterprise.ncl\n├── installer.solo.ncl\n├── installer.multiuser.ncl\n├── installer.cicd.ncl\n└── installer.enterprise.ncl\n```\n\n## Configuration Composition\n\nEach config is built from layers:\n\n```\n# configs/orchestrator.solo.ncl\nlet schemas = import "../schemas/orchestrator.ncl" in\nlet defaults = import "../defaults/orchestrator-defaults.ncl" in\nlet solo_defaults = import "../defaults/deployment/solo-defaults.ncl" in\nlet validators = import "../validators/orchestrator-validator.ncl" in\n\n{\n # Merge: base defaults + mode overrides + user customization\n orchestrator = defaults.orchestrator & solo_defaults.services.orchestrator & {\n # User customization goes here (from values/orchestrator.solo.ncl)\n },\n} | schemas.OrchestratorConfig # Apply schema validation\n```\n\n## Example Configuration\n\n### Base Defaults\n\n```\n# defaults/orchestrator-defaults.ncl\norchestrator = {\n workspace = {\n name = "default",\n path = "/var/lib/provisioning/orchestrator",\n enabled = true,\n },\n server = {\n host = "127.0.0.1",\n port = 9090,\n workers = 4,\n },\n queue = {\n max_concurrent_tasks = 5,\n },\n}\n```\n\n### Solo Mode Override\n\n```\n# defaults/deployment/solo-defaults.ncl\nservices.orchestrator = {\n workers = 2, # Fewer workers\n queue_max_concurrent_tasks = 3, # Limited concurrency\n storage_backend = 'filesystem,\n}\n```\n\n### Generated Config\n\n```\n# configs/orchestrator.solo.ncl (auto-generated)\n{\n orchestrator = {\n workspace = {\n name = "default", # From base defaults\n path = "/var/lib/provisioning/orchestrator",\n enabled = true,\n },\n server = {\n host = "127.0.0.1", # From base defaults\n port = 9090, # From base defaults\n workers = 2, # OVERRIDDEN by solo mode\n },\n queue = {\n max_concurrent_tasks = 3, # OVERRIDDEN by solo mode\n },\n },\n}\n```\n\n## Updating Configurations\n\n**DO NOT manually edit** configs/ files. Instead:\n\n1. **Modify service defaults** (defaults/{service}-defaults.ncl)\n2. **Modify mode overrides** (defaults/deployment/{mode}-defaults.ncl)\n3. **Modify user values** (values/{service}.{mode}.ncl)\n4. **Regenerate configs** (via TypeDialog or manual rebuild)\n\n### Regenerating Configs\n\n#### Via TypeDialog (Recommended)\n\n```\nnu provisioning/.typedialog/provisioning/platform/scripts/configure.nu orchestrator solo\n```\n\nAutomatically:\n1. Loads existing config as defaults\n2. Shows form with validated constraints\n3. User edits configuration\n4. Generates updated config\n\n#### Manual Rebuild\n\n```\n# (Future) Script to rebuild all configs from sources\nnu provisioning/.typedialog/provisioning/platform/scripts/generate-configs.nu orchestrator solo\n```\n\n## Config Types\n\n### Orchestrator (Workflow Engine)\n- Workspace configuration\n- Server settings\n- Storage backend (filesystem, RocksDB, SurrealDB)\n- Queue configuration (concurrency, retries, timeout)\n- Batch workflow settings\n- Optional: monitoring, rollback, extensions\n\n### Control Center (Policy/RBAC)\n- Workspace configuration\n- Server settings\n- Database configuration\n- Security (JWT, RBAC, encryption)\n- Optional: compliance, audit logging\n\n### MCP Server (Protocol Server)\n- Workspace configuration\n- Server settings\n- MCP capabilities (tools, prompts, resources)\n- Optional: custom tools, resource limits\n\n### Installer (Setup Automation)\n- Target configuration\n- Provider settings\n- Pre-flight checks\n- Installation options\n\n## Configuration Values Hierarchy\n\n```\n1. Explicit user customization (values/{service}.{mode}.ncl)\n2. Mode-specific defaults (defaults/deployment/{mode}-defaults.ncl)\n3. Service base defaults (defaults/{service}-defaults.ncl)\n4. Common shared defaults (defaults/common/*.ncl)\n```\n\n## Validation Levels\n\nConfigurations are validated at three levels:\n\n### 1. Schema Validation\nType checking when config is evaluated:\n\n```\n| schemas.OrchestratorConfig\n```\n\n### 2. Constraint Validation\nRange checking via validators:\n\n```\nmax_concurrent_tasks = validators.ValidConcurrentTasks 5\n```\n\n### 3. Business Logic Validation\nService-specific rules in validators.\n\n## Usage in Rust Services\n\nConfigs are exported to TOML for Rust services:\n\n```\n# Generate TOML\nnu provisioning/.typedialog/provisioning/platform/scripts/generate-configs.nu orchestrator solo\n\n# Output: provisioning/platform/config/orchestrator.solo.toml\n```\n\nRust services load the TOML:\n\n```\nlet config_path = "provisioning/platform/config/orchestrator.solo.toml";\nlet config = Config::from_file(config_path)?;\n```\n\n## Deployment Mode Specifics\n\n### Solo Mode Config\n- Minimal resources (2 CPU, 4GB)\n- Filesystem storage (no DB infrastructure)\n- Single worker, low concurrency\n- Simplified security (no MFA)\n\n### MultiUser Mode Config\n- Team resources (4 CPU, 8GB)\n- PostgreSQL or SurrealDB\n- Moderate concurrency (4-8 workers)\n- RBAC enabled\n\n### CI/CD Mode Config\n- Ephemeral (cleanup after run)\n- API-driven (no UI/forms)\n- High concurrency (8+ workers)\n- Minimal security overhead\n\n### Enterprise Mode Config\n- Production HA (16+ CPU, 32+ GB)\n- SurrealDB cluster with replication\n- High concurrency (16+ workers)\n- Full security (MFA, KMS, compliance)\n\n## Testing Configurations\n\n```\n# Typecheck a config\nnickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Evaluate and view\nnickel eval provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl | head -50\n\n# Export to TOML\nnickel export --format toml provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Export to JSON\nnickel export --format json provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n```\n\n## Configuration Merge Example\n\n```\n# Base\n{\n server = {\n host = "127.0.0.1",\n port = 9090,\n workers = 4,\n },\n}\n\n# + Mode override\n& {\n server.workers = 2,\n}\n\n# = Result\n{\n server = {\n host = "127.0.0.1",\n port = 9090,\n workers = 2, # OVERRIDDEN\n },\n}\n```\n\nNickel's `&` operator is a **shallow merge** - only top-level fields are replaced, deeper nesting is preserved.\n\n## Generated Config Structure\n\nAll generated configs follow this structure:\n\n```\n# Service config\n{\n {service} = {\n # Workspace\n workspace = { ... },\n\n # Server\n server = { ... },\n\n # Storage/Database\n [storage | database] = { ... },\n\n # Service-specific\n [queue | rbac | capabilities] = { ... },\n\n # Optional\n [monitoring | security | compliance] = { ... },\n },\n}\n```\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 diff --git a/schemas/platform/configuration-workflow.md b/schemas/platform/configuration-workflow.md index 3783842..06b9fa7 100644 --- a/schemas/platform/configuration-workflow.md +++ b/schemas/platform/configuration-workflow.md @@ -1 +1 @@ -# Configuration Workflow: TypeDialog → Nickel → TOML → Rust\n\nComplete documentation of the configuration pipeline that transforms interactive user input into production Rust service configurations.\n\n## Overview\n\nThe provisioning platform uses a **four-stage configuration workflow** that leverages TypeDialog for interactive configuration,\nNickel for type-safe composition, and TOML for service consumption:\n\n```\n┌─────────────────────────────────────────────────────────────────┐\n│ Stage 1: User Interaction (TypeDialog) │\n│ - Can use Nickel configuration as default values │\n│ if use provisioning/platform/config/ it will be updated │\n│ - Interactive form (web/tui/cli) │\n│ - Real-time constraint validation │\n│ - Generates Nickel configuration │\n└────────────────┬────────────────────────────────────────────────┘\n │\n ▼\n┌─────────────────────────────────────────────────────────────────┐\n│ Stage 2: Composition (Nickel) │\n│ - Base defaults imported │\n│ - Mode overlay applied │\n│ - Validators enforce business rules │\n│ - Produces Nickel config file │\n└────────────────┬────────────────────────────────────────────────┘\n │\n ▼\n┌─────────────────────────────────────────────────────────────────┐\n│ Stage 3: Export (Nickel → TOML) │\n│ - Nickel config evaluated │\n│ - Exported to TOML format │\n│ - Saved to provisioning/platform/config/ │\n└────────────────┬────────────────────────────────────────────────┘\n │\n ▼\n┌─────────────────────────────────────────────────────────────────┐\n│ Stage 4: Runtime (Rust Services) │\n│ - Services load TOML configuration │\n│ - Environment variables override specific values │\n│ - Start services with final configuration │\n└─────────────────────────────────────────────────────────────────┘\n```\n\n---\n\n## Stage 1: User Interaction (TypeDialog)\n\n### Purpose\n\nCollect configuration from users through an interactive, constraint-aware interface.\n\n### Workflow\n\n```\n# Launch interactive configuration wizard\nnu scripts/configure.nu orchestrator solo --backend web\n```\n\n### What Happens\n\n1. **Form Loads**\n - TypeDialog reads `forms/orchestrator-form.toml`\n - Form displays configuration sections\n - Constraints from `constraints.toml` enforce min/max values\n - Environment variables populate initial defaults\n\n2. **User Interaction**\n - User fills in form fields (workspace name, server port, etc.)\n - Real-time validation on each field\n - Constraint interpolation shows valid ranges:\n - `${constraint.orchestrator.workers.min}` → `1`\n - `${constraint.orchestrator.workers.max}` → `32`\n\n3. **Configuration Submission**\n - User submits form\n - TypeDialog validates all fields against schemas\n - Generates Nickel configuration output\n\n4. **Output Generation**\n - Nickel config saved to `values/{service}.{mode}.ncl`\n - Example: `values/orchestrator.solo.ncl`\n - File becomes source of truth for user customizations\n\n### Form Structure Example\n\n```\n# forms/orchestrator-form.toml\nname = "orchestrator_configuration"\ndescription = "Configure orchestrator service"\n\n[[items]]\nname = "workspace_group"\ntype = "group"\nincludes = ["fragments/workspace-section.toml"]\n\n[[items]]\nname = "server_group"\ntype = "group"\nincludes = ["fragments/server-section.toml"]\n\n[[items]]\nname = "queue_group"\ntype = "group"\nincludes = ["fragments/orchestrator/queue-section.toml"]\n```\n\n### Fragment with Constraint Interpolation\n\n```\n# forms/fragments/orchestrator/queue-section.toml\n[[elements]]\nname = "max_concurrent_tasks"\ntype = "number"\nprompt = "Maximum Concurrent Tasks"\ndefault = 5\nmin = "${constraint.orchestrator.queue.concurrent_tasks.min}"\nmax = "${constraint.orchestrator.queue.concurrent_tasks.max}"\nrequired = true\nhelp = "Range: ${constraint.orchestrator.queue.concurrent_tasks.min}-${constraint.orchestrator.queue.concurrent_tasks.max}"\nnickel_path = ["orchestrator", "queue", "max_concurrent_tasks"]\n```\n\n### Generated Nickel Output (from TypeDialog)\n\nTypeDialog's `nickel-roundtrip` pattern generates:\n\n```\n# values/orchestrator.solo.ncl\n# Auto-generated by TypeDialog\n{\n orchestrator = {\n workspace = {\n name = "dev-workspace",\n path = "/home/developer/provisioning/data/orchestrator",\n enabled = true,\n },\n server = {\n host = "127.0.0.1",\n port = 9090,\n workers = 2,\n },\n queue = {\n max_concurrent_tasks = 3,\n retry_attempts = 2,\n retry_delay = 1000,\n },\n },\n}\n```\n\n---\n\n## Stage 2: Composition (Nickel)\n\n### Purpose\n\nCompose the user input with defaults, validators, and schemas to create a complete, validated configuration.\n\n### Workflow\n\n```\n# The nickel typecheck command validates the composition\nnickel typecheck values/orchestrator.solo.ncl\n```\n\n### Composition Layers\n\nThe final configuration is built by merging layers in priority order:\n\n#### Layer 1: Schema Import\n\n```\n# Ensures type safety and required fields\nlet schemas = import "../schemas/orchestrator.ncl" in\n```\n\n#### Layer 2: Base Defaults\n\n```\n# Default values for all orchestrator configurations\nlet defaults = import "../defaults/orchestrator-defaults.ncl" in\n```\n\n#### Layer 3: Mode Overlay\n\n```\n# Solo-specific overrides and adjustments\nlet solo_defaults = import "../defaults/deployment/solo-defaults.ncl" in\n```\n\n#### Layer 4: Validators Import\n\n```\n# Business rule validation (ranges, uniqueness, dependencies)\nlet validators = import "../validators/orchestrator-validator.ncl" in\n```\n\n#### Layer 5: User Values\n\n```\n# User input from TypeDialog (values/orchestrator.solo.ncl)\n# Loaded and merged with defaults\n```\n\n### Composition Example\n\n```\n# configs/orchestrator.solo.ncl (generated composition)\n\nlet schemas = import "../schemas/orchestrator.ncl" in\nlet defaults = import "../defaults/orchestrator-defaults.ncl" in\nlet solo_defaults = import "../defaults/deployment/solo-defaults.ncl" in\nlet validators = import "../validators/orchestrator-validator.ncl" in\n\n# Composition: Base defaults + mode overlay + user input\n{\n orchestrator = defaults.orchestrator & {\n # User input from TypeDialog values/orchestrator.solo.ncl\n workspace = {\n name = "dev-workspace",\n path = "/home/developer/provisioning/data/orchestrator",\n },\n\n # Solo mode overrides\n server = {\n workers = validators.ValidWorkers 2,\n max_connections = 128,\n },\n\n queue = {\n max_concurrent_tasks = validators.ValidConcurrentTasks 3,\n },\n\n # Fallback to defaults for unspecified fields\n },\n} | schemas.OrchestratorConfig # Validate against schema\n```\n\n### Validation During Composition\n\nEach field is validated through multiple validation layers:\n\n```\n# validators/orchestrator-validator.ncl\nlet constraints = import "../constraints/constraints.toml" in\n\n{\n # Validate workers within allowed range\n ValidWorkers = fun workers =>\n if workers < constraints.orchestrator.workers.min then\n error "Workers below minimum"\n else if workers > constraints.orchestrator.workers.max then\n error "Workers above maximum"\n else\n workers,\n\n # Validate concurrent tasks\n ValidConcurrentTasks = fun tasks =>\n if tasks < constraints.orchestrator.queue.concurrent_tasks.min then\n error "Tasks below minimum"\n else if tasks > constraints.orchestrator.queue.concurrent_tasks.max then\n error "Tasks above maximum"\n else\n tasks,\n}\n```\n\n### Constraints: Single Source of Truth\n\n```\n# constraints/constraints.toml\n[orchestrator.workers]\nmin = 1\nmax = 32\n\n[orchestrator.queue.concurrent_tasks]\nmin = 1\nmax = 100\n\n[common.server.port]\nmin = 1024\nmax = 65535\n```\n\nThese values are referenced in:\n- Form constraints (constraint interpolation)\n- Validators (ValidWorkers, ValidConcurrentTasks)\n- Default values (appropriate for each mode)\n\n---\n\n## Stage 3: Export (Nickel → TOML)\n\n### Purpose\n\nConvert validated Nickel configuration to TOML format for consumption by Rust services.\n\n### Workflow\n\n```\n# Export Nickel to TOML\nnu scripts/generate-configs.nu orchestrator solo\n```\n\n### Command Chain\n\n```\n# What happens internally:\n\n# 1. Typecheck the Nickel config (catch errors early)\nnickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# 2. Export to TOML format\nnickel export --format toml provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# 3. Save to output location\n# → provisioning/platform/config/orchestrator.solo.toml\n```\n\n### Input: Nickel Configuration\n\n```\n# From: configs/orchestrator.solo.ncl\n{\n orchestrator = {\n workspace = {\n name = "dev-workspace",\n path = "/home/developer/provisioning/data/orchestrator",\n enabled = true,\n multi_workspace = false,\n },\n server = {\n host = "127.0.0.1",\n port = 9090,\n workers = 2,\n keep_alive = 75,\n max_connections = 128,\n },\n storage = {\n backend = "filesystem",\n path = "/home/developer/provisioning/data/orchestrator",\n },\n queue = {\n max_concurrent_tasks = 3,\n retry_attempts = 2,\n retry_delay = 1000,\n task_timeout = 1800000,\n },\n monitoring = {\n enabled = true,\n metrics = {\n enabled = false,\n },\n health_check = {\n enabled = true,\n interval = 60,\n },\n },\n logging = {\n level = "debug",\n format = "text",\n outputs = [\n {\n destination = "stdout",\n level = "debug",\n },\n ],\n },\n },\n}\n```\n\n### Output: TOML Configuration\n\n```\n# To: provisioning/platform/config/orchestrator.solo.toml\n[orchestrator.workspace]\nname = "dev-workspace"\npath = "/home/developer/provisioning/data/orchestrator"\nenabled = true\nmulti_workspace = false\n\n[orchestrator.server]\nhost = "127.0.0.1"\nport = 9090\nworkers = 2\nkeep_alive = 75\nmax_connections = 128\n\n[orchestrator.storage]\nbackend = "filesystem"\npath = "/home/developer/provisioning/data/orchestrator"\n\n[orchestrator.queue]\nmax_concurrent_tasks = 3\nretry_attempts = 2\nretry_delay = 1000\ntask_timeout = 1800000\n\n[orchestrator.monitoring]\nenabled = true\n\n[orchestrator.monitoring.metrics]\nenabled = false\n\n[orchestrator.monitoring.health_check]\nenabled = true\ninterval = 60\n\n[orchestrator.logging]\nlevel = "debug"\nformat = "text"\n\n[[orchestrator.logging.outputs]]\ndestination = "stdout"\nlevel = "debug"\n```\n\n### Output Location\n\n```\nprovisioning/platform/config/\n├── orchestrator.solo.toml # Exported from configs/orchestrator.solo.ncl\n├── orchestrator.multiuser.toml # Exported from configs/orchestrator.multiuser.ncl\n├── orchestrator.cicd.toml # Exported from configs/orchestrator.cicd.ncl\n├── orchestrator.enterprise.toml # Exported from configs/orchestrator.enterprise.ncl\n├── control-center.solo.toml # Similar structure for each service\n├── control-center.multiuser.toml\n├── mcp-server.solo.toml\n└── mcp-server.enterprise.toml\n```\n\n### Validation During Export\n\nThe `generate-configs.nu` script:\n\n1. **Typechecks** - Ensures Nickel is syntactically valid\n2. **Evaluates** - Computes final values\n3. **Exports** - Converts to TOML format\n4. **Saves** - Writes to `provisioning/platform/config/`\n\n---\n\n## Stage 4: Runtime (Rust Services)\n\n### Purpose\n\nLoad TOML configuration and start Rust services with validated settings.\n\n### Configuration Loading Hierarchy\n\nRust services load configuration in this priority order:\n\n#### 1. Runtime Arguments (Highest Priority)\n\n```\nORCHESTRATOR_CONFIG=/path/to/config.toml cargo run --bin orchestrator\n```\n\n#### 2. Environment Variables\n\n```\n# Environment variable overrides specific TOML values\nexport ORCHESTRATOR_SERVER_PORT=9999\nexport ORCHESTRATOR_LOG_LEVEL=debug\n\nORCHESTRATOR_CONFIG=orchestrator.solo.toml cargo run --bin orchestrator\n```\n\nEnvironment variable format: `ORCHESTRATOR_{SECTION}_{KEY}=value`\n\nExample mappings:\n- `ORCHESTRATOR_SERVER_PORT=9999` → `orchestrator.server.port = 9999`\n- `ORCHESTRATOR_LOG_LEVEL=debug` → `orchestrator.logging.level = "debug"`\n- `ORCHESTRATOR_QUEUE_MAX_CONCURRENT_TASKS=10` → `orchestrator.queue.max_concurrent_tasks = 10`\n\n#### 3. TOML Configuration File\n\n```\n# Load from TOML (medium priority)\nORCHESTRATOR_CONFIG=orchestrator.solo.toml cargo run --bin orchestrator\n```\n\n#### 4. Compiled Defaults (Lowest Priority)\n\n```\n// In Rust code - fallback for unspecified values\nlet config = Config::from_file(config_path)\n .unwrap_or_else(|_| Config::default());\n```\n\n### Example: Solo Mode Startup\n\n```\n# Step 1: User generates config through TypeDialog\nnu scripts/configure.nu orchestrator solo --backend web\n\n# Step 2: Export to TOML\nnu scripts/generate-configs.nu orchestrator solo\n\n# Step 3: Set environment variables for environment-specific overrides\nexport ORCHESTRATOR_SERVER_PORT=9090\nexport ORCHESTRATOR_LOG_LEVEL=debug\n\n# Step 4: Start the Rust service\nORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator\n```\n\n### Rust Service Configuration Loading\n\n```\n// In orchestrator/src/config.rs\n\nuse config::{Config, ConfigError, Environment, File};\nuse serde::Deserialize;\n\n#[derive(Debug, Deserialize)]\npub struct OrchestratorConfig {\n pub orchestrator: OrchestratorService,\n}\n\n#[derive(Debug, Deserialize)]\npub struct OrchestratorService {\n pub workspace: Workspace,\n pub server: Server,\n pub storage: Storage,\n pub queue: Queue,\n}\n\nimpl OrchestratorConfig {\n pub fn load(config_path: Option<&str>) -> Result {\n let mut builder = Config::builder();\n\n // 1. Load TOML file if provided\n if let Some(path) = config_path {\n builder = builder.add_source(File::from(Path::new(path)));\n } else {\n // Fallback to defaults\n builder = builder.add_source(File::with_name("config/orchestrator.defaults.toml"));\n }\n\n // 2. Apply environment variable overrides\n builder = builder.add_source(\n Environment::with_prefix("ORCHESTRATOR")\n .separator("_")\n );\n\n let config = builder.build()?;\n config.try_deserialize()\n }\n}\n```\n\n### Configuration Validation in Rust\n\n```\nimpl OrchestratorConfig {\n pub fn validate(&self) -> Result<(), ConfigError> {\n // Validate server configuration\n if self.orchestrator.server.port < 1024 || self.orchestrator.server.port > 65535 {\n return Err(ConfigError::Message(\n "Server port must be between 1024 and 65535".to_string()\n ));\n }\n\n // Validate queue configuration\n if self.orchestrator.queue.max_concurrent_tasks == 0 {\n return Err(ConfigError::Message(\n "max_concurrent_tasks must be > 0".to_string()\n ));\n }\n\n // Validate storage configuration\n match self.orchestrator.storage.backend.as_str() {\n "filesystem" | "surrealdb" | "rocksdb" => {\n // Valid backend\n },\n backend => {\n return Err(ConfigError::Message(\n format!("Unknown storage backend: {}", backend)\n ));\n }\n }\n\n Ok(())\n }\n}\n```\n\n### Runtime Startup Sequence\n\n```\n#[tokio::main]\nasync fn main() -> Result<()> {\n // Load configuration\n let config = OrchestratorConfig::load(\n std::env::var("ORCHESTRATOR_CONFIG").ok().as_deref()\n )?;\n\n // Validate configuration\n config.validate()?;\n\n // Initialize logging\n init_logging(&config.orchestrator.logging)?;\n\n // Start HTTP server\n let server = Server::new(\n config.orchestrator.server.host.clone(),\n config.orchestrator.server.port,\n );\n\n // Initialize storage backend\n let storage = Storage::new(&config.orchestrator.storage)?;\n\n // Start the service\n server.start(storage).await?;\n\n Ok(())\n}\n```\n\n---\n\n## Complete Example: Solo Mode End-to-End\n\n### Step 1: Interactive Configuration\n\n```\n$ nu scripts/configure.nu orchestrator solo --backend web\n\n# TypeDialog launches web interface\n# User fills in form:\n# - Workspace name: "dev-workspace"\n# - Server host: "127.0.0.1"\n# - Server port: 9090\n# - Storage backend: "filesystem"\n# - Storage path: "/home/developer/provisioning/data/orchestrator"\n# - Max concurrent tasks: 3\n# - Log level: "debug"\n\n# Saves to: values/orchestrator.solo.ncl\n```\n\n### Step 2: Generated Nickel Configuration\n\n```\n# values/orchestrator.solo.ncl\n{\n orchestrator = {\n workspace = {\n name = "dev-workspace",\n path = "/home/developer/provisioning/data/orchestrator",\n enabled = true,\n multi_workspace = false,\n },\n server = {\n host = "127.0.0.1",\n port = 9090,\n workers = 2,\n keep_alive = 75,\n max_connections = 128,\n },\n storage = {\n backend = "filesystem",\n path = "/home/developer/provisioning/data/orchestrator",\n },\n queue = {\n max_concurrent_tasks = 3,\n retry_attempts = 2,\n retry_delay = 1000,\n task_timeout = 1800000,\n },\n logging = {\n level = "debug",\n format = "text",\n outputs = [{\n destination = "stdout",\n level = "debug",\n }],\n },\n },\n}\n```\n\n### Step 3: Composition and Validation\n\n```\n$ nickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Validation passes:\n# - Workspace name: valid string ✓\n# - Port 9090: within range 1024-65535 ✓\n# - Max concurrent tasks 3: within range 1-100 ✓\n# - Log level: recognized level ✓\n```\n\n### Step 4: Export to TOML\n\n```\n$ nu scripts/generate-configs.nu orchestrator solo\n\n# Generates: provisioning/platform/config/orchestrator.solo.toml\n```\n\n### Step 5: TOML File Created\n\n```\n# provisioning/platform/config/orchestrator.solo.toml\n[orchestrator.workspace]\nname = "dev-workspace"\npath = "/home/developer/provisioning/data/orchestrator"\nenabled = true\nmulti_workspace = false\n\n[orchestrator.server]\nhost = "127.0.0.1"\nport = 9090\nworkers = 2\nkeep_alive = 75\nmax_connections = 128\n\n[orchestrator.storage]\nbackend = "filesystem"\npath = "/home/developer/provisioning/data/orchestrator"\n\n[orchestrator.queue]\nmax_concurrent_tasks = 3\nretry_attempts = 2\nretry_delay = 1000\ntask_timeout = 1800000\n\n[orchestrator.logging]\nlevel = "debug"\nformat = "text"\n\n[[orchestrator.logging.outputs]]\ndestination = "stdout"\nlevel = "debug"\n```\n\n### Step 6: Runtime Startup\n\n```\n$ export ORCHESTRATOR_LOG_LEVEL=debug\n$ ORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator\n\n# Service loads orchestrator.solo.toml\n# Environment variable overrides ORCHESTRATOR_LOG_LEVEL to "debug"\n# Service starts and begins accepting requests on 127.0.0.1:9090\n```\n\n---\n\n## Configuration Modification Workflow\n\n### Scenario: User Wants to Change Port\n\n#### Option A: Modify TypeDialog Form and Regenerate\n\n```\n# 1. Re-run interactive configuration\nnu scripts/configure.nu orchestrator solo --backend web\n\n# 2. User changes port to 9999 in form\n# 3. TypeDialog generates new values/orchestrator.solo.ncl\n\n# 4. Export updated config\nnu scripts/generate-configs.nu orchestrator solo\n\n# 5. New TOML created with port: 9999\n# 6. Restart service\nORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator\n```\n\n#### Option B: Direct TOML Edit\n\n```\n# 1. Edit TOML directly\nvi provisioning/platform/config/orchestrator.solo.toml\n# Change: port = 9999\n\n# 2. Restart service (no Nickel re-export needed)\nORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator\n```\n\n#### Option C: Environment Variable Override\n\n```\n# 1. No file changes needed\n# 2. Just override environment variable\nexport ORCHESTRATOR_SERVER_PORT=9999\n\n# 3. Restart service\nORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator\n```\n\n---\n\n## Architecture Relationships\n\n### Component Interactions\n\n```\nTypeDialog Forms Nickel Schemas\n(forms/*.toml) ←shares→ (schemas/*.ncl)\n │ │\n │ user input │ type definitions\n │ │\n ▼ ▼\nvalues/*.ncl ←─ constraint validation ─→ constraints.toml\n │ (single source of truth)\n │ │\n │ │\n ├──→ imported into composition ────────────┤\n │ (configs/*.ncl) │\n │ │\n │ base defaults ───→ defaults/*.ncl │\n │ mode overlay ─────→ deployment/*.ncl │\n │ validators ──────→ validators/*.ncl │\n │ │\n └──→ typecheck + export ──────────────→─────┘\n nickel export --format toml\n │\n ▼\n provisioning/platform/config/\n *.toml files\n │\n │ loaded by Rust services\n │ at runtime\n ▼\n Running Service\n (orchestrator, control-center, mcp-server)\n```\n\n---\n\n## Best Practices\n\n### 1. Always Validate Before Deploying\n\n```\n# Typecheck Nickel before export\nnickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Validate TOML before loading in Rust\ncargo run --bin orchestrator -- --validate-config orchestrator.solo.toml\n```\n\n### 2. Use Version Control for TOML Configs\n\n```\n# Commit generated TOML files\ngit add provisioning/platform/config/orchestrator.solo.toml\ngit commit -m "Update orchestrator solo configuration"\n\n# But NOT the values/*.ncl files\necho "values/*.ncl" >> provisioning/.typedialog/provisioning/platform/.gitignore\n```\n\n### 3. Document Configuration Changes\n\n```\n# In TypeDialog form, add comments\n[[items]]\nname = "max_concurrent_tasks"\ntype = "number"\nprompt = "Max concurrent tasks (3 for dev, 50+ for production)"\nhelp = "Increased from 3 to 10 for higher throughput testing"\n```\n\n### 4. Environment Variables for Sensitive Data\n\nNever hardcode secrets in TOML:\n\n```\n# Instead of:\n# [orchestrator.security]\n# jwt_secret = "hardcoded-secret"\n\n# Use environment variable:\nexport ORCHESTRATOR_SECURITY_JWT_SECRET="actual-secret"\n\n# TOML can reference it:\n# [orchestrator.security]\n# jwt_secret = "${JWT_SECRET}"\n```\n\n### 5. Test Configuration Changes in Staging First\n\n```\n# Generate staging config\nnu scripts/configure.nu orchestrator multiuser --backend web\n\n# Export to staging TOML\nnu scripts/generate-configs.nu orchestrator multiuser\n\n# Test in staging environment\nORCHESTRATOR_CONFIG=orchestrator.multiuser.toml cargo run --bin orchestrator\n# Monitor logs and verify behavior\n\n# Then deploy to production\n```\n\n---\n\n## Summary\n\nThe four-stage workflow provides:\n\n1. **User-Friendly Interface**: TypeDialog forms with real-time validation\n2. **Type Safety**: Nickel schemas and validators catch configuration errors early\n3. **Flexibility**: TOML format can be edited manually or generated programmatically\n4. **Runtime Configurability**: Environment variables allow deployment-time overrides\n5. **Single Source of Truth**: Constraints, schemas, and validators all reference shared definitions\n\nThis layered approach ensures that:\n- Invalid configurations are caught before deployment\n- Users can modify configuration safely\n- Different deployment modes have appropriate defaults\n- Configuration changes can be version-controlled\n- Services can be reconfigured without code changes \ No newline at end of file +# Configuration Workflow: TypeDialog → Nickel → TOML → Rust\n\nComplete documentation of the configuration pipeline that transforms interactive user input into production Rust service configurations.\n\n## Overview\n\nThe provisioning platform uses a **four-stage configuration workflow** that leverages TypeDialog for interactive configuration,\nNickel for type-safe composition, and TOML for service consumption:\n\n```\n┌─────────────────────────────────────────────────────────────────┐\n│ Stage 1: User Interaction (TypeDialog) │\n│ - Can use Nickel configuration as default values │\n│ if use provisioning/platform/config/ it will be updated │\n│ - Interactive form (web/tui/cli) │\n│ - Real-time constraint validation │\n│ - Generates Nickel configuration │\n└────────────────┬────────────────────────────────────────────────┘\n │\n ▼\n┌─────────────────────────────────────────────────────────────────┐\n│ Stage 2: Composition (Nickel) │\n│ - Base defaults imported │\n│ - Mode overlay applied │\n│ - Validators enforce business rules │\n│ - Produces Nickel config file │\n└────────────────┬────────────────────────────────────────────────┘\n │\n ▼\n┌─────────────────────────────────────────────────────────────────┐\n│ Stage 3: Export (Nickel → TOML) │\n│ - Nickel config evaluated │\n│ - Exported to TOML format │\n│ - Saved to provisioning/platform/config/ │\n└────────────────┬────────────────────────────────────────────────┘\n │\n ▼\n┌─────────────────────────────────────────────────────────────────┐\n│ Stage 4: Runtime (Rust Services) │\n│ - Services load TOML configuration │\n│ - Environment variables override specific values │\n│ - Start services with final configuration │\n└─────────────────────────────────────────────────────────────────┘\n```\n\n---\n\n## Stage 1: User Interaction (TypeDialog)\n\n### Purpose\n\nCollect configuration from users through an interactive, constraint-aware interface.\n\n### Workflow\n\n```\n# Launch interactive configuration wizard\nnu scripts/configure.nu orchestrator solo --backend web\n```\n\n### What Happens\n\n1. **Form Loads**\n - TypeDialog reads `forms/orchestrator-form.toml`\n - Form displays configuration sections\n - Constraints from `constraints.toml` enforce min/max values\n - Environment variables populate initial defaults\n\n2. **User Interaction**\n - User fills in form fields (workspace name, server port, etc.)\n - Real-time validation on each field\n - Constraint interpolation shows valid ranges:\n - `${constraint.orchestrator.workers.min}` → `1`\n - `${constraint.orchestrator.workers.max}` → `32`\n\n3. **Configuration Submission**\n - User submits form\n - TypeDialog validates all fields against schemas\n - Generates Nickel configuration output\n\n4. **Output Generation**\n - Nickel config saved to `values/{service}.{mode}.ncl`\n - Example: `values/orchestrator.solo.ncl`\n - File becomes source of truth for user customizations\n\n### Form Structure Example\n\n```\n# forms/orchestrator-form.toml\nname = "orchestrator_configuration"\ndescription = "Configure orchestrator service"\n\n[[items]]\nname = "workspace_group"\ntype = "group"\nincludes = ["fragments/workspace-section.toml"]\n\n[[items]]\nname = "server_group"\ntype = "group"\nincludes = ["fragments/server-section.toml"]\n\n[[items]]\nname = "queue_group"\ntype = "group"\nincludes = ["fragments/orchestrator/queue-section.toml"]\n```\n\n### Fragment with Constraint Interpolation\n\n```\n# forms/fragments/orchestrator/queue-section.toml\n[[elements]]\nname = "max_concurrent_tasks"\ntype = "number"\nprompt = "Maximum Concurrent Tasks"\ndefault = 5\nmin = "${constraint.orchestrator.queue.concurrent_tasks.min}"\nmax = "${constraint.orchestrator.queue.concurrent_tasks.max}"\nrequired = true\nhelp = "Range: ${constraint.orchestrator.queue.concurrent_tasks.min}-${constraint.orchestrator.queue.concurrent_tasks.max}"\nnickel_path = ["orchestrator", "queue", "max_concurrent_tasks"]\n```\n\n### Generated Nickel Output (from TypeDialog)\n\nTypeDialog's `nickel-roundtrip` pattern generates:\n\n```\n# values/orchestrator.solo.ncl\n# Auto-generated by TypeDialog\n{\n orchestrator = {\n workspace = {\n name = "dev-workspace",\n path = "/home/developer/provisioning/data/orchestrator",\n enabled = true,\n },\n server = {\n host = "127.0.0.1",\n port = 9090,\n workers = 2,\n },\n queue = {\n max_concurrent_tasks = 3,\n retry_attempts = 2,\n retry_delay = 1000,\n },\n },\n}\n```\n\n---\n\n## Stage 2: Composition (Nickel)\n\n### Purpose\n\nCompose the user input with defaults, validators, and schemas to create a complete, validated configuration.\n\n### Workflow\n\n```\n# The nickel typecheck command validates the composition\nnickel typecheck values/orchestrator.solo.ncl\n```\n\n### Composition Layers\n\nThe final configuration is built by merging layers in priority order:\n\n#### Layer 1: Schema Import\n\n```\n# Ensures type safety and required fields\nlet schemas = import "../schemas/orchestrator.ncl" in\n```\n\n#### Layer 2: Base Defaults\n\n```\n# Default values for all orchestrator configurations\nlet defaults = import "../defaults/orchestrator-defaults.ncl" in\n```\n\n#### Layer 3: Mode Overlay\n\n```\n# Solo-specific overrides and adjustments\nlet solo_defaults = import "../defaults/deployment/solo-defaults.ncl" in\n```\n\n#### Layer 4: Validators Import\n\n```\n# Business rule validation (ranges, uniqueness, dependencies)\nlet validators = import "../validators/orchestrator-validator.ncl" in\n```\n\n#### Layer 5: User Values\n\n```\n# User input from TypeDialog (values/orchestrator.solo.ncl)\n# Loaded and merged with defaults\n```\n\n### Composition Example\n\n```\n# configs/orchestrator.solo.ncl (generated composition)\n\nlet schemas = import "../schemas/orchestrator.ncl" in\nlet defaults = import "../defaults/orchestrator-defaults.ncl" in\nlet solo_defaults = import "../defaults/deployment/solo-defaults.ncl" in\nlet validators = import "../validators/orchestrator-validator.ncl" in\n\n# Composition: Base defaults + mode overlay + user input\n{\n orchestrator = defaults.orchestrator & {\n # User input from TypeDialog values/orchestrator.solo.ncl\n workspace = {\n name = "dev-workspace",\n path = "/home/developer/provisioning/data/orchestrator",\n },\n\n # Solo mode overrides\n server = {\n workers = validators.ValidWorkers 2,\n max_connections = 128,\n },\n\n queue = {\n max_concurrent_tasks = validators.ValidConcurrentTasks 3,\n },\n\n # Fallback to defaults for unspecified fields\n },\n} | schemas.OrchestratorConfig # Validate against schema\n```\n\n### Validation During Composition\n\nEach field is validated through multiple validation layers:\n\n```\n# validators/orchestrator-validator.ncl\nlet constraints = import "../constraints/constraints.toml" in\n\n{\n # Validate workers within allowed range\n ValidWorkers = fun workers =>\n if workers < constraints.orchestrator.workers.min then\n error "Workers below minimum"\n else if workers > constraints.orchestrator.workers.max then\n error "Workers above maximum"\n else\n workers,\n\n # Validate concurrent tasks\n ValidConcurrentTasks = fun tasks =>\n if tasks < constraints.orchestrator.queue.concurrent_tasks.min then\n error "Tasks below minimum"\n else if tasks > constraints.orchestrator.queue.concurrent_tasks.max then\n error "Tasks above maximum"\n else\n tasks,\n}\n```\n\n### Constraints: Single Source of Truth\n\n```\n# constraints/constraints.toml\n[orchestrator.workers]\nmin = 1\nmax = 32\n\n[orchestrator.queue.concurrent_tasks]\nmin = 1\nmax = 100\n\n[common.server.port]\nmin = 1024\nmax = 65535\n```\n\nThese values are referenced in:\n- Form constraints (constraint interpolation)\n- Validators (ValidWorkers, ValidConcurrentTasks)\n- Default values (appropriate for each mode)\n\n---\n\n## Stage 3: Export (Nickel → TOML)\n\n### Purpose\n\nConvert validated Nickel configuration to TOML format for consumption by Rust services.\n\n### Workflow\n\n```\n# Export Nickel to TOML\nnu scripts/generate-configs.nu orchestrator solo\n```\n\n### Command Chain\n\n```\n# What happens internally:\n\n# 1. Typecheck the Nickel config (catch errors early)\nnickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# 2. Export to TOML format\nnickel export --format toml provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# 3. Save to output location\n# → provisioning/platform/config/orchestrator.solo.toml\n```\n\n### Input: Nickel Configuration\n\n```\n# From: configs/orchestrator.solo.ncl\n{\n orchestrator = {\n workspace = {\n name = "dev-workspace",\n path = "/home/developer/provisioning/data/orchestrator",\n enabled = true,\n multi_workspace = false,\n },\n server = {\n host = "127.0.0.1",\n port = 9090,\n workers = 2,\n keep_alive = 75,\n max_connections = 128,\n },\n storage = {\n backend = "filesystem",\n path = "/home/developer/provisioning/data/orchestrator",\n },\n queue = {\n max_concurrent_tasks = 3,\n retry_attempts = 2,\n retry_delay = 1000,\n task_timeout = 1800000,\n },\n monitoring = {\n enabled = true,\n metrics = {\n enabled = false,\n },\n health_check = {\n enabled = true,\n interval = 60,\n },\n },\n logging = {\n level = "debug",\n format = "text",\n outputs = [\n {\n destination = "stdout",\n level = "debug",\n },\n ],\n },\n },\n}\n```\n\n### Output: TOML Configuration\n\n```\n# To: provisioning/platform/config/orchestrator.solo.toml\n[orchestrator.workspace]\nname = "dev-workspace"\npath = "/home/developer/provisioning/data/orchestrator"\nenabled = true\nmulti_workspace = false\n\n[orchestrator.server]\nhost = "127.0.0.1"\nport = 9090\nworkers = 2\nkeep_alive = 75\nmax_connections = 128\n\n[orchestrator.storage]\nbackend = "filesystem"\npath = "/home/developer/provisioning/data/orchestrator"\n\n[orchestrator.queue]\nmax_concurrent_tasks = 3\nretry_attempts = 2\nretry_delay = 1000\ntask_timeout = 1800000\n\n[orchestrator.monitoring]\nenabled = true\n\n[orchestrator.monitoring.metrics]\nenabled = false\n\n[orchestrator.monitoring.health_check]\nenabled = true\ninterval = 60\n\n[orchestrator.logging]\nlevel = "debug"\nformat = "text"\n\n[[orchestrator.logging.outputs]]\ndestination = "stdout"\nlevel = "debug"\n```\n\n### Output Location\n\n```\nprovisioning/platform/config/\n├── orchestrator.solo.toml # Exported from configs/orchestrator.solo.ncl\n├── orchestrator.multiuser.toml # Exported from configs/orchestrator.multiuser.ncl\n├── orchestrator.cicd.toml # Exported from configs/orchestrator.cicd.ncl\n├── orchestrator.enterprise.toml # Exported from configs/orchestrator.enterprise.ncl\n├── control-center.solo.toml # Similar structure for each service\n├── control-center.multiuser.toml\n├── mcp-server.solo.toml\n└── mcp-server.enterprise.toml\n```\n\n### Validation During Export\n\nThe `generate-configs.nu` script:\n\n1. **Typechecks** - Ensures Nickel is syntactically valid\n2. **Evaluates** - Computes final values\n3. **Exports** - Converts to TOML format\n4. **Saves** - Writes to `provisioning/platform/config/`\n\n---\n\n## Stage 4: Runtime (Rust Services)\n\n### Purpose\n\nLoad TOML configuration and start Rust services with validated settings.\n\n### Configuration Loading Hierarchy\n\nRust services load configuration in this priority order:\n\n#### 1. Runtime Arguments (Highest Priority)\n\n```\nORCHESTRATOR_CONFIG=/path/to/config.toml cargo run --bin orchestrator\n```\n\n#### 2. Environment Variables\n\n```\n# Environment variable overrides specific TOML values\nexport ORCHESTRATOR_SERVER_PORT=9999\nexport ORCHESTRATOR_LOG_LEVEL=debug\n\nORCHESTRATOR_CONFIG=orchestrator.solo.toml cargo run --bin orchestrator\n```\n\nEnvironment variable format: `ORCHESTRATOR_{SECTION}_{KEY}=value`\n\nExample mappings:\n- `ORCHESTRATOR_SERVER_PORT=9999` → `orchestrator.server.port = 9999`\n- `ORCHESTRATOR_LOG_LEVEL=debug` → `orchestrator.logging.level = "debug"`\n- `ORCHESTRATOR_QUEUE_MAX_CONCURRENT_TASKS=10` → `orchestrator.queue.max_concurrent_tasks = 10`\n\n#### 3. TOML Configuration File\n\n```\n# Load from TOML (medium priority)\nORCHESTRATOR_CONFIG=orchestrator.solo.toml cargo run --bin orchestrator\n```\n\n#### 4. Compiled Defaults (Lowest Priority)\n\n```\n// In Rust code - fallback for unspecified values\nlet config = Config::from_file(config_path)\n .unwrap_or_else(|_| Config::default());\n```\n\n### Example: Solo Mode Startup\n\n```\n# Step 1: User generates config through TypeDialog\nnu scripts/configure.nu orchestrator solo --backend web\n\n# Step 2: Export to TOML\nnu scripts/generate-configs.nu orchestrator solo\n\n# Step 3: Set environment variables for environment-specific overrides\nexport ORCHESTRATOR_SERVER_PORT=9090\nexport ORCHESTRATOR_LOG_LEVEL=debug\n\n# Step 4: Start the Rust service\nORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator\n```\n\n### Rust Service Configuration Loading\n\n```\n// In orchestrator/src/config.rs\n\nuse config::{Config, ConfigError, Environment, File};\nuse serde::Deserialize;\n\n#[derive(Debug, Deserialize)]\npub struct OrchestratorConfig {\n pub orchestrator: OrchestratorService,\n}\n\n#[derive(Debug, Deserialize)]\npub struct OrchestratorService {\n pub workspace: Workspace,\n pub server: Server,\n pub storage: Storage,\n pub queue: Queue,\n}\n\nimpl OrchestratorConfig {\n pub fn load(config_path: Option<&str>) -> Result {\n let mut builder = Config::builder();\n\n // 1. Load TOML file if provided\n if let Some(path) = config_path {\n builder = builder.add_source(File::from(Path::new(path)));\n } else {\n // Fallback to defaults\n builder = builder.add_source(File::with_name("config/orchestrator.defaults.toml"));\n }\n\n // 2. Apply environment variable overrides\n builder = builder.add_source(\n Environment::with_prefix("ORCHESTRATOR")\n .separator("_")\n );\n\n let config = builder.build()?;\n config.try_deserialize()\n }\n}\n```\n\n### Configuration Validation in Rust\n\n```\nimpl OrchestratorConfig {\n pub fn validate(&self) -> Result<(), ConfigError> {\n // Validate server configuration\n if self.orchestrator.server.port < 1024 || self.orchestrator.server.port > 65535 {\n return Err(ConfigError::Message(\n "Server port must be between 1024 and 65535".to_string()\n ));\n }\n\n // Validate queue configuration\n if self.orchestrator.queue.max_concurrent_tasks == 0 {\n return Err(ConfigError::Message(\n "max_concurrent_tasks must be > 0".to_string()\n ));\n }\n\n // Validate storage configuration\n match self.orchestrator.storage.backend.as_str() {\n "filesystem" | "surrealdb" | "rocksdb" => {\n // Valid backend\n },\n backend => {\n return Err(ConfigError::Message(\n format!("Unknown storage backend: {}", backend)\n ));\n }\n }\n\n Ok(())\n }\n}\n```\n\n### Runtime Startup Sequence\n\n```\n#[tokio::main]\nasync fn main() -> Result<()> {\n // Load configuration\n let config = OrchestratorConfig::load(\n std::env::var("ORCHESTRATOR_CONFIG").ok().as_deref()\n )?;\n\n // Validate configuration\n config.validate()?;\n\n // Initialize logging\n init_logging(&config.orchestrator.logging)?;\n\n // Start HTTP server\n let server = Server::new(\n config.orchestrator.server.host.clone(),\n config.orchestrator.server.port,\n );\n\n // Initialize storage backend\n let storage = Storage::new(&config.orchestrator.storage)?;\n\n // Start the service\n server.start(storage).await?;\n\n Ok(())\n}\n```\n\n---\n\n## Complete Example: Solo Mode End-to-End\n\n### Step 1: Interactive Configuration\n\n```\n$ nu scripts/configure.nu orchestrator solo --backend web\n\n# TypeDialog launches web interface\n# User fills in form:\n# - Workspace name: "dev-workspace"\n# - Server host: "127.0.0.1"\n# - Server port: 9090\n# - Storage backend: "filesystem"\n# - Storage path: "/home/developer/provisioning/data/orchestrator"\n# - Max concurrent tasks: 3\n# - Log level: "debug"\n\n# Saves to: values/orchestrator.solo.ncl\n```\n\n### Step 2: Generated Nickel Configuration\n\n```\n# values/orchestrator.solo.ncl\n{\n orchestrator = {\n workspace = {\n name = "dev-workspace",\n path = "/home/developer/provisioning/data/orchestrator",\n enabled = true,\n multi_workspace = false,\n },\n server = {\n host = "127.0.0.1",\n port = 9090,\n workers = 2,\n keep_alive = 75,\n max_connections = 128,\n },\n storage = {\n backend = "filesystem",\n path = "/home/developer/provisioning/data/orchestrator",\n },\n queue = {\n max_concurrent_tasks = 3,\n retry_attempts = 2,\n retry_delay = 1000,\n task_timeout = 1800000,\n },\n logging = {\n level = "debug",\n format = "text",\n outputs = [{\n destination = "stdout",\n level = "debug",\n }],\n },\n },\n}\n```\n\n### Step 3: Composition and Validation\n\n```\n$ nickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Validation passes:\n# - Workspace name: valid string ✓\n# - Port 9090: within range 1024-65535 ✓\n# - Max concurrent tasks 3: within range 1-100 ✓\n# - Log level: recognized level ✓\n```\n\n### Step 4: Export to TOML\n\n```\n$ nu scripts/generate-configs.nu orchestrator solo\n\n# Generates: provisioning/platform/config/orchestrator.solo.toml\n```\n\n### Step 5: TOML File Created\n\n```\n# provisioning/platform/config/orchestrator.solo.toml\n[orchestrator.workspace]\nname = "dev-workspace"\npath = "/home/developer/provisioning/data/orchestrator"\nenabled = true\nmulti_workspace = false\n\n[orchestrator.server]\nhost = "127.0.0.1"\nport = 9090\nworkers = 2\nkeep_alive = 75\nmax_connections = 128\n\n[orchestrator.storage]\nbackend = "filesystem"\npath = "/home/developer/provisioning/data/orchestrator"\n\n[orchestrator.queue]\nmax_concurrent_tasks = 3\nretry_attempts = 2\nretry_delay = 1000\ntask_timeout = 1800000\n\n[orchestrator.logging]\nlevel = "debug"\nformat = "text"\n\n[[orchestrator.logging.outputs]]\ndestination = "stdout"\nlevel = "debug"\n```\n\n### Step 6: Runtime Startup\n\n```\n$ export ORCHESTRATOR_LOG_LEVEL=debug\n$ ORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator\n\n# Service loads orchestrator.solo.toml\n# Environment variable overrides ORCHESTRATOR_LOG_LEVEL to "debug"\n# Service starts and begins accepting requests on 127.0.0.1:9090\n```\n\n---\n\n## Configuration Modification Workflow\n\n### Scenario: User Wants to Change Port\n\n#### Option A: Modify TypeDialog Form and Regenerate\n\n```\n# 1. Re-run interactive configuration\nnu scripts/configure.nu orchestrator solo --backend web\n\n# 2. User changes port to 9999 in form\n# 3. TypeDialog generates new values/orchestrator.solo.ncl\n\n# 4. Export updated config\nnu scripts/generate-configs.nu orchestrator solo\n\n# 5. New TOML created with port: 9999\n# 6. Restart service\nORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator\n```\n\n#### Option B: Direct TOML Edit\n\n```\n# 1. Edit TOML directly\nvi provisioning/platform/config/orchestrator.solo.toml\n# Change: port = 9999\n\n# 2. Restart service (no Nickel re-export needed)\nORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator\n```\n\n#### Option C: Environment Variable Override\n\n```\n# 1. No file changes needed\n# 2. Just override environment variable\nexport ORCHESTRATOR_SERVER_PORT=9999\n\n# 3. Restart service\nORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator\n```\n\n---\n\n## Architecture Relationships\n\n### Component Interactions\n\n```\nTypeDialog Forms Nickel Schemas\n(forms/*.toml) ←shares→ (schemas/*.ncl)\n │ │\n │ user input │ type definitions\n │ │\n ▼ ▼\nvalues/*.ncl ←─ constraint validation ─→ constraints.toml\n │ (single source of truth)\n │ │\n │ │\n ├──→ imported into composition ────────────┤\n │ (configs/*.ncl) │\n │ │\n │ base defaults ───→ defaults/*.ncl │\n │ mode overlay ─────→ deployment/*.ncl │\n │ validators ──────→ validators/*.ncl │\n │ │\n └──→ typecheck + export ──────────────→─────┘\n nickel export --format toml\n │\n ▼\n provisioning/platform/config/\n *.toml files\n │\n │ loaded by Rust services\n │ at runtime\n ▼\n Running Service\n (orchestrator, control-center, mcp-server)\n```\n\n---\n\n## Best Practices\n\n### 1. Always Validate Before Deploying\n\n```\n# Typecheck Nickel before export\nnickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Validate TOML before loading in Rust\ncargo run --bin orchestrator -- --validate-config orchestrator.solo.toml\n```\n\n### 2. Use Version Control for TOML Configs\n\n```\n# Commit generated TOML files\ngit add provisioning/platform/config/orchestrator.solo.toml\ngit commit -m "Update orchestrator solo configuration"\n\n# But NOT the values/*.ncl files\necho "values/*.ncl" >> provisioning/.typedialog/provisioning/platform/.gitignore\n```\n\n### 3. Document Configuration Changes\n\n```\n# In TypeDialog form, add comments\n[[items]]\nname = "max_concurrent_tasks"\ntype = "number"\nprompt = "Max concurrent tasks (3 for dev, 50+ for production)"\nhelp = "Increased from 3 to 10 for higher throughput testing"\n```\n\n### 4. Environment Variables for Sensitive Data\n\nNever hardcode secrets in TOML:\n\n```\n# Instead of:\n# [orchestrator.security]\n# jwt_secret = "hardcoded-secret"\n\n# Use environment variable:\nexport ORCHESTRATOR_SECURITY_JWT_SECRET="actual-secret"\n\n# TOML can reference it:\n# [orchestrator.security]\n# jwt_secret = "${JWT_SECRET}"\n```\n\n### 5. Test Configuration Changes in Staging First\n\n```\n# Generate staging config\nnu scripts/configure.nu orchestrator multiuser --backend web\n\n# Export to staging TOML\nnu scripts/generate-configs.nu orchestrator multiuser\n\n# Test in staging environment\nORCHESTRATOR_CONFIG=orchestrator.multiuser.toml cargo run --bin orchestrator\n# Monitor logs and verify behavior\n\n# Then deploy to production\n```\n\n---\n\n## Summary\n\nThe four-stage workflow provides:\n\n1. **User-Friendly Interface**: TypeDialog forms with real-time validation\n2. **Type Safety**: Nickel schemas and validators catch configuration errors early\n3. **Flexibility**: TOML format can be edited manually or generated programmatically\n4. **Runtime Configurability**: Environment variables allow deployment-time overrides\n5. **Single Source of Truth**: Constraints, schemas, and validators all reference shared definitions\n\nThis layered approach ensures that:\n- Invalid configurations are caught before deployment\n- Users can modify configuration safely\n- Different deployment modes have appropriate defaults\n- Configuration changes can be version-controlled\n- Services can be reconfigured without code changes diff --git a/schemas/platform/constraints/README.md b/schemas/platform/constraints/README.md index 7a54fcf..c39cef5 100644 --- a/schemas/platform/constraints/README.md +++ b/schemas/platform/constraints/README.md @@ -1 +1 @@ -# Constraints\n\nSingle source of truth for validation limits across all services.\n\n## Purpose\n\nThe `constraints.toml` file defines:\n- **Numeric ranges** (min/max values for ports, workers, timeouts, etc.)\n- **Uniqueness rules** (field constraints, array bounds)\n- **Validation bounds** (resource limits, timeout ranges)\n\nThese constraints are used by:\n1. **Validators** (`validators/*.ncl`) - Check that configuration values are within bounds\n2. **TypeDialog forms** (`forms/*.toml`) - Enable constraint interpolation for dynamic field validation\n3. **Nickel schemas** (`schemas/*.ncl`) - Define type contracts with bounds\n\n## File Structure\n\n```\nconstraints/\n└── constraints.toml # All validation constraints in TOML format\n```\n\n## Usage Pattern\n\n### 1. Define Constraint\n\n**constraints.toml**:\n\n```\n[orchestrator.queue.concurrent_tasks]\nmin = 1\nmax = 100\n```\n\n### 2. Reference in Validator\n\n**validators/orchestrator-validator.ncl**:\n\n```\nlet constraints = import "../constraints/constraints.toml" in\n\n{\n ValidConcurrentTasks = fun tasks =>\n if tasks < constraints.orchestrator.queue.concurrent_tasks.min then\n error "Tasks must be >= 1"\n else if tasks > constraints.orchestrator.queue.concurrent_tasks.max then\n error "Tasks must be <= 100"\n else\n tasks,\n}\n```\n\n### 3. Reference in Form\n\n**forms/fragments/orchestrator-queue-section.toml**:\n\n```\n[[elements]]\nname = "max_concurrent_tasks"\ntype = "number"\nmin = "${constraint.orchestrator.queue.concurrent_tasks.min}"\nmax = "${constraint.orchestrator.queue.concurrent_tasks.max}"\nhelp = "Max: ${constraint.orchestrator.queue.concurrent_tasks.max}"\nnickel_path = ["orchestrator", "queue", "max_concurrent_tasks"]\n```\n\n## Constraint Categories\n\n### Service-Specific Constraints\n\n- **Orchestrator** (`[orchestrator.*]`)\n - Worker count bounds\n - Queue concurrency limits\n - Task timeout ranges\n - Batch parallelism limits\n\n- **Control Center** (`[control_center.*]`)\n - JWT token expiration bounds\n - Rate limiting thresholds\n - RBAC policy limits\n\n- **MCP Server** (`[mcp_server.*]`)\n - Tool concurrency limits\n - Resource size bounds\n - Prompt template limits\n\n### Common Constraints\n\n- **Server** (`[common.server.*]`)\n - Port range (1024-65535)\n - Worker count\n - Connection limits\n\n- **Deployment** (`[deployment.{solo,multiuser,cicd,enterprise}.*]`)\n - CPU core bounds\n - Memory allocation bounds\n - Disk space requirements\n\n## Modifying Constraints\n\nWhen changing constraint bounds:\n\n1. **Update constraints.toml**\n2. **Update validators** that use the constraint\n3. **Update forms** that interpolate the constraint\n4. **Test validation** in forms and Nickel typecheck\n5. **Update documentation** of affected services\n\n### Example: Increase Max Queue Tasks\n\n**Before**:\n\n```\n[orchestrator.queue.concurrent_tasks]\nmin = 1\nmax = 100\n```\n\n**After**:\n\n```\n[orchestrator.queue.concurrent_tasks]\nmin = 1\nmax = 200 # Increased from 100\n```\n\n**Then**:\n1. Verify `validators/orchestrator-validator.ncl` still type-checks\n2. Form will automatically show new max (via constraint interpolation)\n3. Test with: `nu scripts/validate-config.nu values/orchestrator.*.ncl`\n\n## Constraint Interpolation in Forms\n\nTypeDialog supports dynamic constraint references via `${constraint.path.to.value}`:\n\n```\n# Static min/max\nmin = 1\nmax = 100\n\n# Dynamic from constraints.toml\nmin = "${constraint.orchestrator.queue.concurrent_tasks.min}"\nmax = "${constraint.orchestrator.queue.concurrent_tasks.max}"\n\n# Help text with dynamic reference\nhelp = "Value must be between ${constraint.orchestrator.queue.concurrent_tasks.min} and ${constraint.orchestrator.queue.concurrent_tasks.max}"\n```\n\n## Best Practices\n\n1. **Single source of truth** - Define constraint once in constraints.toml\n2. **Meaningful names** - Use clear path hierarchy (service.subsystem.property)\n3. **Document ranges** - Add comments explaining why min/max values exist\n4. **Validate propagation** - Ensure forms and validators reference the same constraint\n5. **Test edge cases** - Verify min/max values work in validators and forms\n\n## Files to Update When Modifying Constraints\n\nWhen you change `constraints/constraints.toml`:\n\n1. `validators/*.ncl` - Update validator bounds\n2. `forms/fragments/*.toml` - Update form field constraints\n3. `schemas/*.ncl` - Update type contracts if needed\n4. Documentation - Update service-specific constraint documentation\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 \ No newline at end of file +# Constraints\n\nSingle source of truth for validation limits across all services.\n\n## Purpose\n\nThe `constraints.toml` file defines:\n- **Numeric ranges** (min/max values for ports, workers, timeouts, etc.)\n- **Uniqueness rules** (field constraints, array bounds)\n- **Validation bounds** (resource limits, timeout ranges)\n\nThese constraints are used by:\n1. **Validators** (`validators/*.ncl`) - Check that configuration values are within bounds\n2. **TypeDialog forms** (`forms/*.toml`) - Enable constraint interpolation for dynamic field validation\n3. **Nickel schemas** (`schemas/*.ncl`) - Define type contracts with bounds\n\n## File Structure\n\n```\nconstraints/\n└── constraints.toml # All validation constraints in TOML format\n```\n\n## Usage Pattern\n\n### 1. Define Constraint\n\n**constraints.toml**:\n\n```\n[orchestrator.queue.concurrent_tasks]\nmin = 1\nmax = 100\n```\n\n### 2. Reference in Validator\n\n**validators/orchestrator-validator.ncl**:\n\n```\nlet constraints = import "../constraints/constraints.toml" in\n\n{\n ValidConcurrentTasks = fun tasks =>\n if tasks < constraints.orchestrator.queue.concurrent_tasks.min then\n error "Tasks must be >= 1"\n else if tasks > constraints.orchestrator.queue.concurrent_tasks.max then\n error "Tasks must be <= 100"\n else\n tasks,\n}\n```\n\n### 3. Reference in Form\n\n**forms/fragments/orchestrator-queue-section.toml**:\n\n```\n[[elements]]\nname = "max_concurrent_tasks"\ntype = "number"\nmin = "${constraint.orchestrator.queue.concurrent_tasks.min}"\nmax = "${constraint.orchestrator.queue.concurrent_tasks.max}"\nhelp = "Max: ${constraint.orchestrator.queue.concurrent_tasks.max}"\nnickel_path = ["orchestrator", "queue", "max_concurrent_tasks"]\n```\n\n## Constraint Categories\n\n### Service-Specific Constraints\n\n- **Orchestrator** (`[orchestrator.*]`)\n - Worker count bounds\n - Queue concurrency limits\n - Task timeout ranges\n - Batch parallelism limits\n\n- **Control Center** (`[control_center.*]`)\n - JWT token expiration bounds\n - Rate limiting thresholds\n - RBAC policy limits\n\n- **MCP Server** (`[mcp_server.*]`)\n - Tool concurrency limits\n - Resource size bounds\n - Prompt template limits\n\n### Common Constraints\n\n- **Server** (`[common.server.*]`)\n - Port range (1024-65535)\n - Worker count\n - Connection limits\n\n- **Deployment** (`[deployment.{solo,multiuser,cicd,enterprise}.*]`)\n - CPU core bounds\n - Memory allocation bounds\n - Disk space requirements\n\n## Modifying Constraints\n\nWhen changing constraint bounds:\n\n1. **Update constraints.toml**\n2. **Update validators** that use the constraint\n3. **Update forms** that interpolate the constraint\n4. **Test validation** in forms and Nickel typecheck\n5. **Update documentation** of affected services\n\n### Example: Increase Max Queue Tasks\n\n**Before**:\n\n```\n[orchestrator.queue.concurrent_tasks]\nmin = 1\nmax = 100\n```\n\n**After**:\n\n```\n[orchestrator.queue.concurrent_tasks]\nmin = 1\nmax = 200 # Increased from 100\n```\n\n**Then**:\n1. Verify `validators/orchestrator-validator.ncl` still type-checks\n2. Form will automatically show new max (via constraint interpolation)\n3. Test with: `nu scripts/validate-config.nu values/orchestrator.*.ncl`\n\n## Constraint Interpolation in Forms\n\nTypeDialog supports dynamic constraint references via `${constraint.path.to.value}`:\n\n```\n# Static min/max\nmin = 1\nmax = 100\n\n# Dynamic from constraints.toml\nmin = "${constraint.orchestrator.queue.concurrent_tasks.min}"\nmax = "${constraint.orchestrator.queue.concurrent_tasks.max}"\n\n# Help text with dynamic reference\nhelp = "Value must be between ${constraint.orchestrator.queue.concurrent_tasks.min} and ${constraint.orchestrator.queue.concurrent_tasks.max}"\n```\n\n## Best Practices\n\n1. **Single source of truth** - Define constraint once in constraints.toml\n2. **Meaningful names** - Use clear path hierarchy (service.subsystem.property)\n3. **Document ranges** - Add comments explaining why min/max values exist\n4. **Validate propagation** - Ensure forms and validators reference the same constraint\n5. **Test edge cases** - Verify min/max values work in validators and forms\n\n## Files to Update When Modifying Constraints\n\nWhen you change `constraints/constraints.toml`:\n\n1. `validators/*.ncl` - Update validator bounds\n2. `forms/fragments/*.toml` - Update form field constraints\n3. `schemas/*.ncl` - Update type contracts if needed\n4. Documentation - Update service-specific constraint documentation\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 diff --git a/schemas/platform/defaults/README.md b/schemas/platform/defaults/README.md index d2b4c72..66484c4 100644 --- a/schemas/platform/defaults/README.md +++ b/schemas/platform/defaults/README.md @@ -1 +1 @@ -# Defaults\n\nDefault configuration values for all services and deployment modes.\n\n## Purpose\n\nDefaults provide:\n- **Base values** for all configuration fields\n- **Mode-specific overrides** (solo, multiuser, cicd, enterprise)\n- **Composition with validators** for constraint checking\n- **Documentation** of recommended values\n\n## File Organization\n\n```\ndefaults/\n├── README.md # This file\n├── common/ # Shared defaults\n│ ├── server-defaults.ncl # HTTP server defaults\n│ ├── database-defaults.ncl # Database defaults\n│ ├── security-defaults.ncl # Security defaults\n│ ├── monitoring-defaults.ncl # Monitoring defaults\n│ └── logging-defaults.ncl # Logging defaults\n├── deployment/ # Mode-specific defaults\n│ ├── solo-defaults.ncl # Solo mode (2 CPU, 4GB)\n│ ├── multiuser-defaults.ncl # Multi-user mode (4 CPU, 8GB)\n│ ├── cicd-defaults.ncl # CI/CD mode (8 CPU, 16GB)\n│ └── enterprise-defaults.ncl # Enterprise mode (16+ CPU, 32+ GB)\n├── orchestrator-defaults.ncl # Orchestrator base defaults\n├── control-center-defaults.ncl # Control Center base defaults\n├── mcp-server-defaults.ncl # MCP Server base defaults\n└── installer-defaults.ncl # Installer base defaults\n```\n\n## Composition Pattern\n\nConfiguration is built from layers:\n\n```\nBase Defaults (service-defaults.ncl)\n ↓\n+ Mode Overlay (deployment/{mode}-defaults.ncl)\n ↓\n+ User Customization (values/{service}.{mode}.ncl)\n ↓\n+ Schema Validation (schemas/*.ncl)\n ↓\n= Final Configuration (configs/{service}.{mode}.ncl)\n```\n\nExample:\n\n```\n# configs/orchestrator.solo.ncl\nlet defaults = import "../defaults/orchestrator-defaults.ncl" in\nlet solo_defaults = import "../defaults/deployment/solo-defaults.ncl" in\n\n{\n orchestrator = defaults.orchestrator & {\n # Mode-specific overrides\n server.workers = 2, # Solo mode: fewer workers\n queue.max_concurrent_tasks = 3, # Solo: limited concurrency\n },\n}\n```\n\n## Default Value Hierarchy\n\n### 1. Service Base Defaults\n\n**orchestrator-defaults.ncl**:\n\n```\n{\n orchestrator = {\n workspace = {\n name = "default",\n path = "/var/lib/provisioning/orchestrator",\n enabled = true,\n multi_workspace = false,\n },\n server = {\n host = "127.0.0.1",\n port = 9090,\n workers = 4, # General default\n },\n storage = {\n backend = 'filesystem,\n path = "/var/lib/provisioning/orchestrator/data",\n },\n queue = {\n max_concurrent_tasks = 5,\n retry_attempts = 3,\n },\n },\n}\n```\n\n### 2. Mode-Specific Overrides\n\n**deployment/solo-defaults.ncl**:\n\n```\n{\n resources = {\n cpu_cores = 2,\n memory_mb = 4096,\n },\n services = {\n orchestrator = {\n workers = 2, # Override: fewer workers for solo\n queue_max_concurrent_tasks = 3, # Override: limited concurrency\n storage_backend = 'filesystem,\n },\n },\n}\n```\n\n**deployment/enterprise-defaults.ncl**:\n\n```\n{\n resources = {\n cpu_cores = 16,\n memory_mb = 32768,\n },\n services = {\n orchestrator = {\n workers = 16, # Override: more workers for enterprise\n queue_max_concurrent_tasks = 50, # Override: high concurrency\n storage_backend = 'surrealdb_server,\n surrealdb_url = "surrealdb://cluster:8000",\n },\n },\n}\n```\n\n## Common Defaults\n\n### server-defaults.ncl\n\n```\n{\n server = {\n host = "0.0.0.0", # Accept all interfaces\n port = 8080, # Standard HTTP port (service-specific override)\n workers = 4, # CPU-aware default\n keep_alive = 75, # seconds\n max_connections = 100,\n },\n}\n```\n\n### database-defaults.ncl\n\n```\n{\n database = {\n backend = 'rocksdb, # Fast embedded default\n path = "/var/lib/provisioning/data",\n pool_size = 10, # Connection pool\n timeout = 30000, # milliseconds\n },\n}\n```\n\n### security-defaults.ncl\n\n```\n{\n security = {\n jwt_issuer = "provisioning-system",\n jwt_expiration = 3600, # 1 hour\n encryption_key = "", # User must set\n kms_backend = "age", # Local encryption\n mfa_required = false, # Solo: disabled by default\n },\n}\n```\n\n### monitoring-defaults.ncl\n\n```\n{\n monitoring = {\n enabled = false, # Optional feature\n metrics_interval = 60, # seconds\n health_check_interval = 30,\n retention_days = 30,\n },\n}\n```\n\n## Mode Configurations\n\n### Solo Mode\n- **Use case**: Single developer, testing\n- **Resources**: 2 CPU, 4GB RAM, 50GB disk\n- **Database**: Filesystem or embedded (RocksDB)\n- **Security**: Simplified (no MFA, local encryption)\n- **Services**: Core services only (orchestrator, control-center)\n\n### MultiUser Mode\n- **Use case**: Team collaboration, staging\n- **Resources**: 4 CPU, 8GB RAM, 100GB disk\n- **Database**: PostgreSQL or SurrealDB server\n- **Security**: RBAC enabled, shared authentication\n- **Services**: Full platform (orchestrator, control-center, MCP, Gitea)\n\n### CI/CD Mode\n- **Use case**: Automated pipelines, testing\n- **Resources**: 8 CPU, 16GB RAM, 200GB disk\n- **Database**: Ephemeral, fast cleanup\n- **Security**: API tokens, no UI\n- **Services**: Minimal (orchestrator in API mode)\n\n### Enterprise Mode\n- **Use case**: Production, high availability\n- **Resources**: 16+ CPU, 32+ GB RAM, 500GB+ disk\n- **Database**: SurrealDB cluster with replication\n- **Security**: MFA required, KMS integration, compliance\n- **Services**: Full platform with redundancy, monitoring, logging\n\n## Modifying Defaults\n\n### Changing a Base Default\n\n**orchestrator-defaults.ncl**:\n\n```\n# Before\nqueue = {\n max_concurrent_tasks = 5,\n},\n\n# After\nqueue = {\n max_concurrent_tasks = 10, # Increased default\n},\n```\n\n**Then**:\n1. Test with: `nickel eval configs/orchestrator.solo.ncl`\n2. Verify forms still work\n3. Update documentation if default meaning changes\n\n### Changing Mode Override\n\n**deployment/solo-defaults.ncl**:\n\n```\n# Before\norchestrator = {\n workers = 2,\n}\n\n# After\norchestrator = {\n workers = 1, # Reduce to 1 for solo\n}\n```\n\n## Best Practices\n\n1. **Keep it conservative** - Default to safe, minimal values\n2. **Document overrides** - Explain why mode-specific values differ\n3. **Use composition** - Import and merge rather than duplicate\n4. **Test composition** - Verify defaults merge correctly with modes\n5. **Provide examples** - Use `examples/` directory to show realistic setups\n\n## Testing Defaults\n\n```\n# Evaluate defaults\nnickel eval provisioning/.typedialog/provisioning/platform/defaults/orchestrator-defaults.ncl\n\n# Test merged defaults (base + mode)\nnickel eval provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl | head -50\n\n# Typecheck with schemas\nnickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n```\n\n## Default Value Guidelines\n\n### Ports\n- Solo mode: Local (127.0.0.1) only\n- Multi-user/Enterprise: Bind all interfaces (0.0.0.0)\n- Never conflict with system services\n\n### Workers/Concurrency\n- Solo: 1-2 workers, limited concurrency\n- Multi-user: 4-8 workers, moderate concurrency\n- Enterprise: 8+ workers, high concurrency\n\n### Resources\n- Solo: 2 CPU, 4GB RAM (laptop testing)\n- Multi-user: 4 CPU, 8GB RAM (team servers)\n- Enterprise: 16+ CPU, 32+ GB RAM (production)\n\n### Security\n- Solo: Disabled/minimal (local development)\n- Multi-user: RBAC enabled (shared team)\n- Enterprise: MFA required, KMS backend (production)\n\n### Storage\n- Solo: Filesystem or RocksDB (no infrastructure needed)\n- Multi-user: PostgreSQL or SurrealDB (team data)\n- Enterprise: SurrealDB cluster with replication (HA)\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 \ No newline at end of file +# Defaults\n\nDefault configuration values for all services and deployment modes.\n\n## Purpose\n\nDefaults provide:\n- **Base values** for all configuration fields\n- **Mode-specific overrides** (solo, multiuser, cicd, enterprise)\n- **Composition with validators** for constraint checking\n- **Documentation** of recommended values\n\n## File Organization\n\n```\ndefaults/\n├── README.md # This file\n├── common/ # Shared defaults\n│ ├── server-defaults.ncl # HTTP server defaults\n│ ├── database-defaults.ncl # Database defaults\n│ ├── security-defaults.ncl # Security defaults\n│ ├── monitoring-defaults.ncl # Monitoring defaults\n│ └── logging-defaults.ncl # Logging defaults\n├── deployment/ # Mode-specific defaults\n│ ├── solo-defaults.ncl # Solo mode (2 CPU, 4GB)\n│ ├── multiuser-defaults.ncl # Multi-user mode (4 CPU, 8GB)\n│ ├── cicd-defaults.ncl # CI/CD mode (8 CPU, 16GB)\n│ └── enterprise-defaults.ncl # Enterprise mode (16+ CPU, 32+ GB)\n├── orchestrator-defaults.ncl # Orchestrator base defaults\n├── control-center-defaults.ncl # Control Center base defaults\n├── mcp-server-defaults.ncl # MCP Server base defaults\n└── installer-defaults.ncl # Installer base defaults\n```\n\n## Composition Pattern\n\nConfiguration is built from layers:\n\n```\nBase Defaults (service-defaults.ncl)\n ↓\n+ Mode Overlay (deployment/{mode}-defaults.ncl)\n ↓\n+ User Customization (values/{service}.{mode}.ncl)\n ↓\n+ Schema Validation (schemas/*.ncl)\n ↓\n= Final Configuration (configs/{service}.{mode}.ncl)\n```\n\nExample:\n\n```\n# configs/orchestrator.solo.ncl\nlet defaults = import "../defaults/orchestrator-defaults.ncl" in\nlet solo_defaults = import "../defaults/deployment/solo-defaults.ncl" in\n\n{\n orchestrator = defaults.orchestrator & {\n # Mode-specific overrides\n server.workers = 2, # Solo mode: fewer workers\n queue.max_concurrent_tasks = 3, # Solo: limited concurrency\n },\n}\n```\n\n## Default Value Hierarchy\n\n### 1. Service Base Defaults\n\n**orchestrator-defaults.ncl**:\n\n```\n{\n orchestrator = {\n workspace = {\n name = "default",\n path = "/var/lib/provisioning/orchestrator",\n enabled = true,\n multi_workspace = false,\n },\n server = {\n host = "127.0.0.1",\n port = 9090,\n workers = 4, # General default\n },\n storage = {\n backend = 'filesystem,\n path = "/var/lib/provisioning/orchestrator/data",\n },\n queue = {\n max_concurrent_tasks = 5,\n retry_attempts = 3,\n },\n },\n}\n```\n\n### 2. Mode-Specific Overrides\n\n**deployment/solo-defaults.ncl**:\n\n```\n{\n resources = {\n cpu_cores = 2,\n memory_mb = 4096,\n },\n services = {\n orchestrator = {\n workers = 2, # Override: fewer workers for solo\n queue_max_concurrent_tasks = 3, # Override: limited concurrency\n storage_backend = 'filesystem,\n },\n },\n}\n```\n\n**deployment/enterprise-defaults.ncl**:\n\n```\n{\n resources = {\n cpu_cores = 16,\n memory_mb = 32768,\n },\n services = {\n orchestrator = {\n workers = 16, # Override: more workers for enterprise\n queue_max_concurrent_tasks = 50, # Override: high concurrency\n storage_backend = 'surrealdb_server,\n surrealdb_url = "surrealdb://cluster:8000",\n },\n },\n}\n```\n\n## Common Defaults\n\n### server-defaults.ncl\n\n```\n{\n server = {\n host = "0.0.0.0", # Accept all interfaces\n port = 8080, # Standard HTTP port (service-specific override)\n workers = 4, # CPU-aware default\n keep_alive = 75, # seconds\n max_connections = 100,\n },\n}\n```\n\n### database-defaults.ncl\n\n```\n{\n database = {\n backend = 'rocksdb, # Fast embedded default\n path = "/var/lib/provisioning/data",\n pool_size = 10, # Connection pool\n timeout = 30000, # milliseconds\n },\n}\n```\n\n### security-defaults.ncl\n\n```\n{\n security = {\n jwt_issuer = "provisioning-system",\n jwt_expiration = 3600, # 1 hour\n encryption_key = "", # User must set\n kms_backend = "age", # Local encryption\n mfa_required = false, # Solo: disabled by default\n },\n}\n```\n\n### monitoring-defaults.ncl\n\n```\n{\n monitoring = {\n enabled = false, # Optional feature\n metrics_interval = 60, # seconds\n health_check_interval = 30,\n retention_days = 30,\n },\n}\n```\n\n## Mode Configurations\n\n### Solo Mode\n- **Use case**: Single developer, testing\n- **Resources**: 2 CPU, 4GB RAM, 50GB disk\n- **Database**: Filesystem or embedded (RocksDB)\n- **Security**: Simplified (no MFA, local encryption)\n- **Services**: Core services only (orchestrator, control-center)\n\n### MultiUser Mode\n- **Use case**: Team collaboration, staging\n- **Resources**: 4 CPU, 8GB RAM, 100GB disk\n- **Database**: PostgreSQL or SurrealDB server\n- **Security**: RBAC enabled, shared authentication\n- **Services**: Full platform (orchestrator, control-center, MCP, Gitea)\n\n### CI/CD Mode\n- **Use case**: Automated pipelines, testing\n- **Resources**: 8 CPU, 16GB RAM, 200GB disk\n- **Database**: Ephemeral, fast cleanup\n- **Security**: API tokens, no UI\n- **Services**: Minimal (orchestrator in API mode)\n\n### Enterprise Mode\n- **Use case**: Production, high availability\n- **Resources**: 16+ CPU, 32+ GB RAM, 500GB+ disk\n- **Database**: SurrealDB cluster with replication\n- **Security**: MFA required, KMS integration, compliance\n- **Services**: Full platform with redundancy, monitoring, logging\n\n## Modifying Defaults\n\n### Changing a Base Default\n\n**orchestrator-defaults.ncl**:\n\n```\n# Before\nqueue = {\n max_concurrent_tasks = 5,\n},\n\n# After\nqueue = {\n max_concurrent_tasks = 10, # Increased default\n},\n```\n\n**Then**:\n1. Test with: `nickel eval configs/orchestrator.solo.ncl`\n2. Verify forms still work\n3. Update documentation if default meaning changes\n\n### Changing Mode Override\n\n**deployment/solo-defaults.ncl**:\n\n```\n# Before\norchestrator = {\n workers = 2,\n}\n\n# After\norchestrator = {\n workers = 1, # Reduce to 1 for solo\n}\n```\n\n## Best Practices\n\n1. **Keep it conservative** - Default to safe, minimal values\n2. **Document overrides** - Explain why mode-specific values differ\n3. **Use composition** - Import and merge rather than duplicate\n4. **Test composition** - Verify defaults merge correctly with modes\n5. **Provide examples** - Use `examples/` directory to show realistic setups\n\n## Testing Defaults\n\n```\n# Evaluate defaults\nnickel eval provisioning/.typedialog/provisioning/platform/defaults/orchestrator-defaults.ncl\n\n# Test merged defaults (base + mode)\nnickel eval provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl | head -50\n\n# Typecheck with schemas\nnickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n```\n\n## Default Value Guidelines\n\n### Ports\n- Solo mode: Local (127.0.0.1) only\n- Multi-user/Enterprise: Bind all interfaces (0.0.0.0)\n- Never conflict with system services\n\n### Workers/Concurrency\n- Solo: 1-2 workers, limited concurrency\n- Multi-user: 4-8 workers, moderate concurrency\n- Enterprise: 8+ workers, high concurrency\n\n### Resources\n- Solo: 2 CPU, 4GB RAM (laptop testing)\n- Multi-user: 4 CPU, 8GB RAM (team servers)\n- Enterprise: 16+ CPU, 32+ GB RAM (production)\n\n### Security\n- Solo: Disabled/minimal (local development)\n- Multi-user: RBAC enabled (shared team)\n- Enterprise: MFA required, KMS backend (production)\n\n### Storage\n- Solo: Filesystem or RocksDB (no infrastructure needed)\n- Multi-user: PostgreSQL or SurrealDB (team data)\n- Enterprise: SurrealDB cluster with replication (HA)\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 diff --git a/schemas/platform/examples/README.md b/schemas/platform/examples/README.md index 01cc9c5..6a19f5a 100644 --- a/schemas/platform/examples/README.md +++ b/schemas/platform/examples/README.md @@ -1 +1 @@ -# Provisioning Platform Configuration Examples\n\nProduction-ready reference configurations demonstrating different deployment scenarios and best practices.\n\n## Purpose\n\nExamples provide:\n- **Real-world configurations** - Complete, tested working setups ready for production use\n- **Best practices** - Recommended patterns, values, and architectural approaches\n- **Learning resource** - How to use the configuration system effectively\n- **Starting point** - Copy, customize, and deploy for your environment\n- **Documentation** - Detailed inline comments explaining every configuration option\n\n## Quick Start\n\nChoose your deployment mode and get started immediately:\n\n```\n# Solo development (local, single developer)\nnickel export --format toml orchestrator-solo.ncl > orchestrator.toml\n\n# Team collaboration (PostgreSQL, RBAC, audit logging)\nnickel export --format toml control-center-multiuser.ncl > control-center.toml\n\n# Production enterprise (HA, SurrealDB cluster, full monitoring)\nnickel export --format toml full-platform-enterprise.ncl > platform.toml\n```\n\n## Example Configurations by Mode\n\n### 1. orchestrator-solo.ncl\n\n**Deployment Mode**: Solo (Single Developer)\n\n**Resource Requirements**:\n- CPU: 2 cores\n- RAM: 4 GB\n- Disk: 50 GB (local data)\n\n**Configuration Highlights**:\n- **Workspace**: Local `dev-workspace` at `/home/developer/provisioning/data/orchestrator`\n- **Server**: Localhost binding (127.0.0.1:9090), 2 workers, 128 connections max\n- **Storage**: Filesystem backend (no external database required)\n- **Queue**: 3 max concurrent tasks (minimal for development)\n- **Batch**: 2 parallel limit with frequent checkpointing (every 50 operations)\n- **Logging**: Debug level, human-readable text format, concurrent stdout + file output\n- **Security**: Auth disabled, CORS allows all origins, no TLS\n- **Monitoring**: Health checks only (metrics disabled), resource tracking disabled\n- **Features**: Experimental features enabled for testing and iteration\n\n**Ideal For**:\n- ✅ Single developer local development\n- ✅ Quick prototyping and experimentation\n- ✅ Learning the provisioning platform\n- ✅ CI/CD local testing without external services\n\n**Key Advantages**:\n- No external dependencies (database-free)\n- Fast startup (<10 seconds)\n- Minimal resource footprint\n- Verbose debug logging for troubleshooting\n- Zero security overhead\n\n**Key Limitations**:\n- Localhost-only (not accessible remotely)\n- Single-threaded processing (3 concurrent tasks max)\n- No persistence across restarts (if using `:memory:` storage)\n- No audit logging\n\n**Usage**:\n\n```\n# Export to TOML and run\nnickel export --format toml orchestrator-solo.ncl > orchestrator.solo.toml\nORCHESTRATOR_CONFIG=orchestrator.solo.toml cargo run --bin orchestrator\n\n# With TypeDialog interactive configuration\nnu ../../scripts/configure.nu orchestrator solo --backend cli\n```\n\n**Customization Examples**:\n\n```\n# Increase concurrency for testing (still development-friendly)\nqueue.max_concurrent_tasks = 5\n\n# Reduce debug noise for cleaner logs\nlogging.level = "info"\n\n# Change workspace location\nworkspace.path = "/path/to/my/workspace"\n```\n\n---\n\n### 2. orchestrator-enterprise.ncl\n\n**Deployment Mode**: Enterprise (Production High-Availability)\n\n**Resource Requirements**:\n- CPU: 8+ cores (recommended 16)\n- RAM: 16+ GB (recommended 32+ GB)\n- Disk: 500+ GB (SurrealDB cluster)\n\n**Configuration Highlights**:\n- **Workspace**: Production `production` workspace at `/var/lib/provisioning/orchestrator` with multi-workspace support enabled\n- **Server**: All interfaces binding (0.0.0.0:9090), 16 workers, 4096 connections max\n- **Storage**: SurrealDB cluster (3 nodes) for distributed storage and high availability\n- **Queue**: 100 max concurrent tasks, 5 retry attempts, 2-hour timeout for long-running operations\n- **Batch**: 50 parallel limit with frequent checkpointing (every 1000 operations) and automatic cleanup\n- **Logging**: Info level, JSON structured format for log aggregation\n - Standard logs: 500MB files, kept 30 versions (90 days)\n - Audit logs: 200MB files, kept 365 versions (1 year)\n- **Security**: JWT authentication required, specific CORS origins, TLS 1.3 mandatory, 10,000 RPS rate limit\n- **Extensions**: Auto-load from OCI registry with daily refresh, 10 concurrent initializations\n- **Monitoring**:\n - Metrics every 10 seconds\n - Profiling at 10% sample rate\n - Resource tracking with CPU/memory/disk alerts\n - Health checks every 30 seconds\n- **Features**: Audit logging, task history, performance tracking all enabled\n\n**Ideal For**:\n- ✅ Production deployments with SLAs\n- ✅ High-throughput, mission-critical workloads\n- ✅ Multi-team environments requiring audit trails\n- ✅ Large-scale infrastructure deployments\n- ✅ Compliance and governance requirements\n\n**Key Advantages**:\n- High availability (3 SurrealDB replicas with failover)\n- Production security (JWT + TLS 1.3 mandatory)\n- Full observability (metrics, profiling, audit logs)\n- High throughput (100 concurrent tasks)\n- Extension management via OCI registry\n- Automatic rollback and recovery capabilities\n\n**Key Limitations**:\n- Requires SurrealDB cluster setup and maintenance\n- Resource-intensive (8+ CPU, 16+ GB RAM minimum)\n- More complex initial setup and configuration\n- Requires secrets management (JWT keys, TLS certificates)\n- Network isolation and load balancing setup required\n\n**Environment Variables Required**:\n\n```\nexport JWT_SECRET=""\nexport SURREALDB_PASSWORD=""\n```\n\n**Usage**:\n\n```\n# Deploy standalone with SurrealDB\nnickel export --format toml orchestrator-enterprise.ncl > orchestrator.enterprise.toml\nORCHESTRATOR_CONFIG=orchestrator.enterprise.toml cargo run --bin orchestrator\n\n# Deploy to Kubernetes with all enterprise infrastructure\nnu ../../scripts/render-kubernetes.nu enterprise --namespace production\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/*.yaml\n```\n\n**Customization Examples**:\n\n```\n# Adjust concurrency for your specific infrastructure\nqueue.max_concurrent_tasks = 50 # Scale down if resource-constrained\n\n# Change SurrealDB cluster endpoints\nstorage.surrealdb_url = "surrealdb://node1:8000,node2:8000,node3:8000"\n\n# Modify audit log retention for compliance\nlogging.outputs[1].rotation.max_backups = 2555 # 7 years for HIPAA compliance\n\n# Increase rate limiting for high-frequency integrations\nsecurity.rate_limit.requests_per_second = 20000\n```\n\n---\n\n### 3. control-center-multiuser.ncl\n\n**Deployment Mode**: MultiUser (Team Collaboration & Staging)\n\n**Resource Requirements**:\n- CPU: 4 cores\n- RAM: 8 GB\n- Disk: 100 GB (PostgreSQL data + logs)\n\n**Configuration Highlights**:\n- **Server**: All interfaces binding (0.0.0.0:8080), 4 workers, 256 connections max\n- **Database**: PostgreSQL with connection pooling (min 5, max 20 connections)\n- **Auth**: JWT with 8-hour token expiration (aligned with team workday)\n- **RBAC**: 4 pre-defined roles with granular permissions\n - `admin`: Infrastructure lead with full access (`*` permissions)\n - `operator`: Operations team - execute, manage, view workflows and policies\n - `developer`: Development team - read-only workflow and policy access\n - `viewer`: Minimal read-only for non-technical stakeholders\n- **MFA**: Optional per-user (TOTP + email methods available, not globally required)\n- **Password Policies**: 12-character minimum, requires uppercase/lowercase/digits, 90-day rotation, history count of 3\n- **Session Policies**: 8-hour maximum duration, 1-hour idle timeout, 3 concurrent sessions per user\n- **Rate Limiting**: 1000 RPS global, 100 RPS per-user, 20 burst requests\n- **CORS**: Allows localhost:3000 (dev), control-center.example.com, orchestrator.example.com\n- **Logging**: Info level, JSON format, 200MB files kept 15 versions (90 days retention)\n- **Features**: Audit logging enabled, policy enforcement enabled\n\n**Ideal For**:\n- ✅ Team collaboration (2-50 engineers)\n- ✅ Staging environments before production\n- ✅ Development team operations\n- ✅ RBAC with different access levels\n- ✅ Compliance-light environments (SOC2 optional)\n\n**Key Advantages**:\n- Team-friendly security (optional MFA, reasonable password policy)\n- RBAC supports different team roles and responsibilities\n- Persistent storage (PostgreSQL) maintains state across restarts\n- Audit trail for basic compliance\n- Flexible session management (multiple concurrent sessions)\n- Good balance of security and usability\n\n**Key Limitations**:\n- Requires PostgreSQL database setup\n- Single replica (not HA by default)\n- More complex than solo mode\n- RBAC requires careful role definition\n\n**Environment Variables Required**:\n\n```\nexport DB_PASSWORD=""\nexport JWT_SECRET=""\n```\n\n**Usage**:\n\n```\n# Generate and deploy\nnickel export --format toml control-center-multiuser.ncl > control-center.multiuser.toml\nCONTROL_CENTER_CONFIG=control-center.multiuser.toml cargo run --bin control-center\n\n# With Docker Compose for team\nnu ../../scripts/render-docker-compose.nu multiuser\ndocker-compose -f docker-compose.multiuser.yml up -d\n\n# Access the UI\n# http://localhost:8080 (or your configured domain)\n```\n\n**RBAC Quick Reference**:\n\n| Role | Intended Users | Key Permissions |\n| ------ | ---------------- | ----------------- |\n| admin | Infrastructure leads | All operations: full access |\n| operator | Operations engineers | Execute workflows, manage tasks, view policies |\n| developer | Application developers | View workflows, view policies (read-only) |\n| viewer | Non-technical (PM, QA) | View workflows only (minimal read) |\n\n**Customization Examples**:\n\n```\n# Require MFA globally for higher security\nmfa.required = true\n\n# Add custom role for auditors\nrbac.roles.auditor = {\n description = "Compliance auditor",\n permissions = ["audit.view", "orchestrator.view"],\n}\n\n# Adjust for larger team (more concurrent sessions)\npolicies.session.max_concurrent = 5\n\n# Stricter password policy for regulated industry\npolicies.password = {\n min_length = 16,\n require_special_chars = true,\n expiration_days = 60,\n history_count = 8,\n}\n```\n\n---\n\n### 4. full-platform-enterprise.ncl\n\n**Deployment Mode**: Enterprise Integrated (Complete Platform)\n\n**Resource Requirements**:\n- CPU: 16+ cores (3 replicas × 4 cores each + infrastructure)\n- RAM: 32+ GB (orchestrator 12GB + control-center 4GB + databases 12GB + monitoring 4GB)\n- Disk: 1+ TB (databases, logs, metrics, artifacts)\n\n**Services Configured**:\n\n**Orchestrator Section**:\n- SurrealDB cluster (3 nodes) for distributed workflow storage\n- 100 concurrent tasks with 5 retry attempts\n- Full audit logging and monitoring\n- JWT authentication with configurable token expiration\n- Extension loading from OCI registry\n- High-performance tuning (16 workers, 4096 connections)\n\n**Control Center Section**:\n- PostgreSQL HA backend for policy/RBAC storage\n- Full RBAC (4 roles with 7+ permissions each)\n- MFA required (TOTP + email methods)\n- SOC2 compliance enabled with audit logging\n- Strict password policy (16+ chars, special chars required)\n- 30-minute session idle timeout for security\n- Per-user rate limiting (100 RPS)\n\n**MCP Server Section**:\n- Claude integration for AI-powered provisioning\n- Full MCP capability support (tools, resources, prompts, sampling)\n- Orchestrator and Control Center integration\n- Read-only filesystem access with 10MB file limit\n- JWT authentication\n- Advanced audit logging (all requests logged except sensitive data)\n- 100 RPS rate limiting with 20-request burst\n\n**Global Configuration**:\n\n```\nlet deployment_mode = "enterprise"\nlet namespace = "provisioning"\nlet domain = "provisioning.example.com"\nlet environment = "production"\n```\n\n**Infrastructure Components** (when deployed to Kubernetes):\n- Load Balancer (Nginx) - TLS termination, CORS, rate limiting\n- 3x Orchestrator replicas - Distributed processing\n- 2x Control Center replicas - Policy management\n- 1-2x MCP Server replicas - AI integration\n- PostgreSQL HA - Primary/replica setup\n- SurrealDB cluster - 3 nodes with replication\n- Prometheus - Metrics collection\n- Grafana - Visualization and dashboards\n- Loki - Log aggregation\n- Harbor - Private OCI image registry\n\n**Ideal For**:\n- ✅ Production deployments with full SLAs\n- ✅ Enterprise compliance requirements (SOC2, HIPAA)\n- ✅ Multi-team organizations\n- ✅ AI/LLM integration for provisioning\n- ✅ Large-scale infrastructure management (1000+ resources)\n- ✅ High-availability deployments with 99.9%+ uptime requirements\n\n**Key Advantages**:\n- Complete service integration (no missing pieces)\n- Production-grade HA setup (3 replicas, load balancing)\n- Full compliance and audit capabilities\n- AI/LLM integration via MCP Server\n- Comprehensive monitoring and observability\n- Clear separation of concerns per service\n- Global variables for easy parameterization\n\n**Key Limitations**:\n- Complex setup requiring multiple services\n- Resource-intensive (16+ CPU, 32+ GB RAM minimum)\n- Requires Kubernetes or advanced Docker Compose setup\n- Multiple databases to maintain (PostgreSQL + SurrealDB)\n- Network setup complexity (TLS, CORS, rate limiting)\n\n**Environment Variables Required**:\n\n```\n# Database credentials\nexport DB_PASSWORD=""\nexport SURREALDB_PASSWORD=""\n\n# Security\nexport JWT_SECRET=""\nexport KMS_KEY=""\n\n# AI/LLM integration\nexport CLAUDE_API_KEY=""\nexport CLAUDE_MODEL="claude-3-opus-20240229"\n\n# TLS certificates (for production)\nexport TLS_CERT=""\nexport TLS_KEY=""\n```\n\n**Architecture Diagram**:\n\n```\n┌───────────────────────────────────────────────┐\n│ Nginx Load Balancer (TLS, CORS, RateLimit) │\n│ https://orchestrator.example.com │\n│ https://control-center.example.com │\n│ https://mcp.example.com │\n└──────────┬──────────────────────┬─────────────┘\n │ │\n ┌──────▼──────┐ ┌────────▼────────┐\n │ Orchestrator│ │ Control Center │\n │ (3 replicas)│ │ (2 replicas) │\n └──────┬──────┘ └────────┬────────┘\n │ │\n ┌──────▼──────┐ ┌────────▼────────┐ ┌─────────────────┐\n │ SurrealDB │ │ PostgreSQL HA │ │ MCP Server │\n │ Cluster │ │ │ │ (1-2 replicas) │\n │ (3 nodes) │ │ Primary/Replica│ │ │\n └─────────────┘ └─────────────────┘ │ ↓ Claude API │\n └─────────────────┘\n\n ┌─────────────────────────────────────────────────┐\n │ Observability Stack (Optional) │\n ├──────────────────┬──────────────────────────────┤\n │ Prometheus │ Grafana │ Loki │\n │ (Metrics) │ (Dashboards) │ (Logs) │\n └──────────────────┴──────────────────────────────┘\n```\n\n**Usage**:\n\n```\n# Export complete configuration\nnickel export --format toml full-platform-enterprise.ncl > platform.toml\n\n# Extract individual service configs if needed\n# (Each service extracts its section from platform.toml)\n\n# Deploy to Kubernetes with all enterprise infrastructure\nnu ../../scripts/render-kubernetes.nu enterprise --namespace production\n\n# Apply all manifests\nkubectl create namespace production\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/*.yaml\n\n# Or deploy with Docker Compose for single-node testing\nnu ../../scripts/render-docker-compose.nu enterprise\ndocker-compose -f docker-compose.enterprise.yml up -d\n```\n\n**Customization Examples**:\n\n```\n# Adjust deployment domain\nlet domain = "my-company.com"\nlet namespace = "infrastructure"\n\n# Scale for higher throughput\norchestrator.queue.max_concurrent_tasks = 200\norchestrator.security.rate_limit.requests_per_second = 50000\n\n# Add HIPAA compliance\ncontrol_center.policies.compliance.hipaa.enabled = true\ncontrol_center.policies.audit.retention_days = 2555 # 7 years\n\n# Custom MCP Server model\nmcp_server.integration.claude.model = "claude-3-sonnet-20240229"\n\n# Enable caching for performance\nmcp_server.features.enable_caching = true\nmcp_server.performance.cache_ttl = 7200\n```\n\n---\n\n## Deployment Mode Comparison Matrix\n\n| Feature | Solo | MultiUser | Enterprise |\n| --------- | ------ | ----------- | ----------- |\n| **Ideal For** | Dev | Team/Staging | Production |\n| **Storage** | Filesystem | PostgreSQL | SurrealDB Cluster |\n| **Replicas** | 1 | 1 | 3+ (HA) |\n| **Max Concurrency** | 3 tasks | 5-10 | 100 |\n| **Security** | None | RBAC + JWT | Full + MFA + SOC2 |\n| **Monitoring** | Health check | Basic | Full (Prom+Grafana) |\n| **Setup Time** | <5 min | 15 min | 30+ min |\n| **Min CPU** | 2 | 4 | 16 |\n| **Min RAM** | 4GB | 8GB | 32GB |\n| **Audit Logs** | No | 90 days | 365 days |\n| **TLS Required** | No | No | Yes |\n| **Compliance** | None | Basic | SOC2 + HIPAA ready |\n\n---\n\n## Getting Started Guide\n\n### Step 1: Choose Your Deployment Mode\n\n- **Solo**: Single developer working locally → Use `orchestrator-solo.ncl`\n- **Team**: 2-50 engineers, staging environment → Use `control-center-multiuser.ncl`\n- **Production**: Full enterprise deployment → Use `full-platform-enterprise.ncl`\n\n### Step 2: Export Configuration to TOML\n\n```\n# Start with solo mode\nnickel export --format toml orchestrator-solo.ncl > orchestrator.toml\n\n# Validate the export\ncat orchestrator.toml | head -20\n```\n\n### Step 3: Validate Configuration\n\n```\n# Typecheck the Nickel configuration\nnickel typecheck orchestrator-solo.ncl\n\n# Validate using provided script\nnu ../../scripts/validate-config.nu orchestrator-solo.ncl\n```\n\n### Step 4: Customize for Your Environment\n\nEdit the exported `.toml` or the `.ncl` file:\n\n```\n# Option A: Edit TOML directly (simpler)\nvi orchestrator.toml # Change workspace path, port, etc.\n\n# Option B: Edit Nickel and re-export (type-safe)\nvi orchestrator-solo.ncl\nnickel export --format toml orchestrator-solo.ncl > orchestrator.toml\n```\n\n### Step 5: Deploy\n\n```\n# Docker Compose\nORCHESTRATOR_CONFIG=orchestrator.toml docker-compose up -d\n\n# Direct Rust execution\nORCHESTRATOR_CONFIG=orchestrator.toml cargo run --bin orchestrator\n\n# Kubernetes\nnu ../../scripts/render-kubernetes.nu solo\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/*.yaml\n```\n\n---\n\n## Common Customizations\n\n### Changing Domain/Namespace\n\nIn any `.ncl` file at the top:\n\n```\nlet domain = "your-domain.com"\nlet namespace = "your-namespace"\nlet environment = "your-env"\n```\n\n### Increasing Resource Limits\n\nFor higher throughput:\n\n```\nqueue.max_concurrent_tasks = 200 # Default: 100\nsecurity.rate_limit.requests_per_second = 50000 # Default: 10000\nserver.workers = 32 # Default: 16\n```\n\n### Enabling Compliance Features\n\nFor regulated environments:\n\n```\npolicies.compliance.soc2.enabled = true\npolicies.compliance.hipaa.enabled = true\npolicies.audit.retention_days = 2555 # 7 years\n```\n\n### Custom Logging\n\nFor troubleshooting:\n\n```\nlogging.level = "debug" # Default: info\nlogging.format = "text" # Default: json (use text for development)\nlogging.outputs[0].level = "debug" # stdout level\n```\n\n---\n\n## Validation & Testing\n\n### Syntax Validation\n\n```\n# Typecheck all examples\nfor f in *.ncl; do\n echo "Checking $f..."\n nickel typecheck "$f"\ndone\n```\n\n### Configuration Export\n\n```\n# Export to TOML\nnickel export --format toml orchestrator-solo.ncl | head -30\n\n# Export to JSON\nnickel export --format json full-platform-enterprise.ncl | jq '.orchestrator.server'\n```\n\n### Load in Rust Application\n\n```\n# With dry-run flag (if supported)\nORCHESTRATOR_CONFIG=orchestrator.solo.toml cargo run --bin orchestrator -- --validate\n\n# Or simply attempt startup\nORCHESTRATOR_CONFIG=orchestrator.solo.toml timeout 5 cargo run --bin orchestrator\n```\n\n---\n\n## Troubleshooting\n\n### "Type mismatch" Error\n\n**Cause**: Field value doesn't match expected type\n\n**Fix**: Check the schema for correct type. Common issues:\n- Use `true`/`false` not `"true"`/`"false"` for booleans\n- Use `9090` not `"9090"` for numbers\n- Use record syntax `{ key = value }` not `{ "key": value }`\n\n### Port Already in Use\n\n**Fix**: Change the port in your configuration:\n\n```\nserver.port = 9999 # Instead of 9090\n```\n\n### Database Connection Errors\n\n**Fix**: For multiuser/enterprise modes:\n- Ensure PostgreSQL is running: `docker-compose up -d postgres`\n- Verify credentials in environment variables\n- Check network connectivity\n- Validate connection string format\n\n### Import Not Found\n\n**Fix**: Ensure all relative paths in imports are correct:\n\n```\n# Correct (relative to examples/)\nlet defaults = import "../defaults/orchestrator-defaults.ncl" in\n\n# Wrong (absolute path)\nlet defaults = import "/full/path/to/defaults.ncl" in\n```\n\n---\n\n## Best Practices\n\n1. **Start Small**: Begin with solo mode, graduate to multiuser, then enterprise\n2. **Environment Variables**: Never hardcode secrets, use environment variables\n3. **Version Control**: Keep examples in Git with clear comments\n4. **Validation**: Always typecheck and export before deploying\n5. **Documentation**: Add comments explaining non-obvious configuration choices\n6. **Testing**: Deploy to staging first, validate all services before production\n7. **Monitoring**: Enable metrics and logging from day one for easier troubleshooting\n8. **Backups**: Regular backups of database state and configurations\n\n---\n\n## Adding New Examples\n\n### Create a Custom Example\n\n```\n# Copy an existing example as template\ncp orchestrator-solo.ncl orchestrator-custom.ncl\n\n# Edit for your use case\nvi orchestrator-custom.ncl\n\n# Validate\nnickel typecheck orchestrator-custom.ncl\n\n# Export and test\nnickel export --format toml orchestrator-custom.ncl > orchestrator.custom.toml\n```\n\n### Naming Convention\n\n- **Service + Mode**: `{service}-{mode}.ncl` (orchestrator-solo.ncl)\n- **Scenario**: `{service}-{scenario}.ncl` (orchestrator-high-throughput.ncl)\n- **Full Stack**: `full-platform-{mode}.ncl` (full-platform-enterprise.ncl)\n\n---\n\n## See Also\n\n- **Parent README**: `../README.md` - Complete configuration system overview\n- **Schemas**: `../schemas/` - Type definitions and validation rules\n- **Defaults**: `../defaults/` - Base configurations for composition\n- **Scripts**: `../scripts/` - Automation for configuration workflow\n- **Forms**: `../forms/` - Interactive TypeDialog form definitions\n\n---\n\n**Version**: 2.0\n**Last Updated**: 2025-01-05\n**Status**: Production Ready - All examples tested and validated\n\n## Using Examples\n\n### View Example\n\n```\ncat provisioning/.typedialog/provisioning/platform/examples/orchestrator-solo.ncl\n```\n\n### Copy and Customize\n\n```\n# Start with solo example\ncp examples/orchestrator-solo.ncl values/orchestrator.solo.ncl\n\n# Edit for your environment\nvi values/orchestrator.solo.ncl\n\n# Validate\nnu scripts/validate-config.nu values/orchestrator.solo.ncl\n```\n\n### Generate from Example\n\n```\n# Use example as base, regenerate with TypeDialog\nnu scripts/configure.nu orchestrator solo --backend web\n```\n\n## Example Structure\n\nEach example is a complete Nickel configuration:\n\n```\n# orchestrator-solo.ncl\n{\n orchestrator = {\n workspace = { },\n server = { },\n storage = { },\n queue = { },\n monitoring = { },\n },\n}\n```\n\n## Configuration Elements\n\n### Workspace Configuration\n- **name** - Workspace identifier\n- **path** - Directory path\n- **enabled** - Enable/disable flag\n- **multi_workspace** - Support multiple workspaces\n\n### Server Configuration\n- **host** - Bind address (127.0.0.1 for solo, 0.0.0.0 for public)\n- **port** - Listen port\n- **workers** - Thread count (mode-dependent)\n- **keep_alive** - Connection keep-alive timeout\n- **max_connections** - Connection limit\n\n### Storage Configuration\n- **backend** - 'filesystem | 'rocksdb | 'surrealdb | 'postgres\n- **path** - Local storage path (filesystem/rocksdb)\n- **connection_string** - DB URL (surrealdb/postgres)\n\n### Queue Configuration (Orchestrator)\n- **max_concurrent_tasks** - Concurrent task limit\n- **retry_attempts** - Retry count\n- **retry_delay** - Delay between retries (ms)\n- **task_timeout** - Task execution timeout (ms)\n\n### Monitoring Configuration (Optional)\n- **enabled** - Enable metrics collection\n- **metrics_interval** - Collection frequency (seconds)\n- **health_check_interval** - Health check frequency\n\n## Creating New Examples\n\n### 1. Start with Existing Example\n\n```\ncp examples/orchestrator-solo.ncl examples/orchestrator-custom.ncl\n```\n\n### 2. Modify for Your Use Case\n\n```\n# Update configuration values\norchestrator.server.workers = 8 # More workers\norchestrator.queue.max_concurrent_tasks = 20 # Higher concurrency\n```\n\n### 3. Validate Configuration\n\n```\nnickel typecheck examples/orchestrator-custom.ncl\nnickel eval examples/orchestrator-custom.ncl\n```\n\n### 4. Document Purpose\nAdd comments explaining:\n- Use case (deployment scenario)\n- Resource requirements\n- Expected load\n- Customization needed\n\n### 5. Save as Reference\n\n```\nmv examples/orchestrator-custom.ncl examples/orchestrator-{scenario}.ncl\n```\n\n## Best Practices for Examples\n\n1. **Clear documentation** - Explain the use case at the top\n2. **Realistic values** - Use production-appropriate configurations\n3. **Complete configuration** - Include all required sections\n4. **Inline comments** - Explain non-obvious choices\n5. **Validated** - Typecheck all examples before committing\n6. **Organized** - Group by service and deployment mode\n\n## Example Naming Convention\n\n- **Service-mode**: `{service}-{mode}.ncl` (orchestrator-solo.ncl)\n- **Scenario**: `{service}-{scenario}.ncl` (orchestrator-gpu-intensive.ncl)\n- **Full stack**: `full-platform-{mode}.ncl` (full-platform-enterprise.ncl)\n\n## Customizing Examples\n\n### For Your Environment\n\n```\n# orchestrator-solo.ncl (customized)\n{\n orchestrator = {\n workspace = {\n name = "my-workspace", # Your workspace name\n path = "/home/user/projects/workspace", # Your path\n },\n server = {\n host = "127.0.0.1", # Keep local for solo\n port = 9090,\n },\n storage = {\n backend = 'filesystem, # No external DB needed\n path = "/home/user/provisioning/data", # Your path\n },\n },\n}\n```\n\n### For Different Resources\n\n```\n# orchestrator-multiuser.ncl (customized for team)\n{\n orchestrator = {\n server = {\n host = "0.0.0.0", # Public binding\n port = 9090,\n workers = 4, # Team concurrency\n },\n queue = {\n max_concurrent_tasks = 10, # Team workload\n },\n },\n}\n```\n\n## Testing Examples\n\n```\n# Typecheck example\nnickel typecheck examples/orchestrator-solo.ncl\n\n# Evaluate and view\nnickel eval examples/orchestrator-solo.ncl | head -20\n\n# Export to TOML\nnickel export --format toml examples/orchestrator-solo.ncl > test.toml\n```\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 \ No newline at end of file +# Provisioning Platform Configuration Examples\n\nProduction-ready reference configurations demonstrating different deployment scenarios and best practices.\n\n## Purpose\n\nExamples provide:\n- **Real-world configurations** - Complete, tested working setups ready for production use\n- **Best practices** - Recommended patterns, values, and architectural approaches\n- **Learning resource** - How to use the configuration system effectively\n- **Starting point** - Copy, customize, and deploy for your environment\n- **Documentation** - Detailed inline comments explaining every configuration option\n\n## Quick Start\n\nChoose your deployment mode and get started immediately:\n\n```\n# Solo development (local, single developer)\nnickel export --format toml orchestrator-solo.ncl > orchestrator.toml\n\n# Team collaboration (PostgreSQL, RBAC, audit logging)\nnickel export --format toml control-center-multiuser.ncl > control-center.toml\n\n# Production enterprise (HA, SurrealDB cluster, full monitoring)\nnickel export --format toml full-platform-enterprise.ncl > platform.toml\n```\n\n## Example Configurations by Mode\n\n### 1. orchestrator-solo.ncl\n\n**Deployment Mode**: Solo (Single Developer)\n\n**Resource Requirements**:\n- CPU: 2 cores\n- RAM: 4 GB\n- Disk: 50 GB (local data)\n\n**Configuration Highlights**:\n- **Workspace**: Local `dev-workspace` at `/home/developer/provisioning/data/orchestrator`\n- **Server**: Localhost binding (127.0.0.1:9090), 2 workers, 128 connections max\n- **Storage**: Filesystem backend (no external database required)\n- **Queue**: 3 max concurrent tasks (minimal for development)\n- **Batch**: 2 parallel limit with frequent checkpointing (every 50 operations)\n- **Logging**: Debug level, human-readable text format, concurrent stdout + file output\n- **Security**: Auth disabled, CORS allows all origins, no TLS\n- **Monitoring**: Health checks only (metrics disabled), resource tracking disabled\n- **Features**: Experimental features enabled for testing and iteration\n\n**Ideal For**:\n- ✅ Single developer local development\n- ✅ Quick prototyping and experimentation\n- ✅ Learning the provisioning platform\n- ✅ CI/CD local testing without external services\n\n**Key Advantages**:\n- No external dependencies (database-free)\n- Fast startup (<10 seconds)\n- Minimal resource footprint\n- Verbose debug logging for troubleshooting\n- Zero security overhead\n\n**Key Limitations**:\n- Localhost-only (not accessible remotely)\n- Single-threaded processing (3 concurrent tasks max)\n- No persistence across restarts (if using `:memory:` storage)\n- No audit logging\n\n**Usage**:\n\n```\n# Export to TOML and run\nnickel export --format toml orchestrator-solo.ncl > orchestrator.solo.toml\nORCHESTRATOR_CONFIG=orchestrator.solo.toml cargo run --bin orchestrator\n\n# With TypeDialog interactive configuration\nnu ../../scripts/configure.nu orchestrator solo --backend cli\n```\n\n**Customization Examples**:\n\n```\n# Increase concurrency for testing (still development-friendly)\nqueue.max_concurrent_tasks = 5\n\n# Reduce debug noise for cleaner logs\nlogging.level = "info"\n\n# Change workspace location\nworkspace.path = "/path/to/my/workspace"\n```\n\n---\n\n### 2. orchestrator-enterprise.ncl\n\n**Deployment Mode**: Enterprise (Production High-Availability)\n\n**Resource Requirements**:\n- CPU: 8+ cores (recommended 16)\n- RAM: 16+ GB (recommended 32+ GB)\n- Disk: 500+ GB (SurrealDB cluster)\n\n**Configuration Highlights**:\n- **Workspace**: Production `production` workspace at `/var/lib/provisioning/orchestrator` with multi-workspace support enabled\n- **Server**: All interfaces binding (0.0.0.0:9090), 16 workers, 4096 connections max\n- **Storage**: SurrealDB cluster (3 nodes) for distributed storage and high availability\n- **Queue**: 100 max concurrent tasks, 5 retry attempts, 2-hour timeout for long-running operations\n- **Batch**: 50 parallel limit with frequent checkpointing (every 1000 operations) and automatic cleanup\n- **Logging**: Info level, JSON structured format for log aggregation\n - Standard logs: 500MB files, kept 30 versions (90 days)\n - Audit logs: 200MB files, kept 365 versions (1 year)\n- **Security**: JWT authentication required, specific CORS origins, TLS 1.3 mandatory, 10,000 RPS rate limit\n- **Extensions**: Auto-load from OCI registry with daily refresh, 10 concurrent initializations\n- **Monitoring**:\n - Metrics every 10 seconds\n - Profiling at 10% sample rate\n - Resource tracking with CPU/memory/disk alerts\n - Health checks every 30 seconds\n- **Features**: Audit logging, task history, performance tracking all enabled\n\n**Ideal For**:\n- ✅ Production deployments with SLAs\n- ✅ High-throughput, mission-critical workloads\n- ✅ Multi-team environments requiring audit trails\n- ✅ Large-scale infrastructure deployments\n- ✅ Compliance and governance requirements\n\n**Key Advantages**:\n- High availability (3 SurrealDB replicas with failover)\n- Production security (JWT + TLS 1.3 mandatory)\n- Full observability (metrics, profiling, audit logs)\n- High throughput (100 concurrent tasks)\n- Extension management via OCI registry\n- Automatic rollback and recovery capabilities\n\n**Key Limitations**:\n- Requires SurrealDB cluster setup and maintenance\n- Resource-intensive (8+ CPU, 16+ GB RAM minimum)\n- More complex initial setup and configuration\n- Requires secrets management (JWT keys, TLS certificates)\n- Network isolation and load balancing setup required\n\n**Environment Variables Required**:\n\n```\nexport JWT_SECRET=""\nexport SURREALDB_PASSWORD=""\n```\n\n**Usage**:\n\n```\n# Deploy standalone with SurrealDB\nnickel export --format toml orchestrator-enterprise.ncl > orchestrator.enterprise.toml\nORCHESTRATOR_CONFIG=orchestrator.enterprise.toml cargo run --bin orchestrator\n\n# Deploy to Kubernetes with all enterprise infrastructure\nnu ../../scripts/render-kubernetes.nu enterprise --namespace production\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/*.yaml\n```\n\n**Customization Examples**:\n\n```\n# Adjust concurrency for your specific infrastructure\nqueue.max_concurrent_tasks = 50 # Scale down if resource-constrained\n\n# Change SurrealDB cluster endpoints\nstorage.surrealdb_url = "surrealdb://node1:8000,node2:8000,node3:8000"\n\n# Modify audit log retention for compliance\nlogging.outputs[1].rotation.max_backups = 2555 # 7 years for HIPAA compliance\n\n# Increase rate limiting for high-frequency integrations\nsecurity.rate_limit.requests_per_second = 20000\n```\n\n---\n\n### 3. control-center-multiuser.ncl\n\n**Deployment Mode**: MultiUser (Team Collaboration & Staging)\n\n**Resource Requirements**:\n- CPU: 4 cores\n- RAM: 8 GB\n- Disk: 100 GB (PostgreSQL data + logs)\n\n**Configuration Highlights**:\n- **Server**: All interfaces binding (0.0.0.0:8080), 4 workers, 256 connections max\n- **Database**: PostgreSQL with connection pooling (min 5, max 20 connections)\n- **Auth**: JWT with 8-hour token expiration (aligned with team workday)\n- **RBAC**: 4 pre-defined roles with granular permissions\n - `admin`: Infrastructure lead with full access (`*` permissions)\n - `operator`: Operations team - execute, manage, view workflows and policies\n - `developer`: Development team - read-only workflow and policy access\n - `viewer`: Minimal read-only for non-technical stakeholders\n- **MFA**: Optional per-user (TOTP + email methods available, not globally required)\n- **Password Policies**: 12-character minimum, requires uppercase/lowercase/digits, 90-day rotation, history count of 3\n- **Session Policies**: 8-hour maximum duration, 1-hour idle timeout, 3 concurrent sessions per user\n- **Rate Limiting**: 1000 RPS global, 100 RPS per-user, 20 burst requests\n- **CORS**: Allows localhost:3000 (dev), control-center.example.com, orchestrator.example.com\n- **Logging**: Info level, JSON format, 200MB files kept 15 versions (90 days retention)\n- **Features**: Audit logging enabled, policy enforcement enabled\n\n**Ideal For**:\n- ✅ Team collaboration (2-50 engineers)\n- ✅ Staging environments before production\n- ✅ Development team operations\n- ✅ RBAC with different access levels\n- ✅ Compliance-light environments (SOC2 optional)\n\n**Key Advantages**:\n- Team-friendly security (optional MFA, reasonable password policy)\n- RBAC supports different team roles and responsibilities\n- Persistent storage (PostgreSQL) maintains state across restarts\n- Audit trail for basic compliance\n- Flexible session management (multiple concurrent sessions)\n- Good balance of security and usability\n\n**Key Limitations**:\n- Requires PostgreSQL database setup\n- Single replica (not HA by default)\n- More complex than solo mode\n- RBAC requires careful role definition\n\n**Environment Variables Required**:\n\n```\nexport DB_PASSWORD=""\nexport JWT_SECRET=""\n```\n\n**Usage**:\n\n```\n# Generate and deploy\nnickel export --format toml control-center-multiuser.ncl > control-center.multiuser.toml\nCONTROL_CENTER_CONFIG=control-center.multiuser.toml cargo run --bin control-center\n\n# With Docker Compose for team\nnu ../../scripts/render-docker-compose.nu multiuser\ndocker-compose -f docker-compose.multiuser.yml up -d\n\n# Access the UI\n# http://localhost:8080 (or your configured domain)\n```\n\n**RBAC Quick Reference**:\n\n| Role | Intended Users | Key Permissions |\n| ------ | ---------------- | ----------------- |\n| admin | Infrastructure leads | All operations: full access |\n| operator | Operations engineers | Execute workflows, manage tasks, view policies |\n| developer | Application developers | View workflows, view policies (read-only) |\n| viewer | Non-technical (PM, QA) | View workflows only (minimal read) |\n\n**Customization Examples**:\n\n```\n# Require MFA globally for higher security\nmfa.required = true\n\n# Add custom role for auditors\nrbac.roles.auditor = {\n description = "Compliance auditor",\n permissions = ["audit.view", "orchestrator.view"],\n}\n\n# Adjust for larger team (more concurrent sessions)\npolicies.session.max_concurrent = 5\n\n# Stricter password policy for regulated industry\npolicies.password = {\n min_length = 16,\n require_special_chars = true,\n expiration_days = 60,\n history_count = 8,\n}\n```\n\n---\n\n### 4. full-platform-enterprise.ncl\n\n**Deployment Mode**: Enterprise Integrated (Complete Platform)\n\n**Resource Requirements**:\n- CPU: 16+ cores (3 replicas × 4 cores each + infrastructure)\n- RAM: 32+ GB (orchestrator 12GB + control-center 4GB + databases 12GB + monitoring 4GB)\n- Disk: 1+ TB (databases, logs, metrics, artifacts)\n\n**Services Configured**:\n\n**Orchestrator Section**:\n- SurrealDB cluster (3 nodes) for distributed workflow storage\n- 100 concurrent tasks with 5 retry attempts\n- Full audit logging and monitoring\n- JWT authentication with configurable token expiration\n- Extension loading from OCI registry\n- High-performance tuning (16 workers, 4096 connections)\n\n**Control Center Section**:\n- PostgreSQL HA backend for policy/RBAC storage\n- Full RBAC (4 roles with 7+ permissions each)\n- MFA required (TOTP + email methods)\n- SOC2 compliance enabled with audit logging\n- Strict password policy (16+ chars, special chars required)\n- 30-minute session idle timeout for security\n- Per-user rate limiting (100 RPS)\n\n**MCP Server Section**:\n- Claude integration for AI-powered provisioning\n- Full MCP capability support (tools, resources, prompts, sampling)\n- Orchestrator and Control Center integration\n- Read-only filesystem access with 10MB file limit\n- JWT authentication\n- Advanced audit logging (all requests logged except sensitive data)\n- 100 RPS rate limiting with 20-request burst\n\n**Global Configuration**:\n\n```\nlet deployment_mode = "enterprise"\nlet namespace = "provisioning"\nlet domain = "provisioning.example.com"\nlet environment = "production"\n```\n\n**Infrastructure Components** (when deployed to Kubernetes):\n- Load Balancer (Nginx) - TLS termination, CORS, rate limiting\n- 3x Orchestrator replicas - Distributed processing\n- 2x Control Center replicas - Policy management\n- 1-2x MCP Server replicas - AI integration\n- PostgreSQL HA - Primary/replica setup\n- SurrealDB cluster - 3 nodes with replication\n- Prometheus - Metrics collection\n- Grafana - Visualization and dashboards\n- Loki - Log aggregation\n- Harbor - Private OCI image registry\n\n**Ideal For**:\n- ✅ Production deployments with full SLAs\n- ✅ Enterprise compliance requirements (SOC2, HIPAA)\n- ✅ Multi-team organizations\n- ✅ AI/LLM integration for provisioning\n- ✅ Large-scale infrastructure management (1000+ resources)\n- ✅ High-availability deployments with 99.9%+ uptime requirements\n\n**Key Advantages**:\n- Complete service integration (no missing pieces)\n- Production-grade HA setup (3 replicas, load balancing)\n- Full compliance and audit capabilities\n- AI/LLM integration via MCP Server\n- Comprehensive monitoring and observability\n- Clear separation of concerns per service\n- Global variables for easy parameterization\n\n**Key Limitations**:\n- Complex setup requiring multiple services\n- Resource-intensive (16+ CPU, 32+ GB RAM minimum)\n- Requires Kubernetes or advanced Docker Compose setup\n- Multiple databases to maintain (PostgreSQL + SurrealDB)\n- Network setup complexity (TLS, CORS, rate limiting)\n\n**Environment Variables Required**:\n\n```\n# Database credentials\nexport DB_PASSWORD=""\nexport SURREALDB_PASSWORD=""\n\n# Security\nexport JWT_SECRET=""\nexport KMS_KEY=""\n\n# AI/LLM integration\nexport CLAUDE_API_KEY=""\nexport CLAUDE_MODEL="claude-3-opus-20240229"\n\n# TLS certificates (for production)\nexport TLS_CERT=""\nexport TLS_KEY=""\n```\n\n**Architecture Diagram**:\n\n```\n┌───────────────────────────────────────────────┐\n│ Nginx Load Balancer (TLS, CORS, RateLimit) │\n│ https://orchestrator.example.com │\n│ https://control-center.example.com │\n│ https://mcp.example.com │\n└──────────┬──────────────────────┬─────────────┘\n │ │\n ┌──────▼──────┐ ┌────────▼────────┐\n │ Orchestrator│ │ Control Center │\n │ (3 replicas)│ │ (2 replicas) │\n └──────┬──────┘ └────────┬────────┘\n │ │\n ┌──────▼──────┐ ┌────────▼────────┐ ┌─────────────────┐\n │ SurrealDB │ │ PostgreSQL HA │ │ MCP Server │\n │ Cluster │ │ │ │ (1-2 replicas) │\n │ (3 nodes) │ │ Primary/Replica│ │ │\n └─────────────┘ └─────────────────┘ │ ↓ Claude API │\n └─────────────────┘\n\n ┌─────────────────────────────────────────────────┐\n │ Observability Stack (Optional) │\n ├──────────────────┬──────────────────────────────┤\n │ Prometheus │ Grafana │ Loki │\n │ (Metrics) │ (Dashboards) │ (Logs) │\n └──────────────────┴──────────────────────────────┘\n```\n\n**Usage**:\n\n```\n# Export complete configuration\nnickel export --format toml full-platform-enterprise.ncl > platform.toml\n\n# Extract individual service configs if needed\n# (Each service extracts its section from platform.toml)\n\n# Deploy to Kubernetes with all enterprise infrastructure\nnu ../../scripts/render-kubernetes.nu enterprise --namespace production\n\n# Apply all manifests\nkubectl create namespace production\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/*.yaml\n\n# Or deploy with Docker Compose for single-node testing\nnu ../../scripts/render-docker-compose.nu enterprise\ndocker-compose -f docker-compose.enterprise.yml up -d\n```\n\n**Customization Examples**:\n\n```\n# Adjust deployment domain\nlet domain = "my-company.com"\nlet namespace = "infrastructure"\n\n# Scale for higher throughput\norchestrator.queue.max_concurrent_tasks = 200\norchestrator.security.rate_limit.requests_per_second = 50000\n\n# Add HIPAA compliance\ncontrol_center.policies.compliance.hipaa.enabled = true\ncontrol_center.policies.audit.retention_days = 2555 # 7 years\n\n# Custom MCP Server model\nmcp_server.integration.claude.model = "claude-3-sonnet-20240229"\n\n# Enable caching for performance\nmcp_server.features.enable_caching = true\nmcp_server.performance.cache_ttl = 7200\n```\n\n---\n\n## Deployment Mode Comparison Matrix\n\n| Feature | Solo | MultiUser | Enterprise |\n| --------- | ------ | ----------- | ----------- |\n| **Ideal For** | Dev | Team/Staging | Production |\n| **Storage** | Filesystem | PostgreSQL | SurrealDB Cluster |\n| **Replicas** | 1 | 1 | 3+ (HA) |\n| **Max Concurrency** | 3 tasks | 5-10 | 100 |\n| **Security** | None | RBAC + JWT | Full + MFA + SOC2 |\n| **Monitoring** | Health check | Basic | Full (Prom+Grafana) |\n| **Setup Time** | <5 min | 15 min | 30+ min |\n| **Min CPU** | 2 | 4 | 16 |\n| **Min RAM** | 4GB | 8GB | 32GB |\n| **Audit Logs** | No | 90 days | 365 days |\n| **TLS Required** | No | No | Yes |\n| **Compliance** | None | Basic | SOC2 + HIPAA ready |\n\n---\n\n## Getting Started Guide\n\n### Step 1: Choose Your Deployment Mode\n\n- **Solo**: Single developer working locally → Use `orchestrator-solo.ncl`\n- **Team**: 2-50 engineers, staging environment → Use `control-center-multiuser.ncl`\n- **Production**: Full enterprise deployment → Use `full-platform-enterprise.ncl`\n\n### Step 2: Export Configuration to TOML\n\n```\n# Start with solo mode\nnickel export --format toml orchestrator-solo.ncl > orchestrator.toml\n\n# Validate the export\ncat orchestrator.toml | head -20\n```\n\n### Step 3: Validate Configuration\n\n```\n# Typecheck the Nickel configuration\nnickel typecheck orchestrator-solo.ncl\n\n# Validate using provided script\nnu ../../scripts/validate-config.nu orchestrator-solo.ncl\n```\n\n### Step 4: Customize for Your Environment\n\nEdit the exported `.toml` or the `.ncl` file:\n\n```\n# Option A: Edit TOML directly (simpler)\nvi orchestrator.toml # Change workspace path, port, etc.\n\n# Option B: Edit Nickel and re-export (type-safe)\nvi orchestrator-solo.ncl\nnickel export --format toml orchestrator-solo.ncl > orchestrator.toml\n```\n\n### Step 5: Deploy\n\n```\n# Docker Compose\nORCHESTRATOR_CONFIG=orchestrator.toml docker-compose up -d\n\n# Direct Rust execution\nORCHESTRATOR_CONFIG=orchestrator.toml cargo run --bin orchestrator\n\n# Kubernetes\nnu ../../scripts/render-kubernetes.nu solo\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/*.yaml\n```\n\n---\n\n## Common Customizations\n\n### Changing Domain/Namespace\n\nIn any `.ncl` file at the top:\n\n```\nlet domain = "your-domain.com"\nlet namespace = "your-namespace"\nlet environment = "your-env"\n```\n\n### Increasing Resource Limits\n\nFor higher throughput:\n\n```\nqueue.max_concurrent_tasks = 200 # Default: 100\nsecurity.rate_limit.requests_per_second = 50000 # Default: 10000\nserver.workers = 32 # Default: 16\n```\n\n### Enabling Compliance Features\n\nFor regulated environments:\n\n```\npolicies.compliance.soc2.enabled = true\npolicies.compliance.hipaa.enabled = true\npolicies.audit.retention_days = 2555 # 7 years\n```\n\n### Custom Logging\n\nFor troubleshooting:\n\n```\nlogging.level = "debug" # Default: info\nlogging.format = "text" # Default: json (use text for development)\nlogging.outputs[0].level = "debug" # stdout level\n```\n\n---\n\n## Validation & Testing\n\n### Syntax Validation\n\n```\n# Typecheck all examples\nfor f in *.ncl; do\n echo "Checking $f..."\n nickel typecheck "$f"\ndone\n```\n\n### Configuration Export\n\n```\n# Export to TOML\nnickel export --format toml orchestrator-solo.ncl | head -30\n\n# Export to JSON\nnickel export --format json full-platform-enterprise.ncl | jq '.orchestrator.server'\n```\n\n### Load in Rust Application\n\n```\n# With dry-run flag (if supported)\nORCHESTRATOR_CONFIG=orchestrator.solo.toml cargo run --bin orchestrator -- --validate\n\n# Or simply attempt startup\nORCHESTRATOR_CONFIG=orchestrator.solo.toml timeout 5 cargo run --bin orchestrator\n```\n\n---\n\n## Troubleshooting\n\n### "Type mismatch" Error\n\n**Cause**: Field value doesn't match expected type\n\n**Fix**: Check the schema for correct type. Common issues:\n- Use `true`/`false` not `"true"`/`"false"` for booleans\n- Use `9090` not `"9090"` for numbers\n- Use record syntax `{ key = value }` not `{ "key": value }`\n\n### Port Already in Use\n\n**Fix**: Change the port in your configuration:\n\n```\nserver.port = 9999 # Instead of 9090\n```\n\n### Database Connection Errors\n\n**Fix**: For multiuser/enterprise modes:\n- Ensure PostgreSQL is running: `docker-compose up -d postgres`\n- Verify credentials in environment variables\n- Check network connectivity\n- Validate connection string format\n\n### Import Not Found\n\n**Fix**: Ensure all relative paths in imports are correct:\n\n```\n# Correct (relative to examples/)\nlet defaults = import "../defaults/orchestrator-defaults.ncl" in\n\n# Wrong (absolute path)\nlet defaults = import "/full/path/to/defaults.ncl" in\n```\n\n---\n\n## Best Practices\n\n1. **Start Small**: Begin with solo mode, graduate to multiuser, then enterprise\n2. **Environment Variables**: Never hardcode secrets, use environment variables\n3. **Version Control**: Keep examples in Git with clear comments\n4. **Validation**: Always typecheck and export before deploying\n5. **Documentation**: Add comments explaining non-obvious configuration choices\n6. **Testing**: Deploy to staging first, validate all services before production\n7. **Monitoring**: Enable metrics and logging from day one for easier troubleshooting\n8. **Backups**: Regular backups of database state and configurations\n\n---\n\n## Adding New Examples\n\n### Create a Custom Example\n\n```\n# Copy an existing example as template\ncp orchestrator-solo.ncl orchestrator-custom.ncl\n\n# Edit for your use case\nvi orchestrator-custom.ncl\n\n# Validate\nnickel typecheck orchestrator-custom.ncl\n\n# Export and test\nnickel export --format toml orchestrator-custom.ncl > orchestrator.custom.toml\n```\n\n### Naming Convention\n\n- **Service + Mode**: `{service}-{mode}.ncl` (orchestrator-solo.ncl)\n- **Scenario**: `{service}-{scenario}.ncl` (orchestrator-high-throughput.ncl)\n- **Full Stack**: `full-platform-{mode}.ncl` (full-platform-enterprise.ncl)\n\n---\n\n## See Also\n\n- **Parent README**: `../README.md` - Complete configuration system overview\n- **Schemas**: `../schemas/` - Type definitions and validation rules\n- **Defaults**: `../defaults/` - Base configurations for composition\n- **Scripts**: `../scripts/` - Automation for configuration workflow\n- **Forms**: `../forms/` - Interactive TypeDialog form definitions\n\n---\n\n**Version**: 2.0\n**Last Updated**: 2025-01-05\n**Status**: Production Ready - All examples tested and validated\n\n## Using Examples\n\n### View Example\n\n```\ncat provisioning/.typedialog/provisioning/platform/examples/orchestrator-solo.ncl\n```\n\n### Copy and Customize\n\n```\n# Start with solo example\ncp examples/orchestrator-solo.ncl values/orchestrator.solo.ncl\n\n# Edit for your environment\nvi values/orchestrator.solo.ncl\n\n# Validate\nnu scripts/validate-config.nu values/orchestrator.solo.ncl\n```\n\n### Generate from Example\n\n```\n# Use example as base, regenerate with TypeDialog\nnu scripts/configure.nu orchestrator solo --backend web\n```\n\n## Example Structure\n\nEach example is a complete Nickel configuration:\n\n```\n# orchestrator-solo.ncl\n{\n orchestrator = {\n workspace = { },\n server = { },\n storage = { },\n queue = { },\n monitoring = { },\n },\n}\n```\n\n## Configuration Elements\n\n### Workspace Configuration\n- **name** - Workspace identifier\n- **path** - Directory path\n- **enabled** - Enable/disable flag\n- **multi_workspace** - Support multiple workspaces\n\n### Server Configuration\n- **host** - Bind address (127.0.0.1 for solo, 0.0.0.0 for public)\n- **port** - Listen port\n- **workers** - Thread count (mode-dependent)\n- **keep_alive** - Connection keep-alive timeout\n- **max_connections** - Connection limit\n\n### Storage Configuration\n- **backend** - 'filesystem | 'rocksdb | 'surrealdb | 'postgres\n- **path** - Local storage path (filesystem/rocksdb)\n- **connection_string** - DB URL (surrealdb/postgres)\n\n### Queue Configuration (Orchestrator)\n- **max_concurrent_tasks** - Concurrent task limit\n- **retry_attempts** - Retry count\n- **retry_delay** - Delay between retries (ms)\n- **task_timeout** - Task execution timeout (ms)\n\n### Monitoring Configuration (Optional)\n- **enabled** - Enable metrics collection\n- **metrics_interval** - Collection frequency (seconds)\n- **health_check_interval** - Health check frequency\n\n## Creating New Examples\n\n### 1. Start with Existing Example\n\n```\ncp examples/orchestrator-solo.ncl examples/orchestrator-custom.ncl\n```\n\n### 2. Modify for Your Use Case\n\n```\n# Update configuration values\norchestrator.server.workers = 8 # More workers\norchestrator.queue.max_concurrent_tasks = 20 # Higher concurrency\n```\n\n### 3. Validate Configuration\n\n```\nnickel typecheck examples/orchestrator-custom.ncl\nnickel eval examples/orchestrator-custom.ncl\n```\n\n### 4. Document Purpose\nAdd comments explaining:\n- Use case (deployment scenario)\n- Resource requirements\n- Expected load\n- Customization needed\n\n### 5. Save as Reference\n\n```\nmv examples/orchestrator-custom.ncl examples/orchestrator-{scenario}.ncl\n```\n\n## Best Practices for Examples\n\n1. **Clear documentation** - Explain the use case at the top\n2. **Realistic values** - Use production-appropriate configurations\n3. **Complete configuration** - Include all required sections\n4. **Inline comments** - Explain non-obvious choices\n5. **Validated** - Typecheck all examples before committing\n6. **Organized** - Group by service and deployment mode\n\n## Example Naming Convention\n\n- **Service-mode**: `{service}-{mode}.ncl` (orchestrator-solo.ncl)\n- **Scenario**: `{service}-{scenario}.ncl` (orchestrator-gpu-intensive.ncl)\n- **Full stack**: `full-platform-{mode}.ncl` (full-platform-enterprise.ncl)\n\n## Customizing Examples\n\n### For Your Environment\n\n```\n# orchestrator-solo.ncl (customized)\n{\n orchestrator = {\n workspace = {\n name = "my-workspace", # Your workspace name\n path = "/home/user/projects/workspace", # Your path\n },\n server = {\n host = "127.0.0.1", # Keep local for solo\n port = 9090,\n },\n storage = {\n backend = 'filesystem, # No external DB needed\n path = "/home/user/provisioning/data", # Your path\n },\n },\n}\n```\n\n### For Different Resources\n\n```\n# orchestrator-multiuser.ncl (customized for team)\n{\n orchestrator = {\n server = {\n host = "0.0.0.0", # Public binding\n port = 9090,\n workers = 4, # Team concurrency\n },\n queue = {\n max_concurrent_tasks = 10, # Team workload\n },\n },\n}\n```\n\n## Testing Examples\n\n```\n# Typecheck example\nnickel typecheck examples/orchestrator-solo.ncl\n\n# Evaluate and view\nnickel eval examples/orchestrator-solo.ncl | head -20\n\n# Export to TOML\nnickel export --format toml examples/orchestrator-solo.ncl > test.toml\n```\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 diff --git a/schemas/platform/schemas/README.md b/schemas/platform/schemas/README.md index 3f5123e..9ff771b 100644 --- a/schemas/platform/schemas/README.md +++ b/schemas/platform/schemas/README.md @@ -1 +1 @@ -# Schemas\n\nNickel type contracts defining configuration structure and validation for all services.\n\n## Purpose\n\nSchemas define:\n- **Type safety** - Required/optional fields, valid types (string, number, bool, record)\n- **Value constraints** - Enum values, numeric bounds (via contracts)\n- **Documentation** - Field descriptions and usage patterns\n- **Composition** - Inheritance and merging of schema types\n\n## File Organization\n\n```\nschemas/\n├── README.md # This file\n├── common/ # Shared schemas (server, database, security, etc.)\n│ ├── server.ncl # HTTP server configuration schema\n│ ├── database.ncl # Database backend schema\n│ ├── security.ncl # Authentication and security schema\n│ ├── monitoring.ncl # Metrics and health checks schema\n│ ├── logging.ncl # Log level and format schema\n│ ├── network.ncl # Network binding and TLS schema\n│ ├── storage.ncl # Storage backend schema\n│ └── workspace.ncl # Workspace configuration schema\n├── deployment/ # Mode-specific schemas\n│ ├── solo.ncl # Solo mode resource constraints\n│ ├── multiuser.ncl # Multi-user mode schema\n│ ├── cicd.ncl # CI/CD mode schema\n│ └── enterprise.ncl # Enterprise HA schema\n├── orchestrator.ncl # Orchestrator service schema\n├── control-center.ncl # Control Center service schema\n├── mcp-server.ncl # MCP Server service schema\n└── installer.ncl # Installer service schema\n```\n\n## Schema Patterns\n\n### 1. Basic Schema Definition\n\n```\n# schemas/common/server.ncl\n{\n Server = {\n host | String, # Required string field\n port | Number, # Required number field\n workers | Number | default = 4, # Optional with default\n keep_alive | Number | optional, # Optional field\n max_connections | Number | optional,\n },\n}\n```\n\n### 2. Type with Contract Validation\n\n```\n# With constraint checking (via validators)\n{\n WorkerCount =\n let valid_range = fun n =>\n if n < 1 then\n std.contract.blame "Workers must be >= 1" n\n else if n > 32 then\n std.contract.blame "Workers must be <= 32" n\n else\n n\n in\n Number | valid_range,\n}\n```\n\n### 3. Record Merging (Composition)\n\n```\n# schemas/orchestrator.ncl\nlet server_schema = import "./common/server.ncl" in\nlet database_schema = import "./common/database.ncl" in\n\n{\n OrchestratorConfig = {\n workspace | {\n name | String,\n path | String,\n enabled | Bool | default = true,\n },\n server | server_schema.Server, # Reuse Server schema\n storage | database_schema.Database, # Reuse Database schema\n queue | {\n max_concurrent_tasks | Number,\n retry_attempts | Number | default = 3,\n },\n },\n}\n```\n\n## Common Schemas\n\n### server.ncl\nHTTP server configuration:\n- `host` - Bind address (string)\n- `port` - Listen port (number)\n- `workers` - Thread count (number, optional)\n- `keep_alive` - Keep-alive timeout (number, optional)\n- `max_connections` - Connection limit (number, optional)\n\n### database.ncl\nDatabase backend selection:\n- `backend` - 'filesystem | 'rocksdb | 'surrealdb_embedded | 'surrealdb_server | 'postgres (enum)\n- `path` - Storage path (string, optional)\n- `connection_string` - DB URL (string, optional)\n- `credentials` - Auth object (optional)\n\n### security.ncl\nAuthentication and encryption:\n- `jwt_issuer` - JWT issuer (string, optional)\n- `jwt_audience` - JWT audience (string, optional)\n- `jwt_expiration` - Token expiration (number, optional)\n- `encryption_key` - Encryption key (string, optional)\n- `kms_backend` - KMS provider (string, optional)\n- `mfa_required` - Require MFA (bool, optional)\n\n### monitoring.ncl\nMetrics and health:\n- `enabled` - Enable monitoring (bool, optional)\n- `metrics_interval` - Metrics collection interval (number, optional)\n- `health_check_interval` - Health check frequency (number, optional)\n- `retention_days` - Metrics retention (number, optional)\n\n### logging.ncl\nLog configuration:\n- `level` - Log level (debug | info | warn | error)\n- `format` - Log format (json | text)\n- `rotation` - Log rotation policy (optional)\n- `output` - Log destination (stdout | file | syslog)\n\n## Service Schemas\n\n### orchestrator.ncl\nWorkflow orchestration:\n\n```\nOrchestratorConfig = {\n workspace | WorkspaceConfig,\n server | Server,\n storage | Database,\n queue | QueueConfig,\n batch | BatchConfig,\n monitoring | MonitoringConfig | optional,\n rollback | RollbackConfig | optional,\n extensions | ExtensionsConfig | optional,\n}\n```\n\n### control-center.ncl\nPolicy and RBAC:\n\n```\nControlCenterConfig = {\n workspace | WorkspaceConfig,\n server | Server,\n database | Database,\n security | SecurityConfig,\n rbac | RBACConfig | optional,\n compliance | ComplianceConfig | optional,\n}\n```\n\n### mcp-server.ncl\nMCP protocol server:\n\n```\nMCPServerConfig = {\n workspace | WorkspaceConfig,\n server | Server,\n capabilities | CapabilitiesConfig,\n tools | ToolsConfig | optional,\n resources | ResourcesConfig | optional,\n}\n```\n\n## Deployment Mode Schemas\n\nDeployment schemas define resource constraints for each mode:\n\n- **solo.ncl** - 2 CPU, 4GB RAM, embedded DB\n- **multiuser.ncl** - 4 CPU, 8GB RAM, PostgreSQL\n- **cicd.ncl** - 8 CPU, 16GB RAM, ephemeral\n- **enterprise.ncl** - 16+ CPU, 32+ GB RAM, HA\n\nExample:\n\n```\n# schemas/deployment/solo.ncl\n{\n SoloMode = {\n resources = {\n cpu_cores | 2,\n memory_mb | 4096,\n disk_gb | 50,\n },\n database_backend | 'filesystem,\n security_level | 'basic,\n },\n}\n```\n\n## Validation with Schemas\n\nSchemas are composed with validators in config files:\n\n```\n# configs/orchestrator.solo.ncl\nlet schemas = import "../schemas/orchestrator.ncl" in\nlet validators = import "../validators/orchestrator-validator.ncl" in\nlet defaults = import "../defaults/orchestrator-defaults.ncl" in\n\n# Compose: defaults + validation + schema checking\n{\n orchestrator = defaults.orchestrator & {\n queue = {\n max_concurrent_tasks = validators.ValidConcurrentTasks 5,\n },\n },\n} | schemas.OrchestratorConfig\n```\n\nThe final `| schemas.OrchestratorConfig` applies type checking.\n\n## Type System\n\n### Nickel Type Syntax\n\n```\n# Required field\nfield | Type,\n\n# Optional field\nfield | Type | optional,\n\n# Field with default\nfield | Type | default = value,\n\n# Union type\nfield | [| 'option1, 'option2],\n\n# Nested record\nfield | {\n subfield | Type,\n},\n```\n\n## Best Practices\n\n1. **Reuse common schemas** - Import and compose rather than duplicate\n2. **Use enums for choices** - `'filesystem | 'rocksdb` instead of string validation\n3. **Document fields** - Add comments explaining purpose\n4. **Keep schemas focused** - Each file covers one logical component\n5. **Test composition** - Use `nickel typecheck` to verify schema merging\n\n## Modifying Schemas\n\nWhen changing a schema:\n\n1. Update schema file (schemas/*.ncl)\n2. Update corresponding defaults (defaults/*.ncl) to match schema\n3. Update validators if constraints changed\n4. Run typecheck: `nickel typecheck configs/orchestrator.*.ncl`\n5. Verify all configs still type-check\n\n## Schema Testing\n\n```\n# Typecheck a schema\nnickel typecheck provisioning/.typedialog/provisioning/platform/schemas/orchestrator.ncl\n\n# Typecheck a config (which applies schema)\nnickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Evaluate a schema\nnickel eval provisioning/.typedialog/provisioning/platform/schemas/orchestrator.ncl\n```\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 \ No newline at end of file +# Schemas\n\nNickel type contracts defining configuration structure and validation for all services.\n\n## Purpose\n\nSchemas define:\n- **Type safety** - Required/optional fields, valid types (string, number, bool, record)\n- **Value constraints** - Enum values, numeric bounds (via contracts)\n- **Documentation** - Field descriptions and usage patterns\n- **Composition** - Inheritance and merging of schema types\n\n## File Organization\n\n```\nschemas/\n├── README.md # This file\n├── common/ # Shared schemas (server, database, security, etc.)\n│ ├── server.ncl # HTTP server configuration schema\n│ ├── database.ncl # Database backend schema\n│ ├── security.ncl # Authentication and security schema\n│ ├── monitoring.ncl # Metrics and health checks schema\n│ ├── logging.ncl # Log level and format schema\n│ ├── network.ncl # Network binding and TLS schema\n│ ├── storage.ncl # Storage backend schema\n│ └── workspace.ncl # Workspace configuration schema\n├── deployment/ # Mode-specific schemas\n│ ├── solo.ncl # Solo mode resource constraints\n│ ├── multiuser.ncl # Multi-user mode schema\n│ ├── cicd.ncl # CI/CD mode schema\n│ └── enterprise.ncl # Enterprise HA schema\n├── orchestrator.ncl # Orchestrator service schema\n├── control-center.ncl # Control Center service schema\n├── mcp-server.ncl # MCP Server service schema\n└── installer.ncl # Installer service schema\n```\n\n## Schema Patterns\n\n### 1. Basic Schema Definition\n\n```\n# schemas/common/server.ncl\n{\n Server = {\n host | String, # Required string field\n port | Number, # Required number field\n workers | Number | default = 4, # Optional with default\n keep_alive | Number | optional, # Optional field\n max_connections | Number | optional,\n },\n}\n```\n\n### 2. Type with Contract Validation\n\n```\n# With constraint checking (via validators)\n{\n WorkerCount =\n let valid_range = fun n =>\n if n < 1 then\n std.contract.blame "Workers must be >= 1" n\n else if n > 32 then\n std.contract.blame "Workers must be <= 32" n\n else\n n\n in\n Number | valid_range,\n}\n```\n\n### 3. Record Merging (Composition)\n\n```\n# schemas/orchestrator.ncl\nlet server_schema = import "./common/server.ncl" in\nlet database_schema = import "./common/database.ncl" in\n\n{\n OrchestratorConfig = {\n workspace | {\n name | String,\n path | String,\n enabled | Bool | default = true,\n },\n server | server_schema.Server, # Reuse Server schema\n storage | database_schema.Database, # Reuse Database schema\n queue | {\n max_concurrent_tasks | Number,\n retry_attempts | Number | default = 3,\n },\n },\n}\n```\n\n## Common Schemas\n\n### server.ncl\nHTTP server configuration:\n- `host` - Bind address (string)\n- `port` - Listen port (number)\n- `workers` - Thread count (number, optional)\n- `keep_alive` - Keep-alive timeout (number, optional)\n- `max_connections` - Connection limit (number, optional)\n\n### database.ncl\nDatabase backend selection:\n- `backend` - 'filesystem | 'rocksdb | 'surrealdb_embedded | 'surrealdb_server | 'postgres (enum)\n- `path` - Storage path (string, optional)\n- `connection_string` - DB URL (string, optional)\n- `credentials` - Auth object (optional)\n\n### security.ncl\nAuthentication and encryption:\n- `jwt_issuer` - JWT issuer (string, optional)\n- `jwt_audience` - JWT audience (string, optional)\n- `jwt_expiration` - Token expiration (number, optional)\n- `encryption_key` - Encryption key (string, optional)\n- `kms_backend` - KMS provider (string, optional)\n- `mfa_required` - Require MFA (bool, optional)\n\n### monitoring.ncl\nMetrics and health:\n- `enabled` - Enable monitoring (bool, optional)\n- `metrics_interval` - Metrics collection interval (number, optional)\n- `health_check_interval` - Health check frequency (number, optional)\n- `retention_days` - Metrics retention (number, optional)\n\n### logging.ncl\nLog configuration:\n- `level` - Log level (debug | info | warn | error)\n- `format` - Log format (json | text)\n- `rotation` - Log rotation policy (optional)\n- `output` - Log destination (stdout | file | syslog)\n\n## Service Schemas\n\n### orchestrator.ncl\nWorkflow orchestration:\n\n```\nOrchestratorConfig = {\n workspace | WorkspaceConfig,\n server | Server,\n storage | Database,\n queue | QueueConfig,\n batch | BatchConfig,\n monitoring | MonitoringConfig | optional,\n rollback | RollbackConfig | optional,\n extensions | ExtensionsConfig | optional,\n}\n```\n\n### control-center.ncl\nPolicy and RBAC:\n\n```\nControlCenterConfig = {\n workspace | WorkspaceConfig,\n server | Server,\n database | Database,\n security | SecurityConfig,\n rbac | RBACConfig | optional,\n compliance | ComplianceConfig | optional,\n}\n```\n\n### mcp-server.ncl\nMCP protocol server:\n\n```\nMCPServerConfig = {\n workspace | WorkspaceConfig,\n server | Server,\n capabilities | CapabilitiesConfig,\n tools | ToolsConfig | optional,\n resources | ResourcesConfig | optional,\n}\n```\n\n## Deployment Mode Schemas\n\nDeployment schemas define resource constraints for each mode:\n\n- **solo.ncl** - 2 CPU, 4GB RAM, embedded DB\n- **multiuser.ncl** - 4 CPU, 8GB RAM, PostgreSQL\n- **cicd.ncl** - 8 CPU, 16GB RAM, ephemeral\n- **enterprise.ncl** - 16+ CPU, 32+ GB RAM, HA\n\nExample:\n\n```\n# schemas/deployment/solo.ncl\n{\n SoloMode = {\n resources = {\n cpu_cores | 2,\n memory_mb | 4096,\n disk_gb | 50,\n },\n database_backend | 'filesystem,\n security_level | 'basic,\n },\n}\n```\n\n## Validation with Schemas\n\nSchemas are composed with validators in config files:\n\n```\n# configs/orchestrator.solo.ncl\nlet schemas = import "../schemas/orchestrator.ncl" in\nlet validators = import "../validators/orchestrator-validator.ncl" in\nlet defaults = import "../defaults/orchestrator-defaults.ncl" in\n\n# Compose: defaults + validation + schema checking\n{\n orchestrator = defaults.orchestrator & {\n queue = {\n max_concurrent_tasks = validators.ValidConcurrentTasks 5,\n },\n },\n} | schemas.OrchestratorConfig\n```\n\nThe final `| schemas.OrchestratorConfig` applies type checking.\n\n## Type System\n\n### Nickel Type Syntax\n\n```\n# Required field\nfield | Type,\n\n# Optional field\nfield | Type | optional,\n\n# Field with default\nfield | Type | default = value,\n\n# Union type\nfield | [| 'option1, 'option2],\n\n# Nested record\nfield | {\n subfield | Type,\n},\n```\n\n## Best Practices\n\n1. **Reuse common schemas** - Import and compose rather than duplicate\n2. **Use enums for choices** - `'filesystem | 'rocksdb` instead of string validation\n3. **Document fields** - Add comments explaining purpose\n4. **Keep schemas focused** - Each file covers one logical component\n5. **Test composition** - Use `nickel typecheck` to verify schema merging\n\n## Modifying Schemas\n\nWhen changing a schema:\n\n1. Update schema file (schemas/*.ncl)\n2. Update corresponding defaults (defaults/*.ncl) to match schema\n3. Update validators if constraints changed\n4. Run typecheck: `nickel typecheck configs/orchestrator.*.ncl`\n5. Verify all configs still type-check\n\n## Schema Testing\n\n```\n# Typecheck a schema\nnickel typecheck provisioning/.typedialog/provisioning/platform/schemas/orchestrator.ncl\n\n# Typecheck a config (which applies schema)\nnickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Evaluate a schema\nnickel eval provisioning/.typedialog/provisioning/platform/schemas/orchestrator.ncl\n```\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 diff --git a/schemas/platform/templates/README.md b/schemas/platform/templates/README.md index c01d06f..4a312e7 100644 --- a/schemas/platform/templates/README.md +++ b/schemas/platform/templates/README.md @@ -1 +1 @@ -# Templates\n\nJinja2 and Nickel templates for configuration and deployment generation.\n\n## Purpose\n\nTemplates provide:\n- **Nickel output generation** - Jinja2 templates for TypeDialog nickel-roundtrip\n- **Docker Compose generation** - Infrastructure-as-code for containerized deployment\n- **Kubernetes manifests** - Declarative deployment manifests\n- **TOML export** - Service configuration generation for Rust codebase\n\n## File Organization\n\n```\ntemplates/\n├── README.md # This file\n├── orchestrator-config.ncl.j2 # Nickel output template (Jinja2)\n├── control-center-config.ncl.j2 # Nickel output template (Jinja2)\n├── mcp-server-config.ncl.j2 # Nickel output template (Jinja2)\n├── installer-config.ncl.j2 # Nickel output template (Jinja2)\n├── docker-compose/ # Docker Compose templates\n│ ├── platform-stack.solo.yml.ncl\n│ ├── platform-stack.multiuser.yml.ncl\n│ ├── platform-stack.cicd.yml.ncl\n│ └── platform-stack.enterprise.yml.ncl\n├── kubernetes/ # Kubernetes templates\n│ ├── orchestrator-deployment.yaml.ncl\n│ ├── orchestrator-service.yaml.ncl\n│ ├── control-center-deployment.yaml.ncl\n│ ├── control-center-service.yaml.ncl\n│ └── platform-ingress.yaml.ncl\n└── configs/ # Service config templates (optional)\n ├── orchestrator-config.toml.ncl\n ├── control-center-config.toml.ncl\n └── mcp-server-config.toml.ncl\n```\n\n## Jinja2 Config Templates\n\n**Critical for TypeDialog nickel-roundtrip workflow**:\n\n```\ntypedialog-web nickel-roundtrip "$CONFIG" "forms/{service}-form.toml" --output "$CONFIG" --template "templates/{service}-config.ncl.j2"\n```\n\n### Template Pattern: orchestrator-config.ncl.j2\n\n```\n# Orchestrator Configuration - Nickel Format\n# Auto-generated by provisioning TypeDialog\n# Edit via: nu scripts/configure.nu orchestrator {mode}\n\n{\n orchestrator = {\n # Workspace Configuration\n workspace = {\n {%- if workspace_name %}\n name = "{{ workspace_name }}",\n {%- endif %}\n {%- if workspace_path %}\n path = "{{ workspace_path }}",\n {%- endif %}\n {%- if workspace_enabled is defined %}\n enabled = {{ workspace_enabled | lower }},\n {%- endif %}\n {%- if multi_workspace is defined %}\n multi_workspace = {{ multi_workspace | lower }},\n {%- endif %}\n },\n\n # Server Configuration\n server = {\n {%- if server_host %}\n host = "{{ server_host }}",\n {%- endif %}\n {%- if server_port %}\n port = {{ server_port }},\n {%- endif %}\n {%- if server_workers %}\n workers = {{ server_workers }},\n {%- endif %}\n {%- if server_keep_alive %}\n keep_alive = {{ server_keep_alive }},\n {%- endif %}\n },\n\n # Storage Configuration\n storage = {\n {%- if storage_backend %}\n backend = '{{ storage_backend }},\n {%- endif %}\n {%- if storage_path %}\n path = "{{ storage_path }}",\n {%- endif %}\n {%- if surrealdb_url %}\n surrealdb_url = "{{ surrealdb_url }}",\n {%- endif %}\n },\n\n # Queue Configuration\n queue = {\n {%- if max_concurrent_tasks %}\n max_concurrent_tasks = {{ max_concurrent_tasks }},\n {%- endif %}\n {%- if retry_attempts %}\n retry_attempts = {{ retry_attempts }},\n {%- endif %}\n {%- if retry_delay %}\n retry_delay = {{ retry_delay }},\n {%- endif %}\n {%- if task_timeout %}\n task_timeout = {{ task_timeout }},\n {%- endif %}\n },\n\n # Monitoring Configuration (optional)\n {%- if enable_monitoring is defined and enable_monitoring %}\n monitoring = {\n enabled = true,\n {%- if metrics_interval %}\n metrics_interval = {{ metrics_interval }},\n {%- endif %}\n {%- if health_check_interval %}\n health_check_interval = {{ health_check_interval }},\n {%- endif %}\n },\n {%- endif %}\n },\n}\n```\n\n### Key Jinja2 Patterns\n\n**Conditional blocks** (only include if field is set):\n\n```\n{%- if workspace_name %}\nname = "{{ workspace_name }}",\n{%- endif %}\n```\n\n**String values** (with quotes):\n\n```\n{%- if storage_backend %}\nbackend = '{{ storage_backend }}, # Enum (atom syntax)\n{%- endif %}\n```\n\n**Numeric values** (no quotes):\n\n```\n{%- if server_port %}\nport = {{ server_port }}, # Number\n{%- endif %}\n```\n\n**Boolean values** (lower case):\n\n```\n{%- if workspace_enabled is defined %}\nenabled = {{ workspace_enabled | lower }}, # Boolean (true/false)\n{%- endif %}\n```\n\n**Comments** (for generated files):\n\n```\n# Auto-generated by provisioning TypeDialog\n# Edit via: nu scripts/configure.nu orchestrator {mode}\n```\n\n## Docker Compose Templates\n\nNickel templates that import from `values/*.ncl`:\n\n```\n# templates/docker-compose/platform-stack.solo.yml.ncl\n# Docker Compose Platform Stack - Solo Mode\n# Imports config from values/orchestrator.solo.ncl\n\nlet orchestrator_config = import "../../values/orchestrator.solo.ncl" in\nlet control_center_config = import "../../values/control-center.solo.ncl" in\n\n{\n version = "3.8",\n services = {\n orchestrator = {\n image = "provisioning-orchestrator:latest",\n container_name = "orchestrator",\n ports = [\n "%{std.to_string orchestrator_config.orchestrator.server.port}:9090",\n ],\n environment = {\n ORCHESTRATOR_SERVER_HOST = orchestrator_config.orchestrator.server.host,\n ORCHESTRATOR_SERVER_PORT = std.to_string orchestrator_config.orchestrator.server.port,\n ORCHESTRATOR_STORAGE_BACKEND = orchestrator_config.orchestrator.storage.backend,\n },\n volumes = [\n "./data/orchestrator:%{orchestrator_config.orchestrator.storage.path}",\n ],\n restart = "unless-stopped",\n },\n control-center = {\n image = "provisioning-control-center:latest",\n container_name = "control-center",\n ports = [\n "%{std.to_string control_center_config.control_center.server.port}:8080",\n ],\n environment = {\n CONTROL_CENTER_SERVER_HOST = control_center_config.control_center.server.host,\n CONTROL_CENTER_SERVER_PORT = std.to_string control_center_config.control_center.server.port,\n },\n restart = "unless-stopped",\n },\n },\n}\n```\n\n### Rendering Docker Compose\n\n```\n# Export Nickel template to YAML\nnickel export --format json templates/docker-compose/platform-stack.solo.yml.ncl | yq -P > docker-compose.solo.yml\n```\n\n## Kubernetes Templates\n\nNickel templates for Kubernetes manifests:\n\n```\n# templates/kubernetes/orchestrator-deployment.yaml.ncl\nlet config = import "../../values/orchestrator.solo.ncl" in\n\n{\n apiVersion = "apps/v1",\n kind = "Deployment",\n metadata = {\n name = "orchestrator",\n labels = {\n app = "orchestrator",\n },\n },\n spec = {\n replicas = 1,\n selector = {\n matchLabels = {\n app = "orchestrator",\n },\n },\n template = {\n metadata = {\n labels = {\n app = "orchestrator",\n },\n },\n spec = {\n containers = [\n {\n name = "orchestrator",\n image = "provisioning-orchestrator:latest",\n ports = [\n {\n containerPort = 9090,\n },\n ],\n env = [\n {\n name = "ORCHESTRATOR_SERVER_PORT",\n value = std.to_string config.orchestrator.server.port,\n },\n {\n name = "ORCHESTRATOR_STORAGE_BACKEND",\n value = config.orchestrator.storage.backend,\n },\n ],\n volumeMounts = [\n {\n name = "data",\n mountPath = config.orchestrator.storage.path,\n },\n ],\n },\n ],\n volumes = [\n {\n name = "data",\n persistentVolumeClaim = {\n claimName = "orchestrator-pvc",\n },\n },\n ],\n },\n },\n },\n}\n```\n\n## Rendering Templates\n\n### Render to JSON\n\n```\nnickel export --format json templates/orchestrator-config.ncl.j2 > config.json\n```\n\n### Render to YAML (via yq)\n\n```\nnickel export --format json templates/kubernetes/orchestrator-deployment.yaml.ncl | yq -P > deployment.yaml\n```\n\n### Render to TOML\n\n```\nnickel export --format toml templates/configs/orchestrator-config.toml.ncl > config.toml\n```\n\n## Template Variables\n\nVariables in templates come from:\n1. **Form values** (TypeDialog input)\n2. **Imported configs** (Nickel imports)\n3. **Constraint interpolation** (constraints.toml)\n\n## Best Practices\n\n1. **Use conditional blocks** - Only include fields if set\n2. **Import configs** - Reuse Nickel configs in templates\n3. **Type conversion** - Use `std.to_string` for numeric values\n4. **Comments** - Explain generated/auto-edited markers\n5. **Validation** - Use `nickel typecheck` to verify templates\n6. **Environment variables** - Prefer env over hardcoding\n\n## Template Testing\n\n```\n# Typecheck Jinja2 + Nickel template\nnickel typecheck templates/orchestrator-config.ncl.j2\n\n# Evaluate and view output\nnickel eval templates/orchestrator-config.ncl.j2\n\n# Export and validate output\nnickel export --format json templates/orchestrator-config.ncl.j2 | jq '.'\n```\n\n## Adding a New Template\n\n1. **Create template file** (`{service}-config.ncl.j2` or `{name}.yml.ncl`)\n2. **Define structure** (Nickel or Jinja2)\n3. **Import configs** (if Nickel)\n4. **Use variables** (from forms or imports)\n5. **Typecheck**: `nickel typecheck templates/{file}`\n6. **Test rendering**: `nickel export {format} templates/{file}`\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 \ No newline at end of file +# Templates\n\nJinja2 and Nickel templates for configuration and deployment generation.\n\n## Purpose\n\nTemplates provide:\n- **Nickel output generation** - Jinja2 templates for TypeDialog nickel-roundtrip\n- **Docker Compose generation** - Infrastructure-as-code for containerized deployment\n- **Kubernetes manifests** - Declarative deployment manifests\n- **TOML export** - Service configuration generation for Rust codebase\n\n## File Organization\n\n```\ntemplates/\n├── README.md # This file\n├── orchestrator-config.ncl.j2 # Nickel output template (Jinja2)\n├── control-center-config.ncl.j2 # Nickel output template (Jinja2)\n├── mcp-server-config.ncl.j2 # Nickel output template (Jinja2)\n├── installer-config.ncl.j2 # Nickel output template (Jinja2)\n├── docker-compose/ # Docker Compose templates\n│ ├── platform-stack.solo.yml.ncl\n│ ├── platform-stack.multiuser.yml.ncl\n│ ├── platform-stack.cicd.yml.ncl\n│ └── platform-stack.enterprise.yml.ncl\n├── kubernetes/ # Kubernetes templates\n│ ├── orchestrator-deployment.yaml.ncl\n│ ├── orchestrator-service.yaml.ncl\n│ ├── control-center-deployment.yaml.ncl\n│ ├── control-center-service.yaml.ncl\n│ └── platform-ingress.yaml.ncl\n└── configs/ # Service config templates (optional)\n ├── orchestrator-config.toml.ncl\n ├── control-center-config.toml.ncl\n └── mcp-server-config.toml.ncl\n```\n\n## Jinja2 Config Templates\n\n**Critical for TypeDialog nickel-roundtrip workflow**:\n\n```\ntypedialog-web nickel-roundtrip "$CONFIG" "forms/{service}-form.toml" --output "$CONFIG" --template "templates/{service}-config.ncl.j2"\n```\n\n### Template Pattern: orchestrator-config.ncl.j2\n\n```\n# Orchestrator Configuration - Nickel Format\n# Auto-generated by provisioning TypeDialog\n# Edit via: nu scripts/configure.nu orchestrator {mode}\n\n{\n orchestrator = {\n # Workspace Configuration\n workspace = {\n {%- if workspace_name %}\n name = "{{ workspace_name }}",\n {%- endif %}\n {%- if workspace_path %}\n path = "{{ workspace_path }}",\n {%- endif %}\n {%- if workspace_enabled is defined %}\n enabled = {{ workspace_enabled | lower }},\n {%- endif %}\n {%- if multi_workspace is defined %}\n multi_workspace = {{ multi_workspace | lower }},\n {%- endif %}\n },\n\n # Server Configuration\n server = {\n {%- if server_host %}\n host = "{{ server_host }}",\n {%- endif %}\n {%- if server_port %}\n port = {{ server_port }},\n {%- endif %}\n {%- if server_workers %}\n workers = {{ server_workers }},\n {%- endif %}\n {%- if server_keep_alive %}\n keep_alive = {{ server_keep_alive }},\n {%- endif %}\n },\n\n # Storage Configuration\n storage = {\n {%- if storage_backend %}\n backend = '{{ storage_backend }},\n {%- endif %}\n {%- if storage_path %}\n path = "{{ storage_path }}",\n {%- endif %}\n {%- if surrealdb_url %}\n surrealdb_url = "{{ surrealdb_url }}",\n {%- endif %}\n },\n\n # Queue Configuration\n queue = {\n {%- if max_concurrent_tasks %}\n max_concurrent_tasks = {{ max_concurrent_tasks }},\n {%- endif %}\n {%- if retry_attempts %}\n retry_attempts = {{ retry_attempts }},\n {%- endif %}\n {%- if retry_delay %}\n retry_delay = {{ retry_delay }},\n {%- endif %}\n {%- if task_timeout %}\n task_timeout = {{ task_timeout }},\n {%- endif %}\n },\n\n # Monitoring Configuration (optional)\n {%- if enable_monitoring is defined and enable_monitoring %}\n monitoring = {\n enabled = true,\n {%- if metrics_interval %}\n metrics_interval = {{ metrics_interval }},\n {%- endif %}\n {%- if health_check_interval %}\n health_check_interval = {{ health_check_interval }},\n {%- endif %}\n },\n {%- endif %}\n },\n}\n```\n\n### Key Jinja2 Patterns\n\n**Conditional blocks** (only include if field is set):\n\n```\n{%- if workspace_name %}\nname = "{{ workspace_name }}",\n{%- endif %}\n```\n\n**String values** (with quotes):\n\n```\n{%- if storage_backend %}\nbackend = '{{ storage_backend }}, # Enum (atom syntax)\n{%- endif %}\n```\n\n**Numeric values** (no quotes):\n\n```\n{%- if server_port %}\nport = {{ server_port }}, # Number\n{%- endif %}\n```\n\n**Boolean values** (lower case):\n\n```\n{%- if workspace_enabled is defined %}\nenabled = {{ workspace_enabled | lower }}, # Boolean (true/false)\n{%- endif %}\n```\n\n**Comments** (for generated files):\n\n```\n# Auto-generated by provisioning TypeDialog\n# Edit via: nu scripts/configure.nu orchestrator {mode}\n```\n\n## Docker Compose Templates\n\nNickel templates that import from `values/*.ncl`:\n\n```\n# templates/docker-compose/platform-stack.solo.yml.ncl\n# Docker Compose Platform Stack - Solo Mode\n# Imports config from values/orchestrator.solo.ncl\n\nlet orchestrator_config = import "../../values/orchestrator.solo.ncl" in\nlet control_center_config = import "../../values/control-center.solo.ncl" in\n\n{\n version = "3.8",\n services = {\n orchestrator = {\n image = "provisioning-orchestrator:latest",\n container_name = "orchestrator",\n ports = [\n "%{std.to_string orchestrator_config.orchestrator.server.port}:9090",\n ],\n environment = {\n ORCHESTRATOR_SERVER_HOST = orchestrator_config.orchestrator.server.host,\n ORCHESTRATOR_SERVER_PORT = std.to_string orchestrator_config.orchestrator.server.port,\n ORCHESTRATOR_STORAGE_BACKEND = orchestrator_config.orchestrator.storage.backend,\n },\n volumes = [\n "./data/orchestrator:%{orchestrator_config.orchestrator.storage.path}",\n ],\n restart = "unless-stopped",\n },\n control-center = {\n image = "provisioning-control-center:latest",\n container_name = "control-center",\n ports = [\n "%{std.to_string control_center_config.control_center.server.port}:8080",\n ],\n environment = {\n CONTROL_CENTER_SERVER_HOST = control_center_config.control_center.server.host,\n CONTROL_CENTER_SERVER_PORT = std.to_string control_center_config.control_center.server.port,\n },\n restart = "unless-stopped",\n },\n },\n}\n```\n\n### Rendering Docker Compose\n\n```\n# Export Nickel template to YAML\nnickel export --format json templates/docker-compose/platform-stack.solo.yml.ncl | yq -P > docker-compose.solo.yml\n```\n\n## Kubernetes Templates\n\nNickel templates for Kubernetes manifests:\n\n```\n# templates/kubernetes/orchestrator-deployment.yaml.ncl\nlet config = import "../../values/orchestrator.solo.ncl" in\n\n{\n apiVersion = "apps/v1",\n kind = "Deployment",\n metadata = {\n name = "orchestrator",\n labels = {\n app = "orchestrator",\n },\n },\n spec = {\n replicas = 1,\n selector = {\n matchLabels = {\n app = "orchestrator",\n },\n },\n template = {\n metadata = {\n labels = {\n app = "orchestrator",\n },\n },\n spec = {\n containers = [\n {\n name = "orchestrator",\n image = "provisioning-orchestrator:latest",\n ports = [\n {\n containerPort = 9090,\n },\n ],\n env = [\n {\n name = "ORCHESTRATOR_SERVER_PORT",\n value = std.to_string config.orchestrator.server.port,\n },\n {\n name = "ORCHESTRATOR_STORAGE_BACKEND",\n value = config.orchestrator.storage.backend,\n },\n ],\n volumeMounts = [\n {\n name = "data",\n mountPath = config.orchestrator.storage.path,\n },\n ],\n },\n ],\n volumes = [\n {\n name = "data",\n persistentVolumeClaim = {\n claimName = "orchestrator-pvc",\n },\n },\n ],\n },\n },\n },\n}\n```\n\n## Rendering Templates\n\n### Render to JSON\n\n```\nnickel export --format json templates/orchestrator-config.ncl.j2 > config.json\n```\n\n### Render to YAML (via yq)\n\n```\nnickel export --format json templates/kubernetes/orchestrator-deployment.yaml.ncl | yq -P > deployment.yaml\n```\n\n### Render to TOML\n\n```\nnickel export --format toml templates/configs/orchestrator-config.toml.ncl > config.toml\n```\n\n## Template Variables\n\nVariables in templates come from:\n1. **Form values** (TypeDialog input)\n2. **Imported configs** (Nickel imports)\n3. **Constraint interpolation** (constraints.toml)\n\n## Best Practices\n\n1. **Use conditional blocks** - Only include fields if set\n2. **Import configs** - Reuse Nickel configs in templates\n3. **Type conversion** - Use `std.to_string` for numeric values\n4. **Comments** - Explain generated/auto-edited markers\n5. **Validation** - Use `nickel typecheck` to verify templates\n6. **Environment variables** - Prefer env over hardcoding\n\n## Template Testing\n\n```\n# Typecheck Jinja2 + Nickel template\nnickel typecheck templates/orchestrator-config.ncl.j2\n\n# Evaluate and view output\nnickel eval templates/orchestrator-config.ncl.j2\n\n# Export and validate output\nnickel export --format json templates/orchestrator-config.ncl.j2 | jq '.'\n```\n\n## Adding a New Template\n\n1. **Create template file** (`{service}-config.ncl.j2` or `{name}.yml.ncl`)\n2. **Define structure** (Nickel or Jinja2)\n3. **Import configs** (if Nickel)\n4. **Use variables** (from forms or imports)\n5. **Typecheck**: `nickel typecheck templates/{file}`\n6. **Test rendering**: `nickel export {format} templates/{file}`\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 diff --git a/schemas/platform/templates/configs/README.md b/schemas/platform/templates/configs/README.md index 4d6c893..4831149 100644 --- a/schemas/platform/templates/configs/README.md +++ b/schemas/platform/templates/configs/README.md @@ -1 +1 @@ -# Service Configuration Templates\n\nNickel-based configuration templates that export to TOML format for provisioning platform services.\n\n## Overview\n\nThis directory contains Nickel templates that generate TOML configuration files for the provisioning platform services:\n\n- **orchestrator-config.toml.ncl** - Workflow engine configuration\n- **control-center-config.toml.ncl** - Policy and RBAC management configuration\n- **mcp-server-config.toml.ncl** - Model Context Protocol server configuration\n\nThese templates support all four deployment modes:\n\n- **solo**: Single developer, minimal configuration\n- **multiuser**: Team collaboration with full features\n- **cicd**: CI/CD pipelines with ephemeral configuration\n- **enterprise**: Production with advanced security and monitoring\n\n## Templates\n\n### orchestrator-config.toml.ncl\n\nOrchestrator workflow engine configuration with sections for:\n\n- **Workspace**: Workspace name, path, and multi-workspace support\n- **Server**: HTTP server configuration (host, port, workers)\n- **Storage**: Backend selection (filesystem, SurrealDB embedded, SurrealDB server)\n- **Queue**: Task concurrency, retries, timeouts, deadletter queue\n- **Batch**: Parallel limits, operation timeouts, checkpointing, rollback\n- **Monitoring**: Metrics collection, health checks, resource tracking\n- **Logging**: Log levels, outputs, rotation\n- **Security**: JWT auth, CORS, TLS, rate limiting\n- **Extensions**: Auto-loading from OCI registry\n- **Database**: Connection pooling for non-filesystem storage\n- **Features**: Feature flags for experimental functionality\n\n**Key Parameters**:\n- `max_concurrent_tasks`: 1-100 (constrained)\n- `batch.parallel_limit`: 1-50 (constrained)\n- Storage backend: filesystem, surrealdb_server, surrealdb_cluster\n- Logging format: json or text\n\n### control-center-config.toml.ncl\n\nControl Center policy and RBAC management configuration with sections for:\n\n- **Server**: HTTP server configuration\n- **Database**: Backend selection (RocksDB, PostgreSQL, PostgreSQL HA)\n- **Auth**: JWT, OAUTH2, LDAP authentication methods\n- **RBAC**: Role-based access control with roles and permissions\n- **MFA**: Multi-factor authentication (TOTP, Email OTP)\n- **Policies**: Password policy, session policy, audit, compliance\n- **Rate Limiting**: Global and per-user rate limits\n- **CORS**: Cross-origin resource sharing configuration\n- **TLS**: SSL/TLS configuration\n- **Monitoring**: Metrics, health checks, tracing\n- **Logging**: Log outputs and rotation\n- **Orchestrator Integration**: Connection to orchestrator service\n- **Features**: Feature flags\n\n**Key Parameters**:\n- `database.backend`: rocksdb, postgres, postgres_ha\n- `mfa.required`: false for solo/multiuser, true for enterprise\n- `policies.password.min_length`: 12\n- `policies.compliance`: SOC2, HIPAA support\n\n### mcp-server-config.toml.ncl\n\nModel Context Protocol server configuration for AI/LLM integration with sections for:\n\n- **Server**: HTTP/Stdio protocol configuration\n- **Capabilities**: Tools, resources, prompts, sampling\n- **Tools**: Tool categories and configurations (orchestrator, provisioning, workspace)\n- **Resources**: File system, database, external API resources\n- **Prompts**: System prompts and user prompt configuration\n- **Integration**: Orchestrator, Control Center, Claude API integration\n- **Security**: Authentication, authorization, rate limiting, input validation\n- **Monitoring**: Metrics, health checks, audit logging\n- **Logging**: Log outputs and configuration\n- **Features**: Feature flags\n- **Performance**: Thread pools, timeouts, caching\n\n**Key Parameters**:\n- `server.protocol`: stdio (process-based) or http (network-based)\n- `capabilities.tools.enabled`: true/false\n- `capabilities.resources.max_size`: 1GB default\n- `integration.claude.model`: claude-3-opus (latest)\n\n## Usage\n\n### Exporting to TOML\n\nEach template exports to TOML format:\n\n```\n# Export orchestrator configuration\nnickel export --format toml orchestrator-config.toml.ncl > orchestrator.toml\n\n# Export control-center configuration\nnickel export --format toml control-center-config.toml.ncl > control-center.toml\n\n# Export MCP server configuration\nnickel export --format toml mcp-server-config.toml.ncl > mcp-server.toml\n```\n\n### Mode-Specific Configuration\n\nOverride configuration values based on deployment mode using environment variables or configuration layering:\n\n```\n# Export solo mode configuration\nORCHESTRATOR_MODE=solo nickel export --format toml orchestrator-config.toml.ncl > orchestrator.solo.toml\n\n# Export enterprise mode with full features\nORCHESTRATOR_MODE=enterprise nickel export --format toml orchestrator-config.toml.ncl > orchestrator.enterprise.toml\n```\n\n### Integration with Rust Services\n\nRust services load TOML configuration in this order (high to low priority):\n\n1. **Environment Variables** - `ORCHESTRATOR_*`, `CONTROL_CENTER_*`, `MCP_*`\n2. **User Configuration** - `~/.config/provisioning/user_config.toml`\n3. **Mode-Specific Config** - `provisioning/platform/config/{service}.{mode}.toml`\n4. **Default Configuration** - `provisioning/platform/config/{service}.defaults.toml`\n\nExample loading in Rust:\n\n```\nuse config::{Config, ConfigError, File};\n\npub fn load_config(mode: &str) -> Result {\n let config_path = format!("provisioning/platform/config/orchestrator.{}.toml", mode);\n\n Config::builder()\n .add_source(File::with_name("provisioning/platform/config/orchestrator.defaults"))\n .add_source(File::with_name(&config_path).required(false))\n .add_source(config::Environment::with_prefix("ORCHESTRATOR"))\n .build()?\n .try_deserialize()\n}\n```\n\n## Configuration Sections\n\n### Server Configuration (All Services)\n\n```\n[server]\nhost = "0.0.0.0"\nport = 9090\nworkers = 4\nkeep_alive = 75\nmax_connections = 512\n```\n\n### Database Configuration (Control Center)\n\n**RocksDB** (solo, cicd modes):\n\n```\n[database]\nbackend = "rocksdb"\n\n[database.rocksdb]\npath = "/var/lib/provisioning/control-center/db"\ncache_size = "256MB"\nmax_open_files = 1000\ncompression = "snappy"\n```\n\n**PostgreSQL** (multiuser, enterprise modes):\n\n```\n[database]\nbackend = "postgres"\n\n[database.postgres]\nhost = "postgres.provisioning.svc.cluster.local"\nport = 5432\ndatabase = "provisioning"\nuser = "provisioning"\npassword = "${DB_PASSWORD}"\nssl_mode = "require"\n```\n\n### Storage Configuration (Orchestrator)\n\n**Filesystem** (solo, cicd modes):\n\n```\n[storage]\nbackend = "filesystem"\npath = "/var/lib/provisioning/orchestrator/data"\n```\n\n**SurrealDB Server** (multiuser mode):\n\n```\n[storage]\nbackend = "surrealdb_server"\nsurrealdb_url = "surrealdb://surrealdb:8000"\nsurrealdb_namespace = "provisioning"\nsurrealdb_database = "orchestrator"\n```\n\n**SurrealDB Cluster** (enterprise mode):\n\n```\n[storage]\nbackend = "surrealdb_cluster"\nsurrealdb_url = "surrealdb://surrealdb-cluster.provisioning.svc.cluster.local:8000"\nsurrealdb_namespace = "provisioning"\nsurrealdb_database = "orchestrator"\n```\n\n### RBAC Configuration (Control Center)\n\n```\n[rbac]\nenabled = true\ndefault_role = "viewer"\n\n[rbac.roles.admin]\ndescription = "Administrator with full access"\npermissions = ["*"]\n\n[rbac.roles.operator]\ndescription = "Operator managing orchestrator"\npermissions = ["orchestrator.view", "orchestrator.execute"]\n```\n\n### Queue Configuration (Orchestrator)\n\n```\n[queue]\nmax_concurrent_tasks = 50\nretry_attempts = 3\nretry_delay = 5000\ntask_timeout = 3600000\n\n[queue.deadletter_queue]\nenabled = true\nmax_messages = 1000\nretention_period = 86400\n```\n\n### Logging Configuration (All Services)\n\n```\n[logging]\nlevel = "info"\nformat = "json"\n\n[[logging.outputs]]\ndestination = "stdout"\nlevel = "info"\n\n[[logging.outputs]]\ndestination = "file"\npath = "/var/log/provisioning/orchestrator/orchestrator.log"\nlevel = "debug"\n\n[logging.outputs.rotation]\nmax_size = "100MB"\nmax_backups = 10\nmax_age = 30\n```\n\n### Monitoring Configuration (All Services)\n\n```\n[monitoring]\nenabled = true\n\n[monitoring.metrics]\nenabled = true\ninterval = 30\nexport_format = "prometheus"\n\n[monitoring.health_check]\nenabled = true\ninterval = 30\ntimeout = 10\n```\n\n### Security Configuration (All Services)\n\n```\n[security.auth]\nenabled = true\nmethod = "jwt"\njwt_secret = "${JWT_SECRET}"\njwt_issuer = "provisioning.local"\njwt_audience = "orchestrator"\ntoken_expiration = 3600\n\n[security.cors]\nenabled = true\nallowed_origins = ["https://control-center:8080"]\nallowed_methods = ["GET", "POST", "PUT", "DELETE"]\n\n[security.rate_limit]\nenabled = true\nrequests_per_second = 1000\nburst_size = 100\n```\n\n## Environment Variables\n\nAll sensitive values should be provided via environment variables:\n\n```\n# Secrets\nexport JWT_SECRET="your-jwt-secret-here"\nexport DB_PASSWORD="your-database-password"\nexport ORCHESTRATOR_TOKEN="your-orchestrator-token"\nexport CONTROL_CENTER_TOKEN="your-control-center-token"\nexport CLAUDE_API_KEY="your-claude-api-key"\n\n# Service URLs (if different from defaults)\nexport ORCHESTRATOR_URL="http://orchestrator:9090"\nexport CONTROL_CENTER_URL="http://control-center:8080"\n\n# Mode selection\nexport PROVISIONING_MODE="enterprise"\n```\n\n## Mode-Specific Overrides\n\n### Solo Mode\n- Minimal resources: 2 CPU, 4GB RAM\n- Filesystem storage for orchestrator\n- RocksDB for control-center\n- No MFA required\n- Single replica deployments\n- Logging: info level\n\n### MultiUser Mode\n- Moderate resources: 4 CPU, 8GB RAM\n- SurrealDB server for orchestrator\n- PostgreSQL for control-center\n- RBAC enabled\n- 1 replica per service\n- Logging: debug level\n\n### CI/CD Mode\n- Stateless configuration\n- Ephemeral storage (no persistence)\n- API-driven (minimal UI)\n- No MFA required\n- 1 replica per service\n- Logging: warn level (minimal)\n\n### Enterprise Mode\n- High resources: 16+ CPU, 32+ GB RAM\n- SurrealDB cluster for orchestrator HA\n- PostgreSQL HA for control-center\n- Full RBAC and MFA required\n- 3+ replicas per service\n- Full monitoring and audit logging\n- Logging: info level with detailed audit\n\n## Validation\n\nValidate configuration before using:\n\n```\n# Type check with Nickel\nnickel typecheck orchestrator-config.toml.ncl\n\n# Export and validate TOML syntax\nnickel export --format toml orchestrator-config.toml.ncl | toml-cli validate -\n```\n\n## References\n\n- [Orchestrator Configuration Schema](../../schemas/orchestrator.ncl)\n- [Control Center Configuration Schema](../../schemas/control-center.ncl)\n- [MCP Server Configuration Schema](../../schemas/mcp-server.ncl)\n- [Nickel Language](https://nickel-lang.org/)\n- [TOML Format](https://toml.io/) \ No newline at end of file +# Service Configuration Templates\n\nNickel-based configuration templates that export to TOML format for provisioning platform services.\n\n## Overview\n\nThis directory contains Nickel templates that generate TOML configuration files for the provisioning platform services:\n\n- **orchestrator-config.toml.ncl** - Workflow engine configuration\n- **control-center-config.toml.ncl** - Policy and RBAC management configuration\n- **mcp-server-config.toml.ncl** - Model Context Protocol server configuration\n\nThese templates support all four deployment modes:\n\n- **solo**: Single developer, minimal configuration\n- **multiuser**: Team collaboration with full features\n- **cicd**: CI/CD pipelines with ephemeral configuration\n- **enterprise**: Production with advanced security and monitoring\n\n## Templates\n\n### orchestrator-config.toml.ncl\n\nOrchestrator workflow engine configuration with sections for:\n\n- **Workspace**: Workspace name, path, and multi-workspace support\n- **Server**: HTTP server configuration (host, port, workers)\n- **Storage**: Backend selection (filesystem, SurrealDB embedded, SurrealDB server)\n- **Queue**: Task concurrency, retries, timeouts, deadletter queue\n- **Batch**: Parallel limits, operation timeouts, checkpointing, rollback\n- **Monitoring**: Metrics collection, health checks, resource tracking\n- **Logging**: Log levels, outputs, rotation\n- **Security**: JWT auth, CORS, TLS, rate limiting\n- **Extensions**: Auto-loading from OCI registry\n- **Database**: Connection pooling for non-filesystem storage\n- **Features**: Feature flags for experimental functionality\n\n**Key Parameters**:\n- `max_concurrent_tasks`: 1-100 (constrained)\n- `batch.parallel_limit`: 1-50 (constrained)\n- Storage backend: filesystem, surrealdb_server, surrealdb_cluster\n- Logging format: json or text\n\n### control-center-config.toml.ncl\n\nControl Center policy and RBAC management configuration with sections for:\n\n- **Server**: HTTP server configuration\n- **Database**: Backend selection (RocksDB, PostgreSQL, PostgreSQL HA)\n- **Auth**: JWT, OAUTH2, LDAP authentication methods\n- **RBAC**: Role-based access control with roles and permissions\n- **MFA**: Multi-factor authentication (TOTP, Email OTP)\n- **Policies**: Password policy, session policy, audit, compliance\n- **Rate Limiting**: Global and per-user rate limits\n- **CORS**: Cross-origin resource sharing configuration\n- **TLS**: SSL/TLS configuration\n- **Monitoring**: Metrics, health checks, tracing\n- **Logging**: Log outputs and rotation\n- **Orchestrator Integration**: Connection to orchestrator service\n- **Features**: Feature flags\n\n**Key Parameters**:\n- `database.backend`: rocksdb, postgres, postgres_ha\n- `mfa.required`: false for solo/multiuser, true for enterprise\n- `policies.password.min_length`: 12\n- `policies.compliance`: SOC2, HIPAA support\n\n### mcp-server-config.toml.ncl\n\nModel Context Protocol server configuration for AI/LLM integration with sections for:\n\n- **Server**: HTTP/Stdio protocol configuration\n- **Capabilities**: Tools, resources, prompts, sampling\n- **Tools**: Tool categories and configurations (orchestrator, provisioning, workspace)\n- **Resources**: File system, database, external API resources\n- **Prompts**: System prompts and user prompt configuration\n- **Integration**: Orchestrator, Control Center, Claude API integration\n- **Security**: Authentication, authorization, rate limiting, input validation\n- **Monitoring**: Metrics, health checks, audit logging\n- **Logging**: Log outputs and configuration\n- **Features**: Feature flags\n- **Performance**: Thread pools, timeouts, caching\n\n**Key Parameters**:\n- `server.protocol`: stdio (process-based) or http (network-based)\n- `capabilities.tools.enabled`: true/false\n- `capabilities.resources.max_size`: 1GB default\n- `integration.claude.model`: claude-3-opus (latest)\n\n## Usage\n\n### Exporting to TOML\n\nEach template exports to TOML format:\n\n```\n# Export orchestrator configuration\nnickel export --format toml orchestrator-config.toml.ncl > orchestrator.toml\n\n# Export control-center configuration\nnickel export --format toml control-center-config.toml.ncl > control-center.toml\n\n# Export MCP server configuration\nnickel export --format toml mcp-server-config.toml.ncl > mcp-server.toml\n```\n\n### Mode-Specific Configuration\n\nOverride configuration values based on deployment mode using environment variables or configuration layering:\n\n```\n# Export solo mode configuration\nORCHESTRATOR_MODE=solo nickel export --format toml orchestrator-config.toml.ncl > orchestrator.solo.toml\n\n# Export enterprise mode with full features\nORCHESTRATOR_MODE=enterprise nickel export --format toml orchestrator-config.toml.ncl > orchestrator.enterprise.toml\n```\n\n### Integration with Rust Services\n\nRust services load TOML configuration in this order (high to low priority):\n\n1. **Environment Variables** - `ORCHESTRATOR_*`, `CONTROL_CENTER_*`, `MCP_*`\n2. **User Configuration** - `~/.config/provisioning/user_config.toml`\n3. **Mode-Specific Config** - `provisioning/platform/config/{service}.{mode}.toml`\n4. **Default Configuration** - `provisioning/platform/config/{service}.defaults.toml`\n\nExample loading in Rust:\n\n```\nuse config::{Config, ConfigError, File};\n\npub fn load_config(mode: &str) -> Result {\n let config_path = format!("provisioning/platform/config/orchestrator.{}.toml", mode);\n\n Config::builder()\n .add_source(File::with_name("provisioning/platform/config/orchestrator.defaults"))\n .add_source(File::with_name(&config_path).required(false))\n .add_source(config::Environment::with_prefix("ORCHESTRATOR"))\n .build()?\n .try_deserialize()\n}\n```\n\n## Configuration Sections\n\n### Server Configuration (All Services)\n\n```\n[server]\nhost = "0.0.0.0"\nport = 9090\nworkers = 4\nkeep_alive = 75\nmax_connections = 512\n```\n\n### Database Configuration (Control Center)\n\n**RocksDB** (solo, cicd modes):\n\n```\n[database]\nbackend = "rocksdb"\n\n[database.rocksdb]\npath = "/var/lib/provisioning/control-center/db"\ncache_size = "256MB"\nmax_open_files = 1000\ncompression = "snappy"\n```\n\n**PostgreSQL** (multiuser, enterprise modes):\n\n```\n[database]\nbackend = "postgres"\n\n[database.postgres]\nhost = "postgres.provisioning.svc.cluster.local"\nport = 5432\ndatabase = "provisioning"\nuser = "provisioning"\npassword = "${DB_PASSWORD}"\nssl_mode = "require"\n```\n\n### Storage Configuration (Orchestrator)\n\n**Filesystem** (solo, cicd modes):\n\n```\n[storage]\nbackend = "filesystem"\npath = "/var/lib/provisioning/orchestrator/data"\n```\n\n**SurrealDB Server** (multiuser mode):\n\n```\n[storage]\nbackend = "surrealdb_server"\nsurrealdb_url = "surrealdb://surrealdb:8000"\nsurrealdb_namespace = "provisioning"\nsurrealdb_database = "orchestrator"\n```\n\n**SurrealDB Cluster** (enterprise mode):\n\n```\n[storage]\nbackend = "surrealdb_cluster"\nsurrealdb_url = "surrealdb://surrealdb-cluster.provisioning.svc.cluster.local:8000"\nsurrealdb_namespace = "provisioning"\nsurrealdb_database = "orchestrator"\n```\n\n### RBAC Configuration (Control Center)\n\n```\n[rbac]\nenabled = true\ndefault_role = "viewer"\n\n[rbac.roles.admin]\ndescription = "Administrator with full access"\npermissions = ["*"]\n\n[rbac.roles.operator]\ndescription = "Operator managing orchestrator"\npermissions = ["orchestrator.view", "orchestrator.execute"]\n```\n\n### Queue Configuration (Orchestrator)\n\n```\n[queue]\nmax_concurrent_tasks = 50\nretry_attempts = 3\nretry_delay = 5000\ntask_timeout = 3600000\n\n[queue.deadletter_queue]\nenabled = true\nmax_messages = 1000\nretention_period = 86400\n```\n\n### Logging Configuration (All Services)\n\n```\n[logging]\nlevel = "info"\nformat = "json"\n\n[[logging.outputs]]\ndestination = "stdout"\nlevel = "info"\n\n[[logging.outputs]]\ndestination = "file"\npath = "/var/log/provisioning/orchestrator/orchestrator.log"\nlevel = "debug"\n\n[logging.outputs.rotation]\nmax_size = "100MB"\nmax_backups = 10\nmax_age = 30\n```\n\n### Monitoring Configuration (All Services)\n\n```\n[monitoring]\nenabled = true\n\n[monitoring.metrics]\nenabled = true\ninterval = 30\nexport_format = "prometheus"\n\n[monitoring.health_check]\nenabled = true\ninterval = 30\ntimeout = 10\n```\n\n### Security Configuration (All Services)\n\n```\n[security.auth]\nenabled = true\nmethod = "jwt"\njwt_secret = "${JWT_SECRET}"\njwt_issuer = "provisioning.local"\njwt_audience = "orchestrator"\ntoken_expiration = 3600\n\n[security.cors]\nenabled = true\nallowed_origins = ["https://control-center:8080"]\nallowed_methods = ["GET", "POST", "PUT", "DELETE"]\n\n[security.rate_limit]\nenabled = true\nrequests_per_second = 1000\nburst_size = 100\n```\n\n## Environment Variables\n\nAll sensitive values should be provided via environment variables:\n\n```\n# Secrets\nexport JWT_SECRET="your-jwt-secret-here"\nexport DB_PASSWORD="your-database-password"\nexport ORCHESTRATOR_TOKEN="your-orchestrator-token"\nexport CONTROL_CENTER_TOKEN="your-control-center-token"\nexport CLAUDE_API_KEY="your-claude-api-key"\n\n# Service URLs (if different from defaults)\nexport ORCHESTRATOR_URL="http://orchestrator:9090"\nexport CONTROL_CENTER_URL="http://control-center:8080"\n\n# Mode selection\nexport PROVISIONING_MODE="enterprise"\n```\n\n## Mode-Specific Overrides\n\n### Solo Mode\n- Minimal resources: 2 CPU, 4GB RAM\n- Filesystem storage for orchestrator\n- RocksDB for control-center\n- No MFA required\n- Single replica deployments\n- Logging: info level\n\n### MultiUser Mode\n- Moderate resources: 4 CPU, 8GB RAM\n- SurrealDB server for orchestrator\n- PostgreSQL for control-center\n- RBAC enabled\n- 1 replica per service\n- Logging: debug level\n\n### CI/CD Mode\n- Stateless configuration\n- Ephemeral storage (no persistence)\n- API-driven (minimal UI)\n- No MFA required\n- 1 replica per service\n- Logging: warn level (minimal)\n\n### Enterprise Mode\n- High resources: 16+ CPU, 32+ GB RAM\n- SurrealDB cluster for orchestrator HA\n- PostgreSQL HA for control-center\n- Full RBAC and MFA required\n- 3+ replicas per service\n- Full monitoring and audit logging\n- Logging: info level with detailed audit\n\n## Validation\n\nValidate configuration before using:\n\n```\n# Type check with Nickel\nnickel typecheck orchestrator-config.toml.ncl\n\n# Export and validate TOML syntax\nnickel export --format toml orchestrator-config.toml.ncl | toml-cli validate -\n```\n\n## References\n\n- [Orchestrator Configuration Schema](../../schemas/orchestrator.ncl)\n- [Control Center Configuration Schema](../../schemas/control-center.ncl)\n- [MCP Server Configuration Schema](../../schemas/mcp-server.ncl)\n- [Nickel Language](https://nickel-lang.org/)\n- [TOML Format](https://toml.io/) diff --git a/schemas/platform/templates/docker-compose/README.md b/schemas/platform/templates/docker-compose/README.md index 78e711a..053d634 100644 --- a/schemas/platform/templates/docker-compose/README.md +++ b/schemas/platform/templates/docker-compose/README.md @@ -1 +1 @@ -# Docker Compose Templates\n\nNickel-based Docker Compose templates for deploying platform services across all deployment modes.\n\n## Overview\n\nThis directory contains Nickel templates that generate Docker Compose files for different deployment scenarios.\nEach template imports configuration from `values/*.ncl` and expands to valid Docker Compose YAML.\n\n**Key Pattern**: Templates use **Nickel composition** to build service definitions dynamically based on configuration, allowing parameterized infrastructure-as-code.\n\n## Templates\n\n### 1. platform-stack.solo.yml.ncl\n\n**Purpose**: Single-developer local development stack\n\n**Services**:\n- `orchestrator` - Workflow engine\n- `control-center` - Policy and RBAC management\n- `mcp-server` - MCP protocol server\n\n**Configuration**:\n- Network: Bridge network named `provisioning`\n- Volumes: 5 named volumes for persistence\n - `orchestrator-data` - Orchestrator workflows\n - `control-center-data` - Control Center policies\n - `mcp-server-data` - MCP Server cache\n - `logs` - Shared log volume\n - `cache` - Shared cache volume\n- Ports:\n - 9090 - Orchestrator API\n - 8080 - Control Center UI\n - 8888 - MCP Server\n- Health Checks: 30-second intervals for all services\n- Logging: JSON format, 10MB max file size, 3 backups\n- Restart Policy: `unless-stopped` (survives host reboot)\n\n**Usage**:\n\n```\n# Generate from Nickel template\nnickel export --format json platform-stack.solo.yml.ncl | yq -P > docker-compose.solo.yml\n\n# Start services\ndocker-compose -f docker-compose.solo.yml up -d\n\n# View logs\ndocker-compose -f docker-compose.solo.yml logs -f\n\n# Stop services\ndocker-compose -f docker-compose.solo.yml down\n```\n\n**Environment Variables** (recommended in `.env` file):\n\n```\nORCHESTRATOR_LOG_LEVEL=debug\nCONTROL_CENTER_LOG_LEVEL=info\nMCP_SERVER_LOG_LEVEL=info\n```\n\n---\n\n### 2. platform-stack.multiuser.yml.ncl\n\n**Purpose**: Team collaboration with persistent database storage\n\n**Services** (6 total):\n- `postgres` - Primary database (PostgreSQL 15)\n- `orchestrator` - Workflow engine\n- `control-center` - Policy and RBAC management\n- `mcp-server` - MCP protocol server\n- `surrealdb` - Workflow storage (SurrealDB server)\n- `gitea` - Git repository hosting (optional, for version control)\n\n**Configuration**:\n- Network: Custom bridge network named `provisioning-network`\n- Volumes:\n - `postgres-data` - PostgreSQL database files\n - `orchestrator-data` - Orchestrator workflows\n - `control-center-data` - Control Center policies\n - `surrealdb-data` - SurrealDB files\n - `gitea-data` - Gitea repositories and configuration\n - `logs` - Shared logs\n- Ports:\n - 9090 - Orchestrator API\n - 8080 - Control Center UI\n - 8888 - MCP Server\n - 5432 - PostgreSQL (internal only)\n - 8000 - SurrealDB (internal only)\n - 3000 - Gitea web UI (optional)\n - 22 - Gitea SSH (optional)\n- Service Dependencies: Explicit `depends_on` with health checks\n - Control Center waits for PostgreSQL\n - SurrealDB starts before Orchestrator\n- Health Checks: Service-specific health checks\n- Restart Policy: `always` (automatic recovery on failure)\n- Logging: JSON format with rotation\n\n**Usage**:\n\n```\n# Generate from Nickel template\nnickel export --format json platform-stack.multiuser.yml.ncl | yq -P > docker-compose.multiuser.yml\n\n# Create environment file\ncat > .env.multiuser << 'EOF'\nDB_PASSWORD=secure-postgres-password\nSURREALDB_PASSWORD=secure-surrealdb-password\nJWT_SECRET=secure-jwt-secret-256-bits\nEOF\n\n# Start services\ndocker-compose -f docker-compose.multiuser.yml --env-file .env.multiuser up -d\n\n# Wait for all services to be healthy\ndocker-compose -f docker-compose.multiuser.yml ps\n\n# Create database and initialize schema (one-time)\ndocker-compose exec postgres psql -U postgres -c "CREATE DATABASE provisioning;"\n```\n\n**Database Initialization**:\n\n```\n# Connect to PostgreSQL for schema creation\ndocker-compose exec postgres psql -U provisioning -d provisioning\n\n# Connect to SurrealDB for schema setup\ndocker-compose exec surrealdb surreal sql --auth root:password\n\n# Connect to Gitea web UI\n# http://localhost:3000 (admin:admin by default)\n```\n\n**Environment Variables** (in `.env.multiuser`):\n\n```\n# Database Credentials (CRITICAL - change before production)\nDB_PASSWORD=your-strong-password\nSURREALDB_PASSWORD=your-strong-password\n\n# Security\nJWT_SECRET=your-256-bit-random-string\n\n# Logging\nORCHESTRATOR_LOG_LEVEL=info\nCONTROL_CENTER_LOG_LEVEL=info\nMCP_SERVER_LOG_LEVEL=info\n\n# Optional: Gitea Configuration\nGITEA_DOMAIN=localhost:3000\nGITEA_ROOT_URL=http://localhost:3000/\n```\n\n---\n\n### 3. platform-stack.cicd.yml.ncl\n\n**Purpose**: Ephemeral CI/CD pipeline stack with minimal persistence\n\n**Services** (2 total):\n- `orchestrator` - API-only mode (no UI, streamlined for programmatic use)\n- `api-gateway` - Optional: Request routing and authentication\n\n**Configuration**:\n- Network: Bridge network\n- Volumes:\n - `orchestrator-tmpfs` - Temporary storage (tmpfs - in-memory, no persistence)\n- Ports:\n - 9090 - Orchestrator API (read-only orchestrator state)\n - 8000 - API Gateway (optional)\n- Health Checks: Fast checks (10-second intervals)\n- Restart Policy: `no` (containers do not auto-restart)\n- Logging: Minimal (only warnings and errors)\n- Cleanup: All artifacts deleted when containers stop\n\n**Characteristics**:\n- **Ephemeral**: No persistent storage (uses tmpfs)\n- **Fast Startup**: Minimal services, quick boot time\n- **API-First**: No UI, command-line/API integration only\n- **Stateless**: Clean slate each run\n- **Low Resource**: Minimal memory/CPU footprint\n\n**Usage**:\n\n```\n# Generate from Nickel template\nnickel export --format json platform-stack.cicd.yml.ncl | yq -P > docker-compose.cicd.yml\n\n# Start ephemeral stack\ndocker-compose -f docker-compose.cicd.yml up\n\n# Run CI/CD commands (in parallel terminal)\ncurl -X POST http://localhost:9090/api/workflows \\n -H "Content-Type: application/json" \\n -d @workflow.json\n\n# Stop and cleanup (all data lost)\ndocker-compose -f docker-compose.cicd.yml down\n# Or with volume cleanup\ndocker-compose -f docker-compose.cicd.yml down -v\n```\n\n**CI/CD Integration Example**:\n\n```\n# GitHub Actions workflow\n- name: Start Provisioning Stack\n run: docker-compose -f docker-compose.cicd.yml up -d\n\n- name: Run Tests\n run: |\n ./tests/integration.sh\n curl -X GET http://localhost:9090/health\n\n- name: Cleanup\n if: always()\n run: docker-compose -f docker-compose.cicd.yml down -v\n```\n\n**Environment Variables** (minimal):\n\n```\n# Logging (optional)\nORCHESTRATOR_LOG_LEVEL=warn\n```\n\n---\n\n### 4. platform-stack.enterprise.yml.ncl\n\n**Purpose**: Production-grade high-availability deployment\n\n**Services** (10+ total):\n- `postgres` - PostgreSQL 15 (primary database)\n- `orchestrator` (3 replicas) - Load-balanced workflow engine\n- `control-center` (2 replicas) - Load-balanced policy management\n- `mcp-server` (1-2 replicas) - MCP server for AI integration\n- `surrealdb-1`, `surrealdb-2`, `surrealdb-3` - SurrealDB cluster (3 nodes)\n- `nginx` - Load balancer and reverse proxy\n- `prometheus` - Metrics collection\n- `grafana` - Visualization and dashboards\n- `loki` - Log aggregation\n\n**Configuration**:\n- Network: Custom bridge network named `provisioning-enterprise`\n- Volumes:\n - `postgres-data` - PostgreSQL HA storage\n - `surrealdb-node-1`, `surrealdb-node-2`, `surrealdb-node-3` - Cluster storage\n - `prometheus-data` - Metrics storage\n - `grafana-data` - Grafana configuration\n - `loki-data` - Log storage\n - `logs` - Shared log aggregation\n- Ports:\n - 80 - HTTP (Nginx reverse proxy)\n - 443 - HTTPS (TLS - requires certificates)\n - 9090 - Orchestrator API (internal)\n - 8080 - Control Center UI (internal)\n - 8888 - MCP Server (internal)\n - 5432 - PostgreSQL (internal only)\n - 8000 - SurrealDB cluster (internal)\n - 9091 - Prometheus metrics (internal)\n - 3000 - Grafana dashboards (external)\n- Service Dependencies:\n - Control Center waits for PostgreSQL\n - Orchestrator waits for SurrealDB cluster\n - MCP Server waits for Orchestrator and Control Center\n - Prometheus waits for all services\n- Health Checks: 30-second intervals with 10-second timeout\n- Restart Policy: `always` (high availability)\n- Load Balancing: Nginx upstream blocks for orchestrator, control-center\n- Logging: JSON format with 500MB files, kept 30 versions\n\n**Architecture**:\n\n```\n┌──────────────────────┐\n│ External Client │\n│ (HTTPS, Port 443) │\n└──────────┬───────────┘\n │\n ┌──────▼──────────┐\n │ Nginx Load │\n │ Balancer │\n │ (TLS, CORS, │\n │ Rate Limiting) │\n └───────┬──────┬──────┬─────┐\n │ │ │ │\n ┌────────▼──┐ ┌──────▼──┐ ┌──▼────────┐\n │Orchestrator│ │Control │ │MCP Server │\n │ (3 copies) │ │ Center │ │ (1-2 copy)│\n │ │ │(2 copies)│ │ │\n └────────┬──┘ └─────┬───┘ └──┬───────┘\n │ │ │\n ┌───────▼────────┬──▼────┐ │\n │ SurrealDB │ PostSQL │\n │ Cluster │ HA │\n │ (3 nodes) │ (Primary/│\n │ │ Replica)│\n └────────────────┴──────────┘\n\nObservability Stack:\n┌────────────┬───────────┬───────────┐\n│ Prometheus │ Grafana │ Loki │\n│ (Metrics) │(Dashboard)│ (Logs) │\n└────────────┴───────────┴───────────┘\n```\n\n**Usage**:\n\n```\n# Generate from Nickel template\nnickel export --format json platform-stack.enterprise.yml.ncl | yq -P > docker-compose.enterprise.yml\n\n# Create environment file with secrets\ncat > .env.enterprise << 'EOF'\n# Database\nDB_PASSWORD=generate-strong-password\nSURREALDB_PASSWORD=generate-strong-password\n\n# Security\nJWT_SECRET=generate-256-bit-random-string\nADMIN_PASSWORD=generate-strong-admin-password\n\n# TLS Certificates\nTLS_CERT_PATH=/path/to/cert.pem\nTLS_KEY_PATH=/path/to/key.pem\n\n# Logging and Monitoring\nPROMETHEUS_RETENTION=30d\nGRAFANA_ADMIN_PASSWORD=generate-strong-password\nLOKI_RETENTION_DAYS=30\nEOF\n\n# Start entire stack\ndocker-compose -f docker-compose.enterprise.yml --env-file .env.enterprise up -d\n\n# Verify all services are healthy\ndocker-compose -f docker-compose.enterprise.yml ps\n\n# Check load balancer status\ncurl -H "Host: orchestrator.example.com" http://localhost/health\n\n# Access monitoring\n# Grafana: http://localhost:3000 (admin/password)\n# Prometheus: http://localhost:9091 (internal)\n# Loki: http://localhost:3100 (internal)\n```\n\n**Production Checklist**:\n- [ ] Generate strong database passwords (32+ characters)\n- [ ] Generate strong JWT secret (256-bit random string)\n- [ ] Provision valid TLS certificates (not self-signed)\n- [ ] Configure Nginx upstream health checks\n- [ ] Set up log retention policies (30+ days)\n- [ ] Enable Prometheus scraping with 15-second intervals\n- [ ] Configure Grafana dashboards and alerts\n- [ ] Test SurrealDB cluster failover\n- [ ] Document backup procedures\n- [ ] Enable PostgreSQL replication and backups\n- [ ] Configure external log aggregation (ELK stack, Splunk, etc.)\n\n**Environment Variables** (in `.env.enterprise`):\n\n```\n# Database Credentials (CRITICAL)\nDB_PASSWORD=your-strong-password-32-chars-min\nSURREALDB_PASSWORD=your-strong-password-32-chars-min\n\n# Security\nJWT_SECRET=your-256-bit-random-base64-encoded-string\nADMIN_PASSWORD=your-strong-admin-password\n\n# TLS/HTTPS\nTLS_CERT_PATH=/etc/provisioning/certs/server.crt\nTLS_KEY_PATH=/etc/provisioning/certs/server.key\n\n# Logging and Monitoring\nPROMETHEUS_RETENTION=30d\nPROMETHEUS_SCRAPE_INTERVAL=15s\nGRAFANA_ADMIN_USER=admin\nGRAFANA_ADMIN_PASSWORD=your-strong-grafana-password\nLOKI_RETENTION_DAYS=30\n\n# Optional: External Integrations\nSLACK_WEBHOOK_URL=https://hooks.slack.com/services/xxxxxxx\nPAGERDUTY_INTEGRATION_KEY=your-pagerduty-key\n```\n\n---\n\n## Workflow: From Nickel to Docker Compose\n\n### 1. Configuration Source (values/*.ncl)\n\n```\n# values/orchestrator.enterprise.ncl\n{\n orchestrator = {\n server = {\n host = "0.0.0.0",\n port = 9090,\n workers = 8,\n },\n storage = {\n backend = 'surrealdb_cluster,\n surrealdb_url = "surrealdb://surrealdb-1:8000",\n },\n queue = {\n max_concurrent_tasks = 100,\n retry_attempts = 5,\n task_timeout = 7200000,\n },\n monitoring = {\n enabled = true,\n metrics_interval = 10,\n },\n },\n}\n```\n\n### 2. Template Generation (Nickel → JSON)\n\n```\n# Exports Nickel config as JSON\nnickel export --format json platform-stack.enterprise.yml.ncl\n```\n\n### 3. YAML Conversion (JSON → YAML)\n\n```\n# Converts JSON to YAML format\nnickel export --format json platform-stack.enterprise.yml.ncl | yq -P > docker-compose.enterprise.yml\n```\n\n### 4. Deployment (YAML → Running Containers)\n\n```\n# Starts all services defined in YAML\ndocker-compose -f docker-compose.enterprise.yml up -d\n```\n\n---\n\n## Common Customizations\n\n### Change Service Replicas\n\nEdit the template to adjust replica counts:\n\n```\n# In platform-stack.enterprise.yml.ncl\nlet orchestrator_replicas = 5 in # Instead of 3\nlet control_center_replicas = 3 in # Instead of 2\nservices.orchestrator_replicas\n```\n\n### Add Custom Service\n\nAdd to the template services record:\n\n```\n# In platform-stack.enterprise.yml.ncl\nservices = base_services & {\n custom_service = {\n image = "custom:latest",\n ports = ["9999:9999"],\n volumes = ["custom-data:/data"],\n restart = "always",\n healthcheck = {\n test = ["CMD", "curl", "-f", "http://localhost:9999/health"],\n interval = "30s",\n timeout = "10s",\n retries = 3,\n },\n },\n}\n```\n\n### Modify Resource Limits\n\nIn each service definition:\n\n```\norchestrator = {\n deploy = {\n resources = {\n limits = {\n cpus = "2.0",\n memory = "2G",\n },\n reservations = {\n cpus = "1.0",\n memory = "1G",\n },\n },\n },\n}\n```\n\n---\n\n## Validation and Testing\n\n### Syntax Validation\n\n```\n# Validate YAML before deploying\ndocker-compose -f docker-compose.enterprise.yml config --quiet\n\n# Check service definitions\ndocker-compose -f docker-compose.enterprise.yml ps\n```\n\n### Health Checks\n\n```\n# Monitor health of all services\nwatch docker-compose ps\n\n# Check specific service health\ndocker-compose exec orchestrator curl -s http://localhost:9090/health\n```\n\n### Log Inspection\n\n```\n# View logs from all services\ndocker-compose logs -f\n\n# View logs from specific service\ndocker-compose logs -f orchestrator\n\n# Follow specific container\ndocker logs -f $(docker ps | grep orchestrator | awk '{print $1}')\n```\n\n---\n\n## Troubleshooting\n\n### Port Already in Use\n\n**Error**: `bind: address already in use`\n\n**Fix**: Change port in template or stop conflicting container:\n\n```\n# Find process using port\nlsof -i :9090\n\n# Kill process\nkill -9 \n\n# Or change port in docker-compose file\nports:\n - "9999:9090" # Use 9999 instead\n```\n\n### Service Fails to Start\n\n**Check logs**:\n\n```\ndocker-compose logs orchestrator\n```\n\n**Common causes**:\n- Port conflict - Check if another service uses port\n- Missing volume - Create volume before starting\n- Network connectivity - Verify docker network exists\n- Database not ready - Wait for db service to become healthy\n- Configuration error - Validate YAML syntax\n\n### Persistent Volume Issues\n\n**Clean volumes** (WARNING: Deletes data):\n\n```\ndocker-compose down -v\ndocker volume prune -f\n```\n\n---\n\n## See Also\n\n- **Kubernetes Templates**: `../kubernetes/` - For production K8s deployments\n- **Configuration System**: `../../` - Full configuration documentation\n- **Examples**: `../../examples/` - Example deployment scenarios\n- **Scripts**: `../../scripts/` - Automation scripts\n\n---\n\n**Version**: 1.0\n**Last Updated**: 2025-01-05\n**Status**: Production Ready \ No newline at end of file +# Docker Compose Templates\n\nNickel-based Docker Compose templates for deploying platform services across all deployment modes.\n\n## Overview\n\nThis directory contains Nickel templates that generate Docker Compose files for different deployment scenarios.\nEach template imports configuration from `values/*.ncl` and expands to valid Docker Compose YAML.\n\n**Key Pattern**: Templates use **Nickel composition** to build service definitions dynamically based on configuration, allowing parameterized infrastructure-as-code.\n\n## Templates\n\n### 1. platform-stack.solo.yml.ncl\n\n**Purpose**: Single-developer local development stack\n\n**Services**:\n- `orchestrator` - Workflow engine\n- `control-center` - Policy and RBAC management\n- `mcp-server` - MCP protocol server\n\n**Configuration**:\n- Network: Bridge network named `provisioning`\n- Volumes: 5 named volumes for persistence\n - `orchestrator-data` - Orchestrator workflows\n - `control-center-data` - Control Center policies\n - `mcp-server-data` - MCP Server cache\n - `logs` - Shared log volume\n - `cache` - Shared cache volume\n- Ports:\n - 9090 - Orchestrator API\n - 8080 - Control Center UI\n - 8888 - MCP Server\n- Health Checks: 30-second intervals for all services\n- Logging: JSON format, 10MB max file size, 3 backups\n- Restart Policy: `unless-stopped` (survives host reboot)\n\n**Usage**:\n\n```\n# Generate from Nickel template\nnickel export --format json platform-stack.solo.yml.ncl | yq -P > docker-compose.solo.yml\n\n# Start services\ndocker-compose -f docker-compose.solo.yml up -d\n\n# View logs\ndocker-compose -f docker-compose.solo.yml logs -f\n\n# Stop services\ndocker-compose -f docker-compose.solo.yml down\n```\n\n**Environment Variables** (recommended in `.env` file):\n\n```\nORCHESTRATOR_LOG_LEVEL=debug\nCONTROL_CENTER_LOG_LEVEL=info\nMCP_SERVER_LOG_LEVEL=info\n```\n\n---\n\n### 2. platform-stack.multiuser.yml.ncl\n\n**Purpose**: Team collaboration with persistent database storage\n\n**Services** (6 total):\n- `postgres` - Primary database (PostgreSQL 15)\n- `orchestrator` - Workflow engine\n- `control-center` - Policy and RBAC management\n- `mcp-server` - MCP protocol server\n- `surrealdb` - Workflow storage (SurrealDB server)\n- `gitea` - Git repository hosting (optional, for version control)\n\n**Configuration**:\n- Network: Custom bridge network named `provisioning-network`\n- Volumes:\n - `postgres-data` - PostgreSQL database files\n - `orchestrator-data` - Orchestrator workflows\n - `control-center-data` - Control Center policies\n - `surrealdb-data` - SurrealDB files\n - `gitea-data` - Gitea repositories and configuration\n - `logs` - Shared logs\n- Ports:\n - 9090 - Orchestrator API\n - 8080 - Control Center UI\n - 8888 - MCP Server\n - 5432 - PostgreSQL (internal only)\n - 8000 - SurrealDB (internal only)\n - 3000 - Gitea web UI (optional)\n - 22 - Gitea SSH (optional)\n- Service Dependencies: Explicit `depends_on` with health checks\n - Control Center waits for PostgreSQL\n - SurrealDB starts before Orchestrator\n- Health Checks: Service-specific health checks\n- Restart Policy: `always` (automatic recovery on failure)\n- Logging: JSON format with rotation\n\n**Usage**:\n\n```\n# Generate from Nickel template\nnickel export --format json platform-stack.multiuser.yml.ncl | yq -P > docker-compose.multiuser.yml\n\n# Create environment file\ncat > .env.multiuser << 'EOF'\nDB_PASSWORD=secure-postgres-password\nSURREALDB_PASSWORD=secure-surrealdb-password\nJWT_SECRET=secure-jwt-secret-256-bits\nEOF\n\n# Start services\ndocker-compose -f docker-compose.multiuser.yml --env-file .env.multiuser up -d\n\n# Wait for all services to be healthy\ndocker-compose -f docker-compose.multiuser.yml ps\n\n# Create database and initialize schema (one-time)\ndocker-compose exec postgres psql -U postgres -c "CREATE DATABASE provisioning;"\n```\n\n**Database Initialization**:\n\n```\n# Connect to PostgreSQL for schema creation\ndocker-compose exec postgres psql -U provisioning -d provisioning\n\n# Connect to SurrealDB for schema setup\ndocker-compose exec surrealdb surreal sql --auth root:password\n\n# Connect to Gitea web UI\n# http://localhost:3000 (admin:admin by default)\n```\n\n**Environment Variables** (in `.env.multiuser`):\n\n```\n# Database Credentials (CRITICAL - change before production)\nDB_PASSWORD=your-strong-password\nSURREALDB_PASSWORD=your-strong-password\n\n# Security\nJWT_SECRET=your-256-bit-random-string\n\n# Logging\nORCHESTRATOR_LOG_LEVEL=info\nCONTROL_CENTER_LOG_LEVEL=info\nMCP_SERVER_LOG_LEVEL=info\n\n# Optional: Gitea Configuration\nGITEA_DOMAIN=localhost:3000\nGITEA_ROOT_URL=http://localhost:3000/\n```\n\n---\n\n### 3. platform-stack.cicd.yml.ncl\n\n**Purpose**: Ephemeral CI/CD pipeline stack with minimal persistence\n\n**Services** (2 total):\n- `orchestrator` - API-only mode (no UI, streamlined for programmatic use)\n- `api-gateway` - Optional: Request routing and authentication\n\n**Configuration**:\n- Network: Bridge network\n- Volumes:\n - `orchestrator-tmpfs` - Temporary storage (tmpfs - in-memory, no persistence)\n- Ports:\n - 9090 - Orchestrator API (read-only orchestrator state)\n - 8000 - API Gateway (optional)\n- Health Checks: Fast checks (10-second intervals)\n- Restart Policy: `no` (containers do not auto-restart)\n- Logging: Minimal (only warnings and errors)\n- Cleanup: All artifacts deleted when containers stop\n\n**Characteristics**:\n- **Ephemeral**: No persistent storage (uses tmpfs)\n- **Fast Startup**: Minimal services, quick boot time\n- **API-First**: No UI, command-line/API integration only\n- **Stateless**: Clean slate each run\n- **Low Resource**: Minimal memory/CPU footprint\n\n**Usage**:\n\n```\n# Generate from Nickel template\nnickel export --format json platform-stack.cicd.yml.ncl | yq -P > docker-compose.cicd.yml\n\n# Start ephemeral stack\ndocker-compose -f docker-compose.cicd.yml up\n\n# Run CI/CD commands (in parallel terminal)\ncurl -X POST http://localhost:9090/api/workflows \\n -H "Content-Type: application/json" \\n -d @workflow.json\n\n# Stop and cleanup (all data lost)\ndocker-compose -f docker-compose.cicd.yml down\n# Or with volume cleanup\ndocker-compose -f docker-compose.cicd.yml down -v\n```\n\n**CI/CD Integration Example**:\n\n```\n# GitHub Actions workflow\n- name: Start Provisioning Stack\n run: docker-compose -f docker-compose.cicd.yml up -d\n\n- name: Run Tests\n run: |\n ./tests/integration.sh\n curl -X GET http://localhost:9090/health\n\n- name: Cleanup\n if: always()\n run: docker-compose -f docker-compose.cicd.yml down -v\n```\n\n**Environment Variables** (minimal):\n\n```\n# Logging (optional)\nORCHESTRATOR_LOG_LEVEL=warn\n```\n\n---\n\n### 4. platform-stack.enterprise.yml.ncl\n\n**Purpose**: Production-grade high-availability deployment\n\n**Services** (10+ total):\n- `postgres` - PostgreSQL 15 (primary database)\n- `orchestrator` (3 replicas) - Load-balanced workflow engine\n- `control-center` (2 replicas) - Load-balanced policy management\n- `mcp-server` (1-2 replicas) - MCP server for AI integration\n- `surrealdb-1`, `surrealdb-2`, `surrealdb-3` - SurrealDB cluster (3 nodes)\n- `nginx` - Load balancer and reverse proxy\n- `prometheus` - Metrics collection\n- `grafana` - Visualization and dashboards\n- `loki` - Log aggregation\n\n**Configuration**:\n- Network: Custom bridge network named `provisioning-enterprise`\n- Volumes:\n - `postgres-data` - PostgreSQL HA storage\n - `surrealdb-node-1`, `surrealdb-node-2`, `surrealdb-node-3` - Cluster storage\n - `prometheus-data` - Metrics storage\n - `grafana-data` - Grafana configuration\n - `loki-data` - Log storage\n - `logs` - Shared log aggregation\n- Ports:\n - 80 - HTTP (Nginx reverse proxy)\n - 443 - HTTPS (TLS - requires certificates)\n - 9090 - Orchestrator API (internal)\n - 8080 - Control Center UI (internal)\n - 8888 - MCP Server (internal)\n - 5432 - PostgreSQL (internal only)\n - 8000 - SurrealDB cluster (internal)\n - 9091 - Prometheus metrics (internal)\n - 3000 - Grafana dashboards (external)\n- Service Dependencies:\n - Control Center waits for PostgreSQL\n - Orchestrator waits for SurrealDB cluster\n - MCP Server waits for Orchestrator and Control Center\n - Prometheus waits for all services\n- Health Checks: 30-second intervals with 10-second timeout\n- Restart Policy: `always` (high availability)\n- Load Balancing: Nginx upstream blocks for orchestrator, control-center\n- Logging: JSON format with 500MB files, kept 30 versions\n\n**Architecture**:\n\n```\n┌──────────────────────┐\n│ External Client │\n│ (HTTPS, Port 443) │\n└──────────┬───────────┘\n │\n ┌──────▼──────────┐\n │ Nginx Load │\n │ Balancer │\n │ (TLS, CORS, │\n │ Rate Limiting) │\n └───────┬──────┬──────┬─────┐\n │ │ │ │\n ┌────────▼──┐ ┌──────▼──┐ ┌──▼────────┐\n │Orchestrator│ │Control │ │MCP Server │\n │ (3 copies) │ │ Center │ │ (1-2 copy)│\n │ │ │(2 copies)│ │ │\n └────────┬──┘ └─────┬───┘ └──┬───────┘\n │ │ │\n ┌───────▼────────┬──▼────┐ │\n │ SurrealDB │ PostSQL │\n │ Cluster │ HA │\n │ (3 nodes) │ (Primary/│\n │ │ Replica)│\n └────────────────┴──────────┘\n\nObservability Stack:\n┌────────────┬───────────┬───────────┐\n│ Prometheus │ Grafana │ Loki │\n│ (Metrics) │(Dashboard)│ (Logs) │\n└────────────┴───────────┴───────────┘\n```\n\n**Usage**:\n\n```\n# Generate from Nickel template\nnickel export --format json platform-stack.enterprise.yml.ncl | yq -P > docker-compose.enterprise.yml\n\n# Create environment file with secrets\ncat > .env.enterprise << 'EOF'\n# Database\nDB_PASSWORD=generate-strong-password\nSURREALDB_PASSWORD=generate-strong-password\n\n# Security\nJWT_SECRET=generate-256-bit-random-string\nADMIN_PASSWORD=generate-strong-admin-password\n\n# TLS Certificates\nTLS_CERT_PATH=/path/to/cert.pem\nTLS_KEY_PATH=/path/to/key.pem\n\n# Logging and Monitoring\nPROMETHEUS_RETENTION=30d\nGRAFANA_ADMIN_PASSWORD=generate-strong-password\nLOKI_RETENTION_DAYS=30\nEOF\n\n# Start entire stack\ndocker-compose -f docker-compose.enterprise.yml --env-file .env.enterprise up -d\n\n# Verify all services are healthy\ndocker-compose -f docker-compose.enterprise.yml ps\n\n# Check load balancer status\ncurl -H "Host: orchestrator.example.com" http://localhost/health\n\n# Access monitoring\n# Grafana: http://localhost:3000 (admin/password)\n# Prometheus: http://localhost:9091 (internal)\n# Loki: http://localhost:3100 (internal)\n```\n\n**Production Checklist**:\n- [ ] Generate strong database passwords (32+ characters)\n- [ ] Generate strong JWT secret (256-bit random string)\n- [ ] Provision valid TLS certificates (not self-signed)\n- [ ] Configure Nginx upstream health checks\n- [ ] Set up log retention policies (30+ days)\n- [ ] Enable Prometheus scraping with 15-second intervals\n- [ ] Configure Grafana dashboards and alerts\n- [ ] Test SurrealDB cluster failover\n- [ ] Document backup procedures\n- [ ] Enable PostgreSQL replication and backups\n- [ ] Configure external log aggregation (ELK stack, Splunk, etc.)\n\n**Environment Variables** (in `.env.enterprise`):\n\n```\n# Database Credentials (CRITICAL)\nDB_PASSWORD=your-strong-password-32-chars-min\nSURREALDB_PASSWORD=your-strong-password-32-chars-min\n\n# Security\nJWT_SECRET=your-256-bit-random-base64-encoded-string\nADMIN_PASSWORD=your-strong-admin-password\n\n# TLS/HTTPS\nTLS_CERT_PATH=/etc/provisioning/certs/server.crt\nTLS_KEY_PATH=/etc/provisioning/certs/server.key\n\n# Logging and Monitoring\nPROMETHEUS_RETENTION=30d\nPROMETHEUS_SCRAPE_INTERVAL=15s\nGRAFANA_ADMIN_USER=admin\nGRAFANA_ADMIN_PASSWORD=your-strong-grafana-password\nLOKI_RETENTION_DAYS=30\n\n# Optional: External Integrations\nSLACK_WEBHOOK_URL=https://hooks.slack.com/services/xxxxxxx\nPAGERDUTY_INTEGRATION_KEY=your-pagerduty-key\n```\n\n---\n\n## Workflow: From Nickel to Docker Compose\n\n### 1. Configuration Source (values/*.ncl)\n\n```\n# values/orchestrator.enterprise.ncl\n{\n orchestrator = {\n server = {\n host = "0.0.0.0",\n port = 9090,\n workers = 8,\n },\n storage = {\n backend = 'surrealdb_cluster,\n surrealdb_url = "surrealdb://surrealdb-1:8000",\n },\n queue = {\n max_concurrent_tasks = 100,\n retry_attempts = 5,\n task_timeout = 7200000,\n },\n monitoring = {\n enabled = true,\n metrics_interval = 10,\n },\n },\n}\n```\n\n### 2. Template Generation (Nickel → JSON)\n\n```\n# Exports Nickel config as JSON\nnickel export --format json platform-stack.enterprise.yml.ncl\n```\n\n### 3. YAML Conversion (JSON → YAML)\n\n```\n# Converts JSON to YAML format\nnickel export --format json platform-stack.enterprise.yml.ncl | yq -P > docker-compose.enterprise.yml\n```\n\n### 4. Deployment (YAML → Running Containers)\n\n```\n# Starts all services defined in YAML\ndocker-compose -f docker-compose.enterprise.yml up -d\n```\n\n---\n\n## Common Customizations\n\n### Change Service Replicas\n\nEdit the template to adjust replica counts:\n\n```\n# In platform-stack.enterprise.yml.ncl\nlet orchestrator_replicas = 5 in # Instead of 3\nlet control_center_replicas = 3 in # Instead of 2\nservices.orchestrator_replicas\n```\n\n### Add Custom Service\n\nAdd to the template services record:\n\n```\n# In platform-stack.enterprise.yml.ncl\nservices = base_services & {\n custom_service = {\n image = "custom:latest",\n ports = ["9999:9999"],\n volumes = ["custom-data:/data"],\n restart = "always",\n healthcheck = {\n test = ["CMD", "curl", "-f", "http://localhost:9999/health"],\n interval = "30s",\n timeout = "10s",\n retries = 3,\n },\n },\n}\n```\n\n### Modify Resource Limits\n\nIn each service definition:\n\n```\norchestrator = {\n deploy = {\n resources = {\n limits = {\n cpus = "2.0",\n memory = "2G",\n },\n reservations = {\n cpus = "1.0",\n memory = "1G",\n },\n },\n },\n}\n```\n\n---\n\n## Validation and Testing\n\n### Syntax Validation\n\n```\n# Validate YAML before deploying\ndocker-compose -f docker-compose.enterprise.yml config --quiet\n\n# Check service definitions\ndocker-compose -f docker-compose.enterprise.yml ps\n```\n\n### Health Checks\n\n```\n# Monitor health of all services\nwatch docker-compose ps\n\n# Check specific service health\ndocker-compose exec orchestrator curl -s http://localhost:9090/health\n```\n\n### Log Inspection\n\n```\n# View logs from all services\ndocker-compose logs -f\n\n# View logs from specific service\ndocker-compose logs -f orchestrator\n\n# Follow specific container\ndocker logs -f $(docker ps | grep orchestrator | awk '{print $1}')\n```\n\n---\n\n## Troubleshooting\n\n### Port Already in Use\n\n**Error**: `bind: address already in use`\n\n**Fix**: Change port in template or stop conflicting container:\n\n```\n# Find process using port\nlsof -i :9090\n\n# Kill process\nkill -9 \n\n# Or change port in docker-compose file\nports:\n - "9999:9090" # Use 9999 instead\n```\n\n### Service Fails to Start\n\n**Check logs**:\n\n```\ndocker-compose logs orchestrator\n```\n\n**Common causes**:\n- Port conflict - Check if another service uses port\n- Missing volume - Create volume before starting\n- Network connectivity - Verify docker network exists\n- Database not ready - Wait for db service to become healthy\n- Configuration error - Validate YAML syntax\n\n### Persistent Volume Issues\n\n**Clean volumes** (WARNING: Deletes data):\n\n```\ndocker-compose down -v\ndocker volume prune -f\n```\n\n---\n\n## See Also\n\n- **Kubernetes Templates**: `../kubernetes/` - For production K8s deployments\n- **Configuration System**: `../../` - Full configuration documentation\n- **Examples**: `../../examples/` - Example deployment scenarios\n- **Scripts**: `../../scripts/` - Automation scripts\n\n---\n\n**Version**: 1.0\n**Last Updated**: 2025-01-05\n**Status**: Production Ready diff --git a/schemas/platform/templates/kubernetes/README.md b/schemas/platform/templates/kubernetes/README.md index 7326386..30ffa96 100644 --- a/schemas/platform/templates/kubernetes/README.md +++ b/schemas/platform/templates/kubernetes/README.md @@ -1 +1 @@ -# Kubernetes Templates\n\nNickel-based Kubernetes manifest templates for provisioning platform services.\n\n## Overview\n\nThis directory contains Kubernetes deployment manifests written in Nickel language. These templates are parameterized to support all four deployment modes:\n\n- **solo**: Single developer, 1 replica per service, minimal resources\n- **multiuser**: Team collaboration, 1-2 replicas per service, PostgreSQL + SurrealDB\n- **cicd**: CI/CD pipelines, 1 replica, stateless and ephemeral\n- **enterprise**: Production HA, 2-3 replicas per service, full monitoring stack\n\n## Templates\n\n### Service Deployments\n\n#### orchestrator-deployment.yaml.ncl\nOrchestrator workflow engine deployment with:\n- 3 replicas (enterprise mode, override per mode)\n- Service account for RBAC\n- Health checks (liveness + readiness probes)\n- Resource requests/limits (500m CPU, 512Mi RAM minimum)\n- Volume mounts for data and logs\n- Pod anti-affinity for distributed deployment\n- Init containers for dependency checking\n\n**Mode-specific overrides**:\n- Solo: 1 replica, filesystem storage\n- MultiUser: 1 replica, SurrealDB backend\n- CI/CD: 1 replica, ephemeral storage\n- Enterprise: 3 replicas, SurrealDB cluster\n\n#### orchestrator-service.yaml.ncl\nInternal ClusterIP service for orchestrator with:\n- Session affinity (3-hour timeout)\n- Port 9090 (HTTP API)\n- Port 9091 (Metrics)\n- Internal access only (ClusterIP)\n\n**Mode-specific overrides**:\n- Enterprise: LoadBalancer for external access\n\n#### control-center-deployment.yaml.ncl\nControl Center policy and RBAC management with:\n- 2 replicas (enterprise mode)\n- Database integration (PostgreSQL or RocksDB)\n- RBAC and JWT configuration\n- MFA support\n- Health checks and resource limits\n- Security context (non-root user)\n\n**Environment variables**:\n- Database type and URL\n- RBAC enablement\n- JWT issuer, audience, secret\n- MFA requirement\n- Log level\n\n#### control-center-service.yaml.ncl\nInternal ClusterIP service for Control Center with:\n- Port 8080 (HTTP API + UI)\n- Port 8081 (Metrics)\n- Session affinity\n\n#### mcp-server-deployment.yaml.ncl\nModel Context Protocol server for AI/LLM integration with:\n- Lightweight deployment (100m CPU, 128Mi RAM minimum)\n- Orchestrator integration\n- Control Center integration\n- MCP capabilities (tools, resources, prompts)\n- Tool concurrency limits\n- Resource size limits\n\n**Mode-specific overrides**:\n- Solo: 1 replica\n- Enterprise: 2 replicas for HA\n\n#### mcp-server-service.yaml.ncl\nInternal ClusterIP service for MCP server with:\n- Port 8888 (HTTP API)\n- Port 8889 (Metrics)\n\n### Networking\n\n#### platform-ingress.yaml.ncl\nNginx ingress for external HTTP/HTTPS routing with:\n- TLS termination with Let's Encrypt (cert-manager)\n- CORS configuration\n- Security headers (HSTS, X-Frame-Options, etc.)\n- Rate limiting (1000 RPS, 100 connections)\n- Path-based routing to services\n\n**Routes**:\n- `api.example.com/orchestrator` → orchestrator:9090\n- `control-center.example.com/` → control-center:8080\n- `mcp.example.com/` → mcp-server:8888\n- `orchestrator.example.com/api` → orchestrator:9090\n- `orchestrator.example.com/policy` → control-center:8080\n\n### Namespace and Cluster Configuration\n\n#### namespace.yaml.ncl\nKubernetes Namespace for provisioning platform with:\n- Pod security policies (baseline enforcement)\n- Labels for organization and monitoring\n- Annotations for description\n\n#### resource-quota.yaml.ncl\nResourceQuota for resource consumption limits:\n- **CPU**: 8 requests / 16 limits (total)\n- **Memory**: 16GB requests / 32GB limits (total)\n- **Storage**: 200GB (persistent volumes)\n- **Pod limit**: 20 pods maximum\n- **Services**: 10 maximum\n- **ConfigMaps/Secrets**: 50 each\n- **Deployments/StatefulSets/Jobs**: Limited per type\n\n**Mode-specific overrides**:\n- Solo: 4 CPU / 8GB memory, 10 pods\n- MultiUser: 8 CPU / 16GB memory, 20 pods\n- CI/CD: 16 CPU / 32GB memory, 50 pods (ephemeral)\n- Enterprise: Unlimited (managed externally)\n\n#### network-policy.yaml.ncl\nNetworkPolicy for network isolation and security:\n- **Ingress**: Allow traffic from Nginx, inter-pod, Prometheus, DNS\n- **Egress**: Allow DNS queries, inter-pod, external HTTPS\n- **Default**: Deny all except explicitly allowed\n\n**Ports managed**:\n- 9090: Orchestrator API\n- 8080: Control Center API/UI\n- 8888: MCP Server\n- 5432: PostgreSQL\n- 8000: SurrealDB\n- 53: DNS (TCP/UDP)\n- 443/80: External HTTPS/HTTP\n\n#### rbac.yaml.ncl\nRole-Based Access Control (RBAC) setup with:\n- **ServiceAccounts**: orchestrator, control-center, mcp-server\n- **Roles**: Minimal permissions per service\n- **RoleBindings**: Connect ServiceAccounts to Roles\n\n**Permissions**:\n- Orchestrator: Read ConfigMaps, Secrets, Pods, Services\n- Control Center: Read/Write Secrets, ConfigMaps, Deployments\n- MCP Server: Read ConfigMaps, Secrets, Pods, Services\n\n## Usage\n\n### Rendering Templates\n\nEach template is a Nickel file that exports to JSON, then converts to YAML:\n\n```\n# Render a single template\nnickel eval --format json orchestrator-deployment.yaml.ncl | yq -P > orchestrator-deployment.yaml\n\n# Render all templates\nfor template in *.ncl; do\n nickel eval --format json "$template" | yq -P > "${template%.ncl}.yaml"\ndone\n```\n\n### Deploying to Kubernetes\n\n```\n# Create namespace\nkubectl create namespace provisioning\n\n# Create ConfigMaps for configuration\nkubectl create configmap orchestrator-config \\n --from-literal=storage_backend=surrealdb \\n --from-literal=max_concurrent_tasks=50 \\n --from-literal=batch_parallel_limit=20 \\n --from-literal=log_level=info \\n -n provisioning\n\n# Create secrets for sensitive data\nkubectl create secret generic control-center-secrets \\n --from-literal=database_url="postgresql://user:pass@postgres/provisioning" \\n --from-literal=jwt_secret="your-jwt-secret-here" \\n -n provisioning\n\n# Apply manifests\nkubectl apply -f orchestrator-deployment.yaml -n provisioning\nkubectl apply -f orchestrator-service.yaml -n provisioning\nkubectl apply -f control-center-deployment.yaml -n provisioning\nkubectl apply -f control-center-service.yaml -n provisioning\nkubectl apply -f mcp-server-deployment.yaml -n provisioning\nkubectl apply -f mcp-server-service.yaml -n provisioning\nkubectl apply -f platform-ingress.yaml -n provisioning\n```\n\n### Verifying Deployment\n\n```\n# Check deployments\nkubectl get deployments -n provisioning\n\n# Check services\nkubectl get svc -n provisioning\n\n# Check ingress\nkubectl get ingress -n provisioning\n\n# View logs\nkubectl logs -n provisioning -l app=orchestrator -f\nkubectl logs -n provisioning -l app=control-center -f\nkubectl logs -n provisioning -l app=mcp-server -f\n\n# Describe resource\nkubectl describe deployment orchestrator -n provisioning\nkubectl describe service orchestrator -n provisioning\n```\n\n## ConfigMaps and Secrets\n\n### Required ConfigMaps\n\n#### orchestrator-config\n\n```\napiVersion: v1\nkind: ConfigMap\nmetadata:\n name: orchestrator-config\n namespace: provisioning\ndata:\n storage_backend: "surrealdb" # or "filesystem"\n max_concurrent_tasks: "50" # Must match constraint.orchestrator.queue.concurrent_tasks.max\n batch_parallel_limit: "20" # Must match constraint.orchestrator.batch.parallel_limit.max\n log_level: "info"\n```\n\n#### control-center-config\n\n```\napiVersion: v1\nkind: ConfigMap\nmetadata:\n name: control-center-config\n namespace: provisioning\ndata:\n database_type: "postgres" # or "rocksdb"\n rbac_enabled: "true"\n jwt_issuer: "provisioning.local"\n jwt_audience: "orchestrator"\n mfa_required: "true" # Enterprise only\n log_level: "info"\n```\n\n#### mcp-server-config\n\n```\napiVersion: v1\nkind: ConfigMap\nmetadata:\n name: mcp-server-config\n namespace: provisioning\ndata:\n protocol: "stdio" # or "http"\n orchestrator_url: "http://orchestrator:9090"\n control_center_url: "http://control-center:8080"\n enable_tools: "true"\n enable_resources: "true"\n enable_prompts: "true"\n max_concurrent_tools: "10"\n max_resource_size: "1073741824" # 1GB in bytes\n log_level: "info"\n```\n\n### Required Secrets\n\n#### control-center-secrets\n\n```\napiVersion: v1\nkind: Secret\nmetadata:\n name: control-center-secrets\n namespace: provisioning\ntype: Opaque\nstringData:\n database_url: "postgresql://user:password@postgres:5432/provisioning"\n jwt_secret: "your-secure-random-string-here"\n```\n\n## Persistence\n\nAll deployments use PersistentVolumeClaims for data storage:\n\n```\n# Create PersistentVolumes and PersistentVolumeClaims\nkubectl apply -f - < -n provisioning -- nslookup orchestrator\n\n# Check ingress routing\nkubectl describe ingress platform-ingress -n provisioning\n\n# Test connectivity from pod\nkubectl run -it --rm test --image=busybox -n provisioning -- wget http://orchestrator:9090/health\n```\n\n### TLS certificate issues\n\n```\n# Check certificate status\nkubectl describe certificate platform-tls-cert -n provisioning\n\n# Check cert-manager logs\nkubectl logs -n cert-manager deployment/cert-manager -f\n```\n\n## References\n\n- [Kubernetes Deployment API](https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/deployment-v1/)\n- [Kubernetes Service API](https://kubernetes.io/docs/reference/kubernetes-api/service-resources/service-v1/)\n- [Kubernetes Ingress API](https://kubernetes.io/docs/reference/kubernetes-api/service-resources/ingress-v1/)\n- [Nginx Ingress Controller](https://kubernetes.github.io/ingress-nginx/)\n- [Cert-manager](https://cert-manager.io/) \ No newline at end of file +# Kubernetes Templates\n\nNickel-based Kubernetes manifest templates for provisioning platform services.\n\n## Overview\n\nThis directory contains Kubernetes deployment manifests written in Nickel language. These templates are parameterized to support all four deployment modes:\n\n- **solo**: Single developer, 1 replica per service, minimal resources\n- **multiuser**: Team collaboration, 1-2 replicas per service, PostgreSQL + SurrealDB\n- **cicd**: CI/CD pipelines, 1 replica, stateless and ephemeral\n- **enterprise**: Production HA, 2-3 replicas per service, full monitoring stack\n\n## Templates\n\n### Service Deployments\n\n#### orchestrator-deployment.yaml.ncl\nOrchestrator workflow engine deployment with:\n- 3 replicas (enterprise mode, override per mode)\n- Service account for RBAC\n- Health checks (liveness + readiness probes)\n- Resource requests/limits (500m CPU, 512Mi RAM minimum)\n- Volume mounts for data and logs\n- Pod anti-affinity for distributed deployment\n- Init containers for dependency checking\n\n**Mode-specific overrides**:\n- Solo: 1 replica, filesystem storage\n- MultiUser: 1 replica, SurrealDB backend\n- CI/CD: 1 replica, ephemeral storage\n- Enterprise: 3 replicas, SurrealDB cluster\n\n#### orchestrator-service.yaml.ncl\nInternal ClusterIP service for orchestrator with:\n- Session affinity (3-hour timeout)\n- Port 9090 (HTTP API)\n- Port 9091 (Metrics)\n- Internal access only (ClusterIP)\n\n**Mode-specific overrides**:\n- Enterprise: LoadBalancer for external access\n\n#### control-center-deployment.yaml.ncl\nControl Center policy and RBAC management with:\n- 2 replicas (enterprise mode)\n- Database integration (PostgreSQL or RocksDB)\n- RBAC and JWT configuration\n- MFA support\n- Health checks and resource limits\n- Security context (non-root user)\n\n**Environment variables**:\n- Database type and URL\n- RBAC enablement\n- JWT issuer, audience, secret\n- MFA requirement\n- Log level\n\n#### control-center-service.yaml.ncl\nInternal ClusterIP service for Control Center with:\n- Port 8080 (HTTP API + UI)\n- Port 8081 (Metrics)\n- Session affinity\n\n#### mcp-server-deployment.yaml.ncl\nModel Context Protocol server for AI/LLM integration with:\n- Lightweight deployment (100m CPU, 128Mi RAM minimum)\n- Orchestrator integration\n- Control Center integration\n- MCP capabilities (tools, resources, prompts)\n- Tool concurrency limits\n- Resource size limits\n\n**Mode-specific overrides**:\n- Solo: 1 replica\n- Enterprise: 2 replicas for HA\n\n#### mcp-server-service.yaml.ncl\nInternal ClusterIP service for MCP server with:\n- Port 8888 (HTTP API)\n- Port 8889 (Metrics)\n\n### Networking\n\n#### platform-ingress.yaml.ncl\nNginx ingress for external HTTP/HTTPS routing with:\n- TLS termination with Let's Encrypt (cert-manager)\n- CORS configuration\n- Security headers (HSTS, X-Frame-Options, etc.)\n- Rate limiting (1000 RPS, 100 connections)\n- Path-based routing to services\n\n**Routes**:\n- `api.example.com/orchestrator` → orchestrator:9090\n- `control-center.example.com/` → control-center:8080\n- `mcp.example.com/` → mcp-server:8888\n- `orchestrator.example.com/api` → orchestrator:9090\n- `orchestrator.example.com/policy` → control-center:8080\n\n### Namespace and Cluster Configuration\n\n#### namespace.yaml.ncl\nKubernetes Namespace for provisioning platform with:\n- Pod security policies (baseline enforcement)\n- Labels for organization and monitoring\n- Annotations for description\n\n#### resource-quota.yaml.ncl\nResourceQuota for resource consumption limits:\n- **CPU**: 8 requests / 16 limits (total)\n- **Memory**: 16GB requests / 32GB limits (total)\n- **Storage**: 200GB (persistent volumes)\n- **Pod limit**: 20 pods maximum\n- **Services**: 10 maximum\n- **ConfigMaps/Secrets**: 50 each\n- **Deployments/StatefulSets/Jobs**: Limited per type\n\n**Mode-specific overrides**:\n- Solo: 4 CPU / 8GB memory, 10 pods\n- MultiUser: 8 CPU / 16GB memory, 20 pods\n- CI/CD: 16 CPU / 32GB memory, 50 pods (ephemeral)\n- Enterprise: Unlimited (managed externally)\n\n#### network-policy.yaml.ncl\nNetworkPolicy for network isolation and security:\n- **Ingress**: Allow traffic from Nginx, inter-pod, Prometheus, DNS\n- **Egress**: Allow DNS queries, inter-pod, external HTTPS\n- **Default**: Deny all except explicitly allowed\n\n**Ports managed**:\n- 9090: Orchestrator API\n- 8080: Control Center API/UI\n- 8888: MCP Server\n- 5432: PostgreSQL\n- 8000: SurrealDB\n- 53: DNS (TCP/UDP)\n- 443/80: External HTTPS/HTTP\n\n#### rbac.yaml.ncl\nRole-Based Access Control (RBAC) setup with:\n- **ServiceAccounts**: orchestrator, control-center, mcp-server\n- **Roles**: Minimal permissions per service\n- **RoleBindings**: Connect ServiceAccounts to Roles\n\n**Permissions**:\n- Orchestrator: Read ConfigMaps, Secrets, Pods, Services\n- Control Center: Read/Write Secrets, ConfigMaps, Deployments\n- MCP Server: Read ConfigMaps, Secrets, Pods, Services\n\n## Usage\n\n### Rendering Templates\n\nEach template is a Nickel file that exports to JSON, then converts to YAML:\n\n```\n# Render a single template\nnickel eval --format json orchestrator-deployment.yaml.ncl | yq -P > orchestrator-deployment.yaml\n\n# Render all templates\nfor template in *.ncl; do\n nickel eval --format json "$template" | yq -P > "${template%.ncl}.yaml"\ndone\n```\n\n### Deploying to Kubernetes\n\n```\n# Create namespace\nkubectl create namespace provisioning\n\n# Create ConfigMaps for configuration\nkubectl create configmap orchestrator-config \\n --from-literal=storage_backend=surrealdb \\n --from-literal=max_concurrent_tasks=50 \\n --from-literal=batch_parallel_limit=20 \\n --from-literal=log_level=info \\n -n provisioning\n\n# Create secrets for sensitive data\nkubectl create secret generic control-center-secrets \\n --from-literal=database_url="postgresql://user:pass@postgres/provisioning" \\n --from-literal=jwt_secret="your-jwt-secret-here" \\n -n provisioning\n\n# Apply manifests\nkubectl apply -f orchestrator-deployment.yaml -n provisioning\nkubectl apply -f orchestrator-service.yaml -n provisioning\nkubectl apply -f control-center-deployment.yaml -n provisioning\nkubectl apply -f control-center-service.yaml -n provisioning\nkubectl apply -f mcp-server-deployment.yaml -n provisioning\nkubectl apply -f mcp-server-service.yaml -n provisioning\nkubectl apply -f platform-ingress.yaml -n provisioning\n```\n\n### Verifying Deployment\n\n```\n# Check deployments\nkubectl get deployments -n provisioning\n\n# Check services\nkubectl get svc -n provisioning\n\n# Check ingress\nkubectl get ingress -n provisioning\n\n# View logs\nkubectl logs -n provisioning -l app=orchestrator -f\nkubectl logs -n provisioning -l app=control-center -f\nkubectl logs -n provisioning -l app=mcp-server -f\n\n# Describe resource\nkubectl describe deployment orchestrator -n provisioning\nkubectl describe service orchestrator -n provisioning\n```\n\n## ConfigMaps and Secrets\n\n### Required ConfigMaps\n\n#### orchestrator-config\n\n```\napiVersion: v1\nkind: ConfigMap\nmetadata:\n name: orchestrator-config\n namespace: provisioning\ndata:\n storage_backend: "surrealdb" # or "filesystem"\n max_concurrent_tasks: "50" # Must match constraint.orchestrator.queue.concurrent_tasks.max\n batch_parallel_limit: "20" # Must match constraint.orchestrator.batch.parallel_limit.max\n log_level: "info"\n```\n\n#### control-center-config\n\n```\napiVersion: v1\nkind: ConfigMap\nmetadata:\n name: control-center-config\n namespace: provisioning\ndata:\n database_type: "postgres" # or "rocksdb"\n rbac_enabled: "true"\n jwt_issuer: "provisioning.local"\n jwt_audience: "orchestrator"\n mfa_required: "true" # Enterprise only\n log_level: "info"\n```\n\n#### mcp-server-config\n\n```\napiVersion: v1\nkind: ConfigMap\nmetadata:\n name: mcp-server-config\n namespace: provisioning\ndata:\n protocol: "stdio" # or "http"\n orchestrator_url: "http://orchestrator:9090"\n control_center_url: "http://control-center:8080"\n enable_tools: "true"\n enable_resources: "true"\n enable_prompts: "true"\n max_concurrent_tools: "10"\n max_resource_size: "1073741824" # 1GB in bytes\n log_level: "info"\n```\n\n### Required Secrets\n\n#### control-center-secrets\n\n```\napiVersion: v1\nkind: Secret\nmetadata:\n name: control-center-secrets\n namespace: provisioning\ntype: Opaque\nstringData:\n database_url: "postgresql://user:password@postgres:5432/provisioning"\n jwt_secret: "your-secure-random-string-here"\n```\n\n## Persistence\n\nAll deployments use PersistentVolumeClaims for data storage:\n\n```\n# Create PersistentVolumes and PersistentVolumeClaims\nkubectl apply -f - < -n provisioning -- nslookup orchestrator\n\n# Check ingress routing\nkubectl describe ingress platform-ingress -n provisioning\n\n# Test connectivity from pod\nkubectl run -it --rm test --image=busybox -n provisioning -- wget http://orchestrator:9090/health\n```\n\n### TLS certificate issues\n\n```\n# Check certificate status\nkubectl describe certificate platform-tls-cert -n provisioning\n\n# Check cert-manager logs\nkubectl logs -n cert-manager deployment/cert-manager -f\n```\n\n## References\n\n- [Kubernetes Deployment API](https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/deployment-v1/)\n- [Kubernetes Service API](https://kubernetes.io/docs/reference/kubernetes-api/service-resources/service-v1/)\n- [Kubernetes Ingress API](https://kubernetes.io/docs/reference/kubernetes-api/service-resources/ingress-v1/)\n- [Nginx Ingress Controller](https://kubernetes.github.io/ingress-nginx/)\n- [Cert-manager](https://cert-manager.io/) diff --git a/schemas/platform/usage-guide.md b/schemas/platform/usage-guide.md index 0ec9809..01c0251 100644 --- a/schemas/platform/usage-guide.md +++ b/schemas/platform/usage-guide.md @@ -1 +1 @@ -# Configuration System Usage Guide\n\nPractical guide for using the provisioning platform configuration system across common scenarios.\n\n## Quick Start (5 Minutes)\n\n### For Local Development\n\n```\n# 1. Enter configuration system directory\ncd provisioning/.typedialog/provisioning/platform\n\n# 2. Generate solo configuration (interactive)\nnu scripts/configure.nu orchestrator solo --backend cli\n\n# 3. Export to TOML\nnu scripts/generate-configs.nu orchestrator solo\n\n# 4. Start orchestrator\ncd ../../\nORCHESTRATOR_CONFIG=platform/config/orchestrator.solo.toml cargo run --bin orchestrator\n```\n\n### For Team Staging\n\n```\n# 1. Generate multiuser configuration\ncd provisioning/.typedialog/provisioning/platform\nnu scripts/configure.nu control-center multiuser --backend web\n\n# 2. Export configuration\nnu scripts/generate-configs.nu control-center multiuser\n\n# 3. Start with Docker Compose\ncd ../../\ndocker-compose -f platform/infrastructure/docker/docker-compose.multiuser.yml up -d\n```\n\n### For Production Enterprise\n\n```\n# 1. Generate enterprise configuration\ncd provisioning/.typedialog/provisioning/platform\nnu scripts/configure.nu orchestrator enterprise --backend web\n\n# 2. Export configuration\nnu scripts/generate-configs.nu orchestrator enterprise\n\n# 3. Deploy to Kubernetes\ncd ../../\nkubectl apply -f platform/infrastructure/kubernetes/namespace.yaml\nkubectl apply -f platform/infrastructure/kubernetes/*.yaml\n```\n\n---\n\n## Scenario 1: Single Developer Setup\n\n**Goal**: Set up local orchestrator for development testing\n**Time**: 5-10 minutes\n**Requirements**: Nushell, Nickel, Rust toolchain\n\n### Step 1: Interactive Configuration\n\n```\ncd provisioning/.typedialog/provisioning/platform\nnu scripts/configure.nu orchestrator solo --backend cli\n```\n\n**Form Fields**:\n- Workspace name: `dev-workspace` (default)\n- Workspace path: `/home/username/provisioning/data/orchestrator` (change to your path)\n- Server host: `127.0.0.1` (localhost only)\n- Server port: `9090` (default)\n- Storage backend: `filesystem` (selected by default)\n- Logging level: `debug` (recommended for dev)\n\n### Step 2: Validate Configuration\n\n```\n# Typecheck the generated Nickel\nnickel typecheck configs/orchestrator.solo.ncl\n\n# Should output: "✓ Type checking successful"\n```\n\n### Step 3: Export to TOML\n\n```\n# Generate TOML from Nickel\nnu scripts/generate-configs.nu orchestrator solo\n\n# Output: provisioning/platform/config/orchestrator.solo.toml\n```\n\n### Step 4: Start the Service\n\n```\ncd ../..\nORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator\n```\n\n**Expected Output**:\n\n```\n[INFO] Orchestrator starting...\n[INFO] Server listening on 127.0.0.1:9090\n[INFO] Storage backend: filesystem\n[INFO] Ready to accept requests\n```\n\n### Step 5: Test the Service\n\nIn another terminal:\n\n```\n# Check health\ncurl http://localhost:9090/health\n\n# Submit a workflow\ncurl -X POST http://localhost:9090/api/workflows \\n -H "Content-Type: application/json" \\n -d '{"name": "test-workflow", "steps": []}'\n```\n\n### Iteration: Modify Configuration\n\nTo change configuration:\n\n**Option A: Re-run Interactive Form**\n\n```\ncd provisioning/.typedialog/provisioning/platform\nnu scripts/configure.nu orchestrator solo --backend cli\n# Answer with new values\nnu scripts/generate-configs.nu orchestrator solo\n# Restart service\n```\n\n**Option B: Edit TOML Directly**\n\n```\n# Edit the file directly\nvi provisioning/platform/config/orchestrator.solo.toml\n# Change values as needed\n# Restart service\n```\n\n**Option C: Environment Variable Override**\n\n```\n# No file changes needed\nexport ORCHESTRATOR_SERVER_PORT=9999\nexport ORCHESTRATOR_LOG_LEVEL=info\n\nORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator\n```\n\n---\n\n## Scenario 2: Team Collaboration Setup\n\n**Goal**: Set up shared team environment with PostgreSQL and RBAC\n**Time**: 20-30 minutes\n**Requirements**: Docker, Docker Compose, PostgreSQL running\n\n### Step 1: Interactive Configuration\n\n```\ncd provisioning/.typedialog/provisioning/platform\n\n# Configure Control Center with RBAC\nnu scripts/configure.nu control-center multiuser --backend web\n```\n\n**Important Fields**:\n- Database backend: `postgres` (for persistent storage)\n- Database host: `postgres.provisioning.svc.cluster.local` or `localhost` for local\n- Database password: Generate strong password (store in `.env` file, don't hardcode)\n- JWT secret: Generate 256-bit random string\n- MFA required: `false` (optional for team, not required)\n- Default role: `viewer` (least privilege)\n\n### Step 2: Create Environment File\n\n```\n# Create .env for secrets\ncat > provisioning/platform/.env << 'EOF'\nDB_PASSWORD=generate-strong-password-here\nJWT_SECRET=generate-256-bit-random-base64-string\nSURREALDB_PASSWORD=another-strong-password\nEOF\n\n# Protect the file\nchmod 600 provisioning/platform/.env\n```\n\n### Step 3: Export Configurations\n\n```\n# Export all three services for team setup\nnu scripts/generate-configs.nu control-center multiuser\nnu scripts/generate-configs.nu orchestrator multiuser\nnu scripts/generate-configs.nu mcp-server multiuser\n```\n\n### Step 4: Start Services with Docker Compose\n\n```\ncd ../..\n\n# Generate Docker Compose from Nickel template\nnu provisioning/.typedialog/provisioning/platform/scripts/render-docker-compose.nu multiuser\n\n# Start all services\ndocker-compose -f provisioning/platform/infrastructure/docker/docker-compose.multiuser.yml \\n --env-file provisioning/platform/.env \\n up -d\n```\n\n**Verify Services**:\n\n```\n# Check all services are running\ndocker-compose -f provisioning/platform/infrastructure/docker/docker-compose.multiuser.yml ps\n\n# Check logs for errors\ndocker-compose -f provisioning/platform/infrastructure/docker/docker-compose.multiuser.yml logs -f control-center\n\n# Test Control Center UI\nopen http://localhost:8080\n# Login with default credentials (or configure initially)\n```\n\n### Step 5: Create Team Users and Roles\n\n```\n# Access PostgreSQL to set up users\ndocker-compose exec postgres psql -U provisioning -d provisioning\n\n-- Create users\nINSERT INTO users (username, email, role) VALUES\n ('alice@company.com', 'alice@company.com', 'admin'),\n ('bob@company.com', 'bob@company.com', 'operator'),\n ('charlie@company.com', 'charlie@company.com', 'developer');\n\n-- Create RBAC assignments\nINSERT INTO role_assignments (user_id, role) VALUES\n ((SELECT id FROM users WHERE username='alice@company.com'), 'admin'),\n ((SELECT id FROM users WHERE username='bob@company.com'), 'operator'),\n ((SELECT id FROM users WHERE username='charlie@company.com'), 'developer');\n```\n\n### Step 6: Team Access\n\n**Admin (Alice)**:\n- Full platform access\n- Can create/modify users\n- Can manage all workflows and policies\n\n**Operator (Bob)**:\n- Execute and manage workflows\n- View logs and metrics\n- Cannot modify policies or users\n\n**Developer (Charlie)**:\n- Read-only access to workflows\n- Cannot execute or modify\n- Can view logs\n\n---\n\n## Scenario 3: Production Enterprise Deployment\n\n**Goal**: Deploy complete platform to Kubernetes with HA and monitoring\n**Time**: 1-2 hours (includes infrastructure setup)\n**Requirements**: Kubernetes cluster, kubectl, Helm (optional)\n\n### Step 1: Pre-Deployment Checklist\n\n```\n# Verify Kubernetes access\nkubectl cluster-info\n\n# Create namespace\nkubectl create namespace provisioning\n\n# Verify persistent volumes available\nkubectl get pv\n\n# Check node resources\nkubectl top nodes\n# Minimum 16 CPU, 32GB RAM across cluster\n```\n\n### Step 2: Interactive Configuration (Enterprise Mode)\n\n```\ncd provisioning/.typedialog/provisioning/platform\n\nnu scripts/configure.nu orchestrator enterprise --backend web\nnu scripts/configure.nu control-center enterprise --backend web\nnu scripts/configure.nu mcp-server enterprise --backend web\n```\n\n**Critical Enterprise Settings**:\n- Deployment mode: `enterprise`\n- Replicas: Orchestrator (3), Control Center (2), MCP Server (1-2)\n- Storage:\n - Orchestrator: `surrealdb_cluster` with 3 nodes\n - Control Center: `postgres` with HA\n- Security:\n - Auth: `jwt` (required)\n - TLS: `true` (required)\n - MFA: `true` (required)\n- Monitoring: All enabled\n- Logging: JSON format with 365-day retention\n\n### Step 3: Generate Secrets\n\n```\n# Generate secure values\nJWT_SECRET=$(openssl rand -base64 32)\nDB_PASSWORD=$(openssl rand -base64 32)\nSURREALDB_PASSWORD=$(openssl rand -base64 32)\nADMIN_PASSWORD=$(openssl rand -base64 16)\n\n# Create Kubernetes secret\nkubectl create secret generic provisioning-secrets \\n -n provisioning \\n --from-literal=jwt-secret="$JWT_SECRET" \\n --from-literal=db-password="$DB_PASSWORD" \\n --from-literal=surrealdb-password="$SURREALDB_PASSWORD" \\n --from-literal=admin-password="$ADMIN_PASSWORD"\n\n# Verify secret created\nkubectl get secrets -n provisioning\n```\n\n### Step 4: TLS Certificate Setup\n\n```\n# Generate self-signed certificate (for testing)\nopenssl req -x509 -nodes -days 365 -newkey rsa:2048 \\n -keyout provisioning.key \\n -out provisioning.crt \\n -subj "/CN=provisioning.example.com"\n\n# Create TLS secret in Kubernetes\nkubectl create secret tls provisioning-tls \\n -n provisioning \\n --cert=provisioning.crt \\n --key=provisioning.key\n\n# For production: Use cert-manager or real certificates\n# kubectl create secret tls provisioning-tls \\n# -n provisioning \\n# --cert=/path/to/cert.pem \\n# --key=/path/to/key.pem\n```\n\n### Step 5: Export Configurations\n\n```\n# Export TOML configurations\nnu scripts/generate-configs.nu orchestrator enterprise\nnu scripts/generate-configs.nu control-center enterprise\nnu scripts/generate-configs.nu mcp-server enterprise\n```\n\n### Step 6: Create ConfigMaps for Configuration\n\n```\n# Create ConfigMaps with exported TOML\nkubectl create configmap orchestrator-config \\n -n provisioning \\n --from-file=provisioning/platform/config/orchestrator.enterprise.toml\n\nkubectl create configmap control-center-config \\n -n provisioning \\n --from-file=provisioning/platform/config/control-center.enterprise.toml\n\nkubectl create configmap mcp-server-config \\n -n provisioning \\n --from-file=provisioning/platform/config/mcp-server.enterprise.toml\n```\n\n### Step 7: Deploy Infrastructure\n\n```\ncd ../..\n\n# Deploy in order of dependencies\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/namespace.yaml\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/resource-quota.yaml\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/rbac.yaml\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/network-policy.yaml\n\n# Deploy storage (PostgreSQL, SurrealDB)\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/postgres-*.yaml\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/surrealdb-*.yaml\n\n# Wait for databases to be ready\nkubectl wait --for=condition=ready pod -l app=postgres -n provisioning --timeout=300s\nkubectl wait --for=condition=ready pod -l app=surrealdb -n provisioning --timeout=300s\n\n# Deploy platform services\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/orchestrator-*.yaml\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/control-center-*.yaml\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/mcp-server-*.yaml\n\n# Deploy monitoring stack\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/prometheus-*.yaml\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/grafana-*.yaml\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/loki-*.yaml\n\n# Deploy ingress\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/platform-ingress.yaml\n```\n\n### Step 8: Verify Deployment\n\n```\n# Check all pods are running\nkubectl get pods -n provisioning\n\n# Check services\nkubectl get svc -n provisioning\n\n# Wait for all pods ready\nkubectl wait --for=condition=Ready pods --all -n provisioning --timeout=600s\n\n# Check ingress\nkubectl get ingress -n provisioning\n```\n\n### Step 9: Access the Platform\n\n```\n# Get Ingress IP\nkubectl get ingress -n provisioning\n\n# Configure DNS (or use /etc/hosts for testing)\necho "INGRESS_IP provisioning.example.com" | sudo tee -a /etc/hosts\n\n# Access services\n# Orchestrator: https://orchestrator.provisioning.example.com/api\n# Control Center: https://control-center.provisioning.example.com\n# MCP Server: https://mcp.provisioning.example.com\n# Grafana: https://grafana.provisioning.example.com (admin/password)\n# Prometheus: https://prometheus.provisioning.example.com (internal)\n```\n\n### Step 10: Post-Deployment Configuration\n\n```\n# Create database schema\nkubectl exec -it -n provisioning deployment/postgres -- psql -U provisioning -d provisioning -f /schema.sql\n\n# Initialize Grafana dashboards\nkubectl cp grafana-dashboards provisioning/grafana-0:/var/lib/grafana/dashboards/\n\n# Configure alerts\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/prometheus-alerts.yaml\n```\n\n---\n\n## Common Tasks\n\n### Change Configuration Value\n\n**Without Service Restart** (Environment Variable):\n\n```\n# Override specific value via environment variable\nexport ORCHESTRATOR_LOG_LEVEL=debug\nexport ORCHESTRATOR_SERVER_PORT=9999\n\n# Service uses overridden values\nORCHESTRATOR_CONFIG=config.toml cargo run --bin orchestrator\n```\n\n**With Service Restart** (TOML Edit):\n\n```\n# Edit TOML directly\nvi provisioning/platform/config/orchestrator.solo.toml\n\n# Restart service\npkill -f "cargo run --bin orchestrator"\nORCHESTRATOR_CONFIG=config.toml cargo run --bin orchestrator\n```\n\n**With Validation** (Regenerate from Form):\n\n```\n# Re-run interactive form to regenerate\ncd provisioning/.typedialog/provisioning/platform\nnu scripts/configure.nu orchestrator solo --backend cli\n\n# Validation ensures consistency\nnu scripts/generate-configs.nu orchestrator solo\n\n# Restart service with validated config\n```\n\n### Add Team Member\n\n**In Kubernetes PostgreSQL**:\n\n```\nkubectl exec -it -n provisioning deployment/postgres -- psql -U provisioning -d provisioning\n\n-- Create user\nINSERT INTO users (username, email, password_hash, role, created_at) VALUES\n ('newuser@company.com', 'newuser@company.com', crypt('password', gen_salt('bf')), 'developer', now());\n\n-- Assign role\nINSERT INTO role_assignments (user_id, role, granted_by, granted_at) VALUES\n ((SELECT id FROM users WHERE username='newuser@company.com'), 'developer', 1, now());\n```\n\n### Scale Service Replicas\n\n**In Kubernetes**:\n\n```\n# Scale orchestrator from 3 to 5 replicas\nkubectl scale deployment orchestrator -n provisioning --replicas=5\n\n# Verify scaling\nkubectl get deployment orchestrator -n provisioning\nkubectl get pods -n provisioning | grep orchestrator\n```\n\n### Monitor Service Health\n\n```\n# Check pod status\nkubectl describe pod orchestrator-0 -n provisioning\n\n# Check service logs\nkubectl logs -f deployment/orchestrator -n provisioning --all-containers=true\n\n# Check resource usage\nkubectl top pods -n provisioning\n\n# Check service metrics (via Prometheus)\nkubectl port-forward -n provisioning svc/prometheus 9091:9091\nopen http://localhost:9091\n```\n\n### Backup Configuration\n\n```\n# Backup current TOML configs\ntar -czf configs-backup-$(date +%Y%m%d).tar.gz provisioning/platform/config/\n\n# Backup Kubernetes manifests\nkubectl get all -n provisioning -o yaml > k8s-backup-$(date +%Y%m%d).yaml\n\n# Backup database\nkubectl exec -n provisioning deployment/postgres -- pg_dump -U provisioning provisioning | gzip > db-backup-$(date +%Y%m%d).sql.gz\n```\n\n### Troubleshoot Configuration Issues\n\n```\n# Check Nickel syntax errors\nnickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Validate TOML syntax\nnickel export --format toml provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Check TOML is valid for Rust\nORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator -- --validate-config\n\n# Check environment variable overrides\necho $ORCHESTRATOR_SERVER_PORT\necho $ORCHESTRATOR_LOG_LEVEL\n\n# Examine actual config loaded (if service logs it)\nORCHESTRATOR_CONFIG=config.toml cargo run --bin orchestrator 2>&1 | grep -i "config\|configuration"\n```\n\n---\n\n## Configuration File Locations\n\n```\nprovisioning/.typedialog/provisioning/platform/\n├── forms/ # User-facing interactive forms\n│ ├── orchestrator-form.toml\n│ ├── control-center-form.toml\n│ └── fragments/ # Reusable form sections\n│\n├── values/ # User input files (gitignored)\n│ ├── orchestrator.solo.ncl\n│ ├── orchestrator.enterprise.ncl\n│ └── (auto-generated by TypeDialog)\n│\n├── configs/ # Composed Nickel configs\n│ ├── orchestrator.solo.ncl # Base + mode overlay + user input + validation\n│ ├── control-center.multiuser.ncl\n│ └── (4 services × 4 modes = 16 files)\n│\n├── schemas/ # Type definitions\n│ ├── orchestrator.ncl\n│ ├── control-center.ncl\n│ └── common/ # Shared schemas\n│\n├── defaults/ # Default values\n│ ├── orchestrator-defaults.ncl\n│ └── deployment/solo-defaults.ncl\n│\n├── validators/ # Business rules\n│ ├── orchestrator-validator.ncl\n│ └── (per-service validators)\n│\n├── constraints/\n│ └── constraints.toml # Min/max values (single source of truth)\n│\n├── templates/ # Deployment templates\n│ ├── docker-compose/\n│ │ ├── platform-stack.solo.yml.ncl\n│ │ └── (4 modes)\n│ └── kubernetes/\n│ ├── orchestrator-deployment.yaml.ncl\n│ └── (11 templates)\n│\n└── scripts/ # Automation\n ├── configure.nu # Interactive TypeDialog\n ├── generate-configs.nu # Nickel → TOML export\n ├── validate-config.nu # Typecheck Nickel\n ├── render-docker-compose.nu # Templates → Docker Compose\n └── render-kubernetes.nu # Templates → Kubernetes\n```\n\nTOML output location:\n\n```\nprovisioning/platform/config/\n├── orchestrator.solo.toml # Consumed by orchestrator service\n├── control-center.enterprise.toml # Consumed by control-center service\n└── (4 services × 4 modes = 16 files)\n```\n\n---\n\n## Tips & Best Practices\n\n### 1. Use Version Control\n\n```\n# Commit TOML configs to track changes\ngit add provisioning/platform/config/*.toml\ngit commit -m "Update orchestrator enterprise config: increase worker threads to 16"\n\n# Do NOT commit Nickel source files in values/\necho "provisioning/.typedialog/provisioning/platform/values/*.ncl" >> .gitignore\n```\n\n### 2. Test Before Production Deployment\n\n```\n# Test in solo mode first\nnu scripts/configure.nu orchestrator solo\ncargo run --bin orchestrator\n\n# Then test in staging (multiuser mode)\nnu scripts/configure.nu orchestrator multiuser\ndocker-compose -f docker-compose.multiuser.yml up\n\n# Finally deploy to production (enterprise)\nnu scripts/configure.nu orchestrator enterprise\n# Then Kubernetes deployment\n```\n\n### 3. Document Custom Configurations\n\n```\n# Add comments to configurations\n# In values/*.ncl or config/*.ncl:\n\n# Custom configuration for high-throughput testing\n# - Increased workers from 4 to 8\n# - Increased queue.max_concurrent_tasks from 5 to 20\n# - Lowered logging level from debug to info\n{\n orchestrator = {\n # Worker threads increased for testing parallel task processing\n server.workers = 8,\n queue.max_concurrent_tasks = 20,\n logging.level = "info",\n },\n}\n```\n\n### 4. Secrets Management\n\n**Never** hardcode secrets in configuration files:\n\n```\n# WRONG - Don't do this\n[orchestrator.security]\njwt_secret = "hardcoded-secret-exposed-in-git"\n\n# RIGHT - Use environment variables\nexport ORCHESTRATOR_SECURITY_JWT_SECRET="actual-secret-from-vault"\n\n# TOML references it:\n[orchestrator.security]\njwt_secret = "${JWT_SECRET}" # Loaded at runtime\n```\n\n### 5. Monitor Changes\n\n```\n# Track configuration changes over time\ngit log --oneline provisioning/platform/config/\n\n# See what changed\ngit diff provisioning/platform/config/orchestrator.solo.toml\n```\n\n---\n\n**Version**: 1.0\n**Last Updated**: 2025-01-05\n**Status**: Production Ready \ No newline at end of file +# Configuration System Usage Guide\n\nPractical guide for using the provisioning platform configuration system across common scenarios.\n\n## Quick Start (5 Minutes)\n\n### For Local Development\n\n```\n# 1. Enter configuration system directory\ncd provisioning/.typedialog/provisioning/platform\n\n# 2. Generate solo configuration (interactive)\nnu scripts/configure.nu orchestrator solo --backend cli\n\n# 3. Export to TOML\nnu scripts/generate-configs.nu orchestrator solo\n\n# 4. Start orchestrator\ncd ../../\nORCHESTRATOR_CONFIG=platform/config/orchestrator.solo.toml cargo run --bin orchestrator\n```\n\n### For Team Staging\n\n```\n# 1. Generate multiuser configuration\ncd provisioning/.typedialog/provisioning/platform\nnu scripts/configure.nu control-center multiuser --backend web\n\n# 2. Export configuration\nnu scripts/generate-configs.nu control-center multiuser\n\n# 3. Start with Docker Compose\ncd ../../\ndocker-compose -f platform/infrastructure/docker/docker-compose.multiuser.yml up -d\n```\n\n### For Production Enterprise\n\n```\n# 1. Generate enterprise configuration\ncd provisioning/.typedialog/provisioning/platform\nnu scripts/configure.nu orchestrator enterprise --backend web\n\n# 2. Export configuration\nnu scripts/generate-configs.nu orchestrator enterprise\n\n# 3. Deploy to Kubernetes\ncd ../../\nkubectl apply -f platform/infrastructure/kubernetes/namespace.yaml\nkubectl apply -f platform/infrastructure/kubernetes/*.yaml\n```\n\n---\n\n## Scenario 1: Single Developer Setup\n\n**Goal**: Set up local orchestrator for development testing\n**Time**: 5-10 minutes\n**Requirements**: Nushell, Nickel, Rust toolchain\n\n### Step 1: Interactive Configuration\n\n```\ncd provisioning/.typedialog/provisioning/platform\nnu scripts/configure.nu orchestrator solo --backend cli\n```\n\n**Form Fields**:\n- Workspace name: `dev-workspace` (default)\n- Workspace path: `/home/username/provisioning/data/orchestrator` (change to your path)\n- Server host: `127.0.0.1` (localhost only)\n- Server port: `9090` (default)\n- Storage backend: `filesystem` (selected by default)\n- Logging level: `debug` (recommended for dev)\n\n### Step 2: Validate Configuration\n\n```\n# Typecheck the generated Nickel\nnickel typecheck configs/orchestrator.solo.ncl\n\n# Should output: "✓ Type checking successful"\n```\n\n### Step 3: Export to TOML\n\n```\n# Generate TOML from Nickel\nnu scripts/generate-configs.nu orchestrator solo\n\n# Output: provisioning/platform/config/orchestrator.solo.toml\n```\n\n### Step 4: Start the Service\n\n```\ncd ../..\nORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator\n```\n\n**Expected Output**:\n\n```\n[INFO] Orchestrator starting...\n[INFO] Server listening on 127.0.0.1:9090\n[INFO] Storage backend: filesystem\n[INFO] Ready to accept requests\n```\n\n### Step 5: Test the Service\n\nIn another terminal:\n\n```\n# Check health\ncurl http://localhost:9090/health\n\n# Submit a workflow\ncurl -X POST http://localhost:9090/api/workflows \\n -H "Content-Type: application/json" \\n -d '{"name": "test-workflow", "steps": []}'\n```\n\n### Iteration: Modify Configuration\n\nTo change configuration:\n\n**Option A: Re-run Interactive Form**\n\n```\ncd provisioning/.typedialog/provisioning/platform\nnu scripts/configure.nu orchestrator solo --backend cli\n# Answer with new values\nnu scripts/generate-configs.nu orchestrator solo\n# Restart service\n```\n\n**Option B: Edit TOML Directly**\n\n```\n# Edit the file directly\nvi provisioning/platform/config/orchestrator.solo.toml\n# Change values as needed\n# Restart service\n```\n\n**Option C: Environment Variable Override**\n\n```\n# No file changes needed\nexport ORCHESTRATOR_SERVER_PORT=9999\nexport ORCHESTRATOR_LOG_LEVEL=info\n\nORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator\n```\n\n---\n\n## Scenario 2: Team Collaboration Setup\n\n**Goal**: Set up shared team environment with PostgreSQL and RBAC\n**Time**: 20-30 minutes\n**Requirements**: Docker, Docker Compose, PostgreSQL running\n\n### Step 1: Interactive Configuration\n\n```\ncd provisioning/.typedialog/provisioning/platform\n\n# Configure Control Center with RBAC\nnu scripts/configure.nu control-center multiuser --backend web\n```\n\n**Important Fields**:\n- Database backend: `postgres` (for persistent storage)\n- Database host: `postgres.provisioning.svc.cluster.local` or `localhost` for local\n- Database password: Generate strong password (store in `.env` file, don't hardcode)\n- JWT secret: Generate 256-bit random string\n- MFA required: `false` (optional for team, not required)\n- Default role: `viewer` (least privilege)\n\n### Step 2: Create Environment File\n\n```\n# Create .env for secrets\ncat > provisioning/platform/.env << 'EOF'\nDB_PASSWORD=generate-strong-password-here\nJWT_SECRET=generate-256-bit-random-base64-string\nSURREALDB_PASSWORD=another-strong-password\nEOF\n\n# Protect the file\nchmod 600 provisioning/platform/.env\n```\n\n### Step 3: Export Configurations\n\n```\n# Export all three services for team setup\nnu scripts/generate-configs.nu control-center multiuser\nnu scripts/generate-configs.nu orchestrator multiuser\nnu scripts/generate-configs.nu mcp-server multiuser\n```\n\n### Step 4: Start Services with Docker Compose\n\n```\ncd ../..\n\n# Generate Docker Compose from Nickel template\nnu provisioning/.typedialog/provisioning/platform/scripts/render-docker-compose.nu multiuser\n\n# Start all services\ndocker-compose -f provisioning/platform/infrastructure/docker/docker-compose.multiuser.yml \\n --env-file provisioning/platform/.env \\n up -d\n```\n\n**Verify Services**:\n\n```\n# Check all services are running\ndocker-compose -f provisioning/platform/infrastructure/docker/docker-compose.multiuser.yml ps\n\n# Check logs for errors\ndocker-compose -f provisioning/platform/infrastructure/docker/docker-compose.multiuser.yml logs -f control-center\n\n# Test Control Center UI\nopen http://localhost:8080\n# Login with default credentials (or configure initially)\n```\n\n### Step 5: Create Team Users and Roles\n\n```\n# Access PostgreSQL to set up users\ndocker-compose exec postgres psql -U provisioning -d provisioning\n\n-- Create users\nINSERT INTO users (username, email, role) VALUES\n ('alice@company.com', 'alice@company.com', 'admin'),\n ('bob@company.com', 'bob@company.com', 'operator'),\n ('charlie@company.com', 'charlie@company.com', 'developer');\n\n-- Create RBAC assignments\nINSERT INTO role_assignments (user_id, role) VALUES\n ((SELECT id FROM users WHERE username='alice@company.com'), 'admin'),\n ((SELECT id FROM users WHERE username='bob@company.com'), 'operator'),\n ((SELECT id FROM users WHERE username='charlie@company.com'), 'developer');\n```\n\n### Step 6: Team Access\n\n**Admin (Alice)**:\n- Full platform access\n- Can create/modify users\n- Can manage all workflows and policies\n\n**Operator (Bob)**:\n- Execute and manage workflows\n- View logs and metrics\n- Cannot modify policies or users\n\n**Developer (Charlie)**:\n- Read-only access to workflows\n- Cannot execute or modify\n- Can view logs\n\n---\n\n## Scenario 3: Production Enterprise Deployment\n\n**Goal**: Deploy complete platform to Kubernetes with HA and monitoring\n**Time**: 1-2 hours (includes infrastructure setup)\n**Requirements**: Kubernetes cluster, kubectl, Helm (optional)\n\n### Step 1: Pre-Deployment Checklist\n\n```\n# Verify Kubernetes access\nkubectl cluster-info\n\n# Create namespace\nkubectl create namespace provisioning\n\n# Verify persistent volumes available\nkubectl get pv\n\n# Check node resources\nkubectl top nodes\n# Minimum 16 CPU, 32GB RAM across cluster\n```\n\n### Step 2: Interactive Configuration (Enterprise Mode)\n\n```\ncd provisioning/.typedialog/provisioning/platform\n\nnu scripts/configure.nu orchestrator enterprise --backend web\nnu scripts/configure.nu control-center enterprise --backend web\nnu scripts/configure.nu mcp-server enterprise --backend web\n```\n\n**Critical Enterprise Settings**:\n- Deployment mode: `enterprise`\n- Replicas: Orchestrator (3), Control Center (2), MCP Server (1-2)\n- Storage:\n - Orchestrator: `surrealdb_cluster` with 3 nodes\n - Control Center: `postgres` with HA\n- Security:\n - Auth: `jwt` (required)\n - TLS: `true` (required)\n - MFA: `true` (required)\n- Monitoring: All enabled\n- Logging: JSON format with 365-day retention\n\n### Step 3: Generate Secrets\n\n```\n# Generate secure values\nJWT_SECRET=$(openssl rand -base64 32)\nDB_PASSWORD=$(openssl rand -base64 32)\nSURREALDB_PASSWORD=$(openssl rand -base64 32)\nADMIN_PASSWORD=$(openssl rand -base64 16)\n\n# Create Kubernetes secret\nkubectl create secret generic provisioning-secrets \\n -n provisioning \\n --from-literal=jwt-secret="$JWT_SECRET" \\n --from-literal=db-password="$DB_PASSWORD" \\n --from-literal=surrealdb-password="$SURREALDB_PASSWORD" \\n --from-literal=admin-password="$ADMIN_PASSWORD"\n\n# Verify secret created\nkubectl get secrets -n provisioning\n```\n\n### Step 4: TLS Certificate Setup\n\n```\n# Generate self-signed certificate (for testing)\nopenssl req -x509 -nodes -days 365 -newkey rsa:2048 \\n -keyout provisioning.key \\n -out provisioning.crt \\n -subj "/CN=provisioning.example.com"\n\n# Create TLS secret in Kubernetes\nkubectl create secret tls provisioning-tls \\n -n provisioning \\n --cert=provisioning.crt \\n --key=provisioning.key\n\n# For production: Use cert-manager or real certificates\n# kubectl create secret tls provisioning-tls \\n# -n provisioning \\n# --cert=/path/to/cert.pem \\n# --key=/path/to/key.pem\n```\n\n### Step 5: Export Configurations\n\n```\n# Export TOML configurations\nnu scripts/generate-configs.nu orchestrator enterprise\nnu scripts/generate-configs.nu control-center enterprise\nnu scripts/generate-configs.nu mcp-server enterprise\n```\n\n### Step 6: Create ConfigMaps for Configuration\n\n```\n# Create ConfigMaps with exported TOML\nkubectl create configmap orchestrator-config \\n -n provisioning \\n --from-file=provisioning/platform/config/orchestrator.enterprise.toml\n\nkubectl create configmap control-center-config \\n -n provisioning \\n --from-file=provisioning/platform/config/control-center.enterprise.toml\n\nkubectl create configmap mcp-server-config \\n -n provisioning \\n --from-file=provisioning/platform/config/mcp-server.enterprise.toml\n```\n\n### Step 7: Deploy Infrastructure\n\n```\ncd ../..\n\n# Deploy in order of dependencies\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/namespace.yaml\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/resource-quota.yaml\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/rbac.yaml\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/network-policy.yaml\n\n# Deploy storage (PostgreSQL, SurrealDB)\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/postgres-*.yaml\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/surrealdb-*.yaml\n\n# Wait for databases to be ready\nkubectl wait --for=condition=ready pod -l app=postgres -n provisioning --timeout=300s\nkubectl wait --for=condition=ready pod -l app=surrealdb -n provisioning --timeout=300s\n\n# Deploy platform services\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/orchestrator-*.yaml\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/control-center-*.yaml\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/mcp-server-*.yaml\n\n# Deploy monitoring stack\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/prometheus-*.yaml\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/grafana-*.yaml\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/loki-*.yaml\n\n# Deploy ingress\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/platform-ingress.yaml\n```\n\n### Step 8: Verify Deployment\n\n```\n# Check all pods are running\nkubectl get pods -n provisioning\n\n# Check services\nkubectl get svc -n provisioning\n\n# Wait for all pods ready\nkubectl wait --for=condition=Ready pods --all -n provisioning --timeout=600s\n\n# Check ingress\nkubectl get ingress -n provisioning\n```\n\n### Step 9: Access the Platform\n\n```\n# Get Ingress IP\nkubectl get ingress -n provisioning\n\n# Configure DNS (or use /etc/hosts for testing)\necho "INGRESS_IP provisioning.example.com" | sudo tee -a /etc/hosts\n\n# Access services\n# Orchestrator: https://orchestrator.provisioning.example.com/api\n# Control Center: https://control-center.provisioning.example.com\n# MCP Server: https://mcp.provisioning.example.com\n# Grafana: https://grafana.provisioning.example.com (admin/password)\n# Prometheus: https://prometheus.provisioning.example.com (internal)\n```\n\n### Step 10: Post-Deployment Configuration\n\n```\n# Create database schema\nkubectl exec -it -n provisioning deployment/postgres -- psql -U provisioning -d provisioning -f /schema.sql\n\n# Initialize Grafana dashboards\nkubectl cp grafana-dashboards provisioning/grafana-0:/var/lib/grafana/dashboards/\n\n# Configure alerts\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/prometheus-alerts.yaml\n```\n\n---\n\n## Common Tasks\n\n### Change Configuration Value\n\n**Without Service Restart** (Environment Variable):\n\n```\n# Override specific value via environment variable\nexport ORCHESTRATOR_LOG_LEVEL=debug\nexport ORCHESTRATOR_SERVER_PORT=9999\n\n# Service uses overridden values\nORCHESTRATOR_CONFIG=config.toml cargo run --bin orchestrator\n```\n\n**With Service Restart** (TOML Edit):\n\n```\n# Edit TOML directly\nvi provisioning/platform/config/orchestrator.solo.toml\n\n# Restart service\npkill -f "cargo run --bin orchestrator"\nORCHESTRATOR_CONFIG=config.toml cargo run --bin orchestrator\n```\n\n**With Validation** (Regenerate from Form):\n\n```\n# Re-run interactive form to regenerate\ncd provisioning/.typedialog/provisioning/platform\nnu scripts/configure.nu orchestrator solo --backend cli\n\n# Validation ensures consistency\nnu scripts/generate-configs.nu orchestrator solo\n\n# Restart service with validated config\n```\n\n### Add Team Member\n\n**In Kubernetes PostgreSQL**:\n\n```\nkubectl exec -it -n provisioning deployment/postgres -- psql -U provisioning -d provisioning\n\n-- Create user\nINSERT INTO users (username, email, password_hash, role, created_at) VALUES\n ('newuser@company.com', 'newuser@company.com', crypt('password', gen_salt('bf')), 'developer', now());\n\n-- Assign role\nINSERT INTO role_assignments (user_id, role, granted_by, granted_at) VALUES\n ((SELECT id FROM users WHERE username='newuser@company.com'), 'developer', 1, now());\n```\n\n### Scale Service Replicas\n\n**In Kubernetes**:\n\n```\n# Scale orchestrator from 3 to 5 replicas\nkubectl scale deployment orchestrator -n provisioning --replicas=5\n\n# Verify scaling\nkubectl get deployment orchestrator -n provisioning\nkubectl get pods -n provisioning | grep orchestrator\n```\n\n### Monitor Service Health\n\n```\n# Check pod status\nkubectl describe pod orchestrator-0 -n provisioning\n\n# Check service logs\nkubectl logs -f deployment/orchestrator -n provisioning --all-containers=true\n\n# Check resource usage\nkubectl top pods -n provisioning\n\n# Check service metrics (via Prometheus)\nkubectl port-forward -n provisioning svc/prometheus 9091:9091\nopen http://localhost:9091\n```\n\n### Backup Configuration\n\n```\n# Backup current TOML configs\ntar -czf configs-backup-$(date +%Y%m%d).tar.gz provisioning/platform/config/\n\n# Backup Kubernetes manifests\nkubectl get all -n provisioning -o yaml > k8s-backup-$(date +%Y%m%d).yaml\n\n# Backup database\nkubectl exec -n provisioning deployment/postgres -- pg_dump -U provisioning provisioning | gzip > db-backup-$(date +%Y%m%d).sql.gz\n```\n\n### Troubleshoot Configuration Issues\n\n```\n# Check Nickel syntax errors\nnickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Validate TOML syntax\nnickel export --format toml provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Check TOML is valid for Rust\nORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator -- --validate-config\n\n# Check environment variable overrides\necho $ORCHESTRATOR_SERVER_PORT\necho $ORCHESTRATOR_LOG_LEVEL\n\n# Examine actual config loaded (if service logs it)\nORCHESTRATOR_CONFIG=config.toml cargo run --bin orchestrator 2>&1 | grep -i "config\|configuration"\n```\n\n---\n\n## Configuration File Locations\n\n```\nprovisioning/.typedialog/provisioning/platform/\n├── forms/ # User-facing interactive forms\n│ ├── orchestrator-form.toml\n│ ├── control-center-form.toml\n│ └── fragments/ # Reusable form sections\n│\n├── values/ # User input files (gitignored)\n│ ├── orchestrator.solo.ncl\n│ ├── orchestrator.enterprise.ncl\n│ └── (auto-generated by TypeDialog)\n│\n├── configs/ # Composed Nickel configs\n│ ├── orchestrator.solo.ncl # Base + mode overlay + user input + validation\n│ ├── control-center.multiuser.ncl\n│ └── (4 services × 4 modes = 16 files)\n│\n├── schemas/ # Type definitions\n│ ├── orchestrator.ncl\n│ ├── control-center.ncl\n│ └── common/ # Shared schemas\n│\n├── defaults/ # Default values\n│ ├── orchestrator-defaults.ncl\n│ └── deployment/solo-defaults.ncl\n│\n├── validators/ # Business rules\n│ ├── orchestrator-validator.ncl\n│ └── (per-service validators)\n│\n├── constraints/\n│ └── constraints.toml # Min/max values (single source of truth)\n│\n├── templates/ # Deployment templates\n│ ├── docker-compose/\n│ │ ├── platform-stack.solo.yml.ncl\n│ │ └── (4 modes)\n│ └── kubernetes/\n│ ├── orchestrator-deployment.yaml.ncl\n│ └── (11 templates)\n│\n└── scripts/ # Automation\n ├── configure.nu # Interactive TypeDialog\n ├── generate-configs.nu # Nickel → TOML export\n ├── validate-config.nu # Typecheck Nickel\n ├── render-docker-compose.nu # Templates → Docker Compose\n └── render-kubernetes.nu # Templates → Kubernetes\n```\n\nTOML output location:\n\n```\nprovisioning/platform/config/\n├── orchestrator.solo.toml # Consumed by orchestrator service\n├── control-center.enterprise.toml # Consumed by control-center service\n└── (4 services × 4 modes = 16 files)\n```\n\n---\n\n## Tips & Best Practices\n\n### 1. Use Version Control\n\n```\n# Commit TOML configs to track changes\ngit add provisioning/platform/config/*.toml\ngit commit -m "Update orchestrator enterprise config: increase worker threads to 16"\n\n# Do NOT commit Nickel source files in values/\necho "provisioning/.typedialog/provisioning/platform/values/*.ncl" >> .gitignore\n```\n\n### 2. Test Before Production Deployment\n\n```\n# Test in solo mode first\nnu scripts/configure.nu orchestrator solo\ncargo run --bin orchestrator\n\n# Then test in staging (multiuser mode)\nnu scripts/configure.nu orchestrator multiuser\ndocker-compose -f docker-compose.multiuser.yml up\n\n# Finally deploy to production (enterprise)\nnu scripts/configure.nu orchestrator enterprise\n# Then Kubernetes deployment\n```\n\n### 3. Document Custom Configurations\n\n```\n# Add comments to configurations\n# In values/*.ncl or config/*.ncl:\n\n# Custom configuration for high-throughput testing\n# - Increased workers from 4 to 8\n# - Increased queue.max_concurrent_tasks from 5 to 20\n# - Lowered logging level from debug to info\n{\n orchestrator = {\n # Worker threads increased for testing parallel task processing\n server.workers = 8,\n queue.max_concurrent_tasks = 20,\n logging.level = "info",\n },\n}\n```\n\n### 4. Secrets Management\n\n**Never** hardcode secrets in configuration files:\n\n```\n# WRONG - Don't do this\n[orchestrator.security]\njwt_secret = "hardcoded-secret-exposed-in-git"\n\n# RIGHT - Use environment variables\nexport ORCHESTRATOR_SECURITY_JWT_SECRET="actual-secret-from-vault"\n\n# TOML references it:\n[orchestrator.security]\njwt_secret = "${JWT_SECRET}" # Loaded at runtime\n```\n\n### 5. Monitor Changes\n\n```\n# Track configuration changes over time\ngit log --oneline provisioning/platform/config/\n\n# See what changed\ngit diff provisioning/platform/config/orchestrator.solo.toml\n```\n\n---\n\n**Version**: 1.0\n**Last Updated**: 2025-01-05\n**Status**: Production Ready diff --git a/schemas/platform/validators/README.md b/schemas/platform/validators/README.md index fe797c4..acdac2a 100644 --- a/schemas/platform/validators/README.md +++ b/schemas/platform/validators/README.md @@ -1 +1 @@ -# Validators\n\nValidation logic for configuration values using constraints and business rules.\n\n## Purpose\n\nValidators provide:\n- **Constraint checking** - Numeric ranges, required fields\n- **Business logic validation** - Service-specific constraints\n- **Error messages** - Clear feedback on invalid values\n- **Composition with configs** - Validators applied during config generation\n\n## File Organization\n\n```\nvalidators/\n├── README.md # This file\n├── common-validator.ncl # Ports, positive numbers, strings\n├── network-validator.ncl # IP addresses, bind addresses\n├── path-validator.ncl # File paths, directories\n├── resource-validator.ncl # CPU, memory, disk\n├── string-validator.ncl # Workspace names, identifiers\n├── orchestrator-validator.ncl # Queue, workflow validation\n├── control-center-validator.ncl # RBAC, policy validation\n├── mcp-server-validator.ncl # MCP tools, capabilities\n└── deployment-validator.ncl # Resource allocation\n```\n\n## Validation Patterns\n\n### 1. Basic Range Validation\n\n```\n# validators/common-validator.ncl\nlet constraints = import "../constraints/constraints.toml" in\n\n{\n ValidPort = fun port =>\n if port < constraints.common.server.port.min then\n std.contract.blame_with_message "Port < 1024" port\n else if port > constraints.common.server.port.max then\n std.contract.blame_with_message "Port > 65535" port\n else\n port,\n}\n```\n\n### 2. Range Validator (Reusable)\n\n```\n# Reusable validator for any numeric range\nValidRange = fun min max value =>\n if value < min then\n std.contract.blame_with_message "Value < %{std.to_string min}" value\n else if value > max then\n std.contract.blame_with_message "Value > %{std.to_string max}" value\n else\n value,\n```\n\n### 3. Enum Validation\n\n```\n{\n ValidStorageBackend = fun backend =>\n if backend != 'filesystem &&\n backend != 'rocksdb &&\n backend != 'surrealdb &&\n backend != 'postgres then\n std.contract.blame_with_message "Invalid backend" backend\n else\n backend,\n}\n```\n\n### 4. String Validation\n\n```\n{\n ValidNonEmptyString = fun s =>\n if s == "" then\n std.contract.blame_with_message "Cannot be empty" s\n else\n s,\n\n ValidWorkspaceName = fun name =>\n if std.string.matches "^[a-z0-9_-]+$" name then\n name\n else\n std.contract.blame_with_message "Invalid workspace name" name,\n}\n```\n\n## Common Validators\n\n### common-validator.ncl\n\n```\nlet constraints = import "../constraints/constraints.toml" in\n\n{\n # Port validation\n ValidPort = fun port =>\n if port < constraints.common.server.port.min then error "Port too low"\n else if port > constraints.common.server.port.max then error "Port too high"\n else port,\n\n # Positive integer\n ValidPositiveNumber = fun n =>\n if n <= 0 then error "Must be positive"\n else n,\n\n # Non-empty string\n ValidNonEmptyString = fun s =>\n if s == "" then error "Cannot be empty"\n else s,\n\n # Generic range validator\n ValidRange = fun min max value =>\n if value < min then error "Value below minimum"\n else if value > max then error "Value above maximum"\n else value,\n}\n```\n\n### resource-validator.ncl\n\n```\nlet constraints = import "../constraints/constraints.toml" in\nlet common = import "./common-validator.ncl" in\n\n{\n # Validate CPU cores for deployment mode\n ValidCPUCores = fun mode cores =>\n let limits = constraints.deployment.{mode} in\n common.ValidRange limits.cpu.min limits.cpu.max cores,\n\n # Validate memory allocation\n ValidMemory = fun mode memory_mb =>\n let limits = constraints.deployment.{mode} in\n common.ValidRange limits.memory_mb.min limits.memory_mb.max memory_mb,\n}\n```\n\n## Service-Specific Validators\n\n### orchestrator-validator.ncl\n\n```\nlet constraints = import "../constraints/constraints.toml" in\nlet common = import "./common-validator.ncl" in\n\n{\n # Validate worker count\n ValidWorkers = fun workers =>\n common.ValidRange\n constraints.orchestrator.workers.min\n constraints.orchestrator.workers.max\n workers,\n\n # Validate queue concurrency\n ValidConcurrentTasks = fun tasks =>\n common.ValidRange\n constraints.orchestrator.queue.concurrent_tasks.min\n constraints.orchestrator.queue.concurrent_tasks.max\n tasks,\n\n # Validate batch parallelism\n ValidParallelLimit = fun limit =>\n common.ValidRange\n constraints.orchestrator.batch.parallel_limit.min\n constraints.orchestrator.batch.parallel_limit.max\n limit,\n\n # Validate task timeout (ms)\n ValidTaskTimeout = fun timeout =>\n if timeout < 1000 then error "Timeout < 1 second"\n else if timeout > 86400000 then error "Timeout > 24 hours"\n else timeout,\n}\n```\n\n### control-center-validator.ncl\n\n```\n{\n # JWT token expiration\n ValidTokenExpiration = fun seconds =>\n if seconds < 300 then error "Token expiration < 5 min"\n else if seconds > 604800 then error "Token expiration > 7 days"\n else seconds,\n\n # Rate limit threshold\n ValidRateLimit = fun requests_per_minute =>\n if requests_per_minute < 10 then error "Rate limit too low"\n else if requests_per_minute > 10000 then error "Rate limit too high"\n else requests_per_minute,\n}\n```\n\n### mcp-server-validator.ncl\n\n```\n{\n # Max concurrent tool executions\n ValidConcurrentTools = fun count =>\n if count < 1 then error "Must allow >= 1 concurrent"\n else if count > 20 then error "Max 20 concurrent tools"\n else count,\n\n # Max resource size\n ValidMaxResourceSize = fun bytes =>\n if bytes < 1048576 then error "Min 1 MB"\n else if bytes > 1073741824 then error "Max 1 GB"\n else bytes,\n}\n```\n\n## Composition with Configs\n\nValidators are applied in config files:\n\n```\n# configs/orchestrator.solo.ncl\nlet validators = import "../validators/orchestrator-validator.ncl" in\n\n{\n orchestrator = {\n server.workers = validators.ValidWorkers 2, # Validated\n queue.max_concurrent_tasks = validators.ValidConcurrentTasks 3, # Validated\n },\n}\n```\n\nValidation happens at:\n1. **Config composition** - When config is evaluated\n2. **Nickel typecheck** - When config is typechecked\n3. **Form submission** - When TypeDialog form is submitted (constraints)\n4. **TOML export** - When Nickel is exported to TOML\n\n## Error Handling\n\n### Validation Errors\n\n```\n# If validation fails during config evaluation:\n# Error: Port too high\n```\n\n### Meaningful Messages\n\nAlways provide context in error messages:\n\n```\n# Bad\nstd.contract.blame "Invalid" value\n\n# Good\nstd.contract.blame_with_message "Port must be 1024-65535, got %{std.to_string value}" port\n```\n\n## Best Practices\n\n1. **Reuse common validators** - Build from common-validator.ncl\n2. **Name clearly** - Prefix with "Valid" (ValidPort, ValidWorkers, etc.)\n3. **Error messages** - Include valid range or enum in message\n4. **Test edge cases** - Verify min/max boundary values\n5. **Document assumptions** - Why a constraint exists\n\n## Testing Validators\n\n```\n# Test a single validator\nnickel eval -c 'import "validators/orchestrator-validator.ncl" as v in v.ValidWorkers 2'\n\n# Test config with validators\nnickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Evaluate config (runs validators)\nnickel eval provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Export to TOML (validates during export)\nnickel export --format toml provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n```\n\n## Adding a New Validator\n\n1. **Create validator function** in appropriate file:\n\n ```nickel\n ValidMyValue = fun value =>\n if value < minimum then error "Too low"\n else if value > maximum then error "Too high"\n else value,\n ```\n\n2. **Add constraint** to constraints.toml if needed:\n\n ```toml\n [service.feature.my_value]\n min = 1\n max = 100\n ```\n\n3. **Use in config**:\n\n ```nickel\n my_value = validators.ValidMyValue 50,\n ```\n\n4. **Add form constraint** (if interactive):\n\n ```toml\n [[elements]]\n name = "my_value"\n min = "${constraint.service.feature.my_value.min}"\n max = "${constraint.service.feature.my_value.max}"\n ```\n\n5. **Test**:\n\n ```bash\n nickel typecheck configs/service.mode.ncl\n ```\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 \ No newline at end of file +# Validators\n\nValidation logic for configuration values using constraints and business rules.\n\n## Purpose\n\nValidators provide:\n- **Constraint checking** - Numeric ranges, required fields\n- **Business logic validation** - Service-specific constraints\n- **Error messages** - Clear feedback on invalid values\n- **Composition with configs** - Validators applied during config generation\n\n## File Organization\n\n```\nvalidators/\n├── README.md # This file\n├── common-validator.ncl # Ports, positive numbers, strings\n├── network-validator.ncl # IP addresses, bind addresses\n├── path-validator.ncl # File paths, directories\n├── resource-validator.ncl # CPU, memory, disk\n├── string-validator.ncl # Workspace names, identifiers\n├── orchestrator-validator.ncl # Queue, workflow validation\n├── control-center-validator.ncl # RBAC, policy validation\n├── mcp-server-validator.ncl # MCP tools, capabilities\n└── deployment-validator.ncl # Resource allocation\n```\n\n## Validation Patterns\n\n### 1. Basic Range Validation\n\n```\n# validators/common-validator.ncl\nlet constraints = import "../constraints/constraints.toml" in\n\n{\n ValidPort = fun port =>\n if port < constraints.common.server.port.min then\n std.contract.blame_with_message "Port < 1024" port\n else if port > constraints.common.server.port.max then\n std.contract.blame_with_message "Port > 65535" port\n else\n port,\n}\n```\n\n### 2. Range Validator (Reusable)\n\n```\n# Reusable validator for any numeric range\nValidRange = fun min max value =>\n if value < min then\n std.contract.blame_with_message "Value < %{std.to_string min}" value\n else if value > max then\n std.contract.blame_with_message "Value > %{std.to_string max}" value\n else\n value,\n```\n\n### 3. Enum Validation\n\n```\n{\n ValidStorageBackend = fun backend =>\n if backend != 'filesystem &&\n backend != 'rocksdb &&\n backend != 'surrealdb &&\n backend != 'postgres then\n std.contract.blame_with_message "Invalid backend" backend\n else\n backend,\n}\n```\n\n### 4. String Validation\n\n```\n{\n ValidNonEmptyString = fun s =>\n if s == "" then\n std.contract.blame_with_message "Cannot be empty" s\n else\n s,\n\n ValidWorkspaceName = fun name =>\n if std.string.matches "^[a-z0-9_-]+$" name then\n name\n else\n std.contract.blame_with_message "Invalid workspace name" name,\n}\n```\n\n## Common Validators\n\n### common-validator.ncl\n\n```\nlet constraints = import "../constraints/constraints.toml" in\n\n{\n # Port validation\n ValidPort = fun port =>\n if port < constraints.common.server.port.min then error "Port too low"\n else if port > constraints.common.server.port.max then error "Port too high"\n else port,\n\n # Positive integer\n ValidPositiveNumber = fun n =>\n if n <= 0 then error "Must be positive"\n else n,\n\n # Non-empty string\n ValidNonEmptyString = fun s =>\n if s == "" then error "Cannot be empty"\n else s,\n\n # Generic range validator\n ValidRange = fun min max value =>\n if value < min then error "Value below minimum"\n else if value > max then error "Value above maximum"\n else value,\n}\n```\n\n### resource-validator.ncl\n\n```\nlet constraints = import "../constraints/constraints.toml" in\nlet common = import "./common-validator.ncl" in\n\n{\n # Validate CPU cores for deployment mode\n ValidCPUCores = fun mode cores =>\n let limits = constraints.deployment.{mode} in\n common.ValidRange limits.cpu.min limits.cpu.max cores,\n\n # Validate memory allocation\n ValidMemory = fun mode memory_mb =>\n let limits = constraints.deployment.{mode} in\n common.ValidRange limits.memory_mb.min limits.memory_mb.max memory_mb,\n}\n```\n\n## Service-Specific Validators\n\n### orchestrator-validator.ncl\n\n```\nlet constraints = import "../constraints/constraints.toml" in\nlet common = import "./common-validator.ncl" in\n\n{\n # Validate worker count\n ValidWorkers = fun workers =>\n common.ValidRange\n constraints.orchestrator.workers.min\n constraints.orchestrator.workers.max\n workers,\n\n # Validate queue concurrency\n ValidConcurrentTasks = fun tasks =>\n common.ValidRange\n constraints.orchestrator.queue.concurrent_tasks.min\n constraints.orchestrator.queue.concurrent_tasks.max\n tasks,\n\n # Validate batch parallelism\n ValidParallelLimit = fun limit =>\n common.ValidRange\n constraints.orchestrator.batch.parallel_limit.min\n constraints.orchestrator.batch.parallel_limit.max\n limit,\n\n # Validate task timeout (ms)\n ValidTaskTimeout = fun timeout =>\n if timeout < 1000 then error "Timeout < 1 second"\n else if timeout > 86400000 then error "Timeout > 24 hours"\n else timeout,\n}\n```\n\n### control-center-validator.ncl\n\n```\n{\n # JWT token expiration\n ValidTokenExpiration = fun seconds =>\n if seconds < 300 then error "Token expiration < 5 min"\n else if seconds > 604800 then error "Token expiration > 7 days"\n else seconds,\n\n # Rate limit threshold\n ValidRateLimit = fun requests_per_minute =>\n if requests_per_minute < 10 then error "Rate limit too low"\n else if requests_per_minute > 10000 then error "Rate limit too high"\n else requests_per_minute,\n}\n```\n\n### mcp-server-validator.ncl\n\n```\n{\n # Max concurrent tool executions\n ValidConcurrentTools = fun count =>\n if count < 1 then error "Must allow >= 1 concurrent"\n else if count > 20 then error "Max 20 concurrent tools"\n else count,\n\n # Max resource size\n ValidMaxResourceSize = fun bytes =>\n if bytes < 1048576 then error "Min 1 MB"\n else if bytes > 1073741824 then error "Max 1 GB"\n else bytes,\n}\n```\n\n## Composition with Configs\n\nValidators are applied in config files:\n\n```\n# configs/orchestrator.solo.ncl\nlet validators = import "../validators/orchestrator-validator.ncl" in\n\n{\n orchestrator = {\n server.workers = validators.ValidWorkers 2, # Validated\n queue.max_concurrent_tasks = validators.ValidConcurrentTasks 3, # Validated\n },\n}\n```\n\nValidation happens at:\n1. **Config composition** - When config is evaluated\n2. **Nickel typecheck** - When config is typechecked\n3. **Form submission** - When TypeDialog form is submitted (constraints)\n4. **TOML export** - When Nickel is exported to TOML\n\n## Error Handling\n\n### Validation Errors\n\n```\n# If validation fails during config evaluation:\n# Error: Port too high\n```\n\n### Meaningful Messages\n\nAlways provide context in error messages:\n\n```\n# Bad\nstd.contract.blame "Invalid" value\n\n# Good\nstd.contract.blame_with_message "Port must be 1024-65535, got %{std.to_string value}" port\n```\n\n## Best Practices\n\n1. **Reuse common validators** - Build from common-validator.ncl\n2. **Name clearly** - Prefix with "Valid" (ValidPort, ValidWorkers, etc.)\n3. **Error messages** - Include valid range or enum in message\n4. **Test edge cases** - Verify min/max boundary values\n5. **Document assumptions** - Why a constraint exists\n\n## Testing Validators\n\n```\n# Test a single validator\nnickel eval -c 'import "validators/orchestrator-validator.ncl" as v in v.ValidWorkers 2'\n\n# Test config with validators\nnickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Evaluate config (runs validators)\nnickel eval provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Export to TOML (validates during export)\nnickel export --format toml provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n```\n\n## Adding a New Validator\n\n1. **Create validator function** in appropriate file:\n\n ```nickel\n ValidMyValue = fun value =>\n if value < minimum then error "Too low"\n else if value > maximum then error "Too high"\n else value,\n ```\n\n2. **Add constraint** to constraints.toml if needed:\n\n ```toml\n [service.feature.my_value]\n min = 1\n max = 100\n ```\n\n3. **Use in config**:\n\n ```nickel\n my_value = validators.ValidMyValue 50,\n ```\n\n4. **Add form constraint** (if interactive):\n\n ```toml\n [[elements]]\n name = "my_value"\n min = "${constraint.service.feature.my_value.min}"\n max = "${constraint.service.feature.my_value.max}"\n ```\n\n5. **Test**:\n\n ```bash\n nickel typecheck configs/service.mode.ncl\n ```\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 diff --git a/schemas/platform/values/README.md b/schemas/platform/values/README.md index ddadf97..7301990 100644 --- a/schemas/platform/values/README.md +++ b/schemas/platform/values/README.md @@ -1 +1 @@ -# Values\n\nUser configuration files for provisioning platform services (gitignored).\n\n## Purpose\n\nThe values directory stores:\n- **User configurations** - Service-specific settings for each deployment mode\n- **Generated Nickel configs** - Output from TypeDialog configuration wizard\n- **Customizations** - User-specific overrides to defaults\n- **Runtime data** - Persisted configuration state\n\n## File Organization\n\n```\nvalues/\n├── .gitignore # Ignore *.ncl user configs\n├── README.md # This file\n├── orchestrator.solo.ncl # User config (gitignored)\n├── orchestrator.multiuser.ncl\n├── orchestrator.cicd.ncl\n├── orchestrator.enterprise.ncl\n├── control-center.solo.ncl\n├── control-center.multiuser.ncl\n├── control-center.cicd.ncl\n├── control-center.enterprise.ncl\n├── mcp-server.solo.ncl\n├── mcp-server.multiuser.ncl\n├── mcp-server.cicd.ncl\n├── mcp-server.enterprise.ncl\n├── installer.solo.ncl\n├── installer.multiuser.ncl\n├── installer.cicd.ncl\n├── installer.enterprise.ncl\n└── orchestrator.example.ncl # Example template (tracked)\n```\n\n## Configuration Files\n\nEach config file (`{service}.{mode}.ncl`) is:\n- **Generated by TypeDialog** - Via `configure.nu` wizard\n- **User-specific** - Contains customizations for that environment\n- **Gitignored** - NOT tracked in version control\n- **Runtime data** - Created/updated by scripts and forms\n\nExample:\n\n```\n# values/orchestrator.solo.ncl (auto-generated, user-editable)\n{\n orchestrator = {\n workspace = {\n name = "my-workspace",\n path = "/home/user/workspace",\n enabled = true,\n },\n server = {\n host = "127.0.0.1",\n port = 9090,\n workers = 2,\n },\n storage = {\n backend = 'filesystem,\n path = "/home/user/.provisioning/data",\n },\n },\n}\n```\n\n## .gitignore Pattern\n\n```\n# values/.gitignore\n*.ncl # Ignore all Nickel config files (user-specific)\n!*.example.ncl # EXCEPT example files (tracked for documentation)\n```\n\nThis ensures:\n- User configs (`orchestrator.solo.ncl`) are NOT committed\n- Example configs (`orchestrator.example.ncl`) ARE committed\n- Each user has their own configs without merge conflicts\n\n## Example Template\n\n`orchestrator.example.ncl` provides a documented template:\n\n```\n# orchestrator.example.ncl\n# Example configuration for Orchestrator service\n# Copy to orchestrator.{mode}.ncl and customize for your environment\n\n{\n orchestrator = {\n # Workspace Configuration\n workspace = {\n # Name of the workspace\n name = "default",\n\n # Absolute path to workspace directory\n path = "/var/lib/provisioning/orchestrator",\n\n # Enable this workspace\n enabled = true,\n\n # Allow serving multiple workspaces\n multi_workspace = false,\n },\n\n # HTTP Server Configuration\n server = {\n # Bind address (127.0.0.1 for local only, 0.0.0.0 for network)\n host = "127.0.0.1",\n\n # Listen port\n port = 9090,\n\n # Worker thread count\n workers = 4,\n\n # Keep-alive timeout (seconds)\n keep_alive = 75,\n },\n\n # Storage Configuration\n storage = {\n # Backend: 'filesystem | 'rocksdb | 'surrealdb | 'postgres\n backend = 'filesystem,\n\n # Path for filesystem/rocksdb storage\n path = "/var/lib/provisioning/orchestrator/data",\n },\n\n # Queue Configuration\n queue = {\n # Maximum concurrent tasks\n max_concurrent_tasks = 5,\n\n # Retry attempts for failed tasks\n retry_attempts = 3,\n\n # Delay between retries (milliseconds)\n retry_delay = 5000,\n\n # Task execution timeout (milliseconds)\n task_timeout = 3600000,\n },\n },\n}\n```\n\n## Configuration Workflow\n\n### 1. Generate Initial Config\n\n```\nnu scripts/configure.nu orchestrator solo\n```\n\nCreates `values/orchestrator.solo.ncl` from form input.\n\n### 2. Edit Configuration\n\n```\n# Manually edit if needed\nvi values/orchestrator.solo.ncl\n\n# Or reconfigure with wizard\nnu scripts/configure.nu orchestrator solo --backend web\n```\n\n### 3. Validate Configuration\n\n```\nnu scripts/validate-config.nu values/orchestrator.solo.ncl\n```\n\n### 4. Generate TOML for Services\n\n```\nnu scripts/generate-configs.nu orchestrator solo\n```\n\nExports to `provisioning/platform/config/orchestrator.solo.toml` (consumed by Rust services).\n\n## Configuration Composition\n\nUser configs are composed with defaults during generation:\n\n```\ndefaults/orchestrator-defaults.ncl (base values)\n ↓ &\nvalues/orchestrator.solo.ncl (user customizations)\n ↓\nconfigs/orchestrator.solo.ncl (final generated config)\n ↓\nprovisioning/platform/config/orchestrator.solo.toml (Rust service config)\n```\n\n## Best Practices\n\n1. **Start with example** - Copy `orchestrator.example.ncl` as template\n2. **Document changes** - Add inline comments explaining customizations\n3. **Use TypeDialog** - Let wizard handle configuration for you\n4. **Validate before deploying** - Always run `validate-config.nu`\n5. **Keep defaults** - Only override what you need to change\n6. **Backup important configs** - Save known-good configurations\n\n## Sharing Configurations\n\nSince user configs are gitignored, sharing requires:\n\n### Option 1: Share via File\n\n```\n# Export current config\ncat values/orchestrator.solo.ncl > /tmp/orchestrator-config.ncl\n\n# Import on another system\ncp /tmp/orchestrator-config.ncl values/orchestrator.solo.ncl\n```\n\n### Option 2: Use Example Template\nShare setup instructions instead of raw config:\n\n```\n# Document the setup steps\ncat > SETUP.md << EOF\n1. Run: nu scripts/configure.nu orchestrator solo\n2. Set workspace path: /shared/workspace\n3. Set storage backend: postgres\n4. Set server workers: 8\nEOF\n```\n\n### Option 3: Store in Separate Repo\nFor team configs, use a separate private repository:\n\n```\n# Clone team configs\ngit clone private-repo/provisioning-configs values/\n\n# Use team configs\ncp values/team-orchestrator-solo.ncl values/orchestrator.solo.ncl\n```\n\n## File Permissions\n\nUser config files should have restricted permissions:\n\n```\n# Secure config file (if contains secrets)\nchmod 600 values/orchestrator.solo.ncl\n```\n\n## Recovery\n\nIf you accidentally delete a user config:\n\n### Option 1: Regenerate from TypeDialog\n\n```\nnu scripts/configure.nu orchestrator solo\n```\n\n### Option 2: Copy from Backup\n\n```\ncp /backup/provisioning-values/orchestrator.solo.ncl values/\n```\n\n### Option 3: Use Example as Base\n\n```\ncp examples/orchestrator-solo.ncl values/orchestrator.solo.ncl\n# Customize as needed\nnu scripts/configure.nu orchestrator solo --backend web\n```\n\n## Troubleshooting\n\n### Config File Missing\n\n```\n# Regenerate from defaults\nnu scripts/configure.nu orchestrator solo\n```\n\n### Config Won't Validate\n\n```\n# Check for syntax errors\nnickel eval values/orchestrator.solo.ncl\n\n# Compare with example\ndiff examples/orchestrator-solo.ncl values/orchestrator.solo.ncl\n```\n\n### Changes Not Taking Effect\n\n```\n# Regenerate TOML from Nickel\nnu scripts/generate-configs.nu orchestrator solo\n\n# Verify TOML was updated\nls -la provisioning/platform/config/orchestrator.solo.toml\n```\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 \ No newline at end of file +# Values\n\nUser configuration files for provisioning platform services (gitignored).\n\n## Purpose\n\nThe values directory stores:\n- **User configurations** - Service-specific settings for each deployment mode\n- **Generated Nickel configs** - Output from TypeDialog configuration wizard\n- **Customizations** - User-specific overrides to defaults\n- **Runtime data** - Persisted configuration state\n\n## File Organization\n\n```\nvalues/\n├── .gitignore # Ignore *.ncl user configs\n├── README.md # This file\n├── orchestrator.solo.ncl # User config (gitignored)\n├── orchestrator.multiuser.ncl\n├── orchestrator.cicd.ncl\n├── orchestrator.enterprise.ncl\n├── control-center.solo.ncl\n├── control-center.multiuser.ncl\n├── control-center.cicd.ncl\n├── control-center.enterprise.ncl\n├── mcp-server.solo.ncl\n├── mcp-server.multiuser.ncl\n├── mcp-server.cicd.ncl\n├── mcp-server.enterprise.ncl\n├── installer.solo.ncl\n├── installer.multiuser.ncl\n├── installer.cicd.ncl\n├── installer.enterprise.ncl\n└── orchestrator.example.ncl # Example template (tracked)\n```\n\n## Configuration Files\n\nEach config file (`{service}.{mode}.ncl`) is:\n- **Generated by TypeDialog** - Via `configure.nu` wizard\n- **User-specific** - Contains customizations for that environment\n- **Gitignored** - NOT tracked in version control\n- **Runtime data** - Created/updated by scripts and forms\n\nExample:\n\n```\n# values/orchestrator.solo.ncl (auto-generated, user-editable)\n{\n orchestrator = {\n workspace = {\n name = "my-workspace",\n path = "/home/user/workspace",\n enabled = true,\n },\n server = {\n host = "127.0.0.1",\n port = 9090,\n workers = 2,\n },\n storage = {\n backend = 'filesystem,\n path = "/home/user/.provisioning/data",\n },\n },\n}\n```\n\n## .gitignore Pattern\n\n```\n# values/.gitignore\n*.ncl # Ignore all Nickel config files (user-specific)\n!*.example.ncl # EXCEPT example files (tracked for documentation)\n```\n\nThis ensures:\n- User configs (`orchestrator.solo.ncl`) are NOT committed\n- Example configs (`orchestrator.example.ncl`) ARE committed\n- Each user has their own configs without merge conflicts\n\n## Example Template\n\n`orchestrator.example.ncl` provides a documented template:\n\n```\n# orchestrator.example.ncl\n# Example configuration for Orchestrator service\n# Copy to orchestrator.{mode}.ncl and customize for your environment\n\n{\n orchestrator = {\n # Workspace Configuration\n workspace = {\n # Name of the workspace\n name = "default",\n\n # Absolute path to workspace directory\n path = "/var/lib/provisioning/orchestrator",\n\n # Enable this workspace\n enabled = true,\n\n # Allow serving multiple workspaces\n multi_workspace = false,\n },\n\n # HTTP Server Configuration\n server = {\n # Bind address (127.0.0.1 for local only, 0.0.0.0 for network)\n host = "127.0.0.1",\n\n # Listen port\n port = 9090,\n\n # Worker thread count\n workers = 4,\n\n # Keep-alive timeout (seconds)\n keep_alive = 75,\n },\n\n # Storage Configuration\n storage = {\n # Backend: 'filesystem | 'rocksdb | 'surrealdb | 'postgres\n backend = 'filesystem,\n\n # Path for filesystem/rocksdb storage\n path = "/var/lib/provisioning/orchestrator/data",\n },\n\n # Queue Configuration\n queue = {\n # Maximum concurrent tasks\n max_concurrent_tasks = 5,\n\n # Retry attempts for failed tasks\n retry_attempts = 3,\n\n # Delay between retries (milliseconds)\n retry_delay = 5000,\n\n # Task execution timeout (milliseconds)\n task_timeout = 3600000,\n },\n },\n}\n```\n\n## Configuration Workflow\n\n### 1. Generate Initial Config\n\n```\nnu scripts/configure.nu orchestrator solo\n```\n\nCreates `values/orchestrator.solo.ncl` from form input.\n\n### 2. Edit Configuration\n\n```\n# Manually edit if needed\nvi values/orchestrator.solo.ncl\n\n# Or reconfigure with wizard\nnu scripts/configure.nu orchestrator solo --backend web\n```\n\n### 3. Validate Configuration\n\n```\nnu scripts/validate-config.nu values/orchestrator.solo.ncl\n```\n\n### 4. Generate TOML for Services\n\n```\nnu scripts/generate-configs.nu orchestrator solo\n```\n\nExports to `provisioning/platform/config/orchestrator.solo.toml` (consumed by Rust services).\n\n## Configuration Composition\n\nUser configs are composed with defaults during generation:\n\n```\ndefaults/orchestrator-defaults.ncl (base values)\n ↓ &\nvalues/orchestrator.solo.ncl (user customizations)\n ↓\nconfigs/orchestrator.solo.ncl (final generated config)\n ↓\nprovisioning/platform/config/orchestrator.solo.toml (Rust service config)\n```\n\n## Best Practices\n\n1. **Start with example** - Copy `orchestrator.example.ncl` as template\n2. **Document changes** - Add inline comments explaining customizations\n3. **Use TypeDialog** - Let wizard handle configuration for you\n4. **Validate before deploying** - Always run `validate-config.nu`\n5. **Keep defaults** - Only override what you need to change\n6. **Backup important configs** - Save known-good configurations\n\n## Sharing Configurations\n\nSince user configs are gitignored, sharing requires:\n\n### Option 1: Share via File\n\n```\n# Export current config\ncat values/orchestrator.solo.ncl > /tmp/orchestrator-config.ncl\n\n# Import on another system\ncp /tmp/orchestrator-config.ncl values/orchestrator.solo.ncl\n```\n\n### Option 2: Use Example Template\nShare setup instructions instead of raw config:\n\n```\n# Document the setup steps\ncat > SETUP.md << EOF\n1. Run: nu scripts/configure.nu orchestrator solo\n2. Set workspace path: /shared/workspace\n3. Set storage backend: postgres\n4. Set server workers: 8\nEOF\n```\n\n### Option 3: Store in Separate Repo\nFor team configs, use a separate private repository:\n\n```\n# Clone team configs\ngit clone private-repo/provisioning-configs values/\n\n# Use team configs\ncp values/team-orchestrator-solo.ncl values/orchestrator.solo.ncl\n```\n\n## File Permissions\n\nUser config files should have restricted permissions:\n\n```\n# Secure config file (if contains secrets)\nchmod 600 values/orchestrator.solo.ncl\n```\n\n## Recovery\n\nIf you accidentally delete a user config:\n\n### Option 1: Regenerate from TypeDialog\n\n```\nnu scripts/configure.nu orchestrator solo\n```\n\n### Option 2: Copy from Backup\n\n```\ncp /backup/provisioning-values/orchestrator.solo.ncl values/\n```\n\n### Option 3: Use Example as Base\n\n```\ncp examples/orchestrator-solo.ncl values/orchestrator.solo.ncl\n# Customize as needed\nnu scripts/configure.nu orchestrator solo --backend web\n```\n\n## Troubleshooting\n\n### Config File Missing\n\n```\n# Regenerate from defaults\nnu scripts/configure.nu orchestrator solo\n```\n\n### Config Won't Validate\n\n```\n# Check for syntax errors\nnickel eval values/orchestrator.solo.ncl\n\n# Compare with example\ndiff examples/orchestrator-solo.ncl values/orchestrator.solo.ncl\n```\n\n### Changes Not Taking Effect\n\n```\n# Regenerate TOML from Nickel\nnu scripts/generate-configs.nu orchestrator solo\n\n# Verify TOML was updated\nls -la provisioning/platform/config/orchestrator.solo.toml\n```\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 diff --git a/scripts/fix-layout-rename.nu b/scripts/fix-layout-rename.nu index a0ac6d8..0f9c52c 100755 --- a/scripts/fix-layout-rename.nu +++ b/scripts/fix-layout-rename.nu @@ -6,12 +6,12 @@ def main [] { print "🔧 Renaming product docs to lowercase-kebab-case..." print "" - + # Product docs to rename let renames = [ # docs/src/ {old: "docs/src/PROVISIONING.md", new: "docs/src/provisioning.md"}, - + # ADRs {old: "docs/src/architecture/adr/ADR-001-project-structure.md", new: "docs/src/architecture/adr/adr-001-project-structure.md"}, {old: "docs/src/architecture/adr/ADR-002-distribution-strategy.md", new: "docs/src/architecture/adr/adr-002-distribution-strategy.md"}, @@ -23,18 +23,18 @@ def main [] { {old: "docs/src/architecture/adr/ADR-008-cedar-authorization.md", new: "docs/src/architecture/adr/adr-008-cedar-authorization.md"}, {old: "docs/src/architecture/adr/ADR-009-security-system-complete.md", new: "docs/src/architecture/adr/adr-009-security-system-complete.md"}, {old: "docs/src/architecture/orchestrator_info.md", new: "docs/src/architecture/orchestrator-info.md"}, - + # Control Center UI {old: "platform/crates/control-center-ui/AUTH_SYSTEM.md", new: "platform/crates/control-center-ui/auth-system.md"}, {old: "platform/crates/control-center-ui/REFERENCE.md", new: "platform/crates/control-center-ui/reference.md"}, {old: "platform/crates/control-center-ui/UPSTREAM_DEPENDENCY_ISSUE.md", new: "platform/crates/control-center-ui/upstream-dependency-issue.md"}, - + # Extension Registry {old: "platform/crates/extension-registry/API.md", new: "platform/crates/extension-registry/api.md"}, - + # Control Center {old: "platform/crates/control-center/docs/SECURITY_CONSIDERATIONS.md", new: "platform/crates/control-center/docs/security-considerations.md"}, - + # Orchestrator {old: "platform/crates/orchestrator/docs/DNS_INTEGRATION.md", new: "platform/crates/orchestrator/docs/dns-integration.md"}, {old: "platform/crates/orchestrator/docs/EXTENSION_LOADING.md", new: "platform/crates/orchestrator/docs/extension-loading.md"}, @@ -43,18 +43,18 @@ def main [] { {old: "platform/crates/orchestrator/docs/SSH_KEY_MANAGEMENT.md", new: "platform/crates/orchestrator/docs/ssh-key-management.md"}, {old: "platform/crates/orchestrator/docs/STORAGE_BACKENDS.md", new: "platform/crates/orchestrator/docs/storage-backends.md"}, {old: "platform/crates/orchestrator/wrks/README_TESTING.md", new: "platform/crates/orchestrator/wrks/readme-testing.md"}, - + # Test docs {old: "tests/integration/docs/ORBSTACK_SETUP.md", new: "tests/integration/docs/orbstack-setup.md"}, {old: "tests/integration/docs/TESTING_GUIDE.md", new: "tests/integration/docs/testing-guide.md"}, {old: "tests/integration/docs/TEST_COVERAGE.md", new: "tests/integration/docs/test-coverage.md"}, - + # Extensions {old: "extensions/providers/REFERENCE.md", new: "extensions/providers/reference.md"}, {old: "extensions/clusters/REFERENCE.md", new: "extensions/clusters/reference.md"}, {old: "extensions/providers/aws/kcl/docs/aws_prov.md", new: "extensions/providers/aws/kcl/docs/aws-prov.md"}, ] - + print "📝 Renaming product documentation files..." for rename in $renames { if ($rename.old | path exists) { @@ -67,7 +67,7 @@ def main [] { print $" ⚠️ File not found: ($rename.old)" } } - + print "\n✅ Product docs renamed to lowercase-kebab-case" print "\n⚡ Next: Update SUMMARY.md and fix internal links" } diff --git a/scripts/fix-layout-violations.nu b/scripts/fix-layout-violations.nu index 11ac6a1..5de09a5 100755 --- a/scripts/fix-layout-violations.nu +++ b/scripts/fix-layout-violations.nu @@ -5,39 +5,39 @@ def main [] { print "🔧 Fixing layout_conventions.md violations..." print "" - + # Session files to move let session_files = [ # AI Service {src: "platform/crates/ai-service/PHASE4_API.md", dst: ".coder/platform/ai-service/2026-01-10-phase4-api.info.md"}, - + # Control Center UI {src: "platform/crates/control-center-ui/LEPTOS_0.8_MIGRATION_COMPLETE.md", dst: ".coder/platform/control-center-ui/2025-12-XX-leptos-migration-complete.done.md"}, {src: "platform/crates/control-center-ui/LEPTOS_MIGRATION_INDEX.md", dst: ".coder/platform/control-center-ui/2025-12-XX-leptos-migration-index.info.md"}, {src: "platform/crates/control-center-ui/MIGRATION_VERIFICATION_FINAL.md", dst: ".coder/platform/control-center-ui/2025-12-XX-migration-verification.done.md"}, {src: "platform/crates/control-center-ui/UI_MOCKUPS.md", dst: ".coder/platform/control-center-ui/2025-12-XX-ui-mockups.info.md"}, - + # Orchestrator {src: "platform/crates/orchestrator/docs/what_is_next_info.md", dst: ".coder/platform/orchestrator/2025-12-XX-what-is-next.info.md"}, - + # Infrastructure {src: "platform/infrastructure/oci-registry/IMPLEMENTATION_SUMMARY.md", dst: ".coder/platform/oci-registry/2025-12-XX-implementation.done.md"}, - + # Tests {src: "tests/integration/IMPLEMENTATION_SUMMARY.md", dst: ".coder/tests/integration/2025-12-XX-implementation.done.md"}, - + # Core {src: "core/nulib/lib_provisioning/extensions/QUICKSTART.md", dst: ".coder/core/extensions/2025-12-XX-quickstart.info.md"}, {src: "core/nulib/lib_provisioning/secrets/info_README.md", dst: ".coder/core/secrets/2025-12-XX-info-readme.info.md"}, {src: "core/nulib/lib_provisioning/ai/kcl_build_ai.md", dst: ".coder/core/ai/2025-12-XX-kcl-build-ai.info.md"}, {src: "core/nulib/lib_provisioning/ai/info_about.md", dst: ".coder/core/ai/2025-12-XX-info-about.info.md"}, {src: "core/nulib/lib_provisioning/ai/info_ai.md", dst: ".coder/core/ai/2025-12-XX-info-ai.info.md"}, - + # Extensions wrks {src: "extensions/wrks/EXTENSIONS.md", dst: ".coder/extensions/2025-12-XX-extensions.info.md"}, {src: "extensions/wrks/EXTENSION_DEMO.md", dst: ".coder/extensions/2025-12-XX-extension-demo.info.md"}, ] - + print "📦 Moving session files to .coder/..." for file in $session_files { if ($file.src | path exists) { @@ -49,7 +49,7 @@ def main [] { print $" ⚠️ File not found: ($file.src)" } } - + print "\n✅ Session files moved to .coder/" print "\n⚡ Run fix-layout-rename.nu to rename product docs to lowercase" } diff --git a/scripts/fix-markdown-fences.nu b/scripts/fix-markdown-fences.nu index 00e8f79..a00c107 100644 --- a/scripts/fix-markdown-fences.nu +++ b/scripts/fix-markdown-fences.nu @@ -65,8 +65,9 @@ def main [ $modified_content = $opening_result.content } - # Write changes if not dry-run - if ($modified_content != $original_content) { + # Write changes if not dry-run AND if there were any fixes + let has_changes = ($closing_fixed > 0) or ($opening_fixed > 0) + if $has_changes { if (not $dry_run) { $modified_content | save --force $file } @@ -107,9 +108,14 @@ def main [ # Discover all markdown files with proper exclusions def discover-markdown-files [] { glob **/*.md + | each { |f| $f | str replace $'(pwd)/' '' } # Normalize to relative paths | where { |f| - # Exclude various non-doc directories - $f !~ '(node_modules|target|build|dist|\.git|\.vale|\.coder|\.claude|\.wrks|old_config)' + # Exclude system/cache directories + let excluded = $f =~ '(node_modules/|\.git/|\.vale/|\.coder/|\.claude/|\.wrks/|/old_config/)' + # Exclude root-level build/dist/target (but NOT tools/build, tools/dist) + let bad_build = ($f =~ '^(build|dist|target)/' and $f !~ '^tools/(build|dist)') + + not $excluded and not $bad_build } | sort } @@ -163,39 +169,44 @@ def fix-opening-fences [content, file_path] { mut in_fence = false mut fixed_count = 0 - for idx in (0..($lines | length)) { + for idx in (0..<($lines | length)) { let line = $lines | get $idx - # Check if this is an opening fence without language - if ($line =~ '^```$' and not $in_fence) { - # Get content after fence (first 10 lines or until closing fence) - let next_start = $idx + 1 - let next_count = if ($next_start + 10 < ($lines | length)) { 10 } else { ($lines | length) - $next_start } - let content_after = if $next_start < ($lines | length) { - $lines | skip $next_start | first $next_count - } else { - [] - } + if ($line =~ '^```') { + if (not $in_fence) { + # This is an opening fence + if ($line =~ '^```$') { + # Opening fence WITHOUT language → needs fixing + # Get content after fence (first 10 lines) + let next_start = $idx + 1 + let next_count = if ($next_start + 10 < ($lines | length)) { 10 } else { ($lines | length) - $next_start } + let content_after = if $next_start < ($lines | length) { + $lines | skip $next_start | first $next_count + } else { + [] + } - # Get context before fence (3 lines) - let context_start = if ($idx > 3) { $idx - 3 } else { 0 } - let context_before = $lines | skip $context_start | first ($idx - $context_start) | str join '\n' + # Get context before fence (3 lines) + let context_start = if ($idx > 3) { $idx - 3 } else { 0 } + let context_before = $lines | skip $context_start | first ($idx - $context_start) | str join '\n' - # Detect language - let detected_lang = detect-language $content_after $context_before $file_path + # Detect language + let detected_lang = detect-language $content_after $context_before $file_path - # Add language to fence - $fixed_lines = ($fixed_lines | append $'```{$detected_lang}') - $fixed_count += 1 - $in_fence = true - } else if ($line =~ '^```') { - # Track fence state for other fences - if $in_fence { - $in_fence = false - } else { + # Add language to fence + $fixed_lines = ($fixed_lines | append $'```{$detected_lang}') + $fixed_count += 1 + } else { + # Opening fence WITH language → no fix needed + $fixed_lines = ($fixed_lines | append $line) + } + # Enter fence state $in_fence = true + } else { + # We're inside a fence → this is closing fence + $fixed_lines = ($fixed_lines | append $line) + $in_fence = false } - $fixed_lines = ($fixed_lines | append $line) } else { $fixed_lines = ($fixed_lines | append $line) } diff --git a/scripts/setup-platform-config.sh.md b/scripts/setup-platform-config.sh.md index 174564d..24cce53 100644 --- a/scripts/setup-platform-config.sh.md +++ b/scripts/setup-platform-config.sh.md @@ -1,402 +1 @@ -# Platform Services Configuration Setup Script - -**Path**: `provisioning/scripts/setup-platform-config.sh` - -Setup and manage platform service configurations in `provisioning/config/runtime/`. - -## Features - -- ✅ **Interactive Mode**: Guided setup with TypeDialog or quick mode -- ✅ **Interactive TypeDialog**: Web/TUI/CLI form-based configuration -- ✅ **Quick Mode**: Auto-setup from defaults + mode overlays -- ✅ **Automatic TOML Export**: Generates TOML files for Rust services -- ✅ **Runtime Detection**: Detect existing configs and offer update/replace options -- ✅ **Batch Operations**: Configure all 8 services at once -- ✅ **Cleanup Management**: Remove/reset configurations safely - -## Usage - -### Interactive Setup (Recommended) - -``` -# Start interactive wizard -./provisioning/scripts/setup-platform-config.sh - -# Prompts for: -# 1. Action: TypeDialog, Quick Mode, Clean, or List -# 2. Service (if TypeDialog/Quick) -# 3. Mode (solo/multiuser/cicd/enterprise) -# 4. Backend (web/tui/cli, if TypeDialog) -``` - -### Command-Line Options - -``` -# Configure specific service via TypeDialog -./provisioning/scripts/setup-platform-config.sh \ - --service orchestrator \ - --mode solo \ - --backend web - -# Quick setup all services for enterprise mode -./provisioning/scripts/setup-platform-config.sh \ - --quick-mode \ - --mode enterprise - -# Regenerate TOML files from existing .ncl configs -./provisioning/scripts/setup-platform-config.sh \ - --generate-toml - -# List available options -./provisioning/scripts/setup-platform-config.sh --list-modes -./provisioning/scripts/setup-platform-config.sh --list-services -./provisioning/scripts/setup-platform-config.sh --list-configs - -# Clean all runtime configurations -./provisioning/scripts/setup-platform-config.sh --clean -``` - -## Workflow - -### 1. Initial Setup (Empty Runtime) - -``` -Interactive Prompt - ↓ -├─ TypeDialog (Recommended) -│ ├─ Load form definitions -│ ├─ User fills form (web/tui/cli) -│ └─ Generates orchestrator.solo.ncl -│ ↓ -│ Auto-export to orchestrator.solo.toml -│ -└─ Quick Mode - ├─ Select mode (solo/multiuser/cicd/enterprise) - ├─ Compose all services: defaults + mode overlay - ├─ Create 8 .ncl files - └─ Auto-export to 8 .toml files -``` - -### 2. Update Existing Configuration - -``` -Detect Existing Config - ↓ -Choose Action: - ├─ Clean up & start fresh - ├─ Update via TypeDialog (edit existing) - ├─ Use quick mode (regenerate) - └─ List current configs -``` - -### 3. Manual NCL Edits - -``` -User edits: provisioning/config/runtime/orchestrator.solo.ncl - ↓ -Run: ./setup-platform-config.sh --generate-toml - ↓ -Auto-exports to: provisioning/config/runtime/generated/orchestrator.solo.toml - ↓ -Service loads TOML automatically -``` - -## Configuration Layers - -The script composes configurations from multiple layers: - -``` -1. Schema (TYPE-SAFE CONTRACT) - ↓ - provisioning/schemas/platform/schemas/orchestrator.ncl - (Defines valid fields, types, constraints) - -2. Service Defaults (BASE VALUES) - ↓ - provisioning/schemas/platform/defaults/orchestrator-defaults.ncl - (Default values for all orchestrator settings) - -3. Mode Overlay (MODE-SPECIFIC TUNING) - ↓ - provisioning/schemas/platform/defaults/deployment/solo-defaults.ncl - (Resource limits for solo mode: 2 CPU, 4GB RAM) - -4. Composition (MERGE) - ↓ - defaults.merge_config_with_mode(mode_config) - (Merges base + mode overlay) - -5. Runtime Config (USER CUSTOMIZATION) - ↓ - provisioning/config/runtime/orchestrator.solo.ncl - (Final config, can be hand-edited) - -6. TOML Export (SERVICE CONSUMPTION) - ↓ - provisioning/config/runtime/generated/orchestrator.solo.toml - (Rust service reads this) -``` - -## Services & Modes - -### 8 Available Services - -``` -1. orchestrator - Main orchestration engine -2. control-center - Web UI and management console -3. mcp-server - Model Context Protocol server -4. vault-service - Secrets management and encryption -5. extension-registry - Extension distribution system -6. rag - Retrieval-Augmented Generation -7. ai-service - AI model integration -8. provisioning-daemon - Background operations -``` - -### 4 Deployment Modes - -| Mode | Specs | Use Case | -| ------ | ------- | ---------- | -| `solo` | 2 CPU, 4GB RAM | Development, testing | -| `multiuser` | 4 CPU, 8GB RAM | Team staging | -| `cicd` | 8 CPU, 16GB RAM | CI/CD pipelines | -| `enterprise` | 16+ CPU, 32+ GB | Production HA | - -## Directory Structure - -``` -provisioning/ -├── config/ -│ └── runtime/ # 🔒 PRIVATE (gitignored) -│ ├── .gitignore -│ ├── orchestrator.solo.ncl # Runtime config (user editable) -│ ├── vault-service.multiuser.ncl # Runtime config -│ └── generated/ # TOMLs (auto-generated) -│ ├── orchestrator.solo.toml # For Rust services -│ └── vault-service.multiuser.toml -│ -├── schemas/platform/ # 📘 PUBLIC (versionable) -│ ├── schemas/ # Type contracts -│ ├── defaults/ # Base values -│ │ ├── orchestrator-defaults.ncl -│ │ └── deployment/ -│ │ ├── solo-defaults.ncl -│ │ ├── multiuser-defaults.ncl -│ │ ├── cicd-defaults.ncl -│ │ └── enterprise-defaults.ncl -│ └── validators/ # Business logic -│ -└── scripts/ - └── setup-platform-config.sh # This script -``` - -## Requirements - -- **Bash 4.0+** -- **Nickel 0.10+** - Configuration language -- **Nushell 0.109+** - Script engine -- **TypeDialog** (optional, for interactive setup) - -## Integration with Provisioning Installer - -### ⚠️ Current Status: Installer NOT YET IMPLEMENTED - -The `setup-platform-config.sh` script is a **standalone tool** ready to use independently. - -**For now**: Call the script manually before running services - -``` -# Step 1: Setup platform configurations (MANUAL) -./provisioning/scripts/setup-platform-config.sh --quick-mode --mode solo - -# Step 2: Run services -export ORCHESTRATOR_MODE=solo -cargo run -p orchestrator -``` - -### Future: When Installer is Implemented - -Once `provisioning/scripts/install.sh` is ready, it will look like: - -``` -#!/bin/bash -# provisioning/scripts/install.sh (FUTURE) - -# Pre-flight checks -check_dependencies() { - command -v nickel >/dev/null || { echo "Nickel required"; exit 1; } - command -v nu >/dev/null || { echo "Nushell required"; exit 1; } -} -check_dependencies - -# Install provisioning system -echo "Installing provisioning system..." -# (implementation here) - -# Setup platform configurations -echo "Setting up platform configurations..." -./provisioning/scripts/setup-platform-config.sh --quick-mode --mode solo - -# Build and verify -echo "Building platform services..." -cargo build -p orchestrator -p control-center -p mcp-server - -echo "Installation complete!" -``` - -### CI/CD Integration (Available Now) - -For CI/CD pipelines that don't require the full installer: - -``` -#!/bin/bash -# ci/setup.sh - -# Setup configurations for CI/CD mode -./provisioning/scripts/setup-platform-config.sh \ - --quick-mode \ - --mode cicd - -# Run tests -cargo test --all - -# Deploy -docker-compose -f provisioning/platform/infrastructure/docker/docker-compose.cicd.yml up -``` - -## Important Notes - -### ⚠️ Manual Edits - -If you manually edit `.ncl` files in `provisioning/config/runtime/`: - -``` -# Always regenerate TOMLs afterward -./provisioning/scripts/setup-platform-config.sh --generate-toml -``` - -### 🔒 Private Configurations - -Files in `provisioning/config/runtime/` are **gitignored**: -- `.ncl` files (may contain secrets/encrypted values) -- `generated/*.toml` files (auto-generated, no need to version) - -### 📘 Public Schemas - -Schemas in `provisioning/schemas/platform/` are **public**: -- Source of truth for configuration structure -- Versionable and shared across team -- Can be committed to git - -### 🔄 Regeneration - -The script is **idempotent** - run it multiple times safely: - -``` -# Safe: Re-runs setup, updates configs -./provisioning/scripts/setup-platform-config.sh --quick-mode --mode multiuser - -# Does NOT overwrite manually edited files (unless --clean is used) -``` - -## Troubleshooting - -### Nickel Validation Fails - -``` -# Check syntax of generated config -nickel typecheck provisioning/config/runtime/orchestrator.solo.ncl - -# View detailed error -nickel export --format toml provisioning/config/runtime/orchestrator.solo.ncl -``` - -### TOML Export Fails - -``` -# Check if Nickel config is valid -nickel typecheck provisioning/config/runtime/orchestrator.solo.ncl - -# Try manual export -nickel export --format toml provisioning/config/runtime/orchestrator.solo.ncl -``` - -### Service Won't Start - -``` -# Verify TOML exists -ls -la provisioning/config/runtime/generated/orchestrator.solo.toml - -# Check TOML syntax -cat provisioning/config/runtime/generated/orchestrator.solo.toml | head -20 - -# Verify service can read TOML -ORCHESTRATOR_MODE=solo cargo run -p orchestrator -- -``` - -## Examples - -### Example 1: Quick Setup for Development - -``` -# Setup all services for solo mode (2 CPU, 4GB RAM) -./provisioning/scripts/setup-platform-config.sh --quick-mode --mode solo - -# Result: -# ✓ Created provisioning/config/runtime/orchestrator.solo.ncl -# ✓ Created provisioning/config/runtime/control-center.solo.ncl -# ✓ ... (8 services total) -# ✓ Generated 8 TOML files in provisioning/config/runtime/generated/ - -# Run service: -export ORCHESTRATOR_MODE=solo -cargo run -p orchestrator -``` - -### Example 2: Interactive TypeDialog Setup - -``` -# Configure orchestrator in multiuser mode with web UI -./provisioning/scripts/setup-platform-config.sh \ - --service orchestrator \ - --mode multiuser \ - --backend web - -# TypeDialog opens browser, user fills form -# Result: -# ✓ Created provisioning/config/runtime/orchestrator.multiuser.ncl -# ✓ Generated provisioning/config/runtime/generated/orchestrator.multiuser.toml - -# Run service: -export ORCHESTRATOR_MODE=multiuser -cargo run -p orchestrator -``` - -### Example 3: Update After Manual Edit - -``` -# Edit config manually -vim provisioning/config/runtime/orchestrator.solo.ncl - -# Regenerate TOML (critical!) -./provisioning/scripts/setup-platform-config.sh --generate-toml - -# Verify changes -cat provisioning/config/runtime/generated/orchestrator.solo.toml | head -20 - -# Restart service with new config -pkill orchestrator -ORCHESTRATOR_MODE=solo cargo run -p orchestrator -``` - -## Performance Notes - -- **Quick Mode**: ~1-2 seconds (no user interaction, direct composition) -- **TypeDialog (web)**: 10-30 seconds (server startup + UI loading) -- **TOML Export**: <1 second per service -- **Full Setup**: 5-10 minutes (8 services via TypeDialog) - ---- - -**Version**: 1.0.0 -**Created**: 2026-01-05 -**Last Updated**: 2026-01-05 +# Platform Services Configuration Setup Script\n\n**Path**: `provisioning/scripts/setup-platform-config.sh`\n\nSetup and manage platform service configurations in `provisioning/config/runtime/`.\n\n## Features\n\n- ✅ **Interactive Mode**: Guided setup with TypeDialog or quick mode\n- ✅ **Interactive TypeDialog**: Web/TUI/CLI form-based configuration\n- ✅ **Quick Mode**: Auto-setup from defaults + mode overlays\n- ✅ **Automatic TOML Export**: Generates TOML files for Rust services\n- ✅ **Runtime Detection**: Detect existing configs and offer update/replace options\n- ✅ **Batch Operations**: Configure all 8 services at once\n- ✅ **Cleanup Management**: Remove/reset configurations safely\n\n## Usage\n\n### Interactive Setup (Recommended)\n\n```\n# Start interactive wizard\n./provisioning/scripts/setup-platform-config.sh\n\n# Prompts for:\n# 1. Action: TypeDialog, Quick Mode, Clean, or List\n# 2. Service (if TypeDialog/Quick)\n# 3. Mode (solo/multiuser/cicd/enterprise)\n# 4. Backend (web/tui/cli, if TypeDialog)\n```\n\n### Command-Line Options\n\n```\n# Configure specific service via TypeDialog\n./provisioning/scripts/setup-platform-config.sh \\n --service orchestrator \\n --mode solo \\n --backend web\n\n# Quick setup all services for enterprise mode\n./provisioning/scripts/setup-platform-config.sh \\n --quick-mode \\n --mode enterprise\n\n# Regenerate TOML files from existing .ncl configs\n./provisioning/scripts/setup-platform-config.sh \\n --generate-toml\n\n# List available options\n./provisioning/scripts/setup-platform-config.sh --list-modes\n./provisioning/scripts/setup-platform-config.sh --list-services\n./provisioning/scripts/setup-platform-config.sh --list-configs\n\n# Clean all runtime configurations\n./provisioning/scripts/setup-platform-config.sh --clean\n```\n\n## Workflow\n\n### 1. Initial Setup (Empty Runtime)\n\n```\nInteractive Prompt\n ↓\n├─ TypeDialog (Recommended)\n│ ├─ Load form definitions\n│ ├─ User fills form (web/tui/cli)\n│ └─ Generates orchestrator.solo.ncl\n│ ↓\n│ Auto-export to orchestrator.solo.toml\n│\n└─ Quick Mode\n ├─ Select mode (solo/multiuser/cicd/enterprise)\n ├─ Compose all services: defaults + mode overlay\n ├─ Create 8 .ncl files\n └─ Auto-export to 8 .toml files\n```\n\n### 2. Update Existing Configuration\n\n```\nDetect Existing Config\n ↓\nChoose Action:\n ├─ Clean up & start fresh\n ├─ Update via TypeDialog (edit existing)\n ├─ Use quick mode (regenerate)\n └─ List current configs\n```\n\n### 3. Manual NCL Edits\n\n```\nUser edits: provisioning/config/runtime/orchestrator.solo.ncl\n ↓\nRun: ./setup-platform-config.sh --generate-toml\n ↓\nAuto-exports to: provisioning/config/runtime/generated/orchestrator.solo.toml\n ↓\nService loads TOML automatically\n```\n\n## Configuration Layers\n\nThe script composes configurations from multiple layers:\n\n```\n1. Schema (TYPE-SAFE CONTRACT)\n ↓\n provisioning/schemas/platform/schemas/orchestrator.ncl\n (Defines valid fields, types, constraints)\n\n2. Service Defaults (BASE VALUES)\n ↓\n provisioning/schemas/platform/defaults/orchestrator-defaults.ncl\n (Default values for all orchestrator settings)\n\n3. Mode Overlay (MODE-SPECIFIC TUNING)\n ↓\n provisioning/schemas/platform/defaults/deployment/solo-defaults.ncl\n (Resource limits for solo mode: 2 CPU, 4GB RAM)\n\n4. Composition (MERGE)\n ↓\n defaults.merge_config_with_mode(mode_config)\n (Merges base + mode overlay)\n\n5. Runtime Config (USER CUSTOMIZATION)\n ↓\n provisioning/config/runtime/orchestrator.solo.ncl\n (Final config, can be hand-edited)\n\n6. TOML Export (SERVICE CONSUMPTION)\n ↓\n provisioning/config/runtime/generated/orchestrator.solo.toml\n (Rust service reads this)\n```\n\n## Services & Modes\n\n### 8 Available Services\n\n```\n1. orchestrator - Main orchestration engine\n2. control-center - Web UI and management console\n3. mcp-server - Model Context Protocol server\n4. vault-service - Secrets management and encryption\n5. extension-registry - Extension distribution system\n6. rag - Retrieval-Augmented Generation\n7. ai-service - AI model integration\n8. provisioning-daemon - Background operations\n```\n\n### 4 Deployment Modes\n\n| Mode | Specs | Use Case |\n| ------ | ------- | ---------- |\n| `solo` | 2 CPU, 4GB RAM | Development, testing |\n| `multiuser` | 4 CPU, 8GB RAM | Team staging |\n| `cicd` | 8 CPU, 16GB RAM | CI/CD pipelines |\n| `enterprise` | 16+ CPU, 32+ GB | Production HA |\n\n## Directory Structure\n\n```\nprovisioning/\n├── config/\n│ └── runtime/ # 🔒 PRIVATE (gitignored)\n│ ├── .gitignore\n│ ├── orchestrator.solo.ncl # Runtime config (user editable)\n│ ├── vault-service.multiuser.ncl # Runtime config\n│ └── generated/ # TOMLs (auto-generated)\n│ ├── orchestrator.solo.toml # For Rust services\n│ └── vault-service.multiuser.toml\n│\n├── schemas/platform/ # 📘 PUBLIC (versionable)\n│ ├── schemas/ # Type contracts\n│ ├── defaults/ # Base values\n│ │ ├── orchestrator-defaults.ncl\n│ │ └── deployment/\n│ │ ├── solo-defaults.ncl\n│ │ ├── multiuser-defaults.ncl\n│ │ ├── cicd-defaults.ncl\n│ │ └── enterprise-defaults.ncl\n│ └── validators/ # Business logic\n│\n└── scripts/\n └── setup-platform-config.sh # This script\n```\n\n## Requirements\n\n- **Bash 4.0+**\n- **Nickel 0.10+** - Configuration language\n- **Nushell 0.109+** - Script engine\n- **TypeDialog** (optional, for interactive setup)\n\n## Integration with Provisioning Installer\n\n### ⚠️ Current Status: Installer NOT YET IMPLEMENTED\n\nThe `setup-platform-config.sh` script is a **standalone tool** ready to use independently.\n\n**For now**: Call the script manually before running services\n\n```\n# Step 1: Setup platform configurations (MANUAL)\n./provisioning/scripts/setup-platform-config.sh --quick-mode --mode solo\n\n# Step 2: Run services\nexport ORCHESTRATOR_MODE=solo\ncargo run -p orchestrator\n```\n\n### Future: When Installer is Implemented\n\nOnce `provisioning/scripts/install.sh` is ready, it will look like:\n\n```\n#!/bin/bash\n# provisioning/scripts/install.sh (FUTURE)\n\n# Pre-flight checks\ncheck_dependencies() {\n command -v nickel >/dev/null || { echo "Nickel required"; exit 1; }\n command -v nu >/dev/null || { echo "Nushell required"; exit 1; }\n}\ncheck_dependencies\n\n# Install provisioning system\necho "Installing provisioning system..."\n# (implementation here)\n\n# Setup platform configurations\necho "Setting up platform configurations..."\n./provisioning/scripts/setup-platform-config.sh --quick-mode --mode solo\n\n# Build and verify\necho "Building platform services..."\ncargo build -p orchestrator -p control-center -p mcp-server\n\necho "Installation complete!"\n```\n\n### CI/CD Integration (Available Now)\n\nFor CI/CD pipelines that don't require the full installer:\n\n```\n#!/bin/bash\n# ci/setup.sh\n\n# Setup configurations for CI/CD mode\n./provisioning/scripts/setup-platform-config.sh \\n --quick-mode \\n --mode cicd\n\n# Run tests\ncargo test --all\n\n# Deploy\ndocker-compose -f provisioning/platform/infrastructure/docker/docker-compose.cicd.yml up\n```\n\n## Important Notes\n\n### ⚠️ Manual Edits\n\nIf you manually edit `.ncl` files in `provisioning/config/runtime/`:\n\n```\n# Always regenerate TOMLs afterward\n./provisioning/scripts/setup-platform-config.sh --generate-toml\n```\n\n### 🔒 Private Configurations\n\nFiles in `provisioning/config/runtime/` are **gitignored**:\n- `.ncl` files (may contain secrets/encrypted values)\n- `generated/*.toml` files (auto-generated, no need to version)\n\n### 📘 Public Schemas\n\nSchemas in `provisioning/schemas/platform/` are **public**:\n- Source of truth for configuration structure\n- Versionable and shared across team\n- Can be committed to git\n\n### 🔄 Regeneration\n\nThe script is **idempotent** - run it multiple times safely:\n\n```\n# Safe: Re-runs setup, updates configs\n./provisioning/scripts/setup-platform-config.sh --quick-mode --mode multiuser\n\n# Does NOT overwrite manually edited files (unless --clean is used)\n```\n\n## Troubleshooting\n\n### Nickel Validation Fails\n\n```\n# Check syntax of generated config\nnickel typecheck provisioning/config/runtime/orchestrator.solo.ncl\n\n# View detailed error\nnickel export --format toml provisioning/config/runtime/orchestrator.solo.ncl\n```\n\n### TOML Export Fails\n\n```\n# Check if Nickel config is valid\nnickel typecheck provisioning/config/runtime/orchestrator.solo.ncl\n\n# Try manual export\nnickel export --format toml provisioning/config/runtime/orchestrator.solo.ncl\n```\n\n### Service Won't Start\n\n```\n# Verify TOML exists\nls -la provisioning/config/runtime/generated/orchestrator.solo.toml\n\n# Check TOML syntax\ncat provisioning/config/runtime/generated/orchestrator.solo.toml | head -20\n\n# Verify service can read TOML\nORCHESTRATOR_MODE=solo cargo run -p orchestrator --\n```\n\n## Examples\n\n### Example 1: Quick Setup for Development\n\n```\n# Setup all services for solo mode (2 CPU, 4GB RAM)\n./provisioning/scripts/setup-platform-config.sh --quick-mode --mode solo\n\n# Result:\n# ✓ Created provisioning/config/runtime/orchestrator.solo.ncl\n# ✓ Created provisioning/config/runtime/control-center.solo.ncl\n# ✓ ... (8 services total)\n# ✓ Generated 8 TOML files in provisioning/config/runtime/generated/\n\n# Run service:\nexport ORCHESTRATOR_MODE=solo\ncargo run -p orchestrator\n```\n\n### Example 2: Interactive TypeDialog Setup\n\n```\n# Configure orchestrator in multiuser mode with web UI\n./provisioning/scripts/setup-platform-config.sh \\n --service orchestrator \\n --mode multiuser \\n --backend web\n\n# TypeDialog opens browser, user fills form\n# Result:\n# ✓ Created provisioning/config/runtime/orchestrator.multiuser.ncl\n# ✓ Generated provisioning/config/runtime/generated/orchestrator.multiuser.toml\n\n# Run service:\nexport ORCHESTRATOR_MODE=multiuser\ncargo run -p orchestrator\n```\n\n### Example 3: Update After Manual Edit\n\n```\n# Edit config manually\nvim provisioning/config/runtime/orchestrator.solo.ncl\n\n# Regenerate TOML (critical!)\n./provisioning/scripts/setup-platform-config.sh --generate-toml\n\n# Verify changes\ncat provisioning/config/runtime/generated/orchestrator.solo.toml | head -20\n\n# Restart service with new config\npkill orchestrator\nORCHESTRATOR_MODE=solo cargo run -p orchestrator\n```\n\n## Performance Notes\n\n- **Quick Mode**: ~1-2 seconds (no user interaction, direct composition)\n- **TypeDialog (web)**: 10-30 seconds (server startup + UI loading)\n- **TOML Export**: <1 second per service\n- **Full Setup**: 5-10 minutes (8 services via TypeDialog)\n\n---\n\n**Version**: 1.0.0\n**Created**: 2026-01-05\n**Last Updated**: 2026-01-05 diff --git a/templates/workspace/example/README.md b/templates/workspace/example/README.md index 18e7264..2099263 100644 --- a/templates/workspace/example/README.md +++ b/templates/workspace/example/README.md @@ -1,212 +1 @@ -# Example Infrastructure Template - -This is a complete, ready-to-deploy example of a simple web application stack. - -## What's Included - -- **2 Web servers** - Load-balanced frontend -- **1 Database server** - Backend database -- **Complete configuration** - Ready to deploy with minimal changes -- **Usage instructions** - Step-by-step deployment guide - -## Architecture - -``` -┌─────────────────────────────────────────┐ -│ Internet / Load Balancer │ -└─────────────┬───────────────────────────┘ - │ - ┌───────┴───────┐ - │ │ -┌─────▼─────┐ ┌────▼──────┐ -│ demo-web-01│ │demo-web-02│ -│ (Public) │ │ (Public) │ -└─────┬──────┘ └────┬──────┘ - │ │ - └───────┬───────┘ - │ - │ Private Network - │ - ┌─────▼──────┐ - │ demo-db-01 │ - │ (Private) │ - └────────────┘ -``` - -## Quick Start - -### 1. Load Required Provider - -``` -cd infra/ - -# Load your cloud provider -provisioning mod load providers . upcloud -# OR -provisioning mod load providers . aws -``` - -### 2. Configure Provider Settings - -Edit `servers.k` and uncomment provider-specific settings: - -**UpCloud example:** - -``` -plan = "1xCPU-2GB" # Web servers -# plan = "2xCPU-4GB" # Database server (larger) -storage_size = 25 # Disk size in GB -``` - -**AWS example:** - -``` -instance_type = "t3.small" # Web servers -# instance_type = "t3.medium" # Database server -storage_size = 25 -``` - -### 3. Load Optional Task Services - -``` -# For container support -provisioning mod load taskservs . containerd - -# For additional services -provisioning mod load taskservs . docker redis nginx -``` - -### 4. Deploy - -``` -# Test configuration first -kcl run servers.k - -# Dry-run to see what will be created -provisioning s create --infra --check - -# Deploy the infrastructure -provisioning s create --infra - -# Monitor deployment -watch provisioning s list --infra -``` - -### 5. Verify Deployment - -``` -# List all servers -provisioning s list --infra - -# SSH into web server -provisioning s ssh demo-web-01 - -# Check database server -provisioning s ssh demo-db-01 -``` - -## Configuration Details - -### Web Servers (demo-web-01, demo-web-02) - -- **Networking**: Public IPv4 + Private IPv4 -- **Purpose**: Frontend application servers -- **Load balancing**: Configure externally -- **Resources**: Minimal (1-2 CPU, 2-4GB RAM) - -### Database Server (demo-db-01) - -- **Networking**: Private IPv4 only (no public access) -- **Purpose**: Backend database -- **Security**: Isolated on private network -- **Resources**: Medium (2-4 CPU, 4-8GB RAM) - -## Next Steps - -### Application Deployment - -1. **Deploy application code** - Use SSH or CI/CD -2. **Configure web servers** - Set up Nginx/Apache -3. **Set up database** - Install PostgreSQL/MySQL -4. **Configure connectivity** - Connect web servers to database - -### Security Hardening - -1. **Firewall rules** - Lock down server access -2. **SSH keys** - Disable password auth -3. **Database access** - Restrict to web servers only -4. **SSL certificates** - Set up HTTPS - -### Monitoring & Backup - -1. **Monitoring** - Set up metrics collection -2. **Logging** - Configure centralized logging -3. **Backups** - Set up database backups -4. **Alerts** - Configure alerting - -### Scaling - -1. **Add more web servers** - Copy web-02 definition -2. **Database replication** - Add read replicas -3. **Load balancer** - Configure external LB -4. **Auto-scaling** - Set up scaling policies - -## Customization - -### Change Server Count - -``` -# Add more web servers -{ - hostname = "demo-web-03" - # ... copy configuration from web-01 -} -``` - -### Change Resource Sizes - -``` -# Web servers -plan = "2xCPU-4GB" # Increase resources - -# Database -plan = "4xCPU-8GB" # More resources for DB -storage_size = 100 # Larger disk -``` - -### Add Task Services - -``` -taskservs = [ - { name = "containerd", profile = "default" } - { name = "docker", profile = "default" } - { name = "redis", profile = "default" } -] -``` - -## Common Issues - -### Deployment Fails - -- Check provider credentials -- Verify network configuration -- Check resource quotas - -### Can't SSH - -- Verify SSH key is loaded -- Check firewall rules -- Ensure server is running - -### Database Connection - -- Verify private network -- Check firewall rules between web and DB -- Test connectivity from web servers - -## Template Characteristics - -- **Complexity**: Medium -- **Servers**: 3 (2 web + 1 database) -- **Pre-configured modules**: Provider only -- **Best for**: Quick demos, learning deployments, testing infrastructure code +# Example Infrastructure Template\n\nThis is a complete, ready-to-deploy example of a simple web application stack.\n\n## What's Included\n\n- **2 Web servers** - Load-balanced frontend\n- **1 Database server** - Backend database\n- **Complete configuration** - Ready to deploy with minimal changes\n- **Usage instructions** - Step-by-step deployment guide\n\n## Architecture\n\n```\n┌─────────────────────────────────────────┐\n│ Internet / Load Balancer │\n└─────────────┬───────────────────────────┘\n │\n ┌───────┴───────┐\n │ │\n┌─────▼─────┐ ┌────▼──────┐\n│ demo-web-01│ │demo-web-02│\n│ (Public) │ │ (Public) │\n└─────┬──────┘ └────┬──────┘\n │ │\n └───────┬───────┘\n │\n │ Private Network\n │\n ┌─────▼──────┐\n │ demo-db-01 │\n │ (Private) │\n └────────────┘\n```\n\n## Quick Start\n\n### 1. Load Required Provider\n\n```\ncd infra/\n\n# Load your cloud provider\nprovisioning mod load providers . upcloud\n# OR\nprovisioning mod load providers . aws\n```\n\n### 2. Configure Provider Settings\n\nEdit `servers.k` and uncomment provider-specific settings:\n\n**UpCloud example:**\n\n```\nplan = "1xCPU-2GB" # Web servers\n# plan = "2xCPU-4GB" # Database server (larger)\nstorage_size = 25 # Disk size in GB\n```\n\n**AWS example:**\n\n```\ninstance_type = "t3.small" # Web servers\n# instance_type = "t3.medium" # Database server\nstorage_size = 25\n```\n\n### 3. Load Optional Task Services\n\n```\n# For container support\nprovisioning mod load taskservs . containerd\n\n# For additional services\nprovisioning mod load taskservs . docker redis nginx\n```\n\n### 4. Deploy\n\n```\n# Test configuration first\nkcl run servers.k\n\n# Dry-run to see what will be created\nprovisioning s create --infra --check\n\n# Deploy the infrastructure\nprovisioning s create --infra \n\n# Monitor deployment\nwatch provisioning s list --infra \n```\n\n### 5. Verify Deployment\n\n```\n# List all servers\nprovisioning s list --infra \n\n# SSH into web server\nprovisioning s ssh demo-web-01\n\n# Check database server\nprovisioning s ssh demo-db-01\n```\n\n## Configuration Details\n\n### Web Servers (demo-web-01, demo-web-02)\n\n- **Networking**: Public IPv4 + Private IPv4\n- **Purpose**: Frontend application servers\n- **Load balancing**: Configure externally\n- **Resources**: Minimal (1-2 CPU, 2-4GB RAM)\n\n### Database Server (demo-db-01)\n\n- **Networking**: Private IPv4 only (no public access)\n- **Purpose**: Backend database\n- **Security**: Isolated on private network\n- **Resources**: Medium (2-4 CPU, 4-8GB RAM)\n\n## Next Steps\n\n### Application Deployment\n\n1. **Deploy application code** - Use SSH or CI/CD\n2. **Configure web servers** - Set up Nginx/Apache\n3. **Set up database** - Install PostgreSQL/MySQL\n4. **Configure connectivity** - Connect web servers to database\n\n### Security Hardening\n\n1. **Firewall rules** - Lock down server access\n2. **SSH keys** - Disable password auth\n3. **Database access** - Restrict to web servers only\n4. **SSL certificates** - Set up HTTPS\n\n### Monitoring & Backup\n\n1. **Monitoring** - Set up metrics collection\n2. **Logging** - Configure centralized logging\n3. **Backups** - Set up database backups\n4. **Alerts** - Configure alerting\n\n### Scaling\n\n1. **Add more web servers** - Copy web-02 definition\n2. **Database replication** - Add read replicas\n3. **Load balancer** - Configure external LB\n4. **Auto-scaling** - Set up scaling policies\n\n## Customization\n\n### Change Server Count\n\n```\n# Add more web servers\n{\n hostname = "demo-web-03"\n # ... copy configuration from web-01\n}\n```\n\n### Change Resource Sizes\n\n```\n# Web servers\nplan = "2xCPU-4GB" # Increase resources\n\n# Database\nplan = "4xCPU-8GB" # More resources for DB\nstorage_size = 100 # Larger disk\n```\n\n### Add Task Services\n\n```\ntaskservs = [\n { name = "containerd", profile = "default" }\n { name = "docker", profile = "default" }\n { name = "redis", profile = "default" }\n]\n```\n\n## Common Issues\n\n### Deployment Fails\n\n- Check provider credentials\n- Verify network configuration\n- Check resource quotas\n\n### Can't SSH\n\n- Verify SSH key is loaded\n- Check firewall rules\n- Ensure server is running\n\n### Database Connection\n\n- Verify private network\n- Check firewall rules between web and DB\n- Test connectivity from web servers\n\n## Template Characteristics\n\n- **Complexity**: Medium\n- **Servers**: 3 (2 web + 1 database)\n- **Pre-configured modules**: Provider only\n- **Best for**: Quick demos, learning deployments, testing infrastructure code diff --git a/templates/workspace/full/README.md b/templates/workspace/full/README.md index 80eb0d8..fe3fa18 100644 --- a/templates/workspace/full/README.md +++ b/templates/workspace/full/README.md @@ -1,162 +1 @@ -# Full Infrastructure Template - -This is a comprehensive infrastructure template with multiple server types and advanced configuration examples. - -## What's Included - -- **Web servers** - 2 frontend web servers -- **Database server** - Backend database with private networking -- **Kubernetes control plane** - Control plane node -- **Kubernetes workers** - 2 worker nodes -- **Advanced settings** - SSH config, monitoring, backup options -- **Comprehensive examples** - Multiple server roles and configurations - -## Server Inventory - -| Hostname | Role | Network | Purpose | -| ---------- | ------ | --------- | --------- | -| web-01, web-02 | Web | Public + Private | Frontend application servers | -| db-01 | Database | Private only | Backend database | -| k8s-control-01 | K8s Control | Public + Private | Kubernetes control plane | -| k8s-worker-01, k8s-worker-02 | K8s Worker | Public + Private | Kubernetes compute nodes | - -## Quick Start - -### 1. Load Required Modules - -``` -cd infra/ - -# Load provider -provisioning mod load providers . upcloud - -# Load taskservs -provisioning mod load taskservs . kubernetes containerd cilium - -# Load cluster configurations (optional) -provisioning mod load clusters . buildkit -``` - -### 2. Customize Configuration - -Edit `servers.k`: - -**Provider-specific settings:** - -``` -# Uncomment and adjust for your provider -plan = "2xCPU-4GB" # Server size -storage_size = 50 # Disk size in GB -``` - -**Task services:** - -``` -# Uncomment after loading modules -taskservs = [ - { name = "kubernetes", profile = "control-plane" } - { name = "containerd", profile = "default" } - { name = "cilium", profile = "default" } -] -``` - -**Select servers to deploy:** - -``` -# Choose which server groups to deploy -all_servers = web_servers + db_servers # Web + DB only -# OR -all_servers = k8s_control + k8s_workers # Kubernetes cluster only -# OR -all_servers = web_servers + db_servers + k8s_control + k8s_workers # Everything -``` - -### 3. Deploy - -``` -# Test configuration -kcl run servers.k - -# Dry-run deployment (recommended) -provisioning s create --infra --check - -# Deploy selected servers -provisioning s create --infra - -# Or deploy specific server groups -provisioning s create --infra --select web -``` - -## Architecture Examples - -### Web Application Stack - -Deploy web servers + database: - -``` -all_servers = web_servers + db_servers -``` - -### Kubernetes Cluster - -Deploy control plane + workers: - -``` -all_servers = k8s_control + k8s_workers -``` - -### Complete Infrastructure - -Deploy everything: - -``` -all_servers = web_servers + db_servers + k8s_control + k8s_workers -``` - -## Advanced Configuration - -### Network Segmentation - -- **Public servers**: web-01, web-02 (public + private networks) -- **Private servers**: db-01 (private network only) -- **Hybrid**: k8s nodes (public for API access, private for pod networking) - -### Monitoring - -Monitoring is pre-configured in settings: - -``` -monitoring = { - enabled = True - metrics_port = 9100 - log_aggregation = True -} -``` - -### SSH Configuration - -Advanced SSH settings are included: - -``` -ssh_config = { - connect_timeout = 30 - retry_attempts = 3 - compression = True -} -``` - -## Next Steps - -1. **Customize server specs** - Adjust CPU, memory, storage -2. **Configure networking** - Set up firewall rules, load balancers -3. **Add taskservs** - Uncomment and configure task services -4. **Set up clusters** - Deploy Kubernetes or container clusters -5. **Configure monitoring** - Set up metrics and logging -6. **Implement backup** - Configure backup policies - -## Template Characteristics - -- **Complexity**: High -- **Servers**: 6 examples (web, database, k8s) -- **Pre-configured modules**: Examples for all major components -- **Best for**: Production deployments, complex architectures, learning advanced patterns +# Full Infrastructure Template\n\nThis is a comprehensive infrastructure template with multiple server types and advanced configuration examples.\n\n## What's Included\n\n- **Web servers** - 2 frontend web servers\n- **Database server** - Backend database with private networking\n- **Kubernetes control plane** - Control plane node\n- **Kubernetes workers** - 2 worker nodes\n- **Advanced settings** - SSH config, monitoring, backup options\n- **Comprehensive examples** - Multiple server roles and configurations\n\n## Server Inventory\n\n| Hostname | Role | Network | Purpose |\n| ---------- | ------ | --------- | --------- |\n| web-01, web-02 | Web | Public + Private | Frontend application servers |\n| db-01 | Database | Private only | Backend database |\n| k8s-control-01 | K8s Control | Public + Private | Kubernetes control plane |\n| k8s-worker-01, k8s-worker-02 | K8s Worker | Public + Private | Kubernetes compute nodes |\n\n## Quick Start\n\n### 1. Load Required Modules\n\n```\ncd infra/\n\n# Load provider\nprovisioning mod load providers . upcloud\n\n# Load taskservs\nprovisioning mod load taskservs . kubernetes containerd cilium\n\n# Load cluster configurations (optional)\nprovisioning mod load clusters . buildkit\n```\n\n### 2. Customize Configuration\n\nEdit `servers.k`:\n\n**Provider-specific settings:**\n\n```\n# Uncomment and adjust for your provider\nplan = "2xCPU-4GB" # Server size\nstorage_size = 50 # Disk size in GB\n```\n\n**Task services:**\n\n```\n# Uncomment after loading modules\ntaskservs = [\n { name = "kubernetes", profile = "control-plane" }\n { name = "containerd", profile = "default" }\n { name = "cilium", profile = "default" }\n]\n```\n\n**Select servers to deploy:**\n\n```\n# Choose which server groups to deploy\nall_servers = web_servers + db_servers # Web + DB only\n# OR\nall_servers = k8s_control + k8s_workers # Kubernetes cluster only\n# OR\nall_servers = web_servers + db_servers + k8s_control + k8s_workers # Everything\n```\n\n### 3. Deploy\n\n```\n# Test configuration\nkcl run servers.k\n\n# Dry-run deployment (recommended)\nprovisioning s create --infra --check\n\n# Deploy selected servers\nprovisioning s create --infra \n\n# Or deploy specific server groups\nprovisioning s create --infra --select web\n```\n\n## Architecture Examples\n\n### Web Application Stack\n\nDeploy web servers + database:\n\n```\nall_servers = web_servers + db_servers\n```\n\n### Kubernetes Cluster\n\nDeploy control plane + workers:\n\n```\nall_servers = k8s_control + k8s_workers\n```\n\n### Complete Infrastructure\n\nDeploy everything:\n\n```\nall_servers = web_servers + db_servers + k8s_control + k8s_workers\n```\n\n## Advanced Configuration\n\n### Network Segmentation\n\n- **Public servers**: web-01, web-02 (public + private networks)\n- **Private servers**: db-01 (private network only)\n- **Hybrid**: k8s nodes (public for API access, private for pod networking)\n\n### Monitoring\n\nMonitoring is pre-configured in settings:\n\n```\nmonitoring = {\n enabled = True\n metrics_port = 9100\n log_aggregation = True\n}\n```\n\n### SSH Configuration\n\nAdvanced SSH settings are included:\n\n```\nssh_config = {\n connect_timeout = 30\n retry_attempts = 3\n compression = True\n}\n```\n\n## Next Steps\n\n1. **Customize server specs** - Adjust CPU, memory, storage\n2. **Configure networking** - Set up firewall rules, load balancers\n3. **Add taskservs** - Uncomment and configure task services\n4. **Set up clusters** - Deploy Kubernetes or container clusters\n5. **Configure monitoring** - Set up metrics and logging\n6. **Implement backup** - Configure backup policies\n\n## Template Characteristics\n\n- **Complexity**: High\n- **Servers**: 6 examples (web, database, k8s)\n- **Pre-configured modules**: Examples for all major components\n- **Best for**: Production deployments, complex architectures, learning advanced patterns diff --git a/templates/workspace/minimal/README.md b/templates/workspace/minimal/README.md index a452540..620465b 100644 --- a/templates/workspace/minimal/README.md +++ b/templates/workspace/minimal/README.md @@ -1,59 +1 @@ -# Minimal Infrastructure Template - -This is a minimal infrastructure template with a basic server configuration. - -## What's Included - -- **Single server definition** - Basic example to customize -- **Minimal settings** - Essential configuration only -- **No pre-configured modules** - Load what you need - -## Quick Start - -### 1. Load Required Modules - -``` -cd infra/ - -# Load a provider -provisioning mod load providers . upcloud - -# Load taskservs as needed -provisioning mod load taskservs . containerd -``` - -### 2. Customize Configuration - -Edit `servers.k`: - -- Change server hostname and title -- Configure network settings -- Add provider-specific settings (plan, storage, etc.) -- Add taskservs when ready - -### 3. Deploy - -``` -# Test configuration -kcl run servers.k - -# Dry-run deployment -provisioning s create --infra --check - -# Deploy -provisioning s create --infra -``` - -## Next Steps - -- Add more servers to the `example_servers` array -- Configure taskservs for your servers -- Set up monitoring and backup -- Configure firewall rules - -## Template Characteristics - -- **Complexity**: Low -- **Servers**: 1 basic example -- **Pre-configured modules**: None -- **Best for**: Learning, simple deployments, custom configurations +# Minimal Infrastructure Template\n\nThis is a minimal infrastructure template with a basic server configuration.\n\n## What's Included\n\n- **Single server definition** - Basic example to customize\n- **Minimal settings** - Essential configuration only\n- **No pre-configured modules** - Load what you need\n\n## Quick Start\n\n### 1. Load Required Modules\n\n```\ncd infra/\n\n# Load a provider\nprovisioning mod load providers . upcloud\n\n# Load taskservs as needed\nprovisioning mod load taskservs . containerd\n```\n\n### 2. Customize Configuration\n\nEdit `servers.k`:\n\n- Change server hostname and title\n- Configure network settings\n- Add provider-specific settings (plan, storage, etc.)\n- Add taskservs when ready\n\n### 3. Deploy\n\n```\n# Test configuration\nkcl run servers.k\n\n# Dry-run deployment\nprovisioning s create --infra --check\n\n# Deploy\nprovisioning s create --infra \n```\n\n## Next Steps\n\n- Add more servers to the `example_servers` array\n- Configure taskservs for your servers\n- Set up monitoring and backup\n- Configure firewall rules\n\n## Template Characteristics\n\n- **Complexity**: Low\n- **Servers**: 1 basic example\n- **Pre-configured modules**: None\n- **Best for**: Learning, simple deployments, custom configurations diff --git a/templates/workspaces/kubernetes/setup.md b/templates/workspaces/kubernetes/setup.md index a9a0e8b..b0a3166 100644 --- a/templates/workspaces/kubernetes/setup.md +++ b/templates/workspaces/kubernetes/setup.md @@ -1,167 +1 @@ -# Kubernetes Workspace Setup - -This template provides a complete Kubernetes cluster configuration using the package-based provisioning system. - -## Prerequisites - -1. Core provisioning package installed: - - ```bash - kcl-packager.nu install --version latest - ``` - -2. Module loader CLI available: - - ```bash - module-loader --help - ``` - -## Setup Steps - -### 1. Initialize Workspace - -``` -# Create workspace from template -cp -r provisioning/templates/workspaces/kubernetes ./my-k8s-cluster -cd my-k8s-cluster - -# Initialize directory structure -workspace-init.nu . init -``` - -### 2. Load Required Taskservs - -``` -# Load Kubernetes components -module-loader load taskservs . [kubernetes, cilium, containerd] - -# Verify loading -module-loader list taskservs . -``` - -### 3. Load Cloud Provider - -``` -# For UpCloud -module-loader load providers . [upcloud] - -# For AWS -module-loader load providers . [aws] - -# For local development -module-loader load providers . [local] -``` - -### 4. Configure Infrastructure - -1. Edit `servers.k` to uncomment the import statements and taskserv configurations -2. Adjust server specifications, hostnames, and labels as needed -3. Configure provider-specific settings in the generated provider files - -### 5. Validate Configuration - -``` -# Validate KCL configuration -kcl run servers.k - -# Validate workspace -module-loader validate . -``` - -### 6. Deploy Cluster - -``` -# Create servers -provisioning server create --infra . --check - -# Install taskservs -provisioning taskserv create kubernetes --infra . -provisioning taskserv create cilium --infra . -provisioning taskserv create containerd --infra . - -# Verify cluster -kubectl get nodes -``` - -## Configuration Details - -### Server Roles - -- **k8s-master-01**: Control plane node running the Kubernetes API server, etcd, and scheduler -- **k8s-worker-01/02**: Worker nodes running kubelet and container runtime - -### Taskservs - -- **containerd**: Container runtime for Kubernetes -- **kubernetes**: Core Kubernetes components (kubelet, kubeadm, kubectl) -- **cilium**: CNI (Container Network Interface) for pod networking - -### Network Configuration - -- All nodes have public IPv4 for initial setup -- Cilium provides internal pod-to-pod networking -- SSH access on port 22 for management - -## Customization - -### Adding More Workers - -Copy the worker node configuration in `servers.k` and modify: - -- `hostname` -- `title` -- Any provider-specific settings - -### Different Container Runtime - -Replace `containerd` taskserv with: - -- `crio`: CRI-O runtime -- `docker`: Docker runtime (not recommended for production) - -### Different CNI - -Replace `cilium` taskserv with: - -- `calico`: Calico CNI -- `flannel`: Flannel CNI -- Built-in kubenet (remove CNI taskserv) - -### Storage - -Add storage taskservs: - -``` -module-loader load taskservs . [rook-ceph, mayastor] -``` - -Then add to server taskserv configurations: - -``` -taskservs = [ - { name = "containerd", profile = "default" }, - { name = "kubernetes", profile = "worker" }, - { name = "cilium", profile = "worker" }, - { name = "rook-ceph", profile = "default" } -] -``` - -## Troubleshooting - -### Module Import Errors - -If you see import errors like "module not found": - -1. Verify modules are loaded: `module-loader list taskservs .` -2. Check generated import files: `ls .taskservs/` -3. Reload modules if needed: `module-loader load taskservs . [kubernetes, cilium, containerd]` - -### Provider Configuration - -Check provider-specific configuration in `.providers/` directory after loading. - -### Kubernetes Setup Issues - -1. Check taskserv installation logs in `./tmp/k8s-deployment/` -2. Verify all nodes are reachable via SSH -3. Check firewall rules for Kubernetes ports (6443, 10250, etc.) +# Kubernetes Workspace Setup\n\nThis template provides a complete Kubernetes cluster configuration using the package-based provisioning system.\n\n## Prerequisites\n\n1. Core provisioning package installed:\n\n ```bash\n kcl-packager.nu install --version latest\n ```\n\n2. Module loader CLI available:\n\n ```bash\n module-loader --help\n ```\n\n## Setup Steps\n\n### 1. Initialize Workspace\n\n```\n# Create workspace from template\ncp -r provisioning/templates/workspaces/kubernetes ./my-k8s-cluster\ncd my-k8s-cluster\n\n# Initialize directory structure\nworkspace-init.nu . init\n```\n\n### 2. Load Required Taskservs\n\n```\n# Load Kubernetes components\nmodule-loader load taskservs . [kubernetes, cilium, containerd]\n\n# Verify loading\nmodule-loader list taskservs .\n```\n\n### 3. Load Cloud Provider\n\n```\n# For UpCloud\nmodule-loader load providers . [upcloud]\n\n# For AWS\nmodule-loader load providers . [aws]\n\n# For local development\nmodule-loader load providers . [local]\n```\n\n### 4. Configure Infrastructure\n\n1. Edit `servers.k` to uncomment the import statements and taskserv configurations\n2. Adjust server specifications, hostnames, and labels as needed\n3. Configure provider-specific settings in the generated provider files\n\n### 5. Validate Configuration\n\n```\n# Validate KCL configuration\nkcl run servers.k\n\n# Validate workspace\nmodule-loader validate .\n```\n\n### 6. Deploy Cluster\n\n```\n# Create servers\nprovisioning server create --infra . --check\n\n# Install taskservs\nprovisioning taskserv create kubernetes --infra .\nprovisioning taskserv create cilium --infra .\nprovisioning taskserv create containerd --infra .\n\n# Verify cluster\nkubectl get nodes\n```\n\n## Configuration Details\n\n### Server Roles\n\n- **k8s-master-01**: Control plane node running the Kubernetes API server, etcd, and scheduler\n- **k8s-worker-01/02**: Worker nodes running kubelet and container runtime\n\n### Taskservs\n\n- **containerd**: Container runtime for Kubernetes\n- **kubernetes**: Core Kubernetes components (kubelet, kubeadm, kubectl)\n- **cilium**: CNI (Container Network Interface) for pod networking\n\n### Network Configuration\n\n- All nodes have public IPv4 for initial setup\n- Cilium provides internal pod-to-pod networking\n- SSH access on port 22 for management\n\n## Customization\n\n### Adding More Workers\n\nCopy the worker node configuration in `servers.k` and modify:\n\n- `hostname`\n- `title`\n- Any provider-specific settings\n\n### Different Container Runtime\n\nReplace `containerd` taskserv with:\n\n- `crio`: CRI-O runtime\n- `docker`: Docker runtime (not recommended for production)\n\n### Different CNI\n\nReplace `cilium` taskserv with:\n\n- `calico`: Calico CNI\n- `flannel`: Flannel CNI\n- Built-in kubenet (remove CNI taskserv)\n\n### Storage\n\nAdd storage taskservs:\n\n```\nmodule-loader load taskservs . [rook-ceph, mayastor]\n```\n\nThen add to server taskserv configurations:\n\n```\ntaskservs = [\n { name = "containerd", profile = "default" },\n { name = "kubernetes", profile = "worker" },\n { name = "cilium", profile = "worker" },\n { name = "rook-ceph", profile = "default" }\n]\n```\n\n## Troubleshooting\n\n### Module Import Errors\n\nIf you see import errors like "module not found":\n\n1. Verify modules are loaded: `module-loader list taskservs .`\n2. Check generated import files: `ls .taskservs/`\n3. Reload modules if needed: `module-loader load taskservs . [kubernetes, cilium, containerd]`\n\n### Provider Configuration\n\nCheck provider-specific configuration in `.providers/` directory after loading.\n\n### Kubernetes Setup Issues\n\n1. Check taskserv installation logs in `./tmp/k8s-deployment/`\n2. Verify all nodes are reachable via SSH\n3. Check firewall rules for Kubernetes ports (6443, 10250, etc.) diff --git a/tests/integration/README.md b/tests/integration/README.md index ad407e4..373b5ae 100644 --- a/tests/integration/README.md +++ b/tests/integration/README.md @@ -1,588 +1 @@ -# Integration Testing Suite - -**Version**: 1.0.0 -**Status**: ✅ Complete -**Test Coverage**: 140 tests across 4 modes, 15+ services - ---- - -## Overview - -This directory contains the comprehensive integration testing suite for the provisioning platform. Tests validate all four execution modes (solo, multi-user, CI/CD, enterprise) with full service integration, workflow testing, and end-to-end scenarios. - -**Key Features**: - -- ✅ **4 Execution Modes**: Solo, Multi-User, CI/CD, Enterprise -- ✅ **15+ Services**: Orchestrator, CoreDNS, Gitea, OCI registries, PostgreSQL, Prometheus, etc. -- ✅ **OrbStack Integration**: Deployable to isolated OrbStack machine -- ✅ **Parallel Execution**: Run tests in parallel for speed -- ✅ **Multiple Report Formats**: JUnit XML, HTML, JSON -- ✅ **Automatic Cleanup**: Resources cleaned up after tests - ---- - -## Quick Start - -### 1. Prerequisites - -``` -# Install OrbStack -brew install --cask orbstack - -# Create OrbStack machine -orb create provisioning --cpu 4 --memory 8192 --disk 100 - -# Verify machine is running -orb status provisioning -``` - -### 2. Run Tests - -``` -# Run all tests for solo mode -nu provisioning/tests/integration/framework/test_runner.nu --mode solo - -# Run all tests for all modes -nu provisioning/tests/integration/framework/test_runner.nu - -# Run with HTML report -nu provisioning/tests/integration/framework/test_runner.nu --report test-report.html -``` - -### 3. View Results - -``` -# View JUnit report -cat /tmp/provisioning-test-reports/junit-results.xml - -# View HTML report -open test-report.html - -# View logs -cat /tmp/provisioning-test.log -``` - ---- - -## Directory Structure - -``` -provisioning/tests/integration/ -├── README.md # This file -├── test_config.yaml # Test configuration -├── setup_test_environment.nu # Environment setup -├── teardown_test_environment.nu # Cleanup script -├── framework/ # Test framework -│ ├── test_helpers.nu # Common utilities (400 lines) -│ ├── orbstack_helpers.nu # OrbStack integration (250 lines) -│ └── test_runner.nu # Test orchestrator (500 lines) -├── modes/ # Mode-specific tests -│ ├── test_solo_mode.nu # Solo mode (400 lines, 8 tests) -│ ├── test_multiuser_mode.nu # Multi-user (500 lines, 10 tests) -│ ├── test_cicd_mode.nu # CI/CD (450 lines, 8 tests) -│ └── test_enterprise_mode.nu # Enterprise (600 lines, 6 tests) -├── services/ # Service integration tests -│ ├── test_dns_integration.nu # CoreDNS (300 lines, 8 tests) -│ ├── test_gitea_integration.nu # Gitea (350 lines, 10 tests) -│ ├── test_oci_integration.nu # OCI registries (400 lines, 12 tests) -│ └── test_service_orchestration.nu # Service manager (350 lines, 10 tests) -├── workflows/ # Workflow tests -│ ├── test_extension_loading.nu # Extension loading (400 lines, 12 tests) -│ └── test_batch_workflows.nu # Batch workflows (500 lines, 12 tests) -├── e2e/ # End-to-end tests -│ ├── test_complete_deployment.nu # Full deployment (600 lines, 6 tests) -│ └── test_disaster_recovery.nu # Backup/restore (400 lines, 6 tests) -├── performance/ # Performance tests -│ ├── test_concurrency.nu # Concurrency (350 lines, 6 tests) -│ └── test_scalability.nu # Scalability (300 lines, 6 tests) -├── security/ # Security tests -│ ├── test_rbac_enforcement.nu # RBAC (400 lines, 10 tests) -│ └── test_kms_integration.nu # KMS (300 lines, 5 tests) -└── docs/ # Documentation - ├── TESTING_GUIDE.md # Complete testing guide (800 lines) - ├── ORBSTACK_SETUP.md # OrbStack setup (300 lines) - └── TEST_COVERAGE.md # Coverage report (400 lines) -``` - -**Total**: ~7,500 lines of test code + ~1,500 lines of documentation - ---- - -## Test Modes - -### Solo Mode (8 Tests) - -**Services**: Orchestrator, CoreDNS, Zot OCI registry - -**Tests**: - -- ✅ Minimal services running -- ✅ Single-user operations (no auth) -- ✅ No multi-user services -- ✅ Workspace creation -- ✅ Server deployment with DNS registration -- ✅ Taskserv installation -- ✅ Extension loading from OCI -- ✅ Admin permissions - -**Run**: - -``` -nu provisioning/tests/integration/framework/test_runner.nu --mode solo -``` - -### Multi-User Mode (10 Tests) - -**Services**: Solo services + Gitea, PostgreSQL - -**Tests**: - -- ✅ Multi-user services running -- ✅ User authentication -- ✅ Role-based permissions (viewer, developer, operator, admin) -- ✅ Workspace collaboration (clone, push, pull) -- ✅ Distributed locking via Gitea issues -- ✅ Concurrent operations -- ✅ Extension publishing to Gitea -- ✅ Extension downloading from Gitea -- ✅ DNS for multiple servers -- ✅ User isolation - -**Run**: - -``` -nu provisioning/tests/integration/framework/test_runner.nu --mode multiuser -``` - -### CI/CD Mode (8 Tests) - -**Services**: Multi-user services + API server, Prometheus - -**Tests**: - -- ✅ API server accessibility -- ✅ Service account JWT authentication -- ✅ API server creation -- ✅ API taskserv installation -- ✅ Batch workflow submission via API -- ✅ Remote workflow monitoring -- ✅ Automated deployment pipeline -- ✅ Prometheus metrics collection - -**Run**: - -``` -nu provisioning/tests/integration/framework/test_runner.nu --mode cicd -``` - -### Enterprise Mode (6 Tests) - -**Services**: CI/CD services + Harbor, Grafana, KMS, Elasticsearch - -**Tests**: - -- ✅ All enterprise services running (Harbor, Grafana, Prometheus, KMS) -- ✅ SSH keys stored in KMS -- ✅ Full RBAC enforcement -- ✅ Audit logging for all operations -- ✅ Harbor OCI registry operational -- ✅ Monitoring stack (Prometheus + Grafana) - -**Run**: - -``` -nu provisioning/tests/integration/framework/test_runner.nu --mode enterprise -``` - ---- - -## Service Integration Tests - -### CoreDNS Integration (8 Tests) - -- DNS registration on server creation -- DNS resolution -- DNS cleanup on server deletion -- DNS updates on IP change -- External DNS queries -- Multiple server DNS records -- Zone transfers (if enabled) -- DNS caching - -### Gitea Integration (10 Tests) - -- Gitea initialization -- Workspace git clone -- Workspace git push -- Workspace git pull -- Distributed locking (acquire/release) -- Extension publishing to releases -- Extension downloading from releases -- Gitea webhooks -- Gitea API access - -### OCI Registry Integration (12 Tests) - -- Zot registry (solo/multi-user modes) -- Harbor registry (enterprise mode) -- Push/pull KCL packages -- Push/pull extension artifacts -- List artifacts -- Verify manifests -- Delete artifacts -- Authentication -- Catalog API -- Blob upload - -### Orchestrator Integration (10 Tests) - -- Health endpoint -- Task submission -- Task status queries -- Task completion -- Failure handling -- Retry logic -- Task queue processing -- Workflow submission -- Workflow monitoring -- REST API endpoints - ---- - -## Workflow Tests - -### Extension Loading (12 Tests) - -- Load taskserv from OCI -- Load provider from Gitea -- Load cluster from local path -- Dependency resolution -- Version conflict resolution -- Extension caching -- Lazy loading -- Semver version resolution -- Extension updates -- Extension rollback -- Multi-source loading -- Extension validation - -### Batch Workflows (12 Tests) - -- Batch submission -- Batch status queries -- Batch monitoring -- Multi-server creation -- Multi-taskserv installation -- Cluster deployment -- Mixed providers (AWS + UpCloud + local) -- Dependency resolution -- Rollback on failure -- Partial failure handling -- Parallel execution -- Checkpoint recovery - ---- - -## End-to-End Tests - -### Complete Deployment (6 Tests) - -**Scenario**: Deploy 3-node Kubernetes cluster from scratch - -1. Initialize workspace -2. Load extensions (containerd, etcd, kubernetes, cilium) -3. Create 3 servers (1 control-plane, 2 workers) -4. Verify DNS registration -5. Install containerd on all servers -6. Install etcd on control-plane -7. Install kubernetes on all servers -8. Install cilium for networking -9. Verify cluster health -10. Deploy test application -11. Verify application accessible via DNS -12. Cleanup - -### Disaster Recovery (6 Tests) - -- Workspace backup -- Data loss simulation -- Workspace restore -- Data integrity verification -- Platform service backup -- Platform service restore - ---- - -## Performance Tests - -### Concurrency (6 Tests) - -- 10 concurrent server creations -- 20 concurrent DNS registrations -- 5 concurrent workflow submissions -- Throughput measurement -- Latency measurement -- Resource contention handling - -### Scalability (6 Tests) - -- 100 server creations -- 100 taskserv installations -- 100 DNS records -- 1000 OCI artifacts -- Performance degradation analysis -- Resource usage tracking - ---- - -## Security Tests - -### RBAC Enforcement (10 Tests) - -- Viewer cannot create servers -- Developer can deploy to dev, not prod -- Operator can manage infrastructure -- Admin has full access -- Service account automation permissions -- Role escalation prevention -- Permission inheritance -- Workspace isolation -- API endpoint authorization -- CLI command authorization - -### KMS Integration (5 Tests) - -- SSH key storage -- SSH key retrieval -- SSH key usage for server access -- SSH key rotation -- Audit logging for key access - ---- - -## Test Runner Options - -``` -nu provisioning/tests/integration/framework/test_runner.nu [OPTIONS] -``` - -**Options**: - -| Option | Description | Example | -| -------- | ------------- | --------- | -| `--mode ` | Test specific mode (solo, multiuser, cicd, enterprise) | `--mode solo` | -| `--filter ` | Filter tests by regex pattern | `--filter "dns"` | -| `--parallel ` | Number of parallel workers | `--parallel 4` | -| `--verbose` | Detailed output | `--verbose` | -| `--report ` | Generate HTML report | `--report test-report.html` | -| `--skip-setup` | Skip environment setup | `--skip-setup` | -| `--skip-teardown` | Skip environment teardown (for debugging) | `--skip-teardown` | - -**Examples**: - -``` -# Run all tests for all modes (sequential) -nu provisioning/tests/integration/framework/test_runner.nu - -# Run solo mode tests only -nu provisioning/tests/integration/framework/test_runner.nu --mode solo - -# Run DNS-related tests across all modes -nu provisioning/tests/integration/framework/test_runner.nu --filter "dns" - -# Run tests in parallel with 4 workers -nu provisioning/tests/integration/framework/test_runner.nu --parallel 4 - -# Generate HTML report -nu provisioning/tests/integration/framework/test_runner.nu --report /tmp/test-report.html - -# Run tests without cleanup (for debugging failures) -nu provisioning/tests/integration/framework/test_runner.nu --skip-teardown -``` - ---- - -## CI/CD Integration - -### GitHub Actions - -See `.github/workflows/integration-tests.yml` for complete workflow. - -**Trigger**: PR, push to main, nightly - -**Matrix**: All 4 modes tested in parallel - -**Artifacts**: Test reports, logs uploaded on failure - -### GitLab CI - -See `.gitlab-ci.yml` for complete configuration. - -**Stages**: Test - -**Parallel**: All 4 modes - -**Artifacts**: JUnit XML, HTML reports - ---- - -## Test Results - -### Expected Duration - -| Mode | Sequential | Parallel (4 workers) | -| ------ | ------------ | ---------------------- | -| Solo | 10 min | 3 min | -| Multi-User | 15 min | 4 min | -| CI/CD | 20 min | 5 min | -| Enterprise | 30 min | 8 min | -| **Total** | **75 min** | **20 min** | - -### Report Formats - -**JUnit XML**: `/tmp/provisioning-test-reports/junit-results.xml` - -- For CI/CD integration -- Compatible with all CI systems - -**HTML Report**: Generated with `--report` flag - -- Beautiful visual report -- Test details, duration, errors -- Pass/fail summary - -**JSON Report**: `/tmp/provisioning-test-reports/test-results.json` - -- Machine-readable format -- For custom analysis - ---- - -## Troubleshooting - -### Common Issues - -**OrbStack machine not found**: - -``` -orb create provisioning --cpu 4 --memory 8192 -``` - -**Docker connection failed**: - -``` -orb restart provisioning -docker -H /var/run/docker.sock ps -``` - -**Service health check timeout**: - -``` -# Check logs -nu provisioning/tests/integration/framework/orbstack_helpers.nu orbstack-logs orchestrator - -# Increase timeout in test_config.yaml -# test_execution.timeouts.test_timeout_seconds: 600 -``` - -**Test environment cleanup failed**: - -``` -# Manual cleanup -nu provisioning/tests/integration/teardown_test_environment.nu --force -``` - -**For more troubleshooting**, see [docs/TESTING_GUIDE.md](docs/TESTING_GUIDE.md#troubleshooting) - ---- - -## Documentation - -- **[TESTING_GUIDE.md](docs/TESTING_GUIDE.md)**: Complete testing guide (800 lines) -- **[ORBSTACK_SETUP.md](docs/ORBSTACK_SETUP.md)**: OrbStack machine setup (300 lines) -- **[TEST_COVERAGE.md](docs/TEST_COVERAGE.md)**: Coverage report (400 lines) - ---- - -## Contributing - -### Writing New Tests - -1. **Choose appropriate directory**: `modes/`, `services/`, `workflows/`, `e2e/`, `performance/`, `security/` -2. **Follow naming convention**: `test__.nu` -3. **Use test helpers**: Import from `framework/test_helpers.nu` -4. **Add assertions**: Use `assert-*` helpers -5. **Cleanup resources**: Always cleanup, even on failure -6. **Update coverage**: Add test to TEST_COVERAGE.md - -### Example Test - -``` -use std log -use ../framework/test_helpers.nu * - -def test-my-feature [test_config: record] { - run-test "my-feature-test" { - log info "Testing my feature..." - - # Setup - let resource = create-test-resource - - # Test - let result = perform-operation $resource - - # Assert - assert-eq $result.status "success" - - # Cleanup - cleanup-test-resource $resource - - log info "✓ My feature works" - } -} -``` - ---- - -## Metrics - -### Test Suite Statistics - -- **Total Tests**: 140 -- **Total Lines of Code**: ~7,500 -- **Documentation Lines**: ~1,500 -- **Coverage**: 88.5% (Rust orchestrator code) -- **Flaky Tests**: 0% -- **Success Rate**: 99.8% - -### Bug Detection - -- **Bugs Caught by Integration Tests**: 92% -- **Bugs Caught by Unit Tests**: 90% -- **Bugs Found in Production**: 2.7% - ---- - -## License - -Same as provisioning platform (check root LICENSE file) - ---- - -## Maintainers - -Platform Team - -**Last Updated**: 2025-10-06 -**Next Review**: 2025-11-06 - ---- - -## Quick Links - -- [Setup OrbStack](docs/ORBSTACK_SETUP.md#creating-the-provisioning-machine) -- [Run First Test](docs/TESTING_GUIDE.md#quick-start) -- [Writing Tests](docs/TESTING_GUIDE.md#writing-new-tests) -- [CI/CD Integration](docs/TESTING_GUIDE.md#cicd-integration) -- [Troubleshooting](docs/TESTING_GUIDE.md#troubleshooting) -- [Test Coverage Report](docs/TEST_COVERAGE.md) +# Integration Testing Suite\n\n**Version**: 1.0.0\n**Status**: ✅ Complete\n**Test Coverage**: 140 tests across 4 modes, 15+ services\n\n---\n\n## Overview\n\nThis directory contains the comprehensive integration testing suite for the provisioning platform. Tests validate all four execution modes (solo, multi-user, CI/CD, enterprise) with full service integration, workflow testing, and end-to-end scenarios.\n\n**Key Features**:\n\n- ✅ **4 Execution Modes**: Solo, Multi-User, CI/CD, Enterprise\n- ✅ **15+ Services**: Orchestrator, CoreDNS, Gitea, OCI registries, PostgreSQL, Prometheus, etc.\n- ✅ **OrbStack Integration**: Deployable to isolated OrbStack machine\n- ✅ **Parallel Execution**: Run tests in parallel for speed\n- ✅ **Multiple Report Formats**: JUnit XML, HTML, JSON\n- ✅ **Automatic Cleanup**: Resources cleaned up after tests\n\n---\n\n## Quick Start\n\n### 1. Prerequisites\n\n```\n# Install OrbStack\nbrew install --cask orbstack\n\n# Create OrbStack machine\norb create provisioning --cpu 4 --memory 8192 --disk 100\n\n# Verify machine is running\norb status provisioning\n```\n\n### 2. Run Tests\n\n```\n# Run all tests for solo mode\nnu provisioning/tests/integration/framework/test_runner.nu --mode solo\n\n# Run all tests for all modes\nnu provisioning/tests/integration/framework/test_runner.nu\n\n# Run with HTML report\nnu provisioning/tests/integration/framework/test_runner.nu --report test-report.html\n```\n\n### 3. View Results\n\n```\n# View JUnit report\ncat /tmp/provisioning-test-reports/junit-results.xml\n\n# View HTML report\nopen test-report.html\n\n# View logs\ncat /tmp/provisioning-test.log\n```\n\n---\n\n## Directory Structure\n\n```\nprovisioning/tests/integration/\n├── README.md # This file\n├── test_config.yaml # Test configuration\n├── setup_test_environment.nu # Environment setup\n├── teardown_test_environment.nu # Cleanup script\n├── framework/ # Test framework\n│ ├── test_helpers.nu # Common utilities (400 lines)\n│ ├── orbstack_helpers.nu # OrbStack integration (250 lines)\n│ └── test_runner.nu # Test orchestrator (500 lines)\n├── modes/ # Mode-specific tests\n│ ├── test_solo_mode.nu # Solo mode (400 lines, 8 tests)\n│ ├── test_multiuser_mode.nu # Multi-user (500 lines, 10 tests)\n│ ├── test_cicd_mode.nu # CI/CD (450 lines, 8 tests)\n│ └── test_enterprise_mode.nu # Enterprise (600 lines, 6 tests)\n├── services/ # Service integration tests\n│ ├── test_dns_integration.nu # CoreDNS (300 lines, 8 tests)\n│ ├── test_gitea_integration.nu # Gitea (350 lines, 10 tests)\n│ ├── test_oci_integration.nu # OCI registries (400 lines, 12 tests)\n│ └── test_service_orchestration.nu # Service manager (350 lines, 10 tests)\n├── workflows/ # Workflow tests\n│ ├── test_extension_loading.nu # Extension loading (400 lines, 12 tests)\n│ └── test_batch_workflows.nu # Batch workflows (500 lines, 12 tests)\n├── e2e/ # End-to-end tests\n│ ├── test_complete_deployment.nu # Full deployment (600 lines, 6 tests)\n│ └── test_disaster_recovery.nu # Backup/restore (400 lines, 6 tests)\n├── performance/ # Performance tests\n│ ├── test_concurrency.nu # Concurrency (350 lines, 6 tests)\n│ └── test_scalability.nu # Scalability (300 lines, 6 tests)\n├── security/ # Security tests\n│ ├── test_rbac_enforcement.nu # RBAC (400 lines, 10 tests)\n│ └── test_kms_integration.nu # KMS (300 lines, 5 tests)\n└── docs/ # Documentation\n ├── TESTING_GUIDE.md # Complete testing guide (800 lines)\n ├── ORBSTACK_SETUP.md # OrbStack setup (300 lines)\n └── TEST_COVERAGE.md # Coverage report (400 lines)\n```\n\n**Total**: ~7,500 lines of test code + ~1,500 lines of documentation\n\n---\n\n## Test Modes\n\n### Solo Mode (8 Tests)\n\n**Services**: Orchestrator, CoreDNS, Zot OCI registry\n\n**Tests**:\n\n- ✅ Minimal services running\n- ✅ Single-user operations (no auth)\n- ✅ No multi-user services\n- ✅ Workspace creation\n- ✅ Server deployment with DNS registration\n- ✅ Taskserv installation\n- ✅ Extension loading from OCI\n- ✅ Admin permissions\n\n**Run**:\n\n```\nnu provisioning/tests/integration/framework/test_runner.nu --mode solo\n```\n\n### Multi-User Mode (10 Tests)\n\n**Services**: Solo services + Gitea, PostgreSQL\n\n**Tests**:\n\n- ✅ Multi-user services running\n- ✅ User authentication\n- ✅ Role-based permissions (viewer, developer, operator, admin)\n- ✅ Workspace collaboration (clone, push, pull)\n- ✅ Distributed locking via Gitea issues\n- ✅ Concurrent operations\n- ✅ Extension publishing to Gitea\n- ✅ Extension downloading from Gitea\n- ✅ DNS for multiple servers\n- ✅ User isolation\n\n**Run**:\n\n```\nnu provisioning/tests/integration/framework/test_runner.nu --mode multiuser\n```\n\n### CI/CD Mode (8 Tests)\n\n**Services**: Multi-user services + API server, Prometheus\n\n**Tests**:\n\n- ✅ API server accessibility\n- ✅ Service account JWT authentication\n- ✅ API server creation\n- ✅ API taskserv installation\n- ✅ Batch workflow submission via API\n- ✅ Remote workflow monitoring\n- ✅ Automated deployment pipeline\n- ✅ Prometheus metrics collection\n\n**Run**:\n\n```\nnu provisioning/tests/integration/framework/test_runner.nu --mode cicd\n```\n\n### Enterprise Mode (6 Tests)\n\n**Services**: CI/CD services + Harbor, Grafana, KMS, Elasticsearch\n\n**Tests**:\n\n- ✅ All enterprise services running (Harbor, Grafana, Prometheus, KMS)\n- ✅ SSH keys stored in KMS\n- ✅ Full RBAC enforcement\n- ✅ Audit logging for all operations\n- ✅ Harbor OCI registry operational\n- ✅ Monitoring stack (Prometheus + Grafana)\n\n**Run**:\n\n```\nnu provisioning/tests/integration/framework/test_runner.nu --mode enterprise\n```\n\n---\n\n## Service Integration Tests\n\n### CoreDNS Integration (8 Tests)\n\n- DNS registration on server creation\n- DNS resolution\n- DNS cleanup on server deletion\n- DNS updates on IP change\n- External DNS queries\n- Multiple server DNS records\n- Zone transfers (if enabled)\n- DNS caching\n\n### Gitea Integration (10 Tests)\n\n- Gitea initialization\n- Workspace git clone\n- Workspace git push\n- Workspace git pull\n- Distributed locking (acquire/release)\n- Extension publishing to releases\n- Extension downloading from releases\n- Gitea webhooks\n- Gitea API access\n\n### OCI Registry Integration (12 Tests)\n\n- Zot registry (solo/multi-user modes)\n- Harbor registry (enterprise mode)\n- Push/pull KCL packages\n- Push/pull extension artifacts\n- List artifacts\n- Verify manifests\n- Delete artifacts\n- Authentication\n- Catalog API\n- Blob upload\n\n### Orchestrator Integration (10 Tests)\n\n- Health endpoint\n- Task submission\n- Task status queries\n- Task completion\n- Failure handling\n- Retry logic\n- Task queue processing\n- Workflow submission\n- Workflow monitoring\n- REST API endpoints\n\n---\n\n## Workflow Tests\n\n### Extension Loading (12 Tests)\n\n- Load taskserv from OCI\n- Load provider from Gitea\n- Load cluster from local path\n- Dependency resolution\n- Version conflict resolution\n- Extension caching\n- Lazy loading\n- Semver version resolution\n- Extension updates\n- Extension rollback\n- Multi-source loading\n- Extension validation\n\n### Batch Workflows (12 Tests)\n\n- Batch submission\n- Batch status queries\n- Batch monitoring\n- Multi-server creation\n- Multi-taskserv installation\n- Cluster deployment\n- Mixed providers (AWS + UpCloud + local)\n- Dependency resolution\n- Rollback on failure\n- Partial failure handling\n- Parallel execution\n- Checkpoint recovery\n\n---\n\n## End-to-End Tests\n\n### Complete Deployment (6 Tests)\n\n**Scenario**: Deploy 3-node Kubernetes cluster from scratch\n\n1. Initialize workspace\n2. Load extensions (containerd, etcd, kubernetes, cilium)\n3. Create 3 servers (1 control-plane, 2 workers)\n4. Verify DNS registration\n5. Install containerd on all servers\n6. Install etcd on control-plane\n7. Install kubernetes on all servers\n8. Install cilium for networking\n9. Verify cluster health\n10. Deploy test application\n11. Verify application accessible via DNS\n12. Cleanup\n\n### Disaster Recovery (6 Tests)\n\n- Workspace backup\n- Data loss simulation\n- Workspace restore\n- Data integrity verification\n- Platform service backup\n- Platform service restore\n\n---\n\n## Performance Tests\n\n### Concurrency (6 Tests)\n\n- 10 concurrent server creations\n- 20 concurrent DNS registrations\n- 5 concurrent workflow submissions\n- Throughput measurement\n- Latency measurement\n- Resource contention handling\n\n### Scalability (6 Tests)\n\n- 100 server creations\n- 100 taskserv installations\n- 100 DNS records\n- 1000 OCI artifacts\n- Performance degradation analysis\n- Resource usage tracking\n\n---\n\n## Security Tests\n\n### RBAC Enforcement (10 Tests)\n\n- Viewer cannot create servers\n- Developer can deploy to dev, not prod\n- Operator can manage infrastructure\n- Admin has full access\n- Service account automation permissions\n- Role escalation prevention\n- Permission inheritance\n- Workspace isolation\n- API endpoint authorization\n- CLI command authorization\n\n### KMS Integration (5 Tests)\n\n- SSH key storage\n- SSH key retrieval\n- SSH key usage for server access\n- SSH key rotation\n- Audit logging for key access\n\n---\n\n## Test Runner Options\n\n```\nnu provisioning/tests/integration/framework/test_runner.nu [OPTIONS]\n```\n\n**Options**:\n\n| Option | Description | Example |\n| -------- | ------------- | --------- |\n| `--mode ` | Test specific mode (solo, multiuser, cicd, enterprise) | `--mode solo` |\n| `--filter ` | Filter tests by regex pattern | `--filter "dns"` |\n| `--parallel ` | Number of parallel workers | `--parallel 4` |\n| `--verbose` | Detailed output | `--verbose` |\n| `--report ` | Generate HTML report | `--report test-report.html` |\n| `--skip-setup` | Skip environment setup | `--skip-setup` |\n| `--skip-teardown` | Skip environment teardown (for debugging) | `--skip-teardown` |\n\n**Examples**:\n\n```\n# Run all tests for all modes (sequential)\nnu provisioning/tests/integration/framework/test_runner.nu\n\n# Run solo mode tests only\nnu provisioning/tests/integration/framework/test_runner.nu --mode solo\n\n# Run DNS-related tests across all modes\nnu provisioning/tests/integration/framework/test_runner.nu --filter "dns"\n\n# Run tests in parallel with 4 workers\nnu provisioning/tests/integration/framework/test_runner.nu --parallel 4\n\n# Generate HTML report\nnu provisioning/tests/integration/framework/test_runner.nu --report /tmp/test-report.html\n\n# Run tests without cleanup (for debugging failures)\nnu provisioning/tests/integration/framework/test_runner.nu --skip-teardown\n```\n\n---\n\n## CI/CD Integration\n\n### GitHub Actions\n\nSee `.github/workflows/integration-tests.yml` for complete workflow.\n\n**Trigger**: PR, push to main, nightly\n\n**Matrix**: All 4 modes tested in parallel\n\n**Artifacts**: Test reports, logs uploaded on failure\n\n### GitLab CI\n\nSee `.gitlab-ci.yml` for complete configuration.\n\n**Stages**: Test\n\n**Parallel**: All 4 modes\n\n**Artifacts**: JUnit XML, HTML reports\n\n---\n\n## Test Results\n\n### Expected Duration\n\n| Mode | Sequential | Parallel (4 workers) |\n| ------ | ------------ | ---------------------- |\n| Solo | 10 min | 3 min |\n| Multi-User | 15 min | 4 min |\n| CI/CD | 20 min | 5 min |\n| Enterprise | 30 min | 8 min |\n| **Total** | **75 min** | **20 min** |\n\n### Report Formats\n\n**JUnit XML**: `/tmp/provisioning-test-reports/junit-results.xml`\n\n- For CI/CD integration\n- Compatible with all CI systems\n\n**HTML Report**: Generated with `--report` flag\n\n- Beautiful visual report\n- Test details, duration, errors\n- Pass/fail summary\n\n**JSON Report**: `/tmp/provisioning-test-reports/test-results.json`\n\n- Machine-readable format\n- For custom analysis\n\n---\n\n## Troubleshooting\n\n### Common Issues\n\n**OrbStack machine not found**:\n\n```\norb create provisioning --cpu 4 --memory 8192\n```\n\n**Docker connection failed**:\n\n```\norb restart provisioning\ndocker -H /var/run/docker.sock ps\n```\n\n**Service health check timeout**:\n\n```\n# Check logs\nnu provisioning/tests/integration/framework/orbstack_helpers.nu orbstack-logs orchestrator\n\n# Increase timeout in test_config.yaml\n# test_execution.timeouts.test_timeout_seconds: 600\n```\n\n**Test environment cleanup failed**:\n\n```\n# Manual cleanup\nnu provisioning/tests/integration/teardown_test_environment.nu --force\n```\n\n**For more troubleshooting**, see [docs/TESTING_GUIDE.md](docs/TESTING_GUIDE.md#troubleshooting)\n\n---\n\n## Documentation\n\n- **[TESTING_GUIDE.md](docs/TESTING_GUIDE.md)**: Complete testing guide (800 lines)\n- **[ORBSTACK_SETUP.md](docs/ORBSTACK_SETUP.md)**: OrbStack machine setup (300 lines)\n- **[TEST_COVERAGE.md](docs/TEST_COVERAGE.md)**: Coverage report (400 lines)\n\n---\n\n## Contributing\n\n### Writing New Tests\n\n1. **Choose appropriate directory**: `modes/`, `services/`, `workflows/`, `e2e/`, `performance/`, `security/`\n2. **Follow naming convention**: `test__.nu`\n3. **Use test helpers**: Import from `framework/test_helpers.nu`\n4. **Add assertions**: Use `assert-*` helpers\n5. **Cleanup resources**: Always cleanup, even on failure\n6. **Update coverage**: Add test to TEST_COVERAGE.md\n\n### Example Test\n\n```\nuse std log\nuse ../framework/test_helpers.nu *\n\ndef test-my-feature [test_config: record] {\n run-test "my-feature-test" {\n log info "Testing my feature..."\n\n # Setup\n let resource = create-test-resource\n\n # Test\n let result = perform-operation $resource\n\n # Assert\n assert-eq $result.status "success"\n\n # Cleanup\n cleanup-test-resource $resource\n\n log info "✓ My feature works"\n }\n}\n```\n\n---\n\n## Metrics\n\n### Test Suite Statistics\n\n- **Total Tests**: 140\n- **Total Lines of Code**: ~7,500\n- **Documentation Lines**: ~1,500\n- **Coverage**: 88.5% (Rust orchestrator code)\n- **Flaky Tests**: 0%\n- **Success Rate**: 99.8%\n\n### Bug Detection\n\n- **Bugs Caught by Integration Tests**: 92%\n- **Bugs Caught by Unit Tests**: 90%\n- **Bugs Found in Production**: 2.7%\n\n---\n\n## License\n\nSame as provisioning platform (check root LICENSE file)\n\n---\n\n## Maintainers\n\nPlatform Team\n\n**Last Updated**: 2025-10-06\n**Next Review**: 2025-11-06\n\n---\n\n## Quick Links\n\n- [Setup OrbStack](docs/ORBSTACK_SETUP.md#creating-the-provisioning-machine)\n- [Run First Test](docs/TESTING_GUIDE.md#quick-start)\n- [Writing Tests](docs/TESTING_GUIDE.md#writing-new-tests)\n- [CI/CD Integration](docs/TESTING_GUIDE.md#cicd-integration)\n- [Troubleshooting](docs/TESTING_GUIDE.md#troubleshooting)\n- [Test Coverage Report](docs/TEST_COVERAGE.md) diff --git a/tests/integration/docs/orbstack-setup.md b/tests/integration/docs/orbstack-setup.md index 0bc773e..28ca4e9 100644 --- a/tests/integration/docs/orbstack-setup.md +++ b/tests/integration/docs/orbstack-setup.md @@ -1,501 +1 @@ -# OrbStack Machine Setup Guide - -**Version**: 1.0.0 -**Last Updated**: 2025-10-06 - -This guide walks through setting up an OrbStack machine named "provisioning" for integration testing. - -## Table of Contents - -1. [Overview](#overview) -2. [Prerequisites](#prerequisites) -3. [Installing OrbStack](#installing-orbstack) -4. [Creating the Provisioning Machine](#creating-the-provisioning-machine) -5. [Configuring Resources](#configuring-resources) -6. [Installing Prerequisites](#installing-prerequisites) -7. [Deploying Platform for Testing](#deploying-platform-for-testing) -8. [Verifying Setup](#verifying-setup) -9. [Troubleshooting](#troubleshooting) - ---- - -## Overview - -OrbStack is a lightweight, fast Docker and Linux environment for macOS. We use it to run integration tests in an isolated environment without -affecting the host system. - -**Why OrbStack?** - -- ✅ **Fast**: Boots in seconds, much faster than traditional VMs -- ✅ **Lightweight**: Uses minimal resources -- ✅ **Native macOS Integration**: Seamless file sharing and networking -- ✅ **Docker Compatible**: Full Docker API compatibility -- ✅ **Easy Management**: Simple CLI for machine management - ---- - -## Prerequisites - -- **macOS 12.0+** (Monterey or later) -- **Homebrew** package manager -- **4 GB+ RAM** available for OrbStack machine -- **50 GB+ disk space** for containers and images - ---- - -## Installing OrbStack - -### Option 1: Homebrew (Recommended) - -``` -# Install OrbStack via Homebrew -brew install --cask orbstack -``` - -### Option 2: Direct Download - -1. Download OrbStack from -2. Open the downloaded DMG file -3. Drag OrbStack to Applications folder -4. Launch OrbStack from Applications - -### Verify Installation - -``` -# Check OrbStack CLI is available -orb version - -# Expected output: -# OrbStack 1.x.x -``` - ---- - -## Creating the Provisioning Machine - -### Create Machine - -``` -# Create machine named "provisioning" -orb create provisioning - -# Output: -# Creating machine "provisioning"... -# Machine "provisioning" created successfully -``` - -### Start Machine - -``` -# Start the machine -orb start provisioning - -# Verify machine is running -orb status provisioning - -# Output: -# Machine: provisioning -# State: running -# CPU: 4 cores -# Memory: 8192 MB -# Disk: 100 GB -``` - -### List All Machines - -``` -# List all OrbStack machines -orb list - -# Output (JSON): -# [ -# { -# "name": "provisioning", -# "state": "running", -# "cpu_cores": 4, -# "memory_mb": 8192, -# "disk_gb": 100 -# } -# ] -``` - ---- - -## Configuring Resources - -### Set CPU Cores - -``` -# Set CPU cores to 4 -orb config provisioning --cpu 4 -``` - -### Set Memory - -``` -# Set memory to 8 GB (8192 MB) -orb config provisioning --memory 8192 -``` - -### Set Disk Size - -``` -# Set disk size to 100 GB -orb config provisioning --disk 100 -``` - -### Apply All Settings at Once - -``` -# Configure all resources during creation -orb create provisioning --cpu 4 --memory 8192 --disk 100 -``` - -### Recommended Resources - -| Component | Minimum | Recommended | -| ----------- | --------- | ------------- | -| CPU Cores | 2 | 4 | -| Memory | 4 GB | 8 GB | -| Disk | 50 GB | 100 GB | - -**Note**: Enterprise mode tests require more resources due to additional services (Harbor, ELK, etc.) - ---- - -## Installing Prerequisites - -### Install Docker CLI - -OrbStack includes Docker, but you may need the Docker CLI: - -``` -# Install Docker CLI via Homebrew -brew install docker - -# Verify Docker is available -docker version -``` - -### Install Nushell - -``` -# Install Nushell -brew install nushell - -# Verify Nushell is installed -nu --version - -# Expected: 0.107.1 or later -``` - -### Install Additional Tools - -``` -# Install dig for DNS testing -brew install bind - -# Install psql for PostgreSQL testing -brew install postgresql@15 - -# Install git for Gitea testing -brew install git -``` - ---- - -## Deploying Platform for Testing - -### Deploy Solo Mode - -``` -# Navigate to project directory -cd /Users/Akasha/project-provisioning - -# Deploy solo mode to OrbStack -nu provisioning/tests/integration/setup_test_environment.nu --mode solo -``` - -**Deployed Services**: - -- Orchestrator (172.20.0.10:9090) -- CoreDNS (172.20.0.2:53) -- Zot OCI Registry (172.20.0.20:5000) - -### Deploy Multi-User Mode - -``` -# Deploy multi-user mode -nu provisioning/tests/integration/setup_test_environment.nu --mode multiuser -``` - -**Deployed Services**: - -- Solo mode services + -- Gitea (172.20.0.30:3000) -- PostgreSQL (172.20.0.40:5432) - -### Deploy CI/CD Mode - -``` -# Deploy CI/CD mode -nu provisioning/tests/integration/setup_test_environment.nu --mode cicd -``` - -**Deployed Services**: - -- Multi-user mode services + -- API Server (enabled in orchestrator) -- Prometheus (172.20.0.50:9090) - -### Deploy Enterprise Mode - -``` -# Deploy enterprise mode -nu provisioning/tests/integration/setup_test_environment.nu --mode enterprise -``` - -**Deployed Services**: - -- CI/CD mode services + -- Harbor OCI Registry (172.20.0.21:443) -- Grafana (172.20.0.51:3000) -- KMS (integrated with orchestrator) -- Elasticsearch (for audit logging) - ---- - -## Verifying Setup - -### Verify Machine is Running - -``` -# Check machine status -orb status provisioning - -# Expected: state = "running" -``` - -### Verify Docker Connectivity - -``` -# List running containers -docker -H /var/run/docker.sock ps - -# Expected: List of running containers -``` - -### Verify Services are Healthy - -``` -# Check orchestrator health -curl http://172.20.0.10:9090/health - -# Expected: {"status": "healthy"} - -# Check CoreDNS -dig @172.20.0.2 test.local - -# Expected: DNS query response - -# Check OCI registry -curl http://172.20.0.20:5000/v2/ - -# Expected: {} -``` - -### Run Smoke Test - -``` -# Run a simple smoke test -nu provisioning/tests/integration/framework/test_runner.nu --filter "health" --mode solo - -# Expected: All health check tests pass -``` - ---- - -## Troubleshooting - -### Machine Won't Start - -**Symptom**: `orb start provisioning` fails - -**Solutions**: - -``` -# Check OrbStack daemon -ps aux | grep orbstack - -# Restart OrbStack app -killall OrbStack -open -a OrbStack - -# Recreate machine -orb delete provisioning -orb create provisioning -``` - -### Docker Connection Failed - -**Symptom**: `docker -H /var/run/docker.sock ps` fails - -**Solutions**: - -``` -# Verify Docker socket exists -ls -la /var/run/docker.sock - -# Check OrbStack is running -orb status provisioning - -# Restart machine -orb restart provisioning -``` - -### Network Connectivity Issues - -**Symptom**: Cannot connect to services - -**Solutions**: - -``` -# Check Docker network -docker -H /var/run/docker.sock network ls - -# Recreate provisioning network -docker -H /var/run/docker.sock network rm provisioning-net -nu provisioning/tests/integration/framework/orbstack_helpers.nu orbstack-create-network - -# Verify network exists -docker -H /var/run/docker.sock network inspect provisioning-net -``` - -### Resource Exhaustion - -**Symptom**: Services fail to start due to lack of resources - -**Solutions**: - -``` -# Increase machine resources -orb config provisioning --cpu 8 --memory 16384 - -# Restart machine -orb restart provisioning - -# Check resource usage -docker -H /var/run/docker.sock stats -``` - -### Service Container Crashes - -**Symptom**: Container exits immediately after start - -**Solutions**: - -``` -# Check container logs -docker -H /var/run/docker.sock logs - -# Check container exit code -docker -H /var/run/docker.sock inspect | grep ExitCode - -# Restart container -docker -H /var/run/docker.sock restart -``` - ---- - -## Advanced Configuration - -### Custom Network Subnet - -Edit `provisioning/tests/integration/test_config.yaml`: - -``` -orbstack: - network: - subnet: "172.30.0.0/16" # Custom subnet - gateway: "172.30.0.1" - dns: ["172.30.0.2"] -``` - -### Persistent Volumes - -``` -# Create named volume for data persistence -docker -H /var/run/docker.sock volume create provisioning-data - -# Mount volume in container -docker -H /var/run/docker.sock run -v provisioning-data:/data ... -``` - -### SSH Access to Machine - -``` -# SSH into OrbStack machine -orb ssh provisioning - -# Now you're inside the machine -# Install additional tools if needed -apt-get update && apt-get install -y vim curl -``` - ---- - -## Cleanup - -### Stop Machine - -``` -# Stop machine (preserves data) -orb stop provisioning -``` - -### Delete Machine - -``` -# Delete machine (removes all data) -orb delete provisioning - -# Confirm deletion -# This will remove all containers, volumes, and data -``` - -### Cleanup Docker Resources - -``` -# Remove all containers -docker -H /var/run/docker.sock rm -f $(docker -H /var/run/docker.sock ps -aq) - -# Remove all volumes -docker -H /var/run/docker.sock volume prune -f - -# Remove all networks -docker -H /var/run/docker.sock network prune -f -``` - ---- - -## Best Practices - -1. **Regular Cleanup**: Clean up unused containers and volumes regularly -2. **Resource Monitoring**: Monitor resource usage to prevent exhaustion -3. **Automated Setup**: Use setup scripts for consistent environments -4. **Version Control**: Track OrbStack machine configuration in version control -5. **Backup Important Data**: Backup test data before major changes - ---- - -## References - -- [OrbStack Official Documentation](https://orbstack.dev/docs) -- [OrbStack GitHub](https://github.com/orbstack/orbstack) -- [Docker Documentation](https://docs.docker.com) -- [Integration Testing Guide](TESTING_GUIDE.md) - ---- - -**Maintained By**: Platform Team -**Last Updated**: 2025-10-06 +# OrbStack Machine Setup Guide\n\n**Version**: 1.0.0\n**Last Updated**: 2025-10-06\n\nThis guide walks through setting up an OrbStack machine named "provisioning" for integration testing.\n\n## Table of Contents\n\n1. [Overview](#overview)\n2. [Prerequisites](#prerequisites)\n3. [Installing OrbStack](#installing-orbstack)\n4. [Creating the Provisioning Machine](#creating-the-provisioning-machine)\n5. [Configuring Resources](#configuring-resources)\n6. [Installing Prerequisites](#installing-prerequisites)\n7. [Deploying Platform for Testing](#deploying-platform-for-testing)\n8. [Verifying Setup](#verifying-setup)\n9. [Troubleshooting](#troubleshooting)\n\n---\n\n## Overview\n\nOrbStack is a lightweight, fast Docker and Linux environment for macOS. We use it to run integration tests in an isolated environment without \naffecting the host system.\n\n**Why OrbStack?**\n\n- ✅ **Fast**: Boots in seconds, much faster than traditional VMs\n- ✅ **Lightweight**: Uses minimal resources\n- ✅ **Native macOS Integration**: Seamless file sharing and networking\n- ✅ **Docker Compatible**: Full Docker API compatibility\n- ✅ **Easy Management**: Simple CLI for machine management\n\n---\n\n## Prerequisites\n\n- **macOS 12.0+** (Monterey or later)\n- **Homebrew** package manager\n- **4 GB+ RAM** available for OrbStack machine\n- **50 GB+ disk space** for containers and images\n\n---\n\n## Installing OrbStack\n\n### Option 1: Homebrew (Recommended)\n\n```\n# Install OrbStack via Homebrew\nbrew install --cask orbstack\n```\n\n### Option 2: Direct Download\n\n1. Download OrbStack from \n2. Open the downloaded DMG file\n3. Drag OrbStack to Applications folder\n4. Launch OrbStack from Applications\n\n### Verify Installation\n\n```\n# Check OrbStack CLI is available\norb version\n\n# Expected output:\n# OrbStack 1.x.x\n```\n\n---\n\n## Creating the Provisioning Machine\n\n### Create Machine\n\n```\n# Create machine named "provisioning"\norb create provisioning\n\n# Output:\n# Creating machine "provisioning"...\n# Machine "provisioning" created successfully\n```\n\n### Start Machine\n\n```\n# Start the machine\norb start provisioning\n\n# Verify machine is running\norb status provisioning\n\n# Output:\n# Machine: provisioning\n# State: running\n# CPU: 4 cores\n# Memory: 8192 MB\n# Disk: 100 GB\n```\n\n### List All Machines\n\n```\n# List all OrbStack machines\norb list\n\n# Output (JSON):\n# [\n# {\n# "name": "provisioning",\n# "state": "running",\n# "cpu_cores": 4,\n# "memory_mb": 8192,\n# "disk_gb": 100\n# }\n# ]\n```\n\n---\n\n## Configuring Resources\n\n### Set CPU Cores\n\n```\n# Set CPU cores to 4\norb config provisioning --cpu 4\n```\n\n### Set Memory\n\n```\n# Set memory to 8 GB (8192 MB)\norb config provisioning --memory 8192\n```\n\n### Set Disk Size\n\n```\n# Set disk size to 100 GB\norb config provisioning --disk 100\n```\n\n### Apply All Settings at Once\n\n```\n# Configure all resources during creation\norb create provisioning --cpu 4 --memory 8192 --disk 100\n```\n\n### Recommended Resources\n\n| Component | Minimum | Recommended |\n| ----------- | --------- | ------------- |\n| CPU Cores | 2 | 4 |\n| Memory | 4 GB | 8 GB |\n| Disk | 50 GB | 100 GB |\n\n**Note**: Enterprise mode tests require more resources due to additional services (Harbor, ELK, etc.)\n\n---\n\n## Installing Prerequisites\n\n### Install Docker CLI\n\nOrbStack includes Docker, but you may need the Docker CLI:\n\n```\n# Install Docker CLI via Homebrew\nbrew install docker\n\n# Verify Docker is available\ndocker version\n```\n\n### Install Nushell\n\n```\n# Install Nushell\nbrew install nushell\n\n# Verify Nushell is installed\nnu --version\n\n# Expected: 0.107.1 or later\n```\n\n### Install Additional Tools\n\n```\n# Install dig for DNS testing\nbrew install bind\n\n# Install psql for PostgreSQL testing\nbrew install postgresql@15\n\n# Install git for Gitea testing\nbrew install git\n```\n\n---\n\n## Deploying Platform for Testing\n\n### Deploy Solo Mode\n\n```\n# Navigate to project directory\ncd /Users/Akasha/project-provisioning\n\n# Deploy solo mode to OrbStack\nnu provisioning/tests/integration/setup_test_environment.nu --mode solo\n```\n\n**Deployed Services**:\n\n- Orchestrator (172.20.0.10:9090)\n- CoreDNS (172.20.0.2:53)\n- Zot OCI Registry (172.20.0.20:5000)\n\n### Deploy Multi-User Mode\n\n```\n# Deploy multi-user mode\nnu provisioning/tests/integration/setup_test_environment.nu --mode multiuser\n```\n\n**Deployed Services**:\n\n- Solo mode services +\n- Gitea (172.20.0.30:3000)\n- PostgreSQL (172.20.0.40:5432)\n\n### Deploy CI/CD Mode\n\n```\n# Deploy CI/CD mode\nnu provisioning/tests/integration/setup_test_environment.nu --mode cicd\n```\n\n**Deployed Services**:\n\n- Multi-user mode services +\n- API Server (enabled in orchestrator)\n- Prometheus (172.20.0.50:9090)\n\n### Deploy Enterprise Mode\n\n```\n# Deploy enterprise mode\nnu provisioning/tests/integration/setup_test_environment.nu --mode enterprise\n```\n\n**Deployed Services**:\n\n- CI/CD mode services +\n- Harbor OCI Registry (172.20.0.21:443)\n- Grafana (172.20.0.51:3000)\n- KMS (integrated with orchestrator)\n- Elasticsearch (for audit logging)\n\n---\n\n## Verifying Setup\n\n### Verify Machine is Running\n\n```\n# Check machine status\norb status provisioning\n\n# Expected: state = "running"\n```\n\n### Verify Docker Connectivity\n\n```\n# List running containers\ndocker -H /var/run/docker.sock ps\n\n# Expected: List of running containers\n```\n\n### Verify Services are Healthy\n\n```\n# Check orchestrator health\ncurl http://172.20.0.10:9090/health\n\n# Expected: {"status": "healthy"}\n\n# Check CoreDNS\ndig @172.20.0.2 test.local\n\n# Expected: DNS query response\n\n# Check OCI registry\ncurl http://172.20.0.20:5000/v2/\n\n# Expected: {}\n```\n\n### Run Smoke Test\n\n```\n# Run a simple smoke test\nnu provisioning/tests/integration/framework/test_runner.nu --filter "health" --mode solo\n\n# Expected: All health check tests pass\n```\n\n---\n\n## Troubleshooting\n\n### Machine Won't Start\n\n**Symptom**: `orb start provisioning` fails\n\n**Solutions**:\n\n```\n# Check OrbStack daemon\nps aux | grep orbstack\n\n# Restart OrbStack app\nkillall OrbStack\nopen -a OrbStack\n\n# Recreate machine\norb delete provisioning\norb create provisioning\n```\n\n### Docker Connection Failed\n\n**Symptom**: `docker -H /var/run/docker.sock ps` fails\n\n**Solutions**:\n\n```\n# Verify Docker socket exists\nls -la /var/run/docker.sock\n\n# Check OrbStack is running\norb status provisioning\n\n# Restart machine\norb restart provisioning\n```\n\n### Network Connectivity Issues\n\n**Symptom**: Cannot connect to services\n\n**Solutions**:\n\n```\n# Check Docker network\ndocker -H /var/run/docker.sock network ls\n\n# Recreate provisioning network\ndocker -H /var/run/docker.sock network rm provisioning-net\nnu provisioning/tests/integration/framework/orbstack_helpers.nu orbstack-create-network\n\n# Verify network exists\ndocker -H /var/run/docker.sock network inspect provisioning-net\n```\n\n### Resource Exhaustion\n\n**Symptom**: Services fail to start due to lack of resources\n\n**Solutions**:\n\n```\n# Increase machine resources\norb config provisioning --cpu 8 --memory 16384\n\n# Restart machine\norb restart provisioning\n\n# Check resource usage\ndocker -H /var/run/docker.sock stats\n```\n\n### Service Container Crashes\n\n**Symptom**: Container exits immediately after start\n\n**Solutions**:\n\n```\n# Check container logs\ndocker -H /var/run/docker.sock logs \n\n# Check container exit code\ndocker -H /var/run/docker.sock inspect | grep ExitCode\n\n# Restart container\ndocker -H /var/run/docker.sock restart \n```\n\n---\n\n## Advanced Configuration\n\n### Custom Network Subnet\n\nEdit `provisioning/tests/integration/test_config.yaml`:\n\n```\norbstack:\n network:\n subnet: "172.30.0.0/16" # Custom subnet\n gateway: "172.30.0.1"\n dns: ["172.30.0.2"]\n```\n\n### Persistent Volumes\n\n```\n# Create named volume for data persistence\ndocker -H /var/run/docker.sock volume create provisioning-data\n\n# Mount volume in container\ndocker -H /var/run/docker.sock run -v provisioning-data:/data ...\n```\n\n### SSH Access to Machine\n\n```\n# SSH into OrbStack machine\norb ssh provisioning\n\n# Now you're inside the machine\n# Install additional tools if needed\napt-get update && apt-get install -y vim curl\n```\n\n---\n\n## Cleanup\n\n### Stop Machine\n\n```\n# Stop machine (preserves data)\norb stop provisioning\n```\n\n### Delete Machine\n\n```\n# Delete machine (removes all data)\norb delete provisioning\n\n# Confirm deletion\n# This will remove all containers, volumes, and data\n```\n\n### Cleanup Docker Resources\n\n```\n# Remove all containers\ndocker -H /var/run/docker.sock rm -f $(docker -H /var/run/docker.sock ps -aq)\n\n# Remove all volumes\ndocker -H /var/run/docker.sock volume prune -f\n\n# Remove all networks\ndocker -H /var/run/docker.sock network prune -f\n```\n\n---\n\n## Best Practices\n\n1. **Regular Cleanup**: Clean up unused containers and volumes regularly\n2. **Resource Monitoring**: Monitor resource usage to prevent exhaustion\n3. **Automated Setup**: Use setup scripts for consistent environments\n4. **Version Control**: Track OrbStack machine configuration in version control\n5. **Backup Important Data**: Backup test data before major changes\n\n---\n\n## References\n\n- [OrbStack Official Documentation](https://orbstack.dev/docs)\n- [OrbStack GitHub](https://github.com/orbstack/orbstack)\n- [Docker Documentation](https://docs.docker.com)\n- [Integration Testing Guide](TESTING_GUIDE.md)\n\n---\n\n**Maintained By**: Platform Team\n**Last Updated**: 2025-10-06 diff --git a/tests/integration/docs/test-coverage.md b/tests/integration/docs/test-coverage.md index 86c637f..305a1fa 100644 --- a/tests/integration/docs/test-coverage.md +++ b/tests/integration/docs/test-coverage.md @@ -1,400 +1 @@ -# Integration Test Coverage Report - -**Version**: 1.0.0 -**Last Updated**: 2025-10-06 -**Test Suite Version**: 1.0.0 - -This document provides a comprehensive overview of integration test coverage for the provisioning platform. - -## Table of Contents - -1. [Summary](#summary) -2. [Mode Coverage](#mode-coverage) -3. [Service Coverage](#service-coverage) -4. [Workflow Coverage](#workflow-coverage) -5. [Edge Cases Covered](#edge-cases-covered) -6. [Coverage Gaps](#coverage-gaps) -7. [Future Enhancements](#future-enhancements) - ---- - -## Summary - -### Overall Coverage - -| Category | Coverage | Tests | Status | -| ---------- | ---------- | ------- | -------- | -| **Modes** | 4/4 (100%) | 32 | ✅ Complete | -| **Services** | 15/15 (100%) | 45 | ✅ Complete | -| **Workflows** | 8/8 (100%) | 24 | ✅ Complete | -| **E2E Scenarios** | 6/6 (100%) | 12 | ✅ Complete | -| **Security** | 5/5 (100%) | 15 | ✅ Complete | -| **Performance** | 4/4 (100%) | 12 | ✅ Complete | -| **Total** | **42/42** | **140** | ✅ **Complete** | - -### Test Distribution - -``` -Total Integration Tests: 140 -├── Mode Tests: 32 (23%) -│ ├── Solo: 8 -│ ├── Multi-User: 10 -│ ├── CI/CD: 8 -│ └── Enterprise: 6 -├── Service Tests: 45 (32%) -│ ├── DNS: 8 -│ ├── Gitea: 10 -│ ├── OCI Registry: 12 -│ ├── Orchestrator: 10 -│ └── Others: 5 -├── Workflow Tests: 24 (17%) -│ ├── Extension Loading: 12 -│ └── Batch Workflows: 12 -├── E2E Tests: 12 (9%) -│ ├── Complete Deployment: 6 -│ └── Disaster Recovery: 6 -├── Security Tests: 15 (11%) -│ ├── RBAC: 10 -│ └── KMS: 5 -└── Performance Tests: 12 (8%) - ├── Concurrency: 6 - └── Scalability: 6 -``` - ---- - -## Mode Coverage - -### Solo Mode (8 Tests) ✅ - -| Test | Description | Status | -| ------ | ------------- | -------- | -| `test-minimal-services` | Verify orchestrator, CoreDNS, Zot running | ✅ Pass | -| `test-single-user-operations` | All operations work without authentication | ✅ Pass | -| `test-no-multiuser-services` | Gitea, PostgreSQL not running | ✅ Pass | -| `test-workspace-creation` | Create workspace in solo mode | ✅ Pass | -| `test-server-deployment-with-dns` | Server creation triggers DNS registration | ✅ Pass | -| `test-taskserv-installation` | Install kubernetes taskserv | ✅ Pass | -| `test-extension-loading-from-oci` | Load extensions from Zot registry | ✅ Pass | -| `test-admin-permissions` | Admin has full permissions | ✅ Pass | - -**Coverage**: 100% -**Critical Paths**: ✅ All covered -**Edge Cases**: ✅ Handled - -### Multi-User Mode (10 Tests) ✅ - -| Test | Description | Status | -| ------ | ------------- | -------- | -| `test-multiuser-services-running` | Gitea, PostgreSQL running | ✅ Pass | -| `test-user-authentication` | Users can authenticate | ✅ Pass | -| `test-role-based-permissions` | Roles enforced (viewer, developer, operator, admin) | ✅ Pass | -| `test-workspace-collaboration` | Multiple users can clone/push workspaces | ✅ Pass | -| `test-workspace-locking` | Distributed locking via Gitea issues | ✅ Pass | -| `test-concurrent-operations` | Multiple users work simultaneously | ✅ Pass | -| `test-extension-publishing` | Publish extensions to Gitea releases | ✅ Pass | -| `test-extension-downloading` | Download extensions from Gitea | ✅ Pass | -| `test-dns-multi-server` | DNS registration for multiple servers | ✅ Pass | -| `test-user-isolation` | Users can only access their resources | ✅ Pass | - -**Coverage**: 100% -**Critical Paths**: ✅ All covered -**Edge Cases**: ✅ Handled - -### CI/CD Mode (8 Tests) ✅ - -| Test | Description | Status | -| ------ | ------------- | -------- | -| `test-api-server-running` | API server accessible | ✅ Pass | -| `test-service-account-auth` | Service accounts can authenticate with JWT | ✅ Pass | -| `test-api-server-creation` | Create server via API | ✅ Pass | -| `test-api-taskserv-installation` | Install taskserv via API | ✅ Pass | -| `test-batch-workflow-submission` | Submit batch workflow via API | ✅ Pass | -| `test-workflow-monitoring` | Monitor workflow progress remotely | ✅ Pass | -| `test-automated-pipeline` | Complete automated deployment pipeline | ✅ Pass | -| `test-prometheus-metrics` | Metrics collected and queryable | ✅ Pass | - -**Coverage**: 100% -**Critical Paths**: ✅ All covered -**Edge Cases**: ✅ Handled - -### Enterprise Mode (6 Tests) ✅ - -| Test | Description | Status | -| ------ | ------------- | -------- | -| `test-enterprise-services-running` | Harbor, Grafana, Prometheus, KMS running | ✅ Pass | -| `test-kms-ssh-key-storage` | SSH keys stored in KMS | ✅ Pass | -| `test-rbac-full-enforcement` | RBAC enforced at all levels | ✅ Pass | -| `test-audit-logging` | All operations logged | ✅ Pass | -| `test-harbor-registry` | Harbor OCI registry operational | ✅ Pass | -| `test-monitoring-stack` | Prometheus + Grafana operational | ✅ Pass | - -**Coverage**: 100% -**Critical Paths**: ✅ All covered -**Edge Cases**: ✅ Handled - ---- - -## Service Coverage - -### CoreDNS (8 Tests) ✅ - -| Test | Description | Coverage | -| ------ | ------------- | ---------- | -| `test-dns-registration` | Server creation triggers DNS A record | ✅ | -| `test-dns-resolution` | DNS queries resolve correctly | ✅ | -| `test-dns-cleanup` | DNS records removed on server deletion | ✅ | -| `test-dns-update` | DNS records updated on IP change | ✅ | -| `test-dns-external-query` | External clients can query DNS | ✅ | -| `test-dns-multiple-records` | Multiple servers get unique records | ✅ | -| `test-dns-zone-transfer` | Zone transfers work (if enabled) | ✅ | -| `test-dns-caching` | DNS caching works correctly | ✅ | - -**Coverage**: 100% - -### Gitea (10 Tests) ✅ - -| Test | Description | Coverage | -| ------ | ------------- | ---------- | -| `test-gitea-initialization` | Gitea initializes with default settings | ✅ | -| `test-git-clone` | Clone workspace repository | ✅ | -| `test-git-push` | Push workspace changes | ✅ | -| `test-git-pull` | Pull workspace updates | ✅ | -| `test-workspace-locking-acquire` | Acquire workspace lock via issue | ✅ | -| `test-workspace-locking-release` | Release workspace lock | ✅ | -| `test-extension-publish` | Publish extension to Gitea release | ✅ | -| `test-extension-download` | Download extension from release | ✅ | -| `test-gitea-webhooks` | Webhooks trigger on push | ✅ | -| `test-gitea-api-access` | Gitea API accessible | ✅ | - -**Coverage**: 100% - -### OCI Registry (12 Tests) ✅ - -| Test | Description | Coverage | -| ------ | ------------- | ---------- | -| `test-zot-registry-running` | Zot registry accessible (solo/multi-user) | ✅ | -| `test-harbor-registry-running` | Harbor registry accessible (enterprise) | ✅ | -| `test-oci-push-kcl-package` | Push KCL package to OCI | ✅ | -| `test-oci-pull-kcl-package` | Pull KCL package from OCI | ✅ | -| `test-oci-push-extension` | Push extension artifact to OCI | ✅ | -| `test-oci-pull-extension` | Pull extension artifact from OCI | ✅ | -| `test-oci-list-artifacts` | List artifacts in namespace | ✅ | -| `test-oci-verify-manifest` | Verify OCI manifest contents | ✅ | -| `test-oci-delete-artifact` | Delete artifact from registry | ✅ | -| `test-oci-authentication` | Authentication with OCI registry | ✅ | -| `test-oci-catalog` | Catalog API works | ✅ | -| `test-oci-blob-upload` | Blob upload works | ✅ | - -**Coverage**: 100% - -### Orchestrator (10 Tests) ✅ - -| Test | Description | Coverage | -| ------ | ------------- | ---------- | -| `test-orchestrator-health` | Health endpoint returns healthy | ✅ | -| `test-task-submission` | Submit task to orchestrator | ✅ | -| `test-task-status` | Query task status | ✅ | -| `test-task-completion` | Task completes successfully | ✅ | -| `test-task-failure-handling` | Failed tasks handled correctly | ✅ | -| `test-task-retry` | Tasks retry on transient failure | ✅ | -| `test-task-queue` | Task queue processes tasks in order | ✅ | -| `test-workflow-submission` | Submit workflow | ✅ | -| `test-workflow-monitoring` | Monitor workflow progress | ✅ | -| `test-orchestrator-api` | REST API endpoints work | ✅ | - -**Coverage**: 100% - -### PostgreSQL (5 Tests) ✅ - -| Test | Description | Coverage | -| ------ | ------------- | ---------- | -| `test-postgres-running` | PostgreSQL accessible | ✅ | -| `test-database-creation` | Create database | ✅ | -| `test-user-creation` | Create database user | ✅ | -| `test-data-persistence` | Data persists across restarts | ✅ | -| `test-connection-pool` | Connection pooling works | ✅ | - -**Coverage**: 100% - ---- - -## Workflow Coverage - -### Extension Loading (12 Tests) ✅ - -| Test | Description | Coverage | -| ------ | ------------- | ---------- | -| `test-load-taskserv-from-oci` | Load taskserv from OCI registry | ✅ | -| `test-load-provider-from-gitea` | Load provider from Gitea release | ✅ | -| `test-load-cluster-from-local` | Load cluster from local path | ✅ | -| `test-dependency-resolution` | Resolve extension dependencies | ✅ | -| `test-version-conflict-resolution` | Handle version conflicts | ✅ | -| `test-extension-caching` | Cache extension artifacts | ✅ | -| `test-extension-lazy-loading` | Extensions loaded on-demand | ✅ | -| `test-semver-resolution` | Semver version resolution | ✅ | -| `test-extension-update` | Update extension to newer version | ✅ | -| `test-extension-rollback` | Rollback extension to previous version | ✅ | -| `test-multi-source-loading` | Load from multiple sources in one workflow | ✅ | -| `test-extension-validation` | Validate extension before loading | ✅ | - -**Coverage**: 100% - -### Batch Workflows (12 Tests) ✅ - -| Test | Description | Coverage | -| ------ | ------------- | ---------- | -| `test-batch-submit` | Submit batch workflow | ✅ | -| `test-batch-status` | Query batch status | ✅ | -| `test-batch-monitor` | Monitor batch progress | ✅ | -| `test-batch-multi-server-creation` | Create multiple servers in batch | ✅ | -| `test-batch-multi-taskserv-install` | Install taskservs on multiple servers | ✅ | -| `test-batch-cluster-deployment` | Deploy complete cluster in batch | ✅ | -| `test-batch-mixed-providers` | Batch with AWS + UpCloud + local | ✅ | -| `test-batch-dependencies` | Batch operations with dependencies | ✅ | -| `test-batch-rollback` | Rollback failed batch operation | ✅ | -| `test-batch-partial-failure` | Handle partial batch failures | ✅ | -| `test-batch-parallel-execution` | Parallel execution within batch | ✅ | -| `test-batch-checkpoint-recovery` | Recovery from checkpoint after failure | ✅ | - -**Coverage**: 100% - ---- - -## Edge Cases Covered - -### Authentication & Authorization - -| Edge Case | Test Coverage | Status | -| ----------- | --------------- | -------- | -| Unauthenticated request | ✅ Rejected in multi-user mode | ✅ | -| Invalid JWT token | ✅ Rejected with 401 | ✅ | -| Expired JWT token | ✅ Rejected with 401 | ✅ | -| Insufficient permissions | ✅ Rejected with 403 | ✅ | -| Role escalation attempt | ✅ Blocked by RBAC | ✅ | - -### Resource Management - -| Edge Case | Test Coverage | Status | -| ----------- | --------------- | -------- | -| Resource exhaustion | ✅ Graceful degradation | ✅ | -| Concurrent resource access | ✅ Locking prevents conflicts | ✅ | -| Resource cleanup failure | ✅ Retry with backoff | ✅ | -| Orphaned resources | ✅ Cleanup job removes | ✅ | - -### Network Operations - -| Edge Case | Test Coverage | Status | -| ----------- | --------------- | -------- | -| Network timeout | ✅ Retry with exponential backoff | ✅ | -| DNS resolution failure | ✅ Fallback to IP address | ✅ | -| Service unavailable | ✅ Circuit breaker pattern | ✅ | -| Partial network partition | ✅ Retry and eventual consistency | ✅ | - -### Data Consistency - -| Edge Case | Test Coverage | Status | -| ----------- | --------------- | -------- | -| Concurrent writes | ✅ Last-write-wins with timestamps | ✅ | -| Split-brain scenario | ✅ Distributed lock prevents | ✅ | -| Data corruption | ✅ Checksum validation | ✅ | -| Incomplete transactions | ✅ Rollback on failure | ✅ | - ---- - -## Coverage Gaps - -### Known Limitations - -1. **Load Testing**: No tests for extreme load (1000+ concurrent requests) - - **Impact**: Medium - - **Mitigation**: Planned for v1.1.0 - -2. **Disaster Recovery**: Limited testing of backup/restore under load - - **Impact**: Low - - **Mitigation**: Manual testing procedures documented - -3. **Network Partitions**: Limited testing of split-brain scenarios - - **Impact**: Low (distributed locking mitigates) - - **Mitigation**: Planned for v1.2.0 - -4. **Security Penetration Testing**: No automated penetration tests - - **Impact**: Medium - - **Mitigation**: Annual security audit - -### Planned Enhancements - -- [ ] Chaos engineering tests (inject failures) -- [ ] Load testing with 10,000+ concurrent operations -- [ ] Extended disaster recovery scenarios -- [ ] Fuzz testing for API endpoints -- [ ] Performance regression detection - ---- - -## Future Enhancements - -### v1.1.0 (Next Release) - -- **Load Testing Suite**: 1000+ concurrent operations -- **Chaos Engineering**: Inject random failures -- **Extended Security Tests**: Penetration testing automation -- **Performance Benchmarks**: Baseline performance metrics - -### v1.2.0 (Q2 2025) - -- **Multi-Cloud Integration**: Test AWS + UpCloud + GCP simultaneously -- **Network Partition Testing**: Advanced split-brain scenarios -- **Compliance Testing**: GDPR, SOC2 compliance validation -- **Visual Regression Testing**: UI component testing - -### v2.0.0 (Future) - -- **AI-Powered Test Generation**: Generate tests from user scenarios -- **Property-Based Testing**: QuickCheck-style property testing -- **Mutation Testing**: Detect untested code paths -- **Continuous Fuzzing**: 24/7 fuzz testing - ---- - -## Test Quality Metrics - -### Code Coverage (Orchestrator Rust Code) - -| Module | Coverage | Tests | -| -------- | ---------- | ------- | -| `main.rs` | 85% | 12 | -| `config.rs` | 92% | 8 | -| `queue.rs` | 88% | 10 | -| `batch.rs` | 90% | 15 | -| `dependency.rs` | 87% | 12 | -| `rollback.rs` | 89% | 14 | -| **Average** | **88.5%** | **71** | - -### Test Reliability - -- **Flaky Tests**: 0% -- **Test Success Rate**: 99.8% -- **Average Test Duration**: 15 minutes (full suite) -- **Parallel Execution Speedup**: 4x (with 4 workers) - -### Bug Detection Rate - -- **Bugs Caught by Integration Tests**: 23/25 (92%) -- **Bugs Caught by Unit Tests**: 45/50 (90%) -- **Bugs Found in Production**: 2/75 (2.7%) - ---- - -## References - -- [Integration Testing Guide](TESTING_GUIDE.md) -- [OrbStack Setup Guide](ORBSTACK_SETUP.md) -- [Platform Architecture](/docs/architecture/) -- [CI/CD Pipeline](/.github/workflows/) - ---- - -**Maintained By**: Platform Team -**Last Updated**: 2025-10-06 -**Next Review**: 2025-11-06 +# Integration Test Coverage Report\n\n**Version**: 1.0.0\n**Last Updated**: 2025-10-06\n**Test Suite Version**: 1.0.0\n\nThis document provides a comprehensive overview of integration test coverage for the provisioning platform.\n\n## Table of Contents\n\n1. [Summary](#summary)\n2. [Mode Coverage](#mode-coverage)\n3. [Service Coverage](#service-coverage)\n4. [Workflow Coverage](#workflow-coverage)\n5. [Edge Cases Covered](#edge-cases-covered)\n6. [Coverage Gaps](#coverage-gaps)\n7. [Future Enhancements](#future-enhancements)\n\n---\n\n## Summary\n\n### Overall Coverage\n\n| Category | Coverage | Tests | Status |\n| ---------- | ---------- | ------- | -------- |\n| **Modes** | 4/4 (100%) | 32 | ✅ Complete |\n| **Services** | 15/15 (100%) | 45 | ✅ Complete |\n| **Workflows** | 8/8 (100%) | 24 | ✅ Complete |\n| **E2E Scenarios** | 6/6 (100%) | 12 | ✅ Complete |\n| **Security** | 5/5 (100%) | 15 | ✅ Complete |\n| **Performance** | 4/4 (100%) | 12 | ✅ Complete |\n| **Total** | **42/42** | **140** | ✅ **Complete** |\n\n### Test Distribution\n\n```\nTotal Integration Tests: 140\n├── Mode Tests: 32 (23%)\n│ ├── Solo: 8\n│ ├── Multi-User: 10\n│ ├── CI/CD: 8\n│ └── Enterprise: 6\n├── Service Tests: 45 (32%)\n│ ├── DNS: 8\n│ ├── Gitea: 10\n│ ├── OCI Registry: 12\n│ ├── Orchestrator: 10\n│ └── Others: 5\n├── Workflow Tests: 24 (17%)\n│ ├── Extension Loading: 12\n│ └── Batch Workflows: 12\n├── E2E Tests: 12 (9%)\n│ ├── Complete Deployment: 6\n│ └── Disaster Recovery: 6\n├── Security Tests: 15 (11%)\n│ ├── RBAC: 10\n│ └── KMS: 5\n└── Performance Tests: 12 (8%)\n ├── Concurrency: 6\n └── Scalability: 6\n```\n\n---\n\n## Mode Coverage\n\n### Solo Mode (8 Tests) ✅\n\n| Test | Description | Status |\n| ------ | ------------- | -------- |\n| `test-minimal-services` | Verify orchestrator, CoreDNS, Zot running | ✅ Pass |\n| `test-single-user-operations` | All operations work without authentication | ✅ Pass |\n| `test-no-multiuser-services` | Gitea, PostgreSQL not running | ✅ Pass |\n| `test-workspace-creation` | Create workspace in solo mode | ✅ Pass |\n| `test-server-deployment-with-dns` | Server creation triggers DNS registration | ✅ Pass |\n| `test-taskserv-installation` | Install kubernetes taskserv | ✅ Pass |\n| `test-extension-loading-from-oci` | Load extensions from Zot registry | ✅ Pass |\n| `test-admin-permissions` | Admin has full permissions | ✅ Pass |\n\n**Coverage**: 100%\n**Critical Paths**: ✅ All covered\n**Edge Cases**: ✅ Handled\n\n### Multi-User Mode (10 Tests) ✅\n\n| Test | Description | Status |\n| ------ | ------------- | -------- |\n| `test-multiuser-services-running` | Gitea, PostgreSQL running | ✅ Pass |\n| `test-user-authentication` | Users can authenticate | ✅ Pass |\n| `test-role-based-permissions` | Roles enforced (viewer, developer, operator, admin) | ✅ Pass |\n| `test-workspace-collaboration` | Multiple users can clone/push workspaces | ✅ Pass |\n| `test-workspace-locking` | Distributed locking via Gitea issues | ✅ Pass |\n| `test-concurrent-operations` | Multiple users work simultaneously | ✅ Pass |\n| `test-extension-publishing` | Publish extensions to Gitea releases | ✅ Pass |\n| `test-extension-downloading` | Download extensions from Gitea | ✅ Pass |\n| `test-dns-multi-server` | DNS registration for multiple servers | ✅ Pass |\n| `test-user-isolation` | Users can only access their resources | ✅ Pass |\n\n**Coverage**: 100%\n**Critical Paths**: ✅ All covered\n**Edge Cases**: ✅ Handled\n\n### CI/CD Mode (8 Tests) ✅\n\n| Test | Description | Status |\n| ------ | ------------- | -------- |\n| `test-api-server-running` | API server accessible | ✅ Pass |\n| `test-service-account-auth` | Service accounts can authenticate with JWT | ✅ Pass |\n| `test-api-server-creation` | Create server via API | ✅ Pass |\n| `test-api-taskserv-installation` | Install taskserv via API | ✅ Pass |\n| `test-batch-workflow-submission` | Submit batch workflow via API | ✅ Pass |\n| `test-workflow-monitoring` | Monitor workflow progress remotely | ✅ Pass |\n| `test-automated-pipeline` | Complete automated deployment pipeline | ✅ Pass |\n| `test-prometheus-metrics` | Metrics collected and queryable | ✅ Pass |\n\n**Coverage**: 100%\n**Critical Paths**: ✅ All covered\n**Edge Cases**: ✅ Handled\n\n### Enterprise Mode (6 Tests) ✅\n\n| Test | Description | Status |\n| ------ | ------------- | -------- |\n| `test-enterprise-services-running` | Harbor, Grafana, Prometheus, KMS running | ✅ Pass |\n| `test-kms-ssh-key-storage` | SSH keys stored in KMS | ✅ Pass |\n| `test-rbac-full-enforcement` | RBAC enforced at all levels | ✅ Pass |\n| `test-audit-logging` | All operations logged | ✅ Pass |\n| `test-harbor-registry` | Harbor OCI registry operational | ✅ Pass |\n| `test-monitoring-stack` | Prometheus + Grafana operational | ✅ Pass |\n\n**Coverage**: 100%\n**Critical Paths**: ✅ All covered\n**Edge Cases**: ✅ Handled\n\n---\n\n## Service Coverage\n\n### CoreDNS (8 Tests) ✅\n\n| Test | Description | Coverage |\n| ------ | ------------- | ---------- |\n| `test-dns-registration` | Server creation triggers DNS A record | ✅ |\n| `test-dns-resolution` | DNS queries resolve correctly | ✅ |\n| `test-dns-cleanup` | DNS records removed on server deletion | ✅ |\n| `test-dns-update` | DNS records updated on IP change | ✅ |\n| `test-dns-external-query` | External clients can query DNS | ✅ |\n| `test-dns-multiple-records` | Multiple servers get unique records | ✅ |\n| `test-dns-zone-transfer` | Zone transfers work (if enabled) | ✅ |\n| `test-dns-caching` | DNS caching works correctly | ✅ |\n\n**Coverage**: 100%\n\n### Gitea (10 Tests) ✅\n\n| Test | Description | Coverage |\n| ------ | ------------- | ---------- |\n| `test-gitea-initialization` | Gitea initializes with default settings | ✅ |\n| `test-git-clone` | Clone workspace repository | ✅ |\n| `test-git-push` | Push workspace changes | ✅ |\n| `test-git-pull` | Pull workspace updates | ✅ |\n| `test-workspace-locking-acquire` | Acquire workspace lock via issue | ✅ |\n| `test-workspace-locking-release` | Release workspace lock | ✅ |\n| `test-extension-publish` | Publish extension to Gitea release | ✅ |\n| `test-extension-download` | Download extension from release | ✅ |\n| `test-gitea-webhooks` | Webhooks trigger on push | ✅ |\n| `test-gitea-api-access` | Gitea API accessible | ✅ |\n\n**Coverage**: 100%\n\n### OCI Registry (12 Tests) ✅\n\n| Test | Description | Coverage |\n| ------ | ------------- | ---------- |\n| `test-zot-registry-running` | Zot registry accessible (solo/multi-user) | ✅ |\n| `test-harbor-registry-running` | Harbor registry accessible (enterprise) | ✅ |\n| `test-oci-push-kcl-package` | Push KCL package to OCI | ✅ |\n| `test-oci-pull-kcl-package` | Pull KCL package from OCI | ✅ |\n| `test-oci-push-extension` | Push extension artifact to OCI | ✅ |\n| `test-oci-pull-extension` | Pull extension artifact from OCI | ✅ |\n| `test-oci-list-artifacts` | List artifacts in namespace | ✅ |\n| `test-oci-verify-manifest` | Verify OCI manifest contents | ✅ |\n| `test-oci-delete-artifact` | Delete artifact from registry | ✅ |\n| `test-oci-authentication` | Authentication with OCI registry | ✅ |\n| `test-oci-catalog` | Catalog API works | ✅ |\n| `test-oci-blob-upload` | Blob upload works | ✅ |\n\n**Coverage**: 100%\n\n### Orchestrator (10 Tests) ✅\n\n| Test | Description | Coverage |\n| ------ | ------------- | ---------- |\n| `test-orchestrator-health` | Health endpoint returns healthy | ✅ |\n| `test-task-submission` | Submit task to orchestrator | ✅ |\n| `test-task-status` | Query task status | ✅ |\n| `test-task-completion` | Task completes successfully | ✅ |\n| `test-task-failure-handling` | Failed tasks handled correctly | ✅ |\n| `test-task-retry` | Tasks retry on transient failure | ✅ |\n| `test-task-queue` | Task queue processes tasks in order | ✅ |\n| `test-workflow-submission` | Submit workflow | ✅ |\n| `test-workflow-monitoring` | Monitor workflow progress | ✅ |\n| `test-orchestrator-api` | REST API endpoints work | ✅ |\n\n**Coverage**: 100%\n\n### PostgreSQL (5 Tests) ✅\n\n| Test | Description | Coverage |\n| ------ | ------------- | ---------- |\n| `test-postgres-running` | PostgreSQL accessible | ✅ |\n| `test-database-creation` | Create database | ✅ |\n| `test-user-creation` | Create database user | ✅ |\n| `test-data-persistence` | Data persists across restarts | ✅ |\n| `test-connection-pool` | Connection pooling works | ✅ |\n\n**Coverage**: 100%\n\n---\n\n## Workflow Coverage\n\n### Extension Loading (12 Tests) ✅\n\n| Test | Description | Coverage |\n| ------ | ------------- | ---------- |\n| `test-load-taskserv-from-oci` | Load taskserv from OCI registry | ✅ |\n| `test-load-provider-from-gitea` | Load provider from Gitea release | ✅ |\n| `test-load-cluster-from-local` | Load cluster from local path | ✅ |\n| `test-dependency-resolution` | Resolve extension dependencies | ✅ |\n| `test-version-conflict-resolution` | Handle version conflicts | ✅ |\n| `test-extension-caching` | Cache extension artifacts | ✅ |\n| `test-extension-lazy-loading` | Extensions loaded on-demand | ✅ |\n| `test-semver-resolution` | Semver version resolution | ✅ |\n| `test-extension-update` | Update extension to newer version | ✅ |\n| `test-extension-rollback` | Rollback extension to previous version | ✅ |\n| `test-multi-source-loading` | Load from multiple sources in one workflow | ✅ |\n| `test-extension-validation` | Validate extension before loading | ✅ |\n\n**Coverage**: 100%\n\n### Batch Workflows (12 Tests) ✅\n\n| Test | Description | Coverage |\n| ------ | ------------- | ---------- |\n| `test-batch-submit` | Submit batch workflow | ✅ |\n| `test-batch-status` | Query batch status | ✅ |\n| `test-batch-monitor` | Monitor batch progress | ✅ |\n| `test-batch-multi-server-creation` | Create multiple servers in batch | ✅ |\n| `test-batch-multi-taskserv-install` | Install taskservs on multiple servers | ✅ |\n| `test-batch-cluster-deployment` | Deploy complete cluster in batch | ✅ |\n| `test-batch-mixed-providers` | Batch with AWS + UpCloud + local | ✅ |\n| `test-batch-dependencies` | Batch operations with dependencies | ✅ |\n| `test-batch-rollback` | Rollback failed batch operation | ✅ |\n| `test-batch-partial-failure` | Handle partial batch failures | ✅ |\n| `test-batch-parallel-execution` | Parallel execution within batch | ✅ |\n| `test-batch-checkpoint-recovery` | Recovery from checkpoint after failure | ✅ |\n\n**Coverage**: 100%\n\n---\n\n## Edge Cases Covered\n\n### Authentication & Authorization\n\n| Edge Case | Test Coverage | Status |\n| ----------- | --------------- | -------- |\n| Unauthenticated request | ✅ Rejected in multi-user mode | ✅ |\n| Invalid JWT token | ✅ Rejected with 401 | ✅ |\n| Expired JWT token | ✅ Rejected with 401 | ✅ |\n| Insufficient permissions | ✅ Rejected with 403 | ✅ |\n| Role escalation attempt | ✅ Blocked by RBAC | ✅ |\n\n### Resource Management\n\n| Edge Case | Test Coverage | Status |\n| ----------- | --------------- | -------- |\n| Resource exhaustion | ✅ Graceful degradation | ✅ |\n| Concurrent resource access | ✅ Locking prevents conflicts | ✅ |\n| Resource cleanup failure | ✅ Retry with backoff | ✅ |\n| Orphaned resources | ✅ Cleanup job removes | ✅ |\n\n### Network Operations\n\n| Edge Case | Test Coverage | Status |\n| ----------- | --------------- | -------- |\n| Network timeout | ✅ Retry with exponential backoff | ✅ |\n| DNS resolution failure | ✅ Fallback to IP address | ✅ |\n| Service unavailable | ✅ Circuit breaker pattern | ✅ |\n| Partial network partition | ✅ Retry and eventual consistency | ✅ |\n\n### Data Consistency\n\n| Edge Case | Test Coverage | Status |\n| ----------- | --------------- | -------- |\n| Concurrent writes | ✅ Last-write-wins with timestamps | ✅ |\n| Split-brain scenario | ✅ Distributed lock prevents | ✅ |\n| Data corruption | ✅ Checksum validation | ✅ |\n| Incomplete transactions | ✅ Rollback on failure | ✅ |\n\n---\n\n## Coverage Gaps\n\n### Known Limitations\n\n1. **Load Testing**: No tests for extreme load (1000+ concurrent requests)\n - **Impact**: Medium\n - **Mitigation**: Planned for v1.1.0\n\n2. **Disaster Recovery**: Limited testing of backup/restore under load\n - **Impact**: Low\n - **Mitigation**: Manual testing procedures documented\n\n3. **Network Partitions**: Limited testing of split-brain scenarios\n - **Impact**: Low (distributed locking mitigates)\n - **Mitigation**: Planned for v1.2.0\n\n4. **Security Penetration Testing**: No automated penetration tests\n - **Impact**: Medium\n - **Mitigation**: Annual security audit\n\n### Planned Enhancements\n\n- [ ] Chaos engineering tests (inject failures)\n- [ ] Load testing with 10,000+ concurrent operations\n- [ ] Extended disaster recovery scenarios\n- [ ] Fuzz testing for API endpoints\n- [ ] Performance regression detection\n\n---\n\n## Future Enhancements\n\n### v1.1.0 (Next Release)\n\n- **Load Testing Suite**: 1000+ concurrent operations\n- **Chaos Engineering**: Inject random failures\n- **Extended Security Tests**: Penetration testing automation\n- **Performance Benchmarks**: Baseline performance metrics\n\n### v1.2.0 (Q2 2025)\n\n- **Multi-Cloud Integration**: Test AWS + UpCloud + GCP simultaneously\n- **Network Partition Testing**: Advanced split-brain scenarios\n- **Compliance Testing**: GDPR, SOC2 compliance validation\n- **Visual Regression Testing**: UI component testing\n\n### v2.0.0 (Future)\n\n- **AI-Powered Test Generation**: Generate tests from user scenarios\n- **Property-Based Testing**: QuickCheck-style property testing\n- **Mutation Testing**: Detect untested code paths\n- **Continuous Fuzzing**: 24/7 fuzz testing\n\n---\n\n## Test Quality Metrics\n\n### Code Coverage (Orchestrator Rust Code)\n\n| Module | Coverage | Tests |\n| -------- | ---------- | ------- |\n| `main.rs` | 85% | 12 |\n| `config.rs` | 92% | 8 |\n| `queue.rs` | 88% | 10 |\n| `batch.rs` | 90% | 15 |\n| `dependency.rs` | 87% | 12 |\n| `rollback.rs` | 89% | 14 |\n| **Average** | **88.5%** | **71** |\n\n### Test Reliability\n\n- **Flaky Tests**: 0%\n- **Test Success Rate**: 99.8%\n- **Average Test Duration**: 15 minutes (full suite)\n- **Parallel Execution Speedup**: 4x (with 4 workers)\n\n### Bug Detection Rate\n\n- **Bugs Caught by Integration Tests**: 23/25 (92%)\n- **Bugs Caught by Unit Tests**: 45/50 (90%)\n- **Bugs Found in Production**: 2/75 (2.7%)\n\n---\n\n## References\n\n- [Integration Testing Guide](TESTING_GUIDE.md)\n- [OrbStack Setup Guide](ORBSTACK_SETUP.md)\n- [Platform Architecture](/docs/architecture/)\n- [CI/CD Pipeline](/.github/workflows/)\n\n---\n\n**Maintained By**: Platform Team\n**Last Updated**: 2025-10-06\n**Next Review**: 2025-11-06 diff --git a/tests/integration/docs/testing-guide.md b/tests/integration/docs/testing-guide.md index 2941045..6a6aae1 100644 --- a/tests/integration/docs/testing-guide.md +++ b/tests/integration/docs/testing-guide.md @@ -1,716 +1 @@ -# Integration Testing Guide - -**Version**: 1.0.0 -**Last Updated**: 2025-10-06 - -This guide provides comprehensive documentation for the provisioning platform integration testing suite. - -## Table of Contents - -1. [Overview](#overview) -2. [Test Infrastructure](#test-infrastructure) -3. [Running Tests Locally](#running-tests-locally) -4. [Running Tests on OrbStack](#running-tests-on-orbstack) -5. [Writing New Tests](#writing-new-tests) -6. [Test Organization](#test-organization) -7. [CI/CD Integration](#cicd-integration) -8. [Troubleshooting](#troubleshooting) - ---- - -## Overview - -The integration testing suite validates all four execution modes of the provisioning platform: - -- **Solo Mode**: Single-user, minimal services (orchestrator, CoreDNS, OCI registry) -- **Multi-User Mode**: Multi-user support with Gitea, PostgreSQL, RBAC -- **CI/CD Mode**: Automation mode with API server, service accounts -- **Enterprise Mode**: Full enterprise features (Harbor, KMS, Prometheus, Grafana, ELK) - -### Key Features - -- ✅ **Comprehensive Coverage**: Tests for all 4 modes, 15+ services -- ✅ **OrbStack Integration**: Tests deployable to OrbStack machine "provisioning" -- ✅ **Parallel Execution**: Run independent tests in parallel for speed -- ✅ **Automatic Cleanup**: Resources cleaned up automatically after tests -- ✅ **Multiple Report Formats**: JUnit XML, HTML, JSON -- ✅ **CI/CD Ready**: GitHub Actions and GitLab CI integration - ---- - -## Test Infrastructure - -### Prerequisites - -1. **OrbStack Installed**: - - ```bash - # Install OrbStack (macOS) - brew install --cask orbstack - ``` - -2. **OrbStack Machine Named "provisioning"**: - - ```bash - # Create OrbStack machine - orb create provisioning - - # Verify machine is running - orb status provisioning - ``` - -3. **Nushell 0.107.1+**: - - ```bash - # Install Nushell - brew install nushell - ``` - -4. **Docker CLI**: - - ```bash - # Verify Docker is available - docker version - ``` - -### Test Configuration - -The test suite is configured via `provisioning/tests/integration/test_config.yaml`: - -``` -# OrbStack connection -orbstack: - machine_name: "provisioning" - connection: - type: "docker" - socket: "/var/run/docker.sock" - -# Service endpoints -services: - orchestrator: - host: "172.20.0.10" - port: 8080 - - coredns: - host: "172.20.0.2" - port: 53 - - # ... more services -``` - -**Key Settings**: - -- `orbstack.machine_name`: Name of OrbStack machine to use -- `services.*`: IP addresses and ports for deployed services -- `test_execution.parallel.max_workers`: Number of parallel test workers -- `test_execution.timeouts.*`: Timeout values for various operations - ---- - -## Running Tests Locally - -### Quick Start - -1. **Setup Test Environment**: - - ```bash - # Setup solo mode environment - nu provisioning/tests/integration/setup_test_environment.nu --mode solo - ``` - -1. **Run Tests**: - - ```bash - # Run all tests for solo mode - nu provisioning/tests/integration/framework/test_runner.nu --mode solo - - # Run specific test file - nu provisioning/tests/integration/modes/test_solo_mode.nu - ``` - -2. **Teardown Test Environment**: - - ```bash - # Cleanup all resources - nu provisioning/tests/integration/teardown_test_environment.nu --force - ``` - -### Test Runner Options - -``` -nu provisioning/tests/integration/framework/test_runner.nu \ - --mode # Test specific mode (solo, multiuser, cicd, enterprise) - --filter # Filter tests by regex pattern - --parallel # Number of parallel workers (default: 1) - --verbose # Detailed output - --report # Generate HTML report - --skip-setup # Skip environment setup - --skip-teardown # Skip environment teardown -``` - -**Examples**: - -``` -# Run all tests for all modes -nu provisioning/tests/integration/framework/test_runner.nu - -# Run only solo mode tests -nu provisioning/tests/integration/framework/test_runner.nu --mode solo - -# Run tests matching pattern -nu provisioning/tests/integration/framework/test_runner.nu --filter "dns" - -# Run tests in parallel with 4 workers -nu provisioning/tests/integration/framework/test_runner.nu --parallel 4 - -# Generate HTML report -nu provisioning/tests/integration/framework/test_runner.nu --report /tmp/test-report.html - -# Run tests without cleanup (for debugging) -nu provisioning/tests/integration/framework/test_runner.nu --skip-teardown -``` - ---- - -## Running Tests on OrbStack - -### Setup OrbStack Machine - -1. **Create OrbStack Machine**: - - ```bash - # Create machine named "provisioning" - orb create provisioning --cpu 4 --memory 8192 --disk 100 - - # Verify machine is created - orb list - ``` - -1. **Configure Machine**: - - ```bash - # Start machine - orb start provisioning - - # Verify Docker is accessible - docker -H /var/run/docker.sock ps - ``` - -### Deploy Platform to OrbStack - -The test setup automatically deploys platform services to OrbStack: - -``` -# Deploy solo mode -nu provisioning/tests/integration/setup_test_environment.nu --mode solo - -# Deploy multi-user mode -nu provisioning/tests/integration/setup_test_environment.nu --mode multiuser - -# Deploy CI/CD mode -nu provisioning/tests/integration/setup_test_environment.nu --mode cicd - -# Deploy enterprise mode -nu provisioning/tests/integration/setup_test_environment.nu --mode enterprise -``` - -**Deployed Services**: - -| Mode | Services | -| ------ | ---------- | -| Solo | Orchestrator, CoreDNS, Zot (OCI registry) | -| Multi-User | Solo services + Gitea, PostgreSQL | -| CI/CD | Multi-User services + API server, Prometheus | -| Enterprise | CI/CD services + Harbor, KMS, Grafana, Elasticsearch | - -### Verify Deployment - -``` -# Check service health -nu provisioning/tests/integration/framework/test_helpers.nu check-service-health orchestrator - -# View service logs -nu provisioning/tests/integration/framework/orbstack_helpers.nu orbstack-logs orchestrator - -# List running containers -docker -H /var/run/docker.sock ps -``` - ---- - -## Writing New Tests - -### Test File Structure - -All test files follow this structure: - -``` -# Test Description -# Brief description of what this test validates - -use std log -use ../framework/test_helpers.nu * -use ../framework/orbstack_helpers.nu * - -# Main test suite -export def main [] { - log info "Running " - - let test_config = (load-test-config) - - mut results = [] - - # Run all tests - $results = ($results | append (test-case-1 $test_config)) - $results = ($results | append (test-case-2 $test_config)) - - # Report results - report-test-results $results -} - -# Individual test case -def test-case-1 [test_config: record] { - run-test "test-case-1-name" { - log info "Testing specific functionality..." - - # Test logic - let result = (some-operation) - - # Assertions - assert-eq $result.status "success" "Operation should succeed" - assert-not-empty $result.data "Result should contain data" - - log info "✓ Test case 1 passed" - } -} - -# Report test results -def report-test-results [results: list] { - # ... reporting logic -} -``` - -### Using Assertion Helpers - -The test framework provides several assertion helpers: - -``` -# Equality assertion -assert-eq $actual $expected "Error message if assertion fails" - -# Boolean assertions -assert-true $condition "Error message" -assert-false $condition "Error message" - -# Collection assertions -assert-contains $list $item "Error message" -assert-not-contains $list $item "Error message" -assert-not-empty $value "Error message" - -# HTTP assertions -assert-http-success $response "Error message" -``` - -### Using Test Fixtures - -Create reusable test fixtures: - -``` -# Create test workspace -let workspace = create-test-workspace "my-test-ws" { - provider: "local" - environment: "test" -} - -# Create test server -let server = create-test-server "test-server" "local" { - cores: 4 - memory: 8192 -} - -# Cleanup -cleanup-test-workspace $workspace -delete-test-server $server.id -``` - -### Using Retry Logic - -For flaky operations, use retry helpers: - -``` -# Retry operation up to 3 times -let result = (with-retry --max-attempts 3 --delay 5 { - # Operation that might fail - http get "http://example.com/api" -}) - -# Wait for condition with timeout -wait-for-condition --timeout 60 --interval 5 { - # Condition to check - check-service-health "orchestrator" -} "orchestrator to be healthy" -``` - -### Example: Writing a New Service Integration Test - -``` -# Test Gitea Integration -# Validates Gitea workspace git operations and extension publishing - -use std log -use ../framework/test_helpers.nu * - -def test-gitea-workspace-operations [test_config: record] { - run-test "gitea-workspace-git-operations" { - log info "Testing Gitea workspace operations..." - - # Create workspace - let workspace = create-test-workspace "gitea-test" { - provider: "local" - } - - # Initialize git repo - cd $workspace.path - git init - - # Configure Gitea remote - let gitea_url = $"http://($test_config.services.gitea.host):($test_config.services.gitea.port)" - git remote add origin $"($gitea_url)/test-user/gitea-test.git" - - # Create test file - "test content" | save test.txt - git add test.txt - git commit -m "Test commit" - - # Push to Gitea - git push -u origin main - - # Verify push succeeded - let remote_log = (git ls-remote origin) - assert-not-empty $remote_log "Remote should have commits" - - log info "✓ Gitea workspace operations work" - - # Cleanup - cleanup-test-workspace $workspace - } -} -``` - ---- - -## Test Organization - -### Directory Structure - -``` -provisioning/tests/integration/ -├── test_config.yaml # Test configuration -├── setup_test_environment.nu # Environment setup script -├── teardown_test_environment.nu # Cleanup script -├── framework/ # Test framework utilities -│ ├── test_helpers.nu # Common test helpers -│ ├── orbstack_helpers.nu # OrbStack integration -│ └── test_runner.nu # Test orchestration -├── modes/ # Mode-specific tests -│ ├── test_solo_mode.nu # Solo mode tests -│ ├── test_multiuser_mode.nu # Multi-user mode tests -│ ├── test_cicd_mode.nu # CI/CD mode tests -│ └── test_enterprise_mode.nu # Enterprise mode tests -├── services/ # Service integration tests -│ ├── test_dns_integration.nu # CoreDNS tests -│ ├── test_gitea_integration.nu # Gitea tests -│ ├── test_oci_integration.nu # OCI registry tests -│ └── test_service_orchestration.nu # Service manager tests -├── workflows/ # Workflow tests -│ ├── test_extension_loading.nu # Extension loading tests -│ └── test_batch_workflows.nu # Batch workflow tests -├── e2e/ # End-to-end tests -│ ├── test_complete_deployment.nu # Full deployment workflow -│ └── test_disaster_recovery.nu # Backup/restore tests -├── performance/ # Performance tests -│ ├── test_concurrency.nu # Concurrency tests -│ └── test_scalability.nu # Scalability tests -├── security/ # Security tests -│ ├── test_rbac_enforcement.nu # RBAC tests -│ └── test_kms_integration.nu # KMS tests -└── docs/ # Documentation - ├── TESTING_GUIDE.md # This guide - ├── TEST_COVERAGE.md # Coverage report - └── ORBSTACK_SETUP.md # OrbStack setup guide -``` - -### Test Naming Conventions - -- **Test Files**: `test__.nu` -- **Test Functions**: `test-` -- **Test Names**: `--` - -**Examples**: - -- File: `test_dns_integration.nu` -- Function: `test-dns-registration` -- Test Name: `solo-mode-dns-registration` - ---- - -## CI/CD Integration - -### GitHub Actions - -Create `.github/workflows/integration-tests.yml`: - -``` -name: Integration Tests - -on: - pull_request: - push: - branches: [main] - schedule: - - cron: '0 2 * * *' # Nightly at 2 AM - -jobs: - integration-tests: - runs-on: macos-latest - - strategy: - matrix: - mode: [solo, multiuser, cicd, enterprise] - - steps: - - name: Checkout code - uses: actions/checkout@v3 - - - name: Install OrbStack - run: brew install --cask orbstack - - - name: Create OrbStack machine - run: orb create provisioning - - - name: Install Nushell - run: brew install nushell - - - name: Setup test environment - run: | - nu provisioning/tests/integration/setup_test_environment.nu \ - --mode ${{ matrix.mode }} - - - name: Run integration tests - run: | - nu provisioning/tests/integration/framework/test_runner.nu \ - --mode ${{ matrix.mode }} \ - --report test-report.html - - - name: Upload test results - if: always() - uses: actions/upload-artifact@v3 - with: - name: test-results-${{ matrix.mode }} - path: | - /tmp/provisioning-test-reports/ - test-report.html - - - name: Teardown test environment - if: always() - run: | - nu provisioning/tests/integration/teardown_test_environment.nu --force -``` - -### GitLab CI - -Create `.gitlab-ci.yml`: - -``` -stages: - - test - -integration-tests: - stage: test - image: ubuntu:22.04 - - parallel: - matrix: - - MODE: [solo, multiuser, cicd, enterprise] - - before_script: - # Install dependencies - - apt-get update && apt-get install -y docker.io nushell - - script: - # Setup test environment - - nu provisioning/tests/integration/setup_test_environment.nu --mode $MODE - - # Run tests - - nu provisioning/tests/integration/framework/test_runner.nu --mode $MODE --report test-report.html - - after_script: - # Cleanup - - nu provisioning/tests/integration/teardown_test_environment.nu --force - - artifacts: - when: always - paths: - - /tmp/provisioning-test-reports/ - - test-report.html - reports: - junit: /tmp/provisioning-test-reports/junit-results.xml -``` - ---- - -## Troubleshooting - -### Common Issues - -#### 1. OrbStack Machine Not Found - -**Error**: `OrbStack machine 'provisioning' not found` - -**Solution**: - -``` -# Create OrbStack machine -orb create provisioning - -# Verify creation -orb list -``` - -#### 2. Docker Connection Failed - -**Error**: `Cannot connect to Docker daemon` - -**Solution**: - -``` -# Verify OrbStack is running -orb status provisioning - -# Restart OrbStack -orb restart provisioning -``` - -#### 3. Service Health Check Timeout - -**Error**: `Timeout waiting for service orchestrator to be healthy` - -**Solution**: - -``` -# Check service logs -nu provisioning/tests/integration/framework/orbstack_helpers.nu orbstack-logs orchestrator - -# Verify service is running -docker -H /var/run/docker.sock ps | grep orchestrator - -# Increase timeout in test_config.yaml -# test_execution.timeouts.test_timeout_seconds: 600 -``` - -#### 4. Test Environment Cleanup Failed - -**Error**: `Failed to remove test workspace` - -**Solution**: - -``` -# Manual cleanup -rm -rf /tmp/provisioning-test-workspace* - -# Cleanup OrbStack resources -nu provisioning/tests/integration/framework/orbstack_helpers.nu orbstack-cleanup -``` - -#### 5. DNS Resolution Failed - -**Error**: `DNS record should exist for server` - -**Solution**: - -``` -# Check CoreDNS logs -nu provisioning/tests/integration/framework/orbstack_helpers.nu orbstack-logs coredns - -# Verify CoreDNS is running -docker -H /var/run/docker.sock ps | grep coredns - -# Test DNS manually -dig @172.20.0.2 test-server.local -``` - -### Debug Mode - -Run tests with verbose logging: - -``` -# Enable verbose output -nu provisioning/tests/integration/framework/test_runner.nu --verbose --mode solo - -# Keep environment after tests for debugging -nu provisioning/tests/integration/framework/test_runner.nu --skip-teardown --mode solo - -# Inspect environment manually -docker -H /var/run/docker.sock ps -docker -H /var/run/docker.sock logs orchestrator -``` - -### Viewing Test Logs - -``` -# View test execution logs -cat /tmp/provisioning-test.log - -# View service logs -ls /tmp/provisioning-test-reports/logs/ - -# View HTML report -open /tmp/provisioning-test-reports/test-report.html -``` - ---- - -## Performance Benchmarks - -Expected test execution times: - -| Test Suite | Duration (Solo) | Duration (Enterprise) | -| ------------ | ----------------- | ------------------------ | -| Mode Tests | 5-10 min | 15-20 min | -| Service Tests | 3-5 min | 10-15 min | -| Workflow Tests | 5-10 min | 15-20 min | -| E2E Tests | 10-15 min | 30-40 min | -| **Total** | **25-40 min** | **70-95 min** | - -**Parallel Execution** (4 workers): - -- Solo mode: ~10-15 min -- Enterprise mode: ~25-35 min - ---- - -## Best Practices - -1. **Idempotent Tests**: Tests should be repeatable without side effects -2. **Isolated Tests**: Each test should be independent -3. **Clear Assertions**: Use descriptive error messages -4. **Cleanup**: Always cleanup resources, even on failure -5. **Retry Flaky Operations**: Use `with-retry` for network operations -6. **Meaningful Names**: Use descriptive test names -7. **Fast Feedback**: Run quick tests first, slow tests later -8. **Log Important Steps**: Log key operations for debugging - ---- - -## References - -- [OrbStack Documentation](https://orbstack.dev/docs) -- [Nushell Documentation](https://www.nushell.sh) -- [Provisioning Platform Architecture](/docs/architecture/) -- [Test Coverage Report](TEST_COVERAGE.md) -- [OrbStack Setup Guide](ORBSTACK_SETUP.md) - ---- - -**Maintained By**: Platform Team -**Last Updated**: 2025-10-06 +# Integration Testing Guide\n\n**Version**: 1.0.0\n**Last Updated**: 2025-10-06\n\nThis guide provides comprehensive documentation for the provisioning platform integration testing suite.\n\n## Table of Contents\n\n1. [Overview](#overview)\n2. [Test Infrastructure](#test-infrastructure)\n3. [Running Tests Locally](#running-tests-locally)\n4. [Running Tests on OrbStack](#running-tests-on-orbstack)\n5. [Writing New Tests](#writing-new-tests)\n6. [Test Organization](#test-organization)\n7. [CI/CD Integration](#cicd-integration)\n8. [Troubleshooting](#troubleshooting)\n\n---\n\n## Overview\n\nThe integration testing suite validates all four execution modes of the provisioning platform:\n\n- **Solo Mode**: Single-user, minimal services (orchestrator, CoreDNS, OCI registry)\n- **Multi-User Mode**: Multi-user support with Gitea, PostgreSQL, RBAC\n- **CI/CD Mode**: Automation mode with API server, service accounts\n- **Enterprise Mode**: Full enterprise features (Harbor, KMS, Prometheus, Grafana, ELK)\n\n### Key Features\n\n- ✅ **Comprehensive Coverage**: Tests for all 4 modes, 15+ services\n- ✅ **OrbStack Integration**: Tests deployable to OrbStack machine "provisioning"\n- ✅ **Parallel Execution**: Run independent tests in parallel for speed\n- ✅ **Automatic Cleanup**: Resources cleaned up automatically after tests\n- ✅ **Multiple Report Formats**: JUnit XML, HTML, JSON\n- ✅ **CI/CD Ready**: GitHub Actions and GitLab CI integration\n\n---\n\n## Test Infrastructure\n\n### Prerequisites\n\n1. **OrbStack Installed**:\n\n ```bash\n # Install OrbStack (macOS)\n brew install --cask orbstack\n ```\n\n2. **OrbStack Machine Named "provisioning"**:\n\n ```bash\n # Create OrbStack machine\n orb create provisioning\n\n # Verify machine is running\n orb status provisioning\n ```\n\n3. **Nushell 0.107.1+**:\n\n ```bash\n # Install Nushell\n brew install nushell\n ```\n\n4. **Docker CLI**:\n\n ```bash\n # Verify Docker is available\n docker version\n ```\n\n### Test Configuration\n\nThe test suite is configured via `provisioning/tests/integration/test_config.yaml`:\n\n```\n# OrbStack connection\norbstack:\n machine_name: "provisioning"\n connection:\n type: "docker"\n socket: "/var/run/docker.sock"\n\n# Service endpoints\nservices:\n orchestrator:\n host: "172.20.0.10"\n port: 8080\n\n coredns:\n host: "172.20.0.2"\n port: 53\n\n # ... more services\n```\n\n**Key Settings**:\n\n- `orbstack.machine_name`: Name of OrbStack machine to use\n- `services.*`: IP addresses and ports for deployed services\n- `test_execution.parallel.max_workers`: Number of parallel test workers\n- `test_execution.timeouts.*`: Timeout values for various operations\n\n---\n\n## Running Tests Locally\n\n### Quick Start\n\n1. **Setup Test Environment**:\n\n ```bash\n # Setup solo mode environment\n nu provisioning/tests/integration/setup_test_environment.nu --mode solo\n ```\n\n1. **Run Tests**:\n\n ```bash\n # Run all tests for solo mode\n nu provisioning/tests/integration/framework/test_runner.nu --mode solo\n\n # Run specific test file\n nu provisioning/tests/integration/modes/test_solo_mode.nu\n ```\n\n2. **Teardown Test Environment**:\n\n ```bash\n # Cleanup all resources\n nu provisioning/tests/integration/teardown_test_environment.nu --force\n ```\n\n### Test Runner Options\n\n```\nnu provisioning/tests/integration/framework/test_runner.nu \\n --mode # Test specific mode (solo, multiuser, cicd, enterprise)\n --filter # Filter tests by regex pattern\n --parallel # Number of parallel workers (default: 1)\n --verbose # Detailed output\n --report # Generate HTML report\n --skip-setup # Skip environment setup\n --skip-teardown # Skip environment teardown\n```\n\n**Examples**:\n\n```\n# Run all tests for all modes\nnu provisioning/tests/integration/framework/test_runner.nu\n\n# Run only solo mode tests\nnu provisioning/tests/integration/framework/test_runner.nu --mode solo\n\n# Run tests matching pattern\nnu provisioning/tests/integration/framework/test_runner.nu --filter "dns"\n\n# Run tests in parallel with 4 workers\nnu provisioning/tests/integration/framework/test_runner.nu --parallel 4\n\n# Generate HTML report\nnu provisioning/tests/integration/framework/test_runner.nu --report /tmp/test-report.html\n\n# Run tests without cleanup (for debugging)\nnu provisioning/tests/integration/framework/test_runner.nu --skip-teardown\n```\n\n---\n\n## Running Tests on OrbStack\n\n### Setup OrbStack Machine\n\n1. **Create OrbStack Machine**:\n\n ```bash\n # Create machine named "provisioning"\n orb create provisioning --cpu 4 --memory 8192 --disk 100\n\n # Verify machine is created\n orb list\n ```\n\n1. **Configure Machine**:\n\n ```bash\n # Start machine\n orb start provisioning\n\n # Verify Docker is accessible\n docker -H /var/run/docker.sock ps\n ```\n\n### Deploy Platform to OrbStack\n\nThe test setup automatically deploys platform services to OrbStack:\n\n```\n# Deploy solo mode\nnu provisioning/tests/integration/setup_test_environment.nu --mode solo\n\n# Deploy multi-user mode\nnu provisioning/tests/integration/setup_test_environment.nu --mode multiuser\n\n# Deploy CI/CD mode\nnu provisioning/tests/integration/setup_test_environment.nu --mode cicd\n\n# Deploy enterprise mode\nnu provisioning/tests/integration/setup_test_environment.nu --mode enterprise\n```\n\n**Deployed Services**:\n\n| Mode | Services |\n| ------ | ---------- |\n| Solo | Orchestrator, CoreDNS, Zot (OCI registry) |\n| Multi-User | Solo services + Gitea, PostgreSQL |\n| CI/CD | Multi-User services + API server, Prometheus |\n| Enterprise | CI/CD services + Harbor, KMS, Grafana, Elasticsearch |\n\n### Verify Deployment\n\n```\n# Check service health\nnu provisioning/tests/integration/framework/test_helpers.nu check-service-health orchestrator\n\n# View service logs\nnu provisioning/tests/integration/framework/orbstack_helpers.nu orbstack-logs orchestrator\n\n# List running containers\ndocker -H /var/run/docker.sock ps\n```\n\n---\n\n## Writing New Tests\n\n### Test File Structure\n\nAll test files follow this structure:\n\n```\n# Test Description\n# Brief description of what this test validates\n\nuse std log\nuse ../framework/test_helpers.nu *\nuse ../framework/orbstack_helpers.nu *\n\n# Main test suite\nexport def main [] {\n log info "Running "\n\n let test_config = (load-test-config)\n\n mut results = []\n\n # Run all tests\n $results = ($results | append (test-case-1 $test_config))\n $results = ($results | append (test-case-2 $test_config))\n\n # Report results\n report-test-results $results\n}\n\n# Individual test case\ndef test-case-1 [test_config: record] {\n run-test "test-case-1-name" {\n log info "Testing specific functionality..."\n\n # Test logic\n let result = (some-operation)\n\n # Assertions\n assert-eq $result.status "success" "Operation should succeed"\n assert-not-empty $result.data "Result should contain data"\n\n log info "✓ Test case 1 passed"\n }\n}\n\n# Report test results\ndef report-test-results [results: list] {\n # ... reporting logic\n}\n```\n\n### Using Assertion Helpers\n\nThe test framework provides several assertion helpers:\n\n```\n# Equality assertion\nassert-eq $actual $expected "Error message if assertion fails"\n\n# Boolean assertions\nassert-true $condition "Error message"\nassert-false $condition "Error message"\n\n# Collection assertions\nassert-contains $list $item "Error message"\nassert-not-contains $list $item "Error message"\nassert-not-empty $value "Error message"\n\n# HTTP assertions\nassert-http-success $response "Error message"\n```\n\n### Using Test Fixtures\n\nCreate reusable test fixtures:\n\n```\n# Create test workspace\nlet workspace = create-test-workspace "my-test-ws" {\n provider: "local"\n environment: "test"\n}\n\n# Create test server\nlet server = create-test-server "test-server" "local" {\n cores: 4\n memory: 8192\n}\n\n# Cleanup\ncleanup-test-workspace $workspace\ndelete-test-server $server.id\n```\n\n### Using Retry Logic\n\nFor flaky operations, use retry helpers:\n\n```\n# Retry operation up to 3 times\nlet result = (with-retry --max-attempts 3 --delay 5 {\n # Operation that might fail\n http get "http://example.com/api"\n})\n\n# Wait for condition with timeout\nwait-for-condition --timeout 60 --interval 5 {\n # Condition to check\n check-service-health "orchestrator"\n} "orchestrator to be healthy"\n```\n\n### Example: Writing a New Service Integration Test\n\n```\n# Test Gitea Integration\n# Validates Gitea workspace git operations and extension publishing\n\nuse std log\nuse ../framework/test_helpers.nu *\n\ndef test-gitea-workspace-operations [test_config: record] {\n run-test "gitea-workspace-git-operations" {\n log info "Testing Gitea workspace operations..."\n\n # Create workspace\n let workspace = create-test-workspace "gitea-test" {\n provider: "local"\n }\n\n # Initialize git repo\n cd $workspace.path\n git init\n\n # Configure Gitea remote\n let gitea_url = $"http://($test_config.services.gitea.host):($test_config.services.gitea.port)"\n git remote add origin $"($gitea_url)/test-user/gitea-test.git"\n\n # Create test file\n "test content" | save test.txt\n git add test.txt\n git commit -m "Test commit"\n\n # Push to Gitea\n git push -u origin main\n\n # Verify push succeeded\n let remote_log = (git ls-remote origin)\n assert-not-empty $remote_log "Remote should have commits"\n\n log info "✓ Gitea workspace operations work"\n\n # Cleanup\n cleanup-test-workspace $workspace\n }\n}\n```\n\n---\n\n## Test Organization\n\n### Directory Structure\n\n```\nprovisioning/tests/integration/\n├── test_config.yaml # Test configuration\n├── setup_test_environment.nu # Environment setup script\n├── teardown_test_environment.nu # Cleanup script\n├── framework/ # Test framework utilities\n│ ├── test_helpers.nu # Common test helpers\n│ ├── orbstack_helpers.nu # OrbStack integration\n│ └── test_runner.nu # Test orchestration\n├── modes/ # Mode-specific tests\n│ ├── test_solo_mode.nu # Solo mode tests\n│ ├── test_multiuser_mode.nu # Multi-user mode tests\n│ ├── test_cicd_mode.nu # CI/CD mode tests\n│ └── test_enterprise_mode.nu # Enterprise mode tests\n├── services/ # Service integration tests\n│ ├── test_dns_integration.nu # CoreDNS tests\n│ ├── test_gitea_integration.nu # Gitea tests\n│ ├── test_oci_integration.nu # OCI registry tests\n│ └── test_service_orchestration.nu # Service manager tests\n├── workflows/ # Workflow tests\n│ ├── test_extension_loading.nu # Extension loading tests\n│ └── test_batch_workflows.nu # Batch workflow tests\n├── e2e/ # End-to-end tests\n│ ├── test_complete_deployment.nu # Full deployment workflow\n│ └── test_disaster_recovery.nu # Backup/restore tests\n├── performance/ # Performance tests\n│ ├── test_concurrency.nu # Concurrency tests\n│ └── test_scalability.nu # Scalability tests\n├── security/ # Security tests\n│ ├── test_rbac_enforcement.nu # RBAC tests\n│ └── test_kms_integration.nu # KMS tests\n└── docs/ # Documentation\n ├── TESTING_GUIDE.md # This guide\n ├── TEST_COVERAGE.md # Coverage report\n └── ORBSTACK_SETUP.md # OrbStack setup guide\n```\n\n### Test Naming Conventions\n\n- **Test Files**: `test__.nu`\n- **Test Functions**: `test-`\n- **Test Names**: `--`\n\n**Examples**:\n\n- File: `test_dns_integration.nu`\n- Function: `test-dns-registration`\n- Test Name: `solo-mode-dns-registration`\n\n---\n\n## CI/CD Integration\n\n### GitHub Actions\n\nCreate `.github/workflows/integration-tests.yml`:\n\n```\nname: Integration Tests\n\non:\n pull_request:\n push:\n branches: [main]\n schedule:\n - cron: '0 2 * * *' # Nightly at 2 AM\n\njobs:\n integration-tests:\n runs-on: macos-latest\n\n strategy:\n matrix:\n mode: [solo, multiuser, cicd, enterprise]\n\n steps:\n - name: Checkout code\n uses: actions/checkout@v3\n\n - name: Install OrbStack\n run: brew install --cask orbstack\n\n - name: Create OrbStack machine\n run: orb create provisioning\n\n - name: Install Nushell\n run: brew install nushell\n\n - name: Setup test environment\n run: |\n nu provisioning/tests/integration/setup_test_environment.nu \\n --mode ${{ matrix.mode }}\n\n - name: Run integration tests\n run: |\n nu provisioning/tests/integration/framework/test_runner.nu \\n --mode ${{ matrix.mode }} \\n --report test-report.html\n\n - name: Upload test results\n if: always()\n uses: actions/upload-artifact@v3\n with:\n name: test-results-${{ matrix.mode }}\n path: |\n /tmp/provisioning-test-reports/\n test-report.html\n\n - name: Teardown test environment\n if: always()\n run: |\n nu provisioning/tests/integration/teardown_test_environment.nu --force\n```\n\n### GitLab CI\n\nCreate `.gitlab-ci.yml`:\n\n```\nstages:\n - test\n\nintegration-tests:\n stage: test\n image: ubuntu:22.04\n\n parallel:\n matrix:\n - MODE: [solo, multiuser, cicd, enterprise]\n\n before_script:\n # Install dependencies\n - apt-get update && apt-get install -y docker.io nushell\n\n script:\n # Setup test environment\n - nu provisioning/tests/integration/setup_test_environment.nu --mode $MODE\n\n # Run tests\n - nu provisioning/tests/integration/framework/test_runner.nu --mode $MODE --report test-report.html\n\n after_script:\n # Cleanup\n - nu provisioning/tests/integration/teardown_test_environment.nu --force\n\n artifacts:\n when: always\n paths:\n - /tmp/provisioning-test-reports/\n - test-report.html\n reports:\n junit: /tmp/provisioning-test-reports/junit-results.xml\n```\n\n---\n\n## Troubleshooting\n\n### Common Issues\n\n#### 1. OrbStack Machine Not Found\n\n**Error**: `OrbStack machine 'provisioning' not found`\n\n**Solution**:\n\n```\n# Create OrbStack machine\norb create provisioning\n\n# Verify creation\norb list\n```\n\n#### 2. Docker Connection Failed\n\n**Error**: `Cannot connect to Docker daemon`\n\n**Solution**:\n\n```\n# Verify OrbStack is running\norb status provisioning\n\n# Restart OrbStack\norb restart provisioning\n```\n\n#### 3. Service Health Check Timeout\n\n**Error**: `Timeout waiting for service orchestrator to be healthy`\n\n**Solution**:\n\n```\n# Check service logs\nnu provisioning/tests/integration/framework/orbstack_helpers.nu orbstack-logs orchestrator\n\n# Verify service is running\ndocker -H /var/run/docker.sock ps | grep orchestrator\n\n# Increase timeout in test_config.yaml\n# test_execution.timeouts.test_timeout_seconds: 600\n```\n\n#### 4. Test Environment Cleanup Failed\n\n**Error**: `Failed to remove test workspace`\n\n**Solution**:\n\n```\n# Manual cleanup\nrm -rf /tmp/provisioning-test-workspace*\n\n# Cleanup OrbStack resources\nnu provisioning/tests/integration/framework/orbstack_helpers.nu orbstack-cleanup\n```\n\n#### 5. DNS Resolution Failed\n\n**Error**: `DNS record should exist for server`\n\n**Solution**:\n\n```\n# Check CoreDNS logs\nnu provisioning/tests/integration/framework/orbstack_helpers.nu orbstack-logs coredns\n\n# Verify CoreDNS is running\ndocker -H /var/run/docker.sock ps | grep coredns\n\n# Test DNS manually\ndig @172.20.0.2 test-server.local\n```\n\n### Debug Mode\n\nRun tests with verbose logging:\n\n```\n# Enable verbose output\nnu provisioning/tests/integration/framework/test_runner.nu --verbose --mode solo\n\n# Keep environment after tests for debugging\nnu provisioning/tests/integration/framework/test_runner.nu --skip-teardown --mode solo\n\n# Inspect environment manually\ndocker -H /var/run/docker.sock ps\ndocker -H /var/run/docker.sock logs orchestrator\n```\n\n### Viewing Test Logs\n\n```\n# View test execution logs\ncat /tmp/provisioning-test.log\n\n# View service logs\nls /tmp/provisioning-test-reports/logs/\n\n# View HTML report\nopen /tmp/provisioning-test-reports/test-report.html\n```\n\n---\n\n## Performance Benchmarks\n\nExpected test execution times:\n\n| Test Suite | Duration (Solo) | Duration (Enterprise) |\n| ------------ | ----------------- | ------------------------ |\n| Mode Tests | 5-10 min | 15-20 min |\n| Service Tests | 3-5 min | 10-15 min |\n| Workflow Tests | 5-10 min | 15-20 min |\n| E2E Tests | 10-15 min | 30-40 min |\n| **Total** | **25-40 min** | **70-95 min** |\n\n**Parallel Execution** (4 workers):\n\n- Solo mode: ~10-15 min\n- Enterprise mode: ~25-35 min\n\n---\n\n## Best Practices\n\n1. **Idempotent Tests**: Tests should be repeatable without side effects\n2. **Isolated Tests**: Each test should be independent\n3. **Clear Assertions**: Use descriptive error messages\n4. **Cleanup**: Always cleanup resources, even on failure\n5. **Retry Flaky Operations**: Use `with-retry` for network operations\n6. **Meaningful Names**: Use descriptive test names\n7. **Fast Feedback**: Run quick tests first, slow tests later\n8. **Log Important Steps**: Log key operations for debugging\n\n---\n\n## References\n\n- [OrbStack Documentation](https://orbstack.dev/docs)\n- [Nushell Documentation](https://www.nushell.sh)\n- [Provisioning Platform Architecture](/docs/architecture/)\n- [Test Coverage Report](TEST_COVERAGE.md)\n- [OrbStack Setup Guide](ORBSTACK_SETUP.md)\n\n---\n\n**Maintained By**: Platform Team\n**Last Updated**: 2025-10-06 diff --git a/tools/README-analyze-codebase.md b/tools/README-analyze-codebase.md index 3c119cf..22a4e82 100644 --- a/tools/README-analyze-codebase.md +++ b/tools/README-analyze-codebase.md @@ -1,157 +1 @@ -# Codebase Analysis Script - -Script to analyze the technology distribution in the provisioning codebase. - -## Usage - -### Basic Usage - -``` -# From provisioning directory (analyzes current directory) -cd provisioning -nu tools/analyze-codebase.nu - -# From project root, analyze provisioning -nu provisioning/tools/analyze-codebase.nu --path provisioning - -# Analyze any path -nu provisioning/tools/analyze-codebase.nu --path /absolute/path/to/directory -``` - -### Output Formats - -``` -# Table format (default) - colored, visual bars -nu provisioning/tools/analyze-codebase.nu --format table - -# JSON format - for programmatic use -nu provisioning/tools/analyze-codebase.nu --format json - -# Markdown format - for documentation -nu provisioning/tools/analyze-codebase.nu --format markdown -``` - -### From provisioning directory - -``` -cd provisioning -nu tools/analyze-codebase.nu -``` - -### Direct execution (if in PATH) - -``` -# Make it globally available (one time) -ln -sf "$(pwd)/provisioning/tools/analyze-codebase.nu" /usr/local/bin/analyze-codebase - -# Then run from anywhere -analyze-codebase -analyze-codebase --format json -analyze-codebase --format markdown > CODEBASE_STATS.md -``` - -## Output - -The script analyzes: - -- **Nushell** (.nu files) -- **KCL** (.k files) -- **Rust** (.rs files) -- **Templates** (.j2, .tera files) - -Across these sections: - -- `core/` - CLI interface, core libraries -- `extensions/` - Providers, taskservs, clusters -- `platform/` - Rust services (orchestrator, control-center, etc.) -- `templates/` - Template files -- `kcl/` - KCL configuration schemas - -## Example Output - -### Table Format - -``` -📊 Analyzing Codebase: provisioning - -📋 Lines of Code by Section - -╭─────────────┬─────────┬────────────┬─────┬─────────┬─────┬──────────┬───────────┬───────────────┬───────────┬───────╮ -│ section │ nushell │ nushell_pct│ kcl │ kcl_pct │ rust│ rust_pct │ templates │ templates_pct │ total │ │ -├─────────────┼─────────┼────────────┼─────┼─────────┼─────┼──────────┼───────────┼───────────────┼───────────┼───────┤ -│ core │ 53843 │ 99.87 │ 71 │ 0.13 │ 0 │ 0.00 │ 0 │ 0.00 │ 53914 │ │ -│ extensions │ 10202 │ 43.21 │3946 │ 16.72 │ 0 │ 0.00 │ 9456 │ 40.05 │ 23604 │ │ -│ platform │ 5759 │ 0.19 │ 0 │ 0.00 │2992107│ 99.81 │ 0 │ 0.00 │ 2997866 │ │ -│ templates │ 4197 │ 72.11 │ 834 │ 14.33 │ 0 │ 0.00 │ 789 │ 13.56 │ 5820 │ │ -│ kcl │ 0 │ 0.00 │5594 │ 100.00 │ 0 │ 0.00 │ 0 │ 0.00 │ 5594 │ │ -╰─────────────┴─────────┴────────────┴─────┴─────────┴─────┴──────────┴───────────┴───────────────┴───────────┴───────╯ - -📊 Overall Technology Distribution - -╭──────────────────────┬──────────┬────────────┬────────────────────────────────────────────────────╮ -│ technology │ lines │ percentage │ visual │ -├──────────────────────┼──────────┼────────────┼────────────────────────────────────────────────────┤ -│ Nushell │ 74001 │ 2.40 │ █ │ -│ KCL │ 10445 │ 0.34 │ │ -│ Rust │ 2992107 │ 96.93 │ ████████████████████████████████████████████████ │ -│ Templates (Tera) │ 10245 │ 0.33 │ │ -╰──────────────────────┴──────────┴────────────┴────────────────────────────────────────────────────╯ - -📈 Total Lines of Code: 3086798 -``` - -### JSON Format - -``` -{ - "sections": [...], - "totals": { - "nushell": 74001, - "kcl": 10445, - "rust": 2992107, - "templates": 10245, - "grand_total": 3086798 - }, - "percentages": { - "nushell": 2.40, - "kcl": 0.34, - "rust": 96.93, - "templates": 0.33 - } -} -``` - -### Markdown Format - -``` -# Codebase Analysis - -## Technology Distribution - -| Technology | Lines | Percentage | -|------------|-------|------------| -| Nushell | 74001 | 2.40% | -| KCL | 10445 | 0.34% | -| Rust | 2992107 | 96.93% | -| Templates | 10245 | 0.33% | -| **TOTAL** | **3086798** | **100%** | -``` - -## Requirements - -- Nushell 0.107.1+ -- Access to the provisioning directory - -## What It Analyzes - -- ✅ All `.nu` files (Nushell scripts) -- ✅ All `.k` files (KCL configuration) -- ✅ All `.rs` files (Rust source) -- ✅ All `.j2` and `.tera` files (Templates) - -## Notes - -- The script recursively searches all subdirectories -- Empty sections show 0 for all technologies -- Percentages are calculated per section and overall -- Visual bars are proportional to percentage (max 50 chars = 100%) +# Codebase Analysis Script\n\nScript to analyze the technology distribution in the provisioning codebase.\n\n## Usage\n\n### Basic Usage\n\n```\n# From provisioning directory (analyzes current directory)\ncd provisioning\nnu tools/analyze-codebase.nu\n\n# From project root, analyze provisioning\nnu provisioning/tools/analyze-codebase.nu --path provisioning\n\n# Analyze any path\nnu provisioning/tools/analyze-codebase.nu --path /absolute/path/to/directory\n```\n\n### Output Formats\n\n```\n# Table format (default) - colored, visual bars\nnu provisioning/tools/analyze-codebase.nu --format table\n\n# JSON format - for programmatic use\nnu provisioning/tools/analyze-codebase.nu --format json\n\n# Markdown format - for documentation\nnu provisioning/tools/analyze-codebase.nu --format markdown\n```\n\n### From provisioning directory\n\n```\ncd provisioning\nnu tools/analyze-codebase.nu\n```\n\n### Direct execution (if in PATH)\n\n```\n# Make it globally available (one time)\nln -sf "$(pwd)/provisioning/tools/analyze-codebase.nu" /usr/local/bin/analyze-codebase\n\n# Then run from anywhere\nanalyze-codebase\nanalyze-codebase --format json\nanalyze-codebase --format markdown > CODEBASE_STATS.md\n```\n\n## Output\n\nThe script analyzes:\n\n- **Nushell** (.nu files)\n- **KCL** (.k files)\n- **Rust** (.rs files)\n- **Templates** (.j2, .tera files)\n\nAcross these sections:\n\n- `core/` - CLI interface, core libraries\n- `extensions/` - Providers, taskservs, clusters\n- `platform/` - Rust services (orchestrator, control-center, etc.)\n- `templates/` - Template files\n- `kcl/` - KCL configuration schemas\n\n## Example Output\n\n### Table Format\n\n```\n📊 Analyzing Codebase: provisioning\n\n📋 Lines of Code by Section\n\n╭─────────────┬─────────┬────────────┬─────┬─────────┬─────┬──────────┬───────────┬───────────────┬───────────┬───────╮\n│ section │ nushell │ nushell_pct│ kcl │ kcl_pct │ rust│ rust_pct │ templates │ templates_pct │ total │ │\n├─────────────┼─────────┼────────────┼─────┼─────────┼─────┼──────────┼───────────┼───────────────┼───────────┼───────┤\n│ core │ 53843 │ 99.87 │ 71 │ 0.13 │ 0 │ 0.00 │ 0 │ 0.00 │ 53914 │ │\n│ extensions │ 10202 │ 43.21 │3946 │ 16.72 │ 0 │ 0.00 │ 9456 │ 40.05 │ 23604 │ │\n│ platform │ 5759 │ 0.19 │ 0 │ 0.00 │2992107│ 99.81 │ 0 │ 0.00 │ 2997866 │ │\n│ templates │ 4197 │ 72.11 │ 834 │ 14.33 │ 0 │ 0.00 │ 789 │ 13.56 │ 5820 │ │\n│ kcl │ 0 │ 0.00 │5594 │ 100.00 │ 0 │ 0.00 │ 0 │ 0.00 │ 5594 │ │\n╰─────────────┴─────────┴────────────┴─────┴─────────┴─────┴──────────┴───────────┴───────────────┴───────────┴───────╯\n\n📊 Overall Technology Distribution\n\n╭──────────────────────┬──────────┬────────────┬────────────────────────────────────────────────────╮\n│ technology │ lines │ percentage │ visual │\n├──────────────────────┼──────────┼────────────┼────────────────────────────────────────────────────┤\n│ Nushell │ 74001 │ 2.40 │ █ │\n│ KCL │ 10445 │ 0.34 │ │\n│ Rust │ 2992107 │ 96.93 │ ████████████████████████████████████████████████ │\n│ Templates (Tera) │ 10245 │ 0.33 │ │\n╰──────────────────────┴──────────┴────────────┴────────────────────────────────────────────────────╯\n\n📈 Total Lines of Code: 3086798\n```\n\n### JSON Format\n\n```\n{\n "sections": [...],\n "totals": {\n "nushell": 74001,\n "kcl": 10445,\n "rust": 2992107,\n "templates": 10245,\n "grand_total": 3086798\n },\n "percentages": {\n "nushell": 2.40,\n "kcl": 0.34,\n "rust": 96.93,\n "templates": 0.33\n }\n}\n```\n\n### Markdown Format\n\n```\n# Codebase Analysis\n\n## Technology Distribution\n\n| Technology | Lines | Percentage |\n|------------|-------|------------|\n| Nushell | 74001 | 2.40% |\n| KCL | 10445 | 0.34% |\n| Rust | 2992107 | 96.93% |\n| Templates | 10245 | 0.33% |\n| **TOTAL** | **3086798** | **100%** |\n```\n\n## Requirements\n\n- Nushell 0.107.1+\n- Access to the provisioning directory\n\n## What It Analyzes\n\n- ✅ All `.nu` files (Nushell scripts)\n- ✅ All `.k` files (KCL configuration)\n- ✅ All `.rs` files (Rust source)\n- ✅ All `.j2` and `.tera` files (Templates)\n\n## Notes\n\n- The script recursively searches all subdirectories\n- Empty sections show 0 for all technologies\n- Percentages are calculated per section and overall\n- Visual bars are proportional to percentage (max 50 chars = 100%) diff --git a/tools/README.md b/tools/README.md index 8e2ae73..17240bf 100644 --- a/tools/README.md +++ b/tools/README.md @@ -1,155 +1 @@ -# Development Tools - -Development and distribution tooling for provisioning. - -## Tool Categories - -### Build Tools (`build/`) - -Build automation and compilation tools: - -- Nushell script validation -- KCL schema compilation -- Dependency management -- Asset bundling - -**Future Features**: - -- Automated testing pipelines -- Code quality checks -- Performance benchmarking - -### Package Tools (`package/`) - -Packaging utilities for distribution: - -- Standalone executables -- Container images -- System packages (deb, rpm, etc.) -- Archive creation - -**Future Features**: - -- Multi-platform builds -- Dependency bundling -- Signature verification - -### Release Tools (`release/`) - -Release management automation: - -- Version bumping -- Changelog generation -- Git tag management -- Release notes creation - -**Future Features**: - -- Automated GitHub releases -- Asset uploads -- Release validation - -### Distribution Tools (`distribution/`) - -Distribution generators and deployment: - -- Installation scripts -- Configuration templates -- Update mechanisms -- Registry management - -**Future Features**: - -- Package repositories -- Update servers -- Telemetry collection - -## Tool Architecture - -### Script-Based Tools - -Most tools are implemented as Nushell scripts for consistency with the main system: - -- Easy integration with existing codebase -- Consistent configuration handling -- Native data structure support - -### Build Pipeline Integration - -Tools integrate with common CI/CD systems: - -- GitHub Actions -- GitLab CI -- Jenkins -- Custom automation - -### Configuration Management - -Tools use the same configuration system as the main application: - -- Unified settings -- Environment-specific overrides -- Secret management integration - -## Usage Examples - -``` -# Build the complete system -./tools/build/build-all.nu - -# Package for distribution -./tools/package/create-standalone.nu --target linux - -# Create a release -./tools/release/prepare-release.nu --version 4.0.0 - -# Generate distribution assets -./tools/distribution/generate-installer.nu --platform macos -``` - -## Directory Structure - -``` -provisioning/tools/ -├── README.md # This file -├── build/ # Core build tools (Rust + Nushell) -│ ├── README.md -│ ├── compile-platform.nu # Compile Rust binaries -│ ├── bundle-core.nu # Bundle Nushell libraries -│ └── check-system.nu # Validate build environment -├── dist/ # Build output directory (generated) -│ ├── README.md -│ ├── core/ # Nushell bundles -│ ├── platform/ # Compiled binaries -│ └── config/ # Configuration files -├── distribution/ # Distribution generation -│ ├── README.md -│ └── generate-distribution.nu # Create installable packages -├── package/ # Package outputs (generated) -│ └── README.md -├── release/ # Release management (generated) -│ └── README.md -├── scripts/ # Utility and setup scripts -│ ├── *.nu files # Nushell utilities -│ └── *.sh files # Shell scripts -└── [Other utility scripts] # Standalone tools -``` - -See individual README.md files in each subdirectory for detailed information. - -## Development Setup - -1. Ensure all dependencies are installed -2. Configure build environment -3. Run initial setup scripts -4. Validate tool functionality - -## Integration - -These tools integrate with: - -- Main provisioning system -- Extension system -- Configuration management -- Documentation generation -- CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins) +# Development Tools\n\nDevelopment and distribution tooling for provisioning.\n\n## Tool Categories\n\n### Build Tools (`build/`)\n\nBuild automation and compilation tools:\n\n- Nushell script validation\n- KCL schema compilation\n- Dependency management\n- Asset bundling\n\n**Future Features**:\n\n- Automated testing pipelines\n- Code quality checks\n- Performance benchmarking\n\n### Package Tools (`package/`)\n\nPackaging utilities for distribution:\n\n- Standalone executables\n- Container images\n- System packages (deb, rpm, etc.)\n- Archive creation\n\n**Future Features**:\n\n- Multi-platform builds\n- Dependency bundling\n- Signature verification\n\n### Release Tools (`release/`)\n\nRelease management automation:\n\n- Version bumping\n- Changelog generation\n- Git tag management\n- Release notes creation\n\n**Future Features**:\n\n- Automated GitHub releases\n- Asset uploads\n- Release validation\n\n### Distribution Tools (`distribution/`)\n\nDistribution generators and deployment:\n\n- Installation scripts\n- Configuration templates\n- Update mechanisms\n- Registry management\n\n**Future Features**:\n\n- Package repositories\n- Update servers\n- Telemetry collection\n\n## Tool Architecture\n\n### Script-Based Tools\n\nMost tools are implemented as Nushell scripts for consistency with the main system:\n\n- Easy integration with existing codebase\n- Consistent configuration handling\n- Native data structure support\n\n### Build Pipeline Integration\n\nTools integrate with common CI/CD systems:\n\n- GitHub Actions\n- GitLab CI\n- Jenkins\n- Custom automation\n\n### Configuration Management\n\nTools use the same configuration system as the main application:\n\n- Unified settings\n- Environment-specific overrides\n- Secret management integration\n\n## Usage Examples\n\n```\n# Build the complete system\n./tools/build/build-all.nu\n\n# Package for distribution\n./tools/package/create-standalone.nu --target linux\n\n# Create a release\n./tools/release/prepare-release.nu --version 4.0.0\n\n# Generate distribution assets\n./tools/distribution/generate-installer.nu --platform macos\n```\n\n## Directory Structure\n\n```\nprovisioning/tools/\n├── README.md # This file\n├── build/ # Core build tools (Rust + Nushell)\n│ ├── README.md\n│ ├── compile-platform.nu # Compile Rust binaries\n│ ├── bundle-core.nu # Bundle Nushell libraries\n│ └── check-system.nu # Validate build environment\n├── dist/ # Build output directory (generated)\n│ ├── README.md\n│ ├── core/ # Nushell bundles\n│ ├── platform/ # Compiled binaries\n│ └── config/ # Configuration files\n├── distribution/ # Distribution generation\n│ ├── README.md\n│ └── generate-distribution.nu # Create installable packages\n├── package/ # Package outputs (generated)\n│ └── README.md\n├── release/ # Release management (generated)\n│ └── README.md\n├── scripts/ # Utility and setup scripts\n│ ├── *.nu files # Nushell utilities\n│ └── *.sh files # Shell scripts\n└── [Other utility scripts] # Standalone tools\n```\n\nSee individual README.md files in each subdirectory for detailed information.\n\n## Development Setup\n\n1. Ensure all dependencies are installed\n2. Configure build environment\n3. Run initial setup scripts\n4. Validate tool functionality\n\n## Integration\n\nThese tools integrate with:\n\n- Main provisioning system\n- Extension system\n- Configuration management\n- Documentation generation\n- CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins) diff --git a/tools/build/README.md b/tools/build/README.md index 4261f13..36ca6ee 100644 --- a/tools/build/README.md +++ b/tools/build/README.md @@ -1,76 +1 @@ -# Build System - -**Purpose**: Core build tools for compiling Rust components and bundling Nushell libraries. - -## Tools - -### Compilation - -- **`compile-platform.nu`** - Compile Rust orchestrator, control-center, and MCP server - - Multi-platform cross-compilation - - Release/debug build modes - - Feature flag management - - Output to `dist/platform/` - -### Bundling - -- **`bundle-core.nu`** - Bundle Nushell core libraries and CLI - - Package provisioning CLI wrapper - - Core library bundling (lib_provisioning) - - Configuration system packaging - - Validation and syntax checking - - Optional compression (gzip) - - Output to `dist/core/` - -### Validation - -- **`check-system.nu`** - Validate build environment - - Check required tools (Rust, Nushell, Nickel) - - Verify dependencies - - Validate configuration - -## Build Process - -Complete build pipeline: - -``` -just build-all # Platform + core -just build-platform # Rust binaries only -just build-core # Nushell libraries only -``` - -Build with validation: - -``` -just build-core --validate # Validate Nushell syntax -``` - -Debug build: - -``` -just build-debug # Build with debug symbols -``` - -## Output - -Build outputs go to `dist/`: - -- `dist/platform/` - Compiled Rust binaries -- `dist/core/` - Nushell libraries and CLI -- `dist/config/` - Configuration files -- `dist/core/bundle-metadata.json` - Build metadata - -## Architecture - -Each build tool follows Nushell 0.109+ standards: - -- Immutable variable patterns -- Explicit external command prefixes (`^`) -- Error handling via `do { } | complete` pattern -- Comprehensive logging - -## Related Files - -- `provisioning/justfiles/build.just` - Build recipe definitions -- `provisioning/tools/distribution/` - Distribution generation using build outputs -- `provisioning/tools/package/` - Packaging compiled binaries +# Build System\n\n**Purpose**: Core build tools for compiling Rust components and bundling Nushell libraries.\n\n## Tools\n\n### Compilation\n\n- **`compile-platform.nu`** - Compile Rust orchestrator, control-center, and MCP server\n - Multi-platform cross-compilation\n - Release/debug build modes\n - Feature flag management\n - Output to `dist/platform/`\n\n### Bundling\n\n- **`bundle-core.nu`** - Bundle Nushell core libraries and CLI\n - Package provisioning CLI wrapper\n - Core library bundling (lib_provisioning)\n - Configuration system packaging\n - Validation and syntax checking\n - Optional compression (gzip)\n - Output to `dist/core/`\n\n### Validation\n\n- **`check-system.nu`** - Validate build environment\n - Check required tools (Rust, Nushell, Nickel)\n - Verify dependencies\n - Validate configuration\n\n## Build Process\n\nComplete build pipeline:\n\n```{$detected_lang}\njust build-all # Platform + core\njust build-platform # Rust binaries only\njust build-core # Nushell libraries only\n```\n\nBuild with validation:\n\n```{$detected_lang}\njust build-core --validate # Validate Nushell syntax\n```\n\nDebug build:\n\n```{$detected_lang}\njust build-debug # Build with debug symbols\n```\n\n## Output\n\nBuild outputs go to `dist/`:\n\n- `dist/platform/` - Compiled Rust binaries\n- `dist/core/` - Nushell libraries and CLI\n- `dist/config/` - Configuration files\n- `dist/core/bundle-metadata.json` - Build metadata\n\n## Architecture\n\nEach build tool follows Nushell 0.109+ standards:\n\n- Immutable variable patterns\n- Explicit external command prefixes (`^`)\n- Error handling via `do { } | complete` pattern\n- Comprehensive logging\n\n## Related Files\n\n- `provisioning/justfiles/build.just` - Build recipe definitions\n- `provisioning/tools/distribution/` - Distribution generation using build outputs\n- `provisioning/tools/package/` - Packaging compiled binaries diff --git a/tools/build/bundle-core.nu b/tools/build/bundle-core.nu index b52f6c8..c49aa77 100755 --- a/tools/build/bundle-core.nu +++ b/tools/build/bundle-core.nu @@ -408,4 +408,4 @@ def "main info" [bundle_dir: string = "dist/core"] { nu_files: (find $bundle_dir -name "*.nu" -type f | length) } } -} \ No newline at end of file +} diff --git a/tools/build/compile-platform.nu b/tools/build/compile-platform.nu index 636490b..6979d9a 100755 --- a/tools/build/compile-platform.nu +++ b/tools/build/compile-platform.nu @@ -217,4 +217,4 @@ def check_project_status [path: string] -> string { } return "ready" -} \ No newline at end of file +} diff --git a/tools/build/test-distribution.nu b/tools/build/test-distribution.nu index 8ecc975..1bcc99c 100755 --- a/tools/build/test-distribution.nu +++ b/tools/build/test-distribution.nu @@ -507,4 +507,4 @@ def get_directory_size [dir: string] -> int { } return ($total_size | if $in == null { 0 } else { $in }) -} \ No newline at end of file +} diff --git a/tools/cross-references-integration-report.md b/tools/cross-references-integration-report.md index 11b648f..fa11fd3 100644 --- a/tools/cross-references-integration-report.md +++ b/tools/cross-references-integration-report.md @@ -1,741 +1 @@ -# Cross-References & Integration Report - -**Agent**: Agent 6: Cross-References & Integration -**Date**: 2025-10-10 -**Status**: ✅ Phase 1 Complete - Core Infrastructure Ready - ---- - -## Executive Summary - -Successfully completed Phase 1 of documentation cross-referencing and integration, creating the foundational infrastructure for a unified documentation system. This phase focused on building the essential tools and reference materials needed for comprehensive documentation integration. - -### Key Deliverables - -1. ✅ **Documentation Validator Tool** - Automated link checking -2. ✅ **Broken Links Report** - 261 broken links identified across 264 files -3. ✅ **Comprehensive Glossary** - 80+ terms with cross-references -4. ✅ **Documentation Map** - Complete navigation guide with user journeys -5. ⚠️ **System Integration** - Diagnostics system analysis (existing references verified) - ---- - -## 1. Documentation Validator Tool - -**File**: `provisioning/tools/doc-validator.nu` (210 lines) - -### Features - -- ✅ Scans all markdown files in documentation (264 files found) -- ✅ Extracts and validates internal links using regex parsing -- ✅ Resolves relative paths and checks file existence -- ✅ Classifies links: internal, external, anchor -- ✅ Generates broken links report (JSON + Markdown) -- ✅ Provides summary statistics -- ✅ Supports multiple output formats (table, json, markdown) - -### Usage - -``` -# Run full validation -nu provisioning/tools/doc-validator.nu - -# Generate markdown report -nu provisioning/tools/doc-validator.nu --format markdown - -# Generate JSON for automation -nu provisioning/tools/doc-validator.nu --format json -``` - -### Performance - -- **264 markdown files** scanned -- **Completion time**: ~2 minutes -- **Memory usage**: Minimal (streaming processing) - -### Output Files - -1. `provisioning/tools/broken-links-report.json` - Detailed broken links (261 entries) -2. `provisioning/tools/doc-validation-full-report.json` - Complete validation data - ---- - -## 2. Broken Links Analysis - -### Statistics - -**Total Links Analyzed**: 2,847 links -**Broken Links**: 261 (9.2% failure rate) -**Valid Links**: 2,586 (90.8% success rate) - -### Link Type Breakdown - -- **Internal links**: 1,842 (64.7%) -- **External links**: 523 (18.4%) -- **Anchor links**: 482 (16.9%) - -### Broken Link Categories - -#### 1. Missing Documentation Files (47%) - -Common patterns: - -- `docs/user/quickstart.md` - Referenced but not created -- `docs/development/CONTRIBUTING.md` - Standard file missing -- `.claude/features/*.md` - Path resolution issues from docs/ - -#### 2. Anchor Links to Missing Sections (31%) - -Examples: - -- `workspace-management.md#setup-and-initialization` -- `configuration.md#configuration-architecture` -- `workflow.md#daily-development-workflow` - -#### 3. Path Resolution Issues (15%) - -- References to files in `.claude/` from `docs/` (path mismatch) -- References to `provisioning/` from `docs/` (relative path errors) - -#### 4. Outdated References (7%) - -- ADR links to non-existent ADRs -- Old migration guide structure - -### Recommendations - -**High Priority Fixes**: - -1. Create missing guide files in `docs/guides/` -2. Create missing ADRs or update references -3. Fix path resolution for `.claude/` references -4. Add missing anchor sections in existing docs - -**Medium Priority**: - -1. Verify and add missing anchor links -2. Update outdated migration paths -3. Create CONTRIBUTING.md - -**Low Priority**: - -1. Validate external links (may be intentional placeholders) -2. Standardize relative vs absolute paths - ---- - -## 3. Glossary (GLOSSARY.md) - -**File**: `provisioning/docs/src/GLOSSARY.md` (23,500+ lines) - -### Comprehensive Terminology Reference - -**80+ Terms Defined**, covering: - -- Infrastructure concepts (Server, Cluster, Taskserv, Provider, etc.) -- Security terms (Auth, JWT, MFA, Cedar, KMS, etc.) -- Configuration (Config, KCL, Schema, Workspace, etc.) -- Operations (Workflow, Batch Operation, Orchestrator, etc.) -- Platform (Control Center, MCP, API Gateway, etc.) -- Development (Extension, Plugin, Module, Template, etc.) - -### Structure - -Each term includes: - -1. **Definition** - Clear, concise explanation -2. **Where Used** - Context and use cases -3. **Related Concepts** - Cross-references to related terms -4. **Examples** - Code samples, commands, or configurations (where applicable) -5. **Commands** - CLI commands related to the term (where applicable) -6. **See Also** - Links to related documentation - -### Special Sections - -1. **Symbol and Acronym Index** - Quick lookup table -2. **Cross-Reference Map** - Terms organized by topic area -3. **Terminology Guidelines** - Writing style and conventions -4. **Contributing to Glossary** - How to add/update terms - -### Usage - -The glossary serves as: - -- **Learning resource** for new users -- **Reference** for experienced users -- **Documentation standard** for contributors -- **Cross-reference hub** for all documentation - ---- - -## 4. Documentation Map (DOCUMENTATION_MAP.md) - -**File**: `provisioning/docs/src/DOCUMENTATION_MAP.md` (48,000+ lines) - -### Comprehensive Navigation Guide - -**264 Documents Mapped**, organized by: - -- User Journeys (6 distinct paths) -- Topic Areas (14 categories) -- Difficulty Levels (Beginner, Intermediate, Advanced) -- Estimated Reading Times - -### User Journeys - -#### 1. New User Journey (0-7 days, 4-6 hours) - -8 steps from platform overview to basic deployment - -#### 2. Intermediate User Journey (1-4 weeks, 8-12 hours) - -8 steps mastering infrastructure automation and customization - -#### 3. Advanced User Journey (1-3 months, 20-30 hours) - -8 steps to become platform expert and contributor - -#### 4. Developer Journey (Ongoing) - -Contributing to platform development - -#### 5. Security Specialist Journey (10-15 hours) - -12 steps mastering security features - -#### 6. Operations Specialist Journey (6-8 hours) - -7 steps for daily operations mastery - -### Documentation by Topic - -**14 Major Categories**: - -1. Core Platform (3 docs) -2. User Guides (45+ docs) -3. Guides & Tutorials (10+ specialized guides) -4. Architecture (27 docs including 10 ADRs) -5. Development (25+ docs) -6. API Documentation (7 docs) -7. Security (15+ docs) -8. Operations (3+ docs) -9. Configuration & Workspace (11+ docs) -10. Reference Documentation (10+ docs) -11. Testing & Validation (4+ docs) -12. Migration (10+ docs) -13. Examples (2+ with more planned) -14. Quick References (10+ docs) - -### Documentation Statistics - -**By Category**: - -- User Guides: 32 documents -- Architecture: 27 documents -- Development: 25 documents -- API: 7 documents -- Security: 15 documents -- Migration: 10 documents -- Operations: 3 documents -- Configuration: 8 documents -- KCL: 14 documents -- Testing: 4 documents -- Quick References: 10 documents -- Examples: 2 documents -- ADRs: 10 documents - -**By Level**: - -- Beginner: ~40 documents (4-6 hours total) -- Intermediate: ~120 documents (20-30 hours total) -- Advanced: ~100 documents (40-60 hours total) - -**Total Estimated Reading Time**: 150-200 hours (complete corpus) - -### Essential Reading Lists - -Curated "Must-Read" lists for: - -- Everyone (4 docs) -- Operators (4 docs) -- Developers (4 docs) -- Security Specialists (4 docs) - -### Features - -- **Learning Paths**: Structured journeys for different user types -- **Topic Browse**: Jump to specific topics -- **Level Filtering**: Match docs to expertise -- **Quick References**: Fast command lookup -- **Alphabetical Index**: Complete file listing -- **Time Estimates**: Plan learning sessions -- **Cross-References**: Related document discovery - ---- - -## 5. Diagnostics System Integration - -### Analysis of Existing References - -**Diagnostics System Files Analyzed**: - -1. `provisioning/core/nulib/lib_provisioning/diagnostics/system_status.nu` (318 lines) -2. `provisioning/core/nulib/lib_provisioning/diagnostics/health_check.nu` (423 lines) -3. `provisioning/core/nulib/lib_provisioning/diagnostics/next_steps.nu` (316 lines) -4. `provisioning/core/nulib/main_provisioning/commands/diagnostics.nu` (75 lines) - -### Documentation References Found - -**35+ documentation links** embedded in diagnostics system, referencing: - -✅ **Existing Documentation**: - -- `docs/user/WORKSPACE_SWITCHING_GUIDE.md` -- `docs/guides/quickstart-cheatsheet.md` -- `docs/guides/from-scratch.md` -- `docs/user/troubleshooting-guide.md` -- `docs/user/SERVICE_MANAGEMENT_GUIDE.md` -- `.claude/features/orchestrator-architecture.md` -- `docs/user/PLUGIN_INTEGRATION_GUIDE.md` -- `docs/user/AUTHENTICATION_LAYER_GUIDE.md` -- `docs/user/CONFIG_ENCRYPTION_GUIDE.md` -- `docs/user/RUSTYVAULT_KMS_GUIDE.md` - -### Integration Status - -✅ **Already Integrated**: - -- Status command references correct doc paths -- Health command provides fix recommendations with doc links -- Next steps command includes progressive guidance with docs -- Phase command tracks deployment progress - -⚠️ **Validation Needed**: - -- Some references may point to moved/renamed files -- Need to validate all 35+ doc paths against current structure -- Should update to use new GLOSSARY.md and DOCUMENTATION_MAP.md - -### Recommendations - -**Immediate Actions**: - -1. Validate all diagnostics doc paths against current file locations -2. Update any broken references found in validation -3. Add references to new GLOSSARY.md and DOCUMENTATION_MAP.md -4. Consider adding doc path validation to CI/CD - -**Future Enhancements**: - -1. Auto-update doc paths when files move -2. Add version checking for doc references -3. Include doc freshness indicators -4. Add inline doc previews - ---- - -## 6. Pending Integration Work - -### MCP Tools Integration (Not Started) - -**Scope**: Ensure MCP (Model Context Protocol) tools reference correct documentation paths - -**Files to Check**: - -- `provisioning/platform/mcp-server/` - MCP server implementation -- MCP tool definitions -- Guidance system references - -**Actions Needed**: - -1. Locate MCP tool implementations -2. Extract all documentation references -3. Validate paths against current structure -4. Update broken references -5. Add GLOSSARY and DOCUMENTATION_MAP references - -**Estimated Time**: 2-3 hours - ---- - -### UI Integration (Not Started) - -**Scope**: Ensure Control Center UI references correct documentation - -**Files to Check**: - -- `provisioning/platform/control-center/` - UI implementation -- Tooltip references -- QuickLinks definitions -- Help modals - -**Actions Needed**: - -1. Locate UI documentation references -2. Validate all doc paths -3. Update broken references -4. Test documentation viewer/modal -5. Add navigation to GLOSSARY and DOCUMENTATION_MAP - -**Estimated Time**: 3-4 hours - ---- - -### Integration Tests (Not Started) - -**Scope**: Create automated tests for documentation integration - -**Test File**: `provisioning/tests/integration/docs_integration_test.nu` - -**Test Coverage Needed**: - -1. CLI hints reference valid docs -2. MCP tools return valid doc paths -3. UI links work correctly -4. Diagnostics output is accurate -5. All cross-references resolve -6. GLOSSARY terms link correctly -7. DOCUMENTATION_MAP paths valid - -**Test Types**: - -- Unit tests for link validation -- Integration tests for system components -- End-to-end tests for user journeys - -**Estimated Time**: 4-5 hours - ---- - -### Documentation System Guide (Not Started) - -**Scope**: Document how the unified documentation system works - -**File**: `provisioning/docs/src/development/documentation-system.md` - -**Content Needed**: - -1. **Organization**: How docs are structured -2. **Adding Documentation**: Step-by-step process -3. **CLI Integration**: How CLI links to docs -4. **MCP Integration**: How MCP uses docs -5. **UI Integration**: How UI presents docs -6. **Cross-References**: How to maintain links -7. **Architecture Diagram**: Visual system map -8. **Best Practices**: Documentation standards -9. **Tools**: Using doc-validator.nu -10. **Maintenance**: Keeping docs updated - -**Estimated Time**: 3-4 hours - ---- - -### Final Integration Check (Not Started) - -**Scope**: Complete user journey validation - -**Test Journey**: - -1. New user runs `provisioning status` -2. Follows suggestions from output -3. Uses `provisioning guide` commands -4. Opens Control Center UI -5. Completes onboarding wizard -6. Deploys first infrastructure - -**Validation Points**: - -- All suggested commands work -- All documentation links are valid -- UI navigation is intuitive -- Help system is comprehensive -- Error messages include helpful doc links -- User can complete journey without getting stuck - -**Estimated Time**: 2-3 hours - ---- - -## 7. Files Created/Modified - -### Created Files - -1. **`provisioning/tools/doc-validator.nu`** (210 lines) - - Documentation link validator tool - - Automated scanning and validation - - Multiple output formats - -2. **`provisioning/docs/src/GLOSSARY.md`** (23,500+ lines) - - Comprehensive terminology reference - - 80+ terms with cross-references - - Symbol index and usage guidelines - -3. **`provisioning/docs/src/DOCUMENTATION_MAP.md`** (48,000+ lines) - - Complete documentation navigation guide - - 6 user journeys - - 14 topic categories - - 264 documents mapped - -4. **`provisioning/tools/broken-links-report.json`** (Generated) - - 261 broken links identified - - Source file and line numbers - - Target paths and resolution attempts - -5. **`provisioning/tools/doc-validation-full-report.json`** (Generated) - - Complete validation results - - All 2,847 links analyzed - - Metadata and timestamps - -6. **`provisioning/tools/CROSS_REFERENCES_INTEGRATION_REPORT.md`** (This file) - - Comprehensive integration report - - Status of all deliverables - - Recommendations and next steps - -### Modified Files - -None (Phase 1 focused on analysis and reference material creation) - ---- - -## 8. Success Metrics - -### Deliverables Completed - -| Task | Status | Lines Created | Time Invested | -| ------ | -------- | --------------- | --------------- | -| Documentation Validator | ✅ Complete | 210 | ~2 hours | -| Broken Links Report | ✅ Complete | N/A (Generated) | ~30 min | -| Glossary | ✅ Complete | 23,500+ | ~4 hours | -| Documentation Map | ✅ Complete | 48,000+ | ~6 hours | -| Diagnostics Integration Analysis | ✅ Complete | N/A (Analysis) | ~1 hour | -| MCP Integration | ⏸️ Pending | - | - | -| UI Integration | ⏸️ Pending | - | - | -| Integration Tests | ⏸️ Pending | - | - | -| Documentation System Guide | ⏸️ Pending | - | - | -| Final Integration Check | ⏸️ Pending | - | - | - -**Total Lines Created**: 71,710+ lines -**Total Time Invested**: ~13.5 hours -**Completion**: 50% (Phase 1 of 2) - -### Quality Metrics - -**Documentation Validator**: - -- ✅ Handles 264 markdown files -- ✅ Analyzes 2,847 links -- ✅ 90.8% link validation accuracy -- ✅ Multiple output formats -- ✅ Extensible for future checks - -**Glossary**: - -- ✅ 80+ terms defined -- ✅ 100% cross-referenced -- ✅ Examples for 60% of terms -- ✅ CLI commands for 40% of terms -- ✅ Complete symbol index - -**Documentation Map**: - -- ✅ 100% of 264 docs cataloged -- ✅ 6 complete user journeys -- ✅ Reading time estimates for all docs -- ✅ 14 topic categories -- ✅ 3 difficulty levels - ---- - -## 9. Integration Architecture - -### Current State - -``` -Documentation System (Phase 1 - Complete) -├── Validator Tool ────────────┐ -│ └── doc-validator.nu │ -│ │ -├── Reference Materials │ -│ ├── GLOSSARY.md ───────────┤──> Cross-References -│ └── DOCUMENTATION_MAP.md ──┤ -│ │ -├── Reports │ -│ ├── broken-links-report ───┘ -│ └── validation-full-report -│ -└── System Integration (Phase 1 Analysis) - ├── Diagnostics ✅ (35+ doc refs verified) - ├── MCP Tools ⏸️ (pending) - ├── UI ⏸️ (pending) - └── Tests ⏸️ (pending) -``` - -### Target State (Phase 2) - -``` -Unified Documentation System -├── Validator Tool ────────────┐ -│ └── doc-validator.nu │ -│ ├── Link checking │ -│ ├── Freshness checks │ -│ └── CI/CD integration │ -│ │ -├── Reference Hub │ -│ ├── GLOSSARY.md ───────────┤──> All Systems -│ ├── DOCUMENTATION_MAP.md ──┤ -│ └── System Guide ──────────┤ -│ │ -├── System Integration │ -│ ├── Diagnostics ✅ │ -│ ├── MCP Tools ✅ ──────────┤ -│ ├── UI ✅ ─────────────────┤ -│ └── CLI ✅ ────────────────┤ -│ │ -├── Automated Testing │ -│ ├── Link validation ───────┘ -│ ├── Integration tests -│ └── User journey tests -│ -└── CI/CD Integration - ├── Pre-commit hooks - ├── PR validation - └── Doc freshness checks -``` - ---- - -## 10. Recommendations - -### Immediate Actions (Priority 1) - -1. **Fix High-Impact Broken Links** (2-3 hours) - - Create missing guide files - - Fix path resolution issues - - Update ADR references - -2. **Complete MCP Integration** (2-3 hours) - - Validate MCP tool doc references - - Update broken paths - - Add GLOSSARY/MAP references - -3. **Complete UI Integration** (3-4 hours) - - Validate UI doc references - - Test documentation viewer - - Update tooltips and help modals - -### Short-Term Actions (Priority 2) - -1. **Create Integration Tests** (4-5 hours) - - Write automated test suite - - Cover all system integrations - - Add to CI/CD pipeline - -2. **Write Documentation System Guide** (3-4 hours) - - Document unified system architecture - - Provide maintenance guidelines - - Include contribution process - -3. **Run Final Integration Check** (2-3 hours) - - Test complete user journey - - Validate all touchpoints - - Fix any issues found - -### Medium-Term Actions (Priority 3) - -1. **Automate Link Validation** (1-2 hours) - - Add doc-validator to CI/CD - - Run on every PR - - Block merges with broken links - -2. **Add Doc Freshness Checks** (2-3 hours) - - Track doc last-updated dates - - Flag stale documentation - - Auto-create update issues - -3. **Create Documentation Dashboard** (4-6 hours) - - Visual doc health metrics - - Link validation status - - Coverage statistics - - Contribution tracking - ---- - -## 11. Lessons Learned - -### Successes - -1. **Comprehensive Scope**: Mapping 264 documents revealed true system complexity -2. **Tool-First Approach**: Building validator before manual work saved significant time -3. **User Journey Focus**: Organizing by user type makes docs more accessible -4. **Cross-Reference Hub**: GLOSSARY + MAP create powerful navigation -5. **Existing Integration**: Diagnostics system already follows good practices - -### Challenges - -1. **Link Validation Complexity**: 261 broken links harder to fix than expected -2. **Path Resolution**: Multiple doc directories create path confusion -3. **Moving Target**: Documentation structure evolving during project -4. **Time Estimation**: Original scope underestimated total work -5. **Tool Limitations**: Anchor validation requires parsing headers (future work) - -### Improvements for Phase 2 - -1. **Incremental Validation**: Fix broken links category by category -2. **Automated Updates**: Update references when files move -3. **Version Tracking**: Track doc versions for compatibility -4. **CI/CD Integration**: Prevent new broken links from being added -5. **Living Documentation**: Auto-update maps and glossary - ---- - -## 12. Next Steps - -### Phase 2 Work (12-16 hours estimated) - -**Week 1**: - -- Day 1-2: Fix high-priority broken links (5-6 hours) -- Day 3: Complete MCP integration (2-3 hours) -- Day 4: Complete UI integration (3-4 hours) - -**Week 2**: - -- Day 5: Create integration tests (4-5 hours) -- Day 6: Write documentation system guide (3-4 hours) -- Day 7: Run final integration check (2-3 hours) - -### Acceptance Criteria - -Phase 2 complete when: - -- ✅ <5% broken links (currently 9.2%) -- ✅ All system components reference valid docs -- ✅ Integration tests pass -- ✅ Documentation system guide published -- ✅ Complete user journey validated -- ✅ CI/CD validation in place - ---- - -## 13. Conclusion - -Phase 1 of the Cross-References & Integration project is **successfully complete**. We have built the foundational infrastructure for a unified documentation system: - -✅ **Tool Created**: Automated documentation validator -✅ **Baseline Established**: 261 broken links identified -✅ **References Built**: Comprehensive glossary and documentation map -✅ **Integration Analyzed**: Diagnostics system verified - -The project is on track for Phase 2 completion, which will integrate all system components (MCP, UI, Tests) and validate the complete user experience. - -**Total Progress**: 50% complete -**Quality**: High - All Phase 1 deliverables meet or exceed requirements -**Risk**: Low - Clear path to Phase 2 completion -**Recommendation**: Proceed with Phase 2 implementation - ---- - -**Report Generated**: 2025-10-10 -**Agent**: Agent 6: Cross-References & Integration -**Status**: ✅ Phase 1 Complete -**Next Review**: After Phase 2 completion (estimated 12-16 hours) +# Cross-References & Integration Report\n\n**Agent**: Agent 6: Cross-References & Integration\n**Date**: 2025-10-10\n**Status**: ✅ Phase 1 Complete - Core Infrastructure Ready\n\n---\n\n## Executive Summary\n\nSuccessfully completed Phase 1 of documentation cross-referencing and integration, creating the foundational infrastructure for a unified documentation system. This phase focused on building the essential tools and reference materials needed for comprehensive documentation integration.\n\n### Key Deliverables\n\n1. ✅ **Documentation Validator Tool** - Automated link checking\n2. ✅ **Broken Links Report** - 261 broken links identified across 264 files\n3. ✅ **Comprehensive Glossary** - 80+ terms with cross-references\n4. ✅ **Documentation Map** - Complete navigation guide with user journeys\n5. ⚠️ **System Integration** - Diagnostics system analysis (existing references verified)\n\n---\n\n## 1. Documentation Validator Tool\n\n**File**: `provisioning/tools/doc-validator.nu` (210 lines)\n\n### Features\n\n- ✅ Scans all markdown files in documentation (264 files found)\n- ✅ Extracts and validates internal links using regex parsing\n- ✅ Resolves relative paths and checks file existence\n- ✅ Classifies links: internal, external, anchor\n- ✅ Generates broken links report (JSON + Markdown)\n- ✅ Provides summary statistics\n- ✅ Supports multiple output formats (table, json, markdown)\n\n### Usage\n\n```\n# Run full validation\nnu provisioning/tools/doc-validator.nu\n\n# Generate markdown report\nnu provisioning/tools/doc-validator.nu --format markdown\n\n# Generate JSON for automation\nnu provisioning/tools/doc-validator.nu --format json\n```\n\n### Performance\n\n- **264 markdown files** scanned\n- **Completion time**: ~2 minutes\n- **Memory usage**: Minimal (streaming processing)\n\n### Output Files\n\n1. `provisioning/tools/broken-links-report.json` - Detailed broken links (261 entries)\n2. `provisioning/tools/doc-validation-full-report.json` - Complete validation data\n\n---\n\n## 2. Broken Links Analysis\n\n### Statistics\n\n**Total Links Analyzed**: 2,847 links\n**Broken Links**: 261 (9.2% failure rate)\n**Valid Links**: 2,586 (90.8% success rate)\n\n### Link Type Breakdown\n\n- **Internal links**: 1,842 (64.7%)\n- **External links**: 523 (18.4%)\n- **Anchor links**: 482 (16.9%)\n\n### Broken Link Categories\n\n#### 1. Missing Documentation Files (47%)\n\nCommon patterns:\n\n- `docs/user/quickstart.md` - Referenced but not created\n- `docs/development/CONTRIBUTING.md` - Standard file missing\n- `.claude/features/*.md` - Path resolution issues from docs/\n\n#### 2. Anchor Links to Missing Sections (31%)\n\nExamples:\n\n- `workspace-management.md#setup-and-initialization`\n- `configuration.md#configuration-architecture`\n- `workflow.md#daily-development-workflow`\n\n#### 3. Path Resolution Issues (15%)\n\n- References to files in `.claude/` from `docs/` (path mismatch)\n- References to `provisioning/` from `docs/` (relative path errors)\n\n#### 4. Outdated References (7%)\n\n- ADR links to non-existent ADRs\n- Old migration guide structure\n\n### Recommendations\n\n**High Priority Fixes**:\n\n1. Create missing guide files in `docs/guides/`\n2. Create missing ADRs or update references\n3. Fix path resolution for `.claude/` references\n4. Add missing anchor sections in existing docs\n\n**Medium Priority**:\n\n1. Verify and add missing anchor links\n2. Update outdated migration paths\n3. Create CONTRIBUTING.md\n\n**Low Priority**:\n\n1. Validate external links (may be intentional placeholders)\n2. Standardize relative vs absolute paths\n\n---\n\n## 3. Glossary (GLOSSARY.md)\n\n**File**: `provisioning/docs/src/GLOSSARY.md` (23,500+ lines)\n\n### Comprehensive Terminology Reference\n\n**80+ Terms Defined**, covering:\n\n- Infrastructure concepts (Server, Cluster, Taskserv, Provider, etc.)\n- Security terms (Auth, JWT, MFA, Cedar, KMS, etc.)\n- Configuration (Config, KCL, Schema, Workspace, etc.)\n- Operations (Workflow, Batch Operation, Orchestrator, etc.)\n- Platform (Control Center, MCP, API Gateway, etc.)\n- Development (Extension, Plugin, Module, Template, etc.)\n\n### Structure\n\nEach term includes:\n\n1. **Definition** - Clear, concise explanation\n2. **Where Used** - Context and use cases\n3. **Related Concepts** - Cross-references to related terms\n4. **Examples** - Code samples, commands, or configurations (where applicable)\n5. **Commands** - CLI commands related to the term (where applicable)\n6. **See Also** - Links to related documentation\n\n### Special Sections\n\n1. **Symbol and Acronym Index** - Quick lookup table\n2. **Cross-Reference Map** - Terms organized by topic area\n3. **Terminology Guidelines** - Writing style and conventions\n4. **Contributing to Glossary** - How to add/update terms\n\n### Usage\n\nThe glossary serves as:\n\n- **Learning resource** for new users\n- **Reference** for experienced users\n- **Documentation standard** for contributors\n- **Cross-reference hub** for all documentation\n\n---\n\n## 4. Documentation Map (DOCUMENTATION_MAP.md)\n\n**File**: `provisioning/docs/src/DOCUMENTATION_MAP.md` (48,000+ lines)\n\n### Comprehensive Navigation Guide\n\n**264 Documents Mapped**, organized by:\n\n- User Journeys (6 distinct paths)\n- Topic Areas (14 categories)\n- Difficulty Levels (Beginner, Intermediate, Advanced)\n- Estimated Reading Times\n\n### User Journeys\n\n#### 1. New User Journey (0-7 days, 4-6 hours)\n\n8 steps from platform overview to basic deployment\n\n#### 2. Intermediate User Journey (1-4 weeks, 8-12 hours)\n\n8 steps mastering infrastructure automation and customization\n\n#### 3. Advanced User Journey (1-3 months, 20-30 hours)\n\n8 steps to become platform expert and contributor\n\n#### 4. Developer Journey (Ongoing)\n\nContributing to platform development\n\n#### 5. Security Specialist Journey (10-15 hours)\n\n12 steps mastering security features\n\n#### 6. Operations Specialist Journey (6-8 hours)\n\n7 steps for daily operations mastery\n\n### Documentation by Topic\n\n**14 Major Categories**:\n\n1. Core Platform (3 docs)\n2. User Guides (45+ docs)\n3. Guides & Tutorials (10+ specialized guides)\n4. Architecture (27 docs including 10 ADRs)\n5. Development (25+ docs)\n6. API Documentation (7 docs)\n7. Security (15+ docs)\n8. Operations (3+ docs)\n9. Configuration & Workspace (11+ docs)\n10. Reference Documentation (10+ docs)\n11. Testing & Validation (4+ docs)\n12. Migration (10+ docs)\n13. Examples (2+ with more planned)\n14. Quick References (10+ docs)\n\n### Documentation Statistics\n\n**By Category**:\n\n- User Guides: 32 documents\n- Architecture: 27 documents\n- Development: 25 documents\n- API: 7 documents\n- Security: 15 documents\n- Migration: 10 documents\n- Operations: 3 documents\n- Configuration: 8 documents\n- KCL: 14 documents\n- Testing: 4 documents\n- Quick References: 10 documents\n- Examples: 2 documents\n- ADRs: 10 documents\n\n**By Level**:\n\n- Beginner: ~40 documents (4-6 hours total)\n- Intermediate: ~120 documents (20-30 hours total)\n- Advanced: ~100 documents (40-60 hours total)\n\n**Total Estimated Reading Time**: 150-200 hours (complete corpus)\n\n### Essential Reading Lists\n\nCurated "Must-Read" lists for:\n\n- Everyone (4 docs)\n- Operators (4 docs)\n- Developers (4 docs)\n- Security Specialists (4 docs)\n\n### Features\n\n- **Learning Paths**: Structured journeys for different user types\n- **Topic Browse**: Jump to specific topics\n- **Level Filtering**: Match docs to expertise\n- **Quick References**: Fast command lookup\n- **Alphabetical Index**: Complete file listing\n- **Time Estimates**: Plan learning sessions\n- **Cross-References**: Related document discovery\n\n---\n\n## 5. Diagnostics System Integration\n\n### Analysis of Existing References\n\n**Diagnostics System Files Analyzed**:\n\n1. `provisioning/core/nulib/lib_provisioning/diagnostics/system_status.nu` (318 lines)\n2. `provisioning/core/nulib/lib_provisioning/diagnostics/health_check.nu` (423 lines)\n3. `provisioning/core/nulib/lib_provisioning/diagnostics/next_steps.nu` (316 lines)\n4. `provisioning/core/nulib/main_provisioning/commands/diagnostics.nu` (75 lines)\n\n### Documentation References Found\n\n**35+ documentation links** embedded in diagnostics system, referencing:\n\n✅ **Existing Documentation**:\n\n- `docs/user/WORKSPACE_SWITCHING_GUIDE.md`\n- `docs/guides/quickstart-cheatsheet.md`\n- `docs/guides/from-scratch.md`\n- `docs/user/troubleshooting-guide.md`\n- `docs/user/SERVICE_MANAGEMENT_GUIDE.md`\n- `.claude/features/orchestrator-architecture.md`\n- `docs/user/PLUGIN_INTEGRATION_GUIDE.md`\n- `docs/user/AUTHENTICATION_LAYER_GUIDE.md`\n- `docs/user/CONFIG_ENCRYPTION_GUIDE.md`\n- `docs/user/RUSTYVAULT_KMS_GUIDE.md`\n\n### Integration Status\n\n✅ **Already Integrated**:\n\n- Status command references correct doc paths\n- Health command provides fix recommendations with doc links\n- Next steps command includes progressive guidance with docs\n- Phase command tracks deployment progress\n\n⚠️ **Validation Needed**:\n\n- Some references may point to moved/renamed files\n- Need to validate all 35+ doc paths against current structure\n- Should update to use new GLOSSARY.md and DOCUMENTATION_MAP.md\n\n### Recommendations\n\n**Immediate Actions**:\n\n1. Validate all diagnostics doc paths against current file locations\n2. Update any broken references found in validation\n3. Add references to new GLOSSARY.md and DOCUMENTATION_MAP.md\n4. Consider adding doc path validation to CI/CD\n\n**Future Enhancements**:\n\n1. Auto-update doc paths when files move\n2. Add version checking for doc references\n3. Include doc freshness indicators\n4. Add inline doc previews\n\n---\n\n## 6. Pending Integration Work\n\n### MCP Tools Integration (Not Started)\n\n**Scope**: Ensure MCP (Model Context Protocol) tools reference correct documentation paths\n\n**Files to Check**:\n\n- `provisioning/platform/mcp-server/` - MCP server implementation\n- MCP tool definitions\n- Guidance system references\n\n**Actions Needed**:\n\n1. Locate MCP tool implementations\n2. Extract all documentation references\n3. Validate paths against current structure\n4. Update broken references\n5. Add GLOSSARY and DOCUMENTATION_MAP references\n\n**Estimated Time**: 2-3 hours\n\n---\n\n### UI Integration (Not Started)\n\n**Scope**: Ensure Control Center UI references correct documentation\n\n**Files to Check**:\n\n- `provisioning/platform/control-center/` - UI implementation\n- Tooltip references\n- QuickLinks definitions\n- Help modals\n\n**Actions Needed**:\n\n1. Locate UI documentation references\n2. Validate all doc paths\n3. Update broken references\n4. Test documentation viewer/modal\n5. Add navigation to GLOSSARY and DOCUMENTATION_MAP\n\n**Estimated Time**: 3-4 hours\n\n---\n\n### Integration Tests (Not Started)\n\n**Scope**: Create automated tests for documentation integration\n\n**Test File**: `provisioning/tests/integration/docs_integration_test.nu`\n\n**Test Coverage Needed**:\n\n1. CLI hints reference valid docs\n2. MCP tools return valid doc paths\n3. UI links work correctly\n4. Diagnostics output is accurate\n5. All cross-references resolve\n6. GLOSSARY terms link correctly\n7. DOCUMENTATION_MAP paths valid\n\n**Test Types**:\n\n- Unit tests for link validation\n- Integration tests for system components\n- End-to-end tests for user journeys\n\n**Estimated Time**: 4-5 hours\n\n---\n\n### Documentation System Guide (Not Started)\n\n**Scope**: Document how the unified documentation system works\n\n**File**: `provisioning/docs/src/development/documentation-system.md`\n\n**Content Needed**:\n\n1. **Organization**: How docs are structured\n2. **Adding Documentation**: Step-by-step process\n3. **CLI Integration**: How CLI links to docs\n4. **MCP Integration**: How MCP uses docs\n5. **UI Integration**: How UI presents docs\n6. **Cross-References**: How to maintain links\n7. **Architecture Diagram**: Visual system map\n8. **Best Practices**: Documentation standards\n9. **Tools**: Using doc-validator.nu\n10. **Maintenance**: Keeping docs updated\n\n**Estimated Time**: 3-4 hours\n\n---\n\n### Final Integration Check (Not Started)\n\n**Scope**: Complete user journey validation\n\n**Test Journey**:\n\n1. New user runs `provisioning status`\n2. Follows suggestions from output\n3. Uses `provisioning guide` commands\n4. Opens Control Center UI\n5. Completes onboarding wizard\n6. Deploys first infrastructure\n\n**Validation Points**:\n\n- All suggested commands work\n- All documentation links are valid\n- UI navigation is intuitive\n- Help system is comprehensive\n- Error messages include helpful doc links\n- User can complete journey without getting stuck\n\n**Estimated Time**: 2-3 hours\n\n---\n\n## 7. Files Created/Modified\n\n### Created Files\n\n1. **`provisioning/tools/doc-validator.nu`** (210 lines)\n - Documentation link validator tool\n - Automated scanning and validation\n - Multiple output formats\n\n2. **`provisioning/docs/src/GLOSSARY.md`** (23,500+ lines)\n - Comprehensive terminology reference\n - 80+ terms with cross-references\n - Symbol index and usage guidelines\n\n3. **`provisioning/docs/src/DOCUMENTATION_MAP.md`** (48,000+ lines)\n - Complete documentation navigation guide\n - 6 user journeys\n - 14 topic categories\n - 264 documents mapped\n\n4. **`provisioning/tools/broken-links-report.json`** (Generated)\n - 261 broken links identified\n - Source file and line numbers\n - Target paths and resolution attempts\n\n5. **`provisioning/tools/doc-validation-full-report.json`** (Generated)\n - Complete validation results\n - All 2,847 links analyzed\n - Metadata and timestamps\n\n6. **`provisioning/tools/CROSS_REFERENCES_INTEGRATION_REPORT.md`** (This file)\n - Comprehensive integration report\n - Status of all deliverables\n - Recommendations and next steps\n\n### Modified Files\n\nNone (Phase 1 focused on analysis and reference material creation)\n\n---\n\n## 8. Success Metrics\n\n### Deliverables Completed\n\n| Task | Status | Lines Created | Time Invested |\n| ------ | -------- | --------------- | --------------- |\n| Documentation Validator | ✅ Complete | 210 | ~2 hours |\n| Broken Links Report | ✅ Complete | N/A (Generated) | ~30 min |\n| Glossary | ✅ Complete | 23,500+ | ~4 hours |\n| Documentation Map | ✅ Complete | 48,000+ | ~6 hours |\n| Diagnostics Integration Analysis | ✅ Complete | N/A (Analysis) | ~1 hour |\n| MCP Integration | ⏸️ Pending | - | - |\n| UI Integration | ⏸️ Pending | - | - |\n| Integration Tests | ⏸️ Pending | - | - |\n| Documentation System Guide | ⏸️ Pending | - | - |\n| Final Integration Check | ⏸️ Pending | - | - |\n\n**Total Lines Created**: 71,710+ lines\n**Total Time Invested**: ~13.5 hours\n**Completion**: 50% (Phase 1 of 2)\n\n### Quality Metrics\n\n**Documentation Validator**:\n\n- ✅ Handles 264 markdown files\n- ✅ Analyzes 2,847 links\n- ✅ 90.8% link validation accuracy\n- ✅ Multiple output formats\n- ✅ Extensible for future checks\n\n**Glossary**:\n\n- ✅ 80+ terms defined\n- ✅ 100% cross-referenced\n- ✅ Examples for 60% of terms\n- ✅ CLI commands for 40% of terms\n- ✅ Complete symbol index\n\n**Documentation Map**:\n\n- ✅ 100% of 264 docs cataloged\n- ✅ 6 complete user journeys\n- ✅ Reading time estimates for all docs\n- ✅ 14 topic categories\n- ✅ 3 difficulty levels\n\n---\n\n## 9. Integration Architecture\n\n### Current State\n\n```\nDocumentation System (Phase 1 - Complete)\n├── Validator Tool ────────────┐\n│ └── doc-validator.nu │\n│ │\n├── Reference Materials │\n│ ├── GLOSSARY.md ───────────┤──> Cross-References\n│ └── DOCUMENTATION_MAP.md ──┤\n│ │\n├── Reports │\n│ ├── broken-links-report ───┘\n│ └── validation-full-report\n│\n└── System Integration (Phase 1 Analysis)\n ├── Diagnostics ✅ (35+ doc refs verified)\n ├── MCP Tools ⏸️ (pending)\n ├── UI ⏸️ (pending)\n └── Tests ⏸️ (pending)\n```\n\n### Target State (Phase 2)\n\n```\nUnified Documentation System\n├── Validator Tool ────────────┐\n│ └── doc-validator.nu │\n│ ├── Link checking │\n│ ├── Freshness checks │\n│ └── CI/CD integration │\n│ │\n├── Reference Hub │\n│ ├── GLOSSARY.md ───────────┤──> All Systems\n│ ├── DOCUMENTATION_MAP.md ──┤\n│ └── System Guide ──────────┤\n│ │\n├── System Integration │\n│ ├── Diagnostics ✅ │\n│ ├── MCP Tools ✅ ──────────┤\n│ ├── UI ✅ ─────────────────┤\n│ └── CLI ✅ ────────────────┤\n│ │\n├── Automated Testing │\n│ ├── Link validation ───────┘\n│ ├── Integration tests\n│ └── User journey tests\n│\n└── CI/CD Integration\n ├── Pre-commit hooks\n ├── PR validation\n └── Doc freshness checks\n```\n\n---\n\n## 10. Recommendations\n\n### Immediate Actions (Priority 1)\n\n1. **Fix High-Impact Broken Links** (2-3 hours)\n - Create missing guide files\n - Fix path resolution issues\n - Update ADR references\n\n2. **Complete MCP Integration** (2-3 hours)\n - Validate MCP tool doc references\n - Update broken paths\n - Add GLOSSARY/MAP references\n\n3. **Complete UI Integration** (3-4 hours)\n - Validate UI doc references\n - Test documentation viewer\n - Update tooltips and help modals\n\n### Short-Term Actions (Priority 2)\n\n1. **Create Integration Tests** (4-5 hours)\n - Write automated test suite\n - Cover all system integrations\n - Add to CI/CD pipeline\n\n2. **Write Documentation System Guide** (3-4 hours)\n - Document unified system architecture\n - Provide maintenance guidelines\n - Include contribution process\n\n3. **Run Final Integration Check** (2-3 hours)\n - Test complete user journey\n - Validate all touchpoints\n - Fix any issues found\n\n### Medium-Term Actions (Priority 3)\n\n1. **Automate Link Validation** (1-2 hours)\n - Add doc-validator to CI/CD\n - Run on every PR\n - Block merges with broken links\n\n2. **Add Doc Freshness Checks** (2-3 hours)\n - Track doc last-updated dates\n - Flag stale documentation\n - Auto-create update issues\n\n3. **Create Documentation Dashboard** (4-6 hours)\n - Visual doc health metrics\n - Link validation status\n - Coverage statistics\n - Contribution tracking\n\n---\n\n## 11. Lessons Learned\n\n### Successes\n\n1. **Comprehensive Scope**: Mapping 264 documents revealed true system complexity\n2. **Tool-First Approach**: Building validator before manual work saved significant time\n3. **User Journey Focus**: Organizing by user type makes docs more accessible\n4. **Cross-Reference Hub**: GLOSSARY + MAP create powerful navigation\n5. **Existing Integration**: Diagnostics system already follows good practices\n\n### Challenges\n\n1. **Link Validation Complexity**: 261 broken links harder to fix than expected\n2. **Path Resolution**: Multiple doc directories create path confusion\n3. **Moving Target**: Documentation structure evolving during project\n4. **Time Estimation**: Original scope underestimated total work\n5. **Tool Limitations**: Anchor validation requires parsing headers (future work)\n\n### Improvements for Phase 2\n\n1. **Incremental Validation**: Fix broken links category by category\n2. **Automated Updates**: Update references when files move\n3. **Version Tracking**: Track doc versions for compatibility\n4. **CI/CD Integration**: Prevent new broken links from being added\n5. **Living Documentation**: Auto-update maps and glossary\n\n---\n\n## 12. Next Steps\n\n### Phase 2 Work (12-16 hours estimated)\n\n**Week 1**:\n\n- Day 1-2: Fix high-priority broken links (5-6 hours)\n- Day 3: Complete MCP integration (2-3 hours)\n- Day 4: Complete UI integration (3-4 hours)\n\n**Week 2**:\n\n- Day 5: Create integration tests (4-5 hours)\n- Day 6: Write documentation system guide (3-4 hours)\n- Day 7: Run final integration check (2-3 hours)\n\n### Acceptance Criteria\n\nPhase 2 complete when:\n\n- ✅ <5% broken links (currently 9.2%)\n- ✅ All system components reference valid docs\n- ✅ Integration tests pass\n- ✅ Documentation system guide published\n- ✅ Complete user journey validated\n- ✅ CI/CD validation in place\n\n---\n\n## 13. Conclusion\n\nPhase 1 of the Cross-References & Integration project is **successfully complete**. We have built the foundational infrastructure for a unified documentation system:\n\n✅ **Tool Created**: Automated documentation validator\n✅ **Baseline Established**: 261 broken links identified\n✅ **References Built**: Comprehensive glossary and documentation map\n✅ **Integration Analyzed**: Diagnostics system verified\n\nThe project is on track for Phase 2 completion, which will integrate all system components (MCP, UI, Tests) and validate the complete user experience.\n\n**Total Progress**: 50% complete\n**Quality**: High - All Phase 1 deliverables meet or exceed requirements\n**Risk**: Low - Clear path to Phase 2 completion\n**Recommendation**: Proceed with Phase 2 implementation\n\n---\n\n**Report Generated**: 2025-10-10\n**Agent**: Agent 6: Cross-References & Integration\n**Status**: ✅ Phase 1 Complete\n**Next Review**: After Phase 2 completion (estimated 12-16 hours) diff --git a/tools/dist/README.md b/tools/dist/README.md index ff7a93f..95a3981 100644 --- a/tools/dist/README.md +++ b/tools/dist/README.md @@ -1,66 +1 @@ -# Distribution Build Output - -**Purpose**: Compiled binaries and bundled libraries ready for packaging and distribution. - -## Contents - -This directory contains the build output from the core platform build system: - -### Subdirectories - -- **`core/`** - Nushell core libraries and CLI bundles (from `bundle-core.nu`) - - Nushell provisioning CLI wrapper - - Core libraries (lib_provisioning) - - Configuration system - - Template system - - Extensions and plugins - -- **`platform/`** - Compiled Rust binaries (from `compile-platform.nu`) - - provisioning-orchestrator binary - - control-center binary - - control-center-ui binary - - mcp-server-rust binary - - All cross-platform target binaries - -- **`config/`** - Configuration files and templates - - Default configurations - - Configuration examples - - Schema definitions - -- **`provisioning-kcl-1.0.0/`** - Deprecated KCL distribution (archived) - - Historical reference only - - Migrated to `.coder/archive/kcl/` for long-term storage - -## Usage - -This directory is generated by the build system. Do not commit contents to git (configured in .gitignore). - -Build the distribution: - -``` -just build-all # Complete build (platform + core) -just build-platform # Platform binaries only -just build-core # Core libraries only -``` - -View distribution contents: - -``` -ls dist/core/ # Nushell libraries -ls dist/platform/ # Compiled binaries -ls dist/config/ # Configuration files -``` - -## Cleanup - -Remove all distribution artifacts: - -``` -just clean-dist # Remove dist/ directory -``` - -## Related Directories - -- `distribution/` - Distribution package generation -- `package/` - Package creation (deb, rpm, tar.gz, etc.) -- `release/` - Release management and versioning +# Distribution Build Output\n\n**Purpose**: Compiled binaries and bundled libraries ready for packaging and distribution.\n\n## Contents\n\nThis directory contains the build output from the core platform build system:\n\n### Subdirectories\n\n- **`core/`** - Nushell core libraries and CLI bundles (from `bundle-core.nu`)\n - Nushell provisioning CLI wrapper\n - Core libraries (lib_provisioning)\n - Configuration system\n - Template system\n - Extensions and plugins\n\n- **`platform/`** - Compiled Rust binaries (from `compile-platform.nu`)\n - provisioning-orchestrator binary\n - control-center binary\n - control-center-ui binary\n - mcp-server-rust binary\n - All cross-platform target binaries\n\n- **`config/`** - Configuration files and templates\n - Default configurations\n - Configuration examples\n - Schema definitions\n\n- **`provisioning-kcl-1.0.0/`** - Deprecated KCL distribution (archived)\n - Historical reference only\n - Migrated to `.coder/archive/kcl/` for long-term storage\n\n## Usage\n\nThis directory is generated by the build system. Do not commit contents to git (configured in .gitignore).\n\nBuild the distribution:\n\n```{$detected_lang}\njust build-all # Complete build (platform + core)\njust build-platform # Platform binaries only\njust build-core # Core libraries only\n```\n\nView distribution contents:\n\n```{$detected_lang}\nls dist/core/ # Nushell libraries\nls dist/platform/ # Compiled binaries\nls dist/config/ # Configuration files\n```\n\n## Cleanup\n\nRemove all distribution artifacts:\n\n```{$detected_lang}\njust clean-dist # Remove dist/ directory\n```\n\n## Related Directories\n\n- `distribution/` - Distribution package generation\n- `package/` - Package creation (deb, rpm, tar.gz, etc.)\n- `release/` - Release management and versioning diff --git a/tools/distribution/README.md b/tools/distribution/README.md index cd03974..5e89a40 100644 --- a/tools/distribution/README.md +++ b/tools/distribution/README.md @@ -1,58 +1 @@ -# Distribution Package Generation - -**Purpose**: Generate complete distribution packages from compiled binaries and libraries. - -## Contents - -Scripts and outputs for creating distribution-ready packages across multiple platforms and formats. - -## What is Distribution Generation - -Distribution generation takes the compiled artifacts from `dist/` and packages them into: - -- Installable archives (tar.gz, zip) -- Platform-specific installers (deb, rpm, brew) -- Docker/container images -- Binary distributions with configuration templates - -## Build Process - -The distribution build system: - -1. Takes binaries from `dist/platform/` -2. Takes libraries from `dist/core/` -3. Takes configuration templates from `dist/config/` -4. Combines with installation scripts -5. Creates platform-specific packages - -Generate a distribution: - -``` -just dist-generate # Full distribution generation -just dist-validate # Validate generated distribution -``` - -## Output Artifacts - -Generated distribution includes: - -- Compiled binaries (orchestrator, control-center, MCP server) -- Installation script (install.sh) -- Configuration templates -- Documentation -- License files - -## Related Directories - -- `dist/` - Build output (source for distribution) -- `package/` - Alternative packaging (low-level format creation) -- `release/` - Version management and release tagging - -## Integration - -The distribution output is used by: - -- Installation system (`provisioning-installer`) -- Package managers -- CI/CD pipelines -- End-user downloads +# Distribution Package Generation\n\n**Purpose**: Generate complete distribution packages from compiled binaries and libraries.\n\n## Contents\n\nScripts and outputs for creating distribution-ready packages across multiple platforms and formats.\n\n## What is Distribution Generation\n\nDistribution generation takes the compiled artifacts from `dist/` and packages them into:\n\n- Installable archives (tar.gz, zip)\n- Platform-specific installers (deb, rpm, brew)\n- Docker/container images\n- Binary distributions with configuration templates\n\n## Build Process\n\nThe distribution build system:\n\n1. Takes binaries from `dist/platform/`\n2. Takes libraries from `dist/core/`\n3. Takes configuration templates from `dist/config/`\n4. Combines with installation scripts\n5. Creates platform-specific packages\n\nGenerate a distribution:\n\n```{$detected_lang}\njust dist-generate # Full distribution generation\njust dist-validate # Validate generated distribution\n```\n\n## Output Artifacts\n\nGenerated distribution includes:\n\n- Compiled binaries (orchestrator, control-center, MCP server)\n- Installation script (install.sh)\n- Configuration templates\n- Documentation\n- License files\n\n## Related Directories\n\n- `dist/` - Build output (source for distribution)\n- `package/` - Alternative packaging (low-level format creation)\n- `release/` - Version management and release tagging\n\n## Integration\n\nThe distribution output is used by:\n\n- Installation system (`provisioning-installer`)\n- Package managers\n- CI/CD pipelines\n- End-user downloads diff --git a/tools/nickel-installation-guide.md b/tools/nickel-installation-guide.md index e984dc1..0198a46 100644 --- a/tools/nickel-installation-guide.md +++ b/tools/nickel-installation-guide.md @@ -1,187 +1 @@ -# Nickel Installation Guide - -## Overview - -Nickel is a configuration language that complements KCL in the provisioning system. It provides: - -- Lazy evaluation for efficient configuration processing -- Modern functional programming paradigms -- Excellent integration with the CLI daemon for config rendering - -## Installation Methods - -### Recommended: Nix (Official Method) - -Nickel is maintained by Tweag and officially recommends Nix for installation. This avoids all dependency issues: - -``` -# Install Nix (one-time setup) - Using official NixOS installer -curl https://nixos.org/nix/install | sh - -# Install Nickel via Nix -nix profile install nixpkgs#nickel - -# Verify installation -nickel --version -``` - -**Why Nix?** - -- Isolated, reproducible environments -- No system library conflicts -- Official Nickel distribution method -- Works on macOS, Linux, and other Unix-like systems -- Pre-built binaries available - -### Alternative: Automatic Installation - -The provisioning system can automate installation: - -``` -# Via tools-install script (uses Nix if available) -$PROVISIONING/core/cli/tools-install nickel - -# Check installation status -$PROVISIONING/core/cli/tools-install check -``` - -### Alternative: Manual Installation from Source - -If you have a Rust toolchain: - -``` -cargo install nickel-lang-cli -``` - -**Note**: This requires Rust compiler (slower than pre-built binaries) - -## Troubleshooting - -### "Library not loaded: /nix/store/..." Error - -This occurs when using pre-built binaries without Nix installed. **Solution**: Install Nix or use Cargo: - -``` -# Option 1: Install Nix (recommended) - Using official NixOS installer -curl https://nixos.org/nix/install | sh - -# Then install Nickel -nix profile install nixpkgs#nickel - -# Option 2: Build from source with Cargo -cargo install nickel-lang-cli -``` - -### Command Not Found - -Ensure Nix is properly installed and in PATH: - -``` -# Check if Nix is installed -which nix - -# If not found, install Nix first using official NixOS installer: -curl https://nixos.org/nix/install | sh - -# Then install Nickel -nix profile install nixpkgs#nickel -``` - -### Version Mismatch - -To ensure you're using the correct version: - -``` -# Check installed version -nickel --version - -# Expected version (from provisioning/core/versions) -echo $NICKEL_VERSION - -# Update to latest -nix profile upgrade '*' -``` - -## Integration with Provisioning System - -### CLI Daemon Integration - -Nickel is integrated into the CLI daemon for configuration rendering: - -``` -# Render Nickel configuration via daemon -curl -X POST http://localhost:9091/config/render \ - -H "Content-Type: application/json" \ - -d '{ - "language": "nickel", - "content": "{name = \"my-config\", enabled = true}", - "context": {"env": "prod"} - }' -``` - -### Comparison with KCL - -| Feature | KCL | Nickel | -| --------- | ----- | -------- | -| **Type System** | Gradual, OOP-style | Gradual, Functional | -| **Evaluation** | Eager | Lazy (partial evaluation) | -| **Performance** | Fast | Very fast (lazy) | -| **Learning Curve** | Moderate | Functional programming knowledge helps | -| **Use Cases** | Infrastructure schemas | Configuration merging, lazy evaluation | - -## Deployment Considerations - -### macOS M1/M2/M3 (arm64) - -Nix automatically handles architecture: - -``` -nix profile install nixpkgs#nickel -# Automatically installs arm64 binary -``` - -### Linux (x86_64/arm64) - -``` -nix profile install nixpkgs#nickel -# Automatically installs correct architecture -``` - -### CI/CD Environments - -For GitHub Actions or other CI/CD: - -``` -# .github/workflows/example.yml -- name: Install Nickel - run: | - curl https://nixos.org/nix/install | sh - nix profile install nixpkgs#nickel -``` - -## Resources - -- **Official Website**: -- **Getting Started**: -- **User Manual**: -- **GitHub**: -- **Nix Package**: - -## Version Information - -Current provisioning system configuration: - -``` -# View configured version -cat $PROVISIONING/core/versions | grep NICKEL_VERSION - -# Current: 1.15.1 -``` - -## Support - -For issues related to: - -- **Nickel language**: See -- **Nix installation**: See -- **Provisioning integration**: See the provisioning system documentation +# Nickel Installation Guide\n\n## Overview\n\nNickel is a configuration language that complements KCL in the provisioning system. It provides:\n\n- Lazy evaluation for efficient configuration processing\n- Modern functional programming paradigms\n- Excellent integration with the CLI daemon for config rendering\n\n## Installation Methods\n\n### Recommended: Nix (Official Method)\n\nNickel is maintained by Tweag and officially recommends Nix for installation. This avoids all dependency issues:\n\n```\n# Install Nix (one-time setup) - Using official NixOS installer\ncurl https://nixos.org/nix/install | sh\n\n# Install Nickel via Nix\nnix profile install nixpkgs#nickel\n\n# Verify installation\nnickel --version\n```\n\n**Why Nix?**\n\n- Isolated, reproducible environments\n- No system library conflicts\n- Official Nickel distribution method\n- Works on macOS, Linux, and other Unix-like systems\n- Pre-built binaries available\n\n### Alternative: Automatic Installation\n\nThe provisioning system can automate installation:\n\n```\n# Via tools-install script (uses Nix if available)\n$PROVISIONING/core/cli/tools-install nickel\n\n# Check installation status\n$PROVISIONING/core/cli/tools-install check\n```\n\n### Alternative: Manual Installation from Source\n\nIf you have a Rust toolchain:\n\n```\ncargo install nickel-lang-cli\n```\n\n**Note**: This requires Rust compiler (slower than pre-built binaries)\n\n## Troubleshooting\n\n### "Library not loaded: /nix/store/..." Error\n\nThis occurs when using pre-built binaries without Nix installed. **Solution**: Install Nix or use Cargo:\n\n```\n# Option 1: Install Nix (recommended) - Using official NixOS installer\ncurl https://nixos.org/nix/install | sh\n\n# Then install Nickel\nnix profile install nixpkgs#nickel\n\n# Option 2: Build from source with Cargo\ncargo install nickel-lang-cli\n```\n\n### Command Not Found\n\nEnsure Nix is properly installed and in PATH:\n\n```\n# Check if Nix is installed\nwhich nix\n\n# If not found, install Nix first using official NixOS installer:\ncurl https://nixos.org/nix/install | sh\n\n# Then install Nickel\nnix profile install nixpkgs#nickel\n```\n\n### Version Mismatch\n\nTo ensure you're using the correct version:\n\n```\n# Check installed version\nnickel --version\n\n# Expected version (from provisioning/core/versions)\necho $NICKEL_VERSION\n\n# Update to latest\nnix profile upgrade '*'\n```\n\n## Integration with Provisioning System\n\n### CLI Daemon Integration\n\nNickel is integrated into the CLI daemon for configuration rendering:\n\n```\n# Render Nickel configuration via daemon\ncurl -X POST http://localhost:9091/config/render \\n -H "Content-Type: application/json" \\n -d '{\n "language": "nickel",\n "content": "{name = \"my-config\", enabled = true}",\n "context": {"env": "prod"}\n }'\n```\n\n### Comparison with KCL\n\n| Feature | KCL | Nickel |\n| --------- | ----- | -------- |\n| **Type System** | Gradual, OOP-style | Gradual, Functional |\n| **Evaluation** | Eager | Lazy (partial evaluation) |\n| **Performance** | Fast | Very fast (lazy) |\n| **Learning Curve** | Moderate | Functional programming knowledge helps |\n| **Use Cases** | Infrastructure schemas | Configuration merging, lazy evaluation |\n\n## Deployment Considerations\n\n### macOS M1/M2/M3 (arm64)\n\nNix automatically handles architecture:\n\n```\nnix profile install nixpkgs#nickel\n# Automatically installs arm64 binary\n```\n\n### Linux (x86_64/arm64)\n\n```\nnix profile install nixpkgs#nickel\n# Automatically installs correct architecture\n```\n\n### CI/CD Environments\n\nFor GitHub Actions or other CI/CD:\n\n```\n# .github/workflows/example.yml\n- name: Install Nickel\n run: |\n curl https://nixos.org/nix/install | sh\n nix profile install nixpkgs#nickel\n```\n\n## Resources\n\n- **Official Website**: \n- **Getting Started**: \n- **User Manual**: \n- **GitHub**: \n- **Nix Package**: \n\n## Version Information\n\nCurrent provisioning system configuration:\n\n```\n# View configured version\ncat $PROVISIONING/core/versions | grep NICKEL_VERSION\n\n# Current: 1.15.1\n```\n\n## Support\n\nFor issues related to:\n\n- **Nickel language**: See \n- **Nix installation**: See \n- **Provisioning integration**: See the provisioning system documentation diff --git a/tools/package/README.md b/tools/package/README.md index 6f5e4f9..1209222 100644 --- a/tools/package/README.md +++ b/tools/package/README.md @@ -1,83 +1 @@ -# Package Build Output - -**Purpose**: Platform-specific packages (deb, rpm, tar.gz, etc.) created from distribution artifacts. - -## Contents - -This directory contains the output from package creation tools that convert distribution artifacts into system-specific formats. - -## Package Formats - -Generated packages may include: - -### Linux Packages - -- **deb** - Debian/Ubuntu packages -- **rpm** - RedHat/CentOS packages -- **tar.gz** - Portable tarball archives -- **AppImage** - Universal Linux application format - -### macOS Packages - -- **pkg** - macOS installer packages -- **dmg** - macOS disk image -- **tar.gz** - Portable archive - -### Windows Packages - -- **msi** - Windows installer -- **zip** - Portable archive -- **exe** - Self-extracting executable - -### Container Images - -- **docker** - Docker container images -- **oci** - OCI container format - -## Usage - -This directory is generated by the package build system. Do not commit contents to git (configured in .gitignore). - -Create packages: - -``` -just package-all # All format packages -just package-linux # Linux packages only -just package-macos # macOS packages only -just package-deb # Debian package only -``` - -Install a package: - -``` -# Linux -sudo dpkg -i provisioning-*.deb # Debian -sudo rpm -i provisioning-*.rpm # RedHat - -# macOS -sudo installer -pkg provisioning-*.pkg -target / -``` - -## Package Verification - -Verify package contents: - -``` -dpkg -c provisioning-*.deb # List Debian package contents -rpm -ql provisioning-*.rpm # List RedHat package contents -tar -tzf provisioning-*.tar.gz # List tarball contents -``` - -## Cleanup - -Remove all packages: - -``` -just clean # Clean all build artifacts -``` - -## Related Directories - -- `dist/` - Build artifacts that packages are created from -- `distribution/` - Distribution generation (uses package outputs) -- `release/` - Version management for packages +# Package Build Output\n\n**Purpose**: Platform-specific packages (deb, rpm, tar.gz, etc.) created from distribution artifacts.\n\n## Contents\n\nThis directory contains the output from package creation tools that convert distribution artifacts into system-specific formats.\n\n## Package Formats\n\nGenerated packages may include:\n\n### Linux Packages\n\n- **deb** - Debian/Ubuntu packages\n- **rpm** - RedHat/CentOS packages\n- **tar.gz** - Portable tarball archives\n- **AppImage** - Universal Linux application format\n\n### macOS Packages\n\n- **pkg** - macOS installer packages\n- **dmg** - macOS disk image\n- **tar.gz** - Portable archive\n\n### Windows Packages\n\n- **msi** - Windows installer\n- **zip** - Portable archive\n- **exe** - Self-extracting executable\n\n### Container Images\n\n- **docker** - Docker container images\n- **oci** - OCI container format\n\n## Usage\n\nThis directory is generated by the package build system. Do not commit contents to git (configured in .gitignore).\n\nCreate packages:\n\n```\njust package-all # All format packages\njust package-linux # Linux packages only\njust package-macos # macOS packages only\njust package-deb # Debian package only\n```\n\nInstall a package:\n\n```\n# Linux\nsudo dpkg -i provisioning-*.deb # Debian\nsudo rpm -i provisioning-*.rpm # RedHat\n\n# macOS\nsudo installer -pkg provisioning-*.pkg -target /\n```\n\n## Package Verification\n\nVerify package contents:\n\n```\ndpkg -c provisioning-*.deb # List Debian package contents\nrpm -ql provisioning-*.rpm # List RedHat package contents\ntar -tzf provisioning-*.tar.gz # List tarball contents\n```\n\n## Cleanup\n\nRemove all packages:\n\n```\njust clean # Clean all build artifacts\n```\n\n## Related Directories\n\n- `dist/` - Build artifacts that packages are created from\n- `distribution/` - Distribution generation (uses package outputs)\n- `release/` - Version management for packages diff --git a/tools/release/README.md b/tools/release/README.md index c65f190..6583cb7 100644 --- a/tools/release/README.md +++ b/tools/release/README.md @@ -1,72 +1 @@ -# Release Management Output - -**Purpose**: Release artifacts, version tags, and release notes. - -## Contents - -This directory contains outputs and staging for release management operations. - -## Release Process - -Release management handles: - -1. **Version Bumping** - Update version numbers across the project -2. **Changelog Generation** - Create release notes from git history -3. **Git Tagging** - Create git tags for releases -4. **Release Notes** - Write comprehensive release documentation - -## Release Workflow - -Prepare a release: - -``` -just release-prepare --version 4.0.0 # Prepare new release -just release-validate # Validate release readiness -``` - -Create release artifacts: - -``` -just release-build # Build final release packages -just release-tag # Create git tags -``` - -Publish release: - -``` -just release-publish # Upload to repositories -``` - -## Version Management - -Versions follow semantic versioning: `MAJOR.MINOR.PATCH` - -- **MAJOR** - Breaking changes -- **MINOR** - New features (backward compatible) -- **PATCH** - Bug fixes - -Example: `4.2.1` = Major 4, Minor 2, Patch 1 - -## Release Artifacts - -A release includes: - -- Version-tagged binaries -- Changelog with all changes -- Release notes with highlights -- Git tags for all milestones -- Published packages in repositories - -## Cleanup - -Remove release artifacts: - -``` -just clean # Clean all build artifacts -``` - -## Related Directories - -- `dist/` - Build artifacts that releases are based on -- `package/` - Packages that get versioned and released -- `distribution/` - Distribution that incorporates release versions +# Release Management Output\n\n**Purpose**: Release artifacts, version tags, and release notes.\n\n## Contents\n\nThis directory contains outputs and staging for release management operations.\n\n## Release Process\n\nRelease management handles:\n\n1. **Version Bumping** - Update version numbers across the project\n2. **Changelog Generation** - Create release notes from git history\n3. **Git Tagging** - Create git tags for releases\n4. **Release Notes** - Write comprehensive release documentation\n\n## Release Workflow\n\nPrepare a release:\n\n```\njust release-prepare --version 4.0.0 # Prepare new release\njust release-validate # Validate release readiness\n```\n\nCreate release artifacts:\n\n```\njust release-build # Build final release packages\njust release-tag # Create git tags\n```\n\nPublish release:\n\n```\njust release-publish # Upload to repositories\n```\n\n## Version Management\n\nVersions follow semantic versioning: `MAJOR.MINOR.PATCH`\n\n- **MAJOR** - Breaking changes\n- **MINOR** - New features (backward compatible)\n- **PATCH** - Bug fixes\n\nExample: `4.2.1` = Major 4, Minor 2, Patch 1\n\n## Release Artifacts\n\nA release includes:\n\n- Version-tagged binaries\n- Changelog with all changes\n- Release notes with highlights\n- Git tags for all milestones\n- Published packages in repositories\n\n## Cleanup\n\nRemove release artifacts:\n\n```\njust clean # Clean all build artifacts\n```\n\n## Related Directories\n\n- `dist/` - Build artifacts that releases are based on\n- `package/` - Packages that get versioned and released\n- `distribution/` - Distribution that incorporates release versions diff --git a/workspace/README.md b/workspace/README.md index 7b9eaa4..508d914 100644 --- a/workspace/README.md +++ b/workspace/README.md @@ -1,273 +1 @@ -# Layered Template Architecture System - -This workspace provides a combined **Layered Extension Architecture with Override System** and **Template-Based Infrastructure Pattern Library** that maintains PAP principles while enabling maximum reusability of infrastructure configurations. - -## 🏗️ Architecture Overview - -### Layer Hierarchy - -The system resolves configurations through a three-tier layer system: - -1. **Core Layer (Priority 100)** - `provisioning/extensions/` - - Base provisioning system extensions - - Core taskservs, providers, and clusters - -2. **Workspace Layer (Priority 200)** - `provisioning/workspace/templates/` - - Shared templates extracted from proven infrastructure patterns - - Reusable configurations across multiple infrastructures - -3. **Infrastructure Layer (Priority 300)** - `workspace/infra/{name}/` - - Infrastructure-specific configurations and overrides - - Custom implementations per infrastructure - -### Directory Structure - -``` -provisioning/workspace/ -├── templates/ # Template library -│ ├── taskservs/ # Taskserv configuration templates -│ │ ├── kubernetes/ # Kubernetes templates -│ │ │ ├── base.k # Base configuration -│ │ │ └── variants/ # HA, single-node variants -│ │ ├── storage/ # Storage system templates -│ │ ├── networking/ # Network configuration templates -│ │ └── container-runtime/ # Container runtime templates -│ ├── providers/ # Provider templates -│ │ ├── upcloud/ # UpCloud provider templates -│ │ └── aws/ # AWS provider templates -│ ├── servers/ # Server configuration patterns -│ └── clusters/ # Complete cluster templates -├── layers/ # Layer definitions -│ ├── core.layer.k # Core layer definition -│ ├── workspace.layer.k # Workspace layer definition -│ └── infra.layer.k # Infrastructure layer definition -├── registry/ # Extension registry -│ ├── manifest.yaml # Template catalog and metadata -│ └── imports.k # Central import aliases -├── templates/lib/ # Composition utilities -│ ├── compose.k # KCL composition functions -│ └── override.k # Override and layer utilities -└── tools/ # Migration and management tools - └── migrate-infra.nu # Infrastructure migration tool -``` - -## 🚀 Getting Started - -### 1. Extract Existing Patterns - -Extract patterns from existing infrastructure (e.g., wuji) to create reusable templates: - -``` -# Extract all patterns from wuji infrastructure -cd provisioning/workspace/tools -./migrate-infra.nu extract wuji - -# Extract specific types only -./migrate-infra.nu extract wuji --type taskservs -``` - -### 2. Use Enhanced Module Loader - -The enhanced module loader provides template and layer management: - -``` -# List available templates -provisioning/core/cli/module-loader-enhanced template list - -# Show layer resolution order -provisioning/core/cli/module-loader-enhanced layer show - -# Test layer resolution for a specific module -provisioning/core/cli/module-loader-enhanced layer test kubernetes --infra wuji -``` - -### 3. Apply Templates to New Infrastructure - -``` -# Apply kubernetes template to new infrastructure -provisioning/core/cli/module-loader-enhanced template apply kubernetes-base new-infra --provider upcloud - -# Load taskservs using templates -provisioning/core/cli/module-loader-enhanced load enhanced taskservs workspace/infra/new-infra [kubernetes, cilium] --layer workspace -``` - -## 📋 Usage Examples - -### Creating a New Infrastructure from Templates - -``` -# 1. Create directory structure -mkdir -p workspace/infra/my-new-infra/{taskservs,defs,overrides} - -# 2. Apply base templates -cd provisioning -./core/cli/module-loader-enhanced template apply kubernetes-base my-new-infra - -# 3. Customize for your needs -# Edit workspace/infra/my-new-infra/taskservs/kubernetes.k - -# 4. Test layer resolution -./core/cli/module-loader-enhanced layer test kubernetes --infra my-new-infra -``` - -### Converting Existing Infrastructure - -``` -# 1. Extract patterns to templates -cd provisioning/workspace/tools -./migrate-infra.nu extract existing-infra - -# 2. Convert infrastructure to use templates -./migrate-infra.nu convert existing-infra - -# 3. Validate conversion -./migrate-infra.nu validate existing-infra -``` - -### Template Development - -``` -# Create a new taskserv template -# provisioning/workspace/templates/taskservs/my-service/base.k - -import taskservs.my_service.kcl.my_service as service_core -import ../../../workspace/templates/lib/compose as comp - -schema MyServiceBase { - version: str = "1.0.0" - cluster_name: str - # ... configuration options -} - -def create_my_service [cluster_name: str, overrides: any = {}] -> any { - let base_config = MyServiceBase { cluster_name = $cluster_name } - let final_config = comp.deep_merge $base_config $overrides - - service_core.MyService $final_config -} -``` - -## 🔧 Configuration Composition - -### Using Templates in Infrastructure - -``` -# workspace/infra/my-infra/taskservs/kubernetes.k - -import provisioning.workspace.registry.imports as reg -import provisioning.workspace.templates.lib.override as ovr - -# Use base template with infrastructure-specific overrides -_taskserv = ovr.infrastructure_overrides.taskserv_override( - reg.tpl_kubernetes_base.kubernetes_base, - "my-infra", - "kubernetes", - { - cluster_name: "my-infra" - version: "1.31.0" # Override version - cri: "crio" # Override container runtime - # Custom network configuration - network_config: { - pod_cidr: "10.244.0.0/16" - service_cidr: "10.96.0.0/12" - } - } -) -``` - -### Layer Composition - -``` -# Compose configuration through all layers -import provisioning.workspace.templates.lib.compose as comp - -# Manual layer composition -final_config = comp.compose_templates( - $core_config, # From provisioning/extensions - $workspace_config, # From provisioning/workspace/templates - $infra_config # From workspace/infra/{name} -) -``` - -## 🛠️ Advanced Features - -### Provider-Aware Composition - -``` -import provisioning.workspace.templates.lib.override as ovr - -# Apply provider-specific configurations -config = ovr.override_patterns.env_override( - $base_config, - "upcloud", - { - upcloud: { zone: "es-mad1", plan: "2xCPU-4GB" }, - aws: { region: "us-west-2", instance_type: "t3.medium" }, - local: { memory: "4GB", cpus: 2 } - } -) -``` - -### Conditional Overrides - -``` -# Infrastructure-specific conditional overrides -config = ovr.layer_resolution.infra_conditional( - $base_config, - $infra_name, - { - "production": { ha: true, replicas: 3 }, - "development": { ha: false, replicas: 1 }, - "default": { ha: false, replicas: 1 } - } -) -``` - -## 📚 Benefits - -### ✅ Maintains PAP Principles - -- **Configuration-driven**: No hardcoded values -- **Modular**: Clear separation of concerns -- **Declarative**: Infrastructure as code -- **Reusable**: DRY principle throughout - -### ✅ Flexible Override System - -- **Layer-based resolution**: Clean precedence order -- **Selective overrides**: Override only what's needed -- **Provider-agnostic**: Works across all providers -- **Environment-aware**: Dev/test/prod configurations - -### ✅ Template Reusability - -- **Pattern extraction**: Learn from existing infrastructures -- **Template versioning**: Track evolution over time -- **Composition utilities**: Rich KCL composition functions -- **Migration tools**: Easy conversion process - -### ✅ No Core Changes - -- **Non-invasive**: Core provisioning system unchanged -- **Backward compatible**: Existing infrastructures continue working -- **Progressive adoption**: Migrate at your own pace -- **Extensible**: Add new templates and layers easily - -## 🔄 Migration Path - -1. **Extract patterns** from existing infrastructures using `migrate-infra.nu extract` -2. **Create templates** in `provisioning/workspace/templates/` -3. **Convert infrastructures** to use templates with `migrate-infra.nu convert` -4. **Validate** the conversion with `migrate-infra.nu validate` -5. **Test** layer resolution with enhanced module loader -6. **Iterate** and improve templates based on usage - -## 📖 Further Reading - -- **Layer Definitions**: See `layers/*.layer.k` for layer configuration -- **Template Examples**: Browse `templates/` for real-world patterns -- **Composition Utilities**: Check `templates/lib/` for KCL utilities -- **Migration Tools**: Use `tools/migrate-infra.nu` for infrastructure conversion -- **Registry System**: Explore `registry/` for template metadata and imports - -This system provides the perfect balance of flexibility, reusability, and maintainability while preserving the core provisioning system's integrity. +# Layered Template Architecture System\n\nThis workspace provides a combined **Layered Extension Architecture with Override System** and **Template-Based Infrastructure Pattern Library** that maintains PAP principles while enabling maximum reusability of infrastructure configurations.\n\n## 🏗️ Architecture Overview\n\n### Layer Hierarchy\n\nThe system resolves configurations through a three-tier layer system:\n\n1. **Core Layer (Priority 100)** - `provisioning/extensions/`\n - Base provisioning system extensions\n - Core taskservs, providers, and clusters\n\n2. **Workspace Layer (Priority 200)** - `provisioning/workspace/templates/`\n - Shared templates extracted from proven infrastructure patterns\n - Reusable configurations across multiple infrastructures\n\n3. **Infrastructure Layer (Priority 300)** - `workspace/infra/{name}/`\n - Infrastructure-specific configurations and overrides\n - Custom implementations per infrastructure\n\n### Directory Structure\n\n```\nprovisioning/workspace/\n├── templates/ # Template library\n│ ├── taskservs/ # Taskserv configuration templates\n│ │ ├── kubernetes/ # Kubernetes templates\n│ │ │ ├── base.k # Base configuration\n│ │ │ └── variants/ # HA, single-node variants\n│ │ ├── storage/ # Storage system templates\n│ │ ├── networking/ # Network configuration templates\n│ │ └── container-runtime/ # Container runtime templates\n│ ├── providers/ # Provider templates\n│ │ ├── upcloud/ # UpCloud provider templates\n│ │ └── aws/ # AWS provider templates\n│ ├── servers/ # Server configuration patterns\n│ └── clusters/ # Complete cluster templates\n├── layers/ # Layer definitions\n│ ├── core.layer.k # Core layer definition\n│ ├── workspace.layer.k # Workspace layer definition\n│ └── infra.layer.k # Infrastructure layer definition\n├── registry/ # Extension registry\n│ ├── manifest.yaml # Template catalog and metadata\n│ └── imports.k # Central import aliases\n├── templates/lib/ # Composition utilities\n│ ├── compose.k # KCL composition functions\n│ └── override.k # Override and layer utilities\n└── tools/ # Migration and management tools\n └── migrate-infra.nu # Infrastructure migration tool\n```\n\n## 🚀 Getting Started\n\n### 1. Extract Existing Patterns\n\nExtract patterns from existing infrastructure (e.g., wuji) to create reusable templates:\n\n```\n# Extract all patterns from wuji infrastructure\ncd provisioning/workspace/tools\n./migrate-infra.nu extract wuji\n\n# Extract specific types only\n./migrate-infra.nu extract wuji --type taskservs\n```\n\n### 2. Use Enhanced Module Loader\n\nThe enhanced module loader provides template and layer management:\n\n```\n# List available templates\nprovisioning/core/cli/module-loader-enhanced template list\n\n# Show layer resolution order\nprovisioning/core/cli/module-loader-enhanced layer show\n\n# Test layer resolution for a specific module\nprovisioning/core/cli/module-loader-enhanced layer test kubernetes --infra wuji\n```\n\n### 3. Apply Templates to New Infrastructure\n\n```\n# Apply kubernetes template to new infrastructure\nprovisioning/core/cli/module-loader-enhanced template apply kubernetes-base new-infra --provider upcloud\n\n# Load taskservs using templates\nprovisioning/core/cli/module-loader-enhanced load enhanced taskservs workspace/infra/new-infra [kubernetes, cilium] --layer workspace\n```\n\n## 📋 Usage Examples\n\n### Creating a New Infrastructure from Templates\n\n```\n# 1. Create directory structure\nmkdir -p workspace/infra/my-new-infra/{taskservs,defs,overrides}\n\n# 2. Apply base templates\ncd provisioning\n./core/cli/module-loader-enhanced template apply kubernetes-base my-new-infra\n\n# 3. Customize for your needs\n# Edit workspace/infra/my-new-infra/taskservs/kubernetes.k\n\n# 4. Test layer resolution\n./core/cli/module-loader-enhanced layer test kubernetes --infra my-new-infra\n```\n\n### Converting Existing Infrastructure\n\n```\n# 1. Extract patterns to templates\ncd provisioning/workspace/tools\n./migrate-infra.nu extract existing-infra\n\n# 2. Convert infrastructure to use templates\n./migrate-infra.nu convert existing-infra\n\n# 3. Validate conversion\n./migrate-infra.nu validate existing-infra\n```\n\n### Template Development\n\n```\n# Create a new taskserv template\n# provisioning/workspace/templates/taskservs/my-service/base.k\n\nimport taskservs.my_service.kcl.my_service as service_core\nimport ../../../workspace/templates/lib/compose as comp\n\nschema MyServiceBase {\n version: str = "1.0.0"\n cluster_name: str\n # ... configuration options\n}\n\ndef create_my_service [cluster_name: str, overrides: any = {}] -> any {\n let base_config = MyServiceBase { cluster_name = $cluster_name }\n let final_config = comp.deep_merge $base_config $overrides\n\n service_core.MyService $final_config\n}\n```\n\n## 🔧 Configuration Composition\n\n### Using Templates in Infrastructure\n\n```\n# workspace/infra/my-infra/taskservs/kubernetes.k\n\nimport provisioning.workspace.registry.imports as reg\nimport provisioning.workspace.templates.lib.override as ovr\n\n# Use base template with infrastructure-specific overrides\n_taskserv = ovr.infrastructure_overrides.taskserv_override(\n reg.tpl_kubernetes_base.kubernetes_base,\n "my-infra",\n "kubernetes",\n {\n cluster_name: "my-infra"\n version: "1.31.0" # Override version\n cri: "crio" # Override container runtime\n # Custom network configuration\n network_config: {\n pod_cidr: "10.244.0.0/16"\n service_cidr: "10.96.0.0/12"\n }\n }\n)\n```\n\n### Layer Composition\n\n```\n# Compose configuration through all layers\nimport provisioning.workspace.templates.lib.compose as comp\n\n# Manual layer composition\nfinal_config = comp.compose_templates(\n $core_config, # From provisioning/extensions\n $workspace_config, # From provisioning/workspace/templates\n $infra_config # From workspace/infra/{name}\n)\n```\n\n## 🛠️ Advanced Features\n\n### Provider-Aware Composition\n\n```\nimport provisioning.workspace.templates.lib.override as ovr\n\n# Apply provider-specific configurations\nconfig = ovr.override_patterns.env_override(\n $base_config,\n "upcloud",\n {\n upcloud: { zone: "es-mad1", plan: "2xCPU-4GB" },\n aws: { region: "us-west-2", instance_type: "t3.medium" },\n local: { memory: "4GB", cpus: 2 }\n }\n)\n```\n\n### Conditional Overrides\n\n```\n# Infrastructure-specific conditional overrides\nconfig = ovr.layer_resolution.infra_conditional(\n $base_config,\n $infra_name,\n {\n "production": { ha: true, replicas: 3 },\n "development": { ha: false, replicas: 1 },\n "default": { ha: false, replicas: 1 }\n }\n)\n```\n\n## 📚 Benefits\n\n### ✅ Maintains PAP Principles\n\n- **Configuration-driven**: No hardcoded values\n- **Modular**: Clear separation of concerns\n- **Declarative**: Infrastructure as code\n- **Reusable**: DRY principle throughout\n\n### ✅ Flexible Override System\n\n- **Layer-based resolution**: Clean precedence order\n- **Selective overrides**: Override only what's needed\n- **Provider-agnostic**: Works across all providers\n- **Environment-aware**: Dev/test/prod configurations\n\n### ✅ Template Reusability\n\n- **Pattern extraction**: Learn from existing infrastructures\n- **Template versioning**: Track evolution over time\n- **Composition utilities**: Rich KCL composition functions\n- **Migration tools**: Easy conversion process\n\n### ✅ No Core Changes\n\n- **Non-invasive**: Core provisioning system unchanged\n- **Backward compatible**: Existing infrastructures continue working\n- **Progressive adoption**: Migrate at your own pace\n- **Extensible**: Add new templates and layers easily\n\n## 🔄 Migration Path\n\n1. **Extract patterns** from existing infrastructures using `migrate-infra.nu extract`\n2. **Create templates** in `provisioning/workspace/templates/`\n3. **Convert infrastructures** to use templates with `migrate-infra.nu convert`\n4. **Validate** the conversion with `migrate-infra.nu validate`\n5. **Test** layer resolution with enhanced module loader\n6. **Iterate** and improve templates based on usage\n\n## 📖 Further Reading\n\n- **Layer Definitions**: See `layers/*.layer.k` for layer configuration\n- **Template Examples**: Browse `templates/` for real-world patterns\n- **Composition Utilities**: Check `templates/lib/` for KCL utilities\n- **Migration Tools**: Use `tools/migrate-infra.nu` for infrastructure conversion\n- **Registry System**: Explore `registry/` for template metadata and imports\n\nThis system provides the perfect balance of flexibility, reusability, and maintainability while preserving the core provisioning system's integrity.