diff --git a/.typedialog/README.md b/.typedialog/README.md index 1ad576a..d49e296 100644 --- a/.typedialog/README.md +++ b/.typedialog/README.md @@ -1 +1,182 @@ -# TypeDialog Configuration Structure\n\nThis directory contains TypeDialog forms, templates, and configuration data organized by subsystem.\n\n## Directory Organization\n\n```\n.typedialog/\n├── core/ # Core subsystem forms (setup, auth, infrastructure)\n├── provisioning/ # Main provisioning configuration fragments\n└── platform/ # Platform services forms (future)\n```\n\n### Why Multiple Subdirectories\n\nDifferent subsystems have different form requirements:\n\n1. **`core/`** - Core infrastructure operations\n - System setup wizard\n - Authentication (login, MFA)\n - Infrastructure confirmations (delete, deploy)\n - **Users**: Developers, operators\n\n2. **`provisioning/`** - Project provisioning configuration\n - Deployment target selection (docker, k8s, ssh)\n - Database configuration (postgres, mysql, sqlite)\n - Monitoring setup\n - **Users**: Project setup, CI/CD\n\n3. **`platform/`** (future) - Platform services\n - Orchestrator configuration\n - Control center setup\n - Service-specific forms\n - **Users**: Platform administrators\n\n## Structure Within Each Subdirectory\n\nEach subdirectory follows this pattern:\n\n```\n{subsystem}/\n├── forms/ # TOML form definitions\n├── templates/ # Nickel/Jinja2 templates\n├── defaults/ # Default configurations\n├── constraints/ # Validation rules\n└── generated/ # Generated configs (gitignored)\n```\n\n## Core Subsystem (`core/`)\n\n**Purpose**: Core infrastructure operations (setup, auth, confirmations)\n\n**Forms**:\n- `forms/setup-wizard.toml` - Initial system setup\n- `forms/auth-login.toml` - User authentication\n- `forms/mfa-enroll.toml` - MFA enrollment\n- `forms/infrastructure/*.toml` - Delete confirmations (server, cluster, taskserv)\n\n**Bash Wrappers** (TTY-safe):\n- `../../core/shlib/setup-wizard-tty.sh`\n- `../../core/shlib/auth-login-tty.sh`\n- `../../core/shlib/mfa-enroll-tty.sh`\n\n**Usage**:\n```\n# Run setup wizard\n./provisioning/core/shlib/setup-wizard-tty.sh\n\n# Nushell reads result\nlet config = (open provisioning/.typedialog/core/generated/setup-wizard-result.json | from json)\n```\n\n## Provisioning Subsystem (`provisioning/`)\n\n**Purpose**: Main provisioning configuration (deployments, databases, monitoring)\n\n**Structure**:\n- `form.toml` - Main provisioning form\n- `fragments/` - Modular form fragments\n - `deployment-*.toml` - Docker, K8s, SSH deployments\n - `database-*.toml` - Database configurations\n - `monitoring.toml` - Monitoring setup\n - `auth-*.toml` - Authentication methods\n- `constraints.toml` - Validation constraints\n- `defaults/` - Default values\n- `schemas/` - Nickel schemas\n\n**Usage**:\n```\n# Configure provisioning\nnu provisioning/.typedialog/provisioning/configure.nu --backend web\n```\n\n## Platform Subsystem (`platform/` - Future)\n\n**Purpose**: Platform services configuration\n\n**Planned forms**:\n- Orchestrator configuration\n- Control center setup\n- MCP server configuration\n- Vault service setup\n\n**Status**: Structure planned, not yet implemented\n\n## Integration with Code\n\n### Bash Wrappers (TTY-safe)\n\nLocated in: `provisioning/core/shlib/*-tty.sh`\n\nThese wrappers solve Nushell's TTY input limitations by:\n1. Handling interactive input in bash\n2. Calling TypeDialog with proper TTY forwarding\n3. Generating JSON output for Nushell consumption\n\n**Pattern**:\n```\nBash wrapper → TypeDialog (TTY input) → Nickel config → JSON → Nushell\n```\n\n### Nushell Integration\n\nLocated in: `provisioning/core/nulib/lib_provisioning/`\n\nFunctions that call the bash wrappers:\n- `setup/wizard.nu::run-setup-wizard-interactive`\n- `plugins/auth.nu::login-interactive`\n- `plugins/auth.nu::mfa-enroll-interactive`\n\n## Generated Files\n\n**Location**: `{subsystem}/generated/`\n\n**Files**:\n- `*.ncl` - Nickel configuration files\n- `*.json` - JSON exports for Nushell\n- `*-defaults.ncl` - Default configurations\n\n**Note**: All generated files are gitignored\n\n## Form Naming Conventions\n\n1. **Top-level forms**: `{purpose}.toml`\n - Example: `setup-wizard.toml`, `auth-login.toml`\n\n2. **Fragment forms**: `fragments/{category}-{variant}.toml`\n - Example: `deployment-docker.toml`, `database-postgres.toml`\n\n3. **Infrastructure forms**: `forms/infrastructure/{operation}_{resource}_confirm.toml`\n - Example: `server_delete_confirm.toml`\n\n## Adding New Forms\n\n### For Core Operations\n\n1. Create form: `.typedialog/core/forms/{operation}.toml`\n2. Create wrapper: `core/shlib/{operation}-tty.sh`\n3. Integrate in Nushell: `core/nulib/lib_provisioning/`\n\n### For Provisioning Config\n\n1. Create fragment: `.typedialog/provisioning/fragments/{category}-{variant}.toml`\n2. Update main form: `.typedialog/provisioning/form.toml`\n3. Add defaults: `.typedialog/provisioning/defaults/`\n\n### For Platform Services (Future)\n\n1. Create subsystem: `.typedialog/platform/`\n2. Follow same structure as `core/` or `provisioning/`\n3. Document in this README\n\n## Related Documentation\n\n- **Bash wrappers**: `provisioning/core/shlib/README.md`\n- **TypeDialog integration**: `provisioning/platform/.typedialog/README.md`\n- **Nushell setup**: `provisioning/core/nulib/lib_provisioning/setup/wizard.nu`\n\n---\n\n**Last Updated**: 2025-01-09\n**Structure Version**: 2.0 (Multi-subsystem organization) +# TypeDialog Configuration Structure + +This directory contains TypeDialog forms, templates, and configuration data organized by subsystem. + +## Directory Organization + +```toml +.typedialog/ +├── core/ # Core subsystem forms (setup, auth, infrastructure) +├── provisioning/ # Main provisioning configuration fragments +└── platform/ # Platform services forms (future) +``` + +### Why Multiple Subdirectories + +Different subsystems have different form requirements: + +1. **`core/`** - Core infrastructure operations + - System setup wizard + - Authentication (login, MFA) + - Infrastructure confirmations (delete, deploy) + - **Users**: Developers, operators + +2. **`provisioning/`** - Project provisioning configuration + - Deployment target selection (docker, k8s, ssh) + - Database configuration (postgres, mysql, sqlite) + - Monitoring setup + - **Users**: Project setup, CI/CD + +3. **`platform/`** (future) - Platform services + - Orchestrator configuration + - Control center setup + - Service-specific forms + - **Users**: Platform administrators + +## Structure Within Each Subdirectory + +Each subdirectory follows this pattern: + +```json +{subsystem}/ +├── forms/ # TOML form definitions +├── templates/ # Nickel/Jinja2 templates +├── defaults/ # Default configurations +├── constraints/ # Validation rules +└── generated/ # Generated configs (gitignored) +``` + +## Core Subsystem (`core/`) + +**Purpose**: Core infrastructure operations (setup, auth, confirmations) + +**Forms**: +- `forms/setup-wizard.toml` - Initial system setup +- `forms/auth-login.toml` - User authentication +- `forms/mfa-enroll.toml` - MFA enrollment +- `forms/infrastructure/*.toml` - Delete confirmations (server, cluster, taskserv) + +**Bash Wrappers** (TTY-safe): +- `../../core/shlib/setup-wizard-tty.sh` +- `../../core/shlib/auth-login-tty.sh` +- `../../core/shlib/mfa-enroll-tty.sh` + +**Usage**: +```toml +# Run setup wizard +./provisioning/core/shlib/setup-wizard-tty.sh + +# Nushell reads result +let config = (open provisioning/.typedialog/core/generated/setup-wizard-result.json | from json) +``` + +## Provisioning Subsystem (`provisioning/`) + +**Purpose**: Main provisioning configuration (deployments, databases, monitoring) + +**Structure**: +- `form.toml` - Main provisioning form +- `fragments/` - Modular form fragments + - `deployment-*.toml` - Docker, K8s, SSH deployments + - `database-*.toml` - Database configurations + - `monitoring.toml` - Monitoring setup + - `auth-*.toml` - Authentication methods +- `constraints.toml` - Validation constraints +- `defaults/` - Default values +- `schemas/` - Nickel schemas + +**Usage**: +```nickel +# Configure provisioning +nu provisioning/.typedialog/provisioning/configure.nu --backend web +``` + +## Platform Subsystem (`platform/` - Future) + +**Purpose**: Platform services configuration + +**Planned forms**: +- Orchestrator configuration +- Control center setup +- MCP server configuration +- Vault service setup + +**Status**: Structure planned, not yet implemented + +## Integration with Code + +### Bash Wrappers (TTY-safe) + +Located in: `provisioning/core/shlib/*-tty.sh` + +These wrappers solve Nushell's TTY input limitations by: +1. Handling interactive input in bash +2. Calling TypeDialog with proper TTY forwarding +3. Generating JSON output for Nushell consumption + +**Pattern**: +```nushell +Bash wrapper → TypeDialog (TTY input) → Nickel config → JSON → Nushell +``` + +### Nushell Integration + +Located in: `provisioning/core/nulib/lib_provisioning/` + +Functions that call the bash wrappers: +- `setup/wizard.nu::run-setup-wizard-interactive` +- `plugins/auth.nu::login-interactive` +- `plugins/auth.nu::mfa-enroll-interactive` + +## Generated Files + +**Location**: `{subsystem}/generated/` + +**Files**: +- `*.ncl` - Nickel configuration files +- `*.json` - JSON exports for Nushell +- `*-defaults.ncl` - Default configurations + +**Note**: All generated files are gitignored + +## Form Naming Conventions + +1. **Top-level forms**: `{purpose}.toml` + - Example: `setup-wizard.toml`, `auth-login.toml` + +2. **Fragment forms**: `fragments/{category}-{variant}.toml` + - Example: `deployment-docker.toml`, `database-postgres.toml` + +3. **Infrastructure forms**: `forms/infrastructure/{operation}_{resource}_confirm.toml` + - Example: `server_delete_confirm.toml` + +## Adding New Forms + +### For Core Operations + +1. Create form: `.typedialog/core/forms/{operation}.toml` +2. Create wrapper: `core/shlib/{operation}-tty.sh` +3. Integrate in Nushell: `core/nulib/lib_provisioning/` + +### For Provisioning Config + +1. Create fragment: `.typedialog/provisioning/fragments/{category}-{variant}.toml` +2. Update main form: `.typedialog/provisioning/form.toml` +3. Add defaults: `.typedialog/provisioning/defaults/` + +### For Platform Services (Future) + +1. Create subsystem: `.typedialog/platform/` +2. Follow same structure as `core/` or `provisioning/` +3. Document in this README + +## Related Documentation + +- **Bash wrappers**: `provisioning/core/shlib/README.md` +- **TypeDialog integration**: `provisioning/platform/.typedialog/README.md` +- **Nushell setup**: `provisioning/core/nulib/lib_provisioning/setup/wizard.nu` + +--- + +**Last Updated**: 2025-01-09 +**Structure Version**: 2.0 (Multi-subsystem organization) \ No newline at end of file diff --git a/.typedialog/ci/README.md b/.typedialog/ci/README.md index bbf9568..da6d95d 100644 --- a/.typedialog/ci/README.md +++ b/.typedialog/ci/README.md @@ -1 +1,328 @@ -# CI System - Configuration Guide\n\n**Installed**: 2026-01-01\n**Detected Languages**: rust, nushell, nickel, bash, markdown, python, javascript\n\n---\n\n## Quick Start\n\n### Option 1: Using configure.sh (Recommended)\n\nA convenience script is installed in `.typedialog/ci/`:\n\n```\n# Use web backend (default) - Opens in browser\n.typedialog/ci/configure.sh\n\n# Use TUI backend - Terminal interface\n.typedialog/ci/configure.sh tui\n\n# Use CLI backend - Command-line prompts\n.typedialog/ci/configure.sh cli\n```\n\n**This script automatically:**\n\n- Sources `.typedialog/ci/envrc` for environment setup\n- Loads defaults from `config.ncl` (Nickel format)\n- Uses cascading search for fragments (local → Tools)\n- Creates backup before overwriting existing config\n- Saves output in Nickel format using nickel-roundtrip with documented template\n- Generates `config.ncl` compatible with `nickel doc` command\n\n### Option 2: Direct TypeDialog Commands\n\nUse TypeDialog nickel-roundtrip directly with manual paths:\n\n#### Web Backend (Recommended - Easy Viewing)\n\n```\ncd .typedialog/ci # Change to CI directory\nsource envrc # Load environment\ntypedialog-web nickel-roundtrip config.ncl form.toml \n --output config.ncl \n --ncl-template $TOOLS_PATH/dev-system/ci/templates/config.ncl.j2\n```\n\n#### TUI Backend\n\n```\ncd .typedialog/ci\nsource envrc\ntypedialog-tui nickel-roundtrip config.ncl form.toml \n --output config.ncl \n --ncl-template $TOOLS_PATH/dev-system/ci/templates/config.ncl.j2\n```\n\n#### CLI Backend\n\n```\ncd .typedialog/ci\nsource envrc\ntypedialog nickel-roundtrip config.ncl form.toml \n --output config.ncl \n --ncl-template $TOOLS_PATH/dev-system/ci/templates/config.ncl.j2\n```\n\n**Note:** The `--ncl-template` flag uses a Tera template that adds:\n\n- Descriptive comments for each section\n- Documentation compatible with `nickel doc config.ncl`\n- Consistent formatting and structure\n\n**All backends will:**\n\n- Show only options relevant to your detected languages\n- Guide you through all configuration choices\n- Validate your inputs\n- Generate config.ncl in Nickel format\n\n### Option 3: Manual Configuration\n\nEdit `config.ncl` directly:\n\n```\nvim .typedialog/ci/config.ncl\n```\n\n---\n\n## Configuration Format: Nickel\n\n**This project uses Nickel format by default** for all configuration files.\n\n### Why Nickel?\n\n- ✅ **Typed configuration** - Static type checking with `nickel typecheck`\n- ✅ **Documentation** - Generate docs with `nickel doc config.ncl`\n- ✅ **Validation** - Built-in schema validation\n- ✅ **Comments** - Rich inline documentation support\n- ✅ **Modular** - Import/export system for reusable configs\n\n### Nickel Template\n\nThe output structure is controlled by a **Tera template** at:\n\n- **Tools default**: `$TOOLS_PATH/dev-system/ci/templates/config.ncl.j2`\n- **Local override**: `.typedialog/ci/config.ncl.j2` (optional)\n\n**To customize the template:**\n\n```\n# Copy the default template\ncp $TOOLS_PATH/dev-system/ci/templates/config.ncl.j2 \n .typedialog/ci/config.ncl.j2\n\n# Edit to add custom comments, documentation, or structure\nvim .typedialog/ci/config.ncl.j2\n\n# Your template will now be used automatically\n```\n\n**Template features:**\n\n- Customizable comments per section\n- Control field ordering\n- Add project-specific documentation\n- Configure output for `nickel doc` command\n\n### TypeDialog Environment Variables\n\nYou can customize TypeDialog behavior with environment variables:\n\n```\n# Web server configuration\nexport TYPEDIALOG_PORT=9000 # Port for web backend (default: 9000)\nexport TYPEDIALOG_HOST=localhost # Host binding (default: localhost)\n\n# Localization\nexport TYPEDIALOG_LANG=en_US.UTF-8 # Form language (default: system locale)\n\n# Run with custom settings\nTYPEDIALOG_PORT=8080 .typedialog/ci/configure.sh web\n```\n\n**Common use cases:**\n\n```\n# Access from other machines in network\nTYPEDIALOG_HOST=0.0.0.0 TYPEDIALOG_PORT=8080 .typedialog/ci/configure.sh web\n\n# Use different port if 9000 is busy\nTYPEDIALOG_PORT=3000 .typedialog/ci/configure.sh web\n\n# Spanish interface\nTYPEDIALOG_LANG=es_ES.UTF-8 .typedialog/ci/configure.sh web\n```\n\n## Configuration Structure\n\nYour config.ncl is organized in the `ci` namespace (Nickel format):\n\n```\n{\n ci = {\n project = {\n name = "rust",\n detected_languages = ["rust, nushell, nickel, bash, markdown, python, javascript"],\n primary_language = "rust",\n },\n tools = {\n # Tools are added based on detected languages\n },\n features = {\n # CI features (pre-commit, GitHub Actions, etc.)\n },\n ci_providers = {\n # CI provider configurations\n },\n },\n}\n```\n\n## Available Fragments\n\nTool configurations are modular. Check `.typedialog/ci/fragments/` for:\n\n- rust-tools.toml - Tools for rust\n- nushell-tools.toml - Tools for nushell\n- nickel-tools.toml - Tools for nickel\n- bash-tools.toml - Tools for bash\n- markdown-tools.toml - Tools for markdown\n- python-tools.toml - Tools for python\n- javascript-tools.toml - Tools for javascript\n- general-tools.toml - Cross-language tools\n- ci-providers.toml - GitHub Actions, Woodpecker, etc.\n\n## Cascading Override System\n\nThis project uses a **local → Tools cascading search** for all resources:\n\n### How It Works\n\nResources are searched in priority order:\n\n1. **Local files** (`.typedialog/ci/`) - **FIRST** (highest priority)\n2. **Tools files** (`$TOOLS_PATH/dev-system/ci/`) - **FALLBACK** (default)\n\n### Affected Resources\n\n| Resource | Local Path | Tools Path |\n| ---------- | ------------ | ------------ |\n| Fragments | `.typedialog/ci/fragments/` | `$TOOLS_PATH/dev-system/ci/forms/fragments/` |\n| Schemas | `.typedialog/ci/schemas/` | `$TOOLS_PATH/dev-system/ci/schemas/` |\n| Validators | `.typedialog/ci/validators/` | `$TOOLS_PATH/dev-system/ci/validators/` |\n| Defaults | `.typedialog/ci/defaults/` | `$TOOLS_PATH/dev-system/ci/defaults/` |\n| Nickel Template | `.typedialog/ci/config.ncl.j2` | `$TOOLS_PATH/dev-system/ci/templates/config.ncl.j2` |\n\n### Environment Setup (.envrc)\n\nThe `.typedialog/ci/.envrc` file configures search paths:\n\n```\n# Source this file to load environment\nsource .typedialog/ci/.envrc\n\n# Or use direnv for automatic loading\necho 'source .typedialog/ci/.envrc' >> .envrc\n```\n\n**What's in .envrc:**\n\n```\nexport NICKEL_IMPORT_PATH="schemas:$TOOLS_PATH/dev-system/ci/schemas:validators:..."\nexport TYPEDIALOG_FRAGMENT_PATH=".:$TOOLS_PATH/dev-system/ci/forms"\nexport NCL_TEMPLATE=""\nexport TYPEDIALOG_PORT=9000 # Web server port\nexport TYPEDIALOG_HOST=localhost # Web server host\nexport TYPEDIALOG_LANG="${LANG}" # Form localization\n```\n\n### Creating Overrides\n\n**By default:** All resources come from Tools (no duplication).\n\n**To customize:** Create file in local directory with same name:\n\n```\n# Override a fragment\ncp $TOOLS_PATH/dev-system/ci/fragments/rust-tools.toml \n .typedialog/ci/fragments/rust-tools.toml\n\n# Edit your local version\nvim .typedialog/ci/fragments/rust-tools.toml\n\n# Override Nickel template (customize comments, structure, nickel doc output)\ncp $TOOLS_PATH/dev-system/ci/templates/config.ncl.j2 \n .typedialog/ci/config.ncl.j2\n\n# Edit to customize documentation and structure\nvim .typedialog/ci/config.ncl.j2\n\n# Now your version will be used instead of Tools version\n```\n\n**Benefits:**\n\n- ✅ Override only what you need\n- ✅ Everything else stays synchronized with Tools\n- ✅ No duplication by default\n- ✅ Automatic updates when Tools is updated\n\n**See:** `$TOOLS_PATH/dev-system/ci/docs/cascade-override.md` for complete documentation.\n\n## Testing Your Configuration\n\n### Validate Configuration\n\n```\nnu $env.TOOLS_PATH/dev-system/ci/scripts/validator.nu \n --config .typedialog/ci/config.ncl \n --project . \n --namespace ci\n```\n\n### Regenerate CI Files\n\n```\nnu $env.TOOLS_PATH/dev-system/ci/scripts/generate-configs.nu \n --config .typedialog/ci/config.ncl \n --templates $env.TOOLS_PATH/dev-system/ci/templates \n --output . \n --namespace ci\n```\n\n## Common Tasks\n\n### Add a New Tool\n\nEdit `config.ncl` and add under `ci.tools`:\n\n```\n{\n ci = {\n tools = {\n newtool = {\n enabled = true,\n install_method = "cargo",\n version = "latest",\n },\n },\n },\n}\n```\n\n### Disable a Feature\n\n```\n[ci.features]\nenable_pre_commit = false\n```\n\n## Need Help?\n\nFor detailed documentation, see:\n\n- $env.TOOLS_PATH/dev-system/ci/docs/configuration-guide.md\n- $env.TOOLS_PATH/dev-system/ci/docs/installation-guide.md \ No newline at end of file +# CI System - Configuration Guide + +**Installed**: 2026-01-01 +**Detected Languages**: rust, nushell, nickel, bash, markdown, python, javascript + +--- + +## Quick Start + +### Option 1: Using configure.sh (Recommended) + +A convenience script is installed in `.typedialog/ci/`: + +```toml +# Use web backend (default) - Opens in browser +.typedialog/ci/configure.sh + +# Use TUI backend - Terminal interface +.typedialog/ci/configure.sh tui + +# Use CLI backend - Command-line prompts +.typedialog/ci/configure.sh cli +``` + +**This script automatically:** + +- Sources `.typedialog/ci/envrc` for environment setup +- Loads defaults from `config.ncl` (Nickel format) +- Uses cascading search for fragments (local → Tools) +- Creates backup before overwriting existing config +- Saves output in Nickel format using nickel-roundtrip with documented template +- Generates `config.ncl` compatible with `nickel doc` command + +### Option 2: Direct TypeDialog Commands + +Use TypeDialog nickel-roundtrip directly with manual paths: + +#### Web Backend (Recommended - Easy Viewing) + +```toml +cd .typedialog/ci # Change to CI directory +source envrc # Load environment +typedialog-web nickel-roundtrip config.ncl form.toml + --output config.ncl + --ncl-template $TOOLS_PATH/dev-system/ci/templates/config.ncl.j2 +``` + +#### TUI Backend + +```toml +cd .typedialog/ci +source envrc +typedialog-tui nickel-roundtrip config.ncl form.toml + --output config.ncl + --ncl-template $TOOLS_PATH/dev-system/ci/templates/config.ncl.j2 +``` + +#### CLI Backend + +```toml +cd .typedialog/ci +source envrc +typedialog nickel-roundtrip config.ncl form.toml + --output config.ncl + --ncl-template $TOOLS_PATH/dev-system/ci/templates/config.ncl.j2 +``` + +**Note:** The `--ncl-template` flag uses a Tera template that adds: + +- Descriptive comments for each section +- Documentation compatible with `nickel doc config.ncl` +- Consistent formatting and structure + +**All backends will:** + +- Show only options relevant to your detected languages +- Guide you through all configuration choices +- Validate your inputs +- Generate config.ncl in Nickel format + +### Option 3: Manual Configuration + +Edit `config.ncl` directly: + +```nickel +vim .typedialog/ci/config.ncl +``` + +--- + +## Configuration Format: Nickel + +**This project uses Nickel format by default** for all configuration files. + +### Why Nickel? + +- ✅ **Typed configuration** - Static type checking with `nickel typecheck` +- ✅ **Documentation** - Generate docs with `nickel doc config.ncl` +- ✅ **Validation** - Built-in schema validation +- ✅ **Comments** - Rich inline documentation support +- ✅ **Modular** - Import/export system for reusable configs + +### Nickel Template + +The output structure is controlled by a **Tera template** at: + +- **Tools default**: `$TOOLS_PATH/dev-system/ci/templates/config.ncl.j2` +- **Local override**: `.typedialog/ci/config.ncl.j2` (optional) + +**To customize the template:** + +```toml +# Copy the default template +cp $TOOLS_PATH/dev-system/ci/templates/config.ncl.j2 + .typedialog/ci/config.ncl.j2 + +# Edit to add custom comments, documentation, or structure +vim .typedialog/ci/config.ncl.j2 + +# Your template will now be used automatically +``` + +**Template features:** + +- Customizable comments per section +- Control field ordering +- Add project-specific documentation +- Configure output for `nickel doc` command + +### TypeDialog Environment Variables + +You can customize TypeDialog behavior with environment variables: + +```toml +# Web server configuration +export TYPEDIALOG_PORT=9000 # Port for web backend (default: 9000) +export TYPEDIALOG_HOST=localhost # Host binding (default: localhost) + +# Localization +export TYPEDIALOG_LANG=en_US.UTF-8 # Form language (default: system locale) + +# Run with custom settings +TYPEDIALOG_PORT=8080 .typedialog/ci/configure.sh web +``` + +**Common use cases:** + +```toml +# Access from other machines in network +TYPEDIALOG_HOST=0.0.0.0 TYPEDIALOG_PORT=8080 .typedialog/ci/configure.sh web + +# Use different port if 9000 is busy +TYPEDIALOG_PORT=3000 .typedialog/ci/configure.sh web + +# Spanish interface +TYPEDIALOG_LANG=es_ES.UTF-8 .typedialog/ci/configure.sh web +``` + +## Configuration Structure + +Your config.ncl is organized in the `ci` namespace (Nickel format): + +```json +{ + ci = { + project = { + name = "rust", + detected_languages = ["rust, nushell, nickel, bash, markdown, python, javascript"], + primary_language = "rust", + }, + tools = { + # Tools are added based on detected languages + }, + features = { + # CI features (pre-commit, GitHub Actions, etc.) + }, + ci_providers = { + # CI provider configurations + }, + }, +} +``` + +## Available Fragments + +Tool configurations are modular. Check `.typedialog/ci/fragments/` for: + +- rust-tools.toml - Tools for rust +- nushell-tools.toml - Tools for nushell +- nickel-tools.toml - Tools for nickel +- bash-tools.toml - Tools for bash +- markdown-tools.toml - Tools for markdown +- python-tools.toml - Tools for python +- javascript-tools.toml - Tools for javascript +- general-tools.toml - Cross-language tools +- ci-providers.toml - GitHub Actions, Woodpecker, etc. + +## Cascading Override System + +This project uses a **local → Tools cascading search** for all resources: + +### How It Works + +Resources are searched in priority order: + +1. **Local files** (`.typedialog/ci/`) - **FIRST** (highest priority) +2. **Tools files** (`$TOOLS_PATH/dev-system/ci/`) - **FALLBACK** (default) + +### Affected Resources + +| Resource | Local Path | Tools Path | +| ---------- | ------------ | ------------ | +| Fragments | `.typedialog/ci/fragments/` | `$TOOLS_PATH/dev-system/ci/forms/fragments/` | +| Schemas | `.typedialog/ci/schemas/` | `$TOOLS_PATH/dev-system/ci/schemas/` | +| Validators | `.typedialog/ci/validators/` | `$TOOLS_PATH/dev-system/ci/validators/` | +| Defaults | `.typedialog/ci/defaults/` | `$TOOLS_PATH/dev-system/ci/defaults/` | +| Nickel Template | `.typedialog/ci/config.ncl.j2` | `$TOOLS_PATH/dev-system/ci/templates/config.ncl.j2` | + +### Environment Setup (.envrc) + +The `.typedialog/ci/.envrc` file configures search paths: + +```toml +# Source this file to load environment +source .typedialog/ci/.envrc + +# Or use direnv for automatic loading +echo 'source .typedialog/ci/.envrc' >> .envrc +``` + +**What's in .envrc:** + +```javascript +export NICKEL_IMPORT_PATH="schemas:$TOOLS_PATH/dev-system/ci/schemas:validators:..." +export TYPEDIALOG_FRAGMENT_PATH=".:$TOOLS_PATH/dev-system/ci/forms" +export NCL_TEMPLATE="" +export TYPEDIALOG_PORT=9000 # Web server port +export TYPEDIALOG_HOST=localhost # Web server host +export TYPEDIALOG_LANG="${LANG}" # Form localization +``` + +### Creating Overrides + +**By default:** All resources come from Tools (no duplication). + +**To customize:** Create file in local directory with same name: + +```toml +# Override a fragment +cp $TOOLS_PATH/dev-system/ci/fragments/rust-tools.toml + .typedialog/ci/fragments/rust-tools.toml + +# Edit your local version +vim .typedialog/ci/fragments/rust-tools.toml + +# Override Nickel template (customize comments, structure, nickel doc output) +cp $TOOLS_PATH/dev-system/ci/templates/config.ncl.j2 + .typedialog/ci/config.ncl.j2 + +# Edit to customize documentation and structure +vim .typedialog/ci/config.ncl.j2 + +# Now your version will be used instead of Tools version +``` + +**Benefits:** + +- ✅ Override only what you need +- ✅ Everything else stays synchronized with Tools +- ✅ No duplication by default +- ✅ Automatic updates when Tools is updated + +**See:** `$TOOLS_PATH/dev-system/ci/docs/cascade-override.md` for complete documentation. + +## Testing Your Configuration + +### Validate Configuration + +```toml +nu $env.TOOLS_PATH/dev-system/ci/scripts/validator.nu + --config .typedialog/ci/config.ncl + --project . + --namespace ci +``` + +### Regenerate CI Files + +```toml +nu $env.TOOLS_PATH/dev-system/ci/scripts/generate-configs.nu + --config .typedialog/ci/config.ncl + --templates $env.TOOLS_PATH/dev-system/ci/templates + --output . + --namespace ci +``` + +## Common Tasks + +### Add a New Tool + +Edit `config.ncl` and add under `ci.tools`: + +```json +{ + ci = { + tools = { + newtool = { + enabled = true, + install_method = "cargo", + version = "latest", + }, + }, + }, +} +``` + +### Disable a Feature + +```toml +[ci.features] +enable_pre_commit = false +``` + +## Need Help? + +For detailed documentation, see: + +- $env.TOOLS_PATH/dev-system/ci/docs/configuration-guide.md +- $env.TOOLS_PATH/dev-system/ci/docs/installation-guide.md \ No newline at end of file diff --git a/.typedialog/platform/forms/README.md b/.typedialog/platform/forms/README.md index a8f99b2..6bce32f 100644 --- a/.typedialog/platform/forms/README.md +++ b/.typedialog/platform/forms/README.md @@ -1 +1,390 @@ -# Forms\n\nTypeDialog form definitions for interactive configuration of platform services.\n\n## Purpose\n\nForms provide:\n- **Interactive configuration** - Web/TUI/CLI interfaces for user input\n- **Constraint validation** - Dynamic min/max from constraints.toml\n- **Nickel mapping** - Form fields map to Nickel structure via `nickel_path`\n- **Jinja2 template integration** - Generate Nickel configs from form values\n- **nickel-roundtrip workflow** - Load existing Nickel → edit → generate updated Nickel\n\n## File Organization\n\n```\nforms/\n├── README.md # This file\n├── orchestrator-form.toml # Orchestrator configuration form\n├── control-center-form.toml # Control Center configuration form\n├── mcp-server-form.toml # MCP Server configuration form\n├── installer-form.toml # Installer configuration form\n└── fragments/ # FLAT fragment directory (all fragments here)\n ├── workspace-section.toml # Workspace configuration\n ├── server-section.toml # HTTP server settings\n ├── database-rocksdb-section.toml # RocksDB configuration\n ├── database-surrealdb-section.toml # SurrealDB configuration\n ├── database-postgres-section.toml # PostgreSQL configuration\n ├── security-section.toml # Auth, RBAC, encryption\n ├── monitoring-section.toml # Metrics, health checks\n ├── logging-section.toml # Log configuration\n ├── orchestrator-queue-section.toml # Orchestrator queue config\n ├── orchestrator-workflow-section.toml\n ├── control-center-jwt-section.toml\n ├── control-center-rbac-section.toml\n ├── mcp-capabilities-section.toml\n ├── deployment-mode-section.toml # Mode selection\n └── README.md # Fragment documentation\n```\n\n## Critical: Fragment Organization\n\n**Fragments are FLAT** - all stored in `forms/fragments/` at the same level, referenced by paths in form includes:\n\n```\n# Main form (orchestrator-form.toml)\n[[items]]\nname = "workspace_group"\ntype = "group"\nincludes = ["fragments/workspace-section.toml"] # Path reference to flat fragment\n\n[[items]]\nname = "queue_group"\ntype = "group"\nincludes = ["fragments/orchestrator-queue-section.toml"] # Same level, different name\n```\n\n**NOT nested directories** like `fragments/orchestrator/queue-section.toml` - all in `fragments/`\n\n## TypeDialog nickel-roundtrip Workflow\n\nCRITICAL: Forms integrate with Nickel config generation via:\n\n```\ntypedialog-web nickel-roundtrip "$CONFIG_FILE" "$FORM_FILE" --output "$CONFIG_FILE" --template "$NCL_TEMPLATE"\n```\n\nThis workflow:\n1. **Loads existing Nickel config** as default values in form\n2. **Shows form** with validated constraints\n3. **User edits** configuration values\n4. **Generates updated Nickel** using Jinja2 template\n\n## Required Fields: nickel_path\n\n**CRITICAL**: Every form element MUST have `nickel_path` to map to Nickel structure:\n\n```\n[[elements]]\nname = "workspace_name"\ntype = "text"\nprompt = "Workspace Name"\nnickel_path = ["orchestrator", "workspace", "name"] # ← REQUIRED\n```\n\nThe `nickel_path` array specifies the path in the Nickel config structure:\n- `["orchestrator", "workspace", "name"]` → `orchestrator.workspace.name`\n- `["orchestrator", "queue", "max_concurrent_tasks"]` → `orchestrator.queue.max_concurrent_tasks`\n\n## Constraint Interpolation\n\nForm fields reference constraints dynamically:\n\n```\n[[elements]]\nname = "max_concurrent_tasks"\ntype = "number"\nprompt = "Maximum Concurrent Tasks"\nmin = "${constraint.orchestrator.queue.concurrent_tasks.min}" # Dynamic\nmax = "${constraint.orchestrator.queue.concurrent_tasks.max}" # Dynamic\nhelp = "Range: ${constraint.orchestrator.queue.concurrent_tasks.min}-${constraint.orchestrator.queue.concurrent_tasks.max}"\nnickel_path = ["orchestrator", "queue", "max_concurrent_tasks"]\n```\n\nTypeDialog resolves `${constraint.path}` from `constraints/constraints.toml`.\n\n## Main Form Structure\n\nAll main forms follow this pattern:\n\n```\nname = "service_configuration"\ndescription = "Interactive configuration for {Service}"\ndisplay_mode = "complete"\n\n# Section 1: Deployment mode selection\n[[items]]\nname = "deployment_mode_group"\ntype = "group"\nincludes = ["fragments/deployment-mode-section.toml"]\n\n# Section 2: Workspace configuration\n[[items]]\nname = "workspace_group"\ntype = "group"\nincludes = ["fragments/workspace-section.toml"]\n\n# Section 3: Server configuration\n[[items]]\nname = "server_group"\ntype = "group"\nincludes = ["fragments/server-section.toml"]\n\n# Section N: Service-specific configuration\n[[items]]\nname = "service_group"\ntype = "group"\nincludes = ["fragments/{service}-specific-section.toml"]\n\n# Optional: Conditional sections\n[[items]]\nname = "monitoring_group"\ntype = "group"\nwhen = "enable_monitoring == true"\nincludes = ["fragments/monitoring-section.toml"]\n```\n\n## Fragment Example: workspace-section.toml\n\n```\n# Workspace configuration fragment\n[[elements]]\nborder_top = true\nborder_bottom = true\nname = "workspace_header"\ntitle = "🗂️ Workspace Configuration"\ntype = "section_header"\n\n[[elements]]\nname = "workspace_name"\ntype = "text"\nprompt = "Workspace Name"\ndefault = "default"\nplaceholder = "e.g., librecloud, production"\nrequired = true\nhelp = "Name of the workspace"\nnickel_path = ["orchestrator", "workspace", "name"]\n\n[[elements]]\nname = "workspace_path"\ntype = "text"\nprompt = "Workspace Path"\ndefault = "/var/lib/provisioning/orchestrator"\nrequired = true\nhelp = "Absolute path to workspace directory"\nnickel_path = ["orchestrator", "workspace", "path"]\n\n[[elements]]\nname = "workspace_enabled"\ntype = "confirm"\nprompt = "Enable Workspace?"\ndefault = true\nnickel_path = ["orchestrator", "workspace", "enabled"]\n\n[[elements]]\nname = "multi_workspace"\ntype = "confirm"\nprompt = "Multi-Workspace Mode?"\ndefault = false\nhelp = "Allow serving multiple workspaces"\nnickel_path = ["orchestrator", "workspace", "multi_workspace"]\n```\n\n## Fragment Example: orchestrator-queue-section.toml\n\n```\n# Orchestrator queue configuration\n[[elements]]\nborder_top = true\nname = "queue_header"\ntitle = "⚙️ Queue Configuration"\ntype = "section_header"\n\n[[elements]]\nname = "max_concurrent_tasks"\ntype = "number"\nprompt = "Maximum Concurrent Tasks"\ndefault = 5\nmin = "${constraint.orchestrator.queue.concurrent_tasks.min}"\nmax = "${constraint.orchestrator.queue.concurrent_tasks.max}"\nrequired = true\nhelp = "Max tasks running simultaneously. Range: ${constraint.orchestrator.queue.concurrent_tasks.min}-${constraint.orchestrator.queue.concurrent_tasks.max}"\nnickel_path = ["orchestrator", "queue", "max_concurrent_tasks"]\n\n[[elements]]\nname = "retry_attempts"\ntype = "number"\nprompt = "Retry Attempts"\ndefault = 3\nmin = 0\nmax = 10\nhelp = "Number of retry attempts for failed tasks"\nnickel_path = ["orchestrator", "queue", "retry_attempts"]\n\n[[elements]]\nname = "retry_delay"\ntype = "number"\nprompt = "Retry Delay (ms)"\ndefault = 5000\nmin = 1000\nmax = 60000\nhelp = "Delay between retries in milliseconds"\nnickel_path = ["orchestrator", "queue", "retry_delay"]\n\n[[elements]]\nname = "task_timeout"\ntype = "number"\nprompt = "Task Timeout (ms)"\ndefault = 3600000\nmin = 60000\nmax = 86400000\nhelp = "Default timeout for task execution (min 1 min, max 24 hrs)"\nnickel_path = ["orchestrator", "queue", "task_timeout"]\n```\n\n## Jinja2 Template Integration\n\nJinja2 templates (`templates/{service}-config.ncl.j2`) convert form values to Nickel:\n\n```\n# templates/orchestrator-config.ncl.j2\n{\n orchestrator = {\n workspace = {\n {%- if workspace_name %}\n name = "{{ workspace_name }}",\n {%- endif %}\n {%- if workspace_path %}\n path = "{{ workspace_path }}",\n {%- endif %}\n {%- if workspace_enabled is defined %}\n enabled = {{ workspace_enabled | lower }},\n {%- endif %}\n },\n queue = {\n {%- if max_concurrent_tasks %}\n max_concurrent_tasks = {{ max_concurrent_tasks }},\n {%- endif %}\n {%- if retry_attempts %}\n retry_attempts = {{ retry_attempts }},\n {%- endif %}\n {%- if retry_delay %}\n retry_delay = {{ retry_delay }},\n {%- endif %}\n {%- if task_timeout %}\n task_timeout = {{ task_timeout }},\n {%- endif %}\n },\n },\n}\n```\n\n## Conditional Sections\n\nForms can show/hide sections based on user selections:\n\n```\n# Always shown\n[[items]]\nname = "deployment_mode_group"\ntype = "group"\nincludes = ["fragments/deployment-mode-section.toml"]\n\n# Only shown if enable_monitoring is true\n[[items]]\nname = "monitoring_group"\ntype = "group"\nwhen = "enable_monitoring == true"\nincludes = ["fragments/monitoring-section.toml"]\n\n# Only shown if deployment_mode is "enterprise"\n[[items]]\nname = "enterprise_options"\ntype = "group"\nwhen = "deployment_mode == 'enterprise'"\nincludes = ["fragments/enterprise-options-section.toml"]\n```\n\n## Element Types\n\n```\ntype = "text" # Single-line text input\ntype = "number" # Numeric input\ntype = "confirm" # Boolean checkbox\ntype = "select" # Dropdown (single choice)\ntype = "multiselect" # Checkboxes (multiple choices)\ntype = "password" # Hidden text input\ntype = "textarea" # Multi-line text\ntype = "section_header" # Visual section separator\ntype = "footer" # Confirmation text\ntype = "group" # Container for fragments\n```\n\n## Usage Workflow\n\n### 1. Run Configuration Wizard\n\n```\nnu provisioning/.typedialog/provisioning/platform/scripts/configure.nu orchestrator solo\n```\n\n### 2. TypeDialog Loads Form\n\n- Shows `forms/orchestrator-form.toml`\n- Includes fragments from `forms/fragments/*.toml`\n- Applies constraint interpolation\n- Loads existing config as defaults (if exists)\n\n### 3. User Edits\n\n- Fills form fields\n- Validates against constraints\n- Shows validation errors\n\n### 4. Generate Nickel\n\n- Uses `templates/orchestrator-config.ncl.j2`\n- Converts form values to Nickel\n- Saves to `values/orchestrator.solo.ncl`\n\n## Best Practices\n\n1. **Use fragments** - Don't duplicate form sections\n2. **Always add nickel_path** - Required for Nickel mapping\n3. **Use constraint interpolation** - Dynamic limits from constraints.toml\n4. **Provide defaults** - Sensible defaults speed up configuration\n5. **Use clear prompts** - Explain what each field does in `help` text\n6. **Group related fields** - Use fragments to organize logically\n7. **Test constraint interpolation** - Verify ${constraint.*} resolves\n8. **Document fragments** - Use headers and help text\n\n## Testing Forms\n\n```\n# Validate form TOML syntax (if supported by TypeDialog)\n# typedialog validate forms/orchestrator-form.toml\n\n# Launch interactive form (web backend)\nnu provisioning/.typedialog/provisioning/platform/scripts/configure.nu orchestrator solo --backend web\n\n# View generated Nickel\ncat provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl\n```\n\n## Adding New Fields\n\nTo add a new configuration field:\n\n1. **Add to schema** (schemas/{service}.ncl)\n2. **Add to defaults** (defaults/{service}-defaults.ncl)\n3. **Add to fragment** (forms/fragments/{appropriate}-section.toml)\n - Include `nickel_path` mapping\n - Add constraint if numeric\n4. **Update Jinja2 template** (templates/{service}-config.ncl.j2)\n5. **Test**: `nu scripts/configure.nu {service} {mode}`\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 +# Forms + +TypeDialog form definitions for interactive configuration of platform services. + +## Purpose + +Forms provide: +- **Interactive configuration** - Web/TUI/CLI interfaces for user input +- **Constraint validation** - Dynamic min/max from constraints.toml +- **Nickel mapping** - Form fields map to Nickel structure via `nickel_path` +- **Jinja2 template integration** - Generate Nickel configs from form values +- **nickel-roundtrip workflow** - Load existing Nickel → edit → generate updated Nickel + +## File Organization + +```toml +forms/ +├── README.md # This file +├── orchestrator-form.toml # Orchestrator configuration form +├── control-center-form.toml # Control Center configuration form +├── mcp-server-form.toml # MCP Server configuration form +├── installer-form.toml # Installer configuration form +└── fragments/ # FLAT fragment directory (all fragments here) + ├── workspace-section.toml # Workspace configuration + ├── server-section.toml # HTTP server settings + ├── database-rocksdb-section.toml # RocksDB configuration + ├── database-surrealdb-section.toml # SurrealDB configuration + ├── database-postgres-section.toml # PostgreSQL configuration + ├── security-section.toml # Auth, RBAC, encryption + ├── monitoring-section.toml # Metrics, health checks + ├── logging-section.toml # Log configuration + ├── orchestrator-queue-section.toml # Orchestrator queue config + ├── orchestrator-workflow-section.toml + ├── control-center-jwt-section.toml + ├── control-center-rbac-section.toml + ├── mcp-capabilities-section.toml + ├── deployment-mode-section.toml # Mode selection + └── README.md # Fragment documentation +``` + +## Critical: Fragment Organization + +**Fragments are FLAT** - all stored in `forms/fragments/` at the same level, referenced by paths in form includes: + +```toml +# Main form (orchestrator-form.toml) +[[items]] +name = "workspace_group" +type = "group" +includes = ["fragments/workspace-section.toml"] # Path reference to flat fragment + +[[items]] +name = "queue_group" +type = "group" +includes = ["fragments/orchestrator-queue-section.toml"] # Same level, different name +``` + +**NOT nested directories** like `fragments/orchestrator/queue-section.toml` - all in `fragments/` + +## TypeDialog nickel-roundtrip Workflow + +CRITICAL: Forms integrate with Nickel config generation via: + +```nickel +typedialog-web nickel-roundtrip "$CONFIG_FILE" "$FORM_FILE" --output "$CONFIG_FILE" --template "$NCL_TEMPLATE" +``` + +This workflow: +1. **Loads existing Nickel config** as default values in form +2. **Shows form** with validated constraints +3. **User edits** configuration values +4. **Generates updated Nickel** using Jinja2 template + +## Required Fields: nickel_path + +**CRITICAL**: Every form element MUST have `nickel_path` to map to Nickel structure: + +```toml +[[elements]] +name = "workspace_name" +type = "text" +prompt = "Workspace Name" +nickel_path = ["orchestrator", "workspace", "name"] # ← REQUIRED +``` + +The `nickel_path` array specifies the path in the Nickel config structure: +- `["orchestrator", "workspace", "name"]` → `orchestrator.workspace.name` +- `["orchestrator", "queue", "max_concurrent_tasks"]` → `orchestrator.queue.max_concurrent_tasks` + +## Constraint Interpolation + +Form fields reference constraints dynamically: + +```toml +[[elements]] +name = "max_concurrent_tasks" +type = "number" +prompt = "Maximum Concurrent Tasks" +min = "${constraint.orchestrator.queue.concurrent_tasks.min}" # Dynamic +max = "${constraint.orchestrator.queue.concurrent_tasks.max}" # Dynamic +help = "Range: ${constraint.orchestrator.queue.concurrent_tasks.min}-${constraint.orchestrator.queue.concurrent_tasks.max}" +nickel_path = ["orchestrator", "queue", "max_concurrent_tasks"] +``` + +TypeDialog resolves `${constraint.path}` from `constraints/constraints.toml`. + +## Main Form Structure + +All main forms follow this pattern: + +```toml +name = "service_configuration" +description = "Interactive configuration for {Service}" +display_mode = "complete" + +# Section 1: Deployment mode selection +[[items]] +name = "deployment_mode_group" +type = "group" +includes = ["fragments/deployment-mode-section.toml"] + +# Section 2: Workspace configuration +[[items]] +name = "workspace_group" +type = "group" +includes = ["fragments/workspace-section.toml"] + +# Section 3: Server configuration +[[items]] +name = "server_group" +type = "group" +includes = ["fragments/server-section.toml"] + +# Section N: Service-specific configuration +[[items]] +name = "service_group" +type = "group" +includes = ["fragments/{service}-specific-section.toml"] + +# Optional: Conditional sections +[[items]] +name = "monitoring_group" +type = "group" +when = "enable_monitoring == true" +includes = ["fragments/monitoring-section.toml"] +``` + +## Fragment Example: workspace-section.toml + +```toml +# Workspace configuration fragment +[[elements]] +border_top = true +border_bottom = true +name = "workspace_header" +title = "🗂️ Workspace Configuration" +type = "section_header" + +[[elements]] +name = "workspace_name" +type = "text" +prompt = "Workspace Name" +default = "default" +placeholder = "e.g., librecloud, production" +required = true +help = "Name of the workspace" +nickel_path = ["orchestrator", "workspace", "name"] + +[[elements]] +name = "workspace_path" +type = "text" +prompt = "Workspace Path" +default = "/var/lib/provisioning/orchestrator" +required = true +help = "Absolute path to workspace directory" +nickel_path = ["orchestrator", "workspace", "path"] + +[[elements]] +name = "workspace_enabled" +type = "confirm" +prompt = "Enable Workspace?" +default = true +nickel_path = ["orchestrator", "workspace", "enabled"] + +[[elements]] +name = "multi_workspace" +type = "confirm" +prompt = "Multi-Workspace Mode?" +default = false +help = "Allow serving multiple workspaces" +nickel_path = ["orchestrator", "workspace", "multi_workspace"] +``` + +## Fragment Example: orchestrator-queue-section.toml + +```toml +# Orchestrator queue configuration +[[elements]] +border_top = true +name = "queue_header" +title = "⚙️ Queue Configuration" +type = "section_header" + +[[elements]] +name = "max_concurrent_tasks" +type = "number" +prompt = "Maximum Concurrent Tasks" +default = 5 +min = "${constraint.orchestrator.queue.concurrent_tasks.min}" +max = "${constraint.orchestrator.queue.concurrent_tasks.max}" +required = true +help = "Max tasks running simultaneously. Range: ${constraint.orchestrator.queue.concurrent_tasks.min}-${constraint.orchestrator.queue.concurrent_tasks.max}" +nickel_path = ["orchestrator", "queue", "max_concurrent_tasks"] + +[[elements]] +name = "retry_attempts" +type = "number" +prompt = "Retry Attempts" +default = 3 +min = 0 +max = 10 +help = "Number of retry attempts for failed tasks" +nickel_path = ["orchestrator", "queue", "retry_attempts"] + +[[elements]] +name = "retry_delay" +type = "number" +prompt = "Retry Delay (ms)" +default = 5000 +min = 1000 +max = 60000 +help = "Delay between retries in milliseconds" +nickel_path = ["orchestrator", "queue", "retry_delay"] + +[[elements]] +name = "task_timeout" +type = "number" +prompt = "Task Timeout (ms)" +default = 3600000 +min = 60000 +max = 86400000 +help = "Default timeout for task execution (min 1 min, max 24 hrs)" +nickel_path = ["orchestrator", "queue", "task_timeout"] +``` + +## Jinja2 Template Integration + +Jinja2 templates (`templates/{service}-config.ncl.j2`) convert form values to Nickel: + +```nickel +# templates/orchestrator-config.ncl.j2 +{ + orchestrator = { + workspace = { + {%- if workspace_name %} + name = "{{ workspace_name }}", + {%- endif %} + {%- if workspace_path %} + path = "{{ workspace_path }}", + {%- endif %} + {%- if workspace_enabled is defined %} + enabled = {{ workspace_enabled | lower }}, + {%- endif %} + }, + queue = { + {%- if max_concurrent_tasks %} + max_concurrent_tasks = {{ max_concurrent_tasks }}, + {%- endif %} + {%- if retry_attempts %} + retry_attempts = {{ retry_attempts }}, + {%- endif %} + {%- if retry_delay %} + retry_delay = {{ retry_delay }}, + {%- endif %} + {%- if task_timeout %} + task_timeout = {{ task_timeout }}, + {%- endif %} + }, + }, +} +``` + +## Conditional Sections + +Forms can show/hide sections based on user selections: + +```toml +# Always shown +[[items]] +name = "deployment_mode_group" +type = "group" +includes = ["fragments/deployment-mode-section.toml"] + +# Only shown if enable_monitoring is true +[[items]] +name = "monitoring_group" +type = "group" +when = "enable_monitoring == true" +includes = ["fragments/monitoring-section.toml"] + +# Only shown if deployment_mode is "enterprise" +[[items]] +name = "enterprise_options" +type = "group" +when = "deployment_mode == 'enterprise'" +includes = ["fragments/enterprise-options-section.toml"] +``` + +## Element Types + +```toml +type = "text" # Single-line text input +type = "number" # Numeric input +type = "confirm" # Boolean checkbox +type = "select" # Dropdown (single choice) +type = "multiselect" # Checkboxes (multiple choices) +type = "password" # Hidden text input +type = "textarea" # Multi-line text +type = "section_header" # Visual section separator +type = "footer" # Confirmation text +type = "group" # Container for fragments +``` + +## Usage Workflow + +### 1. Run Configuration Wizard + +```toml +nu provisioning/.typedialog/provisioning/platform/scripts/configure.nu orchestrator solo +``` + +### 2. TypeDialog Loads Form + +- Shows `forms/orchestrator-form.toml` +- Includes fragments from `forms/fragments/*.toml` +- Applies constraint interpolation +- Loads existing config as defaults (if exists) + +### 3. User Edits + +- Fills form fields +- Validates against constraints +- Shows validation errors + +### 4. Generate Nickel + +- Uses `templates/orchestrator-config.ncl.j2` +- Converts form values to Nickel +- Saves to `values/orchestrator.solo.ncl` + +## Best Practices + +1. **Use fragments** - Don't duplicate form sections +2. **Always add nickel_path** - Required for Nickel mapping +3. **Use constraint interpolation** - Dynamic limits from constraints.toml +4. **Provide defaults** - Sensible defaults speed up configuration +5. **Use clear prompts** - Explain what each field does in `help` text +6. **Group related fields** - Use fragments to organize logically +7. **Test constraint interpolation** - Verify ${constraint.*} resolves +8. **Document fragments** - Use headers and help text + +## Testing Forms + +```toml +# Validate form TOML syntax (if supported by TypeDialog) +# typedialog validate forms/orchestrator-form.toml + +# Launch interactive form (web backend) +nu provisioning/.typedialog/provisioning/platform/scripts/configure.nu orchestrator solo --backend web + +# View generated Nickel +cat provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl +``` + +## Adding New Fields + +To add a new configuration field: + +1. **Add to schema** (schemas/{service}.ncl) +2. **Add to defaults** (defaults/{service}-defaults.ncl) +3. **Add to fragment** (forms/fragments/{appropriate}-section.toml) + - Include `nickel_path` mapping + - Add constraint if numeric +4. **Update Jinja2 template** (templates/{service}-config.ncl.j2) +5. **Test**: `nu scripts/configure.nu {service} {mode}` + +--- + +**Version**: 1.0.0 +**Last Updated**: 2025-01-05 \ No newline at end of file diff --git a/.typedialog/platform/forms/fragments/README.md b/.typedialog/platform/forms/fragments/README.md index 802e1fb..83191b3 100644 --- a/.typedialog/platform/forms/fragments/README.md +++ b/.typedialog/platform/forms/fragments/README.md @@ -1 +1,334 @@ -# Fragments\n\nReusable form fragments organized FLAT in this directory (not nested subdirectories).\n\n## Purpose\n\nFragments provide:\n- **Reusable sections** - Used by multiple forms\n- **Modularity** - Change once, applies to all forms using it\n- **Organization** - Named by purpose (workspace, server, queue, etc.)\n- **DRY principle** - Don't repeat configuration sections\n\n## Fragment Organization\n\n**CRITICAL**: All fragments are stored at the SAME LEVEL (flat directory).\n\n```\nfragments/\n├── workspace-section.toml # Workspace configuration\n├── server-section.toml # HTTP server settings\n├── database-rocksdb-section.toml # RocksDB database\n├── database-surrealdb-section.toml # SurrealDB database\n├── database-postgres-section.toml # PostgreSQL database\n├── security-section.toml # Auth, RBAC, encryption\n├── monitoring-section.toml # Metrics, health checks\n├── logging-section.toml # Log configuration\n├── orchestrator-queue-section.toml # Orchestrator queue config\n├── orchestrator-workflow-section.toml # Orchestrator batch workflow\n├── orchestrator-storage-section.toml # Orchestrator storage backend\n├── control-center-jwt-section.toml # Control Center JWT\n├── control-center-rbac-section.toml # Control Center RBAC\n├── control-center-compliance-section.toml\n├── mcp-capabilities-section.toml # MCP capabilities\n├── mcp-tools-section.toml # MCP tools configuration\n├── mcp-resources-section.toml # MCP resource limits\n├── deployment-mode-section.toml # Deployment mode selection\n├── resources-section.toml # Resource allocation (CPU, RAM, disk)\n└── README.md # This file\n```\n\nReferenced in forms as:\n```\n[[items]]\nname = "workspace_group"\ntype = "group"\nincludes = ["fragments/workspace-section.toml"] # Flat reference\n\n[[items]]\nname = "queue_group"\ntype = "group"\nincludes = ["fragments/orchestrator-queue-section.toml"] # Same level\n```\n\n## Fragment Categories\n\n### Common Fragments (Used by Multiple Services)\n\n- **workspace-section.toml** - Workspace name, path, enable/disable\n- **server-section.toml** - HTTP server host, port, workers, keep-alive\n- **database-rocksdb-section.toml** - RocksDB path (filesystem-backed)\n- **database-surrealdb-section.toml** - SurrealDB embedded (no external service)\n- **database-postgres-section.toml** - PostgreSQL server connection\n- **security-section.toml** - JWT issuer, RBAC, encryption keys\n- **monitoring-section.toml** - Metrics interval, health checks\n- **logging-section.toml** - Log level, format, rotation\n- **resources-section.toml** - CPU cores, memory, disk allocation\n- **deployment-mode-section.toml** - Solo/MultiUser/CI/CD/Enterprise selection\n\n### Service-Specific Fragments\n\n**Orchestrator** (workflow engine):\n- **orchestrator-queue-section.toml** - Max concurrent tasks, retries, timeout\n- **orchestrator-workflow-section.toml** - Batch workflow settings, parallelism\n- **orchestrator-storage-section.toml** - Storage backend selection\n\n**Control Center** (policy/RBAC):\n- **control-center-jwt-section.toml** - JWT issuer, audience, token expiration\n- **control-center-rbac-section.toml** - Roles, permissions, policies\n- **control-center-compliance-section.toml** - SOC2, HIPAA, audit logging\n\n**MCP Server** (protocol):\n- **mcp-capabilities-section.toml** - Tools, prompts, resources, sampling\n- **mcp-tools-section.toml** - Tool timeout, max concurrent, categories\n- **mcp-resources-section.toml** - Max size, caching, TTL\n\n## Fragment Structure\n\nEach fragment is a TOML file containing `[[elements]]` definitions:\n\n```\n# fragments/workspace-section.toml\n\n[[elements]]\nborder_top = true\nborder_bottom = true\nname = "workspace_header"\ntitle = "🗂️ Workspace Configuration"\ntype = "section_header"\n\n[[elements]]\nname = "workspace_name"\ntype = "text"\nprompt = "Workspace Name"\ndefault = "default"\nrequired = true\nhelp = "Name of the workspace this service will serve"\nnickel_path = ["orchestrator", "workspace", "name"]\n\n[[elements]]\nname = "workspace_path"\ntype = "text"\nprompt = "Workspace Path"\ndefault = "/var/lib/provisioning/orchestrator"\nrequired = true\nhelp = "Absolute path to the workspace directory"\nnickel_path = ["orchestrator", "workspace", "path"]\n\n[[elements]]\nname = "workspace_enabled"\ntype = "confirm"\nprompt = "Enable Workspace?"\ndefault = true\nhelp = "Enable or disable this workspace"\nnickel_path = ["orchestrator", "workspace", "enabled"]\n```\n\n## Fragment Composition\n\nFragments are included in main forms:\n\n```\n# forms/orchestrator-form.toml\n\nname = "orchestrator_configuration"\ndescription = "Interactive configuration for Orchestrator"\n\n# Include fragments in order\n\n[[items]]\nname = "deployment_group"\ntype = "group"\nincludes = ["fragments/deployment-mode-section.toml"]\n\n[[items]]\nname = "workspace_group"\ntype = "group"\nincludes = ["fragments/workspace-section.toml"]\n\n[[items]]\nname = "server_group"\ntype = "group"\nincludes = ["fragments/server-section.toml"]\n\n[[items]]\nname = "storage_group"\ntype = "group"\nincludes = ["fragments/orchestrator-storage-section.toml"]\n\n[[items]]\nname = "queue_group"\ntype = "group"\nincludes = ["fragments/orchestrator-queue-section.toml"]\n\n# Optional sections\n[[items]]\nname = "monitoring_group"\ntype = "group"\nwhen = "enable_monitoring == true"\nincludes = ["fragments/monitoring-section.toml"]\n```\n\n## Element Requirements\n\nEvery element in a fragment MUST include:\n\n1. **name** - Unique identifier (used in form data)\n2. **type** - Element type (text, number, confirm, select, etc.)\n3. **prompt** - User-facing label\n4. **nickel_path** - Mapping to Nickel structure (**CRITICAL**)\n\nExample:\n```\n[[elements]]\nname = "max_concurrent_tasks" # Unique identifier\ntype = "number" # Type\nprompt = "Maximum Concurrent Tasks" # User label\nnickel_path = ["orchestrator", "queue", "max_concurrent_tasks"] # Nickel mapping\n```\n\n## Constraint Interpolation\n\nFragments reference constraints dynamically:\n\n```\n[[elements]]\nname = "max_concurrent_tasks"\ntype = "number"\nprompt = "Maximum Concurrent Tasks"\nmin = "${constraint.orchestrator.queue.concurrent_tasks.min}" # Dynamic\nmax = "${constraint.orchestrator.queue.concurrent_tasks.max}" # Dynamic\nnickel_path = ["orchestrator", "queue", "max_concurrent_tasks"]\n```\n\nThe `${constraint.path.to.value}` syntax references `constraints/constraints.toml`.\n\n## Common Fragment Patterns\n\n### Workspace Fragment Pattern\n```\n[[elements]]\nname = "workspace_name"\ntype = "text"\nprompt = "Workspace Name"\nnickel_path = ["orchestrator", "workspace", "name"]\n\n[[elements]]\nname = "workspace_path"\ntype = "text"\nprompt = "Workspace Path"\nnickel_path = ["orchestrator", "workspace", "path"]\n\n[[elements]]\nname = "workspace_enabled"\ntype = "confirm"\nprompt = "Enable Workspace?"\nnickel_path = ["orchestrator", "workspace", "enabled"]\n```\n\n### Server Fragment Pattern\n```\n[[elements]]\nname = "server_host"\ntype = "text"\nprompt = "Server Host"\ndefault = "127.0.0.1"\nnickel_path = ["orchestrator", "server", "host"]\n\n[[elements]]\nname = "server_port"\ntype = "number"\nprompt = "Server Port"\nmin = "${constraint.common.server.port.min}"\nmax = "${constraint.common.server.port.max}"\nnickel_path = ["orchestrator", "server", "port"]\n\n[[elements]]\nname = "server_workers"\ntype = "number"\nprompt = "Worker Threads"\nmin = 1\nmax = 32\nnickel_path = ["orchestrator", "server", "workers"]\n```\n\n### Database Selection Pattern\n```\n[[elements]]\nname = "storage_backend"\ntype = "select"\nprompt = "Storage Backend"\noptions = [\n { value = "filesystem", label = "📁 Filesystem" },\n { value = "rocksdb", label = "🗄️ RocksDB (Embedded)" },\n { value = "surrealdb", label = "📊 SurrealDB" },\n { value = "postgres", label = "🐘 PostgreSQL" },\n]\nnickel_path = ["orchestrator", "storage", "backend"]\n\n[[elements]]\nname = "rocksdb_group"\ntype = "group"\nwhen = "storage_backend == 'rocksdb'"\nincludes = ["fragments/database-rocksdb-section.toml"]\n\n[[elements]]\nname = "postgres_group"\ntype = "group"\nwhen = "storage_backend == 'postgres'"\nincludes = ["fragments/database-postgres-section.toml"]\n\n[[elements]]\nname = "surrealdb_group"\ntype = "group"\nwhen = "storage_backend == 'surrealdb'"\nincludes = ["fragments/database-surrealdb-section.toml"]\n```\n\n## Best Practices\n\n1. **Clear naming** - Fragment name describes its purpose (queue-section, not qs)\n2. **Meaningful headers** - Each fragment starts with a section header (name, title, emoji)\n3. **Constraint interpolation** - Use `${constraint.*}` for dynamic validation\n4. **Consistent nickel_path** - Paths match actual Nickel structure\n5. **Provide defaults** - Sensible defaults improve UX\n6. **Help text** - Explain each field clearly\n7. **Group logically** - Related fields in same fragment\n8. **Test with form** - Verify fragment loads correctly in form\n\n## Adding a New Fragment\n\n1. **Create fragment file** in `forms/fragments/{name}-section.toml`\n2. **Add section header** (name, title, emoji)\n3. **Add form elements**:\n - Include `name`, `type`, `prompt`\n - Add `nickel_path` (CRITICAL)\n - Add constraints if applicable\n - Add `help` and `default` if appropriate\n4. **Include in form** - Add to main form via `includes` field\n5. **Test** - Run configuration wizard to verify fragment loads\n\n## Fragment Naming Convention\n\n- **Section fragments**: `{topic}-section.toml` (workspace-section.toml)\n- **Service-specific**: `{service}-{topic}-section.toml` (orchestrator-queue-section.toml)\n- **Database-specific**: `database-{backend}-section.toml` (database-postgres-section.toml)\n- **Deployment-specific**: `{mode}-{topic}-section.toml` (enterprise-options-section.toml)\n\n## Testing Fragments\n\n```\n# Validate form that uses fragment\nnu provisioning/.typedialog/provisioning/platform/scripts/configure.nu orchestrator solo --backend web\n\n# Verify constraint interpolation works\ngrep "constraint\." forms/fragments/*.toml\n\n# Check nickel_path consistency\ngrep "nickel_path" forms/fragments/*.toml | sort\n```\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 +# Fragments + +Reusable form fragments organized FLAT in this directory (not nested subdirectories). + +## Purpose + +Fragments provide: +- **Reusable sections** - Used by multiple forms +- **Modularity** - Change once, applies to all forms using it +- **Organization** - Named by purpose (workspace, server, queue, etc.) +- **DRY principle** - Don't repeat configuration sections + +## Fragment Organization + +**CRITICAL**: All fragments are stored at the SAME LEVEL (flat directory). + +```toml +fragments/ +├── workspace-section.toml # Workspace configuration +├── server-section.toml # HTTP server settings +├── database-rocksdb-section.toml # RocksDB database +├── database-surrealdb-section.toml # SurrealDB database +├── database-postgres-section.toml # PostgreSQL database +├── security-section.toml # Auth, RBAC, encryption +├── monitoring-section.toml # Metrics, health checks +├── logging-section.toml # Log configuration +├── orchestrator-queue-section.toml # Orchestrator queue config +├── orchestrator-workflow-section.toml # Orchestrator batch workflow +├── orchestrator-storage-section.toml # Orchestrator storage backend +├── control-center-jwt-section.toml # Control Center JWT +├── control-center-rbac-section.toml # Control Center RBAC +├── control-center-compliance-section.toml +├── mcp-capabilities-section.toml # MCP capabilities +├── mcp-tools-section.toml # MCP tools configuration +├── mcp-resources-section.toml # MCP resource limits +├── deployment-mode-section.toml # Deployment mode selection +├── resources-section.toml # Resource allocation (CPU, RAM, disk) +└── README.md # This file +``` + +Referenced in forms as: +```toml +[[items]] +name = "workspace_group" +type = "group" +includes = ["fragments/workspace-section.toml"] # Flat reference + +[[items]] +name = "queue_group" +type = "group" +includes = ["fragments/orchestrator-queue-section.toml"] # Same level +``` + +## Fragment Categories + +### Common Fragments (Used by Multiple Services) + +- **workspace-section.toml** - Workspace name, path, enable/disable +- **server-section.toml** - HTTP server host, port, workers, keep-alive +- **database-rocksdb-section.toml** - RocksDB path (filesystem-backed) +- **database-surrealdb-section.toml** - SurrealDB embedded (no external service) +- **database-postgres-section.toml** - PostgreSQL server connection +- **security-section.toml** - JWT issuer, RBAC, encryption keys +- **monitoring-section.toml** - Metrics interval, health checks +- **logging-section.toml** - Log level, format, rotation +- **resources-section.toml** - CPU cores, memory, disk allocation +- **deployment-mode-section.toml** - Solo/MultiUser/CI/CD/Enterprise selection + +### Service-Specific Fragments + +**Orchestrator** (workflow engine): +- **orchestrator-queue-section.toml** - Max concurrent tasks, retries, timeout +- **orchestrator-workflow-section.toml** - Batch workflow settings, parallelism +- **orchestrator-storage-section.toml** - Storage backend selection + +**Control Center** (policy/RBAC): +- **control-center-jwt-section.toml** - JWT issuer, audience, token expiration +- **control-center-rbac-section.toml** - Roles, permissions, policies +- **control-center-compliance-section.toml** - SOC2, HIPAA, audit logging + +**MCP Server** (protocol): +- **mcp-capabilities-section.toml** - Tools, prompts, resources, sampling +- **mcp-tools-section.toml** - Tool timeout, max concurrent, categories +- **mcp-resources-section.toml** - Max size, caching, TTL + +## Fragment Structure + +Each fragment is a TOML file containing `[[elements]]` definitions: + +```toml +# fragments/workspace-section.toml + +[[elements]] +border_top = true +border_bottom = true +name = "workspace_header" +title = "🗂️ Workspace Configuration" +type = "section_header" + +[[elements]] +name = "workspace_name" +type = "text" +prompt = "Workspace Name" +default = "default" +required = true +help = "Name of the workspace this service will serve" +nickel_path = ["orchestrator", "workspace", "name"] + +[[elements]] +name = "workspace_path" +type = "text" +prompt = "Workspace Path" +default = "/var/lib/provisioning/orchestrator" +required = true +help = "Absolute path to the workspace directory" +nickel_path = ["orchestrator", "workspace", "path"] + +[[elements]] +name = "workspace_enabled" +type = "confirm" +prompt = "Enable Workspace?" +default = true +help = "Enable or disable this workspace" +nickel_path = ["orchestrator", "workspace", "enabled"] +``` + +## Fragment Composition + +Fragments are included in main forms: + +```toml +# forms/orchestrator-form.toml + +name = "orchestrator_configuration" +description = "Interactive configuration for Orchestrator" + +# Include fragments in order + +[[items]] +name = "deployment_group" +type = "group" +includes = ["fragments/deployment-mode-section.toml"] + +[[items]] +name = "workspace_group" +type = "group" +includes = ["fragments/workspace-section.toml"] + +[[items]] +name = "server_group" +type = "group" +includes = ["fragments/server-section.toml"] + +[[items]] +name = "storage_group" +type = "group" +includes = ["fragments/orchestrator-storage-section.toml"] + +[[items]] +name = "queue_group" +type = "group" +includes = ["fragments/orchestrator-queue-section.toml"] + +# Optional sections +[[items]] +name = "monitoring_group" +type = "group" +when = "enable_monitoring == true" +includes = ["fragments/monitoring-section.toml"] +``` + +## Element Requirements + +Every element in a fragment MUST include: + +1. **name** - Unique identifier (used in form data) +2. **type** - Element type (text, number, confirm, select, etc.) +3. **prompt** - User-facing label +4. **nickel_path** - Mapping to Nickel structure (**CRITICAL**) + +Example: +```toml +[[elements]] +name = "max_concurrent_tasks" # Unique identifier +type = "number" # Type +prompt = "Maximum Concurrent Tasks" # User label +nickel_path = ["orchestrator", "queue", "max_concurrent_tasks"] # Nickel mapping +``` + +## Constraint Interpolation + +Fragments reference constraints dynamically: + +```toml +[[elements]] +name = "max_concurrent_tasks" +type = "number" +prompt = "Maximum Concurrent Tasks" +min = "${constraint.orchestrator.queue.concurrent_tasks.min}" # Dynamic +max = "${constraint.orchestrator.queue.concurrent_tasks.max}" # Dynamic +nickel_path = ["orchestrator", "queue", "max_concurrent_tasks"] +``` + +The `${constraint.path.to.value}` syntax references `constraints/constraints.toml`. + +## Common Fragment Patterns + +### Workspace Fragment Pattern +```toml +[[elements]] +name = "workspace_name" +type = "text" +prompt = "Workspace Name" +nickel_path = ["orchestrator", "workspace", "name"] + +[[elements]] +name = "workspace_path" +type = "text" +prompt = "Workspace Path" +nickel_path = ["orchestrator", "workspace", "path"] + +[[elements]] +name = "workspace_enabled" +type = "confirm" +prompt = "Enable Workspace?" +nickel_path = ["orchestrator", "workspace", "enabled"] +``` + +### Server Fragment Pattern +```toml +[[elements]] +name = "server_host" +type = "text" +prompt = "Server Host" +default = "127.0.0.1" +nickel_path = ["orchestrator", "server", "host"] + +[[elements]] +name = "server_port" +type = "number" +prompt = "Server Port" +min = "${constraint.common.server.port.min}" +max = "${constraint.common.server.port.max}" +nickel_path = ["orchestrator", "server", "port"] + +[[elements]] +name = "server_workers" +type = "number" +prompt = "Worker Threads" +min = 1 +max = 32 +nickel_path = ["orchestrator", "server", "workers"] +``` + +### Database Selection Pattern +```toml +[[elements]] +name = "storage_backend" +type = "select" +prompt = "Storage Backend" +options = [ + { value = "filesystem", label = "📁 Filesystem" }, + { value = "rocksdb", label = "🗄️ RocksDB (Embedded)" }, + { value = "surrealdb", label = "📊 SurrealDB" }, + { value = "postgres", label = "🐘 PostgreSQL" }, +] +nickel_path = ["orchestrator", "storage", "backend"] + +[[elements]] +name = "rocksdb_group" +type = "group" +when = "storage_backend == 'rocksdb'" +includes = ["fragments/database-rocksdb-section.toml"] + +[[elements]] +name = "postgres_group" +type = "group" +when = "storage_backend == 'postgres'" +includes = ["fragments/database-postgres-section.toml"] + +[[elements]] +name = "surrealdb_group" +type = "group" +when = "storage_backend == 'surrealdb'" +includes = ["fragments/database-surrealdb-section.toml"] +``` + +## Best Practices + +1. **Clear naming** - Fragment name describes its purpose (queue-section, not qs) +2. **Meaningful headers** - Each fragment starts with a section header (name, title, emoji) +3. **Constraint interpolation** - Use `${constraint.*}` for dynamic validation +4. **Consistent nickel_path** - Paths match actual Nickel structure +5. **Provide defaults** - Sensible defaults improve UX +6. **Help text** - Explain each field clearly +7. **Group logically** - Related fields in same fragment +8. **Test with form** - Verify fragment loads correctly in form + +## Adding a New Fragment + +1. **Create fragment file** in `forms/fragments/{name}-section.toml` +2. **Add section header** (name, title, emoji) +3. **Add form elements**: + - Include `name`, `type`, `prompt` + - Add `nickel_path` (CRITICAL) + - Add constraints if applicable + - Add `help` and `default` if appropriate +4. **Include in form** - Add to main form via `includes` field +5. **Test** - Run configuration wizard to verify fragment loads + +## Fragment Naming Convention + +- **Section fragments**: `{topic}-section.toml` (workspace-section.toml) +- **Service-specific**: `{service}-{topic}-section.toml` (orchestrator-queue-section.toml) +- **Database-specific**: `database-{backend}-section.toml` (database-postgres-section.toml) +- **Deployment-specific**: `{mode}-{topic}-section.toml` (enterprise-options-section.toml) + +## Testing Fragments + +```toml +# Validate form that uses fragment +nu provisioning/.typedialog/provisioning/platform/scripts/configure.nu orchestrator solo --backend web + +# Verify constraint interpolation works +grep "constraint\." forms/fragments/*.toml + +# Check nickel_path consistency +grep "nickel_path" forms/fragments/*.toml | sort +``` + +--- + +**Version**: 1.0.0 +**Last Updated**: 2025-01-05 \ No newline at end of file diff --git a/.typedialog/platform/forms/fragments/constraint_interpolation_guide.md b/.typedialog/platform/forms/fragments/constraint_interpolation_guide.md index 47a57d7..f846762 100644 --- a/.typedialog/platform/forms/fragments/constraint_interpolation_guide.md +++ b/.typedialog/platform/forms/fragments/constraint_interpolation_guide.md @@ -1 +1,225 @@ -# Constraint Interpolation Guide\n\n## Overview\n\nTypeDialog form fields can reference constraints from `constraints.toml` using Jinja2-style template syntax. This provides a **single source of truth** for validation limits across forms, Nickel schemas, and validators.\n\n## Pattern\n\nAll numeric form fields should use constraint interpolation for `min` and `max` values:\n\n```\n[[elements]]\nname = "field_name"\ntype = "number"\ndefault = 5\nhelp = "Field description (range: ${constraint.path.to.constraint.min}-${constraint.path.to.constraint.max})"\nmin = "${constraint.path.to.constraint.min}"\nmax = "${constraint.path.to.constraint.max}"\nnickel_path = ["path", "to", "field"]\nprompt = "Field Label"\n```\n\n## Benefits\n\n1. **Single Source of Truth**: Constraints defined once in `constraints.toml`, used everywhere\n2. **Dynamic Validation**: If constraint changes, all forms automatically get updated ranges\n3. **User-Friendly**: Forms show actual valid ranges in help text\n4. **Type Safety**: Constraints match Nickel schema contract ranges\n\n## Complete Constraint Mapping\n\n### Orchestrator Fragments\n\n| Fragment | Field | Constraint Path | Min | Max |\n| ---------- | ------- | ----------------- | ----- | ----- |\n| `queue-section.toml` | `queue_max_concurrent_tasks` | `orchestrator.queue.concurrent_tasks` | 1 | 100 |\n| `queue-section.toml` | `queue_retry_attempts` | `orchestrator.queue.retry_attempts` | 0 | 10 |\n| `queue-section.toml` | `queue_retry_delay` | `orchestrator.queue.retry_delay` | 1000 | 60000 |\n| `queue-section.toml` | `queue_task_timeout` | `orchestrator.queue.task_timeout` | 60000 | 86400000 |\n| `batch-section.toml` | `batch_parallel_limit` | `orchestrator.batch.parallel_limit` | 1 | 50 |\n| `batch-section.toml` | `batch_operation_timeout` | `orchestrator.batch.operation_timeout` | 60000 | 3600000 |\n| `extensions-section.toml` | `extensions_max_concurrent` | `orchestrator.extensions.max_concurrent` | 1 | 20 |\n| `extensions-section.toml` | `extensions_discovery_interval` | Not in constraints (use reasonable bounds) | 300 | 86400 |\n| `extensions-section.toml` | `extensions_init_timeout` | Not in constraints (use reasonable bounds) | 1000 | 300000 |\n| `extensions-section.toml` | `extensions_sandbox_max_memory_mb` | Not in constraints (use reasonable bounds) | 64 | 4096 |\n| `performance-section.toml` | `memory_max_heap_mb` | Not in constraints (use mode-based bounds) | 256 | 131072 |\n| `performance-section.toml` | `profiling_sample_rate` | Not in constraints (use reasonable bounds) | 10 | 1000 |\n| `storage-section.toml` | `storage_cache_ttl` | Not in constraints (use 60-3600) | 60 | 3600 |\n| `storage-section.toml` | `storage_cache_max_entries` | Not in constraints (use 10-100000) | 10 | 100000 |\n| `storage-section.toml` | `storage_compression_level` | Not in constraints (zstd: 1-19) | 1 | 19 |\n| `storage-section.toml` | `storage_gc_retention` | Not in constraints (use 3600-31536000) | 3600 | 31536000 |\n| `storage-section.toml` | `storage_gc_interval` | Not in constraints (use 300-86400) | 300 | 86400 |\n\n### Control Center Fragments\n\n| Fragment | Field | Constraint Path | Min | Max |\n| ---------- | ------- | ----------------- | ----- | ----- |\n| `security-section.toml` | `jwt_token_expiration` | `control_center.jwt.token_expiration` | 300 | 604800 |\n| `security-section.toml` | `jwt_refresh_expiration` | `control_center.jwt.refresh_expiration` | 3600 | 2592000 |\n| `security-section.toml` | `rate_limiting_max_requests` | `control_center.rate_limiting.max_requests` | 10 | 10000 |\n| `security-section.toml` | `rate_limiting_window` | `control_center.rate_limiting.window_seconds` | 1 | 3600 |\n| `security-section.toml` | `users_sessions_max_active` | Not in constraints (use 1-100) | 1 | 100 |\n| `security-section.toml` | `users_sessions_idle_timeout` | Not in constraints (use 300-86400) | 300 | 86400 |\n| `security-section.toml` | `users_sessions_absolute_timeout` | Not in constraints (use 3600-2592000) | 3600 | 2592000 |\n| `policy-section.toml` | `policy_cache_ttl` | Not in constraints (use 60-86400) | 60 | 86400 |\n| `policy-section.toml` | `policy_cache_max_policies` | Not in constraints (use 100-10000) | 100 | 10000 |\n| `policy-section.toml` | `policy_versioning_max_versions` | Not in constraints (use 1-100) | 1 | 100 |\n| `users-section.toml` | `users_registration_auto_role` | Not in constraints (select field, not numeric) | - | - |\n| `users-section.toml` | `users_sessions_max_active` | Not in constraints (use 1-100) | 1 | 100 |\n| `users-section.toml` | `users_sessions_idle_timeout` | Not in constraints (use 300-86400) | 300 | 86400 |\n| `users-section.toml` | `users_sessions_absolute_timeout` | Not in constraints (use 3600-2592000) | 3600 | 2592000 |\n| `compliance-section.toml` | `audit_retention_days` | `control_center.audit.retention_days` | 1 | 3650 |\n| `compliance-section.toml` | `compliance_validation_interval` | Not in constraints (use 1-168 hours) | 1 | 168 |\n| `compliance-section.toml` | `compliance_data_retention_years` | Not in constraints (use 1-30) | 1 | 30 |\n| `compliance-section.toml` | `compliance_audit_log_days` | Not in constraints (use 90-10950) | 90 | 10950 |\n\n### MCP Server Fragments\n\n| Fragment | Field | Constraint Path | Min | Max |\n| ---------- | ------- | ----------------- | ----- | ----- |\n| `tools-section.toml` | `tools_max_concurrent` | `mcp_server.tools.max_concurrent` | 1 | 20 |\n| `tools-section.toml` | `tools_timeout` | `mcp_server.tools.timeout` | 5000 | 600000 |\n| `prompts-section.toml` | `prompts_max_templates` | `mcp_server.prompts.max_templates` | 1 | 100 |\n| `prompts-section.toml` | `prompts_cache_ttl` | Not in constraints (use 60-86400) | 60 | 86400 |\n| `prompts-section.toml` | `prompts_versioning_max_versions` | Not in constraints (use 1-100) | 1 | 100 |\n| `resources-section.toml` | `resources_max_size` | `mcp_server.resources.max_size` | 1048576 | 1073741824 |\n| `resources-section.toml` | `resources_cache_max_size_mb` | Not in constraints (use 10-10240) | 10 | 10240 |\n| `resources-section.toml` | `resources_cache_ttl` | `mcp_server.resources.cache_ttl` | 60 | 3600 |\n| `resources-section.toml` | `resources_validation_max_depth` | Not in constraints (use 1-100) | 1 | 100 |\n| `sampling-section.toml` | `sampling_max_tokens` | `mcp_server.sampling.max_tokens` | 100 | 100000 |\n| `sampling-section.toml` | `sampling_temperature` | Not in constraints (use 0.0-2.0) | 0.0 | 2.0 |\n| `sampling-section.toml` | `sampling_cache_ttl` | Not in constraints (use 60-3600) | 60 | 3600 |\n\n### Common/Shared Fragments\n\n| Fragment | Field | Constraint Path | Min | Max |\n| ---------- | ------- | ----------------- | ----- | ----- |\n| `server-section.toml` | `server_port` | `common.server.port` | 1024 | 65535 |\n| `server-section.toml` | `server_workers` | `common.server.workers` | 1 | 32 |\n| `server-section.toml` | `server_max_connections` | `common.server.max_connections` | 10 | 10000 |\n| `server-section.toml` | `server_keep_alive` | `common.server.keep_alive` | 0 | 600 |\n| `monitoring-section.toml` | `monitoring_metrics_interval` | `common.monitoring.metrics_interval` | 10 | 300 |\n| `monitoring-section.toml` | `monitoring_health_check_interval` | `common.monitoring.health_check_interval` | 5 | 300 |\n| `logging-section.toml` | `logging_max_file_size` | `common.logging.max_file_size` | 1048576 | 1073741824 |\n| `logging-section.toml` | `logging_max_backups` | `common.logging.max_backups` | 1 | 100 |\n| `database-rocksdb-section.toml` | `database_pool_size` | Not in constraints (use 1-100) | 1 | 100 |\n| `database-rocksdb-section.toml` | `database_timeout` | Not in constraints (use 10-3600) | 10 | 3600 |\n| `database-rocksdb-section.toml` | `database_retry_attempts` | Not in constraints (use 0-10) | 0 | 10 |\n| `database-rocksdb-section.toml` | `database_retry_delay` | Not in constraints (use 1000-60000) | 1000 | 60000 |\n| `database-surrealdb-section.toml` | `pool_size` | Not in constraints (use 1-200) | 1 | 200 |\n| `database-surrealdb-section.toml` | `timeout` | Not in constraints (use 10-3600) | 10 | 3600 |\n| `database-postgres-section.toml` | `postgres_port` | Not in constraints (use 1024-65535) | 1024 | 65535 |\n| `database-postgres-section.toml` | `postgres_pool_size` | Not in constraints (use 5-200) | 5 | 200 |\n\n### Installer Fragments\n\n| Fragment | Field | Constraint Path | Min | Max |\n| ---------- | ------- | ----------------- | ----- | ----- |\n| `target-section.toml` | `remote_ssh_port` | `common.server.port` | 1024 | 65535 |\n| `preflight-section.toml` | `min_disk_gb` | `deployment.solo.disk_gb.min` (mode-dependent) | Variable | Variable |\n| `preflight-section.toml` | `min_memory_gb` | `deployment.solo.memory_mb.min` (mode-dependent) | Variable | Variable |\n| `preflight-section.toml` | `min_cpu_cores` | `deployment.solo.cpu.min` (mode-dependent) | Variable | Variable |\n| `installation-section.toml` | `parallel_services` | Not in constraints (use 1-10) | 1 | 10 |\n| `installation-section.toml` | `installation_timeout_seconds` | Not in constraints (use 0-14400) | 0 | 14400 |\n| `installation-section.toml` | `log_level` | Not in constraints (select field, not numeric) | - | - |\n| `installation-section.toml` | `validation_timeout` | Not in constraints (use 5000-300000) | 5000 | 300000 |\n| `services-section.toml` | `orchestrator_port` | `common.server.port` | 1024 | 65535 |\n| `services-section.toml` | `control_center_port` | `common.server.port` | 1024 | 65535 |\n| `services-section.toml` | `mcp_server_port` | `common.server.port` | 1024 | 65535 |\n| `services-section.toml` | `api_gateway_port` | `common.server.port` | 1024 | 65535 |\n| `database-section.toml` | `connection_pool_size` | Not in constraints (use 1-100) | 1 | 100 |\n| `database-section.toml` | `connection_pool_timeout` | Not in constraints (use 10-3600) | 10 | 3600 |\n| `database-section.toml` | `connection_idle_timeout` | Not in constraints (use 60-14400) | 60 | 14400 |\n| `storage-section.toml` | `storage_size_gb` | Not in constraints (use 10-100000) | 10 | 100000 |\n| `storage-section.toml` | `storage_replication_factor` | Not in constraints (use 2-10) | 2 | 10 |\n| `networking-section.toml` | `load_balancer_http_port` | `common.server.port` | 1024 | 65535 |\n| `networking-section.toml` | `load_balancer_https_port` | `common.server.port` | 1024 | 65535 |\n| `ha-section.toml` | `ha_cluster_size` | Not in constraints (use 3-256) | 3 | 256 |\n| `ha-section.toml` | `ha_db_quorum_size` | Not in constraints (use 1-max_cluster_size) | 1 | 256 |\n| `ha-section.toml` | `ha_health_check_interval` | Not in constraints (use 1-120) | 1 | 120 |\n| `ha-section.toml` | `ha_health_check_failure_threshold` | Not in constraints (use 1-10) | 1 | 10 |\n| `ha-section.toml` | `ha_failover_delay` | Not in constraints (use 0-600) | 0 | 600 |\n| `upgrades-section.toml` | `rolling_upgrade_parallel` | Not in constraints (use 1-10) | 1 | 10 |\n| `upgrades-section.toml` | `canary_percentage` | Not in constraints (use 1-50) | 1 | 50 |\n| `upgrades-section.toml` | `canary_duration_seconds` | Not in constraints (use 30-3600) | 30 | 3600 |\n\n## Fragments Status\n\n### ✅ Completed (Constraints Interpolated)\n- `server-section.toml` - All numeric fields updated\n- `monitoring-section.toml` - Core metrics interval updated\n- `orchestrator/queue-section.toml` - All queue fields updated\n- `orchestrator/batch-section.toml` - Parallel limit and operation timeout updated\n- `mcp-server/tools-section.toml` - Tools concurrency and timeout updated\n\n### ⏳ Remaining (Need Updates)\n- All other orchestrator fragments (extensions, performance, storage)\n- All control-center fragments (security, policy, users, compliance)\n- Remaining MCP fragments (prompts, resources, sampling)\n- All installer fragments (target, preflight, installation, services, database, storage, networking, ha, upgrades)\n- All database fragments (rocksdb, surrealdb, postgres)\n- logging-section.toml\n\n## How to Add Constraints to a Fragment\n\n1. **Identify numeric fields** with `type = "number"` that have `min` and/or `max` values\n2. **Find the constraint path** in the mapping table above\n3. **Update the field** with constraint references:\n\n```\n# Before\n[[elements]]\ndefault = 5\nmin = 1\nmax = 100\nname = "my_field"\ntype = "number"\n\n# After\n[[elements]]\ndefault = 5\nhelp = "Field description (range: ${constraint.path.to.field.min}-${constraint.path.to.field.max})"\nmin = "${constraint.path.to.field.min}"\nmax = "${constraint.path.to.field.max}"\nname = "my_field"\ntype = "number"\n```\n\n4. **For fields without existing constraints**, add reasonable bounds based on the domain:\n - Timeouts: typically 1 second to 1 hour (1000-3600000 ms)\n - Counters: typically 1-100 or 1-1000\n - Memory: use deployment mode constraints (64MB-256GB)\n - Ports: use `common.server.port` (1024-65535)\n\n5. **Test** that the constraint is accessible in `constraints.toml`\n\n## Example: Adding Constraint to a New Field\n\n```\n[[elements]]\ndefault = 3600\nhelp = "Cache timeout in seconds (range: ${constraint.common.monitoring.health_check_interval.min}-${constraint.common.monitoring.health_check_interval.max})"\nmin = "${constraint.common.monitoring.health_check_interval.min}"\nmax = "${constraint.common.monitoring.health_check_interval.max}"\nname = "cache_timeout_seconds"\nnickel_path = ["cache", "timeout_seconds"]\nprompt = "Cache Timeout (seconds)"\ntype = "number"\n```\n\n## Integration with TypeDialog\n\nWhen TypeDialog processes forms:\n\n1. **Load time**: Constraint references are resolved from `constraints.toml`\n2. **Validation**: User input is validated against resolved min/max values\n3. **Help text**: Ranges are shown to user in help messages\n4. **Nickel generation**: Jinja2 templates receive validated values\n\n## See Also\n\n- `provisioning/.typedialog/provisioning/platform/constraints/constraints.toml` - Constraint definitions\n- `constraint_update_status.md` - Progress tracking for constraint interpolation updates\n- `provisioning/.typedialog/provisioning/platform/templates/*.j2` - Jinja2 templates for code generation\n- `provisioning/schemas/` - Nickel schemas (use same ranges as constraints) +# Constraint Interpolation Guide + +## Overview + +TypeDialog form fields can reference constraints from `constraints.toml` using Jinja2-style template syntax. This provides a **single source of truth** for validation limits across forms, Nickel schemas, and validators. + +## Pattern + +All numeric form fields should use constraint interpolation for `min` and `max` values: + +```toml +[[elements]] +name = "field_name" +type = "number" +default = 5 +help = "Field description (range: ${constraint.path.to.constraint.min}-${constraint.path.to.constraint.max})" +min = "${constraint.path.to.constraint.min}" +max = "${constraint.path.to.constraint.max}" +nickel_path = ["path", "to", "field"] +prompt = "Field Label" +``` + +## Benefits + +1. **Single Source of Truth**: Constraints defined once in `constraints.toml`, used everywhere +2. **Dynamic Validation**: If constraint changes, all forms automatically get updated ranges +3. **User-Friendly**: Forms show actual valid ranges in help text +4. **Type Safety**: Constraints match Nickel schema contract ranges + +## Complete Constraint Mapping + +### Orchestrator Fragments + +| Fragment | Field | Constraint Path | Min | Max | +| ---------- | ------- | ----------------- | ----- | ----- | +| `queue-section.toml` | `queue_max_concurrent_tasks` | `orchestrator.queue.concurrent_tasks` | 1 | 100 | +| `queue-section.toml` | `queue_retry_attempts` | `orchestrator.queue.retry_attempts` | 0 | 10 | +| `queue-section.toml` | `queue_retry_delay` | `orchestrator.queue.retry_delay` | 1000 | 60000 | +| `queue-section.toml` | `queue_task_timeout` | `orchestrator.queue.task_timeout` | 60000 | 86400000 | +| `batch-section.toml` | `batch_parallel_limit` | `orchestrator.batch.parallel_limit` | 1 | 50 | +| `batch-section.toml` | `batch_operation_timeout` | `orchestrator.batch.operation_timeout` | 60000 | 3600000 | +| `extensions-section.toml` | `extensions_max_concurrent` | `orchestrator.extensions.max_concurrent` | 1 | 20 | +| `extensions-section.toml` | `extensions_discovery_interval` | Not in constraints (use reasonable bounds) | 300 | 86400 | +| `extensions-section.toml` | `extensions_init_timeout` | Not in constraints (use reasonable bounds) | 1000 | 300000 | +| `extensions-section.toml` | `extensions_sandbox_max_memory_mb` | Not in constraints (use reasonable bounds) | 64 | 4096 | +| `performance-section.toml` | `memory_max_heap_mb` | Not in constraints (use mode-based bounds) | 256 | 131072 | +| `performance-section.toml` | `profiling_sample_rate` | Not in constraints (use reasonable bounds) | 10 | 1000 | +| `storage-section.toml` | `storage_cache_ttl` | Not in constraints (use 60-3600) | 60 | 3600 | +| `storage-section.toml` | `storage_cache_max_entries` | Not in constraints (use 10-100000) | 10 | 100000 | +| `storage-section.toml` | `storage_compression_level` | Not in constraints (zstd: 1-19) | 1 | 19 | +| `storage-section.toml` | `storage_gc_retention` | Not in constraints (use 3600-31536000) | 3600 | 31536000 | +| `storage-section.toml` | `storage_gc_interval` | Not in constraints (use 300-86400) | 300 | 86400 | + +### Control Center Fragments + +| Fragment | Field | Constraint Path | Min | Max | +| ---------- | ------- | ----------------- | ----- | ----- | +| `security-section.toml` | `jwt_token_expiration` | `control_center.jwt.token_expiration` | 300 | 604800 | +| `security-section.toml` | `jwt_refresh_expiration` | `control_center.jwt.refresh_expiration` | 3600 | 2592000 | +| `security-section.toml` | `rate_limiting_max_requests` | `control_center.rate_limiting.max_requests` | 10 | 10000 | +| `security-section.toml` | `rate_limiting_window` | `control_center.rate_limiting.window_seconds` | 1 | 3600 | +| `security-section.toml` | `users_sessions_max_active` | Not in constraints (use 1-100) | 1 | 100 | +| `security-section.toml` | `users_sessions_idle_timeout` | Not in constraints (use 300-86400) | 300 | 86400 | +| `security-section.toml` | `users_sessions_absolute_timeout` | Not in constraints (use 3600-2592000) | 3600 | 2592000 | +| `policy-section.toml` | `policy_cache_ttl` | Not in constraints (use 60-86400) | 60 | 86400 | +| `policy-section.toml` | `policy_cache_max_policies` | Not in constraints (use 100-10000) | 100 | 10000 | +| `policy-section.toml` | `policy_versioning_max_versions` | Not in constraints (use 1-100) | 1 | 100 | +| `users-section.toml` | `users_registration_auto_role` | Not in constraints (select field, not numeric) | - | - | +| `users-section.toml` | `users_sessions_max_active` | Not in constraints (use 1-100) | 1 | 100 | +| `users-section.toml` | `users_sessions_idle_timeout` | Not in constraints (use 300-86400) | 300 | 86400 | +| `users-section.toml` | `users_sessions_absolute_timeout` | Not in constraints (use 3600-2592000) | 3600 | 2592000 | +| `compliance-section.toml` | `audit_retention_days` | `control_center.audit.retention_days` | 1 | 3650 | +| `compliance-section.toml` | `compliance_validation_interval` | Not in constraints (use 1-168 hours) | 1 | 168 | +| `compliance-section.toml` | `compliance_data_retention_years` | Not in constraints (use 1-30) | 1 | 30 | +| `compliance-section.toml` | `compliance_audit_log_days` | Not in constraints (use 90-10950) | 90 | 10950 | + +### MCP Server Fragments + +| Fragment | Field | Constraint Path | Min | Max | +| ---------- | ------- | ----------------- | ----- | ----- | +| `tools-section.toml` | `tools_max_concurrent` | `mcp_server.tools.max_concurrent` | 1 | 20 | +| `tools-section.toml` | `tools_timeout` | `mcp_server.tools.timeout` | 5000 | 600000 | +| `prompts-section.toml` | `prompts_max_templates` | `mcp_server.prompts.max_templates` | 1 | 100 | +| `prompts-section.toml` | `prompts_cache_ttl` | Not in constraints (use 60-86400) | 60 | 86400 | +| `prompts-section.toml` | `prompts_versioning_max_versions` | Not in constraints (use 1-100) | 1 | 100 | +| `resources-section.toml` | `resources_max_size` | `mcp_server.resources.max_size` | 1048576 | 1073741824 | +| `resources-section.toml` | `resources_cache_max_size_mb` | Not in constraints (use 10-10240) | 10 | 10240 | +| `resources-section.toml` | `resources_cache_ttl` | `mcp_server.resources.cache_ttl` | 60 | 3600 | +| `resources-section.toml` | `resources_validation_max_depth` | Not in constraints (use 1-100) | 1 | 100 | +| `sampling-section.toml` | `sampling_max_tokens` | `mcp_server.sampling.max_tokens` | 100 | 100000 | +| `sampling-section.toml` | `sampling_temperature` | Not in constraints (use 0.0-2.0) | 0.0 | 2.0 | +| `sampling-section.toml` | `sampling_cache_ttl` | Not in constraints (use 60-3600) | 60 | 3600 | + +### Common/Shared Fragments + +| Fragment | Field | Constraint Path | Min | Max | +| ---------- | ------- | ----------------- | ----- | ----- | +| `server-section.toml` | `server_port` | `common.server.port` | 1024 | 65535 | +| `server-section.toml` | `server_workers` | `common.server.workers` | 1 | 32 | +| `server-section.toml` | `server_max_connections` | `common.server.max_connections` | 10 | 10000 | +| `server-section.toml` | `server_keep_alive` | `common.server.keep_alive` | 0 | 600 | +| `monitoring-section.toml` | `monitoring_metrics_interval` | `common.monitoring.metrics_interval` | 10 | 300 | +| `monitoring-section.toml` | `monitoring_health_check_interval` | `common.monitoring.health_check_interval` | 5 | 300 | +| `logging-section.toml` | `logging_max_file_size` | `common.logging.max_file_size` | 1048576 | 1073741824 | +| `logging-section.toml` | `logging_max_backups` | `common.logging.max_backups` | 1 | 100 | +| `database-rocksdb-section.toml` | `database_pool_size` | Not in constraints (use 1-100) | 1 | 100 | +| `database-rocksdb-section.toml` | `database_timeout` | Not in constraints (use 10-3600) | 10 | 3600 | +| `database-rocksdb-section.toml` | `database_retry_attempts` | Not in constraints (use 0-10) | 0 | 10 | +| `database-rocksdb-section.toml` | `database_retry_delay` | Not in constraints (use 1000-60000) | 1000 | 60000 | +| `database-surrealdb-section.toml` | `pool_size` | Not in constraints (use 1-200) | 1 | 200 | +| `database-surrealdb-section.toml` | `timeout` | Not in constraints (use 10-3600) | 10 | 3600 | +| `database-postgres-section.toml` | `postgres_port` | Not in constraints (use 1024-65535) | 1024 | 65535 | +| `database-postgres-section.toml` | `postgres_pool_size` | Not in constraints (use 5-200) | 5 | 200 | + +### Installer Fragments + +| Fragment | Field | Constraint Path | Min | Max | +| ---------- | ------- | ----------------- | ----- | ----- | +| `target-section.toml` | `remote_ssh_port` | `common.server.port` | 1024 | 65535 | +| `preflight-section.toml` | `min_disk_gb` | `deployment.solo.disk_gb.min` (mode-dependent) | Variable | Variable | +| `preflight-section.toml` | `min_memory_gb` | `deployment.solo.memory_mb.min` (mode-dependent) | Variable | Variable | +| `preflight-section.toml` | `min_cpu_cores` | `deployment.solo.cpu.min` (mode-dependent) | Variable | Variable | +| `installation-section.toml` | `parallel_services` | Not in constraints (use 1-10) | 1 | 10 | +| `installation-section.toml` | `installation_timeout_seconds` | Not in constraints (use 0-14400) | 0 | 14400 | +| `installation-section.toml` | `log_level` | Not in constraints (select field, not numeric) | - | - | +| `installation-section.toml` | `validation_timeout` | Not in constraints (use 5000-300000) | 5000 | 300000 | +| `services-section.toml` | `orchestrator_port` | `common.server.port` | 1024 | 65535 | +| `services-section.toml` | `control_center_port` | `common.server.port` | 1024 | 65535 | +| `services-section.toml` | `mcp_server_port` | `common.server.port` | 1024 | 65535 | +| `services-section.toml` | `api_gateway_port` | `common.server.port` | 1024 | 65535 | +| `database-section.toml` | `connection_pool_size` | Not in constraints (use 1-100) | 1 | 100 | +| `database-section.toml` | `connection_pool_timeout` | Not in constraints (use 10-3600) | 10 | 3600 | +| `database-section.toml` | `connection_idle_timeout` | Not in constraints (use 60-14400) | 60 | 14400 | +| `storage-section.toml` | `storage_size_gb` | Not in constraints (use 10-100000) | 10 | 100000 | +| `storage-section.toml` | `storage_replication_factor` | Not in constraints (use 2-10) | 2 | 10 | +| `networking-section.toml` | `load_balancer_http_port` | `common.server.port` | 1024 | 65535 | +| `networking-section.toml` | `load_balancer_https_port` | `common.server.port` | 1024 | 65535 | +| `ha-section.toml` | `ha_cluster_size` | Not in constraints (use 3-256) | 3 | 256 | +| `ha-section.toml` | `ha_db_quorum_size` | Not in constraints (use 1-max_cluster_size) | 1 | 256 | +| `ha-section.toml` | `ha_health_check_interval` | Not in constraints (use 1-120) | 1 | 120 | +| `ha-section.toml` | `ha_health_check_failure_threshold` | Not in constraints (use 1-10) | 1 | 10 | +| `ha-section.toml` | `ha_failover_delay` | Not in constraints (use 0-600) | 0 | 600 | +| `upgrades-section.toml` | `rolling_upgrade_parallel` | Not in constraints (use 1-10) | 1 | 10 | +| `upgrades-section.toml` | `canary_percentage` | Not in constraints (use 1-50) | 1 | 50 | +| `upgrades-section.toml` | `canary_duration_seconds` | Not in constraints (use 30-3600) | 30 | 3600 | + +## Fragments Status + +### ✅ Completed (Constraints Interpolated) +- `server-section.toml` - All numeric fields updated +- `monitoring-section.toml` - Core metrics interval updated +- `orchestrator/queue-section.toml` - All queue fields updated +- `orchestrator/batch-section.toml` - Parallel limit and operation timeout updated +- `mcp-server/tools-section.toml` - Tools concurrency and timeout updated + +### ⏳ Remaining (Need Updates) +- All other orchestrator fragments (extensions, performance, storage) +- All control-center fragments (security, policy, users, compliance) +- Remaining MCP fragments (prompts, resources, sampling) +- All installer fragments (target, preflight, installation, services, database, storage, networking, ha, upgrades) +- All database fragments (rocksdb, surrealdb, postgres) +- logging-section.toml + +## How to Add Constraints to a Fragment + +1. **Identify numeric fields** with `type = "number"` that have `min` and/or `max` values +2. **Find the constraint path** in the mapping table above +3. **Update the field** with constraint references: + +```toml +# Before +[[elements]] +default = 5 +min = 1 +max = 100 +name = "my_field" +type = "number" + +# After +[[elements]] +default = 5 +help = "Field description (range: ${constraint.path.to.field.min}-${constraint.path.to.field.max})" +min = "${constraint.path.to.field.min}" +max = "${constraint.path.to.field.max}" +name = "my_field" +type = "number" +``` + +4. **For fields without existing constraints**, add reasonable bounds based on the domain: + - Timeouts: typically 1 second to 1 hour (1000-3600000 ms) + - Counters: typically 1-100 or 1-1000 + - Memory: use deployment mode constraints (64MB-256GB) + - Ports: use `common.server.port` (1024-65535) + +5. **Test** that the constraint is accessible in `constraints.toml` + +## Example: Adding Constraint to a New Field + +```toml +[[elements]] +default = 3600 +help = "Cache timeout in seconds (range: ${constraint.common.monitoring.health_check_interval.min}-${constraint.common.monitoring.health_check_interval.max})" +min = "${constraint.common.monitoring.health_check_interval.min}" +max = "${constraint.common.monitoring.health_check_interval.max}" +name = "cache_timeout_seconds" +nickel_path = ["cache", "timeout_seconds"] +prompt = "Cache Timeout (seconds)" +type = "number" +``` + +## Integration with TypeDialog + +When TypeDialog processes forms: + +1. **Load time**: Constraint references are resolved from `constraints.toml` +2. **Validation**: User input is validated against resolved min/max values +3. **Help text**: Ranges are shown to user in help messages +4. **Nickel generation**: Jinja2 templates receive validated values + +## See Also + +- `provisioning/.typedialog/provisioning/platform/constraints/constraints.toml` - Constraint definitions +- `constraint_update_status.md` - Progress tracking for constraint interpolation updates +- `provisioning/.typedialog/provisioning/platform/templates/*.j2` - Jinja2 templates for code generation +- `provisioning/schemas/` - Nickel schemas (use same ranges as constraints) \ No newline at end of file diff --git a/.typedialog/platform/forms/fragments/constraint_update_status.md b/.typedialog/platform/forms/fragments/constraint_update_status.md index c81ead2..eb0aab2 100644 --- a/.typedialog/platform/forms/fragments/constraint_update_status.md +++ b/.typedialog/platform/forms/fragments/constraint_update_status.md @@ -1 +1,298 @@ -# Constraint Interpolation Update Status\n\n**Date**: 2025-01-05\n**Status**: Phase 1.5 - COMPLETE ✅ All Constraint Interpolation Finished\n**Progress**: 33 / 33 fragments updated (100%)\n\n## Summary\n\nConstraint interpolation has been implemented for critical numeric form fields, providing a single source of truth for validation limits. The comprehensive mapping guide documents which constraints should be applied to remaining fragments.\n\n## Completed Fragments ✅\n\n### Common/Shared Fragments\n- ✅ **server-section.toml** (100%)\n - server_port → `common.server.port`\n - server_workers → `common.server.workers`\n - server_max_connections → `common.server.max_connections`\n - server_keep_alive → `common.server.keep_alive`\n\n- ✅ **monitoring-section.toml** (1 of 1 critical field)\n - monitoring_metrics_interval → `common.monitoring.metrics_interval`\n\n### Orchestrator Fragments\n- ✅ **orchestrator/queue-section.toml** (100%)\n - queue_max_concurrent_tasks → `orchestrator.queue.concurrent_tasks`\n - queue_retry_attempts → `orchestrator.queue.retry_attempts`\n - queue_retry_delay → `orchestrator.queue.retry_delay`\n - queue_task_timeout → `orchestrator.queue.task_timeout`\n\n- ✅ **orchestrator/batch-section.toml** (2 of 2 critical fields)\n - batch_parallel_limit → `orchestrator.batch.parallel_limit`\n - batch_operation_timeout → `orchestrator.batch.operation_timeout`\n\n### MCP Server Fragments\n- ✅ **mcp-server/tools-section.toml** (100%)\n - tools_max_concurrent → `mcp_server.tools.max_concurrent`\n - tools_timeout → `mcp_server.tools.timeout`\n\n- ✅ **mcp-server/prompts-section.toml** (100%)\n - prompts_max_templates → `mcp_server.prompts.max_templates`\n - prompts_cache_ttl → reasonable bounds (60-86400)\n - prompts_versioning_max_versions → reasonable bounds (1-100)\n\n- ✅ **mcp-server/resources-section.toml** (100%)\n - resources_max_size → `mcp_server.resources.max_size`\n - resources_cache_ttl → `mcp_server.resources.cache_ttl`\n - resources_cache_max_size_mb → reasonable bounds (10-10240)\n - resources_validation_max_depth → reasonable bounds (1-100)\n\n- ✅ **mcp-server/sampling-section.toml** (100%)\n - sampling_max_tokens → `mcp_server.sampling.max_tokens`\n - sampling_cache_ttl → reasonable bounds (60-3600)\n\n### Control Center Fragments\n- ✅ **control-center/security-section.toml** (100%)\n - jwt_token_expiration → `control_center.jwt.token_expiration`\n - jwt_refresh_expiration → `control_center.jwt.refresh_expiration`\n - rate_limiting_max_requests → `control_center.rate_limiting.max_requests`\n - rate_limiting_window → `control_center.rate_limiting.window_seconds`\n\n- ✅ **control-center/compliance-section.toml** (100%)\n - audit_retention_days → `control_center.audit.retention_days`\n - compliance_validation_interval → reasonable bounds (1-168 hours)\n - compliance_data_retention_years → reasonable bounds (1-30)\n - compliance_audit_log_days → reasonable bounds (90-10950)\n\n### Shared/Common Fragments\n- ✅ **logging-section.toml** (100%)\n - logging_max_file_size → `common.logging.max_file_size`\n - logging_max_backups → `common.logging.max_backups`\n\n### Orchestrator Fragments\n- ✅ **orchestrator/extensions-section.toml** (100%)\n - extensions_max_concurrent → `orchestrator.extensions.max_concurrent`\n - extensions_discovery_interval → reasonable bounds (300-86400)\n - extensions_init_timeout → reasonable bounds (1000-300000)\n - extensions_health_check_interval → reasonable bounds (5000-300000)\n\n## All Fragments Completed ✅\n\n### Orchestrator Fragments (3/3 Complete)\n- [x] ✅ orchestrator/extensions-section.toml (100%)\n - extensions_max_concurrent → `orchestrator.extensions.max_concurrent`\n - extensions_discovery_interval, init_timeout, health_check_interval → reasonable bounds\n\n- [x] ✅ orchestrator/performance-section.toml (100% - TODAY)\n - memory_initial_heap_mb → reasonable bounds (128-131072)\n - profiling_memory_min_size_kb → reasonable bounds (1-1048576)\n - inline_cache_max_entries → reasonable bounds (1000-1000000)\n - inline_cache_ttl → reasonable bounds (60-86400)\n - async_io_max_in_flight → reasonable bounds (256-1048576)\n\n- [x] ✅ orchestrator/storage-section.toml (100% - TODAY)\n - storage_cache_ttl → reasonable bounds (60-86400)\n - storage_cache_max_entries → reasonable bounds (10-1000000)\n - storage_compression_level → already has max (1-19)\n - storage_gc_retention → reasonable bounds (3600-31536000 / 1 hour-1 year)\n - storage_gc_interval → reasonable bounds (300-86400)\n\n### Control Center Fragments (5/5 Complete)\n- [x] ✅ control-center/security-section.toml (100%)\n - jwt_token_expiration → `control_center.jwt.token_expiration`\n - rate_limiting_max_requests → `control_center.rate_limiting.max_requests`\n\n- [x] ✅ control-center/policy-section.toml (100% - TODAY)\n - policy_cache_ttl → reasonable bounds (60-86400)\n - policy_cache_max_policies → reasonable bounds (100-1000000)\n - policy_versioning_max_versions → reasonable bounds (1-1000)\n\n- [x] ✅ control-center/users-section.toml (100% - TODAY)\n - users_sessions_max_active → reasonable bounds (1-100)\n - users_sessions_idle_timeout → reasonable bounds (300-86400)\n - users_sessions_absolute_timeout → reasonable bounds (3600-604800 / 1 hour-1 week)\n\n- [x] ✅ control-center/compliance-section.toml (100%)\n - audit_retention_days → `control_center.audit.retention_days`\n\n- [x] ✅ control-center/rbac-section.toml (100%)\n - No numeric fields (confirm/select only)\n\n### MCP Server (3 fragments)\n- [x] ✅ mcp-server/prompts-section.toml\n\n- [x] ✅ mcp-server/resources-section.toml\n\n- [x] ✅ mcp-server/sampling-section.toml\n\n### Common Database Fragments (3 fragments)\n- [x] ✅ database-rocksdb-section.toml (100%)\n - connection_pool_size → reasonable bounds (1-100)\n - connection_pool_timeout → reasonable bounds (10-3600)\n - connection_retry_attempts → reasonable bounds (0-10)\n - connection_retry_delay → reasonable bounds (1000-60000)\n\n- [x] ✅ database-surrealdb-section.toml (100%)\n - connection_pool_size → reasonable bounds (1-200)\n - connection_pool_timeout → reasonable bounds (10-3600)\n - connection_retry_attempts → reasonable bounds (0-10)\n - connection_retry_delay → reasonable bounds (1000-60000)\n\n- [x] ✅ database-postgres-section.toml (100%)\n - postgres_port → `common.server.port`\n - postgres_pool_size → reasonable bounds (5-200)\n - postgres_pool_timeout → reasonable bounds (10-3600)\n - postgres_retry_attempts → reasonable bounds (0-10)\n - postgres_retry_delay → reasonable bounds (1000-60000)\n\n### Other Shared Fragments (1 fragment)\n- [x] ✅ logging-section.toml\n\n### Installer Fragments (10 fragments) - ALL COMPLETE ✅\n\n- [x] ✅ installer/target-section.toml (100%)\n - remote_ssh_port → `common.server.port`\n\n- [x] ✅ installer/preflight-section.toml (100%)\n - min_disk_gb → reasonable bounds (1-10000)\n - min_memory_gb → already has constraints (1-512)\n - min_cpu_cores → already has constraints (1-128)\n\n- [x] ✅ installer/installation-section.toml (100%)\n - parallel_services → reasonable bounds (1-10)\n - installation_timeout_seconds → reasonable bounds (0-14400)\n - validation_timeout → reasonable bounds (5000-300000)\n\n- [x] ✅ installer/services-section.toml (100%)\n - orchestrator_port → `common.server.port`\n - control_center_port → `common.server.port`\n - mcp_server_port → `common.server.port`\n - api_gateway_port → `common.server.port`\n\n- [x] ✅ installer/database-section.toml (100%)\n - connection_pool_size → reasonable bounds (1-100)\n - connection_pool_timeout → reasonable bounds (10-3600)\n - connection_idle_timeout → reasonable bounds (60-14400)\n\n- [x] ✅ installer/storage-section.toml (100%)\n - storage_size_gb → reasonable bounds (10-100000)\n - storage_replication_factor → reasonable bounds (2-10)\n\n- [x] ✅ installer/networking-section.toml (100%)\n - load_balancer_http_port → `common.server.port`\n - load_balancer_https_port → `common.server.port`\n\n- [x] ✅ installer/ha-section.toml (100%)\n - ha_cluster_size → reasonable bounds (3-256)\n - ha_db_quorum_size → reasonable bounds (1-256)\n - ha_health_check_interval → reasonable bounds (1-120)\n - ha_health_check_timeout → reasonable bounds (1000-300000)\n - ha_failover_delay → reasonable bounds (0-600)\n - ha_backup_interval → reasonable bounds (300-86400)\n - ha_metrics_interval → reasonable bounds (5-300)\n\n- [x] ✅ installer/post-install-section.toml (100%)\n - verification_timeout → reasonable bounds (30-3600)\n\n- [x] ✅ installer/upgrades-section.toml (100%)\n - rolling_upgrade_parallel → reasonable bounds (1-10)\n - canary_percentage → reasonable bounds (1-50)\n - canary_duration_seconds → reasonable bounds (30-7200)\n - maintenance_duration_seconds → reasonable bounds (600-86400)\n - backup_timeout_minutes → reasonable bounds (5-1440)\n - rollback_validation_delay → reasonable bounds (30-1800)\n - post_upgrade_health_check_interval → reasonable bounds (5-300)\n - post_upgrade_monitoring_duration → reasonable bounds (60-86400)\n\n## How to Continue\n\n1. **Reference the mapping**: See `constraint_interpolation_guide.md` for complete field → constraint mappings\n\n2. **For fragments with existing constraints** (e.g., `security-section.toml`):\n ```bash\n # Update fields using the pattern from completed fragments\n # Example: jwt_token_expiration → control_center.jwt.token_expiration\n ```\n\n3. **For fragments without existing constraints** (e.g., `performance-section.toml`):\n - Use reasonable domain-based ranges\n - Document your choice in the help text\n - Examples:\n - Timeouts: 1s-1hr range (1000-3600000 ms)\n - Thread counts: 1-32 range\n - Memory: 64MB-256GB range (use deployment modes)\n - Ports: use `common.server.port` (1024-65535)\n\n## Testing\n\nAfter updating a fragment:\n\n```\n# 1. Verify fragment syntax\ncd provisioning/.typedialog/provisioning/platform/forms/fragments\ngrep -n 'min = \|max = ' .toml | head -20\n\n# 2. Validate constraints exist\ncd ../..\ngrep -r "$(constraint path)" constraints/constraints.toml\n\n# 3. Test form rendering\ntypedialog-cli validate forms/-form.toml\n```\n\n## Notes\n\n### Pattern Applied\nAll numeric fields now follow this structure:\n```\n[[elements]]\ndefault = 10\nhelp = "Field description (range: ${constraint.path.min}-${constraint.path.max})"\nmin = "${constraint.path.min}"\nmax = "${constraint.path.max}"\nname = "field_name"\nnickel_path = ["path", "to", "nickel"]\nprompt = "Field Label"\ntype = "number"\n```\n\n### Benefits Realized\n- ✅ Single source of truth in `constraints.toml`\n- ✅ Help text shows actual valid ranges to users\n- ✅ TypeDialog validates input against constraints\n- ✅ Jinja2 templates receive validated values\n- ✅ Easy to update limits globally (all forms auto-update)\n\n## Completion Summary\n\n**Final Status**: 33/33 fragments (100%) ✅ COMPLETE\n\n**Work Completed Today**:\n- ✅ orchestrator/performance-section.toml (5 fields with max bounds)\n- ✅ orchestrator/storage-section.toml (4 fields with max bounds)\n- ✅ control-center/policy-section.toml (3 fields with max bounds)\n- ✅ control-center/users-section.toml (3 fields with max bounds)\n- ✅ Fragments with no numeric fields (rbac, mode-selection, workspace) verified as complete\n\n**Total Progress This Session**:\n- Started: 12/33 (36%)\n- Ended: 33/33 (100%)\n- +21 fragments updated\n- +50+ numeric fields with constraint bounds added\n\n### Next Phase: Phase 8 - Nushell Scripts\nReady to proceed with implementation:\n- Interactive configuration wizard (configure.nu)\n- Config generation from Nickel → TOML (generate-configs.nu)\n- Validation and roundtrip workflows\n- Template rendering (Docker Compose, Kubernetes)\n\n## Files\n\n- `constraints/constraints.toml` - Source of truth for all validation limits\n- `constraint_interpolation_guide.md` - Complete mapping and best practices\n- `constraint_update_status.md` - This file (progress tracking)\n\n---\n\n**To contribute**: Pick any unchecked fragment above and follow the pattern in `constraint_interpolation_guide.md`. Each constraint update takes ~5 minutes per fragment. +# Constraint Interpolation Update Status + +**Date**: 2025-01-05 +**Status**: Phase 1.5 - COMPLETE ✅ All Constraint Interpolation Finished +**Progress**: 33 / 33 fragments updated (100%) + +## Summary + +Constraint interpolation has been implemented for critical numeric form fields, providing a single source of truth for validation limits. The comprehensive mapping guide documents which constraints should be applied to remaining fragments. + +## Completed Fragments ✅ + +### Common/Shared Fragments +- ✅ **server-section.toml** (100%) + - server_port → `common.server.port` + - server_workers → `common.server.workers` + - server_max_connections → `common.server.max_connections` + - server_keep_alive → `common.server.keep_alive` + +- ✅ **monitoring-section.toml** (1 of 1 critical field) + - monitoring_metrics_interval → `common.monitoring.metrics_interval` + +### Orchestrator Fragments +- ✅ **orchestrator/queue-section.toml** (100%) + - queue_max_concurrent_tasks → `orchestrator.queue.concurrent_tasks` + - queue_retry_attempts → `orchestrator.queue.retry_attempts` + - queue_retry_delay → `orchestrator.queue.retry_delay` + - queue_task_timeout → `orchestrator.queue.task_timeout` + +- ✅ **orchestrator/batch-section.toml** (2 of 2 critical fields) + - batch_parallel_limit → `orchestrator.batch.parallel_limit` + - batch_operation_timeout → `orchestrator.batch.operation_timeout` + +### MCP Server Fragments +- ✅ **mcp-server/tools-section.toml** (100%) + - tools_max_concurrent → `mcp_server.tools.max_concurrent` + - tools_timeout → `mcp_server.tools.timeout` + +- ✅ **mcp-server/prompts-section.toml** (100%) + - prompts_max_templates → `mcp_server.prompts.max_templates` + - prompts_cache_ttl → reasonable bounds (60-86400) + - prompts_versioning_max_versions → reasonable bounds (1-100) + +- ✅ **mcp-server/resources-section.toml** (100%) + - resources_max_size → `mcp_server.resources.max_size` + - resources_cache_ttl → `mcp_server.resources.cache_ttl` + - resources_cache_max_size_mb → reasonable bounds (10-10240) + - resources_validation_max_depth → reasonable bounds (1-100) + +- ✅ **mcp-server/sampling-section.toml** (100%) + - sampling_max_tokens → `mcp_server.sampling.max_tokens` + - sampling_cache_ttl → reasonable bounds (60-3600) + +### Control Center Fragments +- ✅ **control-center/security-section.toml** (100%) + - jwt_token_expiration → `control_center.jwt.token_expiration` + - jwt_refresh_expiration → `control_center.jwt.refresh_expiration` + - rate_limiting_max_requests → `control_center.rate_limiting.max_requests` + - rate_limiting_window → `control_center.rate_limiting.window_seconds` + +- ✅ **control-center/compliance-section.toml** (100%) + - audit_retention_days → `control_center.audit.retention_days` + - compliance_validation_interval → reasonable bounds (1-168 hours) + - compliance_data_retention_years → reasonable bounds (1-30) + - compliance_audit_log_days → reasonable bounds (90-10950) + +### Shared/Common Fragments +- ✅ **logging-section.toml** (100%) + - logging_max_file_size → `common.logging.max_file_size` + - logging_max_backups → `common.logging.max_backups` + +### Orchestrator Fragments +- ✅ **orchestrator/extensions-section.toml** (100%) + - extensions_max_concurrent → `orchestrator.extensions.max_concurrent` + - extensions_discovery_interval → reasonable bounds (300-86400) + - extensions_init_timeout → reasonable bounds (1000-300000) + - extensions_health_check_interval → reasonable bounds (5000-300000) + +## All Fragments Completed ✅ + +### Orchestrator Fragments (3/3 Complete) +- [x] ✅ orchestrator/extensions-section.toml (100%) + - extensions_max_concurrent → `orchestrator.extensions.max_concurrent` + - extensions_discovery_interval, init_timeout, health_check_interval → reasonable bounds + +- [x] ✅ orchestrator/performance-section.toml (100% - TODAY) + - memory_initial_heap_mb → reasonable bounds (128-131072) + - profiling_memory_min_size_kb → reasonable bounds (1-1048576) + - inline_cache_max_entries → reasonable bounds (1000-1000000) + - inline_cache_ttl → reasonable bounds (60-86400) + - async_io_max_in_flight → reasonable bounds (256-1048576) + +- [x] ✅ orchestrator/storage-section.toml (100% - TODAY) + - storage_cache_ttl → reasonable bounds (60-86400) + - storage_cache_max_entries → reasonable bounds (10-1000000) + - storage_compression_level → already has max (1-19) + - storage_gc_retention → reasonable bounds (3600-31536000 / 1 hour-1 year) + - storage_gc_interval → reasonable bounds (300-86400) + +### Control Center Fragments (5/5 Complete) +- [x] ✅ control-center/security-section.toml (100%) + - jwt_token_expiration → `control_center.jwt.token_expiration` + - rate_limiting_max_requests → `control_center.rate_limiting.max_requests` + +- [x] ✅ control-center/policy-section.toml (100% - TODAY) + - policy_cache_ttl → reasonable bounds (60-86400) + - policy_cache_max_policies → reasonable bounds (100-1000000) + - policy_versioning_max_versions → reasonable bounds (1-1000) + +- [x] ✅ control-center/users-section.toml (100% - TODAY) + - users_sessions_max_active → reasonable bounds (1-100) + - users_sessions_idle_timeout → reasonable bounds (300-86400) + - users_sessions_absolute_timeout → reasonable bounds (3600-604800 / 1 hour-1 week) + +- [x] ✅ control-center/compliance-section.toml (100%) + - audit_retention_days → `control_center.audit.retention_days` + +- [x] ✅ control-center/rbac-section.toml (100%) + - No numeric fields (confirm/select only) + +### MCP Server (3 fragments) +- [x] ✅ mcp-server/prompts-section.toml + +- [x] ✅ mcp-server/resources-section.toml + +- [x] ✅ mcp-server/sampling-section.toml + +### Common Database Fragments (3 fragments) +- [x] ✅ database-rocksdb-section.toml (100%) + - connection_pool_size → reasonable bounds (1-100) + - connection_pool_timeout → reasonable bounds (10-3600) + - connection_retry_attempts → reasonable bounds (0-10) + - connection_retry_delay → reasonable bounds (1000-60000) + +- [x] ✅ database-surrealdb-section.toml (100%) + - connection_pool_size → reasonable bounds (1-200) + - connection_pool_timeout → reasonable bounds (10-3600) + - connection_retry_attempts → reasonable bounds (0-10) + - connection_retry_delay → reasonable bounds (1000-60000) + +- [x] ✅ database-postgres-section.toml (100%) + - postgres_port → `common.server.port` + - postgres_pool_size → reasonable bounds (5-200) + - postgres_pool_timeout → reasonable bounds (10-3600) + - postgres_retry_attempts → reasonable bounds (0-10) + - postgres_retry_delay → reasonable bounds (1000-60000) + +### Other Shared Fragments (1 fragment) +- [x] ✅ logging-section.toml + +### Installer Fragments (10 fragments) - ALL COMPLETE ✅ + +- [x] ✅ installer/target-section.toml (100%) + - remote_ssh_port → `common.server.port` + +- [x] ✅ installer/preflight-section.toml (100%) + - min_disk_gb → reasonable bounds (1-10000) + - min_memory_gb → already has constraints (1-512) + - min_cpu_cores → already has constraints (1-128) + +- [x] ✅ installer/installation-section.toml (100%) + - parallel_services → reasonable bounds (1-10) + - installation_timeout_seconds → reasonable bounds (0-14400) + - validation_timeout → reasonable bounds (5000-300000) + +- [x] ✅ installer/services-section.toml (100%) + - orchestrator_port → `common.server.port` + - control_center_port → `common.server.port` + - mcp_server_port → `common.server.port` + - api_gateway_port → `common.server.port` + +- [x] ✅ installer/database-section.toml (100%) + - connection_pool_size → reasonable bounds (1-100) + - connection_pool_timeout → reasonable bounds (10-3600) + - connection_idle_timeout → reasonable bounds (60-14400) + +- [x] ✅ installer/storage-section.toml (100%) + - storage_size_gb → reasonable bounds (10-100000) + - storage_replication_factor → reasonable bounds (2-10) + +- [x] ✅ installer/networking-section.toml (100%) + - load_balancer_http_port → `common.server.port` + - load_balancer_https_port → `common.server.port` + +- [x] ✅ installer/ha-section.toml (100%) + - ha_cluster_size → reasonable bounds (3-256) + - ha_db_quorum_size → reasonable bounds (1-256) + - ha_health_check_interval → reasonable bounds (1-120) + - ha_health_check_timeout → reasonable bounds (1000-300000) + - ha_failover_delay → reasonable bounds (0-600) + - ha_backup_interval → reasonable bounds (300-86400) + - ha_metrics_interval → reasonable bounds (5-300) + +- [x] ✅ installer/post-install-section.toml (100%) + - verification_timeout → reasonable bounds (30-3600) + +- [x] ✅ installer/upgrades-section.toml (100%) + - rolling_upgrade_parallel → reasonable bounds (1-10) + - canary_percentage → reasonable bounds (1-50) + - canary_duration_seconds → reasonable bounds (30-7200) + - maintenance_duration_seconds → reasonable bounds (600-86400) + - backup_timeout_minutes → reasonable bounds (5-1440) + - rollback_validation_delay → reasonable bounds (30-1800) + - post_upgrade_health_check_interval → reasonable bounds (5-300) + - post_upgrade_monitoring_duration → reasonable bounds (60-86400) + +## How to Continue + +1. **Reference the mapping**: See `constraint_interpolation_guide.md` for complete field → constraint mappings + +2. **For fragments with existing constraints** (e.g., `security-section.toml`): + ```bash + # Update fields using the pattern from completed fragments + # Example: jwt_token_expiration → control_center.jwt.token_expiration + ``` + +3. **For fragments without existing constraints** (e.g., `performance-section.toml`): + - Use reasonable domain-based ranges + - Document your choice in the help text + - Examples: + - Timeouts: 1s-1hr range (1000-3600000 ms) + - Thread counts: 1-32 range + - Memory: 64MB-256GB range (use deployment modes) + - Ports: use `common.server.port` (1024-65535) + +## Testing + +After updating a fragment: + +```toml +# 1. Verify fragment syntax +cd provisioning/.typedialog/provisioning/platform/forms/fragments +grep -n 'min = \|max = ' .toml | head -20 + +# 2. Validate constraints exist +cd ../.. +grep -r "$(constraint path)" constraints/constraints.toml + +# 3. Test form rendering +typedialog-cli validate forms/-form.toml +``` + +## Notes + +### Pattern Applied +All numeric fields now follow this structure: +```toml +[[elements]] +default = 10 +help = "Field description (range: ${constraint.path.min}-${constraint.path.max})" +min = "${constraint.path.min}" +max = "${constraint.path.max}" +name = "field_name" +nickel_path = ["path", "to", "nickel"] +prompt = "Field Label" +type = "number" +``` + +### Benefits Realized +- ✅ Single source of truth in `constraints.toml` +- ✅ Help text shows actual valid ranges to users +- ✅ TypeDialog validates input against constraints +- ✅ Jinja2 templates receive validated values +- ✅ Easy to update limits globally (all forms auto-update) + +## Completion Summary + +**Final Status**: 33/33 fragments (100%) ✅ COMPLETE + +**Work Completed Today**: +- ✅ orchestrator/performance-section.toml (5 fields with max bounds) +- ✅ orchestrator/storage-section.toml (4 fields with max bounds) +- ✅ control-center/policy-section.toml (3 fields with max bounds) +- ✅ control-center/users-section.toml (3 fields with max bounds) +- ✅ Fragments with no numeric fields (rbac, mode-selection, workspace) verified as complete + +**Total Progress This Session**: +- Started: 12/33 (36%) +- Ended: 33/33 (100%) +- +21 fragments updated +- +50+ numeric fields with constraint bounds added + +### Next Phase: Phase 8 - Nushell Scripts +Ready to proceed with implementation: +- Interactive configuration wizard (configure.nu) +- Config generation from Nickel → TOML (generate-configs.nu) +- Validation and roundtrip workflows +- Template rendering (Docker Compose, Kubernetes) + +## Files + +- `constraints/constraints.toml` - Source of truth for all validation limits +- `constraint_interpolation_guide.md` - Complete mapping and best practices +- `constraint_update_status.md` - This file (progress tracking) + +--- + +**To contribute**: Pick any unchecked fragment above and follow the pattern in `constraint_interpolation_guide.md`. Each constraint update takes ~5 minutes per fragment. \ No newline at end of file diff --git a/.typedialog/platform/forms/fragments/provisioning/.typedialog/provisioning/platform/scripts/README.md b/.typedialog/platform/forms/fragments/provisioning/.typedialog/provisioning/platform/scripts/README.md index eb94ef1..28ee5bc 100644 --- a/.typedialog/platform/forms/fragments/provisioning/.typedialog/provisioning/platform/scripts/README.md +++ b/.typedialog/platform/forms/fragments/provisioning/.typedialog/provisioning/platform/scripts/README.md @@ -1 +1,98 @@ -# TypeDialog + Nickel Configuration Scripts\n\nPhase 8 Nushell automation scripts for interactive configuration workflow, config generation, validation, and deployment.\n\n## Quick Start\n\n```\n# 1. Interactive Configuration (TypeDialog)\nnu scripts/configure.nu orchestrator solo\n\n# 2. Generate TOML configs\nnu scripts/generate-configs.nu orchestrator solo\n\n# 3. Validate configuration\nnu scripts/validate-config.nu provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl\n\n# 4. Render Docker Compose\nnu scripts/render-docker-compose.nu solo\n\n# 5. Full deployment workflow\nnu scripts/install-services.nu orchestrator solo --docker\n```\n\n## Scripts Overview\n\n### Shared Helpers\n- **ansi.nu** - ANSI color and emoji output formatting\n- **paths.nu** - Path validation and directory structure helpers \n- **external.nu** - Safe external command execution with error handling\n\n### Core Configuration Scripts\n- **configure.nu** - Interactive TypeDialog configuration wizard\n- **generate-configs.nu** - Export Nickel configs to TOML\n- **validate-config.nu** - Validate Nickel configuration\n\n### Rendering Scripts\n- **render-docker-compose.nu** - Render Docker Compose from Nickel templates\n- **render-kubernetes.nu** - Render Kubernetes manifests from Nickel templates\n\n### Deployment & Monitoring Scripts\n- **install-services.nu** - Full deployment orchestration\n- **detect-services.nu** - Auto-detect running services\n\n## Supported Services\n- orchestrator (port 9090)\n- control-center (port 8080)\n- mcp-server (port 8888)\n- installer (port 8000)\n\n## Supported Deployment Modes\n- solo (2 CPU, 4GB RAM)\n- multiuser (4 CPU, 8GB RAM)\n- cicd (8 CPU, 16GB RAM)\n- enterprise (16+ CPU, 32+ GB RAM)\n\n## Nushell Compliance\nAll scripts follow Nushell 0.109.0+ guidelines with proper type signatures, error handling, and no try-catch blocks.\n\n## Examples\n\n### Single Service Configuration\n```\nnu scripts/configure.nu orchestrator solo --backend web\nnu scripts/validate-config.nu provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl\nnu scripts/generate-configs.nu orchestrator solo\ncargo run -p orchestrator -- --config provisioning/platform/config/orchestrator.solo.toml\n```\n\n### Docker Compose Deployment\n```\nnu scripts/generate-configs.nu orchestrator multiuser\nnu scripts/render-docker-compose.nu multiuser\ndocker-compose -f provisioning/platform/infrastructure/docker/docker-compose.multiuser.yml up -d\n```\n\n### Kubernetes Deployment\n```\nnu scripts/generate-configs.nu orchestrator enterprise\nnu scripts/render-kubernetes.nu enterprise --namespace production\nnu scripts/install-services.nu all enterprise --kubernetes --namespace production\n```\n\n## Phase 8 Status\n\n✅ Phase 8.A: Shared helper modules\n✅ Phase 8.B: Core configuration scripts \n✅ Phase 8.C: Rendering scripts\n✅ Phase 8.D: Deployment orchestration\n✅ Phase 8.E: Testing and documentation\n\n## Requirements\n\n- Nushell 0.109.1+\n- Nickel 1.15.1+\n- TypeDialog CLI\n- yq v4.50.1+\n- Docker (optional)\n- kubectl (optional) +# TypeDialog + Nickel Configuration Scripts + +Phase 8 Nushell automation scripts for interactive configuration workflow, config generation, validation, and deployment. + +## Quick Start + +```toml +# 1. Interactive Configuration (TypeDialog) +nu scripts/configure.nu orchestrator solo + +# 2. Generate TOML configs +nu scripts/generate-configs.nu orchestrator solo + +# 3. Validate configuration +nu scripts/validate-config.nu provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl + +# 4. Render Docker Compose +nu scripts/render-docker-compose.nu solo + +# 5. Full deployment workflow +nu scripts/install-services.nu orchestrator solo --docker +``` + +## Scripts Overview + +### Shared Helpers +- **ansi.nu** - ANSI color and emoji output formatting +- **paths.nu** - Path validation and directory structure helpers +- **external.nu** - Safe external command execution with error handling + +### Core Configuration Scripts +- **configure.nu** - Interactive TypeDialog configuration wizard +- **generate-configs.nu** - Export Nickel configs to TOML +- **validate-config.nu** - Validate Nickel configuration + +### Rendering Scripts +- **render-docker-compose.nu** - Render Docker Compose from Nickel templates +- **render-kubernetes.nu** - Render Kubernetes manifests from Nickel templates + +### Deployment & Monitoring Scripts +- **install-services.nu** - Full deployment orchestration +- **detect-services.nu** - Auto-detect running services + +## Supported Services +- orchestrator (port 9090) +- control-center (port 8080) +- mcp-server (port 8888) +- installer (port 8000) + +## Supported Deployment Modes +- solo (2 CPU, 4GB RAM) +- multiuser (4 CPU, 8GB RAM) +- cicd (8 CPU, 16GB RAM) +- enterprise (16+ CPU, 32+ GB RAM) + +## Nushell Compliance +All scripts follow Nushell 0.109.0+ guidelines with proper type signatures, error handling, and no try-catch blocks. + +## Examples + +### Single Service Configuration +```toml +nu scripts/configure.nu orchestrator solo --backend web +nu scripts/validate-config.nu provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl +nu scripts/generate-configs.nu orchestrator solo +cargo run -p orchestrator -- --config provisioning/platform/config/orchestrator.solo.toml +``` + +### Docker Compose Deployment +```toml +nu scripts/generate-configs.nu orchestrator multiuser +nu scripts/render-docker-compose.nu multiuser +docker-compose -f provisioning/platform/infrastructure/docker/docker-compose.multiuser.yml up -d +``` + +### Kubernetes Deployment +```yaml +nu scripts/generate-configs.nu orchestrator enterprise +nu scripts/render-kubernetes.nu enterprise --namespace production +nu scripts/install-services.nu all enterprise --kubernetes --namespace production +``` + +## Phase 8 Status + +✅ Phase 8.A: Shared helper modules +✅ Phase 8.B: Core configuration scripts +✅ Phase 8.C: Rendering scripts +✅ Phase 8.D: Deployment orchestration +✅ Phase 8.E: Testing and documentation + +## Requirements + +- Nushell 0.109.1+ +- Nickel 1.15.1+ +- TypeDialog CLI +- yq v4.50.1+ +- Docker (optional) +- kubectl (optional) \ No newline at end of file diff --git a/.typedialog/platform/scripts/README.md b/.typedialog/platform/scripts/README.md index fe6e914..80a1a19 100644 --- a/.typedialog/platform/scripts/README.md +++ b/.typedialog/platform/scripts/README.md @@ -1 +1,255 @@ -# Scripts\n\nNushell orchestration scripts for configuration workflow automation (NuShell 0.109+).\n\n## Purpose\n\nScripts provide:\n- **Interactive configuration wizard** - TypeDialog with nickel-roundtrip\n- **Configuration generation** - Nickel → TOML export\n- **Validation** - Nickel typecheck and constraint validation\n- **Deployment** - Docker Compose, Kubernetes, service installation\n\n## Script Organization\n\n```\nscripts/\n├── README.md # This file\n├── configure.nu # Interactive TypeDialog wizard\n├── generate-configs.nu # Nickel → TOML export\n├── validate-config.nu # Nickel typecheck\n├── render-docker-compose.nu # Docker Compose generation\n├── render-kubernetes.nu # Kubernetes manifests generation\n├── install-services.nu # Deploy platform services\n└── detect-services.nu # Auto-detect running services\n```\n\n## Scripts (Planned Implementation)\n\n### configure.nu\nInteractive configuration wizard using TypeDialog nickel-roundtrip:\n\n```\nnu provisioning/.typedialog/platform/scripts/configure.nu orchestrator solo --backend web\n```\n\nWorkflow:\n1. Loads existing config (if exists) as defaults\n2. Launches TypeDialog form (web/tui/cli)\n3. Shows form with validated constraints\n4. User edits configuration\n5. Generates updated Nickel config to `provisioning/schemas/platform/values/orchestrator.solo.ncl`\n\nUsage:\n```\nnu scripts/configure.nu [service] [mode] --backend [web|tui|cli]\n service: orchestrator | control-center | mcp-server | vault-service | extension-registry | rag | ai-service | provisioning-daemon\n mode: solo | multiuser | cicd | enterprise\n backend: web (default) | tui | cli\n```\n\n### generate-configs.nu\nExport Nickel configuration to TOML:\n\n```\nnu provisioning/.typedialog/platform/scripts/generate-configs.nu orchestrator solo\n```\n\nWorkflow:\n1. Validates Nickel config (typecheck)\n2. Exports to TOML format\n3. Saves to `provisioning/config/runtime/generated/{service}.{mode}.toml`\n\nUsage:\n```\nnu scripts/generate-configs.nu [service] [mode]\n service: orchestrator | control-center | mcp-server | vault-service | extension-registry | rag | ai-service | provisioning-daemon\n mode: solo | multiuser | cicd | enterprise\n```\n\n### validate-config.nu\nTypecheck Nickel configuration:\n\n```\nnu provisioning/.typedialog/platform/scripts/validate-config.nu provisioning/schemas/platform/values/orchestrator.solo.ncl\n```\n\nWorkflow:\n1. Runs nickel typecheck\n2. Reports errors (schema violations, constraint errors)\n3. Exits with status\n\nUsage:\n```\nnu scripts/validate-config.nu [config_path]\n config_path: Path to Nickel config file\n```\n\n### render-docker-compose.nu\nGenerate Docker Compose files from Nickel templates:\n\n```\nnu provisioning/.typedialog/platform/scripts/render-docker-compose.nu solo\n```\n\nWorkflow:\n1. Evaluates Nickel template\n2. Exports to JSON\n3. Converts to YAML (via yq)\n4. Saves to `provisioning/platform/infrastructure/docker/docker-compose.{mode}.yml`\n\nUsage:\n```\nnu scripts/render-docker-compose.nu [mode]\n mode: solo | multiuser | cicd | enterprise\n```\n\n### render-kubernetes.nu\nGenerate Kubernetes manifests:\n\n```\nnu scripts/render-kubernetes.nu solo\n```\n\nWorkflow:\n1. Evaluates Nickel templates\n2. Exports to JSON\n3. Converts to YAML\n4. Saves to `provisioning/platform/infrastructure/kubernetes/`\n\n### install-services.nu\nDeploy platform services:\n\n```\nnu scripts/install-services.nu solo --backend docker\n```\n\nWorkflow:\n1. Generates all configs for mode\n2. Renders deployment manifests\n3. Deploys services (Docker Compose or Kubernetes)\n4. Verifies service startup\n\n### detect-services.nu\nAuto-detect running services:\n\n```\nnu scripts/detect-services.nu\n```\n\nOutputs:\n- Running service list\n- Detected mode\n- Port usage\n- Container/pod status\n\n## Common Workflow\n\n```\n# 1. Configure service\nnu scripts/configure.nu orchestrator solo\n\n# 2. Validate configuration\nnu scripts/validate-config.nu provisioning/schemas/platform/values/orchestrator.solo.ncl\n\n# 3. Generate TOML\nnu scripts/generate-configs.nu orchestrator solo\n\n# 4. Review generated config\ncat provisioning/config/runtime/generated/orchestrator.solo.toml\n\n# 5. Render Docker Compose\nnu scripts/render-docker-compose.nu solo\n\n# 6. Deploy services\nnu scripts/install-services.nu solo --backend docker\n\n# 7. Verify running services\nnu scripts/detect-services.nu\n```\n\n## Guidelines\n\nAll scripts follow @.claude/guidelines/nushell.md (NuShell 0.109+):\n\n- **Explicit type signatures** - Function parameters with type annotations\n- **Colon notation** - Use `:` before input type, `->` before output type\n- **Error handling** - Use `do { } | complete` pattern (not try-catch)\n- **Pipeline operations** - Chain operations, avoid nested calls\n- **No mutable variables** - Use reduce/recursion instead\n- **External commands** - Use `^` prefix (`^nickel`, `^docker`, etc.)\n\nExample:\n```\nexport def main [\n service: string, # Type annotation\n mode: string\n]: nothing -> nothing { # Input/output types\n let result = do {\n ^nickel typecheck $config_path\n } | complete\n\n if $result.exit_code == 0 {\n print "✅ Validation passed"\n } else {\n print $"❌ Validation failed: ($result.stderr)"\n exit 1\n }\n}\n```\n\n## Error Handling Pattern\n\nAll scripts use `do { } | complete` for error handling:\n\n```\nlet result = do {\n ^some-command --flag value\n} | complete\n\nif $result.exit_code != 0 {\n error make {\n msg: $"Command failed: ($result.stderr)"\n }\n}\n```\n\n**Never use try-catch** (not supported in 0.109+).\n\n## Script Dependencies\n\nAll scripts assume:\n- **NuShell 0.109+** - Modern shell\n- **Nickel** (0.10+) - Configuration language\n- **TypeDialog** - Interactive forms\n- **Docker** or **kubectl** - Deployment backends\n- **yq** - YAML/JSON conversion\n- **jq** - JSON processing\n\n## Testing Scripts\n\n```\n# Validate Nushell syntax\nnu --version # Verify 0.109+\n\n# Test script execution\nnu scripts/validate-config.nu values/orchestrator.solo.ncl\n\n# Check script compliance\ngrep -r "try\|panic\|todo" scripts/ # Should be empty\n```\n\n## Adding a New Script\n\n1. **Create script file** (`scripts/{name}.nu`)\n2. **Add @.claude/guidelines/nushell.md** compliance\n3. **Define main function** with type signatures\n4. **Use do { } | complete** for error handling\n5. **Test execution**: `nu scripts/{name}.nu`\n6. **Verify**: No try-catch, no mutable vars, no panic\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05\n**Guideline**: @.claude/guidelines/nushell.md (NuShell 0.109+) +# Scripts + +Nushell orchestration scripts for configuration workflow automation (NuShell 0.109+). + +## Purpose + +Scripts provide: +- **Interactive configuration wizard** - TypeDialog with nickel-roundtrip +- **Configuration generation** - Nickel → TOML export +- **Validation** - Nickel typecheck and constraint validation +- **Deployment** - Docker Compose, Kubernetes, service installation + +## Script Organization + +```toml +scripts/ +├── README.md # This file +├── configure.nu # Interactive TypeDialog wizard +├── generate-configs.nu # Nickel → TOML export +├── validate-config.nu # Nickel typecheck +├── render-docker-compose.nu # Docker Compose generation +├── render-kubernetes.nu # Kubernetes manifests generation +├── install-services.nu # Deploy platform services +└── detect-services.nu # Auto-detect running services +``` + +## Scripts (Planned Implementation) + +### configure.nu +Interactive configuration wizard using TypeDialog nickel-roundtrip: + +```nickel +nu provisioning/.typedialog/platform/scripts/configure.nu orchestrator solo --backend web +``` + +Workflow: +1. Loads existing config (if exists) as defaults +2. Launches TypeDialog form (web/tui/cli) +3. Shows form with validated constraints +4. User edits configuration +5. Generates updated Nickel config to `provisioning/schemas/platform/values/orchestrator.solo.ncl` + +Usage: +```nickel +nu scripts/configure.nu [service] [mode] --backend [web|tui|cli] + service: orchestrator | control-center | mcp-server | vault-service | extension-registry | rag | ai-service | provisioning-daemon + mode: solo | multiuser | cicd | enterprise + backend: web (default) | tui | cli +``` + +### generate-configs.nu +Export Nickel configuration to TOML: + +```nickel +nu provisioning/.typedialog/platform/scripts/generate-configs.nu orchestrator solo +``` + +Workflow: +1. Validates Nickel config (typecheck) +2. Exports to TOML format +3. Saves to `provisioning/config/runtime/generated/{service}.{mode}.toml` + +Usage: +```toml +nu scripts/generate-configs.nu [service] [mode] + service: orchestrator | control-center | mcp-server | vault-service | extension-registry | rag | ai-service | provisioning-daemon + mode: solo | multiuser | cicd | enterprise +``` + +### validate-config.nu +Typecheck Nickel configuration: + +```nickel +nu provisioning/.typedialog/platform/scripts/validate-config.nu provisioning/schemas/platform/values/orchestrator.solo.ncl +``` + +Workflow: +1. Runs nickel typecheck +2. Reports errors (schema violations, constraint errors) +3. Exits with status + +Usage: +```toml +nu scripts/validate-config.nu [config_path] + config_path: Path to Nickel config file +``` + +### render-docker-compose.nu +Generate Docker Compose files from Nickel templates: + +```nickel +nu provisioning/.typedialog/platform/scripts/render-docker-compose.nu solo +``` + +Workflow: +1. Evaluates Nickel template +2. Exports to JSON +3. Converts to YAML (via yq) +4. Saves to `provisioning/platform/infrastructure/docker/docker-compose.{mode}.yml` + +Usage: +```yaml +nu scripts/render-docker-compose.nu [mode] + mode: solo | multiuser | cicd | enterprise +``` + +### render-kubernetes.nu +Generate Kubernetes manifests: + +```nushell +nu scripts/render-kubernetes.nu solo +``` + +Workflow: +1. Evaluates Nickel templates +2. Exports to JSON +3. Converts to YAML +4. Saves to `provisioning/platform/infrastructure/kubernetes/` + +### install-services.nu +Deploy platform services: + +```nushell +nu scripts/install-services.nu solo --backend docker +``` + +Workflow: +1. Generates all configs for mode +2. Renders deployment manifests +3. Deploys services (Docker Compose or Kubernetes) +4. Verifies service startup + +### detect-services.nu +Auto-detect running services: + +```nushell +nu scripts/detect-services.nu +``` + +Outputs: +- Running service list +- Detected mode +- Port usage +- Container/pod status + +## Common Workflow + +```toml +# 1. Configure service +nu scripts/configure.nu orchestrator solo + +# 2. Validate configuration +nu scripts/validate-config.nu provisioning/schemas/platform/values/orchestrator.solo.ncl + +# 3. Generate TOML +nu scripts/generate-configs.nu orchestrator solo + +# 4. Review generated config +cat provisioning/config/runtime/generated/orchestrator.solo.toml + +# 5. Render Docker Compose +nu scripts/render-docker-compose.nu solo + +# 6. Deploy services +nu scripts/install-services.nu solo --backend docker + +# 7. Verify running services +nu scripts/detect-services.nu +``` + +## Guidelines + +All scripts follow @.claude/guidelines/nushell.md (NuShell 0.109+): + +- **Explicit type signatures** - Function parameters with type annotations +- **Colon notation** - Use `:` before input type, `->` before output type +- **Error handling** - Use `do { } | complete` pattern (not try-catch) +- **Pipeline operations** - Chain operations, avoid nested calls +- **No mutable variables** - Use reduce/recursion instead +- **External commands** - Use `^` prefix (`^nickel`, `^docker`, etc.) + +Example: +```javascript +export def main [ + service: string, # Type annotation + mode: string +]: nothing -> nothing { # Input/output types + let result = do { + ^nickel typecheck $config_path + } | complete + + if $result.exit_code == 0 { + print "✅ Validation passed" + } else { + print $"❌ Validation failed: ($result.stderr)" + exit 1 + } +} +``` + +## Error Handling Pattern + +All scripts use `do { } | complete` for error handling: + +```javascript +let result = do { + ^some-command --flag value +} | complete + +if $result.exit_code != 0 { + error make { + msg: $"Command failed: ($result.stderr)" + } +} +``` + +**Never use try-catch** (not supported in 0.109+). + +## Script Dependencies + +All scripts assume: +- **NuShell 0.109+** - Modern shell +- **Nickel** (0.10+) - Configuration language +- **TypeDialog** - Interactive forms +- **Docker** or **kubectl** - Deployment backends +- **yq** - YAML/JSON conversion +- **jq** - JSON processing + +## Testing Scripts + +```toml +# Validate Nushell syntax +nu --version # Verify 0.109+ + +# Test script execution +nu scripts/validate-config.nu values/orchestrator.solo.ncl + +# Check script compliance +grep -r "try\|panic\|todo" scripts/ # Should be empty +``` + +## Adding a New Script + +1. **Create script file** (`scripts/{name}.nu`) +2. **Add @.claude/guidelines/nushell.md** compliance +3. **Define main function** with type signatures +4. **Use do { } | complete** for error handling +5. **Test execution**: `nu scripts/{name}.nu` +6. **Verify**: No try-catch, no mutable vars, no panic + +--- + +**Version**: 1.0.0 +**Last Updated**: 2025-01-05 +**Guideline**: @.claude/guidelines/nushell.md (NuShell 0.109+) \ No newline at end of file diff --git a/CHANGELOG.md b/CHANGELOG.md index 37cd429..26345d8 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1 +1,131 @@ -# Provisioning Repository - Changes\n\n**Date**: 2026-01-08\n**Repository**: provisioning (standalone, nickel branch)\n**Changes**: Nickel IaC migration complete - Legacy KCL and config cleanup\n\n---\n\n## 📋 Summary\n\nComplete migration to Nickel-based infrastructure-as-code with consolidated configuration strategy. Legacy KCL schemas, deprecated config files, and redundant documentation removed. New project structure with `.cargo/`, `.github/`, and schema-driven configuration system.\n\n---\n\n## 📁 Changes by Directory\n\n### ✅ REMOVED (Legacy KCL Ecosystem)\n\n- **config/** - Deprecated TOML configs (config.defaults.toml, kms.toml, plugins.toml, etc.)\n- **config/cedar-policies/** - Legacy Cedar policies (moved to Nickel schemas)\n- **config/templates/** - Old Jinja2 templates (replaced by Nickel generator/)\n- **config/installer-examples/** - KCL-based examples\n- **docs/src/** - Legacy documentation (full migration to provisioning/docs/src/)\n- **kcl/** - Complete removal (all workspaces migrated to Nickel)\n- **tools/kcl-packager.nu** - KCL packaging system\n\n### ✅ ADDED (Nickel IaC & New Structure)\n\n- **.cargo/** - Rust build configuration (clippy settings, rustfmt.toml)\n- **.github/** - GitHub Actions CI/CD workflows\n- **schemas/** - Nickel schema definitions (primary IaC format)\n - main.ncl, provider-aws.ncl, provider-local.ncl, provider-upcloud.ncl\n - Infrastructure, deployment, services, operations schemas\n- **docs/src/architecture/adr/** - ADR updates for Nickel migration\n - adr-010-configuration-format-strategy.md\n - adr-011-nickel-migration.md\n - adr-012-nushell-nickel-plugin-cli-wrapper.md\n\n### 📝 UPDATED (Core System)\n\n- **provisioning/docs/src/** - Comprehensive product documentation\n - API reference, architecture, guides, operations, security, testing\n - Nickel configuration guide with examples\n - Migrated from legacy KCL documentation\n\n- **core/** - Updated with Nickel integration\n - Scripts, plugins, CLI updated for Nickel schema parsing\n\n- **justfiles/** - Added ci.just for Nickel-aware CI/CD\n- **README.md** - Complete restructure for Nickel-first approach\n- **.gitignore** - Updated to ignore Nickel build artifacts\n\n---\n\n## 📊 Change Statistics\n\n| Category | Removed | Added | Modified |\n| ---------- | --------- | ------- | ---------- |\n| Configuration | 50+ | 10+ | 3 |\n| Documentation | 150+ | 200+ | 40+ |\n| Infrastructure | 1 (kcl/) | - | - |\n| Plugins | 1 | - | 5+ |\n| Build System | 5 | 8+ | 3 |\n| **Total** | **~220 files** | **~250 files** | **50+ files** |\n\n## ⚠️ Breaking Changes\n\n1. **KCL Sunset**: All KCL infrastructure code removed. Migrate workspaces using `nickel-kcl-bridge` or rewrite directly in Nickel.\n2. **Config Format**: TOML configuration files moved to schema-driven Nickel system. Legacy config loading deprecated.\n3. **Documentation**: Old KCL/legacy docs removed. Use `provisioning/docs/` for current product documentation.\n4. **Plugin System**: Updated to Nickel-aware plugin API. Legacy Nushell plugins require recompilation.\n\n## 🔧 Migration Path\n\n```\n# For existing workspaces:\nprovisioning workspace migrate --from-kcl \n\n# For custom configs:\nnickel eval --format json | jq '.'\n```\n\n## ✨ Key Features\n\n- **Type-Safe**: Nickel schemas eliminate silent config errors\n- **Composable**: Modular infrastructure definitions with lazy evaluation\n- **Documented**: Schema validation built-in, IDE support via LSP\n- **Validated**: All imports pre-checked, circular dependencies prevented\n- **Bridge Available**: `nickel-kcl-bridge` for gradual KCL→Nickel migration\n\n---\n\n## 📝 Implementation Details\n\n### Nickel Schema System\n\n- **Three-tier architecture**: infrastructure, operations, deployment\n- **Lazy evaluation**: Efficient resource binding and composition\n- **Record merging**: Clean override patterns without duplication\n- **Type validation**: LSP-aware with IDE auto-completion\n- **Generator system**: Nickel-based dynamic configuration at runtime\n\n### Documentation Reorganization\n\n- **provisioning/docs/src/** (200+ files) - Customer-facing product docs\n- **docs/src/** (20-30 files) - Architecture and development guidelines\n- **.coder/** - Session files and implementation records\n- Separation of concerns: Product docs isolated from session artifacts\n\n### CI/CD Integration\n\n- GitHub Actions workflows for Rust, Nickel, Nushell\n- Automated schema validation pre-commit\n- Cross-platform testing (Linux, macOS)\n- Build artifact caching for fast iteration\n\n---\n\n## ⚠️ Compatibility Notes\n\n**Breaking**: KCL workspaces require migration to Nickel. Use schema-aware tooling for validation.\n\n**Migration support**: `nickel-kcl-bridge` tool and guides available in `provisioning/docs/src/development/`.\n\n**Legacy configs**: Old TOML files no longer loaded. Migrate to Nickel schema format via CLI tool.\n\n---\n\n**Status**: Nickel migration complete. System is production-ready.\n**Date**: 2026-01-08\n**Branch**: nickel +# Provisioning Repository - Changes + +**Date**: 2026-01-08 +**Repository**: provisioning (standalone, nickel branch) +**Changes**: Nickel IaC migration complete - Legacy KCL and config cleanup + +--- + +## 📋 Summary + +Complete migration to Nickel-based infrastructure-as-code with consolidated configuration strategy. Legacy KCL schemas, deprecated config files, and redundant documentation removed. New project structure with `.cargo/`, `.github/`, and schema-driven configuration system. + +--- + +## 📁 Changes by Directory + +### ✅ REMOVED (Legacy KCL Ecosystem) + +- **config/** - Deprecated TOML configs (config.defaults.toml, kms.toml, plugins.toml, etc.) +- **config/cedar-policies/** - Legacy Cedar policies (moved to Nickel schemas) +- **config/templates/** - Old Jinja2 templates (replaced by Nickel generator/) +- **config/installer-examples/** - KCL-based examples +- **docs/src/** - Legacy documentation (full migration to provisioning/docs/src/) +- **kcl/** - Complete removal (all workspaces migrated to Nickel) +- **tools/kcl-packager.nu** - KCL packaging system + +### ✅ ADDED (Nickel IaC & New Structure) + +- **.cargo/** - Rust build configuration (clippy settings, rustfmt.toml) +- **.github/** - GitHub Actions CI/CD workflows +- **schemas/** - Nickel schema definitions (primary IaC format) + - main.ncl, provider-aws.ncl, provider-local.ncl, provider-upcloud.ncl + - Infrastructure, deployment, services, operations schemas +- **docs/src/architecture/adr/** - ADR updates for Nickel migration + - adr-010-configuration-format-strategy.md + - adr-011-nickel-migration.md + - adr-012-nushell-nickel-plugin-cli-wrapper.md + +### 📝 UPDATED (Core System) + +- **provisioning/docs/src/** - Comprehensive product documentation + - API reference, architecture, guides, operations, security, testing + - Nickel configuration guide with examples + - Migrated from legacy KCL documentation + +- **core/** - Updated with Nickel integration + - Scripts, plugins, CLI updated for Nickel schema parsing + +- **justfiles/** - Added ci.just for Nickel-aware CI/CD +- **README.md** - Complete restructure for Nickel-first approach +- **.gitignore** - Updated to ignore Nickel build artifacts + +--- + +## 📊 Change Statistics + +| Category | Removed | Added | Modified | +| ---------- | --------- | ------- | ---------- | +| Configuration | 50+ | 10+ | 3 | +| Documentation | 150+ | 200+ | 40+ | +| Infrastructure | 1 (kcl/) | - | - | +| Plugins | 1 | - | 5+ | +| Build System | 5 | 8+ | 3 | +| **Total** | **~220 files** | **~250 files** | **50+ files** | + +## ⚠️ Breaking Changes + +1. **KCL Sunset**: All KCL infrastructure code removed. Migrate workspaces using `nickel-kcl-bridge` or rewrite directly in Nickel. +2. **Config Format**: TOML configuration files moved to schema-driven Nickel system. Legacy config loading deprecated. +3. **Documentation**: Old KCL/legacy docs removed. Use `provisioning/docs/` for current product documentation. +4. **Plugin System**: Updated to Nickel-aware plugin API. Legacy Nushell plugins require recompilation. + +## 🔧 Migration Path + +```bash +# For existing workspaces: +provisioning workspace migrate --from-kcl + +# For custom configs: +nickel eval --format json | jq '.' +``` + +## ✨ Key Features + +- **Type-Safe**: Nickel schemas eliminate silent config errors +- **Composable**: Modular infrastructure definitions with lazy evaluation +- **Documented**: Schema validation built-in, IDE support via LSP +- **Validated**: All imports pre-checked, circular dependencies prevented +- **Bridge Available**: `nickel-kcl-bridge` for gradual KCL→Nickel migration + +--- + +## 📝 Implementation Details + +### Nickel Schema System + +- **Three-tier architecture**: infrastructure, operations, deployment +- **Lazy evaluation**: Efficient resource binding and composition +- **Record merging**: Clean override patterns without duplication +- **Type validation**: LSP-aware with IDE auto-completion +- **Generator system**: Nickel-based dynamic configuration at runtime + +### Documentation Reorganization + +- **provisioning/docs/src/** (200+ files) - Customer-facing product docs +- **docs/src/** (20-30 files) - Architecture and development guidelines +- **.coder/** - Session files and implementation records +- Separation of concerns: Product docs isolated from session artifacts + +### CI/CD Integration + +- GitHub Actions workflows for Rust, Nickel, Nushell +- Automated schema validation pre-commit +- Cross-platform testing (Linux, macOS) +- Build artifact caching for fast iteration + +--- + +## ⚠️ Compatibility Notes + +**Breaking**: KCL workspaces require migration to Nickel. Use schema-aware tooling for validation. + +**Migration support**: `nickel-kcl-bridge` tool and guides available in `provisioning/docs/src/development/`. + +**Legacy configs**: Old TOML files no longer loaded. Migrate to Nickel schema format via CLI tool. + +--- + +**Status**: Nickel migration complete. System is production-ready. +**Date**: 2026-01-08 +**Branch**: nickel \ No newline at end of file diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md index 86d2ac5..084ffa9 100644 --- a/CODE_OF_CONDUCT.md +++ b/CODE_OF_CONDUCT.md @@ -1 +1,107 @@ -# Code of Conduct\n\n## Our Pledge\n\nWe, as members, contributors, and leaders, pledge to make participation in our project and community a harassment-free experience for everyone, regardless of:\n\n- Age\n- Body size\n- Visible or invisible disability\n- Ethnicity\n- Sex characteristics\n- Gender identity and expression\n- Level of experience\n- Education\n- Socioeconomic status\n- Nationality\n- Personal appearance\n- Race\n- Caste\n- Color\n- Religion\n- Sexual identity and orientation\n\nWe pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.\n\n## Our Standards\n\nExamples of behavior that contributes to a positive environment for our community include:\n\n- Demonstrating empathy and kindness toward other people\n- Being respectful of differing opinions, viewpoints, and experiences\n- Giving and gracefully accepting constructive feedback\n- Accepting responsibility and apologizing to those affected by mistakes\n- Focusing on what is best not just for us as individuals, but for the overall community\n\nExamples of unacceptable behavior include:\n\n- The use of sexualized language or imagery\n- Trolling, insulting, or derogatory comments\n- Personal or political attacks\n- Public or private harassment\n- Publishing others' private information (doxing)\n- Other conduct which could reasonably be considered inappropriate in a professional setting\n\n## Enforcement Responsibilities\n\nProject maintainers are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate corrective action in response to unacceptable behavior.\n\nMaintainers have the right and responsibility to:\n\n- Remove, edit, or reject comments, commits, code, and other contributions\n- Ban contributors for behavior they deem inappropriate, threatening, or harmful\n\n## Scope\n\nThis Code of Conduct applies to:\n\n- All community spaces (GitHub, forums, chat, events, etc.)\n- Official project channels and representations\n- Interactions between community members related to the project\n\n## Enforcement\n\nInstances of abusive, harassing, or otherwise unacceptable behavior may be reported to project maintainers:\n\n- Email: [project contact]\n- GitHub: Private security advisory\n- Issues: Report with `conduct` label (public discussions only)\n\nAll complaints will be reviewed and investigated promptly and fairly.\n\n### Enforcement Guidelines\n\n**1. Correction**\n\n- Community impact: Use of inappropriate language or unwelcoming behavior\n- Action: Private written warning with explanation and clarity on impact\n- Consequence: Warning and no further violations\n\n**2. Warning**\n\n- Community impact: Violation through single incident or series of actions\n- Action: Written warning with severity consequences for continued behavior\n- Consequence: Suspension from community interaction\n\n**3. Temporary Ban**\n\n- Community impact: Serious violation of standards\n- Action: Temporary ban from community interaction\n- Consequence: Revocation of ban after reflection period\n\n**4. Permanent Ban**\n\n- Community impact: Pattern of violating community standards\n- Action: Permanent ban from community interaction\n\n## Attribution\n\nThis Code of Conduct is adapted from the [Contributor Covenant](https://www.contributor-covenant.org), version 2.1.\n\nFor answers to common questions about this code of conduct, see the FAQ at .\n\n---\n\n**Thank you for being part of our community!**\n\nWe believe in creating a welcoming and inclusive space where everyone can contribute their best work. Together, we make this project better. +# Code of Conduct + +## Our Pledge + +We, as members, contributors, and leaders, pledge to make participation in our project and community a harassment-free experience for everyone, regardless of: + +- Age +- Body size +- Visible or invisible disability +- Ethnicity +- Sex characteristics +- Gender identity and expression +- Level of experience +- Education +- Socioeconomic status +- Nationality +- Personal appearance +- Race +- Caste +- Color +- Religion +- Sexual identity and orientation + +We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community. + +## Our Standards + +Examples of behavior that contributes to a positive environment for our community include: + +- Demonstrating empathy and kindness toward other people +- Being respectful of differing opinions, viewpoints, and experiences +- Giving and gracefully accepting constructive feedback +- Accepting responsibility and apologizing to those affected by mistakes +- Focusing on what is best not just for us as individuals, but for the overall community + +Examples of unacceptable behavior include: + +- The use of sexualized language or imagery +- Trolling, insulting, or derogatory comments +- Personal or political attacks +- Public or private harassment +- Publishing others' private information (doxing) +- Other conduct which could reasonably be considered inappropriate in a professional setting + +## Enforcement Responsibilities + +Project maintainers are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate corrective action in response to unacceptable behavior. + +Maintainers have the right and responsibility to: + +- Remove, edit, or reject comments, commits, code, and other contributions +- Ban contributors for behavior they deem inappropriate, threatening, or harmful + +## Scope + +This Code of Conduct applies to: + +- All community spaces (GitHub, forums, chat, events, etc.) +- Official project channels and representations +- Interactions between community members related to the project + +## Enforcement + +Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to project maintainers: + +- Email: [project contact] +- GitHub: Private security advisory +- Issues: Report with `conduct` label (public discussions only) + +All complaints will be reviewed and investigated promptly and fairly. + +### Enforcement Guidelines + +**1. Correction** + +- Community impact: Use of inappropriate language or unwelcoming behavior +- Action: Private written warning with explanation and clarity on impact +- Consequence: Warning and no further violations + +**2. Warning** + +- Community impact: Violation through single incident or series of actions +- Action: Written warning with severity consequences for continued behavior +- Consequence: Suspension from community interaction + +**3. Temporary Ban** + +- Community impact: Serious violation of standards +- Action: Temporary ban from community interaction +- Consequence: Revocation of ban after reflection period + +**4. Permanent Ban** + +- Community impact: Pattern of violating community standards +- Action: Permanent ban from community interaction + +## Attribution + +This Code of Conduct is adapted from the [Contributor Covenant](https://www.contributor-covenant.org), version 2.1. + +For answers to common questions about this code of conduct, see the FAQ at . + +--- + +**Thank you for being part of our community!** + +We believe in creating a welcoming and inclusive space where everyone can contribute their best work. Together, we make this project better. diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index ebfde00..fe52829 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1 +1,130 @@ -# Contributing to provisioning\n\nThank you for your interest in contributing! This document provides guidelines and instructions for contributing to this project.\n\n## Code of Conduct\n\nThis project adheres to a Code of Conduct. By participating, you are expected to uphold this code. Please see [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md) for details.\n\n## Getting Started\n\n### Prerequisites\n\n- Rust 1.70+ (if project uses Rust)\n- NuShell (if project uses Nushell scripts)\n- Git\n\n### Development Setup\n\n1. Fork the repository\n2. Clone your fork: `git clone https://repo.jesusperez.pro/jesus/provisioning`\n3. Add upstream: `git remote add upstream https://repo.jesusperez.pro/jesus/provisioning`\n4. Create a branch: `git checkout -b feature/your-feature`\n\n## Development Workflow\n\n### Before You Code\n\n- Check existing issues and pull requests to avoid duplication\n- Create an issue to discuss major changes before implementing\n- Assign yourself to let others know you're working on it\n\n### Code Standards\n\n#### Rust\n\n- Run `cargo fmt --all` before committing\n- All code must pass `cargo clippy -- -D warnings`\n- Write tests for new functionality\n- Maintain 100% documentation coverage for public APIs\n\n#### Nushell\n\n- Validate scripts with `nu --ide-check 100 script.nu`\n- Follow consistent naming conventions\n- Use type hints where applicable\n\n#### Nickel\n\n- Type check schemas with `nickel typecheck`\n- Document schema fields with comments\n- Test schema validation\n\n### Commit Guidelines\n\n- Write clear, descriptive commit messages\n- Reference issues with `Fixes #123` or `Related to #123`\n- Keep commits focused on a single concern\n- Use imperative mood: "Add feature" not "Added feature"\n\n### Testing\n\nAll changes must include tests:\n\n```\n# Run all tests\ncargo test --workspace\n\n# Run with coverage\ncargo llvm-cov --all-features --lcov\n\n# Run locally before pushing\njust ci-full\n```\n\n### Pull Request Process\n\n1. Update documentation for any changed functionality\n2. Add tests for new code\n3. Ensure all CI checks pass\n4. Request review from maintainers\n5. Be responsive to feedback and iterate quickly\n\n## Review Process\n\n- Maintainers will review your PR within 3-5 business days\n- Feedback is constructive and meant to improve the code\n- All discussions should be respectful and professional\n- Once approved, maintainers will merge the PR\n\n## Reporting Bugs\n\nFound a bug? Please file an issue with:\n\n- **Title**: Clear, descriptive title\n- **Description**: What happened and what you expected\n- **Steps to reproduce**: Minimal reproducible example\n- **Environment**: OS, Rust version, etc.\n- **Screenshots**: If applicable\n\n## Suggesting Enhancements\n\nHave an idea? Please file an issue with:\n\n- **Title**: Clear feature title\n- **Description**: What, why, and how\n- **Use cases**: Real-world scenarios where this would help\n- **Alternative approaches**: If you've considered any\n\n## Documentation\n\n- Keep README.md up to date\n- Document public APIs with rustdoc comments\n- Add examples for non-obvious functionality\n- Update CHANGELOG.md with your changes\n\n## Release Process\n\nMaintainers handle releases following semantic versioning:\n\n- MAJOR: Breaking changes\n- MINOR: New features (backward compatible)\n- PATCH: Bug fixes\n\n## Questions?\n\n- Check existing documentation and issues\n- Ask in discussions or open an issue\n- Join our community channels\n\nThank you for contributing! +# Contributing to provisioning + +Thank you for your interest in contributing! This document provides guidelines and instructions for contributing to this project. + +## Code of Conduct + +This project adheres to a Code of Conduct. By participating, you are expected to uphold this code. Please see [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md) for details. + +## Getting Started + +### Prerequisites + +- Rust 1.70+ (if project uses Rust) +- NuShell (if project uses Nushell scripts) +- Git + +### Development Setup + +1. Fork the repository +2. Clone your fork: `git clone https://repo.jesusperez.pro/jesus/provisioning` +3. Add upstream: `git remote add upstream https://repo.jesusperez.pro/jesus/provisioning` +4. Create a branch: `git checkout -b feature/your-feature` + +## Development Workflow + +### Before You Code + +- Check existing issues and pull requests to avoid duplication +- Create an issue to discuss major changes before implementing +- Assign yourself to let others know you're working on it + +### Code Standards + +#### Rust + +- Run `cargo fmt --all` before committing +- All code must pass `cargo clippy -- -D warnings` +- Write tests for new functionality +- Maintain 100% documentation coverage for public APIs + +#### Nushell + +- Validate scripts with `nu --ide-check 100 script.nu` +- Follow consistent naming conventions +- Use type hints where applicable + +#### Nickel + +- Type check schemas with `nickel typecheck` +- Document schema fields with comments +- Test schema validation + +### Commit Guidelines + +- Write clear, descriptive commit messages +- Reference issues with `Fixes #123` or `Related to #123` +- Keep commits focused on a single concern +- Use imperative mood: "Add feature" not "Added feature" + +### Testing + +All changes must include tests: + +```bash +# Run all tests +cargo test --workspace + +# Run with coverage +cargo llvm-cov --all-features --lcov + +# Run locally before pushing +just ci-full +``` + +### Pull Request Process + +1. Update documentation for any changed functionality +2. Add tests for new code +3. Ensure all CI checks pass +4. Request review from maintainers +5. Be responsive to feedback and iterate quickly + +## Review Process + +- Maintainers will review your PR within 3-5 business days +- Feedback is constructive and meant to improve the code +- All discussions should be respectful and professional +- Once approved, maintainers will merge the PR + +## Reporting Bugs + +Found a bug? Please file an issue with: + +- **Title**: Clear, descriptive title +- **Description**: What happened and what you expected +- **Steps to reproduce**: Minimal reproducible example +- **Environment**: OS, Rust version, etc. +- **Screenshots**: If applicable + +## Suggesting Enhancements + +Have an idea? Please file an issue with: + +- **Title**: Clear feature title +- **Description**: What, why, and how +- **Use cases**: Real-world scenarios where this would help +- **Alternative approaches**: If you've considered any + +## Documentation + +- Keep README.md up to date +- Document public APIs with rustdoc comments +- Add examples for non-obvious functionality +- Update CHANGELOG.md with your changes + +## Release Process + +Maintainers handle releases following semantic versioning: + +- MAJOR: Breaking changes +- MINOR: New features (backward compatible) +- PATCH: Bug fixes + +## Questions? + +- Check existing documentation and issues +- Ask in discussions or open an issue +- Join our community channels + +Thank you for contributing! \ No newline at end of file diff --git a/README.md b/README.md index 1dc8194..e6f6aa5 100644 --- a/README.md +++ b/README.md @@ -1 +1,1125 @@ -

\n Provisioning Logo\n

\n

\n Provisioning\n

\n\n# Provisioning - Infrastructure Automation Platform\n\n> **A modular, declarative Infrastructure as Code (IaC) platform for managing complete infrastructure lifecycles**\n\n## Table of Contents\n\n- [What is Provisioning?](#what-is-provisioning)\n- [Why Provisioning?](#why-provisioning)\n- [Core Concepts](#core-concepts)\n- [Architecture](#architecture)\n- [Key Features](#key-features)\n- [Technology Stack](#technology-stack)\n- [How It Works](#how-it-works)\n- [Use Cases](#use-cases)\n- [Getting Started](#getting-started)\n\n---\n\n## What is Provisioning?\n\n**Provisioning** is a comprehensive **Infrastructure as Code (IaC)** platform designed to manage\ncomplete infrastructure lifecycles: cloud providers, infrastructure services, clusters,\nand isolated workspaces across multiple cloud/local environments.\n\nExtensible and customizable by design, it delivers type-safe, configuration-driven workflows\nwith enterprise security (encrypted configuration, Cosmian KMS integration, Cedar policy engine,\nsecrets management, authorization and permissions control, compliance checking, anomaly detection)\nand adaptable deployment modes (interactive UI, CLI automation, unattended CI/CD)\nsuitable for any scale from development to production.\n\n### Technical Definition\n\nDeclarative Infrastructure as Code (IaC) platform providing:\n\n- **Type-safe, configuration-driven workflows** with schema validation and constraint checking\n- **Modular, extensible architecture**: cloud providers, task services, clusters, workspaces\n- **Multi-cloud abstraction layer** with unified API (UpCloud, AWS, local infrastructure)\n- **High-performance state management**:\n - Graph database backend for complex relationships\n - Real-time state tracking and queries\n - Multi-model data storage (document, graph, relational)\n- **Enterprise security stack**:\n - Encrypted configuration and secrets management\n - Cosmian KMS integration for confidential key management\n - Cedar policy engine for fine-grained access control\n - Authorization and permissions control via platform services\n - Compliance checking and policy enforcement\n - Anomaly detection for security monitoring\n - Audit logging and compliance tracking\n- **Hybrid orchestration**: Rust-based performance layer + scripting flexibility\n- **Production-ready features**:\n - Batch workflows with dependency resolution\n - Checkpoint recovery and automatic rollback\n - Parallel execution with state management\n- **Adaptable deployment modes**:\n - Interactive TUI for guided setup\n - Headless CLI for scripted automation\n - Unattended mode for CI/CD pipelines\n- **Hierarchical configuration system** with inheritance and overrides\n\n### What It Does\n\n- **Provisions Infrastructure** - Create servers, networks, storage across multiple cloud providers\n- **Installs Services** - Deploy Kubernetes, containerd, databases, monitoring, and 50+ infrastructure components\n- **Manages Clusters** - Orchestrate complete cluster deployments with dependency management\n- **Handles Configuration** - Hierarchical configuration system with inheritance and overrides\n- **Orchestrates Workflows** - Batch operations with parallel execution and checkpoint recovery\n- **Manages Secrets** - SOPS/Age integration for encrypted configuration\n- **Secures Infrastructure** - Enterprise security with JWT, MFA, Cedar policies, audit logging\n- **Optimizes Performance** - Native plugins providing 10-50x speed improvements\n\n---\n\n## Why Provisioning?\n\n### The Problems It Solves\n\n#### 1. **Multi-Cloud Complexity**\n\n**Problem**: Each cloud provider has different APIs, tools, and workflows.\n\n**Solution**: Unified abstraction layer with provider-agnostic interfaces. Write configuration once, deploy anywhere using Nickel schemas.\n\n```\n# Same configuration works on UpCloud, AWS, or local infrastructure\n{\n servers = [\n {\n name = "web-01"\n plan = "medium" # Abstract size, provider-specific translation\n provider = "upcloud" # Switch to "aws" or "local" as needed\n }\n ]\n}\n```\n\n#### 2. **Dependency Hell**\n\n**Problem**: Infrastructure components have complex dependencies (Kubernetes needs containerd, Cilium needs Kubernetes, etc.).\n\n**Solution**: Automatic dependency resolution with topological sorting and health checks via Nickel schemas.\n\n```\n# Provisioning resolves: containerd → etcd → kubernetes → cilium\n{\n taskservs = ["cilium"] # Automatically installs all dependencies\n}\n```\n\n#### 3. **Configuration Sprawl**\n\n**Problem**: Environment variables, hardcoded values, scattered configuration files.\n\n**Solution**: Hierarchical configuration system with 476+ config accessors replacing 200+ ENV variables.\n\n```\nDefaults → User → Project → Infrastructure → Environment → Runtime\n```\n\n#### 4. **Imperative Scripts**\n\n**Problem**: Brittle shell scripts that don't handle failures, don't support rollback, hard to maintain.\n\n**Solution**: Declarative Nickel configurations with validation, type safety, lazy evaluation, and automatic rollback.\n\n#### 5. **Lack of Visibility**\n\n**Problem**: No insight into what's happening during deployment, hard to debug failures.\n\n**Solution**:\n\n- Real-time workflow monitoring\n- Comprehensive logging system\n- Web-based control center\n- REST API for integration\n\n#### 6. **No Standardization**\n\n**Problem**: Each team builds their own deployment tools, no shared patterns.\n\n**Solution**: Reusable task services, cluster templates, and workflow patterns.\n\n---\n\n## Core Concepts\n\n### 1. **Providers**\n\nCloud infrastructure backends that handle resource provisioning.\n\n- **UpCloud** - Primary cloud provider\n- **AWS** - Amazon Web Services integration\n- **Local** - Local infrastructure (VMs, Docker, bare metal)\n\nProviders implement a common interface, making infrastructure code portable.\n\n### 2. **Task Services (TaskServs)**\n\nReusable infrastructure components that can be installed on servers.\n\n**Categories**:\n\n- **Container Runtimes** - containerd, Docker, Podman, crun, runc, youki\n- **Orchestration** - Kubernetes, etcd, CoreDNS\n- **Networking** - Cilium, Flannel, Calico, ip-aliases\n- **Storage** - Rook-Ceph, local storage\n- **Databases** - PostgreSQL, Redis, SurrealDB\n- **Observability** - Prometheus, Grafana, Loki\n- **Security** - Webhook, KMS, Vault\n- **Development** - Gitea, Radicle, ORAS\n\nEach task service includes:\n\n- Version management\n- Dependency declarations\n- Health checks\n- Installation/uninstallation logic\n- Configuration schemas\n\n### 3. **Clusters**\n\nComplete infrastructure deployments combining servers and task services.\n\n**Examples**:\n\n- **Kubernetes Cluster** - HA control plane + worker nodes + CNI + storage\n- **Database Cluster** - Replicated PostgreSQL with backup\n- **Build Infrastructure** - BuildKit + container registry + CI/CD\n\nClusters handle:\n\n- Multi-node coordination\n- Service distribution\n- High availability\n- Rolling updates\n\n### 4. **Workspaces**\n\nIsolated environments for different projects or deployment stages.\n\n```\nworkspace_librecloud/ # Production workspace\n├── infra/ # Infrastructure definitions\n├── config/ # Workspace configuration\n├── extensions/ # Custom modules\n└── runtime/ # State and runtime data\n\nworkspace_dev/ # Development workspace\n├── infra/\n└── config/\n```\n\nSwitch between workspaces with single command:\n\n```\nprovisioning workspace switch librecloud\n```\n\n### 5. **Workflows**\n\nCoordinated sequences of operations with dependency management.\n\n**Types**:\n\n- **Server Workflows** - Create/delete/update servers\n- **TaskServ Workflows** - Install/remove infrastructure services\n- **Cluster Workflows** - Deploy/scale complete clusters\n- **Batch Workflows** - Multi-cloud parallel operations\n\n**Features**:\n\n- Dependency resolution\n- Parallel execution\n- Checkpoint recovery\n- Automatic rollback\n- Progress monitoring\n\n---\n\n## Architecture\n\n### System Components\n\n```\n┌─────────────────────────────────────────────────────────────────┐\n│ User Interface Layer │\n│ • CLI (provisioning command) │\n│ • Web Control Center (UI) │\n│ • REST API │\n└─────────────────────────────────────────────────────────────────┘\n ↓\n┌─────────────────────────────────────────────────────────────────┐\n│ Core Engine Layer │\n│ • Command Routing & Dispatch │\n│ • Configuration Management │\n│ • Provider Abstraction │\n│ • Utility Libraries │\n└─────────────────────────────────────────────────────────────────┘\n ↓\n┌─────────────────────────────────────────────────────────────────┐\n│ Orchestration Layer │\n│ • Workflow Orchestrator (Rust/Nushell hybrid) │\n│ • Dependency Resolver │\n│ • State Manager │\n│ • Task Scheduler │\n└─────────────────────────────────────────────────────────────────┘\n ↓\n┌─────────────────────────────────────────────────────────────────┐\n│ Extension Layer │\n│ • Providers (Cloud APIs) │\n│ • Task Services (Infrastructure Components) │\n│ • Clusters (Complete Deployments) │\n│ • Workflows (Automation Templates) │\n└─────────────────────────────────────────────────────────────────┘\n ↓\n┌─────────────────────────────────────────────────────────────────┐\n│ Infrastructure Layer │\n│ • Cloud Resources (Servers, Networks, Storage) │\n│ • Kubernetes Clusters │\n│ • Running Services │\n└─────────────────────────────────────────────────────────────────┘\n```\n\n### Directory Structure\n\n```\nproject-provisioning/\n├── provisioning/ # Core provisioning system\n│ ├── core/ # Core engine and libraries\n│ │ ├── cli/ # Command-line interface\n│ │ ├── nulib/ # Core Nushell libraries\n│ │ ├── plugins/ # System plugins (Rust native)\n│ │ └── scripts/ # Utility scripts\n│ │\n│ ├── extensions/ # Extensible components\n│ │ ├── providers/ # Cloud provider implementations\n│ │ ├── taskservs/ # Infrastructure service definitions\n│ │ ├── clusters/ # Complete cluster configurations\n│ │ └── workflows/ # Core workflow templates\n│ │\n│ ├── platform/ # Platform services\n│ │ ├── orchestrator/ # Rust orchestrator service\n│ │ ├── control-center/ # Web control center\n│ │ ├── mcp-server/ # Model Context Protocol server\n│ │ ├── api-gateway/ # REST API gateway\n│ │ ├── oci-registry/ # OCI registry for extensions\n│ │ └── installer/ # Platform installer (TUI + CLI)\n│ │\n│ ├── schemas/ # Nickel schema definitions (PRIMARY IaC)\n│ │ ├── main.ncl # Main infrastructure schema\n│ │ ├── providers/ # Provider-specific schemas\n│ │ ├── infrastructure/ # Infra definitions\n│ │ ├── deployment/ # Deployment schemas\n│ │ ├── services/ # Service schemas\n│ │ ├── operations/ # Operations schemas\n│ │ └── generator/ # Runtime schema generation\n│ │\n│ ├── docs/ # Product documentation (mdBook)\n│ ├── config/ # Configuration examples\n│ ├── tools/ # Build and distribution tools\n│ └── justfiles/ # Just recipes for common tasks\n│\n├── workspace/ # User workspaces and data\n│ ├── infra/ # Infrastructure definitions\n│ ├── config/ # User configuration\n│ ├── extensions/ # User extensions\n│ └── runtime/ # Runtime data and state\n│\n├── docs/ # Architecture & Development docs\n│ ├── architecture/ # System design and ADRs\n│ └── development/ # Development guidelines\n│\n└── .github/ # CI/CD workflows\n └── workflows/ # GitHub Actions (Rust, Nickel, Nushell)\n```\n\n### Platform Services\n\n#### 1. **Orchestrator** (`platform/orchestrator/`)\n\n- **Language**: Rust + Nushell\n- **Purpose**: Workflow execution, task scheduling, state management\n- **Features**:\n - File-based persistence\n - Priority processing\n - Retry logic with exponential backoff\n - Checkpoint-based recovery\n - REST API endpoints\n\n#### 2. **Control Center** (`platform/control-center/`)\n\n- **Language**: Web UI + Backend API\n- **Purpose**: Web-based infrastructure management\n- **Features**:\n - Dashboard views\n - Real-time monitoring\n - Interactive deployments\n - Log viewing\n\n#### 3. **MCP Server** (`platform/mcp-server/`)\n\n- **Language**: Nushell\n- **Purpose**: Model Context Protocol integration for AI assistance\n- **Features**:\n - 7 AI-powered settings tools\n - Intelligent config completion\n - Natural language infrastructure queries\n\n#### 4. **OCI Registry** (`platform/oci-registry/`)\n\n- **Purpose**: Extension distribution and versioning\n- **Features**:\n - Task service packages\n - Provider packages\n - Cluster templates\n - Workflow definitions\n\n#### 5. **Installer** (`platform/installer/`)\n\n- **Language**: Rust (Ratatui TUI) + Nushell\n- **Purpose**: Platform installation and setup\n- **Features**:\n - Interactive TUI mode\n - Headless CLI mode\n - Unattended CI/CD mode\n - Configuration generation\n\n---\n\n## Key Features\n\n### 1. **Modular CLI Architecture** (v3.2.0)\n\n84% code reduction with domain-driven design.\n\n- **Main CLI**: 211 lines (from 1,329 lines)\n- **80+ shortcuts**: `s` → `server`, `t` → `taskserv`, etc.\n- **Bi-directional help**: `provisioning help ws` = `provisioning ws help`\n- **7 domain modules**: infrastructure, orchestration, development, workspace, configuration, utilities, generation\n\n### 2. **Configuration System** (v2.0.0)\n\nHierarchical, config-driven architecture.\n\n- **476+ config accessors** replacing 200+ ENV variables\n- **Hierarchical loading**: defaults → user → project → infra → env → runtime\n- **Variable interpolation**: `{{paths.base}}`, `{{env.HOME}}`, `{{now.date}}`\n- **Multi-format support**: TOML, YAML, KCL\n\n### 3. **Batch Workflow System** (v3.1.0)\n\nProvider-agnostic batch operations with 85-90% token efficiency.\n\n- **Multi-cloud support**: Mixed UpCloud + AWS + local in single workflow\n- **KCL schema integration**: Type-safe workflow definitions\n- **Dependency resolution**: Topological sorting with soft/hard dependencies\n- **State management**: Checkpoint-based recovery with rollback\n- **Real-time monitoring**: Live progress tracking\n\n### 4. **Hybrid Orchestrator** (v3.0.0)\n\nRust/Nushell architecture solving deep call stack limitations.\n\n- **High-performance coordination layer**\n- **File-based persistence**\n- **Priority processing with retry logic**\n- **REST API for external integration**\n- **Comprehensive workflow system**\n\n### 5. **Workspace Switching** (v2.0.5)\n\nCentralized workspace management.\n\n- **Single-command switching**: `provisioning workspace switch `\n- **Automatic tracking**: Last-used timestamps, active workspace markers\n- **User preferences**: Global settings across all workspaces\n- **Workspace registry**: Centralized configuration in `user_config.yaml`\n\n### 6. **Interactive Guides** (v3.3.0)\n\nStep-by-step walkthroughs and quick references.\n\n- **Quick reference**: `provisioning sc` (fastest)\n- **Complete guides**: from-scratch, update, customize\n- **Copy-paste ready**: All commands include placeholders\n- **Beautiful rendering**: Uses glow, bat, or less\n\n### 7. **Test Environment Service** (v3.4.0)\n\nAutomated container-based testing.\n\n- **Three test types**: Single taskserv, server simulation, multi-node clusters\n- **Topology templates**: Kubernetes HA, etcd clusters, etc.\n- **Auto-cleanup**: Optional automatic cleanup after tests\n- **CI/CD integration**: Easy integration into pipelines\n\n### 8. **Platform Installer** (v3.5.0)\n\nMulti-mode installation system with TUI, CLI, and unattended modes.\n\n- **Interactive TUI**: Beautiful Ratatui terminal UI with 7 screens\n- **Headless Mode**: CLI automation for scripted installations\n- **Unattended Mode**: Zero-interaction CI/CD deployments\n- **Deployment Modes**: Solo (2 CPU/4GB), MultiUser (4 CPU/8GB), CICD (8 CPU/16GB), Enterprise (16 CPU/32GB)\n- **MCP Integration**: 7 AI-powered settings tools for intelligent configuration\n\n### 9. **Version Management System** (v3.6.0)\n\nCentralized tool and provider version management with bash-compatible export.\n\n- **Unified Version Source**: All versions defined in Nickel files (`versions.ncl` and provider `version.ncl`)\n- **Generated Versions File**: Bash-compatible KEY="VALUE" format for shell scripts\n- **Core Tools**: NUSHELL, NICKEL, SOPS, AGE, K9S with convenient aliases (NU for NUSHELL)\n- **Provider Versions**: Automatically discovers and includes all provider versions (AWS, HCLOUD, UPCTL, etc.)\n- **Command**: `provisioning setup versions` generates `/provisioning/core/versions` file\n- **Shell Integration**: Can be sourced directly in bash scripts: `source /provisioning/core/versions && echo $NU_VERSION`\n- **Usage**:\n ```bash\n # Generate versions file\n provisioning setup versions\n\n # Use in bash scripts\n source /provisioning/core/versions\n echo "Using Nushell version: $NU_VERSION"\n echo "AWS CLI version: $PROVIDER_AWS_VERSION"\n ```\n\n### 10. **Nushell Plugins Integration** (v1.0.0)\n\nThree native Rust plugins providing 10-50x performance improvements over HTTP API.\n\n- **Three Native Plugins**: auth, KMS, orchestrator\n- **Performance Gains**:\n - KMS operations: ~5ms vs ~50ms (10x faster)\n - Orchestrator queries: ~1ms vs ~30ms (30x faster)\n - Auth verification: ~10ms vs ~50ms (5x faster)\n- **OS-Native Keyring**: macOS Keychain, Linux Secret Service, Windows Credential Manager\n- **KMS Backends**: RustyVault, Age, AWS KMS, Vault, Cosmian\n- **Graceful Fallback**: Automatic fallback to HTTP if plugins not installed\n\n### 11. **Complete Security System** (v4.0.0)\n\nEnterprise-grade security with 39,699 lines across 12 components.\n\n- **12 Components**: JWT Auth, Cedar Authorization, MFA (TOTP + WebAuthn), Secrets Management, KMS, Audit Logging, Break-Glass, Compliance, Audit Query, Token Management, Access Control, Encryption\n- **Performance**: <20ms overhead per secure operation\n- **Testing**: 350+ comprehensive test cases\n- **API**: 83+ REST endpoints, 111+ CLI commands\n- **Standards**: GDPR, SOC2, ISO 27001 compliance\n- **Key Features**:\n - RS256 authentication with Argon2id hashing\n - Policy-as-code with hot reload\n - Multi-factor authentication (TOTP + WebAuthn/FIDO2)\n - Dynamic secrets (AWS STS, SSH keys) with TTL\n - 5 KMS backends with envelope encryption\n - 7-year audit retention with 5 export formats\n - Multi-party break-glass approval\n\n---\n\n## Technology Stack\n\n### Core Technologies\n\n| Technology | Version | Purpose | Why |\n| ------------ | --------- | --------- | ----- |\n| **Nickel** | Latest | PRIMARY - Infrastructure-as-code language | Type-safe schemas, lazy evaluation, LSP support, composable records, gradual validation |\n| **Nushell** | 0.109.0+ | Scripting and task automation | Structured data pipelines, cross-platform, modern built-in parsers (JSON/YAML/TOML) |\n| **Rust** | Latest | Platform services (orchestrator, control-center, installer) | Performance, memory safety, concurrency, reliability |\n| **KCL** | DEPRECATED | Legacy configuration (fully replaced by Nickel) | Migration bridge available; use Nickel for new work |\n\n### Data & State Management\n\n| Technology | Version | Purpose | Features |\n| ------------ | --------- | --------- | ---------- |\n| **SurrealDB** | Latest | High-performance graph database backend | Multi-model (document, graph, relational), real-time queries, distributed architecture, complex relationship tracking |\n\n### Platform Services (Rust-based)\n\n| Service | Purpose | Security Features |\n| --------- | --------- | ------------------- |\n| **Orchestrator** | Workflow execution, task scheduling, state management | File-based persistence, retry logic, checkpoint recovery |\n| **Control Center** | Web-based infrastructure management | **Authorization and permissions control**, RBAC, audit logging |\n| **Installer** | Platform installation (TUI + CLI modes) | Secure configuration generation, validation |\n| **API Gateway** | REST API for external integration | Authentication, rate limiting, request validation |\n| **MCP Server** | AI-powered configuration management | 7 settings tools, intelligent config completion |\n| **OCI Registry** | Extension distribution and versioning | Task services, providers, cluster templates |\n\n### Security & Secrets\n\n| Technology | Version | Purpose | Enterprise Features |\n| ------------ | --------- | --------- | --------------------- |\n| **SOPS** | 3.10.2+ | Secrets management | Encrypted configuration files |\n| **Age** | 1.2.1+ | Encryption | Secure key-based encryption |\n| **Cosmian KMS** | Latest | Key Management System | Confidential computing, secure key storage, cloud-native KMS |\n| **Cedar** | Latest | Policy engine | Fine-grained access control, policy-as-code, compliance checking, anomaly detection |\n| **RustyVault** | Latest | Transit encryption engine | 5ms encryption performance, multiple KMS backends |\n| **JWT** | Latest | Authentication tokens | RS256 signatures, Argon2id password hashing |\n| **Keyring** | Latest | OS-native secure storage | macOS Keychain, Linux Secret Service, Windows Credential Manager |\n\n### Version Management\n\n| Component | Purpose | Format |\n| ----------- | --------- | -------- |\n| **versions.ncl** | Core tool versions (Nickel primary) | Nickel schema |\n| **provider version.ncl** | Provider-specific versions | Nickel schema |\n| **provisioning setup versions** | Version file generator | Nushell command |\n| **versions file** | Bash-compatible exports | KEY="VALUE" format |\n\n**Usage**:\n```\n# Generate versions file from Nickel schemas\nprovisioning setup versions\n\n# Source in shell scripts\nsource /provisioning/core/versions\necho $NU_VERSION $PROVIDER_AWS_VERSION\n```\n\n### Optional Tools\n\n| Tool | Purpose |\n| ------ | --------- |\n| **K9s** | Kubernetes management interface |\n| **nu_plugin_tera** | Nushell plugin for Tera template rendering |\n| **nu_plugin_kcl** | Nushell plugin for KCL integration (CLI required, plugin optional) |\n| **nu_plugin_auth** | Authentication plugin (5x faster auth, OS keyring integration) |\n| **nu_plugin_kms** | KMS encryption plugin (10x faster, 5ms encryption) |\n| **nu_plugin_orchestrator** | Orchestrator plugin (30-50x faster queries) |\n| **glow** | Markdown rendering for interactive guides |\n| **bat** | Syntax highlighting for file viewing and guides |\n\n---\n\n## How It Works\n\n### Data Flow\n\n```\n1. User defines infrastructure in Nickel schemas\n ↓\n2. Nickel evaluates with type validation and lazy evaluation\n ↓\n3. CLI loads configuration (hierarchical merging)\n ↓\n4. Configuration validated against provider schemas\n ↓\n5. Workflow created with operations\n ↓\n6. Orchestrator receives workflow\n ↓\n7. Dependencies resolved (topological sort)\n ↓\n8. Operations executed in order (parallel where possible)\n ↓\n9. Providers handle cloud operations\n ↓\n10. Task services installed on servers\n ↓\n11. State persisted and monitored\n```\n\n### Example Workflow: Deploy Kubernetes Cluster\n\n**Step 1**: Define infrastructure in Nickel\n\n```\n# schemas/my-cluster.ncl\n{\n metadata = {\n name = "my-cluster"\n provider = "upcloud"\n environment = "production"\n }\n\n infrastructure = {\n servers = [\n {name = "control-01", plan = "medium", role = "control"}\n {name = "worker-01", plan = "large", role = "worker"}\n {name = "worker-02", plan = "large", role = "worker"}\n ]\n }\n\n services = {\n taskservs = ["kubernetes", "cilium", "rook-ceph"]\n }\n}\n```\n\n**Step 2**: Submit to Provisioning\n\n```\nprovisioning server create --infra my-cluster\n```\n\n**Step 3**: Provisioning executes workflow\n\n```\n1. Create workflow: "deploy-my-cluster"\n2. Resolve dependencies:\n - containerd (required by kubernetes)\n - etcd (required by kubernetes)\n - kubernetes (explicitly requested)\n - cilium (explicitly requested, requires kubernetes)\n - rook-ceph (explicitly requested, requires kubernetes)\n\n3. Execution order:\n a. Provision servers (parallel)\n b. Install containerd on all nodes\n c. Install etcd on control nodes\n d. Install kubernetes control plane\n e. Join worker nodes\n f. Install Cilium CNI\n g. Install Rook-Ceph storage\n\n4. Checkpoint after each step\n5. Monitor health checks\n6. Report completion\n```\n\n**Step 4**: Verify deployment\n\n```\nprovisioning cluster status my-cluster\n```\n\n### Configuration Hierarchy\n\nConfiguration values are resolved through a hierarchy:\n\n```\n1. System Defaults (provisioning/config/config.defaults.toml)\n ↓ (overridden by)\n2. User Preferences (~/.config/provisioning/user_config.yaml)\n ↓ (overridden by)\n3. Workspace Config (workspace/config/provisioning.yaml)\n ↓ (overridden by)\n4. Infrastructure Config (workspace/infra//config.toml)\n ↓ (overridden by)\n5. Environment Config (workspace/config/prod-defaults.toml)\n ↓ (overridden by)\n6. Runtime Flags (--flag value)\n```\n\n**Example**:\n\n```\n# System default\n[servers]\ndefault_plan = "small"\n\n# User preference\n[servers]\ndefault_plan = "medium" # Overrides system default\n\n# Infrastructure config\n[servers]\ndefault_plan = "large" # Overrides user preference\n\n# Runtime\nprovisioning server create --plan xlarge # Overrides everything\n```\n\n---\n\n## Use Cases\n\n### 1. **Multi-Cloud Kubernetes Deployment**\n\nDeploy Kubernetes clusters across different cloud providers with identical configuration.\n\n```\n# UpCloud cluster\nprovisioning cluster create k8s-prod --provider upcloud\n\n# AWS cluster (same config)\nprovisioning cluster create k8s-prod --provider aws\n```\n\n### 2. **Development → Staging → Production Pipeline**\n\nManage multiple environments with workspace switching.\n\n```\n# Development\nprovisioning workspace switch dev\nprovisioning cluster create app-stack\n\n# Staging (same config, different resources)\nprovisioning workspace switch staging\nprovisioning cluster create app-stack\n\n# Production (HA, larger resources)\nprovisioning workspace switch prod\nprovisioning cluster create app-stack\n```\n\n### 3. **Infrastructure as Code Testing**\n\nTest infrastructure changes before deploying to production.\n\n```\n# Test Kubernetes upgrade locally\nprovisioning test topology load kubernetes_3node | \n test env cluster kubernetes --version 1.29.0\n\n# Verify functionality\nprovisioning test env run \n\n# Cleanup\nprovisioning test env cleanup \n```\n\n### 4. **Batch Multi-Region Deployment**\n\nDeploy to multiple regions in parallel using Nickel batch workflows.\n\n```\n# schemas/batch/multi-region.ncl\n{\n batch_workflow = {\n operations = [\n {\n id = "eu-cluster"\n type = "cluster"\n region = "eu-west-1"\n cluster = "app-stack"\n }\n {\n id = "us-cluster"\n type = "cluster"\n region = "us-east-1"\n cluster = "app-stack"\n }\n {\n id = "asia-cluster"\n type = "cluster"\n region = "ap-south-1"\n cluster = "app-stack"\n }\n ]\n parallel_limit = 3 # All at once\n }\n}\n```\n\n```\nprovisioning batch submit schemas/batch/multi-region.ncl\nprovisioning batch monitor \n```\n\n### 5. **Automated Disaster Recovery**\n\nRecreate infrastructure from configuration.\n\n```\n# Infrastructure destroyed\nprovisioning workspace switch prod\n\n# Recreate from config\nprovisioning cluster create --infra backup-restore --wait\n\n# All services restored with same configuration\n```\n\n### 6. **CI/CD Integration**\n\nAutomated testing and deployment pipelines.\n\n```\n# .gitlab-ci.yml\ntest-infrastructure:\n script:\n - provisioning test quick kubernetes\n - provisioning test quick postgres\n\ndeploy-staging:\n script:\n - provisioning workspace switch staging\n - provisioning cluster create app-stack --check\n - provisioning cluster create app-stack --yes\n\ndeploy-production:\n when: manual\n script:\n - provisioning workspace switch prod\n - provisioning cluster create app-stack --yes\n```\n\n---\n\n## Getting Started\n\n### Quick Start\n\n1. **Install Prerequisites**\n\n ```bash\n # Install Nushell (0.109.0+)\n brew install nushell # macOS\n\n # Install Nickel (required for IaC)\n brew install nickel # macOS or from source\n\n # Install SOPS (optional, for encrypted secrets)\n brew install sops\n ```\n\n2. **Add CLI to PATH**\n\n ```bash\n ln -sf "$(pwd)/provisioning/core/cli/provisioning" /usr/local/bin/provisioning\n ```\n\n3. **Initialize Workspace**\n\n ```bash\n provisioning workspace init my-project\n cd my-project\n ```\n\n3.5. **Generate Versions File** (Optional - for bash scripts)\n\n ```bash\n provisioning setup versions\n # Creates /provisioning/core/versions with all tool and provider versions\n\n # Use in your deployment scripts\n source /provisioning/core/versions\n echo "Deploying with Nushell $NU_VERSION and AWS CLI $PROVIDER_AWS_VERSION"\n ```\n\n4. **Define Infrastructure (Nickel)**\n\n ```bash\n # Create workspace infrastructure schema\n cat > workspace/infra/my-cluster.ncl <<'EOF'\n {\n metadata.name = "my-cluster"\n metadata.provider = "upcloud"\n\n infrastructure.servers = [\n {name = "control-01", plan = "medium"}\n {name = "worker-01", plan = "large"}\n ]\n\n services.taskservs = ["kubernetes", "cilium"]\n }\n EOF\n ```\n\n5. **Deploy Infrastructure**\n\n ```bash\n # Validate configuration\n provisioning config validate\n\n # Check what will be created\n provisioning server create --check\n\n # Create servers\n provisioning server create --yes\n\n # Install Kubernetes\n provisioning taskserv create kubernetes\n ```\n\n### Learning Path\n\n1. **Start with Guides**\n\n ```bash\n provisioning sc # Quick reference\n provisioning guide from-scratch # Complete walkthrough\n ```\n\n2. **Explore Examples**\n\n ```bash\n ls provisioning/examples/\n ```\n\n3. **Read Architecture Docs**\n - [Core Engine](provisioning/core/README.md)\n - [CLI Architecture](.claude/features/cli-architecture.md)\n - [Configuration System](.claude/features/configuration-system.md)\n - [Batch Workflows](.claude/features/batch-workflow-system.md)\n\n4. **Try Test Environments**\n\n ```bash\n provisioning test quick kubernetes\n provisioning test quick postgres\n ```\n\n5. **Build Custom Extensions**\n - Create custom task services\n - Define cluster templates\n - Write workflow automation\n\n---\n\n## Documentation Index\n\n### User & Operations Guides\n\nSee **[provisioning/docs/src/](provisioning/docs/src/)** for comprehensive documentation:\n\n- **Quick Start** - Get started in 10 minutes\n- **Command Reference** - Complete CLI command reference\n- **Nickel Configuration Guide** - IaC language and patterns\n- **Workspace Management** - Multi-workspace guide\n- **Test Environment Guide** - Testing infrastructure with containers\n- **Plugin Integration** - Native Rust plugins (10-50x faster)\n- **Security System** - Authentication, MFA, KMS, Cedar policies\n- **Operations** - Deployment, monitoring, incident response\n\n### Architecture & Design Decisions\n\nSee **[docs/src/architecture/](docs/src/architecture/)** for design patterns:\n\n- **System Architecture** - Multi-layer design\n- **ADRs (Architecture Decision Records)** - Major decisions including:\n - ADR-011: Nickel Migration (from KCL)\n - ADR-012: Nushell + Nickel plugin wrapper\n - ADR-010: Configuration format strategy\n- **Multi-Repo Strategy** - Repository organization\n- **Integration Patterns** - How components interact\n\n### Development Guidelines\n\n- **[Repository Structure](docs/src/development/)** - Codebase organization\n- **[Contributing Guide](CONTRIBUTING.md)** - How to contribute\n- **[Nushell Guidelines](.claude/guidelines/nushell/)** - Best practices\n- **[Nickel Guidelines](.claude/guidelines/nickel.md)** - IaC patterns\n- **[Rust Guidelines](.claude/guidelines/rust/)** - Rust conventions\n\n### API Reference\n\n- **REST API** - HTTP endpoints in `provisioning/docs/src/api-reference/`\n- **Nushell API** - Library functions and modules\n- **Provider API** - Cloud provider interface specification\n\n---\n\n## Project Status\n\n**Current Version**: v5.0.0-nickel (Production Ready) | **Date**: 2026-01-08\n\n### Completed Milestones\n\n- ✅ **v5.0.0** (2026-01-08) - **Nickel IaC Migration Complete**\n - Full KCL→Nickel migration\n - Schema-driven configuration system\n - Type-safe lazy evaluation\n - ~220 legacy files removed, ~250 new schema files added\n\n- ✅ **v3.6.0** (2026-01-08) - Version Management System\n - Centralized tool and provider version management\n - Bash-compatible versions file generation\n - `provisioning setup versions` command\n - Automatic provider version discovery from Nickel schemas\n - Shell script integration with sourcing support\n\n- ✅ **v4.0.0** (2025-10-09) - Complete Security System (12 components, 39,699 lines)\n- ✅ **v3.5.0** (2025-10-07) - Platform Installer with TUI and CI/CD modes\n- ✅ **v3.4.0** (2025-10-06) - Test Environment Service with container management\n- ✅ **v3.3.0** (2025-09-30) - Interactive Guides system\n- ✅ **v3.2.0** (2025-09-30) - Modular CLI Architecture (84% code reduction)\n- ✅ **v3.1.0** (2025-09-25) - Batch Workflow System (85-90% token efficiency)\n- ✅ **v3.0.0** (2025-09-25) - Hybrid Orchestrator (Rust/Nushell)\n- ✅ **v2.0.5** (2025-10-02) - Workspace Switching system\n- ✅ **v2.0.0** (2025-09-23) - Configuration System (476+ accessors)\n- ✅ **v1.0.0** (2025-10-09) - Nushell Plugins Integration (10-50x performance)\n\n### Current Focus\n\n- **Nickel Ecosystem** - IDE support, LSP integration, schema libraries\n- **Platform Consolidation** - GitHub Actions CI/CD, cross-platform testing\n- **Extension Registry** - OCI-based distribution for task services and providers\n- **Documentation** - Complete Nickel migration guides, ADR updates\n\n---\n\n## Support and Community\n\n### Getting Help\n\n- **Documentation**: Start with `provisioning help` or `provisioning guide from-scratch`\n- **Issues**: Report bugs and request features on the issue tracker\n- **Discussions**: Join community discussions for questions and ideas\n\n### Contributing\n\nContributions are welcome! See [CONTRIBUTING.md](docs/development/CONTRIBUTING.md) for guidelines.\n\n**Key areas for contribution**:\n\n- New task service definitions\n- Cloud provider implementations\n- Cluster templates\n- Documentation improvements\n- Bug fixes and testing\n\n---\n\n## License\n\nSee [LICENSE](LICENSE) file in project root.\n\n---\n\n**Maintained By**: Architecture Team\n**Last Updated**: 2026-01-08 (Version Management System v3.6.0 + Nickel v5.0.0 Migration Complete)\n**Current Branch**: nickel\n**Project Home**: [provisioning/](provisioning/)\n\n---\n\n## Recent Changes (2026-01-08)\n\n### Version Management System (v3.6.0)\n\n**What Changed**:\n- ✅ Implemented `provisioning setup versions` command\n- ✅ Generates bash-compatible `/provisioning/core/versions` file\n- ✅ Automatically discovers and includes all provider versions from Nickel schemas\n- ✅ Fixed to remove redundant metadata (all sources are Nickel)\n- ✅ Core tools with aliases: NUSHELL→NU, NICKEL, SOPS, AGE, K9S\n- ✅ Shell script integration: `source /provisioning/core/versions && echo $NU_VERSION`\n\n**Files Modified**:\n- `provisioning/core/nulib/lib_provisioning/setup/utils.nu` - Core implementation\n- `provisioning/core/nulib/main_provisioning/commands/setup.nu` - Command routing\n- `provisioning/core/nulib/lib_provisioning/workspace/enforcement.nu` - Workspace exemption\n- `provisioning/README.md` - Documentation updates\n\n**Generated File Example**:\n```\nNUSHELL_VERSION="0.109.1"\nNUSHELL_SOURCE="https://github.com/nushell/nushell/releases"\nNU_VERSION="0.109.1"\nNU_SOURCE="https://github.com/nushell/nushell/releases"\n\nNICKEL_VERSION="1.15.1"\nNICKEL_SOURCE="https://github.com/tweag/nickel/releases"\n\nPROVIDER_AWS_VERSION="2.32.11"\nPROVIDER_AWS_SOURCE="https://github.com/aws/aws-cli/releases"\n# ... and more providers\n```\n\n**Key Improvements**:\n- Clean metadata (no redundant `_LIB` fields - all sources are Nickel)\n- Automatic provider discovery from `extensions/providers/*/nickel/version.ncl`\n- Direct Nickel file parsing with JSON export\n- Zero dependency on environment variables or legacy systems\n- 100% bash/shell compatible for deployment scripts \ No newline at end of file +

+ Provisioning Logo +

+

+ Provisioning +

+ +# Provisioning - Infrastructure Automation Platform + +> **A modular, declarative Infrastructure as Code (IaC) platform for managing complete infrastructure lifecycles** + +## Table of Contents + +- [What is Provisioning?](#what-is-provisioning) +- [Why Provisioning?](#why-provisioning) +- [Core Concepts](#core-concepts) +- [Architecture](#architecture) +- [Key Features](#key-features) +- [Technology Stack](#technology-stack) +- [How It Works](#how-it-works) +- [Use Cases](#use-cases) +- [Getting Started](#getting-started) + +--- + +## What is Provisioning? + +**Provisioning** is a comprehensive **Infrastructure as Code (IaC)** platform designed to manage +complete infrastructure lifecycles: cloud providers, infrastructure services, clusters, +and isolated workspaces across multiple cloud/local environments. + +Extensible and customizable by design, it delivers type-safe, configuration-driven workflows +with enterprise security (encrypted configuration, Cosmian KMS integration, Cedar policy engine, +secrets management, authorization and permissions control, compliance checking, anomaly detection) +and adaptable deployment modes (interactive UI, CLI automation, unattended CI/CD) +suitable for any scale from development to production. + +### Technical Definition + +Declarative Infrastructure as Code (IaC) platform providing: + +- **Type-safe, configuration-driven workflows** with schema validation and constraint checking +- **Modular, extensible architecture**: cloud providers, task services, clusters, workspaces +- **Multi-cloud abstraction layer** with unified API (UpCloud, AWS, local infrastructure) +- **High-performance state management**: + - Graph database backend for complex relationships + - Real-time state tracking and queries + - Multi-model data storage (document, graph, relational) +- **Enterprise security stack**: + - Encrypted configuration and secrets management + - Cosmian KMS integration for confidential key management + - Cedar policy engine for fine-grained access control + - Authorization and permissions control via platform services + - Compliance checking and policy enforcement + - Anomaly detection for security monitoring + - Audit logging and compliance tracking +- **Hybrid orchestration**: Rust-based performance layer + scripting flexibility +- **Production-ready features**: + - Batch workflows with dependency resolution + - Checkpoint recovery and automatic rollback + - Parallel execution with state management +- **Adaptable deployment modes**: + - Interactive TUI for guided setup + - Headless CLI for scripted automation + - Unattended mode for CI/CD pipelines +- **Hierarchical configuration system** with inheritance and overrides + +### What It Does + +- **Provisions Infrastructure** - Create servers, networks, storage across multiple cloud providers +- **Installs Services** - Deploy Kubernetes, containerd, databases, monitoring, and 50+ infrastructure components +- **Manages Clusters** - Orchestrate complete cluster deployments with dependency management +- **Handles Configuration** - Hierarchical configuration system with inheritance and overrides +- **Orchestrates Workflows** - Batch operations with parallel execution and checkpoint recovery +- **Manages Secrets** - SOPS/Age integration for encrypted configuration +- **Secures Infrastructure** - Enterprise security with JWT, MFA, Cedar policies, audit logging +- **Optimizes Performance** - Native plugins providing 10-50x speed improvements + +--- + +## Why Provisioning? + +### The Problems It Solves + +#### 1. **Multi-Cloud Complexity** + +**Problem**: Each cloud provider has different APIs, tools, and workflows. + +**Solution**: Unified abstraction layer with provider-agnostic interfaces. Write configuration once, deploy anywhere using Nickel schemas. + +```nickel +# Same configuration works on UpCloud, AWS, or local infrastructure +{ + servers = [ + { + name = "web-01" + plan = "medium" # Abstract size, provider-specific translation + provider = "upcloud" # Switch to "aws" or "local" as needed + } + ] +} +``` + +#### 2. **Dependency Hell** + +**Problem**: Infrastructure components have complex dependencies (Kubernetes needs containerd, Cilium needs Kubernetes, etc.). + +**Solution**: Automatic dependency resolution with topological sorting and health checks via Nickel schemas. + +```nickel +# Provisioning resolves: containerd → etcd → kubernetes → cilium +{ + taskservs = ["cilium"] # Automatically installs all dependencies +} +``` + +#### 3. **Configuration Sprawl** + +**Problem**: Environment variables, hardcoded values, scattered configuration files. + +**Solution**: Hierarchical configuration system with 476+ config accessors replacing 200+ ENV variables. + +```toml +Defaults → User → Project → Infrastructure → Environment → Runtime +``` + +#### 4. **Imperative Scripts** + +**Problem**: Brittle shell scripts that don't handle failures, don't support rollback, hard to maintain. + +**Solution**: Declarative Nickel configurations with validation, type safety, lazy evaluation, and automatic rollback. + +#### 5. **Lack of Visibility** + +**Problem**: No insight into what's happening during deployment, hard to debug failures. + +**Solution**: + +- Real-time workflow monitoring +- Comprehensive logging system +- Web-based control center +- REST API for integration + +#### 6. **No Standardization** + +**Problem**: Each team builds their own deployment tools, no shared patterns. + +**Solution**: Reusable task services, cluster templates, and workflow patterns. + +--- + +## Core Concepts + +### 1. **Providers** + +Cloud infrastructure backends that handle resource provisioning. + +- **UpCloud** - Primary cloud provider +- **AWS** - Amazon Web Services integration +- **Local** - Local infrastructure (VMs, Docker, bare metal) + +Providers implement a common interface, making infrastructure code portable. + +### 2. **Task Services (TaskServs)** + +Reusable infrastructure components that can be installed on servers. + +**Categories**: + +- **Container Runtimes** - containerd, Docker, Podman, crun, runc, youki +- **Orchestration** - Kubernetes, etcd, CoreDNS +- **Networking** - Cilium, Flannel, Calico, ip-aliases +- **Storage** - Rook-Ceph, local storage +- **Databases** - PostgreSQL, Redis, SurrealDB +- **Observability** - Prometheus, Grafana, Loki +- **Security** - Webhook, KMS, Vault +- **Development** - Gitea, Radicle, ORAS + +Each task service includes: + +- Version management +- Dependency declarations +- Health checks +- Installation/uninstallation logic +- Configuration schemas + +### 3. **Clusters** + +Complete infrastructure deployments combining servers and task services. + +**Examples**: + +- **Kubernetes Cluster** - HA control plane + worker nodes + CNI + storage +- **Database Cluster** - Replicated PostgreSQL with backup +- **Build Infrastructure** - BuildKit + container registry + CI/CD + +Clusters handle: + +- Multi-node coordination +- Service distribution +- High availability +- Rolling updates + +### 4. **Workspaces** + +Isolated environments for different projects or deployment stages. + +```bash +workspace_librecloud/ # Production workspace +├── infra/ # Infrastructure definitions +├── config/ # Workspace configuration +├── extensions/ # Custom modules +└── runtime/ # State and runtime data + +workspace_dev/ # Development workspace +├── infra/ +└── config/ +``` + +Switch between workspaces with single command: + +```bash +provisioning workspace switch librecloud +``` + +### 5. **Workflows** + +Coordinated sequences of operations with dependency management. + +**Types**: + +- **Server Workflows** - Create/delete/update servers +- **TaskServ Workflows** - Install/remove infrastructure services +- **Cluster Workflows** - Deploy/scale complete clusters +- **Batch Workflows** - Multi-cloud parallel operations + +**Features**: + +- Dependency resolution +- Parallel execution +- Checkpoint recovery +- Automatic rollback +- Progress monitoring + +--- + +## Architecture + +### System Components + +```bash +┌─────────────────────────────────────────────────────────────────┐ +│ User Interface Layer │ +│ • CLI (provisioning command) │ +│ • Web Control Center (UI) │ +│ • REST API │ +└─────────────────────────────────────────────────────────────────┘ + ↓ +┌─────────────────────────────────────────────────────────────────┐ +│ Core Engine Layer │ +│ • Command Routing & Dispatch │ +│ • Configuration Management │ +│ • Provider Abstraction │ +│ • Utility Libraries │ +└─────────────────────────────────────────────────────────────────┘ + ↓ +┌─────────────────────────────────────────────────────────────────┐ +│ Orchestration Layer │ +│ • Workflow Orchestrator (Rust/Nushell hybrid) │ +│ • Dependency Resolver │ +│ • State Manager │ +│ • Task Scheduler │ +└─────────────────────────────────────────────────────────────────┘ + ↓ +┌─────────────────────────────────────────────────────────────────┐ +│ Extension Layer │ +│ • Providers (Cloud APIs) │ +│ • Task Services (Infrastructure Components) │ +│ • Clusters (Complete Deployments) │ +│ • Workflows (Automation Templates) │ +└─────────────────────────────────────────────────────────────────┘ + ↓ +┌─────────────────────────────────────────────────────────────────┐ +│ Infrastructure Layer │ +│ • Cloud Resources (Servers, Networks, Storage) │ +│ • Kubernetes Clusters │ +│ • Running Services │ +└─────────────────────────────────────────────────────────────────┘ +``` + +### Directory Structure + +```bash +project-provisioning/ +├── provisioning/ # Core provisioning system +│ ├── core/ # Core engine and libraries +│ │ ├── cli/ # Command-line interface +│ │ ├── nulib/ # Core Nushell libraries +│ │ ├── plugins/ # System plugins (Rust native) +│ │ └── scripts/ # Utility scripts +│ │ +│ ├── extensions/ # Extensible components +│ │ ├── providers/ # Cloud provider implementations +│ │ ├── taskservs/ # Infrastructure service definitions +│ │ ├── clusters/ # Complete cluster configurations +│ │ └── workflows/ # Core workflow templates +│ │ +│ ├── platform/ # Platform services +│ │ ├── orchestrator/ # Rust orchestrator service +│ │ ├── control-center/ # Web control center +│ │ ├── mcp-server/ # Model Context Protocol server +│ │ ├── api-gateway/ # REST API gateway +│ │ ├── oci-registry/ # OCI registry for extensions +│ │ └── installer/ # Platform installer (TUI + CLI) +│ │ +│ ├── schemas/ # Nickel schema definitions (PRIMARY IaC) +│ │ ├── main.ncl # Main infrastructure schema +│ │ ├── providers/ # Provider-specific schemas +│ │ ├── infrastructure/ # Infra definitions +│ │ ├── deployment/ # Deployment schemas +│ │ ├── services/ # Service schemas +│ │ ├── operations/ # Operations schemas +│ │ └── generator/ # Runtime schema generation +│ │ +│ ├── docs/ # Product documentation (mdBook) +│ ├── config/ # Configuration examples +│ ├── tools/ # Build and distribution tools +│ └── justfiles/ # Just recipes for common tasks +│ +├── workspace/ # User workspaces and data +│ ├── infra/ # Infrastructure definitions +│ ├── config/ # User configuration +│ ├── extensions/ # User extensions +│ └── runtime/ # Runtime data and state +│ +├── docs/ # Architecture & Development docs +│ ├── architecture/ # System design and ADRs +│ └── development/ # Development guidelines +│ +└── .github/ # CI/CD workflows + └── workflows/ # GitHub Actions (Rust, Nickel, Nushell) +``` + +### Platform Services + +#### 1. **Orchestrator** (`platform/orchestrator/`) + +- **Language**: Rust + Nushell +- **Purpose**: Workflow execution, task scheduling, state management +- **Features**: + - File-based persistence + - Priority processing + - Retry logic with exponential backoff + - Checkpoint-based recovery + - REST API endpoints + +#### 2. **Control Center** (`platform/control-center/`) + +- **Language**: Web UI + Backend API +- **Purpose**: Web-based infrastructure management +- **Features**: + - Dashboard views + - Real-time monitoring + - Interactive deployments + - Log viewing + +#### 3. **MCP Server** (`platform/mcp-server/`) + +- **Language**: Nushell +- **Purpose**: Model Context Protocol integration for AI assistance +- **Features**: + - 7 AI-powered settings tools + - Intelligent config completion + - Natural language infrastructure queries + +#### 4. **OCI Registry** (`platform/oci-registry/`) + +- **Purpose**: Extension distribution and versioning +- **Features**: + - Task service packages + - Provider packages + - Cluster templates + - Workflow definitions + +#### 5. **Installer** (`platform/installer/`) + +- **Language**: Rust (Ratatui TUI) + Nushell +- **Purpose**: Platform installation and setup +- **Features**: + - Interactive TUI mode + - Headless CLI mode + - Unattended CI/CD mode + - Configuration generation + +--- + +## Key Features + +### 1. **Modular CLI Architecture** (v3.2.0) + +84% code reduction with domain-driven design. + +- **Main CLI**: 211 lines (from 1,329 lines) +- **80+ shortcuts**: `s` → `server`, `t` → `taskserv`, etc. +- **Bi-directional help**: `provisioning help ws` = `provisioning ws help` +- **7 domain modules**: infrastructure, orchestration, development, workspace, configuration, utilities, generation + +### 2. **Configuration System** (v2.0.0) + +Hierarchical, config-driven architecture. + +- **476+ config accessors** replacing 200+ ENV variables +- **Hierarchical loading**: defaults → user → project → infra → env → runtime +- **Variable interpolation**: `{{paths.base}}`, `{{env.HOME}}`, `{{now.date}}` +- **Multi-format support**: TOML, YAML, KCL + +### 3. **Batch Workflow System** (v3.1.0) + +Provider-agnostic batch operations with 85-90% token efficiency. + +- **Multi-cloud support**: Mixed UpCloud + AWS + local in single workflow +- **KCL schema integration**: Type-safe workflow definitions +- **Dependency resolution**: Topological sorting with soft/hard dependencies +- **State management**: Checkpoint-based recovery with rollback +- **Real-time monitoring**: Live progress tracking + +### 4. **Hybrid Orchestrator** (v3.0.0) + +Rust/Nushell architecture solving deep call stack limitations. + +- **High-performance coordination layer** +- **File-based persistence** +- **Priority processing with retry logic** +- **REST API for external integration** +- **Comprehensive workflow system** + +### 5. **Workspace Switching** (v2.0.5) + +Centralized workspace management. + +- **Single-command switching**: `provisioning workspace switch ` +- **Automatic tracking**: Last-used timestamps, active workspace markers +- **User preferences**: Global settings across all workspaces +- **Workspace registry**: Centralized configuration in `user_config.yaml` + +### 6. **Interactive Guides** (v3.3.0) + +Step-by-step walkthroughs and quick references. + +- **Quick reference**: `provisioning sc` (fastest) +- **Complete guides**: from-scratch, update, customize +- **Copy-paste ready**: All commands include placeholders +- **Beautiful rendering**: Uses glow, bat, or less + +### 7. **Test Environment Service** (v3.4.0) + +Automated container-based testing. + +- **Three test types**: Single taskserv, server simulation, multi-node clusters +- **Topology templates**: Kubernetes HA, etcd clusters, etc. +- **Auto-cleanup**: Optional automatic cleanup after tests +- **CI/CD integration**: Easy integration into pipelines + +### 8. **Platform Installer** (v3.5.0) + +Multi-mode installation system with TUI, CLI, and unattended modes. + +- **Interactive TUI**: Beautiful Ratatui terminal UI with 7 screens +- **Headless Mode**: CLI automation for scripted installations +- **Unattended Mode**: Zero-interaction CI/CD deployments +- **Deployment Modes**: Solo (2 CPU/4GB), MultiUser (4 CPU/8GB), CICD (8 CPU/16GB), Enterprise (16 CPU/32GB) +- **MCP Integration**: 7 AI-powered settings tools for intelligent configuration + +### 9. **Version Management System** (v3.6.0) + +Centralized tool and provider version management with bash-compatible export. + +- **Unified Version Source**: All versions defined in Nickel files (`versions.ncl` and provider `version.ncl`) +- **Generated Versions File**: Bash-compatible KEY="VALUE" format for shell scripts +- **Core Tools**: NUSHELL, NICKEL, SOPS, AGE, K9S with convenient aliases (NU for NUSHELL) +- **Provider Versions**: Automatically discovers and includes all provider versions (AWS, HCLOUD, UPCTL, etc.) +- **Command**: `provisioning setup versions` generates `/provisioning/core/versions` file +- **Shell Integration**: Can be sourced directly in bash scripts: `source /provisioning/core/versions && echo $NU_VERSION` +- **Usage**: + ```bash + # Generate versions file + provisioning setup versions + + # Use in bash scripts + source /provisioning/core/versions + echo "Using Nushell version: $NU_VERSION" + echo "AWS CLI version: $PROVIDER_AWS_VERSION" + ``` + +### 10. **Nushell Plugins Integration** (v1.0.0) + +Three native Rust plugins providing 10-50x performance improvements over HTTP API. + +- **Three Native Plugins**: auth, KMS, orchestrator +- **Performance Gains**: + - KMS operations: ~5ms vs ~50ms (10x faster) + - Orchestrator queries: ~1ms vs ~30ms (30x faster) + - Auth verification: ~10ms vs ~50ms (5x faster) +- **OS-Native Keyring**: macOS Keychain, Linux Secret Service, Windows Credential Manager +- **KMS Backends**: RustyVault, Age, AWS KMS, Vault, Cosmian +- **Graceful Fallback**: Automatic fallback to HTTP if plugins not installed + +### 11. **Complete Security System** (v4.0.0) + +Enterprise-grade security with 39,699 lines across 12 components. + +- **12 Components**: JWT Auth, Cedar Authorization, MFA (TOTP + WebAuthn), Secrets Management, KMS, Audit Logging, Break-Glass, Compliance, Audit Query, Token Management, Access Control, Encryption +- **Performance**: <20ms overhead per secure operation +- **Testing**: 350+ comprehensive test cases +- **API**: 83+ REST endpoints, 111+ CLI commands +- **Standards**: GDPR, SOC2, ISO 27001 compliance +- **Key Features**: + - RS256 authentication with Argon2id hashing + - Policy-as-code with hot reload + - Multi-factor authentication (TOTP + WebAuthn/FIDO2) + - Dynamic secrets (AWS STS, SSH keys) with TTL + - 5 KMS backends with envelope encryption + - 7-year audit retention with 5 export formats + - Multi-party break-glass approval + +--- + +## Technology Stack + +### Core Technologies + +| Technology | Version | Purpose | Why | +| ------------ | --------- | --------- | ----- | +| **Nickel** | Latest | PRIMARY - Infrastructure-as-code language | Type-safe schemas, lazy evaluation, LSP support, composable records, gradual validation | +| **Nushell** | 0.109.0+ | Scripting and task automation | Structured data pipelines, cross-platform, modern built-in parsers (JSON/YAML/TOML) | +| **Rust** | Latest | Platform services (orchestrator, control-center, installer) | Performance, memory safety, concurrency, reliability | +| **KCL** | DEPRECATED | Legacy configuration (fully replaced by Nickel) | Migration bridge available; use Nickel for new work | + +### Data & State Management + +| Technology | Version | Purpose | Features | +| ------------ | --------- | --------- | ---------- | +| **SurrealDB** | Latest | High-performance graph database backend | Multi-model (document, graph, relational), real-time queries, distributed architecture, complex relationship tracking | + +### Platform Services (Rust-based) + +| Service | Purpose | Security Features | +| --------- | --------- | ------------------- | +| **Orchestrator** | Workflow execution, task scheduling, state management | File-based persistence, retry logic, checkpoint recovery | +| **Control Center** | Web-based infrastructure management | **Authorization and permissions control**, RBAC, audit logging | +| **Installer** | Platform installation (TUI + CLI modes) | Secure configuration generation, validation | +| **API Gateway** | REST API for external integration | Authentication, rate limiting, request validation | +| **MCP Server** | AI-powered configuration management | 7 settings tools, intelligent config completion | +| **OCI Registry** | Extension distribution and versioning | Task services, providers, cluster templates | + +### Security & Secrets + +| Technology | Version | Purpose | Enterprise Features | +| ------------ | --------- | --------- | --------------------- | +| **SOPS** | 3.10.2+ | Secrets management | Encrypted configuration files | +| **Age** | 1.2.1+ | Encryption | Secure key-based encryption | +| **Cosmian KMS** | Latest | Key Management System | Confidential computing, secure key storage, cloud-native KMS | +| **Cedar** | Latest | Policy engine | Fine-grained access control, policy-as-code, compliance checking, anomaly detection | +| **RustyVault** | Latest | Transit encryption engine | 5ms encryption performance, multiple KMS backends | +| **JWT** | Latest | Authentication tokens | RS256 signatures, Argon2id password hashing | +| **Keyring** | Latest | OS-native secure storage | macOS Keychain, Linux Secret Service, Windows Credential Manager | + +### Version Management + +| Component | Purpose | Format | +| ----------- | --------- | -------- | +| **versions.ncl** | Core tool versions (Nickel primary) | Nickel schema | +| **provider version.ncl** | Provider-specific versions | Nickel schema | +| **provisioning setup versions** | Version file generator | Nushell command | +| **versions file** | Bash-compatible exports | KEY="VALUE" format | + +**Usage**: +```bash +# Generate versions file from Nickel schemas +provisioning setup versions + +# Source in shell scripts +source /provisioning/core/versions +echo $NU_VERSION $PROVIDER_AWS_VERSION +``` + +### Optional Tools + +| Tool | Purpose | +| ------ | --------- | +| **K9s** | Kubernetes management interface | +| **nu_plugin_tera** | Nushell plugin for Tera template rendering | +| **nu_plugin_kcl** | Nushell plugin for KCL integration (CLI required, plugin optional) | +| **nu_plugin_auth** | Authentication plugin (5x faster auth, OS keyring integration) | +| **nu_plugin_kms** | KMS encryption plugin (10x faster, 5ms encryption) | +| **nu_plugin_orchestrator** | Orchestrator plugin (30-50x faster queries) | +| **glow** | Markdown rendering for interactive guides | +| **bat** | Syntax highlighting for file viewing and guides | + +--- + +## How It Works + +### Data Flow + +```bash +1. User defines infrastructure in Nickel schemas + ↓ +2. Nickel evaluates with type validation and lazy evaluation + ↓ +3. CLI loads configuration (hierarchical merging) + ↓ +4. Configuration validated against provider schemas + ↓ +5. Workflow created with operations + ↓ +6. Orchestrator receives workflow + ↓ +7. Dependencies resolved (topological sort) + ↓ +8. Operations executed in order (parallel where possible) + ↓ +9. Providers handle cloud operations + ↓ +10. Task services installed on servers + ↓ +11. State persisted and monitored +``` + +### Example Workflow: Deploy Kubernetes Cluster + +**Step 1**: Define infrastructure in Nickel + +```nickel +# schemas/my-cluster.ncl +{ + metadata = { + name = "my-cluster" + provider = "upcloud" + environment = "production" + } + + infrastructure = { + servers = [ + {name = "control-01", plan = "medium", role = "control"} + {name = "worker-01", plan = "large", role = "worker"} + {name = "worker-02", plan = "large", role = "worker"} + ] + } + + services = { + taskservs = ["kubernetes", "cilium", "rook-ceph"] + } +} +``` + +**Step 2**: Submit to Provisioning + +```bash +provisioning server create --infra my-cluster +``` + +**Step 3**: Provisioning executes workflow + +```bash +1. Create workflow: "deploy-my-cluster" +2. Resolve dependencies: + - containerd (required by kubernetes) + - etcd (required by kubernetes) + - kubernetes (explicitly requested) + - cilium (explicitly requested, requires kubernetes) + - rook-ceph (explicitly requested, requires kubernetes) + +3. Execution order: + a. Provision servers (parallel) + b. Install containerd on all nodes + c. Install etcd on control nodes + d. Install kubernetes control plane + e. Join worker nodes + f. Install Cilium CNI + g. Install Rook-Ceph storage + +4. Checkpoint after each step +5. Monitor health checks +6. Report completion +``` + +**Step 4**: Verify deployment + +```bash +provisioning cluster status my-cluster +``` + +### Configuration Hierarchy + +Configuration values are resolved through a hierarchy: + +```toml +1. System Defaults (provisioning/config/config.defaults.toml) + ↓ (overridden by) +2. User Preferences (~/.config/provisioning/user_config.yaml) + ↓ (overridden by) +3. Workspace Config (workspace/config/provisioning.yaml) + ↓ (overridden by) +4. Infrastructure Config (workspace/infra//config.toml) + ↓ (overridden by) +5. Environment Config (workspace/config/prod-defaults.toml) + ↓ (overridden by) +6. Runtime Flags (--flag value) +``` + +**Example**: + +```bash +# System default +[servers] +default_plan = "small" + +# User preference +[servers] +default_plan = "medium" # Overrides system default + +# Infrastructure config +[servers] +default_plan = "large" # Overrides user preference + +# Runtime +provisioning server create --plan xlarge # Overrides everything +``` + +--- + +## Use Cases + +### 1. **Multi-Cloud Kubernetes Deployment** + +Deploy Kubernetes clusters across different cloud providers with identical configuration. + +```yaml +# UpCloud cluster +provisioning cluster create k8s-prod --provider upcloud + +# AWS cluster (same config) +provisioning cluster create k8s-prod --provider aws +``` + +### 2. **Development → Staging → Production Pipeline** + +Manage multiple environments with workspace switching. + +```bash +# Development +provisioning workspace switch dev +provisioning cluster create app-stack + +# Staging (same config, different resources) +provisioning workspace switch staging +provisioning cluster create app-stack + +# Production (HA, larger resources) +provisioning workspace switch prod +provisioning cluster create app-stack +``` + +### 3. **Infrastructure as Code Testing** + +Test infrastructure changes before deploying to production. + +```bash +# Test Kubernetes upgrade locally +provisioning test topology load kubernetes_3node | + test env cluster kubernetes --version 1.29.0 + +# Verify functionality +provisioning test env run + +# Cleanup +provisioning test env cleanup +``` + +### 4. **Batch Multi-Region Deployment** + +Deploy to multiple regions in parallel using Nickel batch workflows. + +```nickel +# schemas/batch/multi-region.ncl +{ + batch_workflow = { + operations = [ + { + id = "eu-cluster" + type = "cluster" + region = "eu-west-1" + cluster = "app-stack" + } + { + id = "us-cluster" + type = "cluster" + region = "us-east-1" + cluster = "app-stack" + } + { + id = "asia-cluster" + type = "cluster" + region = "ap-south-1" + cluster = "app-stack" + } + ] + parallel_limit = 3 # All at once + } +} +``` + +```bash +provisioning batch submit schemas/batch/multi-region.ncl +provisioning batch monitor +``` + +### 5. **Automated Disaster Recovery** + +Recreate infrastructure from configuration. + +```toml +# Infrastructure destroyed +provisioning workspace switch prod + +# Recreate from config +provisioning cluster create --infra backup-restore --wait + +# All services restored with same configuration +``` + +### 6. **CI/CD Integration** + +Automated testing and deployment pipelines. + +```bash +# .gitlab-ci.yml +test-infrastructure: + script: + - provisioning test quick kubernetes + - provisioning test quick postgres + +deploy-staging: + script: + - provisioning workspace switch staging + - provisioning cluster create app-stack --check + - provisioning cluster create app-stack --yes + +deploy-production: + when: manual + script: + - provisioning workspace switch prod + - provisioning cluster create app-stack --yes +``` + +--- + +## Getting Started + +### Quick Start + +1. **Install Prerequisites** + + ```bash + # Install Nushell (0.109.0+) + brew install nushell # macOS + + # Install Nickel (required for IaC) + brew install nickel # macOS or from source + + # Install SOPS (optional, for encrypted secrets) + brew install sops + ``` + +2. **Add CLI to PATH** + + ```bash + ln -sf "$(pwd)/provisioning/core/cli/provisioning" /usr/local/bin/provisioning + ``` + +3. **Initialize Workspace** + + ```bash + provisioning workspace init my-project + cd my-project + ``` + +3.5. **Generate Versions File** (Optional - for bash scripts) + + ```bash + provisioning setup versions + # Creates /provisioning/core/versions with all tool and provider versions + + # Use in your deployment scripts + source /provisioning/core/versions + echo "Deploying with Nushell $NU_VERSION and AWS CLI $PROVIDER_AWS_VERSION" + ``` + +4. **Define Infrastructure (Nickel)** + + ```bash + # Create workspace infrastructure schema + cat > workspace/infra/my-cluster.ncl <<'EOF' + { + metadata.name = "my-cluster" + metadata.provider = "upcloud" + + infrastructure.servers = [ + {name = "control-01", plan = "medium"} + {name = "worker-01", plan = "large"} + ] + + services.taskservs = ["kubernetes", "cilium"] + } + EOF + ``` + +5. **Deploy Infrastructure** + + ```bash + # Validate configuration + provisioning config validate + + # Check what will be created + provisioning server create --check + + # Create servers + provisioning server create --yes + + # Install Kubernetes + provisioning taskserv create kubernetes + ``` + +### Learning Path + +1. **Start with Guides** + + ```bash + provisioning sc # Quick reference + provisioning guide from-scratch # Complete walkthrough + ``` + +2. **Explore Examples** + + ```bash + ls provisioning/examples/ + ``` + +3. **Read Architecture Docs** + - [Core Engine](provisioning/core/README.md) + - [CLI Architecture](.claude/features/cli-architecture.md) + - [Configuration System](.claude/features/configuration-system.md) + - [Batch Workflows](.claude/features/batch-workflow-system.md) + +4. **Try Test Environments** + + ```bash + provisioning test quick kubernetes + provisioning test quick postgres + ``` + +5. **Build Custom Extensions** + - Create custom task services + - Define cluster templates + - Write workflow automation + +--- + +## Documentation Index + +### User & Operations Guides + +See **[provisioning/docs/src/](provisioning/docs/src/)** for comprehensive documentation: + +- **Quick Start** - Get started in 10 minutes +- **Command Reference** - Complete CLI command reference +- **Nickel Configuration Guide** - IaC language and patterns +- **Workspace Management** - Multi-workspace guide +- **Test Environment Guide** - Testing infrastructure with containers +- **Plugin Integration** - Native Rust plugins (10-50x faster) +- **Security System** - Authentication, MFA, KMS, Cedar policies +- **Operations** - Deployment, monitoring, incident response + +### Architecture & Design Decisions + +See **[docs/src/architecture/](docs/src/architecture/)** for design patterns: + +- **System Architecture** - Multi-layer design +- **ADRs (Architecture Decision Records)** - Major decisions including: + - ADR-011: Nickel Migration (from KCL) + - ADR-012: Nushell + Nickel plugin wrapper + - ADR-010: Configuration format strategy +- **Multi-Repo Strategy** - Repository organization +- **Integration Patterns** - How components interact + +### Development Guidelines + +- **[Repository Structure](docs/src/development/)** - Codebase organization +- **[Contributing Guide](CONTRIBUTING.md)** - How to contribute +- **[Nushell Guidelines](.claude/guidelines/nushell/)** - Best practices +- **[Nickel Guidelines](.claude/guidelines/nickel.md)** - IaC patterns +- **[Rust Guidelines](.claude/guidelines/rust/)** - Rust conventions + +### API Reference + +- **REST API** - HTTP endpoints in `provisioning/docs/src/api-reference/` +- **Nushell API** - Library functions and modules +- **Provider API** - Cloud provider interface specification + +--- + +## Project Status + +**Current Version**: v5.0.0-nickel (Production Ready) | **Date**: 2026-01-08 + +### Completed Milestones + +- ✅ **v5.0.0** (2026-01-08) - **Nickel IaC Migration Complete** + - Full KCL→Nickel migration + - Schema-driven configuration system + - Type-safe lazy evaluation + - ~220 legacy files removed, ~250 new schema files added + +- ✅ **v3.6.0** (2026-01-08) - Version Management System + - Centralized tool and provider version management + - Bash-compatible versions file generation + - `provisioning setup versions` command + - Automatic provider version discovery from Nickel schemas + - Shell script integration with sourcing support + +- ✅ **v4.0.0** (2025-10-09) - Complete Security System (12 components, 39,699 lines) +- ✅ **v3.5.0** (2025-10-07) - Platform Installer with TUI and CI/CD modes +- ✅ **v3.4.0** (2025-10-06) - Test Environment Service with container management +- ✅ **v3.3.0** (2025-09-30) - Interactive Guides system +- ✅ **v3.2.0** (2025-09-30) - Modular CLI Architecture (84% code reduction) +- ✅ **v3.1.0** (2025-09-25) - Batch Workflow System (85-90% token efficiency) +- ✅ **v3.0.0** (2025-09-25) - Hybrid Orchestrator (Rust/Nushell) +- ✅ **v2.0.5** (2025-10-02) - Workspace Switching system +- ✅ **v2.0.0** (2025-09-23) - Configuration System (476+ accessors) +- ✅ **v1.0.0** (2025-10-09) - Nushell Plugins Integration (10-50x performance) + +### Current Focus + +- **Nickel Ecosystem** - IDE support, LSP integration, schema libraries +- **Platform Consolidation** - GitHub Actions CI/CD, cross-platform testing +- **Extension Registry** - OCI-based distribution for task services and providers +- **Documentation** - Complete Nickel migration guides, ADR updates + +--- + +## Support and Community + +### Getting Help + +- **Documentation**: Start with `provisioning help` or `provisioning guide from-scratch` +- **Issues**: Report bugs and request features on the issue tracker +- **Discussions**: Join community discussions for questions and ideas + +### Contributing + +Contributions are welcome! See [CONTRIBUTING.md](docs/development/CONTRIBUTING.md) for guidelines. + +**Key areas for contribution**: + +- New task service definitions +- Cloud provider implementations +- Cluster templates +- Documentation improvements +- Bug fixes and testing + +--- + +## License + +See [LICENSE](LICENSE) file in project root. + +--- + +**Maintained By**: Architecture Team +**Last Updated**: 2026-01-08 (Version Management System v3.6.0 + Nickel v5.0.0 Migration Complete) +**Current Branch**: nickel +**Project Home**: [provisioning/](provisioning/) + +--- + +## Recent Changes (2026-01-08) + +### Version Management System (v3.6.0) + +**What Changed**: +- ✅ Implemented `provisioning setup versions` command +- ✅ Generates bash-compatible `/provisioning/core/versions` file +- ✅ Automatically discovers and includes all provider versions from Nickel schemas +- ✅ Fixed to remove redundant metadata (all sources are Nickel) +- ✅ Core tools with aliases: NUSHELL→NU, NICKEL, SOPS, AGE, K9S +- ✅ Shell script integration: `source /provisioning/core/versions && echo $NU_VERSION` + +**Files Modified**: +- `provisioning/core/nulib/lib_provisioning/setup/utils.nu` - Core implementation +- `provisioning/core/nulib/main_provisioning/commands/setup.nu` - Command routing +- `provisioning/core/nulib/lib_provisioning/workspace/enforcement.nu` - Workspace exemption +- `provisioning/README.md` - Documentation updates + +**Generated File Example**: +```bash +NUSHELL_VERSION="0.109.1" +NUSHELL_SOURCE="https://github.com/nushell/nushell/releases" +NU_VERSION="0.109.1" +NU_SOURCE="https://github.com/nushell/nushell/releases" + +NICKEL_VERSION="1.15.1" +NICKEL_SOURCE="https://github.com/tweag/nickel/releases" + +PROVIDER_AWS_VERSION="2.32.11" +PROVIDER_AWS_SOURCE="https://github.com/aws/aws-cli/releases" +# ... and more providers +``` + +**Key Improvements**: +- Clean metadata (no redundant `_LIB` fields - all sources are Nickel) +- Automatic provider discovery from `extensions/providers/*/nickel/version.ncl` +- Direct Nickel file parsing with JSON export +- Zero dependency on environment variables or legacy systems +- 100% bash/shell compatible for deployment scripts \ No newline at end of file diff --git a/SECURITY.md b/SECURITY.md index c79b018..02b830f 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -1 +1,101 @@ -# Security Policy\n\n## Supported Versions\n\nThis project provides security updates for the following versions:\n\n| Version | Supported |\n|---------|-----------|\n| 1.x | ✅ Yes |\n| 0.x | ❌ No |\n\nOnly the latest major version receives security patches. Users are encouraged to upgrade to the latest version.\n\n## Reporting a Vulnerability\n\n**Do not open public GitHub issues for security vulnerabilities.**\n\nInstead, please report security issues to the maintainers privately:\n\n### Reporting Process\n\n1. Email security details to the maintainers (see project README for contact)\n2. Include:\n - Description of the vulnerability\n - Steps to reproduce (if possible)\n - Potential impact\n - Suggested fix (if you have one)\n\n3. Expect acknowledgment within 48 hours\n4. We will work on a fix and coordinate disclosure timing\n\n### Responsible Disclosure\n\n- Allow reasonable time for a fix before public disclosure\n- Work with us to understand and validate the issue\n- Maintain confidentiality until the fix is released\n\n## Security Best Practices\n\n### For Users\n\n- Keep dependencies up to date\n- Use the latest version of this project\n- Review security advisories regularly\n- Report vulnerabilities responsibly\n\n### For Contributors\n\n- Run `cargo audit` before submitting PRs\n- Use `cargo deny` to check license compliance\n- Follow secure coding practices\n- Don't hardcode secrets or credentials\n- Validate all external inputs\n\n## Dependency Security\n\nWe use automated tools to monitor dependencies:\n\n- **cargo-audit**: Scans for known security vulnerabilities\n- **cargo-deny**: Checks licenses and bans unsafe dependencies\n\nThese run in CI on every push and PR.\n\n## Code Review\n\nAll code changes go through review before merging:\n\n- At least one maintainer review required\n- Security implications considered\n- Tests required for all changes\n- CI checks must pass\n\n## Known Vulnerabilities\n\nWe maintain transparency about known issues:\n\n- Documented in GitHub security advisories\n- Announced in release notes\n- Tracked in issues with `security` label\n\n## Security Contact\n\nFor security inquiries, please contact:\n\n- Email: [project maintainers]\n- Issue: Open a private security advisory on GitHub\n\n## Changelog\n\nSecurity fixes are highlighted in CHANGELOG.md with [SECURITY] prefix.\n\n## Resources\n\n- [OWASP Top 10](https://owasp.org/www-project-top-ten/)\n- [CWE: Common Weakness Enumeration](https://cwe.mitre.org/)\n- [Rust Security](https://www.rust-lang.org/governance/security-disclosures)\n- [npm Security](https://docs.npmjs.com/about-npm/security)\n\n## Questions\n\nIf you have security questions (not vulnerabilities), open a discussion or issue with the `security` label. +# Security Policy + +## Supported Versions + +This project provides security updates for the following versions: + +| Version | Supported | +|---------|-----------| +| 1.x | ✅ Yes | +| 0.x | ❌ No | + +Only the latest major version receives security patches. Users are encouraged to upgrade to the latest version. + +## Reporting a Vulnerability + +**Do not open public GitHub issues for security vulnerabilities.** + +Instead, please report security issues to the maintainers privately: + +### Reporting Process + +1. Email security details to the maintainers (see project README for contact) +2. Include: + - Description of the vulnerability + - Steps to reproduce (if possible) + - Potential impact + - Suggested fix (if you have one) + +3. Expect acknowledgment within 48 hours +4. We will work on a fix and coordinate disclosure timing + +### Responsible Disclosure + +- Allow reasonable time for a fix before public disclosure +- Work with us to understand and validate the issue +- Maintain confidentiality until the fix is released + +## Security Best Practices + +### For Users + +- Keep dependencies up to date +- Use the latest version of this project +- Review security advisories regularly +- Report vulnerabilities responsibly + +### For Contributors + +- Run `cargo audit` before submitting PRs +- Use `cargo deny` to check license compliance +- Follow secure coding practices +- Don't hardcode secrets or credentials +- Validate all external inputs + +## Dependency Security + +We use automated tools to monitor dependencies: + +- **cargo-audit**: Scans for known security vulnerabilities +- **cargo-deny**: Checks licenses and bans unsafe dependencies + +These run in CI on every push and PR. + +## Code Review + +All code changes go through review before merging: + +- At least one maintainer review required +- Security implications considered +- Tests required for all changes +- CI checks must pass + +## Known Vulnerabilities + +We maintain transparency about known issues: + +- Documented in GitHub security advisories +- Announced in release notes +- Tracked in issues with `security` label + +## Security Contact + +For security inquiries, please contact: + +- Email: [project maintainers] +- Issue: Open a private security advisory on GitHub + +## Changelog + +Security fixes are highlighted in CHANGELOG.md with [SECURITY] prefix. + +## Resources + +- [OWASP Top 10](https://owasp.org/www-project-top-ten/) +- [CWE: Common Weakness Enumeration](https://cwe.mitre.org/) +- [Rust Security](https://www.rust-lang.org/governance/security-disclosures) +- [npm Security](https://docs.npmjs.com/about-npm/security) + +## Questions + +If you have security questions (not vulnerabilities), open a discussion or issue with the `security` label. diff --git a/bootstrap/README.md b/bootstrap/README.md index 7973e8a..a66afae 100644 --- a/bootstrap/README.md +++ b/bootstrap/README.md @@ -1 +1,246 @@ -# Provisioning Platform Bootstrap\n\nSimple, flexible bootstrap script for provisioning platform installation.\n\n**No Rust compilation required** - uses pure Bash + Nushell.\n\n## Quick Start\n\n### From Git Repository\n\n```\ngit clone https://github.com/provisioning/provisioning.git\ncd provisioning\n\n# Run bootstrap\n./provisioning/bootstrap/install.sh\n```\n\n### What it Does (7 Stages)\n\n1. **System Detection** - Detects OS, CPU, RAM, architecture\n2. **Dependency Check** - Validates Docker, Rust, Nushell installed\n3. **Directory Structure** - Creates workspace directories\n4. **Configuration Validation** - Validates Nickel config syntax\n5. **Export Configuration** - Exports config.ncl → TOML for services\n6. **Initialize Orchestrator** - Starts orchestrator service\n7. **Verification** - Confirms all files created and services running\n\n## Usage\n\n### Standard Bootstrap (Interactive)\n\n```\n./provisioning/bootstrap/install.sh\n```\n\n### Nushell Direct\n\n```\nnu provisioning/bootstrap/install.nu $(pwd)\n```\n\n## Requirements\n\n**Minimum**:\n\n- Nushell 0.109.0+ (auto-installed if missing)\n- Docker (for containers)\n- Rust + Cargo (for building services)\n- Git (for cloning)\n\n**Recommended**:\n\n- 2+ GB RAM\n- 10+ GB disk\n- macOS, Linux, or WSL2\n\n## What Gets Created\n\nAfter bootstrap, your workspace has:\n\n```\nworkspace_librecloud/\n├── config/\n│ ├── config.ncl ← Master config (Nickel)\n│ └── generated/ ← Auto-exported TOML\n│ ├── workspace.toml\n│ ├── providers/\n│ │ ├── upcloud.toml\n│ │ └── local.toml\n│ └── platform/\n│ └── orchestrator.toml\n├── .orchestrator/data/queue/ ← Orchestrator data\n├── .kms/ ← KMS data\n├── .providers/ ← Provider state\n├── .taskservs/ ← Task service data\n└── .clusters/ ← Cluster data\n```\n\n## Differences from Rust Installer\n\n| Feature | Rust Installer | Bash+Nushell Bootstrap |\n| --------- | ----------------- | ------------------------ |\n| **Requires compilation** | ✅ Yes (5+ min) | ❌ No |\n| **Flexible** | ⚠️ Limited | ✅ Fully scriptable |\n| **Source code** | ❌ Binary | ✅ Clear scripts |\n| **Easy to modify** | ❌ Recompile | ✅ Edit script |\n| **Integrates with TypeDialog** | ❌ Hard | ✅ Easy |\n| **Deployable everywhere** | ✅ Binary | ✅ Script |\n| **TUI Interface** | ✅ Ratatui | ⚠️ Text menus |\n\n## Troubleshooting\n\n### "Nushell not found"\n\n```\n# Install Nushell manually:\n# macOS:\nbrew install nushell\n\n# Linux (Debian):\nsudo apt install nushell\n\n# Linux (RHEL):\nsudo yum install nushell\n\n# Or: https://nushell.sh/book/installation.html\n```\n\n### "Docker not installed"\n\n```\n# https://docs.docker.com/get-docker/\n```\n\n### "Rust not installed"\n\n```\n# https://rustup.rs/\ncurl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh\nrustup default stable\n```\n\n### "Configuration validation failed"\n\n```\n# Check Nickel syntax\nnickel typecheck workspace_librecloud/config/config.ncl\n\n# Fix errors in config.ncl\nvim workspace_librecloud/config/config.ncl\n\n# Re-run bootstrap\n./provisioning/bootstrap/install.sh\n```\n\n### "Orchestrator didn't start"\n\n```\n# Check logs\ntail -f workspace_librecloud/.orchestrator/logs/orchestrator.log\n\n# Manual start\ncd provisioning/platform/orchestrator\n./scripts/start-orchestrator.nu --background\n\n# Check health\ncurl http://localhost:9090/health\n```\n\n## After Bootstrap\n\nOnce complete:\n\n1. **Verify orchestrator**:\n\n ```bash\n curl http://localhost:9090/health\n ```\n\n1. **Update configuration** (optional):\n\n ```bash\n provisioning config platform orchestrator\n ```\n\n2. **Start provisioning**:\n\n ```bash\n provisioning server create --infra sgoyol --name web-01\n ```\n\n3. **Monitor progress**:\n\n ```bash\n provisioning workflow monitor \n ```\n\n## Development\n\n### Add New Bootstrap Stage\n\nEdit `install.nu` and add:\n\n```\n# Stage N: YOUR STAGE NAME\nprint "🔧 Stage N: Your Stage Name"\nprint "─────────────────────────────────────────────────────────────────"\n\n# Your logic here\n\nprint " ✅ Done"\nprint ""\n```\n\n### Modify Existing Stages\n\nDirect script edits - no compilation needed. Changes take effect immediately.\n\n### Extend Bootstrap\n\nAdd new scripts in `provisioning/bootstrap/` directory:\n\n```\nprovisioning/bootstrap/\n├── install.sh # Entry point\n├── install.nu # Main orchestrator\n├── validators.nu # Validation helpers (future)\n├── generators.nu # Generator helpers (future)\n└── README.md # This file\n```\n\n## Comparison to Old Rust Installer\n\n**Old way**:\n\n1. Run Rust installer binary\n2. Need to recompile for any changes\n3. Difficult to integrate with TypeDialog\n4. Hard to debug\n\n**New way**:\n\n1. Run simple bash script\n2. Changes take effect immediately\n3. Uses existing Nushell libraries\n4. Easy to extend and debug\n\n## FAQ\n\n**Q: Why not keep the Rust installer?**\nA: Rust crate was over-engineered for bootstrap. Bash+Nushell is simpler, more flexible, and integrates better with the rest of the system.\n\n**Q: Can I customize the bootstrap?**\nA: Yes! Edit `install.nu` directly. Add new stages, change logic, integrate TypeDialog - all without compilation.\n\n**Q: What about TUI interface?**\nA: Bootstrap uses text menus. If you need a fancy TUI, you can build a separate Rust tool, but it's not required for basic installation.\n\n**Q: Is this production-ready?**\nA: Yes. It's simpler and more robust than the old Rust installer.\n\n---\n\n**Status**: ✅ Ready for use\n**Last Updated**: 2025-01-02 +# Provisioning Platform Bootstrap + +Simple, flexible bootstrap script for provisioning platform installation. + +**No Rust compilation required** - uses pure Bash + Nushell. + +## Quick Start + +### From Git Repository + +```bash +git clone https://github.com/provisioning/provisioning.git +cd provisioning + +# Run bootstrap +./provisioning/bootstrap/install.sh +``` + +### What it Does (7 Stages) + +1. **System Detection** - Detects OS, CPU, RAM, architecture +2. **Dependency Check** - Validates Docker, Rust, Nushell installed +3. **Directory Structure** - Creates workspace directories +4. **Configuration Validation** - Validates Nickel config syntax +5. **Export Configuration** - Exports config.ncl → TOML for services +6. **Initialize Orchestrator** - Starts orchestrator service +7. **Verification** - Confirms all files created and services running + +## Usage + +### Standard Bootstrap (Interactive) + +```bash +./provisioning/bootstrap/install.sh +``` + +### Nushell Direct + +```nushell +nu provisioning/bootstrap/install.nu $(pwd) +``` + +## Requirements + +**Minimum**: + +- Nushell 0.109.0+ (auto-installed if missing) +- Docker (for containers) +- Rust + Cargo (for building services) +- Git (for cloning) + +**Recommended**: + +- 2+ GB RAM +- 10+ GB disk +- macOS, Linux, or WSL2 + +## What Gets Created + +After bootstrap, your workspace has: + +```bash +workspace_librecloud/ +├── config/ +│ ├── config.ncl ← Master config (Nickel) +│ └── generated/ ← Auto-exported TOML +│ ├── workspace.toml +│ ├── providers/ +│ │ ├── upcloud.toml +│ │ └── local.toml +│ └── platform/ +│ └── orchestrator.toml +├── .orchestrator/data/queue/ ← Orchestrator data +├── .kms/ ← KMS data +├── .providers/ ← Provider state +├── .taskservs/ ← Task service data +└── .clusters/ ← Cluster data +``` + +## Differences from Rust Installer + +| Feature | Rust Installer | Bash+Nushell Bootstrap | +| --------- | ----------------- | ------------------------ | +| **Requires compilation** | ✅ Yes (5+ min) | ❌ No | +| **Flexible** | ⚠️ Limited | ✅ Fully scriptable | +| **Source code** | ❌ Binary | ✅ Clear scripts | +| **Easy to modify** | ❌ Recompile | ✅ Edit script | +| **Integrates with TypeDialog** | ❌ Hard | ✅ Easy | +| **Deployable everywhere** | ✅ Binary | ✅ Script | +| **TUI Interface** | ✅ Ratatui | ⚠️ Text menus | + +## Troubleshooting + +### "Nushell not found" + +```nushell +# Install Nushell manually: +# macOS: +brew install nushell + +# Linux (Debian): +sudo apt install nushell + +# Linux (RHEL): +sudo yum install nushell + +# Or: https://nushell.sh/book/installation.html +``` + +### "Docker not installed" + +```bash +# https://docs.docker.com/get-docker/ +``` + +### "Rust not installed" + +```rust +# https://rustup.rs/ +curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh +rustup default stable +``` + +### "Configuration validation failed" + +```toml +# Check Nickel syntax +nickel typecheck workspace_librecloud/config/config.ncl + +# Fix errors in config.ncl +vim workspace_librecloud/config/config.ncl + +# Re-run bootstrap +./provisioning/bootstrap/install.sh +``` + +### "Orchestrator didn't start" + +```bash +# Check logs +tail -f workspace_librecloud/.orchestrator/logs/orchestrator.log + +# Manual start +cd provisioning/platform/orchestrator +./scripts/start-orchestrator.nu --background + +# Check health +curl http://localhost:9090/health +``` + +## After Bootstrap + +Once complete: + +1. **Verify orchestrator**: + + ```bash + curl http://localhost:9090/health + ``` + +1. **Update configuration** (optional): + + ```bash + provisioning config platform orchestrator + ``` + +2. **Start provisioning**: + + ```bash + provisioning server create --infra sgoyol --name web-01 + ``` + +3. **Monitor progress**: + + ```bash + provisioning workflow monitor + ``` + +## Development + +### Add New Bootstrap Stage + +Edit `install.nu` and add: + +```nushell +# Stage N: YOUR STAGE NAME +print "🔧 Stage N: Your Stage Name" +print "─────────────────────────────────────────────────────────────────" + +# Your logic here + +print " ✅ Done" +print "" +``` + +### Modify Existing Stages + +Direct script edits - no compilation needed. Changes take effect immediately. + +### Extend Bootstrap + +Add new scripts in `provisioning/bootstrap/` directory: + +```bash +provisioning/bootstrap/ +├── install.sh # Entry point +├── install.nu # Main orchestrator +├── validators.nu # Validation helpers (future) +├── generators.nu # Generator helpers (future) +└── README.md # This file +``` + +## Comparison to Old Rust Installer + +**Old way**: + +1. Run Rust installer binary +2. Need to recompile for any changes +3. Difficult to integrate with TypeDialog +4. Hard to debug + +**New way**: + +1. Run simple bash script +2. Changes take effect immediately +3. Uses existing Nushell libraries +4. Easy to extend and debug + +## FAQ + +**Q: Why not keep the Rust installer?** +A: Rust crate was over-engineered for bootstrap. Bash+Nushell is simpler, more flexible, and integrates better with the rest of the system. + +**Q: Can I customize the bootstrap?** +A: Yes! Edit `install.nu` directly. Add new stages, change logic, integrate TypeDialog - all without compilation. + +**Q: What about TUI interface?** +A: Bootstrap uses text menus. If you need a fancy TUI, you can build a separate Rust tool, but it's not required for basic installation. + +**Q: Is this production-ready?** +A: Yes. It's simpler and more robust than the old Rust installer. + +--- + +**Status**: ✅ Ready for use +**Last Updated**: 2025-01-02 \ No newline at end of file diff --git a/config/README.md b/config/README.md index e0bb93d..81ee8b1 100644 --- a/config/README.md +++ b/config/README.md @@ -1 +1,391 @@ -# Platform Configuration Management\n\nThis directory manages **runtime configurations** for provisioning platform services.\n\n## Structure\n\n```\nprovisioning/config/\n├── runtime/ # 🔒 PRIVATE (gitignored)\n│ ├── .gitignore # Runtime files are private\n│ ├── orchestrator.solo.ncl # Runtime config (editable)\n│ ├── vault-service.multiuser.ncl # Runtime config (editable)\n│ └── generated/ # 📄 Auto-generated TOMLs\n│ ├── orchestrator.solo.toml # Exported from .ncl\n│ └── vault-service.multiuser.toml\n│\n├── examples/ # 📘 PUBLIC (reference)\n│ ├── orchestrator.solo.example.ncl\n│ └── orchestrator.enterprise.example.ncl\n│\n├── README.md # This file\n└── setup-platform-config.sh # ← See provisioning/scripts/setup-platform-config.sh\n```\n\n## Quick Start\n\n### 1. Setup Platform Configuration (First Time)\n\n```\n# Interactive wizard (recommended)\n./provisioning/scripts/setup-platform-config.sh\n\n# Or quick setup for all services in solo mode\n./provisioning/scripts/setup-platform-config.sh --quick-mode --mode solo\n```\n\n### 2. Run Services\n\n```\n# Service reads config from generated TOML\nexport ORCHESTRATOR_MODE=solo\ncargo run -p orchestrator\n\n# Or with explicit config path\nexport ORCHESTRATOR_CONFIG=provisioning/config/runtime/generated/orchestrator.solo.toml\ncargo run -p orchestrator\n```\n\n### 3. Update Configuration\n\n**Option A: Interactive (Recommended)**\n```\n# Update via TypeDialog UI\n./provisioning/scripts/setup-platform-config.sh --service orchestrator --mode solo\n```\n\n**Option B: Manual Edit**\n```\n# Edit Nickel directly\nvim provisioning/config/runtime/orchestrator.solo.ncl\n\n# ⚠️ CRITICAL: Regenerate TOML afterward\n./provisioning/scripts/setup-platform-config.sh --generate-toml\n```\n\n## Configuration Layers\n\n```\n📘 PUBLIC (provisioning/schemas/platform/)\n├── schemas/ → Type contracts (Nickel)\n├── defaults/ → Base configuration values\n│ └── deployment/ → Mode-specific overlays (solo/multiuser/cicd/enterprise)\n├── validators/ → Business logic validation\n└── common/\n └── helpers.ncl → Merge functions\n\n ⬇️ COMPOSITION PROCESS ⬇️\n\n🔒 PRIVATE (provisioning/config/runtime/)\n├── orchestrator.solo.ncl ← User editable\n│ (imports schemas + defaults + mode overlay)\n│ (uses helpers.compose_config for merge)\n│\n└── generated/\n └── orchestrator.solo.toml ← Auto-exported for Rust services\n (generated by: nickel export --format toml)\n```\n\n## Key Concepts\n\n### Schema (Type Contract)\n- **File**: `provisioning/schemas/platform/schemas/orchestrator.ncl`\n- **Purpose**: Defines valid fields, types, constraints\n- **Status**: 📘 PUBLIC, versioned, source of truth\n- **Edit**: Rarely (architecture changes only)\n\n### Defaults (Base Values)\n- **File**: `provisioning/schemas/platform/defaults/orchestrator-defaults.ncl`\n- **Purpose**: Default values for all orchestrator settings\n- **Status**: 📘 PUBLIC, versioned, part of product\n- **Edit**: When changing default behavior\n\n### Mode Overlay (Tuning)\n- **File**: `provisioning/schemas/platform/defaults/deployment/solo-defaults.ncl`\n- **Purpose**: Mode-specific resource/behavior tuning\n- **Status**: 📘 PUBLIC, versioned\n- **Example**: solo mode uses 2 CPU, enterprise uses 16+ CPU\n\n### Runtime Config (User Customization)\n- **File**: `provisioning/config/runtime/orchestrator.solo.ncl`\n- **Purpose**: Actual deployment configuration (can be hand-edited)\n- **Status**: 🔒 PRIVATE, gitignored\n- **Edit**: Yes, use setup script or edit manually + regenerate TOML\n\n### Generated TOML (Service Consumption)\n- **File**: `provisioning/config/runtime/generated/orchestrator.solo.toml`\n- **Purpose**: What Rust services actually read\n- **Status**: 🔒 PRIVATE, gitignored, auto-generated\n- **Edit**: NO - regenerate from .ncl instead\n- **Generation**: `nickel export --format toml `\n\n## Workflows\n\n### Scenario 1: First-Time Setup\n\n```\n# 1. Run setup script\n./provisioning/scripts/setup-platform-config.sh\n\n# 2. Choose action (TypeDialog or Quick Mode)\n# ↓\n# TypeDialog: User fills form → generates orchestrator.solo.ncl\n# Quick Mode: Composes defaults + mode overlay → generates all 8 services\n\n# 3. Script auto-exports to TOML\n# orchestrator.solo.ncl → orchestrator.solo.toml\n\n# 4. Service reads TOML\n# cargo run -p orchestrator (reads generated/orchestrator.solo.toml)\n```\n\n### Scenario 2: Update Configuration\n\n```\n# Option A: Interactive TypeDialog\n./provisioning/scripts/setup-platform-config.sh \n --service orchestrator \n --mode solo \n --backend web\n\n# Result: Updated orchestrator.solo.ncl + auto-exported TOML\n\n# Option B: Manual Edit\nvim provisioning/config/runtime/orchestrator.solo.ncl\n\n# ⚠️ CRITICAL: Must regenerate TOML\n./provisioning/scripts/setup-platform-config.sh --generate-toml\n\n# Result: Updated TOML in generated/\n```\n\n### Scenario 3: Switch Deployment Mode\n\n```\n# From solo to enterprise\n./provisioning/scripts/setup-platform-config.sh --quick-mode --mode enterprise\n\n# Result: All 8 services configured for enterprise mode\n# 16+ CPU, 32+ GB RAM, HA setup, KMS integration, etc.\n```\n\n### Scenario 4: Workspace-Specific Overrides\n\n```\nworkspace_librecloud/\n├── config/\n│ └── platform-overrides.ncl # Workspace customization\n│\n# Example:\n# {\n# orchestrator.server.port = 9999,\n# orchestrator.workspace.name = "librecloud",\n# vault-service.storage.path = "./workspace_librecloud/data/vault"\n# }\n```\n\n## Important Notes\n\n### ⚠️ Manual Edits Require TOML Regeneration\n\nIf you edit `.ncl` files directly:\n\n```\n# 1. Edit the .ncl file\nvim provisioning/config/runtime/orchestrator.solo.ncl\n\n# 2. ALWAYS regenerate TOML\n./provisioning/scripts/setup-platform-config.sh --generate-toml\n\n# Service will NOT see your changes until TOML is regenerated\n```\n\n### 🔒 Private by Design\n\nRuntime configs are **gitignored** for good reasons:\n\n- **May contain secrets**: Encrypted credentials, API keys, tokens\n- **Deployment-specific**: Different values per environment\n- **User-customized**: Each developer/workspace has different needs\n- **Not shared**: Don't commit locally-built configs\n\n### 📘 Schemas are Public\n\nSchema/defaults in `provisioning/schemas/` are **version-controlled**:\n\n- Product definition (part of releases)\n- Shared across team\n- Source of truth for config structure\n- Can reference in documentation\n\n### 🔄 Idempotent Setup\n\nThe setup script is safe to run multiple times:\n\n```\n# Safe: Updates only what's needed\n./provisioning/scripts/setup-platform-config.sh --quick-mode --mode enterprise\n\n# Safe: Doesn't overwrite unless --clean is used\n./provisioning/scripts/setup-platform-config.sh --generate-toml\n\n# Use --clean to start fresh\n./provisioning/scripts/setup-platform-config.sh --clean\n```\n\n## Service Configuration Paths\n\nEach service loads config using this priority:\n\n```\n1. Environment variable: ORCHESTRATOR_CONFIG=/path/to/custom.toml\n2. Mode-specific runtime: provisioning/config/runtime/generated/orchestrator.{MODE}.toml\n3. Fallback defaults: provisioning/schemas/platform/defaults/orchestrator-defaults.ncl\n```\n\n## Configuration Composition (Technical)\n\nThe setup script uses Nickel's `helpers.compose_config` function:\n\n```\n# Generated .ncl file imports:\nlet helpers = import "provisioning/schemas/platform/common/helpers.ncl"\nlet defaults = import "provisioning/schemas/platform/defaults/orchestrator-defaults.ncl"\nlet mode_config = import "provisioning/schemas/platform/defaults/deployment/solo-defaults.ncl"\n\n# Compose: base + mode overlay\nhelpers.compose_config defaults mode_config {}\n# ^base ^mode overlay ^user overrides (empty if not customized)\n```\n\nThis ensures:\n- Type safety (validated by Nickel schema)\n- Proper layering (base + mode + user)\n- Reproducibility (same compose always produces same result)\n- Extensibility (can add user layer via Nickel import)\n\n## Troubleshooting\n\n### Config Won't Generate TOML\n\n```\n# Check Nickel syntax\nnickel typecheck provisioning/config/runtime/orchestrator.solo.ncl\n\n# Check for schema import errors\nnickel export --format json provisioning/config/runtime/orchestrator.solo.ncl\n\n# View detailed error message\nnickel typecheck -i provisioning/config/runtime/orchestrator.solo.ncl 2>&1 | less\n```\n\n### Service Won't Start\n\n```\n# Verify TOML exists\nls -la provisioning/config/runtime/generated/orchestrator.solo.toml\n\n# Verify TOML syntax\ntoml-cli validate provisioning/config/runtime/generated/orchestrator.solo.toml\n\n# Check service config loading\nRUST_LOG=debug cargo run -p orchestrator 2>&1 | head -50\n```\n\n### Wrong Configuration Being Used\n\n```\n# Verify environment mode\necho $ORCHESTRATOR_MODE # Should be: solo, multiuser, cicd, or enterprise\n\n# Check which file service is reading\nORCHESTRATOR_CONFIG=provisioning/config/runtime/generated/orchestrator.solo.toml \n cargo run -p orchestrator\n\n# Verify file modification time\nls -lah provisioning/config/runtime/generated/orchestrator.*.toml\n```\n\n## Integration Points\n\n### ⚠️ Provisioning Installer Status\n\n**Current Status**: Installer NOT YET IMPLEMENTED\n\nThe `setup-platform-config.sh` script is a **standalone tool** that:\n- ✅ Works independently from the provisioning installer\n- ✅ Can be called manually for configuration setup\n- ⏳ Will be integrated into the installer once it's implemented\n\n**For Now**: Use script manually before running services:\n\n```\n# Manual setup (until installer is implemented)\ncd /path/to/project-provisioning\n./provisioning/scripts/setup-platform-config.sh --quick-mode --mode solo\n\n# Then run services\nexport ORCHESTRATOR_MODE=solo\ncargo run -p orchestrator\n```\n\n### Future: Integration into Provisioning Installer\n\nOnce `provisioning/scripts/install.sh` is implemented, it will automatically call this script:\n\n```\n#!/bin/bash\n# provisioning/scripts/install.sh (FUTURE - NOT YET IMPLEMENTED)\n\n# Pre-flight checks (verification of dependencies, paths, permissions)\ncheck_dependencies() {\n command -v nickel >/dev/null || { echo "Nickel required"; exit 1; }\n command -v nu >/dev/null || { echo "Nushell required"; exit 1; }\n}\ncheck_dependencies\n\n# Install core provisioning system\necho "Installing provisioning system..."\n# (install implementation details here)\n\n# Setup platform configurations\necho "Setting up platform configurations..."\n./provisioning/scripts/setup-platform-config.sh --quick-mode --mode solo\n\n# Build and test platform services\necho "Building platform services..."\ncargo build -p orchestrator -p control-center -p mcp-server\n\n# Verify services are operational\necho "Verification complete - services ready to run"\n```\n\n### CI/CD Pipeline Integration\n\nFor automated CI/CD setups (can use now):\n\n```\n#!/bin/bash\n# ci/setup.sh\n\n# Setup configurations for CI/CD mode\ncd /path/to/project-provisioning\n./provisioning/scripts/setup-platform-config.sh \n --quick-mode \n --mode cicd\n\n# Result: All services configured for CI/CD mode\n# (ephemeral, API-driven, fast cleanup, minimal resource footprint)\n\n# Run tests\ncargo test --all\n\n# Deploy (CI/CD specific)\ndocker-compose -f provisioning/platform/infrastructure/docker/docker-compose.cicd.yml up\n```\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2026-01-05\n**Script Reference**: `provisioning/scripts/setup-platform-config.sh` \ No newline at end of file +# Platform Configuration Management + +This directory manages **runtime configurations** for provisioning platform services. + +## Structure + +```bash +provisioning/config/ +├── runtime/ # 🔒 PRIVATE (gitignored) +│ ├── .gitignore # Runtime files are private +│ ├── orchestrator.solo.ncl # Runtime config (editable) +│ ├── vault-service.multiuser.ncl # Runtime config (editable) +│ └── generated/ # 📄 Auto-generated TOMLs +│ ├── orchestrator.solo.toml # Exported from .ncl +│ └── vault-service.multiuser.toml +│ +├── examples/ # 📘 PUBLIC (reference) +│ ├── orchestrator.solo.example.ncl +│ └── orchestrator.enterprise.example.ncl +│ +├── README.md # This file +└── setup-platform-config.sh # ← See provisioning/scripts/setup-platform-config.sh +``` + +## Quick Start + +### 1. Setup Platform Configuration (First Time) + +```toml +# Interactive wizard (recommended) +./provisioning/scripts/setup-platform-config.sh + +# Or quick setup for all services in solo mode +./provisioning/scripts/setup-platform-config.sh --quick-mode --mode solo +``` + +### 2. Run Services + +```bash +# Service reads config from generated TOML +export ORCHESTRATOR_MODE=solo +cargo run -p orchestrator + +# Or with explicit config path +export ORCHESTRATOR_CONFIG=provisioning/config/runtime/generated/orchestrator.solo.toml +cargo run -p orchestrator +``` + +### 3. Update Configuration + +**Option A: Interactive (Recommended)** +```toml +# Update via TypeDialog UI +./provisioning/scripts/setup-platform-config.sh --service orchestrator --mode solo +``` + +**Option B: Manual Edit** +```bash +# Edit Nickel directly +vim provisioning/config/runtime/orchestrator.solo.ncl + +# ⚠️ CRITICAL: Regenerate TOML afterward +./provisioning/scripts/setup-platform-config.sh --generate-toml +``` + +## Configuration Layers + +```toml +📘 PUBLIC (provisioning/schemas/platform/) +├── schemas/ → Type contracts (Nickel) +├── defaults/ → Base configuration values +│ └── deployment/ → Mode-specific overlays (solo/multiuser/cicd/enterprise) +├── validators/ → Business logic validation +└── common/ + └── helpers.ncl → Merge functions + + ⬇️ COMPOSITION PROCESS ⬇️ + +🔒 PRIVATE (provisioning/config/runtime/) +├── orchestrator.solo.ncl ← User editable +│ (imports schemas + defaults + mode overlay) +│ (uses helpers.compose_config for merge) +│ +└── generated/ + └── orchestrator.solo.toml ← Auto-exported for Rust services + (generated by: nickel export --format toml) +``` + +## Key Concepts + +### Schema (Type Contract) +- **File**: `provisioning/schemas/platform/schemas/orchestrator.ncl` +- **Purpose**: Defines valid fields, types, constraints +- **Status**: 📘 PUBLIC, versioned, source of truth +- **Edit**: Rarely (architecture changes only) + +### Defaults (Base Values) +- **File**: `provisioning/schemas/platform/defaults/orchestrator-defaults.ncl` +- **Purpose**: Default values for all orchestrator settings +- **Status**: 📘 PUBLIC, versioned, part of product +- **Edit**: When changing default behavior + +### Mode Overlay (Tuning) +- **File**: `provisioning/schemas/platform/defaults/deployment/solo-defaults.ncl` +- **Purpose**: Mode-specific resource/behavior tuning +- **Status**: 📘 PUBLIC, versioned +- **Example**: solo mode uses 2 CPU, enterprise uses 16+ CPU + +### Runtime Config (User Customization) +- **File**: `provisioning/config/runtime/orchestrator.solo.ncl` +- **Purpose**: Actual deployment configuration (can be hand-edited) +- **Status**: 🔒 PRIVATE, gitignored +- **Edit**: Yes, use setup script or edit manually + regenerate TOML + +### Generated TOML (Service Consumption) +- **File**: `provisioning/config/runtime/generated/orchestrator.solo.toml` +- **Purpose**: What Rust services actually read +- **Status**: 🔒 PRIVATE, gitignored, auto-generated +- **Edit**: NO - regenerate from .ncl instead +- **Generation**: `nickel export --format toml ` + +## Workflows + +### Scenario 1: First-Time Setup + +```bash +# 1. Run setup script +./provisioning/scripts/setup-platform-config.sh + +# 2. Choose action (TypeDialog or Quick Mode) +# ↓ +# TypeDialog: User fills form → generates orchestrator.solo.ncl +# Quick Mode: Composes defaults + mode overlay → generates all 8 services + +# 3. Script auto-exports to TOML +# orchestrator.solo.ncl → orchestrator.solo.toml + +# 4. Service reads TOML +# cargo run -p orchestrator (reads generated/orchestrator.solo.toml) +``` + +### Scenario 2: Update Configuration + +```toml +# Option A: Interactive TypeDialog +./provisioning/scripts/setup-platform-config.sh + --service orchestrator + --mode solo + --backend web + +# Result: Updated orchestrator.solo.ncl + auto-exported TOML + +# Option B: Manual Edit +vim provisioning/config/runtime/orchestrator.solo.ncl + +# ⚠️ CRITICAL: Must regenerate TOML +./provisioning/scripts/setup-platform-config.sh --generate-toml + +# Result: Updated TOML in generated/ +``` + +### Scenario 3: Switch Deployment Mode + +```bash +# From solo to enterprise +./provisioning/scripts/setup-platform-config.sh --quick-mode --mode enterprise + +# Result: All 8 services configured for enterprise mode +# 16+ CPU, 32+ GB RAM, HA setup, KMS integration, etc. +``` + +### Scenario 4: Workspace-Specific Overrides + +```bash +workspace_librecloud/ +├── config/ +│ └── platform-overrides.ncl # Workspace customization +│ +# Example: +# { +# orchestrator.server.port = 9999, +# orchestrator.workspace.name = "librecloud", +# vault-service.storage.path = "./workspace_librecloud/data/vault" +# } +``` + +## Important Notes + +### ⚠️ Manual Edits Require TOML Regeneration + +If you edit `.ncl` files directly: + +```nickel +# 1. Edit the .ncl file +vim provisioning/config/runtime/orchestrator.solo.ncl + +# 2. ALWAYS regenerate TOML +./provisioning/scripts/setup-platform-config.sh --generate-toml + +# Service will NOT see your changes until TOML is regenerated +``` + +### 🔒 Private by Design + +Runtime configs are **gitignored** for good reasons: + +- **May contain secrets**: Encrypted credentials, API keys, tokens +- **Deployment-specific**: Different values per environment +- **User-customized**: Each developer/workspace has different needs +- **Not shared**: Don't commit locally-built configs + +### 📘 Schemas are Public + +Schema/defaults in `provisioning/schemas/` are **version-controlled**: + +- Product definition (part of releases) +- Shared across team +- Source of truth for config structure +- Can reference in documentation + +### 🔄 Idempotent Setup + +The setup script is safe to run multiple times: + +```bash +# Safe: Updates only what's needed +./provisioning/scripts/setup-platform-config.sh --quick-mode --mode enterprise + +# Safe: Doesn't overwrite unless --clean is used +./provisioning/scripts/setup-platform-config.sh --generate-toml + +# Use --clean to start fresh +./provisioning/scripts/setup-platform-config.sh --clean +``` + +## Service Configuration Paths + +Each service loads config using this priority: + +```toml +1. Environment variable: ORCHESTRATOR_CONFIG=/path/to/custom.toml +2. Mode-specific runtime: provisioning/config/runtime/generated/orchestrator.{MODE}.toml +3. Fallback defaults: provisioning/schemas/platform/defaults/orchestrator-defaults.ncl +``` + +## Configuration Composition (Technical) + +The setup script uses Nickel's `helpers.compose_config` function: + +```nickel +# Generated .ncl file imports: +let helpers = import "provisioning/schemas/platform/common/helpers.ncl" +let defaults = import "provisioning/schemas/platform/defaults/orchestrator-defaults.ncl" +let mode_config = import "provisioning/schemas/platform/defaults/deployment/solo-defaults.ncl" + +# Compose: base + mode overlay +helpers.compose_config defaults mode_config {} +# ^base ^mode overlay ^user overrides (empty if not customized) +``` + +This ensures: +- Type safety (validated by Nickel schema) +- Proper layering (base + mode + user) +- Reproducibility (same compose always produces same result) +- Extensibility (can add user layer via Nickel import) + +## Troubleshooting + +### Config Won't Generate TOML + +```toml +# Check Nickel syntax +nickel typecheck provisioning/config/runtime/orchestrator.solo.ncl + +# Check for schema import errors +nickel export --format json provisioning/config/runtime/orchestrator.solo.ncl + +# View detailed error message +nickel typecheck -i provisioning/config/runtime/orchestrator.solo.ncl 2>&1 | less +``` + +### Service Won't Start + +```bash +# Verify TOML exists +ls -la provisioning/config/runtime/generated/orchestrator.solo.toml + +# Verify TOML syntax +toml-cli validate provisioning/config/runtime/generated/orchestrator.solo.toml + +# Check service config loading +RUST_LOG=debug cargo run -p orchestrator 2>&1 | head -50 +``` + +### Wrong Configuration Being Used + +```toml +# Verify environment mode +echo $ORCHESTRATOR_MODE # Should be: solo, multiuser, cicd, or enterprise + +# Check which file service is reading +ORCHESTRATOR_CONFIG=provisioning/config/runtime/generated/orchestrator.solo.toml + cargo run -p orchestrator + +# Verify file modification time +ls -lah provisioning/config/runtime/generated/orchestrator.*.toml +``` + +## Integration Points + +### ⚠️ Provisioning Installer Status + +**Current Status**: Installer NOT YET IMPLEMENTED + +The `setup-platform-config.sh` script is a **standalone tool** that: +- ✅ Works independently from the provisioning installer +- ✅ Can be called manually for configuration setup +- ⏳ Will be integrated into the installer once it's implemented + +**For Now**: Use script manually before running services: + +```bash +# Manual setup (until installer is implemented) +cd /path/to/project-provisioning +./provisioning/scripts/setup-platform-config.sh --quick-mode --mode solo + +# Then run services +export ORCHESTRATOR_MODE=solo +cargo run -p orchestrator +``` + +### Future: Integration into Provisioning Installer + +Once `provisioning/scripts/install.sh` is implemented, it will automatically call this script: + +```bash +#!/bin/bash +# provisioning/scripts/install.sh (FUTURE - NOT YET IMPLEMENTED) + +# Pre-flight checks (verification of dependencies, paths, permissions) +check_dependencies() { + command -v nickel >/dev/null || { echo "Nickel required"; exit 1; } + command -v nu >/dev/null || { echo "Nushell required"; exit 1; } +} +check_dependencies + +# Install core provisioning system +echo "Installing provisioning system..." +# (install implementation details here) + +# Setup platform configurations +echo "Setting up platform configurations..." +./provisioning/scripts/setup-platform-config.sh --quick-mode --mode solo + +# Build and test platform services +echo "Building platform services..." +cargo build -p orchestrator -p control-center -p mcp-server + +# Verify services are operational +echo "Verification complete - services ready to run" +``` + +### CI/CD Pipeline Integration + +For automated CI/CD setups (can use now): + +```bash +#!/bin/bash +# ci/setup.sh + +# Setup configurations for CI/CD mode +cd /path/to/project-provisioning +./provisioning/scripts/setup-platform-config.sh + --quick-mode + --mode cicd + +# Result: All services configured for CI/CD mode +# (ephemeral, API-driven, fast cleanup, minimal resource footprint) + +# Run tests +cargo test --all + +# Deploy (CI/CD specific) +docker-compose -f provisioning/platform/infrastructure/docker/docker-compose.cicd.yml up +``` + +--- + +**Version**: 1.0.0 +**Last Updated**: 2026-01-05 +**Script Reference**: `provisioning/scripts/setup-platform-config.sh` \ No newline at end of file diff --git a/config/examples/README.md b/config/examples/README.md index d666b48..f1dfeae 100644 --- a/config/examples/README.md +++ b/config/examples/README.md @@ -1 +1,494 @@ -# Example Platform Service Configurations\n\nThis directory contains reference configurations for platform services in different deployment modes. These examples show realistic settings and best practices for each mode.\n\n## What Are These Examples?\n\nThese are **Nickel configuration files** (.ncl format) that demonstrate how to configure the provisioning platform services. They show:\n\n- Recommended settings for each deployment mode\n- How to customize services for your environment\n- Best practices for development, staging, and production\n- Performance tuning for different scenarios\n- Security settings appropriate to each mode\n\n## Directory Structure\n\n```\nprovisioning/config/examples/\n├── README.md # This file\n├── orchestrator.solo.example.ncl # Development mode reference\n├── orchestrator.multiuser.example.ncl # Team staging reference\n└── orchestrator.enterprise.example.ncl # Production reference\n```\n\n## Deployment Modes\n\n### Solo Mode (Development)\n\n**File**: `orchestrator.solo.example.ncl`\n\n**Characteristics**:\n- 2 CPU, 4GB RAM (lightweight)\n- Single user/developer\n- Local development machine\n- Minimal resource consumption\n- No TLS or authentication\n- In-memory storage\n\n**When to use**:\n- Local development\n- Testing configurations\n- Learning the platform\n- CI/CD test environments\n\n**Key Settings**:\n- workers: 2\n- max_concurrent_tasks: 2\n- max_memory: 1GB\n- tls: disabled\n- auth: disabled\n\n### Multiuser Mode (Team Staging)\n\n**File**: `orchestrator.multiuser.example.ncl`\n\n**Characteristics**:\n- 4 CPU, 8GB RAM (moderate)\n- Multiple concurrent users\n- Team staging environment\n- Production-like testing\n- Basic TLS and token auth\n- Filesystem storage with caching\n\n**When to use**:\n- Team development\n- Integration testing\n- Staging environment\n- Pre-production validation\n- Multi-user environments\n\n**Key Settings**:\n- workers: 4\n- max_concurrent_tasks: 10\n- max_memory: 4GB\n- tls: enabled (certificates required)\n- auth: token-based\n- storage: filesystem with replication\n\n### Enterprise Mode (Production)\n\n**File**: `orchestrator.enterprise.example.ncl`\n\n**Characteristics**:\n- 16+ CPU, 32+ GB RAM (high-performance)\n- Multi-team, multi-workspace\n- Production mission-critical\n- Full redundancy and HA\n- OAuth2/Enterprise auth\n- Distributed storage with replication\n- Full monitoring, tracing, audit\n\n**When to use**:\n- Production deployment\n- Mission-critical systems\n- High-availability requirements\n- Multi-tenant environments\n- Compliance requirements (SOC2, ISO27001)\n\n**Key Settings**:\n- workers: 16\n- max_concurrent_tasks: 100\n- max_memory: 32GB\n- tls: mandatory (TLS 1.3)\n- auth: OAuth2 (enterprise provider)\n- storage: distributed with 3-way replication\n- monitoring: comprehensive with tracing\n- disaster_recovery: enabled\n- compliance: SOC2, ISO27001\n\n## How to Use These Examples\n\n### Step 1: Copy the Appropriate Example\n\nChoose the example that matches your deployment mode:\n\n```\n# For development (solo)\ncp provisioning/config/examples/orchestrator.solo.example.ncl \n provisioning/config/runtime/orchestrator.solo.ncl\n\n# For team staging (multiuser)\ncp provisioning/config/examples/orchestrator.multiuser.example.ncl \n provisioning/config/runtime/orchestrator.multiuser.ncl\n\n# For production (enterprise)\ncp provisioning/config/examples/orchestrator.enterprise.example.ncl \n provisioning/config/runtime/orchestrator.enterprise.ncl\n```\n\n### Step 2: Customize for Your Environment\n\nEdit the copied file to match your specific setup:\n\n```\n# Edit the configuration\nvim provisioning/config/runtime/orchestrator.solo.ncl\n\n# Examples of customizations:\n# - Change workspace path to your project\n# - Adjust worker count based on CPU cores\n# - Set your domain names and hostnames\n# - Configure storage paths for your filesystem\n# - Update certificate paths for production\n# - Set logging endpoints for your infrastructure\n```\n\n### Step 3: Validate Configuration\n\nVerify the configuration is syntactically correct:\n\n```\n# Check Nickel syntax\nnickel typecheck provisioning/config/runtime/orchestrator.solo.ncl\n\n# View generated TOML\nnickel export --format toml provisioning/config/runtime/orchestrator.solo.ncl\n```\n\n### Step 4: Generate TOML\n\nExport the Nickel configuration to TOML format for service consumption:\n\n```\n# Use setup script to generate TOML\n./provisioning/scripts/setup-platform-config.sh --generate-toml\n\n# Or manually export\nnickel export --format toml provisioning/config/runtime/orchestrator.solo.ncl > \n provisioning/config/runtime/generated/orchestrator.solo.toml\n```\n\n### Step 5: Run Services\n\nStart your platform services with the generated configuration:\n\n```\n# Set the deployment mode\nexport ORCHESTRATOR_MODE=solo\n\n# Run the orchestrator\ncargo run -p orchestrator\n```\n\n## Configuration Reference\n\n### Solo Mode Example Settings\n\n```\nserver.workers = 2\nqueue.max_concurrent_tasks = 2\nperformance.max_memory = 1000 # 1GB max\nsecurity.tls.enabled = false # No TLS for local dev\nsecurity.auth.enabled = false # No auth for local dev\n```\n\n**Use case**: Single developer on local machine\n\n### Multiuser Mode Example Settings\n\n```\nserver.workers = 4\nqueue.max_concurrent_tasks = 10\nperformance.max_memory = 4000 # 4GB max\nsecurity.tls.enabled = true # Enable TLS\nsecurity.auth.type = "token" # Token-based auth\n```\n\n**Use case**: Team of 5-10 developers in staging\n\n### Enterprise Mode Example Settings\n\n```\nserver.workers = 16\nqueue.max_concurrent_tasks = 100\nperformance.max_memory = 32000 # 32GB max\nsecurity.tls.enabled = true # TLS 1.3 only\nsecurity.auth.type = "oauth2" # OAuth2 for enterprise\nstorage.replication.factor = 3 # 3-way replication\n```\n\n**Use case**: Production with 100+ users across multiple teams\n\n## Key Configuration Sections\n\n### Server Configuration\n\nControls HTTP server behavior:\n\n```\nserver = {\n host = "0.0.0.0", # Bind address\n port = 9090, # Listen port\n workers = 4, # Worker threads\n max_connections = 200, # Concurrent connections\n request_timeout = 30000, # Milliseconds\n}\n```\n\n### Storage Configuration\n\nControls data persistence:\n\n```\nstorage = {\n backend = "filesystem", # filesystem or distributed\n path = "/var/lib/provisioning/orchestrator/data",\n cache.enabled = true,\n replication.enabled = true,\n replication.factor = 3, # 3-way replication for HA\n}\n```\n\n### Queue Configuration\n\nControls task queuing:\n\n```\nqueue = {\n max_concurrent_tasks = 10,\n retry_attempts = 3,\n task_timeout = 3600000, # 1 hour in milliseconds\n priority_queue = true, # Enable priority for tasks\n metrics = true, # Enable queue metrics\n}\n```\n\n### Security Configuration\n\nControls authentication and encryption:\n\n```\nsecurity = {\n tls = {\n enabled = true,\n cert_path = "/etc/provisioning/certs/cert.crt",\n key_path = "/etc/provisioning/certs/key.key",\n min_tls_version = "1.3",\n },\n auth = {\n enabled = true,\n type = "oauth2", # oauth2, token, or none\n provider = "okta",\n },\n encryption = {\n enabled = true,\n algorithm = "aes-256-gcm",\n },\n}\n```\n\n### Logging Configuration\n\nControls log output and persistence:\n\n```\nlogging = {\n level = "info", # debug, info, warning, error\n format = "json",\n output = "both", # stdout, file, or both\n file = {\n enabled = true,\n path = "/var/log/orchestrator.log",\n rotation.max_size = 104857600, # 100MB per file\n },\n}\n```\n\n### Monitoring Configuration\n\nControls observability and metrics:\n\n```\nmonitoring = {\n enabled = true,\n metrics.enabled = true,\n health_check.enabled = true,\n distributed_tracing.enabled = true,\n audit_logging.enabled = true,\n}\n```\n\n## Customization Examples\n\n### Example 1: Change Workspace Name\n\nChange the workspace identifier in solo mode:\n\n```\nworkspace = {\n name = "myproject",\n path = "./provisioning/data/orchestrator",\n}\n```\n\nInstead of default "development", use "myproject".\n\n### Example 2: Custom Server Port\n\nChange server port from default 9090:\n\n```\nserver = {\n port = 8888,\n}\n```\n\nUseful if port 9090 is already in use.\n\n### Example 3: Enable TLS in Solo Mode\n\nAdd TLS certificates to solo development:\n\n```\nsecurity = {\n tls = {\n enabled = true,\n cert_path = "./certs/localhost.crt",\n key_path = "./certs/localhost.key",\n },\n}\n```\n\nUseful for testing TLS locally before production.\n\n### Example 4: Custom Storage Path\n\nUse custom storage location:\n\n```\nstorage = {\n path = "/mnt/fast-storage/orchestrator/data",\n}\n```\n\nUseful if you have fast SSD storage available.\n\n### Example 5: Increase Workers for Staging\n\nIncrease from 4 to 8 workers in multiuser:\n\n```\nserver = {\n workers = 8,\n}\n```\n\nUseful when you have more CPU cores available.\n\n## Troubleshooting Configuration\n\n### Issue: "Configuration Won't Validate"\n\n```\n# Check for Nickel syntax errors\nnickel typecheck provisioning/config/runtime/orchestrator.solo.ncl\n\n# Get detailed error message\nnickel typecheck -i provisioning/config/runtime/orchestrator.solo.ncl\n```\n\nThe typecheck command will show exactly where the syntax error is.\n\n### Issue: "Service Won't Start"\n\n```\n# Verify TOML was exported correctly\ncat provisioning/config/runtime/generated/orchestrator.solo.toml | head -20\n\n# Check TOML syntax is valid\ntoml-cli validate provisioning/config/runtime/generated/orchestrator.solo.toml\n```\n\nThe TOML must be valid for the Rust service to parse it.\n\n### Issue: "Service Uses Wrong Configuration"\n\n```\n# Verify deployment mode is set\necho $ORCHESTRATOR_MODE\n\n# Check which TOML file service reads\nls -lah provisioning/config/runtime/generated/orchestrator.*.toml\n\n# Verify TOML modification time is recent\nstat provisioning/config/runtime/generated/orchestrator.solo.toml\n```\n\nThe service reads from `orchestrator.{MODE}.toml` based on environment variable.\n\n## Best Practices\n\n### Development (Solo Mode)\n\n1. Start simple using the solo example as-is first\n2. Iterate gradually, making one change at a time\n3. Enable logging by setting level = "debug" for troubleshooting\n4. Disable security features for local development (TLS/auth)\n5. Store data in ./provisioning/data/ which is gitignored\n\n### Staging (Multiuser Mode)\n\n1. Mirror production settings to test realistically\n2. Enable authentication even in staging to test auth flows\n3. Enable TLS with valid certificates to test secure connections\n4. Set up monitoring metrics and health checks\n5. Plan worker count based on expected concurrent users\n\n### Production (Enterprise Mode)\n\n1. Follow the enterprise example as baseline configuration\n2. Use secure vault for storing credentials and secrets\n3. Enable redundancy with 3-way replication for HA\n4. Enable full monitoring with distributed tracing\n5. Test failover scenarios regularly\n6. Enable audit logging for compliance\n7. Enforce TLS 1.3 and certificate rotation\n\n## Migration Between Modes\n\nTo upgrade from solo → multiuser → enterprise:\n\n```\n# 1. Backup current configuration\ncp provisioning/config/runtime/orchestrator.solo.ncl \n provisioning/config/runtime/orchestrator.solo.ncl.bak\n\n# 2. Copy new example for target mode\ncp provisioning/config/examples/orchestrator.multiuser.example.ncl \n provisioning/config/runtime/orchestrator.multiuser.ncl\n\n# 3. Customize for your environment\nvim provisioning/config/runtime/orchestrator.multiuser.ncl\n\n# 4. Validate and generate TOML\n./provisioning/scripts/setup-platform-config.sh --generate-toml\n\n# 5. Update mode environment variable and restart\nexport ORCHESTRATOR_MODE=multiuser\ncargo run -p orchestrator\n```\n\n## Related Documentation\n\n- **Platform Configuration Guide**: `provisioning/docs/src/getting-started/05-platform-configuration.md`\n- **Configuration README**: `provisioning/config/README.md`\n- **System Status**: `provisioning/config/SETUP_STATUS.md`\n- **Setup Script Reference**: `provisioning/scripts/setup-platform-config.sh.md`\n- **Advanced TypeDialog Guide**: `provisioning/docs/src/development/typedialog-platform-config-guide.md`\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2026-01-05\n**Status**: Ready to use \ No newline at end of file +# Example Platform Service Configurations + +This directory contains reference configurations for platform services in different deployment modes. These examples show realistic settings and best practices for each mode. + +## What Are These Examples? + +These are **Nickel configuration files** (.ncl format) that demonstrate how to configure the provisioning platform services. They show: + +- Recommended settings for each deployment mode +- How to customize services for your environment +- Best practices for development, staging, and production +- Performance tuning for different scenarios +- Security settings appropriate to each mode + +## Directory Structure + +```bash +provisioning/config/examples/ +├── README.md # This file +├── orchestrator.solo.example.ncl # Development mode reference +├── orchestrator.multiuser.example.ncl # Team staging reference +└── orchestrator.enterprise.example.ncl # Production reference +``` + +## Deployment Modes + +### Solo Mode (Development) + +**File**: `orchestrator.solo.example.ncl` + +**Characteristics**: +- 2 CPU, 4GB RAM (lightweight) +- Single user/developer +- Local development machine +- Minimal resource consumption +- No TLS or authentication +- In-memory storage + +**When to use**: +- Local development +- Testing configurations +- Learning the platform +- CI/CD test environments + +**Key Settings**: +- workers: 2 +- max_concurrent_tasks: 2 +- max_memory: 1GB +- tls: disabled +- auth: disabled + +### Multiuser Mode (Team Staging) + +**File**: `orchestrator.multiuser.example.ncl` + +**Characteristics**: +- 4 CPU, 8GB RAM (moderate) +- Multiple concurrent users +- Team staging environment +- Production-like testing +- Basic TLS and token auth +- Filesystem storage with caching + +**When to use**: +- Team development +- Integration testing +- Staging environment +- Pre-production validation +- Multi-user environments + +**Key Settings**: +- workers: 4 +- max_concurrent_tasks: 10 +- max_memory: 4GB +- tls: enabled (certificates required) +- auth: token-based +- storage: filesystem with replication + +### Enterprise Mode (Production) + +**File**: `orchestrator.enterprise.example.ncl` + +**Characteristics**: +- 16+ CPU, 32+ GB RAM (high-performance) +- Multi-team, multi-workspace +- Production mission-critical +- Full redundancy and HA +- OAuth2/Enterprise auth +- Distributed storage with replication +- Full monitoring, tracing, audit + +**When to use**: +- Production deployment +- Mission-critical systems +- High-availability requirements +- Multi-tenant environments +- Compliance requirements (SOC2, ISO27001) + +**Key Settings**: +- workers: 16 +- max_concurrent_tasks: 100 +- max_memory: 32GB +- tls: mandatory (TLS 1.3) +- auth: OAuth2 (enterprise provider) +- storage: distributed with 3-way replication +- monitoring: comprehensive with tracing +- disaster_recovery: enabled +- compliance: SOC2, ISO27001 + +## How to Use These Examples + +### Step 1: Copy the Appropriate Example + +Choose the example that matches your deployment mode: + +```bash +# For development (solo) +cp provisioning/config/examples/orchestrator.solo.example.ncl + provisioning/config/runtime/orchestrator.solo.ncl + +# For team staging (multiuser) +cp provisioning/config/examples/orchestrator.multiuser.example.ncl + provisioning/config/runtime/orchestrator.multiuser.ncl + +# For production (enterprise) +cp provisioning/config/examples/orchestrator.enterprise.example.ncl + provisioning/config/runtime/orchestrator.enterprise.ncl +``` + +### Step 2: Customize for Your Environment + +Edit the copied file to match your specific setup: + +```bash +# Edit the configuration +vim provisioning/config/runtime/orchestrator.solo.ncl + +# Examples of customizations: +# - Change workspace path to your project +# - Adjust worker count based on CPU cores +# - Set your domain names and hostnames +# - Configure storage paths for your filesystem +# - Update certificate paths for production +# - Set logging endpoints for your infrastructure +``` + +### Step 3: Validate Configuration + +Verify the configuration is syntactically correct: + +```toml +# Check Nickel syntax +nickel typecheck provisioning/config/runtime/orchestrator.solo.ncl + +# View generated TOML +nickel export --format toml provisioning/config/runtime/orchestrator.solo.ncl +``` + +### Step 4: Generate TOML + +Export the Nickel configuration to TOML format for service consumption: + +```nickel +# Use setup script to generate TOML +./provisioning/scripts/setup-platform-config.sh --generate-toml + +# Or manually export +nickel export --format toml provisioning/config/runtime/orchestrator.solo.ncl > + provisioning/config/runtime/generated/orchestrator.solo.toml +``` + +### Step 5: Run Services + +Start your platform services with the generated configuration: + +```toml +# Set the deployment mode +export ORCHESTRATOR_MODE=solo + +# Run the orchestrator +cargo run -p orchestrator +``` + +## Configuration Reference + +### Solo Mode Example Settings + +```toml +server.workers = 2 +queue.max_concurrent_tasks = 2 +performance.max_memory = 1000 # 1GB max +security.tls.enabled = false # No TLS for local dev +security.auth.enabled = false # No auth for local dev +``` + +**Use case**: Single developer on local machine + +### Multiuser Mode Example Settings + +```toml +server.workers = 4 +queue.max_concurrent_tasks = 10 +performance.max_memory = 4000 # 4GB max +security.tls.enabled = true # Enable TLS +security.auth.type = "token" # Token-based auth +``` + +**Use case**: Team of 5-10 developers in staging + +### Enterprise Mode Example Settings + +```toml +server.workers = 16 +queue.max_concurrent_tasks = 100 +performance.max_memory = 32000 # 32GB max +security.tls.enabled = true # TLS 1.3 only +security.auth.type = "oauth2" # OAuth2 for enterprise +storage.replication.factor = 3 # 3-way replication +``` + +**Use case**: Production with 100+ users across multiple teams + +## Key Configuration Sections + +### Server Configuration + +Controls HTTP server behavior: + +```bash +server = { + host = "0.0.0.0", # Bind address + port = 9090, # Listen port + workers = 4, # Worker threads + max_connections = 200, # Concurrent connections + request_timeout = 30000, # Milliseconds +} +``` + +### Storage Configuration + +Controls data persistence: + +```bash +storage = { + backend = "filesystem", # filesystem or distributed + path = "/var/lib/provisioning/orchestrator/data", + cache.enabled = true, + replication.enabled = true, + replication.factor = 3, # 3-way replication for HA +} +``` + +### Queue Configuration + +Controls task queuing: + +```bash +queue = { + max_concurrent_tasks = 10, + retry_attempts = 3, + task_timeout = 3600000, # 1 hour in milliseconds + priority_queue = true, # Enable priority for tasks + metrics = true, # Enable queue metrics +} +``` + +### Security Configuration + +Controls authentication and encryption: + +```bash +security = { + tls = { + enabled = true, + cert_path = "/etc/provisioning/certs/cert.crt", + key_path = "/etc/provisioning/certs/key.key", + min_tls_version = "1.3", + }, + auth = { + enabled = true, + type = "oauth2", # oauth2, token, or none + provider = "okta", + }, + encryption = { + enabled = true, + algorithm = "aes-256-gcm", + }, +} +``` + +### Logging Configuration + +Controls log output and persistence: + +```bash +logging = { + level = "info", # debug, info, warning, error + format = "json", + output = "both", # stdout, file, or both + file = { + enabled = true, + path = "/var/log/orchestrator.log", + rotation.max_size = 104857600, # 100MB per file + }, +} +``` + +### Monitoring Configuration + +Controls observability and metrics: + +```bash +monitoring = { + enabled = true, + metrics.enabled = true, + health_check.enabled = true, + distributed_tracing.enabled = true, + audit_logging.enabled = true, +} +``` + +## Customization Examples + +### Example 1: Change Workspace Name + +Change the workspace identifier in solo mode: + +```bash +workspace = { + name = "myproject", + path = "./provisioning/data/orchestrator", +} +``` + +Instead of default "development", use "myproject". + +### Example 2: Custom Server Port + +Change server port from default 9090: + +```bash +server = { + port = 8888, +} +``` + +Useful if port 9090 is already in use. + +### Example 3: Enable TLS in Solo Mode + +Add TLS certificates to solo development: + +```bash +security = { + tls = { + enabled = true, + cert_path = "./certs/localhost.crt", + key_path = "./certs/localhost.key", + }, +} +``` + +Useful for testing TLS locally before production. + +### Example 4: Custom Storage Path + +Use custom storage location: + +```bash +storage = { + path = "/mnt/fast-storage/orchestrator/data", +} +``` + +Useful if you have fast SSD storage available. + +### Example 5: Increase Workers for Staging + +Increase from 4 to 8 workers in multiuser: + +```bash +server = { + workers = 8, +} +``` + +Useful when you have more CPU cores available. + +## Troubleshooting Configuration + +### Issue: "Configuration Won't Validate" + +```toml +# Check for Nickel syntax errors +nickel typecheck provisioning/config/runtime/orchestrator.solo.ncl + +# Get detailed error message +nickel typecheck -i provisioning/config/runtime/orchestrator.solo.ncl +``` + +The typecheck command will show exactly where the syntax error is. + +### Issue: "Service Won't Start" + +```bash +# Verify TOML was exported correctly +cat provisioning/config/runtime/generated/orchestrator.solo.toml | head -20 + +# Check TOML syntax is valid +toml-cli validate provisioning/config/runtime/generated/orchestrator.solo.toml +``` + +The TOML must be valid for the Rust service to parse it. + +### Issue: "Service Uses Wrong Configuration" + +```toml +# Verify deployment mode is set +echo $ORCHESTRATOR_MODE + +# Check which TOML file service reads +ls -lah provisioning/config/runtime/generated/orchestrator.*.toml + +# Verify TOML modification time is recent +stat provisioning/config/runtime/generated/orchestrator.solo.toml +``` + +The service reads from `orchestrator.{MODE}.toml` based on environment variable. + +## Best Practices + +### Development (Solo Mode) + +1. Start simple using the solo example as-is first +2. Iterate gradually, making one change at a time +3. Enable logging by setting level = "debug" for troubleshooting +4. Disable security features for local development (TLS/auth) +5. Store data in ./provisioning/data/ which is gitignored + +### Staging (Multiuser Mode) + +1. Mirror production settings to test realistically +2. Enable authentication even in staging to test auth flows +3. Enable TLS with valid certificates to test secure connections +4. Set up monitoring metrics and health checks +5. Plan worker count based on expected concurrent users + +### Production (Enterprise Mode) + +1. Follow the enterprise example as baseline configuration +2. Use secure vault for storing credentials and secrets +3. Enable redundancy with 3-way replication for HA +4. Enable full monitoring with distributed tracing +5. Test failover scenarios regularly +6. Enable audit logging for compliance +7. Enforce TLS 1.3 and certificate rotation + +## Migration Between Modes + +To upgrade from solo → multiuser → enterprise: + +```bash +# 1. Backup current configuration +cp provisioning/config/runtime/orchestrator.solo.ncl + provisioning/config/runtime/orchestrator.solo.ncl.bak + +# 2. Copy new example for target mode +cp provisioning/config/examples/orchestrator.multiuser.example.ncl + provisioning/config/runtime/orchestrator.multiuser.ncl + +# 3. Customize for your environment +vim provisioning/config/runtime/orchestrator.multiuser.ncl + +# 4. Validate and generate TOML +./provisioning/scripts/setup-platform-config.sh --generate-toml + +# 5. Update mode environment variable and restart +export ORCHESTRATOR_MODE=multiuser +cargo run -p orchestrator +``` + +## Related Documentation + +- **Platform Configuration Guide**: `provisioning/docs/src/getting-started/05-platform-configuration.md` +- **Configuration README**: `provisioning/config/README.md` +- **System Status**: `provisioning/config/SETUP_STATUS.md` +- **Setup Script Reference**: `provisioning/scripts/setup-platform-config.sh.md` +- **Advanced TypeDialog Guide**: `provisioning/docs/src/development/typedialog-platform-config-guide.md` + +--- + +**Version**: 1.0.0 +**Last Updated**: 2026-01-05 +**Status**: Ready to use \ No newline at end of file diff --git a/examples/complete-workflow.md b/examples/complete-workflow.md index 91597d8..01b375d 100644 --- a/examples/complete-workflow.md +++ b/examples/complete-workflow.md @@ -1 +1,510 @@ -# Complete Workflow Example: Kubernetes Cluster with Package System\n\nThis example demonstrates the complete workflow using the new KCL package and module loader system to deploy a production Kubernetes cluster.\n\n## Scenario\n\nDeploy a 3-node Kubernetes cluster with:\n\n- 1 master node\n- 2 worker nodes\n- Cilium CNI\n- Containerd runtime\n- UpCloud provider\n- Production-ready configuration\n\n## Prerequisites\n\n1. Core provisioning package installed\n2. UpCloud credentials configured\n3. SSH keys set up\n\n## Step 1: Environment Setup\n\n```\n# Ensure core package is installed\ncd /Users/Akasha/project-provisioning\n./provisioning/tools/kcl-packager.nu build --version 1.0.0\n./provisioning/tools/kcl-packager.nu install dist/provisioning-1.0.0.tar.gz\n\n# Verify installation\nkcl list packages | grep provisioning\n```\n\n## Step 2: Create Workspace\n\n```\n# Create new workspace from template\nmkdir -p workspace/infra/production-k8s\ncd workspace/infra/production-k8s\n\n# Initialize workspace structure\n../../../provisioning/tools/workspace-init.nu . init\n\n# Verify structure\ntree -a .\n```\n\nExpected output:\n\n```\n.\n├── kcl.mod\n├── servers.k\n├── README.md\n├── .gitignore\n├── .taskservs/\n├── .providers/\n├── .clusters/\n├── .manifest/\n├── data/\n├── tmp/\n├── resources/\n└── clusters/\n```\n\n## Step 3: Discover Available Modules\n\n```\n# Discover available taskservs\n../../../provisioning/core/cli/module-loader discover taskservs\n\n# Search for Kubernetes-related modules\n../../../provisioning/core/cli/module-loader discover taskservs kubernetes\n\n# Discover providers\n../../../provisioning/core/cli/module-loader discover providers\n\n# Check output formats\n../../../provisioning/core/cli/module-loader discover taskservs --format json\n```\n\n## Step 4: Load Required Modules\n\n```\n# Load Kubernetes stack taskservs\n../../../provisioning/core/cli/module-loader load taskservs . [kubernetes, cilium, containerd]\n\n# Load UpCloud provider\n../../../provisioning/core/cli/module-loader load providers . [upcloud]\n\n# Verify loading\n../../../provisioning/core/cli/module-loader list taskservs .\n../../../provisioning/core/cli/module-loader list providers .\n```\n\nCheck generated files:\n\n```\n# Check auto-generated imports\ncat taskservs.k\ncat providers.k\n\n# Check manifest\ncat .manifest/taskservs.yaml\ncat .manifest/providers.yaml\n```\n\n## Step 5: Configure Infrastructure\n\nEdit `servers.k` to configure the Kubernetes cluster:\n\n```\n# Production Kubernetes Cluster Configuration\nimport provisioning.settings as settings\nimport provisioning.server as server\nimport provisioning.defaults as defaults\n\n# Import loaded modules (auto-generated)\nimport .taskservs.kubernetes.kubernetes as k8s\nimport .taskservs.cilium.cilium as cilium\nimport .taskservs.containerd.containerd as containerd\nimport .providers.upcloud as upcloud\n\n# Cluster settings\nk8s_settings: settings.Settings = {\n main_name = "production-k8s"\n main_title = "Production Kubernetes Cluster"\n\n # Configure paths\n settings_path = "./data/settings.yaml"\n defaults_provs_dirpath = "./defs"\n prov_data_dirpath = "./data"\n created_taskservs_dirpath = "./tmp/k8s-deployment"\n prov_resources_path = "./resources"\n created_clusters_dirpath = "./tmp/k8s-clusters"\n prov_clusters_path = "./clusters"\n\n # Kubernetes cluster settings\n cluster_admin_host = "" # Set by provider (first master)\n cluster_admin_port = 22\n cluster_admin_user = "admin"\n servers_wait_started = 60 # K8s nodes need more time\n\n runset = {\n wait = True\n output_format = "human"\n output_path = "tmp/k8s-deployment"\n inventory_file = "./k8s-inventory.yaml"\n use_time = True\n }\n\n # Secrets configuration\n secrets = {\n provider = "sops"\n sops_config = {\n age_key_file = "~/.age/keys.txt"\n use_age = True\n }\n }\n}\n\n# Production Kubernetes cluster servers\nproduction_servers: [server.Server] = [\n # Control plane node\n {\n hostname = "k8s-master-01"\n title = "Kubernetes Master Node 01"\n\n # Production specifications\n time_zone = "UTC"\n running_wait = 20\n running_timeout = 400\n storage_os_find = "name: debian-12 | arch: x86_64"\n\n # Network configuration\n network_utility_ipv4 = True\n network_public_ipv4 = True\n priv_cidr_block = "10.0.0.0/24"\n\n # User settings\n user = "admin"\n user_ssh_port = 22\n fix_local_hosts = True\n labels = "env: production, role: control-plane, tier: master"\n\n # Taskservs configuration\n taskservs = [\n {\n name = "containerd"\n profile = "production"\n install_mode = "library"\n },\n {\n name = "kubernetes"\n profile = "master"\n install_mode = "library-server"\n },\n {\n name = "cilium"\n profile = "master"\n install_mode = "library"\n }\n ]\n },\n\n # Worker nodes\n {\n hostname = "k8s-worker-01"\n title = "Kubernetes Worker Node 01"\n\n time_zone = "UTC"\n running_wait = 20\n running_timeout = 400\n storage_os_find = "name: debian-12 | arch: x86_64"\n\n network_utility_ipv4 = True\n network_public_ipv4 = True\n priv_cidr_block = "10.0.0.0/24"\n\n user = "admin"\n user_ssh_port = 22\n fix_local_hosts = True\n labels = "env: production, role: worker, tier: compute"\n\n taskservs = [\n {\n name = "containerd"\n profile = "production"\n install_mode = "library"\n },\n {\n name = "kubernetes"\n profile = "worker"\n install_mode = "library"\n },\n {\n name = "cilium"\n profile = "worker"\n install_mode = "library"\n }\n ]\n },\n\n {\n hostname = "k8s-worker-02"\n title = "Kubernetes Worker Node 02"\n\n time_zone = "UTC"\n running_wait = 20\n running_timeout = 400\n storage_os_find = "name: debian-12 | arch: x86_64"\n\n network_utility_ipv4 = True\n network_public_ipv4 = True\n priv_cidr_block = "10.0.0.0/24"\n\n user = "admin"\n user_ssh_port = 22\n fix_local_hosts = True\n labels = "env: production, role: worker, tier: compute"\n\n taskservs = [\n {\n name = "containerd"\n profile = "production"\n install_mode = "library"\n },\n {\n name = "kubernetes"\n profile = "worker"\n install_mode = "library"\n },\n {\n name = "cilium"\n profile = "worker"\n install_mode = "library"\n }\n ]\n }\n]\n\n# Export for provisioning system\n{\n settings = k8s_settings\n servers = production_servers\n}\n```\n\n## Step 6: Validate Configuration\n\n```\n# Validate KCL configuration\nkcl run servers.k\n\n# Validate workspace\n../../../provisioning/core/cli/module-loader validate .\n\n# Check workspace info\n../../../provisioning/tools/workspace-init.nu . info\n```\n\n## Step 7: Configure Provider Credentials\n\n```\n# Create provider configuration directory\nmkdir -p defs\n\n# Create UpCloud provider defaults (example)\ncat > defs/upcloud_defaults.k << 'EOF'\n# UpCloud Provider Defaults\nimport provisioning.defaults as defaults\n\nupcloud_defaults: defaults.ServerDefaults = {\n lock = False\n time_zone = "UTC"\n running_wait = 15\n running_timeout = 300\n\n # UpCloud specific settings\n storage_os_find = "name: debian-12 | arch: x86_64"\n\n # Network settings\n network_utility_ipv4 = True\n network_public_ipv4 = True\n\n # SSH settings\n ssh_key_path = "~/.ssh/id_rsa.pub"\n user = "admin"\n user_ssh_port = 22\n fix_local_hosts = True\n\n # UpCloud plan specifications\n labels = "provider: upcloud"\n}\n\nupcloud_defaults\nEOF\n```\n\n## Step 8: Deploy Infrastructure\n\n```\n# Create servers with check mode first\n../../../provisioning/core/cli/provisioning server create --infra . --check\n\n# If validation passes, deploy for real\n../../../provisioning/core/cli/provisioning server create --infra .\n\n# Monitor server creation\n../../../provisioning/core/cli/provisioning server list --infra .\n```\n\n## Step 9: Install Taskservs\n\n```\n# Install containerd on all nodes\n../../../provisioning/core/cli/provisioning taskserv create containerd --infra .\n\n# Install Kubernetes (this will set up master and join workers)\n../../../provisioning/core/cli/provisioning taskserv create kubernetes --infra .\n\n# Install Cilium CNI\n../../../provisioning/core/cli/provisioning taskserv create cilium --infra .\n```\n\n## Step 10: Verify Cluster\n\n```\n# SSH to master node and verify cluster\n../../../provisioning/core/cli/provisioning server ssh k8s-master-01 --infra .\n\n# On the master node:\nkubectl get nodes\nkubectl get pods -A\nkubectl get services -A\n\n# Test Cilium connectivity\ncilium status\ncilium connectivity test\n```\n\n## Step 11: Deploy Sample Application\n\nCreate a test deployment to verify the cluster:\n\n```\n# Create namespace\nkubectl create namespace test-app\n\n# Deploy nginx\nkubectl create deployment nginx --image=nginx:latest -n test-app\nkubectl expose deployment nginx --port=80 --type=ClusterIP -n test-app\n\n# Verify deployment\nkubectl get pods -n test-app\nkubectl get services -n test-app\n```\n\n## Step 12: Cluster Management\n\n```\n# Add monitoring (example)\n../../../provisioning/core/cli/module-loader load taskservs . [prometheus, grafana]\n\n# Regenerate configuration\n../../../provisioning/core/cli/module-loader list taskservs .\n\n# Deploy monitoring stack\n../../../provisioning/core/cli/provisioning taskserv create prometheus --infra .\n../../../provisioning/core/cli/provisioning taskserv create grafana --infra .\n```\n\n## Step 13: Backup and Documentation\n\n```\n# Create cluster documentation\ncat > cluster-info.md << 'EOF'\n# Production Kubernetes Cluster\n\n## Cluster Details\n- **Name**: production-k8s\n- **Nodes**: 3 (1 master, 2 workers)\n- **CNI**: Cilium\n- **Runtime**: Containerd\n- **Provider**: UpCloud\n\n## Node Information\n- k8s-master-01: Control plane node\n- k8s-worker-01: Worker node\n- k8s-worker-02: Worker node\n\n## Loaded Modules\n- kubernetes (master/worker profiles)\n- cilium (cluster networking)\n- containerd (container runtime)\n- upcloud (cloud provider)\n\n## Management Commands\n```\n# SSH to master\n../../../provisioning/core/cli/provisioning server ssh k8s-master-01 --infra .\n\n# Update cluster\n../../../provisioning/core/cli/provisioning taskserv generate kubernetes --infra .\n```\n\nEOF\n\n# Backup workspace\n\ncp -r . ../production-k8s-backup-$(date +%Y%m%d)\n\n# Commit to version control\n\ngit add .\ngit commit -m "Initial Kubernetes cluster deployment with package system"\n\n```\n\n## Troubleshooting\n\n### Module Loading Issues\n```\n# If modules don't load properly\n../../../provisioning/core/cli/module-loader discover taskservs\n../../../provisioning/core/cli/module-loader load taskservs . [kubernetes, cilium, containerd] --force\n\n# Check generated imports\ncat taskservs.k\n```\n\n### KCL Compilation Issues\n\n```\n# Check for syntax errors\nkcl check servers.k\n\n# Validate specific schemas\nkcl run --dry-run servers.k\n```\n\n### Provider Authentication Issues\n\n```\n# Check provider configuration\ncat .providers/upcloud/provision_upcloud.k\n\n# Verify credentials\n../../../provisioning/core/cli/provisioning server price --provider upcloud\n```\n\n### Kubernetes Setup Issues\n\n```\n# Check taskserv logs\ntail -f tmp/k8s-deployment/kubernetes-*.log\n\n# Verify SSH connectivity\n../../../provisioning/core/cli/provisioning server ssh k8s-master-01 --infra . --command "systemctl status kubelet"\n```\n\n## Next Steps\n\n1. **Scale the cluster**: Add more worker nodes\n2. **Add storage**: Load and configure storage taskservs (rook-ceph, mayastor)\n3. **Setup monitoring**: Deploy Prometheus/Grafana stack\n4. **Configure ingress**: Set up ingress controllers\n5. **Implement GitOps**: Configure ArgoCD or Flux\n\nThis example demonstrates the complete workflow from workspace creation to production Kubernetes cluster deployment using the new package-based system. +# Complete Workflow Example: Kubernetes Cluster with Package System + +This example demonstrates the complete workflow using the new KCL package and module loader system to deploy a production Kubernetes cluster. + +## Scenario + +Deploy a 3-node Kubernetes cluster with: + +- 1 master node +- 2 worker nodes +- Cilium CNI +- Containerd runtime +- UpCloud provider +- Production-ready configuration + +## Prerequisites + +1. Core provisioning package installed +2. UpCloud credentials configured +3. SSH keys set up + +## Step 1: Environment Setup + +```bash +# Ensure core package is installed +cd /Users/Akasha/project-provisioning +./provisioning/tools/kcl-packager.nu build --version 1.0.0 +./provisioning/tools/kcl-packager.nu install dist/provisioning-1.0.0.tar.gz + +# Verify installation +kcl list packages | grep provisioning +``` + +## Step 2: Create Workspace + +```bash +# Create new workspace from template +mkdir -p workspace/infra/production-k8s +cd workspace/infra/production-k8s + +# Initialize workspace structure +../../../provisioning/tools/workspace-init.nu . init + +# Verify structure +tree -a . +``` + +Expected output: + +```bash +. +├── kcl.mod +├── servers.k +├── README.md +├── .gitignore +├── .taskservs/ +├── .providers/ +├── .clusters/ +├── .manifest/ +├── data/ +├── tmp/ +├── resources/ +└── clusters/ +``` + +## Step 3: Discover Available Modules + +```bash +# Discover available taskservs +../../../provisioning/core/cli/module-loader discover taskservs + +# Search for Kubernetes-related modules +../../../provisioning/core/cli/module-loader discover taskservs kubernetes + +# Discover providers +../../../provisioning/core/cli/module-loader discover providers + +# Check output formats +../../../provisioning/core/cli/module-loader discover taskservs --format json +``` + +## Step 4: Load Required Modules + +```bash +# Load Kubernetes stack taskservs +../../../provisioning/core/cli/module-loader load taskservs . [kubernetes, cilium, containerd] + +# Load UpCloud provider +../../../provisioning/core/cli/module-loader load providers . [upcloud] + +# Verify loading +../../../provisioning/core/cli/module-loader list taskservs . +../../../provisioning/core/cli/module-loader list providers . +``` + +Check generated files: + +```bash +# Check auto-generated imports +cat taskservs.k +cat providers.k + +# Check manifest +cat .manifest/taskservs.yaml +cat .manifest/providers.yaml +``` + +## Step 5: Configure Infrastructure + +Edit `servers.k` to configure the Kubernetes cluster: + +```yaml +# Production Kubernetes Cluster Configuration +import provisioning.settings as settings +import provisioning.server as server +import provisioning.defaults as defaults + +# Import loaded modules (auto-generated) +import .taskservs.kubernetes.kubernetes as k8s +import .taskservs.cilium.cilium as cilium +import .taskservs.containerd.containerd as containerd +import .providers.upcloud as upcloud + +# Cluster settings +k8s_settings: settings.Settings = { + main_name = "production-k8s" + main_title = "Production Kubernetes Cluster" + + # Configure paths + settings_path = "./data/settings.yaml" + defaults_provs_dirpath = "./defs" + prov_data_dirpath = "./data" + created_taskservs_dirpath = "./tmp/k8s-deployment" + prov_resources_path = "./resources" + created_clusters_dirpath = "./tmp/k8s-clusters" + prov_clusters_path = "./clusters" + + # Kubernetes cluster settings + cluster_admin_host = "" # Set by provider (first master) + cluster_admin_port = 22 + cluster_admin_user = "admin" + servers_wait_started = 60 # K8s nodes need more time + + runset = { + wait = True + output_format = "human" + output_path = "tmp/k8s-deployment" + inventory_file = "./k8s-inventory.yaml" + use_time = True + } + + # Secrets configuration + secrets = { + provider = "sops" + sops_config = { + age_key_file = "~/.age/keys.txt" + use_age = True + } + } +} + +# Production Kubernetes cluster servers +production_servers: [server.Server] = [ + # Control plane node + { + hostname = "k8s-master-01" + title = "Kubernetes Master Node 01" + + # Production specifications + time_zone = "UTC" + running_wait = 20 + running_timeout = 400 + storage_os_find = "name: debian-12 | arch: x86_64" + + # Network configuration + network_utility_ipv4 = True + network_public_ipv4 = True + priv_cidr_block = "10.0.0.0/24" + + # User settings + user = "admin" + user_ssh_port = 22 + fix_local_hosts = True + labels = "env: production, role: control-plane, tier: master" + + # Taskservs configuration + taskservs = [ + { + name = "containerd" + profile = "production" + install_mode = "library" + }, + { + name = "kubernetes" + profile = "master" + install_mode = "library-server" + }, + { + name = "cilium" + profile = "master" + install_mode = "library" + } + ] + }, + + # Worker nodes + { + hostname = "k8s-worker-01" + title = "Kubernetes Worker Node 01" + + time_zone = "UTC" + running_wait = 20 + running_timeout = 400 + storage_os_find = "name: debian-12 | arch: x86_64" + + network_utility_ipv4 = True + network_public_ipv4 = True + priv_cidr_block = "10.0.0.0/24" + + user = "admin" + user_ssh_port = 22 + fix_local_hosts = True + labels = "env: production, role: worker, tier: compute" + + taskservs = [ + { + name = "containerd" + profile = "production" + install_mode = "library" + }, + { + name = "kubernetes" + profile = "worker" + install_mode = "library" + }, + { + name = "cilium" + profile = "worker" + install_mode = "library" + } + ] + }, + + { + hostname = "k8s-worker-02" + title = "Kubernetes Worker Node 02" + + time_zone = "UTC" + running_wait = 20 + running_timeout = 400 + storage_os_find = "name: debian-12 | arch: x86_64" + + network_utility_ipv4 = True + network_public_ipv4 = True + priv_cidr_block = "10.0.0.0/24" + + user = "admin" + user_ssh_port = 22 + fix_local_hosts = True + labels = "env: production, role: worker, tier: compute" + + taskservs = [ + { + name = "containerd" + profile = "production" + install_mode = "library" + }, + { + name = "kubernetes" + profile = "worker" + install_mode = "library" + }, + { + name = "cilium" + profile = "worker" + install_mode = "library" + } + ] + } +] + +# Export for provisioning system +{ + settings = k8s_settings + servers = production_servers +} +``` + +## Step 6: Validate Configuration + +```toml +# Validate KCL configuration +kcl run servers.k + +# Validate workspace +../../../provisioning/core/cli/module-loader validate . + +# Check workspace info +../../../provisioning/tools/workspace-init.nu . info +``` + +## Step 7: Configure Provider Credentials + +```toml +# Create provider configuration directory +mkdir -p defs + +# Create UpCloud provider defaults (example) +cat > defs/upcloud_defaults.k << 'EOF' +# UpCloud Provider Defaults +import provisioning.defaults as defaults + +upcloud_defaults: defaults.ServerDefaults = { + lock = False + time_zone = "UTC" + running_wait = 15 + running_timeout = 300 + + # UpCloud specific settings + storage_os_find = "name: debian-12 | arch: x86_64" + + # Network settings + network_utility_ipv4 = True + network_public_ipv4 = True + + # SSH settings + ssh_key_path = "~/.ssh/id_rsa.pub" + user = "admin" + user_ssh_port = 22 + fix_local_hosts = True + + # UpCloud plan specifications + labels = "provider: upcloud" +} + +upcloud_defaults +EOF +``` + +## Step 8: Deploy Infrastructure + +```bash +# Create servers with check mode first +../../../provisioning/core/cli/provisioning server create --infra . --check + +# If validation passes, deploy for real +../../../provisioning/core/cli/provisioning server create --infra . + +# Monitor server creation +../../../provisioning/core/cli/provisioning server list --infra . +``` + +## Step 9: Install Taskservs + +```bash +# Install containerd on all nodes +../../../provisioning/core/cli/provisioning taskserv create containerd --infra . + +# Install Kubernetes (this will set up master and join workers) +../../../provisioning/core/cli/provisioning taskserv create kubernetes --infra . + +# Install Cilium CNI +../../../provisioning/core/cli/provisioning taskserv create cilium --infra . +``` + +## Step 10: Verify Cluster + +```bash +# SSH to master node and verify cluster +../../../provisioning/core/cli/provisioning server ssh k8s-master-01 --infra . + +# On the master node: +kubectl get nodes +kubectl get pods -A +kubectl get services -A + +# Test Cilium connectivity +cilium status +cilium connectivity test +``` + +## Step 11: Deploy Sample Application + +Create a test deployment to verify the cluster: + +```bash +# Create namespace +kubectl create namespace test-app + +# Deploy nginx +kubectl create deployment nginx --image=nginx:latest -n test-app +kubectl expose deployment nginx --port=80 --type=ClusterIP -n test-app + +# Verify deployment +kubectl get pods -n test-app +kubectl get services -n test-app +``` + +## Step 12: Cluster Management + +```bash +# Add monitoring (example) +../../../provisioning/core/cli/module-loader load taskservs . [prometheus, grafana] + +# Regenerate configuration +../../../provisioning/core/cli/module-loader list taskservs . + +# Deploy monitoring stack +../../../provisioning/core/cli/provisioning taskserv create prometheus --infra . +../../../provisioning/core/cli/provisioning taskserv create grafana --infra . +``` + +## Step 13: Backup and Documentation + +```bash +# Create cluster documentation +cat > cluster-info.md << 'EOF' +# Production Kubernetes Cluster + +## Cluster Details +- **Name**: production-k8s +- **Nodes**: 3 (1 master, 2 workers) +- **CNI**: Cilium +- **Runtime**: Containerd +- **Provider**: UpCloud + +## Node Information +- k8s-master-01: Control plane node +- k8s-worker-01: Worker node +- k8s-worker-02: Worker node + +## Loaded Modules +- kubernetes (master/worker profiles) +- cilium (cluster networking) +- containerd (container runtime) +- upcloud (cloud provider) + +## Management Commands +``` +# SSH to master +../../../provisioning/core/cli/provisioning server ssh k8s-master-01 --infra . + +# Update cluster +../../../provisioning/core/cli/provisioning taskserv generate kubernetes --infra . +```yaml + +EOF + +# Backup workspace + +cp -r . ../production-k8s-backup-$(date +%Y%m%d) + +# Commit to version control + +git add . +git commit -m "Initial Kubernetes cluster deployment with package system" + +``` + +## Troubleshooting + +### Module Loading Issues +```bash +# If modules don't load properly +../../../provisioning/core/cli/module-loader discover taskservs +../../../provisioning/core/cli/module-loader load taskservs . [kubernetes, cilium, containerd] --force + +# Check generated imports +cat taskservs.k +``` + +### KCL Compilation Issues + +```bash +# Check for syntax errors +kcl check servers.k + +# Validate specific schemas +kcl run --dry-run servers.k +``` + +### Provider Authentication Issues + +```bash +# Check provider configuration +cat .providers/upcloud/provision_upcloud.k + +# Verify credentials +../../../provisioning/core/cli/provisioning server price --provider upcloud +``` + +### Kubernetes Setup Issues + +```yaml +# Check taskserv logs +tail -f tmp/k8s-deployment/kubernetes-*.log + +# Verify SSH connectivity +../../../provisioning/core/cli/provisioning server ssh k8s-master-01 --infra . --command "systemctl status kubelet" +``` + +## Next Steps + +1. **Scale the cluster**: Add more worker nodes +2. **Add storage**: Load and configure storage taskservs (rook-ceph, mayastor) +3. **Setup monitoring**: Deploy Prometheus/Grafana stack +4. **Configure ingress**: Set up ingress controllers +5. **Implement GitOps**: Configure ArgoCD or Flux + +This example demonstrates the complete workflow from workspace creation to production Kubernetes cluster deployment using the new package-based system. \ No newline at end of file diff --git a/examples/workspaces/cost-optimized/README.md b/examples/workspaces/cost-optimized/README.md index 6b116c0..41426b5 100644 --- a/examples/workspaces/cost-optimized/README.md +++ b/examples/workspaces/cost-optimized/README.md @@ -1 +1,540 @@ -# Cost-Optimized Multi-Provider Workspace\n\nThis workspace demonstrates cost optimization through intelligent provider specialization:\n\n- **Hetzner**: Compute tier (CPX21 servers at €20.90/month) - best price/performance\n- **AWS**: Managed services (RDS, ElastiCache, SQS) - reliability without ops overhead\n- **DigitalOcean**: CDN and object storage - affordable content delivery\n\n## Why This Architecture?\n\n### Cost Comparison\n\n```\nCost-Optimized Architecture:\n├── Hetzner compute: €72.70/month (~$78)\n├── AWS managed services: $115/month\n└── DigitalOcean CDN: $64/month\nTotal: ~$280/month\n\nAll-AWS Equivalent:\n├── EC2 instances: ~$200+\n├── RDS database: ~$150+\n├── ElastiCache: ~$50+\n├── CloudFront CDN: ~$100+\n└── Other services: ~$50+\nTotal: ~$600+/month\n\nSavings: ~$320/month (53% reduction)\n```\n\n### Architecture Benefits\n\n**Hetzner Advantages**:\n- Best price/performance for compute (€20.90/month for 4 vCPU/8GB)\n- Powerful Load Balancer (€10/month)\n- Fast networking (10Gbps)\n- EU data residency (GDPR compliant)\n\n**AWS Advantages**:\n- Managed RDS: Automatic backups, failover, patching\n- ElastiCache: Redis cluster with automatic failover\n- SQS: Scalable message queue (pay per message)\n- CloudWatch: Comprehensive monitoring\n\n**DigitalOcean Advantages**:\n- CDN: Cost-effective content delivery ($25/month)\n- Spaces: Object storage at scale ($15/month)\n- Simple pricing and management\n- Edge nodes for regional distribution\n\n## Architecture Overview\n\n```\n┌────────────────────────────────────────────────┐\n│ Client Requests │\n└─────────────────┬────────────────────────────────┘\n │ HTTPS/HTTP\n ┌────────▼─────────┐\n │ DigitalOcean │\n │ CDN / Spaces │\n └────────┬─────────┘\n │\n ┌────────────┼────────────┐\n │ │ │\n┌────▼──────┐ ┌──▼────────┐ ┌─▼──────┐\n│ Hetzner │ │ AWS │ │ DO │\n│ Compute │ │ Managed │ │ CDN │\n│ (Load LB) │ │ Services │ │ │\n└────┬──────┘ └──┬────────┘ └────────┘\n │VPN Tunnel │\n┌────▼──────────▼────┐\n│ Hetzner Network │ AWS VPC DO Spaces\n│ 10.0.0.0/16 ◄──► 10.1.0.0/16 ◄──► nyc3\n│ 3x CPX21 Servers │ RDS + Cache CDN +\n│ │ + SQS Backups\n└────────────────────┘\n```\n\n## Prerequisites\n\n### 1. Cloud Accounts\n\n- **Hetzner**: Account with API token\n- **AWS**: Account with access keys\n- **DigitalOcean**: Account with API token\n\n### 2. Environment Variables\n\n```\nexport HCLOUD_TOKEN="MC4wNTI1YmE1M2E4YmE0YTQzMTQyZTdlODYy"\nexport AWS_ACCESS_KEY_ID="AKIA1234567890ABCDEF"\nexport AWS_SECRET_ACCESS_KEY="wJalrXUtnFEMI/K7MDENG+j/zI0m1234567890ab"\nexport DIGITALOCEAN_TOKEN="dop_v1_abc123def456ghi789jkl012mno"\n```\n\n### 3. CLI Tools\n\n```\n# Install and verify\nwhich hcloud && hcloud version\nwhich aws && aws --version\nwhich doctl && doctl version\nwhich nickel && nickel --version\n```\n\n### 4. SSH Keys\n\n```\n# Hetzner\nhcloud ssh-key create --name provisioning-key \n --public-key-from-file ~/.ssh/id_rsa.pub\n\n# AWS\naws ec2 create-key-pair --key-name provisioning-key \n --query 'KeyMaterial' --output text > provisioning-key.pem\nchmod 600 provisioning-key.pem\n\n# DigitalOcean\ndoctl compute ssh-key create provisioning-key \n --public-key-from-file ~/.ssh/id_rsa.pub\n```\n\n## Deployment\n\n### Step 1: Configure the Workspace\n\nEdit `workspace.ncl`:\n\n```\n# Update networking if needed\ncompute_tier.primary_servers = hetzner.Server & {\n server_type = "cpx21",\n count = 3,\n location = "nbg1"\n}\n\n# Update AWS region if needed\nmanaged_services.database = aws.RDS & {\n instance_class = "db.t3.small",\n region = "us-east-1"\n}\n\n# Update CDN endpoints\ncdn_tier.cdn.endpoints = [{\n name = "app-cdn",\n origin = "content.example.com"\n}]\n```\n\nEdit `config.toml`:\n\n```\n[cost_tracking]\nmonthly_budget = 300\nbudget_alert_threshold = 280\n\n[application.cache]\nmax_memory = "250MB"\n```\n\n### Step 2: Validate Configuration\n\n```\n# Validate Nickel syntax\nnickel export workspace.ncl | jq . > /dev/null\n\n# Verify provider access\nhcloud context use default\naws sts get-caller-identity\ndoctl account get\n```\n\n### Step 3: Deploy\n\n```\nchmod +x deploy.nu\n./deploy.nu\n\n# Or with debug output\n./deploy.nu --debug\n```\n\n### Step 4: Verify Deployment\n\n```\n# Hetzner compute resources\nhcloud server list\nhcloud load-balancer list\n\n# AWS managed services\naws rds describe-db-instances --region us-east-1\naws elasticache describe-cache-clusters --region us-east-1\naws sqs list-queues --region us-east-1\n\n# DigitalOcean CDN\ndoctl compute cdn list\ndoctl compute spaces list\n```\n\n## Post-Deployment Configuration\n\n### 1. Connect Hetzner Compute to AWS Database\n\n```\n# Get Hetzner server IPs\nhcloud server list --format ID,PublicIPv4\n\n# Get RDS endpoint\naws rds describe-db-instances --region us-east-1 \n --query 'DBInstances[0].Endpoint.Address'\n\n# On Hetzner server, install PostgreSQL client\nssh root@hetzner-server\napt-get update && apt-get install postgresql-client\n\n# Test connection to RDS\npsql -h app-db.abc123.us-east-1.rds.amazonaws.com \n -U admin -d postgres -c "SELECT now();"\n```\n\n### 2. Configure Application for Services\n\n```\n# Application configuration file\ncat > /var/www/app/.env << EOF\nDATABASE_HOST=app-db.abc123.us-east-1.rds.amazonaws.com\nDATABASE_PORT=5432\nDATABASE_USER=admin\nDATABASE_PASSWORD=your_password\nDATABASE_NAME=app_db\n\nREDIS_HOST=app-cache.abc123.ng.0001.euc1.cache.amazonaws.com\nREDIS_PORT=6379\n\nSQS_QUEUE_URL=https://sqs.us-east-1.amazonaws.com/123456789/app-queue\n\nCDN_ENDPOINT=https://content.example.com\nSPACES_ENDPOINT=https://app-content.nyc3.digitaloceanspaces.com\nSPACES_KEY=your_spaces_key\nSPACES_SECRET=your_spaces_secret\n\nENVIRONMENT=production\nEOF\n```\n\n### 3. Setup CDN and Object Storage\n\n```\n# Configure Spaces bucket\ndoctl compute spaces create app-content --region nyc3\n\n# Get Spaces endpoint\ndoctl compute spaces list\n\n# Configure CDN endpoint\ndoctl compute cdn create --origin content.example.com\n\n# Upload test file\naws s3 cp test.html s3://app-content/\n```\n\n### 4. Configure Application Queue\n\n```\n# Get SQS queue URL\naws sqs list-queues --region us-east-1\n\n# Create queue if needed\naws sqs create-queue --queue-name app-queue --region us-east-1\n\n# Test queue\naws sqs send-message --queue-url https://sqs.us-east-1.amazonaws.com/123456789/app-queue \n --message-body "test message" --region us-east-1\n```\n\n### 5. Deploy Application\n\nSSH to Hetzner servers:\n\n```\n# Get server IPs\nSERVERS=$(hcloud server list --format PublicIPv4 --no-header)\n\n# Deploy to each server\nfor server in $SERVERS; do\n ssh -o StrictHostKeyChecking=no root@$server << 'DEPLOY'\n cd /var/www\n git clone https://github.com/your-org/app.git\n cd app\n cp .env.example .env\n ./deploy.sh\n DEPLOY\ndone\n```\n\n## Monitoring and Cost Control\n\n### Cost Monitoring\n\n```\n# Hetzner billing\n# Manual via console: https://console.hetzner.cloud/billing\n\n# AWS cost tracking\naws ce get-cost-and-usage \n --time-period Start=2024-01-01,End=2024-01-31 \n --granularity MONTHLY \n --metrics BlendedCost \n --group-by Type=DIMENSION,Key=SERVICE\n\n# DigitalOcean billing\ndoctl billing get\n\n# Real-time cost status\naws ce get-cost-and-usage \n --time-period Start=$(date -d '1 day ago' +%Y-%m-%d),End=$(date +%Y-%m-%d) \n --granularity DAILY \n --metrics BlendedCost\n```\n\n### Application Performance Monitoring\n\n```\n# RDS performance insights\naws pi describe-dimension-keys \n --service-type RDS \n --identifier arn:aws:rds:us-east-1:123456789:db:app-db \n --start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%S) \n --end-time $(date -u +%Y-%m-%dT%H:%M:%S) \n --period-in-seconds 60 \n --metric db.load.avg \n --partition-by Dimension \n --dimension-group.group-by WAIT_EVENT\n\n# ElastiCache monitoring\naws cloudwatch get-metric-statistics \n --namespace AWS/ElastiCache \n --metric-name CPUUtilization \n --dimensions Name=CacheClusterId,Value=app-cache \n --start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%S) \n --end-time $(date -u +%Y-%m-%dT%H:%M:%S) \n --period 300 \n --statistics Average\n\n# SQS monitoring\naws sqs get-queue-attributes \n --queue-url https://sqs.us-east-1.amazonaws.com/123456789/app-queue \n --attribute-names All\n```\n\n### Alerts Configuration\n\n```\n# CPU threshold alert\naws cloudwatch put-metric-alarm \n --alarm-name hetzner-cpu-high \n --alarm-description "Alert when Hetzner CPU > 80%" \n --metric-name CPUUtilization \n --threshold 80 \n --comparison-operator GreaterThanThreshold\n\n# Queue depth alert\naws cloudwatch put-metric-alarm \n --alarm-name sqs-queue-depth-high \n --alarm-description "Alert when SQS queue depth > 1000" \n --metric-name ApproximateNumberOfMessagesVisible \n --threshold 1000 \n --comparison-operator GreaterThanThreshold\n\n# Cache eviction alert\naws cloudwatch put-metric-alarm \n --alarm-name elasticache-eviction-rate-high \n --alarm-description "Alert when cache eviction rate > 10%" \n --metric-name EvictionRate \n --namespace AWS/ElastiCache \n --threshold 10 \n --comparison-operator GreaterThanThreshold\n```\n\n## Scaling and Optimization\n\n### Scale Hetzner Compute\n\nEdit `workspace.ncl`:\n\n```\ncompute_tier.primary_servers = hetzner.Server & {\n count = 5, # Increase from 3\n server_type = "cpx21"\n}\n```\n\nRedeploy:\n\n```\n./deploy.nu\n```\n\n### Upgrade Database\n\n```\n# Modify RDS instance class\naws rds modify-db-instance \n --db-instance-identifier app-db \n --db-instance-class db.t3.medium \n --apply-immediately \n --region us-east-1\n```\n\n### Add Caching Layer\n\nAlready configured with ElastiCache. Optimize by adjusting:\n\n```\n[application.cache]\nmax_memory = "512MB"\neviction_policy = "allkeys-lru"\n```\n\n### Increase Queue Throughput\n\nSQS automatically scales. Monitor with:\n\n```\naws sqs get-queue-attributes \n --queue-url https://sqs.us-east-1.amazonaws.com/123456789/app-queue \n --attribute-names ApproximateNumberOfMessages\n```\n\n## Cost Optimization Tips\n\n1. **Hetzner Compute**: CPX21 is sweet spot. Consider CX21 for lower workloads\n2. **AWS RDS**: Use t3.small for dev, t3.medium for prod with burst capability\n3. **ElastiCache**: 2 nodes with auto-failover. Monitor eviction rates\n4. **SQS**: Pay per request, no fixed costs. Good for variable load\n5. **DigitalOcean CDN**: Cache more aggressively (86400s TTL for assets)\n6. **Spaces**: Use lifecycle rules to delete old files automatically\n\n### Cost Reduction Checklist\n\n- Reduce Hetzner servers from 3 to 2 (saves ~€21/month)\n- Downgrade RDS to db.t3.micro for dev (saves ~$40/month)\n- Reduce ElastiCache nodes from 2 to 1 (saves ~$12/month)\n- Archive old CDN content (savings from Spaces storage)\n- Use reserved capacity on AWS (20-30% discount)\n\nPotential total savings: ~$100+/month with right-sizing.\n\n## Troubleshooting\n\n### Issue: Hetzner Can't Connect to RDS\n\n**Diagnosis**:\n```\n# SSH to Hetzner server\nssh root@hetzner-server\n\n# Test connectivity\nnc -zv app-db.abc123.us-east-1.rds.amazonaws.com 5432\n```\n\n**Solution**:\n- Check VPN tunnel is active\n- Verify RDS security group allows port 5432 from Hetzner network\n- Check route table on both sides\n\n### Issue: High Database Latency\n\n**Diagnosis**:\n```\n# Check RDS performance\naws pi describe-dimension-keys --service-type RDS ...\n\n# Check network latency\nping -c 5 app-db.abc123.us-east-1.rds.amazonaws.com\n```\n\n**Solution**:\n- Upgrade RDS instance class\n- Increase ElastiCache size to reduce database queries\n- Check network bandwidth between providers\n\n### Issue: Queue Processing Slow\n\n**Diagnosis**:\n```\n# Check queue depth and age\naws sqs get-queue-attributes \n --queue-url \n --attribute-names All\n```\n\n**Solution**:\n- Scale up application servers processing queue\n- Reduce visibility timeout if messages are timing out\n- Check application logs for processing errors\n\n## Cleanup\n\n```\n# Hetzner\nhcloud server delete hetzner-app-1 hetzner-app-2 hetzner-app-3\nhcloud load-balancer delete app-lb\n\n# AWS\naws rds delete-db-instance --db-instance-identifier app-db --skip-final-snapshot\naws elasticache delete-cache-cluster --cache-cluster-id app-cache\naws sqs delete-queue --queue-url https://sqs.us-east-1.amazonaws.com/123456789/app-queue\n\n# DigitalOcean\ndoctl compute spaces delete app-content\ndoctl compute cdn delete cdn-app\ndoctl compute droplet delete edge-node-1 edge-node-2 edge-node-3\n```\n\n## Next Steps\n\n1. Implement application logging to CloudWatch\n2. Set up Hetzner monitoring dashboard\n3. Configure auto-scaling based on queue depth\n4. Implement database read replicas for read-heavy workloads\n5. Add WAF protection to Hetzner load balancer\n6. Implement cross-region backups to Spaces\n7. Set up cost anomaly detection alerts\n\n## Support\n\nFor issues or questions:\n\n- Review the cost-optimized deployment guide\n- Check provider-specific documentation\n- Monitor costs with: `aws ce get-cost-and-usage ...`\n- Review deployment logs: `./deploy.nu --debug`\n\n## Files\n\n- `workspace.ncl`: Infrastructure definition (Nickel)\n- `config.toml`: Provider credentials and settings\n- `deploy.nu`: Deployment orchestration (Nushell)\n- `README.md`: This file \ No newline at end of file +# Cost-Optimized Multi-Provider Workspace + +This workspace demonstrates cost optimization through intelligent provider specialization: + +- **Hetzner**: Compute tier (CPX21 servers at €20.90/month) - best price/performance +- **AWS**: Managed services (RDS, ElastiCache, SQS) - reliability without ops overhead +- **DigitalOcean**: CDN and object storage - affordable content delivery + +## Why This Architecture? + +### Cost Comparison + +```bash +Cost-Optimized Architecture: +├── Hetzner compute: €72.70/month (~$78) +├── AWS managed services: $115/month +└── DigitalOcean CDN: $64/month +Total: ~$280/month + +All-AWS Equivalent: +├── EC2 instances: ~$200+ +├── RDS database: ~$150+ +├── ElastiCache: ~$50+ +├── CloudFront CDN: ~$100+ +└── Other services: ~$50+ +Total: ~$600+/month + +Savings: ~$320/month (53% reduction) +``` + +### Architecture Benefits + +**Hetzner Advantages**: +- Best price/performance for compute (€20.90/month for 4 vCPU/8GB) +- Powerful Load Balancer (€10/month) +- Fast networking (10Gbps) +- EU data residency (GDPR compliant) + +**AWS Advantages**: +- Managed RDS: Automatic backups, failover, patching +- ElastiCache: Redis cluster with automatic failover +- SQS: Scalable message queue (pay per message) +- CloudWatch: Comprehensive monitoring + +**DigitalOcean Advantages**: +- CDN: Cost-effective content delivery ($25/month) +- Spaces: Object storage at scale ($15/month) +- Simple pricing and management +- Edge nodes for regional distribution + +## Architecture Overview + +```bash +┌────────────────────────────────────────────────┐ +│ Client Requests │ +└─────────────────┬────────────────────────────────┘ + │ HTTPS/HTTP + ┌────────▼─────────┐ + │ DigitalOcean │ + │ CDN / Spaces │ + └────────┬─────────┘ + │ + ┌────────────┼────────────┐ + │ │ │ +┌────▼──────┐ ┌──▼────────┐ ┌─▼──────┐ +│ Hetzner │ │ AWS │ │ DO │ +│ Compute │ │ Managed │ │ CDN │ +│ (Load LB) │ │ Services │ │ │ +└────┬──────┘ └──┬────────┘ └────────┘ + │VPN Tunnel │ +┌────▼──────────▼────┐ +│ Hetzner Network │ AWS VPC DO Spaces +│ 10.0.0.0/16 ◄──► 10.1.0.0/16 ◄──► nyc3 +│ 3x CPX21 Servers │ RDS + Cache CDN + +│ │ + SQS Backups +└────────────────────┘ +``` + +## Prerequisites + +### 1. Cloud Accounts + +- **Hetzner**: Account with API token +- **AWS**: Account with access keys +- **DigitalOcean**: Account with API token + +### 2. Environment Variables + +```javascript +export HCLOUD_TOKEN="MC4wNTI1YmE1M2E4YmE0YTQzMTQyZTdlODYy" +export AWS_ACCESS_KEY_ID="AKIA1234567890ABCDEF" +export AWS_SECRET_ACCESS_KEY="wJalrXUtnFEMI/K7MDENG+j/zI0m1234567890ab" +export DIGITALOCEAN_TOKEN="dop_v1_abc123def456ghi789jkl012mno" +``` + +### 3. CLI Tools + +```bash +# Install and verify +which hcloud && hcloud version +which aws && aws --version +which doctl && doctl version +which nickel && nickel --version +``` + +### 4. SSH Keys + +```bash +# Hetzner +hcloud ssh-key create --name provisioning-key + --public-key-from-file ~/.ssh/id_rsa.pub + +# AWS +aws ec2 create-key-pair --key-name provisioning-key + --query 'KeyMaterial' --output text > provisioning-key.pem +chmod 600 provisioning-key.pem + +# DigitalOcean +doctl compute ssh-key create provisioning-key + --public-key-from-file ~/.ssh/id_rsa.pub +``` + +## Deployment + +### Step 1: Configure the Workspace + +Edit `workspace.ncl`: + +```nickel +# Update networking if needed +compute_tier.primary_servers = hetzner.Server & { + server_type = "cpx21", + count = 3, + location = "nbg1" +} + +# Update AWS region if needed +managed_services.database = aws.RDS & { + instance_class = "db.t3.small", + region = "us-east-1" +} + +# Update CDN endpoints +cdn_tier.cdn.endpoints = [{ + name = "app-cdn", + origin = "content.example.com" +}] +``` + +Edit `config.toml`: + +```toml +[cost_tracking] +monthly_budget = 300 +budget_alert_threshold = 280 + +[application.cache] +max_memory = "250MB" +``` + +### Step 2: Validate Configuration + +```toml +# Validate Nickel syntax +nickel export workspace.ncl | jq . > /dev/null + +# Verify provider access +hcloud context use default +aws sts get-caller-identity +doctl account get +``` + +### Step 3: Deploy + +```bash +chmod +x deploy.nu +./deploy.nu + +# Or with debug output +./deploy.nu --debug +``` + +### Step 4: Verify Deployment + +```bash +# Hetzner compute resources +hcloud server list +hcloud load-balancer list + +# AWS managed services +aws rds describe-db-instances --region us-east-1 +aws elasticache describe-cache-clusters --region us-east-1 +aws sqs list-queues --region us-east-1 + +# DigitalOcean CDN +doctl compute cdn list +doctl compute spaces list +``` + +## Post-Deployment Configuration + +### 1. Connect Hetzner Compute to AWS Database + +```bash +# Get Hetzner server IPs +hcloud server list --format ID,PublicIPv4 + +# Get RDS endpoint +aws rds describe-db-instances --region us-east-1 + --query 'DBInstances[0].Endpoint.Address' + +# On Hetzner server, install PostgreSQL client +ssh root@hetzner-server +apt-get update && apt-get install postgresql-client + +# Test connection to RDS +psql -h app-db.abc123.us-east-1.rds.amazonaws.com + -U admin -d postgres -c "SELECT now();" +``` + +### 2. Configure Application for Services + +```toml +# Application configuration file +cat > /var/www/app/.env << EOF +DATABASE_HOST=app-db.abc123.us-east-1.rds.amazonaws.com +DATABASE_PORT=5432 +DATABASE_USER=admin +DATABASE_PASSWORD=your_password +DATABASE_NAME=app_db + +REDIS_HOST=app-cache.abc123.ng.0001.euc1.cache.amazonaws.com +REDIS_PORT=6379 + +SQS_QUEUE_URL=https://sqs.us-east-1.amazonaws.com/123456789/app-queue + +CDN_ENDPOINT=https://content.example.com +SPACES_ENDPOINT=https://app-content.nyc3.digitaloceanspaces.com +SPACES_KEY=your_spaces_key +SPACES_SECRET=your_spaces_secret + +ENVIRONMENT=production +EOF +``` + +### 3. Setup CDN and Object Storage + +```bash +# Configure Spaces bucket +doctl compute spaces create app-content --region nyc3 + +# Get Spaces endpoint +doctl compute spaces list + +# Configure CDN endpoint +doctl compute cdn create --origin content.example.com + +# Upload test file +aws s3 cp test.html s3://app-content/ +``` + +### 4. Configure Application Queue + +```toml +# Get SQS queue URL +aws sqs list-queues --region us-east-1 + +# Create queue if needed +aws sqs create-queue --queue-name app-queue --region us-east-1 + +# Test queue +aws sqs send-message --queue-url https://sqs.us-east-1.amazonaws.com/123456789/app-queue + --message-body "test message" --region us-east-1 +``` + +### 5. Deploy Application + +SSH to Hetzner servers: + +```bash +# Get server IPs +SERVERS=$(hcloud server list --format PublicIPv4 --no-header) + +# Deploy to each server +for server in $SERVERS; do + ssh -o StrictHostKeyChecking=no root@$server << 'DEPLOY' + cd /var/www + git clone https://github.com/your-org/app.git + cd app + cp .env.example .env + ./deploy.sh + DEPLOY +done +``` + +## Monitoring and Cost Control + +### Cost Monitoring + +```bash +# Hetzner billing +# Manual via console: https://console.hetzner.cloud/billing + +# AWS cost tracking +aws ce get-cost-and-usage + --time-period Start=2024-01-01,End=2024-01-31 + --granularity MONTHLY + --metrics BlendedCost + --group-by Type=DIMENSION,Key=SERVICE + +# DigitalOcean billing +doctl billing get + +# Real-time cost status +aws ce get-cost-and-usage + --time-period Start=$(date -d '1 day ago' +%Y-%m-%d),End=$(date +%Y-%m-%d) + --granularity DAILY + --metrics BlendedCost +``` + +### Application Performance Monitoring + +```bash +# RDS performance insights +aws pi describe-dimension-keys + --service-type RDS + --identifier arn:aws:rds:us-east-1:123456789:db:app-db + --start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%S) + --end-time $(date -u +%Y-%m-%dT%H:%M:%S) + --period-in-seconds 60 + --metric db.load.avg + --partition-by Dimension + --dimension-group.group-by WAIT_EVENT + +# ElastiCache monitoring +aws cloudwatch get-metric-statistics + --namespace AWS/ElastiCache + --metric-name CPUUtilization + --dimensions Name=CacheClusterId,Value=app-cache + --start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%S) + --end-time $(date -u +%Y-%m-%dT%H:%M:%S) + --period 300 + --statistics Average + +# SQS monitoring +aws sqs get-queue-attributes + --queue-url https://sqs.us-east-1.amazonaws.com/123456789/app-queue + --attribute-names All +``` + +### Alerts Configuration + +```toml +# CPU threshold alert +aws cloudwatch put-metric-alarm + --alarm-name hetzner-cpu-high + --alarm-description "Alert when Hetzner CPU > 80%" + --metric-name CPUUtilization + --threshold 80 + --comparison-operator GreaterThanThreshold + +# Queue depth alert +aws cloudwatch put-metric-alarm + --alarm-name sqs-queue-depth-high + --alarm-description "Alert when SQS queue depth > 1000" + --metric-name ApproximateNumberOfMessagesVisible + --threshold 1000 + --comparison-operator GreaterThanThreshold + +# Cache eviction alert +aws cloudwatch put-metric-alarm + --alarm-name elasticache-eviction-rate-high + --alarm-description "Alert when cache eviction rate > 10%" + --metric-name EvictionRate + --namespace AWS/ElastiCache + --threshold 10 + --comparison-operator GreaterThanThreshold +``` + +## Scaling and Optimization + +### Scale Hetzner Compute + +Edit `workspace.ncl`: + +```nickel +compute_tier.primary_servers = hetzner.Server & { + count = 5, # Increase from 3 + server_type = "cpx21" +} +``` + +Redeploy: + +```bash +./deploy.nu +``` + +### Upgrade Database + +```bash +# Modify RDS instance class +aws rds modify-db-instance + --db-instance-identifier app-db + --db-instance-class db.t3.medium + --apply-immediately + --region us-east-1 +``` + +### Add Caching Layer + +Already configured with ElastiCache. Optimize by adjusting: + +```toml +[application.cache] +max_memory = "512MB" +eviction_policy = "allkeys-lru" +``` + +### Increase Queue Throughput + +SQS automatically scales. Monitor with: + +```bash +aws sqs get-queue-attributes + --queue-url https://sqs.us-east-1.amazonaws.com/123456789/app-queue + --attribute-names ApproximateNumberOfMessages +``` + +## Cost Optimization Tips + +1. **Hetzner Compute**: CPX21 is sweet spot. Consider CX21 for lower workloads +2. **AWS RDS**: Use t3.small for dev, t3.medium for prod with burst capability +3. **ElastiCache**: 2 nodes with auto-failover. Monitor eviction rates +4. **SQS**: Pay per request, no fixed costs. Good for variable load +5. **DigitalOcean CDN**: Cache more aggressively (86400s TTL for assets) +6. **Spaces**: Use lifecycle rules to delete old files automatically + +### Cost Reduction Checklist + +- Reduce Hetzner servers from 3 to 2 (saves ~€21/month) +- Downgrade RDS to db.t3.micro for dev (saves ~$40/month) +- Reduce ElastiCache nodes from 2 to 1 (saves ~$12/month) +- Archive old CDN content (savings from Spaces storage) +- Use reserved capacity on AWS (20-30% discount) + +Potential total savings: ~$100+/month with right-sizing. + +## Troubleshooting + +### Issue: Hetzner Can't Connect to RDS + +**Diagnosis**: +```bash +# SSH to Hetzner server +ssh root@hetzner-server + +# Test connectivity +nc -zv app-db.abc123.us-east-1.rds.amazonaws.com 5432 +``` + +**Solution**: +- Check VPN tunnel is active +- Verify RDS security group allows port 5432 from Hetzner network +- Check route table on both sides + +### Issue: High Database Latency + +**Diagnosis**: +```bash +# Check RDS performance +aws pi describe-dimension-keys --service-type RDS ... + +# Check network latency +ping -c 5 app-db.abc123.us-east-1.rds.amazonaws.com +``` + +**Solution**: +- Upgrade RDS instance class +- Increase ElastiCache size to reduce database queries +- Check network bandwidth between providers + +### Issue: Queue Processing Slow + +**Diagnosis**: +```bash +# Check queue depth and age +aws sqs get-queue-attributes + --queue-url + --attribute-names All +``` + +**Solution**: +- Scale up application servers processing queue +- Reduce visibility timeout if messages are timing out +- Check application logs for processing errors + +## Cleanup + +```bash +# Hetzner +hcloud server delete hetzner-app-1 hetzner-app-2 hetzner-app-3 +hcloud load-balancer delete app-lb + +# AWS +aws rds delete-db-instance --db-instance-identifier app-db --skip-final-snapshot +aws elasticache delete-cache-cluster --cache-cluster-id app-cache +aws sqs delete-queue --queue-url https://sqs.us-east-1.amazonaws.com/123456789/app-queue + +# DigitalOcean +doctl compute spaces delete app-content +doctl compute cdn delete cdn-app +doctl compute droplet delete edge-node-1 edge-node-2 edge-node-3 +``` + +## Next Steps + +1. Implement application logging to CloudWatch +2. Set up Hetzner monitoring dashboard +3. Configure auto-scaling based on queue depth +4. Implement database read replicas for read-heavy workloads +5. Add WAF protection to Hetzner load balancer +6. Implement cross-region backups to Spaces +7. Set up cost anomaly detection alerts + +## Support + +For issues or questions: + +- Review the cost-optimized deployment guide +- Check provider-specific documentation +- Monitor costs with: `aws ce get-cost-and-usage ...` +- Review deployment logs: `./deploy.nu --debug` + +## Files + +- `workspace.ncl`: Infrastructure definition (Nickel) +- `config.toml`: Provider credentials and settings +- `deploy.nu`: Deployment orchestration (Nushell) +- `README.md`: This file \ No newline at end of file diff --git a/examples/workspaces/multi-provider-web-app/README.md b/examples/workspaces/multi-provider-web-app/README.md index 980691b..13b4272 100644 --- a/examples/workspaces/multi-provider-web-app/README.md +++ b/examples/workspaces/multi-provider-web-app/README.md @@ -1 +1,413 @@ -# Multi-Provider Web App Workspace\n\nThis workspace demonstrates a production-ready web application deployment spanning three cloud providers:\n\n- **DigitalOcean**: Web servers and load balancing (NYC region)\n- **AWS**: Managed PostgreSQL database with high availability (US-East region)\n- **Hetzner**: Backup storage and disaster recovery (Germany region)\n\n## Why Three Providers?\n\nThis architecture optimizes cost, performance, and reliability:\n\n- **DigitalOcean** (~$77/month): Cost-effective compute with simple management\n- **AWS RDS** (~$75/month): Managed database with automatic failover\n- **Hetzner** (~$13/month): Affordable backup storage\n- **Total**: ~$165/month (vs $300+ for equivalent all-cloud setup)\n\n## Architecture Overview\n\n```\n┌─────────────────────────────────────────────┐\n│ Client Requests │\n└──────────────┬──────────────────────────────┘\n │ HTTPS/HTTP\n ┌───────▼─────────┐\n │ DigitalOcean LB │\n └───────┬─────────┘\n ┌────────┼────────┐\n │ │ │\n ┌─▼──┐ ┌─▼──┐ ┌─▼──┐\n │Web │ │Web │ │Web │ (DigitalOcean Droplets)\n │ 1 │ │ 2 │ │ 3 │\n └──┬─┘ └──┬─┘ └──┬─┘\n │ │ │\n └───────┼───────┘\n │ VPN Tunnel\n ┌───────▼────────────┐\n │ AWS RDS (PG) │ (us-east-1)\n │ Multi-AZ Cluster │\n └────────┬───────────┘\n │ Replication\n ┌──────▼──────────┐\n │ Hetzner Volume │ (nbg1 - Germany)\n │ Backups │\n └─────────────────┘\n```\n\n## Prerequisites\n\n### 1. Cloud Accounts\n\n- **DigitalOcean**: Account with API token\n- **AWS**: Account with access keys\n- **Hetzner**: Account with API token\n\n### 2. Environment Variables\n\nSet these before deployment:\n\n```\nexport DIGITALOCEAN_TOKEN="dop_v1_abc123def456ghi789jkl012mno"\nexport AWS_ACCESS_KEY_ID="AKIA1234567890ABCDEF"\nexport AWS_SECRET_ACCESS_KEY="wJalrXUtnFEMI/K7MDENG+j/zI0m1234567890ab"\nexport HCLOUD_TOKEN="MC4wNTI1YmE1M2E4YmE0YTQzMTQyZTdlODYy"\n```\n\n### 3. SSH Key Setup\n\n#### DigitalOcean\n```\n# Upload your SSH public key\ndoctl compute ssh-key create provisioning-key \n --public-key-from-file ~/.ssh/id_rsa.pub\n\n# Note the key ID for workspace.ncl\ndoctl compute ssh-key list\n```\n\n#### AWS\n```\n# Create EC2 key pair (if needed)\naws ec2 create-key-pair --key-name provisioning-key \n --query 'KeyMaterial' --output text > provisioning-key.pem\nchmod 600 provisioning-key.pem\n```\n\n#### Hetzner\n```\n# Upload SSH key\nhcloud ssh-key create --name provisioning-key \n --public-key-from-file ~/.ssh/id_rsa.pub\n\n# List keys\nhcloud ssh-key list\n```\n\n### 4. DNS Setup\n\nUpdate `workspace.ncl` with your domain:\n- Replace `your-certificate-id` with actual AWS certificate ID\n- Update load balancer CNAME to point to your domain\n\n## Deployment\n\n### Step 1: Configure the Workspace\n\nEdit `workspace.ncl` to:\n- Set your SSH key IDs\n- Update certificate ID for HTTPS\n- Set domain names\n- Adjust instance counts if needed\n\nEdit `config.toml` to:\n- Set correct environment variable names\n- Adjust thresholds and settings\n\n### Step 2: Validate Configuration\n\n```\n# Validate Nickel syntax\nnickel export workspace.ncl | jq .\n\n# Validate provider credentials\nprovisioning provider verify digitalocean\nprovisioning provider verify aws\nprovisioning provider verify hetzner\n```\n\n### Step 3: Deploy\n\n```\n# Using provided deploy script\n./deploy.nu\n\n# Or manually via provisioning CLI\nprovisioning workspace deploy --config config.toml\n```\n\n### Step 4: Verify Deployment\n\n```\n# List resources per provider\ndoctl compute droplet list\naws rds describe-db-instances\nhcloud volume list\n\n# Test load balancer\ncurl http://your-domain.com/health\n```\n\n## Post-Deployment Configuration\n\n### 1. Application Deployment\n\nSSH into web servers and deploy application:\n\n```\n# Get web server IPs\ndoctl compute droplet list --format Name,PublicIPv4\n\n# SSH to first server\nssh root@198.51.100.15\n\n# Deploy application\ncd /var/www\ngit clone https://github.com/your-org/web-app.git\ncd web-app\n./deploy.sh\n```\n\n### 2. Database Configuration\n\nConnect to RDS database and initialize schema:\n\n```\n# Get RDS endpoint\naws rds describe-db-instances --query 'DBInstances[0].Endpoint.Address'\n\n# Connect and initialize\npsql -h webapp-db.c9akciq32.us-east-1.rds.amazonaws.com -U admin -d defaultdb < schema.sql\n```\n\n### 3. DNS Configuration\n\nPoint your domain to the load balancer:\n\n```\n# Get load balancer IP\ndoctl compute load-balancer list\n\n# Update DNS CNAME\n# Add CNAME record: app.example.com -> lb-123456789.nyc3.digitalocean.com\n```\n\n### 4. SSL/TLS Certificate\n\nUse AWS Certificate Manager:\n\n```\n# Request certificate\naws acm request-certificate \n --domain-name app.example.com \n --validation-method DNS\n\n# Validate and get certificate ID\naws acm list-certificates | grep app.example.com\n\n# Update workspace.ncl with certificate ID\n```\n\n## Monitoring\n\n### DigitalOcean Monitoring\n\n- CPU usage tracked per droplet\n- Memory usage alerts on Droplet greater than 85%\n- Disk space alerts on greater than 90% full\n\n### AWS CloudWatch\n\n- RDS database metrics (CPU, connections, disk)\n- Automatic failover notifications\n- Slow query logging\n\n### Hetzner Monitoring\n\n- Volume usage tracking\n- Manual monitoring script via cron\n\n### Application Monitoring\n\nImplement application-level monitoring:\n\n```\n# SSH to web server\nssh root@198.51.100.15\n\n# Check app logs\ntail -f /var/www/app/logs/application.log\n\n# Monitor system resources\ntop\niostat -x 1\n\n# Check database connection pool\npsql -h webapp-db.c9akciq32.us-east-1.rds.amazonaws.com -c "SELECT count(plus) FROM pg_stat_activity;"\n```\n\n## Backup and Recovery\n\n### Automated Backups\n\n- **RDS**: Daily backups retained for 30 days (AWS handles)\n- **Application Data**: Weekly backups to Hetzner volume\n- **Configuration**: Version control via Git\n\n### Manual Backup\n\n```\n# Backup RDS to Hetzner volume\nssh hetzner-backup-volume\n\n# Mount Hetzner volume (if not mounted)\nsudo mount /dev/sdb /mnt/backups\n\n# Backup RDS database\npg_dump -h webapp-db.c9akciq32.us-east-1.rds.amazonaws.com -U admin -d defaultdb | \n gzip > /mnt/backups/db-$(date +%Y%m%d).sql.gz\n```\n\n### Recovery Procedure\n\n1. **Web Server Failure**: Load balancer automatically redirects to healthy server\n2. **Database Failure**: RDS Multi-AZ automatic failover\n3. **Complete Failure**: Restore from Hetzner backup volume\n\n## Scaling\n\n### Add More Web Servers\n\nEdit `workspace.ncl`:\n\n```\ndroplets = digitalocean.Droplet & {\n name = "web-server",\n region = "nyc3",\n size = "s-2vcpu-4gb",\n count = 5\n}\n```\n\nRedeploy:\n\n```\n./deploy.nu\n```\n\n### Upgrade Database\n\nEdit `workspace.ncl`:\n\n```\ndatabase_tier = aws.RDS & {\n identifier = "webapp-db",\n instance_class = "db.t3.large"\n}\n```\n\nRedeploy with minimal downtime (Multi-AZ handles switchover).\n\n## Cost Optimization\n\n### Reduce Costs\n\n1. **Droplets**: Use smaller size or fewer instances\n2. **Database**: Switch to smaller db.t3.small (approximately $30/month)\n3. **Storage**: Reduce backup volume size\n4. **Data Transfer**: Monitor and optimize outbound traffic\n\n### Monitor Costs\n\n```\n# DigitalOcean estimated bill\ndoctl billing get\n\n# AWS Cost Explorer\naws ce get-cost-and-usage --time-period Start=2024-01-01,End=2024-01-31\n\n# Hetzner manual tracking via console\n# Navigate to https://console.hetzner.cloud/billing\n```\n\n## Troubleshooting\n\n### Issue: Web Servers Unreachable\n\n**Diagnosis**:\n```\ndoctl compute droplet list\ndoctl compute firewall list-rules firewall-id\n```\n\n**Solution**:\n- Check firewall allows ports 80, 443\n- Verify droplets have public IPs\n- Check web server application status\n\n### Issue: Database Connection Failure\n\n**Diagnosis**:\n```\naws rds describe-db-instances\naws security-group describe-security-groups\n```\n\n**Solution**:\n- Verify RDS security group allows port 5432 from web servers\n- Check RDS status is "available"\n- Verify connection string in application\n\n### Issue: Backup Volume Not Mounted\n\n**Diagnosis**:\n```\nhcloud volume list\nssh hetzner-volume\nlsblk\n```\n\n**Solution**:\n```\nsudo mkfs.ext4 /dev/sdb\nsudo mount /dev/sdb /mnt/backups\necho '/dev/sdb /mnt/backups ext4 defaults,nofail 0 0' | sudo tee -a /etc/fstab\n```\n\n## Cleanup\n\nTo destroy all resources:\n\n```\n# This will delete everything - use carefully\nprovisioning workspace destroy --config config.toml\n\n# Or manually\ndoctl compute droplet delete web-server-1 web-server-2 web-server-3\ndoctl compute load-balancer delete web-lb\naws rds delete-db-instance --db-instance-identifier webapp-db --skip-final-snapshot\nhcloud volume delete webapp-backups\n```\n\n## Next Steps\n\n1. **SSL/TLS**: Update certificate and enable HTTPS\n2. **Auto-scaling**: Add DigitalOcean autoscaling based on load\n3. **Multi-region**: Add additional AWS RDS read replicas in other regions\n4. **Disaster Recovery**: Test failover procedures\n5. **Cost Optimization**: Review and optimize resource sizes\n\n## Support\n\nFor issues or questions:\n\n- Review the multi-provider deployment guide\n- Check provider-specific documentation\n- Review workspace logs with debug flag: ./deploy.nu --debug\n\n## Files\n\n- `workspace.ncl`: Infrastructure definition (Nickel)\n- `config.toml`: Provider credentials and settings\n- `deploy.nu`: Deployment automation script (Nushell)\n- `README.md`: This file \ No newline at end of file +# Multi-Provider Web App Workspace + +This workspace demonstrates a production-ready web application deployment spanning three cloud providers: + +- **DigitalOcean**: Web servers and load balancing (NYC region) +- **AWS**: Managed PostgreSQL database with high availability (US-East region) +- **Hetzner**: Backup storage and disaster recovery (Germany region) + +## Why Three Providers? + +This architecture optimizes cost, performance, and reliability: + +- **DigitalOcean** (~$77/month): Cost-effective compute with simple management +- **AWS RDS** (~$75/month): Managed database with automatic failover +- **Hetzner** (~$13/month): Affordable backup storage +- **Total**: ~$165/month (vs $300+ for equivalent all-cloud setup) + +## Architecture Overview + +```bash +┌─────────────────────────────────────────────┐ +│ Client Requests │ +└──────────────┬──────────────────────────────┘ + │ HTTPS/HTTP + ┌───────▼─────────┐ + │ DigitalOcean LB │ + └───────┬─────────┘ + ┌────────┼────────┐ + │ │ │ + ┌─▼──┐ ┌─▼──┐ ┌─▼──┐ + │Web │ │Web │ │Web │ (DigitalOcean Droplets) + │ 1 │ │ 2 │ │ 3 │ + └──┬─┘ └──┬─┘ └──┬─┘ + │ │ │ + └───────┼───────┘ + │ VPN Tunnel + ┌───────▼────────────┐ + │ AWS RDS (PG) │ (us-east-1) + │ Multi-AZ Cluster │ + └────────┬───────────┘ + │ Replication + ┌──────▼──────────┐ + │ Hetzner Volume │ (nbg1 - Germany) + │ Backups │ + └─────────────────┘ +``` + +## Prerequisites + +### 1. Cloud Accounts + +- **DigitalOcean**: Account with API token +- **AWS**: Account with access keys +- **Hetzner**: Account with API token + +### 2. Environment Variables + +Set these before deployment: + +```javascript +export DIGITALOCEAN_TOKEN="dop_v1_abc123def456ghi789jkl012mno" +export AWS_ACCESS_KEY_ID="AKIA1234567890ABCDEF" +export AWS_SECRET_ACCESS_KEY="wJalrXUtnFEMI/K7MDENG+j/zI0m1234567890ab" +export HCLOUD_TOKEN="MC4wNTI1YmE1M2E4YmE0YTQzMTQyZTdlODYy" +``` + +### 3. SSH Key Setup + +#### DigitalOcean +```bash +# Upload your SSH public key +doctl compute ssh-key create provisioning-key + --public-key-from-file ~/.ssh/id_rsa.pub + +# Note the key ID for workspace.ncl +doctl compute ssh-key list +``` + +#### AWS +```bash +# Create EC2 key pair (if needed) +aws ec2 create-key-pair --key-name provisioning-key + --query 'KeyMaterial' --output text > provisioning-key.pem +chmod 600 provisioning-key.pem +``` + +#### Hetzner +```bash +# Upload SSH key +hcloud ssh-key create --name provisioning-key + --public-key-from-file ~/.ssh/id_rsa.pub + +# List keys +hcloud ssh-key list +``` + +### 4. DNS Setup + +Update `workspace.ncl` with your domain: +- Replace `your-certificate-id` with actual AWS certificate ID +- Update load balancer CNAME to point to your domain + +## Deployment + +### Step 1: Configure the Workspace + +Edit `workspace.ncl` to: +- Set your SSH key IDs +- Update certificate ID for HTTPS +- Set domain names +- Adjust instance counts if needed + +Edit `config.toml` to: +- Set correct environment variable names +- Adjust thresholds and settings + +### Step 2: Validate Configuration + +```toml +# Validate Nickel syntax +nickel export workspace.ncl | jq . + +# Validate provider credentials +provisioning provider verify digitalocean +provisioning provider verify aws +provisioning provider verify hetzner +``` + +### Step 3: Deploy + +```bash +# Using provided deploy script +./deploy.nu + +# Or manually via provisioning CLI +provisioning workspace deploy --config config.toml +``` + +### Step 4: Verify Deployment + +```bash +# List resources per provider +doctl compute droplet list +aws rds describe-db-instances +hcloud volume list + +# Test load balancer +curl http://your-domain.com/health +``` + +## Post-Deployment Configuration + +### 1. Application Deployment + +SSH into web servers and deploy application: + +```bash +# Get web server IPs +doctl compute droplet list --format Name,PublicIPv4 + +# SSH to first server +ssh root@198.51.100.15 + +# Deploy application +cd /var/www +git clone https://github.com/your-org/web-app.git +cd web-app +./deploy.sh +``` + +### 2. Database Configuration + +Connect to RDS database and initialize schema: + +```bash +# Get RDS endpoint +aws rds describe-db-instances --query 'DBInstances[0].Endpoint.Address' + +# Connect and initialize +psql -h webapp-db.c9akciq32.us-east-1.rds.amazonaws.com -U admin -d defaultdb < schema.sql +``` + +### 3. DNS Configuration + +Point your domain to the load balancer: + +```bash +# Get load balancer IP +doctl compute load-balancer list + +# Update DNS CNAME +# Add CNAME record: app.example.com -> lb-123456789.nyc3.digitalocean.com +``` + +### 4. SSL/TLS Certificate + +Use AWS Certificate Manager: + +```bash +# Request certificate +aws acm request-certificate + --domain-name app.example.com + --validation-method DNS + +# Validate and get certificate ID +aws acm list-certificates | grep app.example.com + +# Update workspace.ncl with certificate ID +``` + +## Monitoring + +### DigitalOcean Monitoring + +- CPU usage tracked per droplet +- Memory usage alerts on Droplet greater than 85% +- Disk space alerts on greater than 90% full + +### AWS CloudWatch + +- RDS database metrics (CPU, connections, disk) +- Automatic failover notifications +- Slow query logging + +### Hetzner Monitoring + +- Volume usage tracking +- Manual monitoring script via cron + +### Application Monitoring + +Implement application-level monitoring: + +```bash +# SSH to web server +ssh root@198.51.100.15 + +# Check app logs +tail -f /var/www/app/logs/application.log + +# Monitor system resources +top +iostat -x 1 + +# Check database connection pool +psql -h webapp-db.c9akciq32.us-east-1.rds.amazonaws.com -c "SELECT count(plus) FROM pg_stat_activity;" +``` + +## Backup and Recovery + +### Automated Backups + +- **RDS**: Daily backups retained for 30 days (AWS handles) +- **Application Data**: Weekly backups to Hetzner volume +- **Configuration**: Version control via Git + +### Manual Backup + +```bash +# Backup RDS to Hetzner volume +ssh hetzner-backup-volume + +# Mount Hetzner volume (if not mounted) +sudo mount /dev/sdb /mnt/backups + +# Backup RDS database +pg_dump -h webapp-db.c9akciq32.us-east-1.rds.amazonaws.com -U admin -d defaultdb | + gzip > /mnt/backups/db-$(date +%Y%m%d).sql.gz +``` + +### Recovery Procedure + +1. **Web Server Failure**: Load balancer automatically redirects to healthy server +2. **Database Failure**: RDS Multi-AZ automatic failover +3. **Complete Failure**: Restore from Hetzner backup volume + +## Scaling + +### Add More Web Servers + +Edit `workspace.ncl`: + +```nickel +droplets = digitalocean.Droplet & { + name = "web-server", + region = "nyc3", + size = "s-2vcpu-4gb", + count = 5 +} +``` + +Redeploy: + +```bash +./deploy.nu +``` + +### Upgrade Database + +Edit `workspace.ncl`: + +```nickel +database_tier = aws.RDS & { + identifier = "webapp-db", + instance_class = "db.t3.large" +} +``` + +Redeploy with minimal downtime (Multi-AZ handles switchover). + +## Cost Optimization + +### Reduce Costs + +1. **Droplets**: Use smaller size or fewer instances +2. **Database**: Switch to smaller db.t3.small (approximately $30/month) +3. **Storage**: Reduce backup volume size +4. **Data Transfer**: Monitor and optimize outbound traffic + +### Monitor Costs + +```bash +# DigitalOcean estimated bill +doctl billing get + +# AWS Cost Explorer +aws ce get-cost-and-usage --time-period Start=2024-01-01,End=2024-01-31 + +# Hetzner manual tracking via console +# Navigate to https://console.hetzner.cloud/billing +``` + +## Troubleshooting + +### Issue: Web Servers Unreachable + +**Diagnosis**: +```bash +doctl compute droplet list +doctl compute firewall list-rules firewall-id +``` + +**Solution**: +- Check firewall allows ports 80, 443 +- Verify droplets have public IPs +- Check web server application status + +### Issue: Database Connection Failure + +**Diagnosis**: +```bash +aws rds describe-db-instances +aws security-group describe-security-groups +``` + +**Solution**: +- Verify RDS security group allows port 5432 from web servers +- Check RDS status is "available" +- Verify connection string in application + +### Issue: Backup Volume Not Mounted + +**Diagnosis**: +```bash +hcloud volume list +ssh hetzner-volume +lsblk +``` + +**Solution**: +```bash +sudo mkfs.ext4 /dev/sdb +sudo mount /dev/sdb /mnt/backups +echo '/dev/sdb /mnt/backups ext4 defaults,nofail 0 0' | sudo tee -a /etc/fstab +``` + +## Cleanup + +To destroy all resources: + +```bash +# This will delete everything - use carefully +provisioning workspace destroy --config config.toml + +# Or manually +doctl compute droplet delete web-server-1 web-server-2 web-server-3 +doctl compute load-balancer delete web-lb +aws rds delete-db-instance --db-instance-identifier webapp-db --skip-final-snapshot +hcloud volume delete webapp-backups +``` + +## Next Steps + +1. **SSL/TLS**: Update certificate and enable HTTPS +2. **Auto-scaling**: Add DigitalOcean autoscaling based on load +3. **Multi-region**: Add additional AWS RDS read replicas in other regions +4. **Disaster Recovery**: Test failover procedures +5. **Cost Optimization**: Review and optimize resource sizes + +## Support + +For issues or questions: + +- Review the multi-provider deployment guide +- Check provider-specific documentation +- Review workspace logs with debug flag: ./deploy.nu --debug + +## Files + +- `workspace.ncl`: Infrastructure definition (Nickel) +- `config.toml`: Provider credentials and settings +- `deploy.nu`: Deployment automation script (Nushell) +- `README.md`: This file \ No newline at end of file diff --git a/examples/workspaces/multi-region-ha/README.md b/examples/workspaces/multi-region-ha/README.md index 85d956e..3a75867 100644 --- a/examples/workspaces/multi-region-ha/README.md +++ b/examples/workspaces/multi-region-ha/README.md @@ -1 +1,729 @@ -# Multi-Region High Availability Workspace\n\nThis workspace demonstrates a production-ready global high availability deployment spanning three cloud providers across three geographic regions:\n\n- **US East (DigitalOcean NYC)**: Primary region - active serving, primary database\n- **EU Central (Hetzner Germany)**: Secondary region - active serving, read replicas\n- **Asia Pacific (AWS Singapore)**: Tertiary region - active serving, read replicas\n\n## Why Multi-Region High Availability?\n\n### Business Benefits\n\n- **99.99% Uptime**: Automatic failover across regions\n- **Low Latency**: Users served from geographically closest region\n- **Compliance**: Data residency in specific regions (GDPR for EU)\n- **Disaster Recovery**: Complete regional failure tolerance\n\n### Technical Benefits\n\n- **Load Distribution**: Traffic spread across 3 regions\n- **Cost Optimization**: Pay only for actual usage (~$311/month)\n- **Provider Diversity**: Reduces vendor lock-in risk\n- **Capacity Planning**: Scale independently per region\n\n## Architecture Overview\n\n```\n┌─────────────────────────────────────────────────────────────────┐\n│ Global Route53 DNS │\n│ Geographic Routing + Health Checks │\n└────────────────────┬────────────────────────────────────────────┘\n │\n ┌───────────┼───────────┐\n │ │ │\n ┌────▼─────┐ ┌──▼────────┐ ┌▼──────────┐\n │ US │ │ EU │ │ APAC │\n │ Primary │ │ Secondary │ │ Tertiary │\n └────┬─────┘ └──┬────────┘ └▼──────────┘\n │ │ │\n ┌────▼──────────▼───────────▼────┐\n │ Multi-Master Database │\n │ Replication (300s lag) │\n └────────────────────────────────┘\n │ │ │\n ┌────▼────┐ ┌──▼─────┐ ┌──▼────┐\n │DO Droplets Hetzner AWS\n │ 3 x nyc3 3 x nbg1 3 x sgp1\n │ │ │ │\n │ Load Balancer (per region)\n │ │ │ │\n └─────────┼─────────┼─────────┘\n │VPN Tunnels (IPSec)│\n └───────────────────┘\n```\n\n### Regional Components\n\n#### US East (DigitalOcean) - Primary\n\n```\nRegion: nyc3 (New York)\nCompute: 3x Droplets (s-2vcpu-4gb)\nLoad Balancer: Round-robin with health checks\nDatabase: PostgreSQL (3-node cluster, Multi-AZ)\nNetwork: VPC 10.0.0.0/16\nCost: ~$102/month\n```\n\n#### EU Central (Hetzner) - Secondary\n\n```\nRegion: nbg1 (Nuremberg, Germany)\nCompute: 3x CPX21 servers (4 vCPU, 8GB RAM)\nLoad Balancer: Hetzner Load Balancer\nDatabase: Read-only replica (lag: 300s)\nNetwork: vSwitch 10.1.0.0/16\nCost: ~$79/month (€72.70)\n```\n\n#### Asia Pacific (AWS) - Tertiary\n\n```\nRegion: ap-southeast-1 (Singapore)\nCompute: 3x EC2 t3.medium instances\nLoad Balancer: Application Load Balancer (ALB)\nDatabase: RDS read-only replica (lag: 300s)\nNetwork: VPC 10.2.0.0/16\nCost: ~$130/month\n```\n\n## Prerequisites\n\n### 1. Cloud Accounts & Credentials\n\n#### DigitalOcean\n```\n# Create API token\n# Dashboard → API → Tokens/Keys → Generate New Token\n# Scopes: read, write\n\nexport DIGITALOCEAN_TOKEN="dop_v1_abc123def456ghi789jkl012mno"\n```\n\n#### Hetzner\n```\n# Create API token\n# Dashboard → Security → API Tokens → Generate Token\n\nexport HCLOUD_TOKEN="MC4wNTI1YmE1M2E4YmE0YTQzMTQyZTdlODYy"\n```\n\n#### AWS\n```\n# Create IAM user with programmatic access\n# IAM → Users → Add User → Check "Programmatic access"\n# Attach policies: AmazonEC2FullAccess, AmazonRDSFullAccess, Route53FullAccess\n\nexport AWS_ACCESS_KEY_ID="AKIA1234567890ABCDEF"\nexport AWS_SECRET_ACCESS_KEY="wJalrXUtnFEMI/K7MDENG+j/zI0m1234567890ab"\n```\n\n### 2. CLI Tools\n\n```\n# Verify all CLIs are installed\nwhich doctl\nwhich hcloud\nwhich aws\nwhich nickel\n\n# Versions\ndoctl version # >= 1.94.0\nhcloud version # >= 1.35.0\naws --version # >= 2.0\nnickel --version # >= 1.0\n```\n\n### 3. SSH Keys\n\n#### DigitalOcean\n```\n# Upload SSH key\ndoctl compute ssh-key create provisioning-key \n --public-key-from-file ~/.ssh/id_rsa.pub\n\n# Note the key ID\ndoctl compute ssh-key list\n```\n\n#### Hetzner\n```\n# Upload SSH key\nhcloud ssh-key create \n --name provisioning-key \n --public-key-from-file ~/.ssh/id_rsa.pub\n\n# List keys\nhcloud ssh-key list\n```\n\n#### AWS\n```\n# Create or import EC2 key pair\naws ec2 create-key-pair \n --key-name provisioning-key \n --query 'KeyMaterial' --output text > provisioning-key.pem\n\nchmod 600 provisioning-key.pem\n```\n\n### 4. Domain and DNS\n\nYou need a domain with Route53 or ability to create DNS records:\n\n```\n# Create hosted zone in Route53\naws route53 create-hosted-zone \n --name api.example.com \n --caller-reference $(date +%s)\n\n# Note the Zone ID for updates\naws route53 list-hosted-zones\n```\n\n## Deployment\n\n### Step 1: Configure the Workspace\n\nEdit `workspace.ncl` to customize:\n\n```\n# Update SSH key references\ndroplets = digitalocean.Droplet & {\n ssh_keys = ["YOUR_DO_KEY_ID"],\n name = "us-app",\n region = "nyc3"\n}\n\n# Update AWS AMI IDs for your region\napp_servers = aws.EC2 & {\n image_id = "ami-09d56f8956ab235b7",\n instance_type = "t3.medium",\n region = "ap-southeast-1"\n}\n\n# Update certificate ID\nload_balancer = digitalocean.LoadBalancer & {\n forwarding_rules = [{\n certificate_id = "your-certificate-id",\n entry_protocol = "https",\n entry_port = 443\n }]\n}\n```\n\nEdit `config.toml`:\n\n```\n# Update regional names if different\n[providers.digitalocean]\nregion_name = "us-east"\n\n[providers.hetzner]\nregion_name = "eu-central"\n\n[providers.aws]\nregion_name = "asia-southeast"\n\n# Update domain\n[dns]\ndomain = "api.example.com"\n```\n\n### Step 2: Validate Configuration\n\n```\n# Validate Nickel syntax\nnickel export workspace.ncl | jq . > /dev/null\n\n# Verify credentials per provider\ndoctl auth init --access-token $DIGITALOCEAN_TOKEN\nhcloud context use default\naws sts get-caller-identity\n\n# Check connectivity\ndoctl account get\nhcloud server list\naws ec2 describe-regions\n```\n\n### Step 3: Deploy\n\n```\n# Make script executable\nchmod +x deploy.nu\n\n# Execute deployment (step-by-step)\n./deploy.nu\n\n# Or with debug output\n./deploy.nu --debug\n\n# Or deploy per region\n./deploy.nu --region us-east\n./deploy.nu --region eu-central\n./deploy.nu --region asia-southeast\n```\n\n### Step 4: Verify Global Deployment\n\n```\n# List resources per region\necho "=== US EAST (DigitalOcean) ==="\ndoctl compute droplet list --format Name,Region,Status,PublicIPv4\ndoctl compute load-balancer list\n\necho "=== EU CENTRAL (Hetzner) ==="\nhcloud server list\n\necho "=== ASIA PACIFIC (AWS) ==="\naws ec2 describe-instances --region ap-southeast-1 \n --query 'Reservations[*].Instances[*].[InstanceId,InstanceType,State.Name,PublicIpAddress]' \n --output table\naws elbv2 describe-load-balancers --region ap-southeast-1\n```\n\n## Post-Deployment Configuration\n\n### 1. SSL/TLS Certificates\n\n#### AWS Certificate Manager\n```\n# Request certificate for all regions\naws acm request-certificate \n --domain-name api.example.com \n --subject-alternative-names *.api.example.com \n --validation-method DNS \n --region us-east-1\n\n# Get certificate ARN\naws acm list-certificates --region us-east-1\n\n# Note the ARN for workspace.ncl\n```\n\n### 2. Database Primary/Replica Setup\n\n```\n# Connect to US East primary\nPGPASSWORD=admin psql -h us-db-primary.abc123.us-east-1.rds.amazonaws.com -U admin -d postgres\n\n# Create read-only replication users for EU and APAC\nCREATE ROLE replication_user WITH REPLICATION LOGIN PASSWORD 'replica_password';\n\n# On EU read replica (Hetzner) - verify replication\nSELECT slot_name, restart_lsn, confirmed_flush_lsn FROM pg_replication_slots;\n\n# On APAC read replica (AWS RDS) - verify replica status\nSELECT databaseid, xmin, catalog_xmin FROM pg_replication_origin_status;\n```\n\n### 3. Global DNS Setup\n\n```\n# Create Route53 records for each region\naws route53 change-resource-record-sets \n --hosted-zone-id Z1234567890ABC \n --change-batch '{\n "Changes": [\n {\n "Action": "CREATE",\n "ResourceRecordSet": {\n "Name": "us.api.example.com",\n "Type": "A",\n "TTL": 60,\n "ResourceRecords": [{"Value": "198.51.100.15"}]\n }\n },\n {\n "Action": "CREATE",\n "ResourceRecordSet": {\n "Name": "eu.api.example.com",\n "Type": "A",\n "TTL": 60,\n "ResourceRecords": [{"Value": "192.0.2.100"}]\n }\n },\n {\n "Action": "CREATE",\n "ResourceRecordSet": {\n "Name": "asia.api.example.com",\n "Type": "A",\n "TTL": 60,\n "ResourceRecords": [{"Value": "203.0.113.50"}]\n }\n }\n ]\n }'\n\n# Health checks per region\naws route53 create-health-check \n --health-check-config '{\n "Type": "HTTPS",\n "ResourcePath": "/health",\n "FullyQualifiedDomainName": "us.api.example.com",\n "Port": 443,\n "RequestInterval": 30,\n "FailureThreshold": 3\n }'\n```\n\n### 4. Application Deployment\n\nSSH to web servers in each region:\n\n```\n# US East\nUS_IP=$(doctl compute droplet get us-app-1 --format PublicIPv4 --no-header)\nssh root@$US_IP\n\n# Deploy application\ncd /var/www\ngit clone https://github.com/your-org/app.git\ncd app\n./deploy.sh\n\n# EU Central\nEU_IP=$(hcloud server list --selector region=eu-central --format ID | head -1 | xargs -I {} hcloud server ip {})\nssh root@$EU_IP\n\n# Asia Pacific\nASIA_IP=$(aws ec2 describe-instances \n --region ap-southeast-1 \n --filters "Name=tag:Name,Values=asia-app-1" \n --query 'Reservations[0].Instances[0].PublicIpAddress' \n --output text)\nssh -i provisioning-key.pem ec2-user@$ASIA_IP\n```\n\n## Monitoring and Health Checks\n\n### Regional Monitoring\n\nEach region generates metrics to CloudWatch/provider-specific monitoring:\n\n```\n# DigitalOcean metrics\ndoctl monitoring metrics list droplet \n --droplet-id 123456789 \n --metric cpu\n\n# Hetzner metrics (manual monitoring)\nhcloud server list\n\n# AWS CloudWatch\naws cloudwatch get-metric-statistics \n --metric-name CPUUtilization \n --namespace AWS/EC2 \n --start-time 2024-01-01T00:00:00Z \n --end-time 2024-01-02T00:00:00Z \n --period 300 \n --statistics Average\n```\n\n### Global Health Checks\n\nRoute53 health checks verify all regions are healthy:\n\n```\n# List health checks\naws route53 list-health-checks\n\n# Get detailed status\naws route53 get-health-check-status --health-check-id abc123\n\n# Verify replication lag\n# On primary (US East) DigitalOcean\nSELECT now() - pg_last_xact_replay_timestamp() AS replication_lag;\n\n# Should be less than 300 seconds\n```\n\n### Alert Configuration\n\nConfigure alerts for critical metrics:\n\n```\n# CPU > 80%\naws cloudwatch put-metric-alarm \n --alarm-name us-east-high-cpu \n --alarm-actions arn:aws:sns:us-east-1:123456:ops-alerts \n --metric-name CPUUtilization \n --threshold 80 \n --comparison-operator GreaterThanThreshold\n\n# Replication lag > 600s\naws cloudwatch put-metric-alarm \n --alarm-name replication-lag-critical \n --metric-name ReplicationLag \n --threshold 600 \n --comparison-operator GreaterThanThreshold\n```\n\n## Failover Testing\n\n### Planned Failover - US East to EU Central\n\n```\n# 1. Stop traffic to US East\naws route53 change-resource-record-sets \n --hosted-zone-id Z1234567890ABC \n --change-batch '{\n "Changes": [{\n "Action": "UPSERT",\n "ResourceRecordSet": {\n "Name": "api.example.com",\n "Type": "A",\n "TTL": 60,\n "ResourceRecords": [{"Value": "192.0.2.100"}]\n }\n }]\n }'\n\n# 2. Promote EU Central to primary\n# Connect to EU read replica and promote\npsql -h hetzner-eu-db.netz.de -U admin -d postgres \n -c "SELECT pg_promote();"\n\n# 3. Verify failover\ncurl https://api.example.com/health\n\n# 4. Monitor replication (now from EU)\nSELECT now() - pg_last_xact_replay_timestamp() AS replication_lag;\n```\n\n### Automatic Failover - Health Check Failure\n\nRoute53 automatically fails over when health checks fail:\n\n```\n# Simulate US East failure (for testing only)\n# Stop web servers temporarily\ndoctl compute droplet-action power-off us-app-1 us-app-2 us-app-3\n\n# Wait ~1 minute for health check to fail\nsleep 60\n\n# Verify traffic now routes to EU/APAC\ncurl https://api.example.com/ -v | grep -E "^< Server"\n\n# Restore US East\ndoctl compute droplet-action power-on us-app-1 us-app-2 us-app-3\n```\n\n## Scaling and Upgrades\n\n### Add More Web Servers\n\nEdit `workspace.ncl`:\n\n```\n# Increase droplet count\nregion_us_east.app_servers = digitalocean.Droplet & {\n count = 5,\n name = "us-app",\n region = "nyc3"\n}\n\n# Increase Hetzner servers\nregion_eu_central.app_servers = hetzner.Server & {\n count = 5,\n server_type = "cpx21",\n location = "nbg1"\n}\n\n# Increase AWS EC2 instances\nregion_asia_southeast.app_servers = aws.EC2 & {\n count = 5,\n instance_type = "t3.medium",\n region = "ap-southeast-1"\n}\n```\n\nRedeploy:\n\n```\n./deploy.nu --region us-east\n./deploy.nu --region eu-central\n./deploy.nu --region asia-southeast\n```\n\n### Upgrade Database Instance Class\n\nEdit `workspace.ncl`:\n\n```\n# US East primary\ndatabase = digitalocean.Database & {\n size = "db-s-4vcpu-8gb",\n name = "us-db-primary",\n engine = "pg"\n}\n```\n\nDigitalOcean handles upgrade with minimal downtime.\n\n### Upgrade EC2 Instances\n\n```\n# Stop instances for upgrade (rolling)\naws ec2 stop-instances --region ap-southeast-1 --instance-ids i-1234567890abcdef0\n\n# Wait for stop\naws ec2 wait instance-stopped --region ap-southeast-1 --instance-ids i-1234567890abcdef0\n\n# Modify instance type\naws ec2 modify-instance-attribute \n --region ap-southeast-1 \n --instance-id i-1234567890abcdef0 \n --instance-type t3.large\n\n# Start instance\naws ec2 start-instances --region ap-southeast-1 --instance-ids i-1234567890abcdef0\n```\n\n## Cost Optimization\n\n### Monthly Cost Breakdown\n\n| Component | US East | EU Central | Asia Pacific | Total |\n| ----------- | --------- | ----------- | -------------- | ------- |\n| Compute | $72 | €62.70 | $80 | $242.70 |\n| Database | $30 | Read Replica | $30 | $60 |\n| Load Balancer | Free | ~$10 | ~$20 | ~$30 |\n| **Total** | **$102** | **~$79** | **$130** | **~$311** |\n\n### Optimization Strategies\n\n1. Reduce instance count from 3 to 2 (saves ~$30-40/month)\n2. Downsize compute to s-1vcpu-2gb (saves ~$20-30/month)\n3. Use Reserved Instances on AWS (saves ~20-30%)\n4. Optimize data transfer between regions\n5. Review backups and retention settings\n\n### Monitor Costs\n\n```\n# DigitalOcean\ndoctl billing get\n\n# AWS Cost Explorer\naws ce get-cost-and-usage \n --time-period Start=2024-01-01,End=2024-01-31 \n --granularity MONTHLY \n --metrics BlendedCost \n --group-by Type=DIMENSION,Key=SERVICE\n\n# Hetzner (manual via console)\n# https://console.hetzner.cloud/billing\n```\n\n## Troubleshooting\n\n### Issue: One Region Not Responding\n\n**Diagnosis**:\n```\n# Check health checks\naws route53 get-health-check-status --health-check-id abc123\n\n# Test regional endpoints\ncurl -v https://us.api.example.com/health\ncurl -v https://eu.api.example.com/health\ncurl -v https://asia.api.example.com/health\n```\n\n**Solution**:\n- Check web server status in affected region\n- Verify load balancer is healthy\n- Review security groups/firewall rules\n- Check application logs on web servers\n\n### Issue: High Replication Lag\n\n**Diagnosis**:\n```\n# Check replication status\npsql -h us-db-primary.abc123.us-east-1.rds.amazonaws.com -U admin -d postgres \n -c "SELECT now() - pg_last_xact_replay_timestamp() AS replication_lag;"\n\n# Check replication slots\npsql -h us-db-primary.abc123.us-east-1.rds.amazonaws.com -U admin -d postgres \n -c "SELECT * FROM pg_replication_slots;"\n```\n\n**Solution**:\n- Check network connectivity between regions\n- Verify VPN tunnels are operational\n- Reduce write load on primary\n- Monitor network bandwidth\n- May need larger database instance\n\n### Issue: VPN Tunnel Down\n\n**Diagnosis**:\n```\n# Check VPN connection status\naws ec2 describe-vpn-connections --region us-east-1\n\n# Test connectivity between regions\nssh hetzner-server "ping 10.0.0.1"\n```\n\n**Solution**:\n- Reconnect VPN tunnel manually\n- Verify tunnel configuration\n- Check security groups allow necessary ports\n- Review ISP routing\n\n## Cleanup\n\nTo destroy all resources (use carefully):\n\n```\n# DigitalOcean\ndoctl compute droplet delete --force us-app-1 us-app-2 us-app-3\ndoctl compute load-balancer delete --force us-lb\ndoctl compute database delete --force us-db-primary\n\n# Hetzner\nhcloud server delete hetzner-eu-1 hetzner-eu-2 hetzner-eu-3\nhcloud load-balancer delete eu-lb\nhcloud volume delete eu-backups\n\n# AWS\naws ec2 terminate-instances --region ap-southeast-1 --instance-ids i-xxxxx\naws elbv2 delete-load-balancer --load-balancer-arn arn:aws:elasticloadbalancing:ap-southeast-1:123456789:loadbalancer/app/asia-lb/1234567890abcdef\naws rds delete-db-instance --db-instance-identifier asia-db-replica --skip-final-snapshot\n\n# Route53\naws route53 delete-health-check --health-check-id abc123\naws route53 delete-hosted-zone --id Z1234567890ABC\n```\n\n## Next Steps\n\n1. Disaster Recovery Testing: Regular failover drills\n2. Auto-scaling: Add provider-specific autoscaling\n3. Monitoring Integration: Connect to centralized monitoring (Datadog, New Relic, Prometheus)\n4. Backup Automation: Implement cross-region backups\n5. Cost Optimization: Review and tune resource sizing\n6. Security Hardening: Implement WAF, DDoS protection\n7. Load Testing: Validate performance across regions\n\n## Support\n\nFor issues or questions:\n\n- Review the multi-provider networking guide\n- Check provider-specific documentation\n- Review regional deployment logs: `./deploy.nu --debug`\n- Test regional endpoints independently\n\n## Files\n\n- `workspace.ncl`: Global infrastructure definition (Nickel)\n- `config.toml`: Provider credentials and regional settings\n- `deploy.nu`: Multi-region deployment orchestration (Nushell)\n- `README.md`: This file \ No newline at end of file +# Multi-Region High Availability Workspace + +This workspace demonstrates a production-ready global high availability deployment spanning three cloud providers across three geographic regions: + +- **US East (DigitalOcean NYC)**: Primary region - active serving, primary database +- **EU Central (Hetzner Germany)**: Secondary region - active serving, read replicas +- **Asia Pacific (AWS Singapore)**: Tertiary region - active serving, read replicas + +## Why Multi-Region High Availability? + +### Business Benefits + +- **99.99% Uptime**: Automatic failover across regions +- **Low Latency**: Users served from geographically closest region +- **Compliance**: Data residency in specific regions (GDPR for EU) +- **Disaster Recovery**: Complete regional failure tolerance + +### Technical Benefits + +- **Load Distribution**: Traffic spread across 3 regions +- **Cost Optimization**: Pay only for actual usage (~$311/month) +- **Provider Diversity**: Reduces vendor lock-in risk +- **Capacity Planning**: Scale independently per region + +## Architecture Overview + +```bash +┌─────────────────────────────────────────────────────────────────┐ +│ Global Route53 DNS │ +│ Geographic Routing + Health Checks │ +└────────────────────┬────────────────────────────────────────────┘ + │ + ┌───────────┼───────────┐ + │ │ │ + ┌────▼─────┐ ┌──▼────────┐ ┌▼──────────┐ + │ US │ │ EU │ │ APAC │ + │ Primary │ │ Secondary │ │ Tertiary │ + └────┬─────┘ └──┬────────┘ └▼──────────┘ + │ │ │ + ┌────▼──────────▼───────────▼────┐ + │ Multi-Master Database │ + │ Replication (300s lag) │ + └────────────────────────────────┘ + │ │ │ + ┌────▼────┐ ┌──▼─────┐ ┌──▼────┐ + │DO Droplets Hetzner AWS + │ 3 x nyc3 3 x nbg1 3 x sgp1 + │ │ │ │ + │ Load Balancer (per region) + │ │ │ │ + └─────────┼─────────┼─────────┘ + │VPN Tunnels (IPSec)│ + └───────────────────┘ +``` + +### Regional Components + +#### US East (DigitalOcean) - Primary + +```bash +Region: nyc3 (New York) +Compute: 3x Droplets (s-2vcpu-4gb) +Load Balancer: Round-robin with health checks +Database: PostgreSQL (3-node cluster, Multi-AZ) +Network: VPC 10.0.0.0/16 +Cost: ~$102/month +``` + +#### EU Central (Hetzner) - Secondary + +```bash +Region: nbg1 (Nuremberg, Germany) +Compute: 3x CPX21 servers (4 vCPU, 8GB RAM) +Load Balancer: Hetzner Load Balancer +Database: Read-only replica (lag: 300s) +Network: vSwitch 10.1.0.0/16 +Cost: ~$79/month (€72.70) +``` + +#### Asia Pacific (AWS) - Tertiary + +```bash +Region: ap-southeast-1 (Singapore) +Compute: 3x EC2 t3.medium instances +Load Balancer: Application Load Balancer (ALB) +Database: RDS read-only replica (lag: 300s) +Network: VPC 10.2.0.0/16 +Cost: ~$130/month +``` + +## Prerequisites + +### 1. Cloud Accounts & Credentials + +#### DigitalOcean +```bash +# Create API token +# Dashboard → API → Tokens/Keys → Generate New Token +# Scopes: read, write + +export DIGITALOCEAN_TOKEN="dop_v1_abc123def456ghi789jkl012mno" +``` + +#### Hetzner +```bash +# Create API token +# Dashboard → Security → API Tokens → Generate Token + +export HCLOUD_TOKEN="MC4wNTI1YmE1M2E4YmE0YTQzMTQyZTdlODYy" +``` + +#### AWS +```bash +# Create IAM user with programmatic access +# IAM → Users → Add User → Check "Programmatic access" +# Attach policies: AmazonEC2FullAccess, AmazonRDSFullAccess, Route53FullAccess + +export AWS_ACCESS_KEY_ID="AKIA1234567890ABCDEF" +export AWS_SECRET_ACCESS_KEY="wJalrXUtnFEMI/K7MDENG+j/zI0m1234567890ab" +``` + +### 2. CLI Tools + +```bash +# Verify all CLIs are installed +which doctl +which hcloud +which aws +which nickel + +# Versions +doctl version # >= 1.94.0 +hcloud version # >= 1.35.0 +aws --version # >= 2.0 +nickel --version # >= 1.0 +``` + +### 3. SSH Keys + +#### DigitalOcean +```bash +# Upload SSH key +doctl compute ssh-key create provisioning-key + --public-key-from-file ~/.ssh/id_rsa.pub + +# Note the key ID +doctl compute ssh-key list +``` + +#### Hetzner +```bash +# Upload SSH key +hcloud ssh-key create + --name provisioning-key + --public-key-from-file ~/.ssh/id_rsa.pub + +# List keys +hcloud ssh-key list +``` + +#### AWS +```bash +# Create or import EC2 key pair +aws ec2 create-key-pair + --key-name provisioning-key + --query 'KeyMaterial' --output text > provisioning-key.pem + +chmod 600 provisioning-key.pem +``` + +### 4. Domain and DNS + +You need a domain with Route53 or ability to create DNS records: + +```bash +# Create hosted zone in Route53 +aws route53 create-hosted-zone + --name api.example.com + --caller-reference $(date +%s) + +# Note the Zone ID for updates +aws route53 list-hosted-zones +``` + +## Deployment + +### Step 1: Configure the Workspace + +Edit `workspace.ncl` to customize: + +```nickel +# Update SSH key references +droplets = digitalocean.Droplet & { + ssh_keys = ["YOUR_DO_KEY_ID"], + name = "us-app", + region = "nyc3" +} + +# Update AWS AMI IDs for your region +app_servers = aws.EC2 & { + image_id = "ami-09d56f8956ab235b7", + instance_type = "t3.medium", + region = "ap-southeast-1" +} + +# Update certificate ID +load_balancer = digitalocean.LoadBalancer & { + forwarding_rules = [{ + certificate_id = "your-certificate-id", + entry_protocol = "https", + entry_port = 443 + }] +} +``` + +Edit `config.toml`: + +```toml +# Update regional names if different +[providers.digitalocean] +region_name = "us-east" + +[providers.hetzner] +region_name = "eu-central" + +[providers.aws] +region_name = "asia-southeast" + +# Update domain +[dns] +domain = "api.example.com" +``` + +### Step 2: Validate Configuration + +```toml +# Validate Nickel syntax +nickel export workspace.ncl | jq . > /dev/null + +# Verify credentials per provider +doctl auth init --access-token $DIGITALOCEAN_TOKEN +hcloud context use default +aws sts get-caller-identity + +# Check connectivity +doctl account get +hcloud server list +aws ec2 describe-regions +``` + +### Step 3: Deploy + +```bash +# Make script executable +chmod +x deploy.nu + +# Execute deployment (step-by-step) +./deploy.nu + +# Or with debug output +./deploy.nu --debug + +# Or deploy per region +./deploy.nu --region us-east +./deploy.nu --region eu-central +./deploy.nu --region asia-southeast +``` + +### Step 4: Verify Global Deployment + +```bash +# List resources per region +echo "=== US EAST (DigitalOcean) ===" +doctl compute droplet list --format Name,Region,Status,PublicIPv4 +doctl compute load-balancer list + +echo "=== EU CENTRAL (Hetzner) ===" +hcloud server list + +echo "=== ASIA PACIFIC (AWS) ===" +aws ec2 describe-instances --region ap-southeast-1 + --query 'Reservations[*].Instances[*].[InstanceId,InstanceType,State.Name,PublicIpAddress]' + --output table +aws elbv2 describe-load-balancers --region ap-southeast-1 +``` + +## Post-Deployment Configuration + +### 1. SSL/TLS Certificates + +#### AWS Certificate Manager +```bash +# Request certificate for all regions +aws acm request-certificate + --domain-name api.example.com + --subject-alternative-names *.api.example.com + --validation-method DNS + --region us-east-1 + +# Get certificate ARN +aws acm list-certificates --region us-east-1 + +# Note the ARN for workspace.ncl +``` + +### 2. Database Primary/Replica Setup + +```bash +# Connect to US East primary +PGPASSWORD=admin psql -h us-db-primary.abc123.us-east-1.rds.amazonaws.com -U admin -d postgres + +# Create read-only replication users for EU and APAC +CREATE ROLE replication_user WITH REPLICATION LOGIN PASSWORD 'replica_password'; + +# On EU read replica (Hetzner) - verify replication +SELECT slot_name, restart_lsn, confirmed_flush_lsn FROM pg_replication_slots; + +# On APAC read replica (AWS RDS) - verify replica status +SELECT databaseid, xmin, catalog_xmin FROM pg_replication_origin_status; +``` + +### 3. Global DNS Setup + +```bash +# Create Route53 records for each region +aws route53 change-resource-record-sets + --hosted-zone-id Z1234567890ABC + --change-batch '{ + "Changes": [ + { + "Action": "CREATE", + "ResourceRecordSet": { + "Name": "us.api.example.com", + "Type": "A", + "TTL": 60, + "ResourceRecords": [{"Value": "198.51.100.15"}] + } + }, + { + "Action": "CREATE", + "ResourceRecordSet": { + "Name": "eu.api.example.com", + "Type": "A", + "TTL": 60, + "ResourceRecords": [{"Value": "192.0.2.100"}] + } + }, + { + "Action": "CREATE", + "ResourceRecordSet": { + "Name": "asia.api.example.com", + "Type": "A", + "TTL": 60, + "ResourceRecords": [{"Value": "203.0.113.50"}] + } + } + ] + }' + +# Health checks per region +aws route53 create-health-check + --health-check-config '{ + "Type": "HTTPS", + "ResourcePath": "/health", + "FullyQualifiedDomainName": "us.api.example.com", + "Port": 443, + "RequestInterval": 30, + "FailureThreshold": 3 + }' +``` + +### 4. Application Deployment + +SSH to web servers in each region: + +```bash +# US East +US_IP=$(doctl compute droplet get us-app-1 --format PublicIPv4 --no-header) +ssh root@$US_IP + +# Deploy application +cd /var/www +git clone https://github.com/your-org/app.git +cd app +./deploy.sh + +# EU Central +EU_IP=$(hcloud server list --selector region=eu-central --format ID | head -1 | xargs -I {} hcloud server ip {}) +ssh root@$EU_IP + +# Asia Pacific +ASIA_IP=$(aws ec2 describe-instances + --region ap-southeast-1 + --filters "Name=tag:Name,Values=asia-app-1" + --query 'Reservations[0].Instances[0].PublicIpAddress' + --output text) +ssh -i provisioning-key.pem ec2-user@$ASIA_IP +``` + +## Monitoring and Health Checks + +### Regional Monitoring + +Each region generates metrics to CloudWatch/provider-specific monitoring: + +```bash +# DigitalOcean metrics +doctl monitoring metrics list droplet + --droplet-id 123456789 + --metric cpu + +# Hetzner metrics (manual monitoring) +hcloud server list + +# AWS CloudWatch +aws cloudwatch get-metric-statistics + --metric-name CPUUtilization + --namespace AWS/EC2 + --start-time 2024-01-01T00:00:00Z + --end-time 2024-01-02T00:00:00Z + --period 300 + --statistics Average +``` + +### Global Health Checks + +Route53 health checks verify all regions are healthy: + +```bash +# List health checks +aws route53 list-health-checks + +# Get detailed status +aws route53 get-health-check-status --health-check-id abc123 + +# Verify replication lag +# On primary (US East) DigitalOcean +SELECT now() - pg_last_xact_replay_timestamp() AS replication_lag; + +# Should be less than 300 seconds +``` + +### Alert Configuration + +Configure alerts for critical metrics: + +```toml +# CPU > 80% +aws cloudwatch put-metric-alarm + --alarm-name us-east-high-cpu + --alarm-actions arn:aws:sns:us-east-1:123456:ops-alerts + --metric-name CPUUtilization + --threshold 80 + --comparison-operator GreaterThanThreshold + +# Replication lag > 600s +aws cloudwatch put-metric-alarm + --alarm-name replication-lag-critical + --metric-name ReplicationLag + --threshold 600 + --comparison-operator GreaterThanThreshold +``` + +## Failover Testing + +### Planned Failover - US East to EU Central + +```bash +# 1. Stop traffic to US East +aws route53 change-resource-record-sets + --hosted-zone-id Z1234567890ABC + --change-batch '{ + "Changes": [{ + "Action": "UPSERT", + "ResourceRecordSet": { + "Name": "api.example.com", + "Type": "A", + "TTL": 60, + "ResourceRecords": [{"Value": "192.0.2.100"}] + } + }] + }' + +# 2. Promote EU Central to primary +# Connect to EU read replica and promote +psql -h hetzner-eu-db.netz.de -U admin -d postgres + -c "SELECT pg_promote();" + +# 3. Verify failover +curl https://api.example.com/health + +# 4. Monitor replication (now from EU) +SELECT now() - pg_last_xact_replay_timestamp() AS replication_lag; +``` + +### Automatic Failover - Health Check Failure + +Route53 automatically fails over when health checks fail: + +```bash +# Simulate US East failure (for testing only) +# Stop web servers temporarily +doctl compute droplet-action power-off us-app-1 us-app-2 us-app-3 + +# Wait ~1 minute for health check to fail +sleep 60 + +# Verify traffic now routes to EU/APAC +curl https://api.example.com/ -v | grep -E "^< Server" + +# Restore US East +doctl compute droplet-action power-on us-app-1 us-app-2 us-app-3 +``` + +## Scaling and Upgrades + +### Add More Web Servers + +Edit `workspace.ncl`: + +```nickel +# Increase droplet count +region_us_east.app_servers = digitalocean.Droplet & { + count = 5, + name = "us-app", + region = "nyc3" +} + +# Increase Hetzner servers +region_eu_central.app_servers = hetzner.Server & { + count = 5, + server_type = "cpx21", + location = "nbg1" +} + +# Increase AWS EC2 instances +region_asia_southeast.app_servers = aws.EC2 & { + count = 5, + instance_type = "t3.medium", + region = "ap-southeast-1" +} +``` + +Redeploy: + +```bash +./deploy.nu --region us-east +./deploy.nu --region eu-central +./deploy.nu --region asia-southeast +``` + +### Upgrade Database Instance Class + +Edit `workspace.ncl`: + +```nickel +# US East primary +database = digitalocean.Database & { + size = "db-s-4vcpu-8gb", + name = "us-db-primary", + engine = "pg" +} +``` + +DigitalOcean handles upgrade with minimal downtime. + +### Upgrade EC2 Instances + +```bash +# Stop instances for upgrade (rolling) +aws ec2 stop-instances --region ap-southeast-1 --instance-ids i-1234567890abcdef0 + +# Wait for stop +aws ec2 wait instance-stopped --region ap-southeast-1 --instance-ids i-1234567890abcdef0 + +# Modify instance type +aws ec2 modify-instance-attribute + --region ap-southeast-1 + --instance-id i-1234567890abcdef0 + --instance-type t3.large + +# Start instance +aws ec2 start-instances --region ap-southeast-1 --instance-ids i-1234567890abcdef0 +``` + +## Cost Optimization + +### Monthly Cost Breakdown + +| Component | US East | EU Central | Asia Pacific | Total | +| ----------- | --------- | ----------- | -------------- | ------- | +| Compute | $72 | €62.70 | $80 | $242.70 | +| Database | $30 | Read Replica | $30 | $60 | +| Load Balancer | Free | ~$10 | ~$20 | ~$30 | +| **Total** | **$102** | **~$79** | **$130** | **~$311** | + +### Optimization Strategies + +1. Reduce instance count from 3 to 2 (saves ~$30-40/month) +2. Downsize compute to s-1vcpu-2gb (saves ~$20-30/month) +3. Use Reserved Instances on AWS (saves ~20-30%) +4. Optimize data transfer between regions +5. Review backups and retention settings + +### Monitor Costs + +```bash +# DigitalOcean +doctl billing get + +# AWS Cost Explorer +aws ce get-cost-and-usage + --time-period Start=2024-01-01,End=2024-01-31 + --granularity MONTHLY + --metrics BlendedCost + --group-by Type=DIMENSION,Key=SERVICE + +# Hetzner (manual via console) +# https://console.hetzner.cloud/billing +``` + +## Troubleshooting + +### Issue: One Region Not Responding + +**Diagnosis**: +```bash +# Check health checks +aws route53 get-health-check-status --health-check-id abc123 + +# Test regional endpoints +curl -v https://us.api.example.com/health +curl -v https://eu.api.example.com/health +curl -v https://asia.api.example.com/health +``` + +**Solution**: +- Check web server status in affected region +- Verify load balancer is healthy +- Review security groups/firewall rules +- Check application logs on web servers + +### Issue: High Replication Lag + +**Diagnosis**: +```bash +# Check replication status +psql -h us-db-primary.abc123.us-east-1.rds.amazonaws.com -U admin -d postgres + -c "SELECT now() - pg_last_xact_replay_timestamp() AS replication_lag;" + +# Check replication slots +psql -h us-db-primary.abc123.us-east-1.rds.amazonaws.com -U admin -d postgres + -c "SELECT * FROM pg_replication_slots;" +``` + +**Solution**: +- Check network connectivity between regions +- Verify VPN tunnels are operational +- Reduce write load on primary +- Monitor network bandwidth +- May need larger database instance + +### Issue: VPN Tunnel Down + +**Diagnosis**: +```bash +# Check VPN connection status +aws ec2 describe-vpn-connections --region us-east-1 + +# Test connectivity between regions +ssh hetzner-server "ping 10.0.0.1" +``` + +**Solution**: +- Reconnect VPN tunnel manually +- Verify tunnel configuration +- Check security groups allow necessary ports +- Review ISP routing + +## Cleanup + +To destroy all resources (use carefully): + +```bash +# DigitalOcean +doctl compute droplet delete --force us-app-1 us-app-2 us-app-3 +doctl compute load-balancer delete --force us-lb +doctl compute database delete --force us-db-primary + +# Hetzner +hcloud server delete hetzner-eu-1 hetzner-eu-2 hetzner-eu-3 +hcloud load-balancer delete eu-lb +hcloud volume delete eu-backups + +# AWS +aws ec2 terminate-instances --region ap-southeast-1 --instance-ids i-xxxxx +aws elbv2 delete-load-balancer --load-balancer-arn arn:aws:elasticloadbalancing:ap-southeast-1:123456789:loadbalancer/app/asia-lb/1234567890abcdef +aws rds delete-db-instance --db-instance-identifier asia-db-replica --skip-final-snapshot + +# Route53 +aws route53 delete-health-check --health-check-id abc123 +aws route53 delete-hosted-zone --id Z1234567890ABC +``` + +## Next Steps + +1. Disaster Recovery Testing: Regular failover drills +2. Auto-scaling: Add provider-specific autoscaling +3. Monitoring Integration: Connect to centralized monitoring (Datadog, New Relic, Prometheus) +4. Backup Automation: Implement cross-region backups +5. Cost Optimization: Review and tune resource sizing +6. Security Hardening: Implement WAF, DDoS protection +7. Load Testing: Validate performance across regions + +## Support + +For issues or questions: + +- Review the multi-provider networking guide +- Check provider-specific documentation +- Review regional deployment logs: `./deploy.nu --debug` +- Test regional endpoints independently + +## Files + +- `workspace.ncl`: Global infrastructure definition (Nickel) +- `config.toml`: Provider credentials and regional settings +- `deploy.nu`: Multi-region deployment orchestration (Nushell) +- `README.md`: This file \ No newline at end of file diff --git a/schemas/infrastructure/README.md b/schemas/infrastructure/README.md index f9183cc..49ef03e 100644 --- a/schemas/infrastructure/README.md +++ b/schemas/infrastructure/README.md @@ -1 +1,424 @@ -# Infrastructure Schemas\n\nThis directory contains Nickel type-safe schemas for infrastructure configuration generation.\n\n## Overview\n\nThese schemas provide type contracts and validation for multi-format infrastructure configuration generation:\n\n- **Docker Compose** (`docker-compose.ncl`) - Container orchestration via Docker Compose\n- **Kubernetes** (`kubernetes.ncl`) - Kubernetes manifest generation (Deployments, Services, ConfigMaps)\n- **Nginx** (`nginx.ncl`) - Reverse proxy and load balancer configuration\n- **Prometheus** (`prometheus.ncl`) - Metrics collection and monitoring\n- **Systemd** (`systemd.ncl`) - System service units for standalone deployments\n- **OCI Registry** (`oci-registry.ncl`) - Container registry backend configuration (Zot, Distribution, Harbor)\n\n## Key Features\n\n### 1. Mode-Based Presets\n\nEach schema includes presets for different deployment modes:\n\n- **solo**: Single-node deployments (minimal resources)\n- **multiuser**: Staging/small production (2 replicas, HA)\n- **enterprise**: Large-scale production (3+ replicas, distributed storage)\n- **cicd**: CI/CD pipeline deployments\n\n### 2. Type Safety\n\n```\n# All fields are strongly typed with validation\nResourceLimits = {\n cpus | String, # Type: string\n memory | String,\n},\n\n# Enum validation\nServiceType = [| 'ClusterIP, 'NodePort, 'LoadBalancer |],\n\n# Numeric range validation\nPort = Number | {\n predicate = fun n => n > 0 && n < 65536,\n}\n```\n\n### 3. Export Formats\n\nSchemas export to multiple formats:\n\n```\n# Export as YAML (K8s, Docker Compose)\nnickel export --format yaml provisioning/schemas/infrastructure/kubernetes.ncl\n\n# Export as JSON (OCI Registry, Prometheus configs)\nnickel export --format json provisioning/schemas/infrastructure/oci-registry.ncl\n\n# Export as TOML (systemd, Nginx)\nnickel export --format toml provisioning/schemas/infrastructure/systemd.ncl\n```\n\n## Single Source of Truth Pattern\n\nDefine service configuration once, generate multiple infrastructure outputs:\n\n```\norchestrator.ncl (Platform Service Schema)\n ↓\nInfrastructure Schemas (Docker, Kubernetes, Nginx, etc.)\n ↓\n[Multiple Outputs]\n├─→ docker-compose.yaml\n├─→ kubernetes/deployment.yaml\n├─→ nginx.conf\n├─→ prometheus.yml\n└─→ systemd/orchestrator.service\n```\n\n### Example: Service Port Definition\n\n```\n# Platform service schema (provisioning/schemas/platform/schemas/orchestrator.ncl)\nserver = {\n port | Number, # Define port once\n}\n\n# Used in Docker Compose\ndocker-compose = {\n services.orchestrator = {\n ports = ["%{orchestrator.server.port}:8080"],\n }\n}\n\n# Used in Kubernetes\nkubernetes = {\n containers.ports = [{\n containerPort = orchestrator.server.port,\n }]\n}\n\n# Used in Nginx\nnginx = {\n upstreams.orchestrator.servers = [{\n address = "orchestrator:%{orchestrator.server.port}",\n }]\n}\n```\n\n**Benefit**: Change port in one place, all infrastructure configs update automatically.\n\n## Validation Before Deployment\n\n```\n# Type check schema\nnickel typecheck provisioning/schemas/infrastructure/docker-compose.ncl\n\n# Validate export\nnickel export --format json provisioning/schemas/infrastructure/kubernetes.ncl \n | jq . # Validate JSON structure\n\n# Check generated YAML\nnickel export --format yaml provisioning/schemas/infrastructure/kubernetes.ncl \n | kubectl apply --dry-run=client -f -\n```\n\n## File Structure\n\n```\ninfrastructure/\n├── README.md # This file\n├── docker-compose.ncl # Docker Compose schema (232 lines)\n├── kubernetes.ncl # Kubernetes manifests (376 lines)\n├── nginx.ncl # Nginx configuration (233 lines)\n├── prometheus.ncl # Prometheus configuration (280 lines)\n├── systemd.ncl # Systemd service units (235 lines)\n└── oci-registry.ncl # OCI Registry configuration (221 lines)\n```\n\n**Total**: 1,577 lines of type-safe infrastructure schemas\n\n## Usage Patterns\n\n### 1. Generate Solo Mode Infrastructure\n\n```\n# Export docker-compose for solo deployment\nnickel export --format yaml provisioning/schemas/infrastructure/docker-compose.ncl \n | tee provisioning/platform/infrastructure/docker/docker-compose.solo.yaml\n\n# Validate with Docker\ndocker-compose -f docker-compose.solo.yaml config --quiet\n```\n\n### 2. Generate Enterprise HA Kubernetes\n\n```\n# Export Kubernetes manifests\nnickel export --format yaml provisioning/schemas/infrastructure/kubernetes.ncl \n > provisioning/platform/infrastructure/kubernetes/deployment.yaml\n\n# Validate and apply\nkubectl apply --dry-run=client -f deployment.yaml\nkubectl apply -f deployment.yaml\n```\n\n### 3. Generate Monitoring Stack\n\n```\n# Prometheus configuration\nnickel export --format yaml provisioning/schemas/infrastructure/prometheus.ncl \n > provisioning/platform/infrastructure/prometheus/prometheus.yml\n\n# Validate Prometheus config\npromtool check config provisioning/platform/infrastructure/prometheus/prometheus.yml\n```\n\n### 4. Auto-Generate Infrastructure from Service Schemas\n\n```\n# Composition function: generate Docker Compose from service port\nlet service = import "../platform/schemas/orchestrator.ncl" in\n{\n services.orchestrator = {\n image = "provisioning/orchestrator:latest",\n ports = ["%{service.server.port}:8080"],\n deploy.resources.limits = service.deploy.resources.limits,\n }\n}\n```\n\n## Documentation\n\n### Inline Schema Documentation\n\nEach schema field includes inline documentation (via `| doc`):\n\n```\nfield | Type | doc "description" | default = value\n```\n\n**Important**: With Nickel, `| doc` must come BEFORE `| default`:\n\n```\n✅ CORRECT: cpus | String | doc "CPU limit" | default = "2.0"\n❌ INCORRECT: cpus | String | default = "2.0" | doc "CPU limit"\n```\n\nFor details, see `.claude/guidelines/nickel.md`\n\n## Validation Rules\n\n### Docker Compose\n\n- ✅ Valid service names, port ranges\n- ✅ Resource limits: CPU and memory strings\n- ✅ Health check configuration\n- ✅ Environment variables typed as strings\n\n### Kubernetes\n\n- ✅ Valid API versions (apps/v1, v1)\n- ✅ Container resource requests/limits\n- ✅ Valid restart policies (Always, OnFailure, Never)\n- ✅ Port ranges (1-65535)\n\n### Nginx\n\n- ✅ Upstream server addresses\n- ✅ Rate limiting zones and rules\n- ✅ TLS configuration validation\n- ✅ Security headers structure\n\n### Prometheus\n\n- ✅ Scrape job configuration\n- ✅ Alert manager targets\n- ✅ Scrape intervals (duration format)\n- ✅ Relabel configuration\n\n### Systemd\n\n- ✅ Unit dependencies (after, requires, wants)\n- ✅ Resource limits (CPU quota, memory)\n- ✅ Restart policies\n- ✅ Service types (simple, forking, oneshot, etc.)\n\n### OCI Registry\n\n- ✅ Registry backends (Zot, Distribution, Harbor)\n- ✅ Storage backend selection (filesystem, S3, Azure)\n- ✅ Authentication methods (none, basic, bearer, OIDC)\n- ✅ Access control policies\n\n## Deployment Examples\n\nTwo comprehensive infrastructure examples are provided demonstrating solo and enterprise configurations:\n\n### Solo Deployment Example\n\n**File**: `examples-solo-deployment.ncl`\n\nMinimal single-node setup for development/testing:\n\n```\n# Exports 4 infrastructure components\ndocker_compose_services # 5 services: orchestrator, control-center, coredns, kms, oci_registry\nnginx_config # Simple upstream routing to localhost services\nprometheus_config # 4 scrape jobs for basic monitoring\noci_registry_config # Zot backend with filesystem storage\n```\n\n**Resource Allocation**:\n- Orchestrator: 1.0 CPU, 1024M RAM\n- Control Center: 0.5 CPU, 512M RAM\n- Other services: 0.25-0.5 CPU, 256-512M RAM\n\n**Export to JSON**:\n\n```\nnickel export --format json provisioning/schemas/infrastructure/examples-solo-deployment.ncl\n# Output: 198 lines of configuration\n```\n\n### Enterprise Deployment Example\n\n**File**: `examples-enterprise-deployment.ncl`\n\nHigh-availability production-grade deployment:\n\n```\n# Exports 4 infrastructure components (HA versions)\ndocker_compose_services # 6 services with 3 replicas for HA\nnginx_config # Multiple upstreams with rate limiting and failover\nprometheus_config # 7 scrape jobs with remote storage\noci_registry_config # Harbor backend with S3 replication\n```\n\n**Resource Allocation**:\n- Orchestrator: 4.0 CPU, 4096M RAM (3 replicas)\n- Control Center: 2.0 CPU, 2048M RAM (HA)\n- Services scale appropriately for production load\n\n**Export to JSON**:\n\n```\nnickel export --format json provisioning/schemas/infrastructure/examples-enterprise-deployment.ncl\n# Output: 313 lines of configuration\n```\n\n### Example Comparison\n\n| Aspect | Solo | Enterprise |\n| -------- | ------ | ----------- |\n| **Services** | 5 | 6 |\n| **Orchestrator CPU** | 1.0 | 4.0 |\n| **Orchestrator Memory** | 1024M | 4096M |\n| **Prometheus Jobs** | 4 | 7 |\n| **Registry Backend** | Zot | Harbor |\n| **Use Case** | Dev/Testing | Production |\n| **JSON Size** | 198 lines | 313 lines |\n\n### Validation Results\n\nBoth examples have been tested and validated:\n\n✅ **Solo Deployment** (`examples-solo-deployment.ncl`):\n- Type-checks without errors\n- Exports to valid JSON (198 lines)\n- All resource limits validated\n- Port range validation: 8080, 9090, 5432, 53\n- JSON structure: docker_compose_services, nginx_config, prometheus_config, oci_registry_config\n\n✅ **Enterprise Deployment** (`examples-enterprise-deployment.ncl`):\n- Type-checks without errors\n- Exports to valid JSON (313 lines)\n- HA configuration with 3 replicas\n- Enhanced monitoring: 7 vs 4 scrape jobs\n- Distributed storage backend (Harbor vs Zot)\n- Full JSON structure validated with jq\n\n## Automation Scripts\n\nGenerate all infrastructure configs in one command:\n\n```\n# Generate all formats for all modes\nprovisioning/platform/scripts/generate-infrastructure-configs.nu\n\n# Generate specific mode/format\nprovisioning/platform/scripts/generate-infrastructure-configs.nu --mode solo --format yaml\n\n# Specify output directory\nprovisioning/platform/scripts/generate-infrastructure-configs.nu --output-dir /tmp/infra\n```\n\nSee `provisioning/platform/scripts/generate-infrastructure-configs.nu` for implementation details.\n\n## Validation and Testing\n\n### Test Generated Configs\n\n```\n# Export solo deployment\nnickel export --format json provisioning/schemas/infrastructure/examples-solo-deployment.ncl \n > solo-infra.json\n\n# Validate JSON structure\njq . solo-infra.json\n\n# Inspect specific component (Docker Compose services)\njq '.docker_compose_services | keys' solo-infra.json\n\n# Check resource allocation\njq '.docker_compose_services.orchestrator.deploy.resources.limits' solo-infra.json\n```\n\n### Validate with Docker/Kubectl\n\n```\n# Export and validate Docker Compose\nnickel export --format yaml examples-solo-deployment.ncl \n | docker-compose config --quiet\n\n# Validate Kubernetes (if applicable)\nnickel export --format yaml examples-enterprise-deployment.ncl \n | kubectl apply --dry-run=client -f -\n\n# Validate Prometheus config\nnickel export --format yaml prometheus.ncl \n | promtool check config -\n```\n\n## Integration with ConfigLoader\n\nInfrastructure schemas are independent from platform config schemas:\n\n- **Platform configs** → Service-specific settings (port, timeouts, auth)\n- **Infrastructure schemas** → Deployment-specific settings (replicas, resources, networking)\n\nConfigLoader automatically loads platform configs. Infrastructure configs are generated separately and deployed via infrastructure tools:\n\n```\nPlatform Schema (Nickel)\n ↓ nickel export → TOML\n ↓ ConfigLoader → Service reads config\n\nInfrastructure Schema (Nickel)\n ↓ nickel export → YAML/JSON\n ↓ Docker/Kubernetes/Nginx CLI\n```\n\n## Next Steps\n\n1. **Use these schemas** in your infrastructure-as-code pipeline\n2. **Generate configs** with the automation script\n3. **Validate** before deployment using format-specific tools\n4. **Maintain single source of truth** by updating schemas, not generated files\n\n---\n\n**Version**: 1.1.0 (Infrastructure Examples & Validation Added)\n**Total Schemas**: 6 core files, 1,577 lines\n**Deployment Examples**: 2 files, 54 lines (solo + enterprise)\n**Validated**: All schemas and examples pass type-checking and export validation\n**Last Updated**: 2025-01-06\n**Nickel Version**: Latest \ No newline at end of file +# Infrastructure Schemas + +This directory contains Nickel type-safe schemas for infrastructure configuration generation. + +## Overview + +These schemas provide type contracts and validation for multi-format infrastructure configuration generation: + +- **Docker Compose** (`docker-compose.ncl`) - Container orchestration via Docker Compose +- **Kubernetes** (`kubernetes.ncl`) - Kubernetes manifest generation (Deployments, Services, ConfigMaps) +- **Nginx** (`nginx.ncl`) - Reverse proxy and load balancer configuration +- **Prometheus** (`prometheus.ncl`) - Metrics collection and monitoring +- **Systemd** (`systemd.ncl`) - System service units for standalone deployments +- **OCI Registry** (`oci-registry.ncl`) - Container registry backend configuration (Zot, Distribution, Harbor) + +## Key Features + +### 1. Mode-Based Presets + +Each schema includes presets for different deployment modes: + +- **solo**: Single-node deployments (minimal resources) +- **multiuser**: Staging/small production (2 replicas, HA) +- **enterprise**: Large-scale production (3+ replicas, distributed storage) +- **cicd**: CI/CD pipeline deployments + +### 2. Type Safety + +```bash +# All fields are strongly typed with validation +ResourceLimits = { + cpus | String, # Type: string + memory | String, +}, + +# Enum validation +ServiceType = [| 'ClusterIP, 'NodePort, 'LoadBalancer |], + +# Numeric range validation +Port = Number | { + predicate = fun n => n > 0 && n < 65536, +} +``` + +### 3. Export Formats + +Schemas export to multiple formats: + +```bash +# Export as YAML (K8s, Docker Compose) +nickel export --format yaml provisioning/schemas/infrastructure/kubernetes.ncl + +# Export as JSON (OCI Registry, Prometheus configs) +nickel export --format json provisioning/schemas/infrastructure/oci-registry.ncl + +# Export as TOML (systemd, Nginx) +nickel export --format toml provisioning/schemas/infrastructure/systemd.ncl +``` + +## Single Source of Truth Pattern + +Define service configuration once, generate multiple infrastructure outputs: + +```toml +orchestrator.ncl (Platform Service Schema) + ↓ +Infrastructure Schemas (Docker, Kubernetes, Nginx, etc.) + ↓ +[Multiple Outputs] +├─→ docker-compose.yaml +├─→ kubernetes/deployment.yaml +├─→ nginx.conf +├─→ prometheus.yml +└─→ systemd/orchestrator.service +``` + +### Example: Service Port Definition + +```bash +# Platform service schema (provisioning/schemas/platform/schemas/orchestrator.ncl) +server = { + port | Number, # Define port once +} + +# Used in Docker Compose +docker-compose = { + services.orchestrator = { + ports = ["%{orchestrator.server.port}:8080"], + } +} + +# Used in Kubernetes +kubernetes = { + containers.ports = [{ + containerPort = orchestrator.server.port, + }] +} + +# Used in Nginx +nginx = { + upstreams.orchestrator.servers = [{ + address = "orchestrator:%{orchestrator.server.port}", + }] +} +``` + +**Benefit**: Change port in one place, all infrastructure configs update automatically. + +## Validation Before Deployment + +```bash +# Type check schema +nickel typecheck provisioning/schemas/infrastructure/docker-compose.ncl + +# Validate export +nickel export --format json provisioning/schemas/infrastructure/kubernetes.ncl + | jq . # Validate JSON structure + +# Check generated YAML +nickel export --format yaml provisioning/schemas/infrastructure/kubernetes.ncl + | kubectl apply --dry-run=client -f - +``` + +## File Structure + +```bash +infrastructure/ +├── README.md # This file +├── docker-compose.ncl # Docker Compose schema (232 lines) +├── kubernetes.ncl # Kubernetes manifests (376 lines) +├── nginx.ncl # Nginx configuration (233 lines) +├── prometheus.ncl # Prometheus configuration (280 lines) +├── systemd.ncl # Systemd service units (235 lines) +└── oci-registry.ncl # OCI Registry configuration (221 lines) +``` + +**Total**: 1,577 lines of type-safe infrastructure schemas + +## Usage Patterns + +### 1. Generate Solo Mode Infrastructure + +```bash +# Export docker-compose for solo deployment +nickel export --format yaml provisioning/schemas/infrastructure/docker-compose.ncl + | tee provisioning/platform/infrastructure/docker/docker-compose.solo.yaml + +# Validate with Docker +docker-compose -f docker-compose.solo.yaml config --quiet +``` + +### 2. Generate Enterprise HA Kubernetes + +```yaml +# Export Kubernetes manifests +nickel export --format yaml provisioning/schemas/infrastructure/kubernetes.ncl + > provisioning/platform/infrastructure/kubernetes/deployment.yaml + +# Validate and apply +kubectl apply --dry-run=client -f deployment.yaml +kubectl apply -f deployment.yaml +``` + +### 3. Generate Monitoring Stack + +```bash +# Prometheus configuration +nickel export --format yaml provisioning/schemas/infrastructure/prometheus.ncl + > provisioning/platform/infrastructure/prometheus/prometheus.yml + +# Validate Prometheus config +promtool check config provisioning/platform/infrastructure/prometheus/prometheus.yml +``` + +### 4. Auto-Generate Infrastructure from Service Schemas + +```bash +# Composition function: generate Docker Compose from service port +let service = import "../platform/schemas/orchestrator.ncl" in +{ + services.orchestrator = { + image = "provisioning/orchestrator:latest", + ports = ["%{service.server.port}:8080"], + deploy.resources.limits = service.deploy.resources.limits, + } +} +``` + +## Documentation + +### Inline Schema Documentation + +Each schema field includes inline documentation (via `| doc`): + +```bash +field | Type | doc "description" | default = value +``` + +**Important**: With Nickel, `| doc` must come BEFORE `| default`: + +```nickel +✅ CORRECT: cpus | String | doc "CPU limit" | default = "2.0" +❌ INCORRECT: cpus | String | default = "2.0" | doc "CPU limit" +``` + +For details, see `.claude/guidelines/nickel.md` + +## Validation Rules + +### Docker Compose + +- ✅ Valid service names, port ranges +- ✅ Resource limits: CPU and memory strings +- ✅ Health check configuration +- ✅ Environment variables typed as strings + +### Kubernetes + +- ✅ Valid API versions (apps/v1, v1) +- ✅ Container resource requests/limits +- ✅ Valid restart policies (Always, OnFailure, Never) +- ✅ Port ranges (1-65535) + +### Nginx + +- ✅ Upstream server addresses +- ✅ Rate limiting zones and rules +- ✅ TLS configuration validation +- ✅ Security headers structure + +### Prometheus + +- ✅ Scrape job configuration +- ✅ Alert manager targets +- ✅ Scrape intervals (duration format) +- ✅ Relabel configuration + +### Systemd + +- ✅ Unit dependencies (after, requires, wants) +- ✅ Resource limits (CPU quota, memory) +- ✅ Restart policies +- ✅ Service types (simple, forking, oneshot, etc.) + +### OCI Registry + +- ✅ Registry backends (Zot, Distribution, Harbor) +- ✅ Storage backend selection (filesystem, S3, Azure) +- ✅ Authentication methods (none, basic, bearer, OIDC) +- ✅ Access control policies + +## Deployment Examples + +Two comprehensive infrastructure examples are provided demonstrating solo and enterprise configurations: + +### Solo Deployment Example + +**File**: `examples-solo-deployment.ncl` + +Minimal single-node setup for development/testing: + +```bash +# Exports 4 infrastructure components +docker_compose_services # 5 services: orchestrator, control-center, coredns, kms, oci_registry +nginx_config # Simple upstream routing to localhost services +prometheus_config # 4 scrape jobs for basic monitoring +oci_registry_config # Zot backend with filesystem storage +``` + +**Resource Allocation**: +- Orchestrator: 1.0 CPU, 1024M RAM +- Control Center: 0.5 CPU, 512M RAM +- Other services: 0.25-0.5 CPU, 256-512M RAM + +**Export to JSON**: + +```bash +nickel export --format json provisioning/schemas/infrastructure/examples-solo-deployment.ncl +# Output: 198 lines of configuration +``` + +### Enterprise Deployment Example + +**File**: `examples-enterprise-deployment.ncl` + +High-availability production-grade deployment: + +```bash +# Exports 4 infrastructure components (HA versions) +docker_compose_services # 6 services with 3 replicas for HA +nginx_config # Multiple upstreams with rate limiting and failover +prometheus_config # 7 scrape jobs with remote storage +oci_registry_config # Harbor backend with S3 replication +``` + +**Resource Allocation**: +- Orchestrator: 4.0 CPU, 4096M RAM (3 replicas) +- Control Center: 2.0 CPU, 2048M RAM (HA) +- Services scale appropriately for production load + +**Export to JSON**: + +```bash +nickel export --format json provisioning/schemas/infrastructure/examples-enterprise-deployment.ncl +# Output: 313 lines of configuration +``` + +### Example Comparison + +| Aspect | Solo | Enterprise | +| -------- | ------ | ----------- | +| **Services** | 5 | 6 | +| **Orchestrator CPU** | 1.0 | 4.0 | +| **Orchestrator Memory** | 1024M | 4096M | +| **Prometheus Jobs** | 4 | 7 | +| **Registry Backend** | Zot | Harbor | +| **Use Case** | Dev/Testing | Production | +| **JSON Size** | 198 lines | 313 lines | + +### Validation Results + +Both examples have been tested and validated: + +✅ **Solo Deployment** (`examples-solo-deployment.ncl`): +- Type-checks without errors +- Exports to valid JSON (198 lines) +- All resource limits validated +- Port range validation: 8080, 9090, 5432, 53 +- JSON structure: docker_compose_services, nginx_config, prometheus_config, oci_registry_config + +✅ **Enterprise Deployment** (`examples-enterprise-deployment.ncl`): +- Type-checks without errors +- Exports to valid JSON (313 lines) +- HA configuration with 3 replicas +- Enhanced monitoring: 7 vs 4 scrape jobs +- Distributed storage backend (Harbor vs Zot) +- Full JSON structure validated with jq + +## Automation Scripts + +Generate all infrastructure configs in one command: + +```toml +# Generate all formats for all modes +provisioning/platform/scripts/generate-infrastructure-configs.nu + +# Generate specific mode/format +provisioning/platform/scripts/generate-infrastructure-configs.nu --mode solo --format yaml + +# Specify output directory +provisioning/platform/scripts/generate-infrastructure-configs.nu --output-dir /tmp/infra +``` + +See `provisioning/platform/scripts/generate-infrastructure-configs.nu` for implementation details. + +## Validation and Testing + +### Test Generated Configs + +```toml +# Export solo deployment +nickel export --format json provisioning/schemas/infrastructure/examples-solo-deployment.ncl + > solo-infra.json + +# Validate JSON structure +jq . solo-infra.json + +# Inspect specific component (Docker Compose services) +jq '.docker_compose_services | keys' solo-infra.json + +# Check resource allocation +jq '.docker_compose_services.orchestrator.deploy.resources.limits' solo-infra.json +``` + +### Validate with Docker/Kubectl + +```bash +# Export and validate Docker Compose +nickel export --format yaml examples-solo-deployment.ncl + | docker-compose config --quiet + +# Validate Kubernetes (if applicable) +nickel export --format yaml examples-enterprise-deployment.ncl + | kubectl apply --dry-run=client -f - + +# Validate Prometheus config +nickel export --format yaml prometheus.ncl + | promtool check config - +``` + +## Integration with ConfigLoader + +Infrastructure schemas are independent from platform config schemas: + +- **Platform configs** → Service-specific settings (port, timeouts, auth) +- **Infrastructure schemas** → Deployment-specific settings (replicas, resources, networking) + +ConfigLoader automatically loads platform configs. Infrastructure configs are generated separately and deployed via infrastructure tools: + +```toml +Platform Schema (Nickel) + ↓ nickel export → TOML + ↓ ConfigLoader → Service reads config + +Infrastructure Schema (Nickel) + ↓ nickel export → YAML/JSON + ↓ Docker/Kubernetes/Nginx CLI +``` + +## Next Steps + +1. **Use these schemas** in your infrastructure-as-code pipeline +2. **Generate configs** with the automation script +3. **Validate** before deployment using format-specific tools +4. **Maintain single source of truth** by updating schemas, not generated files + +--- + +**Version**: 1.1.0 (Infrastructure Examples & Validation Added) +**Total Schemas**: 6 core files, 1,577 lines +**Deployment Examples**: 2 files, 54 lines (solo + enterprise) +**Validated**: All schemas and examples pass type-checking and export validation +**Last Updated**: 2025-01-06 +**Nickel Version**: Latest \ No newline at end of file diff --git a/schemas/platform/README.md b/schemas/platform/README.md index bcf7631..765a0b7 100644 --- a/schemas/platform/README.md +++ b/schemas/platform/README.md @@ -1 +1,361 @@ -# TypeDialog + Nickel Configuration System for Platform Services\n\nComplete configuration system for provisioning platform services (orchestrator, control-center, mcp-server, vault-service,\nextension-registry, rag, ai-service, provisioning-daemon) across multiple deployment modes (solo, multiuser, cicd, enterprise).\n\n## Architecture Overview\n\nThis system implements a **TypeDialog + Nickel configuration workflow** that provides:\n\n- **Type-safe configuration** via Nickel schemas with validation\n- **Interactive configuration** via TypeDialog forms with real-time constraint validation\n- **Multi-mode deployment** (solo/multiuser/cicd/enterprise) with mode-specific defaults\n- **Configuration composition** (base defaults + mode overlays + user customization + validation)\n- **Automated TOML export** for Rust service consumption\n- **Docker Compose + Kubernetes templates** for infrastructure deployment\n\n## Directory Structure\n\n```\nprovisioning/.typedialog/provisioning/platform/\n├── constraints/ # Single source of truth for validation limits\n├── schemas/ # Nickel type contracts (services + common + deployment modes)\n├── defaults/ # Default configuration values (services + common + deployment modes)\n├── validators/ # Validation logic (constraints, ranges, business rules)\n├── configs/ # Generated mode-specific Nickel configurations (4 services × 4 modes = 16 configs)\n├── forms/ # TypeDialog form definitions (4 main forms + flat fragments)\n│ └── fragments/ # Reusable form fragments (workspace, server, database, etc.)\n├── templates/ # Jinja2 + Nickel templates for config/deployment generation\n│ ├── docker-compose/ # Docker Compose templates (solo/multiuser/cicd/enterprise)\n│ ├── kubernetes/ # Kubernetes deployment templates\n│ └── configs/ # Service configuration templates (TOML generation)\n├── scripts/ # Nushell orchestration scripts (configure, generate, validate, deploy)\n├── examples/ # Example configurations for different deployment scenarios\n└── values/ # User configuration files (gitignored *.ncl)\n```\n\n## Configuration Workflow\n\n### 1. User Interaction (TypeDialog)\n\n```\nnu scripts/configure.nu orchestrator solo --backend web\n```\n\n- Launches interactive form (web/tui/cli)\n- Loads existing config as default values (if exists)\n- Validates user input against constraints\n- Generates updated Nickel config\n\n### 2. Configuration Composition\n\n```\nBase Defaults (defaults/*.ncl)\n ↓\n+ Mode Overlay (defaults/deployment/{mode}-defaults.ncl)\n ↓\n+ User Customization (values/{service}.{mode}.ncl)\n ↓\n+ Schema Validation (schemas/*.ncl)\n ↓\n+ Constraint Validation (validators/*.ncl)\n ↓\n= Final Configuration (configs/{service}.{mode}.ncl)\n```\n\n### 3. TOML Export\n\n```\nnu scripts/generate-configs.nu orchestrator solo\n```\n\nExports Nickel config to TOML:\n- `provisioning/platform/config/orchestrator.solo.toml` (consumed by Rust services)\n\n## Deployment Modes\n\n### Solo (2 CPU, 4GB RAM)\n- Single developer/testing\n- Filesystem or embedded database\n- Minimal security\n- All services enabled\n\n### MultiUser (4 CPU, 8GB RAM)\n- Team collaboration, staging\n- PostgreSQL or SurrealDB server\n- RBAC enabled\n- Gitea integration\n\n### CI/CD (8 CPU, 16GB RAM)\n- Automated pipelines, ephemeral\n- API-driven configuration\n- Fast cleanup, minimal storage\n\n### Enterprise (16+ CPU, 32+ GB RAM)\n- Production high availability\n- SurrealDB cluster with replication\n- MFA required, KMS integration\n- Compliance (SOC2/HIPAA)\n\n## Key Components\n\n### Constraints (constraints/constraints.toml)\nSingle source of truth for validation limits across all services. Used for:\n- Form field validation (min/max values)\n- Constraint interpolation in TypeDialog forms\n- Nickel validator bounds checking\n\n### Schemas (schemas/*.ncl)\nType-safe configuration contracts defining:\n- Required/optional fields\n- Valid value types and enums\n- Default values\n- Input/output type signatures\n\n**Organization**:\n- `schemas/common/` - HTTP server, database, security, monitoring, logging\n- `schemas/{orchestrator,control-center,mcp-server,vault-service,extension-registry,rag,ai-service,provisioning-daemon}.ncl` - Service-specific schemas\n- `schemas/deployment/{solo,multiuser,cicd,enterprise}.ncl` - Mode-specific schemas\n\n### Defaults (defaults/*.ncl)\nConfiguration base values composed with mode overlays:\n- `defaults/{service}-defaults.ncl` - Service base defaults\n- `defaults/common/` - Shared defaults (server, database, security)\n- `defaults/deployment/{mode}-defaults.ncl` - Mode-specific value overrides\n\n### Validators (validators/*.ncl)\nBusiness logic validation using constraints:\n- Port range validation (1024-65535)\n- Resource allocation validation (CPU, memory)\n- Workflow/policy validation (service-specific)\n- Cross-field validation\n\n### Configurations (configs/*.ncl)\nGenerated mode-specific Nickel configs (NOT manually edited):\n- `orchestrator.{solo,multiuser,cicd,enterprise}.ncl`\n- `control-center.{solo,multiuser,cicd,enterprise}.ncl`\n- `mcp-server.{solo,multiuser,cicd,enterprise}.ncl`\n- `vault-service.{solo,multiuser,cicd,enterprise}.ncl`\n- `extension-registry.{solo,multiuser,cicd,enterprise}.ncl`\n- `rag.{solo,multiuser,cicd,enterprise}.ncl`\n- `ai-service.{solo,multiuser,cicd,enterprise}.ncl`\n- `provisioning-daemon.{solo,multiuser,cicd,enterprise}.ncl`\n\n### Forms (forms/*.toml)\nTypeDialog form definitions with **flat fragments** referenced by paths:\n- 4 main forms: `{service}-form.toml`\n- Fragments: `fragments/{name}-section.toml` (workspace, server, database, security, monitoring, etc.)\n- CRITICAL: Every form element has `nickel_path` for Nickel structure mapping\n\n**Fragment Organization** (FLAT, referenced by paths):\n- `workspace-section.toml`\n- `server-section.toml`\n- `database-rocksdb-section.toml`\n- `database-surrealdb-section.toml`\n- `database-postgres-section.toml`\n- `security-section.toml`\n- `monitoring-section.toml`\n- `logging-section.toml`\n- `orchestrator-queue-section.toml`\n- `orchestrator-workflow-section.toml`\n- ... (service-specific and mode-specific fragments)\n\n### Templates (templates/)\nJinja2 + Nickel templates for automated generation:\n- `{service}-config.ncl.j2` - Nickel output template (critical for TypeDialog nickel-roundtrip)\n- `docker-compose/platform-stack.{mode}.yml.ncl` - Docker Compose templates\n- `kubernetes/{service}-deployment.yaml.ncl` - Kubernetes templates\n\n### Scripts (scripts/)\nNushell orchestration (NuShell 0.109+):\n- `configure.nu` - Interactive TypeDialog wizard (nickel-roundtrip workflow)\n- `generate-configs.nu` - Export Nickel → TOML\n- `validate-config.nu` - Typecheck Nickel configs\n- `render-docker-compose.nu` - Generate Docker Compose files\n- `render-kubernetes.nu` - Generate Kubernetes manifests\n- `install-services.nu` - Deploy platform services\n- `detect-services.nu` - Auto-detect running services\n\n### Examples (examples/)\nReference configurations for different scenarios:\n- `orchestrator-solo.ncl` - Simple development setup\n- `orchestrator-enterprise.ncl` - Complex production setup\n- `full-platform-enterprise.ncl` - Complete enterprise stack\n\n### Values (values/)\nUser configuration directory (gitignored):\n- `{service}.{mode}.ncl` - User customizations (loaded in compose)\n- `.gitignore` - Ignores `*.ncl` files\n- `orchestrator.example.ncl` - Documented example template\n\n## TypeDialog nickel-roundtrip Workflow\n\nCRITICAL: Forms use Jinja2 templates for Nickel generation:\n\n```\n# Command pattern\ntypedialog-web nickel-roundtrip "$CONFIG_FILE" "$FORM_FILE" --output "$CONFIG_FILE" --template "$NCL_TEMPLATE"\n\n# Example\ntypedialog-web nickel-roundtrip \n "provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl" \n "provisioning/.typedialog/provisioning/platform/forms/orchestrator-form.toml" \n --output "provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl" \n --template "provisioning/.typedialog/provisioning/platform/templates/orchestrator-config.ncl.j2"\n```\n\n**Key Requirements**:\n1. **Jinja2 template** (`config.ncl.j2`) - Defines Nickel output structure with conditional `{% if %}` blocks\n2. **nickel_path** in form elements - Maps form fields to Nickel structure paths (e.g., `["orchestrator", "queue", "max_concurrent_tasks"]`)\n3. **Constraint interpolation** - Form limits reference constraints (e.g., `${constraint.orchestrator.queue.concurrent_tasks.max}`)\n4. **Base + overlay composition** - Nickel imports merge defaults + mode overlays + validators\n\n## Usage Workflow\n\n### 1. Configure Service (Interactive)\n\n```\n# Start TypeDialog wizard for orchestrator in solo mode\nnu provisioning/.typedialog/provisioning/platform/scripts/configure.nu orchestrator solo --backend web\n```\n\nWizard:\n1. Loads existing config (if exists) as defaults\n2. Shows form with validated constraints\n3. User edits configuration\n4. Generates updated Nickel config to `provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl`\n\n### 2. Validate Configuration\n\n```\n# Typecheck Nickel config\nnu provisioning/.typedialog/provisioning/platform/scripts/validate-config.nu provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl\n```\n\n### 3. Generate TOML for Rust Services\n\n```\n# Export Nickel → TOML\nnu provisioning/.typedialog/provisioning/platform/scripts/generate-configs.nu orchestrator solo\n```\n\nOutput: `provisioning/platform/config/orchestrator.solo.toml`\n\n### 4. Deploy Services\n\n```\n# Install services (Docker Compose or Kubernetes)\nnu provisioning/.typedialog/provisioning/platform/scripts/install-services.nu solo\n```\n\n## Configuration Loading Hierarchy (Rust Services)\n\n```\n1. Environment variables (ORCHESTRATOR_*)\n2. User config (values/{service}.{mode}.ncl → TOML)\n3. Mode-specific defaults (configs/{service}.{mode}.toml)\n4. Service defaults (config/orchestrator.defaults.toml)\n```\n\n## Constraint Interpolation Example\n\n**constraints.toml**:\n\n```\n[orchestrator.queue.concurrent_tasks]\nmin = 1\nmax = 100\n```\n\n**Form element** (fragments/orchestrator-queue-section.toml):\n\n```\n[[elements]]\nname = "max_concurrent_tasks"\ntype = "number"\nmin = "${constraint.orchestrator.queue.concurrent_tasks.min}"\nmax = "${constraint.orchestrator.queue.concurrent_tasks.max}"\nnickel_path = ["orchestrator", "queue", "max_concurrent_tasks"]\n```\n\n**Jinja2 template** (orchestrator-config.ncl.j2):\n\n```\norchestrator = {\n queue = {\n {%- if max_concurrent_tasks %}\n max_concurrent_tasks = {{ max_concurrent_tasks }},\n {%- endif %}\n },\n}\n```\n\n## Getting Started\n\n1. **Run configuration wizard**:\n\n ```bash\n nu provisioning/.typedialog/provisioning/platform/scripts/configure.nu orchestrator solo\n ```\n\n2. **Generate TOML configs**:\n\n ```bash\n nu provisioning/.typedialog/provisioning/platform/scripts/generate-configs.nu orchestrator solo\n ```\n\n3. **Deploy services**:\n\n ```bash\n nu provisioning/.typedialog/provisioning/platform/scripts/install-services.nu solo\n ```\n\n## Documentation\n\n- `constraints/README.md` - How to modify validation constraints\n- `schemas/README.md` - Schema patterns and imports\n- `defaults/README.md` - Defaults composition and merging strategy\n- `validators/README.md` - Validator patterns and error handling\n- `forms/README.md` - Form structure and fragment organization\n- `forms/fragments/README.md` - Fragment usage and nickel_path mapping\n- `scripts/README.md` - Script usage and dependencies\n- `examples/README.md` - Example deployment scenarios\n- `templates/README.md` - Template patterns and interpolation\n\n## Key Files\n\n| File | Purpose |\n| ------ | --------- |\n| `constraints/constraints.toml` | Single source of truth for validation limits |\n| `schemas/orchestrator.ncl` | Orchestrator type schema |\n| `defaults/orchestrator-defaults.ncl` | Orchestrator default values |\n| `validators/orchestrator-validator.ncl` | Orchestrator validation logic |\n| `configs/orchestrator.solo.ncl` | Generated solo mode config |\n| `forms/orchestrator-form.toml` | Orchestrator form definition |\n| `templates/orchestrator-config.ncl.j2` | Nickel output template |\n| `scripts/configure.nu` | Interactive configuration wizard |\n| `scripts/generate-configs.nu` | Nickel → TOML export |\n| `values/orchestrator.solo.ncl` | User configuration (gitignored) |\n\n## Tools Required\n\n- **Nickel** (0.10+) - Configuration language\n- **TypeDialog** - Interactive form backend\n- **NuShell** (0.109+) - Script orchestration\n- **Jinja2/tera** - Template rendering (via nu_plugin_tera)\n- **TOML** - Config file format (for Rust services)\n\n## Notes\n\n- Configuration files in `values/` are **gitignored** (user-specific)\n- Generated configs in `configs/` are composed automatically (not hand-edited)\n- Each mode (solo/multiuser/cicd/enterprise) has different resource defaults\n- Fragments are **flat** in `forms/fragments/` and referenced by paths in form definitions\n- All form elements must have `nickel_path` for proper Nickel structure mapping\n- Constraint interpolation enables dynamic form validation based on service requirements\n\n---\n\n**Version**: 1.0.0\n**Created**: 2025-01-05\n**Last Updated**: 2025-01-05 \ No newline at end of file +# TypeDialog + Nickel Configuration System for Platform Services + +Complete configuration system for provisioning platform services (orchestrator, control-center, mcp-server, vault-service, +extension-registry, rag, ai-service, provisioning-daemon) across multiple deployment modes (solo, multiuser, cicd, enterprise). + +## Architecture Overview + +This system implements a **TypeDialog + Nickel configuration workflow** that provides: + +- **Type-safe configuration** via Nickel schemas with validation +- **Interactive configuration** via TypeDialog forms with real-time constraint validation +- **Multi-mode deployment** (solo/multiuser/cicd/enterprise) with mode-specific defaults +- **Configuration composition** (base defaults + mode overlays + user customization + validation) +- **Automated TOML export** for Rust service consumption +- **Docker Compose + Kubernetes templates** for infrastructure deployment + +## Directory Structure + +```bash +provisioning/.typedialog/provisioning/platform/ +├── constraints/ # Single source of truth for validation limits +├── schemas/ # Nickel type contracts (services + common + deployment modes) +├── defaults/ # Default configuration values (services + common + deployment modes) +├── validators/ # Validation logic (constraints, ranges, business rules) +├── configs/ # Generated mode-specific Nickel configurations (4 services × 4 modes = 16 configs) +├── forms/ # TypeDialog form definitions (4 main forms + flat fragments) +│ └── fragments/ # Reusable form fragments (workspace, server, database, etc.) +├── templates/ # Jinja2 + Nickel templates for config/deployment generation +│ ├── docker-compose/ # Docker Compose templates (solo/multiuser/cicd/enterprise) +│ ├── kubernetes/ # Kubernetes deployment templates +│ └── configs/ # Service configuration templates (TOML generation) +├── scripts/ # Nushell orchestration scripts (configure, generate, validate, deploy) +├── examples/ # Example configurations for different deployment scenarios +└── values/ # User configuration files (gitignored *.ncl) +``` + +## Configuration Workflow + +### 1. User Interaction (TypeDialog) + +```nushell +nu scripts/configure.nu orchestrator solo --backend web +``` + +- Launches interactive form (web/tui/cli) +- Loads existing config as default values (if exists) +- Validates user input against constraints +- Generates updated Nickel config + +### 2. Configuration Composition + +```toml +Base Defaults (defaults/*.ncl) + ↓ ++ Mode Overlay (defaults/deployment/{mode}-defaults.ncl) + ↓ ++ User Customization (values/{service}.{mode}.ncl) + ↓ ++ Schema Validation (schemas/*.ncl) + ↓ ++ Constraint Validation (validators/*.ncl) + ↓ += Final Configuration (configs/{service}.{mode}.ncl) +``` + +### 3. TOML Export + +```toml +nu scripts/generate-configs.nu orchestrator solo +``` + +Exports Nickel config to TOML: +- `provisioning/platform/config/orchestrator.solo.toml` (consumed by Rust services) + +## Deployment Modes + +### Solo (2 CPU, 4GB RAM) +- Single developer/testing +- Filesystem or embedded database +- Minimal security +- All services enabled + +### MultiUser (4 CPU, 8GB RAM) +- Team collaboration, staging +- PostgreSQL or SurrealDB server +- RBAC enabled +- Gitea integration + +### CI/CD (8 CPU, 16GB RAM) +- Automated pipelines, ephemeral +- API-driven configuration +- Fast cleanup, minimal storage + +### Enterprise (16+ CPU, 32+ GB RAM) +- Production high availability +- SurrealDB cluster with replication +- MFA required, KMS integration +- Compliance (SOC2/HIPAA) + +## Key Components + +### Constraints (constraints/constraints.toml) +Single source of truth for validation limits across all services. Used for: +- Form field validation (min/max values) +- Constraint interpolation in TypeDialog forms +- Nickel validator bounds checking + +### Schemas (schemas/*.ncl) +Type-safe configuration contracts defining: +- Required/optional fields +- Valid value types and enums +- Default values +- Input/output type signatures + +**Organization**: +- `schemas/common/` - HTTP server, database, security, monitoring, logging +- `schemas/{orchestrator,control-center,mcp-server,vault-service,extension-registry,rag,ai-service,provisioning-daemon}.ncl` - Service-specific schemas +- `schemas/deployment/{solo,multiuser,cicd,enterprise}.ncl` - Mode-specific schemas + +### Defaults (defaults/*.ncl) +Configuration base values composed with mode overlays: +- `defaults/{service}-defaults.ncl` - Service base defaults +- `defaults/common/` - Shared defaults (server, database, security) +- `defaults/deployment/{mode}-defaults.ncl` - Mode-specific value overrides + +### Validators (validators/*.ncl) +Business logic validation using constraints: +- Port range validation (1024-65535) +- Resource allocation validation (CPU, memory) +- Workflow/policy validation (service-specific) +- Cross-field validation + +### Configurations (configs/*.ncl) +Generated mode-specific Nickel configs (NOT manually edited): +- `orchestrator.{solo,multiuser,cicd,enterprise}.ncl` +- `control-center.{solo,multiuser,cicd,enterprise}.ncl` +- `mcp-server.{solo,multiuser,cicd,enterprise}.ncl` +- `vault-service.{solo,multiuser,cicd,enterprise}.ncl` +- `extension-registry.{solo,multiuser,cicd,enterprise}.ncl` +- `rag.{solo,multiuser,cicd,enterprise}.ncl` +- `ai-service.{solo,multiuser,cicd,enterprise}.ncl` +- `provisioning-daemon.{solo,multiuser,cicd,enterprise}.ncl` + +### Forms (forms/*.toml) +TypeDialog form definitions with **flat fragments** referenced by paths: +- 4 main forms: `{service}-form.toml` +- Fragments: `fragments/{name}-section.toml` (workspace, server, database, security, monitoring, etc.) +- CRITICAL: Every form element has `nickel_path` for Nickel structure mapping + +**Fragment Organization** (FLAT, referenced by paths): +- `workspace-section.toml` +- `server-section.toml` +- `database-rocksdb-section.toml` +- `database-surrealdb-section.toml` +- `database-postgres-section.toml` +- `security-section.toml` +- `monitoring-section.toml` +- `logging-section.toml` +- `orchestrator-queue-section.toml` +- `orchestrator-workflow-section.toml` +- ... (service-specific and mode-specific fragments) + +### Templates (templates/) +Jinja2 + Nickel templates for automated generation: +- `{service}-config.ncl.j2` - Nickel output template (critical for TypeDialog nickel-roundtrip) +- `docker-compose/platform-stack.{mode}.yml.ncl` - Docker Compose templates +- `kubernetes/{service}-deployment.yaml.ncl` - Kubernetes templates + +### Scripts (scripts/) +Nushell orchestration (NuShell 0.109+): +- `configure.nu` - Interactive TypeDialog wizard (nickel-roundtrip workflow) +- `generate-configs.nu` - Export Nickel → TOML +- `validate-config.nu` - Typecheck Nickel configs +- `render-docker-compose.nu` - Generate Docker Compose files +- `render-kubernetes.nu` - Generate Kubernetes manifests +- `install-services.nu` - Deploy platform services +- `detect-services.nu` - Auto-detect running services + +### Examples (examples/) +Reference configurations for different scenarios: +- `orchestrator-solo.ncl` - Simple development setup +- `orchestrator-enterprise.ncl` - Complex production setup +- `full-platform-enterprise.ncl` - Complete enterprise stack + +### Values (values/) +User configuration directory (gitignored): +- `{service}.{mode}.ncl` - User customizations (loaded in compose) +- `.gitignore` - Ignores `*.ncl` files +- `orchestrator.example.ncl` - Documented example template + +## TypeDialog nickel-roundtrip Workflow + +CRITICAL: Forms use Jinja2 templates for Nickel generation: + +```nickel +# Command pattern +typedialog-web nickel-roundtrip "$CONFIG_FILE" "$FORM_FILE" --output "$CONFIG_FILE" --template "$NCL_TEMPLATE" + +# Example +typedialog-web nickel-roundtrip + "provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl" + "provisioning/.typedialog/provisioning/platform/forms/orchestrator-form.toml" + --output "provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl" + --template "provisioning/.typedialog/provisioning/platform/templates/orchestrator-config.ncl.j2" +``` + +**Key Requirements**: +1. **Jinja2 template** (`config.ncl.j2`) - Defines Nickel output structure with conditional `{% if %}` blocks +2. **nickel_path** in form elements - Maps form fields to Nickel structure paths (e.g., `["orchestrator", "queue", "max_concurrent_tasks"]`) +3. **Constraint interpolation** - Form limits reference constraints (e.g., `${constraint.orchestrator.queue.concurrent_tasks.max}`) +4. **Base + overlay composition** - Nickel imports merge defaults + mode overlays + validators + +## Usage Workflow + +### 1. Configure Service (Interactive) + +```toml +# Start TypeDialog wizard for orchestrator in solo mode +nu provisioning/.typedialog/provisioning/platform/scripts/configure.nu orchestrator solo --backend web +``` + +Wizard: +1. Loads existing config (if exists) as defaults +2. Shows form with validated constraints +3. User edits configuration +4. Generates updated Nickel config to `provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl` + +### 2. Validate Configuration + +```toml +# Typecheck Nickel config +nu provisioning/.typedialog/provisioning/platform/scripts/validate-config.nu provisioning/.typedialog/provisioning/platform/values/orchestrator.solo.ncl +``` + +### 3. Generate TOML for Rust Services + +```toml +# Export Nickel → TOML +nu provisioning/.typedialog/provisioning/platform/scripts/generate-configs.nu orchestrator solo +``` + +Output: `provisioning/platform/config/orchestrator.solo.toml` + +### 4. Deploy Services + +```bash +# Install services (Docker Compose or Kubernetes) +nu provisioning/.typedialog/provisioning/platform/scripts/install-services.nu solo +``` + +## Configuration Loading Hierarchy (Rust Services) + +```rust +1. Environment variables (ORCHESTRATOR_*) +2. User config (values/{service}.{mode}.ncl → TOML) +3. Mode-specific defaults (configs/{service}.{mode}.toml) +4. Service defaults (config/orchestrator.defaults.toml) +``` + +## Constraint Interpolation Example + +**constraints.toml**: + +```toml +[orchestrator.queue.concurrent_tasks] +min = 1 +max = 100 +``` + +**Form element** (fragments/orchestrator-queue-section.toml): + +```toml +[[elements]] +name = "max_concurrent_tasks" +type = "number" +min = "${constraint.orchestrator.queue.concurrent_tasks.min}" +max = "${constraint.orchestrator.queue.concurrent_tasks.max}" +nickel_path = ["orchestrator", "queue", "max_concurrent_tasks"] +``` + +**Jinja2 template** (orchestrator-config.ncl.j2): + +```nickel +orchestrator = { + queue = { + {%- if max_concurrent_tasks %} + max_concurrent_tasks = {{ max_concurrent_tasks }}, + {%- endif %} + }, +} +``` + +## Getting Started + +1. **Run configuration wizard**: + + ```bash + nu provisioning/.typedialog/provisioning/platform/scripts/configure.nu orchestrator solo + ``` + +2. **Generate TOML configs**: + + ```bash + nu provisioning/.typedialog/provisioning/platform/scripts/generate-configs.nu orchestrator solo + ``` + +3. **Deploy services**: + + ```bash + nu provisioning/.typedialog/provisioning/platform/scripts/install-services.nu solo + ``` + +## Documentation + +- `constraints/README.md` - How to modify validation constraints +- `schemas/README.md` - Schema patterns and imports +- `defaults/README.md` - Defaults composition and merging strategy +- `validators/README.md` - Validator patterns and error handling +- `forms/README.md` - Form structure and fragment organization +- `forms/fragments/README.md` - Fragment usage and nickel_path mapping +- `scripts/README.md` - Script usage and dependencies +- `examples/README.md` - Example deployment scenarios +- `templates/README.md` - Template patterns and interpolation + +## Key Files + +| File | Purpose | +| ------ | --------- | +| `constraints/constraints.toml` | Single source of truth for validation limits | +| `schemas/orchestrator.ncl` | Orchestrator type schema | +| `defaults/orchestrator-defaults.ncl` | Orchestrator default values | +| `validators/orchestrator-validator.ncl` | Orchestrator validation logic | +| `configs/orchestrator.solo.ncl` | Generated solo mode config | +| `forms/orchestrator-form.toml` | Orchestrator form definition | +| `templates/orchestrator-config.ncl.j2` | Nickel output template | +| `scripts/configure.nu` | Interactive configuration wizard | +| `scripts/generate-configs.nu` | Nickel → TOML export | +| `values/orchestrator.solo.ncl` | User configuration (gitignored) | + +## Tools Required + +- **Nickel** (0.10+) - Configuration language +- **TypeDialog** - Interactive form backend +- **NuShell** (0.109+) - Script orchestration +- **Jinja2/tera** - Template rendering (via nu_plugin_tera) +- **TOML** - Config file format (for Rust services) + +## Notes + +- Configuration files in `values/` are **gitignored** (user-specific) +- Generated configs in `configs/` are composed automatically (not hand-edited) +- Each mode (solo/multiuser/cicd/enterprise) has different resource defaults +- Fragments are **flat** in `forms/fragments/` and referenced by paths in form definitions +- All form elements must have `nickel_path` for proper Nickel structure mapping +- Constraint interpolation enables dynamic form validation based on service requirements + +--- + +**Version**: 1.0.0 +**Created**: 2025-01-05 +**Last Updated**: 2025-01-05 \ No newline at end of file diff --git a/schemas/platform/configs/README.md b/schemas/platform/configs/README.md index 5c3c0dc..e16d3c1 100644 --- a/schemas/platform/configs/README.md +++ b/schemas/platform/configs/README.md @@ -1 +1,320 @@ -# Configurations\n\nMode-specific Nickel configurations for all services (NOT manually edited).\n\n## Purpose\n\nConfigurations are **automatically generated** by composing:\n1. Service base defaults (defaults/{service}-defaults.ncl)\n2. Mode overlay (defaults/deployment/{mode}-defaults.ncl)\n3. User customization (values/{service}.{mode}.ncl)\n4. Schema validation (schemas/{service}.ncl)\n5. Constraint validation (validators/{service}-validator.ncl)\n\n## File Organization\n\n```\nconfigs/\n├── README.md # This file\n├── orchestrator.solo.ncl # Orchestrator solo mode\n├── orchestrator.multiuser.ncl # Orchestrator multi-user mode\n├── orchestrator.cicd.ncl # Orchestrator CI/CD mode\n├── orchestrator.enterprise.ncl # Orchestrator enterprise mode\n├── control-center.solo.ncl\n├── control-center.multiuser.ncl\n├── control-center.cicd.ncl\n├── control-center.enterprise.ncl\n├── mcp-server.solo.ncl\n├── mcp-server.multiuser.ncl\n├── mcp-server.cicd.ncl\n├── mcp-server.enterprise.ncl\n├── installer.solo.ncl\n├── installer.multiuser.ncl\n├── installer.cicd.ncl\n└── installer.enterprise.ncl\n```\n\n## Configuration Composition\n\nEach config is built from layers:\n\n```\n# configs/orchestrator.solo.ncl\nlet schemas = import "../schemas/orchestrator.ncl" in\nlet defaults = import "../defaults/orchestrator-defaults.ncl" in\nlet solo_defaults = import "../defaults/deployment/solo-defaults.ncl" in\nlet validators = import "../validators/orchestrator-validator.ncl" in\n\n{\n # Merge: base defaults + mode overrides + user customization\n orchestrator = defaults.orchestrator & solo_defaults.services.orchestrator & {\n # User customization goes here (from values/orchestrator.solo.ncl)\n },\n} | schemas.OrchestratorConfig # Apply schema validation\n```\n\n## Example Configuration\n\n### Base Defaults\n\n```\n# defaults/orchestrator-defaults.ncl\norchestrator = {\n workspace = {\n name = "default",\n path = "/var/lib/provisioning/orchestrator",\n enabled = true,\n },\n server = {\n host = "127.0.0.1",\n port = 9090,\n workers = 4,\n },\n queue = {\n max_concurrent_tasks = 5,\n },\n}\n```\n\n### Solo Mode Override\n\n```\n# defaults/deployment/solo-defaults.ncl\nservices.orchestrator = {\n workers = 2, # Fewer workers\n queue_max_concurrent_tasks = 3, # Limited concurrency\n storage_backend = 'filesystem,\n}\n```\n\n### Generated Config\n\n```\n# configs/orchestrator.solo.ncl (auto-generated)\n{\n orchestrator = {\n workspace = {\n name = "default", # From base defaults\n path = "/var/lib/provisioning/orchestrator",\n enabled = true,\n },\n server = {\n host = "127.0.0.1", # From base defaults\n port = 9090, # From base defaults\n workers = 2, # OVERRIDDEN by solo mode\n },\n queue = {\n max_concurrent_tasks = 3, # OVERRIDDEN by solo mode\n },\n },\n}\n```\n\n## Updating Configurations\n\n**DO NOT manually edit** configs/ files. Instead:\n\n1. **Modify service defaults** (defaults/{service}-defaults.ncl)\n2. **Modify mode overrides** (defaults/deployment/{mode}-defaults.ncl)\n3. **Modify user values** (values/{service}.{mode}.ncl)\n4. **Regenerate configs** (via TypeDialog or manual rebuild)\n\n### Regenerating Configs\n\n#### Via TypeDialog (Recommended)\n\n```\nnu provisioning/.typedialog/provisioning/platform/scripts/configure.nu orchestrator solo\n```\n\nAutomatically:\n1. Loads existing config as defaults\n2. Shows form with validated constraints\n3. User edits configuration\n4. Generates updated config\n\n#### Manual Rebuild\n\n```\n# (Future) Script to rebuild all configs from sources\nnu provisioning/.typedialog/provisioning/platform/scripts/generate-configs.nu orchestrator solo\n```\n\n## Config Types\n\n### Orchestrator (Workflow Engine)\n- Workspace configuration\n- Server settings\n- Storage backend (filesystem, RocksDB, SurrealDB)\n- Queue configuration (concurrency, retries, timeout)\n- Batch workflow settings\n- Optional: monitoring, rollback, extensions\n\n### Control Center (Policy/RBAC)\n- Workspace configuration\n- Server settings\n- Database configuration\n- Security (JWT, RBAC, encryption)\n- Optional: compliance, audit logging\n\n### MCP Server (Protocol Server)\n- Workspace configuration\n- Server settings\n- MCP capabilities (tools, prompts, resources)\n- Optional: custom tools, resource limits\n\n### Installer (Setup Automation)\n- Target configuration\n- Provider settings\n- Pre-flight checks\n- Installation options\n\n## Configuration Values Hierarchy\n\n```\n1. Explicit user customization (values/{service}.{mode}.ncl)\n2. Mode-specific defaults (defaults/deployment/{mode}-defaults.ncl)\n3. Service base defaults (defaults/{service}-defaults.ncl)\n4. Common shared defaults (defaults/common/*.ncl)\n```\n\n## Validation Levels\n\nConfigurations are validated at three levels:\n\n### 1. Schema Validation\nType checking when config is evaluated:\n\n```\n| schemas.OrchestratorConfig\n```\n\n### 2. Constraint Validation\nRange checking via validators:\n\n```\nmax_concurrent_tasks = validators.ValidConcurrentTasks 5\n```\n\n### 3. Business Logic Validation\nService-specific rules in validators.\n\n## Usage in Rust Services\n\nConfigs are exported to TOML for Rust services:\n\n```\n# Generate TOML\nnu provisioning/.typedialog/provisioning/platform/scripts/generate-configs.nu orchestrator solo\n\n# Output: provisioning/platform/config/orchestrator.solo.toml\n```\n\nRust services load the TOML:\n\n```\nlet config_path = "provisioning/platform/config/orchestrator.solo.toml";\nlet config = Config::from_file(config_path)?;\n```\n\n## Deployment Mode Specifics\n\n### Solo Mode Config\n- Minimal resources (2 CPU, 4GB)\n- Filesystem storage (no DB infrastructure)\n- Single worker, low concurrency\n- Simplified security (no MFA)\n\n### MultiUser Mode Config\n- Team resources (4 CPU, 8GB)\n- PostgreSQL or SurrealDB\n- Moderate concurrency (4-8 workers)\n- RBAC enabled\n\n### CI/CD Mode Config\n- Ephemeral (cleanup after run)\n- API-driven (no UI/forms)\n- High concurrency (8+ workers)\n- Minimal security overhead\n\n### Enterprise Mode Config\n- Production HA (16+ CPU, 32+ GB)\n- SurrealDB cluster with replication\n- High concurrency (16+ workers)\n- Full security (MFA, KMS, compliance)\n\n## Testing Configurations\n\n```\n# Typecheck a config\nnickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Evaluate and view\nnickel eval provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl | head -50\n\n# Export to TOML\nnickel export --format toml provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Export to JSON\nnickel export --format json provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n```\n\n## Configuration Merge Example\n\n```\n# Base\n{\n server = {\n host = "127.0.0.1",\n port = 9090,\n workers = 4,\n },\n}\n\n# + Mode override\n& {\n server.workers = 2,\n}\n\n# = Result\n{\n server = {\n host = "127.0.0.1",\n port = 9090,\n workers = 2, # OVERRIDDEN\n },\n}\n```\n\nNickel's `&` operator is a **shallow merge** - only top-level fields are replaced, deeper nesting is preserved.\n\n## Generated Config Structure\n\nAll generated configs follow this structure:\n\n```\n# Service config\n{\n {service} = {\n # Workspace\n workspace = { ... },\n\n # Server\n server = { ... },\n\n # Storage/Database\n [storage | database] = { ... },\n\n # Service-specific\n [queue | rbac | capabilities] = { ... },\n\n # Optional\n [monitoring | security | compliance] = { ... },\n },\n}\n```\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 +# Configurations + +Mode-specific Nickel configurations for all services (NOT manually edited). + +## Purpose + +Configurations are **automatically generated** by composing: +1. Service base defaults (defaults/{service}-defaults.ncl) +2. Mode overlay (defaults/deployment/{mode}-defaults.ncl) +3. User customization (values/{service}.{mode}.ncl) +4. Schema validation (schemas/{service}.ncl) +5. Constraint validation (validators/{service}-validator.ncl) + +## File Organization + +```bash +configs/ +├── README.md # This file +├── orchestrator.solo.ncl # Orchestrator solo mode +├── orchestrator.multiuser.ncl # Orchestrator multi-user mode +├── orchestrator.cicd.ncl # Orchestrator CI/CD mode +├── orchestrator.enterprise.ncl # Orchestrator enterprise mode +├── control-center.solo.ncl +├── control-center.multiuser.ncl +├── control-center.cicd.ncl +├── control-center.enterprise.ncl +├── mcp-server.solo.ncl +├── mcp-server.multiuser.ncl +├── mcp-server.cicd.ncl +├── mcp-server.enterprise.ncl +├── installer.solo.ncl +├── installer.multiuser.ncl +├── installer.cicd.ncl +└── installer.enterprise.ncl +``` + +## Configuration Composition + +Each config is built from layers: + +```toml +# configs/orchestrator.solo.ncl +let schemas = import "../schemas/orchestrator.ncl" in +let defaults = import "../defaults/orchestrator-defaults.ncl" in +let solo_defaults = import "../defaults/deployment/solo-defaults.ncl" in +let validators = import "../validators/orchestrator-validator.ncl" in + +{ + # Merge: base defaults + mode overrides + user customization + orchestrator = defaults.orchestrator & solo_defaults.services.orchestrator & { + # User customization goes here (from values/orchestrator.solo.ncl) + }, +} | schemas.OrchestratorConfig # Apply schema validation +``` + +## Example Configuration + +### Base Defaults + +```bash +# defaults/orchestrator-defaults.ncl +orchestrator = { + workspace = { + name = "default", + path = "/var/lib/provisioning/orchestrator", + enabled = true, + }, + server = { + host = "127.0.0.1", + port = 9090, + workers = 4, + }, + queue = { + max_concurrent_tasks = 5, + }, +} +``` + +### Solo Mode Override + +```bash +# defaults/deployment/solo-defaults.ncl +services.orchestrator = { + workers = 2, # Fewer workers + queue_max_concurrent_tasks = 3, # Limited concurrency + storage_backend = 'filesystem, +} +``` + +### Generated Config + +```toml +# configs/orchestrator.solo.ncl (auto-generated) +{ + orchestrator = { + workspace = { + name = "default", # From base defaults + path = "/var/lib/provisioning/orchestrator", + enabled = true, + }, + server = { + host = "127.0.0.1", # From base defaults + port = 9090, # From base defaults + workers = 2, # OVERRIDDEN by solo mode + }, + queue = { + max_concurrent_tasks = 3, # OVERRIDDEN by solo mode + }, + }, +} +``` + +## Updating Configurations + +**DO NOT manually edit** configs/ files. Instead: + +1. **Modify service defaults** (defaults/{service}-defaults.ncl) +2. **Modify mode overrides** (defaults/deployment/{mode}-defaults.ncl) +3. **Modify user values** (values/{service}.{mode}.ncl) +4. **Regenerate configs** (via TypeDialog or manual rebuild) + +### Regenerating Configs + +#### Via TypeDialog (Recommended) + +```nushell +nu provisioning/.typedialog/provisioning/platform/scripts/configure.nu orchestrator solo +``` + +Automatically: +1. Loads existing config as defaults +2. Shows form with validated constraints +3. User edits configuration +4. Generates updated config + +#### Manual Rebuild + +```bash +# (Future) Script to rebuild all configs from sources +nu provisioning/.typedialog/provisioning/platform/scripts/generate-configs.nu orchestrator solo +``` + +## Config Types + +### Orchestrator (Workflow Engine) +- Workspace configuration +- Server settings +- Storage backend (filesystem, RocksDB, SurrealDB) +- Queue configuration (concurrency, retries, timeout) +- Batch workflow settings +- Optional: monitoring, rollback, extensions + +### Control Center (Policy/RBAC) +- Workspace configuration +- Server settings +- Database configuration +- Security (JWT, RBAC, encryption) +- Optional: compliance, audit logging + +### MCP Server (Protocol Server) +- Workspace configuration +- Server settings +- MCP capabilities (tools, prompts, resources) +- Optional: custom tools, resource limits + +### Installer (Setup Automation) +- Target configuration +- Provider settings +- Pre-flight checks +- Installation options + +## Configuration Values Hierarchy + +```toml +1. Explicit user customization (values/{service}.{mode}.ncl) +2. Mode-specific defaults (defaults/deployment/{mode}-defaults.ncl) +3. Service base defaults (defaults/{service}-defaults.ncl) +4. Common shared defaults (defaults/common/*.ncl) +``` + +## Validation Levels + +Configurations are validated at three levels: + +### 1. Schema Validation +Type checking when config is evaluated: + +```toml +| schemas.OrchestratorConfig +``` + +### 2. Constraint Validation +Range checking via validators: + +```bash +max_concurrent_tasks = validators.ValidConcurrentTasks 5 +``` + +### 3. Business Logic Validation +Service-specific rules in validators. + +## Usage in Rust Services + +Configs are exported to TOML for Rust services: + +```toml +# Generate TOML +nu provisioning/.typedialog/provisioning/platform/scripts/generate-configs.nu orchestrator solo + +# Output: provisioning/platform/config/orchestrator.solo.toml +``` + +Rust services load the TOML: + +```javascript +let config_path = "provisioning/platform/config/orchestrator.solo.toml"; +let config = Config::from_file(config_path)?; +``` + +## Deployment Mode Specifics + +### Solo Mode Config +- Minimal resources (2 CPU, 4GB) +- Filesystem storage (no DB infrastructure) +- Single worker, low concurrency +- Simplified security (no MFA) + +### MultiUser Mode Config +- Team resources (4 CPU, 8GB) +- PostgreSQL or SurrealDB +- Moderate concurrency (4-8 workers) +- RBAC enabled + +### CI/CD Mode Config +- Ephemeral (cleanup after run) +- API-driven (no UI/forms) +- High concurrency (8+ workers) +- Minimal security overhead + +### Enterprise Mode Config +- Production HA (16+ CPU, 32+ GB) +- SurrealDB cluster with replication +- High concurrency (16+ workers) +- Full security (MFA, KMS, compliance) + +## Testing Configurations + +```toml +# Typecheck a config +nickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl + +# Evaluate and view +nickel eval provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl | head -50 + +# Export to TOML +nickel export --format toml provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl + +# Export to JSON +nickel export --format json provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl +``` + +## Configuration Merge Example + +```toml +# Base +{ + server = { + host = "127.0.0.1", + port = 9090, + workers = 4, + }, +} + +# + Mode override +& { + server.workers = 2, +} + +# = Result +{ + server = { + host = "127.0.0.1", + port = 9090, + workers = 2, # OVERRIDDEN + }, +} +``` + +Nickel's `&` operator is a **shallow merge** - only top-level fields are replaced, deeper nesting is preserved. + +## Generated Config Structure + +All generated configs follow this structure: + +```toml +# Service config +{ + {service} = { + # Workspace + workspace = { ... }, + + # Server + server = { ... }, + + # Storage/Database + [storage | database] = { ... }, + + # Service-specific + [queue | rbac | capabilities] = { ... }, + + # Optional + [monitoring | security | compliance] = { ... }, + }, +} +``` + +--- + +**Version**: 1.0.0 +**Last Updated**: 2025-01-05 \ No newline at end of file diff --git a/schemas/platform/configuration-workflow.md b/schemas/platform/configuration-workflow.md index 06b9fa7..7fc6be6 100644 --- a/schemas/platform/configuration-workflow.md +++ b/schemas/platform/configuration-workflow.md @@ -1 +1,923 @@ -# Configuration Workflow: TypeDialog → Nickel → TOML → Rust\n\nComplete documentation of the configuration pipeline that transforms interactive user input into production Rust service configurations.\n\n## Overview\n\nThe provisioning platform uses a **four-stage configuration workflow** that leverages TypeDialog for interactive configuration,\nNickel for type-safe composition, and TOML for service consumption:\n\n```\n┌─────────────────────────────────────────────────────────────────┐\n│ Stage 1: User Interaction (TypeDialog) │\n│ - Can use Nickel configuration as default values │\n│ if use provisioning/platform/config/ it will be updated │\n│ - Interactive form (web/tui/cli) │\n│ - Real-time constraint validation │\n│ - Generates Nickel configuration │\n└────────────────┬────────────────────────────────────────────────┘\n │\n ▼\n┌─────────────────────────────────────────────────────────────────┐\n│ Stage 2: Composition (Nickel) │\n│ - Base defaults imported │\n│ - Mode overlay applied │\n│ - Validators enforce business rules │\n│ - Produces Nickel config file │\n└────────────────┬────────────────────────────────────────────────┘\n │\n ▼\n┌─────────────────────────────────────────────────────────────────┐\n│ Stage 3: Export (Nickel → TOML) │\n│ - Nickel config evaluated │\n│ - Exported to TOML format │\n│ - Saved to provisioning/platform/config/ │\n└────────────────┬────────────────────────────────────────────────┘\n │\n ▼\n┌─────────────────────────────────────────────────────────────────┐\n│ Stage 4: Runtime (Rust Services) │\n│ - Services load TOML configuration │\n│ - Environment variables override specific values │\n│ - Start services with final configuration │\n└─────────────────────────────────────────────────────────────────┘\n```\n\n---\n\n## Stage 1: User Interaction (TypeDialog)\n\n### Purpose\n\nCollect configuration from users through an interactive, constraint-aware interface.\n\n### Workflow\n\n```\n# Launch interactive configuration wizard\nnu scripts/configure.nu orchestrator solo --backend web\n```\n\n### What Happens\n\n1. **Form Loads**\n - TypeDialog reads `forms/orchestrator-form.toml`\n - Form displays configuration sections\n - Constraints from `constraints.toml` enforce min/max values\n - Environment variables populate initial defaults\n\n2. **User Interaction**\n - User fills in form fields (workspace name, server port, etc.)\n - Real-time validation on each field\n - Constraint interpolation shows valid ranges:\n - `${constraint.orchestrator.workers.min}` → `1`\n - `${constraint.orchestrator.workers.max}` → `32`\n\n3. **Configuration Submission**\n - User submits form\n - TypeDialog validates all fields against schemas\n - Generates Nickel configuration output\n\n4. **Output Generation**\n - Nickel config saved to `values/{service}.{mode}.ncl`\n - Example: `values/orchestrator.solo.ncl`\n - File becomes source of truth for user customizations\n\n### Form Structure Example\n\n```\n# forms/orchestrator-form.toml\nname = "orchestrator_configuration"\ndescription = "Configure orchestrator service"\n\n[[items]]\nname = "workspace_group"\ntype = "group"\nincludes = ["fragments/workspace-section.toml"]\n\n[[items]]\nname = "server_group"\ntype = "group"\nincludes = ["fragments/server-section.toml"]\n\n[[items]]\nname = "queue_group"\ntype = "group"\nincludes = ["fragments/orchestrator/queue-section.toml"]\n```\n\n### Fragment with Constraint Interpolation\n\n```\n# forms/fragments/orchestrator/queue-section.toml\n[[elements]]\nname = "max_concurrent_tasks"\ntype = "number"\nprompt = "Maximum Concurrent Tasks"\ndefault = 5\nmin = "${constraint.orchestrator.queue.concurrent_tasks.min}"\nmax = "${constraint.orchestrator.queue.concurrent_tasks.max}"\nrequired = true\nhelp = "Range: ${constraint.orchestrator.queue.concurrent_tasks.min}-${constraint.orchestrator.queue.concurrent_tasks.max}"\nnickel_path = ["orchestrator", "queue", "max_concurrent_tasks"]\n```\n\n### Generated Nickel Output (from TypeDialog)\n\nTypeDialog's `nickel-roundtrip` pattern generates:\n\n```\n# values/orchestrator.solo.ncl\n# Auto-generated by TypeDialog\n{\n orchestrator = {\n workspace = {\n name = "dev-workspace",\n path = "/home/developer/provisioning/data/orchestrator",\n enabled = true,\n },\n server = {\n host = "127.0.0.1",\n port = 9090,\n workers = 2,\n },\n queue = {\n max_concurrent_tasks = 3,\n retry_attempts = 2,\n retry_delay = 1000,\n },\n },\n}\n```\n\n---\n\n## Stage 2: Composition (Nickel)\n\n### Purpose\n\nCompose the user input with defaults, validators, and schemas to create a complete, validated configuration.\n\n### Workflow\n\n```\n# The nickel typecheck command validates the composition\nnickel typecheck values/orchestrator.solo.ncl\n```\n\n### Composition Layers\n\nThe final configuration is built by merging layers in priority order:\n\n#### Layer 1: Schema Import\n\n```\n# Ensures type safety and required fields\nlet schemas = import "../schemas/orchestrator.ncl" in\n```\n\n#### Layer 2: Base Defaults\n\n```\n# Default values for all orchestrator configurations\nlet defaults = import "../defaults/orchestrator-defaults.ncl" in\n```\n\n#### Layer 3: Mode Overlay\n\n```\n# Solo-specific overrides and adjustments\nlet solo_defaults = import "../defaults/deployment/solo-defaults.ncl" in\n```\n\n#### Layer 4: Validators Import\n\n```\n# Business rule validation (ranges, uniqueness, dependencies)\nlet validators = import "../validators/orchestrator-validator.ncl" in\n```\n\n#### Layer 5: User Values\n\n```\n# User input from TypeDialog (values/orchestrator.solo.ncl)\n# Loaded and merged with defaults\n```\n\n### Composition Example\n\n```\n# configs/orchestrator.solo.ncl (generated composition)\n\nlet schemas = import "../schemas/orchestrator.ncl" in\nlet defaults = import "../defaults/orchestrator-defaults.ncl" in\nlet solo_defaults = import "../defaults/deployment/solo-defaults.ncl" in\nlet validators = import "../validators/orchestrator-validator.ncl" in\n\n# Composition: Base defaults + mode overlay + user input\n{\n orchestrator = defaults.orchestrator & {\n # User input from TypeDialog values/orchestrator.solo.ncl\n workspace = {\n name = "dev-workspace",\n path = "/home/developer/provisioning/data/orchestrator",\n },\n\n # Solo mode overrides\n server = {\n workers = validators.ValidWorkers 2,\n max_connections = 128,\n },\n\n queue = {\n max_concurrent_tasks = validators.ValidConcurrentTasks 3,\n },\n\n # Fallback to defaults for unspecified fields\n },\n} | schemas.OrchestratorConfig # Validate against schema\n```\n\n### Validation During Composition\n\nEach field is validated through multiple validation layers:\n\n```\n# validators/orchestrator-validator.ncl\nlet constraints = import "../constraints/constraints.toml" in\n\n{\n # Validate workers within allowed range\n ValidWorkers = fun workers =>\n if workers < constraints.orchestrator.workers.min then\n error "Workers below minimum"\n else if workers > constraints.orchestrator.workers.max then\n error "Workers above maximum"\n else\n workers,\n\n # Validate concurrent tasks\n ValidConcurrentTasks = fun tasks =>\n if tasks < constraints.orchestrator.queue.concurrent_tasks.min then\n error "Tasks below minimum"\n else if tasks > constraints.orchestrator.queue.concurrent_tasks.max then\n error "Tasks above maximum"\n else\n tasks,\n}\n```\n\n### Constraints: Single Source of Truth\n\n```\n# constraints/constraints.toml\n[orchestrator.workers]\nmin = 1\nmax = 32\n\n[orchestrator.queue.concurrent_tasks]\nmin = 1\nmax = 100\n\n[common.server.port]\nmin = 1024\nmax = 65535\n```\n\nThese values are referenced in:\n- Form constraints (constraint interpolation)\n- Validators (ValidWorkers, ValidConcurrentTasks)\n- Default values (appropriate for each mode)\n\n---\n\n## Stage 3: Export (Nickel → TOML)\n\n### Purpose\n\nConvert validated Nickel configuration to TOML format for consumption by Rust services.\n\n### Workflow\n\n```\n# Export Nickel to TOML\nnu scripts/generate-configs.nu orchestrator solo\n```\n\n### Command Chain\n\n```\n# What happens internally:\n\n# 1. Typecheck the Nickel config (catch errors early)\nnickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# 2. Export to TOML format\nnickel export --format toml provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# 3. Save to output location\n# → provisioning/platform/config/orchestrator.solo.toml\n```\n\n### Input: Nickel Configuration\n\n```\n# From: configs/orchestrator.solo.ncl\n{\n orchestrator = {\n workspace = {\n name = "dev-workspace",\n path = "/home/developer/provisioning/data/orchestrator",\n enabled = true,\n multi_workspace = false,\n },\n server = {\n host = "127.0.0.1",\n port = 9090,\n workers = 2,\n keep_alive = 75,\n max_connections = 128,\n },\n storage = {\n backend = "filesystem",\n path = "/home/developer/provisioning/data/orchestrator",\n },\n queue = {\n max_concurrent_tasks = 3,\n retry_attempts = 2,\n retry_delay = 1000,\n task_timeout = 1800000,\n },\n monitoring = {\n enabled = true,\n metrics = {\n enabled = false,\n },\n health_check = {\n enabled = true,\n interval = 60,\n },\n },\n logging = {\n level = "debug",\n format = "text",\n outputs = [\n {\n destination = "stdout",\n level = "debug",\n },\n ],\n },\n },\n}\n```\n\n### Output: TOML Configuration\n\n```\n# To: provisioning/platform/config/orchestrator.solo.toml\n[orchestrator.workspace]\nname = "dev-workspace"\npath = "/home/developer/provisioning/data/orchestrator"\nenabled = true\nmulti_workspace = false\n\n[orchestrator.server]\nhost = "127.0.0.1"\nport = 9090\nworkers = 2\nkeep_alive = 75\nmax_connections = 128\n\n[orchestrator.storage]\nbackend = "filesystem"\npath = "/home/developer/provisioning/data/orchestrator"\n\n[orchestrator.queue]\nmax_concurrent_tasks = 3\nretry_attempts = 2\nretry_delay = 1000\ntask_timeout = 1800000\n\n[orchestrator.monitoring]\nenabled = true\n\n[orchestrator.monitoring.metrics]\nenabled = false\n\n[orchestrator.monitoring.health_check]\nenabled = true\ninterval = 60\n\n[orchestrator.logging]\nlevel = "debug"\nformat = "text"\n\n[[orchestrator.logging.outputs]]\ndestination = "stdout"\nlevel = "debug"\n```\n\n### Output Location\n\n```\nprovisioning/platform/config/\n├── orchestrator.solo.toml # Exported from configs/orchestrator.solo.ncl\n├── orchestrator.multiuser.toml # Exported from configs/orchestrator.multiuser.ncl\n├── orchestrator.cicd.toml # Exported from configs/orchestrator.cicd.ncl\n├── orchestrator.enterprise.toml # Exported from configs/orchestrator.enterprise.ncl\n├── control-center.solo.toml # Similar structure for each service\n├── control-center.multiuser.toml\n├── mcp-server.solo.toml\n└── mcp-server.enterprise.toml\n```\n\n### Validation During Export\n\nThe `generate-configs.nu` script:\n\n1. **Typechecks** - Ensures Nickel is syntactically valid\n2. **Evaluates** - Computes final values\n3. **Exports** - Converts to TOML format\n4. **Saves** - Writes to `provisioning/platform/config/`\n\n---\n\n## Stage 4: Runtime (Rust Services)\n\n### Purpose\n\nLoad TOML configuration and start Rust services with validated settings.\n\n### Configuration Loading Hierarchy\n\nRust services load configuration in this priority order:\n\n#### 1. Runtime Arguments (Highest Priority)\n\n```\nORCHESTRATOR_CONFIG=/path/to/config.toml cargo run --bin orchestrator\n```\n\n#### 2. Environment Variables\n\n```\n# Environment variable overrides specific TOML values\nexport ORCHESTRATOR_SERVER_PORT=9999\nexport ORCHESTRATOR_LOG_LEVEL=debug\n\nORCHESTRATOR_CONFIG=orchestrator.solo.toml cargo run --bin orchestrator\n```\n\nEnvironment variable format: `ORCHESTRATOR_{SECTION}_{KEY}=value`\n\nExample mappings:\n- `ORCHESTRATOR_SERVER_PORT=9999` → `orchestrator.server.port = 9999`\n- `ORCHESTRATOR_LOG_LEVEL=debug` → `orchestrator.logging.level = "debug"`\n- `ORCHESTRATOR_QUEUE_MAX_CONCURRENT_TASKS=10` → `orchestrator.queue.max_concurrent_tasks = 10`\n\n#### 3. TOML Configuration File\n\n```\n# Load from TOML (medium priority)\nORCHESTRATOR_CONFIG=orchestrator.solo.toml cargo run --bin orchestrator\n```\n\n#### 4. Compiled Defaults (Lowest Priority)\n\n```\n// In Rust code - fallback for unspecified values\nlet config = Config::from_file(config_path)\n .unwrap_or_else(|_| Config::default());\n```\n\n### Example: Solo Mode Startup\n\n```\n# Step 1: User generates config through TypeDialog\nnu scripts/configure.nu orchestrator solo --backend web\n\n# Step 2: Export to TOML\nnu scripts/generate-configs.nu orchestrator solo\n\n# Step 3: Set environment variables for environment-specific overrides\nexport ORCHESTRATOR_SERVER_PORT=9090\nexport ORCHESTRATOR_LOG_LEVEL=debug\n\n# Step 4: Start the Rust service\nORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator\n```\n\n### Rust Service Configuration Loading\n\n```\n// In orchestrator/src/config.rs\n\nuse config::{Config, ConfigError, Environment, File};\nuse serde::Deserialize;\n\n#[derive(Debug, Deserialize)]\npub struct OrchestratorConfig {\n pub orchestrator: OrchestratorService,\n}\n\n#[derive(Debug, Deserialize)]\npub struct OrchestratorService {\n pub workspace: Workspace,\n pub server: Server,\n pub storage: Storage,\n pub queue: Queue,\n}\n\nimpl OrchestratorConfig {\n pub fn load(config_path: Option<&str>) -> Result {\n let mut builder = Config::builder();\n\n // 1. Load TOML file if provided\n if let Some(path) = config_path {\n builder = builder.add_source(File::from(Path::new(path)));\n } else {\n // Fallback to defaults\n builder = builder.add_source(File::with_name("config/orchestrator.defaults.toml"));\n }\n\n // 2. Apply environment variable overrides\n builder = builder.add_source(\n Environment::with_prefix("ORCHESTRATOR")\n .separator("_")\n );\n\n let config = builder.build()?;\n config.try_deserialize()\n }\n}\n```\n\n### Configuration Validation in Rust\n\n```\nimpl OrchestratorConfig {\n pub fn validate(&self) -> Result<(), ConfigError> {\n // Validate server configuration\n if self.orchestrator.server.port < 1024 || self.orchestrator.server.port > 65535 {\n return Err(ConfigError::Message(\n "Server port must be between 1024 and 65535".to_string()\n ));\n }\n\n // Validate queue configuration\n if self.orchestrator.queue.max_concurrent_tasks == 0 {\n return Err(ConfigError::Message(\n "max_concurrent_tasks must be > 0".to_string()\n ));\n }\n\n // Validate storage configuration\n match self.orchestrator.storage.backend.as_str() {\n "filesystem" | "surrealdb" | "rocksdb" => {\n // Valid backend\n },\n backend => {\n return Err(ConfigError::Message(\n format!("Unknown storage backend: {}", backend)\n ));\n }\n }\n\n Ok(())\n }\n}\n```\n\n### Runtime Startup Sequence\n\n```\n#[tokio::main]\nasync fn main() -> Result<()> {\n // Load configuration\n let config = OrchestratorConfig::load(\n std::env::var("ORCHESTRATOR_CONFIG").ok().as_deref()\n )?;\n\n // Validate configuration\n config.validate()?;\n\n // Initialize logging\n init_logging(&config.orchestrator.logging)?;\n\n // Start HTTP server\n let server = Server::new(\n config.orchestrator.server.host.clone(),\n config.orchestrator.server.port,\n );\n\n // Initialize storage backend\n let storage = Storage::new(&config.orchestrator.storage)?;\n\n // Start the service\n server.start(storage).await?;\n\n Ok(())\n}\n```\n\n---\n\n## Complete Example: Solo Mode End-to-End\n\n### Step 1: Interactive Configuration\n\n```\n$ nu scripts/configure.nu orchestrator solo --backend web\n\n# TypeDialog launches web interface\n# User fills in form:\n# - Workspace name: "dev-workspace"\n# - Server host: "127.0.0.1"\n# - Server port: 9090\n# - Storage backend: "filesystem"\n# - Storage path: "/home/developer/provisioning/data/orchestrator"\n# - Max concurrent tasks: 3\n# - Log level: "debug"\n\n# Saves to: values/orchestrator.solo.ncl\n```\n\n### Step 2: Generated Nickel Configuration\n\n```\n# values/orchestrator.solo.ncl\n{\n orchestrator = {\n workspace = {\n name = "dev-workspace",\n path = "/home/developer/provisioning/data/orchestrator",\n enabled = true,\n multi_workspace = false,\n },\n server = {\n host = "127.0.0.1",\n port = 9090,\n workers = 2,\n keep_alive = 75,\n max_connections = 128,\n },\n storage = {\n backend = "filesystem",\n path = "/home/developer/provisioning/data/orchestrator",\n },\n queue = {\n max_concurrent_tasks = 3,\n retry_attempts = 2,\n retry_delay = 1000,\n task_timeout = 1800000,\n },\n logging = {\n level = "debug",\n format = "text",\n outputs = [{\n destination = "stdout",\n level = "debug",\n }],\n },\n },\n}\n```\n\n### Step 3: Composition and Validation\n\n```\n$ nickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Validation passes:\n# - Workspace name: valid string ✓\n# - Port 9090: within range 1024-65535 ✓\n# - Max concurrent tasks 3: within range 1-100 ✓\n# - Log level: recognized level ✓\n```\n\n### Step 4: Export to TOML\n\n```\n$ nu scripts/generate-configs.nu orchestrator solo\n\n# Generates: provisioning/platform/config/orchestrator.solo.toml\n```\n\n### Step 5: TOML File Created\n\n```\n# provisioning/platform/config/orchestrator.solo.toml\n[orchestrator.workspace]\nname = "dev-workspace"\npath = "/home/developer/provisioning/data/orchestrator"\nenabled = true\nmulti_workspace = false\n\n[orchestrator.server]\nhost = "127.0.0.1"\nport = 9090\nworkers = 2\nkeep_alive = 75\nmax_connections = 128\n\n[orchestrator.storage]\nbackend = "filesystem"\npath = "/home/developer/provisioning/data/orchestrator"\n\n[orchestrator.queue]\nmax_concurrent_tasks = 3\nretry_attempts = 2\nretry_delay = 1000\ntask_timeout = 1800000\n\n[orchestrator.logging]\nlevel = "debug"\nformat = "text"\n\n[[orchestrator.logging.outputs]]\ndestination = "stdout"\nlevel = "debug"\n```\n\n### Step 6: Runtime Startup\n\n```\n$ export ORCHESTRATOR_LOG_LEVEL=debug\n$ ORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator\n\n# Service loads orchestrator.solo.toml\n# Environment variable overrides ORCHESTRATOR_LOG_LEVEL to "debug"\n# Service starts and begins accepting requests on 127.0.0.1:9090\n```\n\n---\n\n## Configuration Modification Workflow\n\n### Scenario: User Wants to Change Port\n\n#### Option A: Modify TypeDialog Form and Regenerate\n\n```\n# 1. Re-run interactive configuration\nnu scripts/configure.nu orchestrator solo --backend web\n\n# 2. User changes port to 9999 in form\n# 3. TypeDialog generates new values/orchestrator.solo.ncl\n\n# 4. Export updated config\nnu scripts/generate-configs.nu orchestrator solo\n\n# 5. New TOML created with port: 9999\n# 6. Restart service\nORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator\n```\n\n#### Option B: Direct TOML Edit\n\n```\n# 1. Edit TOML directly\nvi provisioning/platform/config/orchestrator.solo.toml\n# Change: port = 9999\n\n# 2. Restart service (no Nickel re-export needed)\nORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator\n```\n\n#### Option C: Environment Variable Override\n\n```\n# 1. No file changes needed\n# 2. Just override environment variable\nexport ORCHESTRATOR_SERVER_PORT=9999\n\n# 3. Restart service\nORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator\n```\n\n---\n\n## Architecture Relationships\n\n### Component Interactions\n\n```\nTypeDialog Forms Nickel Schemas\n(forms/*.toml) ←shares→ (schemas/*.ncl)\n │ │\n │ user input │ type definitions\n │ │\n ▼ ▼\nvalues/*.ncl ←─ constraint validation ─→ constraints.toml\n │ (single source of truth)\n │ │\n │ │\n ├──→ imported into composition ────────────┤\n │ (configs/*.ncl) │\n │ │\n │ base defaults ───→ defaults/*.ncl │\n │ mode overlay ─────→ deployment/*.ncl │\n │ validators ──────→ validators/*.ncl │\n │ │\n └──→ typecheck + export ──────────────→─────┘\n nickel export --format toml\n │\n ▼\n provisioning/platform/config/\n *.toml files\n │\n │ loaded by Rust services\n │ at runtime\n ▼\n Running Service\n (orchestrator, control-center, mcp-server)\n```\n\n---\n\n## Best Practices\n\n### 1. Always Validate Before Deploying\n\n```\n# Typecheck Nickel before export\nnickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Validate TOML before loading in Rust\ncargo run --bin orchestrator -- --validate-config orchestrator.solo.toml\n```\n\n### 2. Use Version Control for TOML Configs\n\n```\n# Commit generated TOML files\ngit add provisioning/platform/config/orchestrator.solo.toml\ngit commit -m "Update orchestrator solo configuration"\n\n# But NOT the values/*.ncl files\necho "values/*.ncl" >> provisioning/.typedialog/provisioning/platform/.gitignore\n```\n\n### 3. Document Configuration Changes\n\n```\n# In TypeDialog form, add comments\n[[items]]\nname = "max_concurrent_tasks"\ntype = "number"\nprompt = "Max concurrent tasks (3 for dev, 50+ for production)"\nhelp = "Increased from 3 to 10 for higher throughput testing"\n```\n\n### 4. Environment Variables for Sensitive Data\n\nNever hardcode secrets in TOML:\n\n```\n# Instead of:\n# [orchestrator.security]\n# jwt_secret = "hardcoded-secret"\n\n# Use environment variable:\nexport ORCHESTRATOR_SECURITY_JWT_SECRET="actual-secret"\n\n# TOML can reference it:\n# [orchestrator.security]\n# jwt_secret = "${JWT_SECRET}"\n```\n\n### 5. Test Configuration Changes in Staging First\n\n```\n# Generate staging config\nnu scripts/configure.nu orchestrator multiuser --backend web\n\n# Export to staging TOML\nnu scripts/generate-configs.nu orchestrator multiuser\n\n# Test in staging environment\nORCHESTRATOR_CONFIG=orchestrator.multiuser.toml cargo run --bin orchestrator\n# Monitor logs and verify behavior\n\n# Then deploy to production\n```\n\n---\n\n## Summary\n\nThe four-stage workflow provides:\n\n1. **User-Friendly Interface**: TypeDialog forms with real-time validation\n2. **Type Safety**: Nickel schemas and validators catch configuration errors early\n3. **Flexibility**: TOML format can be edited manually or generated programmatically\n4. **Runtime Configurability**: Environment variables allow deployment-time overrides\n5. **Single Source of Truth**: Constraints, schemas, and validators all reference shared definitions\n\nThis layered approach ensures that:\n- Invalid configurations are caught before deployment\n- Users can modify configuration safely\n- Different deployment modes have appropriate defaults\n- Configuration changes can be version-controlled\n- Services can be reconfigured without code changes +# Configuration Workflow: TypeDialog → Nickel → TOML → Rust + +Complete documentation of the configuration pipeline that transforms interactive user input into production Rust service configurations. + +## Overview + +The provisioning platform uses a **four-stage configuration workflow** that leverages TypeDialog for interactive configuration, +Nickel for type-safe composition, and TOML for service consumption: + +```nickel +┌─────────────────────────────────────────────────────────────────┐ +│ Stage 1: User Interaction (TypeDialog) │ +│ - Can use Nickel configuration as default values │ +│ if use provisioning/platform/config/ it will be updated │ +│ - Interactive form (web/tui/cli) │ +│ - Real-time constraint validation │ +│ - Generates Nickel configuration │ +└────────────────┬────────────────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────────┐ +│ Stage 2: Composition (Nickel) │ +│ - Base defaults imported │ +│ - Mode overlay applied │ +│ - Validators enforce business rules │ +│ - Produces Nickel config file │ +└────────────────┬────────────────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────────┐ +│ Stage 3: Export (Nickel → TOML) │ +│ - Nickel config evaluated │ +│ - Exported to TOML format │ +│ - Saved to provisioning/platform/config/ │ +└────────────────┬────────────────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────────┐ +│ Stage 4: Runtime (Rust Services) │ +│ - Services load TOML configuration │ +│ - Environment variables override specific values │ +│ - Start services with final configuration │ +└─────────────────────────────────────────────────────────────────┘ +``` + +--- + +## Stage 1: User Interaction (TypeDialog) + +### Purpose + +Collect configuration from users through an interactive, constraint-aware interface. + +### Workflow + +```bash +# Launch interactive configuration wizard +nu scripts/configure.nu orchestrator solo --backend web +``` + +### What Happens + +1. **Form Loads** + - TypeDialog reads `forms/orchestrator-form.toml` + - Form displays configuration sections + - Constraints from `constraints.toml` enforce min/max values + - Environment variables populate initial defaults + +2. **User Interaction** + - User fills in form fields (workspace name, server port, etc.) + - Real-time validation on each field + - Constraint interpolation shows valid ranges: + - `${constraint.orchestrator.workers.min}` → `1` + - `${constraint.orchestrator.workers.max}` → `32` + +3. **Configuration Submission** + - User submits form + - TypeDialog validates all fields against schemas + - Generates Nickel configuration output + +4. **Output Generation** + - Nickel config saved to `values/{service}.{mode}.ncl` + - Example: `values/orchestrator.solo.ncl` + - File becomes source of truth for user customizations + +### Form Structure Example + +```bash +# forms/orchestrator-form.toml +name = "orchestrator_configuration" +description = "Configure orchestrator service" + +[[items]] +name = "workspace_group" +type = "group" +includes = ["fragments/workspace-section.toml"] + +[[items]] +name = "server_group" +type = "group" +includes = ["fragments/server-section.toml"] + +[[items]] +name = "queue_group" +type = "group" +includes = ["fragments/orchestrator/queue-section.toml"] +``` + +### Fragment with Constraint Interpolation + +```bash +# forms/fragments/orchestrator/queue-section.toml +[[elements]] +name = "max_concurrent_tasks" +type = "number" +prompt = "Maximum Concurrent Tasks" +default = 5 +min = "${constraint.orchestrator.queue.concurrent_tasks.min}" +max = "${constraint.orchestrator.queue.concurrent_tasks.max}" +required = true +help = "Range: ${constraint.orchestrator.queue.concurrent_tasks.min}-${constraint.orchestrator.queue.concurrent_tasks.max}" +nickel_path = ["orchestrator", "queue", "max_concurrent_tasks"] +``` + +### Generated Nickel Output (from TypeDialog) + +TypeDialog's `nickel-roundtrip` pattern generates: + +```nickel +# values/orchestrator.solo.ncl +# Auto-generated by TypeDialog +{ + orchestrator = { + workspace = { + name = "dev-workspace", + path = "/home/developer/provisioning/data/orchestrator", + enabled = true, + }, + server = { + host = "127.0.0.1", + port = 9090, + workers = 2, + }, + queue = { + max_concurrent_tasks = 3, + retry_attempts = 2, + retry_delay = 1000, + }, + }, +} +``` + +--- + +## Stage 2: Composition (Nickel) + +### Purpose + +Compose the user input with defaults, validators, and schemas to create a complete, validated configuration. + +### Workflow + +```bash +# The nickel typecheck command validates the composition +nickel typecheck values/orchestrator.solo.ncl +``` + +### Composition Layers + +The final configuration is built by merging layers in priority order: + +#### Layer 1: Schema Import + +```bash +# Ensures type safety and required fields +let schemas = import "../schemas/orchestrator.ncl" in +``` + +#### Layer 2: Base Defaults + +```bash +# Default values for all orchestrator configurations +let defaults = import "../defaults/orchestrator-defaults.ncl" in +``` + +#### Layer 3: Mode Overlay + +```bash +# Solo-specific overrides and adjustments +let solo_defaults = import "../defaults/deployment/solo-defaults.ncl" in +``` + +#### Layer 4: Validators Import + +```bash +# Business rule validation (ranges, uniqueness, dependencies) +let validators = import "../validators/orchestrator-validator.ncl" in +``` + +#### Layer 5: User Values + +```bash +# User input from TypeDialog (values/orchestrator.solo.ncl) +# Loaded and merged with defaults +``` + +### Composition Example + +```bash +# configs/orchestrator.solo.ncl (generated composition) + +let schemas = import "../schemas/orchestrator.ncl" in +let defaults = import "../defaults/orchestrator-defaults.ncl" in +let solo_defaults = import "../defaults/deployment/solo-defaults.ncl" in +let validators = import "../validators/orchestrator-validator.ncl" in + +# Composition: Base defaults + mode overlay + user input +{ + orchestrator = defaults.orchestrator & { + # User input from TypeDialog values/orchestrator.solo.ncl + workspace = { + name = "dev-workspace", + path = "/home/developer/provisioning/data/orchestrator", + }, + + # Solo mode overrides + server = { + workers = validators.ValidWorkers 2, + max_connections = 128, + }, + + queue = { + max_concurrent_tasks = validators.ValidConcurrentTasks 3, + }, + + # Fallback to defaults for unspecified fields + }, +} | schemas.OrchestratorConfig # Validate against schema +``` + +### Validation During Composition + +Each field is validated through multiple validation layers: + +```bash +# validators/orchestrator-validator.ncl +let constraints = import "../constraints/constraints.toml" in + +{ + # Validate workers within allowed range + ValidWorkers = fun workers => + if workers < constraints.orchestrator.workers.min then + error "Workers below minimum" + else if workers > constraints.orchestrator.workers.max then + error "Workers above maximum" + else + workers, + + # Validate concurrent tasks + ValidConcurrentTasks = fun tasks => + if tasks < constraints.orchestrator.queue.concurrent_tasks.min then + error "Tasks below minimum" + else if tasks > constraints.orchestrator.queue.concurrent_tasks.max then + error "Tasks above maximum" + else + tasks, +} +``` + +### Constraints: Single Source of Truth + +```bash +# constraints/constraints.toml +[orchestrator.workers] +min = 1 +max = 32 + +[orchestrator.queue.concurrent_tasks] +min = 1 +max = 100 + +[common.server.port] +min = 1024 +max = 65535 +``` + +These values are referenced in: +- Form constraints (constraint interpolation) +- Validators (ValidWorkers, ValidConcurrentTasks) +- Default values (appropriate for each mode) + +--- + +## Stage 3: Export (Nickel → TOML) + +### Purpose + +Convert validated Nickel configuration to TOML format for consumption by Rust services. + +### Workflow + +```bash +# Export Nickel to TOML +nu scripts/generate-configs.nu orchestrator solo +``` + +### Command Chain + +```bash +# What happens internally: + +# 1. Typecheck the Nickel config (catch errors early) +nickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl + +# 2. Export to TOML format +nickel export --format toml provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl + +# 3. Save to output location +# → provisioning/platform/config/orchestrator.solo.toml +``` + +### Input: Nickel Configuration + +```nickel +# From: configs/orchestrator.solo.ncl +{ + orchestrator = { + workspace = { + name = "dev-workspace", + path = "/home/developer/provisioning/data/orchestrator", + enabled = true, + multi_workspace = false, + }, + server = { + host = "127.0.0.1", + port = 9090, + workers = 2, + keep_alive = 75, + max_connections = 128, + }, + storage = { + backend = "filesystem", + path = "/home/developer/provisioning/data/orchestrator", + }, + queue = { + max_concurrent_tasks = 3, + retry_attempts = 2, + retry_delay = 1000, + task_timeout = 1800000, + }, + monitoring = { + enabled = true, + metrics = { + enabled = false, + }, + health_check = { + enabled = true, + interval = 60, + }, + }, + logging = { + level = "debug", + format = "text", + outputs = [ + { + destination = "stdout", + level = "debug", + }, + ], + }, + }, +} +``` + +### Output: TOML Configuration + +```toml +# To: provisioning/platform/config/orchestrator.solo.toml +[orchestrator.workspace] +name = "dev-workspace" +path = "/home/developer/provisioning/data/orchestrator" +enabled = true +multi_workspace = false + +[orchestrator.server] +host = "127.0.0.1" +port = 9090 +workers = 2 +keep_alive = 75 +max_connections = 128 + +[orchestrator.storage] +backend = "filesystem" +path = "/home/developer/provisioning/data/orchestrator" + +[orchestrator.queue] +max_concurrent_tasks = 3 +retry_attempts = 2 +retry_delay = 1000 +task_timeout = 1800000 + +[orchestrator.monitoring] +enabled = true + +[orchestrator.monitoring.metrics] +enabled = false + +[orchestrator.monitoring.health_check] +enabled = true +interval = 60 + +[orchestrator.logging] +level = "debug" +format = "text" + +[[orchestrator.logging.outputs]] +destination = "stdout" +level = "debug" +``` + +### Output Location + +```bash +provisioning/platform/config/ +├── orchestrator.solo.toml # Exported from configs/orchestrator.solo.ncl +├── orchestrator.multiuser.toml # Exported from configs/orchestrator.multiuser.ncl +├── orchestrator.cicd.toml # Exported from configs/orchestrator.cicd.ncl +├── orchestrator.enterprise.toml # Exported from configs/orchestrator.enterprise.ncl +├── control-center.solo.toml # Similar structure for each service +├── control-center.multiuser.toml +├── mcp-server.solo.toml +└── mcp-server.enterprise.toml +``` + +### Validation During Export + +The `generate-configs.nu` script: + +1. **Typechecks** - Ensures Nickel is syntactically valid +2. **Evaluates** - Computes final values +3. **Exports** - Converts to TOML format +4. **Saves** - Writes to `provisioning/platform/config/` + +--- + +## Stage 4: Runtime (Rust Services) + +### Purpose + +Load TOML configuration and start Rust services with validated settings. + +### Configuration Loading Hierarchy + +Rust services load configuration in this priority order: + +#### 1. Runtime Arguments (Highest Priority) + +```bash +ORCHESTRATOR_CONFIG=/path/to/config.toml cargo run --bin orchestrator +``` + +#### 2. Environment Variables + +```bash +# Environment variable overrides specific TOML values +export ORCHESTRATOR_SERVER_PORT=9999 +export ORCHESTRATOR_LOG_LEVEL=debug + +ORCHESTRATOR_CONFIG=orchestrator.solo.toml cargo run --bin orchestrator +``` + +Environment variable format: `ORCHESTRATOR_{SECTION}_{KEY}=value` + +Example mappings: +- `ORCHESTRATOR_SERVER_PORT=9999` → `orchestrator.server.port = 9999` +- `ORCHESTRATOR_LOG_LEVEL=debug` → `orchestrator.logging.level = "debug"` +- `ORCHESTRATOR_QUEUE_MAX_CONCURRENT_TASKS=10` → `orchestrator.queue.max_concurrent_tasks = 10` + +#### 3. TOML Configuration File + +```toml +# Load from TOML (medium priority) +ORCHESTRATOR_CONFIG=orchestrator.solo.toml cargo run --bin orchestrator +``` + +#### 4. Compiled Defaults (Lowest Priority) + +```bash +// In Rust code - fallback for unspecified values +let config = Config::from_file(config_path) + .unwrap_or_else(|_| Config::default()); +``` + +### Example: Solo Mode Startup + +```bash +# Step 1: User generates config through TypeDialog +nu scripts/configure.nu orchestrator solo --backend web + +# Step 2: Export to TOML +nu scripts/generate-configs.nu orchestrator solo + +# Step 3: Set environment variables for environment-specific overrides +export ORCHESTRATOR_SERVER_PORT=9090 +export ORCHESTRATOR_LOG_LEVEL=debug + +# Step 4: Start the Rust service +ORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator +``` + +### Rust Service Configuration Loading + +```rust +// In orchestrator/src/config.rs + +use config::{Config, ConfigError, Environment, File}; +use serde::Deserialize; + +#[derive(Debug, Deserialize)] +pub struct OrchestratorConfig { + pub orchestrator: OrchestratorService, +} + +#[derive(Debug, Deserialize)] +pub struct OrchestratorService { + pub workspace: Workspace, + pub server: Server, + pub storage: Storage, + pub queue: Queue, +} + +impl OrchestratorConfig { + pub fn load(config_path: Option<&str>) -> Result { + let mut builder = Config::builder(); + + // 1. Load TOML file if provided + if let Some(path) = config_path { + builder = builder.add_source(File::from(Path::new(path))); + } else { + // Fallback to defaults + builder = builder.add_source(File::with_name("config/orchestrator.defaults.toml")); + } + + // 2. Apply environment variable overrides + builder = builder.add_source( + Environment::with_prefix("ORCHESTRATOR") + .separator("_") + ); + + let config = builder.build()?; + config.try_deserialize() + } +} +``` + +### Configuration Validation in Rust + +```rust +impl OrchestratorConfig { + pub fn validate(&self) -> Result<(), ConfigError> { + // Validate server configuration + if self.orchestrator.server.port < 1024 || self.orchestrator.server.port > 65535 { + return Err(ConfigError::Message( + "Server port must be between 1024 and 65535".to_string() + )); + } + + // Validate queue configuration + if self.orchestrator.queue.max_concurrent_tasks == 0 { + return Err(ConfigError::Message( + "max_concurrent_tasks must be > 0".to_string() + )); + } + + // Validate storage configuration + match self.orchestrator.storage.backend.as_str() { + "filesystem" | "surrealdb" | "rocksdb" => { + // Valid backend + }, + backend => { + return Err(ConfigError::Message( + format!("Unknown storage backend: {}", backend) + )); + } + } + + Ok(()) + } +} +``` + +### Runtime Startup Sequence + +```bash +#[tokio::main] +async fn main() -> Result<()> { + // Load configuration + let config = OrchestratorConfig::load( + std::env::var("ORCHESTRATOR_CONFIG").ok().as_deref() + )?; + + // Validate configuration + config.validate()?; + + // Initialize logging + init_logging(&config.orchestrator.logging)?; + + // Start HTTP server + let server = Server::new( + config.orchestrator.server.host.clone(), + config.orchestrator.server.port, + ); + + // Initialize storage backend + let storage = Storage::new(&config.orchestrator.storage)?; + + // Start the service + server.start(storage).await?; + + Ok(()) +} +``` + +--- + +## Complete Example: Solo Mode End-to-End + +### Step 1: Interactive Configuration + +```toml +$ nu scripts/configure.nu orchestrator solo --backend web + +# TypeDialog launches web interface +# User fills in form: +# - Workspace name: "dev-workspace" +# - Server host: "127.0.0.1" +# - Server port: 9090 +# - Storage backend: "filesystem" +# - Storage path: "/home/developer/provisioning/data/orchestrator" +# - Max concurrent tasks: 3 +# - Log level: "debug" + +# Saves to: values/orchestrator.solo.ncl +``` + +### Step 2: Generated Nickel Configuration + +```nickel +# values/orchestrator.solo.ncl +{ + orchestrator = { + workspace = { + name = "dev-workspace", + path = "/home/developer/provisioning/data/orchestrator", + enabled = true, + multi_workspace = false, + }, + server = { + host = "127.0.0.1", + port = 9090, + workers = 2, + keep_alive = 75, + max_connections = 128, + }, + storage = { + backend = "filesystem", + path = "/home/developer/provisioning/data/orchestrator", + }, + queue = { + max_concurrent_tasks = 3, + retry_attempts = 2, + retry_delay = 1000, + task_timeout = 1800000, + }, + logging = { + level = "debug", + format = "text", + outputs = [{ + destination = "stdout", + level = "debug", + }], + }, + }, +} +``` + +### Step 3: Composition and Validation + +```bash +$ nickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl + +# Validation passes: +# - Workspace name: valid string ✓ +# - Port 9090: within range 1024-65535 ✓ +# - Max concurrent tasks 3: within range 1-100 ✓ +# - Log level: recognized level ✓ +``` + +### Step 4: Export to TOML + +```toml +$ nu scripts/generate-configs.nu orchestrator solo + +# Generates: provisioning/platform/config/orchestrator.solo.toml +``` + +### Step 5: TOML File Created + +```toml +# provisioning/platform/config/orchestrator.solo.toml +[orchestrator.workspace] +name = "dev-workspace" +path = "/home/developer/provisioning/data/orchestrator" +enabled = true +multi_workspace = false + +[orchestrator.server] +host = "127.0.0.1" +port = 9090 +workers = 2 +keep_alive = 75 +max_connections = 128 + +[orchestrator.storage] +backend = "filesystem" +path = "/home/developer/provisioning/data/orchestrator" + +[orchestrator.queue] +max_concurrent_tasks = 3 +retry_attempts = 2 +retry_delay = 1000 +task_timeout = 1800000 + +[orchestrator.logging] +level = "debug" +format = "text" + +[[orchestrator.logging.outputs]] +destination = "stdout" +level = "debug" +``` + +### Step 6: Runtime Startup + +```bash +$ export ORCHESTRATOR_LOG_LEVEL=debug +$ ORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator + +# Service loads orchestrator.solo.toml +# Environment variable overrides ORCHESTRATOR_LOG_LEVEL to "debug" +# Service starts and begins accepting requests on 127.0.0.1:9090 +``` + +--- + +## Configuration Modification Workflow + +### Scenario: User Wants to Change Port + +#### Option A: Modify TypeDialog Form and Regenerate + +```bash +# 1. Re-run interactive configuration +nu scripts/configure.nu orchestrator solo --backend web + +# 2. User changes port to 9999 in form +# 3. TypeDialog generates new values/orchestrator.solo.ncl + +# 4. Export updated config +nu scripts/generate-configs.nu orchestrator solo + +# 5. New TOML created with port: 9999 +# 6. Restart service +ORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator +``` + +#### Option B: Direct TOML Edit + +```toml +# 1. Edit TOML directly +vi provisioning/platform/config/orchestrator.solo.toml +# Change: port = 9999 + +# 2. Restart service (no Nickel re-export needed) +ORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator +``` + +#### Option C: Environment Variable Override + +```bash +# 1. No file changes needed +# 2. Just override environment variable +export ORCHESTRATOR_SERVER_PORT=9999 + +# 3. Restart service +ORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator +``` + +--- + +## Architecture Relationships + +### Component Interactions + +```bash +TypeDialog Forms Nickel Schemas +(forms/*.toml) ←shares→ (schemas/*.ncl) + │ │ + │ user input │ type definitions + │ │ + ▼ ▼ +values/*.ncl ←─ constraint validation ─→ constraints.toml + │ (single source of truth) + │ │ + │ │ + ├──→ imported into composition ────────────┤ + │ (configs/*.ncl) │ + │ │ + │ base defaults ───→ defaults/*.ncl │ + │ mode overlay ─────→ deployment/*.ncl │ + │ validators ──────→ validators/*.ncl │ + │ │ + └──→ typecheck + export ──────────────→─────┘ + nickel export --format toml + │ + ▼ + provisioning/platform/config/ + *.toml files + │ + │ loaded by Rust services + │ at runtime + ▼ + Running Service + (orchestrator, control-center, mcp-server) +``` + +--- + +## Best Practices + +### 1. Always Validate Before Deploying + +```bash +# Typecheck Nickel before export +nickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl + +# Validate TOML before loading in Rust +cargo run --bin orchestrator -- --validate-config orchestrator.solo.toml +``` + +### 2. Use Version Control for TOML Configs + +```toml +# Commit generated TOML files +git add provisioning/platform/config/orchestrator.solo.toml +git commit -m "Update orchestrator solo configuration" + +# But NOT the values/*.ncl files +echo "values/*.ncl" >> provisioning/.typedialog/provisioning/platform/.gitignore +``` + +### 3. Document Configuration Changes + +```toml +# In TypeDialog form, add comments +[[items]] +name = "max_concurrent_tasks" +type = "number" +prompt = "Max concurrent tasks (3 for dev, 50+ for production)" +help = "Increased from 3 to 10 for higher throughput testing" +``` + +### 4. Environment Variables for Sensitive Data + +Never hardcode secrets in TOML: + +```toml +# Instead of: +# [orchestrator.security] +# jwt_secret = "hardcoded-secret" + +# Use environment variable: +export ORCHESTRATOR_SECURITY_JWT_SECRET="actual-secret" + +# TOML can reference it: +# [orchestrator.security] +# jwt_secret = "${JWT_SECRET}" +``` + +### 5. Test Configuration Changes in Staging First + +```toml +# Generate staging config +nu scripts/configure.nu orchestrator multiuser --backend web + +# Export to staging TOML +nu scripts/generate-configs.nu orchestrator multiuser + +# Test in staging environment +ORCHESTRATOR_CONFIG=orchestrator.multiuser.toml cargo run --bin orchestrator +# Monitor logs and verify behavior + +# Then deploy to production +``` + +--- + +## Summary + +The four-stage workflow provides: + +1. **User-Friendly Interface**: TypeDialog forms with real-time validation +2. **Type Safety**: Nickel schemas and validators catch configuration errors early +3. **Flexibility**: TOML format can be edited manually or generated programmatically +4. **Runtime Configurability**: Environment variables allow deployment-time overrides +5. **Single Source of Truth**: Constraints, schemas, and validators all reference shared definitions + +This layered approach ensures that: +- Invalid configurations are caught before deployment +- Users can modify configuration safely +- Different deployment modes have appropriate defaults +- Configuration changes can be version-controlled +- Services can be reconfigured without code changes \ No newline at end of file diff --git a/schemas/platform/constraints/README.md b/schemas/platform/constraints/README.md index c39cef5..b3871e9 100644 --- a/schemas/platform/constraints/README.md +++ b/schemas/platform/constraints/README.md @@ -1 +1,170 @@ -# Constraints\n\nSingle source of truth for validation limits across all services.\n\n## Purpose\n\nThe `constraints.toml` file defines:\n- **Numeric ranges** (min/max values for ports, workers, timeouts, etc.)\n- **Uniqueness rules** (field constraints, array bounds)\n- **Validation bounds** (resource limits, timeout ranges)\n\nThese constraints are used by:\n1. **Validators** (`validators/*.ncl`) - Check that configuration values are within bounds\n2. **TypeDialog forms** (`forms/*.toml`) - Enable constraint interpolation for dynamic field validation\n3. **Nickel schemas** (`schemas/*.ncl`) - Define type contracts with bounds\n\n## File Structure\n\n```\nconstraints/\n└── constraints.toml # All validation constraints in TOML format\n```\n\n## Usage Pattern\n\n### 1. Define Constraint\n\n**constraints.toml**:\n\n```\n[orchestrator.queue.concurrent_tasks]\nmin = 1\nmax = 100\n```\n\n### 2. Reference in Validator\n\n**validators/orchestrator-validator.ncl**:\n\n```\nlet constraints = import "../constraints/constraints.toml" in\n\n{\n ValidConcurrentTasks = fun tasks =>\n if tasks < constraints.orchestrator.queue.concurrent_tasks.min then\n error "Tasks must be >= 1"\n else if tasks > constraints.orchestrator.queue.concurrent_tasks.max then\n error "Tasks must be <= 100"\n else\n tasks,\n}\n```\n\n### 3. Reference in Form\n\n**forms/fragments/orchestrator-queue-section.toml**:\n\n```\n[[elements]]\nname = "max_concurrent_tasks"\ntype = "number"\nmin = "${constraint.orchestrator.queue.concurrent_tasks.min}"\nmax = "${constraint.orchestrator.queue.concurrent_tasks.max}"\nhelp = "Max: ${constraint.orchestrator.queue.concurrent_tasks.max}"\nnickel_path = ["orchestrator", "queue", "max_concurrent_tasks"]\n```\n\n## Constraint Categories\n\n### Service-Specific Constraints\n\n- **Orchestrator** (`[orchestrator.*]`)\n - Worker count bounds\n - Queue concurrency limits\n - Task timeout ranges\n - Batch parallelism limits\n\n- **Control Center** (`[control_center.*]`)\n - JWT token expiration bounds\n - Rate limiting thresholds\n - RBAC policy limits\n\n- **MCP Server** (`[mcp_server.*]`)\n - Tool concurrency limits\n - Resource size bounds\n - Prompt template limits\n\n### Common Constraints\n\n- **Server** (`[common.server.*]`)\n - Port range (1024-65535)\n - Worker count\n - Connection limits\n\n- **Deployment** (`[deployment.{solo,multiuser,cicd,enterprise}.*]`)\n - CPU core bounds\n - Memory allocation bounds\n - Disk space requirements\n\n## Modifying Constraints\n\nWhen changing constraint bounds:\n\n1. **Update constraints.toml**\n2. **Update validators** that use the constraint\n3. **Update forms** that interpolate the constraint\n4. **Test validation** in forms and Nickel typecheck\n5. **Update documentation** of affected services\n\n### Example: Increase Max Queue Tasks\n\n**Before**:\n\n```\n[orchestrator.queue.concurrent_tasks]\nmin = 1\nmax = 100\n```\n\n**After**:\n\n```\n[orchestrator.queue.concurrent_tasks]\nmin = 1\nmax = 200 # Increased from 100\n```\n\n**Then**:\n1. Verify `validators/orchestrator-validator.ncl` still type-checks\n2. Form will automatically show new max (via constraint interpolation)\n3. Test with: `nu scripts/validate-config.nu values/orchestrator.*.ncl`\n\n## Constraint Interpolation in Forms\n\nTypeDialog supports dynamic constraint references via `${constraint.path.to.value}`:\n\n```\n# Static min/max\nmin = 1\nmax = 100\n\n# Dynamic from constraints.toml\nmin = "${constraint.orchestrator.queue.concurrent_tasks.min}"\nmax = "${constraint.orchestrator.queue.concurrent_tasks.max}"\n\n# Help text with dynamic reference\nhelp = "Value must be between ${constraint.orchestrator.queue.concurrent_tasks.min} and ${constraint.orchestrator.queue.concurrent_tasks.max}"\n```\n\n## Best Practices\n\n1. **Single source of truth** - Define constraint once in constraints.toml\n2. **Meaningful names** - Use clear path hierarchy (service.subsystem.property)\n3. **Document ranges** - Add comments explaining why min/max values exist\n4. **Validate propagation** - Ensure forms and validators reference the same constraint\n5. **Test edge cases** - Verify min/max values work in validators and forms\n\n## Files to Update When Modifying Constraints\n\nWhen you change `constraints/constraints.toml`:\n\n1. `validators/*.ncl` - Update validator bounds\n2. `forms/fragments/*.toml` - Update form field constraints\n3. `schemas/*.ncl` - Update type contracts if needed\n4. Documentation - Update service-specific constraint documentation\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 +# Constraints + +Single source of truth for validation limits across all services. + +## Purpose + +The `constraints.toml` file defines: +- **Numeric ranges** (min/max values for ports, workers, timeouts, etc.) +- **Uniqueness rules** (field constraints, array bounds) +- **Validation bounds** (resource limits, timeout ranges) + +These constraints are used by: +1. **Validators** (`validators/*.ncl`) - Check that configuration values are within bounds +2. **TypeDialog forms** (`forms/*.toml`) - Enable constraint interpolation for dynamic field validation +3. **Nickel schemas** (`schemas/*.ncl`) - Define type contracts with bounds + +## File Structure + +```javascript +constraints/ +└── constraints.toml # All validation constraints in TOML format +``` + +## Usage Pattern + +### 1. Define Constraint + +**constraints.toml**: + +```toml +[orchestrator.queue.concurrent_tasks] +min = 1 +max = 100 +``` + +### 2. Reference in Validator + +**validators/orchestrator-validator.ncl**: + +```javascript +let constraints = import "../constraints/constraints.toml" in + +{ + ValidConcurrentTasks = fun tasks => + if tasks < constraints.orchestrator.queue.concurrent_tasks.min then + error "Tasks must be >= 1" + else if tasks > constraints.orchestrator.queue.concurrent_tasks.max then + error "Tasks must be <= 100" + else + tasks, +} +``` + +### 3. Reference in Form + +**forms/fragments/orchestrator-queue-section.toml**: + +```toml +[[elements]] +name = "max_concurrent_tasks" +type = "number" +min = "${constraint.orchestrator.queue.concurrent_tasks.min}" +max = "${constraint.orchestrator.queue.concurrent_tasks.max}" +help = "Max: ${constraint.orchestrator.queue.concurrent_tasks.max}" +nickel_path = ["orchestrator", "queue", "max_concurrent_tasks"] +``` + +## Constraint Categories + +### Service-Specific Constraints + +- **Orchestrator** (`[orchestrator.*]`) + - Worker count bounds + - Queue concurrency limits + - Task timeout ranges + - Batch parallelism limits + +- **Control Center** (`[control_center.*]`) + - JWT token expiration bounds + - Rate limiting thresholds + - RBAC policy limits + +- **MCP Server** (`[mcp_server.*]`) + - Tool concurrency limits + - Resource size bounds + - Prompt template limits + +### Common Constraints + +- **Server** (`[common.server.*]`) + - Port range (1024-65535) + - Worker count + - Connection limits + +- **Deployment** (`[deployment.{solo,multiuser,cicd,enterprise}.*]`) + - CPU core bounds + - Memory allocation bounds + - Disk space requirements + +## Modifying Constraints + +When changing constraint bounds: + +1. **Update constraints.toml** +2. **Update validators** that use the constraint +3. **Update forms** that interpolate the constraint +4. **Test validation** in forms and Nickel typecheck +5. **Update documentation** of affected services + +### Example: Increase Max Queue Tasks + +**Before**: + +```toml +[orchestrator.queue.concurrent_tasks] +min = 1 +max = 100 +``` + +**After**: + +```toml +[orchestrator.queue.concurrent_tasks] +min = 1 +max = 200 # Increased from 100 +``` + +**Then**: +1. Verify `validators/orchestrator-validator.ncl` still type-checks +2. Form will automatically show new max (via constraint interpolation) +3. Test with: `nu scripts/validate-config.nu values/orchestrator.*.ncl` + +## Constraint Interpolation in Forms + +TypeDialog supports dynamic constraint references via `${constraint.path.to.value}`: + +```bash +# Static min/max +min = 1 +max = 100 + +# Dynamic from constraints.toml +min = "${constraint.orchestrator.queue.concurrent_tasks.min}" +max = "${constraint.orchestrator.queue.concurrent_tasks.max}" + +# Help text with dynamic reference +help = "Value must be between ${constraint.orchestrator.queue.concurrent_tasks.min} and ${constraint.orchestrator.queue.concurrent_tasks.max}" +``` + +## Best Practices + +1. **Single source of truth** - Define constraint once in constraints.toml +2. **Meaningful names** - Use clear path hierarchy (service.subsystem.property) +3. **Document ranges** - Add comments explaining why min/max values exist +4. **Validate propagation** - Ensure forms and validators reference the same constraint +5. **Test edge cases** - Verify min/max values work in validators and forms + +## Files to Update When Modifying Constraints + +When you change `constraints/constraints.toml`: + +1. `validators/*.ncl` - Update validator bounds +2. `forms/fragments/*.toml` - Update form field constraints +3. `schemas/*.ncl` - Update type contracts if needed +4. Documentation - Update service-specific constraint documentation + +--- + +**Version**: 1.0.0 +**Last Updated**: 2025-01-05 \ No newline at end of file diff --git a/schemas/platform/defaults/README.md b/schemas/platform/defaults/README.md index 66484c4..22fb251 100644 --- a/schemas/platform/defaults/README.md +++ b/schemas/platform/defaults/README.md @@ -1 +1,314 @@ -# Defaults\n\nDefault configuration values for all services and deployment modes.\n\n## Purpose\n\nDefaults provide:\n- **Base values** for all configuration fields\n- **Mode-specific overrides** (solo, multiuser, cicd, enterprise)\n- **Composition with validators** for constraint checking\n- **Documentation** of recommended values\n\n## File Organization\n\n```\ndefaults/\n├── README.md # This file\n├── common/ # Shared defaults\n│ ├── server-defaults.ncl # HTTP server defaults\n│ ├── database-defaults.ncl # Database defaults\n│ ├── security-defaults.ncl # Security defaults\n│ ├── monitoring-defaults.ncl # Monitoring defaults\n│ └── logging-defaults.ncl # Logging defaults\n├── deployment/ # Mode-specific defaults\n│ ├── solo-defaults.ncl # Solo mode (2 CPU, 4GB)\n│ ├── multiuser-defaults.ncl # Multi-user mode (4 CPU, 8GB)\n│ ├── cicd-defaults.ncl # CI/CD mode (8 CPU, 16GB)\n│ └── enterprise-defaults.ncl # Enterprise mode (16+ CPU, 32+ GB)\n├── orchestrator-defaults.ncl # Orchestrator base defaults\n├── control-center-defaults.ncl # Control Center base defaults\n├── mcp-server-defaults.ncl # MCP Server base defaults\n└── installer-defaults.ncl # Installer base defaults\n```\n\n## Composition Pattern\n\nConfiguration is built from layers:\n\n```\nBase Defaults (service-defaults.ncl)\n ↓\n+ Mode Overlay (deployment/{mode}-defaults.ncl)\n ↓\n+ User Customization (values/{service}.{mode}.ncl)\n ↓\n+ Schema Validation (schemas/*.ncl)\n ↓\n= Final Configuration (configs/{service}.{mode}.ncl)\n```\n\nExample:\n\n```\n# configs/orchestrator.solo.ncl\nlet defaults = import "../defaults/orchestrator-defaults.ncl" in\nlet solo_defaults = import "../defaults/deployment/solo-defaults.ncl" in\n\n{\n orchestrator = defaults.orchestrator & {\n # Mode-specific overrides\n server.workers = 2, # Solo mode: fewer workers\n queue.max_concurrent_tasks = 3, # Solo: limited concurrency\n },\n}\n```\n\n## Default Value Hierarchy\n\n### 1. Service Base Defaults\n\n**orchestrator-defaults.ncl**:\n\n```\n{\n orchestrator = {\n workspace = {\n name = "default",\n path = "/var/lib/provisioning/orchestrator",\n enabled = true,\n multi_workspace = false,\n },\n server = {\n host = "127.0.0.1",\n port = 9090,\n workers = 4, # General default\n },\n storage = {\n backend = 'filesystem,\n path = "/var/lib/provisioning/orchestrator/data",\n },\n queue = {\n max_concurrent_tasks = 5,\n retry_attempts = 3,\n },\n },\n}\n```\n\n### 2. Mode-Specific Overrides\n\n**deployment/solo-defaults.ncl**:\n\n```\n{\n resources = {\n cpu_cores = 2,\n memory_mb = 4096,\n },\n services = {\n orchestrator = {\n workers = 2, # Override: fewer workers for solo\n queue_max_concurrent_tasks = 3, # Override: limited concurrency\n storage_backend = 'filesystem,\n },\n },\n}\n```\n\n**deployment/enterprise-defaults.ncl**:\n\n```\n{\n resources = {\n cpu_cores = 16,\n memory_mb = 32768,\n },\n services = {\n orchestrator = {\n workers = 16, # Override: more workers for enterprise\n queue_max_concurrent_tasks = 50, # Override: high concurrency\n storage_backend = 'surrealdb_server,\n surrealdb_url = "surrealdb://cluster:8000",\n },\n },\n}\n```\n\n## Common Defaults\n\n### server-defaults.ncl\n\n```\n{\n server = {\n host = "0.0.0.0", # Accept all interfaces\n port = 8080, # Standard HTTP port (service-specific override)\n workers = 4, # CPU-aware default\n keep_alive = 75, # seconds\n max_connections = 100,\n },\n}\n```\n\n### database-defaults.ncl\n\n```\n{\n database = {\n backend = 'rocksdb, # Fast embedded default\n path = "/var/lib/provisioning/data",\n pool_size = 10, # Connection pool\n timeout = 30000, # milliseconds\n },\n}\n```\n\n### security-defaults.ncl\n\n```\n{\n security = {\n jwt_issuer = "provisioning-system",\n jwt_expiration = 3600, # 1 hour\n encryption_key = "", # User must set\n kms_backend = "age", # Local encryption\n mfa_required = false, # Solo: disabled by default\n },\n}\n```\n\n### monitoring-defaults.ncl\n\n```\n{\n monitoring = {\n enabled = false, # Optional feature\n metrics_interval = 60, # seconds\n health_check_interval = 30,\n retention_days = 30,\n },\n}\n```\n\n## Mode Configurations\n\n### Solo Mode\n- **Use case**: Single developer, testing\n- **Resources**: 2 CPU, 4GB RAM, 50GB disk\n- **Database**: Filesystem or embedded (RocksDB)\n- **Security**: Simplified (no MFA, local encryption)\n- **Services**: Core services only (orchestrator, control-center)\n\n### MultiUser Mode\n- **Use case**: Team collaboration, staging\n- **Resources**: 4 CPU, 8GB RAM, 100GB disk\n- **Database**: PostgreSQL or SurrealDB server\n- **Security**: RBAC enabled, shared authentication\n- **Services**: Full platform (orchestrator, control-center, MCP, Gitea)\n\n### CI/CD Mode\n- **Use case**: Automated pipelines, testing\n- **Resources**: 8 CPU, 16GB RAM, 200GB disk\n- **Database**: Ephemeral, fast cleanup\n- **Security**: API tokens, no UI\n- **Services**: Minimal (orchestrator in API mode)\n\n### Enterprise Mode\n- **Use case**: Production, high availability\n- **Resources**: 16+ CPU, 32+ GB RAM, 500GB+ disk\n- **Database**: SurrealDB cluster with replication\n- **Security**: MFA required, KMS integration, compliance\n- **Services**: Full platform with redundancy, monitoring, logging\n\n## Modifying Defaults\n\n### Changing a Base Default\n\n**orchestrator-defaults.ncl**:\n\n```\n# Before\nqueue = {\n max_concurrent_tasks = 5,\n},\n\n# After\nqueue = {\n max_concurrent_tasks = 10, # Increased default\n},\n```\n\n**Then**:\n1. Test with: `nickel eval configs/orchestrator.solo.ncl`\n2. Verify forms still work\n3. Update documentation if default meaning changes\n\n### Changing Mode Override\n\n**deployment/solo-defaults.ncl**:\n\n```\n# Before\norchestrator = {\n workers = 2,\n}\n\n# After\norchestrator = {\n workers = 1, # Reduce to 1 for solo\n}\n```\n\n## Best Practices\n\n1. **Keep it conservative** - Default to safe, minimal values\n2. **Document overrides** - Explain why mode-specific values differ\n3. **Use composition** - Import and merge rather than duplicate\n4. **Test composition** - Verify defaults merge correctly with modes\n5. **Provide examples** - Use `examples/` directory to show realistic setups\n\n## Testing Defaults\n\n```\n# Evaluate defaults\nnickel eval provisioning/.typedialog/provisioning/platform/defaults/orchestrator-defaults.ncl\n\n# Test merged defaults (base + mode)\nnickel eval provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl | head -50\n\n# Typecheck with schemas\nnickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n```\n\n## Default Value Guidelines\n\n### Ports\n- Solo mode: Local (127.0.0.1) only\n- Multi-user/Enterprise: Bind all interfaces (0.0.0.0)\n- Never conflict with system services\n\n### Workers/Concurrency\n- Solo: 1-2 workers, limited concurrency\n- Multi-user: 4-8 workers, moderate concurrency\n- Enterprise: 8+ workers, high concurrency\n\n### Resources\n- Solo: 2 CPU, 4GB RAM (laptop testing)\n- Multi-user: 4 CPU, 8GB RAM (team servers)\n- Enterprise: 16+ CPU, 32+ GB RAM (production)\n\n### Security\n- Solo: Disabled/minimal (local development)\n- Multi-user: RBAC enabled (shared team)\n- Enterprise: MFA required, KMS backend (production)\n\n### Storage\n- Solo: Filesystem or RocksDB (no infrastructure needed)\n- Multi-user: PostgreSQL or SurrealDB (team data)\n- Enterprise: SurrealDB cluster with replication (HA)\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 +# Defaults + +Default configuration values for all services and deployment modes. + +## Purpose + +Defaults provide: +- **Base values** for all configuration fields +- **Mode-specific overrides** (solo, multiuser, cicd, enterprise) +- **Composition with validators** for constraint checking +- **Documentation** of recommended values + +## File Organization + +```bash +defaults/ +├── README.md # This file +├── common/ # Shared defaults +│ ├── server-defaults.ncl # HTTP server defaults +│ ├── database-defaults.ncl # Database defaults +│ ├── security-defaults.ncl # Security defaults +│ ├── monitoring-defaults.ncl # Monitoring defaults +│ └── logging-defaults.ncl # Logging defaults +├── deployment/ # Mode-specific defaults +│ ├── solo-defaults.ncl # Solo mode (2 CPU, 4GB) +│ ├── multiuser-defaults.ncl # Multi-user mode (4 CPU, 8GB) +│ ├── cicd-defaults.ncl # CI/CD mode (8 CPU, 16GB) +│ └── enterprise-defaults.ncl # Enterprise mode (16+ CPU, 32+ GB) +├── orchestrator-defaults.ncl # Orchestrator base defaults +├── control-center-defaults.ncl # Control Center base defaults +├── mcp-server-defaults.ncl # MCP Server base defaults +└── installer-defaults.ncl # Installer base defaults +``` + +## Composition Pattern + +Configuration is built from layers: + +```toml +Base Defaults (service-defaults.ncl) + ↓ ++ Mode Overlay (deployment/{mode}-defaults.ncl) + ↓ ++ User Customization (values/{service}.{mode}.ncl) + ↓ ++ Schema Validation (schemas/*.ncl) + ↓ += Final Configuration (configs/{service}.{mode}.ncl) +``` + +Example: + +```bash +# configs/orchestrator.solo.ncl +let defaults = import "../defaults/orchestrator-defaults.ncl" in +let solo_defaults = import "../defaults/deployment/solo-defaults.ncl" in + +{ + orchestrator = defaults.orchestrator & { + # Mode-specific overrides + server.workers = 2, # Solo mode: fewer workers + queue.max_concurrent_tasks = 3, # Solo: limited concurrency + }, +} +``` + +## Default Value Hierarchy + +### 1. Service Base Defaults + +**orchestrator-defaults.ncl**: + +```json +{ + orchestrator = { + workspace = { + name = "default", + path = "/var/lib/provisioning/orchestrator", + enabled = true, + multi_workspace = false, + }, + server = { + host = "127.0.0.1", + port = 9090, + workers = 4, # General default + }, + storage = { + backend = 'filesystem, + path = "/var/lib/provisioning/orchestrator/data", + }, + queue = { + max_concurrent_tasks = 5, + retry_attempts = 3, + }, + }, +} +``` + +### 2. Mode-Specific Overrides + +**deployment/solo-defaults.ncl**: + +```json +{ + resources = { + cpu_cores = 2, + memory_mb = 4096, + }, + services = { + orchestrator = { + workers = 2, # Override: fewer workers for solo + queue_max_concurrent_tasks = 3, # Override: limited concurrency + storage_backend = 'filesystem, + }, + }, +} +``` + +**deployment/enterprise-defaults.ncl**: + +```json +{ + resources = { + cpu_cores = 16, + memory_mb = 32768, + }, + services = { + orchestrator = { + workers = 16, # Override: more workers for enterprise + queue_max_concurrent_tasks = 50, # Override: high concurrency + storage_backend = 'surrealdb_server, + surrealdb_url = "surrealdb://cluster:8000", + }, + }, +} +``` + +## Common Defaults + +### server-defaults.ncl + +```json +{ + server = { + host = "0.0.0.0", # Accept all interfaces + port = 8080, # Standard HTTP port (service-specific override) + workers = 4, # CPU-aware default + keep_alive = 75, # seconds + max_connections = 100, + }, +} +``` + +### database-defaults.ncl + +```json +{ + database = { + backend = 'rocksdb, # Fast embedded default + path = "/var/lib/provisioning/data", + pool_size = 10, # Connection pool + timeout = 30000, # milliseconds + }, +} +``` + +### security-defaults.ncl + +```json +{ + security = { + jwt_issuer = "provisioning-system", + jwt_expiration = 3600, # 1 hour + encryption_key = "", # User must set + kms_backend = "age", # Local encryption + mfa_required = false, # Solo: disabled by default + }, +} +``` + +### monitoring-defaults.ncl + +```json +{ + monitoring = { + enabled = false, # Optional feature + metrics_interval = 60, # seconds + health_check_interval = 30, + retention_days = 30, + }, +} +``` + +## Mode Configurations + +### Solo Mode +- **Use case**: Single developer, testing +- **Resources**: 2 CPU, 4GB RAM, 50GB disk +- **Database**: Filesystem or embedded (RocksDB) +- **Security**: Simplified (no MFA, local encryption) +- **Services**: Core services only (orchestrator, control-center) + +### MultiUser Mode +- **Use case**: Team collaboration, staging +- **Resources**: 4 CPU, 8GB RAM, 100GB disk +- **Database**: PostgreSQL or SurrealDB server +- **Security**: RBAC enabled, shared authentication +- **Services**: Full platform (orchestrator, control-center, MCP, Gitea) + +### CI/CD Mode +- **Use case**: Automated pipelines, testing +- **Resources**: 8 CPU, 16GB RAM, 200GB disk +- **Database**: Ephemeral, fast cleanup +- **Security**: API tokens, no UI +- **Services**: Minimal (orchestrator in API mode) + +### Enterprise Mode +- **Use case**: Production, high availability +- **Resources**: 16+ CPU, 32+ GB RAM, 500GB+ disk +- **Database**: SurrealDB cluster with replication +- **Security**: MFA required, KMS integration, compliance +- **Services**: Full platform with redundancy, monitoring, logging + +## Modifying Defaults + +### Changing a Base Default + +**orchestrator-defaults.ncl**: + +```nickel +# Before +queue = { + max_concurrent_tasks = 5, +}, + +# After +queue = { + max_concurrent_tasks = 10, # Increased default +}, +``` + +**Then**: +1. Test with: `nickel eval configs/orchestrator.solo.ncl` +2. Verify forms still work +3. Update documentation if default meaning changes + +### Changing Mode Override + +**deployment/solo-defaults.ncl**: + +```nickel +# Before +orchestrator = { + workers = 2, +} + +# After +orchestrator = { + workers = 1, # Reduce to 1 for solo +} +``` + +## Best Practices + +1. **Keep it conservative** - Default to safe, minimal values +2. **Document overrides** - Explain why mode-specific values differ +3. **Use composition** - Import and merge rather than duplicate +4. **Test composition** - Verify defaults merge correctly with modes +5. **Provide examples** - Use `examples/` directory to show realistic setups + +## Testing Defaults + +```bash +# Evaluate defaults +nickel eval provisioning/.typedialog/provisioning/platform/defaults/orchestrator-defaults.ncl + +# Test merged defaults (base + mode) +nickel eval provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl | head -50 + +# Typecheck with schemas +nickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl +``` + +## Default Value Guidelines + +### Ports +- Solo mode: Local (127.0.0.1) only +- Multi-user/Enterprise: Bind all interfaces (0.0.0.0) +- Never conflict with system services + +### Workers/Concurrency +- Solo: 1-2 workers, limited concurrency +- Multi-user: 4-8 workers, moderate concurrency +- Enterprise: 8+ workers, high concurrency + +### Resources +- Solo: 2 CPU, 4GB RAM (laptop testing) +- Multi-user: 4 CPU, 8GB RAM (team servers) +- Enterprise: 16+ CPU, 32+ GB RAM (production) + +### Security +- Solo: Disabled/minimal (local development) +- Multi-user: RBAC enabled (shared team) +- Enterprise: MFA required, KMS backend (production) + +### Storage +- Solo: Filesystem or RocksDB (no infrastructure needed) +- Multi-user: PostgreSQL or SurrealDB (team data) +- Enterprise: SurrealDB cluster with replication (HA) + +--- + +**Version**: 1.0.0 +**Last Updated**: 2025-01-05 \ No newline at end of file diff --git a/schemas/platform/examples/README.md b/schemas/platform/examples/README.md index 6a19f5a..77490cd 100644 --- a/schemas/platform/examples/README.md +++ b/schemas/platform/examples/README.md @@ -1 +1,897 @@ -# Provisioning Platform Configuration Examples\n\nProduction-ready reference configurations demonstrating different deployment scenarios and best practices.\n\n## Purpose\n\nExamples provide:\n- **Real-world configurations** - Complete, tested working setups ready for production use\n- **Best practices** - Recommended patterns, values, and architectural approaches\n- **Learning resource** - How to use the configuration system effectively\n- **Starting point** - Copy, customize, and deploy for your environment\n- **Documentation** - Detailed inline comments explaining every configuration option\n\n## Quick Start\n\nChoose your deployment mode and get started immediately:\n\n```\n# Solo development (local, single developer)\nnickel export --format toml orchestrator-solo.ncl > orchestrator.toml\n\n# Team collaboration (PostgreSQL, RBAC, audit logging)\nnickel export --format toml control-center-multiuser.ncl > control-center.toml\n\n# Production enterprise (HA, SurrealDB cluster, full monitoring)\nnickel export --format toml full-platform-enterprise.ncl > platform.toml\n```\n\n## Example Configurations by Mode\n\n### 1. orchestrator-solo.ncl\n\n**Deployment Mode**: Solo (Single Developer)\n\n**Resource Requirements**:\n- CPU: 2 cores\n- RAM: 4 GB\n- Disk: 50 GB (local data)\n\n**Configuration Highlights**:\n- **Workspace**: Local `dev-workspace` at `/home/developer/provisioning/data/orchestrator`\n- **Server**: Localhost binding (127.0.0.1:9090), 2 workers, 128 connections max\n- **Storage**: Filesystem backend (no external database required)\n- **Queue**: 3 max concurrent tasks (minimal for development)\n- **Batch**: 2 parallel limit with frequent checkpointing (every 50 operations)\n- **Logging**: Debug level, human-readable text format, concurrent stdout + file output\n- **Security**: Auth disabled, CORS allows all origins, no TLS\n- **Monitoring**: Health checks only (metrics disabled), resource tracking disabled\n- **Features**: Experimental features enabled for testing and iteration\n\n**Ideal For**:\n- ✅ Single developer local development\n- ✅ Quick prototyping and experimentation\n- ✅ Learning the provisioning platform\n- ✅ CI/CD local testing without external services\n\n**Key Advantages**:\n- No external dependencies (database-free)\n- Fast startup (<10 seconds)\n- Minimal resource footprint\n- Verbose debug logging for troubleshooting\n- Zero security overhead\n\n**Key Limitations**:\n- Localhost-only (not accessible remotely)\n- Single-threaded processing (3 concurrent tasks max)\n- No persistence across restarts (if using `:memory:` storage)\n- No audit logging\n\n**Usage**:\n\n```\n# Export to TOML and run\nnickel export --format toml orchestrator-solo.ncl > orchestrator.solo.toml\nORCHESTRATOR_CONFIG=orchestrator.solo.toml cargo run --bin orchestrator\n\n# With TypeDialog interactive configuration\nnu ../../scripts/configure.nu orchestrator solo --backend cli\n```\n\n**Customization Examples**:\n\n```\n# Increase concurrency for testing (still development-friendly)\nqueue.max_concurrent_tasks = 5\n\n# Reduce debug noise for cleaner logs\nlogging.level = "info"\n\n# Change workspace location\nworkspace.path = "/path/to/my/workspace"\n```\n\n---\n\n### 2. orchestrator-enterprise.ncl\n\n**Deployment Mode**: Enterprise (Production High-Availability)\n\n**Resource Requirements**:\n- CPU: 8+ cores (recommended 16)\n- RAM: 16+ GB (recommended 32+ GB)\n- Disk: 500+ GB (SurrealDB cluster)\n\n**Configuration Highlights**:\n- **Workspace**: Production `production` workspace at `/var/lib/provisioning/orchestrator` with multi-workspace support enabled\n- **Server**: All interfaces binding (0.0.0.0:9090), 16 workers, 4096 connections max\n- **Storage**: SurrealDB cluster (3 nodes) for distributed storage and high availability\n- **Queue**: 100 max concurrent tasks, 5 retry attempts, 2-hour timeout for long-running operations\n- **Batch**: 50 parallel limit with frequent checkpointing (every 1000 operations) and automatic cleanup\n- **Logging**: Info level, JSON structured format for log aggregation\n - Standard logs: 500MB files, kept 30 versions (90 days)\n - Audit logs: 200MB files, kept 365 versions (1 year)\n- **Security**: JWT authentication required, specific CORS origins, TLS 1.3 mandatory, 10,000 RPS rate limit\n- **Extensions**: Auto-load from OCI registry with daily refresh, 10 concurrent initializations\n- **Monitoring**:\n - Metrics every 10 seconds\n - Profiling at 10% sample rate\n - Resource tracking with CPU/memory/disk alerts\n - Health checks every 30 seconds\n- **Features**: Audit logging, task history, performance tracking all enabled\n\n**Ideal For**:\n- ✅ Production deployments with SLAs\n- ✅ High-throughput, mission-critical workloads\n- ✅ Multi-team environments requiring audit trails\n- ✅ Large-scale infrastructure deployments\n- ✅ Compliance and governance requirements\n\n**Key Advantages**:\n- High availability (3 SurrealDB replicas with failover)\n- Production security (JWT + TLS 1.3 mandatory)\n- Full observability (metrics, profiling, audit logs)\n- High throughput (100 concurrent tasks)\n- Extension management via OCI registry\n- Automatic rollback and recovery capabilities\n\n**Key Limitations**:\n- Requires SurrealDB cluster setup and maintenance\n- Resource-intensive (8+ CPU, 16+ GB RAM minimum)\n- More complex initial setup and configuration\n- Requires secrets management (JWT keys, TLS certificates)\n- Network isolation and load balancing setup required\n\n**Environment Variables Required**:\n\n```\nexport JWT_SECRET=""\nexport SURREALDB_PASSWORD=""\n```\n\n**Usage**:\n\n```\n# Deploy standalone with SurrealDB\nnickel export --format toml orchestrator-enterprise.ncl > orchestrator.enterprise.toml\nORCHESTRATOR_CONFIG=orchestrator.enterprise.toml cargo run --bin orchestrator\n\n# Deploy to Kubernetes with all enterprise infrastructure\nnu ../../scripts/render-kubernetes.nu enterprise --namespace production\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/*.yaml\n```\n\n**Customization Examples**:\n\n```\n# Adjust concurrency for your specific infrastructure\nqueue.max_concurrent_tasks = 50 # Scale down if resource-constrained\n\n# Change SurrealDB cluster endpoints\nstorage.surrealdb_url = "surrealdb://node1:8000,node2:8000,node3:8000"\n\n# Modify audit log retention for compliance\nlogging.outputs[1].rotation.max_backups = 2555 # 7 years for HIPAA compliance\n\n# Increase rate limiting for high-frequency integrations\nsecurity.rate_limit.requests_per_second = 20000\n```\n\n---\n\n### 3. control-center-multiuser.ncl\n\n**Deployment Mode**: MultiUser (Team Collaboration & Staging)\n\n**Resource Requirements**:\n- CPU: 4 cores\n- RAM: 8 GB\n- Disk: 100 GB (PostgreSQL data + logs)\n\n**Configuration Highlights**:\n- **Server**: All interfaces binding (0.0.0.0:8080), 4 workers, 256 connections max\n- **Database**: PostgreSQL with connection pooling (min 5, max 20 connections)\n- **Auth**: JWT with 8-hour token expiration (aligned with team workday)\n- **RBAC**: 4 pre-defined roles with granular permissions\n - `admin`: Infrastructure lead with full access (`*` permissions)\n - `operator`: Operations team - execute, manage, view workflows and policies\n - `developer`: Development team - read-only workflow and policy access\n - `viewer`: Minimal read-only for non-technical stakeholders\n- **MFA**: Optional per-user (TOTP + email methods available, not globally required)\n- **Password Policies**: 12-character minimum, requires uppercase/lowercase/digits, 90-day rotation, history count of 3\n- **Session Policies**: 8-hour maximum duration, 1-hour idle timeout, 3 concurrent sessions per user\n- **Rate Limiting**: 1000 RPS global, 100 RPS per-user, 20 burst requests\n- **CORS**: Allows localhost:3000 (dev), control-center.example.com, orchestrator.example.com\n- **Logging**: Info level, JSON format, 200MB files kept 15 versions (90 days retention)\n- **Features**: Audit logging enabled, policy enforcement enabled\n\n**Ideal For**:\n- ✅ Team collaboration (2-50 engineers)\n- ✅ Staging environments before production\n- ✅ Development team operations\n- ✅ RBAC with different access levels\n- ✅ Compliance-light environments (SOC2 optional)\n\n**Key Advantages**:\n- Team-friendly security (optional MFA, reasonable password policy)\n- RBAC supports different team roles and responsibilities\n- Persistent storage (PostgreSQL) maintains state across restarts\n- Audit trail for basic compliance\n- Flexible session management (multiple concurrent sessions)\n- Good balance of security and usability\n\n**Key Limitations**:\n- Requires PostgreSQL database setup\n- Single replica (not HA by default)\n- More complex than solo mode\n- RBAC requires careful role definition\n\n**Environment Variables Required**:\n\n```\nexport DB_PASSWORD=""\nexport JWT_SECRET=""\n```\n\n**Usage**:\n\n```\n# Generate and deploy\nnickel export --format toml control-center-multiuser.ncl > control-center.multiuser.toml\nCONTROL_CENTER_CONFIG=control-center.multiuser.toml cargo run --bin control-center\n\n# With Docker Compose for team\nnu ../../scripts/render-docker-compose.nu multiuser\ndocker-compose -f docker-compose.multiuser.yml up -d\n\n# Access the UI\n# http://localhost:8080 (or your configured domain)\n```\n\n**RBAC Quick Reference**:\n\n| Role | Intended Users | Key Permissions |\n| ------ | ---------------- | ----------------- |\n| admin | Infrastructure leads | All operations: full access |\n| operator | Operations engineers | Execute workflows, manage tasks, view policies |\n| developer | Application developers | View workflows, view policies (read-only) |\n| viewer | Non-technical (PM, QA) | View workflows only (minimal read) |\n\n**Customization Examples**:\n\n```\n# Require MFA globally for higher security\nmfa.required = true\n\n# Add custom role for auditors\nrbac.roles.auditor = {\n description = "Compliance auditor",\n permissions = ["audit.view", "orchestrator.view"],\n}\n\n# Adjust for larger team (more concurrent sessions)\npolicies.session.max_concurrent = 5\n\n# Stricter password policy for regulated industry\npolicies.password = {\n min_length = 16,\n require_special_chars = true,\n expiration_days = 60,\n history_count = 8,\n}\n```\n\n---\n\n### 4. full-platform-enterprise.ncl\n\n**Deployment Mode**: Enterprise Integrated (Complete Platform)\n\n**Resource Requirements**:\n- CPU: 16+ cores (3 replicas × 4 cores each + infrastructure)\n- RAM: 32+ GB (orchestrator 12GB + control-center 4GB + databases 12GB + monitoring 4GB)\n- Disk: 1+ TB (databases, logs, metrics, artifacts)\n\n**Services Configured**:\n\n**Orchestrator Section**:\n- SurrealDB cluster (3 nodes) for distributed workflow storage\n- 100 concurrent tasks with 5 retry attempts\n- Full audit logging and monitoring\n- JWT authentication with configurable token expiration\n- Extension loading from OCI registry\n- High-performance tuning (16 workers, 4096 connections)\n\n**Control Center Section**:\n- PostgreSQL HA backend for policy/RBAC storage\n- Full RBAC (4 roles with 7+ permissions each)\n- MFA required (TOTP + email methods)\n- SOC2 compliance enabled with audit logging\n- Strict password policy (16+ chars, special chars required)\n- 30-minute session idle timeout for security\n- Per-user rate limiting (100 RPS)\n\n**MCP Server Section**:\n- Claude integration for AI-powered provisioning\n- Full MCP capability support (tools, resources, prompts, sampling)\n- Orchestrator and Control Center integration\n- Read-only filesystem access with 10MB file limit\n- JWT authentication\n- Advanced audit logging (all requests logged except sensitive data)\n- 100 RPS rate limiting with 20-request burst\n\n**Global Configuration**:\n\n```\nlet deployment_mode = "enterprise"\nlet namespace = "provisioning"\nlet domain = "provisioning.example.com"\nlet environment = "production"\n```\n\n**Infrastructure Components** (when deployed to Kubernetes):\n- Load Balancer (Nginx) - TLS termination, CORS, rate limiting\n- 3x Orchestrator replicas - Distributed processing\n- 2x Control Center replicas - Policy management\n- 1-2x MCP Server replicas - AI integration\n- PostgreSQL HA - Primary/replica setup\n- SurrealDB cluster - 3 nodes with replication\n- Prometheus - Metrics collection\n- Grafana - Visualization and dashboards\n- Loki - Log aggregation\n- Harbor - Private OCI image registry\n\n**Ideal For**:\n- ✅ Production deployments with full SLAs\n- ✅ Enterprise compliance requirements (SOC2, HIPAA)\n- ✅ Multi-team organizations\n- ✅ AI/LLM integration for provisioning\n- ✅ Large-scale infrastructure management (1000+ resources)\n- ✅ High-availability deployments with 99.9%+ uptime requirements\n\n**Key Advantages**:\n- Complete service integration (no missing pieces)\n- Production-grade HA setup (3 replicas, load balancing)\n- Full compliance and audit capabilities\n- AI/LLM integration via MCP Server\n- Comprehensive monitoring and observability\n- Clear separation of concerns per service\n- Global variables for easy parameterization\n\n**Key Limitations**:\n- Complex setup requiring multiple services\n- Resource-intensive (16+ CPU, 32+ GB RAM minimum)\n- Requires Kubernetes or advanced Docker Compose setup\n- Multiple databases to maintain (PostgreSQL + SurrealDB)\n- Network setup complexity (TLS, CORS, rate limiting)\n\n**Environment Variables Required**:\n\n```\n# Database credentials\nexport DB_PASSWORD=""\nexport SURREALDB_PASSWORD=""\n\n# Security\nexport JWT_SECRET=""\nexport KMS_KEY=""\n\n# AI/LLM integration\nexport CLAUDE_API_KEY=""\nexport CLAUDE_MODEL="claude-3-opus-20240229"\n\n# TLS certificates (for production)\nexport TLS_CERT=""\nexport TLS_KEY=""\n```\n\n**Architecture Diagram**:\n\n```\n┌───────────────────────────────────────────────┐\n│ Nginx Load Balancer (TLS, CORS, RateLimit) │\n│ https://orchestrator.example.com │\n│ https://control-center.example.com │\n│ https://mcp.example.com │\n└──────────┬──────────────────────┬─────────────┘\n │ │\n ┌──────▼──────┐ ┌────────▼────────┐\n │ Orchestrator│ │ Control Center │\n │ (3 replicas)│ │ (2 replicas) │\n └──────┬──────┘ └────────┬────────┘\n │ │\n ┌──────▼──────┐ ┌────────▼────────┐ ┌─────────────────┐\n │ SurrealDB │ │ PostgreSQL HA │ │ MCP Server │\n │ Cluster │ │ │ │ (1-2 replicas) │\n │ (3 nodes) │ │ Primary/Replica│ │ │\n └─────────────┘ └─────────────────┘ │ ↓ Claude API │\n └─────────────────┘\n\n ┌─────────────────────────────────────────────────┐\n │ Observability Stack (Optional) │\n ├──────────────────┬──────────────────────────────┤\n │ Prometheus │ Grafana │ Loki │\n │ (Metrics) │ (Dashboards) │ (Logs) │\n └──────────────────┴──────────────────────────────┘\n```\n\n**Usage**:\n\n```\n# Export complete configuration\nnickel export --format toml full-platform-enterprise.ncl > platform.toml\n\n# Extract individual service configs if needed\n# (Each service extracts its section from platform.toml)\n\n# Deploy to Kubernetes with all enterprise infrastructure\nnu ../../scripts/render-kubernetes.nu enterprise --namespace production\n\n# Apply all manifests\nkubectl create namespace production\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/*.yaml\n\n# Or deploy with Docker Compose for single-node testing\nnu ../../scripts/render-docker-compose.nu enterprise\ndocker-compose -f docker-compose.enterprise.yml up -d\n```\n\n**Customization Examples**:\n\n```\n# Adjust deployment domain\nlet domain = "my-company.com"\nlet namespace = "infrastructure"\n\n# Scale for higher throughput\norchestrator.queue.max_concurrent_tasks = 200\norchestrator.security.rate_limit.requests_per_second = 50000\n\n# Add HIPAA compliance\ncontrol_center.policies.compliance.hipaa.enabled = true\ncontrol_center.policies.audit.retention_days = 2555 # 7 years\n\n# Custom MCP Server model\nmcp_server.integration.claude.model = "claude-3-sonnet-20240229"\n\n# Enable caching for performance\nmcp_server.features.enable_caching = true\nmcp_server.performance.cache_ttl = 7200\n```\n\n---\n\n## Deployment Mode Comparison Matrix\n\n| Feature | Solo | MultiUser | Enterprise |\n| --------- | ------ | ----------- | ----------- |\n| **Ideal For** | Dev | Team/Staging | Production |\n| **Storage** | Filesystem | PostgreSQL | SurrealDB Cluster |\n| **Replicas** | 1 | 1 | 3+ (HA) |\n| **Max Concurrency** | 3 tasks | 5-10 | 100 |\n| **Security** | None | RBAC + JWT | Full + MFA + SOC2 |\n| **Monitoring** | Health check | Basic | Full (Prom+Grafana) |\n| **Setup Time** | <5 min | 15 min | 30+ min |\n| **Min CPU** | 2 | 4 | 16 |\n| **Min RAM** | 4GB | 8GB | 32GB |\n| **Audit Logs** | No | 90 days | 365 days |\n| **TLS Required** | No | No | Yes |\n| **Compliance** | None | Basic | SOC2 + HIPAA ready |\n\n---\n\n## Getting Started Guide\n\n### Step 1: Choose Your Deployment Mode\n\n- **Solo**: Single developer working locally → Use `orchestrator-solo.ncl`\n- **Team**: 2-50 engineers, staging environment → Use `control-center-multiuser.ncl`\n- **Production**: Full enterprise deployment → Use `full-platform-enterprise.ncl`\n\n### Step 2: Export Configuration to TOML\n\n```\n# Start with solo mode\nnickel export --format toml orchestrator-solo.ncl > orchestrator.toml\n\n# Validate the export\ncat orchestrator.toml | head -20\n```\n\n### Step 3: Validate Configuration\n\n```\n# Typecheck the Nickel configuration\nnickel typecheck orchestrator-solo.ncl\n\n# Validate using provided script\nnu ../../scripts/validate-config.nu orchestrator-solo.ncl\n```\n\n### Step 4: Customize for Your Environment\n\nEdit the exported `.toml` or the `.ncl` file:\n\n```\n# Option A: Edit TOML directly (simpler)\nvi orchestrator.toml # Change workspace path, port, etc.\n\n# Option B: Edit Nickel and re-export (type-safe)\nvi orchestrator-solo.ncl\nnickel export --format toml orchestrator-solo.ncl > orchestrator.toml\n```\n\n### Step 5: Deploy\n\n```\n# Docker Compose\nORCHESTRATOR_CONFIG=orchestrator.toml docker-compose up -d\n\n# Direct Rust execution\nORCHESTRATOR_CONFIG=orchestrator.toml cargo run --bin orchestrator\n\n# Kubernetes\nnu ../../scripts/render-kubernetes.nu solo\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/*.yaml\n```\n\n---\n\n## Common Customizations\n\n### Changing Domain/Namespace\n\nIn any `.ncl` file at the top:\n\n```\nlet domain = "your-domain.com"\nlet namespace = "your-namespace"\nlet environment = "your-env"\n```\n\n### Increasing Resource Limits\n\nFor higher throughput:\n\n```\nqueue.max_concurrent_tasks = 200 # Default: 100\nsecurity.rate_limit.requests_per_second = 50000 # Default: 10000\nserver.workers = 32 # Default: 16\n```\n\n### Enabling Compliance Features\n\nFor regulated environments:\n\n```\npolicies.compliance.soc2.enabled = true\npolicies.compliance.hipaa.enabled = true\npolicies.audit.retention_days = 2555 # 7 years\n```\n\n### Custom Logging\n\nFor troubleshooting:\n\n```\nlogging.level = "debug" # Default: info\nlogging.format = "text" # Default: json (use text for development)\nlogging.outputs[0].level = "debug" # stdout level\n```\n\n---\n\n## Validation & Testing\n\n### Syntax Validation\n\n```\n# Typecheck all examples\nfor f in *.ncl; do\n echo "Checking $f..."\n nickel typecheck "$f"\ndone\n```\n\n### Configuration Export\n\n```\n# Export to TOML\nnickel export --format toml orchestrator-solo.ncl | head -30\n\n# Export to JSON\nnickel export --format json full-platform-enterprise.ncl | jq '.orchestrator.server'\n```\n\n### Load in Rust Application\n\n```\n# With dry-run flag (if supported)\nORCHESTRATOR_CONFIG=orchestrator.solo.toml cargo run --bin orchestrator -- --validate\n\n# Or simply attempt startup\nORCHESTRATOR_CONFIG=orchestrator.solo.toml timeout 5 cargo run --bin orchestrator\n```\n\n---\n\n## Troubleshooting\n\n### "Type mismatch" Error\n\n**Cause**: Field value doesn't match expected type\n\n**Fix**: Check the schema for correct type. Common issues:\n- Use `true`/`false` not `"true"`/`"false"` for booleans\n- Use `9090` not `"9090"` for numbers\n- Use record syntax `{ key = value }` not `{ "key": value }`\n\n### Port Already in Use\n\n**Fix**: Change the port in your configuration:\n\n```\nserver.port = 9999 # Instead of 9090\n```\n\n### Database Connection Errors\n\n**Fix**: For multiuser/enterprise modes:\n- Ensure PostgreSQL is running: `docker-compose up -d postgres`\n- Verify credentials in environment variables\n- Check network connectivity\n- Validate connection string format\n\n### Import Not Found\n\n**Fix**: Ensure all relative paths in imports are correct:\n\n```\n# Correct (relative to examples/)\nlet defaults = import "../defaults/orchestrator-defaults.ncl" in\n\n# Wrong (absolute path)\nlet defaults = import "/full/path/to/defaults.ncl" in\n```\n\n---\n\n## Best Practices\n\n1. **Start Small**: Begin with solo mode, graduate to multiuser, then enterprise\n2. **Environment Variables**: Never hardcode secrets, use environment variables\n3. **Version Control**: Keep examples in Git with clear comments\n4. **Validation**: Always typecheck and export before deploying\n5. **Documentation**: Add comments explaining non-obvious configuration choices\n6. **Testing**: Deploy to staging first, validate all services before production\n7. **Monitoring**: Enable metrics and logging from day one for easier troubleshooting\n8. **Backups**: Regular backups of database state and configurations\n\n---\n\n## Adding New Examples\n\n### Create a Custom Example\n\n```\n# Copy an existing example as template\ncp orchestrator-solo.ncl orchestrator-custom.ncl\n\n# Edit for your use case\nvi orchestrator-custom.ncl\n\n# Validate\nnickel typecheck orchestrator-custom.ncl\n\n# Export and test\nnickel export --format toml orchestrator-custom.ncl > orchestrator.custom.toml\n```\n\n### Naming Convention\n\n- **Service + Mode**: `{service}-{mode}.ncl` (orchestrator-solo.ncl)\n- **Scenario**: `{service}-{scenario}.ncl` (orchestrator-high-throughput.ncl)\n- **Full Stack**: `full-platform-{mode}.ncl` (full-platform-enterprise.ncl)\n\n---\n\n## See Also\n\n- **Parent README**: `../README.md` - Complete configuration system overview\n- **Schemas**: `../schemas/` - Type definitions and validation rules\n- **Defaults**: `../defaults/` - Base configurations for composition\n- **Scripts**: `../scripts/` - Automation for configuration workflow\n- **Forms**: `../forms/` - Interactive TypeDialog form definitions\n\n---\n\n**Version**: 2.0\n**Last Updated**: 2025-01-05\n**Status**: Production Ready - All examples tested and validated\n\n## Using Examples\n\n### View Example\n\n```\ncat provisioning/.typedialog/provisioning/platform/examples/orchestrator-solo.ncl\n```\n\n### Copy and Customize\n\n```\n# Start with solo example\ncp examples/orchestrator-solo.ncl values/orchestrator.solo.ncl\n\n# Edit for your environment\nvi values/orchestrator.solo.ncl\n\n# Validate\nnu scripts/validate-config.nu values/orchestrator.solo.ncl\n```\n\n### Generate from Example\n\n```\n# Use example as base, regenerate with TypeDialog\nnu scripts/configure.nu orchestrator solo --backend web\n```\n\n## Example Structure\n\nEach example is a complete Nickel configuration:\n\n```\n# orchestrator-solo.ncl\n{\n orchestrator = {\n workspace = { },\n server = { },\n storage = { },\n queue = { },\n monitoring = { },\n },\n}\n```\n\n## Configuration Elements\n\n### Workspace Configuration\n- **name** - Workspace identifier\n- **path** - Directory path\n- **enabled** - Enable/disable flag\n- **multi_workspace** - Support multiple workspaces\n\n### Server Configuration\n- **host** - Bind address (127.0.0.1 for solo, 0.0.0.0 for public)\n- **port** - Listen port\n- **workers** - Thread count (mode-dependent)\n- **keep_alive** - Connection keep-alive timeout\n- **max_connections** - Connection limit\n\n### Storage Configuration\n- **backend** - 'filesystem | 'rocksdb | 'surrealdb | 'postgres\n- **path** - Local storage path (filesystem/rocksdb)\n- **connection_string** - DB URL (surrealdb/postgres)\n\n### Queue Configuration (Orchestrator)\n- **max_concurrent_tasks** - Concurrent task limit\n- **retry_attempts** - Retry count\n- **retry_delay** - Delay between retries (ms)\n- **task_timeout** - Task execution timeout (ms)\n\n### Monitoring Configuration (Optional)\n- **enabled** - Enable metrics collection\n- **metrics_interval** - Collection frequency (seconds)\n- **health_check_interval** - Health check frequency\n\n## Creating New Examples\n\n### 1. Start with Existing Example\n\n```\ncp examples/orchestrator-solo.ncl examples/orchestrator-custom.ncl\n```\n\n### 2. Modify for Your Use Case\n\n```\n# Update configuration values\norchestrator.server.workers = 8 # More workers\norchestrator.queue.max_concurrent_tasks = 20 # Higher concurrency\n```\n\n### 3. Validate Configuration\n\n```\nnickel typecheck examples/orchestrator-custom.ncl\nnickel eval examples/orchestrator-custom.ncl\n```\n\n### 4. Document Purpose\nAdd comments explaining:\n- Use case (deployment scenario)\n- Resource requirements\n- Expected load\n- Customization needed\n\n### 5. Save as Reference\n\n```\nmv examples/orchestrator-custom.ncl examples/orchestrator-{scenario}.ncl\n```\n\n## Best Practices for Examples\n\n1. **Clear documentation** - Explain the use case at the top\n2. **Realistic values** - Use production-appropriate configurations\n3. **Complete configuration** - Include all required sections\n4. **Inline comments** - Explain non-obvious choices\n5. **Validated** - Typecheck all examples before committing\n6. **Organized** - Group by service and deployment mode\n\n## Example Naming Convention\n\n- **Service-mode**: `{service}-{mode}.ncl` (orchestrator-solo.ncl)\n- **Scenario**: `{service}-{scenario}.ncl` (orchestrator-gpu-intensive.ncl)\n- **Full stack**: `full-platform-{mode}.ncl` (full-platform-enterprise.ncl)\n\n## Customizing Examples\n\n### For Your Environment\n\n```\n# orchestrator-solo.ncl (customized)\n{\n orchestrator = {\n workspace = {\n name = "my-workspace", # Your workspace name\n path = "/home/user/projects/workspace", # Your path\n },\n server = {\n host = "127.0.0.1", # Keep local for solo\n port = 9090,\n },\n storage = {\n backend = 'filesystem, # No external DB needed\n path = "/home/user/provisioning/data", # Your path\n },\n },\n}\n```\n\n### For Different Resources\n\n```\n# orchestrator-multiuser.ncl (customized for team)\n{\n orchestrator = {\n server = {\n host = "0.0.0.0", # Public binding\n port = 9090,\n workers = 4, # Team concurrency\n },\n queue = {\n max_concurrent_tasks = 10, # Team workload\n },\n },\n}\n```\n\n## Testing Examples\n\n```\n# Typecheck example\nnickel typecheck examples/orchestrator-solo.ncl\n\n# Evaluate and view\nnickel eval examples/orchestrator-solo.ncl | head -20\n\n# Export to TOML\nnickel export --format toml examples/orchestrator-solo.ncl > test.toml\n```\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 +# Provisioning Platform Configuration Examples + +Production-ready reference configurations demonstrating different deployment scenarios and best practices. + +## Purpose + +Examples provide: +- **Real-world configurations** - Complete, tested working setups ready for production use +- **Best practices** - Recommended patterns, values, and architectural approaches +- **Learning resource** - How to use the configuration system effectively +- **Starting point** - Copy, customize, and deploy for your environment +- **Documentation** - Detailed inline comments explaining every configuration option + +## Quick Start + +Choose your deployment mode and get started immediately: + +```bash +# Solo development (local, single developer) +nickel export --format toml orchestrator-solo.ncl > orchestrator.toml + +# Team collaboration (PostgreSQL, RBAC, audit logging) +nickel export --format toml control-center-multiuser.ncl > control-center.toml + +# Production enterprise (HA, SurrealDB cluster, full monitoring) +nickel export --format toml full-platform-enterprise.ncl > platform.toml +``` + +## Example Configurations by Mode + +### 1. orchestrator-solo.ncl + +**Deployment Mode**: Solo (Single Developer) + +**Resource Requirements**: +- CPU: 2 cores +- RAM: 4 GB +- Disk: 50 GB (local data) + +**Configuration Highlights**: +- **Workspace**: Local `dev-workspace` at `/home/developer/provisioning/data/orchestrator` +- **Server**: Localhost binding (127.0.0.1:9090), 2 workers, 128 connections max +- **Storage**: Filesystem backend (no external database required) +- **Queue**: 3 max concurrent tasks (minimal for development) +- **Batch**: 2 parallel limit with frequent checkpointing (every 50 operations) +- **Logging**: Debug level, human-readable text format, concurrent stdout + file output +- **Security**: Auth disabled, CORS allows all origins, no TLS +- **Monitoring**: Health checks only (metrics disabled), resource tracking disabled +- **Features**: Experimental features enabled for testing and iteration + +**Ideal For**: +- ✅ Single developer local development +- ✅ Quick prototyping and experimentation +- ✅ Learning the provisioning platform +- ✅ CI/CD local testing without external services + +**Key Advantages**: +- No external dependencies (database-free) +- Fast startup (<10 seconds) +- Minimal resource footprint +- Verbose debug logging for troubleshooting +- Zero security overhead + +**Key Limitations**: +- Localhost-only (not accessible remotely) +- Single-threaded processing (3 concurrent tasks max) +- No persistence across restarts (if using `:memory:` storage) +- No audit logging + +**Usage**: + +```bash +# Export to TOML and run +nickel export --format toml orchestrator-solo.ncl > orchestrator.solo.toml +ORCHESTRATOR_CONFIG=orchestrator.solo.toml cargo run --bin orchestrator + +# With TypeDialog interactive configuration +nu ../../scripts/configure.nu orchestrator solo --backend cli +``` + +**Customization Examples**: + +```bash +# Increase concurrency for testing (still development-friendly) +queue.max_concurrent_tasks = 5 + +# Reduce debug noise for cleaner logs +logging.level = "info" + +# Change workspace location +workspace.path = "/path/to/my/workspace" +``` + +--- + +### 2. orchestrator-enterprise.ncl + +**Deployment Mode**: Enterprise (Production High-Availability) + +**Resource Requirements**: +- CPU: 8+ cores (recommended 16) +- RAM: 16+ GB (recommended 32+ GB) +- Disk: 500+ GB (SurrealDB cluster) + +**Configuration Highlights**: +- **Workspace**: Production `production` workspace at `/var/lib/provisioning/orchestrator` with multi-workspace support enabled +- **Server**: All interfaces binding (0.0.0.0:9090), 16 workers, 4096 connections max +- **Storage**: SurrealDB cluster (3 nodes) for distributed storage and high availability +- **Queue**: 100 max concurrent tasks, 5 retry attempts, 2-hour timeout for long-running operations +- **Batch**: 50 parallel limit with frequent checkpointing (every 1000 operations) and automatic cleanup +- **Logging**: Info level, JSON structured format for log aggregation + - Standard logs: 500MB files, kept 30 versions (90 days) + - Audit logs: 200MB files, kept 365 versions (1 year) +- **Security**: JWT authentication required, specific CORS origins, TLS 1.3 mandatory, 10,000 RPS rate limit +- **Extensions**: Auto-load from OCI registry with daily refresh, 10 concurrent initializations +- **Monitoring**: + - Metrics every 10 seconds + - Profiling at 10% sample rate + - Resource tracking with CPU/memory/disk alerts + - Health checks every 30 seconds +- **Features**: Audit logging, task history, performance tracking all enabled + +**Ideal For**: +- ✅ Production deployments with SLAs +- ✅ High-throughput, mission-critical workloads +- ✅ Multi-team environments requiring audit trails +- ✅ Large-scale infrastructure deployments +- ✅ Compliance and governance requirements + +**Key Advantages**: +- High availability (3 SurrealDB replicas with failover) +- Production security (JWT + TLS 1.3 mandatory) +- Full observability (metrics, profiling, audit logs) +- High throughput (100 concurrent tasks) +- Extension management via OCI registry +- Automatic rollback and recovery capabilities + +**Key Limitations**: +- Requires SurrealDB cluster setup and maintenance +- Resource-intensive (8+ CPU, 16+ GB RAM minimum) +- More complex initial setup and configuration +- Requires secrets management (JWT keys, TLS certificates) +- Network isolation and load balancing setup required + +**Environment Variables Required**: + +```javascript +export JWT_SECRET="" +export SURREALDB_PASSWORD="" +``` + +**Usage**: + +```bash +# Deploy standalone with SurrealDB +nickel export --format toml orchestrator-enterprise.ncl > orchestrator.enterprise.toml +ORCHESTRATOR_CONFIG=orchestrator.enterprise.toml cargo run --bin orchestrator + +# Deploy to Kubernetes with all enterprise infrastructure +nu ../../scripts/render-kubernetes.nu enterprise --namespace production +kubectl apply -f provisioning/platform/infrastructure/kubernetes/*.yaml +``` + +**Customization Examples**: + +```bash +# Adjust concurrency for your specific infrastructure +queue.max_concurrent_tasks = 50 # Scale down if resource-constrained + +# Change SurrealDB cluster endpoints +storage.surrealdb_url = "surrealdb://node1:8000,node2:8000,node3:8000" + +# Modify audit log retention for compliance +logging.outputs[1].rotation.max_backups = 2555 # 7 years for HIPAA compliance + +# Increase rate limiting for high-frequency integrations +security.rate_limit.requests_per_second = 20000 +``` + +--- + +### 3. control-center-multiuser.ncl + +**Deployment Mode**: MultiUser (Team Collaboration & Staging) + +**Resource Requirements**: +- CPU: 4 cores +- RAM: 8 GB +- Disk: 100 GB (PostgreSQL data + logs) + +**Configuration Highlights**: +- **Server**: All interfaces binding (0.0.0.0:8080), 4 workers, 256 connections max +- **Database**: PostgreSQL with connection pooling (min 5, max 20 connections) +- **Auth**: JWT with 8-hour token expiration (aligned with team workday) +- **RBAC**: 4 pre-defined roles with granular permissions + - `admin`: Infrastructure lead with full access (`*` permissions) + - `operator`: Operations team - execute, manage, view workflows and policies + - `developer`: Development team - read-only workflow and policy access + - `viewer`: Minimal read-only for non-technical stakeholders +- **MFA**: Optional per-user (TOTP + email methods available, not globally required) +- **Password Policies**: 12-character minimum, requires uppercase/lowercase/digits, 90-day rotation, history count of 3 +- **Session Policies**: 8-hour maximum duration, 1-hour idle timeout, 3 concurrent sessions per user +- **Rate Limiting**: 1000 RPS global, 100 RPS per-user, 20 burst requests +- **CORS**: Allows localhost:3000 (dev), control-center.example.com, orchestrator.example.com +- **Logging**: Info level, JSON format, 200MB files kept 15 versions (90 days retention) +- **Features**: Audit logging enabled, policy enforcement enabled + +**Ideal For**: +- ✅ Team collaboration (2-50 engineers) +- ✅ Staging environments before production +- ✅ Development team operations +- ✅ RBAC with different access levels +- ✅ Compliance-light environments (SOC2 optional) + +**Key Advantages**: +- Team-friendly security (optional MFA, reasonable password policy) +- RBAC supports different team roles and responsibilities +- Persistent storage (PostgreSQL) maintains state across restarts +- Audit trail for basic compliance +- Flexible session management (multiple concurrent sessions) +- Good balance of security and usability + +**Key Limitations**: +- Requires PostgreSQL database setup +- Single replica (not HA by default) +- More complex than solo mode +- RBAC requires careful role definition + +**Environment Variables Required**: + +```javascript +export DB_PASSWORD="" +export JWT_SECRET="" +``` + +**Usage**: + +```bash +# Generate and deploy +nickel export --format toml control-center-multiuser.ncl > control-center.multiuser.toml +CONTROL_CENTER_CONFIG=control-center.multiuser.toml cargo run --bin control-center + +# With Docker Compose for team +nu ../../scripts/render-docker-compose.nu multiuser +docker-compose -f docker-compose.multiuser.yml up -d + +# Access the UI +# http://localhost:8080 (or your configured domain) +``` + +**RBAC Quick Reference**: + +| Role | Intended Users | Key Permissions | +| ------ | ---------------- | ----------------- | +| admin | Infrastructure leads | All operations: full access | +| operator | Operations engineers | Execute workflows, manage tasks, view policies | +| developer | Application developers | View workflows, view policies (read-only) | +| viewer | Non-technical (PM, QA) | View workflows only (minimal read) | + +**Customization Examples**: + +```bash +# Require MFA globally for higher security +mfa.required = true + +# Add custom role for auditors +rbac.roles.auditor = { + description = "Compliance auditor", + permissions = ["audit.view", "orchestrator.view"], +} + +# Adjust for larger team (more concurrent sessions) +policies.session.max_concurrent = 5 + +# Stricter password policy for regulated industry +policies.password = { + min_length = 16, + require_special_chars = true, + expiration_days = 60, + history_count = 8, +} +``` + +--- + +### 4. full-platform-enterprise.ncl + +**Deployment Mode**: Enterprise Integrated (Complete Platform) + +**Resource Requirements**: +- CPU: 16+ cores (3 replicas × 4 cores each + infrastructure) +- RAM: 32+ GB (orchestrator 12GB + control-center 4GB + databases 12GB + monitoring 4GB) +- Disk: 1+ TB (databases, logs, metrics, artifacts) + +**Services Configured**: + +**Orchestrator Section**: +- SurrealDB cluster (3 nodes) for distributed workflow storage +- 100 concurrent tasks with 5 retry attempts +- Full audit logging and monitoring +- JWT authentication with configurable token expiration +- Extension loading from OCI registry +- High-performance tuning (16 workers, 4096 connections) + +**Control Center Section**: +- PostgreSQL HA backend for policy/RBAC storage +- Full RBAC (4 roles with 7+ permissions each) +- MFA required (TOTP + email methods) +- SOC2 compliance enabled with audit logging +- Strict password policy (16+ chars, special chars required) +- 30-minute session idle timeout for security +- Per-user rate limiting (100 RPS) + +**MCP Server Section**: +- Claude integration for AI-powered provisioning +- Full MCP capability support (tools, resources, prompts, sampling) +- Orchestrator and Control Center integration +- Read-only filesystem access with 10MB file limit +- JWT authentication +- Advanced audit logging (all requests logged except sensitive data) +- 100 RPS rate limiting with 20-request burst + +**Global Configuration**: + +```javascript +let deployment_mode = "enterprise" +let namespace = "provisioning" +let domain = "provisioning.example.com" +let environment = "production" +``` + +**Infrastructure Components** (when deployed to Kubernetes): +- Load Balancer (Nginx) - TLS termination, CORS, rate limiting +- 3x Orchestrator replicas - Distributed processing +- 2x Control Center replicas - Policy management +- 1-2x MCP Server replicas - AI integration +- PostgreSQL HA - Primary/replica setup +- SurrealDB cluster - 3 nodes with replication +- Prometheus - Metrics collection +- Grafana - Visualization and dashboards +- Loki - Log aggregation +- Harbor - Private OCI image registry + +**Ideal For**: +- ✅ Production deployments with full SLAs +- ✅ Enterprise compliance requirements (SOC2, HIPAA) +- ✅ Multi-team organizations +- ✅ AI/LLM integration for provisioning +- ✅ Large-scale infrastructure management (1000+ resources) +- ✅ High-availability deployments with 99.9%+ uptime requirements + +**Key Advantages**: +- Complete service integration (no missing pieces) +- Production-grade HA setup (3 replicas, load balancing) +- Full compliance and audit capabilities +- AI/LLM integration via MCP Server +- Comprehensive monitoring and observability +- Clear separation of concerns per service +- Global variables for easy parameterization + +**Key Limitations**: +- Complex setup requiring multiple services +- Resource-intensive (16+ CPU, 32+ GB RAM minimum) +- Requires Kubernetes or advanced Docker Compose setup +- Multiple databases to maintain (PostgreSQL + SurrealDB) +- Network setup complexity (TLS, CORS, rate limiting) + +**Environment Variables Required**: + +```bash +# Database credentials +export DB_PASSWORD="" +export SURREALDB_PASSWORD="" + +# Security +export JWT_SECRET="" +export KMS_KEY="" + +# AI/LLM integration +export CLAUDE_API_KEY="" +export CLAUDE_MODEL="claude-3-opus-20240229" + +# TLS certificates (for production) +export TLS_CERT="" +export TLS_KEY="" +``` + +**Architecture Diagram**: + +```bash +┌───────────────────────────────────────────────┐ +│ Nginx Load Balancer (TLS, CORS, RateLimit) │ +│ https://orchestrator.example.com │ +│ https://control-center.example.com │ +│ https://mcp.example.com │ +└──────────┬──────────────────────┬─────────────┘ + │ │ + ┌──────▼──────┐ ┌────────▼────────┐ + │ Orchestrator│ │ Control Center │ + │ (3 replicas)│ │ (2 replicas) │ + └──────┬──────┘ └────────┬────────┘ + │ │ + ┌──────▼──────┐ ┌────────▼────────┐ ┌─────────────────┐ + │ SurrealDB │ │ PostgreSQL HA │ │ MCP Server │ + │ Cluster │ │ │ │ (1-2 replicas) │ + │ (3 nodes) │ │ Primary/Replica│ │ │ + └─────────────┘ └─────────────────┘ │ ↓ Claude API │ + └─────────────────┘ + + ┌─────────────────────────────────────────────────┐ + │ Observability Stack (Optional) │ + ├──────────────────┬──────────────────────────────┤ + │ Prometheus │ Grafana │ Loki │ + │ (Metrics) │ (Dashboards) │ (Logs) │ + └──────────────────┴──────────────────────────────┘ +``` + +**Usage**: + +```bash +# Export complete configuration +nickel export --format toml full-platform-enterprise.ncl > platform.toml + +# Extract individual service configs if needed +# (Each service extracts its section from platform.toml) + +# Deploy to Kubernetes with all enterprise infrastructure +nu ../../scripts/render-kubernetes.nu enterprise --namespace production + +# Apply all manifests +kubectl create namespace production +kubectl apply -f provisioning/platform/infrastructure/kubernetes/*.yaml + +# Or deploy with Docker Compose for single-node testing +nu ../../scripts/render-docker-compose.nu enterprise +docker-compose -f docker-compose.enterprise.yml up -d +``` + +**Customization Examples**: + +```bash +# Adjust deployment domain +let domain = "my-company.com" +let namespace = "infrastructure" + +# Scale for higher throughput +orchestrator.queue.max_concurrent_tasks = 200 +orchestrator.security.rate_limit.requests_per_second = 50000 + +# Add HIPAA compliance +control_center.policies.compliance.hipaa.enabled = true +control_center.policies.audit.retention_days = 2555 # 7 years + +# Custom MCP Server model +mcp_server.integration.claude.model = "claude-3-sonnet-20240229" + +# Enable caching for performance +mcp_server.features.enable_caching = true +mcp_server.performance.cache_ttl = 7200 +``` + +--- + +## Deployment Mode Comparison Matrix + +| Feature | Solo | MultiUser | Enterprise | +| --------- | ------ | ----------- | ----------- | +| **Ideal For** | Dev | Team/Staging | Production | +| **Storage** | Filesystem | PostgreSQL | SurrealDB Cluster | +| **Replicas** | 1 | 1 | 3+ (HA) | +| **Max Concurrency** | 3 tasks | 5-10 | 100 | +| **Security** | None | RBAC + JWT | Full + MFA + SOC2 | +| **Monitoring** | Health check | Basic | Full (Prom+Grafana) | +| **Setup Time** | <5 min | 15 min | 30+ min | +| **Min CPU** | 2 | 4 | 16 | +| **Min RAM** | 4GB | 8GB | 32GB | +| **Audit Logs** | No | 90 days | 365 days | +| **TLS Required** | No | No | Yes | +| **Compliance** | None | Basic | SOC2 + HIPAA ready | + +--- + +## Getting Started Guide + +### Step 1: Choose Your Deployment Mode + +- **Solo**: Single developer working locally → Use `orchestrator-solo.ncl` +- **Team**: 2-50 engineers, staging environment → Use `control-center-multiuser.ncl` +- **Production**: Full enterprise deployment → Use `full-platform-enterprise.ncl` + +### Step 2: Export Configuration to TOML + +```toml +# Start with solo mode +nickel export --format toml orchestrator-solo.ncl > orchestrator.toml + +# Validate the export +cat orchestrator.toml | head -20 +``` + +### Step 3: Validate Configuration + +```toml +# Typecheck the Nickel configuration +nickel typecheck orchestrator-solo.ncl + +# Validate using provided script +nu ../../scripts/validate-config.nu orchestrator-solo.ncl +``` + +### Step 4: Customize for Your Environment + +Edit the exported `.toml` or the `.ncl` file: + +```nickel +# Option A: Edit TOML directly (simpler) +vi orchestrator.toml # Change workspace path, port, etc. + +# Option B: Edit Nickel and re-export (type-safe) +vi orchestrator-solo.ncl +nickel export --format toml orchestrator-solo.ncl > orchestrator.toml +``` + +### Step 5: Deploy + +```bash +# Docker Compose +ORCHESTRATOR_CONFIG=orchestrator.toml docker-compose up -d + +# Direct Rust execution +ORCHESTRATOR_CONFIG=orchestrator.toml cargo run --bin orchestrator + +# Kubernetes +nu ../../scripts/render-kubernetes.nu solo +kubectl apply -f provisioning/platform/infrastructure/kubernetes/*.yaml +``` + +--- + +## Common Customizations + +### Changing Domain/Namespace + +In any `.ncl` file at the top: + +```javascript +let domain = "your-domain.com" +let namespace = "your-namespace" +let environment = "your-env" +``` + +### Increasing Resource Limits + +For higher throughput: + +```bash +queue.max_concurrent_tasks = 200 # Default: 100 +security.rate_limit.requests_per_second = 50000 # Default: 10000 +server.workers = 32 # Default: 16 +``` + +### Enabling Compliance Features + +For regulated environments: + +```bash +policies.compliance.soc2.enabled = true +policies.compliance.hipaa.enabled = true +policies.audit.retention_days = 2555 # 7 years +``` + +### Custom Logging + +For troubleshooting: + +```bash +logging.level = "debug" # Default: info +logging.format = "text" # Default: json (use text for development) +logging.outputs[0].level = "debug" # stdout level +``` + +--- + +## Validation & Testing + +### Syntax Validation + +```bash +# Typecheck all examples +for f in *.ncl; do + echo "Checking $f..." + nickel typecheck "$f" +done +``` + +### Configuration Export + +```toml +# Export to TOML +nickel export --format toml orchestrator-solo.ncl | head -30 + +# Export to JSON +nickel export --format json full-platform-enterprise.ncl | jq '.orchestrator.server' +``` + +### Load in Rust Application + +```rust +# With dry-run flag (if supported) +ORCHESTRATOR_CONFIG=orchestrator.solo.toml cargo run --bin orchestrator -- --validate + +# Or simply attempt startup +ORCHESTRATOR_CONFIG=orchestrator.solo.toml timeout 5 cargo run --bin orchestrator +``` + +--- + +## Troubleshooting + +### "Type mismatch" Error + +**Cause**: Field value doesn't match expected type + +**Fix**: Check the schema for correct type. Common issues: +- Use `true`/`false` not `"true"`/`"false"` for booleans +- Use `9090` not `"9090"` for numbers +- Use record syntax `{ key = value }` not `{ "key": value }` + +### Port Already in Use + +**Fix**: Change the port in your configuration: + +```toml +server.port = 9999 # Instead of 9090 +``` + +### Database Connection Errors + +**Fix**: For multiuser/enterprise modes: +- Ensure PostgreSQL is running: `docker-compose up -d postgres` +- Verify credentials in environment variables +- Check network connectivity +- Validate connection string format + +### Import Not Found + +**Fix**: Ensure all relative paths in imports are correct: + +```bash +# Correct (relative to examples/) +let defaults = import "../defaults/orchestrator-defaults.ncl" in + +# Wrong (absolute path) +let defaults = import "/full/path/to/defaults.ncl" in +``` + +--- + +## Best Practices + +1. **Start Small**: Begin with solo mode, graduate to multiuser, then enterprise +2. **Environment Variables**: Never hardcode secrets, use environment variables +3. **Version Control**: Keep examples in Git with clear comments +4. **Validation**: Always typecheck and export before deploying +5. **Documentation**: Add comments explaining non-obvious configuration choices +6. **Testing**: Deploy to staging first, validate all services before production +7. **Monitoring**: Enable metrics and logging from day one for easier troubleshooting +8. **Backups**: Regular backups of database state and configurations + +--- + +## Adding New Examples + +### Create a Custom Example + +```bash +# Copy an existing example as template +cp orchestrator-solo.ncl orchestrator-custom.ncl + +# Edit for your use case +vi orchestrator-custom.ncl + +# Validate +nickel typecheck orchestrator-custom.ncl + +# Export and test +nickel export --format toml orchestrator-custom.ncl > orchestrator.custom.toml +``` + +### Naming Convention + +- **Service + Mode**: `{service}-{mode}.ncl` (orchestrator-solo.ncl) +- **Scenario**: `{service}-{scenario}.ncl` (orchestrator-high-throughput.ncl) +- **Full Stack**: `full-platform-{mode}.ncl` (full-platform-enterprise.ncl) + +--- + +## See Also + +- **Parent README**: `../README.md` - Complete configuration system overview +- **Schemas**: `../schemas/` - Type definitions and validation rules +- **Defaults**: `../defaults/` - Base configurations for composition +- **Scripts**: `../scripts/` - Automation for configuration workflow +- **Forms**: `../forms/` - Interactive TypeDialog form definitions + +--- + +**Version**: 2.0 +**Last Updated**: 2025-01-05 +**Status**: Production Ready - All examples tested and validated + +## Using Examples + +### View Example + +```bash +cat provisioning/.typedialog/provisioning/platform/examples/orchestrator-solo.ncl +``` + +### Copy and Customize + +```bash +# Start with solo example +cp examples/orchestrator-solo.ncl values/orchestrator.solo.ncl + +# Edit for your environment +vi values/orchestrator.solo.ncl + +# Validate +nu scripts/validate-config.nu values/orchestrator.solo.ncl +``` + +### Generate from Example + +```bash +# Use example as base, regenerate with TypeDialog +nu scripts/configure.nu orchestrator solo --backend web +``` + +## Example Structure + +Each example is a complete Nickel configuration: + +```nickel +# orchestrator-solo.ncl +{ + orchestrator = { + workspace = { }, + server = { }, + storage = { }, + queue = { }, + monitoring = { }, + }, +} +``` + +## Configuration Elements + +### Workspace Configuration +- **name** - Workspace identifier +- **path** - Directory path +- **enabled** - Enable/disable flag +- **multi_workspace** - Support multiple workspaces + +### Server Configuration +- **host** - Bind address (127.0.0.1 for solo, 0.0.0.0 for public) +- **port** - Listen port +- **workers** - Thread count (mode-dependent) +- **keep_alive** - Connection keep-alive timeout +- **max_connections** - Connection limit + +### Storage Configuration +- **backend** - 'filesystem | 'rocksdb | 'surrealdb | 'postgres +- **path** - Local storage path (filesystem/rocksdb) +- **connection_string** - DB URL (surrealdb/postgres) + +### Queue Configuration (Orchestrator) +- **max_concurrent_tasks** - Concurrent task limit +- **retry_attempts** - Retry count +- **retry_delay** - Delay between retries (ms) +- **task_timeout** - Task execution timeout (ms) + +### Monitoring Configuration (Optional) +- **enabled** - Enable metrics collection +- **metrics_interval** - Collection frequency (seconds) +- **health_check_interval** - Health check frequency + +## Creating New Examples + +### 1. Start with Existing Example + +```bash +cp examples/orchestrator-solo.ncl examples/orchestrator-custom.ncl +``` + +### 2. Modify for Your Use Case + +```bash +# Update configuration values +orchestrator.server.workers = 8 # More workers +orchestrator.queue.max_concurrent_tasks = 20 # Higher concurrency +``` + +### 3. Validate Configuration + +```toml +nickel typecheck examples/orchestrator-custom.ncl +nickel eval examples/orchestrator-custom.ncl +``` + +### 4. Document Purpose +Add comments explaining: +- Use case (deployment scenario) +- Resource requirements +- Expected load +- Customization needed + +### 5. Save as Reference + +```bash +mv examples/orchestrator-custom.ncl examples/orchestrator-{scenario}.ncl +``` + +## Best Practices for Examples + +1. **Clear documentation** - Explain the use case at the top +2. **Realistic values** - Use production-appropriate configurations +3. **Complete configuration** - Include all required sections +4. **Inline comments** - Explain non-obvious choices +5. **Validated** - Typecheck all examples before committing +6. **Organized** - Group by service and deployment mode + +## Example Naming Convention + +- **Service-mode**: `{service}-{mode}.ncl` (orchestrator-solo.ncl) +- **Scenario**: `{service}-{scenario}.ncl` (orchestrator-gpu-intensive.ncl) +- **Full stack**: `full-platform-{mode}.ncl` (full-platform-enterprise.ncl) + +## Customizing Examples + +### For Your Environment + +```bash +# orchestrator-solo.ncl (customized) +{ + orchestrator = { + workspace = { + name = "my-workspace", # Your workspace name + path = "/home/user/projects/workspace", # Your path + }, + server = { + host = "127.0.0.1", # Keep local for solo + port = 9090, + }, + storage = { + backend = 'filesystem, # No external DB needed + path = "/home/user/provisioning/data", # Your path + }, + }, +} +``` + +### For Different Resources + +```bash +# orchestrator-multiuser.ncl (customized for team) +{ + orchestrator = { + server = { + host = "0.0.0.0", # Public binding + port = 9090, + workers = 4, # Team concurrency + }, + queue = { + max_concurrent_tasks = 10, # Team workload + }, + }, +} +``` + +## Testing Examples + +```bash +# Typecheck example +nickel typecheck examples/orchestrator-solo.ncl + +# Evaluate and view +nickel eval examples/orchestrator-solo.ncl | head -20 + +# Export to TOML +nickel export --format toml examples/orchestrator-solo.ncl > test.toml +``` + +--- + +**Version**: 1.0.0 +**Last Updated**: 2025-01-05 \ No newline at end of file diff --git a/schemas/platform/schemas/README.md b/schemas/platform/schemas/README.md index 9ff771b..40e19a4 100644 --- a/schemas/platform/schemas/README.md +++ b/schemas/platform/schemas/README.md @@ -1 +1,287 @@ -# Schemas\n\nNickel type contracts defining configuration structure and validation for all services.\n\n## Purpose\n\nSchemas define:\n- **Type safety** - Required/optional fields, valid types (string, number, bool, record)\n- **Value constraints** - Enum values, numeric bounds (via contracts)\n- **Documentation** - Field descriptions and usage patterns\n- **Composition** - Inheritance and merging of schema types\n\n## File Organization\n\n```\nschemas/\n├── README.md # This file\n├── common/ # Shared schemas (server, database, security, etc.)\n│ ├── server.ncl # HTTP server configuration schema\n│ ├── database.ncl # Database backend schema\n│ ├── security.ncl # Authentication and security schema\n│ ├── monitoring.ncl # Metrics and health checks schema\n│ ├── logging.ncl # Log level and format schema\n│ ├── network.ncl # Network binding and TLS schema\n│ ├── storage.ncl # Storage backend schema\n│ └── workspace.ncl # Workspace configuration schema\n├── deployment/ # Mode-specific schemas\n│ ├── solo.ncl # Solo mode resource constraints\n│ ├── multiuser.ncl # Multi-user mode schema\n│ ├── cicd.ncl # CI/CD mode schema\n│ └── enterprise.ncl # Enterprise HA schema\n├── orchestrator.ncl # Orchestrator service schema\n├── control-center.ncl # Control Center service schema\n├── mcp-server.ncl # MCP Server service schema\n└── installer.ncl # Installer service schema\n```\n\n## Schema Patterns\n\n### 1. Basic Schema Definition\n\n```\n# schemas/common/server.ncl\n{\n Server = {\n host | String, # Required string field\n port | Number, # Required number field\n workers | Number | default = 4, # Optional with default\n keep_alive | Number | optional, # Optional field\n max_connections | Number | optional,\n },\n}\n```\n\n### 2. Type with Contract Validation\n\n```\n# With constraint checking (via validators)\n{\n WorkerCount =\n let valid_range = fun n =>\n if n < 1 then\n std.contract.blame "Workers must be >= 1" n\n else if n > 32 then\n std.contract.blame "Workers must be <= 32" n\n else\n n\n in\n Number | valid_range,\n}\n```\n\n### 3. Record Merging (Composition)\n\n```\n# schemas/orchestrator.ncl\nlet server_schema = import "./common/server.ncl" in\nlet database_schema = import "./common/database.ncl" in\n\n{\n OrchestratorConfig = {\n workspace | {\n name | String,\n path | String,\n enabled | Bool | default = true,\n },\n server | server_schema.Server, # Reuse Server schema\n storage | database_schema.Database, # Reuse Database schema\n queue | {\n max_concurrent_tasks | Number,\n retry_attempts | Number | default = 3,\n },\n },\n}\n```\n\n## Common Schemas\n\n### server.ncl\nHTTP server configuration:\n- `host` - Bind address (string)\n- `port` - Listen port (number)\n- `workers` - Thread count (number, optional)\n- `keep_alive` - Keep-alive timeout (number, optional)\n- `max_connections` - Connection limit (number, optional)\n\n### database.ncl\nDatabase backend selection:\n- `backend` - 'filesystem | 'rocksdb | 'surrealdb_embedded | 'surrealdb_server | 'postgres (enum)\n- `path` - Storage path (string, optional)\n- `connection_string` - DB URL (string, optional)\n- `credentials` - Auth object (optional)\n\n### security.ncl\nAuthentication and encryption:\n- `jwt_issuer` - JWT issuer (string, optional)\n- `jwt_audience` - JWT audience (string, optional)\n- `jwt_expiration` - Token expiration (number, optional)\n- `encryption_key` - Encryption key (string, optional)\n- `kms_backend` - KMS provider (string, optional)\n- `mfa_required` - Require MFA (bool, optional)\n\n### monitoring.ncl\nMetrics and health:\n- `enabled` - Enable monitoring (bool, optional)\n- `metrics_interval` - Metrics collection interval (number, optional)\n- `health_check_interval` - Health check frequency (number, optional)\n- `retention_days` - Metrics retention (number, optional)\n\n### logging.ncl\nLog configuration:\n- `level` - Log level (debug | info | warn | error)\n- `format` - Log format (json | text)\n- `rotation` - Log rotation policy (optional)\n- `output` - Log destination (stdout | file | syslog)\n\n## Service Schemas\n\n### orchestrator.ncl\nWorkflow orchestration:\n\n```\nOrchestratorConfig = {\n workspace | WorkspaceConfig,\n server | Server,\n storage | Database,\n queue | QueueConfig,\n batch | BatchConfig,\n monitoring | MonitoringConfig | optional,\n rollback | RollbackConfig | optional,\n extensions | ExtensionsConfig | optional,\n}\n```\n\n### control-center.ncl\nPolicy and RBAC:\n\n```\nControlCenterConfig = {\n workspace | WorkspaceConfig,\n server | Server,\n database | Database,\n security | SecurityConfig,\n rbac | RBACConfig | optional,\n compliance | ComplianceConfig | optional,\n}\n```\n\n### mcp-server.ncl\nMCP protocol server:\n\n```\nMCPServerConfig = {\n workspace | WorkspaceConfig,\n server | Server,\n capabilities | CapabilitiesConfig,\n tools | ToolsConfig | optional,\n resources | ResourcesConfig | optional,\n}\n```\n\n## Deployment Mode Schemas\n\nDeployment schemas define resource constraints for each mode:\n\n- **solo.ncl** - 2 CPU, 4GB RAM, embedded DB\n- **multiuser.ncl** - 4 CPU, 8GB RAM, PostgreSQL\n- **cicd.ncl** - 8 CPU, 16GB RAM, ephemeral\n- **enterprise.ncl** - 16+ CPU, 32+ GB RAM, HA\n\nExample:\n\n```\n# schemas/deployment/solo.ncl\n{\n SoloMode = {\n resources = {\n cpu_cores | 2,\n memory_mb | 4096,\n disk_gb | 50,\n },\n database_backend | 'filesystem,\n security_level | 'basic,\n },\n}\n```\n\n## Validation with Schemas\n\nSchemas are composed with validators in config files:\n\n```\n# configs/orchestrator.solo.ncl\nlet schemas = import "../schemas/orchestrator.ncl" in\nlet validators = import "../validators/orchestrator-validator.ncl" in\nlet defaults = import "../defaults/orchestrator-defaults.ncl" in\n\n# Compose: defaults + validation + schema checking\n{\n orchestrator = defaults.orchestrator & {\n queue = {\n max_concurrent_tasks = validators.ValidConcurrentTasks 5,\n },\n },\n} | schemas.OrchestratorConfig\n```\n\nThe final `| schemas.OrchestratorConfig` applies type checking.\n\n## Type System\n\n### Nickel Type Syntax\n\n```\n# Required field\nfield | Type,\n\n# Optional field\nfield | Type | optional,\n\n# Field with default\nfield | Type | default = value,\n\n# Union type\nfield | [| 'option1, 'option2],\n\n# Nested record\nfield | {\n subfield | Type,\n},\n```\n\n## Best Practices\n\n1. **Reuse common schemas** - Import and compose rather than duplicate\n2. **Use enums for choices** - `'filesystem | 'rocksdb` instead of string validation\n3. **Document fields** - Add comments explaining purpose\n4. **Keep schemas focused** - Each file covers one logical component\n5. **Test composition** - Use `nickel typecheck` to verify schema merging\n\n## Modifying Schemas\n\nWhen changing a schema:\n\n1. Update schema file (schemas/*.ncl)\n2. Update corresponding defaults (defaults/*.ncl) to match schema\n3. Update validators if constraints changed\n4. Run typecheck: `nickel typecheck configs/orchestrator.*.ncl`\n5. Verify all configs still type-check\n\n## Schema Testing\n\n```\n# Typecheck a schema\nnickel typecheck provisioning/.typedialog/provisioning/platform/schemas/orchestrator.ncl\n\n# Typecheck a config (which applies schema)\nnickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Evaluate a schema\nnickel eval provisioning/.typedialog/provisioning/platform/schemas/orchestrator.ncl\n```\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 +# Schemas + +Nickel type contracts defining configuration structure and validation for all services. + +## Purpose + +Schemas define: +- **Type safety** - Required/optional fields, valid types (string, number, bool, record) +- **Value constraints** - Enum values, numeric bounds (via contracts) +- **Documentation** - Field descriptions and usage patterns +- **Composition** - Inheritance and merging of schema types + +## File Organization + +```bash +schemas/ +├── README.md # This file +├── common/ # Shared schemas (server, database, security, etc.) +│ ├── server.ncl # HTTP server configuration schema +│ ├── database.ncl # Database backend schema +│ ├── security.ncl # Authentication and security schema +│ ├── monitoring.ncl # Metrics and health checks schema +│ ├── logging.ncl # Log level and format schema +│ ├── network.ncl # Network binding and TLS schema +│ ├── storage.ncl # Storage backend schema +│ └── workspace.ncl # Workspace configuration schema +├── deployment/ # Mode-specific schemas +│ ├── solo.ncl # Solo mode resource constraints +│ ├── multiuser.ncl # Multi-user mode schema +│ ├── cicd.ncl # CI/CD mode schema +│ └── enterprise.ncl # Enterprise HA schema +├── orchestrator.ncl # Orchestrator service schema +├── control-center.ncl # Control Center service schema +├── mcp-server.ncl # MCP Server service schema +└── installer.ncl # Installer service schema +``` + +## Schema Patterns + +### 1. Basic Schema Definition + +```bash +# schemas/common/server.ncl +{ + Server = { + host | String, # Required string field + port | Number, # Required number field + workers | Number | default = 4, # Optional with default + keep_alive | Number | optional, # Optional field + max_connections | Number | optional, + }, +} +``` + +### 2. Type with Contract Validation + +```bash +# With constraint checking (via validators) +{ + WorkerCount = + let valid_range = fun n => + if n < 1 then + std.contract.blame "Workers must be >= 1" n + else if n > 32 then + std.contract.blame "Workers must be <= 32" n + else + n + in + Number | valid_range, +} +``` + +### 3. Record Merging (Composition) + +```bash +# schemas/orchestrator.ncl +let server_schema = import "./common/server.ncl" in +let database_schema = import "./common/database.ncl" in + +{ + OrchestratorConfig = { + workspace | { + name | String, + path | String, + enabled | Bool | default = true, + }, + server | server_schema.Server, # Reuse Server schema + storage | database_schema.Database, # Reuse Database schema + queue | { + max_concurrent_tasks | Number, + retry_attempts | Number | default = 3, + }, + }, +} +``` + +## Common Schemas + +### server.ncl +HTTP server configuration: +- `host` - Bind address (string) +- `port` - Listen port (number) +- `workers` - Thread count (number, optional) +- `keep_alive` - Keep-alive timeout (number, optional) +- `max_connections` - Connection limit (number, optional) + +### database.ncl +Database backend selection: +- `backend` - 'filesystem | 'rocksdb | 'surrealdb_embedded | 'surrealdb_server | 'postgres (enum) +- `path` - Storage path (string, optional) +- `connection_string` - DB URL (string, optional) +- `credentials` - Auth object (optional) + +### security.ncl +Authentication and encryption: +- `jwt_issuer` - JWT issuer (string, optional) +- `jwt_audience` - JWT audience (string, optional) +- `jwt_expiration` - Token expiration (number, optional) +- `encryption_key` - Encryption key (string, optional) +- `kms_backend` - KMS provider (string, optional) +- `mfa_required` - Require MFA (bool, optional) + +### monitoring.ncl +Metrics and health: +- `enabled` - Enable monitoring (bool, optional) +- `metrics_interval` - Metrics collection interval (number, optional) +- `health_check_interval` - Health check frequency (number, optional) +- `retention_days` - Metrics retention (number, optional) + +### logging.ncl +Log configuration: +- `level` - Log level (debug | info | warn | error) +- `format` - Log format (json | text) +- `rotation` - Log rotation policy (optional) +- `output` - Log destination (stdout | file | syslog) + +## Service Schemas + +### orchestrator.ncl +Workflow orchestration: + +```nickel +OrchestratorConfig = { + workspace | WorkspaceConfig, + server | Server, + storage | Database, + queue | QueueConfig, + batch | BatchConfig, + monitoring | MonitoringConfig | optional, + rollback | RollbackConfig | optional, + extensions | ExtensionsConfig | optional, +} +``` + +### control-center.ncl +Policy and RBAC: + +```nickel +ControlCenterConfig = { + workspace | WorkspaceConfig, + server | Server, + database | Database, + security | SecurityConfig, + rbac | RBACConfig | optional, + compliance | ComplianceConfig | optional, +} +``` + +### mcp-server.ncl +MCP protocol server: + +```nickel +MCPServerConfig = { + workspace | WorkspaceConfig, + server | Server, + capabilities | CapabilitiesConfig, + tools | ToolsConfig | optional, + resources | ResourcesConfig | optional, +} +``` + +## Deployment Mode Schemas + +Deployment schemas define resource constraints for each mode: + +- **solo.ncl** - 2 CPU, 4GB RAM, embedded DB +- **multiuser.ncl** - 4 CPU, 8GB RAM, PostgreSQL +- **cicd.ncl** - 8 CPU, 16GB RAM, ephemeral +- **enterprise.ncl** - 16+ CPU, 32+ GB RAM, HA + +Example: + +```bash +# schemas/deployment/solo.ncl +{ + SoloMode = { + resources = { + cpu_cores | 2, + memory_mb | 4096, + disk_gb | 50, + }, + database_backend | 'filesystem, + security_level | 'basic, + }, +} +``` + +## Validation with Schemas + +Schemas are composed with validators in config files: + +```toml +# configs/orchestrator.solo.ncl +let schemas = import "../schemas/orchestrator.ncl" in +let validators = import "../validators/orchestrator-validator.ncl" in +let defaults = import "../defaults/orchestrator-defaults.ncl" in + +# Compose: defaults + validation + schema checking +{ + orchestrator = defaults.orchestrator & { + queue = { + max_concurrent_tasks = validators.ValidConcurrentTasks 5, + }, + }, +} | schemas.OrchestratorConfig +``` + +The final `| schemas.OrchestratorConfig` applies type checking. + +## Type System + +### Nickel Type Syntax + +```nickel +# Required field +field | Type, + +# Optional field +field | Type | optional, + +# Field with default +field | Type | default = value, + +# Union type +field | [| 'option1, 'option2], + +# Nested record +field | { + subfield | Type, +}, +``` + +## Best Practices + +1. **Reuse common schemas** - Import and compose rather than duplicate +2. **Use enums for choices** - `'filesystem | 'rocksdb` instead of string validation +3. **Document fields** - Add comments explaining purpose +4. **Keep schemas focused** - Each file covers one logical component +5. **Test composition** - Use `nickel typecheck` to verify schema merging + +## Modifying Schemas + +When changing a schema: + +1. Update schema file (schemas/*.ncl) +2. Update corresponding defaults (defaults/*.ncl) to match schema +3. Update validators if constraints changed +4. Run typecheck: `nickel typecheck configs/orchestrator.*.ncl` +5. Verify all configs still type-check + +## Schema Testing + +```bash +# Typecheck a schema +nickel typecheck provisioning/.typedialog/provisioning/platform/schemas/orchestrator.ncl + +# Typecheck a config (which applies schema) +nickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl + +# Evaluate a schema +nickel eval provisioning/.typedialog/provisioning/platform/schemas/orchestrator.ncl +``` + +--- + +**Version**: 1.0.0 +**Last Updated**: 2025-01-05 \ No newline at end of file diff --git a/schemas/platform/templates/README.md b/schemas/platform/templates/README.md index 4a312e7..1967ae8 100644 --- a/schemas/platform/templates/README.md +++ b/schemas/platform/templates/README.md @@ -1 +1,361 @@ -# Templates\n\nJinja2 and Nickel templates for configuration and deployment generation.\n\n## Purpose\n\nTemplates provide:\n- **Nickel output generation** - Jinja2 templates for TypeDialog nickel-roundtrip\n- **Docker Compose generation** - Infrastructure-as-code for containerized deployment\n- **Kubernetes manifests** - Declarative deployment manifests\n- **TOML export** - Service configuration generation for Rust codebase\n\n## File Organization\n\n```\ntemplates/\n├── README.md # This file\n├── orchestrator-config.ncl.j2 # Nickel output template (Jinja2)\n├── control-center-config.ncl.j2 # Nickel output template (Jinja2)\n├── mcp-server-config.ncl.j2 # Nickel output template (Jinja2)\n├── installer-config.ncl.j2 # Nickel output template (Jinja2)\n├── docker-compose/ # Docker Compose templates\n│ ├── platform-stack.solo.yml.ncl\n│ ├── platform-stack.multiuser.yml.ncl\n│ ├── platform-stack.cicd.yml.ncl\n│ └── platform-stack.enterprise.yml.ncl\n├── kubernetes/ # Kubernetes templates\n│ ├── orchestrator-deployment.yaml.ncl\n│ ├── orchestrator-service.yaml.ncl\n│ ├── control-center-deployment.yaml.ncl\n│ ├── control-center-service.yaml.ncl\n│ └── platform-ingress.yaml.ncl\n└── configs/ # Service config templates (optional)\n ├── orchestrator-config.toml.ncl\n ├── control-center-config.toml.ncl\n └── mcp-server-config.toml.ncl\n```\n\n## Jinja2 Config Templates\n\n**Critical for TypeDialog nickel-roundtrip workflow**:\n\n```\ntypedialog-web nickel-roundtrip "$CONFIG" "forms/{service}-form.toml" --output "$CONFIG" --template "templates/{service}-config.ncl.j2"\n```\n\n### Template Pattern: orchestrator-config.ncl.j2\n\n```\n# Orchestrator Configuration - Nickel Format\n# Auto-generated by provisioning TypeDialog\n# Edit via: nu scripts/configure.nu orchestrator {mode}\n\n{\n orchestrator = {\n # Workspace Configuration\n workspace = {\n {%- if workspace_name %}\n name = "{{ workspace_name }}",\n {%- endif %}\n {%- if workspace_path %}\n path = "{{ workspace_path }}",\n {%- endif %}\n {%- if workspace_enabled is defined %}\n enabled = {{ workspace_enabled | lower }},\n {%- endif %}\n {%- if multi_workspace is defined %}\n multi_workspace = {{ multi_workspace | lower }},\n {%- endif %}\n },\n\n # Server Configuration\n server = {\n {%- if server_host %}\n host = "{{ server_host }}",\n {%- endif %}\n {%- if server_port %}\n port = {{ server_port }},\n {%- endif %}\n {%- if server_workers %}\n workers = {{ server_workers }},\n {%- endif %}\n {%- if server_keep_alive %}\n keep_alive = {{ server_keep_alive }},\n {%- endif %}\n },\n\n # Storage Configuration\n storage = {\n {%- if storage_backend %}\n backend = '{{ storage_backend }},\n {%- endif %}\n {%- if storage_path %}\n path = "{{ storage_path }}",\n {%- endif %}\n {%- if surrealdb_url %}\n surrealdb_url = "{{ surrealdb_url }}",\n {%- endif %}\n },\n\n # Queue Configuration\n queue = {\n {%- if max_concurrent_tasks %}\n max_concurrent_tasks = {{ max_concurrent_tasks }},\n {%- endif %}\n {%- if retry_attempts %}\n retry_attempts = {{ retry_attempts }},\n {%- endif %}\n {%- if retry_delay %}\n retry_delay = {{ retry_delay }},\n {%- endif %}\n {%- if task_timeout %}\n task_timeout = {{ task_timeout }},\n {%- endif %}\n },\n\n # Monitoring Configuration (optional)\n {%- if enable_monitoring is defined and enable_monitoring %}\n monitoring = {\n enabled = true,\n {%- if metrics_interval %}\n metrics_interval = {{ metrics_interval }},\n {%- endif %}\n {%- if health_check_interval %}\n health_check_interval = {{ health_check_interval }},\n {%- endif %}\n },\n {%- endif %}\n },\n}\n```\n\n### Key Jinja2 Patterns\n\n**Conditional blocks** (only include if field is set):\n\n```\n{%- if workspace_name %}\nname = "{{ workspace_name }}",\n{%- endif %}\n```\n\n**String values** (with quotes):\n\n```\n{%- if storage_backend %}\nbackend = '{{ storage_backend }}, # Enum (atom syntax)\n{%- endif %}\n```\n\n**Numeric values** (no quotes):\n\n```\n{%- if server_port %}\nport = {{ server_port }}, # Number\n{%- endif %}\n```\n\n**Boolean values** (lower case):\n\n```\n{%- if workspace_enabled is defined %}\nenabled = {{ workspace_enabled | lower }}, # Boolean (true/false)\n{%- endif %}\n```\n\n**Comments** (for generated files):\n\n```\n# Auto-generated by provisioning TypeDialog\n# Edit via: nu scripts/configure.nu orchestrator {mode}\n```\n\n## Docker Compose Templates\n\nNickel templates that import from `values/*.ncl`:\n\n```\n# templates/docker-compose/platform-stack.solo.yml.ncl\n# Docker Compose Platform Stack - Solo Mode\n# Imports config from values/orchestrator.solo.ncl\n\nlet orchestrator_config = import "../../values/orchestrator.solo.ncl" in\nlet control_center_config = import "../../values/control-center.solo.ncl" in\n\n{\n version = "3.8",\n services = {\n orchestrator = {\n image = "provisioning-orchestrator:latest",\n container_name = "orchestrator",\n ports = [\n "%{std.to_string orchestrator_config.orchestrator.server.port}:9090",\n ],\n environment = {\n ORCHESTRATOR_SERVER_HOST = orchestrator_config.orchestrator.server.host,\n ORCHESTRATOR_SERVER_PORT = std.to_string orchestrator_config.orchestrator.server.port,\n ORCHESTRATOR_STORAGE_BACKEND = orchestrator_config.orchestrator.storage.backend,\n },\n volumes = [\n "./data/orchestrator:%{orchestrator_config.orchestrator.storage.path}",\n ],\n restart = "unless-stopped",\n },\n control-center = {\n image = "provisioning-control-center:latest",\n container_name = "control-center",\n ports = [\n "%{std.to_string control_center_config.control_center.server.port}:8080",\n ],\n environment = {\n CONTROL_CENTER_SERVER_HOST = control_center_config.control_center.server.host,\n CONTROL_CENTER_SERVER_PORT = std.to_string control_center_config.control_center.server.port,\n },\n restart = "unless-stopped",\n },\n },\n}\n```\n\n### Rendering Docker Compose\n\n```\n# Export Nickel template to YAML\nnickel export --format json templates/docker-compose/platform-stack.solo.yml.ncl | yq -P > docker-compose.solo.yml\n```\n\n## Kubernetes Templates\n\nNickel templates for Kubernetes manifests:\n\n```\n# templates/kubernetes/orchestrator-deployment.yaml.ncl\nlet config = import "../../values/orchestrator.solo.ncl" in\n\n{\n apiVersion = "apps/v1",\n kind = "Deployment",\n metadata = {\n name = "orchestrator",\n labels = {\n app = "orchestrator",\n },\n },\n spec = {\n replicas = 1,\n selector = {\n matchLabels = {\n app = "orchestrator",\n },\n },\n template = {\n metadata = {\n labels = {\n app = "orchestrator",\n },\n },\n spec = {\n containers = [\n {\n name = "orchestrator",\n image = "provisioning-orchestrator:latest",\n ports = [\n {\n containerPort = 9090,\n },\n ],\n env = [\n {\n name = "ORCHESTRATOR_SERVER_PORT",\n value = std.to_string config.orchestrator.server.port,\n },\n {\n name = "ORCHESTRATOR_STORAGE_BACKEND",\n value = config.orchestrator.storage.backend,\n },\n ],\n volumeMounts = [\n {\n name = "data",\n mountPath = config.orchestrator.storage.path,\n },\n ],\n },\n ],\n volumes = [\n {\n name = "data",\n persistentVolumeClaim = {\n claimName = "orchestrator-pvc",\n },\n },\n ],\n },\n },\n },\n}\n```\n\n## Rendering Templates\n\n### Render to JSON\n\n```\nnickel export --format json templates/orchestrator-config.ncl.j2 > config.json\n```\n\n### Render to YAML (via yq)\n\n```\nnickel export --format json templates/kubernetes/orchestrator-deployment.yaml.ncl | yq -P > deployment.yaml\n```\n\n### Render to TOML\n\n```\nnickel export --format toml templates/configs/orchestrator-config.toml.ncl > config.toml\n```\n\n## Template Variables\n\nVariables in templates come from:\n1. **Form values** (TypeDialog input)\n2. **Imported configs** (Nickel imports)\n3. **Constraint interpolation** (constraints.toml)\n\n## Best Practices\n\n1. **Use conditional blocks** - Only include fields if set\n2. **Import configs** - Reuse Nickel configs in templates\n3. **Type conversion** - Use `std.to_string` for numeric values\n4. **Comments** - Explain generated/auto-edited markers\n5. **Validation** - Use `nickel typecheck` to verify templates\n6. **Environment variables** - Prefer env over hardcoding\n\n## Template Testing\n\n```\n# Typecheck Jinja2 + Nickel template\nnickel typecheck templates/orchestrator-config.ncl.j2\n\n# Evaluate and view output\nnickel eval templates/orchestrator-config.ncl.j2\n\n# Export and validate output\nnickel export --format json templates/orchestrator-config.ncl.j2 | jq '.'\n```\n\n## Adding a New Template\n\n1. **Create template file** (`{service}-config.ncl.j2` or `{name}.yml.ncl`)\n2. **Define structure** (Nickel or Jinja2)\n3. **Import configs** (if Nickel)\n4. **Use variables** (from forms or imports)\n5. **Typecheck**: `nickel typecheck templates/{file}`\n6. **Test rendering**: `nickel export {format} templates/{file}`\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 +# Templates + +Jinja2 and Nickel templates for configuration and deployment generation. + +## Purpose + +Templates provide: +- **Nickel output generation** - Jinja2 templates for TypeDialog nickel-roundtrip +- **Docker Compose generation** - Infrastructure-as-code for containerized deployment +- **Kubernetes manifests** - Declarative deployment manifests +- **TOML export** - Service configuration generation for Rust codebase + +## File Organization + +```bash +templates/ +├── README.md # This file +├── orchestrator-config.ncl.j2 # Nickel output template (Jinja2) +├── control-center-config.ncl.j2 # Nickel output template (Jinja2) +├── mcp-server-config.ncl.j2 # Nickel output template (Jinja2) +├── installer-config.ncl.j2 # Nickel output template (Jinja2) +├── docker-compose/ # Docker Compose templates +│ ├── platform-stack.solo.yml.ncl +│ ├── platform-stack.multiuser.yml.ncl +│ ├── platform-stack.cicd.yml.ncl +│ └── platform-stack.enterprise.yml.ncl +├── kubernetes/ # Kubernetes templates +│ ├── orchestrator-deployment.yaml.ncl +│ ├── orchestrator-service.yaml.ncl +│ ├── control-center-deployment.yaml.ncl +│ ├── control-center-service.yaml.ncl +│ └── platform-ingress.yaml.ncl +└── configs/ # Service config templates (optional) + ├── orchestrator-config.toml.ncl + ├── control-center-config.toml.ncl + └── mcp-server-config.toml.ncl +``` + +## Jinja2 Config Templates + +**Critical for TypeDialog nickel-roundtrip workflow**: + +```nickel +typedialog-web nickel-roundtrip "$CONFIG" "forms/{service}-form.toml" --output "$CONFIG" --template "templates/{service}-config.ncl.j2" +``` + +### Template Pattern: orchestrator-config.ncl.j2 + +```nickel +# Orchestrator Configuration - Nickel Format +# Auto-generated by provisioning TypeDialog +# Edit via: nu scripts/configure.nu orchestrator {mode} + +{ + orchestrator = { + # Workspace Configuration + workspace = { + {%- if workspace_name %} + name = "{{ workspace_name }}", + {%- endif %} + {%- if workspace_path %} + path = "{{ workspace_path }}", + {%- endif %} + {%- if workspace_enabled is defined %} + enabled = {{ workspace_enabled | lower }}, + {%- endif %} + {%- if multi_workspace is defined %} + multi_workspace = {{ multi_workspace | lower }}, + {%- endif %} + }, + + # Server Configuration + server = { + {%- if server_host %} + host = "{{ server_host }}", + {%- endif %} + {%- if server_port %} + port = {{ server_port }}, + {%- endif %} + {%- if server_workers %} + workers = {{ server_workers }}, + {%- endif %} + {%- if server_keep_alive %} + keep_alive = {{ server_keep_alive }}, + {%- endif %} + }, + + # Storage Configuration + storage = { + {%- if storage_backend %} + backend = '{{ storage_backend }}, + {%- endif %} + {%- if storage_path %} + path = "{{ storage_path }}", + {%- endif %} + {%- if surrealdb_url %} + surrealdb_url = "{{ surrealdb_url }}", + {%- endif %} + }, + + # Queue Configuration + queue = { + {%- if max_concurrent_tasks %} + max_concurrent_tasks = {{ max_concurrent_tasks }}, + {%- endif %} + {%- if retry_attempts %} + retry_attempts = {{ retry_attempts }}, + {%- endif %} + {%- if retry_delay %} + retry_delay = {{ retry_delay }}, + {%- endif %} + {%- if task_timeout %} + task_timeout = {{ task_timeout }}, + {%- endif %} + }, + + # Monitoring Configuration (optional) + {%- if enable_monitoring is defined and enable_monitoring %} + monitoring = { + enabled = true, + {%- if metrics_interval %} + metrics_interval = {{ metrics_interval }}, + {%- endif %} + {%- if health_check_interval %} + health_check_interval = {{ health_check_interval }}, + {%- endif %} + }, + {%- endif %} + }, +} +``` + +### Key Jinja2 Patterns + +**Conditional blocks** (only include if field is set): + +```json +{%- if workspace_name %} +name = "{{ workspace_name }}", +{%- endif %} +``` + +**String values** (with quotes): + +```json +{%- if storage_backend %} +backend = '{{ storage_backend }}, # Enum (atom syntax) +{%- endif %} +``` + +**Numeric values** (no quotes): + +```json +{%- if server_port %} +port = {{ server_port }}, # Number +{%- endif %} +``` + +**Boolean values** (lower case): + +```json +{%- if workspace_enabled is defined %} +enabled = {{ workspace_enabled | lower }}, # Boolean (true/false) +{%- endif %} +``` + +**Comments** (for generated files): + +```bash +# Auto-generated by provisioning TypeDialog +# Edit via: nu scripts/configure.nu orchestrator {mode} +``` + +## Docker Compose Templates + +Nickel templates that import from `values/*.ncl`: + +```nickel +# templates/docker-compose/platform-stack.solo.yml.ncl +# Docker Compose Platform Stack - Solo Mode +# Imports config from values/orchestrator.solo.ncl + +let orchestrator_config = import "../../values/orchestrator.solo.ncl" in +let control_center_config = import "../../values/control-center.solo.ncl" in + +{ + version = "3.8", + services = { + orchestrator = { + image = "provisioning-orchestrator:latest", + container_name = "orchestrator", + ports = [ + "%{std.to_string orchestrator_config.orchestrator.server.port}:9090", + ], + environment = { + ORCHESTRATOR_SERVER_HOST = orchestrator_config.orchestrator.server.host, + ORCHESTRATOR_SERVER_PORT = std.to_string orchestrator_config.orchestrator.server.port, + ORCHESTRATOR_STORAGE_BACKEND = orchestrator_config.orchestrator.storage.backend, + }, + volumes = [ + "./data/orchestrator:%{orchestrator_config.orchestrator.storage.path}", + ], + restart = "unless-stopped", + }, + control-center = { + image = "provisioning-control-center:latest", + container_name = "control-center", + ports = [ + "%{std.to_string control_center_config.control_center.server.port}:8080", + ], + environment = { + CONTROL_CENTER_SERVER_HOST = control_center_config.control_center.server.host, + CONTROL_CENTER_SERVER_PORT = std.to_string control_center_config.control_center.server.port, + }, + restart = "unless-stopped", + }, + }, +} +``` + +### Rendering Docker Compose + +```bash +# Export Nickel template to YAML +nickel export --format json templates/docker-compose/platform-stack.solo.yml.ncl | yq -P > docker-compose.solo.yml +``` + +## Kubernetes Templates + +Nickel templates for Kubernetes manifests: + +```nickel +# templates/kubernetes/orchestrator-deployment.yaml.ncl +let config = import "../../values/orchestrator.solo.ncl" in + +{ + apiVersion = "apps/v1", + kind = "Deployment", + metadata = { + name = "orchestrator", + labels = { + app = "orchestrator", + }, + }, + spec = { + replicas = 1, + selector = { + matchLabels = { + app = "orchestrator", + }, + }, + template = { + metadata = { + labels = { + app = "orchestrator", + }, + }, + spec = { + containers = [ + { + name = "orchestrator", + image = "provisioning-orchestrator:latest", + ports = [ + { + containerPort = 9090, + }, + ], + env = [ + { + name = "ORCHESTRATOR_SERVER_PORT", + value = std.to_string config.orchestrator.server.port, + }, + { + name = "ORCHESTRATOR_STORAGE_BACKEND", + value = config.orchestrator.storage.backend, + }, + ], + volumeMounts = [ + { + name = "data", + mountPath = config.orchestrator.storage.path, + }, + ], + }, + ], + volumes = [ + { + name = "data", + persistentVolumeClaim = { + claimName = "orchestrator-pvc", + }, + }, + ], + }, + }, + }, +} +``` + +## Rendering Templates + +### Render to JSON + +```bash +nickel export --format json templates/orchestrator-config.ncl.j2 > config.json +``` + +### Render to YAML (via yq) + +```yaml +nickel export --format json templates/kubernetes/orchestrator-deployment.yaml.ncl | yq -P > deployment.yaml +``` + +### Render to TOML + +```toml +nickel export --format toml templates/configs/orchestrator-config.toml.ncl > config.toml +``` + +## Template Variables + +Variables in templates come from: +1. **Form values** (TypeDialog input) +2. **Imported configs** (Nickel imports) +3. **Constraint interpolation** (constraints.toml) + +## Best Practices + +1. **Use conditional blocks** - Only include fields if set +2. **Import configs** - Reuse Nickel configs in templates +3. **Type conversion** - Use `std.to_string` for numeric values +4. **Comments** - Explain generated/auto-edited markers +5. **Validation** - Use `nickel typecheck` to verify templates +6. **Environment variables** - Prefer env over hardcoding + +## Template Testing + +```bash +# Typecheck Jinja2 + Nickel template +nickel typecheck templates/orchestrator-config.ncl.j2 + +# Evaluate and view output +nickel eval templates/orchestrator-config.ncl.j2 + +# Export and validate output +nickel export --format json templates/orchestrator-config.ncl.j2 | jq '.' +``` + +## Adding a New Template + +1. **Create template file** (`{service}-config.ncl.j2` or `{name}.yml.ncl`) +2. **Define structure** (Nickel or Jinja2) +3. **Import configs** (if Nickel) +4. **Use variables** (from forms or imports) +5. **Typecheck**: `nickel typecheck templates/{file}` +6. **Test rendering**: `nickel export {format} templates/{file}` + +--- + +**Version**: 1.0.0 +**Last Updated**: 2025-01-05 \ No newline at end of file diff --git a/schemas/platform/templates/configs/README.md b/schemas/platform/templates/configs/README.md index 4831149..0a77f94 100644 --- a/schemas/platform/templates/configs/README.md +++ b/schemas/platform/templates/configs/README.md @@ -1 +1,383 @@ -# Service Configuration Templates\n\nNickel-based configuration templates that export to TOML format for provisioning platform services.\n\n## Overview\n\nThis directory contains Nickel templates that generate TOML configuration files for the provisioning platform services:\n\n- **orchestrator-config.toml.ncl** - Workflow engine configuration\n- **control-center-config.toml.ncl** - Policy and RBAC management configuration\n- **mcp-server-config.toml.ncl** - Model Context Protocol server configuration\n\nThese templates support all four deployment modes:\n\n- **solo**: Single developer, minimal configuration\n- **multiuser**: Team collaboration with full features\n- **cicd**: CI/CD pipelines with ephemeral configuration\n- **enterprise**: Production with advanced security and monitoring\n\n## Templates\n\n### orchestrator-config.toml.ncl\n\nOrchestrator workflow engine configuration with sections for:\n\n- **Workspace**: Workspace name, path, and multi-workspace support\n- **Server**: HTTP server configuration (host, port, workers)\n- **Storage**: Backend selection (filesystem, SurrealDB embedded, SurrealDB server)\n- **Queue**: Task concurrency, retries, timeouts, deadletter queue\n- **Batch**: Parallel limits, operation timeouts, checkpointing, rollback\n- **Monitoring**: Metrics collection, health checks, resource tracking\n- **Logging**: Log levels, outputs, rotation\n- **Security**: JWT auth, CORS, TLS, rate limiting\n- **Extensions**: Auto-loading from OCI registry\n- **Database**: Connection pooling for non-filesystem storage\n- **Features**: Feature flags for experimental functionality\n\n**Key Parameters**:\n- `max_concurrent_tasks`: 1-100 (constrained)\n- `batch.parallel_limit`: 1-50 (constrained)\n- Storage backend: filesystem, surrealdb_server, surrealdb_cluster\n- Logging format: json or text\n\n### control-center-config.toml.ncl\n\nControl Center policy and RBAC management configuration with sections for:\n\n- **Server**: HTTP server configuration\n- **Database**: Backend selection (RocksDB, PostgreSQL, PostgreSQL HA)\n- **Auth**: JWT, OAUTH2, LDAP authentication methods\n- **RBAC**: Role-based access control with roles and permissions\n- **MFA**: Multi-factor authentication (TOTP, Email OTP)\n- **Policies**: Password policy, session policy, audit, compliance\n- **Rate Limiting**: Global and per-user rate limits\n- **CORS**: Cross-origin resource sharing configuration\n- **TLS**: SSL/TLS configuration\n- **Monitoring**: Metrics, health checks, tracing\n- **Logging**: Log outputs and rotation\n- **Orchestrator Integration**: Connection to orchestrator service\n- **Features**: Feature flags\n\n**Key Parameters**:\n- `database.backend`: rocksdb, postgres, postgres_ha\n- `mfa.required`: false for solo/multiuser, true for enterprise\n- `policies.password.min_length`: 12\n- `policies.compliance`: SOC2, HIPAA support\n\n### mcp-server-config.toml.ncl\n\nModel Context Protocol server configuration for AI/LLM integration with sections for:\n\n- **Server**: HTTP/Stdio protocol configuration\n- **Capabilities**: Tools, resources, prompts, sampling\n- **Tools**: Tool categories and configurations (orchestrator, provisioning, workspace)\n- **Resources**: File system, database, external API resources\n- **Prompts**: System prompts and user prompt configuration\n- **Integration**: Orchestrator, Control Center, Claude API integration\n- **Security**: Authentication, authorization, rate limiting, input validation\n- **Monitoring**: Metrics, health checks, audit logging\n- **Logging**: Log outputs and configuration\n- **Features**: Feature flags\n- **Performance**: Thread pools, timeouts, caching\n\n**Key Parameters**:\n- `server.protocol`: stdio (process-based) or http (network-based)\n- `capabilities.tools.enabled`: true/false\n- `capabilities.resources.max_size`: 1GB default\n- `integration.claude.model`: claude-3-opus (latest)\n\n## Usage\n\n### Exporting to TOML\n\nEach template exports to TOML format:\n\n```\n# Export orchestrator configuration\nnickel export --format toml orchestrator-config.toml.ncl > orchestrator.toml\n\n# Export control-center configuration\nnickel export --format toml control-center-config.toml.ncl > control-center.toml\n\n# Export MCP server configuration\nnickel export --format toml mcp-server-config.toml.ncl > mcp-server.toml\n```\n\n### Mode-Specific Configuration\n\nOverride configuration values based on deployment mode using environment variables or configuration layering:\n\n```\n# Export solo mode configuration\nORCHESTRATOR_MODE=solo nickel export --format toml orchestrator-config.toml.ncl > orchestrator.solo.toml\n\n# Export enterprise mode with full features\nORCHESTRATOR_MODE=enterprise nickel export --format toml orchestrator-config.toml.ncl > orchestrator.enterprise.toml\n```\n\n### Integration with Rust Services\n\nRust services load TOML configuration in this order (high to low priority):\n\n1. **Environment Variables** - `ORCHESTRATOR_*`, `CONTROL_CENTER_*`, `MCP_*`\n2. **User Configuration** - `~/.config/provisioning/user_config.toml`\n3. **Mode-Specific Config** - `provisioning/platform/config/{service}.{mode}.toml`\n4. **Default Configuration** - `provisioning/platform/config/{service}.defaults.toml`\n\nExample loading in Rust:\n\n```\nuse config::{Config, ConfigError, File};\n\npub fn load_config(mode: &str) -> Result {\n let config_path = format!("provisioning/platform/config/orchestrator.{}.toml", mode);\n\n Config::builder()\n .add_source(File::with_name("provisioning/platform/config/orchestrator.defaults"))\n .add_source(File::with_name(&config_path).required(false))\n .add_source(config::Environment::with_prefix("ORCHESTRATOR"))\n .build()?\n .try_deserialize()\n}\n```\n\n## Configuration Sections\n\n### Server Configuration (All Services)\n\n```\n[server]\nhost = "0.0.0.0"\nport = 9090\nworkers = 4\nkeep_alive = 75\nmax_connections = 512\n```\n\n### Database Configuration (Control Center)\n\n**RocksDB** (solo, cicd modes):\n\n```\n[database]\nbackend = "rocksdb"\n\n[database.rocksdb]\npath = "/var/lib/provisioning/control-center/db"\ncache_size = "256MB"\nmax_open_files = 1000\ncompression = "snappy"\n```\n\n**PostgreSQL** (multiuser, enterprise modes):\n\n```\n[database]\nbackend = "postgres"\n\n[database.postgres]\nhost = "postgres.provisioning.svc.cluster.local"\nport = 5432\ndatabase = "provisioning"\nuser = "provisioning"\npassword = "${DB_PASSWORD}"\nssl_mode = "require"\n```\n\n### Storage Configuration (Orchestrator)\n\n**Filesystem** (solo, cicd modes):\n\n```\n[storage]\nbackend = "filesystem"\npath = "/var/lib/provisioning/orchestrator/data"\n```\n\n**SurrealDB Server** (multiuser mode):\n\n```\n[storage]\nbackend = "surrealdb_server"\nsurrealdb_url = "surrealdb://surrealdb:8000"\nsurrealdb_namespace = "provisioning"\nsurrealdb_database = "orchestrator"\n```\n\n**SurrealDB Cluster** (enterprise mode):\n\n```\n[storage]\nbackend = "surrealdb_cluster"\nsurrealdb_url = "surrealdb://surrealdb-cluster.provisioning.svc.cluster.local:8000"\nsurrealdb_namespace = "provisioning"\nsurrealdb_database = "orchestrator"\n```\n\n### RBAC Configuration (Control Center)\n\n```\n[rbac]\nenabled = true\ndefault_role = "viewer"\n\n[rbac.roles.admin]\ndescription = "Administrator with full access"\npermissions = ["*"]\n\n[rbac.roles.operator]\ndescription = "Operator managing orchestrator"\npermissions = ["orchestrator.view", "orchestrator.execute"]\n```\n\n### Queue Configuration (Orchestrator)\n\n```\n[queue]\nmax_concurrent_tasks = 50\nretry_attempts = 3\nretry_delay = 5000\ntask_timeout = 3600000\n\n[queue.deadletter_queue]\nenabled = true\nmax_messages = 1000\nretention_period = 86400\n```\n\n### Logging Configuration (All Services)\n\n```\n[logging]\nlevel = "info"\nformat = "json"\n\n[[logging.outputs]]\ndestination = "stdout"\nlevel = "info"\n\n[[logging.outputs]]\ndestination = "file"\npath = "/var/log/provisioning/orchestrator/orchestrator.log"\nlevel = "debug"\n\n[logging.outputs.rotation]\nmax_size = "100MB"\nmax_backups = 10\nmax_age = 30\n```\n\n### Monitoring Configuration (All Services)\n\n```\n[monitoring]\nenabled = true\n\n[monitoring.metrics]\nenabled = true\ninterval = 30\nexport_format = "prometheus"\n\n[monitoring.health_check]\nenabled = true\ninterval = 30\ntimeout = 10\n```\n\n### Security Configuration (All Services)\n\n```\n[security.auth]\nenabled = true\nmethod = "jwt"\njwt_secret = "${JWT_SECRET}"\njwt_issuer = "provisioning.local"\njwt_audience = "orchestrator"\ntoken_expiration = 3600\n\n[security.cors]\nenabled = true\nallowed_origins = ["https://control-center:8080"]\nallowed_methods = ["GET", "POST", "PUT", "DELETE"]\n\n[security.rate_limit]\nenabled = true\nrequests_per_second = 1000\nburst_size = 100\n```\n\n## Environment Variables\n\nAll sensitive values should be provided via environment variables:\n\n```\n# Secrets\nexport JWT_SECRET="your-jwt-secret-here"\nexport DB_PASSWORD="your-database-password"\nexport ORCHESTRATOR_TOKEN="your-orchestrator-token"\nexport CONTROL_CENTER_TOKEN="your-control-center-token"\nexport CLAUDE_API_KEY="your-claude-api-key"\n\n# Service URLs (if different from defaults)\nexport ORCHESTRATOR_URL="http://orchestrator:9090"\nexport CONTROL_CENTER_URL="http://control-center:8080"\n\n# Mode selection\nexport PROVISIONING_MODE="enterprise"\n```\n\n## Mode-Specific Overrides\n\n### Solo Mode\n- Minimal resources: 2 CPU, 4GB RAM\n- Filesystem storage for orchestrator\n- RocksDB for control-center\n- No MFA required\n- Single replica deployments\n- Logging: info level\n\n### MultiUser Mode\n- Moderate resources: 4 CPU, 8GB RAM\n- SurrealDB server for orchestrator\n- PostgreSQL for control-center\n- RBAC enabled\n- 1 replica per service\n- Logging: debug level\n\n### CI/CD Mode\n- Stateless configuration\n- Ephemeral storage (no persistence)\n- API-driven (minimal UI)\n- No MFA required\n- 1 replica per service\n- Logging: warn level (minimal)\n\n### Enterprise Mode\n- High resources: 16+ CPU, 32+ GB RAM\n- SurrealDB cluster for orchestrator HA\n- PostgreSQL HA for control-center\n- Full RBAC and MFA required\n- 3+ replicas per service\n- Full monitoring and audit logging\n- Logging: info level with detailed audit\n\n## Validation\n\nValidate configuration before using:\n\n```\n# Type check with Nickel\nnickel typecheck orchestrator-config.toml.ncl\n\n# Export and validate TOML syntax\nnickel export --format toml orchestrator-config.toml.ncl | toml-cli validate -\n```\n\n## References\n\n- [Orchestrator Configuration Schema](../../schemas/orchestrator.ncl)\n- [Control Center Configuration Schema](../../schemas/control-center.ncl)\n- [MCP Server Configuration Schema](../../schemas/mcp-server.ncl)\n- [Nickel Language](https://nickel-lang.org/)\n- [TOML Format](https://toml.io/) +# Service Configuration Templates + +Nickel-based configuration templates that export to TOML format for provisioning platform services. + +## Overview + +This directory contains Nickel templates that generate TOML configuration files for the provisioning platform services: + +- **orchestrator-config.toml.ncl** - Workflow engine configuration +- **control-center-config.toml.ncl** - Policy and RBAC management configuration +- **mcp-server-config.toml.ncl** - Model Context Protocol server configuration + +These templates support all four deployment modes: + +- **solo**: Single developer, minimal configuration +- **multiuser**: Team collaboration with full features +- **cicd**: CI/CD pipelines with ephemeral configuration +- **enterprise**: Production with advanced security and monitoring + +## Templates + +### orchestrator-config.toml.ncl + +Orchestrator workflow engine configuration with sections for: + +- **Workspace**: Workspace name, path, and multi-workspace support +- **Server**: HTTP server configuration (host, port, workers) +- **Storage**: Backend selection (filesystem, SurrealDB embedded, SurrealDB server) +- **Queue**: Task concurrency, retries, timeouts, deadletter queue +- **Batch**: Parallel limits, operation timeouts, checkpointing, rollback +- **Monitoring**: Metrics collection, health checks, resource tracking +- **Logging**: Log levels, outputs, rotation +- **Security**: JWT auth, CORS, TLS, rate limiting +- **Extensions**: Auto-loading from OCI registry +- **Database**: Connection pooling for non-filesystem storage +- **Features**: Feature flags for experimental functionality + +**Key Parameters**: +- `max_concurrent_tasks`: 1-100 (constrained) +- `batch.parallel_limit`: 1-50 (constrained) +- Storage backend: filesystem, surrealdb_server, surrealdb_cluster +- Logging format: json or text + +### control-center-config.toml.ncl + +Control Center policy and RBAC management configuration with sections for: + +- **Server**: HTTP server configuration +- **Database**: Backend selection (RocksDB, PostgreSQL, PostgreSQL HA) +- **Auth**: JWT, OAUTH2, LDAP authentication methods +- **RBAC**: Role-based access control with roles and permissions +- **MFA**: Multi-factor authentication (TOTP, Email OTP) +- **Policies**: Password policy, session policy, audit, compliance +- **Rate Limiting**: Global and per-user rate limits +- **CORS**: Cross-origin resource sharing configuration +- **TLS**: SSL/TLS configuration +- **Monitoring**: Metrics, health checks, tracing +- **Logging**: Log outputs and rotation +- **Orchestrator Integration**: Connection to orchestrator service +- **Features**: Feature flags + +**Key Parameters**: +- `database.backend`: rocksdb, postgres, postgres_ha +- `mfa.required`: false for solo/multiuser, true for enterprise +- `policies.password.min_length`: 12 +- `policies.compliance`: SOC2, HIPAA support + +### mcp-server-config.toml.ncl + +Model Context Protocol server configuration for AI/LLM integration with sections for: + +- **Server**: HTTP/Stdio protocol configuration +- **Capabilities**: Tools, resources, prompts, sampling +- **Tools**: Tool categories and configurations (orchestrator, provisioning, workspace) +- **Resources**: File system, database, external API resources +- **Prompts**: System prompts and user prompt configuration +- **Integration**: Orchestrator, Control Center, Claude API integration +- **Security**: Authentication, authorization, rate limiting, input validation +- **Monitoring**: Metrics, health checks, audit logging +- **Logging**: Log outputs and configuration +- **Features**: Feature flags +- **Performance**: Thread pools, timeouts, caching + +**Key Parameters**: +- `server.protocol`: stdio (process-based) or http (network-based) +- `capabilities.tools.enabled`: true/false +- `capabilities.resources.max_size`: 1GB default +- `integration.claude.model`: claude-3-opus (latest) + +## Usage + +### Exporting to TOML + +Each template exports to TOML format: + +```toml +# Export orchestrator configuration +nickel export --format toml orchestrator-config.toml.ncl > orchestrator.toml + +# Export control-center configuration +nickel export --format toml control-center-config.toml.ncl > control-center.toml + +# Export MCP server configuration +nickel export --format toml mcp-server-config.toml.ncl > mcp-server.toml +``` + +### Mode-Specific Configuration + +Override configuration values based on deployment mode using environment variables or configuration layering: + +```toml +# Export solo mode configuration +ORCHESTRATOR_MODE=solo nickel export --format toml orchestrator-config.toml.ncl > orchestrator.solo.toml + +# Export enterprise mode with full features +ORCHESTRATOR_MODE=enterprise nickel export --format toml orchestrator-config.toml.ncl > orchestrator.enterprise.toml +``` + +### Integration with Rust Services + +Rust services load TOML configuration in this order (high to low priority): + +1. **Environment Variables** - `ORCHESTRATOR_*`, `CONTROL_CENTER_*`, `MCP_*` +2. **User Configuration** - `~/.config/provisioning/user_config.toml` +3. **Mode-Specific Config** - `provisioning/platform/config/{service}.{mode}.toml` +4. **Default Configuration** - `provisioning/platform/config/{service}.defaults.toml` + +Example loading in Rust: + +```rust +use config::{Config, ConfigError, File}; + +pub fn load_config(mode: &str) -> Result { + let config_path = format!("provisioning/platform/config/orchestrator.{}.toml", mode); + + Config::builder() + .add_source(File::with_name("provisioning/platform/config/orchestrator.defaults")) + .add_source(File::with_name(&config_path).required(false)) + .add_source(config::Environment::with_prefix("ORCHESTRATOR")) + .build()? + .try_deserialize() +} +``` + +## Configuration Sections + +### Server Configuration (All Services) + +```toml +[server] +host = "0.0.0.0" +port = 9090 +workers = 4 +keep_alive = 75 +max_connections = 512 +``` + +### Database Configuration (Control Center) + +**RocksDB** (solo, cicd modes): + +```toml +[database] +backend = "rocksdb" + +[database.rocksdb] +path = "/var/lib/provisioning/control-center/db" +cache_size = "256MB" +max_open_files = 1000 +compression = "snappy" +``` + +**PostgreSQL** (multiuser, enterprise modes): + +```toml +[database] +backend = "postgres" + +[database.postgres] +host = "postgres.provisioning.svc.cluster.local" +port = 5432 +database = "provisioning" +user = "provisioning" +password = "${DB_PASSWORD}" +ssl_mode = "require" +``` + +### Storage Configuration (Orchestrator) + +**Filesystem** (solo, cicd modes): + +```toml +[storage] +backend = "filesystem" +path = "/var/lib/provisioning/orchestrator/data" +``` + +**SurrealDB Server** (multiuser mode): + +```toml +[storage] +backend = "surrealdb_server" +surrealdb_url = "surrealdb://surrealdb:8000" +surrealdb_namespace = "provisioning" +surrealdb_database = "orchestrator" +``` + +**SurrealDB Cluster** (enterprise mode): + +```toml +[storage] +backend = "surrealdb_cluster" +surrealdb_url = "surrealdb://surrealdb-cluster.provisioning.svc.cluster.local:8000" +surrealdb_namespace = "provisioning" +surrealdb_database = "orchestrator" +``` + +### RBAC Configuration (Control Center) + +```toml +[rbac] +enabled = true +default_role = "viewer" + +[rbac.roles.admin] +description = "Administrator with full access" +permissions = ["*"] + +[rbac.roles.operator] +description = "Operator managing orchestrator" +permissions = ["orchestrator.view", "orchestrator.execute"] +``` + +### Queue Configuration (Orchestrator) + +```toml +[queue] +max_concurrent_tasks = 50 +retry_attempts = 3 +retry_delay = 5000 +task_timeout = 3600000 + +[queue.deadletter_queue] +enabled = true +max_messages = 1000 +retention_period = 86400 +``` + +### Logging Configuration (All Services) + +```toml +[logging] +level = "info" +format = "json" + +[[logging.outputs]] +destination = "stdout" +level = "info" + +[[logging.outputs]] +destination = "file" +path = "/var/log/provisioning/orchestrator/orchestrator.log" +level = "debug" + +[logging.outputs.rotation] +max_size = "100MB" +max_backups = 10 +max_age = 30 +``` + +### Monitoring Configuration (All Services) + +```toml +[monitoring] +enabled = true + +[monitoring.metrics] +enabled = true +interval = 30 +export_format = "prometheus" + +[monitoring.health_check] +enabled = true +interval = 30 +timeout = 10 +``` + +### Security Configuration (All Services) + +```toml +[security.auth] +enabled = true +method = "jwt" +jwt_secret = "${JWT_SECRET}" +jwt_issuer = "provisioning.local" +jwt_audience = "orchestrator" +token_expiration = 3600 + +[security.cors] +enabled = true +allowed_origins = ["https://control-center:8080"] +allowed_methods = ["GET", "POST", "PUT", "DELETE"] + +[security.rate_limit] +enabled = true +requests_per_second = 1000 +burst_size = 100 +``` + +## Environment Variables + +All sensitive values should be provided via environment variables: + +```bash +# Secrets +export JWT_SECRET="your-jwt-secret-here" +export DB_PASSWORD="your-database-password" +export ORCHESTRATOR_TOKEN="your-orchestrator-token" +export CONTROL_CENTER_TOKEN="your-control-center-token" +export CLAUDE_API_KEY="your-claude-api-key" + +# Service URLs (if different from defaults) +export ORCHESTRATOR_URL="http://orchestrator:9090" +export CONTROL_CENTER_URL="http://control-center:8080" + +# Mode selection +export PROVISIONING_MODE="enterprise" +``` + +## Mode-Specific Overrides + +### Solo Mode +- Minimal resources: 2 CPU, 4GB RAM +- Filesystem storage for orchestrator +- RocksDB for control-center +- No MFA required +- Single replica deployments +- Logging: info level + +### MultiUser Mode +- Moderate resources: 4 CPU, 8GB RAM +- SurrealDB server for orchestrator +- PostgreSQL for control-center +- RBAC enabled +- 1 replica per service +- Logging: debug level + +### CI/CD Mode +- Stateless configuration +- Ephemeral storage (no persistence) +- API-driven (minimal UI) +- No MFA required +- 1 replica per service +- Logging: warn level (minimal) + +### Enterprise Mode +- High resources: 16+ CPU, 32+ GB RAM +- SurrealDB cluster for orchestrator HA +- PostgreSQL HA for control-center +- Full RBAC and MFA required +- 3+ replicas per service +- Full monitoring and audit logging +- Logging: info level with detailed audit + +## Validation + +Validate configuration before using: + +```toml +# Type check with Nickel +nickel typecheck orchestrator-config.toml.ncl + +# Export and validate TOML syntax +nickel export --format toml orchestrator-config.toml.ncl | toml-cli validate - +``` + +## References + +- [Orchestrator Configuration Schema](../../schemas/orchestrator.ncl) +- [Control Center Configuration Schema](../../schemas/control-center.ncl) +- [MCP Server Configuration Schema](../../schemas/mcp-server.ncl) +- [Nickel Language](https://nickel-lang.org/) +- [TOML Format](https://toml.io/) \ No newline at end of file diff --git a/schemas/platform/templates/docker-compose/README.md b/schemas/platform/templates/docker-compose/README.md index c5e9cc3..107f0aa 100644 --- a/schemas/platform/templates/docker-compose/README.md +++ b/schemas/platform/templates/docker-compose/README.md @@ -1 +1,599 @@ -# Docker Compose Templates\n\nNickel-based Docker Compose templates for deploying platform services across all deployment modes.\n\n## Overview\n\nThis directory contains Nickel templates that generate Docker Compose files for different deployment scenarios.\nEach template imports configuration from `values/*.ncl` and expands to valid Docker Compose YAML.\n\n**Key Pattern**: Templates use **Nickel composition** to build service definitions dynamically based on configuration, allowing parameterized infrastructure-as-code.\n\n## Templates\n\n### 1. platform-stack.solo.yml.ncl\n\n**Purpose**: Single-developer local development stack\n\n**Services**:\n- `orchestrator` - Workflow engine\n- `control-center` - Policy and RBAC management\n- `mcp-server` - MCP protocol server\n\n**Configuration**:\n- Network: Bridge network named `provisioning`\n- Volumes: 5 named volumes for persistence\n - `orchestrator-data` - Orchestrator workflows\n - `control-center-data` - Control Center policies\n - `mcp-server-data` - MCP Server cache\n - `logs` - Shared log volume\n - `cache` - Shared cache volume\n- Ports:\n - 9090 - Orchestrator API\n - 8080 - Control Center UI\n - 8888 - MCP Server\n- Health Checks: 30-second intervals for all services\n- Logging: JSON format, 10MB max file size, 3 backups\n- Restart Policy: `unless-stopped` (survives host reboot)\n\n**Usage**:\n\n```\n# Generate from Nickel template\nnickel export --format json platform-stack.solo.yml.ncl | yq -P > docker-compose.solo.yml\n\n# Start services\ndocker-compose -f docker-compose.solo.yml up -d\n\n# View logs\ndocker-compose -f docker-compose.solo.yml logs -f\n\n# Stop services\ndocker-compose -f docker-compose.solo.yml down\n```\n\n**Environment Variables** (recommended in `.env` file):\n\n```\nORCHESTRATOR_LOG_LEVEL=debug\nCONTROL_CENTER_LOG_LEVEL=info\nMCP_SERVER_LOG_LEVEL=info\n```\n\n---\n\n### 2. platform-stack.multiuser.yml.ncl\n\n**Purpose**: Team collaboration with persistent database storage\n\n**Services** (6 total):\n- `postgres` - Primary database (PostgreSQL 15)\n- `orchestrator` - Workflow engine\n- `control-center` - Policy and RBAC management\n- `mcp-server` - MCP protocol server\n- `surrealdb` - Workflow storage (SurrealDB server)\n- `gitea` - Git repository hosting (optional, for version control)\n\n**Configuration**:\n- Network: Custom bridge network named `provisioning-network`\n- Volumes:\n - `postgres-data` - PostgreSQL database files\n - `orchestrator-data` - Orchestrator workflows\n - `control-center-data` - Control Center policies\n - `surrealdb-data` - SurrealDB files\n - `gitea-data` - Gitea repositories and configuration\n - `logs` - Shared logs\n- Ports:\n - 9090 - Orchestrator API\n - 8080 - Control Center UI\n - 8888 - MCP Server\n - 5432 - PostgreSQL (internal only)\n - 8000 - SurrealDB (internal only)\n - 3000 - Gitea web UI (optional)\n - 22 - Gitea SSH (optional)\n- Service Dependencies: Explicit `depends_on` with health checks\n - Control Center waits for PostgreSQL\n - SurrealDB starts before Orchestrator\n- Health Checks: Service-specific health checks\n- Restart Policy: `always` (automatic recovery on failure)\n- Logging: JSON format with rotation\n\n**Usage**:\n\n```\n# Generate from Nickel template\nnickel export --format json platform-stack.multiuser.yml.ncl | yq -P > docker-compose.multiuser.yml\n\n# Create environment file\ncat > .env.multiuser << 'EOF'\nDB_PASSWORD=secure-postgres-password\nSURREALDB_PASSWORD=secure-surrealdb-password\nJWT_SECRET=secure-jwt-secret-256-bits\nEOF\n\n# Start services\ndocker-compose -f docker-compose.multiuser.yml --env-file .env.multiuser up -d\n\n# Wait for all services to be healthy\ndocker-compose -f docker-compose.multiuser.yml ps\n\n# Create database and initialize schema (one-time)\ndocker-compose exec postgres psql -U postgres -c "CREATE DATABASE provisioning;"\n```\n\n**Database Initialization**:\n\n```\n# Connect to PostgreSQL for schema creation\ndocker-compose exec postgres psql -U provisioning -d provisioning\n\n# Connect to SurrealDB for schema setup\ndocker-compose exec surrealdb surreal sql --auth root:password\n\n# Connect to Gitea web UI\n# http://localhost:3000 (admin:admin by default)\n```\n\n**Environment Variables** (in `.env.multiuser`):\n\n```\n# Database Credentials (CRITICAL - change before production)\nDB_PASSWORD=your-strong-password\nSURREALDB_PASSWORD=your-strong-password\n\n# Security\nJWT_SECRET=your-256-bit-random-string\n\n# Logging\nORCHESTRATOR_LOG_LEVEL=info\nCONTROL_CENTER_LOG_LEVEL=info\nMCP_SERVER_LOG_LEVEL=info\n\n# Optional: Gitea Configuration\nGITEA_DOMAIN=localhost:3000\nGITEA_ROOT_URL=http://localhost:3000/\n```\n\n---\n\n### 3. platform-stack.cicd.yml.ncl\n\n**Purpose**: Ephemeral CI/CD pipeline stack with minimal persistence\n\n**Services** (2 total):\n- `orchestrator` - API-only mode (no UI, streamlined for programmatic use)\n- `api-gateway` - Optional: Request routing and authentication\n\n**Configuration**:\n- Network: Bridge network\n- Volumes:\n - `orchestrator-tmpfs` - Temporary storage (tmpfs - in-memory, no persistence)\n- Ports:\n - 9090 - Orchestrator API (read-only orchestrator state)\n - 8000 - API Gateway (optional)\n- Health Checks: Fast checks (10-second intervals)\n- Restart Policy: `no` (containers do not auto-restart)\n- Logging: Minimal (only warnings and errors)\n- Cleanup: All artifacts deleted when containers stop\n\n**Characteristics**:\n- **Ephemeral**: No persistent storage (uses tmpfs)\n- **Fast Startup**: Minimal services, quick boot time\n- **API-First**: No UI, command-line/API integration only\n- **Stateless**: Clean slate each run\n- **Low Resource**: Minimal memory/CPU footprint\n\n**Usage**:\n\n```\n# Generate from Nickel template\nnickel export --format json platform-stack.cicd.yml.ncl | yq -P > docker-compose.cicd.yml\n\n# Start ephemeral stack\ndocker-compose -f docker-compose.cicd.yml up\n\n# Run CI/CD commands (in parallel terminal)\ncurl -X POST http://localhost:9090/api/workflows \n -H "Content-Type: application/json" \n -d @workflow.json\n\n# Stop and cleanup (all data lost)\ndocker-compose -f docker-compose.cicd.yml down\n# Or with volume cleanup\ndocker-compose -f docker-compose.cicd.yml down -v\n```\n\n**CI/CD Integration Example**:\n\n```\n# GitHub Actions workflow\n- name: Start Provisioning Stack\n run: docker-compose -f docker-compose.cicd.yml up -d\n\n- name: Run Tests\n run: |\n ./tests/integration.sh\n curl -X GET http://localhost:9090/health\n\n- name: Cleanup\n if: always()\n run: docker-compose -f docker-compose.cicd.yml down -v\n```\n\n**Environment Variables** (minimal):\n\n```\n# Logging (optional)\nORCHESTRATOR_LOG_LEVEL=warn\n```\n\n---\n\n### 4. platform-stack.enterprise.yml.ncl\n\n**Purpose**: Production-grade high-availability deployment\n\n**Services** (10+ total):\n- `postgres` - PostgreSQL 15 (primary database)\n- `orchestrator` (3 replicas) - Load-balanced workflow engine\n- `control-center` (2 replicas) - Load-balanced policy management\n- `mcp-server` (1-2 replicas) - MCP server for AI integration\n- `surrealdb-1`, `surrealdb-2`, `surrealdb-3` - SurrealDB cluster (3 nodes)\n- `nginx` - Load balancer and reverse proxy\n- `prometheus` - Metrics collection\n- `grafana` - Visualization and dashboards\n- `loki` - Log aggregation\n\n**Configuration**:\n- Network: Custom bridge network named `provisioning-enterprise`\n- Volumes:\n - `postgres-data` - PostgreSQL HA storage\n - `surrealdb-node-1`, `surrealdb-node-2`, `surrealdb-node-3` - Cluster storage\n - `prometheus-data` - Metrics storage\n - `grafana-data` - Grafana configuration\n - `loki-data` - Log storage\n - `logs` - Shared log aggregation\n- Ports:\n - 80 - HTTP (Nginx reverse proxy)\n - 443 - HTTPS (TLS - requires certificates)\n - 9090 - Orchestrator API (internal)\n - 8080 - Control Center UI (internal)\n - 8888 - MCP Server (internal)\n - 5432 - PostgreSQL (internal only)\n - 8000 - SurrealDB cluster (internal)\n - 9091 - Prometheus metrics (internal)\n - 3000 - Grafana dashboards (external)\n- Service Dependencies:\n - Control Center waits for PostgreSQL\n - Orchestrator waits for SurrealDB cluster\n - MCP Server waits for Orchestrator and Control Center\n - Prometheus waits for all services\n- Health Checks: 30-second intervals with 10-second timeout\n- Restart Policy: `always` (high availability)\n- Load Balancing: Nginx upstream blocks for orchestrator, control-center\n- Logging: JSON format with 500MB files, kept 30 versions\n\n**Architecture**:\n\n```\n┌──────────────────────┐\n│ External Client │\n│ (HTTPS, Port 443) │\n└──────────┬───────────┘\n │\n ┌──────▼──────────┐\n │ Nginx Load │\n │ Balancer │\n │ (TLS, CORS, │\n │ Rate Limiting) │\n └───────┬──────┬──────┬─────┐\n │ │ │ │\n ┌────────▼──┐ ┌──────▼──┐ ┌──▼────────┐\n │Orchestrator│ │Control │ │MCP Server │\n │ (3 copies) │ │ Center │ │ (1-2 copy)│\n │ │ │(2 copies)│ │ │\n └────────┬──┘ └─────┬───┘ └──┬───────┘\n │ │ │\n ┌───────▼────────┬──▼────┐ │\n │ SurrealDB │ PostSQL │\n │ Cluster │ HA │\n │ (3 nodes) │ (Primary/│\n │ │ Replica)│\n └────────────────┴──────────┘\n\nObservability Stack:\n┌────────────┬───────────┬───────────┐\n│ Prometheus │ Grafana │ Loki │\n│ (Metrics) │(Dashboard)│ (Logs) │\n└────────────┴───────────┴───────────┘\n```\n\n**Usage**:\n\n```\n# Generate from Nickel template\nnickel export --format json platform-stack.enterprise.yml.ncl | yq -P > docker-compose.enterprise.yml\n\n# Create environment file with secrets\ncat > .env.enterprise << 'EOF'\n# Database\nDB_PASSWORD=generate-strong-password\nSURREALDB_PASSWORD=generate-strong-password\n\n# Security\nJWT_SECRET=generate-256-bit-random-string\nADMIN_PASSWORD=generate-strong-admin-password\n\n# TLS Certificates\nTLS_CERT_PATH=/path/to/cert.pem\nTLS_KEY_PATH=/path/to/key.pem\n\n# Logging and Monitoring\nPROMETHEUS_RETENTION=30d\nGRAFANA_ADMIN_PASSWORD=generate-strong-password\nLOKI_RETENTION_DAYS=30\nEOF\n\n# Start entire stack\ndocker-compose -f docker-compose.enterprise.yml --env-file .env.enterprise up -d\n\n# Verify all services are healthy\ndocker-compose -f docker-compose.enterprise.yml ps\n\n# Check load balancer status\ncurl -H "Host: orchestrator.example.com" http://localhost/health\n\n# Access monitoring\n# Grafana: http://localhost:3000 (admin/password)\n# Prometheus: http://localhost:9091 (internal)\n# Loki: http://localhost:3100 (internal)\n```\n\n**Production Checklist**:\n- [ ] Generate strong database passwords (32+ characters)\n- [ ] Generate strong JWT secret (256-bit random string)\n- [ ] Provision valid TLS certificates (not self-signed)\n- [ ] Configure Nginx upstream health checks\n- [ ] Set up log retention policies (30+ days)\n- [ ] Enable Prometheus scraping with 15-second intervals\n- [ ] Configure Grafana dashboards and alerts\n- [ ] Test SurrealDB cluster failover\n- [ ] Document backup procedures\n- [ ] Enable PostgreSQL replication and backups\n- [ ] Configure external log aggregation (ELK stack, Splunk, etc.)\n\n**Environment Variables** (in `.env.enterprise`):\n\n```\n# Database Credentials (CRITICAL)\nDB_PASSWORD=your-strong-password-32-chars-min\nSURREALDB_PASSWORD=your-strong-password-32-chars-min\n\n# Security\nJWT_SECRET=your-256-bit-random-base64-encoded-string\nADMIN_PASSWORD=your-strong-admin-password\n\n# TLS/HTTPS\nTLS_CERT_PATH=/etc/provisioning/certs/server.crt\nTLS_KEY_PATH=/etc/provisioning/certs/server.key\n\n# Logging and Monitoring\nPROMETHEUS_RETENTION=30d\nPROMETHEUS_SCRAPE_INTERVAL=15s\nGRAFANA_ADMIN_USER=admin\nGRAFANA_ADMIN_PASSWORD=your-strong-grafana-password\nLOKI_RETENTION_DAYS=30\n\n# Optional: External Integrations\nSLACK_WEBHOOK_URL=https://hooks.slack.com/services/xxxxxxx\nPAGERDUTY_INTEGRATION_KEY=your-pagerduty-key\n```\n\n---\n\n## Workflow: From Nickel to Docker Compose\n\n### 1. Configuration Source (values/*.ncl)\n\n```\n# values/orchestrator.enterprise.ncl\n{\n orchestrator = {\n server = {\n host = "0.0.0.0",\n port = 9090,\n workers = 8,\n },\n storage = {\n backend = 'surrealdb_cluster,\n surrealdb_url = "surrealdb://surrealdb-1:8000",\n },\n queue = {\n max_concurrent_tasks = 100,\n retry_attempts = 5,\n task_timeout = 7200000,\n },\n monitoring = {\n enabled = true,\n metrics_interval = 10,\n },\n },\n}\n```\n\n### 2. Template Generation (Nickel → JSON)\n\n```\n# Exports Nickel config as JSON\nnickel export --format json platform-stack.enterprise.yml.ncl\n```\n\n### 3. YAML Conversion (JSON → YAML)\n\n```\n# Converts JSON to YAML format\nnickel export --format json platform-stack.enterprise.yml.ncl | yq -P > docker-compose.enterprise.yml\n```\n\n### 4. Deployment (YAML → Running Containers)\n\n```\n# Starts all services defined in YAML\ndocker-compose -f docker-compose.enterprise.yml up -d\n```\n\n---\n\n## Common Customizations\n\n### Change Service Replicas\n\nEdit the template to adjust replica counts:\n\n```\n# In platform-stack.enterprise.yml.ncl\nlet orchestrator_replicas = 5 in # Instead of 3\nlet control_center_replicas = 3 in # Instead of 2\nservices.orchestrator_replicas\n```\n\n### Add Custom Service\n\nAdd to the template services record:\n\n```\n# In platform-stack.enterprise.yml.ncl\nservices = base_services & {\n custom_service = {\n image = "custom:latest",\n ports = ["9999:9999"],\n volumes = ["custom-data:/data"],\n restart = "always",\n healthcheck = {\n test = ["CMD", "curl", "-f", "http://localhost:9999/health"],\n interval = "30s",\n timeout = "10s",\n retries = 3,\n },\n },\n}\n```\n\n### Modify Resource Limits\n\nIn each service definition:\n\n```\norchestrator = {\n deploy = {\n resources = {\n limits = {\n cpus = "2.0",\n memory = "2G",\n },\n reservations = {\n cpus = "1.0",\n memory = "1G",\n },\n },\n },\n}\n```\n\n---\n\n## Validation and Testing\n\n### Syntax Validation\n\n```\n# Validate YAML before deploying\ndocker-compose -f docker-compose.enterprise.yml config --quiet\n\n# Check service definitions\ndocker-compose -f docker-compose.enterprise.yml ps\n```\n\n### Health Checks\n\n```\n# Monitor health of all services\nwatch docker-compose ps\n\n# Check specific service health\ndocker-compose exec orchestrator curl -s http://localhost:9090/health\n```\n\n### Log Inspection\n\n```\n# View logs from all services\ndocker-compose logs -f\n\n# View logs from specific service\ndocker-compose logs -f orchestrator\n\n# Follow specific container\ndocker logs -f $(docker ps | grep orchestrator | awk '{print $1}')\n```\n\n---\n\n## Troubleshooting\n\n### Port Already in Use\n\n**Error**: `bind: address already in use`\n\n**Fix**: Change port in template or stop conflicting container:\n\n```\n# Find process using port\nlsof -i :9090\n\n# Kill process\nkill -9 \n\n# Or change port in docker-compose file\nports:\n - "9999:9090" # Use 9999 instead\n```\n\n### Service Fails to Start\n\n**Check logs**:\n\n```\ndocker-compose logs orchestrator\n```\n\n**Common causes**:\n- Port conflict - Check if another service uses port\n- Missing volume - Create volume before starting\n- Network connectivity - Verify docker network exists\n- Database not ready - Wait for db service to become healthy\n- Configuration error - Validate YAML syntax\n\n### Persistent Volume Issues\n\n**Clean volumes** (WARNING: Deletes data):\n\n```\ndocker-compose down -v\ndocker volume prune -f\n```\n\n---\n\n## See Also\n\n- **Kubernetes Templates**: `../kubernetes/` - For production K8s deployments\n- **Configuration System**: `../../` - Full configuration documentation\n- **Examples**: `../../examples/` - Example deployment scenarios\n- **Scripts**: `../../scripts/` - Automation scripts\n\n---\n\n**Version**: 1.0\n**Last Updated**: 2025-01-05\n**Status**: Production Ready \ No newline at end of file +# Docker Compose Templates + +Nickel-based Docker Compose templates for deploying platform services across all deployment modes. + +## Overview + +This directory contains Nickel templates that generate Docker Compose files for different deployment scenarios. +Each template imports configuration from `values/*.ncl` and expands to valid Docker Compose YAML. + +**Key Pattern**: Templates use **Nickel composition** to build service definitions dynamically based on configuration, allowing parameterized infrastructure-as-code. + +## Templates + +### 1. platform-stack.solo.yml.ncl + +**Purpose**: Single-developer local development stack + +**Services**: +- `orchestrator` - Workflow engine +- `control-center` - Policy and RBAC management +- `mcp-server` - MCP protocol server + +**Configuration**: +- Network: Bridge network named `provisioning` +- Volumes: 5 named volumes for persistence + - `orchestrator-data` - Orchestrator workflows + - `control-center-data` - Control Center policies + - `mcp-server-data` - MCP Server cache + - `logs` - Shared log volume + - `cache` - Shared cache volume +- Ports: + - 9090 - Orchestrator API + - 8080 - Control Center UI + - 8888 - MCP Server +- Health Checks: 30-second intervals for all services +- Logging: JSON format, 10MB max file size, 3 backups +- Restart Policy: `unless-stopped` (survives host reboot) + +**Usage**: + +```bash +# Generate from Nickel template +nickel export --format json platform-stack.solo.yml.ncl | yq -P > docker-compose.solo.yml + +# Start services +docker-compose -f docker-compose.solo.yml up -d + +# View logs +docker-compose -f docker-compose.solo.yml logs -f + +# Stop services +docker-compose -f docker-compose.solo.yml down +``` + +**Environment Variables** (recommended in `.env` file): + +```bash +ORCHESTRATOR_LOG_LEVEL=debug +CONTROL_CENTER_LOG_LEVEL=info +MCP_SERVER_LOG_LEVEL=info +``` + +--- + +### 2. platform-stack.multiuser.yml.ncl + +**Purpose**: Team collaboration with persistent database storage + +**Services** (6 total): +- `postgres` - Primary database (PostgreSQL 15) +- `orchestrator` - Workflow engine +- `control-center` - Policy and RBAC management +- `mcp-server` - MCP protocol server +- `surrealdb` - Workflow storage (SurrealDB server) +- `gitea` - Git repository hosting (optional, for version control) + +**Configuration**: +- Network: Custom bridge network named `provisioning-network` +- Volumes: + - `postgres-data` - PostgreSQL database files + - `orchestrator-data` - Orchestrator workflows + - `control-center-data` - Control Center policies + - `surrealdb-data` - SurrealDB files + - `gitea-data` - Gitea repositories and configuration + - `logs` - Shared logs +- Ports: + - 9090 - Orchestrator API + - 8080 - Control Center UI + - 8888 - MCP Server + - 5432 - PostgreSQL (internal only) + - 8000 - SurrealDB (internal only) + - 3000 - Gitea web UI (optional) + - 22 - Gitea SSH (optional) +- Service Dependencies: Explicit `depends_on` with health checks + - Control Center waits for PostgreSQL + - SurrealDB starts before Orchestrator +- Health Checks: Service-specific health checks +- Restart Policy: `always` (automatic recovery on failure) +- Logging: JSON format with rotation + +**Usage**: + +```bash +# Generate from Nickel template +nickel export --format json platform-stack.multiuser.yml.ncl | yq -P > docker-compose.multiuser.yml + +# Create environment file +cat > .env.multiuser << 'EOF' +DB_PASSWORD=secure-postgres-password +SURREALDB_PASSWORD=secure-surrealdb-password +JWT_SECRET=secure-jwt-secret-256-bits +EOF + +# Start services +docker-compose -f docker-compose.multiuser.yml --env-file .env.multiuser up -d + +# Wait for all services to be healthy +docker-compose -f docker-compose.multiuser.yml ps + +# Create database and initialize schema (one-time) +docker-compose exec postgres psql -U postgres -c "CREATE DATABASE provisioning;" +``` + +**Database Initialization**: + +```bash +# Connect to PostgreSQL for schema creation +docker-compose exec postgres psql -U provisioning -d provisioning + +# Connect to SurrealDB for schema setup +docker-compose exec surrealdb surreal sql --auth root:password + +# Connect to Gitea web UI +# http://localhost:3000 (admin:admin by default) +``` + +**Environment Variables** (in `.env.multiuser`): + +```bash +# Database Credentials (CRITICAL - change before production) +DB_PASSWORD=your-strong-password +SURREALDB_PASSWORD=your-strong-password + +# Security +JWT_SECRET=your-256-bit-random-string + +# Logging +ORCHESTRATOR_LOG_LEVEL=info +CONTROL_CENTER_LOG_LEVEL=info +MCP_SERVER_LOG_LEVEL=info + +# Optional: Gitea Configuration +GITEA_DOMAIN=localhost:3000 +GITEA_ROOT_URL=http://localhost:3000/ +``` + +--- + +### 3. platform-stack.cicd.yml.ncl + +**Purpose**: Ephemeral CI/CD pipeline stack with minimal persistence + +**Services** (2 total): +- `orchestrator` - API-only mode (no UI, streamlined for programmatic use) +- `api-gateway` - Optional: Request routing and authentication + +**Configuration**: +- Network: Bridge network +- Volumes: + - `orchestrator-tmpfs` - Temporary storage (tmpfs - in-memory, no persistence) +- Ports: + - 9090 - Orchestrator API (read-only orchestrator state) + - 8000 - API Gateway (optional) +- Health Checks: Fast checks (10-second intervals) +- Restart Policy: `no` (containers do not auto-restart) +- Logging: Minimal (only warnings and errors) +- Cleanup: All artifacts deleted when containers stop + +**Characteristics**: +- **Ephemeral**: No persistent storage (uses tmpfs) +- **Fast Startup**: Minimal services, quick boot time +- **API-First**: No UI, command-line/API integration only +- **Stateless**: Clean slate each run +- **Low Resource**: Minimal memory/CPU footprint + +**Usage**: + +```bash +# Generate from Nickel template +nickel export --format json platform-stack.cicd.yml.ncl | yq -P > docker-compose.cicd.yml + +# Start ephemeral stack +docker-compose -f docker-compose.cicd.yml up + +# Run CI/CD commands (in parallel terminal) +curl -X POST http://localhost:9090/api/workflows + -H "Content-Type: application/json" + -d @workflow.json + +# Stop and cleanup (all data lost) +docker-compose -f docker-compose.cicd.yml down +# Or with volume cleanup +docker-compose -f docker-compose.cicd.yml down -v +``` + +**CI/CD Integration Example**: + +```bash +# GitHub Actions workflow +- name: Start Provisioning Stack + run: docker-compose -f docker-compose.cicd.yml up -d + +- name: Run Tests + run: | + ./tests/integration.sh + curl -X GET http://localhost:9090/health + +- name: Cleanup + if: always() + run: docker-compose -f docker-compose.cicd.yml down -v +``` + +**Environment Variables** (minimal): + +```bash +# Logging (optional) +ORCHESTRATOR_LOG_LEVEL=warn +``` + +--- + +### 4. platform-stack.enterprise.yml.ncl + +**Purpose**: Production-grade high-availability deployment + +**Services** (10+ total): +- `postgres` - PostgreSQL 15 (primary database) +- `orchestrator` (3 replicas) - Load-balanced workflow engine +- `control-center` (2 replicas) - Load-balanced policy management +- `mcp-server` (1-2 replicas) - MCP server for AI integration +- `surrealdb-1`, `surrealdb-2`, `surrealdb-3` - SurrealDB cluster (3 nodes) +- `nginx` - Load balancer and reverse proxy +- `prometheus` - Metrics collection +- `grafana` - Visualization and dashboards +- `loki` - Log aggregation + +**Configuration**: +- Network: Custom bridge network named `provisioning-enterprise` +- Volumes: + - `postgres-data` - PostgreSQL HA storage + - `surrealdb-node-1`, `surrealdb-node-2`, `surrealdb-node-3` - Cluster storage + - `prometheus-data` - Metrics storage + - `grafana-data` - Grafana configuration + - `loki-data` - Log storage + - `logs` - Shared log aggregation +- Ports: + - 80 - HTTP (Nginx reverse proxy) + - 443 - HTTPS (TLS - requires certificates) + - 9090 - Orchestrator API (internal) + - 8080 - Control Center UI (internal) + - 8888 - MCP Server (internal) + - 5432 - PostgreSQL (internal only) + - 8000 - SurrealDB cluster (internal) + - 9091 - Prometheus metrics (internal) + - 3000 - Grafana dashboards (external) +- Service Dependencies: + - Control Center waits for PostgreSQL + - Orchestrator waits for SurrealDB cluster + - MCP Server waits for Orchestrator and Control Center + - Prometheus waits for all services +- Health Checks: 30-second intervals with 10-second timeout +- Restart Policy: `always` (high availability) +- Load Balancing: Nginx upstream blocks for orchestrator, control-center +- Logging: JSON format with 500MB files, kept 30 versions + +**Architecture**: + +```bash +┌──────────────────────┐ +│ External Client │ +│ (HTTPS, Port 443) │ +└──────────┬───────────┘ + │ + ┌──────▼──────────┐ + │ Nginx Load │ + │ Balancer │ + │ (TLS, CORS, │ + │ Rate Limiting) │ + └───────┬──────┬──────┬─────┐ + │ │ │ │ + ┌────────▼──┐ ┌──────▼──┐ ┌──▼────────┐ + │Orchestrator│ │Control │ │MCP Server │ + │ (3 copies) │ │ Center │ │ (1-2 copy)│ + │ │ │(2 copies)│ │ │ + └────────┬──┘ └─────┬───┘ └──┬───────┘ + │ │ │ + ┌───────▼────────┬──▼────┐ │ + │ SurrealDB │ PostSQL │ + │ Cluster │ HA │ + │ (3 nodes) │ (Primary/│ + │ │ Replica)│ + └────────────────┴──────────┘ + +Observability Stack: +┌────────────┬───────────┬───────────┐ +│ Prometheus │ Grafana │ Loki │ +│ (Metrics) │(Dashboard)│ (Logs) │ +└────────────┴───────────┴───────────┘ +``` + +**Usage**: + +```bash +# Generate from Nickel template +nickel export --format json platform-stack.enterprise.yml.ncl | yq -P > docker-compose.enterprise.yml + +# Create environment file with secrets +cat > .env.enterprise << 'EOF' +# Database +DB_PASSWORD=generate-strong-password +SURREALDB_PASSWORD=generate-strong-password + +# Security +JWT_SECRET=generate-256-bit-random-string +ADMIN_PASSWORD=generate-strong-admin-password + +# TLS Certificates +TLS_CERT_PATH=/path/to/cert.pem +TLS_KEY_PATH=/path/to/key.pem + +# Logging and Monitoring +PROMETHEUS_RETENTION=30d +GRAFANA_ADMIN_PASSWORD=generate-strong-password +LOKI_RETENTION_DAYS=30 +EOF + +# Start entire stack +docker-compose -f docker-compose.enterprise.yml --env-file .env.enterprise up -d + +# Verify all services are healthy +docker-compose -f docker-compose.enterprise.yml ps + +# Check load balancer status +curl -H "Host: orchestrator.example.com" http://localhost/health + +# Access monitoring +# Grafana: http://localhost:3000 (admin/password) +# Prometheus: http://localhost:9091 (internal) +# Loki: http://localhost:3100 (internal) +``` + +**Production Checklist**: +- [ ] Generate strong database passwords (32+ characters) +- [ ] Generate strong JWT secret (256-bit random string) +- [ ] Provision valid TLS certificates (not self-signed) +- [ ] Configure Nginx upstream health checks +- [ ] Set up log retention policies (30+ days) +- [ ] Enable Prometheus scraping with 15-second intervals +- [ ] Configure Grafana dashboards and alerts +- [ ] Test SurrealDB cluster failover +- [ ] Document backup procedures +- [ ] Enable PostgreSQL replication and backups +- [ ] Configure external log aggregation (ELK stack, Splunk, etc.) + +**Environment Variables** (in `.env.enterprise`): + +```bash +# Database Credentials (CRITICAL) +DB_PASSWORD=your-strong-password-32-chars-min +SURREALDB_PASSWORD=your-strong-password-32-chars-min + +# Security +JWT_SECRET=your-256-bit-random-base64-encoded-string +ADMIN_PASSWORD=your-strong-admin-password + +# TLS/HTTPS +TLS_CERT_PATH=/etc/provisioning/certs/server.crt +TLS_KEY_PATH=/etc/provisioning/certs/server.key + +# Logging and Monitoring +PROMETHEUS_RETENTION=30d +PROMETHEUS_SCRAPE_INTERVAL=15s +GRAFANA_ADMIN_USER=admin +GRAFANA_ADMIN_PASSWORD=your-strong-grafana-password +LOKI_RETENTION_DAYS=30 + +# Optional: External Integrations +SLACK_WEBHOOK_URL=https://hooks.slack.com/services/xxxxxxx +PAGERDUTY_INTEGRATION_KEY=your-pagerduty-key +``` + +--- + +## Workflow: From Nickel to Docker Compose + +### 1. Configuration Source (values/*.ncl) + +```nickel +# values/orchestrator.enterprise.ncl +{ + orchestrator = { + server = { + host = "0.0.0.0", + port = 9090, + workers = 8, + }, + storage = { + backend = 'surrealdb_cluster, + surrealdb_url = "surrealdb://surrealdb-1:8000", + }, + queue = { + max_concurrent_tasks = 100, + retry_attempts = 5, + task_timeout = 7200000, + }, + monitoring = { + enabled = true, + metrics_interval = 10, + }, + }, +} +``` + +### 2. Template Generation (Nickel → JSON) + +```nickel +# Exports Nickel config as JSON +nickel export --format json platform-stack.enterprise.yml.ncl +``` + +### 3. YAML Conversion (JSON → YAML) + +```yaml +# Converts JSON to YAML format +nickel export --format json platform-stack.enterprise.yml.ncl | yq -P > docker-compose.enterprise.yml +``` + +### 4. Deployment (YAML → Running Containers) + +```yaml +# Starts all services defined in YAML +docker-compose -f docker-compose.enterprise.yml up -d +``` + +--- + +## Common Customizations + +### Change Service Replicas + +Edit the template to adjust replica counts: + +```bash +# In platform-stack.enterprise.yml.ncl +let orchestrator_replicas = 5 in # Instead of 3 +let control_center_replicas = 3 in # Instead of 2 +services.orchestrator_replicas +``` + +### Add Custom Service + +Add to the template services record: + +```bash +# In platform-stack.enterprise.yml.ncl +services = base_services & { + custom_service = { + image = "custom:latest", + ports = ["9999:9999"], + volumes = ["custom-data:/data"], + restart = "always", + healthcheck = { + test = ["CMD", "curl", "-f", "http://localhost:9999/health"], + interval = "30s", + timeout = "10s", + retries = 3, + }, + }, +} +``` + +### Modify Resource Limits + +In each service definition: + +```bash +orchestrator = { + deploy = { + resources = { + limits = { + cpus = "2.0", + memory = "2G", + }, + reservations = { + cpus = "1.0", + memory = "1G", + }, + }, + }, +} +``` + +--- + +## Validation and Testing + +### Syntax Validation + +```bash +# Validate YAML before deploying +docker-compose -f docker-compose.enterprise.yml config --quiet + +# Check service definitions +docker-compose -f docker-compose.enterprise.yml ps +``` + +### Health Checks + +```bash +# Monitor health of all services +watch docker-compose ps + +# Check specific service health +docker-compose exec orchestrator curl -s http://localhost:9090/health +``` + +### Log Inspection + +```bash +# View logs from all services +docker-compose logs -f + +# View logs from specific service +docker-compose logs -f orchestrator + +# Follow specific container +docker logs -f $(docker ps | grep orchestrator | awk '{print $1}') +``` + +--- + +## Troubleshooting + +### Port Already in Use + +**Error**: `bind: address already in use` + +**Fix**: Change port in template or stop conflicting container: + +```bash +# Find process using port +lsof -i :9090 + +# Kill process +kill -9 + +# Or change port in docker-compose file +ports: + - "9999:9090" # Use 9999 instead +``` + +### Service Fails to Start + +**Check logs**: + +```bash +docker-compose logs orchestrator +``` + +**Common causes**: +- Port conflict - Check if another service uses port +- Missing volume - Create volume before starting +- Network connectivity - Verify docker network exists +- Database not ready - Wait for db service to become healthy +- Configuration error - Validate YAML syntax + +### Persistent Volume Issues + +**Clean volumes** (WARNING: Deletes data): + +```bash +docker-compose down -v +docker volume prune -f +``` + +--- + +## See Also + +- **Kubernetes Templates**: `../kubernetes/` - For production K8s deployments +- **Configuration System**: `../../` - Full configuration documentation +- **Examples**: `../../examples/` - Example deployment scenarios +- **Scripts**: `../../scripts/` - Automation scripts + +--- + +**Version**: 1.0 +**Last Updated**: 2025-01-05 +**Status**: Production Ready \ No newline at end of file diff --git a/schemas/platform/templates/kubernetes/README.md b/schemas/platform/templates/kubernetes/README.md index ac9baa4..1d665cf 100644 --- a/schemas/platform/templates/kubernetes/README.md +++ b/schemas/platform/templates/kubernetes/README.md @@ -1 +1,486 @@ -# Kubernetes Templates\n\nNickel-based Kubernetes manifest templates for provisioning platform services.\n\n## Overview\n\nThis directory contains Kubernetes deployment manifests written in Nickel language. These templates are parameterized to support all four deployment modes:\n\n- **solo**: Single developer, 1 replica per service, minimal resources\n- **multiuser**: Team collaboration, 1-2 replicas per service, PostgreSQL + SurrealDB\n- **cicd**: CI/CD pipelines, 1 replica, stateless and ephemeral\n- **enterprise**: Production HA, 2-3 replicas per service, full monitoring stack\n\n## Templates\n\n### Service Deployments\n\n#### orchestrator-deployment.yaml.ncl\nOrchestrator workflow engine deployment with:\n- 3 replicas (enterprise mode, override per mode)\n- Service account for RBAC\n- Health checks (liveness + readiness probes)\n- Resource requests/limits (500m CPU, 512Mi RAM minimum)\n- Volume mounts for data and logs\n- Pod anti-affinity for distributed deployment\n- Init containers for dependency checking\n\n**Mode-specific overrides**:\n- Solo: 1 replica, filesystem storage\n- MultiUser: 1 replica, SurrealDB backend\n- CI/CD: 1 replica, ephemeral storage\n- Enterprise: 3 replicas, SurrealDB cluster\n\n#### orchestrator-service.yaml.ncl\nInternal ClusterIP service for orchestrator with:\n- Session affinity (3-hour timeout)\n- Port 9090 (HTTP API)\n- Port 9091 (Metrics)\n- Internal access only (ClusterIP)\n\n**Mode-specific overrides**:\n- Enterprise: LoadBalancer for external access\n\n#### control-center-deployment.yaml.ncl\nControl Center policy and RBAC management with:\n- 2 replicas (enterprise mode)\n- Database integration (PostgreSQL or RocksDB)\n- RBAC and JWT configuration\n- MFA support\n- Health checks and resource limits\n- Security context (non-root user)\n\n**Environment variables**:\n- Database type and URL\n- RBAC enablement\n- JWT issuer, audience, secret\n- MFA requirement\n- Log level\n\n#### control-center-service.yaml.ncl\nInternal ClusterIP service for Control Center with:\n- Port 8080 (HTTP API + UI)\n- Port 8081 (Metrics)\n- Session affinity\n\n#### mcp-server-deployment.yaml.ncl\nModel Context Protocol server for AI/LLM integration with:\n- Lightweight deployment (100m CPU, 128Mi RAM minimum)\n- Orchestrator integration\n- Control Center integration\n- MCP capabilities (tools, resources, prompts)\n- Tool concurrency limits\n- Resource size limits\n\n**Mode-specific overrides**:\n- Solo: 1 replica\n- Enterprise: 2 replicas for HA\n\n#### mcp-server-service.yaml.ncl\nInternal ClusterIP service for MCP server with:\n- Port 8888 (HTTP API)\n- Port 8889 (Metrics)\n\n### Networking\n\n#### platform-ingress.yaml.ncl\nNginx ingress for external HTTP/HTTPS routing with:\n- TLS termination with Let's Encrypt (cert-manager)\n- CORS configuration\n- Security headers (HSTS, X-Frame-Options, etc.)\n- Rate limiting (1000 RPS, 100 connections)\n- Path-based routing to services\n\n**Routes**:\n- `api.example.com/orchestrator` → orchestrator:9090\n- `control-center.example.com/` → control-center:8080\n- `mcp.example.com/` → mcp-server:8888\n- `orchestrator.example.com/api` → orchestrator:9090\n- `orchestrator.example.com/policy` → control-center:8080\n\n### Namespace and Cluster Configuration\n\n#### namespace.yaml.ncl\nKubernetes Namespace for provisioning platform with:\n- Pod security policies (baseline enforcement)\n- Labels for organization and monitoring\n- Annotations for description\n\n#### resource-quota.yaml.ncl\nResourceQuota for resource consumption limits:\n- **CPU**: 8 requests / 16 limits (total)\n- **Memory**: 16GB requests / 32GB limits (total)\n- **Storage**: 200GB (persistent volumes)\n- **Pod limit**: 20 pods maximum\n- **Services**: 10 maximum\n- **ConfigMaps/Secrets**: 50 each\n- **Deployments/StatefulSets/Jobs**: Limited per type\n\n**Mode-specific overrides**:\n- Solo: 4 CPU / 8GB memory, 10 pods\n- MultiUser: 8 CPU / 16GB memory, 20 pods\n- CI/CD: 16 CPU / 32GB memory, 50 pods (ephemeral)\n- Enterprise: Unlimited (managed externally)\n\n#### network-policy.yaml.ncl\nNetworkPolicy for network isolation and security:\n- **Ingress**: Allow traffic from Nginx, inter-pod, Prometheus, DNS\n- **Egress**: Allow DNS queries, inter-pod, external HTTPS\n- **Default**: Deny all except explicitly allowed\n\n**Ports managed**:\n- 9090: Orchestrator API\n- 8080: Control Center API/UI\n- 8888: MCP Server\n- 5432: PostgreSQL\n- 8000: SurrealDB\n- 53: DNS (TCP/UDP)\n- 443/80: External HTTPS/HTTP\n\n#### rbac.yaml.ncl\nRole-Based Access Control (RBAC) setup with:\n- **ServiceAccounts**: orchestrator, control-center, mcp-server\n- **Roles**: Minimal permissions per service\n- **RoleBindings**: Connect ServiceAccounts to Roles\n\n**Permissions**:\n- Orchestrator: Read ConfigMaps, Secrets, Pods, Services\n- Control Center: Read/Write Secrets, ConfigMaps, Deployments\n- MCP Server: Read ConfigMaps, Secrets, Pods, Services\n\n## Usage\n\n### Rendering Templates\n\nEach template is a Nickel file that exports to JSON, then converts to YAML:\n\n```\n# Render a single template\nnickel eval --format json orchestrator-deployment.yaml.ncl | yq -P > orchestrator-deployment.yaml\n\n# Render all templates\nfor template in *.ncl; do\n nickel eval --format json "$template" | yq -P > "${template%.ncl}.yaml"\ndone\n```\n\n### Deploying to Kubernetes\n\n```\n# Create namespace\nkubectl create namespace provisioning\n\n# Create ConfigMaps for configuration\nkubectl create configmap orchestrator-config \n --from-literal=storage_backend=surrealdb \n --from-literal=max_concurrent_tasks=50 \n --from-literal=batch_parallel_limit=20 \n --from-literal=log_level=info \n -n provisioning\n\n# Create secrets for sensitive data\nkubectl create secret generic control-center-secrets \n --from-literal=database_url="postgresql://user:pass@postgres/provisioning" \n --from-literal=jwt_secret="your-jwt-secret-here" \n -n provisioning\n\n# Apply manifests\nkubectl apply -f orchestrator-deployment.yaml -n provisioning\nkubectl apply -f orchestrator-service.yaml -n provisioning\nkubectl apply -f control-center-deployment.yaml -n provisioning\nkubectl apply -f control-center-service.yaml -n provisioning\nkubectl apply -f mcp-server-deployment.yaml -n provisioning\nkubectl apply -f mcp-server-service.yaml -n provisioning\nkubectl apply -f platform-ingress.yaml -n provisioning\n```\n\n### Verifying Deployment\n\n```\n# Check deployments\nkubectl get deployments -n provisioning\n\n# Check services\nkubectl get svc -n provisioning\n\n# Check ingress\nkubectl get ingress -n provisioning\n\n# View logs\nkubectl logs -n provisioning -l app=orchestrator -f\nkubectl logs -n provisioning -l app=control-center -f\nkubectl logs -n provisioning -l app=mcp-server -f\n\n# Describe resource\nkubectl describe deployment orchestrator -n provisioning\nkubectl describe service orchestrator -n provisioning\n```\n\n## ConfigMaps and Secrets\n\n### Required ConfigMaps\n\n#### orchestrator-config\n\n```\napiVersion: v1\nkind: ConfigMap\nmetadata:\n name: orchestrator-config\n namespace: provisioning\ndata:\n storage_backend: "surrealdb" # or "filesystem"\n max_concurrent_tasks: "50" # Must match constraint.orchestrator.queue.concurrent_tasks.max\n batch_parallel_limit: "20" # Must match constraint.orchestrator.batch.parallel_limit.max\n log_level: "info"\n```\n\n#### control-center-config\n\n```\napiVersion: v1\nkind: ConfigMap\nmetadata:\n name: control-center-config\n namespace: provisioning\ndata:\n database_type: "postgres" # or "rocksdb"\n rbac_enabled: "true"\n jwt_issuer: "provisioning.local"\n jwt_audience: "orchestrator"\n mfa_required: "true" # Enterprise only\n log_level: "info"\n```\n\n#### mcp-server-config\n\n```\napiVersion: v1\nkind: ConfigMap\nmetadata:\n name: mcp-server-config\n namespace: provisioning\ndata:\n protocol: "stdio" # or "http"\n orchestrator_url: "http://orchestrator:9090"\n control_center_url: "http://control-center:8080"\n enable_tools: "true"\n enable_resources: "true"\n enable_prompts: "true"\n max_concurrent_tools: "10"\n max_resource_size: "1073741824" # 1GB in bytes\n log_level: "info"\n```\n\n### Required Secrets\n\n#### control-center-secrets\n\n```\napiVersion: v1\nkind: Secret\nmetadata:\n name: control-center-secrets\n namespace: provisioning\ntype: Opaque\nstringData:\n database_url: "postgresql://user:password@postgres:5432/provisioning"\n jwt_secret: "your-secure-random-string-here"\n```\n\n## Persistence\n\nAll deployments use PersistentVolumeClaims for data storage:\n\n```\n# Create PersistentVolumes and PersistentVolumeClaims\nkubectl apply -f - < -n provisioning -- nslookup orchestrator\n\n# Check ingress routing\nkubectl describe ingress platform-ingress -n provisioning\n\n# Test connectivity from pod\nkubectl run -it --rm test --image=busybox -n provisioning -- wget http://orchestrator:9090/health\n```\n\n### TLS certificate issues\n\n```\n# Check certificate status\nkubectl describe certificate platform-tls-cert -n provisioning\n\n# Check cert-manager logs\nkubectl logs -n cert-manager deployment/cert-manager -f\n```\n\n## References\n\n- [Kubernetes Deployment API](https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/deployment-v1/)\n- [Kubernetes Service API](https://kubernetes.io/docs/reference/kubernetes-api/service-resources/service-v1/)\n- [Kubernetes Ingress API](https://kubernetes.io/docs/reference/kubernetes-api/service-resources/ingress-v1/)\n- [Nginx Ingress Controller](https://kubernetes.github.io/ingress-nginx/)\n- [Cert-manager](https://cert-manager.io/) \ No newline at end of file +# Kubernetes Templates + +Nickel-based Kubernetes manifest templates for provisioning platform services. + +## Overview + +This directory contains Kubernetes deployment manifests written in Nickel language. These templates are parameterized to support all four deployment modes: + +- **solo**: Single developer, 1 replica per service, minimal resources +- **multiuser**: Team collaboration, 1-2 replicas per service, PostgreSQL + SurrealDB +- **cicd**: CI/CD pipelines, 1 replica, stateless and ephemeral +- **enterprise**: Production HA, 2-3 replicas per service, full monitoring stack + +## Templates + +### Service Deployments + +#### orchestrator-deployment.yaml.ncl +Orchestrator workflow engine deployment with: +- 3 replicas (enterprise mode, override per mode) +- Service account for RBAC +- Health checks (liveness + readiness probes) +- Resource requests/limits (500m CPU, 512Mi RAM minimum) +- Volume mounts for data and logs +- Pod anti-affinity for distributed deployment +- Init containers for dependency checking + +**Mode-specific overrides**: +- Solo: 1 replica, filesystem storage +- MultiUser: 1 replica, SurrealDB backend +- CI/CD: 1 replica, ephemeral storage +- Enterprise: 3 replicas, SurrealDB cluster + +#### orchestrator-service.yaml.ncl +Internal ClusterIP service for orchestrator with: +- Session affinity (3-hour timeout) +- Port 9090 (HTTP API) +- Port 9091 (Metrics) +- Internal access only (ClusterIP) + +**Mode-specific overrides**: +- Enterprise: LoadBalancer for external access + +#### control-center-deployment.yaml.ncl +Control Center policy and RBAC management with: +- 2 replicas (enterprise mode) +- Database integration (PostgreSQL or RocksDB) +- RBAC and JWT configuration +- MFA support +- Health checks and resource limits +- Security context (non-root user) + +**Environment variables**: +- Database type and URL +- RBAC enablement +- JWT issuer, audience, secret +- MFA requirement +- Log level + +#### control-center-service.yaml.ncl +Internal ClusterIP service for Control Center with: +- Port 8080 (HTTP API + UI) +- Port 8081 (Metrics) +- Session affinity + +#### mcp-server-deployment.yaml.ncl +Model Context Protocol server for AI/LLM integration with: +- Lightweight deployment (100m CPU, 128Mi RAM minimum) +- Orchestrator integration +- Control Center integration +- MCP capabilities (tools, resources, prompts) +- Tool concurrency limits +- Resource size limits + +**Mode-specific overrides**: +- Solo: 1 replica +- Enterprise: 2 replicas for HA + +#### mcp-server-service.yaml.ncl +Internal ClusterIP service for MCP server with: +- Port 8888 (HTTP API) +- Port 8889 (Metrics) + +### Networking + +#### platform-ingress.yaml.ncl +Nginx ingress for external HTTP/HTTPS routing with: +- TLS termination with Let's Encrypt (cert-manager) +- CORS configuration +- Security headers (HSTS, X-Frame-Options, etc.) +- Rate limiting (1000 RPS, 100 connections) +- Path-based routing to services + +**Routes**: +- `api.example.com/orchestrator` → orchestrator:9090 +- `control-center.example.com/` → control-center:8080 +- `mcp.example.com/` → mcp-server:8888 +- `orchestrator.example.com/api` → orchestrator:9090 +- `orchestrator.example.com/policy` → control-center:8080 + +### Namespace and Cluster Configuration + +#### namespace.yaml.ncl +Kubernetes Namespace for provisioning platform with: +- Pod security policies (baseline enforcement) +- Labels for organization and monitoring +- Annotations for description + +#### resource-quota.yaml.ncl +ResourceQuota for resource consumption limits: +- **CPU**: 8 requests / 16 limits (total) +- **Memory**: 16GB requests / 32GB limits (total) +- **Storage**: 200GB (persistent volumes) +- **Pod limit**: 20 pods maximum +- **Services**: 10 maximum +- **ConfigMaps/Secrets**: 50 each +- **Deployments/StatefulSets/Jobs**: Limited per type + +**Mode-specific overrides**: +- Solo: 4 CPU / 8GB memory, 10 pods +- MultiUser: 8 CPU / 16GB memory, 20 pods +- CI/CD: 16 CPU / 32GB memory, 50 pods (ephemeral) +- Enterprise: Unlimited (managed externally) + +#### network-policy.yaml.ncl +NetworkPolicy for network isolation and security: +- **Ingress**: Allow traffic from Nginx, inter-pod, Prometheus, DNS +- **Egress**: Allow DNS queries, inter-pod, external HTTPS +- **Default**: Deny all except explicitly allowed + +**Ports managed**: +- 9090: Orchestrator API +- 8080: Control Center API/UI +- 8888: MCP Server +- 5432: PostgreSQL +- 8000: SurrealDB +- 53: DNS (TCP/UDP) +- 443/80: External HTTPS/HTTP + +#### rbac.yaml.ncl +Role-Based Access Control (RBAC) setup with: +- **ServiceAccounts**: orchestrator, control-center, mcp-server +- **Roles**: Minimal permissions per service +- **RoleBindings**: Connect ServiceAccounts to Roles + +**Permissions**: +- Orchestrator: Read ConfigMaps, Secrets, Pods, Services +- Control Center: Read/Write Secrets, ConfigMaps, Deployments +- MCP Server: Read ConfigMaps, Secrets, Pods, Services + +## Usage + +### Rendering Templates + +Each template is a Nickel file that exports to JSON, then converts to YAML: + +```nickel +# Render a single template +nickel eval --format json orchestrator-deployment.yaml.ncl | yq -P > orchestrator-deployment.yaml + +# Render all templates +for template in *.ncl; do + nickel eval --format json "$template" | yq -P > "${template%.ncl}.yaml" +done +``` + +### Deploying to Kubernetes + +```yaml +# Create namespace +kubectl create namespace provisioning + +# Create ConfigMaps for configuration +kubectl create configmap orchestrator-config + --from-literal=storage_backend=surrealdb + --from-literal=max_concurrent_tasks=50 + --from-literal=batch_parallel_limit=20 + --from-literal=log_level=info + -n provisioning + +# Create secrets for sensitive data +kubectl create secret generic control-center-secrets + --from-literal=database_url="postgresql://user:pass@postgres/provisioning" + --from-literal=jwt_secret="your-jwt-secret-here" + -n provisioning + +# Apply manifests +kubectl apply -f orchestrator-deployment.yaml -n provisioning +kubectl apply -f orchestrator-service.yaml -n provisioning +kubectl apply -f control-center-deployment.yaml -n provisioning +kubectl apply -f control-center-service.yaml -n provisioning +kubectl apply -f mcp-server-deployment.yaml -n provisioning +kubectl apply -f mcp-server-service.yaml -n provisioning +kubectl apply -f platform-ingress.yaml -n provisioning +``` + +### Verifying Deployment + +```bash +# Check deployments +kubectl get deployments -n provisioning + +# Check services +kubectl get svc -n provisioning + +# Check ingress +kubectl get ingress -n provisioning + +# View logs +kubectl logs -n provisioning -l app=orchestrator -f +kubectl logs -n provisioning -l app=control-center -f +kubectl logs -n provisioning -l app=mcp-server -f + +# Describe resource +kubectl describe deployment orchestrator -n provisioning +kubectl describe service orchestrator -n provisioning +``` + +## ConfigMaps and Secrets + +### Required ConfigMaps + +#### orchestrator-config + +```toml +apiVersion: v1 +kind: ConfigMap +metadata: + name: orchestrator-config + namespace: provisioning +data: + storage_backend: "surrealdb" # or "filesystem" + max_concurrent_tasks: "50" # Must match constraint.orchestrator.queue.concurrent_tasks.max + batch_parallel_limit: "20" # Must match constraint.orchestrator.batch.parallel_limit.max + log_level: "info" +``` + +#### control-center-config + +```toml +apiVersion: v1 +kind: ConfigMap +metadata: + name: control-center-config + namespace: provisioning +data: + database_type: "postgres" # or "rocksdb" + rbac_enabled: "true" + jwt_issuer: "provisioning.local" + jwt_audience: "orchestrator" + mfa_required: "true" # Enterprise only + log_level: "info" +``` + +#### mcp-server-config + +```toml +apiVersion: v1 +kind: ConfigMap +metadata: + name: mcp-server-config + namespace: provisioning +data: + protocol: "stdio" # or "http" + orchestrator_url: "http://orchestrator:9090" + control_center_url: "http://control-center:8080" + enable_tools: "true" + enable_resources: "true" + enable_prompts: "true" + max_concurrent_tools: "10" + max_resource_size: "1073741824" # 1GB in bytes + log_level: "info" +``` + +### Required Secrets + +#### control-center-secrets + +```bash +apiVersion: v1 +kind: Secret +metadata: + name: control-center-secrets + namespace: provisioning +type: Opaque +stringData: + database_url: "postgresql://user:password@postgres:5432/provisioning" + jwt_secret: "your-secure-random-string-here" +``` + +## Persistence + +All deployments use PersistentVolumeClaims for data storage: + +```bash +# Create PersistentVolumes and PersistentVolumeClaims +kubectl apply -f - < -n provisioning -- nslookup orchestrator + +# Check ingress routing +kubectl describe ingress platform-ingress -n provisioning + +# Test connectivity from pod +kubectl run -it --rm test --image=busybox -n provisioning -- wget http://orchestrator:9090/health +``` + +### TLS certificate issues + +```bash +# Check certificate status +kubectl describe certificate platform-tls-cert -n provisioning + +# Check cert-manager logs +kubectl logs -n cert-manager deployment/cert-manager -f +``` + +## References + +- [Kubernetes Deployment API](https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/deployment-v1/) +- [Kubernetes Service API](https://kubernetes.io/docs/reference/kubernetes-api/service-resources/service-v1/) +- [Kubernetes Ingress API](https://kubernetes.io/docs/reference/kubernetes-api/service-resources/ingress-v1/) +- [Nginx Ingress Controller](https://kubernetes.github.io/ingress-nginx/) +- [Cert-manager](https://cert-manager.io/) \ No newline at end of file diff --git a/schemas/platform/usage-guide.md b/schemas/platform/usage-guide.md index b02a3de..d379dd2 100644 --- a/schemas/platform/usage-guide.md +++ b/schemas/platform/usage-guide.md @@ -1 +1,731 @@ -# Configuration System Usage Guide\n\nPractical guide for using the provisioning platform configuration system across common scenarios.\n\n## Quick Start (5 Minutes)\n\n### For Local Development\n\n```\n# 1. Enter configuration system directory\ncd provisioning/.typedialog/provisioning/platform\n\n# 2. Generate solo configuration (interactive)\nnu scripts/configure.nu orchestrator solo --backend cli\n\n# 3. Export to TOML\nnu scripts/generate-configs.nu orchestrator solo\n\n# 4. Start orchestrator\ncd ../../\nORCHESTRATOR_CONFIG=platform/config/orchestrator.solo.toml cargo run --bin orchestrator\n```\n\n### For Team Staging\n\n```\n# 1. Generate multiuser configuration\ncd provisioning/.typedialog/provisioning/platform\nnu scripts/configure.nu control-center multiuser --backend web\n\n# 2. Export configuration\nnu scripts/generate-configs.nu control-center multiuser\n\n# 3. Start with Docker Compose\ncd ../../\ndocker-compose -f platform/infrastructure/docker/docker-compose.multiuser.yml up -d\n```\n\n### For Production Enterprise\n\n```\n# 1. Generate enterprise configuration\ncd provisioning/.typedialog/provisioning/platform\nnu scripts/configure.nu orchestrator enterprise --backend web\n\n# 2. Export configuration\nnu scripts/generate-configs.nu orchestrator enterprise\n\n# 3. Deploy to Kubernetes\ncd ../../\nkubectl apply -f platform/infrastructure/kubernetes/namespace.yaml\nkubectl apply -f platform/infrastructure/kubernetes/*.yaml\n```\n\n---\n\n## Scenario 1: Single Developer Setup\n\n**Goal**: Set up local orchestrator for development testing\n**Time**: 5-10 minutes\n**Requirements**: Nushell, Nickel, Rust toolchain\n\n### Step 1: Interactive Configuration\n\n```\ncd provisioning/.typedialog/provisioning/platform\nnu scripts/configure.nu orchestrator solo --backend cli\n```\n\n**Form Fields**:\n- Workspace name: `dev-workspace` (default)\n- Workspace path: `/home/username/provisioning/data/orchestrator` (change to your path)\n- Server host: `127.0.0.1` (localhost only)\n- Server port: `9090` (default)\n- Storage backend: `filesystem` (selected by default)\n- Logging level: `debug` (recommended for dev)\n\n### Step 2: Validate Configuration\n\n```\n# Typecheck the generated Nickel\nnickel typecheck configs/orchestrator.solo.ncl\n\n# Should output: "✓ Type checking successful"\n```\n\n### Step 3: Export to TOML\n\n```\n# Generate TOML from Nickel\nnu scripts/generate-configs.nu orchestrator solo\n\n# Output: provisioning/platform/config/orchestrator.solo.toml\n```\n\n### Step 4: Start the Service\n\n```\ncd ../..\nORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator\n```\n\n**Expected Output**:\n\n```\n[INFO] Orchestrator starting...\n[INFO] Server listening on 127.0.0.1:9090\n[INFO] Storage backend: filesystem\n[INFO] Ready to accept requests\n```\n\n### Step 5: Test the Service\n\nIn another terminal:\n\n```\n# Check health\ncurl http://localhost:9090/health\n\n# Submit a workflow\ncurl -X POST http://localhost:9090/api/workflows \n -H "Content-Type: application/json" \n -d '{"name": "test-workflow", "steps": []}'\n```\n\n### Iteration: Modify Configuration\n\nTo change configuration:\n\n**Option A: Re-run Interactive Form**\n\n```\ncd provisioning/.typedialog/provisioning/platform\nnu scripts/configure.nu orchestrator solo --backend cli\n# Answer with new values\nnu scripts/generate-configs.nu orchestrator solo\n# Restart service\n```\n\n**Option B: Edit TOML Directly**\n\n```\n# Edit the file directly\nvi provisioning/platform/config/orchestrator.solo.toml\n# Change values as needed\n# Restart service\n```\n\n**Option C: Environment Variable Override**\n\n```\n# No file changes needed\nexport ORCHESTRATOR_SERVER_PORT=9999\nexport ORCHESTRATOR_LOG_LEVEL=info\n\nORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator\n```\n\n---\n\n## Scenario 2: Team Collaboration Setup\n\n**Goal**: Set up shared team environment with PostgreSQL and RBAC\n**Time**: 20-30 minutes\n**Requirements**: Docker, Docker Compose, PostgreSQL running\n\n### Step 1: Interactive Configuration\n\n```\ncd provisioning/.typedialog/provisioning/platform\n\n# Configure Control Center with RBAC\nnu scripts/configure.nu control-center multiuser --backend web\n```\n\n**Important Fields**:\n- Database backend: `postgres` (for persistent storage)\n- Database host: `postgres.provisioning.svc.cluster.local` or `localhost` for local\n- Database password: Generate strong password (store in `.env` file, don't hardcode)\n- JWT secret: Generate 256-bit random string\n- MFA required: `false` (optional for team, not required)\n- Default role: `viewer` (least privilege)\n\n### Step 2: Create Environment File\n\n```\n# Create .env for secrets\ncat > provisioning/platform/.env << 'EOF'\nDB_PASSWORD=generate-strong-password-here\nJWT_SECRET=generate-256-bit-random-base64-string\nSURREALDB_PASSWORD=another-strong-password\nEOF\n\n# Protect the file\nchmod 600 provisioning/platform/.env\n```\n\n### Step 3: Export Configurations\n\n```\n# Export all three services for team setup\nnu scripts/generate-configs.nu control-center multiuser\nnu scripts/generate-configs.nu orchestrator multiuser\nnu scripts/generate-configs.nu mcp-server multiuser\n```\n\n### Step 4: Start Services with Docker Compose\n\n```\ncd ../..\n\n# Generate Docker Compose from Nickel template\nnu provisioning/.typedialog/provisioning/platform/scripts/render-docker-compose.nu multiuser\n\n# Start all services\ndocker-compose -f provisioning/platform/infrastructure/docker/docker-compose.multiuser.yml \n --env-file provisioning/platform/.env \n up -d\n```\n\n**Verify Services**:\n\n```\n# Check all services are running\ndocker-compose -f provisioning/platform/infrastructure/docker/docker-compose.multiuser.yml ps\n\n# Check logs for errors\ndocker-compose -f provisioning/platform/infrastructure/docker/docker-compose.multiuser.yml logs -f control-center\n\n# Test Control Center UI\nopen http://localhost:8080\n# Login with default credentials (or configure initially)\n```\n\n### Step 5: Create Team Users and Roles\n\n```\n# Access PostgreSQL to set up users\ndocker-compose exec postgres psql -U provisioning -d provisioning\n\n-- Create users\nINSERT INTO users (username, email, role) VALUES\n ('alice@company.com', 'alice@company.com', 'admin'),\n ('bob@company.com', 'bob@company.com', 'operator'),\n ('charlie@company.com', 'charlie@company.com', 'developer');\n\n-- Create RBAC assignments\nINSERT INTO role_assignments (user_id, role) VALUES\n ((SELECT id FROM users WHERE username='alice@company.com'), 'admin'),\n ((SELECT id FROM users WHERE username='bob@company.com'), 'operator'),\n ((SELECT id FROM users WHERE username='charlie@company.com'), 'developer');\n```\n\n### Step 6: Team Access\n\n**Admin (Alice)**:\n- Full platform access\n- Can create/modify users\n- Can manage all workflows and policies\n\n**Operator (Bob)**:\n- Execute and manage workflows\n- View logs and metrics\n- Cannot modify policies or users\n\n**Developer (Charlie)**:\n- Read-only access to workflows\n- Cannot execute or modify\n- Can view logs\n\n---\n\n## Scenario 3: Production Enterprise Deployment\n\n**Goal**: Deploy complete platform to Kubernetes with HA and monitoring\n**Time**: 1-2 hours (includes infrastructure setup)\n**Requirements**: Kubernetes cluster, kubectl, Helm (optional)\n\n### Step 1: Pre-Deployment Checklist\n\n```\n# Verify Kubernetes access\nkubectl cluster-info\n\n# Create namespace\nkubectl create namespace provisioning\n\n# Verify persistent volumes available\nkubectl get pv\n\n# Check node resources\nkubectl top nodes\n# Minimum 16 CPU, 32GB RAM across cluster\n```\n\n### Step 2: Interactive Configuration (Enterprise Mode)\n\n```\ncd provisioning/.typedialog/provisioning/platform\n\nnu scripts/configure.nu orchestrator enterprise --backend web\nnu scripts/configure.nu control-center enterprise --backend web\nnu scripts/configure.nu mcp-server enterprise --backend web\n```\n\n**Critical Enterprise Settings**:\n- Deployment mode: `enterprise`\n- Replicas: Orchestrator (3), Control Center (2), MCP Server (1-2)\n- Storage:\n - Orchestrator: `surrealdb_cluster` with 3 nodes\n - Control Center: `postgres` with HA\n- Security:\n - Auth: `jwt` (required)\n - TLS: `true` (required)\n - MFA: `true` (required)\n- Monitoring: All enabled\n- Logging: JSON format with 365-day retention\n\n### Step 3: Generate Secrets\n\n```\n# Generate secure values\nJWT_SECRET=$(openssl rand -base64 32)\nDB_PASSWORD=$(openssl rand -base64 32)\nSURREALDB_PASSWORD=$(openssl rand -base64 32)\nADMIN_PASSWORD=$(openssl rand -base64 16)\n\n# Create Kubernetes secret\nkubectl create secret generic provisioning-secrets \n -n provisioning \n --from-literal=jwt-secret="$JWT_SECRET" \n --from-literal=db-password="$DB_PASSWORD" \n --from-literal=surrealdb-password="$SURREALDB_PASSWORD" \n --from-literal=admin-password="$ADMIN_PASSWORD"\n\n# Verify secret created\nkubectl get secrets -n provisioning\n```\n\n### Step 4: TLS Certificate Setup\n\n```\n# Generate self-signed certificate (for testing)\nopenssl req -x509 -nodes -days 365 -newkey rsa:2048 \n -keyout provisioning.key \n -out provisioning.crt \n -subj "/CN=provisioning.example.com"\n\n# Create TLS secret in Kubernetes\nkubectl create secret tls provisioning-tls \n -n provisioning \n --cert=provisioning.crt \n --key=provisioning.key\n\n# For production: Use cert-manager or real certificates\n# kubectl create secret tls provisioning-tls \n# -n provisioning \n# --cert=/path/to/cert.pem \n# --key=/path/to/key.pem\n```\n\n### Step 5: Export Configurations\n\n```\n# Export TOML configurations\nnu scripts/generate-configs.nu orchestrator enterprise\nnu scripts/generate-configs.nu control-center enterprise\nnu scripts/generate-configs.nu mcp-server enterprise\n```\n\n### Step 6: Create ConfigMaps for Configuration\n\n```\n# Create ConfigMaps with exported TOML\nkubectl create configmap orchestrator-config \n -n provisioning \n --from-file=provisioning/platform/config/orchestrator.enterprise.toml\n\nkubectl create configmap control-center-config \n -n provisioning \n --from-file=provisioning/platform/config/control-center.enterprise.toml\n\nkubectl create configmap mcp-server-config \n -n provisioning \n --from-file=provisioning/platform/config/mcp-server.enterprise.toml\n```\n\n### Step 7: Deploy Infrastructure\n\n```\ncd ../..\n\n# Deploy in order of dependencies\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/namespace.yaml\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/resource-quota.yaml\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/rbac.yaml\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/network-policy.yaml\n\n# Deploy storage (PostgreSQL, SurrealDB)\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/postgres-*.yaml\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/surrealdb-*.yaml\n\n# Wait for databases to be ready\nkubectl wait --for=condition=ready pod -l app=postgres -n provisioning --timeout=300s\nkubectl wait --for=condition=ready pod -l app=surrealdb -n provisioning --timeout=300s\n\n# Deploy platform services\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/orchestrator-*.yaml\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/control-center-*.yaml\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/mcp-server-*.yaml\n\n# Deploy monitoring stack\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/prometheus-*.yaml\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/grafana-*.yaml\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/loki-*.yaml\n\n# Deploy ingress\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/platform-ingress.yaml\n```\n\n### Step 8: Verify Deployment\n\n```\n# Check all pods are running\nkubectl get pods -n provisioning\n\n# Check services\nkubectl get svc -n provisioning\n\n# Wait for all pods ready\nkubectl wait --for=condition=Ready pods --all -n provisioning --timeout=600s\n\n# Check ingress\nkubectl get ingress -n provisioning\n```\n\n### Step 9: Access the Platform\n\n```\n# Get Ingress IP\nkubectl get ingress -n provisioning\n\n# Configure DNS (or use /etc/hosts for testing)\necho "INGRESS_IP provisioning.example.com" | sudo tee -a /etc/hosts\n\n# Access services\n# Orchestrator: https://orchestrator.provisioning.example.com/api\n# Control Center: https://control-center.provisioning.example.com\n# MCP Server: https://mcp.provisioning.example.com\n# Grafana: https://grafana.provisioning.example.com (admin/password)\n# Prometheus: https://prometheus.provisioning.example.com (internal)\n```\n\n### Step 10: Post-Deployment Configuration\n\n```\n# Create database schema\nkubectl exec -it -n provisioning deployment/postgres -- psql -U provisioning -d provisioning -f /schema.sql\n\n# Initialize Grafana dashboards\nkubectl cp grafana-dashboards provisioning/grafana-0:/var/lib/grafana/dashboards/\n\n# Configure alerts\nkubectl apply -f provisioning/platform/infrastructure/kubernetes/prometheus-alerts.yaml\n```\n\n---\n\n## Common Tasks\n\n### Change Configuration Value\n\n**Without Service Restart** (Environment Variable):\n\n```\n# Override specific value via environment variable\nexport ORCHESTRATOR_LOG_LEVEL=debug\nexport ORCHESTRATOR_SERVER_PORT=9999\n\n# Service uses overridden values\nORCHESTRATOR_CONFIG=config.toml cargo run --bin orchestrator\n```\n\n**With Service Restart** (TOML Edit):\n\n```\n# Edit TOML directly\nvi provisioning/platform/config/orchestrator.solo.toml\n\n# Restart service\npkill -f "cargo run --bin orchestrator"\nORCHESTRATOR_CONFIG=config.toml cargo run --bin orchestrator\n```\n\n**With Validation** (Regenerate from Form):\n\n```\n# Re-run interactive form to regenerate\ncd provisioning/.typedialog/provisioning/platform\nnu scripts/configure.nu orchestrator solo --backend cli\n\n# Validation ensures consistency\nnu scripts/generate-configs.nu orchestrator solo\n\n# Restart service with validated config\n```\n\n### Add Team Member\n\n**In Kubernetes PostgreSQL**:\n\n```\nkubectl exec -it -n provisioning deployment/postgres -- psql -U provisioning -d provisioning\n\n-- Create user\nINSERT INTO users (username, email, password_hash, role, created_at) VALUES\n ('newuser@company.com', 'newuser@company.com', crypt('password', gen_salt('bf')), 'developer', now());\n\n-- Assign role\nINSERT INTO role_assignments (user_id, role, granted_by, granted_at) VALUES\n ((SELECT id FROM users WHERE username='newuser@company.com'), 'developer', 1, now());\n```\n\n### Scale Service Replicas\n\n**In Kubernetes**:\n\n```\n# Scale orchestrator from 3 to 5 replicas\nkubectl scale deployment orchestrator -n provisioning --replicas=5\n\n# Verify scaling\nkubectl get deployment orchestrator -n provisioning\nkubectl get pods -n provisioning | grep orchestrator\n```\n\n### Monitor Service Health\n\n```\n# Check pod status\nkubectl describe pod orchestrator-0 -n provisioning\n\n# Check service logs\nkubectl logs -f deployment/orchestrator -n provisioning --all-containers=true\n\n# Check resource usage\nkubectl top pods -n provisioning\n\n# Check service metrics (via Prometheus)\nkubectl port-forward -n provisioning svc/prometheus 9091:9091\nopen http://localhost:9091\n```\n\n### Backup Configuration\n\n```\n# Backup current TOML configs\ntar -czf configs-backup-$(date +%Y%m%d).tar.gz provisioning/platform/config/\n\n# Backup Kubernetes manifests\nkubectl get all -n provisioning -o yaml > k8s-backup-$(date +%Y%m%d).yaml\n\n# Backup database\nkubectl exec -n provisioning deployment/postgres -- pg_dump -U provisioning provisioning | gzip > db-backup-$(date +%Y%m%d).sql.gz\n```\n\n### Troubleshoot Configuration Issues\n\n```\n# Check Nickel syntax errors\nnickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Validate TOML syntax\nnickel export --format toml provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Check TOML is valid for Rust\nORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator -- --validate-config\n\n# Check environment variable overrides\necho $ORCHESTRATOR_SERVER_PORT\necho $ORCHESTRATOR_LOG_LEVEL\n\n# Examine actual config loaded (if service logs it)\nORCHESTRATOR_CONFIG=config.toml cargo run --bin orchestrator 2>&1 | grep -i "config\|configuration"\n```\n\n---\n\n## Configuration File Locations\n\n```\nprovisioning/.typedialog/provisioning/platform/\n├── forms/ # User-facing interactive forms\n│ ├── orchestrator-form.toml\n│ ├── control-center-form.toml\n│ └── fragments/ # Reusable form sections\n│\n├── values/ # User input files (gitignored)\n│ ├── orchestrator.solo.ncl\n│ ├── orchestrator.enterprise.ncl\n│ └── (auto-generated by TypeDialog)\n│\n├── configs/ # Composed Nickel configs\n│ ├── orchestrator.solo.ncl # Base + mode overlay + user input + validation\n│ ├── control-center.multiuser.ncl\n│ └── (4 services × 4 modes = 16 files)\n│\n├── schemas/ # Type definitions\n│ ├── orchestrator.ncl\n│ ├── control-center.ncl\n│ └── common/ # Shared schemas\n│\n├── defaults/ # Default values\n│ ├── orchestrator-defaults.ncl\n│ └── deployment/solo-defaults.ncl\n│\n├── validators/ # Business rules\n│ ├── orchestrator-validator.ncl\n│ └── (per-service validators)\n│\n├── constraints/\n│ └── constraints.toml # Min/max values (single source of truth)\n│\n├── templates/ # Deployment templates\n│ ├── docker-compose/\n│ │ ├── platform-stack.solo.yml.ncl\n│ │ └── (4 modes)\n│ └── kubernetes/\n│ ├── orchestrator-deployment.yaml.ncl\n│ └── (11 templates)\n│\n└── scripts/ # Automation\n ├── configure.nu # Interactive TypeDialog\n ├── generate-configs.nu # Nickel → TOML export\n ├── validate-config.nu # Typecheck Nickel\n ├── render-docker-compose.nu # Templates → Docker Compose\n └── render-kubernetes.nu # Templates → Kubernetes\n```\n\nTOML output location:\n\n```\nprovisioning/platform/config/\n├── orchestrator.solo.toml # Consumed by orchestrator service\n├── control-center.enterprise.toml # Consumed by control-center service\n└── (4 services × 4 modes = 16 files)\n```\n\n---\n\n## Tips & Best Practices\n\n### 1. Use Version Control\n\n```\n# Commit TOML configs to track changes\ngit add provisioning/platform/config/*.toml\ngit commit -m "Update orchestrator enterprise config: increase worker threads to 16"\n\n# Do NOT commit Nickel source files in values/\necho "provisioning/.typedialog/provisioning/platform/values/*.ncl" >> .gitignore\n```\n\n### 2. Test Before Production Deployment\n\n```\n# Test in solo mode first\nnu scripts/configure.nu orchestrator solo\ncargo run --bin orchestrator\n\n# Then test in staging (multiuser mode)\nnu scripts/configure.nu orchestrator multiuser\ndocker-compose -f docker-compose.multiuser.yml up\n\n# Finally deploy to production (enterprise)\nnu scripts/configure.nu orchestrator enterprise\n# Then Kubernetes deployment\n```\n\n### 3. Document Custom Configurations\n\n```\n# Add comments to configurations\n# In values/*.ncl or config/*.ncl:\n\n# Custom configuration for high-throughput testing\n# - Increased workers from 4 to 8\n# - Increased queue.max_concurrent_tasks from 5 to 20\n# - Lowered logging level from debug to info\n{\n orchestrator = {\n # Worker threads increased for testing parallel task processing\n server.workers = 8,\n queue.max_concurrent_tasks = 20,\n logging.level = "info",\n },\n}\n```\n\n### 4. Secrets Management\n\n**Never** hardcode secrets in configuration files:\n\n```\n# WRONG - Don't do this\n[orchestrator.security]\njwt_secret = "hardcoded-secret-exposed-in-git"\n\n# RIGHT - Use environment variables\nexport ORCHESTRATOR_SECURITY_JWT_SECRET="actual-secret-from-vault"\n\n# TOML references it:\n[orchestrator.security]\njwt_secret = "${JWT_SECRET}" # Loaded at runtime\n```\n\n### 5. Monitor Changes\n\n```\n# Track configuration changes over time\ngit log --oneline provisioning/platform/config/\n\n# See what changed\ngit diff provisioning/platform/config/orchestrator.solo.toml\n```\n\n---\n\n**Version**: 1.0\n**Last Updated**: 2025-01-05\n**Status**: Production Ready \ No newline at end of file +# Configuration System Usage Guide + +Practical guide for using the provisioning platform configuration system across common scenarios. + +## Quick Start (5 Minutes) + +### For Local Development + +```bash +# 1. Enter configuration system directory +cd provisioning/.typedialog/provisioning/platform + +# 2. Generate solo configuration (interactive) +nu scripts/configure.nu orchestrator solo --backend cli + +# 3. Export to TOML +nu scripts/generate-configs.nu orchestrator solo + +# 4. Start orchestrator +cd ../../ +ORCHESTRATOR_CONFIG=platform/config/orchestrator.solo.toml cargo run --bin orchestrator +``` + +### For Team Staging + +```bash +# 1. Generate multiuser configuration +cd provisioning/.typedialog/provisioning/platform +nu scripts/configure.nu control-center multiuser --backend web + +# 2. Export configuration +nu scripts/generate-configs.nu control-center multiuser + +# 3. Start with Docker Compose +cd ../../ +docker-compose -f platform/infrastructure/docker/docker-compose.multiuser.yml up -d +``` + +### For Production Enterprise + +```bash +# 1. Generate enterprise configuration +cd provisioning/.typedialog/provisioning/platform +nu scripts/configure.nu orchestrator enterprise --backend web + +# 2. Export configuration +nu scripts/generate-configs.nu orchestrator enterprise + +# 3. Deploy to Kubernetes +cd ../../ +kubectl apply -f platform/infrastructure/kubernetes/namespace.yaml +kubectl apply -f platform/infrastructure/kubernetes/*.yaml +``` + +--- + +## Scenario 1: Single Developer Setup + +**Goal**: Set up local orchestrator for development testing +**Time**: 5-10 minutes +**Requirements**: Nushell, Nickel, Rust toolchain + +### Step 1: Interactive Configuration + +```toml +cd provisioning/.typedialog/provisioning/platform +nu scripts/configure.nu orchestrator solo --backend cli +``` + +**Form Fields**: +- Workspace name: `dev-workspace` (default) +- Workspace path: `/home/username/provisioning/data/orchestrator` (change to your path) +- Server host: `127.0.0.1` (localhost only) +- Server port: `9090` (default) +- Storage backend: `filesystem` (selected by default) +- Logging level: `debug` (recommended for dev) + +### Step 2: Validate Configuration + +```toml +# Typecheck the generated Nickel +nickel typecheck configs/orchestrator.solo.ncl + +# Should output: "✓ Type checking successful" +``` + +### Step 3: Export to TOML + +```toml +# Generate TOML from Nickel +nu scripts/generate-configs.nu orchestrator solo + +# Output: provisioning/platform/config/orchestrator.solo.toml +``` + +### Step 4: Start the Service + +```bash +cd ../.. +ORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator +``` + +**Expected Output**: + +```bash +[INFO] Orchestrator starting... +[INFO] Server listening on 127.0.0.1:9090 +[INFO] Storage backend: filesystem +[INFO] Ready to accept requests +``` + +### Step 5: Test the Service + +In another terminal: + +```bash +# Check health +curl http://localhost:9090/health + +# Submit a workflow +curl -X POST http://localhost:9090/api/workflows + -H "Content-Type: application/json" + -d '{"name": "test-workflow", "steps": []}' +``` + +### Iteration: Modify Configuration + +To change configuration: + +**Option A: Re-run Interactive Form** + +```bash +cd provisioning/.typedialog/provisioning/platform +nu scripts/configure.nu orchestrator solo --backend cli +# Answer with new values +nu scripts/generate-configs.nu orchestrator solo +# Restart service +``` + +**Option B: Edit TOML Directly** + +```toml +# Edit the file directly +vi provisioning/platform/config/orchestrator.solo.toml +# Change values as needed +# Restart service +``` + +**Option C: Environment Variable Override** + +```bash +# No file changes needed +export ORCHESTRATOR_SERVER_PORT=9999 +export ORCHESTRATOR_LOG_LEVEL=info + +ORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator +``` + +--- + +## Scenario 2: Team Collaboration Setup + +**Goal**: Set up shared team environment with PostgreSQL and RBAC +**Time**: 20-30 minutes +**Requirements**: Docker, Docker Compose, PostgreSQL running + +### Step 1: Interactive Configuration + +```toml +cd provisioning/.typedialog/provisioning/platform + +# Configure Control Center with RBAC +nu scripts/configure.nu control-center multiuser --backend web +``` + +**Important Fields**: +- Database backend: `postgres` (for persistent storage) +- Database host: `postgres.provisioning.svc.cluster.local` or `localhost` for local +- Database password: Generate strong password (store in `.env` file, don't hardcode) +- JWT secret: Generate 256-bit random string +- MFA required: `false` (optional for team, not required) +- Default role: `viewer` (least privilege) + +### Step 2: Create Environment File + +```bash +# Create .env for secrets +cat > provisioning/platform/.env << 'EOF' +DB_PASSWORD=generate-strong-password-here +JWT_SECRET=generate-256-bit-random-base64-string +SURREALDB_PASSWORD=another-strong-password +EOF + +# Protect the file +chmod 600 provisioning/platform/.env +``` + +### Step 3: Export Configurations + +```toml +# Export all three services for team setup +nu scripts/generate-configs.nu control-center multiuser +nu scripts/generate-configs.nu orchestrator multiuser +nu scripts/generate-configs.nu mcp-server multiuser +``` + +### Step 4: Start Services with Docker Compose + +```bash +cd ../.. + +# Generate Docker Compose from Nickel template +nu provisioning/.typedialog/provisioning/platform/scripts/render-docker-compose.nu multiuser + +# Start all services +docker-compose -f provisioning/platform/infrastructure/docker/docker-compose.multiuser.yml + --env-file provisioning/platform/.env + up -d +``` + +**Verify Services**: + +```bash +# Check all services are running +docker-compose -f provisioning/platform/infrastructure/docker/docker-compose.multiuser.yml ps + +# Check logs for errors +docker-compose -f provisioning/platform/infrastructure/docker/docker-compose.multiuser.yml logs -f control-center + +# Test Control Center UI +open http://localhost:8080 +# Login with default credentials (or configure initially) +``` + +### Step 5: Create Team Users and Roles + +```bash +# Access PostgreSQL to set up users +docker-compose exec postgres psql -U provisioning -d provisioning + +-- Create users +INSERT INTO users (username, email, role) VALUES + ('alice@company.com', 'alice@company.com', 'admin'), + ('bob@company.com', 'bob@company.com', 'operator'), + ('charlie@company.com', 'charlie@company.com', 'developer'); + +-- Create RBAC assignments +INSERT INTO role_assignments (user_id, role) VALUES + ((SELECT id FROM users WHERE username='alice@company.com'), 'admin'), + ((SELECT id FROM users WHERE username='bob@company.com'), 'operator'), + ((SELECT id FROM users WHERE username='charlie@company.com'), 'developer'); +``` + +### Step 6: Team Access + +**Admin (Alice)**: +- Full platform access +- Can create/modify users +- Can manage all workflows and policies + +**Operator (Bob)**: +- Execute and manage workflows +- View logs and metrics +- Cannot modify policies or users + +**Developer (Charlie)**: +- Read-only access to workflows +- Cannot execute or modify +- Can view logs + +--- + +## Scenario 3: Production Enterprise Deployment + +**Goal**: Deploy complete platform to Kubernetes with HA and monitoring +**Time**: 1-2 hours (includes infrastructure setup) +**Requirements**: Kubernetes cluster, kubectl, Helm (optional) + +### Step 1: Pre-Deployment Checklist + +```bash +# Verify Kubernetes access +kubectl cluster-info + +# Create namespace +kubectl create namespace provisioning + +# Verify persistent volumes available +kubectl get pv + +# Check node resources +kubectl top nodes +# Minimum 16 CPU, 32GB RAM across cluster +``` + +### Step 2: Interactive Configuration (Enterprise Mode) + +```toml +cd provisioning/.typedialog/provisioning/platform + +nu scripts/configure.nu orchestrator enterprise --backend web +nu scripts/configure.nu control-center enterprise --backend web +nu scripts/configure.nu mcp-server enterprise --backend web +``` + +**Critical Enterprise Settings**: +- Deployment mode: `enterprise` +- Replicas: Orchestrator (3), Control Center (2), MCP Server (1-2) +- Storage: + - Orchestrator: `surrealdb_cluster` with 3 nodes + - Control Center: `postgres` with HA +- Security: + - Auth: `jwt` (required) + - TLS: `true` (required) + - MFA: `true` (required) +- Monitoring: All enabled +- Logging: JSON format with 365-day retention + +### Step 3: Generate Secrets + +```bash +# Generate secure values +JWT_SECRET=$(openssl rand -base64 32) +DB_PASSWORD=$(openssl rand -base64 32) +SURREALDB_PASSWORD=$(openssl rand -base64 32) +ADMIN_PASSWORD=$(openssl rand -base64 16) + +# Create Kubernetes secret +kubectl create secret generic provisioning-secrets + -n provisioning + --from-literal=jwt-secret="$JWT_SECRET" + --from-literal=db-password="$DB_PASSWORD" + --from-literal=surrealdb-password="$SURREALDB_PASSWORD" + --from-literal=admin-password="$ADMIN_PASSWORD" + +# Verify secret created +kubectl get secrets -n provisioning +``` + +### Step 4: TLS Certificate Setup + +```bash +# Generate self-signed certificate (for testing) +openssl req -x509 -nodes -days 365 -newkey rsa:2048 + -keyout provisioning.key + -out provisioning.crt + -subj "/CN=provisioning.example.com" + +# Create TLS secret in Kubernetes +kubectl create secret tls provisioning-tls + -n provisioning + --cert=provisioning.crt + --key=provisioning.key + +# For production: Use cert-manager or real certificates +# kubectl create secret tls provisioning-tls +# -n provisioning +# --cert=/path/to/cert.pem +# --key=/path/to/key.pem +``` + +### Step 5: Export Configurations + +```toml +# Export TOML configurations +nu scripts/generate-configs.nu orchestrator enterprise +nu scripts/generate-configs.nu control-center enterprise +nu scripts/generate-configs.nu mcp-server enterprise +``` + +### Step 6: Create ConfigMaps for Configuration + +```toml +# Create ConfigMaps with exported TOML +kubectl create configmap orchestrator-config + -n provisioning + --from-file=provisioning/platform/config/orchestrator.enterprise.toml + +kubectl create configmap control-center-config + -n provisioning + --from-file=provisioning/platform/config/control-center.enterprise.toml + +kubectl create configmap mcp-server-config + -n provisioning + --from-file=provisioning/platform/config/mcp-server.enterprise.toml +``` + +### Step 7: Deploy Infrastructure + +```bash +cd ../.. + +# Deploy in order of dependencies +kubectl apply -f provisioning/platform/infrastructure/kubernetes/namespace.yaml +kubectl apply -f provisioning/platform/infrastructure/kubernetes/resource-quota.yaml +kubectl apply -f provisioning/platform/infrastructure/kubernetes/rbac.yaml +kubectl apply -f provisioning/platform/infrastructure/kubernetes/network-policy.yaml + +# Deploy storage (PostgreSQL, SurrealDB) +kubectl apply -f provisioning/platform/infrastructure/kubernetes/postgres-*.yaml +kubectl apply -f provisioning/platform/infrastructure/kubernetes/surrealdb-*.yaml + +# Wait for databases to be ready +kubectl wait --for=condition=ready pod -l app=postgres -n provisioning --timeout=300s +kubectl wait --for=condition=ready pod -l app=surrealdb -n provisioning --timeout=300s + +# Deploy platform services +kubectl apply -f provisioning/platform/infrastructure/kubernetes/orchestrator-*.yaml +kubectl apply -f provisioning/platform/infrastructure/kubernetes/control-center-*.yaml +kubectl apply -f provisioning/platform/infrastructure/kubernetes/mcp-server-*.yaml + +# Deploy monitoring stack +kubectl apply -f provisioning/platform/infrastructure/kubernetes/prometheus-*.yaml +kubectl apply -f provisioning/platform/infrastructure/kubernetes/grafana-*.yaml +kubectl apply -f provisioning/platform/infrastructure/kubernetes/loki-*.yaml + +# Deploy ingress +kubectl apply -f provisioning/platform/infrastructure/kubernetes/platform-ingress.yaml +``` + +### Step 8: Verify Deployment + +```bash +# Check all pods are running +kubectl get pods -n provisioning + +# Check services +kubectl get svc -n provisioning + +# Wait for all pods ready +kubectl wait --for=condition=Ready pods --all -n provisioning --timeout=600s + +# Check ingress +kubectl get ingress -n provisioning +``` + +### Step 9: Access the Platform + +```bash +# Get Ingress IP +kubectl get ingress -n provisioning + +# Configure DNS (or use /etc/hosts for testing) +echo "INGRESS_IP provisioning.example.com" | sudo tee -a /etc/hosts + +# Access services +# Orchestrator: https://orchestrator.provisioning.example.com/api +# Control Center: https://control-center.provisioning.example.com +# MCP Server: https://mcp.provisioning.example.com +# Grafana: https://grafana.provisioning.example.com (admin/password) +# Prometheus: https://prometheus.provisioning.example.com (internal) +``` + +### Step 10: Post-Deployment Configuration + +```toml +# Create database schema +kubectl exec -it -n provisioning deployment/postgres -- psql -U provisioning -d provisioning -f /schema.sql + +# Initialize Grafana dashboards +kubectl cp grafana-dashboards provisioning/grafana-0:/var/lib/grafana/dashboards/ + +# Configure alerts +kubectl apply -f provisioning/platform/infrastructure/kubernetes/prometheus-alerts.yaml +``` + +--- + +## Common Tasks + +### Change Configuration Value + +**Without Service Restart** (Environment Variable): + +```bash +# Override specific value via environment variable +export ORCHESTRATOR_LOG_LEVEL=debug +export ORCHESTRATOR_SERVER_PORT=9999 + +# Service uses overridden values +ORCHESTRATOR_CONFIG=config.toml cargo run --bin orchestrator +``` + +**With Service Restart** (TOML Edit): + +```toml +# Edit TOML directly +vi provisioning/platform/config/orchestrator.solo.toml + +# Restart service +pkill -f "cargo run --bin orchestrator" +ORCHESTRATOR_CONFIG=config.toml cargo run --bin orchestrator +``` + +**With Validation** (Regenerate from Form): + +```bash +# Re-run interactive form to regenerate +cd provisioning/.typedialog/provisioning/platform +nu scripts/configure.nu orchestrator solo --backend cli + +# Validation ensures consistency +nu scripts/generate-configs.nu orchestrator solo + +# Restart service with validated config +``` + +### Add Team Member + +**In Kubernetes PostgreSQL**: + +```yaml +kubectl exec -it -n provisioning deployment/postgres -- psql -U provisioning -d provisioning + +-- Create user +INSERT INTO users (username, email, password_hash, role, created_at) VALUES + ('newuser@company.com', 'newuser@company.com', crypt('password', gen_salt('bf')), 'developer', now()); + +-- Assign role +INSERT INTO role_assignments (user_id, role, granted_by, granted_at) VALUES + ((SELECT id FROM users WHERE username='newuser@company.com'), 'developer', 1, now()); +``` + +### Scale Service Replicas + +**In Kubernetes**: + +```yaml +# Scale orchestrator from 3 to 5 replicas +kubectl scale deployment orchestrator -n provisioning --replicas=5 + +# Verify scaling +kubectl get deployment orchestrator -n provisioning +kubectl get pods -n provisioning | grep orchestrator +``` + +### Monitor Service Health + +```bash +# Check pod status +kubectl describe pod orchestrator-0 -n provisioning + +# Check service logs +kubectl logs -f deployment/orchestrator -n provisioning --all-containers=true + +# Check resource usage +kubectl top pods -n provisioning + +# Check service metrics (via Prometheus) +kubectl port-forward -n provisioning svc/prometheus 9091:9091 +open http://localhost:9091 +``` + +### Backup Configuration + +```toml +# Backup current TOML configs +tar -czf configs-backup-$(date +%Y%m%d).tar.gz provisioning/platform/config/ + +# Backup Kubernetes manifests +kubectl get all -n provisioning -o yaml > k8s-backup-$(date +%Y%m%d).yaml + +# Backup database +kubectl exec -n provisioning deployment/postgres -- pg_dump -U provisioning provisioning | gzip > db-backup-$(date +%Y%m%d).sql.gz +``` + +### Troubleshoot Configuration Issues + +```toml +# Check Nickel syntax errors +nickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl + +# Validate TOML syntax +nickel export --format toml provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl + +# Check TOML is valid for Rust +ORCHESTRATOR_CONFIG=provisioning/platform/config/orchestrator.solo.toml cargo run --bin orchestrator -- --validate-config + +# Check environment variable overrides +echo $ORCHESTRATOR_SERVER_PORT +echo $ORCHESTRATOR_LOG_LEVEL + +# Examine actual config loaded (if service logs it) +ORCHESTRATOR_CONFIG=config.toml cargo run --bin orchestrator 2>&1 | grep -i "config\|configuration" +``` + +--- + +## Configuration File Locations + +```toml +provisioning/.typedialog/provisioning/platform/ +├── forms/ # User-facing interactive forms +│ ├── orchestrator-form.toml +│ ├── control-center-form.toml +│ └── fragments/ # Reusable form sections +│ +├── values/ # User input files (gitignored) +│ ├── orchestrator.solo.ncl +│ ├── orchestrator.enterprise.ncl +│ └── (auto-generated by TypeDialog) +│ +├── configs/ # Composed Nickel configs +│ ├── orchestrator.solo.ncl # Base + mode overlay + user input + validation +│ ├── control-center.multiuser.ncl +│ └── (4 services × 4 modes = 16 files) +│ +├── schemas/ # Type definitions +│ ├── orchestrator.ncl +│ ├── control-center.ncl +│ └── common/ # Shared schemas +│ +├── defaults/ # Default values +│ ├── orchestrator-defaults.ncl +│ └── deployment/solo-defaults.ncl +│ +├── validators/ # Business rules +│ ├── orchestrator-validator.ncl +│ └── (per-service validators) +│ +├── constraints/ +│ └── constraints.toml # Min/max values (single source of truth) +│ +├── templates/ # Deployment templates +│ ├── docker-compose/ +│ │ ├── platform-stack.solo.yml.ncl +│ │ └── (4 modes) +│ └── kubernetes/ +│ ├── orchestrator-deployment.yaml.ncl +│ └── (11 templates) +│ +└── scripts/ # Automation + ├── configure.nu # Interactive TypeDialog + ├── generate-configs.nu # Nickel → TOML export + ├── validate-config.nu # Typecheck Nickel + ├── render-docker-compose.nu # Templates → Docker Compose + └── render-kubernetes.nu # Templates → Kubernetes +``` + +TOML output location: + +```toml +provisioning/platform/config/ +├── orchestrator.solo.toml # Consumed by orchestrator service +├── control-center.enterprise.toml # Consumed by control-center service +└── (4 services × 4 modes = 16 files) +``` + +--- + +## Tips & Best Practices + +### 1. Use Version Control + +```bash +# Commit TOML configs to track changes +git add provisioning/platform/config/*.toml +git commit -m "Update orchestrator enterprise config: increase worker threads to 16" + +# Do NOT commit Nickel source files in values/ +echo "provisioning/.typedialog/provisioning/platform/values/*.ncl" >> .gitignore +``` + +### 2. Test Before Production Deployment + +```bash +# Test in solo mode first +nu scripts/configure.nu orchestrator solo +cargo run --bin orchestrator + +# Then test in staging (multiuser mode) +nu scripts/configure.nu orchestrator multiuser +docker-compose -f docker-compose.multiuser.yml up + +# Finally deploy to production (enterprise) +nu scripts/configure.nu orchestrator enterprise +# Then Kubernetes deployment +``` + +### 3. Document Custom Configurations + +```toml +# Add comments to configurations +# In values/*.ncl or config/*.ncl: + +# Custom configuration for high-throughput testing +# - Increased workers from 4 to 8 +# - Increased queue.max_concurrent_tasks from 5 to 20 +# - Lowered logging level from debug to info +{ + orchestrator = { + # Worker threads increased for testing parallel task processing + server.workers = 8, + queue.max_concurrent_tasks = 20, + logging.level = "info", + }, +} +``` + +### 4. Secrets Management + +**Never** hardcode secrets in configuration files: + +```toml +# WRONG - Don't do this +[orchestrator.security] +jwt_secret = "hardcoded-secret-exposed-in-git" + +# RIGHT - Use environment variables +export ORCHESTRATOR_SECURITY_JWT_SECRET="actual-secret-from-vault" + +# TOML references it: +[orchestrator.security] +jwt_secret = "${JWT_SECRET}" # Loaded at runtime +``` + +### 5. Monitor Changes + +```bash +# Track configuration changes over time +git log --oneline provisioning/platform/config/ + +# See what changed +git diff provisioning/platform/config/orchestrator.solo.toml +``` + +--- + +**Version**: 1.0 +**Last Updated**: 2025-01-05 +**Status**: Production Ready \ No newline at end of file diff --git a/schemas/platform/validators/README.md b/schemas/platform/validators/README.md index acdac2a..752d34d 100644 --- a/schemas/platform/validators/README.md +++ b/schemas/platform/validators/README.md @@ -1 +1,329 @@ -# Validators\n\nValidation logic for configuration values using constraints and business rules.\n\n## Purpose\n\nValidators provide:\n- **Constraint checking** - Numeric ranges, required fields\n- **Business logic validation** - Service-specific constraints\n- **Error messages** - Clear feedback on invalid values\n- **Composition with configs** - Validators applied during config generation\n\n## File Organization\n\n```\nvalidators/\n├── README.md # This file\n├── common-validator.ncl # Ports, positive numbers, strings\n├── network-validator.ncl # IP addresses, bind addresses\n├── path-validator.ncl # File paths, directories\n├── resource-validator.ncl # CPU, memory, disk\n├── string-validator.ncl # Workspace names, identifiers\n├── orchestrator-validator.ncl # Queue, workflow validation\n├── control-center-validator.ncl # RBAC, policy validation\n├── mcp-server-validator.ncl # MCP tools, capabilities\n└── deployment-validator.ncl # Resource allocation\n```\n\n## Validation Patterns\n\n### 1. Basic Range Validation\n\n```\n# validators/common-validator.ncl\nlet constraints = import "../constraints/constraints.toml" in\n\n{\n ValidPort = fun port =>\n if port < constraints.common.server.port.min then\n std.contract.blame_with_message "Port < 1024" port\n else if port > constraints.common.server.port.max then\n std.contract.blame_with_message "Port > 65535" port\n else\n port,\n}\n```\n\n### 2. Range Validator (Reusable)\n\n```\n# Reusable validator for any numeric range\nValidRange = fun min max value =>\n if value < min then\n std.contract.blame_with_message "Value < %{std.to_string min}" value\n else if value > max then\n std.contract.blame_with_message "Value > %{std.to_string max}" value\n else\n value,\n```\n\n### 3. Enum Validation\n\n```\n{\n ValidStorageBackend = fun backend =>\n if backend != 'filesystem &&\n backend != 'rocksdb &&\n backend != 'surrealdb &&\n backend != 'postgres then\n std.contract.blame_with_message "Invalid backend" backend\n else\n backend,\n}\n```\n\n### 4. String Validation\n\n```\n{\n ValidNonEmptyString = fun s =>\n if s == "" then\n std.contract.blame_with_message "Cannot be empty" s\n else\n s,\n\n ValidWorkspaceName = fun name =>\n if std.string.matches "^[a-z0-9_-]+$" name then\n name\n else\n std.contract.blame_with_message "Invalid workspace name" name,\n}\n```\n\n## Common Validators\n\n### common-validator.ncl\n\n```\nlet constraints = import "../constraints/constraints.toml" in\n\n{\n # Port validation\n ValidPort = fun port =>\n if port < constraints.common.server.port.min then error "Port too low"\n else if port > constraints.common.server.port.max then error "Port too high"\n else port,\n\n # Positive integer\n ValidPositiveNumber = fun n =>\n if n <= 0 then error "Must be positive"\n else n,\n\n # Non-empty string\n ValidNonEmptyString = fun s =>\n if s == "" then error "Cannot be empty"\n else s,\n\n # Generic range validator\n ValidRange = fun min max value =>\n if value < min then error "Value below minimum"\n else if value > max then error "Value above maximum"\n else value,\n}\n```\n\n### resource-validator.ncl\n\n```\nlet constraints = import "../constraints/constraints.toml" in\nlet common = import "./common-validator.ncl" in\n\n{\n # Validate CPU cores for deployment mode\n ValidCPUCores = fun mode cores =>\n let limits = constraints.deployment.{mode} in\n common.ValidRange limits.cpu.min limits.cpu.max cores,\n\n # Validate memory allocation\n ValidMemory = fun mode memory_mb =>\n let limits = constraints.deployment.{mode} in\n common.ValidRange limits.memory_mb.min limits.memory_mb.max memory_mb,\n}\n```\n\n## Service-Specific Validators\n\n### orchestrator-validator.ncl\n\n```\nlet constraints = import "../constraints/constraints.toml" in\nlet common = import "./common-validator.ncl" in\n\n{\n # Validate worker count\n ValidWorkers = fun workers =>\n common.ValidRange\n constraints.orchestrator.workers.min\n constraints.orchestrator.workers.max\n workers,\n\n # Validate queue concurrency\n ValidConcurrentTasks = fun tasks =>\n common.ValidRange\n constraints.orchestrator.queue.concurrent_tasks.min\n constraints.orchestrator.queue.concurrent_tasks.max\n tasks,\n\n # Validate batch parallelism\n ValidParallelLimit = fun limit =>\n common.ValidRange\n constraints.orchestrator.batch.parallel_limit.min\n constraints.orchestrator.batch.parallel_limit.max\n limit,\n\n # Validate task timeout (ms)\n ValidTaskTimeout = fun timeout =>\n if timeout < 1000 then error "Timeout < 1 second"\n else if timeout > 86400000 then error "Timeout > 24 hours"\n else timeout,\n}\n```\n\n### control-center-validator.ncl\n\n```\n{\n # JWT token expiration\n ValidTokenExpiration = fun seconds =>\n if seconds < 300 then error "Token expiration < 5 min"\n else if seconds > 604800 then error "Token expiration > 7 days"\n else seconds,\n\n # Rate limit threshold\n ValidRateLimit = fun requests_per_minute =>\n if requests_per_minute < 10 then error "Rate limit too low"\n else if requests_per_minute > 10000 then error "Rate limit too high"\n else requests_per_minute,\n}\n```\n\n### mcp-server-validator.ncl\n\n```\n{\n # Max concurrent tool executions\n ValidConcurrentTools = fun count =>\n if count < 1 then error "Must allow >= 1 concurrent"\n else if count > 20 then error "Max 20 concurrent tools"\n else count,\n\n # Max resource size\n ValidMaxResourceSize = fun bytes =>\n if bytes < 1048576 then error "Min 1 MB"\n else if bytes > 1073741824 then error "Max 1 GB"\n else bytes,\n}\n```\n\n## Composition with Configs\n\nValidators are applied in config files:\n\n```\n# configs/orchestrator.solo.ncl\nlet validators = import "../validators/orchestrator-validator.ncl" in\n\n{\n orchestrator = {\n server.workers = validators.ValidWorkers 2, # Validated\n queue.max_concurrent_tasks = validators.ValidConcurrentTasks 3, # Validated\n },\n}\n```\n\nValidation happens at:\n1. **Config composition** - When config is evaluated\n2. **Nickel typecheck** - When config is typechecked\n3. **Form submission** - When TypeDialog form is submitted (constraints)\n4. **TOML export** - When Nickel is exported to TOML\n\n## Error Handling\n\n### Validation Errors\n\n```\n# If validation fails during config evaluation:\n# Error: Port too high\n```\n\n### Meaningful Messages\n\nAlways provide context in error messages:\n\n```\n# Bad\nstd.contract.blame "Invalid" value\n\n# Good\nstd.contract.blame_with_message "Port must be 1024-65535, got %{std.to_string value}" port\n```\n\n## Best Practices\n\n1. **Reuse common validators** - Build from common-validator.ncl\n2. **Name clearly** - Prefix with "Valid" (ValidPort, ValidWorkers, etc.)\n3. **Error messages** - Include valid range or enum in message\n4. **Test edge cases** - Verify min/max boundary values\n5. **Document assumptions** - Why a constraint exists\n\n## Testing Validators\n\n```\n# Test a single validator\nnickel eval -c 'import "validators/orchestrator-validator.ncl" as v in v.ValidWorkers 2'\n\n# Test config with validators\nnickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Evaluate config (runs validators)\nnickel eval provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n\n# Export to TOML (validates during export)\nnickel export --format toml provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl\n```\n\n## Adding a New Validator\n\n1. **Create validator function** in appropriate file:\n\n ```nickel\n ValidMyValue = fun value =>\n if value < minimum then error "Too low"\n else if value > maximum then error "Too high"\n else value,\n ```\n\n2. **Add constraint** to constraints.toml if needed:\n\n ```toml\n [service.feature.my_value]\n min = 1\n max = 100\n ```\n\n3. **Use in config**:\n\n ```nickel\n my_value = validators.ValidMyValue 50,\n ```\n\n4. **Add form constraint** (if interactive):\n\n ```toml\n [[elements]]\n name = "my_value"\n min = "${constraint.service.feature.my_value.min}"\n max = "${constraint.service.feature.my_value.max}"\n ```\n\n5. **Test**:\n\n ```bash\n nickel typecheck configs/service.mode.ncl\n ```\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 +# Validators + +Validation logic for configuration values using constraints and business rules. + +## Purpose + +Validators provide: +- **Constraint checking** - Numeric ranges, required fields +- **Business logic validation** - Service-specific constraints +- **Error messages** - Clear feedback on invalid values +- **Composition with configs** - Validators applied during config generation + +## File Organization + +```bash +validators/ +├── README.md # This file +├── common-validator.ncl # Ports, positive numbers, strings +├── network-validator.ncl # IP addresses, bind addresses +├── path-validator.ncl # File paths, directories +├── resource-validator.ncl # CPU, memory, disk +├── string-validator.ncl # Workspace names, identifiers +├── orchestrator-validator.ncl # Queue, workflow validation +├── control-center-validator.ncl # RBAC, policy validation +├── mcp-server-validator.ncl # MCP tools, capabilities +└── deployment-validator.ncl # Resource allocation +``` + +## Validation Patterns + +### 1. Basic Range Validation + +```bash +# validators/common-validator.ncl +let constraints = import "../constraints/constraints.toml" in + +{ + ValidPort = fun port => + if port < constraints.common.server.port.min then + std.contract.blame_with_message "Port < 1024" port + else if port > constraints.common.server.port.max then + std.contract.blame_with_message "Port > 65535" port + else + port, +} +``` + +### 2. Range Validator (Reusable) + +```bash +# Reusable validator for any numeric range +ValidRange = fun min max value => + if value < min then + std.contract.blame_with_message "Value < %{std.to_string min}" value + else if value > max then + std.contract.blame_with_message "Value > %{std.to_string max}" value + else + value, +``` + +### 3. Enum Validation + +```json +{ + ValidStorageBackend = fun backend => + if backend != 'filesystem && + backend != 'rocksdb && + backend != 'surrealdb && + backend != 'postgres then + std.contract.blame_with_message "Invalid backend" backend + else + backend, +} +``` + +### 4. String Validation + +```json +{ + ValidNonEmptyString = fun s => + if s == "" then + std.contract.blame_with_message "Cannot be empty" s + else + s, + + ValidWorkspaceName = fun name => + if std.string.matches "^[a-z0-9_-]+$" name then + name + else + std.contract.blame_with_message "Invalid workspace name" name, +} +``` + +## Common Validators + +### common-validator.ncl + +```javascript +let constraints = import "../constraints/constraints.toml" in + +{ + # Port validation + ValidPort = fun port => + if port < constraints.common.server.port.min then error "Port too low" + else if port > constraints.common.server.port.max then error "Port too high" + else port, + + # Positive integer + ValidPositiveNumber = fun n => + if n <= 0 then error "Must be positive" + else n, + + # Non-empty string + ValidNonEmptyString = fun s => + if s == "" then error "Cannot be empty" + else s, + + # Generic range validator + ValidRange = fun min max value => + if value < min then error "Value below minimum" + else if value > max then error "Value above maximum" + else value, +} +``` + +### resource-validator.ncl + +```javascript +let constraints = import "../constraints/constraints.toml" in +let common = import "./common-validator.ncl" in + +{ + # Validate CPU cores for deployment mode + ValidCPUCores = fun mode cores => + let limits = constraints.deployment.{mode} in + common.ValidRange limits.cpu.min limits.cpu.max cores, + + # Validate memory allocation + ValidMemory = fun mode memory_mb => + let limits = constraints.deployment.{mode} in + common.ValidRange limits.memory_mb.min limits.memory_mb.max memory_mb, +} +``` + +## Service-Specific Validators + +### orchestrator-validator.ncl + +```javascript +let constraints = import "../constraints/constraints.toml" in +let common = import "./common-validator.ncl" in + +{ + # Validate worker count + ValidWorkers = fun workers => + common.ValidRange + constraints.orchestrator.workers.min + constraints.orchestrator.workers.max + workers, + + # Validate queue concurrency + ValidConcurrentTasks = fun tasks => + common.ValidRange + constraints.orchestrator.queue.concurrent_tasks.min + constraints.orchestrator.queue.concurrent_tasks.max + tasks, + + # Validate batch parallelism + ValidParallelLimit = fun limit => + common.ValidRange + constraints.orchestrator.batch.parallel_limit.min + constraints.orchestrator.batch.parallel_limit.max + limit, + + # Validate task timeout (ms) + ValidTaskTimeout = fun timeout => + if timeout < 1000 then error "Timeout < 1 second" + else if timeout > 86400000 then error "Timeout > 24 hours" + else timeout, +} +``` + +### control-center-validator.ncl + +```json +{ + # JWT token expiration + ValidTokenExpiration = fun seconds => + if seconds < 300 then error "Token expiration < 5 min" + else if seconds > 604800 then error "Token expiration > 7 days" + else seconds, + + # Rate limit threshold + ValidRateLimit = fun requests_per_minute => + if requests_per_minute < 10 then error "Rate limit too low" + else if requests_per_minute > 10000 then error "Rate limit too high" + else requests_per_minute, +} +``` + +### mcp-server-validator.ncl + +```json +{ + # Max concurrent tool executions + ValidConcurrentTools = fun count => + if count < 1 then error "Must allow >= 1 concurrent" + else if count > 20 then error "Max 20 concurrent tools" + else count, + + # Max resource size + ValidMaxResourceSize = fun bytes => + if bytes < 1048576 then error "Min 1 MB" + else if bytes > 1073741824 then error "Max 1 GB" + else bytes, +} +``` + +## Composition with Configs + +Validators are applied in config files: + +```toml +# configs/orchestrator.solo.ncl +let validators = import "../validators/orchestrator-validator.ncl" in + +{ + orchestrator = { + server.workers = validators.ValidWorkers 2, # Validated + queue.max_concurrent_tasks = validators.ValidConcurrentTasks 3, # Validated + }, +} +``` + +Validation happens at: +1. **Config composition** - When config is evaluated +2. **Nickel typecheck** - When config is typechecked +3. **Form submission** - When TypeDialog form is submitted (constraints) +4. **TOML export** - When Nickel is exported to TOML + +## Error Handling + +### Validation Errors + +```bash +# If validation fails during config evaluation: +# Error: Port too high +``` + +### Meaningful Messages + +Always provide context in error messages: + +```bash +# Bad +std.contract.blame "Invalid" value + +# Good +std.contract.blame_with_message "Port must be 1024-65535, got %{std.to_string value}" port +``` + +## Best Practices + +1. **Reuse common validators** - Build from common-validator.ncl +2. **Name clearly** - Prefix with "Valid" (ValidPort, ValidWorkers, etc.) +3. **Error messages** - Include valid range or enum in message +4. **Test edge cases** - Verify min/max boundary values +5. **Document assumptions** - Why a constraint exists + +## Testing Validators + +```bash +# Test a single validator +nickel eval -c 'import "validators/orchestrator-validator.ncl" as v in v.ValidWorkers 2' + +# Test config with validators +nickel typecheck provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl + +# Evaluate config (runs validators) +nickel eval provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl + +# Export to TOML (validates during export) +nickel export --format toml provisioning/.typedialog/provisioning/platform/configs/orchestrator.solo.ncl +``` + +## Adding a New Validator + +1. **Create validator function** in appropriate file: + + ```nickel + ValidMyValue = fun value => + if value < minimum then error "Too low" + else if value > maximum then error "Too high" + else value, + ``` + +2. **Add constraint** to constraints.toml if needed: + + ```toml + [service.feature.my_value] + min = 1 + max = 100 + ``` + +3. **Use in config**: + + ```nickel + my_value = validators.ValidMyValue 50, + ``` + +4. **Add form constraint** (if interactive): + + ```toml + [[elements]] + name = "my_value" + min = "${constraint.service.feature.my_value.min}" + max = "${constraint.service.feature.my_value.max}" + ``` + +5. **Test**: + + ```bash + nickel typecheck configs/service.mode.ncl + ``` + +--- + +**Version**: 1.0.0 +**Last Updated**: 2025-01-05 \ No newline at end of file diff --git a/schemas/platform/values/README.md b/schemas/platform/values/README.md index 7301990..82e09a5 100644 --- a/schemas/platform/values/README.md +++ b/schemas/platform/values/README.md @@ -1 +1,311 @@ -# Values\n\nUser configuration files for provisioning platform services (gitignored).\n\n## Purpose\n\nThe values directory stores:\n- **User configurations** - Service-specific settings for each deployment mode\n- **Generated Nickel configs** - Output from TypeDialog configuration wizard\n- **Customizations** - User-specific overrides to defaults\n- **Runtime data** - Persisted configuration state\n\n## File Organization\n\n```\nvalues/\n├── .gitignore # Ignore *.ncl user configs\n├── README.md # This file\n├── orchestrator.solo.ncl # User config (gitignored)\n├── orchestrator.multiuser.ncl\n├── orchestrator.cicd.ncl\n├── orchestrator.enterprise.ncl\n├── control-center.solo.ncl\n├── control-center.multiuser.ncl\n├── control-center.cicd.ncl\n├── control-center.enterprise.ncl\n├── mcp-server.solo.ncl\n├── mcp-server.multiuser.ncl\n├── mcp-server.cicd.ncl\n├── mcp-server.enterprise.ncl\n├── installer.solo.ncl\n├── installer.multiuser.ncl\n├── installer.cicd.ncl\n├── installer.enterprise.ncl\n└── orchestrator.example.ncl # Example template (tracked)\n```\n\n## Configuration Files\n\nEach config file (`{service}.{mode}.ncl`) is:\n- **Generated by TypeDialog** - Via `configure.nu` wizard\n- **User-specific** - Contains customizations for that environment\n- **Gitignored** - NOT tracked in version control\n- **Runtime data** - Created/updated by scripts and forms\n\nExample:\n\n```\n# values/orchestrator.solo.ncl (auto-generated, user-editable)\n{\n orchestrator = {\n workspace = {\n name = "my-workspace",\n path = "/home/user/workspace",\n enabled = true,\n },\n server = {\n host = "127.0.0.1",\n port = 9090,\n workers = 2,\n },\n storage = {\n backend = 'filesystem,\n path = "/home/user/.provisioning/data",\n },\n },\n}\n```\n\n## .gitignore Pattern\n\n```\n# values/.gitignore\n*.ncl # Ignore all Nickel config files (user-specific)\n!*.example.ncl # EXCEPT example files (tracked for documentation)\n```\n\nThis ensures:\n- User configs (`orchestrator.solo.ncl`) are NOT committed\n- Example configs (`orchestrator.example.ncl`) ARE committed\n- Each user has their own configs without merge conflicts\n\n## Example Template\n\n`orchestrator.example.ncl` provides a documented template:\n\n```\n# orchestrator.example.ncl\n# Example configuration for Orchestrator service\n# Copy to orchestrator.{mode}.ncl and customize for your environment\n\n{\n orchestrator = {\n # Workspace Configuration\n workspace = {\n # Name of the workspace\n name = "default",\n\n # Absolute path to workspace directory\n path = "/var/lib/provisioning/orchestrator",\n\n # Enable this workspace\n enabled = true,\n\n # Allow serving multiple workspaces\n multi_workspace = false,\n },\n\n # HTTP Server Configuration\n server = {\n # Bind address (127.0.0.1 for local only, 0.0.0.0 for network)\n host = "127.0.0.1",\n\n # Listen port\n port = 9090,\n\n # Worker thread count\n workers = 4,\n\n # Keep-alive timeout (seconds)\n keep_alive = 75,\n },\n\n # Storage Configuration\n storage = {\n # Backend: 'filesystem | 'rocksdb | 'surrealdb | 'postgres\n backend = 'filesystem,\n\n # Path for filesystem/rocksdb storage\n path = "/var/lib/provisioning/orchestrator/data",\n },\n\n # Queue Configuration\n queue = {\n # Maximum concurrent tasks\n max_concurrent_tasks = 5,\n\n # Retry attempts for failed tasks\n retry_attempts = 3,\n\n # Delay between retries (milliseconds)\n retry_delay = 5000,\n\n # Task execution timeout (milliseconds)\n task_timeout = 3600000,\n },\n },\n}\n```\n\n## Configuration Workflow\n\n### 1. Generate Initial Config\n\n```\nnu scripts/configure.nu orchestrator solo\n```\n\nCreates `values/orchestrator.solo.ncl` from form input.\n\n### 2. Edit Configuration\n\n```\n# Manually edit if needed\nvi values/orchestrator.solo.ncl\n\n# Or reconfigure with wizard\nnu scripts/configure.nu orchestrator solo --backend web\n```\n\n### 3. Validate Configuration\n\n```\nnu scripts/validate-config.nu values/orchestrator.solo.ncl\n```\n\n### 4. Generate TOML for Services\n\n```\nnu scripts/generate-configs.nu orchestrator solo\n```\n\nExports to `provisioning/platform/config/orchestrator.solo.toml` (consumed by Rust services).\n\n## Configuration Composition\n\nUser configs are composed with defaults during generation:\n\n```\ndefaults/orchestrator-defaults.ncl (base values)\n ↓ &\nvalues/orchestrator.solo.ncl (user customizations)\n ↓\nconfigs/orchestrator.solo.ncl (final generated config)\n ↓\nprovisioning/platform/config/orchestrator.solo.toml (Rust service config)\n```\n\n## Best Practices\n\n1. **Start with example** - Copy `orchestrator.example.ncl` as template\n2. **Document changes** - Add inline comments explaining customizations\n3. **Use TypeDialog** - Let wizard handle configuration for you\n4. **Validate before deploying** - Always run `validate-config.nu`\n5. **Keep defaults** - Only override what you need to change\n6. **Backup important configs** - Save known-good configurations\n\n## Sharing Configurations\n\nSince user configs are gitignored, sharing requires:\n\n### Option 1: Share via File\n\n```\n# Export current config\ncat values/orchestrator.solo.ncl > /tmp/orchestrator-config.ncl\n\n# Import on another system\ncp /tmp/orchestrator-config.ncl values/orchestrator.solo.ncl\n```\n\n### Option 2: Use Example Template\nShare setup instructions instead of raw config:\n\n```\n# Document the setup steps\ncat > SETUP.md << EOF\n1. Run: nu scripts/configure.nu orchestrator solo\n2. Set workspace path: /shared/workspace\n3. Set storage backend: postgres\n4. Set server workers: 8\nEOF\n```\n\n### Option 3: Store in Separate Repo\nFor team configs, use a separate private repository:\n\n```\n# Clone team configs\ngit clone private-repo/provisioning-configs values/\n\n# Use team configs\ncp values/team-orchestrator-solo.ncl values/orchestrator.solo.ncl\n```\n\n## File Permissions\n\nUser config files should have restricted permissions:\n\n```\n# Secure config file (if contains secrets)\nchmod 600 values/orchestrator.solo.ncl\n```\n\n## Recovery\n\nIf you accidentally delete a user config:\n\n### Option 1: Regenerate from TypeDialog\n\n```\nnu scripts/configure.nu orchestrator solo\n```\n\n### Option 2: Copy from Backup\n\n```\ncp /backup/provisioning-values/orchestrator.solo.ncl values/\n```\n\n### Option 3: Use Example as Base\n\n```\ncp examples/orchestrator-solo.ncl values/orchestrator.solo.ncl\n# Customize as needed\nnu scripts/configure.nu orchestrator solo --backend web\n```\n\n## Troubleshooting\n\n### Config File Missing\n\n```\n# Regenerate from defaults\nnu scripts/configure.nu orchestrator solo\n```\n\n### Config Won't Validate\n\n```\n# Check for syntax errors\nnickel eval values/orchestrator.solo.ncl\n\n# Compare with example\ndiff examples/orchestrator-solo.ncl values/orchestrator.solo.ncl\n```\n\n### Changes Not Taking Effect\n\n```\n# Regenerate TOML from Nickel\nnu scripts/generate-configs.nu orchestrator solo\n\n# Verify TOML was updated\nls -la provisioning/platform/config/orchestrator.solo.toml\n```\n\n---\n\n**Version**: 1.0.0\n**Last Updated**: 2025-01-05 +# Values + +User configuration files for provisioning platform services (gitignored). + +## Purpose + +The values directory stores: +- **User configurations** - Service-specific settings for each deployment mode +- **Generated Nickel configs** - Output from TypeDialog configuration wizard +- **Customizations** - User-specific overrides to defaults +- **Runtime data** - Persisted configuration state + +## File Organization + +```bash +values/ +├── .gitignore # Ignore *.ncl user configs +├── README.md # This file +├── orchestrator.solo.ncl # User config (gitignored) +├── orchestrator.multiuser.ncl +├── orchestrator.cicd.ncl +├── orchestrator.enterprise.ncl +├── control-center.solo.ncl +├── control-center.multiuser.ncl +├── control-center.cicd.ncl +├── control-center.enterprise.ncl +├── mcp-server.solo.ncl +├── mcp-server.multiuser.ncl +├── mcp-server.cicd.ncl +├── mcp-server.enterprise.ncl +├── installer.solo.ncl +├── installer.multiuser.ncl +├── installer.cicd.ncl +├── installer.enterprise.ncl +└── orchestrator.example.ncl # Example template (tracked) +``` + +## Configuration Files + +Each config file (`{service}.{mode}.ncl`) is: +- **Generated by TypeDialog** - Via `configure.nu` wizard +- **User-specific** - Contains customizations for that environment +- **Gitignored** - NOT tracked in version control +- **Runtime data** - Created/updated by scripts and forms + +Example: + +```bash +# values/orchestrator.solo.ncl (auto-generated, user-editable) +{ + orchestrator = { + workspace = { + name = "my-workspace", + path = "/home/user/workspace", + enabled = true, + }, + server = { + host = "127.0.0.1", + port = 9090, + workers = 2, + }, + storage = { + backend = 'filesystem, + path = "/home/user/.provisioning/data", + }, + }, +} +``` + +## .gitignore Pattern + +```bash +# values/.gitignore +*.ncl # Ignore all Nickel config files (user-specific) +!*.example.ncl # EXCEPT example files (tracked for documentation) +``` + +This ensures: +- User configs (`orchestrator.solo.ncl`) are NOT committed +- Example configs (`orchestrator.example.ncl`) ARE committed +- Each user has their own configs without merge conflicts + +## Example Template + +`orchestrator.example.ncl` provides a documented template: + +```nickel +# orchestrator.example.ncl +# Example configuration for Orchestrator service +# Copy to orchestrator.{mode}.ncl and customize for your environment + +{ + orchestrator = { + # Workspace Configuration + workspace = { + # Name of the workspace + name = "default", + + # Absolute path to workspace directory + path = "/var/lib/provisioning/orchestrator", + + # Enable this workspace + enabled = true, + + # Allow serving multiple workspaces + multi_workspace = false, + }, + + # HTTP Server Configuration + server = { + # Bind address (127.0.0.1 for local only, 0.0.0.0 for network) + host = "127.0.0.1", + + # Listen port + port = 9090, + + # Worker thread count + workers = 4, + + # Keep-alive timeout (seconds) + keep_alive = 75, + }, + + # Storage Configuration + storage = { + # Backend: 'filesystem | 'rocksdb | 'surrealdb | 'postgres + backend = 'filesystem, + + # Path for filesystem/rocksdb storage + path = "/var/lib/provisioning/orchestrator/data", + }, + + # Queue Configuration + queue = { + # Maximum concurrent tasks + max_concurrent_tasks = 5, + + # Retry attempts for failed tasks + retry_attempts = 3, + + # Delay between retries (milliseconds) + retry_delay = 5000, + + # Task execution timeout (milliseconds) + task_timeout = 3600000, + }, + }, +} +``` + +## Configuration Workflow + +### 1. Generate Initial Config + +```toml +nu scripts/configure.nu orchestrator solo +``` + +Creates `values/orchestrator.solo.ncl` from form input. + +### 2. Edit Configuration + +```toml +# Manually edit if needed +vi values/orchestrator.solo.ncl + +# Or reconfigure with wizard +nu scripts/configure.nu orchestrator solo --backend web +``` + +### 3. Validate Configuration + +```toml +nu scripts/validate-config.nu values/orchestrator.solo.ncl +``` + +### 4. Generate TOML for Services + +```toml +nu scripts/generate-configs.nu orchestrator solo +``` + +Exports to `provisioning/platform/config/orchestrator.solo.toml` (consumed by Rust services). + +## Configuration Composition + +User configs are composed with defaults during generation: + +```toml +defaults/orchestrator-defaults.ncl (base values) + ↓ & +values/orchestrator.solo.ncl (user customizations) + ↓ +configs/orchestrator.solo.ncl (final generated config) + ↓ +provisioning/platform/config/orchestrator.solo.toml (Rust service config) +``` + +## Best Practices + +1. **Start with example** - Copy `orchestrator.example.ncl` as template +2. **Document changes** - Add inline comments explaining customizations +3. **Use TypeDialog** - Let wizard handle configuration for you +4. **Validate before deploying** - Always run `validate-config.nu` +5. **Keep defaults** - Only override what you need to change +6. **Backup important configs** - Save known-good configurations + +## Sharing Configurations + +Since user configs are gitignored, sharing requires: + +### Option 1: Share via File + +```bash +# Export current config +cat values/orchestrator.solo.ncl > /tmp/orchestrator-config.ncl + +# Import on another system +cp /tmp/orchestrator-config.ncl values/orchestrator.solo.ncl +``` + +### Option 2: Use Example Template +Share setup instructions instead of raw config: + +```toml +# Document the setup steps +cat > SETUP.md << EOF +1. Run: nu scripts/configure.nu orchestrator solo +2. Set workspace path: /shared/workspace +3. Set storage backend: postgres +4. Set server workers: 8 +EOF +``` + +### Option 3: Store in Separate Repo +For team configs, use a separate private repository: + +```toml +# Clone team configs +git clone private-repo/provisioning-configs values/ + +# Use team configs +cp values/team-orchestrator-solo.ncl values/orchestrator.solo.ncl +``` + +## File Permissions + +User config files should have restricted permissions: + +```toml +# Secure config file (if contains secrets) +chmod 600 values/orchestrator.solo.ncl +``` + +## Recovery + +If you accidentally delete a user config: + +### Option 1: Regenerate from TypeDialog + +```nushell +nu scripts/configure.nu orchestrator solo +``` + +### Option 2: Copy from Backup + +```bash +cp /backup/provisioning-values/orchestrator.solo.ncl values/ +``` + +### Option 3: Use Example as Base + +```bash +cp examples/orchestrator-solo.ncl values/orchestrator.solo.ncl +# Customize as needed +nu scripts/configure.nu orchestrator solo --backend web +``` + +## Troubleshooting + +### Config File Missing + +```toml +# Regenerate from defaults +nu scripts/configure.nu orchestrator solo +``` + +### Config Won't Validate + +```toml +# Check for syntax errors +nickel eval values/orchestrator.solo.ncl + +# Compare with example +diff examples/orchestrator-solo.ncl values/orchestrator.solo.ncl +``` + +### Changes Not Taking Effect + +```bash +# Regenerate TOML from Nickel +nu scripts/generate-configs.nu orchestrator solo + +# Verify TOML was updated +ls -la provisioning/platform/config/orchestrator.solo.toml +``` + +--- + +**Version**: 1.0.0 +**Last Updated**: 2025-01-05 \ No newline at end of file diff --git a/templates/workspace/example/README.md b/templates/workspace/example/README.md index 2099263..5369da6 100644 --- a/templates/workspace/example/README.md +++ b/templates/workspace/example/README.md @@ -1 +1,212 @@ -# Example Infrastructure Template\n\nThis is a complete, ready-to-deploy example of a simple web application stack.\n\n## What's Included\n\n- **2 Web servers** - Load-balanced frontend\n- **1 Database server** - Backend database\n- **Complete configuration** - Ready to deploy with minimal changes\n- **Usage instructions** - Step-by-step deployment guide\n\n## Architecture\n\n```\n┌─────────────────────────────────────────┐\n│ Internet / Load Balancer │\n└─────────────┬───────────────────────────┘\n │\n ┌───────┴───────┐\n │ │\n┌─────▼─────┐ ┌────▼──────┐\n│ demo-web-01│ │demo-web-02│\n│ (Public) │ │ (Public) │\n└─────┬──────┘ └────┬──────┘\n │ │\n └───────┬───────┘\n │\n │ Private Network\n │\n ┌─────▼──────┐\n │ demo-db-01 │\n │ (Private) │\n └────────────┘\n```\n\n## Quick Start\n\n### 1. Load Required Provider\n\n```\ncd infra/\n\n# Load your cloud provider\nprovisioning mod load providers . upcloud\n# OR\nprovisioning mod load providers . aws\n```\n\n### 2. Configure Provider Settings\n\nEdit `servers.k` and uncomment provider-specific settings:\n\n**UpCloud example:**\n\n```\nplan = "1xCPU-2GB" # Web servers\n# plan = "2xCPU-4GB" # Database server (larger)\nstorage_size = 25 # Disk size in GB\n```\n\n**AWS example:**\n\n```\ninstance_type = "t3.small" # Web servers\n# instance_type = "t3.medium" # Database server\nstorage_size = 25\n```\n\n### 3. Load Optional Task Services\n\n```\n# For container support\nprovisioning mod load taskservs . containerd\n\n# For additional services\nprovisioning mod load taskservs . docker redis nginx\n```\n\n### 4. Deploy\n\n```\n# Test configuration first\nkcl run servers.k\n\n# Dry-run to see what will be created\nprovisioning s create --infra --check\n\n# Deploy the infrastructure\nprovisioning s create --infra \n\n# Monitor deployment\nwatch provisioning s list --infra \n```\n\n### 5. Verify Deployment\n\n```\n# List all servers\nprovisioning s list --infra \n\n# SSH into web server\nprovisioning s ssh demo-web-01\n\n# Check database server\nprovisioning s ssh demo-db-01\n```\n\n## Configuration Details\n\n### Web Servers (demo-web-01, demo-web-02)\n\n- **Networking**: Public IPv4 + Private IPv4\n- **Purpose**: Frontend application servers\n- **Load balancing**: Configure externally\n- **Resources**: Minimal (1-2 CPU, 2-4GB RAM)\n\n### Database Server (demo-db-01)\n\n- **Networking**: Private IPv4 only (no public access)\n- **Purpose**: Backend database\n- **Security**: Isolated on private network\n- **Resources**: Medium (2-4 CPU, 4-8GB RAM)\n\n## Next Steps\n\n### Application Deployment\n\n1. **Deploy application code** - Use SSH or CI/CD\n2. **Configure web servers** - Set up Nginx/Apache\n3. **Set up database** - Install PostgreSQL/MySQL\n4. **Configure connectivity** - Connect web servers to database\n\n### Security Hardening\n\n1. **Firewall rules** - Lock down server access\n2. **SSH keys** - Disable password auth\n3. **Database access** - Restrict to web servers only\n4. **SSL certificates** - Set up HTTPS\n\n### Monitoring & Backup\n\n1. **Monitoring** - Set up metrics collection\n2. **Logging** - Configure centralized logging\n3. **Backups** - Set up database backups\n4. **Alerts** - Configure alerting\n\n### Scaling\n\n1. **Add more web servers** - Copy web-02 definition\n2. **Database replication** - Add read replicas\n3. **Load balancer** - Configure external LB\n4. **Auto-scaling** - Set up scaling policies\n\n## Customization\n\n### Change Server Count\n\n```\n# Add more web servers\n{\n hostname = "demo-web-03"\n # ... copy configuration from web-01\n}\n```\n\n### Change Resource Sizes\n\n```\n# Web servers\nplan = "2xCPU-4GB" # Increase resources\n\n# Database\nplan = "4xCPU-8GB" # More resources for DB\nstorage_size = 100 # Larger disk\n```\n\n### Add Task Services\n\n```\ntaskservs = [\n { name = "containerd", profile = "default" }\n { name = "docker", profile = "default" }\n { name = "redis", profile = "default" }\n]\n```\n\n## Common Issues\n\n### Deployment Fails\n\n- Check provider credentials\n- Verify network configuration\n- Check resource quotas\n\n### Can't SSH\n\n- Verify SSH key is loaded\n- Check firewall rules\n- Ensure server is running\n\n### Database Connection\n\n- Verify private network\n- Check firewall rules between web and DB\n- Test connectivity from web servers\n\n## Template Characteristics\n\n- **Complexity**: Medium\n- **Servers**: 3 (2 web + 1 database)\n- **Pre-configured modules**: Provider only\n- **Best for**: Quick demos, learning deployments, testing infrastructure code +# Example Infrastructure Template + +This is a complete, ready-to-deploy example of a simple web application stack. + +## What's Included + +- **2 Web servers** - Load-balanced frontend +- **1 Database server** - Backend database +- **Complete configuration** - Ready to deploy with minimal changes +- **Usage instructions** - Step-by-step deployment guide + +## Architecture + +```bash +┌─────────────────────────────────────────┐ +│ Internet / Load Balancer │ +└─────────────┬───────────────────────────┘ + │ + ┌───────┴───────┐ + │ │ +┌─────▼─────┐ ┌────▼──────┐ +│ demo-web-01│ │demo-web-02│ +│ (Public) │ │ (Public) │ +└─────┬──────┘ └────┬──────┘ + │ │ + └───────┬───────┘ + │ + │ Private Network + │ + ┌─────▼──────┐ + │ demo-db-01 │ + │ (Private) │ + └────────────┘ +``` + +## Quick Start + +### 1. Load Required Provider + +```bash +cd infra/ + +# Load your cloud provider +provisioning mod load providers . upcloud +# OR +provisioning mod load providers . aws +``` + +### 2. Configure Provider Settings + +Edit `servers.k` and uncomment provider-specific settings: + +**UpCloud example:** + +```bash +plan = "1xCPU-2GB" # Web servers +# plan = "2xCPU-4GB" # Database server (larger) +storage_size = 25 # Disk size in GB +``` + +**AWS example:** + +```bash +instance_type = "t3.small" # Web servers +# instance_type = "t3.medium" # Database server +storage_size = 25 +``` + +### 3. Load Optional Task Services + +```bash +# For container support +provisioning mod load taskservs . containerd + +# For additional services +provisioning mod load taskservs . docker redis nginx +``` + +### 4. Deploy + +```bash +# Test configuration first +kcl run servers.k + +# Dry-run to see what will be created +provisioning s create --infra --check + +# Deploy the infrastructure +provisioning s create --infra + +# Monitor deployment +watch provisioning s list --infra +``` + +### 5. Verify Deployment + +```bash +# List all servers +provisioning s list --infra + +# SSH into web server +provisioning s ssh demo-web-01 + +# Check database server +provisioning s ssh demo-db-01 +``` + +## Configuration Details + +### Web Servers (demo-web-01, demo-web-02) + +- **Networking**: Public IPv4 + Private IPv4 +- **Purpose**: Frontend application servers +- **Load balancing**: Configure externally +- **Resources**: Minimal (1-2 CPU, 2-4GB RAM) + +### Database Server (demo-db-01) + +- **Networking**: Private IPv4 only (no public access) +- **Purpose**: Backend database +- **Security**: Isolated on private network +- **Resources**: Medium (2-4 CPU, 4-8GB RAM) + +## Next Steps + +### Application Deployment + +1. **Deploy application code** - Use SSH or CI/CD +2. **Configure web servers** - Set up Nginx/Apache +3. **Set up database** - Install PostgreSQL/MySQL +4. **Configure connectivity** - Connect web servers to database + +### Security Hardening + +1. **Firewall rules** - Lock down server access +2. **SSH keys** - Disable password auth +3. **Database access** - Restrict to web servers only +4. **SSL certificates** - Set up HTTPS + +### Monitoring & Backup + +1. **Monitoring** - Set up metrics collection +2. **Logging** - Configure centralized logging +3. **Backups** - Set up database backups +4. **Alerts** - Configure alerting + +### Scaling + +1. **Add more web servers** - Copy web-02 definition +2. **Database replication** - Add read replicas +3. **Load balancer** - Configure external LB +4. **Auto-scaling** - Set up scaling policies + +## Customization + +### Change Server Count + +```bash +# Add more web servers +{ + hostname = "demo-web-03" + # ... copy configuration from web-01 +} +``` + +### Change Resource Sizes + +```bash +# Web servers +plan = "2xCPU-4GB" # Increase resources + +# Database +plan = "4xCPU-8GB" # More resources for DB +storage_size = 100 # Larger disk +``` + +### Add Task Services + +```bash +taskservs = [ + { name = "containerd", profile = "default" } + { name = "docker", profile = "default" } + { name = "redis", profile = "default" } +] +``` + +## Common Issues + +### Deployment Fails + +- Check provider credentials +- Verify network configuration +- Check resource quotas + +### Can't SSH + +- Verify SSH key is loaded +- Check firewall rules +- Ensure server is running + +### Database Connection + +- Verify private network +- Check firewall rules between web and DB +- Test connectivity from web servers + +## Template Characteristics + +- **Complexity**: Medium +- **Servers**: 3 (2 web + 1 database) +- **Pre-configured modules**: Provider only +- **Best for**: Quick demos, learning deployments, testing infrastructure code \ No newline at end of file diff --git a/templates/workspace/full/README.md b/templates/workspace/full/README.md index fe3fa18..af15142 100644 --- a/templates/workspace/full/README.md +++ b/templates/workspace/full/README.md @@ -1 +1,162 @@ -# Full Infrastructure Template\n\nThis is a comprehensive infrastructure template with multiple server types and advanced configuration examples.\n\n## What's Included\n\n- **Web servers** - 2 frontend web servers\n- **Database server** - Backend database with private networking\n- **Kubernetes control plane** - Control plane node\n- **Kubernetes workers** - 2 worker nodes\n- **Advanced settings** - SSH config, monitoring, backup options\n- **Comprehensive examples** - Multiple server roles and configurations\n\n## Server Inventory\n\n| Hostname | Role | Network | Purpose |\n| ---------- | ------ | --------- | --------- |\n| web-01, web-02 | Web | Public + Private | Frontend application servers |\n| db-01 | Database | Private only | Backend database |\n| k8s-control-01 | K8s Control | Public + Private | Kubernetes control plane |\n| k8s-worker-01, k8s-worker-02 | K8s Worker | Public + Private | Kubernetes compute nodes |\n\n## Quick Start\n\n### 1. Load Required Modules\n\n```\ncd infra/\n\n# Load provider\nprovisioning mod load providers . upcloud\n\n# Load taskservs\nprovisioning mod load taskservs . kubernetes containerd cilium\n\n# Load cluster configurations (optional)\nprovisioning mod load clusters . buildkit\n```\n\n### 2. Customize Configuration\n\nEdit `servers.k`:\n\n**Provider-specific settings:**\n\n```\n# Uncomment and adjust for your provider\nplan = "2xCPU-4GB" # Server size\nstorage_size = 50 # Disk size in GB\n```\n\n**Task services:**\n\n```\n# Uncomment after loading modules\ntaskservs = [\n { name = "kubernetes", profile = "control-plane" }\n { name = "containerd", profile = "default" }\n { name = "cilium", profile = "default" }\n]\n```\n\n**Select servers to deploy:**\n\n```\n# Choose which server groups to deploy\nall_servers = web_servers + db_servers # Web + DB only\n# OR\nall_servers = k8s_control + k8s_workers # Kubernetes cluster only\n# OR\nall_servers = web_servers + db_servers + k8s_control + k8s_workers # Everything\n```\n\n### 3. Deploy\n\n```\n# Test configuration\nkcl run servers.k\n\n# Dry-run deployment (recommended)\nprovisioning s create --infra --check\n\n# Deploy selected servers\nprovisioning s create --infra \n\n# Or deploy specific server groups\nprovisioning s create --infra --select web\n```\n\n## Architecture Examples\n\n### Web Application Stack\n\nDeploy web servers + database:\n\n```\nall_servers = web_servers + db_servers\n```\n\n### Kubernetes Cluster\n\nDeploy control plane + workers:\n\n```\nall_servers = k8s_control + k8s_workers\n```\n\n### Complete Infrastructure\n\nDeploy everything:\n\n```\nall_servers = web_servers + db_servers + k8s_control + k8s_workers\n```\n\n## Advanced Configuration\n\n### Network Segmentation\n\n- **Public servers**: web-01, web-02 (public + private networks)\n- **Private servers**: db-01 (private network only)\n- **Hybrid**: k8s nodes (public for API access, private for pod networking)\n\n### Monitoring\n\nMonitoring is pre-configured in settings:\n\n```\nmonitoring = {\n enabled = True\n metrics_port = 9100\n log_aggregation = True\n}\n```\n\n### SSH Configuration\n\nAdvanced SSH settings are included:\n\n```\nssh_config = {\n connect_timeout = 30\n retry_attempts = 3\n compression = True\n}\n```\n\n## Next Steps\n\n1. **Customize server specs** - Adjust CPU, memory, storage\n2. **Configure networking** - Set up firewall rules, load balancers\n3. **Add taskservs** - Uncomment and configure task services\n4. **Set up clusters** - Deploy Kubernetes or container clusters\n5. **Configure monitoring** - Set up metrics and logging\n6. **Implement backup** - Configure backup policies\n\n## Template Characteristics\n\n- **Complexity**: High\n- **Servers**: 6 examples (web, database, k8s)\n- **Pre-configured modules**: Examples for all major components\n- **Best for**: Production deployments, complex architectures, learning advanced patterns +# Full Infrastructure Template + +This is a comprehensive infrastructure template with multiple server types and advanced configuration examples. + +## What's Included + +- **Web servers** - 2 frontend web servers +- **Database server** - Backend database with private networking +- **Kubernetes control plane** - Control plane node +- **Kubernetes workers** - 2 worker nodes +- **Advanced settings** - SSH config, monitoring, backup options +- **Comprehensive examples** - Multiple server roles and configurations + +## Server Inventory + +| Hostname | Role | Network | Purpose | +| ---------- | ------ | --------- | --------- | +| web-01, web-02 | Web | Public + Private | Frontend application servers | +| db-01 | Database | Private only | Backend database | +| k8s-control-01 | K8s Control | Public + Private | Kubernetes control plane | +| k8s-worker-01, k8s-worker-02 | K8s Worker | Public + Private | Kubernetes compute nodes | + +## Quick Start + +### 1. Load Required Modules + +```bash +cd infra/ + +# Load provider +provisioning mod load providers . upcloud + +# Load taskservs +provisioning mod load taskservs . kubernetes containerd cilium + +# Load cluster configurations (optional) +provisioning mod load clusters . buildkit +``` + +### 2. Customize Configuration + +Edit `servers.k`: + +**Provider-specific settings:** + +```toml +# Uncomment and adjust for your provider +plan = "2xCPU-4GB" # Server size +storage_size = 50 # Disk size in GB +``` + +**Task services:** + +```bash +# Uncomment after loading modules +taskservs = [ + { name = "kubernetes", profile = "control-plane" } + { name = "containerd", profile = "default" } + { name = "cilium", profile = "default" } +] +``` + +**Select servers to deploy:** + +```bash +# Choose which server groups to deploy +all_servers = web_servers + db_servers # Web + DB only +# OR +all_servers = k8s_control + k8s_workers # Kubernetes cluster only +# OR +all_servers = web_servers + db_servers + k8s_control + k8s_workers # Everything +``` + +### 3. Deploy + +```bash +# Test configuration +kcl run servers.k + +# Dry-run deployment (recommended) +provisioning s create --infra --check + +# Deploy selected servers +provisioning s create --infra + +# Or deploy specific server groups +provisioning s create --infra --select web +``` + +## Architecture Examples + +### Web Application Stack + +Deploy web servers + database: + +```bash +all_servers = web_servers + db_servers +``` + +### Kubernetes Cluster + +Deploy control plane + workers: + +```bash +all_servers = k8s_control + k8s_workers +``` + +### Complete Infrastructure + +Deploy everything: + +```bash +all_servers = web_servers + db_servers + k8s_control + k8s_workers +``` + +## Advanced Configuration + +### Network Segmentation + +- **Public servers**: web-01, web-02 (public + private networks) +- **Private servers**: db-01 (private network only) +- **Hybrid**: k8s nodes (public for API access, private for pod networking) + +### Monitoring + +Monitoring is pre-configured in settings: + +```toml +monitoring = { + enabled = True + metrics_port = 9100 + log_aggregation = True +} +``` + +### SSH Configuration + +Advanced SSH settings are included: + +```toml +ssh_config = { + connect_timeout = 30 + retry_attempts = 3 + compression = True +} +``` + +## Next Steps + +1. **Customize server specs** - Adjust CPU, memory, storage +2. **Configure networking** - Set up firewall rules, load balancers +3. **Add taskservs** - Uncomment and configure task services +4. **Set up clusters** - Deploy Kubernetes or container clusters +5. **Configure monitoring** - Set up metrics and logging +6. **Implement backup** - Configure backup policies + +## Template Characteristics + +- **Complexity**: High +- **Servers**: 6 examples (web, database, k8s) +- **Pre-configured modules**: Examples for all major components +- **Best for**: Production deployments, complex architectures, learning advanced patterns \ No newline at end of file diff --git a/templates/workspace/minimal/README.md b/templates/workspace/minimal/README.md index 620465b..cb10aa3 100644 --- a/templates/workspace/minimal/README.md +++ b/templates/workspace/minimal/README.md @@ -1 +1,59 @@ -# Minimal Infrastructure Template\n\nThis is a minimal infrastructure template with a basic server configuration.\n\n## What's Included\n\n- **Single server definition** - Basic example to customize\n- **Minimal settings** - Essential configuration only\n- **No pre-configured modules** - Load what you need\n\n## Quick Start\n\n### 1. Load Required Modules\n\n```\ncd infra/\n\n# Load a provider\nprovisioning mod load providers . upcloud\n\n# Load taskservs as needed\nprovisioning mod load taskservs . containerd\n```\n\n### 2. Customize Configuration\n\nEdit `servers.k`:\n\n- Change server hostname and title\n- Configure network settings\n- Add provider-specific settings (plan, storage, etc.)\n- Add taskservs when ready\n\n### 3. Deploy\n\n```\n# Test configuration\nkcl run servers.k\n\n# Dry-run deployment\nprovisioning s create --infra --check\n\n# Deploy\nprovisioning s create --infra \n```\n\n## Next Steps\n\n- Add more servers to the `example_servers` array\n- Configure taskservs for your servers\n- Set up monitoring and backup\n- Configure firewall rules\n\n## Template Characteristics\n\n- **Complexity**: Low\n- **Servers**: 1 basic example\n- **Pre-configured modules**: None\n- **Best for**: Learning, simple deployments, custom configurations +# Minimal Infrastructure Template + +This is a minimal infrastructure template with a basic server configuration. + +## What's Included + +- **Single server definition** - Basic example to customize +- **Minimal settings** - Essential configuration only +- **No pre-configured modules** - Load what you need + +## Quick Start + +### 1. Load Required Modules + +```bash +cd infra/ + +# Load a provider +provisioning mod load providers . upcloud + +# Load taskservs as needed +provisioning mod load taskservs . containerd +``` + +### 2. Customize Configuration + +Edit `servers.k`: + +- Change server hostname and title +- Configure network settings +- Add provider-specific settings (plan, storage, etc.) +- Add taskservs when ready + +### 3. Deploy + +```bash +# Test configuration +kcl run servers.k + +# Dry-run deployment +provisioning s create --infra --check + +# Deploy +provisioning s create --infra +``` + +## Next Steps + +- Add more servers to the `example_servers` array +- Configure taskservs for your servers +- Set up monitoring and backup +- Configure firewall rules + +## Template Characteristics + +- **Complexity**: Low +- **Servers**: 1 basic example +- **Pre-configured modules**: None +- **Best for**: Learning, simple deployments, custom configurations \ No newline at end of file diff --git a/templates/workspaces/kubernetes/setup.md b/templates/workspaces/kubernetes/setup.md index b0a3166..b422582 100644 --- a/templates/workspaces/kubernetes/setup.md +++ b/templates/workspaces/kubernetes/setup.md @@ -1 +1,167 @@ -# Kubernetes Workspace Setup\n\nThis template provides a complete Kubernetes cluster configuration using the package-based provisioning system.\n\n## Prerequisites\n\n1. Core provisioning package installed:\n\n ```bash\n kcl-packager.nu install --version latest\n ```\n\n2. Module loader CLI available:\n\n ```bash\n module-loader --help\n ```\n\n## Setup Steps\n\n### 1. Initialize Workspace\n\n```\n# Create workspace from template\ncp -r provisioning/templates/workspaces/kubernetes ./my-k8s-cluster\ncd my-k8s-cluster\n\n# Initialize directory structure\nworkspace-init.nu . init\n```\n\n### 2. Load Required Taskservs\n\n```\n# Load Kubernetes components\nmodule-loader load taskservs . [kubernetes, cilium, containerd]\n\n# Verify loading\nmodule-loader list taskservs .\n```\n\n### 3. Load Cloud Provider\n\n```\n# For UpCloud\nmodule-loader load providers . [upcloud]\n\n# For AWS\nmodule-loader load providers . [aws]\n\n# For local development\nmodule-loader load providers . [local]\n```\n\n### 4. Configure Infrastructure\n\n1. Edit `servers.k` to uncomment the import statements and taskserv configurations\n2. Adjust server specifications, hostnames, and labels as needed\n3. Configure provider-specific settings in the generated provider files\n\n### 5. Validate Configuration\n\n```\n# Validate KCL configuration\nkcl run servers.k\n\n# Validate workspace\nmodule-loader validate .\n```\n\n### 6. Deploy Cluster\n\n```\n# Create servers\nprovisioning server create --infra . --check\n\n# Install taskservs\nprovisioning taskserv create kubernetes --infra .\nprovisioning taskserv create cilium --infra .\nprovisioning taskserv create containerd --infra .\n\n# Verify cluster\nkubectl get nodes\n```\n\n## Configuration Details\n\n### Server Roles\n\n- **k8s-master-01**: Control plane node running the Kubernetes API server, etcd, and scheduler\n- **k8s-worker-01/02**: Worker nodes running kubelet and container runtime\n\n### Taskservs\n\n- **containerd**: Container runtime for Kubernetes\n- **kubernetes**: Core Kubernetes components (kubelet, kubeadm, kubectl)\n- **cilium**: CNI (Container Network Interface) for pod networking\n\n### Network Configuration\n\n- All nodes have public IPv4 for initial setup\n- Cilium provides internal pod-to-pod networking\n- SSH access on port 22 for management\n\n## Customization\n\n### Adding More Workers\n\nCopy the worker node configuration in `servers.k` and modify:\n\n- `hostname`\n- `title`\n- Any provider-specific settings\n\n### Different Container Runtime\n\nReplace `containerd` taskserv with:\n\n- `crio`: CRI-O runtime\n- `docker`: Docker runtime (not recommended for production)\n\n### Different CNI\n\nReplace `cilium` taskserv with:\n\n- `calico`: Calico CNI\n- `flannel`: Flannel CNI\n- Built-in kubenet (remove CNI taskserv)\n\n### Storage\n\nAdd storage taskservs:\n\n```\nmodule-loader load taskservs . [rook-ceph, mayastor]\n```\n\nThen add to server taskserv configurations:\n\n```\ntaskservs = [\n { name = "containerd", profile = "default" },\n { name = "kubernetes", profile = "worker" },\n { name = "cilium", profile = "worker" },\n { name = "rook-ceph", profile = "default" }\n]\n```\n\n## Troubleshooting\n\n### Module Import Errors\n\nIf you see import errors like "module not found":\n\n1. Verify modules are loaded: `module-loader list taskservs .`\n2. Check generated import files: `ls .taskservs/`\n3. Reload modules if needed: `module-loader load taskservs . [kubernetes, cilium, containerd]`\n\n### Provider Configuration\n\nCheck provider-specific configuration in `.providers/` directory after loading.\n\n### Kubernetes Setup Issues\n\n1. Check taskserv installation logs in `./tmp/k8s-deployment/`\n2. Verify all nodes are reachable via SSH\n3. Check firewall rules for Kubernetes ports (6443, 10250, etc.) +# Kubernetes Workspace Setup + +This template provides a complete Kubernetes cluster configuration using the package-based provisioning system. + +## Prerequisites + +1. Core provisioning package installed: + + ```bash + kcl-packager.nu install --version latest + ``` + +2. Module loader CLI available: + + ```bash + module-loader --help + ``` + +## Setup Steps + +### 1. Initialize Workspace + +```bash +# Create workspace from template +cp -r provisioning/templates/workspaces/kubernetes ./my-k8s-cluster +cd my-k8s-cluster + +# Initialize directory structure +workspace-init.nu . init +``` + +### 2. Load Required Taskservs + +```bash +# Load Kubernetes components +module-loader load taskservs . [kubernetes, cilium, containerd] + +# Verify loading +module-loader list taskservs . +``` + +### 3. Load Cloud Provider + +```bash +# For UpCloud +module-loader load providers . [upcloud] + +# For AWS +module-loader load providers . [aws] + +# For local development +module-loader load providers . [local] +``` + +### 4. Configure Infrastructure + +1. Edit `servers.k` to uncomment the import statements and taskserv configurations +2. Adjust server specifications, hostnames, and labels as needed +3. Configure provider-specific settings in the generated provider files + +### 5. Validate Configuration + +```toml +# Validate KCL configuration +kcl run servers.k + +# Validate workspace +module-loader validate . +``` + +### 6. Deploy Cluster + +```bash +# Create servers +provisioning server create --infra . --check + +# Install taskservs +provisioning taskserv create kubernetes --infra . +provisioning taskserv create cilium --infra . +provisioning taskserv create containerd --infra . + +# Verify cluster +kubectl get nodes +``` + +## Configuration Details + +### Server Roles + +- **k8s-master-01**: Control plane node running the Kubernetes API server, etcd, and scheduler +- **k8s-worker-01/02**: Worker nodes running kubelet and container runtime + +### Taskservs + +- **containerd**: Container runtime for Kubernetes +- **kubernetes**: Core Kubernetes components (kubelet, kubeadm, kubectl) +- **cilium**: CNI (Container Network Interface) for pod networking + +### Network Configuration + +- All nodes have public IPv4 for initial setup +- Cilium provides internal pod-to-pod networking +- SSH access on port 22 for management + +## Customization + +### Adding More Workers + +Copy the worker node configuration in `servers.k` and modify: + +- `hostname` +- `title` +- Any provider-specific settings + +### Different Container Runtime + +Replace `containerd` taskserv with: + +- `crio`: CRI-O runtime +- `docker`: Docker runtime (not recommended for production) + +### Different CNI + +Replace `cilium` taskserv with: + +- `calico`: Calico CNI +- `flannel`: Flannel CNI +- Built-in kubenet (remove CNI taskserv) + +### Storage + +Add storage taskservs: + +```bash +module-loader load taskservs . [rook-ceph, mayastor] +``` + +Then add to server taskserv configurations: + +```toml +taskservs = [ + { name = "containerd", profile = "default" }, + { name = "kubernetes", profile = "worker" }, + { name = "cilium", profile = "worker" }, + { name = "rook-ceph", profile = "default" } +] +``` + +## Troubleshooting + +### Module Import Errors + +If you see import errors like "module not found": + +1. Verify modules are loaded: `module-loader list taskservs .` +2. Check generated import files: `ls .taskservs/` +3. Reload modules if needed: `module-loader load taskservs . [kubernetes, cilium, containerd]` + +### Provider Configuration + +Check provider-specific configuration in `.providers/` directory after loading. + +### Kubernetes Setup Issues + +1. Check taskserv installation logs in `./tmp/k8s-deployment/` +2. Verify all nodes are reachable via SSH +3. Check firewall rules for Kubernetes ports (6443, 10250, etc.) \ No newline at end of file diff --git a/tools/README-analyze-codebase.md b/tools/README-analyze-codebase.md index 22a4e82..64a32b7 100644 --- a/tools/README-analyze-codebase.md +++ b/tools/README-analyze-codebase.md @@ -1 +1,157 @@ -# Codebase Analysis Script\n\nScript to analyze the technology distribution in the provisioning codebase.\n\n## Usage\n\n### Basic Usage\n\n```\n# From provisioning directory (analyzes current directory)\ncd provisioning\nnu tools/analyze-codebase.nu\n\n# From project root, analyze provisioning\nnu provisioning/tools/analyze-codebase.nu --path provisioning\n\n# Analyze any path\nnu provisioning/tools/analyze-codebase.nu --path /absolute/path/to/directory\n```\n\n### Output Formats\n\n```\n# Table format (default) - colored, visual bars\nnu provisioning/tools/analyze-codebase.nu --format table\n\n# JSON format - for programmatic use\nnu provisioning/tools/analyze-codebase.nu --format json\n\n# Markdown format - for documentation\nnu provisioning/tools/analyze-codebase.nu --format markdown\n```\n\n### From provisioning directory\n\n```\ncd provisioning\nnu tools/analyze-codebase.nu\n```\n\n### Direct execution (if in PATH)\n\n```\n# Make it globally available (one time)\nln -sf "$(pwd)/provisioning/tools/analyze-codebase.nu" /usr/local/bin/analyze-codebase\n\n# Then run from anywhere\nanalyze-codebase\nanalyze-codebase --format json\nanalyze-codebase --format markdown > CODEBASE_STATS.md\n```\n\n## Output\n\nThe script analyzes:\n\n- **Nushell** (.nu files)\n- **KCL** (.k files)\n- **Rust** (.rs files)\n- **Templates** (.j2, .tera files)\n\nAcross these sections:\n\n- `core/` - CLI interface, core libraries\n- `extensions/` - Providers, taskservs, clusters\n- `platform/` - Rust services (orchestrator, control-center, etc.)\n- `templates/` - Template files\n- `kcl/` - KCL configuration schemas\n\n## Example Output\n\n### Table Format\n\n```\n📊 Analyzing Codebase: provisioning\n\n📋 Lines of Code by Section\n\n╭─────────────┬─────────┬────────────┬─────┬─────────┬─────┬──────────┬───────────┬───────────────┬───────────┬───────╮\n│ section │ nushell │ nushell_pct│ kcl │ kcl_pct │ rust│ rust_pct │ templates │ templates_pct │ total │ │\n├─────────────┼─────────┼────────────┼─────┼─────────┼─────┼──────────┼───────────┼───────────────┼───────────┼───────┤\n│ core │ 53843 │ 99.87 │ 71 │ 0.13 │ 0 │ 0.00 │ 0 │ 0.00 │ 53914 │ │\n│ extensions │ 10202 │ 43.21 │3946 │ 16.72 │ 0 │ 0.00 │ 9456 │ 40.05 │ 23604 │ │\n│ platform │ 5759 │ 0.19 │ 0 │ 0.00 │2992107│ 99.81 │ 0 │ 0.00 │ 2997866 │ │\n│ templates │ 4197 │ 72.11 │ 834 │ 14.33 │ 0 │ 0.00 │ 789 │ 13.56 │ 5820 │ │\n│ kcl │ 0 │ 0.00 │5594 │ 100.00 │ 0 │ 0.00 │ 0 │ 0.00 │ 5594 │ │\n╰─────────────┴─────────┴────────────┴─────┴─────────┴─────┴──────────┴───────────┴───────────────┴───────────┴───────╯\n\n📊 Overall Technology Distribution\n\n╭──────────────────────┬──────────┬────────────┬────────────────────────────────────────────────────╮\n│ technology │ lines │ percentage │ visual │\n├──────────────────────┼──────────┼────────────┼────────────────────────────────────────────────────┤\n│ Nushell │ 74001 │ 2.40 │ █ │\n│ KCL │ 10445 │ 0.34 │ │\n│ Rust │ 2992107 │ 96.93 │ ████████████████████████████████████████████████ │\n│ Templates (Tera) │ 10245 │ 0.33 │ │\n╰──────────────────────┴──────────┴────────────┴────────────────────────────────────────────────────╯\n\n📈 Total Lines of Code: 3086798\n```\n\n### JSON Format\n\n```\n{\n "sections": [...],\n "totals": {\n "nushell": 74001,\n "kcl": 10445,\n "rust": 2992107,\n "templates": 10245,\n "grand_total": 3086798\n },\n "percentages": {\n "nushell": 2.40,\n "kcl": 0.34,\n "rust": 96.93,\n "templates": 0.33\n }\n}\n```\n\n### Markdown Format\n\n```\n# Codebase Analysis\n\n## Technology Distribution\n\n| Technology | Lines | Percentage |\n|------------|-------|------------|\n| Nushell | 74001 | 2.40% |\n| KCL | 10445 | 0.34% |\n| Rust | 2992107 | 96.93% |\n| Templates | 10245 | 0.33% |\n| **TOTAL** | **3086798** | **100%** |\n```\n\n## Requirements\n\n- Nushell 0.107.1+\n- Access to the provisioning directory\n\n## What It Analyzes\n\n- ✅ All `.nu` files (Nushell scripts)\n- ✅ All `.k` files (KCL configuration)\n- ✅ All `.rs` files (Rust source)\n- ✅ All `.j2` and `.tera` files (Templates)\n\n## Notes\n\n- The script recursively searches all subdirectories\n- Empty sections show 0 for all technologies\n- Percentages are calculated per section and overall\n- Visual bars are proportional to percentage (max 50 chars = 100%) +# Codebase Analysis Script + +Script to analyze the technology distribution in the provisioning codebase. + +## Usage + +### Basic Usage + +```bash +# From provisioning directory (analyzes current directory) +cd provisioning +nu tools/analyze-codebase.nu + +# From project root, analyze provisioning +nu provisioning/tools/analyze-codebase.nu --path provisioning + +# Analyze any path +nu provisioning/tools/analyze-codebase.nu --path /absolute/path/to/directory +``` + +### Output Formats + +```bash +# Table format (default) - colored, visual bars +nu provisioning/tools/analyze-codebase.nu --format table + +# JSON format - for programmatic use +nu provisioning/tools/analyze-codebase.nu --format json + +# Markdown format - for documentation +nu provisioning/tools/analyze-codebase.nu --format markdown +``` + +### From provisioning directory + +```bash +cd provisioning +nu tools/analyze-codebase.nu +``` + +### Direct execution (if in PATH) + +```bash +# Make it globally available (one time) +ln -sf "$(pwd)/provisioning/tools/analyze-codebase.nu" /usr/local/bin/analyze-codebase + +# Then run from anywhere +analyze-codebase +analyze-codebase --format json +analyze-codebase --format markdown > CODEBASE_STATS.md +``` + +## Output + +The script analyzes: + +- **Nushell** (.nu files) +- **KCL** (.k files) +- **Rust** (.rs files) +- **Templates** (.j2, .tera files) + +Across these sections: + +- `core/` - CLI interface, core libraries +- `extensions/` - Providers, taskservs, clusters +- `platform/` - Rust services (orchestrator, control-center, etc.) +- `templates/` - Template files +- `kcl/` - KCL configuration schemas + +## Example Output + +### Table Format + +```bash +📊 Analyzing Codebase: provisioning + +📋 Lines of Code by Section + +╭─────────────┬─────────┬────────────┬─────┬─────────┬─────┬──────────┬───────────┬───────────────┬───────────┬───────╮ +│ section │ nushell │ nushell_pct│ kcl │ kcl_pct │ rust│ rust_pct │ templates │ templates_pct │ total │ │ +├─────────────┼─────────┼────────────┼─────┼─────────┼─────┼──────────┼───────────┼───────────────┼───────────┼───────┤ +│ core │ 53843 │ 99.87 │ 71 │ 0.13 │ 0 │ 0.00 │ 0 │ 0.00 │ 53914 │ │ +│ extensions │ 10202 │ 43.21 │3946 │ 16.72 │ 0 │ 0.00 │ 9456 │ 40.05 │ 23604 │ │ +│ platform │ 5759 │ 0.19 │ 0 │ 0.00 │2992107│ 99.81 │ 0 │ 0.00 │ 2997866 │ │ +│ templates │ 4197 │ 72.11 │ 834 │ 14.33 │ 0 │ 0.00 │ 789 │ 13.56 │ 5820 │ │ +│ kcl │ 0 │ 0.00 │5594 │ 100.00 │ 0 │ 0.00 │ 0 │ 0.00 │ 5594 │ │ +╰─────────────┴─────────┴────────────┴─────┴─────────┴─────┴──────────┴───────────┴───────────────┴───────────┴───────╯ + +📊 Overall Technology Distribution + +╭──────────────────────┬──────────┬────────────┬────────────────────────────────────────────────────╮ +│ technology │ lines │ percentage │ visual │ +├──────────────────────┼──────────┼────────────┼────────────────────────────────────────────────────┤ +│ Nushell │ 74001 │ 2.40 │ █ │ +│ KCL │ 10445 │ 0.34 │ │ +│ Rust │ 2992107 │ 96.93 │ ████████████████████████████████████████████████ │ +│ Templates (Tera) │ 10245 │ 0.33 │ │ +╰──────────────────────┴──────────┴────────────┴────────────────────────────────────────────────────╯ + +📈 Total Lines of Code: 3086798 +``` + +### JSON Format + +```json +{ + "sections": [...], + "totals": { + "nushell": 74001, + "kcl": 10445, + "rust": 2992107, + "templates": 10245, + "grand_total": 3086798 + }, + "percentages": { + "nushell": 2.40, + "kcl": 0.34, + "rust": 96.93, + "templates": 0.33 + } +} +``` + +### Markdown Format + +```bash +# Codebase Analysis + +## Technology Distribution + +| Technology | Lines | Percentage | +|------------|-------|------------| +| Nushell | 74001 | 2.40% | +| KCL | 10445 | 0.34% | +| Rust | 2992107 | 96.93% | +| Templates | 10245 | 0.33% | +| **TOTAL** | **3086798** | **100%** | +``` + +## Requirements + +- Nushell 0.107.1+ +- Access to the provisioning directory + +## What It Analyzes + +- ✅ All `.nu` files (Nushell scripts) +- ✅ All `.k` files (KCL configuration) +- ✅ All `.rs` files (Rust source) +- ✅ All `.j2` and `.tera` files (Templates) + +## Notes + +- The script recursively searches all subdirectories +- Empty sections show 0 for all technologies +- Percentages are calculated per section and overall +- Visual bars are proportional to percentage (max 50 chars = 100%) \ No newline at end of file diff --git a/tools/README.md b/tools/README.md index 17240bf..a7acc83 100644 --- a/tools/README.md +++ b/tools/README.md @@ -1 +1,155 @@ -# Development Tools\n\nDevelopment and distribution tooling for provisioning.\n\n## Tool Categories\n\n### Build Tools (`build/`)\n\nBuild automation and compilation tools:\n\n- Nushell script validation\n- KCL schema compilation\n- Dependency management\n- Asset bundling\n\n**Future Features**:\n\n- Automated testing pipelines\n- Code quality checks\n- Performance benchmarking\n\n### Package Tools (`package/`)\n\nPackaging utilities for distribution:\n\n- Standalone executables\n- Container images\n- System packages (deb, rpm, etc.)\n- Archive creation\n\n**Future Features**:\n\n- Multi-platform builds\n- Dependency bundling\n- Signature verification\n\n### Release Tools (`release/`)\n\nRelease management automation:\n\n- Version bumping\n- Changelog generation\n- Git tag management\n- Release notes creation\n\n**Future Features**:\n\n- Automated GitHub releases\n- Asset uploads\n- Release validation\n\n### Distribution Tools (`distribution/`)\n\nDistribution generators and deployment:\n\n- Installation scripts\n- Configuration templates\n- Update mechanisms\n- Registry management\n\n**Future Features**:\n\n- Package repositories\n- Update servers\n- Telemetry collection\n\n## Tool Architecture\n\n### Script-Based Tools\n\nMost tools are implemented as Nushell scripts for consistency with the main system:\n\n- Easy integration with existing codebase\n- Consistent configuration handling\n- Native data structure support\n\n### Build Pipeline Integration\n\nTools integrate with common CI/CD systems:\n\n- GitHub Actions\n- GitLab CI\n- Jenkins\n- Custom automation\n\n### Configuration Management\n\nTools use the same configuration system as the main application:\n\n- Unified settings\n- Environment-specific overrides\n- Secret management integration\n\n## Usage Examples\n\n```\n# Build the complete system\n./tools/build/build-all.nu\n\n# Package for distribution\n./tools/package/create-standalone.nu --target linux\n\n# Create a release\n./tools/release/prepare-release.nu --version 4.0.0\n\n# Generate distribution assets\n./tools/distribution/generate-installer.nu --platform macos\n```\n\n## Directory Structure\n\n```\nprovisioning/tools/\n├── README.md # This file\n├── build/ # Core build tools (Rust + Nushell)\n│ ├── README.md\n│ ├── compile-platform.nu # Compile Rust binaries\n│ ├── bundle-core.nu # Bundle Nushell libraries\n│ └── check-system.nu # Validate build environment\n├── dist/ # Build output directory (generated)\n│ ├── README.md\n│ ├── core/ # Nushell bundles\n│ ├── platform/ # Compiled binaries\n│ └── config/ # Configuration files\n├── distribution/ # Distribution generation\n│ ├── README.md\n│ └── generate-distribution.nu # Create installable packages\n├── package/ # Package outputs (generated)\n│ └── README.md\n├── release/ # Release management (generated)\n│ └── README.md\n├── scripts/ # Utility and setup scripts\n│ ├── *.nu files # Nushell utilities\n│ └── *.sh files # Shell scripts\n└── [Other utility scripts] # Standalone tools\n```\n\nSee individual README.md files in each subdirectory for detailed information.\n\n## Development Setup\n\n1. Ensure all dependencies are installed\n2. Configure build environment\n3. Run initial setup scripts\n4. Validate tool functionality\n\n## Integration\n\nThese tools integrate with:\n\n- Main provisioning system\n- Extension system\n- Configuration management\n- Documentation generation\n- CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins) +# Development Tools + +Development and distribution tooling for provisioning. + +## Tool Categories + +### Build Tools (`build/`) + +Build automation and compilation tools: + +- Nushell script validation +- KCL schema compilation +- Dependency management +- Asset bundling + +**Future Features**: + +- Automated testing pipelines +- Code quality checks +- Performance benchmarking + +### Package Tools (`package/`) + +Packaging utilities for distribution: + +- Standalone executables +- Container images +- System packages (deb, rpm, etc.) +- Archive creation + +**Future Features**: + +- Multi-platform builds +- Dependency bundling +- Signature verification + +### Release Tools (`release/`) + +Release management automation: + +- Version bumping +- Changelog generation +- Git tag management +- Release notes creation + +**Future Features**: + +- Automated GitHub releases +- Asset uploads +- Release validation + +### Distribution Tools (`distribution/`) + +Distribution generators and deployment: + +- Installation scripts +- Configuration templates +- Update mechanisms +- Registry management + +**Future Features**: + +- Package repositories +- Update servers +- Telemetry collection + +## Tool Architecture + +### Script-Based Tools + +Most tools are implemented as Nushell scripts for consistency with the main system: + +- Easy integration with existing codebase +- Consistent configuration handling +- Native data structure support + +### Build Pipeline Integration + +Tools integrate with common CI/CD systems: + +- GitHub Actions +- GitLab CI +- Jenkins +- Custom automation + +### Configuration Management + +Tools use the same configuration system as the main application: + +- Unified settings +- Environment-specific overrides +- Secret management integration + +## Usage Examples + +```bash +# Build the complete system +./tools/build/build-all.nu + +# Package for distribution +./tools/package/create-standalone.nu --target linux + +# Create a release +./tools/release/prepare-release.nu --version 4.0.0 + +# Generate distribution assets +./tools/distribution/generate-installer.nu --platform macos +``` + +## Directory Structure + +```bash +provisioning/tools/ +├── README.md # This file +├── build/ # Core build tools (Rust + Nushell) +│ ├── README.md +│ ├── compile-platform.nu # Compile Rust binaries +│ ├── bundle-core.nu # Bundle Nushell libraries +│ └── check-system.nu # Validate build environment +├── dist/ # Build output directory (generated) +│ ├── README.md +│ ├── core/ # Nushell bundles +│ ├── platform/ # Compiled binaries +│ └── config/ # Configuration files +├── distribution/ # Distribution generation +│ ├── README.md +│ └── generate-distribution.nu # Create installable packages +├── package/ # Package outputs (generated) +│ └── README.md +├── release/ # Release management (generated) +│ └── README.md +├── scripts/ # Utility and setup scripts +│ ├── *.nu files # Nushell utilities +│ └── *.sh files # Shell scripts +└── [Other utility scripts] # Standalone tools +``` + +See individual README.md files in each subdirectory for detailed information. + +## Development Setup + +1. Ensure all dependencies are installed +2. Configure build environment +3. Run initial setup scripts +4. Validate tool functionality + +## Integration + +These tools integrate with: + +- Main provisioning system +- Extension system +- Configuration management +- Documentation generation +- CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins) \ No newline at end of file diff --git a/tools/build/README.md b/tools/build/README.md index 36ca6ee..75647fb 100644 --- a/tools/build/README.md +++ b/tools/build/README.md @@ -1 +1,76 @@ -# Build System\n\n**Purpose**: Core build tools for compiling Rust components and bundling Nushell libraries.\n\n## Tools\n\n### Compilation\n\n- **`compile-platform.nu`** - Compile Rust orchestrator, control-center, and MCP server\n - Multi-platform cross-compilation\n - Release/debug build modes\n - Feature flag management\n - Output to `dist/platform/`\n\n### Bundling\n\n- **`bundle-core.nu`** - Bundle Nushell core libraries and CLI\n - Package provisioning CLI wrapper\n - Core library bundling (lib_provisioning)\n - Configuration system packaging\n - Validation and syntax checking\n - Optional compression (gzip)\n - Output to `dist/core/`\n\n### Validation\n\n- **`check-system.nu`** - Validate build environment\n - Check required tools (Rust, Nushell, Nickel)\n - Verify dependencies\n - Validate configuration\n\n## Build Process\n\nComplete build pipeline:\n\n```{$detected_lang}\njust build-all # Platform + core\njust build-platform # Rust binaries only\njust build-core # Nushell libraries only\n```\n\nBuild with validation:\n\n```{$detected_lang}\njust build-core --validate # Validate Nushell syntax\n```\n\nDebug build:\n\n```{$detected_lang}\njust build-debug # Build with debug symbols\n```\n\n## Output\n\nBuild outputs go to `dist/`:\n\n- `dist/platform/` - Compiled Rust binaries\n- `dist/core/` - Nushell libraries and CLI\n- `dist/config/` - Configuration files\n- `dist/core/bundle-metadata.json` - Build metadata\n\n## Architecture\n\nEach build tool follows Nushell 0.109+ standards:\n\n- Immutable variable patterns\n- Explicit external command prefixes (`^`)\n- Error handling via `do { } | complete` pattern\n- Comprehensive logging\n\n## Related Files\n\n- `provisioning/justfiles/build.just` - Build recipe definitions\n- `provisioning/tools/distribution/` - Distribution generation using build outputs\n- `provisioning/tools/package/` - Packaging compiled binaries +# Build System + +**Purpose**: Core build tools for compiling Rust components and bundling Nushell libraries. + +## Tools + +### Compilation + +- **`compile-platform.nu`** - Compile Rust orchestrator, control-center, and MCP server + - Multi-platform cross-compilation + - Release/debug build modes + - Feature flag management + - Output to `dist/platform/` + +### Bundling + +- **`bundle-core.nu`** - Bundle Nushell core libraries and CLI + - Package provisioning CLI wrapper + - Core library bundling (lib_provisioning) + - Configuration system packaging + - Validation and syntax checking + - Optional compression (gzip) + - Output to `dist/core/` + +### Validation + +- **`check-system.nu`** - Validate build environment + - Check required tools (Rust, Nushell, Nickel) + - Verify dependencies + - Validate configuration + +## Build Process + +Complete build pipeline: + +```bash +just build-all # Platform + core +just build-platform # Rust binaries only +just build-core # Nushell libraries only +``` + +Build with validation: + +```bash +just build-core --validate # Validate Nushell syntax +``` + +Debug build: + +```bash +just build-debug # Build with debug symbols +``` + +## Output + +Build outputs go to `dist/`: + +- `dist/platform/` - Compiled Rust binaries +- `dist/core/` - Nushell libraries and CLI +- `dist/config/` - Configuration files +- `dist/core/bundle-metadata.json` - Build metadata + +## Architecture + +Each build tool follows Nushell 0.109+ standards: + +- Immutable variable patterns +- Explicit external command prefixes (`^`) +- Error handling via `do { } | complete` pattern +- Comprehensive logging + +## Related Files + +- `provisioning/justfiles/build.just` - Build recipe definitions +- `provisioning/tools/distribution/` - Distribution generation using build outputs +- `provisioning/tools/package/` - Packaging compiled binaries \ No newline at end of file diff --git a/tools/cross-references-integration-report.md b/tools/cross-references-integration-report.md index fa11fd3..9665bff 100644 --- a/tools/cross-references-integration-report.md +++ b/tools/cross-references-integration-report.md @@ -1 +1,741 @@ -# Cross-References & Integration Report\n\n**Agent**: Agent 6: Cross-References & Integration\n**Date**: 2025-10-10\n**Status**: ✅ Phase 1 Complete - Core Infrastructure Ready\n\n---\n\n## Executive Summary\n\nSuccessfully completed Phase 1 of documentation cross-referencing and integration, creating the foundational infrastructure for a unified documentation system. This phase focused on building the essential tools and reference materials needed for comprehensive documentation integration.\n\n### Key Deliverables\n\n1. ✅ **Documentation Validator Tool** - Automated link checking\n2. ✅ **Broken Links Report** - 261 broken links identified across 264 files\n3. ✅ **Comprehensive Glossary** - 80+ terms with cross-references\n4. ✅ **Documentation Map** - Complete navigation guide with user journeys\n5. ⚠️ **System Integration** - Diagnostics system analysis (existing references verified)\n\n---\n\n## 1. Documentation Validator Tool\n\n**File**: `provisioning/tools/doc-validator.nu` (210 lines)\n\n### Features\n\n- ✅ Scans all markdown files in documentation (264 files found)\n- ✅ Extracts and validates internal links using regex parsing\n- ✅ Resolves relative paths and checks file existence\n- ✅ Classifies links: internal, external, anchor\n- ✅ Generates broken links report (JSON + Markdown)\n- ✅ Provides summary statistics\n- ✅ Supports multiple output formats (table, json, markdown)\n\n### Usage\n\n```\n# Run full validation\nnu provisioning/tools/doc-validator.nu\n\n# Generate markdown report\nnu provisioning/tools/doc-validator.nu --format markdown\n\n# Generate JSON for automation\nnu provisioning/tools/doc-validator.nu --format json\n```\n\n### Performance\n\n- **264 markdown files** scanned\n- **Completion time**: ~2 minutes\n- **Memory usage**: Minimal (streaming processing)\n\n### Output Files\n\n1. `provisioning/tools/broken-links-report.json` - Detailed broken links (261 entries)\n2. `provisioning/tools/doc-validation-full-report.json` - Complete validation data\n\n---\n\n## 2. Broken Links Analysis\n\n### Statistics\n\n**Total Links Analyzed**: 2,847 links\n**Broken Links**: 261 (9.2% failure rate)\n**Valid Links**: 2,586 (90.8% success rate)\n\n### Link Type Breakdown\n\n- **Internal links**: 1,842 (64.7%)\n- **External links**: 523 (18.4%)\n- **Anchor links**: 482 (16.9%)\n\n### Broken Link Categories\n\n#### 1. Missing Documentation Files (47%)\n\nCommon patterns:\n\n- `docs/user/quickstart.md` - Referenced but not created\n- `docs/development/CONTRIBUTING.md` - Standard file missing\n- `.claude/features/*.md` - Path resolution issues from docs/\n\n#### 2. Anchor Links to Missing Sections (31%)\n\nExamples:\n\n- `workspace-management.md#setup-and-initialization`\n- `configuration.md#configuration-architecture`\n- `workflow.md#daily-development-workflow`\n\n#### 3. Path Resolution Issues (15%)\n\n- References to files in `.claude/` from `docs/` (path mismatch)\n- References to `provisioning/` from `docs/` (relative path errors)\n\n#### 4. Outdated References (7%)\n\n- ADR links to non-existent ADRs\n- Old migration guide structure\n\n### Recommendations\n\n**High Priority Fixes**:\n\n1. Create missing guide files in `docs/guides/`\n2. Create missing ADRs or update references\n3. Fix path resolution for `.claude/` references\n4. Add missing anchor sections in existing docs\n\n**Medium Priority**:\n\n1. Verify and add missing anchor links\n2. Update outdated migration paths\n3. Create CONTRIBUTING.md\n\n**Low Priority**:\n\n1. Validate external links (may be intentional placeholders)\n2. Standardize relative vs absolute paths\n\n---\n\n## 3. Glossary (GLOSSARY.md)\n\n**File**: `provisioning/docs/src/GLOSSARY.md` (23,500+ lines)\n\n### Comprehensive Terminology Reference\n\n**80+ Terms Defined**, covering:\n\n- Infrastructure concepts (Server, Cluster, Taskserv, Provider, etc.)\n- Security terms (Auth, JWT, MFA, Cedar, KMS, etc.)\n- Configuration (Config, KCL, Schema, Workspace, etc.)\n- Operations (Workflow, Batch Operation, Orchestrator, etc.)\n- Platform (Control Center, MCP, API Gateway, etc.)\n- Development (Extension, Plugin, Module, Template, etc.)\n\n### Structure\n\nEach term includes:\n\n1. **Definition** - Clear, concise explanation\n2. **Where Used** - Context and use cases\n3. **Related Concepts** - Cross-references to related terms\n4. **Examples** - Code samples, commands, or configurations (where applicable)\n5. **Commands** - CLI commands related to the term (where applicable)\n6. **See Also** - Links to related documentation\n\n### Special Sections\n\n1. **Symbol and Acronym Index** - Quick lookup table\n2. **Cross-Reference Map** - Terms organized by topic area\n3. **Terminology Guidelines** - Writing style and conventions\n4. **Contributing to Glossary** - How to add/update terms\n\n### Usage\n\nThe glossary serves as:\n\n- **Learning resource** for new users\n- **Reference** for experienced users\n- **Documentation standard** for contributors\n- **Cross-reference hub** for all documentation\n\n---\n\n## 4. Documentation Map (DOCUMENTATION_MAP.md)\n\n**File**: `provisioning/docs/src/DOCUMENTATION_MAP.md` (48,000+ lines)\n\n### Comprehensive Navigation Guide\n\n**264 Documents Mapped**, organized by:\n\n- User Journeys (6 distinct paths)\n- Topic Areas (14 categories)\n- Difficulty Levels (Beginner, Intermediate, Advanced)\n- Estimated Reading Times\n\n### User Journeys\n\n#### 1. New User Journey (0-7 days, 4-6 hours)\n\n8 steps from platform overview to basic deployment\n\n#### 2. Intermediate User Journey (1-4 weeks, 8-12 hours)\n\n8 steps mastering infrastructure automation and customization\n\n#### 3. Advanced User Journey (1-3 months, 20-30 hours)\n\n8 steps to become platform expert and contributor\n\n#### 4. Developer Journey (Ongoing)\n\nContributing to platform development\n\n#### 5. Security Specialist Journey (10-15 hours)\n\n12 steps mastering security features\n\n#### 6. Operations Specialist Journey (6-8 hours)\n\n7 steps for daily operations mastery\n\n### Documentation by Topic\n\n**14 Major Categories**:\n\n1. Core Platform (3 docs)\n2. User Guides (45+ docs)\n3. Guides & Tutorials (10+ specialized guides)\n4. Architecture (27 docs including 10 ADRs)\n5. Development (25+ docs)\n6. API Documentation (7 docs)\n7. Security (15+ docs)\n8. Operations (3+ docs)\n9. Configuration & Workspace (11+ docs)\n10. Reference Documentation (10+ docs)\n11. Testing & Validation (4+ docs)\n12. Migration (10+ docs)\n13. Examples (2+ with more planned)\n14. Quick References (10+ docs)\n\n### Documentation Statistics\n\n**By Category**:\n\n- User Guides: 32 documents\n- Architecture: 27 documents\n- Development: 25 documents\n- API: 7 documents\n- Security: 15 documents\n- Migration: 10 documents\n- Operations: 3 documents\n- Configuration: 8 documents\n- KCL: 14 documents\n- Testing: 4 documents\n- Quick References: 10 documents\n- Examples: 2 documents\n- ADRs: 10 documents\n\n**By Level**:\n\n- Beginner: ~40 documents (4-6 hours total)\n- Intermediate: ~120 documents (20-30 hours total)\n- Advanced: ~100 documents (40-60 hours total)\n\n**Total Estimated Reading Time**: 150-200 hours (complete corpus)\n\n### Essential Reading Lists\n\nCurated "Must-Read" lists for:\n\n- Everyone (4 docs)\n- Operators (4 docs)\n- Developers (4 docs)\n- Security Specialists (4 docs)\n\n### Features\n\n- **Learning Paths**: Structured journeys for different user types\n- **Topic Browse**: Jump to specific topics\n- **Level Filtering**: Match docs to expertise\n- **Quick References**: Fast command lookup\n- **Alphabetical Index**: Complete file listing\n- **Time Estimates**: Plan learning sessions\n- **Cross-References**: Related document discovery\n\n---\n\n## 5. Diagnostics System Integration\n\n### Analysis of Existing References\n\n**Diagnostics System Files Analyzed**:\n\n1. `provisioning/core/nulib/lib_provisioning/diagnostics/system_status.nu` (318 lines)\n2. `provisioning/core/nulib/lib_provisioning/diagnostics/health_check.nu` (423 lines)\n3. `provisioning/core/nulib/lib_provisioning/diagnostics/next_steps.nu` (316 lines)\n4. `provisioning/core/nulib/main_provisioning/commands/diagnostics.nu` (75 lines)\n\n### Documentation References Found\n\n**35+ documentation links** embedded in diagnostics system, referencing:\n\n✅ **Existing Documentation**:\n\n- `docs/user/WORKSPACE_SWITCHING_GUIDE.md`\n- `docs/guides/quickstart-cheatsheet.md`\n- `docs/guides/from-scratch.md`\n- `docs/user/troubleshooting-guide.md`\n- `docs/user/SERVICE_MANAGEMENT_GUIDE.md`\n- `.claude/features/orchestrator-architecture.md`\n- `docs/user/PLUGIN_INTEGRATION_GUIDE.md`\n- `docs/user/AUTHENTICATION_LAYER_GUIDE.md`\n- `docs/user/CONFIG_ENCRYPTION_GUIDE.md`\n- `docs/user/RUSTYVAULT_KMS_GUIDE.md`\n\n### Integration Status\n\n✅ **Already Integrated**:\n\n- Status command references correct doc paths\n- Health command provides fix recommendations with doc links\n- Next steps command includes progressive guidance with docs\n- Phase command tracks deployment progress\n\n⚠️ **Validation Needed**:\n\n- Some references may point to moved/renamed files\n- Need to validate all 35+ doc paths against current structure\n- Should update to use new GLOSSARY.md and DOCUMENTATION_MAP.md\n\n### Recommendations\n\n**Immediate Actions**:\n\n1. Validate all diagnostics doc paths against current file locations\n2. Update any broken references found in validation\n3. Add references to new GLOSSARY.md and DOCUMENTATION_MAP.md\n4. Consider adding doc path validation to CI/CD\n\n**Future Enhancements**:\n\n1. Auto-update doc paths when files move\n2. Add version checking for doc references\n3. Include doc freshness indicators\n4. Add inline doc previews\n\n---\n\n## 6. Pending Integration Work\n\n### MCP Tools Integration (Not Started)\n\n**Scope**: Ensure MCP (Model Context Protocol) tools reference correct documentation paths\n\n**Files to Check**:\n\n- `provisioning/platform/mcp-server/` - MCP server implementation\n- MCP tool definitions\n- Guidance system references\n\n**Actions Needed**:\n\n1. Locate MCP tool implementations\n2. Extract all documentation references\n3. Validate paths against current structure\n4. Update broken references\n5. Add GLOSSARY and DOCUMENTATION_MAP references\n\n**Estimated Time**: 2-3 hours\n\n---\n\n### UI Integration (Not Started)\n\n**Scope**: Ensure Control Center UI references correct documentation\n\n**Files to Check**:\n\n- `provisioning/platform/control-center/` - UI implementation\n- Tooltip references\n- QuickLinks definitions\n- Help modals\n\n**Actions Needed**:\n\n1. Locate UI documentation references\n2. Validate all doc paths\n3. Update broken references\n4. Test documentation viewer/modal\n5. Add navigation to GLOSSARY and DOCUMENTATION_MAP\n\n**Estimated Time**: 3-4 hours\n\n---\n\n### Integration Tests (Not Started)\n\n**Scope**: Create automated tests for documentation integration\n\n**Test File**: `provisioning/tests/integration/docs_integration_test.nu`\n\n**Test Coverage Needed**:\n\n1. CLI hints reference valid docs\n2. MCP tools return valid doc paths\n3. UI links work correctly\n4. Diagnostics output is accurate\n5. All cross-references resolve\n6. GLOSSARY terms link correctly\n7. DOCUMENTATION_MAP paths valid\n\n**Test Types**:\n\n- Unit tests for link validation\n- Integration tests for system components\n- End-to-end tests for user journeys\n\n**Estimated Time**: 4-5 hours\n\n---\n\n### Documentation System Guide (Not Started)\n\n**Scope**: Document how the unified documentation system works\n\n**File**: `provisioning/docs/src/development/documentation-system.md`\n\n**Content Needed**:\n\n1. **Organization**: How docs are structured\n2. **Adding Documentation**: Step-by-step process\n3. **CLI Integration**: How CLI links to docs\n4. **MCP Integration**: How MCP uses docs\n5. **UI Integration**: How UI presents docs\n6. **Cross-References**: How to maintain links\n7. **Architecture Diagram**: Visual system map\n8. **Best Practices**: Documentation standards\n9. **Tools**: Using doc-validator.nu\n10. **Maintenance**: Keeping docs updated\n\n**Estimated Time**: 3-4 hours\n\n---\n\n### Final Integration Check (Not Started)\n\n**Scope**: Complete user journey validation\n\n**Test Journey**:\n\n1. New user runs `provisioning status`\n2. Follows suggestions from output\n3. Uses `provisioning guide` commands\n4. Opens Control Center UI\n5. Completes onboarding wizard\n6. Deploys first infrastructure\n\n**Validation Points**:\n\n- All suggested commands work\n- All documentation links are valid\n- UI navigation is intuitive\n- Help system is comprehensive\n- Error messages include helpful doc links\n- User can complete journey without getting stuck\n\n**Estimated Time**: 2-3 hours\n\n---\n\n## 7. Files Created/Modified\n\n### Created Files\n\n1. **`provisioning/tools/doc-validator.nu`** (210 lines)\n - Documentation link validator tool\n - Automated scanning and validation\n - Multiple output formats\n\n2. **`provisioning/docs/src/GLOSSARY.md`** (23,500+ lines)\n - Comprehensive terminology reference\n - 80+ terms with cross-references\n - Symbol index and usage guidelines\n\n3. **`provisioning/docs/src/DOCUMENTATION_MAP.md`** (48,000+ lines)\n - Complete documentation navigation guide\n - 6 user journeys\n - 14 topic categories\n - 264 documents mapped\n\n4. **`provisioning/tools/broken-links-report.json`** (Generated)\n - 261 broken links identified\n - Source file and line numbers\n - Target paths and resolution attempts\n\n5. **`provisioning/tools/doc-validation-full-report.json`** (Generated)\n - Complete validation results\n - All 2,847 links analyzed\n - Metadata and timestamps\n\n6. **`provisioning/tools/CROSS_REFERENCES_INTEGRATION_REPORT.md`** (This file)\n - Comprehensive integration report\n - Status of all deliverables\n - Recommendations and next steps\n\n### Modified Files\n\nNone (Phase 1 focused on analysis and reference material creation)\n\n---\n\n## 8. Success Metrics\n\n### Deliverables Completed\n\n| Task | Status | Lines Created | Time Invested |\n| ------ | -------- | --------------- | --------------- |\n| Documentation Validator | ✅ Complete | 210 | ~2 hours |\n| Broken Links Report | ✅ Complete | N/A (Generated) | ~30 min |\n| Glossary | ✅ Complete | 23,500+ | ~4 hours |\n| Documentation Map | ✅ Complete | 48,000+ | ~6 hours |\n| Diagnostics Integration Analysis | ✅ Complete | N/A (Analysis) | ~1 hour |\n| MCP Integration | ⏸️ Pending | - | - |\n| UI Integration | ⏸️ Pending | - | - |\n| Integration Tests | ⏸️ Pending | - | - |\n| Documentation System Guide | ⏸️ Pending | - | - |\n| Final Integration Check | ⏸️ Pending | - | - |\n\n**Total Lines Created**: 71,710+ lines\n**Total Time Invested**: ~13.5 hours\n**Completion**: 50% (Phase 1 of 2)\n\n### Quality Metrics\n\n**Documentation Validator**:\n\n- ✅ Handles 264 markdown files\n- ✅ Analyzes 2,847 links\n- ✅ 90.8% link validation accuracy\n- ✅ Multiple output formats\n- ✅ Extensible for future checks\n\n**Glossary**:\n\n- ✅ 80+ terms defined\n- ✅ 100% cross-referenced\n- ✅ Examples for 60% of terms\n- ✅ CLI commands for 40% of terms\n- ✅ Complete symbol index\n\n**Documentation Map**:\n\n- ✅ 100% of 264 docs cataloged\n- ✅ 6 complete user journeys\n- ✅ Reading time estimates for all docs\n- ✅ 14 topic categories\n- ✅ 3 difficulty levels\n\n---\n\n## 9. Integration Architecture\n\n### Current State\n\n```\nDocumentation System (Phase 1 - Complete)\n├── Validator Tool ────────────┐\n│ └── doc-validator.nu │\n│ │\n├── Reference Materials │\n│ ├── GLOSSARY.md ───────────┤──> Cross-References\n│ └── DOCUMENTATION_MAP.md ──┤\n│ │\n├── Reports │\n│ ├── broken-links-report ───┘\n│ └── validation-full-report\n│\n└── System Integration (Phase 1 Analysis)\n ├── Diagnostics ✅ (35+ doc refs verified)\n ├── MCP Tools ⏸️ (pending)\n ├── UI ⏸️ (pending)\n └── Tests ⏸️ (pending)\n```\n\n### Target State (Phase 2)\n\n```\nUnified Documentation System\n├── Validator Tool ────────────┐\n│ └── doc-validator.nu │\n│ ├── Link checking │\n│ ├── Freshness checks │\n│ └── CI/CD integration │\n│ │\n├── Reference Hub │\n│ ├── GLOSSARY.md ───────────┤──> All Systems\n│ ├── DOCUMENTATION_MAP.md ──┤\n│ └── System Guide ──────────┤\n│ │\n├── System Integration │\n│ ├── Diagnostics ✅ │\n│ ├── MCP Tools ✅ ──────────┤\n│ ├── UI ✅ ─────────────────┤\n│ └── CLI ✅ ────────────────┤\n│ │\n├── Automated Testing │\n│ ├── Link validation ───────┘\n│ ├── Integration tests\n│ └── User journey tests\n│\n└── CI/CD Integration\n ├── Pre-commit hooks\n ├── PR validation\n └── Doc freshness checks\n```\n\n---\n\n## 10. Recommendations\n\n### Immediate Actions (Priority 1)\n\n1. **Fix High-Impact Broken Links** (2-3 hours)\n - Create missing guide files\n - Fix path resolution issues\n - Update ADR references\n\n2. **Complete MCP Integration** (2-3 hours)\n - Validate MCP tool doc references\n - Update broken paths\n - Add GLOSSARY/MAP references\n\n3. **Complete UI Integration** (3-4 hours)\n - Validate UI doc references\n - Test documentation viewer\n - Update tooltips and help modals\n\n### Short-Term Actions (Priority 2)\n\n1. **Create Integration Tests** (4-5 hours)\n - Write automated test suite\n - Cover all system integrations\n - Add to CI/CD pipeline\n\n2. **Write Documentation System Guide** (3-4 hours)\n - Document unified system architecture\n - Provide maintenance guidelines\n - Include contribution process\n\n3. **Run Final Integration Check** (2-3 hours)\n - Test complete user journey\n - Validate all touchpoints\n - Fix any issues found\n\n### Medium-Term Actions (Priority 3)\n\n1. **Automate Link Validation** (1-2 hours)\n - Add doc-validator to CI/CD\n - Run on every PR\n - Block merges with broken links\n\n2. **Add Doc Freshness Checks** (2-3 hours)\n - Track doc last-updated dates\n - Flag stale documentation\n - Auto-create update issues\n\n3. **Create Documentation Dashboard** (4-6 hours)\n - Visual doc health metrics\n - Link validation status\n - Coverage statistics\n - Contribution tracking\n\n---\n\n## 11. Lessons Learned\n\n### Successes\n\n1. **Comprehensive Scope**: Mapping 264 documents revealed true system complexity\n2. **Tool-First Approach**: Building validator before manual work saved significant time\n3. **User Journey Focus**: Organizing by user type makes docs more accessible\n4. **Cross-Reference Hub**: GLOSSARY + MAP create powerful navigation\n5. **Existing Integration**: Diagnostics system already follows good practices\n\n### Challenges\n\n1. **Link Validation Complexity**: 261 broken links harder to fix than expected\n2. **Path Resolution**: Multiple doc directories create path confusion\n3. **Moving Target**: Documentation structure evolving during project\n4. **Time Estimation**: Original scope underestimated total work\n5. **Tool Limitations**: Anchor validation requires parsing headers (future work)\n\n### Improvements for Phase 2\n\n1. **Incremental Validation**: Fix broken links category by category\n2. **Automated Updates**: Update references when files move\n3. **Version Tracking**: Track doc versions for compatibility\n4. **CI/CD Integration**: Prevent new broken links from being added\n5. **Living Documentation**: Auto-update maps and glossary\n\n---\n\n## 12. Next Steps\n\n### Phase 2 Work (12-16 hours estimated)\n\n**Week 1**:\n\n- Day 1-2: Fix high-priority broken links (5-6 hours)\n- Day 3: Complete MCP integration (2-3 hours)\n- Day 4: Complete UI integration (3-4 hours)\n\n**Week 2**:\n\n- Day 5: Create integration tests (4-5 hours)\n- Day 6: Write documentation system guide (3-4 hours)\n- Day 7: Run final integration check (2-3 hours)\n\n### Acceptance Criteria\n\nPhase 2 complete when:\n\n- ✅ <5% broken links (currently 9.2%)\n- ✅ All system components reference valid docs\n- ✅ Integration tests pass\n- ✅ Documentation system guide published\n- ✅ Complete user journey validated\n- ✅ CI/CD validation in place\n\n---\n\n## 13. Conclusion\n\nPhase 1 of the Cross-References & Integration project is **successfully complete**. We have built the foundational infrastructure for a unified documentation system:\n\n✅ **Tool Created**: Automated documentation validator\n✅ **Baseline Established**: 261 broken links identified\n✅ **References Built**: Comprehensive glossary and documentation map\n✅ **Integration Analyzed**: Diagnostics system verified\n\nThe project is on track for Phase 2 completion, which will integrate all system components (MCP, UI, Tests) and validate the complete user experience.\n\n**Total Progress**: 50% complete\n**Quality**: High - All Phase 1 deliverables meet or exceed requirements\n**Risk**: Low - Clear path to Phase 2 completion\n**Recommendation**: Proceed with Phase 2 implementation\n\n---\n\n**Report Generated**: 2025-10-10\n**Agent**: Agent 6: Cross-References & Integration\n**Status**: ✅ Phase 1 Complete\n**Next Review**: After Phase 2 completion (estimated 12-16 hours) +# Cross-References & Integration Report + +**Agent**: Agent 6: Cross-References & Integration +**Date**: 2025-10-10 +**Status**: ✅ Phase 1 Complete - Core Infrastructure Ready + +--- + +## Executive Summary + +Successfully completed Phase 1 of documentation cross-referencing and integration, creating the foundational infrastructure for a unified documentation system. This phase focused on building the essential tools and reference materials needed for comprehensive documentation integration. + +### Key Deliverables + +1. ✅ **Documentation Validator Tool** - Automated link checking +2. ✅ **Broken Links Report** - 261 broken links identified across 264 files +3. ✅ **Comprehensive Glossary** - 80+ terms with cross-references +4. ✅ **Documentation Map** - Complete navigation guide with user journeys +5. ⚠️ **System Integration** - Diagnostics system analysis (existing references verified) + +--- + +## 1. Documentation Validator Tool + +**File**: `provisioning/tools/doc-validator.nu` (210 lines) + +### Features + +- ✅ Scans all markdown files in documentation (264 files found) +- ✅ Extracts and validates internal links using regex parsing +- ✅ Resolves relative paths and checks file existence +- ✅ Classifies links: internal, external, anchor +- ✅ Generates broken links report (JSON + Markdown) +- ✅ Provides summary statistics +- ✅ Supports multiple output formats (table, json, markdown) + +### Usage + +```bash +# Run full validation +nu provisioning/tools/doc-validator.nu + +# Generate markdown report +nu provisioning/tools/doc-validator.nu --format markdown + +# Generate JSON for automation +nu provisioning/tools/doc-validator.nu --format json +``` + +### Performance + +- **264 markdown files** scanned +- **Completion time**: ~2 minutes +- **Memory usage**: Minimal (streaming processing) + +### Output Files + +1. `provisioning/tools/broken-links-report.json` - Detailed broken links (261 entries) +2. `provisioning/tools/doc-validation-full-report.json` - Complete validation data + +--- + +## 2. Broken Links Analysis + +### Statistics + +**Total Links Analyzed**: 2,847 links +**Broken Links**: 261 (9.2% failure rate) +**Valid Links**: 2,586 (90.8% success rate) + +### Link Type Breakdown + +- **Internal links**: 1,842 (64.7%) +- **External links**: 523 (18.4%) +- **Anchor links**: 482 (16.9%) + +### Broken Link Categories + +#### 1. Missing Documentation Files (47%) + +Common patterns: + +- `docs/user/quickstart.md` - Referenced but not created +- `docs/development/CONTRIBUTING.md` - Standard file missing +- `.claude/features/*.md` - Path resolution issues from docs/ + +#### 2. Anchor Links to Missing Sections (31%) + +Examples: + +- `workspace-management.md#setup-and-initialization` +- `configuration.md#configuration-architecture` +- `workflow.md#daily-development-workflow` + +#### 3. Path Resolution Issues (15%) + +- References to files in `.claude/` from `docs/` (path mismatch) +- References to `provisioning/` from `docs/` (relative path errors) + +#### 4. Outdated References (7%) + +- ADR links to non-existent ADRs +- Old migration guide structure + +### Recommendations + +**High Priority Fixes**: + +1. Create missing guide files in `docs/guides/` +2. Create missing ADRs or update references +3. Fix path resolution for `.claude/` references +4. Add missing anchor sections in existing docs + +**Medium Priority**: + +1. Verify and add missing anchor links +2. Update outdated migration paths +3. Create CONTRIBUTING.md + +**Low Priority**: + +1. Validate external links (may be intentional placeholders) +2. Standardize relative vs absolute paths + +--- + +## 3. Glossary (GLOSSARY.md) + +**File**: `provisioning/docs/src/GLOSSARY.md` (23,500+ lines) + +### Comprehensive Terminology Reference + +**80+ Terms Defined**, covering: + +- Infrastructure concepts (Server, Cluster, Taskserv, Provider, etc.) +- Security terms (Auth, JWT, MFA, Cedar, KMS, etc.) +- Configuration (Config, KCL, Schema, Workspace, etc.) +- Operations (Workflow, Batch Operation, Orchestrator, etc.) +- Platform (Control Center, MCP, API Gateway, etc.) +- Development (Extension, Plugin, Module, Template, etc.) + +### Structure + +Each term includes: + +1. **Definition** - Clear, concise explanation +2. **Where Used** - Context and use cases +3. **Related Concepts** - Cross-references to related terms +4. **Examples** - Code samples, commands, or configurations (where applicable) +5. **Commands** - CLI commands related to the term (where applicable) +6. **See Also** - Links to related documentation + +### Special Sections + +1. **Symbol and Acronym Index** - Quick lookup table +2. **Cross-Reference Map** - Terms organized by topic area +3. **Terminology Guidelines** - Writing style and conventions +4. **Contributing to Glossary** - How to add/update terms + +### Usage + +The glossary serves as: + +- **Learning resource** for new users +- **Reference** for experienced users +- **Documentation standard** for contributors +- **Cross-reference hub** for all documentation + +--- + +## 4. Documentation Map (DOCUMENTATION_MAP.md) + +**File**: `provisioning/docs/src/DOCUMENTATION_MAP.md` (48,000+ lines) + +### Comprehensive Navigation Guide + +**264 Documents Mapped**, organized by: + +- User Journeys (6 distinct paths) +- Topic Areas (14 categories) +- Difficulty Levels (Beginner, Intermediate, Advanced) +- Estimated Reading Times + +### User Journeys + +#### 1. New User Journey (0-7 days, 4-6 hours) + +8 steps from platform overview to basic deployment + +#### 2. Intermediate User Journey (1-4 weeks, 8-12 hours) + +8 steps mastering infrastructure automation and customization + +#### 3. Advanced User Journey (1-3 months, 20-30 hours) + +8 steps to become platform expert and contributor + +#### 4. Developer Journey (Ongoing) + +Contributing to platform development + +#### 5. Security Specialist Journey (10-15 hours) + +12 steps mastering security features + +#### 6. Operations Specialist Journey (6-8 hours) + +7 steps for daily operations mastery + +### Documentation by Topic + +**14 Major Categories**: + +1. Core Platform (3 docs) +2. User Guides (45+ docs) +3. Guides & Tutorials (10+ specialized guides) +4. Architecture (27 docs including 10 ADRs) +5. Development (25+ docs) +6. API Documentation (7 docs) +7. Security (15+ docs) +8. Operations (3+ docs) +9. Configuration & Workspace (11+ docs) +10. Reference Documentation (10+ docs) +11. Testing & Validation (4+ docs) +12. Migration (10+ docs) +13. Examples (2+ with more planned) +14. Quick References (10+ docs) + +### Documentation Statistics + +**By Category**: + +- User Guides: 32 documents +- Architecture: 27 documents +- Development: 25 documents +- API: 7 documents +- Security: 15 documents +- Migration: 10 documents +- Operations: 3 documents +- Configuration: 8 documents +- KCL: 14 documents +- Testing: 4 documents +- Quick References: 10 documents +- Examples: 2 documents +- ADRs: 10 documents + +**By Level**: + +- Beginner: ~40 documents (4-6 hours total) +- Intermediate: ~120 documents (20-30 hours total) +- Advanced: ~100 documents (40-60 hours total) + +**Total Estimated Reading Time**: 150-200 hours (complete corpus) + +### Essential Reading Lists + +Curated "Must-Read" lists for: + +- Everyone (4 docs) +- Operators (4 docs) +- Developers (4 docs) +- Security Specialists (4 docs) + +### Features + +- **Learning Paths**: Structured journeys for different user types +- **Topic Browse**: Jump to specific topics +- **Level Filtering**: Match docs to expertise +- **Quick References**: Fast command lookup +- **Alphabetical Index**: Complete file listing +- **Time Estimates**: Plan learning sessions +- **Cross-References**: Related document discovery + +--- + +## 5. Diagnostics System Integration + +### Analysis of Existing References + +**Diagnostics System Files Analyzed**: + +1. `provisioning/core/nulib/lib_provisioning/diagnostics/system_status.nu` (318 lines) +2. `provisioning/core/nulib/lib_provisioning/diagnostics/health_check.nu` (423 lines) +3. `provisioning/core/nulib/lib_provisioning/diagnostics/next_steps.nu` (316 lines) +4. `provisioning/core/nulib/main_provisioning/commands/diagnostics.nu` (75 lines) + +### Documentation References Found + +**35+ documentation links** embedded in diagnostics system, referencing: + +✅ **Existing Documentation**: + +- `docs/user/WORKSPACE_SWITCHING_GUIDE.md` +- `docs/guides/quickstart-cheatsheet.md` +- `docs/guides/from-scratch.md` +- `docs/user/troubleshooting-guide.md` +- `docs/user/SERVICE_MANAGEMENT_GUIDE.md` +- `.claude/features/orchestrator-architecture.md` +- `docs/user/PLUGIN_INTEGRATION_GUIDE.md` +- `docs/user/AUTHENTICATION_LAYER_GUIDE.md` +- `docs/user/CONFIG_ENCRYPTION_GUIDE.md` +- `docs/user/RUSTYVAULT_KMS_GUIDE.md` + +### Integration Status + +✅ **Already Integrated**: + +- Status command references correct doc paths +- Health command provides fix recommendations with doc links +- Next steps command includes progressive guidance with docs +- Phase command tracks deployment progress + +⚠️ **Validation Needed**: + +- Some references may point to moved/renamed files +- Need to validate all 35+ doc paths against current structure +- Should update to use new GLOSSARY.md and DOCUMENTATION_MAP.md + +### Recommendations + +**Immediate Actions**: + +1. Validate all diagnostics doc paths against current file locations +2. Update any broken references found in validation +3. Add references to new GLOSSARY.md and DOCUMENTATION_MAP.md +4. Consider adding doc path validation to CI/CD + +**Future Enhancements**: + +1. Auto-update doc paths when files move +2. Add version checking for doc references +3. Include doc freshness indicators +4. Add inline doc previews + +--- + +## 6. Pending Integration Work + +### MCP Tools Integration (Not Started) + +**Scope**: Ensure MCP (Model Context Protocol) tools reference correct documentation paths + +**Files to Check**: + +- `provisioning/platform/mcp-server/` - MCP server implementation +- MCP tool definitions +- Guidance system references + +**Actions Needed**: + +1. Locate MCP tool implementations +2. Extract all documentation references +3. Validate paths against current structure +4. Update broken references +5. Add GLOSSARY and DOCUMENTATION_MAP references + +**Estimated Time**: 2-3 hours + +--- + +### UI Integration (Not Started) + +**Scope**: Ensure Control Center UI references correct documentation + +**Files to Check**: + +- `provisioning/platform/control-center/` - UI implementation +- Tooltip references +- QuickLinks definitions +- Help modals + +**Actions Needed**: + +1. Locate UI documentation references +2. Validate all doc paths +3. Update broken references +4. Test documentation viewer/modal +5. Add navigation to GLOSSARY and DOCUMENTATION_MAP + +**Estimated Time**: 3-4 hours + +--- + +### Integration Tests (Not Started) + +**Scope**: Create automated tests for documentation integration + +**Test File**: `provisioning/tests/integration/docs_integration_test.nu` + +**Test Coverage Needed**: + +1. CLI hints reference valid docs +2. MCP tools return valid doc paths +3. UI links work correctly +4. Diagnostics output is accurate +5. All cross-references resolve +6. GLOSSARY terms link correctly +7. DOCUMENTATION_MAP paths valid + +**Test Types**: + +- Unit tests for link validation +- Integration tests for system components +- End-to-end tests for user journeys + +**Estimated Time**: 4-5 hours + +--- + +### Documentation System Guide (Not Started) + +**Scope**: Document how the unified documentation system works + +**File**: `provisioning/docs/src/development/documentation-system.md` + +**Content Needed**: + +1. **Organization**: How docs are structured +2. **Adding Documentation**: Step-by-step process +3. **CLI Integration**: How CLI links to docs +4. **MCP Integration**: How MCP uses docs +5. **UI Integration**: How UI presents docs +6. **Cross-References**: How to maintain links +7. **Architecture Diagram**: Visual system map +8. **Best Practices**: Documentation standards +9. **Tools**: Using doc-validator.nu +10. **Maintenance**: Keeping docs updated + +**Estimated Time**: 3-4 hours + +--- + +### Final Integration Check (Not Started) + +**Scope**: Complete user journey validation + +**Test Journey**: + +1. New user runs `provisioning status` +2. Follows suggestions from output +3. Uses `provisioning guide` commands +4. Opens Control Center UI +5. Completes onboarding wizard +6. Deploys first infrastructure + +**Validation Points**: + +- All suggested commands work +- All documentation links are valid +- UI navigation is intuitive +- Help system is comprehensive +- Error messages include helpful doc links +- User can complete journey without getting stuck + +**Estimated Time**: 2-3 hours + +--- + +## 7. Files Created/Modified + +### Created Files + +1. **`provisioning/tools/doc-validator.nu`** (210 lines) + - Documentation link validator tool + - Automated scanning and validation + - Multiple output formats + +2. **`provisioning/docs/src/GLOSSARY.md`** (23,500+ lines) + - Comprehensive terminology reference + - 80+ terms with cross-references + - Symbol index and usage guidelines + +3. **`provisioning/docs/src/DOCUMENTATION_MAP.md`** (48,000+ lines) + - Complete documentation navigation guide + - 6 user journeys + - 14 topic categories + - 264 documents mapped + +4. **`provisioning/tools/broken-links-report.json`** (Generated) + - 261 broken links identified + - Source file and line numbers + - Target paths and resolution attempts + +5. **`provisioning/tools/doc-validation-full-report.json`** (Generated) + - Complete validation results + - All 2,847 links analyzed + - Metadata and timestamps + +6. **`provisioning/tools/CROSS_REFERENCES_INTEGRATION_REPORT.md`** (This file) + - Comprehensive integration report + - Status of all deliverables + - Recommendations and next steps + +### Modified Files + +None (Phase 1 focused on analysis and reference material creation) + +--- + +## 8. Success Metrics + +### Deliverables Completed + +| Task | Status | Lines Created | Time Invested | +| ------ | -------- | --------------- | --------------- | +| Documentation Validator | ✅ Complete | 210 | ~2 hours | +| Broken Links Report | ✅ Complete | N/A (Generated) | ~30 min | +| Glossary | ✅ Complete | 23,500+ | ~4 hours | +| Documentation Map | ✅ Complete | 48,000+ | ~6 hours | +| Diagnostics Integration Analysis | ✅ Complete | N/A (Analysis) | ~1 hour | +| MCP Integration | ⏸️ Pending | - | - | +| UI Integration | ⏸️ Pending | - | - | +| Integration Tests | ⏸️ Pending | - | - | +| Documentation System Guide | ⏸️ Pending | - | - | +| Final Integration Check | ⏸️ Pending | - | - | + +**Total Lines Created**: 71,710+ lines +**Total Time Invested**: ~13.5 hours +**Completion**: 50% (Phase 1 of 2) + +### Quality Metrics + +**Documentation Validator**: + +- ✅ Handles 264 markdown files +- ✅ Analyzes 2,847 links +- ✅ 90.8% link validation accuracy +- ✅ Multiple output formats +- ✅ Extensible for future checks + +**Glossary**: + +- ✅ 80+ terms defined +- ✅ 100% cross-referenced +- ✅ Examples for 60% of terms +- ✅ CLI commands for 40% of terms +- ✅ Complete symbol index + +**Documentation Map**: + +- ✅ 100% of 264 docs cataloged +- ✅ 6 complete user journeys +- ✅ Reading time estimates for all docs +- ✅ 14 topic categories +- ✅ 3 difficulty levels + +--- + +## 9. Integration Architecture + +### Current State + +```bash +Documentation System (Phase 1 - Complete) +├── Validator Tool ────────────┐ +│ └── doc-validator.nu │ +│ │ +├── Reference Materials │ +│ ├── GLOSSARY.md ───────────┤──> Cross-References +│ └── DOCUMENTATION_MAP.md ──┤ +│ │ +├── Reports │ +│ ├── broken-links-report ───┘ +│ └── validation-full-report +│ +└── System Integration (Phase 1 Analysis) + ├── Diagnostics ✅ (35+ doc refs verified) + ├── MCP Tools ⏸️ (pending) + ├── UI ⏸️ (pending) + └── Tests ⏸️ (pending) +``` + +### Target State (Phase 2) + +```bash +Unified Documentation System +├── Validator Tool ────────────┐ +│ └── doc-validator.nu │ +│ ├── Link checking │ +│ ├── Freshness checks │ +│ └── CI/CD integration │ +│ │ +├── Reference Hub │ +│ ├── GLOSSARY.md ───────────┤──> All Systems +│ ├── DOCUMENTATION_MAP.md ──┤ +│ └── System Guide ──────────┤ +│ │ +├── System Integration │ +│ ├── Diagnostics ✅ │ +│ ├── MCP Tools ✅ ──────────┤ +│ ├── UI ✅ ─────────────────┤ +│ └── CLI ✅ ────────────────┤ +│ │ +├── Automated Testing │ +│ ├── Link validation ───────┘ +│ ├── Integration tests +│ └── User journey tests +│ +└── CI/CD Integration + ├── Pre-commit hooks + ├── PR validation + └── Doc freshness checks +``` + +--- + +## 10. Recommendations + +### Immediate Actions (Priority 1) + +1. **Fix High-Impact Broken Links** (2-3 hours) + - Create missing guide files + - Fix path resolution issues + - Update ADR references + +2. **Complete MCP Integration** (2-3 hours) + - Validate MCP tool doc references + - Update broken paths + - Add GLOSSARY/MAP references + +3. **Complete UI Integration** (3-4 hours) + - Validate UI doc references + - Test documentation viewer + - Update tooltips and help modals + +### Short-Term Actions (Priority 2) + +1. **Create Integration Tests** (4-5 hours) + - Write automated test suite + - Cover all system integrations + - Add to CI/CD pipeline + +2. **Write Documentation System Guide** (3-4 hours) + - Document unified system architecture + - Provide maintenance guidelines + - Include contribution process + +3. **Run Final Integration Check** (2-3 hours) + - Test complete user journey + - Validate all touchpoints + - Fix any issues found + +### Medium-Term Actions (Priority 3) + +1. **Automate Link Validation** (1-2 hours) + - Add doc-validator to CI/CD + - Run on every PR + - Block merges with broken links + +2. **Add Doc Freshness Checks** (2-3 hours) + - Track doc last-updated dates + - Flag stale documentation + - Auto-create update issues + +3. **Create Documentation Dashboard** (4-6 hours) + - Visual doc health metrics + - Link validation status + - Coverage statistics + - Contribution tracking + +--- + +## 11. Lessons Learned + +### Successes + +1. **Comprehensive Scope**: Mapping 264 documents revealed true system complexity +2. **Tool-First Approach**: Building validator before manual work saved significant time +3. **User Journey Focus**: Organizing by user type makes docs more accessible +4. **Cross-Reference Hub**: GLOSSARY + MAP create powerful navigation +5. **Existing Integration**: Diagnostics system already follows good practices + +### Challenges + +1. **Link Validation Complexity**: 261 broken links harder to fix than expected +2. **Path Resolution**: Multiple doc directories create path confusion +3. **Moving Target**: Documentation structure evolving during project +4. **Time Estimation**: Original scope underestimated total work +5. **Tool Limitations**: Anchor validation requires parsing headers (future work) + +### Improvements for Phase 2 + +1. **Incremental Validation**: Fix broken links category by category +2. **Automated Updates**: Update references when files move +3. **Version Tracking**: Track doc versions for compatibility +4. **CI/CD Integration**: Prevent new broken links from being added +5. **Living Documentation**: Auto-update maps and glossary + +--- + +## 12. Next Steps + +### Phase 2 Work (12-16 hours estimated) + +**Week 1**: + +- Day 1-2: Fix high-priority broken links (5-6 hours) +- Day 3: Complete MCP integration (2-3 hours) +- Day 4: Complete UI integration (3-4 hours) + +**Week 2**: + +- Day 5: Create integration tests (4-5 hours) +- Day 6: Write documentation system guide (3-4 hours) +- Day 7: Run final integration check (2-3 hours) + +### Acceptance Criteria + +Phase 2 complete when: + +- ✅ <5% broken links (currently 9.2%) +- ✅ All system components reference valid docs +- ✅ Integration tests pass +- ✅ Documentation system guide published +- ✅ Complete user journey validated +- ✅ CI/CD validation in place + +--- + +## 13. Conclusion + +Phase 1 of the Cross-References & Integration project is **successfully complete**. We have built the foundational infrastructure for a unified documentation system: + +✅ **Tool Created**: Automated documentation validator +✅ **Baseline Established**: 261 broken links identified +✅ **References Built**: Comprehensive glossary and documentation map +✅ **Integration Analyzed**: Diagnostics system verified + +The project is on track for Phase 2 completion, which will integrate all system components (MCP, UI, Tests) and validate the complete user experience. + +**Total Progress**: 50% complete +**Quality**: High - All Phase 1 deliverables meet or exceed requirements +**Risk**: Low - Clear path to Phase 2 completion +**Recommendation**: Proceed with Phase 2 implementation + +--- + +**Report Generated**: 2025-10-10 +**Agent**: Agent 6: Cross-References & Integration +**Status**: ✅ Phase 1 Complete +**Next Review**: After Phase 2 completion (estimated 12-16 hours) \ No newline at end of file diff --git a/tools/dist/README.md b/tools/dist/README.md index 95a3981..2d894a1 100644 --- a/tools/dist/README.md +++ b/tools/dist/README.md @@ -1 +1,66 @@ -# Distribution Build Output\n\n**Purpose**: Compiled binaries and bundled libraries ready for packaging and distribution.\n\n## Contents\n\nThis directory contains the build output from the core platform build system:\n\n### Subdirectories\n\n- **`core/`** - Nushell core libraries and CLI bundles (from `bundle-core.nu`)\n - Nushell provisioning CLI wrapper\n - Core libraries (lib_provisioning)\n - Configuration system\n - Template system\n - Extensions and plugins\n\n- **`platform/`** - Compiled Rust binaries (from `compile-platform.nu`)\n - provisioning-orchestrator binary\n - control-center binary\n - control-center-ui binary\n - mcp-server-rust binary\n - All cross-platform target binaries\n\n- **`config/`** - Configuration files and templates\n - Default configurations\n - Configuration examples\n - Schema definitions\n\n- **`provisioning-kcl-1.0.0/`** - Deprecated KCL distribution (archived)\n - Historical reference only\n - Migrated to `.coder/archive/kcl/` for long-term storage\n\n## Usage\n\nThis directory is generated by the build system. Do not commit contents to git (configured in .gitignore).\n\nBuild the distribution:\n\n```{$detected_lang}\njust build-all # Complete build (platform + core)\njust build-platform # Platform binaries only\njust build-core # Core libraries only\n```\n\nView distribution contents:\n\n```{$detected_lang}\nls dist/core/ # Nushell libraries\nls dist/platform/ # Compiled binaries\nls dist/config/ # Configuration files\n```\n\n## Cleanup\n\nRemove all distribution artifacts:\n\n```{$detected_lang}\njust clean-dist # Remove dist/ directory\n```\n\n## Related Directories\n\n- `distribution/` - Distribution package generation\n- `package/` - Package creation (deb, rpm, tar.gz, etc.)\n- `release/` - Release management and versioning +# Distribution Build Output + +**Purpose**: Compiled binaries and bundled libraries ready for packaging and distribution. + +## Contents + +This directory contains the build output from the core platform build system: + +### Subdirectories + +- **`core/`** - Nushell core libraries and CLI bundles (from `bundle-core.nu`) + - Nushell provisioning CLI wrapper + - Core libraries (lib_provisioning) + - Configuration system + - Template system + - Extensions and plugins + +- **`platform/`** - Compiled Rust binaries (from `compile-platform.nu`) + - provisioning-orchestrator binary + - control-center binary + - control-center-ui binary + - mcp-server-rust binary + - All cross-platform target binaries + +- **`config/`** - Configuration files and templates + - Default configurations + - Configuration examples + - Schema definitions + +- **`provisioning-kcl-1.0.0/`** - Deprecated KCL distribution (archived) + - Historical reference only + - Migrated to `.coder/archive/kcl/` for long-term storage + +## Usage + +This directory is generated by the build system. Do not commit contents to git (configured in .gitignore). + +Build the distribution: + +```bash +just build-all # Complete build (platform + core) +just build-platform # Platform binaries only +just build-core # Core libraries only +``` + +View distribution contents: + +```bash +ls dist/core/ # Nushell libraries +ls dist/platform/ # Compiled binaries +ls dist/config/ # Configuration files +``` + +## Cleanup + +Remove all distribution artifacts: + +```bash +just clean-dist # Remove dist/ directory +``` + +## Related Directories + +- `distribution/` - Distribution package generation +- `package/` - Package creation (deb, rpm, tar.gz, etc.) +- `release/` - Release management and versioning \ No newline at end of file diff --git a/tools/distribution/README.md b/tools/distribution/README.md index 5e89a40..d2e90f9 100644 --- a/tools/distribution/README.md +++ b/tools/distribution/README.md @@ -1 +1,58 @@ -# Distribution Package Generation\n\n**Purpose**: Generate complete distribution packages from compiled binaries and libraries.\n\n## Contents\n\nScripts and outputs for creating distribution-ready packages across multiple platforms and formats.\n\n## What is Distribution Generation\n\nDistribution generation takes the compiled artifacts from `dist/` and packages them into:\n\n- Installable archives (tar.gz, zip)\n- Platform-specific installers (deb, rpm, brew)\n- Docker/container images\n- Binary distributions with configuration templates\n\n## Build Process\n\nThe distribution build system:\n\n1. Takes binaries from `dist/platform/`\n2. Takes libraries from `dist/core/`\n3. Takes configuration templates from `dist/config/`\n4. Combines with installation scripts\n5. Creates platform-specific packages\n\nGenerate a distribution:\n\n```{$detected_lang}\njust dist-generate # Full distribution generation\njust dist-validate # Validate generated distribution\n```\n\n## Output Artifacts\n\nGenerated distribution includes:\n\n- Compiled binaries (orchestrator, control-center, MCP server)\n- Installation script (install.sh)\n- Configuration templates\n- Documentation\n- License files\n\n## Related Directories\n\n- `dist/` - Build output (source for distribution)\n- `package/` - Alternative packaging (low-level format creation)\n- `release/` - Version management and release tagging\n\n## Integration\n\nThe distribution output is used by:\n\n- Installation system (`provisioning-installer`)\n- Package managers\n- CI/CD pipelines\n- End-user downloads +# Distribution Package Generation + +**Purpose**: Generate complete distribution packages from compiled binaries and libraries. + +## Contents + +Scripts and outputs for creating distribution-ready packages across multiple platforms and formats. + +## What is Distribution Generation + +Distribution generation takes the compiled artifacts from `dist/` and packages them into: + +- Installable archives (tar.gz, zip) +- Platform-specific installers (deb, rpm, brew) +- Docker/container images +- Binary distributions with configuration templates + +## Build Process + +The distribution build system: + +1. Takes binaries from `dist/platform/` +2. Takes libraries from `dist/core/` +3. Takes configuration templates from `dist/config/` +4. Combines with installation scripts +5. Creates platform-specific packages + +Generate a distribution: + +```bash +just dist-generate # Full distribution generation +just dist-validate # Validate generated distribution +``` + +## Output Artifacts + +Generated distribution includes: + +- Compiled binaries (orchestrator, control-center, MCP server) +- Installation script (install.sh) +- Configuration templates +- Documentation +- License files + +## Related Directories + +- `dist/` - Build output (source for distribution) +- `package/` - Alternative packaging (low-level format creation) +- `release/` - Version management and release tagging + +## Integration + +The distribution output is used by: + +- Installation system (`provisioning-installer`) +- Package managers +- CI/CD pipelines +- End-user downloads \ No newline at end of file diff --git a/tools/nickel-installation-guide.md b/tools/nickel-installation-guide.md index 17ed264..2001fac 100644 --- a/tools/nickel-installation-guide.md +++ b/tools/nickel-installation-guide.md @@ -1 +1,187 @@ -# Nickel Installation Guide\n\n## Overview\n\nNickel is a configuration language that complements KCL in the provisioning system. It provides:\n\n- Lazy evaluation for efficient configuration processing\n- Modern functional programming paradigms\n- Excellent integration with the CLI daemon for config rendering\n\n## Installation Methods\n\n### Recommended: Nix (Official Method)\n\nNickel is maintained by Tweag and officially recommends Nix for installation. This avoids all dependency issues:\n\n```\n# Install Nix (one-time setup) - Using official NixOS installer\ncurl https://nixos.org/nix/install | sh\n\n# Install Nickel via Nix\nnix profile install nixpkgs#nickel\n\n# Verify installation\nnickel --version\n```\n\n**Why Nix?**\n\n- Isolated, reproducible environments\n- No system library conflicts\n- Official Nickel distribution method\n- Works on macOS, Linux, and other Unix-like systems\n- Pre-built binaries available\n\n### Alternative: Automatic Installation\n\nThe provisioning system can automate installation:\n\n```\n# Via tools-install script (uses Nix if available)\n$PROVISIONING/core/cli/tools-install nickel\n\n# Check installation status\n$PROVISIONING/core/cli/tools-install check\n```\n\n### Alternative: Manual Installation from Source\n\nIf you have a Rust toolchain:\n\n```\ncargo install nickel-lang-cli\n```\n\n**Note**: This requires Rust compiler (slower than pre-built binaries)\n\n## Troubleshooting\n\n### "Library not loaded: /nix/store/..." Error\n\nThis occurs when using pre-built binaries without Nix installed. **Solution**: Install Nix or use Cargo:\n\n```\n# Option 1: Install Nix (recommended) - Using official NixOS installer\ncurl https://nixos.org/nix/install | sh\n\n# Then install Nickel\nnix profile install nixpkgs#nickel\n\n# Option 2: Build from source with Cargo\ncargo install nickel-lang-cli\n```\n\n### Command Not Found\n\nEnsure Nix is properly installed and in PATH:\n\n```\n# Check if Nix is installed\nwhich nix\n\n# If not found, install Nix first using official NixOS installer:\ncurl https://nixos.org/nix/install | sh\n\n# Then install Nickel\nnix profile install nixpkgs#nickel\n```\n\n### Version Mismatch\n\nTo ensure you're using the correct version:\n\n```\n# Check installed version\nnickel --version\n\n# Expected version (from provisioning/core/versions)\necho $NICKEL_VERSION\n\n# Update to latest\nnix profile upgrade '*'\n```\n\n## Integration with Provisioning System\n\n### CLI Daemon Integration\n\nNickel is integrated into the CLI daemon for configuration rendering:\n\n```\n# Render Nickel configuration via daemon\ncurl -X POST http://localhost:9091/config/render \n -H "Content-Type: application/json" \n -d '{\n "language": "nickel",\n "content": "{name = \"my-config\", enabled = true}",\n "context": {"env": "prod"}\n }'\n```\n\n### Comparison with KCL\n\n| Feature | KCL | Nickel |\n| --------- | ----- | -------- |\n| **Type System** | Gradual, OOP-style | Gradual, Functional |\n| **Evaluation** | Eager | Lazy (partial evaluation) |\n| **Performance** | Fast | Very fast (lazy) |\n| **Learning Curve** | Moderate | Functional programming knowledge helps |\n| **Use Cases** | Infrastructure schemas | Configuration merging, lazy evaluation |\n\n## Deployment Considerations\n\n### macOS M1/M2/M3 (arm64)\n\nNix automatically handles architecture:\n\n```\nnix profile install nixpkgs#nickel\n# Automatically installs arm64 binary\n```\n\n### Linux (x86_64/arm64)\n\n```\nnix profile install nixpkgs#nickel\n# Automatically installs correct architecture\n```\n\n### CI/CD Environments\n\nFor GitHub Actions or other CI/CD:\n\n```\n# .github/workflows/example.yml\n- name: Install Nickel\n run: |\n curl https://nixos.org/nix/install | sh\n nix profile install nixpkgs#nickel\n```\n\n## Resources\n\n- **Official Website**: \n- **Getting Started**: \n- **User Manual**: \n- **GitHub**: \n- **Nix Package**: \n\n## Version Information\n\nCurrent provisioning system configuration:\n\n```\n# View configured version\ncat $PROVISIONING/core/versions | grep NICKEL_VERSION\n\n# Current: 1.15.1\n```\n\n## Support\n\nFor issues related to:\n\n- **Nickel language**: See \n- **Nix installation**: See \n- **Provisioning integration**: See the provisioning system documentation \ No newline at end of file +# Nickel Installation Guide + +## Overview + +Nickel is a configuration language that complements KCL in the provisioning system. It provides: + +- Lazy evaluation for efficient configuration processing +- Modern functional programming paradigms +- Excellent integration with the CLI daemon for config rendering + +## Installation Methods + +### Recommended: Nix (Official Method) + +Nickel is maintained by Tweag and officially recommends Nix for installation. This avoids all dependency issues: + +```nickel +# Install Nix (one-time setup) - Using official NixOS installer +curl https://nixos.org/nix/install | sh + +# Install Nickel via Nix +nix profile install nixpkgs#nickel + +# Verify installation +nickel --version +``` + +**Why Nix?** + +- Isolated, reproducible environments +- No system library conflicts +- Official Nickel distribution method +- Works on macOS, Linux, and other Unix-like systems +- Pre-built binaries available + +### Alternative: Automatic Installation + +The provisioning system can automate installation: + +```nickel +# Via tools-install script (uses Nix if available) +$PROVISIONING/core/cli/tools-install nickel + +# Check installation status +$PROVISIONING/core/cli/tools-install check +``` + +### Alternative: Manual Installation from Source + +If you have a Rust toolchain: + +```rust +cargo install nickel-lang-cli +``` + +**Note**: This requires Rust compiler (slower than pre-built binaries) + +## Troubleshooting + +### "Library not loaded: /nix/store/..." Error + +This occurs when using pre-built binaries without Nix installed. **Solution**: Install Nix or use Cargo: + +```rust +# Option 1: Install Nix (recommended) - Using official NixOS installer +curl https://nixos.org/nix/install | sh + +# Then install Nickel +nix profile install nixpkgs#nickel + +# Option 2: Build from source with Cargo +cargo install nickel-lang-cli +``` + +### Command Not Found + +Ensure Nix is properly installed and in PATH: + +```nickel +# Check if Nix is installed +which nix + +# If not found, install Nix first using official NixOS installer: +curl https://nixos.org/nix/install | sh + +# Then install Nickel +nix profile install nixpkgs#nickel +``` + +### Version Mismatch + +To ensure you're using the correct version: + +```nickel +# Check installed version +nickel --version + +# Expected version (from provisioning/core/versions) +echo $NICKEL_VERSION + +# Update to latest +nix profile upgrade '*' +``` + +## Integration with Provisioning System + +### CLI Daemon Integration + +Nickel is integrated into the CLI daemon for configuration rendering: + +```nickel +# Render Nickel configuration via daemon +curl -X POST http://localhost:9091/config/render + -H "Content-Type: application/json" + -d '{ + "language": "nickel", + "content": "{name = \"my-config\", enabled = true}", + "context": {"env": "prod"} + }' +``` + +### Comparison with KCL + +| Feature | KCL | Nickel | +| --------- | ----- | -------- | +| **Type System** | Gradual, OOP-style | Gradual, Functional | +| **Evaluation** | Eager | Lazy (partial evaluation) | +| **Performance** | Fast | Very fast (lazy) | +| **Learning Curve** | Moderate | Functional programming knowledge helps | +| **Use Cases** | Infrastructure schemas | Configuration merging, lazy evaluation | + +## Deployment Considerations + +### macOS M1/M2/M3 (arm64) + +Nix automatically handles architecture: + +```nickel +nix profile install nixpkgs#nickel +# Automatically installs arm64 binary +``` + +### Linux (x86_64/arm64) + +```nickel +nix profile install nixpkgs#nickel +# Automatically installs correct architecture +``` + +### CI/CD Environments + +For GitHub Actions or other CI/CD: + +```nickel +# .github/workflows/example.yml +- name: Install Nickel + run: | + curl https://nixos.org/nix/install | sh + nix profile install nixpkgs#nickel +``` + +## Resources + +- **Official Website**: +- **Getting Started**: +- **User Manual**: +- **GitHub**: +- **Nix Package**: + +## Version Information + +Current provisioning system configuration: + +```toml +# View configured version +cat $PROVISIONING/core/versions | grep NICKEL_VERSION + +# Current: 1.15.1 +``` + +## Support + +For issues related to: + +- **Nickel language**: See +- **Nix installation**: See +- **Provisioning integration**: See the provisioning system documentation \ No newline at end of file diff --git a/tools/package/README.md b/tools/package/README.md index 1209222..85ddf19 100644 --- a/tools/package/README.md +++ b/tools/package/README.md @@ -1 +1,83 @@ -# Package Build Output\n\n**Purpose**: Platform-specific packages (deb, rpm, tar.gz, etc.) created from distribution artifacts.\n\n## Contents\n\nThis directory contains the output from package creation tools that convert distribution artifacts into system-specific formats.\n\n## Package Formats\n\nGenerated packages may include:\n\n### Linux Packages\n\n- **deb** - Debian/Ubuntu packages\n- **rpm** - RedHat/CentOS packages\n- **tar.gz** - Portable tarball archives\n- **AppImage** - Universal Linux application format\n\n### macOS Packages\n\n- **pkg** - macOS installer packages\n- **dmg** - macOS disk image\n- **tar.gz** - Portable archive\n\n### Windows Packages\n\n- **msi** - Windows installer\n- **zip** - Portable archive\n- **exe** - Self-extracting executable\n\n### Container Images\n\n- **docker** - Docker container images\n- **oci** - OCI container format\n\n## Usage\n\nThis directory is generated by the package build system. Do not commit contents to git (configured in .gitignore).\n\nCreate packages:\n\n```\njust package-all # All format packages\njust package-linux # Linux packages only\njust package-macos # macOS packages only\njust package-deb # Debian package only\n```\n\nInstall a package:\n\n```\n# Linux\nsudo dpkg -i provisioning-*.deb # Debian\nsudo rpm -i provisioning-*.rpm # RedHat\n\n# macOS\nsudo installer -pkg provisioning-*.pkg -target /\n```\n\n## Package Verification\n\nVerify package contents:\n\n```\ndpkg -c provisioning-*.deb # List Debian package contents\nrpm -ql provisioning-*.rpm # List RedHat package contents\ntar -tzf provisioning-*.tar.gz # List tarball contents\n```\n\n## Cleanup\n\nRemove all packages:\n\n```\njust clean # Clean all build artifacts\n```\n\n## Related Directories\n\n- `dist/` - Build artifacts that packages are created from\n- `distribution/` - Distribution generation (uses package outputs)\n- `release/` - Version management for packages +# Package Build Output + +**Purpose**: Platform-specific packages (deb, rpm, tar.gz, etc.) created from distribution artifacts. + +## Contents + +This directory contains the output from package creation tools that convert distribution artifacts into system-specific formats. + +## Package Formats + +Generated packages may include: + +### Linux Packages + +- **deb** - Debian/Ubuntu packages +- **rpm** - RedHat/CentOS packages +- **tar.gz** - Portable tarball archives +- **AppImage** - Universal Linux application format + +### macOS Packages + +- **pkg** - macOS installer packages +- **dmg** - macOS disk image +- **tar.gz** - Portable archive + +### Windows Packages + +- **msi** - Windows installer +- **zip** - Portable archive +- **exe** - Self-extracting executable + +### Container Images + +- **docker** - Docker container images +- **oci** - OCI container format + +## Usage + +This directory is generated by the package build system. Do not commit contents to git (configured in .gitignore). + +Create packages: + +```bash +just package-all # All format packages +just package-linux # Linux packages only +just package-macos # macOS packages only +just package-deb # Debian package only +``` + +Install a package: + +```bash +# Linux +sudo dpkg -i provisioning-*.deb # Debian +sudo rpm -i provisioning-*.rpm # RedHat + +# macOS +sudo installer -pkg provisioning-*.pkg -target / +``` + +## Package Verification + +Verify package contents: + +```bash +dpkg -c provisioning-*.deb # List Debian package contents +rpm -ql provisioning-*.rpm # List RedHat package contents +tar -tzf provisioning-*.tar.gz # List tarball contents +``` + +## Cleanup + +Remove all packages: + +```bash +just clean # Clean all build artifacts +``` + +## Related Directories + +- `dist/` - Build artifacts that packages are created from +- `distribution/` - Distribution generation (uses package outputs) +- `release/` - Version management for packages \ No newline at end of file diff --git a/tools/release/README.md b/tools/release/README.md index 6583cb7..516f48d 100644 --- a/tools/release/README.md +++ b/tools/release/README.md @@ -1 +1,72 @@ -# Release Management Output\n\n**Purpose**: Release artifacts, version tags, and release notes.\n\n## Contents\n\nThis directory contains outputs and staging for release management operations.\n\n## Release Process\n\nRelease management handles:\n\n1. **Version Bumping** - Update version numbers across the project\n2. **Changelog Generation** - Create release notes from git history\n3. **Git Tagging** - Create git tags for releases\n4. **Release Notes** - Write comprehensive release documentation\n\n## Release Workflow\n\nPrepare a release:\n\n```\njust release-prepare --version 4.0.0 # Prepare new release\njust release-validate # Validate release readiness\n```\n\nCreate release artifacts:\n\n```\njust release-build # Build final release packages\njust release-tag # Create git tags\n```\n\nPublish release:\n\n```\njust release-publish # Upload to repositories\n```\n\n## Version Management\n\nVersions follow semantic versioning: `MAJOR.MINOR.PATCH`\n\n- **MAJOR** - Breaking changes\n- **MINOR** - New features (backward compatible)\n- **PATCH** - Bug fixes\n\nExample: `4.2.1` = Major 4, Minor 2, Patch 1\n\n## Release Artifacts\n\nA release includes:\n\n- Version-tagged binaries\n- Changelog with all changes\n- Release notes with highlights\n- Git tags for all milestones\n- Published packages in repositories\n\n## Cleanup\n\nRemove release artifacts:\n\n```\njust clean # Clean all build artifacts\n```\n\n## Related Directories\n\n- `dist/` - Build artifacts that releases are based on\n- `package/` - Packages that get versioned and released\n- `distribution/` - Distribution that incorporates release versions +# Release Management Output + +**Purpose**: Release artifacts, version tags, and release notes. + +## Contents + +This directory contains outputs and staging for release management operations. + +## Release Process + +Release management handles: + +1. **Version Bumping** - Update version numbers across the project +2. **Changelog Generation** - Create release notes from git history +3. **Git Tagging** - Create git tags for releases +4. **Release Notes** - Write comprehensive release documentation + +## Release Workflow + +Prepare a release: + +```bash +just release-prepare --version 4.0.0 # Prepare new release +just release-validate # Validate release readiness +``` + +Create release artifacts: + +```bash +just release-build # Build final release packages +just release-tag # Create git tags +``` + +Publish release: + +```bash +just release-publish # Upload to repositories +``` + +## Version Management + +Versions follow semantic versioning: `MAJOR.MINOR.PATCH` + +- **MAJOR** - Breaking changes +- **MINOR** - New features (backward compatible) +- **PATCH** - Bug fixes + +Example: `4.2.1` = Major 4, Minor 2, Patch 1 + +## Release Artifacts + +A release includes: + +- Version-tagged binaries +- Changelog with all changes +- Release notes with highlights +- Git tags for all milestones +- Published packages in repositories + +## Cleanup + +Remove release artifacts: + +```bash +just clean # Clean all build artifacts +``` + +## Related Directories + +- `dist/` - Build artifacts that releases are based on +- `package/` - Packages that get versioned and released +- `distribution/` - Distribution that incorporates release versions \ No newline at end of file diff --git a/workspace/README.md b/workspace/README.md index 508d914..d055492 100644 --- a/workspace/README.md +++ b/workspace/README.md @@ -1 +1,273 @@ -# Layered Template Architecture System\n\nThis workspace provides a combined **Layered Extension Architecture with Override System** and **Template-Based Infrastructure Pattern Library** that maintains PAP principles while enabling maximum reusability of infrastructure configurations.\n\n## 🏗️ Architecture Overview\n\n### Layer Hierarchy\n\nThe system resolves configurations through a three-tier layer system:\n\n1. **Core Layer (Priority 100)** - `provisioning/extensions/`\n - Base provisioning system extensions\n - Core taskservs, providers, and clusters\n\n2. **Workspace Layer (Priority 200)** - `provisioning/workspace/templates/`\n - Shared templates extracted from proven infrastructure patterns\n - Reusable configurations across multiple infrastructures\n\n3. **Infrastructure Layer (Priority 300)** - `workspace/infra/{name}/`\n - Infrastructure-specific configurations and overrides\n - Custom implementations per infrastructure\n\n### Directory Structure\n\n```\nprovisioning/workspace/\n├── templates/ # Template library\n│ ├── taskservs/ # Taskserv configuration templates\n│ │ ├── kubernetes/ # Kubernetes templates\n│ │ │ ├── base.k # Base configuration\n│ │ │ └── variants/ # HA, single-node variants\n│ │ ├── storage/ # Storage system templates\n│ │ ├── networking/ # Network configuration templates\n│ │ └── container-runtime/ # Container runtime templates\n│ ├── providers/ # Provider templates\n│ │ ├── upcloud/ # UpCloud provider templates\n│ │ └── aws/ # AWS provider templates\n│ ├── servers/ # Server configuration patterns\n│ └── clusters/ # Complete cluster templates\n├── layers/ # Layer definitions\n│ ├── core.layer.k # Core layer definition\n│ ├── workspace.layer.k # Workspace layer definition\n│ └── infra.layer.k # Infrastructure layer definition\n├── registry/ # Extension registry\n│ ├── manifest.yaml # Template catalog and metadata\n│ └── imports.k # Central import aliases\n├── templates/lib/ # Composition utilities\n│ ├── compose.k # KCL composition functions\n│ └── override.k # Override and layer utilities\n└── tools/ # Migration and management tools\n └── migrate-infra.nu # Infrastructure migration tool\n```\n\n## 🚀 Getting Started\n\n### 1. Extract Existing Patterns\n\nExtract patterns from existing infrastructure (e.g., wuji) to create reusable templates:\n\n```\n# Extract all patterns from wuji infrastructure\ncd provisioning/workspace/tools\n./migrate-infra.nu extract wuji\n\n# Extract specific types only\n./migrate-infra.nu extract wuji --type taskservs\n```\n\n### 2. Use Enhanced Module Loader\n\nThe enhanced module loader provides template and layer management:\n\n```\n# List available templates\nprovisioning/core/cli/module-loader-enhanced template list\n\n# Show layer resolution order\nprovisioning/core/cli/module-loader-enhanced layer show\n\n# Test layer resolution for a specific module\nprovisioning/core/cli/module-loader-enhanced layer test kubernetes --infra wuji\n```\n\n### 3. Apply Templates to New Infrastructure\n\n```\n# Apply kubernetes template to new infrastructure\nprovisioning/core/cli/module-loader-enhanced template apply kubernetes-base new-infra --provider upcloud\n\n# Load taskservs using templates\nprovisioning/core/cli/module-loader-enhanced load enhanced taskservs workspace/infra/new-infra [kubernetes, cilium] --layer workspace\n```\n\n## 📋 Usage Examples\n\n### Creating a New Infrastructure from Templates\n\n```\n# 1. Create directory structure\nmkdir -p workspace/infra/my-new-infra/{taskservs,defs,overrides}\n\n# 2. Apply base templates\ncd provisioning\n./core/cli/module-loader-enhanced template apply kubernetes-base my-new-infra\n\n# 3. Customize for your needs\n# Edit workspace/infra/my-new-infra/taskservs/kubernetes.k\n\n# 4. Test layer resolution\n./core/cli/module-loader-enhanced layer test kubernetes --infra my-new-infra\n```\n\n### Converting Existing Infrastructure\n\n```\n# 1. Extract patterns to templates\ncd provisioning/workspace/tools\n./migrate-infra.nu extract existing-infra\n\n# 2. Convert infrastructure to use templates\n./migrate-infra.nu convert existing-infra\n\n# 3. Validate conversion\n./migrate-infra.nu validate existing-infra\n```\n\n### Template Development\n\n```\n# Create a new taskserv template\n# provisioning/workspace/templates/taskservs/my-service/base.k\n\nimport taskservs.my_service.kcl.my_service as service_core\nimport ../../../workspace/templates/lib/compose as comp\n\nschema MyServiceBase {\n version: str = "1.0.0"\n cluster_name: str\n # ... configuration options\n}\n\ndef create_my_service [cluster_name: str, overrides: any = {}] -> any {\n let base_config = MyServiceBase { cluster_name = $cluster_name }\n let final_config = comp.deep_merge $base_config $overrides\n\n service_core.MyService $final_config\n}\n```\n\n## 🔧 Configuration Composition\n\n### Using Templates in Infrastructure\n\n```\n# workspace/infra/my-infra/taskservs/kubernetes.k\n\nimport provisioning.workspace.registry.imports as reg\nimport provisioning.workspace.templates.lib.override as ovr\n\n# Use base template with infrastructure-specific overrides\n_taskserv = ovr.infrastructure_overrides.taskserv_override(\n reg.tpl_kubernetes_base.kubernetes_base,\n "my-infra",\n "kubernetes",\n {\n cluster_name: "my-infra"\n version: "1.31.0" # Override version\n cri: "crio" # Override container runtime\n # Custom network configuration\n network_config: {\n pod_cidr: "10.244.0.0/16"\n service_cidr: "10.96.0.0/12"\n }\n }\n)\n```\n\n### Layer Composition\n\n```\n# Compose configuration through all layers\nimport provisioning.workspace.templates.lib.compose as comp\n\n# Manual layer composition\nfinal_config = comp.compose_templates(\n $core_config, # From provisioning/extensions\n $workspace_config, # From provisioning/workspace/templates\n $infra_config # From workspace/infra/{name}\n)\n```\n\n## 🛠️ Advanced Features\n\n### Provider-Aware Composition\n\n```\nimport provisioning.workspace.templates.lib.override as ovr\n\n# Apply provider-specific configurations\nconfig = ovr.override_patterns.env_override(\n $base_config,\n "upcloud",\n {\n upcloud: { zone: "es-mad1", plan: "2xCPU-4GB" },\n aws: { region: "us-west-2", instance_type: "t3.medium" },\n local: { memory: "4GB", cpus: 2 }\n }\n)\n```\n\n### Conditional Overrides\n\n```\n# Infrastructure-specific conditional overrides\nconfig = ovr.layer_resolution.infra_conditional(\n $base_config,\n $infra_name,\n {\n "production": { ha: true, replicas: 3 },\n "development": { ha: false, replicas: 1 },\n "default": { ha: false, replicas: 1 }\n }\n)\n```\n\n## 📚 Benefits\n\n### ✅ Maintains PAP Principles\n\n- **Configuration-driven**: No hardcoded values\n- **Modular**: Clear separation of concerns\n- **Declarative**: Infrastructure as code\n- **Reusable**: DRY principle throughout\n\n### ✅ Flexible Override System\n\n- **Layer-based resolution**: Clean precedence order\n- **Selective overrides**: Override only what's needed\n- **Provider-agnostic**: Works across all providers\n- **Environment-aware**: Dev/test/prod configurations\n\n### ✅ Template Reusability\n\n- **Pattern extraction**: Learn from existing infrastructures\n- **Template versioning**: Track evolution over time\n- **Composition utilities**: Rich KCL composition functions\n- **Migration tools**: Easy conversion process\n\n### ✅ No Core Changes\n\n- **Non-invasive**: Core provisioning system unchanged\n- **Backward compatible**: Existing infrastructures continue working\n- **Progressive adoption**: Migrate at your own pace\n- **Extensible**: Add new templates and layers easily\n\n## 🔄 Migration Path\n\n1. **Extract patterns** from existing infrastructures using `migrate-infra.nu extract`\n2. **Create templates** in `provisioning/workspace/templates/`\n3. **Convert infrastructures** to use templates with `migrate-infra.nu convert`\n4. **Validate** the conversion with `migrate-infra.nu validate`\n5. **Test** layer resolution with enhanced module loader\n6. **Iterate** and improve templates based on usage\n\n## 📖 Further Reading\n\n- **Layer Definitions**: See `layers/*.layer.k` for layer configuration\n- **Template Examples**: Browse `templates/` for real-world patterns\n- **Composition Utilities**: Check `templates/lib/` for KCL utilities\n- **Migration Tools**: Use `tools/migrate-infra.nu` for infrastructure conversion\n- **Registry System**: Explore `registry/` for template metadata and imports\n\nThis system provides the perfect balance of flexibility, reusability, and maintainability while preserving the core provisioning system's integrity. +# Layered Template Architecture System + +This workspace provides a combined **Layered Extension Architecture with Override System** and **Template-Based Infrastructure Pattern Library** that maintains PAP principles while enabling maximum reusability of infrastructure configurations. + +## 🏗️ Architecture Overview + +### Layer Hierarchy + +The system resolves configurations through a three-tier layer system: + +1. **Core Layer (Priority 100)** - `provisioning/extensions/` + - Base provisioning system extensions + - Core taskservs, providers, and clusters + +2. **Workspace Layer (Priority 200)** - `provisioning/workspace/templates/` + - Shared templates extracted from proven infrastructure patterns + - Reusable configurations across multiple infrastructures + +3. **Infrastructure Layer (Priority 300)** - `workspace/infra/{name}/` + - Infrastructure-specific configurations and overrides + - Custom implementations per infrastructure + +### Directory Structure + +```bash +provisioning/workspace/ +├── templates/ # Template library +│ ├── taskservs/ # Taskserv configuration templates +│ │ ├── kubernetes/ # Kubernetes templates +│ │ │ ├── base.k # Base configuration +│ │ │ └── variants/ # HA, single-node variants +│ │ ├── storage/ # Storage system templates +│ │ ├── networking/ # Network configuration templates +│ │ └── container-runtime/ # Container runtime templates +│ ├── providers/ # Provider templates +│ │ ├── upcloud/ # UpCloud provider templates +│ │ └── aws/ # AWS provider templates +│ ├── servers/ # Server configuration patterns +│ └── clusters/ # Complete cluster templates +├── layers/ # Layer definitions +│ ├── core.layer.k # Core layer definition +│ ├── workspace.layer.k # Workspace layer definition +│ └── infra.layer.k # Infrastructure layer definition +├── registry/ # Extension registry +│ ├── manifest.yaml # Template catalog and metadata +│ └── imports.k # Central import aliases +├── templates/lib/ # Composition utilities +│ ├── compose.k # KCL composition functions +│ └── override.k # Override and layer utilities +└── tools/ # Migration and management tools + └── migrate-infra.nu # Infrastructure migration tool +``` + +## 🚀 Getting Started + +### 1. Extract Existing Patterns + +Extract patterns from existing infrastructure (e.g., wuji) to create reusable templates: + +```bash +# Extract all patterns from wuji infrastructure +cd provisioning/workspace/tools +./migrate-infra.nu extract wuji + +# Extract specific types only +./migrate-infra.nu extract wuji --type taskservs +``` + +### 2. Use Enhanced Module Loader + +The enhanced module loader provides template and layer management: + +```bash +# List available templates +provisioning/core/cli/module-loader-enhanced template list + +# Show layer resolution order +provisioning/core/cli/module-loader-enhanced layer show + +# Test layer resolution for a specific module +provisioning/core/cli/module-loader-enhanced layer test kubernetes --infra wuji +``` + +### 3. Apply Templates to New Infrastructure + +```bash +# Apply kubernetes template to new infrastructure +provisioning/core/cli/module-loader-enhanced template apply kubernetes-base new-infra --provider upcloud + +# Load taskservs using templates +provisioning/core/cli/module-loader-enhanced load enhanced taskservs workspace/infra/new-infra [kubernetes, cilium] --layer workspace +``` + +## 📋 Usage Examples + +### Creating a New Infrastructure from Templates + +```bash +# 1. Create directory structure +mkdir -p workspace/infra/my-new-infra/{taskservs,defs,overrides} + +# 2. Apply base templates +cd provisioning +./core/cli/module-loader-enhanced template apply kubernetes-base my-new-infra + +# 3. Customize for your needs +# Edit workspace/infra/my-new-infra/taskservs/kubernetes.k + +# 4. Test layer resolution +./core/cli/module-loader-enhanced layer test kubernetes --infra my-new-infra +``` + +### Converting Existing Infrastructure + +```bash +# 1. Extract patterns to templates +cd provisioning/workspace/tools +./migrate-infra.nu extract existing-infra + +# 2. Convert infrastructure to use templates +./migrate-infra.nu convert existing-infra + +# 3. Validate conversion +./migrate-infra.nu validate existing-infra +``` + +### Template Development + +```bash +# Create a new taskserv template +# provisioning/workspace/templates/taskservs/my-service/base.k + +import taskservs.my_service.kcl.my_service as service_core +import ../../../workspace/templates/lib/compose as comp + +schema MyServiceBase { + version: str = "1.0.0" + cluster_name: str + # ... configuration options +} + +def create_my_service [cluster_name: str, overrides: any = {}] -> any { + let base_config = MyServiceBase { cluster_name = $cluster_name } + let final_config = comp.deep_merge $base_config $overrides + + service_core.MyService $final_config +} +``` + +## 🔧 Configuration Composition + +### Using Templates in Infrastructure + +```bash +# workspace/infra/my-infra/taskservs/kubernetes.k + +import provisioning.workspace.registry.imports as reg +import provisioning.workspace.templates.lib.override as ovr + +# Use base template with infrastructure-specific overrides +_taskserv = ovr.infrastructure_overrides.taskserv_override( + reg.tpl_kubernetes_base.kubernetes_base, + "my-infra", + "kubernetes", + { + cluster_name: "my-infra" + version: "1.31.0" # Override version + cri: "crio" # Override container runtime + # Custom network configuration + network_config: { + pod_cidr: "10.244.0.0/16" + service_cidr: "10.96.0.0/12" + } + } +) +``` + +### Layer Composition + +```bash +# Compose configuration through all layers +import provisioning.workspace.templates.lib.compose as comp + +# Manual layer composition +final_config = comp.compose_templates( + $core_config, # From provisioning/extensions + $workspace_config, # From provisioning/workspace/templates + $infra_config # From workspace/infra/{name} +) +``` + +## 🛠️ Advanced Features + +### Provider-Aware Composition + +```bash +import provisioning.workspace.templates.lib.override as ovr + +# Apply provider-specific configurations +config = ovr.override_patterns.env_override( + $base_config, + "upcloud", + { + upcloud: { zone: "es-mad1", plan: "2xCPU-4GB" }, + aws: { region: "us-west-2", instance_type: "t3.medium" }, + local: { memory: "4GB", cpus: 2 } + } +) +``` + +### Conditional Overrides + +```bash +# Infrastructure-specific conditional overrides +config = ovr.layer_resolution.infra_conditional( + $base_config, + $infra_name, + { + "production": { ha: true, replicas: 3 }, + "development": { ha: false, replicas: 1 }, + "default": { ha: false, replicas: 1 } + } +) +``` + +## 📚 Benefits + +### ✅ Maintains PAP Principles + +- **Configuration-driven**: No hardcoded values +- **Modular**: Clear separation of concerns +- **Declarative**: Infrastructure as code +- **Reusable**: DRY principle throughout + +### ✅ Flexible Override System + +- **Layer-based resolution**: Clean precedence order +- **Selective overrides**: Override only what's needed +- **Provider-agnostic**: Works across all providers +- **Environment-aware**: Dev/test/prod configurations + +### ✅ Template Reusability + +- **Pattern extraction**: Learn from existing infrastructures +- **Template versioning**: Track evolution over time +- **Composition utilities**: Rich KCL composition functions +- **Migration tools**: Easy conversion process + +### ✅ No Core Changes + +- **Non-invasive**: Core provisioning system unchanged +- **Backward compatible**: Existing infrastructures continue working +- **Progressive adoption**: Migrate at your own pace +- **Extensible**: Add new templates and layers easily + +## 🔄 Migration Path + +1. **Extract patterns** from existing infrastructures using `migrate-infra.nu extract` +2. **Create templates** in `provisioning/workspace/templates/` +3. **Convert infrastructures** to use templates with `migrate-infra.nu convert` +4. **Validate** the conversion with `migrate-infra.nu validate` +5. **Test** layer resolution with enhanced module loader +6. **Iterate** and improve templates based on usage + +## 📖 Further Reading + +- **Layer Definitions**: See `layers/*.layer.k` for layer configuration +- **Template Examples**: Browse `templates/` for real-world patterns +- **Composition Utilities**: Check `templates/lib/` for KCL utilities +- **Migration Tools**: Use `tools/migrate-infra.nu` for infrastructure conversion +- **Registry System**: Explore `registry/` for template metadata and imports + +This system provides the perfect balance of flexibility, reusability, and maintainability while preserving the core provisioning system's integrity. \ No newline at end of file