# Metadata-Driven Authentication System - Implementation Guide **Status**: ✅ Complete and Production-Ready **Version**: 1.0.0 **Last Updated**: 2025-12-10 ## Table of Contents 1. [Overview](#overview) 2. [Architecture](#architecture) 3. [Installation](#installation) 4. [Usage Guide](#usage-guide) 5. [Migration Path](#migration-path) 6. [Developer Guide](#developer-guide) 7. [Testing](#testing) 8. [Troubleshooting](#troubleshooting) ## Overview This guide describes the metadata-driven authentication system implemented over 5 weeks across 14 command handlers and 12 major systems. The system provides: - **Centralized Metadata**: All command definitions in Nickel with runtime validation - **Automatic Auth Checks**: Pre-execution validation before handler logic - **Performance Optimization**: 40-100x faster through metadata caching - **Flexible Deployment**: Works with orchestrator, batch workflows, and direct CLI ## Architecture ### System Components ``` ┌─────────────────────────────────────────────────────────────┐ │ User Command │ └────────────────────────────────┬──────────────────────────────┘ │ ┌────────────▼─────────────┐ │ CLI Dispatcher │ │ (main_provisioning) │ └────────────┬─────────────┘ │ ┌────────────▼─────────────┐ │ Metadata Loading │ │ (cached via traits.nu) │ └────────────┬─────────────┘ │ ┌────────────▼─────────────────────┐ │ Pre-Execution Validation │ │ - Auth checks │ │ - Permission validation │ │ - Operation type mapping │ └────────────┬─────────────────────┘ │ ┌────────────▼─────────────────────┐ │ Command Handler Execution │ │ - infrastructure.nu │ │ - orchestration.nu │ │ - workspace.nu │ └────────────┬─────────────────────┘ │ ┌────────────▼─────────────┐ │ Result/Response │ └─────────────────────────┘ ``` ### Data Flow 1. **User Command** → CLI Dispatcher 2. **Dispatcher** → Load cached metadata (or parse Nickel) 3. **Validate** → Check auth, operation type, permissions 4. **Execute** → Call appropriate handler 5. **Return** → Result to user ### Metadata Caching - **Location**: `~/.cache/provisioning/command_metadata.json` - **Format**: Serialized JSON (pre-parsed for speed) - **TTL**: 1 hour (configurable via `PROVISIONING_METADATA_TTL`) - **Invalidation**: Automatic on `main.ncl` modification - **Performance**: 40-100x faster than Nickel parsing ## Installation ### Prerequisites - Nushell 0.109.0+ - Nickel 1.15.0+ - SOPS 3.10.2 (for encrypted configs) - Age 1.2.1 (for encryption) ### Installation Steps ``` # 1. Clone or update repository git clone https://github.com/your-org/project-provisioning.git cd project-provisioning # 2. Initialize workspace ./provisioning/core/cli/provisioning workspace init # 3. Validate system ./provisioning/core/cli/provisioning validate config # 4. Run system checks ./provisioning/core/cli/provisioning health # 5. Run test suites nu tests/test-fase5-e2e.nu nu tests/test-security-audit-day20.nu nu tests/test-metadata-cache-benchmark.nu ``` ## Usage Guide ### Basic Commands ``` # Initialize authentication provisioning login # Enroll in MFA provisioning mfa totp enroll # Create infrastructure provisioning server create --name web-01 --plan 1xCPU-2 GB # Deploy with orchestrator provisioning workflow submit workflows/deployment.ncl --orchestrated # Batch operations provisioning batch submit workflows/batch-deploy.ncl # Check without executing provisioning server create --name test --check ``` ### Authentication Flow ``` # 1. Login (required for production operations) $ provisioning login Username: alice@example.com Password: **** # 2. Optional: Setup MFA $ provisioning mfa totp enroll Scan QR code with authenticator app Verify code: 123456 # 3. Use commands (auth checks happen automatically) $ provisioning server delete --name old-server --infra production Auth check: Check auth for production (delete operation) Are you sure? [yes/no] yes ✓ Server deleted # 4. All destructive operations require auth $ provisioning taskserv delete postgres web-01 Auth check: Check auth for destructive operation ✓ Taskserv deleted ``` ### Check Mode (Bypass Auth for Testing) ``` # Dry-run without auth checks provisioning server create --name test --check # Output: Shows what would happen, no auth checks Dry-run mode - no changes will be made ✓ Would create server: test ✓ Would deploy taskservs: [] ``` ### Non-Interactive CI/CD Mode ``` # Automated mode - skip confirmations provisioning server create --name web-01 --yes # Batch operations provisioning batch submit workflows/batch.ncl --yes --check # With environment variable PROVISIONING_NON_INTERACTIVE=1 provisioning server create --name web-02 --yes ``` ## Migration Path ### Phase 1: From Old `input` to Metadata **Old Pattern** (Before Fase 5): ``` # Hardcoded auth check let response = (input "Delete server? (yes/no): ") if $response != "yes" { exit 1 } # No metadata - auth unknown export def delete-server [name: string, --yes] { if not $yes { ... manual confirmation ... } # ... deletion logic ... } ``` **New Pattern** (After Fase 5): ``` # Metadata header # [command] # name = "server delete" # group = "infrastructure" # tags = ["server", "delete", "destructive"] # version = "1.0.0" # Automatic auth check from metadata export def delete-server [name: string, --yes] { # Pre-execution check happens in dispatcher # Auth enforcement via metadata # Operation type: "delete" automatically detected # ... deletion logic ... } ``` ### Phase 2: Adding Metadata Headers **For each script that was migrated:** 1. Add metadata header after shebang: ``` #!/usr/bin/env nu # [command] # name = "server create" # group = "infrastructure" # tags = ["server", "create", "interactive"] # version = "1.0.0" export def create-server [name: string] { # Logic here } ``` 1. Register in `provisioning/schemas/main.ncl`: ``` let server_create = { name = "server create", domain = "infrastructure", description = "Create a new server", requirements = { interactive = false, requires_auth = true, auth_type = "jwt", side_effect_type = "create", min_permission = "write", }, } in server_create ``` 1. Handler integration (happens in dispatcher): ``` # Dispatcher automatically: # 1. Loads metadata for "server create" # 2. Validates auth based on requirements # 3. Checks permission levels # 4. Calls handler if validation passes ``` ### Phase 3: Validating Migration ``` # Validate metadata headers nu utils/validate-metadata-headers.nu # Find scripts by tag nu utils/search-scripts.nu by-tag destructive # Find all scripts in group nu utils/search-scripts.nu by-group infrastructure # Find scripts with multiple tags nu utils/search-scripts.nu by-tags server delete # List all migrated scripts nu utils/search-scripts.nu list ``` ## Developer Guide ### Adding New Commands with Metadata **Step 1: Create metadata in main.ncl** ``` let new_feature_command = { name = "feature command", domain = "infrastructure", description = "My new feature", requirements = { interactive = false, requires_auth = true, auth_type = "jwt", side_effect_type = "create", min_permission = "write", }, } in new_feature_command ``` **Step 2: Add metadata header to script** ``` #!/usr/bin/env nu # [command] # name = "feature command" # group = "infrastructure" # tags = ["feature", "create"] # version = "1.0.0" export def feature-command [param: string] { # Implementation } ``` **Step 3: Implement handler function** ``` # Handler registered in dispatcher export def handle-feature-command [ action: string --flags ]: nothing -> nothing { # Dispatcher handles: # 1. Metadata validation # 2. Auth checks # 3. Permission validation # Your logic here } ``` **Step 4: Test with check mode** ``` # Dry-run without auth provisioning feature command --check # Full execution provisioning feature command --yes ``` ### Metadata Field Reference | Field | Type | Required | Description | | ------- | ------ | ---------- | ------------- | | name | string | Yes | Command canonical name | | domain | string | Yes | Command category (infrastructure, orchestration, etc.) | | description | string | Yes | Human-readable description | | requires_auth | bool | Yes | Whether auth is required | | auth_type | enum | Yes | "none", "jwt", "mfa", "cedar" | | side_effect_type | enum | Yes | "none", "create", "update", "delete", "deploy" | | min_permission | enum | Yes | "read", "write", "admin", "superadmin" | | interactive | bool | No | Whether command requires user input | | slow_operation | bool | No | Whether operation takes >60 seconds | ### Standard Tags **Groups**: - infrastructure - Server, taskserv, cluster operations - orchestration - Workflow, batch operations - workspace - Workspace management - authentication - Auth, MFA, tokens - utilities - Helper commands **Operations**: - create, read, update, delete - CRUD operations - destructive - Irreversible operations - interactive - Requires user input **Performance**: - slow - Operation >60 seconds - optimizable - Candidate for optimization ### Performance Optimization Patterns **Pattern 1: For Long Operations** ``` # Use orchestrator for operations >2 seconds if (get-operation-duration "my-operation") > 2000 { submit-to-orchestrator $operation return "Operation submitted in background" } ``` **Pattern 2: For Batch Operations** ``` # Use batch workflows for multiple operations nu -c " use core/nulib/workflows/batch.nu * batch submit workflows/batch-deploy.ncl --parallel-limit 5 " ``` **Pattern 3: For Metadata Overhead** ``` # Cache hit rate optimization # Current: 40-100x faster with warm cache # Target: >95% cache hit rate # Achieved: Metadata stays in cache for 1 hour (TTL) ``` ## Testing ### Running Tests ``` # End-to-End Integration Tests nu tests/test-fase5-e2e.nu # Security Audit nu tests/test-security-audit-day20.nu # Performance Benchmarks nu tests/test-metadata-cache-benchmark.nu # Run all tests for test in tests/test-*.nu { nu $test } ``` ### Test Coverage | Test Suite | Category | Coverage | | ----------- | ---------- | ---------- | | E2E Tests | Integration | 7 test groups, 40+ checks | | Security Audit | Auth | 5 audit categories, 100% pass | | Benchmarks | Performance | 6 benchmark categories | ### Expected Results ✅ All tests pass ✅ No Nushell syntax violations ✅ Cache hit rate >95% ✅ Auth enforcement 100% ✅ Performance baselines met ## Troubleshooting ### Issue: Command not found **Solution**: Ensure metadata is registered in `main.ncl` ``` # Check if command is in metadata grep "command_name" provisioning/schemas/main.ncl ``` ### Issue: Auth check failing **Solution**: Verify user has required permission level ``` # Check current user permissions provisioning auth whoami # Check command requirements nu -c " use core/nulib/lib_provisioning/commands/traits.nu * get-command-metadata 'server create' " ``` ### Issue: Slow command execution **Solution**: Check cache status ``` # Force cache reload rm ~/.cache/provisioning/command_metadata.json # Check cache hit rate nu tests/test-metadata-cache-benchmark.nu ``` ### Issue: Nushell syntax error **Solution**: Run compliance check ``` # Validate Nushell compliance nu --ide-check 100 # Check for common issues grep "try {" # Should be empty grep "let mut" # Should be empty ``` ## Performance Characteristics ### Baseline Metrics | Operation | Cold | Warm | Improvement | | ----------- | ------ | ------ | ------------- | | Metadata Load | 200 ms | 2-5 ms | 40-100x | | Auth Check | <5 ms | <5 ms | Same | | Command Dispatch | <10 ms | <10 ms | Same | | Total Command | ~210 ms | ~10 ms | 21x | ### Real-World Impact ``` Scenario: 20 sequential commands Without cache: 20 × 200 ms = 4 seconds With cache: 1 × 200 ms + 19 × 5 ms = 295 ms Speedup: ~13.5x faster ``` ## Next Steps 1. **Deploy**: Use installer to deploy to production 2. **Monitor**: Watch cache hit rates (target >95%) 3. **Extend**: Add new commands following migration pattern 4. **Optimize**: Use profiling to identify slow operations 5. **Maintain**: Run validation scripts regularly --- **For Support**: See `docs/troubleshooting-guide.md` **For Architecture**: See `docs/architecture/` **For User Guide**: See `docs/user/AUTHENTICATION_LAYER_GUIDE.md`