# Metadata-Driven Authentication System - Implementation Guide

**Status**: ✅ Complete and Production-Ready
**Version**: 1.0.0
**Last Updated**: 2025-12-10

## Table of Contents

1. [Overview](#overview)
2. [Architecture](#architecture)
3. [Installation](#installation)
4. [Usage Guide](#usage-guide)
5. [Migration Path](#migration-path)
6. [Developer Guide](#developer-guide)
7. [Testing](#testing)
8. [Troubleshooting](#troubleshooting)

## Overview

This guide describes the metadata-driven authentication system implemented over 5 weeks across 14 command handlers and 12 major systems. The system provides:

- **Centralized Metadata**: All command definitions in Nickel with runtime validation
- **Automatic Auth Checks**: Pre-execution validation before handler logic
- **Performance Optimization**: 40-100x faster through metadata caching
- **Flexible Deployment**: Works with orchestrator, batch workflows, and direct CLI

## Architecture

### System Components

```
┌─────────────────────────────────────────────────────────────┐
│                     User Command                             │
└────────────────────────────────┬──────────────────────────────┘
                                 │
                    ┌────────────▼─────────────┐
                    │    CLI Dispatcher       │
                    │  (main_provisioning)    │
                    └────────────┬─────────────┘
                                 │
                    ┌────────────▼─────────────┐
                    │  Metadata Loading       │
                    │  (cached via traits.nu) │
                    └────────────┬─────────────┘
                                 │
                    ┌────────────▼─────────────────────┐
                    │  Pre-Execution Validation       │
                    │  - Auth checks                  │
                    │  - Permission validation        │
                    │  - Operation type mapping       │
                    └────────────┬─────────────────────┘
                                 │
                    ┌────────────▼─────────────────────┐
                    │  Command Handler Execution      │
                    │  - infrastructure.nu            │
                    │  - orchestration.nu             │
                    │  - workspace.nu                 │
                    └────────────┬─────────────────────┘
                                 │
                    ┌────────────▼─────────────┐
                    │   Result/Response        │
                    └─────────────────────────┘
```

### Data Flow

1. **User Command** → CLI Dispatcher
2. **Dispatcher** → Load cached metadata (or parse Nickel)
3. **Validate** → Check auth, operation type, permissions
4. **Execute** → Call appropriate handler
5. **Return** → Result to user

### Metadata Caching

- **Location**: `~/.cache/provisioning/command_metadata.json`
- **Format**: Serialized JSON (pre-parsed for speed)
- **TTL**: 1 hour (configurable via `PROVISIONING_METADATA_TTL`)
- **Invalidation**: Automatic on `main.ncl` modification
- **Performance**: 40-100x faster than Nickel parsing

## Installation

### Prerequisites

- Nushell 0.109.0+
- Nickel 1.15.0+
- SOPS 3.10.2 (for encrypted configs)
- Age 1.2.1 (for encryption)

### Installation Steps

```
# 1. Clone or update repository
git clone https://github.com/your-org/project-provisioning.git
cd project-provisioning

# 2. Initialize workspace
./provisioning/core/cli/provisioning workspace init

# 3. Validate system
./provisioning/core/cli/provisioning validate config

# 4. Run system checks
./provisioning/core/cli/provisioning health

# 5. Run test suites
nu tests/test-fase5-e2e.nu
nu tests/test-security-audit-day20.nu
nu tests/test-metadata-cache-benchmark.nu
```

## Usage Guide

### Basic Commands

```
# Initialize authentication
provisioning login

# Enroll in MFA
provisioning mfa totp enroll

# Create infrastructure
provisioning server create --name web-01 --plan 1xCPU-2 GB

# Deploy with orchestrator
provisioning workflow submit workflows/deployment.ncl --orchestrated

# Batch operations
provisioning batch submit workflows/batch-deploy.ncl

# Check without executing
provisioning server create --name test --check
```

### Authentication Flow

```
# 1. Login (required for production operations)
$ provisioning login
Username: alice@example.com
Password: ****

# 2. Optional: Setup MFA
$ provisioning mfa totp enroll
Scan QR code with authenticator app
Verify code: 123456

# 3. Use commands (auth checks happen automatically)
$ provisioning server delete --name old-server --infra production
Auth check: Check auth for production (delete operation)
Are you sure? [yes/no] yes
✓ Server deleted

# 4. All destructive operations require auth
$ provisioning taskserv delete postgres web-01
Auth check: Check auth for destructive operation
✓ Taskserv deleted
```

### Check Mode (Bypass Auth for Testing)

```
# Dry-run without auth checks
provisioning server create --name test --check

# Output: Shows what would happen, no auth checks
Dry-run mode - no changes will be made
✓ Would create server: test
✓ Would deploy taskservs: []
```

### Non-Interactive CI/CD Mode

```
# Automated mode - skip confirmations
provisioning server create --name web-01 --yes

# Batch operations
provisioning batch submit workflows/batch.ncl --yes --check

# With environment variable
PROVISIONING_NON_INTERACTIVE=1 provisioning server create --name web-02 --yes
```

## Migration Path

### Phase 1: From Old `input` to Metadata

**Old Pattern** (Before Fase 5):

```
# Hardcoded auth check
let response = (input "Delete server? (yes/no): ")
if $response != "yes" { exit 1 }

# No metadata - auth unknown
export def delete-server [name: string, --yes] {
    if not $yes { ... manual confirmation ... }
    # ... deletion logic ...
}
```

**New Pattern** (After Fase 5):

```
# Metadata header
# [command]
# name = "server delete"
# group = "infrastructure"
# tags = ["server", "delete", "destructive"]
# version = "1.0.0"

# Automatic auth check from metadata
export def delete-server [name: string, --yes] {
    # Pre-execution check happens in dispatcher
    # Auth enforcement via metadata
    # Operation type: "delete" automatically detected
    # ... deletion logic ...
}
```

### Phase 2: Adding Metadata Headers

**For each script that was migrated:**

1. Add metadata header after shebang:

```
#!/usr/bin/env nu
# [command]
# name = "server create"
# group = "infrastructure"
# tags = ["server", "create", "interactive"]
# version = "1.0.0"

export def create-server [name: string] {
    # Logic here
}
```

1. Register in `provisioning/schemas/main.ncl`:

```
let server_create = {
    name = "server create",
    domain = "infrastructure",
    description = "Create a new server",
    requirements = {
        interactive = false,
        requires_auth = true,
        auth_type = "jwt",
        side_effect_type = "create",
        min_permission = "write",
    },
} in
server_create
```

1. Handler integration (happens in dispatcher):

```
# Dispatcher automatically:
# 1. Loads metadata for "server create"
# 2. Validates auth based on requirements
# 3. Checks permission levels
# 4. Calls handler if validation passes
```

### Phase 3: Validating Migration

```
# Validate metadata headers
nu utils/validate-metadata-headers.nu

# Find scripts by tag
nu utils/search-scripts.nu by-tag destructive

# Find all scripts in group
nu utils/search-scripts.nu by-group infrastructure

# Find scripts with multiple tags
nu utils/search-scripts.nu by-tags server delete

# List all migrated scripts
nu utils/search-scripts.nu list
```

## Developer Guide

### Adding New Commands with Metadata

**Step 1: Create metadata in main.ncl**

```
let new_feature_command = {
    name = "feature command",
    domain = "infrastructure",
    description = "My new feature",
    requirements = {
        interactive = false,
        requires_auth = true,
        auth_type = "jwt",
        side_effect_type = "create",
        min_permission = "write",
    },
} in
new_feature_command
```

**Step 2: Add metadata header to script**

```
#!/usr/bin/env nu
# [command]
# name = "feature command"
# group = "infrastructure"
# tags = ["feature", "create"]
# version = "1.0.0"

export def feature-command [param: string] {
    # Implementation
}
```

**Step 3: Implement handler function**

```
# Handler registered in dispatcher
export def handle-feature-command [
    action: string
    --flags
]: nothing -> nothing {
    # Dispatcher handles:
    # 1. Metadata validation
    # 2. Auth checks
    # 3. Permission validation

    # Your logic here
}
```

**Step 4: Test with check mode**

```
# Dry-run without auth
provisioning feature command --check

# Full execution
provisioning feature command --yes
```

### Metadata Field Reference

| Field | Type | Required | Description |
| ------- | ------ | ---------- | ------------- |
| name | string | Yes | Command canonical name |
| domain | string | Yes | Command category (infrastructure, orchestration, etc.) |
| description | string | Yes | Human-readable description |
| requires_auth | bool | Yes | Whether auth is required |
| auth_type | enum | Yes | "none", "jwt", "mfa", "cedar" |
| side_effect_type | enum | Yes | "none", "create", "update", "delete", "deploy" |
| min_permission | enum | Yes | "read", "write", "admin", "superadmin" |
| interactive | bool | No | Whether command requires user input |
| slow_operation | bool | No | Whether operation takes >60 seconds |

### Standard Tags

**Groups**:

- infrastructure - Server, taskserv, cluster operations
- orchestration - Workflow, batch operations
- workspace - Workspace management
- authentication - Auth, MFA, tokens
- utilities - Helper commands

**Operations**:

- create, read, update, delete - CRUD operations
- destructive - Irreversible operations
- interactive - Requires user input

**Performance**:

- slow - Operation >60 seconds
- optimizable - Candidate for optimization

### Performance Optimization Patterns

**Pattern 1: For Long Operations**

```
# Use orchestrator for operations >2 seconds
if (get-operation-duration "my-operation") > 2000 {
    submit-to-orchestrator $operation
    return "Operation submitted in background"
}
```

**Pattern 2: For Batch Operations**

```
# Use batch workflows for multiple operations
nu -c "
use core/nulib/workflows/batch.nu *
batch submit workflows/batch-deploy.ncl --parallel-limit 5
"
```

**Pattern 3: For Metadata Overhead**

```
# Cache hit rate optimization
# Current: 40-100x faster with warm cache
# Target: >95% cache hit rate
# Achieved: Metadata stays in cache for 1 hour (TTL)
```

## Testing

### Running Tests

```
# End-to-End Integration Tests
nu tests/test-fase5-e2e.nu

# Security Audit
nu tests/test-security-audit-day20.nu

# Performance Benchmarks
nu tests/test-metadata-cache-benchmark.nu

# Run all tests
for test in tests/test-*.nu { nu $test }
```

### Test Coverage

| Test Suite | Category | Coverage |
| ----------- | ---------- | ---------- |
| E2E Tests | Integration | 7 test groups, 40+ checks |
| Security Audit | Auth | 5 audit categories, 100% pass |
| Benchmarks | Performance | 6 benchmark categories |

### Expected Results

✅ All tests pass
✅ No Nushell syntax violations
✅ Cache hit rate >95%
✅ Auth enforcement 100%
✅ Performance baselines met

## Troubleshooting

### Issue: Command not found

**Solution**: Ensure metadata is registered in `main.ncl`

```
# Check if command is in metadata
grep "command_name" provisioning/schemas/main.ncl
```

### Issue: Auth check failing

**Solution**: Verify user has required permission level

```
# Check current user permissions
provisioning auth whoami

# Check command requirements
nu -c "
use core/nulib/lib_provisioning/commands/traits.nu *
get-command-metadata 'server create'
"
```

### Issue: Slow command execution

**Solution**: Check cache status

```
# Force cache reload
rm ~/.cache/provisioning/command_metadata.json

# Check cache hit rate
nu tests/test-metadata-cache-benchmark.nu
```

### Issue: Nushell syntax error

**Solution**: Run compliance check

```
# Validate Nushell compliance
nu --ide-check 100 <file.nu>

# Check for common issues
grep "try {" <file.nu>  # Should be empty
grep "let mut" <file.nu>  # Should be empty
```

## Performance Characteristics

### Baseline Metrics

| Operation | Cold | Warm | Improvement |
| ----------- | ------ | ------ | ------------- |
| Metadata Load | 200 ms | 2-5 ms | 40-100x |
| Auth Check | <5 ms | <5 ms | Same |
| Command Dispatch | <10 ms | <10 ms | Same |
| Total Command | ~210 ms | ~10 ms | 21x |

### Real-World Impact

```
Scenario: 20 sequential commands
  Without cache: 20 × 200 ms = 4 seconds
  With cache:    1 × 200 ms + 19 × 5 ms = 295 ms
  Speedup:       ~13.5x faster
```

## Next Steps

1. **Deploy**: Use installer to deploy to production
2. **Monitor**: Watch cache hit rates (target >95%)
3. **Extend**: Add new commands following migration pattern
4. **Optimize**: Use profiling to identify slow operations
5. **Maintain**: Run validation scripts regularly

---

**For Support**: See `docs/troubleshooting-guide.md`
**For Architecture**: See `docs/architecture/`
**For User Guide**: See `docs/user/AUTHENTICATION_LAYER_GUIDE.md`