provisioning/docs/src/architecture/adr/ADR-008-cedar-authorization.md

# ADR-008: Cedar Authorization Policy Engine Integration

**Status**: Accepted
**Date**: 2025-10-08
**Deciders**: Architecture Team
**Tags**: security, authorization, cedar, policy-engine

## Context and Problem Statement

The Provisioning platform requires fine-grained authorization controls to manage access to infrastructure resources across multiple environments
(development, staging, production). The authorization system must:

1. Support complex authorization rules (MFA, IP restrictions, time windows, approvals)
2. Be auditable and version-controlled
3. Allow hot-reload of policies without restart
4. Integrate with JWT tokens for identity
5. Scale to thousands of authorization decisions per second
6. Be maintainable by security team without code changes

Traditional code-based authorization (if/else statements) is difficult to audit, maintain, and scale.

## Decision Drivers

- **Security**: Critical for production infrastructure access
- **Auditability**: Compliance requirements demand clear authorization policies
- **Flexibility**: Policies change more frequently than code
- **Performance**: Low-latency authorization decisions (<10 ms)
- **Maintainability**: Security team should update policies without developers
- **Type Safety**: Prevent policy errors before deployment

## Considered Options

### Option 1: Code-Based Authorization (Current State)

Implement authorization logic directly in Rust/Nushell code.

**Pros**:

- Full control and flexibility
- No external dependencies
- Simple to understand for small use cases

**Cons**:

- Hard to audit and maintain
- Requires code deployment for policy changes
- No type safety for policies
- Difficult to test all combinations
- Not declarative

### Option 2: OPA (Open Policy Agent)

Use OPA with Rego policy language.

**Pros**:

- Industry standard
- Rich ecosystem
- Rego is powerful

**Cons**:

- Rego is complex to learn
- Requires separate service deployment
- Performance overhead (HTTP calls)
- Policies not type-checked

### Option 3: Cedar Policy Engine (Chosen)

Use AWS Cedar policy language integrated directly into orchestrator.

**Pros**:

- Type-safe policy language
- Fast (compiled, no network overhead)
- Schema-based validation
- Declarative and auditable
- Hot-reload support
- Rust library (no external service)
- Deny-by-default security model

**Cons**:

- Recently introduced (2023)
- Smaller ecosystem than OPA
- Learning curve for policy authors

### Option 4: Casbin

Use Casbin authorization library.

**Pros**:

- Multiple policy models (ACL, RBAC, ABAC)
- Rust bindings available

**Cons**:

- Less declarative than Cedar
- Weaker type safety
- More imperative style

## Decision Outcome

**Chosen Option**: Option 3 - Cedar Policy Engine

### Rationale

1. **Type Safety**: Cedar's schema validation prevents policy errors before deployment
2. **Performance**: Native Rust library, no network overhead, <1 ms authorization decisions
3. **Auditability**: Declarative policies in version control
4. **Hot Reload**: Update policies without orchestrator restart
5. **AWS Standard**: Used in production by AWS for AVP (Amazon Verified Permissions)
6. **Deny-by-Default**: Secure by design

### Implementation Details

#### Architecture

```
┌─────────────────────────────────────────────────────────┐
│                  Orchestrator                           │
├─────────────────────────────────────────────────────────┤
│                                                         │
│  HTTP Request                                           │
│       ↓                                                 │
│  ┌──────────────────┐                                  │
│  │ JWT Validation   │ ← Token Validator                │
│  └────────┬─────────┘                                  │
│           ↓                                             │
│  ┌──────────────────┐                                  │
│  │ Cedar Engine     │ ← Policy Loader                  │
│  │                  │   (Hot Reload)                   │
│  │ • Check Policies │                                  │
│  │ • Evaluate Rules │                                  │
│  │ • Context Check  │                                  │
│  └────────┬─────────┘                                  │
│           ↓                                             │
│  Allow / Deny                                           │
│                                                         │
└─────────────────────────────────────────────────────────┘
```

#### Policy Organization

```
provisioning/config/cedar-policies/
├── schema.cedar          # Entity and action definitions
├── production.cedar      # Production environment policies
├── development.cedar     # Development environment policies
├── admin.cedar          # Administrative policies
└── README.md            # Documentation
```

#### Rust Implementation

```
provisioning/platform/orchestrator/src/security/
├── cedar.rs             # Cedar engine integration (450 lines)
├── policy_loader.rs     # Policy loading with hot reload (320 lines)
├── authorization.rs     # Middleware integration (380 lines)
├── mod.rs              # Module exports
└── tests.rs            # Comprehensive tests (450 lines)
```

#### Key Components

1. **CedarEngine**: Core authorization engine
   - Load policies from strings
   - Load schema for validation
   - Authorize requests
   - Policy statistics

2. **PolicyLoader**: File-based policy management
   - Load policies from directory
   - Hot reload on file changes (notify crate)
   - Validate policy syntax
   - Schema validation

3. **Authorization Middleware**: Axum integration
   - Extract JWT claims
   - Build authorization context (IP, MFA, time)
   - Check authorization
   - Return 403 Forbidden on deny

4. **Policy Files**: Declarative authorization rules
   - Production: MFA, approvals, IP restrictions, business hours
   - Development: Permissive for developers
   - Admin: Platform admin, SRE, audit team policies

#### Context Variables

```
AuthorizationContext {
    mfa_verified: bool,          // MFA verification status
    ip_address: String,          // Client IP address
    time: String,                // ISO 8601 timestamp
    approval_id: Option<String>, // Approval ID (optional)
    reason: Option<String>,      // Reason for operation
    force: bool,                 // Force flag
    additional: HashMap,         // Additional context
}
```

#### Example Policy

```
// Production deployments require MFA verification
@id("prod-deploy-mfa")
@description("All production deployments must have MFA verification")
permit (
  principal,
  action == Provisioning::Action::"deploy",
  resource in Provisioning::Environment::"production"
) when {
  context.mfa_verified == true
};
```

### Integration Points

1. **JWT Tokens**: Extract principal and context from validated JWT
2. **Audit System**: Log all authorization decisions
3. **Control Center**: UI for policy management and testing
4. **CLI**: Policy validation and testing commands

### Security Best Practices

1. **Deny by Default**: Cedar defaults to deny all actions
2. **Schema Validation**: Type-check policies before loading
3. **Version Control**: All policies in git for auditability
4. **Principle of Least Privilege**: Grant minimum necessary permissions
5. **Defense in Depth**: Combine with JWT validation and rate limiting
6. **Separation of Concerns**: Security team owns policies, developers own code

## Consequences

### Positive

1. ✅ **Auditable**: All policies in version control
2. ✅ **Type-Safe**: Schema validation prevents errors
3. ✅ **Fast**: <1 ms authorization decisions
4. ✅ **Maintainable**: Security team can update policies independently
5. ✅ **Hot Reload**: No downtime for policy updates
6. ✅ **Testable**: Comprehensive test suite for policies
7. ✅ **Declarative**: Clear intent, no hidden logic

### Negative

1. ❌ **Learning Curve**: Team must learn Cedar policy language
2. ❌ **New Technology**: Cedar is relatively new (2023)
3. ❌ **Ecosystem**: Smaller community than OPA
4. ❌ **Tooling**: Limited IDE support compared to Rego

### Neutral

1. 🔶 **Migration**: Existing authorization logic needs migration to Cedar
2. 🔶 **Policy Complexity**: Complex rules may be harder to express
3. 🔶 **Debugging**: Policy debugging requires understanding Cedar evaluation

## Compliance

### Security Standards

- **SOC 2**: Auditable access control policies
- **ISO 27001**: Access control management
- **GDPR**: Data access authorization and logging
- **NIST 800-53**: AC-3 Access Enforcement

### Audit Requirements

All authorization decisions include:

- Principal (user/team)
- Action performed
- Resource accessed
- Context (MFA, IP, time)
- Decision (allow/deny)
- Policies evaluated

## Migration Path

### Phase 1: Implementation (Completed)

- ✅ Cedar engine integration
- ✅ Policy loader with hot reload
- ✅ Authorization middleware
- ✅ Production, development, and admin policies
- ✅ Comprehensive tests

### Phase 2: Rollout (Next)

- 🔲 Enable Cedar authorization in orchestrator
- 🔲 Migrate existing authorization logic to Cedar policies
- 🔲 Add authorization checks to all API endpoints
- 🔲 Integrate with audit logging

### Phase 3: Enhancement (Future)

- 🔲 Control Center policy editor UI
- 🔲 Policy testing UI
- 🔲 Policy simulation and dry-run mode
- 🔲 Policy analytics and insights
- 🔲 Advanced context variables (location, device type)

## Alternatives Considered

### Alternative 1: Continue with Code-Based Authorization

Keep authorization logic in Rust/Nushell code.

**Rejected Because**:

- Not auditable
- Requires code changes for policy updates
- Difficult to test all combinations
- Not compliant with security standards

### Alternative 2: Hybrid Approach

Use Cedar for high-level policies, code for fine-grained checks.

**Rejected Because**:

- Complexity of two authorization systems
- Unclear separation of concerns
- Harder to audit

## References

- **Cedar Documentation**: <https://docs.cedarpolicy.com/>
- **Cedar GitHub**: <https://github.com/cedar-policy/cedar>
- **AWS AVP**: <https://aws.amazon.com/verified-permissions/>
- **Policy Files**: `/provisioning/config/cedar-policies/`
- **Implementation**: `/provisioning/platform/orchestrator/src/security/`

## Related ADRs

- ADR-003: JWT Token-Based Authentication
- ADR-004: Audit Logging System
- ADR-005: KMS Key Management

## Notes

Cedar policy language is inspired by decades of authorization research (XACML, AWS IAM) and production experience at AWS. It balances expressiveness
with safety.

---

**Approved By**: Architecture Team
**Implementation Date**: 2025-10-08
**Review Date**: 2026-01-08 (Quarterly)