provisioning/docs/src/architecture/adr/ADR-008-cedar-authorization.md
2026-01-14 04:53:21 +00:00

11 KiB

ADR-008: Cedar Authorization Policy Engine Integration

Status: Accepted Date: 2025-10-08 Deciders: Architecture Team Tags: security, authorization, cedar, policy-engine

Context and Problem Statement

The Provisioning platform requires fine-grained authorization controls to manage access to infrastructure resources across multiple environments (development, staging, production). The authorization system must:

  1. Support complex authorization rules (MFA, IP restrictions, time windows, approvals)
  2. Be auditable and version-controlled
  3. Allow hot-reload of policies without restart
  4. Integrate with JWT tokens for identity
  5. Scale to thousands of authorization decisions per second
  6. Be maintainable by security team without code changes

Traditional code-based authorization (if/else statements) is difficult to audit, maintain, and scale.

Decision Drivers

  • Security: Critical for production infrastructure access
  • Auditability: Compliance requirements demand clear authorization policies
  • Flexibility: Policies change more frequently than code
  • Performance: Low-latency authorization decisions (<10 ms)
  • Maintainability: Security team should update policies without developers
  • Type Safety: Prevent policy errors before deployment

Considered Options

Option 1: Code-Based Authorization (Current State)

Implement authorization logic directly in Rust/Nushell code.

Pros:

  • Full control and flexibility
  • No external dependencies
  • Simple to understand for small use cases

Cons:

  • Hard to audit and maintain
  • Requires code deployment for policy changes
  • No type safety for policies
  • Difficult to test all combinations
  • Not declarative

Option 2: OPA (Open Policy Agent)

Use OPA with Rego policy language.

Pros:

  • Industry standard
  • Rich ecosystem
  • Rego is powerful

Cons:

  • Rego is complex to learn
  • Requires separate service deployment
  • Performance overhead (HTTP calls)
  • Policies not type-checked

Option 3: Cedar Policy Engine (Chosen)

Use AWS Cedar policy language integrated directly into orchestrator.

Pros:

  • Type-safe policy language
  • Fast (compiled, no network overhead)
  • Schema-based validation
  • Declarative and auditable
  • Hot-reload support
  • Rust library (no external service)
  • Deny-by-default security model

Cons:

  • Recently introduced (2023)
  • Smaller ecosystem than OPA
  • Learning curve for policy authors

Option 4: Casbin

Use Casbin authorization library.

Pros:

  • Multiple policy models (ACL, RBAC, ABAC)
  • Rust bindings available

Cons:

  • Less declarative than Cedar
  • Weaker type safety
  • More imperative style

Decision Outcome

Chosen Option: Option 3 - Cedar Policy Engine

Rationale

  1. Type Safety: Cedar's schema validation prevents policy errors before deployment
  2. Performance: Native Rust library, no network overhead, <1 ms authorization decisions
  3. Auditability: Declarative policies in version control
  4. Hot Reload: Update policies without orchestrator restart
  5. AWS Standard: Used in production by AWS for AVP (Amazon Verified Permissions)
  6. Deny-by-Default: Secure by design

Implementation Details

Architecture

┌─────────────────────────────────────────────────────────┐
│                  Orchestrator                           │
├─────────────────────────────────────────────────────────┤
│                                                         │
│  HTTP Request                                           │
│       ↓                                                 │
│  ┌──────────────────┐                                  │
│  │ JWT Validation   │ ← Token Validator                │
│  └────────┬─────────┘                                  │
│           ↓                                             │
│  ┌──────────────────┐                                  │
│  │ Cedar Engine     │ ← Policy Loader                  │
│  │                  │   (Hot Reload)                   │
│  │ • Check Policies │                                  │
│  │ • Evaluate Rules │                                  │
│  │ • Context Check  │                                  │
│  └────────┬─────────┘                                  │
│           ↓                                             │
│  Allow / Deny                                           │
│                                                         │
└─────────────────────────────────────────────────────────┘

Policy Organization

provisioning/config/cedar-policies/
├── schema.cedar          # Entity and action definitions
├── production.cedar      # Production environment policies
├── development.cedar     # Development environment policies
├── admin.cedar          # Administrative policies
└── README.md            # Documentation

Rust Implementation

provisioning/platform/orchestrator/src/security/
├── cedar.rs             # Cedar engine integration (450 lines)
├── policy_loader.rs     # Policy loading with hot reload (320 lines)
├── authorization.rs     # Middleware integration (380 lines)
├── mod.rs              # Module exports
└── tests.rs            # Comprehensive tests (450 lines)

Key Components

  1. CedarEngine: Core authorization engine

    • Load policies from strings
    • Load schema for validation
    • Authorize requests
    • Policy statistics
  2. PolicyLoader: File-based policy management

    • Load policies from directory
    • Hot reload on file changes (notify crate)
    • Validate policy syntax
    • Schema validation
  3. Authorization Middleware: Axum integration

    • Extract JWT claims
    • Build authorization context (IP, MFA, time)
    • Check authorization
    • Return 403 Forbidden on deny
  4. Policy Files: Declarative authorization rules

    • Production: MFA, approvals, IP restrictions, business hours
    • Development: Permissive for developers
    • Admin: Platform admin, SRE, audit team policies

Context Variables

AuthorizationContext {
    mfa_verified: bool,          // MFA verification status
    ip_address: String,          // Client IP address
    time: String,                // ISO 8601 timestamp
    approval_id: Option<String>, // Approval ID (optional)
    reason: Option<String>,      // Reason for operation
    force: bool,                 // Force flag
    additional: HashMap,         // Additional context
}

Example Policy

// Production deployments require MFA verification
@id("prod-deploy-mfa")
@description("All production deployments must have MFA verification")
permit (
  principal,
  action == Provisioning::Action::"deploy",
  resource in Provisioning::Environment::"production"
) when {
  context.mfa_verified == true
};

Integration Points

  1. JWT Tokens: Extract principal and context from validated JWT
  2. Audit System: Log all authorization decisions
  3. Control Center: UI for policy management and testing
  4. CLI: Policy validation and testing commands

Security Best Practices

  1. Deny by Default: Cedar defaults to deny all actions
  2. Schema Validation: Type-check policies before loading
  3. Version Control: All policies in git for auditability
  4. Principle of Least Privilege: Grant minimum necessary permissions
  5. Defense in Depth: Combine with JWT validation and rate limiting
  6. Separation of Concerns: Security team owns policies, developers own code

Consequences

Positive

  1. Auditable: All policies in version control
  2. Type-Safe: Schema validation prevents errors
  3. Fast: <1 ms authorization decisions
  4. Maintainable: Security team can update policies independently
  5. Hot Reload: No downtime for policy updates
  6. Testable: Comprehensive test suite for policies
  7. Declarative: Clear intent, no hidden logic

Negative

  1. Learning Curve: Team must learn Cedar policy language
  2. New Technology: Cedar is relatively new (2023)
  3. Ecosystem: Smaller community than OPA
  4. Tooling: Limited IDE support compared to Rego

Neutral

  1. 🔶 Migration: Existing authorization logic needs migration to Cedar
  2. 🔶 Policy Complexity: Complex rules may be harder to express
  3. 🔶 Debugging: Policy debugging requires understanding Cedar evaluation

Compliance

Security Standards

  • SOC 2: Auditable access control policies
  • ISO 27001: Access control management
  • GDPR: Data access authorization and logging
  • NIST 800-53: AC-3 Access Enforcement

Audit Requirements

All authorization decisions include:

  • Principal (user/team)
  • Action performed
  • Resource accessed
  • Context (MFA, IP, time)
  • Decision (allow/deny)
  • Policies evaluated

Migration Path

Phase 1: Implementation (Completed)

  • Cedar engine integration
  • Policy loader with hot reload
  • Authorization middleware
  • Production, development, and admin policies
  • Comprehensive tests

Phase 2: Rollout (Next)

  • 🔲 Enable Cedar authorization in orchestrator
  • 🔲 Migrate existing authorization logic to Cedar policies
  • 🔲 Add authorization checks to all API endpoints
  • 🔲 Integrate with audit logging

Phase 3: Enhancement (Future)

  • 🔲 Control Center policy editor UI
  • 🔲 Policy testing UI
  • 🔲 Policy simulation and dry-run mode
  • 🔲 Policy analytics and insights
  • 🔲 Advanced context variables (location, device type)

Alternatives Considered

Alternative 1: Continue with Code-Based Authorization

Keep authorization logic in Rust/Nushell code.

Rejected Because:

  • Not auditable
  • Requires code changes for policy updates
  • Difficult to test all combinations
  • Not compliant with security standards

Alternative 2: Hybrid Approach

Use Cedar for high-level policies, code for fine-grained checks.

Rejected Because:

  • Complexity of two authorization systems
  • Unclear separation of concerns
  • Harder to audit

References

  • ADR-003: JWT Token-Based Authentication
  • ADR-004: Audit Logging System
  • ADR-005: KMS Key Management

Notes

Cedar policy language is inspired by decades of authorization research (XACML, AWS IAM) and production experience at AWS. It balances expressiveness with safety.


Approved By: Architecture Team Implementation Date: 2025-10-08 Review Date: 2026-01-08 (Quarterly)