provisioning/docs/src/operations/cedar-policies-production-guide.md
2026-01-14 01:56:30 +00:00

18 KiB

Cedar Policies Production Guide

Version: 1.0.0 Date: 2025-10-08 Audience: Platform Administrators, Security Teams Prerequisites: Understanding of Cedar policy language, Provisioning platform architecture


Table of Contents

  1. Introduction
  2. Cedar Policy Basics
  3. Production Policy Strategy
  4. Policy Templates
  5. Policy Development Workflow
  6. Testing Policies
  7. Deployment
  8. Monitoring & Auditing
  9. Troubleshooting
  10. Best Practices

Introduction

Cedar policies control who can do what in the Provisioning platform. This guide helps you create, test, and deploy production-ready Cedar policies that balance security with operational efficiency.

Why Cedar

  • Fine-grained: Control access at resource + action level
  • Context-aware: Decisions based on MFA, IP, time, approvals
  • Auditable: Every decision is logged with policy ID
  • Hot-reload: Update policies without restarting services
  • Type-safe: Schema validation prevents errors

Cedar Policy Basics

Core Concepts

permit (
  principal,    # Who (user, team, role)
  action,       # What (create, delete, deploy)
  resource      # Where (server, cluster, environment)
) when {
  condition     # Context (MFA, IP, time)
};

Entities

Type Examples Description
User User::"alice" Individual users
Team Team::"platform-admin" User groups
Role Role::"Admin" Permission levels
Resource Server::"web-01" Infrastructure resources
Environment Environment::"production" Deployment targets

Actions

Category Actions
Read read, list
Write create, update, delete
Deploy deploy, rollback
Admin ssh, execute, admin

Production Policy Strategy

Security Levels

Level 1: Development (Permissive)

// Developers have full access to dev environment
permit (
  principal in Team::"developers",
  action,
  resource in Environment::"development"
);

Level 2: Staging (MFA Required)

// All operations require MFA
permit (
  principal in Team::"developers",
  action,
  resource in Environment::"staging"
) when {
  context.mfa_verified == true
};

Level 3: Production (MFA + Approval)

// Deployments require MFA + approval
permit (
  principal in Team::"platform-admin",
  action in [Action::"deploy", Action::"delete"],
  resource in Environment::"production"
) when {
  context.mfa_verified == true &&
  context has approval_id &&
  context.approval_id.startsWith("APPROVAL-")
};

Level 4: Critical (Break-Glass Only)

// Only emergency access
permit (
  principal,
  action,
  resource in Resource::"production-database"
) when {
  context.emergency_access == true &&
  context.session_approved == true
};

Policy Templates

1. Role-Based Access Control (RBAC)

// Admin: Full access
permit (
  principal in Role::"Admin",
  action,
  resource
);

// Operator: Server management + read clusters
permit (
  principal in Role::"Operator",
  action in [
    Action::"create",
    Action::"update",
    Action::"delete"
  ],
  resource is Server
);

permit (
  principal in Role::"Operator",
  action in [Action::"read", Action::"list"],
  resource is Cluster
);

// Viewer: Read-only everywhere
permit (
  principal in Role::"Viewer",
  action in [Action::"read", Action::"list"],
  resource
);

// Auditor: Read audit logs only
permit (
  principal in Role::"Auditor",
  action in [Action::"read", Action::"list"],
  resource is AuditLog
);

2. Team-Based Policies

// Platform team: Infrastructure management
permit (
  principal in Team::"platform",
  action in [
    Action::"create",
    Action::"update",
    Action::"delete",
    Action::"deploy"
  ],
  resource in [Server, Cluster, Taskserv]
);

// Security team: Access control + audit
permit (
  principal in Team::"security",
  action,
  resource in [User, Role, AuditLog, BreakGlass]
);

// DevOps team: Application deployments
permit (
  principal in Team::"devops",
  action == Action::"deploy",
  resource in Environment::"production"
) when {
  context.mfa_verified == true &&
  context.has_approval == true
};

3. Time-Based Restrictions

// Deployments only during business hours
permit (
  principal,
  action == Action::"deploy",
  resource in Environment::"production"
) when {
  context.time.hour >= 9 &&
  context.time.hour <= 17 &&
  context.time.weekday in ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"]
};

// Maintenance window
permit (
  principal in Team::"platform",
  action,
  resource
) when {
  context.maintenance_window == true
};

4. IP-Based Restrictions

// Production access only from office network
permit (
  principal,
  action,
  resource in Environment::"production"
) when {
  context.ip_address.isInRange("10.0.0.0/8") ||
  context.ip_address.isInRange("192.168.1.0/24")
};

// VPN access for remote work
permit (
  principal,
  action,
  resource in Environment::"production"
) when {
  context.vpn_connected == true &&
  context.mfa_verified == true
};

5. Resource-Specific Policies

// Database servers: Extra protection
forbid (
  principal,
  action == Action::"delete",
  resource in Resource::"database-*"
) unless {
  context.emergency_access == true
};

// Critical clusters: Require multiple approvals
permit (
  principal,
  action in [Action::"update", Action::"delete"],
  resource in Resource::"k8s-production-*"
) when {
  context.approval_count >= 2 &&
  context.mfa_verified == true
};

6. Self-Service Policies

// Users can manage their own MFA devices
permit (
  principal,
  action in [Action::"create", Action::"delete"],
  resource is MfaDevice
) when {
  resource.owner == principal
};

// Users can view their own audit logs
permit (
  principal,
  action == Action::"read",
  resource is AuditLog
) when {
  resource.user_id == principal.id
};

Policy Development Workflow

Step 1: Define Requirements

Document:

  • Who needs access? (roles, teams, individuals)
  • To what resources? (servers, clusters, environments)
  • What actions? (read, write, deploy, delete)
  • Under what conditions? (MFA, IP, time, approvals)

Example Requirements Document:

# Requirement: Production Deployment

**Who**: DevOps team members
**What**: Deploy applications to production
**When**: Business hours (9am-5pm Mon-Fri)
**Conditions**:
- MFA verified
- Change request approved
- From office network or VPN

Step 2: Write Policy

@id("prod-deploy-devops")
@description("DevOps can deploy to production during business hours with approval")
permit (
  principal in Team::"devops",
  action == Action::"deploy",
  resource in Environment::"production"
) when {
  context.mfa_verified == true &&
  context has approval_id &&
  context.time.hour >= 9 &&
  context.time.hour <= 17 &&
  context.time.weekday in ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"] &&
  (context.ip_address.isInRange("10.0.0.0/8") || context.vpn_connected == true)
};

Step 3: Validate Syntax

# Use Cedar CLI to validate
cedar validate \
  --policies provisioning/config/cedar-policies/production.cedar \
  --schema provisioning/config/cedar-policies/schema.cedar

# Expected output: ✓ Policy is valid

Step 4: Test in Development

# Deploy to development environment first
cp production.cedar provisioning/config/cedar-policies/development.cedar

# Restart orchestrator to load new policies
systemctl restart provisioning-orchestrator

# Test with real requests
provisioning server create test-server --check

Step 5: Review & Approve

Review Checklist:

  • Policy syntax valid
  • Policy ID unique
  • Description clear
  • Conditions appropriate for security level
  • Tested in development
  • Reviewed by security team
  • Documented in change log

Step 6: Deploy to Production

# Backup current policies
cp provisioning/config/cedar-policies/production.cedar \
   provisioning/config/cedar-policies/production.cedar.backup.$(date +%Y%m%d)

# Deploy new policy
cp new-production.cedar provisioning/config/cedar-policies/production.cedar

# Hot reload (no restart needed)
provisioning cedar reload

# Verify loaded
provisioning cedar list

Testing Policies

Unit Testing

Create test cases for each policy:

# tests/cedar/prod-deploy-devops.yaml
policy_id: prod-deploy-devops

test_cases:
  - name: "DevOps can deploy with approval and MFA"
    principal: { type: "Team", id: "devops" }
    action: "deploy"
    resource: { type: "Environment", id: "production" }
    context:
      mfa_verified: true
      approval_id: "APPROVAL-123"
      time: { hour: 10, weekday: "Monday" }
      ip_address: "10.0.1.5"
    expected: Allow

  - name: "DevOps cannot deploy without MFA"
    principal: { type: "Team", id: "devops" }
    action: "deploy"
    resource: { type: "Environment", id: "production" }
    context:
      mfa_verified: false
      approval_id: "APPROVAL-123"
      time: { hour: 10, weekday: "Monday" }
    expected: Deny

  - name: "DevOps cannot deploy outside business hours"
    principal: { type: "Team", id: "devops" }
    action: "deploy"
    resource: { type: "Environment", id: "production" }
    context:
      mfa_verified: true
      approval_id: "APPROVAL-123"
      time: { hour: 22, weekday: "Monday" }
    expected: Deny

Run tests:

provisioning cedar test tests/cedar/

Integration Testing

Test with real API calls:

# Setup test user
export TEST_USER="alice"
export TEST_TOKEN=$(provisioning login --user $TEST_USER --output token)

# Test allowed action
curl -H "Authorization: Bearer $TEST_TOKEN" \
  http://localhost:9090/api/v1/servers \
  -X POST -d '{"name": "test-server"}'

# Expected: 200 OK

# Test denied action (without MFA)
curl -H "Authorization: Bearer $TEST_TOKEN" \
  http://localhost:9090/api/v1/servers/prod-server-01 \
  -X DELETE

# Expected: 403 Forbidden (MFA required)

Load Testing

Verify policy evaluation performance:

# Generate load
provisioning cedar bench \
  --policies production.cedar \
  --requests 10000 \
  --concurrency 100

# Expected: <10 ms per evaluation

Deployment

Development → Staging → Production

#!/bin/bash
# deploy-policies.sh

ENVIRONMENT=$1  # dev, staging, prod

# Validate policies
cedar validate \
  --policies provisioning/config/cedar-policies/$ENVIRONMENT.cedar \
  --schema provisioning/config/cedar-policies/schema.cedar

if [ $? -ne 0 ]; then
  echo "❌ Policy validation failed"
  exit 1
fi

# Backup current policies
BACKUP_DIR="provisioning/config/cedar-policies/backups/$ENVIRONMENT"
mkdir -p $BACKUP_DIR
cp provisioning/config/cedar-policies/$ENVIRONMENT.cedar \
   $BACKUP_DIR/$ENVIRONMENT.cedar.$(date +%Y%m%d-%H%M%S)

# Deploy new policies
scp provisioning/config/cedar-policies/$ENVIRONMENT.cedar \
    $ENVIRONMENT-orchestrator:/etc/provisioning/cedar-policies/production.cedar

# Hot reload on remote
ssh $ENVIRONMENT-orchestrator "provisioning cedar reload"

echo "✅ Policies deployed to $ENVIRONMENT"

Rollback Procedure

# List backups
ls -ltr provisioning/config/cedar-policies/backups/production/

# Restore previous version
cp provisioning/config/cedar-policies/backups/production/production.cedar.20251008-143000 \
   provisioning/config/cedar-policies/production.cedar

# Reload
provisioning cedar reload

# Verify
provisioning cedar list

Monitoring & Auditing

Monitor Authorization Decisions

# Query denied requests (last 24 hours)
provisioning audit query \
  --action authorization_denied \
  --from "24h" \
  --out table

# Expected output:
# ┌─────────┬────────┬──────────┬────────┬────────────────┐
# │ Time    │ User   │ Action   │ Resour │ Reason         │
# ├─────────┼────────┼──────────┼────────┼────────────────┤
# │ 10:15am │ bob    │ deploy   │ prod   │ MFA not verif  │
# │ 11:30am │ alice  │ delete   │ db-01  │ No approval    │
# └─────────┴────────┴──────────┴────────┴────────────────┘

Alert on Suspicious Activity

# alerts/cedar-policies.yaml
alerts:
  - name: "High Denial Rate"
    query: "authorization_denied"
    threshold: 10
    window: "5m"
    action: "notify:security-team"

  - name: "Policy Bypass Attempt"
    query: "action:deploy AND result:denied"
    user: "critical-users"
    action: "page:oncall"

Policy Usage Statistics

# Which policies are most used?
provisioning cedar stats --top 10

# Example output:
# Policy ID              | Uses  | Allows | Denies
# ---------------------- | ------- | -------- | -------
# prod-deploy-devops    | 1,234 | 1,100  | 134
# admin-full-access     |   892 |   892  | 0
# viewer-read-only      | 5,421 | 5,421  | 0

Troubleshooting

Policy Not Applying

Symptom: Policy changes not taking effect

Solutions:

  1. Verify hot reload:

    provisioning cedar reload
    provisioning cedar list  # Should show updated timestamp
    
  2. Check orchestrator logs:

    journalctl -u provisioning-orchestrator -f | grep cedar
    
  3. Restart orchestrator:

    systemctl restart provisioning-orchestrator
    

Unexpected Denials

Symptom: User denied access when policy should allow

Debug:

# Enable debug mode
export PROVISIONING_DEBUG=1

# View authorization decision
provisioning audit query \
  --user alice \
  --action deploy \
  --from "1h" \
  --out json | jq '.authorization'

# Shows which policy evaluated, context used, reason for denial

Policy Conflicts

Symptom: Multiple policies match, unclear which applies

Resolution:

  • Cedar uses deny-override: If any forbid matches, request denied
  • Use @priority annotations (higher number = higher priority)
  • Make policies more specific to avoid conflicts
@priority(100)
permit (
  principal in Role::"Admin",
  action,
  resource
);

@priority(50)
forbid (
  principal,
  action == Action::"delete",
  resource is Database
);

// Admin can do anything EXCEPT delete databases

Best Practices

1. Start Restrictive, Loosen Gradually

// ❌ BAD: Too permissive initially
permit (principal, action, resource);

// ✅ GOOD: Explicit allow, expand as needed
permit (
  principal in Role::"Admin",
  action in [Action::"read", Action::"list"],
  resource
);

2. Use Annotations

@id("prod-deploy-mfa")
@description("Production deployments require MFA verification")
@owner("platform-team")
@reviewed("2025-10-08")
@expires("2026-10-08")
permit (
  principal in Team::"platform-admin",
  action == Action::"deploy",
  resource in Environment::"production"
) when {
  context.mfa_verified == true
};

3. Principle of Least Privilege

Give users minimum permissions needed:

// ❌ BAD: Overly broad
permit (principal in Team::"developers", action, resource);

// ✅ GOOD: Specific permissions
permit (
  principal in Team::"developers",
  action in [Action::"read", Action::"create", Action::"update"],
  resource in Environment::"development"
);

4. Document Context Requirements

// Context required for this policy:
// - mfa_verified: boolean (from JWT claims)
// - approval_id: string (from request header)
// - ip_address: IpAddr (from connection)
permit (
  principal in Role::"Operator",
  action == Action::"deploy",
  resource in Environment::"production"
) when {
  context.mfa_verified == true &&
  context has approval_id &&
  context.ip_address.isInRange("10.0.0.0/8")
};

5. Separate Policies by Concern

File organization:

cedar-policies/
├── schema.cedar              # Entity/action definitions
├── rbac.cedar                # Role-based policies
├── teams.cedar               # Team-based policies
├── time-restrictions.cedar   # Time-based policies
├── ip-restrictions.cedar     # Network-based policies
├── production.cedar          # Production-specific
└── development.cedar         # Development-specific

6. Version Control

# Git commit each policy change
git add provisioning/config/cedar-policies/production.cedar
git commit -m "feat(cedar): Add MFA requirement for prod deployments

- Require MFA for all production deployments
- Applies to devops and platform-admin teams
- Effective 2025-10-08

Policy ID: prod-deploy-mfa
Reviewed by: security-team
Ticket: SEC-1234"

git push

7. Regular Policy Audits

Quarterly review:

  • Remove unused policies
  • Tighten overly permissive policies
  • Update for new resources/actions
  • Verify team memberships current
  • Test break-glass procedures

Quick Reference

Common Policy Patterns

# Allow all
permit (principal, action, resource);

# Deny all
forbid (principal, action, resource);

# Role-based
permit (principal in Role::"Admin", action, resource);

# Team-based
permit (principal in Team::"platform", action, resource);

# Resource-based
permit (principal, action, resource in Environment::"production");

# Action-based
permit (principal, action in [Action::"read", Action::"list"], resource);

# Condition-based
permit (principal, action, resource) when { context.mfa_verified == true };

# Complex
permit (
  principal in Team::"devops",
  action == Action::"deploy",
  resource in Environment::"production"
) when {
  context.mfa_verified == true &&
  context has approval_id &&
  context.time.hour >= 9 &&
  context.time.hour <= 17
};

Useful Commands

# Validate policies
provisioning cedar validate

# Reload policies (hot reload)
provisioning cedar reload

# List active policies
provisioning cedar list

# Test policies
provisioning cedar test tests/

# Query denials
provisioning audit query --action authorization_denied

# Policy statistics
provisioning cedar stats

Support

  • Documentation: docs/architecture/CEDAR_AUTHORIZATION_IMPLEMENTATION.md
  • Policy Examples: provisioning/config/cedar-policies/
  • Issues: Report to platform-team
  • Emergency: Use break-glass procedure

Version: 1.0.0 Maintained By: Platform Team Last Updated: 2025-10-08