provisioning/docs/src/operations/cedar-policies-production-guide.md
2026-01-14 03:09:18 +00:00

19 KiB

Cedar Policies Production Guide\n\nVersion: 1.0.0\nDate: 2025-10-08\nAudience: Platform Administrators, Security Teams\nPrerequisites: Understanding of Cedar policy language, Provisioning platform architecture\n\n---\n\n## Table of Contents\n\n1. Introduction\n2. Cedar Policy Basics\n3. Production Policy Strategy\n4. Policy Templates\n5. Policy Development Workflow\n6. Testing Policies\n7. Deployment\n8. Monitoring & Auditing\n9. Troubleshooting\n10. Best Practices\n\n---\n\n## Introduction\n\nCedar policies control who can do what in the Provisioning platform. This guide helps you create, test, and deploy production-ready Cedar policies\nthat balance security with operational efficiency.\n\n### Why Cedar\n\n- Fine-grained: Control access at resource + action level\n- Context-aware: Decisions based on MFA, IP, time, approvals\n- Auditable: Every decision is logged with policy ID\n- Hot-reload: Update policies without restarting services\n- Type-safe: Schema validation prevents errors\n\n---\n\n## Cedar Policy Basics\n\n### Core Concepts\n\n\npermit (\n principal, # Who (user, team, role)\n action, # What (create, delete, deploy)\n resource # Where (server, cluster, environment)\n) when {\n condition # Context (MFA, IP, time)\n};\n\n\n### Entities\n\n| Type | Examples | Description |\n| ------ | ---------- | ------------- |\n| User | User::"alice" | Individual users |\n| Team | Team::"platform-admin" | User groups |\n| Role | Role::"Admin" | Permission levels |\n| Resource | Server::"web-01" | Infrastructure resources |\n| Environment | Environment::"production" | Deployment targets |\n\n### Actions\n\n| Category | Actions |\n| ---------- | --------- |\n| Read | read, list |\n| Write | create, update, delete |\n| Deploy | deploy, rollback |\n| Admin | ssh, execute, admin |\n\n---\n\n## Production Policy Strategy\n\n### Security Levels\n\n#### Level 1: Development (Permissive)\n\n\n// Developers have full access to dev environment\npermit (\n principal in Team::"developers",\n action,\n resource in Environment::"development"\n);\n\n\n#### Level 2: Staging (MFA Required)\n\n\n// All operations require MFA\npermit (\n principal in Team::"developers",\n action,\n resource in Environment::"staging"\n) when {\n context.mfa_verified == true\n};\n\n\n#### Level 3: Production (MFA + Approval)\n\n\n// Deployments require MFA + approval\npermit (\n principal in Team::"platform-admin",\n action in [Action::"deploy", Action::"delete"],\n resource in Environment::"production"\n) when {\n context.mfa_verified == true &&\n context has approval_id &&\n context.approval_id.startsWith("APPROVAL-")\n};\n\n\n#### Level 4: Critical (Break-Glass Only)\n\n\n// Only emergency access\npermit (\n principal,\n action,\n resource in Resource::"production-database"\n) when {\n context.emergency_access == true &&\n context.session_approved == true\n};\n\n\n---\n\n## Policy Templates\n\n### 1. Role-Based Access Control (RBAC)\n\n\n// Admin: Full access\npermit (\n principal in Role::"Admin",\n action,\n resource\n);\n\n// Operator: Server management + read clusters\npermit (\n principal in Role::"Operator",\n action in [\n Action::"create",\n Action::"update",\n Action::"delete"\n ],\n resource is Server\n);\n\npermit (\n principal in Role::"Operator",\n action in [Action::"read", Action::"list"],\n resource is Cluster\n);\n\n// Viewer: Read-only everywhere\npermit (\n principal in Role::"Viewer",\n action in [Action::"read", Action::"list"],\n resource\n);\n\n// Auditor: Read audit logs only\npermit (\n principal in Role::"Auditor",\n action in [Action::"read", Action::"list"],\n resource is AuditLog\n);\n\n\n### 2. Team-Based Policies\n\n\n// Platform team: Infrastructure management\npermit (\n principal in Team::"platform",\n action in [\n Action::"create",\n Action::"update",\n Action::"delete",\n Action::"deploy"\n ],\n resource in [Server, Cluster, Taskserv]\n);\n\n// Security team: Access control + audit\npermit (\n principal in Team::"security",\n action,\n resource in [User, Role, AuditLog, BreakGlass]\n);\n\n// DevOps team: Application deployments\npermit (\n principal in Team::"devops",\n action == Action::"deploy",\n resource in Environment::"production"\n) when {\n context.mfa_verified == true &&\n context.has_approval == true\n};\n\n\n### 3. Time-Based Restrictions\n\n\n// Deployments only during business hours\npermit (\n principal,\n action == Action::"deploy",\n resource in Environment::"production"\n) when {\n context.time.hour >= 9 &&\n context.time.hour <= 17 &&\n context.time.weekday in ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"]\n};\n\n// Maintenance window\npermit (\n principal in Team::"platform",\n action,\n resource\n) when {\n context.maintenance_window == true\n};\n\n\n### 4. IP-Based Restrictions\n\n\n// Production access only from office network\npermit (\n principal,\n action,\n resource in Environment::"production"\n) when {\n context.ip_address.isInRange("10.0.0.0/8") ||\n context.ip_address.isInRange("192.168.1.0/24")\n};\n\n// VPN access for remote work\npermit (\n principal,\n action,\n resource in Environment::"production"\n) when {\n context.vpn_connected == true &&\n context.mfa_verified == true\n};\n\n\n### 5. Resource-Specific Policies\n\n\n// Database servers: Extra protection\nforbid (\n principal,\n action == Action::"delete",\n resource in Resource::"database-*"\n) unless {\n context.emergency_access == true\n};\n\n// Critical clusters: Require multiple approvals\npermit (\n principal,\n action in [Action::"update", Action::"delete"],\n resource in Resource::"k8s-production-*"\n) when {\n context.approval_count >= 2 &&\n context.mfa_verified == true\n};\n\n\n### 6. Self-Service Policies\n\n\n// Users can manage their own MFA devices\npermit (\n principal,\n action in [Action::"create", Action::"delete"],\n resource is MfaDevice\n) when {\n resource.owner == principal\n};\n\n// Users can view their own audit logs\npermit (\n principal,\n action == Action::"read",\n resource is AuditLog\n) when {\n resource.user_id == principal.id\n};\n\n\n---\n\n## Policy Development Workflow\n\n### Step 1: Define Requirements\n\nDocument:\n\n- Who needs access? (roles, teams, individuals)\n- To what resources? (servers, clusters, environments)\n- What actions? (read, write, deploy, delete)\n- Under what conditions? (MFA, IP, time, approvals)\n\nExample Requirements Document:\n\n\n# Requirement: Production Deployment\n\n**Who**: DevOps team members\n**What**: Deploy applications to production\n**When**: Business hours (9am-5pm Mon-Fri)\n**Conditions**:\n- MFA verified\n- Change request approved\n- From office network or VPN\n\n\n### Step 2: Write Policy\n\n\n@id("prod-deploy-devops")\n@description("DevOps can deploy to production during business hours with approval")\npermit (\n principal in Team::"devops",\n action == Action::"deploy",\n resource in Environment::"production"\n) when {\n context.mfa_verified == true &&\n context has approval_id &&\n context.time.hour >= 9 &&\n context.time.hour <= 17 &&\n context.time.weekday in ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"] &&\n (context.ip_address.isInRange("10.0.0.0/8") || context.vpn_connected == true)\n};\n\n\n### Step 3: Validate Syntax\n\n\n# Use Cedar CLI to validate\ncedar validate \\n --policies provisioning/config/cedar-policies/production.cedar \\n --schema provisioning/config/cedar-policies/schema.cedar\n\n# Expected output: ✓ Policy is valid\n\n\n### Step 4: Test in Development\n\n\n# Deploy to development environment first\ncp production.cedar provisioning/config/cedar-policies/development.cedar\n\n# Restart orchestrator to load new policies\nsystemctl restart provisioning-orchestrator\n\n# Test with real requests\nprovisioning server create test-server --check\n\n\n### Step 5: Review & Approve\n\nReview Checklist:\n\n- [ ] Policy syntax valid\n- [ ] Policy ID unique\n- [ ] Description clear\n- [ ] Conditions appropriate for security level\n- [ ] Tested in development\n- [ ] Reviewed by security team\n- [ ] Documented in change log\n\n### Step 6: Deploy to Production\n\n\n# Backup current policies\ncp provisioning/config/cedar-policies/production.cedar \\n provisioning/config/cedar-policies/production.cedar.backup.$(date +%Y%m%d)\n\n# Deploy new policy\ncp new-production.cedar provisioning/config/cedar-policies/production.cedar\n\n# Hot reload (no restart needed)\nprovisioning cedar reload\n\n# Verify loaded\nprovisioning cedar list\n\n\n---\n\n## Testing Policies\n\n### Unit Testing\n\nCreate test cases for each policy:\n\n\n# tests/cedar/prod-deploy-devops.yaml\npolicy_id: prod-deploy-devops\n\ntest_cases:\n - name: "DevOps can deploy with approval and MFA"\n principal: { type: "Team", id: "devops" }\n action: "deploy"\n resource: { type: "Environment", id: "production" }\n context:\n mfa_verified: true\n approval_id: "APPROVAL-123"\n time: { hour: 10, weekday: "Monday" }\n ip_address: "10.0.1.5"\n expected: Allow\n\n - name: "DevOps cannot deploy without MFA"\n principal: { type: "Team", id: "devops" }\n action: "deploy"\n resource: { type: "Environment", id: "production" }\n context:\n mfa_verified: false\n approval_id: "APPROVAL-123"\n time: { hour: 10, weekday: "Monday" }\n expected: Deny\n\n - name: "DevOps cannot deploy outside business hours"\n principal: { type: "Team", id: "devops" }\n action: "deploy"\n resource: { type: "Environment", id: "production" }\n context:\n mfa_verified: true\n approval_id: "APPROVAL-123"\n time: { hour: 22, weekday: "Monday" }\n expected: Deny\n\n\nRun tests:\n\n\nprovisioning cedar test tests/cedar/\n\n\n### Integration Testing\n\nTest with real API calls:\n\n\n# Setup test user\nexport TEST_USER="alice"\nexport TEST_TOKEN=$(provisioning login --user $TEST_USER --output token)\n\n# Test allowed action\ncurl -H "Authorization: Bearer $TEST_TOKEN" \\n http://localhost:9090/api/v1/servers \\n -X POST -d '{"name": "test-server"}'\n\n# Expected: 200 OK\n\n# Test denied action (without MFA)\ncurl -H "Authorization: Bearer $TEST_TOKEN" \\n http://localhost:9090/api/v1/servers/prod-server-01 \\n -X DELETE\n\n# Expected: 403 Forbidden (MFA required)\n\n\n### Load Testing\n\nVerify policy evaluation performance:\n\n\n# Generate load\nprovisioning cedar bench \\n --policies production.cedar \\n --requests 10000 \\n --concurrency 100\n\n# Expected: <10 ms per evaluation\n\n\n---\n\n## Deployment\n\n### Development → Staging → Production\n\n\n#!/bin/bash\n# deploy-policies.sh\n\nENVIRONMENT=$1 # dev, staging, prod\n\n# Validate policies\ncedar validate \\n --policies provisioning/config/cedar-policies/$ENVIRONMENT.cedar \\n --schema provisioning/config/cedar-policies/schema.cedar\n\nif [ $? -ne 0 ]; then\n echo "❌ Policy validation failed"\n exit 1\nfi\n\n# Backup current policies\nBACKUP_DIR="provisioning/config/cedar-policies/backups/$ENVIRONMENT"\nmkdir -p $BACKUP_DIR\ncp provisioning/config/cedar-policies/$ENVIRONMENT.cedar \\n $BACKUP_DIR/$ENVIRONMENT.cedar.$(date +%Y%m%d-%H%M%S)\n\n# Deploy new policies\nscp provisioning/config/cedar-policies/$ENVIRONMENT.cedar \\n $ENVIRONMENT-orchestrator:/etc/provisioning/cedar-policies/production.cedar\n\n# Hot reload on remote\nssh $ENVIRONMENT-orchestrator "provisioning cedar reload"\n\necho "✅ Policies deployed to $ENVIRONMENT"\n\n\n### Rollback Procedure\n\n\n# List backups\nls -ltr provisioning/config/cedar-policies/backups/production/\n\n# Restore previous version\ncp provisioning/config/cedar-policies/backups/production/production.cedar.20251008-143000 \\n provisioning/config/cedar-policies/production.cedar\n\n# Reload\nprovisioning cedar reload\n\n# Verify\nprovisioning cedar list\n\n\n---\n\n## Monitoring & Auditing\n\n### Monitor Authorization Decisions\n\n\n# Query denied requests (last 24 hours)\nprovisioning audit query \\n --action authorization_denied \\n --from "24h" \\n --out table\n\n# Expected output:\n# ┌─────────┬────────┬──────────┬────────┬────────────────┐\n# │ Time │ User │ Action │ Resour │ Reason │\n# ├─────────┼────────┼──────────┼────────┼────────────────┤\n# │ 10:15am │ bob │ deploy │ prod │ MFA not verif │\n# │ 11:30am │ alice │ delete │ db-01 │ No approval │\n# └─────────┴────────┴──────────┴────────┴────────────────┘\n\n\n### Alert on Suspicious Activity\n\n\n# alerts/cedar-policies.yaml\nalerts:\n - name: "High Denial Rate"\n query: "authorization_denied"\n threshold: 10\n window: "5m"\n action: "notify:security-team"\n\n - name: "Policy Bypass Attempt"\n query: "action:deploy AND result:denied"\n user: "critical-users"\n action: "page:oncall"\n\n\n### Policy Usage Statistics\n\n\n# Which policies are most used?\nprovisioning cedar stats --top 10\n\n# Example output:\n# Policy ID | Uses | Allows | Denies\n# ---------------------- | ------- | -------- | -------\n# prod-deploy-devops | 1,234 | 1,100 | 134\n# admin-full-access | 892 | 892 | 0\n# viewer-read-only | 5,421 | 5,421 | 0\n\n\n---\n\n## Troubleshooting\n\n### Policy Not Applying\n\nSymptom: Policy changes not taking effect\n\nSolutions:\n\n1. Verify hot reload:\n\n bash\n provisioning cedar reload\n provisioning cedar list # Should show updated timestamp\n \n\n1. Check orchestrator logs:\n\n bash\n journalctl -u provisioning-orchestrator -f | grep cedar\n \n\n2. Restart orchestrator:\n\n bash\n systemctl restart provisioning-orchestrator\n \n\n### Unexpected Denials\n\nSymptom: User denied access when policy should allow\n\nDebug:\n\n\n# Enable debug mode\nexport PROVISIONING_DEBUG=1\n\n# View authorization decision\nprovisioning audit query \\n --user alice \\n --action deploy \\n --from "1h" \\n --out json | jq '.authorization'\n\n# Shows which policy evaluated, context used, reason for denial\n\n\n### Policy Conflicts\n\nSymptom: Multiple policies match, unclear which applies\n\nResolution:\n\n- Cedar uses deny-override: If any forbid matches, request denied\n- Use @priority annotations (higher number = higher priority)\n- Make policies more specific to avoid conflicts\n\n\n@priority(100)\npermit (\n principal in Role::"Admin",\n action,\n resource\n);\n\n@priority(50)\nforbid (\n principal,\n action == Action::"delete",\n resource is Database\n);\n\n// Admin can do anything EXCEPT delete databases\n\n\n---\n\n## Best Practices\n\n### 1. Start Restrictive, Loosen Gradually\n\n\n// ❌ BAD: Too permissive initially\npermit (principal, action, resource);\n\n// ✅ GOOD: Explicit allow, expand as needed\npermit (\n principal in Role::"Admin",\n action in [Action::"read", Action::"list"],\n resource\n);\n\n\n### 2. Use Annotations\n\n\n@id("prod-deploy-mfa")\n@description("Production deployments require MFA verification")\n@owner("platform-team")\n@reviewed("2025-10-08")\n@expires("2026-10-08")\npermit (\n principal in Team::"platform-admin",\n action == Action::"deploy",\n resource in Environment::"production"\n) when {\n context.mfa_verified == true\n};\n\n\n### 3. Principle of Least Privilege\n\nGive users minimum permissions needed:\n\n\n// ❌ BAD: Overly broad\npermit (principal in Team::"developers", action, resource);\n\n// ✅ GOOD: Specific permissions\npermit (\n principal in Team::"developers",\n action in [Action::"read", Action::"create", Action::"update"],\n resource in Environment::"development"\n);\n\n\n### 4. Document Context Requirements\n\n\n// Context required for this policy:\n// - mfa_verified: boolean (from JWT claims)\n// - approval_id: string (from request header)\n// - ip_address: IpAddr (from connection)\npermit (\n principal in Role::"Operator",\n action == Action::"deploy",\n resource in Environment::"production"\n) when {\n context.mfa_verified == true &&\n context has approval_id &&\n context.ip_address.isInRange("10.0.0.0/8")\n};\n\n\n### 5. Separate Policies by Concern\n\nFile organization:\n\n\ncedar-policies/\n├── schema.cedar # Entity/action definitions\n├── rbac.cedar # Role-based policies\n├── teams.cedar # Team-based policies\n├── time-restrictions.cedar # Time-based policies\n├── ip-restrictions.cedar # Network-based policies\n├── production.cedar # Production-specific\n└── development.cedar # Development-specific\n\n\n### 6. Version Control\n\n\n# Git commit each policy change\ngit add provisioning/config/cedar-policies/production.cedar\ngit commit -m "feat(cedar): Add MFA requirement for prod deployments\n\n- Require MFA for all production deployments\n- Applies to devops and platform-admin teams\n- Effective 2025-10-08\n\nPolicy ID: prod-deploy-mfa\nReviewed by: security-team\nTicket: SEC-1234"\n\ngit push\n\n\n### 7. Regular Policy Audits\n\nQuarterly review:\n\n- [ ] Remove unused policies\n- [ ] Tighten overly permissive policies\n- [ ] Update for new resources/actions\n- [ ] Verify team memberships current\n- [ ] Test break-glass procedures\n\n---\n\n## Quick Reference\n\n### Common Policy Patterns\n\n\n# Allow all\npermit (principal, action, resource);\n\n# Deny all\nforbid (principal, action, resource);\n\n# Role-based\npermit (principal in Role::"Admin", action, resource);\n\n# Team-based\npermit (principal in Team::"platform", action, resource);\n\n# Resource-based\npermit (principal, action, resource in Environment::"production");\n\n# Action-based\npermit (principal, action in [Action::"read", Action::"list"], resource);\n\n# Condition-based\npermit (principal, action, resource) when { context.mfa_verified == true };\n\n# Complex\npermit (\n principal in Team::"devops",\n action == Action::"deploy",\n resource in Environment::"production"\n) when {\n context.mfa_verified == true &&\n context has approval_id &&\n context.time.hour >= 9 &&\n context.time.hour <= 17\n};\n\n\n### Useful Commands\n\n\n# Validate policies\nprovisioning cedar validate\n\n# Reload policies (hot reload)\nprovisioning cedar reload\n\n# List active policies\nprovisioning cedar list\n\n# Test policies\nprovisioning cedar test tests/\n\n# Query denials\nprovisioning audit query --action authorization_denied\n\n# Policy statistics\nprovisioning cedar stats\n\n\n---\n\n## Support\n\n- Documentation: docs/architecture/CEDAR_AUTHORIZATION_IMPLEMENTATION.md\n- Policy Examples: provisioning/config/cedar-policies/\n- Issues: Report to platform-team\n- Emergency: Use break-glass procedure\n\n---\n\nVersion: 1.0.0\nMaintained By: Platform Team\nLast Updated: 2025-10-08