prvng_platform/control-center/docs/ENHANCEMENTS_README.md

544 lines
14 KiB
Markdown
Raw Normal View History

2025-10-07 10:59:52 +01:00
# Control Center Enhancements - Quick Start Guide
## What's New
The control-center has been enhanced with three major features:
1. **SSH Key Management** - Securely store and manage SSH keys with KMS encryption
2. **Mode-Based RBAC** - Four execution modes with role-based access control
3. **Platform Monitoring** - Real-time health monitoring for all platform services
## Quick Start
### 1. SSH Key Management
#### Store an SSH Key
```bash
# Using curl
curl -X POST http://localhost:8080/api/v1/kms/keys \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "production-server-key",
"private_key": "-----BEGIN RSA PRIVATE KEY-----\n...\n-----END RSA PRIVATE KEY-----",
"public_key": "ssh-rsa AAAA...",
"purpose": "ServerAccess",
"tags": ["production", "web-server"]
}'
# Response
{
"key_id": "550e8400-e29b-41d4-a716-446655440000",
"fingerprint": "SHA256:abc123...",
"created_at": "2025-10-06T10:00:00Z"
}
```
#### List SSH Keys
```bash
curl http://localhost:8080/api/v1/kms/keys \
-H "Authorization: Bearer $TOKEN"
# Response
[
{
"key_id": "550e8400-e29b-41d4-a716-446655440000",
"name": "production-server-key",
"fingerprint": "SHA256:abc123...",
"created_at": "2025-10-06T10:00:00Z",
"last_used": "2025-10-06T11:30:00Z",
"rotation_due": "2026-01-04T10:00:00Z",
"purpose": "ServerAccess"
}
]
```
#### Rotate an SSH Key
```bash
curl -X POST http://localhost:8080/api/v1/kms/keys/550e8400.../rotate \
-H "Authorization: Bearer $TOKEN"
# Response
{
"old_key_id": "550e8400-e29b-41d4-a716-446655440000",
"new_key_id": "661f9511-f3ac-52e5-b827-557766551111",
"grace_period_ends": "2025-10-13T10:00:00Z"
}
```
### 2. Mode-Based RBAC
#### Execution Modes
| Mode | Use Case | RBAC | Audit |
|------|----------|------|-------|
| **Solo** | Single developer | ❌ All admin | ❌ Optional |
| **MultiUser** | Small teams | ✅ Role-based | ⚠️ Optional |
| **CICD** | Automation | ✅ Service accounts | ✅ Mandatory |
| **Enterprise** | Production | ✅ Full RBAC | ✅ Mandatory |
#### Switch Execution Mode
```bash
# Development: Solo mode
curl -X POST http://localhost:8080/api/v1/mode \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"mode": "solo"}'
# Production: Enterprise mode
curl -X POST http://localhost:8080/api/v1/mode \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"mode": "enterprise"}'
```
#### Assign Roles
```bash
# Make user an operator
curl -X POST http://localhost:8080/api/v1/rbac/users/john/role \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"role": "operator"}'
# Roles available:
# - admin (full access)
# - operator (deploy & manage)
# - developer (read + dev deploy)
# - viewer (read-only)
# - service_account (automation)
# - auditor (audit logs)
```
#### Check Your Permissions
```bash
curl http://localhost:8080/api/v1/rbac/permissions \
-H "Authorization: Bearer $TOKEN"
# Response
[
{"resource": "server", "action": "read"},
{"resource": "server", "action": "create"},
{"resource": "taskserv", "action": "deploy"},
...
]
```
### 3. Platform Service Monitoring
#### View All Services
```bash
curl http://localhost:8080/api/v1/platform/services \
-H "Authorization: Bearer $TOKEN"
# Response
{
"orchestrator": {
"name": "Orchestrator",
"status": "Healthy",
"url": "http://localhost:8080",
"last_check": "2025-10-06T12:00:00Z",
"metrics": {
"requests_per_second": 45.2,
"response_time_ms": 12.5,
"custom": {
"active_tasks": "3"
}
}
},
"coredns": {
"name": "CoreDNS",
"status": "Healthy",
...
}
}
```
#### View Service Health History
```bash
curl http://localhost:8080/api/v1/platform/services/orchestrator/history?since=1h \
-H "Authorization: Bearer $TOKEN"
# Response
[
{
"timestamp": "2025-10-06T12:00:00Z",
"status": "Healthy",
"response_time_ms": 12
},
{
"timestamp": "2025-10-06T11:59:30Z",
"status": "Healthy",
"response_time_ms": 15
}
]
```
#### View Service Dependencies
```bash
curl http://localhost:8080/api/v1/platform/dependencies \
-H "Authorization: Bearer $TOKEN"
# Response
{
"orchestrator": [],
"gitea": ["database"],
"extension_registry": ["cache"],
"provisioning_api": ["orchestrator"]
}
```
## Configuration
### config.defaults.toml
```toml
# SSH Key Management
[kms.ssh_keys]
rotation_enabled = true
rotation_interval_days = 90 # Rotate every 90 days
grace_period_days = 7 # 7-day grace period
auto_rotate = false # Manual rotation only
# RBAC Configuration
[rbac]
enabled = true
mode = "solo" # solo, multi-user, cicd, enterprise
default_role = "viewer" # Default for new users
admin_users = ["admin"]
allow_mode_switch = true
session_timeout_minutes = 60
# Platform Monitoring
[platform]
orchestrator_url = "http://localhost:8080"
coredns_url = "http://localhost:9153"
gitea_url = "http://localhost:3000"
oci_registry_url = "http://localhost:5000"
extension_registry_url = "http://localhost:8081"
provisioning_api_url = "http://localhost:8082"
check_interval_seconds = 30 # Health check every 30s
timeout_seconds = 5 # 5s timeout per check
```
## Use Cases
### Use Case 1: Developer Onboarding
```bash
# 1. Admin creates SSH key for new developer
curl -X POST http://localhost:8080/api/v1/kms/keys \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-d '{
"name": "john-dev-key",
"purpose": "ServerAccess",
"tags": ["developer", "john"]
}'
# 2. Admin assigns developer role
curl -X POST http://localhost:8080/api/v1/rbac/users/john/role \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-d '{"role": "developer"}'
# 3. John can now access dev/staging but not production
# His permissions are automatically enforced by RBAC middleware
```
### Use Case 2: CI/CD Pipeline
```bash
# 1. Switch to CICD mode
curl -X POST http://localhost:8080/api/v1/mode \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-d '{"mode": "cicd"}'
# 2. Create service account SSH key
curl -X POST http://localhost:8080/api/v1/kms/keys \
-H "Authorization: Bearer $SERVICE_TOKEN" \
-d '{
"name": "gitlab-ci-deploy-key",
"purpose": "Automation",
"tags": ["cicd", "gitlab"]
}'
# 3. Service account can create/deploy but not delete
# All actions are logged for audit
```
### Use Case 3: Production Deployment
```bash
# 1. Switch to Enterprise mode (production)
curl -X POST http://localhost:8080/api/v1/mode \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-d '{"mode": "enterprise"}'
# 2. Assign operator role to ops team
curl -X POST http://localhost:8080/api/v1/rbac/users/ops-team/role \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-d '{"role": "operator"}'
# 3. Ops team can deploy, but all actions are audited
# Audit trail required for compliance (SOC2, PCI DSS)
```
### Use Case 4: Service Health Monitoring
```bash
# 1. Check all platform services
curl http://localhost:8080/api/v1/platform/services
# 2. Get notified if any service is unhealthy
# (Integrate with alerting system)
# 3. View service dependency graph
curl http://localhost:8080/api/v1/platform/dependencies
# 4. Identify which services are affected by outage
# (e.g., if database is down, Gitea will be degraded)
```
## Role Permission Matrix
| Action | Admin | Operator | Developer | Viewer | ServiceAccount | Auditor |
|--------|-------|----------|-----------|--------|----------------|---------|
| **Servers** |
| Read | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Create | ✅ | ✅ | ❌ | ❌ | ✅ | ❌ |
| Deploy | ✅ | ✅ | ⚠️ Dev only | ❌ | ✅ | ❌ |
| Delete | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
| **Taskservs** |
| Read | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Create | ✅ | ✅ | ❌ | ❌ | ✅ | ❌ |
| Deploy | ✅ | ✅ | ⚠️ Dev only | ❌ | ✅ | ❌ |
| Delete | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
| **Services** |
| Read | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Start/Stop | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
| **Users & Roles** |
| Read | ✅ | ❌ | ❌ | ❌ | ❌ | ✅ |
| Assign Role | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
| **Audit Logs** |
| Read | ✅ | ❌ | ❌ | ❌ | ❌ | ✅ |
| Audit | ✅ | ❌ | ❌ | ❌ | ❌ | ✅ |
## Security Best Practices
### 1. SSH Keys
-**Use rotation**: Enable automatic rotation for production keys
-**Tag keys**: Use tags to organize keys by environment, purpose
-**Audit access**: Regularly review SSH key audit logs
-**Delete unused**: Remove SSH keys that haven't been used in 90+ days
- ⚠️ **Never expose**: Never log or display private keys
### 2. RBAC
-**Least privilege**: Default to Viewer role for new users
-**Enterprise mode**: Use Enterprise mode for production
-**Regular audits**: Review role assignments quarterly
-**Session timeout**: Use shorter timeouts (30 min) for Enterprise
- ⚠️ **Avoid Solo mode**: Never use Solo mode in production
### 3. Platform Monitoring
-**Set alerts**: Configure alerts for unhealthy services
-**Monitor dependencies**: Track service dependency health
-**Review metrics**: Check service metrics daily
-**Internal only**: Never expose service URLs externally
- ⚠️ **Timeout protection**: Use reasonable timeouts (5s default)
## Troubleshooting
### SSH Key Issues
**Problem**: "Key not found"
```bash
# Check if key exists
curl http://localhost:8080/api/v1/kms/keys | jq '.[] | select(.name=="my-key")'
```
**Problem**: "Permission denied to access key"
```bash
# Check your permissions
curl http://localhost:8080/api/v1/rbac/permissions | grep ssh_key
```
**Problem**: "Key rotation failed"
```bash
# Check rotation policy
cat config.toml | grep -A 5 "kms.ssh_keys"
```
### RBAC Issues
**Problem**: "Permission denied on API call"
```bash
# Check your role
curl http://localhost:8080/api/v1/rbac/permissions
# Check current mode
curl http://localhost:8080/api/v1/mode
```
**Problem**: "Cannot assign role"
```bash
# Only admins can assign roles
# Check if you have admin role
```
**Problem**: "Mode switch denied"
```bash
# Check if mode switching is allowed
cat config.toml | grep allow_mode_switch
```
### Platform Monitoring Issues
**Problem**: "Service shows as unhealthy"
```bash
# Check service directly
curl http://localhost:8080/health # For orchestrator
# Check service logs
journalctl -u orchestrator -n 50
```
**Problem**: "Service health not updating"
```bash
# Check monitoring interval
cat config.toml | grep check_interval_seconds
# Verify platform monitor is running
ps aux | grep control-center
```
**Problem**: "Cannot start/stop service"
```bash
# Check permissions (requires Operator or Admin)
curl http://localhost:8080/api/v1/rbac/permissions | grep service
```
## Migration Guide
### From Existing SSH Key Storage
```bash
# 1. Export existing SSH keys
ls ~/.ssh/*.pub > key_list.txt
# 2. Import to KMS
while read key_file; do
name=$(basename "$key_file" .pub)
private_key=$(cat "${key_file%.pub}")
public_key=$(cat "$key_file")
curl -X POST http://localhost:8080/api/v1/kms/keys \
-H "Authorization: Bearer $TOKEN" \
-d "{
\"name\": \"$name\",
\"private_key\": \"$private_key\",
\"public_key\": \"$public_key\",
\"purpose\": \"ServerAccess\"
}"
done < key_list.txt
# 3. Verify import
curl http://localhost:8080/api/v1/kms/keys
```
### From No RBAC to Enterprise Mode
```bash
# 1. Start in Solo mode (current default)
# config.toml: mode = "solo"
# 2. Create admin users
curl -X POST http://localhost:8080/api/v1/users \
-d '{"username": "admin", "role": "admin"}'
# 3. Assign roles to existing users
curl -X POST http://localhost:8080/api/v1/rbac/users/john/role \
-d '{"role": "developer"}'
curl -X POST http://localhost:8080/api/v1/rbac/users/ops/role \
-d '{"role": "operator"}'
# 4. Switch to Multi-User mode (test)
curl -X POST http://localhost:8080/api/v1/mode \
-d '{"mode": "multi-user"}'
# 5. Verify permissions work
# Test as different users
# 6. Switch to Enterprise mode (production)
curl -X POST http://localhost:8080/api/v1/mode \
-d '{"mode": "enterprise"}'
# 7. Enable audit logging
# config.toml: [logging] audit_enabled = true
```
## API Reference
### SSH Keys
| Endpoint | Method | Auth | Description |
|----------|--------|------|-------------|
| `/api/v1/kms/keys` | POST | Admin/Operator | Store SSH key |
| `/api/v1/kms/keys` | GET | All | List SSH keys |
| `/api/v1/kms/keys/:id` | GET | All | Get SSH key details |
| `/api/v1/kms/keys/:id` | DELETE | Admin/Operator | Delete SSH key |
| `/api/v1/kms/keys/:id/rotate` | POST | Admin/Operator | Rotate SSH key |
| `/api/v1/kms/keys/:id/audit` | GET | Admin/Auditor | Get audit log |
### RBAC
| Endpoint | Method | Auth | Description |
|----------|--------|------|-------------|
| `/api/v1/rbac/roles` | GET | All | List available roles |
| `/api/v1/rbac/users/:id/role` | POST | Admin | Assign role |
| `/api/v1/rbac/permissions` | GET | All | Get user permissions |
| `/api/v1/mode` | GET | All | Get current mode |
| `/api/v1/mode` | POST | Admin | Switch mode |
### Platform
| Endpoint | Method | Auth | Description |
|----------|--------|------|-------------|
| `/api/v1/platform/services` | GET | All | All services status |
| `/api/v1/platform/services/:type` | GET | All | Specific service |
| `/api/v1/platform/services/:type/history` | GET | All | Health history |
| `/api/v1/platform/dependencies` | GET | All | Dependency graph |
| `/api/v1/platform/services/:type/start` | POST | Admin/Operator | Start service |
| `/api/v1/platform/services/:type/stop` | POST | Admin/Operator | Stop service |
## Additional Documentation
- **Complete Implementation Guide**: `CONTROL_CENTER_ENHANCEMENTS.md`
- **Security Architecture**: `SECURITY_CONSIDERATIONS.md`
- **Implementation Summary**: `IMPLEMENTATION_SUMMARY.md`
- **KMS Documentation**: `src/kms/README.md`
## Support
For issues or questions:
1. Check this guide first
2. Review `CONTROL_CENTER_ENHANCEMENTS.md` for detailed implementation
3. Review `SECURITY_CONSIDERATIONS.md` for security questions
4. Check test files for usage examples
---
**Last Updated**: 2025-10-06
**Version**: 1.0.0