658 lines
24 KiB
Markdown
658 lines
24 KiB
Markdown
|
|
# ADR-014: SecretumVault Integration for Secrets Management
|
||
|
|
|
||
|
|
## Status
|
||
|
|
|
||
|
|
**Accepted** - 2025-01-08
|
||
|
|
|
||
|
|
## Context
|
||
|
|
|
||
|
|
The provisioning system manages sensitive data across multiple infrastructure layers: cloud provider credentials, database passwords, API keys, SSH keys, encryption keys, and service tokens. The current security architecture (ADR-009) includes SOPS for encrypted config files and Age for key management, but lacks a centralized secrets management solution with dynamic secrets, access control, and audit logging.
|
||
|
|
|
||
|
|
### Current Secrets Management Challenges
|
||
|
|
|
||
|
|
**Existing Approach**:
|
||
|
|
|
||
|
|
1. **SOPS + Age**: Static secrets encrypted in config files
|
||
|
|
- Good: Version-controlled, gitops-friendly
|
||
|
|
- Limited: Static rotation, no audit trail, manual key distribution
|
||
|
|
|
||
|
|
2. **Nickel Configuration**: Declarative secrets references
|
||
|
|
- Good: Type-safe configuration
|
||
|
|
- Limited: Cannot generate dynamic secrets, no lifecycle management
|
||
|
|
|
||
|
|
3. **Manual Secret Injection**: Environment variables, CLI flags
|
||
|
|
- Good: Simple for development
|
||
|
|
- Limited: No security guarantees, prone to leakage
|
||
|
|
|
||
|
|
### Problems Without Centralized Secrets Management
|
||
|
|
|
||
|
|
**Security Issues**:
|
||
|
|
- ❌ No centralized audit trail (who accessed which secret when)
|
||
|
|
- ❌ No automatic secret rotation policies
|
||
|
|
- ❌ No fine-grained access control (Cedar policies not enforced on secrets)
|
||
|
|
- ❌ Secrets scattered across: SOPS files, env vars, config files, K8s secrets
|
||
|
|
- ❌ No detection of secret sprawl or leaked credentials
|
||
|
|
|
||
|
|
**Operational Issues**:
|
||
|
|
- ❌ Manual secret rotation (error-prone, often neglected)
|
||
|
|
- ❌ No secret versioning (cannot rollback to previous credentials)
|
||
|
|
- ❌ Difficult onboarding (manual key distribution)
|
||
|
|
- ❌ No dynamic secrets (credentials exist indefinitely)
|
||
|
|
|
||
|
|
**Compliance Issues**:
|
||
|
|
- ❌ Cannot prove compliance with secret access policies
|
||
|
|
- ❌ No audit logs for regulatory requirements
|
||
|
|
- ❌ Cannot enforce secret expiration policies
|
||
|
|
- ❌ Difficult to demonstrate least-privilege access
|
||
|
|
|
||
|
|
### Use Cases Requiring Centralized Secrets Management
|
||
|
|
|
||
|
|
1. **Dynamic Database Credentials**:
|
||
|
|
- Generate short-lived DB credentials for applications
|
||
|
|
- Automatic rotation based on policies
|
||
|
|
- Revocation on application termination
|
||
|
|
|
||
|
|
2. **Cloud Provider API Keys**:
|
||
|
|
- Centralized storage with access control
|
||
|
|
- Audit trail of credential usage
|
||
|
|
- Automatic rotation schedules
|
||
|
|
|
||
|
|
3. **Service-to-Service Authentication**:
|
||
|
|
- Dynamic tokens for microservices
|
||
|
|
- Short-lived certificates for mTLS
|
||
|
|
- Automatic renewal before expiration
|
||
|
|
|
||
|
|
4. **SSH Key Management**:
|
||
|
|
- Temporal SSH keys (ADR-009 SSH integration)
|
||
|
|
- Centralized certificate authority
|
||
|
|
- Audit trail of SSH access
|
||
|
|
|
||
|
|
5. **Encryption Key Management**:
|
||
|
|
- Master encryption keys for data at rest
|
||
|
|
- Key rotation and versioning
|
||
|
|
- Integration with KMS systems
|
||
|
|
|
||
|
|
### Requirements for Secrets Management System
|
||
|
|
|
||
|
|
- ✅ **Dynamic Secrets**: Generate credentials on-demand with TTL
|
||
|
|
- ✅ **Access Control**: Integration with Cedar authorization policies
|
||
|
|
- ✅ **Audit Logging**: Complete trail of secret access and modifications
|
||
|
|
- ✅ **Secret Rotation**: Automatic and manual rotation policies
|
||
|
|
- ✅ **Versioning**: Track secret versions, enable rollback
|
||
|
|
- ✅ **High Availability**: Distributed, fault-tolerant architecture
|
||
|
|
- ✅ **Encryption at Rest**: AES-256-GCM for stored secrets
|
||
|
|
- ✅ **API-First**: RESTful API for integration
|
||
|
|
- ✅ **Plugin Ecosystem**: Extensible backends (AWS, Azure, databases)
|
||
|
|
- ✅ **Open Source**: Self-hosted, no vendor lock-in
|
||
|
|
|
||
|
|
## Decision
|
||
|
|
|
||
|
|
Integrate **SecretumVault** as the centralized secrets management system for the provisioning platform.
|
||
|
|
|
||
|
|
### Architecture Diagram
|
||
|
|
|
||
|
|
```
|
||
|
|
┌─────────────────────────────────────────────────────────────┐
|
||
|
|
│ Provisioning CLI / Orchestrator / Services │
|
||
|
|
│ │
|
||
|
|
│ - Workspace initialization (credentials) │
|
||
|
|
│ - Infrastructure deployment (cloud API keys) │
|
||
|
|
│ - Service configuration (database passwords) │
|
||
|
|
│ - SSH temporal keys (certificate generation) │
|
||
|
|
└────────────┬────────────────────────────────────────────────┘
|
||
|
|
│
|
||
|
|
▼
|
||
|
|
┌─────────────────────────────────────────────────────────────┐
|
||
|
|
│ SecretumVault Client Library (Rust) │
|
||
|
|
│ (provisioning/core/libs/secretum-client/) │
|
||
|
|
│ │
|
||
|
|
│ - Authentication (token, mTLS) │
|
||
|
|
│ - Secret CRUD operations │
|
||
|
|
│ - Dynamic secret generation │
|
||
|
|
│ - Lease renewal and revocation │
|
||
|
|
│ - Policy enforcement │
|
||
|
|
└────────────┬────────────────────────────────────────────────┘
|
||
|
|
│ HTTPS + mTLS
|
||
|
|
▼
|
||
|
|
┌─────────────────────────────────────────────────────────────┐
|
||
|
|
│ SecretumVault Server │
|
||
|
|
│ (Rust-based Vault implementation) │
|
||
|
|
│ │
|
||
|
|
│ ┌───────────────────────────────────────────────────┐ │
|
||
|
|
│ │ API Layer (REST + gRPC) │ │
|
||
|
|
│ ├───────────────────────────────────────────────────┤ │
|
||
|
|
│ │ Authentication & Authorization │ │
|
||
|
|
│ │ - Token auth, mTLS, OIDC integration │ │
|
||
|
|
│ │ - Cedar policy enforcement │ │
|
||
|
|
│ ├───────────────────────────────────────────────────┤ │
|
||
|
|
│ │ Secret Engines │ │
|
||
|
|
│ │ - KV (key-value v2 with versioning) │ │
|
||
|
|
│ │ - Database (dynamic credentials) │ │
|
||
|
|
│ │ - SSH (certificate authority) │ │
|
||
|
|
│ │ - PKI (X.509 certificates) │ │
|
||
|
|
│ │ - Cloud Providers (AWS/Azure/OCI) │ │
|
||
|
|
│ ├───────────────────────────────────────────────────┤ │
|
||
|
|
│ │ Storage Backend │ │
|
||
|
|
│ │ - Encrypted storage (AES-256-GCM) │ │
|
||
|
|
│ │ - PostgreSQL / Raft cluster │ │
|
||
|
|
│ ├───────────────────────────────────────────────────┤ │
|
||
|
|
│ │ Audit Backend │ │
|
||
|
|
│ │ - Structured logging (JSON) │ │
|
||
|
|
│ │ - Syslog, file, database sinks │ │
|
||
|
|
│ └───────────────────────────────────────────────────┘ │
|
||
|
|
└─────────────────────────────────────────────────────────────┘
|
||
|
|
│
|
||
|
|
▼
|
||
|
|
┌─────────────────────────────────────────────────────────────┐
|
||
|
|
│ Backends (Dynamic Secret Generation) │
|
||
|
|
│ │
|
||
|
|
│ - PostgreSQL/MySQL (database credentials) │
|
||
|
|
│ - AWS IAM (temporary access keys) │
|
||
|
|
│ - Azure AD (service principals) │
|
||
|
|
│ - SSH CA (signed certificates) │
|
||
|
|
│ - PKI (X.509 certificates) │
|
||
|
|
└─────────────────────────────────────────────────────────────┘
|
||
|
|
```
|
||
|
|
|
||
|
|
### Implementation Characteristics
|
||
|
|
|
||
|
|
**SecretumVault Provides**:
|
||
|
|
|
||
|
|
- ✅ Dynamic secret generation with configurable TTL
|
||
|
|
- ✅ Secret versioning and rollback capabilities
|
||
|
|
- ✅ Fine-grained access control (Cedar policies)
|
||
|
|
- ✅ Complete audit trail (all operations logged)
|
||
|
|
- ✅ Automatic secret rotation policies
|
||
|
|
- ✅ High availability (Raft consensus)
|
||
|
|
- ✅ Encryption at rest (AES-256-GCM)
|
||
|
|
- ✅ Plugin architecture for secret backends
|
||
|
|
- ✅ RESTful and gRPC APIs
|
||
|
|
- ✅ Rust implementation (performance, safety)
|
||
|
|
|
||
|
|
**Integration with Provisioning System**:
|
||
|
|
|
||
|
|
- ✅ Rust client library (native integration)
|
||
|
|
- ✅ Nushell commands via CLI wrapper
|
||
|
|
- ✅ Nickel configuration references secrets
|
||
|
|
- ✅ Cedar policies control secret access
|
||
|
|
- ✅ Orchestrator manages secret lifecycle
|
||
|
|
- ✅ SSH integration for temporal keys
|
||
|
|
- ✅ KMS integration for encryption keys
|
||
|
|
|
||
|
|
## Rationale
|
||
|
|
|
||
|
|
### Why SecretumVault Is Required
|
||
|
|
|
||
|
|
| Aspect | SOPS + Age (current) | HashiCorp Vault | SecretumVault (chosen) |
|
||
|
|
|--------|----------------------|-----------------|------------------------|
|
||
|
|
| **Dynamic Secrets** | ❌ Static only | ✅ Full support | ✅ Full support |
|
||
|
|
| **Rust Native** | ⚠️ External CLI | ❌ Go binary | ✅ Pure Rust |
|
||
|
|
| **Cedar Integration** | ❌ None | ❌ Custom policies | ✅ Native Cedar |
|
||
|
|
| **Audit Trail** | ❌ Git only | ✅ Comprehensive | ✅ Comprehensive |
|
||
|
|
| **Secret Rotation** | ❌ Manual | ✅ Automatic | ✅ Automatic |
|
||
|
|
| **Open Source** | ✅ Yes | ⚠️ MPL 2.0 (BSL now) | ✅ Yes |
|
||
|
|
| **Self-Hosted** | ✅ Yes | ✅ Yes | ✅ Yes |
|
||
|
|
| **License** | ✅ Permissive | ⚠️ BSL (proprietary) | ✅ Permissive |
|
||
|
|
| **Versioning** | ⚠️ Git commits | ✅ Built-in | ✅ Built-in |
|
||
|
|
| **High Availability** | ❌ Single file | ✅ Raft cluster | ✅ Raft cluster |
|
||
|
|
| **Performance** | ✅ Fast (local) | ⚠️ Network latency | ✅ Rust performance |
|
||
|
|
|
||
|
|
### Why Not Continue with SOPS Alone?
|
||
|
|
|
||
|
|
SOPS is excellent for **static secrets in git**, but inadequate for:
|
||
|
|
|
||
|
|
1. **Dynamic Credentials**: Cannot generate temporary DB passwords
|
||
|
|
2. **Audit Trail**: Git commits are insufficient for compliance
|
||
|
|
3. **Rotation Policies**: Manual rotation is error-prone
|
||
|
|
4. **Access Control**: No runtime policy enforcement
|
||
|
|
5. **Secret Lifecycle**: Cannot track usage or revoke access
|
||
|
|
6. **Multi-System Integration**: Limited to files, not API-accessible
|
||
|
|
|
||
|
|
**Complementary Approach**:
|
||
|
|
- SOPS: Configuration files with long-lived secrets (gitops workflow)
|
||
|
|
- SecretumVault: Runtime dynamic secrets, short-lived credentials, audit trail
|
||
|
|
|
||
|
|
### Why SecretumVault Over HashiCorp Vault?
|
||
|
|
|
||
|
|
**HashiCorp Vault Limitations**:
|
||
|
|
|
||
|
|
1. **License Change**: BSL (Business Source License) - proprietary for production
|
||
|
|
2. **Not Rust Native**: Go binary, subprocess overhead
|
||
|
|
3. **Custom Policy Language**: HCL policies, not Cedar (provisioning standard)
|
||
|
|
4. **Complex Deployment**: Heavy operational burden
|
||
|
|
5. **Vendor Lock-In**: HashiCorp ecosystem dependency
|
||
|
|
|
||
|
|
**SecretumVault Advantages**:
|
||
|
|
|
||
|
|
1. **Rust Native**: Zero-cost integration, no subprocess spawning
|
||
|
|
2. **Cedar Policies**: Consistent with ADR-008 authorization model
|
||
|
|
3. **Lightweight**: Smaller binary, lower resource usage
|
||
|
|
4. **Open Source**: Permissive license, community-driven
|
||
|
|
5. **Provisioning-First**: Designed for IaC workflows
|
||
|
|
|
||
|
|
### Integration with Existing Security Architecture
|
||
|
|
|
||
|
|
**ADR-009 (Security System)**:
|
||
|
|
- SOPS: Static config encryption (unchanged)
|
||
|
|
- Age: Key management for SOPS (unchanged)
|
||
|
|
- SecretumVault: Dynamic secrets, runtime access control (new)
|
||
|
|
|
||
|
|
**ADR-008 (Cedar Authorization)**:
|
||
|
|
- Cedar policies control SecretumVault secret access
|
||
|
|
- Fine-grained permissions: `read:secret:database/prod/password`
|
||
|
|
- Audit trail records Cedar policy decisions
|
||
|
|
|
||
|
|
**SSH Temporal Keys**:
|
||
|
|
- SecretumVault SSH CA signs user certificates
|
||
|
|
- Short-lived certificates (1-24 hours)
|
||
|
|
- Audit trail of SSH access
|
||
|
|
|
||
|
|
## Consequences
|
||
|
|
|
||
|
|
### Positive
|
||
|
|
|
||
|
|
- **Security Posture**: Centralized secrets with audit trail and rotation
|
||
|
|
- **Compliance**: Complete audit logs for regulatory requirements
|
||
|
|
- **Operational Excellence**: Automatic rotation, dynamic credentials
|
||
|
|
- **Developer Experience**: Simple API for secret access
|
||
|
|
- **Performance**: Rust implementation, zero-cost abstractions
|
||
|
|
- **Consistency**: Cedar policies across entire system (auth + secrets)
|
||
|
|
- **Observability**: Metrics, logs, traces for secret access
|
||
|
|
- **Disaster Recovery**: Secret versioning enables rollback
|
||
|
|
|
||
|
|
### Negative
|
||
|
|
|
||
|
|
- **Infrastructure Complexity**: Additional service to deploy and operate
|
||
|
|
- **High Availability Requirements**: Raft cluster needs 3+ nodes
|
||
|
|
- **Migration Effort**: Existing SOPS secrets need migration path
|
||
|
|
- **Learning Curve**: Operators must learn vault concepts
|
||
|
|
- **Dependency Risk**: Critical path service (secrets unavailable = system down)
|
||
|
|
|
||
|
|
### Mitigation Strategies
|
||
|
|
|
||
|
|
**High Availability**:
|
||
|
|
```bash
|
||
|
|
# Deploy SecretumVault cluster (3 nodes)
|
||
|
|
provisioning deploy secretum-vault --ha --replicas 3
|
||
|
|
|
||
|
|
# Automatic leader election via Raft
|
||
|
|
# Clients auto-reconnect to leader
|
||
|
|
```
|
||
|
|
|
||
|
|
**Migration from SOPS**:
|
||
|
|
```bash
|
||
|
|
# Phase 1: Import existing SOPS secrets into SecretumVault
|
||
|
|
provisioning secrets migrate --from-sops config/secrets.yaml
|
||
|
|
|
||
|
|
# Phase 2: Update Nickel configs to reference vault paths
|
||
|
|
# Phase 3: Deprecate SOPS for runtime secrets (keep for config files)
|
||
|
|
```
|
||
|
|
|
||
|
|
**Fallback Strategy**:
|
||
|
|
```rust
|
||
|
|
// Graceful degradation if vault unavailable
|
||
|
|
let secret = match vault_client.get_secret("database/password").await {
|
||
|
|
Ok(s) => s,
|
||
|
|
Err(VaultError::Unavailable) => {
|
||
|
|
// Fallback to SOPS for read-only operations
|
||
|
|
warn!("Vault unavailable, using SOPS fallback");
|
||
|
|
sops_decrypt("config/secrets.yaml", "database.password")?
|
||
|
|
},
|
||
|
|
Err(e) => return Err(e),
|
||
|
|
};
|
||
|
|
```
|
||
|
|
|
||
|
|
**Operational Monitoring**:
|
||
|
|
```toml
|
||
|
|
# prometheus metrics
|
||
|
|
secretum_vault_request_duration_seconds
|
||
|
|
secretum_vault_secret_lease_expiry
|
||
|
|
secretum_vault_auth_failures_total
|
||
|
|
secretum_vault_raft_leader_changes
|
||
|
|
|
||
|
|
# Alerts: Vault unavailable, high auth failure rate, lease expiry
|
||
|
|
```
|
||
|
|
|
||
|
|
## Alternatives Considered
|
||
|
|
|
||
|
|
### Alternative 1: Continue with SOPS Only
|
||
|
|
|
||
|
|
**Pros**: No new infrastructure, simple
|
||
|
|
**Cons**: No dynamic secrets, no audit trail, manual rotation
|
||
|
|
**Decision**: REJECTED - Insufficient for production security
|
||
|
|
|
||
|
|
### Alternative 2: HashiCorp Vault
|
||
|
|
|
||
|
|
**Pros**: Mature, feature-rich, widely adopted
|
||
|
|
**Cons**: BSL license, Go binary, HCL policies (not Cedar), complex deployment
|
||
|
|
**Decision**: REJECTED - License and integration concerns
|
||
|
|
|
||
|
|
### Alternative 3: Cloud Provider Native (AWS Secrets Manager, Azure Key Vault)
|
||
|
|
|
||
|
|
**Pros**: Fully managed, high availability
|
||
|
|
**Cons**: Vendor lock-in, multi-cloud complexity, cost at scale
|
||
|
|
**Decision**: REJECTED - Against open-source and multi-cloud principles
|
||
|
|
|
||
|
|
### Alternative 4: CyberArk, 1Password, etc.
|
||
|
|
|
||
|
|
**Pros**: Enterprise features
|
||
|
|
**Cons**: Proprietary, expensive, poor API integration
|
||
|
|
**Decision**: REJECTED - Not suitable for IaC automation
|
||
|
|
|
||
|
|
### Alternative 5: Build Custom Secrets Manager
|
||
|
|
|
||
|
|
**Pros**: Full control, tailored to needs
|
||
|
|
**Cons**: High maintenance burden, security risk, reinventing wheel
|
||
|
|
**Decision**: REJECTED - SecretumVault provides this already
|
||
|
|
|
||
|
|
## Implementation Details
|
||
|
|
|
||
|
|
### SecretumVault Deployment
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Deploy via provisioning system
|
||
|
|
provisioning deploy secretum-vault \
|
||
|
|
--ha \
|
||
|
|
--replicas 3 \
|
||
|
|
--storage postgres \
|
||
|
|
--tls-cert /path/to/cert.pem \
|
||
|
|
--tls-key /path/to/key.pem
|
||
|
|
|
||
|
|
# Initialize and unseal
|
||
|
|
provisioning vault init
|
||
|
|
provisioning vault unseal --key-shares 5 --key-threshold 3
|
||
|
|
```
|
||
|
|
|
||
|
|
### Rust Client Library
|
||
|
|
|
||
|
|
```rust
|
||
|
|
// provisioning/core/libs/secretum-client/src/lib.rs
|
||
|
|
|
||
|
|
use secretum_vault::{Client, SecretEngine, Auth};
|
||
|
|
|
||
|
|
pub struct VaultClient {
|
||
|
|
client: Client,
|
||
|
|
}
|
||
|
|
|
||
|
|
impl VaultClient {
|
||
|
|
pub async fn new(addr: &str, token: &str) -> Result<Self> {
|
||
|
|
let client = Client::new(addr)
|
||
|
|
.auth(Auth::Token(token))
|
||
|
|
.tls_config(TlsConfig::from_files("ca.pem", "cert.pem", "key.pem"))?
|
||
|
|
.build()?;
|
||
|
|
|
||
|
|
Ok(Self { client })
|
||
|
|
}
|
||
|
|
|
||
|
|
pub async fn get_secret(&self, path: &str) -> Result<Secret> {
|
||
|
|
self.client.kv2().get(path).await
|
||
|
|
}
|
||
|
|
|
||
|
|
pub async fn create_dynamic_db_credentials(&self, role: &str) -> Result<DbCredentials> {
|
||
|
|
self.client.database().generate_credentials(role).await
|
||
|
|
}
|
||
|
|
|
||
|
|
pub async fn sign_ssh_key(&self, public_key: &str, ttl: Duration) -> Result<Certificate> {
|
||
|
|
self.client.ssh().sign_key(public_key, ttl).await
|
||
|
|
}
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
### Nushell Integration
|
||
|
|
|
||
|
|
```nushell
|
||
|
|
# Nushell commands via Rust CLI wrapper
|
||
|
|
provisioning secrets get database/prod/password
|
||
|
|
provisioning secrets set api/keys/stripe --value "sk_live_xyz"
|
||
|
|
provisioning secrets rotate database/prod/password
|
||
|
|
provisioning secrets lease renew lease_id_12345
|
||
|
|
provisioning secrets list database/
|
||
|
|
```
|
||
|
|
|
||
|
|
### Nickel Configuration Integration
|
||
|
|
|
||
|
|
```nickel
|
||
|
|
# provisioning/schemas/database.ncl
|
||
|
|
{
|
||
|
|
database = {
|
||
|
|
host = "postgres.example.com",
|
||
|
|
port = 5432,
|
||
|
|
username = secrets.get "database/prod/username",
|
||
|
|
password = secrets.get "database/prod/password",
|
||
|
|
}
|
||
|
|
}
|
||
|
|
|
||
|
|
# Nickel function: secrets.get resolves to SecretumVault API call
|
||
|
|
```
|
||
|
|
|
||
|
|
### Cedar Policy for Secret Access
|
||
|
|
|
||
|
|
```cedar
|
||
|
|
// policy: developers can read dev secrets, not prod
|
||
|
|
permit(
|
||
|
|
principal in Group::"developers",
|
||
|
|
action == Action::"read",
|
||
|
|
resource in Secret::"database/dev"
|
||
|
|
);
|
||
|
|
|
||
|
|
forbid(
|
||
|
|
principal in Group::"developers",
|
||
|
|
action == Action::"read",
|
||
|
|
resource in Secret::"database/prod"
|
||
|
|
);
|
||
|
|
|
||
|
|
// policy: CI/CD can generate dynamic DB credentials
|
||
|
|
permit(
|
||
|
|
principal == Service::"github-actions",
|
||
|
|
action == Action::"generate",
|
||
|
|
resource in Secret::"database/dynamic"
|
||
|
|
) when {
|
||
|
|
context.ttl <= duration("1h")
|
||
|
|
};
|
||
|
|
```
|
||
|
|
|
||
|
|
### Dynamic Database Credentials
|
||
|
|
|
||
|
|
```rust
|
||
|
|
// Application requests temporary DB credentials
|
||
|
|
let creds = vault_client
|
||
|
|
.database()
|
||
|
|
.generate_credentials("postgres-readonly")
|
||
|
|
.await?;
|
||
|
|
|
||
|
|
println!("Username: {}", creds.username); // v-app-abcd1234
|
||
|
|
println!("Password: {}", creds.password); // random-secure-password
|
||
|
|
println!("TTL: {}", creds.lease_duration); // 1h
|
||
|
|
|
||
|
|
// Credentials automatically revoked after TTL
|
||
|
|
// No manual cleanup needed
|
||
|
|
```
|
||
|
|
|
||
|
|
### Secret Rotation Automation
|
||
|
|
|
||
|
|
```toml
|
||
|
|
# secretum-vault config
|
||
|
|
[[rotation_policies]]
|
||
|
|
path = "database/prod/password"
|
||
|
|
schedule = "0 0 * * 0" # Weekly on Sunday midnight
|
||
|
|
max_age = "30d"
|
||
|
|
|
||
|
|
[[rotation_policies]]
|
||
|
|
path = "api/keys/stripe"
|
||
|
|
schedule = "0 0 1 * *" # Monthly on 1st
|
||
|
|
max_age = "90d"
|
||
|
|
```
|
||
|
|
|
||
|
|
### Audit Log Format
|
||
|
|
|
||
|
|
```json
|
||
|
|
{
|
||
|
|
"timestamp": "2025-01-08T12:34:56Z",
|
||
|
|
"type": "request",
|
||
|
|
"auth": {
|
||
|
|
"client_token": "sha256:abc123...",
|
||
|
|
"accessor": "hmac:def456...",
|
||
|
|
"display_name": "service-orchestrator",
|
||
|
|
"policies": ["default", "service-policy"]
|
||
|
|
},
|
||
|
|
"request": {
|
||
|
|
"operation": "read",
|
||
|
|
"path": "secret/data/database/prod/password",
|
||
|
|
"remote_address": "10.0.1.5"
|
||
|
|
},
|
||
|
|
"response": {
|
||
|
|
"status": 200
|
||
|
|
},
|
||
|
|
"cedar_policy": {
|
||
|
|
"decision": "permit",
|
||
|
|
"policy_id": "allow-orchestrator-read-secrets"
|
||
|
|
}
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
## Testing Strategy
|
||
|
|
|
||
|
|
**Unit Tests**:
|
||
|
|
```rust
|
||
|
|
#[tokio::test]
|
||
|
|
async fn test_get_secret() {
|
||
|
|
let vault = mock_vault_client();
|
||
|
|
let secret = vault.get_secret("test/secret").await.unwrap();
|
||
|
|
assert_eq!(secret.value, "expected-value");
|
||
|
|
}
|
||
|
|
|
||
|
|
#[tokio::test]
|
||
|
|
async fn test_dynamic_credentials_generation() {
|
||
|
|
let vault = mock_vault_client();
|
||
|
|
let creds = vault.create_dynamic_db_credentials("postgres-readonly").await.unwrap();
|
||
|
|
assert!(creds.username.starts_with("v-"));
|
||
|
|
assert_eq!(creds.lease_duration, Duration::from_secs(3600));
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
**Integration Tests**:
|
||
|
|
```bash
|
||
|
|
# Test vault deployment
|
||
|
|
provisioning deploy secretum-vault --test-mode
|
||
|
|
provisioning vault init
|
||
|
|
provisioning vault unseal
|
||
|
|
|
||
|
|
# Test secret operations
|
||
|
|
provisioning secrets set test/secret --value "test-value"
|
||
|
|
provisioning secrets get test/secret | assert "test-value"
|
||
|
|
|
||
|
|
# Test dynamic credentials
|
||
|
|
provisioning secrets db-creds postgres-readonly | jq '.username' | assert-contains "v-"
|
||
|
|
|
||
|
|
# Test rotation
|
||
|
|
provisioning secrets rotate test/secret
|
||
|
|
```
|
||
|
|
|
||
|
|
**Security Tests**:
|
||
|
|
```rust
|
||
|
|
#[tokio::test]
|
||
|
|
async fn test_unauthorized_access_denied() {
|
||
|
|
let vault = vault_client_with_limited_token();
|
||
|
|
let result = vault.get_secret("database/prod/password").await;
|
||
|
|
assert!(matches!(result, Err(VaultError::PermissionDenied)));
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
## Configuration Integration
|
||
|
|
|
||
|
|
**Provisioning Config**:
|
||
|
|
```toml
|
||
|
|
# provisioning/config/config.defaults.toml
|
||
|
|
[secrets]
|
||
|
|
provider = "secretum-vault" # "secretum-vault" | "sops" | "env"
|
||
|
|
vault_addr = "https://vault.example.com:8200"
|
||
|
|
vault_namespace = "provisioning"
|
||
|
|
vault_mount = "secret"
|
||
|
|
|
||
|
|
[secrets.tls]
|
||
|
|
ca_cert = "/etc/provisioning/vault-ca.pem"
|
||
|
|
client_cert = "/etc/provisioning/vault-client.pem"
|
||
|
|
client_key = "/etc/provisioning/vault-client-key.pem"
|
||
|
|
|
||
|
|
[secrets.cache]
|
||
|
|
enabled = true
|
||
|
|
ttl = "5m"
|
||
|
|
max_size = "100MB"
|
||
|
|
```
|
||
|
|
|
||
|
|
**Environment Variables**:
|
||
|
|
```bash
|
||
|
|
export VAULT_ADDR="https://vault.example.com:8200"
|
||
|
|
export VAULT_TOKEN="s.abc123def456..."
|
||
|
|
export VAULT_NAMESPACE="provisioning"
|
||
|
|
export VAULT_CACERT="/etc/provisioning/vault-ca.pem"
|
||
|
|
```
|
||
|
|
|
||
|
|
## Migration Path
|
||
|
|
|
||
|
|
**Phase 1: Deploy SecretumVault**
|
||
|
|
- Deploy vault cluster in HA mode
|
||
|
|
- Initialize and configure backends
|
||
|
|
- Set up Cedar policies
|
||
|
|
|
||
|
|
**Phase 2: Migrate Static Secrets**
|
||
|
|
- Import SOPS secrets into vault KV store
|
||
|
|
- Update Nickel configs to reference vault paths
|
||
|
|
- Verify secret access via new API
|
||
|
|
|
||
|
|
**Phase 3: Enable Dynamic Secrets**
|
||
|
|
- Configure database secret engine
|
||
|
|
- Configure SSH CA secret engine
|
||
|
|
- Update applications to use dynamic credentials
|
||
|
|
|
||
|
|
**Phase 4: Deprecate SOPS for Runtime**
|
||
|
|
- SOPS remains for gitops config files
|
||
|
|
- Runtime secrets exclusively from vault
|
||
|
|
- Audit trail enforcement
|
||
|
|
|
||
|
|
**Phase 5: Automation**
|
||
|
|
- Automatic rotation policies
|
||
|
|
- Lease renewal automation
|
||
|
|
- Monitoring and alerting
|
||
|
|
|
||
|
|
## Documentation Requirements
|
||
|
|
|
||
|
|
**User Guides**:
|
||
|
|
- `docs/user/secrets-management.md` - Using SecretumVault
|
||
|
|
- `docs/user/dynamic-credentials.md` - Dynamic secret workflows
|
||
|
|
- `docs/user/secret-rotation.md` - Rotation policies and procedures
|
||
|
|
|
||
|
|
**Operations Documentation**:
|
||
|
|
- `docs/operations/vault-deployment.md` - Deploying and configuring vault
|
||
|
|
- `docs/operations/vault-backup-restore.md` - Backup and disaster recovery
|
||
|
|
- `docs/operations/vault-monitoring.md` - Metrics, logs, alerts
|
||
|
|
|
||
|
|
**Developer Documentation**:
|
||
|
|
- `docs/development/secrets-api.md` - Rust client library usage
|
||
|
|
- `docs/development/cedar-secret-policies.md` - Writing Cedar policies for secrets
|
||
|
|
- Secret engine development guide
|
||
|
|
|
||
|
|
**Security Documentation**:
|
||
|
|
- `docs/security/secrets-architecture.md` - Security architecture overview
|
||
|
|
- `docs/security/audit-logging.md` - Audit trail and compliance
|
||
|
|
- Threat model and risk assessment
|
||
|
|
|
||
|
|
## References
|
||
|
|
|
||
|
|
- [SecretumVault GitHub](https://github.com/secretum-vault/secretum) (hypothetical, replace with actual)
|
||
|
|
- [HashiCorp Vault Documentation](https://www.vaultproject.io/docs) (for comparison)
|
||
|
|
- ADR-008: Cedar Authorization (policy integration)
|
||
|
|
- ADR-009: Security System Complete (current security architecture)
|
||
|
|
- [Raft Consensus Algorithm](https://raft.github.io/)
|
||
|
|
- [Cedar Policy Language](https://www.cedarpolicy.com/)
|
||
|
|
- SOPS: [https://github.com/getsops/sops](https://github.com/getsops/sops)
|
||
|
|
- Age Encryption: [https://age-encryption.org/](https://age-encryption.org/)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
**Status**: Accepted
|
||
|
|
**Last Updated**: 2025-01-08
|
||
|
|
**Implementation**: Planned
|
||
|
|
**Priority**: High (Security and compliance)
|
||
|
|
**Estimated Complexity**: Complex
|