# SecretumVault Architecture Complete system architecture, design decisions, and component interactions. ## Table of Contents 1. [System Overview](#system-overview) 2. [Core Components](#core-components) 3. [Request Flow](#request-flow) 4. [Configuration-Driven Design](#configuration-driven-design) 5. [Registry Pattern](#registry-pattern) 6. [Storage Layer](#storage-layer) 7. [Cryptography Layer](#cryptography-layer) 8. [Secrets Engines](#secrets-engines) 9. [Authorization & Policies](#authorization--policies) 10. [Deployment Architecture](#deployment-architecture) --- ## System Overview SecretumVault is a **config-driven, async-first secrets management system** built on: - **Rust + Tokio**: Type-safe async runtime - **Axum**: High-performance HTTP framework - **Trait-based polymorphism**: Pluggable backends - **Registry pattern**: Type-safe factory dispatch - **Cedar**: Attribute-based access control (ABAC) - **Post-quantum cryptography**: Future-proof security ### Design Philosophy ``` ┌─────────────────────────────────────────────────────┐ │ Config-Driven: WHAT to use │ │ (backend selection, engine mounting) │ └────────────────┬────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────┐ │ Registry Pattern: HOW to create it │ │ (type-safe dispatch from config string) │ └────────────────┬────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────┐ │ Trait Abstraction: INTERFACE definition │ │ (StorageBackend, CryptoBackend, Engine) │ └────────────────┬────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────┐ │ Concrete Implementations: ACTUAL code │ │ (etcd, PostgreSQL, OpenSSL, AWS-LC) │ └─────────────────────────────────────────────────────┘ ``` **Benefit**: Add new backend without modifying existing code—only implement trait + update config. --- ## Core Components ### VaultCore Central coordinator managing all vault operations. ```rust pub struct VaultCore { // Storage for encrypted secrets and metadata pub storage: Arc, // Cryptographic operations (encrypt/decrypt/sign/verify) pub crypto: Arc, // Authentication tokens and TTL management pub auth_manager: Arc, // Cedar policy engine for fine-grained access control pub cedar_engine: Arc, // Mounted secret engines (KV, Transit, PKI, Database, etc.) pub engines: HashMap>, // Seal/unseal state and master key encryption pub seal_manager: Arc, // Metrics collection (Prometheus-compatible) pub metrics: Arc, // Configuration (static, loaded once at startup) pub config: VaultConfig, } ``` **Initialization**: ```rust impl VaultCore { pub async fn from_config(config: VaultConfig) -> Result { // 1. Load and validate configuration config.validate()?; // 2. Create storage backend from config let storage = StorageRegistry::create(&config.storage).await?; // 3. Create crypto backend from config let crypto = CryptoRegistry::create(&config.vault.crypto_backend, &config.crypto)?; // 4. Initialize seal/unseal manager let seal_manager = SealManager::new(crypto.clone()); // 5. Mount secret engines from config let mut engines = HashMap::new(); if let Some(kv_cfg) = &config.engines.kv { engines.insert( kv_cfg.path.clone(), Box::new(KVEngine::new(kv_cfg, storage.clone())?) ); } // 6. Create auth manager and Cedar engine let auth_manager = AuthManager::new(storage.clone()); let cedar_engine = CedarEngine::new(&config.auth)?; Ok(Self { storage, crypto, auth_manager, cedar_engine, engines, seal_manager, metrics: Arc::new(Metrics::new()), config, }) } } ``` ### API Server Axum-based HTTP server with middleware stack. ``` HTTP Request ↓ [Axum Router] ↓ [Auth Middleware] - Validate X-Vault-Token ↓ [Cedar Middleware] - Evaluate policy (permit/forbid) ↓ [Request Handler] - Route to appropriate engine ↓ [Engine Implementation] - Process request ↓ [Storage/Crypto] - Persist/encrypt ↓ HTTP Response ↓ [Metrics] - Record operation ↓ [Audit Log] - Log to storage ``` **Routing**: ```rust pub fn build_router(vault: Arc) -> Router { let mut router = Router::new() // System endpoints .route("/v1/sys/init", post(sys::init)) .route("/v1/sys/unseal", post(sys::unseal)) .route("/v1/sys/health", get(sys::health)) .route("/v1/sys/seal-status", get(sys::seal_status)); // Mount dynamic routes from engines for (path, engine) in &vault.engines { router = router.nest(&format!("/v1/{}", path), engine.routes()); } router .layer(middleware::from_fn_with_state(vault.clone(), auth_middleware)) .layer(middleware::from_fn_with_state(vault.clone(), cedar_authz_middleware)) .with_state(vault) } ``` --- ## Request Flow ### Secret Read Request ``` 1. Client: curl -H "X-Vault-Token: $TOKEN" \ http://localhost:8200/v1/secret/data/myapp 2. Server receives request ↓ 3. Auth Middleware: - Extract token from header - Lookup token in storage - Validate TTL (not expired) - Extract token metadata (principal, ttl, policies) ↓ 4. Cedar Middleware: - Build context: principal={token_id}, action=read, resource=/secret/data/myapp - Evaluate policies: cedar_engine.evaluate(context) - Result: permit / forbid - If forbid: Return 403 Forbidden ↓ 5. Route Handler: - Parse request path: /v1/secret/data/myapp - Find mounted engine: KVEngine at /secret/ - Delegate to engine.handle_request() ↓ 6. KV Engine: - Extract secret path: myapp - Call storage.get("secret:myapp") ↓ 7. Storage Backend (etcd/postgres/etc): - Lookup encrypted secret blob - Return to engine ↓ 8. KV Engine (decrypt): - Get master key from seal_manager - Call crypto.decrypt(blob, master_key) - Return plaintext metadata + versions ↓ 9. Response: - Build JSON response - Record metrics.secrets_read.inc() - Log to audit: {principal, action, resource, result} - Return 200 OK with secret data ``` ### Secret Write Request ``` Similar to read, but: 1. Auth → Cedar policy evaluation (write policy) 2. Engine handler parses request body (secret data) 3. Encryption: - Get master key from seal_manager - crypto.encrypt(plaintext, master_key) → ciphertext 4. Storage: store(ciphertext, metadata) 5. Return 201 Created or 204 No Content 6. Metrics/Audit: Record write operation ``` --- ## Configuration-Driven Design All runtime behavior determined by `svault.toml`: ### Configuration Hierarchy ``` VaultConfig (root) ├── [vault] section │ ├── crypto_backend = "openssl" │ └── (global settings) ├── [server] section │ ├── address = "0.0.0.0" │ ├── port = 8200 │ └── (TLS settings) ├── [storage] section │ ├── backend = "etcd" │ └── [storage.etcd] │ └── endpoints = ["http://localhost:2379"] ├── [crypto] section │ └── (crypto-specific settings) ├── [seal] section │ ├── seal_type = "shamir" │ └── [seal.shamir] │ ├── threshold = 2 │ └── shares = 3 ├── [engines] section │ ├── [engines.kv] │ │ ├── path = "secret/" │ │ └── versioned = true │ ├── [engines.transit] │ │ └── path = "transit/" │ └── (other engines) ├── [logging] section │ ├── level = "info" │ └── format = "json" ├── [telemetry] section │ ├── prometheus_port = 9090 │ └── enable_trace = false └── [auth] section └── default_ttl = 24 ``` ### Configuration Validation Validation at startup (fail-fast): ```rust impl VaultConfig { pub fn validate(&self) -> Result<()> { // 1. Check backend availability if !CryptoRegistry::is_available(&self.vault.crypto_backend) { return Err(ConfigError::UnavailableBackend(backend_name)); } // 2. Check path collisions let mut paths = HashSet::new(); for engine_cfg in self.engines.all_engines() { if !paths.insert(engine_cfg.path.clone()) { return Err(ConfigError::DuplicatePath(engine_cfg.path)); } } // 3. Validate seal threshold if self.seal.threshold > self.seal.shares { return Err(ConfigError::InvalidSealConfig); } // 4. Check required fields if self.storage.endpoints.is_empty() { return Err(ConfigError::MissingField("endpoints")); } Ok(()) } } ``` --- ## Registry Pattern Type-safe backend factory pattern. ### Storage Registry ```rust pub struct StorageRegistry; impl StorageRegistry { pub async fn create(config: &StorageConfig) -> Result> { match config.backend.as_str() { "filesystem" => { Ok(Arc::new(FilesystemBackend::new(&config)?)) } "etcd" => { Ok(Arc::new(EtcdBackend::new(&config.etcd).await?)) } "surrealdb" => { Ok(Arc::new(SurrealDBBackend::new(&config.surrealdb).await?)) } "postgresql" => { Ok(Arc::new(PostgreSQLBackend::new(&config.postgresql).await?)) } unknown => Err(ConfigError::UnknownBackend(unknown.to_string())) } } } ``` ### Crypto Registry ```rust pub struct CryptoRegistry; impl CryptoRegistry { pub fn create(backend: &str, config: &CryptoConfig) -> Result> { match backend { "openssl" => Ok(Arc::new(OpenSSLBackend::new()?)), "aws-lc" => { #[cfg(feature = "aws-lc")] return Ok(Arc::new(AwsLcBackend::new()?)); #[cfg(not(feature = "aws-lc"))] return Err(ConfigError::FeatureNotEnabled("aws-lc")); } "rustcrypto" => { #[cfg(feature = "rustcrypto")] return Ok(Arc::new(RustCryptoBackend::new()?)); #[cfg(not(feature = "rustcrypto"))] return Err(ConfigError::FeatureNotEnabled("rustcrypto")); } unknown => Err(ConfigError::UnknownBackend(unknown.to_string())) } } } ``` ### Engine Registry ```rust pub struct EngineRegistry; impl EngineRegistry { pub fn mount_engines( config: &EnginesConfig, vault: &Arc ) -> Result>> { let mut engines = HashMap::new(); // Mount KV engine if let Some(kv_cfg) = &config.kv { engines.insert( kv_cfg.path.clone(), Box::new(KVEngine::new(kv_cfg, vault.storage.clone())?) as Box ); } // Mount Transit engine if let Some(transit_cfg) = &config.transit { engines.insert( transit_cfg.path.clone(), Box::new(TransitEngine::new(transit_cfg, vault.crypto.clone())?) as Box ); } // Mount PKI engine if let Some(pki_cfg) = &config.pki { engines.insert( pki_cfg.path.clone(), Box::new(PKIEngine::new(pki_cfg, vault.crypto.clone())?) as Box ); } // Mount Database engine if let Some(db_cfg) = &config.database { engines.insert( db_cfg.path.clone(), Box::new(DatabaseEngine::new(db_cfg, vault.storage.clone())?) as Box ); } Ok(engines) } } ``` --- ## Storage Layer ### StorageBackend Trait ```rust pub trait StorageBackend: Send + Sync { // Key-value operations async fn get(&self, key: &str) -> StorageResult>>; async fn set(&self, key: &str, value: Vec) -> StorageResult<()>; async fn delete(&self, key: &str) -> StorageResult<()>; // Listing and querying async fn list(&self, prefix: &str) -> StorageResult>; async fn exists(&self, key: &str) -> StorageResult; // Atomic operations async fn cas(&self, key: &str, old: Option>, new: Vec) -> StorageResult; // Transactions async fn transaction(&self, ops: Vec) -> StorageResult>>; } ``` ### Storage Key Organization Keys are namespaced by purpose: ``` Direct secret storage: secret:metadata:myapp → Metadata (path, versions, timestamps) secret:v1:myapp → Version 1 (encrypted data) secret:v2:myapp → Version 2 (encrypted data) Token storage: auth:tokens:token_abc123 → Token metadata (TTL, policies) auth:leases:lease_id → Active lease info Engine-specific: pki:roots:root-ca → PKI root certificate pki:roles:my-role → PKI role configuration db:credentials:postgres-prod → Generated credentials transit:keys:my-key → Transit encryption key Internal: vault:config:shamir → Shamir threshold and shares vault:master:encrypted_key → Encrypted master key ``` ### Concurrent Access Storage operations are atomic but don't use distributed locks: ``` Write Operation: 1. Read current value (with version) 2. Modify in-memory 3. CAS (compare-and-swap) write: - If version matches → Write succeeds - If version mismatch → Retry from step 1 Read Operation: - Simple get() call - No locking, readers don't block writers ``` --- ## Cryptography Layer ### CryptoBackend Trait ```rust pub trait CryptoBackend: Send + Sync { // Symmetric encryption (AES-256-GCM, ChaCha20-Poly1305) async fn encrypt(&self, plaintext: &[u8], aad: &[u8]) -> CryptoResult; async fn decrypt(&self, ciphertext: &Ciphertext, aad: &[u8]) -> CryptoResult>; // Key generation async fn generate_keypair(&self, algorithm: KeyAlgorithm) -> CryptoResult; // Signing and verification (if supported) async fn sign(&self, data: &[u8], key_id: &str) -> CryptoResult; async fn verify(&self, data: &[u8], signature: &Signature) -> CryptoResult; // Hash operations async fn hash(&self, data: &[u8], algorithm: HashAlgorithm) -> CryptoResult>; } ``` ### Master Key Encryption All secrets encrypted with master key: ``` Master Key (from Shamir SSS) ↓ Encrypt with NIST SP 800-38D (GCM mode) ↓ Ciphertext + IV + Tag stored in encrypted_secret ``` ### Post-Quantum Support Feature-gated post-quantum algorithms: ```rust #[cfg(feature = "pqc")] pub enum KeyAlgorithm { // Classical Rsa2048, Rsa4096, EcdsaP256, EcdsaP384, EcdsaP521, // Post-quantum (ML-KEM for key exchange) MlKem768, // Post-quantum (ML-DSA for signatures) MlDsa65, } #[cfg(not(feature = "pqc"))] pub enum KeyAlgorithm { // Classical only Rsa2048, Rsa4096, EcdsaP256, EcdsaP384, EcdsaP521, } ``` --- ## Secrets Engines ### Engine Trait ```rust pub trait Engine: Send + Sync { // Handle HTTP request for this engine async fn handle_request(&self, req: EngineRequest) -> EngineResult; // Mount point (e.g., "secret/", "transit/") fn mount_path(&self) -> &str; // Engine type (for metrics and logging) fn engine_type(&self) -> &str; // Build Axum router for this engine's routes fn routes(&self) -> Router; } ``` ### Engine Request Flow ``` HTTP Request: POST /v1/secret/data/myapp ↓ Router matches /secret/ prefix ↓ KVEngine::routes() router handles /data/myapp ↓ KVEngine::handle_request() called ↓ KVEngine processes: - Parse request body - Validate against storage - Encrypt/decrypt as needed - Call storage backend - Return response ↓ HTTP Response ``` ### KV Engine (Versioned) ```rust pub struct KVEngine { storage: Arc, config: KVEngineConfig, crypto: Arc, } impl KVEngine { // Handle read request pub async fn read(&self, path: &str) -> EngineResult { // 1. Get secret metadata let metadata_key = format!("{}secret:metadata:{}", self.config.path, path); let encrypted = self.storage.get(&metadata_key).await?; // 2. Decrypt metadata let plaintext = self.crypto.decrypt(&encrypted, b"").await?; let metadata: SecretMetadata = serde_json::from_slice(&plaintext)?; Ok(metadata) } // Handle write request pub async fn write(&self, path: &str, data: Value) -> EngineResult<()> { // 1. Get or create metadata let metadata_key = format!("{}secret:metadata:{}", self.config.path, path); let mut metadata = self.read_metadata(&metadata_key).await?; // 2. Create new version let version = metadata.versions.len() + 1; let version_key = format!("{}secret:v{}:{}", self.config.path, version, path); // 3. Encrypt version data let plaintext = serde_json::to_vec(&data)?; let encrypted = self.crypto.encrypt(&plaintext, b"").await?; // 4. Store version and update metadata self.storage.set(&version_key, encrypted).await?; metadata.update(version, Utc::now()); // 5. Store metadata let metadata_bytes = serde_json::to_vec(&metadata)?; let encrypted_metadata = self.crypto.encrypt(&metadata_bytes, b"").await?; self.storage.set(&metadata_key, encrypted_metadata).await?; Ok(()) } } ``` --- ## Authorization & Policies ### Cedar Integration Cedar is AWS's open-source policy language: ```cedar permit ( principal == User::"alice", action == Action::"read", resource == Secret::"secret/myapp" ) when { context.ip_address.isIpv4("10.0.0.0", 16) }; ``` ### Policy Evaluation Flow ``` HTTP Request ↓ Extract principal: X-Vault-Token ↓ Build Cedar context: principal = Token(token_id, policies=[...]) action = "read" resource = "/secret/data/myapp" context = { ip_address = "10.0.20.5", timestamp = "2025-12-21T10:30:00Z" } ↓ Cedar engine evaluates: evaluate(context) ↓ Decision: - Permit → Proceed to engine - Deny → Return 403 Forbidden - NotApplicable → Default deny ``` ### Token Lifecycle ``` Create: 1. Generate random token ID (32 bytes) 2. Create metadata: {policies, ttl, created_at, renewable} 3. Store encrypted in storage: auth:tokens:token_id 4. Return token to client Validate: 1. Extract token from request header 2. Lookup in storage 3. Check TTL: if expired → invalid 4. Extract policies and principal info Renew: 1. Validate token (not expired) 2. Update TTL: expires_at = now + renewal_period 3. Update in storage Revoke: 1. Delete from storage 2. Invalidate any active leases ``` --- ## Deployment Architecture ### Docker Compose (Local Development) ``` ┌─────────────────────────────────────────────────────┐ │ Docker Compose Network │ │ (vault-network) │ ├──────────────┬──────────────┬───────────┬────────────┤ │ │ │ │ │ ▼ ▼ ▼ ▼ ▼ [vault:8200] [etcd:2379] [surrealdb:8000] [postgres:5432] [prometheus:9090] (server) (storage) (alt-storage) (alt-storage) (monitoring) ``` ### Kubernetes Cluster ``` ┌────────────────────────────────────────────────────┐ │ Kubernetes Cluster │ │ │ │ ┌──────────────────────────────────────────────┐ │ │ │ secretumvault Namespace │ │ │ │ │ │ │ │ ┌────────┐ ┌────────┐ ┌─────────────┐ │ │ │ │ │vault:8200 │etcd:2379 │prometheus:9090 │ │ │ │ │Deployment │StatefulSet │Deployment │ │ │ │ │(1 replica)│(3 replicas)│(1 replica) │ │ │ │ └────────┘ └────────┘ └─────────────┘ │ │ │ │ ↓ ↓ │ │ │ │ [Service] [Headless] │ │ │ │ vault:8200 etcd:2379 │ │ │ │ (peer discovery) │ │ │ │ │ │ │ │ [ConfigMap] vault-config (svault.toml) │ │ │ │ [RBAC] ServiceAccount, ClusterRole │ │ │ │ [PVC] Persistent storage for etcd │ │ │ │ │ │ │ └──────────────────────────────────────────────┘ │ │ │ └────────────────────────────────────────────────────┘ ``` ### Helm Chart Structure ``` helm/secretumvault/ ├── Chart.yaml # Chart metadata ├── values.yaml # Default values (90+ options) ├── templates/ │ ├── _helpers.tpl # Template functions │ ├── deployment.yaml # Vault deployment │ ├── service.yaml # Services │ ├── configmap.yaml # Configuration │ └── rbac.yaml # Security ``` --- ## Data Flow Diagram ### Secret Storage Flow ``` User Request: {"username": "admin", "password": "secret123"} ↓ Auth Middleware validates token Cedar policy evaluates (permit/forbid) ↓ KV Engine write handler: 1. Parse request body 2. Generate metadata (created_at, version) 3. Serialize to JSON ↓ Crypto Backend: plaintext = b'{"username": "admin", ...}' master_key = seal_manager.unseal() ciphertext = aes_256_gcm.encrypt(plaintext, master_key) → ciphertext = [nonce(12B) | ciphertext | tag(16B)] ↓ Storage Backend (etcd/postgres): storage.set( key = "secret:v1:myapp", value = ciphertext ) ↓ Metrics recorded: vault_secrets_stored.inc() ↓ Audit logged: { timestamp: "2025-12-21T10:30:00Z", principal: "user:alice", action: "write", resource: "/secret/data/myapp", result: "success" } ``` ### Secret Retrieval Flow ``` User Request: GET /v1/secret/data/myapp Header: X-Vault-Token: token_abc123 ↓ Auth Middleware: 1. Extract token from header 2. storage.get("auth:tokens:token_abc123") 3. Verify not expired ↓ Cedar Policy Engine: context = { principal: User(token_id, policies=[...]), action: "read", resource: "/secret/data/myapp", ip: "10.20.5.1" } → Evaluate policies → Decision: permit ↓ KV Engine read handler: 1. Parse path: myapp 2. storage.get("secret:v1:myapp") 3. Returns encrypted ciphertext ↓ Crypto Backend decrypt: master_key = seal_manager.unseal() plaintext = aes_256_gcm.decrypt(ciphertext, master_key) → {"username": "admin", "password": "secret123"} ↓ Response: { "request_id": "req_123", "data": { "data": {"username": "admin", "password": "secret123"}, "metadata": { "created_time": "2025-12-21T10:20:00Z", "current_version": 1 } } } ↓ Metrics & Audit: vault_secrets_read.inc() audit_log(success) ``` --- ## Performance Characteristics ### Async/Await Foundation All I/O operations use Tokio's non-blocking runtime: - HTTP requests: Axum + Hyper (async) - Database queries: sqlx (async driver) - etcd operations: etcd_client (async) - File operations: tokio::fs (async) Result: **Thousands of concurrent requests** on single machine ### Caching Strategy Limited in-memory caching for: - Token metadata (refreshed on access) - Policy evaluation (for frequently used policies) - Crypto key material (loaded once, kept in memory) ### Lock Contention Minimal contention design: - Per-token locking only during TTL updates - Storage backend handles internal consistency - No distributed locks (CAS operations used instead) --- ## Security Architecture ### Secret Encryption All secrets encrypted at rest: ``` Plaintext → Master Key → AES-256-GCM → Ciphertext (with AAD) ``` Master key stored encrypted via Shamir SSS (threshold encryption). ### Audit Trail Complete operation audit: ``` Every operation logged: - Principal (token ID) - Action (read/write/delete) - Resource (secret path) - Result (success/failure) - Timestamp - IP address - Error details ``` ### Policy Enforcement Cedar policies enforce: - **Who** can access (principal matching) - **What** they can do (action authorization) - **Where** they access (resource paths) - **When** they access (time windows) - **How** they access (IP ranges, MFA) --- ## Extension Points ### Adding New Storage Backend 1. Implement `StorageBackend` trait 2. Add to `StorageRegistry::create()` 3. Add feature flag in Cargo.toml 4. Update configuration schema Example: To add S3 backend, implement trait with get/set/delete/list methods, add to registry match statement, add feature flag, update config TOML schema. ### Adding New Secrets Engine 1. Implement `Engine` trait 2. Add to `EngineRegistry::mount_engines()` 3. Implement Axum routes 4. Add to configuration Example: To add SSH engine, create new file, implement Engine trait with handle_request, add Axum router methods, integrate into registry. --- **Architecture validated**: Config-driven design enables flexible deployment while maintaining type safety and performance.