7.6 KiB
ADR-007: KMS Service Simplification to Age and Cosmian Backends
Status: Accepted Date: 2025-10-08 Deciders: Architecture Team Related: ADR-006 (KMS Service Integration)
Context
The KMS service initially supported 4 backends: HashiCorp Vault, AWS KMS, Age, and Cosmian KMS. This created unnecessary complexity and unclear guidance about which backend to use for different environments.
Problems with 4-Backend Approach
- Complexity: Supporting 4 different backends increased maintenance burden
- Dependencies: AWS SDK added significant compile time (~30 s) and binary size
- Confusion: No clear guidance on which backend to use when
- Cloud Lock-in: AWS KMS dependency limited infrastructure flexibility
- Operational Overhead: Vault requires server setup even for simple dev environments
- Code Duplication: Similar logic implemented 4 different ways
Key Insights
- Most development work doesn't need server-based KMS
- Production deployments need enterprise-grade security features
- Age provides fast, offline encryption perfect for development
- Cosmian KMS offers confidential computing and zero-knowledge architecture
- Supporting Vault AND Cosmian is redundant (both are server-based KMS)
- AWS KMS locks us into AWS infrastructure
Decision
Simplify the KMS service to support only 2 backends:
-
Age: For development and local testing
- Fast, offline, no server required
- Simple key generation with
age-keygen - X25519 encryption (modern, secure)
- Perfect for dev/test environments
-
Cosmian KMS: For production deployments
- Enterprise-grade key management
- Confidential computing support (SGX/SEV)
- Zero-knowledge architecture
- Server-side key rotation
- Audit logging and compliance
- Multi-tenant support
Remove support for:
- ❌ HashiCorp Vault (redundant with Cosmian)
- ❌ AWS KMS (cloud lock-in, complexity)
Consequences
Positive
- Simpler Code: 2 backends instead of 4 reduces complexity by 50%
- Faster Compilation: Removing AWS SDK saves ~30 seconds compile time
- Clear Guidance: Age = dev, Cosmian = prod (no confusion)
- Offline Development: Age works without network connectivity
- Better Security: Cosmian provides confidential computing (TEE)
- No Cloud Lock-in: Not dependent on AWS infrastructure
- Easier Testing: Age backend requires no setup
- Reduced Dependencies: Fewer external crates to maintain
Negative
- Migration Required: Existing Vault/AWS KMS users must migrate
- Learning Curve: Teams must learn Age and Cosmian
- Cosmian Dependency: Production depends on Cosmian availability
- Cost: Cosmian may have licensing costs (cloud or self-hosted)
Neutral
- Feature Parity: Cosmian provides all features Vault/AWS had
- API Compatibility: Encrypt/decrypt API remains primarily the same
- Configuration Change: TOML config structure updated but similar
Implementation
Files Created
src/age/client.rs(167 lines) - Age encryption clientsrc/age/mod.rs(3 lines) - Age module exportssrc/cosmian/client.rs(294 lines) - Cosmian KMS clientsrc/cosmian/mod.rs(3 lines) - Cosmian module exportsdocs/migration/KMS_SIMPLIFICATION.md(500+ lines) - Migration guide
Files Modified
src/lib.rs- Updated exports (age, cosmian instead of aws, vault)src/types.rs- Updated error types and config enumsrc/service.rs- Simplified to 2 backends (180 lines, was 213)Cargo.toml- Removed AWS deps, addedage = "0.10"README.md- Complete rewrite for new backendsprovisioning/config/kms.toml- Simplified configuration
Files Deleted
src/aws/client.rs- AWS KMS clientsrc/aws/envelope.rs- Envelope encryption helperssrc/aws/mod.rs- AWS modulesrc/vault/client.rs- Vault clientsrc/vault/mod.rs- Vault module
Dependencies Changed
Removed:
aws-sdk-kms = "1"aws-config = "1"aws-credential-types = "1"aes-gcm = "0.10"(was only for AWS envelope encryption)
Added:
age = "0.10"tempfile = "3"(dev dependency for tests)
Kept:
- All Axum web framework deps
reqwest(for Cosmian HTTP API)base64,serde,tokio, etc.
Migration Path
For Development
# 1. Install Age
brew install age # or apt install age
# 2. Generate keys
age-keygen -o ~/.config/provisioning/age/private_key.txt
age-keygen -y ~/.config/provisioning/age/private_key.txt > ~/.config/provisioning/age/public_key.txt
# 3. Update config to use Age backend
# 4. Re-encrypt development secrets
For Production
# 1. Set up Cosmian KMS (cloud or self-hosted)
# 2. Create master key in Cosmian
# 3. Migrate secrets from Vault/AWS to Cosmian
# 4. Update production config
# 5. Deploy new KMS service
See docs/migration/KMS_SIMPLIFICATION.md for detailed steps.
Alternatives Considered
Alternative 1: Keep All 4 Backends
Pros:
- No migration required
- Maximum flexibility
Cons:
- Continued complexity
- Maintenance burden
- Unclear guidance
Rejected: Complexity outweighs benefits
Alternative 2: Only Cosmian (No Age)
Pros:
- Single backend
- Enterprise-grade everywhere
Cons:
- Requires Cosmian server for development
- Slower dev iteration
- Network dependency for local dev
Rejected: Development experience matters
Alternative 3: Only Age (No Production Backend)
Pros:
- Simplest solution
- No server required
Cons:
- Not suitable for production
- No audit logging
- No key rotation
- No multi-tenant support
Rejected: Production needs enterprise features
Alternative 4: Age + HashiCorp Vault
Pros:
- Vault is widely known
- No Cosmian dependency
Cons:
- Vault lacks confidential computing
- Vault server still required
- No zero-knowledge architecture
Rejected: Cosmian provides better security features
Metrics
Code Reduction
- Total Lines Removed: ~800 lines (AWS + Vault implementations)
- Total Lines Added: ~470 lines (Age + Cosmian + docs)
- Net Reduction: ~330 lines
Dependency Reduction
- Crates Removed: 4 (aws-sdk-kms, aws-config, aws-credential-types, aes-gcm)
- Crates Added: 1 (age)
- Net Reduction: 3 crates
Compilation Time
- Before: ~90 seconds (with AWS SDK)
- After: ~60 seconds (without AWS SDK)
- Improvement: 33% faster
Compliance
Security Considerations
- Age Security: X25519 (Curve25519) encryption, modern and secure
- Cosmian Security: Confidential computing, zero-knowledge, enterprise-grade
- No Regression: Security features maintained or improved
- Clear Separation: Dev (Age) never used for production secrets
Testing Requirements
- Unit Tests: Both backends have comprehensive test coverage
- Integration Tests: Age tests run without external deps
- Cosmian Tests: Require test server (marked as
#[ignore]) - Migration Tests: Verify old configs fail gracefully
References
- Age Encryption - Modern encryption tool
- Cosmian KMS - Enterprise KMS with confidential computing
- ADR-006 - Previous KMS integration
- Migration Guide - Detailed migration steps
Notes
- Age is designed by Filippo Valsorda (Google, Go security team)
- Cosmian provides FIPS 140-2 Level 3 compliance (when using certified hardware)
- This decision aligns with project goal of reducing cloud provider dependencies
- Migration timeline: 6 weeks for full adoption