2025-12-11 21:50:42 +00:00
|
|
|
# ADR-007: KMS Service Simplification to Age and Cosmian Backends
|
|
|
|
|
|
|
|
|
|
**Status**: Accepted
|
|
|
|
|
**Date**: 2025-10-08
|
|
|
|
|
**Deciders**: Architecture Team
|
|
|
|
|
**Related**: ADR-006 (KMS Service Integration)
|
|
|
|
|
|
|
|
|
|
## Context
|
|
|
|
|
|
2026-01-12 04:42:18 +00:00
|
|
|
The KMS service initially supported 4 backends: HashiCorp Vault, AWS KMS, Age, and Cosmian KMS. This created unnecessary complexity and unclear
|
|
|
|
|
guidance about which backend to use for different environments.
|
2025-12-11 21:50:42 +00:00
|
|
|
|
|
|
|
|
### Problems with 4-Backend Approach
|
|
|
|
|
|
|
|
|
|
1. **Complexity**: Supporting 4 different backends increased maintenance burden
|
2026-01-08 09:55:37 +00:00
|
|
|
2. **Dependencies**: AWS SDK added significant compile time (~30 s) and binary size
|
2025-12-11 21:50:42 +00:00
|
|
|
3. **Confusion**: No clear guidance on which backend to use when
|
|
|
|
|
4. **Cloud Lock-in**: AWS KMS dependency limited infrastructure flexibility
|
|
|
|
|
5. **Operational Overhead**: Vault requires server setup even for simple dev environments
|
|
|
|
|
6. **Code Duplication**: Similar logic implemented 4 different ways
|
|
|
|
|
|
|
|
|
|
### Key Insights
|
|
|
|
|
|
|
|
|
|
- Most development work doesn't need server-based KMS
|
|
|
|
|
- Production deployments need enterprise-grade security features
|
|
|
|
|
- Age provides fast, offline encryption perfect for development
|
|
|
|
|
- Cosmian KMS offers confidential computing and zero-knowledge architecture
|
|
|
|
|
- Supporting Vault AND Cosmian is redundant (both are server-based KMS)
|
|
|
|
|
- AWS KMS locks us into AWS infrastructure
|
|
|
|
|
|
|
|
|
|
## Decision
|
|
|
|
|
|
|
|
|
|
Simplify the KMS service to support only 2 backends:
|
|
|
|
|
|
|
|
|
|
1. **Age**: For development and local testing
|
|
|
|
|
- Fast, offline, no server required
|
|
|
|
|
- Simple key generation with `age-keygen`
|
|
|
|
|
- X25519 encryption (modern, secure)
|
|
|
|
|
- Perfect for dev/test environments
|
|
|
|
|
|
|
|
|
|
2. **Cosmian KMS**: For production deployments
|
|
|
|
|
- Enterprise-grade key management
|
|
|
|
|
- Confidential computing support (SGX/SEV)
|
|
|
|
|
- Zero-knowledge architecture
|
|
|
|
|
- Server-side key rotation
|
|
|
|
|
- Audit logging and compliance
|
|
|
|
|
- Multi-tenant support
|
|
|
|
|
|
|
|
|
|
Remove support for:
|
2026-01-08 09:55:37 +00:00
|
|
|
|
2025-12-11 21:50:42 +00:00
|
|
|
- ❌ HashiCorp Vault (redundant with Cosmian)
|
|
|
|
|
- ❌ AWS KMS (cloud lock-in, complexity)
|
|
|
|
|
|
|
|
|
|
## Consequences
|
|
|
|
|
|
|
|
|
|
### Positive
|
|
|
|
|
|
|
|
|
|
1. **Simpler Code**: 2 backends instead of 4 reduces complexity by 50%
|
|
|
|
|
2. **Faster Compilation**: Removing AWS SDK saves ~30 seconds compile time
|
|
|
|
|
3. **Clear Guidance**: Age = dev, Cosmian = prod (no confusion)
|
|
|
|
|
4. **Offline Development**: Age works without network connectivity
|
|
|
|
|
5. **Better Security**: Cosmian provides confidential computing (TEE)
|
|
|
|
|
6. **No Cloud Lock-in**: Not dependent on AWS infrastructure
|
|
|
|
|
7. **Easier Testing**: Age backend requires no setup
|
|
|
|
|
8. **Reduced Dependencies**: Fewer external crates to maintain
|
|
|
|
|
|
|
|
|
|
### Negative
|
|
|
|
|
|
|
|
|
|
1. **Migration Required**: Existing Vault/AWS KMS users must migrate
|
|
|
|
|
2. **Learning Curve**: Teams must learn Age and Cosmian
|
|
|
|
|
3. **Cosmian Dependency**: Production depends on Cosmian availability
|
|
|
|
|
4. **Cost**: Cosmian may have licensing costs (cloud or self-hosted)
|
|
|
|
|
|
|
|
|
|
### Neutral
|
|
|
|
|
|
|
|
|
|
1. **Feature Parity**: Cosmian provides all features Vault/AWS had
|
2026-01-08 09:55:37 +00:00
|
|
|
2. **API Compatibility**: Encrypt/decrypt API remains primarily the same
|
2025-12-11 21:50:42 +00:00
|
|
|
3. **Configuration Change**: TOML config structure updated but similar
|
|
|
|
|
|
|
|
|
|
## Implementation
|
|
|
|
|
|
|
|
|
|
### Files Created
|
|
|
|
|
|
|
|
|
|
1. `src/age/client.rs` (167 lines) - Age encryption client
|
|
|
|
|
2. `src/age/mod.rs` (3 lines) - Age module exports
|
|
|
|
|
3. `src/cosmian/client.rs` (294 lines) - Cosmian KMS client
|
|
|
|
|
4. `src/cosmian/mod.rs` (3 lines) - Cosmian module exports
|
|
|
|
|
5. `docs/migration/KMS_SIMPLIFICATION.md` (500+ lines) - Migration guide
|
|
|
|
|
|
|
|
|
|
### Files Modified
|
|
|
|
|
|
|
|
|
|
1. `src/lib.rs` - Updated exports (age, cosmian instead of aws, vault)
|
|
|
|
|
2. `src/types.rs` - Updated error types and config enum
|
|
|
|
|
3. `src/service.rs` - Simplified to 2 backends (180 lines, was 213)
|
|
|
|
|
4. `Cargo.toml` - Removed AWS deps, added `age = "0.10"`
|
|
|
|
|
5. `README.md` - Complete rewrite for new backends
|
|
|
|
|
6. `provisioning/config/kms.toml` - Simplified configuration
|
|
|
|
|
|
|
|
|
|
### Files Deleted
|
|
|
|
|
|
|
|
|
|
1. `src/aws/client.rs` - AWS KMS client
|
|
|
|
|
2. `src/aws/envelope.rs` - Envelope encryption helpers
|
|
|
|
|
3. `src/aws/mod.rs` - AWS module
|
|
|
|
|
4. `src/vault/client.rs` - Vault client
|
|
|
|
|
5. `src/vault/mod.rs` - Vault module
|
|
|
|
|
|
|
|
|
|
### Dependencies Changed
|
|
|
|
|
|
|
|
|
|
**Removed**:
|
2026-01-08 09:55:37 +00:00
|
|
|
|
2025-12-11 21:50:42 +00:00
|
|
|
- `aws-sdk-kms = "1"`
|
|
|
|
|
- `aws-config = "1"`
|
|
|
|
|
- `aws-credential-types = "1"`
|
|
|
|
|
- `aes-gcm = "0.10"` (was only for AWS envelope encryption)
|
|
|
|
|
|
|
|
|
|
**Added**:
|
2026-01-08 09:55:37 +00:00
|
|
|
|
2025-12-11 21:50:42 +00:00
|
|
|
- `age = "0.10"`
|
|
|
|
|
- `tempfile = "3"` (dev dependency for tests)
|
|
|
|
|
|
|
|
|
|
**Kept**:
|
2026-01-08 09:55:37 +00:00
|
|
|
|
2025-12-11 21:50:42 +00:00
|
|
|
- All Axum web framework deps
|
|
|
|
|
- `reqwest` (for Cosmian HTTP API)
|
|
|
|
|
- `base64`, `serde`, `tokio`, etc.
|
|
|
|
|
|
|
|
|
|
## Migration Path
|
|
|
|
|
|
|
|
|
|
### For Development
|
|
|
|
|
|
|
|
|
|
```bash
|
|
|
|
|
# 1. Install Age
|
|
|
|
|
brew install age # or apt install age
|
|
|
|
|
|
|
|
|
|
# 2. Generate keys
|
|
|
|
|
age-keygen -o ~/.config/provisioning/age/private_key.txt
|
|
|
|
|
age-keygen -y ~/.config/provisioning/age/private_key.txt > ~/.config/provisioning/age/public_key.txt
|
|
|
|
|
|
|
|
|
|
# 3. Update config to use Age backend
|
|
|
|
|
# 4. Re-encrypt development secrets
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### For Production
|
|
|
|
|
|
|
|
|
|
```bash
|
|
|
|
|
# 1. Set up Cosmian KMS (cloud or self-hosted)
|
|
|
|
|
# 2. Create master key in Cosmian
|
|
|
|
|
# 3. Migrate secrets from Vault/AWS to Cosmian
|
|
|
|
|
# 4. Update production config
|
|
|
|
|
# 5. Deploy new KMS service
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
See `docs/migration/KMS_SIMPLIFICATION.md` for detailed steps.
|
|
|
|
|
|
|
|
|
|
## Alternatives Considered
|
|
|
|
|
|
|
|
|
|
### Alternative 1: Keep All 4 Backends
|
|
|
|
|
|
|
|
|
|
**Pros**:
|
2026-01-08 09:55:37 +00:00
|
|
|
|
2025-12-11 21:50:42 +00:00
|
|
|
- No migration required
|
|
|
|
|
- Maximum flexibility
|
|
|
|
|
|
|
|
|
|
**Cons**:
|
2026-01-08 09:55:37 +00:00
|
|
|
|
2025-12-11 21:50:42 +00:00
|
|
|
- Continued complexity
|
|
|
|
|
- Maintenance burden
|
|
|
|
|
- Unclear guidance
|
|
|
|
|
|
|
|
|
|
**Rejected**: Complexity outweighs benefits
|
|
|
|
|
|
|
|
|
|
### Alternative 2: Only Cosmian (No Age)
|
|
|
|
|
|
|
|
|
|
**Pros**:
|
2026-01-08 09:55:37 +00:00
|
|
|
|
2025-12-11 21:50:42 +00:00
|
|
|
- Single backend
|
|
|
|
|
- Enterprise-grade everywhere
|
|
|
|
|
|
|
|
|
|
**Cons**:
|
2026-01-08 09:55:37 +00:00
|
|
|
|
2025-12-11 21:50:42 +00:00
|
|
|
- Requires Cosmian server for development
|
|
|
|
|
- Slower dev iteration
|
|
|
|
|
- Network dependency for local dev
|
|
|
|
|
|
|
|
|
|
**Rejected**: Development experience matters
|
|
|
|
|
|
|
|
|
|
### Alternative 3: Only Age (No Production Backend)
|
|
|
|
|
|
|
|
|
|
**Pros**:
|
2026-01-08 09:55:37 +00:00
|
|
|
|
2025-12-11 21:50:42 +00:00
|
|
|
- Simplest solution
|
|
|
|
|
- No server required
|
|
|
|
|
|
|
|
|
|
**Cons**:
|
2026-01-08 09:55:37 +00:00
|
|
|
|
2025-12-11 21:50:42 +00:00
|
|
|
- Not suitable for production
|
|
|
|
|
- No audit logging
|
|
|
|
|
- No key rotation
|
|
|
|
|
- No multi-tenant support
|
|
|
|
|
|
|
|
|
|
**Rejected**: Production needs enterprise features
|
|
|
|
|
|
|
|
|
|
### Alternative 4: Age + HashiCorp Vault
|
|
|
|
|
|
|
|
|
|
**Pros**:
|
2026-01-08 09:55:37 +00:00
|
|
|
|
2025-12-11 21:50:42 +00:00
|
|
|
- Vault is widely known
|
|
|
|
|
- No Cosmian dependency
|
|
|
|
|
|
|
|
|
|
**Cons**:
|
2026-01-08 09:55:37 +00:00
|
|
|
|
2025-12-11 21:50:42 +00:00
|
|
|
- Vault lacks confidential computing
|
|
|
|
|
- Vault server still required
|
|
|
|
|
- No zero-knowledge architecture
|
|
|
|
|
|
|
|
|
|
**Rejected**: Cosmian provides better security features
|
|
|
|
|
|
|
|
|
|
## Metrics
|
|
|
|
|
|
|
|
|
|
### Code Reduction
|
|
|
|
|
|
|
|
|
|
- **Total Lines Removed**: ~800 lines (AWS + Vault implementations)
|
|
|
|
|
- **Total Lines Added**: ~470 lines (Age + Cosmian + docs)
|
|
|
|
|
- **Net Reduction**: ~330 lines
|
|
|
|
|
|
|
|
|
|
### Dependency Reduction
|
|
|
|
|
|
|
|
|
|
- **Crates Removed**: 4 (aws-sdk-kms, aws-config, aws-credential-types, aes-gcm)
|
|
|
|
|
- **Crates Added**: 1 (age)
|
|
|
|
|
- **Net Reduction**: 3 crates
|
|
|
|
|
|
|
|
|
|
### Compilation Time
|
|
|
|
|
|
|
|
|
|
- **Before**: ~90 seconds (with AWS SDK)
|
|
|
|
|
- **After**: ~60 seconds (without AWS SDK)
|
|
|
|
|
- **Improvement**: 33% faster
|
|
|
|
|
|
|
|
|
|
## Compliance
|
|
|
|
|
|
|
|
|
|
### Security Considerations
|
|
|
|
|
|
|
|
|
|
1. **Age Security**: X25519 (Curve25519) encryption, modern and secure
|
|
|
|
|
2. **Cosmian Security**: Confidential computing, zero-knowledge, enterprise-grade
|
|
|
|
|
3. **No Regression**: Security features maintained or improved
|
|
|
|
|
4. **Clear Separation**: Dev (Age) never used for production secrets
|
|
|
|
|
|
|
|
|
|
### Testing Requirements
|
|
|
|
|
|
|
|
|
|
1. **Unit Tests**: Both backends have comprehensive test coverage
|
|
|
|
|
2. **Integration Tests**: Age tests run without external deps
|
|
|
|
|
3. **Cosmian Tests**: Require test server (marked as `#[ignore]`)
|
|
|
|
|
4. **Migration Tests**: Verify old configs fail gracefully
|
|
|
|
|
|
|
|
|
|
## References
|
|
|
|
|
|
|
|
|
|
- [Age Encryption](https://github.com/FiloSottile/age) - Modern encryption tool
|
|
|
|
|
- [Cosmian KMS](https://cosmian.com/kms/) - Enterprise KMS with confidential computing
|
2026-01-12 04:42:18 +00:00
|
|
|
- [ADR-006](adr-006-provisioning-cli-refactoring.md) - Previous KMS integration
|
2025-12-11 21:50:42 +00:00
|
|
|
- [Migration Guide](../migration/KMS_SIMPLIFICATION.md) - Detailed migration steps
|
|
|
|
|
|
|
|
|
|
## Notes
|
|
|
|
|
|
|
|
|
|
- Age is designed by Filippo Valsorda (Google, Go security team)
|
|
|
|
|
- Cosmian provides FIPS 140-2 Level 3 compliance (when using certified hardware)
|
|
|
|
|
- This decision aligns with project goal of reducing cloud provider dependencies
|
|
|
|
|
- Migration timeline: 6 weeks for full adoption
|