kogral/docs/architecture/adrs/001-nickel-vs-toml.md

314 lines
8.1 KiB
Markdown
Raw Permalink Normal View History

2026-01-23 16:11:07 +00:00
# ADR-001: Nickel vs TOML for Configuration
**Status**: Accepted
**Date**: 2026-01-17
**Deciders**: Architecture Team
**Context**: Configuration Strategy for Knowledge Base System
---
## Context
The KOGRAL requires a flexible, type-safe configuration format that supports:
1. **Complex nested structures** (graph settings, storage configs, embedding providers)
2. **Type validation** (prevent runtime errors from config mistakes)
3. **Composition and inheritance** (shared configs, environment-specific overrides)
4. **Documentation** (self-documenting schemas)
5. **Validation before runtime** (catch errors early)
We evaluated two primary options:
### Option 1: TOML (Traditional Config Format)
**Pros**:
- Widely adopted in Rust ecosystem (`Cargo.toml`)
- Simple, human-readable syntax
- Native `serde` support
- IDE support (syntax highlighting, completion)
**Cons**:
- No type system (validation only at runtime)
- Limited composition (no imports, no functions)
- No schema validation (errors discovered during execution)
- Verbose for complex nested structures
- No documentation in config files
**Example TOML**:
```toml
[graph]
name = "my-project"
version = "1.0.0"
[storage]
primary = "filesystem" # String, not validated as enum
[storage.secondary]
enabled = true
type = "surrealdb" # Typo would fail at runtime
url = "ws://localhost:8000"
[embeddings]
enabled = true
provider = "openai" # No validation of valid providers
model = "text-embedding-3-small"
```
**Problems**:
- Typos in enum values (`"surrealdb"` vs `"surealdb"`) fail at runtime
- No validation that `provider = "openai"` requires `api_key_env`
- No documentation of valid options
- No way to compose configs (e.g., base config + environment override)
### Option 2: Nickel (Functional Configuration Language)
**Pros**:
- **Type system** with contracts (validate before runtime)
- **Composition** via imports and merging
- **Documentation** in schemas (self-documenting)
- **Validation** at export time (catch errors early)
- **Functions** for conditional logic
- **Default values** in schema definitions
**Cons**:
- Less familiar to Rust developers
- Requires separate `nickel` CLI tool
- Smaller ecosystem
- Steeper learning curve
**Example Nickel**:
```nickel
# schemas/kogral-config.ncl
{
KbConfig = {
graph | GraphConfig,
storage | StorageConfig,
embeddings | EmbeddingConfig,
},
StorageConfig = {
primary | [| 'filesystem, 'memory |], # Enum validated at export
secondary | {
enabled | Bool,
type | [| 'surrealdb, 'sqlite |], # Typos caught immediately
url | String,
} | optional,
},
EmbeddingConfig = {
enabled | Bool,
provider | [| 'openai, 'claude, 'fastembed |], # Valid providers enforced
model | String,
api_key_env | String | doc "Environment variable for API key",
},
}
```
**Benefits**:
- Typos in enum values caught at `nickel export` time
- Schema enforces required fields based on provider
- Documentation embedded in schema
- Config can be composed: `import "base.ncl" & { /* overrides */ }`
---
## Decision
**We will use Nickel for configuration.**
**Implementation**:
1. Define schemas in `schemas/*.ncl` with type contracts
2. Users write configs in `.kogral/config.ncl`
3. Export to JSON via CLI: `nickel export --format json config.ncl`
4. Load JSON in Rust via `serde_json` into typed structs
**Pattern** (double validation):
```text
Nickel Config (.ncl)
↓ [nickel export]
JSON (validated by Nickel contracts)
↓ [serde_json::from_str]
Rust Struct (validated by serde)
Runtime (guaranteed valid config)
```
**Bridge Code** (`kogral-core/src/config/nickel.rs`):
```rust
pub fn load_config<P: AsRef<Path>>(path: P) -> Result<KbConfig> {
// Export Nickel to JSON
let json = export_nickel_to_json(path)?;
// Deserialize to Rust struct
let config: KbConfig = serde_json::from_str(&json)?;
Ok(config)
}
fn export_nickel_to_json<P: AsRef<Path>>(path: P) -> Result<String> {
let output = Command::new("nickel")
.arg("export")
.arg("--format").arg("json")
.arg(path.as_ref())
.output()?;
Ok(String::from_utf8(output.stdout)?)
}
```
---
## Consequences
### Positive
**Type Safety**: Config errors caught before runtime
- Invalid enum values fail at export: `'filesystm` → error
- Missing required fields detected: no `graph.name` → error
- Type mismatches prevented: `enabled = "yes"` → error (expects Bool)
**Self-Documenting**: Schemas serve as documentation
- `| doc "Environment variable for API key"` describes fields
- Enum options visible in schema: `[| 'openai, 'claude, 'fastembed |]`
- Default values explicit: `| default = 'filesystem`
**Composition**: Config reuse and overrides
```nickel
# base.ncl
{ graph = { version = "1.0.0" } }
# project.ncl
import "base.ncl" & { graph = { name = "my-project" } }
```
**Validation Before Deployment**: Catch errors in CI
```bash
# CI pipeline
nickel typecheck config.ncl
nickel export --format json config.ncl > /dev/null
```
**Conditional Logic**: Environment-specific configs
```nickel
let is_prod = std.string.is_match "prod" (std.env.get "ENV") in
{
embeddings = {
provider = if is_prod then 'openai else 'fastembed,
},
}
```
### Negative
**Learning Curve**: Team must learn Nickel syntax
- **Mitigation**: Provide comprehensive examples in `config/` directory
- **Mitigation**: Document common patterns in `docs/config/`
**Tool Dependency**: Requires `nickel` CLI installed
- **Mitigation**: Document installation in setup guide
- **Mitigation**: Check `nickel` availability in `kogral init` command
**IDE Support**: Limited compared to TOML
- **Mitigation**: Use LSP (nickel-lang-lsp) for VSCode/Neovim
- **Mitigation**: Syntax highlighting available for major editors
**Ecosystem Size**: Smaller than TOML
- **Mitigation**: Nickel actively developed by Tweag
- **Mitigation**: Stable language specification (v1.0+)
### Neutral
**Two-Stage Loading**: Nickel → JSON → Rust
- Not a performance concern (config loaded once at startup)
- Adds resilience (double validation)
- Allows runtime config inspection (read JSON directly)
---
## Alternatives Considered
### JSON Schema
**Rejected**: Not ergonomic for humans to write
- No comments
- Verbose syntax (`{"key": "value"}` vs `key = value`)
- JSON Schema separate from config (duplication)
### YAML
**Rejected**: No type system, ambiguous parsing
- Boolean confusion: `yes`/`no`/`on`/`off`/`true`/`false`
- Indentation-sensitive (error-prone)
- No validation without external tools
### Dhall
**Rejected**: More complex than needed
- Turing-incomplete by design (limits use cases)
- Smaller ecosystem than Nickel
- Steeper learning curve
### KCL (KusionStack Configuration Language)
**Rejected**: Kubernetes-focused, less general-purpose
- Designed for K8s manifests
- Less mature than Nickel for general config
---
## Implementation Timeline
1. ✅ Define base schemas (`schemas/kogral-config.ncl`)
2. ✅ Implement Nickel loader (`kogral-core/src/config/nickel.rs`)
3. ✅ Create example configs (`config/defaults.ncl`, `config/production.ncl`)
4. ✅ Document Nickel usage (`docs/config/nickel-schemas.md`)
5. ⏳ Add LSP recommendations to setup guide
6. ⏳ Create Nickel → TOML migration tool (for existing users)
---
## Monitoring
**Success Criteria**:
- Config errors caught at export time (not runtime)
- Users can compose configs for different environments
- Team comfortable with Nickel syntax within 2 weeks
**Metrics**:
- Number of config validation errors caught before runtime
- Time to diagnose config issues (should decrease)
- User feedback on config complexity
---
## References
- [Nickel Language](https://nickel-lang.org/)
- [Nickel User Manual](https://nickel-lang.org/user-manual/introduction)
- [platform-config pattern](../../crates/kogral-core/src/config/README.md) (reference implementation)
- [TOML Specification](https://toml.io/)
---
## Revision History
| Date | Author | Change |
| ---------- | ------------------ | ---------------- |
| 2026-01-17 | Architecture Team | Initial decision |
---
**Next ADR**: [ADR-002: FastEmbed via AI Providers](002-fastembed-ai-providers.md)