ontoref/adrs/adr-019-per-file-recipient-routing-tenant-isolation.ncl

152 lines
12 KiB
Text
Raw Normal View History

feat: #[onto_mcp_tool] catalog, OCI credential vault layer, validate ADR-018 mode hierarchy ontoref-derive: #[onto_mcp_tool] attribute macro registers MCP tool unit-structs in the catalog at link time via inventory::submit!; annotated item is emitted unchanged, ToolBase/AsyncTool impls stay on the struct. All 34 tools migrated from manual wiring (net +5: ontoref_list_projects, ontoref_search, ontoref_describe, ontoref_list_ontology_extensions, ontoref_get_ontology_extension). validate modes (ADR-018): reads level_hierarchy from workflow.ncl and checks every .ncl mode for level declared, strategy declared, delegate chain coherent, compose extends valid. mode resolve <id> shows which hierarchy level handles a mode and why. --self-test generates synthetic fixtures in a temp dir for CI smoke-testing. validate run-cargo: two-step Cargo.toml resolution — workspace layout first (crates/<check.crate>/Cargo.toml), single-crate fallback by package name or repo basename. Lets the same ADR constraint shape apply to workspace and single-crate repos. ontology/schemas/manifest.ncl: registry_topology_type contract — multi-registry coordination, push targets, participant scopes, per-namespace capability. reflection/requirements/base.ncl: oras ≥1.2.0, cosign ≥2.0.0, sops ≥3.9.0, age ≥1.1.0, restic declared as Hard/Soft requirements with version_min, check_cmd, and install_hint (ADR-017 toolchain surface). ADR-019: per-file recipient routing for tenant isolation without multi-vault. Schema additions: sops.recipient_groups + sops.recipient_rules in ontoref-project.ncl. secrets-bootstrap generates .sops.yaml from project.ncl in declarative mode. Three new secrets-audit checks: recipient-routing-coherent, recipient-routing-coverage, no-multi-vault. Adoption templates: single-team/, multi-tenant/, agent-first/. Integration templates: domain-producer/, mode-producer/, mode-consumer/. UI: project_picker surfaces registry badge (⟳ participant) and vault badge (⛁ vault_id · N, green=declarative / amber=legacy) per project card. Expanded panel adds collapsible Registry section with namespace, endpoint, and push/pull capability. manage.html gains Runtime Services card — MCP and GraphQL toggleable without restart via HTMX POST /ui/manage/services/{service}/toggle. describe.nu: capabilities JSON includes registry_topology and vault_state per project. sync.nu: drift check extended to detect //! absence on newly registered crates. qa.ncl: six entries — credential-vault-best-practice (layered data-flow diagram), credential-vault-templates (paths A/B/C), credential-vault-troubleshooting (15 named errors), integration-what-and-why (ADR-042 OCI federation), integration-how-to-implement, integration-troubleshooting. on+re: core.ncl + manifest.ncl updated to reflect OCI, MCP, and mode-hierarchy nodes. Deleted stale presentation assets (2026-02 slides + voice notes).
2026-05-12 04:46:15 +01:00
let d = import "defaults.ncl" in
d.make_adr {
id = "adr-019",
title = "Per-File Recipient Routing for Tenant Isolation in lieu of Multi-Vault",
status = 'Accepted,
date = "2026-05-03",
context = "ADR-017 established per-project credential vaults as OCI artifacts in ZOT, encrypted with sops + age multi-recipient. The model held one recipient set per vault: every actor who could decrypt access.sops.yaml could decrypt every credential file inside the vault. Real projects (libre-wuji, the canonical multi-tenant example) need stronger separation — a single project hosts multiple clients with distinct services, AI agents that operate with restricted access, and developer/admin roles that occasionally overlap. Three concrete failure modes appear: (1) credential-of-clientB is visible to anyone who can open the vault even when only working on clientA — recipient lists are coarse and indiscriminate; (2) AI agents with read-only intent receive credentials whose blast radius exceeds their declared scope; (3) blast-radius limitation requires per-file recipient sets, which the original single-set model does not express. A naive answer is multi-vault — one vault_id per tenant or per environment — but it cascades through every layer (schema, helpers, recipes, dispatcher, migration) and creates an architectural debt without justifying a real isolation requirement (separate master keys, separate restic repos) for the typical case.",
decision = "Tenant isolation within a single project is expressed via sops creation_rules, declared in project.ncl::sops as recipient_groups (named lists of age public keys) and recipient_rules (path_regex to group-union mappings). The bootstrap recipe generates <vault_dir>/.sops.yaml from these declarations and sops natively encrypts each file with the union of declared groups. One vault_id per project remains the unit. Multi-vault is explicitly NOT implemented and remains out of scope until a project requires HARD isolation (separate master keys, separate restic repos, compliance-grade separation surviving accidental cross-decryption); such a case requires a future ADR. When recipient_rules are declared, project.ncl is the single source of truth: secrets-add-key and secrets-remove-key error out and direct the operator to edit project.ncl plus run secrets-rekey, which regenerates .sops.yaml and re-encrypts every *.sops.yaml file. Three adoption templates (single-team, multi-tenant, agent-first) ship in install/resources/templates/sops/ as copy-paste starting points; a project may adopt any pattern or none.",
rationale = [
{
claim = "Use sops creation_rules natively — do not invent a parallel routing layer",
detail = "sops already provides per-file recipient routing via creation_rules in .sops.yaml. Using it directly inherits all of sops' tooling (updatekeys, --decrypt with .sops.yaml discovery, --filename-override) and avoids a competing convention that would double the surface and break composability with sops itself.",
},
{
claim = "Preserve the single-vault structural invariants of ADR-017",
detail = "Single vault_id, single OCI artifact, single restic repo, single lock, single dispatcher subcommand surface. The schema delta is two optional fields (recipient_groups + recipient_rules). Most code paths require zero modification — additivity over existing helpers is the migration story for projects already on ADR-017.",
},
{
claim = "Project.ncl is the single source of truth for recipient sets",
detail = "Direct sops mutations via secrets-add-key and secrets-remove-key are forbidden in declarative mode (recipes error explicitly). The reason: any direct mutation diverges from project.ncl, and the next secrets-rekey would silently revert. Forcing edits through git via project.ncl makes recipient changes auditable and reproducible across machines.",
},
{
claim = "Honest trade-off: per-file routing protects against accidental cross-decryption, not against encrypted-byte visibility",
detail = "ClientA's lead with their .kage cannot decrypt clientB-*.sops.yaml files — sops rejects decryption when the recipient set excludes the actor. But all encrypted files are layers in the same OCI artifact; clientA SEES that clientB-* files exist. For HARD isolation (separate master keys, separate restic repos, compliance-grade separation surviving accidental cross-decryption), multi-vault remains the future option — a separate ADR captures that work when a real case requires it.",
},
{
claim = "Three adoption templates anchor common patterns without forcing them",
detail = "single-team, multi-tenant, and agent-first templates ship in install/resources/templates/sops/ as copy-paste starting points. The schema fields are optional with sensible defaults — a project may adopt any pattern, mix them, or skip templates entirely while still being a valid consumer of ADR-017 + ADR-019.",
},
],
consequences = {
positive = [
"Tenant isolation in a single vault, single master key, single OCI artifact — adoption cost minimal.",
"Compatible with all existing ADR-017 enforcement (assert-actor-authorized, assert-target-in-scope, vault lock, impact analysis).",
"Migration from legacy single-set mode is additive: existing projects keep working, new fields opt them into per-file routing.",
"Defense in depth: actor scope (ops + namespaces) + recipient routing — both must permit an operation.",
],
negative = [
"Operators must understand sops creation_rules ordering (first match wins) — surfacing rule conflicts requires care.",
"secrets-add-key and secrets-remove-key behavior diverges between legacy and declarative modes, introducing a mode-aware UX.",
"Cross-tenant visibility of encrypted file paths in the OCI manifest — operationally clientB knows clientA exists in the vault.",
],
},
alternatives_considered = [
{
option = "Multi-vault: project.ncl::sops.vault_id (single string) becomes sops.vaults (record of named SopsConfigs)",
why_rejected ="Cascades through every layer (schema, helpers, recipes, dispatcher), forcing a 12-component refactor and a migration for every existing project to express what one optional schema field accomplishes via sops creation_rules. The HARD isolation it provides (separate master keys, separate restic repos) is rarely required; per-file routing covers the common case while leaving the door open for a future multi-vault ADR if a project genuinely needs filesystem-level separation.",
},
{
option = "Single recipient set + role-based decryption gating in helper code",
why_rejected ="Would require a custom layer over sops and reinvent recipient routing. sops already does it natively via creation_rules. Inventing a parallel mechanism doubles the surface and breaks composition with sops tooling (sops --decrypt, updatekeys).",
},
{
option = "Externalize tenant credentials to an external secret manager (e.g. HashiCorp Vault)",
why_rejected ="Adds an external runtime dependency that contradicts the ADR-017 invariant of self-contained, distribution-via-OCI credentials. Reasonable for projects that already operate such a system, but inappropriate as the default ontoref pattern.",
},
],
constraints = [
{
id = "rules-imply-groups-defined",
claim = "Every group referenced in recipient_rules must be declared in recipient_groups",
scope = "all projects with sops.recipient_rules non-empty",
severity = 'Hard,
check = {
tag = 'NuCmd,
cmd = "ore secrets audit --check recipient-routing-coherent",
},
rationale = "An undeclared group resolves to an empty recipient list, producing files encrypted to nobody. Catch at audit time before push.",
},
{
id = "no-empty-group-on-active-rule",
claim = "A rule whose group union resolves to zero recipients is rejected at bootstrap and rekey",
scope = "all projects with sops.recipient_rules non-empty",
severity = 'Hard,
check = {
tag = 'NuCmd,
cmd = "ore secrets audit --check recipient-routing-coherent",
},
rationale = "Encrypting to zero recipients silently produces an unrecoverable file. Reject at the source rather than waiting for sops to fail at use time.",
},
{
id = "declarative-mode-locks-direct-mutations",
claim = "secrets-add-key and secrets-remove-key recipes must error when project declares recipient_rules; canonical workflow is edit project.ncl + secrets-rekey",
scope = "all projects with sops.recipient_rules non-empty",
severity = 'Hard,
check = {
tag = 'Grep,
paths = ["justfiles/secrets.just"],
pattern = "HAS_RULES.*declarative",
must_be_empty = false,
},
rationale = "Direct sops mutations would diverge from project.ncl, and the next rekey would silently revert them. Forcing the rekey path keeps git as the single source of truth.",
},
{
id = "every-vault-file-matches-a-rule",
claim = "Every *.sops.yaml under <vault_dir>/ must match at least one declared rule when recipient_rules is non-empty",
scope = "all projects with sops.recipient_rules non-empty",
severity = 'Hard,
check = {
tag = 'NuCmd,
cmd = "ore secrets audit --check recipient-routing-coverage",
},
rationale = "sops fails encryption with 'no matching creation rules found' for unmatched paths. Catch the mismatch at audit time, surface which path needs a rule (or which rule needs broadening).",
},
{
id = "multi-vault-not-implemented",
claim = "Projects must not declare a multi-vault structure (e.g. sops.vaults map). Multi-vault adoption requires a new ADR superseding or extending this one",
scope = "all projects with .ontoref/project.ncl",
severity = 'Hard,
check = {
tag = 'Grep,
pattern = "sops.vaults *=",
paths = [".ontoref/project.ncl"],
must_be_empty = true,
},
rationale = "Premature multi-vault implementation without a justifying use case generates schema and helper debt across 12+ components. The constraint is structural: any project that hits a real HARD-isolation requirement must capture it in a new ADR before adopting multi-vault.",
},
{
id = "templates-discoverable",
claim = "Three adoption templates (single-team, multi-tenant, agent-first) live under install/resources/templates/sops/ and are referenced by qa.ncl::credential-vault-templates",
scope = "ontoref protocol layer",
severity = 'Soft,
check = {
tag = 'FileExists,
paths = [
"install/resources/templates/sops/single-team/project.ncl.snippet",
"install/resources/templates/sops/multi-tenant/project.ncl.snippet",
"install/resources/templates/sops/agent-first/project.ncl.snippet",
],
},
rationale = "Without templates, adoption requires reading the schema, sops docs, and the FAQ — too high a friction. Templates are not mandatory but their absence is a soft signal of protocol decay.",
},
],
related_adrs = [
"adr-017-registry-credential-vault-model",
"adr-015-mcp-tool-inventory-auto-derive",
],
ontology_check = {
decision_string = "tenant isolation within a single credential vault is expressed via sops creation_rules driven by project.ncl::sops.recipient_groups + recipient_rules; multi-vault is explicitly out of scope; project.ncl is the single source of truth for recipient sets and direct sops mutations are forbidden in declarative mode",
invariants_at_risk = ["protocol-not-runtime"],
verdict = 'Safe,
},
}