let s = import "schemas/qa.ncl" in { entries = [ { id = "credential-vault-best-practice", question = "What is the canonical approach to manage registry credentials with ontoref's credential vault?", answer = m%" ADR-017 implements a layered credential model. Apply each layer in order; do not skip layers — they enforce different invariants. DATA FLOW developer machine ZOT registry ───────────────── ───────────────── ~/.age/keys/.key.txt (Layer 0) │ sops --decrypt ▼ access.sops.yaml (Layer 1) { zot_user, zot_pass, vault_key, cosign_pass } │ │ oras pull ──────────────────► src-vault/:latest ◄────────────────── (cosigned) ▼ ~/.config/ontoref/vaults//src-vault/ scopes/.ncl ─┐ registry/.sops.yaml (Layer 2) │ assert-actor-authorized │ sops --decrypt │ assert-target-in-scope ▼ │ DOCKER_CONFIG=$tmpdir ─┘ │ │ oras push/pull ─────────────► domains//: │ modes//: ▼ rm -rf $tmpdir LAYER 0 — Master key (per developer/machine) - One age private key (.kage) per actor; declared globally in ~/.config/ontoref/config.ncl::vault.master_key_path. Per-project override in /.ontoref/project.ncl::sops.master_key_path when the project requires a different key (e.g. yubikey-backed for production-only access). - Generate once with: age-keygen -o ~/.age/keys/.key.txt - Permissions 0400. Never commit; never put inside any vault directory. LAYER 1 — Vault access credential (per-project) - access.sops.yaml encrypted multi-recipient with all actors who may open the vault. Contains: zot_username, zot_password, vault_key (restic/kopia repo password), cosign_password. - Lives at ~/.config/ontoref/vaults//access.sops.yaml. - Generated once by 'ore secrets bootstrap'; updated via 'ore secrets open'. LAYER 2 — Operation credentials (per-purpose) - Files under ~/.config/ontoref/vaults//src-vault/registry/. - Referenced from .ontology/manifest.ncl::registry_provides.registries[]: credential_sops (RO — pull/list) credential_sops_rw (RW — push) - Paths are RELATIVE to src-vault/, not to project root. - Decrypted into an isolated DOCKER_CONFIG tmpdir per oras invocation. PER-FILE RECIPIENT ROUTING (multi-tenant, optional) - Single vault, multiple recipient sets via sops creation_rules. - Declare in project.ncl::sops: recipient_groups = { admin = [...], clientA = [...], agents = [...] } recipient_rules = [ { path = \"registry/clientA-.*\\.sops\\.yaml$\", groups = [\"admin\", \"clientA\"] }, { path = \"registry/agent-.*\\.sops\\.yaml$\", groups = [\"admin\", \"agents\"] }, ] - Bootstrap generates /.sops.yaml; sops encrypts each file with the union of declared groups. Use this instead of multi-vault for tenant/agent isolation in a single project. AUTHORIZATION GATING (always enforced) - project.ncl::sops.actor_key_bindings maps ONTOREF_ACTOR → role. - /src-vault/scopes/.ncl declares { access, bound_actor, namespaces, ops }. Two-level enforcement: assert-actor-authorized — checks scope.ops + scope.bound_actor assert-target-in-scope — checks the OCI ref against scope.namespaces - Both fire BEFORE any oras call. No cache hit bypasses these checks. HARD RULES (ADR-017 invariants — non-negotiable) - Daemon never touches credentials. Resolution lives in the CLI process that holds the actor's .kage. The ontoref daemon only reads declarative metadata. - Every oras call runs with DOCKER_CONFIG=$tmpdir, freshly built from sops and torn down after the call. No fallback to ~/.docker/config.json. - cosign signing is mandatory for src-vault pushes. Default tlog=false (private model); vault.cosign.signing_config_path declares a Rekor-less signing config when tlog disabled. - Multi-recipient sops mandatory. ≥ 2 recipients per encrypted file. - access.sops.yaml carries cosign_password so push runs non-interactively. - Vault lock (OCI artifact at src-vault/:lock) coordinates concurrent edits with TTL 60min. force-unlock is admin-only and auditable. DAY-TO-DAY COMMANDS ore secrets status vault state, master key resolution ore secrets describe full inventory: groups, rules, scopes, ops ore secrets sync pull latest src-vault from ZOT ore secrets open acquire lock + edit access.sops.yaml ore secrets close impact report + push + release lock ore secrets rekey regenerate .sops.yaml + sops updatekeys ore secrets force-unlock release abandoned lock (admin) ESCAPE HATCHES ONTOREF_SECRETS_YES=1 skip impact-confirm in secrets close (no ambient docker config fallback — by design, see ADR-017 invariant I4) REFERENCES reflection/modules/secrets.nu (header) — function contract reflection/migrations/0016 — adoption steps install/resources/templates/sops/ — copy-paste templates per tenancy model adrs/adr-017 — invariants and rationale reflection/qa.ncl::ontoref-three-layer-model — vault credentials gate LAYER 2 publication (oras push of a project's domains//* and modes//*). The credential resolution chain ALSO sits at the LAYER 2 ↔ LAYER 3 seam: cred files referenced from manifest.ncl::registry_provides (Layer 2 declaration) are decrypted by caller-side workflows (Layer 3 execution). "%, actor = "human", created_at = "2026-05-03", tags = ["credentials", "adr-017", "sops", "cosign", "vault", "best-practice", "layer-2"], related = ["adr-017", "adr-015"], verified = true, }, { id = "credential-vault-templates", question = "How do I bootstrap the credential vault for a new project? Are there templates I can copy?", answer = m%" Three adoption paths, in order of complexity. Pick the one matching your project: PATH A — SINGLE-TEAM (legacy, simplest) Use when: one team, no tenant isolation, no agent restrictions beyond ops gating. Template: install/resources/templates/sops/single-team/project.ncl Steps: 1. Copy the template snippet into your .ontoref/project.ncl 2. Set master_key_path (per-project) or rely on the global config 3. Add registry_provides to .ontology/manifest.ncl with credential_sops/_rw pointing to registry/ro.sops.yaml and registry/rw.sops.yaml 4. export SOPS_AGE_RECIPIENTS=\"age1...\" (comma-separated, ≥ 2 keys) 5. ore secrets bootstrap (creates default ro/rw files seeded with interactive zot credentials) 6. ore secrets push PATH B — MULTI-TENANT (recommended for libre-wuji-class projects) Use when: multiple clients/agents/teams must NOT see each other's credentials. Template: install/resources/templates/sops/multi-tenant/project.ncl Adds recipient_groups + recipient_rules. Each tenant has its own group of age public keys; rules route file paths to group unions. Steps: 1. Copy the template — adjust group keys and rule patterns to your tenancy 2. Add registry_provides entries for each tenant (e.g. credential_sops = \"registry/clientA-ro.sops.yaml\") 3. ore secrets bootstrap (skips default ro/rw files, generates .sops.yaml) 4. ore secrets open (populate registry/.sops.yaml entries) 5. ore secrets close PATH C — AGENT-FIRST (ontoref/MCP integration) Use when: AI agents read credentials with strict restrictions. Template: install/resources/templates/sops/agent-first/project.ncl Same shape as Path B but with predefined groups for admin / developer / agent and a default scope file that gives 'agent' role RO ops on a single agent-readonly.sops.yaml. UNIVERSAL CHECKLIST Pre-bootstrap: - master .kage generated (age-keygen) and at master_key_path - cosign keypair at vault.cosign.{key_path,pub_path} - signing-config-no-rekor.json (when tlog=false) - ZOT registry reachable; ACL allows src-vault// namespace Post-bootstrap: - ore secrets describe shows expected recipients and per-file routing - ore secrets audit all 3 checks pass NO TEMPLATE = LEGACY DEFAULTS If you skip the templates entirely, ore secrets bootstrap with SOPS_AGE_RECIPIENTS env-var works as a minimal viable path. The result is Path A. You can migrate to B or C later by adding recipient_groups + recipient_rules and running ore secrets rekey. "%, actor = "human", created_at = "2026-05-03", tags = ["credentials", "templates", "onboarding", "adoption"], related = ["adr-017"], verified = true, }, { id = "credential-vault-troubleshooting", question = "What do the named errors from secrets.nu mean and how to recover?", answer = m%" The credential helper raises 15 named errors. Each maps to a recovery action: [invalid-op] op must be 'pull' or 'push'. Caller bug. [project-ncl-missing] Run from a project with .ontoref/project.ncl, or set ONTOREF_PROJECT_ROOT to that path. [manifest-ncl-missing] Apply migration 0016 — add registry_provides to .ontology/manifest.ncl. [registry-provides-missing] Same as above — manifest needs registry_provides block. [registry-id-unknown] Pass --registry-id matching a declared entry, or declare registries.default in the manifest. [credential-sops-missing] The chosen RegistryEntry has no credential_sops/_rw for the requested op. Add the field. [sops-file-not-found] Vault not synced. Run: ore secrets sync . [kage-not-resolvable] Master key absent. Set vault.master_key_path globally or sops.master_key_path per project. [sops-decrypt-failed] Your .kage is not a recipient of the file, or it is corrupt. Verify with: ore secrets describe. [actor-bindings-missing] project.ncl::sops.actor_key_bindings is empty. Map at least the actors used in this project. [actor-not-bound] ONTOREF_ACTOR has no entry in actor_key_bindings. Set the env var or add the mapping. [actor-not-in-bound-actor] scope.bound_actor list excludes this actor. Either change actor or relax the scope. [scope-not-loaded] Scope file missing — vault not synced or never created. Run: ore secrets sync . [op-not-in-scope] The role's scope.ops does not allow this operation. Use a higher-privilege role or extend scope. [target-not-in-scope] The OCI ref does not match any scope.namespaces glob. Operate on a permitted target or extend scope. Errors are raised before any registry call — no operation is half-completed. "%, actor = "human", created_at = "2026-05-03", tags = ["credentials", "errors", "troubleshooting", "secrets"], related = ["adr-017"], verified = true, }, { id = "integration-what-and-why", question = "What is integration in ontoref and why use it?", answer = m%" WHAT Integration is the federated distribution of two kinds of artifacts via an OCI registry (typically zot): DOMAIN ARTIFACTS application/vnd.ontoref.domain.v1 A domain is a Nickel contract (contract.ncl) describing the typed shape of a piece of structured data — e.g. 'registry-access', 'secret-delivery', 'compute-provisioning'. Pushed at domains//:. MODE ARTIFACTS application/vnd.ontoref.mode.v1 A mode is an operational orchestration (provisioning.ncl + domains.lock.ncl) that consumes one or more domain contracts to perform a workflow — e.g. 'lian-build/provisioning'. Pushed at modes//:. Both are cosign-signed and immutable per version. WHY - DECOUPLE producers from consumers. The team that defines the contract for 'registry-access' does not need to coordinate with every workspace that consumes it. Versioning (semver) handles compatibility. - REUSE across projects. A mode author writes one mode artifact; multiple workspaces subscribe with different cabling files binding their own values. - VERIFIABLE TRUST. cosign signatures + multi-recipient sops (ADR-017) establish who published an artifact and who can read its credentials. - DAG-FORMALIZED. Modes declare domains_used + steps as a typed graph; consumers can statically verify their cabling resolves all bindings. DATA FLOW producer project (libre-wuji) registry (zot) consumer project ───────────────────────────── ──────────────── ──────────────── catalog/domains/registry-access/ contract.ncl ─push──► domains/libre-wuji/ example.json registry-access:0.1.0 (cosigned) catalog/modes/lian-build/ provisioning.ncl ─push──► modes/lian-build/ domains.lock.ncl provisioning:0.1.0 (cosigned) │ │ oras pull ▼ infra//integrations/ lian-build.ncl (cabling — binds mode params to workspace values) consumer commands: prvng integration subscribe lian-build --mode-file ... --workspace-dir . prvng integration validate lian-build --workspace-dir . prvng integration invoke lian-build --binary lian-build REFERENCES - ADR-042 (provisioning workspace) — federation model - reflection/migrations/0015 — registry topology adoption - install/resources/templates/integration/ — copy-paste templates - reflection/qa.ncl::ontoref-three-layer-model — domains and modes are LAYER 2 of a project's ontoref instance (the integration surface). This entry focuses on the federation mechanism; the layering entry frames where the artifacts sit relative to a project's self-management ontoref and to caller-side wiring. "%, actor = "human", created_at = "2026-05-03", tags = ["integration", "oci", "domains", "modes", "federation", "layer-2"], related = ["adr-015", "adr-017"], verified = true, }, { id = "integration-how-to-implement", question = "How do I implement integration in my project — as producer, consumer, or both?", answer = m%" Two roles, often the same project plays both. Pick the side and follow the predefined paths. PRODUCER SIDE — publish a domain or mode artifact Path P1 — DOMAIN PUBLISHER (you author a contract.ncl others should consume) Template: install/resources/templates/integration/domain-producer/ Steps: 1. Create catalog/domains// with: contract.ncl — typed shape of the domain (Nickel contract) example.json — sample value matching the contract manifest.ncl — DomainArtifact descriptor (id, version, layers) 2. Declare uses_registry in manifest.ncl::DomainArtifact pointing to the RegistryEntry that hosts pushes (ADR-017 G2 impact analysis). 3. ore secrets bootstrap (one-time per project) 4. prvng integration domain publish catalog/domains/ (cosign-signs at push time) Validation: prvng integration domain verify / → checks media types, layers, contract typecheck, cosign signature Path P2 — MODE PUBLISHER (you author a mode that orchestrates other modes) Template: install/resources/templates/integration/mode-producer/ Steps: 1. Create catalog/modes// with: provisioning.ncl — the IntegrationMode declaration (id, participant, direction, trigger, domains_used, steps) domains.lock.ncl — pinned domain versions consumed 2. prvng integration mode publish catalog/modes/ CONSUMER SIDE — bind a published mode to your workspace Path C1 — MODE SUBSCRIBER (you adopt someone's published mode) Template: install/resources/templates/integration/mode-consumer/ Steps: 1. prvng integration subscribe --mode-file --workspace-dir . → pulls all domains_used dependencies, verifies signatures, scaffolds infra//integrations/.ncl (the cabling) 2. Edit the cabling file to bind mode parameters to workspace values (e.g. dns zones, tenant ids, registry endpoints from your manifest). 3. prvng integration validate --workspace-dir . → typechecks the cabling and confirms every binding resolves. 4. prvng integration invoke --binary → assembles context envelope + pipes to the mode binary stdin. PREDEFINED INTEGRATION MODES (canonical examples in libre-wuji) modes/cloudatasave/provisioning data-save workflow with backup-policy-binding + result-reporting domains modes/lian-build/provisioning CI build pipeline using compute-provisioning + cache-management + secret-delivery domains These are reference implementations — clone the structure, adapt domains_used, re-publish under your participant. CABLING FILE STRUCTURE infra//integrations/.ncl let mode = import \"modes//:\" in { mode_id = mode.id, bindings = { # Match each domain in mode.domains_used. Resolve via: # - workspace state (manifest, capabilities) # - secret-delivery (pulls from credential vault) # - registry-access (zot endpoint + namespace policy) \"\" = { ... fields per the domain's contract.ncl ... }, }, } WITHOUT TEMPLATES — minimal viable Producer: manually create contract.ncl + example.json + manifest.ncl in any dir; cosign keypair; prvng integration domain publish Consumer: manually pull the OCI artifact; write cabling.ncl from scratch matching the mode's domains_used schema; invoke "%, actor = "human", created_at = "2026-05-03", tags = ["integration", "templates", "producer", "consumer", "subscribe"], related = ["adr-015", "adr-017"], verified = true, }, { id = "integration-troubleshooting", question = "Common errors when working with integration artifacts — what causes them and how to fix?", answer = m%" ON PUSH oras push: 403 Forbidden Cause: ZOT ACL does not declare the target namespace (e.g. domains// when registry config only allows domains/**). Fix: Add the namespace to zot configmap creation_rules and redeploy. cosign sign: 401 Unauthorized Cause: cosign needs DOCKER_CONFIG to fetch the manifest before signing. Fix: Ensure the calling code wraps cosign with the same isolated DOCKER_CONFIG used for the oras push. cosign sign: --tlog-upload=false is not supported with --signing-config Cause: cosign 2+ deprecated --tlog-upload. Use a signing-config without rekorTlogUrls/Config instead. Fix: Generate one with: curl https://raw.githubusercontent.com/sigstore/root-signing/refs/heads/main/targets/signing_config.v0.2.json \\ | jq 'del(.rekorTlogUrls) | del(.rekorTlogConfig)' > signing-config-no-rekor.json Set vault.cosign.signing_config_path in ~/.config/ontoref/config.ncl. ON PULL / SUBSCRIBE Vault artifact signature FAILED Cause: cosign pubkey configured does not match the signing key. Fix: Confirm vault.cosign.pub_path in ~/.config/ontoref/config.ncl points to the public half of the keypair used for the push. Verify the tlog policy is symmetric (both sign and verify expect tlog=false). domain-pull: scope-not-loaded Cause: Vault not synced — scopes/.ncl absent from local src-vault. Fix: ore secrets sync oras pull: not found Cause: Mismatch between expected ref format. Old flat domains/ are not reachable via the new domains// path. Fix: Either re-publish the artifact under the participant-scoped path, or pass --registry overriding to a registry that has the legacy ref. ON INVOKE integration validate: binding does not resolve Cause: Cabling.ncl has a binding whose value cannot be derived from workspace state at validate time. Fix: Inspect prvng integration describe to see the expected shape; ensure your cabling provides the matching field with the required type (use prvng i validate --strict for hard fail). integration invoke: binary not found Cause: Mode declares Invocation.binary.source = 'oci_blob but the blob reference is unreachable, or 'cargo_install but cargo crate is absent in the local registry index. Fix: Pre-fetch the binary: docker pull or cargo install . Or pass --binary to override resolution. REFERENCES - prvng integration --help full command surface - reflection/qa.ncl::integration-* this FAQ entry tree - reflection/migrations/0015 participant-scoped namespace migration "%, actor = "human", created_at = "2026-05-03", tags = ["integration", "errors", "troubleshooting", "cosign", "oras"], related = ["adr-015", "adr-017"], verified = true, }, { id = "nats-what-and-why", question = "Why does ontoref use NATS, and what role does it play for projects that adopt the protocol?", answer = m%" WHAT ontoref uses NATS JetStream as its async event substrate. The daemon publishes lifecycle events; the CLI receives notifications; projects publish domain- specific events (build started/completed, sync done, integration invoked). All events flow through a single JetStream stream with a typed subject hierarchy. TOPOLOGY (default — see nats/streams.json) Stream: ECOSYSTEM subjects: ecosystem.> retention: Limits (max_age = 30 days) storage: File (durable across restarts) Consumers (pull, explicit ack): daemon-ontoref filters: ecosystem.daemon.> ecosystem.actor.> ecosystem.ontoref.> cli-notifications filters: ecosystem.ontoref.> ecosystem.actor.> SUBJECT HIERARCHY ecosystem.daemon. daemon lifecycle (started, reload, ...) ecosystem.ontoref.. protocol events (sync.done, mode.run, ...) ecosystem.actor.. per-actor session/audit ecosystem... project-specific events (e.g. ecosystem.lian-build.build.completed) WHY ontoref uses NATS - DECOUPLE the daemon from the CLI. The daemon publishes; CLIs subscribe. No HTTP polling, no shared filesystem state, no blocking RPCs. Either side can restart without dropping the other. - DURABILITY across restarts. JetStream's File storage means a CLI launched after a daemon event still receives it via consumer replay. ack_policy Explicit lets each consumer track its own position. - MULTI-PROJECT VISIBILITY. The shared ECOSYSTEM stream and ecosystem.> subject root let any project publish or subscribe without negotiating a dedicated stream. Tenancy lives in the subject hierarchy, not in stream-per-project sprawl. - GRACEFUL DEGRADATION. NATS is a runtime-toggle service (ADR-014): nats_events.enabled = false in .ontoref/config.ncl shuts publishing off cleanly. Consumers see no events; the daemon and CLI continue working. Same is true if NATS is unreachable at startup — connection failure logs a warning and the publisher returns Ok(None). WHY a project adopting ontoref should publish to it - AUDIT TRAIL by subscribing once and recording. ecosystem..> captures every lifecycle event a project emits without bespoke logging pipelines. - CROSS-PROJECT COORDINATION. A workspace's CI pipeline can wait for ecosystem.lian-build.build.completed before triggering a deploy step, rather than polling an HTTP API or watching a filesystem. - FAN-OUT FOR FREE. Multiple consumers (oncall dashboard, audit log, notification UI, downstream pipeline) subscribe to the same subject without the producer knowing. - PROTOCOL ALIGNMENT. ADR-014 defines NATS as one of three runtime services (NATS, SurrealDB, src-vault). Projects that adopt ontoref get the same enable/disable mechanics for free; turn it on later without code changes. CONFIGURATION SHAPE Global (every ontoref-onboarded host): ~/.config/ontoref/config.ncl::nats_events .url = "nats://..." .enabled = true | false .nkey_seed = "..." (optional) ~/.config/ontoref/streams.json (stream + consumer topology) Project-local override (a project that wants its own stream): /.ontoref/config.ncl::nats_events.streams_config = "nats/streams.json" /nats/streams.json (overrides the global topology) Most projects accept the global topology; the project-local override is for the rare case of a project needing isolated streams (e.g. high-volume internal events that shouldn't share retention with ECOSYSTEM). REFERENCES - nats/streams.json default ECOSYSTEM topology - crates/ontoref-daemon/src/nats.rs daemon-side NatsPublisher - adr-002 daemon as notification barrier - adr-014 runtime service toggles (NATS as one) - reflection/qa.ncl::nats-* this FAQ entry tree - reflection/qa.ncl::ontoref-three-layer-model — the subjects a project publishes are a LAYER 2 surface (other projects subscribe to coordinate). The broker and ECOSYSTEM stream are protocol-level infrastructure, not project Layer 2, so the layering crisply separates WHAT a project emits (Layer 2) from WHERE the events flow (protocol). "%, actor = "human", created_at = "2026-05-03", tags = ["nats", "jetstream", "events", "ecosystem", "architecture", "layer-2"], related = ["adr-002", "adr-014"], verified = true, }, { id = "nats-how-to-setup", question = "How do I set up NATS for ontoref, and how does a project plug into the event system?", answer = m%" Three deployment shapes — pick the lowest one your threat model accepts. All three satisfy the daemon's connection requirements. OPTION A — Local nats-server, no auth (lowest barrier, dev only) Best for: laptop development, ad-hoc testing of a project's event publishing. Install: macOS: brew install nats-server Linux: curl -sf https://binaries.nats.dev/nats-io/nats-server/v2/install.sh | sh Run (foreground, ^C to stop): nats-server -DV -js The -js flag enables JetStream so the ECOSYSTEM stream can be created. Wire into ontoref: Edit ~/.config/ontoref/config.ncl::nats_events enabled = true, url = "nats://127.0.0.1:4222", # nkey_seed unset — anonymous client Restart ontoref-daemon (it picks up the new config on next bootstrap). Bootstrap the topology (one-time per server): nats --server nats://127.0.0.1:4222 stream add --config nats/streams.json nats --server nats://127.0.0.1:4222 consumer add ECOSYSTEM \\ --config '' (Or let the daemon's TopologyConfig apply it on first connect — see crates/ontoref-daemon/src/nats.rs::connect.) OPTION B — Local nats-server with NKey auth (matches deployed shape) Best for: validating the NKey code path before deploying; production-shape testing without k8s overhead. Generate an NKey: go install github.com/nats-io/nkeys/nk@latest nk -gen user > /tmp/ontoref.user.nk nk -inkey /tmp/ontoref.user.nk -pubout > /tmp/ontoref.user.pub Configure /tmp/nats-server.conf: port: 4222 jetstream: enabled authorization { users = [{ nkey: "" permissions { publish { allow: ["ecosystem.>"] } subscribe { allow: ["ecosystem.>", "_INBOX.>"] } } }] } Run: nats-server -c /tmp/nats-server.conf Wire into ontoref: Edit ~/.config/ontoref/config.ncl::nats_events enabled = true, url = "nats://127.0.0.1:4222", nkey_seed = "", Note: platform-nats hardcodes require_signed_messages=false. The broker authenticates the client by NKey identity but does not require per-message signatures. This matches the deployed pattern. OPTION C — Production via provisioning's nats component Best for: shared workspaces, persistent JetStream state, multi-actor. The reusable component lives at: /catalog/components/nats/ metadata.ncl (name=nats, version=2.10, mode=cluster, JetStream) cluster/manifest_plan.ncl nickel/{main,defaults,contracts}.ncl Defaults (defaults.ncl): port=4222, monitor_port=8222, mode='cluster, image=nats:2.10-alpine jetstream.max_mem=256MB, jetstream.max_file=1GB storage 1Gi persistent Deploy: prvng workspace install nats # add to workspace component DAG prvng workspace apply # execute prvng workspace status # confirm pod Running Wire into ontoref (in-cluster vs port-forward): In-cluster: url = "nats://nats..svc.cluster.local:4222" nkey_seed = $(kubectl get secret -n ontoref-nkey -o jsonpath='{.data.seed}' | base64 -d) Out-of-cluster (port-forward for ad-hoc): kubectl port-forward -n svc/nats 4222:4222 & url = "nats://127.0.0.1:4222" nkey_seed = (same as above) PROJECT-SIDE: enabling event publishing in a project In /.ontoref/config.ncl: nats_events = { enabled = true, url = "nats://127.0.0.1:4222", # or workspace URL nkey_seed = std.env.get "NATS_NKEY_SEED", # or null for anonymous emit = ["ecosystem...>"], # subjects this project publishes subscribe = ["ecosystem.ontoref.>"], # subjects this project consumes } Subject discipline: prefix every emitted subject with ecosystem... The slug matches .ontoref/project.ncl::slug. Scope is project-internal (e.g. lian-build uses 'build' for build lifecycle: ecosystem.lian-build.build.started). Project-local stream override (rare): Add nats/streams.json at the project root with the same shape as ontoref's global, then set nats_events.streams_config = "nats/streams.json" in the project's config.ncl. The daemon applies this topology on next connect instead of inheriting the global ECOSYSTEM stream. VERIFY end-to-end Subscribe in one terminal: nats sub --server $NATS_URL 'ecosystem.>' Trigger an event from another terminal: ontoref --actor developer sync . # daemon emits ecosystem.ontoref.sync.* OR run a project that publishes (e.g. lian-build integrate) The subscriber prints the JSON envelope within milliseconds. If it does not, see reflection/qa.ncl::nats-troubleshooting. REFERENCES - nats/streams.json default topology — copy as starting point - ~/.config/ontoref/config.ncl where nats_events is configured - /catalog/components/nats Option C source - crates/ontoref-daemon/src/nats.rs daemon connect + topology apply - reflection/qa.ncl::nats-what-and-why rationale and architecture "%, actor = "human", created_at = "2026-05-03", tags = ["nats", "setup", "configuration", "nkey", "onboarding"], related = ["adr-002", "adr-014", "adr-017"], verified = true, }, { id = "ontoref-three-layer-model", question = "When I see a project's ontoref instance, what am I actually looking at? What are the three layers, and how do they not mix?", answer = m%" A project's ontoref instance has THREE distinct layers, each with its own audience, lifecycle, and validation rule. They share the project's repo but do not share content. This entry exists so adopters discover the layering from protocol documentation, not by re-deriving it across three projects. LAYER 1 — Self-management ontoref (about the project itself) paths .ontology/ reflection/ adrs/ audience this project's developers and maintainers purpose describe the project to itself — axioms, FSM dimensions, binding decisions (ADRs), open questions (backlog), accepted knowledge (qa) required YES, on every ontoref-onboarded project. `ontoref setup` creates it. Concrete files: .ontology/core.ncl axioms, tensions, practices, edges .ontology/state.ncl FSM dimensions for project maturity .ontology/gate.ncl membranes gating state transitions .ontology/manifest.ncl project metadata, layers, capabilities, registry_provides (binds Layer 1 to Layer 2) reflection/qa.ncl accepted knowledge as typed Q&A entries reflection/backlog.ncl open items routed to graduation targets reflection/modes/ the project's own integration-mode declarations adrs/adr-NNN-*.ncl binding decisions with typed constraint checks LAYER 2 — Specialized domain/mode ontoref (the integration surface) paths schemas/ catalog/{domains,modes}/ manifest.ncl::registry_provides audience OTHER projects that want to integrate this project purpose the contract surface other projects bind to — typed domains, orchestration modes, registry-namespace claim. Lives in this project so the schemas and the binary stay in lock-step. required OPTIONAL. A pure consumer of the protocol (no published artifacts) skips Layer 2. A federated peer (publishes domains//* or modes//*) has all of it. Concrete files (when present): schemas/.ncl typed domain contracts catalog/domains//manifest.ncl OCI DomainArtifact catalog/domains//contract.ncl re-export of schema catalog/domains//example.json canonical instance catalog/modes//provisioning.ncl ModeArtifact + steps DAG catalog/modes//domains.lock.ncl pinned domain digests manifest.ncl::registry_provides.{participant,registries} namespace claim + cred refs LAYER 3 — Caller-side implementations (NOT in this project) paths /extensions// /catalog/components/<...>/ (when consuming) /infra//integrations/ audience operators and CI of caller projects purpose wiring this project into a specific workspace or pipeline — cabling values to mode steps, declaring which workspace components consume which artifacts present PER CALLER, NEVER in this project's repo This layer is OUTPUT of the integration relationship, not input to it. The project does not ship Layer 3 content for itself; callers ship it for their own use. Cross-references from this project to Layer 3 are explicit pointers ("see /extensions/..."), never copy-paste. WHY THE LAYERING MATTERS — three concrete failure modes when mixed Layer 3 in Layer 1 A project's reflection/qa.ncl carries entries titled "How do I plug into workspace X?" Other callers reading the FAQ get nothing useful; they're looking at one workspace's wiring instead of the project itself. Layer 2 in Layer 1 Schemas live under reflection/ or .ontology/ as "project notes". They drift from the binary because they're not on the contract path. Consumers pull a contract artifact from the registry and find it out of sync. Layer 1 in Layer 2 A catalog/domains//contract.ncl carries architectural rationale instead of a typed shape. Consumers pull and get text where they expected a Nickel contract; integrations break. CROSS-LAYER REFERENCES — explicit, never implicit Within a project: tag qa entries that touch a Layer 3 boundary with "layer-3-boundary" so the boundary is visible. Touch implies "points out to a caller", not "documents the caller's wiring". Between projects: cross-link via id (e.g. lian-build's reflection/backlog.ncl::bl-002 mirrors ontoref's reflection/backlog.ncl::bl-007). Never duplicate content; refer. To the protocol: the protocol's own qa entries (this file) describe the layering generically; project-level qa entries reference them ("see /reflection/qa.ncl::ontoref-three-layer-model") rather than restate. DETECTION — quick checks to see what a project has Layer 1 present: test -f .ontology/core.ncl && test -f reflection/qa.ncl && \ test -d adrs/ && echo "L1 present" Layer 2 present: test -d catalog/ && rg -q 'registry_provides' manifest.ncl && \ echo "L2 present" Layer 3 absence (correctness check): rg -l 'extensions/.*/cabling\|infra/.*/integrations/' \ --type-not ncl --type-not md . | head Should return empty — Layer 3 belongs to callers, not this project. STATUS — protocol-level codification The model is observed in practice (ontoref + lian-build + provisioning) but not yet codified as a protocol ADR with enforceable constraints. See reflection/backlog.ncl::bl-009 for the open codification question including four constraint candidates (Layer 1 mandatory, Layer 2 biconditional, Layer 3 isolation, cross-layer tag convention). Open question flagged in bl-009: how does this layering interact with ADR-018's level hierarchy (Base / Domain / Instance)? Likely orthogonal axes (3-layer × 3-level matrix), but unresolved until the ADR drafts. REFERENCES - reflection/backlog.ncl::bl-009 codification work item - adr-018-level-hierarchy-mode-resolution-strategy open interaction - lian-build/reflection/qa.ncl::lian-build-what-and-why worked example of all three layers as observed - lian-build/manifest.ncl Layer 2 example (registry_provides) - lian-build/catalog/ Layer 2 example (domains + modes) "%, actor = "human", created_at = "2026-05-03", tags = ["ontoref", "layers", "architecture", "adoption", "scope", "boundaries"], related = ["adr-018-level-hierarchy-mode-resolution-strategy"], verified = true, }, { id = "nats-troubleshooting", question = "My ontoref/project NATS publishing isn't working — how do I diagnose it?", answer = m%" Symptoms and fixes. Apply in order; each step rules out a category. (1) Daemon logs "NATS connect failed" or "events disabled" Cause: nats_events.enabled = false, or url unreachable, or NKey rejected. Diagnose: - cat ~/.config/ontoref/config.ncl | rg -A5 'nats_events' - Check enabled = true. - curl -sf http://:8222/healthz (broker monitor port) Should return {"status":"ok"}; if not, the broker is unreachable. Fix: - Wrong URL: edit url, restart daemon. - Broker down: start nats-server (Option A/B) or check workspace pod. - NKey mismatch: regenerate user pub key, update nats-server.conf. (2) Subscriber sees no events Cause: subject prefix mismatch, or publisher silently dropping (warn-only degradation hides failures). Diagnose: - In another terminal, subscribe to the wildcard root: nats sub --server $NATS_URL 'ecosystem.>' If THIS receives events, your filter was wrong. - Inspect the publisher's stderr — platform-nats logs the resolved subject on each publish at INFO level. Compare to your subscribe filter. - JetStream consumers ack-once: a consumer that already ack'd a message won't redeliver it. Check consumer info: nats consumer info ECOSYSTEM Fix: - Subject mismatch: align the subscribe pattern to what's published. - Consumer stuck: nats consumer rm + re-add, or use a fresh consumer name to start from latest. - Warn-only drops: set RUST_LOG=warn and re-run; look for "NATS publish failed". (3) "no responders available for request" or stream missing Cause: ECOSYSTEM stream not created on the broker. Diagnose: - nats stream ls --server $NATS_URL Should list ECOSYSTEM. If empty, the topology was never applied. Fix: - Re-run topology bootstrap (see nats-how-to-setup OPTION A "Bootstrap the topology"), or restart ontoref-daemon — its connect() applies nats/streams.json on first call. (4) NKey decode error / "invalid seed" Cause: seed format wrong (must be the SU... prefixed value, not the public UD... or a JWT). Diagnose: - echo $NATS_NKEY_SEED | head -c 4 # should print 'SU' (user seed) # or 'SA' (account), 'SO' (operator) Fix: - Regenerate: nk -gen user > /tmp/ontoref.user.nk; use the WHOLE file contents. Do not concatenate the .pub file. (5) "permissions violation for publish" Cause: broker's authorization block restricts publish subjects; the project is publishing outside its allowed namespace. Diagnose: - Inspect nats-server.conf (Option B) or the workspace's NATS auth config (Option C — kubectl get configmap -n nats-server-conf). - Check the user's permissions.publish.allow list against your subject. Fix: - Widen the allow list to include ecosystem..> — or better, use a per-project user with scoped permissions. (6) Project's events appear in daemon log but never on NATS Cause: the project is using a different NATS_URL than the daemon, or its nats_events.enabled is false. Diagnose: - Compare the project's resolved NATS_URL (its stderr at startup) to the daemon's. They must point at the same broker if they share the ECOSYSTEM stream. Fix: - Project-local override is intentional? Inspect /.ontoref/config.ncl::nats_events. If unintentional, remove the override; the daemon's global config applies. (7) Events received but with stale timestamps / out of order Cause: JetStream's File storage replays unacked messages on consumer reconnect. This is a feature, not a bug — explicit-ack consumers are expected to handle redelivery. Fix: - Code your subscribers to be idempotent; use the message's correlation_id / event_id (when present) to deduplicate. - If strict ordering matters: set MaxAckPending = 1 on the consumer. REFERENCES - reflection/qa.ncl::nats-what-and-why architecture - reflection/qa.ncl::nats-how-to-setup setup paths - crates/ontoref-daemon/src/nats.rs daemon connect logic - /crates/platform-nats/ NatsConnectionConfig shape - https://docs.nats.io/running-a-nats-service/troubleshooting upstream docs "%, actor = "human", created_at = "2026-05-03", tags = ["nats", "errors", "troubleshooting", "jetstream", "nkey"], related = ["adr-002", "adr-014"], verified = true, }, { id = "ontoref-dao-discipline", question = "What is ondaod and when does an architectural analysis or ADR draft have to apply it?", answer = m%" ALIAS ondaod — shorthand for this discipline in conversation, ADR drafts, CLAUDE.md rules, and reflection mode declarations. WHAT Discipline applied to architectural analysis in any ontoref-onboarded project: read named tensions before recommending; describe synthesis state, not pick a pole; name engaged tensions explicitly. The named tensions in .ontology/core.ncl are continuous Spirals, not binaries — recommendations that collapse them by choosing one side silently bias toward whichever pole the analyst's reasoning happened to land on. WHEN (triggered) - Architectural analysis — any reasoning that produces a recommendation about structure, naming, layout, contracts, or constraints. - ADR drafting — every ADR must apply ondaod alongside the four-criterion ADR test (alternative-rejected, lasting-constraint, multi-component- reversal, not-duplicating-existing). - Work touching .ontology/, adrs/, reflection/, catalog/, manifest.ncl. WHEN (NOT triggered) - Routine code work — bug fixes, feature implementation, refactors that don't change architecture. - Operational tasks — CI runs, commits, tests. - Pure data extraction — querying, reading. HOW (procedure) 1. READ .ontology/core.ncl; locate `level = 'Tension` nodes. 2. IDENTIFY which named tensions the question engages. Empty set is allowed but must be declared: tensions_engaged: [] # no Spiral tensions present 3. CHARACTERIZE the synthesis state of each engaged tension: - Where on the flow? (claim-only / populating / realized / consumed / dormant) - What direction is the project moving? (toward Yang / toward Yin / static) 4. RECOMMEND the move that maintains continuous flow, not the move that collapses the tension. Half-states are partially-realized syntheses, not violations. 5. CITE engaged tensions explicitly in the output (prose paragraph or structured field). WHY Default reasoning — human or agent — is Yang: sequential, decide-and- commit. This loses the continuous flow that the named tensions exist to surface. Recommendations that don't name engaged tensions silently bias toward whichever pole reasoning happened to land on. ondaod surfaces the bias structurally so analysts and reviewers can correct for it before the bias compounds across decisions. FORBIDDEN PATTERNS (most common ondaod violations) - Pole-collapse recommendations: "pick option A" / "go with B" / "the right answer is X" — without naming what got collapsed. - Reality-collapses-intent: "drop the declared claim because the catalog is empty" — that erases Yin intent rather than describing partial realization. The half-state is the project's current location on a continuous flow, not a contract violation. - Hard-biconditional on Spiral questions: ADR constraints with severity = 'Hard and `A ⟺ B` checks on questions that core.ncl already names as 'Spiral tensions. Use 'Soft constraints that report direction of motion. - Yang-bias by sequence: when serial reasoning produces "best option per question", that IS the bias. Counter: characterize all questions' synthesis states first, recommend last (or not at all). ADR INTEGRATION (criterion 5) The four-criterion `adr?` test extends to FIVE when ondaod applies: 1. alternative consciously rejected? 2. lasting constraints future contributors must follow? 3. reversing requires coordinated effort across multiple components? 4. not already captured as a constraint in an existing ADR? 5. ondaod — engaged named tensions identified and synthesis state described, OR explicitly tensions_engaged: [] with rationale "no Spiral tensions present in this decision"? All five must hold. Failing 5 means the question is tractable (1-4 hold) but the analysis hasn't characterized the flow — request synthesis-state description before drafting the ADR. A 'Spiral-poled ADR (when the schema permits the field — see bl-009 graduation) cannot use severity = 'Hard biconditional constraints; Spiral decisions get Soft constraints reporting direction of motion. ACCESS PATHS (how agents and humans reach this entry) From any ontoref-onboarded project, via the canonical `ontoref` CLI: ontoref qa show ontoref-dao-discipline Auto-emits JSON when invoked by an agent (agent-identified context); humans see the formatted output. Force one or the other explicitly with --fmt json | -f json (or --fmt md). Available on every `ontoref` subcommand that returns structured data. Direct file read (last resort, no CLI required): $ONTOREF_ROOT/reflection/qa.ncl::ontoref-dao-discipline This entry is canonical and not duplicated into consumer projects. The discipline applies to consumer projects via reference; the content lives here once. Each consumer project carries its OWN dao discipline entry as an extension (e.g. lian-build/reflection/qa.ncl::lian-build-dao-discipline) that names the project's specific tensions and forbidden patterns; both the protocol baseline (this entry) and the project extension are read together when applying ondaod. REFERENCES - .ontology/core.ncl::read-tensions-first Practice node (structural anchor) - reflection/qa.ncl::ontoref-three-layer-model worked example of Yang-bias and Spiral re-frame - reflection/backlog.ncl::bl-009 three-layer model graduation (depends on ondaod-disciplined analysis) - global ~/.claude/CLAUDE.md::adr? extended four-→-five criterion definition references this entry "%, actor = "human", created_at = "2026-05-03", tags = ["ontoref", "dao", "discipline", "ondaod", "tensions", "spiral", "adr-process", "meta"], related = ["adr-018-level-hierarchy-mode-resolution-strategy"], verified = true, }, { id = "credential-vault-disaster-recovery", question = "What if I lose my .kage, my vault_key, the access.sops.yaml file, or the entire local vault directory?", answer = m%" The credential vault has multiple recovery paths depending on what survives. None requires re-bootstrapping unless the catastrophic case (all local + all keys lost) hits. Verified empirically 2026-05-03 — the access.sops.yaml lost recovery via oras pull works as documented. LOSS MATRIX Lost Recovery ──── ──────── access.sops.yaml oras pull src-vault/:latest from ZOT. (corrupted/deleted) cp the file from the artifact into ~/.config/ontoref/vaults//. Local restic repo (/repo dir) oras pull restores src-vault/ subtree (scopes, registry, logs). vault_key alone Decrypt access.sops.yaml with .kage, extract vault_key field. Your .kage alone Peer recipient decrypts; share vault_key out-of-band. Generate new .kage; add via 'ore secrets add-key' + 'rekey'. Both local AND your .kage Use a peer recipient's .kage to pull and (other recipients alive) decrypt. Generate new .kage; add new pubkey via add-key. Both local AND ALL recipients' Catastrophic. The encrypted artifact in .kage files ZOT is unrecoverable by design. Re-bootstrap; reissue all registry credentials from zot admin surface. CONCRETE RECOVERY (access.sops.yaml lost — verified) See justfiles/secrets.just::secrets-recover (or call ore directly): ore secrets recover --from-registry # pulls and restores access.sops.yaml # using current zot credentials # from your project.ncl. Manually, the equivalent oras invocation is documented in justfiles/_secrets_lib.sh::vault_zot_config_open + an oras pull. Steps: 1. Build a DOCKER_CONFIG tmpdir with admin zot credentials 2. oras pull /src-vault/:latest --output 3. cp /access.sops.yaml ~/.config/ontoref/vaults// 4. ore secrets status — should report 'access.sops: present' BACKUP STRATEGY master .kage Hardware key (Yubikey via age-plugin-yubikey), encrypted disk, or password manager with file attachment. Multi-recipient sops makes the recipient list itself the resilience layer. vault_key Encrypted inside access.sops.yaml — recoverable via .kage. Backup is automatic. cosign signing key Separate (vault.cosign.key_path). Treat as a standalone private key — backup independently. INVARIANTS - Multi-recipient sops mandatory (≥ 2 per file) — no single point of failure. - access.sops.yaml is in the OCI artifact — pulling restores it intact. - cosign verification on pull detects substituted artifacts. - Daemon never holds credentials — daemon recovery is independent. WHAT NOT TO DO - Do NOT skip cosign verification on pull. - Do NOT rotate vault_key proactively — it is the local restic repo password, not a public-service credential. - Do NOT re-bootstrap to skip recovery — fresh bootstrap loses audit.jsonl history and the participant's src-vault history in ZOT. "%, actor = "human", created_at = "2026-05-03", tags = ["credentials", "recovery", "disaster", "operations", "backup"], related = ["adr-017", "adr-019"], verified = true, }, ], } | s.QaStore