let d = import "adr-defaults.ncl" in d.make_adr { id = "adr-027", title = "prvng-cli: Unix-Socket Registry Daemon Eliminating Nu Startup Cost Per Invocation", status = 'Accepted, date = "2026-04-18", context = "Every `prvng` invocation runs `_validate_command` in the bash wrapper, which determines whether a command exists in the registry and whether it requires the orchestrator daemon. The baseline implementation had two paths: (1) `nickel export` to rebuild the JSON cache (~2-5s, only on cache miss), and (2) pure bash grep over the JSON cache to extract `found` and `requires_daemon`. The bash grep path has a correctness flaw: `grep -o '\"[a-zA-Z0-9_\\-\\+\\.]*\"'` extracts every quoted string in the JSON file, not just command names, so a description substring matching the query command produces a false positive. Additionally, as the registry grows, a grep+sed window extraction over a multi-kilobyte JSON file for each invocation adds unneeded I/O. ADR-022 (ncl-sync) already established the pattern of using a lightweight Rust daemon for operations that benefit from persistent in-memory state. The registry is a read-heavy, write-rare structure — exactly the right profile for an in-memory cache behind a Unix socket.", decision = "Implement `prvng-cli` as a standalone Rust binary in `platform/crates/prvng-cli/`. The binary: (1) reads `~/.cache/provisioning/commands-registry.json` at startup into a `HashMap` indexed by both canonical name and all aliases; (2) listens on a Unix domain socket at `~/.local/share/provisioning/cli.sock`; (3) serves JSON-framed lookup requests with newline termination; (4) watches the JSON cache file via the `notify` crate and hot-reloads the index on any `Modify` or `Create` event without restarting; (5) shuts down automatically after 60s of idle — the bash wrapper restarts it on demand. The bash wrapper gains three functions: `_ensure_cli_daemon` (starts the binary if the socket is absent, waits up to 300ms for the socket to appear), `_cli_query` (sends `{\"op\":\"lookup\",\"command\":\"\"}` via `nc -U -w 1`, parses `found` and `requires_daemon` from the response), and a three-tier `_validate_command` (socket → bash grep JSON cache → Nu script fallback). The binary is a workspace member of `platform/Cargo.toml` with no dependency on Nushell, SurrealDB, NATS, or any platform service.", rationale = [ { claim = "The in-memory HashMap indexed by both canonical name and aliases eliminates the bash grep false-positive problem", detail = "The HashMap is built at load time by `Registry::into_index()`: for each CommandEntry, the canonical name is inserted as a key, then each alias is inserted as an additional key pointing to the same entry. A lookup on `\"h\"` returns the `help` entry without scanning the description or any other field. The bash grep approach extracted every quoted string from the JSON, meaning a description containing `\"h\"` (e.g., as part of another word) would have matched. The HashMap provides O(1) exact lookup with no false positives.", }, { claim = "Unix socket with newline-framed JSON is simpler and lower-latency than HTTP for same-host IPC", detail = "HTTP adds header parsing, keep-alive negotiation, and TCP stack overhead. For a query that returns ~200 bytes, the round-trip overhead of HTTP is larger than the payload. Unix domain sockets bypass the network stack entirely and are available on all Unix targets. `nc -U -w 1` is universally available on macOS and Linux without additional tooling. The newline frame is parseable by any shell with `grep -o '\"field\":value'` — no JSON parser required in the caller.", }, { claim = "notify-based hot reload eliminates the need to restart the daemon after `nickel export` updates the cache", detail = "The workflow for a registry change is: edit `commands-registry.ncl` → `nickel export` writes the JSON cache → the file watcher detects the write event → the daemon reloads the HashMap in-place. No socket downtime, no bash logic to detect staleness, no version negotiation. The watcher monitors the parent directory with `RecursiveMode::NonRecursive` to catch atomic writes (where editors write to a temp file then rename into place, which does not trigger a `Modify` on the original path but does trigger `Create` on the canonical path).", }, { claim = "Idle shutdown at 60s keeps resource usage zero between `prvng` invocations", detail = "The daemon is not a long-running service — it is an on-demand cache server. On a developer workstation, `prvng` may not be invoked for hours. A daemon that runs continuously would hold an open file descriptor on the socket and consume memory permanently. The 60s idle timeout means the daemon self-terminates after a session of commands, and `_ensure_cli_daemon` restarts it on the next invocation. The restart cost is ~100ms (binary start + HashMap load for 40 commands); this is amortized across all commands in a session.", }, { claim = "Three-tier fallback in `_validate_command` preserves correctness when the daemon is unavailable", detail = "The socket path can fail in three ways: the binary is not installed, the daemon is starting (race condition during `_ensure_cli_daemon`), or `nc` returns no output. The fallback chain is: socket (fast, correct) → bash grep on JSON cache (fast, has false-positive risk but handles 99% of cases) → Nu script (slow, always correct). The Nu fallback is the pre-ADR-027 behavior; it is retained as the last resort to ensure `_validate_command` never hard-fails due to daemon absence.", }, ], consequences = { positive = [ "`_validate_command` completes in <5ms when the daemon is running vs ~50-200ms for bash grep+sed", "Registry lookup correctness: HashMap indexed by exact name/alias, no substring false positives", "Hot reload: `nickel export` → daemon reloads automatically, no restart needed", "Zero resource usage between sessions: idle shutdown at 60s", "No additional system dependencies: `nc` (netcat) is present on all macOS/Linux targets", ], negative = [ "300ms cold-start latency on first invocation after idle shutdown — amortized across the session but visible on the very first `prvng` call", "`nc -U -w 1` behavior differs between GNU netcat (`-q 1`) and BSD netcat (`-w 1`) — the bash wrapper must use `-w 1` for macOS compatibility", "The binary must be installed to `~/.local/share/provisioning/bin/prvng-cli` before `_ensure_cli_daemon` can start it; the installer script must include this step", ], }, alternatives_considered = [ { option = "Keep pure bash grep over JSON cache as the only validation path", why_rejected = "False-positive risk: grep extracts every quoted token from the JSON, not just command names. For 40 commands with aliases and descriptions, the extracted token list contains ~300 strings. A description containing a common word that matches an input typo would suppress the 'unknown command' error. Correctness requires exact-name matching against the command/alias fields only.", }, { option = "Reuse the existing `provisioning-daemon` (platform/crates/daemon/) for registry queries", why_rejected = "The provisioning-daemon is a full platform service: SurrealDB, NATS, auth middleware, provider APIs. It requires the orchestrator infrastructure to be running and is not designed for sub-millisecond local queries. Starting it solely for registry lookup is architectural misuse. ADR-022's ncl-sync daemon established the correct pattern: a separate binary scoped to one responsibility.", }, { option = "HTTP server on localhost instead of Unix socket", why_rejected = "HTTP requires a port allocation, adds TCP stack overhead, and exposes the registry to other processes on the host. Unix sockets are file-permission-controlled, zero-overhead, and already the established IPC pattern for this codebase (the orchestrator uses WebSocket-over-Unix-socket for SurrealDB embedded mode).", }, { option = "Shared memory or mmap for the registry index", why_rejected = "Requires either a file-format contract for the serialized HashMap or a memory-mapped file with a custom reader. `nc`-over-Unix-socket is implementable in bash with one line; mmap requires a dedicated reader binary or Nushell plugin. The complexity gain is negative: the index is 40 entries and fits in a single cache line of JSON.", }, ], constraints = [ { id = "prvng-cli-no-platform-deps", claim = "platform/crates/prvng-cli/Cargo.toml must not depend on nushell, surrealdb, async-nats, platform-config, service-clients, or any crate that transitively requires them", scope = "platform/crates/prvng-cli/", severity = 'Hard, check = { tag = 'Manual, description = "cargo tree -p prvng-cli | grep -E 'nushell|surrealdb|async-nats' — must be empty" }, rationale = "prvng-cli is a lightweight daemon with a single responsibility: serve registry lookups. Platform service dependencies would pull in rustls version conflicts (nushell pins rustls=0.23.28; surrealdb requires ^0.23.36) and increase binary size by 10-50x. Keeping it dependency-minimal ensures it builds fast and stays buildable independently of the platform workspace conflicts.", }, { id = "socket-path-via-xdg", claim = "The socket path must be derived from XDG_DATA_HOME (defaulting to ~/.local/share), never hardcoded", scope = "platform/crates/prvng-cli/src/main.rs, provisioning/core/cli/provisioning", severity = 'Hard, check = { tag = 'Grep, pattern = "\\.local/share/provisioning/cli\\.sock", paths = ["platform/crates/prvng-cli/src/main.rs"], must_be_empty = true }, rationale = "Hardcoded paths break in NixOS, container environments, and CI runners where HOME may not exist or XDG_DATA_HOME points elsewhere. The PRVNG_CLI_SOCKET environment variable allows per-invocation override for testing.", }, { id = "bsd-nc-compatibility", claim = "All `nc` invocations in provisioning/core/cli/provisioning must use `-w 1` for timeout, never `-q 1`", scope = "provisioning/core/cli/provisioning", severity = 'Hard, check = { tag = 'Grep, pattern = "nc.*-q", paths = ["provisioning/core/cli/provisioning"], must_be_empty = true }, rationale = "macOS ships BSD netcat which does not implement `-q` (GNU netcat timeout flag). BSD netcat uses `-w` for connection timeout. Using `-q 1` causes nc to exit with an error on macOS, making `_cli_query` always fail and fall through to the bash grep path, silently degrading to the pre-ADR-027 behavior.", }, ], ontology_check = { decision_string = "Rust Unix-socket daemon serving in-memory HashMap registry lookups with file-watcher hot-reload and 60s idle shutdown; bash wrapper gains three-tier _validate_command (socket → grep → Nu)", invariants_at_risk = ["config-driven-always"], verdict = 'Safe, }, related_adrs = ["adr-022-ncl-sync-daemon", "adr-025-unified-lazy-loading", "adr-028-daemon-target-registry-field"], }