let d = import "adr-defaults.ncl" in d.make_adr { id = "adr-031", title = "Unified Component CLI: prvng component with Polymorphic Mode Dispatch", status = 'Accepted, date = "2026-04-19", context = "Before this decision the CLI exposed two separate command hierarchies for infrastructure lifecycle: `prvng taskserv ` for taskserv-mode components and `prvng component list|show|info` for read-only introspection. Write operations (install, delete, update, reinstall, restart, backup, restore) were routed through `taskserv.nu` regardless of the component's actual deploy mode (taskserv/cluster/container). This created three problems: (1) a cluster-mode component like postgresql required `prvng taskserv create postgresql` even though it runs as a Kubernetes deployment — the semantics were wrong and confused operators; (2) the script resolution used only the Tier-1 `install-.sh` convention — Tier-2 per-op scripts (`delete-.sh`, `backup-.sh`) were never invoked; (3) the orchestrator endpoint `/workflows/taskserv/create` accepted a `TaskservWorkflow` body that embedded no mode information, so the worker had no way to route script execution by mode. A prerequisite condition also existed: no precondition gate validated that capability providers (e.g. k0s for cluster-mode, NFS for democratic_csi) were healthy before enqueuing a write.", decision = "Replace `prvng taskserv` with a unified `prvng component ` command that dispatches polymorphically by deploy mode. The implementation has five parts: (1) A new orchestrator endpoint `/api/v1/workflows/component/{op}` accepts `ComponentWorkflow` (workspace + infra + component + server + namespace + ssh_user + ssh_key_path + settings + check_mode + provisioning). The `{op}` path segment is validated against a fixed allowlist (install, delete, update, reinstall, restart, backup, restore, check-updates for write; status, health, list, show for read). (2) A precondition gate (`src/preconditions.rs`) runs before task enqueue for write ops: it fast-fails on `.provisioning-state.ncl` terminal states (failed/error), then runs live SSH probes via the system `ssh` binary with a 15-second global timeout. The gate is skipped for read-only ops and for taskserv-mode components (root provider, no dependencies). (3) Script resolution is upgraded to two tiers: Tier 2 (`-.sh`) is preferred; Tier 1 (`install-.sh` + `CMD_TASK=` env) is the fallback. `cluster_deploy.nu` calls `get-component-script-path` from `extensions/discovery.nu` and merges `CMD_TASK` only for Tier-1 scripts. (4) The NuShell CLI adds eight exported lifecycle commands to `cli/components.nu` (install/delete/update/reinstall/restart/backup/restore/check-updates) and two new functions to `platform/clients/orchestrator.nu` (`orch-submit-component`, `orch-wait-task`). (5) A feature flag `orchestrator.features.enable_component_endpoint` (default: true) allows a one-line rollback to 404 without binary redeployment.", rationale = [ { claim = "Single endpoint with path-param op is cleaner than one endpoint per operation", detail = "The alternative was `/api/v1/workflows/component/install`, `/api/v1/workflows/component/delete`, etc. — eight separate routes with identical handler bodies. A single parameterised route `/api/v1/workflows/component/{op}` validates the op at the top of the handler and reuses one `WorkflowTask` construction path. The operation name is forwarded as the Nu script's `CMD_TASK` env var for Tier-1 scripts, so the routing is structural, not branching.", }, { claim = "SSH via tokio::process::Command is safer than the russh crate for the precondition gate", detail = "The codebase has a `russh` dependency but `pool/executor.rs` is a stub. Adding a live russh integration solely for the precondition gate would require implementing the full connection pool — a significant scope addition. The system `ssh` binary with `BatchMode=yes StrictHostKeyChecking=no ConnectTimeout=N` is a one-file implementation with no state, no connection pool management, and no key-format assumptions. The probe is fire-and-forget (invoked once per precondition check, not per request), so there is no meaningful performance loss from process spawn overhead.", }, { claim = "The feature flag is a deployment-level rollback, not an A/B switch", detail = "The flag `enable_component_endpoint` returns HTTP 404 when false. This is intentional: callers that haven't migrated receive a 404 and fall back to the legacy `/workflows/taskserv/create` route, which remains in place. The flag is not intended for permanent dual-track operation — it exists to give a one-flag rollback path during the migration window, after which the legacy route and taskserv.nu will be deleted.", }, { claim = "Deleting prvng taskserv is a breaking change accepted as deliberate", detail = "All three call sites for the old taskserv API (NuShell CLI, control-center-ui, MCP server) are migrated atomically in this decision. The commands-registry.ncl entry is removed, the `t` single-char alias is removed, and the bash wrapper dispatch case is removed. `taskserv.nu` is retained temporarily to avoid breaking any in-flight sessions. Its permanent deletion is a follow-on commit after confirming no un-migrated callers exist.", }, ], consequences = { positive = [ "prvng component install postgresql correctly names the operation for both cluster and taskserv modes", "Backup and restore operations are now surfaced via prvng component backup/restore — previously inaccessible from the CLI", "Precondition gate prevents cascading failures when a capability provider is unhealthy before a write operation", "Tier-2 per-op scripts are now invoked when present — operators can specialise delete/backup logic without patching the generic install script", ], negative = [ "Breaking change: prvng taskserv is removed. Any external scripts or documentation referencing the old command require update", "The precondition gate adds 0–15 seconds to write operations when providers are unhealthy or unreachable — healthy-path overhead is ~2s for the state-file fast-fail", ], neutral = [ "taskserv.nu is not deleted in this commit — a follow-on cleanup commit removes it once in-flight migration is confirmed", "The legacy /workflows/taskserv/create endpoint is preserved indefinitely until the feature flag is toggled and the route removed in a future cleanup", ], }, }