lian-build/README.md
2026-05-04 18:23:52 +01:00

9.4 KiB

ontoref


lian-build

炼 — alchemical refinement. Standalone build substrate for ephemeral BuildKit sessions.

lian-build is a single Rust binary that orchestrates ephemeral remote BuildKit runs against a pluggable orchestrator.

Callers (provisioning, vapora, workspace CI) supply intent as BuildDirectives in NCL; lian-build controls compute provisioning, OCI cache flow, and multi-actor session namespacing.

Compute (hcloud / proxmox / docker-local) and registry (zot / harbor / ghcr) are plug-in slots.

Crate lian-build (binary), 0.1.0
Status Beta · pre-1.0, schema and CLI surface still mobile
Edition 2021
ADRs adr-001 lift-out · adr-002 CLI subcommand discipline · adr-003 Nickel via subprocess

What it is

  • An orchestrator client, not a buildkitd. It spawns runners through an external HTTP orchestrator (default http://localhost:9011), rsyncs the build context, then drives buildctl over SSH.
  • A directives consumer. Schemas live in schemas/*.ncl; the Rust types in src/directives.rs mirror them and round-trip via serde_json.
  • An event publisher. started / completed / failed lifecycle events go to NATS at <prefix>.<workspace>.build.<event> (best-effort — NATS failures never fail a build).

What it is not

  • Not a library. There is no lib.rs. Public surface is the CLI and the NCL schemas.
  • Not coupled to provisioning. ADR-001 forbids importing provisioning, platform-config, or any stratum--prefixed crate (grep-checked).
  • Not a Nickel runtime. NCL is parsed by shelling out to the nickel CLI (ADR-003). nickel must be on $PATH.

CLI

Two subcommands, no default, no flat-arg fallback (ADR-002). All logs go to stderr; stdout is reserved for one structured envelope per invocation.

lian-build build

Run a build on an ephemeral runner. Flat args or a directives file:

lian-build build \
  --workspace <name> \
  --context <dir> \
  --image <fully-qualified-ref> \
  --ssh-key <path> \
  [--directives <file.ncl>] \
  [--dockerfile Dockerfile] \
  [--cache-from <ref>] [--cache-to <ref>] \
  [--language rust|go|java|...] \
  [--platforms linux/amd64,linux/arm64] \
  [--runner-image <id>] \
  [--orchestrator-url <url>] \
  [--nats-url <url>] [--nats-nkey-seed <seed>] [--nats-subject-prefix <p>]

When --directives <file.ncl> is supplied it takes precedence over flat-arg fields except --ssh-key, which is still required separately (directives don't yet carry an inline SSH reference).

Environment fallbacks: BUILDKIT_WORKSPACE, BUILDKIT_SSH_KEY, BUILDKIT_RUNNER_IMAGE, ORCHESTRATOR_URL, NATS_URL, NATS_NKEY_SEED, NATS_SUBJECT_PREFIX.

lian-build integrate

Federated probe. Reads a SecretDeliveryContext JSON envelope from stdin, emits a ResultEnvelope JSON line on stdout, optionally publishes a completion event to NATS.

echo '<context-json>' | lian-build integrate \
  [--nats-url <url>] [--nats-nkey-seed <seed>]

Omit --nats-url to skip event emission with a warning (the envelope still goes to stdout).


Three-tier runner sizing

sizing::resolve walks three sources, first match wins:

  1. Explicit.build-spec.ncl in the build context, validated against schemas/build_spec.ncl (bounded_cpu_ ≤ 256, bounded_time_budget_ ≤ 1440 min).
  2. P95 historicalOrchestratorClient::get_p95(workspace) returns measured CPU / memory P95 from prior runs; multiplied by 1.2, floored at min(2 cpu, 4 GB).
  3. Language defaultsRunnerSize::language_default(lang):
    Language CPU Memory Time budget
    rust 4 8 GB 60 min
    go 2 4 GB 30 min
    java / kotlin / scala 4 8 GB 45 min
    (default) 2 4 GB 30 min

On exit-code 137 or stderr containing OOM / Killed, the build retries once at the next tier (cx22cx32cx42cx52). The retry cap is hard-bound (retry::MAX_OOM_RETRIES = 1).


Cache namespacing

Two layers, defined in schemas/cache_policy.ncl and enforced in src/cache.rs:

  • ci/<workspace>/* — canonical, written by CI, read-only to sessions.
  • dev/<actor-id>-<workspace>/* — ephemeral per-session actor.

Sessions read from both layers; CI never imports from dev/*. These invariants are guarded by tests under src/cache.rs.


NCL contract surface

Schema Defines
schemas/build_directives.ncl BuildDirectives, BuildArtifact, ComputeProviderRef, RegistryProviderRef, RunnerOverride, NatsEventConfig
schemas/build_spec.ncl BuildSpec (per-repo .build-spec.ncl)
schemas/cache_policy.ncl CachePolicy, BuildMode, SessionActor, SessionCacheDisposition
schemas/build_result.ncl BuildResult envelope shape
schemas/vault_refs.ncl VaultCredRef, VaultKeyRef
defaults/build_directives.ncl make_* constructors, ci_cache_policy / session_cache_policy helpers

Validate with the nickel CLI:

nickel export \
  --import-path /Users/Akasha/Development/lian-build \
  --import-path /Users/Akasha/Development/lian-build/schemas \
  --import-path /Users/Akasha/Development/ontoref \
  --import-path /Users/Akasha/Development/ontoref/ontology \
  schemas/build_directives.ncl

Hard constraints (ADR-001)

These two rules are grep-checked and define the lift-out boundary:

  1. no-provisioning-lib-importCargo.toml and src/ must not match platform-config|provisioning|stratum-. The platform-nats path-dep from stratumiops is explicitly allowed; the constraint targets the provisioning workspace and stratum--prefixed crates.
  2. build-directives-ncl-vocabularysrc/ must not match provisioning_workspace|vapora_|woodpecker_. Caller-specific logic stays in caller-supplied directives, not in core.

Build / test / run

cargo build                          # debug
cargo build --release                # release at target/release/lian-build
cargo clippy -- -D warnings          # mandatory before commit
cargo fmt
cargo test                           # full suite
cargo test <name>                    # single test by substring
cargo test -- --nocapture            # show tracing in tests

just recipes live in justfiles/{build,test,ci}.just. Run just (or just help) to list them.

Stratumiops peer dependency

platform-nats is consumed as a local path dependency from /Users/Akasha/Development/stratumiops/crates/platform-nats (declared directly in Cargo.toml). That path must exist for cargo build to succeed — there is no feature flag to disable it.


Module layout

src/
  main.rs                # CLI parsing, top-level orchestration, OOM-retry control flow
  buildctl_runner.rs     # rsync_context + run_buildctl over SSH; OOM_EXIT_CODE = 137
  cache.rs               # BuildMode, build_cache_flags, ci/* vs dev/<actor>/* invariants
  directives.rs          # BuildDirectives ↔ JSON via `nickel export` subprocess
  integration/           # federated probe handler (stdin → ResultEnvelope on stdout)
    context.rs · event.rs · handler.rs · result.rs · mod.rs
  nats_events.rs         # BuildEventPublisher over platform-nats::EventStream
  orchestrator_client.rs # HTTP client: spawn_runner, destroy_runner, get_p95, record_metrics
  retry.rs               # MAX_OOM_RETRIES = 1; SIZE_TIERS walk (cx22→cx32→cx42→cx52)
  sizing.rs              # three-tier resolution

Operational surface:

adrs/                    # accepted ADRs (NCL)
.ontology/               # core, state, gate, manifest — design intent
reflection/              # modes, backlog, qa
schemas/                 # caller-facing NCL contracts
defaults/                # constructor / helper NCL
catalog/{domains,modes}/ # federated peer publishing layout
examples/sample.ncl      # example BuildDirectives instance
tests/fixtures/          # integration test fixtures
.coder/                  # session interaction files (not product docs)
.claude/                 # operational config (symlinks into shared dev-system)

Further reading

  • adrs/adr-001-lian-build-as-standalone.ncl — why this project exists, alternatives rejected, the two grep-checked invariants.
  • adrs/adr-002-cli-subcommand-discipline.ncl — subcommand-only surface, stderr/stdout discipline.
  • adrs/adr-003-nickel-via-subprocess.ncl — why nickel is on $PATH, not in Cargo.toml.
  • .ontology/core.ncl — axioms (ephemeral-builds, provider-pluggability, cache-content-addressed, caller-supplies-directives), tensions, practices.
  • .ontology/state.ncl — five maturity dimensions and their current transitions (provider-pluggability, session-multi-actor, active-active-registry, caller-integration, peer-publishing).
  • CHANGES.md — record of accepted decisions and visible surface changes.