ontoref/assets/presentation/lian-build.md

762 lines
20 KiB
Markdown
Raw Normal View History

feat: #[onto_mcp_tool] catalog, OCI credential vault layer, validate ADR-018 mode hierarchy ontoref-derive: #[onto_mcp_tool] attribute macro registers MCP tool unit-structs in the catalog at link time via inventory::submit!; annotated item is emitted unchanged, ToolBase/AsyncTool impls stay on the struct. All 34 tools migrated from manual wiring (net +5: ontoref_list_projects, ontoref_search, ontoref_describe, ontoref_list_ontology_extensions, ontoref_get_ontology_extension). validate modes (ADR-018): reads level_hierarchy from workflow.ncl and checks every .ncl mode for level declared, strategy declared, delegate chain coherent, compose extends valid. mode resolve <id> shows which hierarchy level handles a mode and why. --self-test generates synthetic fixtures in a temp dir for CI smoke-testing. validate run-cargo: two-step Cargo.toml resolution — workspace layout first (crates/<check.crate>/Cargo.toml), single-crate fallback by package name or repo basename. Lets the same ADR constraint shape apply to workspace and single-crate repos. ontology/schemas/manifest.ncl: registry_topology_type contract — multi-registry coordination, push targets, participant scopes, per-namespace capability. reflection/requirements/base.ncl: oras ≥1.2.0, cosign ≥2.0.0, sops ≥3.9.0, age ≥1.1.0, restic declared as Hard/Soft requirements with version_min, check_cmd, and install_hint (ADR-017 toolchain surface). ADR-019: per-file recipient routing for tenant isolation without multi-vault. Schema additions: sops.recipient_groups + sops.recipient_rules in ontoref-project.ncl. secrets-bootstrap generates .sops.yaml from project.ncl in declarative mode. Three new secrets-audit checks: recipient-routing-coherent, recipient-routing-coverage, no-multi-vault. Adoption templates: single-team/, multi-tenant/, agent-first/. Integration templates: domain-producer/, mode-producer/, mode-consumer/. UI: project_picker surfaces registry badge (⟳ participant) and vault badge (⛁ vault_id · N, green=declarative / amber=legacy) per project card. Expanded panel adds collapsible Registry section with namespace, endpoint, and push/pull capability. manage.html gains Runtime Services card — MCP and GraphQL toggleable without restart via HTMX POST /ui/manage/services/{service}/toggle. describe.nu: capabilities JSON includes registry_topology and vault_state per project. sync.nu: drift check extended to detect //! absence on newly registered crates. qa.ncl: six entries — credential-vault-best-practice (layered data-flow diagram), credential-vault-templates (paths A/B/C), credential-vault-troubleshooting (15 named errors), integration-what-and-why (ADR-042 OCI federation), integration-how-to-implement, integration-troubleshooting. on+re: core.ncl + manifest.ncl updated to reflect OCI, MCP, and mode-hierarchy nodes. Deleted stale presentation assets (2026-02 slides + voice notes).
2026-05-12 04:46:15 +01:00
---
theme: default
title: "炼 lian-build"
titleTemplate: '%s — Ephemeral BuildKit Substrate'
layout: cover
keywords: Rust,BuildKit,CI,sccache,cargo-chef,lian-build,lamina
download: true
exportFilename: lian-build-presentation
monaco: true
remoteAssets: true
selectable: true
colorSchema: dark
lineNumbers: true
themeConfig:
primary: '#ce422b'
logoHeader: '/ferris.svg'
fonts:
mono: 'Victor Mono'
background: /jude-infantini-mI-QcAP95Ok-unsplash.jpg
class: 'justify-center flex flex-cols photo-bg'
---
<h1 class="absolute top-15 left-3/10 font-bold mt-3 text-5xl">lian-build</h1>
<h2 class="absolute top-30 left-1/10 font-medium my-11 text-2xl opacity-80">
Ephemeral BuildKit — from <code>docker build</code> to a substrate
</h2>
<div class="absolute top-57 left-2/10 text-sm opacity-60 font-mono">
BuildKit · cargo-chef · sccache · NATS · Nickel · lamina
</div>
<div class="absolute top-65 left-3/10"><img src="/lian-h.svg" width="420"></div>
<img class="absolute bottom-10 right-10 w-32" src="/ferris.svg">
<style scoped>
h1, h2, div { z-index: 10; }
code { background: rgba(206,66,43,0.2); padding: 0.1em 0.3em; border-radius: 3px; }
</style>
---
# The Problem
**Rust CI: 8 minutes cold. Every. Single. Build.**
<div>
### What happens today without a substrate
<div class="absolute right-2 top-4 w-110 box-highlight mt-2">
Every developer and CI pipeline reinvents the same wheel — and pays full price each time.
</div>
```
push → CI triggers
└─ docker build .
├─ FROM rust:latest # 1.8 GB pull
├─ COPY Cargo.toml Cargo.lock # layer invalidated
├─ RUN cargo build --deps # 46 min compiling serde, tokio…
├─ COPY src/ # always changes
└─ RUN cargo build # 24 min compiling your code
```
</div>
<div class="grid grid-cols-2 gap-6 mt-4">
<div>
### The compounding failures
- **Cache bust cascade** — `Cargo.lock`<br> change invalidates every downstream layer
- **No cross-run reuse** — parallel PRs duplicate identical dep compilation
</div>
<div>
- **Registry pull cost** — base image re-pulled<br> if not pinned
- **OOM silent failure** — exit 137, no retry, <br> build marked failed
</div>
</div>
---
# Why Not `docker build`?
<div class="grid grid-cols-2 gap-6 mt-15">
<div>
### `docker build` limitations
```bash
# No runner control
docker build . # uses daemon defaults
# no VM sizing, no OOM retry
# No external cache injection
# --cache-from only reads local/registry layers
# Can't mount S3-backed sccache bucket
# No SSH forwarding into RUN steps
# (without BuildKit secret/SSH mounts)
# No structured events
# build started/finished = exit code only
```
</div>
<div>
<div class="box-highlight mt-11 text-sm">
<code>docker build</code> is a convenience wrapper. When you need <em>control</em>, you need BuildKit directly.
</div>
</div>
</div>
---
# Why Not `docker build`?
<div>
### What we actually need
| Need | `docker` | BuildKit |
|------|---------------|----------|
| Cache mounts | ✗ | `--mount=type=cache` |
| SSH into build | partial | `--mount=type=ssh` |
| Secret injection | ✗ | `--mount=type=secret` |
| Remote daemon | cumbersome | `buildctl --addr` |
| Structured output | exit code | `--progress=json` |
| Parallel stages | limited | LLB graph native |
</div>
---
# Why BuildKit
**BuildKit is not a build tool. It's a graph execution engine.**
```
Your Dockerfile ──► LLB (Low-Level Build) graph ──► parallel DAG execution
├─ content-addressed cache (every node keyed by its inputs)
├─ prunable: unchanged nodes cost nothing
├─ remote execution: daemon can run anywhere buildctld runs
└─ mount primitives: cache / secret / ssh / bind
```
<div class="grid grid-cols-3 gap-4 mt-4 text-sm">
<div class="border border-rust-orange/30 rounded p-3">
### Cache mounts
```dockerfile
RUN --mount=type=cache,\
target=/usr/local/cargo/registry \
cargo build
```
Registry downloads survive across builds *inside* the daemon.
</div>
<div class="border border-rust-orange/30 rounded p-3">
### Secret mounts
```dockerfile
RUN --mount=type=secret,\
id=sccache_creds \
SCCACHE_S3_USE_SSL=true \
cargo build
```
Credentials never written to layer.
</div>
<div class="border border-rust-orange/30 rounded p-3">
### Remote daemon
```bash
buildctl \
--addr ssh://runner:1234 \
build \
--frontend dockerfile.v0 \
--local context=. \
--output type=image,name=…
```
Daemon on ephemeral VM, client local.
</div>
</div>
---
# cargo-chef — Dependency Layer Surgery
**Problem:** any `src/` change busts the dependency compilation layer.
**cargo-chef solution:** separate the dependency graph compilation from your code.
```dockerfile
# Stage 1 — planner: extract dependency recipe (no actual compilation)
FROM rust:1.82 AS planner
RUN cargo install cargo-chef
COPY . .
RUN cargo chef prepare --recipe-path recipe.json # only Cargo.toml/Cargo.lock matter
# Stage 2 — cooker: compile all deps from the recipe (expensive, cached)
FROM rust:1.82 AS cooker
RUN cargo install cargo-chef
COPY --from=planner /app/recipe.json recipe.json
RUN cargo chef cook --release --recipe-path recipe.json # deps compiled, layer stable
# Stage 3 — final: only your code compiles (fast, always runs)
FROM rust:1.82 AS final
COPY --from=cooker /app/target target
COPY --from=cooker $CARGO_HOME $CARGO_HOME
COPY . .
RUN cargo build --release
```
---
# cargo-chef — Dependency Layer Surgery
<div class="box-highlight mt-3 text-sm">
The <em>planner</em> stage is cheap — it only reads metadata.
The <em>cooker</em> layer is stable as long as <code>Cargo.lock</code> doesn't change.
Source changes never touch the cooker.
</div>
---
# sccache — Compiler-Level Cache
**cargo-chef** caches at the *crate dependency graph* level.<br>
**sccache** caches at the *individual compilation unit* level.
<div class="grid grid-cols-2 gap-6 mt-4">
```
cargo build
└─ rustc src/sizing.rs → .rlib
└─ sccache wraps rustc:
hash(source + flags + toolchain) → S3 lookup
HIT → download cached .rlib (seconds)
MISS → compile + upload (minutes)
```
<div class="-mt12">
<h3 class="ml-15"> Orthogonal cache layers </h3>
| Layer | Tool | Granularity | Backend |
|-------|------|-------------|---------|
| Toolchain image | lamina | Docker layer | Registry |
| Dep compilation | cargo-chef | Cargo crate graph | Docker layer |
| Artifact cache | sccache | Individual `.rlib` | S3 / GCS / Redis |
</div>
</div>
<div class="grid grid-cols-2 gap-6 -mt-45">
<div>
### Secret mount pattern (lamina canonical)
```dockerfile
RUN --mount=type=secret,id=sccache_creds,\
target=/run/secrets/sccache_creds \
. /run/secrets/sccache_creds && \
RUSTC_WRAPPER=sccache \
SCCACHE_BUCKET=$SCCACHE_BUCKET \
cargo chef cook --release \
--recipe-path recipe.json
```
</div>
</div>
Credentials injected at build time, not baked into layer.
---
# cargo-chef + sccache + BuildKit Together
**Each tool solves a different cache miss problem.**
```
Cold build (first run)
──────────────────────────────────────────────────────────────────────────────
planner ──► recipe.json (always cheap: metadata only)
cooker ──► compile 200 deps from scratch (6 min — sccache MISS: upload)
final ──► compile your code (2 min — sccache MISS: upload)
Warm build — Cargo.lock unchanged, your code changed
──────────────────────────────────────────────────────────────────────────────
planner ──► recipe.json (cheap)
cooker ──► BuildKit layer HIT (identical recipe) (0 sec — skip entirely)
final ──► compile your code (sccache MISS if changed: 2 min)
Hot build — src/ minor change
──────────────────────────────────────────────────────────────────────────────
planner ──► recipe.json (cheap)
cooker ──► BuildKit layer HIT (0 sec)
final ──► rustc on changed files → sccache HIT (seconds per file)
```
<div class="box-highlight mt-3 text-sm">
Three cache layers, three different scopes. When one misses, the others still win.
BuildKit serializes the dependency; sccache and cargo-chef operate independently inside.
</div>
---
# lian-build — Architecture
<div class="absolute right-30 top-8"><img src="/lian-h.svg" width="140"></div>
**Single binary. Orchestrates remote BuildKit runs. Emits lifecycle events.**
```
lian-build CLI
├─ 1. resolve runner size sizing::resolve(.build-spec.ncl → P95 → lang default)
├─ 2. publish build.started NATS: <prefix>.<workspace>.build.started
├─ 3. spawn runner POST /api/v1/vm-pool → lease_id
│ └─ hcloud cax11cax41 (ARM) | proxmox | docker-local
├─ 4. rsync build context rsync -e ssh context/ runner:/workspace/
├─ 5. run buildctl over SSH buildctl --addr ssh://runner build …
│ └─ OOM exit 137? retry once on next size tier (ADR-039)
├─ 6. record_metrics POST /api/v1/metrics (cpu_p95, mem_p95)
├─ 7. destroy runner DELETE /api/v1/vm-pool/{lease_id} [always]
└─ 8. publish build.completed/failed
```
<div class="text-xs opacity-60 mt-2">Compute and registry are plug-in slots. Callers supply <code>BuildDirectives</code> in Nickel — no caller identity in core code.</div>
---
# Three-Tier Sizing Resolution
**First match wins. Explicit beats historical beats defaults.**
<div class="grid grid-cols-3 gap-4 mt-4 text-sm">
<div class="border border-rust-orange/40 rounded p-3">
### Tier 1 — Explicit
```nix
# .build-spec.ncl
# in build context
{
runner_type = "cax31",
# authoritative
# or raw resources:
cpu = 8,
memory_gb = 16,
time_budget_min = 90,
}
```
Validated against `schemas/build_spec.ncl`.<br> Repo-level contract.
</div>
<div class="border border-rust-orange/30 rounded p-3">
### Tier 2 — P95 Historical
```
GET /api/v1/p95?workspace=…
→ {
cpu_p95: 3.2, mem_p95: 6.1
}
effective = {
cpu: ceil(3.2 × 1.2) = 4,
mem_gb:ceil(6.1 × 1.2) = 8,
}
floor: min(2 cpu, 4 GB)
```
Measured from prior runs. <br>Advisory — operator must approve before production use.
</div>
<div class="border border-rust-orange/20 rounded p-3">
### Tier 3 — Lang Default
```rust
match language {
"rust" => (
4 cpu, 8 GB, 60 min),
"go" => (
2 cpu, 4 GB, 30 min),
"java" => (
4 cpu, 8 GB, 45 min),
_ => (
2 cpu, 4 GB, 30 min),
}
```
Conservative floor. Rust is more expensive than Go — that's structural.
</div>
</div>
---
# OOM Retry — Bounded Escalation
**Exit 137 or stderr "OOM"/"Killed" → walk one tier up. Once.**
<div class="grid grid-cols-2 gap-6 mt-4">
<div>
```rust
// MAX_OOM_RETRIES = 1 —
// ADR-039 constraint oom-retry-bounded
pub const MAX_OOM_RETRIES: u8 = 1;
const SIZE_TIERS: &[(&str, u32, u32)] = &[
("cax11", 2, 4),
("cax21", 4, 8), // ← most Rust builds here
("cax31", 8, 16), // ← OOM retry target
("cax41", 16, 32),
];
```
</div>
<div>
### Why bounded at 1
- Second OOM means **misconfiguration**, <br>not transient pressure
- Unbounded retry loops spend money <br>on dead ends
- Forces developer to set explicit `runner_type`<br> in `.build-spec.ncl`
- ADR-039 constraint —<br> changing this requires a new ADR
</div>
<div class="-mt35">
### Retry flow
```
build on cax21 → OOM (exit 137)
└─ retries_used(0) < MAX_OOM_RETRIES(1)
└─ next_size_tier(cax21) → cax31
└─ rebuild on cax31
├─ success → record_metrics, destroy
└─ OOM again → FAIL (retries exhausted)
```
</div>
</div>
---
<h1 class="-mt8"> Cache Namespace Model</h1>
**Isolation between CI and session actors — the core tension resolved.**
```
Registry
├── ci/<workspace>/* canonical — written by CI, read-only to sessions
│ ├── ci/lian-build/deps:sha256-… (cargo-chef cooker layer)
│ └── ci/lian-build/base:sha256-… (toolchain layer from lamina)
└── dev/<actor-id>-<workspace>/* ephemeral — per session actor
└── dev/jpl-lian-build/… (your WIP session cache)
```
<div class="grid grid-cols-2 gap-6 mt-0">
<div>
<h3 class="-mb4">Resolution rules</h3>
| <small>Actor</small> | <small>Reads</small> | <small>Writes</small> |
|------------|-------|--------|
| <small>`'ci`</small> | <small>`ci/*` + own `dev/*`</small> | <small>`ci/*`</small> |
| <small>`'human`</small> | <small>`ci/*` + own `dev/*`</small> | <small>own `dev/*`</small> |
| <small>`'agent`</small> | <small>`ci/*` + own `dev/*`</small> | <small>own `dev/*`</small> |
| <small>`'ci_aux`</small> | <small>`ci/*` only</small> | <small>`ci/*` (restricted)</small> |
</div>
<div class="mt-1">
### Nickel schema
```nix
# schemas/cache_policy.ncl
let SessionCacheDisposition = [|
'export, # write back to registry on success
'discard, # ephemeral, discard after run
'rollback, # revert to last good state on fail
|]
```
Sessions declare intent. lian-build enforces it.<br>
CI never imports from `dev/*`.
</div>
</div>
---
# BuildDirectives — Caller-Supplied Vocabulary
**Callers (provisioning, vapora, CI) supply directives in Nickel. Core has no caller identity.**
```nix
# schemas/build_directives.ncl — the contract surface
let BuildDirectives = {
workspace | String,
artifact | BuildArtifact,
compute_provider | ComputeProviderRef, # 'hcloud | 'proxmox | 'docker_local
registry_provider | RegistryProviderRef, # 'zot | 'harbor | 'ghcr | 'dockerhub
cache_policy | CachePolicy,
runner_override | RunnerOverride | optional,
nats_events | NatsEventConfig | optional,
}
```
<div class="grid grid-cols-2 gap-6 mt-4 text-sm">
<div>
### CI invocation
```nix
# ci/directives.ncl
let D = import "defaults/build_directives.ncl" in
D.make_ci_build {
workspace = "lian-build",
artifact = {
image = "registry/lian-build:${sha}" },
cache_policy = D.ci_cache_policy,
}
```
</div>
<div>
### Session invocation
```nix
# dev/session.ncl
let D = import "defaults/build_directives.ncl" in
D.make_session_build {
workspace = "lian-build",
actor_id = "jpl",
disposition = 'discard,
}
```
</div>
</div>
<div class="text-xs opacity-60 mt-3">Hard constraint (ADR-001): <code>src/</code> must not match <code>provisioning_workspace | vapora_ | woodpecker_</code> — caller logic stays in directives.</div>
---
layout: cover
background: ./jude-infantini-mI-QcAP95Ok-unsplash.jpg
class: 'text-center photo-bg'
---
# lamina
## The pre-baked layer library
<br>
### *The catalog that feeds lian-build*
---
<h1 class="p-b-5"> lamina — What It Is </h1>
**Docker base images (toolchain layers) <br> + pre-cooked cargo dep caches (dependency layers).**
No binary.
<br>
No `src/`. No `Cargo.toml`. Dockerfiles + Nickel schemas + Nushell scripts.
```
lamina/
├── rust/ ─── Rust toolchain layer (rustup + sccache + cargo-chef)
├── leptos/ ─── Leptos WASM layer (rust + wasm-pack + trunk)
├── ontoref/ ─── Nickel + ore tools layer
├── nushell/ ─── Nu shell layer
├── lian-build/ ─── Build directives per layer, ctx-test.nu script
│ ├── Dockerfile.rust # planner/cooker/final for rust layer
│ ├── build_directives.ncl # per-layer lian-build config
│ └── ctx-test.nu # local test runner (docker-local mode)
└── schemas/ ─── workflow.ncl, layer catalog contracts
```
---
# lamina — What It Is
<div class="grid grid-cols-2 gap-6 mt-4 text-sm">
<div>
### Layer types
| Type | What it provides | Cache scope |
|------|-----------------|-------------|
| Toolchain | rustup, cargo, sccache binary | Docker registry |
| Dep layer | compiled `.rlib` for your deps | Docker layer + S3 |
| Utility | additional tools (nu, nickel) | Docker layer |
</div>
<div>
### Catalogue invariant
Every layer in `catalog/` has a `workflow.ncl` that declares:
- `tools_provided` — binaries that must exist post-build
- `build_base` — which layer it depends on (DAG)
- `artifact_paths` — what gets promoted to registry
`catalog-validate.nu --check-dag` enforces the DAG.
</div>
</div>
---
# lamina + lian-build — End-to-End
**lamina provides the layers. lian-build provides the compute.**
```
lamina lian-build
────────────────────────────────── ──────────────────────────────────────────────
rust/Dockerfile BuildDirectives (Nickel)
planner ──────► --cache-from registry/lamina/rust-deps:sha
cooker (cargo-chef) --cache-to registry/lamina/rust-deps:sha
final --image registry/lamina/rust:latest
schema: Compute:
workflow.ncl spawn cax21 runner (hcloud ARM)
build_directives.ncl ──────────► rsync context/ → runner
buildctl --addr ssh://runner:1234
ctx-test.nu OOM? → cax31, retry once
--layer rust destroy runner always
--mode docker-local ◄────── NATS: lian-build.lamina.build.completed
(local dev without VM)
```
<div class="box-highlight mt-1 text-sm">
lamina layers become the <code>--cache-from</code> inputs for downstream project builds.<br>
A project's cargo deps compile <em>on top of</em> the lamina rust dep layer <br>— hitting sccache HIT for everything lamina already compiled.
</div>
---
# The Full Picture
**From `docker build .` to a controlled substrate.**
<div class="grid grid-cols-2 gap-8 mt-4 text-sm">
<div>
### Before
```
dev push
└─ CI: docker build .
├─ pull rust:latest (1.8 GB)
├─ cargo build --deps (6 min, always)
└─ cargo build src (2 min)
810 min every build.
No retry on OOM.
No observability.
No cross-build reuse.
No actor isolation.
```
</div>
<div>
### After
```
dev push
└─ lian-build dispatch
├─ sizing: .build-spec.ncl → cax21
├─ NATS: build.started
├─ spawn cax21 (hcloud ARM, 30 sec)
├─ rsync context (15 sec)
├─ buildctl:
│ ├─ FROM lamina/rust:latest
│ │ (registry HIT, 0 sec)
│ ├─ cooker: cargo-chef layer
│ │ (registry HIT, 0 sec)
│ └─ final: your code
│ (sccache: seconds)
├─ destroy runner
└─ NATS: build.completed
~2 min warm. OOM retry automatic.
Structured events. Multi-actor isolation.
```
</div>
</div>
---
layout: cover
background: ./images/cleo-heck-1-l3ds6xcVI-unsplash.jpg
class: 'text-center photo-bg'
---
<h1 class="font-bold text-5xl absolute top-4 left-4.3/10"><img src="/lian-v.svg" width="130"></h1>
<h2 class="mt-40 text-2xl opacity-80">Alchemical refinement</h2>
<div class="mt-8 text-sm opacity-60 font-mono">
lian-build · lamina · BuildKit · cargo-chef · sccache
</div>
<div class="mt-6 text-base">
Each build: <em class="!text-orange-500">ephemeral compute, content-addressed cache, structured events.</em>
<br>
Callers supply intent. `lian-build` supplies execution.
</div>
<img class="absolute bottom-8 right-8 w-24 opacity-80" src="/ferris-celebration.svg">
<style scoped>
h1, h2, div, p { z-index: 10; }
em { color: #ce422b; font-style: italic; }
</style>