ontoref-derive: #[onto_mcp_tool] attribute macro registers MCP tool unit-structs in
the catalog at link time via inventory::submit!; annotated item is emitted unchanged,
ToolBase/AsyncTool impls stay on the struct. All 34 tools migrated from manual wiring
(net +5: ontoref_list_projects, ontoref_search, ontoref_describe,
ontoref_list_ontology_extensions, ontoref_get_ontology_extension).
validate modes (ADR-018): reads level_hierarchy from workflow.ncl and checks every
.ncl mode for level declared, strategy declared, delegate chain coherent, compose
extends valid. mode resolve <id> shows which hierarchy level handles a mode and why.
--self-test generates synthetic fixtures in a temp dir for CI smoke-testing.
validate run-cargo: two-step Cargo.toml resolution — workspace layout first
(crates/<check.crate>/Cargo.toml), single-crate fallback by package name or repo
basename. Lets the same ADR constraint shape apply to workspace and single-crate repos.
ontology/schemas/manifest.ncl: registry_topology_type contract — multi-registry
coordination, push targets, participant scopes, per-namespace capability.
reflection/requirements/base.ncl: oras ≥1.2.0, cosign ≥2.0.0, sops ≥3.9.0, age
≥1.1.0, restic declared as Hard/Soft requirements with version_min, check_cmd, and
install_hint (ADR-017 toolchain surface).
ADR-019: per-file recipient routing for tenant isolation without multi-vault. Schema
additions: sops.recipient_groups + sops.recipient_rules in ontoref-project.ncl.
secrets-bootstrap generates .sops.yaml from project.ncl in declarative mode. Three
new secrets-audit checks: recipient-routing-coherent, recipient-routing-coverage,
no-multi-vault. Adoption templates: single-team/, multi-tenant/, agent-first/.
Integration templates: domain-producer/, mode-producer/, mode-consumer/.
UI: project_picker surfaces registry badge (⟳ participant) and vault badge
(⛁ vault_id · N, green=declarative / amber=legacy) per project card. Expanded panel
adds collapsible Registry section with namespace, endpoint, and push/pull capability.
manage.html gains Runtime Services card — MCP and GraphQL toggleable without restart
via HTMX POST /ui/manage/services/{service}/toggle.
describe.nu: capabilities JSON includes registry_topology and vault_state per project.
sync.nu: drift check extended to detect //! absence on newly registered crates.
qa.ncl: six entries — credential-vault-best-practice (layered data-flow diagram),
credential-vault-templates (paths A/B/C), credential-vault-troubleshooting (15 named
errors), integration-what-and-why (ADR-042 OCI federation), integration-how-to-implement,
integration-troubleshooting.
on+re: core.ncl + manifest.ncl updated to reflect OCI, MCP, and mode-hierarchy nodes.
Deleted stale presentation assets (2026-02 slides + voice notes).
106 lines
6.3 KiB
Markdown
106 lines
6.3 KiB
Markdown
---
|
||
# Post metadata
|
||
id: "dags-everywhere-none-know-what-they-are"
|
||
title: "DAGs Are Everywhere. None of Them Know What They Are."
|
||
slug: "dags-everywhere-none-know-what-they-are"
|
||
subtitle: "Every build system, CI pipeline, and runbook uses DAGs for execution. Ontoref uses them for knowledge."
|
||
excerpt: "CI/CD pipelines, compilers, runbooks, data orchestrators — they all use directed acyclic graphs. Every single one of them uses DAGs as execution models: this before that, topological ordering, dependency resolution. None of them use DAGs to represent what the system is, why it exists, or what trade-offs define it. That's the gap ontoref fills."
|
||
|
||
# Publication info
|
||
author: "Jesús Pérez"
|
||
date: "2026-05-10"
|
||
published: false
|
||
featured: false
|
||
|
||
# Categorization
|
||
category: "ontoref"
|
||
tags: ["ontoref", "dag", "ontology", "knowledge-graphs", "software-architecture"]
|
||
|
||
# Display
|
||
read_time: "6 min read"
|
||
sort_order: 2
|
||
css_class: "category-ontoref"
|
||
category_description: "Ontoref — protocol and tooling for structured self-knowledge in software projects"
|
||
category_published: true
|
||
---
|
||
|
||
# DAGs Are Everywhere. None of Them Know What They Are.
|
||
|
||
Every build system uses DAGs. Every CI/CD pipeline uses DAGs. Compilers, data orchestrators, runbooks, package managers, Kubernetes operators — all DAGs. Directed acyclic graphs are so ubiquitous in software infrastructure that the question "why DAGs?" barely registers as a question anymore.
|
||
|
||
But there's a second question nobody asks: *what do those DAGs represent?*
|
||
|
||
The answer is always the same: **execution order**. This before that. Topological sorting. Dependency resolution. The graph describes *how* something computes, not *what* it is.
|
||
|
||
```
|
||
CI/CD: test → build → deploy
|
||
Compiler: parse → typecheck → codegen → link
|
||
Runbook: check_health → drain → restart → verify
|
||
```
|
||
|
||
These graphs are inert with respect to the system they operate on. A CI pipeline doesn't know what the project is, why it was built this way, or what trade-offs define its architecture. It knows that tests must pass before deployment. That's all it knows.
|
||
|
||
## The Semantic Gap
|
||
|
||
This inertness is not a flaw — it's appropriate. Build systems should be fast and mechanical. Runbooks should be executable without requiring philosophical knowledge about the system. Execution graphs and knowledge graphs are different tools for different purposes.
|
||
|
||
The problem is that the software industry has reached for DAGs for *everything except knowledge representation*. The knowledge of what a system is — its principles, its architectural decisions, its active tensions, its capability surface — lives in documents. Wikis. READMEs. Tickets. Verbal tradition.
|
||
|
||
These are all unstructured, un-typed, un-queryable, and guaranteed to drift from the actual system within months. They describe what the project was, not what it is.
|
||
|
||
## Ontoref's Dual DAG
|
||
|
||
Ontoref uses DAGs in two fundamentally different modes, and the distinction matters:
|
||
|
||
**Ontological DAG — semantically typed edges:**
|
||
|
||
```
|
||
Practice "ncl-schemas" --implements--> Principle "type-safety"
|
||
Practice "ncl-schemas" --enforces--> Constraint "zero-runtime-deps"
|
||
Practice "ncl-schemas" --enables--> Capability "agent-queryable-state"
|
||
Practice "ncl-schemas" --tension--> Practice "adoption-friction"
|
||
```
|
||
|
||
These edges are not "depends on." They are typed relationships: `implements`, `enforces`, `enables`, `tension`. The graph is the knowledge — formal commitments about what the project is and how its concepts relate. The equivalent would be a compiler that knows why it exists and what architectural trade-offs it made.
|
||
|
||
**Reflection DAG — executable contracts with actor restrictions:**
|
||
|
||
```
|
||
step "validate-state" (actor: any) --> step "run-mode" (actor: developer)
|
||
step "run-mode" (actor: developer) --> step "report" (actor: agent|developer)
|
||
```
|
||
|
||
This is not just execution ordering. The DAG is validated as a contract before it runs, records its own progress through state transitions, and encodes who can execute each step. An agent trying to run a developer-restricted step receives a typed rejection, not a runtime error.
|
||
|
||
## The Missing Connection
|
||
|
||
What makes this non-trivial is that the two DAGs communicate:
|
||
|
||
- The ontological DAG declares what the project IS — active constraints, practices in tension, current state dimensions
|
||
- The reflection DAG OPERATES on that state — detecting drift, executing modes, recording transitions
|
||
- Migrations propagate changes in the ontological DAG to downstream consumer projects
|
||
|
||
In every other system that uses DAGs, the graph is evaluated and discarded. In ontoref, evaluation modifies the state that the next execution reads. The graph has memory.
|
||
|
||
## Why This Didn't Exist Before
|
||
|
||
Three things had to align:
|
||
|
||
**Agents as consumers.** Before agentic AI, there was no automated consumer that needed structured knowledge about a project. Humans could read the wiki. Agents cannot — they need typed, queryable, machine-readable project knowledge to work accurately rather than hallucinate context. Ontoref's knowledge DAG is what agents consume via MCP.
|
||
|
||
**Configuration languages with contracts.** Expressing an ontology without an external triplestore required a configuration language with types and contracts. Nickel (NCL) provides exactly that: a typed, lazy configuration language where schema violations are caught at evaluation time, not at runtime. RDF/OWL would have required dedicated infrastructure and specialist expertise.
|
||
|
||
**Repository scale, not enterprise scale.** Enterprise knowledge graphs are multi-year initiatives. Ontoref operates at the scale of a single repository — incremental adoption, no enforcement, no dedicated team. The smallest unit that can adopt it is a project of one.
|
||
|
||
## The Consequence
|
||
|
||
When a project has a knowledge DAG that its agents can consume, the accuracy arithmetic changes entirely. The numbers from KGC 2026 research:
|
||
|
||
- Ontology-grounded retrieval: **3.4× more accurate** than vector RAG on enterprise queries
|
||
- Ontology-validated queries: accuracy jumps from **16% to 72%** on SQL question-answering
|
||
|
||
The gap is closed not by a bigger model but by structured knowledge the model can reason against. A DAG where the edges mean something.
|
||
|
||
---
|
||
|
||
*Ontoref is open source. The protocol specification, Nushell automation, and Rust crates are at [github.com/jesusperezlorenzo/ontoref](https://github.com/jesusperezlorenzo/ontoref).*
|