ontoref/assets/presentation/docs/posts/en/dags-everywhere-none-know-what-they-are.md

---
# Post metadata
id: "dags-everywhere-none-know-what-they-are"
title: "DAGs Are Everywhere. None of Them Know What They Are."
slug: "dags-everywhere-none-know-what-they-are"
subtitle: "Every build system, CI pipeline, and runbook uses DAGs for execution. Ontoref uses them for knowledge."
excerpt: "CI/CD pipelines, compilers, runbooks, data orchestrators — they all use directed acyclic graphs. Every single one of them uses DAGs as execution models: this before that, topological ordering, dependency resolution. None of them use DAGs to represent what the system is, why it exists, or what trade-offs define it. That's the gap ontoref fills."

# Publication info
author: "Jesús Pérez"
date: "2026-05-10"
published: false
featured: false

# Categorization
category: "ontoref"
tags: ["ontoref", "dag", "ontology", "knowledge-graphs", "software-architecture"]

# Display
read_time: "6 min read"
sort_order: 2
css_class: "category-ontoref"
category_description: "Ontoref — protocol and tooling for structured self-knowledge in software projects"
category_published: true
---

# DAGs Are Everywhere. None of Them Know What They Are.

Every build system uses DAGs. Every CI/CD pipeline uses DAGs. Compilers, data orchestrators, runbooks, package managers, Kubernetes operators — all DAGs. Directed acyclic graphs are so ubiquitous in software infrastructure that the question "why DAGs?" barely registers as a question anymore.

But there's a second question nobody asks: *what do those DAGs represent?*

The answer is always the same: **execution order**. This before that. Topological sorting. Dependency resolution. The graph describes *how* something computes, not *what* it is.

```
CI/CD:      test → build → deploy
Compiler:   parse → typecheck → codegen → link
Runbook:    check_health → drain → restart → verify
```

These graphs are inert with respect to the system they operate on. A CI pipeline doesn't know what the project is, why it was built this way, or what trade-offs define its architecture. It knows that tests must pass before deployment. That's all it knows.

## The Semantic Gap

This inertness is not a flaw — it's appropriate. Build systems should be fast and mechanical. Runbooks should be executable without requiring philosophical knowledge about the system. Execution graphs and knowledge graphs are different tools for different purposes.

The problem is that the software industry has reached for DAGs for *everything except knowledge representation*. The knowledge of what a system is — its principles, its architectural decisions, its active tensions, its capability surface — lives in documents. Wikis. READMEs. Tickets. Verbal tradition.

These are all unstructured, un-typed, un-queryable, and guaranteed to drift from the actual system within months. They describe what the project was, not what it is.

## Ontoref's Dual DAG

Ontoref uses DAGs in two fundamentally different modes, and the distinction matters:

**Ontological DAG — semantically typed edges:**

```
Practice "ncl-schemas"  --implements-->  Principle "type-safety"
Practice "ncl-schemas"  --enforces-->    Constraint "zero-runtime-deps"
Practice "ncl-schemas"  --enables-->     Capability "agent-queryable-state"
Practice "ncl-schemas"  --tension-->     Practice "adoption-friction"
```

These edges are not "depends on." They are typed relationships: `implements`, `enforces`, `enables`, `tension`. The graph is the knowledge — formal commitments about what the project is and how its concepts relate. The equivalent would be a compiler that knows why it exists and what architectural trade-offs it made.

**Reflection DAG — executable contracts with actor restrictions:**

```
step "validate-state"  (actor: any)        -->  step "run-mode"  (actor: developer)
step "run-mode"        (actor: developer)  -->  step "report"    (actor: agent|developer)
```

This is not just execution ordering. The DAG is validated as a contract before it runs, records its own progress through state transitions, and encodes who can execute each step. An agent trying to run a developer-restricted step receives a typed rejection, not a runtime error.

## The Missing Connection

What makes this non-trivial is that the two DAGs communicate:

- The ontological DAG declares what the project IS — active constraints, practices in tension, current state dimensions
- The reflection DAG OPERATES on that state — detecting drift, executing modes, recording transitions
- Migrations propagate changes in the ontological DAG to downstream consumer projects

In every other system that uses DAGs, the graph is evaluated and discarded. In ontoref, evaluation modifies the state that the next execution reads. The graph has memory.

## Why This Didn't Exist Before

Three things had to align:

**Agents as consumers.** Before agentic AI, there was no automated consumer that needed structured knowledge about a project. Humans could read the wiki. Agents cannot — they need typed, queryable, machine-readable project knowledge to work accurately rather than hallucinate context. Ontoref's knowledge DAG is what agents consume via MCP.

**Configuration languages with contracts.** Expressing an ontology without an external triplestore required a configuration language with types and contracts. Nickel (NCL) provides exactly that: a typed, lazy configuration language where schema violations are caught at evaluation time, not at runtime. RDF/OWL would have required dedicated infrastructure and specialist expertise.

**Repository scale, not enterprise scale.** Enterprise knowledge graphs are multi-year initiatives. Ontoref operates at the scale of a single repository — incremental adoption, no enforcement, no dedicated team. The smallest unit that can adopt it is a project of one.

## The Consequence

When a project has a knowledge DAG that its agents can consume, the accuracy arithmetic changes entirely. The numbers from KGC 2026 research:

- Ontology-grounded retrieval: **3.4× more accurate** than vector RAG on enterprise queries
- Ontology-validated queries: accuracy jumps from **16% to 72%** on SQL question-answering

The gap is closed not by a bigger model but by structured knowledge the model can reason against. A DAG where the edges mean something.

---

*Ontoref is open source. The protocol specification, Nushell automation, and Rust crates are at [github.com/jesusperezlorenzo/ontoref](https://github.com/jesusperezlorenzo/ontoref).*