107 lines
6.3 KiB
Markdown
107 lines
6.3 KiB
Markdown
|
|
---
|
|||
|
|
# Post metadata
|
|||
|
|
id: "dags-everywhere-none-know-what-they-are"
|
|||
|
|
title: "DAGs Are Everywhere. None of Them Know What They Are."
|
|||
|
|
slug: "dags-everywhere-none-know-what-they-are"
|
|||
|
|
subtitle: "Every build system, CI pipeline, and runbook uses DAGs for execution. Ontoref uses them for knowledge."
|
|||
|
|
excerpt: "CI/CD pipelines, compilers, runbooks, data orchestrators — they all use directed acyclic graphs. Every single one of them uses DAGs as execution models: this before that, topological ordering, dependency resolution. None of them use DAGs to represent what the system is, why it exists, or what trade-offs define it. That's the gap ontoref fills."
|
|||
|
|
|
|||
|
|
# Publication info
|
|||
|
|
author: "Jesús Pérez"
|
|||
|
|
date: "2026-05-10"
|
|||
|
|
published: false
|
|||
|
|
featured: false
|
|||
|
|
|
|||
|
|
# Categorization
|
|||
|
|
category: "ontoref"
|
|||
|
|
tags: ["ontoref", "dag", "ontology", "knowledge-graphs", "software-architecture"]
|
|||
|
|
|
|||
|
|
# Display
|
|||
|
|
read_time: "6 min read"
|
|||
|
|
sort_order: 2
|
|||
|
|
css_class: "category-ontoref"
|
|||
|
|
category_description: "Ontoref — protocol and tooling for structured self-knowledge in software projects"
|
|||
|
|
category_published: true
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
# DAGs Are Everywhere. None of Them Know What They Are.
|
|||
|
|
|
|||
|
|
Every build system uses DAGs. Every CI/CD pipeline uses DAGs. Compilers, data orchestrators, runbooks, package managers, Kubernetes operators — all DAGs. Directed acyclic graphs are so ubiquitous in software infrastructure that the question "why DAGs?" barely registers as a question anymore.
|
|||
|
|
|
|||
|
|
But there's a second question nobody asks: *what do those DAGs represent?*
|
|||
|
|
|
|||
|
|
The answer is always the same: **execution order**. This before that. Topological sorting. Dependency resolution. The graph describes *how* something computes, not *what* it is.
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
CI/CD: test → build → deploy
|
|||
|
|
Compiler: parse → typecheck → codegen → link
|
|||
|
|
Runbook: check_health → drain → restart → verify
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
These graphs are inert with respect to the system they operate on. A CI pipeline doesn't know what the project is, why it was built this way, or what trade-offs define its architecture. It knows that tests must pass before deployment. That's all it knows.
|
|||
|
|
|
|||
|
|
## The Semantic Gap
|
|||
|
|
|
|||
|
|
This inertness is not a flaw — it's appropriate. Build systems should be fast and mechanical. Runbooks should be executable without requiring philosophical knowledge about the system. Execution graphs and knowledge graphs are different tools for different purposes.
|
|||
|
|
|
|||
|
|
The problem is that the software industry has reached for DAGs for *everything except knowledge representation*. The knowledge of what a system is — its principles, its architectural decisions, its active tensions, its capability surface — lives in documents. Wikis. READMEs. Tickets. Verbal tradition.
|
|||
|
|
|
|||
|
|
These are all unstructured, un-typed, un-queryable, and guaranteed to drift from the actual system within months. They describe what the project was, not what it is.
|
|||
|
|
|
|||
|
|
## Ontoref's Dual DAG
|
|||
|
|
|
|||
|
|
Ontoref uses DAGs in two fundamentally different modes, and the distinction matters:
|
|||
|
|
|
|||
|
|
**Ontological DAG — semantically typed edges:**
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
Practice "ncl-schemas" --implements--> Principle "type-safety"
|
|||
|
|
Practice "ncl-schemas" --enforces--> Constraint "zero-runtime-deps"
|
|||
|
|
Practice "ncl-schemas" --enables--> Capability "agent-queryable-state"
|
|||
|
|
Practice "ncl-schemas" --tension--> Practice "adoption-friction"
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
These edges are not "depends on." They are typed relationships: `implements`, `enforces`, `enables`, `tension`. The graph is the knowledge — formal commitments about what the project is and how its concepts relate. The equivalent would be a compiler that knows why it exists and what architectural trade-offs it made.
|
|||
|
|
|
|||
|
|
**Reflection DAG — executable contracts with actor restrictions:**
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
step "validate-state" (actor: any) --> step "run-mode" (actor: developer)
|
|||
|
|
step "run-mode" (actor: developer) --> step "report" (actor: agent|developer)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
This is not just execution ordering. The DAG is validated as a contract before it runs, records its own progress through state transitions, and encodes who can execute each step. An agent trying to run a developer-restricted step receives a typed rejection, not a runtime error.
|
|||
|
|
|
|||
|
|
## The Missing Connection
|
|||
|
|
|
|||
|
|
What makes this non-trivial is that the two DAGs communicate:
|
|||
|
|
|
|||
|
|
- The ontological DAG declares what the project IS — active constraints, practices in tension, current state dimensions
|
|||
|
|
- The reflection DAG OPERATES on that state — detecting drift, executing modes, recording transitions
|
|||
|
|
- Migrations propagate changes in the ontological DAG to downstream consumer projects
|
|||
|
|
|
|||
|
|
In every other system that uses DAGs, the graph is evaluated and discarded. In ontoref, evaluation modifies the state that the next execution reads. The graph has memory.
|
|||
|
|
|
|||
|
|
## Why This Didn't Exist Before
|
|||
|
|
|
|||
|
|
Three things had to align:
|
|||
|
|
|
|||
|
|
**Agents as consumers.** Before agentic AI, there was no automated consumer that needed structured knowledge about a project. Humans could read the wiki. Agents cannot — they need typed, queryable, machine-readable project knowledge to work accurately rather than hallucinate context. Ontoref's knowledge DAG is what agents consume via MCP.
|
|||
|
|
|
|||
|
|
**Configuration languages with contracts.** Expressing an ontology without an external triplestore required a configuration language with types and contracts. Nickel (NCL) provides exactly that: a typed, lazy configuration language where schema violations are caught at evaluation time, not at runtime. RDF/OWL would have required dedicated infrastructure and specialist expertise.
|
|||
|
|
|
|||
|
|
**Repository scale, not enterprise scale.** Enterprise knowledge graphs are multi-year initiatives. Ontoref operates at the scale of a single repository — incremental adoption, no enforcement, no dedicated team. The smallest unit that can adopt it is a project of one.
|
|||
|
|
|
|||
|
|
## The Consequence
|
|||
|
|
|
|||
|
|
When a project has a knowledge DAG that its agents can consume, the accuracy arithmetic changes entirely. The numbers from KGC 2026 research:
|
|||
|
|
|
|||
|
|
- Ontology-grounded retrieval: **3.4× more accurate** than vector RAG on enterprise queries
|
|||
|
|
- Ontology-validated queries: accuracy jumps from **16% to 72%** on SQL question-answering
|
|||
|
|
|
|||
|
|
The gap is closed not by a bigger model but by structured knowledge the model can reason against. A DAG where the edges mean something.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
*Ontoref is open source. The protocol specification, Nushell automation, and Rust crates are at [github.com/jesusperezlorenzo/ontoref](https://github.com/jesusperezlorenzo/ontoref).*
|