ontoref/assets/presentation/fix_slides.md
Jesús Pérez d59644b96f
feat: unified auth model, project onboarding, install pipeline, config management
The full scope across this batch: POST /sessions key→token exchange, SessionStore dual-index with revoke_by_id, CLI Bearer injection (ONTOREF_TOKEN), ontoref setup
  --gen-keys, install scripts, daemon config form roundtrip, ADR-004/005, on+re self-description update (fully-self-described), and landing page refresh.
2026-03-13 20:56:31 +00:00

25 KiB
Raw Blame History

theme, title, titleTemplate, layout, background, class
theme title titleTemplate layout background class
default Why I Needed Rust %s - Rustikon 2026 cover ./images/charles-assuncao-1BbOtIqx21I-unsplash.jpg photo-bg

Why I Needed Rust

Finally, Infrastructure Automation I Can Sleep On

Jesús Pérez Lorenzo · Rustikon 2026

🛡 ●●●○○ 😴 ●●●○○ 🔥 ●○○○○

38 Years. One Problem.

1987 → 2025

Era Tool Lesson
1990s Perl Power without safety is a disaster
2000s Python Pragmatism without guarantees is fragile
2010s Bash · Chef · Ansible · Terraform More tools don't solve paradigm problems
2020s Go · ???

Each time, I thought I had the answer. Each time, reality proved me wrong.


layout: section

The Evolution

How we got here


background: https://images.unsplash.com/photo-1cTFuvI14J4?auto=format&fit=crop&w=1920&q=80 class: photo-bg

Stage 1 — Local (late 80s / early 90s)

Dumb terminals. Single machine. One state.

  • Local development, long deployment cycles, low urgency
  • One state — easy to observe, easy to control
  • IaC: procedural scripts, logic hidden inside the application

The Perl Era: we could do anything. We could also break anything.

Beautiful, terrifying metaprogramming. No safety net. Silent failures at 3 AM.

Lesson: power without safety is a disaster.

🛡 ●●●●○   😴 ●●●●○   🔥 ●○○○○

background: https://images.unsplash.com/photo-M5tzZtFCOfs?auto=format&fit=crop&w=1920&q=80 class: photo-bg

Stage 2 — Networks / Internet

Systems getting farther away. More people. More coordination.

  • Remote access, distributed teams, security becomes relevant
  • Cost of downtime rises — processes become critical
  • Harmonizing: package installs, config, updates across multiple machines in parallel
  • IaC: reproducible automation, first declarative attempts

The Python Era: rapid development, great community. But nothing stopped you from being wrong.

Type hints came late — and optional. Runtime errors >> compile-time errors.

Lesson: pragmatism without guarantees is fragile.

🛡 ●●●○○   😴 ●●●○○   🔥 ●●○○○

background: https://images.unsplash.com/photo-CuZ8VdwRpyk?auto=format&fit=crop&w=1920&q=80 class: photo-bg

Stage 3 — Containers / Cloud / CI-CD

Everything. Everywhere. All at once.

  • Monolith → distributed, 24×7×365, high availability
  • Cloud, hybrid, multi-cloud, on-prem — simultaneously
  • Rollback and rollforward: database transactions, but for infrastructure
  • Scale horizontally AND vertically — and descale
  • CI/CD continuous: new features, new deploys, permanent churn

The Cloud/IaC Era: Ansible, Terraform, Chef, Puppet.

What changed? The syntax.

What didn't? The fundamental problems.

Still fighting type safety. Still discovering errors in production.

Lesson: more tools don't solve paradigm problems.

🛡 ●●○○○   😴 ●○○○○   🔥 ●●●●○

layout: center background: https://images.unsplash.com/photo-lLLZkSmxe7A?auto=format&fit=crop&w=1920&q=80 class: photo-bg

I could automate infrastructure.

But I couldn't make it reliable.

I couldn't prevent mistakes.

I couldn't sleep.

🛡 ●○○○○    😴 ○○○○○    🔥 ●●●●●

layout: section

Why IaC Fails

The restaurant problem


layout: two-cols background: "./images/blackieshoot-fuR0Iwu5dkk-unsplash.jpg" class: 'restaurant'

The Restaurant

Every restaurant has at least three actors.

Restaurant Infrastructure
Guest declares
what they want
Declarative config
(YAML, HCL)
Waiter validates
and transmits
Orchestrator
(K8s, Ansible)
Kitchen executes
and delivers
Runtime / provisioning
Dish arrives —
or doesn't
Deployment succeeds — or not

::right::

What makes it work — or not:

The guest declares. Doesn't implement.

The waiter must know what's possible —
before going to the kitchen.

"I want X" → waiter goes to kitchen
→ "we don't have X, why is it on the menu?"
→ back to the table.

Equivalent: I configured a host with port 8443
→ that port isn't allowed
→ reconfigure from zero.


layout: two-cols

The Truth That Mutates

State is not static.
It can change at every step of the chain.

Step "Truth" for this actor
Guest speaks What they want
Waiter's notepad What was written down
Kitchen markings What's done / not done
Payment ticket What was actually served

::right::

The context problem:
The waiter knows the regular customer:
"always no salt."

The kitchen doesn't. If the waiter changes
— that context disappears.

Configuration drift is the same thing: Implicit state. Not explicit. Not propagated. Lost silently.

The cost of failure depends on where it happens:

  • Fail at the table (impossible order):
    cheap — caught before kitchen
  • Fail in kitchen (ingredient missing):
    medium — renegotiate with guest
  • Fail at delivery (wrong dish arrives):
    expensive — experience destroyed

Fail early = fail cheap. Fail in production = nightmare.


"We Don't Have Mushrooms"

When an actor in the chain can't fulfill part of the order.

"Can I substitute vegetables?"

That renegotiation must be explicit. Traced. Re-authorized.
Not silent. Not assumed.

Configuration drift is silent renegotiation:
The system changes. Nobody notified. State diverges without trace.

Rust's answer — Option<T>:

// The waiter cannot silently skip a missing ingredient
let mushrooms: Option<Ingredient> = order.mushrooms;
match mushrooms {
    Some(m) => add_to_dish(m),
    None    => renegotiate_with_guest(&guest)?, // explicit. always.
}
// drift = treating None as Some. Rust makes that impossible.

The compiler is the waiter who cannot pretend an ingredient exists.


The Config Evolution

How we got from code to YAML hell

  1. Hardcoded — everything inside the binary. Full control. Zero flexibility.

  2. External config (JSON) — works between machines. Unreadable for humans at scale.

  3. YAML / TOML — more readable. Fragile syntax. Implicit types. Silent errors.

  4. YAML + Serde — Serde validates the structure:

    • Does the field exist? Is it the right type?
    • Do we accept "elephant" as a pet? If the type is String... yes.
    • Serde validates shape. Not meaning.
  5. Helm / Jinja templates — YAML generated from variables (in YAML).

    • Does it validate the content of the generated YAML? No. Not at all.
    • Like using an LLM with a markdown reference: the format is there, but is the content correct?
      Nobody guarantees that.

layout: center

Continuous CI/CD.

No semantic validation.

Continuous hope.

(crossing our fingers in production)

🛡 ●○○○○    😴 ○○○○○    🔥 ●●●●●

Three Questions Without Answers

Question 1 — Why do we wait for things to break?

  • "Works on my machine" — in production, I don't know
  • Fail late = maximum cost. We want: fail fast, fail cheap

Question 2 — Do we actually know what we want?

  • Is the declaration sufficient and consistent with what's possible?
  • What are the boundaries? Static or dynamic? What is the source of truth — and when does it mutate?

Question 3 — Can we guarantee determinism?

  • CI/CD without semantic validation = continuous hope
  • We want certainty, not randomness
  • "Works on my machine" cannot be the production standard

We're not inventing anything new. Everything already exists. The question is whether we're managing it correctly.


layout: center

The tools weren't the problem.

The languages weren't the problem.

The paradigm was the problem.


layout: center

Systems we don't know how to control.

We hope they work.

When they don't — we fix them.

Continuous nightmare.

(alarm state as the new normal)

🛡 ●○○○○    😴 ○○○○○    🔥 ●●●●●

layout: section

Rust

The answer to all three questions


The Bridge: From Serde to Types

Serde loads structurally valid config. But "elephant" as pet: String compiles.

Rust's answer: don't use String. Use a type.

// Before: String — anything goes
pet: String  // "elephant" compiles. "unicorn" compiles. 🤷

// After: closed domain — impossible values don't exist
enum Pet { Dog, Cat, Rabbit }  // "elephant" doesn't compile

This is the shift. Not the config format. The model of what it can contain.

Serde validates shape. Types validate meaning. The compiler validates before the binary exists.


What Rust Gives Us

Answer to Question 1: fail early, fail cheap

// Immutability by default — invariants are invariants
let config = load_config()?;  // cannot change silently
// Option<T> — no nulls, no assumptions
let mushrooms: Option<Ingredient> = order.mushrooms;
match mushrooms {
    Some(m) => add_to_dish(m),
    None    => notify_kitchen_to_skip(),  // explicit. always.
}
// Enums as closed domains
enum CloudProvider { Hetzner, UpCloud, AWS, GCP, Azure, OnPrem }
enum Port { Valid(u16) }  // not any integer — a valid port

Answer to Question 2: explicit contracts

// Traits define what every actor in the chain must fulfill
#[async_trait]
pub trait TaskStorage: Send + Sync {
    async fn create_task(&self, task: WorkflowTask) -> StorageResult<WorkflowTask>;
    async fn update_task(&self, id: &str, status: TaskStatus) -> StorageResult<()>;
    // Add a new provider: implement this trait or it doesn't compile
}

layout: two-cols

The Compiler as Pre-Validator

Answer to Question 3: guaranteed determinism

// Closed domain — you can't forget a case
enum RollbackStrategy {
    ConfigDriven,
    Conservative, // preserve unless marked for deletion
    Aggressive,  // revert all changes
    Custom { operations: Vec<String> },
}
// The compiler enforces exhaustive handling
match strategy {
    RollbackStrategy::ConfigDriven  => ...,
    RollbackStrategy::Conservative  => ...,
    RollbackStrategy::Aggressive    => ...,
    RollbackStrategy::Custom { .. } => ...,
    // miss one → compile error
}

::right::

The compiler validates:

  • Before building the binary
  • Not after hours of execution
  • Not when a function nobody touched in months finally gets called
  • Predictable behavior:
    memory, resources, workflows

The compiler is the waiter who validates the order before it reaches the kitchen. Before the guest waits. Before the ingredient is missing.

🛡 ●●●●○   😴 ●●●●○   🔥 ●●○○○

The Human Impact

When the system is trustworthy:

✓ Sleep comes back

✓ Confidence returns

✓ The team trusts the automation

✓ Stress decreases

✓ You can actually rest

What you can't measure: fear.

What you can measure: MTTR.

Before: > 30 minutes. Now: < 5 minutes.


🛡 ●●●●●  😴 ●●●●●  🔥 ●○○○○

layout: center background: https://images.unsplash.com/photo-e1dnFk7_570?auto=format&fit=crop&w=1920&q=80 class: photo-bg

Continuous CI/CD.

Types. Compiler. Explicit state.

Continuous certainty.

(to keep sleeping well)

🛡 ●●●●●    😴 ●●●●●    🔥 ●○○○○

layout: section

In Production

This is not theory


layout: two-cols

Nickel as Typed Source of Truth

YAML rejected. TOML rejected.
Reason: no type safety.

# Infrastructure schema  validated at config compile time
{
  compute | {
    region | String,
    count  | Number & (fun n => n > 0),
    scaling | {
      min | Number & (fun n => n > 0),
      max | Number & (fun n => n >= min), 
      # -- compiler verifies this relationship
    }
  }
}

::right:: Result (ADR-003):
zero configuration type errors in production.

Config hierarchy:
defaults → workspace → profile → environment → runtime


Each layer merges.
Type system catches conflicts.
At config time — not deployment time.

Serde validates shape.

Nickel validates meaning.

The compiler validates before deployment.


Traits as Provider Contracts

The kitchen can change. AWS ≠ UpCloud ≠ bare metal. Same menu.

// Every provider implements the same contract
enum DependencyType { Hard, Soft, Optional }
enum TaskStatus     { Pending, Running, Completed, Failed, Cancelled }
// Dependency resolution — the orchestrator knows the order
// Installing Kubernetes:
//   containerd (Hard) → etcd (Hard) → kubernetes
//   → cilium (requires kubernetes) → rook-ceph (requires cilium)

Explicit state — no drift:

pub struct WorkflowExecutionState {
    pub task_states: HashMap<String, TaskExecutionState>,
    pub checkpoints: Vec<WorkflowCheckpoint>,  // what happened and when
    pub provider_states: HashMap<String, ProviderState>,
}
  • Checkpoint every 5 minutes
  • No implicit state. No "the waiter remembers the customer doesn't want salt."
  • It's in the order. Always. Explicit.

Dependency Graph — Fail Fast, Fail Cheap

fail_fast: bool is not a config option. It's a principle encoded as a type.

pub struct WorkflowConfig {
    pub max_parallel_tasks:          usize,
    pub task_timeout_seconds:        u64,
    pub fail_fast:                   bool,  // halt on first failure
    pub checkpoint_interval_seconds: u64,   // recovery point granularity
}

Typed DAG — dependency resolution enforced at workflow compile time:

containerd (Hard) → etcd (Hard) → kubernetes
                                → cilium    (requires: kubernetes)
                                → rook-ceph (requires: kubernetes + cilium)
  • DependencyType::Hard — failure stops the chain. Always.
  • DependencyType::Soft — continues, explicitly degraded.
  • DependencyType::Optional — missing is expected and fine.

Not a runbook. Not a comment. A type the compiler enforces.

🛡 ●●●●●   😴 ●●●●●   🔥 ●○○○○

layout: two-cols

Real Applications

::left::

Kubernetes

The orchestrator provisions cluster components as a typed workflow:

containerd
  → etcd
  → kubernetes control plane
    → CoreDNS
    → Cilium (CNI)
      → Rook-Ceph (storage)

Each dependency is a DependencyType. The compiler catches installing Cilium without Kubernetes. Not the on-call engineer at 2 AM.

::right::

Blockchain Validators

Validators require brutal uptime. A validator that fails loses funds.

  • Post-quantum cryptography: CRYSTALS-Kyber + Falcon + AES-256-GCM hybrid. Validator keys protected against quantum computers.
  • SLOs with real error budgets: 99.99% = 52.6 min downtime/year. Prometheus blocks deploys when burn rate exceeds budget.
  • Deterministic config: validator parameters are types. A bond_amount that isn't a valid u128 doesn't compile.

Disaster Recovery

Rollback as a type, not a procedure

// Checkpoint = complete system snapshot
pub struct Checkpoint {
    pub workflow_state:  Option<WorkflowExecutionState>,
    pub resources:       Vec<ResourceSnapshot>,
    pub provider_states: HashMap<String, ProviderState>,
}
// Rollback strategy = typed choice, not a runbook
enum RollbackStrategy {
    ConfigDriven,
    Conservative,   // preserve unless marked for deletion
    Aggressive,     // revert all changes
    Custom { operations: Vec<String> },
}
// You cannot do rollback without choosing a strategy.
// The compiler doesn't let you ignore the case.

Multi-backend backup: restic, borg, tar, rsync — all as enum variants.
Production backup and DR restore use the same type, the same schema.

"Works in prod but not in DR" can't happen if the state is the same type.


Self-Healing — Typed Remediation

When something breaks at 3 AM — the system responds, not you.

enum RemediationAction {
    ScaleService    { service: String, replicas: u32 },
    FailoverService { service: String, region: Region },
    RestartService  { service: String },
    ClearCache      { service: String, scope: CacheScope },
}
// Typed playbooks. Not shell scripts. Not hope.
// Fails 3 times → escalates to human. Never loops indefinitely.

What happens at 3 AM:

  • Alert fires → RemediationEngine matches condition → runs RestartService
  • Works: silent. Nobody woken up.
  • Fails 3×: page sent — with full state, checkpoint, and execution history.

You wake up to information. Not to chaos.

🛡 ●●●●●   😴 ●●●●●   🔥 ●○○○○

layout: center

Without types. Without compiler. Without explicit state.

MTTR > 30 minutes.

────────────────────────

Rust. Types. Explicit state. Automated response.

MTTR < 5 minutes.

(at 3 AM. without you.)

🛡 ●●●●●    😴 ●●●●●    🔥 ●○○○○

layout: section

Why This Matters

For everyone in this room


layout: two-cols

For You

::left::

If you've been frustrated like me

  • Rust solves the problems you already know — not hypothetical ones
  • This isn't hype. I've seen technologies come and go for decades.
  • Give it a real chance.
  • Your sleep will thank you.

Start here:

  • Model your infrastructure as types
  • Replace stringly-typed config with enums
  • Let the compiler be your pre-validator

::right::

If you're earlier in your career

  • Don't waste decades on fragile infrastructure
  • Start with type safety from day one
  • Build for reliability — not just for speed
  • You'll thank yourself later

The shortest path:

  • Learn the type system deeply
  • Understand ownership as state management
  • Traits as contracts between systems

layout: center

At my age, I have perspective.

I've seen technologies come and go.

Rust isn't hype.

It solves real problems I've had for decades.

More years isn't a liability.

It's an advantage.


layout: cover

Why I Needed Rust

Because I Wanted to Sleep

🛡 ●●●●● 😴 ●●●●● 🔥 ●○○○○

Thank you very much

Questions?

provisioning.systems

· jesusperez.pro