The full scope across this batch: POST /sessions key→token exchange, SessionStore dual-index with revoke_by_id, CLI Bearer injection (ONTOREF_TOKEN), ontoref setup --gen-keys, install scripts, daemon config form roundtrip, ADR-004/005, on+re self-description update (fully-self-described), and landing page refresh.
25 KiB
theme, title, titleTemplate, layout, background, class
| theme | title | titleTemplate | layout | background | class |
|---|---|---|---|---|---|
| default | Why I Needed Rust | %s - Rustikon 2026 | cover | ./images/charles-assuncao-1BbOtIqx21I-unsplash.jpg | photo-bg |
Why I Needed Rust
Finally, Infrastructure Automation I Can Sleep On
Jesús Pérez Lorenzo · Rustikon 2026
38 Years. One Problem.
1987 → 2025
| Era | Tool | Lesson |
|---|---|---|
| 1990s | Perl | Power without safety is a disaster |
| 2000s | Python | Pragmatism without guarantees is fragile |
| 2010s | Bash · Chef · Ansible · Terraform | More tools don't solve paradigm problems |
| 2020s | Go · ??? |
Each time, I thought I had the answer. Each time, reality proved me wrong.
layout: section
The Evolution
How we got here
background: https://images.unsplash.com/photo-1cTFuvI14J4?auto=format&fit=crop&w=1920&q=80 class: photo-bg
Stage 1 — Local (late 80s / early 90s)
Dumb terminals. Single machine. One state.
- Local development, long deployment cycles, low urgency
- One state — easy to observe, easy to control
- IaC: procedural scripts, logic hidden inside the application
The Perl Era: we could do anything. We could also break anything.
Beautiful, terrifying metaprogramming. No safety net. Silent failures at 3 AM.
Lesson: power without safety is a disaster.
background: https://images.unsplash.com/photo-M5tzZtFCOfs?auto=format&fit=crop&w=1920&q=80 class: photo-bg
Stage 2 — Networks / Internet
Systems getting farther away. More people. More coordination.
- Remote access, distributed teams, security becomes relevant
- Cost of downtime rises — processes become critical
- Harmonizing: package installs, config, updates across multiple machines in parallel
- IaC: reproducible automation, first declarative attempts
The Python Era: rapid development, great community. But nothing stopped you from being wrong.
Type hints came late — and optional. Runtime errors >> compile-time errors.
Lesson: pragmatism without guarantees is fragile.
background: https://images.unsplash.com/photo-CuZ8VdwRpyk?auto=format&fit=crop&w=1920&q=80 class: photo-bg
Stage 3 — Containers / Cloud / CI-CD
Everything. Everywhere. All at once.
- Monolith → distributed, 24×7×365, high availability
- Cloud, hybrid, multi-cloud, on-prem — simultaneously
- Rollback and rollforward: database transactions, but for infrastructure
- Scale horizontally AND vertically — and descale
- CI/CD continuous: new features, new deploys, permanent churn
The Cloud/IaC Era: Ansible, Terraform, Chef, Puppet.
What changed? The syntax.
What didn't? The fundamental problems.
Still fighting type safety. Still discovering errors in production.
Lesson: more tools don't solve paradigm problems.
layout: center background: https://images.unsplash.com/photo-lLLZkSmxe7A?auto=format&fit=crop&w=1920&q=80 class: photo-bg
I could automate infrastructure.
But I couldn't make it reliable.
I couldn't prevent mistakes.
I couldn't sleep.
layout: section
Why IaC Fails
The restaurant problem
layout: two-cols background: "./images/blackieshoot-fuR0Iwu5dkk-unsplash.jpg" class: 'restaurant'
The Restaurant
Every restaurant has at least three actors.
| Restaurant | Infrastructure |
|---|---|
| Guest declares what they want |
Declarative config (YAML, HCL) |
| Waiter validates and transmits |
Orchestrator (K8s, Ansible) |
| Kitchen executes and delivers |
Runtime / provisioning |
| Dish arrives — or doesn't |
Deployment succeeds — or not |
::right::
What makes it work — or not:
The guest declares. Doesn't implement.
The waiter must know what's possible —
before going to the kitchen.
"I want X" → waiter goes to kitchen
→ "we don't have X, why is it on the menu?"
→ back to the table.
Equivalent: I configured a host with port 8443
→ that port isn't allowed
→ reconfigure from zero.
layout: two-cols
The Truth That Mutates
State is not static.
It can change at every step of the chain.
| Step | "Truth" for this actor |
|---|---|
| Guest speaks | What they want |
| Waiter's notepad | What was written down |
| Kitchen markings | What's done / not done |
| Payment ticket | What was actually served |
::right::
The context problem:
The waiter knows the regular customer:
"always no salt."
The kitchen doesn't. If the waiter changes
— that context disappears.
Configuration drift is the same thing: Implicit state. Not explicit. Not propagated. Lost silently.
The cost of failure depends on where it happens:
- Fail at the table (impossible order):
cheap — caught before kitchen - Fail in kitchen (ingredient missing):
medium — renegotiate with guest - Fail at delivery (wrong dish arrives):
expensive — experience destroyed
Fail early = fail cheap. Fail in production = nightmare.
"We Don't Have Mushrooms"
When an actor in the chain can't fulfill part of the order.
"Can I substitute vegetables?"
That renegotiation must be explicit. Traced. Re-authorized.
Not silent. Not assumed.
Configuration drift is silent renegotiation:
The system changes. Nobody notified. State diverges without trace.
Rust's answer — Option<T>:
// The waiter cannot silently skip a missing ingredient
let mushrooms: Option<Ingredient> = order.mushrooms;
match mushrooms {
Some(m) => add_to_dish(m),
None => renegotiate_with_guest(&guest)?, // explicit. always.
}
// drift = treating None as Some. Rust makes that impossible.
The compiler is the waiter who cannot pretend an ingredient exists.
The Config Evolution
How we got from code to YAML hell
-
Hardcoded — everything inside the binary. Full control. Zero flexibility.
-
External config (JSON) — works between machines. Unreadable for humans at scale.
-
YAML / TOML — more readable. Fragile syntax. Implicit types. Silent errors.
-
YAML + Serde — Serde validates the structure:
- Does the field exist? Is it the right type?
- Do we accept
"elephant"as a pet? If the type isString... yes. - Serde validates shape. Not meaning.
-
Helm / Jinja templates — YAML generated from variables (in YAML).
- Does it validate the content of the generated YAML? No. Not at all.
- Like using an LLM with a markdown reference: the format is there,
but is the content correct?
Nobody guarantees that.
layout: center
Continuous CI/CD.
No semantic validation.
Continuous hope.
(crossing our fingers in production)
Three Questions Without Answers
Question 1 — Why do we wait for things to break?
- "Works on my machine" — in production, I don't know
- Fail late = maximum cost. We want: fail fast, fail cheap
Question 2 — Do we actually know what we want?
- Is the declaration sufficient and consistent with what's possible?
- What are the boundaries? Static or dynamic? What is the source of truth — and when does it mutate?
Question 3 — Can we guarantee determinism?
- CI/CD without semantic validation = continuous hope
- We want certainty, not randomness
- "Works on my machine" cannot be the production standard
We're not inventing anything new. Everything already exists. The question is whether we're managing it correctly.
layout: center
The tools weren't the problem.
The languages weren't the problem.
The paradigm was the problem.
layout: center
Systems we don't know how to control.
We hope they work.
When they don't — we fix them.
Continuous nightmare.
(alarm state as the new normal)
layout: section
Rust
The answer to all three questions
The Bridge: From Serde to Types
Serde loads structurally valid config. But "elephant" as pet: String compiles.
Rust's answer: don't use String. Use a type.
// Before: String — anything goes
pet: String // "elephant" compiles. "unicorn" compiles. 🤷
// After: closed domain — impossible values don't exist
enum Pet { Dog, Cat, Rabbit } // "elephant" doesn't compile
This is the shift. Not the config format. The model of what it can contain.
Serde validates shape. Types validate meaning. The compiler validates before the binary exists.
What Rust Gives Us
Answer to Question 1: fail early, fail cheap
// Immutability by default — invariants are invariants
let config = load_config()?; // cannot change silently
// Option<T> — no nulls, no assumptions
let mushrooms: Option<Ingredient> = order.mushrooms;
match mushrooms {
Some(m) => add_to_dish(m),
None => notify_kitchen_to_skip(), // explicit. always.
}
// Enums as closed domains
enum CloudProvider { Hetzner, UpCloud, AWS, GCP, Azure, OnPrem }
enum Port { Valid(u16) } // not any integer — a valid port
Answer to Question 2: explicit contracts
// Traits define what every actor in the chain must fulfill
#[async_trait]
pub trait TaskStorage: Send + Sync {
async fn create_task(&self, task: WorkflowTask) -> StorageResult<WorkflowTask>;
async fn update_task(&self, id: &str, status: TaskStatus) -> StorageResult<()>;
// Add a new provider: implement this trait or it doesn't compile
}
layout: two-cols
The Compiler as Pre-Validator
Answer to Question 3: guaranteed determinism
// Closed domain — you can't forget a case
enum RollbackStrategy {
ConfigDriven,
Conservative, // preserve unless marked for deletion
Aggressive, // revert all changes
Custom { operations: Vec<String> },
}
// The compiler enforces exhaustive handling
match strategy {
RollbackStrategy::ConfigDriven => ...,
RollbackStrategy::Conservative => ...,
RollbackStrategy::Aggressive => ...,
RollbackStrategy::Custom { .. } => ...,
// miss one → compile error
}
::right::
The compiler validates:
- Before building the binary
- Not after hours of execution
- Not when a function nobody touched in months finally gets called
- Predictable behavior:
memory, resources, workflows
The compiler is the waiter who validates the order before it reaches the kitchen. Before the guest waits. Before the ingredient is missing.
The Human Impact
When the system is trustworthy:
✓ Sleep comes back
✓ Confidence returns
✓ The team trusts the automation
✓ Stress decreases
✓ You can actually rest
What you can't measure: fear.
What you can measure: MTTR.
Before: > 30 minutes. Now: < 5 minutes.
layout: center background: https://images.unsplash.com/photo-e1dnFk7_570?auto=format&fit=crop&w=1920&q=80 class: photo-bg
Continuous CI/CD.
Types. Compiler. Explicit state.
Continuous certainty.
(to keep sleeping well)
layout: section
In Production
This is not theory
layout: two-cols
Nickel as Typed Source of Truth
YAML rejected. TOML rejected.
Reason: no type safety.
# Infrastructure schema — validated at config compile time
{
compute | {
region | String,
count | Number & (fun n => n > 0),
scaling | {
min | Number & (fun n => n > 0),
max | Number & (fun n => n >= min),
# -- compiler verifies this relationship
}
}
}
::right::
Result (ADR-003):
zero configuration type errors in production.
Config hierarchy:
defaults → workspace → profile → environment → runtime
Each layer merges.
Type system catches conflicts.
At config time — not deployment time.
Serde validates shape.
Nickel validates meaning.
The compiler validates before deployment.
Traits as Provider Contracts
The kitchen can change. AWS ≠ UpCloud ≠ bare metal. Same menu.
// Every provider implements the same contract
enum DependencyType { Hard, Soft, Optional }
enum TaskStatus { Pending, Running, Completed, Failed, Cancelled }
// Dependency resolution — the orchestrator knows the order
// Installing Kubernetes:
// containerd (Hard) → etcd (Hard) → kubernetes
// → cilium (requires kubernetes) → rook-ceph (requires cilium)
Explicit state — no drift:
pub struct WorkflowExecutionState {
pub task_states: HashMap<String, TaskExecutionState>,
pub checkpoints: Vec<WorkflowCheckpoint>, // what happened and when
pub provider_states: HashMap<String, ProviderState>,
}
- Checkpoint every 5 minutes
- No implicit state. No "the waiter remembers the customer doesn't want salt."
- It's in the order. Always. Explicit.
Dependency Graph — Fail Fast, Fail Cheap
fail_fast: bool is not a config option. It's a principle encoded as a type.
pub struct WorkflowConfig {
pub max_parallel_tasks: usize,
pub task_timeout_seconds: u64,
pub fail_fast: bool, // halt on first failure
pub checkpoint_interval_seconds: u64, // recovery point granularity
}
Typed DAG — dependency resolution enforced at workflow compile time:
containerd (Hard) → etcd (Hard) → kubernetes
→ cilium (requires: kubernetes)
→ rook-ceph (requires: kubernetes + cilium)
DependencyType::Hard— failure stops the chain. Always.DependencyType::Soft— continues, explicitly degraded.DependencyType::Optional— missing is expected and fine.
Not a runbook. Not a comment. A type the compiler enforces.
layout: two-cols
Real Applications
::left::
Kubernetes
The orchestrator provisions cluster components as a typed workflow:
containerd
→ etcd
→ kubernetes control plane
→ CoreDNS
→ Cilium (CNI)
→ Rook-Ceph (storage)
Each dependency is a DependencyType.
The compiler catches installing Cilium without Kubernetes.
Not the on-call engineer at 2 AM.
::right::
Blockchain Validators
Validators require brutal uptime. A validator that fails loses funds.
- Post-quantum cryptography: CRYSTALS-Kyber + Falcon + AES-256-GCM hybrid. Validator keys protected against quantum computers.
- SLOs with real error budgets: 99.99% = 52.6 min downtime/year. Prometheus blocks deploys when burn rate exceeds budget.
- Deterministic config: validator parameters are types. A
bond_amountthat isn't a validu128doesn't compile.
Self-Healing — Typed Remediation
When something breaks at 3 AM — the system responds, not you.
enum RemediationAction {
ScaleService { service: String, replicas: u32 },
FailoverService { service: String, region: Region },
RestartService { service: String },
ClearCache { service: String, scope: CacheScope },
}
// Typed playbooks. Not shell scripts. Not hope.
// Fails 3 times → escalates to human. Never loops indefinitely.
What happens at 3 AM:
- Alert fires →
RemediationEnginematches condition → runsRestartService - Works: silent. Nobody woken up.
- Fails 3×: page sent — with full state, checkpoint, and execution history.
You wake up to information. Not to chaos.
layout: center
Without types. Without compiler. Without explicit state.
MTTR > 30 minutes.
Rust. Types. Explicit state. Automated response.
MTTR < 5 minutes.
(at 3 AM. without you.)
Disaster Recovery
Rollback as a type, not a procedure
// Checkpoint = complete system snapshot
pub struct Checkpoint {
pub workflow_state: Option<WorkflowExecutionState>,
pub resources: Vec<ResourceSnapshot>,
pub provider_states: HashMap<String, ProviderState>,
}
// Rollback strategy = typed choice, not a runbook
enum RollbackStrategy {
ConfigDriven,
Conservative, // preserve unless marked for deletion
Aggressive, // revert all changes
Custom { operations: Vec<String> },
}
// You cannot do rollback without choosing a strategy.
// The compiler doesn't let you ignore the case.
Multi-backend backup: restic, borg, tar, rsync — all as enum variants.
Production backup and DR restore use the same type, the same schema.
"Works in prod but not in DR" can't happen if the state is the same type.
layout: section
Why This Matters
For everyone in this room
layout: two-cols
For You
::left::
If you've been frustrated like me
- Rust solves the problems you already know — not hypothetical ones
- This isn't hype. I've seen technologies come and go for decades.
- Give it a real chance.
- Your sleep will thank you.
Start here:
- Model your infrastructure as types
- Replace stringly-typed config with enums
- Let the compiler be your pre-validator
::right::
If you're earlier in your career
- Don't waste decades on fragile infrastructure
- Start with type safety from day one
- Build for reliability — not just for speed
- You'll thank yourself later
The shortest path:
- Learn the type system deeply
- Understand ownership as state management
- Traits as contracts between systems
layout: center
At my age, I have perspective.
I've seen technologies come and go.
Rust isn't hype.
It solves real problems I've had for decades.
More years isn't a liability.
It's an advantage.
layout: cover
Why I Needed Rust
Because I Wanted to Sleep
Thank you very much
Questions?
provisioning.systems
· jesusperez.pro