kogral/docs/storage/surrealdb.md
Jesús Pérez 1329eb509f
Some checks failed
Nickel Type Check / Nickel Type Checking (push) Has been cancelled
Rust CI / Security Audit (push) Has been cancelled
Rust CI / Check + Test + Lint (nightly) (push) Has been cancelled
Rust CI / Check + Test + Lint (stable) (push) Has been cancelled
feat(core): add SurrealDB v3 engine abstraction, NATS event publishing, and storage factory
Key changes: new events.rs (NATS EventingStorage decorator), storage/factory.rs (backend selection), orchestration.rs, SurrealDB v3
  engine upgrade, expanded Nickel schemas, and two new ADRs (006, 007).
2026-02-22 21:51:53 +00:00

177 lines
5.3 KiB
Markdown

# SurrealDB Storage
KOGRAL uses SurrealDB 3.0 as its scalable backend, enabled via the `surrealdb-backend` Cargo feature.
The integration is built on `surrealdb::engine::any::connect(url)`, which selects the engine at
runtime from a URL scheme — no recompilation required when switching between embedded, in-memory,
or remote deployments.
## Dual Hot/Cold Layout
`SurrealDbStorage` maintains two independent database connections:
| Connection | Default engine | URL | Purpose |
|---|---|---|---|
| `graph_db` | SurrealKV (B-tree) | `surrealkv://.kogral/db/graph` | Nodes, edges, graph metadata |
| `hot_db` | RocksDB (LSM) | `rocksdb://.kogral/db/hot` | Embeddings, session logs, append data |
SurrealKV's B-tree layout favours point lookups and range scans (node/graph queries). RocksDB's
LSM tree favours sequential writes (embedding vectors, event logs). Separating them avoids
write-amplification cross-contamination.
## Supported Engines
All four engines are compiled in when the `surrealdb-backend` feature is active:
| Nickel `engine` | URL scheme | Cargo feature | Use case |
|---|---|---|---|
| `mem` | `mem://` | `kv-mem` | Tests, ephemeral dev sessions |
| `surreal_kv` | `surrealkv://path` | `kv-surrealkv` | Embedded production (default graph) |
| `rocks_db` | `rocksdb://path` | `kv-rocksdb` | Embedded production (default hot) |
| `ws` | `ws://host:port` | `protocol-ws` | Remote team / shared deployments |
## Configuration
### Embedded (default production)
```nickel
storage = {
primary = 'filesystem,
secondary = {
enabled = true,
type = 'surrealdb,
surrealdb = {
graph = { engine = "surreal_kv", path = ".kogral/db/graph" },
hot = { engine = "rocks_db", path = ".kogral/db/hot" },
namespace = "kogral",
},
},
}
```
### In-Memory (tests, CI)
```nickel
storage = {
primary = 'memory,
secondary = {
enabled = true,
type = 'surrealdb,
surrealdb = {
graph = { engine = "mem" },
hot = { engine = "mem" },
namespace = "test",
},
},
}
```
### Remote WebSocket (team/shared deployment)
```nickel
storage = {
primary = 'filesystem,
secondary = {
enabled = true,
type = 'surrealdb,
surrealdb = {
graph = { engine = "ws", url = "ws://kb.company.com:8000" },
hot = { engine = "ws", url = "ws://kb.company.com:8000" },
namespace = "engineering",
},
},
}
```
## Building with SurrealDB Support
```bash
# Debug build
cargo build -p kogral-core --features surrealdb-backend
# All features (SurrealDB + NATS + orchestration)
cargo build -p kogral-core --all-features
# Justfile shortcut
just build::core-db
```
## CRUD Pattern
All CRUD operations route through `serde_json::Value` as the intermediary type (SurrealDB 3.0
removed `IntoSurrealValue`/`SurrealValue`). The key format for nodes is
`("{graph_name}__{node_id}")` on the `nodes` table:
```rust
// upsert
let row = serde_json::to_value(node)?;
let _: Option<serde_json::Value> = graph_db
.upsert(("nodes", format!("{graph_name}__{}", node.id)))
.content(row)
.await?;
// select
let raw: Option<serde_json::Value> = graph_db
.select(("nodes", format!("{graph_name}__{node_id}")))
.await?;
// delete
let _: Option<serde_json::Value> = graph_db
.delete(("nodes", format!("{graph_name}__{node_id}")))
.await?;
// list by graph (query API)
let nodes: Vec<Node> = graph_db
.query("SELECT * FROM nodes WHERE project = $g")
.bind(("g", graph_name.to_string()))
.await?
.take(0)?;
```
`.bind()` parameters require owned `String` values — `&str` slices do not satisfy the `'static`
bound in SurrealDB 3.0's bind API.
## Hot Data Methods
`SurrealDbStorage` exposes direct methods on `hot_db` that are outside the `Storage` trait:
```rust
// Store embedding vector for a node
pub async fn save_embedding(&self, node_id: &str, vector: &[f32]) -> Result<()>
// Append a session event to the log
pub async fn log_session(&self, entry: &serde_json::Value) -> Result<()>
```
These operate on the `embeddings` and `sessions` tables in `hot_db`.
## NATS Event Integration
When the `nats-events` feature is enabled and `config.nats` is present, the storage factory
wraps `SurrealDbStorage` (or any other backend) with `EventingStorage`. Every mutation emits
a NATS JetStream event:
```text
kogral.<graph>.node.saved → NodeSaved { graph, node_id, node_type }
kogral.<graph>.node.deleted → NodeDeleted { graph, node_id }
kogral.<graph>.graph.saved → GraphSaved { name, node_count }
```
See [ADR-007: NATS Event Publishing](../architecture/adrs/007-nats-event-publishing.md) for design rationale.
## Feature Matrix
| Feature | Includes |
|---|---|
| `filesystem` (default) | `FilesystemStorage` only |
| `surrealdb-backend` | `SurrealDbStorage` + all four engines |
| `nats-events` | `EventingStorage`, `KogralEvent`, NATS JetStream client |
| `orchestration` | `nats-events` + `stratum-orchestrator` bridge |
| `full` | All of the above |
## Related
- [ADR-003: Hybrid Storage Strategy](../architecture/adrs/003-hybrid-storage.md)
- [ADR-006: SurrealDB 3.0 Engine Abstraction](../architecture/adrs/006-surrealdb-v3-engine-abstraction.md)
- [storage/factory.rs](../../crates/kogral-core/src/storage/factory.rs)
- [storage/surrealdb.rs](../../crates/kogral-core/src/storage/surrealdb.rs)