# Storage Architecture The Knowledge Base uses a **hybrid storage strategy** combining filesystem, SurrealDB, and in-memory backends to balance git-friendliness, scalability, and performance. ## Overview ![Storage Architecture](../diagrams/storage-architecture.svg) The storage layer is abstracted through a common `Storage` trait, allowing KOGRAL to use different backends based on project needs: - **Filesystem**: Git-tracked markdown files for local project knowledge - **SurrealDB**: Scalable graph database for shared organizational knowledge - **In-Memory**: Fast ephemeral storage for testing and caching ## Storage Trait All storage backends implement a common async trait: ```rust #[async_trait] pub trait Storage: Send + Sync { async fn save_node(&self, node: &Node) -> Result<()>; async fn load_node(&self, id: &str) -> Result; async fn delete_node(&self, id: &str) -> Result<()>; async fn list_nodes(&self, node_type: Option) -> Result>; async fn search(&self, query: &str) -> Result>; async fn save_edge(&self, edge: &Edge) -> Result<()>; async fn load_edges(&self, node_id: &str) -> Result>; async fn delete_edge(&self, from: &str, to: &str, relation: EdgeType) -> Result<()>; async fn save_graph(&self, graph: &Graph) -> Result<()>; async fn load_graph(&self, name: &str) -> Result; } ``` This abstraction allows: - ✅ Swapping backends without code changes (config-driven) - ✅ Testing with in-memory storage - ✅ Hybrid setups (filesystem + SurrealDB) - ✅ Custom backends (implement trait + add to config) ## Filesystem Storage ### Purpose **Git-friendly, human-readable knowledge storage** Ideal for: - Project-specific knowledge - Version control with git - Code review of knowledge changes - Offline work - Logseq compatibility ### File Layout ```text .kogral/ ├── config.toml # Graph metadata ├── notes/ │ ├── 2026-01-17-topic.md │ ├── async-patterns.md │ └── error-handling.md ├── decisions/ │ ├── 0001-use-rust.md # ADR format │ ├── 0002-surrealdb.md │ └── 0003-nickel-config.md ├── guidelines/ │ ├── rust-errors.md │ ├── testing-standards.md │ └── api-design.md ├── patterns/ │ ├── repository-pattern.md │ ├── builder-pattern.md │ └── async-error-handling.md └── journal/ ├── 2026-01-17.md # Daily notes └── 2026-01-18.md ``` ### Document Format Each document is **markdown with YAML frontmatter**: ```markdown --- id: note-rust-async-traits type: note title: Async Trait Patterns in Rust created: 2026-01-17T10:30:00Z modified: 2026-01-17T15:45:00Z tags: [rust, async, traits] status: active relates_to: - pattern-async-error-handling - guideline-rust-async depends_on: - note-rust-basics project: knowledge-base --- # Async Trait Patterns in Rust Using async traits with the `async-trait` crate... ## Pattern 1: Boxed Futures When working with async traits, use `async-trait` to avoid lifetime issues: \`\`\`rust #[async_trait] trait DataSource { async fn fetch(&self, id: &str) -> Result; } \`\`\` ## Related Concepts See [[pattern-async-error-handling]] for error handling in async contexts. Depends on understanding [[note-rust-basics]] first. ``` ### Features **Wikilinks**: `[[other-note]]` automatically creates relationships **Code References**: `@src/main.rs:42` links to code locations **Git Integration**: - Diffs show knowledge changes - Branches for experimental knowledge - PRs for knowledge review - History tracking - Blame shows knowledge authors **Logseq Compatibility**: Format is compatible with Logseq for graph visualization ### Implementation ```rust pub struct FilesystemStorage { root: PathBuf, } impl FilesystemStorage { pub fn new(root: impl Into) -> Result { let root = root.into(); fs::create_dir_all(&root)?; Ok(Self { root }) } fn node_path(&self, node_type: NodeType, id: &str) -> PathBuf { let subdir = match node_type { NodeType::Note => "notes", NodeType::Decision => "decisions", NodeType::Guideline => "guidelines", NodeType::Pattern => "patterns", NodeType::Journal => "journal", NodeType::Execution => "executions", }; self.root.join(subdir).join(format!("{}.md", id)) } } #[async_trait] impl Storage for FilesystemStorage { async fn save_node(&self, node: &Node) -> Result<()> { let path = self.node_path(node.node_type, &node.id); fs::create_dir_all(path.parent().unwrap())?; let markdown = format_node_as_markdown(node)?; fs::write(path, markdown)?; Ok(()) } async fn load_node(&self, id: &str) -> Result { // Try each node type directory for node_type in NodeType::iter() { let path = self.node_path(node_type, id); if path.exists() { let content = fs::read_to_string(path)?; return parse_markdown_node(&content); } } Err(KbError::NodeNotFound(id.to_string())) } } ``` ### File Watching Filesystem storage can watch for changes and auto-sync: ```rust use notify::{Watcher, RecursiveMode, Event}; impl FilesystemStorage { pub async fn watch(&self, on_change: impl Fn(Event) + Send + 'static) -> Result<()> { let (tx, rx) = channel(); let mut watcher = RecommendedWatcher::new(tx, Config::default())?; watcher.watch(&self.root, RecursiveMode::Recursive)?; while let Ok(event) = rx.recv() { on_change(event); } Ok(()) } } ``` ## SurrealDB Storage ### Purpose **Scalable graph database for shared organizational knowledge** Ideal for: - Organization-wide guidelines - Shared patterns library - Advanced graph queries - Semantic search with embeddings - Multi-project knowledge sharing ### Schema SurrealDB schema for nodes and edges: ```sql DEFINE TABLE node SCHEMAFULL; DEFINE FIELD id ON node TYPE string; DEFINE FIELD node_type ON node TYPE string ASSERT $value INSIDE ["note", "decision", "guideline", "pattern", "journal", "execution"]; DEFINE FIELD title ON node TYPE string; DEFINE FIELD content ON node TYPE string; DEFINE FIELD tags ON node TYPE array; DEFINE FIELD status ON node TYPE string; DEFINE FIELD created ON node TYPE datetime; DEFINE FIELD modified ON node TYPE datetime; DEFINE FIELD embedding ON node TYPE array; -- For semantic search DEFINE INDEX unique_node_id ON node COLUMNS id UNIQUE; DEFINE INDEX node_type_idx ON node COLUMNS node_type; DEFINE INDEX node_tags_idx ON node COLUMNS tags; -- Relationship table DEFINE TABLE edge SCHEMAFULL; DEFINE FIELD from ON edge TYPE record(node); DEFINE FIELD to ON edge TYPE record(node); DEFINE FIELD relation ON edge TYPE string ASSERT $value INSIDE ["relates_to", "depends_on", "implements", "extends", "supersedes", "explains"]; DEFINE FIELD strength ON edge TYPE number; DEFINE FIELD created ON edge TYPE datetime; ``` ### Graph Queries SurrealDB's native graph support enables powerful queries: **Find all dependencies** (transitive): ```sql SELECT * FROM node:note-id->depends_on->node; ``` **Find related guidelines** (2 hops): ```sql SELECT * FROM node:guideline-rust-errors<-relates_to<-node<-relates_to<-node; ``` **Semantic search** (cosine similarity): ```sql SELECT *, vector::similarity::cosine(embedding, $query_embedding) AS score FROM node WHERE score > 0.6 ORDER BY score DESC LIMIT 10; ``` ### Implementation ```rust use surrealdb::{Surreal, engine::remote::ws::Client}; pub struct SurrealDbStorage { db: Surreal, namespace: String, database: String, } impl SurrealDbStorage { pub async fn new(config: &KbConfig) -> Result { let db = Surreal::new::(&config.storage.secondary.url).await?; db.signin(Root { username: &config.storage.secondary.username, password: &config.storage.secondary.password, }).await?; db.use_ns(&config.storage.secondary.namespace) .use_db(&config.storage.secondary.database) .await?; Ok(Self { db, namespace: config.storage.secondary.namespace.clone(), database: config.storage.secondary.database.clone(), }) } } #[async_trait] impl Storage for SurrealDbStorage { async fn save_node(&self, node: &Node) -> Result<()> { let _: Option = self.db .create(("node", &node.id)) .content(node) .await?; Ok(()) } async fn load_node(&self, id: &str) -> Result { self.db.select(("node", id)).await? .ok_or_else(|| KbError::NodeNotFound(id.to_string())) } async fn search(&self, query: &str) -> Result> { let sql = "SELECT * FROM node WHERE title ~ $query OR content ~ $query"; let mut result = self.db.query(sql).bind(("query", query)).await?; let nodes: Vec = result.take(0)?; Ok(nodes) } } ``` ### Multi-Tenancy SurrealDB supports namespaces and databases for isolation: ```text Namespace: "kb" ├── Database: "shared" (organization-wide) │ └── Nodes: guidelines, patterns, policies └── Database: "project-foo" (project-specific) └── Nodes: project decisions, local notes ``` Configuration: ```nickel storage = { secondary = { enabled = true, namespace = "kogral", database = "shared", # or "project-foo" for project-specific }, } ``` ## In-Memory Storage ### Purpose **Fast ephemeral storage for testing and caching** Ideal for: - Unit tests (isolated, deterministic) - Integration tests (fast setup/teardown) - Session caching (temporary graphs) - Development mode (rapid iteration) ### Implementation ```rust use dashmap::DashMap; pub struct MemoryStorage { nodes: DashMap, edges: DashMap<(String, String, EdgeType), Edge>, } impl MemoryStorage { pub fn new() -> Self { Self { nodes: DashMap::new(), edges: DashMap::new(), } } } #[async_trait] impl Storage for MemoryStorage { async fn save_node(&self, node: &Node) -> Result<()> { self.nodes.insert(node.id.clone(), node.clone()); Ok(()) } async fn load_node(&self, id: &str) -> Result { self.nodes.get(id) .map(|entry| entry.value().clone()) .ok_or_else(|| KbError::NodeNotFound(id.to_string())) } async fn list_nodes(&self, node_type: Option) -> Result> { Ok(self.nodes.iter() .filter(|entry| { node_type.map_or(true, |t| entry.value().node_type == t) }) .map(|entry| entry.key().clone()) .collect()) } async fn search(&self, query: &str) -> Result> { Ok(self.nodes.iter() .filter(|entry| { let node = entry.value(); node.title.contains(query) || node.content.contains(query) || node.tags.iter().any(|tag| tag.contains(query)) }) .map(|entry| entry.value().clone()) .collect()) } } ``` ### Concurrency `DashMap` provides lock-free concurrent access: ```rust // Multiple threads can read/write simultaneously let storage = Arc::new(MemoryStorage::new()); tokio::spawn({ let storage = storage.clone(); async move { storage.save_node(&node1).await.unwrap(); } }); tokio::spawn({ let storage = storage.clone(); async move { storage.save_node(&node2).await.unwrap(); } }); ``` ## Hybrid Storage Strategy ### Architecture Combine filesystem and SurrealDB for best of both worlds: ```text Project Graph (local) Shared Graph (central) ↓ ↓ Filesystem Storage SurrealDB Storage .kogral/notes/ namespace: "kb" .kogral/decisions/ database: "shared" .kogral/guidelines/ nodes: shared guidelines ↓ ↓ └─────── Sync Mechanism ─────────┘ (bidirectional) ``` **Local advantages**: - Git-tracked (version control) - Offline work - Fast local queries - Human-readable diffs **Central advantages**: - Shared across projects - Advanced graph queries - Semantic search at scale - Organization-wide knowledge ### Sync Mechanism Bidirectional synchronization keeps both in sync: ```rust pub struct SyncManager { filesystem: Arc, surrealdb: Arc, config: SyncConfig, } impl SyncManager { pub async fn sync_to_central(&self) -> Result<()> { let nodes = self.filesystem.list_nodes(None).await?; for node_id in nodes { let node = self.filesystem.load_node(&node_id).await?; self.surrealdb.save_node(&node).await?; } Ok(()) } pub async fn sync_from_central(&self) -> Result<()> { let nodes = self.surrealdb.list_nodes(None).await?; for node_id in nodes { let node = self.surrealdb.load_node(&node_id).await?; self.filesystem.save_node(&node).await?; } Ok(()) } pub async fn watch_and_sync(&self) -> Result<()> { self.filesystem.watch(|event| { if let Some(path) = event.path { // Debounce and sync changed file tokio::spawn(async move { sleep(Duration::from_millis(self.config.debounce_ms)).await; self.sync_to_central().await }); } }).await } } ``` ### Conflict Resolution When both storage have different versions of the same node: **Strategy 1: Last-Write-Wins** (based on `modified` timestamp) ```rust if filesystem_node.modified > surrealdb_node.modified { surrealdb.save_node(&filesystem_node).await?; } else { filesystem.save_node(&surrealdb_node).await?; } ``` **Strategy 2: User Prompt** (safe, explicit) ```rust if filesystem_node != surrealdb_node { match prompt_user(&filesystem_node, &surrealdb_node)? { Choice::KeepLocal => surrealdb.save_node(&filesystem_node).await?, Choice::KeepCentral => filesystem.save_node(&surrealdb_node).await?, Choice::Merge => /* merge and save */, } } ``` **Strategy 3: Both** (create branches) ```rust // Save both as separate nodes with relationship filesystem_node.id = format!("{}-local", original_id); surrealdb_node.id = format!("{}-central", original_id); storage.save_node(&filesystem_node).await?; storage.save_node(&surrealdb_node).await?; storage.save_edge(&Edge { from: filesystem_node.id, to: surrealdb_node.id, relation: EdgeType::RelatesTo, strength: 1.0, }).await?; ``` ## Configuration Storage backend is selected via config: ```nickel { storage = { # Primary storage (always used) primary = 'filesystem, # or 'memory, 'surrealdb # Secondary storage (optional, for hybrid setup) secondary = { enabled = true, # Enable SurrealDB type = 'surrealdb, url = "ws://localhost:8000", namespace = "kogral", database = "shared", username = "root", # Or from env var password = "root", }, }, sync = { auto_index = true, # Auto-sync to secondary debounce_ms = 500, # Wait before syncing watch_paths = ["notes", "decisions", "guidelines"], }, } ``` ## Performance Considerations ### Filesystem Storage - **Read**: O(1) if ID known, O(n) for scanning - **Write**: Fast (single file write) - **Search**: O(n) text scan, slow for large graphs - **Optimization**: Use file system cache, lazy load ### SurrealDB Storage - **Read**: O(log n) with indexes - **Write**: Fast with async commits - **Search**: O(1) with full-text index - **Graph traversal**: Optimized with native graph support - **Optimization**: Index on tags, node_type, embeddings ### In-Memory Storage - **Read**: O(1) with DashMap - **Write**: O(1) lock-free - **Search**: O(n) iteration - **Memory**: Entire graph in RAM ## Security - **Filesystem**: Unix permissions, path sanitization - **SurrealDB**: Authentication, namespaces, TLS - **In-Memory**: Process isolation, no persistence ## See Also - **Configuration**: [Storage Configuration](../config/storage.md) - **Sync Guide**: [Filesystem ↔ SurrealDB Sync](sync.md) - **ADR**: [Hybrid Storage Decision](../architecture/adrs/003-hybrid-storage.md) - **API Reference**: [Storage Trait Documentation](../api/storage-trait.md)