kogral/docs/architecture/overview.md
2026-01-23 16:11:07 +00:00

487 lines
12 KiB
Markdown

# System Architecture
Comprehensive overview of the KOGRAL architecture.
## High-Level Architecture
![Architecture Overview](../diagrams/architecture-overview.svg)
The KOGRAL consists of three main layers:
1. **User Interfaces**: kogral-cli (terminal), kogral-mcp (AI integration), NuShell scripts (automation)
2. **Core Library (kogral-core)**: Rust library with graph engine, storage abstraction, embeddings, query engine
3. **Storage Backends**: Filesystem (git-friendly), SurrealDB (scalable), In-Memory (cache/testing)
## Component Details
### kogral-cli (Command-Line Interface)
**Purpose**: Primary user interface for local knowledge management.
**Commands** (13 total):
- `init`: Initialize `.kogral/` directory
- `add`: Create nodes (note, decision, guideline, pattern, journal)
- `search`: Text and semantic search
- `link`: Create relationships between nodes
- `list`: List all nodes
- `show`: Display node details
- `delete`: Remove nodes
- `graph`: Visualize knowledge graph
- `sync`: Sync filesystem ↔ SurrealDB
- `serve`: Start MCP server
- `import`: Import from Logseq
- `export`: Export to Logseq/JSON
- `config`: Manage configuration
**Technology**: Rust + clap (derive API)
**Features**:
- Colored terminal output
- Interactive prompts
- Dry-run modes
- Validation before operations
### kogral-mcp (MCP Server)
**Purpose**: AI integration via Model Context Protocol.
**Protocol**: JSON-RPC 2.0 over stdio
**Components**:
1. **Tools** (7):
- `kogral/search`: Query knowledge base
- `kogral/add_note`: Create notes
- `kogral/add_decision`: Create ADRs
- `kogral/link`: Create relationships
- `kogral/get_guidelines`: Retrieve guidelines with inheritance
- `kb/list_graphs`: List available graphs
- `kogral/export`: Export to formats
2. **Resources** (6 URIs):
- `kogral://project/notes`
- `kogral://project/decisions`
- `kogral://project/guidelines`
- `kogral://project/patterns`
- `kogral://shared/guidelines`
- `kogral://shared/patterns`
3. **Prompts** (2):
- `kogral/summarize_project`: Generate project summary
- `kogral/find_related`: Find related nodes
**Integration**: Claude Code via `~/.config/claude/config.json`
### NuShell Scripts
**Purpose**: Automation and maintenance tasks.
**Scripts** (6):
- `kogral-sync.nu`: Filesystem ↔ SurrealDB sync
- `kogral-backup.nu`: Archive knowledge base
- `kogral-reindex.nu`: Rebuild embeddings
- `kogral-import-logseq.nu`: Import from Logseq
- `kogral-export-logseq.nu`: Export to Logseq
- `kogral-stats.nu`: Graph statistics
**Features**:
- Colored output
- Dry-run modes
- Progress indicators
- Error handling
## Core Library (kogral-core)
### Models
**Graph**:
```rust
pub struct Graph {
pub name: String,
pub version: String,
pub nodes: HashMap<String, Node>, // ID → Node
pub edges: Vec<Edge>,
pub metadata: HashMap<String, Value>,
}
```
**Node**:
```rust
pub struct Node {
pub id: String,
pub node_type: NodeType,
pub title: String,
pub content: String,
pub tags: Vec<String>,
pub status: NodeStatus,
pub created: DateTime<Utc>,
pub modified: DateTime<Utc>,
// ... relationships, metadata
}
```
**Edge**:
```rust
pub struct Edge {
pub from: String,
pub to: String,
pub relation: EdgeType,
pub strength: f32,
pub created: DateTime<Utc>,
}
```
### Storage Trait
```rust
#[async_trait]
pub trait Storage: Send + Sync {
/// Save a complete graph to storage
async fn save_graph(&mut self, graph: &Graph) -> Result<()>;
/// Load a graph from storage
async fn load_graph(&self, name: &str) -> Result<Graph>;
/// Save a single node to storage
async fn save_node(&mut self, node: &Node) -> Result<()>;
/// Load a node by ID
async fn load_node(&self, graph_name: &str, node_id: &str) -> Result<Node>;
/// Delete a node
async fn delete_node(&mut self, graph_name: &str, node_id: &str) -> Result<()>;
/// List all graphs
async fn list_graphs(&self) -> Result<Vec<String>>;
/// List nodes in a graph, optionally filtered by type
async fn list_nodes(&self, graph_name: &str, node_type: Option<&str>) -> Result<Vec<Node>>;
}
```
**Implementations**:
1. `FilesystemStorage`: Git-friendly markdown files
2. `MemoryStorage`: In-memory with DashMap
3. `SurrealDbStorage`: Scalable graph database
### Embedding Provider Trait
```rust
#[async_trait]
pub trait EmbeddingProvider: Send + Sync {
async fn embed(&self, texts: Vec<String>) -> Result<Vec<Vec<f32>>>;
fn dimensions(&self) -> usize;
fn model_name(&self) -> &str;
}
```
**Implementations**:
1. `FastEmbedProvider`: Local fastembed
2. `RigEmbeddingProvider`: OpenAI, Claude, Ollama (via rig-core)
### Parser
**Input**: Markdown file with YAML frontmatter
**Output**: `Node` struct
**Features**:
- YAML frontmatter extraction
- Markdown body parsing
- Wikilink detection (`[[linked-note]]`)
- Code reference parsing (`@file.rs:42`)
**Example**:
```markdown
---
id: note-123
type: note
title: My Note
tags: [rust, async]
---
# My Note
Content with [[other-note]] and @src/main.rs:10
```
```rust
Node {
id: "note-123",
node_type: NodeType::Note,
title: "My Note",
content: "Content with [[other-note]] and @src/main.rs:10",
tags: vec!["rust", "async"],
// ... parsed wikilinks, code refs
}
```
## Configuration System
### Nickel Schema
```nickel
# schemas/kogral-config.ncl
{
KbConfig = {
graph | GraphConfig,
storage | StorageConfig,
embeddings | EmbeddingConfig,
templates | TemplateConfig,
query | QueryConfig,
mcp | McpConfig,
sync | SyncConfig,
},
}
```
### Loading Process
```text
User writes: .kogral/config.ncl
↓ [nickel export --format json]
JSON intermediate
↓ [serde_json::from_str]
KbConfig struct (Rust)
Runtime behavior
```
**Double Validation**:
1. Nickel contracts: Type-safe, enum validation
2. Serde deserialization: Rust type checking
**Benefits**:
- Errors caught at export time
- Runtime guaranteed valid config
- Self-documenting schemas
## Storage Architecture
### Hybrid Strategy
**Local Graph** (per project):
- Storage: Filesystem (`.kogral/` directory)
- Format: Markdown + YAML frontmatter
- Version control: Git
- Scope: Project-specific knowledge
**Shared Graph** (organization):
- Storage: SurrealDB (or synced filesystem)
- Format: Same markdown (for compatibility)
- Version control: Optional
- Scope: Organization-wide guidelines
**Sync**:
```text
Filesystem (.kogral/)
↕ [bidirectional sync]
SurrealDB (central)
```
### File Layout
```text
.kogral/
├── config.toml # Graph metadata
├── notes/
│ ├── async-patterns.md # Individual note
│ └── error-handling.md
├── decisions/
│ ├── 0001-use-rust.md # ADR format
│ └── 0002-surrealdb.md
├── guidelines/
│ ├── rust-errors.md # Project guideline
│ └── testing.md
├── patterns/
│ └── repository.md
└── journal/
├── 2026-01-17.md # Daily journal
└── 2026-01-18.md
```
## Query Engine
### Text Search
```rust
let results = graph.nodes.values()
.filter(|node| {
node.title.contains(&query) ||
node.content.contains(&query) ||
node.tags.iter().any(|tag| tag.contains(&query))
})
.collect();
```
### Semantic Search
```rust
let query_embedding = embeddings.embed(vec![query]).await?;
let mut scored: Vec<_> = graph.nodes.values()
.filter_map(|node| {
let node_embedding = node.embedding.as_ref()?;
let similarity = cosine_similarity(&query_embedding[0], node_embedding);
(similarity >= threshold).then_some((node, similarity))
})
.collect();
scored.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap());
```
### Cross-Graph Query
```rust
// Query both project and shared graphs
let project_results = project_graph.search(&query).await?;
let shared_results = shared_graph.search(&query).await?;
// Merge with deduplication
let combined = merge_results(project_results, shared_results);
```
## MCP Protocol Flow
```text
Claude Code kogral-mcp kogral-core
│ │ │
├─ JSON-RPC request ───→ │ │
│ kogral/search │ │
│ {"query": "rust"} │ │
│ ├─ search() ──────────→ │
│ │ │
│ │ Query engine
│ │ Text + semantic
│ │ │
│ │ ←──── results ─────────┤
│ │ │
│ ←─ JSON-RPC response ──┤ │
│ {"results": [...]} │ │
```
## Template System
**Engine**: Tera (Jinja2-like)
**Templates**:
1. **Document Templates** (6):
- `note.md.tera`
- `decision.md.tera`
- `guideline.md.tera`
- `pattern.md.tera`
- `journal.md.tera`
- `execution.md.tera`
2. **Export Templates** (4):
- `logseq-page.md.tera`
- `logseq-journal.md.tera`
- `summary.md.tera`
- `graph.json.tera`
**Usage**:
```rust
let mut tera = Tera::new("templates/**/*.tera")?;
let rendered = tera.render("note.md.tera", &context)?;
```
## Error Handling
**Strategy**: `thiserror` for structured errors
```rust
#[derive(Error, Debug)]
pub enum KbError {
#[error("Storage error: {0}")]
Storage(String),
#[error("Node not found: {0}")]
NodeNotFound(String),
#[error("Configuration error: {0}")]
Config(String),
#[error("Parse error: {0}")]
Parse(String),
#[error("Embedding error: {0}")]
Embedding(String),
}
```
**Propagation**: `?` operator throughout
## Testing Strategy
**Unit Tests**: Per module (models, parser, storage)
**Integration Tests**: Full workflow (add → save → load → query)
**Test Coverage**:
- kogral-core: 48 tests
- kogral-mcp: 5 tests
- Total: 56 tests
**Test Data**: Fixtures in `tests/fixtures/`
## Performance Considerations
**Node Lookup**: O(1) via HashMap
**Semantic Search**: O(n) with early termination (threshold filter)
**Storage**:
- Filesystem: Lazy loading (load on demand)
- Memory: Full graph in RAM
- SurrealDB: Query optimization (indexes)
**Embeddings**:
- Cache embeddings in node metadata
- Batch processing (configurable batch size)
- Async generation (non-blocking)
## Security
**No unsafe code**: `#![forbid(unsafe_code)]`
**Input validation**:
- Nickel contracts validate config
- serde validates JSON
- Custom validation for user input
**File operations**:
- Path sanitization (no `../` traversal)
- Permissions checking
- Atomic writes (temp file + rename)
## Scalability
**Small Projects** (< 1000 nodes):
- Filesystem storage
- In-memory search
- Local embeddings (fastembed)
**Medium Projects** (1000-10,000 nodes):
- Filesystem + SurrealDB sync
- Semantic search with caching
- Cloud embeddings (OpenAI/Claude)
**Large Organizations** (> 10,000 nodes):
- SurrealDB primary
- Distributed embeddings
- Multi-graph federation
## Next Steps
- **Graph Model Details**: [Graph Model](graph-model.md)
- **Storage Deep Dive**: [Storage Architecture](storage-architecture.md)
- **ADRs**: [Architectural Decisions](adrs/001-nickel-vs-toml.md)
- **Implementation**: [Development Guide](../contributing/development.md)