kogral/docs/architecture/adrs/004-logseq-blocks-support.md
2026-01-23 16:11:07 +00:00

197 lines
5.4 KiB
Markdown

# ADR-004: Logseq Blocks Support
## Status
**Proposed** (Design phase)
## Context
Logseq uses **content blocks** as the fundamental unit of information, not full documents. KB currently treats `Node.content` as flat markdown string, which loses block-level features on import/export:
**Lost features**:
- Block properties (`#card`, `TODO`, custom properties)
- Block hierarchy (outliner nesting)
- Block references (`((block-uuid))`)
- Block-level queries (find all flashcards, TODOs)
**User requirement**: Round-trip Logseq import/export with full fidelity:
```text
Logseq → KOGRAL Import → KOGRAL Storage → KOGRAL Export → Logseq
(blocks preserved at every step)
```
## Decision
Implement **Hybrid Block Support** (structured + markdown):
### 1. Add Block Data Structure
```rust
pub struct Block {
pub id: String, // UUID
pub content: String, // Block text
pub properties: BlockProperties, // Tags, status, custom
pub children: Vec<Block>, // Nested blocks
// ... timestamps ...
}
pub struct BlockProperties {
pub tags: Vec<String>, // #card, #important
pub status: Option<TaskStatus>, // TODO, DONE, etc.
pub custom: HashMap<String, String>, // property:: value
pub block_refs: Vec<String>, // ((uuid))
pub page_refs: Vec<String>, // [[page]]
}
```
### 2. Extend Node Model
```rust
pub struct Node {
// ... existing fields ...
pub content: String, // Source of truth (markdown)
pub blocks: Option<Vec<Block>>, // Cached structure (optional)
}
```
### 3. Bidirectional Parser
- **Parse**: Markdown → `Vec<Block>` (lazy, on-demand)
- **Serialize**: `Vec<Block>` → Markdown (for export)
### 4. Storage Strategy
**Filesystem** (git-friendly, Logseq-compatible):
```markdown
- Block 1 #card
- Nested answer
- TODO Block 2
priority:: high
```
**SurrealDB** (queryable):
```sql
DEFINE TABLE block;
DEFINE FIELD node_id ON block TYPE record(node);
DEFINE FIELD block_id ON block TYPE string;
DEFINE FIELD properties ON block TYPE object;
DEFINE INDEX block_tags ON block COLUMNS properties.tags;
```
### 5. Query Extensions
```rust
// Find all flashcards
graph.find_blocks_by_tag("card")
// Find all TODOs
graph.find_all_todos()
// Find blocks with custom property
node.find_blocks_by_property("priority", "high")
```
## Consequences
### Positive
**Full Logseq Compatibility** - Import/export preserves all block features
**Queryable Blocks** - Find #card, TODO, custom properties across KOGRAL
**Backward Compatible** - Existing nodes without blocks still work
**Type-Safe** - Structured data instead of regex parsing everywhere
**Extensible** - Custom block properties supported
**Hierarchy Preserved** - Nested blocks maintain parent-child relationships
### Negative
⚠️ **Added Complexity** - New data structures, parser, sync logic
⚠️ **Dual Representation** - Must keep `content` and `blocks` in sync
⚠️ **Storage Overhead** - SurrealDB stores both markdown and structure
⚠️ **Migration Required** - Existing data needs parsing to populate blocks
### Neutral
⚙️ **Lazy Parsing** - Blocks parsed on-demand (not stored by default)
⚙️ **Opt-In** - Config flag `blocks.enabled` to activate features
⚙️ **Gradual Adoption** - Can implement in phases
## Implementation Phases
**Phase 1: Foundation** (No behavior change)
- Add `Block` struct to `models/block.rs`
- Add optional `blocks` field to `Node`
- Add config: `blocks.enabled = false` (default off)
**Phase 2: Parser**
- Implement `BlockParser::parse()` (markdown → blocks)
- Implement `BlockParser::serialize()` (blocks → markdown)
- Add `Node::get_blocks()` method (lazy parsing)
**Phase 3: Logseq Integration**
- Update `LogseqImporter` to parse blocks
- Update `LogseqExporter` to serialize blocks
- Test round-trip (Logseq → KB → Logseq)
**Phase 4: Query API**
- Add `Graph::find_blocks_by_tag()`
- Add `Graph::find_all_todos()`
- Add `Node::find_blocks_by_property()`
**Phase 5: MCP/CLI Integration**
- Add `kb/find_blocks` MCP tool
- Add `kogral find-cards` CLI command
- Add `kogral find-todos` CLI command
**Phase 6: SurrealDB Backend**
- Create `block` table schema
- Index on tags, status, properties
- Store blocks alongside nodes
## Alternatives Considered
### Alternative 1: Blocks as First-Class Nodes
Convert each Logseq block to a separate KOGRAL Node.
**Rejected**: Too granular, explosion of nodes, loses document context.
### Alternative 2: Parser-Only (No Storage)
Keep `content: String`, parse blocks on every access.
**Rejected**: Can't query blocks in database, parse overhead, can't index.
### Alternative 3: Metadata Field
Store blocks in `metadata: HashMap<String, Value>`.
**Rejected**: Not type-safe, harder to query, no schema validation.
## References
- [Logseq Block Format](https://docs.logseq.com/#/page/blocks)
- [Full Design Document](../logseq-blocks-design.md)
- [Implementation Tracking](https://github.com/.../issues/XXX)
## Notes
**Backward Compatibility Strategy**:
- `content` remains source of truth
- `blocks` is optional enhancement
- Old code works unchanged
- New features opt-in via config
**Migration Path**:
- Existing users: blocks disabled by default
- New users: blocks enabled, parsed on import
- Manual: `kogral reindex --parse-blocks` to populate
---
**Decision Date**: 2026-01-17
**Approvers**: TBD
**Review Date**: After Phase 2 implementation