kogral/docs/architecture/adrs/004-logseq-blocks-support.md

# ADR-004: Logseq Blocks Support

## Status

**Proposed** (Design phase)

## Context

Logseq uses **content blocks** as the fundamental unit of information, not full documents. KB currently treats `Node.content` as flat markdown string, which loses block-level features on import/export:

**Lost features**:
- Block properties (`#card`, `TODO`, custom properties)
- Block hierarchy (outliner nesting)
- Block references (`((block-uuid))`)
- Block-level queries (find all flashcards, TODOs)

**User requirement**: Round-trip Logseq import/export with full fidelity:

```text
Logseq → KOGRAL Import → KOGRAL Storage → KOGRAL Export → Logseq
  (blocks preserved at every step)
```

## Decision

Implement **Hybrid Block Support** (structured + markdown):

### 1. Add Block Data Structure

```rust
pub struct Block {
    pub id: String,                   // UUID
    pub content: String,              // Block text
    pub properties: BlockProperties,  // Tags, status, custom
    pub children: Vec<Block>,         // Nested blocks
    // ... timestamps ...
}

pub struct BlockProperties {
    pub tags: Vec<String>,            // #card, #important
    pub status: Option<TaskStatus>,   // TODO, DONE, etc.
    pub custom: HashMap<String, String>, // property:: value
    pub block_refs: Vec<String>,      // ((uuid))
    pub page_refs: Vec<String>,       // [[page]]
}
```

### 2. Extend Node Model

```rust
pub struct Node {
    // ... existing fields ...
    pub content: String,              // Source of truth (markdown)
    pub blocks: Option<Vec<Block>>,   // Cached structure (optional)
}
```

### 3. Bidirectional Parser

- **Parse**: Markdown → `Vec<Block>` (lazy, on-demand)
- **Serialize**: `Vec<Block>` → Markdown (for export)

### 4. Storage Strategy

**Filesystem** (git-friendly, Logseq-compatible):

```markdown
- Block 1 #card
  - Nested answer
- TODO Block 2
  priority:: high
```

**SurrealDB** (queryable):

```sql
DEFINE TABLE block;
DEFINE FIELD node_id ON block TYPE record(node);
DEFINE FIELD block_id ON block TYPE string;
DEFINE FIELD properties ON block TYPE object;
DEFINE INDEX block_tags ON block COLUMNS properties.tags;
```

### 5. Query Extensions

```rust
// Find all flashcards
graph.find_blocks_by_tag("card")

// Find all TODOs
graph.find_all_todos()

// Find blocks with custom property
node.find_blocks_by_property("priority", "high")
```

## Consequences

### Positive

✅ **Full Logseq Compatibility** - Import/export preserves all block features
✅ **Queryable Blocks** - Find #card, TODO, custom properties across KOGRAL
✅ **Backward Compatible** - Existing nodes without blocks still work
✅ **Type-Safe** - Structured data instead of regex parsing everywhere
✅ **Extensible** - Custom block properties supported
✅ **Hierarchy Preserved** - Nested blocks maintain parent-child relationships

### Negative

⚠️ **Added Complexity** - New data structures, parser, sync logic
⚠️ **Dual Representation** - Must keep `content` and `blocks` in sync
⚠️ **Storage Overhead** - SurrealDB stores both markdown and structure
⚠️ **Migration Required** - Existing data needs parsing to populate blocks

### Neutral

⚙️ **Lazy Parsing** - Blocks parsed on-demand (not stored by default)
⚙️ **Opt-In** - Config flag `blocks.enabled` to activate features
⚙️ **Gradual Adoption** - Can implement in phases

## Implementation Phases

**Phase 1: Foundation** (No behavior change)
- Add `Block` struct to `models/block.rs`
- Add optional `blocks` field to `Node`
- Add config: `blocks.enabled = false` (default off)

**Phase 2: Parser**
- Implement `BlockParser::parse()` (markdown → blocks)
- Implement `BlockParser::serialize()` (blocks → markdown)
- Add `Node::get_blocks()` method (lazy parsing)

**Phase 3: Logseq Integration**
- Update `LogseqImporter` to parse blocks
- Update `LogseqExporter` to serialize blocks
- Test round-trip (Logseq → KB → Logseq)

**Phase 4: Query API**
- Add `Graph::find_blocks_by_tag()`
- Add `Graph::find_all_todos()`
- Add `Node::find_blocks_by_property()`

**Phase 5: MCP/CLI Integration**
- Add `kb/find_blocks` MCP tool
- Add `kogral find-cards` CLI command
- Add `kogral find-todos` CLI command

**Phase 6: SurrealDB Backend**
- Create `block` table schema
- Index on tags, status, properties
- Store blocks alongside nodes

## Alternatives Considered

### Alternative 1: Blocks as First-Class Nodes

Convert each Logseq block to a separate KOGRAL Node.

**Rejected**: Too granular, explosion of nodes, loses document context.

### Alternative 2: Parser-Only (No Storage)

Keep `content: String`, parse blocks on every access.

**Rejected**: Can't query blocks in database, parse overhead, can't index.

### Alternative 3: Metadata Field

Store blocks in `metadata: HashMap<String, Value>`.

**Rejected**: Not type-safe, harder to query, no schema validation.

## References

- [Logseq Block Format](https://docs.logseq.com/#/page/blocks)
- [Full Design Document](../logseq-blocks-design.md)
- [Implementation Tracking](https://github.com/.../issues/XXX)

## Notes

**Backward Compatibility Strategy**:
- `content` remains source of truth
- `blocks` is optional enhancement
- Old code works unchanged
- New features opt-in via config

**Migration Path**:
- Existing users: blocks disabled by default
- New users: blocks enabled, parsed on import
- Manual: `kogral reindex --parse-blocks` to populate

---

**Decision Date**: 2026-01-17
**Approvers**: TBD
**Review Date**: After Phase 2 implementation