Logseq Blocks Support - Architecture Design
Problem Statement
Logseq uses content blocks as the fundamental unit of information, not full documents. Each block can have:
- Properties:
#card,TODO,DONE, custom properties - Tags: Inline tags like
#flashcard,#important - References: Block references
((block-id)), page references[[page]] - Nesting: Outliner-style hierarchy (parent-child blocks)
- Metadata: Block-level properties (unlike page-level frontmatter)
Current KB limitation: Nodes only have content: String (flat markdown). Importing from Logseq loses block structure and properties.
Requirement: Support round-trip import/export with full block fidelity:
Logseq Graph → KOGRAL Import → KOGRAL Storage → KOGRAL Export → Logseq Graph
(blocks preserved) (blocks preserved)
Use Cases
1. Flashcards (#card)
Logseq:
- What is Rust's ownership model? #card
- Rust uses ownership, borrowing, and lifetimes
- Three rules: one owner, many borrows XOR one mutable
KB needs to preserve:
- Block with
#cardproperty - Nested answer blocks
- Ability to query all cards
2. Task Tracking (TODO/DONE)
Logseq:
- TODO Implement block parser #rust
- DONE Research block structure
- TODO Write parser tests
KB needs to preserve:
- Task status per block
- Hierarchical task breakdown
- Tags on tasks
3. Block References
Logseq:
- Core concept: ((block-uuid-123))
- See also: [[Related Page]]
KB needs to preserve:
- Block-to-block links (not just page-to-page)
- UUID references
4. Block Properties
Logseq:
- This is a block with properties
property1:: value1
property2:: value2
KB needs to preserve:
- Custom key-value properties per block
- Property inheritance/override
Design Options
Option A: Blocks as First-Class Data Structure
Add blocks field to Node:
#![allow(unused)] fn main() { pub struct Node { // ... existing fields ... pub content: String, // Backward compat: flat markdown pub blocks: Option<Vec<Block>>, // NEW: Structured blocks } pub struct Block { pub id: String, // UUID or auto-generated pub content: String, // Block text pub properties: BlockProperties, // Tags, status, custom props pub children: Vec<Block>, // Nested blocks pub created: DateTime<Utc>, pub modified: DateTime<Utc>, } pub struct BlockProperties { pub tags: Vec<String>, // #card, #important pub status: Option<TaskStatus>, // TODO, DONE, WAITING pub custom: HashMap<String, String>, // property:: value } pub enum TaskStatus { Todo, Doing, Done, Waiting, Cancelled, } }
Pros:
- ✅ Type-safe, explicit structure
- ✅ Queryable (find all #card blocks)
- ✅ Preserves hierarchy
- ✅ Supports block-level operations
Cons:
- ❌ Adds complexity to Node
- ❌ Dual representation (content + blocks)
- ❌ Requires migration of existing data
Option B: Parser-Only Approach
Keep content: String, parse blocks on-demand:
#![allow(unused)] fn main() { pub struct BlockParser; impl BlockParser { // Parse markdown content into block structure fn parse(content: &str) -> Vec<Block>; // Serialize blocks back to markdown fn serialize(blocks: &[Block]) -> String; } // Usage let blocks = BlockParser::parse(&node.content); let filtered = blocks.iter().filter(|b| b.properties.tags.contains("card")); }
Pros:
- ✅ No schema changes
- ✅ Backward compatible
- ✅ Simple storage (still just String)
Cons:
- ❌ Parse overhead on every access
- ❌ Can't query blocks in database (SurrealDB)
- ❌ Harder to index/search blocks
Option C: Hybrid Approach (RECOMMENDED)
Combine both: structured storage + lazy parsing:
#![allow(unused)] fn main() { pub struct Node { // ... existing fields ... pub content: String, // Source of truth (markdown) #[serde(skip_serializing_if = "Option::is_none")] pub blocks: Option<Vec<Block>>, // Cached structure (parsed) } impl Node { // Parse blocks from content if not already cached pub fn get_blocks(&mut self) -> &Vec<Block> { if self.blocks.is_none() { self.blocks = Some(BlockParser::parse(&self.content)); } self.blocks.as_ref().unwrap() } // Update content from blocks (when blocks modified) pub fn sync_blocks_to_content(&mut self) { if let Some(ref blocks) = self.blocks { self.content = BlockParser::serialize(blocks); } } } }
Storage Strategy:
-
Filesystem - Store as markdown (Logseq compatible):
- Block 1 #card - Nested block - Block 2 TODO -
SurrealDB - Store both:
DEFINE TABLE block SCHEMAFULL; DEFINE FIELD node_id ON block TYPE record(node); DEFINE FIELD block_id ON block TYPE string; DEFINE FIELD content ON block TYPE string; DEFINE FIELD properties ON block TYPE object; DEFINE FIELD parent_id ON block TYPE option<string>; -- Index for queries DEFINE INDEX block_tags ON block COLUMNS properties.tags; DEFINE INDEX block_status ON block COLUMNS properties.status;
Pros:
- ✅ Best of both worlds
- ✅ Filesystem stays Logseq-compatible
- ✅ SurrealDB can query blocks
- ✅ Lazy parsing (only when needed)
- ✅ Backward compatible
Cons:
- ⚠️ Need to keep content/blocks in sync
- ⚠️ More complex implementation
Recommended Implementation
Phase 1: Data Model
#![allow(unused)] fn main() { // crates/kb-core/src/models/block.rs use chrono::{DateTime, Utc}; use serde::{Deserialize, Serialize}; use std::collections::HashMap; /// A content block (Logseq-style) #[derive(Debug, Clone, Serialize, Deserialize)] pub struct Block { /// Unique block identifier (UUID) pub id: String, /// Block content (markdown text, excluding nested blocks) pub content: String, /// Block properties (tags, status, custom) pub properties: BlockProperties, /// Child blocks (nested hierarchy) #[serde(default)] pub children: Vec<Block>, /// Creation timestamp pub created: DateTime<Utc>, /// Last modification timestamp pub modified: DateTime<Utc>, /// Parent block ID (if nested) #[serde(skip_serializing_if = "Option::is_none")] pub parent_id: Option<String>, } /// Block-level properties #[derive(Debug, Clone, Default, Serialize, Deserialize)] pub struct BlockProperties { /// Tags (e.g., #card, #important) #[serde(default)] pub tags: Vec<String>, /// Task status (TODO, DONE, etc.) #[serde(skip_serializing_if = "Option::is_none")] pub status: Option<TaskStatus>, /// Custom properties (property:: value) #[serde(default)] pub custom: HashMap<String, String>, /// Block references ((uuid)) #[serde(default)] pub block_refs: Vec<String>, /// Page references ([[page]]) #[serde(default)] pub page_refs: Vec<String>, } /// Task status for TODO blocks #[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)] #[serde(rename_all = "UPPERCASE")] pub enum TaskStatus { Todo, Doing, Done, Later, Now, Waiting, Cancelled, } impl Block { /// Create a new block with content pub fn new(content: String) -> Self { use uuid::Uuid; Self { id: Uuid::new_v4().to_string(), content, properties: BlockProperties::default(), children: Vec::new(), created: Utc::now(), modified: Utc::now(), parent_id: None, } } /// Add a child block pub fn add_child(&mut self, mut child: Block) { child.parent_id = Some(self.id.clone()); self.children.push(child); self.modified = Utc::now(); } /// Add a tag to this block pub fn add_tag(&mut self, tag: String) { if !self.properties.tags.contains(&tag) { self.properties.tags.push(tag); self.modified = Utc::now(); } } /// Set task status pub fn set_status(&mut self, status: TaskStatus) { self.properties.status = Some(status); self.modified = Utc::now(); } /// Get all blocks (self + descendants) as flat list pub fn flatten(&self) -> Vec<&Block> { let mut result = vec![self]; for child in &self.children { result.extend(child.flatten()); } result } /// Find block by ID in tree pub fn find(&self, id: &str) -> Option<&Block> { if self.id == id { return Some(self); } for child in &self.children { if let Some(found) = child.find(id) { return Some(found); } } None } } }
Phase 2: Update Node Model
#![allow(unused)] fn main() { // crates/kb-core/src/models.rs (modifications) use crate::models::block::Block; pub struct Node { // ... existing fields ... pub content: String, /// Structured blocks (optional, parsed from content) #[serde(skip_serializing_if = "Option::is_none")] pub blocks: Option<Vec<Block>>, } impl Node { /// Get blocks, parsing from content if needed pub fn get_blocks(&mut self) -> Result<&Vec<Block>> { if self.blocks.is_none() { self.blocks = Some(crate::parser::BlockParser::parse(&self.content)?); } Ok(self.blocks.as_ref().unwrap()) } /// Update content from blocks pub fn sync_blocks_to_content(&mut self) { if let Some(ref blocks) = self.blocks { self.content = crate::parser::BlockParser::serialize(blocks); } } /// Find all blocks with a specific tag pub fn find_blocks_by_tag(&mut self, tag: &str) -> Result<Vec<&Block>> { let blocks = self.get_blocks()?; let mut result = Vec::new(); for block in blocks { for b in block.flatten() { if b.properties.tags.iter().any(|t| t == tag) { result.push(b); } } } Ok(result) } /// Find all TODO blocks pub fn find_todos(&mut self) -> Result<Vec<&Block>> { let blocks = self.get_blocks()?; let mut result = Vec::new(); for block in blocks { for b in block.flatten() { if matches!(b.properties.status, Some(TaskStatus::Todo)) { result.push(b); } } } Ok(result) } } }
Phase 3: Block Parser
#![allow(unused)] fn main() { // crates/kb-core/src/parser/block_parser.rs use crate::models::block::{Block, BlockProperties, TaskStatus}; use regex::Regex; pub struct BlockParser; impl BlockParser { /// Parse markdown content into block structure /// /// Handles: /// - Outliner format (- prefix with indentation) /// - Tags (#card, #important) /// - Task status (TODO, DONE) /// - Properties (property:: value) /// - Block references (((uuid))) /// - Page references ([[page]]) pub fn parse(content: &str) -> Result<Vec<Block>> { let mut blocks = Vec::new(); let mut stack: Vec<(usize, Block)> = Vec::new(); // (indent_level, block) for line in content.lines() { // Detect indentation level let indent = count_indent(line); let trimmed = line.trim_start(); // Skip empty lines if trimmed.is_empty() { continue; } // Parse block line if let Some(block_content) = trimmed.strip_prefix("- ") { let mut block = Self::parse_block_line(block_content)?; // Pop stack until we find parent level while let Some((level, _)) = stack.last() { if *level < indent { break; } stack.pop(); } // Add as child to parent or as root if let Some((_, parent)) = stack.last_mut() { parent.add_child(block.clone()); } else { blocks.push(block.clone()); } stack.push((indent, block)); } } Ok(blocks) } /// Parse a single block line (after "- " prefix) fn parse_block_line(line: &str) -> Result<Block> { let mut block = Block::new(String::new()); let mut properties = BlockProperties::default(); // Extract task status (TODO, DONE, etc.) let (status, remaining) = Self::extract_task_status(line); properties.status = status; // Extract tags (#card, #important) let (tags, remaining) = Self::extract_tags(remaining); properties.tags = tags; // Extract properties (property:: value) let (custom_props, remaining) = Self::extract_properties(remaining); properties.custom = custom_props; // Extract block references (((uuid))) let (block_refs, remaining) = Self::extract_block_refs(remaining); properties.block_refs = block_refs; // Extract page references ([[page]]) let (page_refs, content) = Self::extract_page_refs(remaining); properties.page_refs = page_refs; block.content = content.trim().to_string(); block.properties = properties; Ok(block) } /// Serialize blocks back to markdown pub fn serialize(blocks: &[Block]) -> String { let mut result = String::new(); for block in blocks { Self::serialize_block(&mut result, block, 0); } result } fn serialize_block(output: &mut String, block: &Block, indent: usize) { // Write indent for _ in 0..indent { output.push_str(" "); } // Write prefix output.push_str("- "); // Write task status if let Some(status) = block.properties.status { output.push_str(&format!("{:?} ", status).to_uppercase()); } // Write content output.push_str(&block.content); // Write tags for tag in &block.properties.tags { output.push_str(&format!(" #{}", tag)); } // Write properties if !block.properties.custom.is_empty() { output.push('\n'); for (key, value) in &block.properties.custom { for _ in 0..=indent { output.push_str(" "); } output.push_str(&format!("{}:: {}\n", key, value)); } } output.push('\n'); // Write children recursively for child in &block.children { Self::serialize_block(output, child, indent + 1); } } // Helper methods for extraction fn extract_task_status(line: &str) -> (Option<TaskStatus>, &str) { let line = line.trim_start(); if let Some(rest) = line.strip_prefix("TODO ") { (Some(TaskStatus::Todo), rest) } else if let Some(rest) = line.strip_prefix("DONE ") { (Some(TaskStatus::Done), rest) } else if let Some(rest) = line.strip_prefix("DOING ") { (Some(TaskStatus::Doing), rest) } else if let Some(rest) = line.strip_prefix("LATER ") { (Some(TaskStatus::Later), rest) } else if let Some(rest) = line.strip_prefix("NOW ") { (Some(TaskStatus::Now), rest) } else if let Some(rest) = line.strip_prefix("WAITING ") { (Some(TaskStatus::Waiting), rest) } else if let Some(rest) = line.strip_prefix("CANCELLED ") { (Some(TaskStatus::Cancelled), rest) } else { (None, line) } } fn extract_tags(line: &str) -> (Vec<String>, String) { let tag_regex = Regex::new(r"#(\w+)").unwrap(); let mut tags = Vec::new(); let mut result = line.to_string(); for cap in tag_regex.captures_iter(line) { if let Some(tag) = cap.get(1) { tags.push(tag.as_str().to_string()); result = result.replace(&format!("#{}", tag.as_str()), ""); } } (tags, result.trim().to_string()) } fn extract_properties(line: &str) -> (HashMap<String, String>, String) { let prop_regex = Regex::new(r"(\w+)::\s*([^\n]+)").unwrap(); let mut props = HashMap::new(); let mut result = line.to_string(); for cap in prop_regex.captures_iter(line) { if let (Some(key), Some(value)) = (cap.get(1), cap.get(2)) { props.insert(key.as_str().to_string(), value.as_str().trim().to_string()); result = result.replace(&cap[0], ""); } } (props, result.trim().to_string()) } fn extract_block_refs(line: &str) -> (Vec<String>, String) { let ref_regex = Regex::new(r"\(\(([^)]+)\)\)").unwrap(); let mut refs = Vec::new(); let mut result = line.to_string(); for cap in ref_regex.captures_iter(line) { if let Some(uuid) = cap.get(1) { refs.push(uuid.as_str().to_string()); result = result.replace(&cap[0], ""); } } (refs, result.trim().to_string()) } fn extract_page_refs(line: &str) -> (Vec<String>, String) { let page_regex = Regex::new(r"\[\[([^\]]+)\]\]").unwrap(); let mut pages = Vec::new(); let result = line.to_string(); for cap in page_regex.captures_iter(line) { if let Some(page) = cap.get(1) { pages.push(page.as_str().to_string()); // Keep [[page]] in content for now (backward compat) } } (pages, result) } } fn count_indent(line: &str) -> usize { line.chars().take_while(|c| c.is_whitespace()).count() / 2 } }
Phase 4: Logseq Import/Export
#![allow(unused)] fn main() { // crates/kb-core/src/logseq.rs use crate::models::{Node, NodeType}; use crate::models::block::Block; use crate::parser::BlockParser; pub struct LogseqImporter; impl LogseqImporter { /// Import a Logseq page (markdown file) as a Node pub fn import_page(path: &Path) -> Result<Node> { let content = std::fs::read_to_string(path)?; // Extract frontmatter if present let (frontmatter, body) = Self::split_frontmatter(&content); // Parse blocks from body let blocks = BlockParser::parse(&body)?; // Create node with blocks let mut node = Node::new(NodeType::Note, Self::extract_title(path)); node.content = body; node.blocks = Some(blocks); // Apply frontmatter properties if let Some(fm) = frontmatter { Self::apply_frontmatter(&mut node, &fm)?; } Ok(node) } fn split_frontmatter(content: &str) -> (Option<String>, String) { if content.starts_with("---\n") { if let Some(end) = content[4..].find("\n---\n") { let frontmatter = content[4..4 + end].to_string(); let body = content[4 + end + 5..].to_string(); return (Some(frontmatter), body); } } (None, content.to_string()) } fn extract_title(path: &Path) -> String { path.file_stem() .and_then(|s| s.to_str()) .unwrap_or("Untitled") .to_string() } fn apply_frontmatter(node: &mut Node, frontmatter: &str) -> Result<()> { // Parse YAML frontmatter and apply to node // ... implementation ... Ok(()) } } pub struct LogseqExporter; impl LogseqExporter { /// Export a Node to Logseq page format pub fn export_page(node: &Node, path: &Path) -> Result<()> { let mut output = String::new(); // Generate frontmatter output.push_str("---\n"); output.push_str(&Self::generate_frontmatter(node)?); output.push_str("---\n\n"); // Serialize blocks or use content if let Some(ref blocks) = node.blocks { output.push_str(&BlockParser::serialize(blocks)); } else { output.push_str(&node.content); } std::fs::write(path, output)?; Ok(()) } fn generate_frontmatter(node: &Node) -> Result<String> { let mut fm = String::new(); fm.push_str(&format!("title: {}\n", node.title)); fm.push_str(&format!("tags: {}\n", node.tags.join(", "))); // ... more frontmatter fields ... Ok(fm) } } }
Query API Extensions
#![allow(unused)] fn main() { // New methods in Graph or Query module impl Graph { /// Find all blocks with a specific tag across all nodes pub fn find_blocks_by_tag(&mut self, tag: &str) -> Vec<(&Node, &Block)> { let mut results = Vec::new(); for node in self.nodes.values_mut() { if let Ok(blocks) = node.find_blocks_by_tag(tag) { for block in blocks { results.push((node as &Node, block)); } } } results } /// Find all flashcards (#card blocks) pub fn find_flashcards(&mut self) -> Vec<(&Node, &Block)> { self.find_blocks_by_tag("card") } /// Find all TODO items across knowledge base pub fn find_all_todos(&mut self) -> Vec<(&Node, &Block)> { let mut results = Vec::new(); for node in self.nodes.values_mut() { if let Ok(todos) = node.find_todos() { for block in todos { results.push((node as &Node, block)); } } } results } } }
MCP Tool Extensions
{
"name": "kogral/find_blocks",
"description": "Find blocks by tag, status, or properties",
"inputSchema": {
"type": "object",
"properties": {
"tag": { "type": "string", "description": "Filter by tag (e.g., 'card')" },
"status": { "type": "string", "enum": ["TODO", "DONE", "DOING"] },
"property": { "type": "string", "description": "Custom property key" },
"value": { "type": "string", "description": "Property value to match" }
}
}
}
Configuration
# schemas/kb/contracts.ncl (additions)
BlockConfig = {
enabled | Bool
| doc "Enable block-level parsing and storage"
| default = true,
preserve_hierarchy | Bool
| doc "Preserve block nesting on import/export"
| default = true,
parse_on_load | Bool
| doc "Automatically parse blocks when loading nodes"
| default = false, # Lazy parsing by default
supported_statuses | Array String
| doc "Supported task statuses"
| default = ["TODO", "DONE", "DOING", "LATER", "NOW", "WAITING", "CANCELLED"],
}
KbConfig = {
# ... existing fields ...
blocks | BlockConfig
| doc "Block-level features configuration"
| default = {},
}
Migration Path
Phase 1: Add Block models (no behavior change) Phase 2: Add BlockParser (opt-in via config) Phase 3: Update Logseq import/export Phase 4: Add block queries to CLI/MCP Phase 5: SurrealDB block indexing
Backward Compatibility:
- Existing nodes without
blocksfield work as before contentremains source of truthblocksis optional cache/structure- Config flag
blocks.enabledto opt-in
Testing Strategy
#![allow(unused)] fn main() { #[cfg(test)] mod tests { use super::*; #[test] fn test_parse_simple_block() { let content = "- This is a block #card"; let blocks = BlockParser::parse(content).unwrap(); assert_eq!(blocks.len(), 1); assert_eq!(blocks[0].content, "This is a block"); assert_eq!(blocks[0].properties.tags, vec!["card"]); } #[test] fn test_parse_nested_blocks() { let content = r#" - Parent block - Child block 1 - Child block 2 - Grandchild "#; let blocks = BlockParser::parse(content).unwrap(); assert_eq!(blocks.len(), 1); assert_eq!(blocks[0].children.len(), 2); assert_eq!(blocks[0].children[1].children.len(), 1); } #[test] fn test_parse_todo() { let content = "- TODO Implement feature #rust"; let blocks = BlockParser::parse(content).unwrap(); assert_eq!(blocks[0].properties.status, Some(TaskStatus::Todo)); assert_eq!(blocks[0].content, "Implement feature"); assert_eq!(blocks[0].properties.tags, vec!["rust"]); } #[test] fn test_roundtrip() { let original = r#"- Block 1 #card - Nested - TODO Block 2 priority:: high "#; let blocks = BlockParser::parse(original).unwrap(); let serialized = BlockParser::serialize(&blocks); let reparsed = BlockParser::parse(&serialized).unwrap(); assert_eq!(blocks.len(), reparsed.len()); assert_eq!(blocks[0].properties, reparsed[0].properties); } } }
Summary
Recommended Approach: Hybrid (Option C)
- Add
Blockstruct with properties, hierarchy - Extend
Nodewith optionalblocks: Option<Vec<Block>> - Implement bidirectional parser (markdown ↔ blocks)
- Preserve
contentas source of truth (backward compat) - Enable block queries in CLI/MCP
- Support round-trip Logseq import/export
Benefits:
- ✅ Full Logseq compatibility
- ✅ Queryable blocks (find #card, TODO, etc.)
- ✅ Backward compatible
- ✅ Extensible (custom properties)
- ✅ Type-safe structure
Trade-offs:
- ⚠️ Added complexity
- ⚠️ Need to sync content ↔ blocks
- ⚠️ More storage for SurrealDB backend
Next Steps:
- Review and approve design
- Implement Phase 1 (Block models)
- Implement Phase 2 (BlockParser)
- Update Logseq import/export
- Add block queries to MCP/CLI