- KG: HNSW + BM25 + RRF(k=60) hybrid search via SurrealDB 3 native indexes - Fix schema bug: kg_executions missing agent_role/provider/cost_cents (silent empty reads) - channels: on_agent_inactive hook (AgentStatus::Inactive → Message::error) - migration 012: adds missing fields + HNSW + BM25 indexes - docs: ADR-0036, update ADR-0035 + notification-channels feature doc
7.7 KiB
ADR-0036: Knowledge Graph Hybrid Search — HNSW + BM25 + RRF
Status: Implemented
Date: 2026-02-26
Deciders: VAPORA Team
Technical Story: find_similar_executions was a stub returning recent records; find_similar_rlm_tasks ignored embeddings entirely. A missing schema migration caused all kg_executions reads to silently fail deserialization.
Decision
Replace the stub similarity functions in KGPersistence with a hybrid retrieval pipeline combining:
- HNSW (SurrealDB 3 native) — approximate nearest-neighbor vector search over
embeddingfield - BM25 (SurrealDB 3 native full-text search) — lexical scoring over
task_descriptionfield - Reciprocal Rank Fusion (RRF, k=60) — scale-invariant score fusion
Add migration 012_kg_hybrid_search.surql that fixes a pre-existing schema bug (three fields missing from the SCHEMAFULL table) and defines the required indexes.
Context
The Stub Problem
find_similar_executions in persistence.rs discarded its embedding: &[f32] argument entirely and returned the N most-recent successful executions, ordered by timestamp. Any caller relying on semantic proximity was silently receiving chronological results — a correctness bug, not a performance issue.
The Silent Schema Bug
kg_executions was declared SCHEMAFULL in migration 005 but three fields used by PersistedExecution (agent_role, provider, cost_cents) were absent from the schema. SurrealDB drops undefined fields on INSERT in SCHEMAFULL tables. All subsequent SELECT queries returned records that failed serde_json::from_value deserialization, which was swallowed by .filter_map(|v| v.ok()). The persistence layer appeared to work (no errors) while returning empty results for every query.
Why Not stratum-embeddings SurrealDbStore
stratumiops/crates/stratum-embeddings/src/store/surrealdb.rs implements vector search as a brute-force full-scan: it loads all records into memory and computes cosine similarity in-process. This works for document chunk retrieval (bounded dataset per document), but is unsuitable for the knowledge graph which accumulates unbounded execution records across all agents and tasks over time.
Why Hybrid Over Pure Semantic
Embedding-only retrieval misses exact keyword matches: an agent searching for "cargo clippy warnings" may not find a record titled "clippy deny warnings fix" if the embedding model compresses the phrase differently than the query. BM25 handles exact token overlap that embeddings smooth over.
Alternatives Considered
❌ Pure HNSW semantic search only
- Misses exact keyword matches (e.g., specific error codes, crate names)
- Embedding quality varies across providers; degrades if provider changes
❌ Pure BM25 lexical search only
- Misses paraphrases and semantic variants ("task failed" vs "execution error")
- No relevance for structurally similar tasks with different wording
❌ Tantivy / external FTS engine
- Adds a new process dependency for a capability SurrealDB 3 provides natively
- Requires synchronizing two stores; adds operational complexity
✅ SurrealDB 3 HNSW + BM25 + RRF (chosen)
Single data store, two native index types, no new dependencies, no sync complexity.
Implementation
Migration 012
-- Fix missing fields causing silent deserialization failure
DEFINE FIELD agent_role ON TABLE kg_executions TYPE option<string>;
DEFINE FIELD provider ON TABLE kg_executions TYPE string DEFAULT 'unknown';
DEFINE FIELD cost_cents ON TABLE kg_executions TYPE int DEFAULT 0;
-- BM25 full-text index on task_description
DEFINE ANALYZER kg_text_analyzer
TOKENIZERS class
FILTERS lowercase, snowball(english);
DEFINE INDEX idx_kg_executions_ft
ON TABLE kg_executions
FIELDS task_description
SEARCH ANALYZER kg_text_analyzer BM25;
-- HNSW ANN index on embedding (1536-dim, cosine, float32)
DEFINE INDEX idx_kg_executions_hnsw
ON TABLE kg_executions
FIELDS embedding
HNSW DIMENSION 1536 DIST COSINE TYPE F32 M 16 EF_CONSTRUCTION 200;
HNSW parameters: M 16 (16 edges per node, standard for 1536-dim); EF_CONSTRUCTION 200 (index build quality vs. insert speed; 200 is the standard default).
Query Patterns
HNSW semantic search (<|100,64|> = 100 candidates, ef=64 at query time):
SELECT *, vector::similarity::cosine(embedding, $q) AS cosine_score
FROM kg_executions
WHERE embedding <|100,64|> $q
ORDER BY cosine_score DESC
LIMIT 20
BM25 lexical search (@1@ assigns predicate ID 1; paired with search::score(1)):
SELECT *, search::score(1) AS bm25_score
FROM kg_executions
WHERE task_description @1@ $text
ORDER BY bm25_score DESC
LIMIT 100
RRF Fusion
Cosine similarity is bounded [0.0, 1.0]; BM25 is unbounded [0, ∞). Linear blending requires per-corpus normalization. RRF is scale-invariant:
hybrid_score(id) = 1 / (60 + rank_semantic) + 1 / (60 + rank_lexical)
k=60 is the standard constant (Robertson & Zaragoza, 2009). IDs absent from one ranked list receive rank 0, contributing 1/60 — never 0, preventing complete suppression of single-method results.
RLM Executions
rlm_executions is SCHEMALESS with a nullable query_embedding field. HNSW indexes require a SCHEMAFULL table with a non-nullable typed field. find_similar_rlm_tasks uses in-memory cosine similarity: loads candidate records, filters those with non-empty embeddings, sorts by cosine score. Acceptable because the RLM dataset is bounded per document.
New Public API
impl KGPersistence {
// Was stub (returned recent records). Now uses HNSW ANN query.
pub async fn find_similar_executions(
&self,
embedding: &[f32],
limit: usize,
) -> anyhow::Result<Vec<PersistedExecution>>;
// New. HNSW + BM25 + RRF. Either argument may be empty (degrades gracefully).
pub async fn hybrid_search(
&self,
embedding: &[f32],
text_query: &str,
limit: usize,
) -> anyhow::Result<Vec<HybridSearchResult>>;
}
HybridSearchResult exposes semantic_score, lexical_score, hybrid_score, semantic_rank, lexical_rank — callers can inspect individual signal contributions.
Consequences
Positive
find_similar_executionsreturns semantically similar past executions, not recent ones. The correctness bug is fixed.hybrid_searchexposes both signals; callers can filter bysemantic_score ≥ 0.7for high-confidence-only retrieval.- No new dependencies. The two indexes are defined in a migration; no Rust dependency change.
- The schema bug fix means all existing
kg_executionsrecords round-trip correctly after migration 012 is applied.
Negative / Trade-offs
- HNSW index build is
O(n log n)in SurrealDB; large existing datasets will cause migration 012 to take longer than typical DDL migrations. No data migration is needed — only index creation. - BM25 requires the
task_descriptionfield to be populated at insert time. Records inserted before this migration with empty or null descriptions will not appear in lexical results. rlm_executionshybrid search remains in-memory. A future migration convertingrlm_executionsto SCHEMAFULL would enable native HNSW for that table too.
Supersedes
- The stub implementation of
find_similar_executions(existed since persistence.rs was written). - Extends ADR-0013 (KG temporal design) with the retrieval layer decision.
Related
- ADR-0013: Knowledge Graph Temporal — original KG design
- ADR-0029: RLM Recursive Language Models — RLM hybrid search (different use case: document chunks, not execution records)
- ADR-0004: SurrealDB — database foundation