7.5 KiB
7.5 KiB
ADR-013: Knowledge Graph Temporal con SurrealDB
Status: Accepted | Implemented Date: 2024-11-01 Deciders: Architecture Team Technical Story: Enabling collective agent learning through temporal execution history
Decision
Implementar Knowledge Graph temporal en SurrealDB con historia de ejecución, curvas de aprendizaje, y búsqueda de similaridad.
Rationale
- Collective Learning: Agentes aprenden de experiencia compartida (no solo individual)
- Temporal History: Histórico de 30/90 días permite identificar tendencias
- Causal Relationships: Graph permite rastrear raíces de problemas y soluciones
- Similarity Search: Encontrar soluciones pasadas para tareas similares
- SurrealDB Native: Graph queries integradas en mismo DB que relacional
Alternatives Considered
❌ Event Log Only (No Graph)
- Pros: Simple
- Cons: Sin relaciones causales, búsqueda ineficiente
❌ Separate Graph DB (Neo4j)
- Pros: Optimizado para graph
- Cons: Duplicación de datos, sincronización complexity
✅ SurrealDB Temporal KG (CHOSEN)
- Unificado, temporal, graph queries integradas
Trade-offs
Pros:
- ✅ Temporal data (30/90 day retention)
- ✅ Causal relationships traceable
- ✅ Similarity search for solution discovery
- ✅ Learning curves identify improvement trends
- ✅ Single database (no sync issues)
Cons:
- ⚠️ Graph queries more complex than relational
- ⚠️ Storage overhead for full history
- ⚠️ Retention policy trade-off: longer history = more storage
Implementation
Temporal Data Model:
// crates/vapora-knowledge-graph/src/models.rs
pub struct ExecutionRecord {
pub id: String,
pub agent_id: String,
pub task_id: String,
pub task_type: String,
pub success: bool,
pub quality_score: f32,
pub latency_ms: u32,
pub cost_cents: u32,
pub timestamp: DateTime<Utc>,
pub daily_window: String, // YYYY-MM-DD for aggregation
}
pub struct LearningCurve {
pub id: String,
pub agent_id: String,
pub task_type: String,
pub day: String, // YYYY-MM-DD
pub success_rate: f32,
pub avg_quality: f32,
pub trend: TrendDirection, // Improving, Stable, Declining
}
SurrealDB Schema:
-- Define execution records table
DEFINE TABLE executions;
DEFINE FIELD agent_id ON TABLE executions TYPE string;
DEFINE FIELD task_id ON TABLE executions TYPE string;
DEFINE FIELD task_type ON TABLE executions TYPE string;
DEFINE FIELD success ON TABLE executions TYPE boolean;
DEFINE FIELD quality_score ON TABLE executions TYPE float;
DEFINE FIELD timestamp ON TABLE executions TYPE datetime;
DEFINE FIELD daily_window ON TABLE executions TYPE string;
-- Define temporal index for efficient time-range queries
DEFINE INDEX idx_execution_temporal ON TABLE executions
COLUMNS timestamp, daily_window;
-- Define learning curves table
DEFINE TABLE learning_curves;
DEFINE FIELD agent_id ON TABLE learning_curves TYPE string;
DEFINE FIELD task_type ON TABLE learning_curves TYPE string;
DEFINE FIELD day ON TABLE learning_curves TYPE string;
DEFINE FIELD success_rate ON TABLE learning_curves TYPE float;
DEFINE FIELD trend ON TABLE learning_curves TYPE string;
Temporal Query (30-Day Learning Curve):
// crates/vapora-knowledge-graph/src/learning.rs
pub async fn compute_learning_curve(
db: &Surreal<Ws>,
agent_id: &str,
task_type: &str,
days: u32,
) -> Result<Vec<LearningCurve>> {
let since = (Utc::now() - Duration::days(days as i64))
.format("%Y-%m-%d")
.to_string();
let query = format!(
r#"
SELECT
day,
count(id) as total_tasks,
count(id WHERE success = true) / count(id) as success_rate,
avg(quality_score) as avg_quality,
(avg(quality_score) - LAG(avg(quality_score)) OVER (ORDER BY day)) as trend
FROM executions
WHERE agent_id = {} AND task_type = {} AND daily_window >= {}
GROUP BY daily_window
ORDER BY daily_window ASC
"#,
agent_id, task_type, since
);
db.query(query).await?
.take::<Vec<LearningCurve>>(0)?
.ok_or(Error::NotFound)
}
Similarity Search (Find Past Solutions):
pub async fn find_similar_tasks(
db: &Surreal<Ws>,
task: &Task,
limit: u32,
) -> Result<Vec<(ExecutionRecord, f32)>> {
// Compute embedding similarity for task description
let similarity_threshold = 0.85;
let query = r#"
SELECT
executions.*,
<similarity_score> as score
FROM executions
WHERE similarity_score > {} AND success = true
ORDER BY similarity_score DESC
LIMIT {}
"#;
db.query(query)
.bind(("similarity_score", similarity_threshold))
.bind(("limit", limit))
.await?
.take::<Vec<(ExecutionRecord, f32)>>(0)?
.ok_or(Error::NotFound)
}
Causal Graph (Problem Resolution):
pub async fn trace_solution_chain(
db: &Surreal<Ws>,
problem_task_id: &str,
) -> Result<Vec<ExecutionRecord>> {
let query = format!(
r#"
SELECT
->(resolved_by)->executions AS solutions
FROM tasks
WHERE id = {}
"#,
problem_task_id
);
db.query(query)
.await?
.take::<Vec<ExecutionRecord>>(0)?
.ok_or(Error::NotFound)
}
Key Files:
/crates/vapora-knowledge-graph/src/learning.rs(learning curve computation)/crates/vapora-knowledge-graph/src/persistence.rs(DB persistence)/crates/vapora-knowledge-graph/src/models.rs(temporal models)/crates/vapora-backend/src/services/(uses KG for task recommendations)
Verification
# Test learning curve computation
cargo test -p vapora-knowledge-graph test_learning_curve_30day
# Test similarity search
cargo test -p vapora-knowledge-graph test_similarity_search
# Test causal graph traversal
cargo test -p vapora-knowledge-graph test_causal_chain
# Test retention policy (30-day window)
cargo test -p vapora-knowledge-graph test_retention_policy
# Integration test: full KG workflow
cargo test -p vapora-knowledge-graph test_full_kg_lifecycle
# Query performance test
cargo bench -p vapora-knowledge-graph bench_temporal_queries
Expected Output:
- Learning curves computed correctly
- Similarity search finds relevant past executions
- Causal chains traceable
- Retention policy removes old records
- Temporal queries perform well (<100ms)
Consequences
Data Management
- Storage grows ~1MB per 1000 executions (depends on detail level)
- Retention policy: 30 days (users), 90 days (enterprise)
- Archival strategy for historical analysis
Agent Learning
- Agents access KG to find similar past solutions
- Learning curves inform agent selection (see ADR-014)
- Improvement trends visible for monitoring
Observability
- Full audit trail of agent decisions
- Trending analysis for capacity planning
- Incident investigation via causal chains
Scalability
- Graph queries optimized with indexes
- Temporal queries use daily windows (efficient partition)
- Similarity search scales to millions of records
References
/crates/vapora-knowledge-graph/src/learning.rs(implementation)/crates/vapora-knowledge-graph/src/persistence.rs(persistence layer)- ADR-004 (SurrealDB)
- ADR-014 (Learning Profiles)
- ADR-019 (Temporal Execution History)
Related ADRs: ADR-004 (SurrealDB), ADR-014 (Learning Profiles), ADR-019 (Temporal History)