Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

ADR-013: Knowledge Graph Temporal con SurrealDB

Status: Accepted | Implemented Date: 2024-11-01 Deciders: Architecture Team Technical Story: Enabling collective agent learning through temporal execution history


Decision

Implementar Knowledge Graph temporal en SurrealDB con historia de ejecución, curvas de aprendizaje, y búsqueda de similaridad.


Rationale

  1. Collective Learning: Agentes aprenden de experiencia compartida (no solo individual)
  2. Temporal History: Histórico de 30/90 días permite identificar tendencias
  3. Causal Relationships: Graph permite rastrear raíces de problemas y soluciones
  4. Similarity Search: Encontrar soluciones pasadas para tareas similares
  5. SurrealDB Native: Graph queries integradas en mismo DB que relacional

Alternatives Considered

❌ Event Log Only (No Graph)

  • Pros: Simple
  • Cons: Sin relaciones causales, búsqueda ineficiente

❌ Separate Graph DB (Neo4j)

  • Pros: Optimizado para graph
  • Cons: Duplicación de datos, sincronización complexity

✅ SurrealDB Temporal KG (CHOSEN)

  • Unificado, temporal, graph queries integradas

Trade-offs

Pros:

  • ✅ Temporal data (30/90 day retention)
  • ✅ Causal relationships traceable
  • ✅ Similarity search for solution discovery
  • ✅ Learning curves identify improvement trends
  • ✅ Single database (no sync issues)

Cons:

  • ⚠️ Graph queries more complex than relational
  • ⚠️ Storage overhead for full history
  • ⚠️ Retention policy trade-off: longer history = more storage

Implementation

Temporal Data Model:

#![allow(unused)]
fn main() {
// crates/vapora-knowledge-graph/src/models.rs
pub struct ExecutionRecord {
    pub id: String,
    pub agent_id: String,
    pub task_id: String,
    pub task_type: String,
    pub success: bool,
    pub quality_score: f32,
    pub latency_ms: u32,
    pub cost_cents: u32,
    pub timestamp: DateTime<Utc>,
    pub daily_window: String,  // YYYY-MM-DD for aggregation
}

pub struct LearningCurve {
    pub id: String,
    pub agent_id: String,
    pub task_type: String,
    pub day: String,           // YYYY-MM-DD
    pub success_rate: f32,
    pub avg_quality: f32,
    pub trend: TrendDirection, // Improving, Stable, Declining
}
}

SurrealDB Schema:

-- Define execution records table
DEFINE TABLE executions;
DEFINE FIELD agent_id ON TABLE executions TYPE string;
DEFINE FIELD task_id ON TABLE executions TYPE string;
DEFINE FIELD task_type ON TABLE executions TYPE string;
DEFINE FIELD success ON TABLE executions TYPE boolean;
DEFINE FIELD quality_score ON TABLE executions TYPE float;
DEFINE FIELD timestamp ON TABLE executions TYPE datetime;
DEFINE FIELD daily_window ON TABLE executions TYPE string;

-- Define temporal index for efficient time-range queries
DEFINE INDEX idx_execution_temporal ON TABLE executions
    COLUMNS timestamp, daily_window;

-- Define learning curves table
DEFINE TABLE learning_curves;
DEFINE FIELD agent_id ON TABLE learning_curves TYPE string;
DEFINE FIELD task_type ON TABLE learning_curves TYPE string;
DEFINE FIELD day ON TABLE learning_curves TYPE string;
DEFINE FIELD success_rate ON TABLE learning_curves TYPE float;
DEFINE FIELD trend ON TABLE learning_curves TYPE string;

Temporal Query (30-Day Learning Curve):

#![allow(unused)]
fn main() {
// crates/vapora-knowledge-graph/src/learning.rs
pub async fn compute_learning_curve(
    db: &Surreal<Ws>,
    agent_id: &str,
    task_type: &str,
    days: u32,
) -> Result<Vec<LearningCurve>> {
    let since = (Utc::now() - Duration::days(days as i64))
        .format("%Y-%m-%d")
        .to_string();

    let query = format!(
        r#"
        SELECT
            day,
            count(id) as total_tasks,
            count(id WHERE success = true) / count(id) as success_rate,
            avg(quality_score) as avg_quality,
            (avg(quality_score) - LAG(avg(quality_score)) OVER (ORDER BY day)) as trend
        FROM executions
        WHERE agent_id = {} AND task_type = {} AND daily_window >= {}
        GROUP BY daily_window
        ORDER BY daily_window ASC
        "#,
        agent_id, task_type, since
    );

    db.query(query).await?
        .take::<Vec<LearningCurve>>(0)?
        .ok_or(Error::NotFound)
}
}

Similarity Search (Find Past Solutions):

#![allow(unused)]
fn main() {
pub async fn find_similar_tasks(
    db: &Surreal<Ws>,
    task: &Task,
    limit: u32,
) -> Result<Vec<(ExecutionRecord, f32)>> {
    // Compute embedding similarity for task description
    let similarity_threshold = 0.85;

    let query = r#"
        SELECT
            executions.*,
            <similarity_score> as score
        FROM executions
        WHERE similarity_score > {} AND success = true
        ORDER BY similarity_score DESC
        LIMIT {}
    "#;

    db.query(query)
        .bind(("similarity_score", similarity_threshold))
        .bind(("limit", limit))
        .await?
        .take::<Vec<(ExecutionRecord, f32)>>(0)?
        .ok_or(Error::NotFound)
}
}

Causal Graph (Problem Resolution):

#![allow(unused)]
fn main() {
pub async fn trace_solution_chain(
    db: &Surreal<Ws>,
    problem_task_id: &str,
) -> Result<Vec<ExecutionRecord>> {
    let query = format!(
        r#"
        SELECT
            ->(resolved_by)->executions AS solutions
        FROM tasks
        WHERE id = {}
        "#,
        problem_task_id
    );

    db.query(query)
        .await?
        .take::<Vec<ExecutionRecord>>(0)?
        .ok_or(Error::NotFound)
}
}

Key Files:

  • /crates/vapora-knowledge-graph/src/learning.rs (learning curve computation)
  • /crates/vapora-knowledge-graph/src/persistence.rs (DB persistence)
  • /crates/vapora-knowledge-graph/src/models.rs (temporal models)
  • /crates/vapora-backend/src/services/ (uses KG for task recommendations)

Verification

# Test learning curve computation
cargo test -p vapora-knowledge-graph test_learning_curve_30day

# Test similarity search
cargo test -p vapora-knowledge-graph test_similarity_search

# Test causal graph traversal
cargo test -p vapora-knowledge-graph test_causal_chain

# Test retention policy (30-day window)
cargo test -p vapora-knowledge-graph test_retention_policy

# Integration test: full KG workflow
cargo test -p vapora-knowledge-graph test_full_kg_lifecycle

# Query performance test
cargo bench -p vapora-knowledge-graph bench_temporal_queries

Expected Output:

  • Learning curves computed correctly
  • Similarity search finds relevant past executions
  • Causal chains traceable
  • Retention policy removes old records
  • Temporal queries perform well (<100ms)

Consequences

Data Management

  • Storage grows ~1MB per 1000 executions (depends on detail level)
  • Retention policy: 30 days (users), 90 days (enterprise)
  • Archival strategy for historical analysis

Agent Learning

  • Agents access KG to find similar past solutions
  • Learning curves inform agent selection (see ADR-014)
  • Improvement trends visible for monitoring

Observability

  • Full audit trail of agent decisions
  • Trending analysis for capacity planning
  • Incident investigation via causal chains

Scalability

  • Graph queries optimized with indexes
  • Temporal queries use daily windows (efficient partition)
  • Similarity search scales to millions of records

References

  • /crates/vapora-knowledge-graph/src/learning.rs (implementation)
  • /crates/vapora-knowledge-graph/src/persistence.rs (persistence layer)
  • ADR-004 (SurrealDB)
  • ADR-014 (Learning Profiles)
  • ADR-019 (Temporal Execution History)

Related ADRs: ADR-004 (SurrealDB), ADR-014 (Learning Profiles), ADR-019 (Temporal History)