Vapora/docs/integrations/doc-lifecycle.md
Jesús Pérez d14150da75 feat: Phase 5.3 - Multi-Agent Learning Infrastructure
Implement intelligent agent learning from Knowledge Graph execution history
with per-task-type expertise tracking, recency bias, and learning curves.

## Phase 5.3 Implementation

### Learning Infrastructure ( Complete)
- LearningProfileService with per-task-type expertise metrics
- TaskTypeExpertise model tracking success_rate, confidence, learning curves
- Recency bias weighting: recent 7 days weighted 3x higher (exponential decay)
- Confidence scoring prevents overfitting: min(1.0, executions / 20)
- Learning curves computed from daily execution windows

### Agent Scoring Service ( Complete)
- Unified AgentScore combining SwarmCoordinator + learning profiles
- Scoring formula: 0.3*base + 0.5*expertise + 0.2*confidence
- Rank agents by combined score for intelligent assignment
- Support for recency-biased scoring (recent_success_rate)
- Methods: rank_agents, select_best, rank_agents_with_recency

### KG Integration ( Complete)
- KGPersistence::get_executions_for_task_type() - query by agent + task type
- KGPersistence::get_agent_executions() - all executions for agent
- Coordinator::load_learning_profile_from_kg() - core KG→Learning integration
- Coordinator::load_all_learning_profiles() - batch load for multiple agents
- Convert PersistedExecution → ExecutionData for learning calculations

### Agent Assignment Integration ( Complete)
- AgentCoordinator uses learning profiles for task assignment
- extract_task_type() infers task type from title/description
- assign_task() scores candidates using AgentScoringService
- Fallback to load-based selection if no learning data available
- Learning profiles stored in coordinator.learning_profiles RwLock

### Profile Adapter Enhancements ( Complete)
- create_learning_profile() - initialize empty profiles
- add_task_type_expertise() - set task-type expertise
- update_profile_with_learning() - update swarm profiles from learning

## Files Modified

### vapora-knowledge-graph/src/persistence.rs (+30 lines)
- get_executions_for_task_type(agent_id, task_type, limit)
- get_agent_executions(agent_id, limit)

### vapora-agents/src/coordinator.rs (+100 lines)
- load_learning_profile_from_kg() - core KG integration method
- load_all_learning_profiles() - batch loading for agents
- assign_task() already uses learning-based scoring via AgentScoringService

### Existing Complete Implementation
- vapora-knowledge-graph/src/learning.rs - calculation functions
- vapora-agents/src/learning_profile.rs - data structures and expertise
- vapora-agents/src/scoring.rs - unified scoring service
- vapora-agents/src/profile_adapter.rs - adapter methods

## Tests Passing
- learning_profile: 7 tests 
- scoring: 5 tests 
- profile_adapter: 6 tests 
- coordinator: learning-specific tests 

## Data Flow
1. Task arrives → AgentCoordinator::assign_task()
2. Extract task_type from description
3. Query KG for task-type executions (load_learning_profile_from_kg)
4. Calculate expertise with recency bias
5. Score candidates (SwarmCoordinator + learning)
6. Assign to top-scored agent
7. Execution result → KG → Update learning profiles

## Key Design Decisions
 Recency bias: 7-day half-life with 3x weight for recent performance
 Confidence scoring: min(1.0, total_executions / 20) prevents overfitting
 Hierarchical scoring: 30% base load, 50% expertise, 20% confidence
 KG query limit: 100 recent executions per task-type for performance
 Async loading: load_learning_profile_from_kg supports concurrent loads

## Next: Phase 5.4 - Cost Optimization
Ready to implement budget enforcement and cost-aware provider selection.
2026-01-11 13:03:53 +00:00

14 KiB

Doc-Lifecycle-Manager Integration Guide

Overview

doc-lifecycle-manager (external project) provides complete documentation lifecycle management for VAPORA, including classification, consolidation, semantic search, real-time updates, and enterprise security features.

Project Location: External project (doc-lifecycle-manager) Status: Enterprise-Ready Tests: 155/155 passing | Zero unsafe code


What is doc-lifecycle-manager?

A comprehensive Rust-based system that handles documentation throughout its entire lifecycle:

Core Capabilities (Phases 1-3)

  • Automatic Classification: Categorizes docs (vision, design, specs, ADRs, guides, testing, archive)
  • Duplicate Detection: Finds similar documents with TF-IDF analysis
  • Semantic RAG Indexing: Vector embeddings for semantic search
  • mdBook Generation: Auto-generates documentation websites

Enterprise Features (Phases 4-7)

  • GraphQL API: Semantic document queries with pagination
  • Real-Time Events: WebSocket streaming of doc updates
  • Distributed Tracing: OpenTelemetry with W3C Trace Context
  • Security: mTLS with automatic certificate rotation
  • Performance: Comprehensive benchmarking with percentiles
  • Persistence: SurrealDB backend (feature-gated)

Integration Architecture

Data Flow in VAPORA

Frontend/Agents
    ↓
┌─────────────────────────────────┐
│   VAPORA API Layer (Axum)       │
│   ├─ REST endpoints             │
│   └─ WebSocket gateway          │
└─────────────────────────────────┘
    ↓
┌─────────────────────────────────┐
│  doc-lifecycle-manager Services │
│                                 │
│  ├─ GraphQL Resolver            │
│  ├─ WebSocket Manager           │
│  ├─ Document Classifier         │
│  ├─ RAG Indexer                 │
│  └─ mTLS Auth Manager           │
└─────────────────────────────────┘
    ↓
┌─────────────────────────────────┐
│   Data Layer                    │
│   ├─ SurrealDB (vectors)        │
│   ├─ NATS JetStream (events)    │
│   └─ Redis (cache)              │
└─────────────────────────────────┘

Component Integration Points

1. Documenter Agent ↔ doc-lifecycle-manager

use vapora_doc_lifecycle::prelude::*;

// On task completion
async fn on_task_completed(task_id: &str) {
    let config = PluginConfig::default();
    let mut docs = DocumenterIntegration::new(config)?;
    docs.on_task_completed(task_id).await?;
}

2. Frontend ↔ GraphQL API

{
  documentSearch(query: {
    text_query: "authentication"
    limit: 10
  }) {
    results { id title relevance_score }
  }
}

3. Frontend ↔ WebSocket Events

const ws = new WebSocket("ws://vapora/doc-events");
ws.onmessage = (event) => {
  const { event_type, payload } = JSON.parse(event.data);
  // Update UI on document_indexed, document_updated, etc.
};

4. Agent-to-Agent ↔ NATS JetStream

Task Completed Event
  → Documenter Agent (NATS)
    → Classify + Index
      → Broadcast DocumentIndexed Event
        → All Agents notified

Feature Set by Phase

Phase 1: Foundation & Core Library

  • Error handling and configuration
  • Core abstractions and types

Phase 2: Extended Implementation

  • Document Classifier (7 types)
  • Consolidator (TF-IDF)
  • RAG Indexer (markdown-aware)
  • MDBook Generator

Phase 3: CLI & Automation

  • 4 command handlers
  • 62+ Just recipes
  • 5 NuShell scripts

Phase 4: VAPORA Deep Integration

  • NATS JetStream events
  • Vector store trait
  • Plugin system
  • Agent coordination

Phase 5: Production Hardening

  • Real NATS integration
  • DocServer RBAC (4 roles, 3 visibility levels)
  • Root Files Keeper (auto-update README, CHANGELOG)
  • Kubernetes manifests (7 YAML files)

Phase 6: Multi-Agent VAPORA

  • Agent registry with health checking
  • CI/CD pipeline (GitHub Actions)
  • Prometheus monitoring rules
  • Comprehensive documentation

Phase 7: Advanced Features

  • SurrealDB Backend: Persistent vector store
  • OpenTelemetry: W3C Trace Context support
  • GraphQL API: Query builder with semantic search
  • WebSocket Events: Real-time subscriptions
  • mTLS Auth: Certificate rotation
  • Benchmarking: P95/P99 metrics

How to Use in VAPORA

1. Basic Integration (Documenter Agent)

// In vapora-backend/documenter_agent.rs

use vapora_doc_lifecycle::prelude::*;

impl DocumenterAgent {
    async fn process_task(&self, task: Task) -> Result<()> {
        let config = PluginConfig::default();
        let mut integration = DocumenterIntegration::new(config)?;

        // Automatically classifies, indexes, and generates docs
        integration.on_task_completed(&task.id).await?;

        Ok(())
    }
}

2. GraphQL Queries (Frontend/Agents)

# Search for documentation
query SearchDocs($query: String!) {
  documentSearch(query: {
    text_query: $query
    limit: 10
    visibility: "Public"
  }) {
    results {
      id
      title
      path
      relevance_score
      preview
    }
    total_count
    has_more
  }
}

# Get specific document
query GetDoc($id: ID!) {
  document(id: $id) {
    id
    title
    content
    metadata {
      created_at
      updated_at
      owner_id
    }
  }
}

3. Real-Time Updates (Frontend)

// Connect to doc-lifecycle WebSocket
const docWs = new WebSocket('ws://vapora-api/doc-lifecycle/events');

// Subscribe to document changes
docWs.onopen = () => {
  docWs.send(JSON.stringify({
    type: 'subscribe',
    event_types: ['document_indexed', 'document_updated', 'search_index_rebuilt'],
    min_priority: 5
  }));
};

// Handle updates
docWs.onmessage = (event) => {
  const message = JSON.parse(event.data);

  if (message.event_type === 'document_indexed') {
    console.log('New doc indexed:', message.payload);
    // Refresh documentation view
  }
};

4. Distributed Tracing

All operations are automatically traced:

GET /api/documents?search=auth
  trace_id: 0af7651916cd43dd8448eb211c80319c
  span_id: b7ad6b7169203331

  ├─ graphql_resolver [15ms]
  │  ├─ rbac_check [2ms]
  │  └─ semantic_search [12ms]
  └─ response [1ms]

5. mTLS Security

Service-to-service communication is secured:

# Kubernetes secret for certs
apiVersion: v1
kind: Secret
metadata:
  name: doc-lifecycle-certs
data:
  server.crt: <base64>
  server.key: <base64>
  ca.crt: <base64>

Deployment in VAPORA

Kubernetes Manifests Provided

kubernetes/
├── namespace.yaml                    # Create doc-lifecycle namespace
├── configmap.yaml                    # Configuration
├── deployment.yaml                   # Main service (2 replicas)
├── statefulset-nats.yaml            # NATS JetStream (3 replicas)
├── statefulset-surreal.yaml         # SurrealDB (1 replica)
├── service.yaml                      # Internal services
├── rbac.yaml                         # RBAC configuration
└── prometheus-rules.yaml             # Monitoring rules

Quick Deploy

# Deploy to VAPORA cluster
kubectl apply -f /Tools/doc-lifecycle-manager/kubernetes/

# Verify
kubectl get pods -n doc-lifecycle
kubectl get svc -n doc-lifecycle

Configuration via ConfigMap

apiVersion: v1
kind: ConfigMap
metadata:
  name: doc-lifecycle-config
  namespace: doc-lifecycle
data:
  config.json: |
    {
      "mode": "full",
      "classification": {
        "auto_classify": true,
        "confidence_threshold": 0.8
      },
      "rag": {
        "enable_embeddings": true,
        "max_chunk_size": 512
      },
      "nats": {
        "server": "nats://nats:4222",
        "jetstream_enabled": true
      },
      "otel": {
        "enabled": true,
        "jaeger_endpoint": "http://jaeger:14268"
      },
      "mtls": {
        "enabled": true,
        "rotation_days": 30
      }
    }

VAPORA Agent Integration

Documenter Agent

// Processes documentation tasks
pub struct DocumenterAgent {
    integration: DocumenterIntegration,
    nats: NatsEventHandler,
}

impl DocumenterAgent {
    pub async fn handle_task(&self, task: Task) -> Result<()> {
        // 1. Classify document
        self.integration.on_task_completed(&task.id).await?;

        // 2. Broadcast via NATS
        let event = DocsUpdatedEvent {
            task_id: task.id,
            doc_count: 5,
        };
        self.nats.publish_docs_updated(event).await?;

        Ok(())
    }
}
// Searches for relevant documentation
pub struct DeveloperAgent;

impl DeveloperAgent {
    pub async fn find_relevant_docs(&self, task: Task) -> Result<Vec<DocumentResult>> {
        // GraphQL query for semantic search
        let query = DocumentQuery {
            text_query: Some(task.description),
            limit: Some(5),
            visibility: Some("Internal".to_string()),
            ..Default::default()
        };

        // Execute search
        resolver.resolve_document_search(query, user).await
    }
}

CodeReviewer Agent (Uses Context)

// Uses documentation as context for reviews
pub struct CodeReviewerAgent;

impl CodeReviewerAgent {
    pub async fn review_with_context(&self, code: &str) -> Result<Review> {
        // Search for related documentation
        let docs = semantic_search(code_summary).await?;

        // Use docs as context in review
        let review = llm_client
            .review_code(code, &docs.to_context_string())
            .await?;

        Ok(review)
    }
}

Performance & Scaling

Expected Performance

Operation Latency Throughput
Classify doc <10ms 1000 docs/sec
GraphQL query <200ms 50 queries/sec
WebSocket broadcast <20ms 1000 events/sec
Semantic search <100ms 50 searches/sec
mTLS validation <5ms N/A

Resource Requirements

Deployment Resources:

  • CPU: 2-4 cores (main service)
  • Memory: 512MB-2GB
  • Storage: 50GB (SurrealDB + vectors)

NATS Requirements:

  • CPU: 1-2 cores
  • Memory: 256MB-1GB
  • Persistent volume: 20GB

Monitoring & Observability

Prometheus Metrics

# Error rate
rate(doc_lifecycle_errors_total[5m])

# Latency
histogram_quantile(0.99, doc_lifecycle_request_duration_seconds)

# Service availability
up{job="doc-lifecycle"}

Distributed Tracing

Traces are sent to Jaeger in W3C format:

Trace: 0af7651916cd43dd8448eb211c80319c
├─ Span: graphql_resolver
│  ├─ Span: rbac_check
│  └─ Span: semantic_search
└─ Span: response

Health Checks

# Liveness probe
curl http://doc-lifecycle:8080/health/live

# Readiness probe
curl http://doc-lifecycle:8080/health/ready

Configuration Reference

Environment Variables

# Core
DOC_LIFECYCLE_MODE=full                          # minimal|standard|full
DOC_LIFECYCLE_ENABLED=true

# Classification
CLASSIFIER_AUTO_CLASSIFY=true
CLASSIFIER_CONFIDENCE_THRESHOLD=0.8

# RAG/Search
RAG_ENABLE_EMBEDDINGS=true
RAG_MAX_CHUNK_SIZE=512
RAG_CHUNK_OVERLAP=50

# NATS
NATS_SERVER_URL=nats://nats:4222
NATS_JETSTREAM_ENABLED=true

# SurrealDB (optional)
SURREAL_DB_URL=ws://surrealdb:8000
SURREAL_NAMESPACE=vapora
SURREAL_DATABASE=documents

# OpenTelemetry
OTEL_ENABLED=true
OTEL_JAEGER_ENDPOINT=http://jaeger:14268
OTEL_SERVICE_NAME=vapora-doc-lifecycle

# mTLS
MTLS_ENABLED=true
MTLS_SERVER_CERT=/etc/vapora/certs/server.crt
MTLS_SERVER_KEY=/etc/vapora/certs/server.key
MTLS_CA_CERT=/etc/vapora/certs/ca.crt
MTLS_ROTATION_DAYS=30

Integration Checklist

Immediate (Ready Now)

  • Core features (Phases 1-3)
  • VAPORA integration (Phase 4)
  • Production hardening (Phase 5)
  • Multi-agent support (Phase 6)
  • Enterprise features (Phase 7)
  • Kubernetes deployment
  • GraphQL API
  • WebSocket events
  • Distributed tracing
  • mTLS security

Planned (Phase 8)

  • Jaeger exporter
  • SurrealDB live testing
  • Load testing
  • Performance tuning
  • Production deployment guide

Troubleshooting

Common Issues

1. NATS Connection Failed

# Check NATS service
kubectl get svc -n doc-lifecycle
kubectl logs -n doc-lifecycle deployment/nats

2. GraphQL Query Timeout

# Check semantic search performance
# Query execution should be < 200ms
# Check RAG index size

3. WebSocket Disconnection

# Verify WebSocket port is open
# Check subscription history size
# Monitor event broadcast latency

References

Documentation Files:

  • /Tools/doc-lifecycle-manager/PHASE_7_COMPLETION.md - Phase 7 details
  • /Tools/doc-lifecycle-manager/PHASES_COMPLETION.md - All phases overview
  • /Tools/doc-lifecycle-manager/INTEGRATION_WITH_VAPORA.md - Integration guide
  • /Tools/doc-lifecycle-manager/kubernetes/README.md - K8s deployment

Source Code:

  • crates/vapora-doc-lifecycle/src/lib.rs - Main library
  • crates/vapora-doc-lifecycle/src/graphql_api.rs - GraphQL resolver
  • crates/vapora-doc-lifecycle/src/websocket_events.rs - WebSocket manager
  • crates/vapora-doc-lifecycle/src/mtls_auth.rs - Security

Support

For questions or issues:

  1. Check documentation in /Tools/doc-lifecycle-manager/
  2. Review test cases for usage examples
  3. Check Kubernetes logs: kubectl logs -n doc-lifecycle <pod>
  4. Monitor with Prometheus/Grafana

Status: Ready for Production Deployment Last Updated: 2025-11-10 Maintainer: VAPORA Team