Vapora/docs/integrations/doc-lifecycle-integration.md
Jesús Pérez d14150da75 feat: Phase 5.3 - Multi-Agent Learning Infrastructure
Implement intelligent agent learning from Knowledge Graph execution history
with per-task-type expertise tracking, recency bias, and learning curves.

## Phase 5.3 Implementation

### Learning Infrastructure ( Complete)
- LearningProfileService with per-task-type expertise metrics
- TaskTypeExpertise model tracking success_rate, confidence, learning curves
- Recency bias weighting: recent 7 days weighted 3x higher (exponential decay)
- Confidence scoring prevents overfitting: min(1.0, executions / 20)
- Learning curves computed from daily execution windows

### Agent Scoring Service ( Complete)
- Unified AgentScore combining SwarmCoordinator + learning profiles
- Scoring formula: 0.3*base + 0.5*expertise + 0.2*confidence
- Rank agents by combined score for intelligent assignment
- Support for recency-biased scoring (recent_success_rate)
- Methods: rank_agents, select_best, rank_agents_with_recency

### KG Integration ( Complete)
- KGPersistence::get_executions_for_task_type() - query by agent + task type
- KGPersistence::get_agent_executions() - all executions for agent
- Coordinator::load_learning_profile_from_kg() - core KG→Learning integration
- Coordinator::load_all_learning_profiles() - batch load for multiple agents
- Convert PersistedExecution → ExecutionData for learning calculations

### Agent Assignment Integration ( Complete)
- AgentCoordinator uses learning profiles for task assignment
- extract_task_type() infers task type from title/description
- assign_task() scores candidates using AgentScoringService
- Fallback to load-based selection if no learning data available
- Learning profiles stored in coordinator.learning_profiles RwLock

### Profile Adapter Enhancements ( Complete)
- create_learning_profile() - initialize empty profiles
- add_task_type_expertise() - set task-type expertise
- update_profile_with_learning() - update swarm profiles from learning

## Files Modified

### vapora-knowledge-graph/src/persistence.rs (+30 lines)
- get_executions_for_task_type(agent_id, task_type, limit)
- get_agent_executions(agent_id, limit)

### vapora-agents/src/coordinator.rs (+100 lines)
- load_learning_profile_from_kg() - core KG integration method
- load_all_learning_profiles() - batch loading for agents
- assign_task() already uses learning-based scoring via AgentScoringService

### Existing Complete Implementation
- vapora-knowledge-graph/src/learning.rs - calculation functions
- vapora-agents/src/learning_profile.rs - data structures and expertise
- vapora-agents/src/scoring.rs - unified scoring service
- vapora-agents/src/profile_adapter.rs - adapter methods

## Tests Passing
- learning_profile: 7 tests 
- scoring: 5 tests 
- profile_adapter: 6 tests 
- coordinator: learning-specific tests 

## Data Flow
1. Task arrives → AgentCoordinator::assign_task()
2. Extract task_type from description
3. Query KG for task-type executions (load_learning_profile_from_kg)
4. Calculate expertise with recency bias
5. Score candidates (SwarmCoordinator + learning)
6. Assign to top-scored agent
7. Execution result → KG → Update learning profiles

## Key Design Decisions
 Recency bias: 7-day half-life with 3x weight for recent performance
 Confidence scoring: min(1.0, total_executions / 20) prevents overfitting
 Hierarchical scoring: 30% base load, 50% expertise, 20% confidence
 KG query limit: 100 recent executions per task-type for performance
 Async loading: load_learning_profile_from_kg supports concurrent loads

## Next: Phase 5.4 - Cost Optimization
Ready to implement budget enforcement and cost-aware provider selection.
2026-01-11 13:03:53 +00:00

8.8 KiB

📚 doc-lifecycle-manager Integration

Dual-Mode: Agent Plugin + Standalone System

Version: 0.1.0 Status: Specification (VAPORA v1.0 Integration) Purpose: Integration of doc-lifecycle-manager as both VAPORA component AND standalone tool


🎯 Objetivo

doc-lifecycle-manager funciona de dos formas:

  1. Como agente VAPORA: Documenter role usa doc-lifecycle internally
  2. Como sistema standalone: Proyectos sin VAPORA usan doc-lifecycle solo

Permite adopción gradual: empezar con doc-lifecycle solo, migrar a VAPORA después.


🔄 Dual-Mode Architecture

Mode 1: Standalone (Sin VAPORA)

proyecto-simple/
├── docs/
│   ├── architecture/
│   ├── guides/
│   └── adr/
├── .doc-lifecycle-manager/
│   ├── config.toml
│   ├── templates/
│   └── metadata/
└── .github/workflows/
    └── docs-update.yaml  # Triggered on push

Usage:

# Manual
doc-lifecycle-manager classify docs/
doc-lifecycle-manager consolidate docs/
doc-lifecycle-manager index --for-rag

# Via CI/CD
.github/workflows/docs-update.yaml:
  on: [push]
  steps:
    - run: doc-lifecycle-manager sync

Capabilities:

  • Classify docs by type
  • Consolidate duplicates
  • Manage lifecycle (draft → published → archived)
  • Generate RAG index
  • Build presentations (mdBook, Slidev)

Mode 2: As VAPORA Agent (With VAPORA)

proyecto-vapora/
├── .vapora/
│   ├── agents/
│   │   └── documenter/
│   │       ├── config.toml
│   │       └── plugins/
│   │           └── doc-lifecycle-manager/  # Embedded
│   └── ...
├── docs/
└── .coder/

Architecture:

Documenter Agent (Role)
  │
  ├─ Root Files Keeper
  │  ├─ README.md
  │  ├─ CHANGELOG.md
  │  ├─ ROADMAP.md
  │  └─ (auto-generated)
  │
  └─ doc-lifecycle-manager Plugin
     ├─ Classify documents
     ├─ Consolidate duplicates
     ├─ Manage ADRs (from sessions)
     ├─ Generate presentations
     └─ Build RAG index

Workflow:

Task completed
  ↓
Orchestrator publishes: "task_completed" event
  ↓
Documenter Agent subscribes to: vapora.tasks.completed
  ↓
Documenter loads config:
  ├─ Root Files Keeper (built-in)
  └─ doc-lifecycle-manager plugin
  ↓
Executes (in order):
  1. Extract decisions from sessions → doc-lifecycle ADR classification
  2. Update root files (README, CHANGELOG, ROADMAP)
  3. Classify all docs in docs/
  4. Consolidate duplicates
  5. Generate RAG index
  6. (Optional) Build mdBook + Slidev presentations
  ↓
Publishes: "docs_updated" event

🔌 Plugin Interface

Documenter Agent Loads doc-lifecycle-manager

pub struct DocumenterAgent {
    pub root_files_keeper: RootFilesKeeper,
    pub doc_lifecycle: DocLifecycleManager,  // Plugin
}

impl DocumenterAgent {
    pub async fn execute_task(
        &mut self,
        task: Task,
    ) -> anyhow::Result<()> {
        // 1. Update root files (always)
        self.root_files_keeper.sync_all(&task).await?;

        // 2. Use doc-lifecycle for deep doc management (if configured)
        if self.config.enable_doc_lifecycle {
            self.doc_lifecycle.classify_docs("docs/").await?;
            self.doc_lifecycle.consolidate_duplicates().await?;
            self.doc_lifecycle.manage_lifecycle().await?;

            // 3. Build presentations
            if self.config.generate_presentations {
                self.doc_lifecycle.generate_mdbook().await?;
                self.doc_lifecycle.generate_slidev().await?;
            }

            // 4. Build RAG index (for search)
            self.doc_lifecycle.build_rag_index().await?;
        }

        Ok(())
    }
}

🚀 Migration: Standalone → VAPORA

Step 1: Run Standalone

proyecto/
├── docs/
│   ├── architecture/
│   └── adr/
├── .doc-lifecycle-manager/
│   └── config.toml
└── .github/workflows/docs-update.yaml

# Usage: Manual or via CI/CD
doc-lifecycle-manager sync

Step 2: Install VAPORA

# Initialize VAPORA
vapora init

# VAPORA auto-detects existing .doc-lifecycle-manager/
# and integrates it into Documenter agent

Step 3: Migrate Workflows

# Before (in CI/CD):
- run: doc-lifecycle-manager sync

# After (in VAPORA):
# - Documenter agent runs automatically post-task
# - CLI still available:
vapora doc-lifecycle classify
vapora doc-lifecycle consolidate
vapora doc-lifecycle rag-index

📋 Configuration

Standalone Config

# .doc-lifecycle-manager/config.toml

[lifecycle]
doc_root = "docs/"
adr_path = "docs/adr/"
archive_days = 180

[classification]
enabled = true
auto_consolidate_duplicates = true
detect_orphaned_docs = true

[rag]
enabled = true
chunk_size = 500
overlap = 50
index_path = ".doc-lifecycle-manager/index.json"

[presentations]
generate_mdbook = true
generate_slidev = true
mdbook_out = "book/"
slidev_out = "slides/"

[lifecycle_rules]
[[rule]]
path_pattern = "docs/guides/*"
lifecycle = "guide"
retention_days = 0  # Never delete

[[rule]]
path_pattern = "docs/experimental/*"
lifecycle = "experimental"
retention_days = 30

VAPORA Integration Config

# .vapora/.vapora.toml

[documenter]
# Embedded doc-lifecycle config
doc_lifecycle_enabled = true
doc_lifecycle_config = ".doc-lifecycle-manager/config.toml"  # Reuse

[root_files]
auto_update = true
generate_changelog_from_git = true
generate_roadmap_from_tasks = true

🎯 Commands (Both Modes)

Standalone Mode

# Classify documents
doc-lifecycle-manager classify docs/

# Consolidate duplicates
doc-lifecycle-manager consolidate

# Manage lifecycle
doc-lifecycle-manager lifecycle prune --older-than 180d

# Build RAG index
doc-lifecycle-manager rag-index --output index.json

# Generate presentations
doc-lifecycle-manager mdbook build
doc-lifecycle-manager slidev build

VAPORA Integration

# Via documenter agent (automatic post-task)
# Or manual:
vapora doc-lifecycle classify
vapora doc-lifecycle consolidate
vapora doc-lifecycle rag-index

# Root files (via Documenter)
vapora root-files sync

# Full documentation update
vapora document sync --all

📊 Lifecycle States (doc-lifecycle)

Draft
  ├─ In-progress documentation
  ├─ Not indexed
  └─ Not published

Published
  ├─ Ready for users
  ├─ Indexed for RAG
  ├─ Included in presentations
  └─ Linked in README

Updated
  ├─ Recently modified
  ├─ Re-indexed for RAG
  └─ Change log entry created

Archived
  ├─ Outdated
  ├─ Removed from presentations
  ├─ Indexed but marked deprecated
  └─ Can be recovered

🔐 RAG Integration

doc-lifecycle → RAG Index

{
  "doc_id": "ADR-015-batch-workflow",
  "title": "ADR-015: Batch Workflow System",
  "doc_type": "adr",
  "lifecycle_state": "published",
  "created_date": "2025-11-09",
  "last_updated": "2025-11-10",
  "vector_embedding": [0.1, 0.2, ...],  // 1536-dim
  "content_preview": "Decision: Use Rust for batch orchestrator...",
  "tags": ["orchestrator", "workflow", "architecture"],
  "source_session": "sess-2025-11-09-143022",
  "related_adr": ["ADR-010", "ADR-014"],
  "search_keywords": ["batch", "workflow", "orchestrator"]
}
# Search documentation
vapora search "batch workflow architecture"

# Results from doc-lifecycle RAG index:
# 1. ADR-015-batch-workflow.md (0.94 relevance)
# 2. batch-workflow-guide.md (0.87)
# 3. orchestrator-design.md (0.71)

🎯 Implementation Checklist

Standalone Components

  • Document classifier (by type, domain, lifecycle)
  • Duplicate detector & consolidator
  • Lifecycle state management (Draft→Published→Archived)
  • RAG index builder (chunking, embeddings)
  • mdBook generator
  • Slidev generator
  • CLI interface

VAPORA Integration

  • Documenter agent loads doc-lifecycle-manager
  • Plugin interface (DocLifecycleManager trait)
  • Event subscriptions (vapora.tasks.completed)
  • Config reuse (.doc-lifecycle-manager/ detected)
  • Seamless startup (no additional config)

Migration Tools

  • Detect existing .doc-lifecycle-manager/
  • Auto-configure Documenter agent
  • Preserve existing RAG indexes
  • No data loss during migration

📊 Success Metrics

Standalone doc-lifecycle works independently VAPORA auto-detects and loads doc-lifecycle Documenter agent uses both Root Files + doc-lifecycle Migration takes < 5 minutes No duplicate work (each tool owns its domain) RAG indexing automatic and current


Version: 0.1.0 Status: Integration Specification Complete Purpose: Seamless doc-lifecycle-manager dual-mode integration with VAPORA