Vapora/docs/integrations/doc-lifecycle-integration.md
Jesús Pérez d14150da75 feat: Phase 5.3 - Multi-Agent Learning Infrastructure
Implement intelligent agent learning from Knowledge Graph execution history
with per-task-type expertise tracking, recency bias, and learning curves.

## Phase 5.3 Implementation

### Learning Infrastructure ( Complete)
- LearningProfileService with per-task-type expertise metrics
- TaskTypeExpertise model tracking success_rate, confidence, learning curves
- Recency bias weighting: recent 7 days weighted 3x higher (exponential decay)
- Confidence scoring prevents overfitting: min(1.0, executions / 20)
- Learning curves computed from daily execution windows

### Agent Scoring Service ( Complete)
- Unified AgentScore combining SwarmCoordinator + learning profiles
- Scoring formula: 0.3*base + 0.5*expertise + 0.2*confidence
- Rank agents by combined score for intelligent assignment
- Support for recency-biased scoring (recent_success_rate)
- Methods: rank_agents, select_best, rank_agents_with_recency

### KG Integration ( Complete)
- KGPersistence::get_executions_for_task_type() - query by agent + task type
- KGPersistence::get_agent_executions() - all executions for agent
- Coordinator::load_learning_profile_from_kg() - core KG→Learning integration
- Coordinator::load_all_learning_profiles() - batch load for multiple agents
- Convert PersistedExecution → ExecutionData for learning calculations

### Agent Assignment Integration ( Complete)
- AgentCoordinator uses learning profiles for task assignment
- extract_task_type() infers task type from title/description
- assign_task() scores candidates using AgentScoringService
- Fallback to load-based selection if no learning data available
- Learning profiles stored in coordinator.learning_profiles RwLock

### Profile Adapter Enhancements ( Complete)
- create_learning_profile() - initialize empty profiles
- add_task_type_expertise() - set task-type expertise
- update_profile_with_learning() - update swarm profiles from learning

## Files Modified

### vapora-knowledge-graph/src/persistence.rs (+30 lines)
- get_executions_for_task_type(agent_id, task_type, limit)
- get_agent_executions(agent_id, limit)

### vapora-agents/src/coordinator.rs (+100 lines)
- load_learning_profile_from_kg() - core KG integration method
- load_all_learning_profiles() - batch loading for agents
- assign_task() already uses learning-based scoring via AgentScoringService

### Existing Complete Implementation
- vapora-knowledge-graph/src/learning.rs - calculation functions
- vapora-agents/src/learning_profile.rs - data structures and expertise
- vapora-agents/src/scoring.rs - unified scoring service
- vapora-agents/src/profile_adapter.rs - adapter methods

## Tests Passing
- learning_profile: 7 tests 
- scoring: 5 tests 
- profile_adapter: 6 tests 
- coordinator: learning-specific tests 

## Data Flow
1. Task arrives → AgentCoordinator::assign_task()
2. Extract task_type from description
3. Query KG for task-type executions (load_learning_profile_from_kg)
4. Calculate expertise with recency bias
5. Score candidates (SwarmCoordinator + learning)
6. Assign to top-scored agent
7. Execution result → KG → Update learning profiles

## Key Design Decisions
 Recency bias: 7-day half-life with 3x weight for recent performance
 Confidence scoring: min(1.0, total_executions / 20) prevents overfitting
 Hierarchical scoring: 30% base load, 50% expertise, 20% confidence
 KG query limit: 100 recent executions per task-type for performance
 Async loading: load_learning_profile_from_kg supports concurrent loads

## Next: Phase 5.4 - Cost Optimization
Ready to implement budget enforcement and cost-aware provider selection.
2026-01-11 13:03:53 +00:00

405 lines
8.8 KiB
Markdown

# 📚 doc-lifecycle-manager Integration
## Dual-Mode: Agent Plugin + Standalone System
**Version**: 0.1.0
**Status**: Specification (VAPORA v1.0 Integration)
**Purpose**: Integration of doc-lifecycle-manager as both VAPORA component AND standalone tool
---
## 🎯 Objetivo
**doc-lifecycle-manager** funciona de dos formas:
1. **Como agente VAPORA**: Documenter role usa doc-lifecycle internally
2. **Como sistema standalone**: Proyectos sin VAPORA usan doc-lifecycle solo
Permite adopción gradual: empezar con doc-lifecycle solo, migrar a VAPORA después.
---
## 🔄 Dual-Mode Architecture
### Mode 1: Standalone (Sin VAPORA)
```
proyecto-simple/
├── docs/
│ ├── architecture/
│ ├── guides/
│ └── adr/
├── .doc-lifecycle-manager/
│ ├── config.toml
│ ├── templates/
│ └── metadata/
└── .github/workflows/
└── docs-update.yaml # Triggered on push
```
**Usage**:
```bash
# Manual
doc-lifecycle-manager classify docs/
doc-lifecycle-manager consolidate docs/
doc-lifecycle-manager index --for-rag
# Via CI/CD
.github/workflows/docs-update.yaml:
on: [push]
steps:
- run: doc-lifecycle-manager sync
```
**Capabilities**:
- Classify docs by type
- Consolidate duplicates
- Manage lifecycle (draft → published → archived)
- Generate RAG index
- Build presentations (mdBook, Slidev)
---
### Mode 2: As VAPORA Agent (With VAPORA)
```
proyecto-vapora/
├── .vapora/
│ ├── agents/
│ │ └── documenter/
│ │ ├── config.toml
│ │ └── plugins/
│ │ └── doc-lifecycle-manager/ # Embedded
│ └── ...
├── docs/
└── .coder/
```
**Architecture**:
```
Documenter Agent (Role)
├─ Root Files Keeper
│ ├─ README.md
│ ├─ CHANGELOG.md
│ ├─ ROADMAP.md
│ └─ (auto-generated)
└─ doc-lifecycle-manager Plugin
├─ Classify documents
├─ Consolidate duplicates
├─ Manage ADRs (from sessions)
├─ Generate presentations
└─ Build RAG index
```
**Workflow**:
```
Task completed
Orchestrator publishes: "task_completed" event
Documenter Agent subscribes to: vapora.tasks.completed
Documenter loads config:
├─ Root Files Keeper (built-in)
└─ doc-lifecycle-manager plugin
Executes (in order):
1. Extract decisions from sessions → doc-lifecycle ADR classification
2. Update root files (README, CHANGELOG, ROADMAP)
3. Classify all docs in docs/
4. Consolidate duplicates
5. Generate RAG index
6. (Optional) Build mdBook + Slidev presentations
Publishes: "docs_updated" event
```
---
## 🔌 Plugin Interface
### Documenter Agent Loads doc-lifecycle-manager
```rust
pub struct DocumenterAgent {
pub root_files_keeper: RootFilesKeeper,
pub doc_lifecycle: DocLifecycleManager, // Plugin
}
impl DocumenterAgent {
pub async fn execute_task(
&mut self,
task: Task,
) -> anyhow::Result<()> {
// 1. Update root files (always)
self.root_files_keeper.sync_all(&task).await?;
// 2. Use doc-lifecycle for deep doc management (if configured)
if self.config.enable_doc_lifecycle {
self.doc_lifecycle.classify_docs("docs/").await?;
self.doc_lifecycle.consolidate_duplicates().await?;
self.doc_lifecycle.manage_lifecycle().await?;
// 3. Build presentations
if self.config.generate_presentations {
self.doc_lifecycle.generate_mdbook().await?;
self.doc_lifecycle.generate_slidev().await?;
}
// 4. Build RAG index (for search)
self.doc_lifecycle.build_rag_index().await?;
}
Ok(())
}
}
```
---
## 🚀 Migration: Standalone → VAPORA
### Step 1: Run Standalone
```bash
proyecto/
├── docs/
│ ├── architecture/
│ └── adr/
├── .doc-lifecycle-manager/
│ └── config.toml
└── .github/workflows/docs-update.yaml
# Usage: Manual or via CI/CD
doc-lifecycle-manager sync
```
### Step 2: Install VAPORA
```bash
# Initialize VAPORA
vapora init
# VAPORA auto-detects existing .doc-lifecycle-manager/
# and integrates it into Documenter agent
```
### Step 3: Migrate Workflows
```bash
# Before (in CI/CD):
- run: doc-lifecycle-manager sync
# After (in VAPORA):
# - Documenter agent runs automatically post-task
# - CLI still available:
vapora doc-lifecycle classify
vapora doc-lifecycle consolidate
vapora doc-lifecycle rag-index
```
---
## 📋 Configuration
### Standalone Config
```toml
# .doc-lifecycle-manager/config.toml
[lifecycle]
doc_root = "docs/"
adr_path = "docs/adr/"
archive_days = 180
[classification]
enabled = true
auto_consolidate_duplicates = true
detect_orphaned_docs = true
[rag]
enabled = true
chunk_size = 500
overlap = 50
index_path = ".doc-lifecycle-manager/index.json"
[presentations]
generate_mdbook = true
generate_slidev = true
mdbook_out = "book/"
slidev_out = "slides/"
[lifecycle_rules]
[[rule]]
path_pattern = "docs/guides/*"
lifecycle = "guide"
retention_days = 0 # Never delete
[[rule]]
path_pattern = "docs/experimental/*"
lifecycle = "experimental"
retention_days = 30
```
### VAPORA Integration Config
```toml
# .vapora/.vapora.toml
[documenter]
# Embedded doc-lifecycle config
doc_lifecycle_enabled = true
doc_lifecycle_config = ".doc-lifecycle-manager/config.toml" # Reuse
[root_files]
auto_update = true
generate_changelog_from_git = true
generate_roadmap_from_tasks = true
```
---
## 🎯 Commands (Both Modes)
### Standalone Mode
```bash
# Classify documents
doc-lifecycle-manager classify docs/
# Consolidate duplicates
doc-lifecycle-manager consolidate
# Manage lifecycle
doc-lifecycle-manager lifecycle prune --older-than 180d
# Build RAG index
doc-lifecycle-manager rag-index --output index.json
# Generate presentations
doc-lifecycle-manager mdbook build
doc-lifecycle-manager slidev build
```
### VAPORA Integration
```bash
# Via documenter agent (automatic post-task)
# Or manual:
vapora doc-lifecycle classify
vapora doc-lifecycle consolidate
vapora doc-lifecycle rag-index
# Root files (via Documenter)
vapora root-files sync
# Full documentation update
vapora document sync --all
```
---
## 📊 Lifecycle States (doc-lifecycle)
```
Draft
├─ In-progress documentation
├─ Not indexed
└─ Not published
Published
├─ Ready for users
├─ Indexed for RAG
├─ Included in presentations
└─ Linked in README
Updated
├─ Recently modified
├─ Re-indexed for RAG
└─ Change log entry created
Archived
├─ Outdated
├─ Removed from presentations
├─ Indexed but marked deprecated
└─ Can be recovered
```
---
## 🔐 RAG Integration
### doc-lifecycle → RAG Index
```json
{
"doc_id": "ADR-015-batch-workflow",
"title": "ADR-015: Batch Workflow System",
"doc_type": "adr",
"lifecycle_state": "published",
"created_date": "2025-11-09",
"last_updated": "2025-11-10",
"vector_embedding": [0.1, 0.2, ...], // 1536-dim
"content_preview": "Decision: Use Rust for batch orchestrator...",
"tags": ["orchestrator", "workflow", "architecture"],
"source_session": "sess-2025-11-09-143022",
"related_adr": ["ADR-010", "ADR-014"],
"search_keywords": ["batch", "workflow", "orchestrator"]
}
```
### RAG Search (Via VAPORA Agent Search)
```bash
# Search documentation
vapora search "batch workflow architecture"
# Results from doc-lifecycle RAG index:
# 1. ADR-015-batch-workflow.md (0.94 relevance)
# 2. batch-workflow-guide.md (0.87)
# 3. orchestrator-design.md (0.71)
```
---
## 🎯 Implementation Checklist
### Standalone Components
- [ ] Document classifier (by type, domain, lifecycle)
- [ ] Duplicate detector & consolidator
- [ ] Lifecycle state management (Draft→Published→Archived)
- [ ] RAG index builder (chunking, embeddings)
- [ ] mdBook generator
- [ ] Slidev generator
- [ ] CLI interface
### VAPORA Integration
- [ ] Documenter agent loads doc-lifecycle-manager
- [ ] Plugin interface (DocLifecycleManager trait)
- [ ] Event subscriptions (vapora.tasks.completed)
- [ ] Config reuse (.doc-lifecycle-manager/ detected)
- [ ] Seamless startup (no additional config)
### Migration Tools
- [ ] Detect existing .doc-lifecycle-manager/
- [ ] Auto-configure Documenter agent
- [ ] Preserve existing RAG indexes
- [ ] No data loss during migration
---
## 📊 Success Metrics
✅ Standalone doc-lifecycle works independently
✅ VAPORA auto-detects and loads doc-lifecycle
✅ Documenter agent uses both Root Files + doc-lifecycle
✅ Migration takes < 5 minutes
No duplicate work (each tool owns its domain)
RAG indexing automatic and current
---
**Version**: 0.1.0
**Status**: Integration Specification Complete
**Purpose**: Seamless doc-lifecycle-manager dual-mode integration with VAPORA