Vapora/docs/architecture/task-agent-doc-manager.md
Jesús Pérez d14150da75 feat: Phase 5.3 - Multi-Agent Learning Infrastructure
Implement intelligent agent learning from Knowledge Graph execution history
with per-task-type expertise tracking, recency bias, and learning curves.

## Phase 5.3 Implementation

### Learning Infrastructure ( Complete)
- LearningProfileService with per-task-type expertise metrics
- TaskTypeExpertise model tracking success_rate, confidence, learning curves
- Recency bias weighting: recent 7 days weighted 3x higher (exponential decay)
- Confidence scoring prevents overfitting: min(1.0, executions / 20)
- Learning curves computed from daily execution windows

### Agent Scoring Service ( Complete)
- Unified AgentScore combining SwarmCoordinator + learning profiles
- Scoring formula: 0.3*base + 0.5*expertise + 0.2*confidence
- Rank agents by combined score for intelligent assignment
- Support for recency-biased scoring (recent_success_rate)
- Methods: rank_agents, select_best, rank_agents_with_recency

### KG Integration ( Complete)
- KGPersistence::get_executions_for_task_type() - query by agent + task type
- KGPersistence::get_agent_executions() - all executions for agent
- Coordinator::load_learning_profile_from_kg() - core KG→Learning integration
- Coordinator::load_all_learning_profiles() - batch load for multiple agents
- Convert PersistedExecution → ExecutionData for learning calculations

### Agent Assignment Integration ( Complete)
- AgentCoordinator uses learning profiles for task assignment
- extract_task_type() infers task type from title/description
- assign_task() scores candidates using AgentScoringService
- Fallback to load-based selection if no learning data available
- Learning profiles stored in coordinator.learning_profiles RwLock

### Profile Adapter Enhancements ( Complete)
- create_learning_profile() - initialize empty profiles
- add_task_type_expertise() - set task-type expertise
- update_profile_with_learning() - update swarm profiles from learning

## Files Modified

### vapora-knowledge-graph/src/persistence.rs (+30 lines)
- get_executions_for_task_type(agent_id, task_type, limit)
- get_agent_executions(agent_id, limit)

### vapora-agents/src/coordinator.rs (+100 lines)
- load_learning_profile_from_kg() - core KG integration method
- load_all_learning_profiles() - batch loading for agents
- assign_task() already uses learning-based scoring via AgentScoringService

### Existing Complete Implementation
- vapora-knowledge-graph/src/learning.rs - calculation functions
- vapora-agents/src/learning_profile.rs - data structures and expertise
- vapora-agents/src/scoring.rs - unified scoring service
- vapora-agents/src/profile_adapter.rs - adapter methods

## Tests Passing
- learning_profile: 7 tests 
- scoring: 5 tests 
- profile_adapter: 6 tests 
- coordinator: learning-specific tests 

## Data Flow
1. Task arrives → AgentCoordinator::assign_task()
2. Extract task_type from description
3. Query KG for task-type executions (load_learning_profile_from_kg)
4. Calculate expertise with recency bias
5. Score candidates (SwarmCoordinator + learning)
6. Assign to top-scored agent
7. Execution result → KG → Update learning profiles

## Key Design Decisions
 Recency bias: 7-day half-life with 3x weight for recent performance
 Confidence scoring: min(1.0, total_executions / 20) prevents overfitting
 Hierarchical scoring: 30% base load, 50% expertise, 20% confidence
 KG query limit: 100 recent executions per task-type for performance
 Async loading: load_learning_profile_from_kg supports concurrent loads

## Next: Phase 5.4 - Cost Optimization
Ready to implement budget enforcement and cost-aware provider selection.
2026-01-11 13:03:53 +00:00

385 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Task, Agent & Documentation Manager
## Multi-Agent Task Orchestration & Documentation Sync
**Status**: Production Ready (v1.2.0)
**Date**: January 2026
---
## 🎯 Overview
System that:
1. **Manages tasks** in multi-agent workflow
2. **Assigns agents** automatically based on expertise
3. **Coordinates execution** in parallel with approval gates
4. **Extracts decisions** as Architecture Decision Records (ADRs)
5. **Maintains documentation** automatically synchronized
---
## 📋 Task Structure
### Task Metadata
Tasks are stored in SurrealDB with the following structure:
```toml
[task]
id = "task-089"
type = "feature" # feature | bugfix | enhancement | tech-debt
title = "Implement learning profiles"
description = "Agent expertise tracking with recency bias"
[status]
state = "in-progress" # todo | in-progress | review | done | archived
progress = 60 # 0-100%
created_at = "2026-01-11T10:15:30Z"
updated_at = "2026-01-11T14:30:22Z"
[assignment]
priority = "high" # high | medium | low
assigned_agent = "developer" # Or null if unassigned
assigned_team = "infrastructure"
[estimation]
estimated_hours = 8
actual_hours = null # Updated when complete
[context]
related_tasks = ["task-087", "task-088"]
blocking_tasks = []
blocked_by = []
```
### Task Lifecycle
```
┌─────────┐ ┌──────────────┐ ┌────────┐ ┌──────────┐
│ TODO │────▶│ IN-PROGRESS │────▶│ REVIEW │────▶│ DONE │
└─────────┘ └──────────────┘ └────────┘ └──────────┘
△ │
│ │
└───────────── ARCHIVED ◀───────────┘
```
---
## 🤖 Agent Assignment
### Automatic Selection
When a task is created, SwarmCoordinator assigns the best agent:
1. **Capability Matching**: Filter agents by role matching task type
2. **Learning Profile Lookup**: Get expertise scores for task-type
3. **Load Balancing**: Check current agent load (tasks in progress)
4. **Scoring**: `final_score = 0.3*load + 0.5*expertise + 0.2*confidence`
5. **Notification**: Agent receives job via NATS JetStream
### Agent Roles
| Role | Specialization | Primary Tasks |
|------|---|---|
| **Architect** | System design | Feature planning, ADRs, design reviews |
| **Developer** | Implementation | Code generation, refactoring, debugging |
| **Reviewer** | Quality assurance | Code review, test coverage, style checks |
| **Tester** | QA & Benchmarks | Test suite, performance benchmarks |
| **Documenter** | Documentation | Guides, API docs, README updates |
| **Marketer** | Marketing content | Blog posts, case studies, announcements |
| **Presenter** | Presentations | Slides, deck creation, demo scripts |
| **DevOps** | Infrastructure | CI/CD setup, deployment, monitoring |
| **Monitor** | Health & Alerting | System monitoring, alerts, incident response |
| **Security** | Compliance & Audit | Code security, access control, compliance |
| **ProjectManager** | Coordination | Roadmap, tracking, milestone management |
| **DecisionMaker** | Conflict Resolution | Tie-breaking, escalation, ADR creation |
---
## 🔄 Multi-Agent Workflow Execution
### Sequential Workflow (Phases)
```
Phase 1: Design
└─ Architect creates ADR
└─ Move to Phase 2 (auto on completion)
Phase 2: Development
└─ Developer implements
└─ (Parallel) Documenter writes guide
└─ Move to Phase 3
Phase 3: Review
└─ Reviewer checks code quality
└─ Security audits for compliance
└─ If approved: Move to Phase 4
└─ If rejected: Back to Phase 2
Phase 4: Testing
└─ Tester creates test suite
└─ Tester runs benchmarks
└─ If passing: Move to Phase 5
└─ If failing: Back to Phase 2
Phase 5: Completion
└─ DevOps deploys
└─ Monitor sets up alerts
└─ ProjectManager marks done
```
### Parallel Coordination
Multiple agents work simultaneously when independent:
```
Task: "Add learning profiles"
├─ Architect (ADR) ▶ Created in 2h
├─ Developer (Code) ▶ Implemented in 8h
│ ├─ Reviewer (Review) ▶ Reviewed in 1h (parallel)
│ └─ Documenter (Guide) ▶ Documented in 2h (parallel)
└─ Tester (Tests) ▶ Tests in 3h
└─ Security (Audit) ▶ Audited in 1h (parallel)
```
### Approval Gates
Critical decision points require manual approval:
- **Security Gate**: Must approve if code touches auth/secrets
- **Breaking Changes**: Architect approval required
- **Production Deployment**: DevOps + ProjectManager approval
- **Major Refactoring**: Architect + Lead Developer approval
---
## 📝 Decision Extraction (ADRs)
Every design decision is automatically captured:
### ADR Template
```markdown
# ADR-042: Learning-Based Agent Selection
## Context
Previous agent assignment used simple load balancing (min tasks),
ignoring historical performance data. This led to poor agent-task matches.
## Decision
Implement per-task-type learning profiles with recency bias.
### Key Points
- Success rate weighted by recency (7-day window, 3× weight)
- Confidence scoring prevents small-sample overfitting
- Supports adaptive recovery from temporary degradation
## Consequences
**Positive**:
- 30-50% improvement in task success rate
- Agents improve continuously
**Negative**:
- Requires KG data collection (startup period)
- Learning period ~20 tasks per task-type
## Alternatives Considered
1. Rule-based routing (rejected: no learning)
2. Pure random assignment (rejected: no improvement)
3. Rolling average (rejected: no recency bias)
## Decision Made
Option A: Learning profiles with recency bias
```
### ADR Extraction Process
1. **Automatic**: Each task completion generates execution record
2. **Learning**: If decision had trade-offs, extract as ADR candidate
3. **Curation**: ProjectManager/Architect reviews and approves
4. **Archival**: Stored in docs/architecture/adr/ (numbered, immutable)
---
## 📚 Documentation Synchronization
### Automatic Updates
When tasks complete, documentation is auto-updated:
| Task Type | Auto-Updates |
|---|---|
| Feature | CHANGELOG.md, feature overview, API docs |
| Bugfix | CHANGELOG.md, troubleshooting guide |
| Tech-Debt | Architecture docs, refactoring guide |
| Enhancement | Feature docs, user guide |
| Documentation | Indexed in RAG, updated in search |
### Documentation Lifecycle
```
Task Created
Documentation Context Extracted
├─ Decision/ADR created
├─ Related docs identified
└─ Change summary prepared
Task Execution
├─ Code generated
├─ Tests created
└─ Examples documented
Task Complete
├─ ADR finalized
├─ Docs auto-generated
├─ CHANGELOG entry created
└─ Search index updated (RAG)
Archival (if stale)
└─ Moved to docs/archive/
(kept for historical reference)
```
---
## 🔍 Search & Retrieval (RAG Integration)
### Document Indexing
All generated documentation is indexed for semantic search:
- **Architecture decisions** (ADRs)
- **Feature guides** (how-tos)
- **Code examples** (patterns)
- **Execution history** (knowledge graph)
### Query Examples
User asks: "How do I implement learning profiles?"
System searches:
1. ADRs mentioning "learning"
2. Implementation guides with "learning"
3. Execution history with similar task type
4. Code examples for "learning profiles"
Returns ranked results with sources.
---
## 📊 Metrics & Monitoring
### Task Metrics
- **Success Rate**: % of tasks completed successfully
- **Cycle Time**: Average time from todo → done
- **Agent Utilization**: Tasks per agent per role
- **Decision Quality**: ADRs implemented vs. abandoned
### Agent Metrics (per role)
- **Task Success Rate**: % tasks completed successfully
- **Learning Curve**: Expert improvement over time
- **Cost per Task**: Average LLM spend per completed task
- **Task Coverage**: Breadth of task-types handled
### Documentation Metrics
- **Coverage**: % of features documented
- **Freshness**: Days since last update
- **Usage**: Search queries hitting each doc
- **Accuracy**: User feedback on doc correctness
---
## 🏗️ Implementation Details
### SurrealDB Schema
```sql
-- Tasks table
DEFINE TABLE tasks SCHEMAFULL;
DEFINE FIELD id ON tasks TYPE string;
DEFINE FIELD type ON tasks TYPE string;
DEFINE FIELD state ON tasks TYPE string;
DEFINE FIELD assigned_agent ON tasks TYPE option<string>;
-- Executions (for learning)
DEFINE TABLE executions SCHEMAFULL;
DEFINE FIELD task_id ON executions TYPE string;
DEFINE FIELD agent_id ON executions TYPE string;
DEFINE FIELD success ON executions TYPE bool;
DEFINE FIELD duration_ms ON executions TYPE number;
DEFINE FIELD cost_cents ON executions TYPE number;
-- ADRs table
DEFINE TABLE adrs SCHEMAFULL;
DEFINE FIELD id ON adrs TYPE string;
DEFINE FIELD task_id ON adrs TYPE string;
DEFINE FIELD title ON adrs TYPE string;
DEFINE FIELD status ON adrs TYPE string; -- draft|approved|archived
```
### NATS Topics
- `tasks.{type}.{priority}` — Task assignments
- `agents.{role}.ready` — Agent heartbeats
- `agents.{role}.complete` — Task completion
- `adrs.created` — New ADR events
- `docs.updated` — Documentation changes
---
## 🎯 Key Design Patterns
### 1. Event-Driven Coordination
- Task creation → Agent assignment (async via NATS)
- Task completion → Documentation update (eventual consistency)
- No direct API calls between services (loosely coupled)
### 2. Learning from Execution History
- Every task stores execution metadata (success, duration, cost)
- Learning profiles updated from execution data
- Better assignments improve continuously
### 3. Decision Extraction
- Design decisions captured as ADRs
- Immutable record of architectural rationale
- Serves as organizational memory
### 4. Graceful Degradation
- NATS offline: In-memory queue fallback
- Agent unavailable: Task re-assigned to next best
- Doc generation failed: Manual entry allowed
---
## 📚 Related Documentation
- **[VAPORA Architecture](vapora-architecture.md)** — System overview
- **[Agent Registry & Coordination](agent-registry-coordination.md)** — Agent patterns
- **[Multi-Agent Workflows](multi-agent-workflows.md)** — Workflow execution
- **[Multi-IA Router](multi-ia-router.md)** — LLM provider selection
- **[Roles, Permissions & Profiles](roles-permissions-profiles.md)** — RBAC
---
**Status**: ✅ Production Ready
**Version**: 1.2.0
**Last Updated**: January 2026