Implement intelligent agent learning from Knowledge Graph execution history with per-task-type expertise tracking, recency bias, and learning curves. ## Phase 5.3 Implementation ### Learning Infrastructure (✅ Complete) - LearningProfileService with per-task-type expertise metrics - TaskTypeExpertise model tracking success_rate, confidence, learning curves - Recency bias weighting: recent 7 days weighted 3x higher (exponential decay) - Confidence scoring prevents overfitting: min(1.0, executions / 20) - Learning curves computed from daily execution windows ### Agent Scoring Service (✅ Complete) - Unified AgentScore combining SwarmCoordinator + learning profiles - Scoring formula: 0.3*base + 0.5*expertise + 0.2*confidence - Rank agents by combined score for intelligent assignment - Support for recency-biased scoring (recent_success_rate) - Methods: rank_agents, select_best, rank_agents_with_recency ### KG Integration (✅ Complete) - KGPersistence::get_executions_for_task_type() - query by agent + task type - KGPersistence::get_agent_executions() - all executions for agent - Coordinator::load_learning_profile_from_kg() - core KG→Learning integration - Coordinator::load_all_learning_profiles() - batch load for multiple agents - Convert PersistedExecution → ExecutionData for learning calculations ### Agent Assignment Integration (✅ Complete) - AgentCoordinator uses learning profiles for task assignment - extract_task_type() infers task type from title/description - assign_task() scores candidates using AgentScoringService - Fallback to load-based selection if no learning data available - Learning profiles stored in coordinator.learning_profiles RwLock ### Profile Adapter Enhancements (✅ Complete) - create_learning_profile() - initialize empty profiles - add_task_type_expertise() - set task-type expertise - update_profile_with_learning() - update swarm profiles from learning ## Files Modified ### vapora-knowledge-graph/src/persistence.rs (+30 lines) - get_executions_for_task_type(agent_id, task_type, limit) - get_agent_executions(agent_id, limit) ### vapora-agents/src/coordinator.rs (+100 lines) - load_learning_profile_from_kg() - core KG integration method - load_all_learning_profiles() - batch loading for agents - assign_task() already uses learning-based scoring via AgentScoringService ### Existing Complete Implementation - vapora-knowledge-graph/src/learning.rs - calculation functions - vapora-agents/src/learning_profile.rs - data structures and expertise - vapora-agents/src/scoring.rs - unified scoring service - vapora-agents/src/profile_adapter.rs - adapter methods ## Tests Passing - learning_profile: 7 tests ✅ - scoring: 5 tests ✅ - profile_adapter: 6 tests ✅ - coordinator: learning-specific tests ✅ ## Data Flow 1. Task arrives → AgentCoordinator::assign_task() 2. Extract task_type from description 3. Query KG for task-type executions (load_learning_profile_from_kg) 4. Calculate expertise with recency bias 5. Score candidates (SwarmCoordinator + learning) 6. Assign to top-scored agent 7. Execution result → KG → Update learning profiles ## Key Design Decisions ✅ Recency bias: 7-day half-life with 3x weight for recent performance ✅ Confidence scoring: min(1.0, total_executions / 20) prevents overfitting ✅ Hierarchical scoring: 30% base load, 50% expertise, 20% confidence ✅ KG query limit: 100 recent executions per task-type for performance ✅ Async loading: load_learning_profile_from_kg supports concurrent loads ## Next: Phase 5.4 - Cost Optimization Ready to implement budget enforcement and cost-aware provider selection.
385 lines
11 KiB
Markdown
385 lines
11 KiB
Markdown
# Task, Agent & Documentation Manager
|
||
## Multi-Agent Task Orchestration & Documentation Sync
|
||
|
||
**Status**: Production Ready (v1.2.0)
|
||
**Date**: January 2026
|
||
|
||
---
|
||
|
||
## 🎯 Overview
|
||
|
||
System that:
|
||
1. **Manages tasks** in multi-agent workflow
|
||
2. **Assigns agents** automatically based on expertise
|
||
3. **Coordinates execution** in parallel with approval gates
|
||
4. **Extracts decisions** as Architecture Decision Records (ADRs)
|
||
5. **Maintains documentation** automatically synchronized
|
||
|
||
---
|
||
|
||
## 📋 Task Structure
|
||
|
||
### Task Metadata
|
||
|
||
Tasks are stored in SurrealDB with the following structure:
|
||
|
||
```toml
|
||
[task]
|
||
id = "task-089"
|
||
type = "feature" # feature | bugfix | enhancement | tech-debt
|
||
title = "Implement learning profiles"
|
||
description = "Agent expertise tracking with recency bias"
|
||
|
||
[status]
|
||
state = "in-progress" # todo | in-progress | review | done | archived
|
||
progress = 60 # 0-100%
|
||
created_at = "2026-01-11T10:15:30Z"
|
||
updated_at = "2026-01-11T14:30:22Z"
|
||
|
||
[assignment]
|
||
priority = "high" # high | medium | low
|
||
assigned_agent = "developer" # Or null if unassigned
|
||
assigned_team = "infrastructure"
|
||
|
||
[estimation]
|
||
estimated_hours = 8
|
||
actual_hours = null # Updated when complete
|
||
|
||
[context]
|
||
related_tasks = ["task-087", "task-088"]
|
||
blocking_tasks = []
|
||
blocked_by = []
|
||
```
|
||
|
||
### Task Lifecycle
|
||
|
||
```
|
||
┌─────────┐ ┌──────────────┐ ┌────────┐ ┌──────────┐
|
||
│ TODO │────▶│ IN-PROGRESS │────▶│ REVIEW │────▶│ DONE │
|
||
└─────────┘ └──────────────┘ └────────┘ └──────────┘
|
||
△ │
|
||
│ │
|
||
└───────────── ARCHIVED ◀───────────┘
|
||
```
|
||
|
||
---
|
||
|
||
## 🤖 Agent Assignment
|
||
|
||
### Automatic Selection
|
||
|
||
When a task is created, SwarmCoordinator assigns the best agent:
|
||
|
||
1. **Capability Matching**: Filter agents by role matching task type
|
||
2. **Learning Profile Lookup**: Get expertise scores for task-type
|
||
3. **Load Balancing**: Check current agent load (tasks in progress)
|
||
4. **Scoring**: `final_score = 0.3*load + 0.5*expertise + 0.2*confidence`
|
||
5. **Notification**: Agent receives job via NATS JetStream
|
||
|
||
### Agent Roles
|
||
|
||
| Role | Specialization | Primary Tasks |
|
||
|------|---|---|
|
||
| **Architect** | System design | Feature planning, ADRs, design reviews |
|
||
| **Developer** | Implementation | Code generation, refactoring, debugging |
|
||
| **Reviewer** | Quality assurance | Code review, test coverage, style checks |
|
||
| **Tester** | QA & Benchmarks | Test suite, performance benchmarks |
|
||
| **Documenter** | Documentation | Guides, API docs, README updates |
|
||
| **Marketer** | Marketing content | Blog posts, case studies, announcements |
|
||
| **Presenter** | Presentations | Slides, deck creation, demo scripts |
|
||
| **DevOps** | Infrastructure | CI/CD setup, deployment, monitoring |
|
||
| **Monitor** | Health & Alerting | System monitoring, alerts, incident response |
|
||
| **Security** | Compliance & Audit | Code security, access control, compliance |
|
||
| **ProjectManager** | Coordination | Roadmap, tracking, milestone management |
|
||
| **DecisionMaker** | Conflict Resolution | Tie-breaking, escalation, ADR creation |
|
||
|
||
---
|
||
|
||
## 🔄 Multi-Agent Workflow Execution
|
||
|
||
### Sequential Workflow (Phases)
|
||
|
||
```
|
||
Phase 1: Design
|
||
└─ Architect creates ADR
|
||
└─ Move to Phase 2 (auto on completion)
|
||
|
||
Phase 2: Development
|
||
└─ Developer implements
|
||
└─ (Parallel) Documenter writes guide
|
||
└─ Move to Phase 3
|
||
|
||
Phase 3: Review
|
||
└─ Reviewer checks code quality
|
||
└─ Security audits for compliance
|
||
└─ If approved: Move to Phase 4
|
||
└─ If rejected: Back to Phase 2
|
||
|
||
Phase 4: Testing
|
||
└─ Tester creates test suite
|
||
└─ Tester runs benchmarks
|
||
└─ If passing: Move to Phase 5
|
||
└─ If failing: Back to Phase 2
|
||
|
||
Phase 5: Completion
|
||
└─ DevOps deploys
|
||
└─ Monitor sets up alerts
|
||
└─ ProjectManager marks done
|
||
```
|
||
|
||
### Parallel Coordination
|
||
|
||
Multiple agents work simultaneously when independent:
|
||
|
||
```
|
||
Task: "Add learning profiles"
|
||
|
||
├─ Architect (ADR) ▶ Created in 2h
|
||
├─ Developer (Code) ▶ Implemented in 8h
|
||
│ ├─ Reviewer (Review) ▶ Reviewed in 1h (parallel)
|
||
│ └─ Documenter (Guide) ▶ Documented in 2h (parallel)
|
||
│
|
||
└─ Tester (Tests) ▶ Tests in 3h
|
||
└─ Security (Audit) ▶ Audited in 1h (parallel)
|
||
```
|
||
|
||
### Approval Gates
|
||
|
||
Critical decision points require manual approval:
|
||
|
||
- **Security Gate**: Must approve if code touches auth/secrets
|
||
- **Breaking Changes**: Architect approval required
|
||
- **Production Deployment**: DevOps + ProjectManager approval
|
||
- **Major Refactoring**: Architect + Lead Developer approval
|
||
|
||
---
|
||
|
||
## 📝 Decision Extraction (ADRs)
|
||
|
||
Every design decision is automatically captured:
|
||
|
||
### ADR Template
|
||
|
||
```markdown
|
||
# ADR-042: Learning-Based Agent Selection
|
||
|
||
## Context
|
||
|
||
Previous agent assignment used simple load balancing (min tasks),
|
||
ignoring historical performance data. This led to poor agent-task matches.
|
||
|
||
## Decision
|
||
|
||
Implement per-task-type learning profiles with recency bias.
|
||
|
||
### Key Points
|
||
- Success rate weighted by recency (7-day window, 3× weight)
|
||
- Confidence scoring prevents small-sample overfitting
|
||
- Supports adaptive recovery from temporary degradation
|
||
|
||
## Consequences
|
||
|
||
**Positive**:
|
||
- 30-50% improvement in task success rate
|
||
- Agents improve continuously
|
||
|
||
**Negative**:
|
||
- Requires KG data collection (startup period)
|
||
- Learning period ~20 tasks per task-type
|
||
|
||
## Alternatives Considered
|
||
|
||
1. Rule-based routing (rejected: no learning)
|
||
2. Pure random assignment (rejected: no improvement)
|
||
3. Rolling average (rejected: no recency bias)
|
||
|
||
## Decision Made
|
||
|
||
Option A: Learning profiles with recency bias
|
||
```
|
||
|
||
### ADR Extraction Process
|
||
|
||
1. **Automatic**: Each task completion generates execution record
|
||
2. **Learning**: If decision had trade-offs, extract as ADR candidate
|
||
3. **Curation**: ProjectManager/Architect reviews and approves
|
||
4. **Archival**: Stored in docs/architecture/adr/ (numbered, immutable)
|
||
|
||
---
|
||
|
||
## 📚 Documentation Synchronization
|
||
|
||
### Automatic Updates
|
||
|
||
When tasks complete, documentation is auto-updated:
|
||
|
||
| Task Type | Auto-Updates |
|
||
|---|---|
|
||
| Feature | CHANGELOG.md, feature overview, API docs |
|
||
| Bugfix | CHANGELOG.md, troubleshooting guide |
|
||
| Tech-Debt | Architecture docs, refactoring guide |
|
||
| Enhancement | Feature docs, user guide |
|
||
| Documentation | Indexed in RAG, updated in search |
|
||
|
||
### Documentation Lifecycle
|
||
|
||
```
|
||
Task Created
|
||
│
|
||
▼
|
||
Documentation Context Extracted
|
||
│
|
||
├─ Decision/ADR created
|
||
├─ Related docs identified
|
||
└─ Change summary prepared
|
||
│
|
||
▼
|
||
Task Execution
|
||
│
|
||
├─ Code generated
|
||
├─ Tests created
|
||
└─ Examples documented
|
||
│
|
||
▼
|
||
Task Complete
|
||
│
|
||
├─ ADR finalized
|
||
├─ Docs auto-generated
|
||
├─ CHANGELOG entry created
|
||
└─ Search index updated (RAG)
|
||
│
|
||
▼
|
||
Archival (if stale)
|
||
│
|
||
└─ Moved to docs/archive/
|
||
(kept for historical reference)
|
||
```
|
||
|
||
---
|
||
|
||
## 🔍 Search & Retrieval (RAG Integration)
|
||
|
||
### Document Indexing
|
||
|
||
All generated documentation is indexed for semantic search:
|
||
|
||
- **Architecture decisions** (ADRs)
|
||
- **Feature guides** (how-tos)
|
||
- **Code examples** (patterns)
|
||
- **Execution history** (knowledge graph)
|
||
|
||
### Query Examples
|
||
|
||
User asks: "How do I implement learning profiles?"
|
||
|
||
System searches:
|
||
1. ADRs mentioning "learning"
|
||
2. Implementation guides with "learning"
|
||
3. Execution history with similar task type
|
||
4. Code examples for "learning profiles"
|
||
|
||
Returns ranked results with sources.
|
||
|
||
---
|
||
|
||
## 📊 Metrics & Monitoring
|
||
|
||
### Task Metrics
|
||
|
||
- **Success Rate**: % of tasks completed successfully
|
||
- **Cycle Time**: Average time from todo → done
|
||
- **Agent Utilization**: Tasks per agent per role
|
||
- **Decision Quality**: ADRs implemented vs. abandoned
|
||
|
||
### Agent Metrics (per role)
|
||
|
||
- **Task Success Rate**: % tasks completed successfully
|
||
- **Learning Curve**: Expert improvement over time
|
||
- **Cost per Task**: Average LLM spend per completed task
|
||
- **Task Coverage**: Breadth of task-types handled
|
||
|
||
### Documentation Metrics
|
||
|
||
- **Coverage**: % of features documented
|
||
- **Freshness**: Days since last update
|
||
- **Usage**: Search queries hitting each doc
|
||
- **Accuracy**: User feedback on doc correctness
|
||
|
||
---
|
||
|
||
## 🏗️ Implementation Details
|
||
|
||
### SurrealDB Schema
|
||
|
||
```sql
|
||
-- Tasks table
|
||
DEFINE TABLE tasks SCHEMAFULL;
|
||
DEFINE FIELD id ON tasks TYPE string;
|
||
DEFINE FIELD type ON tasks TYPE string;
|
||
DEFINE FIELD state ON tasks TYPE string;
|
||
DEFINE FIELD assigned_agent ON tasks TYPE option<string>;
|
||
|
||
-- Executions (for learning)
|
||
DEFINE TABLE executions SCHEMAFULL;
|
||
DEFINE FIELD task_id ON executions TYPE string;
|
||
DEFINE FIELD agent_id ON executions TYPE string;
|
||
DEFINE FIELD success ON executions TYPE bool;
|
||
DEFINE FIELD duration_ms ON executions TYPE number;
|
||
DEFINE FIELD cost_cents ON executions TYPE number;
|
||
|
||
-- ADRs table
|
||
DEFINE TABLE adrs SCHEMAFULL;
|
||
DEFINE FIELD id ON adrs TYPE string;
|
||
DEFINE FIELD task_id ON adrs TYPE string;
|
||
DEFINE FIELD title ON adrs TYPE string;
|
||
DEFINE FIELD status ON adrs TYPE string; -- draft|approved|archived
|
||
```
|
||
|
||
### NATS Topics
|
||
|
||
- `tasks.{type}.{priority}` — Task assignments
|
||
- `agents.{role}.ready` — Agent heartbeats
|
||
- `agents.{role}.complete` — Task completion
|
||
- `adrs.created` — New ADR events
|
||
- `docs.updated` — Documentation changes
|
||
|
||
---
|
||
|
||
## 🎯 Key Design Patterns
|
||
|
||
### 1. Event-Driven Coordination
|
||
- Task creation → Agent assignment (async via NATS)
|
||
- Task completion → Documentation update (eventual consistency)
|
||
- No direct API calls between services (loosely coupled)
|
||
|
||
### 2. Learning from Execution History
|
||
- Every task stores execution metadata (success, duration, cost)
|
||
- Learning profiles updated from execution data
|
||
- Better assignments improve continuously
|
||
|
||
### 3. Decision Extraction
|
||
- Design decisions captured as ADRs
|
||
- Immutable record of architectural rationale
|
||
- Serves as organizational memory
|
||
|
||
### 4. Graceful Degradation
|
||
- NATS offline: In-memory queue fallback
|
||
- Agent unavailable: Task re-assigned to next best
|
||
- Doc generation failed: Manual entry allowed
|
||
|
||
---
|
||
|
||
## 📚 Related Documentation
|
||
|
||
- **[VAPORA Architecture](vapora-architecture.md)** — System overview
|
||
- **[Agent Registry & Coordination](agent-registry-coordination.md)** — Agent patterns
|
||
- **[Multi-Agent Workflows](multi-agent-workflows.md)** — Workflow execution
|
||
- **[Multi-IA Router](multi-ia-router.md)** — LLM provider selection
|
||
- **[Roles, Permissions & Profiles](roles-permissions-profiles.md)** — RBAC
|
||
|
||
---
|
||
|
||
**Status**: ✅ Production Ready
|
||
**Version**: 1.2.0
|
||
**Last Updated**: January 2026
|