Vapora/docs/architecture/task-agent-doc-manager.md

385 lines
11 KiB
Markdown
Raw Normal View History

feat: Phase 5.3 - Multi-Agent Learning Infrastructure Implement intelligent agent learning from Knowledge Graph execution history with per-task-type expertise tracking, recency bias, and learning curves. ## Phase 5.3 Implementation ### Learning Infrastructure (✅ Complete) - LearningProfileService with per-task-type expertise metrics - TaskTypeExpertise model tracking success_rate, confidence, learning curves - Recency bias weighting: recent 7 days weighted 3x higher (exponential decay) - Confidence scoring prevents overfitting: min(1.0, executions / 20) - Learning curves computed from daily execution windows ### Agent Scoring Service (✅ Complete) - Unified AgentScore combining SwarmCoordinator + learning profiles - Scoring formula: 0.3*base + 0.5*expertise + 0.2*confidence - Rank agents by combined score for intelligent assignment - Support for recency-biased scoring (recent_success_rate) - Methods: rank_agents, select_best, rank_agents_with_recency ### KG Integration (✅ Complete) - KGPersistence::get_executions_for_task_type() - query by agent + task type - KGPersistence::get_agent_executions() - all executions for agent - Coordinator::load_learning_profile_from_kg() - core KG→Learning integration - Coordinator::load_all_learning_profiles() - batch load for multiple agents - Convert PersistedExecution → ExecutionData for learning calculations ### Agent Assignment Integration (✅ Complete) - AgentCoordinator uses learning profiles for task assignment - extract_task_type() infers task type from title/description - assign_task() scores candidates using AgentScoringService - Fallback to load-based selection if no learning data available - Learning profiles stored in coordinator.learning_profiles RwLock ### Profile Adapter Enhancements (✅ Complete) - create_learning_profile() - initialize empty profiles - add_task_type_expertise() - set task-type expertise - update_profile_with_learning() - update swarm profiles from learning ## Files Modified ### vapora-knowledge-graph/src/persistence.rs (+30 lines) - get_executions_for_task_type(agent_id, task_type, limit) - get_agent_executions(agent_id, limit) ### vapora-agents/src/coordinator.rs (+100 lines) - load_learning_profile_from_kg() - core KG integration method - load_all_learning_profiles() - batch loading for agents - assign_task() already uses learning-based scoring via AgentScoringService ### Existing Complete Implementation - vapora-knowledge-graph/src/learning.rs - calculation functions - vapora-agents/src/learning_profile.rs - data structures and expertise - vapora-agents/src/scoring.rs - unified scoring service - vapora-agents/src/profile_adapter.rs - adapter methods ## Tests Passing - learning_profile: 7 tests ✅ - scoring: 5 tests ✅ - profile_adapter: 6 tests ✅ - coordinator: learning-specific tests ✅ ## Data Flow 1. Task arrives → AgentCoordinator::assign_task() 2. Extract task_type from description 3. Query KG for task-type executions (load_learning_profile_from_kg) 4. Calculate expertise with recency bias 5. Score candidates (SwarmCoordinator + learning) 6. Assign to top-scored agent 7. Execution result → KG → Update learning profiles ## Key Design Decisions ✅ Recency bias: 7-day half-life with 3x weight for recent performance ✅ Confidence scoring: min(1.0, total_executions / 20) prevents overfitting ✅ Hierarchical scoring: 30% base load, 50% expertise, 20% confidence ✅ KG query limit: 100 recent executions per task-type for performance ✅ Async loading: load_learning_profile_from_kg supports concurrent loads ## Next: Phase 5.4 - Cost Optimization Ready to implement budget enforcement and cost-aware provider selection.
2026-01-11 13:03:53 +00:00
# Task, Agent & Documentation Manager
## Multi-Agent Task Orchestration & Documentation Sync
**Status**: Production Ready (v1.2.0)
**Date**: January 2026
---
## 🎯 Overview
System that:
1. **Manages tasks** in multi-agent workflow
2. **Assigns agents** automatically based on expertise
3. **Coordinates execution** in parallel with approval gates
4. **Extracts decisions** as Architecture Decision Records (ADRs)
5. **Maintains documentation** automatically synchronized
---
## 📋 Task Structure
### Task Metadata
Tasks are stored in SurrealDB with the following structure:
```toml
[task]
id = "task-089"
type = "feature" # feature | bugfix | enhancement | tech-debt
title = "Implement learning profiles"
description = "Agent expertise tracking with recency bias"
[status]
state = "in-progress" # todo | in-progress | review | done | archived
progress = 60 # 0-100%
created_at = "2026-01-11T10:15:30Z"
updated_at = "2026-01-11T14:30:22Z"
[assignment]
priority = "high" # high | medium | low
assigned_agent = "developer" # Or null if unassigned
assigned_team = "infrastructure"
[estimation]
estimated_hours = 8
actual_hours = null # Updated when complete
[context]
related_tasks = ["task-087", "task-088"]
blocking_tasks = []
blocked_by = []
```
### Task Lifecycle
```
┌─────────┐ ┌──────────────┐ ┌────────┐ ┌──────────┐
│ TODO │────▶│ IN-PROGRESS │────▶│ REVIEW │────▶│ DONE │
└─────────┘ └──────────────┘ └────────┘ └──────────┘
△ │
│ │
└───────────── ARCHIVED ◀───────────┘
```
---
## 🤖 Agent Assignment
### Automatic Selection
When a task is created, SwarmCoordinator assigns the best agent:
1. **Capability Matching**: Filter agents by role matching task type
2. **Learning Profile Lookup**: Get expertise scores for task-type
3. **Load Balancing**: Check current agent load (tasks in progress)
4. **Scoring**: `final_score = 0.3*load + 0.5*expertise + 0.2*confidence`
5. **Notification**: Agent receives job via NATS JetStream
### Agent Roles
| Role | Specialization | Primary Tasks |
|------|---|---|
| **Architect** | System design | Feature planning, ADRs, design reviews |
| **Developer** | Implementation | Code generation, refactoring, debugging |
| **Reviewer** | Quality assurance | Code review, test coverage, style checks |
| **Tester** | QA & Benchmarks | Test suite, performance benchmarks |
| **Documenter** | Documentation | Guides, API docs, README updates |
| **Marketer** | Marketing content | Blog posts, case studies, announcements |
| **Presenter** | Presentations | Slides, deck creation, demo scripts |
| **DevOps** | Infrastructure | CI/CD setup, deployment, monitoring |
| **Monitor** | Health & Alerting | System monitoring, alerts, incident response |
| **Security** | Compliance & Audit | Code security, access control, compliance |
| **ProjectManager** | Coordination | Roadmap, tracking, milestone management |
| **DecisionMaker** | Conflict Resolution | Tie-breaking, escalation, ADR creation |
---
## 🔄 Multi-Agent Workflow Execution
### Sequential Workflow (Phases)
```
Phase 1: Design
└─ Architect creates ADR
└─ Move to Phase 2 (auto on completion)
Phase 2: Development
└─ Developer implements
└─ (Parallel) Documenter writes guide
└─ Move to Phase 3
Phase 3: Review
└─ Reviewer checks code quality
└─ Security audits for compliance
└─ If approved: Move to Phase 4
└─ If rejected: Back to Phase 2
Phase 4: Testing
└─ Tester creates test suite
└─ Tester runs benchmarks
└─ If passing: Move to Phase 5
└─ If failing: Back to Phase 2
Phase 5: Completion
└─ DevOps deploys
└─ Monitor sets up alerts
└─ ProjectManager marks done
```
### Parallel Coordination
Multiple agents work simultaneously when independent:
```
Task: "Add learning profiles"
├─ Architect (ADR) ▶ Created in 2h
├─ Developer (Code) ▶ Implemented in 8h
│ ├─ Reviewer (Review) ▶ Reviewed in 1h (parallel)
│ └─ Documenter (Guide) ▶ Documented in 2h (parallel)
└─ Tester (Tests) ▶ Tests in 3h
└─ Security (Audit) ▶ Audited in 1h (parallel)
```
### Approval Gates
Critical decision points require manual approval:
- **Security Gate**: Must approve if code touches auth/secrets
- **Breaking Changes**: Architect approval required
- **Production Deployment**: DevOps + ProjectManager approval
- **Major Refactoring**: Architect + Lead Developer approval
---
## 📝 Decision Extraction (ADRs)
Every design decision is automatically captured:
### ADR Template
```markdown
# ADR-042: Learning-Based Agent Selection
## Context
Previous agent assignment used simple load balancing (min tasks),
ignoring historical performance data. This led to poor agent-task matches.
## Decision
Implement per-task-type learning profiles with recency bias.
### Key Points
- Success rate weighted by recency (7-day window, 3× weight)
- Confidence scoring prevents small-sample overfitting
- Supports adaptive recovery from temporary degradation
## Consequences
**Positive**:
- 30-50% improvement in task success rate
- Agents improve continuously
**Negative**:
- Requires KG data collection (startup period)
- Learning period ~20 tasks per task-type
## Alternatives Considered
1. Rule-based routing (rejected: no learning)
2. Pure random assignment (rejected: no improvement)
3. Rolling average (rejected: no recency bias)
## Decision Made
Option A: Learning profiles with recency bias
```
### ADR Extraction Process
1. **Automatic**: Each task completion generates execution record
2. **Learning**: If decision had trade-offs, extract as ADR candidate
3. **Curation**: ProjectManager/Architect reviews and approves
4. **Archival**: Stored in docs/architecture/adr/ (numbered, immutable)
---
## 📚 Documentation Synchronization
### Automatic Updates
When tasks complete, documentation is auto-updated:
| Task Type | Auto-Updates |
|---|---|
| Feature | CHANGELOG.md, feature overview, API docs |
| Bugfix | CHANGELOG.md, troubleshooting guide |
| Tech-Debt | Architecture docs, refactoring guide |
| Enhancement | Feature docs, user guide |
| Documentation | Indexed in RAG, updated in search |
### Documentation Lifecycle
```
Task Created
Documentation Context Extracted
├─ Decision/ADR created
├─ Related docs identified
└─ Change summary prepared
Task Execution
├─ Code generated
├─ Tests created
└─ Examples documented
Task Complete
├─ ADR finalized
├─ Docs auto-generated
├─ CHANGELOG entry created
└─ Search index updated (RAG)
Archival (if stale)
└─ Moved to docs/archive/
(kept for historical reference)
```
---
## 🔍 Search & Retrieval (RAG Integration)
### Document Indexing
All generated documentation is indexed for semantic search:
- **Architecture decisions** (ADRs)
- **Feature guides** (how-tos)
- **Code examples** (patterns)
- **Execution history** (knowledge graph)
### Query Examples
User asks: "How do I implement learning profiles?"
System searches:
1. ADRs mentioning "learning"
2. Implementation guides with "learning"
3. Execution history with similar task type
4. Code examples for "learning profiles"
Returns ranked results with sources.
---
## 📊 Metrics & Monitoring
### Task Metrics
- **Success Rate**: % of tasks completed successfully
- **Cycle Time**: Average time from todo → done
- **Agent Utilization**: Tasks per agent per role
- **Decision Quality**: ADRs implemented vs. abandoned
### Agent Metrics (per role)
- **Task Success Rate**: % tasks completed successfully
- **Learning Curve**: Expert improvement over time
- **Cost per Task**: Average LLM spend per completed task
- **Task Coverage**: Breadth of task-types handled
### Documentation Metrics
- **Coverage**: % of features documented
- **Freshness**: Days since last update
- **Usage**: Search queries hitting each doc
- **Accuracy**: User feedback on doc correctness
---
## 🏗️ Implementation Details
### SurrealDB Schema
```sql
-- Tasks table
DEFINE TABLE tasks SCHEMAFULL;
DEFINE FIELD id ON tasks TYPE string;
DEFINE FIELD type ON tasks TYPE string;
DEFINE FIELD state ON tasks TYPE string;
DEFINE FIELD assigned_agent ON tasks TYPE option<string>;
-- Executions (for learning)
DEFINE TABLE executions SCHEMAFULL;
DEFINE FIELD task_id ON executions TYPE string;
DEFINE FIELD agent_id ON executions TYPE string;
DEFINE FIELD success ON executions TYPE bool;
DEFINE FIELD duration_ms ON executions TYPE number;
DEFINE FIELD cost_cents ON executions TYPE number;
-- ADRs table
DEFINE TABLE adrs SCHEMAFULL;
DEFINE FIELD id ON adrs TYPE string;
DEFINE FIELD task_id ON adrs TYPE string;
DEFINE FIELD title ON adrs TYPE string;
DEFINE FIELD status ON adrs TYPE string; -- draft|approved|archived
```
### NATS Topics
- `tasks.{type}.{priority}` — Task assignments
- `agents.{role}.ready` — Agent heartbeats
- `agents.{role}.complete` — Task completion
- `adrs.created` — New ADR events
- `docs.updated` — Documentation changes
---
## 🎯 Key Design Patterns
### 1. Event-Driven Coordination
- Task creation → Agent assignment (async via NATS)
- Task completion → Documentation update (eventual consistency)
- No direct API calls between services (loosely coupled)
### 2. Learning from Execution History
- Every task stores execution metadata (success, duration, cost)
- Learning profiles updated from execution data
- Better assignments improve continuously
### 3. Decision Extraction
- Design decisions captured as ADRs
- Immutable record of architectural rationale
- Serves as organizational memory
### 4. Graceful Degradation
- NATS offline: In-memory queue fallback
- Agent unavailable: Task re-assigned to next best
- Doc generation failed: Manual entry allowed
---
## 📚 Related Documentation
- **[VAPORA Architecture](vapora-architecture.md)** — System overview
- **[Agent Registry & Coordination](agent-registry-coordination.md)** — Agent patterns
- **[Multi-Agent Workflows](multi-agent-workflows.md)** — Workflow execution
- **[Multi-IA Router](multi-ia-router.md)** — LLM provider selection
- **[Roles, Permissions & Profiles](roles-permissions-profiles.md)** — RBAC
---
**Status**: ✅ Production Ready
**Version**: 1.2.0
**Last Updated**: January 2026