Vapora/docs/architecture/task-agent-doc-manager.md
Jesús Pérez d14150da75 feat: Phase 5.3 - Multi-Agent Learning Infrastructure
Implement intelligent agent learning from Knowledge Graph execution history
with per-task-type expertise tracking, recency bias, and learning curves.

## Phase 5.3 Implementation

### Learning Infrastructure ( Complete)
- LearningProfileService with per-task-type expertise metrics
- TaskTypeExpertise model tracking success_rate, confidence, learning curves
- Recency bias weighting: recent 7 days weighted 3x higher (exponential decay)
- Confidence scoring prevents overfitting: min(1.0, executions / 20)
- Learning curves computed from daily execution windows

### Agent Scoring Service ( Complete)
- Unified AgentScore combining SwarmCoordinator + learning profiles
- Scoring formula: 0.3*base + 0.5*expertise + 0.2*confidence
- Rank agents by combined score for intelligent assignment
- Support for recency-biased scoring (recent_success_rate)
- Methods: rank_agents, select_best, rank_agents_with_recency

### KG Integration ( Complete)
- KGPersistence::get_executions_for_task_type() - query by agent + task type
- KGPersistence::get_agent_executions() - all executions for agent
- Coordinator::load_learning_profile_from_kg() - core KG→Learning integration
- Coordinator::load_all_learning_profiles() - batch load for multiple agents
- Convert PersistedExecution → ExecutionData for learning calculations

### Agent Assignment Integration ( Complete)
- AgentCoordinator uses learning profiles for task assignment
- extract_task_type() infers task type from title/description
- assign_task() scores candidates using AgentScoringService
- Fallback to load-based selection if no learning data available
- Learning profiles stored in coordinator.learning_profiles RwLock

### Profile Adapter Enhancements ( Complete)
- create_learning_profile() - initialize empty profiles
- add_task_type_expertise() - set task-type expertise
- update_profile_with_learning() - update swarm profiles from learning

## Files Modified

### vapora-knowledge-graph/src/persistence.rs (+30 lines)
- get_executions_for_task_type(agent_id, task_type, limit)
- get_agent_executions(agent_id, limit)

### vapora-agents/src/coordinator.rs (+100 lines)
- load_learning_profile_from_kg() - core KG integration method
- load_all_learning_profiles() - batch loading for agents
- assign_task() already uses learning-based scoring via AgentScoringService

### Existing Complete Implementation
- vapora-knowledge-graph/src/learning.rs - calculation functions
- vapora-agents/src/learning_profile.rs - data structures and expertise
- vapora-agents/src/scoring.rs - unified scoring service
- vapora-agents/src/profile_adapter.rs - adapter methods

## Tests Passing
- learning_profile: 7 tests 
- scoring: 5 tests 
- profile_adapter: 6 tests 
- coordinator: learning-specific tests 

## Data Flow
1. Task arrives → AgentCoordinator::assign_task()
2. Extract task_type from description
3. Query KG for task-type executions (load_learning_profile_from_kg)
4. Calculate expertise with recency bias
5. Score candidates (SwarmCoordinator + learning)
6. Assign to top-scored agent
7. Execution result → KG → Update learning profiles

## Key Design Decisions
 Recency bias: 7-day half-life with 3x weight for recent performance
 Confidence scoring: min(1.0, total_executions / 20) prevents overfitting
 Hierarchical scoring: 30% base load, 50% expertise, 20% confidence
 KG query limit: 100 recent executions per task-type for performance
 Async loading: load_learning_profile_from_kg supports concurrent loads

## Next: Phase 5.4 - Cost Optimization
Ready to implement budget enforcement and cost-aware provider selection.
2026-01-11 13:03:53 +00:00

11 KiB
Raw Blame History

Task, Agent & Documentation Manager

Multi-Agent Task Orchestration & Documentation Sync

Status: Production Ready (v1.2.0) Date: January 2026


🎯 Overview

System that:

  1. Manages tasks in multi-agent workflow
  2. Assigns agents automatically based on expertise
  3. Coordinates execution in parallel with approval gates
  4. Extracts decisions as Architecture Decision Records (ADRs)
  5. Maintains documentation automatically synchronized

📋 Task Structure

Task Metadata

Tasks are stored in SurrealDB with the following structure:

[task]
id = "task-089"
type = "feature"                    # feature | bugfix | enhancement | tech-debt
title = "Implement learning profiles"
description = "Agent expertise tracking with recency bias"

[status]
state = "in-progress"               # todo | in-progress | review | done | archived
progress = 60                        # 0-100%
created_at = "2026-01-11T10:15:30Z"
updated_at = "2026-01-11T14:30:22Z"

[assignment]
priority = "high"                   # high | medium | low
assigned_agent = "developer"        # Or null if unassigned
assigned_team = "infrastructure"

[estimation]
estimated_hours = 8
actual_hours = null                 # Updated when complete

[context]
related_tasks = ["task-087", "task-088"]
blocking_tasks = []
blocked_by = []

Task Lifecycle

┌─────────┐     ┌──────────────┐     ┌────────┐     ┌──────────┐
│  TODO   │────▶│ IN-PROGRESS  │────▶│ REVIEW │────▶│   DONE   │
└─────────┘     └──────────────┘     └────────┘     └──────────┘
       △                                   │
       │                                   │
       └───────────── ARCHIVED ◀───────────┘

🤖 Agent Assignment

Automatic Selection

When a task is created, SwarmCoordinator assigns the best agent:

  1. Capability Matching: Filter agents by role matching task type
  2. Learning Profile Lookup: Get expertise scores for task-type
  3. Load Balancing: Check current agent load (tasks in progress)
  4. Scoring: final_score = 0.3*load + 0.5*expertise + 0.2*confidence
  5. Notification: Agent receives job via NATS JetStream

Agent Roles

Role Specialization Primary Tasks
Architect System design Feature planning, ADRs, design reviews
Developer Implementation Code generation, refactoring, debugging
Reviewer Quality assurance Code review, test coverage, style checks
Tester QA & Benchmarks Test suite, performance benchmarks
Documenter Documentation Guides, API docs, README updates
Marketer Marketing content Blog posts, case studies, announcements
Presenter Presentations Slides, deck creation, demo scripts
DevOps Infrastructure CI/CD setup, deployment, monitoring
Monitor Health & Alerting System monitoring, alerts, incident response
Security Compliance & Audit Code security, access control, compliance
ProjectManager Coordination Roadmap, tracking, milestone management
DecisionMaker Conflict Resolution Tie-breaking, escalation, ADR creation

🔄 Multi-Agent Workflow Execution

Sequential Workflow (Phases)

Phase 1: Design
  └─ Architect creates ADR
     └─ Move to Phase 2 (auto on completion)

Phase 2: Development
  └─ Developer implements
  └─ (Parallel) Documenter writes guide
     └─ Move to Phase 3

Phase 3: Review
  └─ Reviewer checks code quality
  └─ Security audits for compliance
     └─ If approved: Move to Phase 4
     └─ If rejected: Back to Phase 2

Phase 4: Testing
  └─ Tester creates test suite
  └─ Tester runs benchmarks
     └─ If passing: Move to Phase 5
     └─ If failing: Back to Phase 2

Phase 5: Completion
  └─ DevOps deploys
  └─ Monitor sets up alerts
  └─ ProjectManager marks done

Parallel Coordination

Multiple agents work simultaneously when independent:

Task: "Add learning profiles"

├─ Architect (ADR)          ▶ Created in 2h
├─ Developer (Code)         ▶ Implemented in 8h
│  ├─ Reviewer (Review)     ▶ Reviewed in 1h (parallel)
│  └─ Documenter (Guide)    ▶ Documented in 2h (parallel)
│
└─ Tester (Tests)           ▶ Tests in 3h
   └─ Security (Audit)      ▶ Audited in 1h (parallel)

Approval Gates

Critical decision points require manual approval:

  • Security Gate: Must approve if code touches auth/secrets
  • Breaking Changes: Architect approval required
  • Production Deployment: DevOps + ProjectManager approval
  • Major Refactoring: Architect + Lead Developer approval

📝 Decision Extraction (ADRs)

Every design decision is automatically captured:

ADR Template

# ADR-042: Learning-Based Agent Selection

## Context

Previous agent assignment used simple load balancing (min tasks),
ignoring historical performance data. This led to poor agent-task matches.

## Decision

Implement per-task-type learning profiles with recency bias.

### Key Points
- Success rate weighted by recency (7-day window, 3× weight)
- Confidence scoring prevents small-sample overfitting
- Supports adaptive recovery from temporary degradation

## Consequences

**Positive**:
- 30-50% improvement in task success rate
- Agents improve continuously

**Negative**:
- Requires KG data collection (startup period)
- Learning period ~20 tasks per task-type

## Alternatives Considered

1. Rule-based routing (rejected: no learning)
2. Pure random assignment (rejected: no improvement)
3. Rolling average (rejected: no recency bias)

## Decision Made

Option A: Learning profiles with recency bias

ADR Extraction Process

  1. Automatic: Each task completion generates execution record
  2. Learning: If decision had trade-offs, extract as ADR candidate
  3. Curation: ProjectManager/Architect reviews and approves
  4. Archival: Stored in docs/architecture/adr/ (numbered, immutable)

📚 Documentation Synchronization

Automatic Updates

When tasks complete, documentation is auto-updated:

Task Type Auto-Updates
Feature CHANGELOG.md, feature overview, API docs
Bugfix CHANGELOG.md, troubleshooting guide
Tech-Debt Architecture docs, refactoring guide
Enhancement Feature docs, user guide
Documentation Indexed in RAG, updated in search

Documentation Lifecycle

Task Created
    │
    ▼
Documentation Context Extracted
    │
    ├─ Decision/ADR created
    ├─ Related docs identified
    └─ Change summary prepared
    │
    ▼
Task Execution
    │
    ├─ Code generated
    ├─ Tests created
    └─ Examples documented
    │
    ▼
Task Complete
    │
    ├─ ADR finalized
    ├─ Docs auto-generated
    ├─ CHANGELOG entry created
    └─ Search index updated (RAG)
    │
    ▼
Archival (if stale)
    │
    └─ Moved to docs/archive/
       (kept for historical reference)

🔍 Search & Retrieval (RAG Integration)

Document Indexing

All generated documentation is indexed for semantic search:

  • Architecture decisions (ADRs)
  • Feature guides (how-tos)
  • Code examples (patterns)
  • Execution history (knowledge graph)

Query Examples

User asks: "How do I implement learning profiles?"

System searches:

  1. ADRs mentioning "learning"
  2. Implementation guides with "learning"
  3. Execution history with similar task type
  4. Code examples for "learning profiles"

Returns ranked results with sources.


📊 Metrics & Monitoring

Task Metrics

  • Success Rate: % of tasks completed successfully
  • Cycle Time: Average time from todo → done
  • Agent Utilization: Tasks per agent per role
  • Decision Quality: ADRs implemented vs. abandoned

Agent Metrics (per role)

  • Task Success Rate: % tasks completed successfully
  • Learning Curve: Expert improvement over time
  • Cost per Task: Average LLM spend per completed task
  • Task Coverage: Breadth of task-types handled

Documentation Metrics

  • Coverage: % of features documented
  • Freshness: Days since last update
  • Usage: Search queries hitting each doc
  • Accuracy: User feedback on doc correctness

🏗️ Implementation Details

SurrealDB Schema

-- Tasks table
DEFINE TABLE tasks SCHEMAFULL;
DEFINE FIELD id ON tasks TYPE string;
DEFINE FIELD type ON tasks TYPE string;
DEFINE FIELD state ON tasks TYPE string;
DEFINE FIELD assigned_agent ON tasks TYPE option<string>;

-- Executions (for learning)
DEFINE TABLE executions SCHEMAFULL;
DEFINE FIELD task_id ON executions TYPE string;
DEFINE FIELD agent_id ON executions TYPE string;
DEFINE FIELD success ON executions TYPE bool;
DEFINE FIELD duration_ms ON executions TYPE number;
DEFINE FIELD cost_cents ON executions TYPE number;

-- ADRs table
DEFINE TABLE adrs SCHEMAFULL;
DEFINE FIELD id ON adrs TYPE string;
DEFINE FIELD task_id ON adrs TYPE string;
DEFINE FIELD title ON adrs TYPE string;
DEFINE FIELD status ON adrs TYPE string; -- draft|approved|archived

NATS Topics

  • tasks.{type}.{priority} — Task assignments
  • agents.{role}.ready — Agent heartbeats
  • agents.{role}.complete — Task completion
  • adrs.created — New ADR events
  • docs.updated — Documentation changes

🎯 Key Design Patterns

1. Event-Driven Coordination

  • Task creation → Agent assignment (async via NATS)
  • Task completion → Documentation update (eventual consistency)
  • No direct API calls between services (loosely coupled)

2. Learning from Execution History

  • Every task stores execution metadata (success, duration, cost)
  • Learning profiles updated from execution data
  • Better assignments improve continuously

3. Decision Extraction

  • Design decisions captured as ADRs
  • Immutable record of architectural rationale
  • Serves as organizational memory

4. Graceful Degradation

  • NATS offline: In-memory queue fallback
  • Agent unavailable: Task re-assigned to next best
  • Doc generation failed: Manual entry allowed


Status: Production Ready Version: 1.2.0 Last Updated: January 2026