Vapora/docs/architecture/task-agent-doc-manager.md
2026-02-17 23:15:12 +00:00

385 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Task, Agent & Documentation Manager
## Multi-Agent Task Orchestration & Documentation Sync
**Status**: Production Ready (v1.2.0)
**Date**: January 2026
---
## 🎯 Overview
System that:
1. **Manages tasks** in multi-agent workflow
2. **Assigns agents** automatically based on expertise
3. **Coordinates execution** in parallel with approval gates
4. **Extracts decisions** as Architecture Decision Records (ADRs)
5. **Maintains documentation** automatically synchronized
---
## 📋 Task Structure
### Task Metadata
Tasks are stored in SurrealDB with the following structure:
```toml
[task]
id = "task-089"
type = "feature" # feature | bugfix | enhancement | tech-debt
title = "Implement learning profiles"
description = "Agent expertise tracking with recency bias"
[status]
state = "in-progress" # todo | in-progress | review | done | archived
progress = 60 # 0-100%
created_at = "2026-01-11T10:15:30Z"
updated_at = "2026-01-11T14:30:22Z"
[assignment]
priority = "high" # high | medium | low
assigned_agent = "developer" # Or null if unassigned
assigned_team = "infrastructure"
[estimation]
estimated_hours = 8
actual_hours = null # Updated when complete
[context]
related_tasks = ["task-087", "task-088"]
blocking_tasks = []
blocked_by = []
```
### Task Lifecycle
```
┌─────────┐ ┌──────────────┐ ┌────────┐ ┌──────────┐
│ TODO │────▶│ IN-PROGRESS │────▶│ REVIEW │────▶│ DONE │
└─────────┘ └──────────────┘ └────────┘ └──────────┘
△ │
│ │
└───────────── ARCHIVED ◀───────────┘
```
---
## 🤖 Agent Assignment
### Automatic Selection
When a task is created, SwarmCoordinator assigns the best agent:
1. **Capability Matching**: Filter agents by role matching task type
2. **Learning Profile Lookup**: Get expertise scores for task-type
3. **Load Balancing**: Check current agent load (tasks in progress)
4. **Scoring**: `final_score = 0.3*load + 0.5*expertise + 0.2*confidence`
5. **Notification**: Agent receives job via NATS JetStream
### Agent Roles
| Role | Specialization | Primary Tasks |
|------|---|---|
| **Architect** | System design | Feature planning, ADRs, design reviews |
| **Developer** | Implementation | Code generation, refactoring, debugging |
| **Reviewer** | Quality assurance | Code review, test coverage, style checks |
| **Tester** | QA & Benchmarks | Test suite, performance benchmarks |
| **Documenter** | Documentation | Guides, API docs, README updates |
| **Marketer** | Marketing content | Blog posts, case studies, announcements |
| **Presenter** | Presentations | Slides, deck creation, demo scripts |
| **DevOps** | Infrastructure | CI/CD setup, deployment, monitoring |
| **Monitor** | Health & Alerting | System monitoring, alerts, incident response |
| **Security** | Compliance & Audit | Code security, access control, compliance |
| **ProjectManager** | Coordination | Roadmap, tracking, milestone management |
| **DecisionMaker** | Conflict Resolution | Tie-breaking, escalation, ADR creation |
---
## 🔄 Multi-Agent Workflow Execution
### Sequential Workflow (Phases)
```
Phase 1: Design
└─ Architect creates ADR
└─ Move to Phase 2 (auto on completion)
Phase 2: Development
└─ Developer implements
└─ (Parallel) Documenter writes guide
└─ Move to Phase 3
Phase 3: Review
└─ Reviewer checks code quality
└─ Security audits for compliance
└─ If approved: Move to Phase 4
└─ If rejected: Back to Phase 2
Phase 4: Testing
└─ Tester creates test suite
└─ Tester runs benchmarks
└─ If passing: Move to Phase 5
└─ If failing: Back to Phase 2
Phase 5: Completion
└─ DevOps deploys
└─ Monitor sets up alerts
└─ ProjectManager marks done
```
### Parallel Coordination
Multiple agents work simultaneously when independent:
```
Task: "Add learning profiles"
├─ Architect (ADR) ▶ Created in 2h
├─ Developer (Code) ▶ Implemented in 8h
│ ├─ Reviewer (Review) ▶ Reviewed in 1h (parallel)
│ └─ Documenter (Guide) ▶ Documented in 2h (parallel)
└─ Tester (Tests) ▶ Tests in 3h
└─ Security (Audit) ▶ Audited in 1h (parallel)
```
### Approval Gates
Critical decision points require manual approval:
- **Security Gate**: Must approve if code touches auth/secrets
- **Breaking Changes**: Architect approval required
- **Production Deployment**: DevOps + ProjectManager approval
- **Major Refactoring**: Architect + Lead Developer approval
---
## 📝 Decision Extraction (ADRs)
Every design decision is automatically captured:
### ADR Template
```markdown
# ADR-042: Learning-Based Agent Selection
## Context
Previous agent assignment used simple load balancing (min tasks),
ignoring historical performance data. This led to poor agent-task matches.
## Decision
Implement per-task-type learning profiles with recency bias.
### Key Points
- Success rate weighted by recency (7-day window, 3× weight)
- Confidence scoring prevents small-sample overfitting
- Supports adaptive recovery from temporary degradation
## Consequences
**Positive**:
- 30-50% improvement in task success rate
- Agents improve continuously
**Negative**:
- Requires KG data collection (startup period)
- Learning period ~20 tasks per task-type
## Alternatives Considered
1. Rule-based routing (rejected: no learning)
2. Pure random assignment (rejected: no improvement)
3. Rolling average (rejected: no recency bias)
## Decision Made
Option A: Learning profiles with recency bias
```
### ADR Extraction Process
1. **Automatic**: Each task completion generates execution record
2. **Learning**: If decision had trade-offs, extract as ADR candidate
3. **Curation**: ProjectManager/Architect reviews and approves
4. **Archival**: Stored in docs/adrs/ (numbered, immutable)
---
## 📚 Documentation Synchronization
### Automatic Updates
When tasks complete, documentation is auto-updated:
| Task Type | Auto-Updates |
|---|---|
| Feature | CHANGELOG.md, feature overview, API docs |
| Bugfix | CHANGELOG.md, troubleshooting guide |
| Tech-Debt | Architecture docs, refactoring guide |
| Enhancement | Feature docs, user guide |
| Documentation | Indexed in RAG, updated in search |
### Documentation Lifecycle
```
Task Created
Documentation Context Extracted
├─ Decision/ADR created
├─ Related docs identified
└─ Change summary prepared
Task Execution
├─ Code generated
├─ Tests created
└─ Examples documented
Task Complete
├─ ADR finalized
├─ Docs auto-generated
├─ CHANGELOG entry created
└─ Search index updated (RAG)
Archival (if stale)
└─ Moved to docs/archive/
(kept for historical reference)
```
---
## 🔍 Search & Retrieval (RAG Integration)
### Document Indexing
All generated documentation is indexed for semantic search:
- **Architecture decisions** (ADRs)
- **Feature guides** (how-tos)
- **Code examples** (patterns)
- **Execution history** (knowledge graph)
### Query Examples
User asks: "How do I implement learning profiles?"
System searches:
1. ADRs mentioning "learning"
2. Implementation guides with "learning"
3. Execution history with similar task type
4. Code examples for "learning profiles"
Returns ranked results with sources.
---
## 📊 Metrics & Monitoring
### Task Metrics
- **Success Rate**: % of tasks completed successfully
- **Cycle Time**: Average time from todo → done
- **Agent Utilization**: Tasks per agent per role
- **Decision Quality**: ADRs implemented vs. abandoned
### Agent Metrics (per role)
- **Task Success Rate**: % tasks completed successfully
- **Learning Curve**: Expert improvement over time
- **Cost per Task**: Average LLM spend per completed task
- **Task Coverage**: Breadth of task-types handled
### Documentation Metrics
- **Coverage**: % of features documented
- **Freshness**: Days since last update
- **Usage**: Search queries hitting each doc
- **Accuracy**: User feedback on doc correctness
---
## 🏗️ Implementation Details
### SurrealDB Schema
```sql
-- Tasks table
DEFINE TABLE tasks SCHEMAFULL;
DEFINE FIELD id ON tasks TYPE string;
DEFINE FIELD type ON tasks TYPE string;
DEFINE FIELD state ON tasks TYPE string;
DEFINE FIELD assigned_agent ON tasks TYPE option<string>;
-- Executions (for learning)
DEFINE TABLE executions SCHEMAFULL;
DEFINE FIELD task_id ON executions TYPE string;
DEFINE FIELD agent_id ON executions TYPE string;
DEFINE FIELD success ON executions TYPE bool;
DEFINE FIELD duration_ms ON executions TYPE number;
DEFINE FIELD cost_cents ON executions TYPE number;
-- ADRs table
DEFINE TABLE adrs SCHEMAFULL;
DEFINE FIELD id ON adrs TYPE string;
DEFINE FIELD task_id ON adrs TYPE string;
DEFINE FIELD title ON adrs TYPE string;
DEFINE FIELD status ON adrs TYPE string; -- draft|approved|archived
```
### NATS Topics
- `tasks.{type}.{priority}` — Task assignments
- `agents.{role}.ready` — Agent heartbeats
- `agents.{role}.complete` — Task completion
- `adrs.created` — New ADR events
- `docs.updated` — Documentation changes
---
## 🎯 Key Design Patterns
### 1. Event-Driven Coordination
- Task creation → Agent assignment (async via NATS)
- Task completion → Documentation update (eventual consistency)
- No direct API calls between services (loosely coupled)
### 2. Learning from Execution History
- Every task stores execution metadata (success, duration, cost)
- Learning profiles updated from execution data
- Better assignments improve continuously
### 3. Decision Extraction
- Design decisions captured as ADRs
- Immutable record of architectural rationale
- Serves as organizational memory
### 4. Graceful Degradation
- NATS offline: In-memory queue fallback
- Agent unavailable: Task re-assigned to next best
- Doc generation failed: Manual entry allowed
---
## 📚 Related Documentation
- **[VAPORA Architecture](vapora-architecture.md)** — System overview
- **[Agent Registry & Coordination](agent-registry-coordination.md)** — Agent patterns
- **[Multi-Agent Workflows](multi-agent-workflows.md)** — Workflow execution
- **[Multi-IA Router](multi-ia-router.md)** — LLM provider selection
- **[Roles, Permissions & Profiles](roles-permissions-profiles.md)** — RBAC
---
**Status**: ✅ Production Ready
**Version**: 1.2.0
**Last Updated**: January 2026