Vapora/docs/architecture/task-agent-doc-manager.md

# Task, Agent & Documentation Manager
## Multi-Agent Task Orchestration & Documentation Sync

**Status**: Production Ready (v1.2.0)
**Date**: January 2026

---

## 🎯 Overview

System that:
1. **Manages tasks** in multi-agent workflow
2. **Assigns agents** automatically based on expertise
3. **Coordinates execution** in parallel with approval gates
4. **Extracts decisions** as Architecture Decision Records (ADRs)
5. **Maintains documentation** automatically synchronized

---

## 📋 Task Structure

### Task Metadata

Tasks are stored in SurrealDB with the following structure:

```toml
[task]
id = "task-089"
type = "feature"                    # feature | bugfix | enhancement | tech-debt
title = "Implement learning profiles"
description = "Agent expertise tracking with recency bias"

[status]
state = "in-progress"               # todo | in-progress | review | done | archived
progress = 60                        # 0-100%
created_at = "2026-01-11T10:15:30Z"
updated_at = "2026-01-11T14:30:22Z"

[assignment]
priority = "high"                   # high | medium | low
assigned_agent = "developer"        # Or null if unassigned
assigned_team = "infrastructure"

[estimation]
estimated_hours = 8
actual_hours = null                 # Updated when complete

[context]
related_tasks = ["task-087", "task-088"]
blocking_tasks = []
blocked_by = []
```

### Task Lifecycle

```
┌─────────┐     ┌──────────────┐     ┌────────┐     ┌──────────┐
│  TODO   │────▶│ IN-PROGRESS  │────▶│ REVIEW │────▶│   DONE   │
└─────────┘     └──────────────┘     └────────┘     └──────────┘
       △                                   │
       │                                   │
       └───────────── ARCHIVED ◀───────────┘
```

---

## 🤖 Agent Assignment

### Automatic Selection

When a task is created, SwarmCoordinator assigns the best agent:

1. **Capability Matching**: Filter agents by role matching task type
2. **Learning Profile Lookup**: Get expertise scores for task-type
3. **Load Balancing**: Check current agent load (tasks in progress)
4. **Scoring**: `final_score = 0.3*load + 0.5*expertise + 0.2*confidence`
5. **Notification**: Agent receives job via NATS JetStream

### Agent Roles

| Role | Specialization | Primary Tasks |
|------|---|---|
| **Architect** | System design | Feature planning, ADRs, design reviews |
| **Developer** | Implementation | Code generation, refactoring, debugging |
| **Reviewer** | Quality assurance | Code review, test coverage, style checks |
| **Tester** | QA & Benchmarks | Test suite, performance benchmarks |
| **Documenter** | Documentation | Guides, API docs, README updates |
| **Marketer** | Marketing content | Blog posts, case studies, announcements |
| **Presenter** | Presentations | Slides, deck creation, demo scripts |
| **DevOps** | Infrastructure | CI/CD setup, deployment, monitoring |
| **Monitor** | Health & Alerting | System monitoring, alerts, incident response |
| **Security** | Compliance & Audit | Code security, access control, compliance |
| **ProjectManager** | Coordination | Roadmap, tracking, milestone management |
| **DecisionMaker** | Conflict Resolution | Tie-breaking, escalation, ADR creation |

---

## 🔄 Multi-Agent Workflow Execution

### Sequential Workflow (Phases)

```
Phase 1: Design
  └─ Architect creates ADR
     └─ Move to Phase 2 (auto on completion)

Phase 2: Development
  └─ Developer implements
  └─ (Parallel) Documenter writes guide
     └─ Move to Phase 3

Phase 3: Review
  └─ Reviewer checks code quality
  └─ Security audits for compliance
     └─ If approved: Move to Phase 4
     └─ If rejected: Back to Phase 2

Phase 4: Testing
  └─ Tester creates test suite
  └─ Tester runs benchmarks
     └─ If passing: Move to Phase 5
     └─ If failing: Back to Phase 2

Phase 5: Completion
  └─ DevOps deploys
  └─ Monitor sets up alerts
  └─ ProjectManager marks done
```

### Parallel Coordination

Multiple agents work simultaneously when independent:

```
Task: "Add learning profiles"

├─ Architect (ADR)          ▶ Created in 2h
├─ Developer (Code)         ▶ Implemented in 8h
│  ├─ Reviewer (Review)     ▶ Reviewed in 1h (parallel)
│  └─ Documenter (Guide)    ▶ Documented in 2h (parallel)
│
└─ Tester (Tests)           ▶ Tests in 3h
   └─ Security (Audit)      ▶ Audited in 1h (parallel)
```

### Approval Gates

Critical decision points require manual approval:

- **Security Gate**: Must approve if code touches auth/secrets
- **Breaking Changes**: Architect approval required
- **Production Deployment**: DevOps + ProjectManager approval
- **Major Refactoring**: Architect + Lead Developer approval

---

## 📝 Decision Extraction (ADRs)

Every design decision is automatically captured:

### ADR Template

```markdown
# ADR-042: Learning-Based Agent Selection

## Context

Previous agent assignment used simple load balancing (min tasks),
ignoring historical performance data. This led to poor agent-task matches.

## Decision

Implement per-task-type learning profiles with recency bias.

### Key Points
- Success rate weighted by recency (7-day window, 3× weight)
- Confidence scoring prevents small-sample overfitting
- Supports adaptive recovery from temporary degradation

## Consequences

**Positive**:
- 30-50% improvement in task success rate
- Agents improve continuously

**Negative**:
- Requires KG data collection (startup period)
- Learning period ~20 tasks per task-type

## Alternatives Considered

1. Rule-based routing (rejected: no learning)
2. Pure random assignment (rejected: no improvement)
3. Rolling average (rejected: no recency bias)

## Decision Made

Option A: Learning profiles with recency bias
```

### ADR Extraction Process

1. **Automatic**: Each task completion generates execution record
2. **Learning**: If decision had trade-offs, extract as ADR candidate
3. **Curation**: ProjectManager/Architect reviews and approves
4. **Archival**: Stored in docs/adrs/ (numbered, immutable)

---

## 📚 Documentation Synchronization

### Automatic Updates

When tasks complete, documentation is auto-updated:

| Task Type | Auto-Updates |
|---|---|
| Feature | CHANGELOG.md, feature overview, API docs |
| Bugfix | CHANGELOG.md, troubleshooting guide |
| Tech-Debt | Architecture docs, refactoring guide |
| Enhancement | Feature docs, user guide |
| Documentation | Indexed in RAG, updated in search |

### Documentation Lifecycle

```
Task Created
    │
    ▼
Documentation Context Extracted
    │
    ├─ Decision/ADR created
    ├─ Related docs identified
    └─ Change summary prepared
    │
    ▼
Task Execution
    │
    ├─ Code generated
    ├─ Tests created
    └─ Examples documented
    │
    ▼
Task Complete
    │
    ├─ ADR finalized
    ├─ Docs auto-generated
    ├─ CHANGELOG entry created
    └─ Search index updated (RAG)
    │
    ▼
Archival (if stale)
    │
    └─ Moved to docs/archive/
       (kept for historical reference)
```

---

## 🔍 Search & Retrieval (RAG Integration)

### Document Indexing

All generated documentation is indexed for semantic search:

- **Architecture decisions** (ADRs)
- **Feature guides** (how-tos)
- **Code examples** (patterns)
- **Execution history** (knowledge graph)

### Query Examples

User asks: "How do I implement learning profiles?"

System searches:
1. ADRs mentioning "learning"
2. Implementation guides with "learning"
3. Execution history with similar task type
4. Code examples for "learning profiles"

Returns ranked results with sources.

---

## 📊 Metrics & Monitoring

### Task Metrics

- **Success Rate**: % of tasks completed successfully
- **Cycle Time**: Average time from todo → done
- **Agent Utilization**: Tasks per agent per role
- **Decision Quality**: ADRs implemented vs. abandoned

### Agent Metrics (per role)

- **Task Success Rate**: % tasks completed successfully
- **Learning Curve**: Expert improvement over time
- **Cost per Task**: Average LLM spend per completed task
- **Task Coverage**: Breadth of task-types handled

### Documentation Metrics

- **Coverage**: % of features documented
- **Freshness**: Days since last update
- **Usage**: Search queries hitting each doc
- **Accuracy**: User feedback on doc correctness

---

## 🏗️ Implementation Details

### SurrealDB Schema

```sql
-- Tasks table
DEFINE TABLE tasks SCHEMAFULL;
DEFINE FIELD id ON tasks TYPE string;
DEFINE FIELD type ON tasks TYPE string;
DEFINE FIELD state ON tasks TYPE string;
DEFINE FIELD assigned_agent ON tasks TYPE option<string>;

-- Executions (for learning)
DEFINE TABLE executions SCHEMAFULL;
DEFINE FIELD task_id ON executions TYPE string;
DEFINE FIELD agent_id ON executions TYPE string;
DEFINE FIELD success ON executions TYPE bool;
DEFINE FIELD duration_ms ON executions TYPE number;
DEFINE FIELD cost_cents ON executions TYPE number;

-- ADRs table
DEFINE TABLE adrs SCHEMAFULL;
DEFINE FIELD id ON adrs TYPE string;
DEFINE FIELD task_id ON adrs TYPE string;
DEFINE FIELD title ON adrs TYPE string;
DEFINE FIELD status ON adrs TYPE string; -- draft|approved|archived
```

### NATS Topics

- `tasks.{type}.{priority}` — Task assignments
- `agents.{role}.ready` — Agent heartbeats
- `agents.{role}.complete` — Task completion
- `adrs.created` — New ADR events
- `docs.updated` — Documentation changes

---

## 🎯 Key Design Patterns

### 1. Event-Driven Coordination
- Task creation → Agent assignment (async via NATS)
- Task completion → Documentation update (eventual consistency)
- No direct API calls between services (loosely coupled)

### 2. Learning from Execution History
- Every task stores execution metadata (success, duration, cost)
- Learning profiles updated from execution data
- Better assignments improve continuously

### 3. Decision Extraction
- Design decisions captured as ADRs
- Immutable record of architectural rationale
- Serves as organizational memory

### 4. Graceful Degradation
- NATS offline: In-memory queue fallback
- Agent unavailable: Task re-assigned to next best
- Doc generation failed: Manual entry allowed

---

## 📚 Related Documentation

- **[VAPORA Architecture](vapora-architecture.md)** — System overview
- **[Agent Registry & Coordination](agent-registry-coordination.md)** — Agent patterns
- **[Multi-Agent Workflows](multi-agent-workflows.md)** — Workflow execution
- **[Multi-IA Router](multi-ia-router.md)** — LLM provider selection
- **[Roles, Permissions & Profiles](roles-permissions-profiles.md)** — RBAC

---

**Status**: ✅ Production Ready
**Version**: 1.2.0
**Last Updated**: January 2026