Vapora/docs/architecture/multi-agent-workflows.md
Jesús Pérez d14150da75 feat: Phase 5.3 - Multi-Agent Learning Infrastructure
Implement intelligent agent learning from Knowledge Graph execution history
with per-task-type expertise tracking, recency bias, and learning curves.

## Phase 5.3 Implementation

### Learning Infrastructure ( Complete)
- LearningProfileService with per-task-type expertise metrics
- TaskTypeExpertise model tracking success_rate, confidence, learning curves
- Recency bias weighting: recent 7 days weighted 3x higher (exponential decay)
- Confidence scoring prevents overfitting: min(1.0, executions / 20)
- Learning curves computed from daily execution windows

### Agent Scoring Service ( Complete)
- Unified AgentScore combining SwarmCoordinator + learning profiles
- Scoring formula: 0.3*base + 0.5*expertise + 0.2*confidence
- Rank agents by combined score for intelligent assignment
- Support for recency-biased scoring (recent_success_rate)
- Methods: rank_agents, select_best, rank_agents_with_recency

### KG Integration ( Complete)
- KGPersistence::get_executions_for_task_type() - query by agent + task type
- KGPersistence::get_agent_executions() - all executions for agent
- Coordinator::load_learning_profile_from_kg() - core KG→Learning integration
- Coordinator::load_all_learning_profiles() - batch load for multiple agents
- Convert PersistedExecution → ExecutionData for learning calculations

### Agent Assignment Integration ( Complete)
- AgentCoordinator uses learning profiles for task assignment
- extract_task_type() infers task type from title/description
- assign_task() scores candidates using AgentScoringService
- Fallback to load-based selection if no learning data available
- Learning profiles stored in coordinator.learning_profiles RwLock

### Profile Adapter Enhancements ( Complete)
- create_learning_profile() - initialize empty profiles
- add_task_type_expertise() - set task-type expertise
- update_profile_with_learning() - update swarm profiles from learning

## Files Modified

### vapora-knowledge-graph/src/persistence.rs (+30 lines)
- get_executions_for_task_type(agent_id, task_type, limit)
- get_agent_executions(agent_id, limit)

### vapora-agents/src/coordinator.rs (+100 lines)
- load_learning_profile_from_kg() - core KG integration method
- load_all_learning_profiles() - batch loading for agents
- assign_task() already uses learning-based scoring via AgentScoringService

### Existing Complete Implementation
- vapora-knowledge-graph/src/learning.rs - calculation functions
- vapora-agents/src/learning_profile.rs - data structures and expertise
- vapora-agents/src/scoring.rs - unified scoring service
- vapora-agents/src/profile_adapter.rs - adapter methods

## Tests Passing
- learning_profile: 7 tests 
- scoring: 5 tests 
- profile_adapter: 6 tests 
- coordinator: learning-specific tests 

## Data Flow
1. Task arrives → AgentCoordinator::assign_task()
2. Extract task_type from description
3. Query KG for task-type executions (load_learning_profile_from_kg)
4. Calculate expertise with recency bias
5. Score candidates (SwarmCoordinator + learning)
6. Assign to top-scored agent
7. Execution result → KG → Update learning profiles

## Key Design Decisions
 Recency bias: 7-day half-life with 3x weight for recent performance
 Confidence scoring: min(1.0, total_executions / 20) prevents overfitting
 Hierarchical scoring: 30% base load, 50% expertise, 20% confidence
 KG query limit: 100 recent executions per task-type for performance
 Async loading: load_learning_profile_from_kg supports concurrent loads

## Next: Phase 5.4 - Cost Optimization
Ready to implement budget enforcement and cost-aware provider selection.
2026-01-11 13:03:53 +00:00

570 lines
14 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# 🔄 Multi-Agent Workflows
## End-to-End Parallel Task Orchestration
**Version**: 0.1.0
**Status**: Specification (VAPORA v1.0 - Workflows)
**Purpose**: Workflows where 10+ agents work in parallel, coordinated automatically
---
## 🎯 Objetivo
Orquestar workflows donde múltiples agentes trabajan **en paralelo** en diferentes aspectos de una tarea, sin intervención manual:
```
Feature Request
ProjectManager crea task
↓ (paralelo)
Architect diseña ────────┐
Developer implementa ────├─→ Reviewer revisa ──┐
Tester escribe tests ────┤ ├─→ DecisionMaker aprueba
Documenter prepara docs ─┤ ├─→ DevOps deploya
Security audita ────────┘ │
Marketer promociona
```
---
## 📋 Workflow: Feature Compleja End-to-End
### Fase 1: Planificación (Serial - Requiere aprobación)
**Agentes**: Architect, ProjectManager, DecisionMaker
**Timeline**: 1-2 horas
```yaml
Workflow: feature-auth-mfa
Status: planning
Created: 2025-11-09T10:00:00Z
Steps:
1_architect_designs:
agent: architect
input: feature_request, project_context
task_type: ArchitectureDesign
quality: Critical
estimated_duration: 45min
output:
- design_doc.md
- adr-001-mfa-strategy.md
- architecture_diagram.svg
2_pm_validates:
dependencies: [1_architect_designs]
agent: project-manager
task_type: GeneralQuery
input: design_doc, project_timeline
action: validate_feasibility
3_decision_maker_approves:
dependencies: [2_pm_validates]
agent: decision-maker
task_type: GeneralQuery
input: design, feasibility_report
approval_required: true
escalation_if: ["too risky", "breaks roadmap"]
```
**Output**: ADR aprobado, design doc, go/no-go decision
---
### Fase 2: Implementación (Paralelo - Máxima concurrencia)
**Agentes**: Developer (×3), Tester, Security, Documenter (async)
**Timeline**: 3-5 días
```yaml
4_frontend_dev:
dependencies: [3_decision_maker_approves]
agent: developer-frontend
skill_match: frontend
input: design_doc, api_spec
tasks:
- implement_mfa_ui
- add_totp_input
- add_webauthn_button
parallel_with: [4_backend_dev, 5_security_setup, 6_docs_start]
max_duration: 4days
4_backend_dev:
dependencies: [3_decision_maker_approves]
agent: developer-backend
skill_match: backend, security
input: design_doc, database_schema
tasks:
- implement_mfa_service
- add_totp_verification
- add_webauthn_endpoint
parallel_with: [4_frontend_dev, 5_security_setup, 6_docs_start]
max_duration: 4days
5_security_audit:
dependencies: [3_decision_maker_approves]
agent: security
input: design_doc, threat_model
tasks:
- threat_modeling
- security_review
- vulnerability_scan_plan
parallel_with: [4_frontend_dev, 4_backend_dev, 6_docs_start]
can_block_deployment: true
6_docs_start:
dependencies: [3_decision_maker_approves]
agent: documenter
input: design_doc
tasks:
- create_adr_doc
- start_implementation_guide
parallel_with: [4_frontend_dev, 4_backend_dev, 5_security_audit]
low_priority: true
Status: in_progress
Parallel_agents: 5
Progress: 60%
Blockers: none
```
**Output**:
- Frontend implementation + PRs
- Backend implementation + PRs
- Security audit report
- Initial documentation
---
### Fase 3: Código Review (Paralelo pero gated)
**Agentes**: CodeReviewer (×2), Security, Tester
**Timeline**: 1-2 días
```yaml
7a_frontend_review:
dependencies: [4_frontend_dev]
agent: code-reviewer-frontend
input: frontend_pr
actions: [comment, request_changes, approve]
must_pass: 1 # At least 1 reviewer
can_block_merge: true
7b_backend_review:
dependencies: [4_backend_dev]
agent: code-reviewer-backend
input: backend_pr
actions: [comment, request_changes, approve]
must_pass: 1
security_required: true # Security must also approve
7c_security_review:
dependencies: [4_backend_dev, 5_security_audit]
agent: security
input: backend_pr, security_audit
actions: [scan, approve_or_block]
critical_vulns_block_merge: true
high_vulns_require_mitigation: true
7d_test_coverage:
dependencies: [4_frontend_dev, 4_backend_dev]
agent: tester
input: frontend_pr, backend_pr
actions: [run_tests, check_coverage, benchmark]
must_pass: tests_passing && coverage > 85%
Status: in_progress
Parallel_reviewers: 4
Approved: frontend_review
Pending: backend_review (awaiting security_review)
Blockers: security_review
```
**Output**:
- Approved PRs (if all pass)
- Comments & requested changes
- Test coverage report
- Security clearance
---
### Fase 4: Merge & Deploy (Serial - Ordered)
**Agentes**: CodeReviewer, DevOps, Monitor
**Timeline**: 1-2 horas
```yaml
8_merge_to_dev:
dependencies: [7a_frontend_review, 7b_backend_review, 7c_security_review, 7d_test_coverage]
agent: code-reviewer
action: merge_to_dev
requires: all_approved
9_deploy_staging:
dependencies: [8_merge_to_dev]
agent: devops
environment: staging
actions: [trigger_ci, deploy_manifests, smoke_test]
automatic_after_merge: true
timeout: 30min
10_smoke_test:
dependencies: [9_deploy_staging]
agent: tester
test_type: smoke
environments: [staging]
must_pass: all
11_monitor_staging:
dependencies: [9_deploy_staging]
agent: monitor
duration: 1hour
metrics: [error_rate, latency, cpu, memory]
alert_if: error_rate > 1% or p99_latency > 500ms
Status: in_progress
Completed: 8_merge_to_dev
In_progress: 9_deploy_staging (20min elapsed)
Pending: 10_smoke_test, 11_monitor_staging
```
**Output**:
- Code merged to dev
- Deployed to staging
- Smoke tests pass
- Monitoring active
---
### Fase 5: Final Validation & Release
**Agentes**: DecisionMaker, DevOps, Marketer, Monitor
**Timeline**: 1-3 horas
```yaml
12_final_approval:
dependencies: [10_smoke_test, 11_monitor_staging]
agent: decision-maker
input: test_results, monitoring_report, security_clearance
action: approve_for_production
if_blocked: defer_to_next_week
13_deploy_production:
dependencies: [12_final_approval]
agent: devops
environment: production
deployment_strategy: blue_green # 0 downtime
actions: [deploy, health_check, traffic_switch]
rollback_on: any_error
14_monitor_production:
dependencies: [13_deploy_production]
agent: monitor
duration: 24hours
alert_thresholds: [error_rate > 0.5%, p99 > 300ms, cpu > 80%]
auto_rollback_if: critical_error
15_announce_release:
dependencies: [13_deploy_production] # Can start once deployed
agent: marketer
async: true
actions: [draft_blog_post, announce_on_twitter, create_demo_video]
16_update_docs:
dependencies: [13_deploy_production]
agent: documenter
async: true
actions: [update_changelog, publish_guide, update_roadmap]
Status: completed
Deployed: 2025-11-10T14:00:00Z
Monitoring: Active
Release_notes: docs/releases/v1.2.0.md
```
**Output**:
- Deployed to production
- 24h monitoring active
- Blog post + social media
- Docs updated
- Release notes published
---
## 🔄 Workflow State Machine
```
Created
Planning (serial, approval-gated)
├─ Architect designs
├─ PM validates
└─ DecisionMaker approves → GO / NO-GO
Implementation (parallel)
├─ Frontend dev
├─ Backend dev
├─ Security audit
├─ Tester setup
└─ Documenter start
Review (parallel but gated)
├─ Code review
├─ Security review
├─ Test execution
└─ Coverage check
Merge & Deploy (serial, ordered)
├─ Merge to dev
├─ Deploy staging
├─ Smoke test
└─ Monitor staging
Release (parallel async)
├─ Final approval
├─ Deploy production
├─ Monitor 24h
├─ Marketing announce
└─ Docs update
Completed / Rolled back
Transitions:
- Blocked → can escalate to DecisionMaker
- Failed → auto-rollback if production
- Waiting → timeout after N hours
```
---
## 🎯 Workflow DSL (YAML/TOML)
### Minimal Example
```yaml
workflow:
id: feature-auth
title: Implement MFA
agents:
architect:
role: Architect
parallel_with: [pm]
pm:
role: ProjectManager
depends_on: [architect]
developer:
role: Developer
depends_on: [pm]
parallelizable: true
approval_required_at: [architecture, deploy_production]
allow_concurrent_agents: 10
timeline_hours: 48
```
### Complex Example (Feature-complete)
```yaml
workflow:
id: feature-user-preferences
title: User Preferences System
created_at: 2025-11-09T10:00:00Z
phases:
phase_1_design:
duration_hours: 2
serial: true
steps:
- name: architect_designs
agent: architect
input: feature_spec
output: design_doc
- name: architect_creates_adr
agent: architect
depends_on: architect_designs
output: adr-017.md
- name: pm_reviews
agent: project-manager
depends_on: architect_creates_adr
approval_required: true
phase_2_implementation:
duration_hours: 48
parallel: true
max_concurrent_agents: 6
steps:
- name: frontend_dev
agent: developer
skill_match: frontend
depends_on: [architect_designs]
- name: backend_dev
agent: developer
skill_match: backend
depends_on: [architect_designs]
- name: db_migration
agent: devops
depends_on: [architect_designs]
- name: security_review
agent: security
depends_on: [architect_designs]
- name: docs_start
agent: documenter
depends_on: [architect_creates_adr]
priority: low
phase_3_review:
duration_hours: 16
gate: all_tests_pass && all_reviews_approved
steps:
- name: frontend_review
agent: code-reviewer
depends_on: frontend_dev
- name: backend_review
agent: code-reviewer
depends_on: backend_dev
- name: tests
agent: tester
depends_on: [frontend_dev, backend_dev]
- name: deploy_staging
agent: devops
depends_on: [frontend_review, backend_review, tests]
phase_4_release:
duration_hours: 4
steps:
- name: final_approval
agent: decision-maker
depends_on: phase_3_review
- name: deploy_production
agent: devops
depends_on: final_approval
strategy: blue_green
- name: announce
agent: marketer
depends_on: deploy_production
async: true
```
---
## 🔧 Runtime: Monitoring & Adjustment
### Dashboard (Real-Time)
```
Workflow: feature-auth-mfa
Status: in_progress (Phase 2/5)
Progress: 45%
Timeline: 2/4 days remaining
Active Agents (5/12):
├─ architect-001 🟢 Designing (80% done)
├─ developer-frontend-001 🟢 Implementing (60% done)
├─ developer-backend-001 🟢 Implementing (50% done)
├─ security-001 🟢 Auditing (70% done)
└─ documenter-001 🟡 Waiting for PR links
Pending Agents (4):
├─ code-reviewer-001 ⏳ Waiting for frontend_dev
├─ code-reviewer-002 ⏳ Waiting for backend_dev
├─ tester-001 ⏳ Waiting for dev completion
└─ devops-001 ⏳ Waiting for reviews
Blockers: none
Issues: none
Risks: none
Timeline Projection:
- Design: ✅ 2h (completed)
- Implementation: 3d (50% done, on track)
- Review: 1d (scheduled)
- Deploy: 4h (scheduled)
Total ETA: 4d (vs 5d planned, 1d early!)
```
### Workflow Adjustments
```rust
pub enum WorkflowAdjustment {
// Add more agents if progress slow
AddAgent { agent_role: AgentRole, count: u32 },
// Parallelize steps that were serial
Parallelize { step_ids: Vec<String> },
// Skip optional steps to save time
SkipOptionalSteps { step_ids: Vec<String> },
// Escalate blocker to DecisionMaker
EscalateBlocker { step_id: String },
// Pause workflow for manual review
Pause { reason: String },
// Cancel workflow if infeasible
Cancel { reason: String },
}
// Example: If timeline too tight, add agents
if projected_timeline > planned_timeline {
workflow.adjust(WorkflowAdjustment::AddAgent {
agent_role: AgentRole::Developer,
count: 2,
}).await?;
}
```
---
## 🎯 Implementation Checklist
- [ ] Workflow YAML/TOML parser
- [ ] State machine executor (Created→Completed)
- [ ] Parallel task scheduler
- [ ] Dependency resolution (topological sort)
- [ ] Gate evaluation (all_passed, any_approved, etc.)
- [ ] Blocking & escalation logic
- [ ] Rollback on failure
- [ ] Real-time dashboard
- [ ] Audit trail (who did what, when, why)
- [ ] CLI: `vapora workflow run feature-auth.yaml`
- [ ] CLI: `vapora workflow status --id feature-auth`
- [ ] Monitoring & alerting
---
## 📊 Success Metrics
✅ 10+ agents coordinated without errors
✅ Parallel execution actual (not serial)
✅ Dependencies respected
✅ Approval gates enforce correctly
✅ Rollback works on failure
✅ Dashboard updates real-time
✅ Workflow completes in <5% over estimated time
---
**Version**: 0.1.0
**Status**: Specification Complete (VAPORA v1.0)
**Purpose**: Multi-agent parallel workflow orchestration