Vapora/docs/architecture/roles-permissions-profiles.md
Jesús Pérez d14150da75 feat: Phase 5.3 - Multi-Agent Learning Infrastructure
Implement intelligent agent learning from Knowledge Graph execution history
with per-task-type expertise tracking, recency bias, and learning curves.

## Phase 5.3 Implementation

### Learning Infrastructure ( Complete)
- LearningProfileService with per-task-type expertise metrics
- TaskTypeExpertise model tracking success_rate, confidence, learning curves
- Recency bias weighting: recent 7 days weighted 3x higher (exponential decay)
- Confidence scoring prevents overfitting: min(1.0, executions / 20)
- Learning curves computed from daily execution windows

### Agent Scoring Service ( Complete)
- Unified AgentScore combining SwarmCoordinator + learning profiles
- Scoring formula: 0.3*base + 0.5*expertise + 0.2*confidence
- Rank agents by combined score for intelligent assignment
- Support for recency-biased scoring (recent_success_rate)
- Methods: rank_agents, select_best, rank_agents_with_recency

### KG Integration ( Complete)
- KGPersistence::get_executions_for_task_type() - query by agent + task type
- KGPersistence::get_agent_executions() - all executions for agent
- Coordinator::load_learning_profile_from_kg() - core KG→Learning integration
- Coordinator::load_all_learning_profiles() - batch load for multiple agents
- Convert PersistedExecution → ExecutionData for learning calculations

### Agent Assignment Integration ( Complete)
- AgentCoordinator uses learning profiles for task assignment
- extract_task_type() infers task type from title/description
- assign_task() scores candidates using AgentScoringService
- Fallback to load-based selection if no learning data available
- Learning profiles stored in coordinator.learning_profiles RwLock

### Profile Adapter Enhancements ( Complete)
- create_learning_profile() - initialize empty profiles
- add_task_type_expertise() - set task-type expertise
- update_profile_with_learning() - update swarm profiles from learning

## Files Modified

### vapora-knowledge-graph/src/persistence.rs (+30 lines)
- get_executions_for_task_type(agent_id, task_type, limit)
- get_agent_executions(agent_id, limit)

### vapora-agents/src/coordinator.rs (+100 lines)
- load_learning_profile_from_kg() - core KG integration method
- load_all_learning_profiles() - batch loading for agents
- assign_task() already uses learning-based scoring via AgentScoringService

### Existing Complete Implementation
- vapora-knowledge-graph/src/learning.rs - calculation functions
- vapora-agents/src/learning_profile.rs - data structures and expertise
- vapora-agents/src/scoring.rs - unified scoring service
- vapora-agents/src/profile_adapter.rs - adapter methods

## Tests Passing
- learning_profile: 7 tests 
- scoring: 5 tests 
- profile_adapter: 6 tests 
- coordinator: learning-specific tests 

## Data Flow
1. Task arrives → AgentCoordinator::assign_task()
2. Extract task_type from description
3. Query KG for task-type executions (load_learning_profile_from_kg)
4. Calculate expertise with recency bias
5. Score candidates (SwarmCoordinator + learning)
6. Assign to top-scored agent
7. Execution result → KG → Update learning profiles

## Key Design Decisions
 Recency bias: 7-day half-life with 3x weight for recent performance
 Confidence scoring: min(1.0, total_executions / 20) prevents overfitting
 Hierarchical scoring: 30% base load, 50% expertise, 20% confidence
 KG query limit: 100 recent executions per task-type for performance
 Async loading: load_learning_profile_from_kg supports concurrent loads

## Next: Phase 5.4 - Cost Optimization
Ready to implement budget enforcement and cost-aware provider selection.
2026-01-11 13:03:53 +00:00

433 lines
10 KiB
Markdown

# 👥 Roles, Permissions & Profiles
## Cedar-Based Access Control for Multi-Agent Teams
**Version**: 0.1.0
**Status**: Specification (VAPORA v1.0 - Authorization)
**Purpose**: Fine-grained RBAC + team profiles for agents and humans
---
## 🎯 Objetivo
Sistema de autorización multinivel basado en **Cedar Policy Engine** (de provisioning):
- ✅ 12 roles especializados (agentes + humanos)
- ✅ Perfiles agrupando roles (equipos)
- ✅ Políticas granulares (resource-level, context-aware)
- ✅ Audit trail completo
- ✅ Dynamic policy reload (sin restart)
---
## 👥 Los 12 Roles (+ Admin/Guest)
### Technical Roles
**Architect**
- Permisos: Create ADRs, propose decisions, review architecture
- Restricciones: Can't deploy, can't approve own decisions
- Resources: Design documents, ADR files, architecture diagrams
**Developer**
- Permisos: Create code, push to dev branches, request reviews
- Restricciones: Can't merge to main, can't delete
- Resources: Code files, dev branches, PR creation
**CodeReviewer**
- Permisos: Comment on PRs, approve/request changes, merge to dev
- Restricciones: Can't approve own code, can't force push
- Resources: PRs, review comments, dev branches
**Tester**
- Permisos: Create/modify tests, run benchmarks, report issues
- Restricciones: Can't deploy, can't modify code outside tests
- Resources: Test files, benchmark results, issue reports
### Documentation Roles
**Documenter**
- Permisos: Modify docs/, README, CHANGELOG, update docs/adr/
- Restricciones: Can't modify source code
- Resources: docs/ directory, markdown files
**Marketer**
- Permisos: Create marketing content, modify website
- Restricciones: Can't modify code, docs, or infrastructure
- Resources: marketing/, website, blog posts
**Presenter**
- Permisos: Create presentations, record demos
- Restricciones: Read-only on all code
- Resources: presentations/, demo assets
### Operations Roles
**DevOps**
- Permisos: Approve PRs for deployment, trigger CI/CD, modify manifests
- Restricciones: Can't modify business logic, can't delete environments
- Resources: Kubernetes manifests, CI/CD configs, deployment status
**Monitor**
- Permisos: View all metrics, create alerts, read logs
- Restricciones: Can't modify infrastructure
- Resources: Monitoring dashboards, alert rules, logs
**Security**
- Permisos: Scan code, audit logs, block PRs if critical vulnerabilities
- Restricciones: Can't approve deployments
- Resources: Security scans, audit logs, vulnerability database
### Management Roles
**ProjectManager**
- Permisos: View all tasks, update roadmap, assign work
- Restricciones: Can't merge code, can't approve technical decisions
- Resources: Tasks, roadmap, timelines
**DecisionMaker**
- Permisos: Approve critical decisions, resolve conflicts
- Restricciones: Can't implement decisions
- Resources: Decision queue, escalations
**Orchestrator**
- Permisos: Assign agents to tasks, coordinate workflows
- Restricciones: Can't execute tasks directly
- Resources: Agent registry, task queue, workflows
### Default Roles
**Admin**
- Permisos: Everything
- Restricciones: None
- Resources: All
**Guest**
- Permisos: Read public docs, view public status
- Restricciones: Can't modify anything
- Resources: Public docs, public dashboards
---
## 🏢 Perfiles (Team Groupings)
### Frontend Team
```toml
[profile]
name = "Frontend Team"
members = ["alice@example.com", "bob@example.com", "developer-frontend-001"]
roles = ["Developer", "CodeReviewer", "Tester"]
permissions = [
"create_pr_frontend",
"review_pr_frontend",
"test_frontend",
"commit_dev_branch",
]
resource_constraints = [
"path_prefix:frontend/",
]
```
### Backend Team
```toml
[profile]
name = "Backend Team"
members = ["charlie@example.com", "developer-backend-001", "developer-backend-002"]
roles = ["Developer", "CodeReviewer", "Tester", "Security"]
permissions = [
"create_pr_backend",
"review_pr_backend",
"test_backend",
"security_scan",
]
resource_constraints = [
"path_prefix:backend/",
"exclude_path:backend/secrets/",
]
```
### Full Stack Team
```toml
[profile]
name = "Full Stack Team"
members = ["alice@example.com", "architect-001", "reviewer-001"]
roles = ["Architect", "Developer", "CodeReviewer", "Tester", "Documenter"]
permissions = [
"design_features",
"implement_features",
"review_code",
"test_features",
"document_features",
]
```
### DevOps Team
```toml
[profile]
name = "DevOps Team"
members = ["devops-001", "devops-002", "security-001"]
roles = ["DevOps", "Monitor", "Security"]
permissions = [
"trigger_ci_cd",
"deploy_staging",
"deploy_production",
"modify_manifests",
"monitor_health",
"security_audit",
]
```
### Management
```toml
[profile]
name = "Management"
members = ["pm-001", "decision-maker-001", "orchestrator-001"]
roles = ["ProjectManager", "DecisionMaker", "Orchestrator"]
permissions = [
"create_tasks",
"assign_agents",
"make_decisions",
"view_metrics",
]
```
---
## 🔐 Cedar Policies (Authorization Rules)
### Policy Structure
```cedar
// Policy: Only CodeReviewers can approve PRs
permit(
principal in Role::"CodeReviewer",
action == Action::"approve_pr",
resource
) when {
// Can't approve own PR
principal != resource.author
&& principal.team == resource.team
};
// Policy: Developers can only commit to dev branches
permit(
principal in Role::"Developer",
action == Action::"commit",
resource in Branch::"dev"
) when {
resource.protection_level == "standard"
};
// Policy: Security can block PRs if critical vulns found
permit(
principal in Role::"Security",
action == Action::"block_pr",
resource
) when {
resource.vulnerability_severity == "critical"
};
// Policy: DevOps can only deploy approved code
permit(
principal in Role::"DevOps",
action == Action::"deploy",
resource
) when {
resource.approved_by.has_element(principal)
&& resource.tests_passing == true
};
// Policy: Monitor can view all logs (read-only)
permit(
principal in Role::"Monitor",
action == Action::"view_logs",
resource
);
// Policy: Documenter can only modify docs/
permit(
principal in Role::"Documenter",
action == Action::"modify",
resource
) when {
resource.path.starts_with("docs/")
|| resource.path == "README.md"
|| resource.path == "CHANGELOG.md"
};
```
### Dynamic Policies (Hot Reload)
```toml
# vapora.toml
[authorization]
cedar_policies_path = ".vapora/policies/"
reload_interval_secs = 30
enable_audit_logging = true
# .vapora/policies/custom-rules.cedar
// Custom rule: Only Architects from Backend Team can design backend features
permit(
principal in Team::"Backend Team",
action == Action::"design_architecture",
resource in ResourceType::"backend_feature"
) when {
principal.role == Role::"Architect"
};
```
---
## 🔍 Audit Trail
### Audit Log Entry
```rust
pub struct AuditLogEntry {
pub id: String,
pub timestamp: DateTime<Utc>,
pub principal_id: String,
pub principal_type: String, // "agent" or "human"
pub action: String,
pub resource: String,
pub result: AuditResult, // Permitted, Denied, Error
pub reason: String,
pub context: HashMap<String, String>,
}
pub enum AuditResult {
Permitted,
Denied { reason: String },
Error { error: String },
}
```
### Audit Retention Policy
```toml
[audit]
retention_days = 2555 # 7 years for compliance
export_formats = ["json", "csv", "syslog"]
sensitive_fields = ["api_key", "password", "token"] # Redact these
```
---
## 🚀 Implementation
### Cedar Policy Engine Integration
```rust
pub struct AuthorizationEngine {
pub cedar_schema: cedar_policy_core::Schema,
pub policies: cedar_policy_core::PolicySet,
pub audit_log: Vec<AuditLogEntry>,
}
impl AuthorizationEngine {
pub async fn check_permission(
&mut self,
principal: Principal,
action: Action,
resource: Resource,
context: Context,
) -> anyhow::Result<AuthorizationResult> {
let request = cedar_policy_core::Request::new(
principal,
action,
resource,
context,
);
let response = self.policies.evaluate(&request);
let allowed = response.decision == Decision::Allow;
let reason = response.reason.join(", ");
let entry = AuditLogEntry {
id: uuid::Uuid::new_v4().to_string(),
timestamp: Utc::now(),
principal_id: principal.id,
principal_type: principal.principal_type.to_string(),
action: action.name,
resource: resource.id,
result: if allowed {
AuditResult::Permitted
} else {
AuditResult::Denied { reason: reason.clone() }
},
reason,
context: Default::default(),
};
self.audit_log.push(entry);
Ok(AuthorizationResult { allowed, reason })
}
pub async fn hot_reload_policies(&mut self) -> anyhow::Result<()> {
// Read .vapora/policies/ and reload
// Notify all agents of policy changes
Ok(())
}
}
```
### Context-Aware Authorization
```rust
pub struct Context {
pub time: DateTime<Utc>,
pub ip_address: String,
pub environment: String, // "dev", "staging", "prod"
pub is_business_hours: bool,
pub request_priority: Priority, // Low, Normal, High, Critical
}
// Policy example: Can only deploy to prod during business hours
permit(
principal in Role::"DevOps",
action == Action::"deploy_production",
resource
) when {
context.is_business_hours == true
&& context.environment == "production"
};
```
---
## 🎯 Implementation Checklist
- [ ] Define Principal (agent_id, role, team, profile)
- [ ] Define Action (create_pr, approve, deploy, etc.)
- [ ] Define Resource (PR, code file, branch, deployment)
- [ ] Implement Cedar policy evaluation
- [ ] Load policies from `.vapora/policies/`
- [ ] Implement hot reload (30s interval)
- [ ] Audit logging for every decision
- [ ] CLI: `vapora auth check --principal X --action Y --resource Z`
- [ ] CLI: `vapora auth policies list/reload`
- [ ] Audit log export (JSON, CSV)
- [ ] Tests (policy enforcement)
---
## 📊 Success Metrics
✅ Policy evaluation < 10ms
Hot reload works without restart
Audit log complete and queryable
Multi-team isolation working
Context-aware rules enforced
Deny reasons clear and actionable
---
**Version**: 0.1.0
**Status**: Specification Complete (VAPORA v1.0)
**Purpose**: Cedar-based authorization for multi-agent multi-team platform