feat: Phase 5.3 - Multi-Agent Learning Infrastructure
Implement intelligent agent learning from Knowledge Graph execution history
with per-task-type expertise tracking, recency bias, and learning curves.
## Phase 5.3 Implementation
### Learning Infrastructure (✅ Complete)
- LearningProfileService with per-task-type expertise metrics
- TaskTypeExpertise model tracking success_rate, confidence, learning curves
- Recency bias weighting: recent 7 days weighted 3x higher (exponential decay)
- Confidence scoring prevents overfitting: min(1.0, executions / 20)
- Learning curves computed from daily execution windows
### Agent Scoring Service (✅ Complete)
- Unified AgentScore combining SwarmCoordinator + learning profiles
- Scoring formula: 0.3*base + 0.5*expertise + 0.2*confidence
- Rank agents by combined score for intelligent assignment
- Support for recency-biased scoring (recent_success_rate)
- Methods: rank_agents, select_best, rank_agents_with_recency
### KG Integration (✅ Complete)
- KGPersistence::get_executions_for_task_type() - query by agent + task type
- KGPersistence::get_agent_executions() - all executions for agent
- Coordinator::load_learning_profile_from_kg() - core KG→Learning integration
- Coordinator::load_all_learning_profiles() - batch load for multiple agents
- Convert PersistedExecution → ExecutionData for learning calculations
### Agent Assignment Integration (✅ Complete)
- AgentCoordinator uses learning profiles for task assignment
- extract_task_type() infers task type from title/description
- assign_task() scores candidates using AgentScoringService
- Fallback to load-based selection if no learning data available
- Learning profiles stored in coordinator.learning_profiles RwLock
### Profile Adapter Enhancements (✅ Complete)
- create_learning_profile() - initialize empty profiles
- add_task_type_expertise() - set task-type expertise
- update_profile_with_learning() - update swarm profiles from learning
## Files Modified
### vapora-knowledge-graph/src/persistence.rs (+30 lines)
- get_executions_for_task_type(agent_id, task_type, limit)
- get_agent_executions(agent_id, limit)
### vapora-agents/src/coordinator.rs (+100 lines)
- load_learning_profile_from_kg() - core KG integration method
- load_all_learning_profiles() - batch loading for agents
- assign_task() already uses learning-based scoring via AgentScoringService
### Existing Complete Implementation
- vapora-knowledge-graph/src/learning.rs - calculation functions
- vapora-agents/src/learning_profile.rs - data structures and expertise
- vapora-agents/src/scoring.rs - unified scoring service
- vapora-agents/src/profile_adapter.rs - adapter methods
## Tests Passing
- learning_profile: 7 tests ✅
- scoring: 5 tests ✅
- profile_adapter: 6 tests ✅
- coordinator: learning-specific tests ✅
## Data Flow
1. Task arrives → AgentCoordinator::assign_task()
2. Extract task_type from description
3. Query KG for task-type executions (load_learning_profile_from_kg)
4. Calculate expertise with recency bias
5. Score candidates (SwarmCoordinator + learning)
6. Assign to top-scored agent
7. Execution result → KG → Update learning profiles
## Key Design Decisions
✅ Recency bias: 7-day half-life with 3x weight for recent performance
✅ Confidence scoring: min(1.0, total_executions / 20) prevents overfitting
✅ Hierarchical scoring: 30% base load, 50% expertise, 20% confidence
✅ KG query limit: 100 recent executions per task-type for performance
✅ Async loading: load_learning_profile_from_kg supports concurrent loads
## Next: Phase 5.4 - Cost Optimization
Ready to implement budget enforcement and cost-aware provider selection.
2026-01-11 13:03:53 +00:00
|
|
|
use std::time::Instant;
|
2026-01-11 21:32:56 +00:00
|
|
|
use tracing::{info_span, warn_span, Span};
|
feat: Phase 5.3 - Multi-Agent Learning Infrastructure
Implement intelligent agent learning from Knowledge Graph execution history
with per-task-type expertise tracking, recency bias, and learning curves.
## Phase 5.3 Implementation
### Learning Infrastructure (✅ Complete)
- LearningProfileService with per-task-type expertise metrics
- TaskTypeExpertise model tracking success_rate, confidence, learning curves
- Recency bias weighting: recent 7 days weighted 3x higher (exponential decay)
- Confidence scoring prevents overfitting: min(1.0, executions / 20)
- Learning curves computed from daily execution windows
### Agent Scoring Service (✅ Complete)
- Unified AgentScore combining SwarmCoordinator + learning profiles
- Scoring formula: 0.3*base + 0.5*expertise + 0.2*confidence
- Rank agents by combined score for intelligent assignment
- Support for recency-biased scoring (recent_success_rate)
- Methods: rank_agents, select_best, rank_agents_with_recency
### KG Integration (✅ Complete)
- KGPersistence::get_executions_for_task_type() - query by agent + task type
- KGPersistence::get_agent_executions() - all executions for agent
- Coordinator::load_learning_profile_from_kg() - core KG→Learning integration
- Coordinator::load_all_learning_profiles() - batch load for multiple agents
- Convert PersistedExecution → ExecutionData for learning calculations
### Agent Assignment Integration (✅ Complete)
- AgentCoordinator uses learning profiles for task assignment
- extract_task_type() infers task type from title/description
- assign_task() scores candidates using AgentScoringService
- Fallback to load-based selection if no learning data available
- Learning profiles stored in coordinator.learning_profiles RwLock
### Profile Adapter Enhancements (✅ Complete)
- create_learning_profile() - initialize empty profiles
- add_task_type_expertise() - set task-type expertise
- update_profile_with_learning() - update swarm profiles from learning
## Files Modified
### vapora-knowledge-graph/src/persistence.rs (+30 lines)
- get_executions_for_task_type(agent_id, task_type, limit)
- get_agent_executions(agent_id, limit)
### vapora-agents/src/coordinator.rs (+100 lines)
- load_learning_profile_from_kg() - core KG integration method
- load_all_learning_profiles() - batch loading for agents
- assign_task() already uses learning-based scoring via AgentScoringService
### Existing Complete Implementation
- vapora-knowledge-graph/src/learning.rs - calculation functions
- vapora-agents/src/learning_profile.rs - data structures and expertise
- vapora-agents/src/scoring.rs - unified scoring service
- vapora-agents/src/profile_adapter.rs - adapter methods
## Tests Passing
- learning_profile: 7 tests ✅
- scoring: 5 tests ✅
- profile_adapter: 6 tests ✅
- coordinator: learning-specific tests ✅
## Data Flow
1. Task arrives → AgentCoordinator::assign_task()
2. Extract task_type from description
3. Query KG for task-type executions (load_learning_profile_from_kg)
4. Calculate expertise with recency bias
5. Score candidates (SwarmCoordinator + learning)
6. Assign to top-scored agent
7. Execution result → KG → Update learning profiles
## Key Design Decisions
✅ Recency bias: 7-day half-life with 3x weight for recent performance
✅ Confidence scoring: min(1.0, total_executions / 20) prevents overfitting
✅ Hierarchical scoring: 30% base load, 50% expertise, 20% confidence
✅ KG query limit: 100 recent executions per task-type for performance
✅ Async loading: load_learning_profile_from_kg supports concurrent loads
## Next: Phase 5.4 - Cost Optimization
Ready to implement budget enforcement and cost-aware provider selection.
2026-01-11 13:03:53 +00:00
|
|
|
|
|
|
|
|
/// Span context for task execution tracing
|
|
|
|
|
pub struct TaskSpan {
|
|
|
|
|
span: Span,
|
|
|
|
|
start: Instant,
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
impl TaskSpan {
|
|
|
|
|
/// Create a new task execution span
|
|
|
|
|
pub fn new(task_id: &str, agent_id: &str, task_type: &str) -> Self {
|
|
|
|
|
let span = info_span!(
|
|
|
|
|
"task_execution",
|
|
|
|
|
task_id = %task_id,
|
|
|
|
|
agent_id = %agent_id,
|
|
|
|
|
task_type = %task_type,
|
|
|
|
|
duration_ms = tracing::field::Empty,
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
Self {
|
|
|
|
|
span,
|
|
|
|
|
start: Instant::now(),
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Get reference to span for instrumentation
|
|
|
|
|
pub fn span(&self) -> &Span {
|
|
|
|
|
&self.span
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Record span completion with duration
|
|
|
|
|
pub fn complete(self) {
|
|
|
|
|
let duration_ms = self.start.elapsed().as_millis() as u64;
|
|
|
|
|
self.span.record("duration_ms", duration_ms);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Record span completion with error
|
|
|
|
|
pub fn error(self, error_msg: &str) {
|
|
|
|
|
let duration_ms = self.start.elapsed().as_millis() as u64;
|
|
|
|
|
self.span.record("duration_ms", duration_ms);
|
|
|
|
|
tracing::error!(
|
|
|
|
|
parent: &self.span,
|
|
|
|
|
error = %error_msg,
|
|
|
|
|
"Task execution failed"
|
|
|
|
|
);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Span context for agent operations
|
|
|
|
|
pub struct AgentSpan {
|
|
|
|
|
span: Span,
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
impl AgentSpan {
|
|
|
|
|
/// Create span for agent registration
|
|
|
|
|
pub fn registration(agent_id: &str, role: &str) -> Self {
|
|
|
|
|
let span = info_span!(
|
|
|
|
|
"agent_registration",
|
|
|
|
|
agent_id = %agent_id,
|
|
|
|
|
role = %role,
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
Self { span }
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Create span for agent status update
|
|
|
|
|
pub fn status_update(agent_id: &str, load: f64, available: bool) -> Self {
|
|
|
|
|
let span = info_span!(
|
|
|
|
|
"agent_status_update",
|
|
|
|
|
agent_id = %agent_id,
|
|
|
|
|
load = load,
|
|
|
|
|
available = available,
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
Self { span }
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Create span for agent heartbeat
|
|
|
|
|
pub fn heartbeat(agent_id: &str) -> Self {
|
|
|
|
|
let span = info_span!(
|
|
|
|
|
"agent_heartbeat",
|
|
|
|
|
agent_id = %agent_id,
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
Self { span }
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Get reference to span
|
|
|
|
|
pub fn span(&self) -> &Span {
|
|
|
|
|
&self.span
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Span context for routing operations
|
|
|
|
|
pub struct RoutingSpan {
|
|
|
|
|
span: Span,
|
|
|
|
|
start: Instant,
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
impl RoutingSpan {
|
|
|
|
|
/// Create span for provider selection
|
|
|
|
|
pub fn provider_selection(task_type: &str, candidates: usize) -> Self {
|
|
|
|
|
let span = info_span!(
|
|
|
|
|
"provider_selection",
|
|
|
|
|
task_type = %task_type,
|
|
|
|
|
candidate_count = candidates,
|
|
|
|
|
selected_provider = tracing::field::Empty,
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
Self {
|
|
|
|
|
span,
|
|
|
|
|
start: Instant::now(),
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Create span for cost calculation
|
|
|
|
|
pub fn cost_calculation(provider: &str) -> Self {
|
|
|
|
|
let span = info_span!(
|
|
|
|
|
"cost_calculation",
|
|
|
|
|
provider = %provider,
|
|
|
|
|
input_tokens = tracing::field::Empty,
|
|
|
|
|
output_tokens = tracing::field::Empty,
|
|
|
|
|
total_cost = tracing::field::Empty,
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
Self {
|
|
|
|
|
span,
|
|
|
|
|
start: Instant::now(),
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Record selected provider
|
|
|
|
|
pub fn record_selection(&self, provider: &str) {
|
|
|
|
|
self.span.record("selected_provider", provider);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Record cost details
|
|
|
|
|
pub fn record_cost(&self, input_tokens: u64, output_tokens: u64, cost: f64) {
|
|
|
|
|
self.span.record("input_tokens", input_tokens);
|
|
|
|
|
self.span.record("output_tokens", output_tokens);
|
|
|
|
|
self.span.record("total_cost", cost);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Complete routing operation
|
|
|
|
|
pub fn complete(self) {
|
|
|
|
|
let duration_ms = self.start.elapsed().as_millis() as u64;
|
|
|
|
|
tracing::debug!(
|
|
|
|
|
parent: &self.span,
|
|
|
|
|
duration_ms = duration_ms,
|
|
|
|
|
"Routing decision completed"
|
|
|
|
|
);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Get reference to span
|
|
|
|
|
pub fn span(&self) -> &Span {
|
|
|
|
|
&self.span
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Span context for swarm operations
|
|
|
|
|
pub struct SwarmSpan {
|
|
|
|
|
span: Span,
|
|
|
|
|
start: Instant,
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
impl SwarmSpan {
|
|
|
|
|
/// Create span for task assignment
|
|
|
|
|
pub fn task_assignment(task_id: &str, assigned_to: &str) -> Self {
|
|
|
|
|
let span = info_span!(
|
|
|
|
|
"swarm_task_assignment",
|
|
|
|
|
task_id = %task_id,
|
|
|
|
|
assigned_to = %assigned_to,
|
|
|
|
|
duration_ms = tracing::field::Empty,
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
Self {
|
|
|
|
|
span,
|
|
|
|
|
start: Instant::now(),
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Create span for coalition formation
|
|
|
|
|
pub fn coalition_formation(coalition_id: &str, required_roles: usize) -> Self {
|
|
|
|
|
let span = info_span!(
|
|
|
|
|
"swarm_coalition_formation",
|
|
|
|
|
coalition_id = %coalition_id,
|
|
|
|
|
required_roles = required_roles,
|
|
|
|
|
members_recruited = tracing::field::Empty,
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
Self {
|
|
|
|
|
span,
|
|
|
|
|
start: Instant::now(),
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Create span for consensus voting
|
|
|
|
|
pub fn consensus_voting(proposal_id: &str, voter_count: usize) -> Self {
|
|
|
|
|
let span = info_span!(
|
|
|
|
|
"swarm_consensus",
|
|
|
|
|
proposal_id = %proposal_id,
|
|
|
|
|
voter_count = voter_count,
|
|
|
|
|
consensus_reached = tracing::field::Empty,
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
Self {
|
|
|
|
|
span,
|
|
|
|
|
start: Instant::now(),
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Record members recruited for coalition
|
|
|
|
|
pub fn record_members(&self, count: usize) {
|
|
|
|
|
self.span.record("members_recruited", count);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Record consensus result
|
|
|
|
|
pub fn record_consensus(&self, reached: bool) {
|
|
|
|
|
self.span.record("consensus_reached", reached);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Complete swarm operation
|
|
|
|
|
pub fn complete(self) {
|
|
|
|
|
let duration_ms = self.start.elapsed().as_millis() as u64;
|
|
|
|
|
self.span.record("duration_ms", duration_ms);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Get reference to span
|
|
|
|
|
pub fn span(&self) -> &Span {
|
|
|
|
|
&self.span
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Span context for analytics operations
|
|
|
|
|
pub struct AnalyticsSpan {
|
|
|
|
|
span: Span,
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
impl AnalyticsSpan {
|
|
|
|
|
/// Create span for event processing
|
|
|
|
|
pub fn event_processing(event_type: &str) -> Self {
|
|
|
|
|
let span = info_span!(
|
|
|
|
|
"analytics_event_processing",
|
|
|
|
|
event_type = %event_type,
|
|
|
|
|
processed = false,
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
Self { span }
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Create span for alert generation
|
|
|
|
|
pub fn alert_generation(alert_type: &str, severity: &str) -> Self {
|
|
|
|
|
let span = warn_span!(
|
|
|
|
|
"analytics_alert",
|
|
|
|
|
alert_type = %alert_type,
|
|
|
|
|
severity = %severity,
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
Self { span }
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Create span for aggregation
|
|
|
|
|
pub fn aggregation(window_name: &str) -> Self {
|
|
|
|
|
let span = info_span!(
|
|
|
|
|
"analytics_aggregation",
|
|
|
|
|
window = %window_name,
|
|
|
|
|
aggregated_count = tracing::field::Empty,
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
Self { span }
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Record aggregation count
|
|
|
|
|
pub fn record_count(&self, count: usize) {
|
|
|
|
|
self.span.record("aggregated_count", count);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Get reference to span
|
|
|
|
|
pub fn span(&self) -> &Span {
|
|
|
|
|
&self.span
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Span context for knowledge graph operations
|
|
|
|
|
pub struct KGSpan {
|
|
|
|
|
span: Span,
|
|
|
|
|
start: Instant,
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
impl KGSpan {
|
|
|
|
|
/// Create span for execution recording
|
|
|
|
|
pub fn record_execution(task_id: &str, agent_id: &str) -> Self {
|
|
|
|
|
let span = info_span!(
|
|
|
|
|
"kg_record_execution",
|
|
|
|
|
task_id = %task_id,
|
|
|
|
|
agent_id = %agent_id,
|
|
|
|
|
duration_ms = tracing::field::Empty,
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
Self {
|
|
|
|
|
span,
|
|
|
|
|
start: Instant::now(),
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Create span for similarity query
|
|
|
|
|
pub fn similarity_query(query_text: &str) -> Self {
|
|
|
|
|
let span = info_span!(
|
|
|
|
|
"kg_similarity_query",
|
|
|
|
|
query_length = query_text.len(),
|
|
|
|
|
matches_found = tracing::field::Empty,
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
Self {
|
|
|
|
|
span,
|
|
|
|
|
start: Instant::now(),
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Create span for reasoning operation
|
|
|
|
|
pub fn reasoning(operation: &str) -> Self {
|
|
|
|
|
let span = info_span!(
|
|
|
|
|
"kg_reasoning",
|
|
|
|
|
operation = %operation,
|
|
|
|
|
insights_generated = tracing::field::Empty,
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
Self {
|
|
|
|
|
span,
|
|
|
|
|
start: Instant::now(),
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Record number of insights
|
|
|
|
|
pub fn record_insights(&self, count: usize) {
|
|
|
|
|
self.span.record("insights_generated", count);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Record number of matches
|
|
|
|
|
pub fn record_matches(&self, count: usize) {
|
|
|
|
|
self.span.record("matches_found", count);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Complete operation
|
|
|
|
|
pub fn complete(self) {
|
|
|
|
|
let duration_ms = self.start.elapsed().as_millis() as u64;
|
|
|
|
|
self.span.record("duration_ms", duration_ms);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Get reference to span
|
|
|
|
|
pub fn span(&self) -> &Span {
|
|
|
|
|
&self.span
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
#[cfg(test)]
|
|
|
|
|
mod tests {
|
|
|
|
|
use super::*;
|
|
|
|
|
|
|
|
|
|
#[test]
|
|
|
|
|
fn test_task_span_creation() {
|
|
|
|
|
let span = TaskSpan::new("task-1", "agent-1", "coding");
|
|
|
|
|
// Span created successfully
|
|
|
|
|
let _ = span.span();
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
#[test]
|
|
|
|
|
fn test_agent_span_registration() {
|
|
|
|
|
let span = AgentSpan::registration("agent-1", "developer");
|
|
|
|
|
// Span created successfully
|
|
|
|
|
let _ = span.span();
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
#[test]
|
|
|
|
|
fn test_routing_span_selection() {
|
|
|
|
|
let span = RoutingSpan::provider_selection("code_generation", 3);
|
|
|
|
|
span.record_selection("claude");
|
|
|
|
|
// Span should have recorded the provider selection
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
#[test]
|
|
|
|
|
fn test_swarm_span_coalition() {
|
|
|
|
|
let span = SwarmSpan::coalition_formation("coal_123", 3);
|
|
|
|
|
span.record_members(3);
|
|
|
|
|
// Span should have recorded member count
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
#[test]
|
|
|
|
|
fn test_kg_span_reasoning() {
|
|
|
|
|
let span = KGSpan::reasoning("pattern_detection");
|
|
|
|
|
span.record_insights(5);
|
|
|
|
|
// Span should have recorded insights
|
|
|
|
|
}
|
|
|
|
|
}
|