384 lines
10 KiB
Rust
Raw Normal View History

feat: Phase 5.3 - Multi-Agent Learning Infrastructure Implement intelligent agent learning from Knowledge Graph execution history with per-task-type expertise tracking, recency bias, and learning curves. ## Phase 5.3 Implementation ### Learning Infrastructure (✅ Complete) - LearningProfileService with per-task-type expertise metrics - TaskTypeExpertise model tracking success_rate, confidence, learning curves - Recency bias weighting: recent 7 days weighted 3x higher (exponential decay) - Confidence scoring prevents overfitting: min(1.0, executions / 20) - Learning curves computed from daily execution windows ### Agent Scoring Service (✅ Complete) - Unified AgentScore combining SwarmCoordinator + learning profiles - Scoring formula: 0.3*base + 0.5*expertise + 0.2*confidence - Rank agents by combined score for intelligent assignment - Support for recency-biased scoring (recent_success_rate) - Methods: rank_agents, select_best, rank_agents_with_recency ### KG Integration (✅ Complete) - KGPersistence::get_executions_for_task_type() - query by agent + task type - KGPersistence::get_agent_executions() - all executions for agent - Coordinator::load_learning_profile_from_kg() - core KG→Learning integration - Coordinator::load_all_learning_profiles() - batch load for multiple agents - Convert PersistedExecution → ExecutionData for learning calculations ### Agent Assignment Integration (✅ Complete) - AgentCoordinator uses learning profiles for task assignment - extract_task_type() infers task type from title/description - assign_task() scores candidates using AgentScoringService - Fallback to load-based selection if no learning data available - Learning profiles stored in coordinator.learning_profiles RwLock ### Profile Adapter Enhancements (✅ Complete) - create_learning_profile() - initialize empty profiles - add_task_type_expertise() - set task-type expertise - update_profile_with_learning() - update swarm profiles from learning ## Files Modified ### vapora-knowledge-graph/src/persistence.rs (+30 lines) - get_executions_for_task_type(agent_id, task_type, limit) - get_agent_executions(agent_id, limit) ### vapora-agents/src/coordinator.rs (+100 lines) - load_learning_profile_from_kg() - core KG integration method - load_all_learning_profiles() - batch loading for agents - assign_task() already uses learning-based scoring via AgentScoringService ### Existing Complete Implementation - vapora-knowledge-graph/src/learning.rs - calculation functions - vapora-agents/src/learning_profile.rs - data structures and expertise - vapora-agents/src/scoring.rs - unified scoring service - vapora-agents/src/profile_adapter.rs - adapter methods ## Tests Passing - learning_profile: 7 tests ✅ - scoring: 5 tests ✅ - profile_adapter: 6 tests ✅ - coordinator: learning-specific tests ✅ ## Data Flow 1. Task arrives → AgentCoordinator::assign_task() 2. Extract task_type from description 3. Query KG for task-type executions (load_learning_profile_from_kg) 4. Calculate expertise with recency bias 5. Score candidates (SwarmCoordinator + learning) 6. Assign to top-scored agent 7. Execution result → KG → Update learning profiles ## Key Design Decisions ✅ Recency bias: 7-day half-life with 3x weight for recent performance ✅ Confidence scoring: min(1.0, total_executions / 20) prevents overfitting ✅ Hierarchical scoring: 30% base load, 50% expertise, 20% confidence ✅ KG query limit: 100 recent executions per task-type for performance ✅ Async loading: load_learning_profile_from_kg supports concurrent loads ## Next: Phase 5.4 - Cost Optimization Ready to implement budget enforcement and cost-aware provider selection.
2026-01-11 13:03:53 +00:00
// vapora-agents: Agent registry - manages agent lifecycle and availability
// Phase 2: Complete implementation with 12 agent roles
use chrono::{DateTime, Utc};
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
use std::sync::{Arc, RwLock};
use thiserror::Error;
use uuid::Uuid;
#[derive(Debug, Error)]
pub enum RegistryError {
#[error("Agent not found: {0}")]
AgentNotFound(String),
#[error("Agent already registered: {0}")]
AgentAlreadyRegistered(String),
#[error("Maximum agents reached for role: {0}")]
MaxAgentsReached(String),
#[error("Invalid agent state transition: {0}")]
InvalidStateTransition(String),
}
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
pub enum AgentStatus {
Active,
Inactive,
Updating,
Error(String),
Scaling,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct AgentMetadata {
pub id: String,
pub role: String,
pub name: String,
pub version: String,
pub status: AgentStatus,
pub capabilities: Vec<String>,
pub llm_provider: String,
pub llm_model: String,
pub max_concurrent_tasks: u32,
pub current_tasks: u32,
pub created_at: DateTime<Utc>,
pub last_heartbeat: DateTime<Utc>,
pub uptime_percentage: f64,
pub total_tasks_completed: u64,
}
impl AgentMetadata {
pub fn new(
role: String,
name: String,
llm_provider: String,
llm_model: String,
capabilities: Vec<String>,
) -> Self {
let now = Utc::now();
Self {
id: Uuid::new_v4().to_string(),
role,
name,
version: "0.1.0".to_string(),
status: AgentStatus::Active,
capabilities,
llm_provider,
llm_model,
max_concurrent_tasks: 5,
current_tasks: 0,
created_at: now,
last_heartbeat: now,
uptime_percentage: 100.0,
total_tasks_completed: 0,
}
}
/// Check if agent can accept new tasks
pub fn can_accept_task(&self) -> bool {
self.status == AgentStatus::Active && self.current_tasks < self.max_concurrent_tasks
}
/// Increment task count
pub fn assign_task(&mut self) {
if self.current_tasks < self.max_concurrent_tasks {
self.current_tasks += 1;
}
}
/// Decrement task count
pub fn complete_task(&mut self) {
if self.current_tasks > 0 {
self.current_tasks -= 1;
}
self.total_tasks_completed += 1;
}
}
/// Thread-safe agent registry
#[derive(Clone)]
pub struct AgentRegistry {
inner: Arc<RwLock<AgentRegistryInner>>,
}
struct AgentRegistryInner {
agents: HashMap<String, AgentMetadata>,
running_count: HashMap<String, u32>,
max_agents_per_role: u32,
}
impl AgentRegistry {
pub fn new(max_agents_per_role: u32) -> Self {
Self {
inner: Arc::new(RwLock::new(AgentRegistryInner {
agents: HashMap::new(),
running_count: HashMap::new(),
max_agents_per_role,
})),
}
}
/// Register a new agent
pub fn register_agent(&self, metadata: AgentMetadata) -> Result<String, RegistryError> {
let mut inner = self.inner.write().expect("Failed to acquire write lock");
// Check if agent already registered
if inner.agents.contains_key(&metadata.id) {
return Err(RegistryError::AgentAlreadyRegistered(metadata.id.clone()));
}
// Check if we've reached max agents for this role
let count = inner.running_count.get(&metadata.role).unwrap_or(&0);
if *count >= inner.max_agents_per_role {
return Err(RegistryError::MaxAgentsReached(metadata.role.clone()));
}
let role = metadata.role.clone();
let id = metadata.id.clone();
inner.agents.insert(id.clone(), metadata);
*inner.running_count.entry(role).or_insert(0) += 1;
Ok(id)
}
/// Unregister an agent
pub fn unregister_agent(&self, id: &str) -> Result<(), RegistryError> {
let mut inner = self.inner.write().expect("Failed to acquire write lock");
let agent = inner
.agents
.remove(id)
.ok_or_else(|| RegistryError::AgentNotFound(id.to_string()))?;
if let Some(count) = inner.running_count.get_mut(&agent.role) {
if *count > 0 {
*count -= 1;
}
}
Ok(())
}
/// Get agent metadata
pub fn get_agent(&self, id: &str) -> Option<AgentMetadata> {
let inner = self.inner.read().expect("Failed to acquire read lock");
inner.agents.get(id).cloned()
}
/// Get all agents for a specific role
pub fn get_agents_by_role(&self, role: &str) -> Vec<AgentMetadata> {
let inner = self.inner.read().expect("Failed to acquire read lock");
inner
.agents
.values()
.filter(|a| a.role == role && a.status == AgentStatus::Active)
.cloned()
.collect()
}
/// List all agents
pub fn list_all(&self) -> Vec<AgentMetadata> {
let inner = self.inner.read().expect("Failed to acquire read lock");
inner.agents.values().cloned().collect()
}
/// Update agent status
pub fn update_agent_status(
&self,
id: &str,
status: AgentStatus,
) -> Result<(), RegistryError> {
let mut inner = self.inner.write().expect("Failed to acquire write lock");
let agent = inner
.agents
.get_mut(id)
.ok_or_else(|| RegistryError::AgentNotFound(id.to_string()))?;
agent.status = status;
agent.last_heartbeat = Utc::now();
Ok(())
}
/// Update agent heartbeat
pub fn heartbeat(&self, id: &str) -> Result<(), RegistryError> {
let mut inner = self.inner.write().expect("Failed to acquire write lock");
let agent = inner
.agents
.get_mut(id)
.ok_or_else(|| RegistryError::AgentNotFound(id.to_string()))?;
agent.last_heartbeat = Utc::now();
Ok(())
}
/// Get an available agent for a specific role
pub fn get_available_agent(&self, role: &str) -> Option<AgentMetadata> {
let agents = self.get_agents_by_role(role);
agents
.into_iter()
.filter(|a| a.can_accept_task())
.min_by_key(|a| a.current_tasks)
}
/// Assign task to agent
pub fn assign_task(&self, agent_id: &str) -> Result<(), RegistryError> {
let mut inner = self.inner.write().expect("Failed to acquire write lock");
let agent = inner
.agents
.get_mut(agent_id)
.ok_or_else(|| RegistryError::AgentNotFound(agent_id.to_string()))?;
if !agent.can_accept_task() {
return Err(RegistryError::InvalidStateTransition(
"Agent cannot accept more tasks".to_string(),
));
}
agent.assign_task();
Ok(())
}
/// Complete task for agent
pub fn complete_task(&self, agent_id: &str) -> Result<(), RegistryError> {
let mut inner = self.inner.write().expect("Failed to acquire write lock");
let agent = inner
.agents
.get_mut(agent_id)
.ok_or_else(|| RegistryError::AgentNotFound(agent_id.to_string()))?;
agent.complete_task();
Ok(())
}
/// Get count of agents by role
pub fn count_by_role(&self, role: &str) -> u32 {
let inner = self.inner.read().expect("Failed to acquire read lock");
*inner.running_count.get(role).unwrap_or(&0)
}
/// Get total agent count
pub fn total_count(&self) -> usize {
let inner = self.inner.read().expect("Failed to acquire read lock");
inner.agents.len()
}
}
impl Default for AgentRegistry {
fn default() -> Self {
Self::new(5)
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_agent_registration() {
let registry = AgentRegistry::new(5);
let agent = AgentMetadata::new(
"developer".to_string(),
"Developer Agent 1".to_string(),
"claude".to_string(),
"claude-sonnet-4".to_string(),
vec!["coding".to_string()],
);
let id = registry.register_agent(agent).unwrap();
assert!(registry.get_agent(&id).is_some());
assert_eq!(registry.total_count(), 1);
}
#[test]
fn test_max_agents_per_role() {
let registry = AgentRegistry::new(2);
for i in 0..2 {
let agent = AgentMetadata::new(
"developer".to_string(),
format!("Developer {}", i),
"claude".to_string(),
"claude-sonnet-4".to_string(),
vec![],
);
registry.register_agent(agent).unwrap();
}
// Third agent should fail
let agent = AgentMetadata::new(
"developer".to_string(),
"Developer 3".to_string(),
"claude".to_string(),
"claude-sonnet-4".to_string(),
vec![],
);
let result = registry.register_agent(agent);
assert!(result.is_err());
}
#[test]
fn test_agent_task_assignment() {
let _registry = AgentRegistry::new(5);
let mut agent = AgentMetadata::new(
"developer".to_string(),
"Developer Agent".to_string(),
"claude".to_string(),
"claude-sonnet-4".to_string(),
vec![],
);
assert_eq!(agent.current_tasks, 0);
assert!(agent.can_accept_task());
agent.assign_task();
assert_eq!(agent.current_tasks, 1);
agent.complete_task();
assert_eq!(agent.current_tasks, 0);
assert_eq!(agent.total_tasks_completed, 1);
}
#[test]
fn test_get_available_agent() {
let registry = AgentRegistry::new(5);
let agent1 = AgentMetadata::new(
"developer".to_string(),
"Developer 1".to_string(),
"claude".to_string(),
"claude-sonnet-4".to_string(),
vec![],
);
let id1 = registry.register_agent(agent1).unwrap();
let available = registry.get_available_agent("developer");
assert!(available.is_some());
// Assign tasks to fill capacity
for _ in 0..5 {
registry.assign_task(&id1).unwrap();
}
// Should no longer be available
let available = registry.get_available_agent("developer");
assert!(available.is_none());
}
}