# ADR-014: Learning Profiles con Recency Bias **Status**: Accepted | Implemented **Date**: 2024-11-01 **Deciders**: Agent Architecture Team **Technical Story**: Tracking per-task-type agent expertise with recency-weighted learning --- ## Decision Implementar **Learning Profiles per-task-type con exponential recency bias** para adaptar selección de agentes a capacidad actual. --- ## Rationale 1. **Recency Bias**: Últimos 7 días pesados 3× más alto (agentes mejoran rápidamente) 2. **Per-Task-Type**: Un perfil por tipo de tarea (architecture vs code gen vs review) 3. **Avoid Stale Data**: No usar promedio histórico (puede estar desactualizado) 4. **Confidence Score**: Requiere 20+ ejecuciones antes de confianza completa --- ## Alternatives Considered ### ❌ Simple Average (All-Time) - **Pros**: Simple - **Cons**: Histórico antiguo distorsiona, no adapta a mejoras actuales ### ❌ Sliding Window (Last N Executions) - **Pros**: More recent data - **Cons**: Artificial cutoff, perder contexto histórico ### ✅ Exponential Recency Bias (CHOSEN) - Pesa natural según antigüedad, mejor refleja capacidad actual --- ## Trade-offs **Pros**: - ✅ Adapts to agent capability improvements quickly - ✅ Exponential decay is mathematically sound - ✅ 20+ execution confidence threshold prevents overfitting - ✅ Per-task-type specialization **Cons**: - ⚠️ Cold-start: new agents start with low confidence - ⚠️ Requires 20 executions to reach full confidence - ⚠️ Storage overhead (per agent × per task type) --- ## Implementation **Learning Profile Model**: ```rust // crates/vapora-agents/src/learning_profile.rs pub struct TaskTypeLearning { pub agent_id: String, pub task_type: String, pub executions_total: u32, pub executions_successful: u32, pub avg_quality_score: f32, pub avg_latency_ms: f32, pub last_updated: DateTime, pub records: Vec, // Last 100 executions } impl TaskTypeLearning { /// Recency weight formula: 3.0 * e^(-days_ago / 7.0) for recent /// Then e^(-days_ago / 7.0) for older pub fn compute_recency_weight(days_ago: f64) -> f64 { if days_ago <= 7.0 { 3.0 * (-days_ago / 7.0).exp() // 3× weight for last week } else { (-days_ago / 7.0).exp() // Exponential decay after } } /// Weighted expertise score (0.0 - 1.0) pub fn expertise_score(&self) -> f32 { if self.executions_total == 0 { return 0.0; } let now = Utc::now(); let weighted_sum: f64 = self.records .iter() .map(|r| { let days_ago = (now - r.timestamp).num_days() as f64; let weight = Self::compute_recency_weight(days_ago); (r.quality_score as f64) * weight }) .sum(); let weight_sum: f64 = self.records .iter() .map(|r| { let days_ago = (now - r.timestamp).num_days() as f64; Self::compute_recency_weight(days_ago) }) .sum(); (weighted_sum / weight_sum) as f32 } /// Confidence score: min(1.0, executions / 20) pub fn confidence(&self) -> f32 { std::cmp::min(1.0, (self.executions_total as f32) / 20.0) } /// Final score combines expertise × confidence pub fn score(&self) -> f32 { self.expertise_score() * self.confidence() } } ``` **Recording Execution**: ```rust pub async fn record_execution( db: &Surreal, agent_id: &str, task_type: &str, success: bool, quality: f32, ) -> Result<()> { let record = ExecutionRecord { agent_id: agent_id.to_string(), task_type: task_type.to_string(), success, quality_score: quality, timestamp: Utc::now(), }; // Store in KG db.create("executions").content(&record).await?; // Update learning profile let profile = db.query( "SELECT * FROM task_type_learning \ WHERE agent_id = $1 AND task_type = $2" ) .bind((agent_id, task_type)) .await?; // Update counters (incremental) // If new profile, create with initial values Ok(()) } ``` **Agent Selection Using Profiles**: ```rust pub async fn select_agent_for_task( db: &Surreal, task_type: &str, ) -> Result { let profiles = db.query( "SELECT agent_id, expertise_score(), confidence(), score() \ FROM task_type_learning \ WHERE task_type = $1 \ ORDER BY score() DESC \ LIMIT 1" ) .bind(task_type) .await?; let best_agent = profiles .take::(0)? .ok_or(Error::NoAgentsAvailable)?; Ok(best_agent.agent_id) } ``` **Scoring Formula**: ``` expertise_score = Σ(quality_score_i × recency_weight_i) / Σ(recency_weight_i) recency_weight_i = { 3.0 × e^(-days_ago / 7.0) if days_ago ≤ 7 days (3× recent bias) e^(-days_ago / 7.0) if days_ago > 7 days (exponential decay) } confidence = min(1.0, total_executions / 20) final_score = expertise_score × confidence ``` **Key Files**: - `/crates/vapora-agents/src/learning_profile.rs` (profile computation) - `/crates/vapora-agents/src/scoring.rs` (score calculations) - `/crates/vapora-agents/src/selector.rs` (agent selection logic) --- ## Verification ```bash # Test recency weight calculation cargo test -p vapora-agents test_recency_weight # Test expertise score with mixed recent/old executions cargo test -p vapora-agents test_expertise_score # Test confidence with <20 and >20 executions cargo test -p vapora-agents test_confidence_score # Integration: record executions and verify profile updates cargo test -p vapora-agents test_profile_recording # Integration: select best agent using profiles cargo test -p vapora-agents test_agent_selection_by_profile # Verify cold-start (new agent has low score) cargo test -p vapora-agents test_cold_start_bias ``` **Expected Output**: - Recent executions (< 7 days) weighted 3× higher - Older executions gradually decay exponentially - New agents (< 20 executions) have lower confidence - Agents with 20+ executions reach full confidence - Best agent selected based on recency-weighted score - Profile updates recorded in KG --- ## Consequences ### Agent Dynamics - Agents that improve rapidly rise in selection order - Poor-performing agents decline even with historical success - Learning profiles encourage agent improvement (recent success rewarded) ### Data Management - One profile per agent × per task type - Last 100 executions per profile retained (rest in archive) - Storage: ~50KB per profile ### Monitoring - Track which agents are trending up/down - Identify agents with cold-start problem - Alert if all agents for task type below threshold ### User Experience - Best agents selected automatically - Selection adapts to agent improvements - Users see faster task completion over time --- ## References - `/crates/vapora-agents/src/learning_profile.rs` (profile implementation) - `/crates/vapora-agents/src/scoring.rs` (scoring logic) - ADR-013 (Knowledge Graph Temporal) - ADR-017 (Confidence Weighting) --- **Related ADRs**: ADR-013 (Knowledge Graph), ADR-017 (Confidence), ADR-018 (Load Balancing)