Implement intelligent agent learning from Knowledge Graph execution history with per-task-type expertise tracking, recency bias, and learning curves. ## Phase 5.3 Implementation ### Learning Infrastructure (✅ Complete) - LearningProfileService with per-task-type expertise metrics - TaskTypeExpertise model tracking success_rate, confidence, learning curves - Recency bias weighting: recent 7 days weighted 3x higher (exponential decay) - Confidence scoring prevents overfitting: min(1.0, executions / 20) - Learning curves computed from daily execution windows ### Agent Scoring Service (✅ Complete) - Unified AgentScore combining SwarmCoordinator + learning profiles - Scoring formula: 0.3*base + 0.5*expertise + 0.2*confidence - Rank agents by combined score for intelligent assignment - Support for recency-biased scoring (recent_success_rate) - Methods: rank_agents, select_best, rank_agents_with_recency ### KG Integration (✅ Complete) - KGPersistence::get_executions_for_task_type() - query by agent + task type - KGPersistence::get_agent_executions() - all executions for agent - Coordinator::load_learning_profile_from_kg() - core KG→Learning integration - Coordinator::load_all_learning_profiles() - batch load for multiple agents - Convert PersistedExecution → ExecutionData for learning calculations ### Agent Assignment Integration (✅ Complete) - AgentCoordinator uses learning profiles for task assignment - extract_task_type() infers task type from title/description - assign_task() scores candidates using AgentScoringService - Fallback to load-based selection if no learning data available - Learning profiles stored in coordinator.learning_profiles RwLock ### Profile Adapter Enhancements (✅ Complete) - create_learning_profile() - initialize empty profiles - add_task_type_expertise() - set task-type expertise - update_profile_with_learning() - update swarm profiles from learning ## Files Modified ### vapora-knowledge-graph/src/persistence.rs (+30 lines) - get_executions_for_task_type(agent_id, task_type, limit) - get_agent_executions(agent_id, limit) ### vapora-agents/src/coordinator.rs (+100 lines) - load_learning_profile_from_kg() - core KG integration method - load_all_learning_profiles() - batch loading for agents - assign_task() already uses learning-based scoring via AgentScoringService ### Existing Complete Implementation - vapora-knowledge-graph/src/learning.rs - calculation functions - vapora-agents/src/learning_profile.rs - data structures and expertise - vapora-agents/src/scoring.rs - unified scoring service - vapora-agents/src/profile_adapter.rs - adapter methods ## Tests Passing - learning_profile: 7 tests ✅ - scoring: 5 tests ✅ - profile_adapter: 6 tests ✅ - coordinator: learning-specific tests ✅ ## Data Flow 1. Task arrives → AgentCoordinator::assign_task() 2. Extract task_type from description 3. Query KG for task-type executions (load_learning_profile_from_kg) 4. Calculate expertise with recency bias 5. Score candidates (SwarmCoordinator + learning) 6. Assign to top-scored agent 7. Execution result → KG → Update learning profiles ## Key Design Decisions ✅ Recency bias: 7-day half-life with 3x weight for recent performance ✅ Confidence scoring: min(1.0, total_executions / 20) prevents overfitting ✅ Hierarchical scoring: 30% base load, 50% expertise, 20% confidence ✅ KG query limit: 100 recent executions per task-type for performance ✅ Async loading: load_learning_profile_from_kg supports concurrent loads ## Next: Phase 5.4 - Cost Optimization Ready to implement budget enforcement and cost-aware provider selection.
514 lines
13 KiB
Markdown
514 lines
13 KiB
Markdown
# 🔍 RAG Integration
|
|
## Retrievable Augmented Generation for VAPORA Context
|
|
|
|
**Version**: 0.1.0
|
|
**Status**: Specification (VAPORA v1.0 Integration)
|
|
**Purpose**: RAG system from provisioning integrated into VAPORA for semantic search
|
|
|
|
---
|
|
|
|
## 🎯 Objetivo
|
|
|
|
**RAG (Retrieval-Augmented Generation)** proporciona contexto a los agentes:
|
|
- ✅ Agentes buscan documentación semánticamente similar
|
|
- ✅ ADRs, diseños, y guías como contexto para nuevas tareas
|
|
- ✅ Query LLM con documentación relevante
|
|
- ✅ Reducir alucinaciones, mejorar decisiones
|
|
- ✅ Sistema completo de provisioning (2,140 líneas Rust)
|
|
|
|
---
|
|
|
|
## 🏗️ RAG Architecture
|
|
|
|
### Components (From Provisioning)
|
|
|
|
```
|
|
RAG System (2,140 lines, production-ready from provisioning)
|
|
├─ Chunking Engine
|
|
│ ├─ Markdown chunks (with metadata)
|
|
│ ├─ KCL chunks (for infrastructure docs)
|
|
│ ├─ Nushell chunks (for scripts)
|
|
│ └─ Smart splitting (at headers, code blocks)
|
|
│
|
|
├─ Embeddings
|
|
│ ├─ Primary: OpenAI API (text-embedding-3-small)
|
|
│ ├─ Fallback: Local ONNX (nomic-embed-text)
|
|
│ ├─ Dimension: 1536-dim vectors
|
|
│ └─ Batch processing
|
|
│
|
|
├─ Vector Store
|
|
│ ├─ SurrealDB with HNSW index
|
|
│ ├─ Fast similarity search
|
|
│ ├─ Scalar product distance metric
|
|
│ └─ Replication for redundancy
|
|
│
|
|
├─ Retrieval
|
|
│ ├─ Top-K BM25 + semantic hybrid
|
|
│ ├─ Threshold filtering (relevance > 0.7)
|
|
│ ├─ Context enrichment
|
|
│ └─ Ranking/re-ranking
|
|
│
|
|
└─ Integration
|
|
├─ Claude API with full context
|
|
├─ Agent Search tool
|
|
├─ Workflow context injection
|
|
└─ Decision-making support
|
|
```
|
|
|
|
### Data Flow
|
|
|
|
```
|
|
Document Added to docs/
|
|
↓
|
|
doc-lifecycle-manager classifies
|
|
↓
|
|
RAG Chunking Engine
|
|
├─ Split into semantic chunks
|
|
└─ Extract metadata (title, type, date)
|
|
↓
|
|
Embeddings Generator
|
|
├─ Generate 1536-dim vector per chunk
|
|
└─ Batch process for efficiency
|
|
↓
|
|
Vector Store (SurrealDB HNSW)
|
|
├─ Store chunk + vector + metadata
|
|
└─ Create HNSW index
|
|
↓
|
|
Search Ready
|
|
├─ Agent can query
|
|
├─ Semantic similarity search
|
|
└─ Fast < 100ms latency
|
|
```
|
|
|
|
---
|
|
|
|
## 🔧 RAG in VAPORA
|
|
|
|
### Search Tool (Available to All Agents)
|
|
|
|
```rust
|
|
pub struct SearchTool {
|
|
pub vector_store: SurrealDB,
|
|
pub embeddings: EmbeddingsClient,
|
|
pub retriever: HybridRetriever,
|
|
}
|
|
|
|
impl SearchTool {
|
|
pub async fn search(
|
|
&self,
|
|
query: String,
|
|
top_k: u32,
|
|
threshold: f64,
|
|
) -> anyhow::Result<SearchResults> {
|
|
// 1. Embed query
|
|
let query_vector = self.embeddings.embed(&query).await?;
|
|
|
|
// 2. Search vector store
|
|
let chunk_results = self.vector_store.search_hnsw(
|
|
query_vector,
|
|
top_k,
|
|
threshold,
|
|
).await?;
|
|
|
|
// 3. Enrich with context
|
|
let results = self.enrich_results(chunk_results).await?;
|
|
|
|
Ok(SearchResults {
|
|
query,
|
|
results,
|
|
total_chunks_searched: 1000+,
|
|
search_duration_ms: 45,
|
|
})
|
|
}
|
|
|
|
pub async fn search_with_filters(
|
|
&self,
|
|
query: String,
|
|
filters: SearchFilters,
|
|
) -> anyhow::Result<SearchResults> {
|
|
// Filter by document type, date, tags before search
|
|
let filtered_documents = self.filter_documents(&filters).await?;
|
|
// ... rest of search
|
|
}
|
|
}
|
|
|
|
pub struct SearchFilters {
|
|
pub doc_type: Option<Vec<String>>, // ["adr", "guide"]
|
|
pub date_range: Option<(Date, Date)>,
|
|
pub tags: Option<Vec<String>>, // ["orchestrator", "performance"]
|
|
pub lifecycle_state: Option<String>, // "published", "archived"
|
|
}
|
|
|
|
pub struct SearchResults {
|
|
pub query: String,
|
|
pub results: Vec<SearchResult>,
|
|
pub total_chunks_searched: u32,
|
|
pub search_duration_ms: u32,
|
|
}
|
|
|
|
pub struct SearchResult {
|
|
pub document_id: String,
|
|
pub document_title: String,
|
|
pub chunk_text: String,
|
|
pub relevance_score: f64, // 0.0-1.0
|
|
pub metadata: HashMap<String, String>,
|
|
pub source_url: String,
|
|
pub snippet_context: String, // Surrounding text
|
|
}
|
|
```
|
|
|
|
### Agent Usage Example
|
|
|
|
```rust
|
|
// Agent decides to search for context
|
|
impl DeveloperAgent {
|
|
pub async fn implement_feature(
|
|
&mut self,
|
|
task: Task,
|
|
) -> anyhow::Result<()> {
|
|
// 1. Search for similar features implemented before
|
|
let similar_features = self.search_tool.search(
|
|
format!("implement {} feature like {}", task.domain, task.type_),
|
|
top_k: 5,
|
|
threshold: 0.75,
|
|
).await?;
|
|
|
|
// 2. Extract context from results
|
|
let context_docs = similar_features.results
|
|
.iter()
|
|
.map(|r| r.chunk_text.clone())
|
|
.collect::<Vec<_>>();
|
|
|
|
// 3. Build LLM prompt with context
|
|
let prompt = format!(
|
|
"Implement the following feature:\n{}\n\nSimilar features implemented:\n{}",
|
|
task.description,
|
|
context_docs.join("\n---\n")
|
|
);
|
|
|
|
// 4. Generate code with context
|
|
let code = self.llm_router.complete(prompt).await?;
|
|
|
|
Ok(())
|
|
}
|
|
}
|
|
```
|
|
|
|
### Documenter Agent Integration
|
|
|
|
```rust
|
|
impl DocumenterAgent {
|
|
pub async fn update_documentation(
|
|
&mut self,
|
|
task: Task,
|
|
) -> anyhow::Result<()> {
|
|
// 1. Get decisions from task
|
|
let decisions = task.extract_decisions().await?;
|
|
|
|
for decision in decisions {
|
|
// 2. Search existing ADRs to avoid duplicates
|
|
let similar_adrs = self.search_tool.search(
|
|
decision.context.clone(),
|
|
top_k: 3,
|
|
threshold: 0.8,
|
|
).await?;
|
|
|
|
// 3. Check if decision already documented
|
|
if similar_adrs.results.is_empty() {
|
|
// Create new ADR
|
|
let adr_content = format!(
|
|
"# {}\n\n## Context\n{}\n\n## Decision\n{}",
|
|
decision.title,
|
|
decision.context,
|
|
decision.chosen_option,
|
|
);
|
|
|
|
// 4. Save and index for RAG
|
|
self.db.save_adr(&adr_content).await?;
|
|
self.rag_system.index_document(&adr_content).await?;
|
|
}
|
|
}
|
|
|
|
Ok(())
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 📊 RAG Implementation (From Provisioning)
|
|
|
|
### Schema (SurrealDB)
|
|
|
|
```sql
|
|
-- RAG chunks table
|
|
CREATE TABLE rag_chunks SCHEMAFULL {
|
|
-- Identifiers
|
|
id: string,
|
|
document_id: string,
|
|
chunk_index: int,
|
|
|
|
-- Content
|
|
text: string,
|
|
title: string,
|
|
doc_type: string,
|
|
|
|
-- Vector
|
|
embedding: vector<1536>,
|
|
|
|
-- Metadata
|
|
created_date: datetime,
|
|
last_updated: datetime,
|
|
source_path: string,
|
|
tags: array<string>,
|
|
lifecycle_state: string,
|
|
|
|
-- Indexing
|
|
INDEX embedding ON HNSW (1536) FIELDS embedding
|
|
DISTANCE SCALAR PRODUCT
|
|
M 16
|
|
EF_CONSTRUCTION 200,
|
|
|
|
PERMISSIONS
|
|
FOR select ALLOW (true)
|
|
FOR create ALLOW (true)
|
|
FOR update ALLOW (false)
|
|
FOR delete ALLOW (false)
|
|
};
|
|
```
|
|
|
|
### Chunking Strategy
|
|
|
|
```rust
|
|
pub struct ChunkingEngine;
|
|
|
|
impl ChunkingEngine {
|
|
pub async fn chunk_document(
|
|
&self,
|
|
document: Document,
|
|
) -> anyhow::Result<Vec<Chunk>> {
|
|
let chunks = match document.file_type {
|
|
FileType::Markdown => self.chunk_markdown(&document.content)?,
|
|
FileType::KCL => self.chunk_kcl(&document.content)?,
|
|
FileType::Nushell => self.chunk_nushell(&document.content)?,
|
|
_ => self.chunk_text(&document.content)?,
|
|
};
|
|
|
|
Ok(chunks)
|
|
}
|
|
|
|
fn chunk_markdown(&self, content: &str) -> anyhow::Result<Vec<Chunk>> {
|
|
let mut chunks = Vec::new();
|
|
|
|
// Split by headers
|
|
let sections = content.split(|line: &str| line.starts_with('#'));
|
|
|
|
for section in sections {
|
|
// Max 500 tokens per chunk
|
|
if section.len() > 500 {
|
|
// Split further
|
|
for sub_chunk in section.chunks(400) {
|
|
chunks.push(Chunk {
|
|
text: sub_chunk.to_string(),
|
|
metadata: Default::default(),
|
|
});
|
|
}
|
|
} else {
|
|
chunks.push(Chunk {
|
|
text: section.to_string(),
|
|
metadata: Default::default(),
|
|
});
|
|
}
|
|
}
|
|
|
|
Ok(chunks)
|
|
}
|
|
}
|
|
```
|
|
|
|
### Embeddings
|
|
|
|
```rust
|
|
pub enum EmbeddingsProvider {
|
|
OpenAI {
|
|
api_key: String,
|
|
model: "text-embedding-3-small", // 1536 dims, fast
|
|
},
|
|
Local {
|
|
model_path: String, // ONNX model
|
|
model: "nomic-embed-text",
|
|
},
|
|
}
|
|
|
|
pub struct EmbeddingsClient {
|
|
provider: EmbeddingsProvider,
|
|
}
|
|
|
|
impl EmbeddingsClient {
|
|
pub async fn embed(&self, text: &str) -> anyhow::Result<Vec<f32>> {
|
|
match &self.provider {
|
|
EmbeddingsProvider::OpenAI { api_key, .. } => {
|
|
// Call OpenAI API
|
|
let response = reqwest::Client::new()
|
|
.post("https://api.openai.com/v1/embeddings")
|
|
.bearer_auth(api_key)
|
|
.json(&serde_json::json!({
|
|
"model": "text-embedding-3-small",
|
|
"input": text,
|
|
}))
|
|
.send()
|
|
.await?;
|
|
|
|
let result: OpenAIResponse = response.json().await?;
|
|
Ok(result.data[0].embedding.clone())
|
|
},
|
|
EmbeddingsProvider::Local { model_path, .. } => {
|
|
// Use local ONNX model (nomic-embed-text)
|
|
let session = ort::Session::builder()?.commit_from_file(model_path)?;
|
|
|
|
let output = session.run(ort::inputs![text]?)?;
|
|
let embedding = output[0].try_extract_tensor()?.view().to_owned();
|
|
|
|
Ok(embedding.iter().map(|x| *x as f32).collect())
|
|
},
|
|
}
|
|
}
|
|
|
|
pub async fn embed_batch(
|
|
&self,
|
|
texts: Vec<String>,
|
|
) -> anyhow::Result<Vec<Vec<f32>>> {
|
|
// Batch embed for efficiency
|
|
// (Use batching API for OpenAI, etc.)
|
|
}
|
|
}
|
|
```
|
|
|
|
### Retrieval
|
|
|
|
```rust
|
|
pub struct HybridRetriever {
|
|
vector_store: SurrealDB,
|
|
bm25_index: BM25Index,
|
|
}
|
|
|
|
impl HybridRetriever {
|
|
pub async fn search(
|
|
&self,
|
|
query: String,
|
|
top_k: u32,
|
|
) -> anyhow::Result<Vec<ChunkWithScore>> {
|
|
// 1. Semantic search (vector similarity)
|
|
let query_vector = self.embed(&query).await?;
|
|
let semantic_results = self.vector_store.search_hnsw(
|
|
query_vector,
|
|
top_k * 2, // Get more for re-ranking
|
|
0.5,
|
|
).await?;
|
|
|
|
// 2. BM25 keyword search
|
|
let bm25_results = self.bm25_index.search(&query, top_k * 2)?;
|
|
|
|
// 3. Merge and re-rank
|
|
let mut merged = HashMap::new();
|
|
|
|
for (i, result) in semantic_results.iter().enumerate() {
|
|
let score = 1.0 / (i as f64 + 1.0); // Rank-based score
|
|
merged.entry(result.id.clone())
|
|
.and_modify(|s: &mut f64| *s += score * 0.7) // 70% weight
|
|
.or_insert(score * 0.7);
|
|
}
|
|
|
|
for (i, result) in bm25_results.iter().enumerate() {
|
|
let score = 1.0 / (i as f64 + 1.0);
|
|
merged.entry(result.id.clone())
|
|
.and_modify(|s: &mut f64| *s += score * 0.3) // 30% weight
|
|
.or_insert(score * 0.3);
|
|
}
|
|
|
|
// 4. Sort and return top-k
|
|
let mut final_results: Vec<_> = merged.into_iter().collect();
|
|
final_results.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap());
|
|
|
|
Ok(final_results.into_iter()
|
|
.take(top_k as usize)
|
|
.map(|(id, score)| {
|
|
// Fetch full chunk with this score
|
|
ChunkWithScore { id, score }
|
|
})
|
|
.collect())
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 📚 Indexing Workflow
|
|
|
|
### Automatic Indexing
|
|
|
|
```
|
|
File added to docs/
|
|
↓
|
|
Git hook or workflow trigger
|
|
↓
|
|
doc-lifecycle-manager processes
|
|
├─ Classifies document
|
|
└─ Publishes "document_added" event
|
|
↓
|
|
RAG system subscribes
|
|
├─ Chunks document
|
|
├─ Generates embeddings
|
|
├─ Stores in SurrealDB
|
|
└─ Updates HNSW index
|
|
↓
|
|
Agent Search Tool ready
|
|
```
|
|
|
|
### Batch Reindexing
|
|
|
|
```bash
|
|
# Periodic full reindex (daily or on demand)
|
|
vapora rag reindex --all
|
|
|
|
# Incremental reindex (only changed docs)
|
|
vapora rag reindex --since 1d
|
|
|
|
# Rebuild HNSW index from scratch
|
|
vapora rag rebuild-index --optimize
|
|
```
|
|
|
|
---
|
|
|
|
## 🎯 Implementation Checklist
|
|
|
|
- [ ] Port RAG system from provisioning (2,140 lines)
|
|
- [ ] Integrate with SurrealDB vector store
|
|
- [ ] HNSW index setup + optimization
|
|
- [ ] Chunking strategies (Markdown, KCL, Nushell)
|
|
- [ ] Embeddings client (OpenAI + local fallback)
|
|
- [ ] Hybrid retrieval (semantic + BM25)
|
|
- [ ] Search tool for agents
|
|
- [ ] doc-lifecycle-manager hooks
|
|
- [ ] Indexing workflows
|
|
- [ ] Batch reindexing
|
|
- [ ] CLI: `vapora rag search`, `vapora rag reindex`
|
|
- [ ] Tests + benchmarks
|
|
|
|
---
|
|
|
|
## 📊 Success Metrics
|
|
|
|
✅ Search latency < 100ms (p99)
|
|
✅ Relevance score > 0.8 for top results
|
|
✅ 1000+ documents indexed
|
|
✅ HNSW index memory efficient
|
|
✅ Agents find relevant context automatically
|
|
✅ No hallucinations from out-of-context queries
|
|
|
|
---
|
|
|
|
**Version**: 0.1.0
|
|
**Status**: ✅ Integration Specification Complete
|
|
**Purpose**: RAG system for semantic document search in VAPORA
|