Compare commits

..

2 Commits

Author SHA1 Message Date
Jesús Pérez
765841b18f
feat(capabilities): add vapora-capabilities crate with in-process executor dispatch
Some checks failed
Documentation Lint & Validation / Markdown Linting (push) Has been cancelled
Documentation Lint & Validation / Validate mdBook Configuration (push) Has been cancelled
Documentation Lint & Validation / Content & Structure Validation (push) Has been cancelled
Documentation Lint & Validation / Lint & Validation Summary (push) Has been cancelled
mdBook Build & Deploy / Build mdBook (push) Has been cancelled
mdBook Build & Deploy / Documentation Quality Check (push) Has been cancelled
mdBook Build & Deploy / Deploy to GitHub Pages (push) Has been cancelled
mdBook Build & Deploy / Notification (push) Has been cancelled
Rust CI / Security Audit (push) Has been cancelled
Rust CI / Check + Test + Lint (nightly) (push) Has been cancelled
Rust CI / Check + Test + Lint (stable) (push) Has been cancelled
- New vapora-capabilities crate: CapabilitySpec, Capability trait, CapabilityRegistry
     (parking_lot RwLock), CapabilityLoader (TOML overrides), 3 built-ins
     (code-reviewer, doc-generator, pr-monitor), 22 tests
   - Move AgentDefinition to vapora-shared to break capabilities↔agents circular dep
   - Wire system_prompt into AgentExecutor via LLMRouter.complete_with_budget
   - AgentCoordinator: in-process task dispatch via DashMap<String, Sender<TaskAssignment>>
   - server.rs: bootstrap CapabilityRegistry + LLMRouter from env, spawn executors per capability
   - Landing page: 620 tests, 21 crates, Capability Packages feature box
   - docs: capability-packages feature guide, ADR-0037, CHANGELOG, SUMMARY
   EOF
2026-02-26 16:43:28 +00:00
Jesús Pérez
27a290b369
feat(kg,channels): hybrid search + agent-inactive notifications
- KG: HNSW + BM25 + RRF(k=60) hybrid search via SurrealDB 3 native indexes
  - Fix schema bug: kg_executions missing agent_role/provider/cost_cents (silent empty reads)
  - channels: on_agent_inactive hook (AgentStatus::Inactive → Message::error)
  - migration 012: adds missing fields + HNSW + BM25 indexes
  - docs: ADR-0036, update ADR-0035 + notification-channels feature doc
2026-02-26 15:32:44 +00:00
40 changed files with 2997 additions and 103 deletions

View File

@ -7,6 +7,43 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased]
### Added - Capability Packages (`vapora-capabilities`)
#### `vapora-capabilities` — new crate
- `CapabilitySpec`: full bundle struct — `id`, `display_name`, `description`, `agent_role`, `task_types`, `system_prompt`, `mcp_tools`, `preferred_provider`, `preferred_model`, `max_tokens`, `temperature`, `priority`, `parallelizable`
- `Capability` trait: object-safe (`Send + Sync`), single `fn spec() -> CapabilitySpec` + default `fn to_agent_definition() -> AgentDefinition`
- `CustomCapability(CapabilitySpec)` wrapper for TOML-loaded capabilities
- `CapabilityRegistry`: `parking_lot::RwLock<HashMap<String, Arc<dyn Capability>>>``register()`, `register_or_replace()`, `override_spec()`, `activate()`, `list_ids()`, `len()`
- `CapabilityLoader`: `parse(toml_str)`, `from_file(path)`, `apply(config, registry)`, `load_and_apply(path, registry)` — partial override via `Option<T>` fields, unknown IDs skipped with warning, idempotent re-application
- Three built-in capabilities:
- `CodeReviewer` — role `code_reviewer`, Claude Opus 4.6, temperature 0.1, max_tokens 8192, tools: file_read/file_list/git_diff/code_search, structured JSON output (severity Critical/High/Medium/Low/Info)
- `DocGenerator` — role `documenter`, Claude Sonnet 4.6, temperature 0.3, max_tokens 16384, tools: file_read/file_list/code_search/file_write, multi-level doc methodology
- `PRMonitor` — role `monitor`, Claude Sonnet 4.6, temperature 0.1, max_tokens 4096, tools: git_diff/git_log/git_status/file_list/file_read, READY/NEEDS_REVIEW/BLOCKED classification
- 22 unit tests + 3 doc-tests
#### `vapora-shared``AgentDefinition` moved here
- `AgentDefinition` extracted from `vapora-agents::config` into `vapora-shared::agent_definition`
- Re-exported from `vapora-agents::config` for backward compatibility — zero call-site changes
- Breaks the potential `vapora-agents ↔ vapora-capabilities` circular dependency
#### `vapora-agents` — executor wired to LLM router + capabilities
- `AgentExecutor::with_router(Arc<LLMRouter>)` builder — routes real LLM calls through `complete_with_budget()` using `AgentMetadata::system_prompt` as the system message
- `AgentExecutor::execute_task()` — replaces hardcoded stub; dispatches `task.description` + `task.context` as user prompt; provider name tracked in KG persistence record
- `AgentCoordinator::register_executor_channel(agent_id, Sender<TaskAssignment>)` — registers in-process executor channel
- `AgentCoordinator::assign_task()` — dispatches to registered executor channel (in addition to NATS) without holding `DashMap` shard lock across await
- `server.rs` — initializes `CapabilityRegistry::with_built_ins()` at startup; `build_router_from_env()` builds `LLMRouter` from `LLM_ROUTER_CONFIG` file or `ANTHROPIC_API_KEY`/`OPENAI_API_KEY`/`OLLAMA_URL` env vars; spawns one `AgentExecutor` per capability with router wired; agents from `agents.toml` that have no matching capability also get executors
#### Documentation
- `docs/features/capability-packages.md` — new feature reference
- `docs/guides/capability-packages-guide.md` — usage guide (activate built-ins, TOML customization, custom capabilities, env vars)
- **ADR-0037**: design rationale for capability packages, dependency inversion via `vapora-shared`, and in-process executor dispatch
---
### Added - Webhook Notification Channels (`vapora-channels`)
#### `vapora-channels` — new crate
@ -46,6 +83,37 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
---
### Added - Knowledge Graph Hybrid Search (HNSW + BM25 + RRF)
#### `vapora-knowledge-graph` — real retrieval replaces stub
- `find_similar_executions`: was returning recent records ordered by timestamp; now uses SurrealDB 3 HNSW ANN query (`<|100,64|>`) against the `embedding` field
- `hybrid_search`: new method combining HNSW semantic + BM25 lexical via RRF(k=60) fusion; returns `Vec<HybridSearchResult>` with individual `semantic_score`, `lexical_score`, `hybrid_score`, and rank fields
- `find_similar_rlm_tasks`: was ignoring `query_embedding`; now uses in-memory cosine similarity over SCHEMALESS `rlm_executions` records
- `HybridSearchResult` added to `models.rs` and re-exported from `lib.rs`
- 5 new unit tests: `cosine_similarity` edge cases (orthogonal, identical, empty, partial) + RRF fusion consensus validation
#### `migrations/012_kg_hybrid_search.surql` — schema fix + indexes
- **Schema bug fixed**: `kg_executions` (SCHEMAFULL) was missing `agent_role`, `provider`, `cost_cents` — SurrealDB silently dropped these fields on INSERT, causing all reads to fail deserialization silently; all three fields now declared
- `DEFINE ANALYZER kg_text_analyzer``class` tokenizer + `lowercase` + `snowball(english)` filters
- `DEFINE INDEX idx_kg_executions_ft` — BM25 full-text index on `task_description`
- `DEFINE INDEX idx_kg_executions_hnsw` — HNSW index on `embedding` (1536-dim, cosine, F32, M=16, EF=200)
#### Documentation
- **ADR-0036**: documents HNSW+BM25+RRF decision, the schema bug root cause, and why `stratum-embeddings` brute-force is unsuitable for unbounded KG datasets
---
### Added - `on_agent_inactive` Notification Hook
- `NotificationConfig` gains `on_agent_inactive: Vec<String>` — fires when `update_agent_status` transitions an agent to `AgentStatus::Inactive`
- `update_agent_status` handler in `agents.rs` fires `Message::error("Agent Inactive", ...)` via `state.notify`
- Docs: `on_agent_inactive` added to Events Reference table in `docs/features/notification-channels.md` and to the backend integration section in ADR-0035
---
### Added - Autonomous Scheduling: Timezone Support and Distributed Fire-Lock
#### `vapora-workflow-engine` — scheduling hardening

16
Cargo.lock generated
View File

@ -12338,6 +12338,7 @@ dependencies = [
"axum",
"chrono",
"clap",
"dashmap 6.1.0",
"futures",
"mockall",
"rig-core",
@ -12352,6 +12353,7 @@ dependencies = [
"tracing",
"tracing-subscriber",
"uuid",
"vapora-capabilities",
"vapora-knowledge-graph",
"vapora-llm-router",
"vapora-shared",
@ -12430,6 +12432,20 @@ dependencies = [
"wiremock",
]
[[package]]
name = "vapora-capabilities"
version = "1.2.0"
dependencies = [
"parking_lot",
"serde",
"serde_json",
"tempfile",
"thiserror 2.0.18",
"toml 0.9.8",
"tracing",
"vapora-shared",
]
[[package]]
name = "vapora-channels"
version = "1.2.0"

View File

@ -3,6 +3,7 @@
resolver = "2"
members = [
"crates/vapora-capabilities",
"crates/vapora-channels",
"crates/vapora-backend",
"crates/vapora-frontend",
@ -37,6 +38,7 @@ categories = ["development-tools", "web-programming"]
[workspace.dependencies]
# Vapora internal crates
vapora-capabilities = { path = "crates/vapora-capabilities" }
vapora-channels = { path = "crates/vapora-channels" }
vapora-shared = { path = "crates/vapora-shared" }
vapora-leptos-ui = { path = "crates/vapora-leptos-ui" }

View File

@ -86,10 +86,19 @@
- **Chunking Strategies**: Fixed-size, semantic (sentence-aware), code-aware (AST-based for Rust/Python/JS)
- **Sandbox Execution**: WASM tier (<10ms) + Docker tier (80-150ms) with automatic tier selection
- **Multi-Provider LLM**: OpenAI, Claude, Ollama integration with cost tracking
- **Knowledge Graph**: Execution history persistence with learning curves
- **Knowledge Graph**: Temporal execution history with hybrid retrieval — HNSW (semantic) + BM25 (lexical) + RRF fusion via SurrealDB 3 native indexes
- **Production Ready**: 38/38 tests passing, 0 clippy warnings, real SurrealDB persistence
- **Cost Efficient**: Chunk-based processing reduces token usage vs full-document LLM calls
### 🔔 Webhook Notification Channels
- **Three providers**: Slack (Incoming Webhook), Discord (Webhook embed), Telegram (Bot API) — no vendor SDKs, plain HTTP POST
- **Secret resolution at startup**: `${VAR}` / `${VAR:-default}` interpolation built into `ChannelRegistry`; a raw placeholder never reaches the HTTP layer
- **Fire-and-forget**: channel failures never surface as API errors; delivery is logged and continues for remaining targets
- **Backend events**: `on_task_done`, `on_proposal_approved`, `on_proposal_rejected`, `on_agent_inactive`
- **Workflow events**: `on_stage_complete`, `on_stage_failed`, `on_completed`, `on_cancelled` — per-workflow routing config
- **REST API**: `GET /api/v1/channels` (list), `POST /api/v1/channels/:name/test` (connectivity check)
### 🧠 Intelligent Learning & Cost Optimization (Phase 5.3 + 5.4)
- **Per-Task-Type Learning**: Agents build expertise profiles from execution history

File diff suppressed because one or more lines are too long

View File

@ -516,8 +516,8 @@
<div class="container">
<header>
<span class="status-badge" data-en="✅ v1.2.0 | 372 Tests | 100% Pass Rate" data-es="✅ v1.2.0 | 372 Tests | 100% Éxito"
>✅ v1.2.0 | 372 Tests | 100% Pass Rate</span
<span class="status-badge" data-en="✅ v1.2.0 | 620 Tests | 100% Pass Rate" data-es="✅ v1.2.0 | 620 Tests | 100% Éxito"
>✅ v1.2.0 | 620 Tests | 100% Pass Rate</span
>
<div class="logo-container">
<img id="logo-dark" src="/vapora.svg" alt="Vapora - Development Orchestration" style="display: block;" />
@ -821,6 +821,24 @@
Real-time alerts to Slack, Discord, and Telegram — no vendor SDKs. ${VAR} secret resolution is built into ChannelRegistry construction; tokens never reach the HTTP layer unresolved. Fire-and-forget hooks on task completion, proposal approval/rejection, and workflow lifecycle events.
</p>
</div>
<div class="feature-box" style="border-left-color: #6366f1">
<div class="feature-icon">🧩</div>
<h3
class="feature-title"
style="color: #6366f1"
data-en="Capability Packages"
data-es="Paquetes de Capacidades"
>
Capability Packages
</h3>
<p
class="feature-text"
data-en="Domain-optimized agent bundles — system prompt, preferred LLM model, task types, and MCP tools pre-configured per role. Three built-ins (code-reviewer, doc-generator, pr-monitor) loaded at startup via CapabilityRegistry. TOML overrides let you swap model or prompt without code changes. In-process executor dispatch via DashMap channels — no NATS required for standalone mode. 22 tests."
data-es="Bundles de agentes optimizados por dominio — system prompt, modelo LLM preferido, tipos de tarea y herramientas MCP preconfigurados por rol. Tres built-ins (code-reviewer, doc-generator, pr-monitor) cargados en startup via CapabilityRegistry. Overrides TOML permiten cambiar modelo o prompt sin cambios de código. Dispatch de executor en proceso via canales DashMap — sin NATS requerido en modo standalone. 22 tests."
>
Domain-optimized agent bundles — system prompt, preferred LLM model, task types, and MCP tools pre-configured per role. Three built-ins (code-reviewer, doc-generator, pr-monitor) loaded at startup via CapabilityRegistry. TOML overrides let you swap model or prompt without code changes. In-process executor dispatch via DashMap channels — no NATS required for standalone mode. 22 tests.
</p>
</div>
</div>
</section>
@ -831,7 +849,7 @@
>
</h2>
<div class="tech-stack">
<span class="tech-badge">Rust (18 crates)</span>
<span class="tech-badge">Rust (21 crates)</span>
<span class="tech-badge">Axum REST API</span>
<span class="tech-badge">SurrealDB</span>
<span class="tech-badge">NATS JetStream</span>
@ -844,6 +862,7 @@
<span class="tech-badge">MCP Server</span>
<span class="tech-badge">chrono-tz (Cron)</span>
<span class="tech-badge">Webhook Channels</span>
<span class="tech-badge">Capability Packages</span>
</div>
</section>

View File

@ -17,10 +17,14 @@ path = "src/bin/server.rs"
[dependencies]
# Internal crates
vapora-shared = { workspace = true }
vapora-capabilities = { workspace = true }
vapora-llm-router = { workspace = true }
vapora-knowledge-graph = { workspace = true }
vapora-swarm = { workspace = true }
# Concurrent collections
dashmap = { workspace = true }
# Secrets management
secretumvault = { workspace = true }

View File

@ -1,6 +1,7 @@
//! VAPORA Agent Server Binary
//! Provides HTTP server for agent coordination and health checks
use std::collections::HashMap;
use std::sync::Arc;
use anyhow::Result;
@ -8,9 +9,17 @@ use axum::{extract::State, routing::get, Json, Router};
use clap::Parser;
use serde_json::json;
use tokio::net::TcpListener;
use tracing::{error, info};
use vapora_agents::{config::AgentConfig, coordinator::AgentCoordinator, registry::AgentRegistry};
use vapora_llm_router::{BudgetConfig, BudgetManager};
use tokio::sync::mpsc;
use tracing::{error, info, warn};
use vapora_agents::{
config::AgentConfig,
coordinator::AgentCoordinator,
registry::{AgentMetadata, AgentRegistry},
runtime::executor::AgentExecutor,
};
use vapora_capabilities::CapabilityRegistry;
use vapora_llm_router::{BudgetConfig, BudgetManager, LLMRouter, LLMRouterConfig, ProviderConfig};
use vapora_shared::AgentDefinition;
#[derive(Clone)]
struct AppState {
@ -42,10 +51,8 @@ struct Args {
#[tokio::main]
async fn main() -> Result<()> {
// Parse CLI arguments
let args = Args::parse();
// Initialize tracing
tracing_subscriber::fmt()
.with_env_filter(
tracing_subscriber::EnvFilter::from_default_env()
@ -55,11 +62,11 @@ async fn main() -> Result<()> {
info!("Starting VAPORA Agent Server");
// Load configuration from environment
// Load agent configuration
let config = AgentConfig::from_env()?;
info!("Loaded configuration from environment");
// Load budget configuration from specified path
// Load budget configuration
let budget_config_path = args.budget_config;
let budget_manager = match BudgetConfig::load_or_default(&budget_config_path) {
Ok(budget_config) => {
@ -84,16 +91,25 @@ async fn main() -> Result<()> {
}
};
// Build LLM router from config file or environment credentials
let router = build_router_from_env();
let router = router.map(Arc::new);
// Initialize capability registry with built-in capability packages
let cap_registry = CapabilityRegistry::with_built_ins();
info!(
"Capability registry initialized: {:?}",
cap_registry.list_ids()
);
// Initialize agent registry and coordinator
// Max 10 agents per role (can be configured via environment)
let max_agents_per_role = std::env::var("MAX_AGENTS_PER_ROLE")
.ok()
.and_then(|v| v.parse().ok())
.unwrap_or(10);
let registry = Arc::new(AgentRegistry::new(max_agents_per_role));
let mut coordinator = AgentCoordinator::new(config, registry).await?;
let mut coordinator = AgentCoordinator::new(config.clone(), Arc::clone(&registry)).await?;
// Attach budget manager to coordinator if available
if let Some(ref bm) = budget_manager {
coordinator = coordinator.with_budget_manager(bm.clone());
info!("Budget enforcement enabled for agent coordinator");
@ -101,9 +117,32 @@ async fn main() -> Result<()> {
let coordinator = Arc::new(coordinator);
// Spawn one executor per built-in capability, each wired to the LLM router.
// The executor's channel sender is registered with the coordinator so that
// assign_task() dispatches directly in-process.
for cap_id in cap_registry.list_ids() {
spawn_capability_executor(
&cap_id,
&cap_registry,
&registry,
&coordinator,
router.as_ref(),
);
}
// Spawn executors for any agents defined in agents.toml that are NOT
// already covered by a capability package (role not registered yet).
for agent_def in &config.agents {
if registry.get_agents_by_role(&agent_def.role).is_empty() {
spawn_single_config_executor(agent_def, &registry, &coordinator, router.as_ref());
}
}
info!("{} agents registered", registry.total_count());
// Start coordinator
let _coordinator_handle = {
let coordinator = coordinator.clone();
let coordinator = Arc::clone(&coordinator);
tokio::spawn(async move {
if let Err(e) = coordinator.start().await {
error!("Coordinator error: {}", e);
@ -111,32 +150,221 @@ async fn main() -> Result<()> {
})
};
// Build application state
let state = AppState {
coordinator,
budget_manager,
};
// Build HTTP router
let app = Router::new()
.route("/health", get(health_handler))
.route("/ready", get(readiness_handler))
.with_state(state);
// Start HTTP server
let addr = std::env::var("BIND_ADDR").unwrap_or_else(|_| "0.0.0.0:9000".to_string());
info!("Agent server listening on {}", addr);
let listener = TcpListener::bind(&addr).await?;
axum::serve(listener, app).await?;
// Note: coordinator_handle would be awaited here if needed,
// but axum::serve blocks until server shutdown
Ok(())
}
/// Health check endpoint
/// Activate a capability, register the resulting agent, and spawn its executor.
fn spawn_capability_executor(
cap_id: &str,
cap_registry: &CapabilityRegistry,
registry: &Arc<AgentRegistry>,
coordinator: &Arc<AgentCoordinator>,
router: Option<&Arc<LLMRouter>>,
) {
let def = match cap_registry.activate(cap_id) {
Ok(d) => d,
Err(e) => {
warn!("Failed to activate capability '{}': {}", cap_id, e);
return;
}
};
let metadata = agent_metadata_from_definition(&def);
let agent_id = metadata.id.clone();
if let Err(e) = registry.register_agent(metadata.clone()) {
warn!(
"Failed to register agent for capability '{}': {}",
cap_id, e
);
return;
}
let (tx, rx) = mpsc::channel::<vapora_agents::messages::TaskAssignment>(32);
let executor = build_executor(metadata, rx, router);
tokio::spawn(executor.run());
coordinator.register_executor_channel(agent_id, tx);
info!(
"Spawned executor for capability '{}' (role: {}, provider: {}, model: {})",
cap_id, def.role, def.llm_provider, def.llm_model
);
}
fn spawn_single_config_executor(
agent_def: &AgentDefinition,
registry: &Arc<AgentRegistry>,
coordinator: &Arc<AgentCoordinator>,
router: Option<&Arc<LLMRouter>>,
) {
let metadata = agent_metadata_from_definition(agent_def);
let agent_id = metadata.id.clone();
if let Err(e) = registry.register_agent(metadata.clone()) {
warn!(
"Failed to register config agent '{}': {}",
agent_def.role, e
);
return;
}
let (tx, rx) = mpsc::channel::<vapora_agents::messages::TaskAssignment>(32);
let executor = build_executor(metadata, rx, router);
tokio::spawn(executor.run());
coordinator.register_executor_channel(agent_id, tx);
info!(
"Spawned executor for config agent (role: {}, provider: {})",
agent_def.role, agent_def.llm_provider
);
}
fn agent_metadata_from_definition(def: &AgentDefinition) -> AgentMetadata {
let m = AgentMetadata::new(
def.role.clone(),
format!("{}-agent", def.role),
def.llm_provider.clone(),
def.llm_model.clone(),
def.capabilities.clone(),
);
match &def.system_prompt {
Some(prompt) => m.with_system_prompt(prompt.clone()),
None => m,
}
}
fn build_executor(
metadata: AgentMetadata,
rx: mpsc::Receiver<vapora_agents::messages::TaskAssignment>,
router: Option<&Arc<LLMRouter>>,
) -> AgentExecutor {
let executor = AgentExecutor::new(metadata, rx);
match router {
Some(r) => executor.with_router(Arc::clone(r)),
None => executor,
}
}
/// Build an LLM router from a config file (LLM_ROUTER_CONFIG env var) or
/// from individual provider API key environment variables.
/// Returns None and logs a warning when no credentials are available.
fn build_router_from_env() -> Option<LLMRouter> {
// Try config file path first
if let Ok(path) = std::env::var("LLM_ROUTER_CONFIG") {
match LLMRouterConfig::load(&path) {
Ok(config) => match LLMRouter::new(config) {
Ok(router) => {
info!("LLM router initialized from {}", path);
return Some(router);
}
Err(e) => warn!("LLM router config loaded but init failed: {}", e),
},
Err(e) => warn!("Failed to load LLM router config from {}: {}", path, e),
}
}
// Fall back to building from individual environment variables
let mut providers: HashMap<String, ProviderConfig> = HashMap::new();
if let Ok(api_key) = std::env::var("ANTHROPIC_API_KEY") {
providers.insert(
"claude".to_string(),
ProviderConfig {
enabled: true,
api_key: Some(api_key),
url: None,
model: "claude-sonnet-4-6".to_string(),
max_tokens: 8192,
temperature: 0.7,
cost_per_1m_input: 3.0,
cost_per_1m_output: 15.0,
},
);
}
if let Ok(api_key) = std::env::var("OPENAI_API_KEY") {
providers.insert(
"openai".to_string(),
ProviderConfig {
enabled: true,
api_key: Some(api_key),
url: None,
model: "gpt-4o".to_string(),
max_tokens: 8192,
temperature: 0.7,
cost_per_1m_input: 2.5,
cost_per_1m_output: 10.0,
},
);
}
if let Ok(url) = std::env::var("OLLAMA_URL") {
providers.insert(
"ollama".to_string(),
ProviderConfig {
enabled: true,
api_key: None,
url: Some(url),
model: std::env::var("OLLAMA_MODEL").unwrap_or_else(|_| "llama3.2".to_string()),
max_tokens: 4096,
temperature: 0.7,
cost_per_1m_input: 0.0,
cost_per_1m_output: 0.0,
},
);
}
if providers.is_empty() {
info!("No LLM provider credentials found — executors will use stub responses");
return None;
}
let default_provider = if providers.contains_key("claude") {
"claude".to_string()
} else {
providers.keys().next().unwrap().clone()
};
let config = LLMRouterConfig {
routing: vapora_llm_router::config::RoutingConfig {
default_provider,
cost_tracking_enabled: true,
fallback_enabled: true,
},
providers,
routing_rules: vec![],
};
match LLMRouter::new(config) {
Ok(router) => {
info!("LLM router initialized from environment credentials");
Some(router)
}
Err(e) => {
warn!("Failed to initialize LLM router from env vars: {}", e);
None
}
}
}
async fn health_handler() -> Json<serde_json::Value> {
Json(json!({
"status": "healthy",
@ -145,10 +373,8 @@ async fn health_handler() -> Json<serde_json::Value> {
}))
}
/// Readiness check endpoint
async fn readiness_handler(State(state): State<AppState>) -> Json<serde_json::Value> {
let is_ready = state.coordinator.is_ready().await;
Json(json!({
"ready": is_ready,
"service": "vapora-agents",

View File

@ -46,23 +46,10 @@ fn default_agent_timeout() -> u64 {
300
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct AgentDefinition {
pub role: String,
pub description: String,
pub llm_provider: String,
pub llm_model: String,
#[serde(default)]
pub parallelizable: bool,
#[serde(default = "default_priority")]
pub priority: u32,
#[serde(default)]
pub capabilities: Vec<String>,
}
fn default_priority() -> u32 {
50
}
/// Re-exported from `vapora-shared` so callers using
/// `vapora_agents::config::AgentDefinition` continue to compile without
/// changes.
pub use vapora_shared::AgentDefinition;
impl AgentConfig {
/// Load configuration from TOML file
@ -136,6 +123,7 @@ impl Default for AgentConfig {
parallelizable: true,
priority: 80,
capabilities: vec!["coding".to_string()],
system_prompt: None,
}],
}
}
@ -161,6 +149,7 @@ mod tests {
parallelizable: true,
priority: 80,
capabilities: vec!["coding".to_string()],
system_prompt: None,
}],
};
@ -184,6 +173,7 @@ mod tests {
parallelizable: true,
priority: 80,
capabilities: vec![],
system_prompt: None,
},
AgentDefinition {
role: "developer".to_string(),
@ -193,6 +183,7 @@ mod tests {
parallelizable: true,
priority: 80,
capabilities: vec![],
system_prompt: None,
},
],
};
@ -216,6 +207,7 @@ mod tests {
parallelizable: false,
priority: 100,
capabilities: vec!["architecture".to_string()],
system_prompt: None,
}],
};

View File

@ -6,7 +6,9 @@ use std::path::PathBuf;
use std::sync::Arc;
use chrono::Utc;
use dashmap::DashMap;
use thiserror::Error;
use tokio::sync::mpsc;
use tracing::{debug, info, warn};
use uuid::Uuid;
use vapora_shared::validation::{SchemaRegistry, ValidationPipeline};
@ -52,6 +54,10 @@ pub struct AgentCoordinator {
learning_profiles: Arc<std::sync::RwLock<HashMap<String, LearningProfile>>>,
budget_manager: Option<Arc<BudgetManager>>,
validation: Arc<ValidationPipeline>,
/// In-process executor channels, keyed by agent ID.
/// When a channel is registered for the selected agent, the task is
/// dispatched directly to its executor rather than relying solely on NATS.
executor_channels: Arc<DashMap<String, mpsc::Sender<TaskAssignment>>>,
}
impl AgentCoordinator {
@ -123,6 +129,7 @@ impl AgentCoordinator {
learning_profiles: Arc::new(std::sync::RwLock::new(HashMap::new())),
budget_manager: None,
validation,
executor_channels: Arc::new(DashMap::new()),
})
}
@ -147,9 +154,18 @@ impl AgentCoordinator {
learning_profiles: Arc::new(std::sync::RwLock::new(HashMap::new())),
budget_manager: None,
validation,
executor_channels: Arc::new(DashMap::new()),
}
}
/// Register an in-process executor channel for an agent.
///
/// When registered, [`Self::assign_task`] dispatches tasks directly to the
/// executor via this channel in addition to (or instead of) NATS.
pub fn register_executor_channel(&self, agent_id: String, tx: mpsc::Sender<TaskAssignment>) {
self.executor_channels.insert(agent_id, tx);
}
/// Set NATS client for inter-agent communication
pub fn with_nats(mut self, client: Arc<async_nats::Client>) -> Self {
self.nats_client = Some(client);
@ -308,7 +324,23 @@ impl AgentCoordinator {
task_id, agent.id, role, task_type
);
// Publish to NATS if available
// Dispatch to in-process executor if one is registered for this agent.
// Clone the sender out of the DashMap before awaiting to avoid holding
// the shard lock across an await point.
if let Some(tx) = self
.executor_channels
.get(&agent.id)
.map(|r| r.value().clone())
{
if let Err(e) = tx.send(assignment.clone()).await {
warn!(
"Failed to dispatch task {} to in-process executor for agent {}: {}",
task_id, agent.id, e
);
}
}
// Publish to NATS if available (external consumers / distributed mode)
if let Some(nats) = &self.nats_client {
self.publish_message(nats, AgentMessage::TaskAssigned(assignment))
.await?;

View File

@ -92,6 +92,7 @@ mod tests {
last_heartbeat: chrono::Utc::now(),
uptime_percentage: 99.5,
total_tasks_completed: 10,
system_prompt: None,
};
let profile = ProfileAdapter::create_profile(&agent);
@ -121,6 +122,7 @@ mod tests {
last_heartbeat: chrono::Utc::now(),
uptime_percentage: 99.0,
total_tasks_completed: 5,
system_prompt: None,
},
AgentMetadata {
id: "agent-2".to_string(),
@ -137,6 +139,7 @@ mod tests {
last_heartbeat: chrono::Utc::now(),
uptime_percentage: 98.5,
total_tasks_completed: 3,
system_prompt: None,
},
];

View File

@ -49,6 +49,10 @@ pub struct AgentMetadata {
pub last_heartbeat: DateTime<Utc>,
pub uptime_percentage: f64,
pub total_tasks_completed: u64,
/// Domain-optimized system prompt from capability package.
/// None means the executor uses its default generic prompt.
#[serde(default)]
pub system_prompt: Option<String>,
}
impl AgentMetadata {
@ -75,9 +79,17 @@ impl AgentMetadata {
last_heartbeat: now,
uptime_percentage: 100.0,
total_tasks_completed: 0,
system_prompt: None,
}
}
/// Builder: attach a domain-optimized system prompt from a capability
/// package.
pub fn with_system_prompt(mut self, prompt: impl Into<String>) -> Self {
self.system_prompt = Some(prompt.into());
self
}
/// Check if agent can accept new tasks
pub fn can_accept_task(&self) -> bool {
self.status == AgentStatus::Active && self.current_tasks < self.max_concurrent_tasks

View File

@ -1,39 +1,37 @@
// Per-agent execution loop with channel-based task distribution
// Phase 5.5: Persistence of execution history to KG for learning
// Each agent has dedicated executor managing its state machine
use std::sync::Arc;
use chrono::Utc;
use tokio::sync::mpsc;
use tracing::{debug, info, warn};
use vapora_knowledge_graph::{ExecutionRecord, KGPersistence, PersistedExecution};
use vapora_llm_router::EmbeddingProvider;
use vapora_llm_router::{EmbeddingProvider, LLMRouter};
use super::state_machine::{Agent, ExecutionResult, Idle};
use crate::messages::TaskAssignment;
use crate::registry::AgentMetadata;
/// Per-agent executor handling task processing with persistence (Phase 5.5)
/// Per-agent executor handling task processing with KG persistence.
pub struct AgentExecutor {
agent: Agent<Idle>,
task_rx: mpsc::Receiver<TaskAssignment>,
kg_persistence: Option<Arc<KGPersistence>>,
embedding_provider: Option<Arc<dyn EmbeddingProvider>>,
router: Option<Arc<LLMRouter>>,
}
impl AgentExecutor {
/// Create new executor for an agent (Phase 4)
/// Create a new executor without persistence or LLM routing.
pub fn new(metadata: AgentMetadata, task_rx: mpsc::Receiver<TaskAssignment>) -> Self {
Self {
agent: Agent::new(metadata),
task_rx,
kg_persistence: None,
embedding_provider: None,
router: None,
}
}
/// Create executor with persistence (Phase 5.5)
/// Create executor with KG persistence.
pub fn with_persistence(
metadata: AgentMetadata,
task_rx: mpsc::Receiver<TaskAssignment>,
@ -45,10 +43,25 @@ impl AgentExecutor {
task_rx,
kg_persistence: Some(kg_persistence),
embedding_provider: Some(embedding_provider),
router: None,
}
}
/// Run executor loop, processing tasks until channel closes
/// Attach an LLM router to this executor.
///
/// When set, [`AgentExecutor::run`] forwards each task to the router using
/// `AgentMetadata::system_prompt` as the system message and the task
/// description (+ context when non-empty) as the user message.
/// Budget-aware routing uses `AgentMetadata::role` for spend tracking.
///
/// When absent, the executor produces a stub result — useful in tests that
/// must not make network calls.
pub fn with_router(mut self, router: Arc<LLMRouter>) -> Self {
self.router = Some(router);
self
}
/// Run the executor loop, processing tasks until the channel closes.
pub async fn run(mut self) {
info!(
"AgentExecutor started for agent: {}",
@ -59,30 +72,25 @@ impl AgentExecutor {
while let Some(task) = self.task_rx.recv().await {
debug!("Received task: {}", task.id);
// Transition: Idle → Assigned
let agent_assigned = self.agent.assign_task(task.clone());
// Transition: Assigned → Executing
let agent_executing = agent_assigned.start_execution();
let execution_start = Utc::now();
// Execute task (placeholder - in real use, call LLM via vapora-llm-router)
let result = ExecutionResult {
output: "Task executed successfully".to_string(),
input_tokens: 100,
output_tokens: 50,
duration_ms: 500,
};
// Execute before state transitions — state machine consumes self.agent via
// partial move, which would prevent borrowing self.router afterwards.
let (result, provider_used) = self.execute_task(&task).await;
// Transition: Executing → Completed
let agent_assigned = self.agent.assign_task(task.clone());
let agent_executing = agent_assigned.start_execution();
let completed_agent = agent_executing.complete(result.clone());
// Handle result - transition Completed → Idle
self.agent = completed_agent.reset();
// Phase 5.5: Persist execution to Knowledge Graph (after state transition)
self.persist_execution_internal(&task, &result, execution_start, &agent_id)
.await;
self.persist_execution_internal(
&task,
&result,
execution_start,
&agent_id,
&provider_used,
)
.await;
info!("Task {} completed", task.id);
}
@ -90,27 +98,91 @@ impl AgentExecutor {
info!("AgentExecutor stopped for agent: {}", agent_id);
}
/// Persist execution record to KG database (Phase 5.5)
/// Execute a task, returning the result and the provider name that handled
/// it.
///
/// When a router is attached, calls `complete_with_budget` with:
/// - `system_prompt` from `AgentMetadata` → sent as the LLM system message
/// - `task.description` (plus `task.context` when meaningful) → user
/// message
/// - `task.required_role` → task-type routing key
/// - `agent.role` → budget-tracking role
///
/// Falls back to a stub result when no router is configured.
async fn execute_task(&self, task: &TaskAssignment) -> (ExecutionResult, String) {
let Some(ref router) = self.router else {
return (
ExecutionResult {
output: "Task executed successfully".to_string(),
input_tokens: 100,
output_tokens: 50,
duration_ms: 500,
},
"unknown".to_string(),
);
};
let prompt = build_prompt(&task.description, &task.context);
let system_prompt = self.agent.metadata.system_prompt.clone();
let provider_name = self.agent.metadata.llm_provider.clone();
let wall_start = std::time::Instant::now();
match router
.complete_with_budget(
&task.required_role,
prompt,
system_prompt,
None,
Some(self.agent.metadata.role.as_str()),
)
.await
{
Ok(response) => {
let duration_ms = wall_start.elapsed().as_millis() as u64;
(
ExecutionResult {
output: response.text,
input_tokens: response.input_tokens,
output_tokens: response.output_tokens,
duration_ms,
},
provider_name,
)
}
Err(e) => {
warn!("LLM call failed for task {}: {}", task.id, e);
let duration_ms = wall_start.elapsed().as_millis() as u64;
(
ExecutionResult {
output: format!("LLM error: {e}"),
input_tokens: 0,
output_tokens: 0,
duration_ms,
},
provider_name,
)
}
}
}
async fn persist_execution_internal(
&self,
task: &TaskAssignment,
result: &ExecutionResult,
execution_start: chrono::DateTime<Utc>,
agent_id: &str,
provider: &str,
) {
if let Some(ref kg_persistence) = self.kg_persistence {
if let Some(ref embedding_provider) = self.embedding_provider {
// Generate embedding for task description
let embedding = match embedding_provider.embed(&task.description).await {
Ok(emb) => emb,
Err(e) => {
warn!("Failed to generate embedding for task {}: {}", task.id, e);
// Use zero vector as fallback
vec![0.0; 1536]
}
};
// Create execution record for KG
let execution_record = ExecutionRecord {
id: task.id.clone(),
task_id: task.id.clone(),
@ -122,7 +194,7 @@ impl AgentExecutor {
input_tokens: result.input_tokens,
output_tokens: result.output_tokens,
cost_cents: 0,
provider: "unknown".to_string(),
provider: provider.to_string(),
success: true,
error: None,
solution: Some(result.output.clone()),
@ -130,18 +202,15 @@ impl AgentExecutor {
timestamp: execution_start,
};
// Convert to persisted format
let persisted =
PersistedExecution::from_execution_record(&execution_record, embedding);
// Persist to SurrealDB
if let Err(e) = kg_persistence.persist_execution(persisted).await {
warn!("Failed to persist execution: {}", e);
} else {
debug!("Persisted execution {} to KG", task.id);
}
// Record analytics event
if let Err(e) = kg_persistence
.record_event(
"task_completed",
@ -154,7 +223,6 @@ impl AgentExecutor {
warn!("Failed to record event: {}", e);
}
// Record token usage event
if let Err(e) = kg_persistence
.record_event(
"token_usage",
@ -182,6 +250,22 @@ pub enum ExecutorState {
Failed(Agent<super::state_machine::Failed>, String),
}
/// Build the user-facing LLM prompt from a task description and its JSON
/// context.
///
/// The `context` field in [`TaskAssignment`] carries supplementary JSON that
/// may be empty or contain only `"{}"` when no additional context is available.
/// Those cases are skipped so the model receives a clean description-only
/// prompt.
fn build_prompt(description: &str, context: &str) -> String {
let ctx = context.trim();
if ctx.is_empty() || ctx == "{}" {
description.to_string()
} else {
format!("{description}\n\nContext:\n{ctx}")
}
}
#[cfg(test)]
mod tests {
use super::*;
@ -204,6 +288,7 @@ mod tests {
last_heartbeat: Utc::now(),
uptime_percentage: 100.0,
total_tasks_completed: 0,
system_prompt: None,
};
let (_tx, rx) = mpsc::channel(10);
@ -211,6 +296,7 @@ mod tests {
assert_eq!(executor.agent.metadata.id, "test-executor");
assert!(executor.kg_persistence.is_none());
assert!(executor.router.is_none());
}
#[test]
@ -230,6 +316,7 @@ mod tests {
last_heartbeat: Utc::now(),
uptime_percentage: 99.5,
total_tasks_completed: 100,
system_prompt: None,
};
let (_tx, rx) = mpsc::channel(10);
@ -237,5 +324,26 @@ mod tests {
assert!(!executor.agent.metadata.role.is_empty());
assert!(executor.embedding_provider.is_none());
assert!(executor.router.is_none());
}
#[test]
fn build_prompt_no_context() {
let p = build_prompt("Fix the bug", "");
assert_eq!(p, "Fix the bug");
}
#[test]
fn build_prompt_empty_json_context() {
let p = build_prompt("Fix the bug", "{}");
assert_eq!(p, "Fix the bug");
}
#[test]
fn build_prompt_with_context() {
let p = build_prompt("Fix the bug", r#"{"file":"src/lib.rs"}"#);
assert!(p.contains("Fix the bug"));
assert!(p.contains("src/lib.rs"));
assert!(p.contains("Context:"));
}
}

View File

@ -163,6 +163,7 @@ mod tests {
last_heartbeat: Utc::now(),
uptime_percentage: 100.0,
total_tasks_completed: 0,
system_prompt: None,
};
// Type-state chain: Idle → Assigned → Executing → Completed → Idle
@ -213,6 +214,7 @@ mod tests {
last_heartbeat: Utc::now(),
uptime_percentage: 100.0,
total_tasks_completed: 0,
system_prompt: None,
};
let agent = Agent::new(metadata);

View File

@ -7,6 +7,7 @@ use axum::{
Json,
};
use serde::Deserialize;
use vapora_channels::Message;
use vapora_shared::models::{Agent, AgentStatus};
use crate::api::state::AppState;
@ -90,8 +91,17 @@ pub async fn update_agent_status(
) -> ApiResult<impl IntoResponse> {
let updated = state
.agent_service
.update_agent_status(&id, payload.status)
.update_agent_status(&id, payload.status.clone())
.await?;
if payload.status == AgentStatus::Inactive {
let msg = Message::error(
"Agent Inactive",
format!("Agent '{}' has transitioned to inactive", id),
);
state.notify(&state.notification_config.clone().on_agent_inactive, msg);
}
Ok(Json(updated))
}

View File

@ -35,6 +35,7 @@ pub struct Config {
/// on_task_done = ["team-slack"]
/// on_proposal_approved = ["team-slack"]
/// on_proposal_rejected = ["team-slack", "ops-telegram"]
/// on_agent_inactive = ["ops-telegram"]
/// ```
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
pub struct NotificationConfig {
@ -44,6 +45,9 @@ pub struct NotificationConfig {
pub on_proposal_approved: Vec<String>,
#[serde(default)]
pub on_proposal_rejected: Vec<String>,
/// Fires when an agent transitions to `AgentStatus::Inactive`.
#[serde(default)]
pub on_agent_inactive: Vec<String>,
}
/// Server configuration

View File

@ -0,0 +1,25 @@
[package]
name = "vapora-capabilities"
version.workspace = true
edition.workspace = true
authors.workspace = true
license.workspace = true
repository.workspace = true
rust-version.workspace = true
description = "Pre-built capability packages for VAPORA agents — domain-optimized prompts, tools, and LLM configs"
[lib]
crate-type = ["rlib"]
[dependencies]
vapora-shared = { workspace = true }
serde = { workspace = true }
serde_json = { workspace = true }
toml = { workspace = true }
thiserror = { workspace = true }
tracing = { workspace = true }
parking_lot = { workspace = true }
[dev-dependencies]
tempfile = { workspace = true }

View File

@ -0,0 +1,97 @@
use crate::capability::{Capability, CapabilitySpec};
const SYSTEM_PROMPT: &str = r#"You are a senior software engineer specializing in code review.
Your analysis must be precise, actionable, and grounded in the actual code.
REVIEW METHODOLOGY:
1. Read all changed files completely before forming conclusions.
2. Categorize each finding by severity: Critical | High | Medium | Low | Info.
3. Reference specific line numbers in every finding.
4. Distinguish logic bugs from style suggestions report them separately.
CRITICAL (block merge):
- Security: SQL injection, XSS, SSRF, path traversal, unsafe deserialization, missing auth checks
- Memory safety: use-after-free, buffer overflows, data races, unchecked slice indexing
- Correctness: logic errors that produce wrong results, silent data corruption
- Unhandled error conditions that lead to undefined or catastrophic state
HIGH (should fix before merge):
- Missing input validation on user-supplied data at system boundaries
- Resource leaks: unclosed handles, unbounded memory allocations, missing timeouts
- Error swallowing that hides failures from callers (`let _ = ...`, `unwrap_or(Default)` on failures)
- Concurrency hazards: deadlocks, TOCTOU races, missing synchronization primitives
MEDIUM (fix in follow-up, document if deferred):
- Missing test coverage for new code paths
- Performance anti-patterns in hot paths (unnecessary clones, allocations in loops)
- API design issues: inconsistent naming, overly complex function signatures
- Documentation gaps on public interfaces
LOW / INFO:
- Code style inconsistencies (report only if systematic, not isolated occurrences)
- Minor refactoring opportunities that improve readability without changing behavior
OUTPUT FORMAT (JSON only, no prose wrapping):
{
"summary": "<1-2 sentences on overall quality and primary concern>",
"merge_ready": <true|false>,
"findings": [
{
"severity": "Critical|High|Medium|Low|Info",
"file": "<path>",
"line": <number or null>,
"category": "Security|Correctness|Performance|Testing|Documentation|Style",
"description": "<precise description of the issue>",
"suggestion": "<concrete fix or mitigation>"
}
],
"stats": {
"critical": <n>,
"high": <n>,
"medium": <n>,
"low": <n>
}
}
Do not invent findings. If the code is correct in a category, omit those findings.
Precision over quantity 3 real findings beat 15 speculative ones."#;
/// Pre-built CodeReviewer capability.
///
/// Configures an agent for in-depth code review with security and correctness
/// focus. Uses Claude Opus for deep reasoning on complex code patterns.
/// Temperature 0.1 maximizes determinism — reviews should be reproducible.
#[derive(Debug)]
pub struct CodeReviewer;
impl Capability for CodeReviewer {
fn spec(&self) -> CapabilitySpec {
CapabilitySpec {
id: "code-reviewer".to_string(),
display_name: "Code Reviewer".to_string(),
description: "In-depth code review with security, correctness, and performance focus"
.to_string(),
agent_role: "code_reviewer".to_string(),
task_types: vec![
"code_review".to_string(),
"review".to_string(),
"security_audit".to_string(),
"quality_check".to_string(),
"inspect".to_string(),
],
system_prompt: SYSTEM_PROMPT.to_string(),
mcp_tools: vec![
"file_read".to_string(),
"file_list".to_string(),
"git_diff".to_string(),
"code_search".to_string(),
],
preferred_provider: "claude".to_string(),
preferred_model: "claude-opus-4-6".to_string(),
max_tokens: 8192,
temperature: 0.1,
priority: 90,
parallelizable: true,
}
}
}

View File

@ -0,0 +1,87 @@
use crate::capability::{Capability, CapabilitySpec};
const SYSTEM_PROMPT: &str = r#"You are a technical documentation specialist who writes clear, accurate documentation from source code.
DOCUMENTATION METHODOLOGY:
1. Read the source code thoroughly before writing anything.
2. Document behavior, not implementation explain what, not how.
3. Every public function/method gets: purpose, parameters, return value, errors/panics, and a usage example.
4. Module-level docs explain the module's role and its relationships to other modules.
5. Do not document obvious things (e.g., `/// Returns the value` for a getter).
DOCUMENTATION LEVELS:
MODULE DOCS (top of file or lib.rs):
- One-line summary followed by blank line
- Purpose of the module in the broader system
- Key types and their relationships
- Usage example showing common patterns
FUNCTION/METHOD DOCS:
- One-line summary (imperative: "Creates a...", "Returns the...", "Validates...")
- Extended description if behavior is non-obvious
- `# Arguments` section listing each parameter with type and purpose
- `# Returns` section describing the return value and its possible states
- `# Errors` section listing error variants that can be returned (for Result types)
- `# Panics` section if the function can panic (with conditions)
- `# Examples` section with runnable code (in Rust: ```rust)
TYPE DOCS (struct, enum, trait):
- What the type represents (not just a restatement of its name)
- Invariants the type maintains
- How it's constructed (builder pattern, From impls, etc.)
- For enums: when to use each variant
OUTPUT FORMAT:
Produce documentation as markdown-formatted comments in the target language.
For Rust: use `///` for doc comments. For Python: use docstrings. For TypeScript: use JSDoc.
If the input specifies a different format, follow that format.
QUALITY STANDARDS:
- Every sentence must be complete and grammatically correct
- Use present tense for descriptions ("Creates a connection" not "Will create a connection")
- Avoid vague words: "various", "some", "etc.", "and so on"
- Include real values in examples, not placeholders like `your_value_here`
- If a behavior is undefined or implementation-dependent, say so explicitly"#;
/// Pre-built DocGenerator capability.
///
/// Configures an agent for generating comprehensive technical documentation
/// from source code. Uses Claude Sonnet for cost-efficient, high-quality
/// writing. Temperature 0.3 allows natural prose variation while staying
/// factual.
#[derive(Debug)]
pub struct DocGenerator;
impl Capability for DocGenerator {
fn spec(&self) -> CapabilitySpec {
CapabilitySpec {
id: "doc-generator".to_string(),
display_name: "Documentation Generator".to_string(),
description: "Generates comprehensive technical documentation from source code"
.to_string(),
agent_role: "documenter".to_string(),
task_types: vec![
"doc_generation".to_string(),
"documentation".to_string(),
"document".to_string(),
"api_docs".to_string(),
"readme".to_string(),
"write".to_string(),
],
system_prompt: SYSTEM_PROMPT.to_string(),
mcp_tools: vec![
"file_read".to_string(),
"file_list".to_string(),
"code_search".to_string(),
"file_write".to_string(),
],
preferred_provider: "claude".to_string(),
preferred_model: "claude-sonnet-4-6".to_string(),
max_tokens: 16384,
temperature: 0.3,
priority: 70,
parallelizable: true,
}
}
}

View File

@ -0,0 +1,12 @@
//! Built-in capability packages shipped with VAPORA.
//!
//! Each module provides a unit struct implementing [`crate::Capability`].
//! Obtain instances via [`crate::CapabilityRegistry::with_built_ins()`].
mod code_reviewer;
mod doc_generator;
mod pr_monitor;
pub use code_reviewer::CodeReviewer;
pub use doc_generator::DocGenerator;
pub use pr_monitor::PRMonitor;

View File

@ -0,0 +1,102 @@
use crate::capability::{Capability, CapabilitySpec};
const SYSTEM_PROMPT: &str = r#"You are a pull request analyst responsible for continuous PR health monitoring.
Your job is to produce an accurate, actionable PR health report not to block work, but to surface risks.
ANALYSIS SCOPE:
Given a pull request diff and metadata, evaluate:
1. CHANGE IMPACT:
- Identify the scope: is this a localized fix, a cross-cutting refactor, or an API change?
- Flag breaking changes: public API removals/renames, schema changes, protocol changes
- Identify files with the highest risk (auth, crypto, validation, data models)
2. TEST COVERAGE:
- Are new code paths covered by tests?
- Are edge cases tested (empty input, max values, error paths)?
- Are there deleted tests not replaced?
3. DOCUMENTATION:
- Are new public interfaces documented?
- Is the PR description sufficient to understand the change?
4. SECURITY-SENSITIVE CHANGES:
- Authentication / authorization changes
- Cryptographic operations or key handling
- Input validation or sanitization changes
- External HTTP calls to new or changed endpoints
5. MERGE READINESS:
Based on the above, classify the PR as:
- READY: No blocking issues, minor notes at most
- NEEDS_REVIEW: Issues that should be reviewed but don't block
- BLOCKED: Breaking changes, security issues, or critical missing tests
OUTPUT FORMAT (JSON only):
{
"pr_id": "<PR number or title>",
"status": "READY|NEEDS_REVIEW|BLOCKED",
"summary": "<1-2 sentences on the overall change and primary concern>",
"breaking_changes": [
{ "description": "<what changed>", "impact": "<who is affected>" }
],
"security_flags": [
{ "severity": "High|Medium|Low", "file": "<path>", "description": "<issue>" }
],
"test_gaps": [
{ "file": "<path>", "description": "<what is not covered>" }
],
"recommendations": [
"<actionable recommendation>"
],
"merge_checklist": {
"tests_pass": <true|false|null>,
"breaking_changes_documented": <true|false|null>,
"security_review_needed": <true|false>,
"docs_updated": <true|false|null>
}
}
If information is unavailable (e.g., CI status not in the diff), set fields to null.
Be specific: reference file paths and line numbers when flagging issues.
A READY status with honest recommendations is better than an inflated BLOCKED."#;
/// Pre-built PRMonitor capability.
///
/// Configures an agent for pull request health monitoring and merge readiness
/// analysis. Runs autonomously when scheduled (e.g., every 15 minutes on open
/// PRs). Temperature 0.1 ensures consistent, reproducible status reports.
#[derive(Debug)]
pub struct PRMonitor;
impl Capability for PRMonitor {
fn spec(&self) -> CapabilitySpec {
CapabilitySpec {
id: "pr-monitor".to_string(),
display_name: "PR Monitor".to_string(),
description: "Pull request health monitoring and merge readiness analysis".to_string(),
agent_role: "monitor".to_string(),
task_types: vec![
"pr_review".to_string(),
"pr_monitor".to_string(),
"merge_check".to_string(),
"ci_monitor".to_string(),
"monitor".to_string(),
],
system_prompt: SYSTEM_PROMPT.to_string(),
mcp_tools: vec![
"git_diff".to_string(),
"git_log".to_string(),
"git_status".to_string(),
"file_list".to_string(),
"file_read".to_string(),
],
preferred_provider: "claude".to_string(),
preferred_model: "claude-sonnet-4-6".to_string(),
max_tokens: 4096,
temperature: 0.1,
priority: 85,
parallelizable: false,
}
}
}

View File

@ -0,0 +1,128 @@
use std::fmt;
use serde::{Deserialize, Serialize};
use vapora_shared::AgentDefinition;
/// Complete specification for a capability package.
///
/// Carries everything needed to instantiate a domain-optimized agent:
/// system prompt, tool list, LLM preferences, and task classification hints.
/// Registry-stored capabilities expose this via [`Capability::spec()`].
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
pub struct CapabilitySpec {
/// Unique kebab-case identifier (e.g., `"code-reviewer"`).
pub id: String,
/// Human-readable name shown in UIs and logs.
pub display_name: String,
/// Brief description of what the capability does.
pub description: String,
/// Agent role name used by the coordinator for task routing (e.g.,
/// `"code_reviewer"`).
pub agent_role: String,
/// Task-type strings the coordinator uses to select this agent.
///
/// These strings are matched by [`vapora_agents::coordinator`]'s
/// `extract_task_type` heuristic — they should overlap with keywords
/// present in task titles/descriptions.
pub task_types: Vec<String>,
/// Domain-optimized system prompt injected before every task execution.
pub system_prompt: String,
/// MCP tool IDs activated for this capability.
///
/// These are semantic names matching what `vapora-mcp-server` exposes.
pub mcp_tools: Vec<String>,
/// Preferred LLM provider name (e.g., `"claude"`, `"openai"`).
pub preferred_provider: String,
/// Preferred model within the provider (e.g., `"claude-opus-4-6"`).
pub preferred_model: String,
/// Maximum output tokens for this domain's tasks.
pub max_tokens: u32,
/// Sampling temperature — lower values produce more deterministic output.
pub temperature: f32,
/// Assignment priority in range 0100.
pub priority: u32,
/// Whether multiple instances of this capability can run concurrently.
pub parallelizable: bool,
}
impl CapabilitySpec {
/// Override the system prompt (e.g., from a TOML config file).
pub fn with_system_prompt(mut self, prompt: impl Into<String>) -> Self {
self.system_prompt = prompt.into();
self
}
/// Override the preferred model.
pub fn with_model(mut self, model: impl Into<String>) -> Self {
self.preferred_model = model.into();
self
}
/// Override the preferred provider.
pub fn with_provider(mut self, provider: impl Into<String>) -> Self {
self.preferred_provider = provider.into();
self
}
/// Override max tokens.
pub fn with_max_tokens(mut self, max_tokens: u32) -> Self {
self.max_tokens = max_tokens;
self
}
/// Override temperature.
pub fn with_temperature(mut self, temperature: f32) -> Self {
self.temperature = temperature;
self
}
/// Override priority.
pub fn with_priority(mut self, priority: u32) -> Self {
self.priority = priority;
self
}
}
/// Object-safe trait for a capability package.
///
/// Built-in capabilities are unit structs (e.g.,
/// [`crate::built_in::CodeReviewer`]). TOML-loaded custom capabilities use
/// [`CustomCapability`].
///
/// The default `to_agent_definition()` implementation converts the spec into
/// an [`AgentDefinition`] ready for
/// [`vapora_agents::coordinator::AgentCoordinator`].
pub trait Capability: Send + Sync + fmt::Debug {
/// Return the full specification for this capability.
fn spec(&self) -> CapabilitySpec;
/// Convert to an [`AgentDefinition`] usable by the agent registry.
///
/// The system prompt is embedded into the definition so it is available
/// at the execution site via
/// [`vapora_agents::registry::AgentMetadata::system_prompt`].
fn to_agent_definition(&self) -> AgentDefinition {
let spec = self.spec();
AgentDefinition {
role: spec.agent_role.clone(),
description: format!("[{}] {}", spec.display_name, spec.description),
llm_provider: spec.preferred_provider.clone(),
llm_model: spec.preferred_model.clone(),
parallelizable: spec.parallelizable,
priority: spec.priority,
capabilities: spec.task_types.clone(),
system_prompt: Some(spec.system_prompt.clone()),
}
}
}
/// Wraps a [`CapabilitySpec`] loaded from TOML as a [`Capability`]
/// implementation.
#[derive(Debug)]
pub struct CustomCapability(pub CapabilitySpec);
impl Capability for CustomCapability {
fn spec(&self) -> CapabilitySpec {
self.0.clone()
}
}

View File

@ -0,0 +1,72 @@
//! Pre-built capability packages for VAPORA agents.
//!
//! A capability package bundles everything needed to deploy a domain-optimized
//! agent without manual configuration: system prompt, MCP tool list, LLM model
//! preferences, and task-type classification hints.
//!
//! # Quick Start
//!
//! ```rust
//! use vapora_capabilities::CapabilityRegistry;
//!
//! // 1. Create the registry with all built-ins
//! let registry = CapabilityRegistry::with_built_ins();
//!
//! // 2. Activate a capability → produces an AgentDefinition
//! let def = registry.activate("code-reviewer").unwrap();
//!
//! // 3. Use the definition to instantiate an AgentMetadata in the coordinator
//! assert_eq!(def.role, "code_reviewer");
//! assert!(def.system_prompt.is_some());
//! ```
//!
//! # Built-in Capabilities
//!
//! | ID | Role | Purpose |
//! |----|------|---------|
//! | `code-reviewer` | `code_reviewer` | Security and correctness-focused code review |
//! | `doc-generator` | `documenter` | Technical documentation generation |
//! | `pr-monitor` | `monitor` | Pull request health and merge readiness |
//!
//! # Customization via TOML
//!
//! Override built-in settings or add custom capabilities through a TOML file:
//!
//! ```toml
//! [[override]]
//! id = "code-reviewer"
//! preferred_model = "claude-sonnet-4-6"
//!
//! [[custom]]
//! id = "db-optimizer"
//! display_name = "Database Optimizer"
//! description = "Analyzes and optimizes SQL queries"
//! agent_role = "db_optimizer"
//! task_types = ["db_optimization"]
//! system_prompt = "You are a database expert..."
//! mcp_tools = ["file_read", "code_search"]
//! preferred_provider = "claude"
//! preferred_model = "claude-sonnet-4-6"
//! max_tokens = 4096
//! temperature = 0.2
//! priority = 65
//! parallelizable = true
//! ```
//!
//! Then apply via [`CapabilityLoader`]:
//!
//! ```rust,no_run
//! use vapora_capabilities::{CapabilityRegistry, CapabilityLoader};
//!
//! let registry = CapabilityRegistry::with_built_ins();
//! CapabilityLoader::load_and_apply("config/capabilities.toml", &registry).unwrap();
//! ```
pub mod built_in;
pub mod capability;
pub mod loader;
pub mod registry;
pub use capability::{Capability, CapabilitySpec, CustomCapability};
pub use loader::{CapabilityConfig, CapabilityLoader, CapabilityOverride, LoaderError};
pub use registry::{CapabilityError, CapabilityRegistry};

View File

@ -0,0 +1,404 @@
use serde::Deserialize;
use thiserror::Error;
use tracing::{debug, info, warn};
use crate::capability::{CapabilitySpec, CustomCapability};
use crate::registry::{CapabilityError, CapabilityRegistry};
#[derive(Debug, Error)]
pub enum LoaderError {
#[error("Failed to read file '{path}': {source}")]
Io {
path: String,
#[source]
source: std::io::Error,
},
#[error("Failed to parse TOML: {0}")]
Parse(#[from] toml::de::Error),
#[error("Registry error applying capabilities: {0}")]
Registry(#[from] CapabilityError),
}
/// Partial override applied on top of an existing built-in capability.
///
/// Only fields set in TOML are applied; unset fields keep their default value.
/// The `id` field identifies which built-in to patch.
#[derive(Debug, Deserialize)]
pub struct CapabilityOverride {
/// ID of the built-in capability to override (e.g., `"code-reviewer"`).
pub id: String,
pub system_prompt: Option<String>,
pub preferred_provider: Option<String>,
pub preferred_model: Option<String>,
pub max_tokens: Option<u32>,
pub temperature: Option<f32>,
pub priority: Option<u32>,
pub mcp_tools: Option<Vec<String>>,
}
/// Configuration file structure for capability customization.
///
/// ```toml
/// # Patch a built-in: use Claude Sonnet instead of Opus for code-reviewer
/// [[override]]
/// id = "code-reviewer"
/// preferred_model = "claude-sonnet-4-6"
/// max_tokens = 16384
///
/// # Add a custom capability not shipped as a built-in
/// [[custom]]
/// id = "db-optimizer"
/// display_name = "Database Optimizer"
/// description = "Analyzes and optimizes SQL queries and schema"
/// agent_role = "db_optimizer"
/// task_types = ["db_optimization", "query_review"]
/// system_prompt = "You are a database performance expert..."
/// mcp_tools = ["file_read", "code_search"]
/// preferred_provider = "claude"
/// preferred_model = "claude-sonnet-4-6"
/// max_tokens = 4096
/// temperature = 0.2
/// priority = 70
/// parallelizable = true
/// ```
#[derive(Debug, Deserialize)]
pub struct CapabilityConfig {
/// Patches applied on top of existing built-in capabilities.
#[serde(default, rename = "override")]
pub overrides: Vec<CapabilityOverride>,
/// Entirely new capabilities not shipped as built-ins.
#[serde(default)]
pub custom: Vec<CapabilitySpec>,
}
/// Loads capability overrides and custom capabilities from TOML configuration.
pub struct CapabilityLoader;
impl CapabilityLoader {
/// Parse capability configuration from a TOML string.
///
/// # Errors
///
/// Returns [`LoaderError::Parse`] if the TOML is malformed.
pub fn parse(toml_content: &str) -> Result<CapabilityConfig, LoaderError> {
let config: CapabilityConfig = toml::from_str(toml_content)?;
Ok(config)
}
/// Parse capability configuration from a file on disk.
///
/// # Errors
///
/// Returns [`LoaderError::Io`] if the file cannot be read,
/// or [`LoaderError::Parse`] if the content is not valid TOML.
pub fn from_file(path: &str) -> Result<CapabilityConfig, LoaderError> {
let content = std::fs::read_to_string(path).map_err(|e| LoaderError::Io {
path: path.to_string(),
source: e,
})?;
Self::parse(&content)
}
/// Apply a parsed [`CapabilityConfig`] to a [`CapabilityRegistry`].
///
/// - Overrides: existing built-in specs are patched with the non-None
/// fields. If the ID does not exist in the registry it is skipped with a
/// warning.
/// - Custom: new capabilities are added. If the ID already exists it is
/// replaced (allowing idempotent re-application of config).
///
/// # Errors
///
/// Returns [`LoaderError::Registry`] only when a custom capability spec has
/// an empty ID.
pub fn apply(
config: CapabilityConfig,
registry: &CapabilityRegistry,
) -> Result<(), LoaderError> {
let override_count = config.overrides.len();
let custom_count = config.custom.len();
// Apply overrides to existing capabilities
for ov in config.overrides {
if !registry.contains(&ov.id) {
warn!(
"Capability override '{}' not found in registry — skipping",
ov.id
);
continue;
}
let base_arc = registry
.get(&ov.id)
.expect("contains check above ensures presence");
let mut spec = base_arc.spec();
if let Some(prompt) = ov.system_prompt {
spec.system_prompt = prompt;
}
if let Some(provider) = ov.preferred_provider {
spec.preferred_provider = provider;
}
if let Some(model) = ov.preferred_model {
spec.preferred_model = model;
}
if let Some(max_tokens) = ov.max_tokens {
spec.max_tokens = max_tokens;
}
if let Some(temperature) = ov.temperature {
spec.temperature = temperature;
}
if let Some(priority) = ov.priority {
spec.priority = priority;
}
if let Some(tools) = ov.mcp_tools {
spec.mcp_tools = tools;
}
debug!("Applied override to capability '{}'", ov.id);
registry.register_or_replace(CustomCapability(spec));
}
// Register custom capabilities (replace if already present)
for spec in config.custom {
if spec.id.is_empty() {
return Err(LoaderError::Registry(CapabilityError::InvalidSpec(
"custom capability id must not be empty".to_string(),
)));
}
debug!("Registering custom capability '{}'", spec.id);
registry.register_or_replace(CustomCapability(spec));
}
info!(
"Applied capability config: {} overrides, {} custom",
override_count, custom_count
);
Ok(())
}
/// Convenience: parse from file and apply to registry in one call.
pub fn load_and_apply(path: &str, registry: &CapabilityRegistry) -> Result<(), LoaderError> {
let config = Self::from_file(path)?;
Self::apply(config, registry)
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::CapabilityRegistry;
#[test]
fn parse_empty_config_succeeds() {
let config = CapabilityLoader::parse("").unwrap();
assert!(config.overrides.is_empty());
assert!(config.custom.is_empty());
}
#[test]
fn parse_override_only() {
let toml = r#"
[[override]]
id = "code-reviewer"
preferred_model = "claude-sonnet-4-6"
max_tokens = 16384
"#;
let config = CapabilityLoader::parse(toml).unwrap();
assert_eq!(config.overrides.len(), 1);
assert_eq!(config.overrides[0].id, "code-reviewer");
assert_eq!(
config.overrides[0].preferred_model.as_deref(),
Some("claude-sonnet-4-6")
);
assert_eq!(config.overrides[0].max_tokens, Some(16384));
assert!(config.overrides[0].system_prompt.is_none());
}
#[test]
fn parse_custom_capability() {
let toml = r#"
[[custom]]
id = "db-optimizer"
display_name = "Database Optimizer"
description = "Optimizes SQL queries"
agent_role = "db_optimizer"
task_types = ["db_optimization"]
system_prompt = "You are a database expert."
mcp_tools = ["file_read"]
preferred_provider = "claude"
preferred_model = "claude-sonnet-4-6"
max_tokens = 2048
temperature = 0.2
priority = 65
parallelizable = true
"#;
let config = CapabilityLoader::parse(toml).unwrap();
assert_eq!(config.custom.len(), 1);
assert_eq!(config.custom[0].id, "db-optimizer");
assert_eq!(config.custom[0].agent_role, "db_optimizer");
assert_eq!(config.custom[0].max_tokens, 2048);
}
#[test]
fn apply_override_changes_model() {
let registry = CapabilityRegistry::with_built_ins();
let original_model = registry
.get("code-reviewer")
.unwrap()
.spec()
.preferred_model
.clone();
let toml = r#"
[[override]]
id = "code-reviewer"
preferred_model = "claude-sonnet-4-6"
"#;
let config = CapabilityLoader::parse(toml).unwrap();
CapabilityLoader::apply(config, &registry).unwrap();
let updated = registry.activate("code-reviewer").unwrap();
assert_eq!(updated.llm_model, "claude-sonnet-4-6");
// The original is different (it was opus), so this proves the override took
// effect
assert_ne!(updated.llm_model, original_model);
}
#[test]
fn apply_override_preserves_unset_fields() {
let registry = CapabilityRegistry::with_built_ins();
let original_spec = registry.get("code-reviewer").unwrap().spec();
let toml = r#"
[[override]]
id = "code-reviewer"
max_tokens = 4096
"#;
let config = CapabilityLoader::parse(toml).unwrap();
CapabilityLoader::apply(config, &registry).unwrap();
let updated_spec = registry.get("code-reviewer").unwrap().spec();
assert_eq!(updated_spec.max_tokens, 4096);
// System prompt and provider should be unchanged
assert_eq!(updated_spec.system_prompt, original_spec.system_prompt);
assert_eq!(
updated_spec.preferred_provider,
original_spec.preferred_provider
);
}
#[test]
fn apply_unknown_override_id_is_skipped() {
let registry = CapabilityRegistry::with_built_ins();
let initial_count = registry.len();
let toml = r#"
[[override]]
id = "nonexistent-cap"
max_tokens = 999
"#;
let config = CapabilityLoader::parse(toml).unwrap();
// Should succeed without error — unknown override is skipped with a warning
CapabilityLoader::apply(config, &registry).unwrap();
// Registry count unchanged
assert_eq!(registry.len(), initial_count);
}
#[test]
fn apply_custom_adds_new_capability() {
let registry = CapabilityRegistry::with_built_ins();
let initial_count = registry.len();
let toml = r#"
[[custom]]
id = "test-custom"
display_name = "Test Custom"
description = "A test custom capability"
agent_role = "custom_tester"
task_types = ["custom_task"]
system_prompt = "You are a custom test agent."
mcp_tools = []
preferred_provider = "ollama"
preferred_model = "llama3.2"
max_tokens = 2048
temperature = 0.5
priority = 40
parallelizable = true
"#;
let config = CapabilityLoader::parse(toml).unwrap();
CapabilityLoader::apply(config, &registry).unwrap();
assert_eq!(registry.len(), initial_count + 1);
assert!(registry.contains("test-custom"));
let def = registry.activate("test-custom").unwrap();
assert_eq!(def.role, "custom_tester");
assert_eq!(def.llm_provider, "ollama");
assert_eq!(def.system_prompt.unwrap(), "You are a custom test agent.");
}
#[test]
fn apply_custom_replaces_existing_idempotently() {
let registry = CapabilityRegistry::with_built_ins();
let toml = r#"
[[custom]]
id = "replaceable"
display_name = "V1"
description = "Version 1"
agent_role = "role_v1"
task_types = ["task_v1"]
system_prompt = "V1 prompt"
mcp_tools = []
preferred_provider = "claude"
preferred_model = "claude-sonnet-4-6"
max_tokens = 1024
temperature = 0.5
priority = 50
parallelizable = true
"#;
let config = CapabilityLoader::parse(toml).unwrap();
CapabilityLoader::apply(config, &registry).unwrap();
let toml2 = r#"
[[custom]]
id = "replaceable"
display_name = "V2"
description = "Version 2"
agent_role = "role_v2"
task_types = ["task_v2"]
system_prompt = "V2 prompt"
mcp_tools = []
preferred_provider = "claude"
preferred_model = "claude-opus-4-6"
max_tokens = 2048
temperature = 0.3
priority = 60
parallelizable = false
"#;
let config2 = CapabilityLoader::parse(toml2).unwrap();
CapabilityLoader::apply(config2, &registry).unwrap();
let spec = registry.get("replaceable").unwrap().spec();
assert_eq!(spec.agent_role, "role_v2");
assert_eq!(spec.preferred_model, "claude-opus-4-6");
}
#[test]
fn from_file_nonexistent_returns_io_error() {
let err = CapabilityLoader::from_file("/nonexistent/path/capabilities.toml").unwrap_err();
assert!(matches!(err, LoaderError::Io { .. }));
}
}

View File

@ -0,0 +1,340 @@
use std::sync::Arc;
use parking_lot::RwLock;
use thiserror::Error;
use tracing::{debug, info};
use vapora_shared::AgentDefinition;
use crate::built_in::{CodeReviewer, DocGenerator, PRMonitor};
use crate::capability::{Capability, CustomCapability};
#[derive(Debug, Error)]
pub enum CapabilityError {
#[error("Capability not found: {0}")]
NotFound(String),
#[error("Capability already registered: {0}")]
AlreadyRegistered(String),
#[error("Invalid capability spec: {0}")]
InvalidSpec(String),
}
/// Thread-safe catalog of capability packages.
///
/// Read-heavy (looked up on every task dispatch), written rarely (at startup
/// or when loading user overrides). Uses `parking_lot::RwLock` for
/// uncontended-read performance.
///
/// # Examples
///
/// ```rust
/// use vapora_capabilities::CapabilityRegistry;
///
/// let registry = CapabilityRegistry::with_built_ins();
/// let def = registry.activate("code-reviewer").unwrap();
/// assert_eq!(def.role, "code_reviewer");
/// ```
#[derive(Debug)]
pub struct CapabilityRegistry {
capabilities: RwLock<std::collections::HashMap<String, Arc<dyn Capability>>>,
}
impl CapabilityRegistry {
/// Create an empty registry.
pub fn new() -> Self {
Self {
capabilities: RwLock::new(std::collections::HashMap::new()),
}
}
/// Create a registry pre-populated with all built-in capabilities.
///
/// Built-ins registered: `code-reviewer`, `doc-generator`, `pr-monitor`.
pub fn with_built_ins() -> Self {
let registry = Self::new();
// These are infallible — built-in IDs are unique by construction.
registry
.register(CodeReviewer)
.expect("code-reviewer id collision");
registry
.register(DocGenerator)
.expect("doc-generator id collision");
registry
.register(PRMonitor)
.expect("pr-monitor id collision");
info!(
"CapabilityRegistry initialized with {} built-in capabilities",
registry.len()
);
registry
}
/// Register a capability. Returns [`CapabilityError::AlreadyRegistered`]
/// if a capability with the same ID is already present.
pub fn register<C: Capability + 'static>(&self, capability: C) -> Result<(), CapabilityError> {
let spec = capability.spec();
if spec.id.is_empty() {
return Err(CapabilityError::InvalidSpec(
"capability id must not be empty".to_string(),
));
}
let mut map = self.capabilities.write();
if map.contains_key(&spec.id) {
return Err(CapabilityError::AlreadyRegistered(spec.id));
}
debug!("Registered capability: {}", spec.id);
map.insert(spec.id, Arc::new(capability));
Ok(())
}
/// Register or replace a capability (idempotent — used for TOML overrides).
///
/// If a capability with the same ID already exists it is replaced silently.
/// Use this when applying user-supplied configuration overrides.
pub fn register_or_replace<C: Capability + 'static>(&self, capability: C) {
let spec = capability.spec();
debug!("Registering/replacing capability: {}", spec.id);
self.capabilities
.write()
.insert(spec.id, Arc::new(capability));
}
/// Replace a built-in capability with a custom override spec.
///
/// Returns [`CapabilityError::NotFound`] if the ID does not exist.
pub fn override_spec(
&self,
id: &str,
new_spec: crate::capability::CapabilitySpec,
) -> Result<(), CapabilityError> {
if new_spec.id != id {
return Err(CapabilityError::InvalidSpec(format!(
"override spec id '{}' does not match target id '{}'",
new_spec.id, id
)));
}
let mut map = self.capabilities.write();
if !map.contains_key(id) {
return Err(CapabilityError::NotFound(id.to_string()));
}
debug!("Overriding capability: {}", id);
map.insert(id.to_string(), Arc::new(CustomCapability(new_spec)));
Ok(())
}
/// Retrieve a capability by ID.
pub fn get(&self, id: &str) -> Option<Arc<dyn Capability>> {
self.capabilities.read().get(id).cloned()
}
/// Returns `true` if a capability with the given ID is registered.
pub fn contains(&self, id: &str) -> bool {
self.capabilities.read().contains_key(id)
}
/// List all registered capability IDs.
pub fn list_ids(&self) -> Vec<String> {
let mut ids: Vec<String> = self.capabilities.read().keys().cloned().collect();
ids.sort();
ids
}
/// Return all registered capabilities.
pub fn list_all(&self) -> Vec<Arc<dyn Capability>> {
self.capabilities.read().values().cloned().collect()
}
/// Number of registered capabilities.
pub fn len(&self) -> usize {
self.capabilities.read().len()
}
/// Returns `true` if no capabilities are registered.
pub fn is_empty(&self) -> bool {
self.capabilities.read().is_empty()
}
/// Convert a registered capability to an [`AgentDefinition`].
///
/// This is the primary activation path: look up by ID and produce an
/// `AgentDefinition` that can be used with
/// [`vapora_agents::registry::AgentMetadata::new`] and then registered in
/// the coordinator.
///
/// # Errors
///
/// Returns [`CapabilityError::NotFound`] if no capability with the given ID
/// is registered.
pub fn activate(&self, id: &str) -> Result<AgentDefinition, CapabilityError> {
let cap = self
.get(id)
.ok_or_else(|| CapabilityError::NotFound(id.to_string()))?;
let def = cap.to_agent_definition();
debug!(
"Activated capability '{}' → role '{}' (provider: {}, model: {})",
id, def.role, def.llm_provider, def.llm_model
);
Ok(def)
}
}
impl Default for CapabilityRegistry {
fn default() -> Self {
Self::with_built_ins()
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::capability::CapabilitySpec;
fn minimal_spec(id: &str, role: &str) -> CapabilitySpec {
CapabilitySpec {
id: id.to_string(),
display_name: "Test".to_string(),
description: "Test capability".to_string(),
agent_role: role.to_string(),
task_types: vec!["test_task".to_string()],
system_prompt: "You are a test agent.".to_string(),
mcp_tools: vec![],
preferred_provider: "claude".to_string(),
preferred_model: "claude-sonnet-4-6".to_string(),
max_tokens: 1024,
temperature: 0.5,
priority: 50,
parallelizable: true,
}
}
#[test]
fn built_ins_count_is_three() {
let registry = CapabilityRegistry::with_built_ins();
assert_eq!(registry.len(), 3);
}
#[test]
fn built_ins_ids_are_present() {
let registry = CapabilityRegistry::with_built_ins();
assert!(registry.contains("code-reviewer"));
assert!(registry.contains("doc-generator"));
assert!(registry.contains("pr-monitor"));
}
#[test]
fn get_nonexistent_returns_none() {
let registry = CapabilityRegistry::with_built_ins();
assert!(registry.get("nonexistent-capability").is_none());
}
#[test]
fn activate_produces_agent_definition_with_system_prompt() {
let registry = CapabilityRegistry::with_built_ins();
let def = registry.activate("code-reviewer").unwrap();
assert_eq!(def.role, "code_reviewer");
assert!(
def.system_prompt.is_some(),
"system_prompt must be populated by activation"
);
let prompt = def.system_prompt.unwrap();
assert!(
!prompt.is_empty(),
"system_prompt must not be empty after activation"
);
}
#[test]
fn activate_nonexistent_returns_not_found() {
let registry = CapabilityRegistry::with_built_ins();
let err = registry.activate("does-not-exist").unwrap_err();
assert!(matches!(err, CapabilityError::NotFound(_)));
}
#[test]
fn register_duplicate_returns_already_registered() {
let registry = CapabilityRegistry::new();
let spec = minimal_spec("my-cap", "my_role");
registry
.register(CustomCapability(spec.clone()))
.expect("first registration succeeds");
let err = registry.register(CustomCapability(spec)).unwrap_err();
assert!(matches!(err, CapabilityError::AlreadyRegistered(_)));
}
#[test]
fn register_or_replace_overwrites_existing() {
let registry = CapabilityRegistry::new();
let spec_v1 = minimal_spec("my-cap", "role_v1");
let spec_v2 = minimal_spec("my-cap", "role_v2");
registry.register(CustomCapability(spec_v1)).unwrap();
registry.register_or_replace(CustomCapability(spec_v2));
let retrieved = registry.get("my-cap").unwrap();
assert_eq!(retrieved.spec().agent_role, "role_v2");
}
#[test]
fn override_spec_replaces_built_in_model() {
let registry = CapabilityRegistry::with_built_ins();
let mut new_spec = registry.get("code-reviewer").unwrap().spec();
new_spec.preferred_model = "claude-sonnet-4-6".to_string();
registry.override_spec("code-reviewer", new_spec).unwrap();
let updated = registry.activate("code-reviewer").unwrap();
assert_eq!(updated.llm_model, "claude-sonnet-4-6");
}
#[test]
fn override_spec_wrong_id_returns_invalid() {
let registry = CapabilityRegistry::with_built_ins();
let mut spec = registry.get("code-reviewer").unwrap().spec();
spec.id = "pr-monitor".to_string(); // mismatch
let err = registry.override_spec("code-reviewer", spec).unwrap_err();
assert!(matches!(err, CapabilityError::InvalidSpec(_)));
}
#[test]
fn list_ids_is_sorted() {
let registry = CapabilityRegistry::with_built_ins();
let ids = registry.list_ids();
let mut sorted = ids.clone();
sorted.sort();
assert_eq!(ids, sorted);
}
#[test]
fn activate_doc_generator_correct_role() {
let registry = CapabilityRegistry::with_built_ins();
let def = registry.activate("doc-generator").unwrap();
assert_eq!(def.role, "documenter");
assert!(def.system_prompt.is_some());
}
#[test]
fn activate_pr_monitor_correct_role() {
let registry = CapabilityRegistry::with_built_ins();
let def = registry.activate("pr-monitor").unwrap();
assert_eq!(def.role, "monitor");
assert!(def.system_prompt.is_some());
}
#[test]
fn register_empty_id_returns_invalid() {
let registry = CapabilityRegistry::new();
let spec = minimal_spec("", "role");
let err = registry.register(CustomCapability(spec)).unwrap_err();
assert!(matches!(err, CapabilityError::InvalidSpec(_)));
}
}

View File

@ -20,7 +20,11 @@ pub use analytics::{
pub use error::{KGError, Result};
pub use learning::{apply_recency_bias, calculate_learning_curve};
pub use metrics::{AnalyticsComputation, TimePeriod};
pub use models::*;
pub use models::{
AgentProfile, CausalRelationship, ExecutionRecord, ExecutionRelation, GraphStatistics,
HybridSearchResult, ProviderAnalytics, ProviderCostForecast, ProviderEfficiency,
ProviderTaskTypeMetrics, Recommendation, SimilarityResult,
};
pub use persistence::{
KGPersistence, PersistedExecution, PersistedRlmExecution, RlmExecutionBuilder,
};

View File

@ -128,6 +128,23 @@ pub struct ProviderTaskTypeMetrics {
pub avg_duration_ms: f64,
}
/// Result from hybrid search combining HNSW semantic and BM25 lexical retrieval
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct HybridSearchResult {
pub execution: crate::persistence::PersistedExecution,
/// Cosine similarity score from HNSW vector search [0.0, 1.0]
pub semantic_score: f64,
/// BM25 relevance score from full-text search (unnormalized, higher = more
/// relevant)
pub lexical_score: f64,
/// RRF fused score: sum of 1/(60 + rank) across both methods
pub hybrid_score: f64,
/// 1-indexed position in the semantic ranked list (0 if absent)
pub semantic_rank: usize,
/// 1-indexed position in the lexical ranked list (0 if absent)
pub lexical_rank: usize,
}
/// Provider cost forecast
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ProviderCostForecast {

View File

@ -310,27 +310,227 @@ impl KGPersistence {
Ok(())
}
/// Load historical executions for similar task (using vector similarity)
/// Find similar executions using SurrealDB 3 HNSW approximate nearest
/// neighbor search.
///
/// The `<|100,64|>` operator retrieves 100 HNSW candidates with ef=64
/// search expansion, then reranks by exact cosine similarity. Falls
/// back to recent successful executions when no embedding is provided.
pub async fn find_similar_executions(
&self,
_embedding: &[f32],
embedding: &[f32],
limit: usize,
) -> anyhow::Result<Vec<PersistedExecution>> {
debug!("Searching for similar executions (limit: {})", limit);
// SurrealDB vector similarity queries require different syntax
// For now, return recent successful executions
let query = format!(
"SELECT * FROM kg_executions WHERE outcome = 'success' LIMIT {}",
debug!(
"HNSW semantic search for similar executions (limit: {})",
limit
);
let mut response = self.db.query(&query).await?;
if embedding.is_empty() {
let query = format!(
"SELECT * FROM kg_executions WHERE outcome = 'success' ORDER BY executed_at DESC \
LIMIT {limit}"
);
let mut response = self.db.query(&query).await?;
let raw: Vec<serde_json::Value> = response.take(0)?;
return Ok(raw
.into_iter()
.filter_map(|v| serde_json::from_value(v).ok())
.collect());
}
let results = self
.search_by_embedding_inner(embedding, limit)
.await?
.into_iter()
.map(|(exec, _score)| exec)
.collect();
Ok(results)
}
/// Hybrid search combining HNSW vector similarity and BM25 full-text
/// search, fused via Reciprocal Rank Fusion (RRF, k=60).
///
/// Requires migration 012 (HNSW + BM25 indexes) to be applied.
pub async fn hybrid_search(
&self,
embedding: &[f32],
text_query: &str,
limit: usize,
) -> anyhow::Result<Vec<crate::models::HybridSearchResult>> {
use std::collections::HashMap;
let candidate_limit = (limit * 3).max(30);
let semantic_results = if !embedding.is_empty() {
self.search_by_embedding_inner(embedding, candidate_limit)
.await
.unwrap_or_default()
} else {
Vec::new()
};
let lexical_results = if !text_query.trim().is_empty() {
self.search_by_text_inner(text_query, candidate_limit)
.await
.unwrap_or_default()
} else {
Vec::new()
};
if semantic_results.is_empty() && lexical_results.is_empty() {
return Ok(Vec::new());
}
let semantic_ranked: Vec<String> = semantic_results
.iter()
.map(|(e, _)| e.execution_id.clone())
.collect();
let lexical_ranked: Vec<String> = lexical_results
.iter()
.map(|(e, _)| e.execution_id.clone())
.collect();
let semantic_score_map: HashMap<String, f64> = semantic_results
.iter()
.map(|(e, s)| (e.execution_id.clone(), *s))
.collect();
let lexical_score_map: HashMap<String, f64> = lexical_results
.iter()
.map(|(e, s)| (e.execution_id.clone(), *s))
.collect();
// Merge execution records without double-fetching (prefer semantic)
let mut exec_map: HashMap<String, PersistedExecution> = HashMap::new();
for (exec, _) in semantic_results.into_iter().chain(lexical_results) {
exec_map.entry(exec.execution_id.clone()).or_insert(exec);
}
// Collect unique IDs for RRF
let mut all_ids: Vec<String> = Vec::with_capacity(exec_map.len());
let mut seen = std::collections::HashSet::new();
for id in semantic_ranked.iter().chain(lexical_ranked.iter()) {
if seen.insert(id.clone()) {
all_ids.push(id.clone());
}
}
let mut fused: Vec<crate::models::HybridSearchResult> = all_ids
.into_iter()
.filter_map(|id| {
// RRF: 1-indexed ranks; absent items get rank = list_len + 1
let sem_rank = semantic_ranked
.iter()
.position(|x| x == &id)
.map_or(semantic_ranked.len() + 1, |r| r + 1);
let lex_rank = lexical_ranked
.iter()
.position(|x| x == &id)
.map_or(lexical_ranked.len() + 1, |r| r + 1);
const K: f64 = 60.0;
let hybrid_score = 1.0 / (K + sem_rank as f64) + 1.0 / (K + lex_rank as f64);
let execution = exec_map.remove(&id)?;
Some(crate::models::HybridSearchResult {
semantic_score: semantic_score_map.get(&id).copied().unwrap_or(0.0),
lexical_score: lexical_score_map.get(&id).copied().unwrap_or(0.0),
hybrid_score,
semantic_rank: sem_rank,
lexical_rank: lex_rank,
execution,
})
})
.collect();
fused.sort_by(|a, b| {
b.hybrid_score
.partial_cmp(&a.hybrid_score)
.unwrap_or(std::cmp::Ordering::Equal)
});
fused.truncate(limit);
debug!(
"Hybrid search returned {} results (semantic: {}, lexical: {} unique candidates)",
fused.len(),
semantic_ranked.len(),
lexical_ranked.len()
);
Ok(fused)
}
/// HNSW approximate nearest neighbor search on `embedding` field.
///
/// `<|100,64|>`: retrieve 100 candidates from HNSW index with ef=64.
/// Results are reranked by exact cosine similarity before returning.
async fn search_by_embedding_inner(
&self,
embedding: &[f32],
limit: usize,
) -> anyhow::Result<Vec<(PersistedExecution, f64)>> {
let mut response = self
.db
.query(
"SELECT *, vector::similarity::cosine(embedding, $q) AS cosine_score FROM \
kg_executions WHERE embedding <|100,64|> $q ORDER BY cosine_score DESC",
)
.bind(("q", embedding.to_vec()))
.await?;
let raw: Vec<serde_json::Value> = response.take(0)?;
let results = raw
.into_iter()
.filter_map(|v| serde_json::from_value(v).ok())
.filter_map(|mut v| {
let score = v.get("cosine_score")?.as_f64()?;
if let Some(obj) = v.as_object_mut() {
obj.remove("cosine_score");
obj.remove("id");
}
let exec: PersistedExecution = serde_json::from_value(v).ok()?;
Some((exec, score))
})
.take(limit)
.collect();
Ok(results)
}
/// BM25 full-text search on `task_description` field.
///
/// Requires `idx_kg_executions_ft` index (migration 012).
/// The `@1@` predicate tag pairs with `search::score(1)` for BM25 scoring.
async fn search_by_text_inner(
&self,
text_query: &str,
limit: usize,
) -> anyhow::Result<Vec<(PersistedExecution, f64)>> {
let mut response = self
.db
.query(
"SELECT *, search::score(1) AS bm25_score FROM kg_executions WHERE \
task_description @1@ $text ORDER BY bm25_score DESC LIMIT 100",
)
.bind(("text", text_query.to_string()))
.await?;
let raw: Vec<serde_json::Value> = response.take(0)?;
let results = raw
.into_iter()
.filter_map(|mut v| {
let score = v.get("bm25_score")?.as_f64().unwrap_or(0.0);
if let Some(obj) = v.as_object_mut() {
obj.remove("bm25_score");
obj.remove("id");
}
let exec: PersistedExecution = serde_json::from_value(v).ok()?;
Some((exec, score))
})
.take(limit)
.collect();
Ok(results)
}
@ -751,26 +951,49 @@ impl KGPersistence {
.collect())
}
/// Find similar RLM tasks using query embedding similarity
/// Uses cosine similarity on query_embedding field
/// Find similar RLM tasks using cosine similarity on `query_embedding`.
///
/// `rlm_executions` is SCHEMALESS with nullable `query_embedding`, making
/// HNSW indexing impractical. In-memory cosine similarity is used instead
/// since RLM datasets are per-document and bounded in size.
pub async fn find_similar_rlm_tasks(
&self,
_query_embedding: &[f32],
query_embedding: &[f32],
limit: usize,
) -> anyhow::Result<Vec<PersistedRlmExecution>> {
debug!("Searching for similar RLM tasks (limit: {})", limit);
let query = format!(
"SELECT * FROM rlm_executions WHERE success = true ORDER BY executed_at DESC LIMIT {}",
debug!(
"Cosine similarity search for similar RLM tasks (limit: {})",
limit
);
let candidate_limit = (limit * 10).max(200);
let query = format!(
"SELECT * FROM rlm_executions WHERE success = true ORDER BY executed_at DESC LIMIT \
{candidate_limit}"
);
let mut response = self.db.query(&query).await?;
let raw: Vec<serde_json::Value> = response.take(0)?;
Ok(raw
let candidates: Vec<PersistedRlmExecution> = raw
.into_iter()
.filter_map(|v| serde_json::from_value(v).ok())
.collect())
.collect();
if query_embedding.is_empty() || candidates.is_empty() {
return Ok(candidates.into_iter().take(limit).collect());
}
let mut scored: Vec<(PersistedRlmExecution, f64)> = candidates
.into_iter()
.filter_map(|exec| {
let emb = exec.query_embedding.as_deref()?;
let sim = cosine_similarity(query_embedding, emb);
Some((exec, sim))
})
.collect();
scored.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap_or(std::cmp::Ordering::Equal));
Ok(scored.into_iter().take(limit).map(|(e, _)| e).collect())
}
/// Get RLM success rate for a specific document
@ -856,6 +1079,22 @@ impl KGPersistence {
}
}
/// Cosine similarity between two equal-length f32 vectors.
/// Returns 0.0 for empty or mismatched-dimension inputs.
fn cosine_similarity(a: &[f32], b: &[f32]) -> f64 {
if a.len() != b.len() || a.is_empty() {
return 0.0;
}
let dot: f32 = a.iter().zip(b).map(|(x, y)| x * y).sum();
let norm_a: f32 = a.iter().map(|x| x * x).sum::<f32>().sqrt();
let norm_b: f32 = b.iter().map(|x| x * x).sum::<f32>().sqrt();
if norm_a > 0.0 && norm_b > 0.0 {
(dot / (norm_a * norm_b)) as f64
} else {
0.0
}
}
#[cfg(test)]
mod tests {
use super::*;
@ -961,4 +1200,67 @@ mod tests {
assert_eq!(execution.success(), true);
assert_eq!(execution.duration_ms(), 1000);
}
#[test]
fn test_cosine_similarity_orthogonal() {
let a = vec![1.0_f32, 0.0, 0.0];
let b = vec![0.0_f32, 1.0, 0.0];
assert!((cosine_similarity(&a, &b) - 0.0).abs() < 1e-6);
}
#[test]
fn test_cosine_similarity_identical() {
let v = vec![0.3_f32, 0.4, 0.5];
let sim = cosine_similarity(&v, &v);
assert!((sim - 1.0).abs() < 1e-5, "identical vectors: {sim}");
}
#[test]
fn test_cosine_similarity_empty() {
assert_eq!(cosine_similarity(&[], &[]), 0.0);
assert_eq!(cosine_similarity(&[1.0], &[]), 0.0);
}
#[test]
fn test_cosine_similarity_partial() {
// [1,0] · [0.5, 0.866] ≈ cos(60°) ≈ 0.5
let a = vec![1.0_f32, 0.0];
let b = vec![0.5_f32, 0.866];
let sim = cosine_similarity(&a, &b);
assert!((sim - 0.5).abs() < 0.01, "got {sim}");
}
#[test]
fn test_rrf_fusion_consensus_wins() {
// Simulate the RRF logic: IDs in both lists should score higher
const K: f64 = 60.0;
let semantic = vec!["exec-a".to_string(), "exec-b".to_string()];
let lexical = vec!["exec-b".to_string(), "exec-c".to_string()];
let score_for = |id: &str| {
let sem_rank = semantic
.iter()
.position(|x| x == id)
.map_or(semantic.len() + 1, |r| r + 1);
let lex_rank = lexical
.iter()
.position(|x| x == id)
.map_or(lexical.len() + 1, |r| r + 1);
1.0 / (K + sem_rank as f64) + 1.0 / (K + lex_rank as f64)
};
let score_b = score_for("exec-b");
let score_a = score_for("exec-a");
let score_c = score_for("exec-c");
// exec-b appears in both lists, should rank highest
assert!(
score_b > score_a,
"exec-b ({score_b}) should beat exec-a ({score_a})"
);
assert!(
score_b > score_c,
"exec-b ({score_b}) should beat exec-c ({score_c})"
);
}
}

View File

@ -0,0 +1,44 @@
use serde::{Deserialize, Serialize};
/// Full specification for deploying a domain-optimized agent.
///
/// Produced by [`vapora_capabilities::CapabilityRegistry::activate`] and
/// consumed by [`vapora_agents::registry::AgentMetadata`] construction.
/// Kept in `vapora-shared` so both crates can reference it without a
/// circular dependency.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct AgentDefinition {
/// Agent role name used for task routing (e.g., `"code_reviewer"`).
pub role: String,
/// Human-readable description shown in UIs and logs.
pub description: String,
/// Preferred LLM provider name (e.g., `"claude"`).
pub llm_provider: String,
/// Preferred model within the provider (e.g., `"claude-opus-4-6"`).
pub llm_model: String,
/// Whether multiple instances may run concurrently.
#[serde(default)]
pub parallelizable: bool,
/// Assignment priority 0100.
#[serde(default = "default_priority")]
pub priority: u32,
/// Task-type strings used by the coordinator for capability-based routing.
#[serde(default)]
pub capabilities: Vec<String>,
/// Domain-optimized system prompt injected before every task.
/// `None` falls back to the agent's generic prompt.
#[serde(default)]
pub system_prompt: Option<String>,
}
fn default_priority() -> u32 {
50
}
impl AgentDefinition {
/// Builder: attach a domain-optimized system prompt.
pub fn with_system_prompt(mut self, prompt: impl Into<String>) -> Self {
self.system_prompt = Some(prompt.into());
self
}
}

View File

@ -1,8 +1,10 @@
// vapora-shared: Shared types and utilities for VAPORA v1.0
// Foundation: Minimal skeleton with core types
pub mod agent_definition;
pub mod error;
pub mod models;
pub mod validation;
pub use agent_definition::AgentDefinition;
pub use error::{Result, VaporaError};

View File

@ -111,13 +111,15 @@ webhook_url = "${SLACK_WEBHOOK_URL}"
on_task_done = ["team-slack"]
on_proposal_approved = ["team-slack", "ops-discord"]
on_proposal_rejected = ["ops-discord"]
on_agent_inactive = ["ops-telegram"]
```
`AppState` gains `channel_registry: Option<Arc<ChannelRegistry>>` and `notification_config: Arc<NotificationConfig>`. Hooks in three existing handlers:
`AppState` gains `channel_registry: Option<Arc<ChannelRegistry>>` and `notification_config: Arc<NotificationConfig>`. Hooks in four existing handlers:
- `update_task_status` — fires `Message::success` on `TaskStatus::Done`
- `approve_proposal` — fires `Message::success`
- `reject_proposal` — fires `Message::warning`
- `update_agent_status` — fires `Message::error` on `AgentStatus::Inactive`
#### New REST Endpoints

View File

@ -0,0 +1,181 @@
# ADR-0036: Knowledge Graph Hybrid Search — HNSW + BM25 + RRF
**Status**: Implemented
**Date**: 2026-02-26
**Deciders**: VAPORA Team
**Technical Story**: `find_similar_executions` was a stub returning recent records; `find_similar_rlm_tasks` ignored embeddings entirely. A missing schema migration caused all `kg_executions` reads to silently fail deserialization.
---
## Decision
Replace the stub similarity functions in `KGPersistence` with a **hybrid retrieval pipeline** combining:
1. **HNSW** (SurrealDB 3 native) — approximate nearest-neighbor vector search over `embedding` field
2. **BM25** (SurrealDB 3 native full-text search) — lexical scoring over `task_description` field
3. **Reciprocal Rank Fusion (RRF, k=60)** — scale-invariant score fusion
Add migration `012_kg_hybrid_search.surql` that fixes a pre-existing schema bug (three fields missing from the `SCHEMAFULL` table) and defines the required indexes.
---
## Context
### The Stub Problem
`find_similar_executions` in `persistence.rs` discarded its `embedding: &[f32]` argument entirely and returned the N most-recent successful executions, ordered by timestamp. Any caller relying on semantic proximity was silently receiving chronological results — a correctness bug, not a performance issue.
### The Silent Schema Bug
`kg_executions` was declared `SCHEMAFULL` in migration 005 but three fields used by `PersistedExecution` (`agent_role`, `provider`, `cost_cents`) were absent from the schema. SurrealDB drops undefined fields on `INSERT` in SCHEMAFULL tables. All subsequent `SELECT` queries returned records that failed `serde_json::from_value` deserialization, which was swallowed by `.filter_map(|v| v.ok())`. The persistence layer appeared to work (no errors) while returning empty results for every query.
### Why Not `stratum-embeddings` SurrealDbStore
`stratumiops/crates/stratum-embeddings/src/store/surrealdb.rs` implements vector search as a brute-force full-scan: it loads all records into memory and computes cosine similarity in-process. This works for document chunk retrieval (bounded dataset per document), but is unsuitable for the knowledge graph which accumulates unbounded execution records across all agents and tasks over time.
### Why Hybrid Over Pure Semantic
Embedding-only retrieval misses exact keyword matches: an agent searching for "cargo clippy warnings" may not find a record titled "clippy deny warnings fix" if the embedding model compresses the phrase differently than the query. BM25 handles exact token overlap that embeddings smooth over.
---
## Alternatives Considered
### ❌ Pure HNSW semantic search only
- Misses exact keyword matches (e.g., specific error codes, crate names)
- Embedding quality varies across providers; degrades if provider changes
### ❌ Pure BM25 lexical search only
- Misses paraphrases and semantic variants ("task failed" vs "execution error")
- No relevance for structurally similar tasks with different wording
### ❌ Tantivy / external FTS engine
- Adds a new process dependency for a capability SurrealDB 3 provides natively
- Requires synchronizing two stores; adds operational complexity
### ✅ SurrealDB 3 HNSW + BM25 + RRF (chosen)
Single data store, two native index types, no new dependencies, no sync complexity.
---
## Implementation
### Migration 012
```sql
-- Fix missing fields causing silent deserialization failure
DEFINE FIELD agent_role ON TABLE kg_executions TYPE option<string>;
DEFINE FIELD provider ON TABLE kg_executions TYPE string DEFAULT 'unknown';
DEFINE FIELD cost_cents ON TABLE kg_executions TYPE int DEFAULT 0;
-- BM25 full-text index on task_description
DEFINE ANALYZER kg_text_analyzer
TOKENIZERS class
FILTERS lowercase, snowball(english);
DEFINE INDEX idx_kg_executions_ft
ON TABLE kg_executions
FIELDS task_description
SEARCH ANALYZER kg_text_analyzer BM25;
-- HNSW ANN index on embedding (1536-dim, cosine, float32)
DEFINE INDEX idx_kg_executions_hnsw
ON TABLE kg_executions
FIELDS embedding
HNSW DIMENSION 1536 DIST COSINE TYPE F32 M 16 EF_CONSTRUCTION 200;
```
HNSW parameters: `M 16` (16 edges per node, standard for 1536-dim); `EF_CONSTRUCTION 200` (index build quality vs. insert speed; 200 is the standard default).
### Query Patterns
**HNSW semantic search** (`<|100,64|>` = 100 candidates, ef=64 at query time):
```surql
SELECT *, vector::similarity::cosine(embedding, $q) AS cosine_score
FROM kg_executions
WHERE embedding <|100,64|> $q
ORDER BY cosine_score DESC
LIMIT 20
```
**BM25 lexical search** (`@1@` assigns predicate ID 1; paired with `search::score(1)`):
```surql
SELECT *, search::score(1) AS bm25_score
FROM kg_executions
WHERE task_description @1@ $text
ORDER BY bm25_score DESC
LIMIT 100
```
### RRF Fusion
Cosine similarity is bounded `[0.0, 1.0]`; BM25 is unbounded `[0, ∞)`. Linear blending requires per-corpus normalization. RRF is scale-invariant:
```
hybrid_score(id) = 1 / (60 + rank_semantic) + 1 / (60 + rank_lexical)
```
`k=60` is the standard constant (Robertson & Zaragoza, 2009). IDs absent from one ranked list receive rank 0, contributing `1/60` — never 0, preventing complete suppression of single-method results.
### RLM Executions
`rlm_executions` is `SCHEMALESS` with a nullable `query_embedding` field. HNSW indexes require a `SCHEMAFULL` table with a non-nullable typed field. `find_similar_rlm_tasks` uses in-memory cosine similarity: loads candidate records, filters those with non-empty embeddings, sorts by cosine score. Acceptable because the RLM dataset is bounded per document.
### New Public API
```rust
impl KGPersistence {
// Was stub (returned recent records). Now uses HNSW ANN query.
pub async fn find_similar_executions(
&self,
embedding: &[f32],
limit: usize,
) -> anyhow::Result<Vec<PersistedExecution>>;
// New. HNSW + BM25 + RRF. Either argument may be empty (degrades gracefully).
pub async fn hybrid_search(
&self,
embedding: &[f32],
text_query: &str,
limit: usize,
) -> anyhow::Result<Vec<HybridSearchResult>>;
}
```
`HybridSearchResult` exposes `semantic_score`, `lexical_score`, `hybrid_score`, `semantic_rank`, `lexical_rank` — callers can inspect individual signal contributions.
---
## Consequences
### Positive
- `find_similar_executions` returns semantically similar past executions, not recent ones. The correctness bug is fixed.
- `hybrid_search` exposes both signals; callers can filter by `semantic_score ≥ 0.7` for high-confidence-only retrieval.
- No new dependencies. The two indexes are defined in a migration; no Rust dependency change.
- The schema bug fix means all existing `kg_executions` records round-trip correctly after migration 012 is applied.
### Negative / Trade-offs
- HNSW index build is `O(n log n)` in SurrealDB; large existing datasets will cause migration 012 to take longer than typical DDL migrations. No data migration is needed — only index creation.
- BM25 requires the `task_description` field to be populated at insert time. Records inserted before this migration with empty or null descriptions will not appear in lexical results.
- `rlm_executions` hybrid search remains in-memory. A future migration converting `rlm_executions` to SCHEMAFULL would enable native HNSW for that table too.
### Supersedes
- The stub implementation of `find_similar_executions` (existed since persistence.rs was written).
- Extends ADR-0013 (KG temporal design) with the retrieval layer decision.
---
## Related
- [ADR-0013: Knowledge Graph Temporal](./0013-knowledge-graph.md) — original KG design
- [ADR-0029: RLM Recursive Language Models](./0029-rlm-recursive-language-models.md) — RLM hybrid search (different use case: document chunks, not execution records)
- [ADR-0004: SurrealDB](./0004-surrealdb-database.md) — database foundation

View File

@ -0,0 +1,185 @@
# ADR-0037: Capability Packages — `vapora-capabilities` Crate
**Status**: Accepted
**Date**: 2026-02-26
**Deciders**: VAPORA Team
**Technical Story**: VAPORA agents carried generic roles (developer, reviewer, architect) but had no domain-specific system prompts or model defaults — every new deployment required manual prompt engineering per agent before first use.
---
## Decision
Introduce a dedicated `vapora-capabilities` crate that provides **zero-config capability bundles**, and relocate `AgentDefinition` to `vapora-shared` to eliminate a circular dependency:
1. `vapora-capabilities` exposes a `Capability` trait, `CapabilitySpec` struct, `CapabilityRegistry`, and `CapabilityLoader` — built-in implementations cover `CodeReviewer`, `DocGenerator`, and `PRMonitor`.
2. `AgentDefinition` is moved from `vapora-agents::config` to `vapora-shared`; `vapora-agents::config` re-exports it for backward compatibility.
3. `AgentExecutor` gains a `with_router(Arc<LLMRouter>)` builder method; the system prompt from `CapabilitySpec` is forwarded as a `Role::System` message through the existing `TypeDialogAdapter::build_messages(prompt, context)` path.
4. `AgentCoordinator` gains in-process executor dispatch via a `DashMap<String, Sender<TaskAssignment>>`; the shard lock is released before `.await` by cloning the `Sender` out of the map.
---
## Context
### Gap: Generic Roles Require Manual Configuration
VAPORA's agent roles were defined as enum variants (`AgentRole::Developer`, `AgentRole::Reviewer`, etc.) with no attached system prompt or model preference. A freshly deployed instance had functionally identical agents regardless of the task domain. Operators had to locate `agents.toml`, write a system prompt, select a model, and restart the agents server before agents produced domain-appropriate responses. This repeated for every deployment.
### Competitive Signal: OpenFang "Hands"
OpenFang's "Hands" concept ships pre-configured agent personas as first-class artifacts: a `code-review` Hand knows which model to use, what temperature to set, and what system prompt to send — activated with a single config entry. VAPORA's equivalent required three files and a server restart. The gap was not capability but packaging.
### Circular Dependency Risk
`AgentDefinition` was the struct that capability specs would need to produce in order to register a new agent. It lived in `vapora-agents::config`. If `vapora-capabilities` imported `vapora-agents` to get `AgentDefinition`, and `vapora-agents` imported `vapora-capabilities` to load built-in capabilities at startup, the workspace would have a compile-time cycle. Cargo does not resolve intra-workspace circular dependencies.
`AgentDefinition` is a plain data-transfer struct — no orchestration logic, no I/O, no runtime state. Its correct home is `vapora-shared`.
---
## Implementation
### Crate Structure (`vapora-capabilities`)
```text
vapora-capabilities/
├── src/
│ ├── lib.rs — pub re-exports (Capability, CapabilitySpec, CapabilityRegistry, CapabilityLoader)
│ ├── capability.rs — Capability trait + CapabilitySpec struct
│ ├── registry.rs — CapabilityRegistry: name → Arc<dyn Capability>
│ ├── loader.rs — CapabilityLoader: TOML file + env override resolution
│ └── builtins/
│ ├── mod.rs — register_defaults(registry: &mut CapabilityRegistry)
│ ├── code_reviewer.rs — CodeReviewer (Opus 4.6, temp 0.1)
│ ├── doc_generator.rs — DocGenerator (Sonnet 4.6, temp 0.3)
│ └── pr_monitor.rs — PRMonitor (Sonnet 4.6, temp 0.1)
```
### `Capability` Trait and `CapabilitySpec`
```rust
pub trait Capability: Send + Sync {
fn name(&self) -> &str;
fn spec(&self) -> CapabilitySpec;
}
pub struct CapabilitySpec {
pub system_prompt: String,
pub model: String,
pub temperature: f32,
pub agent_definition: AgentDefinition, // from vapora-shared
}
```
`CapabilitySpec` is intentionally flat — no nested `Option` hierarchies. TOML overrides are applied by `CapabilityLoader` before the spec reaches `AgentExecutor`.
### Built-in Capabilities
| Name | Model | Temperature | Purpose |
|------|-------|-------------|---------|
| `code-reviewer` | `claude-opus-4-6` | 0.1 | Structured code review with severity classification |
| `doc-generator` | `claude-sonnet-4-6` | 0.3 | Module and API documentation generation |
| `pr-monitor` | `claude-sonnet-4-6` | 0.1 | PR diff analysis and merge-readiness assessment |
Temperature 0.1 for review tasks enforces determinism; 0.3 for generation allows phrasing variation without hallucination risk at this task type.
### TOML Override System
Operators can override any `CapabilitySpec` field without touching source code:
```toml
[capabilities.code-reviewer]
model = "claude-sonnet-4-6"
temperature = 0.2
[capabilities.doc-generator]
system_prompt = "You are a documentation expert specializing in Rust crate APIs."
```
Unset fields retain the built-in default. `CapabilityLoader::load` merges partial TOML structs using `Option<T>` fields internally — a `None` after deserialization means "keep built-in value", not "clear to empty".
### `AgentDefinition` Relocation
```text
Before: vapora-agents::config::AgentDefinition
After: vapora-shared::models::AgentDefinition
vapora-agents::config:
pub use vapora_shared::models::AgentDefinition; // backward compat
```
All existing call sites in `vapora-agents`, `vapora-backend`, and `vapora-a2a` continue to compile without change. The re-export is the only required modification in `vapora-agents`.
### `AgentExecutor` — System Prompt Forwarding
`AgentExecutor` gains one builder method:
```rust
impl AgentExecutor {
pub fn with_router(mut self, router: Arc<LLMRouter>) -> Self {
self.router = Some(router);
self
}
}
```
The capability's `system_prompt` is passed as `context` to `TypeDialogAdapter::build_messages(prompt, context)`, which already mapped `context` to a leading `Role::System` message before this change. No new message-construction logic is needed.
### `AgentCoordinator` — In-Process Executor Dispatch
```rust
// DashMap shard released before .await by cloning the Sender
let tx: Sender<TaskAssignment> = {
self.executor_channels
.get(&agent_id)
.map(|entry| entry.value().clone())
.ok_or(CoordinatorError::AgentNotFound(agent_id.clone()))?
};
tx.send(assignment).await?;
```
The `DashMap::get` guard (which holds a shard read lock) is dropped at the end of the block. The `.await` on `tx.send` occurs after the lock is released, eliminating the risk of a Tokio task yielding while holding a `DashMap` shard lock.
---
## Consequences
### Positive
- Deploy VAPORA, activate a capability by name, and it works — no manual system prompt engineering required.
- TOML overrides allow per-deployment model preferences without code changes or recompilation of the capabilities crate.
- `AgentDefinition` in `vapora-shared` is architecturally correct: it is a DTO with no behavior, and `vapora-shared` is the designated home for shared data types.
- The `DashMap` lock-before-await anti-pattern is eliminated from `AgentCoordinator`.
### Negative
- `vapora-agents` binary depends on `vapora-capabilities`. Adding a new built-in capability requires recompiling and redeploying the agents server. Capabilities are not dynamically loaded.
- Built-in system prompts are opinionated. An operator who disagrees with the `CodeReviewer` prompt must override it explicitly; there is no mechanism to reset to "no prompt" after a capability is activated short of removing the config entry.
### Neutral
- Partial TOML override via `Option<T>` fields means unset override fields silently preserve the built-in value. This is the intended behavior but requires operators to understand that omitting a field is not the same as clearing it.
- `vapora-capabilities` has no runtime dependency on SurrealDB or NATS — it is a pure configuration and trait-definition crate. It can be unit-tested without any external services.
---
## Alternatives Considered
### `CapabilityRegistry` Initialization in `vapora-backend`
Register agents via HTTP call to the agents server at backend startup. Rejected: it adds a temporal dependency (agents server must be up before backend can finish booting), introduces a network round-trip on startup, and couples the backend's readiness to the agents server's availability — three failure modes for what is a configuration operation.
### Keep `AgentDefinition` in `vapora-agents`, Expose Capabilities via a Plugin Trait
Define a `CapabilityPlugin` trait in a thin interface crate; `vapora-agents` depends on the interface, `vapora-capabilities` implements it. Rejected: this adds a third crate (`vapora-capability-api`) to resolve what is fundamentally a DTO placement error. `AgentDefinition` has no behavior to isolate behind a trait — moving it to `vapora-shared` is the minimal correct fix.
### Static TOML Baked into `agents.toml` at CLI Time
Generate capability configuration as a static TOML block via `vapora-cli` and write it into `agents.toml`. Rejected: this loses runtime composability (capabilities cannot be activated or deactivated without regenerating the file), makes the override system asymmetric (CLI generates, operator edits by hand), and provides no registry abstraction for future dynamic activation.
---
## Relation to Prior Decisions
- Builds on `AgentCoordinator` patterns established in ADR-0024 and refined in ADR-0033; in-process dispatch via `DashMap<String, Sender>` follows the same lock-discipline patterns applied to the workflow engine.
- `AgentDefinition` relocation is consistent with ADR-0001 (workspace crate responsibilities): `vapora-shared` holds types depended on by multiple crates; orchestration crates hold behavior.
- TOML override resolution follows the same `Option<T>`-merge pattern used in `vapora-llm-router::config` for per-rule model overrides (ADR-0012).

View File

@ -83,6 +83,7 @@ Decisiones únicas que diferencian a VAPORA de otras plataformas de orquestació
| [033](./0033-stratum-orchestrator-workflow-hardening.md) | Workflow Engine Hardening — Persistence · Saga · Cedar | SurrealDB persistence + Saga best-effort rollback + Cedar per-stage auth; stratum patterns implemented natively (no path dep) | ✅ Implemented |
| [034](./0034-autonomous-scheduling.md) | Autonomous Scheduling — Timezone Support and Distributed Fire-Lock | `chrono-tz` IANA-aware cron evaluation + SurrealDB conditional UPDATE fire-lock; no external lock service required | ✅ Implemented |
| [035](./0035-notification-channels.md) | Webhook-Based Notification Channels — `vapora-channels` Crate | Trait-based webhook delivery (Slack/Discord/Telegram) + `${VAR}` secret resolution built into `ChannelRegistry::from_config`; fire-and-forget via `tokio::spawn` | ✅ Implemented |
| [036](./0036-kg-hybrid-search.md) | Knowledge Graph Hybrid Search — HNSW + BM25 + RRF | SurrealDB 3 native HNSW vector index + BM25 full-text; RRF(k=60) fusion; fixes schema bug causing silent empty reads | ✅ Implemented |
---

View File

@ -7,3 +7,4 @@ VAPORA capabilities and overview documentation.
- **[Features Overview](overview.md)** — Complete feature list and descriptions including learning-based agent selection, cost optimization, and swarm coordination
- **[Workflow Orchestrator](workflow-orchestrator.md)** — Multi-stage pipelines, approval gates, artifacts, autonomous scheduling, and distributed fire-lock
- **[Notification Channels](notification-channels.md)** — Webhook delivery to Slack, Discord, and Telegram with built-in secret resolution
- **[Capability Packages](capability-packages.md)** — Pre-built domain-optimized agent bundles (CodeReviewer, DocGenerator, PRMonitor) with system prompts, tool configs, and LLM preferences

View File

@ -0,0 +1,113 @@
# Pre-built Capability Packages
The `vapora-capabilities` crate provides a registry of pre-configured agent capabilities. Each capability is a self-contained bundle that defines everything an agent needs to operate: its system prompt, model preferences, task types, MCP tool bindings, scheduling priority, and inference temperature.
## Overview
A capability package encapsulates:
- `system_prompt` — the instruction context injected as `Role::System` into every LLM call
- `preferred_provider` and `preferred_model` — resolved at runtime through `LLMRouter`
- `task_types` — the set of task labels this capability handles (used by the swarm coordinator for assignment)
- `mcp_tools` — tool names exposed to the agent via the MCP gateway
- `priority` — integer weight (0100) for swarm scheduling decisions
- `temperature` — controls output determinism; lower values for review/analysis, higher for generative tasks
The value proposition is operational simplicity: call `CapabilityRegistry::with_built_ins()`, pass the resulting definitions to `AgentExecutor`, and the agents are ready. No manual system prompt authoring, no per-agent LLM config wiring.
## Built-in Capabilities
| ID | Role | Purpose |
|----|------|---------|
| `code-reviewer` | `code_reviewer` | Security and correctness-focused code review. Uses Claude Opus 4.6 at temperature 0.1. MCP tools: `file_read`, `file_list`, `git_diff`, `code_search` |
| `doc-generator` | `documenter` | Generates technical documentation from source code. Uses Claude Sonnet 4.6 at temperature 0.3. MCP tools: `file_read`, `file_list`, `code_search`, `file_write` |
| `pr-monitor` | `monitor` | PR health monitoring and merge readiness assessment. Uses Claude Sonnet 4.6 at temperature 0.1. MCP tools: `git_diff`, `git_log`, `git_status`, `file_list`, `file_read` |
## Runtime Flow
At agent server startup:
```rust
let registry = CapabilityRegistry::with_built_ins();
```
This populates the registry with all built-in `CapabilityPackage` instances. When an agent is activated:
```rust
let definition: AgentDefinition = registry.activate("code-reviewer")?;
```
`activate()` resolves the capability into an `AgentDefinition` with `system_prompt` fully populated. `AgentExecutor` receives this definition and prepends the prompt as `Role::System` on every invocation before forwarding the request to the LLM:
```rust
let response = router
.complete_with_budget(role, model, messages_with_system, budget_ctx)
.await?;
```
`LLMRouter::complete_with_budget()` applies the capability's `preferred_provider`, `preferred_model`, and token budget. If the budget is exhausted, the router falls back through the configured fallback chain transparently.
## TOML Customization
Capabilities can be overridden or extended via a TOML file without modifying built-ins:
```toml
# Override built-in: use Sonnet instead of Opus for code-reviewer
[[override]]
id = "code-reviewer"
preferred_model = "claude-sonnet-4-6"
max_tokens = 16384
# Add a custom capability
[[custom]]
id = "db-optimizer"
display_name = "Database Optimizer"
description = "Analyzes and optimizes SQL queries and schema"
agent_role = "db_optimizer"
task_types = ["db_optimization", "query_review"]
system_prompt = "You are a database performance expert..."
mcp_tools = ["file_read", "code_search"]
preferred_provider = "claude"
preferred_model = "claude-sonnet-4-6"
max_tokens = 4096
temperature = 0.2
priority = 70
parallelizable = true
```
Load and apply with:
```rust
use vapora_capabilities::{CapabilityRegistry, CapabilityLoader};
let registry = CapabilityRegistry::with_built_ins();
CapabilityLoader::load_and_apply("config/capabilities.toml", &registry)?;
```
`load_and_apply` merges `[[override]]` entries into existing built-ins (only specified fields are replaced) and inserts `[[custom]]` entries as new packages. The registry is append-only after construction; overrides operate by replacement of the matching entry.
## Architecture
`vapora-capabilities` sits at the base of the dependency graph to avoid circular imports:
```text
vapora-shared
└── vapora-capabilities (depends on vapora-shared for AgentDefinition)
└── vapora-agents (depends on vapora-capabilities for registry access)
```
`vapora-capabilities` imports only `vapora-shared` types (`AgentDefinition`, `AgentRole`, `LLMProvider`). It has no dependency on `vapora-agents`. This means capability definitions can be constructed, tested, and loaded independently of the agent runtime.
## Environment Variables
These configure the LLM router used by all executors that receive a capability definition:
| Variable | Purpose |
|----------|---------|
| `LLM_ROUTER_CONFIG` | Path to the router configuration TOML file |
| `ANTHROPIC_API_KEY` | API key for Claude models (Opus, Sonnet) |
| `OPENAI_API_KEY` | API key for OpenAI models (GPT-4, etc.) |
| `OLLAMA_URL` | Base URL for a local Ollama instance |
| `OLLAMA_MODEL` | Default model name to use when routing to Ollama |
The router config at `LLM_ROUTER_CONFIG` defines routing rules, fallback chains, and per-role budget limits. Capability packages specify a `preferred_provider` and `preferred_model`, but the router enforces budget constraints and can override the preference if a fallback is triggered.

View File

@ -42,6 +42,7 @@ Channel names (`team-slack`, `ops-discord`, `alerts-telegram`) are arbitrary ide
on_task_done = ["team-slack"]
on_proposal_approved = ["team-slack", "ops-discord"]
on_proposal_rejected = ["ops-discord"]
on_agent_inactive = ["ops-telegram"]
```
Each key is an event name; the value is a list of channel names declared in `[channels.*]`. An empty list or absent key means no notification for that event.
@ -199,6 +200,7 @@ There is no built-in retry. A channel that is consistently unreachable produces
| `on_task_done` | Task moved to `Done` status | `Success` |
| `on_proposal_approved` | Proposal approved via API | `Success` |
| `on_proposal_rejected` | Proposal rejected via API | `Warning` |
| `on_agent_inactive` | Agent status transitions to `Inactive` | `Error` |
| `on_stage_complete` | Workflow stage finished | `Info` |
| `on_stage_failed` | Workflow stage failed | `Warning` |
| `on_completed` | Workflow reached terminal `Completed` state | `Success` |

View File

@ -0,0 +1,232 @@
# Capability Packages Guide
## What Is a Capability Package
A capability package bundles everything an agent needs to handle a specific domain into a single reusable unit. Activating one produces an `AgentDefinition` that the coordinator registers and routes tasks to.
Each package carries:
- `system_prompt` — domain-optimized instructions injected as the LLM system message before every task execution
- `preferred_model` / `preferred_provider` — e.g. `claude-opus-4-6` for deep code reasoning, `claude-sonnet-4-6` for cost-efficient writing tasks
- `task_types` — strings matched by the coordinator's `extract_task_type` heuristic against task titles and descriptions to select this agent
- `mcp_tools` — list of MCP tool IDs (`file_read`, `git_diff`, etc.) activated for this agent via `vapora-mcp-server`
- `temperature` / `max_tokens` / `priority` / `parallelizable` — execution parameters controlling output quality, cost, scheduling order, and concurrency
## Built-in Capabilities
| ID | Role | Model | Temp | Max Tokens | Use Case |
|----|------|-------|------|------------|----------|
| `code-reviewer` | `code_reviewer` | `claude-opus-4-6` | 0.1 | 8192 | Security and correctness review; JSON output with severity levels and `merge_ready` flag |
| `doc-generator` | `documenter` | `claude-sonnet-4-6` | 0.3 | 16384 | Source-to-documentation generation with rustdoc/JSDoc/docstring output |
| `pr-monitor` | `monitor` | `claude-sonnet-4-6` | 0.1 | 4096 | PR health check; `READY` / `NEEDS_REVIEW` / `BLOCKED` status output |
The `code-reviewer` uses Opus 4.6 because review tasks benefit from deep reasoning over complex code patterns. Temperature 0.1 ensures reproducible findings across repeated runs on the same diff. `pr-monitor` is `parallelizable = false` — concurrent runs on the same PR would produce conflicting status reports.
## Activating Built-ins at Runtime
The agent server calls `CapabilityRegistry::with_built_ins()` at startup automatically. All three built-ins are registered and their executors spawned before the HTTP listener opens — no action required when running the standard agent server (`crates/vapora-agents`).
For programmatic use:
```rust
use vapora_capabilities::CapabilityRegistry;
let registry = CapabilityRegistry::with_built_ins();
// "code-reviewer", "doc-generator", "pr-monitor" are now registered
let def = registry.activate("code-reviewer")?;
// def.role == "code_reviewer"
// def.system_prompt == Some("<full review prompt>")
// def.llm_model == "claude-opus-4-6"
// def.llm_provider == "claude"
```
`activate` returns an `AgentDefinition` from `vapora-shared`. The system prompt is embedded in the definition and available at `def.system_prompt` — the executor injects it before every task without any further lookup.
## Overriding a Built-in
### Via TOML Config File
Override fields are applied on top of the existing built-in spec. Only fields present in TOML are changed; everything else keeps its default. An unknown override `id` is skipped with a warning, not an error.
```toml
# config/capabilities.toml
# Switch code-reviewer to Sonnet for cost savings
[[override]]
id = "code-reviewer"
preferred_model = "claude-sonnet-4-6"
max_tokens = 16384
# Replace the doc-generator system prompt for your tech stack
[[override]]
id = "doc-generator"
system_prompt = """
You are a technical documentation specialist for Rust async systems.
Follow rustdoc conventions. All examples must be runnable.
"""
```
Load and apply at startup (or on config reload):
```rust
use vapora_capabilities::{CapabilityRegistry, CapabilityLoader};
let registry = CapabilityRegistry::with_built_ins();
CapabilityLoader::load_and_apply("config/capabilities.toml", &registry)?;
```
`load_and_apply` reads the file, parses TOML, and applies overrides + custom entries in one call. The call is idempotent — re-applying the same file replaces existing specs rather than erroring.
### Via the Registry API Directly
```rust
use vapora_capabilities::{CapabilityRegistry, CapabilitySpec, CustomCapability};
let registry = CapabilityRegistry::with_built_ins();
// Fetch the current spec, mutate it, push it back
let mut spec = registry.get("code-reviewer").unwrap().spec();
spec = spec.with_model("claude-sonnet-4-6").with_max_tokens(16384);
registry.override_spec("code-reviewer", spec)?;
// Returns CapabilityError::NotFound if the id is not registered
// Returns CapabilityError::InvalidSpec if the spec id does not match the target id
```
## Adding a Custom Capability
Custom entries in TOML are full `CapabilitySpec` definitions — all fields are required. They are registered with `register_or_replace`, so re-applying the config is safe.
```toml
[[custom]]
id = "db-optimizer"
display_name = "Database Optimizer"
description = "Analyzes and optimizes SurrealQL queries and schema"
agent_role = "db_optimizer"
task_types = ["db_optimization", "query_review", "schema_review"]
system_prompt = """
You are a SurrealDB performance expert.
Analyze queries and schema definitions for: index usage, full-table scans,
unnecessary JOINs, missing composite indexes.
Output JSON: { "issues": [...], "optimized_query": "...", "index_suggestions": [...] }
"""
mcp_tools = ["file_read", "code_search"]
preferred_provider = "claude"
preferred_model = "claude-sonnet-4-6"
max_tokens = 4096
temperature = 0.1
priority = 75
parallelizable = true
```
The `task_types` list must overlap with words present in task titles or descriptions. The coordinator's heuristic tokenizes the task text and checks for matches against registered task-type strings. If no match is found, the task falls back to default role assignment. Use lowercase snake\_case strings that reflect verbs and nouns users will write in task titles (`"query_review"`, `"db_optimization"`).
## Environment Variables
The agent server reads provider credentials from the environment at startup to configure the LLM router.
| Variable | Effect |
|----------|--------|
| `LLM_ROUTER_CONFIG` | Path to a `llm-router.toml` file; takes precedence over all individual API key variables |
| `ANTHROPIC_API_KEY` | Enables the `claude` provider; default model `claude-sonnet-4-6` |
| `OPENAI_API_KEY` | Enables the `openai` provider; default model `gpt-4o` |
| `OLLAMA_URL` | Enables the `ollama` provider (e.g. `http://localhost:11434`) |
| `OLLAMA_MODEL` | Model used with Ollama (default: `llama3.2`) |
| `BUDGET_CONFIG_PATH` | Path to budget config file (default: `config/agent-budgets.toml`) |
If none of `LLM_ROUTER_CONFIG`, `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, or `OLLAMA_URL` are set, executors run in stub mode — tasks are accepted and return placeholder responses. This is intentional for integration tests and offline development.
## Checking What Is Registered
```rust
let ids = registry.list_ids();
// sorted alphabetically: ["code-reviewer", "doc-generator", "pr-monitor"]
let count = registry.len(); // 3
// Check and activate a specific capability
if registry.contains("db-optimizer") {
let def = registry.activate("db-optimizer")?;
println!("role: {}, model: {}", def.role, def.llm_model);
}
// Iterate all registered capabilities (order is HashMap-based, not sorted)
for cap in registry.list_all() {
let spec = cap.spec();
println!("{}: {} ({})", spec.id, spec.display_name, spec.preferred_model);
}
```
## Capability Spec Field Reference
| Field | Type | Description |
|-------|------|-------------|
| `id` | `String` | Unique kebab-case identifier (e.g., `"code-reviewer"`) |
| `display_name` | `String` | Human-readable name shown in UIs and logs |
| `description` | `String` | Brief purpose description embedded in the agent's log entries |
| `agent_role` | `String` | Role name used by the coordinator for task routing (e.g., `"code_reviewer"`) |
| `task_types` | `Vec<String>` | Keywords matched against task text by the coordinator heuristic |
| `system_prompt` | `String` | Full system message injected before every task execution |
| `mcp_tools` | `Vec<String>` | MCP tool IDs available to this agent via `vapora-mcp-server` |
| `preferred_provider` | `String` | LLM provider name (`"claude"`, `"openai"`, `"ollama"`) |
| `preferred_model` | `String` | Model ID within the provider (e.g., `"claude-opus-4-6"`) |
| `max_tokens` | `u32` | Maximum output tokens per task execution |
| `temperature` | `f32` | Sampling temperature 0.01.0; lower = more deterministic |
| `priority` | `u32` | Assignment priority 0100; higher = preferred when multiple agents match |
| `parallelizable` | `bool` | Whether multiple instances may run concurrently for the same task type |
## Writing Your Own Built-in
Built-ins are unit structs in `crates/vapora-capabilities/src/built_in/`. Follow this pattern:
```rust
// crates/vapora-capabilities/src/built_in/sql_optimizer.rs
use crate::capability::{Capability, CapabilitySpec};
const SYSTEM_PROMPT: &str = r#"You are a SurrealDB query optimization expert.
Analyze the provided query or schema definition.
Output JSON: { "issues": [...], "optimized": "...", "indexes": [...] }"#;
#[derive(Debug)]
pub struct SqlOptimizer;
impl Capability for SqlOptimizer {
fn spec(&self) -> CapabilitySpec {
CapabilitySpec {
id: "sql-optimizer".to_string(),
display_name: "SQL Optimizer".to_string(),
description: "Optimizes SurrealQL queries and schema definitions".to_string(),
agent_role: "sql_optimizer".to_string(),
task_types: vec![
"sql_optimization".to_string(),
"query_review".to_string(),
"schema_review".to_string(),
],
system_prompt: SYSTEM_PROMPT.to_string(),
mcp_tools: vec!["file_read".to_string(), "code_search".to_string()],
preferred_provider: "claude".to_string(),
preferred_model: "claude-sonnet-4-6".to_string(),
max_tokens: 4096,
temperature: 0.1,
priority: 75,
parallelizable: true,
}
}
}
```
Then wire it into the module and registry:
```rust
// crates/vapora-capabilities/src/built_in/mod.rs
mod sql_optimizer;
pub use sql_optimizer::SqlOptimizer;
```
```rust
// crates/vapora-capabilities/src/registry.rs — inside with_built_ins()
registry.register(SqlOptimizer).expect("sql-optimizer id collision");
```
The `expect` on `register` is intentional — built-in IDs are unique by construction, and a collision at startup indicates a programming error that must be caught during development, not at runtime.

View File

@ -21,6 +21,7 @@
- [Features Overview](../features/README.md)
- [Platform Capabilities](../features/overview.md)
- [Capability Packages](../features/capability-packages.md)
## Architecture
@ -63,11 +64,13 @@
- [0026: Shared State](../adrs/0026-shared-state.md)
- [0027: Documentation Layers](../adrs/0027-documentation-layers.md)
- [0033: Workflow Engine Hardening](../adrs/0033-stratum-orchestrator-workflow-hardening.md)
- [0037: Capability Packages](../adrs/0037-capability-packages.md)
## Guides
- [Workflow Persistence, Saga & Cedar](../guides/workflow-saga-persistence.md)
- [RLM Usage Guide](../guides/rlm-usage-guide.md)
- [Capability Packages Guide](../guides/capability-packages-guide.md)
## Integration Guides
@ -105,8 +108,8 @@
---
**Documentation Version**: 1.3.0
**Last Updated**: 2026-02-21
**Documentation Version**: 1.4.0
**Last Updated**: 2026-02-26
**Status**: Production Ready
For the latest updates, visit: https://github.com/vapora-platform/vapora

View File

@ -0,0 +1,31 @@
-- Migration 012: Knowledge Graph Hybrid Search
-- Adds HNSW vector index and BM25 full-text index for hybrid retrieval
-- Fixes missing fields in kg_executions (agent_role, provider, cost_cents)
-- that caused deserialization failures on SELECT queries.
-- Missing fields added to make kg_executions round-trip correctly
DEFINE FIELD agent_role ON TABLE kg_executions TYPE option<string>;
DEFINE FIELD provider ON TABLE kg_executions TYPE string DEFAULT 'unknown';
DEFINE FIELD cost_cents ON TABLE kg_executions TYPE int DEFAULT 0;
-- BM25 full-text search on task descriptions
-- class tokenizer preserves word boundaries (code terms, identifiers)
-- snowball(english) reduces "compiling" → "compil" for better recall
DEFINE ANALYZER kg_text_analyzer
TOKENIZERS class
FILTERS lowercase, snowball(english);
DEFINE INDEX idx_kg_executions_ft
ON TABLE kg_executions
FIELDS task_description
SEARCH ANALYZER kg_text_analyzer BM25;
-- HNSW approximate nearest neighbor index for semantic similarity
-- DIST COSINE: matches cosine similarity used throughout the codebase
-- TYPE F32: float32 embeddings (OpenAI ada-002 / compatible providers)
-- M 16: graph connectivity (16 edges per node, standard for 1536-dim)
-- EF_CONSTRUCTION 200: index build quality vs speed tradeoff
DEFINE INDEX idx_kg_executions_hnsw
ON TABLE kg_executions
FIELDS embedding
HNSW DIMENSION 1536 DIST COSINE TYPE F32 M 16 EF_CONSTRUCTION 200;