diff --git a/CHANGELOG.md b/CHANGELOG.md index a30c4d0..94e4add 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,6 +7,43 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] +### Added - Capability Packages (`vapora-capabilities`) + +#### `vapora-capabilities` — new crate + +- `CapabilitySpec`: full bundle struct — `id`, `display_name`, `description`, `agent_role`, `task_types`, `system_prompt`, `mcp_tools`, `preferred_provider`, `preferred_model`, `max_tokens`, `temperature`, `priority`, `parallelizable` +- `Capability` trait: object-safe (`Send + Sync`), single `fn spec() -> CapabilitySpec` + default `fn to_agent_definition() -> AgentDefinition` +- `CustomCapability(CapabilitySpec)` wrapper for TOML-loaded capabilities +- `CapabilityRegistry`: `parking_lot::RwLock>>` — `register()`, `register_or_replace()`, `override_spec()`, `activate()`, `list_ids()`, `len()` +- `CapabilityLoader`: `parse(toml_str)`, `from_file(path)`, `apply(config, registry)`, `load_and_apply(path, registry)` — partial override via `Option` fields, unknown IDs skipped with warning, idempotent re-application +- Three built-in capabilities: + - `CodeReviewer` — role `code_reviewer`, Claude Opus 4.6, temperature 0.1, max_tokens 8192, tools: file_read/file_list/git_diff/code_search, structured JSON output (severity Critical/High/Medium/Low/Info) + - `DocGenerator` — role `documenter`, Claude Sonnet 4.6, temperature 0.3, max_tokens 16384, tools: file_read/file_list/code_search/file_write, multi-level doc methodology + - `PRMonitor` — role `monitor`, Claude Sonnet 4.6, temperature 0.1, max_tokens 4096, tools: git_diff/git_log/git_status/file_list/file_read, READY/NEEDS_REVIEW/BLOCKED classification +- 22 unit tests + 3 doc-tests + +#### `vapora-shared` — `AgentDefinition` moved here + +- `AgentDefinition` extracted from `vapora-agents::config` into `vapora-shared::agent_definition` +- Re-exported from `vapora-agents::config` for backward compatibility — zero call-site changes +- Breaks the potential `vapora-agents ↔ vapora-capabilities` circular dependency + +#### `vapora-agents` — executor wired to LLM router + capabilities + +- `AgentExecutor::with_router(Arc)` builder — routes real LLM calls through `complete_with_budget()` using `AgentMetadata::system_prompt` as the system message +- `AgentExecutor::execute_task()` — replaces hardcoded stub; dispatches `task.description` + `task.context` as user prompt; provider name tracked in KG persistence record +- `AgentCoordinator::register_executor_channel(agent_id, Sender)` — registers in-process executor channel +- `AgentCoordinator::assign_task()` — dispatches to registered executor channel (in addition to NATS) without holding `DashMap` shard lock across await +- `server.rs` — initializes `CapabilityRegistry::with_built_ins()` at startup; `build_router_from_env()` builds `LLMRouter` from `LLM_ROUTER_CONFIG` file or `ANTHROPIC_API_KEY`/`OPENAI_API_KEY`/`OLLAMA_URL` env vars; spawns one `AgentExecutor` per capability with router wired; agents from `agents.toml` that have no matching capability also get executors + +#### Documentation + +- `docs/features/capability-packages.md` — new feature reference +- `docs/guides/capability-packages-guide.md` — usage guide (activate built-ins, TOML customization, custom capabilities, env vars) +- **ADR-0037**: design rationale for capability packages, dependency inversion via `vapora-shared`, and in-process executor dispatch + +--- + ### Added - Webhook Notification Channels (`vapora-channels`) #### `vapora-channels` — new crate diff --git a/Cargo.lock b/Cargo.lock index c7e291c..f127707 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -12338,6 +12338,7 @@ dependencies = [ "axum", "chrono", "clap", + "dashmap 6.1.0", "futures", "mockall", "rig-core", @@ -12352,6 +12353,7 @@ dependencies = [ "tracing", "tracing-subscriber", "uuid", + "vapora-capabilities", "vapora-knowledge-graph", "vapora-llm-router", "vapora-shared", @@ -12430,6 +12432,20 @@ dependencies = [ "wiremock", ] +[[package]] +name = "vapora-capabilities" +version = "1.2.0" +dependencies = [ + "parking_lot", + "serde", + "serde_json", + "tempfile", + "thiserror 2.0.18", + "toml 0.9.8", + "tracing", + "vapora-shared", +] + [[package]] name = "vapora-channels" version = "1.2.0" diff --git a/Cargo.toml b/Cargo.toml index bd31825..4885e64 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -3,6 +3,7 @@ resolver = "2" members = [ + "crates/vapora-capabilities", "crates/vapora-channels", "crates/vapora-backend", "crates/vapora-frontend", @@ -37,6 +38,7 @@ categories = ["development-tools", "web-programming"] [workspace.dependencies] # Vapora internal crates +vapora-capabilities = { path = "crates/vapora-capabilities" } vapora-channels = { path = "crates/vapora-channels" } vapora-shared = { path = "crates/vapora-shared" } vapora-leptos-ui = { path = "crates/vapora-leptos-ui" } diff --git a/assets/web/index.html b/assets/web/index.html index 09b6cbc..66c4e21 100644 --- a/assets/web/index.html +++ b/assets/web/index.html @@ -1 +1 @@ - Vapora
✅ v1.2.0 | 372 Tests | 100% Pass Rate
Vapora - Development Orchestration

Evaporate complexity

Development Flows

Specialized agents orchestrate pipelines for design, implementation, testing, documentation and deployment. Agents learn from history and optimize costs automatically.
100% self-hosted.

The 4 Problems It Solves

01

Context Switching

Developers jump between tools constantly. Vapora unifies everything in one intelligent system where context flows.

02

Knowledge Fragmentation

Decisions lost in threads, code scattered, docs unmaintained. RLM (Recursive Language Models) with hybrid search (BM25 + semantic) and chunking makes knowledge discoverable even in 100k+ token documents.

03

Manual Coordination

Orchestrating code review, testing, documentation and deployment manually creates bottlenecks. Multi-agent workflows solve this.

04

Dev-Ops Friction

Handoffs between developers and operations lack visibility and context. Vapora maintains unified deployment readiness.

How It Works

🤖

Specialized Agents

71 tests verify agent orchestration, learning profiles, and task assignment. Agents track expertise per task type with 7-day recency bias (3× weight). Real SurrealDB persistence + NATS coordination.

🧠

Intelligent Orchestration

53 tests verify multi-provider routing (Claude, OpenAI, Gemini, Ollama), per-role budget limits, cost tracking, and automatic fallback chains. Swarm coordination with load-balanced assignment using success_rate / (1 + load) formula.

📚

Recursive Language Models (RLM)

Process 100k+ token documents without context limits. Hybrid search combines BM25 (keywords) + semantic embeddings via RRF fusion. Intelligent chunking (Fixed/Semantic/Code) with SurrealDB persistence. Perfect for large codebases and documentation.

🔗

Agent-to-Agent (A2A) Protocol

Distributed agent coordination with task dispatch, status tracking, and result collection. Real SurrealDB persistence (no in-memory HashMap). NATS messaging for async completion. Exponential backoff retry with circuit breaker. 12 integration tests verify real behavior.

🕸️

Knowledge Graph

Temporal execution history with causal relationships. Learning curves from daily windowed aggregations. Similarity search recommends solutions from past tasks. 20 tests verify graph persistence, learning profiles, and execution tracking.

NATS JetStream

Reliable message delivery for agent coordination. JetStream streams for workflow events, task completion, and status updates. Graceful fallback when NATS unavailable. Background subscribers with DashMap for async result delivery.

🗄️

SurrealDB

Multi-model database with graph capabilities. Multi-tenant scopes for workspace isolation. Native graph relations for Knowledge Graph. All queries use parameterized bindings for security. SCHEMAFULL tables with explicit indexes.

🔌

Backend API & MCP Connectors

40+ REST endpoints (projects, tasks, agents, workflows, swarm). WebSocket real-time updates. MCP gateway for external tool integration and plugin system. Multi-tenant SurrealDB scopes. Prometheus metrics at /metrics. 161 tests verify API correctness.

☸️

Cloud-Native & Self-Hosted

161 backend tests + K8s manifests with Kustomize overlays. Health checks, Prometheus metrics (/metrics endpoint), StatefulSets with anti-affinity. Local Docker Compose for development. Zero vendor lock-in.

Technology Stack

Rust (18 crates)Axum REST APISurrealDBNATS JetStreamLeptos WASMKubernetesPrometheusKnowledge GraphRLM (Hybrid Search)A2A ProtocolMCP Server

Available Agents

ArchitectSystem design
DeveloperCode implementation
CodeReviewerQuality assurance
TesterTests & benchmarks
DocumenterDocumentation
MarketerMarketing content
PresenterPresentations
DevOpsCI/CD deployment
MonitorHealth & alerting
SecurityAudit & compliance
ProjectManagerRoadmap tracking
DecisionMakerConflict resolution

Ready for intelligent orchestration?

Built with Rust 🦀 | Open Source | Self-Hosted

Explore on GitHub →

Vapora v1.2.0

Made with Vapora dreams and Rust reality ✨

Intelligent Development Orchestration | Multi-Agent Multi-IA Platform

+ Vapora
✅ v1.2.0 | 620 Tests | 100% Pass Rate
Vapora - Development Orchestration

Evaporate complexity

Development Flows

Specialized agents orchestrate pipelines for design, implementation, testing, documentation and deployment. Agents learn from history and optimize costs automatically.
100% self-hosted.

The 4 Problems It Solves

01

Context Switching

Developers jump between tools constantly. Vapora unifies everything in one intelligent system where context flows.

02

Knowledge Fragmentation

Decisions lost in threads, code scattered, docs unmaintained. RLM (Recursive Language Models) with hybrid search (BM25 + semantic) and chunking makes knowledge discoverable even in 100k+ token documents.

03

Manual Coordination

Orchestrating code review, testing, documentation and deployment manually creates bottlenecks. Multi-agent workflows solve this.

04

Dev-Ops Friction

Handoffs between developers and operations lack visibility and context. Vapora maintains unified deployment readiness.

How It Works

🤖

Specialized Agents

71 tests verify agent orchestration, learning profiles, and task assignment. Agents track expertise per task type with 7-day recency bias (3× weight). Real SurrealDB persistence + NATS coordination.

🧠

Intelligent Orchestration

53 tests verify multi-provider routing (Claude, OpenAI, Gemini, Ollama), per-role budget limits, cost tracking, and automatic fallback chains. Swarm coordination with load-balanced assignment using success_rate / (1 + load) formula.

📚

Recursive Language Models (RLM)

Process 100k+ token documents without context limits. Hybrid search combines BM25 (keywords) + semantic embeddings via RRF fusion. Intelligent chunking (Fixed/Semantic/Code) with SurrealDB persistence. Perfect for large codebases and documentation.

🔗

Agent-to-Agent (A2A) Protocol

Distributed agent coordination with task dispatch, status tracking, and result collection. Real SurrealDB persistence (no in-memory HashMap). NATS messaging for async completion. Exponential backoff retry with circuit breaker. 12 integration tests verify real behavior.

🕸️

Knowledge Graph

Temporal execution history with causal relationships. Learning curves from daily windowed aggregations. Similarity search recommends solutions from past tasks. 20 tests verify graph persistence, learning profiles, and execution tracking.

NATS JetStream

Reliable message delivery for agent coordination. JetStream streams for workflow events, task completion, and status updates. Graceful fallback when NATS unavailable. Background subscribers with DashMap for async result delivery.

🗄️

SurrealDB

Multi-model database with graph capabilities. Multi-tenant scopes for workspace isolation. Native graph relations for Knowledge Graph. All queries use parameterized bindings for security. SCHEMAFULL tables with explicit indexes.

🔌

Backend API & MCP Connectors

40+ REST endpoints (projects, tasks, agents, workflows, swarm). WebSocket real-time updates. MCP gateway for external tool integration and plugin system. Multi-tenant SurrealDB scopes. Prometheus metrics at /metrics. 161 tests verify API correctness.

☸️

Cloud-Native & Self-Hosted

161 backend tests + K8s manifests with Kustomize overlays. Health checks, Prometheus metrics (/metrics endpoint), StatefulSets with anti-affinity. Local Docker Compose for development. Zero vendor lock-in.

Autonomous Scheduling

Cron-triggered workflow execution with IANA timezone support via chrono-tz. Distributed fire-lock using SurrealDB conditional UPDATE prevents double-fires across multi-instance deployments — no external lock service required. 48 tests.

🔔

Webhook Notifications

Real-time alerts to Slack, Discord, and Telegram — no vendor SDKs. ${VAR} secret resolution is built into ChannelRegistry construction; tokens never reach the HTTP layer unresolved. Fire-and-forget hooks on task completion, proposal approval/rejection, and workflow lifecycle events.

🧩

Capability Packages

Domain-optimized agent bundles — system prompt, preferred LLM model, task types, and MCP tools pre-configured per role. Three built-ins (code-reviewer, doc-generator, pr-monitor) loaded at startup via CapabilityRegistry. TOML overrides let you swap model or prompt without code changes. In-process executor dispatch via DashMap channels — no NATS required for standalone mode. 22 tests.

Technology Stack

Rust (21 crates)Axum REST APISurrealDBNATS JetStreamLeptos WASMKubernetesPrometheusKnowledge GraphRLM (Hybrid Search)A2A ProtocolMCP Serverchrono-tz (Cron)Webhook ChannelsCapability Packages

Available Agents

ArchitectSystem design
DeveloperCode implementation
CodeReviewerQuality assurance
TesterTests & benchmarks
DocumenterDocumentation
MarketerMarketing content
PresenterPresentations
DevOpsCI/CD deployment
MonitorHealth & alerting
SecurityAudit & compliance
ProjectManagerRoadmap tracking
DecisionMakerConflict resolution

Ready for intelligent orchestration?

Built with Rust 🦀 | Open Source | Self-Hosted

Explore on GitHub →

Vapora v1.2.0

Made with Vapora dreams and Rust reality ✨

Intelligent Development Orchestration | Multi-Agent Multi-IA Platform

diff --git a/assets/web/src/index.html b/assets/web/src/index.html index 61aaece..b7dc4a1 100644 --- a/assets/web/src/index.html +++ b/assets/web/src/index.html @@ -516,8 +516,8 @@
- ✅ v1.2.0 | 372 Tests | 100% Pass Rate✅ v1.2.0 | 620 Tests | 100% Pass Rate
Vapora - Development Orchestration @@ -821,6 +821,24 @@ Real-time alerts to Slack, Discord, and Telegram — no vendor SDKs. ${VAR} secret resolution is built into ChannelRegistry construction; tokens never reach the HTTP layer unresolved. Fire-and-forget hooks on task completion, proposal approval/rejection, and workflow lifecycle events.

+
+
🧩
+

+ Capability Packages +

+

+ Domain-optimized agent bundles — system prompt, preferred LLM model, task types, and MCP tools pre-configured per role. Three built-ins (code-reviewer, doc-generator, pr-monitor) loaded at startup via CapabilityRegistry. TOML overrides let you swap model or prompt without code changes. In-process executor dispatch via DashMap channels — no NATS required for standalone mode. 22 tests. +

+
@@ -831,7 +849,7 @@ >
- Rust (18 crates) + Rust (21 crates) Axum REST API SurrealDB NATS JetStream @@ -844,6 +862,7 @@ MCP Server chrono-tz (Cron) Webhook Channels + Capability Packages
diff --git a/crates/vapora-agents/Cargo.toml b/crates/vapora-agents/Cargo.toml index 294ee0b..786d33f 100644 --- a/crates/vapora-agents/Cargo.toml +++ b/crates/vapora-agents/Cargo.toml @@ -17,10 +17,14 @@ path = "src/bin/server.rs" [dependencies] # Internal crates vapora-shared = { workspace = true } +vapora-capabilities = { workspace = true } vapora-llm-router = { workspace = true } vapora-knowledge-graph = { workspace = true } vapora-swarm = { workspace = true } +# Concurrent collections +dashmap = { workspace = true } + # Secrets management secretumvault = { workspace = true } diff --git a/crates/vapora-agents/src/bin/server.rs b/crates/vapora-agents/src/bin/server.rs index c1059c5..bd8a4da 100644 --- a/crates/vapora-agents/src/bin/server.rs +++ b/crates/vapora-agents/src/bin/server.rs @@ -1,6 +1,7 @@ //! VAPORA Agent Server Binary //! Provides HTTP server for agent coordination and health checks +use std::collections::HashMap; use std::sync::Arc; use anyhow::Result; @@ -8,9 +9,17 @@ use axum::{extract::State, routing::get, Json, Router}; use clap::Parser; use serde_json::json; use tokio::net::TcpListener; -use tracing::{error, info}; -use vapora_agents::{config::AgentConfig, coordinator::AgentCoordinator, registry::AgentRegistry}; -use vapora_llm_router::{BudgetConfig, BudgetManager}; +use tokio::sync::mpsc; +use tracing::{error, info, warn}; +use vapora_agents::{ + config::AgentConfig, + coordinator::AgentCoordinator, + registry::{AgentMetadata, AgentRegistry}, + runtime::executor::AgentExecutor, +}; +use vapora_capabilities::CapabilityRegistry; +use vapora_llm_router::{BudgetConfig, BudgetManager, LLMRouter, LLMRouterConfig, ProviderConfig}; +use vapora_shared::AgentDefinition; #[derive(Clone)] struct AppState { @@ -42,10 +51,8 @@ struct Args { #[tokio::main] async fn main() -> Result<()> { - // Parse CLI arguments let args = Args::parse(); - // Initialize tracing tracing_subscriber::fmt() .with_env_filter( tracing_subscriber::EnvFilter::from_default_env() @@ -55,11 +62,11 @@ async fn main() -> Result<()> { info!("Starting VAPORA Agent Server"); - // Load configuration from environment + // Load agent configuration let config = AgentConfig::from_env()?; info!("Loaded configuration from environment"); - // Load budget configuration from specified path + // Load budget configuration let budget_config_path = args.budget_config; let budget_manager = match BudgetConfig::load_or_default(&budget_config_path) { Ok(budget_config) => { @@ -84,16 +91,25 @@ async fn main() -> Result<()> { } }; + // Build LLM router from config file or environment credentials + let router = build_router_from_env(); + let router = router.map(Arc::new); + + // Initialize capability registry with built-in capability packages + let cap_registry = CapabilityRegistry::with_built_ins(); + info!( + "Capability registry initialized: {:?}", + cap_registry.list_ids() + ); + // Initialize agent registry and coordinator - // Max 10 agents per role (can be configured via environment) let max_agents_per_role = std::env::var("MAX_AGENTS_PER_ROLE") .ok() .and_then(|v| v.parse().ok()) .unwrap_or(10); let registry = Arc::new(AgentRegistry::new(max_agents_per_role)); - let mut coordinator = AgentCoordinator::new(config, registry).await?; + let mut coordinator = AgentCoordinator::new(config.clone(), Arc::clone(®istry)).await?; - // Attach budget manager to coordinator if available if let Some(ref bm) = budget_manager { coordinator = coordinator.with_budget_manager(bm.clone()); info!("Budget enforcement enabled for agent coordinator"); @@ -101,9 +117,32 @@ async fn main() -> Result<()> { let coordinator = Arc::new(coordinator); + // Spawn one executor per built-in capability, each wired to the LLM router. + // The executor's channel sender is registered with the coordinator so that + // assign_task() dispatches directly in-process. + for cap_id in cap_registry.list_ids() { + spawn_capability_executor( + &cap_id, + &cap_registry, + ®istry, + &coordinator, + router.as_ref(), + ); + } + + // Spawn executors for any agents defined in agents.toml that are NOT + // already covered by a capability package (role not registered yet). + for agent_def in &config.agents { + if registry.get_agents_by_role(&agent_def.role).is_empty() { + spawn_single_config_executor(agent_def, ®istry, &coordinator, router.as_ref()); + } + } + + info!("{} agents registered", registry.total_count()); + // Start coordinator let _coordinator_handle = { - let coordinator = coordinator.clone(); + let coordinator = Arc::clone(&coordinator); tokio::spawn(async move { if let Err(e) = coordinator.start().await { error!("Coordinator error: {}", e); @@ -111,32 +150,221 @@ async fn main() -> Result<()> { }) }; - // Build application state let state = AppState { coordinator, budget_manager, }; - // Build HTTP router let app = Router::new() .route("/health", get(health_handler)) .route("/ready", get(readiness_handler)) .with_state(state); - // Start HTTP server let addr = std::env::var("BIND_ADDR").unwrap_or_else(|_| "0.0.0.0:9000".to_string()); info!("Agent server listening on {}", addr); let listener = TcpListener::bind(&addr).await?; - axum::serve(listener, app).await?; - // Note: coordinator_handle would be awaited here if needed, - // but axum::serve blocks until server shutdown Ok(()) } -/// Health check endpoint +/// Activate a capability, register the resulting agent, and spawn its executor. +fn spawn_capability_executor( + cap_id: &str, + cap_registry: &CapabilityRegistry, + registry: &Arc, + coordinator: &Arc, + router: Option<&Arc>, +) { + let def = match cap_registry.activate(cap_id) { + Ok(d) => d, + Err(e) => { + warn!("Failed to activate capability '{}': {}", cap_id, e); + return; + } + }; + + let metadata = agent_metadata_from_definition(&def); + let agent_id = metadata.id.clone(); + + if let Err(e) = registry.register_agent(metadata.clone()) { + warn!( + "Failed to register agent for capability '{}': {}", + cap_id, e + ); + return; + } + + let (tx, rx) = mpsc::channel::(32); + let executor = build_executor(metadata, rx, router); + + tokio::spawn(executor.run()); + coordinator.register_executor_channel(agent_id, tx); + + info!( + "Spawned executor for capability '{}' (role: {}, provider: {}, model: {})", + cap_id, def.role, def.llm_provider, def.llm_model + ); +} + +fn spawn_single_config_executor( + agent_def: &AgentDefinition, + registry: &Arc, + coordinator: &Arc, + router: Option<&Arc>, +) { + let metadata = agent_metadata_from_definition(agent_def); + let agent_id = metadata.id.clone(); + + if let Err(e) = registry.register_agent(metadata.clone()) { + warn!( + "Failed to register config agent '{}': {}", + agent_def.role, e + ); + return; + } + + let (tx, rx) = mpsc::channel::(32); + let executor = build_executor(metadata, rx, router); + + tokio::spawn(executor.run()); + coordinator.register_executor_channel(agent_id, tx); + + info!( + "Spawned executor for config agent (role: {}, provider: {})", + agent_def.role, agent_def.llm_provider + ); +} + +fn agent_metadata_from_definition(def: &AgentDefinition) -> AgentMetadata { + let m = AgentMetadata::new( + def.role.clone(), + format!("{}-agent", def.role), + def.llm_provider.clone(), + def.llm_model.clone(), + def.capabilities.clone(), + ); + match &def.system_prompt { + Some(prompt) => m.with_system_prompt(prompt.clone()), + None => m, + } +} + +fn build_executor( + metadata: AgentMetadata, + rx: mpsc::Receiver, + router: Option<&Arc>, +) -> AgentExecutor { + let executor = AgentExecutor::new(metadata, rx); + match router { + Some(r) => executor.with_router(Arc::clone(r)), + None => executor, + } +} + +/// Build an LLM router from a config file (LLM_ROUTER_CONFIG env var) or +/// from individual provider API key environment variables. +/// Returns None and logs a warning when no credentials are available. +fn build_router_from_env() -> Option { + // Try config file path first + if let Ok(path) = std::env::var("LLM_ROUTER_CONFIG") { + match LLMRouterConfig::load(&path) { + Ok(config) => match LLMRouter::new(config) { + Ok(router) => { + info!("LLM router initialized from {}", path); + return Some(router); + } + Err(e) => warn!("LLM router config loaded but init failed: {}", e), + }, + Err(e) => warn!("Failed to load LLM router config from {}: {}", path, e), + } + } + + // Fall back to building from individual environment variables + let mut providers: HashMap = HashMap::new(); + + if let Ok(api_key) = std::env::var("ANTHROPIC_API_KEY") { + providers.insert( + "claude".to_string(), + ProviderConfig { + enabled: true, + api_key: Some(api_key), + url: None, + model: "claude-sonnet-4-6".to_string(), + max_tokens: 8192, + temperature: 0.7, + cost_per_1m_input: 3.0, + cost_per_1m_output: 15.0, + }, + ); + } + + if let Ok(api_key) = std::env::var("OPENAI_API_KEY") { + providers.insert( + "openai".to_string(), + ProviderConfig { + enabled: true, + api_key: Some(api_key), + url: None, + model: "gpt-4o".to_string(), + max_tokens: 8192, + temperature: 0.7, + cost_per_1m_input: 2.5, + cost_per_1m_output: 10.0, + }, + ); + } + + if let Ok(url) = std::env::var("OLLAMA_URL") { + providers.insert( + "ollama".to_string(), + ProviderConfig { + enabled: true, + api_key: None, + url: Some(url), + model: std::env::var("OLLAMA_MODEL").unwrap_or_else(|_| "llama3.2".to_string()), + max_tokens: 4096, + temperature: 0.7, + cost_per_1m_input: 0.0, + cost_per_1m_output: 0.0, + }, + ); + } + + if providers.is_empty() { + info!("No LLM provider credentials found — executors will use stub responses"); + return None; + } + + let default_provider = if providers.contains_key("claude") { + "claude".to_string() + } else { + providers.keys().next().unwrap().clone() + }; + + let config = LLMRouterConfig { + routing: vapora_llm_router::config::RoutingConfig { + default_provider, + cost_tracking_enabled: true, + fallback_enabled: true, + }, + providers, + routing_rules: vec![], + }; + + match LLMRouter::new(config) { + Ok(router) => { + info!("LLM router initialized from environment credentials"); + Some(router) + } + Err(e) => { + warn!("Failed to initialize LLM router from env vars: {}", e); + None + } + } +} + async fn health_handler() -> Json { Json(json!({ "status": "healthy", @@ -145,10 +373,8 @@ async fn health_handler() -> Json { })) } -/// Readiness check endpoint async fn readiness_handler(State(state): State) -> Json { let is_ready = state.coordinator.is_ready().await; - Json(json!({ "ready": is_ready, "service": "vapora-agents", diff --git a/crates/vapora-agents/src/config.rs b/crates/vapora-agents/src/config.rs index 7038ff3..3dd4a5b 100644 --- a/crates/vapora-agents/src/config.rs +++ b/crates/vapora-agents/src/config.rs @@ -46,23 +46,10 @@ fn default_agent_timeout() -> u64 { 300 } -#[derive(Debug, Clone, Serialize, Deserialize)] -pub struct AgentDefinition { - pub role: String, - pub description: String, - pub llm_provider: String, - pub llm_model: String, - #[serde(default)] - pub parallelizable: bool, - #[serde(default = "default_priority")] - pub priority: u32, - #[serde(default)] - pub capabilities: Vec, -} - -fn default_priority() -> u32 { - 50 -} +/// Re-exported from `vapora-shared` so callers using +/// `vapora_agents::config::AgentDefinition` continue to compile without +/// changes. +pub use vapora_shared::AgentDefinition; impl AgentConfig { /// Load configuration from TOML file @@ -136,6 +123,7 @@ impl Default for AgentConfig { parallelizable: true, priority: 80, capabilities: vec!["coding".to_string()], + system_prompt: None, }], } } @@ -161,6 +149,7 @@ mod tests { parallelizable: true, priority: 80, capabilities: vec!["coding".to_string()], + system_prompt: None, }], }; @@ -184,6 +173,7 @@ mod tests { parallelizable: true, priority: 80, capabilities: vec![], + system_prompt: None, }, AgentDefinition { role: "developer".to_string(), @@ -193,6 +183,7 @@ mod tests { parallelizable: true, priority: 80, capabilities: vec![], + system_prompt: None, }, ], }; @@ -216,6 +207,7 @@ mod tests { parallelizable: false, priority: 100, capabilities: vec!["architecture".to_string()], + system_prompt: None, }], }; diff --git a/crates/vapora-agents/src/coordinator.rs b/crates/vapora-agents/src/coordinator.rs index ae36cce..a4c8f18 100644 --- a/crates/vapora-agents/src/coordinator.rs +++ b/crates/vapora-agents/src/coordinator.rs @@ -6,7 +6,9 @@ use std::path::PathBuf; use std::sync::Arc; use chrono::Utc; +use dashmap::DashMap; use thiserror::Error; +use tokio::sync::mpsc; use tracing::{debug, info, warn}; use uuid::Uuid; use vapora_shared::validation::{SchemaRegistry, ValidationPipeline}; @@ -52,6 +54,10 @@ pub struct AgentCoordinator { learning_profiles: Arc>>, budget_manager: Option>, validation: Arc, + /// In-process executor channels, keyed by agent ID. + /// When a channel is registered for the selected agent, the task is + /// dispatched directly to its executor rather than relying solely on NATS. + executor_channels: Arc>>, } impl AgentCoordinator { @@ -123,6 +129,7 @@ impl AgentCoordinator { learning_profiles: Arc::new(std::sync::RwLock::new(HashMap::new())), budget_manager: None, validation, + executor_channels: Arc::new(DashMap::new()), }) } @@ -147,9 +154,18 @@ impl AgentCoordinator { learning_profiles: Arc::new(std::sync::RwLock::new(HashMap::new())), budget_manager: None, validation, + executor_channels: Arc::new(DashMap::new()), } } + /// Register an in-process executor channel for an agent. + /// + /// When registered, [`Self::assign_task`] dispatches tasks directly to the + /// executor via this channel in addition to (or instead of) NATS. + pub fn register_executor_channel(&self, agent_id: String, tx: mpsc::Sender) { + self.executor_channels.insert(agent_id, tx); + } + /// Set NATS client for inter-agent communication pub fn with_nats(mut self, client: Arc) -> Self { self.nats_client = Some(client); @@ -308,7 +324,23 @@ impl AgentCoordinator { task_id, agent.id, role, task_type ); - // Publish to NATS if available + // Dispatch to in-process executor if one is registered for this agent. + // Clone the sender out of the DashMap before awaiting to avoid holding + // the shard lock across an await point. + if let Some(tx) = self + .executor_channels + .get(&agent.id) + .map(|r| r.value().clone()) + { + if let Err(e) = tx.send(assignment.clone()).await { + warn!( + "Failed to dispatch task {} to in-process executor for agent {}: {}", + task_id, agent.id, e + ); + } + } + + // Publish to NATS if available (external consumers / distributed mode) if let Some(nats) = &self.nats_client { self.publish_message(nats, AgentMessage::TaskAssigned(assignment)) .await?; diff --git a/crates/vapora-agents/src/profile_adapter.rs b/crates/vapora-agents/src/profile_adapter.rs index b17e843..f901b95 100644 --- a/crates/vapora-agents/src/profile_adapter.rs +++ b/crates/vapora-agents/src/profile_adapter.rs @@ -92,6 +92,7 @@ mod tests { last_heartbeat: chrono::Utc::now(), uptime_percentage: 99.5, total_tasks_completed: 10, + system_prompt: None, }; let profile = ProfileAdapter::create_profile(&agent); @@ -121,6 +122,7 @@ mod tests { last_heartbeat: chrono::Utc::now(), uptime_percentage: 99.0, total_tasks_completed: 5, + system_prompt: None, }, AgentMetadata { id: "agent-2".to_string(), @@ -137,6 +139,7 @@ mod tests { last_heartbeat: chrono::Utc::now(), uptime_percentage: 98.5, total_tasks_completed: 3, + system_prompt: None, }, ]; diff --git a/crates/vapora-agents/src/registry.rs b/crates/vapora-agents/src/registry.rs index 1102ad0..76971fa 100644 --- a/crates/vapora-agents/src/registry.rs +++ b/crates/vapora-agents/src/registry.rs @@ -49,6 +49,10 @@ pub struct AgentMetadata { pub last_heartbeat: DateTime, pub uptime_percentage: f64, pub total_tasks_completed: u64, + /// Domain-optimized system prompt from capability package. + /// None means the executor uses its default generic prompt. + #[serde(default)] + pub system_prompt: Option, } impl AgentMetadata { @@ -75,9 +79,17 @@ impl AgentMetadata { last_heartbeat: now, uptime_percentage: 100.0, total_tasks_completed: 0, + system_prompt: None, } } + /// Builder: attach a domain-optimized system prompt from a capability + /// package. + pub fn with_system_prompt(mut self, prompt: impl Into) -> Self { + self.system_prompt = Some(prompt.into()); + self + } + /// Check if agent can accept new tasks pub fn can_accept_task(&self) -> bool { self.status == AgentStatus::Active && self.current_tasks < self.max_concurrent_tasks diff --git a/crates/vapora-agents/src/runtime/executor.rs b/crates/vapora-agents/src/runtime/executor.rs index c525bcb..b3d6600 100644 --- a/crates/vapora-agents/src/runtime/executor.rs +++ b/crates/vapora-agents/src/runtime/executor.rs @@ -1,39 +1,37 @@ -// Per-agent execution loop with channel-based task distribution -// Phase 5.5: Persistence of execution history to KG for learning -// Each agent has dedicated executor managing its state machine - use std::sync::Arc; use chrono::Utc; use tokio::sync::mpsc; use tracing::{debug, info, warn}; use vapora_knowledge_graph::{ExecutionRecord, KGPersistence, PersistedExecution}; -use vapora_llm_router::EmbeddingProvider; +use vapora_llm_router::{EmbeddingProvider, LLMRouter}; use super::state_machine::{Agent, ExecutionResult, Idle}; use crate::messages::TaskAssignment; use crate::registry::AgentMetadata; -/// Per-agent executor handling task processing with persistence (Phase 5.5) +/// Per-agent executor handling task processing with KG persistence. pub struct AgentExecutor { agent: Agent, task_rx: mpsc::Receiver, kg_persistence: Option>, embedding_provider: Option>, + router: Option>, } impl AgentExecutor { - /// Create new executor for an agent (Phase 4) + /// Create a new executor without persistence or LLM routing. pub fn new(metadata: AgentMetadata, task_rx: mpsc::Receiver) -> Self { Self { agent: Agent::new(metadata), task_rx, kg_persistence: None, embedding_provider: None, + router: None, } } - /// Create executor with persistence (Phase 5.5) + /// Create executor with KG persistence. pub fn with_persistence( metadata: AgentMetadata, task_rx: mpsc::Receiver, @@ -45,10 +43,25 @@ impl AgentExecutor { task_rx, kg_persistence: Some(kg_persistence), embedding_provider: Some(embedding_provider), + router: None, } } - /// Run executor loop, processing tasks until channel closes + /// Attach an LLM router to this executor. + /// + /// When set, [`AgentExecutor::run`] forwards each task to the router using + /// `AgentMetadata::system_prompt` as the system message and the task + /// description (+ context when non-empty) as the user message. + /// Budget-aware routing uses `AgentMetadata::role` for spend tracking. + /// + /// When absent, the executor produces a stub result — useful in tests that + /// must not make network calls. + pub fn with_router(mut self, router: Arc) -> Self { + self.router = Some(router); + self + } + + /// Run the executor loop, processing tasks until the channel closes. pub async fn run(mut self) { info!( "AgentExecutor started for agent: {}", @@ -59,30 +72,25 @@ impl AgentExecutor { while let Some(task) = self.task_rx.recv().await { debug!("Received task: {}", task.id); - // Transition: Idle → Assigned - let agent_assigned = self.agent.assign_task(task.clone()); - - // Transition: Assigned → Executing - let agent_executing = agent_assigned.start_execution(); let execution_start = Utc::now(); - // Execute task (placeholder - in real use, call LLM via vapora-llm-router) - let result = ExecutionResult { - output: "Task executed successfully".to_string(), - input_tokens: 100, - output_tokens: 50, - duration_ms: 500, - }; + // Execute before state transitions — state machine consumes self.agent via + // partial move, which would prevent borrowing self.router afterwards. + let (result, provider_used) = self.execute_task(&task).await; - // Transition: Executing → Completed + let agent_assigned = self.agent.assign_task(task.clone()); + let agent_executing = agent_assigned.start_execution(); let completed_agent = agent_executing.complete(result.clone()); - - // Handle result - transition Completed → Idle self.agent = completed_agent.reset(); - // Phase 5.5: Persist execution to Knowledge Graph (after state transition) - self.persist_execution_internal(&task, &result, execution_start, &agent_id) - .await; + self.persist_execution_internal( + &task, + &result, + execution_start, + &agent_id, + &provider_used, + ) + .await; info!("Task {} completed", task.id); } @@ -90,27 +98,91 @@ impl AgentExecutor { info!("AgentExecutor stopped for agent: {}", agent_id); } - /// Persist execution record to KG database (Phase 5.5) + /// Execute a task, returning the result and the provider name that handled + /// it. + /// + /// When a router is attached, calls `complete_with_budget` with: + /// - `system_prompt` from `AgentMetadata` → sent as the LLM system message + /// - `task.description` (plus `task.context` when meaningful) → user + /// message + /// - `task.required_role` → task-type routing key + /// - `agent.role` → budget-tracking role + /// + /// Falls back to a stub result when no router is configured. + async fn execute_task(&self, task: &TaskAssignment) -> (ExecutionResult, String) { + let Some(ref router) = self.router else { + return ( + ExecutionResult { + output: "Task executed successfully".to_string(), + input_tokens: 100, + output_tokens: 50, + duration_ms: 500, + }, + "unknown".to_string(), + ); + }; + + let prompt = build_prompt(&task.description, &task.context); + let system_prompt = self.agent.metadata.system_prompt.clone(); + let provider_name = self.agent.metadata.llm_provider.clone(); + let wall_start = std::time::Instant::now(); + + match router + .complete_with_budget( + &task.required_role, + prompt, + system_prompt, + None, + Some(self.agent.metadata.role.as_str()), + ) + .await + { + Ok(response) => { + let duration_ms = wall_start.elapsed().as_millis() as u64; + ( + ExecutionResult { + output: response.text, + input_tokens: response.input_tokens, + output_tokens: response.output_tokens, + duration_ms, + }, + provider_name, + ) + } + Err(e) => { + warn!("LLM call failed for task {}: {}", task.id, e); + let duration_ms = wall_start.elapsed().as_millis() as u64; + ( + ExecutionResult { + output: format!("LLM error: {e}"), + input_tokens: 0, + output_tokens: 0, + duration_ms, + }, + provider_name, + ) + } + } + } + async fn persist_execution_internal( &self, task: &TaskAssignment, result: &ExecutionResult, execution_start: chrono::DateTime, agent_id: &str, + provider: &str, ) { if let Some(ref kg_persistence) = self.kg_persistence { if let Some(ref embedding_provider) = self.embedding_provider { - // Generate embedding for task description let embedding = match embedding_provider.embed(&task.description).await { Ok(emb) => emb, Err(e) => { warn!("Failed to generate embedding for task {}: {}", task.id, e); - // Use zero vector as fallback vec![0.0; 1536] } }; - // Create execution record for KG let execution_record = ExecutionRecord { id: task.id.clone(), task_id: task.id.clone(), @@ -122,7 +194,7 @@ impl AgentExecutor { input_tokens: result.input_tokens, output_tokens: result.output_tokens, cost_cents: 0, - provider: "unknown".to_string(), + provider: provider.to_string(), success: true, error: None, solution: Some(result.output.clone()), @@ -130,18 +202,15 @@ impl AgentExecutor { timestamp: execution_start, }; - // Convert to persisted format let persisted = PersistedExecution::from_execution_record(&execution_record, embedding); - // Persist to SurrealDB if let Err(e) = kg_persistence.persist_execution(persisted).await { warn!("Failed to persist execution: {}", e); } else { debug!("Persisted execution {} to KG", task.id); } - // Record analytics event if let Err(e) = kg_persistence .record_event( "task_completed", @@ -154,7 +223,6 @@ impl AgentExecutor { warn!("Failed to record event: {}", e); } - // Record token usage event if let Err(e) = kg_persistence .record_event( "token_usage", @@ -182,6 +250,22 @@ pub enum ExecutorState { Failed(Agent, String), } +/// Build the user-facing LLM prompt from a task description and its JSON +/// context. +/// +/// The `context` field in [`TaskAssignment`] carries supplementary JSON that +/// may be empty or contain only `"{}"` when no additional context is available. +/// Those cases are skipped so the model receives a clean description-only +/// prompt. +fn build_prompt(description: &str, context: &str) -> String { + let ctx = context.trim(); + if ctx.is_empty() || ctx == "{}" { + description.to_string() + } else { + format!("{description}\n\nContext:\n{ctx}") + } +} + #[cfg(test)] mod tests { use super::*; @@ -204,6 +288,7 @@ mod tests { last_heartbeat: Utc::now(), uptime_percentage: 100.0, total_tasks_completed: 0, + system_prompt: None, }; let (_tx, rx) = mpsc::channel(10); @@ -211,6 +296,7 @@ mod tests { assert_eq!(executor.agent.metadata.id, "test-executor"); assert!(executor.kg_persistence.is_none()); + assert!(executor.router.is_none()); } #[test] @@ -230,6 +316,7 @@ mod tests { last_heartbeat: Utc::now(), uptime_percentage: 99.5, total_tasks_completed: 100, + system_prompt: None, }; let (_tx, rx) = mpsc::channel(10); @@ -237,5 +324,26 @@ mod tests { assert!(!executor.agent.metadata.role.is_empty()); assert!(executor.embedding_provider.is_none()); + assert!(executor.router.is_none()); + } + + #[test] + fn build_prompt_no_context() { + let p = build_prompt("Fix the bug", ""); + assert_eq!(p, "Fix the bug"); + } + + #[test] + fn build_prompt_empty_json_context() { + let p = build_prompt("Fix the bug", "{}"); + assert_eq!(p, "Fix the bug"); + } + + #[test] + fn build_prompt_with_context() { + let p = build_prompt("Fix the bug", r#"{"file":"src/lib.rs"}"#); + assert!(p.contains("Fix the bug")); + assert!(p.contains("src/lib.rs")); + assert!(p.contains("Context:")); } } diff --git a/crates/vapora-agents/src/runtime/state_machine.rs b/crates/vapora-agents/src/runtime/state_machine.rs index 1c54485..9bf192a 100644 --- a/crates/vapora-agents/src/runtime/state_machine.rs +++ b/crates/vapora-agents/src/runtime/state_machine.rs @@ -163,6 +163,7 @@ mod tests { last_heartbeat: Utc::now(), uptime_percentage: 100.0, total_tasks_completed: 0, + system_prompt: None, }; // Type-state chain: Idle → Assigned → Executing → Completed → Idle @@ -213,6 +214,7 @@ mod tests { last_heartbeat: Utc::now(), uptime_percentage: 100.0, total_tasks_completed: 0, + system_prompt: None, }; let agent = Agent::new(metadata); diff --git a/crates/vapora-capabilities/Cargo.toml b/crates/vapora-capabilities/Cargo.toml new file mode 100644 index 0000000..373b19b --- /dev/null +++ b/crates/vapora-capabilities/Cargo.toml @@ -0,0 +1,25 @@ +[package] +name = "vapora-capabilities" +version.workspace = true +edition.workspace = true +authors.workspace = true +license.workspace = true +repository.workspace = true +rust-version.workspace = true +description = "Pre-built capability packages for VAPORA agents — domain-optimized prompts, tools, and LLM configs" + +[lib] +crate-type = ["rlib"] + +[dependencies] +vapora-shared = { workspace = true } + +serde = { workspace = true } +serde_json = { workspace = true } +toml = { workspace = true } +thiserror = { workspace = true } +tracing = { workspace = true } +parking_lot = { workspace = true } + +[dev-dependencies] +tempfile = { workspace = true } diff --git a/crates/vapora-capabilities/src/built_in/code_reviewer.rs b/crates/vapora-capabilities/src/built_in/code_reviewer.rs new file mode 100644 index 0000000..656cfbd --- /dev/null +++ b/crates/vapora-capabilities/src/built_in/code_reviewer.rs @@ -0,0 +1,97 @@ +use crate::capability::{Capability, CapabilitySpec}; + +const SYSTEM_PROMPT: &str = r#"You are a senior software engineer specializing in code review. +Your analysis must be precise, actionable, and grounded in the actual code. + +REVIEW METHODOLOGY: +1. Read all changed files completely before forming conclusions. +2. Categorize each finding by severity: Critical | High | Medium | Low | Info. +3. Reference specific line numbers in every finding. +4. Distinguish logic bugs from style suggestions — report them separately. + +CRITICAL (block merge): +- Security: SQL injection, XSS, SSRF, path traversal, unsafe deserialization, missing auth checks +- Memory safety: use-after-free, buffer overflows, data races, unchecked slice indexing +- Correctness: logic errors that produce wrong results, silent data corruption +- Unhandled error conditions that lead to undefined or catastrophic state + +HIGH (should fix before merge): +- Missing input validation on user-supplied data at system boundaries +- Resource leaks: unclosed handles, unbounded memory allocations, missing timeouts +- Error swallowing that hides failures from callers (`let _ = ...`, `unwrap_or(Default)` on failures) +- Concurrency hazards: deadlocks, TOCTOU races, missing synchronization primitives + +MEDIUM (fix in follow-up, document if deferred): +- Missing test coverage for new code paths +- Performance anti-patterns in hot paths (unnecessary clones, allocations in loops) +- API design issues: inconsistent naming, overly complex function signatures +- Documentation gaps on public interfaces + +LOW / INFO: +- Code style inconsistencies (report only if systematic, not isolated occurrences) +- Minor refactoring opportunities that improve readability without changing behavior + +OUTPUT FORMAT (JSON only, no prose wrapping): +{ + "summary": "<1-2 sentences on overall quality and primary concern>", + "merge_ready": , + "findings": [ + { + "severity": "Critical|High|Medium|Low|Info", + "file": "", + "line": , + "category": "Security|Correctness|Performance|Testing|Documentation|Style", + "description": "", + "suggestion": "" + } + ], + "stats": { + "critical": , + "high": , + "medium": , + "low": + } +} + +Do not invent findings. If the code is correct in a category, omit those findings. +Precision over quantity — 3 real findings beat 15 speculative ones."#; + +/// Pre-built CodeReviewer capability. +/// +/// Configures an agent for in-depth code review with security and correctness +/// focus. Uses Claude Opus for deep reasoning on complex code patterns. +/// Temperature 0.1 maximizes determinism — reviews should be reproducible. +#[derive(Debug)] +pub struct CodeReviewer; + +impl Capability for CodeReviewer { + fn spec(&self) -> CapabilitySpec { + CapabilitySpec { + id: "code-reviewer".to_string(), + display_name: "Code Reviewer".to_string(), + description: "In-depth code review with security, correctness, and performance focus" + .to_string(), + agent_role: "code_reviewer".to_string(), + task_types: vec![ + "code_review".to_string(), + "review".to_string(), + "security_audit".to_string(), + "quality_check".to_string(), + "inspect".to_string(), + ], + system_prompt: SYSTEM_PROMPT.to_string(), + mcp_tools: vec![ + "file_read".to_string(), + "file_list".to_string(), + "git_diff".to_string(), + "code_search".to_string(), + ], + preferred_provider: "claude".to_string(), + preferred_model: "claude-opus-4-6".to_string(), + max_tokens: 8192, + temperature: 0.1, + priority: 90, + parallelizable: true, + } + } +} diff --git a/crates/vapora-capabilities/src/built_in/doc_generator.rs b/crates/vapora-capabilities/src/built_in/doc_generator.rs new file mode 100644 index 0000000..700a7f0 --- /dev/null +++ b/crates/vapora-capabilities/src/built_in/doc_generator.rs @@ -0,0 +1,87 @@ +use crate::capability::{Capability, CapabilitySpec}; + +const SYSTEM_PROMPT: &str = r#"You are a technical documentation specialist who writes clear, accurate documentation from source code. + +DOCUMENTATION METHODOLOGY: +1. Read the source code thoroughly before writing anything. +2. Document behavior, not implementation — explain what, not how. +3. Every public function/method gets: purpose, parameters, return value, errors/panics, and a usage example. +4. Module-level docs explain the module's role and its relationships to other modules. +5. Do not document obvious things (e.g., `/// Returns the value` for a getter). + +DOCUMENTATION LEVELS: + +MODULE DOCS (top of file or lib.rs): +- One-line summary followed by blank line +- Purpose of the module in the broader system +- Key types and their relationships +- Usage example showing common patterns + +FUNCTION/METHOD DOCS: +- One-line summary (imperative: "Creates a...", "Returns the...", "Validates...") +- Extended description if behavior is non-obvious +- `# Arguments` section listing each parameter with type and purpose +- `# Returns` section describing the return value and its possible states +- `# Errors` section listing error variants that can be returned (for Result types) +- `# Panics` section if the function can panic (with conditions) +- `# Examples` section with runnable code (in Rust: ```rust) + +TYPE DOCS (struct, enum, trait): +- What the type represents (not just a restatement of its name) +- Invariants the type maintains +- How it's constructed (builder pattern, From impls, etc.) +- For enums: when to use each variant + +OUTPUT FORMAT: +Produce documentation as markdown-formatted comments in the target language. +For Rust: use `///` for doc comments. For Python: use docstrings. For TypeScript: use JSDoc. +If the input specifies a different format, follow that format. + +QUALITY STANDARDS: +- Every sentence must be complete and grammatically correct +- Use present tense for descriptions ("Creates a connection" not "Will create a connection") +- Avoid vague words: "various", "some", "etc.", "and so on" +- Include real values in examples, not placeholders like `your_value_here` +- If a behavior is undefined or implementation-dependent, say so explicitly"#; + +/// Pre-built DocGenerator capability. +/// +/// Configures an agent for generating comprehensive technical documentation +/// from source code. Uses Claude Sonnet for cost-efficient, high-quality +/// writing. Temperature 0.3 allows natural prose variation while staying +/// factual. +#[derive(Debug)] +pub struct DocGenerator; + +impl Capability for DocGenerator { + fn spec(&self) -> CapabilitySpec { + CapabilitySpec { + id: "doc-generator".to_string(), + display_name: "Documentation Generator".to_string(), + description: "Generates comprehensive technical documentation from source code" + .to_string(), + agent_role: "documenter".to_string(), + task_types: vec![ + "doc_generation".to_string(), + "documentation".to_string(), + "document".to_string(), + "api_docs".to_string(), + "readme".to_string(), + "write".to_string(), + ], + system_prompt: SYSTEM_PROMPT.to_string(), + mcp_tools: vec![ + "file_read".to_string(), + "file_list".to_string(), + "code_search".to_string(), + "file_write".to_string(), + ], + preferred_provider: "claude".to_string(), + preferred_model: "claude-sonnet-4-6".to_string(), + max_tokens: 16384, + temperature: 0.3, + priority: 70, + parallelizable: true, + } + } +} diff --git a/crates/vapora-capabilities/src/built_in/mod.rs b/crates/vapora-capabilities/src/built_in/mod.rs new file mode 100644 index 0000000..2d99ecd --- /dev/null +++ b/crates/vapora-capabilities/src/built_in/mod.rs @@ -0,0 +1,12 @@ +//! Built-in capability packages shipped with VAPORA. +//! +//! Each module provides a unit struct implementing [`crate::Capability`]. +//! Obtain instances via [`crate::CapabilityRegistry::with_built_ins()`]. + +mod code_reviewer; +mod doc_generator; +mod pr_monitor; + +pub use code_reviewer::CodeReviewer; +pub use doc_generator::DocGenerator; +pub use pr_monitor::PRMonitor; diff --git a/crates/vapora-capabilities/src/built_in/pr_monitor.rs b/crates/vapora-capabilities/src/built_in/pr_monitor.rs new file mode 100644 index 0000000..faf2fb2 --- /dev/null +++ b/crates/vapora-capabilities/src/built_in/pr_monitor.rs @@ -0,0 +1,102 @@ +use crate::capability::{Capability, CapabilitySpec}; + +const SYSTEM_PROMPT: &str = r#"You are a pull request analyst responsible for continuous PR health monitoring. +Your job is to produce an accurate, actionable PR health report — not to block work, but to surface risks. + +ANALYSIS SCOPE: +Given a pull request diff and metadata, evaluate: + +1. CHANGE IMPACT: + - Identify the scope: is this a localized fix, a cross-cutting refactor, or an API change? + - Flag breaking changes: public API removals/renames, schema changes, protocol changes + - Identify files with the highest risk (auth, crypto, validation, data models) + +2. TEST COVERAGE: + - Are new code paths covered by tests? + - Are edge cases tested (empty input, max values, error paths)? + - Are there deleted tests not replaced? + +3. DOCUMENTATION: + - Are new public interfaces documented? + - Is the PR description sufficient to understand the change? + +4. SECURITY-SENSITIVE CHANGES: + - Authentication / authorization changes + - Cryptographic operations or key handling + - Input validation or sanitization changes + - External HTTP calls to new or changed endpoints + +5. MERGE READINESS: + Based on the above, classify the PR as: + - READY: No blocking issues, minor notes at most + - NEEDS_REVIEW: Issues that should be reviewed but don't block + - BLOCKED: Breaking changes, security issues, or critical missing tests + +OUTPUT FORMAT (JSON only): +{ + "pr_id": "", + "status": "READY|NEEDS_REVIEW|BLOCKED", + "summary": "<1-2 sentences on the overall change and primary concern>", + "breaking_changes": [ + { "description": "", "impact": "" } + ], + "security_flags": [ + { "severity": "High|Medium|Low", "file": "", "description": "" } + ], + "test_gaps": [ + { "file": "", "description": "" } + ], + "recommendations": [ + "" + ], + "merge_checklist": { + "tests_pass": , + "breaking_changes_documented": , + "security_review_needed": , + "docs_updated": + } +} + +If information is unavailable (e.g., CI status not in the diff), set fields to null. +Be specific: reference file paths and line numbers when flagging issues. +A READY status with honest recommendations is better than an inflated BLOCKED."#; + +/// Pre-built PRMonitor capability. +/// +/// Configures an agent for pull request health monitoring and merge readiness +/// analysis. Runs autonomously when scheduled (e.g., every 15 minutes on open +/// PRs). Temperature 0.1 ensures consistent, reproducible status reports. +#[derive(Debug)] +pub struct PRMonitor; + +impl Capability for PRMonitor { + fn spec(&self) -> CapabilitySpec { + CapabilitySpec { + id: "pr-monitor".to_string(), + display_name: "PR Monitor".to_string(), + description: "Pull request health monitoring and merge readiness analysis".to_string(), + agent_role: "monitor".to_string(), + task_types: vec![ + "pr_review".to_string(), + "pr_monitor".to_string(), + "merge_check".to_string(), + "ci_monitor".to_string(), + "monitor".to_string(), + ], + system_prompt: SYSTEM_PROMPT.to_string(), + mcp_tools: vec![ + "git_diff".to_string(), + "git_log".to_string(), + "git_status".to_string(), + "file_list".to_string(), + "file_read".to_string(), + ], + preferred_provider: "claude".to_string(), + preferred_model: "claude-sonnet-4-6".to_string(), + max_tokens: 4096, + temperature: 0.1, + priority: 85, + parallelizable: false, + } + } +} diff --git a/crates/vapora-capabilities/src/capability.rs b/crates/vapora-capabilities/src/capability.rs new file mode 100644 index 0000000..fdce34e --- /dev/null +++ b/crates/vapora-capabilities/src/capability.rs @@ -0,0 +1,128 @@ +use std::fmt; + +use serde::{Deserialize, Serialize}; +use vapora_shared::AgentDefinition; + +/// Complete specification for a capability package. +/// +/// Carries everything needed to instantiate a domain-optimized agent: +/// system prompt, tool list, LLM preferences, and task classification hints. +/// Registry-stored capabilities expose this via [`Capability::spec()`]. +#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)] +pub struct CapabilitySpec { + /// Unique kebab-case identifier (e.g., `"code-reviewer"`). + pub id: String, + /// Human-readable name shown in UIs and logs. + pub display_name: String, + /// Brief description of what the capability does. + pub description: String, + /// Agent role name used by the coordinator for task routing (e.g., + /// `"code_reviewer"`). + pub agent_role: String, + /// Task-type strings the coordinator uses to select this agent. + /// + /// These strings are matched by [`vapora_agents::coordinator`]'s + /// `extract_task_type` heuristic — they should overlap with keywords + /// present in task titles/descriptions. + pub task_types: Vec, + /// Domain-optimized system prompt injected before every task execution. + pub system_prompt: String, + /// MCP tool IDs activated for this capability. + /// + /// These are semantic names matching what `vapora-mcp-server` exposes. + pub mcp_tools: Vec, + /// Preferred LLM provider name (e.g., `"claude"`, `"openai"`). + pub preferred_provider: String, + /// Preferred model within the provider (e.g., `"claude-opus-4-6"`). + pub preferred_model: String, + /// Maximum output tokens for this domain's tasks. + pub max_tokens: u32, + /// Sampling temperature — lower values produce more deterministic output. + pub temperature: f32, + /// Assignment priority in range 0–100. + pub priority: u32, + /// Whether multiple instances of this capability can run concurrently. + pub parallelizable: bool, +} + +impl CapabilitySpec { + /// Override the system prompt (e.g., from a TOML config file). + pub fn with_system_prompt(mut self, prompt: impl Into) -> Self { + self.system_prompt = prompt.into(); + self + } + + /// Override the preferred model. + pub fn with_model(mut self, model: impl Into) -> Self { + self.preferred_model = model.into(); + self + } + + /// Override the preferred provider. + pub fn with_provider(mut self, provider: impl Into) -> Self { + self.preferred_provider = provider.into(); + self + } + + /// Override max tokens. + pub fn with_max_tokens(mut self, max_tokens: u32) -> Self { + self.max_tokens = max_tokens; + self + } + + /// Override temperature. + pub fn with_temperature(mut self, temperature: f32) -> Self { + self.temperature = temperature; + self + } + + /// Override priority. + pub fn with_priority(mut self, priority: u32) -> Self { + self.priority = priority; + self + } +} + +/// Object-safe trait for a capability package. +/// +/// Built-in capabilities are unit structs (e.g., +/// [`crate::built_in::CodeReviewer`]). TOML-loaded custom capabilities use +/// [`CustomCapability`]. +/// +/// The default `to_agent_definition()` implementation converts the spec into +/// an [`AgentDefinition`] ready for +/// [`vapora_agents::coordinator::AgentCoordinator`]. +pub trait Capability: Send + Sync + fmt::Debug { + /// Return the full specification for this capability. + fn spec(&self) -> CapabilitySpec; + + /// Convert to an [`AgentDefinition`] usable by the agent registry. + /// + /// The system prompt is embedded into the definition so it is available + /// at the execution site via + /// [`vapora_agents::registry::AgentMetadata::system_prompt`]. + fn to_agent_definition(&self) -> AgentDefinition { + let spec = self.spec(); + AgentDefinition { + role: spec.agent_role.clone(), + description: format!("[{}] {}", spec.display_name, spec.description), + llm_provider: spec.preferred_provider.clone(), + llm_model: spec.preferred_model.clone(), + parallelizable: spec.parallelizable, + priority: spec.priority, + capabilities: spec.task_types.clone(), + system_prompt: Some(spec.system_prompt.clone()), + } + } +} + +/// Wraps a [`CapabilitySpec`] loaded from TOML as a [`Capability`] +/// implementation. +#[derive(Debug)] +pub struct CustomCapability(pub CapabilitySpec); + +impl Capability for CustomCapability { + fn spec(&self) -> CapabilitySpec { + self.0.clone() + } +} diff --git a/crates/vapora-capabilities/src/lib.rs b/crates/vapora-capabilities/src/lib.rs new file mode 100644 index 0000000..11c80ce --- /dev/null +++ b/crates/vapora-capabilities/src/lib.rs @@ -0,0 +1,72 @@ +//! Pre-built capability packages for VAPORA agents. +//! +//! A capability package bundles everything needed to deploy a domain-optimized +//! agent without manual configuration: system prompt, MCP tool list, LLM model +//! preferences, and task-type classification hints. +//! +//! # Quick Start +//! +//! ```rust +//! use vapora_capabilities::CapabilityRegistry; +//! +//! // 1. Create the registry with all built-ins +//! let registry = CapabilityRegistry::with_built_ins(); +//! +//! // 2. Activate a capability → produces an AgentDefinition +//! let def = registry.activate("code-reviewer").unwrap(); +//! +//! // 3. Use the definition to instantiate an AgentMetadata in the coordinator +//! assert_eq!(def.role, "code_reviewer"); +//! assert!(def.system_prompt.is_some()); +//! ``` +//! +//! # Built-in Capabilities +//! +//! | ID | Role | Purpose | +//! |----|------|---------| +//! | `code-reviewer` | `code_reviewer` | Security and correctness-focused code review | +//! | `doc-generator` | `documenter` | Technical documentation generation | +//! | `pr-monitor` | `monitor` | Pull request health and merge readiness | +//! +//! # Customization via TOML +//! +//! Override built-in settings or add custom capabilities through a TOML file: +//! +//! ```toml +//! [[override]] +//! id = "code-reviewer" +//! preferred_model = "claude-sonnet-4-6" +//! +//! [[custom]] +//! id = "db-optimizer" +//! display_name = "Database Optimizer" +//! description = "Analyzes and optimizes SQL queries" +//! agent_role = "db_optimizer" +//! task_types = ["db_optimization"] +//! system_prompt = "You are a database expert..." +//! mcp_tools = ["file_read", "code_search"] +//! preferred_provider = "claude" +//! preferred_model = "claude-sonnet-4-6" +//! max_tokens = 4096 +//! temperature = 0.2 +//! priority = 65 +//! parallelizable = true +//! ``` +//! +//! Then apply via [`CapabilityLoader`]: +//! +//! ```rust,no_run +//! use vapora_capabilities::{CapabilityRegistry, CapabilityLoader}; +//! +//! let registry = CapabilityRegistry::with_built_ins(); +//! CapabilityLoader::load_and_apply("config/capabilities.toml", ®istry).unwrap(); +//! ``` + +pub mod built_in; +pub mod capability; +pub mod loader; +pub mod registry; + +pub use capability::{Capability, CapabilitySpec, CustomCapability}; +pub use loader::{CapabilityConfig, CapabilityLoader, CapabilityOverride, LoaderError}; +pub use registry::{CapabilityError, CapabilityRegistry}; diff --git a/crates/vapora-capabilities/src/loader.rs b/crates/vapora-capabilities/src/loader.rs new file mode 100644 index 0000000..a7b5fb4 --- /dev/null +++ b/crates/vapora-capabilities/src/loader.rs @@ -0,0 +1,404 @@ +use serde::Deserialize; +use thiserror::Error; +use tracing::{debug, info, warn}; + +use crate::capability::{CapabilitySpec, CustomCapability}; +use crate::registry::{CapabilityError, CapabilityRegistry}; + +#[derive(Debug, Error)] +pub enum LoaderError { + #[error("Failed to read file '{path}': {source}")] + Io { + path: String, + #[source] + source: std::io::Error, + }, + + #[error("Failed to parse TOML: {0}")] + Parse(#[from] toml::de::Error), + + #[error("Registry error applying capabilities: {0}")] + Registry(#[from] CapabilityError), +} + +/// Partial override applied on top of an existing built-in capability. +/// +/// Only fields set in TOML are applied; unset fields keep their default value. +/// The `id` field identifies which built-in to patch. +#[derive(Debug, Deserialize)] +pub struct CapabilityOverride { + /// ID of the built-in capability to override (e.g., `"code-reviewer"`). + pub id: String, + pub system_prompt: Option, + pub preferred_provider: Option, + pub preferred_model: Option, + pub max_tokens: Option, + pub temperature: Option, + pub priority: Option, + pub mcp_tools: Option>, +} + +/// Configuration file structure for capability customization. +/// +/// ```toml +/// # Patch a built-in: use Claude Sonnet instead of Opus for code-reviewer +/// [[override]] +/// id = "code-reviewer" +/// preferred_model = "claude-sonnet-4-6" +/// max_tokens = 16384 +/// +/// # Add a custom capability not shipped as a built-in +/// [[custom]] +/// id = "db-optimizer" +/// display_name = "Database Optimizer" +/// description = "Analyzes and optimizes SQL queries and schema" +/// agent_role = "db_optimizer" +/// task_types = ["db_optimization", "query_review"] +/// system_prompt = "You are a database performance expert..." +/// mcp_tools = ["file_read", "code_search"] +/// preferred_provider = "claude" +/// preferred_model = "claude-sonnet-4-6" +/// max_tokens = 4096 +/// temperature = 0.2 +/// priority = 70 +/// parallelizable = true +/// ``` +#[derive(Debug, Deserialize)] +pub struct CapabilityConfig { + /// Patches applied on top of existing built-in capabilities. + #[serde(default, rename = "override")] + pub overrides: Vec, + /// Entirely new capabilities not shipped as built-ins. + #[serde(default)] + pub custom: Vec, +} + +/// Loads capability overrides and custom capabilities from TOML configuration. +pub struct CapabilityLoader; + +impl CapabilityLoader { + /// Parse capability configuration from a TOML string. + /// + /// # Errors + /// + /// Returns [`LoaderError::Parse`] if the TOML is malformed. + pub fn parse(toml_content: &str) -> Result { + let config: CapabilityConfig = toml::from_str(toml_content)?; + Ok(config) + } + + /// Parse capability configuration from a file on disk. + /// + /// # Errors + /// + /// Returns [`LoaderError::Io`] if the file cannot be read, + /// or [`LoaderError::Parse`] if the content is not valid TOML. + pub fn from_file(path: &str) -> Result { + let content = std::fs::read_to_string(path).map_err(|e| LoaderError::Io { + path: path.to_string(), + source: e, + })?; + Self::parse(&content) + } + + /// Apply a parsed [`CapabilityConfig`] to a [`CapabilityRegistry`]. + /// + /// - Overrides: existing built-in specs are patched with the non-None + /// fields. If the ID does not exist in the registry it is skipped with a + /// warning. + /// - Custom: new capabilities are added. If the ID already exists it is + /// replaced (allowing idempotent re-application of config). + /// + /// # Errors + /// + /// Returns [`LoaderError::Registry`] only when a custom capability spec has + /// an empty ID. + pub fn apply( + config: CapabilityConfig, + registry: &CapabilityRegistry, + ) -> Result<(), LoaderError> { + let override_count = config.overrides.len(); + let custom_count = config.custom.len(); + + // Apply overrides to existing capabilities + for ov in config.overrides { + if !registry.contains(&ov.id) { + warn!( + "Capability override '{}' not found in registry — skipping", + ov.id + ); + continue; + } + + let base_arc = registry + .get(&ov.id) + .expect("contains check above ensures presence"); + let mut spec = base_arc.spec(); + + if let Some(prompt) = ov.system_prompt { + spec.system_prompt = prompt; + } + if let Some(provider) = ov.preferred_provider { + spec.preferred_provider = provider; + } + if let Some(model) = ov.preferred_model { + spec.preferred_model = model; + } + if let Some(max_tokens) = ov.max_tokens { + spec.max_tokens = max_tokens; + } + if let Some(temperature) = ov.temperature { + spec.temperature = temperature; + } + if let Some(priority) = ov.priority { + spec.priority = priority; + } + if let Some(tools) = ov.mcp_tools { + spec.mcp_tools = tools; + } + + debug!("Applied override to capability '{}'", ov.id); + registry.register_or_replace(CustomCapability(spec)); + } + + // Register custom capabilities (replace if already present) + for spec in config.custom { + if spec.id.is_empty() { + return Err(LoaderError::Registry(CapabilityError::InvalidSpec( + "custom capability id must not be empty".to_string(), + ))); + } + debug!("Registering custom capability '{}'", spec.id); + registry.register_or_replace(CustomCapability(spec)); + } + + info!( + "Applied capability config: {} overrides, {} custom", + override_count, custom_count + ); + Ok(()) + } + + /// Convenience: parse from file and apply to registry in one call. + pub fn load_and_apply(path: &str, registry: &CapabilityRegistry) -> Result<(), LoaderError> { + let config = Self::from_file(path)?; + Self::apply(config, registry) + } +} + +#[cfg(test)] +mod tests { + use super::*; + use crate::CapabilityRegistry; + + #[test] + fn parse_empty_config_succeeds() { + let config = CapabilityLoader::parse("").unwrap(); + assert!(config.overrides.is_empty()); + assert!(config.custom.is_empty()); + } + + #[test] + fn parse_override_only() { + let toml = r#" + [[override]] + id = "code-reviewer" + preferred_model = "claude-sonnet-4-6" + max_tokens = 16384 + "#; + + let config = CapabilityLoader::parse(toml).unwrap(); + assert_eq!(config.overrides.len(), 1); + assert_eq!(config.overrides[0].id, "code-reviewer"); + assert_eq!( + config.overrides[0].preferred_model.as_deref(), + Some("claude-sonnet-4-6") + ); + assert_eq!(config.overrides[0].max_tokens, Some(16384)); + assert!(config.overrides[0].system_prompt.is_none()); + } + + #[test] + fn parse_custom_capability() { + let toml = r#" + [[custom]] + id = "db-optimizer" + display_name = "Database Optimizer" + description = "Optimizes SQL queries" + agent_role = "db_optimizer" + task_types = ["db_optimization"] + system_prompt = "You are a database expert." + mcp_tools = ["file_read"] + preferred_provider = "claude" + preferred_model = "claude-sonnet-4-6" + max_tokens = 2048 + temperature = 0.2 + priority = 65 + parallelizable = true + "#; + + let config = CapabilityLoader::parse(toml).unwrap(); + assert_eq!(config.custom.len(), 1); + assert_eq!(config.custom[0].id, "db-optimizer"); + assert_eq!(config.custom[0].agent_role, "db_optimizer"); + assert_eq!(config.custom[0].max_tokens, 2048); + } + + #[test] + fn apply_override_changes_model() { + let registry = CapabilityRegistry::with_built_ins(); + let original_model = registry + .get("code-reviewer") + .unwrap() + .spec() + .preferred_model + .clone(); + + let toml = r#" + [[override]] + id = "code-reviewer" + preferred_model = "claude-sonnet-4-6" + "#; + + let config = CapabilityLoader::parse(toml).unwrap(); + CapabilityLoader::apply(config, ®istry).unwrap(); + + let updated = registry.activate("code-reviewer").unwrap(); + assert_eq!(updated.llm_model, "claude-sonnet-4-6"); + // The original is different (it was opus), so this proves the override took + // effect + assert_ne!(updated.llm_model, original_model); + } + + #[test] + fn apply_override_preserves_unset_fields() { + let registry = CapabilityRegistry::with_built_ins(); + let original_spec = registry.get("code-reviewer").unwrap().spec(); + + let toml = r#" + [[override]] + id = "code-reviewer" + max_tokens = 4096 + "#; + + let config = CapabilityLoader::parse(toml).unwrap(); + CapabilityLoader::apply(config, ®istry).unwrap(); + + let updated_spec = registry.get("code-reviewer").unwrap().spec(); + assert_eq!(updated_spec.max_tokens, 4096); + // System prompt and provider should be unchanged + assert_eq!(updated_spec.system_prompt, original_spec.system_prompt); + assert_eq!( + updated_spec.preferred_provider, + original_spec.preferred_provider + ); + } + + #[test] + fn apply_unknown_override_id_is_skipped() { + let registry = CapabilityRegistry::with_built_ins(); + let initial_count = registry.len(); + + let toml = r#" + [[override]] + id = "nonexistent-cap" + max_tokens = 999 + "#; + + let config = CapabilityLoader::parse(toml).unwrap(); + // Should succeed without error — unknown override is skipped with a warning + CapabilityLoader::apply(config, ®istry).unwrap(); + + // Registry count unchanged + assert_eq!(registry.len(), initial_count); + } + + #[test] + fn apply_custom_adds_new_capability() { + let registry = CapabilityRegistry::with_built_ins(); + let initial_count = registry.len(); + + let toml = r#" + [[custom]] + id = "test-custom" + display_name = "Test Custom" + description = "A test custom capability" + agent_role = "custom_tester" + task_types = ["custom_task"] + system_prompt = "You are a custom test agent." + mcp_tools = [] + preferred_provider = "ollama" + preferred_model = "llama3.2" + max_tokens = 2048 + temperature = 0.5 + priority = 40 + parallelizable = true + "#; + + let config = CapabilityLoader::parse(toml).unwrap(); + CapabilityLoader::apply(config, ®istry).unwrap(); + + assert_eq!(registry.len(), initial_count + 1); + assert!(registry.contains("test-custom")); + + let def = registry.activate("test-custom").unwrap(); + assert_eq!(def.role, "custom_tester"); + assert_eq!(def.llm_provider, "ollama"); + assert_eq!(def.system_prompt.unwrap(), "You are a custom test agent."); + } + + #[test] + fn apply_custom_replaces_existing_idempotently() { + let registry = CapabilityRegistry::with_built_ins(); + + let toml = r#" + [[custom]] + id = "replaceable" + display_name = "V1" + description = "Version 1" + agent_role = "role_v1" + task_types = ["task_v1"] + system_prompt = "V1 prompt" + mcp_tools = [] + preferred_provider = "claude" + preferred_model = "claude-sonnet-4-6" + max_tokens = 1024 + temperature = 0.5 + priority = 50 + parallelizable = true + "#; + + let config = CapabilityLoader::parse(toml).unwrap(); + CapabilityLoader::apply(config, ®istry).unwrap(); + + let toml2 = r#" + [[custom]] + id = "replaceable" + display_name = "V2" + description = "Version 2" + agent_role = "role_v2" + task_types = ["task_v2"] + system_prompt = "V2 prompt" + mcp_tools = [] + preferred_provider = "claude" + preferred_model = "claude-opus-4-6" + max_tokens = 2048 + temperature = 0.3 + priority = 60 + parallelizable = false + "#; + + let config2 = CapabilityLoader::parse(toml2).unwrap(); + CapabilityLoader::apply(config2, ®istry).unwrap(); + + let spec = registry.get("replaceable").unwrap().spec(); + assert_eq!(spec.agent_role, "role_v2"); + assert_eq!(spec.preferred_model, "claude-opus-4-6"); + } + + #[test] + fn from_file_nonexistent_returns_io_error() { + let err = CapabilityLoader::from_file("/nonexistent/path/capabilities.toml").unwrap_err(); + assert!(matches!(err, LoaderError::Io { .. })); + } +} diff --git a/crates/vapora-capabilities/src/registry.rs b/crates/vapora-capabilities/src/registry.rs new file mode 100644 index 0000000..22866eb --- /dev/null +++ b/crates/vapora-capabilities/src/registry.rs @@ -0,0 +1,340 @@ +use std::sync::Arc; + +use parking_lot::RwLock; +use thiserror::Error; +use tracing::{debug, info}; +use vapora_shared::AgentDefinition; + +use crate::built_in::{CodeReviewer, DocGenerator, PRMonitor}; +use crate::capability::{Capability, CustomCapability}; + +#[derive(Debug, Error)] +pub enum CapabilityError { + #[error("Capability not found: {0}")] + NotFound(String), + + #[error("Capability already registered: {0}")] + AlreadyRegistered(String), + + #[error("Invalid capability spec: {0}")] + InvalidSpec(String), +} + +/// Thread-safe catalog of capability packages. +/// +/// Read-heavy (looked up on every task dispatch), written rarely (at startup +/// or when loading user overrides). Uses `parking_lot::RwLock` for +/// uncontended-read performance. +/// +/// # Examples +/// +/// ```rust +/// use vapora_capabilities::CapabilityRegistry; +/// +/// let registry = CapabilityRegistry::with_built_ins(); +/// let def = registry.activate("code-reviewer").unwrap(); +/// assert_eq!(def.role, "code_reviewer"); +/// ``` +#[derive(Debug)] +pub struct CapabilityRegistry { + capabilities: RwLock>>, +} + +impl CapabilityRegistry { + /// Create an empty registry. + pub fn new() -> Self { + Self { + capabilities: RwLock::new(std::collections::HashMap::new()), + } + } + + /// Create a registry pre-populated with all built-in capabilities. + /// + /// Built-ins registered: `code-reviewer`, `doc-generator`, `pr-monitor`. + pub fn with_built_ins() -> Self { + let registry = Self::new(); + // These are infallible — built-in IDs are unique by construction. + registry + .register(CodeReviewer) + .expect("code-reviewer id collision"); + registry + .register(DocGenerator) + .expect("doc-generator id collision"); + registry + .register(PRMonitor) + .expect("pr-monitor id collision"); + + info!( + "CapabilityRegistry initialized with {} built-in capabilities", + registry.len() + ); + registry + } + + /// Register a capability. Returns [`CapabilityError::AlreadyRegistered`] + /// if a capability with the same ID is already present. + pub fn register(&self, capability: C) -> Result<(), CapabilityError> { + let spec = capability.spec(); + if spec.id.is_empty() { + return Err(CapabilityError::InvalidSpec( + "capability id must not be empty".to_string(), + )); + } + + let mut map = self.capabilities.write(); + if map.contains_key(&spec.id) { + return Err(CapabilityError::AlreadyRegistered(spec.id)); + } + + debug!("Registered capability: {}", spec.id); + map.insert(spec.id, Arc::new(capability)); + Ok(()) + } + + /// Register or replace a capability (idempotent — used for TOML overrides). + /// + /// If a capability with the same ID already exists it is replaced silently. + /// Use this when applying user-supplied configuration overrides. + pub fn register_or_replace(&self, capability: C) { + let spec = capability.spec(); + debug!("Registering/replacing capability: {}", spec.id); + self.capabilities + .write() + .insert(spec.id, Arc::new(capability)); + } + + /// Replace a built-in capability with a custom override spec. + /// + /// Returns [`CapabilityError::NotFound`] if the ID does not exist. + pub fn override_spec( + &self, + id: &str, + new_spec: crate::capability::CapabilitySpec, + ) -> Result<(), CapabilityError> { + if new_spec.id != id { + return Err(CapabilityError::InvalidSpec(format!( + "override spec id '{}' does not match target id '{}'", + new_spec.id, id + ))); + } + + let mut map = self.capabilities.write(); + if !map.contains_key(id) { + return Err(CapabilityError::NotFound(id.to_string())); + } + + debug!("Overriding capability: {}", id); + map.insert(id.to_string(), Arc::new(CustomCapability(new_spec))); + Ok(()) + } + + /// Retrieve a capability by ID. + pub fn get(&self, id: &str) -> Option> { + self.capabilities.read().get(id).cloned() + } + + /// Returns `true` if a capability with the given ID is registered. + pub fn contains(&self, id: &str) -> bool { + self.capabilities.read().contains_key(id) + } + + /// List all registered capability IDs. + pub fn list_ids(&self) -> Vec { + let mut ids: Vec = self.capabilities.read().keys().cloned().collect(); + ids.sort(); + ids + } + + /// Return all registered capabilities. + pub fn list_all(&self) -> Vec> { + self.capabilities.read().values().cloned().collect() + } + + /// Number of registered capabilities. + pub fn len(&self) -> usize { + self.capabilities.read().len() + } + + /// Returns `true` if no capabilities are registered. + pub fn is_empty(&self) -> bool { + self.capabilities.read().is_empty() + } + + /// Convert a registered capability to an [`AgentDefinition`]. + /// + /// This is the primary activation path: look up by ID and produce an + /// `AgentDefinition` that can be used with + /// [`vapora_agents::registry::AgentMetadata::new`] and then registered in + /// the coordinator. + /// + /// # Errors + /// + /// Returns [`CapabilityError::NotFound`] if no capability with the given ID + /// is registered. + pub fn activate(&self, id: &str) -> Result { + let cap = self + .get(id) + .ok_or_else(|| CapabilityError::NotFound(id.to_string()))?; + + let def = cap.to_agent_definition(); + debug!( + "Activated capability '{}' → role '{}' (provider: {}, model: {})", + id, def.role, def.llm_provider, def.llm_model + ); + Ok(def) + } +} + +impl Default for CapabilityRegistry { + fn default() -> Self { + Self::with_built_ins() + } +} + +#[cfg(test)] +mod tests { + use super::*; + use crate::capability::CapabilitySpec; + + fn minimal_spec(id: &str, role: &str) -> CapabilitySpec { + CapabilitySpec { + id: id.to_string(), + display_name: "Test".to_string(), + description: "Test capability".to_string(), + agent_role: role.to_string(), + task_types: vec!["test_task".to_string()], + system_prompt: "You are a test agent.".to_string(), + mcp_tools: vec![], + preferred_provider: "claude".to_string(), + preferred_model: "claude-sonnet-4-6".to_string(), + max_tokens: 1024, + temperature: 0.5, + priority: 50, + parallelizable: true, + } + } + + #[test] + fn built_ins_count_is_three() { + let registry = CapabilityRegistry::with_built_ins(); + assert_eq!(registry.len(), 3); + } + + #[test] + fn built_ins_ids_are_present() { + let registry = CapabilityRegistry::with_built_ins(); + assert!(registry.contains("code-reviewer")); + assert!(registry.contains("doc-generator")); + assert!(registry.contains("pr-monitor")); + } + + #[test] + fn get_nonexistent_returns_none() { + let registry = CapabilityRegistry::with_built_ins(); + assert!(registry.get("nonexistent-capability").is_none()); + } + + #[test] + fn activate_produces_agent_definition_with_system_prompt() { + let registry = CapabilityRegistry::with_built_ins(); + let def = registry.activate("code-reviewer").unwrap(); + + assert_eq!(def.role, "code_reviewer"); + assert!( + def.system_prompt.is_some(), + "system_prompt must be populated by activation" + ); + let prompt = def.system_prompt.unwrap(); + assert!( + !prompt.is_empty(), + "system_prompt must not be empty after activation" + ); + } + + #[test] + fn activate_nonexistent_returns_not_found() { + let registry = CapabilityRegistry::with_built_ins(); + let err = registry.activate("does-not-exist").unwrap_err(); + assert!(matches!(err, CapabilityError::NotFound(_))); + } + + #[test] + fn register_duplicate_returns_already_registered() { + let registry = CapabilityRegistry::new(); + let spec = minimal_spec("my-cap", "my_role"); + registry + .register(CustomCapability(spec.clone())) + .expect("first registration succeeds"); + + let err = registry.register(CustomCapability(spec)).unwrap_err(); + assert!(matches!(err, CapabilityError::AlreadyRegistered(_))); + } + + #[test] + fn register_or_replace_overwrites_existing() { + let registry = CapabilityRegistry::new(); + let spec_v1 = minimal_spec("my-cap", "role_v1"); + let spec_v2 = minimal_spec("my-cap", "role_v2"); + + registry.register(CustomCapability(spec_v1)).unwrap(); + registry.register_or_replace(CustomCapability(spec_v2)); + + let retrieved = registry.get("my-cap").unwrap(); + assert_eq!(retrieved.spec().agent_role, "role_v2"); + } + + #[test] + fn override_spec_replaces_built_in_model() { + let registry = CapabilityRegistry::with_built_ins(); + let mut new_spec = registry.get("code-reviewer").unwrap().spec(); + new_spec.preferred_model = "claude-sonnet-4-6".to_string(); + + registry.override_spec("code-reviewer", new_spec).unwrap(); + + let updated = registry.activate("code-reviewer").unwrap(); + assert_eq!(updated.llm_model, "claude-sonnet-4-6"); + } + + #[test] + fn override_spec_wrong_id_returns_invalid() { + let registry = CapabilityRegistry::with_built_ins(); + let mut spec = registry.get("code-reviewer").unwrap().spec(); + spec.id = "pr-monitor".to_string(); // mismatch + + let err = registry.override_spec("code-reviewer", spec).unwrap_err(); + assert!(matches!(err, CapabilityError::InvalidSpec(_))); + } + + #[test] + fn list_ids_is_sorted() { + let registry = CapabilityRegistry::with_built_ins(); + let ids = registry.list_ids(); + let mut sorted = ids.clone(); + sorted.sort(); + assert_eq!(ids, sorted); + } + + #[test] + fn activate_doc_generator_correct_role() { + let registry = CapabilityRegistry::with_built_ins(); + let def = registry.activate("doc-generator").unwrap(); + assert_eq!(def.role, "documenter"); + assert!(def.system_prompt.is_some()); + } + + #[test] + fn activate_pr_monitor_correct_role() { + let registry = CapabilityRegistry::with_built_ins(); + let def = registry.activate("pr-monitor").unwrap(); + assert_eq!(def.role, "monitor"); + assert!(def.system_prompt.is_some()); + } + + #[test] + fn register_empty_id_returns_invalid() { + let registry = CapabilityRegistry::new(); + let spec = minimal_spec("", "role"); + let err = registry.register(CustomCapability(spec)).unwrap_err(); + assert!(matches!(err, CapabilityError::InvalidSpec(_))); + } +} diff --git a/crates/vapora-shared/src/agent_definition.rs b/crates/vapora-shared/src/agent_definition.rs new file mode 100644 index 0000000..794e1f4 --- /dev/null +++ b/crates/vapora-shared/src/agent_definition.rs @@ -0,0 +1,44 @@ +use serde::{Deserialize, Serialize}; + +/// Full specification for deploying a domain-optimized agent. +/// +/// Produced by [`vapora_capabilities::CapabilityRegistry::activate`] and +/// consumed by [`vapora_agents::registry::AgentMetadata`] construction. +/// Kept in `vapora-shared` so both crates can reference it without a +/// circular dependency. +#[derive(Debug, Clone, Serialize, Deserialize)] +pub struct AgentDefinition { + /// Agent role name used for task routing (e.g., `"code_reviewer"`). + pub role: String, + /// Human-readable description shown in UIs and logs. + pub description: String, + /// Preferred LLM provider name (e.g., `"claude"`). + pub llm_provider: String, + /// Preferred model within the provider (e.g., `"claude-opus-4-6"`). + pub llm_model: String, + /// Whether multiple instances may run concurrently. + #[serde(default)] + pub parallelizable: bool, + /// Assignment priority 0–100. + #[serde(default = "default_priority")] + pub priority: u32, + /// Task-type strings used by the coordinator for capability-based routing. + #[serde(default)] + pub capabilities: Vec, + /// Domain-optimized system prompt injected before every task. + /// `None` falls back to the agent's generic prompt. + #[serde(default)] + pub system_prompt: Option, +} + +fn default_priority() -> u32 { + 50 +} + +impl AgentDefinition { + /// Builder: attach a domain-optimized system prompt. + pub fn with_system_prompt(mut self, prompt: impl Into) -> Self { + self.system_prompt = Some(prompt.into()); + self + } +} diff --git a/crates/vapora-shared/src/lib.rs b/crates/vapora-shared/src/lib.rs index cd85957..c3bc710 100644 --- a/crates/vapora-shared/src/lib.rs +++ b/crates/vapora-shared/src/lib.rs @@ -1,8 +1,10 @@ // vapora-shared: Shared types and utilities for VAPORA v1.0 // Foundation: Minimal skeleton with core types +pub mod agent_definition; pub mod error; pub mod models; pub mod validation; +pub use agent_definition::AgentDefinition; pub use error::{Result, VaporaError}; diff --git a/docs/adrs/0037-capability-packages.md b/docs/adrs/0037-capability-packages.md new file mode 100644 index 0000000..f093412 --- /dev/null +++ b/docs/adrs/0037-capability-packages.md @@ -0,0 +1,185 @@ +# ADR-0037: Capability Packages — `vapora-capabilities` Crate + +**Status**: Accepted +**Date**: 2026-02-26 +**Deciders**: VAPORA Team +**Technical Story**: VAPORA agents carried generic roles (developer, reviewer, architect) but had no domain-specific system prompts or model defaults — every new deployment required manual prompt engineering per agent before first use. + +--- + +## Decision + +Introduce a dedicated `vapora-capabilities` crate that provides **zero-config capability bundles**, and relocate `AgentDefinition` to `vapora-shared` to eliminate a circular dependency: + +1. `vapora-capabilities` exposes a `Capability` trait, `CapabilitySpec` struct, `CapabilityRegistry`, and `CapabilityLoader` — built-in implementations cover `CodeReviewer`, `DocGenerator`, and `PRMonitor`. +2. `AgentDefinition` is moved from `vapora-agents::config` to `vapora-shared`; `vapora-agents::config` re-exports it for backward compatibility. +3. `AgentExecutor` gains a `with_router(Arc)` builder method; the system prompt from `CapabilitySpec` is forwarded as a `Role::System` message through the existing `TypeDialogAdapter::build_messages(prompt, context)` path. +4. `AgentCoordinator` gains in-process executor dispatch via a `DashMap>`; the shard lock is released before `.await` by cloning the `Sender` out of the map. + +--- + +## Context + +### Gap: Generic Roles Require Manual Configuration + +VAPORA's agent roles were defined as enum variants (`AgentRole::Developer`, `AgentRole::Reviewer`, etc.) with no attached system prompt or model preference. A freshly deployed instance had functionally identical agents regardless of the task domain. Operators had to locate `agents.toml`, write a system prompt, select a model, and restart the agents server before agents produced domain-appropriate responses. This repeated for every deployment. + +### Competitive Signal: OpenFang "Hands" + +OpenFang's "Hands" concept ships pre-configured agent personas as first-class artifacts: a `code-review` Hand knows which model to use, what temperature to set, and what system prompt to send — activated with a single config entry. VAPORA's equivalent required three files and a server restart. The gap was not capability but packaging. + +### Circular Dependency Risk + +`AgentDefinition` was the struct that capability specs would need to produce in order to register a new agent. It lived in `vapora-agents::config`. If `vapora-capabilities` imported `vapora-agents` to get `AgentDefinition`, and `vapora-agents` imported `vapora-capabilities` to load built-in capabilities at startup, the workspace would have a compile-time cycle. Cargo does not resolve intra-workspace circular dependencies. + +`AgentDefinition` is a plain data-transfer struct — no orchestration logic, no I/O, no runtime state. Its correct home is `vapora-shared`. + +--- + +## Implementation + +### Crate Structure (`vapora-capabilities`) + +```text +vapora-capabilities/ +├── src/ +│ ├── lib.rs — pub re-exports (Capability, CapabilitySpec, CapabilityRegistry, CapabilityLoader) +│ ├── capability.rs — Capability trait + CapabilitySpec struct +│ ├── registry.rs — CapabilityRegistry: name → Arc +│ ├── loader.rs — CapabilityLoader: TOML file + env override resolution +│ └── builtins/ +│ ├── mod.rs — register_defaults(registry: &mut CapabilityRegistry) +│ ├── code_reviewer.rs — CodeReviewer (Opus 4.6, temp 0.1) +│ ├── doc_generator.rs — DocGenerator (Sonnet 4.6, temp 0.3) +│ └── pr_monitor.rs — PRMonitor (Sonnet 4.6, temp 0.1) +``` + +### `Capability` Trait and `CapabilitySpec` + +```rust +pub trait Capability: Send + Sync { + fn name(&self) -> &str; + fn spec(&self) -> CapabilitySpec; +} + +pub struct CapabilitySpec { + pub system_prompt: String, + pub model: String, + pub temperature: f32, + pub agent_definition: AgentDefinition, // from vapora-shared +} +``` + +`CapabilitySpec` is intentionally flat — no nested `Option` hierarchies. TOML overrides are applied by `CapabilityLoader` before the spec reaches `AgentExecutor`. + +### Built-in Capabilities + +| Name | Model | Temperature | Purpose | +|------|-------|-------------|---------| +| `code-reviewer` | `claude-opus-4-6` | 0.1 | Structured code review with severity classification | +| `doc-generator` | `claude-sonnet-4-6` | 0.3 | Module and API documentation generation | +| `pr-monitor` | `claude-sonnet-4-6` | 0.1 | PR diff analysis and merge-readiness assessment | + +Temperature 0.1 for review tasks enforces determinism; 0.3 for generation allows phrasing variation without hallucination risk at this task type. + +### TOML Override System + +Operators can override any `CapabilitySpec` field without touching source code: + +```toml +[capabilities.code-reviewer] +model = "claude-sonnet-4-6" +temperature = 0.2 + +[capabilities.doc-generator] +system_prompt = "You are a documentation expert specializing in Rust crate APIs." +``` + +Unset fields retain the built-in default. `CapabilityLoader::load` merges partial TOML structs using `Option` fields internally — a `None` after deserialization means "keep built-in value", not "clear to empty". + +### `AgentDefinition` Relocation + +```text +Before: vapora-agents::config::AgentDefinition +After: vapora-shared::models::AgentDefinition + +vapora-agents::config: + pub use vapora_shared::models::AgentDefinition; // backward compat +``` + +All existing call sites in `vapora-agents`, `vapora-backend`, and `vapora-a2a` continue to compile without change. The re-export is the only required modification in `vapora-agents`. + +### `AgentExecutor` — System Prompt Forwarding + +`AgentExecutor` gains one builder method: + +```rust +impl AgentExecutor { + pub fn with_router(mut self, router: Arc) -> Self { + self.router = Some(router); + self + } +} +``` + +The capability's `system_prompt` is passed as `context` to `TypeDialogAdapter::build_messages(prompt, context)`, which already mapped `context` to a leading `Role::System` message before this change. No new message-construction logic is needed. + +### `AgentCoordinator` — In-Process Executor Dispatch + +```rust +// DashMap shard released before .await by cloning the Sender +let tx: Sender = { + self.executor_channels + .get(&agent_id) + .map(|entry| entry.value().clone()) + .ok_or(CoordinatorError::AgentNotFound(agent_id.clone()))? +}; +tx.send(assignment).await?; +``` + +The `DashMap::get` guard (which holds a shard read lock) is dropped at the end of the block. The `.await` on `tx.send` occurs after the lock is released, eliminating the risk of a Tokio task yielding while holding a `DashMap` shard lock. + +--- + +## Consequences + +### Positive + +- Deploy VAPORA, activate a capability by name, and it works — no manual system prompt engineering required. +- TOML overrides allow per-deployment model preferences without code changes or recompilation of the capabilities crate. +- `AgentDefinition` in `vapora-shared` is architecturally correct: it is a DTO with no behavior, and `vapora-shared` is the designated home for shared data types. +- The `DashMap` lock-before-await anti-pattern is eliminated from `AgentCoordinator`. + +### Negative + +- `vapora-agents` binary depends on `vapora-capabilities`. Adding a new built-in capability requires recompiling and redeploying the agents server. Capabilities are not dynamically loaded. +- Built-in system prompts are opinionated. An operator who disagrees with the `CodeReviewer` prompt must override it explicitly; there is no mechanism to reset to "no prompt" after a capability is activated short of removing the config entry. + +### Neutral + +- Partial TOML override via `Option` fields means unset override fields silently preserve the built-in value. This is the intended behavior but requires operators to understand that omitting a field is not the same as clearing it. +- `vapora-capabilities` has no runtime dependency on SurrealDB or NATS — it is a pure configuration and trait-definition crate. It can be unit-tested without any external services. + +--- + +## Alternatives Considered + +### `CapabilityRegistry` Initialization in `vapora-backend` + +Register agents via HTTP call to the agents server at backend startup. Rejected: it adds a temporal dependency (agents server must be up before backend can finish booting), introduces a network round-trip on startup, and couples the backend's readiness to the agents server's availability — three failure modes for what is a configuration operation. + +### Keep `AgentDefinition` in `vapora-agents`, Expose Capabilities via a Plugin Trait + +Define a `CapabilityPlugin` trait in a thin interface crate; `vapora-agents` depends on the interface, `vapora-capabilities` implements it. Rejected: this adds a third crate (`vapora-capability-api`) to resolve what is fundamentally a DTO placement error. `AgentDefinition` has no behavior to isolate behind a trait — moving it to `vapora-shared` is the minimal correct fix. + +### Static TOML Baked into `agents.toml` at CLI Time + +Generate capability configuration as a static TOML block via `vapora-cli` and write it into `agents.toml`. Rejected: this loses runtime composability (capabilities cannot be activated or deactivated without regenerating the file), makes the override system asymmetric (CLI generates, operator edits by hand), and provides no registry abstraction for future dynamic activation. + +--- + +## Relation to Prior Decisions + +- Builds on `AgentCoordinator` patterns established in ADR-0024 and refined in ADR-0033; in-process dispatch via `DashMap` follows the same lock-discipline patterns applied to the workflow engine. +- `AgentDefinition` relocation is consistent with ADR-0001 (workspace crate responsibilities): `vapora-shared` holds types depended on by multiple crates; orchestration crates hold behavior. +- TOML override resolution follows the same `Option`-merge pattern used in `vapora-llm-router::config` for per-rule model overrides (ADR-0012). diff --git a/docs/features/README.md b/docs/features/README.md index c2318ef..9a9eff5 100644 --- a/docs/features/README.md +++ b/docs/features/README.md @@ -7,3 +7,4 @@ VAPORA capabilities and overview documentation. - **[Features Overview](overview.md)** — Complete feature list and descriptions including learning-based agent selection, cost optimization, and swarm coordination - **[Workflow Orchestrator](workflow-orchestrator.md)** — Multi-stage pipelines, approval gates, artifacts, autonomous scheduling, and distributed fire-lock - **[Notification Channels](notification-channels.md)** — Webhook delivery to Slack, Discord, and Telegram with built-in secret resolution +- **[Capability Packages](capability-packages.md)** — Pre-built domain-optimized agent bundles (CodeReviewer, DocGenerator, PRMonitor) with system prompts, tool configs, and LLM preferences diff --git a/docs/features/capability-packages.md b/docs/features/capability-packages.md new file mode 100644 index 0000000..9742a54 --- /dev/null +++ b/docs/features/capability-packages.md @@ -0,0 +1,113 @@ +# Pre-built Capability Packages + +The `vapora-capabilities` crate provides a registry of pre-configured agent capabilities. Each capability is a self-contained bundle that defines everything an agent needs to operate: its system prompt, model preferences, task types, MCP tool bindings, scheduling priority, and inference temperature. + +## Overview + +A capability package encapsulates: + +- `system_prompt` — the instruction context injected as `Role::System` into every LLM call +- `preferred_provider` and `preferred_model` — resolved at runtime through `LLMRouter` +- `task_types` — the set of task labels this capability handles (used by the swarm coordinator for assignment) +- `mcp_tools` — tool names exposed to the agent via the MCP gateway +- `priority` — integer weight (0–100) for swarm scheduling decisions +- `temperature` — controls output determinism; lower values for review/analysis, higher for generative tasks + +The value proposition is operational simplicity: call `CapabilityRegistry::with_built_ins()`, pass the resulting definitions to `AgentExecutor`, and the agents are ready. No manual system prompt authoring, no per-agent LLM config wiring. + +## Built-in Capabilities + +| ID | Role | Purpose | +|----|------|---------| +| `code-reviewer` | `code_reviewer` | Security and correctness-focused code review. Uses Claude Opus 4.6 at temperature 0.1. MCP tools: `file_read`, `file_list`, `git_diff`, `code_search` | +| `doc-generator` | `documenter` | Generates technical documentation from source code. Uses Claude Sonnet 4.6 at temperature 0.3. MCP tools: `file_read`, `file_list`, `code_search`, `file_write` | +| `pr-monitor` | `monitor` | PR health monitoring and merge readiness assessment. Uses Claude Sonnet 4.6 at temperature 0.1. MCP tools: `git_diff`, `git_log`, `git_status`, `file_list`, `file_read` | + +## Runtime Flow + +At agent server startup: + +```rust +let registry = CapabilityRegistry::with_built_ins(); +``` + +This populates the registry with all built-in `CapabilityPackage` instances. When an agent is activated: + +```rust +let definition: AgentDefinition = registry.activate("code-reviewer")?; +``` + +`activate()` resolves the capability into an `AgentDefinition` with `system_prompt` fully populated. `AgentExecutor` receives this definition and prepends the prompt as `Role::System` on every invocation before forwarding the request to the LLM: + +```rust +let response = router + .complete_with_budget(role, model, messages_with_system, budget_ctx) + .await?; +``` + +`LLMRouter::complete_with_budget()` applies the capability's `preferred_provider`, `preferred_model`, and token budget. If the budget is exhausted, the router falls back through the configured fallback chain transparently. + +## TOML Customization + +Capabilities can be overridden or extended via a TOML file without modifying built-ins: + +```toml +# Override built-in: use Sonnet instead of Opus for code-reviewer +[[override]] +id = "code-reviewer" +preferred_model = "claude-sonnet-4-6" +max_tokens = 16384 + +# Add a custom capability +[[custom]] +id = "db-optimizer" +display_name = "Database Optimizer" +description = "Analyzes and optimizes SQL queries and schema" +agent_role = "db_optimizer" +task_types = ["db_optimization", "query_review"] +system_prompt = "You are a database performance expert..." +mcp_tools = ["file_read", "code_search"] +preferred_provider = "claude" +preferred_model = "claude-sonnet-4-6" +max_tokens = 4096 +temperature = 0.2 +priority = 70 +parallelizable = true +``` + +Load and apply with: + +```rust +use vapora_capabilities::{CapabilityRegistry, CapabilityLoader}; + +let registry = CapabilityRegistry::with_built_ins(); +CapabilityLoader::load_and_apply("config/capabilities.toml", ®istry)?; +``` + +`load_and_apply` merges `[[override]]` entries into existing built-ins (only specified fields are replaced) and inserts `[[custom]]` entries as new packages. The registry is append-only after construction; overrides operate by replacement of the matching entry. + +## Architecture + +`vapora-capabilities` sits at the base of the dependency graph to avoid circular imports: + +```text +vapora-shared + └── vapora-capabilities (depends on vapora-shared for AgentDefinition) + └── vapora-agents (depends on vapora-capabilities for registry access) +``` + +`vapora-capabilities` imports only `vapora-shared` types (`AgentDefinition`, `AgentRole`, `LLMProvider`). It has no dependency on `vapora-agents`. This means capability definitions can be constructed, tested, and loaded independently of the agent runtime. + +## Environment Variables + +These configure the LLM router used by all executors that receive a capability definition: + +| Variable | Purpose | +|----------|---------| +| `LLM_ROUTER_CONFIG` | Path to the router configuration TOML file | +| `ANTHROPIC_API_KEY` | API key for Claude models (Opus, Sonnet) | +| `OPENAI_API_KEY` | API key for OpenAI models (GPT-4, etc.) | +| `OLLAMA_URL` | Base URL for a local Ollama instance | +| `OLLAMA_MODEL` | Default model name to use when routing to Ollama | + +The router config at `LLM_ROUTER_CONFIG` defines routing rules, fallback chains, and per-role budget limits. Capability packages specify a `preferred_provider` and `preferred_model`, but the router enforces budget constraints and can override the preference if a fallback is triggered. diff --git a/docs/guides/capability-packages-guide.md b/docs/guides/capability-packages-guide.md new file mode 100644 index 0000000..b5283c4 --- /dev/null +++ b/docs/guides/capability-packages-guide.md @@ -0,0 +1,232 @@ +# Capability Packages Guide + +## What Is a Capability Package + +A capability package bundles everything an agent needs to handle a specific domain into a single reusable unit. Activating one produces an `AgentDefinition` that the coordinator registers and routes tasks to. + +Each package carries: + +- `system_prompt` — domain-optimized instructions injected as the LLM system message before every task execution +- `preferred_model` / `preferred_provider` — e.g. `claude-opus-4-6` for deep code reasoning, `claude-sonnet-4-6` for cost-efficient writing tasks +- `task_types` — strings matched by the coordinator's `extract_task_type` heuristic against task titles and descriptions to select this agent +- `mcp_tools` — list of MCP tool IDs (`file_read`, `git_diff`, etc.) activated for this agent via `vapora-mcp-server` +- `temperature` / `max_tokens` / `priority` / `parallelizable` — execution parameters controlling output quality, cost, scheduling order, and concurrency + +## Built-in Capabilities + +| ID | Role | Model | Temp | Max Tokens | Use Case | +|----|------|-------|------|------------|----------| +| `code-reviewer` | `code_reviewer` | `claude-opus-4-6` | 0.1 | 8192 | Security and correctness review; JSON output with severity levels and `merge_ready` flag | +| `doc-generator` | `documenter` | `claude-sonnet-4-6` | 0.3 | 16384 | Source-to-documentation generation with rustdoc/JSDoc/docstring output | +| `pr-monitor` | `monitor` | `claude-sonnet-4-6` | 0.1 | 4096 | PR health check; `READY` / `NEEDS_REVIEW` / `BLOCKED` status output | + +The `code-reviewer` uses Opus 4.6 because review tasks benefit from deep reasoning over complex code patterns. Temperature 0.1 ensures reproducible findings across repeated runs on the same diff. `pr-monitor` is `parallelizable = false` — concurrent runs on the same PR would produce conflicting status reports. + +## Activating Built-ins at Runtime + +The agent server calls `CapabilityRegistry::with_built_ins()` at startup automatically. All three built-ins are registered and their executors spawned before the HTTP listener opens — no action required when running the standard agent server (`crates/vapora-agents`). + +For programmatic use: + +```rust +use vapora_capabilities::CapabilityRegistry; + +let registry = CapabilityRegistry::with_built_ins(); +// "code-reviewer", "doc-generator", "pr-monitor" are now registered + +let def = registry.activate("code-reviewer")?; +// def.role == "code_reviewer" +// def.system_prompt == Some("") +// def.llm_model == "claude-opus-4-6" +// def.llm_provider == "claude" +``` + +`activate` returns an `AgentDefinition` from `vapora-shared`. The system prompt is embedded in the definition and available at `def.system_prompt` — the executor injects it before every task without any further lookup. + +## Overriding a Built-in + +### Via TOML Config File + +Override fields are applied on top of the existing built-in spec. Only fields present in TOML are changed; everything else keeps its default. An unknown override `id` is skipped with a warning, not an error. + +```toml +# config/capabilities.toml + +# Switch code-reviewer to Sonnet for cost savings +[[override]] +id = "code-reviewer" +preferred_model = "claude-sonnet-4-6" +max_tokens = 16384 + +# Replace the doc-generator system prompt for your tech stack +[[override]] +id = "doc-generator" +system_prompt = """ +You are a technical documentation specialist for Rust async systems. +Follow rustdoc conventions. All examples must be runnable. +""" +``` + +Load and apply at startup (or on config reload): + +```rust +use vapora_capabilities::{CapabilityRegistry, CapabilityLoader}; + +let registry = CapabilityRegistry::with_built_ins(); +CapabilityLoader::load_and_apply("config/capabilities.toml", ®istry)?; +``` + +`load_and_apply` reads the file, parses TOML, and applies overrides + custom entries in one call. The call is idempotent — re-applying the same file replaces existing specs rather than erroring. + +### Via the Registry API Directly + +```rust +use vapora_capabilities::{CapabilityRegistry, CapabilitySpec, CustomCapability}; + +let registry = CapabilityRegistry::with_built_ins(); + +// Fetch the current spec, mutate it, push it back +let mut spec = registry.get("code-reviewer").unwrap().spec(); +spec = spec.with_model("claude-sonnet-4-6").with_max_tokens(16384); + +registry.override_spec("code-reviewer", spec)?; +// Returns CapabilityError::NotFound if the id is not registered +// Returns CapabilityError::InvalidSpec if the spec id does not match the target id +``` + +## Adding a Custom Capability + +Custom entries in TOML are full `CapabilitySpec` definitions — all fields are required. They are registered with `register_or_replace`, so re-applying the config is safe. + +```toml +[[custom]] +id = "db-optimizer" +display_name = "Database Optimizer" +description = "Analyzes and optimizes SurrealQL queries and schema" +agent_role = "db_optimizer" +task_types = ["db_optimization", "query_review", "schema_review"] +system_prompt = """ +You are a SurrealDB performance expert. +Analyze queries and schema definitions for: index usage, full-table scans, +unnecessary JOINs, missing composite indexes. +Output JSON: { "issues": [...], "optimized_query": "...", "index_suggestions": [...] } +""" +mcp_tools = ["file_read", "code_search"] +preferred_provider = "claude" +preferred_model = "claude-sonnet-4-6" +max_tokens = 4096 +temperature = 0.1 +priority = 75 +parallelizable = true +``` + +The `task_types` list must overlap with words present in task titles or descriptions. The coordinator's heuristic tokenizes the task text and checks for matches against registered task-type strings. If no match is found, the task falls back to default role assignment. Use lowercase snake\_case strings that reflect verbs and nouns users will write in task titles (`"query_review"`, `"db_optimization"`). + +## Environment Variables + +The agent server reads provider credentials from the environment at startup to configure the LLM router. + +| Variable | Effect | +|----------|--------| +| `LLM_ROUTER_CONFIG` | Path to a `llm-router.toml` file; takes precedence over all individual API key variables | +| `ANTHROPIC_API_KEY` | Enables the `claude` provider; default model `claude-sonnet-4-6` | +| `OPENAI_API_KEY` | Enables the `openai` provider; default model `gpt-4o` | +| `OLLAMA_URL` | Enables the `ollama` provider (e.g. `http://localhost:11434`) | +| `OLLAMA_MODEL` | Model used with Ollama (default: `llama3.2`) | +| `BUDGET_CONFIG_PATH` | Path to budget config file (default: `config/agent-budgets.toml`) | + +If none of `LLM_ROUTER_CONFIG`, `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, or `OLLAMA_URL` are set, executors run in stub mode — tasks are accepted and return placeholder responses. This is intentional for integration tests and offline development. + +## Checking What Is Registered + +```rust +let ids = registry.list_ids(); +// sorted alphabetically: ["code-reviewer", "doc-generator", "pr-monitor"] + +let count = registry.len(); // 3 + +// Check and activate a specific capability +if registry.contains("db-optimizer") { + let def = registry.activate("db-optimizer")?; + println!("role: {}, model: {}", def.role, def.llm_model); +} + +// Iterate all registered capabilities (order is HashMap-based, not sorted) +for cap in registry.list_all() { + let spec = cap.spec(); + println!("{}: {} ({})", spec.id, spec.display_name, spec.preferred_model); +} +``` + +## Capability Spec Field Reference + +| Field | Type | Description | +|-------|------|-------------| +| `id` | `String` | Unique kebab-case identifier (e.g., `"code-reviewer"`) | +| `display_name` | `String` | Human-readable name shown in UIs and logs | +| `description` | `String` | Brief purpose description embedded in the agent's log entries | +| `agent_role` | `String` | Role name used by the coordinator for task routing (e.g., `"code_reviewer"`) | +| `task_types` | `Vec` | Keywords matched against task text by the coordinator heuristic | +| `system_prompt` | `String` | Full system message injected before every task execution | +| `mcp_tools` | `Vec` | MCP tool IDs available to this agent via `vapora-mcp-server` | +| `preferred_provider` | `String` | LLM provider name (`"claude"`, `"openai"`, `"ollama"`) | +| `preferred_model` | `String` | Model ID within the provider (e.g., `"claude-opus-4-6"`) | +| `max_tokens` | `u32` | Maximum output tokens per task execution | +| `temperature` | `f32` | Sampling temperature 0.0–1.0; lower = more deterministic | +| `priority` | `u32` | Assignment priority 0–100; higher = preferred when multiple agents match | +| `parallelizable` | `bool` | Whether multiple instances may run concurrently for the same task type | + +## Writing Your Own Built-in + +Built-ins are unit structs in `crates/vapora-capabilities/src/built_in/`. Follow this pattern: + +```rust +// crates/vapora-capabilities/src/built_in/sql_optimizer.rs +use crate::capability::{Capability, CapabilitySpec}; + +const SYSTEM_PROMPT: &str = r#"You are a SurrealDB query optimization expert. +Analyze the provided query or schema definition. +Output JSON: { "issues": [...], "optimized": "...", "indexes": [...] }"#; + +#[derive(Debug)] +pub struct SqlOptimizer; + +impl Capability for SqlOptimizer { + fn spec(&self) -> CapabilitySpec { + CapabilitySpec { + id: "sql-optimizer".to_string(), + display_name: "SQL Optimizer".to_string(), + description: "Optimizes SurrealQL queries and schema definitions".to_string(), + agent_role: "sql_optimizer".to_string(), + task_types: vec![ + "sql_optimization".to_string(), + "query_review".to_string(), + "schema_review".to_string(), + ], + system_prompt: SYSTEM_PROMPT.to_string(), + mcp_tools: vec!["file_read".to_string(), "code_search".to_string()], + preferred_provider: "claude".to_string(), + preferred_model: "claude-sonnet-4-6".to_string(), + max_tokens: 4096, + temperature: 0.1, + priority: 75, + parallelizable: true, + } + } +} +``` + +Then wire it into the module and registry: + +```rust +// crates/vapora-capabilities/src/built_in/mod.rs +mod sql_optimizer; +pub use sql_optimizer::SqlOptimizer; +``` + +```rust +// crates/vapora-capabilities/src/registry.rs — inside with_built_ins() +registry.register(SqlOptimizer).expect("sql-optimizer id collision"); +``` + +The `expect` on `register` is intentional — built-in IDs are unique by construction, and a collision at startup indicates a programming error that must be caught during development, not at runtime. diff --git a/docs/src/SUMMARY.md b/docs/src/SUMMARY.md index 820f5d2..a99ae77 100644 --- a/docs/src/SUMMARY.md +++ b/docs/src/SUMMARY.md @@ -21,6 +21,7 @@ - [Features Overview](../features/README.md) - [Platform Capabilities](../features/overview.md) +- [Capability Packages](../features/capability-packages.md) ## Architecture @@ -63,11 +64,13 @@ - [0026: Shared State](../adrs/0026-shared-state.md) - [0027: Documentation Layers](../adrs/0027-documentation-layers.md) - [0033: Workflow Engine Hardening](../adrs/0033-stratum-orchestrator-workflow-hardening.md) + - [0037: Capability Packages](../adrs/0037-capability-packages.md) ## Guides - [Workflow Persistence, Saga & Cedar](../guides/workflow-saga-persistence.md) - [RLM Usage Guide](../guides/rlm-usage-guide.md) +- [Capability Packages Guide](../guides/capability-packages-guide.md) ## Integration Guides @@ -105,8 +108,8 @@ --- -**Documentation Version**: 1.3.0 -**Last Updated**: 2026-02-21 +**Documentation Version**: 1.4.0 +**Last Updated**: 2026-02-26 **Status**: Production Ready For the latest updates, visit: https://github.com/vapora-platform/vapora