Vapora/docs/guides/capability-packages-guide.md
Jesús Pérez 765841b18f
Some checks failed
Documentation Lint & Validation / Markdown Linting (push) Has been cancelled
Documentation Lint & Validation / Validate mdBook Configuration (push) Has been cancelled
Documentation Lint & Validation / Content & Structure Validation (push) Has been cancelled
Documentation Lint & Validation / Lint & Validation Summary (push) Has been cancelled
mdBook Build & Deploy / Build mdBook (push) Has been cancelled
mdBook Build & Deploy / Documentation Quality Check (push) Has been cancelled
mdBook Build & Deploy / Deploy to GitHub Pages (push) Has been cancelled
mdBook Build & Deploy / Notification (push) Has been cancelled
Rust CI / Security Audit (push) Has been cancelled
Rust CI / Check + Test + Lint (nightly) (push) Has been cancelled
Rust CI / Check + Test + Lint (stable) (push) Has been cancelled
feat(capabilities): add vapora-capabilities crate with in-process executor dispatch
- New vapora-capabilities crate: CapabilitySpec, Capability trait, CapabilityRegistry
     (parking_lot RwLock), CapabilityLoader (TOML overrides), 3 built-ins
     (code-reviewer, doc-generator, pr-monitor), 22 tests
   - Move AgentDefinition to vapora-shared to break capabilities↔agents circular dep
   - Wire system_prompt into AgentExecutor via LLMRouter.complete_with_budget
   - AgentCoordinator: in-process task dispatch via DashMap<String, Sender<TaskAssignment>>
   - server.rs: bootstrap CapabilityRegistry + LLMRouter from env, spawn executors per capability
   - Landing page: 620 tests, 21 crates, Capability Packages feature box
   - docs: capability-packages feature guide, ADR-0037, CHANGELOG, SUMMARY
   EOF
2026-02-26 16:43:28 +00:00

233 lines
10 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Capability Packages Guide
## What Is a Capability Package
A capability package bundles everything an agent needs to handle a specific domain into a single reusable unit. Activating one produces an `AgentDefinition` that the coordinator registers and routes tasks to.
Each package carries:
- `system_prompt` — domain-optimized instructions injected as the LLM system message before every task execution
- `preferred_model` / `preferred_provider` — e.g. `claude-opus-4-6` for deep code reasoning, `claude-sonnet-4-6` for cost-efficient writing tasks
- `task_types` — strings matched by the coordinator's `extract_task_type` heuristic against task titles and descriptions to select this agent
- `mcp_tools` — list of MCP tool IDs (`file_read`, `git_diff`, etc.) activated for this agent via `vapora-mcp-server`
- `temperature` / `max_tokens` / `priority` / `parallelizable` — execution parameters controlling output quality, cost, scheduling order, and concurrency
## Built-in Capabilities
| ID | Role | Model | Temp | Max Tokens | Use Case |
|----|------|-------|------|------------|----------|
| `code-reviewer` | `code_reviewer` | `claude-opus-4-6` | 0.1 | 8192 | Security and correctness review; JSON output with severity levels and `merge_ready` flag |
| `doc-generator` | `documenter` | `claude-sonnet-4-6` | 0.3 | 16384 | Source-to-documentation generation with rustdoc/JSDoc/docstring output |
| `pr-monitor` | `monitor` | `claude-sonnet-4-6` | 0.1 | 4096 | PR health check; `READY` / `NEEDS_REVIEW` / `BLOCKED` status output |
The `code-reviewer` uses Opus 4.6 because review tasks benefit from deep reasoning over complex code patterns. Temperature 0.1 ensures reproducible findings across repeated runs on the same diff. `pr-monitor` is `parallelizable = false` — concurrent runs on the same PR would produce conflicting status reports.
## Activating Built-ins at Runtime
The agent server calls `CapabilityRegistry::with_built_ins()` at startup automatically. All three built-ins are registered and their executors spawned before the HTTP listener opens — no action required when running the standard agent server (`crates/vapora-agents`).
For programmatic use:
```rust
use vapora_capabilities::CapabilityRegistry;
let registry = CapabilityRegistry::with_built_ins();
// "code-reviewer", "doc-generator", "pr-monitor" are now registered
let def = registry.activate("code-reviewer")?;
// def.role == "code_reviewer"
// def.system_prompt == Some("<full review prompt>")
// def.llm_model == "claude-opus-4-6"
// def.llm_provider == "claude"
```
`activate` returns an `AgentDefinition` from `vapora-shared`. The system prompt is embedded in the definition and available at `def.system_prompt` — the executor injects it before every task without any further lookup.
## Overriding a Built-in
### Via TOML Config File
Override fields are applied on top of the existing built-in spec. Only fields present in TOML are changed; everything else keeps its default. An unknown override `id` is skipped with a warning, not an error.
```toml
# config/capabilities.toml
# Switch code-reviewer to Sonnet for cost savings
[[override]]
id = "code-reviewer"
preferred_model = "claude-sonnet-4-6"
max_tokens = 16384
# Replace the doc-generator system prompt for your tech stack
[[override]]
id = "doc-generator"
system_prompt = """
You are a technical documentation specialist for Rust async systems.
Follow rustdoc conventions. All examples must be runnable.
"""
```
Load and apply at startup (or on config reload):
```rust
use vapora_capabilities::{CapabilityRegistry, CapabilityLoader};
let registry = CapabilityRegistry::with_built_ins();
CapabilityLoader::load_and_apply("config/capabilities.toml", &registry)?;
```
`load_and_apply` reads the file, parses TOML, and applies overrides + custom entries in one call. The call is idempotent — re-applying the same file replaces existing specs rather than erroring.
### Via the Registry API Directly
```rust
use vapora_capabilities::{CapabilityRegistry, CapabilitySpec, CustomCapability};
let registry = CapabilityRegistry::with_built_ins();
// Fetch the current spec, mutate it, push it back
let mut spec = registry.get("code-reviewer").unwrap().spec();
spec = spec.with_model("claude-sonnet-4-6").with_max_tokens(16384);
registry.override_spec("code-reviewer", spec)?;
// Returns CapabilityError::NotFound if the id is not registered
// Returns CapabilityError::InvalidSpec if the spec id does not match the target id
```
## Adding a Custom Capability
Custom entries in TOML are full `CapabilitySpec` definitions — all fields are required. They are registered with `register_or_replace`, so re-applying the config is safe.
```toml
[[custom]]
id = "db-optimizer"
display_name = "Database Optimizer"
description = "Analyzes and optimizes SurrealQL queries and schema"
agent_role = "db_optimizer"
task_types = ["db_optimization", "query_review", "schema_review"]
system_prompt = """
You are a SurrealDB performance expert.
Analyze queries and schema definitions for: index usage, full-table scans,
unnecessary JOINs, missing composite indexes.
Output JSON: { "issues": [...], "optimized_query": "...", "index_suggestions": [...] }
"""
mcp_tools = ["file_read", "code_search"]
preferred_provider = "claude"
preferred_model = "claude-sonnet-4-6"
max_tokens = 4096
temperature = 0.1
priority = 75
parallelizable = true
```
The `task_types` list must overlap with words present in task titles or descriptions. The coordinator's heuristic tokenizes the task text and checks for matches against registered task-type strings. If no match is found, the task falls back to default role assignment. Use lowercase snake\_case strings that reflect verbs and nouns users will write in task titles (`"query_review"`, `"db_optimization"`).
## Environment Variables
The agent server reads provider credentials from the environment at startup to configure the LLM router.
| Variable | Effect |
|----------|--------|
| `LLM_ROUTER_CONFIG` | Path to a `llm-router.toml` file; takes precedence over all individual API key variables |
| `ANTHROPIC_API_KEY` | Enables the `claude` provider; default model `claude-sonnet-4-6` |
| `OPENAI_API_KEY` | Enables the `openai` provider; default model `gpt-4o` |
| `OLLAMA_URL` | Enables the `ollama` provider (e.g. `http://localhost:11434`) |
| `OLLAMA_MODEL` | Model used with Ollama (default: `llama3.2`) |
| `BUDGET_CONFIG_PATH` | Path to budget config file (default: `config/agent-budgets.toml`) |
If none of `LLM_ROUTER_CONFIG`, `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, or `OLLAMA_URL` are set, executors run in stub mode — tasks are accepted and return placeholder responses. This is intentional for integration tests and offline development.
## Checking What Is Registered
```rust
let ids = registry.list_ids();
// sorted alphabetically: ["code-reviewer", "doc-generator", "pr-monitor"]
let count = registry.len(); // 3
// Check and activate a specific capability
if registry.contains("db-optimizer") {
let def = registry.activate("db-optimizer")?;
println!("role: {}, model: {}", def.role, def.llm_model);
}
// Iterate all registered capabilities (order is HashMap-based, not sorted)
for cap in registry.list_all() {
let spec = cap.spec();
println!("{}: {} ({})", spec.id, spec.display_name, spec.preferred_model);
}
```
## Capability Spec Field Reference
| Field | Type | Description |
|-------|------|-------------|
| `id` | `String` | Unique kebab-case identifier (e.g., `"code-reviewer"`) |
| `display_name` | `String` | Human-readable name shown in UIs and logs |
| `description` | `String` | Brief purpose description embedded in the agent's log entries |
| `agent_role` | `String` | Role name used by the coordinator for task routing (e.g., `"code_reviewer"`) |
| `task_types` | `Vec<String>` | Keywords matched against task text by the coordinator heuristic |
| `system_prompt` | `String` | Full system message injected before every task execution |
| `mcp_tools` | `Vec<String>` | MCP tool IDs available to this agent via `vapora-mcp-server` |
| `preferred_provider` | `String` | LLM provider name (`"claude"`, `"openai"`, `"ollama"`) |
| `preferred_model` | `String` | Model ID within the provider (e.g., `"claude-opus-4-6"`) |
| `max_tokens` | `u32` | Maximum output tokens per task execution |
| `temperature` | `f32` | Sampling temperature 0.01.0; lower = more deterministic |
| `priority` | `u32` | Assignment priority 0100; higher = preferred when multiple agents match |
| `parallelizable` | `bool` | Whether multiple instances may run concurrently for the same task type |
## Writing Your Own Built-in
Built-ins are unit structs in `crates/vapora-capabilities/src/built_in/`. Follow this pattern:
```rust
// crates/vapora-capabilities/src/built_in/sql_optimizer.rs
use crate::capability::{Capability, CapabilitySpec};
const SYSTEM_PROMPT: &str = r#"You are a SurrealDB query optimization expert.
Analyze the provided query or schema definition.
Output JSON: { "issues": [...], "optimized": "...", "indexes": [...] }"#;
#[derive(Debug)]
pub struct SqlOptimizer;
impl Capability for SqlOptimizer {
fn spec(&self) -> CapabilitySpec {
CapabilitySpec {
id: "sql-optimizer".to_string(),
display_name: "SQL Optimizer".to_string(),
description: "Optimizes SurrealQL queries and schema definitions".to_string(),
agent_role: "sql_optimizer".to_string(),
task_types: vec![
"sql_optimization".to_string(),
"query_review".to_string(),
"schema_review".to_string(),
],
system_prompt: SYSTEM_PROMPT.to_string(),
mcp_tools: vec!["file_read".to_string(), "code_search".to_string()],
preferred_provider: "claude".to_string(),
preferred_model: "claude-sonnet-4-6".to_string(),
max_tokens: 4096,
temperature: 0.1,
priority: 75,
parallelizable: true,
}
}
}
```
Then wire it into the module and registry:
```rust
// crates/vapora-capabilities/src/built_in/mod.rs
mod sql_optimizer;
pub use sql_optimizer::SqlOptimizer;
```
```rust
// crates/vapora-capabilities/src/registry.rs — inside with_built_ins()
registry.register(SqlOptimizer).expect("sql-optimizer id collision");
```
The `expect` on `register` is intentional — built-in IDs are unique by construction, and a collision at startup indicates a programming error that must be caught during development, not at runtime.