14 KiB
Schema Validation Pipeline
Runtime validation system for MCP tools and agent task assignments using Nickel contracts.
Overview
The Schema Validation Pipeline prevents downstream errors by validating inputs before execution. It uses Nickel schemas with contracts to enforce type safety, business rules, and data constraints at runtime.
Problem Solved: VAPORA previously assumed valid inputs or failed downstream. This caused:
- Invalid UUIDs reaching database queries
- Empty strings bypassing business logic
- Out-of-range priorities corrupting task queues
- Malformed contexts breaking agent execution
Solution: Validate all inputs against Nickel schemas before execution.
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Client Request │
│ (MCP Tool Invocation / Agent Task Assignment) │
└────────────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ ValidationPipeline │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ 1. Load schema from SchemaRegistry (cached) │ │
│ │ 2. Validate types (String, Number, Array, Object) │ │
│ │ 3. Check required fields │ │
│ │ 4. Apply contracts (NonEmpty, UUID, Range, etc.) │ │
│ │ 5. Apply default values │ │
│ │ 6. Return ValidationResult (valid + errors + data) │ │
│ └──────────────────────────────────────────────────────────┘ │
└────────────────────────────┬────────────────────────────────────┘
│
┌──────────┴──────────┐
│ │
▼ ▼
┌──────────────┐ ┌─────────────┐
│ Valid? │ │ Invalid? │
│ Execute │ │ Reject │
│ with data │ │ with errors│
└──────────────┘ └─────────────┘
Components
1. ValidationPipeline
Core validation engine in vapora-shared/src/validation/pipeline.rs.
Key Methods:
pub async fn validate(
&self,
schema_name: &str,
input: &Value
) -> Result<ValidationResult>
Validation Steps:
- Load compiled schema from registry
- Validate field types (String, Number, Bool, Array, Object)
- Check required fields (reject if missing)
- Apply contracts (NonEmpty, UUID, Range, Email, etc.)
- Apply default values for optional fields
- Return ValidationResult with errors (if any)
Strict Mode: Rejects unknown fields not in schema.
2. SchemaRegistry
Schema loading and caching in vapora-shared/src/validation/schema_registry.rs.
Features:
- Caching: Compiled schemas cached in memory (Arc + RwLock)
- Hot Reload:
invalidate(schema_name)to reload without restart - Schema Sources: File system, embedded string, or URL (future)
Schema Structure:
pub struct CompiledSchema {
pub name: String,
pub fields: HashMap<String, FieldSchema>,
}
pub struct FieldSchema {
pub field_type: FieldType,
pub required: bool,
pub contracts: Vec<Contract>,
pub default: Option<Value>,
}
3. NickelBridge
CLI integration for Nickel operations in vapora-shared/src/validation/nickel_bridge.rs.
Operations:
typecheck(path)— Validate Nickel syntaxexport(path)— Export schema as JSONquery(path, field)— Query specific fieldis_available()— Check if Nickel CLI is installed
Timeout Protection: 30s default to prevent DoS from malicious Nickel code.
Nickel Schemas
Located in schemas/ directory (workspace root).
Directory Structure
schemas/
├── tools/ # MCP tool parameter validation
│ ├── kanban_create_task.ncl
│ ├── kanban_update_task.ncl
│ ├── assign_task_to_agent.ncl
│ ├── get_project_summary.ncl
│ └── get_agent_capabilities.ncl
└── agents/ # Agent task assignment validation
└── task_assignment.ncl
Schema Format
{
tool_name = "example_tool",
parameters = {
# Required field with contracts
user_id
| String
| doc "User UUID"
| std.string.NonEmpty
| std.string.match "^[0-9a-f]{8}-[0-9a-f]{4}-...$",
# Optional field with default
priority
| Number
| doc "Priority score (0-100)"
| std.number.between 0 100
| default = 50,
},
}
Supported Contracts
| Contract | Description | Example |
|---|---|---|
std.string.NonEmpty |
String cannot be empty | Required text fields |
std.string.length.min N |
Minimum length | min 3 for titles |
std.string.length.max N |
Maximum length | max 200 for titles |
std.string.match PATTERN |
Regex validation | UUID format |
std.number.between A B |
Numeric range | between 0 100 |
std.number.greater_than N |
Minimum value (exclusive) | > -1 |
std.number.less_than N |
Maximum value (exclusive) | < 1000 |
std.enum.TaggedUnion |
Enum validation | `[ |
Integration Points
MCP Server
Location: crates/vapora-mcp-server/src/main.rs
// Initialize validation pipeline
let schema_dir = std::env::var("VAPORA_SCHEMA_DIR")
.unwrap_or_else(|_| "schemas".to_string());
let registry = Arc::new(SchemaRegistry::new(PathBuf::from(&schema_dir)));
let validation = Arc::new(ValidationPipeline::new(registry));
// Add to AppState
#[derive(Clone)]
struct AppState {
validation: Arc<ValidationPipeline>,
}
// Validate in handler
async fn invoke_tool(
State(state): State<AppState>,
Json(request): Json<InvokeToolRequest>,
) -> impl IntoResponse {
let schema_name = format!("tools/{}", request.tool);
let validation_result = state
.validation
.validate(&schema_name, &request.parameters)
.await?;
if !validation_result.valid {
return (StatusCode::BAD_REQUEST, Json(validation_errors));
}
// Execute with validated data
let validated_params = validation_result.validated_data.unwrap();
// ...
}
Agent Coordinator
Location: crates/vapora-agents/src/coordinator.rs
pub struct AgentCoordinator {
validation: Arc<ValidationPipeline>,
// ...
}
impl AgentCoordinator {
pub async fn assign_task(
&self,
role: &str,
title: String,
description: String,
context: String,
priority: u32,
) -> Result<String, CoordinatorError> {
// Validate inputs
let input = serde_json::json!({
"role": role,
"title": &title,
"description": &description,
"context": &context,
"priority": priority,
});
let validation_result = self
.validation
.validate("agents/task_assignment", &input)
.await?;
if !validation_result.valid {
return Err(CoordinatorError::ValidationError(
validation_result.errors.join(", ")
));
}
// Continue with validated inputs
// ...
}
}
Usage Patterns
1. Validating MCP Tool Inputs
// In MCP server handler
let validation_result = state
.validation
.validate("tools/kanban_create_task", &input)
.await?;
if !validation_result.valid {
let errors: Vec<String> = validation_result
.errors
.iter()
.map(|e| e.to_string())
.collect();
return (StatusCode::BAD_REQUEST, Json(json!({
"success": false,
"validation_errors": errors,
})));
}
// Use validated data with defaults applied
let validated_data = validation_result.validated_data.unwrap();
2. Validating Agent Task Assignments
// In AgentCoordinator
let input = serde_json::json!({
"role": role,
"title": title,
"description": description,
"context": context,
"priority": priority,
});
let validation_result = self
.validation
.validate("agents/task_assignment", &input)
.await?;
if !validation_result.valid {
warn!("Validation failed: {:?}", validation_result.errors);
return Err(CoordinatorError::ValidationError(
format!("Invalid input: {}", validation_result.errors.join(", "))
));
}
3. Hot Reloading Schemas
// Invalidate single schema
registry.invalidate("tools/kanban_create_task").await;
// Invalidate all schemas (useful for config reload)
registry.invalidate_all().await;
Testing Schemas
Validate Syntax
nickel typecheck schemas/tools/kanban_create_task.ncl
Export as JSON
nickel export schemas/tools/kanban_create_task.ncl
Query Specific Field
nickel query --field parameters.title schemas/tools/kanban_create_task.ncl
Adding New Schemas
- Create
.nclfile in appropriate directory (tools/oragents/) - Define
tool_nameorschema_name - Define
parametersorfieldswith types and contracts - Add
docannotations for documentation - Test with
nickel typecheck - Restart services or use hot-reload
Example:
# schemas/tools/my_new_tool.ncl
{
tool_name = "my_new_tool",
parameters = {
name
| String
| doc "User name (3-50 chars)"
| std.string.NonEmpty
| std.string.length.min 3
| std.string.length.max 50,
age
| Number
| doc "User age (0-120)"
| std.number.between 0 120,
email
| String
| doc "User email address"
| std.string.Email,
active
| Bool
| doc "Account status"
| default = true,
},
}
Configuration
Environment Variables
VAPORA_SCHEMA_DIR— Schema directory path (default:"schemas")
In tests:
std::env::set_var("VAPORA_SCHEMA_DIR", "../../schemas");
In production:
export VAPORA_SCHEMA_DIR=/app/schemas
Performance Characteristics
- Schema Loading: ~5-10ms (first load, then cached)
- Validation: ~0.1-0.5ms per request (in-memory)
- Hot Reload: ~10-20ms (invalidates cache, reloads from disk)
Optimization: SchemaRegistry uses Arc<RwLock<HashMap>> for concurrent reads.
Security Considerations
Timeout Protection
NickelBridge enforces 30s timeout on all CLI operations to prevent:
- Infinite loops in malicious Nickel code
- DoS attacks via crafted schemas
- Resource exhaustion
Input Sanitization
Contracts prevent:
- SQL injection (via UUID/Email validation)
- XSS attacks (via length limits on text fields)
- Buffer overflows (via max length constraints)
- Type confusion (via strict type checking)
Schema Validation
All schemas must pass nickel typecheck before deployment.
Error Handling
ValidationResult
pub struct ValidationResult {
pub valid: bool,
pub errors: Vec<ValidationError>,
pub validated_data: Option<Value>,
}
pub enum ValidationError {
MissingField(String),
TypeMismatch { field: String, expected: String, got: String },
ContractViolation { field: String, contract: String, value: String },
InvalidSchema(String),
}
Error Response Format
{
"success": false,
"error": "Validation failed",
"validation_errors": [
"Field 'project_id' must match UUID pattern",
"Field 'title' must be at least 3 characters",
"Field 'priority' must be between 0 and 100"
]
}
Troubleshooting
Schema Not Found
Error: Schema file not found: schemas/tools/my_tool.ncl
Solution: Check VAPORA_SCHEMA_DIR environment variable and ensure schema file exists.
Nickel CLI Not Available
Error: Nickel CLI not found in PATH
Solution: Install Nickel CLI:
cargo install nickel-lang-cli
Validation Always Fails
Error: All requests rejected with validation errors
Solution: Check schema syntax with nickel typecheck, verify field names match exactly.
Future Enhancements
- Remote schema loading (HTTP/S3)
- Schema versioning and migration
- Custom contract plugins
- GraphQL schema generation from Nickel
- OpenAPI spec generation
Related Documentation
- Nickel Language Documentation
- VAPORA Architecture Overview
- Agent Coordination
- MCP Protocol Integration
References
- Implementation:
crates/vapora-shared/src/validation/ - Schemas:
schemas/ - Tests:
crates/vapora-shared/tests/validation_integration.rs - MCP Integration:
crates/vapora-mcp-server/src/main.rs - Agent Integration:
crates/vapora-agents/src/coordinator.rs