Implement intelligent agent learning from Knowledge Graph execution history with per-task-type expertise tracking, recency bias, and learning curves. ## Phase 5.3 Implementation ### Learning Infrastructure (✅ Complete) - LearningProfileService with per-task-type expertise metrics - TaskTypeExpertise model tracking success_rate, confidence, learning curves - Recency bias weighting: recent 7 days weighted 3x higher (exponential decay) - Confidence scoring prevents overfitting: min(1.0, executions / 20) - Learning curves computed from daily execution windows ### Agent Scoring Service (✅ Complete) - Unified AgentScore combining SwarmCoordinator + learning profiles - Scoring formula: 0.3*base + 0.5*expertise + 0.2*confidence - Rank agents by combined score for intelligent assignment - Support for recency-biased scoring (recent_success_rate) - Methods: rank_agents, select_best, rank_agents_with_recency ### KG Integration (✅ Complete) - KGPersistence::get_executions_for_task_type() - query by agent + task type - KGPersistence::get_agent_executions() - all executions for agent - Coordinator::load_learning_profile_from_kg() - core KG→Learning integration - Coordinator::load_all_learning_profiles() - batch load for multiple agents - Convert PersistedExecution → ExecutionData for learning calculations ### Agent Assignment Integration (✅ Complete) - AgentCoordinator uses learning profiles for task assignment - extract_task_type() infers task type from title/description - assign_task() scores candidates using AgentScoringService - Fallback to load-based selection if no learning data available - Learning profiles stored in coordinator.learning_profiles RwLock ### Profile Adapter Enhancements (✅ Complete) - create_learning_profile() - initialize empty profiles - add_task_type_expertise() - set task-type expertise - update_profile_with_learning() - update swarm profiles from learning ## Files Modified ### vapora-knowledge-graph/src/persistence.rs (+30 lines) - get_executions_for_task_type(agent_id, task_type, limit) - get_agent_executions(agent_id, limit) ### vapora-agents/src/coordinator.rs (+100 lines) - load_learning_profile_from_kg() - core KG integration method - load_all_learning_profiles() - batch loading for agents - assign_task() already uses learning-based scoring via AgentScoringService ### Existing Complete Implementation - vapora-knowledge-graph/src/learning.rs - calculation functions - vapora-agents/src/learning_profile.rs - data structures and expertise - vapora-agents/src/scoring.rs - unified scoring service - vapora-agents/src/profile_adapter.rs - adapter methods ## Tests Passing - learning_profile: 7 tests ✅ - scoring: 5 tests ✅ - profile_adapter: 6 tests ✅ - coordinator: learning-specific tests ✅ ## Data Flow 1. Task arrives → AgentCoordinator::assign_task() 2. Extract task_type from description 3. Query KG for task-type executions (load_learning_profile_from_kg) 4. Calculate expertise with recency bias 5. Score candidates (SwarmCoordinator + learning) 6. Assign to top-scored agent 7. Execution result → KG → Update learning profiles ## Key Design Decisions ✅ Recency bias: 7-day half-life with 3x weight for recent performance ✅ Confidence scoring: min(1.0, total_executions / 20) prevents overfitting ✅ Hierarchical scoring: 30% base load, 50% expertise, 20% confidence ✅ KG query limit: 100 recent executions per task-type for performance ✅ Async loading: load_learning_profile_from_kg supports concurrent loads ## Next: Phase 5.4 - Cost Optimization Ready to implement budget enforcement and cost-aware provider selection.
313 lines
7.6 KiB
Markdown
313 lines
7.6 KiB
Markdown
# VAPORA Provisioning Integration
|
|
|
|
Integration documentation for deploying VAPORA v1.0 using Provisioning.
|
|
|
|
## Overview
|
|
|
|
VAPORA can be deployed using **Provisioning**, a Rust-based infrastructure-as-code platform that manages Kubernetes clusters, services, and workflows.
|
|
|
|
The Provisioning workspace is located at: `/Users/Akasha/Development/vapora/provisioning/vapora-wrksp/`
|
|
|
|
## Provisioning Workspace Structure
|
|
|
|
```
|
|
provisioning/vapora-wrksp/
|
|
├── workspace.toml # Master configuration
|
|
├── kcl/ # Infrastructure schemas (KCL)
|
|
│ ├── cluster.k # Cluster definition
|
|
│ ├── namespace.k # Namespace configuration
|
|
│ ├── backend.k # Backend deployment
|
|
│ ├── frontend.k # Frontend deployment
|
|
│ └── agents.k # Agent deployment
|
|
├── taskservs/ # Service definitions (TOML)
|
|
│ ├── surrealdb.toml # SurrealDB service
|
|
│ ├── nats.toml # NATS service
|
|
│ ├── backend.toml # Backend service
|
|
│ ├── frontend.toml # Frontend service
|
|
│ └── agents.toml # Agents service
|
|
└── workflows/ # Batch operations (YAML)
|
|
├── deploy-full-stack.yaml
|
|
├── deploy-infra.yaml
|
|
├── deploy-services.yaml
|
|
└── health-check.yaml
|
|
```
|
|
|
|
## Integration Points
|
|
|
|
### 1. Cluster Management
|
|
|
|
Provisioning creates and manages the Kubernetes cluster:
|
|
|
|
```bash
|
|
cd provisioning/vapora-wrksp
|
|
provisioning cluster create --config workspace.toml
|
|
```
|
|
|
|
This creates:
|
|
- K3s/RKE2 cluster
|
|
- Storage class (Rook Ceph or local-path)
|
|
- Ingress controller (nginx)
|
|
- Service mesh (optional Istio)
|
|
|
|
### 2. Service Deployment
|
|
|
|
Services are defined in `taskservs/` and deployed via workflows:
|
|
|
|
```bash
|
|
provisioning workflow run workflows/deploy-full-stack.yaml
|
|
```
|
|
|
|
This deploys all VAPORA components in order:
|
|
1. SurrealDB (StatefulSet)
|
|
2. NATS JetStream (Deployment)
|
|
3. Backend API (Deployment)
|
|
4. Frontend UI (Deployment)
|
|
5. Agents (Deployment)
|
|
6. MCP Server (Deployment)
|
|
|
|
### 3. Infrastructure as Code (KCL)
|
|
|
|
KCL schemas in `kcl/` define infrastructure resources:
|
|
|
|
**Example: `kcl/backend.k`**
|
|
```python
|
|
schema BackendDeployment:
|
|
name: str = "vapora-backend"
|
|
namespace: str = "vapora"
|
|
replicas: int = 2
|
|
image: str = "vapora/backend:latest"
|
|
port: int = 8080
|
|
|
|
env:
|
|
SURREALDB_URL: str = "http://surrealdb:8000"
|
|
NATS_URL: str = "nats://nats:4222"
|
|
JWT_SECRET: str = "${SECRET:jwt-secret}"
|
|
```
|
|
|
|
### 4. Taskserv Definitions
|
|
|
|
Taskservs define how services are deployed and managed:
|
|
|
|
**Example: `taskservs/backend.toml`**
|
|
```toml
|
|
[service]
|
|
name = "vapora-backend"
|
|
type = "deployment"
|
|
namespace = "vapora"
|
|
|
|
[deployment]
|
|
replicas = 2
|
|
image = "vapora/backend:latest"
|
|
port = 8080
|
|
|
|
[health]
|
|
liveness = "/health"
|
|
readiness = "/health"
|
|
|
|
[dependencies]
|
|
requires = ["surrealdb", "nats"]
|
|
```
|
|
|
|
### 5. Workflows
|
|
|
|
Workflows orchestrate complex deployment tasks:
|
|
|
|
**Example: `workflows/deploy-full-stack.yaml`**
|
|
```yaml
|
|
name: deploy-full-stack
|
|
description: Deploy complete VAPORA stack
|
|
|
|
steps:
|
|
- name: create-namespace
|
|
taskserv: namespace
|
|
action: create
|
|
|
|
- name: deploy-database
|
|
taskserv: surrealdb
|
|
action: deploy
|
|
wait: true
|
|
|
|
- name: deploy-messaging
|
|
taskserv: nats
|
|
action: deploy
|
|
wait: true
|
|
|
|
- name: deploy-services
|
|
parallel: true
|
|
tasks:
|
|
- taskserv: backend
|
|
- taskserv: frontend
|
|
- taskserv: agents
|
|
- taskserv: mcp-server
|
|
|
|
- name: health-check
|
|
action: validate
|
|
```
|
|
|
|
## Provisioning vs. Vanilla K8s
|
|
|
|
| Aspect | Provisioning | Vanilla K8s |
|
|
|--------|-------------|-------------|
|
|
| Cluster Creation | Automated (RKE2/K3s) | Manual |
|
|
| Service Mesh | Optional Istio | Manual |
|
|
| Secrets | RustyVault integration | kubectl create secret |
|
|
| Workflows | Declarative YAML | Manual kubectl |
|
|
| Rollback | Built-in | Manual |
|
|
| Monitoring | Prometheus auto-configured | Manual |
|
|
|
|
## Advantages of Provisioning
|
|
|
|
1. **Unified Management**: Single tool for cluster, services, and workflows
|
|
2. **Type Safety**: KCL schemas provide compile-time validation
|
|
3. **Reproducibility**: Infrastructure and services defined as code
|
|
4. **Dependency Management**: Automatic service ordering
|
|
5. **Secret Management**: Integration with RustyVault
|
|
6. **Rollback**: Automatic rollback on failure
|
|
|
|
## Migration from Vanilla K8s
|
|
|
|
If you have an existing K8s deployment using `/kubernetes/` manifests:
|
|
|
|
1. **Import existing manifests**:
|
|
```bash
|
|
provisioning import kubernetes/*.yaml --output kcl/
|
|
```
|
|
|
|
2. **Generate taskservs**:
|
|
```bash
|
|
provisioning taskserv generate --from-kcl kcl/*.k
|
|
```
|
|
|
|
3. **Create workflow**:
|
|
```bash
|
|
provisioning workflow create --interactive
|
|
```
|
|
|
|
4. **Deploy**:
|
|
```bash
|
|
provisioning workflow run workflows/deploy-full-stack.yaml
|
|
```
|
|
|
|
## Deployment Workflow
|
|
|
|
### Using Provisioning (Recommended for Production)
|
|
|
|
```bash
|
|
# 1. Navigate to workspace
|
|
cd provisioning/vapora-wrksp
|
|
|
|
# 2. Validate configuration
|
|
provisioning validate --all
|
|
|
|
# 3. Create cluster
|
|
provisioning cluster create --config workspace.toml
|
|
|
|
# 4. Deploy infrastructure
|
|
provisioning workflow run workflows/deploy-infra.yaml
|
|
|
|
# 5. Deploy services
|
|
provisioning workflow run workflows/deploy-services.yaml
|
|
|
|
# 6. Health check
|
|
provisioning workflow run workflows/health-check.yaml
|
|
|
|
# 7. Monitor
|
|
provisioning health-check --all
|
|
```
|
|
|
|
### Using Vanilla K8s (Manual)
|
|
|
|
```bash
|
|
# Use vanilla K8s manifests
|
|
cd /Users/Akasha/Development/vapora
|
|
nu scripts/deploy-k8s.nu
|
|
```
|
|
|
|
## Validation
|
|
|
|
To validate Provisioning configuration without executing:
|
|
|
|
```bash
|
|
# From project root
|
|
nu scripts/validate-provisioning.nu
|
|
```
|
|
|
|
This checks:
|
|
- Workspace exists
|
|
- KCL schemas are valid
|
|
- Taskserv definitions exist
|
|
- Workflows are well-formed
|
|
|
|
## Next Steps
|
|
|
|
1. **Review Configuration**:
|
|
- Update `workspace.toml` with your cluster details
|
|
- Modify KCL schemas for your environment
|
|
- Adjust resource limits in taskservs
|
|
|
|
2. **Test Locally**:
|
|
- Use K3s for local testing
|
|
- Validate with `--dry-run` flag
|
|
|
|
3. **Deploy to Production**:
|
|
- Use RKE2 for production cluster
|
|
- Enable Istio service mesh
|
|
- Configure external load balancer
|
|
|
|
4. **Monitor**:
|
|
- Use built-in Prometheus/Grafana
|
|
- Configure alerting
|
|
- Set up log aggregation
|
|
|
|
## Troubleshooting
|
|
|
|
### Provisioning not installed
|
|
|
|
```bash
|
|
# Install Provisioning (Rust-based)
|
|
cargo install provisioning-cli
|
|
```
|
|
|
|
### Workspace validation fails
|
|
|
|
```bash
|
|
cd provisioning/vapora-wrksp
|
|
provisioning validate --verbose
|
|
```
|
|
|
|
### Deployment stuck
|
|
|
|
```bash
|
|
# Check workflow status
|
|
provisioning workflow status <workflow-id>
|
|
|
|
# View logs
|
|
provisioning logs --taskserv backend
|
|
|
|
# Rollback
|
|
provisioning rollback --to-version <version>
|
|
```
|
|
|
|
## Documentation References
|
|
|
|
- **Provisioning Documentation**: See `provisioning/vapora-wrksp/README.md`
|
|
- **KCL Language Guide**: https://kcl-lang.io/docs/
|
|
- **Taskserv Specification**: `provisioning/vapora-wrksp/taskservs/README.md`
|
|
- **Workflow Syntax**: `provisioning/vapora-wrksp/workflows/README.md`
|
|
|
|
## Notes
|
|
|
|
- **IMPORTANT**: Provisioning integration is **validated** but not executed in this phase
|
|
- All configuration files exist and are valid
|
|
- Deployment using Provisioning is deferred for manual production deployment
|
|
- For immediate testing, use vanilla K8s deployment: `nu scripts/deploy-k8s.nu`
|
|
- Provisioning provides advanced features (service mesh, auto-scaling, rollback)
|
|
- Vanilla K8s deployment is simpler and requires less infrastructure
|
|
|
|
## Support
|
|
|
|
For issues related to:
|
|
- **VAPORA deployment**: Check `/kubernetes/README.md` and `DEPLOYMENT.md`
|
|
- **Provisioning workspace**: See `provisioning/vapora-wrksp/README.md`
|
|
- **Scripts**: Run `nu scripts/<script-name>.nu --help`
|