2026-02-14 20:10:55 +00:00
|
|
|
# VAPORA Architecture
|
feat: Phase 5.3 - Multi-Agent Learning Infrastructure
Implement intelligent agent learning from Knowledge Graph execution history
with per-task-type expertise tracking, recency bias, and learning curves.
## Phase 5.3 Implementation
### Learning Infrastructure (✅ Complete)
- LearningProfileService with per-task-type expertise metrics
- TaskTypeExpertise model tracking success_rate, confidence, learning curves
- Recency bias weighting: recent 7 days weighted 3x higher (exponential decay)
- Confidence scoring prevents overfitting: min(1.0, executions / 20)
- Learning curves computed from daily execution windows
### Agent Scoring Service (✅ Complete)
- Unified AgentScore combining SwarmCoordinator + learning profiles
- Scoring formula: 0.3*base + 0.5*expertise + 0.2*confidence
- Rank agents by combined score for intelligent assignment
- Support for recency-biased scoring (recent_success_rate)
- Methods: rank_agents, select_best, rank_agents_with_recency
### KG Integration (✅ Complete)
- KGPersistence::get_executions_for_task_type() - query by agent + task type
- KGPersistence::get_agent_executions() - all executions for agent
- Coordinator::load_learning_profile_from_kg() - core KG→Learning integration
- Coordinator::load_all_learning_profiles() - batch load for multiple agents
- Convert PersistedExecution → ExecutionData for learning calculations
### Agent Assignment Integration (✅ Complete)
- AgentCoordinator uses learning profiles for task assignment
- extract_task_type() infers task type from title/description
- assign_task() scores candidates using AgentScoringService
- Fallback to load-based selection if no learning data available
- Learning profiles stored in coordinator.learning_profiles RwLock
### Profile Adapter Enhancements (✅ Complete)
- create_learning_profile() - initialize empty profiles
- add_task_type_expertise() - set task-type expertise
- update_profile_with_learning() - update swarm profiles from learning
## Files Modified
### vapora-knowledge-graph/src/persistence.rs (+30 lines)
- get_executions_for_task_type(agent_id, task_type, limit)
- get_agent_executions(agent_id, limit)
### vapora-agents/src/coordinator.rs (+100 lines)
- load_learning_profile_from_kg() - core KG integration method
- load_all_learning_profiles() - batch loading for agents
- assign_task() already uses learning-based scoring via AgentScoringService
### Existing Complete Implementation
- vapora-knowledge-graph/src/learning.rs - calculation functions
- vapora-agents/src/learning_profile.rs - data structures and expertise
- vapora-agents/src/scoring.rs - unified scoring service
- vapora-agents/src/profile_adapter.rs - adapter methods
## Tests Passing
- learning_profile: 7 tests ✅
- scoring: 5 tests ✅
- profile_adapter: 6 tests ✅
- coordinator: learning-specific tests ✅
## Data Flow
1. Task arrives → AgentCoordinator::assign_task()
2. Extract task_type from description
3. Query KG for task-type executions (load_learning_profile_from_kg)
4. Calculate expertise with recency bias
5. Score candidates (SwarmCoordinator + learning)
6. Assign to top-scored agent
7. Execution result → KG → Update learning profiles
## Key Design Decisions
✅ Recency bias: 7-day half-life with 3x weight for recent performance
✅ Confidence scoring: min(1.0, total_executions / 20) prevents overfitting
✅ Hierarchical scoring: 30% base load, 50% expertise, 20% confidence
✅ KG query limit: 100 recent executions per task-type for performance
✅ Async loading: load_learning_profile_from_kg supports concurrent loads
## Next: Phase 5.4 - Cost Optimization
Ready to implement budget enforcement and cost-aware provider selection.
2026-01-11 13:03:53 +00:00
|
|
|
|
2026-02-14 20:10:55 +00:00
|
|
|
Comprehensive documentation of VAPORA's system architecture, design patterns, and implementation details.
|
feat: Phase 5.3 - Multi-Agent Learning Infrastructure
Implement intelligent agent learning from Knowledge Graph execution history
with per-task-type expertise tracking, recency bias, and learning curves.
## Phase 5.3 Implementation
### Learning Infrastructure (✅ Complete)
- LearningProfileService with per-task-type expertise metrics
- TaskTypeExpertise model tracking success_rate, confidence, learning curves
- Recency bias weighting: recent 7 days weighted 3x higher (exponential decay)
- Confidence scoring prevents overfitting: min(1.0, executions / 20)
- Learning curves computed from daily execution windows
### Agent Scoring Service (✅ Complete)
- Unified AgentScore combining SwarmCoordinator + learning profiles
- Scoring formula: 0.3*base + 0.5*expertise + 0.2*confidence
- Rank agents by combined score for intelligent assignment
- Support for recency-biased scoring (recent_success_rate)
- Methods: rank_agents, select_best, rank_agents_with_recency
### KG Integration (✅ Complete)
- KGPersistence::get_executions_for_task_type() - query by agent + task type
- KGPersistence::get_agent_executions() - all executions for agent
- Coordinator::load_learning_profile_from_kg() - core KG→Learning integration
- Coordinator::load_all_learning_profiles() - batch load for multiple agents
- Convert PersistedExecution → ExecutionData for learning calculations
### Agent Assignment Integration (✅ Complete)
- AgentCoordinator uses learning profiles for task assignment
- extract_task_type() infers task type from title/description
- assign_task() scores candidates using AgentScoringService
- Fallback to load-based selection if no learning data available
- Learning profiles stored in coordinator.learning_profiles RwLock
### Profile Adapter Enhancements (✅ Complete)
- create_learning_profile() - initialize empty profiles
- add_task_type_expertise() - set task-type expertise
- update_profile_with_learning() - update swarm profiles from learning
## Files Modified
### vapora-knowledge-graph/src/persistence.rs (+30 lines)
- get_executions_for_task_type(agent_id, task_type, limit)
- get_agent_executions(agent_id, limit)
### vapora-agents/src/coordinator.rs (+100 lines)
- load_learning_profile_from_kg() - core KG integration method
- load_all_learning_profiles() - batch loading for agents
- assign_task() already uses learning-based scoring via AgentScoringService
### Existing Complete Implementation
- vapora-knowledge-graph/src/learning.rs - calculation functions
- vapora-agents/src/learning_profile.rs - data structures and expertise
- vapora-agents/src/scoring.rs - unified scoring service
- vapora-agents/src/profile_adapter.rs - adapter methods
## Tests Passing
- learning_profile: 7 tests ✅
- scoring: 5 tests ✅
- profile_adapter: 6 tests ✅
- coordinator: learning-specific tests ✅
## Data Flow
1. Task arrives → AgentCoordinator::assign_task()
2. Extract task_type from description
3. Query KG for task-type executions (load_learning_profile_from_kg)
4. Calculate expertise with recency bias
5. Score candidates (SwarmCoordinator + learning)
6. Assign to top-scored agent
7. Execution result → KG → Update learning profiles
## Key Design Decisions
✅ Recency bias: 7-day half-life with 3x weight for recent performance
✅ Confidence scoring: min(1.0, total_executions / 20) prevents overfitting
✅ Hierarchical scoring: 30% base load, 50% expertise, 20% confidence
✅ KG query limit: 100 recent executions per task-type for performance
✅ Async loading: load_learning_profile_from_kg supports concurrent loads
## Next: Phase 5.4 - Cost Optimization
Ready to implement budget enforcement and cost-aware provider selection.
2026-01-11 13:03:53 +00:00
|
|
|
|
2026-02-14 20:10:55 +00:00
|
|
|
## Architecture Layers
|
feat: Phase 5.3 - Multi-Agent Learning Infrastructure
Implement intelligent agent learning from Knowledge Graph execution history
with per-task-type expertise tracking, recency bias, and learning curves.
## Phase 5.3 Implementation
### Learning Infrastructure (✅ Complete)
- LearningProfileService with per-task-type expertise metrics
- TaskTypeExpertise model tracking success_rate, confidence, learning curves
- Recency bias weighting: recent 7 days weighted 3x higher (exponential decay)
- Confidence scoring prevents overfitting: min(1.0, executions / 20)
- Learning curves computed from daily execution windows
### Agent Scoring Service (✅ Complete)
- Unified AgentScore combining SwarmCoordinator + learning profiles
- Scoring formula: 0.3*base + 0.5*expertise + 0.2*confidence
- Rank agents by combined score for intelligent assignment
- Support for recency-biased scoring (recent_success_rate)
- Methods: rank_agents, select_best, rank_agents_with_recency
### KG Integration (✅ Complete)
- KGPersistence::get_executions_for_task_type() - query by agent + task type
- KGPersistence::get_agent_executions() - all executions for agent
- Coordinator::load_learning_profile_from_kg() - core KG→Learning integration
- Coordinator::load_all_learning_profiles() - batch load for multiple agents
- Convert PersistedExecution → ExecutionData for learning calculations
### Agent Assignment Integration (✅ Complete)
- AgentCoordinator uses learning profiles for task assignment
- extract_task_type() infers task type from title/description
- assign_task() scores candidates using AgentScoringService
- Fallback to load-based selection if no learning data available
- Learning profiles stored in coordinator.learning_profiles RwLock
### Profile Adapter Enhancements (✅ Complete)
- create_learning_profile() - initialize empty profiles
- add_task_type_expertise() - set task-type expertise
- update_profile_with_learning() - update swarm profiles from learning
## Files Modified
### vapora-knowledge-graph/src/persistence.rs (+30 lines)
- get_executions_for_task_type(agent_id, task_type, limit)
- get_agent_executions(agent_id, limit)
### vapora-agents/src/coordinator.rs (+100 lines)
- load_learning_profile_from_kg() - core KG integration method
- load_all_learning_profiles() - batch loading for agents
- assign_task() already uses learning-based scoring via AgentScoringService
### Existing Complete Implementation
- vapora-knowledge-graph/src/learning.rs - calculation functions
- vapora-agents/src/learning_profile.rs - data structures and expertise
- vapora-agents/src/scoring.rs - unified scoring service
- vapora-agents/src/profile_adapter.rs - adapter methods
## Tests Passing
- learning_profile: 7 tests ✅
- scoring: 5 tests ✅
- profile_adapter: 6 tests ✅
- coordinator: learning-specific tests ✅
## Data Flow
1. Task arrives → AgentCoordinator::assign_task()
2. Extract task_type from description
3. Query KG for task-type executions (load_learning_profile_from_kg)
4. Calculate expertise with recency bias
5. Score candidates (SwarmCoordinator + learning)
6. Assign to top-scored agent
7. Execution result → KG → Update learning profiles
## Key Design Decisions
✅ Recency bias: 7-day half-life with 3x weight for recent performance
✅ Confidence scoring: min(1.0, total_executions / 20) prevents overfitting
✅ Hierarchical scoring: 30% base load, 50% expertise, 20% confidence
✅ KG query limit: 100 recent executions per task-type for performance
✅ Async loading: load_learning_profile_from_kg supports concurrent loads
## Next: Phase 5.4 - Cost Optimization
Ready to implement budget enforcement and cost-aware provider selection.
2026-01-11 13:03:53 +00:00
|
|
|
|
2026-02-14 20:10:55 +00:00
|
|
|
### 1. Protocol Layer
|
feat: Phase 5.3 - Multi-Agent Learning Infrastructure
Implement intelligent agent learning from Knowledge Graph execution history
with per-task-type expertise tracking, recency bias, and learning curves.
## Phase 5.3 Implementation
### Learning Infrastructure (✅ Complete)
- LearningProfileService with per-task-type expertise metrics
- TaskTypeExpertise model tracking success_rate, confidence, learning curves
- Recency bias weighting: recent 7 days weighted 3x higher (exponential decay)
- Confidence scoring prevents overfitting: min(1.0, executions / 20)
- Learning curves computed from daily execution windows
### Agent Scoring Service (✅ Complete)
- Unified AgentScore combining SwarmCoordinator + learning profiles
- Scoring formula: 0.3*base + 0.5*expertise + 0.2*confidence
- Rank agents by combined score for intelligent assignment
- Support for recency-biased scoring (recent_success_rate)
- Methods: rank_agents, select_best, rank_agents_with_recency
### KG Integration (✅ Complete)
- KGPersistence::get_executions_for_task_type() - query by agent + task type
- KGPersistence::get_agent_executions() - all executions for agent
- Coordinator::load_learning_profile_from_kg() - core KG→Learning integration
- Coordinator::load_all_learning_profiles() - batch load for multiple agents
- Convert PersistedExecution → ExecutionData for learning calculations
### Agent Assignment Integration (✅ Complete)
- AgentCoordinator uses learning profiles for task assignment
- extract_task_type() infers task type from title/description
- assign_task() scores candidates using AgentScoringService
- Fallback to load-based selection if no learning data available
- Learning profiles stored in coordinator.learning_profiles RwLock
### Profile Adapter Enhancements (✅ Complete)
- create_learning_profile() - initialize empty profiles
- add_task_type_expertise() - set task-type expertise
- update_profile_with_learning() - update swarm profiles from learning
## Files Modified
### vapora-knowledge-graph/src/persistence.rs (+30 lines)
- get_executions_for_task_type(agent_id, task_type, limit)
- get_agent_executions(agent_id, limit)
### vapora-agents/src/coordinator.rs (+100 lines)
- load_learning_profile_from_kg() - core KG integration method
- load_all_learning_profiles() - batch loading for agents
- assign_task() already uses learning-based scoring via AgentScoringService
### Existing Complete Implementation
- vapora-knowledge-graph/src/learning.rs - calculation functions
- vapora-agents/src/learning_profile.rs - data structures and expertise
- vapora-agents/src/scoring.rs - unified scoring service
- vapora-agents/src/profile_adapter.rs - adapter methods
## Tests Passing
- learning_profile: 7 tests ✅
- scoring: 5 tests ✅
- profile_adapter: 6 tests ✅
- coordinator: learning-specific tests ✅
## Data Flow
1. Task arrives → AgentCoordinator::assign_task()
2. Extract task_type from description
3. Query KG for task-type executions (load_learning_profile_from_kg)
4. Calculate expertise with recency bias
5. Score candidates (SwarmCoordinator + learning)
6. Assign to top-scored agent
7. Execution result → KG → Update learning profiles
## Key Design Decisions
✅ Recency bias: 7-day half-life with 3x weight for recent performance
✅ Confidence scoring: min(1.0, total_executions / 20) prevents overfitting
✅ Hierarchical scoring: 30% base load, 50% expertise, 20% confidence
✅ KG query limit: 100 recent executions per task-type for performance
✅ Async loading: load_learning_profile_from_kg supports concurrent loads
## Next: Phase 5.4 - Cost Optimization
Ready to implement budget enforcement and cost-aware provider selection.
2026-01-11 13:03:53 +00:00
|
|
|
|
2026-02-14 20:10:55 +00:00
|
|
|
**A2A (Agent-to-Agent) Protocol**
|
|
|
|
|
- Standard protocol for agent-to-agent communication
|
|
|
|
|
- JSON-RPC 2.0 specification compliance
|
|
|
|
|
- Agent discovery via Agent Card
|
|
|
|
|
- Task dispatch and lifecycle tracking
|
|
|
|
|
- See: [ADR-0001: A2A Protocol Implementation](adr/0001-a2a-protocol-implementation.md)
|
feat: Phase 5.3 - Multi-Agent Learning Infrastructure
Implement intelligent agent learning from Knowledge Graph execution history
with per-task-type expertise tracking, recency bias, and learning curves.
## Phase 5.3 Implementation
### Learning Infrastructure (✅ Complete)
- LearningProfileService with per-task-type expertise metrics
- TaskTypeExpertise model tracking success_rate, confidence, learning curves
- Recency bias weighting: recent 7 days weighted 3x higher (exponential decay)
- Confidence scoring prevents overfitting: min(1.0, executions / 20)
- Learning curves computed from daily execution windows
### Agent Scoring Service (✅ Complete)
- Unified AgentScore combining SwarmCoordinator + learning profiles
- Scoring formula: 0.3*base + 0.5*expertise + 0.2*confidence
- Rank agents by combined score for intelligent assignment
- Support for recency-biased scoring (recent_success_rate)
- Methods: rank_agents, select_best, rank_agents_with_recency
### KG Integration (✅ Complete)
- KGPersistence::get_executions_for_task_type() - query by agent + task type
- KGPersistence::get_agent_executions() - all executions for agent
- Coordinator::load_learning_profile_from_kg() - core KG→Learning integration
- Coordinator::load_all_learning_profiles() - batch load for multiple agents
- Convert PersistedExecution → ExecutionData for learning calculations
### Agent Assignment Integration (✅ Complete)
- AgentCoordinator uses learning profiles for task assignment
- extract_task_type() infers task type from title/description
- assign_task() scores candidates using AgentScoringService
- Fallback to load-based selection if no learning data available
- Learning profiles stored in coordinator.learning_profiles RwLock
### Profile Adapter Enhancements (✅ Complete)
- create_learning_profile() - initialize empty profiles
- add_task_type_expertise() - set task-type expertise
- update_profile_with_learning() - update swarm profiles from learning
## Files Modified
### vapora-knowledge-graph/src/persistence.rs (+30 lines)
- get_executions_for_task_type(agent_id, task_type, limit)
- get_agent_executions(agent_id, limit)
### vapora-agents/src/coordinator.rs (+100 lines)
- load_learning_profile_from_kg() - core KG integration method
- load_all_learning_profiles() - batch loading for agents
- assign_task() already uses learning-based scoring via AgentScoringService
### Existing Complete Implementation
- vapora-knowledge-graph/src/learning.rs - calculation functions
- vapora-agents/src/learning_profile.rs - data structures and expertise
- vapora-agents/src/scoring.rs - unified scoring service
- vapora-agents/src/profile_adapter.rs - adapter methods
## Tests Passing
- learning_profile: 7 tests ✅
- scoring: 5 tests ✅
- profile_adapter: 6 tests ✅
- coordinator: learning-specific tests ✅
## Data Flow
1. Task arrives → AgentCoordinator::assign_task()
2. Extract task_type from description
3. Query KG for task-type executions (load_learning_profile_from_kg)
4. Calculate expertise with recency bias
5. Score candidates (SwarmCoordinator + learning)
6. Assign to top-scored agent
7. Execution result → KG → Update learning profiles
## Key Design Decisions
✅ Recency bias: 7-day half-life with 3x weight for recent performance
✅ Confidence scoring: min(1.0, total_executions / 20) prevents overfitting
✅ Hierarchical scoring: 30% base load, 50% expertise, 20% confidence
✅ KG query limit: 100 recent executions per task-type for performance
✅ Async loading: load_learning_profile_from_kg supports concurrent loads
## Next: Phase 5.4 - Cost Optimization
Ready to implement budget enforcement and cost-aware provider selection.
2026-01-11 13:03:53 +00:00
|
|
|
|
2026-02-14 20:10:55 +00:00
|
|
|
**MCP (Model Context Protocol)**
|
|
|
|
|
- Real MCP transport with Stdio and SSE support
|
|
|
|
|
- 6 integrated tools for task/agent management
|
|
|
|
|
- Backend client integration
|
|
|
|
|
- Tool registry with JSON Schema validation
|
2026-01-14 21:12:49 +00:00
|
|
|
|
2026-02-14 20:10:55 +00:00
|
|
|
### 2. Server Layer
|
|
|
|
|
|
|
|
|
|
**vapora-a2a (A2A Server)**
|
|
|
|
|
- Axum-based HTTP server
|
|
|
|
|
- Endpoints: agent discovery, task dispatch, status query, health, metrics
|
|
|
|
|
- JSON-RPC 2.0 request/response handling
|
|
|
|
|
- **SurrealDB persistent storage** (production-ready)
|
|
|
|
|
- **NATS async coordination** for task lifecycle events
|
|
|
|
|
- **Prometheus metrics** (/metrics endpoint)
|
|
|
|
|
- Integration: AgentCoordinator, TaskManager, CoordinatorBridge
|
|
|
|
|
- Tasks survive server restarts (persistent)
|
|
|
|
|
- Background NATS listeners for TaskCompleted/TaskFailed
|
|
|
|
|
|
|
|
|
|
**vapora-a2a-client (A2A Client)**
|
|
|
|
|
- HTTP client library for A2A protocol
|
|
|
|
|
- Methods for discovery, dispatch, query
|
|
|
|
|
- **Exponential backoff retry** with jitter (100ms → 5s)
|
|
|
|
|
- Smart retry logic (5xx/network YES, 4xx NO)
|
|
|
|
|
- Timeout and error handling
|
|
|
|
|
- Full serialization support
|
|
|
|
|
|
|
|
|
|
### 3. Infrastructure Layer
|
|
|
|
|
|
|
|
|
|
**Kubernetes Deployment**
|
|
|
|
|
- StatefulSet-based deployment via Kustomize
|
|
|
|
|
- Environment-specific overlays (dev, prod)
|
|
|
|
|
- RBAC, resource quotas, anti-affinity
|
|
|
|
|
- ConfigMap-based A2A integration
|
|
|
|
|
- See: [ADR-0002: Kubernetes Deployment Strategy](adr/0002-kubernetes-deployment-strategy.md)
|
|
|
|
|
|
|
|
|
|
### 4. Integration Layer
|
|
|
|
|
|
|
|
|
|
**AgentCoordinator Integration**
|
|
|
|
|
- CoordinatorBridge maps A2A tasks to internal agents
|
|
|
|
|
- Task state management
|
|
|
|
|
- Background completion tracking
|
|
|
|
|
|
|
|
|
|
**Backend Integration**
|
|
|
|
|
- SurrealDB for persistent storage
|
|
|
|
|
- Multi-tenant scope isolation
|
|
|
|
|
- REST API endpoints
|
|
|
|
|
|
|
|
|
|
## Key Components
|
|
|
|
|
|
|
|
|
|
### A2A Protocol Types
|
|
|
|
|
|
|
|
|
|
| Type | Purpose | Location |
|
|
|
|
|
|------|---------|----------|
|
|
|
|
|
| `A2aTask` | Task request | protocol.rs |
|
|
|
|
|
| `A2aMessage` | Message with text/file parts | protocol.rs |
|
|
|
|
|
| `A2aTaskStatus` | Task state and result | protocol.rs |
|
|
|
|
|
| `A2aTaskResult` | Execution result with artifacts | protocol.rs |
|
|
|
|
|
| `AgentCard` | Agent capability advertisement | agent_card.rs |
|
|
|
|
|
|
|
|
|
|
### Error Handling
|
|
|
|
|
|
|
|
|
|
Two-layer strategy:
|
|
|
|
|
- **Domain Layer:** Type-safe Rust errors via `thiserror`
|
|
|
|
|
- **Protocol Layer:** JSON-RPC 2.0 error format
|
|
|
|
|
- See: [ADR-0003: Error Handling and JSON-RPC 2.0 Compliance](adr/0003-error-handling-and-json-rpc-compliance.md)
|
|
|
|
|
|
|
|
|
|
## Crate Dependencies
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
vapora-a2a (A2A Server)
|
|
|
|
|
├── vapora-agents
|
|
|
|
|
├── vapora-shared
|
|
|
|
|
├── axum
|
|
|
|
|
├── tokio
|
|
|
|
|
└── serde/serde_json
|
|
|
|
|
|
|
|
|
|
vapora-a2a-client (A2A Client)
|
|
|
|
|
├── vapora-a2a
|
|
|
|
|
├── vapora-shared
|
|
|
|
|
├── reqwest
|
|
|
|
|
├── tokio
|
|
|
|
|
└── serde/serde_json
|
|
|
|
|
|
|
|
|
|
vapora-mcp-server (MCP Transport)
|
|
|
|
|
├── vapora-agents
|
|
|
|
|
├── vapora-shared
|
|
|
|
|
├── axum
|
|
|
|
|
├── reqwest
|
|
|
|
|
└── serde/serde_json
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## LLM Routing & Cost Management
|
|
|
|
|
|
|
|
|
|
**NEW: Two comprehensive guides for working with LLM providers (Claude, OpenAI, Gemini, Ollama)**
|
|
|
|
|
|
|
|
|
|
### For Developers
|
|
|
|
|
|
|
|
|
|
- **[LLM Provider Patterns](llm-provider-patterns.md)** — Four implementation approaches:
|
|
|
|
|
1. **Mocks** — Zero-cost development without API subscriptions
|
|
|
|
|
2. **SDK Direct** — Full integration with official APIs
|
|
|
|
|
3. **Add Provider** — Extending VAPORA with custom providers
|
|
|
|
|
4. **End-to-End** — Complete request-to-response flow
|
|
|
|
|
|
|
|
|
|
- **[LLM Provider Implementation Guide](llm-provider-implementation.md)** — How VAPORA implements it today:
|
|
|
|
|
- LLMClient trait abstraction (Claude, OpenAI, Gemini, Ollama)
|
|
|
|
|
- Hybrid routing engine (rules + dynamic + manual override)
|
|
|
|
|
- Cost tracking & token accounting
|
|
|
|
|
- Three-tier budget enforcement
|
|
|
|
|
- Automatic fallback chains
|
|
|
|
|
- Production code examples
|
|
|
|
|
|
|
|
|
|
### Key Insights
|
|
|
|
|
|
|
|
|
|
| Scenario | Pattern | Cost |
|
|
|
|
|
|----------|---------|------|
|
|
|
|
|
| Local development | Mocks | $0 |
|
|
|
|
|
| CI/integration tests | SDK + mocks | $0 |
|
|
|
|
|
| Staging/production | SDK real | $varies |
|
|
|
|
|
| Privacy-critical | Ollama local | $0 |
|
|
|
|
|
| Cost-optimized | Gemini + Ollama fallback | $0.005/1k |
|
|
|
|
|
|
|
|
|
|
**Start here**: [llm-provider-patterns.md](llm-provider-patterns.md) for patterns without subscriptions.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## Related Documentation
|
|
|
|
|
|
|
|
|
|
- [ADR Index](adr/README.md) - Architecture decision records
|
|
|
|
|
- [multi-ia-router.md](multi-ia-router.md) - Detailed LLM router specification
|
|
|
|
|
- Related ADRs:
|
|
|
|
|
- [ADR-0007: Multi-Provider LLM Support](../adrs/0007-multi-provider-llm.md)
|
|
|
|
|
- [ADR-0012: Three-Tier LLM Routing](../adrs/0012-llm-routing-tiers.md)
|
|
|
|
|
- [ADR-0015: Budget Enforcement](../adrs/0015-budget-enforcement.md)
|
|
|
|
|
- [ADR-0016: Cost Efficiency Ranking](../adrs/0016-cost-efficiency-ranking.md)
|