# ADR-015: AI Integration Architecture for Intelligent Infrastructure Provisioning ## Status **Accepted** - 2025-01-08 ## Context The provisioning platform has evolved to include complex workflows for infrastructure configuration, deployment, and management. Current interaction patterns require deep technical knowledge of Nickel schemas, cloud provider APIs, networking concepts, and security best practices. This creates barriers to entry and slows down infrastructure provisioning for operators who are not infrastructure experts. ### The Infrastructure Complexity Problem **Current state challenges**: 1. **Knowledge Barrier**: Deep Nickel, cloud, and networking expertise required - Understanding Nickel type system and contracts - Knowing cloud provider resource relationships - Configuring security policies correctly - Debugging deployment failures 2. **Manual Configuration**: All configs hand-written - Repetitive boilerplate for common patterns - Easy to make mistakes (typos, missing fields) - No intelligent suggestions or autocomplete - Trial-and-error debugging 3. **Limited Assistance**: No contextual help - Documentation is separate from workflow - No explanation of validation errors - No suggestions for fixing issues - No learning from past deployments 4. **Troubleshooting Difficulty**: Manual log analysis - Deployment failures require expert analysis - No automated root cause detection - No suggested fixes based on similar issues - Long time-to-resolution ### AI Integration Opportunities 1. **Natural Language to Configuration**: - User: "Create a production PostgreSQL cluster with encryption and daily backups" - AI: Generates validated Nickel configuration 2. **AI-Assisted Form Filling**: - User starts typing in typdialog web form - AI suggests values based on context - AI explains validation errors in plain language 3. **Intelligent Troubleshooting**: - Deployment fails - AI analyzes logs and suggests fixes - AI generates corrected configuration 4. **Configuration Optimization**: - AI analyzes workload patterns - AI suggests performance improvements - AI detects security misconfigurations 5. **Learning from Operations**: - AI indexes past deployments - AI suggests configurations based on similar workloads - AI predicts potential issues ### AI Components Overview The system integrates multiple AI components: 1. **typdialog-ai**: AI-assisted form interactions 2. **typdialog-ag**: AI agents for autonomous operations 3. **typdialog-prov-gen**: AI-powered configuration generation 4. **platform/crates/ai-service**: Core AI service backend 5. **platform/crates/mcp-server**: Model Context Protocol server 6. **platform/crates/rag**: Retrieval-Augmented Generation system ### Requirements for AI Integration - ✅ **Natural Language Understanding**: Parse user intent from free-form text - ✅ **Schema-Aware Generation**: Generate valid Nickel configurations - ✅ **Context Retrieval**: Access documentation, schemas, past deployments - ✅ **Security Enforcement**: Cedar policies control AI access - ✅ **Human-in-the-Loop**: All AI actions require human approval - ✅ **Audit Trail**: Complete logging of AI operations - ✅ **Multi-Provider Support**: OpenAI, Anthropic, local models - ✅ **Cost Control**: Rate limiting and budget management - ✅ **Observability**: Trace AI decisions and reasoning ## Decision Integrate a **comprehensive AI system** consisting of: 1. **AI-Assisted Interfaces** (typdialog-ai) 2. **Autonomous AI Agents** (typdialog-ag) 3. **AI Configuration Generator** (typdialog-prov-gen) 4. **Core AI Infrastructure** (ai-service, mcp-server, rag) All AI components are **schema-aware**, **security-enforced**, and **human-supervised**. ### Architecture Diagram ```text ┌─────────────────────────────────────────────────────────────────┐ │ User Interfaces │ │ │ │ Natural Language: "Create production K8s cluster in AWS" │ │ Typdialog Forms: AI-assisted field suggestions │ │ CLI: provisioning ai generate-config "description" │ └────────────┬────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ AI Frontend Layer │ │ ┌───────────────────────────────────────────────────────┐ │ │ │ typdialog-ai (AI-Assisted Forms) │ │ │ │ - Natural language form filling │ │ │ │ - Real-time AI suggestions │ │ │ │ - Validation error explanations │ │ │ │ - Context-aware autocomplete │ │ │ ├───────────────────────────────────────────────────────┤ │ │ │ typdialog-ag (AI Agents) │ │ │ │ - Autonomous task execution │ │ │ │ - Multi-step workflow automation │ │ │ │ - Learning from feedback │ │ │ │ - Agent collaboration │ │ │ ├───────────────────────────────────────────────────────┤ │ │ │ typdialog-prov-gen (Config Generator) │ │ │ │ - Natural language → Nickel config │ │ │ │ - Template-based generation │ │ │ │ - Best practice injection │ │ │ │ - Validation and refinement │ │ │ └───────────────────────────────────────────────────────┘ │ └────────────┬────────────────────────────────────────────────────┘ │ ▼ ┌────────────────────────────────────────────────────────────────┐ │ Core AI Infrastructure (platform/crates/) │ │ ┌───────────────────────────────────────────────────────┐ │ │ │ ai-service (Central AI Service) │ │ │ │ │ │ │ │ - Request routing and orchestration │ │ │ │ - Authentication and authorization (Cedar) │ │ │ │ - Rate limiting and cost control │ │ │ │ - Caching and optimization │ │ │ │ - Audit logging and observability │ │ │ │ - Multi-provider abstraction │ │ │ └─────────────┬─────────────────────┬───────────────────┘ │ │ │ │ │ │ ▼ ▼ │ │ ┌─────────────────────┐ ┌─────────────────────┐ │ │ │ mcp-server │ │ rag │ │ │ │ (Model Context │ │ (Retrieval-Aug Gen) │ │ │ │ Protocol) │ │ │ │ │ │ │ │ ┌─────────────────┐ │ │ │ │ - LLM integration │ │ │ Vector Store │ │ │ │ │ - Tool calling │ │ │ (Qdrant/Milvus) │ │ │ │ │ - Context mgmt │ │ └─────────────────┘ │ │ │ │ - Multi-provider │ │ ┌─────────────────┐ │ │ │ │ (OpenAI, │ │ │ Embeddings │ │ │ │ │ Anthropic, │ │ │ (text-embed) │ │ │ │ │ Local models) │ │ └─────────────────┘ │ │ │ │ │ │ ┌─────────────────┐ │ │ │ │ Tools: │ │ │ Index: │ │ │ │ │ - nickel_validate │ │ │ - Nickel schemas│ │ │ │ │ - schema_query │ │ │ - Documentation │ │ │ │ │ - config_generate │ │ │ - Past deploys │ │ │ │ │ - cedar_check │ │ │ - Best practices│ │ │ │ └─────────────────────┘ │ └─────────────────┘ │ │ │ │ │ │ │ │ Query: "How to │ │ │ │ configure Postgres │ │ │ │ with encryption?" │ │ │ │ │ │ │ │ Retrieval: Relevant │ │ │ │ docs + examples │ │ │ └─────────────────────┘ │ └────────────┬───────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ Integration Points │ │ │ │ ┌─────────────┐ ┌──────────────┐ ┌─────────────────────┐ │ │ │ Nickel │ │ SecretumVault│ │ Cedar Authorization │ │ │ │ Validation │ │ (Secrets) │ │ (AI Policies) │ │ │ └─────────────┘ └──────────────┘ └─────────────────────┘ │ │ │ │ ┌─────────────┐ ┌──────────────┐ ┌─────────────────────┐ │ │ │ Orchestrator│ │ Typdialog │ │ Audit Logging │ │ │ │ (Deploy) │ │ (Forms) │ │ (All AI Ops) │ │ │ └─────────────┘ └──────────────┘ └─────────────────────┘ │ └─────────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ Output: Validated Nickel Configuration │ │ │ │ ✅ Schema-validated │ │ ✅ Security-checked (Cedar policies) │ │ ✅ Human-approved │ │ ✅ Audit-logged │ │ ✅ Ready for deployment │ └─────────────────────────────────────────────────────────────────┘ ``` ### Component Responsibilities **typdialog-ai** (AI-Assisted Forms): - Real-time form field suggestions based on context - Natural language form filling - Validation error explanations in plain English - Context-aware autocomplete for configuration values - Integration with typdialog web UI **typdialog-ag** (AI Agents): - Autonomous task execution (multi-step workflows) - Agent collaboration (multiple agents working together) - Learning from user feedback and past operations - Goal-oriented behavior (achieve outcome, not just execute steps) - Safety boundaries (cannot deploy without approval) **typdialog-prov-gen** (Config Generator): - Natural language → Nickel configuration - Template-based generation with customization - Best practice injection (security, performance, HA) - Iterative refinement based on validation feedback - Integration with Nickel schema system **ai-service** (Core AI Service): - Central request router for all AI operations - Authentication and authorization (Cedar policies) - Rate limiting and cost control - Caching (reduce LLM API calls) - Audit logging (all AI operations) - Multi-provider abstraction (OpenAI, Anthropic, local) **mcp-server** (Model Context Protocol): - LLM integration (OpenAI, Anthropic, local models) - Tool calling framework (nickel_validate, schema_query, etc.) - Context management (conversation history, schemas) - Streaming responses for real-time feedback - Error handling and retries **rag** (Retrieval-Augmented Generation): - Vector store (Qdrant/Milvus) for embeddings - Document indexing (Nickel schemas, docs, deployments) - Semantic search (find relevant context) - Embedding generation (text-embedding-3-large) - Query expansion and reranking ## Rationale ### Why AI Integration Is Essential | Aspect | Manual Config | AI-Assisted (chosen) | | -------- | --------------- | ---------------------- | | **Learning Curve** | 🔴 Steep | 🟢 Gentle | | **Time to Deploy** | 🔴 Hours | 🟢 Minutes | | **Error Rate** | 🔴 High | 🟢 Low (validated) | | **Documentation Access** | 🔴 Separate | 🟢 Contextual | | **Troubleshooting** | 🔴 Manual | 🟢 AI-assisted | | **Best Practices** | ⚠️ Manual enforcement | ✅ Auto-injected | | **Consistency** | ⚠️ Varies by operator | ✅ Standardized | | **Scalability** | 🔴 Limited by expertise | 🟢 AI scales knowledge | ### Why Schema-Aware AI Is Critical Traditional AI code generation fails for infrastructure because: ```text Generic AI (like GitHub Copilot): ❌ Generates syntactically correct but semantically wrong configs ❌ Doesn't understand cloud provider constraints ❌ No validation against schemas ❌ No security policy enforcement ❌ Hallucinated resource names/IDs ``` **Schema-aware AI** (our approach): ```text # Nickel schema provides ground truth { Database = { engine | [| 'postgres, 'mysql, 'mongodb |], version | String, storage_gb | Number, backup_retention_days | Number, } } # AI generates ONLY valid configs # AI knows: # - Valid engine values ('postgres', not 'postgresql') # - Required fields (all listed above) # - Type constraints (storage_gb is Number, not String) # - Nickel contracts (if defined) ``` **Result**: AI cannot generate invalid configs. ### Why RAG (Retrieval-Augmented Generation) Is Essential LLMs alone have limitations: ```text Pure LLM: ❌ Knowledge cutoff (no recent updates) ❌ Hallucinations (invents plausible-sounding configs) ❌ No project-specific knowledge ❌ No access to past deployments ``` **RAG-enhanced LLM**: ```text Query: "How to configure Postgres with encryption?" RAG retrieves: - Nickel schema: provisioning/schemas/database.ncl - Documentation: docs/user/database-encryption.md - Past deployment: workspaces/prod/postgres-encrypted.ncl - Best practice: .claude/patterns/secure-database.md LLM generates answer WITH retrieved context: ✅ Accurate (based on actual schemas) ✅ Project-specific (uses our patterns) ✅ Proven (learned from past deployments) ✅ Secure (follows our security guidelines) ``` ### Why Human-in-the-Loop Is Non-Negotiable AI-generated infrastructure configs require human approval: ```text // All AI operations require approval pub async fn ai_generate_config(request: GenerateRequest) -> Result { let ai_generated = ai_service.generate(request).await?; // Validate against Nickel schema let validation = nickel_validate(&ai_generated)?; if !validation.is_valid() { return Err("AI generated invalid config"); } // Check Cedar policies let authorized = cedar_authorize( principal: user, action: "approve_ai_config", resource: ai_generated, )?; if !authorized { return Err("User not authorized to approve AI config"); } // Require explicit human approval let approval = prompt_user_approval(&ai_generated).await?; if !approval.approved { audit_log("AI config rejected by user", &ai_generated); return Err("User rejected AI-generated config"); } audit_log("AI config approved by user", &ai_generated); Ok(ai_generated) } ``` **Why**: - Infrastructure changes have real-world cost and security impact - AI can make mistakes (hallucinations, misunderstandings) - Compliance requires human accountability - Learning opportunity (human reviews teach AI) ### Why Multi-Provider Support Matters No single LLM provider is best for all tasks: | Provider | Best For | Considerations | | ---------- | ---------- | ---------------- | | **Anthropic (Claude)** | Long context, accuracy | ✅ Best for complex configs | | **OpenAI (GPT-4)** | Tool calling, speed | ✅ Best for quick suggestions | | **Local (Llama, Mistral)** | Privacy, cost | ✅ Best for air-gapped envs | **Strategy**: - Complex config generation → Claude (long context) - Real-time form suggestions → GPT-4 (fast) - Air-gapped deployments → Local models (privacy) ## Consequences ### Positive - **Accessibility**: Non-experts can provision infrastructure - **Productivity**: 10x faster configuration creation - **Quality**: AI injects best practices automatically - **Consistency**: Standardized configurations across teams - **Learning**: Users learn from AI explanations - **Troubleshooting**: AI-assisted debugging reduces MTTR - **Documentation**: Contextual help embedded in workflow - **Safety**: Schema validation prevents invalid configs - **Security**: Cedar policies control AI access - **Auditability**: Complete trail of AI operations ### Negative - **Dependency**: Requires LLM API access (or local models) - **Cost**: LLM API calls have per-token cost - **Latency**: AI responses take 1-5 seconds - **Accuracy**: AI can still make mistakes (needs validation) - **Trust**: Users must understand AI limitations - **Complexity**: Additional infrastructure to operate - **Privacy**: Configs sent to LLM providers (unless local) ### Mitigation Strategies **Cost Control**: ```text [ai.rate_limiting] requests_per_minute = 60 tokens_per_day = 1000000 cost_limit_per_day = "100.00" # USD [ai.caching] enabled = true ttl = "1h" # Cache similar queries to reduce API calls ``` **Latency Optimization**: ```text // Streaming responses for real-time feedback pub async fn ai_generate_stream(request: GenerateRequest) -> impl Stream { ai_service .generate_stream(request) .await .map(|chunk| chunk.text) } ``` **Privacy (Local Models)**: ```text [ai] provider = "local" model_path = "/opt/provisioning/models/llama-3-70b" # No data leaves the network ``` **Validation (Defense in Depth)**: ```text AI generates config ↓ Nickel schema validation (syntax, types, contracts) ↓ Cedar policy check (security, compliance) ↓ Human approval (final gate) ↓ Deployment ``` **Observability**: ```text [ai.observability] trace_all_requests = true store_conversations = true conversation_retention = "30d" # Every AI operation logged: # - Input prompt # - Retrieved context (RAG) # - Generated output # - Validation results # - Human approval decision ``` ## Alternatives Considered ### Alternative 1: No AI Integration **Pros**: Simpler, no LLM dependencies **Cons**: Steep learning curve, slow provisioning, manual troubleshooting **Decision**: REJECTED - Poor user experience (10x slower provisioning, high error rate) ### Alternative 2: Generic AI Code Generation (GitHub Copilot approach) **Pros**: Existing tools, well-known UX **Cons**: Not schema-aware, generates invalid configs, no validation **Decision**: REJECTED - Inadequate for infrastructure (correctness critical) ### Alternative 3: AI Only for Documentation/Search **Pros**: Lower risk (AI doesn't generate configs) **Cons**: Missed opportunity for 10x productivity gains **Decision**: REJECTED - Too conservative ### Alternative 4: Fully Autonomous AI (No Human Approval) **Pros**: Maximum automation **Cons**: Unacceptable risk for infrastructure changes **Decision**: REJECTED - Safety and compliance requirements ### Alternative 5: Single LLM Provider Lock-in **Pros**: Simpler integration **Cons**: Vendor lock-in, no flexibility for different use cases **Decision**: REJECTED - Multi-provider abstraction provides flexibility ## Implementation Details ### AI Service API ```text // platform/crates/ai-service/src/lib.rs #[async_trait] pub trait AIService { async fn generate_config( &self, prompt: &str, schema: &NickelSchema, context: Option, ) -> Result; async fn suggest_field_value( &self, field: &FieldDefinition, partial_input: &str, form_context: &FormContext, ) -> Result>; async fn explain_validation_error( &self, error: &ValidationError, config: &Config, ) -> Result; async fn troubleshoot_deployment( &self, deployment_id: &str, logs: &DeploymentLogs, ) -> Result; } pub struct AIServiceImpl { mcp_client: MCPClient, rag: RAGService, cedar: CedarEngine, audit: AuditLogger, rate_limiter: RateLimiter, cache: Cache, } impl AIService for AIServiceImpl { async fn generate_config( &self, prompt: &str, schema: &NickelSchema, context: Option, ) -> Result { // Check authorization self.cedar.authorize( principal: current_user(), action: "ai:generate_config", resource: schema, )?; // Rate limiting self.rate_limiter.check(current_user()).await?; // Retrieve relevant context via RAG let rag_context = match context { Some(ctx) => ctx, None => self.rag.retrieve(prompt, schema).await?, }; // Generate config via MCP let generated = self.mcp_client.generate( prompt: prompt, schema: schema, context: rag_context, tools: &["nickel_validate", "schema_query"], ).await?; // Validate generated config let validation = nickel_validate(&generated.config)?; if !validation.is_valid() { return Err(AIError::InvalidGeneration(validation.errors)); } // Audit log self.audit.log(AIOperation::GenerateConfig { user: current_user(), prompt: prompt, schema: schema.name(), generated: &generated.config, validation: validation, }); Ok(GeneratedConfig { config: generated.config, explanation: generated.explanation, confidence: generated.confidence, validation: validation, }) } } ``` ### MCP Server Integration ```text // platform/crates/mcp-server/src/lib.rs pub struct MCPClient { provider: Box, tools: ToolRegistry, } #[async_trait] pub trait LLMProvider { async fn generate(&self, request: GenerateRequest) -> Result; async fn generate_stream(&self, request: GenerateRequest) -> Result>; } // Tool definitions for LLM pub struct ToolRegistry { tools: HashMap, } impl ToolRegistry { pub fn new() -> Self { let mut tools = HashMap::new(); tools.insert("nickel_validate", Tool { name: "nickel_validate", description: "Validate Nickel configuration against schema", parameters: json!({ "type": "object", "properties": { "config": {"type": "string"}, "schema_path": {"type": "string"}, }, "required": ["config", "schema_path"], }), handler: Box::new(|params| async { let config = params["config"].as_str().unwrap(); let schema = params["schema_path"].as_str().unwrap(); nickel_validate_tool(config, schema).await }), }); tools.insert("schema_query", Tool { name: "schema_query", description: "Query Nickel schema for field information", parameters: json!({ "type": "object", "properties": { "schema_path": {"type": "string"}, "query": {"type": "string"}, }, "required": ["schema_path"], }), handler: Box::new(|params| async { let schema = params["schema_path"].as_str().unwrap(); let query = params.get("query").and_then(|v| v.as_str()); schema_query_tool(schema, query).await }), }); Self { tools } } } ``` ### RAG System Implementation ```text // platform/crates/rag/src/lib.rs pub struct RAGService { vector_store: Box, embeddings: EmbeddingModel, indexer: DocumentIndexer, } impl RAGService { pub async fn index_all(&self) -> Result<()> { // Index Nickel schemas self.index_schemas("provisioning/schemas").await?; // Index documentation self.index_docs("docs").await?; // Index past deployments self.index_deployments("workspaces").await?; // Index best practices self.index_patterns(".claude/patterns").await?; Ok(()) } pub async fn retrieve( &self, query: &str, schema: &NickelSchema, ) -> Result { // Generate query embedding let query_embedding = self.embeddings.embed(query).await?; // Search vector store let results = self.vector_store.search( embedding: query_embedding, top_k: 10, filter: Some(json!({ "schema": schema.name(), })), ).await?; // Rerank results let reranked = self.rerank(query, results).await?; // Build context Ok(RAGContext { query: query.to_string(), schema_definition: schema.to_string(), relevant_docs: reranked.iter() .take(5) .map(|r| r.content.clone()) .collect(), similar_configs: self.find_similar_configs(schema).await?, best_practices: self.find_best_practices(schema).await?, }) } } #[async_trait] pub trait VectorStore { async fn insert(&self, id: &str, embedding: Vec, metadata: Value) -> Result<()>; async fn search(&self, embedding: Vec, top_k: usize, filter: Option) -> Result>; } // Qdrant implementation pub struct QdrantStore { client: qdrant::QdrantClient, collection: String, } ``` ### typdialog-ai Integration ```text // typdialog-ai/src/form_assistant.rs pub struct FormAssistant { ai_service: Arc, } impl FormAssistant { pub async fn suggest_field_value( &self, field: &FieldDefinition, partial_input: &str, form_context: &FormContext, ) -> Result> { self.ai_service.suggest_field_value( field, partial_input, form_context, ).await } pub async fn explain_error( &self, error: &ValidationError, field_value: &str, ) -> Result { let explanation = self.ai_service.explain_validation_error( error, field_value, ).await?; Ok(format!( "Error: {} Explanation: {} Suggested fix: {}", error.message, explanation.plain_english, explanation.suggested_fix, )) } pub async fn fill_from_natural_language( &self, description: &str, form_schema: &FormSchema, ) -> Result> { let prompt = format!( "User wants to: {} Form schema: {} Generate field values:", description, serde_json::to_string_pretty(form_schema)?, ); let generated = self.ai_service.generate_config( &prompt, &form_schema.nickel_schema, None, ).await?; Ok(generated.field_values) } } ``` ### typdialog-ag Agents ```text // typdialog-ag/src/agent.rs pub struct ProvisioningAgent { ai_service: Arc, orchestrator: Arc, max_iterations: usize, } impl ProvisioningAgent { pub async fn execute_goal(&self, goal: &str) -> Result { let mut state = AgentState::new(goal); for iteration in 0..self.max_iterations { // AI determines next action let action = self.ai_service.agent_next_action(&state).await?; // Execute action (with human approval for critical operations) let result = self.execute_action(&action, &state).await?; // Update state state.update(action, result); // Check if goal achieved if state.goal_achieved() { return Ok(AgentResult::Success(state)); } } Err(AgentError::MaxIterationsReached) } async fn execute_action( &self, action: &AgentAction, state: &AgentState, ) -> Result { match action { AgentAction::GenerateConfig { description } => { let config = self.ai_service.generate_config( description, &state.target_schema, Some(state.context.clone()), ).await?; Ok(ActionResult::ConfigGenerated(config)) }, AgentAction::Deploy { config } => { // Require human approval for deployment let approval = prompt_user_approval( "Agent wants to deploy. Approve?", config, ).await?; if !approval.approved { return Ok(ActionResult::DeploymentRejected); } let deployment = self.orchestrator.deploy(config).await?; Ok(ActionResult::Deployed(deployment)) }, AgentAction::Troubleshoot { deployment_id } => { let report = self.ai_service.troubleshoot_deployment( deployment_id, &self.orchestrator.get_logs(deployment_id).await?, ).await?; Ok(ActionResult::TroubleshootingReport(report)) }, } } } ``` ### Cedar Policies for AI ```text // AI cannot access secrets without explicit permission forbid( principal == Service::"ai-service", action == Action::"read", resource in Secret::"*" ); // AI can generate configs for non-production environments without approval permit( principal == Service::"ai-service", action == Action::"generate_config", resource in Schema::"*" ) when { resource.environment in ["dev", "staging"] }; // AI config generation for production requires senior engineer approval permit( principal in Group::"senior-engineers", action == Action::"approve_ai_config", resource in Config::"*" ) when { resource.environment == "production" && resource.generated_by == "ai-service" }; // AI agents cannot deploy without human approval forbid( principal == Service::"ai-agent", action == Action::"deploy", resource == Infrastructure::"*" ) unless { context.human_approved == true }; ``` ## Testing Strategy **Unit Tests**: ```text #[tokio::test] async fn test_ai_config_generation_validates() { let ai_service = mock_ai_service(); let generated = ai_service.generate_config( "Create a PostgreSQL database with encryption", &postgres_schema(), None, ).await.unwrap(); // Must validate against schema assert!(generated.validation.is_valid()); assert_eq!(generated.config["engine"], "postgres"); assert_eq!(generated.config["encryption_enabled"], true); } #[tokio::test] async fn test_ai_cannot_access_secrets() { let ai_service = ai_service_with_cedar(); let result = ai_service.get_secret("database/password").await; assert!(result.is_err()); assert_eq!(result.unwrap_err(), AIError::PermissionDenied); } ``` **Integration Tests**: ```text #[tokio::test] async fn test_end_to_end_ai_config_generation() { // User provides natural language let description = "Create a production Kubernetes cluster in AWS with 5 nodes"; // AI generates config let generated = ai_service.generate_config(description).await.unwrap(); // Nickel validation let validation = nickel_validate(&generated.config).await.unwrap(); assert!(validation.is_valid()); // Human approval let approval = Approval { user: "senior-engineer@example.com", approved: true, timestamp: Utc::now(), }; // Deploy let deployment = orchestrator.deploy_with_approval( generated.config, approval, ).await.unwrap(); assert_eq!(deployment.status, DeploymentStatus::Success); } ``` **RAG Quality Tests**: ```text #[tokio::test] async fn test_rag_retrieval_accuracy() { let rag = rag_service(); // Index test documents rag.index_all().await.unwrap(); // Query let context = rag.retrieve( "How to configure PostgreSQL with encryption?", &postgres_schema(), ).await.unwrap(); // Should retrieve relevant docs assert!(context.relevant_docs.iter().any(|doc| { doc.contains("encryption") && doc.contains("postgres") })); // Should retrieve similar configs assert!(!context.similar_configs.is_empty()); } ``` ## Security Considerations **AI Access Control**: ```text AI Service Permissions (enforced by Cedar): ✅ CAN: Read Nickel schemas ✅ CAN: Generate configurations ✅ CAN: Query documentation ✅ CAN: Analyze deployment logs (sanitized) ❌ CANNOT: Access secrets directly ❌ CANNOT: Deploy without approval ❌ CANNOT: Modify Cedar policies ❌ CANNOT: Access user credentials ``` **Data Privacy**: ```text [ai.privacy] # Sanitize before sending to LLM sanitize_secrets = true sanitize_pii = true sanitize_credentials = true # What gets sent to LLM: # ✅ Nickel schemas (public) # ✅ Documentation (public) # ✅ Error messages (sanitized) # ❌ Secret values (never) # ❌ Passwords (never) # ❌ API keys (never) ``` **Audit Trail**: ```text // Every AI operation logged pub struct AIAuditLog { timestamp: DateTime, user: UserId, operation: AIOperation, input_prompt: String, generated_output: String, validation_result: ValidationResult, human_approval: Option, deployment_outcome: Option, } ``` ## Cost Analysis **Estimated Costs** (per month, based on typical usage): ```text Assumptions: - 100 active users - 10 AI config generations per user per day - Average prompt: 2000 tokens - Average response: 1000 tokens Provider: Anthropic Claude Sonnet Cost: $3 per 1M input tokens, $15 per 1M output tokens Monthly cost: = 100 users × 10 generations × 30 days × (2000 input + 1000 output tokens) = 100 × 10 × 30 × 3000 tokens = 90M tokens = (60M input × $3/1M) + (30M output × $15/1M) = $180 + $450 = $630/month With caching (50% hit rate): = $315/month ``` **Cost optimization strategies**: - Caching (50-80% cost reduction) - Streaming (lower latency, same cost) - Local models for non-critical operations (zero marginal cost) - Rate limiting (prevent runaway costs) ## References - [Model Context Protocol (MCP)](https://modelcontextprotocol.io/) - [Anthropic Claude API](https://docs.anthropic.com/claude/reference/getting-started) - [OpenAI GPT-4 API](https://platform.openai.com/docs/api-reference) - [Qdrant Vector Database](https://qdrant.tech/) - [RAG Survey Paper](https://arxiv.org/abs/2312.10997) - ADR-008: Cedar Authorization (AI access control) - ADR-011: Nickel Migration (schema-driven AI) - ADR-013: Typdialog Web UI Backend (AI-assisted forms) - ADR-014: SecretumVault Integration (AI-secret isolation) --- **Status**: Accepted **Last Updated**: 2025-01-08 **Implementation**: Planned (High Priority) **Estimated Complexity**: Very Complex **Dependencies**: ADR-008, ADR-011, ADR-013, ADR-014