11 KiB

AI-Powered Infrastructure Automation

This module provides comprehensive AI capabilities for the provisioning system, enabling natural language infrastructure generation and management.

Features

🤖 Core AI Capabilities

  • Natural language Nickel file generation
  • Intelligent template creation
  • Infrastructure query processing
  • Configuration validation and improvement
  • Chat/webhook integration

📝 Nickel Generation Types

  • Server Configurations (servers.ncl) - Generate server definitions with storage, networking, and services
  • Provider Defaults (*_defaults.ncl) - Create provider-specific default settings
  • Settings Configuration (settings.ncl) - Generate main infrastructure settings
  • Cluster Configuration - Kubernetes and container orchestration setups
  • Task Services - Individual service configurations

🔧 AI Providers Supported

  • OpenAI (GPT-4, GPT-3.5)
  • Anthropic Claude (Claude-3.5 Sonnet, Claude-3)
  • Generic/Local (Ollama, local LLM APIs)

Configuration

Environment Variables

# Enable AI functionality
export PROVISIONING_AI_ENABLED=true

# Set provider
export PROVISIONING_AI_PROVIDER="openai"  # or "claude", "generic"

# API Keys (choose based on provider)
export OPENAI_API_KEY="your-openai-api-key"
export ANTHROPIC_API_KEY="your-anthropic-api-key" 
export LLM_API_KEY="your-generic-api-key"

# Optional overrides
export PROVISIONING_AI_MODEL="gpt-4"
export PROVISIONING_AI_TEMPERATURE="0.3"
export PROVISIONING_AI_MAX_TOKENS="2048"
```plaintext

### Nickel Configuration

```nickel
import settings

settings.Settings {
    ai = settings.AIProvider {
        enabled = True
        provider = "openai"
        model = "gpt-4"
        max_tokens = 2048
        temperature = 0.3
        enable_template_ai = True
        enable_query_ai = True
        enable_webhook_ai = False
    }
}
```plaintext

### YAML Configuration (`ai.yaml`)

```yaml
enabled: true
provider: "openai"
model: "gpt-4"
max_tokens: 2048
temperature: 0.3
timeout: 30
enable_template_ai: true
enable_query_ai: true
enable_webhook_ai: false
```plaintext

## Usage

### 🎯 **Command Line Interface**

#### Generate Infrastructure with AI

```bash
# Interactive generation
./provisioning ai generate --interactive

# Generate specific configurations
./provisioning ai gen -t server -p upcloud -i "3 Kubernetes nodes with Ceph storage" -o servers.ncl
./provisioning ai gen -t defaults -p aws -i "Production environment in us-west-2" -o aws_defaults.ncl
./provisioning ai gen -t settings -i "E-commerce platform with secrets management" -o settings.ncl

# Enhanced generation with validation
./provisioning generate-ai servers "High-availability Kubernetes cluster with 3 control planes and 5 workers" --validate --provider upcloud

# Improve existing configurations
./provisioning ai improve -i existing_servers.ncl -o improved_servers.ncl

# Validate and fix Nickel files
./provisioning ai validate -i servers.ncl
```plaintext

#### Interactive AI Chat

```bash
# Start chat session
./provisioning ai chat

# Single query
./provisioning ai chat -i "How do I set up a 3-node Kubernetes cluster with persistent storage?"

# Test AI connectivity
./provisioning ai test

# Show configuration
./provisioning ai config
```plaintext

### 🧠 **Programmatic API**

#### Generate Nickel Files

```nushell
use lib_provisioning/ai/templates.nu *

# Generate server configuration
let servers = (generate_server_nickel "3 Kubernetes nodes for production workloads" "upcloud" "servers.ncl")

# Generate provider defaults
let defaults = (generate_defaults_nickel "High-availability setup in EU region" "aws" "aws_defaults.ncl")

# Generate complete infrastructure
let result = (generate_full_infra_ai "E-commerce platform with database and caching" "upcloud" "" false)
```plaintext

#### Process Natural Language Queries

```nushell
use lib_provisioning/ai/lib.nu *

# Process infrastructure queries
let response = (ai_process_query "Show me all servers with high CPU usage")

# Generate templates
let template = (ai_generate_template "Docker Swarm cluster with monitoring" "cluster")

# Validate configurations
let validation = (validate_and_fix_nickel "servers.ncl")
```plaintext

### 🌐 **Webhook Integration**

#### HTTP Webhook

```bash
curl -X POST http://your-server/webhook \
  -H "Content-Type: application/json" \
  -d '{
    "message": "generate 3 kubernetes servers with monitoring",
    "user_id": "user123",
    "channel": "infrastructure"
  }'
```plaintext

#### Slack Integration

```nushell
# Process Slack webhook payload
let slack_payload = {
    text: "generate upcloud defaults for development",
    user_id: "U123456",
    channel_id: "C789012"
}

let response = (process_slack_webhook $slack_payload)
```plaintext

#### Discord Integration

```nushell
# Process Discord webhook
let discord_payload = {
    content: "show infrastructure status",
    author: { id: "123456789" },
    channel_id: "987654321"
}

let response = (process_discord_webhook $discord_payload)
```plaintext

## Examples

### 🏗️ **Infrastructure Generation Examples**

#### 1. Kubernetes Cluster Setup

```bash
./provisioning generate-ai servers "
High-availability Kubernetes cluster with:
- 3 control plane nodes (4 CPU, 8GB RAM each)
- 5 worker nodes (8 CPU, 16GB RAM each) 
- Dedicated storage nodes with Ceph
- Private networking with load balancer
- Monitoring and logging stack
" --provider upcloud --output k8s_cluster_servers.ncl --validate
```plaintext

#### 2. AWS Production Environment

```bash
./provisioning generate-ai defaults "
AWS production environment configuration:
- Region: us-west-2 (primary), us-east-1 (backup)
- High availability across 3 AZs
- Enhanced security with VPC, private subnets
- Auto-scaling groups for applications
- RDS Multi-AZ for databases
- ElastiCache for caching
- CloudFront CDN
- Route53 DNS management
" --provider aws --output aws_prod_defaults.ncl
```plaintext

#### 3. Development Environment

```bash
./provisioning generate-ai infra "
Development environment for a microservices application:
- 1 jump/bastion server
- 3 application servers (containerized workloads)
- 1 database server (PostgreSQL)
- 1 cache server (Redis)
- Shared NFS storage
- Development tools (Git, CI/CD agents)
- Monitoring (Prometheus, Grafana)
" --provider local --interactive
```plaintext

### 💬 **Chat Examples**

**User:** *"I need to set up a scalable web application infrastructure"*

**AI:** *"I can help you design a scalable web infrastructure! Let me generate a configuration that includes:*

*- Load balancers for high availability*
*- Auto-scaling application servers*
*- Database with read replicas*
*- Caching layer*
*- CDN for static assets*

*Would you like me to create this for a specific provider like AWS, UpCloud, or local deployment?"*

**User:** *"Generate it for UpCloud with monitoring included"*

**AI:** *"Perfect! I'll generate an UpCloud configuration with monitoring. Here's your infrastructure setup:*

```nickel
import upcloud_prov
servers = [
   // Load balancer
   upcloud_prov.Server_upcloud {
      hostname = "lb-01"
      plan = "2xCPU-4GB"
      // ... configuration
   }
   // Application servers with auto-scaling
   // Database servers with replication  
   // Monitoring stack with Prometheus/Grafana
]
```plaintext

*This configuration includes 7 servers optimized for high availability and performance.
Would you like me to explain any specific part or generate additional configurations?"*

### 🚀 **Advanced Features**

#### Interactive Configuration Builder

```bash
./provisioning ai generate --interactive
```plaintext

This launches an interactive session that asks specific questions to build optimal configurations:

1. **Infrastructure Purpose** - Web app, data processing, ML training, etc.
2. **Scale Requirements** - Number of users, traffic patterns, growth projections
3. **Provider Preference** - Cloud provider selection and regions
4. **Service Requirements** - Databases, caching, storage, monitoring
5. **Security Needs** - Compliance requirements, network isolation
6. **Budget Constraints** - Cost optimization preferences

#### Configuration Optimization

```bash
# Analyze and improve existing configurations
./provisioning ai improve existing_config.ncl --output optimized_config.ncl

# Get AI suggestions for performance improvements
./provisioning ai query --prompt "How can I optimize this configuration for better performance?" --context file:servers.ncl
```plaintext

## Integration with Existing Workflows

### 🔄 **Workflow Integration**

1. **Generate** configurations with AI
2. **Validate** using Nickel compiler
3. **Review** and customize as needed  
4. **Apply** using provisioning commands
5. **Monitor** and iterate

```bash
# Complete workflow example
./provisioning generate-ai servers "Production Kubernetes cluster" --validate --output servers.ncl
./provisioning server create --check  # Review before creation
./provisioning server create          # Actually create infrastructure
```plaintext

### 🛡️ **Security & Best Practices**

- **API Keys**: Store in environment variables, never in code
- **Validation**: Always validate AI-generated configurations
- **Review**: Human review recommended for production deployments
- **Version Control**: Track all generated configurations
- **Testing**: Use `--check` mode for dry runs

### 🧪 **Testing & Development**

```bash
# Test AI functionality
./provisioning ai test

# Test webhook processing
./provisioning ai webhook test

# Debug mode for troubleshooting
./provisioning generate-ai servers "test setup" --debug
```plaintext

## Architecture

### 🏗️ **Module Structure**

```plaintext
ai/
├── lib.nu          # Core AI functionality and API integration  
├── templates.nu    # Nickel template generation functions
├── webhook.nu      # Chat/webhook processing
├── mod.nu         # Module exports
└── README.md      # This documentation
```plaintext

### 🔌 **Integration Points**

- **Settings System** - AI configuration management
- **Secrets Management** - Integration with SOPS/KMS for secure API keys
- **Template Engine** - Enhanced with AI-generated content
- **Validation System** - Automated Nickel syntax checking
- **CLI Commands** - Natural language command processing

### 🌊 **Data Flow**

1. **Input** - Natural language description or chat message
2. **Intent Detection** - Parse and understand user requirements
3. **Context Building** - Gather relevant infrastructure context
4. **AI Processing** - Generate appropriate Nickel configurations
5. **Validation** - Syntax and semantic validation
6. **Output** - Formatted Nickel files and user feedback

This AI integration transforms the provisioning system into an intelligent infrastructure automation platform
that understands natural language and generates production-ready configurations.