provisioning/examples/complete-workflow.md
2025-10-07 11:12:02 +01:00

12 KiB

Complete Workflow Example: Kubernetes Cluster with Package System

This example demonstrates the complete workflow using the new KCL package and module loader system to deploy a production Kubernetes cluster.

Scenario

Deploy a 3-node Kubernetes cluster with:

  • 1 master node
  • 2 worker nodes
  • Cilium CNI
  • Containerd runtime
  • UpCloud provider
  • Production-ready configuration

Prerequisites

  1. Core provisioning package installed
  2. UpCloud credentials configured
  3. SSH keys set up

Step 1: Environment Setup

# Ensure core package is installed
cd /Users/Akasha/project-provisioning
./provisioning/tools/kcl-packager.nu build --version 1.0.0
./provisioning/tools/kcl-packager.nu install dist/provisioning-1.0.0.tar.gz

# Verify installation
kcl list packages | grep provisioning

Step 2: Create Workspace

# Create new workspace from template
mkdir -p workspace/infra/production-k8s
cd workspace/infra/production-k8s

# Initialize workspace structure
../../../provisioning/tools/workspace-init.nu . init

# Verify structure
tree -a .

Expected output:

.
├── kcl.mod
├── servers.k
├── README.md
├── .gitignore
├── .taskservs/
├── .providers/
├── .clusters/
├── .manifest/
├── data/
├── tmp/
├── resources/
└── clusters/

Step 3: Discover Available Modules

# Discover available taskservs
../../../provisioning/core/cli/module-loader discover taskservs

# Search for Kubernetes-related modules
../../../provisioning/core/cli/module-loader discover taskservs kubernetes

# Discover providers
../../../provisioning/core/cli/module-loader discover providers

# Check output formats
../../../provisioning/core/cli/module-loader discover taskservs --format json

Step 4: Load Required Modules

# Load Kubernetes stack taskservs
../../../provisioning/core/cli/module-loader load taskservs . [kubernetes, cilium, containerd]

# Load UpCloud provider
../../../provisioning/core/cli/module-loader load providers . [upcloud]

# Verify loading
../../../provisioning/core/cli/module-loader list taskservs .
../../../provisioning/core/cli/module-loader list providers .

Check generated files:

# Check auto-generated imports
cat taskservs.k
cat providers.k

# Check manifest
cat .manifest/taskservs.yaml
cat .manifest/providers.yaml

Step 5: Configure Infrastructure

Edit servers.k to configure the Kubernetes cluster:

# Production Kubernetes Cluster Configuration
import provisioning.settings as settings
import provisioning.server as server
import provisioning.defaults as defaults

# Import loaded modules (auto-generated)
import .taskservs.kubernetes.kubernetes as k8s
import .taskservs.cilium.cilium as cilium
import .taskservs.containerd.containerd as containerd
import .providers.upcloud as upcloud

# Cluster settings
k8s_settings: settings.Settings = {
    main_name = "production-k8s"
    main_title = "Production Kubernetes Cluster"

    # Configure paths
    settings_path = "./data/settings.yaml"
    defaults_provs_dirpath = "./defs"
    prov_data_dirpath = "./data"
    created_taskservs_dirpath = "./tmp/k8s-deployment"
    prov_resources_path = "./resources"
    created_clusters_dirpath = "./tmp/k8s-clusters"
    prov_clusters_path = "./clusters"

    # Kubernetes cluster settings
    cluster_admin_host = ""  # Set by provider (first master)
    cluster_admin_port = 22
    cluster_admin_user = "admin"
    servers_wait_started = 60  # K8s nodes need more time

    runset = {
        wait = True
        output_format = "human"
        output_path = "tmp/k8s-deployment"
        inventory_file = "./k8s-inventory.yaml"
        use_time = True
    }

    # Secrets configuration
    secrets = {
        provider = "sops"
        sops_config = {
            age_key_file = "~/.age/keys.txt"
            use_age = True
        }
    }
}

# Production Kubernetes cluster servers
production_servers: [server.Server] = [
    # Control plane node
    {
        hostname = "k8s-master-01"
        title = "Kubernetes Master Node 01"

        # Production specifications
        time_zone = "UTC"
        running_wait = 20
        running_timeout = 400
        storage_os_find = "name: debian-12 | arch: x86_64"

        # Network configuration
        network_utility_ipv4 = True
        network_public_ipv4 = True
        priv_cidr_block = "10.0.0.0/24"

        # User settings
        user = "admin"
        user_ssh_port = 22
        fix_local_hosts = True
        labels = "env: production, role: control-plane, tier: master"

        # Taskservs configuration
        taskservs = [
            {
                name = "containerd"
                profile = "production"
                install_mode = "library"
            },
            {
                name = "kubernetes"
                profile = "master"
                install_mode = "library-server"
            },
            {
                name = "cilium"
                profile = "master"
                install_mode = "library"
            }
        ]
    },

    # Worker nodes
    {
        hostname = "k8s-worker-01"
        title = "Kubernetes Worker Node 01"

        time_zone = "UTC"
        running_wait = 20
        running_timeout = 400
        storage_os_find = "name: debian-12 | arch: x86_64"

        network_utility_ipv4 = True
        network_public_ipv4 = True
        priv_cidr_block = "10.0.0.0/24"

        user = "admin"
        user_ssh_port = 22
        fix_local_hosts = True
        labels = "env: production, role: worker, tier: compute"

        taskservs = [
            {
                name = "containerd"
                profile = "production"
                install_mode = "library"
            },
            {
                name = "kubernetes"
                profile = "worker"
                install_mode = "library"
            },
            {
                name = "cilium"
                profile = "worker"
                install_mode = "library"
            }
        ]
    },

    {
        hostname = "k8s-worker-02"
        title = "Kubernetes Worker Node 02"

        time_zone = "UTC"
        running_wait = 20
        running_timeout = 400
        storage_os_find = "name: debian-12 | arch: x86_64"

        network_utility_ipv4 = True
        network_public_ipv4 = True
        priv_cidr_block = "10.0.0.0/24"

        user = "admin"
        user_ssh_port = 22
        fix_local_hosts = True
        labels = "env: production, role: worker, tier: compute"

        taskservs = [
            {
                name = "containerd"
                profile = "production"
                install_mode = "library"
            },
            {
                name = "kubernetes"
                profile = "worker"
                install_mode = "library"
            },
            {
                name = "cilium"
                profile = "worker"
                install_mode = "library"
            }
        ]
    }
]

# Export for provisioning system
{
    settings = k8s_settings
    servers = production_servers
}

Step 6: Validate Configuration

# Validate KCL configuration
kcl run servers.k

# Validate workspace
../../../provisioning/core/cli/module-loader validate .

# Check workspace info
../../../provisioning/tools/workspace-init.nu . info

Step 7: Configure Provider Credentials

# Create provider configuration directory
mkdir -p defs

# Create UpCloud provider defaults (example)
cat > defs/upcloud_defaults.k << 'EOF'
# UpCloud Provider Defaults
import provisioning.defaults as defaults

upcloud_defaults: defaults.ServerDefaults = {
    lock = False
    time_zone = "UTC"
    running_wait = 15
    running_timeout = 300

    # UpCloud specific settings
    storage_os_find = "name: debian-12 | arch: x86_64"

    # Network settings
    network_utility_ipv4 = True
    network_public_ipv4 = True

    # SSH settings
    ssh_key_path = "~/.ssh/id_rsa.pub"
    user = "admin"
    user_ssh_port = 22
    fix_local_hosts = True

    # UpCloud plan specifications
    labels = "provider: upcloud"
}

upcloud_defaults
EOF

Step 8: Deploy Infrastructure

# Create servers with check mode first
../../../provisioning/core/cli/provisioning server create --infra . --check

# If validation passes, deploy for real
../../../provisioning/core/cli/provisioning server create --infra .

# Monitor server creation
../../../provisioning/core/cli/provisioning server list --infra .

Step 9: Install Taskservs

# Install containerd on all nodes
../../../provisioning/core/cli/provisioning taskserv create containerd --infra .

# Install Kubernetes (this will set up master and join workers)
../../../provisioning/core/cli/provisioning taskserv create kubernetes --infra .

# Install Cilium CNI
../../../provisioning/core/cli/provisioning taskserv create cilium --infra .

Step 10: Verify Cluster

# SSH to master node and verify cluster
../../../provisioning/core/cli/provisioning server ssh k8s-master-01 --infra .

# On the master node:
kubectl get nodes
kubectl get pods -A
kubectl get services -A

# Test Cilium connectivity
cilium status
cilium connectivity test

Step 11: Deploy Sample Application

Create a test deployment to verify the cluster:

# Create namespace
kubectl create namespace test-app

# Deploy nginx
kubectl create deployment nginx --image=nginx:latest -n test-app
kubectl expose deployment nginx --port=80 --type=ClusterIP -n test-app

# Verify deployment
kubectl get pods -n test-app
kubectl get services -n test-app

Step 12: Cluster Management

# Add monitoring (example)
../../../provisioning/core/cli/module-loader load taskservs . [prometheus, grafana]

# Regenerate configuration
../../../provisioning/core/cli/module-loader list taskservs .

# Deploy monitoring stack
../../../provisioning/core/cli/provisioning taskserv create prometheus --infra .
../../../provisioning/core/cli/provisioning taskserv create grafana --infra .

Step 13: Backup and Documentation

# Create cluster documentation
cat > cluster-info.md << 'EOF'
# Production Kubernetes Cluster

## Cluster Details
- **Name**: production-k8s
- **Nodes**: 3 (1 master, 2 workers)
- **CNI**: Cilium
- **Runtime**: Containerd
- **Provider**: UpCloud

## Node Information
- k8s-master-01: Control plane node
- k8s-worker-01: Worker node
- k8s-worker-02: Worker node

## Loaded Modules
- kubernetes (master/worker profiles)
- cilium (cluster networking)
- containerd (container runtime)
- upcloud (cloud provider)

## Management Commands
```bash
# SSH to master
../../../provisioning/core/cli/provisioning server ssh k8s-master-01 --infra .

# Update cluster
../../../provisioning/core/cli/provisioning taskserv generate kubernetes --infra .

EOF

Backup workspace

cp -r . ../production-k8s-backup-$(date +%Y%m%d)

Commit to version control

git add . git commit -m "Initial Kubernetes cluster deployment with package system"


## Troubleshooting

### Module Loading Issues
```bash
# If modules don't load properly
../../../provisioning/core/cli/module-loader discover taskservs
../../../provisioning/core/cli/module-loader load taskservs . [kubernetes, cilium, containerd] --force

# Check generated imports
cat taskservs.k

KCL Compilation Issues

# Check for syntax errors
kcl check servers.k

# Validate specific schemas
kcl run --dry-run servers.k

Provider Authentication Issues

# Check provider configuration
cat .providers/upcloud/provision_upcloud.k

# Verify credentials
../../../provisioning/core/cli/provisioning server price --provider upcloud

Kubernetes Setup Issues

# Check taskserv logs
tail -f tmp/k8s-deployment/kubernetes-*.log

# Verify SSH connectivity
../../../provisioning/core/cli/provisioning server ssh k8s-master-01 --infra . --command "systemctl status kubelet"

Next Steps

  1. Scale the cluster: Add more worker nodes
  2. Add storage: Load and configure storage taskservs (rook-ceph, mayastor)
  3. Setup monitoring: Deploy Prometheus/Grafana stack
  4. Configure ingress: Set up ingress controllers
  5. Implement GitOps: Configure ArgoCD or Flux

This example demonstrates the complete workflow from workspace creation to production Kubernetes cluster deployment using the new package-based system.