2026-01-14 04:53:21 +00:00
|
|
|
# Update Existing Infrastructure
|
|
|
|
|
|
|
|
|
|
**Goal**: Safely update running infrastructure with minimal downtime
|
|
|
|
|
**Time**: 15-30 minutes
|
|
|
|
|
**Difficulty**: Intermediate
|
|
|
|
|
|
|
|
|
|
## Overview
|
|
|
|
|
|
|
|
|
|
This guide covers:
|
|
|
|
|
|
|
|
|
|
1. Checking for updates
|
|
|
|
|
2. Planning update strategies
|
|
|
|
|
3. Updating task services
|
|
|
|
|
4. Rolling updates
|
|
|
|
|
5. Rollback procedures
|
|
|
|
|
6. Verification
|
|
|
|
|
|
|
|
|
|
## Update Strategies
|
|
|
|
|
|
|
|
|
|
### Strategy 1: In-Place Updates (Fastest)
|
|
|
|
|
|
|
|
|
|
**Best for**: Non-critical environments, development, staging
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Direct update without downtime consideration
|
|
|
|
|
provisioning t create <taskserv> --infra <project>
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### Strategy 2: Rolling Updates (Recommended)
|
|
|
|
|
|
|
|
|
|
**Best for**: Production environments, high availability
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Update servers one by one
|
|
|
|
|
provisioning s update --infra <project> --rolling
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### Strategy 3: Blue-Green Deployment (Safest)
|
|
|
|
|
|
|
|
|
|
**Best for**: Critical production, zero-downtime requirements
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Create new infrastructure, switch traffic, remove old
|
|
|
|
|
provisioning ws init <project>-green
|
|
|
|
|
# ... configure and deploy
|
|
|
|
|
# ... switch traffic
|
|
|
|
|
provisioning ws delete <project>-blue
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## Step 1: Check for Updates
|
|
|
|
|
|
|
|
|
|
### 1.1 Check All Task Services
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Check all taskservs for updates
|
|
|
|
|
provisioning t check-updates
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Expected Output:**
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
📦 Task Service Update Check:
|
|
|
|
|
|
|
|
|
|
NAME CURRENT LATEST STATUS
|
|
|
|
|
kubernetes 1.29.0 1.30.0 ⬆️ update available
|
|
|
|
|
containerd 1.7.13 1.7.13 ✅ up-to-date
|
|
|
|
|
cilium 1.14.5 1.15.0 ⬆️ update available
|
|
|
|
|
postgres 15.5 16.1 ⬆️ update available
|
|
|
|
|
redis 7.2.3 7.2.3 ✅ up-to-date
|
|
|
|
|
|
|
|
|
|
Updates available: 3
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### 1.2 Check Specific Task Service
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Check specific taskserv
|
|
|
|
|
provisioning t check-updates kubernetes
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Expected Output:**
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
📦 Kubernetes Update Check:
|
|
|
|
|
|
|
|
|
|
Current: 1.29.0
|
|
|
|
|
Latest: 1.30.0
|
|
|
|
|
Status: ⬆️ Update available
|
|
|
|
|
|
|
|
|
|
Changelog:
|
|
|
|
|
• Enhanced security features
|
|
|
|
|
• Performance improvements
|
|
|
|
|
• Bug fixes in kube-apiserver
|
|
|
|
|
• New workload resource types
|
|
|
|
|
|
|
|
|
|
Breaking Changes:
|
|
|
|
|
• None
|
|
|
|
|
|
|
|
|
|
Recommended: ✅ Safe to update
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### 1.3 Check Version Status
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Show detailed version information
|
|
|
|
|
provisioning version show
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Expected Output:**
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
📋 Component Versions:
|
|
|
|
|
|
|
|
|
|
COMPONENT CURRENT LATEST DAYS OLD STATUS
|
|
|
|
|
kubernetes 1.29.0 1.30.0 45 ⬆️ update
|
|
|
|
|
containerd 1.7.13 1.7.13 0 ✅ current
|
|
|
|
|
cilium 1.14.5 1.15.0 30 ⬆️ update
|
|
|
|
|
postgres 15.5 16.1 60 ⬆️ update (major)
|
|
|
|
|
redis 7.2.3 7.2.3 0 ✅ current
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### 1.4 Check for Security Updates
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Check for security-related updates
|
|
|
|
|
provisioning version updates --security-only
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## Step 2: Plan Your Update
|
|
|
|
|
|
|
|
|
|
### 2.1 Review Current Configuration
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```toml
|
2026-01-14 04:53:21 +00:00
|
|
|
# Show current infrastructure
|
|
|
|
|
provisioning show settings --infra my-production
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### 2.2 Backup Configuration
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```toml
|
2026-01-14 04:53:21 +00:00
|
|
|
# Create configuration backup
|
|
|
|
|
cp -r workspace/infra/my-production workspace/infra/my-production.backup-$(date +%Y%m%d)
|
|
|
|
|
|
|
|
|
|
# Or use built-in backup
|
|
|
|
|
provisioning ws backup my-production
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Expected Output:**
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
✅ Backup created: workspace/backups/my-production-20250930.tar.gz
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### 2.3 Create Update Plan
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Generate update plan
|
|
|
|
|
provisioning plan update --infra my-production
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Expected Output:**
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
📝 Update Plan for my-production:
|
|
|
|
|
|
|
|
|
|
Phase 1: Minor Updates (Low Risk)
|
|
|
|
|
• containerd: No update needed
|
|
|
|
|
• redis: No update needed
|
|
|
|
|
|
|
|
|
|
Phase 2: Patch Updates (Medium Risk)
|
|
|
|
|
• cilium: 1.14.5 → 1.15.0 (estimated 5 minutes)
|
|
|
|
|
|
|
|
|
|
Phase 3: Major Updates (High Risk - Requires Testing)
|
|
|
|
|
• kubernetes: 1.29.0 → 1.30.0 (estimated 15 minutes)
|
|
|
|
|
• postgres: 15.5 → 16.1 (estimated 10 minutes, may require data migration)
|
|
|
|
|
|
|
|
|
|
Recommended Order:
|
|
|
|
|
1. Update cilium (low risk)
|
|
|
|
|
2. Update kubernetes (test in staging first)
|
|
|
|
|
3. Update postgres (requires maintenance window)
|
|
|
|
|
|
|
|
|
|
Total Estimated Time: 30 minutes
|
|
|
|
|
Recommended: Test in staging environment first
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## Step 3: Update Task Services
|
|
|
|
|
|
|
|
|
|
### 3.1 Update Non-Critical Service (Cilium Example)
|
|
|
|
|
|
|
|
|
|
#### Dry-Run Update
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Test update without applying
|
|
|
|
|
provisioning t create cilium --infra my-production --check
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Expected Output:**
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
🔍 CHECK MODE: Simulating Cilium update
|
|
|
|
|
|
|
|
|
|
Current: 1.14.5
|
|
|
|
|
Target: 1.15.0
|
|
|
|
|
|
|
|
|
|
Would perform:
|
|
|
|
|
1. Download Cilium 1.15.0
|
|
|
|
|
2. Update configuration
|
|
|
|
|
3. Rolling restart of Cilium pods
|
|
|
|
|
4. Verify connectivity
|
|
|
|
|
|
|
|
|
|
Estimated downtime: <1 minute per node
|
|
|
|
|
No errors detected. Ready to update.
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
#### Generate Updated Configuration
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```toml
|
2026-01-14 04:53:21 +00:00
|
|
|
# Generate new configuration
|
|
|
|
|
provisioning t generate cilium --infra my-production
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Expected Output:**
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
✅ Generated Cilium configuration (version 1.15.0)
|
|
|
|
|
Saved to: workspace/infra/my-production/taskservs/cilium.ncl
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
#### Apply Update
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Apply update
|
|
|
|
|
provisioning t create cilium --infra my-production
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Expected Output:**
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
🚀 Updating Cilium on my-production...
|
|
|
|
|
|
|
|
|
|
Downloading Cilium 1.15.0... ⏳
|
|
|
|
|
✅ Downloaded
|
|
|
|
|
|
|
|
|
|
Updating configuration... ⏳
|
|
|
|
|
✅ Configuration updated
|
|
|
|
|
|
|
|
|
|
Rolling restart: web-01... ⏳
|
|
|
|
|
✅ web-01 updated (Cilium 1.15.0)
|
|
|
|
|
|
|
|
|
|
Rolling restart: web-02... ⏳
|
|
|
|
|
✅ web-02 updated (Cilium 1.15.0)
|
|
|
|
|
|
|
|
|
|
Verifying connectivity... ⏳
|
|
|
|
|
✅ All nodes connected
|
|
|
|
|
|
|
|
|
|
🎉 Cilium update complete!
|
|
|
|
|
Version: 1.14.5 → 1.15.0
|
|
|
|
|
Downtime: 0 minutes
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
#### Verify Update
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Verify updated version
|
|
|
|
|
provisioning version taskserv cilium
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Expected Output:**
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
📦 Cilium Version Info:
|
|
|
|
|
|
|
|
|
|
Installed: 1.15.0
|
|
|
|
|
Latest: 1.15.0
|
|
|
|
|
Status: ✅ Up-to-date
|
|
|
|
|
|
|
|
|
|
Nodes:
|
|
|
|
|
✅ web-01: 1.15.0 (running)
|
|
|
|
|
✅ web-02: 1.15.0 (running)
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### 3.2 Update Critical Service (Kubernetes Example)
|
|
|
|
|
|
|
|
|
|
#### Test in Staging First
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# If you have staging environment
|
|
|
|
|
provisioning t create kubernetes --infra my-staging --check
|
|
|
|
|
provisioning t create kubernetes --infra my-staging
|
|
|
|
|
|
|
|
|
|
# Run integration tests
|
|
|
|
|
provisioning test kubernetes --infra my-staging
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
#### Backup Current State
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Backup Kubernetes state
|
|
|
|
|
kubectl get all -A -o yaml > k8s-backup-$(date +%Y%m%d).yaml
|
|
|
|
|
|
|
|
|
|
# Backup etcd (if using external etcd)
|
|
|
|
|
provisioning t backup kubernetes --infra my-production
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
#### Schedule Maintenance Window
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Set maintenance mode (optional, if supported)
|
|
|
|
|
provisioning maintenance enable --infra my-production --duration 30m
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
#### Update Kubernetes
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```yaml
|
2026-01-14 04:53:21 +00:00
|
|
|
# Update control plane first
|
|
|
|
|
provisioning t create kubernetes --infra my-production --control-plane-only
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Expected Output:**
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
🚀 Updating Kubernetes control plane on my-production...
|
|
|
|
|
|
|
|
|
|
Draining control plane: web-01... ⏳
|
|
|
|
|
✅ web-01 drained
|
|
|
|
|
|
|
|
|
|
Updating control plane: web-01... ⏳
|
|
|
|
|
✅ web-01 updated (Kubernetes 1.30.0)
|
|
|
|
|
|
|
|
|
|
Uncordoning: web-01... ⏳
|
|
|
|
|
✅ web-01 ready
|
|
|
|
|
|
|
|
|
|
Verifying control plane... ⏳
|
|
|
|
|
✅ Control plane healthy
|
|
|
|
|
|
|
|
|
|
🎉 Control plane update complete!
|
|
|
|
|
```
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Update worker nodes one by one
|
|
|
|
|
provisioning t create kubernetes --infra my-production --workers-only --rolling
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Expected Output:**
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
🚀 Updating Kubernetes workers on my-production...
|
|
|
|
|
|
|
|
|
|
Rolling update: web-02...
|
|
|
|
|
Draining... ⏳
|
|
|
|
|
✅ Drained (pods rescheduled)
|
|
|
|
|
|
|
|
|
|
Updating... ⏳
|
|
|
|
|
✅ Updated (Kubernetes 1.30.0)
|
|
|
|
|
|
|
|
|
|
Uncordoning... ⏳
|
|
|
|
|
✅ Ready
|
|
|
|
|
|
|
|
|
|
Waiting for pods to stabilize... ⏳
|
|
|
|
|
✅ All pods running
|
|
|
|
|
|
|
|
|
|
🎉 Worker update complete!
|
|
|
|
|
Updated: web-02
|
|
|
|
|
Version: 1.30.0
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
#### Verify Update
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Verify Kubernetes cluster
|
|
|
|
|
kubectl get nodes
|
|
|
|
|
provisioning version taskserv kubernetes
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Expected Output:**
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
NAME STATUS ROLES AGE VERSION
|
|
|
|
|
web-01 Ready control-plane 30d v1.30.0
|
|
|
|
|
web-02 Ready <none> 30d v1.30.0
|
|
|
|
|
```
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Run smoke tests
|
|
|
|
|
provisioning test kubernetes --infra my-production
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### 3.3 Update Database (PostgreSQL Example)
|
|
|
|
|
|
|
|
|
|
⚠️ **WARNING**: Database updates may require data migration. Always backup first!
|
|
|
|
|
|
|
|
|
|
#### Backup Database
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Backup PostgreSQL database
|
|
|
|
|
provisioning t backup postgres --infra my-production
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Expected Output:**
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
🗄️ Backing up PostgreSQL...
|
|
|
|
|
|
|
|
|
|
Creating dump: my-production-postgres-20250930.sql... ⏳
|
|
|
|
|
✅ Dump created (2.3 GB)
|
|
|
|
|
|
|
|
|
|
Compressing... ⏳
|
|
|
|
|
✅ Compressed (450 MB)
|
|
|
|
|
|
|
|
|
|
Saved to: workspace/backups/postgres/my-production-20250930.sql.gz
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
#### Check Compatibility
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Check if data migration is needed
|
|
|
|
|
provisioning t check-migration postgres --from 15.5 --to 16.1
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Expected Output:**
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
🔍 PostgreSQL Migration Check:
|
|
|
|
|
|
|
|
|
|
From: 15.5
|
|
|
|
|
To: 16.1
|
|
|
|
|
|
|
|
|
|
Migration Required: ✅ Yes (major version change)
|
|
|
|
|
|
|
|
|
|
Steps Required:
|
|
|
|
|
1. Dump database with pg_dump
|
|
|
|
|
2. Stop PostgreSQL 15.5
|
|
|
|
|
3. Install PostgreSQL 16.1
|
|
|
|
|
4. Initialize new data directory
|
|
|
|
|
5. Restore from dump
|
|
|
|
|
|
|
|
|
|
Estimated Time: 15-30 minutes (depending on data size)
|
|
|
|
|
Estimated Downtime: 15-30 minutes
|
|
|
|
|
|
|
|
|
|
Recommended: Use streaming replication for zero-downtime upgrade
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
#### Perform Update
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Update PostgreSQL (with automatic migration)
|
|
|
|
|
provisioning t create postgres --infra my-production --migrate
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Expected Output:**
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
🚀 Updating PostgreSQL on my-production...
|
|
|
|
|
|
|
|
|
|
⚠️ Major version upgrade detected (15.5 → 16.1)
|
|
|
|
|
Automatic migration will be performed
|
|
|
|
|
|
|
|
|
|
Dumping database... ⏳
|
|
|
|
|
✅ Database dumped (2.3 GB)
|
|
|
|
|
|
|
|
|
|
Stopping PostgreSQL 15.5... ⏳
|
|
|
|
|
✅ Stopped
|
|
|
|
|
|
|
|
|
|
Installing PostgreSQL 16.1... ⏳
|
|
|
|
|
✅ Installed
|
|
|
|
|
|
|
|
|
|
Initializing new data directory... ⏳
|
|
|
|
|
✅ Initialized
|
|
|
|
|
|
|
|
|
|
Restoring database... ⏳
|
|
|
|
|
✅ Restored (2.3 GB)
|
|
|
|
|
|
|
|
|
|
Starting PostgreSQL 16.1... ⏳
|
|
|
|
|
✅ Started
|
|
|
|
|
|
|
|
|
|
Verifying data integrity... ⏳
|
|
|
|
|
✅ All tables verified
|
|
|
|
|
|
|
|
|
|
🎉 PostgreSQL update complete!
|
|
|
|
|
Version: 15.5 → 16.1
|
|
|
|
|
Downtime: 18 minutes
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
#### Verify Update
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Verify PostgreSQL
|
|
|
|
|
provisioning version taskserv postgres
|
|
|
|
|
ssh db-01 "psql --version"
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## Step 4: Update Multiple Services
|
|
|
|
|
|
|
|
|
|
### 4.1 Batch Update (Sequentially)
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Update multiple taskservs one by one
|
|
|
|
|
provisioning t update --infra my-production --taskservs cilium,containerd,redis
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Expected Output:**
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
🚀 Updating 3 taskservs on my-production...
|
|
|
|
|
|
|
|
|
|
[1/3] Updating cilium... ⏳
|
|
|
|
|
✅ cilium updated (1.15.0)
|
|
|
|
|
|
|
|
|
|
[2/3] Updating containerd... ⏳
|
|
|
|
|
✅ containerd updated (1.7.14)
|
|
|
|
|
|
|
|
|
|
[3/3] Updating redis... ⏳
|
|
|
|
|
✅ redis updated (7.2.4)
|
|
|
|
|
|
|
|
|
|
🎉 All updates complete!
|
|
|
|
|
Updated: 3 taskservs
|
|
|
|
|
Total time: 8 minutes
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### 4.2 Parallel Update (Non-Dependent Services)
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Update taskservs in parallel (if they don't depend on each other)
|
|
|
|
|
provisioning t update --infra my-production --taskservs redis,postgres --parallel
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Expected Output:**
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
🚀 Updating 2 taskservs in parallel on my-production...
|
|
|
|
|
|
|
|
|
|
redis: Updating... ⏳
|
|
|
|
|
postgres: Updating... ⏳
|
|
|
|
|
|
|
|
|
|
redis: ✅ Updated (7.2.4)
|
|
|
|
|
postgres: ✅ Updated (16.1)
|
|
|
|
|
|
|
|
|
|
🎉 All updates complete!
|
|
|
|
|
Updated: 2 taskservs
|
|
|
|
|
Total time: 3 minutes (parallel)
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## Step 5: Update Server Configuration
|
|
|
|
|
|
|
|
|
|
### 5.1 Update Server Resources
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Edit server configuration
|
|
|
|
|
provisioning sops workspace/infra/my-production/servers.ncl
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Example: Upgrade server plan**
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Before
|
|
|
|
|
{
|
|
|
|
|
name = "web-01"
|
|
|
|
|
plan = "1xCPU-2 GB" # Old plan
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# After
|
|
|
|
|
{
|
|
|
|
|
name = "web-01"
|
|
|
|
|
plan = "2xCPU-4 GB" # New plan
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Apply server update
|
|
|
|
|
provisioning s update --infra my-production --check
|
|
|
|
|
provisioning s update --infra my-production
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### 5.2 Update Server OS
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Update operating system packages
|
|
|
|
|
provisioning s update --infra my-production --os-update
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Expected Output:**
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
🚀 Updating OS packages on my-production servers...
|
|
|
|
|
|
|
|
|
|
web-01: Updating packages... ⏳
|
|
|
|
|
✅ web-01: 24 packages updated
|
|
|
|
|
|
|
|
|
|
web-02: Updating packages... ⏳
|
|
|
|
|
✅ web-02: 24 packages updated
|
|
|
|
|
|
|
|
|
|
db-01: Updating packages... ⏳
|
|
|
|
|
✅ db-01: 24 packages updated
|
|
|
|
|
|
|
|
|
|
🎉 OS updates complete!
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## Step 6: Rollback Procedures
|
|
|
|
|
|
|
|
|
|
### 6.1 Rollback Task Service
|
|
|
|
|
|
|
|
|
|
If update fails or causes issues:
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Rollback to previous version
|
|
|
|
|
provisioning t rollback cilium --infra my-production
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Expected Output:**
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
🔄 Rolling back Cilium on my-production...
|
|
|
|
|
|
|
|
|
|
Current: 1.15.0
|
|
|
|
|
Target: 1.14.5 (previous version)
|
|
|
|
|
|
|
|
|
|
Rolling back: web-01... ⏳
|
|
|
|
|
✅ web-01 rolled back
|
|
|
|
|
|
|
|
|
|
Rolling back: web-02... ⏳
|
|
|
|
|
✅ web-02 rolled back
|
|
|
|
|
|
|
|
|
|
Verifying connectivity... ⏳
|
|
|
|
|
✅ All nodes connected
|
|
|
|
|
|
|
|
|
|
🎉 Rollback complete!
|
|
|
|
|
Version: 1.15.0 → 1.14.5
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### 6.2 Rollback from Backup
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Restore configuration from backup
|
|
|
|
|
provisioning ws restore my-production --from workspace/backups/my-production-20250930.tar.gz
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### 6.3 Emergency Rollback
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Complete infrastructure rollback
|
|
|
|
|
provisioning rollback --infra my-production --to-snapshot <snapshot-id>
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## Step 7: Post-Update Verification
|
|
|
|
|
|
|
|
|
|
### 7.1 Verify All Components
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Check overall health
|
|
|
|
|
provisioning health --infra my-production
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Expected Output:**
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
🏥 Health Check: my-production
|
|
|
|
|
|
|
|
|
|
Servers:
|
|
|
|
|
✅ web-01: Healthy
|
|
|
|
|
✅ web-02: Healthy
|
|
|
|
|
✅ db-01: Healthy
|
|
|
|
|
|
|
|
|
|
Task Services:
|
|
|
|
|
✅ kubernetes: 1.30.0 (healthy)
|
|
|
|
|
✅ containerd: 1.7.13 (healthy)
|
|
|
|
|
✅ cilium: 1.15.0 (healthy)
|
|
|
|
|
✅ postgres: 16.1 (healthy)
|
|
|
|
|
|
|
|
|
|
Clusters:
|
|
|
|
|
✅ buildkit: 2/2 replicas (healthy)
|
|
|
|
|
|
|
|
|
|
Overall Status: ✅ All systems healthy
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### 7.2 Verify Version Updates
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Verify all versions are updated
|
|
|
|
|
provisioning version show
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### 7.3 Run Integration Tests
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Run comprehensive tests
|
|
|
|
|
provisioning test all --infra my-production
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Expected Output:**
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
🧪 Running Integration Tests...
|
|
|
|
|
|
|
|
|
|
[1/5] Server connectivity... ⏳
|
|
|
|
|
✅ All servers reachable
|
|
|
|
|
|
|
|
|
|
[2/5] Kubernetes health... ⏳
|
|
|
|
|
✅ All nodes ready, all pods running
|
|
|
|
|
|
|
|
|
|
[3/5] Network connectivity... ⏳
|
|
|
|
|
✅ All services reachable
|
|
|
|
|
|
|
|
|
|
[4/5] Database connectivity... ⏳
|
|
|
|
|
✅ PostgreSQL responsive
|
|
|
|
|
|
|
|
|
|
[5/5] Application health... ⏳
|
|
|
|
|
✅ All applications healthy
|
|
|
|
|
|
|
|
|
|
🎉 All tests passed!
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### 7.4 Monitor for Issues
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Monitor logs for errors
|
|
|
|
|
provisioning logs --infra my-production --follow --level error
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## Update Checklist
|
|
|
|
|
|
|
|
|
|
Use this checklist for production updates:
|
|
|
|
|
|
|
|
|
|
- [ ] Check for available updates
|
|
|
|
|
- [ ] Review changelog and breaking changes
|
|
|
|
|
- [ ] Create configuration backup
|
|
|
|
|
- [ ] Test update in staging environment
|
|
|
|
|
- [ ] Schedule maintenance window
|
|
|
|
|
- [ ] Notify team/users of maintenance
|
|
|
|
|
- [ ] Update non-critical services first
|
|
|
|
|
- [ ] Verify each update before proceeding
|
|
|
|
|
- [ ] Update critical services with rolling updates
|
|
|
|
|
- [ ] Backup database before major updates
|
|
|
|
|
- [ ] Verify all components after update
|
|
|
|
|
- [ ] Run integration tests
|
|
|
|
|
- [ ] Monitor for issues (30 minutes minimum)
|
|
|
|
|
- [ ] Document any issues encountered
|
|
|
|
|
- [ ] Close maintenance window
|
|
|
|
|
|
|
|
|
|
## Common Update Scenarios
|
|
|
|
|
|
|
|
|
|
### Scenario 1: Minor Security Patch
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Quick security update
|
|
|
|
|
provisioning t check-updates --security-only
|
|
|
|
|
provisioning t update --infra my-production --security-patches --yes
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### Scenario 2: Major Version Upgrade
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Careful major version update
|
|
|
|
|
provisioning ws backup my-production
|
|
|
|
|
provisioning t check-migration <service> --from X.Y --to X+1.Y
|
|
|
|
|
provisioning t create <service> --infra my-production --migrate
|
|
|
|
|
provisioning test all --infra my-production
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### Scenario 3: Emergency Hotfix
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Apply critical hotfix immediately
|
|
|
|
|
provisioning t create <service> --infra my-production --hotfix --yes
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## Troubleshooting Updates
|
|
|
|
|
|
|
|
|
|
### Issue: Update fails mid-process
|
|
|
|
|
|
|
|
|
|
**Solution:**
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Check update status
|
|
|
|
|
provisioning t status <taskserv> --infra my-production
|
|
|
|
|
|
|
|
|
|
# Resume failed update
|
|
|
|
|
provisioning t update <taskserv> --infra my-production --resume
|
|
|
|
|
|
|
|
|
|
# Or rollback
|
|
|
|
|
provisioning t rollback <taskserv> --infra my-production
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### Issue: Service not starting after update
|
|
|
|
|
|
|
|
|
|
**Solution:**
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Check logs
|
|
|
|
|
provisioning logs <taskserv> --infra my-production
|
|
|
|
|
|
|
|
|
|
# Verify configuration
|
|
|
|
|
provisioning t validate <taskserv> --infra my-production
|
|
|
|
|
|
|
|
|
|
# Rollback if necessary
|
|
|
|
|
provisioning t rollback <taskserv> --infra my-production
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### Issue: Data migration fails
|
|
|
|
|
|
|
|
|
|
**Solution:**
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Check migration logs
|
|
|
|
|
provisioning t migration-logs <taskserv> --infra my-production
|
|
|
|
|
|
|
|
|
|
# Restore from backup
|
|
|
|
|
provisioning t restore <taskserv> --infra my-production --from <backup-file>
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## Best Practices
|
|
|
|
|
|
|
|
|
|
1. **Always Test First**: Test updates in staging before production
|
|
|
|
|
2. **Backup Everything**: Create backups before any update
|
|
|
|
|
3. **Update Gradually**: Update one service at a time
|
|
|
|
|
4. **Monitor Closely**: Watch for errors after each update
|
|
|
|
|
5. **Have Rollback Plan**: Always have a rollback strategy
|
|
|
|
|
6. **Document Changes**: Keep update logs for reference
|
|
|
|
|
7. **Schedule Wisely**: Update during low-traffic periods
|
|
|
|
|
8. **Verify Thoroughly**: Run tests after each update
|
|
|
|
|
|
|
|
|
|
## Next Steps
|
|
|
|
|
|
|
|
|
|
- **[Customize Guide](customize-infrastructure.md)** - Customize your infrastructure
|
|
|
|
|
- **[From Scratch Guide](from-scratch.md)** - Deploy new infrastructure
|
|
|
|
|
- **[Workflow Guide](../development/workflow.md)** - Automate with workflows
|
|
|
|
|
|
|
|
|
|
## Quick Reference
|
|
|
|
|
|
2026-01-14 04:53:58 +00:00
|
|
|
```bash
|
2026-01-14 04:53:21 +00:00
|
|
|
# Update workflow
|
|
|
|
|
provisioning t check-updates
|
|
|
|
|
provisioning ws backup my-production
|
|
|
|
|
provisioning t create <taskserv> --infra my-production --check
|
|
|
|
|
provisioning t create <taskserv> --infra my-production
|
|
|
|
|
provisioning version taskserv <taskserv>
|
|
|
|
|
provisioning health --infra my-production
|
|
|
|
|
provisioning test all --infra my-production
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
*This guide is part of the provisioning project documentation. Last updated: 2025-09-30*
|