provisioning/docs/src/guides/update-infrastructure.md

256 lines
5.6 KiB
Markdown
Raw Normal View History

# Update Infrastructure Guide
Guide for safely updating existing infrastructure deployments.
## Overview
This guide covers strategies and procedures for updating provisioned infrastructure, including servers, task services, and cluster configurations.
## Prerequisites
Before updating infrastructure:
- ✅ Backup current configuration
- ✅ Test updates in development environment
- ✅ Review changelog and breaking changes
- ✅ Schedule maintenance window
## Update Strategies
### 1. In-Place Update
Update existing resources without replacement:
```bash
# Check for available updates
provisioning version check
# Update specific taskserv
provisioning taskserv update kubernetes --version 1.29.0 --check
# Update all taskservs
provisioning taskserv update --all --check
```
**Pros**: Fast, no downtime
**Cons**: Risk of service interruption
---
### 2. Rolling Update
Update resources one at a time:
```bash
# Enable rolling update strategy
provisioning config set update.strategy rolling
# Update cluster with rolling strategy
provisioning cluster update my-cluster --rolling --max-unavailable 1
```
**Pros**: No downtime, gradual rollout
**Cons**: Slower, requires multiple nodes
---
### 3. Blue-Green Deployment
Create new infrastructure alongside old:
```bash
# Create new "green" environment
provisioning workspace create my-cluster-green
# Deploy updated infrastructure
provisioning cluster create my-cluster --workspace my-cluster-green
# Test green environment
provisioning test env cluster my-cluster-green
# Switch traffic to green
provisioning cluster switch my-cluster-green --production
# Cleanup old "blue" environment
provisioning workspace delete my-cluster-blue --confirm
```
**Pros**: Zero downtime, easy rollback
**Cons**: Requires 2x resources temporarily
---
## Update Procedures
### Updating Task Services
```bash
# List installed taskservs with versions
provisioning taskserv list --with-versions
# Check for updates
provisioning taskserv check-updates
# Update specific service
provisioning taskserv update kubernetes \
--version 1.29.0 \
--backup \
--check
# Verify update
provisioning taskserv status kubernetes
```
### Updating Server Configuration
```bash
# Update server plan (resize)
provisioning server update web-01 \
--plan 4xCPU-8GB \
--check
# Update server zone (migrate)
provisioning server migrate web-01 \
--to-zone us-west-2 \
--check
```
### Updating Cluster Configuration
```bash
# Update cluster configuration
provisioning cluster update my-cluster \
--config updated-config.k \
--backup \
--check
# Apply configuration changes
provisioning cluster apply my-cluster
```
## Rollback Procedures
If update fails, rollback to previous state:
```bash
# List available backups
provisioning backup list
# Rollback to specific backup
provisioning backup restore my-cluster-20251010-1200 --confirm
# Verify rollback
provisioning cluster status my-cluster
```
## Post-Update Verification
After updating, verify system health:
```bash
# Check system status
provisioning status
# Verify all services
provisioning taskserv list --health
# Run smoke tests
provisioning test quick kubernetes
provisioning test quick postgres
# Check orchestrator
provisioning workflow orchestrator
```
## Update Best Practices
### Before Update
1. **Backup everything**: `provisioning backup create --all`
2. **Review docs**: Check taskserv update notes
3. **Test first**: Use test environment
4. **Schedule window**: Plan for maintenance time
### During Update
1. **Monitor logs**: `provisioning logs follow`
2. **Check health**: `provisioning health` continuously
3. **Verify phases**: Ensure each phase completes
4. **Document changes**: Keep update log
### After Update
1. **Verify functionality**: Run test suite
2. **Check performance**: Monitor metrics
3. **Review logs**: Check for errors
4. **Update documentation**: Record changes
5. **Cleanup**: Remove old backups after verification
## Automated Updates
Enable automatic updates for non-critical updates:
```bash
# Configure auto-update policy
provisioning config set auto-update.enabled true
provisioning config set auto-update.strategy minor
provisioning config set auto-update.schedule "0 2 * * 0" # Weekly Sunday 2AM
# Check auto-update status
provisioning config show auto-update
```
## Update Notifications
Configure notifications for update events:
```bash
# Enable update notifications
provisioning config set notifications.updates.enabled true
provisioning config set notifications.updates.email "admin@example.com"
# Test notifications
provisioning test notification update-available
```
## Troubleshooting Updates
### Common Issues
**Update Fails Mid-Process**:
```bash
# Check update status
provisioning update status
# Resume failed update
provisioning update resume --from-checkpoint
# Or rollback
provisioning update rollback
```
**Service Incompatibility**:
```bash
# Check compatibility
provisioning taskserv compatibility kubernetes 1.29.0
# See dependency tree
provisioning taskserv dependencies kubernetes
```
**Configuration Conflicts**:
```bash
# Validate configuration
provisioning validate config
# Show configuration diff
provisioning config diff --before --after
```
## Related Documentation
- **[Quick Start Guide](../quickstart/01-prerequisites.md)** - Initial setup
- **[Service Management](../user/SERVICE_MANAGEMENT_GUIDE.md)** - Service operations
- **[Backup & Restore](../operations/backup-restore.md)** - Backup procedures
- **[Troubleshooting](../user/troubleshooting-guide.md)** - Common issues
---
**Need Help?** Run `provisioning help update` or see [Troubleshooting Guide](../user/troubleshooting-guide.md).