Some checks failed
Rust CI / Security Audit (push) Has been cancelled
Rust CI / Check + Test + Lint (nightly) (push) Has been cancelled
Rust CI / Check + Test + Lint (stable) (push) Has been cancelled
mdBook Build & Deploy / Build mdBook (push) Has been cancelled
Nickel Type Check / Nickel Type Checking (push) Has been cancelled
mdBook Build & Deploy / Documentation Quality Check (push) Has been cancelled
mdBook Build & Deploy / Deploy to GitHub Pages (push) Has been cancelled
mdBook Build & Deploy / Notification (push) Has been cancelled
675 lines
16 KiB
Markdown
675 lines
16 KiB
Markdown
# GitHub Actions CI/CD Guide for VAPORA Provisioning
|
|
|
|
Complete guide for setting up and using GitHub Actions workflows for VAPORA deployment automation.
|
|
|
|
## Overview
|
|
|
|
Five integrated GitHub Actions workflows provide end-to-end CI/CD automation:
|
|
|
|
1. **validate-and-build.yml** - Configuration validation and artifact generation
|
|
2. **deploy-docker.yml** - Docker Compose deployment automation
|
|
3. **deploy-kubernetes.yml** - Kubernetes deployment automation
|
|
4. **health-check.yml** - Automated health monitoring and diagnostics
|
|
5. **rollback.yml** - Safe deployment rollback with pre-checks
|
|
|
|
---
|
|
|
|
## Quick Setup
|
|
|
|
### 1. Prerequisites
|
|
|
|
- GitHub repository with access to Actions
|
|
- Docker Hub account (for image pushes, optional)
|
|
- Kubernetes cluster with kubeconfig (for K8s deployments)
|
|
- Slack workspace (for notifications, optional)
|
|
|
|
### 2. Required Secrets
|
|
|
|
Add these secrets to your GitHub repository (Settings → Secrets → Actions):
|
|
|
|
```bash
|
|
# Kubeconfig for Kubernetes deployments
|
|
KUBE_CONFIG_CI # For CI/test cluster (optional)
|
|
KUBE_CONFIG_STAGING # For staging Kubernetes cluster
|
|
KUBE_CONFIG_PRODUCTION # For production Kubernetes cluster
|
|
|
|
# Optional: Slack notifications
|
|
SLACK_WEBHOOK # Default Slack webhook
|
|
SLACK_WEBHOOK_ALERTS # Critical alerts webhook
|
|
|
|
# Optional: Docker registry
|
|
DOCKER_USERNAME # Docker Hub username
|
|
DOCKER_PASSWORD # Docker Hub access token
|
|
```
|
|
|
|
### 3. Encode Kubeconfig for Secrets
|
|
|
|
```bash
|
|
# Convert kubeconfig to base64
|
|
cat ~/.kube/config | base64
|
|
|
|
# Store in GitHub Secrets as KUBE_CONFIG_STAGING, etc.
|
|
```
|
|
|
|
### 4. Enable GitHub Actions
|
|
|
|
1. Go to repository Settings
|
|
2. Click "Actions" → "General"
|
|
3. Enable "Allow all actions and reusable workflows"
|
|
4. Set "Workflow permissions" to "Read and write permissions"
|
|
|
|
---
|
|
|
|
## Workflows in Detail
|
|
|
|
### 1. Validate & Build (validate-and-build.yml)
|
|
|
|
**Purpose**: Validate all configurations and generate deployment artifacts
|
|
|
|
**Triggers**:
|
|
- Push to `main` or `develop` branches (if provisioning files change)
|
|
- Manual dispatch with custom mode selection
|
|
- Pull requests affecting provisioning
|
|
|
|
**Jobs**:
|
|
- `validate-configs` - Validates solo, multiuser, and enterprise modes
|
|
- `build-artifacts` - Generates JSON, TOML, YAML, and Kubernetes manifests
|
|
|
|
**Outputs**:
|
|
- `deployment-artifacts` - All configuration and manifest files
|
|
- `build-logs` - Pipeline execution logs
|
|
- `validation-logs-*` - Per-mode validation reports
|
|
|
|
**Usage**:
|
|
|
|
```bash
|
|
# Automatic on push
|
|
git commit -m "Update provisioning config"
|
|
git push origin main
|
|
|
|
# Manual trigger
|
|
# Go to Actions → Validate & Build → Run workflow
|
|
# Select mode: solo, multiuser, or enterprise
|
|
```
|
|
|
|
**Example Outputs**:
|
|
```
|
|
artifacts/
|
|
├── config-solo.json
|
|
├── config-multiuser.json
|
|
├── config-enterprise.json
|
|
├── vapora-solo.toml
|
|
├── vapora-multiuser.toml
|
|
├── vapora-enterprise.toml
|
|
├── vapora-solo.yaml
|
|
├── vapora-multiuser.yaml
|
|
├── vapora-enterprise.yaml
|
|
├── configmap.yaml
|
|
├── deployment.yaml
|
|
├── docker-compose.yml
|
|
└── MANIFEST.md
|
|
```
|
|
|
|
---
|
|
|
|
### 2. Deploy to Docker (deploy-docker.yml)
|
|
|
|
**Purpose**: Deploy VAPORA to Docker Compose
|
|
|
|
**Triggers**:
|
|
- Manual dispatch with configuration options
|
|
- Automatic trigger after validate-and-build on `develop` branch
|
|
|
|
**Required Inputs**:
|
|
- `mode` - Deployment mode (solo, multiuser, enterprise)
|
|
- `environment` - Target environment (development, staging, production)
|
|
- `dry_run` - Test without actual deployment
|
|
|
|
**Features**:
|
|
- Validates Docker Compose configuration
|
|
- Pulls base images
|
|
- Starts services
|
|
- Performs health checks
|
|
- Auto-comments on PRs with deployment details
|
|
- Slack notifications
|
|
|
|
**Usage**:
|
|
|
|
```bash
|
|
# Via GitHub UI
|
|
1. Go to Actions → Deploy to Docker
|
|
2. Click "Run workflow"
|
|
3. Select:
|
|
- Mode: multiuser
|
|
- Dry run: false
|
|
- Environment: staging
|
|
4. Click "Run workflow"
|
|
```
|
|
|
|
**Service Endpoints** (after deployment):
|
|
```
|
|
- Backend: http://localhost:8001
|
|
- Frontend: http://localhost:3000
|
|
- Agents: http://localhost:8002
|
|
- LLM Router: http://localhost:8003
|
|
- SurrealDB: http://localhost:8000
|
|
- Health: http://localhost:8001/health
|
|
```
|
|
|
|
**Local testing with same files**:
|
|
```bash
|
|
# Download artifacts from workflow
|
|
cd deploy/docker
|
|
docker compose up -d
|
|
|
|
# View logs
|
|
docker compose logs -f backend
|
|
|
|
# Check health
|
|
curl http://localhost:8001/health
|
|
```
|
|
|
|
---
|
|
|
|
### 3. Deploy to Kubernetes (deploy-kubernetes.yml)
|
|
|
|
**Purpose**: Deploy VAPORA to Kubernetes cluster
|
|
|
|
**Triggers**:
|
|
- Manual dispatch with full configuration options
|
|
- Workflow dispatch with environment selection
|
|
|
|
**Required Inputs**:
|
|
- `mode` - Deployment mode
|
|
- `environment` - Target environment (staging, production)
|
|
- `dry_run` - Dry-run test (recommended first)
|
|
- `rollout_timeout` - Max time to wait for rollout (default: 300s)
|
|
|
|
**Features**:
|
|
- Validates Kubernetes manifests
|
|
- Creates VAPORA namespace
|
|
- Applies ConfigMap with configuration
|
|
- Deploys all three services
|
|
- Waits for rollout completion
|
|
- Performs health checks
|
|
- Annotation tracking for deployments
|
|
- Slack notifications
|
|
|
|
**Usage**:
|
|
|
|
```bash
|
|
# Via GitHub UI
|
|
1. Go to Actions → Deploy to Kubernetes
|
|
2. Click "Run workflow"
|
|
3. Select:
|
|
- Mode: enterprise
|
|
- Environment: staging
|
|
- Dry run: true # Always test first!
|
|
- Rollout timeout: 300
|
|
4. Click "Run workflow"
|
|
|
|
# After dry-run verification, re-run with dry_run: false
|
|
```
|
|
|
|
**Deployment Steps**:
|
|
1. Validate manifests (dry-run)
|
|
2. Create vapora namespace
|
|
3. Apply ConfigMap
|
|
4. Apply Deployments
|
|
5. Wait for backend rollout (5m timeout)
|
|
6. Wait for agents rollout
|
|
7. Wait for llm-router rollout
|
|
8. Verify pod health
|
|
|
|
**Verification Commands**:
|
|
```bash
|
|
# Check deployments
|
|
kubectl get deployments -n vapora
|
|
kubectl get pods -n vapora
|
|
|
|
# View logs
|
|
kubectl logs -f deployment/vapora-backend -n vapora
|
|
|
|
# Check events
|
|
kubectl get events -n vapora --sort-by='.lastTimestamp'
|
|
|
|
# Port forward for local testing
|
|
kubectl port-forward -n vapora svc/vapora-backend 8001:8001
|
|
curl http://localhost:8001/health
|
|
|
|
# View rollout history
|
|
kubectl rollout history deployment/vapora-backend -n vapora
|
|
```
|
|
|
|
---
|
|
|
|
### 4. Health Check & Monitoring (health-check.yml)
|
|
|
|
**Purpose**: Continuous health monitoring across platforms
|
|
|
|
**Triggers**:
|
|
- Schedule: Every 15 minutes
|
|
- Schedule: Every 6 hours
|
|
- Manual dispatch with custom parameters
|
|
|
|
**Features**:
|
|
- Docker: Container status, HTTP health checks
|
|
- Kubernetes: Deployment replicas, pod phases, service health
|
|
- Automatic issue creation on failures
|
|
- Diagnostics collection
|
|
- Slack notifications
|
|
|
|
**Usage**:
|
|
|
|
```bash
|
|
# Via GitHub UI for manual run
|
|
1. Go to Actions → Health Check & Monitoring
|
|
2. Click "Run workflow"
|
|
3. Select:
|
|
- Target: kubernetes
|
|
- Count: 5 (run 5 checks)
|
|
- Interval: 30 (30 seconds between checks)
|
|
4. Click "Run workflow"
|
|
```
|
|
|
|
**Automatic Monitoring**:
|
|
- Every 15 minutes: Quick health check
|
|
- Every 6 hours: Comprehensive diagnostics
|
|
|
|
**What Gets Checked** (Kubernetes):
|
|
- Deployment replica status
|
|
- Pod readiness conditions
|
|
- Service availability
|
|
- ConfigMap data
|
|
- Recent events
|
|
- Resource usage (if metrics-server available)
|
|
|
|
**What Gets Checked** (Docker):
|
|
- Container status (Up/Down)
|
|
- HTTP endpoint health (200 status)
|
|
- Service responsiveness
|
|
- Docker network status
|
|
- Docker volumes
|
|
|
|
**Reports Generated**:
|
|
- `docker-health.log` - Docker health check output
|
|
- `k8s-health.log` - Kubernetes health check output
|
|
- `k8s-diagnostics.log` - Full K8s diagnostics
|
|
- `docker-diagnostics.log` - Full Docker diagnostics
|
|
- `HEALTH_REPORT.md` - Summary report
|
|
|
|
---
|
|
|
|
### 5. Rollback Deployment (rollback.yml)
|
|
|
|
**Purpose**: Safe deployment rollback with pre-checks and verification
|
|
|
|
**Triggers**:
|
|
- Manual dispatch only (safety feature)
|
|
|
|
**Required Inputs**:
|
|
- `target` - Rollback target (kubernetes or docker)
|
|
- `environment` - Environment to rollback (staging or production)
|
|
- `deployment` - Specific deployment or "all"
|
|
- `revision` - Kubernetes revision (0 = previous)
|
|
|
|
**Features**:
|
|
- Pre-rollback safety checks
|
|
- Deployment history snapshot
|
|
- Automatic rollback execution
|
|
- Post-rollback verification
|
|
- Health check after rollback
|
|
- GitHub issue creation with summary
|
|
- Slack alerts
|
|
|
|
**Usage** (Kubernetes):
|
|
|
|
```bash
|
|
# Via GitHub UI
|
|
1. Go to Actions → Rollback Deployment
|
|
2. Click "Run workflow"
|
|
3. Select:
|
|
- Target: kubernetes
|
|
- Environment: staging
|
|
- Deployment: all
|
|
- Revision: 0 (rollback to previous)
|
|
4. Click "Run workflow"
|
|
|
|
# To rollback to specific revision
|
|
# Check kubectl rollout history deployment/vapora-backend -n vapora
|
|
# Set revision to desired number instead of 0
|
|
```
|
|
|
|
**Usage** (Docker):
|
|
|
|
```bash
|
|
# Via GitHub UI
|
|
1. Go to Actions → Rollback Deployment
|
|
2. Click "Run workflow"
|
|
3. Select:
|
|
- Target: docker
|
|
- Environment: staging
|
|
4. Click "Run workflow"
|
|
|
|
# Follow the manual rollback guide in artifacts
|
|
```
|
|
|
|
**Rollback Process**:
|
|
1. Pre-rollback checks and snapshot
|
|
2. Store current deployment history
|
|
3. Execute rollback (automatic for K8s, guided for Docker)
|
|
4. Verify rollback status
|
|
5. Check pod health
|
|
6. Generate reports
|
|
7. Create GitHub issue
|
|
8. Send Slack alert
|
|
|
|
**Verification After Rollback**:
|
|
```bash
|
|
# Kubernetes
|
|
kubectl get pods -n vapora
|
|
kubectl logs -f deployment/vapora-backend -n vapora
|
|
curl http://localhost:8001/health # After port-forward
|
|
|
|
# Docker
|
|
docker compose ps
|
|
docker compose logs backend
|
|
curl http://localhost:8001/health
|
|
```
|
|
|
|
---
|
|
|
|
## CI/CD Pipelines & Common Workflows
|
|
|
|
### Workflow 1: Local Development
|
|
|
|
```
|
|
Developer creates feature branch
|
|
↓
|
|
Push to GitHub
|
|
↓
|
|
[Validate & Build] triggers automatically
|
|
↓
|
|
Download artifacts
|
|
↓
|
|
[Deploy to Docker] manually for local testing
|
|
↓
|
|
Test locally with docker compose
|
|
↓
|
|
Create PR (artifact links included)
|
|
↓
|
|
Merge to develop when approved
|
|
```
|
|
|
|
### Workflow 2: Staging Deployment
|
|
|
|
```
|
|
Merge PR to develop
|
|
↓
|
|
[Validate & Build] runs automatically
|
|
↓
|
|
Download artifacts
|
|
↓
|
|
Run [Deploy to Kubernetes] manually with dry-run
|
|
↓
|
|
Review dry-run output
|
|
↓
|
|
Run [Deploy to Kubernetes] again with dry-run: false
|
|
↓
|
|
[Health Check] verifies deployment
|
|
↓
|
|
Staging environment live
|
|
```
|
|
|
|
### Workflow 3: Production Deployment
|
|
|
|
```
|
|
Code review and approval
|
|
↓
|
|
Merge PR to main
|
|
↓
|
|
[Validate & Build] runs automatically
|
|
↓
|
|
Manual approval for production
|
|
↓
|
|
Run [Deploy to Kubernetes] with dry-run: true
|
|
↓
|
|
Review changes carefully
|
|
↓
|
|
Run [Deploy to Kubernetes] with dry-run: false
|
|
↓
|
|
[Health Check] monitoring (automatic every 6 hours)
|
|
↓
|
|
Production deployment complete
|
|
```
|
|
|
|
### Workflow 4: Emergency Rollback
|
|
|
|
```
|
|
Production issue detected
|
|
↓
|
|
[Health Check] alerts in Slack
|
|
↓
|
|
Investigate issue
|
|
↓
|
|
Run [Rollback Deployment] manually
|
|
↓
|
|
GitHub issue created automatically
|
|
↓
|
|
[Health Check] verifies rollback
|
|
↓
|
|
Services restored
|
|
↓
|
|
Incident investigation begins
|
|
```
|
|
|
|
---
|
|
|
|
## Environment Configuration
|
|
|
|
### Staging Environment
|
|
|
|
- **Branch**: develop
|
|
- **Auto-deploy**: No (manual only)
|
|
- **Dry-run default**: Yes (test first)
|
|
- **Notifications**: SLACK_WEBHOOK
|
|
- **Protection**: Requires approval for merge to main
|
|
|
|
### Production Environment
|
|
|
|
- **Branch**: main
|
|
- **Auto-deploy**: No (manual only)
|
|
- **Dry-run default**: Yes (always test first)
|
|
- **Notifications**: SLACK_WEBHOOK_ALERTS
|
|
- **Protection**: Requires PR review, status checks must pass
|
|
|
|
---
|
|
|
|
## Artifacts & Downloads
|
|
|
|
All workflow artifacts are available in the Actions tab for 30-90 days:
|
|
|
|
```
|
|
Actions → [Specific Workflow] → [Run] → Artifacts
|
|
```
|
|
|
|
**Available Artifacts**:
|
|
- `deployment-artifacts` - Configuration and manifests
|
|
- `validation-logs-*` - Per-mode validation reports
|
|
- `build-logs` - CI/CD pipeline logs
|
|
- `docker-deployment-logs-*` - Docker deployment details
|
|
- `k8s-deployment-*` - Kubernetes deployment details
|
|
- `health-check-*` - Health monitoring reports
|
|
- `rollback-logs-*` - Rollback execution details
|
|
- `rollback-snapshot-*` - Pre-rollback state snapshot
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
### Build Fails: "Config not found"
|
|
```
|
|
Solution: Ensure provisioning/schemas/ files exist and are committed
|
|
Check path references in validate-config.nu
|
|
```
|
|
|
|
### Deploy Fails: "kubeconfig not found"
|
|
```
|
|
Solution: 1. Verify KUBE_CONFIG_STAGING/PRODUCTION secrets exist
|
|
2. Ensure kubeconfig is properly base64 encoded
|
|
3. Test: echo $KUBE_CONFIG_STAGING | base64 -d
|
|
4. Re-encode if corrupted: cat ~/.kube/config | base64
|
|
```
|
|
|
|
### Health Check: "No kubeconfig available"
|
|
```
|
|
Solution: Configure at least KUBE_CONFIG_STAGING secret
|
|
Health check tries CI first, then falls back to staging
|
|
```
|
|
|
|
### Docker Deploy: "Docker daemon not accessible"
|
|
```
|
|
Solution: Docker is only available in ubuntu-latest runners
|
|
Run deploy-docker on appropriate runners
|
|
```
|
|
|
|
### Deployment Hangs: "Waiting for rollout"
|
|
```
|
|
Solution: 1. Check pod logs: kubectl logs -n vapora <pod>
|
|
2. Describe pod: kubectl describe pod -n vapora <pod>
|
|
3. Increase rollout_timeout in workflow
|
|
4. Check resource requests/limits in deployment.yaml
|
|
```
|
|
|
|
---
|
|
|
|
## Slack Integration
|
|
|
|
### Setup Slack Webhooks
|
|
|
|
1. Create Slack App: https://api.slack.com/apps
|
|
2. Enable Incoming Webhooks
|
|
3. Create webhook for #deployments channel
|
|
4. Copy webhook URL
|
|
5. Add to GitHub Secrets:
|
|
- `SLACK_WEBHOOK` - General notifications
|
|
- `SLACK_WEBHOOK_ALERTS` - Critical alerts
|
|
|
|
### Slack Message Examples
|
|
|
|
**Build Success**:
|
|
```
|
|
✅ VAPORA Artifact Build Complete
|
|
Mode: multiuser | Artifacts ready for deployment
|
|
```
|
|
|
|
**Deployment Success**:
|
|
```
|
|
✅ VAPORA Docker deployment successful!
|
|
Mode: multiuser | Environment: staging
|
|
```
|
|
|
|
**Health Check Alert**:
|
|
```
|
|
❌ VAPORA Health Check Failed
|
|
Target: kubernetes | Create issue for investigation
|
|
```
|
|
|
|
**Rollback Alert**:
|
|
```
|
|
🔙 VAPORA Rollback Executed
|
|
Target: kubernetes | Environment: production
|
|
Executed By: @user | Verify service health
|
|
```
|
|
|
|
---
|
|
|
|
## Security Best Practices
|
|
|
|
✅ **Do**:
|
|
- Always use `--dry-run true` for Kubernetes first
|
|
- Review artifacts before production deployment
|
|
- Enable branch protection rules on main
|
|
- Use environment secrets (staging vs production)
|
|
- Require PR reviews before merge
|
|
- Monitor health checks after deployment
|
|
- Keep kubeconfig.backup safely stored
|
|
- Rotate secrets regularly
|
|
|
|
❌ **Don't**:
|
|
- Commit secrets to repository
|
|
- Deploy directly to production without testing
|
|
- Disable workflow validation steps
|
|
- Skip health checks after deployment
|
|
- Use same kubeconfig for all environments
|
|
- Merge unreviewed PRs
|
|
- Change production without approval
|
|
- Share kubeconfig over unencrypted channels
|
|
|
|
---
|
|
|
|
## Monitoring & Alerts
|
|
|
|
### Automated Monitoring
|
|
|
|
- **Health checks**: Every 15 minutes
|
|
- **Comprehensive diagnostics**: Every 6 hours
|
|
- **Issue creation**: On health check failures
|
|
- **Slack alerts**: On critical failures
|
|
|
|
### Manual Monitoring
|
|
|
|
```bash
|
|
# Real-time logs
|
|
kubectl logs -f deployment/vapora-backend -n vapora
|
|
|
|
# Watch pods
|
|
kubectl get pods -n vapora --watch
|
|
|
|
# Metrics
|
|
kubectl top pods -n vapora
|
|
|
|
# Events
|
|
kubectl get events -n vapora --sort-by='.lastTimestamp'
|
|
```
|
|
|
|
---
|
|
|
|
## FAQ
|
|
|
|
**Q: Can I deploy multiple modes simultaneously?**
|
|
A: No, workflows serialize deployments. Deploy to staging first, then production.
|
|
|
|
**Q: How do I revert a failed deployment?**
|
|
A: Use the Rollback Deployment workflow. It automatically reverts to previous revision.
|
|
|
|
**Q: What if validation fails?**
|
|
A: Fix the configuration error and push again. Workflow will re-run automatically.
|
|
|
|
**Q: Can I skip health checks?**
|
|
A: No, health checks are mandatory for safety. They run automatically after each deployment.
|
|
|
|
**Q: How long do artifacts stay?**
|
|
A: 30-90 days depending on artifact type. Download and archive important ones.
|
|
|
|
**Q: What if kubeconfig expires?**
|
|
A: Update the secret in GitHub Settings → Secrets → Actions with new kubeconfig.
|
|
|
|
**Q: Can I deploy to multiple clusters?**
|
|
A: Yes, create separate secrets (KUBE_CONFIG_PROD_US, KUBE_CONFIG_PROD_EU) and workflows.
|
|
|
|
---
|
|
|
|
## Support & Documentation
|
|
|
|
- **Workflow Logs**: Actions → [Workflow Name] → [Run] → View logs
|
|
- **Artifacts**: Actions → [Workflow Name] → [Run] → Artifacts section
|
|
- **Issues**: GitHub Issues automatically created on failures
|
|
- **Slack**: Check #deployments channel for notifications
|
|
|
|
---
|
|
|
|
**Last Updated**: January 12, 2026
|
|
**Status**: Complete and production-ready
|
|
**Workflows**: 5 (validate-and-build, deploy-docker, deploy-kubernetes, health-check, rollback)
|