Vapora/provisioning/.github/GITHUB_ACTIONS_GUIDE.md
Jesús Pérez a395bd972f
Some checks failed
Rust CI / Security Audit (push) Has been cancelled
Rust CI / Check + Test + Lint (nightly) (push) Has been cancelled
Rust CI / Check + Test + Lint (stable) (push) Has been cancelled
mdBook Build & Deploy / Build mdBook (push) Has been cancelled
Nickel Type Check / Nickel Type Checking (push) Has been cancelled
mdBook Build & Deploy / Documentation Quality Check (push) Has been cancelled
mdBook Build & Deploy / Deploy to GitHub Pages (push) Has been cancelled
mdBook Build & Deploy / Notification (push) Has been cancelled
chore: add cd/ci ops
2026-01-12 03:36:55 +00:00

16 KiB

GitHub Actions CI/CD Guide for VAPORA Provisioning

Complete guide for setting up and using GitHub Actions workflows for VAPORA deployment automation.

Overview

Five integrated GitHub Actions workflows provide end-to-end CI/CD automation:

  1. validate-and-build.yml - Configuration validation and artifact generation
  2. deploy-docker.yml - Docker Compose deployment automation
  3. deploy-kubernetes.yml - Kubernetes deployment automation
  4. health-check.yml - Automated health monitoring and diagnostics
  5. rollback.yml - Safe deployment rollback with pre-checks

Quick Setup

1. Prerequisites

  • GitHub repository with access to Actions
  • Docker Hub account (for image pushes, optional)
  • Kubernetes cluster with kubeconfig (for K8s deployments)
  • Slack workspace (for notifications, optional)

2. Required Secrets

Add these secrets to your GitHub repository (Settings → Secrets → Actions):

# Kubeconfig for Kubernetes deployments
KUBE_CONFIG_CI              # For CI/test cluster (optional)
KUBE_CONFIG_STAGING         # For staging Kubernetes cluster
KUBE_CONFIG_PRODUCTION      # For production Kubernetes cluster

# Optional: Slack notifications
SLACK_WEBHOOK               # Default Slack webhook
SLACK_WEBHOOK_ALERTS        # Critical alerts webhook

# Optional: Docker registry
DOCKER_USERNAME             # Docker Hub username
DOCKER_PASSWORD             # Docker Hub access token

3. Encode Kubeconfig for Secrets

# Convert kubeconfig to base64
cat ~/.kube/config | base64

# Store in GitHub Secrets as KUBE_CONFIG_STAGING, etc.

4. Enable GitHub Actions

  1. Go to repository Settings
  2. Click "Actions" → "General"
  3. Enable "Allow all actions and reusable workflows"
  4. Set "Workflow permissions" to "Read and write permissions"

Workflows in Detail

1. Validate & Build (validate-and-build.yml)

Purpose: Validate all configurations and generate deployment artifacts

Triggers:

  • Push to main or develop branches (if provisioning files change)
  • Manual dispatch with custom mode selection
  • Pull requests affecting provisioning

Jobs:

  • validate-configs - Validates solo, multiuser, and enterprise modes
  • build-artifacts - Generates JSON, TOML, YAML, and Kubernetes manifests

Outputs:

  • deployment-artifacts - All configuration and manifest files
  • build-logs - Pipeline execution logs
  • validation-logs-* - Per-mode validation reports

Usage:

# Automatic on push
git commit -m "Update provisioning config"
git push origin main

# Manual trigger
# Go to Actions → Validate & Build → Run workflow
# Select mode: solo, multiuser, or enterprise

Example Outputs:

artifacts/
├── config-solo.json
├── config-multiuser.json
├── config-enterprise.json
├── vapora-solo.toml
├── vapora-multiuser.toml
├── vapora-enterprise.toml
├── vapora-solo.yaml
├── vapora-multiuser.yaml
├── vapora-enterprise.yaml
├── configmap.yaml
├── deployment.yaml
├── docker-compose.yml
└── MANIFEST.md

2. Deploy to Docker (deploy-docker.yml)

Purpose: Deploy VAPORA to Docker Compose

Triggers:

  • Manual dispatch with configuration options
  • Automatic trigger after validate-and-build on develop branch

Required Inputs:

  • mode - Deployment mode (solo, multiuser, enterprise)
  • environment - Target environment (development, staging, production)
  • dry_run - Test without actual deployment

Features:

  • Validates Docker Compose configuration
  • Pulls base images
  • Starts services
  • Performs health checks
  • Auto-comments on PRs with deployment details
  • Slack notifications

Usage:

# Via GitHub UI
1. Go to Actions → Deploy to Docker
2. Click "Run workflow"
3. Select:
   - Mode: multiuser
   - Dry run: false
   - Environment: staging
4. Click "Run workflow"

Service Endpoints (after deployment):

- Backend: http://localhost:8001
- Frontend: http://localhost:3000
- Agents: http://localhost:8002
- LLM Router: http://localhost:8003
- SurrealDB: http://localhost:8000
- Health: http://localhost:8001/health

Local testing with same files:

# Download artifacts from workflow
cd deploy/docker
docker compose up -d

# View logs
docker compose logs -f backend

# Check health
curl http://localhost:8001/health

3. Deploy to Kubernetes (deploy-kubernetes.yml)

Purpose: Deploy VAPORA to Kubernetes cluster

Triggers:

  • Manual dispatch with full configuration options
  • Workflow dispatch with environment selection

Required Inputs:

  • mode - Deployment mode
  • environment - Target environment (staging, production)
  • dry_run - Dry-run test (recommended first)
  • rollout_timeout - Max time to wait for rollout (default: 300s)

Features:

  • Validates Kubernetes manifests
  • Creates VAPORA namespace
  • Applies ConfigMap with configuration
  • Deploys all three services
  • Waits for rollout completion
  • Performs health checks
  • Annotation tracking for deployments
  • Slack notifications

Usage:

# Via GitHub UI
1. Go to Actions → Deploy to Kubernetes
2. Click "Run workflow"
3. Select:
   - Mode: enterprise
   - Environment: staging
   - Dry run: true    # Always test first!
   - Rollout timeout: 300
4. Click "Run workflow"

# After dry-run verification, re-run with dry_run: false

Deployment Steps:

  1. Validate manifests (dry-run)
  2. Create vapora namespace
  3. Apply ConfigMap
  4. Apply Deployments
  5. Wait for backend rollout (5m timeout)
  6. Wait for agents rollout
  7. Wait for llm-router rollout
  8. Verify pod health

Verification Commands:

# Check deployments
kubectl get deployments -n vapora
kubectl get pods -n vapora

# View logs
kubectl logs -f deployment/vapora-backend -n vapora

# Check events
kubectl get events -n vapora --sort-by='.lastTimestamp'

# Port forward for local testing
kubectl port-forward -n vapora svc/vapora-backend 8001:8001
curl http://localhost:8001/health

# View rollout history
kubectl rollout history deployment/vapora-backend -n vapora

4. Health Check & Monitoring (health-check.yml)

Purpose: Continuous health monitoring across platforms

Triggers:

  • Schedule: Every 15 minutes
  • Schedule: Every 6 hours
  • Manual dispatch with custom parameters

Features:

  • Docker: Container status, HTTP health checks
  • Kubernetes: Deployment replicas, pod phases, service health
  • Automatic issue creation on failures
  • Diagnostics collection
  • Slack notifications

Usage:

# Via GitHub UI for manual run
1. Go to Actions → Health Check & Monitoring
2. Click "Run workflow"
3. Select:
   - Target: kubernetes
   - Count: 5 (run 5 checks)
   - Interval: 30 (30 seconds between checks)
4. Click "Run workflow"

Automatic Monitoring:

  • Every 15 minutes: Quick health check
  • Every 6 hours: Comprehensive diagnostics

What Gets Checked (Kubernetes):

  • Deployment replica status
  • Pod readiness conditions
  • Service availability
  • ConfigMap data
  • Recent events
  • Resource usage (if metrics-server available)

What Gets Checked (Docker):

  • Container status (Up/Down)
  • HTTP endpoint health (200 status)
  • Service responsiveness
  • Docker network status
  • Docker volumes

Reports Generated:

  • docker-health.log - Docker health check output
  • k8s-health.log - Kubernetes health check output
  • k8s-diagnostics.log - Full K8s diagnostics
  • docker-diagnostics.log - Full Docker diagnostics
  • HEALTH_REPORT.md - Summary report

5. Rollback Deployment (rollback.yml)

Purpose: Safe deployment rollback with pre-checks and verification

Triggers:

  • Manual dispatch only (safety feature)

Required Inputs:

  • target - Rollback target (kubernetes or docker)
  • environment - Environment to rollback (staging or production)
  • deployment - Specific deployment or "all"
  • revision - Kubernetes revision (0 = previous)

Features:

  • Pre-rollback safety checks
  • Deployment history snapshot
  • Automatic rollback execution
  • Post-rollback verification
  • Health check after rollback
  • GitHub issue creation with summary
  • Slack alerts

Usage (Kubernetes):

# Via GitHub UI
1. Go to Actions → Rollback Deployment
2. Click "Run workflow"
3. Select:
   - Target: kubernetes
   - Environment: staging
   - Deployment: all
   - Revision: 0 (rollback to previous)
4. Click "Run workflow"

# To rollback to specific revision
# Check kubectl rollout history deployment/vapora-backend -n vapora
# Set revision to desired number instead of 0

Usage (Docker):

# Via GitHub UI
1. Go to Actions → Rollback Deployment
2. Click "Run workflow"
3. Select:
   - Target: docker
   - Environment: staging
4. Click "Run workflow"

# Follow the manual rollback guide in artifacts

Rollback Process:

  1. Pre-rollback checks and snapshot
  2. Store current deployment history
  3. Execute rollback (automatic for K8s, guided for Docker)
  4. Verify rollback status
  5. Check pod health
  6. Generate reports
  7. Create GitHub issue
  8. Send Slack alert

Verification After Rollback:

# Kubernetes
kubectl get pods -n vapora
kubectl logs -f deployment/vapora-backend -n vapora
curl http://localhost:8001/health  # After port-forward

# Docker
docker compose ps
docker compose logs backend
curl http://localhost:8001/health

CI/CD Pipelines & Common Workflows

Workflow 1: Local Development

Developer creates feature branch
    ↓
Push to GitHub
    ↓
[Validate & Build] triggers automatically
    ↓
Download artifacts
    ↓
[Deploy to Docker] manually for local testing
    ↓
Test locally with docker compose
    ↓
Create PR (artifact links included)
    ↓
Merge to develop when approved

Workflow 2: Staging Deployment

Merge PR to develop
    ↓
[Validate & Build] runs automatically
    ↓
Download artifacts
    ↓
Run [Deploy to Kubernetes] manually with dry-run
    ↓
Review dry-run output
    ↓
Run [Deploy to Kubernetes] again with dry-run: false
    ↓
[Health Check] verifies deployment
    ↓
Staging environment live

Workflow 3: Production Deployment

Code review and approval
    ↓
Merge PR to main
    ↓
[Validate & Build] runs automatically
    ↓
Manual approval for production
    ↓
Run [Deploy to Kubernetes] with dry-run: true
    ↓
Review changes carefully
    ↓
Run [Deploy to Kubernetes] with dry-run: false
    ↓
[Health Check] monitoring (automatic every 6 hours)
    ↓
Production deployment complete

Workflow 4: Emergency Rollback

Production issue detected
    ↓
[Health Check] alerts in Slack
    ↓
Investigate issue
    ↓
Run [Rollback Deployment] manually
    ↓
GitHub issue created automatically
    ↓
[Health Check] verifies rollback
    ↓
Services restored
    ↓
Incident investigation begins

Environment Configuration

Staging Environment

  • Branch: develop
  • Auto-deploy: No (manual only)
  • Dry-run default: Yes (test first)
  • Notifications: SLACK_WEBHOOK
  • Protection: Requires approval for merge to main

Production Environment

  • Branch: main
  • Auto-deploy: No (manual only)
  • Dry-run default: Yes (always test first)
  • Notifications: SLACK_WEBHOOK_ALERTS
  • Protection: Requires PR review, status checks must pass

Artifacts & Downloads

All workflow artifacts are available in the Actions tab for 30-90 days:

Actions → [Specific Workflow] → [Run] → Artifacts

Available Artifacts:

  • deployment-artifacts - Configuration and manifests
  • validation-logs-* - Per-mode validation reports
  • build-logs - CI/CD pipeline logs
  • docker-deployment-logs-* - Docker deployment details
  • k8s-deployment-* - Kubernetes deployment details
  • health-check-* - Health monitoring reports
  • rollback-logs-* - Rollback execution details
  • rollback-snapshot-* - Pre-rollback state snapshot

Troubleshooting

Build Fails: "Config not found"

Solution: Ensure provisioning/schemas/ files exist and are committed
          Check path references in validate-config.nu

Deploy Fails: "kubeconfig not found"

Solution: 1. Verify KUBE_CONFIG_STAGING/PRODUCTION secrets exist
          2. Ensure kubeconfig is properly base64 encoded
          3. Test: echo $KUBE_CONFIG_STAGING | base64 -d
          4. Re-encode if corrupted: cat ~/.kube/config | base64

Health Check: "No kubeconfig available"

Solution: Configure at least KUBE_CONFIG_STAGING secret
          Health check tries CI first, then falls back to staging

Docker Deploy: "Docker daemon not accessible"

Solution: Docker is only available in ubuntu-latest runners
          Run deploy-docker on appropriate runners

Deployment Hangs: "Waiting for rollout"

Solution: 1. Check pod logs: kubectl logs -n vapora <pod>
          2. Describe pod: kubectl describe pod -n vapora <pod>
          3. Increase rollout_timeout in workflow
          4. Check resource requests/limits in deployment.yaml

Slack Integration

Setup Slack Webhooks

  1. Create Slack App: https://api.slack.com/apps
  2. Enable Incoming Webhooks
  3. Create webhook for #deployments channel
  4. Copy webhook URL
  5. Add to GitHub Secrets:
    • SLACK_WEBHOOK - General notifications
    • SLACK_WEBHOOK_ALERTS - Critical alerts

Slack Message Examples

Build Success:

✅ VAPORA Artifact Build Complete
Mode: multiuser | Artifacts ready for deployment

Deployment Success:

✅ VAPORA Docker deployment successful!
Mode: multiuser | Environment: staging

Health Check Alert:

❌ VAPORA Health Check Failed
Target: kubernetes | Create issue for investigation

Rollback Alert:

🔙 VAPORA Rollback Executed
Target: kubernetes | Environment: production
Executed By: @user | Verify service health

Security Best Practices

Do:

  • Always use --dry-run true for Kubernetes first
  • Review artifacts before production deployment
  • Enable branch protection rules on main
  • Use environment secrets (staging vs production)
  • Require PR reviews before merge
  • Monitor health checks after deployment
  • Keep kubeconfig.backup safely stored
  • Rotate secrets regularly

Don't:

  • Commit secrets to repository
  • Deploy directly to production without testing
  • Disable workflow validation steps
  • Skip health checks after deployment
  • Use same kubeconfig for all environments
  • Merge unreviewed PRs
  • Change production without approval
  • Share kubeconfig over unencrypted channels

Monitoring & Alerts

Automated Monitoring

  • Health checks: Every 15 minutes
  • Comprehensive diagnostics: Every 6 hours
  • Issue creation: On health check failures
  • Slack alerts: On critical failures

Manual Monitoring

# Real-time logs
kubectl logs -f deployment/vapora-backend -n vapora

# Watch pods
kubectl get pods -n vapora --watch

# Metrics
kubectl top pods -n vapora

# Events
kubectl get events -n vapora --sort-by='.lastTimestamp'

FAQ

Q: Can I deploy multiple modes simultaneously? A: No, workflows serialize deployments. Deploy to staging first, then production.

Q: How do I revert a failed deployment? A: Use the Rollback Deployment workflow. It automatically reverts to previous revision.

Q: What if validation fails? A: Fix the configuration error and push again. Workflow will re-run automatically.

Q: Can I skip health checks? A: No, health checks are mandatory for safety. They run automatically after each deployment.

Q: How long do artifacts stay? A: 30-90 days depending on artifact type. Download and archive important ones.

Q: What if kubeconfig expires? A: Update the secret in GitHub Settings → Secrets → Actions with new kubeconfig.

Q: Can I deploy to multiple clusters? A: Yes, create separate secrets (KUBE_CONFIG_PROD_US, KUBE_CONFIG_PROD_EU) and workflows.


Support & Documentation

  • Workflow Logs: Actions → [Workflow Name] → [Run] → View logs
  • Artifacts: Actions → [Workflow Name] → [Run] → Artifacts section
  • Issues: GitHub Issues automatically created on failures
  • Slack: Check #deployments channel for notifications

Last Updated: January 12, 2026 Status: Complete and production-ready Workflows: 5 (validate-and-build, deploy-docker, deploy-kubernetes, health-check, rollback)