provisioning/docs/src/getting-started/verification.md

373 lines
6.7 KiB
Markdown
Raw Normal View History

2026-01-17 03:58:28 +00:00
# Verification
Validate the Provisioning platform installation and infrastructure health.
## Installation Verification
### CLI and Core Tools
```bash
# Check CLI version
provisioning version
# Verify Nushell
nu --version # 0.109.1+
# Verify Nickel
nickel --version # 1.15.1+
# Check SOPS and Age
sops --version # 3.10.2+
age --version # 1.2.1+
# Verify K9s
k9s version # 0.50.6+
```
### Configuration Validation
```bash
# Validate all configuration files
provisioning validate config
# Check environment
provisioning env
# Show all configuration
provisioning allenv
```
Expected output:
```text
Configuration validation: PASSED
- User config: ~/.config/provisioning/user_config.yaml ✓
- System defaults: provisioning/config/config.defaults.toml ✓
- Provider credentials: configured ✓
```
### Provider Connectivity
```bash
# List available providers
provisioning providers
# Test provider connection (UpCloud example)
provisioning provider test upcloud
# Test provider connection (AWS example)
provisioning provider test aws
```
## Workspace Verification
### Workspace Structure
```bash
# List workspaces
provisioning workspace list
# Show current workspace
provisioning workspace current
# Verify workspace structure
ls -la <workspace-name>/
```
Expected structure:
```text
workspace-name/
├── infra/ # Infrastructure Nickel schemas
├── config/ # Workspace configuration
├── extensions/ # Custom extensions
└── runtime/ # State and logs
```
### Workspace Configuration
```bash
# Show workspace configuration
provisioning config show
# Validate workspace-specific config
provisioning validate config --workspace <name>
```
## Infrastructure Verification
### Server Health
```bash
# List all servers
provisioning server list
# Check server status
provisioning server status <hostname>
# Test SSH connectivity
provisioning server ssh <hostname> -- echo "Connection successful"
```
### Task Service Health
```bash
# List installed task services
provisioning taskserv list
# Check service status
provisioning taskserv status <service-name>
# Verify service health
provisioning taskserv health <service-name>
```
### Cluster Health
For Kubernetes clusters:
```bash
# SSH to control plane
provisioning server ssh <control-hostname>
# Check cluster nodes
kubectl get nodes
# Check system pods
kubectl get pods -n kube-system
# Check cluster info
kubectl cluster-info
```
## Platform Services Verification
### Orchestrator Service
```bash
# Check orchestrator status
curl [http://localhost:5000/health](http://localhost:5000/health)
# View orchestrator version
curl [http://localhost:5000/version](http://localhost:5000/version)
# List active workflows
provisioning workflow list
```
Expected response:
```json
{
"status": "healthy",
"version": "x.x.x",
"uptime": "2h 15m"
}
```
### Control Center
```bash
# Check control center
curl [http://localhost:8080/health](http://localhost:8080/health)
# Access web UI
open [http://localhost:8080](http://localhost:8080) # macOS
xdg-open [http://localhost:8080](http://localhost:8080) # Linux
```
### Native Plugins
```bash
# List registered plugins
nu -c "plugin list"
# Verify plugins loaded
nu -c "plugin use nu_plugin_auth; plugin use nu_plugin_kms; plugin use nu_plugin_orchestrator"
```
## Security Verification
### Secrets Management
```bash
# Verify SOPS configuration
cat ~/.config/provisioning/.sops.yaml
# Test encryption/decryption
echo "test secret" > /tmp/test-secret.txt
sops -e /tmp/test-secret.txt > /tmp/test-secret.enc
sops -d /tmp/test-secret.enc
rm /tmp/test-secret.*
```
### SSH Keys
```bash
# Verify SSH keys exist
ls -la ~/.ssh/provisioning_*
# Test SSH key permissions
ls -l ~/.ssh/provisioning_* | awk '{print $1}'
# Should show: -rw------- (600)
```
### Encrypted Configuration
```bash
# Verify user config encryption
file ~/.config/provisioning/user_config.yaml
# Should show: SOPS encrypted data or YAML
```
## Troubleshooting Common Issues
### CLI Not Found
```bash
# Check PATH
echo $PATH | tr ':' '
' | grep provisioning
# Verify symlink
ls -l /usr/local/bin/provisioning
# Try direct execution
/path/to/project-provisioning/provisioning/core/cli/provisioning version
```
### Provider Authentication Fails
```bash
# Verify credentials are set
provisioning config show | grep -A5 providers
# Test with debug mode
provisioning --debug provider test <provider-name>
# Check network connectivity
ping -c 3 api.upcloud.com # UpCloud
ping -c 3 ec2.amazonaws.com # AWS
```
### Nickel Schema Errors
```bash
# Type-check schema
nickel typecheck <schema-file>.ncl
# Validate with verbose output
provisioning validate config --verbose
# Format Nickel file
nickel fmt <schema-file>.ncl
```
### Server SSH Fails
```bash
# Verify SSH key
ssh-add -l | grep provisioning
# Test direct SSH
ssh -i ~/.ssh/provisioning_rsa root@<server-ip>
# Check server status
provisioning server status <hostname>
```
### Task Service Installation Fails
```bash
# Check dependencies
provisioning taskserv dependencies <service>
# Verify server has resources
provisioning server ssh <hostname> -- df -h
provisioning server ssh <hostname> -- free -h
# Enable debug mode
provisioning --debug taskserv create <service>
```
## Health Check Checklist
Complete verification checklist:
```bash
# Core tools
[x] Nushell 0.109.1+
[x] Nickel 1.15.1+
[x] SOPS 3.10.2+
[x] Age 1.2.1+
[x] K9s 0.50.6+
# Configuration
[x] User config valid
[x] Provider credentials configured
[x] Workspace initialized
# Provider connectivity
[x] Provider API accessible
[x] Authentication successful
# Infrastructure (if deployed)
[x] Servers running
[x] SSH connectivity working
[x] Task services installed
[x] Cluster healthy
# Platform services (if running)
[x] Orchestrator responsive
[x] Control center accessible
[x] Plugins registered
# Security
[x] Secrets encrypted
[x] SSH keys secured
[x] Configuration protected
```
## Performance Verification
### Response Times
```bash
# CLI response time
time provisioning version
# Provider API response time
time provisioning provider test <provider>
# Orchestrator response time
time curl [http://localhost:5000/health](http://localhost:5000/health)
```
Acceptable ranges:
- CLI commands: <1 second
- Provider API: <3 seconds
- Orchestrator API: <100ms
### Resource Usage
```bash
# Check system resources
htop # Interactive process viewer
# Check disk usage
df -h
# Check memory usage
free -h
```
## Next Steps
Once verification is complete:
- [Workspace Management](../guides/workspace-management.md) - Manage multiple workspaces
- [Nickel Guide](../infrastructure/nickel-guide.md) - Master infrastructure-as-code
- [Batch Workflows](../infrastructure/batch-workflows.md) - Multi-cloud orchestration