Platform restructured into crates/, added AI service and detector,
migrated control-center-ui to Leptos 0.8
11 KiB
11 KiB
Storage Backends Guide
This document provides comprehensive guidance on the orchestrator's storage backend options, configuration, and migration between them.
Overview
The orchestrator supports three storage backends through a pluggable architecture:
- Filesystem - JSON file-based storage (default)
- SurrealDB Embedded - Local database with RocksDB engine
- SurrealDB Server - Remote SurrealDB server connection
All backends implement the same TaskStorage trait, ensuring consistent behavior and seamless migration.
Backend Comparison
| Feature | Filesystem | SurrealDB Embedded | SurrealDB Server |
|---|---|---|---|
| Setup Complexity | Minimal | Low | Medium |
| External Dependencies | None | None | SurrealDB Server |
| Storage Format | JSON Files | RocksDB | Remote DB |
| ACID Transactions | No | Yes | Yes |
| Authentication/RBAC | Basic | Advanced | Advanced |
| Real-time Subscriptions | No | Yes | Yes |
| Audit Logging | Manual | Automatic | Automatic |
| Metrics Collection | Basic | Advanced | Advanced |
| Task Dependencies | Simple | Graph-based | Graph-based |
| Horizontal Scaling | No | No | Yes |
| Backup/Recovery | File Copy | Database Backup | Server Backup |
| Performance | Good | Excellent | Variable |
| Memory Usage | Low | Medium | Low |
| Disk Usage | Medium | Optimized | Minimal |
1. Filesystem Backend
Overview
The default storage backend using JSON files for task persistence. Ideal for development and simple deployments.
Configuration
# Default configuration
./orchestrator --storage-type filesystem --data-dir ./data
# Custom data directory
./orchestrator --storage-type filesystem --data-dir /var/lib/orchestrator
```plaintext
### File Structure
```plaintext
data/
└── queue.rkvs/
├── tasks/
│ ├── uuid1.json # Individual task records
│ ├── uuid2.json
│ └── ...
└── queue/
├── uuid1.json # Queue entries with priority
├── uuid2.json
└── ...
```plaintext
### Features
- ✅ **Simple Setup**: No external dependencies
- ✅ **Transparency**: Human-readable JSON files
- ✅ **Backup**: Standard file system tools
- ✅ **Debugging**: Direct file inspection
- ❌ **ACID**: No transaction guarantees
- ❌ **Concurrency**: Basic file locking
- ❌ **Advanced Features**: Limited auth/audit
### Best Use Cases
- Development environments
- Single-instance deployments
- Simple task orchestration
- Environments with strict dependency requirements
## 2. SurrealDB Embedded
### Overview
Local SurrealDB database using RocksDB storage engine. Provides advanced database features without external dependencies.
### Configuration
```bash
# Build with SurrealDB support
cargo build --features surrealdb
# Run with embedded SurrealDB
./orchestrator --storage-type surrealdb-embedded --data-dir ./data
```plaintext
### Database Schema
- **tasks**: Main task records with full metadata
- **task_queue**: Priority queue with scheduling info
- **users**: Authentication and RBAC
- **audit_log**: Complete operation history
- **metrics**: Performance and usage statistics
- **task_events**: Real-time event stream
### Features
- ✅ **ACID Transactions**: Reliable data consistency
- ✅ **Advanced Queries**: SQL-like syntax with graph support
- ✅ **Real-time Events**: Live query subscriptions
- ✅ **Built-in Auth**: User management and RBAC
- ✅ **Audit Logging**: Automatic operation tracking
- ✅ **No External Deps**: Self-contained database
- ❌ **Horizontal Scaling**: Single-node only
### Configuration Options
```bash
# Custom database location
./orchestrator --storage-type surrealdb-embedded \
--data-dir /var/lib/orchestrator/db
# With specific namespace/database
./orchestrator --storage-type surrealdb-embedded \
--data-dir ./data \
--surrealdb-namespace production \
--surrealdb-database orchestrator
```plaintext
### Best Use Cases
- Production single-node deployments
- Applications requiring ACID guarantees
- Advanced querying and analytics
- Real-time monitoring requirements
- Audit logging compliance
## 3. SurrealDB Server
### Overview
Remote SurrealDB server connection providing full distributed database capabilities with horizontal scaling.
### Prerequisites
1. **SurrealDB Server**: Running instance accessible via network
2. **Authentication**: Valid credentials for database access
3. **Network**: Reliable connectivity to SurrealDB server
### SurrealDB Server Setup
```bash
# Install SurrealDB
curl -sSf https://install.surrealdb.com | sh
# Start server
surreal start --log trace --user root --pass root memory
# Or with file storage
surreal start --log trace --user root --pass root file:orchestrator.db
# Or with TiKV (distributed)
surreal start --log trace --user root --pass root tikv://localhost:2379
```plaintext
### Configuration
```bash
# Basic server connection
./orchestrator --storage-type surrealdb-server \
--surrealdb-url ws://localhost:8000 \
--surrealdb-username admin \
--surrealdb-password secret
# Production configuration
./orchestrator --storage-type surrealdb-server \
--surrealdb-url wss://surreal.production.com:8000 \
--surrealdb-namespace prod \
--surrealdb-database orchestrator \
--surrealdb-username orchestrator-service \
--surrealdb-password "$SURREALDB_PASSWORD"
```plaintext
### Features
- ✅ **Distributed**: Multi-node clustering support
- ✅ **Horizontal Scaling**: Handle massive workloads
- ✅ **Multi-tenancy**: Namespace and database isolation
- ✅ **Real-time Collaboration**: Multiple orchestrator instances
- ✅ **Advanced Security**: Enterprise authentication
- ✅ **High Availability**: Fault-tolerant deployments
- ❌ **Complexity**: Requires server management
- ❌ **Network Dependency**: Requires reliable connectivity
### Best Use Cases
- Distributed production deployments
- Multiple orchestrator instances
- High availability requirements
- Large-scale task orchestration
- Multi-tenant environments
## Migration Between Backends
### Migration Tool
Use the migration script to move data between any backend combination:
```bash
# Interactive migration wizard
./scripts/migrate-storage.nu --interactive
# Direct migration examples
./scripts/migrate-storage.nu --from filesystem --to surrealdb-embedded \
--source-dir ./data --target-dir ./surrealdb-data
./scripts/migrate-storage.nu --from surrealdb-embedded --to surrealdb-server \
--source-dir ./data \
--surrealdb-url ws://localhost:8000 \
--username admin --password secret
# Validation and dry-run
./scripts/migrate-storage.nu validate --from filesystem --to surrealdb-embedded
./scripts/migrate-storage.nu --from filesystem --to surrealdb-embedded --dry-run
```plaintext
### Migration Features
- **Data Integrity**: Complete validation before and after migration
- **Progress Tracking**: Real-time progress with throughput metrics
- **Rollback Support**: Automatic rollback on failures
- **Selective Migration**: Filter by task status, date range, etc.
- **Batch Processing**: Configurable batch sizes for performance
### Migration Scenarios
#### Development to Production
```bash
# Migrate from filesystem (dev) to SurrealDB embedded (production)
./scripts/migrate-storage.nu --from filesystem --to surrealdb-embedded \
--source-dir ./dev-data --target-dir ./prod-data \
--batch-size 100 --verify
```plaintext
#### Scaling Up
```bash
# Migrate from embedded to server for distributed setup
./scripts/migrate-storage.nu --from surrealdb-embedded --to surrealdb-server \
--source-dir ./data \
--surrealdb-url ws://production-surreal:8000 \
--username orchestrator --password "$PROD_PASSWORD" \
--namespace production --database main
```plaintext
#### Disaster Recovery
```bash
# Migrate from server back to filesystem for emergency backup
./scripts/migrate-storage.nu --from surrealdb-server --to filesystem \
--surrealdb-url ws://failing-server:8000 \
--username admin --password "$PASSWORD" \
--target-dir ./emergency-backup
```plaintext
## Performance Considerations
### Filesystem
- **Strengths**: Low memory usage, simple debugging
- **Limitations**: File I/O bottlenecks, no concurrent writes
- **Optimization**: Fast SSD, regular cleanup of old tasks
### SurrealDB Embedded
- **Strengths**: Excellent single-node performance, ACID guarantees
- **Limitations**: Memory usage scales with data size
- **Optimization**: Adequate RAM, SSD storage, regular compaction
### SurrealDB Server
- **Strengths**: Horizontal scaling, shared state
- **Limitations**: Network latency, server dependency
- **Optimization**: Low-latency network, connection pooling, server tuning
## Security Considerations
### Filesystem
- **File Permissions**: Restrict access to data directory
- **Backup Security**: Encrypt backup files
- **Network**: No network exposure
### SurrealDB Embedded
- **File Permissions**: Secure database files
- **Encryption**: Database-level encryption available
- **Access Control**: Built-in user management
### SurrealDB Server
- **Network Security**: Use TLS/WSS connections
- **Authentication**: Strong passwords, regular rotation
- **Authorization**: Role-based access control
- **Audit**: Complete operation logging
## Troubleshooting
### Common Issues
#### Filesystem Backend
```bash
# Permission issues
sudo chown -R $USER:$USER ./data
chmod -R 755 ./data
# Corrupted JSON files
rm ./data/queue.rkvs/tasks/corrupted-file.json
```plaintext
#### SurrealDB Embedded
```bash
# Database corruption
rm -rf ./data/orchestrator.db
# Restore from backup or re-initialize
# Permission issues
sudo chown -R $USER:$USER ./data
```plaintext
#### SurrealDB Server
```bash
# Connection issues
telnet surreal-server 8000
# Check server status and network connectivity
# Authentication failures
# Verify credentials and user permissions
```plaintext
### Debugging Commands
```bash
# List available storage types
./orchestrator --help | grep storage-type
# Validate configuration
./orchestrator --storage-type filesystem --data-dir ./data --dry-run
# Test migration
./scripts/migrate-storage.nu validate --from filesystem --to surrealdb-embedded
# Monitor migration progress
./scripts/migrate-storage.nu --from filesystem --to surrealdb-embedded --verbose
```plaintext
## Recommendations
### Development
- **Use**: Filesystem backend
- **Rationale**: Simple setup, easy debugging, no external dependencies
### Single-Node Production
- **Use**: SurrealDB Embedded
- **Rationale**: ACID guarantees, advanced features, no external dependencies
### Distributed Production
- **Use**: SurrealDB Server
- **Rationale**: Horizontal scaling, high availability, multi-instance support
### Migration Path
1. **Start**: Filesystem (development)
2. **Scale**: SurrealDB Embedded (single-node production)
3. **Distribute**: SurrealDB Server (multi-node production)
This progressive approach allows teams to start simple and scale as requirements grow, with seamless migration between each stage.