11 KiB
Storage Backends Guide
This document provides comprehensive guidance on the orchestrator's storage backend options, configuration, and migration between them.
Overview
The orchestrator supports three storage backends through a pluggable architecture:
- Filesystem - JSON file-based storage (default)
- SurrealDB Embedded - Local database with RocksDB engine
- SurrealDB Server - Remote SurrealDB server connection
All backends implement the same TaskStorage trait, ensuring consistent behavior and seamless migration.
Backend Comparison
| Feature | Filesystem | SurrealDB Embedded | SurrealDB Server |
|---|---|---|---|
| Setup Complexity | Minimal | Low | Medium |
| External Dependencies | None | None | SurrealDB Server |
| Storage Format | JSON Files | RocksDB | Remote DB |
| ACID Transactions | No | Yes | Yes |
| Authentication/RBAC | Basic | Advanced | Advanced |
| Real-time Subscriptions | No | Yes | Yes |
| Audit Logging | Manual | Automatic | Automatic |
| Metrics Collection | Basic | Advanced | Advanced |
| Task Dependencies | Simple | Graph-based | Graph-based |
| Horizontal Scaling | No | No | Yes |
| Backup/Recovery | File Copy | Database Backup | Server Backup |
| Performance | Good | Excellent | Variable |
| Memory Usage | Low | Medium | Low |
| Disk Usage | Medium | Optimized | Minimal |
1. Filesystem Backend
Overview
The default storage backend using JSON files for task persistence. Ideal for development and simple deployments.
Configuration
# Default configuration
./orchestrator --storage-type filesystem --data-dir ./data
# Custom data directory
./orchestrator --storage-type filesystem --data-dir /var/lib/orchestrator
File Structure
data/
└── queue.rkvs/
├── tasks/
│ ├── uuid1.json # Individual task records
│ ├── uuid2.json
│ └── ...
└── queue/
├── uuid1.json # Queue entries with priority
├── uuid2.json
└── ...
Features
- ✅ Simple Setup: No external dependencies
- ✅ Transparency: Human-readable JSON files
- ✅ Backup: Standard file system tools
- ✅ Debugging: Direct file inspection
- ❌ ACID: No transaction guarantees
- ❌ Concurrency: Basic file locking
- ❌ Advanced Features: Limited auth/audit
Best Use Cases
- Development environments
- Single-instance deployments
- Simple task orchestration
- Environments with strict dependency requirements
2. SurrealDB Embedded
Overview
Local SurrealDB database using RocksDB storage engine. Provides advanced database features without external dependencies.
Configuration
# Build with SurrealDB support
cargo build --features surrealdb
# Run with embedded SurrealDB
./orchestrator --storage-type surrealdb-embedded --data-dir ./data
Database Schema
- tasks: Main task records with full metadata
- task_queue: Priority queue with scheduling info
- users: Authentication and RBAC
- audit_log: Complete operation history
- metrics: Performance and usage statistics
- task_events: Real-time event stream
Features
- ✅ ACID Transactions: Reliable data consistency
- ✅ Advanced Queries: SQL-like syntax with graph support
- ✅ Real-time Events: Live query subscriptions
- ✅ Built-in Auth: User management and RBAC
- ✅ Audit Logging: Automatic operation tracking
- ✅ No External Deps: Self-contained database
- ❌ Horizontal Scaling: Single-node only
Configuration Options
# Custom database location
./orchestrator --storage-type surrealdb-embedded \
--data-dir /var/lib/orchestrator/db
# With specific namespace/database
./orchestrator --storage-type surrealdb-embedded \
--data-dir ./data \
--surrealdb-namespace production \
--surrealdb-database orchestrator
Best Use Cases
- Production single-node deployments
- Applications requiring ACID guarantees
- Advanced querying and analytics
- Real-time monitoring requirements
- Audit logging compliance
3. SurrealDB Server
Overview
Remote SurrealDB server connection providing full distributed database capabilities with horizontal scaling.
Prerequisites
- SurrealDB Server: Running instance accessible via network
- Authentication: Valid credentials for database access
- Network: Reliable connectivity to SurrealDB server
SurrealDB Server Setup
# Install SurrealDB
curl -sSf https://install.surrealdb.com | sh
# Start server
surreal start --log trace --user root --pass root memory
# Or with file storage
surreal start --log trace --user root --pass root file:orchestrator.db
# Or with TiKV (distributed)
surreal start --log trace --user root --pass root tikv://localhost:2379
Configuration
# Basic server connection
./orchestrator --storage-type surrealdb-server \
--surrealdb-url ws://localhost:8000 \
--surrealdb-username admin \
--surrealdb-password secret
# Production configuration
./orchestrator --storage-type surrealdb-server \
--surrealdb-url wss://surreal.production.com:8000 \
--surrealdb-namespace prod \
--surrealdb-database orchestrator \
--surrealdb-username orchestrator-service \
--surrealdb-password "$SURREALDB_PASSWORD"
Features
- ✅ Distributed: Multi-node clustering support
- ✅ Horizontal Scaling: Handle massive workloads
- ✅ Multi-tenancy: Namespace and database isolation
- ✅ Real-time Collaboration: Multiple orchestrator instances
- ✅ Advanced Security: Enterprise authentication
- ✅ High Availability: Fault-tolerant deployments
- ❌ Complexity: Requires server management
- ❌ Network Dependency: Requires reliable connectivity
Best Use Cases
- Distributed production deployments
- Multiple orchestrator instances
- High availability requirements
- Large-scale task orchestration
- Multi-tenant environments
Migration Between Backends
Migration Tool
Use the migration script to move data between any backend combination:
# Interactive migration wizard
./scripts/migrate-storage.nu --interactive
# Direct migration examples
./scripts/migrate-storage.nu --from filesystem --to surrealdb-embedded \
--source-dir ./data --target-dir ./surrealdb-data
./scripts/migrate-storage.nu --from surrealdb-embedded --to surrealdb-server \
--source-dir ./data \
--surrealdb-url ws://localhost:8000 \
--username admin --password secret
# Validation and dry-run
./scripts/migrate-storage.nu validate --from filesystem --to surrealdb-embedded
./scripts/migrate-storage.nu --from filesystem --to surrealdb-embedded --dry-run
Migration Features
- Data Integrity: Complete validation before and after migration
- Progress Tracking: Real-time progress with throughput metrics
- Rollback Support: Automatic rollback on failures
- Selective Migration: Filter by task status, date range, etc.
- Batch Processing: Configurable batch sizes for performance
Migration Scenarios
Development to Production
# Migrate from filesystem (dev) to SurrealDB embedded (production)
./scripts/migrate-storage.nu --from filesystem --to surrealdb-embedded \
--source-dir ./dev-data --target-dir ./prod-data \
--batch-size 100 --verify
Scaling Up
# Migrate from embedded to server for distributed setup
./scripts/migrate-storage.nu --from surrealdb-embedded --to surrealdb-server \
--source-dir ./data \
--surrealdb-url ws://production-surreal:8000 \
--username orchestrator --password "$PROD_PASSWORD" \
--namespace production --database main
Disaster Recovery
# Migrate from server back to filesystem for emergency backup
./scripts/migrate-storage.nu --from surrealdb-server --to filesystem \
--surrealdb-url ws://failing-server:8000 \
--username admin --password "$PASSWORD" \
--target-dir ./emergency-backup
Performance Considerations
Filesystem
- Strengths: Low memory usage, simple debugging
- Limitations: File I/O bottlenecks, no concurrent writes
- Optimization: Fast SSD, regular cleanup of old tasks
SurrealDB Embedded
- Strengths: Excellent single-node performance, ACID guarantees
- Limitations: Memory usage scales with data size
- Optimization: Adequate RAM, SSD storage, regular compaction
SurrealDB Server
- Strengths: Horizontal scaling, shared state
- Limitations: Network latency, server dependency
- Optimization: Low-latency network, connection pooling, server tuning
Security Considerations
Filesystem
- File Permissions: Restrict access to data directory
- Backup Security: Encrypt backup files
- Network: No network exposure
SurrealDB Embedded
- File Permissions: Secure database files
- Encryption: Database-level encryption available
- Access Control: Built-in user management
SurrealDB Server
- Network Security: Use TLS/WSS connections
- Authentication: Strong passwords, regular rotation
- Authorization: Role-based access control
- Audit: Complete operation logging
Troubleshooting
Common Issues
Filesystem Backend
# Permission issues
sudo chown -R $USER:$USER ./data
chmod -R 755 ./data
# Corrupted JSON files
rm ./data/queue.rkvs/tasks/corrupted-file.json
SurrealDB Embedded
# Database corruption
rm -rf ./data/orchestrator.db
# Restore from backup or re-initialize
# Permission issues
sudo chown -R $USER:$USER ./data
SurrealDB Server
# Connection issues
telnet surreal-server 8000
# Check server status and network connectivity
# Authentication failures
# Verify credentials and user permissions
Debugging Commands
# List available storage types
./orchestrator --help | grep storage-type
# Validate configuration
./orchestrator --storage-type filesystem --data-dir ./data --dry-run
# Test migration
./scripts/migrate-storage.nu validate --from filesystem --to surrealdb-embedded
# Monitor migration progress
./scripts/migrate-storage.nu --from filesystem --to surrealdb-embedded --verbose
Recommendations
Development
- Use: Filesystem backend
- Rationale: Simple setup, easy debugging, no external dependencies
Single-Node Production
- Use: SurrealDB Embedded
- Rationale: ACID guarantees, advanced features, no external dependencies
Distributed Production
- Use: SurrealDB Server
- Rationale: Horizontal scaling, high availability, multi-instance support
Migration Path
- Start: Filesystem (development)
- Scale: SurrealDB Embedded (single-node production)
- Distribute: SurrealDB Server (multi-node production)
This progressive approach allows teams to start simple and scale as requirements grow, with seamless migration between each stage.