prvng_platform/crates/orchestrator/docs/storage-backends.md
2026-01-14 03:20:59 +00:00

11 KiB

Storage Backends Guide\n\nThis document provides comprehensive guidance on the orchestrator's storage backend options, configuration, and migration between them.\n\n## Overview\n\nThe orchestrator supports three storage backends through a pluggable architecture:\n\n1. Filesystem - JSON file-based storage (default)\n2. SurrealDB Embedded - Local database with RocksDB engine\n3. SurrealDB Server - Remote SurrealDB server connection\n\nAll backends implement the same TaskStorage trait, ensuring consistent behavior and seamless migration.\n\n## Backend Comparison\n\n| Feature | Filesystem | SurrealDB Embedded | SurrealDB Server |\n| --------- | ------------ | ------------------- | ------------------ |\n| Setup Complexity | Minimal | Low | Medium |\n| External Dependencies | None | None | SurrealDB Server |\n| Storage Format | JSON Files | RocksDB | Remote DB |\n| ACID Transactions | No | Yes | Yes |\n| Authentication/RBAC | Basic | Advanced | Advanced |\n| Real-time Subscriptions | No | Yes | Yes |\n| Audit Logging | Manual | Automatic | Automatic |\n| Metrics Collection | Basic | Advanced | Advanced |\n| Task Dependencies | Simple | Graph-based | Graph-based |\n| Horizontal Scaling | No | No | Yes |\n| Backup/Recovery | File Copy | Database Backup | Server Backup |\n| Performance | Good | Excellent | Variable |\n| Memory Usage | Low | Medium | Low |\n| Disk Usage | Medium | Optimized | Minimal |\n\n## 1. Filesystem Backend\n\n### Overview\n\nThe default storage backend using JSON files for task persistence. Ideal for development and simple deployments.\n\n### Configuration\n\n\n# Default configuration\n./orchestrator --storage-type filesystem --data-dir ./data\n\n# Custom data directory\n./orchestrator --storage-type filesystem --data-dir /var/lib/orchestrator\n\n\n### File Structure\n\n\ndata/\n└── queue.rkvs/\n ├── tasks/\n │ ├── uuid1.json # Individual task records\n │ ├── uuid2.json\n │ └── ...\n └── queue/\n ├── uuid1.json # Queue entries with priority\n ├── uuid2.json\n └── ...\n\n\n### Features\n\n- Simple Setup: No external dependencies\n- Transparency: Human-readable JSON files\n- Backup: Standard file system tools\n- Debugging: Direct file inspection\n- ACID: No transaction guarantees\n- Concurrency: Basic file locking\n- Advanced Features: Limited auth/audit\n\n### Best Use Cases\n\n- Development environments\n- Single-instance deployments\n- Simple task orchestration\n- Environments with strict dependency requirements\n\n## 2. SurrealDB Embedded\n\n### Overview\n\nLocal SurrealDB database using RocksDB storage engine. Provides advanced database features without external dependencies.\n\n### Configuration\n\n\n# Build with SurrealDB support\ncargo build --features surrealdb\n\n# Run with embedded SurrealDB\n./orchestrator --storage-type surrealdb-embedded --data-dir ./data\n\n\n### Database Schema\n\n- tasks: Main task records with full metadata\n- task_queue: Priority queue with scheduling info\n- users: Authentication and RBAC\n- audit_log: Complete operation history\n- metrics: Performance and usage statistics\n- task_events: Real-time event stream\n\n### Features\n\n- ACID Transactions: Reliable data consistency\n- Advanced Queries: SQL-like syntax with graph support\n- Real-time Events: Live query subscriptions\n- Built-in Auth: User management and RBAC\n- Audit Logging: Automatic operation tracking\n- No External Deps: Self-contained database\n- Horizontal Scaling: Single-node only\n\n### Configuration Options\n\n\n# Custom database location\n./orchestrator --storage-type surrealdb-embedded \\n --data-dir /var/lib/orchestrator/db\n\n# With specific namespace/database\n./orchestrator --storage-type surrealdb-embedded \\n --data-dir ./data \\n --surrealdb-namespace production \\n --surrealdb-database orchestrator\n\n\n### Best Use Cases\n\n- Production single-node deployments\n- Applications requiring ACID guarantees\n- Advanced querying and analytics\n- Real-time monitoring requirements\n- Audit logging compliance\n\n## 3. SurrealDB Server\n\n### Overview\n\nRemote SurrealDB server connection providing full distributed database capabilities with horizontal scaling.\n\n### Prerequisites\n\n1. SurrealDB Server: Running instance accessible via network\n2. Authentication: Valid credentials for database access\n3. Network: Reliable connectivity to SurrealDB server\n\n### SurrealDB Server Setup\n\n\n# Install SurrealDB\ncurl -sSf https://install.surrealdb.com | sh\n\n# Start server\nsurreal start --log trace --user root --pass root memory\n\n# Or with file storage\nsurreal start --log trace --user root --pass root file:orchestrator.db\n\n# Or with TiKV (distributed)\nsurreal start --log trace --user root --pass root tikv://localhost:2379\n\n\n### Configuration\n\n\n# Basic server connection\n./orchestrator --storage-type surrealdb-server \\n --surrealdb-url ws://localhost:8000 \\n --surrealdb-username admin \\n --surrealdb-password secret\n\n# Production configuration\n./orchestrator --storage-type surrealdb-server \\n --surrealdb-url wss://surreal.production.com:8000 \\n --surrealdb-namespace prod \\n --surrealdb-database orchestrator \\n --surrealdb-username orchestrator-service \\n --surrealdb-password "$SURREALDB_PASSWORD"\n\n\n### Features\n\n- Distributed: Multi-node clustering support\n- Horizontal Scaling: Handle massive workloads\n- Multi-tenancy: Namespace and database isolation\n- Real-time Collaboration: Multiple orchestrator instances\n- Advanced Security: Enterprise authentication\n- High Availability: Fault-tolerant deployments\n- Complexity: Requires server management\n- Network Dependency: Requires reliable connectivity\n\n### Best Use Cases\n\n- Distributed production deployments\n- Multiple orchestrator instances\n- High availability requirements\n- Large-scale task orchestration\n- Multi-tenant environments\n\n## Migration Between Backends\n\n### Migration Tool\n\nUse the migration script to move data between any backend combination:\n\n\n# Interactive migration wizard\n./scripts/migrate-storage.nu --interactive\n\n# Direct migration examples\n./scripts/migrate-storage.nu --from filesystem --to surrealdb-embedded \\n --source-dir ./data --target-dir ./surrealdb-data\n\n./scripts/migrate-storage.nu --from surrealdb-embedded --to surrealdb-server \\n --source-dir ./data \\n --surrealdb-url ws://localhost:8000 \\n --username admin --password secret\n\n# Validation and dry-run\n./scripts/migrate-storage.nu validate --from filesystem --to surrealdb-embedded\n./scripts/migrate-storage.nu --from filesystem --to surrealdb-embedded --dry-run\n\n\n### Migration Features\n\n- Data Integrity: Complete validation before and after migration\n- Progress Tracking: Real-time progress with throughput metrics\n- Rollback Support: Automatic rollback on failures\n- Selective Migration: Filter by task status, date range, etc.\n- Batch Processing: Configurable batch sizes for performance\n\n### Migration Scenarios\n\n#### Development to Production\n\n\n# Migrate from filesystem (dev) to SurrealDB embedded (production)\n./scripts/migrate-storage.nu --from filesystem --to surrealdb-embedded \\n --source-dir ./dev-data --target-dir ./prod-data \\n --batch-size 100 --verify\n\n\n#### Scaling Up\n\n\n# Migrate from embedded to server for distributed setup\n./scripts/migrate-storage.nu --from surrealdb-embedded --to surrealdb-server \\n --source-dir ./data \\n --surrealdb-url ws://production-surreal:8000 \\n --username orchestrator --password "$PROD_PASSWORD" \\n --namespace production --database main\n\n\n#### Disaster Recovery\n\n\n# Migrate from server back to filesystem for emergency backup\n./scripts/migrate-storage.nu --from surrealdb-server --to filesystem \\n --surrealdb-url ws://failing-server:8000 \\n --username admin --password "$PASSWORD" \\n --target-dir ./emergency-backup\n\n\n## Performance Considerations\n\n### Filesystem\n\n- Strengths: Low memory usage, simple debugging\n- Limitations: File I/O bottlenecks, no concurrent writes\n- Optimization: Fast SSD, regular cleanup of old tasks\n\n### SurrealDB Embedded\n\n- Strengths: Excellent single-node performance, ACID guarantees\n- Limitations: Memory usage scales with data size\n- Optimization: Adequate RAM, SSD storage, regular compaction\n\n### SurrealDB Server\n\n- Strengths: Horizontal scaling, shared state\n- Limitations: Network latency, server dependency\n- Optimization: Low-latency network, connection pooling, server tuning\n\n## Security Considerations\n\n### Filesystem\n\n- File Permissions: Restrict access to data directory\n- Backup Security: Encrypt backup files\n- Network: No network exposure\n\n### SurrealDB Embedded\n\n- File Permissions: Secure database files\n- Encryption: Database-level encryption available\n- Access Control: Built-in user management\n\n### SurrealDB Server\n\n- Network Security: Use TLS/WSS connections\n- Authentication: Strong passwords, regular rotation\n- Authorization: Role-based access control\n- Audit: Complete operation logging\n\n## Troubleshooting\n\n### Common Issues\n\n#### Filesystem Backend\n\n\n# Permission issues\nsudo chown -R $USER:$USER ./data\nchmod -R 755 ./data\n\n# Corrupted JSON files\nrm ./data/queue.rkvs/tasks/corrupted-file.json\n\n\n#### SurrealDB Embedded\n\n\n# Database corruption\nrm -rf ./data/orchestrator.db\n# Restore from backup or re-initialize\n\n# Permission issues\nsudo chown -R $USER:$USER ./data\n\n\n#### SurrealDB Server\n\n\n# Connection issues\ntelnet surreal-server 8000\n# Check server status and network connectivity\n\n# Authentication failures\n# Verify credentials and user permissions\n\n\n### Debugging Commands\n\n\n# List available storage types\n./orchestrator --help | grep storage-type\n\n# Validate configuration\n./orchestrator --storage-type filesystem --data-dir ./data --dry-run\n\n# Test migration\n./scripts/migrate-storage.nu validate --from filesystem --to surrealdb-embedded\n\n# Monitor migration progress\n./scripts/migrate-storage.nu --from filesystem --to surrealdb-embedded --verbose\n\n\n## Recommendations\n\n### Development\n\n- Use: Filesystem backend\n- Rationale: Simple setup, easy debugging, no external dependencies\n\n### Single-Node Production\n\n- Use: SurrealDB Embedded\n- Rationale: ACID guarantees, advanced features, no external dependencies\n\n### Distributed Production\n\n- Use: SurrealDB Server\n- Rationale: Horizontal scaling, high availability, multi-instance support\n\n### Migration Path\n\n1. Start: Filesystem (development)\n2. Scale: SurrealDB Embedded (single-node production)\n3. Distribute: SurrealDB Server (multi-node production)\n\nThis progressive approach allows teams to start simple and scale as requirements grow, with seamless migration between each stage.