2025-10-07 11:05:08 +01:00

640 lines
20 KiB
Markdown

# Polkadot Validator Task Service
## Overview
The Polkadot Validator task service provides a production-ready installation and configuration of a [Polkadot](https://polkadot.network/) validator node. Validators are critical infrastructure components that secure the Polkadot network by producing blocks, participating in consensus, and finalizing transactions. This service includes comprehensive security hardening, monitoring, and operational tools.
## Features
### Validator Core Functions
- **Block Production** - Aura-based block authoring and validation
- **ELVES Consensus** - Ethereum-Like Validation Execution System support
- **Network Finality** - GRANDPA finality gadget participation
- **Hybrid Consensus** - Support for multiple consensus mechanisms
- **Consensus Participation** - Active participation in network consensus
- **Session Key Management** - Automated key generation, rotation, and backup
- **Slashing Protection** - Built-in protections against slashing conditions
### Security & Hardening
- **System Hardening** - Comprehensive systemd security configuration
- **Firewall Integration** - Automatic UFW/firewalld configuration
- **Fail2ban Protection** - Intrusion detection and prevention
- **Key Security** - Encrypted key backup with Age/SOPS support
- **Access Control** - SSH restrictions and user isolation
### Monitoring & Alerting
- **Health Monitoring** - Comprehensive system and validator health checks
- **Prometheus Integration** - Native metrics export for monitoring
- **Block Production Tracking** - Monitor validator performance
- **Network Connectivity** - Peer and network status monitoring
- **Alerting System** - Syslog integration with custom alerts
### Operational Features
- **Automated Setup** - Complete validator deployment and configuration
- **Session Rotation** - Automated session key rotation with safety checks
- **Backup & Recovery** - Secure key backup and restoration procedures
- **Performance Optimization** - Validator-optimized configuration settings
- **Multi-Chain Support** - Polkadot, Kusama, and Westend support
## Configuration
### Basic Validator Configuration
```kcl
validator: PolkadotValidator = {
name: "polkadot-validator"
version: "1.5.0"
run_user: {
name: "polkadot"
home: "/home/polkadot"
}
chain: "polkadot"
base_path: "/var/lib/polkadot"
ports: {
p2p_port: 30333
prometheus_port: 9615
}
validator_mode: true
telemetry_enabled: false
}
```
### Production Validator Configuration
```kcl
validator: PolkadotValidator = {
name: "polkadot-validator-prod"
version: "1.5.0"
run_user: {
name: "polkadot"
group: "polkadot"
home: "/opt/polkadot"
}
chain: "polkadot"
base_path: "/var/lib/polkadot"
ports: {
p2p_port: 30333
prometheus_port: 9615
}
validator_mode: true
consensus: {
type: "aura" # Options: "aura", "elves", "hybrid"
elves_support: true
hybrid_fallback: false
}
rpc: {
enabled: false # Disabled for validator security
}
security: {
firewall_enabled: true
fail2ban_enabled: true
ssh_restrictions: true
key_backup_enabled: true
backup_encryption_key: "/etc/polkadot/backup.key"
}
monitoring: {
enabled: true
prometheus_external: false
health_check_interval: 60
alert_thresholds: {
peer_count_min: 10
block_production_delay_max: 30
finalization_lag_max: 10
}
}
performance: {
database_cache: 2048
state_cache_size: 2147483648
max_peers: 50
sync_mode: "warp"
pruning: {
mode: "state"
blocks_to_keep: 256
}
}
session_keys: {
rotation_enabled: true
rotation_interval: "7d"
backup_enabled: true
}
reserved_nodes: [
"/ip4/10.0.1.10/tcp/30333/p2p/12D3KooW...",
"/ip4/10.0.1.11/tcp/30333/p2p/12D3KooW..."
]
bootnodes: [
"/dns/bootnode-0.polkadot.io/tcp/30333/p2p/12D3KooWEyoppNCUx8Yx66oV9fJnriXwCcXwDDUA2kj6vnc6iDEp"
]
log_level: "info"
telemetry_enabled: false # Disabled for validator privacy
}
```
### High-Availability Validator Setup
```kcl
validator: PolkadotValidator = {
name: "polkadot-validator-ha"
# ... base configuration
high_availability: {
enabled: true
backup_nodes: [
"validator-backup-1.company.com",
"validator-backup-2.company.com"
]
failover_timeout: 300
sync_check_interval: 30
}
monitoring: {
enabled: true
prometheus_external: true
prometheus_port: 9615
custom_metrics: true
alertmanager_webhook: "https://alerts.company.com/webhook"
}
security: {
firewall_enabled: true
fail2ban_enabled: true
ssh_restrictions: true
allowed_ssh_users: ["admin", "operator"]
key_backup_enabled: true
backup_encryption_key: "/etc/polkadot/backup.key"
auto_updates: true
}
network: {
external_addresses: [
"/ip4/203.0.113.1/tcp/30333"
]
reserved_only: true
reserved_nodes: [
"/ip4/10.0.1.10/tcp/30333/p2p/12D3KooW...",
"/ip4/10.0.1.11/tcp/30333/p2p/12D3KooW...",
"/ip4/10.0.1.12/tcp/30333/p2p/12D3KooW..."
]
max_peers: 25
}
}
```
### ELVES Consensus Validator Configuration
```kcl
validator: PolkadotValidator = {
name: "polkadot-elves-validator"
version: "1.5.0"
run_user: {
name: "polkadot"
group: "polkadot"
home: "/opt/polkadot"
}
chain: "polkadot"
base_path: "/var/lib/polkadot"
ports: {
p2p_port: 30333
prometheus_port: 9615
}
validator_mode: true
consensus: {
type: "elves"
elves_config: {
epoch_duration: 2400 # blocks per epoch
validators_per_epoch: 21
proposal_timeout: 3000
prevote_timeout: 3000
precommit_timeout: 3000
commit_timeout: 1000
ethereum_compatibility: true
}
finality: {
type: "grandpa"
grandpa_interval: 8
}
}
ethereum_compatibility: {
enabled: true
chain_id: 1
evm_runtime: true
}
session_keys: {
aura_key: "auto-generate"
grandpa_key: "auto-generate"
elves_key: "auto-generate"
rotation_enabled: true
rotation_interval: "7d"
}
monitoring: {
enabled: true
elves_metrics: true
ethereum_metrics: true
consensus_transition_alerts: true
}
performance: {
database_cache: 4096
state_cache_size: 4294967296
evm_cache_size: 1073741824
max_peers: 50
}
}
```
## Usage
### Deploy Validator
```bash
./core/nulib/provisioning taskserv create polkadot-validator --infra <infrastructure-name>
```
### List Available Task Services
```bash
./core/nulib/provisioning taskserv list
```
### SSH to Validator Server
```bash
./core/nulib/provisioning server ssh <validator-server>
```
### Service Management
```bash
# Check validator status
systemctl status polkadot-validator
# Start/stop validator
systemctl start polkadot-validator
systemctl stop polkadot-validator
systemctl restart polkadot-validator
# View validator logs
journalctl -u polkadot-validator -f
# Check validator health
sudo -u polkadot /opt/polkadot/scripts/validator-monitor.sh
```
### Session Key Management
```bash
# Generate new session keys
sudo -u polkadot /opt/polkadot/scripts/validator-keys.sh generate
# Backup session keys
sudo -u polkadot /opt/polkadot/scripts/validator-keys.sh backup
# Rotate session keys (with safety checks)
sudo -u polkadot /opt/polkadot/scripts/session-rotation.sh
# Verify keys on-chain
curl -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "author_hasKey", "params":["aura", "0x..."]}' \
http://localhost:9933/
```
### Validator Operations
```bash
# Check if validator is active
curl -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "author_hasSessionKeys", "params":["0x..."]}' \
http://localhost:9933/
# Monitor block production
curl -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "chain_getHeader", "params":[]}' \
http://localhost:9933/
# Check validator metrics
curl http://localhost:9615/metrics | grep polkadot_
# ELVES Consensus Operations
# Check ELVES consensus state
curl -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "elves_getConsensusState", "params":[]}' \
http://localhost:9933/
# Monitor ELVES epoch information
curl -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "elves_getCurrentEpoch", "params":[]}' \
http://localhost:9933/
# Check ELVES validator participation
curl -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "elves_getValidatorParticipation", "params":[]}' \
http://localhost:9933/
# Monitor consensus transitions (hybrid mode)
curl -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "elves_getConsensusTransitions", "params":[]}' \
http://localhost:9933/
```
### Security Operations
```bash
# Check firewall status
sudo ufw status verbose
# or for RHEL/CentOS
sudo firewall-cmd --list-all
# Monitor fail2ban
sudo fail2ban-client status polkadot-validator
# Check SSH access logs
sudo journalctl -u ssh | grep polkadot
# Verify key backup integrity
sudo -u polkadot age -d -i /etc/polkadot/backup.key \
/var/backups/polkadot/keys-latest.age
```
## Architecture
### Validator Architecture
```
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Network │────│ Validator Node │────│ Monitoring │
│ Peers │ │ │ │ & Alerts │
│ │ │ • Block Author │ │ │
│ • Other Vals │────│ • Consensus │────│ • Prometheus │
│ • Full Nodes │ │ • Finality │ │ • Health Checks │
│ • Bootnodes │ │ • Key Mgmt │ │ • Alerting │
└─────────────────┘ └──────────────────┘ └─────────────────┘
```
### Security Layers
```
┌─────────────────────────────────────────────────────────────┐
│ Security Hardening │
├─────────────────────────────────────────────────────────────┤
│ Network Security │ System Security │ Key Security │
│ │ │ │
│ • Firewall (UFW) │ • Systemd Hardening│ • Encrypted Keys │
│ • Fail2ban │ • User Isolation │ • Secure Backup │
│ • SSH Restrictions │ • Auto Updates │ • Key Rotation │
│ • Reserved Nodes │ • File Permissions │ • Age Encryption │
├─────────────────────────────────────────────────────────────┤
│ Monitoring & Alerting │
├─────────────────────────────────────────────────────────────┤
│ Polkadot Validator Process │
└─────────────────────────────────────────────────────────────┘
```
### Network Ports
- **P2P Port (30333)** - Peer-to-peer validator network
- **Prometheus Port (9615)** - Metrics (internal access only)
- **SSH Port (22)** - Restricted administrative access
### File Structure
```
/var/lib/polkadot/ # Main data directory
├── chains/ # Chain-specific data
├── keystore/ # Session keys (encrypted)
├── node.key # Node identity key
└── validator-state/ # Validator state data
/opt/polkadot/ # Validator tools
├── scripts/ # Management scripts
│ ├── validator-keys.sh # Key management
│ ├── session-rotation.sh# Key rotation
│ └── validator-monitor.sh# Health monitoring
└── backups/ # Key backups (encrypted)
/etc/polkadot/ # Configuration
├── validator.conf # Main configuration
├── backup.key # Backup encryption key
└── monitoring.conf # Monitoring configuration
/var/log/polkadot/ # Logs
├── validator.log # Validator logs
├── monitoring.log # Monitoring logs
└── security.log # Security events
```
## Supported Operating Systems
- Ubuntu 20.04+ / Debian 11+
- CentOS 8+ / RHEL 8+ / Fedora 35+
## System Requirements
### Minimum Validator Requirements
- **RAM**: 16GB (32GB recommended)
- **Storage**: 200GB NVMe SSD (500GB+ recommended)
- **CPU**: 4 cores (8 cores recommended, high clock speed)
- **Network**: Dedicated server with excellent connectivity
- **Uptime**: 99.9%+ uptime requirement
### Production Validator Requirements
- **RAM**: 32GB+ (64GB for optimal performance)
- **Storage**: 1TB+ NVMe SSD with high IOPS
- **CPU**: 8+ cores, 3.0GHz+ base clock
- **Network**: Dedicated bare metal server, multiple network paths
- **Backup**: Secondary server for failover
- **Monitoring**: 24/7 monitoring and alerting
### Network Requirements
- **Latency** - Low latency to other validators (< 100ms)
- **Bandwidth** - High bandwidth with unlimited data
- **Redundancy** - Multiple network paths for reliability
- **IP Address** - Static public IP address
- **DDoS Protection** - DDoS mitigation service recommended
## Troubleshooting
### Validator Performance Issues
```bash
# Check validator health
sudo -u polkadot /opt/polkadot/scripts/validator-monitor.sh
# Monitor block production
curl -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "author_hasSessionKeys", "params":["0x..."]}' \
http://localhost:9933/
# Check system resources
htop
iostat -x 1
df -h /var/lib/polkadot
# Analyze validator metrics
curl -s http://localhost:9615/metrics | grep -E "(block_height|finality|peers)"
```
### Session Key Issues
```bash
# Check session keys status
sudo -u polkadot /opt/polkadot/scripts/validator-keys.sh status
# Verify keys in keystore
ls -la /var/lib/polkadot/keystore/
# Test key accessibility
sudo -u polkadot polkadot key inspect \
--keystore-path /var/lib/polkadot/keystore \
--keystore-uri file:///var/lib/polkadot/keystore
# Check if keys are set on-chain
curl -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "session_nextKeys", "params":["0x..."]}' \
http://localhost:9933/
```
### Network Connectivity Issues
```bash
# Check connected peers
curl -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "system_peers", "params":[]}' \
http://localhost:9933/
# Test P2P connectivity
telnet other-validator-ip 30333
# Check network configuration
ip route get 8.8.8.8
netstat -tlnp | grep :30333
# Monitor network traffic
sudo netstat -i
sudo iftop -i eth0
```
### Security Issues
```bash
# Check firewall status
sudo ufw status numbered
sudo fail2ban-client status
# Review security logs
sudo journalctl -u polkadot-validator | grep -i security
sudo tail -f /var/log/auth.log
# Check for intrusion attempts
sudo fail2ban-client status ssh
sudo grep "Failed password" /var/log/auth.log
# Verify file permissions
ls -la /var/lib/polkadot/keystore/
sudo find /var/lib/polkadot -type f -perm /o+r
```
### Backup and Recovery Issues
```bash
# Test backup integrity
sudo -u polkadot /opt/polkadot/scripts/validator-keys.sh verify-backup
# Restore from backup
sudo -u polkadot /opt/polkadot/scripts/validator-keys.sh restore /path/to/backup
# Check backup encryption
sudo -u polkadot age -d -i /etc/polkadot/backup.key \
/var/backups/polkadot/keys-latest.age | head -1
# Verify session key recovery
sudo systemctl stop polkadot-validator
# Restore keys
sudo systemctl start polkadot-validator
```
## Security Best Practices
### Validator Security
- **Key Management** - Never expose session keys, use secure backup
- **Network Isolation** - Use reserved nodes and firewall restrictions
- **Regular Updates** - Keep validator software and OS updated
- **Monitoring** - Implement comprehensive monitoring and alerting
- **Physical Security** - Secure physical access to validator hardware
### Operational Security
- **Access Control** - Limit administrative access to essential personnel
- **Change Management** - Document and review all configuration changes
- **Incident Response** - Have clear incident response procedures
- **Regular Audits** - Perform regular security audits and reviews
- **Backup Testing** - Regularly test backup and recovery procedures
### Network Security
- **DDoS Protection** - Use DDoS mitigation services
- **VPN Access** - Use VPN for administrative access
- **Network Monitoring** - Monitor for unusual network activity
- **Peer Filtering** - Use reserved nodes to control peer connections
- **Traffic Analysis** - Regular analysis of network traffic patterns
## Performance Optimization
### Hardware Optimization
- **NVMe Storage** - Use high-performance NVMe SSDs
- **Memory** - Sufficient RAM for database caching
- **CPU** - High clock speed processors for single-threaded performance
- **Network** - Low-latency network connections
### Configuration Optimization
- **Database Cache** - Optimize database cache size
- **State Cache** - Configure appropriate state cache
- **Peer Limits** - Limit peers to reduce network overhead
- **Pruning** - Use state pruning to manage disk usage
### System Optimization
- **CPU Affinity** - Pin validator process to specific cores
- **I/O Scheduler** - Use appropriate I/O scheduler for SSDs
- **Network Tuning** - Optimize TCP settings for low latency
- **Memory Management** - Configure memory management for validator workload
## Monitoring and Alerting
### Key Metrics to Monitor
```bash
# Block production rate
curl -s http://localhost:9615/metrics | grep polkadot_block_height
# Finalization lag
curl -s http://localhost:9615/metrics | grep polkadot_finality_
# Peer connections
curl -s http://localhost:9615/metrics | grep polkadot_peers
# System resources
curl -s http://localhost:9615/metrics | grep -E "(cpu|memory|disk)"
```
### Prometheus Configuration
```yaml
# prometheus.yml
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'polkadot-validator'
static_configs:
- targets: ['localhost:9615']
scrape_interval: 30s
metrics_path: '/metrics'
```
### Alerting Rules
```yaml
# validator-alerts.yml
groups:
- name: polkadot-validator
rules:
- alert: ValidatorDown
expr: up{job="polkadot-validator"} == 0
for: 1m
labels:
severity: critical
annotations:
summary: "Polkadot validator is down"
- alert: LowPeerCount
expr: polkadot_peers < 10
for: 5m
labels:
severity: warning
annotations:
summary: "Low peer count: {{ $value }}"
- alert: HighFinalizationLag
expr: polkadot_finality_lag > 10
for: 2m
labels:
severity: critical
annotations:
summary: "High finalization lag: {{ $value }}"
```
## Resources
- **Official Documentation**: [wiki.polkadot.network/docs/maintain-validator](https://wiki.polkadot.network/docs/maintain-validator)
- **Validator Guide**: [guide.kusama.network/docs/mirror-maintain-guides-how-to-validate-kusama](https://guide.kusama.network/docs/mirror-maintain-guides-how-to-validate-kusama)
- **GitHub Repository**: [paritytech/polkadot](https://github.com/paritytech/polkadot)
- **Validator Community**: [matrix.to/#/#polkadot-validator-lounge:web3.foundation](https://matrix.to/#/#polkadot-validator-lounge:web3.foundation)
- **Telemetry (for testnets)**: [telemetry.polkadot.io](https://telemetry.polkadot.io)