666 lines
22 KiB
Markdown
Raw Permalink Normal View History

2025-10-07 11:05:08 +01:00
# PostgreSQL Task Service
## Overview
The PostgreSQL task service provides a complete installation and configuration of [PostgreSQL](https://www.postgresql.org/), one of the world's most advanced open-source relational database systems. PostgreSQL offers enterprise-class performance, reliability, and feature robustness, making it suitable for applications ranging from single-machine applications to large internet-facing applications with many concurrent users.
## Features
### Core Database Features
- **ACID Compliance** - Full ACID (Atomicity, Consistency, Isolation, Durability) compliance
- **Advanced SQL Support** - Comprehensive SQL standards compliance with extensions
- **Data Types** - Rich set of native data types including JSON, XML, arrays, and custom types
- **Indexing** - Multiple indexing strategies (B-tree, Hash, GiST, SP-GiST, GIN, BRIN)
- **Full Text Search** - Built-in full-text search capabilities
### Advanced Features
- **Stored Procedures** - Support for multiple procedural languages (PL/pgSQL, PL/Python, etc.)
- **Triggers** - Advanced trigger system for data integrity and automation
- **Views & Materialized Views** - Complex view support including updatable and materialized views
- **Common Table Expressions (CTEs)** - Recursive and non-recursive CTEs
- **Window Functions** - Advanced analytical functions
### High Availability & Scalability
- **Streaming Replication** - Built-in master-slave replication
- **Logical Replication** - Selective data replication and multi-master scenarios
- **Point-in-Time Recovery (PITR)** - Continuous archiving and point-in-time recovery
- **Connection Pooling** - Built-in connection pooling with pgBouncer integration
- **Partitioning** - Table partitioning for large datasets
### Security Features
- **Authentication Methods** - Multiple authentication methods (md5, scram-sha-256, LDAP, etc.)
- **Role-Based Access Control** - Granular permission system with roles and privileges
- **Row Level Security (RLS)** - Fine-grained access control at the row level
- **Encryption** - Data encryption at rest and in transit
- **SSL/TLS Support** - Secure connections with certificate authentication
### Monitoring & Management
- **Statistics Collector** - Comprehensive database statistics and performance metrics
- **Query Planner** - Advanced query optimization and execution planning
- **Auto-vacuum** - Automatic maintenance and optimization
- **Extensions** - Rich ecosystem of extensions (PostGIS, TimescaleDB, etc.)
- **Backup & Recovery** - Multiple backup strategies and recovery options
## Configuration
### Basic Configuration
```kcl
postgres: Postgres = {
name: "postgres"
postgres_version: "15"
vers_num: 15
run_path: "/usr/bin/psql"
lib_path: "/var/lib/postgresql"
data_path: "/var/lib/postgresql/15/main"
etc_path: "/etc/postgresql"
config_file: "postgresql.conf"
run_user: "postgres"
run_group: "postgres"
run_user_home: "/var/lib/postgresql"
}
```
### Production Configuration
```kcl
postgres: Postgres = {
name: "postgres"
postgres_version: "15"
vers_num: 15
run_path: "/usr/bin/psql"
lib_path: "/var/lib/postgresql"
data_path: "/var/lib/postgresql/15/main"
etc_path: "/etc/postgresql"
config_file: "postgresql.conf"
run_user: "postgres"
run_group: "postgres"
run_user_home: "/var/lib/postgresql"
performance: {
max_connections: 200
shared_buffers: "256MB"
effective_cache_size: "1GB"
maintenance_work_mem: "64MB"
checkpoint_completion_target: 0.9
wal_buffers: "16MB"
default_statistics_target: 100
random_page_cost: 1.1
effective_io_concurrency: 200
work_mem: "4MB"
min_wal_size: "1GB"
max_wal_size: "4GB"
}
security: {
ssl: true
ssl_cert_file: "/etc/ssl/certs/postgresql.crt"
ssl_key_file: "/etc/ssl/private/postgresql.key"
password_encryption: "scram-sha-256"
row_security: true
log_connections: true
log_disconnections: true
log_statement: "all"
}
backup: {
enabled: true
wal_level: "replica"
archive_mode: true
archive_command: "cp %p /var/lib/postgresql/15/archive/%f"
max_wal_senders: 3
checkpoint_segments: 32
}
}
```
### High-Availability Master Configuration
```kcl
postgres: Postgres = {
name: "postgres-master"
postgres_version: "15"
vers_num: 15
# ... base configuration
replication: {
role: "master"
wal_level: "replica"
max_wal_senders: 10
max_replication_slots: 10
synchronous_standby_names: "standby1,standby2"
synchronous_commit: "on"
hot_standby: false
archive_mode: true
archive_command: "cp %p /var/lib/postgresql/15/archive/%f"
restore_command: "cp /var/lib/postgresql/15/archive/%f %p"
archive_timeout: 300
}
monitoring: {
log_min_duration_statement: 1000
log_checkpoints: true
log_connections: true
log_disconnections: true
log_lock_waits: true
log_temp_files: 10485760
track_activities: true
track_counts: true
track_io_timing: true
track_functions: "all"
}
performance: {
max_connections: 500
shared_buffers: "2GB"
effective_cache_size: "6GB"
maintenance_work_mem: "256MB"
work_mem: "8MB"
wal_buffers: "64MB"
checkpoint_completion_target: 0.9
max_wal_size: "8GB"
min_wal_size: "2GB"
random_page_cost: 1.1
effective_io_concurrency: 300
}
}
```
### High-Availability Standby Configuration
```kcl
postgres: Postgres = {
name: "postgres-standby"
postgres_version: "15"
vers_num: 15
# ... base configuration
replication: {
role: "standby"
hot_standby: true
max_standby_archive_delay: "30s"
max_standby_streaming_delay: "30s"
hot_standby_feedback: true
primary_conninfo: "host=postgres-master port=5432 user=replicator password=replicator_password application_name=standby1"
restore_command: "cp /var/lib/postgresql/15/archive/%f %p"
recovery_target_timeline: "latest"
standby_mode: true
trigger_file: "/var/lib/postgresql/15/promote"
}
performance: {
max_connections: 300
shared_buffers: "2GB"
effective_cache_size: "6GB"
work_mem: "8MB"
}
}
```
### Development Configuration
```kcl
postgres: Postgres = {
name: "postgres-dev"
postgres_version: "15"
vers_num: 15
# ... base configuration
development: {
log_statement: "all"
log_min_duration_statement: 0
log_line_prefix: "%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h "
log_checkpoints: true
log_connections: true
log_disconnections: true
log_lock_waits: true
deadlock_timeout: "1s"
log_temp_files: 0
}
performance: {
max_connections: 50
shared_buffers: "128MB"
effective_cache_size: "512MB"
work_mem: "2MB"
maintenance_work_mem: "32MB"
fsync: false # Only for development!
synchronous_commit: false # Only for development!
full_page_writes: false # Only for development!
}
}
```
### Connection Pooling Configuration
```kcl
postgres: Postgres = {
name: "postgres-pooled"
postgres_version: "15"
vers_num: 15
# ... base configuration
connection_pooling: {
pgbouncer: {
enabled: true
listen_addr: "0.0.0.0"
listen_port: 6432
pool_mode: "transaction"
max_client_conn: 1000
default_pool_size: 25
max_db_connections: 50
reserve_pool_size: 5
reserve_pool_timeout: 5
server_round_robin: 1
ignore_startup_parameters: "extra_float_digits"
auth_type: "md5"
auth_file: "/etc/pgbouncer/userlist.txt"
admin_users: "postgres"
stats_users: "stats"
}
}
databases: [
{
name: "app_production"
owner: "appuser"
encoding: "UTF8"
lc_collate: "en_US.UTF-8"
lc_ctype: "en_US.UTF-8"
template: "template0"
connections: 200
},
{
name: "app_staging"
owner: "appuser"
encoding: "UTF8"
lc_collate: "en_US.UTF-8"
lc_ctype: "en_US.UTF-8"
template: "template0"
connections: 50
}
]
}
```
## Usage
### Deploy PostgreSQL
```bash
./core/nulib/provisioning taskserv create postgres --infra <infrastructure-name>
```
### List Available Task Services
```bash
./core/nulib/provisioning taskserv list
```
### SSH to PostgreSQL Server
```bash
./core/nulib/provisioning server ssh <postgres-server>
```
### Service Management
```bash
# Check PostgreSQL status
systemctl status postgresql
# Start/stop PostgreSQL
systemctl start postgresql
systemctl stop postgresql
systemctl restart postgresql
# Check PostgreSQL version
sudo -u postgres psql -c "SELECT version();"
# Check cluster status
pg_lsclusters
```
### Database Administration
```bash
# Connect to PostgreSQL
sudo -u postgres psql
# Create database
sudo -u postgres createdb myapp
# Create user
sudo -u postgres createuser --interactive appuser
# Grant privileges
sudo -u postgres psql -c "GRANT ALL PRIVILEGES ON DATABASE myapp TO appuser;"
# List databases
sudo -u postgres psql -l
# List users
sudo -u postgres psql -c "\du"
```
### Database Operations
```sql
-- Connect to database
\c myapp
-- Create table
CREATE TABLE users (
id SERIAL PRIMARY KEY,
username VARCHAR(50) UNIQUE NOT NULL,
email VARCHAR(100) UNIQUE NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Insert data
INSERT INTO users (username, email) VALUES ('admin', 'admin@company.com');
-- Query data
SELECT * FROM users;
-- Create index
CREATE INDEX idx_users_email ON users(email);
-- Show table structure
\d users
-- Show database size
SELECT pg_size_pretty(pg_database_size('myapp'));
```
### Backup and Restore
```bash
# Full database backup
sudo -u postgres pg_dump myapp > /backup/myapp_$(date +%Y%m%d).sql
# Compressed backup
sudo -u postgres pg_dump -Fc myapp > /backup/myapp_$(date +%Y%m%d).dump
# All databases backup
sudo -u postgres pg_dumpall > /backup/all_databases_$(date +%Y%m%d).sql
# Restore database
sudo -u postgres psql myapp < /backup/myapp_backup.sql
# Restore compressed backup
sudo -u postgres pg_restore -d myapp /backup/myapp_backup.dump
# Point-in-time recovery setup
sudo -u postgres psql -c "SELECT pg_start_backup('backup_label');"
```
### Replication Management
```bash
# Check replication status (on master)
sudo -u postgres psql -c "SELECT * FROM pg_stat_replication;"
# Check replication lag
sudo -u postgres psql -c "SELECT EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp()));"
# Promote standby to master
sudo -u postgres pg_ctl promote -D /var/lib/postgresql/15/main
# Create replication slot
sudo -u postgres psql -c "SELECT * FROM pg_create_physical_replication_slot('standby_slot');"
```
### Performance Monitoring
```bash
# Check active connections
sudo -u postgres psql -c "SELECT count(*) FROM pg_stat_activity;"
# Check database statistics
sudo -u postgres psql -c "SELECT * FROM pg_stat_database;"
# Check table statistics
sudo -u postgres psql -c "SELECT * FROM pg_stat_user_tables;"
# Check index usage
sudo -u postgres psql -c "SELECT * FROM pg_stat_user_indexes;"
# Check slow queries
sudo -u postgres psql -c "SELECT query, mean_time, calls FROM pg_stat_statements ORDER BY mean_time DESC LIMIT 10;"
```
### Maintenance Operations
```bash
# Analyze database
sudo -u postgres psql -c "ANALYZE;"
# Vacuum database
sudo -u postgres psql -c "VACUUM;"
# Full vacuum (locks tables)
sudo -u postgres psql -c "VACUUM FULL;"
# Reindex database
sudo -u postgres psql -c "REINDEX DATABASE myapp;"
# Check database integrity
sudo -u postgres psql myapp -c "SELECT * FROM pg_check_database('myapp');"
```
## Architecture
### System Architecture
```
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Applications │────│ PostgreSQL │────│ Data Storage │
│ │ │ │ │ │
│ • Web Apps │ │ • Query Engine │ │ • Data Files │
│ • APIs │────│ • Transaction │────│ • WAL Files │
│ • Analytics │ │ Manager │ │ • Index Files │
│ • Reporting │ │ • Buffer Pool │ │ • Archive Files │
└─────────────────┘ └──────────────────┘ └─────────────────┘
```
### PostgreSQL Process Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ Postmaster │
│ (Main Process) │
├─────────────────────────────────────────────────────────────┤
│ Backend Processes │ Utility Processes │ Logger Process │
│ │ │ │
│ • postgres (client) │ • autovacuum │ • syslogger │
│ • postgres (worker) │ • stats collector │ • log writer │
│ • postgres (wal) │ • checkpointer │ • log rotator │
│ • postgres (bg) │ • wal writer │ │
├─────────────────────────────────────────────────────────────┤
│ Shared Memory │
│ │
│ • Shared Buffers • WAL Buffers • Lock Tables │
│ • Proc Array • CLOG • Commit Log │
└─────────────────────────────────────────────────────────────┘
```
### Replication Architecture
```
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Master │────│ WAL Shipping │────│ Standby │
│ │ │ │ │ │
│ • Primary DB │ │ • Streaming │ │ • Hot Standby │
│ • WAL Writer │────│ • Archive │────│ • WAL Receiver │
│ • WAL Sender │ │ • File Based │ │ • Startup Proc │
│ • Checkpointer │ │ • Synchronous │ │ • Read Queries │
└─────────────────┘ └──────────────────┘ └─────────────────┘
```
### File Structure
```
/var/lib/postgresql/15/main/ # Data directory
├── base/ # Database files
├── global/ # Cluster-wide tables
├── pg_wal/ # Write-ahead log files
├── pg_xact/ # Transaction commit status
├── pg_multixact/ # Multixact status
├── pg_notify/ # LISTEN/NOTIFY status
├── pg_serial/ # Serializable isolation info
├── pg_snapshots/ # Exported snapshots
├── pg_stat/ # Statistics files
├── pg_stat_tmp/ # Temporary statistics files
├── pg_subtrans/ # Subtransaction status
├── pg_tblspc/ # Tablespace symbolic links
├── pg_twophase/ # Prepared transaction files
├── postgresql.conf # Main configuration file
├── pg_hba.conf # Client authentication config
├── pg_ident.conf # User name mapping
└── postmaster.pid # Process ID file
/etc/postgresql/15/main/ # Configuration directory
├── postgresql.conf # Main configuration
├── pg_hba.conf # Host-based authentication
├── pg_ident.conf # Ident map configuration
└── start.conf # Auto-start configuration
/var/log/postgresql/ # Log directory
├── postgresql-15-main.log # Main log file
└── postgresql-15-main.log.* # Rotated log files
```
## Supported Operating Systems
- Ubuntu 20.04+ / Debian 11+
- CentOS 8+ / RHEL 8+ / Fedora 35+
- Amazon Linux 2+
- SUSE Linux Enterprise 15+
- Windows Server 2019+
## System Requirements
### Minimum Requirements
- **RAM**: 2GB (4GB+ recommended)
- **Storage**: 20GB (100GB+ for production)
- **CPU**: 2 cores (4+ cores recommended)
- **Network**: 100Mbps (1Gbps+ for replication)
### Production Requirements
- **RAM**: 8GB+ (32GB+ for large databases)
- **Storage**: 100GB+ NVMe SSD (high IOPS)
- **CPU**: 4+ cores (16+ cores for high load)
- **Network**: 1Gbps+ with low latency
### Performance Requirements
- **Disk IOPS**: 1000+ IOPS for data directory
- **Memory**: 25% of total RAM for shared_buffers
- **CPU**: High-frequency processors for query processing
- **Network**: Low latency for replication (<10ms)
## Troubleshooting
### Service Issues
```bash
# Check PostgreSQL status
systemctl status postgresql
# View PostgreSQL logs
tail -f /var/log/postgresql/postgresql-15-main.log
# Check configuration
sudo -u postgres psql -c "SHOW config_file;"
# Test configuration
sudo -u postgres postgres --check-config
```
### Connection Issues
```bash
# Check listening ports
netstat -tlnp | grep postgres
ss -tlnp | grep postgres
# Test local connection
sudo -u postgres psql -h localhost
# Check pg_hba.conf
sudo cat /etc/postgresql/15/main/pg_hba.conf
# Check active connections
sudo -u postgres psql -c "SELECT * FROM pg_stat_activity;"
```
### Performance Issues
```bash
# Check database size
sudo -u postgres psql -c "SELECT pg_size_pretty(pg_database_size('myapp'));"
# Check slow queries
sudo -u postgres psql -c "SELECT query, mean_time FROM pg_stat_statements ORDER BY mean_time DESC LIMIT 10;"
# Check lock conflicts
sudo -u postgres psql -c "SELECT * FROM pg_locks WHERE NOT granted;"
# Check buffer hit ratio
sudo -u postgres psql -c "SELECT sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) AS hit_ratio FROM pg_statio_user_tables;"
```
### Replication Issues
```bash
# Check replication status
sudo -u postgres psql -c "SELECT * FROM pg_stat_replication;"
# Check replication lag
sudo -u postgres psql -c "SELECT EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp()));"
# Check WAL files
ls -la /var/lib/postgresql/15/main/pg_wal/
# Check archive command
sudo -u postgres psql -c "SHOW archive_command;"
```
### Data Corruption Issues
```bash
# Check database corruption
sudo -u postgres psql myapp -c "SELECT * FROM pg_check_database('myapp');"
# Check table corruption
sudo -u postgres psql myapp -c "SELECT * FROM pg_check_table('users');"
# Repair corrupted indexes
sudo -u postgres psql myapp -c "REINDEX TABLE users;"
# Check WAL consistency
sudo -u postgres pg_waldump /var/lib/postgresql/15/main/pg_wal/000000010000000000000001
```
## Security Considerations
### Authentication Security
- **Strong Passwords** - Use strong passwords for database users
- **Certificate Authentication** - Use SSL certificates for authentication
- **LDAP Integration** - Integrate with corporate LDAP systems
- **Role-Based Access** - Implement proper role hierarchy
### Network Security
- **SSL/TLS Encryption** - Enable SSL for all connections
- **Firewall Rules** - Restrict database port access
- **VPN Access** - Use VPN for remote database access
- **Network Segmentation** - Isolate database network
### Data Security
- **Encryption at Rest** - Encrypt database files
- **Row Level Security** - Implement fine-grained access control
- **Audit Logging** - Enable comprehensive audit logging
- **Backup Encryption** - Encrypt database backups
### Access Control
- **Principle of Least Privilege** - Grant minimal necessary permissions
- **Regular Security Audits** - Perform regular security reviews
- **User Account Management** - Regular user account cleanup
- **Database Hardening** - Follow PostgreSQL security best practices
## Performance Optimization
### Hardware Optimization
- **Storage** - Use NVMe SSDs for data and WAL directories
- **Memory** - Adequate RAM for buffer pool and OS cache
- **CPU** - High-frequency processors for query execution
- **Network** - High-bandwidth, low-latency network
### Configuration Tuning
- **Memory Settings** - Optimize shared_buffers and work_mem
- **Checkpoint Tuning** - Configure checkpoint intervals
- **WAL Settings** - Optimize WAL buffer and archiving
- **Query Planner** - Tune statistics and cost parameters
### Query Optimization
- **Index Strategy** - Proper indexing for query patterns
- **Query Analysis** - Use EXPLAIN ANALYZE for optimization
- **Statistics Updates** - Regular ANALYZE operations
- **Partitioning** - Table partitioning for large datasets
### Maintenance Optimization
- **Auto-vacuum** - Configure auto-vacuum parameters
- **Connection Pooling** - Use pgBouncer for connection management
- **Monitoring** - Continuous performance monitoring
- **Capacity Planning** - Regular capacity planning and scaling
## Resources
- **Official Documentation**: [postgresql.org/docs](https://www.postgresql.org/docs/)
- **PostgreSQL Wiki**: [wiki.postgresql.org](https://wiki.postgresql.org/)
- **Community Support**: [postgresql.org/support](https://www.postgresql.org/support/)
- **Performance Tuning**: [pgtune.leopard.in.ua](https://pgtune.leopard.in.ua/)
- **Extensions**: [pgxn.org](https://pgxn.org/)