Advanced open-source relational database with Patroni-managed high availability, etcd consensus store, and PgBouncer connection pooling.
Addresses below are RFC 5737 documentation ranges or placeholders - swap in your own.
Table of Contents#
- Overview
- Setup
- PostgreSQL and Components
- SSL/TLS Configuration
- Performance Tuning
- WAL Archiving and PITR
- Logical Replication
- Database Management
- Troubleshooting
- See Also
- Sources
1. Overview#
This guide covers a Patroni-managed PostgreSQL 17 cluster with etcd as the distributed configuration store and PgBouncer as the connection pooler. The architecture provides automatic leader election, streaming replication, and transparent connection routing.
Replace all placeholders (<...>) with real values.
Architecture#
Clients
|
PgBouncer (:6432) -- connection pooling
|
PostgreSQL (:5432) -- managed by Patroni
|
etcd (:2379) -- cluster state and leader election2. Setup#
2.1 Environment Variables#
export DB_PWD='<postgres-super-pw>'
export DB_REPLICA=$(openssl rand -base64 24)
export INIT_PASS=$(openssl rand -base64 24)
export CURRENT_NODE='node01'
export CURRENT_IP='192.0.2.111'
export IP_SQL01='192.0.2.111'
export IP_SQL02='192.0.2.112'
export IP_SQL03='192.0.2.211'2.2 Preparation#
# LVM / filesystem
pvcreate /dev/sdb
vgcreate database /dev/sdb
lvcreate -l 100%FREE -n data database
mkfs.xfs -f -i size=512 /dev/mapper/database-data
# Persistent mounts
cat >> /etc/fstab <<EOF
# PostgreSQL data volume
/dev/mapper/database-data /database xfs defaults 1 2
/database/patroni /var/lib/patroni none bind 0 0
EOF
mkdir -p /database/patroni /var/lib/patroni
mount -a3. PostgreSQL and Components#
3.1 etcd#
dnf install -y etcd patroni-etcd
firewall-cmd --permanent --add-port=2379/tcp
firewall-cmd --permanent --add-port=2380/tcp
firewall-cmd --reloadcat > /etc/etcd/etcd.conf <<EOF
[member]
ETCD_NAME="${CURRENT_NODE}"
ETCD_DATA_DIR="/var/lib/etcd"
ETCD_LISTEN_PEER_URLS="http://${CURRENT_IP}:2380"
ETCD_LISTEN_CLIENT_URLS="http://${CURRENT_IP}:2379"
[cluster]
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://${CURRENT_IP}:2380"
ETCD_INITIAL_CLUSTER="node1=http://${IP_SQL01}:2380,node2=http://${IP_SQL02}:2380,node3=http://${IP_SQL03}:2380"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="${INIT_PASS}"
ETCD_ADVERTISE_CLIENT_URLS="http://${CURRENT_IP}:2379"
ETCD_ENABLE_V2="true"
EOFsystemctl daemon-reload
systemctl enable --now etcd3.2 Patroni#
3.2.1 Preparation#
dnf install -y postgresql17 postgresql17-server patroni
firewall-cmd --permanent --add-service=postgresql
firewall-cmd --permanent --add-port=8008/tcp
firewall-cmd --reload
mkdir -p /var/lib/patroni/17/data
chown -R postgres:postgres /var/lib/patroni3.2.2 Configuration#
cat > /etc/patroni/patroni.yml <<EOF
scope: patroni
namespace: /db/
name: ${CURRENT_NODE}
restapi:
listen: ${CURRENT_IP}:8008
connect_address: ${CURRENT_IP}:8008
etcd:
hosts: ${IP_SQL01}:2379,${IP_SQL02}:2379,${IP_SQL03}:2379
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
postgresql:
use_pg_rewind: true
initdb:
- encoding: UTF8
- data-checksums
pg_hba:
- host replication replicator 127.0.0.1/32 scram-sha-256
- host replication replicator ${IP_SQL01}/24 scram-sha-256
- host replication replicator ${IP_SQL02}/24 scram-sha-256
- host replication replicator ${IP_SQL03}/24 scram-sha-256
- host all all 0.0.0.0/0 scram-sha-256
users:
admin:
password: ${DB_PWD}
options: [createrole, createdb]
postgresql:
listen: ${CURRENT_IP}:5432
connect_address: ${CURRENT_IP}:5432
bin_dir: /usr/pgsql-17/bin
data_dir: /var/lib/patroni/17/data
pgpass: /tmp/pgpass
authentication:
replication:
username: replicator
password: ${DB_REPLICA}
superuser:
username: postgres
password: ${DB_PWD}
parameters:
unix_socket_directories: '.'
tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false
EOF3.2.3 Service#
cat > /etc/systemd/system/patroni.service <<EOF
[Unit]
Description=Patroni PostgreSQL 17 HA Manager
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
User=postgres
ExecStart=/usr/bin/patroni /etc/patroni/patroni.yml
KillMode=process
TimeoutSec=30
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable --now patroni3.3 PgBouncer#
3.3.1 Preparation#
dnf install -y pgbouncer
firewall-cmd --permanent --add-port=6432/tcp
firewall-cmd --reload
mkdir -p /etc/pgbouncer3.3.2 Configuration#
cat > /etc/pgbouncer/pgbouncer.ini <<EOF
[databases]
* = host=${CURRENT_IP} port=5432
[pgbouncer]
listen_addr = ${CURRENT_IP}
listen_port = 6432
logfile = /var/log/pgbouncer/pgbouncer.log
pidfile = /run/pgbouncer/pgbouncer.pid
auth_type = hba
auth_hba_file = /var/lib/patroni/17/data/pg_hba.conf
auth_file = /etc/pgbouncer/.secrets
admin_users = postgres
stats_users = stats,postgres
pool_mode = session
server_reset_query = DISCARD ALL
max_client_conn = 256
default_pool_size = 194
ignore_startup_parameters = options,application_name,extra_float_digits
stats_period = 1800
EOFpython /etc/pgbouncer/mkauth.py /etc/pgbouncer/.secrets \
"host=${CURRENT_IP} port=5432 dbname=postgres user=postgres password=${DB_PWD}"
chown -R pgbouncer:pgbouncer /etc/pgbouncer3.3.3 Service#
cat > /etc/systemd/system/pgbouncer.service <<'EOF'
[Unit]
Description=PgBouncer Connection Pooler
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
User=pgbouncer
Environment=BOUNCERCONF=/etc/pgbouncer/pgbouncer.ini
ExecStart=/usr/bin/pgbouncer ${BOUNCERCONF}
ExecReload=/bin/kill -HUP $MAINPID
TimeoutSec=300
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable --now pgbouncer4. SSL/TLS Configuration#
4.1 Generate Server Certificates#
mkdir -p /var/lib/patroni/17/ssl && cd /var/lib/patroni/17/ssl
# CA
openssl genrsa -out ca-key.pem 4096
openssl req -new -x509 -days 3650 -key ca-key.pem -out ca.pem \
-subj "/CN=PostgreSQL CA"
# Server certificate (SAN must include all node IPs)
openssl genrsa -out server-key.pem 4096
openssl req -new -key server-key.pem -out server-req.pem \
-subj "/CN=$(hostname)"
cat > server-ext.cnf <<EOF
subjectAltName=IP:${CURRENT_IP},IP:127.0.0.1,DNS:$(hostname)
EOF
openssl x509 -req -days 3650 -in server-req.pem \
-CA ca.pem -CAkey ca-key.pem -CAcreateserial \
-out server-cert.pem -extfile server-ext.cnf
chown postgres:postgres /var/lib/patroni/17/ssl/*
chmod 600 /var/lib/patroni/17/ssl/*-key.pem4.2 PostgreSQL SSL Parameters#
Add to the parameters section of patroni.yml:
postgresql:
parameters:
ssl: "on"
ssl_ca_file: /var/lib/patroni/17/ssl/ca.pem
ssl_cert_file: /var/lib/patroni/17/ssl/server-cert.pem
ssl_key_file: /var/lib/patroni/17/ssl/server-key.pem
ssl_min_protocol_version: TLSv1.24.3 Enforce SSL in pg_hba.conf#
Replace host with hostssl in Patroni's pg_hba entries:
pg_hba:
- hostssl replication replicator 127.0.0.1/32 scram-sha-256
- hostssl replication replicator <subnet>/24 scram-sha-256
- hostssl all all 0.0.0.0/0 scram-sha-2564.4 Verify SSL Connection#
psql "host=${CURRENT_IP} port=5432 dbname=postgres user=postgres sslmode=verify-full sslrootcert=/var/lib/patroni/17/ssl/ca.pem"-- Check SSL status of current connection
SELECT ssl, version, cipher FROM pg_stat_ssl WHERE pid = pg_backend_pid();5. Performance Tuning#
5.1 Memory Settings#
Add to patroni.yml under postgresql.parameters:
postgresql:
parameters:
# Shared memory - 25% of total RAM
shared_buffers: 4GB
# Per-query work memory
work_mem: 64MB
# Maintenance operations (VACUUM, CREATE INDEX)
maintenance_work_mem: 1GB
# OS cache reliance
effective_cache_size: 12GB
# Background writer
bgwriter_lru_maxpages: 100
bgwriter_lru_multiplier: 2.05.2 WAL and Checkpoint Settings#
postgresql:
parameters:
# WAL settings
wal_level: replica # 'logical' if using logical replication
max_wal_size: 4GB
min_wal_size: 1GB
wal_buffers: 64MB
checkpoint_timeout: 15min
checkpoint_completion_target: 0.9
# Write-ahead logging
synchronous_commit: "on" # 'off' for async (faster, risk of data loss)
wal_compression: lz45.3 Connection and Parallelism#
postgresql:
parameters:
max_connections: 200
max_parallel_workers_per_gather: 4
max_parallel_workers: 8
max_worker_processes: 16
max_parallel_maintenance_workers: 45.4 Autovacuum Tuning#
postgresql:
parameters:
autovacuum_max_workers: 4
autovacuum_naptime: 30s
autovacuum_vacuum_threshold: 50
autovacuum_vacuum_scale_factor: 0.05
autovacuum_analyze_threshold: 50
autovacuum_analyze_scale_factor: 0.025
autovacuum_vacuum_cost_delay: 2ms
autovacuum_vacuum_cost_limit: 10005.5 Query Planning#
postgresql:
parameters:
random_page_cost: 1.1 # Lower for SSD (default 4.0)
effective_io_concurrency: 200 # Higher for SSD
default_statistics_target: 2006. WAL Archiving and PITR#
6.1 Configure WAL Archiving#
Add to patroni.yml:
postgresql:
parameters:
wal_level: replica
archive_mode: "on"
archive_command: 'rsync -a %p /archive/wal/%f'
archive_timeout: 300 # Force switch after 5 minutes of inactivity# Create archive directory
mkdir -p /archive/wal
chown postgres:postgres /archive/wal6.2 Base Backup#
pg_basebackup -U replicator -D /archive/base/$(date +%Y%m%d) \
-Fp -Xs -P -R --checkpoint=fast6.3 Point-in-Time Recovery#
Create a recovery.signal file and configure postgresql.conf:
# Stop PostgreSQL, restore base backup to data directory
systemctl stop patroni
# Restore the base backup
rm -rf /var/lib/patroni/17/data/*
cp -a /archive/base/20260322/* /var/lib/patroni/17/data/
# Configure recovery target
cat >> /var/lib/patroni/17/data/postgresql.conf <<EOF
restore_command = 'cp /archive/wal/%f %p'
recovery_target_time = '2026-03-22 14:30:00+00'
recovery_target_action = 'promote'
EOF
touch /var/lib/patroni/17/data/recovery.signal
chown -R postgres:postgres /var/lib/patroni/17/data
systemctl start patroni6.4 Verify Recovery#
-- Check if recovery is complete
SELECT pg_is_in_recovery(); -- false = promoted to primary
-- Check the recovery timeline
SELECT timeline_id FROM pg_control_recovery();7. Logical Replication#
Logical replication replicates individual tables (or all tables) between PostgreSQL instances, including across major versions.
7.1 Prerequisites#
# On the publisher (source), set in patroni.yml
postgresql:
parameters:
wal_level: logical
max_replication_slots: 10
max_wal_senders: 107.2 Create Publication (Source)#
-- Replicate all tables in a schema
CREATE PUBLICATION my_pub FOR ALL TABLES;
-- Or specific tables
CREATE PUBLICATION my_pub FOR TABLE orders, customers;
-- With filtering
CREATE PUBLICATION my_pub FOR TABLE orders WHERE (status = 'active');7.3 Create Subscription (Target)#
On the target PostgreSQL instance (tables must already exist with matching schema):
CREATE SUBSCRIPTION my_sub
CONNECTION 'host=<source_ip> port=5432 dbname=appdb user=replicator password=<password>'
PUBLICATION my_pub;7.4 Monitor Logical Replication#
-- On publisher: check replication slots
SELECT slot_name, active, restart_lsn, confirmed_flush_lsn
FROM pg_replication_slots;
-- On subscriber: check subscription status
SELECT subname, received_lsn, latest_end_lsn, latest_end_time
FROM pg_stat_subscription;
-- Check replication lag
SELECT slot_name,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)) AS lag
FROM pg_replication_slots
WHERE slot_type = 'logical';7.5 Manage Subscriptions#
-- Pause replication
ALTER SUBSCRIPTION my_sub DISABLE;
-- Resume
ALTER SUBSCRIPTION my_sub ENABLE;
-- Drop
DROP SUBSCRIPTION my_sub;8. Database Management#
8.1 Creating Users and Databases#
CREATE ROLE app_user WITH LOGIN PASSWORD '<password>';
CREATE DATABASE appdb OWNER app_user;8.2 Exporting and Importing#
pg_dump -Fc -U postgres -d appdb > appdb.dump
pg_restore -U postgres -d appdb_restored appdb.dump8.3 Granting Permissions#
GRANT CONNECT ON DATABASE appdb TO app_user;
GRANT USAGE ON SCHEMA public TO app_user;
GRANT SELECT,INSERT,UPDATE,DELETE ON ALL TABLES IN SCHEMA public TO app_user;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT,INSERT,UPDATE,DELETE ON TABLES TO app_user;8.4 Basic psql Commands#
| Command | Description |
|---|---|
\l | List databases |
\c <db> | Connect to database |
\dt | List tables |
\d <table> | Describe table |
\du | List roles |
\dn | List schemas |
\di | List indexes |
\df | List functions |
\i <file> | Run SQL file |
\timing | Toggle query timing |
\x | Toggle expanded display |
\q | Quit |
8.5 Cluster Operations#
| Task | Command |
|---|---|
| Cluster status | patronictl list |
| Manual failover | patronictl failover |
| Switchover | patronictl switchover |
| Pause | patronictl pause |
| Resume | patronictl resume |
| Restart PostgreSQL | patronictl restart patroni |
| Edit DCS config | patronictl edit-config |
| Reinitialize replica | patronictl reinit patroni <node> |
8.6 Backup and Restore#
| Action | Example |
|---|---|
| Physical backup | pg_basebackup -U replicator -D /backup/base -Fp -Xs -P |
| WAL archive command | archive_command = 'rsync -a %p /archive/%f' |
| Restore WAL | restore_command = 'cp /archive/%f %p' |
| Logical backup | pg_dump -Fc -U postgres db > db.dump |
| Parallel dump | pg_dump -Fc -j 4 -U postgres db > db.dump |
| Restore | pg_restore -U postgres -d db -j 4 db.dump |
8.7 Monitoring#
| What | Command |
|---|---|
| Cluster health | patronictl list |
| Replication lag | SELECT client_addr, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn)) FROM pg_stat_replication; |
| etcd status | etcdctl --endpoints=${CURRENT_IP}:2379 endpoint health |
| PgBouncer pools | echo "SHOW POOLS;" | psql -h ${CURRENT_IP} -p 6432 -U postgres pgbouncer |
| Active queries | SELECT pid, query_start, state, query FROM pg_stat_activity WHERE state != 'idle'; |
| Table bloat | SELECT schemaname, relname, n_dead_tup, last_autovacuum FROM pg_stat_user_tables ORDER BY n_dead_tup DESC; |
| Index usage | SELECT indexrelname, idx_scan, idx_tup_read FROM pg_stat_user_indexes ORDER BY idx_scan; |
| Cache hit ratio | SELECT round(100.0 * sum(blks_hit) / nullif(sum(blks_hit) + sum(blks_read), 0), 2) AS ratio FROM pg_stat_database; |
9. Troubleshooting#
| Issue | Cause | Solution |
|---|---|---|
| Database access denied | PgBouncer auth file out of sync | Run chown -R pgbouncer:pgbouncer /etc/pgbouncer; verify user in .secrets |
| Patroni fails to start | Wrong data dir ownership or etcd unreachable | Check chown postgres:postgres /var/lib/patroni/17/data; journalctl -u patroni |
| etcd quorum lost | Majority of members unreachable | etcdctl endpoint status --write-out=table; restart or re-add unhealthy member |
| Replication lag growing | Heavy writes, slow replica disk, or network | Check pg_stat_replication; tune max_wal_size; verify disk I/O on replica |
| PgBouncer connection errors | Pool exhausted or auth mismatch | echo "SHOW STATS;" | psql -p 6432 pgbouncer; increase default_pool_size |
| Disk space running low | WAL accumulation or table bloat | df -h /database; check archive_command success; run VACUUM FULL on bloated tables |
| SSL connection refused | Certificate paths wrong or permissions | Verify ssl_cert_file path; ensure chmod 600 on key files; check pg_hba for hostssl |
| Logical replication slot growing | Subscriber down or lagging | Check pg_replication_slots for inactive slots; drop orphaned slots |
| High CPU from autovacuum | Large table, aggressive settings | Increase autovacuum_vacuum_cost_delay; check pg_stat_user_tables for bloat |