GlusterFS is a scalable, distributed network filesystem that aggregates storage bricks from multiple servers into a single, unified namespace.
Addresses below are RFC 5737 documentation ranges or placeholders - swap in your own.
Table of Contents#
- Overview
- Architecture
- Volume Types
- Installation
- Disk Preparation
- Firewall Configuration
- Volume Configuration
- Security and Access Control
- Client Configuration
- Geo-Replication
- Snapshots
- Monitoring
- Troubleshooting
- See Also
- Sources
1. Overview#
GlusterFS pools storage resources from commodity servers into a large parallel filesystem. It operates entirely in user space with no kernel modifications required on the server side. Key characteristics:
- No metadata server - eliminates a single point of failure; clients use an elastic hash algorithm to locate data
- POSIX-compatible - applications need no modification to use GlusterFS volumes
- Modular translator architecture - features like replication, striping, caching, and encryption are implemented as stackable translators
- Scales to petabytes - tested in production at hundreds of nodes
GlusterFS stores data in units called "bricks," where each brick is an exported directory on a server backed by a local filesystem (XFS is recommended).
2. Architecture#
+---------------------+
| GlusterFS Client | (FUSE, NFS-Ganesha, or libgfapi)
+----------+----------+
|
+------+------+
| Translator | (DHT, AFR, EC, etc.)
| Stack |
+------+------+
|
+----------+----------+----------+
| Brick 1 | Brick 2 | Brick 3 | (Server-side, local XFS/ext4)
| node-1 | node-2 | node-3 |
+----------+----------+----------+Key processes:
- glusterd - management daemon running on every node; handles volume configuration and peer management
- glusterfsd - brick process; one per brick, serves data to clients
- glusterfs (FUSE) - client-side mount process
3. Volume Types#
GlusterFS supports several volume types, each suited to different workloads:
Distribute#
Spreads files across bricks using a hash algorithm. No redundancy; losing one brick loses the files it holds.
gluster volume create dist-vol node1:/data/brick1 node2:/data/brick1- Capacity: Sum of all bricks
- Use case: Maximum capacity when redundancy is handled elsewhere (e.g., underlying RAID)
Replicate#
Maintains identical copies of files across bricks. Equivalent to RAID 1.
gluster volume create repl-vol replica 3 \
node1:/data/brick1 node2:/data/brick1 node3:/data/brick1- Capacity: Size of one brick
- Use case: High availability, small to medium volumes
Arbiter#
A variant of replicate where the third brick stores only metadata (filenames, sizes, checksums) instead of full data. Provides split-brain protection at lower storage cost than a full 3-way replica.
gluster volume create arb-vol replica 3 arbiter 1 \
node1:/data/brick1 node2:/data/brick1 node3:/data/arbiter1- Capacity: Size of one data brick (arbiter brick needs only ~1-5% of data brick size)
- Use case: Two-node deployments needing split-brain protection without tripling storage
Stripe (Deprecated)#
Splits individual files across bricks for parallel access to large files. Deprecated since GlusterFS 6.0; use sharding instead.
Dispersed (Erasure Coding)#
Splits files into fragments with configurable redundancy, similar to RAID 5/6.
gluster volume create disp-vol disperse 6 redundancy 2 \
node{1..6}:/data/brick1- Capacity: (Total bricks - redundancy) x brick size
- Use case: Large-scale storage with space-efficient redundancy
Distributed-Replicate#
Distributes files across replica sets. Combines scalability with redundancy.
gluster volume create dist-repl-vol replica 3 \
node{1..6}:/data/brick1Creates 2 replica sets of 3 bricks each. Files are distributed across sets, replicated within each set.
Distributed-Dispersed#
Distributes files across dispersed sub-volumes.
gluster volume create dist-disp-vol disperse 3 redundancy 1 \
node{1..6}:/data/brick14. Installation#
Debian/Ubuntu#
sudo apt update
sudo apt install -y glusterfs-server
sudo systemctl start glusterd
sudo systemctl enable glusterdRHEL/CentOS/Rocky#
sudo dnf install -y centos-release-gluster10
sudo dnf install -y glusterfs-server
sudo systemctl start glusterd
sudo systemctl enable glusterdArch Linux#
sudo pacman -S glusterfs
sudo systemctl start glusterd
sudo systemctl enable glusterd5. Disk Preparation#
XFS is the recommended filesystem for bricks due to its extended attribute support and performance characteristics.
# Partition and format the drive
sudo mkfs.xfs -i size=512 /dev/sdb1
# Create the brick mount point
sudo mkdir -p /data/brick1
# Add to fstab for persistent mounting
echo '/dev/sdb1 /data/brick1 xfs defaults 0 0' | sudo tee -a /etc/fstab
# Mount all
sudo mount -aAlways use a dedicated partition or disk for bricks. Never place bricks on the root filesystem.
6. Firewall Configuration#
GlusterFS requires several ports for inter-node communication:
| Port | Protocol | Purpose |
|---|---|---|
| 111 | TCP/UDP | Portmapper (rpcbind) |
| 24007 | TCP | glusterd management |
| 24008 | TCP | glusterd RDMA (if used) |
| 49152+ | TCP | Brick ports (one per brick, starting at 49152) |
# Using firewalld
sudo firewall-cmd --zone=public --add-port=111/tcp --permanent
sudo firewall-cmd --zone=public --add-port=111/udp --permanent
sudo firewall-cmd --zone=public --add-port=24007-24008/tcp --permanent
sudo firewall-cmd --zone=public --add-port=49152-49251/tcp --permanent
sudo firewall-cmd --reloadThe brick port range is configurable in /etc/glusterfs/glusterd.vol. Adjust firewall rules if you change the default range.
7. Volume Configuration#
Peer Probing#
Before creating volumes, establish trust between nodes:
# Run from one node to add others to the trusted storage pool
sudo gluster peer probe node2
sudo gluster peer probe node3
# Verify peer status
sudo gluster peer statusCreating and Starting a Volume#
# Create a replica-3 volume
sudo gluster volume create myvol replica 3 \
node1:/data/brick1/myvol \
node2:/data/brick1/myvol \
node3:/data/brick1/myvol
# Start the volume
sudo gluster volume start myvol
# Verify volume info
sudo gluster volume info myvolTuning Volume Options#
# Enable server-side caching
sudo gluster volume set myvol performance.cache-size 256MB
# Enable client-side read-ahead
sudo gluster volume set myvol performance.read-ahead on
# Set number of I/O threads
sudo gluster volume set myvol performance.io-thread-count 32
# Enable sharding (for large files)
sudo gluster volume set myvol features.shard on
sudo gluster volume set myvol features.shard-block-size 64MB
# List all current options
sudo gluster volume get myvol all8. Security and Access Control#
IP-Based Access Restriction#
# Allow only specific networks
sudo gluster volume set myvol auth.allow 192.0.2.*,198.51.100.*
# Deny specific hosts
sudo gluster volume set myvol auth.reject 192.0.2.100TLS/SSL Encryption#
GlusterFS supports TLS for encrypting management and I/O traffic:
# Enable TLS on all connections
sudo gluster volume set myvol client.ssl on
sudo gluster volume set myvol server.ssl on
sudo gluster volume set myvol auth.ssl-allow node1,node2,node3,client1Certificates must be placed in /etc/ssl/glusterfs/ on each node and client.
9. Client Configuration#
FUSE Mount (Native Client)#
# Install the GlusterFS client
sudo apt install -y glusterfs-client # Debian/Ubuntu
sudo dnf install -y glusterfs-fuse # RHEL/CentOS
# Mount the volume
sudo mkdir -p /mnt/glusterfs
sudo mount -t glusterfs node1:/myvol /mnt/glusterfsPersistent Mount via fstab#
node1:/myvol /mnt/glusterfs glusterfs defaults,_netdev,backupvolfile-server=node2 0 0The backupvolfile-server option provides failover if the primary node is unreachable during mount.
NFS Access#
GlusterFS can export volumes via NFS-Ganesha for clients that cannot run the native FUSE client:
sudo gluster volume set myvol nfs.disable offFor NFSv4 with NFS-Ganesha, configure the Ganesha export file separately.
10. Geo-Replication#
Geo-replication asynchronously mirrors a volume to a remote GlusterFS cluster, useful for disaster recovery and content distribution.
Setup#
# Generate SSH keys and establish passwordless access
sudo gluster system:: execute gsec_create
# Push the public key to the remote cluster
sudo gluster volume geo-replication myvol \
remote-node::remote-vol create push-pem
# Start geo-replication
sudo gluster volume geo-replication myvol \
remote-node::remote-vol startMonitoring#
# Check geo-replication status
sudo gluster volume geo-replication myvol \
remote-node::remote-vol status detailKey Options#
# Set sync interval (seconds)
sudo gluster volume geo-replication myvol \
remote-node::remote-vol config sync-jobs 4
# Set changelog-based sync for efficiency
sudo gluster volume geo-replication myvol \
remote-node::remote-vol config use-changelog true11. Snapshots#
GlusterFS supports volume-level snapshots using thinly-provisioned LVM underneath.
Prerequisites#
- Bricks must reside on LVM thin-provisioned logical volumes
snapshot-schedulerfeature must be enabled
Creating Snapshots#
# Create a snapshot
sudo gluster snapshot create snap1 myvol
# List snapshots
sudo gluster snapshot list myvol
# Activate a snapshot for mounting
sudo gluster snapshot activate snap1
# Restore a volume from a snapshot (volume must be stopped)
sudo gluster volume stop myvol
sudo gluster snapshot restore snap1
sudo gluster volume start myvolSnapshot Scheduling#
# Initialize the scheduler
sudo gluster snapshot scheduler init
# Create a scheduled snapshot (every day at 02:00)
sudo gluster snapshot scheduler add "daily-snap" "0 2 * * *" myvolSnapshot Configuration#
# Set max snapshots per volume (hard limit)
sudo gluster snapshot config snap-max-hard-limit 256
# Set soft limit percentage (warning threshold)
sudo gluster snapshot config snap-max-soft-limit 90
# Enable auto-delete when hard limit is reached
sudo gluster snapshot config auto-delete enable12. Monitoring#
Built-in Commands#
# Volume status with detailed brick info
sudo gluster volume status myvol detail
# Brick-level I/O stats
sudo gluster volume profile myvol start
sudo gluster volume profile myvol info
# Self-heal status (for replicated volumes)
sudo gluster volume heal myvol info
# List entries needing heal
sudo gluster volume heal myvol info heal-failedPrometheus with gluster_exporter#
Use gluster_exporter to expose GlusterFS metrics to Prometheus:
# Install and run
gluster_exporter --metrics-path=/metrics --listen=:9189Key metrics to monitor:
| Metric | Description |
|---|---|
| Brick disk usage | Prevents capacity exhaustion |
| Heal pending count | Nonzero indicates ongoing or stalled recovery |
| Peer connection state | Detects split-brain or network partition |
| Volume throughput | Tracks read/write performance trends |
Log Locations#
| Log | Path |
|---|---|
| glusterd | /var/log/glusterfs/glusterd.log |
| Brick | /var/log/glusterfs/bricks/<brick-path>.log |
| Client (FUSE) | /var/log/glusterfs/<volume>.log |
| Geo-replication | /var/log/glusterfs/geo-replication/ |
| Self-heal | /var/log/glusterfs/glustershd.log |
13. Troubleshooting#
| Issue | Cause | Solution |
|---|---|---|
Peer Rejected during probe | Mismatched GlusterFS versions or existing stale peer entries | Ensure identical versions on all nodes; clear /var/lib/glusterd/peers/ on the rejecting node and re-probe |
| Split-brain on replicated volume | Network partition caused divergent writes | Run gluster volume heal <vol> info split-brain, then resolve with gluster volume heal <vol> split-brain source-brick <brick> |
Volume mount fails with Transport endpoint not connected | glusterfsd brick process crashed or is unreachable | Restart the brick: gluster volume start <vol> force; check brick logs |
| Self-heal not progressing | Heal daemon stalled or too many entries | Restart the self-heal daemon: gluster volume heal <vol> enable; check glustershd.log |
| High latency on small file workloads | Metadata overhead in distributed-replicate | Enable metadata-cache translator; consider group metadata-cache preset: gluster volume set <vol> group metadata-cache |
| Brick full, volume degraded | Uneven data distribution or capacity mismatch | Rebalance with gluster volume rebalance <vol> start; add new bricks and rebalance |
Geo-replication stuck in Faulty state | SSH key issues or changelog corruption | Check SSH connectivity; restart geo-rep session; if changelog corrupt, reset with gluster volume geo-replication <vol> <remote>::<rvol> config reset-sync-time |