GlusterFS is a scalable, distributed network filesystem that aggregates storage bricks from multiple servers into a single, unified namespace.

Addresses below are RFC 5737 documentation ranges or placeholders - swap in your own.

Table of Contents#

  1. Overview
  2. Architecture
  3. Volume Types
  4. Installation
  5. Disk Preparation
  6. Firewall Configuration
  7. Volume Configuration
  8. Security and Access Control
  9. Client Configuration
  10. Geo-Replication
  11. Snapshots
  12. Monitoring
  13. Troubleshooting
  14. See Also
  15. Sources

1. Overview#

GlusterFS pools storage resources from commodity servers into a large parallel filesystem. It operates entirely in user space with no kernel modifications required on the server side. Key characteristics:

  • No metadata server - eliminates a single point of failure; clients use an elastic hash algorithm to locate data
  • POSIX-compatible - applications need no modification to use GlusterFS volumes
  • Modular translator architecture - features like replication, striping, caching, and encryption are implemented as stackable translators
  • Scales to petabytes - tested in production at hundreds of nodes

GlusterFS stores data in units called "bricks," where each brick is an exported directory on a server backed by a local filesystem (XFS is recommended).

2. Architecture#

+---------------------+
|    GlusterFS Client  |  (FUSE, NFS-Ganesha, or libgfapi)
+----------+----------+
           |
    +------+------+
    |  Translator  |  (DHT, AFR, EC, etc.)
    |    Stack     |
    +------+------+
           |
+----------+----------+----------+
|  Brick 1 |  Brick 2 |  Brick 3 |  (Server-side, local XFS/ext4)
|  node-1  |  node-2  |  node-3  |
+----------+----------+----------+

Key processes:

  • glusterd - management daemon running on every node; handles volume configuration and peer management
  • glusterfsd - brick process; one per brick, serves data to clients
  • glusterfs (FUSE) - client-side mount process

3. Volume Types#

GlusterFS supports several volume types, each suited to different workloads:

Distribute#

Spreads files across bricks using a hash algorithm. No redundancy; losing one brick loses the files it holds.

gluster volume create dist-vol node1:/data/brick1 node2:/data/brick1
  • Capacity: Sum of all bricks
  • Use case: Maximum capacity when redundancy is handled elsewhere (e.g., underlying RAID)

Replicate#

Maintains identical copies of files across bricks. Equivalent to RAID 1.

gluster volume create repl-vol replica 3 \
  node1:/data/brick1 node2:/data/brick1 node3:/data/brick1
  • Capacity: Size of one brick
  • Use case: High availability, small to medium volumes

Arbiter#

A variant of replicate where the third brick stores only metadata (filenames, sizes, checksums) instead of full data. Provides split-brain protection at lower storage cost than a full 3-way replica.

gluster volume create arb-vol replica 3 arbiter 1 \
  node1:/data/brick1 node2:/data/brick1 node3:/data/arbiter1
  • Capacity: Size of one data brick (arbiter brick needs only ~1-5% of data brick size)
  • Use case: Two-node deployments needing split-brain protection without tripling storage

Stripe (Deprecated)#

Splits individual files across bricks for parallel access to large files. Deprecated since GlusterFS 6.0; use sharding instead.

Dispersed (Erasure Coding)#

Splits files into fragments with configurable redundancy, similar to RAID 5/6.

gluster volume create disp-vol disperse 6 redundancy 2 \
  node{1..6}:/data/brick1
  • Capacity: (Total bricks - redundancy) x brick size
  • Use case: Large-scale storage with space-efficient redundancy

Distributed-Replicate#

Distributes files across replica sets. Combines scalability with redundancy.

gluster volume create dist-repl-vol replica 3 \
  node{1..6}:/data/brick1

Creates 2 replica sets of 3 bricks each. Files are distributed across sets, replicated within each set.

Distributed-Dispersed#

Distributes files across dispersed sub-volumes.

gluster volume create dist-disp-vol disperse 3 redundancy 1 \
  node{1..6}:/data/brick1

4. Installation#

Debian/Ubuntu#

sudo apt update
sudo apt install -y glusterfs-server
sudo systemctl start glusterd
sudo systemctl enable glusterd

RHEL/CentOS/Rocky#

sudo dnf install -y centos-release-gluster10
sudo dnf install -y glusterfs-server
sudo systemctl start glusterd
sudo systemctl enable glusterd

Arch Linux#

sudo pacman -S glusterfs
sudo systemctl start glusterd
sudo systemctl enable glusterd

5. Disk Preparation#

XFS is the recommended filesystem for bricks due to its extended attribute support and performance characteristics.

# Partition and format the drive
sudo mkfs.xfs -i size=512 /dev/sdb1

# Create the brick mount point
sudo mkdir -p /data/brick1

# Add to fstab for persistent mounting
echo '/dev/sdb1 /data/brick1 xfs defaults 0 0' | sudo tee -a /etc/fstab

# Mount all
sudo mount -a

Always use a dedicated partition or disk for bricks. Never place bricks on the root filesystem.

6. Firewall Configuration#

GlusterFS requires several ports for inter-node communication:

PortProtocolPurpose
111TCP/UDPPortmapper (rpcbind)
24007TCPglusterd management
24008TCPglusterd RDMA (if used)
49152+TCPBrick ports (one per brick, starting at 49152)
# Using firewalld
sudo firewall-cmd --zone=public --add-port=111/tcp --permanent
sudo firewall-cmd --zone=public --add-port=111/udp --permanent
sudo firewall-cmd --zone=public --add-port=24007-24008/tcp --permanent
sudo firewall-cmd --zone=public --add-port=49152-49251/tcp --permanent
sudo firewall-cmd --reload

The brick port range is configurable in /etc/glusterfs/glusterd.vol. Adjust firewall rules if you change the default range.

7. Volume Configuration#

Peer Probing#

Before creating volumes, establish trust between nodes:

# Run from one node to add others to the trusted storage pool
sudo gluster peer probe node2
sudo gluster peer probe node3

# Verify peer status
sudo gluster peer status

Creating and Starting a Volume#

# Create a replica-3 volume
sudo gluster volume create myvol replica 3 \
  node1:/data/brick1/myvol \
  node2:/data/brick1/myvol \
  node3:/data/brick1/myvol

# Start the volume
sudo gluster volume start myvol

# Verify volume info
sudo gluster volume info myvol

Tuning Volume Options#

# Enable server-side caching
sudo gluster volume set myvol performance.cache-size 256MB

# Enable client-side read-ahead
sudo gluster volume set myvol performance.read-ahead on

# Set number of I/O threads
sudo gluster volume set myvol performance.io-thread-count 32

# Enable sharding (for large files)
sudo gluster volume set myvol features.shard on
sudo gluster volume set myvol features.shard-block-size 64MB

# List all current options
sudo gluster volume get myvol all

8. Security and Access Control#

IP-Based Access Restriction#

# Allow only specific networks
sudo gluster volume set myvol auth.allow 192.0.2.*,198.51.100.*

# Deny specific hosts
sudo gluster volume set myvol auth.reject 192.0.2.100

TLS/SSL Encryption#

GlusterFS supports TLS for encrypting management and I/O traffic:

# Enable TLS on all connections
sudo gluster volume set myvol client.ssl on
sudo gluster volume set myvol server.ssl on
sudo gluster volume set myvol auth.ssl-allow node1,node2,node3,client1

Certificates must be placed in /etc/ssl/glusterfs/ on each node and client.

9. Client Configuration#

FUSE Mount (Native Client)#

# Install the GlusterFS client
sudo apt install -y glusterfs-client   # Debian/Ubuntu
sudo dnf install -y glusterfs-fuse     # RHEL/CentOS

# Mount the volume
sudo mkdir -p /mnt/glusterfs
sudo mount -t glusterfs node1:/myvol /mnt/glusterfs

Persistent Mount via fstab#

node1:/myvol /mnt/glusterfs glusterfs defaults,_netdev,backupvolfile-server=node2 0 0

The backupvolfile-server option provides failover if the primary node is unreachable during mount.

NFS Access#

GlusterFS can export volumes via NFS-Ganesha for clients that cannot run the native FUSE client:

sudo gluster volume set myvol nfs.disable off

For NFSv4 with NFS-Ganesha, configure the Ganesha export file separately.

10. Geo-Replication#

Geo-replication asynchronously mirrors a volume to a remote GlusterFS cluster, useful for disaster recovery and content distribution.

Setup#

# Generate SSH keys and establish passwordless access
sudo gluster system:: execute gsec_create

# Push the public key to the remote cluster
sudo gluster volume geo-replication myvol \
  remote-node::remote-vol create push-pem

# Start geo-replication
sudo gluster volume geo-replication myvol \
  remote-node::remote-vol start

Monitoring#

# Check geo-replication status
sudo gluster volume geo-replication myvol \
  remote-node::remote-vol status detail

Key Options#

# Set sync interval (seconds)
sudo gluster volume geo-replication myvol \
  remote-node::remote-vol config sync-jobs 4

# Set changelog-based sync for efficiency
sudo gluster volume geo-replication myvol \
  remote-node::remote-vol config use-changelog true

11. Snapshots#

GlusterFS supports volume-level snapshots using thinly-provisioned LVM underneath.

Prerequisites#

  • Bricks must reside on LVM thin-provisioned logical volumes
  • snapshot-scheduler feature must be enabled

Creating Snapshots#

# Create a snapshot
sudo gluster snapshot create snap1 myvol

# List snapshots
sudo gluster snapshot list myvol

# Activate a snapshot for mounting
sudo gluster snapshot activate snap1

# Restore a volume from a snapshot (volume must be stopped)
sudo gluster volume stop myvol
sudo gluster snapshot restore snap1
sudo gluster volume start myvol

Snapshot Scheduling#

# Initialize the scheduler
sudo gluster snapshot scheduler init

# Create a scheduled snapshot (every day at 02:00)
sudo gluster snapshot scheduler add "daily-snap" "0 2 * * *" myvol

Snapshot Configuration#

# Set max snapshots per volume (hard limit)
sudo gluster snapshot config snap-max-hard-limit 256

# Set soft limit percentage (warning threshold)
sudo gluster snapshot config snap-max-soft-limit 90

# Enable auto-delete when hard limit is reached
sudo gluster snapshot config auto-delete enable

12. Monitoring#

Built-in Commands#

# Volume status with detailed brick info
sudo gluster volume status myvol detail

# Brick-level I/O stats
sudo gluster volume profile myvol start
sudo gluster volume profile myvol info

# Self-heal status (for replicated volumes)
sudo gluster volume heal myvol info

# List entries needing heal
sudo gluster volume heal myvol info heal-failed

Prometheus with gluster_exporter#

Use gluster_exporter to expose GlusterFS metrics to Prometheus:

# Install and run
gluster_exporter --metrics-path=/metrics --listen=:9189

Key metrics to monitor:

MetricDescription
Brick disk usagePrevents capacity exhaustion
Heal pending countNonzero indicates ongoing or stalled recovery
Peer connection stateDetects split-brain or network partition
Volume throughputTracks read/write performance trends

Log Locations#

LogPath
glusterd/var/log/glusterfs/glusterd.log
Brick/var/log/glusterfs/bricks/<brick-path>.log
Client (FUSE)/var/log/glusterfs/<volume>.log
Geo-replication/var/log/glusterfs/geo-replication/
Self-heal/var/log/glusterfs/glustershd.log

13. Troubleshooting#

IssueCauseSolution
Peer Rejected during probeMismatched GlusterFS versions or existing stale peer entriesEnsure identical versions on all nodes; clear /var/lib/glusterd/peers/ on the rejecting node and re-probe
Split-brain on replicated volumeNetwork partition caused divergent writesRun gluster volume heal <vol> info split-brain, then resolve with gluster volume heal <vol> split-brain source-brick <brick>
Volume mount fails with Transport endpoint not connectedglusterfsd brick process crashed or is unreachableRestart the brick: gluster volume start <vol> force; check brick logs
Self-heal not progressingHeal daemon stalled or too many entriesRestart the self-heal daemon: gluster volume heal <vol> enable; check glustershd.log
High latency on small file workloadsMetadata overhead in distributed-replicateEnable metadata-cache translator; consider group metadata-cache preset: gluster volume set <vol> group metadata-cache
Brick full, volume degradedUneven data distribution or capacity mismatchRebalance with gluster volume rebalance <vol> start; add new bricks and rebalance
Geo-replication stuck in Faulty stateSSH key issues or changelog corruptionCheck SSH connectivity; restart geo-rep session; if changelog corrupt, reset with gluster volume geo-replication <vol> <remote>::<rvol> config reset-sync-time

See Also#

Sources#