GlusterFS · ArchWorks

GlusterFS is a scalable, distributed network filesystem that aggregates storage bricks from multiple servers into a single, unified namespace.

Addresses below are RFC 5737 documentation ranges or placeholders - swap in your own.

Table of Contents#

Overview
Architecture
Volume Types
Installation
Disk Preparation
Firewall Configuration
Volume Configuration
Security and Access Control
Client Configuration
Geo-Replication
Snapshots
Monitoring
Troubleshooting
See Also
Sources

1. Overview#

GlusterFS pools storage resources from commodity servers into a large parallel filesystem. It operates entirely in user space with no kernel modifications required on the server side. Key characteristics:

No metadata server - eliminates a single point of failure; clients use an elastic hash algorithm to locate data
POSIX-compatible - applications need no modification to use GlusterFS volumes
Modular translator architecture - features like replication, striping, caching, and encryption are implemented as stackable translators
Scales to petabytes - tested in production at hundreds of nodes

GlusterFS stores data in units called "bricks," where each brick is an exported directory on a server backed by a local filesystem (XFS is recommended).

2. Architecture#

+---------------------+
|    GlusterFS Client  |  (FUSE, NFS-Ganesha, or libgfapi)
+----------+----------+
           |
    +------+------+
    |  Translator  |  (DHT, AFR, EC, etc.)
    |    Stack     |
    +------+------+
           |
+----------+----------+----------+
|  Brick 1 |  Brick 2 |  Brick 3 |  (Server-side, local XFS/ext4)
|  node-1  |  node-2  |  node-3  |
+----------+----------+----------+

Key processes:

glusterd - management daemon running on every node; handles volume configuration and peer management
glusterfsd - brick process; one per brick, serves data to clients
glusterfs (FUSE) - client-side mount process

3. Volume Types#

GlusterFS supports several volume types, each suited to different workloads:

Distribute#

Spreads files across bricks using a hash algorithm. No redundancy; losing one brick loses the files it holds.

gluster volume create dist-vol node1:/data/brick1 node2:/data/brick1

Capacity: Sum of all bricks
Use case: Maximum capacity when redundancy is handled elsewhere (e.g., underlying RAID)

Replicate#

Maintains identical copies of files across bricks. Equivalent to RAID 1.

gluster volume create repl-vol replica 3 \
  node1:/data/brick1 node2:/data/brick1 node3:/data/brick1

Capacity: Size of one brick
Use case: High availability, small to medium volumes

Arbiter#

A variant of replicate where the third brick stores only metadata (filenames, sizes, checksums) instead of full data. Provides split-brain protection at lower storage cost than a full 3-way replica.

gluster volume create arb-vol replica 3 arbiter 1 \
  node1:/data/brick1 node2:/data/brick1 node3:/data/arbiter1

Capacity: Size of one data brick (arbiter brick needs only ~1-5% of data brick size)
Use case: Two-node deployments needing split-brain protection without tripling storage

Stripe (Deprecated)#

Splits individual files across bricks for parallel access to large files. Deprecated since GlusterFS 6.0; use sharding instead.

Dispersed (Erasure Coding)#

Splits files into fragments with configurable redundancy, similar to RAID 5/6.

gluster volume create disp-vol disperse 6 redundancy 2 \
  node{1..6}:/data/brick1

Capacity: (Total bricks - redundancy) x brick size
Use case: Large-scale storage with space-efficient redundancy

Distributed-Replicate#

Distributes files across replica sets. Combines scalability with redundancy.

gluster volume create dist-repl-vol replica 3 \
  node{1..6}:/data/brick1

Creates 2 replica sets of 3 bricks each. Files are distributed across sets, replicated within each set.

Distributed-Dispersed#

Distributes files across dispersed sub-volumes.

gluster volume create dist-disp-vol disperse 3 redundancy 1 \
  node{1..6}:/data/brick1

4. Installation#

Debian/Ubuntu#

sudo apt update
sudo apt install -y glusterfs-server
sudo systemctl start glusterd
sudo systemctl enable glusterd

RHEL/CentOS/Rocky#

sudo dnf install -y centos-release-gluster10
sudo dnf install -y glusterfs-server
sudo systemctl start glusterd
sudo systemctl enable glusterd

Arch Linux#

sudo pacman -S glusterfs
sudo systemctl start glusterd
sudo systemctl enable glusterd

5. Disk Preparation#

XFS is the recommended filesystem for bricks due to its extended attribute support and performance characteristics.

# Partition and format the drive
sudo mkfs.xfs -i size=512 /dev/sdb1

# Create the brick mount point
sudo mkdir -p /data/brick1

# Add to fstab for persistent mounting
echo '/dev/sdb1 /data/brick1 xfs defaults 0 0' | sudo tee -a /etc/fstab

# Mount all
sudo mount -a

Always use a dedicated partition or disk for bricks. Never place bricks on the root filesystem.

6. Firewall Configuration#

GlusterFS requires several ports for inter-node communication:

Port	Protocol	Purpose
111	TCP/UDP	Portmapper (rpcbind)
24007	TCP	glusterd management
24008	TCP	glusterd RDMA (if used)
49152+	TCP	Brick ports (one per brick, starting at 49152)

# Using firewalld
sudo firewall-cmd --zone=public --add-port=111/tcp --permanent
sudo firewall-cmd --zone=public --add-port=111/udp --permanent
sudo firewall-cmd --zone=public --add-port=24007-24008/tcp --permanent
sudo firewall-cmd --zone=public --add-port=49152-49251/tcp --permanent
sudo firewall-cmd --reload

The brick port range is configurable in /etc/glusterfs/glusterd.vol. Adjust firewall rules if you change the default range.

7. Volume Configuration#

Peer Probing#

Before creating volumes, establish trust between nodes:

# Run from one node to add others to the trusted storage pool
sudo gluster peer probe node2
sudo gluster peer probe node3

# Verify peer status
sudo gluster peer status

Creating and Starting a Volume#

# Create a replica-3 volume
sudo gluster volume create myvol replica 3 \
  node1:/data/brick1/myvol \
  node2:/data/brick1/myvol \
  node3:/data/brick1/myvol

# Start the volume
sudo gluster volume start myvol

# Verify volume info
sudo gluster volume info myvol

Tuning Volume Options#

# Enable server-side caching
sudo gluster volume set myvol performance.cache-size 256MB

# Enable client-side read-ahead
sudo gluster volume set myvol performance.read-ahead on

# Set number of I/O threads
sudo gluster volume set myvol performance.io-thread-count 32

# Enable sharding (for large files)
sudo gluster volume set myvol features.shard on
sudo gluster volume set myvol features.shard-block-size 64MB

# List all current options
sudo gluster volume get myvol all

8. Security and Access Control#

IP-Based Access Restriction#

# Allow only specific networks
sudo gluster volume set myvol auth.allow 192.0.2.*,198.51.100.*

# Deny specific hosts
sudo gluster volume set myvol auth.reject 192.0.2.100

TLS/SSL Encryption#

GlusterFS supports TLS for encrypting management and I/O traffic:

# Enable TLS on all connections
sudo gluster volume set myvol client.ssl on
sudo gluster volume set myvol server.ssl on
sudo gluster volume set myvol auth.ssl-allow node1,node2,node3,client1

Certificates must be placed in /etc/ssl/glusterfs/ on each node and client.

9. Client Configuration#

FUSE Mount (Native Client)#

# Install the GlusterFS client
sudo apt install -y glusterfs-client   # Debian/Ubuntu
sudo dnf install -y glusterfs-fuse     # RHEL/CentOS

# Mount the volume
sudo mkdir -p /mnt/glusterfs
sudo mount -t glusterfs node1:/myvol /mnt/glusterfs

Persistent Mount via fstab#

node1:/myvol /mnt/glusterfs glusterfs defaults,_netdev,backupvolfile-server=node2 0 0

The backupvolfile-server option provides failover if the primary node is unreachable during mount.

NFS Access#

GlusterFS can export volumes via NFS-Ganesha for clients that cannot run the native FUSE client:

sudo gluster volume set myvol nfs.disable off

For NFSv4 with NFS-Ganesha, configure the Ganesha export file separately.

10. Geo-Replication#

Geo-replication asynchronously mirrors a volume to a remote GlusterFS cluster, useful for disaster recovery and content distribution.

Setup#

# Generate SSH keys and establish passwordless access
sudo gluster system:: execute gsec_create

# Push the public key to the remote cluster
sudo gluster volume geo-replication myvol \
  remote-node::remote-vol create push-pem

# Start geo-replication
sudo gluster volume geo-replication myvol \
  remote-node::remote-vol start

Monitoring#

# Check geo-replication status
sudo gluster volume geo-replication myvol \
  remote-node::remote-vol status detail

Key Options#

# Set sync interval (seconds)
sudo gluster volume geo-replication myvol \
  remote-node::remote-vol config sync-jobs 4

# Set changelog-based sync for efficiency
sudo gluster volume geo-replication myvol \
  remote-node::remote-vol config use-changelog true

11. Snapshots#

GlusterFS supports volume-level snapshots using thinly-provisioned LVM underneath.

Prerequisites#

Bricks must reside on LVM thin-provisioned logical volumes
snapshot-scheduler feature must be enabled

Creating Snapshots#

# Create a snapshot
sudo gluster snapshot create snap1 myvol

# List snapshots
sudo gluster snapshot list myvol

# Activate a snapshot for mounting
sudo gluster snapshot activate snap1

# Restore a volume from a snapshot (volume must be stopped)
sudo gluster volume stop myvol
sudo gluster snapshot restore snap1
sudo gluster volume start myvol

Snapshot Scheduling#

# Initialize the scheduler
sudo gluster snapshot scheduler init

# Create a scheduled snapshot (every day at 02:00)
sudo gluster snapshot scheduler add "daily-snap" "0 2 * * *" myvol

Snapshot Configuration#

# Set max snapshots per volume (hard limit)
sudo gluster snapshot config snap-max-hard-limit 256

# Set soft limit percentage (warning threshold)
sudo gluster snapshot config snap-max-soft-limit 90

# Enable auto-delete when hard limit is reached
sudo gluster snapshot config auto-delete enable

12. Monitoring#

Built-in Commands#

# Volume status with detailed brick info
sudo gluster volume status myvol detail

# Brick-level I/O stats
sudo gluster volume profile myvol start
sudo gluster volume profile myvol info

# Self-heal status (for replicated volumes)
sudo gluster volume heal myvol info

# List entries needing heal
sudo gluster volume heal myvol info heal-failed

Prometheus with gluster_exporter#

Use gluster_exporter to expose GlusterFS metrics to Prometheus:

# Install and run
gluster_exporter --metrics-path=/metrics --listen=:9189

Key metrics to monitor:

Metric	Description
Brick disk usage	Prevents capacity exhaustion
Heal pending count	Nonzero indicates ongoing or stalled recovery
Peer connection state	Detects split-brain or network partition
Volume throughput	Tracks read/write performance trends

Log Locations#

Log	Path
glusterd	`/var/log/glusterfs/glusterd.log`
Brick	`/var/log/glusterfs/bricks/<brick-path>.log`
Client (FUSE)	`/var/log/glusterfs/<volume>.log`
Geo-replication	`/var/log/glusterfs/geo-replication/`
Self-heal	`/var/log/glusterfs/glustershd.log`

13. Troubleshooting#

Issue	Cause	Solution
`Peer Rejected` during probe	Mismatched GlusterFS versions or existing stale peer entries	Ensure identical versions on all nodes; clear `/var/lib/glusterd/peers/` on the rejecting node and re-probe
Split-brain on replicated volume	Network partition caused divergent writes	Run `gluster volume heal <vol> info split-brain`, then resolve with `gluster volume heal <vol> split-brain source-brick <brick>`
Volume mount fails with `Transport endpoint not connected`	glusterfsd brick process crashed or is unreachable	Restart the brick: `gluster volume start <vol> force`; check brick logs
Self-heal not progressing	Heal daemon stalled or too many entries	Restart the self-heal daemon: `gluster volume heal <vol> enable`; check `glustershd.log`
High latency on small file workloads	Metadata overhead in distributed-replicate	Enable `metadata-cache` translator; consider `group metadata-cache` preset: `gluster volume set <vol> group metadata-cache`
Brick full, volume degraded	Uneven data distribution or capacity mismatch	Rebalance with `gluster volume rebalance <vol> start`; add new bricks and rebalance
Geo-replication stuck in `Faulty` state	SSH key issues or changelog corruption	Check SSH connectivity; restart geo-rep session; if changelog corrupt, reset with `gluster volume geo-replication <vol> <remote>::<rvol> config reset-sync-time`