A lightweight, distributed block storage system for Kubernetes that provides persistent storage with built-in replication, snapshots, and backup.
Table of Contents#
- Overview
- Architecture
- Installation
- Volume Management
- Snapshots and Backups
- Disaster Recovery
- Performance Tuning
- Monitoring
- Multi-Node Failure Handling
- Troubleshooting
- See Also
- Sources
1. Overview#
Longhorn turns existing disk storage on Kubernetes nodes into a distributed, replicated persistent storage layer. Each volume is an independent microservice with its own controller and replicas, making the system resilient to individual node or disk failures.
Key features:
- Replicated storage - each volume is replicated across configurable number of nodes
- Snapshots - point-in-time volume snapshots without service disruption
- Backups - incremental backups to S3-compatible or NFS external storage
- Disaster recovery - cross-cluster volume replication for DR scenarios
- Dynamic provisioning - integrates with Kubernetes StorageClass for automatic PV provisioning
- Web UI - built-in dashboard for volume, snapshot, and backup management
- Self-healing - automatic replica rebuilding when nodes recover or are replaced
- ReadWriteMany - RWX support via NFS
2. Architecture#
| Component | Role |
|---|---|
| Longhorn Manager | DaemonSet on every node; manages volumes, replicas, and the Longhorn API |
| Longhorn Engine | Storage controller for each volume; handles I/O and replication |
| Replica | A copy of the volume data stored on a node's disk; each volume has multiple replicas |
| CSI Driver | Kubernetes CSI plugin for dynamic provisioning and volume attachment |
| UI | Web-based management interface |
| Instance Manager | Manages engine and replica processes on each node |
| Share Manager | Provides NFS-based RWX access when needed |
| Backing Image Manager | Manages backing images for volume creation |
Data flow:
Pod -> CSI -> Longhorn Engine (controller) -> Replica 1 (Node A)
-> Replica 2 (Node B)
-> Replica 3 (Node C)3. Installation#
3.1 Prerequisites#
- Kubernetes v1.25+
kubectland Helm 3- Each node needs:
open-iscsiinstalled,iscsidrunning, and at least one extra disk or partition (or use root disk with path configuration) - Recommended:
nfs-commonon every node (for RWX and backup mounts)
Verify prerequisites:
# Check iscsid on each node
systemctl status iscsid
# Install if missing (Debian/Ubuntu)
sudo apt install open-iscsi
sudo systemctl enable --now iscsid
# Install if missing (RHEL/CentOS)
sudo yum install iscsi-initiator-utils
sudo systemctl enable --now iscsidOr use the Longhorn environment check script:
curl -sSfL https://raw.githubusercontent.com/longhorn/longhorn/master/scripts/environment_check.sh | bash3.2 Install via Helm#
helm repo add longhorn https://charts.longhorn.io
helm repo update
helm install longhorn longhorn/longhorn \
--namespace longhorn-system \
--create-namespace \
--version 1.7.2 \
--set defaultSettings.defaultReplicaCount=33.3 Install via kubectl#
kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/v1.7.2/deploy/longhorn.yaml3.4 Verify Installation#
kubectl -n longhorn-system get podsAll pods (manager, driver-deployer, CSI components, UI) should be Running.
3.5 Access the UI#
Port-forward:
kubectl -n longhorn-system port-forward svc/longhorn-frontend 8080:80Or create an Ingress:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: longhorn-ingress
namespace: longhorn-system
annotations:
nginx.ingress.kubernetes.io/auth-type: basic
nginx.ingress.kubernetes.io/auth-secret: longhorn-basic-auth
spec:
rules:
- host: <longhorn-hostname>
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: longhorn-frontend
port:
number: 804. Volume Management#
4.1 StorageClass#
Longhorn creates a default StorageClass. Customize it:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: longhorn-fast
provisioner: driver.longhorn.io
allowVolumeExpansion: true
reclaimPolicy: Delete
volumeBindingMode: Immediate
parameters:
numberOfReplicas: "3"
staleReplicaTimeout: "2880"
dataLocality: "best-effort"
diskSelector: "ssd"
nodeSelector: "storage"4.2 Create a PVC#
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: <pvc-name>
spec:
accessModes:
- ReadWriteOnce
storageClassName: longhorn
resources:
requests:
storage: <size>4.3 ReadWriteMany (RWX)#
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: <pvc-name>
spec:
accessModes:
- ReadWriteMany
storageClassName: longhorn
resources:
requests:
storage: <size>Longhorn serves RWX volumes via an NFS share manager pod.
4.4 Volume Expansion#
Longhorn supports online volume expansion if allowVolumeExpansion: true is set on the StorageClass:
kubectl patch pvc <pvc-name> -p '{"spec":{"resources":{"requests":{"storage":"<new-size>"}}}}'4.5 Data Locality#
| Setting | Behavior |
|---|---|
disabled | No locality preference (default) |
best-effort | Try to keep a replica on the same node as the consuming pod |
strict-local | Volume only accessible on the node with the data; no replication |
4.6 Node and Disk Selection#
Tag nodes and disks in the Longhorn UI or via annotations, then use nodeSelector and diskSelector in the StorageClass to control placement.
5. Snapshots and Backups#
5.1 Snapshots#
Snapshots are point-in-time captures of volume data stored locally on replica disks:
# Create via kubectl
kubectl -n longhorn-system apply -f - <<EOF
apiVersion: longhorn.io/v1beta2
kind: Snapshot
metadata:
name: <snapshot-name>
spec:
volume: <volume-name>
EOFOr use the Longhorn UI: Volume > Take Snapshot.
5.2 Recurring Snapshots#
Schedule automatic snapshots via a RecurringJob:
apiVersion: longhorn.io/v1beta2
kind: RecurringJob
metadata:
name: snapshot-hourly
namespace: longhorn-system
spec:
cron: "0 * * * *"
task: snapshot
retain: 24
concurrency: 1
groups:
- default5.3 Backup Target Configuration#
Configure an external backup target (S3 or NFS) in Longhorn settings:
S3:
# In Longhorn settings or Helm values
defaultSettings:
backupTarget: s3://<bucket-name>@<region>/
backupTargetCredentialSecret: longhorn-backup-secretCreate the credentials Secret:
apiVersion: v1
kind: Secret
metadata:
name: longhorn-backup-secret
namespace: longhorn-system
type: Opaque
stringData:
AWS_ACCESS_KEY_ID: <access-key>
AWS_SECRET_ACCESS_KEY: <secret-key>
AWS_ENDPOINTS: https://<s3-endpoint>NFS:
defaultSettings:
backupTarget: nfs://<nfs-server>:/<export-path>5.4 Create a Backup#
From the UI: Volume > Create Backup.
Or schedule recurring backups:
apiVersion: longhorn.io/v1beta2
kind: RecurringJob
metadata:
name: backup-daily
namespace: longhorn-system
spec:
cron: "0 2 * * *"
task: backup
retain: 7
concurrency: 1
groups:
- default5.5 Restore from Backup#
- In the UI, go to Backup > select backup > Restore
- Or create a volume from backup via the API/CLI:
# List backups
kubectl -n longhorn-system get backups
# Restore creates a new volume from the backup
# Use the Longhorn UI or API to initiate restore5.6 Restore from Snapshot#
In the UI: Volume > Snapshots > Revert to selected snapshot. This overwrites the current volume data.
6. Disaster Recovery#
6.1 DR Volume (Cross-Cluster Replication)#
A DR volume is a standby replica of a volume in another cluster, synced from the same backup target:
- Configure both clusters to use the same S3/NFS backup target
- Create regular backups of the source volume
- In the DR cluster, create a DR volume pointing to the same backup:
Longhorn UI > Volume > Create DR Volume
- Name: <dr-volume-name>
- Backup URL: s3://<bucket>@<region>/?volume=<source-volume>- The DR volume polls for new backups and incrementally restores them
- To activate: detach the source volume, then activate the DR volume in the DR cluster
6.2 Full Cluster Restore#
If the entire source cluster is lost:
- Deploy Longhorn on a new cluster
- Configure the same backup target
- Browse available backups in the Longhorn UI
- Restore each volume from its latest backup
- Re-create PVs and PVCs pointing to the restored volumes
6.3 Volume Clone#
Create an identical copy of an existing volume for testing or migration:
Longhorn UI > Volume > Create PV/PVC from Volume (clone)7. Performance Tuning#
7.1 Replica Count#
Fewer replicas reduce write latency but lower fault tolerance:
| Replicas | Write behavior | Fault tolerance |
|---|---|---|
| 1 | Fastest writes, no redundancy | None |
| 2 | Moderate, tolerates 1 node loss | 1 node |
| 3 | Default, tolerates 2 node losses | 2 nodes |
Set per StorageClass or per volume.
7.2 Data Locality#
Set dataLocality: best-effort to keep a replica on the same node as the workload, reducing network hops for reads.
7.3 Guaranteed Engine Manager CPU#
Reserve CPU for Longhorn engine processes to avoid I/O stalls under load:
defaultSettings:
guaranteedInstanceManagerCPU: 12Value is percentage of a CPU core per instance manager pod.
7.4 Storage Network#
Isolate Longhorn replication traffic on a dedicated network to avoid contention with application traffic:
defaultSettings:
storageNetwork: kube-system/storage-netRequires a Multus CNI configuration.
7.5 Disk Type and Scheduling#
- Use SSDs for performance-sensitive workloads; tag disks as
ssdin the Longhorn UI - Reference the tag in StorageClass:
diskSelector: "ssd" - Set
storageOverProvisioningPercentageandstorageMinimalAvailablePercentageto prevent disk exhaustion
7.6 Volume Parameters#
| Parameter | Default | Tuning advice |
|---|---|---|
numberOfReplicas | 3 | Lower for dev/test, keep 3 for production |
staleReplicaTimeout | 2880 min | Reduce for faster cleanup of failed replicas |
revisionCounterDisabled | false | Set true for performance if data integrity checks are done at the application level |
dataLocality | disabled | Set best-effort for latency-sensitive workloads |
8. Monitoring#
8.1 Prometheus Metrics#
Longhorn exposes metrics at the /metrics endpoint on the Longhorn Manager:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: longhorn
namespace: longhorn-system
spec:
selector:
matchLabels:
app: longhorn-manager
endpoints:
- port: managerKey metrics:
| Metric | Description |
|---|---|
longhorn_volume_actual_size_bytes | Actual disk usage per volume |
longhorn_volume_capacity_bytes | Provisioned capacity per volume |
longhorn_volume_state | Volume state (attached, detached, degraded) |
longhorn_volume_robustness | Robustness status (healthy, degraded, faulted) |
longhorn_node_storage_capacity_bytes | Total storage capacity per node |
longhorn_node_storage_usage_bytes | Storage usage per node |
longhorn_disk_capacity_bytes | Capacity per disk |
longhorn_disk_usage_bytes | Usage per disk |
8.2 Grafana Dashboard#
Import the official Longhorn Grafana dashboard (ID: 13032) or install via Helm values:
defaultSettings:
storageOverProvisioningPercentage: 200
storageMinimalAvailablePercentage: 158.3 Alerts#
Recommended Prometheus alert rules:
groups:
- name: longhorn
rules:
- alert: LonghornVolumeDegraded
expr: longhorn_volume_robustness == 2
for: 5m
labels:
severity: warning
annotations:
summary: "Longhorn volume {{ $labels.volume }} is degraded"
- alert: LonghornVolumeFaulted
expr: longhorn_volume_robustness == 3
for: 1m
labels:
severity: critical
annotations:
summary: "Longhorn volume {{ $labels.volume }} is faulted"
- alert: LonghornNodeStorageLow
expr: (longhorn_node_storage_usage_bytes / longhorn_node_storage_capacity_bytes) > 0.85
for: 10m
labels:
severity: warning
annotations:
summary: "Longhorn node {{ $labels.node }} storage usage above 85%"9. Multi-Node Failure Handling#
9.1 Volume Robustness States#
| State | Meaning | Action |
|---|---|---|
| Healthy | All replicas are online | None needed |
| Degraded | One or more replicas offline, but volume is functional | Longhorn auto-rebuilds replicas when nodes recover |
| Faulted | Majority of replicas lost, volume inaccessible | Manual intervention required |
9.2 Node Drain and Maintenance#
Before draining a node:
# Cordon the node
kubectl cordon <node-name>
# Check volume replica distribution
# Ensure no volume has all replicas on the node being drained
# Drain (Longhorn respects PDB)
kubectl drain <node-name> --ignore-daemonsets --delete-emptydir-dataLonghorn will automatically rebuild replicas on other available nodes.
9.3 Handling Simultaneous Node Failures#
If more nodes fail than the replica count minus one:
- Check volume status in the Longhorn UI
- For faulted volumes, use "Salvage" to recover from the remaining replica
- If the failed nodes will return, wait; replicas auto-resync on recovery
- If nodes are permanently lost, salvage what you can and restore from backup for the rest
9.4 Node Auto-Eviction#
Configure Longhorn to automatically evict replicas from a node before it goes down:
Longhorn UI > Node > Edit > Eviction Requested: trueLonghorn will rebuild replicas on other nodes before the node is removed.
Troubleshooting#
| Issue | Cause | Solution |
|---|---|---|
Volume stuck in Attaching | iSCSI not running on the node | Ensure iscsid is running: systemctl start iscsid |
| Replica rebuild fails | Insufficient disk space on target node | Free disk space or add storage; check storageMinimalAvailablePercentage |
| Volume degraded after node reboot | Replicas not yet rebuilt | Wait for automatic rebuild; check Longhorn Manager logs |
PVC stuck in Pending | StorageClass misconfigured or no schedulable nodes | Verify StorageClass; check node scheduling in the Longhorn UI |
| Backup fails to S3 | Wrong credentials or endpoint | Verify the backup target Secret; test S3 access from within the cluster |
| Volume faulted, all replicas lost | Multiple simultaneous node failures | Use "Salvage" if any replica exists; otherwise restore from backup |
| Slow I/O performance | Replicas on slow disks or high network latency | Enable data locality; use SSD disk selector; isolate storage network |
| RWX volume not mounting | nfs-common not installed on nodes | Install nfs-common (or nfs-utils on RHEL) on all nodes |
| Engine upgrade stuck | Old engine images still in use | Drain and reschedule affected volumes; then upgrade engine images |
| UI inaccessible | Frontend pod not running or Ingress misconfigured | Check longhorn-frontend pod status; verify Ingress rules |