Native Docker clustering and orchestration tool that turns a pool of Docker hosts into a single virtual host with service scheduling, scaling, and rolling updates.

Deprecated: Docker Swarm is in maintenance-only mode and receives no new features. For new container orchestration deployments, use Kubernetes or Docker Compose (for single-host setups). Existing Swarm clusters continue to function, but migration planning is recommended.

Table of Contents#

  1. Overview
  2. Installation
  3. Firewall Configuration
  4. Cluster Setup
  5. Cheat Sheet
  6. Troubleshooting

1. Overview#

Docker Swarm mode is built into the Docker Engine and provides native orchestration capabilities:

  • Declarative service model: Define the desired state and Swarm maintains it
  • Scaling: Scale services up or down with a single command
  • Rolling updates: Update services with zero downtime
  • Service discovery: Built-in DNS-based service discovery and load balancing
  • Mutual TLS: Automatic TLS encryption between nodes
  • Secrets management: Encrypted storage for sensitive data

Node roles:

RoleDescription
ManagerOrchestrates the cluster, schedules tasks, maintains the Raft consensus state. Odd numbers recommended (3 or 5).
WorkerExecutes container workloads. No access to cluster management.

2. Installation#

Docker Swarm mode is included with Docker Engine. Install Docker using your distribution's package manager:

# Arch Linux
pacman -S docker

# Debian / Ubuntu
sudo apt install docker-ce docker-ce-cli containerd.io

# RHEL / Fedora
sudo dnf install docker-ce docker-ce-cli containerd.io

Enable and start Docker:

sudo systemctl enable --now docker

3. Firewall Configuration#

Swarm requires the following ports open between all nodes:

PortProtocolPurpose
2377TCPCluster management and Raft consensus
7946TCP + UDPNode-to-node communication (gossip)
4789UDPVXLAN overlay network traffic (default data path port)

iptables#

# Manager and worker nodes
sudo iptables -A INPUT -p tcp --dport 2377 -j ACCEPT
sudo iptables -A INPUT -p tcp --dport 7946 -j ACCEPT
sudo iptables -A INPUT -p udp --dport 7946 -j ACCEPT
sudo iptables -A INPUT -p udp --dport 4789 -j ACCEPT

# Persist rules (Debian/Ubuntu)
sudo apt install iptables-persistent
sudo netfilter-persistent save

# Persist rules (RHEL/Fedora)
sudo dnf install iptables-services
sudo service iptables save

firewalld#

sudo firewall-cmd --permanent --zone=public --add-port=2377/tcp
sudo firewall-cmd --permanent --zone=public --add-port=7946/tcp
sudo firewall-cmd --permanent --zone=public --add-port=7946/udp
sudo firewall-cmd --permanent --zone=public --add-port=4789/udp
sudo firewall-cmd --reload

nftables#

table inet filter {
  chain input {
    tcp dport 2377 accept comment "Swarm management"
    tcp dport 7946 accept comment "Swarm gossip TCP"
    udp dport 7946 accept comment "Swarm gossip UDP"
    udp dport 4789 accept comment "Swarm VXLAN overlay"
  }
}

ufw#

sudo ufw allow 2377/tcp
sudo ufw allow 7946/tcp
sudo ufw allow 7946/udp
sudo ufw allow 4789/udp

Note: If you run in a VMware NSX environment, use a custom data path port (e.g., --data-path-port 14789) to avoid conflicts with NSX's own VXLAN traffic on port 4789. Open that custom port instead.

4. Cluster Setup#

4.1 Initializing the Swarm#

On the first manager node:

# Basic initialization
docker swarm init

# Specify the advertise address (required on multi-NIC hosts)
docker swarm init --advertise-addr <manager-ip>

# Custom data path port (for VMware NSX or port conflict scenarios)
docker swarm init --data-path-port 14789

4.2 Joining Nodes#

After initialization, get the join tokens:

# Get the worker join token
docker swarm join-token worker

# Get the manager join token
docker swarm join-token manager

On each node to join:

# Join as a worker
docker swarm join --token <worker-token> <manager-ip>:2377

# Join as a manager
docker swarm join --token <manager-token> <manager-ip>:2377

4.3 Draining Manager Nodes#

To prevent manager nodes from running application workloads, set their availability to drain:

docker node update --availability drain <node_name>

This ensures managers focus on cluster management. Existing tasks on the node are rescheduled to active workers.

5. Cheat Sheet#

5.1 Swarm Commands#

CommandDescription
docker swarm initInitialize a new swarm
docker swarm init --advertise-addr <ip>Initialize with a specific advertise address
docker swarm join --token <token> <ip>:2377Join a node to the swarm
docker swarm leaveLeave the swarm (worker)
docker swarm leave --forceForce leave (manager)
docker swarm join-token workerDisplay the worker join token
docker swarm join-token managerDisplay the manager join token
docker swarm join-token --rotate workerRotate the worker join token
docker swarm unlockUnlock a locked swarm after restart
docker swarm unlock-keyDisplay the unlock key
docker swarm unlock-key --rotateRotate the unlock key
docker swarm update --autolock=trueEnable autolock on the swarm
docker swarm caDisplay and rotate the root CA

5.2 Node Commands#

CommandDescription
docker node lsList all nodes in the swarm
docker node inspect <node>Show detailed node information
docker node ps <node>List tasks running on a node
docker node promote <node>Promote a worker to manager
docker node demote <node>Demote a manager to worker
docker node rm <node>Remove a node from the swarm
docker node update --availability active <node>Set node to accept tasks
docker node update --availability pause <node>Prevent new tasks, keep existing
docker node update --availability drain <node>Reschedule all tasks off the node
docker node update --label-add <key>=<value> <node>Add a label to a node
docker node update --label-rm <key> <node>Remove a label from a node

5.3 Service Commands#

CommandDescription
docker service create --name <name> <image>Create a new service
docker service lsList all services
docker service ps <service>List tasks (containers) for a service
docker service inspect <service>Show detailed service information
docker service logs <service>Show service logs
docker service scale <service>=<n>Scale to N replicas
docker service update --image <image>:<tag> <service>Update the service image (rolling update)
docker service update --force <service>Force redeployment of all tasks
docker service rollback <service>Roll back to the previous version
docker service rm <service>Remove a service

5.4 Stack Commands#

CommandDescription
docker stack deploy -c compose.yaml <stack>Deploy a stack from a Compose file
docker stack lsList all stacks
docker stack ps <stack>List tasks in a stack
docker stack services <stack>List services in a stack
docker stack rm <stack>Remove a stack

5.5 Secret and Config Commands#

CommandDescription
echo "secret" | docker secret create <name> -Create a secret from stdin
docker secret create <name> <file>Create a secret from a file
docker secret lsList all secrets
docker secret inspect <name>Inspect a secret (metadata only)
docker secret rm <name>Remove a secret
docker config create <name> <file>Create a config from a file
docker config lsList all configs
docker config rm <name>Remove a config

6. Troubleshooting#

IssueCauseSolution
Error response from daemon: This node is not a swarm managerCommand run on a worker nodeSSH to a manager node or promote this node
Node shows as Down in docker node lsNetwork issue or Docker stopped on the nodeVerify connectivity on ports 2377 and 7946; restart Docker on the node
could not find a leader after manager restartSwarm autolock enabledRun docker swarm unlock with the unlock key
Service stuck in Pending stateNo node meets constraints or resources exhaustedCheck docker service ps <service> for error messages; verify node labels and resource availability
Overlay network unreachable between nodesFirewall blocking UDP 4789 (or custom data path port)Open the VXLAN port on all nodes
rpc error: transport is closingRaft consensus lost (majority of managers down)Restore quorum: docker swarm init --force-new-cluster on a surviving manager
Rolling update failsNew image crashes or health check failsCheck docker service ps <service> for task errors; docker service rollback <service> to revert
network not found during stack deployNetwork from a previous deployment was not cleaned upRemove the old stack first: docker stack rm <stack>, wait, then redeploy
Tasks rescheduled after node drainExpected behaviorDrain triggers task migration; set node to active when maintenance is complete

See Also#

Sources#