A load balancer implementation for bare-metal Kubernetes clusters that provides external IP addresses for services using Layer 2 or BGP advertisement.
Addresses below are RFC 5737 documentation ranges or placeholders - swap in your own.
Table of Contents#
- Overview
- Architecture
- Installation
- Layer 2 Configuration
- BGP Configuration
- Advanced Features
- Troubleshooting
- See Also
- Sources
1. Overview#
Cloud Kubernetes providers offer integrated load balancers, but bare-metal clusters lack this feature. MetalLB fills the gap by implementing LoadBalancer-type services on bare-metal infrastructure. When a service of type LoadBalancer is created, MetalLB assigns an external IP from a configured pool and advertises it to the network.
Key features:
- Layer 2 mode - uses ARP (IPv4) or NDP (IPv6) to announce service IPs on the local network
- BGP mode - peers with network routers to advertise service IPs via BGP
- IP address pools - define multiple pools with different address ranges and policies
- CRD-based configuration - all configuration via Kubernetes custom resources (v0.13+)
- Selective advertisement - control which pools are advertised by which method and to which peers
- Dual-stack - supports both IPv4 and IPv6 addresses
2. Architecture#
| Component | Role |
|---|---|
| Controller | Deployment that handles IP assignment and pool management |
| Speaker | DaemonSet on every node; announces assigned IPs via ARP/NDP (L2) or BGP |
Layer 2 Mode#
One speaker node becomes the "leader" for each service IP and responds to ARP/NDP requests with its own MAC address. All traffic for that IP flows through the leader node, which then forwards it via kube-proxy. If the leader node fails, another speaker takes over.
Limitations:
- Single node handles all traffic for a given IP (no true load balancing across nodes)
- Failover is based on memberlist protocol (typically 5-10 seconds)
BGP Mode#
Every speaker node establishes a BGP session with one or more upstream routers and advertises the service IPs. The routers distribute traffic across all advertising nodes using ECMP (Equal-Cost Multi-Path).
Advantages:
- True load balancing across nodes
- Works with existing network infrastructure
- Supports traffic policies and community strings
3. Installation#
3.1 Prerequisites#
- Kubernetes v1.13+ (v1.25+ recommended)
kubectlconfigured for the cluster- For BGP mode: a router that supports BGP peering
3.2 Install via Manifest#
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.14.9/config/manifests/metallb-native.yaml3.3 Install via Helm#
helm repo add metallb https://metallb.github.io/metallb
helm repo update
helm install metallb metallb/metallb \
--namespace metallb-system \
--create-namespace3.4 Verify Installation#
kubectl get pods -n metallb-systemBoth the controller and speaker pods must be Running.
3.5 Enabling Strict ARP (kube-proxy IPVS mode)#
If kube-proxy runs in IPVS mode, enable strict ARP to prevent conflicts:
kubectl get configmap kube-proxy -n kube-system -o yaml | \
sed -e "s/strictARP: false/strictARP: true/" | \
kubectl apply -f - -n kube-system4. Layer 2 Configuration#
4.1 IP Address Pool#
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: <pool-name>
namespace: metallb-system
spec:
addresses:
- <start-ip>-<end-ip>
- <cidr-range>
autoAssign: true
avoidBuggyIPs: true| Field | Description |
|---|---|
addresses | List of IP ranges (CIDR or start-end format) |
autoAssign | If false, only services requesting this pool by name get an IP |
avoidBuggyIPs | Skips .0 and .255 addresses to avoid broadcast issues |
4.2 L2 Advertisement#
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: <advertisement-name>
namespace: metallb-system
spec:
ipAddressPools:
- <pool-name>
nodeSelectors:
- matchLabels:
kubernetes.io/os: linux
interfaces:
- <interface-name>| Field | Description |
|---|---|
ipAddressPools | Which pools to advertise (omit for all pools) |
nodeSelectors | Restrict advertisement to specific nodes |
interfaces | Restrict ARP responses to specific network interfaces |
4.3 Multiple Pools#
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: public-pool
namespace: metallb-system
spec:
addresses:
- 203.0.113.10-203.0.113.20
---
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: internal-pool
namespace: metallb-system
spec:
addresses:
- 198.51.100.200-198.51.100.250
autoAssign: falseRequest a specific pool:
apiVersion: v1
kind: Service
metadata:
name: <service-name>
annotations:
metallb.universe.tf/address-pool: internal-pool
spec:
type: LoadBalancer
# ...4.4 Request a Specific IP#
apiVersion: v1
kind: Service
metadata:
name: <service-name>
spec:
type: LoadBalancer
loadBalancerIP: <desired-ip>
# ...4.5 IP Sharing#
Multiple services can share the same external IP if they use different ports:
apiVersion: v1
kind: Service
metadata:
name: service-a
annotations:
metallb.universe.tf/allow-shared-ip: "shared-key"
spec:
type: LoadBalancer
loadBalancerIP: <shared-ip>
ports:
- port: 80
---
apiVersion: v1
kind: Service
metadata:
name: service-b
annotations:
metallb.universe.tf/allow-shared-ip: "shared-key"
spec:
type: LoadBalancer
loadBalancerIP: <shared-ip>
ports:
- port: 4435. BGP Configuration#
5.1 BGP Peer#
apiVersion: metallb.io/v1beta2
kind: BGPPeer
metadata:
name: <peer-name>
namespace: metallb-system
spec:
myASN: <local-asn>
peerASN: <remote-asn>
peerAddress: <router-ip>
peerPort: 179
holdTime: 90s
keepaliveTime: 30s
nodeSelectors:
- matchLabels:
kubernetes.io/os: linux
password: <bgp-password>
passwordSecret:
name: <secret-name>
namespace: metallb-system5.2 BGP Advertisement#
apiVersion: metallb.io/v1beta1
kind: BGPAdvertisement
metadata:
name: <advertisement-name>
namespace: metallb-system
spec:
ipAddressPools:
- <pool-name>
aggregationLength: 32
aggregationLengthV6: 128
localPref: 100
communities:
- <community-string>
peers:
- <peer-name>| Field | Description |
|---|---|
aggregationLength | Prefix length for route aggregation (32 = per-IP) |
localPref | BGP LOCAL_PREF attribute for route selection |
communities | BGP community strings (e.g., 65535:65281 for no-export) |
peers | Restrict advertisement to specific BGP peers |
5.3 Multiple BGP Peers (Redundancy)#
apiVersion: metallb.io/v1beta2
kind: BGPPeer
metadata:
name: router-1
namespace: metallb-system
spec:
myASN: 64500
peerASN: 64501
peerAddress: <router-1-ip>
---
apiVersion: metallb.io/v1beta2
kind: BGPPeer
metadata:
name: router-2
namespace: metallb-system
spec:
myASN: 64500
peerASN: 64501
peerAddress: <router-2-ip>5.4 Community Strings#
Define reusable community references:
apiVersion: metallb.io/v1beta1
kind: Community
metadata:
name: communities
namespace: metallb-system
spec:
communities:
- name: no-export
value: 65535:65281
- name: no-advertise
value: 65535:65282Reference in BGPAdvertisement:
spec:
communities:
- no-export6. Advanced Features#
6.1 Traffic Policy#
Control how traffic reaches pods:
apiVersion: v1
kind: Service
metadata:
name: <service-name>
spec:
type: LoadBalancer
externalTrafficPolicy: Local
# ...| Policy | Behavior |
|---|---|
Cluster (default) | Traffic distributed across all nodes via kube-proxy; source IP is NATed |
Local | Traffic only sent to nodes with matching pods; preserves source IP |
With externalTrafficPolicy: Local in L2 mode, MetalLB only assigns the IP to a node that actually runs the service pods.
6.2 Internal Traffic Policy#
spec:
internalTrafficPolicy: LocalRestricts cluster-internal traffic to pods on the same node.
6.3 Dual-Stack (IPv4 + IPv6)#
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: dual-stack-pool
namespace: metallb-system
spec:
addresses:
- 198.51.100.240-198.51.100.250
- fd00::1-fd00::106.4 FRRouting (FRR) Backend#
MetalLB supports FRR as an alternative BGP backend (more mature BGP implementation):
helm install metallb metallb/metallb \
--namespace metallb-system \
--set speaker.frr.enabled=trueFRR provides:
- BFD (Bidirectional Forwarding Detection) for faster failover
- VRF support
- Advanced route filtering
6.5 BFD Profile (with FRR)#
apiVersion: metallb.io/v1beta1
kind: BFDProfile
metadata:
name: fast-detect
namespace: metallb-system
spec:
receiveInterval: 300
transmitInterval: 300
detectMultiplier: 3
echoInterval: 50
echoMode: true
passiveMode: false
minimumTtl: 254Reference in BGPPeer:
spec:
bfdProfile: fast-detectTroubleshooting#
| Issue | Cause | Solution |
|---|---|---|
Service stuck in Pending with no external IP | No IPAddressPool configured or pool exhausted | Create an IPAddressPool; check pool usage with kubectl get ipaddresspool -n metallb-system |
| External IP assigned but unreachable | L2Advertisement missing or wrong interface | Create an L2Advertisement for the pool; verify speaker pods are on the correct network |
| L2 failover takes too long | Default memberlist timeout | Tune memberlist parameters; consider BGP mode for faster failover |
| BGP session not establishing | Wrong ASN, IP, or firewall blocking port 179 | Verify peer config; check speaker pod logs; ensure port 179 TCP is open |
| Traffic only reaches one node (L2) | Expected behavior for L2 mode | L2 funnels all traffic through one leader node; use BGP for true ECMP load balancing |
| Source IP always the node IP | externalTrafficPolicy: Cluster SNAT behavior | Set externalTrafficPolicy: Local to preserve source IP |
| kube-proxy IPVS ARP conflict | strictARP not enabled | Enable strictARP: true in kube-proxy ConfigMap |
| Shared IP not working | Mismatched sharing key or overlapping ports | Ensure both services have identical metallb.universe.tf/allow-shared-ip annotation values |
| BGP routes not propagating | Community string filtering on upstream router | Check router BGP config; verify community strings match expectations |
| Speaker crashlooping | Missing RBAC or node connectivity issues | Check speaker pod logs; ensure memberlist port (7946) is open between nodes |