An open-source multi-cluster Kubernetes management platform that provides a unified dashboard for deploying, managing, and monitoring clusters across any infrastructure.
Table of Contents#
- Overview
- Architecture
- Installation
- Post-Installation Setup
- Cluster Management
- RBAC and User Management
- External Authentication
- Application Deployment
- Backup and Restore
- Upgrading Rancher
- Troubleshooting
- See Also
- Sources
1. Overview#
Rancher provides a management layer on top of Kubernetes clusters, unifying operations across on-premise, cloud, and edge environments. It handles cluster provisioning, user authentication, monitoring, alerting, and application cataloging from a single interface.
Key features:
- Multi-cluster management - manage any CNCF-conformant Kubernetes cluster from one dashboard
- Cluster provisioning - create clusters on bare metal (RKE2/K3s), cloud providers (EKS, AKS, GKE), or vSphere
- Application catalog - deploy Helm charts from the built-in catalog or custom repositories
- RBAC - fine-grained role-based access control integrated with external identity providers
- Monitoring - built-in Prometheus/Grafana stack deployment
- CIS scanning - Kubernetes security benchmark scanning
- Continuous delivery - integrated Fleet for GitOps at scale
2. Architecture#
| Component | Role |
|---|---|
| Rancher Server | Central management server (runs as a Deployment in a Kubernetes cluster) |
| Rancher Agent | Runs on each managed cluster; connects back to the Rancher server |
| Authentication Proxy | Handles SSO, LDAP, SAML, and local auth |
| Cluster Controller | Manages the lifecycle of downstream clusters |
| Fleet | Built-in GitOps engine for continuous delivery |
| Webhook | Validates and mutates resources per Rancher policies |
Rancher itself runs on a "local" Kubernetes cluster (often called the management cluster) and manages "downstream" clusters. The downstream clusters do not require direct network access; agents initiate outbound connections to the Rancher server.
3. Installation#
3.1 Prerequisites#
- A Kubernetes cluster for the Rancher server (RKE2, K3s, or any conformant cluster)
- Helm 3
kubectlconfigured for the management cluster- A DNS name for Rancher (e.g.,
rancher.<domain>) - TLS certificate (cert-manager, own certificate, or Let's Encrypt)
3.2 Install cert-manager (if using Rancher-generated or Let's Encrypt certs)#
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.16.2/cert-manager.crds.yaml
helm repo add jetstack https://charts.jetstack.io
helm repo update
helm install cert-manager jetstack/cert-manager \
--namespace cert-manager \
--create-namespace \
--version v1.16.23.3 Add the Rancher Helm Repository#
Choose a release channel:
| Channel | Repository URL | Use case |
|---|---|---|
| Latest | https://releases.rancher.com/server-charts/latest | Development, testing |
| Stable | https://releases.rancher.com/server-charts/stable | Production |
| Alpha | https://releases.rancher.com/server-charts/alpha | Experimental features |
helm repo add rancher-stable https://releases.rancher.com/server-charts/stable
helm repo update3.4 Create the Namespace#
kubectl create namespace cattle-system3.5 Install Rancher#
With Rancher-generated certificates (cert-manager):
helm install rancher rancher-stable/rancher \
--namespace cattle-system \
--version 2.10.3 \
--set hostname=rancher.<domain> \
--set replicas=3 \
--set bootstrapPassword=<initial-password>With Let's Encrypt:
helm install rancher rancher-stable/rancher \
--namespace cattle-system \
--version 2.10.3 \
--set hostname=rancher.<domain> \
--set replicas=3 \
--set ingress.tls.source=letsEncrypt \
--set letsEncrypt.email=<your-email> \
--set letsEncrypt.ingress.class=nginx \
--set bootstrapPassword=<initial-password>With your own certificate:
kubectl -n cattle-system create secret tls tls-rancher-ingress \
--cert=<tls-cert-path> \
--key=<tls-key-path>
helm install rancher rancher-stable/rancher \
--namespace cattle-system \
--version 2.10.3 \
--set hostname=rancher.<domain> \
--set replicas=3 \
--set ingress.tls.source=secret \
--set privateCA=true \
--set bootstrapPassword=<initial-password>3.6 Verify Installation#
kubectl -n cattle-system rollout status deploy/rancher
kubectl -n cattle-system get podsWait until all pods are Running and the rollout is complete.
4. Post-Installation Setup#
4.1 First Login#
- Navigate to
https://rancher.<domain>in your browser - Enter the bootstrap password set during installation
- Set a new admin password
- Configure the Rancher server URL (should match the hostname)
4.2 Configure the Server URL#
If you need to change it later:
Global Settings > server-url > Edit > https://rancher.<domain>4.3 Enable Monitoring#
Deploy the built-in Prometheus/Grafana monitoring stack:
Cluster > Apps > Charts > Monitoring > InstallOr via Helm:
helm install rancher-monitoring rancher-stable/rancher-monitoring \
--namespace cattle-monitoring-system \
--create-namespace4.4 Enable Logging#
Cluster > Apps > Charts > Logging > InstallSupports Elasticsearch, Splunk, Fluentd, Kafka, and Syslog as outputs.
4.5 Enable CIS Benchmark Scanning#
Cluster > Apps > Charts > CIS Benchmark > InstallRuns the CIS Kubernetes Benchmark and generates compliance reports.
5. Cluster Management#
5.1 Import an Existing Cluster#
Cluster Management > Import Existing > GenericCopy and run the provided kubectl apply command on the target cluster. The Rancher agent will connect back to the Rancher server.
5.2 Create a New Cluster (RKE2/K3s)#
Cluster Management > Create > Custom- Select Kubernetes version and CNI
- Configure node roles (etcd, control plane, worker)
- Copy the registration command and run it on each node
5.3 Create a Cloud-Hosted Cluster#
Cluster Management > Create > Amazon EKS / Azure AKS / Google GKEProvide cloud credentials, select region, node size, and Kubernetes version.
5.4 Cluster Operations#
| Operation | Location |
|---|---|
| Edit cluster config | Cluster > Edit Config |
| Rotate certificates | Cluster > Rotate Certificates |
| Take etcd snapshot | Cluster > Snapshots > Take Snapshot |
| Restore from snapshot | Cluster > Snapshots > Restore |
| Download kubeconfig | Cluster > Download KubeConfig |
6. RBAC and User Management#
6.1 Role Types#
| Level | Scope | Examples |
|---|---|---|
| Global | Entire Rancher installation | Administrator, Standard User, User-Base |
| Cluster | Single cluster | Cluster Owner, Cluster Member |
| Project/Namespace | Namespaces within a cluster | Project Owner, Project Member, Read-Only |
6.2 Built-in Global Roles#
| Role | Permissions |
|---|---|
| Administrator | Full access to all Rancher resources and all clusters |
| Standard User | Can create new clusters and manage clusters they own |
| User-Base | Login access only; no cluster permissions until granted |
6.3 Create Custom Roles#
Users & Authentication > Roles > CreateCustom roles can grant or deny specific API resources at the global, cluster, or project level.
6.4 Assign Roles to Users#
Cluster > Members > AddSelect the user and assign a cluster role. For project-level access:
Cluster > Projects/Namespaces > <Project> > Members > Add6.5 Project and Namespace Isolation#
Projects group namespaces and apply shared RBAC, quotas, and network policies:
Cluster > Projects/Namespaces > Create ProjectConfigure:
- Resource quotas (CPU, memory, pod count)
- Default resource limits for containers
- Network isolation between projects
7. External Authentication#
7.1 LDAP / Active Directory#
Users & Authentication > Auth Provider > ActiveDirectory / OpenLDAPRequired fields:
| Field | Value |
|---|---|
| Hostname | <ldap-server>:<port> |
| Service Account DN | CN=<service-account>,OU=...,DC=... |
| User Search Base | OU=Users,DC=<domain>,DC=com |
| Group Search Base | OU=Groups,DC=<domain>,DC=com |
| User Login Attribute | sAMAccountName (AD) or uid (LDAP) |
7.2 SAML (Okta, ADFS, PingIdentity)#
Users & Authentication > Auth Provider > SAML- Create a SAML application in your IdP
- Enter the Metadata XML URL or upload the metadata file
- Map SAML attributes to Rancher fields (display name, username, groups)
- Test login before saving
7.3 GitHub / GitLab#
Users & Authentication > Auth Provider > GitHub / GitLab- Create an OAuth application in GitHub/GitLab
- Enter the Client ID and Client Secret
- Optionally restrict to specific organizations or groups
7.4 OIDC (Keycloak, Azure AD, etc.)#
Users & Authentication > Auth Provider > OpenID ConnectRequired fields:
| Field | Value |
|---|---|
| Client ID | <oidc-client-id> |
| Client Secret | <oidc-client-secret> |
| Issuer URL | https://<idp-domain>/... |
| Auth Endpoint | Auto-discovered from issuer |
| Scopes | openid profile email |
8. Application Deployment#
8.1 Helm Chart Catalog#
Rancher includes built-in chart repositories (Rancher charts, partner charts) and supports adding custom repositories:
Cluster > Apps > Repositories > CreateName: <repo-name>
Target: http(s)://<chart-repo-url>
# or Git: https://github.com/<org>/<repo>.git8.2 Deploy from Catalog#
Cluster > Apps > Charts > Select Chart > InstallConfigure values via the UI form or paste custom YAML values.
8.3 Fleet (GitOps)#
Fleet is Rancher's built-in GitOps engine for deploying workloads across clusters at scale:
Continuous Delivery > Git Repos > Add RepositoryName: <repo-name>
Repository URL: https://github.com/<org>/<repo>.git
Branch: main
Paths:
- /manifests
Target Clusters:
Cluster Selector:
matchLabels:
env: productionFleet automatically syncs the Git repository to all matching clusters.
9. Backup and Restore#
9.1 Install the Backup Operator#
Cluster (local) > Apps > Charts > Rancher Backups > InstallOr via Helm:
helm install rancher-backup rancher-stable/rancher-backup \
--namespace cattle-resources-system \
--create-namespace9.2 Create a Backup#
apiVersion: resources.cattle.io/v1
kind: Backup
metadata:
name: <backup-name>
spec:
resourceSetName: rancher-resource-set
storageLocation:
s3:
credentialSecretName: <s3-creds-secret>
credentialSecretNamespace: <namespace>
bucketName: <bucket>
region: <region>
endpoint: <s3-endpoint>
schedule: "0 2 * * *"
retentionCount: 10For local storage (default PV):
apiVersion: resources.cattle.io/v1
kind: Backup
metadata:
name: <backup-name>
spec:
resourceSetName: rancher-resource-set9.3 Restore from Backup#
apiVersion: resources.cattle.io/v1
kind: Restore
metadata:
name: <restore-name>
spec:
backupFilename: <backup-filename>
storageLocation:
s3:
credentialSecretName: <s3-creds-secret>
credentialSecretNamespace: <namespace>
bucketName: <bucket>
region: <region>
endpoint: <s3-endpoint>9.4 etcd Snapshots (Downstream Clusters)#
For RKE2/K3s downstream clusters, Rancher manages etcd snapshots:
Cluster > Snapshots > Take SnapshotSnapshots are stored locally on the cluster nodes and optionally in S3.
To restore:
Cluster > Snapshots > Select Snapshot > Restore10. Upgrading Rancher#
10.1 Pre-Upgrade Checklist#
- Take a backup of the Rancher server (see section 9)
- Take etcd snapshots of downstream clusters
- Review the release notes for breaking changes
- Verify Kubernetes version compatibility in the support matrix
10.2 Upgrade via Helm#
helm repo update
helm get values rancher -n cattle-system -o yaml > rancher-values.yaml
helm upgrade rancher rancher-stable/rancher \
--namespace cattle-system \
--version <new-version> \
-f rancher-values.yaml10.3 Verify Upgrade#
kubectl -n cattle-system rollout status deploy/rancher
kubectl -n cattle-system get podsCheck the Rancher UI footer for the new version number.
10.4 Rollback#
If the upgrade fails:
helm rollback rancher -n cattle-systemOr restore from the backup taken before the upgrade.
Troubleshooting#
| Issue | Cause | Solution |
|---|---|---|
| UI unreachable after install | Ingress not configured or cert-manager not ready | Check Ingress: kubectl get ingress -n cattle-system; verify cert-manager pods are Running |
Cluster stuck in Provisioning | Agent cannot reach the Rancher server | Verify network connectivity; check that the Rancher server URL is resolvable from downstream nodes |
| Agent not connecting | Firewall blocking outbound 443 | Open outbound HTTPS from downstream nodes to the Rancher server hostname |
cattle-cluster-agent crashlooping | Mismatched Rancher server URL or expired certificate | Verify server-url in global settings; renew TLS certificates |
| Authentication login fails | Wrong LDAP/SAML configuration | Test bind credentials; check attribute mappings; review Rancher server logs |
| Helm chart install fails from catalog | Repository unreachable or chart version mismatch | Refresh the repository; check network access from the cluster |
| Monitoring stack not deploying | Insufficient cluster resources | Ensure nodes have enough CPU and memory; check PVC availability for Prometheus |
| Backup CRD not found | Backup operator not installed | Install the rancher-backup chart before creating Backup resources |
| Fleet sync failing | Git credentials missing or branch does not exist | Add Git credentials in Continuous Delivery > Settings; verify branch name |
| Upgrade breaks downstream clusters | Version incompatibility | Check the support matrix; roll back Rancher and upgrade downstream clusters first if needed |