A GitOps Kubernetes cluster, layer by layer

The full writeup behind Three-layer GitOps on K3s, in production. Internal specifics are scrubbed; everything here is generic enough to run on your own gear. Replace example.com, the placeholder IPs, and <placeholders> with your own values.

Build a 6-node K3s HA cluster (3 control-plane + 3 worker, embedded etcd) and drive every workload from a single git repository. Calico for CNI, MetalLB for load balancers, Traefik for ingress, Longhorn for storage, Vault for secrets, ArgoCD for reconciliation. Nothing is deployed by hand: a kubectl apply of one root manifest brings the whole platform up in dependency order.

Address note: all 10.x / 192.0.2.x addresses below are RFC 5737 documentation ranges or placeholders. Swap in your own LAN subnet.

Table of Contents#

Overview
Prerequisites
Host Preparation
Firewall
Install K3s in HA Mode
Install Tooling: Helm, kubectl plugins, k9s
CNI: Calico
Load Balancer: MetalLB
Storage: Longhorn
The GitOps Repository
ArgoCD and the App-of-Apps Pattern
Three Helm Deployment Patterns
Secret Management with Vault + AVP
Networking and Network Policies
Databases: PostgreSQL and MariaDB
Platform Services: Harbor, Monitoring
Namespace Convention
Workload Distribution
Dependency Updates with Renovate
Day-to-Day Operations
Troubleshooting
Sources

1. Overview#

The target is a K3s HA cluster, 6 nodes, split 3 control-plane and 3 worker. K3s runs in HA mode with embedded etcd (--cluster-init). Calico provides the CNI, MetalLB hands out LoadBalancer IPs, Traefik terminates ingress, Longhorn provides replicated block storage, Vault holds secrets, and ArgoCD reconciles every workload from a single git repository.

The defining property is that nothing in the cluster is deployed by kubectl apply by hand. Everything is deployed by a git push. ArgoCD watches the repo and the cluster; if they diverge, it converges the cluster back to the repo. A fresh cluster comes up with one kubectl apply of a root manifest and then nothing else.

Example node layout (replace hostnames and IPs with your own):

Node	IP	Role	Purpose
`k3s-control-01`	`192.0.2.51`	Control	K3s server, etcd, platform services
`k3s-control-02`	`192.0.2.52`	Control	K3s server, etcd, platform services
`k3s-control-03`	`192.0.2.53`	Control	K3s server, etcd, platform services
`k3s-worker-01`	`192.0.2.54`	Worker	Application workloads
`k3s-worker-02`	`192.0.2.55`	Worker	Application workloads
`k3s-worker-03`	`192.0.2.56`	Worker	Application workloads

Virtual IP: 192.0.2.50, handed out by MetalLB to the Traefik LoadBalancer service. All external DNS records point here.
Node labels: example.com/role=control on control nodes, example.com/role=worker on workers. Helm charts use nodeSelector to place workloads on the right tier.

2. Prerequisites#


OS	RHEL 9+, Debian 11+, or compatible
Access	Root privileges on every node
Network	Outbound internet access

Minimum per node#

Resource	Specification
CPU	2 cores (more for control nodes running etcd + platform)
Memory	2 GB minimum, 8 GB+ realistic for a platform node
Storage	10 GB free for the OS; Longhorn needs its own disk/partition per node
Kernel	5.4 or later

Required software on each host: curl. The host firewall is disabled and Kubernetes manages the packet rules instead (see section 4).

3. Host Preparation#

Run on every node before installing K3s.

Update packages:

dnf update -y      # RHEL family; apt on Debian

Kernel networking settings. Add to /etc/sysctl.d/k8s.conf:

net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1

Apply:

sysctl --system

Disable swap. Kubernetes requires it off:

swapoff -a
# then comment out the swap line in /etc/fstab to persist

SELinux (RHEL family). Permissive at minimum:

setenforce 0
grubby --update-kernel ALL --args selinux=0

Install required tools:
```
dnf install -y curl
```

4. Firewall#

Disable the host firewall on cluster nodes and let Kubernetes manage the packet rules.

K3s programs its own iptables/nftables chains through kube-proxy and the CNI (Calico here) for pod, service, and NodePort traffic. A host firewall like firewalld or ufw runs its own chains on top of that, and the two fight: a firewalld reload flushes or reorders chains the CNI installed, masquerade rules collide, and pod-to-pod or pod-to-service traffic starts dropping intermittently in ways that are miserable to debug. The K3s docs call this out directly - firewalld is known to conflict, and the supported answer on a trusted network is to turn it off.

# RHEL family
systemctl disable --now firewalld
# Debian / Ubuntu
systemctl disable --now ufw

Security comes from the right layers instead of the host firewall:

A perimeter firewall (router, OPNsense, or a cloud security group) controls what reaches the nodes from outside - normally just 80/443 to the ingress, and 6443 to the API from trusted admin networks.
Kubernetes NetworkPolicy controls pod-to-pod traffic inside the cluster, enforced by Calico.

If you cannot disable the host firewall (shared L2, compliance), do not let it filter pod traffic. Put the CNI interface and the cluster CIDRs in a trusted zone, then open the node ports:

firewall-cmd --permanent --zone=trusted --add-interface=cni0
firewall-cmd --permanent --zone=trusted --add-source=<pod-cidr>      # e.g. K3s default 10.42.0.0/16
firewall-cmd --permanent --zone=trusted --add-source=<service-cidr>  # e.g. K3s default 10.43.0.0/16
for p in 53/tcp 53/udp 80/tcp 443/tcp 2049/tcp 2379-2380/tcp 3260/tcp 4789/udp 6443/tcp 10250/tcp; do
  firewall-cmd --permanent --add-port=$p
done
firewall-cmd --reload

5. Install K3s in HA Mode#

Disable the bundled Traefik and the bundled flannel CNI; Calico and a GitOps-managed Traefik replace them. The aggressive failover flags are deliberate: K3s defaults are roughly twelve minutes from node death to pod eviction, which is not high availability for a small cluster where every node matters.

First control node (--cluster-init starts a new etcd cluster):

curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--cluster-init --disable traefik --flannel-backend=none \
  --kube-controller-manager-arg=node-monitor-period=2s \
  --kube-controller-manager-arg=node-monitor-grace-period=16s \
  --kube-apiserver-arg=default-not-ready-toleration-seconds=30 \
  --kube-apiserver-arg=default-unreachable-toleration-seconds=30" sh -

mkdir -p "$HOME/.kube"
cp /etc/rancher/k3s/k3s.yaml "$HOME/.kube/config"

Flag	Effect
`node-monitor-period=2s`	Kubelet health check interval (default 5s)
`node-monitor-grace-period=16s`	Time before a node is marked `NotReady` (default 40s)
`default-not-ready-toleration-seconds=30`	Pod eviction delay after `NotReady` (default 300s)
`default-unreachable-toleration-seconds=30`	Pod eviction delay after `Unreachable` (default 300s)

Worst-case node-death detection lands around 46 seconds; add pod startup and you are back in service within ~90 seconds of a hard failure. Larger clusters can afford to be more conservative.

Additional control nodes join the existing etcd cluster with --server pointing at the first node and the shared node token (/var/lib/rancher/k3s/server/node-token on the first node):

curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--server https://192.0.2.51:6443 --disable traefik --flannel-backend=none \
  --kube-controller-manager-arg=node-monitor-period=2s \
  --kube-controller-manager-arg=node-monitor-grace-period=16s \
  --kube-apiserver-arg=default-not-ready-toleration-seconds=30 \
  --kube-apiserver-arg=default-unreachable-toleration-seconds=30" \
  K3S_TOKEN="<node-token>" sh -

Worker nodes join as agents:

curl -sfL https://get.k3s.io | K3S_URL="https://192.0.2.51:6443" K3S_TOKEN="<node-token>" sh -

Paths after install: binary /usr/local/bin/k3s, kubeconfig /etc/rancher/k3s/k3s.yaml, systemd unit /etc/systemd/system/k3s.service (k3s-agent.service on workers). The installer enables and starts the service automatically.

etcd tuning. Longhorn writes a lot of CRD state to etcd (per-volume), and the default election timeout (1000 ms) occasionally fires under that load and triggers a spurious leader election. Raise heartbeat to 500 ms and election timeout to 5000 ms - the lowest values at which spurious elections stop. Pass via K3s etcd args or the embedded-etcd config.

Verify:

systemctl status k3s
kubectl get nodes      # all 6 should appear (Ready once the CNI is up)

Label the nodes so charts can target a tier:

kubectl label node k3s-control-01 k3s-control-02 k3s-control-03 example.com/role=control
kubectl label node k3s-worker-01  k3s-worker-02  k3s-worker-03  example.com/role=worker

6. Install Tooling: Helm, kubectl plugins, k9s#

# Helm
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
helm version

# k9s (check the latest tag on the releases page)
curl -L https://github.com/derailed/k9s/releases/download/v0.32.7/k9s_Linux_amd64.tar.gz -o k9s.tar.gz
tar -xzf k9s.tar.gz -C /usr/local/bin
k9s version

The remaining components (Calico, MetalLB, Longhorn, Traefik) can be installed by hand once to bring the cluster to life, then folded under ArgoCD so the repo owns them. The manual install commands below are the bootstrap; the GitOps form is in section 11.

7. CNI: Calico#

K3s was started with --flannel-backend=none, so install Calico explicitly.

kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.29.1/manifests/calico.yaml

Set a custom IP pool. default-ipv4-ippool.yml:

apiVersion: crd.projectcalico.org/v1
kind: IPPool
metadata:
  name: default-ipv4-ippool
spec:
  cidr: 172.31.0.0/16
  blockSize: 26
  ipipMode: Always
  natOutgoing: true
  nodeSelector: all()
  vxlanMode: Never

kubectl apply -f default-ipv4-ippool.yml
kubectl get pods -n kube-system        # calico-node pods should go Running

Optional kubectl-calico plugin for IPAM inspection:

curl -L https://github.com/projectcalico/calico/releases/download/v3.29.1/calicoctl-linux-amd64 -o /usr/local/bin/kubectl-calico
chmod +x /usr/local/bin/kubectl-calico
kubectl calico ipam show

CIDR mismatch caveat. If you let K3s default its pod CIDR to 10.42.0.0/16 but configure Calico's IP pool as 172.31.0.0/16, cross-node traffic is SNAT'd to the IPIP tunnel interface (tunl0) addresses. That breaks NetworkPolicy podSelector matching for cross-node traffic. Either keep the two CIDRs aligned, or add the tunl0 IPs as ipBlock entries in any NetworkPolicy that needs cross-node communication. Aligning the CIDRs from the start is the cleaner fix.

8. Load Balancer: MetalLB#

helm repo add metallb https://metallb.github.io/metallb
helm repo update
helm install metallb metallb/metallb --namespace metallb-system --create-namespace

Layer 2 address pool. metal.yaml:

apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: ingress-vip-pool
  namespace: metallb-system
spec:
  addresses:
  - 192.0.2.50/32
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: ingress-l2
  namespace: metallb-system
spec:
  ipAddressPools:
  - ingress-vip-pool

kubectl apply -f metal.yaml
kubectl get pods -n metallb-system
kubectl get services -A         # Traefik's LoadBalancer should pick up the VIP

That VIP (192.0.2.50) is the single entry point. Point your wildcard DNS record (*.example.com) at it.

9. Storage: Longhorn#

Longhorn provides replicated block storage. In a 3-control-node layout, run replicas across the three control nodes (or across whichever nodes carry dedicated storage disks).

helm repo add longhorn https://charts.longhorn.io
helm repo update
helm install longhorn longhorn/longhorn --namespace longhorn-system --create-namespace
kubectl get pods -n longhorn-system

StorageClass: longhorn (set it default).
Replication: 3 replicas per volume, one per storage node.
RWX volumes: served via NFS share-manager pods for ReadWriteMany access (this is why port 2049 is open).

Expose the UI through Traefik (or NGINX if that is your ingress). Example NGINX Ingress:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: longhorn-ingress
  namespace: longhorn-system
  annotations:
    nginx.ingress.kubernetes.io/ssl-redirect: 'false'
    nginx.ingress.kubernetes.io/proxy-body-size: 10000m
spec:
  ingressClassName: nginx
  rules:
  - host: longhorn.example.com
    http:
      paths:
      - pathType: Prefix
        path: "/"
        backend:
          service:
            name: longhorn-frontend
            port:
              number: 80

kubectl apply -f longhorn-ingress.yml

Stale volume attachments. If a pod cannot mount a Longhorn volume (MountDevice timeout): delete the stale VolumeAttachment (kubectl get volumeattachment | grep <pvc-id>), clear the volume's nodeID (kubectl patch volume <pvc-id> -n longhorn-system --type merge -p '{"spec":{"nodeID":""}}'), and for RWX volumes delete the share-manager pod to force the NFS server to restart.

10. The GitOps Repository#

One repository holds the entire desired state. Point ArgoCD at it; never kubectl apply by hand.

kubernetes/
  argo/                     # ArgoCD Application manifests
    bootstrap/              # Phase 0: infra (Traefik, MetalLB, Longhorn, cert-manager, operators)
    platform/               # Phase 1: shared services (Vault, Postgres, registry, monitoring)
    apps/                   # Phase 2: user applications
    root-bootstrap.yaml     # App-of-apps for the bootstrap phase
    root-platform.yaml      # App-of-apps for the platform phase
    root-apps.yaml          # App-of-apps for the apps phase
  apps/
    helm/                   # Helm values and umbrella charts per application
      traefik/values.yaml   # Values override for the Traefik chart
      registry/Chart.yaml   # Umbrella chart with the registry chart as a dependency
      registry/values.yaml
      ...
    manifests/              # Raw Kubernetes YAML per namespace
      traefik/              # NetworkPolicies, quotas, extra resources
      vault-ha/             # Namespace config, network policies
      ...
  infra/                    # Node-level setup scripts (k3s install, Vault init)
  .gitlab-ci.yml            # CI pipeline (yamllint, kubeconform, helm validate, Renovate)
  .gitlab/renovate.json     # Renovate dependency update config

How changes are made#

Edit files in the repo (values.yaml, manifests, ArgoCD Applications).
Commit and push to master.
ArgoCD detects the change and syncs within ~3 minutes (or instantly via a webhook).
ArgoCD applies the new state to the cluster.

Never kubectl apply directly - ArgoCD reverts it on the next sync. The repo is the only way in.

Commit convention#

Single-line commits: feat:, fix:, docs:, refactor:, chore:. Add a scope for the component, e.g. fix(traefik): increase proxy timeout.

CI pipeline#

Every push validates:

yamllint - YAML syntax.
manifest-validate - kubeconform against the target Kubernetes schema.
helm-validate - helm template on every umbrella chart.
argocd-validate - ArgoCD Application YAML structure.

11. ArgoCD and the App-of-Apps Pattern#

ArgoCD lives in its own namespace and manages every other Application through the app-of-apps pattern: one Application that creates more Application resources. Three roots, one per layer.

argo/
  root-bootstrap.yaml    sync-wave 0
  root-platform.yaml     sync-wave 1
  root-apps.yaml         sync-wave 2
  bootstrap/             Layer 0 Applications
  platform/              Layer 1 Applications
  apps/                  Layer 2 Applications

Each root is an Application pointing at the directory below it; each directory holds child Application resources. Sync waves enforce ordering: Layer 1 does not start until Layer 0 is healthy, Layer 2 not until Layer 1 is healthy. A fresh cluster comes up in the right order from one command:

kubectl apply -f argo/root-bootstrap.yaml

Forty minutes later you have the whole platform. The three layers:

Bootstrap (Phase 0) - infrastructure. Operators and CRDs that depend on nothing else.

App	Chart source	Namespace
traefik	`traefik.github.io/charts`	`traefik-system`
metallb	`metallb.github.io/metallb`	`metallb-system`
longhorn	`charts.longhorn.io`	`longhorn-system`
cert-manager	`charts.jetstack.io`	`cert-core`
gatekeeper	`open-policy-agent.github.io`	`gatekeeper-system`
cnpg	`cloudnative-pg.github.io`	`cnpg-core`
mariadb-operator	`helm.mariadb.com`	`mariadb-core`
reflector	`emberstack.github.io`	`argocd-system`
argocd	git (manifests)	`argocd-system`
cluster-config	git (manifests)	various
volume-snapshot-crds	git (manifests)	`kube-system`

Platform (Phase 1) - shared dependencies. Services that applications rely on.

App	Description	Namespace
vault-transit	Vault auto-unseal backend	`vault-core`
vault-ha	3-node Vault HA with Raft	`vault-core`
valkey	Redis-compatible cache (Sentinel HA)	`valkey-core`
postgres	CNPG 3-node PostgreSQL cluster	`cnpg-core`
registry	Container registry with proxy cache	`registry-core`
object-store	Distributed object storage	`objectstore-core`
kube-prometheus-stack	Prometheus + Grafana + Alertmanager	`monitoring-core`
loki	Log aggregation	`monitoring-core`
promtail	Log shipping DaemonSet	`monitoring-core`

Apps (Phase 2) - user applications. They use the platform but do not provide it.

App	Description	Namespace
awx	Ansible automation platform	`awx`
netbox	IP address management	`netbox`
asset-mgmt	IT asset management	`asset-mgmt`
filebrowser	Web file manager	`filebrowser`
ci-runner	GitLab CI runner (Kubernetes executor)	`ci-runner`

Reconciliation is the point#

"I can deploy with git" is not the win - a shell script does that. The win is reconciliation. With selfHeal: true and prune: true on every Application, manual changes to the cluster are reverted within ~180 seconds. Three consequences:

The repo is the truth. To know what is running, read master, not kubectl get all -A.
No manual hotfixes survive. Edit a Deployment directly and ArgoCD reverts it.
Rebuild is one command. kubectl apply -f argo/root-bootstrap.yaml on a fresh K3s install, and the same cluster comes back in under an hour.

When something is genuinely on fire, self-heal works against you - every fix you try gets reverted. Disable auto-sync on the affected app, fix by hand, commit the fix, re-enable.

Access ArgoCD#

# Admin password
kubectl -n argocd-system get secret argocd-initial-admin-secret \
  -o jsonpath='{.data.password}' | base64 -d

# Port-forward if DNS is not yet pointed at the VIP
kubectl port-forward svc/argocd-server -n argocd-system 8080:443
# then open https://localhost:8080

12. Three Helm Deployment Patterns#

ArgoCD can render manifests more than one way. The repo uses three, picked per app based on whether the app needs secrets and whether the chart exposes the knobs the app needs.

Pattern 1: Multi-source Helm (no secrets)#

Pull the chart from the upstream Helm repo, merge it with a values.yaml from the git repo, optionally add a third source for plain manifests (network policies, quotas). Used for most bootstrap and platform apps.

# argo/bootstrap/traefik-app.yaml
sources:
  - repoURL: https://traefik.github.io/charts
    chart: traefik
    targetRevision: 39.0.1
    helm:
      releaseName: traefik
      valueFiles:
        - $values/apps/helm/traefik/values.yaml
  - repoURL: <this repo>
    targetRevision: master
    ref: values
  - repoURL: <this repo>
    targetRevision: master
    path: apps/manifests/traefik

To bump the chart version: edit targetRevision, commit, push.

Pattern 2: Umbrella chart + AVP (apps needing secrets)#

A small local Chart.yaml declares the upstream chart as a dependency. A custom ArgoCD management plugin (avp-helm) renders it through a pipeline that injects Vault secrets at sync time.

# apps/helm/registry/Chart.yaml
dependencies:
  - name: harbor
    version: 1.18.2
    repository: https://helm.goharbor.io

# argo/platform/registry-app.yaml
sources:
  - repoURL: <this repo>
    path: apps/helm/registry
    plugin:
      name: avp-helm

To bump: change version in Chart.yaml, commit, push. ArgoCD runs helm dependency update + AVP automatically.

The render pipeline:

helm dependency update
  -> helm template
  -> sed (URL-decode AVP placeholders)
  -> argocd-vault-plugin generate

The sed step matters: Helm URL-encodes the <path:...> placeholders that argocd-vault-plugin looks for. Helm sees <path:kv/data/foo#bar> and serialises it as %3Cpath%3A.../%3E; AVP cannot find encoded placeholders. One regex decodes them before AVP runs, the rendered manifests get the real values, and the placeholders never touch the cluster.

Pattern 3: Kustomize + Helm post-render (when the chart has no injection point)#

Some charts hardcode a field you need and expose no value to override it - a missing volume, a sidecar, a cleanupPolicy. Forking the chart is one answer. The other is to render the chart through kustomize and patch the output, so you change container configuration without building a custom image or maintaining a fork.

A kustomization.yaml in apps/helm/<app>/ references the upstream chart in a helmCharts: block and applies patches: on top. The avp-helm plugin auto-detects it (a directory holds either a Chart.yaml or a kustomization.yaml, never both) and renders with kustomize build --enable-helm instead of helm template. The Vault secret-injection step runs on the kustomize output the same way.

# apps/helm/<app>/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
# NEVER set a top-level "namespace:" here. It rewrites the namespace of every
# resource, including cross-namespace CRs the chart emits (e.g. an operator
# Database CR that must target another namespace). That mistake caused a
# production data-loss incident.
helmCharts:
  - name: <chart>
    repo: <upstream-helm-repo>
    version: <pinned-version>
    valuesFile: values.yaml
    releaseName: <release>
    namespace: <destination-namespace>
resources:
  - extra-config.yaml
patches:
  - target: { kind: StatefulSet, name: <release> }
    patch: |-
      <strategic merge patch>

values.yaml here is passed straight to the chart, with no umbrella sub-chart wrapping. Reach for this only when the chart genuinely cannot express the change through values.

The warning that makes this dangerous#

kustomize build --enable-helm inside the ArgoCD repo-server has been seen to render a SHORTER manifest than the exact same command on a workstation - same chart, same values. The cause is not fully isolated (most likely helm/kustomize version drift between the workstation and the plugin container). When it happens, ArgoCD diffs the partial render against the live cluster, decides the missing resources are extraneous, and prunes them. With the wrong defaults that means data loss:

Foot-gun	What pruning does
An operator `Database` CR with `cleanupPolicy: Delete` (a common chart default)	The operator drops the real database. Tables and rows gone.
A `StatefulSet` with `volumeClaimTemplates`	Pods deleted. The PVCs survive, since a StatefulSet does not cascade-delete them, but the workload is down until the StatefulSet is restored under the same VCT name.
A managed-Postgres `Cluster` CR with a delete reclaim policy	Same as the database case, data dropped.

Before adopting this pattern for any app:

Render the kustomization.yaml locally with the same helm and kustomize versions as the repo-server, and diff against the pure helm template output. Resource set, kinds, names, and namespaces must match. A single missing Database/Cluster CR is a stop-the-PR finding.
Patch every stateful CR the chart emits to a retain policy (cleanupPolicy: Skip, or the operator's equivalent) before the first sync, so a future accidental prune cannot drop data.
Set prune: false on the Application for the first cutover. Check status.operationState.syncResult for any pruned stateful resource before re-enabling prune.
Take a fresh database dump immediately before, kept off the resource being migrated.
Do it in a maintenance window. Treat it as risk-equivalent to a database engine upgrade.

13. Secret Management with Vault + AVP#

HashiCorp Vault (3-node HA with Raft) holds every secret. The ArgoCD Vault Plugin injects them at deploy time. The repo never contains a real secret - only <path:...> references.

Bootstrapping Vault HA#

Vault HA is hard to bootstrap, because it needs an unseal mechanism that exists before Vault HA does. You cannot unseal Vault with secrets stored in Vault.

The escape hatch is a second, smaller Vault running in Transit mode. Transit is an encryption API: it does not store the secrets you want, it encrypts and decrypts an unseal token on demand. The HA Vault auto-unseals against the Transit Vault.

Both live in the same namespace. Sync-wave 0 brings up the Transit Vault; sync-wave 1 brings up the HA Vault, which auto-unseals against Transit and is immediately ready. The Transit Vault holds only the HA Vault's unseal key - no application secrets - and is fenced off by namespace network policies and Vault auth policy. Different blast radii: losing Transit loses the ability to cold-start a fresh HA Vault; losing HA loses application secrets.

Vault access#

Internal: http://vault-ha.vault-core.svc.cluster.local:8200
External: https://vault.example.com
Tokens: keep the root token sealed away; use a scoped admin token for daily work.

Secret paths#

Path pattern	Purpose
`kv/data/argocd/platform/<app>#<key>`	Platform service secrets
`kv/data/argocd/apps/helm/<app>#<key>`	User application secrets

Using secrets in values#

In an AVP-processed values.yaml:

config:
  database:
    password: <path:kv/data/argocd/apps/helm/netbox#db_password>

AVP replaces the placeholder with the real value from Vault at sync time. Vault is the source of truth; the repo never sees it. Used for database credentials, OIDC client secrets, SMTP passwords, registry credentials.

14. Networking and Network Policies#

CNI: Calico#

Pod CIDR: keep K3s and Calico aligned (see the section 7 caveat).
Service CIDR: 10.43.0.0/16 (K3s default).
Tunneling: IPIP (tunl0 interfaces on each node).

Load balancer and ingress#

MetalLB in Layer 2 mode hands 192.0.2.50 to Traefik's LoadBalancer service. All HTTP/HTTPS enters through Traefik on port 443. TLS terminates on a wildcard cert for *.example.com, replicated to every namespace by Reflector. Routing uses Traefik IngressRoute CRDs.

Default-deny network policies#

Every namespace gets a default-deny policy for both ingress and egress; allowed traffic is enumerated explicitly.

Policy	Purpose
`default-deny-ingress/egress`	Block everything by default
`allow-same-namespace`	Intra-namespace traffic
`allow-dns-egress`	DNS resolution (`kube-system:53`)
`allow-monitoring`	Prometheus scraping from `monitoring-core`
`allow-traefik`	Ingress from `traefik-system`
`allow-internet-egress`	External traffic (excludes pod/service CIDRs)
`allow-cluster-services-egress`	Access to ClusterIP services

App-specific policies add egress to the databases, Vault, SMTP, and so on.

15. Databases: PostgreSQL and MariaDB#

PostgreSQL (CloudNativePG)#

The CNPG operator manages a 3-node PostgreSQL cluster. Install the operator (bootstrap layer) then declare a Cluster:

helm repo add cnpg https://cloudnative-pg.io/charts
helm install cnpg --namespace cnpg-core --create-namespace cnpg/cloudnative-pg

pg-cluster.yml:

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: postgres
  namespace: cnpg-core
spec:
  instances: 3
  storage:
    size: 10Gi
  bootstrap:
    initdb:
      options:
        - --encoding=UTF8
        - --locale=en_US.UTF-8

kubectl apply -f pg-cluster.yml
kubectl get pods -n cnpg-core    # postgres-1/-2/-3 should run

CNPG publishes discovery services automatically: postgres-rw.cnpg-core.svc (read-write, follows the primary) and postgres-ro.cnpg-core.svc (read-only replicas). Back up with a daily VolumeSnapshot.

MariaDB (MariaDB Operator)#

Single instance in mariadb-core, used by apps that need MySQL/MariaDB. Back up with a daily CronJob and a retention window (e.g. 30 days).

16. Platform Services: Harbor, Monitoring#

Container registry (Harbor)#

Harbor runs as an umbrella-chart + AVP app (it needs DB and admin secrets) and uses the external CNPG PostgreSQL rather than its bundled database.

The external-DB secret (rendered by AVP, never committed plain):

apiVersion: v1
kind: Secret
metadata:
  name: postgres-harbor
  namespace: registry-core
  labels:
    cnpg.io/reload: "true"
stringData:
  host: postgres-rw.cnpg-core.svc.cluster.local
  port: "5432"
  coreDatabase: harbor
  username: harbor
  password: <path:kv/data/argocd/platform/registry#db_password>
type: Opaque

Key values.yaml fragments - run multiple replicas of every component, point the database at CNPG, and enable the proxy-cache so the registry doubles as a pull-through cache for upstream images:

externalURL: https://registry.example.com
expose:
  ingress:
    hosts:
      core: registry.example.com

cache:
  enable: true
  expireHours: 87600

core:       { replicas: 3 }
jobservice: { replicas: 3 }
registry:   { replicas: 3 }
trivy:      { replicas: 3 }
portal:     { replicas: 3 }

database:
  type: external
  external:
    existingSecret: postgres-harbor
    host: postgres-rw.cnpg-core.svc.cluster.local
    port: 5432
    coreDatabase: harbor
    username: harbor

The default admin login is admin with a password set at install; retrieve it from the core secret:

kubectl get secret harbor-core -n registry-core \
  -o jsonpath="{.data.HARBOR_ADMIN_PASSWORD}" | base64 --decode; echo

Monitoring#

monitoring-core runs the kube-prometheus-stack (Prometheus, Grafana, Alertmanager), Loki for logs, and Promtail as a DaemonSet shipping logs from every node. This is cluster-internal monitoring; it is not built to watch external infrastructure. Alertmanager routes alerts by email to an ops address via your SMTP relay (smtp.example.com:25).

17. Namespace Convention#

The suffix tells you what tier something is in. RBAC policies attach to suffix patterns and apply automatically to new namespaces in the same tier.

Suffix	Purpose	Examples
`-system`	Cluster infrastructure operators	`longhorn-system`, `metallb-system`, `traefik-system`
`-core`	Shared platform dependencies	`cnpg-core`, `vault-core`, `registry-core`, `monitoring-core`
(none)	Application namespaces	`awx`, `netbox`, `filebrowser`

One app per namespace, always. Mixing two unrelated workloads in one namespace is how one app ends up able to exfiltrate the other's secrets via a misapplied ServiceAccount.

Pair the convention with topology spread constraints on every multi-replica workload (maxSkew: 1, whenUnsatisfiable: DoNotSchedule, on kubernetes.io/hostname) so a single node failure removes at most one replica of anything.

18. Workload Distribution#

Control nodes run platform state: Vault, PostgreSQL, MariaDB, the registry, the cache, object storage, the CNPG/MariaDB operators, cert-manager, the policy controller, Reflector, ArgoCD.
Worker nodes run user apps and the observability stack: AWX, Netbox, asset management, file browser, CI runner, Grafana, Prometheus, Alertmanager, Loki.
DaemonSets run on all nodes: Traefik, Promtail, the Longhorn CSI, the MetalLB speaker, Calico, node-exporter.

Charts target a tier with nodeSelector against the example.com/role label from section 5.

19. Dependency Updates with Renovate#

Renovate runs on a weekly schedule (weekends) via CI and opens merge requests for:

Helm chart versions in ArgoCD Application manifests (argocd manager).
Umbrella chart dependencies in Chart.yaml (auto-detected).
Container image tags in values.yaml (helm-values manager).
Container images in raw manifests (custom regex).
CI tool versions in the pipeline file.

Config in .gitlab/renovate.json. Rules worth copying:

3-day stability window for Helm chart updates.
Auto-merge patch updates for CI tools only.
Major updates never auto-merged.
Vault and the monitoring stack grouped into single MRs so related bumps land together.

20. Day-to-Day Operations#

Deploy a new application#

Create apps/helm/<app>/values.yaml (and Chart.yaml if it needs AVP).
Create apps/manifests/<app>/ with namespace.yaml, network-policy.yaml, quota.yaml.
Create argo/apps/<app>-app.yaml (the ArgoCD Application).
Add the app to argo/root-apps.yaml.
If it needs secrets, add them to Vault under kv/argocd/apps/helm/<app>.
Commit and push.

Update a Helm chart version#

Multi-source apps: edit targetRevision in argo/*/...-app.yaml.
Umbrella apps: edit version in apps/helm/<app>/Chart.yaml.

Check cluster health#

kubectl get nodes                          # all 6 Ready
kubectl get pods -A | grep -v Running      # empty (or Completed)
kubectl get app -n argocd-system           # all Synced + Healthy

Force an ArgoCD resync#

kubectl annotate app <app-name> -n argocd-system \
  argocd.argoproj.io/refresh=hard --overwrite

Everyday kubectl#

Command	Description
`kubectl cluster-info`	Cluster endpoints
`kubectl get nodes` / `kubectl describe node <node>`	Node state
`kubectl get pods -n <ns>` / `kubectl describe pod <pod> -n <ns>`	Pod state and events
`kubectl logs <pod> -n <ns> --tail=50`	Recent logs (`-c <container>` for a sidecar)
`kubectl get events -n <ns> --sort-by=.lastTimestamp \| tail -20`	Recent events
`kubectl exec <pod> -n <ns> -- <command>`	Run a command in a pod
`kubectl port-forward svc/<svc> -n <ns> <local>:<remote>`	Tunnel a service locally
`kubectl get crd` / `kubectl describe crd <name>`	Custom resource definitions

21. Troubleshooting#

Symptom	First move
Cannot reach the cluster	`kubectl config view` - verify the kubeconfig and API endpoint
Calico pods stuck	Check the VXLAN/IPIP config in the manifest; `kubectl logs -n kube-system calico-node-<id>`
Ingress not working	Confirm DNS points at the VIP; `kubectl describe ingress <name>`
K3s service issues	`journalctl -u k3s` (or `k3s-agent` on workers)
Pod will not mount a Longhorn volume	Clear stale `VolumeAttachment` + the volume `nodeID` (see section 9)
Cross-node NetworkPolicy not matching	CIDR mismatch / `tunl0` SNAT (see section 7)
External PostgreSQL connection fails	Verify the CNPG secret and pod-to-DB connectivity from the namespace
Pod stuck not starting	`kubectl describe pod <pod> -n <ns>` and read the events

GitOps does not solve debugging - kubectl logs/describe/get events is still the toolset. ArgoCD watches what is deployed, not what is happening at runtime. It is also the wrong tool for fast iteration: the ~180-second poll plus webhook plus sync means a deploy is 5-30 seconds end to end, fine for production-shaped changes, painful for tweaking a chart. Use a throwaway dev cluster and helm upgrade directly for that.

22. Sources#

K3s / Kubernetes:

GitOps / ArgoCD:

Networking / CNI / LB:

Storage:

Longhorn documentation

Secrets:

Databases:

Platform: