Kubernetes has become the default platform for running production workloads, but its defaults are aimed at developer ergonomics rather than security. A freshly installed cluster will happily run privileged containers, mount the host filesystem, expose the dashboard, and grant broad service account permissions. Hardening Kubernetes is less about exotic tooling and more about systematically closing those defaults at every layer: image, pod, cluster, network, and runtime. This guide walks through the controls that disproportionately reduce blast radius.
Securing the image supply chain
Everything in Kubernetes ultimately runs container images, so this is where hardening starts.
- Use minimal base images – Distroless, Alpine, or
scratchwhere possible. Fewer packages means a smaller CVE surface and a faster scan. Build images with multi-stage Dockerfiles so build tools never ship to production. - Pin by digest, not just tag –
image: myapp:1.4.2can change;image: myapp@sha256:...cannot. Pinning by digest prevents tag-poisoning and supply-chain swaps. - Scan in CI and at admission – Run Trivy, Grype, Snyk, or your registry's built-in scanner on every build, and use an admission controller (e.g. Kyverno, OPA Gatekeeper, or a registry policy) to block images with critical CVEs from being deployed.
- Sign and verify – Use Cosign / Sigstore to sign images at build time and have the cluster verify signatures via policy. This stops an attacker who pushes a malicious image with a real-looking name.
- Run as a non-root user – Build the image with a dedicated
USER(UID > 10000), and enforce it at the pod level. Many vulnerabilities require root inside the container to be useful.
A maintained internal registry, with scanning and signing baked in, is usually a better long-term investment than ad-hoc image hygiene per team.
Pod-level controls: PSA, securityContext, and friends
The Pod Security Admission (PSA) controller is the modern replacement for the deprecated PodSecurityPolicy. It enforces one of three profiles per namespace: privileged, baseline, or restricted. Production namespaces should run restricted as the enforced level, with baseline at most for legacy workloads. Every pod should set a securityContext that blocks the obvious foot-guns:
runAsNonRoot: trueand an explicitrunAsUser.allowPrivilegeEscalation: false.readOnlyRootFilesystem: truewith explicit writeable volume mounts where needed.capabilities: drop: ["ALL"], adding back only what the workload actually needs (rarely anything).seccompProfile: type: RuntimeDefault(or a stricter custom profile).- No
hostNetwork,hostPID,hostIPC, orhostPathmounts unless there is a documented reason.
These few settings prevent the majority of container-escape and privilege-escalation techniques used in real-world Kubernetes attacks.
RBAC, service accounts, and secrets
RBAC is where most clusters quietly hand out too much power.
- Disable automounting of default service account tokens – Set
automountServiceAccountToken: falseat the namespace or pod level unless the workload actually needs to call the Kubernetes API. A stolen token from a compromised pod is often the entry point for cluster takeover. - One service account per workload – Never share. Bind it to a Role with only the verbs and resources it needs (
get,liston its own ConfigMaps, perhaps; notcluster-admin). - Avoid wildcard ClusterRoles – Audit any binding to
cluster-admin,edit, or roles with*verbs or resources. Tools likekubectl-who-canandrakkesshelp. - Use OIDC for human users – Tie human kubectl access to your identity provider, with short-lived tokens and group-based RBAC. Long-lived kubeconfigs in laptops are a perennial leak source.
- Don't store secrets as base64 ConfigMaps – Kubernetes Secrets are barely encoded by default. Enable encryption at rest in etcd, and prefer a real secret manager (Vault, AWS/Azure/GCP secret managers via CSI driver, or External Secrets Operator) for anything sensitive.
Network policy and isolation
A default Kubernetes cluster is a flat L3 network: every pod can reach every other pod on any port. That is rarely what you want.
- Default-deny network policies – Apply a baseline
NetworkPolicyper namespace that denies all ingress and egress, then explicitly allow what each workload needs. Calico, Cilium, and most CNIs support this. - Egress controls – Restrict which external destinations workloads can reach. Most application pods do not need outbound internet access, and blocking it stops a lot of post-exploit behaviour (C2, data exfiltration).
- Namespace as a trust boundary – Group workloads by trust level into separate namespaces, and write policies between them. Sensitive workloads (payment, identity, build systems) should be in their own namespaces with strict ingress.
- Service mesh for mTLS – Istio, Linkerd, or Cilium's mesh features add automatic mutual TLS between pods. Useful for zero-trust service-to-service auth, especially across teams.
- Protect the control plane – Restrict who can reach the API server (private endpoint, VPN, or IP allowlists), turn off the legacy dashboard or put it behind strong auth, and audit access to etcd, kubelet, and node SSH.
Runtime detection and incident readiness
Even a hardened cluster will eventually face suspicious activity, and you want to see it.
- Enable audit logging – Kubernetes audit logs are off by default in many distributions. Turn them on, ship them to a SIEM, and write rules for things like creation of pods with
hostPath, exec into pods in sensitive namespaces, service-account token requests from unusual sources, and changes to RBAC bindings. - Container runtime detection – Falco, Tetragon, or your EDR's container module can detect behaviours like a shell spawning inside a pod, unexpected outbound connections, writes to
/etc, or kernel-module loads. Sysdig and Aqua offer commercial equivalents. - Image and workload inventory – Keep a live record of which images and versions are running in which namespaces. When the next critical CVE drops, you want to answer "where am I exposed?" in minutes, not days.
- Practise the response – Tabletop scenarios such as "a pod is making unexpected outbound connections", "a service account token has been seen in a public repo", or "an admission webhook is blocking deployments" reveal whether your team can actually act on the alerts you've configured.
- Patch the cluster itself – Kubernetes minor versions only get patches for about a year. Plan upgrades; running an unsupported control plane is its own vulnerability.
Container and Kubernetes security is not a single tool you buy; it is a layered policy that you apply consistently across image build, deployment, runtime, and observability. The good news is that almost every control above can be enforced as code — admission policies, NetworkPolicies, RBAC manifests — so once it is set up, it scales with the platform rather than fighting it.