Kubernetes Upgrades - How Not to Mess Up?
- RNREDDY

- Sep 11
- 2 min read

Kubernetes Upgrades - How Not to Mess Up?
You may have heard about the Reddit Kubernetes upgrade horror story, a 314 minute outage caused by a version upgrade from 1.23 to 1.24.
Whether you're running a startup's first cluster or managing production at scale, no one is immune to upgrade challenges.
Kubernetes releases move fast, and the N-2 support policy means staying on top of upgrades is critical.
Minor version timelines can quickly leave your cluster unsupported if upgrades are delayed.

Kubernetes upgrade documentation provides details about the technical steps for upgrading.
Let’s not dive into that again here.
Instead, I’ve depicted the official upgrade process as a checklist in the illustration below for a quick reference.
How Can we do it? Don’t worry - we’ll discuss it in detail right after this!

Phase 1: PLAN
DEV → Latest Kubernetes version to catch early issues and test breaking changes.
STAGING → DEV - 1 minor version to nail compatibility and smooth the path to production.
PROD → Close to STAGING for simplified workflows and reduced upgrade risks.
YAMLs → Per environment via Kustomize to handle configuration differences.
Testing Time → 2 weeks in staging, 1 month in dev for minor versions before production rollout.
Phase 2: PREPARE
Add Kubernetes EOL and release dates to your calendar to stay on top of timelines.
Upgrade dev with a new cluster once the target version reaches patch .2.
Keep the old dev cluster as a fallback while monitoring the new one for issues.
Upgrade staging to one minor version behind dev; a new cluster is optional.
Phase 3: ACT
Use Pluto to check for deprecated or removed API paths in configs and Helm charts.
Check Helm releases with Nova to confirm CNIs, CoreDNS, and other dependencies are compatible.
Take a snapshot of etcd with Velero to safeguard critical data and enable disaster recovery.
Monitor the upgrade progress and cluster health visually using Lens.
Follow Kubernetes upgrade documentation to upgrade the control plane and nodes sequentially.
Reboot nodes safely and automatically post-upgrade using Kured to apply OS updates.
Worthy TLDR:
Can we skip minor versions and upgrade directly to the latest Kubernetes release?
No, upgrades must follow minor versions sequentially.
How often should we upgrade Kubernetes to stay secure and supported?
Every 12–14 months to stay within the N-2 support window.
Is it better to upgrade an existing cluster or create a new one and migrate workloads?
For clusters behind by more than two versions, starting fresh is often easier.



Comments