How to Design Azure Kubernetes Service (AKS) Cluster
- RNREDDY

- Nov 26, 2025
- 3 min read
How to Design Azure Kubernetes Service (AKS) Cluster
Before you design an Azure Kubernetes Service cluster, it helps to step back and understand how Kubernetes itself is structured.
At its core, Kubernetes separates responsibilities between the control plane and the worker nodes. The control plane hosts the API server, scheduler, controller manager, and etcd. These components maintain the desired state of the cluster and make decisions on scheduling, orchestration, and cluster health.
Worker nodes run kubelet, kube proxy, and the container runtime which together handle pod execution, service routing, and communication with the control plane.
Cluster Architecture Design

1. Pick how your workloads will run
Decide what actually runs inside your cluster. For example:
API services on general purpose nodes
Background workers on CPU heavy nodes
ML or video apps on GPU nodes
Low priority jobs on spot nodes
This directly defines how many node pools you need.
2. Split system and application capacity
Create a small system pool for kube system components.
Create user pools only for application pods.
This avoids situations where app pods starve DNS or CNI and break the cluster.
3. Choose node sizes with real numbers
Select sizing based on known workload patterns. Examples:
API workloads: D4/D8 series
Memory heavy: E series
High IOPS: L series
GPU: NC/ND series
Keep nodes consistent within each pool to avoid unpredictable scaling.
4. Design how the cluster scales
Enable autoscaler on every pool:
Use HPA for CPU or memory based workloads
Use KEDA for queue length, HTTP rate, or event driven scaling
Keep at least one extra node of headroom per pool for safe upgrades.
5. Decide how you isolate workloads
Pick one of these:
Single AKS per environment (most common)
Shared cluster with strict namespace and policy boundaries
Add taints and tolerations to keep workloads on the right node pools.
6. Plan storage per workload, not per cluster
Choose storage based on use case:
Azure Disks for databases or single writer apps
Azure Files for shared read/write
Keep databases on managed services unless you have a strong reason not to.
7. Map identity and secrets to how apps run
Cluster uses managed identity to talk to Azure resources
Apps use workload identity for their own Azure access
All secrets come from Key Vault, not from raw Kubernetes secrets
Network Topology Design
In this layout, the hub VNet acts as the central security and connectivity layer. It hosts shared services such as Azure Firewall, the VPN or ExpressRoute gateway for on premises links, and Azure Bastion for secure admin access. The hub does not run workloads. It simply controls and inspects traffic.

The spoke VNet is where the AKS cluster runs. It is split into dedicated subnets for cluster nodes, ingress resources, Application Gateway, and private endpoints. Each subnet has a focused purpose, which keeps routing predictable and limits the blast radius.
The hub and spoke VNets are connected using VNet peering. Route tables in the spoke ensure all outbound traffic from AKS flows through the Azure Firewall in the hub rather than directly to the internet. This gives a controlled egress path, consistent security enforcement, and clear separation between platform services and workload infrastructure.
This model scales well, lets you attach multiple spokes in the future, and keeps the AKS environment isolated while benefiting from shared security controls in the hub.
Traffic Flow Design
User traffic from the internet first reaches the Web Application Firewall in the spoke. The WAF filters and validates the request, then forwards the allowed traffic to the internal load balancer inside the AKS subnet. The load balancer distributes the request to the appropriate pod running in the cluster. Responses follow the same path in reverse back to the user.

For outbound traffic, AKS workloads send requests from the cluster nodes toward the hub through VNet peering. The traffic is forced through Azure Firewall, which inspects and filters all egress before it reaches the internet. The firewall then sends the response back through the same controlled path.
This would ensures both inbound and outbound flows are inspected, routed predictably, and kept isolated between the spoke workload environment and the hub security layer.



Comments