EKS Pod Limits: When Your Node Just Can't Fit One More Pod
We bootstrapped ArgoCD on EKS and one of its pods got stuck in Pending with "Too many pods." The t3.medium limit of 17 pods caught us off guard. Here's why the limit exists, how to calculate it, and what your options are.
The Symptom
After scaling up our mypie-infra ArgoCD ApplicationSet, several infra add-ons were deployed: cert-manager, LBC, metrics-server, Atlantis. At the same time, ArgoCD was running its own 7-pod control plane. Everything scheduled fine — until a restart of the argocd-repo-server pod triggered a rolling update.
The new pod couldn’t be scheduled:
Events:
Warning FailedScheduling 14m default-scheduler
0/1 nodes are available: 1 Too many pods.
preemption: 0/1 nodes are available: 1 No preemption victims found.
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-10-1-11-55.eu-central-1.compute.internal Ready <none> 95m v1.32.12-eks-f69f56f
One node, 17 pods, node at capacity, one pod can’t schedule.
Why Does t3.medium Cap Out at 17 Pods?
AWS EKS uses the VPC CNI plugin (aws-node), which assigns real VPC IP addresses to each pod. The number of IPs available on a node is determined by the instance type’s ENI (Elastic Network Interface) limits:
max pods = (number of ENIs) × (IPs per ENI - 1) + 2
For t3.medium:
| Attribute | Value |
|---|---|
| Max ENIs | 3 |
| Max IPs per ENI | 6 |
max pods = 3 × (6 - 1) + 2 = 3 × 5 + 2 = 17
This is enforced by the kubelet — Kubernetes itself won’t schedule more pods than the node declares it can hold.
You can verify the limit:
kubectl get node <node-name> -o jsonpath='{.status.allocatable.pods}'
# 17
What Pods Were Taking Up All 17 Slots?
$ kubectl get pods -A --field-selector spec.nodeName=ip-10-1-11-55... \
-o custom-columns='NAMESPACE:.metadata.namespace,NAME:.metadata.name' \
| sort
Decomposed:
| Component | Pod Count |
|---|---|
kube-system: aws-node, kube-proxy, coredns ×2, ebs-csi ×3, metrics-server, LBC ×2 |
~11 |
argocd: application-controller, applicationset-controller, dex, redis, server, notifications |
6 |
| Total | 17 |
The argocd-repo-server pod — the one that needed to restart — would have been pod #18.
Option 1: Add More Nodes (Quickest)
Scale the node group via AWS CLI:
aws eks update-nodegroup-config \
--cluster-name mypie-eks-staging \
--nodegroup-name mypie-eks-staging-general \
--scaling-config minSize=1,maxSize=3,desiredSize=2 \
--region eu-central-1
The second node joined in ~3 minutes. The pending pod scheduled immediately.
Update Terraform to match (so the next apply doesn’t reset desired to 1):
resource "aws_eks_node_group" "general" {
scaling_config {
desired_size = var.environment == "production" ? 3 : 2 # was 1
min_size = var.environment == "production" ? 3 : 1
max_size = var.environment == "production" ? 10 : 3
}
# Prevent Terraform from overriding autoscaler-managed desired count
lifecycle {
ignore_changes = [scaling_config[0].desired_size]
}
}
Option 2: Enable VPC CNI Prefix Delegation (More Pods Per Node)
If you want to keep a single t3.medium but fit more pods, enable prefix delegation on the VPC CNI. This assigns /28 CIDR prefixes to ENI slots instead of individual IPs, multiplying capacity by 16:
new max = (ENIs) × (IPs per ENI - 1) × 16 + 2
t3.medium: 3 × 5 × 16 + 2 = 242 pods (EKS caps at 110)
Enable it on the EKS add-on:
aws eks update-addon \
--cluster-name mypie-eks-staging \
--addon-name vpc-cni \
--configuration-values '{"env":{"ENABLE_PREFIX_DELEGATION":"true","WARM_PREFIX_TARGET":"1"}}' \
--region eu-central-1
Important: After enabling prefix delegation, existing nodes need to be recycled for the new limits to take effect. The CNI calculates available IPs on startup. Rolling-restart the aws-node daemonset and cordon/drain/replace nodes, or simply scale-in and scale-out the node group.
You also need to update the kubelet’s --max-pods setting — for managed node groups, set the EKS max-pods value via a launch template with a custom bootstrap script, or use the EKS eks-max-pods parameter:
# Get the recommended max-pods for t3.medium with prefix delegation
aws ec2 describe-instance-types \
--instance-types t3.medium \
--query 'InstanceTypes[0].NetworkInfo' \
--output table
Or use the EKS max pods calculator.
Option 3: Use a Larger Instance Type
The simplest long-term option: just use a bigger instance. t3.large gives you 35 pods, t3.xlarge gives 58, m5.large gives 29.
For staging environments, t3.large is usually sufficient and cost-effective enough.
resource "aws_eks_node_group" "general" {
instance_types = ["t3.large"] # was "t3.medium"
}
Note: changing instance_types in an EKS managed node group replaces the node group (forces a new node group creation and old one deletion). Plan for node drain.
Which Option We Chose
For our staging cluster, we went with Option 1 — scaling to 2 nodes. It was the fastest fix and keeps costs low (2 × t3.medium ≈ $0.09/hr). We also added ignore_changes on desired_size so Cluster Autoscaler can freely scale between the min and max.
For the production cluster, we configured 3 nodes from the start with t3.large instances.
Key Takeaways
t3.mediumon EKS has a hard limit of 17 pods due to VPC CNI IP address limits.- The formula:
max_pods = (ENIs) × (IPs-per-ENI - 1) + 2. - The symptom is
FailedScheduling: 0/1 nodes available: 1 Too many pods— check allocatable pods withkubectl get node -o jsonpath='{.status.allocatable.pods}'. - Fastest fix: add a second node. Most scalable fix: enable prefix delegation. Long-term fix: right-size the instance type.
- Add
ignore_changes = [scaling_config[0].desired_size]in Terraform if using Cluster Autoscaler, so Terraform doesn’t reset the count on every apply.