EKS Cost Optimization Playbook: From Visibility to Savings

Published by

Vishnu Siddarth

Jan 18, 2026

Introduction

If you run production workloads on Amazon EKS, you likely have untapped savings sitting in your cluster right now. Finding them requires more than high-level advice. This playbook combines observability through AWS tools, hands-on autoscaler configurations (Karpenter, Cluster Autoscaler, HPA, VPA), a decision matrix for Spot versus Fargate versus Reserved capacity, and a step-by-step rollout plan with real before-and-after numbers.

Before & After: Real Savings in 90 Days

A typical mid-size engineering team running 200 pods across 40 m5.large nodes spending $8,400 monthly can expect these results after implementing this playbook:

Before: $8,400/month

40 On-Demand m5.large nodes at 25% average utilization
Oversized pod requests
No autoscaling
Dev clusters running 24/7

After: $2,800/month

15 nodes via Karpenter (70% Spot, 30% On-Demand with Savings Plans)
Right-sized requests
Dev clusters scaled to zero overnight

Net Savings: $5,600/month (67% reduction)

This isn't hypothetical. Most teams implementing this playbook end-to-end achieve 50-70% cost reduction within three months.

Key Highlights

EKS cost waste primarily comes from oversized pods, idle EC2 nodes, orphaned EBS volumes, and unnecessary cross-AZ traffic
AWS Split Cost Allocation Data combined with native billing tools enables pod-level cost visibility and accurate attribution
Right-sizing pod requests typically cuts 30 to 40 percent of node capacity requirements
Karpenter delivers faster, cheaper autoscaling than Cluster Autoscaler and reduces compute costs by 30 to 50 percent
Spot Instances offer 60 to 90 percent savings for tolerant workloads when paired with proper PDBs and diversification
A hybrid compute strategy using On-Demand baseline, Spot variability, and Savings Plans commitments produces the largest long-term ROI
Most teams achieve 50 to 70 percent EKS savings in three months by implementing this playbook end to end

Why EKS Costs Spiral: Anatomy of Spend

The single biggest driver of EKS cost waste is mis-sized pod resource requests. When developers overestimate CPU and memory needs, often requesting 4 cores when workloads use only 1, Kubernetes provisions additional nodes to satisfy those inflated requests. This creates a cascade effect: more nodes lead to higher EC2 charges, more EBS volumes, increased cross-AZ traffic, and lower overall cluster utilization. Fixing pod resource requests first makes everything downstream significantly cheaper.

Understanding Cost Components

EKS spending breaks down into several key components, each with its own cost leaks:

EC2 Compute forms the largest expense category. When teams overprovision instance types or leave nodes running during idle periods, waste compounds quickly. A single m5.large instance running 24/7 costs roughly $70 monthly. Multiply that by dozens of underutilized nodes, and you're looking at thousands in unnecessary spend.

EBS Volumes accumulate silently. Developers provision persistent volumes for stateful workloads, but those volumes persist even after pods terminate, requiring regular S3 Cost Optimization and storage lifecycle controls across clusters. Orphaned EBS volumes charging $0.10 per GB-month add up to significant waste over time.

Data Transfer Costs catch teams off guard. Cross-AZ traffic within your VPC costs $0.01 per GB in each direction. For data-intensive applications processing terabytes monthly, this becomes a five-figure line item. NAT Gateway processing fees add another layer, charging $0.045 per GB processed.

EKS Control Plane costs $0.10 per hour per cluster, totaling $73 monthly. While this fee stays constant, running multiple clusters for different environments multiplies this base cost unnecessarily.

Common Cost Leaks

Oversized pod resource requests force Kubernetes to provision more nodes than workloads actually need. A pod requesting 4 CPU cores but using only 1 core wastes 75% of allocated capacity.
Abandoned workloads from completed experiments continue consuming resources months after teams stop caring about them.
Noisy neighbor problems arise when resource-intensive pods land on shared nodes, forcing other workloads onto additional instances.

Quick Decision Matrix: Workload → Autoscaler → Compute

Workload Type	Best Autoscaler	Compute Strategy	Expected Savings
Stateless web apps	Karpenter + HPA	70% Spot, 30% On-Demand	60-75%
Batch/CI-CD	Karpenter	90% Spot, 10% On-Demand	75-85%
Databases/stateful	Cluster Autoscaler	100% On-Demand + Savings Plans	40-50%
ML inference	Karpenter	50% Spot, 50% On-Demand	50-60%
Low-traffic services	Fargate	Pay-per-use	20% (ops efficiency gain)
Development/staging	Karpenter	100% Spot, scale-to-zero	80-90%

Fargate Trade-off: ~20% higher compute cost for 80% reduction in operational overhead. Choose when operational simplicity justifies the premium.

Observe: Visibility and Allocation

You cannot optimize what you cannot measure. Establishing cost visibility requires connecting AWS billing data with Kubernetes resource allocation using structured Cloud Cost Allocation Methods.

Enable AWS Split Cost Allocation Data

AWS Split Cost Allocation Data for EKS provides native, granular cost visibility down to the pod level. As of October 2025, this feature now supports importing up to 50 Kubernetes custom labels per pod as cost allocation tags.

Setup Steps:

Navigate to AWS Billing Console → Cost Management Preferences
Enable Split Cost Allocation Data for Amazon EKS
Choose your measurement method:
- Resource requests (recommended): Allocates costs based on CPU/memory requests
- Amazon Managed Service for Prometheus: Uses actual utilization metrics
- CloudWatch Container Insights: Provides more granular visibility
For accelerated computing workloads (GPU, Trainium, Inferentia): Split cost allocation automatically tracks these resources alongside CPU and memory (feature added September 2025)
Enable Cost and Usage Reports (CUR) delivery to an S3 bucket with hourly granularity

Configure Cost Visualization

After enabling split cost allocation (data appears within 24 hours):

Use AWS Cost Explorer for basic visualization:

Filter by aws:eks:cluster-name, aws:eks:namespace, aws:eks:workload-name
Group costs by custom Kubernetes labels imported as tags
Set up Cost Anomaly Detection for automatic alerts

For Advanced Analysis:

Deploy the Containers Cost Allocation Dashboard in Amazon QuickSight
Query CUR data using Amazon Athena for custom reports
Create Cost Categories to map Kubernetes costs to business units

Map Billing to Kubernetes

Establish consistent labeling across all workloads:

metadata:
labels:
team: platform
environment: production
app: api-gateway
cost-center: engineering

These labels automatically appear in CUR as resourceTags/kubernetes.io/team, resourceTags/kubernetes.io/environment, etc., enabling precise cost allocation and chargeback.

Quick Wins: Low-Effort High-Impact Changes

Several optimization tactics deliver immediate savings with minimal implementation effort.

1. Tune Resource Requests and Limits

Analyze actual pod consumption using kubectl top pods to compare requested resources against actual usage. A pod requesting 2 CPU cores but averaging 0.5 cores wastes 75% of its allocation.

resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 1000m
memory: 1Gi

This change alone can reduce required node capacity by 30-40% in over-provisioned clusters.

2. Scale Down Development Clusters

Development and testing environments rarely need full capacity overnight. Implement cluster schedulers or simple cron jobs:

# Scale down at 7 PM
0 19 * * * kubectl scale deployment --all --replicas=1 -n development

# Scale up at 7 AM
0 7 * * * kubectl scale deployment --all --replicas=3 -n development

3. Remove Abandoned Workloads

Audit deployments that haven't been updated in 90+ days:

kubectl get deployments --all-namespaces \
--sort-by=.metadata.creationTimestamp

This housekeeping typically recovers 10-15% of cluster capacity.

4. Choose Right-Sized Node Types

Analyze workload patterns to select optimal instances:

Memory-intensive: r5 instances provide more RAM per dollar
Compute-bound: c5 instances offer more CPU capacity
Burstable workloads: t3/t4g instances save costs during idle periods

Autoscaling Deep Dive: Karpenter vs Cluster Autoscaler

Cluster Autoscaler: Stable and Predictable

The traditional choice that monitors pending pods and scales Auto Scaling Groups.

Setup:

apiVersion: apps/v1
kind: Deployment
metadata:
name: cluster-autoscaler
namespace: kube-system
spec:
replicas: 1
template:
spec:
containers:
- image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.33.0
name: cluster-autoscaler
command:
- ./cluster-autoscaler
- --cloud-provider=aws
- --skip-nodes-with-local-storage=false
- --expander=least-waste
- --scale-down-utilization-threshold=0.5

Characteristics:

Scale-up latency: 2-5 minutes
Average utilization: 25-40%
Works within predefined node groups
Mature, stable, multi-cloud portable

Karpenter: Fast and Flexible (Recommended for AWS)

Provisions nodes directly via EC2 APIs, selecting optimal instance types based on pending pod requirements. As of July 2025, Karpenter v1.5 introduced faster bin-packing and "emptiness-first" consolidation.

Basic NodePool Configuration:

apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: default
spec:
template:
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ["spot", "on-demand"]
- key: node.kubernetes.io/instance-type
operator: In
values: ["m5.large", "m5a.large", "m5n.large"]
limits:
cpu: 1000
memory: 1000Gi

Performance Advantages:

Scale-up latency: 45-60 seconds (vs 2-5 minutes for CA)
Average utilization: 60-70% (vs 25-40% for CA)
Cost reduction: Real-world teams report 20-30% cluster-wide savings, up to 90% for CI/CD workloads
Intelligent bin-packing: Consolidates workloads aggressively without breaking SLOs

Decision Framework

Choose Cluster Autoscaler when:

You need multi-cloud portability
Regulatory requirements mandate managed node group controls
Workload patterns are highly predictable
Team prefers proven, stable technology

Choose Karpenter when:

Optimizing aggressively for cost on AWS
Requiring fast scale-up response (<60s)
Running substantial Spot Instance workloads
Need flexible instance type selection

Horizontal and Vertical Pod Autoscalers

HPA scales pod replicas based on metrics:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70

VPA adjusts pod resource requests automatically:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: app-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: app
updatePolicy:
updateMode: "Auto"

VPA analyzes historical usage and right-sizes requests, eliminating guesswork.

Compute Choice Matrix: Spot, On-Demand, Fargate, and Commitments

Spot Instances: 60-90% Savings

Spot Instances offer dramatic discounts on spare EC2 capacity. They work exceptionally well for fault-tolerant workloads like batch processing, CI/CD pipelines, and stateless web services.

Configure Spot-friendly NodePools with diversification:

apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: spot-pool
spec:
template:
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ["spot"]
- key: node.kubernetes.io/instance-type
operator: In
values: ["m5.large", "m5a.large", "m5n.large", "c5.large"]
taints:
- key: workload-type
value: spot
effect: NoSchedule

Implement Pod Disruption Budgets:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: app-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: web-app

Real-world Spot adoption typically delivers 60-75% cost reduction for suitable workloads.

On-Demand Instances: Guaranteed Availability

Use for critical workloads requiring guaranteed availability:

Databases and stateful applications
Message queues
Application pods that cannot tolerate interruptions

Price predictability and instant availability justify the premium cost.

AWS Fargate: Zero Infrastructure Management

Fargate eliminates node management entirely, charging only for pod vCPU and memory consumption.

Pricing (as of December 2025):

$0.04048 per vCPU-hour
$0.004445 per GB-hour

Trade-off: Approximately 20% higher compute costs than equivalent EC2 instances, but removes 80% of operational overhead—no node patching, no capacity planning, no autoscaler configuration.

Consider Fargate when:

Operational simplicity justifies the premium
Running low-traffic services or scheduled jobs
Small teams without dedicated infrastructure engineers
Need instant scaling without pre-provisioned capacity

For cost-sensitive production workloads at scale, EC2 with Karpenter delivers better economics.

Savings Plans and Reserved Instances: 40-72% Discounts

Apply commitments to baseline capacity:

Compute Savings Plans: Flexibility across instance families and regions (40-50% discount)

Reserved Instances: Slightly higher discounts (up to 72%) but locked to specific instance types

Strategy: Purchase Savings Plans for predictable baseline capacity, use Spot for variable demand.

Tagging and Cost Allocation Best Practices

Establish a tagging taxonomy covering:

Team: Engineering team responsible
Environment: Production, staging, development
Application: Specific service name
Cost-center: Business unit for chargeback

apiVersion: v1
kind: Pod
metadata:
labels:
team: platform
environment: production
app: api-gateway
cost-center: engineering

AWS Split Cost Allocation automatically imports these labels as tags with the prefix resourceTags/kubernetes.io/, enabling filtered analysis in Cost Explorer and CUR queries.

KPIs, Alerts, and Guardrails for FinOps

Track These Metrics

Cost per namespace: Monthly spend attributed to each Kubernetes namespace
Cost per pod: Average cost per running pod
CPU/Memory efficiency: Ratio of utilized resources to requested resources
Idle node percentage: Proportion of provisioned capacity unused
Wasted spend: Dollar value of over-provisioned resources

Configure Budget Alerts

Use AWS Budgets with Cost Anomaly Detection:

# Set budget threshold
aws budgets create-budget \
--account-id 123456789 \
--budget file://budget-config.json

Set Karpenter Limits

Prevent runaway scaling:

spec:
limits:
cpu: "1000"
memory: "1000Gi"

Get Expert Help

Organizations implementing this playbook typically reduce EKS costs by 50-70% within three months. The combination of visibility, right-sizing, intelligent autoscaling, and strategic use of Spot Instances delivers sustainable savings without compromising reliability.

Ready to uncover your hidden EKS savings? Visit Opsolute or book a free assessment to get started.

Opsolute provides comprehensive EKS cost optimization as a service, combining automated analysis, expert recommendations, and hands-on implementation support to help you achieve these savings faster with less risk.