Kubernetes Cost Optimization: 10 Strategies to Reduce Cloud Spend
The top Kubernetes cost optimisation strategies are: (1) right-size pod resource requests and limits using VPA recommendations; (2) use Spot/Preemptible instances for non-critical workloads (60–80% cheaper); (3) implement Horizontal Pod Autoscaler to scale down during off-peak hours; (4) set namespace ResourceQuotas to prevent runaway consumption; (5) delete unused namespaces and orphaned persistent volumes; (6) use KEDA for event-driven scaling (scale to zero for batch workloads); (7) implement cluster autoscaler to right-size the node pool. Typical savings: 30–50% of existing K8s cloud spend.
Commercial Expertise
Need help with Cloud & DevOps?
Ortem deploys dedicated Cloud Infrastructure squads in 72 hours.
Why Kubernetes Costs Spiral
Kubernetes makes resource allocation easy — which means over-provisioning is common. The three biggest cost drivers:
- Over-specified resource requests: Pods request 2 CPU and 4GB RAM but average 0.2 CPU and 500MB. You pay for 2 CPU.
- Under-utilised node pools: Cluster nodes running at 20% utilisation because the cluster was sized for peak.
- Always-on workloads that could scale to zero: Batch jobs, dev/staging environments, and low-traffic services running 24/7.
Strategy 1: Right-Size Pod Resource Requests
Resource requests determine which node a pod is scheduled on — and which node you pay for. Oversized requests waste reserved capacity.
Use the Vertical Pod Autoscaler (VPA) in recommendation mode:
kubectl apply -f vpa.yaml # VPA in "Off" mode (recommendations only)
kubectl describe vpa my-deployment
# Shows: Recommended CPU: 250m, Memory: 512Mi
# Your current request: CPU: 2000m, Memory: 4096Mi
Adjust requests based on VPA recommendations. Typical finding: 40–70% of pods are over-provisioned.
Strategy 2: Use Spot / Preemptible Instances
Spot instances (AWS) / Preemptible VMs (GCP) cost 60–80% less than on-demand. They can be reclaimed with 2-minute notice — so they are appropriate for:
- Stateless web services (with multiple replicas)
- Batch processing jobs
- CI/CD runners
- Dev and staging environments
Use node pools with mixed instance types: on-demand for critical pods, Spot for everything else.
Strategy 3: Cluster Autoscaler
Automatically adds and removes nodes based on pending pods and utilisation:
# AWS EKS Cluster Autoscaler
clusterAutoscaler:
enabled: true
scaleDownUtilizationThreshold: 0.5 # Remove nodes at 50% utilisation
scaleDownDelay: "10m"
This eliminates paying for idle nodes during off-peak hours.
Strategy 4: Horizontal Pod Autoscaler (HPA)
Scale deployments based on CPU/memory or custom metrics:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api
minReplicas: 2
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Strategy 5: KEDA for Scale-to-Zero
KEDA (Kubernetes Event-Driven Autoscaling) can scale workloads to zero replicas when there is no traffic — and back up when needed:
- Ideal for background workers, queue processors, scheduled jobs
- Scale to zero = zero cost for idle workloads
- Triggers: queue depth, cron schedule, HTTP traffic, custom metrics
Strategy 6: Namespace Resource Quotas
Prevent any single team or service from consuming the entire cluster:
apiVersion: v1
kind: ResourceQuota
metadata:
name: team-quota
namespace: team-frontend
spec:
hard:
requests.cpu: "10"
requests.memory: "20Gi"
limits.cpu: "20"
limits.memory: "40Gi"
persistentvolumeclaims: "10"
Strategy 7: Scheduled Scaling for Dev Environments
Scale dev/staging to zero at night and weekends:
# Using kube-downscaler or KEDA cron trigger
schedule: "0 18 * * MON-FRI" # Scale down at 6pm
scaleUpSchedule: "0 8 * * MON-FRI" # Scale up at 8am
Savings: dev environments typically represent 15–25% of K8s spend.
Strategy 8: Delete Orphaned Resources
Run monthly cleanup:
# Find unused PersistentVolumeClaims
kubectl get pvc --all-namespaces | grep Released
# Find deployments with zero replicas
kubectl get deployments --all-namespaces | grep "0/0"
# Find unused namespaces
kubectl get namespaces # Review and delete stale ones
Strategy 9: Use Cost Visibility Tools
You cannot optimise what you cannot see:
- Kubecost — shows cost per namespace, deployment, and pod
- OpenCost — CNCF open-source cost monitoring
- AWS Cost Explorer — EKS cost breakdown by tag
- Datadog Cost Management — real-time K8s spend with anomaly detection
Strategy 10: Reserved Instances for Baseline Load
Use Reserved Instances (1-year or 3-year) for the stable baseline of your node pool. Use Spot for burst capacity. A mix of 60% Reserved + 40% Spot typically delivers 40–50% total savings vs all on-demand.
Need a Kubernetes cost audit? Talk to our cloud team → or contact us to book your free K8s cost review.
Get the Ortem Tech Digest
Monthly insights on AI, mobile, and software strategy - straight to your inbox. No spam, ever.
About the Author
Director – AI Product Strategy, Development, Sales & Business Development, Ortem Technologies
Praveen Jha is the Director of AI Product Strategy, Development, Sales & Business Development at Ortem Technologies. With deep expertise in technology consulting and enterprise sales, he helps businesses identify the right digital transformation strategies - from mobile and AI solutions to cloud-native platforms. He writes about technology adoption, business growth, and building software partnerships that deliver real ROI.
Stay Ahead
Get engineering insights in your inbox
Practical guides on software development, AI, and cloud. No fluff — published when it's worth your time.
Ready to Start Your Project?
Let Ortem Technologies help you build innovative solutions for your business.
You Might Also Like
Cloud Cost Reduction: The 8 Optimisations That Actually Move the Needle

AI-Native Cloud & FinOps: Mastering Cost Optimization in the Generative AI Era

