Performance Issues
Overview
Diagnosis and resolution of Kubernetes performance problems — CPU throttling, memory pressure, I/O bottlenecks, slow API server, and HPA/VPA misconfigurations.
Performance Triage Checklist
# 1. Pod-level resource usage
kubectl top pods -n <ns> --sort-by=cpu | head -20
kubectl top pods -n <ns> --sort-by=memory | head -20
# 2. Node-level resource usage
kubectl top nodes
# 3. Check for CPU throttling (% of time in CFS throttle)
# Prometheus: container_cpu_throttled_seconds_total
kubectl get --raw '/api/v1/namespaces/monitoring/services/prometheus-operated:web/proxy/api/v1/query' \
--data-urlencode 'query=rate(container_cpu_throttled_seconds_total{namespace="production"}[5m]) /
rate(container_cpu_usage_seconds_total{namespace="production"}[5m]) > 0.25' | jq .
# 4. Check for OOMKill events
kubectl get events -n <ns> --field-selector reason=OOMKilling
kubectl get events -n <ns> | grep OOM
# 5. Check resource requests vs limits
kubectl get pods -n <ns> -o json | \
jq -r '.items[] | .metadata.name + ": " +
(.spec.containers[0].resources | "req_cpu=" + (.requests.cpu//"none") +
" limit_cpu=" + (.limits.cpu//"none") +
" req_mem=" + (.requests.memory//"none") +
" limit_mem=" + (.limits.memory//"none"))'
CPU Throttling
CPU throttling occurs when:
A container exceeds its cpu LIMIT.
Linux CFS (Completely Fair Scheduler) throttles it for the remainder of the 100ms period.
Symptoms:
- p99 latency much higher than p50 (spiky latency)
- Application feels "frozen" for short bursts
- Prometheus metric: container_cpu_throttled_seconds_total is non-zero
Throttling vs no-limit:
cpu limit = 500m → throttled if spikes above 500m for >50ms in 100ms window
no cpu limit → container can burst to full node CPU, but may compete with neighbors
# Check if throttling is occurring
kubectl exec <pod> -n <ns> -- cat /sys/fs/cgroup/cpu/cpu.stat
# Look for: throttled_time (nanoseconds) > 0
# PromQL query: throttle ratio per container
rate(container_cpu_throttled_seconds_total{pod=~"payments-api.*"}[5m]) /
rate(container_cpu_usage_seconds_total{pod=~"payments-api.*"}[5m])
# > 25% throttle ratio is a problem
# Fix options:
# 1. Increase cpu limit:
kubectl patch deployment payments-api -n production \
-p '{"spec":{"template":{"spec":{"containers":[{"name":"payments-api","resources":{"limits":{"cpu":"2000m"}}}]}}}}'
# 2. Remove cpu limit entirely (for latency-sensitive services):
# Risk: pod can consume entire node CPU and starve neighbors
# Mitigate: use cpu request (guarantees share) without limit
# 3. Profile the application to reduce CPU usage
# Go: pprof; Java: async-profiler; Node: clinic.js
Memory Pressure and OOM
# Check OOMKill history
kubectl get events -n <ns> -o json | \
jq '.items[] | select(.reason=="OOMKilling") |
{pod:.involvedObject.name, msg:.message, time:.lastTimestamp}'
# Current memory usage vs limit
kubectl top pod <pod> -n <ns>
kubectl get pod <pod> -n <ns> \
-o jsonpath='{.spec.containers[0].resources.limits.memory}'
# Java heap sizing (common mistake)
# Container limit: 2Gi
# JVM default heap: 25% of RAM = 512Mi
# Non-heap (metaspace, code cache, GC overhead): ~512Mi
# → Total Java process = 1Gi → fits within 2Gi limit
#
# Problem: JVM sets heap from HOST memory, not container cgroup limit
# Fix: add -XX:MaxRAMPercentage=75 or explicit -Xmx1500m
# OR: -XX:+UseContainerSupport (default in JDK 11+) reads cgroup limits
# Go memory — no GC limit by default
# Add: GOGC=100 (default), GOMEMLIMIT=1750MiB (Go 1.19+)
# GOMEMLIMIT tells Go GC to be aggressive before hitting container limit
# Node.js heap
# --max-old-space-size=1536 (1.5GB for 2GB container)
# VPA recommendation for memory
kubectl describe vpa <vpa-name> -n <ns>
# Look for: target memory recommendation
Slow Application Response
# Step 1: Identify which tier is slow
# - Is it the database? The upstream API? The app itself?
kubectl run netshoot --image=nicolaka/netshoot --rm -it -- \
curl -w "@curl-format.txt" -o /dev/null -s http://payments-api.production.svc:8080/
# Step 2: Check connection pool exhaustion
# App log: "too many open connections", "pool timeout", "ETIMEDOUT"
# Fix: tune pool size, or find connection leak
# Step 3: CPU throttling causing latency spikes
# See CPU throttling section above
# Step 4: GC pauses (Java/Go)
# Go: check GODEBUG=gctrace=1 output
# Java: check GC logs (-XX:+PrintGCDetails)
# Fix: tune GC settings, increase heap headroom
# Step 5: Network latency
kubectl run netshoot --image=nicolaka/netshoot --rm -it -- \
ping -c 10 payments-api.production.svc.cluster.local
# Inter-pod latency should be <1ms on same node, <5ms across nodes
# Step 6: DNS lookup latency
kubectl run netshoot --image=nicolaka/netshoot --rm -it -- \
time nslookup payments-api.production.svc.cluster.local
# DNS should resolve in <5ms; if >50ms → CoreDNS issue
HPA Not Scaling Fast Enough
# Check HPA status
kubectl describe hpa payments-api -n production
# Conditions:
# AbleToScale: True
# ScalingActive: True
# ScalingLimited: False
# Current vs target metric
kubectl get hpa payments-api -n production
# TARGETS: 850m/200m ← 4x over target, should be scaling
# Why is it slow to scale?
# 1. scaleUp.stabilizationWindowSeconds > 0 (default: 0 for scale-up)
# 2. scaleUp.policies[].value too small (e.g., maxSurge of 2 pods per minute)
kubectl get hpa payments-api -n production -o yaml | grep -A10 behavior
# 3. Pods are slow to become Ready (long readiness probe initialDelay)
kubectl describe pod <new-pod> -n production | grep "Readiness"
# 4. metrics-server lag (60s scrape interval)
kubectl get --raw /apis/metrics.k8s.io/v1beta1/namespaces/production/pods | \
jq '.items[].timestamp'
# Fix: configure scale-up behavior for faster response
behavior:
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Pods
value: 10
periodSeconds: 60 # allow up to 10 pods per minute
- type: Percent
value: 100
periodSeconds: 60 # or double replicas per minute
selectPolicy: Max
Node Resource Exhaustion
# Check allocatable vs requested resources on node
kubectl describe node <node> | grep -A15 "Allocated resources"
# Requests: CPU=3500m/4000m (87%), Memory=6Gi/8Gi (75%)
# Find which pods are consuming most resources
kubectl top pods -n production --sort-by=memory | head -20
# Check for resource request inflation (pods requesting more than they use)
# PromQL: actual usage vs request
# container_memory_usage_bytes / on(pod,container) kube_pod_container_resource_requests{resource="memory"}
# Node is fully packed → new pods stay Pending
# Options:
# 1. Scale out: add nodes or use Karpenter auto-provisioning
# 2. Reduce requests: use VPA to right-size
# 3. Use bin-packing: switch HPA to MostAllocated scoring strategy
# Check node capacity including extended resources
kubectl describe node <node> | grep -E "Capacity:|Allocatable:" -A10
Disk I/O Bottleneck
# Symptoms: high io_wait, slow database, application timeout
# Check iowait on node
kubectl debug node/<node> -it --image=ubuntu -- \
iostat -x 1 5
# Look for: %util near 100%, high await (ms), high w_await
# Which process is causing I/O
kubectl debug node/<node> -it --image=ubuntu -- \
iotop -Po # show only active I/O processes
# Check if pod is hitting EBS burst budget
# EBS gp3 burst: sustained 3000 IOPS; burst to 3000 for gp3 baseline
# gp3: 3000 IOPS and 125 MB/s always (no burst bucket needed)
# gp2: IOPS = 3 × size, can burst to 3000 with burst bucket
aws cloudwatch get-metric-statistics \
--namespace AWS/EBS --metric-name BurstBalance \
--dimensions Name=VolumeId,Value=$PV_HANDLE \
--start-time $(date -u -v-1H +%Y-%m-%dT%H:%M:%SZ) \
--end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \
--period 300 --statistics Average
# Upgrade to gp3 with higher IOPS if sustained baseline needed
kubectl patch storageclass gp3-retain \
-p '{"parameters":{"iops":"6000","throughput":"250"}}'
# Note: existing PVs not retroactively changed; new PVCs get new params
# Use fio to benchmark actual disk performance
kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
name: fio-test
namespace: production
spec:
containers:
- name: fio
image: nixery.dev/fio
command: ["fio", "--name=randread", "--ioengine=libaio",
"--rw=randread", "--bs=4k", "--numjobs=4",
"--runtime=30", "--filename=/data/test",
"--size=1G", "--direct=1"]
volumeMounts:
- name: data
mountPath: /data
volumes:
- name: data
persistentVolumeClaim:
claimName: payments-data
EOF
API Server Latency
# API server request latency (high → controllers slow to reconcile)
kubectl get --raw '/metrics' | grep apiserver_request_duration_seconds_bucket | \
grep -v "^#" | tail -5
# PromQL: p99 request latency
histogram_quantile(0.99,
sum(rate(apiserver_request_duration_seconds_bucket[5m])) by (le, verb, resource)
)
# Common causes of API server slowness:
# 1. etcd latency > 100ms
kubectl get --raw '/metrics' | grep etcd_request_duration
# 2. Too many watch connections (informer leak)
kubectl get --raw '/metrics' | grep apiserver_watch_cache_events_dispatched_total
# 3. Large list requests (missing pagination, large objects)
kubectl get --raw '/metrics' | grep apiserver_request_total | grep LIST
# Fix: ensure controllers use --watch-cache=true, paginate large lists
# Fix: limit field selectors / label selectors on LIST calls
# Fix: avoid storing large blobs in etcd (use ConfigMap references)
Related
- 06 — HPA Flow — HPA calculation and scaling triggers
- 01 — Pod Failures — OOMKilled and eviction
- 06 — Node Issues — node-level resource pressure