kube-proxy

Node Components Networking iptables IPVS Services 02-node-components / 02-kube-proxy.html

kube-proxy is the network rules agent that runs on every node. Its sole job is to implement the Service abstraction at the network level: when a packet destined for a Service ClusterIP arrives at the node, kube-proxy's rules translate (DNAT) that virtual IP into a real pod IP and load-balance across healthy endpoints. kube-proxy itself never handles any packet in user-space — it programs the kernel to do the work.

For a deep treatment of how Services work end-to-end, see Networking §Services and kube-proxy Internals. This page focuses on kube-proxy as a node component: its process, modes, configuration, and operations.

What kube-proxy Does (and Doesn't Do)

What it does

  • Watches Service and EndpointSlice objects from the API server
  • Programs iptables NAT rules, IPVS virtual servers, or nftables rules to implement ClusterIP load balancing
  • Implements NodePort — opens host ports and forwards to pod IPs
  • Implements ExternalIP — adds DNAT for externally-routable IPs
  • Handles sessionAffinity: ClientIP via iptables recent-match or IPVS persistence
  • Handles externalTrafficPolicy: Local — only forwards to local node endpoints for NodePort/LB traffic, preserving source IP

What it does NOT do

  • Does not forward packets itself — all forwarding is done by the kernel
  • Does not implement pod-to-pod networking — that's the CNI plugin's job
  • Does not enforce NetworkPolicy — that's the CNI plugin (e.g., Calico, Cilium)
  • Does not provide DNS — that's CoreDNS
  • Does not run on clusters using eBPF-based Service implementations (Cilium kube-proxy replacement, Calico eBPF)
kube-proxy is optional with eBPF CNIs
Clusters running Cilium or Calico with eBPF mode can completely replace kube-proxy. In these setups, eBPF programs attached to network interfaces handle Service load balancing directly in the kernel's fast path without any iptables rules. See eBPF Networking.

Proxy Modes

kube-proxy supports three implementation modes, selected via --proxy-mode:

iptables (default)

Programs chains in the PREROUTING and OUTPUT netfilter hooks. Each Service gets a chain; each endpoint gets a rule using --probability for random load balancing. Rule count grows linearly with Services × endpoints.

Scale limit: ~10,000 Services (rule traversal is O(n)).

Used when: default on most managed clusters; no kernel IPVS module required.

ipvs

Uses the Linux IPVS kernel module (LVS). Creates a virtual server per Service ClusterIP:port and real servers per endpoint. Hash table lookup: O(1) regardless of Service count.

Scale limit: 100,000+ Services with negligible per-rule overhead.

Used when: large clusters (>1000 Services) or when advanced LB algorithms are needed.

nftables (1.29 beta, GA 1.31)

Uses nftables (Linux 3.13+) instead of legacy iptables. Verdict maps provide O(1) lookups similar to IPVS but without requiring a separate kernel module. Replaces iptables on modern Linux distributions.

Used when: modern distros (RHEL 9, Ubuntu 22.04+) where iptables-legacy is deprecated.

iptables Mode — Rule Structure

In iptables mode, kube-proxy creates a predictable chain hierarchy in the nat table. Understanding this structure is essential for debugging connectivity issues.

Packet dst: ClusterIP PREROUTING -j KUBE-SERVICES (external traffic) OUTPUT -j KUBE-SERVICES KUBE-SERVICES matches dst ClusterIP:port -j KUBE-SVC-<hash> one rule per Service KUBE-SVC-<hash> --probability 0.33 -j SEP-1 --probability 0.50 -j SEP-2 -j SEP-3 KUBE-SEP-1 DNAT → 10.244.1.5:8080 KUBE-SEP-2 DNAT → 10.244.2.8:8080 KUBE-SEP-3 DNAT → 10.244.3.2:8080 KUBE-NODEPORTS matches dport 30000-32767 -j KUBE-SVC-<hash> POSTROUTING KUBE-POSTROUTING: MASQUERADE MASQUERADE applied to pod→Service traffic so return packets route back correctly

Key iptables Chains

ChainTablePurpose
KUBE-SERVICESnatEntry point; one rule per Service matching dst ClusterIP:port; jumps to per-Service chain
KUBE-SVC-<hash>natPer-Service chain; random load balancing across endpoints using --probability statistic match
KUBE-SEP-<hash>natPer-endpoint chain; performs DNAT to the pod IP:port
KUBE-NODEPORTSnatMatches NodePort range (30000-32767); jumps to per-Service chain
KUBE-POSTROUTINGnatMASQUERADE for pod→Service traffic (pod to its own Service) to fix return routing
KUBE-FORWARDfilterAllows forwarding for established connections and explicitly accepted traffic
KUBE-FIREWALLfilterDrops traffic marked for rejection (e.g., packets routed to unready endpoints)
# Inspect kube-proxy's iptables rules
iptables -t nat -L KUBE-SERVICES -n --line-numbers | head -30

# Find the chain for a specific Service
SVC_IP=$(kubectl get svc my-service -o jsonpath='{.spec.clusterIP}')
iptables -t nat -L KUBE-SERVICES -n | grep $SVC_IP

# Follow the chain for that Service
iptables -t nat -L KUBE-SVC-XXXXXXXXXX -n --line-numbers

# Count total kube-proxy rules (can be large)
iptables-save | grep -c KUBE

IPVS Mode — Virtual Servers

In IPVS mode, kube-proxy creates a dummy interface called kube-ipvs0 and assigns all Service ClusterIPs to it. This makes the kernel recognize those IPs as local, enabling IPVS to intercept packets before they leave the node. IPVS maintains a hash table of virtual servers and real servers in kernel space.

# Check IPVS virtual servers (requires ipvsadm)
ipvsadm -Ln

# Output example:
# IP Virtual Server version 1.2.1 (size=4096)
# Prot LocalAddress:Port Scheduler Flags
#   -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
# TCP  10.96.0.1:443 rr
#   -> 10.0.1.10:6443               Masq    1      5          0
# TCP  10.96.0.10:53 rr
#   -> 10.244.0.5:53                Masq    1      0          0
#   -> 10.244.0.6:53                Masq    1      0          0
# TCP  10.100.50.20:80 rr
#   -> 10.244.1.5:8080              Masq    1      12         0
#   -> 10.244.2.8:8080              Masq    1      8          0

# Show the kube-ipvs0 dummy interface with all ClusterIPs
ip addr show kube-ipvs0 | grep "inet " | head -20

IPVS Scheduling Algorithms

AlgorithmFlagDescription
Round RobinrrDefault. Distributes connections equally across backends. No state tracking.
Least ConnectionlcSends to backend with fewest active connections. Better for heterogeneous request durations.
Destination HashingdhConsistent mapping: same client always hits same backend (not based on src IP — use sh for that).
Source HashingshConsistent src IP → backend mapping. Implements sessionAffinity without explicit timeout tracking.
Shortest Expected DelaysedConsiders both active connections and weight. Approximates least-loaded server.
Weighted Round RobinwrrWeight-based round robin. kube-proxy currently assigns equal weight to all endpoints.
IPVS requires kernel module and conntrack tuning
IPVS mode requires the ip_vs, ip_vs_rr, ip_vs_wrr, ip_vs_sh, and nf_conntrack kernel modules. In addition, the conntrack table size (nf_conntrack_max) must be sized appropriately — IPVS still uses conntrack for NAT tracking. Default conntrack max (65536) is insufficient for large clusters; set to at least 1M on large nodes.
# Check required kernel modules for IPVS
lsmod | grep -e ip_vs -e nf_conntrack

# Load modules if needed
modprobe ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh nf_conntrack

# Make persistent (Ubuntu/Debian)
cat >> /etc/modules-load.d/ipvs.conf << 'EOF'
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack
EOF

# Tune conntrack (critical for IPVS at scale)
sysctl -w net.netfilter.nf_conntrack_max=1048576
sysctl -w net.netfilter.nf_conntrack_tcp_timeout_established=86400

Service Type Implementation

Service typeWhat kube-proxy doesWhat it does NOT do
ClusterIPPrograms DNAT rules for the ClusterIP:port → endpoint IPsDoes not provision cloud load balancers (that's CCM)
NodePortOpens the NodePort on ALL nodes (KUBE-NODEPORTS chain); forwards to endpointsDoes not ensure the port is accessible through firewalls
LoadBalancerSame as NodePort (backend); cloud LB is provisioned by CCMDoes not program the cloud LB; does not manage ExternalIP from LB status by default
ExternalIPPrograms DNAT for the explicitly specified external IP addressesDoes not route external IPs to the node (that's BGP/cloud routing)
Headless (ClusterIP: None)No rules — DNS returns pod IPs directly; no kube-proxy involvementN/A

externalTrafficPolicy

This field on a Service controls how traffic arriving at a NodePort or LoadBalancer is handled:

externalTrafficPolicy: Cluster (default)

Traffic arriving at any node's NodePort is forwarded to any healthy endpoint in the cluster, potentially hair-pinning to another node. The source IP is SNAT'd to the node's IP (source IP is lost).

Client → Node A :30080
  → SNAT to Node A IP
  → Pod on Node B :8080
  (src IP = Node A, not client)

externalTrafficPolicy: Local

Traffic is only forwarded to endpoints on the same node. If no local endpoints exist, the connection is dropped (health checks will fail, causing the LB to route elsewhere). Source IP is preserved.

Client → Node A :30080
  → Pod on Node A :8080 only
  → src IP = Client IP preserved

  (Node A with no local pods:
   connection dropped — LB
   will not route here)
internalTrafficPolicy (1.26 GA)
Analogous to externalTrafficPolicy but for traffic originating from inside the cluster. When set to Local, ClusterIP traffic is only forwarded to endpoints on the same node. Useful for node-local caches (DaemonSets) where you want pods to always hit the local instance.

EndpointSlices

kube-proxy has fully migrated from watching Endpoints objects to watching EndpointSlice objects (GA 1.21, default from 1.17). EndpointSlices shard endpoints into slices of up to 100 endpoints each, drastically reducing the size of individual watch events.

With old Endpoints: a single endpoint change in a Service with 1000 pods triggered a 1000-entry object update broadcast to every kube-proxy in the cluster. With EndpointSlice: only the slice containing the changed endpoint is updated — at most 100 endpoints per event.

# View EndpointSlices for a Service
kubectl get endpointslices -l kubernetes.io/service-name=my-service

# EndpointSlice object
kubectl get endpointslice my-service-abc12 -o yaml
apiVersion: discovery.k8s.io/v1
kind: EndpointSlice
metadata:
  name: my-service-abc12
  labels:
    kubernetes.io/service-name: my-service
    endpointslice.kubernetes.io/managed-by: endpointslice-controller.k8s.io
addressType: IPv4
ports:
  - name: http
    protocol: TCP
    port: 8080
endpoints:
  - addresses: ["10.244.1.5"]
    conditions:
      ready: true
      serving: true       # Is currently serving (even during termination)
      terminating: false  # Is in graceful termination
    nodeName: worker-1
    zone: us-east-1a
    hints:
      forZones:
        - name: us-east-1a  # Topology Aware Routing hint

Topology Aware Routing

When service.kubernetes.io/topology-mode: auto is set on a Service (GA 1.27), the EndpointSlice controller adds hints.forZones annotations to endpoints. kube-proxy then preferentially routes to endpoints in the same zone as the requesting pod, reducing cross-zone traffic costs.

apiVersion: v1
kind: Service
metadata:
  name: my-service
  annotations:
    service.kubernetes.io/topology-mode: "auto"   # Enable topology-aware routing
spec:
  selector:
    app: my-app
  ports:
    - port: 80
      targetPort: 8080
Topology routing requires balanced endpoint distribution
If endpoints are not evenly distributed across zones, the EndpointSlice controller disables topology hints for that Service (to avoid routing all traffic to a zone with very few endpoints). Monitor endpoint_slice_controller_endpoints_added_per_sync and check EndpointSlice hints fields to verify topology routing is active.

KubeProxyConfiguration Reference

apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration

# --- Core settings ---
bindAddress: "0.0.0.0"
clusterCIDR: "10.244.0.0/16"   # Pod CIDR; used to detect traffic that needs SNAT
hostnameOverride: ""            # Override node name if different from hostname

# --- Proxy mode ---
mode: "ipvs"                    # iptables | ipvs | nftables | "" (auto-detect)

# --- IPVS settings (only relevant when mode=ipvs) ---
ipvs:
  scheduler: "rr"               # rr | lc | dh | sh | sed | wrr
  syncPeriod: "30s"             # Full sync interval
  minSyncPeriod: "1s"           # Minimum time between partial syncs
  tcpTimeout: "0s"              # 0 = use kernel default (900s)
  tcpFinTimeout: "0s"
  udpTimeout: "0s"
  excludeCIDRs: []              # CIDRs to exclude from IPVS (e.g., cloud metadata)
  strictARP: true               # Required for MetalLB; sets arp_announce=2 on all interfaces

# --- iptables settings ---
iptables:
  masqueradeAll: false          # SNAT all traffic (not just pod-to-Service)
  masqueradeBit: 14             # Bit in fwmark used to identify traffic for masquerade
  syncPeriod: "30s"
  minSyncPeriod: "1s"

# --- nftables settings ---
nftables:
  masqueradeAll: false
  masqueradeBit: 14
  syncPeriod: "30s"
  minSyncPeriod: "1s"

# --- API server connection ---
clientConnection:
  kubeconfig: "/var/lib/kube-proxy/kubeconfig.conf"
  acceptContentTypes: ""
  contentType: "application/vnd.kubernetes.protobuf"
  qps: 10
  burst: 20                     # Increase for large clusters with frequent endpoint churn

# --- Ports ---
metricsBindAddress: "127.0.0.1:10249"   # Prometheus metrics
healthzBindAddress: "0.0.0.0:10256"     # Health check

# --- Feature gates ---
featureGates:
  TopologyAwareHints: true

kube-proxy DaemonSet

kube-proxy runs as a DaemonSet in kube-system. It requires elevated host privileges to program kernel networking rules:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kube-proxy
  namespace: kube-system
spec:
  selector:
    matchLabels:
      k8s-app: kube-proxy
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        k8s-app: kube-proxy
    spec:
      priorityClassName: system-node-critical
      tolerations:
        - operator: Exists            # Run on ALL nodes including control plane
      hostNetwork: true               # Needs host network namespace for iptables
      serviceAccountName: kube-proxy
      containers:
        - name: kube-proxy
          image: registry.k8s.io/kube-proxy:v1.29.3
          command:
            - /usr/local/bin/kube-proxy
            - --config=/var/lib/kube-proxy/config.conf
            - --hostname-override=$(NODE_NAME)
          env:
            - name: NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
          securityContext:
            privileged: true          # Required for iptables/IPVS/nftables
          volumeMounts:
            - mountPath: /var/lib/kube-proxy
              name: kube-proxy
            - mountPath: /run/xtables.lock
              name: xtables-lock
              readOnly: false
            - mountPath: /lib/modules
              name: lib-modules
              readOnly: true
      volumes:
        - name: kube-proxy
          configMap:
            name: kube-proxy
        - name: xtables-lock
          hostPath:
            path: /run/xtables.lock
            type: FileOrCreate
        - name: lib-modules
          hostPath:
            path: /lib/modules
xtables-lock prevents concurrent rule programming
The /run/xtables.lock volume mount ensures kube-proxy and any other process (e.g., a CNI plugin) that modifies iptables use the same lock file. Without this, concurrent iptables modifications can cause partial rule application or iptables: Resource temporarily unavailable errors.

Prometheus Metrics

MetricTypeDescription
kubeproxy_sync_proxy_rules_duration_secondsHistogramTime to sync all proxy rules. Alert if p99 > 10s.
kubeproxy_sync_proxy_rules_last_queued_timestamp_secondsGaugeWhen the last sync was triggered. Large lag = kube-proxy falling behind.
kubeproxy_sync_proxy_rules_iptables_totalGaugeTotal number of iptables rules currently programmed (iptables mode)
kubeproxy_sync_proxy_rules_iptables_restore_failures_totalCounterFailed iptables-restore calls. Any value > 0 means rules are not being applied.
kubeproxy_network_programming_duration_secondsHistogramEnd-to-end latency from endpoint change detected to rules programmed

Alerting Rules

groups:
  - name: kube-proxy
    rules:
      - alert: KubeProxyRuleSyncLatencyHigh
        expr: |
          histogram_quantile(0.99,
            rate(kubeproxy_sync_proxy_rules_duration_seconds_bucket[5m])
          ) > 10
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "kube-proxy rule sync p99 > 10s on {{ $labels.instance }}"
          description: "May indicate iptables rule count too high — consider IPVS or eBPF mode"

      - alert: KubeProxyIptablesRestoreFailure
        expr: increase(kubeproxy_sync_proxy_rules_iptables_restore_failures_total[5m]) > 0
        labels:
          severity: critical
        annotations:
          summary: "iptables-restore failing on {{ $labels.instance }} — Service rules not being applied"

      - alert: KubeProxyNotRunning
        expr: kube_daemonset_status_number_ready{daemonset="kube-proxy"} < kube_daemonset_status_desired_number_scheduled{daemonset="kube-proxy"}
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "kube-proxy not running on all nodes — Service connectivity broken"

Troubleshooting Runbooks

Runbook 1: Service ClusterIP not reachable from pod
# 1. Verify Service and endpoints exist
kubectl get svc my-service
kubectl get endpoints my-service
# If Endpoints shows "": selector doesn't match any pods

# 2. Verify kube-proxy is running on target node
kubectl get pods -n kube-system -l k8s-app=kube-proxy -o wide
kubectl logs -n kube-system kube-proxy-xxxxx --tail=50

# 3. Check iptables rules for the ClusterIP
SVC_IP=$(kubectl get svc my-service -o jsonpath='{.spec.clusterIP}')
iptables -t nat -L KUBE-SERVICES -n | grep $SVC_IP
# If missing: kube-proxy hasn't synced yet or is failing

# 4. Check IPVS (if in IPVS mode)
ipvsadm -Ln | grep -A5 $SVC_IP

# 5. Test connectivity directly to pod IP (bypassing Service)
POD_IP=$(kubectl get endpoints my-service -o jsonpath='{.subsets[0].addresses[0].ip}')
kubectl exec -it test-pod -- curl http://$POD_IP:8080

# 6. Check kube-proxy sync errors
kubectl logs -n kube-system -l k8s-app=kube-proxy | grep -i error | tail -20
Runbook 2: NodePort not accessible from outside cluster
# 1. Get the NodePort
kubectl get svc my-service -o jsonpath='{.spec.ports[0].nodePort}'
# e.g. 30080

# 2. Verify KUBE-NODEPORTS chain has the rule
iptables -t nat -L KUBE-NODEPORTS -n | grep 30080
# Should see: tcp dpt:30080 -> KUBE-SVC-...

# 3. Check firewall / security groups (cloud)
# The node's firewall must allow TCP/UDP 30000-32767 from external

# 4. Verify kube-proxy binds to the node's IP
kubectl get cm -n kube-system kube-proxy -o yaml | grep bindAddress

# 5. Test from inside the node
NODE_IP=$(kubectl get node worker-1 -o jsonpath='{.status.addresses[0].address}')
curl http://$NODE_IP:30080

# 6. If externalTrafficPolicy: Local, verify local endpoints exist
kubectl get endpoints my-service -o yaml | grep -A5 nodeName
Runbook 3: iptables rule sync is slow — kube-proxy lagging
# Symptom: new Services take minutes to become reachable
# or kubeproxy_sync_proxy_rules_duration_seconds p99 > 10s

# 1. Count current iptables rules
iptables-save | wc -l
iptables-save | grep -c KUBE
# If > 10000 rules: time to consider IPVS or eBPF

# 2. Check Services with many endpoints (each endpoint = 1 rule)
kubectl get endpoints --all-namespaces -o json | jq '
  .items[] | {
    name: .metadata.name,
    ns: .metadata.namespace,
    count: (.subsets[0].addresses | length)
  }' | sort -k3 -rn | head -10

# 3. Switch to IPVS mode
# Edit kube-proxy ConfigMap:
kubectl edit cm kube-proxy -n kube-system
# Change: mode: "ipvs"
# Add: ipvs.strictARP: true (if using MetalLB)

# 4. Restart kube-proxy DaemonSet to apply
kubectl rollout restart daemonset/kube-proxy -n kube-system

# 5. Clean up legacy iptables rules after switching to IPVS
iptables-save | grep -v KUBE | iptables-restore
# WARNING: review carefully before running in production
Runbook 4: Source IP lost — need to preserve client IP
# Symptom: application sees node IP instead of client IP

# 1. Check current externalTrafficPolicy
kubectl get svc my-service -o jsonpath='{.spec.externalTrafficPolicy}'
# Returns: Cluster (source IP is SNAT'd)

# 2. Change to Local to preserve source IP
kubectl patch svc my-service \
  -p '{"spec":{"externalTrafficPolicy":"Local"}}'

# 3. Verify the Service has local endpoints on target nodes
kubectl get endpoints my-service -o yaml | grep nodeName

# 4. Configure your cloud LB health check to respect Local policy
# Most cloud LBs will stop routing to nodes with 0 local endpoints
# Health check: GET /healthz on NodePort
# Kubernetes provides this automatically when externalTrafficPolicy=Local

# 5. For intra-cluster source IP preservation (1.26+):
kubectl patch svc my-service \
  -p '{"spec":{"internalTrafficPolicy":"Local"}}'
Runbook 5: IPVS conntrack table exhaustion
# Symptom: connections randomly fail; dmesg shows "nf_conntrack: table full, dropping packet"

# 1. Check current conntrack usage
cat /proc/sys/net/netfilter/nf_conntrack_count
cat /proc/sys/net/netfilter/nf_conntrack_max
# If count approaching max: need to increase max

# 2. Increase conntrack max (immediately, not persistent)
sysctl -w net.netfilter.nf_conntrack_max=2097152
sysctl -w net.core.netdev_max_backlog=250000

# 3. Make persistent
cat >> /etc/sysctl.d/99-conntrack.conf << 'EOF'
net.netfilter.nf_conntrack_max = 2097152
net.netfilter.nf_conntrack_tcp_timeout_established = 86400
net.netfilter.nf_conntrack_tcp_timeout_close_wait = 60
EOF
sysctl --system

# 4. Tune IPVS timeouts to reduce stale entries
kubectl edit cm kube-proxy -n kube-system
# Set: ipvs.tcpTimeout: "900s"  (15 minutes, down from kernel default)
#      ipvs.tcpFinTimeout: "30s"

# 5. Monitor conntrack with Prometheus
# node_nf_conntrack_entries vs node_nf_conntrack_entries_limit

kube-proxy Alternatives

AlternativeMechanismAdvantage over kube-proxyRequirement
Cilium (kube-proxy replacement)eBPF programs on tc/XDP hooksO(1) lookup, no conntrack for Service traffic, lower latencyKernel 4.19+; Cilium CNI installed
Calico eBPFeBPFSame as Cilium; integrates with Calico networkingCalico CNI in eBPF mode
kube-routerIPVS + BGPCombines kube-proxy (IPVS) + CNI + BGP in one binaryReplaces kube-proxy DaemonSet entirely
MetalLB (speaker)BGP/L2 advertisementHandles LoadBalancer type on bare metal (not a kube-proxy replacement)Bare metal or non-cloud clusters

Production Best Practices

  1. Use IPVS mode for clusters with >1000 Services. iptables rule count grows as Services × endpoints, and rule traversal is O(n). IPVS uses hash tables (O(1)) and handles 100,000+ Services without degradation. Verify the kernel modules are loaded on all nodes before switching.
  2. Set strictARP: true in IPVS mode when using MetalLB or kube-vip. Without it, ARP requests for Service IPs may be answered by the wrong interface, breaking L2 advertisement.
  3. Tune the conntrack table size on IPVS nodes. IPVS still uses conntrack for NAT. Default max (65536) is exhausted quickly on busy nodes. Start with nf_conntrack_max = 1048576 and monitor node_nf_conntrack_entries.
  4. Use externalTrafficPolicy: Local for services where client IP matters (TLS SNI, rate limiting by IP, geo-routing). Accept that this requires the load balancer to health-check nodes and stop routing to nodes with no local endpoints.
  5. Consider replacing kube-proxy with Cilium's kube-proxy replacement for new clusters. eBPF-based Service implementation has measurably lower latency (no conntrack NAT for service traffic) and is simpler to operate — one less DaemonSet to manage.
  6. Monitor kubeproxy_sync_proxy_rules_duration_seconds p99. Values above 1s in iptables mode indicate the rule table is growing too large. Alert at 5s; act at 10s.
  7. Set minSyncPeriod to at least 1s. Without a minimum sync period, kube-proxy re-programs all rules for every single endpoint change event, causing CPU spikes during rolling deployments.
  8. Use Topology Aware Routing (service.kubernetes.io/topology-mode: auto) in multi-zone clusters to reduce cross-zone traffic costs, but verify the endpoint distribution is balanced enough for hints to be active.