Network Policies

Section 06 › 03 Last updated: 2025 ~35 min read

On this page

  1. NetworkPolicy Data Model
  2. CNI Enforcement Requirement
  3. Full Spec Reference
  4. Selectors: Pod, Namespace, IP Block
  5. Default-Deny Pattern
  6. Common Patterns
  7. Namespace Isolation
  8. Egress Control
  9. DNS Egress
  10. Cross-Namespace Access
  11. CNI Comparison
  12. Cilium L7 Policies
  13. Testing & Verification
  14. Anti-Patterns
  15. Metrics & Alerts
  16. Best Practices
Coverage checklist
  • NetworkPolicy allow-list model (additive, no deny rules)
  • Stateful connection tracking semantics
  • CNI enforcement requirement — no-op without policy-capable CNI
  • CNI comparison: Cilium, Calico, Weave, Flannel (no policy)
  • Full spec: podSelector, policyTypes, ingress, egress
  • Peer selectors: podSelector, namespaceSelector, ipBlock
  • AND logic: podSelector + namespaceSelector in same element
  • OR logic: multiple elements in from/to arrays
  • Ports: port, endPort (range), protocol
  • Default-deny all ingress + egress pattern
  • Default-allow all (empty policy)
  • Common patterns: web tier, microservice mesh, DB isolation
  • Namespace isolation: deny cross-namespace + allow same-ns
  • Egress control: API server access, external SaaS, S3
  • DNS egress: must allow UDP/TCP 53 before anything else works
  • Cross-namespace: labelSelector + namespaceSelector AND pattern
  • Monitoring namespace access to all pods
  • Cilium CiliumNetworkPolicy: L7 HTTP rules, DNS-based egress
  • Calico GlobalNetworkPolicy
  • Testing: netcat/curl verification pods, netshoot
  • Inspektor Gadget / Hubble for traffic visibility
  • Anti-patterns: missing DNS egress, overlapping policies confusion, empty podSelector on deny
  • hostNetwork pods bypass NetworkPolicy callout
  • 5 metrics, 4 alerts, 5 runbooks
  • 8 best practices

NetworkPolicy Data Model

NetworkPolicy is an additive allow-list system — there are no deny rules. Traffic is allowed if at least one policy permits it; traffic is denied if no policy applies to a pod's ingress/egress direction. If no NetworkPolicy selects a pod, all traffic to/from that pod is allowed (open by default).

WITHOUT any NetworkPolicy:
  Pod A ←───────────────────── Pod B   ✅ allowed (no policy = all allowed)
  Pod A ─────────────────────► Internet ✅ allowed

WITH a NetworkPolicy selecting Pod A:
  The policy only applies to the direction(s) listed in policyTypes.
  Any direction NOT covered by a policy rule is DENIED for that pod.

  Example: policy selects pod-a, policyTypes: [Ingress]
    Ingress to pod-a: ONLY what the ingress rules allow (implicit deny for rest)
    Egress from pod-a: ALL allowed (no egress policy = unrestricted egress)

STATEFUL: NetworkPolicy is connection-tracking aware.
  If ingress from B→A is allowed, the return traffic A→B for that connection
  is automatically allowed — you do not need a separate egress rule for responses.
Open-by-default is a critical misunderstanding. Many engineers assume that creating any NetworkPolicy causes default-deny behavior. It does not. A NetworkPolicy only affects pods it selects. Pods not selected by any policy remain completely open. To get default-deny, you must explicitly create a policy that selects all pods with an empty podSelector: {} and specifies the direction with no rules (see Default-Deny Pattern).

CNI Enforcement Requirement

The Kubernetes API accepts NetworkPolicy objects regardless of whether the cluster's CNI plugin can enforce them. If your CNI does not support NetworkPolicy, all policies are silently ignored — no error, no warning, no effect.

CNI PluginNetworkPolicy SupportL7 PoliciesNotes
Cilium✅ Full✅ CiliumNetworkPolicy (HTTP, DNS, Kafka, gRPC)eBPF-based; best performance; Hubble observability; recommended
Calico✅ FullPartial (via GlobalNetworkPolicy + application layer)iptables or eBPF dataplane; GlobalNetworkPolicy for cluster-wide rules
Weave Net✅ FullL3/L4 only; less active development
Antrea✅ FullPartial (AntreaNetworkPolicy)OVS-based; VMware-backed; ClusterNetworkPolicy CRD
Flannel❌ NoneNetworkPolicy objects are accepted but completely ignored
kubenet❌ NoneDefault in many managed K8s; no policy enforcement
AWS VPC CNI✅ (with Calico plugin)Requires installing Calico for NetworkPolicy on EKS
GKE Dataplane V2✅ Full (Cilium-based)PartialEnabled with --enable-dataplane-v2
Verify your CNI enforces policies before relying on them. Run this test: create a default-deny policy, then verify that a pod that previously could connect to another pod can no longer do so. Do not assume enforcement — test it. On EKS with default VPC CNI, all NetworkPolicy objects are no-ops until Calico is installed.

Full Spec Reference

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: api-server-policy
  namespace: production         # NetworkPolicy is namespace-scoped
spec:
  podSelector:                  # which pods this policy applies to
    matchLabels:
      app: api-server           # empty {} = all pods in the namespace
  policyTypes:
  - Ingress                     # this policy controls ingress traffic
  - Egress                      # this policy controls egress traffic
                                # if omitted: inferred from presence of ingress/egress stanzas
  ingress:
  - from:                       # list of allowed ingress sources (OR between elements)
    - podSelector:              # pods matching this selector IN THE SAME NAMESPACE
        matchLabels:
          app: frontend
    - namespaceSelector:        # all pods in namespaces matching this (separate OR element)
        matchLabels:
          kubernetes.io/metadata.name: monitoring
    ports:
    - protocol: TCP
      port: 8080
    - protocol: TCP
      port: 8443
      endPort: 8450             # port range [8443-8450], GA 1.25

  egress:
  - to:                         # allowed egress destinations
    - podSelector:
        matchLabels:
          app: postgres
    ports:
    - protocol: TCP
      port: 5432
  - to:                         # DNS must be explicitly allowed for egress policies
    - namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: kube-system
      podSelector:
        matchLabels:
          k8s-app: kube-dns
    ports:
    - protocol: UDP
      port: 53
    - protocol: TCP
      port: 53

Selectors: Pod, Namespace, IP Block

Selector Combination Logic

The from and to arrays contain peer elements. Understanding the AND vs OR logic is critical to writing correct policies:

OR logic — two separate elements in the array:
  from:
  - podSelector:          ← element 1: pods with app=frontend IN SAME NAMESPACE
      matchLabels:
        app: frontend
  - namespaceSelector:    ← element 2: ALL pods in namespaces with env=prod
      matchLabels:
        env: prod
  Result: traffic from (frontend pods in same ns) OR (any pod in env=prod namespaces)

AND logic — both selectors in the SAME element:
  from:
  - podSelector:          ← SAME element — both conditions must be true
      matchLabels:
        app: frontend
    namespaceSelector:
      matchLabels:
        env: prod
  Result: traffic from (frontend pods AND in env=prod namespaces only)
  Critically: this does NOT allow frontend pods in the same namespace
  unless the same namespace also has env=prod label.
AND vs OR is the most common NetworkPolicy bug. Using two separate elements creates OR logic. Combining both selectors in one element creates AND logic. Most "cross-namespace access" policies should use AND (same element) to mean "pods with label X that are also in namespace Y". Using OR means "pods with label X in any namespace OR all pods in namespace Y" — which is usually not intended.

podSelector

# Match all pods in the namespace
from:
- podSelector: {}

# Match pods with specific labels
from:
- podSelector:
    matchLabels:
      app: frontend
      tier: web

# Match with expression operator
from:
- podSelector:
    matchExpressions:
    - key: app
      operator: In
      values: ["frontend", "api-gateway"]

namespaceSelector

# Match by namespace name (kubernetes.io/metadata.name is auto-set on all namespaces)
from:
- namespaceSelector:
    matchLabels:
      kubernetes.io/metadata.name: monitoring

# Match by custom namespace label
from:
- namespaceSelector:
    matchLabels:
      team: platform

# Allow all pods in all namespaces (use with extreme care)
from:
- namespaceSelector: {}
kubernetes.io/metadata.name is automatically set on all namespaces. Since Kubernetes 1.21, the kubernetes.io/metadata.name label is automatically applied to every namespace with the namespace's own name as the value. Use this to write namespace-by-name selectors without adding custom labels. This is the recommended pattern for precise namespace matching.

ipBlock

# Allow ingress from specific external IP range, excluding a subnet
from:
- ipBlock:
    cidr: 203.0.113.0/24        # allow from this range
    except:
    - 203.0.113.50/32           # except this specific IP

# Allow egress to an external SaaS API
to:
- ipBlock:
    cidr: 52.84.0.0/14          # AWS CloudFront range for api.example.com
ipBlock does not work well with dynamic cloud IPs. Using ipBlock requires maintaining static CIDR lists. For external SaaS with dynamic IPs (AWS S3, external APIs), use Cilium's FQDN-based egress policy (DNS-based) instead of ipBlock — it resolves the DNS and creates dynamic IP rules automatically.

Default-Deny Pattern

Default Deny All Ingress

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-ingress
  namespace: production
spec:
  podSelector: {}        # selects ALL pods in the namespace
  policyTypes:
  - Ingress
  # no ingress stanza = deny all ingress
  # after this policy, pods receive NO ingress until explicitly allowed

Default Deny All Egress

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-egress
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Egress
  # no egress stanza = deny all egress
  # IMPORTANT: DNS will also stop working — add DNS allow rule separately

Default Deny All (Both Directions)

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
  # This is the strictest baseline — all traffic denied until explicitly allowed
  # Apply this to every tenant namespace, then add targeted allow policies

Allow All (Explicitly Open)

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-all-ingress
  namespace: development        # for dev namespace, open access
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  ingress:
  - {}                          # empty rule = allow all ingress sources

Common Patterns

Pattern 1: Three-Tier Web Application

Internet → [Ingress Controller] → [Frontend pods] → [API pods] → [Database pods]
                                       app=frontend      app=api      app=postgres
# 1. Default deny all
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata: {name: default-deny-all, namespace: production}
spec:
  podSelector: {}
  policyTypes: [Ingress, Egress]
---
# 2. Allow ingress from ingress controller to frontend
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata: {name: allow-frontend-ingress, namespace: production}
spec:
  podSelector:
    matchLabels: {app: frontend}
  policyTypes: [Ingress]
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: ingress-nginx
      podSelector:
        matchLabels:
          app.kubernetes.io/name: ingress-nginx
    ports:
    - {protocol: TCP, port: 3000}
---
# 3. Allow frontend → API
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata: {name: allow-api-from-frontend, namespace: production}
spec:
  podSelector:
    matchLabels: {app: api}
  policyTypes: [Ingress]
  ingress:
  - from:
    - podSelector:
        matchLabels: {app: frontend}
    ports:
    - {protocol: TCP, port: 8080}
---
# 4. Allow API → Database
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata: {name: allow-db-from-api, namespace: production}
spec:
  podSelector:
    matchLabels: {app: postgres}
  policyTypes: [Ingress]
  ingress:
  - from:
    - podSelector:
        matchLabels: {app: api}
    ports:
    - {protocol: TCP, port: 5432}
---
# 5. Allow API egress: DNS + database only
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata: {name: api-egress, namespace: production}
spec:
  podSelector:
    matchLabels: {app: api}
  policyTypes: [Egress]
  egress:
  - to:                          # DNS
    - namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: kube-system
      podSelector:
        matchLabels: {k8s-app: kube-dns}
    ports:
    - {protocol: UDP, port: 53}
    - {protocol: TCP, port: 53}
  - to:                          # database
    - podSelector:
        matchLabels: {app: postgres}
    ports:
    - {protocol: TCP, port: 5432}

Pattern 2: Allow Prometheus Scraping

# Allow metrics scraping from monitoring namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-prometheus-scrape
  namespace: production
spec:
  podSelector: {}               # applies to all pods in production
  policyTypes: [Ingress]
  ingress:
  - from:
    - namespaceSelector:        # AND: must be in monitoring namespace
        matchLabels:
          kubernetes.io/metadata.name: monitoring
      podSelector:              # AND: must be prometheus pod
        matchLabels:
          app.kubernetes.io/name: prometheus
    ports:
    - protocol: TCP
      port: 9090                # or use named port if your pods use consistent naming
    - protocol: TCP
      port: 8080                # common metrics endpoint

Pattern 3: Allow kube-apiserver Access

# Operators and controllers that need to call the kube-apiserver
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-apiserver-egress
  namespace: my-operator
spec:
  podSelector:
    matchLabels:
      app: my-operator
  policyTypes: [Egress]
  egress:
  - ports:
    - protocol: TCP
      port: 443         # kube-apiserver HTTPS
    - protocol: TCP
      port: 6443        # alternative port (kubeadm default)
    # No 'to' selector = allow to all destinations on this port
    # For stricter: add ipBlock with control plane node IPs

Namespace Isolation

A common multi-tenant requirement is to allow all intra-namespace traffic while blocking cross-namespace traffic. This requires two policies working together:

# Policy 1: deny all ingress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: team-alpha
spec:
  podSelector: {}
  policyTypes: [Ingress, Egress]
---
# Policy 2: allow intra-namespace communication
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-same-namespace
  namespace: team-alpha
spec:
  podSelector: {}
  policyTypes: [Ingress, Egress]
  ingress:
  - from:
    - podSelector: {}           # empty = all pods in same namespace (NOT cross-namespace)
  egress:
  - to:
    - podSelector: {}           # all pods in same namespace
  - to:                         # DNS always needed for egress
    - namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: kube-system
      podSelector:
        matchLabels:
          k8s-app: kube-dns
    ports:
    - {protocol: UDP, port: 53}
    - {protocol: TCP, port: 53}
Empty podSelector {} in from/to means same namespace only. When you write from: [{podSelector: {}}], it matches all pods in the same namespace as the policy. It does NOT match pods in other namespaces. This is the correct pattern for intra-namespace allow-all without cross-namespace bleed.

Egress Control

Egress policies are critical for preventing data exfiltration and lateral movement after a container compromise. A compromised container with unrestricted egress can beacon to attacker infrastructure, exfiltrate data, scan the internal network, or call external command-and-control endpoints.

Egress to External APIs

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: payment-service-egress
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: payment-service
  policyTypes: [Egress]
  egress:
  - to:                          # DNS
    - namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: kube-system
      podSelector:
        matchLabels: {k8s-app: kube-dns}
    ports:
    - {protocol: UDP, port: 53}
    - {protocol: TCP, port: 53}
  - to:                          # Stripe API (static IP range)
    - ipBlock:
        cidr: 54.187.174.169/32
    - ipBlock:
        cidr: 54.187.205.235/32
    ports:
    - {protocol: TCP, port: 443}
  - to:                          # Internal database
    - podSelector:
        matchLabels: {app: postgres}
    ports:
    - {protocol: TCP, port: 5432}

DNS Egress — The Most Forgotten Rule

DNS is the most commonly forgotten rule when implementing egress NetworkPolicies. Without an explicit DNS allow rule, all name resolution fails immediately after applying an egress deny-all policy — causing every service discovery, external call, and inter-service communication to fail with "name resolution failure" rather than "connection refused".

Apply egress default-deny without DNS allow = immediate application failure. Pod DNS queries go to CoreDNS in kube-system on UDP/TCP port 53 (or the cluster DNS IP, which is kube-dns Service's ClusterIP). If your egress policy blocks this, all DNS lookups fail. The symptom is dial tcp: lookup <hostname>: no such host — which looks like a DNS misconfiguration, not a NetworkPolicy issue.
# Always include this DNS egress rule with any egress default-deny policy
# Apply this as a separate policy to all pods in the namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-dns-egress
  namespace: production
spec:
  podSelector: {}                # all pods
  policyTypes: [Egress]
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: kube-system
      podSelector:
        matchLabels:
          k8s-app: kube-dns      # CoreDNS pod label in kube-system
    ports:
    - protocol: UDP
      port: 53
    - protocol: TCP              # DNS over TCP for large responses (DNSSEC, SRV records)
      port: 53

# Alternative: allow egress to the ClusterIP of kube-dns Service
# (cluster DNS IP is typically 10.96.0.10 — verify with: kubectl get svc kube-dns -n kube-system)
# egress:
# - to:
#   - ipBlock:
#       cidr: 10.96.0.10/32
#   ports:
#   - {protocol: UDP, port: 53}
#   - {protocol: TCP, port: 53}
Verify CoreDNS pod labels in your cluster. kubectl get pods -n kube-system -l k8s-app=kube-dns — if this returns no pods, check the actual labels on your CoreDNS pods: kubectl get pods -n kube-system --show-labels | grep coredns. Some distributions label CoreDNS differently (e.g., app=coredns).

Cross-Namespace Access Patterns

Allow Specific Namespace → All Pods

# Allow all pods in 'monitoring' namespace to scrape any pod in 'production'
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-from-monitoring
  namespace: production
spec:
  podSelector: {}
  policyTypes: [Ingress]
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: monitoring

Allow Specific Pod in Specific Namespace (AND logic)

# Allow ONLY Prometheus pods IN the monitoring namespace
# NOT: any pod with app=prometheus anywhere, NOR: all pods in monitoring
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-prometheus-only
  namespace: production
spec:
  podSelector: {}
  policyTypes: [Ingress]
  ingress:
  - from:
    - namespaceSelector:         # ← same element: AND logic
        matchLabels:
          kubernetes.io/metadata.name: monitoring
      podSelector:               # ← both must be true simultaneously
        matchLabels:
          app.kubernetes.io/name: prometheus

Istio / Service Mesh Sidecar Bypass

NetworkPolicy and service mesh operate at different layers. NetworkPolicy operates at L3/L4 (TCP/UDP connections). If you're using Istio with mTLS, all pod-to-pod traffic appears as port 15006 (Envoy) internally. Your NetworkPolicy must allow port 15006 (or use the CNI integration pattern) to not break mesh traffic. Alternatively, configure Istio to use CNI plugin mode so NetworkPolicy sees the actual application ports.

CNI Comparison for Network Policy

FeatureCiliumCalicoAntreaWeave
Standard NetworkPolicy
L7 HTTP policies✅ CiliumNetworkPolicyPartial
DNS-based egress (FQDN)✅ CiliumNetworkPolicy✅ GlobalNetworkPolicy
Cluster-wide (non-namespace) policies✅ CiliumClusterwideNetworkPolicy✅ GlobalNetworkPolicy✅ ClusterNetworkPolicy
Traffic visibility / flow logs✅ Hubble✅ Felix logs✅ Antrea Flow ExporterLimited
DataplaneeBPFiptables or eBPFOVSVXLAN + iptables
Performance at scaleExcellentGoodGoodModerate
Node-to-node encryption✅ WireGuard✅ WireGuard / IPsec✅ IPsec✅ (sleeve mode)

Cilium L7 Policies

Standard Kubernetes NetworkPolicy operates at L3/L4 only (IP addresses and ports). Cilium extends this with L7 policies that can allow or deny based on HTTP method/path, DNS hostname, Kafka topic, or gRPC method.

Cilium HTTP L7 Policy

apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
  name: api-http-policy
  namespace: production
spec:
  endpointSelector:
    matchLabels:
      app: api-server
  ingress:
  - fromEndpoints:
    - matchLabels:
        app: frontend
    toPorts:
    - ports:
      - port: "8080"
        protocol: TCP
      rules:
        http:
        - method: GET
          path: /api/v1/.*      # regex — allow all GET on /api/v1/
        - method: POST
          path: /api/v1/orders  # allow POST only to this path
        # PUT, DELETE, etc. on other paths are denied

Cilium DNS-Based Egress (FQDN)

apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
  name: allow-external-apis
  namespace: production
spec:
  endpointSelector:
    matchLabels:
      app: payment-service
  egress:
  - toFQDNs:
    - matchName: "api.stripe.com"          # exact FQDN
    - matchPattern: "*.amazonaws.com"      # wildcard pattern
    toPorts:
    - ports:
      - port: "443"
        protocol: TCP
  - toEndpoints:                           # still need DNS
    - matchLabels:
        k8s:io.kubernetes.pod.namespace: kube-system
        k8s-app: kube-dns
    toPorts:
    - ports:
      - port: "53"
        protocol: ANY

Calico GlobalNetworkPolicy

# Applies across all namespaces — useful for cluster-wide baseline rules
apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:
  name: deny-all-except-dns
spec:
  selector: all()               # all endpoints in the cluster
  order: 1000                   # lower order = higher priority
  types:
  - Egress
  egress:
  - action: Allow
    protocol: UDP
    destination:
      selector: k8s-app == "kube-dns"
      namespaceSelector: kubernetes.io/metadata.name == "kube-system"
      ports: [53]
  - action: Deny

Testing & Verification

Verification Test Pods

# Launch a test pod to verify connectivity
kubectl run test-client --image=nicolaka/netshoot --rm -it \
  --labels="app=test-client" \
  -n production \
  -- bash

# Inside the pod, test connections:
# DNS resolution
nslookup postgres.production.svc.cluster.local

# TCP connectivity to a service
nc -zv postgres 5432              # should succeed if policy allows
nc -zv some-blocked-service 8080  # should fail with "Connection refused" or timeout

# HTTP check
curl -sv http://api-server:8080/health

# Connectivity to external
curl -sv https://api.stripe.com   # should fail if egress policy is restrictive
# Create a curl pod with a specific label for policy testing
kubectl run policy-test \
  --image=curlimages/curl:latest \
  --labels="app=frontend" \    # give it the label that policies expect
  --rm -it \
  -n production \
  -- sh

Hubble — Cilium's Traffic Visibility

# Install Hubble CLI
# https://docs.cilium.io/en/stable/observability/hubble/

# Watch all flows in a namespace
hubble observe --namespace production --follow

# Watch dropped flows (policy denies)
hubble observe --namespace production --verdict DROPPED --follow

# Watch flows between specific pods
hubble observe --namespace production \
  --from-pod production/frontend \
  --to-pod production/api-server \
  --follow

# Flow output shows: direction, source, destination, port, verdict (FORWARDED/DROPPED), policy

Inspektor Gadget — eBPF-based Debugging

# Trace network traffic with NetworkPolicy audit
kubectl gadget trace network -n production

# Trace DNS queries
kubectl gadget trace dns -n production

Policy Simulation (Calico)

# Calico policy testing — check if a connection would be allowed/denied
calicoctl policy check \
  --source-selector "app=frontend" \
  --dest-selector "app=api-server" \
  --dest-port 8080 \
  --protocol TCP

Anti-Patterns

Anti-PatternProblemFix
Egress deny without DNS allowAll name resolution fails silently; misleading error messagesAlways add DNS egress allow rule first, before any other egress policy
Assuming no policy = denyPods without any NetworkPolicy selecting them are fully openAlways apply explicit default-deny-all policy to every namespace
Using OR when AND is intended for cross-namespaceAllows any pod with the label from any namespace, or all pods in the namespacePut both podSelector and namespaceSelector in the same element (AND logic)
Missing policyTypesIf you specify only ingress stanza without policyTypes, Kubernetes infers Ingress policyType; but explicitly listing policyTypes is clearer and saferAlways explicitly set policyTypes: [Ingress, Egress] on deny-all policies
Using ipBlock for cloud service endpointsCloud provider IPs change; policy breaks silently when IP changesUse Cilium FQDN policies or Calico DNS-based policies
Forgetting ephemeral containers / debug podskubectl debug pods bypass NetworkPolicy if they get a different label setAdmission policy: require all pods to have specific network labels; or namespace label audit
Policies without namespace scope in multi-tenant clustersA team can create policies in their namespace that accidentally allow cross-namespace traffic from unexpected sourcesReview all NetworkPolicy objects in tenant namespaces; use Gatekeeper to enforce constraints
Pods using hostNetwork bypass NetworkPolicy entirely. A pod with spec.hostNetwork: true uses the node's network namespace and is not subject to NetworkPolicy rules. CNI plugins cannot intercept traffic at the host network level for NetworkPolicy enforcement. This is why hostNetwork should be restricted to privileged system pods only.

Metrics & Alerts

Key Metrics

MetricSourceWhat It Tells You
cilium_drop_count_total{reason="Policy denied"}CiliumNumber of packets dropped by NetworkPolicy; spikes indicate new policy violations or misconfigurations
cilium_forward_count_totalCiliumTotal forwarded packets; baseline for traffic volume
calico_felix_iptables_restore_errors_totalCalico Felixiptables rule application failures — policy not being enforced
network_policy_controller_iptables_restore_errors_totalkube-proxy / CNIErrors applying network policy rules to the dataplane
hubble_flows_processed_total{verdict="DROPPED"}HubbleDropped flows by source/destination/namespace — identify policy gaps

Alerts

groups:
- name: network-policy.rules
  rules:

  - alert: HighNetworkPolicyDropRate
    expr: |
      rate(cilium_drop_count_total{reason="Policy denied"}[5m]) > 100
    for: 2m
    annotations:
      summary: "High policy drop rate on {{ $labels.node }}"
      description: "NetworkPolicy is dropping >100 packets/sec — likely misconfiguration or intrusion attempt"
    labels:
      severity: warning

  - alert: DNSResolutionFailures
    expr: |
      rate(coredns_dns_responses_total{rcode="SERVFAIL"}[5m]) > 10
    for: 1m
    annotations:
      summary: "High DNS SERVFAIL rate — possible NetworkPolicy blocking DNS egress"
    labels:
      severity: warning

  - alert: NamespaceWithoutDefaultDenyPolicy
    # Implement via periodic check or OPA Gatekeeper audit constraint
    annotations:
      summary: "Namespace {{ $labels.namespace }} has no default-deny NetworkPolicy"
    labels:
      severity: warning

  - alert: UnexpectedEgressToInternet
    # Implement via Hubble/Falco: egress flows to public IPs from unexpected pods
    annotations:
      summary: "Pod making unexpected external connections: {{ $labels.pod }}"
    labels:
      severity: high

Runbooks

  1. Service suddenly unreachable after NetworkPolicy change: First check DNS: kubectl exec -it <pod> -- nslookup <service>. If DNS fails, check DNS egress rule. If DNS succeeds but connection fails, check ingress policy on destination: kubectl get networkpolicy -n <ns> and trace with Hubble: hubble observe --namespace <ns> --verdict DROPPED. Temporarily remove the new policy to confirm it's the cause.
  2. All pods in namespace suddenly lose connectivity: Check for a recently applied default-deny policy: kubectl get networkpolicy -n <ns> --sort-by=.metadata.creationTimestamp. If default-deny-all was applied without DNS allow, add the DNS allow policy immediately: apply the allow-dns-egress policy from this page.
  3. Cross-namespace access not working despite policy: Verify the AND vs OR selector logic. Use kubectl describe networkpolicy <name> -n <ns> to see the evaluated selectors. Test with a pod that has the exact labels specified. Verify the namespace label: kubectl get namespace <ns> --show-labels.
  4. NetworkPolicy objects exist but appear to have no effect: Verify the CNI plugin supports NetworkPolicy: kubectl get pods -n kube-system | grep -E "calico|cilium|weave|antrea". If Flannel or kubenet, policies are no-ops. Verify the CNI agent is running on all nodes. Check CNI agent logs for errors.
  5. Intermittent connection failures under load: Check for connection tracking table exhaustion (nf_conntrack overflow) on nodes — symptom: connections fail randomly under load. Check: cat /proc/sys/net/netfilter/nf_conntrack_count vs nf_conntrack_max. Increase nf_conntrack_max via node sysctl or switch to Cilium eBPF which doesn't rely on conntrack.

Best Practices

  1. Start with default-deny-all in every tenant namespace, then allow explicitly. Apply the default-deny-all policy (both Ingress and Egress) to every namespace when it is created. Automate this with a Kyverno ClusterPolicy that generates NetworkPolicy resources upon namespace creation. This ensures namespaces are secure by default.
  2. Always add DNS allow before any egress deny policy. The DNS allow rule is the first egress rule that must exist before any other egress policy is applied. Package it as a standard policy that is automatically applied to all namespaces, separate from application-specific policies.
  3. Use the kubernetes.io/metadata.name label for namespace selectors. This label is automatically set on all namespaces and cannot be overridden by namespace owners (only kube-apiserver sets it). It provides reliable namespace-by-name matching without requiring custom label management.
  4. Document AND vs OR logic in policy comments. The single most common NetworkPolicy bug is confusing AND (same element) vs OR (separate elements) in from/to arrays. Add YAML comments to every policy explaining which logic is intended: # AND: must be prometheus pod AND in monitoring namespace.
  5. Use Cilium CiliumNetworkPolicy for L7 control where needed. Standard NetworkPolicy can't distinguish between GET /api/health (safe) and DELETE /api/users (dangerous) on the same port. For services handling sensitive operations, implement L7 HTTP policies via Cilium to enforce method/path restrictions at the network layer.
  6. Use FQDN-based egress policies instead of ipBlock for external services. Cloud provider IPs change without notice. ipBlock policies silently break when IPs change. Cilium's toFQDNs or Calico's DNS-based policies dynamically resolve and update IP rules as DNS responses change.
  7. Implement continuous NetworkPolicy testing in CI/CD. Add network connectivity tests to your deployment pipeline: after deploying new policies, verify that allowed connections succeed and denied connections fail. Use netshoot or a purpose-built connectivity check pod. Policy regressions are silent — a missing allow rule just breaks things without explaining why.
  8. Enable Hubble or CNI-level flow logs for baseline and anomaly detection. Without traffic visibility, you can't know if policies are working as intended. Enable Hubble (Cilium) or Calico flow logs. Store 7+ days of dropped flow data in your SIEM. Alert on unusual egress to internet from unexpected pods — this is your primary container exfiltration detection signal.