Network Policies
On this page
- NetworkPolicy Data Model
- CNI Enforcement Requirement
- Full Spec Reference
- Selectors: Pod, Namespace, IP Block
- Default-Deny Pattern
- Common Patterns
- Namespace Isolation
- Egress Control
- DNS Egress
- Cross-Namespace Access
- CNI Comparison
- Cilium L7 Policies
- Testing & Verification
- Anti-Patterns
- Metrics & Alerts
- Best Practices
Coverage checklist
- NetworkPolicy allow-list model (additive, no deny rules)
- Stateful connection tracking semantics
- CNI enforcement requirement — no-op without policy-capable CNI
- CNI comparison: Cilium, Calico, Weave, Flannel (no policy)
- Full spec: podSelector, policyTypes, ingress, egress
- Peer selectors: podSelector, namespaceSelector, ipBlock
- AND logic: podSelector + namespaceSelector in same element
- OR logic: multiple elements in from/to arrays
- Ports: port, endPort (range), protocol
- Default-deny all ingress + egress pattern
- Default-allow all (empty policy)
- Common patterns: web tier, microservice mesh, DB isolation
- Namespace isolation: deny cross-namespace + allow same-ns
- Egress control: API server access, external SaaS, S3
- DNS egress: must allow UDP/TCP 53 before anything else works
- Cross-namespace: labelSelector + namespaceSelector AND pattern
- Monitoring namespace access to all pods
- Cilium CiliumNetworkPolicy: L7 HTTP rules, DNS-based egress
- Calico GlobalNetworkPolicy
- Testing: netcat/curl verification pods, netshoot
- Inspektor Gadget / Hubble for traffic visibility
- Anti-patterns: missing DNS egress, overlapping policies confusion, empty podSelector on deny
- hostNetwork pods bypass NetworkPolicy callout
- 5 metrics, 4 alerts, 5 runbooks
- 8 best practices
NetworkPolicy Data Model
NetworkPolicy is an additive allow-list system — there are no deny rules. Traffic is allowed if at least one policy permits it; traffic is denied if no policy applies to a pod's ingress/egress direction. If no NetworkPolicy selects a pod, all traffic to/from that pod is allowed (open by default).
WITHOUT any NetworkPolicy:
Pod A ←───────────────────── Pod B ✅ allowed (no policy = all allowed)
Pod A ─────────────────────► Internet ✅ allowed
WITH a NetworkPolicy selecting Pod A:
The policy only applies to the direction(s) listed in policyTypes.
Any direction NOT covered by a policy rule is DENIED for that pod.
Example: policy selects pod-a, policyTypes: [Ingress]
Ingress to pod-a: ONLY what the ingress rules allow (implicit deny for rest)
Egress from pod-a: ALL allowed (no egress policy = unrestricted egress)
STATEFUL: NetworkPolicy is connection-tracking aware.
If ingress from B→A is allowed, the return traffic A→B for that connection
is automatically allowed — you do not need a separate egress rule for responses.
podSelector: {} and specifies the direction with no rules (see Default-Deny Pattern).
CNI Enforcement Requirement
The Kubernetes API accepts NetworkPolicy objects regardless of whether the cluster's CNI plugin can enforce them. If your CNI does not support NetworkPolicy, all policies are silently ignored — no error, no warning, no effect.
| CNI Plugin | NetworkPolicy Support | L7 Policies | Notes |
|---|---|---|---|
| Cilium | ✅ Full | ✅ CiliumNetworkPolicy (HTTP, DNS, Kafka, gRPC) | eBPF-based; best performance; Hubble observability; recommended |
| Calico | ✅ Full | Partial (via GlobalNetworkPolicy + application layer) | iptables or eBPF dataplane; GlobalNetworkPolicy for cluster-wide rules |
| Weave Net | ✅ Full | ❌ | L3/L4 only; less active development |
| Antrea | ✅ Full | Partial (AntreaNetworkPolicy) | OVS-based; VMware-backed; ClusterNetworkPolicy CRD |
| Flannel | ❌ None | ❌ | NetworkPolicy objects are accepted but completely ignored |
| kubenet | ❌ None | ❌ | Default in many managed K8s; no policy enforcement |
| AWS VPC CNI | ✅ (with Calico plugin) | ❌ | Requires installing Calico for NetworkPolicy on EKS |
| GKE Dataplane V2 | ✅ Full (Cilium-based) | Partial | Enabled with --enable-dataplane-v2 |
Full Spec Reference
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: api-server-policy
namespace: production # NetworkPolicy is namespace-scoped
spec:
podSelector: # which pods this policy applies to
matchLabels:
app: api-server # empty {} = all pods in the namespace
policyTypes:
- Ingress # this policy controls ingress traffic
- Egress # this policy controls egress traffic
# if omitted: inferred from presence of ingress/egress stanzas
ingress:
- from: # list of allowed ingress sources (OR between elements)
- podSelector: # pods matching this selector IN THE SAME NAMESPACE
matchLabels:
app: frontend
- namespaceSelector: # all pods in namespaces matching this (separate OR element)
matchLabels:
kubernetes.io/metadata.name: monitoring
ports:
- protocol: TCP
port: 8080
- protocol: TCP
port: 8443
endPort: 8450 # port range [8443-8450], GA 1.25
egress:
- to: # allowed egress destinations
- podSelector:
matchLabels:
app: postgres
ports:
- protocol: TCP
port: 5432
- to: # DNS must be explicitly allowed for egress policies
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53
- protocol: TCP
port: 53
Selectors: Pod, Namespace, IP Block
Selector Combination Logic
The from and to arrays contain peer elements. Understanding the AND vs OR logic is critical to writing correct policies:
OR logic — two separate elements in the array:
from:
- podSelector: ← element 1: pods with app=frontend IN SAME NAMESPACE
matchLabels:
app: frontend
- namespaceSelector: ← element 2: ALL pods in namespaces with env=prod
matchLabels:
env: prod
Result: traffic from (frontend pods in same ns) OR (any pod in env=prod namespaces)
AND logic — both selectors in the SAME element:
from:
- podSelector: ← SAME element — both conditions must be true
matchLabels:
app: frontend
namespaceSelector:
matchLabels:
env: prod
Result: traffic from (frontend pods AND in env=prod namespaces only)
Critically: this does NOT allow frontend pods in the same namespace
unless the same namespace also has env=prod label.
podSelector
# Match all pods in the namespace
from:
- podSelector: {}
# Match pods with specific labels
from:
- podSelector:
matchLabels:
app: frontend
tier: web
# Match with expression operator
from:
- podSelector:
matchExpressions:
- key: app
operator: In
values: ["frontend", "api-gateway"]
namespaceSelector
# Match by namespace name (kubernetes.io/metadata.name is auto-set on all namespaces)
from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: monitoring
# Match by custom namespace label
from:
- namespaceSelector:
matchLabels:
team: platform
# Allow all pods in all namespaces (use with extreme care)
from:
- namespaceSelector: {}
kubernetes.io/metadata.name label is automatically applied to every namespace with the namespace's own name as the value. Use this to write namespace-by-name selectors without adding custom labels. This is the recommended pattern for precise namespace matching.
ipBlock
# Allow ingress from specific external IP range, excluding a subnet
from:
- ipBlock:
cidr: 203.0.113.0/24 # allow from this range
except:
- 203.0.113.50/32 # except this specific IP
# Allow egress to an external SaaS API
to:
- ipBlock:
cidr: 52.84.0.0/14 # AWS CloudFront range for api.example.com
Default-Deny Pattern
Default Deny All Ingress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-ingress
namespace: production
spec:
podSelector: {} # selects ALL pods in the namespace
policyTypes:
- Ingress
# no ingress stanza = deny all ingress
# after this policy, pods receive NO ingress until explicitly allowed
Default Deny All Egress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-egress
namespace: production
spec:
podSelector: {}
policyTypes:
- Egress
# no egress stanza = deny all egress
# IMPORTANT: DNS will also stop working — add DNS allow rule separately
Default Deny All (Both Directions)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
# This is the strictest baseline — all traffic denied until explicitly allowed
# Apply this to every tenant namespace, then add targeted allow policies
Allow All (Explicitly Open)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-all-ingress
namespace: development # for dev namespace, open access
spec:
podSelector: {}
policyTypes:
- Ingress
ingress:
- {} # empty rule = allow all ingress sources
Common Patterns
Pattern 1: Three-Tier Web Application
Internet → [Ingress Controller] → [Frontend pods] → [API pods] → [Database pods]
app=frontend app=api app=postgres
# 1. Default deny all
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata: {name: default-deny-all, namespace: production}
spec:
podSelector: {}
policyTypes: [Ingress, Egress]
---
# 2. Allow ingress from ingress controller to frontend
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata: {name: allow-frontend-ingress, namespace: production}
spec:
podSelector:
matchLabels: {app: frontend}
policyTypes: [Ingress]
ingress:
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: ingress-nginx
podSelector:
matchLabels:
app.kubernetes.io/name: ingress-nginx
ports:
- {protocol: TCP, port: 3000}
---
# 3. Allow frontend → API
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata: {name: allow-api-from-frontend, namespace: production}
spec:
podSelector:
matchLabels: {app: api}
policyTypes: [Ingress]
ingress:
- from:
- podSelector:
matchLabels: {app: frontend}
ports:
- {protocol: TCP, port: 8080}
---
# 4. Allow API → Database
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata: {name: allow-db-from-api, namespace: production}
spec:
podSelector:
matchLabels: {app: postgres}
policyTypes: [Ingress]
ingress:
- from:
- podSelector:
matchLabels: {app: api}
ports:
- {protocol: TCP, port: 5432}
---
# 5. Allow API egress: DNS + database only
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata: {name: api-egress, namespace: production}
spec:
podSelector:
matchLabels: {app: api}
policyTypes: [Egress]
egress:
- to: # DNS
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
podSelector:
matchLabels: {k8s-app: kube-dns}
ports:
- {protocol: UDP, port: 53}
- {protocol: TCP, port: 53}
- to: # database
- podSelector:
matchLabels: {app: postgres}
ports:
- {protocol: TCP, port: 5432}
Pattern 2: Allow Prometheus Scraping
# Allow metrics scraping from monitoring namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-prometheus-scrape
namespace: production
spec:
podSelector: {} # applies to all pods in production
policyTypes: [Ingress]
ingress:
- from:
- namespaceSelector: # AND: must be in monitoring namespace
matchLabels:
kubernetes.io/metadata.name: monitoring
podSelector: # AND: must be prometheus pod
matchLabels:
app.kubernetes.io/name: prometheus
ports:
- protocol: TCP
port: 9090 # or use named port if your pods use consistent naming
- protocol: TCP
port: 8080 # common metrics endpoint
Pattern 3: Allow kube-apiserver Access
# Operators and controllers that need to call the kube-apiserver
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-apiserver-egress
namespace: my-operator
spec:
podSelector:
matchLabels:
app: my-operator
policyTypes: [Egress]
egress:
- ports:
- protocol: TCP
port: 443 # kube-apiserver HTTPS
- protocol: TCP
port: 6443 # alternative port (kubeadm default)
# No 'to' selector = allow to all destinations on this port
# For stricter: add ipBlock with control plane node IPs
Namespace Isolation
A common multi-tenant requirement is to allow all intra-namespace traffic while blocking cross-namespace traffic. This requires two policies working together:
# Policy 1: deny all ingress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: team-alpha
spec:
podSelector: {}
policyTypes: [Ingress, Egress]
---
# Policy 2: allow intra-namespace communication
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-same-namespace
namespace: team-alpha
spec:
podSelector: {}
policyTypes: [Ingress, Egress]
ingress:
- from:
- podSelector: {} # empty = all pods in same namespace (NOT cross-namespace)
egress:
- to:
- podSelector: {} # all pods in same namespace
- to: # DNS always needed for egress
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- {protocol: UDP, port: 53}
- {protocol: TCP, port: 53}
from: [{podSelector: {}}], it matches all pods in the same namespace as the policy. It does NOT match pods in other namespaces. This is the correct pattern for intra-namespace allow-all without cross-namespace bleed.
Egress Control
Egress policies are critical for preventing data exfiltration and lateral movement after a container compromise. A compromised container with unrestricted egress can beacon to attacker infrastructure, exfiltrate data, scan the internal network, or call external command-and-control endpoints.
Egress to External APIs
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: payment-service-egress
namespace: production
spec:
podSelector:
matchLabels:
app: payment-service
policyTypes: [Egress]
egress:
- to: # DNS
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
podSelector:
matchLabels: {k8s-app: kube-dns}
ports:
- {protocol: UDP, port: 53}
- {protocol: TCP, port: 53}
- to: # Stripe API (static IP range)
- ipBlock:
cidr: 54.187.174.169/32
- ipBlock:
cidr: 54.187.205.235/32
ports:
- {protocol: TCP, port: 443}
- to: # Internal database
- podSelector:
matchLabels: {app: postgres}
ports:
- {protocol: TCP, port: 5432}
DNS Egress — The Most Forgotten Rule
DNS is the most commonly forgotten rule when implementing egress NetworkPolicies. Without an explicit DNS allow rule, all name resolution fails immediately after applying an egress deny-all policy — causing every service discovery, external call, and inter-service communication to fail with "name resolution failure" rather than "connection refused".
dial tcp: lookup <hostname>: no such host — which looks like a DNS misconfiguration, not a NetworkPolicy issue.
# Always include this DNS egress rule with any egress default-deny policy
# Apply this as a separate policy to all pods in the namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns-egress
namespace: production
spec:
podSelector: {} # all pods
policyTypes: [Egress]
egress:
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
podSelector:
matchLabels:
k8s-app: kube-dns # CoreDNS pod label in kube-system
ports:
- protocol: UDP
port: 53
- protocol: TCP # DNS over TCP for large responses (DNSSEC, SRV records)
port: 53
# Alternative: allow egress to the ClusterIP of kube-dns Service
# (cluster DNS IP is typically 10.96.0.10 — verify with: kubectl get svc kube-dns -n kube-system)
# egress:
# - to:
# - ipBlock:
# cidr: 10.96.0.10/32
# ports:
# - {protocol: UDP, port: 53}
# - {protocol: TCP, port: 53}
kubectl get pods -n kube-system -l k8s-app=kube-dns — if this returns no pods, check the actual labels on your CoreDNS pods: kubectl get pods -n kube-system --show-labels | grep coredns. Some distributions label CoreDNS differently (e.g., app=coredns).
Cross-Namespace Access Patterns
Allow Specific Namespace → All Pods
# Allow all pods in 'monitoring' namespace to scrape any pod in 'production'
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-from-monitoring
namespace: production
spec:
podSelector: {}
policyTypes: [Ingress]
ingress:
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: monitoring
Allow Specific Pod in Specific Namespace (AND logic)
# Allow ONLY Prometheus pods IN the monitoring namespace
# NOT: any pod with app=prometheus anywhere, NOR: all pods in monitoring
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-prometheus-only
namespace: production
spec:
podSelector: {}
policyTypes: [Ingress]
ingress:
- from:
- namespaceSelector: # ← same element: AND logic
matchLabels:
kubernetes.io/metadata.name: monitoring
podSelector: # ← both must be true simultaneously
matchLabels:
app.kubernetes.io/name: prometheus
Istio / Service Mesh Sidecar Bypass
CNI Comparison for Network Policy
| Feature | Cilium | Calico | Antrea | Weave |
|---|---|---|---|---|
| Standard NetworkPolicy | ✅ | ✅ | ✅ | ✅ |
| L7 HTTP policies | ✅ CiliumNetworkPolicy | ❌ | Partial | ❌ |
| DNS-based egress (FQDN) | ✅ CiliumNetworkPolicy | ✅ GlobalNetworkPolicy | ❌ | ❌ |
| Cluster-wide (non-namespace) policies | ✅ CiliumClusterwideNetworkPolicy | ✅ GlobalNetworkPolicy | ✅ ClusterNetworkPolicy | ❌ |
| Traffic visibility / flow logs | ✅ Hubble | ✅ Felix logs | ✅ Antrea Flow Exporter | Limited |
| Dataplane | eBPF | iptables or eBPF | OVS | VXLAN + iptables |
| Performance at scale | Excellent | Good | Good | Moderate |
| Node-to-node encryption | ✅ WireGuard | ✅ WireGuard / IPsec | ✅ IPsec | ✅ (sleeve mode) |
Cilium L7 Policies
Standard Kubernetes NetworkPolicy operates at L3/L4 only (IP addresses and ports). Cilium extends this with L7 policies that can allow or deny based on HTTP method/path, DNS hostname, Kafka topic, or gRPC method.
Cilium HTTP L7 Policy
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
name: api-http-policy
namespace: production
spec:
endpointSelector:
matchLabels:
app: api-server
ingress:
- fromEndpoints:
- matchLabels:
app: frontend
toPorts:
- ports:
- port: "8080"
protocol: TCP
rules:
http:
- method: GET
path: /api/v1/.* # regex — allow all GET on /api/v1/
- method: POST
path: /api/v1/orders # allow POST only to this path
# PUT, DELETE, etc. on other paths are denied
Cilium DNS-Based Egress (FQDN)
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
name: allow-external-apis
namespace: production
spec:
endpointSelector:
matchLabels:
app: payment-service
egress:
- toFQDNs:
- matchName: "api.stripe.com" # exact FQDN
- matchPattern: "*.amazonaws.com" # wildcard pattern
toPorts:
- ports:
- port: "443"
protocol: TCP
- toEndpoints: # still need DNS
- matchLabels:
k8s:io.kubernetes.pod.namespace: kube-system
k8s-app: kube-dns
toPorts:
- ports:
- port: "53"
protocol: ANY
Calico GlobalNetworkPolicy
# Applies across all namespaces — useful for cluster-wide baseline rules
apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:
name: deny-all-except-dns
spec:
selector: all() # all endpoints in the cluster
order: 1000 # lower order = higher priority
types:
- Egress
egress:
- action: Allow
protocol: UDP
destination:
selector: k8s-app == "kube-dns"
namespaceSelector: kubernetes.io/metadata.name == "kube-system"
ports: [53]
- action: Deny
Testing & Verification
Verification Test Pods
# Launch a test pod to verify connectivity
kubectl run test-client --image=nicolaka/netshoot --rm -it \
--labels="app=test-client" \
-n production \
-- bash
# Inside the pod, test connections:
# DNS resolution
nslookup postgres.production.svc.cluster.local
# TCP connectivity to a service
nc -zv postgres 5432 # should succeed if policy allows
nc -zv some-blocked-service 8080 # should fail with "Connection refused" or timeout
# HTTP check
curl -sv http://api-server:8080/health
# Connectivity to external
curl -sv https://api.stripe.com # should fail if egress policy is restrictive
# Create a curl pod with a specific label for policy testing
kubectl run policy-test \
--image=curlimages/curl:latest \
--labels="app=frontend" \ # give it the label that policies expect
--rm -it \
-n production \
-- sh
Hubble — Cilium's Traffic Visibility
# Install Hubble CLI
# https://docs.cilium.io/en/stable/observability/hubble/
# Watch all flows in a namespace
hubble observe --namespace production --follow
# Watch dropped flows (policy denies)
hubble observe --namespace production --verdict DROPPED --follow
# Watch flows between specific pods
hubble observe --namespace production \
--from-pod production/frontend \
--to-pod production/api-server \
--follow
# Flow output shows: direction, source, destination, port, verdict (FORWARDED/DROPPED), policy
Inspektor Gadget — eBPF-based Debugging
# Trace network traffic with NetworkPolicy audit
kubectl gadget trace network -n production
# Trace DNS queries
kubectl gadget trace dns -n production
Policy Simulation (Calico)
# Calico policy testing — check if a connection would be allowed/denied
calicoctl policy check \
--source-selector "app=frontend" \
--dest-selector "app=api-server" \
--dest-port 8080 \
--protocol TCP
Anti-Patterns
| Anti-Pattern | Problem | Fix |
|---|---|---|
| Egress deny without DNS allow | All name resolution fails silently; misleading error messages | Always add DNS egress allow rule first, before any other egress policy |
| Assuming no policy = deny | Pods without any NetworkPolicy selecting them are fully open | Always apply explicit default-deny-all policy to every namespace |
| Using OR when AND is intended for cross-namespace | Allows any pod with the label from any namespace, or all pods in the namespace | Put both podSelector and namespaceSelector in the same element (AND logic) |
| Missing policyTypes | If you specify only ingress stanza without policyTypes, Kubernetes infers Ingress policyType; but explicitly listing policyTypes is clearer and safer | Always explicitly set policyTypes: [Ingress, Egress] on deny-all policies |
| Using ipBlock for cloud service endpoints | Cloud provider IPs change; policy breaks silently when IP changes | Use Cilium FQDN policies or Calico DNS-based policies |
| Forgetting ephemeral containers / debug pods | kubectl debug pods bypass NetworkPolicy if they get a different label set | Admission policy: require all pods to have specific network labels; or namespace label audit |
| Policies without namespace scope in multi-tenant clusters | A team can create policies in their namespace that accidentally allow cross-namespace traffic from unexpected sources | Review all NetworkPolicy objects in tenant namespaces; use Gatekeeper to enforce constraints |
spec.hostNetwork: true uses the node's network namespace and is not subject to NetworkPolicy rules. CNI plugins cannot intercept traffic at the host network level for NetworkPolicy enforcement. This is why hostNetwork should be restricted to privileged system pods only.
Metrics & Alerts
Key Metrics
| Metric | Source | What It Tells You |
|---|---|---|
cilium_drop_count_total{reason="Policy denied"} | Cilium | Number of packets dropped by NetworkPolicy; spikes indicate new policy violations or misconfigurations |
cilium_forward_count_total | Cilium | Total forwarded packets; baseline for traffic volume |
calico_felix_iptables_restore_errors_total | Calico Felix | iptables rule application failures — policy not being enforced |
network_policy_controller_iptables_restore_errors_total | kube-proxy / CNI | Errors applying network policy rules to the dataplane |
hubble_flows_processed_total{verdict="DROPPED"} | Hubble | Dropped flows by source/destination/namespace — identify policy gaps |
Alerts
groups:
- name: network-policy.rules
rules:
- alert: HighNetworkPolicyDropRate
expr: |
rate(cilium_drop_count_total{reason="Policy denied"}[5m]) > 100
for: 2m
annotations:
summary: "High policy drop rate on {{ $labels.node }}"
description: "NetworkPolicy is dropping >100 packets/sec — likely misconfiguration or intrusion attempt"
labels:
severity: warning
- alert: DNSResolutionFailures
expr: |
rate(coredns_dns_responses_total{rcode="SERVFAIL"}[5m]) > 10
for: 1m
annotations:
summary: "High DNS SERVFAIL rate — possible NetworkPolicy blocking DNS egress"
labels:
severity: warning
- alert: NamespaceWithoutDefaultDenyPolicy
# Implement via periodic check or OPA Gatekeeper audit constraint
annotations:
summary: "Namespace {{ $labels.namespace }} has no default-deny NetworkPolicy"
labels:
severity: warning
- alert: UnexpectedEgressToInternet
# Implement via Hubble/Falco: egress flows to public IPs from unexpected pods
annotations:
summary: "Pod making unexpected external connections: {{ $labels.pod }}"
labels:
severity: high
Runbooks
- Service suddenly unreachable after NetworkPolicy change: First check DNS:
kubectl exec -it <pod> -- nslookup <service>. If DNS fails, check DNS egress rule. If DNS succeeds but connection fails, check ingress policy on destination:kubectl get networkpolicy -n <ns>and trace with Hubble:hubble observe --namespace <ns> --verdict DROPPED. Temporarily remove the new policy to confirm it's the cause. - All pods in namespace suddenly lose connectivity: Check for a recently applied default-deny policy:
kubectl get networkpolicy -n <ns> --sort-by=.metadata.creationTimestamp. If default-deny-all was applied without DNS allow, add the DNS allow policy immediately: apply the allow-dns-egress policy from this page. - Cross-namespace access not working despite policy: Verify the AND vs OR selector logic. Use
kubectl describe networkpolicy <name> -n <ns>to see the evaluated selectors. Test with a pod that has the exact labels specified. Verify the namespace label:kubectl get namespace <ns> --show-labels. - NetworkPolicy objects exist but appear to have no effect: Verify the CNI plugin supports NetworkPolicy:
kubectl get pods -n kube-system | grep -E "calico|cilium|weave|antrea". If Flannel or kubenet, policies are no-ops. Verify the CNI agent is running on all nodes. Check CNI agent logs for errors. - Intermittent connection failures under load: Check for connection tracking table exhaustion (
nf_conntrackoverflow) on nodes — symptom: connections fail randomly under load. Check:cat /proc/sys/net/netfilter/nf_conntrack_countvsnf_conntrack_max. Increasenf_conntrack_maxvia node sysctl or switch to Cilium eBPF which doesn't rely on conntrack.
Best Practices
- Start with default-deny-all in every tenant namespace, then allow explicitly. Apply the default-deny-all policy (both Ingress and Egress) to every namespace when it is created. Automate this with a Kyverno ClusterPolicy that generates NetworkPolicy resources upon namespace creation. This ensures namespaces are secure by default.
- Always add DNS allow before any egress deny policy. The DNS allow rule is the first egress rule that must exist before any other egress policy is applied. Package it as a standard policy that is automatically applied to all namespaces, separate from application-specific policies.
- Use the kubernetes.io/metadata.name label for namespace selectors. This label is automatically set on all namespaces and cannot be overridden by namespace owners (only kube-apiserver sets it). It provides reliable namespace-by-name matching without requiring custom label management.
- Document AND vs OR logic in policy comments. The single most common NetworkPolicy bug is confusing AND (same element) vs OR (separate elements) in from/to arrays. Add YAML comments to every policy explaining which logic is intended:
# AND: must be prometheus pod AND in monitoring namespace. - Use Cilium CiliumNetworkPolicy for L7 control where needed. Standard NetworkPolicy can't distinguish between GET /api/health (safe) and DELETE /api/users (dangerous) on the same port. For services handling sensitive operations, implement L7 HTTP policies via Cilium to enforce method/path restrictions at the network layer.
- Use FQDN-based egress policies instead of ipBlock for external services. Cloud provider IPs change without notice. ipBlock policies silently break when IPs change. Cilium's
toFQDNsor Calico's DNS-based policies dynamically resolve and update IP rules as DNS responses change. - Implement continuous NetworkPolicy testing in CI/CD. Add network connectivity tests to your deployment pipeline: after deploying new policies, verify that allowed connections succeed and denied connections fail. Use netshoot or a purpose-built connectivity check pod. Policy regressions are silent — a missing allow rule just breaks things without explaining why.
- Enable Hubble or CNI-level flow logs for baseline and anomaly detection. Without traffic visibility, you can't know if policies are working as intended. Enable Hubble (Cilium) or Calico flow logs. Store 7+ days of dropped flow data in your SIEM. Alert on unusual egress to internet from unexpected pods — this is your primary container exfiltration detection signal.