Admission Controllers
On this page
- Admission Pipeline
- Built-in Admission Plugins
- Webhook Configuration
- failurePolicy, sideEffects, reinvocationPolicy
- Namespace & Object Selectors
- OPA / Gatekeeper
- Kyverno
- ValidatingAdmissionPolicy (CEL)
- Common Policy Examples
- Testing Admission Webhooks
- Performance & Reliability
- Metrics & Alerts
- Best Practices
Coverage checklist
- Admission pipeline: AuthN → AuthZ → Mutating → Validation → Validating → etcd
- Mutating runs before validating; mutating can run twice (reinvocation)
- Built-in plugins: complete table of key plugins
- MutatingAdmissionWebhook configuration full spec
- ValidatingAdmissionWebhook configuration full spec
- rules: apiGroups, apiVersions, resources, scope
- namespaceSelector for webhook bypass
- objectSelector for fine-grained matching
- failurePolicy: Fail vs Ignore trade-offs
- sideEffects: None, NoneOnDryRun, Some, Unknown
- reinvocationPolicy: Never vs IfNeeded
- timeoutSeconds (max 30s)
- matchPolicy: Exact vs Equivalent
- OPA/Gatekeeper: ConstraintTemplate, Constraint, audit
- Gatekeeper audit mode: violations without enforcement
- Gatekeeper mutation with Assign/AssignMetadata
- Kyverno: ClusterPolicy validate/mutate/generate rules
- Kyverno background scanning
- Kyverno image verification (cosign)
- ValidatingAdmissionPolicy (CEL): GA 1.30
- VAP: expression, messageExpression, paramKind
- Common policies: image registry, no latest tag, required labels, resource limits, no privileged
- Testing: dry-run, kwok, admission-webhook-tester
- Webhook performance: latency budget, caching, HA deployment
- 5 metrics, 4 alerts, 5 runbooks, 8 best practices
Admission Pipeline
The admission pipeline is the final gate before a Kubernetes object is persisted to etcd. It runs after authentication and authorization have succeeded. The pipeline has two phases — mutating then validating — each calling registered webhooks in parallel within each phase.
Request │ ▼ Authentication (X.509 / OIDC / SA token) │ ▼ Authorization (RBAC / Node / Webhook) │ ▼ ┌─────────────────────────────────────────────┐ │ MUTATING ADMISSION │ │ (webhooks called in parallel per webhook) │ │ │ │ Built-in mutating plugins: │ │ ServiceAccount token injector │ │ LimitRanger (apply defaults) │ │ PodSecurity (mutate phase — label check) │ │ │ │ External mutating webhooks: │ │ Istio sidecar injector │ │ Vault Agent injector │ │ Custom mutating webhooks │ └─────────────────────────────────────────────┘ │ ▼ (reinvocation: mutating webhooks may run again if object was modified) │ ▼ Object Schema Validation (OpenAPI v3 schema check) │ ▼ ┌─────────────────────────────────────────────┐ │ VALIDATING ADMISSION │ │ (webhooks called in parallel per webhook) │ │ │ │ Built-in validating plugins: │ │ PodSecurity (enforce check) │ │ ResourceQuota │ │ NamespaceLifecycle │ │ │ │ External validating webhooks: │ │ OPA/Gatekeeper │ │ Kyverno (validate rules) │ │ ValidatingAdmissionPolicy (CEL) │ │ Custom validating webhooks │ └─────────────────────────────────────────────┘ │ ▼ Persist to etcd
Built-in Admission Plugins
Built-in plugins are compiled into kube-apiserver and enabled via --enable-admission-plugins. The following are on by default or critical to enable:
| Plugin | Phase | Purpose | Default |
|---|---|---|---|
NamespaceLifecycle | Validating | Reject creates in terminating namespaces; protect system namespaces from deletion | ✅ On |
LimitRanger | Mutating + Validating | Apply LimitRange defaults to pods; reject pods exceeding limits | ✅ On |
ServiceAccount | Mutating | Auto-inject default SA, imagePullSecrets, and token volume into pods | ✅ On |
NodeRestriction | Validating | Restrict kubelet to only modify its own Node/Pod objects; block taint/label manipulation | ✅ On (recommended) |
PodSecurity | Mutating + Validating | Enforce Pod Security Standards (privileged/baseline/restricted) per namespace | ✅ On (1.25+) |
ResourceQuota | Validating | Enforce namespace ResourceQuota; reject objects that would exceed quotas | ✅ On |
MutatingAdmissionWebhook | Mutating | Call registered MutatingWebhookConfiguration endpoints | ✅ On |
ValidatingAdmissionWebhook | Validating | Call registered ValidatingWebhookConfiguration endpoints | ✅ On |
ValidatingAdmissionPolicy | Validating | CEL-based in-process validation (GA 1.30); no external webhook needed | ✅ On (1.30+) |
DefaultStorageClass | Mutating | Add default StorageClass annotation to PVCs without one | ✅ On |
DefaultTolerationSeconds | Mutating | Add default 300s toleration for node not-ready/unreachable taints | ✅ On |
Priority | Mutating + Validating | Set pod priority from PriorityClass; validate priority values | ✅ On |
StorageObjectInUseProtection | Validating | Prevent deletion of PVCs/PVs that are in use | ✅ On |
AlwaysPullImages | Mutating | Force imagePullPolicy: Always on all pods; prevents use of locally cached images in multi-tenant clusters | ❌ Off (opt-in) |
ImagePolicyWebhook | Validating | Delegate image policy to external webhook; alternative to OPA/Kyverno for image policy | ❌ Off (opt-in) |
EventRateLimit | Validating | Rate-limit event creation; prevents event storms from overwhelming etcd | ❌ Off (opt-in, recommended) |
# List currently enabled admission plugins
kubectl exec -n kube-system kube-apiserver-controlplane -- \
kube-apiserver -h 2>&1 | grep "admission-plugins"
# Check what's enabled on a running cluster
kubectl get pod kube-apiserver-controlplane -n kube-system \
-o jsonpath='{.spec.containers[0].command}' | tr ' ' '\n' \
| grep -E "admission|enable-admission"
Webhook Configuration
Webhooks are registered via MutatingWebhookConfiguration and ValidatingWebhookConfiguration cluster-scoped resources. The API server calls the registered endpoint for matching operations.
MutatingWebhookConfiguration
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
name: pod-mutator
annotations:
cert-manager.io/inject-ca-from: webhook-system/webhook-cert # cert-manager injects CA bundle
webhooks:
- name: pod-mutator.example.com # unique name; must be FQDN format
admissionReviewVersions: ["v1", "v1beta1"]
clientConfig:
service:
name: pod-mutator
namespace: webhook-system
path: /mutate-pods
port: 443
# caBundle: # OR inject with cert-manager annotation
rules:
- apiGroups: [""]
apiVersions: ["v1"]
resources: ["pods"]
operations: ["CREATE", "UPDATE"] # operations that trigger the webhook
scope: "Namespaced" # Namespaced, Cluster, or * (both)
namespaceSelector: # only call webhook for pods in matching namespaces
matchExpressions:
- key: webhook-injection
operator: In
values: ["enabled"]
objectSelector: # only call for objects matching this selector
matchExpressions:
- key: app.kubernetes.io/managed-by
operator: NotIn
values: ["helm"] # skip Helm-managed pods
failurePolicy: Fail # Fail or Ignore (see below)
sideEffects: None # None, NoneOnDryRun, Some, Unknown
timeoutSeconds: 10 # max 30; default 10
reinvocationPolicy: IfNeeded # Never or IfNeeded (re-run if object changed)
matchPolicy: Equivalent # Exact or Equivalent (handle API version aliases)
ValidatingWebhookConfiguration
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
name: policy-validator
webhooks:
- name: pod-validator.example.com
admissionReviewVersions: ["v1"]
clientConfig:
service:
name: policy-validator
namespace: policy-system
path: /validate-pods
port: 443
rules:
- apiGroups: ["", "apps"]
apiVersions: ["v1"]
resources: ["pods", "deployments", "statefulsets", "daemonsets"]
operations: ["CREATE", "UPDATE"]
namespaceSelector:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: NotIn
values: ["kube-system", "kube-public", "cert-manager", "istio-system"]
failurePolicy: Fail
sideEffects: None
timeoutSeconds: 15
AdmissionReview Request/Response
// Webhook receives AdmissionReview request:
{
"apiVersion": "admission.k8s.io/v1",
"kind": "AdmissionReview",
"request": {
"uid": "705ab4f5-6393-11e8-b7cc-42010a800002",
"kind": {"group": "", "version": "v1", "kind": "Pod"},
"resource": {"group": "", "version": "v1", "resource": "pods"},
"namespace": "production",
"operation": "CREATE",
"userInfo": {
"username": "system:serviceaccount:ci:deployer",
"groups": ["system:serviceaccounts", "system:authenticated"]
},
"object": { /* full pod spec */ },
"oldObject": null,
"dryRun": false,
"options": { /* admission options */ }
}
}
// Webhook must respond with AdmissionReview:
// Allow:
{
"apiVersion": "admission.k8s.io/v1",
"kind": "AdmissionReview",
"response": {
"uid": "705ab4f5-6393-11e8-b7cc-42010a800002",
"allowed": true,
// For mutating: include patch:
"patchType": "JSONPatch",
"patch": "W3sib3AiOiAiYWRkIiwgInBhdGgiOiAiL21ldGFkYXRhL2xhYmVscy9pbmplY3RlZCIsICJ2YWx1ZSI6ICJ0cnVlIn1d"
// base64(JSON patch array)
}
}
// Deny:
{
"response": {
"uid": "...",
"allowed": false,
"status": {
"code": 403,
"message": "Image from unauthorized registry: docker.io/nginx:latest"
}
}
}
failurePolicy, sideEffects, reinvocationPolicy
failurePolicy
| Value | Behavior on Webhook Error | When to Use |
|---|---|---|
Fail | API server rejects the request; object is NOT created/updated | Security-critical policies (image signing, required labels). Default recommendation for validating webhooks. |
Ignore | Webhook error is silently ignored; object proceeds as if webhook was not called | Mutations that are convenient but not security-critical (e.g., adding optional labels). Or during webhook rollout/testing. |
failurePolicy: Ignore and the webhook pod crashes, all pods can be created without the security check. Security webhooks must use failurePolicy: Fail. The tradeoff is that a crashed webhook blocks all matching API calls — which is why webhook reliability and HA deployment is critical.
sideEffects
| Value | Meaning | Dry-Run Behavior |
|---|---|---|
None | Webhook has no side effects; safe to call on dry-run requests | Called on dry-run |
NoneOnDryRun | Has side effects on real requests but not dry-run | Called on dry-run |
Some | Has side effects; do not call on dry-run | NOT called on dry-run (dry-run requests pass without calling webhook) |
Unknown | Unknown side effects (deprecated default) | NOT called on dry-run |
sideEffects: None allows the webhook to run during kubectl apply --dry-run=server, which enables users to preview whether their object would be admitted without actually creating it. Webhooks with Some or Unknown are skipped during dry-run, reducing the value of the dry-run check.
reinvocationPolicy
Mutating webhooks may need to run more than once if a previous webhook modified the object. Setting reinvocationPolicy: IfNeeded causes the webhook to be called again after any other mutating webhook modifies the object.
| Value | Behavior |
|---|---|
Never | Webhook is called at most once per admission; default |
IfNeeded | Webhook is called again if the object was modified by a later-ordered webhook; may be called multiple times |
timeoutSeconds and matchPolicy
timeoutSeconds: 10 # max 30; contributes directly to API call latency
# webhook timeout + network latency must fit in client timeout (typically 30s)
# keep webhook response time <5s; set timeout to 2x P99 latency
matchPolicy: Equivalent # Equivalent (default): match all API version aliases
# e.g., a rule on "deployments" also matches "deployments.v1.apps"
# Exact: only the exact apiVersion/resource specified
Namespace & Object Selectors
namespaceSelector — Which namespaces the webhook applies to
# Pattern: opt-in (webhook only runs in labeled namespaces)
namespaceSelector:
matchLabels:
admission.webhook.example.com/enabled: "true"
# Label the namespace to enable: kubectl label namespace production admission.webhook.example.com/enabled=true
# Pattern: opt-out (webhook runs everywhere EXCEPT labeled namespaces)
namespaceSelector:
matchExpressions:
- key: admission.webhook.example.com/skip
operator: DoesNotExist
# Pattern: exclude system namespaces from policy webhook
namespaceSelector:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: NotIn
values:
- kube-system
- kube-public
- kube-node-lease
- cert-manager
- istio-system
- my-webhook-namespace # CRITICAL: exclude webhook's own namespace (prevent self-blocking)
NotIn exclusion list.
objectSelector — Filter by object labels
# Only call webhook for pods with specific labels
objectSelector:
matchLabels:
sidecar-injection: "true" # opt-in: only inject sidecar where requested
# Skip webhook for objects created by controllers (reduces noise)
objectSelector:
matchExpressions:
- key: app.kubernetes.io/managed-by
operator: NotIn
values: ["kube-controller-manager", "daemonset-controller"]
OPA / Gatekeeper
OPA/Gatekeeper implements both mutating and validating admission policies using Rego (policy language) as ValidatingWebhookConfiguration + Gatekeeper CRDs. It also supports audit mode — scanning existing resources for policy violations without blocking new ones.
Gatekeeper Architecture:
kube-apiserver
│ (MutatingWebhookConfiguration)
▼
Gatekeeper Webhook Pod
├── validates against: Constraint objects
│ └── Constraint references: ConstraintTemplate (Rego code)
└── audit controller: scans existing objects → ConstraintStatus.violations[]
CRD hierarchy:
ConstraintTemplate (cluster-scoped) → defines the Rego code + CRD schema
Constraint (instance of ConstraintTemplate) → applies policy to specific resources
ConstraintTemplate — Define the Policy
apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
name: k8srequiredlabels
spec:
crd:
spec:
names:
kind: K8sRequiredLabels # this creates a new CRD: K8sRequiredLabels
validation:
openAPIV3Schema:
type: object
properties:
labels: # parameter: list of required label keys
type: array
items:
type: string
targets:
- target: admission.k8s.gatekeeper.sh
rego: |
package k8srequiredlabels
violation[{"msg": msg, "details": {"missing_labels": missing}}] {
provided := {label | input.review.object.metadata.labels[label]}
required := {label | label := input.parameters.labels[_]}
missing := required - provided
count(missing) > 0
msg := sprintf("Missing required labels: %v", [missing])
}
Constraint — Instantiate the Policy
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels # kind = ConstraintTemplate name
metadata:
name: pods-must-have-team-label
spec:
enforcementAction: deny # deny (block), warn (allow + warning), dryrun (audit only)
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"]
namespaceSelector: # apply to specific namespaces
matchExpressions:
- key: environment
operator: In
values: ["production", "staging"]
excludedNamespaces: # always skip these namespaces
- kube-system
- cert-manager
parameters:
labels: ["team", "app"] # must have both labels
Checking Violations
# List all constraints and their violation counts
kubectl get constraints --all-namespaces
# See violations for a specific constraint
kubectl describe k8srequiredlabels pods-must-have-team-label
# status.violations[] lists each violating object
# Run audit manually (Gatekeeper audit runs on a configurable interval, default 60s)
kubectl annotate k8srequiredlabels pods-must-have-team-label \
run-audit=$(date +%s)
Gatekeeper Mutation (Assign CRD)
# Gatekeeper mutation: add default labels to all pods
apiVersion: mutations.gatekeeper.sh/v1
kind: Assign
metadata:
name: add-default-team-label
spec:
applyTo:
- groups: [""]
kinds: ["Pod"]
versions: ["v1"]
match:
scope: Namespaced
namespaceSelector:
matchLabels:
environment: production
excludedNamespaces: ["kube-system"]
location: "metadata.labels.injected-by" # JSONPath to field
parameters:
assign:
value: "gatekeeper" # value to set
Kyverno
Kyverno is a Kubernetes-native policy engine that uses YAML (not Rego) for policy definitions. It supports validate, mutate, generate, and verify rules within a single ClusterPolicy resource. It also supports background scanning of existing resources.
Kyverno Architecture: Webhook: ValidatingWebhookConfiguration + MutatingWebhookConfiguration Background controller: scans existing resources for policy violations Reports: PolicyReport, ClusterPolicyReport CRDs Rule types in a single ClusterPolicy: validate → check and block/warn mutate → modify incoming objects (patch or strategic merge) generate → create new resources when a trigger resource is created verifyImages → verify container image signatures (cosign/Notation)
ClusterPolicy: validate rule
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: disallow-latest-tag
spec:
validationFailureAction: Enforce # Enforce (block) or Audit (report only)
background: true # also check existing resources in background
rules:
- name: require-image-tag
match:
any:
- resources:
kinds: ["Pod"]
namespaceSelector:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: NotIn
values: ["kube-system", "kube-public"]
exclude:
any:
- resources:
namespaces: ["ci-testing"] # exclude ci namespace
validate:
message: "Image tag 'latest' is not allowed. Use a specific tag."
pattern:
spec:
containers:
- image: "!*:latest" # Kyverno pattern: negate with !
=(initContainers):
- image: "!*:latest"
ClusterPolicy: mutate rule
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: add-pod-labels
spec:
rules:
- name: add-managed-by-label
match:
any:
- resources:
kinds: ["Pod"]
mutate:
patchStrategicMerge:
metadata:
labels:
+(managed-by): "kyverno" # + prefix: only add if not already present
spec:
containers:
- (name): "*" # match all containers
+(resources):
+(limits):
+(memory): "256Mi" # add default memory limit if not set
ClusterPolicy: generate rule
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: generate-default-deny-policy
spec:
rules:
- name: create-default-networkpolicy
match:
any:
- resources:
kinds: ["Namespace"]
generate:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
name: default-deny-all
namespace: "{{request.object.metadata.name}}" # same namespace as new Namespace
synchronize: true # keep generated resource in sync with policy
data:
spec:
podSelector: {}
policyTypes: [Ingress, Egress]
ClusterPolicy: verifyImages rule (cosign)
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: verify-image-signatures
spec:
validationFailureAction: Enforce
background: false # image verification must be real-time, not background
rules:
- name: verify-cosign-signature
match:
any:
- resources:
kinds: ["Pod"]
verifyImages:
- imageReferences:
- "registry.example.com/*" # images from our registry must be signed
attestors:
- entries:
- keyless:
subject: "https://github.com/myorg/myrepo/.github/workflows/release.yaml@refs/heads/main"
issuer: "https://token.actions.githubusercontent.com"
rekor:
url: https://rekor.sigstore.dev
mutateDigest: true # replace :tag with @digest after verification
Policy Reports
# View policy audit results
kubectl get policyreport -n production
kubectl get clusterpolicyreport
kubectl describe policyreport -n production
# Shows: pass/fail/warn/error counts per rule, failing resources
# Install Policy Reporter UI for dashboard view
helm install policy-reporter kyverno/policy-reporter \
--namespace policy-reporter --create-namespace \
--set ui.enabled=true
ValidatingAdmissionPolicy (CEL)
ValidatingAdmissionPolicy (VAP) is a built-in, in-process validating admission mechanism using CEL (Common Expression Language). GA in Kubernetes 1.30. It runs inside the API server — no external webhook pod needed. This improves latency, reliability (no external dependency), and simplicity for common policies.
Traditional webhook: ValidatingAdmissionPolicy (VAP): API server → network → API server → CEL expression webhook pod → (evaluated in-process, no network hop) response → API server Tradeoff: VAP: faster, simpler, no webhook pod to manage, no TLS cert management VAP: limited expressiveness vs Rego (no loops, limited functions) Use VAP for: simple field checks, label enforcement, resource constraints Use Gatekeeper/Kyverno for: complex logic, cross-resource checks, mutation, generation
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
name: deny-privileged-pods
spec:
failurePolicy: Fail
matchConstraints:
resourceRules:
- apiGroups: [""]
apiVersions: ["v1"]
operations: ["CREATE", "UPDATE"]
resources: ["pods"]
variables: # pre-compute values for reuse in expressions
- name: containers
expression: "object.spec.containers + object.spec.?initContainers.orValue([]) + object.spec.?ephemeralContainers.orValue([])"
validations:
- expression: |
variables.containers.all(c,
!has(c.securityContext) ||
!has(c.securityContext.privileged) ||
c.securityContext.privileged == false
)
message: "Privileged containers are not allowed"
messageExpression: |
"Privileged containers found: " + variables.containers.filter(c,
has(c.securityContext) && has(c.securityContext.privileged) && c.securityContext.privileged
).map(c, c.name).join(", ")
auditAnnotations:
- key: "privileged-container-check"
valueExpression: "'checked'"
# ValidatingAdmissionPolicyBinding — binds policy to namespaces/resources
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicyBinding
metadata:
name: deny-privileged-pods-binding
spec:
policyName: deny-privileged-pods
validationActions: [Deny] # Deny, Warn, or Audit
matchResources:
namespaceSelector:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: NotIn
values: ["kube-system", "kube-public"]
VAP with Parameters
# Policy that accepts parameters from a separate config object
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
name: require-labels
spec:
paramKind: # type of the parameter object
apiVersion: v1
kind: ConfigMap
matchConstraints:
resourceRules:
- apiGroups: [""]
apiVersions: ["v1"]
operations: ["CREATE"]
resources: ["pods"]
validations:
- expression: |
params.data.requiredLabels.split(",").all(label,
has(object.metadata.labels) && label in object.metadata.labels
)
messageExpression: "'Missing required labels: ' + params.data.requiredLabels"
---
# Binding with paramRef
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicyBinding
metadata:
name: require-labels-production
spec:
policyName: require-labels
paramRef:
name: production-required-labels # reference to a ConfigMap
namespace: policy-system
parameterNotFoundAction: Deny # deny if ConfigMap not found
validationActions: [Deny]
matchResources:
namespaceSelector:
matchLabels:
environment: production
---
apiVersion: v1
kind: ConfigMap
metadata:
name: production-required-labels
namespace: policy-system
data:
requiredLabels: "team,app,environment"
Common Policy Examples
1. Require images from approved registries (Kyverno)
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: allowed-registries
spec:
validationFailureAction: Enforce
rules:
- name: check-registry
match:
any:
- resources:
kinds: ["Pod"]
validate:
message: "Image must be from approved registry (registry.example.com or gcr.io/my-project)"
pattern:
spec:
containers:
- image: "registry.example.com/* | gcr.io/my-project/*"
=(initContainers):
- image: "registry.example.com/* | gcr.io/my-project/*"
2. Require resource limits (VAP)
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
name: require-resource-limits
spec:
failurePolicy: Fail
matchConstraints:
resourceRules:
- apiGroups: [""]
apiVersions: ["v1"]
operations: ["CREATE", "UPDATE"]
resources: ["pods"]
variables:
- name: containers
expression: "object.spec.containers"
validations:
- expression: |
variables.containers.all(c,
has(c.resources) &&
has(c.resources.limits) &&
has(c.resources.limits.memory) &&
has(c.resources.limits.cpu)
)
message: "All containers must have CPU and memory limits set"
3. Inject default security context (Kyverno mutate)
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: add-default-security-context
spec:
rules:
- name: add-seccomp-profile
match:
any:
- resources:
kinds: ["Pod"]
mutate:
patchStrategicMerge:
spec:
+(securityContext):
+(seccompProfile):
+(type): RuntimeDefault
containers:
- (name): "*"
+(securityContext):
+(allowPrivilegeEscalation): false
+(readOnlyRootFilesystem): true
4. Auto-generate NetworkPolicy on namespace creation (Kyverno generate)
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: add-default-networkpolicy
spec:
rules:
- name: default-deny-all
match:
any:
- resources:
kinds: ["Namespace"]
generate:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
name: default-deny-all
namespace: "{{request.object.metadata.name}}"
synchronize: true
data:
spec:
podSelector: {}
policyTypes: [Ingress, Egress]
Testing Admission Webhooks
Server-side dry-run
# Test if a resource would be admitted without creating it
kubectl apply -f pod.yaml --dry-run=server
# Runs through full admission pipeline including webhooks with sideEffects: None/NoneOnDryRun
# Returns validation errors without creating the resource
# Test deletion dry-run
kubectl delete pod my-pod --dry-run=server
Kyverno CLI for policy testing
# Install kyverno CLI
brew install kyverno
# Test a policy against a resource offline (no cluster needed)
kyverno apply policy.yaml --resource pod.yaml
# Test with multiple resources
kyverno apply policies/ --resource resources/
# Run in CI pipeline
kyverno apply policies/ --resource resources/ --detailed-results
# exit code 0 = all pass, non-zero = violations
Conftest (OPA policy unit testing)
# Unit test OPA/Gatekeeper policies offline
conftest test pod.yaml --policy rego/
# Run in CI
conftest test manifests/ --policy policies/ --all-namespaces
Manual webhook testing
# Send a synthetic AdmissionReview to your webhook directly
curl -k -X POST \
https://my-webhook.webhook-system.svc/validate-pods \
-H "Content-Type: application/json" \
-d @admission-review.json
# Check webhook certificate is valid
openssl s_client -connect my-webhook.webhook-system.svc:443 \
-CAfile /path/to/ca.crt
Performance & Reliability
Latency Budget
Webhook latency adds directly to the API server response time for admitted operations. Total admission latency = sum of all webhook timeouts (if sequential) or max timeout (if parallel). Guidelines:
- Keep webhook response time under 100ms P99 for non-blocking paths
- Set
timeoutSecondsto 2× your P99 webhook latency (min 10s for safety) - Webhooks in the same phase are called in parallel — total latency = slowest webhook, not sum
- Cache policy decisions in-memory where possible (policy doesn't change per-request)
- Use
namespaceSelectorandobjectSelectoraggressively to skip non-matching resources
High Availability
# Deploy webhook pods with anti-affinity and PDB
apiVersion: apps/v1
kind: Deployment
metadata:
name: policy-webhook
namespace: webhook-system
spec:
replicas: 3
strategy:
rollingUpdate:
maxUnavailable: 0 # never take all replicas down during update
maxSurge: 1
template:
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: policy-webhook
topologyKey: kubernetes.io/hostname # one pod per node
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: policy-webhook
namespace: webhook-system
spec:
minAvailable: 2
selector:
matchLabels:
app: policy-webhook
node-role.kubernetes.io/control-plane and node.kubernetes.io/not-ready to ensure the webhook can run during cluster events.
Metrics & Alerts
Key Metrics
| Metric | Source | What It Tells You |
|---|---|---|
apiserver_admission_webhook_rejection_count{name,operation,rejected_by} | kube-apiserver | Admission rejections per webhook; track trends per policy name |
apiserver_admission_webhook_admission_duration_seconds{name,type} | kube-apiserver | Per-webhook admission latency; P99 >5s is concerning |
apiserver_admission_step_admission_duration_seconds{type} | kube-apiserver | Total admission phase latency (mutating/validating); impact on overall API latency |
kyverno_policy_results_total{policy,result,rule_type} | Kyverno | Pass/fail/warn counts per policy rule |
gatekeeper_violations{kind,namespace,constraint} | Gatekeeper | Current violation count per constraint from audit |
Alerts
groups:
- name: admission.rules
rules:
- alert: AdmissionWebhookHighLatency
expr: |
histogram_quantile(0.99,
rate(apiserver_admission_webhook_admission_duration_seconds_bucket[5m])
) > 5
for: 5m
annotations:
summary: "Admission webhook {{ $labels.name }} P99 latency > 5s"
description: "High webhook latency degrades all API calls for the matched resource type"
labels:
severity: warning
- alert: AdmissionWebhookFailOpen
expr: |
rate(apiserver_admission_webhook_rejection_count{rejected_by="error"}[5m]) > 0
annotations:
summary: "Webhook {{ $labels.name }} is returning errors — check failurePolicy"
description: "If failurePolicy=Ignore, policy is not being enforced. If failurePolicy=Fail, API calls are being blocked."
labels:
severity: high
- alert: GatekeeperViolationsHigh
expr: sum by (constraint) (gatekeeper_violations) > 10
for: 30m
annotations:
summary: "Gatekeeper constraint {{ $labels.constraint }} has >10 violations"
labels:
severity: warning
- alert: KyvernoPolicyViolationRateHigh
expr: rate(kyverno_policy_results_total{result="fail"}[5m]) > 5
annotations:
summary: "Kyverno policy {{ $labels.policy }}/{{ $labels.rule }} failing at high rate"
labels:
severity: warning
Runbooks
- Webhook blocking all pod creates in a namespace: Identify the blocking webhook:
kubectl get events -n <ns> | grep "denied the request". The error message names the webhook. Options: fix the pod spec to comply with the policy; add the namespace to the webhook'snamespaceSelectorexclusion temporarily; usekubectl apply --dry-run=serverto iterate. If an emergency bypass is needed and the webhook is Kyverno/Gatekeeper, switch the policy toAuditmode temporarily and document why. - Webhook crashed — all pod creates failing (failurePolicy: Fail): Check webhook pod status:
kubectl get pods -n <webhook-ns>. Check logs:kubectl logs -l app=<webhook> -n <webhook-ns>. If the webhook pod itself cannot start (caught in its own policy), temporarily switch the failing webhook tofailurePolicy: Ignoreto allow recovery:kubectl patch mutatingwebhookconfiguration <name> --type=json -p='[{"op":"replace","path":"/webhooks/0/failurePolicy","value":"Ignore"}]'. Fix the issue, then restorefailurePolicy: Fail. - Gatekeeper violations discovered in audit: Run
kubectl describe <constraintKind> <constraintName>to list violating objects. Prioritize by severity. For each violation: either fix the object to comply, or update the constraint to exclude the specific case with justification. Track violations in a JIRA/Linear issue. Do not remove the constraint — fix the violation. Set a SLA for remediation (e.g., critical = 24h, high = 7 days). - Webhook admission latency causing API timeout: Check
apiserver_admission_webhook_admission_duration_secondsP99. Identify the slow webhook. Optimize the webhook: add caching, reduce external calls, increase webhook replicas. If the webhook is a 3rd party (Istio, Vault injector), check their known performance issues. As a short-term fix: reduce the webhook's scope with tighternamespaceSelectorandobjectSelectorto reduce call volume. - False positive blocking legitimate workload: Identify the specific policy rule causing the rejection (error message should name it). Review whether the rejection is correct or a false positive. If false positive: update the policy's
excludesection or fix the pattern/expression. Test the fix with Kyverno CLI or--dry-run=server. For Gatekeeper: update the Constraint'sspec.match.excludedNamespacesor add exception annotation support via parameter. Document all exceptions with justification.
Best Practices
- Start policies in audit/warn mode, then graduate to enforce. New policies applied in enforce mode immediately break workloads that don't comply. Instead: deploy with
enforcementAction: dryrun(Gatekeeper),validationFailureAction: Audit(Kyverno), orvalidationActions: [Audit](VAP). Review violations over 2+ weeks. Fix all violations. Then switch to enforce. This prevents surprise breakages during rollout. - Use ValidatingAdmissionPolicy (CEL) for simple checks — no webhook pod needed. For straightforward policies (require labels, forbid privileged, check resource limits), VAP runs in-process with zero additional infrastructure, zero TLS management, and zero latency overhead. Reserve external webhooks (Kyverno/Gatekeeper) for complex logic, mutations, and cross-resource checks.
- Always exclude the webhook's own namespace from its selectors. The webhook pod's namespace must always be in the exclusion list to prevent self-blocking during webhook upgrades and restarts. This is the most common admission webhook deployment mistake.
- Deploy webhook pods with replicas ≥ 2, PDB minAvailable ≥ 1, and anti-affinity. A single-replica webhook with
failurePolicy: Failis a single point of failure for all API operations it matches. Three replicas with a PDB ensures availability during node failures, maintenance, and rolling updates. - Set failurePolicy: Fail on security-critical webhooks, Ignore on convenience mutations. There is no middle ground: a webhook that can silently fail open is not a security control. If the policy is security-critical (image signing, no privileged containers), it must use Fail. If it's convenience (adding optional labels), Ignore is appropriate. Document which webhooks are security-critical vs convenience.
- Test policies offline in CI with Kyverno CLI or conftest before cluster deployment. Every policy change should be tested against a suite of compliant and non-compliant resource examples in CI before being applied to the cluster. This catches regressions (policy change breaks existing workload) and false negatives (policy doesn't catch what it should) before they reach production.
- Use tight namespaceSelector and objectSelector to minimize webhook invocations. Every webhook invocation adds admission latency. Restrict webhooks to only the namespaces and object types they need to evaluate. A webhook that applies to all pods in all namespaces including kube-system and monitoring adds latency to every pod create in the cluster — this matters at scale.
- Monitor admission rejection rates and latency continuously. A sudden spike in rejection rate means a new policy is blocking legitimate workloads. A sudden spike in latency means a webhook is struggling. Both are P1 operational issues. Set up Grafana dashboards for
apiserver_admission_webhook_rejection_countandapiserver_admission_webhook_admission_duration_secondsP99, alerting at defined thresholds.