🛡️ Policy Enforcement

Policy Enforcement in Kubernetes

Complete guide to Kubernetes admission control, OPA Gatekeeper, Kyverno, Pod Security Admission, and policy-as-code workflows — ensuring every workload meets security, operational, and compliance standards before it lands in the cluster.

🧩 OPA Gatekeeper 🔒 Kyverno 🔐 Pod Security Admission ⚙️ Admission Webhooks 🧪 Policy Testing

Admission Control Architecture
Pod Security Admission
OPA Gatekeeper
Kyverno
Gatekeeper vs Kyverno
Common Policy Library
Policy Exceptions
Policy Testing
Policy as Code in GitOps
Audit & Compliance Reporting
Alerting & Monitoring
Best Practices

Admission Control Architecture

Every API request in Kubernetes passes through an ordered admission chain before being persisted to etcd. Policy engines hook into this chain as dynamic admission webhooks — they are called synchronously, making them enforceable gates rather than after-the-fact detectors.

kubectl apply / CI pipeline / GitOps reconciler │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ kube-apiserver │ │ │ │ 1. Authentication (certificate / OIDC / bearer token) │ │ 2. Authorization (RBAC / ABAC / Node) │ │ 3. Admission │ │ ├── Mutating Admission Webhooks (run in parallel) │ │ │ ├── Kyverno mutate webhook │ │ │ ├── Gatekeeper mutation webhook │ │ │ └── cert-manager / linkerd injectors │ │ ├── Object Schema Validation │ │ └── Validating Admission Webhooks (run in parallel) │ │ ├── Gatekeeper validating webhook │ │ ├── Kyverno validate webhook │ │ └── Pod Security Admission (built-in) │ │ │ │ 4. Persist to etcd (only if all webhooks admit) │ └─────────────────────────────────────────────────────────────┘ │ │ ▼ ▼ ALLOW + audit log DENY (HTTP 403 back to client)

⚠️

Webhook availability matters. Validating webhooks with failurePolicy: Fail block all matching requests if the webhook pod is unavailable. Deploy policy engines with podAntiAffinity across zones, set PodDisruptionBudgets, and size replicas to survive zone failure. failurePolicy: Ignore is an escape hatch but removes your guarantees.

Webhook Configuration Anatomy

# ValidatingWebhookConfiguration (simplified — Gatekeeper creates this)
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
  name: gatekeeper-validating-webhook-configuration
webhooks:
- name: validation.gatekeeper.sh
  admissionReviewVersions: ["v1"]
  clientConfig:
    service:
      name: gatekeeper-webhook-service
      namespace: gatekeeper-system
      path: /v1/admit
  rules:
  - apiGroups: ["*"]
    apiVersions: ["*"]
    operations: ["CREATE","UPDATE"]
    resources: ["*"]
    scope: Namespaced
  namespaceSelector:
    matchExpressions:
    - key: admission.gatekeeper.sh/ignore
      operator: DoesNotExist
  failurePolicy: Fail
  sideEffects: None
  timeoutSeconds: 10
  matchPolicy: Equivalent   # catches admission of old API versions too

Built-in Admission Plugins

Plugin	Type	Purpose	Enabled by Default
`PodSecurity`	Validating	Enforce Pod Security Standards per namespace	Yes (1.25+)
`LimitRanger`	Mutating+Validating	Apply LimitRange defaults and enforce bounds	Yes
`ResourceQuota`	Validating	Block creation when namespace quota exceeded	Yes
`DefaultStorageClass`	Mutating	Add default StorageClass annotation to PVCs	Yes
`MutatingAdmissionWebhook`	Mutating	Call registered mutating webhooks	Yes
`ValidatingAdmissionWebhook`	Validating	Call registered validating webhooks	Yes
`ValidatingAdmissionPolicy`	Validating	CEL-based in-process policies (1.30 GA)	Yes (1.30+)

Pod Security Admission (PSA)

PSA is the built-in replacement for the deprecated PodSecurityPolicy. It enforces three hardcoded Pod Security Standards at the namespace level via labels. No CRDs, no external webhook — zero operational overhead.

The Three Pod Security Standards

Privileged

Completely unrestricted. Allows all host namespaces, privileged containers, any capabilities. Reserved for system-level DaemonSets (CNI, node agents, eBPF tools) in kube-system-equivalent namespaces.

Baseline

Prevents known privilege escalations. Disallows privileged containers, hostPID/hostNetwork/hostIPC, dangerous capabilities (NET_ADMIN, SYS_ADMIN), host path volumes, hostPort. Suitable for most general workloads.

Restricted

Heavily hardened. Requires runAsNonRoot: true, drops ALL capabilities, allows only NET_BIND_SERVICE, requires seccompProfile: RuntimeDefault or Localhost, disallows all volume types except configMap/secret/projected/emptyDir/csi/persistent. Target for internet-facing services.

PSA Modes

Mode	Label Key	Behavior	Use Case
`enforce`	`pod-security.kubernetes.io/enforce`	Reject violating pods synchronously	Production namespaces
`audit`	`pod-security.kubernetes.io/audit`	Allow but add audit annotation to API audit log	Monitoring compliance without breakage
`warn`	`pod-security.kubernetes.io/warn`	Allow but return HTTP warning header to client	Gradual rollout — warns kubectl users

Namespace Labeling Examples

# Recommended: enforce baseline, warn+audit restricted
apiVersion: v1
kind: Namespace
metadata:
  name: payments-api
  labels:
    # Reject pods that aren't at least baseline
    pod-security.kubernetes.io/enforce: baseline
    pod-security.kubernetes.io/enforce-version: latest
    # Warn and audit against restricted (see violations without breaking)
    pod-security.kubernetes.io/warn: restricted
    pod-security.kubernetes.io/warn-version: latest
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/audit-version: latest
---
# Infrastructure namespace — privileged for CNI/monitoring DaemonSets
apiVersion: v1
kind: Namespace
metadata:
  name: monitoring
  labels:
    pod-security.kubernetes.io/enforce: privileged

Labeling Existing Namespaces with Dry-Run

# Check what would break before enforcing
kubectl label --dry-run=server --overwrite ns my-app \
  pod-security.kubernetes.io/enforce=restricted 2>&1 | grep -i warning

# Bulk audit all namespaces against restricted
kubectl get ns -o name | xargs -I{} kubectl label --dry-run=server \
  --overwrite {} pod-security.kubernetes.io/audit=restricted 2>&1 | grep Warning

Compliant Pod securityContext for Restricted

spec:
  securityContext:
    runAsNonRoot: true
    runAsUser: 65532
    runAsGroup: 65532
    fsGroup: 65532
    seccompProfile:
      type: RuntimeDefault
  containers:
  - name: app
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      capabilities:
        drop: ["ALL"]

OPA Gatekeeper

Gatekeeper extends OPA (Open Policy Agent) into Kubernetes as a set of CRDs. Policies are expressed as ConstraintTemplates (Rego logic) and instantiated as Constraints (parameters). This two-layer model lets platform teams ship reusable policy libraries while teams configure enforcement parameters.

ConstraintTemplate Constraint (instance) name: K8sRequiredLabels kind: K8sRequiredLabels rego: deny if labels missing name: require-team-label creates CRD: K8sRequiredLabels params: labels: ["team","env"] │ │ └──────────────────────────────┘ │ Gatekeeper admission webhook │ Evaluates Rego against object │ ┌─────────┴──────────┐ ALLOW DENY (with violation message) Audit controller polls all existing objects → populates status.violations[] on each Constraint (catch pre-existing)

Install Gatekeeper

helm repo add gatekeeper https://open-policy-agent.github.io/gatekeeper/charts
helm repo update

helm install gatekeeper gatekeeper/gatekeeper \
  --namespace gatekeeper-system \
  --create-namespace \
  --version 3.17.1 \
  --set replicas=3 \
  --set auditInterval=60 \
  --set auditMatchKindOnly=false \
  --set constraintViolationsLimit=100 \
  --set disableMutation=false \
  --set logDenies=true \
  --set emitAdmissionEvents=true \
  --set emitAuditEvents=true

ConstraintTemplate: Required Labels

apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
  name: k8srequiredlabels
  annotations:
    metadata.gatekeeper.sh/title: "Required Labels"
    metadata.gatekeeper.sh/description: "Requires specified labels on resources"
spec:
  crd:
    spec:
      names:
        kind: K8sRequiredLabels
      validation:
        openAPIV3Schema:
          type: object
          properties:
            labels:
              type: array
              items:
                type: string
  targets:
  - target: admission.k8s.gatekeeper.sh
    rego: |
      package k8srequiredlabels

      violation[{"msg": msg}] {
        provided := {label | input.review.object.metadata.labels[label]}
        required := {label | label := input.parameters.labels[_]}
        missing  := required - provided
        count(missing) > 0
        msg := sprintf("Missing required labels: %v", [missing])
      }

Constraint: Enforce Labels on Namespaces

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
  name: require-team-env-labels
spec:
  enforcementAction: deny     # deny | warn | dryrun
  match:
    kinds:
    - apiGroups: [""]
      kinds: ["Namespace"]
    excludedNamespaces:
    - kube-system
    - kube-public
    - gatekeeper-system
    - cert-manager
    - argocd
  parameters:
    labels: ["team", "env", "cost-center"]

ConstraintTemplate: Allowed Image Registries

apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
  name: k8sallowedrepos
spec:
  crd:
    spec:
      names:
        kind: K8sAllowedRepos
      validation:
        openAPIV3Schema:
          type: object
          properties:
            repos:
              type: array
              items: {type: string}
  targets:
  - target: admission.k8s.gatekeeper.sh
    rego: |
      package k8sallowedrepos

      violation[{"msg": msg}] {
        container := input_containers[_]
        not strings.any_prefix_match(container.image, input.parameters.repos)
        msg := sprintf("Container %q image %q not from allowed registries %v",
          [container.name, container.image, input.parameters.repos])
      }

      input_containers[c] {
        c := input.review.object.spec.containers[_]
      }
      input_containers[c] {
        c := input.review.object.spec.initContainers[_]
      }
      input_containers[c] {
        c := input.review.object.spec.ephemeralContainers[_]
      }
---
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sAllowedRepos
metadata:
  name: allowed-image-registries
spec:
  enforcementAction: deny
  match:
    kinds:
    - apiGroups: [""]
      kinds: ["Pod"]
  parameters:
    repos:
    - "123456789.dkr.ecr.us-east-1.amazonaws.com/"
    - "gcr.io/distroless/"
    - "registry.k8s.io/"

Gatekeeper Mutation

# Automatically inject default resource limits
apiVersion: mutations.gatekeeper.sh/v1
kind: AssignMetadata
metadata:
  name: add-cost-center-label
spec:
  match:
    scope: Namespaced
    kinds:
    - apiGroups: ["apps"]
      kinds: ["Deployment","StatefulSet","DaemonSet"]
  location: "metadata.labels.cost-center"
  parameters:
    assign:
      value: "platform"
---
# Set readOnlyRootFilesystem if not explicitly set
apiVersion: mutations.gatekeeper.sh/v1
kind: Assign
metadata:
  name: set-readonly-rootfs
spec:
  match:
    scope: Namespaced
    kinds:
    - apiGroups: [""]
      kinds: ["Pod"]
  location: "spec.containers[name:*].securityContext.readOnlyRootFilesystem"
  parameters:
    assign:
      value: true

Reading Audit Violations

# Show all violations across all constraints
kubectl get constraints -A
kubectl describe k8srequiredlabels require-team-env-labels

# Structured violation output
kubectl get k8srequiredlabels require-team-env-labels \
  -o jsonpath='{.status.violations[*]}' | jq .

# Count by constraint
kubectl get constraints -o json | jq '
  .items[] | {
    constraint: .metadata.name,
    violations: (.status.violations | length)
  }'

Kyverno

Kyverno is a Kubernetes-native policy engine that uses YAML (with JMESPath expressions) instead of a separate policy language. Policies are ClusterPolicy or namespace-scoped Policy resources with three rule types: validate, mutate, and generate.

Install Kyverno

helm repo add kyverno https://kyverno.github.io/kyverno/
helm repo update

helm install kyverno kyverno/kyverno \
  --namespace kyverno \
  --create-namespace \
  --version 3.2.6 \
  --set admissionController.replicas=3 \
  --set backgroundController.replicas=2 \
  --set cleanupController.replicas=1 \
  --set reportsController.replicas=1 \
  --set admissionController.container.args.enableDeferredLoading=true \
  --set features.policyExceptions.enabled=true \
  --set features.globalContext.enabled=true

ClusterPolicy: Validate

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-pod-probes
  annotations:
    policies.kyverno.io/title: Require Liveness and Readiness Probes
    policies.kyverno.io/category: Best Practices
    policies.kyverno.io/severity: medium
    policies.kyverno.io/subject: Pod
    policies.kyverno.io/description: >-
      Pods without liveness probes cannot be automatically restarted.
      Pods without readiness probes receive traffic before they are ready.
spec:
  validationFailureAction: Enforce   # Enforce | Audit
  background: true   # also scan existing resources
  rules:
  - name: check-container-probes
    match:
      any:
      - resources:
          kinds: ["Pod"]
          operations: ["CREATE","UPDATE"]
    exclude:
      any:
      - resources:
          namespaces: ["kube-system","monitoring","kyverno","argocd"]
      - subjects:
        - kind: ServiceAccount
          name: "argo-rollouts"
          namespace: "argo-rollouts"
    validate:
      message: "Liveness and readiness probes are required for all containers."
      foreach:
      - list: "request.object.spec.containers"
        deny:
          conditions:
            any:
            - key: "{{ element.livenessProbe }}"
              operator: Equals
              value: null
            - key: "{{ element.readinessProbe }}"
              operator: Equals
              value: null

ClusterPolicy: Mutate — Inject Default Resources

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: add-default-resources
spec:
  validationFailureAction: Audit
  background: false
  rules:
  - name: add-resource-defaults
    match:
      any:
      - resources:
          kinds: ["Pod"]
          operations: ["CREATE"]
    mutate:
      foreach:
      - list: "request.object.spec.containers[]"
        patchStrategicMerge:
          spec:
            containers:
            - name: "{{ element.name }}"
              resources:
                requests:
                  cpu: "{{ element.resources.requests.cpu || '100m' }}"
                  memory: "{{ element.resources.requests.memory || '128Mi' }}"
                limits:
                  memory: "{{ element.resources.limits.memory || '256Mi' }}"

ClusterPolicy: Generate — NetworkPolicy on Namespace Create

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: generate-default-network-policy
spec:
  rules:
  - name: generate-deny-all-ingress
    match:
      any:
      - resources:
          kinds: ["Namespace"]
          operations: ["CREATE"]
    exclude:
      any:
      - resources:
          names: ["kube-system","kube-public","kube-node-lease",
                  "monitoring","kyverno","argocd","cert-manager"]
    generate:
      synchronize: true   # keep in sync; delete if namespace deleted
      apiVersion: networking.k8s.io/v1
      kind: NetworkPolicy
      name: deny-all-ingress
      namespace: "{{ request.object.metadata.name }}"
      data:
        spec:
          podSelector: {}
          policyTypes: ["Ingress"]
          # No ingress rules = deny all. Teams must add their own allow rules.

ClusterPolicy: Verify Image Signatures

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: verify-image-signatures
spec:
  validationFailureAction: Enforce
  background: false
  webhookTimeoutSeconds: 30
  rules:
  - name: verify-cosign-keyless
    match:
      any:
      - resources:
          kinds: ["Pod"]
          operations: ["CREATE","UPDATE"]
    verifyImages:
    - imageReferences:
      - "123456789.dkr.ecr.us-east-1.amazonaws.com/*"
      mutateDigest: true       # replace tag with digest (immutability)
      verifyDigest: true
      required: true
      attestors:
      - count: 1
        entries:
        - keyless:
            subject: "https://github.com/myorg/myrepo/.github/workflows/ci.yaml@refs/heads/main"
            issuer: "https://token.actions.githubusercontent.com"
            rekor:
              url: https://rekor.sigstore.dev

ℹ️

mutateDigest: true is a security critical feature — it replaces :v1.2.3 tags with @sha256:... digests at admission time, preventing tag mutation attacks even after the pod is admitted.

Kyverno CLI — Test Locally

# Install
brew install kyverno   # or: curl -sL https://github.com/kyverno/kyverno/releases/...

# Test a policy against resources without a cluster
kyverno apply policy.yaml --resource pod.yaml

# Test a policy with values (simulating variables)
kyverno apply policy.yaml --resource pod.yaml \
  --values-file values.yaml

# Run a full test suite
kyverno test .    # looks for kyverno-test.yaml in current dir

# Generate policy reports from live cluster
kyverno report cluster --policy-report-dir ./reports/

Kyverno Test Manifest

# kyverno-test.yaml
name: require-pod-probes-test
policies:
- require-pod-probes.yaml
resources:
- good-pod.yaml
- bad-pod.yaml
results:
- policy: require-pod-probes
  rule: check-container-probes
  resource: good-pod
  result: pass
- policy: require-pod-probes
  rule: check-container-probes
  resource: bad-pod
  result: fail

Gatekeeper vs Kyverno

Dimension	OPA Gatekeeper	Kyverno
Policy language	Rego (OPA's declarative language)	YAML + JMESPath expressions
Learning curve	High (Rego has unique syntax/semantics)	Low (K8s-native YAML)
Rule types	Validate + Mutate (separate CRDs)	Validate + Mutate + Generate in one ClusterPolicy
Mutation support	AssignMetadata, Assign, ModifySet CRDs	patchStrategicMerge, patchesJSON6902, foreach
Resource generation	No native generate	Yes — generate NetworkPolicy, RBAC, etc. on triggers
Image verification	No (use separate Kyverno for this)	Yes — verifyImages with cosign keyless/key-based
External data	OPA external data providers (HTTP cache)	Global context (K8s resources, API calls)
Testing tooling	conftest, opa test	kyverno test (full test suite with pass/fail)
Policy Reports	via status.violations (custom)	PolicyReport / ClusterPolicyReport (K8s standard CRD)
Audit mode	Separate audit controller, enforcementAction: dryrun	background: true + validationFailureAction: Audit
Exceptions	Excluded scopes in match, separate namespace annotations	PolicyException CRD
Community adoption	CNCF graduated; strong enterprise adoption	CNCF graduated; rapidly growing; preferred for K8s teams
Best for	Teams already using OPA/Rego for other policy; complex cross-resource logic	Teams wanting K8s-native approach; need generate rules; image signing

ℹ️

You can run both. A common pattern: Kyverno handles validate + mutate + generate + image verification; Gatekeeper handles complex Rego-based cross-resource validation (e.g., "no two services with same external hostname"). Ensure webhook timeout budgets don't cascade.

ValidatingAdmissionPolicy (VAP) — Built-in CEL

Kubernetes 1.30 GA'd ValidatingAdmissionPolicy — lightweight CEL-based validation without an external webhook. Ideal for simple constraints where you don't want another running pod.

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
  name: require-run-as-non-root
spec:
  failurePolicy: Fail
  matchConstraints:
    resourceRules:
    - apiGroups: [""]
      apiVersions: ["v1"]
      operations: ["CREATE","UPDATE"]
      resources: ["pods"]
  validations:
  - expression: >-
      object.spec.securityContext.runAsNonRoot == true ||
      object.spec.containers.all(c,
        c.securityContext.runAsNonRoot == true)
    message: "Pods must set runAsNonRoot: true"
---
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicyBinding
metadata:
  name: require-run-as-non-root-binding
spec:
  policyName: require-run-as-non-root
  validationActions: [Deny]
  matchResources:
    namespaceSelector:
      matchLabels:
        enforce-security: "true"

Common Policy Library

These are the policies every production platform should enforce. Ship them in enforcementAction: dryrun / Audit first to find violations, then graduate to deny / Enforce.

Policy	Risk Mitigated	Recommended Action	Notes
Require resource limits & requests	CPU/memory OOM, node pressure	Enforce	Use LimitRange defaults as fallback
Require liveness + readiness probes	Traffic to not-ready pods, no auto-restart	Enforce	Exclude Jobs and one-shot containers
Disallow privileged containers	Node escape, kernel access	Enforce	Allow in kube-system with annotation
Disallow hostPID / hostIPC / hostNetwork	Process snooping, network sniffing	Enforce	Allow for CNI/monitoring DaemonSets
Disallow hostPath volumes	Host filesystem read/write	Enforce	Allow specific paths for log agents
Require non-root user	Container breakout impact	Enforce	Set runAsNonRoot + runAsUser ≥ 1000
Disallow privilege escalation	setuid binary exploitation	Enforce	`allowPrivilegeEscalation: false`
Drop ALL capabilities	Linux capability abuse	Enforce	Allow NET_BIND_SERVICE explicitly
Require readOnlyRootFilesystem	Reduce attack surface for malware	Audit → Enforce	Many apps need emptyDir for /tmp
Require seccompProfile: RuntimeDefault	Syscall surface reduction	Enforce	Required for restricted PSS
Allowed image registries	Supply chain compromise	Enforce	Allowlist internal ECR + distroless
Require image digest (not tag)	Tag mutation attacks	Enforce (via mutateDigest)	Kyverno verifyImages handles this
Require cosign signature	Unsigned / untrusted images	Enforce	Gate on specific namespaces first
Disallow latest tag	Non-reproducible deployments	Enforce	Block pods with image ending in :latest
Require team + env labels	Ownership, cost attribution	Enforce on Namespace/Deployment	Required for cost chargeback
Require PodDisruptionBudget	Accidental zero-replica drain	Audit	Complex to enforce — use audit + alert
Require NetworkPolicy exists	Unrestricted lateral movement	Audit → Enforce	Kyverno generate creates deny-all
Max replicas without HPA	Over-provisioning	Audit	Warn if replicas > 3 without HPA
Disallow NodePort services	Direct node exposure	Enforce	Allow ClusterIP + LoadBalancer only
Require Ingress TLS	Plaintext traffic	Enforce	Check spec.tls is set

Kyverno: Disallow Latest Tag

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: disallow-latest-tag
  annotations:
    policies.kyverno.io/title: Disallow Latest Tag
    policies.kyverno.io/severity: medium
spec:
  validationFailureAction: Enforce
  background: true
  rules:
  - name: require-image-tag
    match:
      any:
      - resources:
          kinds: ["Pod"]
          operations: ["CREATE","UPDATE"]
    exclude:
      any:
      - resources:
          namespaces: ["kube-system"]
    validate:
      message: "Image tag ':latest' or missing tag is not allowed. Use a specific tag or digest."
      foreach:
      - list: "request.object.spec.containers"
        deny:
          conditions:
            any:
            - key: "{{ element.image }}"
              operator: Equals
              value: "*:latest"
            - key: "{{ element.image }}"
              operator: NotEquals
              value: "*:*"   # images without any tag

Kyverno: Require Resource Limits

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-requests-limits
  annotations:
    policies.kyverno.io/title: Require Resource Requests and Limits
    policies.kyverno.io/severity: high
spec:
  validationFailureAction: Enforce
  background: true
  rules:
  - name: validate-resources
    match:
      any:
      - resources:
          kinds: ["Pod"]
          operations: ["CREATE","UPDATE"]
    exclude:
      any:
      - resources:
          namespaces: ["kube-system","kyverno"]
    validate:
      message: "CPU/memory requests and memory limits are required on all containers."
      foreach:
      - list: "request.object.spec.containers"
        deny:
          conditions:
            any:
            - key: "{{ element.resources.requests.cpu || '' }}"
              operator: Equals
              value: ""
            - key: "{{ element.resources.requests.memory || '' }}"
              operator: Equals
              value: ""
            - key: "{{ element.resources.limits.memory || '' }}"
              operator: Equals
              value: ""

Policy Exceptions

No policy library is perfect — there will always be legitimate exceptions (legacy apps, specialized system components, vendor-provided DaemonSets). Manage these explicitly rather than widening the policy's exclusion scope.

Kyverno PolicyException

apiVersion: kyverno.io/v2
kind: PolicyException
metadata:
  name: datadog-agent-exception
  namespace: monitoring        # exceptions are namespaced
spec:
  exceptions:
  - policyName: disallow-privileged-containers
    ruleNames:
    - check-privileged
  - policyName: require-requests-limits
    ruleNames:
    - validate-resources
  match:
    any:
    - resources:
        kinds: ["Pod"]
        namespaces: ["monitoring"]
        selector:
          matchLabels:
            app: datadog-agent
  # Optional: expiry date to force re-review
  podSecurity: []

⚠️

Gate PolicyException creation. Use RBAC to allow only the platform team to create PolicyExceptions. Add a Kyverno policy that requires annotations: approved-by: platform-team on every PolicyException. Track all exceptions in a GitOps repo for auditability.

Gatekeeper Exemption via Namespace Annotation

# Exclude a namespace from ALL Gatekeeper webhooks
kubectl label namespace legacy-app \
  admission.gatekeeper.sh/ignore=no-validation

# Per-constraint: use excludedNamespaces in Constraint spec
spec:
  match:
    excludedNamespaces:
    - legacy-app
    - vendor-system

Exception Tracking in Git

policy/
  library/
    require-probes.yaml
    disallow-privileged.yaml
    allowed-registries.yaml
  constraints/
    cluster-wide-constraints.yaml
  exceptions/
    README.md          # exception registry with justification
    datadog-agent.yaml
    legacy-payment-service.yaml   # expires: 2025-12-31, ticket: PLAT-4892
  tests/
    require-probes/
      kyverno-test.yaml
      good-pod.yaml
      bad-pod.yaml

Policy Testing

Policy changes must be tested before reaching production. The testing pyramid applies: unit tests (kyverno test / conftest), integration against a Kind cluster, then graduated rollout in audit mode.

Kyverno Test Suite Structure

# tests/require-probes/kyverno-test.yaml
name: require-probes-tests
policies:
- ../../library/require-probes.yaml
resources:
- good-pod.yaml
- bad-pod-no-liveness.yaml
- bad-pod-no-readiness.yaml
- excluded-namespace-pod.yaml
variables: variables.yaml
results:
- policy: require-pod-probes
  rule: check-container-probes
  resource: good-pod
  result: pass
- policy: require-pod-probes
  rule: check-container-probes
  resource: bad-pod-no-liveness
  result: fail
- policy: require-pod-probes
  rule: check-container-probes
  resource: bad-pod-no-readiness
  result: fail
- policy: require-pod-probes
  rule: check-container-probes
  resource: excluded-namespace-pod  # in kube-system
  result: skip

conftest for Gatekeeper Rego

# Test Rego policy logic in isolation
conftest test pod.json \
  --policy policy/library/required-labels.rego \
  --namespace k8srequiredlabels

# Test all policies against a manifest directory
conftest test manifests/ \
  --policy policy/library/ \
  --all-namespaces

# Parse K8s YAML to JSON for conftest
kubectl get pod my-pod -o json | conftest test - \
  --policy policy/library/required-labels.rego

GitHub Actions Policy CI

name: Policy Tests
on:
  pull_request:
    paths: ["policy/**"]

jobs:
  kyverno-test:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v4
    - name: Install Kyverno CLI
      run: |
        curl -sL "https://github.com/kyverno/kyverno/releases/download/v1.12.0/kyverno_1.12.0_linux_amd64.tar.gz" \
          | tar -xz -C /usr/local/bin kyverno
    - name: Run Kyverno tests
      run: kyverno test policy/tests/

  audit-against-kind:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v4
    - name: Create Kind cluster
      uses: helm/kind-action@v1.9.0
    - name: Install Kyverno
      run: helm install kyverno kyverno/kyverno -n kyverno --create-namespace --wait
    - name: Apply policies in Audit mode
      run: |
        find policy/library -name "*.yaml" | xargs -I{} \
          sed 's/validationFailureAction: Enforce/validationFailureAction: Audit/' | \
          kubectl apply -f -
    - name: Apply test manifests
      run: kubectl apply -f policy/tests/fixtures/
    - name: Check policy reports
      run: |
        kubectl get policyreport -A -o json | \
          jq '.items[].results[] | select(.result=="fail") | .message' | \
          tee /tmp/violations.txt
        # Fail CI if unexpected violations
        [ -s /tmp/violations.txt ] && exit 1 || true

Graduated Rollout Strategy

Audit mode cluster-wide — deploy with validationFailureAction: Audit. Let the audit controller scan all existing resources. Review PolicyReport violations.
Warn mode on new namespaces — label new namespaces with the PSA warn label. Developers see warnings in kubectl output without failures.
Enforce on non-production first — flip to Enforce in dev/staging namespaces. Fix violations that surface.
Enforce on production — after 2-week soak in staging with zero new violations, promote to production. Document timeline in GitOps PR.
Enforce globally — remove per-namespace overrides. Only named PolicyExceptions remain.

Policy as Code in GitOps

Policies must be in Git — not applied ad-hoc. The GitOps loop (see 02-gitops.html) ensures policies are version-controlled, reviewed, and automatically reconciled. Drift from approved policy triggers alerts.

Argo CD Application for Policy

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: cluster-policies
  namespace: argocd
  annotations:
    argocd.argoproj.io/sync-wave: "-2"   # before workloads
spec:
  project: platform
  source:
    repoURL: https://github.com/myorg/platform
    targetRevision: main
    path: policy/
    kustomize:
      namePrefix: ""
  destination:
    server: https://kubernetes.default.svc
    namespace: kyverno
  syncPolicy:
    automated:
      prune: true
      selfHeal: true    # revert manual policy changes immediately
    syncOptions:
    - CreateNamespace=true
    - ServerSideApply=true

Policy Repo Structure

policy/
├── kustomization.yaml
├── library/
│   ├── security/
│   │   ├── disallow-privileged.yaml
│   │   ├── require-non-root.yaml
│   │   ├── drop-all-capabilities.yaml
│   │   ├── disallow-host-namespaces.yaml
│   │   └── require-seccomp.yaml
│   ├── best-practices/
│   │   ├── require-probes.yaml
│   │   ├── require-resource-limits.yaml
│   │   ├── disallow-latest-tag.yaml
│   │   └── require-labels.yaml
│   ├── supply-chain/
│   │   ├── allowed-registries.yaml
│   │   └── verify-image-signatures.yaml
│   └── networking/
│       ├── disallow-nodeport.yaml
│       └── require-ingress-tls.yaml
├── exceptions/
│   ├── kustomization.yaml
│   └── datadog-agent.yaml
└── tests/
    └── ...

OPA Policy Bundle for Multi-Cluster

# Gatekeeper Config — sync K8s resources into OPA cache
# (needed for policies that reference other objects)
apiVersion: config.gatekeeper.sh/v1alpha1
kind: Config
metadata:
  name: config
  namespace: gatekeeper-system
spec:
  sync:
    syncOnly:
    - group: ""
      version: "v1"
      kind: Namespace
    - group: ""
      version: "v1"
      kind: Service
    - group: "networking.k8s.io"
      version: "v1"
      kind: Ingress
  validation:
    traces: []

Audit & Compliance Reporting

Kyverno PolicyReport

# PolicyReport is namespaced — one per namespace
kubectl get policyreport -n payments-api -o yaml

# ClusterPolicyReport is cluster-scoped — for non-namespaced resources
kubectl get clusterpolicyreport -o yaml

# Summary: pass/fail/warn/error counts per policy
kubectl get policyreport -A \
  -o jsonpath='{range .items[*]}{.metadata.namespace}: {.summary}{"\n"}{end}'

# Find all failing resources across cluster
kubectl get policyreport -A -o json | jq '
  .items[] | .metadata.namespace as $ns |
  .results[] | select(.result == "fail") | {
    namespace: $ns,
    resource: .resources[0].name,
    policy: .policy,
    message: .message
  }'

Policy Reporter Dashboard

# Policy Reporter: visualization layer over PolicyReports
helm install policy-reporter policy-reporter/policy-reporter \
  --namespace policy-reporter \
  --create-namespace \
  --set ui.enabled=true \
  --set kyvernoPlugin.enabled=true \
  --set monitoring.enabled=true \  # ServiceMonitor
  --set target.slack.webhook="https://hooks.slack.com/..."
  --set target.slack.minimumSeverity="high"

K8s Audit Logs for Policy Actions

# Audit policy on kube-apiserver — log admission denials
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# Log all admission webhook denials at RequestResponse level
- level: RequestResponse
  omitStages: []
  resources:
  - group: ""
    resources: ["pods","deployments"]
  verbs: ["create","update"]
  # Filter in SIEM/log pipeline for responseStatus.code=403

Alerting & Monitoring

Gatekeeper Prometheus Metrics

# Key metrics exposed by gatekeeper-controller-manager
gatekeeper_violations                    # gauge: current audit violations per constraint
gatekeeper_audit_last_run_time           # gauge: Unix timestamp of last audit
gatekeeper_audit_duration_seconds        # histogram: audit run duration
gatekeeper_request_count_total           # counter: webhook requests (admitted/denied)
gatekeeper_request_duration_seconds      # histogram: webhook latency

PrometheusRule

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: policy-enforcement-alerts
  namespace: monitoring
  labels:
    prometheus: kube-prometheus
    role: alert-rules
spec:
  groups:
  - name: policy.enforcement
    interval: 60s
    rules:

    # Kyverno webhook pod health
    - alert: KyvernoWebhookPodsLow
      expr: |
        kube_deployment_status_replicas_available{
          namespace="kyverno",
          deployment="kyverno-admission-controller"
        } < 2
      for: 5m
      labels:
        severity: critical
      annotations:
        summary: "Kyverno admission controller replicas below minimum"
        description: "Only {{ $value }} replica(s) available. Policy enforcement may be degraded."

    # Gatekeeper audit violations spiking
    - alert: GatekeeperHighViolations
      expr: |
        sum(gatekeeper_violations) by (enforcement_action) > 50
      for: 15m
      labels:
        severity: warning
      annotations:
        summary: "High number of Gatekeeper violations in audit"
        description: "{{ $value }} violations detected. Run 'kubectl get constraints -A' to review."

    # Policy webhook latency
    - alert: PolicyWebhookHighLatency
      expr: |
        histogram_quantile(0.99,
          rate(gatekeeper_request_duration_seconds_bucket[5m])
        ) > 2
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "Gatekeeper webhook P99 latency exceeds 2s"
        description: "High webhook latency may cause timeout-based admission failures."

    # Kyverno background scan stale
    - alert: KyvernoAuditStale
      expr: |
        (time() - kyverno_policy_results_total) > 3600
      for: 10m
      labels:
        severity: warning
      annotations:
        summary: "Kyverno policy audit not running"

    # New high-severity violations in Kyverno PolicyReport
    - alert: KyvernoCriticalPolicyViolation
      expr: |
        increase(kyverno_policy_results_total{
          policy_type="validate",
          status="fail"
        }[10m]) > 0
      for: 0m
      labels:
        severity: warning
      annotations:
        summary: "New Kyverno policy violations detected"
        description: "Check 'kubectl get policyreport -A' for details."

Grafana Dashboard Panels

Panel	Query	Visualization
Violations by constraint	`gatekeeper_violations by (constraint_name)`	Bar chart
Webhook admit/deny ratio	`rate(gatekeeper_request_count_total[5m]) by (admission_status)`	Stacked area
P99 webhook latency	`histogram_quantile(0.99, rate(gatekeeper_request_duration_seconds_bucket[5m]))`	Stat (threshold: >1s yellow, >2s red)
Kyverno pass/fail trend	`kyverno_policy_results_total by (status)`	Time series
Namespace PSS compliance	Custom query on namespace labels	Table

Best Practices

Audit Before Enforce

Always deploy new policies in Audit/dryrun mode first. Run for at least 2 weeks and fix all violations before flipping to Enforce. New policies in Enforce without audit break existing workloads.

Policy as Code in GitOps

Store all policies in Git with Argo CD selfHeal: true on the policy Application. Any manual policy change is reverted within 3 minutes — policies can never drift from reviewed state.

Sync Waves for Policies

Use Argo CD sync wave -2 for policies and -1 for Gatekeeper/Kyverno install, so the engine is running before workloads are reconciled. A workload in wave 0 will be validated by already-running webhooks.

PodDisruptionBudget on Policy Engine

Set minAvailable: 2 PDB on Kyverno/Gatekeeper admission controllers. This prevents node drains from taking down all webhook replicas simultaneously, which would block all pod scheduling.

Minimize Webhook Scope

Use namespaceSelector and objectSelector to minimize what your webhook is called for. A webhook called for every resource in the cluster adds latency proportional to your policy count.

Test Policies in CI

Run kyverno test or conftest in every PR that touches the policy directory. Include both positive (compliant) and negative (violating) fixtures. Aim for >80% branch coverage of Rego rules.

Gate PolicyExceptions

Create a Kyverno policy that requires annotations.approved-by: platform-team on every PolicyException. Log all exception creations to your SIEM. Review quarterly and expire by date.

No Blanket Webhook Ignore

Labeling a namespace admission.gatekeeper.sh/ignore disables ALL Gatekeeper constraints in that namespace. This is an escape hatch, not standard practice. Use named PolicyExceptions or per-constraint exclusions instead.

Coverage: 05 · Policy Enforcement

Admission control architecture diagram (authentication → authorization → mutating webhooks → schema validation → validating webhooks → etcd)
ValidatingWebhookConfiguration anatomy (clientConfig, rules, namespaceSelector, failurePolicy, matchPolicy: Equivalent)
Built-in admission plugins reference table (PodSecurity/LimitRanger/ResourceQuota/DefaultStorageClass/MutatingAdmissionWebhook/ValidatingAdmissionWebhook/ValidatingAdmissionPolicy)
Pod Security Standards: Privileged / Baseline / Restricted with allowed/disallowed controls
PSA modes: enforce / audit / warn with label keys and behavior differences
Namespace labeling examples (enforce baseline + warn/audit restricted dual-mode)
kubectl dry-run PSS audit commands (check breakage before enforcing)
Compliant Pod securityContext for restricted PSS (runAsNonRoot/runAsUser/fsGroup/seccompProfile/allowPrivilegeEscalation:false/capabilities:drop:ALL/readOnlyRootFilesystem)
Gatekeeper architecture (ConstraintTemplate + Constraint two-layer model + audit controller)
Gatekeeper Helm install (replicas/auditInterval/logDenies/emitEvents)
ConstraintTemplate: K8sRequiredLabels (Rego: provided-required-missing set arithmetic)
Constraint: K8sRequiredLabels with enforcementAction/match/excludedNamespaces/parameters
ConstraintTemplate: K8sAllowedRepos (all container types: containers/initContainers/ephemeralContainers)
Gatekeeper Mutation: AssignMetadata (label injection) + Assign (security context defaults)
Reading audit violations with kubectl (describe, jsonpath, jq count by constraint)
Kyverno install (4 controllers: admission/background/cleanup/reports; features: policyExceptions/globalContext)
Kyverno ClusterPolicy: validate rule (require-pod-probes with foreach, exclude namespaces+serviceaccounts)
Kyverno ClusterPolicy: mutate rule (add-default-resources with foreach + patchStrategicMerge + JMESPath defaults)
Kyverno ClusterPolicy: generate rule (NetworkPolicy on Namespace CREATE with synchronize:true)
Kyverno ClusterPolicy: verifyImages (cosign keyless, mutateDigest:true, verifyDigest, subject regexp, rekor)
Kyverno CLI: apply, test, report commands
Kyverno test manifest (name/policies/resources/results with pass/fail/skip)
Gatekeeper vs Kyverno 13-dimension comparison table
ValidatingAdmissionPolicy (CEL, 1.30 GA): require-run-as-non-root + ValidatingAdmissionPolicyBinding
Common policy library table: 20 policies with risk, recommended action, notes
Kyverno: disallow-latest-tag (foreach deny with wildcard operator)
Kyverno: require-requests-limits (foreach deny on missing cpu/memory requests and memory limits)
PolicyException CRD: datadog-agent example with policyName+ruleNames+match+expiry pattern
Gatekeeper exemption: admission.gatekeeper.sh/ignore label + per-constraint excludedNamespaces
Exception tracking in Git (exceptions/ directory with README and dated tickets)
Kyverno test suite (kyverno-test.yaml with fixtures + good/bad/excluded resources)
conftest for Gatekeeper Rego (test, all-namespaces, parse K8s YAML to JSON)
GitHub Actions policy CI (kyverno test + Kind integration test with PolicyReport failure check)
Graduated rollout strategy: 5-step progression (audit → warn → enforce non-prod → enforce prod → global)
Argo CD Application for policies (sync-wave -2, selfHeal:true, ServerSideApply)
Policy repo structure (library/security/best-practices/supply-chain/networking + exceptions + tests)
Gatekeeper Config CRD (sync Namespace/Service/Ingress into OPA cache for cross-resource policies)
Kyverno PolicyReport and ClusterPolicyReport (get, jsonpath summary, jq fail extraction)
Policy Reporter Helm install (ui/kyvernoPlugin/monitoring/Slack target)
K8s audit policy for admission denials (RequestResponse level for pod create/update)
Gatekeeper Prometheus metrics reference (violations/audit_last_run/request_count/latency)
PrometheusRule: KyvernoWebhookPodsLow/GatekeeperHighViolations/PolicyWebhookHighLatency/KyvernoAuditStale/KyvernoCriticalPolicyViolation
Grafana dashboard panels for policy (violations/admit-deny ratio/P99 latency/pass-fail trend)
8 best practices cards (audit-before-enforce/GitOps selfHeal/sync waves/PDB/minimize scope/CI testing/gate exceptions/no blanket ignore)

Policy Enforcement in Kubernetes

Contents

Admission Control Architecture

Webhook Configuration Anatomy

Built-in Admission Plugins

Pod Security Admission (PSA)

The Three Pod Security Standards

Privileged

Baseline

Restricted

PSA Modes

Namespace Labeling Examples

Labeling Existing Namespaces with Dry-Run

Compliant Pod securityContext for Restricted

OPA Gatekeeper

Install Gatekeeper

ConstraintTemplate: Required Labels

Constraint: Enforce Labels on Namespaces

ConstraintTemplate: Allowed Image Registries

Gatekeeper Mutation

Reading Audit Violations

Kyverno

Install Kyverno

ClusterPolicy: Validate

ClusterPolicy: Mutate — Inject Default Resources

ClusterPolicy: Generate — NetworkPolicy on Namespace Create

ClusterPolicy: Verify Image Signatures

Kyverno CLI — Test Locally

Kyverno Test Manifest

Gatekeeper vs Kyverno

ValidatingAdmissionPolicy (VAP) — Built-in CEL

Common Policy Library

Kyverno: Disallow Latest Tag

Kyverno: Require Resource Limits

Policy Exceptions

Kyverno PolicyException

Gatekeeper Exemption via Namespace Annotation

Exception Tracking in Git

Policy Testing

Kyverno Test Suite Structure

conftest for Gatekeeper Rego

GitHub Actions Policy CI

Graduated Rollout Strategy

Policy as Code in GitOps

Argo CD Application for Policy

Policy Repo Structure

OPA Policy Bundle for Multi-Cluster

Audit & Compliance Reporting

Kyverno PolicyReport

Policy Reporter Dashboard

K8s Audit Logs for Policy Actions

Alerting & Monitoring

Gatekeeper Prometheus Metrics

PrometheusRule

Grafana Dashboard Panels

Best Practices

Audit Before Enforce

Policy as Code in GitOps

Sync Waves for Policies

PodDisruptionBudget on Policy Engine

Minimize Webhook Scope

Test Policies in CI

Gate PolicyExceptions

No Blanket Webhook Ignore