Testing Strategies
Overview
Testing Kubernetes applications requires a wider test pyramid than traditional software. On top of unit and integration tests, you need cluster-level tests that validate Kubernetes-specific behaviour: does the pod start? does the probe work? does the NetworkPolicy allow the right traffic? does the HPA scale correctly?
┌──────────────┐
│ Chaos Tests │ ← infrastructure resilience
└──────┬───────┘
┌───────┴────────┐
│ E2E Tests │ ← full user journeys in cluster
└───────┬────────┘
┌────────┴─────────┐
│ Contract Tests │ ← API consumer/producer contracts
└────────┬─────────┘
┌────────────┴──────────────┐
│ Integration Tests │ ← service + DB + K8s primitives
└────────────┬──────────────┘
┌──────────────────┴───────────────────┐
│ Unit Tests │ ← pure logic, no I/O
└───────────────────────────────────────┘
Unit Tests
Unit tests run without a cluster. They test pure business logic, parsing, transformations. Target: < 1 second per test, 80%+ coverage on core logic.
// Go example — test business logic, mock external dependencies
func TestCalculatePaymentFee(t *testing.T) {
tests := []struct {
name string
amount int64
currency string
want int64
}{
{"USD small", 1000, "USD", 29},
{"EUR small", 1000, "EUR", 25},
{"USD large", 100000, "USD", 230},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
got := CalculatePaymentFee(tc.amount, tc.currency)
if got != tc.want {
t.Errorf("CalculatePaymentFee(%d, %s) = %d, want %d",
tc.amount, tc.currency, got, tc.want)
}
})
}
}
# Run unit tests with race detection and coverage
go test ./... -race -coverprofile=coverage.out -covermode=atomic
# View coverage report
go tool cover -html=coverage.out
# Run specific package
go test ./internal/payments/... -v -run TestCalculatePaymentFee
Integration Tests
Integration tests verify your service interacts correctly with real dependencies — a real database, a real message queue, a real Redis — but not the full Kubernetes cluster. Use testcontainers-go or docker-compose to spin up dependencies.
testcontainers-go
package integration_test
import (
"context"
"testing"
"github.com/testcontainers/testcontainers-go"
"github.com/testcontainers/testcontainers-go/modules/postgres"
)
func TestPaymentRepository(t *testing.T) {
ctx := context.Background()
// Spin up a real Postgres container
pgContainer, err := postgres.RunContainer(ctx,
testcontainers.WithImage("postgres:16-alpine"),
postgres.WithDatabase("payments_test"),
postgres.WithUsername("test"),
postgres.WithPassword("test"),
testcontainers.WithWaitStrategy(
wait.ForLog("database system is ready to accept connections").
WithOccurrence(2).
WithStartupTimeout(30*time.Second),
),
)
if err != nil {
t.Fatal(err)
}
defer pgContainer.Terminate(ctx)
connStr, _ := pgContainer.ConnectionString(ctx, "sslmode=disable")
// Run migrations
db, _ := sql.Open("postgres", connStr)
runMigrations(db)
// Test the repository
repo := NewPaymentRepository(db)
payment, err := repo.Create(ctx, &Payment{Amount: 1000, Currency: "USD"})
if err != nil {
t.Fatalf("Create failed: %v", err)
}
if payment.ID == "" {
t.Error("expected non-empty payment ID")
}
}
# Run integration tests (requires Docker)
go test ./integration/... -tags=integration -timeout=120s
Cluster-Level Tests with envtest
controller-runtime/envtest starts a real API server and etcd binary locally — no cluster needed — and lets you test controllers, webhooks, and admission logic against a real K8s API.
package controllers_test
import (
"path/filepath"
"testing"
. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
"sigs.k8s.io/controller-runtime/pkg/envtest"
)
var (
testEnv *envtest.Environment
cfg *rest.Config
)
func TestControllers(t *testing.T) {
RegisterFailHandler(Fail)
RunSpecs(t, "Controller Suite")
}
var _ = BeforeSuite(func() {
testEnv = &envtest.Environment{
CRDDirectoryPaths: []string{
filepath.Join("..", "config", "crd", "bases"),
},
ErrorIfCRDPathMissing: true,
}
var err error
cfg, err = testEnv.Start()
Expect(err).NotTo(HaveOccurred())
})
var _ = AfterSuite(func() {
Expect(testEnv.Stop()).To(Succeed())
})
var _ = Describe("PaymentReconciler", func() {
It("creates a ConfigMap when a Payment is created", func() {
payment := &paymentv1.Payment{
ObjectMeta: metav1.ObjectMeta{Name: "test-payment", Namespace: "default"},
Spec: paymentv1.PaymentSpec{Amount: 100},
}
Expect(k8sClient.Create(ctx, payment)).To(Succeed())
cm := &corev1.ConfigMap{}
Eventually(func() error {
return k8sClient.Get(ctx, types.NamespacedName{
Name: "payment-config", Namespace: "default",
}, cm)
}, 10*time.Second, 100*time.Millisecond).Should(Succeed())
})
})
E2E Tests — kubectl/kuttl
kuttl (KUbernetes Test TooL) runs test cases as YAML files against a real cluster. It applies manifests, asserts state, and cleans up.
# tests/e2e/payment-flow/00-create-payment.yaml
apiVersion: kuttl.dev/v1beta1
kind: TestStep
apply:
- payment.yaml
---
# tests/e2e/payment-flow/01-assert.yaml
apiVersion: kuttl.dev/v1beta1
kind: TestAssert
timeout: 60
collectors:
- type: pod
selector: app=payments-api
---
# Assert the payment object reaches Ready state
apiVersion: payments.acme.com/v1
kind: Payment
metadata:
name: test-payment
status:
phase: Ready
# Install kuttl
kubectl krew install kuttl
# Run E2E tests against current cluster context
kubectl kuttl test --config kuttl-test.yaml
# kuttl-test.yaml
apiVersion: kuttl.dev/v1beta1
kind: TestSuite
testDirs:
- ./tests/e2e
startKIND: false # use existing cluster context
timeout: 120
E2E Tests with Chainsaw (kuttl successor)
Chainsaw is the next-generation test tool from the Kyverno project, with better assertion syntax and parallel test execution.
# chainsaw-test.yaml
apiVersion: chainsaw.kyverno.io/v1alpha1
kind: Test
metadata:
name: payment-e2e
spec:
steps:
- name: create-payment
try:
- apply:
file: payment.yaml
- assert:
file: payment-ready.yaml
- command:
entrypoint: curl
args: ["-sf", "http://payments-api.production:8080/healthz"]
catch:
- describe:
apiVersion: payments.acme.com/v1
kind: Payment
- podLogs:
selector: app=payments-api
finally:
- delete:
file: payment.yaml
# Run chainsaw tests
chainsaw test --test-dir ./tests/e2e
Helm Test Integration
After every Helm deploy in CI, run helm test to validate the release:
helm upgrade --install payments-api ./charts/payments-api \
--namespace staging \
--values values-staging.yaml \
--wait \
--timeout 5m
# Run helm tests immediately after
helm test payments-api --namespace staging --logs
# If tests fail, CI fails and Helm rolls back (if --atomic was used)
Policy Tests with Kyverno
Test Kyverno policies without a running cluster using kyverno test:
# kyverno-test/kyverno-test.yaml
name: require-resource-limits-test
policies:
- ../../policies/require-resource-limits.yaml
resources:
- resources/compliant-pod.yaml
- resources/non-compliant-pod.yaml
results:
- policy: require-resource-limits
rule: check-resource-limits
resource: compliant-pod
result: pass
- policy: require-resource-limits
rule: check-resource-limits
resource: non-compliant-pod
result: fail
kyverno test kyverno-test/
Chaos Testing
Chaos tests verify your application survives infrastructure failures. See 09 — Disaster Recovery for the full chaos toolchain. Below is the developer-facing subset for per-service chaos testing.
chaos-mesh Pod Kill
# Randomly kill one payments-api pod every 5 minutes (during test window)
apiVersion: chaos-mesh.org/v1alpha1
kind: PodChaos
metadata:
name: payments-api-pod-kill
namespace: chaos-testing
spec:
action: pod-kill
mode: one
selector:
namespaces: [staging]
labelSelectors:
app: payments-api
scheduler:
cron: "@every 5m"
duration: "10m"
# Apply chaos, run load test, verify no errors
kubectl apply -f pod-kill-chaos.yaml
k6 run --duration 10m load-test.js
kubectl delete -f pod-kill-chaos.yaml
Network Latency Injection
apiVersion: chaos-mesh.org/v1alpha1
kind: NetworkChaos
metadata:
name: payments-latency
namespace: chaos-testing
spec:
action: delay
mode: all
selector:
namespaces: [staging]
labelSelectors:
app: payments-api
delay:
latency: "100ms"
correlation: "25"
jitter: "50ms"
duration: "5m"
direction: to # inject latency on incoming requests
Load Testing with k6
// load-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';
import { Rate } from 'k6/metrics';
const errorRate = new Rate('errors');
export const options = {
stages: [
{ duration: '2m', target: 50 }, // ramp up to 50 VUs
{ duration: '5m', target: 50 }, // hold at 50 VUs
{ duration: '2m', target: 100 }, // ramp up to 100 VUs
{ duration: '5m', target: 100 }, // hold at 100 VUs
{ duration: '2m', target: 0 }, // ramp down
],
thresholds: {
http_req_duration: ['p(99)<500'], // 99th percentile < 500ms
errors: ['rate<0.01'], // error rate < 1%
},
};
export default function () {
const res = http.post(
'http://payments-api.staging.svc.cluster.local:8080/payments',
JSON.stringify({ amount: 1000, currency: 'USD', idempotency_key: `key-${__VU}-${__ITER}` }),
{ headers: { 'Content-Type': 'application/json' } }
);
check(res, {
'status is 201': (r) => r.status === 201,
'response time < 200ms': (r) => r.timings.duration < 200,
});
errorRate.add(res.status >= 400);
sleep(0.1);
}
# Run k6 from a pod inside the cluster (avoids network round-trip)
kubectl run k6 --image=grafana/k6:latest --rm -it -- \
run - < load-test.js
CI Test Stages Summary
| Stage | Tool | Runs on | Blocks merge? |
|---|---|---|---|
| Unit tests | go test / jest / pytest | Every PR | Yes |
| Lint + type check | golangci-lint / eslint / mypy | Every PR | Yes |
| Security scan (code) | gosec / semgrep | Every PR | Yes |
| Image build | Buildkit | Every PR | Yes |
| Image scan (CVE) | Trivy | Every PR | Yes (CRITICAL) |
| Policy test | kyverno test | Every PR | Yes |
| Integration tests | testcontainers | Every PR | Yes |
| Helm test | helm test | Post-deploy to staging | Yes |
| E2E tests | kuttl / chainsaw | Post-deploy to staging | Yes |
| Load test | k6 | Scheduled / release branch | No (informational) |
| Chaos test | chaos-mesh | Scheduled weekly | No |
Related
- Local Development — running tests against local cluster
- CI/CD Pipelines — where tests plug into the pipeline
- Progressive Delivery — using test results to gate canary promotion
- Disaster Recovery — full chaos testing