API Aggregation
API Aggregation is the mechanism by which Kubernetes allows third-party API servers to be served through the same kube-apiserver endpoint, making them appear as native Kubernetes APIs. An aggregated API server (AA server) registers itself with the main API server via an APIService object. From then on, requests to that API group/version are proxied transparently to the extension server.
This is one of two primary extension points for adding new API types to Kubernetes. The other is Custom Resource Definitions (CRDs), covered in Platform Engineering §CRDs. Understanding the trade-offs between them is essential for platform engineers.
Motivation and History
Before API Aggregation (introduced in Kubernetes 1.7), the only way to add new resource types was to fork kube-apiserver or use ThirdPartyResources (TPR), a predecessor to CRDs. The aggregation layer solved several problems:
- Arbitrary storage backends — the extension server can store data in any database, not just etcd.
- Custom validation and admission beyond what CRDs can express — the extension server implements its own logic in code, not via CEL or OpenAPI schemas.
- Subresource semantics — e.g.,
/scale,/status,/execwith complex behaviors that CRD subresources cannot replicate. - kubectl and API discovery compatibility — clients see the aggregated APIs as first-class API groups via
/apisdiscovery, supporting kubectl, client-go, and informers natively.
The most prominent aggregated API server in core Kubernetes today is the Metrics Server, which serves metrics.k8s.io/v1beta1 and is used by kubectl top and the HPA controller. Another major consumer is the Service Catalog (now largely replaced by other patterns).
Aggregation Layer Architecture
kube-aggregator library compiled directly into kube-apiserver. The aggregation layer handles all API requests first, routing built-in API groups to local handlers and aggregated groups to the registered extension servers.
The APIService Object
An APIService is a cluster-scoped resource in the apiregistration.k8s.io/v1 group. It maps an API group/version to a backend Service or marks it as local (built-in).
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
name: v1beta1.metrics.k8s.io # Convention: .
spec:
service:
namespace: kube-system
name: metrics-server
port: 443 # Port on the Service
group: metrics.k8s.io
version: v1beta1
groupPriorityMinimum: 100 # Higher = preferred in API discovery
versionPriority: 100 # Higher = shown first among versions
insecureSkipTLSVerify: false # NEVER true in production
caBundle: # CA that signed the extension server cert
status:
conditions:
- type: Available
status: "True"
reason: Passed
message: all checks passed
APIService for Local (Built-in) Groups
Built-in API groups such as v1 (core), apps/v1, batch/v1 also have APIService objects — but with no service field. This means "handle locally":
kubectl get apiservices | head -20
# NAME SERVICE AVAILABLE AGE
# v1. Local True 300d
# v1.apps Local True 300d
# v1.batch Local True 300d
# v1beta1.metrics.k8s.io kube-system/metrics-server True 120d
Request Flow Through the Aggregation Layer
Step by step for a request to GET /apis/metrics.k8s.io/v1beta1/nodes:
- AuthN — kube-apiserver authenticates the client using its normal authenticators (OIDC, client cert, service account token). See kube-apiserver §Authentication.
- AuthZ — RBAC is evaluated against the aggregated API group/resource, same as for native resources. The RBAC rules are stored in etcd and enforced by kube-apiserver, not the extension server.
- Admission — Admission controllers run on write requests, including
MutatingAdmissionWebhookandValidatingAdmissionWebhook. Extension servers can also register webhooks for their own types. - Route lookup — The aggregation layer matches the URL prefix (
/apis/metrics.k8s.io) against the APIService registry. - Proxy to extension server — kube-apiserver forwards the request via an HTTPS reverse proxy. It adds
X-Remote-User,X-Remote-Group, andX-Remote-Extra-*headers identifying the original caller. - Extension server re-validates identity — The extension server reads the request headers and validates them using the extension-apiserver-authentication ConfigMap.
Authentication Delegation
Because the extension server receives proxied requests (not the original client TLS certificate), it needs a way to trust the identity headers injected by kube-apiserver. This is done via delegated authentication:
extension-apiserver-authentication ConfigMap
# Contains the CA and requestheader configuration
kubectl get configmap extension-apiserver-authentication -n kube-system -o yaml
data:
client-ca-file: |
-----BEGIN CERTIFICATE-----
... (cluster CA — used to verify client certs in direct calls)
-----END CERTIFICATE-----
requestheader-allowed-names: '["front-proxy-client"]'
requestheader-client-ca-file: |
-----BEGIN CERTIFICATE-----
... (front-proxy CA — used to verify kube-apiserver's proxy cert)
-----END CERTIFICATE-----
requestheader-extra-headers-prefix: '["X-Remote-Extra-"]'
requestheader-group-headers: '["X-Remote-Group"]'
requestheader-username-headers: '["X-Remote-User"]'
The extension server uses this ConfigMap to:
- Verify that the proxying entity (kube-apiserver) presents a TLS certificate signed by the requestheader-client-ca.
- Only trust the identity headers (
X-Remote-User,X-Remote-Group) when the proxy's certificate CN matchesrequestheader-allowed-names. - For direct calls (without a proxy), verify the client cert against
client-ca-file.
X-Remote-User without verifying the request came from a trusted proxy is vulnerable to impersonation. Any client with network access to the extension server pod could set arbitrary headers and bypass authentication entirely.
Authorization Delegation
Extension servers delegate authorization back to kube-apiserver via the SubjectAccessReview API:
// Extension server delegates authz to kube-apiserver
sar := &authorizationv1.SubjectAccessReview{
Spec: authorizationv1.SubjectAccessReviewSpec{
User: r.Header.Get("X-Remote-User"),
Groups: strings.Split(r.Header.Get("X-Remote-Group"), ","),
ResourceAttributes: &authorizationv1.ResourceAttributes{
Namespace: "default",
Verb: "get",
Group: "metrics.k8s.io",
Resource: "nodes",
},
},
}
result, err := k8sClient.AuthorizationV1().SubjectAccessReviews().Create(ctx, sar, metav1.CreateOptions{})
if !result.Status.Allowed {
http.Error(w, "Forbidden", http.StatusForbidden)
return
}
This means RBAC rules written for the aggregated API group are evaluated by kube-apiserver's authorizer, not the extension server. The extension server just asks kube-apiserver "is this user allowed to do X?"
Building an Aggregated API Server
The apiserver-builder (now superseded by apiserver-runtime) and the reference implementation in k8s.io/apiserver provide a framework for building extension servers. The extension server is effectively a small kube-apiserver for your own API group, sharing most of the same machinery.
Core Libraries
| Library | Purpose |
|---|---|
k8s.io/apiserver/pkg/server | Generic API server framework — HTTP server, request pipeline, audit, authentication delegation |
k8s.io/apiserver/pkg/registry | REST storage interfaces — rest.Storage, rest.Getter, rest.Creater, etc. |
k8s.io/apiserver/pkg/admission | Admission framework for the extension server's own admission plugins |
k8s.io/apiserver/plugin/pkg/authenticator | Delegated authentication implementation reading from extension-apiserver-authentication |
k8s.io/apiserver/plugin/pkg/authorizer/webhook | Delegated authorization via SubjectAccessReview |
sigs.k8s.io/apiserver-runtime | Higher-level framework — builder pattern, auto-generates boilerplate |
Minimal Extension Server Skeleton
package main
import (
"k8s.io/apiserver/pkg/server"
"k8s.io/apiserver/pkg/server/options"
genericapiserver "k8s.io/apiserver/pkg/server"
)
func main() {
// Standard recommended options (etcd, security, audit, features)
recommended := options.NewRecommendedOptions("registry", Codecs.LegacyCodec(SchemeGroupVersion))
recommended.Etcd.StorageConfig.EncodeVersioner = ...
recommended.SecureServing.BindPort = 6443
// Build server config from options
config, err := recommended.Config(Scheme, Codecs)
if err != nil { panic(err) }
// Create the generic API server
genericServer, err := config.Complete().New("my-extension-server", genericapiserver.NewEmptyDelegate())
if err != nil { panic(err) }
// Install API groups
apiGroupInfo := genericapiserver.NewDefaultAPIGroupInfo(GroupName, Scheme, metav1.ParameterCodec, Codecs)
apiGroupInfo.VersionedResourcesStorageMap["v1"] = map[string]rest.Storage{
"widgets": NewWidgetStorage(client),
"widgets/status": NewWidgetStatusStorage(client),
}
genericServer.InstallAPIGroup(&apiGroupInfo)
// Run
genericServer.PrepareRun().Run(stopCh)
}
Storage Backends
Extension servers are not limited to etcd. Common storage patterns:
etcd (via generic server)
Use registry.NewStore with an etcd3.New backend. This is the default for AA servers that want the same storage as core resources — consistent watch semantics, MVCC, and encryption at rest.
In-memory / Computed
Metrics Server stores nothing — it computes resource usage by querying kubelet's /stats/summary endpoint and returns the result directly. Ideal for read-only aggregation APIs.
External database
ServiceCatalog stored OSB instance data in its own database. Any store implementing rest.Storage works — PostgreSQL, Cassandra, etc.
Proxy to another API
The extension server translates Kubernetes API requests into calls to another system (e.g., an external metrics system, a VM manager). The AA layer makes them look like native Kubernetes resources.
APIService Availability and Health Checks
The aggregation layer continuously monitors the health of registered extension servers. Each APIService has an Available condition. kube-apiserver probes GET /healthz on the extension server (via the registered Service).
# Check APIService availability
kubectl get apiservices
# NAME SERVICE AVAILABLE AGE
# v1beta1.metrics.k8s.io kube-system/metrics-server True 45d
# v1alpha1.custom.example.com my-ns/my-extension False 2d
# Describe to see why it's unavailable
kubectl describe apiservice v1alpha1.custom.example.com
# Status:
# Conditions:
# Last Transition Time: 2026-01-10T14:23:00Z
# Message: failing or missing response from
# https://10.96.50.32:443/healthz: Get
# "https://10.96.50.32:443/healthz": dial
# tcp 10.96.50.32:443: connect: connection refused
# Reason: FailedDiscoveryCheck
# Status: False
# Type: Available
Available: False, kube-apiserver will return errors for requests to that group/version. This also breaks kubectl api-resources if the unavailability causes a discovery request to hang — some clients use a short timeout and silently skip unavailable groups, while others will surface the error.
APIService vs CRD: When to Use Which
| Criterion | CRD | Aggregated API Server |
|---|---|---|
| Operational complexity | Low — just apply a YAML | High — deploy + maintain an additional server binary |
| Storage backend | Always etcd (same cluster) | Any — etcd, external DB, in-memory, computed |
| Schema validation | OpenAPI v3 + CEL | Arbitrary Go code |
| Subresource flexibility | Only /status and /scale subresources | Any subresource with custom semantics (e.g., /exec, /logs, /proxy) |
| Watch/list performance | kube-apiserver's own watchCache | Extension server must implement its own watch semantics |
| RBAC | Enforced by kube-apiserver | Enforced by kube-apiserver (via delegation) |
| Admission webhooks | Supported | Can implement admission in-process or delegate |
| Availability impact | CRD types always available (stored in etcd) | API group unavailable if extension server is down |
| Versioning / conversion | Conversion webhooks or in-process CEL | Arbitrary conversion logic in server code |
| Protobuf / encoding | JSON only by default; protobuf with effort | Full protobuf support using standard runtime |
| Best for | Operators, controllers, custom resources stored in cluster state | APIs over external systems, metrics, computed data, complex subresources |
Metrics Server Deep Dive
The Metrics Server is the canonical production example of an aggregated API server. It serves the metrics.k8s.io/v1beta1 group, used by kubectl top and the HPA controller's CPU/memory autoscaling (see kube-controller-manager §HPA).
Data Flow
Metrics Server does not use Prometheus or any persistent storage. It polls kubelets directly, holds the last-collected metrics in memory, and serves them on demand. This means:
- Metrics are available only for live nodes and pods.
- Metrics are sampled, not continuous — there is a 15-second collection interval.
- Restarting Metrics Server loses all data (irrelevant because it is re-scraped within 15s).
- It is NOT a replacement for Prometheus — use Prometheus for historical metrics, dashboards, and alerting.
Metrics Server Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: metrics-server
namespace: kube-system
spec:
selector:
matchLabels:
k8s-app: metrics-server
template:
metadata:
labels:
k8s-app: metrics-server
spec:
serviceAccountName: metrics-server
containers:
- name: metrics-server
image: registry.k8s.io/metrics-server/metrics-server:v0.7.0
args:
- --cert-dir=/tmp
- --secure-port=10250
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
# Only use in lab environments — skips kubelet TLS verification
# - --kubelet-insecure-tls
ports:
- containerPort: 10250
name: https
protocol: TCP
readinessProbe:
httpGet:
path: /readyz
port: https
scheme: HTTPS
resources:
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
- mountPath: /tmp
name: tmp-dir
volumes:
- name: tmp-dir
emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
name: metrics-server
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
kubernetes.io/name: "Metrics-server"
spec:
selector:
k8s-app: metrics-server
ports:
- port: 443
protocol: TCP
targetPort: https
Required RBAC for Metrics Server
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:metrics-server
rules:
- apiGroups: [""]
resources: ["nodes/metrics"]
verbs: ["get"]
- apiGroups: [""]
resources: ["pods", "nodes"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:metrics-server
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:metrics-server
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
# Required for delegated auth
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: metrics-server:system:auth-delegator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:auth-delegator
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
# Required to read extension-apiserver-authentication configmap
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: metrics-server-auth-reader
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
system:auth-delegator ClusterRoleBinding so it can create TokenReview and SubjectAccessReview objects for delegated authentication and authorization. Without this, the extension server cannot validate tokens or check authorization.
Custom Metrics and External Metrics APIs
Beyond metrics.k8s.io (resource metrics), two additional aggregated APIs power HPA's external signal sources:
| API Group | Purpose | Common Implementations |
|---|---|---|
metrics.k8s.io/v1beta1 | CPU and memory usage per pod/node (from kubelet) | Metrics Server |
custom.metrics.k8s.io/v1beta2 | Custom metrics from within the cluster (e.g., RPS, queue depth) | Prometheus Adapter, Keda, KEDA HTTP Add-on |
external.metrics.k8s.io/v1beta1 | Metrics from external systems (e.g., SQS queue length, Datadog metric) | KEDA, Datadog Cluster Agent, Azure Monitor Adapter |
Prometheus Adapter: HPA on Custom Metrics
# prometheus-adapter ConfigMap — maps Prometheus queries to k8s metric names
apiVersion: v1
kind: ConfigMap
metadata:
name: adapter-config
namespace: monitoring
data:
config.yaml: |
rules:
- seriesQuery: 'http_requests_total{namespace!="",pod!=""}'
resources:
overrides:
namespace: {resource: "namespace"}
pod: {resource: "pod"}
name:
matches: "^(.*)_total"
as: "${1}_per_second"
metricsQuery: 'sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)'
# HPA using custom metric
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: my-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
minReplicas: 2
maxReplicas: 20
metrics:
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: "100"
TLS Configuration for Extension Servers
Extension servers must serve HTTPS. Two TLS relationships exist:
kube-apiserver → extension server (proxy TLS)
The caBundle in the APIService object is the CA that signed the extension server's serving certificate. kube-apiserver uses this to verify the extension server's identity when proxying.
# Generate serving cert for extension server
openssl req -x509 -newkey rsa:4096 \
-keyout server.key -out server.crt -days 365 -nodes \
-subj "/CN=my-extension-svc.my-ns.svc"
# Patch the APIService caBundle
CA=$(base64 -w0 server.crt)
kubectl patch apiservice v1alpha1.myapi.example.com \
--type=merge -p "{\"spec\":{\"caBundle\":\"$CA\"}}"
Extension server → kube-apiserver (front-proxy TLS)
kube-apiserver adds a client certificate (the front-proxy certificate) to proxied requests. The extension server reads requestheader-client-ca-file from the extension-apiserver-authentication ConfigMap to verify this certificate.
# kube-apiserver flags for front-proxy
--requestheader-client-ca-file=/etc/kubernetes/front-proxy-ca.crt
--requestheader-allowed-names=front-proxy-client
--requestheader-username-headers=X-Remote-User
--requestheader-group-headers=X-Remote-Group
--requestheader-extra-headers-prefix=X-Remote-Extra-
--proxy-client-cert-file=/etc/kubernetes/front-proxy-client.crt
--proxy-client-key-file=/etc/kubernetes/front-proxy-client.key
API Discovery and the /apis Endpoint
Aggregated API groups automatically appear in the API discovery responses served by kube-apiserver:
# List all API groups (includes aggregated)
kubectl api-groups
# NAME APIVERSION
# apps apps/v1
# batch batch/v1
# metrics.k8s.io metrics.k8s.io/v1beta1
# custom.metrics.k8s.io custom.metrics.k8s.io/v1beta2
# List all API resources (includes aggregated resources)
kubectl api-resources | grep metrics
# Raw discovery
curl -k https://$APISERVER/apis/metrics.k8s.io/v1beta1
The discovery aggregation works as follows: kube-apiserver queries each extension server's /apis/<group>/<version> discovery endpoint and merges the results into the global discovery response. If an extension server is unavailable, its group is omitted or marked unavailable in the discovery response.
Aggregated Discovery (v1 GA — 1.30)
Prior to 1.30, every API client performed O(N) discovery requests — one per API group — to build the full API resource table. In clusters with many extension servers and CRDs, this could take seconds. Aggregated Discovery (GA in 1.30) provides a single endpoint that returns all API groups and resources in one response:
# Single request for all API resources
GET /apis
Accept: application/json;as=APIGroupDiscoveryList;v=v2;g=apidiscovery.k8s.io,application/json
# Returns APIGroupDiscoveryList with all groups, versions, and resources
# Dramatically reduces kubectl startup time and client initialization overhead
kube-apiserver Flags for Aggregation
| Flag | Default | Description |
|---|---|---|
--requestheader-client-ca-file | — | CA to verify the front-proxy client certificate |
--requestheader-allowed-names | — | Common names allowed for front-proxy (usually front-proxy-client) |
--requestheader-username-headers | — | Header containing the authenticated username (usually X-Remote-User) |
--requestheader-group-headers | — | Header containing the authenticated groups (usually X-Remote-Group) |
--requestheader-extra-headers-prefix | — | Prefix for extra info headers (usually X-Remote-Extra-) |
--proxy-client-cert-file | — | TLS client cert kube-apiserver presents to extension servers |
--proxy-client-key-file | — | TLS client key for the above |
--enable-aggregator-routing | false | Route to pod IPs directly rather than via Service ClusterIP (useful when kube-proxy is not available on control plane nodes) |
--requestheader-client-ca-file is not set on kube-apiserver, the aggregation layer will not inject identity headers, and all extension server requests will appear as anonymous. This is a common misconfiguration in self-managed clusters and custom kubeadm setups.
Prometheus Metrics
| Metric | Type | Description |
|---|---|---|
aggregator_unavailable_apiservice | Gauge | 1 if the APIService is currently unavailable, 0 if available |
aggregator_unavailable_apiservice_total | Counter | Total count of times an APIService became unavailable |
kube_apiserver_request_duration_seconds | Histogram | Request latency broken down by verb, resource, subresource — includes proxied requests |
apiserver_proxy_tunnel_sync_latency_secs | Histogram | Time to establish the proxy tunnel to an extension server |
Alerting Rules
groups:
- name: api-aggregation
rules:
- alert: APIServiceUnavailable
expr: aggregator_unavailable_apiservice == 1
for: 2m
labels:
severity: warning
annotations:
summary: "APIService {{ $labels.name }} is unavailable"
description: "Requests to this API group will fail until the extension server recovers"
- alert: MetricsServerUnavailable
expr: aggregator_unavailable_apiservice{name="v1beta1.metrics.k8s.io"} == 1
for: 2m
labels:
severity: critical
annotations:
summary: "Metrics Server unavailable — HPA and kubectl top are broken"
- alert: AggregatedAPIHighLatency
expr: |
histogram_quantile(0.99,
rate(kube_apiserver_request_duration_seconds_bucket{
resource=~".*metrics.*"
}[5m])
) > 2
for: 5m
labels:
severity: warning
annotations:
summary: "Aggregated API p99 latency > 2s — extension server may be slow"
Troubleshooting Runbooks
Runbook 1: kubectl top returns "Error from server (ServiceUnavailable)"
# Symptom: kubectl top nodes / kubectl top pods returns error
# 1. Check APIService status
kubectl get apiservice v1beta1.metrics.k8s.io
# If AVAILABLE is False:
kubectl describe apiservice v1beta1.metrics.k8s.io
# 2. Check Metrics Server pod
kubectl get pods -n kube-system -l k8s-app=metrics-server
kubectl logs -n kube-system -l k8s-app=metrics-server --tail=50
# 3. Check if Metrics Server can reach kubelets
# Common issue: kubelet TLS not trusted
# Fix: add --kubelet-insecure-tls (lab) or configure proper CA
kubectl logs -n kube-system deploy/metrics-server | grep "failed to"
# 4. Check Service and endpoints
kubectl get svc metrics-server -n kube-system
kubectl get endpoints metrics-server -n kube-system
# 5. Verify RBAC bindings
kubectl get clusterrolebinding | grep metrics
kubectl get rolebinding -n kube-system | grep metrics
# 6. Check if APIService caBundle matches Metrics Server cert
kubectl get apiservice v1beta1.metrics.k8s.io -o jsonpath='{.spec.caBundle}' \
| base64 -d | openssl x509 -text -noout | grep CN
Runbook 2: Extension server unavailable — APIService not Available
# 1. Get detailed condition
kubectl describe apiservice v1alpha1.myapi.example.com
# Look at the Message field in conditions
# 2. Common causes:
# a) Pod not running
kubectl get pods -n my-ns -l app=my-extension-server
# b) Service port mismatch
kubectl get svc -n my-ns my-extension-svc -o yaml | grep -A5 ports
# Must match the port in APIService spec
# c) TLS cert mismatch (caBundle doesn't match serving cert)
kubectl get apiservice v1alpha1.myapi.example.com -o jsonpath='{.spec.caBundle}' \
| base64 -d | openssl x509 -noout -fingerprint
# Compare with cert the server is actually serving:
openssl s_client -connect my-extension-svc.my-ns.svc.cluster.local:443 \
/dev/null | openssl x509 -noout -fingerprint
# d) Extension server itself is returning non-200 on /healthz
kubectl exec -it debug-pod -- curl -k https://my-extension-svc.my-ns.svc.cluster.local:443/healthz
# 3. Force re-check by deleting and recreating the APIService
kubectl delete apiservice v1alpha1.myapi.example.com
kubectl apply -f apiservice.yaml
Runbook 3: "403 Forbidden" from extension server
# Symptom: kubectl get myresources.myapi.example.com returns 403
# 1. Check if RBAC ClusterRole exists for the group
kubectl get clusterrole | grep myapi
# 2. Check user's permissions
kubectl auth can-i get myresources.myapi.example.com --as=alice
# 3. Create ClusterRole and Binding
kubectl create clusterrole myapi-reader \
--verb=get,list,watch \
--resource=myresources.myapi.example.com
kubectl create clusterrolebinding alice-myapi \
--clusterrole=myapi-reader \
--user=alice
# 4. Check if extension server is doing its own authz (not delegating)
# Extension server logs should show SubjectAccessReview calls to kube-apiserver
kubectl logs -n my-ns deploy/my-extension-server | grep SubjectAccessReview
# 5. Verify system:auth-delegator binding exists
kubectl get clusterrolebinding | grep auth-delegator | grep my-extension
Runbook 4: HPA not scaling — custom metrics unavailable
# Symptom: HPA shows "unknown" for custom metric target
# 1. Check if custom.metrics.k8s.io APIService is available
kubectl get apiservice | grep custom.metrics
# 2. Test the metric directly
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta2/namespaces/default/pods/*/http_requests_per_second"
# 3. Check Prometheus Adapter logs
kubectl logs -n monitoring deploy/prometheus-adapter --tail=50 | grep -i error
# 4. Verify the metric exists in Prometheus
# Port-forward to Prometheus and query:
kubectl port-forward -n monitoring svc/prometheus 9090:9090 &
curl "http://localhost:9090/api/v1/query?query=http_requests_total" | jq '.data.result | length'
# 5. Validate adapter config maps Prometheus series correctly
kubectl get configmap adapter-config -n monitoring -o yaml
# Check seriesQuery matches the Prometheus metric name
# 6. Check HPA events
kubectl describe hpa my-app-hpa | tail -20
Runbook 5: API discovery broken — "no kind is registered for the type"
# Symptom: client-go or kubectl fails with type registration errors
# Often seen during kubectl api-resources --verbs=list
# 1. Check for unavailable APIServices causing discovery failures
kubectl get apiservices | grep -v "True"
# Any False entries will cause partial discovery failures
# 2. Check aggregation layer logs in kube-apiserver
kubectl logs -n kube-system kube-apiserver-controlplane \
| grep -i "aggregat" | tail -20
# 3. If a stale APIService points to a deleted service, delete it
kubectl delete apiservice v1beta1.stale.example.com
# 4. Force kubectl to re-cache discovery
rm -rf ~/.kube/cache/discovery/
kubectl api-resources
# 5. For client-go: reset the discovery client cache
# In code: use a DiscoveryClient with NoCache or
# call Invalidate() on the cached discovery client
Production Best Practices
- Never use
insecureSkipTLSVerify: truein production. ThecaBundlein the APIService must contain the CA that signed the extension server's serving certificate. Use cert-manager to automate certificate provisioning and rotation. - Deploy extension servers with ≥2 replicas and a PodDisruptionBudget. A down extension server makes the entire API group unavailable. Use Deployments with
minReadySecondsand readiness probes on/healthz. - Always bind the
system:auth-delegatorClusterRole and theextension-apiserver-authentication-readerRole to your extension server's service account. Without these, delegated auth/authz will silently fail. - Use
--enable-aggregator-routingon control-plane-only nodes where kube-proxy may not run. This routes proxy traffic directly to pod IPs, bypassing Services that require kube-proxy iptables rules. - Monitor
aggregator_unavailable_apiserviceper service. Alert immediately — a broken APIService can disrupt HPA, kubectl, and any controller watching that API group. - Implement
/healthz,/readyz, and/livezin your extension server. The aggregation layer polls/healthzto set the APIService Available condition. Without a proper healthz, transient errors can permanently mark your service unavailable until the APIService is deleted and recreated. - Prefer CRDs unless you have a specific need for a custom storage backend or complex subresources. AA servers have significantly higher operational burden.
- Use
groupPriorityMinimumandversionPrioritythoughtfully. Setting very high priorities causes your API group to be preferred in discovery, which can confuse clients if multiple groups serve the same resource name.