Control Plane — Complete Internal Architecture
Deep internals of the five core control-plane components, their gRPC/REST APIs, reconciliation loops, leader election, HA topology, failure modes, and production operations.
What Is the Control Plane?
The control plane is the set of processes that implement the desired-state management loop of the cluster. It never runs your workloads; it only watches, decides, and instructs. Every cluster mutation flows through the control plane, and every node periodically reports status back to it.
The Five Core Components
kube-apiserver
The single, authoritative REST API gateway for the entire cluster. All state reads and writes go through it. Stateless horizontally scalable. Backs every object into etcd.
etcd
Strongly consistent, distributed key-value store. The ground truth for all cluster state. Uses Raft consensus. Loss of etcd = loss of the cluster.
kube-scheduler
Watches for unscheduled Pods and assigns each to a Node. Two-phase: Filter (which nodes can run this pod?) → Score (which is optimal?). Highly extensible via plugins.
kube-controller-manager
Runs 30+ reconciliation loops (controllers) in a single binary. Each controller drives actual cluster state toward desired state. Node controller, ReplicaSet controller, Job controller, etc.
cloud-controller-manager
Optional. Integrates with cloud provider APIs for LoadBalancer provisioning, Node address annotation, Route programming. Decouples cloud logic from core controllers.
Control Plane Architecture Diagram
Figure 1: Control plane component communication topology. All external and internal traffic routes through kube-apiserver. etcd is only accessible by apiserver. The scheduler and controllers only talk to the apiserver — never directly to nodes.
Communication Matrix
Understanding who talks to whom, over what protocol, on what port, authenticated how is critical for firewall rules, mTLS policy, and audit log interpretation.
| Source | Destination | Port(s) | Protocol | AuthN Method | Direction | Notes |
|---|---|---|---|---|---|---|
| kubectl / API clients | kube-apiserver | 6443 | HTTPS (TLS 1.3) | x509 cert / token / OIDC | → CP | External-facing. Must be behind LB in HA. |
| kube-apiserver | etcd | 2379 | gRPC / TLS | mTLS (client cert) | → etcd | Only apiserver talks to etcd. Dedicated cert. |
| kube-scheduler | kube-apiserver | 6443 | HTTPS + HTTP/2 | in-cluster ServiceAccount / kubeconfig | → CP | Watch Pods (unscheduled), patch Pod.Spec.NodeName |
| kube-controller-manager | kube-apiserver | 6443 | HTTPS + HTTP/2 | in-cluster ServiceAccount / kubeconfig | → CP | Each controller has separate SA + RBAC |
| cloud-controller-manager | kube-apiserver | 6443 | HTTPS + HTTP/2 | in-cluster ServiceAccount / kubeconfig | → CP | Also calls cloud provider API externally |
| kubelet | kube-apiserver | 6443 | HTTPS + HTTP/2 | TLS bootstrap → node cert | → CP | Node authn group system:nodes |
| kube-apiserver | kubelet | 10250 | HTTPS | apiserver presents client cert to kubelet | → Node | exec, log, portforward, attach |
| kube-proxy | kube-apiserver | 6443 | HTTPS + HTTP/2 | ServiceAccount / kubeconfig | → CP | Watches Services and EndpointSlices |
| etcd leader | etcd followers | 2380 | gRPC / TLS | mTLS (peer cert) | intra-etcd | Raft replication and heartbeats |
| kube-apiserver (HA) | kube-apiserver peers | — | — | — | independent | APIservers are stateless; they don't talk to each other. LB distributes. |
Component Roles — Internal Mechanics
kube-apiserver — The API Gateway
The apiserver is the only component that reads from and writes to etcd. It is completely stateless: all state lives in etcd. Multiple apiserver replicas can run simultaneously because they share etcd as the source of truth.
Internal request lifecycle (full detail in 04-kubernetes-api-model.html §Request Lifecycle):
The apiserver also proxies requests to webhooks (MutatingAdmissionWebhook, ValidatingAdmissionWebhook) and to aggregated API servers (metrics-server, custom API servers via APIService). It handles Watch via long-lived HTTP/2 streaming connections — see the Informer architecture in 04-kubernetes-api-model.html §Informers.
Key apiserver flags
# Minimal production-relevant flags
kube-apiserver \
--advertise-address= # IP used by kubelet to reach apiserver
--bind-address=0.0.0.0 # Listen on all interfaces
--secure-port=6443 # HTTPS port (default)
--etcd-servers=https://etcd-0:2379,https://etcd-1:2379,https://etcd-2:2379
--etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
--etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
--etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key
--client-ca-file=/etc/kubernetes/pki/ca.crt
--tls-cert-file=/etc/kubernetes/pki/apiserver.crt
--tls-private-key-file=/etc/kubernetes/pki/apiserver.key
--authorization-mode=Node,RBAC
--enable-admission-plugins=NodeRestriction,PodSecurity,MutatingAdmissionWebhook,ValidatingAdmissionWebhook
--service-cluster-ip-range=10.96.0.0/12
--service-node-port-range=30000-32767
--allow-privileged=false
--audit-log-path=/var/log/kubernetes/audit.log
--audit-policy-file=/etc/kubernetes/audit-policy.yaml
--feature-gates=... # feature flags
--max-requests-inflight=800 # concurrency limit
--max-mutating-requests-inflight=400
--request-timeout=60s
etcd — The Ground Truth
etcd is a distributed key-value store using the Raft consensus algorithm. It guarantees strong consistency — every read reflects the most recently committed write (given quorum). Kubernetes stores every API object as a protobuf-encoded value at a path like /registry/{group}/{resource}/{namespace}/{name}.
Raft requires a quorum of ⌊n/2⌋ + 1 members. A 3-member cluster tolerates 1 failure (quorum = 2). A 5-member cluster tolerates 2 failures (quorum = 3). Always run an odd number of members.
etcd Raft Leader Election Flow
- All members start as Followers. Each has a randomized election timeout (150–300ms).
- When a Follower doesn't receive a heartbeat within its timeout, it transitions to Candidate and increments its term.
- The Candidate sends
RequestVoteRPCs to all peers. Each peer grants one vote per term to the first Candidate that has an up-to-date log. - If the Candidate receives votes from a majority, it becomes Leader.
- The Leader sends periodic heartbeat
AppendEntriesRPCs (≤50ms) to prevent elections. - All client writes (from apiserver) go to the Leader. It appends the entry to its log, replicates to followers, and commits once a majority acknowledges.
# Inspect etcd leader
ETCDCTL_API=3 etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/peer.crt \
--key=/etc/kubernetes/pki/etcd/peer.key \
endpoint status --write-out=table
# Output columns: ENDPOINT, ID, VERSION, DB SIZE, IS LEADER, RAFT TERM, RAFT INDEX
kube-scheduler — Two-Phase Placement
The scheduler watches the apiserver for Pods in Pending state with spec.nodeName == "". For each such pod, it runs a two-phase algorithm:
Phase 1: Filter (Predicates)
Eliminate nodes that cannot run this pod. Built-in filter plugins include:
NodeResourcesFit— CPU/memory requests must fitNodeSelector— nodeSelector labels must matchNodeAffinity— requiredDuringScheduling rulesTaintToleration— pod must tolerate all taintsPodTopologySpread— spread constraintsVolumeBinding— PVCs must be bindable to nodeInterPodAffinity— pod anti-affinity hard rulesNodePorts— host ports must be free
Phase 2: Score (Priorities)
Rank remaining feasible nodes. Built-in score plugins include:
LeastAllocated— prefer nodes with most free resourcesBalancedAllocation— balance CPU vs memoryNodeAffinityPriority— preferredDuringSchedulingInterPodAffinityPriority— soft affinity/anti-affinityImageLocality— prefer nodes with image already pulledTaintToleration— deprioritize tainted nodes
Highest-scoring node wins. Scheduler writes pod.spec.nodeName via a Bind operation.
The scheduler uses the Scheduling Framework (introduced v1.15, stable v1.19) — a plugin-based system where all logic is expressed as plugins implementing extension points: PreFilter, Filter, PostFilter, PreScore, Score, NormalizeScore, Reserve, Permit, PreBind, Bind, PostBind.
Scheduler Internals: Scheduling Queue and Backoff
The scheduler maintains a priority queue (heap-based) ordered by pod priority class. When a pod fails scheduling, it goes into a backoff queue with exponential backoff (2s → 4s → 8s → max 10s). Failed pods are retried when cluster state changes (node added, resource freed, taint removed).
# Check scheduling failures
kubectl get events --field-selector reason=FailedScheduling
kubectl describe pod # "Events" section shows scheduler decisions
# Enable verbose scheduler logs
kube-scheduler --v=10 # Logs filter/score results per node
# Inspect scheduler metrics
curl -sk https://localhost:10259/metrics | grep scheduler_
kube-controller-manager — The Reconciliation Engine
The kube-controller-manager runs 30+ controllers in a single binary with a single process. Each controller is an independent goroutine running a reconciliation loop:
// Generic reconciliation loop pattern (all controllers follow this)
for {
desired := readDesiredState(apiserver) // via Informer cache
actual := readActualState(apiserver) // via Informer cache
if desired != actual {
makeChanges(apiserver) // Create/Update/Delete child objects
}
// Sleep or wait for next Informer event
}
The key controllers and what they manage:
| Controller | Watches | Creates / Manages | Key Action |
|---|---|---|---|
| ReplicationController | RC objects + Pods | Pods | Maintain pod count == spec.replicas |
| ReplicaSet controller | RS objects + Pods | Pods | Maintain pod count, owns pods via ownerRef |
| Deployment controller | Deployments + RS | ReplicaSets | Manage rolling updates, rollbacks |
| StatefulSet controller | StatefulSets + Pods + PVCs | Pods, PVCs | Ordered pod creation, stable network IDs |
| DaemonSet controller | DaemonSets + Nodes + Pods | Pods (one per node) | Schedule pod to every matching node |
| Job controller | Jobs + Pods | Pods | Run pods to completion, handle retries |
| CronJob controller | CronJobs | Jobs | Create Job objects on schedule |
| Node controller | Nodes | — | Taint nodes unreachable after 40s, evict pods after 5min |
| Endpoints controller | Services + Pods | Endpoints | Update endpoints when pod IPs change (legacy; EndpointSlice preferred) |
| EndpointSlice controller | Services + Pods | EndpointSlices | Scale-friendly endpoint tracking |
| Namespace controller | Namespaces | — | Delete all objects when namespace is deleted |
| ServiceAccount controller | Namespaces | ServiceAccounts | Create default SA in every namespace |
| Token controller | ServiceAccounts + Secrets | Secrets (token) | Create SA token secrets (pre-v1.22 legacy) |
| PersistentVolume controller | PVs + PVCs | — | Bind PVCs to PVs, handle reclaim policies |
| ResourceQuota controller | ResourceQuotas | — | Track and enforce quota usage |
| GarbageCollection controller | All objects | — | Delete orphaned objects via ownerReferences |
| TTLAfterFinished controller | Jobs | — | Delete finished Jobs after TTL |
--controllers flag can selectively disable specific controllers (e.g., --controllers=-ttl).
cloud-controller-manager — Cloud Integration
Introduced to decouple cloud-provider-specific logic from the core Kubernetes binary. Before its introduction (pre-v1.6), every cloud provider patched their logic directly into kube-controller-manager and kubelet, creating a monolithic and slow release cycle.
The CCM runs three cloud-specific controllers:
- Node controller: Annotates new Nodes with cloud metadata (instance type, zone, region, provider ID). Deletes Node objects when cloud instance is terminated.
- Route controller: Programs cloud routing tables so pod-to-pod traffic flows across nodes (GCP VPC native routing, AWS VPC CNI, etc.).
- Service controller: Watches Services of
type: LoadBalancer. Creates/deletes cloud load balancers. Updatesstatus.loadBalancer.ingresswith the allocated IP/hostname.
High Availability Control Plane
Stacked vs External etcd Topology
Stacked etcd (Default kubeadm)
etcd runs on the same nodes as the control plane components. Simpler to manage. Used by default in kubeadm clusters.
- Each CP node: apiserver + scheduler + controller-manager + etcd
- etcd cluster: 3 members on 3 CP nodes
- Risk: Losing a CP node loses both a control plane member AND an etcd member simultaneously
- Minimum: 3 CP nodes for quorum tolerance
External etcd
etcd runs on separate dedicated nodes. More resilient, harder to operate.
- CP nodes: apiserver + scheduler + controller-manager (no etcd)
- etcd nodes: 3 or 5 dedicated etcd members
- CP node failure doesn't affect etcd quorum
- Requires more nodes: minimum 3 CP + 3 etcd = 6 nodes
- Recommended for production clusters > 50 nodes
apiserver HA — Stateless Horizontal Scale
Since the apiserver is stateless (all state in etcd), you can run any number of replicas. A TCP load balancer (or VIP via keepalived/haproxy) distributes client connections across all healthy apiserver instances. There is no leader election for apiservers — all replicas serve reads and writes concurrently.
# Verify HA apiserver (kubeadm)
kubectl get pods -n kube-system -l component=kube-apiserver
# Check which apiserver is serving your request
kubectl config view --minify -o jsonpath='{.clusters[0].cluster.server}'
# Each kubeconfig points to the VIP; inspect apiserver endpoints
kubectl get endpoints kubernetes -n default
Scheduler and Controller-Manager Leader Election
Unlike the apiserver, the scheduler and controller-manager are NOT safe to run as multiple active instances — two schedulers could assign the same pod to two nodes. They use Kubernetes Lease-based leader election:
The leader election mechanism:
- Each replica tries to acquire a
Leaseobject in thekube-systemnamespace. - The Lease has a
leaseDurationSeconds(default 15s) and arenewDeadlineSeconds(default 10s). - The leader continuously renews the Lease by updating
renewTime. - If the leader fails to renew within
leaseDuration, another replica acquires the Lease and becomes leader. - Non-leaders sleep and periodically attempt to acquire the Lease.
# Check scheduler leader
kubectl get lease kube-scheduler -n kube-system -o yaml
# holderIdentity: kube-scheduler-node1_abc-uuid
# Check controller-manager leader
kubectl get lease kube-controller-manager -n kube-system -o yaml
# Watch for leadership changes
kubectl get lease -n kube-system --watch
# Controller-manager leader election flags
kube-controller-manager \
--leader-elect=true \
--leader-elect-lease-duration=15s \
--leader-elect-renew-deadline=10s \
--leader-elect-retry-period=2s
leaseDurationSeconds. This is the "split-brain" window. Kubernetes mitigates this with optimistic locking (resourceVersion conflicts) but double-scheduling is theoretically possible during this window.
Static Pods — How Control Plane Components Run
On clusters provisioned by kubeadm, all control plane components (apiserver, etcd, scheduler, controller-manager) run as static pods. Static pods are managed directly by the kubelet on the node — not by the apiserver or any controller.
The kubelet's --pod-manifest-path (default: /etc/kubernetes/manifests/) contains YAML files. The kubelet watches this directory and creates/restarts pods whenever files are added, modified, or deleted.
# Static pod manifests location (kubeadm)
ls /etc/kubernetes/manifests/
# etcd.yaml
# kube-apiserver.yaml
# kube-controller-manager.yaml
# kube-scheduler.yaml
# Static pods appear as mirror pods in the API server
# (prefixed with node name, e.g. kube-apiserver-control-plane-1)
kubectl get pods -n kube-system
# Editing a static pod manifest immediately restarts the component
# (kubelet detects inotify change and recreates the pod)
vim /etc/kubernetes/manifests/kube-apiserver.yaml
Component Startup Order and Dependencies
PKI and Certificate Architecture
Every communication in the control plane uses TLS. The control plane PKI is a hierarchy of Certificate Authorities managed by kubeadm (or manually for more complex setups).
# View all cluster certificates and their expiry
kubeadm certs check-expiration
# Renew all certificates (kubeadm)
kubeadm certs renew all
# View cert details
openssl x509 -in /etc/kubernetes/pki/apiserver.crt -noout -text | grep -A 5 "Subject Alternative"
openssl x509 -in /etc/kubernetes/pki/apiserver.crt -noout -dates
# Check controller-manager cert
openssl x509 -in /etc/kubernetes/pki/apiserver-etcd-client.crt -noout -text | grep "Subject:"
Health Check Endpoints and Monitoring
Every control plane component exposes health check endpoints. These are polled by load balancers, monitoring systems, and readiness probes in the static pod manifests.
| Component | Endpoint | Port | Expected Response |
|---|---|---|---|
| kube-apiserver | /healthz | 6443 | ok |
| kube-apiserver | /readyz | 6443 | ok (all checks pass) |
| kube-apiserver | /livez | 6443 | ok |
| kube-apiserver | /metrics | 6443 | Prometheus text format |
| kube-scheduler | /healthz | 10259 | ok |
| kube-scheduler | /metrics | 10259 | Prometheus text format |
| kube-controller-manager | /healthz | 10257 | ok |
| kube-controller-manager | /metrics | 10257 | Prometheus text format |
| etcd | /health | 2379 | {"health":"true"} |
| etcd | /metrics | 2381 | Prometheus text format |
# Check apiserver health from within cluster
kubectl get --raw /healthz
kubectl get --raw /readyz
kubectl get --raw /livez
kubectl get --raw /readyz?verbose # Shows each check's status
# Individual readiness checks
kubectl get --raw /readyz/poststarthook/rbac/bootstrap-roles
kubectl get --raw /readyz/etcd
# From node directly
curl -sk https://localhost:6443/healthz --cert /etc/kubernetes/pki/admin.crt --key /etc/kubernetes/pki/admin.key
# Check scheduler and controller-manager from CP node
curl -sk https://127.0.0.1:10259/healthz
curl -sk https://127.0.0.1:10257/healthz
# etcd health check
ETCDCTL_API=3 etcdctl endpoint health \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt \
--key=/etc/kubernetes/pki/etcd/healthcheck-client.key
Critical Metrics to Monitor
| Metric | Component | Alert Threshold | Meaning |
|---|---|---|---|
apiserver_request_duration_seconds | apiserver | p99 > 1s | API request latency by verb/resource |
apiserver_request_total | apiserver | error rate > 1% | Total requests by code/verb/resource |
etcd_request_duration_seconds | etcd | p99 > 100ms | etcd operation latency (critical for apiserver throughput) |
etcd_server_leader_changes_seen_total | etcd | > 0 in 5min | etcd leader elections (indicates instability) |
etcd_mvcc_db_total_size_in_bytes | etcd | > 8GB | etcd database size (default quota 2GB, configurable to 8GB) |
scheduler_schedule_attempts_total | scheduler | unschedulable > 0 | Pods that couldn't be scheduled |
scheduler_pending_pods | scheduler | > 50 sustained | Pods waiting to be scheduled |
workqueue_depth | controller-manager | > 100 sustained | Controller reconciliation queue backlog |
workqueue_queue_duration_seconds | controller-manager | p99 > 5s | Time items wait in controller work queue |
apiserver_current_inflight_requests | apiserver | near max-requests-inflight | Concurrency pressure on apiserver |
Troubleshooting the Control Plane
apiserver Not Responding
# Step 1: Check if the static pod is running on CP node
ssh cp-node-1
crictl ps | grep kube-apiserver
# Step 2: Check kubelet logs (manages static pods)
journalctl -u kubelet --since "10 minutes ago" | grep apiserver
# Step 3: Check apiserver logs directly
kubectl logs -n kube-system kube-apiserver-cp-node-1
# Or directly via crictl:
crictl logs $(crictl ps --name kube-apiserver -q)
# Step 4: Check manifest for syntax errors
cat /etc/kubernetes/manifests/kube-apiserver.yaml | python3 -c "import sys,yaml;yaml.safe_load(sys.stdin)"
# Step 5: Verify etcd connectivity
ETCDCTL_API=3 etcdctl endpoint health ...
# If etcd is down → apiserver will not serve writes (reads from cache possible)
# Step 6: Check certificates
openssl x509 -in /etc/kubernetes/pki/apiserver.crt -noout -dates
kubeadm certs check-expiration
Scheduler Not Scheduling Pods
# Check if scheduler is running
kubectl get pods -n kube-system -l component=kube-scheduler
# Check leader election
kubectl get lease kube-scheduler -n kube-system
# Check scheduler logs for specific pod
kubectl logs -n kube-system kube-scheduler-cp-node-1 | grep "Failed to schedule"
kubectl logs -n kube-system kube-scheduler-cp-node-1 | grep
# Get scheduling failure events
kubectl get events --field-selector reason=FailedScheduling --all-namespaces
# Describe the unscheduled pod
kubectl describe pod # Look at "Events:" section
# Common causes:
# - Insufficient CPU/memory on all nodes
# - Taint with no toleration
# - NodeSelector/Affinity that matches no nodes
# - PVC cannot be bound (VolumeBinding plugin failure)
Controller Not Reconciling
# Check controller-manager status
kubectl get pods -n kube-system -l component=kube-controller-manager
kubectl logs -n kube-system kube-controller-manager-cp-node-1 --tail=100
# Check leader
kubectl get lease kube-controller-manager -n kube-system
# Check work queue metrics (if prometheus is available)
kubectl port-forward -n kube-system kube-controller-manager-cp-node-1 10257:10257
curl -sk https://localhost:10257/metrics | grep workqueue
# Check for throttling / rate limiting
kubectl logs ... | grep "Throttling request"
# Check RBAC permissions for specific controller SA
kubectl auth can-i list deployments --as=system:serviceaccount:kube-system:deployment-controller
etcd Issues
# etcd cannot reach quorum
# Symptom: apiserver returns 503 for writes, etcd logs show "failed to reach quorum"
ETCDCTL_API=3 etcdctl endpoint status --write-out=table ...
# etcd database size too large
ETCDCTL_API=3 etcdctl endpoint status ... # check DB SIZE column
# Compact and defragment:
ETCDCTL_API=3 etcdctl compact $(etcdctl endpoint status --write-out=json | jq '.[0].Status.header.revision')
ETCDCTL_API=3 etcdctl defrag
# etcd slow writes / high fsync latency
# Check disk performance:
iostat -x 1 5
# etcd is sensitive to fsync latency; use SSDs; avoid NFS/network storage
# Check etcd alarms
ETCDCTL_API=3 etcdctl alarm list
ETCDCTL_API=3 etcdctl alarm disarm # After resolving the root cause
Upgrading the Control Plane
Control plane upgrades must be done before upgrading worker nodes. The supported pattern is to upgrade one minor version at a time (e.g., 1.29 → 1.30, not 1.29 → 1.31).
# Using kubeadm (standard upgrade flow)
# 1. Upgrade kubeadm on first CP node
apt-get update && apt-get install -y kubeadm=1.30.0-1.1
# 2. Verify upgrade plan
kubeadm upgrade plan
# 3. Apply upgrade to first CP node (upgrades apiserver, scheduler, controller-manager, etcd)
kubeadm upgrade apply v1.30.0
# 4. Upgrade kubeadm on remaining CP nodes
# Then on each additional CP node:
kubeadm upgrade node
# 5. Upgrade kubelet and kubectl on CP nodes
apt-get install -y kubelet=1.30.0-1.1 kubectl=1.30.0-1.1
systemctl daemon-reload && systemctl restart kubelet
# 6. Then upgrade worker nodes (drain → upgrade kubelet → uncordon)
Production Control Plane Checklist
20-Item Production Readiness Checklist
| # | Item | Default | Production Setting |
|---|---|---|---|
| 1 | etcd node count | 1 (single) | 3 or 5 (odd, for quorum) |
| 2 | apiserver replicas | 1 | 3 (behind VIP/LB) |
| 3 | etcd encryption at rest | Disabled | Enable with --encryption-provider-config |
| 4 | Audit logging | Disabled | Enable with policy covering Secrets, RBAC, exec |
| 5 | Certificate rotation | Manual | Use cert-manager or kubeadm auto-renewal + alerting |
| 6 | etcd backup | None | Automated hourly snapshots to off-cluster storage |
| 7 | etcd disk | Any | Dedicated NVMe SSD; separate from OS disk |
| 8 | API server request limits | 400/800 | Tune per cluster size; enable APF FlowSchemas |
| 9 | etcd quota | 2GB | Increase to 8GB for large clusters; add compaction cron |
| 10 | Authorization mode | AlwaysAllow (dev) | --authorization-mode=Node,RBAC |
| 11 | Admission plugins | minimal | Enable PodSecurity, NodeRestriction, ResourceQuota |
| 12 | anonymous-auth | true | Disable: --anonymous-auth=false |
| 13 | insecure-port | 0 (disabled v1.20+) | Ensure --insecure-port=0 |
| 14 | profiling | Enabled | Disable in production: --profiling=false |
| 15 | etcd peer encryption | mTLS | Verify peer certs are separate from client certs |
| 16 | Control plane node isolation | Tolerates all (kubeadm default) | Taint CP nodes: NoSchedule for workloads |
| 17 | Resource requests on CP pods | None (static pods) | Set requests in static pod manifests; use separate resource classes |
| 18 | etcd compaction | Auto (every 5min default) | Verify compaction is running; alert on DB growth rate |
| 19 | OIDC integration | x509 only | Integrate with org IdP (Dex, Okta) for human user authn |
| 20 | Monitoring coverage | None | Prometheus + Alertmanager for all 10 metrics above |
Dependency Graph and Next Files
Prerequisites (Already Covered)
This File Covers
- Control plane component roles
- Communication topology
- HA: stacked vs external etcd
- Leader election mechanism
- Static pods and bootstrap
- PKI and certificate hierarchy
- Health checks and metrics
- Upgrade procedure