File Path00-foundations/01-history-of-kubernetes.html
Prerequisites00-introduction.html — core concepts and terminology
Concepts Covered
Google BorgGoogle Omega Kubernetes originCNCF donation Release historyDesign decisions API evolutionCommunity governance Borg vs K8s architectureKey contributors Major feature milestonesEcosystem growth
Related Files

History of Kubernetes

Kubernetes did not emerge in a vacuum. It carries over a decade of hard-won operational knowledge from running the world's largest fleets of containers at Google. Understanding this lineage is not merely academic — the architectural decisions made in Borg and Omega explain why Kubernetes works the way it does today, and why certain design choices that seem unusual have deep roots in production experience at scale.

Why this history matters for engineers When you hit a Kubernetes design limitation or wonder why a feature works a certain way, tracing it back to Borg's operational constraints or Omega's transaction model gives you the mental model to predict behaviour, choose the right workaround, and make better architectural decisions.

1 · Google Borg (2003–2013)

1.1 Origin and Purpose

Around 2003, Google's infrastructure team built the first version of what would become Borg. The immediate trigger was the rapid growth of Google's internal services — the Web search crawler, Gmail, Maps, YouTube — each of which needed to run on thousands of machines simultaneously. Managing this by hand was impossible.

Borg's design goal was to maximise utilisation of Google's data centre hardware while providing a reliable, scalable runtime for both long-running services (user-facing production jobs) and batch workloads (MapReduce, analytics jobs) on the same shared fleet of machines.

1.2 Borg Architecture

Borgmaster

  • Centralised control plane
  • Replicated 5× for HA (Paxos)
  • Handles all scheduling
  • Stores state in Paxos-replicated log
  • Serves read/write API (not HTTP)
  • Equivalent: kube-apiserver + scheduler + etcd

Borglet

  • Agent on every machine
  • Launches/stops tasks
  • Reports machine state to Borgmaster
  • Restarts failed tasks locally
  • Manages local cgroup resources
  • Equivalent: kubelet

Borg Scheduler

  • Feasibility checking (like K8s filter plugins)
  • Scoring with worst-fit packing by default
  • Resource reclamation: buys back unused reservations
  • Priority and quota system
  • Equivalent: kube-scheduler

1.3 Borg Key Concepts That Influenced Kubernetes

Borg conceptKubernetes equivalentLesson learned
Job (set of identical Tasks)ReplicaSet / DeploymentGrouping identical tasks simplifies management
TaskPod / ContainerThe atomic scheduling unit
Alloc (resource envelope shared by tasks)Pod (shared network/cgroup envelope for containers)Co-located containers should share IPC and network
Priority classes (prod vs. batch)PriorityClass, QoS classesMixed workloads need preemption and tiered eviction
Resource reclamation (compaction)VPA, resource requests vs. limitsApplications over-request; reclaim unused capacity
BNS (Borg Naming Service)Service + CoreDNSService discovery must be cluster-native, not DNS-only
Sigma (Borg UI/debugger)kubectl, DashboardOperators need rich introspection tooling
Faborg (failure injection)Chaos engineering toolsBuild for failure, not just availability
ConfigFile (job description language)Kubernetes YAML manifestsDeclarative job descriptions beat imperative scripts

1.4 Lessons Borg Taught Google (and Kubernetes)

The 10 most important operational lessons from Borg
  1. Don't expose raw machine IDs to users. Borg users could see which machine a task ran on and sometimes hard-coded assumptions about it. This created fragility. Kubernetes abstracts node identity — workloads should be location-agnostic.
  2. Allocs (pods) are the right unit of co-location. Tasks that needed to communicate or share data should run as a group with shared resources. This became the Pod — the fundamental scheduling atom in K8s.
  3. Introspection is critical at scale. Sigma (Borg's dashboarding/debugging system) was the most heavily used internal tool. K8s invested in kubectl describe, events, and the metrics API from day one.
  4. The master is the bottleneck. Borgmaster's monolithic architecture limited scale. K8s addressed this with watch semantics (not poll), horizontal API server scaling, and etcd as a separate store.
  5. Users want higher-level abstractions. Raw Borg jobs were too low-level. Users built frameworks on top. K8s formalised this as Deployments, StatefulSets, DaemonSets — and made them extensible via CRDs.
  6. Configuration sprawl is a real problem. BCL (Borg Configuration Language) became too complex. K8s YAML, combined with Helm/Kustomize/CUE, gives structured templating without an embedded language.
  7. Resource reclamation matters enormously. Real utilisation is often 10–30% of requested. Borg reclaimed unused reservations. K8s exposes this via requests vs. limits, and VPA automatically right-sizes.
  8. Priority and preemption are necessary for mixed workloads. Running batch next to production requires a clear priority hierarchy and controlled preemption. K8s PriorityClass + preemption implement this.
  9. Treat infrastructure as code. Borg job files were versioned in source control at Google. K8s doubles down on this — GitOps (ArgoCD/Flux) is the standard operating model.
  10. Health checking must be cluster-native. External health monitors that SSH into machines do not scale. Liveness and readiness probes run inside the kubelet, which is co-located with the workload.

2 · Google Omega (2011–2013)

2.1 What Omega Was

While Borg continued evolving, a separate Google infrastructure team designed Omega as a research prototype to address Borg's architectural limitations. Omega was published academically in the 2013 EuroSys paper "Omega: flexible, scalable schedulers for large compute clusters".

2.2 Omega's Key Innovations

InnovationDescriptionK8s impact
Shared-state scheduling All schedulers see the full cluster state via a shared, consistent in-memory store — no single scheduler bottleneck. Multiple schedulers run in parallel with optimistic concurrency (OCC). K8s allows multiple schedulers via schedulerName. The scheduler framework plugin model comes from Omega's extensibility ideas.
Optimistic concurrency control Schedulers speculatively assign tasks; a transaction commits only if no conflicts occurred. Conflicts trigger retry, not blocking. K8s uses resourceVersion optimistic locking on API objects. Etcd's MVCC is the underlying mechanism.
No central scheduler lock Borg's scheduler held a global lock. Omega's OCC model eliminated this, allowing 10× throughput. K8s API server is horizontally scalable; controllers and schedulers run independently using watch + compare-and-swap.
Cell state visibility All components can read the full cluster state — enables richer scheduling decisions (e.g., topology awareness). K8s's watch API makes full cluster state available to any authorised client, enabling topology-aware scheduling, custom schedulers, and cluster-autoscaler.

2.3 Architecture Comparison: Borg vs Omega vs Kubernetes

BORG Borgmaster (5×, Paxos) Scheduler + State + API (monolith) Borglet Machine agent Borglet Machine agent • Monolithic control plane • Single scheduler (locked) • Internal-only API • Paxos for HA • 2003–present (Google internal) OMEGA Shared-State Store Consistent in-memory cell state Scheduler A Batch jobs Scheduler B Service jobs • Parallel schedulers (OCC) • No global scheduler lock • Research prototype • 2013 EuroSys paper • Never deployed at scale KUBERNETES kube-apiserver REST API + Watch + etcd scheduler plugin framework controllers 30+ reconcilers • Decoupled components • Open standard REST API • etcd (Raft) for HA • CRD extensibility • 2014–present (open source) informed influenced
Figure 1 — Architectural evolution: Borg (monolithic, internal) → Omega (parallel schedulers, OCC, research) → Kubernetes (open, decoupled, extensible).

3 · Birth of Kubernetes (2013–2014)

3.1 The Origin Story

In mid-2013, three engineers who had worked on Borg and Omega at Google — Joe Beda, Brendan Burns, and Craig McLuckie — began a skunkworks project internally to build an open-source container orchestrator. Their insight was that Docker's container format was becoming an industry standard, but nobody had built the orchestration layer for it yet.

They were joined by Brian Grant (who led much of the API design), Tim Hockin (networking), and many others. The project was internally code-named "Project Seven" (a reference to Star Trek's Seven of Nine — a Borg character, fittingly).

3.2 Founding Design Principles

The founding team made several explicit architectural decisions that deliberately diverged from Borg:

DecisionBorg approachKubernetes approachRationale
API surface Internal RPC, not HTTP Public REST API over HTTPS Open ecosystem; any language can speak HTTP
State storage Paxos-replicated in-memory in Borgmaster External etcd (Raft) Decouple storage from control plane; API servers are stateless
Job model Single "Job" concept Multiple resource types (Pod, RC, Service, etc.) Different workloads need different management semantics
Networking Custom Google fabric CNI plugin interface Cloud-agnostic; any network fabric can implement CNI
Storage Google's Colossus (GFS) CSI plugin interface Vendor-neutral; EBS, GCP PD, Ceph, NFS all work the same
Identity model Internal Loas/GAIA Pluggable auth (x509, OIDC, tokens) Works with any existing enterprise identity system
Label system Named "labels" in Borg, not queryable across jobs First-class labels with selectors Flexible grouping, decouples controllers from specific resource names
Extension model Fork Borg (very hard) CRDs + Webhooks + Aggregation API Users can extend without forking the core

3.3 First Public Commit and Launch

June 2014

First public commit — Kubernetes v0.1

Joe Beda made the first public commit to GitHub on June 6, 2014. The initial codebase had ~14,000 lines of Go. The core concepts were already there: Pods, Replication Controllers, Services, and a REST API. Google announced the project at DockerCon 2014.

July 2014

Microsoft, Red Hat, IBM, Docker join

Within one month of the announcement, major vendors committed to contributing. This was the first sign that K8s would become the industry standard rather than a Google-proprietary system.

November 2014

v0.4 — Namespaces, resource quotas, persistent volumes

Early features that are still core today: Namespaces for multi-tenancy, ResourceQuotas to limit namespace consumption, and the first PersistentVolume implementation (GCE PD-backed).

4 · Kubernetes v1.0 and CNCF (2015)

July 2015

Kubernetes v1.0 Released STABLE

Announced at OSCON 2015, v1.0 was declared production-ready. Along with the release, Google and the Linux Foundation announced the Cloud Native Computing Foundation (CNCF), with Kubernetes as its first seed project. This was a strategic move: by donating K8s to a neutral foundation, Google signalled that no single vendor would control it — which accelerated enterprise adoption.

November 2015

KubeCon North America I — 500 attendees

The first KubeCon had 500 attendees. By 2023 it regularly draws 10,000+ in person and tens of thousands virtually. This growth arc mirrors the explosive adoption of K8s across the industry.

2015–2016

v1.1 – v1.3: Horizontal autoscaling, federation, node resource management

v1.1 added Horizontal Pod Autoscaler (HPA) and HTTP-based health checks. v1.2 added ConfigMaps, Ingress, DaemonSets, and rolling deployments (Deployments object). v1.3 added PodDisruptionBudgets, init containers (alpha), and the first CSI prototype (flexVolume).

5 · Major Version Milestones (1.0 → 1.32)

Full milestone table — v1.0 through v1.32 with key features
VersionDateKey features / changes
v1.0Jul 2015GA release, CNCF donation. Pods, RCs, Services, Namespaces, basic auth.
v1.1Nov 2015HPA, HTTP probes, resource limits on containers.
v1.2Mar 2016Deployments GA, ConfigMaps, Ingress (beta), DaemonSets, PetSets (alpha). 1000-node clusters.
v1.3Jul 2016Init containers (alpha), PodDisruptionBudgets, flexVolume, Cluster Federation (alpha). 2000-node clusters.
v1.4Sep 2016PodPresets, kubeadm (alpha), ScheduledJobs (CronJob predecessor). PetSets renamed StatefulSets.
v1.5Dec 2016StatefulSets (beta), kubefed (federation), Windows node support (alpha), RBAC (beta).
v1.6Mar 2017RBAC GA, etcd v3 default, 5000-node clusters, dynamic volume provisioning (beta).
v1.7Jun 2017Network Policy GA, StatefulSets GA, API aggregation layer, kubeadm GA.
v1.8Sep 2017RBAC GA, CronJob beta, Priority and Preemption, Volume snapshots (alpha).
v1.9Dec 2017Workloads API GA (Deployments, DaemonSets, ReplicaSets, StatefulSets stable under apps/v1). CRI stable.
v1.10Mar 2018External cloud provider support (cloud-controller-manager alpha). CSI (beta). Lease API.
v1.11Jun 2018IPVS kube-proxy (stable), CoreDNS default DNS. Dynamic kubelet config.
v1.12Sep 2018RuntimeClass (alpha), TLS bootstrapping improvements, CSI (stable).
v1.13Dec 2018kubeadm GA, CSI GA, CoreDNS GA. SimplestKubernetesRelease™ — fewest features, highest quality.
v1.14Mar 2019Windows nodes (stable), PersistentLocalVolumes GA, kubectl plugin mechanism (stable).
v1.15Jun 2019CRD structural schemas, CustomResourceWebhookConversion, go module support.
v1.16Sep 2019CRD GA (v1), deprecated beta Deployments/DaemonSets/ReplicaSets (apps/v1beta), endpoint slices (alpha).
v1.17Dec 2019Cloud provider labels GA, Volume snapshots (beta), CSI migration (alpha, moving in-tree to CSI).
v1.18Mar 2020Topology manager (beta), Server-side apply (beta), IngressClass resource, HPA v2 (beta).
v1.19Aug 2020Ingress GA, Immutable Secrets/ConfigMaps, EndpointSlices (beta), Storage capacity tracking (alpha).
v1.20Dec 2020Docker shim deprecation announced. CronJob GA, API Priority and Fairness (beta), graceful node shutdown (alpha).
v1.21Apr 2021CronJob GA, Immutable Secrets GA, PodDisruptionBudget GA, EndpointSlices GA, IPv4/IPv6 dual-stack (stable).
v1.22Aug 2021Server-side apply GA, ephemeral containers (beta), memory manager (beta), removal of beta Ingress/RBAC/CRD APIs.
v1.23Dec 2021FlexVolume deprecated, HPA v2 GA, IPv4/IPv6 dual-stack GA, PodSecurity admission (beta, replacing PSP).
v1.24May 2022Docker shim removed. Ephemeral containers GA, gRPC probes (beta), PodSecurity (stable), OpenAPI v3.
v1.25Aug 2022PodSecurityPolicy removed (deprecated since 1.21). CSI migration complete for most in-tree providers. cgroup v2 (stable).
v1.26Dec 2022Cross-namespace VolumeDataSource (alpha), CPUManager static policy improvements, ValidatingAdmissionPolicy (alpha).
v1.27Apr 2023SeccompDefault GA, In-place pod resource resize (alpha), node log access API (alpha).
v1.28Aug 2023Retroactive default StorageClass (GA), NodeSwap (beta), sidecar containers (alpha, KEP-753).
v1.29Dec 2023ReadWriteOncePod PV access mode (GA), KV audit log (beta), LoadBalancer IP mode (alpha).
v1.30Apr 2024Structured auth config (beta), ValidatingAdmissionPolicy GA, AppArmor GA, sidecar containers (beta).
v1.31Aug 2024AppArmor stable, persistent volume last phase transition time GA, nftables kube-proxy (beta).
v1.32Dec 2024Asynchronous preemption (alpha), mutating admission policies (alpha), DRA structured parameters (beta).

6 · The Docker Shim Removal — A Case Study in API Evolution

One of the most misunderstood events in Kubernetes history is the removal of the Docker shim in v1.24. This section explains exactly what happened and why it was the right decision.

6.1 What Was the Docker Shim?

When Kubernetes introduced the Container Runtime Interface (CRI) in v1.5, Docker did not natively implement CRI. To continue supporting Docker, the Kubernetes team added a shim — a translation layer built into the kubelet that converted CRI calls into Docker API calls. This shim was maintained inside the kubelet source tree.

BEFORE v1.24 (with dockershim): kubelet CRI gRPC dockershim (inside kubelet) Docker API dockerd containerd (via docker) runc / OCI AFTER v1.24 (direct CRI — simpler, faster): kubelet CRI gRPC (direct) containerd or CRI-O runc / OCI 2 fewer hops = lower latency Fewer moving parts = easier debug
Figure 2 — Docker shim removal: before (kubelet→shim→dockerd→containerd→runc) vs after (kubelet→containerd→runc). Containers kept running; only the shim was removed.
Common misconception: "Kubernetes dropped Docker support" Kubernetes did NOT stop running Docker-built images. Docker images are OCI-compliant container images; they run perfectly on containerd or CRI-O. What was removed was the special-case shim that translated Kubernetes CRI calls into the Docker daemon API. If you build images with Docker, they continue to work unchanged.

6.2 Removal Timeline

VersionAction
v1.5 (Dec 2016)CRI interface introduced; dockershim added as compatibility layer
v1.20 (Dec 2020)dockershim deprecated; warning added to kubelet logs
v1.23 (Dec 2021)Last release with dockershim in-tree
v1.24 (May 2022)dockershim removed from kubelet. cri-dockerd external shim available for Docker users.

7 · CNCF Ecosystem Growth

Kubernetes' donation to the CNCF in 2015 seeded an entire ecosystem of complementary projects. The CNCF now has 160+ projects spanning every layer of the cloud-native stack.

7.1 Key CNCF Projects by Layer

LayerGraduated CNCF projectsRole
Runtimecontainerd, CRI-OCRI-compliant container runtimes
NetworkingCNI (spec), Cilium, Calico (sandbox)Pod networking, NetworkPolicy, eBPF
StorageRook, LonghornCloud-native distributed storage
Service meshIstio (2023), Linkerd (incubating)mTLS, traffic management, observability
MonitoringPrometheus, ThanosMetrics collection, long-term storage
TracingOpenTelemetry, JaegerDistributed tracing, OTLP
LoggingFluentd, FluentbitLog aggregation and routing
GitOps / CDArgoCD (incubating), Flux (graduated)Declarative continuous delivery
Package managementHelmKubernetes application packaging
PolicyOPA (Graduated)Policy-as-code for admission, authorization
SecurityFalco, TUF, in-totoRuntime threat detection, supply chain
Cluster lifecycleCluster APIDeclarative cluster provisioning
RegistryHarborContainer image registry with scanning

7.2 The Container Orchestrator Wars (2015–2017)

Kubernetes did not become the de-facto standard without competition. Three orchestrators competed from 2015 to 2017:

Docker Swarm, Apache Mesos, and Kubernetes — comparative analysis
DimensionDocker SwarmApache Mesos + MarathonKubernetes
OriginDocker Inc., 2015Twitter/AirBnB/Apple, 2009/2014Google, 2014
Ease of setupExtremely easy (built into Docker)Complex multi-component stackModerate; kubeadm simplified this
Scheduling modelSimple, host-affinity onlyTwo-level (Mesos + Marathon). Very flexible for heterogeneous workloads.Rich multi-constraint scheduler with plugin framework
APIDocker Compose-like YAML, Docker APIMarathon REST API, Mesos APIOpen versioned REST API, CRDs
NetworkingDocker overlay networkCustom; CNI support added laterCNI from the start
ExtensibilityLimitedHigh (frameworks), but complexVery high (CRDs, webhooks, operators)
EcosystemDocker-centricDatacenter-oriented, Hadoop/Spark focusCNCF, vendor-neutral, cloud-native
OutcomeDeprecated; Docker Inc. acquired by Mirantis 2019Still used at scale in some enterprises; D2iQ (formerly Mesosphere) pivoted to KubernetesIndustry standard as of 2018–2019

Kubernetes won for several reasons: rich API, strong community, CNCF neutrality, powerful extensibility (CRDs), and managed Kubernetes services from all major cloud providers (GKE 2015, AKS 2017, EKS 2018) that removed the operational complexity.

8 · Managed Kubernetes Services

The most significant accelerator for enterprise Kubernetes adoption was managed services — cloud providers abstracting away control plane management entirely.

ServiceProviderGA dateNotable features
GKE (Google Kubernetes Engine)Google CloudAug 2015Autopilot mode, Workload Identity, GKE Sandbox (gVisor), integrated logging
AKS (Azure Kubernetes Service)Microsoft AzureJun 2018Virtual nodes (ACI), Azure AD integration, confidential computing nodes
EKS (Elastic Kubernetes Service)AWSJun 2018Fargate serverless nodes, IAM for ServiceAccounts, EKS Anywhere (bare metal)
DOKS (DigitalOcean Kubernetes)DigitalOceanMay 2019Simplified cluster management for smaller teams
OKE (Oracle Container Engine)Oracle Cloud2018Virtual nodes, free control plane
ROKS (Red Hat OpenShift on IBM Cloud)IBM Cloud2019OpenShift layer (SCC, Routes, Operators) on top of Kubernetes

8.1 On-Premises Distributions

DistributionVendorKey differentiator
OpenShiftRed HatEnterprise hardening, SCCs, Routes, built-in CI/CD (Tekton), Operator Framework
Rancher (RKE/RKE2)SUSEMulti-cluster management, simplified UI, Rancher Desktop for local dev
Tanzu Kubernetes GridVMware/BroadcomvSphere integration, Carvel tooling, regulated industry focus
k3sRancher/SUSELightweight (<100MB binary), SQLite or etcd, ARM support, edge/IoT
microk8sCanonicalSingle-binary snap, add-on ecosystem, Ubuntu-native
kind (Kubernetes IN Docker)SIG TestingLocal testing clusters inside Docker containers; CI/CD pipeline standard
minikubeCommunitySingle-node local dev cluster, multi-driver (Docker, QEMU, VirtualBox)

9 · Community Governance and SIGs

Kubernetes is governed by the CNCF Technical Oversight Committee (TOC) and the Kubernetes Steering Committee. The project is divided into Special Interest Groups (SIGs) and Working Groups (WGs), each owning a specific domain.

9.1 Key SIGs and Their Scope

SIGDomainKey deliverables
SIG API MachineryCore API server, CRDs, webhooks, client-goAPI versioning, server-side apply, CRD validation, watch semantics
SIG AppsWorkload APIsDeployments, StatefulSets, DaemonSets, Jobs, CronJobs
SIG Nodekubelet, CRI, cgroups, resource managementkubelet, resource manager, device plugins, topology manager
SIG NetworkCNI, Services, Ingress, Gateway API, DNS, NetworkPolicykube-proxy, EndpointSlices, dual-stack, network policy spec
SIG StorageCSI, PV/PVC lifecycle, volume pluginsCSI spec, dynamic provisioning, volume snapshots, volume health
SIG Schedulingkube-scheduler, scheduler frameworkScheduling framework, topology-aware scheduling, descheduler
SIG AuthRBAC, AuthN/AuthZ, Secrets, admission, certificatesRBAC, TokenRequest API, BoundServiceAccount tokens, audit
SIG SecurityPod security, supply chain, security policyPodSecurity admission, security benchmarks, SLSA compliance
SIG InstrumentationMetrics, logging, events, tracingmetrics-server, structured logging, OpenTelemetry integration
SIG Cluster Lifecyclekubeadm, cluster bootstrap, upgradeskubeadm, cluster-api, upgrade tooling
SIG MulticlusterFederation, multi-cluster service, Cluster APIMCS (Multi-Cluster Services), KubeFed, Cluster API
SIG WindowsWindows node supportWindows containerd, GMSA, HostProcess containers
WG BatchBatch workloads at scaleJobSet, indexed jobs, pod failure policy, job backoff
WG Structured LoggingContextual loggingContextual logging, structured JSON output

9.2 The KEP Process — How Features Get Added

All significant changes to Kubernetes go through a Kubernetes Enhancement Proposal (KEP). This is a design document that follows a lifecycle:

Provisional → Implementable → Implemented → Deferred / Withdrawn

KEP lifecycle stages:
  Alpha   (opt-in, hidden behind feature gate) — one release minimum
  Beta    (on by default, may have API changes) — two releases minimum
  Stable  (GA, cannot be removed without deprecation period)

Feature gates control alpha/beta features:
  --feature-gates=NewFeature=true     # Enable in kubelet/apiserver
# List all feature gates and their status in your cluster
kubectl get --raw /healthz/ping
# Or check apiserver flags:
ps aux | grep kube-apiserver | grep feature-gates

# Check feature gate defaults for your version
# https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/

10 · API Deprecation and Removal Policy

One of the most operationally important aspects of Kubernetes history is its API deprecation policy. Understanding this prevents surprise breakage during upgrades.

API maturityDeprecation notice requiredMinimum support period after deprecation
GA (v1, v2…)Yes, via release notes12 months OR 3 releases (whichever is longer)
Beta (v1beta1, v1beta2…)Yes9 months OR 3 releases (whichever is longer)
Alpha (v1alpha1…)No guarantee; may disappear in any releaseNone — always opt-in, never enabled by default
Notable breaking removals that caught teams off-guard
  • v1.16: extensions/v1beta1 Deployments, ReplicaSets, DaemonSets removed. Replacement: apps/v1.
  • v1.22: networking.k8s.io/v1beta1 Ingress removed. Replacement: networking.k8s.io/v1.
  • v1.25: PodSecurityPolicy removed. Replacement: PodSecurity admission controller.
  • v1.25: batch/v1beta1 CronJob removed. Replacement: batch/v1.
Always run kubectl convert and review deprecated API usage before upgrading.
# Detect deprecated API usage in your cluster before upgrading
# Option 1: pluto (recommended tool)
pluto detect-all-in-cluster --target-versions k8s=v1.26

# Option 2: Kubernetes built-in audit log with deprecated API filter
kubectl get --raw /metrics | grep apiserver_requested_deprecated_apis

# Option 3: kubent (kube no-trouble)
kubent

11 · Production Implications of Kubernetes History

Understanding this history directly impacts how you run Kubernetes in production:

7 production lessons from K8s architectural history
  1. The declarative API is non-negotiable. The entire ecosystem — GitOps, operators, autoscalers — depends on it. Avoid imperative kubectl run or kubectl create in production; always use kubectl apply -f with versioned manifests.
  2. etcd is the source of truth. The Borg lesson: store state outside the control plane components. Back up etcd. Treat it as your most critical service — it IS your cluster. Without etcd, you cannot recover the cluster state.
  3. Controllers are eventually consistent. Like Borg's reconciliation loops, Kubernetes controllers do not guarantee immediate consistency. A kubectl apply returns success when the API server accepts the write — not when the change has fully propagated. Design your tooling around watch + status checking, not timing assumptions.
  4. The watch API scales better than polling. Both Borg and Omega learned that polling creates thundering herds. K8s controllers use informers (List+Watch with local cache). Your own operators should do the same — use controller-runtime or client-go informers.
  5. Upgrades require version skew awareness. K8s's support policy (N-2 skew) is strict for good reasons — the Borg team learned that running mixed versions creates impossible-to-debug race conditions. Always upgrade control plane first, then nodes, never skip minor versions.
  6. Labels and selectors decouple services from implementation. Borg's tight coupling between job name and service discovery was a maintenance burden. K8s label selectors mean you can replace all Pods in a Deployment without changing any Service configuration — blue/green is just a label swap.
  7. Extension points exist for a reason — use them. CRDs, admission webhooks, and scheduler plugins are the designed extension points. Adding features by forking K8s (as some operators tried early on) creates an unmaintainable upgrade nightmare. The operator pattern solved this.

Next Files to Study

Dependency Graph — recommended reading order from this file

References

  • Large-scale cluster management at Google with Borg — Verma et al., EuroSys 2015. The foundational paper.
  • Omega: flexible, scalable schedulers for large compute clusters — Schwarzkopf et al., EuroSys 2013.
  • Borg, Omega, and Kubernetes — Burns, Grant, Oppenheimer, Brewer, Wilkes. ACM Queue, 2016. The best single overview of the lineage.
  • Kubernetes GitHub — github.com/kubernetes/kubernetes — CHANGELOG.md for per-version history
  • Kubernetes Enhancement Proposals — github.com/kubernetes/enhancements
  • CNCF Landscape — landscape.cncf.io
  • Kubernetes release notes — kubernetes.io/docs/setup/release/notes
  • The Kubernetes Book — Nigel Poulton (updated annually)