Multi-Tenancy in Kubernetes
Complete guide to isolating tenants on shared Kubernetes clusters — namespace tenancy models, RBAC design, ResourceQuota, LimitRange, NetworkPolicy, Hierarchical Namespaces, virtual clusters, and the Capsule & Loft operator patterns.
Contents
Tenancy Models
Kubernetes multi-tenancy is not a single feature but a spectrum of isolation strategies. The right model depends on your trust level, regulatory requirements, blast-radius tolerance, and team autonomy needs.
| Model | Isolation Mechanism | Tenant Trust | Best For | Not For |
|---|---|---|---|---|
| Namespace per team | RBAC only | High (internal teams) | Single org, controlled developers | External/untrusted tenants |
| Namespace + policy | RBAC + Quota + NetworkPolicy + PSA | Medium | Multiple teams on shared platform | Hostile tenants, compliance-heavy |
| Capsule Tenant | Above + tenant-owner RBAC + Ingress/StorageClass scoping | Medium-Low | Self-service multi-team platforms | Regulatory isolation requirements |
| vcluster | Virtual K8s API server per tenant (pods on shared nodes) | Low-Medium | Dev environments, CI, ISV testing | Hostile workloads (shared kernel) |
| Dedicated cluster | Separate control plane + nodes | Any | PCI/HIPAA/strict compliance, enterprise customers | Cost-sensitive, many small tenants |
| Dedicated nodes | NodeSelector/Taints + RBAC | Medium | GPU tenants, high-memory workloads | Full kernel isolation |
Namespace Design
The namespace is the fundamental unit of tenancy in Kubernetes. Good namespace design is the foundation of everything else — RBAC, quotas, network policies, and monitoring all apply at the namespace boundary.
Namespace Naming Conventions
# Pattern: {team}-{service}-{environment}
payments-api-production
payments-api-staging
payments-worker-production
platform-tools-shared
# Alternative: {org-unit}/{team}/{env} compressed
myorg-platform-prod
myorg-payments-staging
myorg-analytics-dev
# System namespaces (never put tenant workloads here)
kube-system # K8s system components
kube-public # cluster-info ConfigMap
kube-node-lease # node heartbeat leases
monitoring # prometheus-stack
argocd # Argo CD
kyverno # policy engine
cert-manager # certificate management
ingress-nginx # ingress controller
Required Namespace Labels
apiVersion: v1
kind: Namespace
metadata:
name: payments-api-production
labels:
# Ownership (required by policy — see 05-policy-enforcement.html)
team: payments
env: production
cost-center: CC-4892
# Pod Security Standards
pod-security.kubernetes.io/enforce: baseline
pod-security.kubernetes.io/warn: restricted
pod-security.kubernetes.io/audit: restricted
# Monitoring
prometheus.io/scrape: "true"
# Tenant grouping (used by Capsule / HNC)
tenant: payments-tribe
annotations:
# Human-readable metadata
team-email: payments-team@company.com
slack-channel: "#payments-alerts"
oncall-rotation: pagerduty-schedule-P123456
runbook: https://wiki.company.com/payments/runbooks
Namespace Topology Patterns
One Namespace per Service × Env
Fine-grained isolation. Each service in each environment gets its own namespace. Works well for microservices. Can lead to namespace sprawl (>100 namespaces) but gives precise quota and RBAC control.
payments-api-prodpayments-api-stagingpayments-worker-prod
One Namespace per Team × Env
All services for a team share a namespace per environment. Simpler to manage, but services within the team share quota and network policies. Good starting point for smaller orgs.
payments-productionpayments-staginganalytics-production
RBAC for Tenants
RBAC is the primary access control mechanism for tenant isolation. The key principle: tenants get namespace-scoped Roles (never ClusterRoles with cluster-wide permissions), and platform engineers hold the ClusterRoles needed to manage shared infrastructure.
Standard Tenant Role Tiers
| Role Name | Allowed Resources | Denied Resources | Assigned To |
|---|---|---|---|
tenant-admin | All namespace resources except ResourceQuota/LimitRange/NetworkPolicy | ResourceQuota, LimitRange, NetworkPolicy, RoleBinding to cluster-admin | Team leads |
tenant-developer | Deployments, Services, Ingresses, ConfigMaps, Secrets (read), Jobs, CronJobs, HPA, PVC | RBAC resources, Quota, NetworkPolicy | Engineers |
tenant-viewer | Get/list/watch all namespace resources (read-only) | All create/update/delete/patch | On-call, stakeholders |
tenant-ci | Deployments (patch/update), ConfigMaps (create/update), Pods (get/list), Jobs (create) | Secrets write, RBAC, delete Deployments | CI/CD service accounts |
ClusterRole: tenant-developer
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: tenant-developer
labels:
rbac.platform.io/managed: "true"
rules:
# Workloads
- apiGroups: ["apps"]
resources: ["deployments","statefulsets","daemonsets","replicasets"]
verbs: ["get","list","watch","create","update","patch","delete"]
- apiGroups: [""]
resources: ["pods","pods/log","pods/exec","pods/portforward"]
verbs: ["get","list","watch","create","delete"]
- apiGroups: ["batch"]
resources: ["jobs","cronjobs"]
verbs: ["get","list","watch","create","update","patch","delete"]
# Config and secrets (read secrets, write configmaps)
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["get","list","watch","create","update","patch","delete"]
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get","list","watch"] # create/update via ESO only
# Networking
- apiGroups: [""]
resources: ["services","endpoints"]
verbs: ["get","list","watch","create","update","patch","delete"]
- apiGroups: ["networking.k8s.io"]
resources: ["ingresses"]
verbs: ["get","list","watch","create","update","patch","delete"]
# Storage
- apiGroups: [""]
resources: ["persistentvolumeclaims"]
verbs: ["get","list","watch","create","delete"]
# Autoscaling
- apiGroups: ["autoscaling"]
resources: ["horizontalpodautoscalers"]
verbs: ["get","list","watch","create","update","patch","delete"]
# Observability (read-only)
- apiGroups: ["monitoring.coreos.com"]
resources: ["servicemonitors","prometheusrules","podmonitors"]
verbs: ["get","list","watch","create","update","patch","delete"]
# Events
- apiGroups: [""]
resources: ["events"]
verbs: ["get","list","watch"]
RoleBinding: Assign Team to Namespace
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: payments-team-developer
namespace: payments-api-production
subjects:
# Bind an Okta/OIDC group
- kind: Group
name: "okta-group:payments-engineers"
apiGroup: rbac.authorization.k8s.io
# Also bind specific service accounts
- kind: ServiceAccount
name: payments-ci-deployer
namespace: payments-api-production
roleRef:
kind: ClusterRole
name: tenant-developer
apiGroup: rbac.authorization.k8s.io
ClusterRole: tenant-ci (for CI/CD service accounts)
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: tenant-ci
rules:
- apiGroups: ["apps"]
resources: ["deployments","statefulsets"]
verbs: ["get","list","watch","patch","update"]
- apiGroups: [""]
resources: ["pods","pods/log"]
verbs: ["get","list","watch"]
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["get","list","watch","create","update","patch"]
- apiGroups: ["batch"]
resources: ["jobs"]
verbs: ["get","list","watch","create","delete"]
- apiGroups: ["argoproj.io"]
resources: ["rollouts"]
verbs: ["get","list","watch","patch","update"]
# Needed for rollout status checks
- apiGroups: ["argoproj.io"]
resources: ["analysisruns","experiments"]
verbs: ["get","list","watch"]
Preventing Privilege Escalation via RBAC
# Kyverno policy: prevent tenants from creating RoleBindings
# to cluster-admin or platform-reserved ClusterRoles
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: restrict-rolebinding-clusterroles
spec:
validationFailureAction: Enforce
rules:
- name: block-cluster-admin-binding
match:
any:
- resources:
kinds: ["RoleBinding","ClusterRoleBinding"]
operations: ["CREATE","UPDATE"]
validate:
message: "Binding to cluster-admin or platform ClusterRoles is not allowed."
deny:
conditions:
any:
- key: "{{ request.object.roleRef.name }}"
operator: In
value: ["cluster-admin","system:masters","platform-admin"]
Resource Isolation
Without resource controls, a single noisy tenant can starve the entire cluster. ResourceQuota caps total consumption per namespace; LimitRange sets defaults and bounds per container.
ResourceQuota Tiers
# Small team namespace
apiVersion: v1
kind: ResourceQuota
metadata:
name: tenant-quota-small
namespace: analytics-dev
spec:
hard:
# Compute
requests.cpu: "4"
requests.memory: 8Gi
limits.cpu: "8"
limits.memory: 16Gi
# Pod count prevents fork bombs
pods: "40"
# Storage
requests.storage: 100Gi
persistentvolumeclaims: "10"
# K8s object limits
count/deployments.apps: "20"
count/services: "20"
count/configmaps: "50"
count/secrets: "30"
count/ingresses.networking.k8s.io: "10"
# Restrict LoadBalancer (expensive) and NodePort
services.loadbalancers: "0"
services.nodeports: "0"
---
# Large production namespace
apiVersion: v1
kind: ResourceQuota
metadata:
name: tenant-quota-large
namespace: payments-api-production
spec:
hard:
requests.cpu: "32"
requests.memory: 64Gi
limits.cpu: "64"
limits.memory: 128Gi
pods: "200"
requests.storage: 1Ti
persistentvolumeclaims: "50"
count/deployments.apps: "100"
count/services: "100"
services.loadbalancers: "2" # limited but allowed for prod
LimitRange: Container Defaults
apiVersion: v1
kind: LimitRange
metadata:
name: tenant-limits
namespace: payments-api-production
spec:
limits:
# Container defaults (applied when no resources spec is set)
- type: Container
default: # limit (max) default
cpu: "500m"
memory: 512Mi
defaultRequest: # request default
cpu: "100m"
memory: 128Mi
max: # hard ceiling per container
cpu: "4"
memory: 8Gi
min: # minimum (prevents setting zero)
cpu: "10m"
memory: 32Mi
maxLimitRequestRatio: # limits cannot be more than 4× requests
cpu: "4"
memory: "4"
# Pod-level limit (sum of all containers)
- type: Pod
max:
cpu: "8"
memory: 16Gi
# PVC storage bounds
- type: PersistentVolumeClaim
max:
storage: 50Gi
min:
storage: 1Gi
PriorityClass for Tenant Tiers
# Platform team creates these; tenants reference them
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: platform-critical
value: 1000000
globalDefault: false
description: "Platform system components. Never preempted."
---
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: production-high
value: 100
globalDefault: false
description: "Production workloads — preempts batch/dev"
---
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: production-default
value: 50
globalDefault: true
description: "Default for workloads without explicit priority"
---
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: batch-low
value: 10
preemptionPolicy: Never # won't preempt others even if higher priority available
description: "Batch jobs: use leftover capacity only"
# Kyverno: enforce tenants use allowed PriorityClasses
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: restrict-priority-classes
spec:
validationFailureAction: Enforce
rules:
- name: check-priority-class
match:
any:
- resources:
kinds: ["Pod"]
operations: ["CREATE"]
exclude:
any:
- resources:
namespaces: ["kube-system","monitoring","argocd"]
validate:
message: "Only production-high, production-default, or batch-low PriorityClasses allowed."
deny:
conditions:
any:
- key: "{{ request.object.spec.priorityClassName || 'production-default' }}"
operator: AnyNotIn
value: ["production-high","production-default","batch-low",""]
Network Isolation
NetworkPolicies provide the network-layer tenant fence. By default, all pods in a cluster can reach all other pods regardless of namespace. You must explicitly create deny-all ingress + deny-all egress policies and then open only what each tenant needs.
Baseline Tenant NetworkPolicy Set
# 1. Deny all ingress and egress by default
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: payments-api-production
spec:
podSelector: {}
policyTypes: ["Ingress","Egress"]
# No rules = deny all
---
# 2. Allow DNS (required for all pods)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns
namespace: payments-api-production
spec:
podSelector: {}
policyTypes: ["Egress"]
egress:
- ports:
- port: 53
protocol: UDP
- port: 53
protocol: TCP
---
# 3. Allow intra-namespace traffic
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-same-namespace
namespace: payments-api-production
spec:
podSelector: {}
policyTypes: ["Ingress","Egress"]
ingress:
- from:
- podSelector: {} # any pod in same namespace
egress:
- to:
- podSelector: {}
---
# 4. Allow ingress from NGINX ingress controller
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-ingress-controller
namespace: payments-api-production
spec:
podSelector: {}
policyTypes: ["Ingress"]
ingress:
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: ingress-nginx
ports:
- port: 8080
- port: 8443
---
# 5. Allow Prometheus scraping
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-prometheus-scrape
namespace: payments-api-production
spec:
podSelector: {}
policyTypes: ["Ingress"]
ingress:
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: monitoring
ports:
- port: 8080 # metrics port
Cross-Namespace Service Communication
# Allow payments-api to call payments-worker in a different namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-from-payments-api
namespace: payments-worker-production # applied in TARGET namespace
spec:
podSelector:
matchLabels:
app: worker
policyTypes: ["Ingress"]
ingress:
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: payments-api-production
podSelector:
matchLabels:
app: api # AND condition: namespace AND pod label
ports:
- port: 9000
protocol: TCP
netshoot or kubectl-netpol before trusting it.Cilium NetworkPolicy (Advanced)
# Cilium extends NetworkPolicy with L7 HTTP rules
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: payments-api-l7
namespace: payments-api-production
spec:
endpointSelector:
matchLabels:
app: payments-api
ingress:
- fromEndpoints:
- matchLabels:
app: frontend
toPorts:
- ports:
- port: "8080"
protocol: TCP
rules:
http:
- method: "POST"
path: "^/api/v1/payments$" # only this specific endpoint
- method: "GET"
path: "^/api/v1/payments/[0-9]+$"
Hierarchical Namespaces (HNC)
HNC (hierarchical-namespace-controller) adds a parent-child namespace tree. Resources like RoleBindings, NetworkPolicies, and LimitRanges propagate from parent to children automatically — eliminating manual duplication across namespaces that belong to the same tenant tree.
Install HNC
kubectl apply -f https://github.com/kubernetes-sigs/hierarchical-namespaces/releases/download/v1.1.0/default.yaml
# HNC adds the kubectl-hns plugin
kubectl krew install hns
Create Namespace Hierarchy
# Create root tenant namespace
kubectl create ns payments-tribe
# Create child namespaces under the tenant root
kubectl hns create payments-api-prod -n payments-tribe
kubectl hns create payments-api-staging -n payments-tribe
kubectl hns create payments-worker-prod -n payments-tribe
# View the hierarchy
kubectl hns tree payments-tribe
# Output:
# payments-tribe
# ├── payments-api-prod
# ├── payments-api-staging
# └── payments-worker-prod
Propagate RoleBindings via HierarchicalNamespace
# SubnamespaceAnchor (HNC's way of declaring child namespaces in YAML)
apiVersion: hnc.x-k8s.io/v1alpha2
kind: SubnamespaceAnchor
metadata:
name: payments-api-prod
namespace: payments-tribe # parent
---
# RoleBinding in parent automatically propagates to all children
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: payments-team-developer
namespace: payments-tribe # set once here...
annotations:
propagate.hnc.x-k8s.io/select: "true" # ...propagates to all children
subjects:
- kind: Group
name: "okta-group:payments-engineers"
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: tenant-developer
apiGroup: rbac.authorization.k8s.io
HNC Resource Propagation Configuration
# HNCConfiguration: control what resource types propagate
apiVersion: hnc.x-k8s.io/v1alpha2
kind: HNCConfiguration
metadata:
name: config
spec:
resources:
- resource: rolebindings
group: rbac.authorization.k8s.io
mode: Propagate # Propagate | Ignore | Remove
- resource: networkpolicies
group: networking.k8s.io
mode: Propagate
- resource: limitranges
group: ""
mode: Propagate
- resource: resourcequotas
group: ""
mode: Ignore # Quotas should be set per-namespace, not inherited
Capsule Operator
Capsule provides a higher-level Tenant CRD that aggregates multiple namespaces under a tenant owner with shared policies, quotas, and allowed resource classes. Tenant owners can self-service create namespaces within their Tenant, but the platform team retains control over what those namespaces can do.
Install Capsule
helm repo add projectcapsule https://projectcapsule.github.io/charts
helm repo update
helm install capsule projectcapsule/capsule \
--namespace capsule-system \
--create-namespace \
--version 0.7.2 \
--set manager.options.forceTenantPrefix=true \
--set manager.options.capsuleUserGroups[0]="capsule.clastix.io"
Tenant CRD
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
name: payments
spec:
# Tenant owners can create/delete namespaces within this tenant
owners:
- name: alice@company.com
kind: User
- name: payments-team-leads
kind: Group
# Namespace quota
namespaceOptions:
quota: 10 # max namespaces this tenant can own
additionalMetadata:
labels:
tenant: payments
cost-center: CC-4892
pod-security.kubernetes.io/enforce: baseline
# Aggregate resource quota across ALL tenant namespaces
resourceQuotas:
scope: Tenant # aggregated across namespaces
items:
- hard:
requests.cpu: "32"
requests.memory: 64Gi
limits.cpu: "64"
limits.memory: 128Gi
pods: "200"
requests.storage: 500Gi
# LimitRange applied to every tenant namespace
limitRanges:
items:
- limits:
- type: Container
default:
cpu: "500m"
memory: 512Mi
defaultRequest:
cpu: "100m"
memory: 128Mi
# Allowed StorageClasses
storageClasses:
allowed:
- gp3
- efs-sc
# Allowed IngressClasses
ingressOptions:
allowedClasses:
allowed:
- nginx
allowedHostnames:
allowed:
- "*.payments.company.com"
# Network policies added to every tenant namespace
networkPolicies:
items:
- spec:
podSelector: {}
policyTypes: ["Ingress","Egress"]
ingress: [] # deny all by default; tenant owners add their own
# Allowed node selectors (restrict to tenant node pool)
nodeSelector:
node.kubernetes.io/workload: general
# Prevent privilege escalation
podOptions:
additionalMetadata:
annotations:
seccomp.security.alpha.kubernetes.io/pod: runtime/default
# Image pull policies enforced on this tenant
imagePullPolicies:
- Always
Tenant Owner Creates Namespaces
# Alice (tenant owner) can self-service create namespaces
# Capsule intercepts and enforces forceTenantPrefix
kubectl create ns payments-api-prod # becomes "payments-api-prod"
kubectl create ns payments-staging # allowed; owns up to quota of 10
# Alice cannot create namespaces for other tenants
kubectl create ns analytics-dev # DENIED — not in payments tenant
Virtual Clusters (vcluster)
vcluster creates a fully functional virtual Kubernetes cluster running as pods inside a host namespace. Each vcluster has its own API server, controller manager, and etcd — but its workloads run as regular pods on the host cluster's nodes.
Install vcluster
helm repo add loft-sh https://charts.loft.sh
helm repo update
# Create a vcluster for team-a
helm install vcluster-team-a loft-sh/vcluster \
--namespace vcluster-team-a \
--create-namespace \
--version 0.20.0 \
--set controlPlane.distro.k3s.enabled=true \
--set controlPlane.statefulSet.resources.requests.cpu=200m \
--set controlPlane.statefulSet.resources.requests.memory=256Mi \
--set controlPlane.statefulSet.resources.limits.memory=1Gi \
--set sync.toHost.ingresses.enabled=true \
--set sync.toHost.persistentVolumes.enabled=true
vcluster Configuration (vcluster.yaml)
controlPlane:
distro:
k3s:
enabled: true
image:
tag: "v1.31.0-k3s1"
statefulSet:
highAvailability:
replicas: 1 # increase to 3 for production vclusters
resources:
requests:
cpu: 200m
memory: 256Mi
limits:
memory: 1Gi
sync:
toHost:
ingresses:
enabled: true
persistentVolumes:
enabled: false # only if tenant needs dynamic PV provisioning
storageClasses:
enabled: false # use host storage classes
fromHost:
nodes:
enabled: true
selector:
all: true
# Restrict what the vcluster can do on the host
isolation:
enabled: true
namespace:
isolate: true
networkPolicy:
enabled: true
resourceQuota:
enabled: true
quota:
requests.cpu: "10"
requests.memory: 20Gi
limits.cpu: "20"
limits.memory: 40Gi
pods: "100"
Access a vcluster
# Install vcluster CLI
curl -L -o vcluster "https://github.com/loft-sh/vcluster/releases/latest/download/vcluster-linux-amd64"
chmod +x vcluster && mv vcluster /usr/local/bin/
# Connect and switch kubeconfig to vcluster
vcluster connect vcluster-team-a --namespace vcluster-team-a
# Team-a now sees their own cluster
kubectl get nodes # shows virtual node
kubectl get ns # only default, kube-system (virtual)
# Disconnect (returns to host cluster context)
vcluster disconnect
vcluster vs Namespace Tenancy
| Dimension | Namespace Tenancy | vcluster |
|---|---|---|
| API isolation | Shared API server — tenants see cluster-scoped resources | Dedicated virtual API server per tenant |
| CRD isolation | CRDs are cluster-scoped — shared with all tenants | Virtual CRDs isolated per vcluster |
| RBAC isolation | ClusterRole vs Role boundary | Full cluster-admin inside vcluster |
| K8s version | All tenants on same version | Each vcluster can run different K8s version |
| Operator deployment | Shared operators (Prometheus, cert-manager) | Tenant can install their own operators |
| Resource overhead | ~0 (namespaces are free) | ~200m CPU + 256Mi RAM per vcluster control plane |
| Node isolation | Pods share nodes unless tainted | Pods still share host nodes (soft isolation) |
| Blast radius | Namespace escape possible with privileged containers | Same — still shares host kernel |
| Best for | Internal teams, cost-sensitive | ISV customers, dev/CI environments, different K8s versions |
Self-Service Namespace Provisioning
Platform teams should not be the bottleneck for namespace creation. Automate it through the developer portal (see 04-developer-portal.html) or Crossplane/Kyverno generate.
Crossplane Namespace Composition
# Platform team defines: what a "team namespace" looks like
apiVersion: apiextensions.crossplane.io/v1
kind: Composition
metadata:
name: team-namespace.platform.io
spec:
compositeTypeRef:
apiVersion: platform.io/v1alpha1
kind: XTeamNamespace
resources:
- name: namespace
base:
apiVersion: kubernetes.crossplane.io/v1alpha1
kind: Object
spec:
forProvider:
manifest:
apiVersion: v1
kind: Namespace
patches:
- type: FromCompositeFieldPath
fromFieldPath: spec.teamName
toFieldPath: spec.forProvider.manifest.metadata.name
transforms:
- type: string
string:
fmt: "%s-production"
- name: resource-quota
base:
apiVersion: kubernetes.crossplane.io/v1alpha1
kind: Object
spec:
forProvider:
manifest:
apiVersion: v1
kind: ResourceQuota
spec:
hard:
requests.cpu: "8"
requests.memory: 16Gi
patches:
- type: FromCompositeFieldPath
fromFieldPath: spec.teamName
toFieldPath: spec.forProvider.manifest.metadata.namespace
transforms:
- type: string
string:
fmt: "%s-production"
---
# Composite Resource Definition (the "API" teams use)
apiVersion: apiextensions.crossplane.io/v1
kind: CompositeResourceDefinition
metadata:
name: xteamnamespaces.platform.io
spec:
group: platform.io
names:
kind: XTeamNamespace
plural: xteamnamespaces
claimNames:
kind: TeamNamespace
plural: teamnamespaces
versions:
- name: v1alpha1
served: true
referenceable: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
teamName:
type: string
quotaTier:
type: string
enum: ["small","medium","large"]
required: ["teamName","quotaTier"]
Team Claims a Namespace
# Analytics team requests their namespace (in their own namespace claim)
apiVersion: platform.io/v1alpha1
kind: TeamNamespace
metadata:
name: analytics-production
namespace: platform-claims # claims namespace
spec:
teamName: analytics
quotaTier: medium
Per-Tenant Monitoring
Tenants need visibility into their own workloads without seeing other tenants' data. The standard pattern is multi-tenant Prometheus with label-based isolation, or separate Prometheus instances per tenant.
Prometheus RBAC: Namespace-Scoped Scraping
# ServiceAccount for per-namespace Prometheus or tenant scrape
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus-tenant
namespace: monitoring
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: prometheus-tenant-view
namespace: payments-api-production # only this tenant's namespace
subjects:
- kind: ServiceAccount
name: prometheus-tenant
namespace: monitoring
roleRef:
kind: ClusterRole
name: view
apiGroup: rbac.authorization.k8s.io
Prometheus Operator: Namespace-Scoped ServiceMonitor Discovery
# prometheus-values.yaml (in monitoring namespace)
prometheus:
prometheusSpec:
# Only discover ServiceMonitors in namespaces with this label
serviceMonitorNamespaceSelector:
matchLabels:
prometheus.io/scrape: "true"
serviceMonitorSelector: {}
# Inject namespace as external label so tenant data is distinguishable
externalLabels:
cluster: prod-us-east-1
# Enforce namespace label on all metrics from multi-tenant namespaces
ruleNamespaceSelector:
matchLabels:
prometheus.io/scrape: "true"
Grafana Tenant Isolation
# Grafana Organizations for hard tenant isolation
# OR: Grafana Teams + folder-based RBAC for soft isolation
# datasource per-tenant pointing to Thanos/Mimir with namespace filter
apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-datasources
namespace: monitoring
data:
datasources.yaml: |
apiVersion: 1
datasources:
- name: Prometheus-Payments
type: prometheus
url: http://thanos-query:9090
jsonData:
httpMethod: POST
# Custom HTTP header to enforce namespace restriction in Mimir/Cortex
customQueryParameters: "namespace=payments-api-production"
access: proxy
PrometheusRule for Tenant Quota Alerts
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: tenant-quota-alerts
namespace: monitoring
spec:
groups:
- name: tenant.quota
rules:
# CPU request usage > 90% of quota
- alert: TenantCPUQuotaNearLimit
expr: |
(
kube_resourcequota{type="used", resource="requests.cpu"}
/
kube_resourcequota{type="hard", resource="requests.cpu"}
) > 0.9
for: 10m
labels:
severity: warning
annotations:
summary: "Namespace {{ $labels.namespace }} CPU quota above 90%"
description: "{{ $value | humanizePercentage }} of CPU request quota used."
# Memory limit usage > 90% of quota
- alert: TenantMemoryQuotaNearLimit
expr: |
(
kube_resourcequota{type="used", resource="limits.memory"}
/
kube_resourcequota{type="hard", resource="limits.memory"}
) > 0.9
for: 10m
labels:
severity: warning
annotations:
summary: "Namespace {{ $labels.namespace }} memory quota above 90%"
# Pod count > 80% of quota
- alert: TenantPodCountHigh
expr: |
(
kube_resourcequota{type="used", resource="pods"}
/
kube_resourcequota{type="hard", resource="pods"}
) > 0.8
for: 15m
labels:
severity: info
annotations:
summary: "Namespace {{ $labels.namespace }} pod count above 80% of quota"
Best Practices
Match Isolation to Trust Level
Internal teams on a shared cluster need namespace + RBAC + policy. External customers or regulated workloads need dedicated nodes or dedicated clusters. Namespace tenancy alone is not a security boundary.
Automate Namespace Provisioning
Never hand-provision namespaces. Use Crossplane compositions, Backstage templates, or Capsule self-service. Every namespace must come with ResourceQuota, LimitRange, default NetworkPolicies, and RBAC — not as optional afterthoughts.
Default-Deny NetworkPolicy
Kyverno's generate rule (see 05-policy-enforcement.html) creates a deny-all NetworkPolicy in every new namespace automatically. Teams must explicitly declare what traffic they allow. Never start with allow-all.
Quota Before Workloads
Use Argo CD sync waves to apply ResourceQuota (wave -1) before the Application that deploys workloads (wave 0). A namespace with no quota is a resource bomb waiting to go off.
LimitRange as Safety Net
LimitRange defaults catch workloads deployed without explicit resource specs. Set defaultRequest low and default limit moderate — this prevents unbounded containers while not blocking teams who haven't tuned their resources yet.
Tenant-Scoped RBAC, Not ClusterRoles
Tenants get RoleBindings to ClusterRoles (for reuse), never ClusterRoleBindings unless the role is explicitly cluster-safe. Audit ClusterRoleBindings quarterly — any binding to a non-platform team is a red flag.
HNC or Capsule for Namespace Trees
Managing 50+ namespaces manually leads to drift — some namespaces missing NetworkPolicies, others missing LimitRanges. HNC propagation or Capsule's Tenant CRD ensures uniform configuration without repeated YAML.
Audit Cross-Namespace Access
NetworkPolicies that allow broad namespaceSelector: {} (all namespaces) are lateral movement highways. Every cross-namespace allow rule must specify both a namespace label AND a pod label selector to limit scope to the minimum necessary.
Coverage: 06 · Multi-Tenancy
- Tenancy models spectrum diagram (soft namespace → namespace+policy → Capsule → vcluster → dedicated cluster)
- Tenancy models comparison table (isolation mechanism / trust level / best for / not for)
- Security isolation callout (namespaces ≠ security boundary; kernel escape via privileged containers)
- Namespace naming conventions (team-service-env pattern; system namespace list)
- Required namespace labels (team/env/cost-center/PSA/prometheus/tenant/email/slack/oncall annotations)
- Namespace topology patterns: one-per-service×env vs one-per-team×env comparison
- Standard tenant role tiers table (tenant-admin / tenant-developer / tenant-viewer / tenant-ci)
- ClusterRole: tenant-developer (full YAML covering apps/pods/configmaps/secrets-read/services/ingresses/PVCs/HPA/ServiceMonitors/events)
- RoleBinding: payments-team-developer (Okta group + ServiceAccount binding)
- ClusterRole: tenant-ci (Deployments patch/update, Jobs, Rollouts, AnalysisRuns)
- Kyverno policy: restrict-rolebinding-clusterroles (block binding to cluster-admin/system:masters)
- ResourceQuota tiers: small (4cpu/8Gi/40 pods) and large (32cpu/64Gi/200 pods) with object count limits and services.loadbalancers:0
- LimitRange: Container defaults (cpu 100m→500m/memory 128Mi→512Mi), max bounds, maxLimitRequestRatio 4×, Pod-level max, PVC storage bounds
- PriorityClass tiers: platform-critical/production-high/production-default/batch-low with preemptionPolicy:Never for batch
- Kyverno policy: restrict-priority-classes (deny use of non-allowlisted PriorityClasses)
- NetworkPolicy set: default-deny-all / allow-dns / allow-same-namespace / allow-ingress-controller / allow-prometheus-scrape
- Cross-namespace NetworkPolicy (AND condition: namespaceSelector + podSelector)
- Cilium NetworkPolicy CRD for L7 HTTP path/method rules
- CNI requirement callout (Flannel does not enforce NetworkPolicy; use Calico/Cilium/Weave)
- HNC install (kubectl apply + krew plugin)
- kubectl hns commands: create child namespaces, tree view
- SubnamespaceAnchor YAML; propagate.hnc.x-k8s.io/select annotation on RoleBinding
- HNCConfiguration: Propagate rolebindings+networkpolicies+limitranges, Ignore resourcequotas
- Capsule Helm install (forceTenantPrefix, capsuleUserGroups)
- Capsule Tenant CRD: owners/namespaceOptions/resourceQuotas(scope:Tenant)/limitRanges/storageClasses/ingressOptions(allowedHostnames)/networkPolicies/nodeSelector/podOptions/imagePullPolicies
- Tenant owner self-service namespace creation (forceTenantPrefix enforcement)
- vcluster architecture diagram (virtual API server + sync to host namespace)
- vcluster Helm install (k3s distro, resources, sync.toHost ingresses/PVs)
- vcluster.yaml: controlPlane/sync/isolation (networkPolicy+resourceQuota)
- vcluster CLI: connect, disconnect commands
- vcluster vs namespace tenancy 9-dimension comparison table
- Crossplane Composition for team namespace (Namespace + ResourceQuota + patches from spec.teamName)
- CompositeResourceDefinition (XTeamNamespace/TeamNamespace with teamName+quotaTier schema)
- TeamNamespace claim YAML example
- Prometheus RBAC: ServiceAccount + RoleBinding for namespace-scoped scraping
- Prometheus Operator values: serviceMonitorNamespaceSelector, externalLabels, ruleNamespaceSelector
- Grafana datasource per-tenant with customQueryParameters namespace filter
- PrometheusRule: TenantCPUQuotaNearLimit / TenantMemoryQuotaNearLimit / TenantPodCountHigh (kube_resourcequota ratio alerts)
- 8 best practices cards (trust level matching / automated provisioning / default-deny NetworkPolicy / quota before workloads / LimitRange safety net / namespace-scoped RBAC / HNC or Capsule for trees / audit cross-namespace access)