Skip to main content
  1. Posts/

Enter the Cluster

·892 words·5 mins
Photograph By Sarah Kilian
Blog Kubernetes Infrastructure
Table of Contents
Infrastructure From Scratch - This article is part of a series.
Part 3: This Article

Growing Pains
#

The Docker setup worked great. CI/CD was humming along . And then at work, the requirements changed. Multiple services needed to scale independently. Some needed to run on a schedule. Others needed to survive node failures without manual intervention. Docker Compose on a single machine wasn’t going to cut it anymore.

Enter Kubernetes. Or as I like to call it, “Docker Compose’s older sibling who went to business school.”

The Translation Guide
#

If you already understand Docker, Kubernetes concepts map pretty directly. The names just get fancier.

Docker ComposeKubernetesWhat Changed
ContainerPodA pod can have multiple containers (sidecars)
Service in composeDeploymentManages replicas, rolling updates, rollbacks
Port mappingService (ClusterIP)DNS-based discovery instead of port numbers
docker-compose up --scale=3HPA (Horizontal Pod Autoscaler)Auto-scales based on CPU/memory metrics
Docker networkNamespaceLogical isolation + RBAC boundaries
restart: alwaysBuilt-in self-healingK8s restarts crashed pods automatically
.env fileSecrets / ConfigMapsManaged by the cluster, not files on disk

The biggest mental shift: you stop thinking about where things run and start thinking about what should be running. You tell Kubernetes “I want 3 replicas of this service” and it figures out which nodes to put them on.

One Image, Many Roles
#

One pattern I use at work is deploying the same Docker image with different startup commands. Instead of building separate images for the API server, the scheduler, and a background worker, it’s one image — three deployments:

# The API server — handles HTTP requests
containers:
  - name: app
    image: registry.example.com/my-service:v1.2.3
    command: ["npm", "run", "start:server"] # serves the API
    resources:
      requests:
        cpu: 100m # minimum CPU guaranteed
        memory: 384Mi # minimum memory guaranteed
      limits:
        cpu: 500m # maximum CPU allowed
        memory: 512Mi # maximum memory allowed
# The scheduler — runs cron jobs (only 1 replica to avoid duplicates)
containers:
  - name: app
    image: registry.example.com/my-service:v1.2.3
    command: ["npm", "run", "start:scheduler"] # processes scheduled tasks

Same codebase, same image, different entry points. The scheduler runs as a single replica (you don’t want two instances both trying to send the same scheduled email). The API server scales to handle traffic.

Scaling on Autopilot
#

Horizontal Pod Autoscaler (HPA) adjusts replicas based on real metrics. No more guessing how many instances you need:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
spec:
  minReplicas: 3 # never go below 3
  maxReplicas: 10 # never go above 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70 # scale up when CPU > 70%
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80 # scale up when memory > 80%

During off-peak hours, you run 3 pods. Traffic spikes? Kubernetes spins up more, up to 10. Traffic drops? It scales back down. You pay for what you use (mostly).

Are You Alive? Are You Ready?
#

Kubernetes has two types of health checks, and the distinction matters:

Liveness probe: “Is this container still alive?” If it fails, Kubernetes kills the container and restarts it. This catches containers that are technically running but stuck (deadlocked, out of memory, infinite loop).

Readiness probe: “Can this container handle traffic right now?” If it fails, Kubernetes removes the pod from the Service’s load balancer. The container stays running — it just doesn’t receive new requests until it’s ready again.

livenessProbe:
  httpGet:
    path: /health
    port: 8080
  initialDelaySeconds: 30 # give the app 30s to start
  periodSeconds: 10 # check every 10s
  failureThreshold: 3 # 3 failures = restart

readinessProbe:
  httpGet:
    path: /health
    port: 8080
  initialDelaySeconds: 10 # shorter — we want to serve traffic ASAP
  periodSeconds: 5 # check more frequently
  failureThreshold: 3

Both hit the same /health endpoint, but the readiness probe starts earlier and checks more often. You want to start serving traffic as soon as possible, but you want to be sure the container is actually functional before you do.

Resource Requests: The Scheduler’s Cheat Sheet
#

Every container declares what it needs (requests) and its maximum (limits):

resources:
  requests:
    cpu: 100m # "I need at least 0.1 CPU cores"
    memory: 256Mi # "I need at least 256MB RAM"
  limits:
    cpu: 500m # "Don't let me use more than 0.5 cores"
    memory: 512Mi # "Kill me if I exceed 512MB"

Requests affect scheduling — Kubernetes places pods on nodes that have enough capacity. Limits enforce boundaries — exceed your memory limit and the container gets OOM-killed. Set requests too low and your pods get scheduled on crowded nodes. Set them too high and you waste resources.

Getting these numbers right is more art than science. Start generous, monitor actual usage (we’ll cover that in a later post ), and adjust.

The K9s Connection
#

Remember k9s from the terminal post ? This is where it shines. Instead of typing kubectl get pods -n my-namespace, you type :pods in k9s and navigate visually. View logs, shell into containers, port-forward — all with keyboard shortcuts. Once you’re managing a cluster with dozens of pods across multiple namespaces, a TUI makes a real difference.

What’s Under the Cluster
#

So far I’ve talked about what runs on Kubernetes, but not what runs Kubernetes. The cluster itself needs to live somewhere — and managing that infrastructure is its own challenge. We run ours on Google Kubernetes Engine (GKE), which handles the control plane so we don’t have to. That’s the next post .

Aaron Yong
Author
Aaron Yong
Building things for the web. Writing about development, Linux, cloud, and everything in between.
Infrastructure From Scratch - This article is part of a series.
Part 3: This Article

Related

Containerizing Everything
·787 words·4 mins
Photograph By Fabio Sasso
Blog Docker Infrastructure
Why I containerized my entire homelab on a 7-year-old laptop
Building and Shipping with Docker
·618 words·3 mins
Photograph By William
Blog Docker CI/CD
Docker multi-stage builds and CI/CD with GitHub Actions
Building With Hugo
·1204 words·6 mins
Photograph By Nick Morrison
Blog Hugo Web Development
How I built this website with Hugo and the Blowfish theme