**Elements of Kubernetes**
published: 17 December 2020
In this article, I lay out the elements of Kubernetes _from an application
developer's perspective_. Familiarity with deploying applications in a
production environment is expected[^familiarity].
Rather than explaining Kubernetes from the ground-up and building a pie in the
sky that no-one will eat because it has strawberries and whipped cream and who
likes that, I introduce Kubernetes as a solution for problems you are
encountering in high-traffic production environments right now:
- Your application needs to serve 1000s of requests per second so you need many instances of the application running
- Clients of your application need to connect to one IP address and requests will be routed to the different application instances automatically
- You need to be able to push changes to your application without downtime
- You need to be able to rollback changes to your application
- Your application needs to be _warmed up_: things need to happen before your application can start serving up requests. For example, the application instance needs to sync its data with the data of other instances
- When an instance of your application dies (and is thus no longer able to serve requests), that instance needs to become invisible to clients of your application. Traffic should be guided to still-functioning instances of your application
- To save costs during the night, when you need less capability for your main application, you want to reuse some of the available capability to run batch jobs
- There will be times when there will not be enough machines to serve your traffic (during a promotion, or, say, the holiday period), so you want to increase, _scale up_, the amount of physical machines you have available. After the busy period, you want to get back to normal and scale down
It will come as no surprise (if you _are_ surprised: Sinterklaas does indeed
exist _and_ the world is a beautiful place where no-one drags themselves
through wondering _where and when in the name it all went wrong_) that
_Kubernetes_ is _a_ solution for those problems. I'll refer to _Kubernetes_ by
its common abbreviation of _k8s_ -- pronounced _kates_, because that's what all
the cool kids do and I'll do the same even though it's aeons since I was a kid.
Even though Kubernetes is a large hammer, it does hammer the above problems squarely
away. I disparagingly say _hammer_ as I say almost everything disparagingly but
also because k8s is able to do more than what you'd need to solve the above
stated problems. Blame that on its historic origin as used by a Cloud Provider
in which many different applications need to run on the same cluster of
machines in a desperate capitalistic attempt to optimize usage of
infrastructure. Given the problems I laid out, I'm not _that_ interested
in running different applications on the same infrastructure, although the
particular requirement to save cost during the night and run some batch jobs on
the same machine that your machine runs on during the day, is designed to hint
at that use.
The requirements as laid out above come from my personal experience, and my
limited knowledge I have on k8s comes from [Kubernetes in
Marko Lukša. A book that, like all good books, gave me all the misplaced
confidence to play an expert on the Internet. The book goes in
depth and is useful for infrastructure administrators (which is a whole world
of joy on itself that I will barely touch on here). In this article, I mostly
stick to what I know, application development, and I'll go over every single
one of the above problems/requirements and discuss how k8s helps solving them.
# Instances of your Application
Your application needs to serve 1000s of requests per second so you need many instances of the application running
## A Containerized Application
As an example application, I'll run a web server that serves up the sentence
Hi from HOST
where `HOST` will be the host the application is running on. We're in a brave
new world, so this application is packed up so it can run as a Docker
container. I posted the source of this application on
and I pushed the container image to the [Docker Hub](https://hub.docker.com/)
with name [stijnh/hi_app](https://hub.docker.com/r/stijnh/hi).
If at this point you feel slightly lost and it's related to this article, this
may be a good time to take a break to read up on the [Elements of Docker
have a Snickers, what do I know.
The container image is public so you can try out the app on your laptop right now with:
docker container run -d -p 8111:8111 -t stijnh/hi
Recall that `docker container run` indicates you're asking to run a container
image; `-d` indicates you're going to do that in the background as a daemon
(rather than interactive), `-p 8111:8111` means you're mapping your `localhost`
port 8111 (on your laptop) to port `8111` on your docker container behind which
your app is running, `-t stijnh/hi` indicates the image you're using to get
the container from.
Open your browser and navigate to `localhost:8111`. You will see
Hi from fbd698569ec5
where `fbd698569ec5` will be different for you as this
is the container's hostname.
At this point you have _1 instance of your application running_. It does so as
a container on your laptop. Your laptop is the _machine_ (or for people that
like their engineering terms like their sunsets, with drama, your _bare
## Multiple Instances of your Application
You have 1 instance of our app running, but you'll have thousands of users that
you need to say hi to, so you can't have only 1 instance -- you have dreams,
megalomaniac dreams! More instances! At least 3!
But first I need to define tad more precise what we need 3 of. If I need 3
instances of the application, I need 3 containers (each container runs
the app). As this example is focused on a particular set of requirements that k8s solves, I've
not mentioned other real-world requirements like logging. Typically, your
application will be writing logs, and you'll need something to _rotate_ these
logs away to more permanent storage (to [AWS's S3](https://aws.amazon.com/s3/)
for example, where you could query it using
[Athena](https://aws.amazon.com/athena/)). I want this process of log
rotating in a separate container (other apps could use it as well so it's not
specific to this app). I want these 2 containers always running
together[^sidecar], so if I say I want 3 instances of my app, I actually want 3
instances of my app with the logging container. Hence, the first level of
abstraction -- the base unit k8s deals with -- is _not_ a container, it's
something called a _pod_. So let's do some more wrapping, and wrap that
container into a _Pod_ manifest[^manifest] that describes what the pod should
look like. If you recall `Dockerfile`s, then this is similar in philosophy: in
a `Dockerfile` you describe your app so that `docker` knows how to build an
image, in a _Pod_ manifest you describe what all goes in your _Pod_, most
importantly what container images to use such k8s knows how to create a Pod out
of several containers.
Pods can be defined in [YAML](https://en.wikipedia.org/wiki/YAML) files, so
I'll have a file `hi-pod.yaml`:
- image: stijnh/hi
- containerPort: 8111
which is not doing much more than naming my pod `hi-pod` and indicating that
it's running the container [`stijnh/hi`](https://hub.docker.com/r/stijnh/hi).
Let's go over the pod definition, line by line:
indicates what API version of k8s this k8s _resource_ is defined for (_Pods_,
together with a whole lot of other things, are named _resources_ in k8s), in
this case `v1`. I have no memory, nor patience, for remembering API versions,
but there are [explanations of what version to use](https://matthewpalmer.net/kubernetes-app-developer/articles/kubernetes-apiversion-definition-guide.html).
indicates the type of resource you're defining, a `Pod`.
indicates metadata about the pod, in this case its name `hi-pod`.
- image: stijnh/hi
- containerPort: 8111
indicates the pod's specification, in this case a list of 1 container, where
the container is described by its image tag `stijnh/hi`, given a name `hi`, and
indicates that the container exposes the port `8111`. A pattern worth
remembering are the 4 sections that you will see with other types of resources
as well: the `apiVersion`, the `kind`, the `metadata`, and the `spec`.
To deploy this pod, I assume you're on your laptop and have something like
[minikube](https://github.com/kubernetes/minikube) installed. As I'm tackling
k8s from the perspective of an application developer, I do not tackle setting
up more than a toy cluster using `minikube`. If you want to sing-along, go
check out [minikube](https://github.com/kubernetes/minikube) and install it.
While installing `minikube`, you might have read that minikube will give you a
_single-node cluster_. You have only ever heard cluster in the context of "What
a cluster...! Who ate the last Oreos??!!!" Your feeling for the English
language tells you that clusters may be related to that exclamation of frustration,
but not exactly the same. My working definition of a _cluster_ is that it's a
set of nodes. What's a _node_? Again, my working definition is _that's a
machine, an EC2 instance, a laptop, some kind of thing with a CPU and
memory, an old Pentium tower in a dusty basement, ..._ A typical mapping in my
head would be _if I have 40 [EC2](https://aws.amazon.com/ec2/) instances in the EU region to serve traffic_,
that's a _cluster_ of _40_ nodes.
With `minikube` running, try this:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
minikube Ready master 0d v1.18.3
`kubectl` is the command-line tool for interacting with your cluster (asking it
about nodes, pods, ...) The output of the command shows that 1 node, called
`minikube`. All later examples involve that 1 node. In your real life, you'll
work with clusters that have many more nodes.
We have a cluster with 1 node. We have our `hi` app, tucked away in a
container, and we defined a pod that is supposed to run that container. Let's
deploy that pod using `kubectl`:
$ kubectl create -f hi_pod.yaml
Verify that your pod is created:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
hi-pod 1/1 Running 0 54s
This shows that the pod is `Running` and has been doing so for `54s`. Note the levels of indirection:
- when you ran the app `hi.js` on our laptop, you could navigate to `localhost:8111` and see the `Hi from...`
- when you ran the app in a container, you needed to forward the localhost's port `8111` _into_ the container using `docker container run -d -p 8111:8111 -t stijnh/hi`
- when you run the app in a container in a pod, you need to do what?
You could forward the port via `kubectl`, but that is mostly for debugging.
I'll answer this question in the next section where I'll talk about how to make
your pod visible to other pods. For now, I'll just show how to check your logs
on the pod with name `hi-pod`:
$ kubectl logs hi-pod
hi: listening on port 8111 of host hi-pod
That's indeed what the app logs when you start it. Note that the host is listed as `hi-pod` which
corresponds to the name we gave the pod in `hi-pod.yaml`.
See more details for your pod by doing a `describe`:
$ kubectl describe pod hi-pod
Scroll down to the last section:
Type Reason Age From Message
---- ------ ---- ---- -------
"Yes, but no," I say. "The unit of operation when your application is managed by k8s is a pod, so we want 4 pods, each in turn running 1 container."
"So I _was_ right," you say.
"Stop being pedantic and pay attention." There's a k8s resource type (the first one we met was a `Pod`) that allows to specify how many pods k8s to keep around: a [`Deployment`](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/). The word on the street is that you should specify more than 1 pod, to ward off loneliness. As with pods, you specify a deployment using a YAML manifests. The line that gets me my 4 pods is: ```yaml spec: replicas: 3 ``` Glad to see you're awake Grasshopper. Four of course: ```yaml spec: replicas: 4 ``` How will you describe _which_ pod you want 4 replicas of? Kubernetes uses _labels_ to determine which replicas you're talking about. On the _Pod_ side, you have to make sure your pods are labeled, and then you indicate in your `Deployment` what labels you're describing. An example of a label is `app=hi`. Another label could be `tier=dev`, or `color=blue`. The latter is obviously no good, `color=yellow` is be better. Labels are _key/value pairs_, and you can have multiple different labels on a pod (but you cannot have 2 labels with the same _key_: so `color=blue` and `color=yellow` does not work). On the `Deployment` side, you then indicate that you want 3 replicas of all pods matching _these labels_. For example, you indicate that you are describing pods with label `app=hi`: ```yaml spec: selector: matchLabels: app: hi ``` Easy enough.
"But but, we did not give our pod this label, so there's nothing to find."
Yes, The final piece of such a `Deployment` is to describe what kind of pods you want to create, a so-called _pod template_: ```yaml spec: template: metadata: labels: app: hi spec: containers: - image: stijnh/hi name: hi ports: - containerPort: 8111 ``` The latter looks like the pod definition earlier, except that it also specifies that the pod needs to have a label `app=hi` as metadata. In fact, now that we have a deployment you can forget all about that initial pod definition. This `Deployment` knows all it needs to know to create your 3 pods (_4! pay attention!_). Put it all together: ``` apiVersion: apps/v1 kind: Deployment metadata: name: hi-deployment spec: replicas: 4 selector: matchLabels: app: hi template: metadata: labels: app: hi spec: containers: - image: stijnh/hi name: hi ports: - containerPort: 8111 ``` This specification describes a `Deployment` with name `hi-deployment`. It specifies that at all times there need to be 4 replicas of a `Pod` matching `app=hi`. The `template` indicates how the `Deployment` will go about creating new pods: it will run the container `stijnh/hi` and it will label the pod with `app=hi`. The latter is important as this ensures that the just created Pod is managed by the `Deployment`. Imagine if we create a `Pod` with label `app=yo`. The `Deployment` would never conclude it reached 4 replicas of pods with label `app=hi` so pods would keep on getting created. Before you try this `Deployment`, delete all resources you have so far, and then check that you have 0 pods running: ```shell $ kubectl delete all --all $ kubectl get pods ``` Now, create a deployment using the usual `create` based on the above YAML (`hi-deployment.yaml`): ```shell $ kubectl create -f hi-deployment.yaml ``` And then watch the magic: ```shell $ kubectl get pods NAME READY STATUS RESTARTS AGE hi-deployment-5f7b895fd9-5hb5h 0/1 ContainerCreating 0 3s hi-deployment-5f7b895fd9-dt8kf 0/1 ContainerCreating 0 3s hi-deployment-5f7b895fd9-r58kc 0/1 ContainerCreating 0 3s hi-deployment-5f7b895fd9-wgwpp 0/1 ContainerCreating 0 3s ``` Note the `STATUS` `ContainerCreating`, and then a couple of seconds later: ```shell $ kubectl get pods NAME READY STATUS RESTARTS AGE hi-deployment-5f7b895fd9-5hb5h 1/1 Running 0 8s hi-deployment-5f7b895fd9-dt8kf 1/1 Running 0 8s hi-deployment-5f7b895fd9-r58kc 1/1 Running 0 8s hi-deployment-5f7b895fd9-wgwpp 1/1 Running 0 8s ``` Verify that the pods have the label `app=hi` so they're under management of the `Deployment`: ```shell $ kubectl get pods --show-labels NAME READY STATUS RESTARTS AGE LABELS hi-deployment-5f7b895fd9-5hb5h 1/1 Running 0 2m40s app=hi,pod-template-hash=5f7b895fd9 hi-deployment-5f7b895fd9-dt8kf 1/1 Running 0 2m40s app=hi,pod-template-hash=5f7b895fd9 hi-deployment-5f7b895fd9-r58kc 1/1 Running 0 2m40s app=hi,pod-template-hash=5f7b895fd9 hi-deployment-5f7b895fd9-wgwpp 1/1 Running 0 2m40s app=hi,pod-template-hash=5f7b895fd9 ``` Yep, there we have it, `app=hi`. I promised that the Deployment specifies that at all times there need to be 4 replicas. So what if you delete one? For example, delete the first pod: ```shell $ kubectl delete pod hi-deployment-5f7b895fd9-5hb5h ``` Then check your pods with label `app=hi` -- you can use the `-l` flag to specify to only see pods with that label: ```shell $ kubectl get pods -l app=hi NAME READY STATUS RESTARTS AGE hi-deployment-5f7b895fd9-dt8kf 1/1 Running 0 6m6s hi-deployment-5f7b895fd9-h7mz7 1/1 Running 0 82s hi-deployment-5f7b895fd9-r58kc 1/1 Running 0 6m6s hi-deployment-5f7b895fd9-wgwpp 1/1 Running 0 6m6s ``` Still 4 pods, shucks! But if you look at the `NAME`s you see that `hi-deployment-5f7b895fd9-5hb5h` is gone and replaced by the new pod `hi-deployment-5f7b895fd9-dt8kf`. The `Deployment` guaranteed that 4 replicas are running at all times. The Zombie Apocalypse is nigh. We checked the 4 pods using `kubectl get pods`. You can also get more info on the `Deployment` resources: ```shell $ kubectl get deployment NAME READY UP-TO-DATE AVAILABLE AGE hi-deployment 4/4 4 4 8m48s ``` Note that it indicates that `4` out `4` pods are ready. In this article, I'll stay at the abstraction of `Deployment` and `Pods` but if you do the following, you'll see that there's something called a [`ReplicaSet`](https://kubernetes.io/docs/concepts/workloads/controllers/replicaset/) created as well. ```shell $ kubectl get replicaset NAME DESIRED CURRENT READY AGE hi-deployment-5f7b895fd9 4 4 4 10m ``` It's actually this `ReplicaSet` that is created by the `Deployment` that makes sure that the replicas are kept at `4` (see the `DESIRED` column). The `Deployment` does not create the pods directly. I'll not go into more detail around `ReplicaSet`s as you'll usually not deal with them directly. ![A Deployment](./diagrams/deployment.svg) In the previous, I kept referring to 4 instances of the application or 4 pods, not 4 nodes. The 4 pods running the containers are all running on the same node on your laptop. In reality, you'll have more than 1 node to run your application, and your application instances may have been scheduled on different nodes to accommodate for, for example, CPU requirements of your application. # Clients of your Application Recall the 2nd requirement I listed in the Introduction: !!! Clients of your application need to connect to one IP address and requests will be routed to the different application instances automatically ## (Internal) Clients on the Same Cluster First consider clients on the same cluster. Why is it tricky for clients to connect to your application? - Your application runs on several pods. Each pod has its own IP. For a client to connect to a pod, it needs to know the IP of the pod, but since pods are _ephemeral_ (they could be removed when a node fails etc), that IP may change, so even if your client has the IP of the pod, that IP may become invalid over time - There are multiple instances of your application running (4 in our case), so the client needs to know all 4 IPs and then select one to connect to How to avoid that a client needs to know that list of ever-changing IPs? Another k8s resource to the rescue : a _Service_. A _Service_ gives you 1 IP and load-balances requests to that IP by redirecting requests to the pods that are able to serve the request. As with _Deployments_ before, one question is how will the Service know what Pods it controls? Labels! Remember that all of the pods in our current deployment have label `app=hi`. We can define a _Service_ that provides the IP and load-balancing for exactly those pods. Create a file `hi_service.yaml`: ``` apiVersion: v1 kind: Service metadata: name: hi-service spec: ports: - port: 80 targetPort: 8111 selector: app: hi ``` I defined a service (`kind: Service`), a name for the service (`hi-service`), and I specified that port `80` of the service forwards to port `8111` of the container (if you scroll up, you can see that `8111` is indeed the port our `hi` app is listening on). Finally, by using a `selector`, I specified that this service is a service fronting pods with label `app=hi`. Create the service as usual: ```shell $ kubectl create -f hi_service.yaml ``` And verify it's there: ```shell $ kubectl get services NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE hi-service ClusterIP 10.105.43.171