11 minutes away from Kubernetes

A hyper-condensed 101 of the container orchestration platform in plain(ish) English to hit the ground running

Image for post
Image for post
Photo by Manuel Nägeli on Unsplash

Kubernetes (K8s) is an open-source system for automating deployment, scaling, and management of containerized applications.

I think it’s a short yet simple definition. The most common containerisation platform used in Kubernetes is Docker. So this article will start with a simple Docker image, and explain the different moving parts on a Kubernetes cluster that will allow you to deploy and run your application, scale it and manage it.

The basics

For starters, if you’ve never heard of the term cluster, just think about it as a group of computers working as a unit, with some processes that coordinate the communication and actions between all these computers, or nodes. The process of managing this and the workloads operating in the cluster is called orchestration (as in Kubernetes is an open-source container-orchestration system for…).

Orchestrating the workload

The application container is wrapped in a Pod, that smallest computing unit managed by Kubernetes
The application container is wrapped in a Pod, that smallest computing unit managed by Kubernetes
To avoid managing individual Pods, workloads are organised in ReplicaSets and Deployments
To avoid managing individual Pods, workloads are organised in ReplicaSets and Deployments
Containers are run in Pods, a ReplicaSet is in charge of ensuring a given number of them are running at any point in time. The Deployment orchestrates all the necessary steps to start and have running the Pods.
The service is the method by which the requests from outside of your workload are routed to your Pods
The service is the method by which the requests from outside of your workload are routed to your Pods
  • NodePort — your service will be exposed on a static port on each Node port, accessible outside the cluster.
  • LoadBalancer — the cluster requests a Load Balancer to the cloud provider, in the form of a static IP (extra charges might apply here). This means it’s accessible outside the cluster too.
  • ExternalName creates a CNAME record in Kubernetes’ internal DNS (for example subdomain.domain.com) that will “override” the real DNS record (only within the cluster). For instance, let’s say you’re migrating your workload from your own datacenter and you normally access your DB on db-01.example.com, and before migrating all of it to Kubernetes, you want to migrate only the test environment so you deploy the DB in your cluster. You don’t need to change your application code: just by creating an ExternalName service all the requests will be properly re-routed to the new instance.
Example of how an ExternalName service overrides DNS resolution within the cluster
Example of how an ExternalName service overrides DNS resolution within the cluster
In the cluster on the left there’s no ExternalName service created, hence, the requests to db-01.example.com will resolve normally. In the cluster on the right, the Service is created so the same domain will resolve internally.
A headless Service will generate A records for all the Pod IPs it’s serving, under a domain name matching the Service name
A headless Service will generate A records for all the Pod IPs it’s serving, under a domain name matching the Service name
An example of A records automatically configured by the Service, pointing to the Pod IPs
Pods are labeled and Services use these to determine how/where to route the incoming requests.
Pods are labeled and Services use these to determine how/where to route the incoming requests.
A Service uses Pods labels to determine where to route the requests (Yes, the third colour is indigo. To me it was just purple, but as it turns out, it’s not)
  • ReplicaSet — the controller responsible of ensuring a certain number of Pods running at any point in time.
  • Deployment — the orchestra director, organises all the resources so your Pods can run the way you want them to. This will be what you’re going to mainly be working on.
  • Service — a bridge to the “outside world”

Bonus track: Autoscalers

There’s another component called HorizonalPodAutoscaler (HPA) that, based on some configurable metrics, can adjust the number of running Pods on a ReplicaSet to serve the incoming requests properly, and this number can fluctuate between previously defined upper and lower bounds. CPU, network packets per second, HTTP requests are just some examples of the metrics you can mix and match. This autoscaling will just affect Pods, so it needs to be complemented by a Node autoscaling, otherwise, depending on the bounds configured, there might not be enough resources to host them properly and some of them might not be scheduled.

Configuration objects

In order to tell Kubernetes how to run our workloads, there are two main configuration objects that allow us to do that:

  • Secrets — for sensitive information

Security

Kubernetes has something called ServiceAccount, and it’s a way to represent a (machine) user that will be allowed to execute certain operations in the cluster. You can define as many as you want, but usually the default is enough. This Service account can be modified and, for example, be given permissions to read a private Container Registry (the repository where the Docker images that the Pods need will be pulled from) that requires authentication.

Kubernetes internal architecture considerations

  • What I described aboved are some of what I call logical components of the cluster. Kubernetes has also the concept of Nodes — these in contrast I call them the physical part, which can be bare metal servers, VMs, etc where the Pods will be actually allocated to and consume resources from.
Nodes are organised in NodePools and workloads can be configured to have higher affinity with a NodePool than others
Nodes are organised in NodePools and workloads can be configured to have higher affinity with a NodePool than others
  • Kubernetes has its own administrative workload, and that needs to run in your Nodes, so take this into account when sizing the Node pools. The cloud provider you use might add its own services, so that’s another
  • Kubernetes organises the objects deployed in the cluster in namespaces. These provide another layer of isolation and security, so for instance, it’s possible to restrict access to certain namespaces to some users. By default, all user-generated workload will go to the default namespace, whilst Kubernetes workload belongs to the kube-system namespace.

Stateless vs stateful

It’s one of the main differentiators in workloads, and it basically responds to this question: Do I need to keep track of the data stored in this particular Pod?.

Actually, in Kubernetes a stateless deployment is called Deployment; while a stateful deployment is called StatefulSet

The API

Why an API?

The Kubernetes API is a REST API, and as any other API, lets you query and modify the components of your cluster. Every operation executed by a user or a process, whether internal or external, will go through this API.

kubectl create deploy my-deployment --image=my-app
kubectl create apply -f my-deployment.yaml

Some other commands examples

You can find these on Kubernetes official docs pages, the only reason I’m putting them here is to show how clear they tend to be, so the learning curve here is relatively flat.

kubectl get pods
kubectl describe deploy my-deployment
kubectl delete rs my-replicaset

Technology enthusiast. Less is more.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store