Looking back, 2017 was the year Kubernetes conquered the container orchestration space. For years, Kubernetes' rivals such as Docker Swarm and Mesos have been offering their own container orchestration tools and now they both added support for Kubernetes within their ecosystems. The largest cloud providers such as AWS, Microsoft Azure and Oracle Cloud announced Kubernetes integrations into their respective cloud platforms, not mentioning Google where Kubernetes came from originally. So, every developer would benefit from at least learning the basics of Kubernetes. That's exactly what we are going to do in this post.
Before we get started I want you to watch this awesome animated guide first. Then come back and we will discuss the details:
Did you watch it? NO? GO BACK TO THE VIDEO YOU STUBBORN LITTLE DEVELOPER! Good. Now let's get the formal definition of Kubernetes out of the way:
Kubernetes is a system for managing containerized applications across a cluster of nodes
In simple terms, you have a group of machines (e.g. VMs) and containerized applications (e.g. Dockerized applications), and Kubernetes will help you to easily manage those apps across those machines. We will see a practical example later.
Kubernetes components
Kubernetes cluster consists of Master and Nodes:
Master is the controlling machine and has components which operate as the main management contact point for users. Nodes are where your containerized apps run. Simply put, you run your containerized apps in nodes and you control them through the master.
Both master and nodes have very important components which we discuss below.
Master Components
Etcd is a consistent and highly-available key-value store used as Kubernetes’ backing store for all cluster data. Basically, it is a database for Kubernetes data and represents the state of the cluster.
API Server is what exposes Kubernetes API, as its name suggests. It is the main management point of the entire cluster. It acts as the bridge between various components disseminating information and commands. In simple terms, it is the frontend of the Kubernetes control pane.
Controller Manager is responsible for regulating the state of the cluster and performing routine tasks. For example, the replication controller ensures that the number of replicas defined for a service matches the number currently deployed on the cluster. Another example is the endpoints controller adjusting, well, endpoints by watching for changes in Etcd.
Scheduler Service is what assigns workloads to nodes. This is how it does it:
- Reads the workload's operating requirements
- Analyze the current infrastructure environment
- Place the workload on an acceptable node(s)
Node Components
Docker is used in order to run your containers, duh!
rkt
can be used as an alternative to docker.Kubelet is the main contact point for each node with the cluster group, relaying to and from control pane services (master).
Proxy is used for maintaning network rules and performing connection forwarding. This is what enables the Kubernetes service abstraction (DNS).
You won't directly interact with these components directly but it is good to know what is happening behind the magic.
If the above components are something you don't have to know, the following you must know. Pay great attention.
Kubernetes Work Units
- Pod is the most basic unit in Kubernetes. It represents a unit of deployment, i.e. a single instance of an application which may consist of either a single container or a small number of containers that are tightly coupled and that share resources (for example, a cloud sql proxy container should run in the same pod as the main application). Other than an application container (or multiple containers), a pod encapsulates storage resources, a unique network IP and options that govern how the container(s) should run.
Excuse the image size. You rarely have to directly deploy pods (I never have). You mostly will attach into the process for debugging and testing purposes.
- Service groups together logical collections of pods that perform the same function and presents them as a single entity. Also, it acts as a basic load balancer between pods and enables consumers not to worry about anything beyond a single access location.
- Label is an arbitrary tag to mark work units. Basic key-value pairs. They are what enable services to group several pods together. Let's say you give your pods this label "microservice: auth" and the service with the same selector ("microservice: auth") will be able to forward traffic to those pods.
- Deployment provides a declarative syntax to create/update pods. You tell a deployment your desired state (how many, how fast, when) and it changes the actual state to the desired state at a specified rate
Ingress manages external access to the services. Provides load balancing, SSL termination and path/hos based routing, which are considered its advantages over services of "Load Balancer" type. See below for more details.
Ingress Controller is what implements
Ingress
definitions. That is, you write what you need inIngress
objects and ingress controllers will turn them into reality. It means thatIngress
itself is nothing withoutIngress Controllers
.
Full example
I know that this is all theory and it is boring. You need to set these things up yourself in order to fully understand. That's why, carefully go trough this post and get your hands dirty:
Did you go through the example? Pretty cool huh? Kubernetes makes everything very easy. Now that you have seen a practical example, read these common pitfalls that I have been a victim of while learning Kubernetes. They will save you weeks of your time.
Common pitfalls
Using services of type "LoadBalancer" to expose externally: In most tutorials, even in the official documentation, they use
LoadBalancer
services to expose the application. The reason is that it is really easy to do and great of testing. However, when you want to do SSL termination or route/host based routing, services are not your friends. UseIngress
for real applications.GKE Ingress Controller: In GKE, you don't have to manage your own ingress controller because GKE has its own managed for you. It is great and it works great. However, it cannot force
https
at the time of this writing. Maybe it will change in the future. But for now, you will have to manage your own Ingress Controller ifhttps
is a must for your app, which it should be in 2018. See the full example about on how to do that.SSL certificates: Don't manage them yourself. Use
kube-lego
which automatically updates your certificates when they are about to expire.
Bonus
Zero Downtime: By using something called
readiness-probe
and a rolling update strategy it is very easy to achieve zero-downtime deployment. Let me know in the comments if you want a post showing how to do this.Don't be afraid to switch to Kubernetes: Kubernetes is a new technology and is full of
darkmagic. That's why, it is very natural to be afraid to switch from old tools to Kubernetes, especially in production. I know I was terrified. So, what I did was to switch gradually. First step was to forward 10% of the production traffic to our Kubernetes cluster and the rest 90% to our old setup. Next step was to monitor how it was doing. If it was doing OK we changed those numbers to 30% and 70%. And on it goes until it reaches 100% to Kubernetes cluster and 0% to our old setup. This way, you can make sure that your new Kuberbetes cluster will do just fine even in production. We were usingNGINX
in our old setup and this is how we split traffic between upstreams:
upstream dashboard_app_server {
server old-setup.com weight=9;
server new-kubernetes-cluster.com weight=1;
}
It means that 90% of the traffic goes to old old-setup.com and 10% goes to new-kubernetes-cluster.com. Pretty easy.
Conclusion
I wish I had this material when I was learning Kubernetes. It would save me weeks of my time. I hope it saves for somebody else. Make sure to check out the full example. And always remember this quote from Kelsey Hightower himself:
Kubernetes is going to set you free. But it is going to piss you off first.
Thanks for reading.
Fight on!
You may also find this related post interesting: Nginx Ingress Controller