This article should help you to decide if you are ready to use Kubernetes and whether it’s the right platform to use for your application.
What is Kubernetes?
Kubernetes is a portable, extensible, open-source platform for managing containerised workloads and services, which facilitate both declarative configuration and automation. Kubernetes services, support and tools are widely available, and it has a large, rapidly-growing ecosystem.
The name Kubernetes originates from Greek, meaning helmsman or pilot. K8s (as an abbreviation) results from counting the eight letters between the “K” and the “s”. Kubernetes combines running production workloads at scale with best-of-breed ideas and practices from the community. Google open-sourced the Kubernetes project in 2014 (see 15 years of Google’s experience for more information).
Historically, organisations ran applications on physical servers. Typical challenges of using physical servers are:
- No easy way to define resource boundaries, leading to allocation issues in which one app consumes what it requires and starves other apps of the resources that they require to perform adequately.
- Scaling in any direction (up or down) is difficult and costly. To scale up often means financial outlay for new equipment and the cost/through-life-support that this involves. To scale down typically leads to under-utilised resources, which is not cost effective.
Virtualisation was developed to address many of the cost and performance bottlenecks of physical hardware and introduces efficiencies by allowing you to emulate multiple servers, otherwise known as Virtual Machines (VMs). These isolated VMs are configured to share the same hardware resources (CPU, memory, storage, networking), ensuring that you can effectively manage resource allocation and balance performance utilisation across multiple applications on the same physical hardware.
Virtualisation also addresses the issue of scalability. Adding, removing, and moving workloads becomes a very simple process, as does sharing workloads across disparate hardware. Each VM runs a full Operating system (OS), and full stack of components, on top of virtualised hardware. Each VM has the same costly lifecycle-management issues as their physical counterparts e.g. OS upgrade and patching, access control, performance and logging.
Long before containerisation, server virtualisation was the most efficient way of running applications. However, it always comes with some resource overhead if you need to maintain a lot of servers because you must run a whole OS, including the kernel.
The ancestor of containerisation chroot has been around since Unix (version7) was released in 1979. Isolating a running process from the root filesystem and effectively simulating a root directory was typically know as chroot jail. Containerisation enables a lightweight platform that shares the OS with other containers. Like a VM, a container has its own filesystem, share of CPU, memory, process space, and more. Furthermore, it is decoupled from the underlying infrastructure, making it portable across clouds and OS distributions.
Sharing a kernel has its drawbacks. I will attempt to cover this in detail in a separate article focused on Container Security and Platform hardening. I mention it here so you can bear it in mind. The risks are fairly well understood, and there are plenty of resources on the internet that dive into more detail. Having said that, some of the more obscure mitigations are dependent on your environment and can only be identified following assessment by an experienced Security Practitioner.
If you have containerised applications, you are ready to consider using Kubernetes to run your workloads.
When NOT to use Kubernetes
If you do not have the buy-in of Senior Stakeholders, stop here!
Building out a containerisation platform is an expensive undertaking. In the short term, many companies find that they are financing both the AWS infrastructure in addition to the original on-premises infrastructure. In the medium term, and as confidence grows, the original infrastructure can be shut down and decommissioned, and perhaps the datacentre can be exited altogether.
If you do not have a Capacity Plan, stop here!
Before you set about building any new platform, you will need to understand the short-term and medium-term demands at the very least (long-term projections can help at this stage too). This is critical to ensure that you do not have to stop to redesign and redeploy some of your infrastructure. DNS is a typical example of this.
Using a Capacity Plan, you can also begin to put together a financial forecast and begin to understand the Total Cost of Ownership (TCO). The TCO can vary greatly, so it’s a good idea to contrast and compare your container platform options to make the most cost-effective decision.
If you are not familiar with Agile Delivery Practice, or ready to adopt a DevOps culture, stop here!
To make the platform cost effective, you will need to understand how to make the most out of what you have, and where and how the saving can be driven from. This will involve driving out any old processes and so, if you are not ready to adopt the culture or are resistant to change, you will not easily realise the potential cost saving and optimisation that the platform can offer.
Kubernetes enables the speedy commissioning of code into a live production environment. With the fast turnaround of development efforts and low operational overhead, you will make the most effective use of its capability. Without this culture shift, you will find yourselves using Kubernetes in much the same way as you managed your previous infrastructure and will never realise any cost saving.
Do not use Kubernetes if you do not have trained engineers who can manage the platform. Tinkering with manifests can result in undesired consequences. Knowing what you are doing here is key to any successful outcome.
When to use Kubernetes
If your application can be containerised, and you have addressed concerns outlined in the above section - "When NOT to use Kubernetes" - then read on. Bear in mind that this was by no means an exhaustive list; depending on your specific use case, there may well be other reasons.
When starting out with Kubernetes it is important to understand the capabilities, as each will require some forethought which can help to shape your configuration. Kubernetes provides you with:
- Service discovery and load balancing: Kubernetes can expose a container using the DNS name or using an IP address. If traffic to a container is high, Kubernetes can load balance and distribute the network traffic so that the deployment is stable.
- Storage orchestration: Kubernetes allows you to automatically mount a storage system of your choice, such as local storages, public cloud providers and more.
- Automated rollouts and rollbacks: You can describe the desired state for your deployed containers using Kubernetes, and it can change the actual state to the desired state at a controlled rate. For example, you can automate Kubernetes to create new containers for your deployment, remove existing containers and adopt all their resources to the new container.
- Automatic bin packing: You provide Kubernetes with a cluster of nodes that it can use to run containerised tasks. Declaring to Kubernetes how much CPU and memory (RAM) each container needs enables the Scheduler to dynamically assess and fit containers onto your nodes to make the best use of your resources.
- Self-healing: Kubernetes restarts containers that fail, replaces containers, kills containers that do not respond to your user-defined health check, and does not advertise them to clients until they are ready to serve.
- Secret and configuration management: Kubernetes lets you store and manage sensitive information, such as passwords, OAuth tokens and SSH keys. You can deploy and update secrets and application configuration without rebuilding your container images, and without exposing secrets in your stack configuration.
Of course, this all depends entirely on your implementation. Not all Kubernetes platforms are equal and unless you know what to look for, you may select a Kubernetes platform that cannot support secure secrets.
One of the most powerful features of Kubernetes is its ability to orchestrate the workload using declarative configuration. However, to fully utilise this capability your application must adhere to a few rules based loosely on the 12-factor-app checklist:
- Your containerised application must be able to start quickly, minimising the rolling upgrade time. Databases and other slow-starting service that relies on strong consistency storage is not recommended for running on Kubernetes. Similarly, monolithic applications do not fit as they usually have a very large codebase and dependency, which means bloated container images, slow pull and start.
- Your containerised application must have readiness and liveness http/tcp endpoints for Kubernetes to probe. These allow Kubernetes to determine if the service is healthy or not and whether a restart is required to restore the service. An http/tcp request failure to the liveness endpoint will result in a container restart. The readiness endpoint purpose is to ensure that incoming requests will only be routed to this container when the application is ready. The readiness probe is useful for an application reload; for example, if a feature switch has been turned on and the application is alive, but it needs a bit of time without any requests being routed to it so that it can reload the feature-switch configuration from the database.
- Start-up health probe should be configured for a slow-starting application. The aptly named start-up probe delays both the readiness and the liveness probes until the application starts. The container will be killed if the start-up probe fails after the allocated time.
- Your containerised application should support graceful shutdowns when SIGTERM is received. It should complete any ongoing requests before terminating. If the application doesn’t handle SIGTERM, Kubernetes will forcefully terminate it with SIGKILL.
- The application should be stateless. Any state should be managed externally ideally using a PaaS for a queue or messaging service or a database. A stateless application is one that doesn’t persist application or session-related data locally. Local data persistence should be treated as "code smells" and tackled by refactoring it to be external. There are exceptions to this rule; for example, databases. Running databases on Kubernetes is possible, but they need to handle the fact that pods are designed to be mortal. This means that the databases need to gracefully handle a shard failure, replication failure and failover.
- Reliance on environment variables for your application configuration are facilitated by Kubernetes ConfigMaps and Secrets. This increases security and makes environment separation a breeze.
- Stream your application logs to stdout so you don’t have to deal with log files and stale file handlers. Make sure your observability platform supports the aggregation of these logs. We find that fluentd is prevalent in this space.
If you have applications that are somewhere close to this, you are just about ready to onboard.
How many clusters do I need?
There is no simple answer to this as it depends on cost, what your organisation model looks like, and how many different teams and applications and services you have.
One thing to consider with a single-cluster-multiple-apps approach is that you will have to start thinking about namespaces, Role-Based Access Control (RBAC) and network segregation, as well as the operational cost this may add due to the additional tooling and policy definition. All of this sounds great, of course, but it is only achievable if you know what you are doing with it.
Going back to my earlier point about avoiding the use of Kubernetes unless you have trained engineers who can manage the platform - in addition to engineers who can manage BAU operations, you will also need Architects and Security professionals to assist you with your infrastructure. You should adopt a compliance framework and use tooling such as Airwalk Airview in order to mitigate any risks attributed to unsolicited or malicious configuration changes.
Introducing guardrails as early possible is key to a successful outcome. Starting an undertaking like this without them can, and almost always does, introduce risks that cannot be confidently remediated without a complete rebuild from scratch.
You have a capacity plan, you understand the financials, and you understand how to containerise your applications. But choosing a Kubernetes platform is not an easy decision. You might like to read another of our blogs - Selecting a Cloud Service Provider - before taking your next step.
Airwalk Reply has designed and deployed Kubernetes infrastructure, controls and pipelines for some of the world’s largest organisations. Contact us for more information, advice or assistance.