How to Love Kubernetes and Not Wreck The Planet Part I: Elasticity and Utilization
Earlier this year, I had the enormous honour of delivering a keynote at KubeCon Europe. I wanted to start a conversation about how our community should be thinking about climate change and what actions we should be taking.
The starting point is … we have a problem. The earth is getting warmer. It’s already a degree warmer than it was in pre-industrial times, and it seems likely to go up at least half a degree or a degree more. A couple of degrees warmer doesn’t sound too terrible — we’d barely notice that if we stepped outside. It’s easy to imagine a pleasant future of balmy summers and extra helpings of ice cream.
The reality is likely to be much more uncomfortable. There will be droughts. There will be floods. Some islands might disappear back into the ocean. We will see fiercer hurricanes and more fires.
Climate change can feel a bit abstract — something that happens somewhere else, at some point in the future. In fact, we’re already feeling the effects now, and they’re going to get worse. I was struck by the immediacy of climate change when my IBM Garage team worked with a startup called The Climate Service. The Climate Service helps businesses quantify climate risk. We worked with them to update their application to allow it to scale up to meet growing demand (with Kubernetes as a platform, naturally). As we migrated to the new platform, we noticed what seemed to be a bug with the flood risk graph for Tokyo; the line zoomed to the top of the chart by 2030, and then flatlined on the upper axis. We assumed we had an error in the new database or logic — but we didn’t. It turns out that in parts of Tokyo, the situation really is that bad, that soon.
But what does it have to do with us? It turns out the tech industry is a significant contributor to climate change. Let’s compare it to flying. Aviation is responsible for around 2.5% of worldwide energy usage. Data centres account for 1–3%. Although it’s (currently) less than aviation, it’s the same order of magnitude. With Kubernetes’ rapid rise in popularity, it’s a safe bet that at least some of that 1–3% is on Kubernetes-hosted workloads.
The problem Kubernetes solves is running things in containers. Once you have more than one container or interesting networking, you’re going to need an orchestrator. That’s where Kubernetes comes in; the Kubernetes platform provides support for configuring, connecting, and isolating container workloads. A Kubernetes cluster allows an arbitrary number of containers to run nicely together, so in principle, no one ever needs more than one cluster. In practice, that’s massively, emphatically, wildly not-true. It’s so not-true that sometimes, the cluster ends up as the unit of deployment, rather than the container. Many organisations find they have a lot of clusters, a situation known as kubesprawl. Metrics aren’t easy to find, but I was able to get them for my employer; IBM Cloud has a nice managed Kubernetes service, and at the time of writing each account had 21 active clusters. That’s a lot of clusters.
Utilization and elasticity
Why does this matter? It all has to do with utilization and elasticity. Utilization is how much of a system’s processing capacity is actually used; it’s a measure of efficiency. Elasticity helps you achieve high utilization; if a system can scale up and down in response to demand, utilization can be kept high.
Kubernetes is great for elasticity at the container level; the number of instances of a container can be scaled up or down using the horizontal auto-scaler, and the ‘size’ of containers can be adjusted using the vertical auto-scaler. If you don’t want to use the auto-scaler, changing the number of instances is a one-character change in a yaml file, followed by kubectl apply
.
However, scaling an application down, but keeping the cluster the same size, may not free as many resources as hoped. Both the worker nodes and the control plane will continue to use resources.
Can we just make the cluster smaller? Yes, but we’re limited. Worker nodes are tied to physical (or virtualised) hardware and have to be provisioned in advance. Changing the number of worker nodes or their capacity manually is possible, but it’s not trivial. Cluster auto-scaling is possible, but it needs a plugin, only really makes sense in a public cloud or sophisticated private cloud with provisioning APIs, takes time and has more inclination to scale up rather than down. (No one wants an eager auto-scaler to inadvertently starve their app of resources and impact users.)
Even if it gets scaled right down, having a whole Kubernetes cluster dedicated to running a small number of small containers isn’t ideal. Kubernetes itself has a runtime overhead (it’s doing a lot of clever things!). Large clusters do need larger control planes, but even tiny clusters have a minimum resource requirement for the control plane; if the cluster is very very small, a large proportion of the overall compute resources will be dedicated to running the control plane, rather than the application. That’s not especially efficient.
What about serverless? Serverless computing can be super-elastic, because the number of application instances is zero, and only scales up when needed. Kubernetes has a nice serverless capability, Knative. However, if Knative is being run in a dedicated cluster, scaling the number of application container instances to zero may not reduce energy consumption all that much; the enclosing cluster itself is still consuming resources.
Managed Knative is a better bet — or, more precisely, Knative running in someone else’s cluster. That allows resources to be shared across a larger pool, taking full advantage of the elasticity of serverless. However, this isn’t how most containers are run.
Why are there so many clusters?
What’s behind cluster fragmentation? It’s not malice — no one deliberately sets out to consume extra resources and warm the planet! There are several contributing factors. The first issue is a business one; getting costs to flow to the right place is easier if the cluster topology mirrors the org chart. This is Conway’s law, but for infrastructure.
The second issue is a technical one; for good engineering reasons, we want our workloads to be isolated, and namespace isolation is not enough. Namespace isolation is a great start — but some things do end up shared across a whole cluster.
For example, earlier this year I had issues with an interaction between my Tekton pipeline and my prometheus logging. When builds ran, my whole cluster slowed to a crawl. That was a minor inconvenience because it was a staging cluster being used by a single team; if other teams had also been relying on the cluster, we would have been on the receiving end of some irritated slack messages. The performance problem was solvable (by updating the logging config), but it sure was annoying to all the users of the cluster until someone got around to fixing it. For a more generic solution, we could have used namespace resource quotas to cap resource utilization, but they’re not enabled out of the box.
Name collisions and scope errors can be even harder to fix. Although Kubernetes resources can be scoped to the namespace level, resource definitions (CRDs) are always global. IBM has been working to containerise its middleware stack. One of the challenges along the way has been ensuring all the middleware can coexist in the same cluster. Within a large product portfolio like IBM’s, com.ibm.myresource
is not a well-qualified name! Name clashes amongst third party dependencies further complicate the picture. Without careful planning, different CRDs may end up with the same fully qualified name, which causes interestingly non-deterministic effects at runtime. Avoiding clashes needed us to do lots of testing and also internal coordination to pre-empt any overlapping name choices.
And then there’s security. Kubernetes is not secure by default (although OpenShift helps a bit by setting stricter defaults). It’s a well-established (and thoroughly sensible) best practice to isolate production workloads in their own cluster. However, that doesn’t mean everything needs to be isolated. Consider co-locating several lower environments in the same cluster so they can pool resources.
Multi-tenancy is a hard problem, but it can yield big cost and climate benefits. If you’re helping build the Kubernetes ecosystem, design with multi-tenancy, and elasticity in mind. If you’re a platform user, be a bit brave (except in prod!) and aim for multi-tenant systems. Look at the auto-scaling and elasticity options available to you, and take advantage of them. Monitor your utilization, and re-arrange things if some of your clusters are consistently under-utilized.
So what happens if you get it exactly right? Your clusters are the optimum size, neither too big nor too small, your utilization is high, and your elasticity is good. Is this a climate win? Well … not necessarily! To see why high utilization may not be enough, see Part II, Revenge of The Zombies.
Bring your plan to the IBM Garage.
IBM Garage is built for moving faster, working smarter, and innovating in a way that lets you disrupt disruption.
Learn more at www.ibm.com/garage