Kubernetes – Scaling a cluster

All these techniques are great for scaling the application, but what about the cluster itself? At some point, you will pack the nodes full and need more resources to schedule new pods for your workloads.

Autoscaling

When you create your cluster, you can customize the starting number of nodes (minions) with the NUM_MINIONS environment variable. By default, it is set to 4.

Additionally, the Kubernetes team has started to build autoscaling capability into the cluster itself. Currently, this is only supported on GCE and GKE, but work is being done on other providers. This capability utilizes the KUBE_AUTOSCALER_MIN_NODES, KUBE_AUTOSCALER_MAX_NODES, and KUBE_ENABLE_CLUSTER_AUTOSCALER environment variables.

The following example shows how to set the environment variables for autoscaling before running kube-up.sh:

$ export NUM_MINIONS=5
$ export KUBE_AUTOSCALER_MIN_NODES=2
$ export KUBE_AUTOSCALER_MAX_NODES=5
$ export KUBE_ENABLE_CLUSTER_AUTOSCALER=true

Also, bear in mind that changing this after the cluster is started will have no effect. You would need to tear down the cluster and create it once again. Thus, this section will show you how to add nodes to an existing cluster without rebuilding it.

Once you start a cluster with these settings, your cluster will automatically scale up and down with the minimum and maximum limits based on compute resource usage in the cluster.

GKE clusters also support autoscaling when launched, when using the alpha features. The preceding example will use a flag such as --enable-autoscaling --min-nodes =2 --max-nodes =5 in a command-line launch.

Scaling up the cluster on GCE

If you wish to scale out an existing cluster, we can do it with a few steps. Manually scaling up your cluster on GCE is actually quite easy. The existing plumbing uses managed instance groups in GCE, which allow you to easily add more machines of a standard configuration to the group via an instance template.

You can see this template easily in the GCE console. First, open the console; by default, this should open your default project console. If you are using another project for your Kubernetes cluster, simply select it from the project drop-down at the top of the page.

On the side panel, look under Compute and then Compute Engine, and select Instance templates. You should see a template titled kubernetes-minion-template. Note that the name could vary slightly if you’ve customized your cluster naming settings. Click on that template to see the details. Refer to the following screenshot:

The GCE Instance template for minions

You’ll see a number of settings, but the meat of the template is under the  Custom metadata. Here, you will see a number of environment variables and also a startup script that is run after a new machine instance is created. These are the core components that allow us to create new machines and have them automatically added to the available cluster nodes.

Because the template for new machines is already created, it is very simple to scale out our cluster in GCE. Once in the Compute section of the console, simply go to Instance groups located right above the Instance templates link on the side panel. Again, you should see a group titled kubernetes-minion-group or something similar. Click on that group to see the details, as shown in the following screenshot:

The GCE instance group for minions

You’ll see a page with a CPU metrics graph and three instances listed here. By default, the cluster creates three nodes. We can modify this group by clicking on the EDIT GROUP button at the top of the page:

The GCE instance group edit page

You should see kubernetes-minion-template selected in the  Instance template that we reviewed a moment ago. You’ll also see an Autoscaling setting, which is Off by default, and an instance count of 3. Simply increment this to 4 and click on Save. You’ll be taken back to the group details page and you’ll see a pop-up dialog showing the pending changes.

You’ll also see some auto healing properties on the Instance groups edit page. This recreates failed instances and allows you to set health checks, as well as an initial delay period before an action is taken.

In a few minutes, you’ll have a new instance listed on the details page. We can test that this is ready using the get nodes command from the command line:

$ kubectl get nodes
A word of caution on autoscaling and scaling down in general:
First, if we repeat the earlier process and decrease the countdown to four, GCE will remove one node. However, it will not necessarily be the node you just added. The good news is that pods will be rescheduled on the remaining nodes. However, it can only reschedule where resources are available. If you are close to full capacity and shut down a node, there is a good chance that some pods will not have a place to be rescheduled. In addition, this is not a live migration, so any application state will be lost in the transition. The bottom line is that you should carefully consider the implications before scaling down or implementing an autoscaling scheme.
For more information on general autoscaling in GCE, refer to the  https://cloud.google.com/compute/docs/autoscaler/?hl=en_US#scaling_based_on_cpu_utilization link.

Scaling up the cluster on AWS

The AWS provider code also makes it very easy to scale up your cluster. Similar to GCE, the AWS setup uses autoscaling groups to create the default four minion nodes. In the future, the autoscaling groups will hopefully be integrated into the Kubernetes cluster autoscaling functionality. For now, we will walk though a manual setup.

This can also be easily modified using the CLI or the web console. In the console, from the EC2 page, simply go to the Auto Scaling Groups section at the bottom of the menu on the left. You should see a name similar to kubernetes-minion-group. Select this group and you will see the details shown in the following screenshot:

Kubernetes minion autoscaling details

We can scale this group up easily by clicking on  Edit. Then, change the Desired, Min, and Max values to 5 and click on Save. In a few minutes, you’ll have the fifth node available. You can once again check this using the get nodes command.

Scaling down is the same process, but remember that we discussed the same considerations in the previous Scaling up the cluster on GCE section. Workloads could get abandoned or, at the very least, unexpectedly restarted.

Scaling manually

For other providers, creating new minions may not be an automated process. Depending on your provider, you’ll need to perform various manual steps. It can be helpful to look at the provider-specific scripts in the cluster directory.

Comments are closed.