Kubernetes – Built-in monitoring

How to add group members on Ubuntu Server 20.04″ href=”https://devtutorial.io/how-to-add-group-members-on-ubuntu-server-20-04.html” target=”_blank”>How to add group members on Ubuntu Server 20.04

If you recall from Chapter 1, Introduction to Kubernetes, we noted that our nodes were already running a number of monitoring services. We can see these once again by running the get pods command with the kube-system namespace specified as follows:

$ kubectl get pods --namespace=kube-system

The following screenshot is the result of the preceding command:

System pod listing

Again, we see a variety of services, but how does this all fit together? If you recall, the node (formerly minions) section from Chapter 2, Building a Foundation with Core Kubernetes Constructs, each node is running a kubelet. The kubelet is the main interface for nodes to interact with and update the API server. One such update is the metrics of the node resources. The actual reporting of the resource usage is performed by a program named cAdvisor.

The cAdvisor program is another open source project from Google, which provides various metrics on container resource use. Metrics include CPU, memory, and network statistics. There is no need to tell cAdvisor about individual containers; it collects the metrics for all containers on a node and reports this back to the kubelet, which in turn reports to Heapster.

Google’s open source projects: Google has a variety of open source projects related to Kubernetes. Check them out, use them, and even contribute your own code!

Both cAdvisor and Heapster are mentioned in the following sections of GitHub:

  • cAdvisor: https://github.com/google/cadvisor
  • Heapster: https://github.com/kubernetes/heapster

Contrib is a catch-all term for a variety of components that are not part of core Kubernetes. It can be found at  https://github.com/kubernetes/contrib. LevelDB is a key store library that was used in the creation of InfluxDB. It can be found at  https://github.com/google/leveldb.

Heapster is yet another open source project from Google; you may start to see a theme emerging here (see the preceding information box). Heapster runs in a container on one of the minion nodes and aggregates the data from a kubelet. A simple REST interface is provided to query the data.

When using the GCE setup, a few additional packages are set up for us, which saves us time and gives us a complete package to monitor our container workloads. As we can see from the preceding System pod listing screenshot, there is another pod with influx-grafana in the title.

InfluxDB is described on its official website as follows:

An open-source distributed time series database with no external dependencies.

InfluxDB is based on a key store package (refer to the previous Google’s open source projects information box) and is perfect to store and query event- or time-based statistics such as those provided by Heapster.

Finally, we have Grafana, which provides a dashboard and graphing interface for the data stored in InfluxDB. Using Grafana, users can create a custom monitoring dashboard and get immediate visibility into the health of their Kubernetes cluster, and therefore their entire container infrastructure.

Exploring Heapster

Let’s quickly look at the REST interface by running SSH to the node that is running the Heapster pod. First, we can list the pods to find the one that is running Heapster, as follows:

$ kubectl get pods --namespace=kube-system

The name of the pod should start with monitoring-heapster. Run a describe command to see which node it is running on, as follows:

$ kubectl describe pods/<Heapster monitoring Pod> --namespace=kube-system

From the output in the following screenshot, we can see that the pod is running in kubernetes-minion-merd. Also note the IP for the pod, a few lines down, as we will need that in a moment:

Heapster pod details

Next, we can SSH to this box with the familiar gcloud ssh command, as follows:

$ gcloud compute --project "<Your project ID>" ssh --zone "<your gce zone>" "<kubernetes minion from describe>"

From here, we can access the Heapster REST API directly using the pod’s IP address. Remember that pod IPs are routable not only in the containers but also on the nodes themselves. The Heapster API is listening on port 8082, and we can get a full list of metrics at /api/v1/metric-export-schema/.

Let’s look at the list now by issuing a curl command to the pod IP address we saved from the describe command, as follows:

$ curl -G <Heapster IP from describe>:8082/api/v1/metric-export-schema/

We will see a listing that is quite long. The first section shows all the metrics available. The last two sections list fields by which we can filter and group. For your convenience, I’ve added the following tables which are a little bit easier to read:






The number of milliseconds since the container was started




The cumulative CPU usage on all cores




The CPU limit in millicores



Total memory usage




Total working set usage; the working set is the memory that is being used, and is not easily dropped by the kernel




The memory limit




The number of page faults



The number of major page faults



The cumulative number of bytes received over the network




The cumulative number of errors while receiving over the network



The cumulative number of bytes sent over the network




The cumulative number of errors while sending over the network



The total number of bytes consumed on a filesystem




The total size of filesystem in bytes




The number of available bytes remaining in a the filesystem



Table 6.1. Available Heapster metrics



Label type


The node name where the container ran



The host name where the container ran



An identifier specific to a host, which is set by the cloud provider or user



The user-defined image name that is run inside the container



The user-provided name of the container or full container name for system containers



The name of the pod



The unique ID of the pod



The namespace of the pod



The unique ID of the namespace of the pod



A comma-separated list of user-provided labels


Table 6.2. Available Heapster fields

Customizing our dashboards

Now that we have the fields, we can have some fun. Recall the Grafana page that we looked at in Chapter 1, Introduction to Kubernetes. Let’s pull that up again by going to our cluster’s monitoring URL. Note that you may need to log in with your cluster credentials. Refer to the following format of the link you need to use: https://<your master IP>/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana

We’ll see the default Home dashboard. Click on the down arrow next to Home and select Cluster. This shows the Kubernetes cluster dashboard, and now we can add our own statistics to the board. Scroll all the way to the bottom and click on Add a Row. This should create a space for a new row and present a green tab on the left-hand side of the screen.

Let’s start by adding a view into the filesystem usage for each node (minion). Click on the green tab to expand, and then select  Add Panel and then G raph. An empty graph should appear on the screen, along with a query panel for our custom graph.

The first field in this panel should show a query that starts with SELECT mean(“value”) FROM. Click on the A character next to this field to expand it. Leave the first field next to FROM as default and then click on the next field with the  select measurement value. A drop-down menu will appear with the Heapster metrics we saw in the previous tables. Select filesystem/usage_bytes_gauge. Now, in the SELECT row, click on  mean() and then on the x symbol to remove it. Next, click on the + symbol on the end of the row and add selectors and  max. Then, you’ll see a GROUP BY row with time($interval) and fill(none). Carefully click on  fill and not on the (none) portion, and again on  x to remove it.

Then, click on the + symbol at the end of the row and select tag(hostname).Finally, at the bottom of the screen we should see a G roup by time  interval. Enter 5s there and you should have something similar to the following screenshot:

Heapster pod details

Next, let’s click on the Axes tab, so that we can set the units and legend. Under Left Y Axis, click on the field next to Unit and set it to data | bytes and Label to Disk Space Used. Under Right Y Axis, set Uni t to none | none. Next, on the  Legend tab, make sure to check Show in  Options and Max in  Values.

Now, let’s quickly go to the General tab and choose a title. In my case, I named mine Filesystem Disk Usage by Node (max).

We don’t want to lose this nice new graph we’ve created, so let’s click on the save icon in the top-right corner. It looks like a floppy disk (you can do a Google image search if you don’t know what this is).

After we click on the save icon, we will see a green dialog box that verifies that the dashboard was saved. We can now click the x symbol above the graph details panel and below the graph itself.

This will return us to the dashboard page. If we scroll all the way down, we will see our new graph. Let’s add another panel to this row. Again, use the green tab and then select Add Panel |  singlestat. Once again, an empty panel will appear with a setting form below it.

Let’s say we want to watch a particular node and monitor network usage. We can easily do this by first going to the Metrics tab. Then, expand the query field and set the second value in the FROM field to network/rx. Now, we can specify the WHERE clause by clicking the + symbol at the end of the row and choosing hostname from the drop-down. After hostname =click on select tag value and choose one of the minion nodes from the list.

Finally, leave mean() for the second  SELECT field shown as follows:

Singlestat options

In the Options tab, make sure that Unit format is set to data bytes and check the Show box next to  Spark lines. The spark line gives us a quick historical view of the recent variations in the value. We can use Background mode to take up the entire background; by default, it uses the area below the value.

In  Coloring, we can optionally check the Value or Background box and choose Thresholds and Colors. This will allow us to choose different colors for the value based on the threshold tier we specify. Note that an unformatted version of the number must be used for threshold values.

Now, let’s go back to the General tab and set the title as Network bytes received (Node35ao). Use the identifier for your minion node.

Once again, let’s save our work and return to the dashboard. We should now have a row that looks like the following screenshot:

Custom dashboard panels

Grafana has a number of other panel types that you can play with, such as Dashboard list, Plugin list, Table, and Text

As we can see, it is pretty easy to build a custom dashboard and monitor the health of our cluster at a glance.

Comments are closed.