Kubernetes – StatefulSets

The purpose of StatefulSets is to provide some consistency and predictability to application deployments with stateful data. Thus far, we have deployed applications to the cluster, defining loose requirements around required resources such as compute and storage. The cluster has scheduled our workload on any node that can meet these requirements. While we can use some of these constraints to deploy in a more predictable manner, it will be helpful if we had a construct built to help us provide this consistency.

StatefulSets were set to GA in 1.6 as we went to press. There were previously beta in version 1.5 and were known as Pet Sets prior to that (alpha in 1.3 and 1.4). 

This is where StatefulSets come in. StatefulSets provide us first with numbered and reliable naming for both network access and storage claims. The pods themselves are named with the following convention, where N is from 0 to the number of replicas:

"Name of Set"-N

This means that a StatefulSet called db with three replicas will create the following pods:


This gives Kubernetes a way to associate network names and PersistentVolumes with specific pods. Additionally, it also serves to order the creation and termination of pods. Pod will be started from 0 to N and terminated from N to 0.

A stateful example

Let’s take a look at an example of a stateful application. First, we will want to create and use a StorageClass, as we discussed earlier. This will allow us to hook into the Google Cloud Persistent Disk provisioner. The Kubernetes community is building provisioners for a variety of StorageClasses, including GCP and AWS. Each provisioner has its own set of parameters available. Both GCP and AWS providers let you choose the type of disk (solid-state, standard, and so on) as well as the fault zone that is needed to match the pod attaching to it. AWS additionally allows you to specify encryption parameters as well as IOPs for provisioned IOPs volumes. There are a number of other provisioners in the works, including Azure and a variety of non-cloud options. Save the following code as solidstate-sc.yaml file:

kind: StorageClass
apiVersion: storage.k8s.io/v1
  name: solidstate
provisioner: kubernetes.io/gce-pd
  type: pd-ssd
  zone: us-central1-b

Use the following command with the preceding listing to create a StorageClass kind of SSD drive in us-central1-b:

$ kubectl create -f solidstate.yaml

Next, we will create a StatefulSet kind with our trusty httpwhalesay demo. While this application does include any real state, we can see the storage claims and explore the communication path as shown in the listing sayhey-statefulset.yaml:

apiVersion: apps/v1
kind: StatefulSet
  name: whaleset
  serviceName: sayhey-svc
  replicas: 3
        app: sayhey
      terminationGracePeriodSeconds: 10
      - name: sayhey
        image: jonbaier/httpwhalesay:0.2
        command: ["node", "index.js", "Whale it up!."]
        - containerPort: 80
          name: web
        - name: www
          mountPath: /usr/share/nginx/html
  - metadata:
      name: www
        volume.beta.kubernetes.io/storage-class: solidstate
      accessModes: [ "ReadWriteOnce" ]
          storage: 1Gi

Use the following command to start the creation of this StatefulSet. If you observe pod creation closely, you will see it create whaleset-0, whaleset-1, and whaleset-2 in succession:

$ kubectl create -f sayhey-statefulset.yaml

Immediately after this, we can see our StatefulSet and the corresponding pods using the familiar get subcommand:

$ kubectl get statefulsets
$ kubectl get pods

These pods should create an output similar to the following images:

StatefulSet listing

The get pods output will show the following:

Pods created by StatefulSet

Depending on your timing, the pods may still be being created. As you can see in the preceding screenshot, the third container is still being spun up.

We can also see the volumes the set has created and claimed for each pod. First are the PersistentVolumes themselves:

$ kubectl get pv

The preceding command should show the three PersistentVolumes named www-whaleset-N. We notice the size is 1Gi and the access mode is set to ReadWriteOnce (RWO), just as we defined in our StorageClass:

The PersistentVolumes listing

Next, we can look at the PersistentVolumeClaim that reserves the volumes for each pod:

$ kubectl get pvc

The following is the output of the preceding command:

The PersistentVolumeClaim listing

You’ll notice many of the same settings here as with the PersistentVolumes themselves. You might also notice the end of the claim name (or PersistentVolumeClaim name in the previous listing) looks like www-whaleset-N. www is the mount name we specified in the preceding YAML definition. This is then appended to the pod name to create the actual PersistentVolume and PersistentVolumeClaim name. One more area we can ensure that the proper disk is linked with it’s matching pod.

Another area where this alignment is important is in network communication. StatefulSets also provide consistent naming here. Before we can do this, let’s create a service endpoint sayhey-svc.yaml, so we have a common entry point for incoming requests:

apiVersion: v1
kind: Service
  name: sayhey-svc
    app: sayhey
  - port: 80
    name: web
  clusterIP: None
    app: sayhey
$ kubectl create -f sayhey-svc.yaml

Now, let’s open a shell in one of the pods and see if we can communicate with another in the set:

$ kubectl exec whaleset-0 -i -t bash

The preceding command gives us a bash shell in the first whaleset pod. We can now use the service name to make a simple HTTP request. We can use both the short name, sayhey-svc, and the fully qualified name, sayhey-svc.default.svc.cluster.local:

$ curl sayhey-svc
$ curl sayhey-svc.default.svc.cluster.local

You’ll see an output similar to the following screenshot. The service endpoint acts as a common communication point for all three pods:

HTTP whalesay curl output (whalesay-0 Pod)

Now, let’s see if we can communicate with a specific pod in the StatefulSet. As we noticed earlier, the StatefulSet named the pods in an orderly manner. It also gives them hostnames in a similar fashion so that there is a specific DNS entry for each pod in the set. Again, we will see the convention of "Name of Set"-N  and then add the fully qualified service URL. The following example shows this for whaleset-1, which is the second pod in our set:

$ curl whaleset-1.sayhey-svc.default.svc.cluster.local

Running this command from our existing Bash shell in whaleset-0 will show us the output from whaleset-1:

HTTP whalesay curl output (whalesay-1 Pod)

You can exit out of this shell now with exit

For learning purposes, it may also be instructive to describe some of the items from this section in more detail. For example, kubectl describe svc sayhey-svc will show us all three pod IP address in the service endpoints.

Comments are closed.