loading...

Kubernetes – Lessons learned from production

Kubernetes has been around long enough now that there are a number of companies running Kubernetes. In our day jobs, we’ve seen Kubernetes run in production across a number of different industry verticals and in numerous configurations. Let’s explore what folks across the industry are doing when providing customer-facing workloads. At a high level, there are several key areas:

  • Make sure to set limits in your cluster.
  • Use the appropriate workload types for your application.
  • Label everything! Labels are very flexible and can contain a lot of information that can help identify an object, route traffic, or determine placement.
  • Don’t use default values.
  • Tweak the default values for the core Kubernetes components.
  • Use load balancers as opposed to exposing services directly on a node’s port.
  • Build your Infrastructure as Code and use provisioning tools such as CloudFormation or Terraform, and configuration tools such as Chef, Ansible, or Puppet.
  • Consider not running stateful workloads in production clusters until you build up expertise in Kubernetes.
  • Investigate higher-function templating languages to maintain the state of your cluster. We’ll explore a few options for an immutable infrastructure in the following chapter.
  • Use RBAC, the principle of least privilege, and separation of concerns wherever possible.
  • Use TLS-enabled communications for all inter-cluster chatter. You can set up TLS and certificate rotation for the kubelet communication in your cluster.
  • Until you’re comfortable with managing Kubernetes, build lots of small clusters. It’s more operational overhead, but it will get you into the deep end of experience faster so that you see more failure and experience the operator burden more heavily.
  • As you get better at Kubernetes, build bigger clusters that use namespaces, network segmentation, and the authorization features to break up your cluster into pieces.
  • Once you’re running a few large clusters, manage them with kubefed.
  • If you can, use the features of your given cloud service provider’s built-in high availability on a Kubernetes platform. For example, run Regional Clusters on GCP, with GKE. This feature spreads your nodes across several availability zones in a region. This allows for resilience against a single zone failure, and provides the conceptual building blocks for the zero downtime upgrades of your master nodes.

In the next section, we’ll explore one of these concepts  limits – in more detail.

Setting limits

If you’ve done work with containers before, you will know that one of the first and easiest things to set up for your containers is resource limits in the form of the following metrics:

  • CPU
  • Memory
  • Requests

You may be familiar with setting runtime limits on resources with Docker’s CLI, which specify flags to limit these items and more:

docker run --it --cpu-rt-runtime=950000 \
 --ulimit rtprio=99 \
 --memory=1024m \
 --cpus=".5"
 alpine /bin/sh

Here, we’re setting a runtime parameter, creating a ulimit, and setting memory and CPU quotas. The story evolves a bit in Kubernetes, as you can create these limits to a specific namespace, which allows you to characterize your limits by the domains of your cluster. You have four overarching parameters so that you can work resource limits in Kubernetes:

spec.containers[].resources.limits.cpu
spec.containers[].resources.requests.cpu
spec.containers[].resources.limits.memory
spec.containers[].resources.requests.memory

Scheduling limits

When you create a pod with a memory limit, Kubernetes looks for a node with the right labels and selectors that has enough of the resource types, CPU, and memory, that the pod requires. Kubernetes is in charge of ensuring that the total memory request of the pods on a node is not less than the pod’s total resources. This can sometimes result in unexpected outcomes, as you can have node limitations reached in terms of capacity, even if the net utilization of a pod is low. This is a design of the system in order to accommodate varying load levels.

You can look through pod logs to find out when this has occurred:

$ kubectl describe pod web| grep -C 3 Events
Events:
FirstSeen LastSeen Count From Subobject PathReason Message
74s 15s 2 {scheduler } FailedScheduling Failed for reason PodExceedsFreeCPU and possibly others

You can address these issues by removing unneeded pods, ensuring that your pod isn’t larger as a whole than any one available node, or simply add more resources to the cluster.

Memory limit example

Let’s walk through an example. First, we’ll create a namespace to house our memory limit:

master $ kubectl create namespace low-memory-area
namespace "low-memory-area" created

Once we’ve created the namespace, we can create a file that sets a LimitRange object, which will allow us to enforce a default for memory limits and requests. Create a file called memory-default.yaml with the following contents:

apiVersion: v1
kind: LimitRange
metadata:
 name: mem-limit-range
spec:
 limits:
 - default:
     memory: 512Mi
   defaultRequest:
     memory: 256Mi
   type: Container

And now, we can create it in the namespace:

master $ kubectl create -f test.ym --namespace=low-memory-area
limitrange "mem-limit-range" created

Let’s create a pod without a memory limit, in the low-memory-area namespace, and see what happens.

Create the following low-memory-pod.yaml file:

apiVersion: v1
kind: Pod
metadata:
 name: low-mem-demo
spec:
 containers:
 - name: low-mem-demo
   image: redis

Then, we can create the pod with this command:

kubectl create -f low-memory-pod.yaml --namespace=low-memory-area
pod "low-mem-demo" created

Let’s see if our resource constraints were added to the pod’s configuration for containers, without having to explicitly specify it in the pod configuration. Notice the memory limits in place! We’ve removed some of the informational output for readability:

kubectl get pod low-mem-demo --output=yaml --namespace=low-memory-area

Here’s the output of the preceding code:

apiVersion: v1
kind: Pod
metadata:
 annotations:
   kubernetes.io/limit-ranger: 'LimitRanger plugin set: memory request for container
     low-mem-demo; memory limit for container low-mem-demo'
 creationTimestamp: 2018-09-20T01:41:40Z
 name: low-mem-demo
 namespace: low-memory-area
 resourceVersion: "1132"
 selfLink: /api/v1/namespaces/low-memory-area/pods/low-mem-demo
 uid: 52610141-bc76-11e8-a910-0242ac11006a
spec:
 containers:
 - image: redis
   imagePullPolicy: Always
   name: low-mem-demo
   resources:
     limits:
       memory: 512Mi
     requests:
       memory: 256Mi
   terminationMessagePath: /dev/termination-log
   terminationMessagePolicy: File
   volumeMounts:
   - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
     name: default-token-t6xqm
     readOnly: true
 dnsPolicy: ClusterFirst
 nodeName: node01
 restartPolicy: Always
 schedulerName: default-scheduler
 securityContext: {}
 serviceAccount: default
 serviceAccountName: default
 terminationGracePeriodSeconds: 30
 tolerations:
 - effect: NoExecute
   key: node.kubernetes.io/not-ready
   operator: Exists
   tolerationSeconds: 300
 - effect: NoExecute
   key: node.kubernetes.io/unreachable
   operator: Exists
   tolerationSeconds: 300
 volumes:
 - name: default-token-t6xqm
   secret:
     defaultMode: 420
     secretName: default-token-t6xqm
 hostIP: 172.17.1.21
 phase: Running
 podIP: 10.32.0.3
 qosClass: Burstable
 startTime: 2018-09-20T01:41:40Z

You can delete the pod with the following command:

Kubectl delete pod low-mem-demo --namespace=low-memory-area
pod "low-mem-demo" delete

There are a lot of options for configuring resource limits. If you create a memory limit, but don’t specify the default request, the request will be set to the maximum available memory, which will correspond to the memory limit. That will look like the following:

resources:
 limits:
   memory: 4096m
 requests:
   memory: 4096m

In a cluster with diverse workloads and API-driven relationships, it’s incredibly important to set memory limits with your containers and their corresponding applications in order to prevent misbehaving applications from disrupting your cluster. Services don’t implicitly know about each other, so they’re very susceptible to resource exhaustion if you don’t configure limits correctly.

Scheduling CPU constraints

Let’s look at another type of resource management, the constraint. We’ll use the CPU dimension here, and we’ll explore how to set the maximum and minimum values for available resources for a given container and pod in a namespace. There are a number of reasons you might want to limit CPU on a Kubernetes cluster:

  • If you have a namespaced cluster that has different levels of production and non-production workloads, you may want to specify higher limits for your production workloads. You can allow quad-core CPU consumption for production; put pin development, staging, or UAT-type workloads to a single CPU; or stagger them according to environment needs.
  • You can also ban requests from pods that require more CPU resources than your nodes have available. If you’re running a certain type of machine on a cloud service provider, you can ensure that workloads that require X cores aren’t scheduled on machines with <X cores.

CPU constraints example

Let’s go ahead and create another namespace in which to hold our example:

kubectl create namespace cpu-low-area

Now, let’s set up a LimitRange for CPU constraints, which uses the measurement of millicpus. If you’re requesting 500 m, it means that you’re asking for 500 millicpus or millicores, which is equivalent to 0.5 in notational form. When you request 0.5 or 500 m, you’re asking for half of a CPU in whatever form your platform provides (vCPU, Core, Hyper Thread, vCore, or vCPU).

As we did previously, let’s create a LimitRange for our CPU constraints:

apiVersion: v1
kind: LimitRange
metadata:
 name: cpu-demo-range
spec:
 limits:
 - max:
     cpu: "500m"
   min:
     cpu: "300m"
   type: Container

Now, we can create the LimitRange:

kubectl create -f cpu-constraint.yaml --namespace=cpu-low-area

Once we create the LimitRange, we can inspect it. What you’ll notice is that the defaultRequest is specified as the same as the maximum, because we didn’t specify it. Kubernetes sets the defaultRequest to max:

kubectl get limitrange cpu-demo-range --output=yaml --namespace=cpu-low-area

limits:
- default:
   cpu: 500m
 defaultRequest:
   cpu: 500m
 max:
   cpu: 500m
 min:
   cpu: 300m
 type: Container

This is the intended behavior. When further containers are scheduled in this namespace, Kubernetes first checks to see whether the pod specifies a request and limit. If it doesn’t, the defaults are applied. Next, the controller confirms that the CPU request is more than the lower bound in the LimitRange, 300 m. Additionally, it checks for the upper bound to make sure that the object is not asking for more than 500 m.

You can check the container constraints again by looking at the YAML output of the pod:

kubectl get pod cpu-demo-range --output=yaml --namespace=cpu-low-area

resources:
 limits:
   cpu: 500m
 requests:
   cpu: 300m

Now, don’t forget to delete the pod:

kubectl delete pod cpu-demo-range --namespace=cpu-low-area

Comments are closed.

loading...