loading...

Kubernetes – Core constructs

Now, let’s dive a little deeper and explore some of the core abstractions Kubernetes provides. These abstractions will make it easier to think about our applications and ease the burden of life cycle management, high availability, and scheduling.

Pods

Pods allow you to keep related containers close in terms of the network and hardware infrastructure. Data can live near the application, so processing can be done without incurring a high latency from network traversal. Similarly, common data can be stored on volumes that are shared between a number of containers. Pods essentially allow you to logically group containers and pieces of our application stacks together.

While pods may run one or more containers inside, the pod itself may be one of many that is running on a Kubernetes node (minion). As we’ll see, pods give us a logical group of containers across which we can then replicate, schedule, and balance service endpoints.

Pod example

Let’s take a quick look at a pod in action. We’ll spin up a Node.js application on the cluster. You’ll need a GCE cluster running for this; if you don’t already have one started, refer to the Our first cluster section in  Chapter 1, Introduction to Kubernetes.

Now, let’s make a directory for our definitions. In this example, I’ll create a folder in the /book-examples subfolder under our home directory:

$ mkdir book-examples
$ cd book-examples
$ mkdir 02_example
$ cd 02_example
You can download the example code files from your account at http://www.packtpub.com for all of the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files emailed directly to you.

Use your favorite editor to create the following file and name it as nodejs-pod.yaml:

apiVersion: v1 
kind: Pod 
metadata: 
  name: node-js-pod 
spec: 
  containers: 
  - name: node-js-pod 
    image: bitnami/apache:latest 
    ports: 
    - containerPort: 80

This file creates a pod named node-js-pod with the latest bitnami/apache container running on port 80. We can check this using the following command:

$ kubectl create -f nodejs-pod.yaml
pod "node-js-pod" created

This gives us a pod running the specified container. We can see more information on the pod by running the following command:

$ kubectl describe pods/node-js-pod

You’ll see a good deal of information, such as the pod’s status, IP address, and even relevant log events. You’ll note the pod IP address is a private IP address, so we cannot access it directly from our local machine. Not to worry, as the kubectl exec command mirrors Docker’s exec functionality. You can get the pod IP address in a number of ways. A simple get of the pod will show you the IP where we use a template output that looks up the IP address in the status output:

$ kubectl get pod node-js-pod --template={{.status.podIP}}

You can use that IP address directly, or execute that command within backticks to exec into the pod. Once the pod shows it’s in a running state, we can use this feature to run a command inside a pod:

$ kubectl exec node-js-pod -- curl <private ip address>

--or--

$ kubectl exec node-js-pod -- curl `kubectl get pod node-js-pod --template={{.status.podIP}}`
By default, this runs a command in the first container it finds, but you can select a specific one using the -c argument.

After running the command, you should see some HTML code. We’ll have a prettier view later in this chapter, but for now, we can see that our pod is indeed running as expected.

If you have experience with containers, you’ve probably also exec ‘d into a container. You can do something very similar with Kubernetes:

master $ kubectl exec -it node-js-pod -- /bin/bash
root@node-js-pod:/opt/bitnami/apache/htdocs# exit
master $ 

You can also run other command directly into the container with the exec command. Note that you’ll need to use two dashes to separate your command’s argument in case it has the same in kubectl:

$ kubectl exec node-js-pod ls / 
$ kubectl exec node-js-pod ps aux
$ kubectl exec node-js-pod -- uname -a

Labels

Labels give us another level of categorization, which becomes very helpful in terms of everyday operations and management. Similar to tags, labels can be used as the basis of service discovery as well as a useful grouping tool for day-to-day operations and management tasks. Labels are attached to Kubernetes objects and are simple key-value pairs. You will see them on pods, replication controllers, replica sets, services, and so on. Labels themselves and the keys/values inside of them are based on a constrained set of variables, so that queries against them can be evaluated efficiently using optimized algorithms and data structures.

The label indicates to Kubernetes which resources to work with for a variety of operations. Think of it as a filtering option. It is important to note that labels are meant to be meaningful and usable to the operators and application developers, but do not imply any semantic definitions to the cluster. Labels are used for organization and selection of subsets of objects, and can be added to objects at creation time and/or modified at any time during cluster operations. Labels are leveraged for management purposes, an example of which is when you want to know all of the backing containers for a particular service, you can normally get them via the labels on the container which correspond to the service at hand. With this type of management, you often end up with multiple labels on an object.

Kubernetes cluster management is often a cross-cutting operation, involving scaling up of different resources and services, management of multiple storage devices and dozens of nodes and is therefore a highly multi-dimensional operation.

Labels allow horizontal, vertical, and diagonal encapsulation of Kubernetes objects. You’ll often see labels such as the following:

  • environment: dev, environment: integration, environment: staging, environment: UAT, environment: production
  • tier: web, tier: stateless, tier: stateful, tier: protected
  • tenancy: org1, tenancy: org2

Once you’ve mastered labels, you can use selectors to identify a novel group of objects based on a particular set of label combination. There are currently equality-based and set-based selectors. Equality-based selectors allow operators to filter by keys/value pairs, and in order to select(or) an object, it must match all specified constraints. This kind of selector is often used to choose a particular node, perhaps to run against particularly speedy storage. Set-based selectors are more complex, and allow the operator to filter keys according to a specific value. This kind of selector is often used to determine where a object belongs, such as a tier, tenancy zone, or environment.

In short, an object may have many labels attached to it, but a selector can provide uniqueness to an object or set of objects.

We will take a look at labels in more depth later in this chapter, but first we will explore the remaining three constructs: services, replication controllers, and replica sets.

The container’s afterlife

As Werner Vogels, CTO of AWS, famously said, everything fails all the time; containers and pods can and will crash, become corrupted, or maybe even just get accidentally shut off by a clumsy administrator poking around on one of the nodes. Strong policy and security practices such as enforcing least privilege curtail some of these incidents, but involuntary workload slaughter happens and is simply a fact of operations.

Luckily, Kubernetes provides two very valuable constructs to keep this somber affair all tidied up behind the curtains. Services and replication controllers/replica sets give us the ability to keep our applications running with little interruption and graceful recovery.

Services

Services allow us to abstract access away from the consumers of our applications. Using a reliable endpoint, users and other programs can access pods running on your cluster seamlessly. This is in direct contradiction to one of our core Kubernetes constructs: pods.

Pods by definition are ephemeral and when they die they are not resurrected. If we trust that replication controllers will do their job to create and destroy pods as necessary, we’ll need another construct to create a logical separation and policy for access.

Here we have services, which use a label selector to target a group of ever-changing pods. Services are important because we want frontend services that don’t care about the specifics of backend services, and vice versa. While the pods that compose those tiers are fungible, the service via ReplicationControllers manages the relationships between objects and therefore decouples different types of applications.

For applications that require an IP address, there’s a Virtual IP (VIP) available which can send round robin traffic to a backend pod. With cloud-native applications or microservices, Kubernetes provides the Endpoints API for simple communication between services.

K8s achieves this by making sure that every node in the cluster runs a proxy named kube-proxy. As the name suggests, the job of kube-proxy is to proxy communication from a service endpoint back to the corresponding pod that is running the actual application:

The kube-proxy architecture

Membership of the service load balancing pool is determined by the use of selectors and labels. Pods with matching labels are added to the list of candidates where the service forwards traffic. A virtual IP address and port are used as the entry points for the service, and the traffic is then forwarded to a random pod on a target port defined by either K8s or your definition file.

Updates to service definitions are monitored and coordinated from the K8s cluster Master and propagated to the kube-proxy daemons running on each node.

At the moment, kube-proxy is running on the node host itself. There are plans to containerize this and the kubelet by default in the future.

A service is a RESTful object, which relies on a POST transaction to the apiserver to create a new instance of the Kubernetes object. Here’s an example of a simple service named service-example.yaml:

kind: Service
apiVersion: v1
metadata:
  name: gsw-k8s-3-service
spec:
  selector:
    app: gswk8sApp
  ports:
  - protocol: TCP
    port: 80
    targetPort: 8080

This creates a service named gsw-k8s-3-service, which opens up a target port of 8080 with the key/value label of app:gswk8sApp. While the selector is continuously evaluated by a controller, the results of the IP address assignment (also called a cluster IP) will be posted to the endpoints object of gsw-k8s-3-service. The kind field is required, as is ports, while selector and type are optional.

Kube-proxy runs a number of other forms of virtual IP for services aside from the strategy outlined previously. There are three different types of proxy modes that we’ll mention here, but will investigate in later chapters:

  • Userspace
  • Iptables
  • Ipvs

Replication controllers and replica sets

Replication controllers have been deprecated in favor of using Deployments, which configure ReplicaSets. This method is a more robust manner of application replication, and has been developed as a response to the feedback of the container running community. We’ll explore Deployments, Jobs, ReplicaSets, DaemonSets, and StatefulSets further in Chapter 4, Implementing Reliable Container-Native Applications. The following information is left here for reference.

Replication controllers (RCs), as the name suggests, manage the number of nodes that a pod and included container images run on. They ensure that an instance of an image is being run with the specific number of copies. RCs ensure that a pod or many same pods are always up and available to serve application traffic.

As you start to operationalize your containers and pods, you’ll need a way to roll out updates, scale the number of copies running (both up and down), or simply ensure that at least one instance of your stack is always running. RCs create a high-level mechanism to make sure that things are operating correctly across the entire application and cluster. Pods created by RCs are replaced if they fail, and are deleted when terminated. RCs are recommended for use even if you only have a single pod in your application.

RCs are simply charged with ensuring that you have the desired scale for your application. You define the number of pod replicas you want running and give it a template for how to create new pods. Just like services, we’ll use selectors and labels to define a pod’s membership in an RC.

Kubernetes doesn’t require the strict behavior of the replication controller, which is ideal for long-running processes. In fact, job controllers can be used for short-lived workloads, which allow jobs to be run to a completion state and are well suited for batch work.

Replica sets are a new type, currently in beta, that represent an improved version of replication controllers. Currently, the main difference consists of being able to use the new set-based label selectors, as we will see in the following examples.

Comments are closed.

loading...