Kubernetes Pod Readiness Monitoring

Alexandru Ersenie
Categories: Tech

Lately i kept myself busy with implementing some curl based monitors with the purpose of retrieving readiness information about pods belonging to a deployment in Kubernetes

You can look at this from two points of view:

  1. Is my deployment ready?
  2. Do i still have the required quorum of pods in ready state?

While the first question can be translated into a boolean condition – is my service alive or not? – the second one is more interesting

Any monitoring tool that you might use will eventually perform checks, and interpret results from a boolean perspective (true, false) or from a threshold perspective. I’ll look into the second perspective today, and try to answer the second question: do i have enough pods for a deployment that are in ready state?

Query Kubernetes API

There is no better way to get the information as directly from the source. And by source i mean the Kubernetes API, which is awesomely detailed, and delivers information in the most beloved format of the nowadays, JSON

So i’ll be using two favorite tools of mine, curl and jq. For those of you who have not been working with jq yet, wake up… you’ve missed a lot!

jq is basically a very light-weight json command processor. Read more about it on https://stedolan.github.io/jq/

Get information on pods belonging to a namespace

Authentication

In order to communicate with the Kubernetes API you’ll need either access through kubectl or an authentication token, that will be provided to you by your cluster administrator. I’ll be using the latter in this post.

	export HEADER=`cat kube.token`

Get pod information using curl

Once we have set the authentication header, we can now use it as header in the curl request

curl -sS "--insecure" --header "Authorization: Bearer $HEADER"  https://myserver:6443/api/v1/namespaces/testns/pods 

Depending on the number of pods in your namespace, you will get a relatively large json body back. In order to retrieve only the information for the pods belonging to a deployment, i will use the labeling mechanism in kubernetes. Let’s see all available labels in my deployments:

ns=test;curl  -s -k  --header "Authorization: Bearer $HEADER"  https://myserver:6443/api/v1/namespaces/${ns}/pods | jq '.items[] ' | grep -e 'app.kubernetes.io/name' | sort | uniq

This will return the following labels:
"app.kubernetes.io/name": "frontend",
"app.kubernetes.io/name": "backend",
"app.kubernetes.io/name": "database",

We can see there are three labels i can use. Let’s use the frontend label, and filter out only the pods having this label. I will use the select function of jq for filtering:

ns=test;curl  -s -k  --header "Authorization: Bearer $HEADER"  https://myserver:6443/api/v1/namespaces/${ns}/pods | jq -r   '.items[] | select(.metadata.labels."app.kubernetes.io/name"=="frontend")

This will return a lot of information about the pod, among them:

Pod Metadata

This will return information such as name, namespace, selfLink for querying directly the pod, creation timestamp, labels, references and so on.

ns=test;curl  -s -k  --header "Authorization: Bearer $HEADER"  https://myserver:6443/api/v1/namespaces/${ns}/pods | jq -r   '.items[] | select(.metadata.labels."app.kubernetes.io/name"=="gaming") | .metadata'

A typical response will look like this:

{
  "name": "frontend-7ddcb6647f-8xvcc",
  "generateName": "frontend-7ddcb6647f-",
  "namespace": "test",
  "selfLink": "/api/v1/namespaces/test/pods/frontend-7ddcb6647f-8xvcc",
  "uid": "3e62ee83-0449-11e9-b3fd-005056aa17f3",
  "resourceVersion": "65539833",
  "creationTimestamp": "2018-12-20T11:20:23Z",
  "labels": {
    "app.kubernetes.io/part-of": "mystack",
    "pod-template-hash": "3887622039",
    "app.kubernetes.io/component": "frontend",
    "app.kubernetes.io/managed-by": "Tiller",
    "app.kubernetes.io/name": "frontend"
  },
  "ownerReferences": [
    {
      "apiVersion": "extensions/v1beta1",
      "kind": "ReplicaSet",
      "name": "fronten-7ddcb6647f",
      "uid": "3bd9a084-febc-11e8-b3fd-005056aa17f3",
      "controller": true,
      "blockOwnerDeletion": true
    }
  ]
}

Pod specification

This will return information regarding restart policies, readiness probes, liveness probes, volumes, secrets, containers… pretty much all the information regarding on pod behavior and pod resources

ns=test;curl  -s -k  --header "Authorization: Bearer $HEADER"  https://myserver:6443/api/v1/namespaces/${ns}/pods | jq -r   '.items[] | select(.metadata.labels."app.kubernetes.io/name"=="gaming") | .spec'
{
  "volumes": [
    {
      "name": "logging-volume",
      "configMap": {
        "name": "logging.properties",
        "items": [
          {
            "key": "logging.properties",
            "path": "logging.properties"
          }
        ],
        "defaultMode": 420
      }
    },
    {
      "name": "applications-volume",
      "emptyDir": {}
    },
    {
      "name": "generated-volume",
      "emptyDir": {}
    },
    {
      "name": "sessions-volume",
      "emptyDir": {}
    },
    {
      "name": "backup-volume",
      "emptyDir": {}
    },
    {
      "name": "garbage-collection-log",
      "emptyDir": {}
    },
    {
      "name": "lib-databases-volume",
      "emptyDir": {}
    },
    {
      "name": "logs",
      "emptyDir": {}
    },
    {
      "name": "tmp",
      "emptyDir": {}
    },
    {
      "name": "wrapper",
      "configMap": {
        "name": "scripts",
        "defaultMode": 484
      }
    },
    {
      "name": "secret-certificates",
      "secret": {
        "secretName": "secret-certificates",
        "defaultMode": 420
      }
    },
    {
      "name": "default-token-qg4md",
      "secret": {
        "secretName": "default-token-qg4md",
        "defaultMode": 420
      }
    }
  ],
  "containers": [
    {
      "name": "sidecar-logger-payara-gc",
      "image": "myregistry:5000/docker-base-centos:7.4.1708-7",
      "command": [
        "/scripts/entrypoint.sh"
      ],
      "resources": {
        "limits": {
          "cpu": "10m",
          "memory": "10Mi"
        },
        "requests": {
          "cpu": "10m",
          "memory": "10Mi"
        }
      },
      "volumeMounts": [
        {
          "name": "garbage-collection-log",
          "mountPath": "/var/log/glassfish/domain1"
        },
        {
          "name": "wrapper",
          "mountPath": "/scripts"
        },
        {
          "name": "default-token-qg4md",
          "readOnly": true,
          "mountPath": "/var/run/secrets/kubernetes.io/serviceaccount"
        }
      ],
      "terminationMessagePath": "/dev/termination-log",
      "terminationMessagePolicy": "File",
      "imagePullPolicy": "IfNotPresent"
    }
  ],
  "restartPolicy": "Always",
  "terminationGracePeriodSeconds": 45,
  "dnsPolicy": "ClusterFirst",
  "serviceAccountName": "default",
  "serviceAccount": "default",
  "nodeName": "node2",
  "securityContext": {},
  "schedulerName": "default-scheduler",
  "tolerations": [
    {
      "key": "node.kubernetes.io/not-ready",
      "operator": "Exists",
      "effect": "NoExecute",
      "tolerationSeconds": 300
    },
    {
      "key": "node.kubernetes.io/unreachable",
      "operator": "Exists",
      "effect": "NoExecute",
      "tolerationSeconds": 300
    }
  ]
}

Pod status

This is the most interesting part of them all, and this is the one i’ll be using. It delivers me information such as:

  • pod ip
  • status for each container in the pod
  • readiness state for the pod
  • readiness state for each container in the pod

So basically i can not only get information of the pod’s state, but of it’s individual containers as well, allowing me detailed monitoring and state information

ns=test;curl  -s -k  --header "Authorization: Bearer $HEADER"  https://myserver:6443/api/v1/namespaces/${ns}/pods | jq -r   '.items[] | select(.metadata.labels."app.kubernetes.io/name"=="gaming") | .status'
{
  "phase": "Running",
  "conditions": [
    {
      "type": "Initialized",
      "status": "True",
      "lastProbeTime": null,
      "lastTransitionTime": "2018-12-20T11:20:23Z"
    },
    {
      "type": "Ready",
      "status": "True",
      "lastProbeTime": null,
      "lastTransitionTime": "2018-12-20T11:32:58Z"
    },
    {
      "type": "PodScheduled",
      "status": "True",
      "lastProbeTime": null,
      "lastTransitionTime": "2018-12-20T11:20:23Z"
    }
  ],
  "hostIP": "10.49.11.102",
  "podIP": "10.88.171.27",
  "startTime": "2018-12-20T11:20:23Z",
  "containerStatuses": [
    {
      "name": "frontend-container",
      "state": {
        "running": {
          "startedAt": "2018-12-20T11:21:10Z"
        }
      },
      "lastState": {},
      "ready": true,
      "restartCount": 0,
      "image": "myregistry:5000/frontend:3.12-1",
      "imageID": "docker-pullable://myregistry:5000/frontend-9@sha256:653046d4279f979349cf50fd6d38ce19264c1e483290647d74a74c238bc5dcc0",
      "containerID": "docker://9e40d5920b3b322b10e667aad6ef014a45199a46a4a3a14349a677dd4f81b46c"
    },
    {
      "name": "sidecar-logger-payara-gc",
      "state": {
        "running": {
          "startedAt": "2018-12-20T11:21:12Z"
        }
      },
      "lastState": {},
      "ready": true,
      "restartCount": 0,
      "image": "myregistry:5000/docker-base-centos:7.4.1708-7",
      "imageID": "docker-pullable://myregistry:5000/docker-base-centos@sha256:35db6ad1cf8ca4f39be519daa7f4bb7688d986536d646832c6cc09e3bd1efc0d",
      "containerID": "docker://5cca236e67a6be2fbec56640e2a2ff0705a008c68721d861d4b9100d97289471"
    }
  ],
  "qosClass": "Burstable"
}

Get status of pods based on their label

Now that we know where to find the information, let’s retrieve the status for all pods labeled with our chosen label. These are the steps we will perform:

  • Get all pods
  • Filter the ones labeled as frontend
  • retrieve only the status.conditions key and values for these pods
  • Filter only the pods that match: type must be Ready and status must be True . These are the pods that are in ready mode, ready to accept requests, ready to be taken into the service…just ready
ns=test;curl  -s -k  --header "Authorization: Bearer $HEADER"  https://myserver:6443/api/v1/namespaces/${ns}/pods  | jq -r   '.items[] | select(.metadata.labels."app.kubernetes.io/name"=="frontend") | .status.conditions[] | select(.type=="Ready" and .status=="True") | .status ' | wc -l | awk  '{print $1}'

This will return the number of ready pods. You can now take it and match it against a threshold, generating returnCodes and exitCodes, which again you can use in a monitoring/alerting tool such as Nagios for generating alerts.

Putting it at work – Integrate it with Nagios

You can now integrate this as a whole, using it as a template in Nagios. The full shell script is in my GIT Repository:

https://github.com/alexandru-ersenie/kubernetes-monitoring

Have fun, leave a message, fork, contribute…

Print Friendly, PDF & Email