OpenShift Analyze a Node Recipe

From a browser

  1. Administrator } Compute } Nodes } $NODENAME } Terminal

From the command line

  1. Ensure you're logged in with oc
  2. Find the relevant node(s):
    $ oc get nodes -o wide
    NAME  STATUS  ROLES          AGE   VERSION          INTERNAL-IP  EXTERNAL-IP  OS-IMAGE  KERNEL-VERSION              CONTAINER-RUNTIME
    ...1  Ready   master,worker  105d  v1.18.3+2fbd7c7  ...1         ...4         Red Hat   3.10.0-1160.2.2.el7.x86_64  cri-o://1.18.3-19.rhaos4.5.git9264b4f.el7
    ...2  Ready   master,worker  105d  v1.18.3+2fbd7c7  ...2         ...2         Red Hat   3.10.0-1160.2.2.el7.x86_64  cri-o://1.18.3-19.rhaos4.5.git9264b4f.el7
    ...4  Ready   master,worker  105d  v1.18.3+2fbd7c7  ...4         ...0         Red Hat   3.10.0-1160.2.2.el7.x86_64  cri-o://1.18.3-19.rhaos4.5.git9264b4f.el7
  3. If needed, review node resource usage:
    $ oc adm top nodes
    NAME   CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
    ...1   592m         3%     8344Mi          29%       
    ...2   958m         6%     8675Mi          30%       
    ...4   1139m        7%     9523Mi          33%
  4. Remote into a node:
    $ oc debug node/$NODENAME -t
  5. The debug container is its own container running on the node but usually you want to act as if you're remote'd into the actual node by running chroot:
    sh-4.2# chroot /host
    sh-4.2# whoami
    root
  6. If you need to run utilities that aren't installed on the node, then you can package those utilities into an image and then run that image on the node. Note that the image stream is in a particular project so -n must be specified, and note that in this use case, you probably do not want to run chroot since the relevant binaries are in the container rather than the node. For example:
    $ oc debug node/$NODENAME -t --image=image-registry.openshift-image-registry.svc:5000/$PROJECT/$IMAGE -n $PROJECT
    sh-5.0# gcore
    usage:  gcore [-a] [-o prefix] pid1 [pid2...pidN]

Map container PID to node PID

  1. Either first chroot /host (in which case you'll lose access to your debug pod's binaries), or add --root /host/run/runc after runc.
  2. Find the PIDs that might be interesting. For example, if we're searching for a Java PID:
    # pgrep -f java
    73138
    77182
    84890
    100958
  3. Use runc list to find the container IDs for those PIDs that are in containers. For example:
    # for pid in $(pgrep -f java); do runc list | grep $pid; done | awk '{print $1}'
    76d7cbc64b8411fc04390c940fe14c797d4a996a00a56d1014312a7aa7b6d260
    1a4a983095d84776603b8d77ff625ace91bb492db12fbee42666145d05324dee
    ef77108b440d2ca5631a3781f0bb440e904d86f818457381c0fb820d8f6aa3fd
  4. Use runc state to get details about a container. For example (some output removed):
    # runc state 76d7cbc64b8411fc04390c940fe14c797d4a996a00a56d1014312a7aa7b6d260
    {
      "id": "76d7cbc64b8411fc04390c940fe14c797d4a996a00a56d1014312a7aa7b6d260",
      "pid": 73138,
      "status": "running",
      "bundle": "/var/data/crioruntimestorage/overlay-containers/76d7cbc64b8411fc04390c940fe14c797d4a996a00a56d1014312a7aa7b6d260/userdata",
      "rootfs": "/var/data/criorootstorage/overlay/a83cefbb7952694e724af131870657c6b13043f9fc847b7c4757457224a308da/merged",
      "created": "2021-02-15T20:36:59.080374202Z",
      "annotations": {
         "io.container.manager": "cri-o",
         "io.kubernetes.container.hash": "6e7830a0",
         "io.kubernetes.container.name": "...",
         "io.kubernetes.container.ports": "[{\"containerPort\":9443,\"protocol\":\"TCP\"},{\"containerPort\":9080,\"protocol\":\"TCP\"}]",
         "io.kubernetes.container.restartCount": "0",
         "io.kubernetes.cri-o.Image": "...",
         "io.kubernetes.cri-o.Name": "k8s_...-7d57d6599f-5qq7z_..._722d428b-d0ac-4b1e-b4f5-503b1b76c1e4_0",
         "io.kubernetes.pod.name": "...-7d57d6599f-5qq7z",
         "io.kubernetes.pod.namespace": "...",
      },
    }
  5. If needed, go to the rootfs directory for access to the container's filesystem. For example:
    # head /var/data/criorootstorage/overlay/a83cefbb7952694e724af131870657c6b13043f9fc847b7c4757457224a308da/merged/logs/messages.log
    ********************************************************************************
    product = Open Liberty 21.0.0.1 (wlp-1.0.48.cl210120210113-1459)