Kubernetes
Kubernetes (also known as k8s or kube) is an open-source system for automating deployment, scaling, and management of containerized applications.
Basic terms:
- Pod: Collection of one or more containers running on the same node with shared resources such as storage and IP addresses.
- Deployment: One or more pods.
- Service: Wire together pods by exposing deployments to each other. A
service is basically a load balancer/reverse proxy to a set of pods
using a selector and access policy. A service is normally named
service-name.namespace:port
. A service provides a permanent, internal host name for applications to use. - Operator: Manage application state and exposes interfaces to manage the application.
Architecture
A Kubernetes cluster is a set of nodes. Each node runs the kubelet agent to monitor pods, kube-proxy to maintain network rules, and a container runtime such as Docker, containerd, CRI-O, or any other Container Runtime Interface (CRI)-compliant runtime. Worker nodes run applications and master nodes manage the cluster.
Master Nodes
Master nodes collectively called the control plane administer the worker nodes. Each master node runs etcd for a highly-available key-value store of cluster data, cloud-controller-manager to interact with any underlying cloud infrastructure, kube-apiserver to expose APIs for the control plane, kube-scheduler for assigning pods to nodes, and kube-controller-manager to manage controllers (the last three may be called Master Services).
kubectl
kubectl
is a command line interface to manage a Kubernetes cluster.
Links:
Cluster Context
kubectl
may use multiple clusters. The available
clusters may be shown with the following command and the current cluster
is denoted with *
:
$ kubectl config get-contexts
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
default/c103-:30595/IAM#email c103-:30595 IAM#email/c103-:30595 testodo4
* docker-desktop docker-desktop docker-desktop
The API endpoints may be displayed with:
$ kubectl config view -o jsonpath='{"Cluster name\tServer\n"}{range .clusters[*]}{.name}{"\t"}{.cluster.server}{"\n"}{end}'
Cluster name Server
c103-:30595 https://c103-.com:30595
docker-desktop https://kubernetes.docker.internal:6443
Change Cluster Context
$ kubectl config use-context docker-desktop
Switched to context "docker-desktop".
Delete Cluster Context
$ kubectl config delete-context docker-desktop
deleted context docker-desktop from ~/.kube/config
etcd
etcd stores the current and desired states of the cluster, role-based access control (RBAC) rules, application environment information, and non-application user data.
High Availability
Run at least 3 master nodes for high availability and size each appropriately.
Objects
Kubernetes Objects
represent the intended state of system resources. Controllers act
through resources to try to achieve the desired state. The
spec
property is the desired state and the
status
property is the object's current status.
Labels
Objects may have metadata key/value pair labels and objects may be grouped by label(s) using selectors.
Resources
Kubernetes Resources are API endpoints that store and control a collection of Kubernetes objects (e.g. pods). Common resources:
- Deployment: Collections of pods. API
- ReplicaSet: Ensure that a specified number of replicas of a pod are running but generally Deployments (that include a ReplicaSet) are directly used instead. API
- StatefulSets: Deployment with stateful state. API
- Service: Provides internal network access to a logical set of pods (Deployments or StatefulSets). API
- Ingress: Provides external network access to a Service. Ingress is also called a Route. API
- ConfigMap: Non-confidential key-value configuration pairs. API
- Secret: Confidential key-value configuration pairs. API
- PersistentVolume: Persistent storage. API
- StorageClass: Groups storage by different classes of qualities-of-service and characteristics. API
List resource kinds
$ kubectl api-resources
NAME SHORTNAMES APIGROUP NAMESPACED KIND
pods po true Pod
[...]
Namespace
A namespace is a logical isolatuion unit or "project" to group objects/resources, policies to restrict users, constraints to enforce quotas through ResourceQuotas, and service accounts to automatically manage resources.
List namespaces
$ kubectl get namespaces --show-labels
NAME STATUS AGE LABELS
default Active 8d <none>
kube-node-lease Active 8d <none>
kube-public Active 8d <none>
kube-system Active 8d <none>
kubernetes-dashboard Active 6d22h <none>
Create namespace
kubectl create namespace testns1
Show current namespace (if any)
kubectl config view --minify | grep namespace
Change current namespace
kubectl config set-context --current --namespace=${NAMESPACE}
Reset to no namespace:
kubectl config set-context --current --namespace=
Nodes
List Nodes
$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
docker-desktop Ready master 8d v1.19.7 192.168.65.4 <none> Docker Desktop 5.10.25-linuxkit docker://20.10.5
Controllers
Kubernetes Controllers run a reconciliation loop indefinitely while enabled and continuously attempt to control a set of resources to reach a desired state (e.g. minimum number of pods).
Deployments
Deployments define a collection of one or more pods and configure container templates with a name, image, resources, storage volumes, and health checks, as well as a deployment strategy for how to create/recreate a deployment, and triggers for when to do so.
List Deployments
$ kubectl get deployments --all-namespaces
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
kube-system coredns 2/2 2 2 8d
kube-system metrics-server 1/1 1 1 6d22h
kubernetes-dashboard dashboard-metrics-scraper 1/1 1 1 6d22h
kubernetes-dashboard kubernetes-dashboard 1/1 1 1 6d22h
Create Deployment
kubectl create deployment ${DEPLOYMENT} --image=${FROM} --namespace=${NAMESPACE}
For example:
kubectl create deployment liberty1 --image=icr.io/appcafe/websphere-liberty --namespace=testns1
List pods for a Deployment
kubectl get pods -l=app=${DEPLOYMENT} --namespace=${NAMESPACE}
For example:
kubectl get pods -l=app=liberty1 --namespace=testns1
With custom columns:
$ kubectl get pods -l=app=liberty1 -o=custom-columns=NAME:.metadata.name,NAMESPACE:.metadata.namespace,STATUS:.status.phase,NODE:.spec.nodeName,STARTED:.status.startTime --namespace=testns1
NAME NAMESPACE STATUS NODE STARTED
liberty1-585d8dfd6-2vb6c testns1 Running worker2 2022-04-25T18:34:58Z
Delete Deployment
kubectl delete deployment.apps/${DEPLOYMENT} --namespace=${NAMESPACE}
Scale Deployment
kubectl scale deployment ${DEPLOYMENT} --replicas=${PODS} --namespace=${NAMESPACE}
Print logs for all pods in a deployment
kubectl logs "--selector=app=${DEPLOYMENT}" --prefix=true --all-containers=true --namespace=${NAMESPACE}
Pods
Create, run, and remote into a new pod
kubectl run -i --tty fedora --image=fedora -- sh
Operators
Kubernetes Operators are Kubernetes native applications which are controller pods for custom resources (CRs) (normally a logical application) that interact with the API server to automate actions. Operators are based on a Custom Resource Definition (CRD).
OperatorHub is a public registry of operators.
Operator SDK is one way to build operators.
Operator logs
Find the operator's API resource:
$ kubectl api-resources | awk 'NR==1 || /containerdiagnostic/'
NAME SHORTNAMES APIGROUP NAMESPACED KIND
containerdiagnostics diagnostic.ibm.com true ContainerDiagnostic
Then find the pods for them:
$ kubectl get pods --all-namespaces | awk 'NR==1 || /containerdiag/'
NAMESPACE NAME READY STATUS RESTARTS AGE
containerdiagoperator-system containerdiagoperator-controller-manager-5976b5bb4c-2szb7 2/2 Running 0 19m
Print logs for the manager
container:
$ kubectl logs containerdiagoperator-controller-manager-5976b5bb4c-2szb7 --namespace=containerdiagoperator-system --container=manager
[...]
2021-06-23T16:04:56.624Z INFO setup starting manager
Operator Lifecycle Manager
The Operator Lifecycle Manager (OLM) may be used to install and manager operators in a Kubernetes cluster.
List operator catalogs
$ kubectl get catalogsource --all-namespaces
NAMESPACE NAME DISPLAY TYPE PUBLISHER AGE
openshift-marketplace community-operators Community Operators grpc Red Hat 69d
openshift-marketplace certified-operators Certified Operators grpc Red Hat 69d
openshift-marketplace redhat-marketplace Red Hat Marketplace grpc Red Hat 69d
openshift-marketplace redhat-operators Red Hat Operators grpc Red Hat 69d
openshift-marketplace ibm-operator-catalog IBM Operator Catalog grpc IBM 61d
List all operators
$ kubectl get packagemanifest --all-namespaces
NAMESPACE NAME CATALOG AGE
openshift-marketplace ibm-spectrum-scale-csi-operator Community Operators 69d
openshift-marketplace syndesis Community Operators 69d
openshift-marketplace openshift-nfd-operator Community Operators 69d
[...]
Operator Catalogs
The most common operator catalogs are:
- Kubernetes Community Operators: Hosted at https://operatorhub.io/ and submitted via GitHub k8s-operatorhub/community-operators. Must only use API objects supported by the Kubernetes API.
- OpenShift Community Operators: Shown in OpenShift and OKD and submitted via GitHub redhat-openshift-ecosystem/community-operators-prod. May use OCP-specific resources like Routes, ImageStreams, etc. Certified operators are generally built in RHEL or UBI.
CPU and Memory Resource Limits
A container may be configured with CPU and/or memory resource requests and limits. A request is the minimum amount of a resource that is required by (and reserved for) a container and is used to decide if a node has sufficient capacity to start a new container. A limit puts a cap on a container's usage of that resource. If there are sufficient available resources, a container may use more than the requested amount of resource, up to the limit. If only a limit is specified, the request is set equal to the limit.
Therefore, if request is less than the limit, then the system may
become overcommitted. For resources such as memory, this may lead to the
Linux OOM Killer activating and killing
processes with Killed
in application logs and
kernel: Memory cgroup out of memory: Killed process
in node
logs (e.g. oc debug node/$NODE -t
followed by
chroot /host journalctl
).
CPU Resources
CPU resources are gauged in
terms of a vCPU/core in cloud or a CPU hyperthread on bare metal.
The m
suffix means millicpu (or millicore), so 0.5 (or
half) of one CPU is equivalent to 500m (or 500 millicpu), and CPU
resources may be specified in either form (i.e. 0.5 or 500m) although
the general recommendation is to use millicpu. CPU limits are evaluated
every quota period per CPU and this defaults
to 100ms.
For example, a CPU limit of 500m means that a container may use no more than half of 1 CPU in any 100ms period. Values larger than 1000m may be specified if there is more than one CPU. For details, review the Linux kernel CFS bandwidth control documentation.
Many
recommend using CPU limits. If containers exhaust node CPU, the
kubelet
process may become resource starved and cause the
node to enter the NotReady
state. The
throttling
metric counts the number of times the CPU limit
is exceeded. However, there have been cases of throttling occurring even
when the limit is not hit, generally fixed in Linux kernel >=
4.14.154, 4.19.84, and 5.3.9 (see 1,
2,
3,
and 4). One
solution is to increase CPU requests and limits although this may reduce
density on nodes. Some specify a CPU request but without a limit. Review
additional OpenShift
guidance on overcommit.
Memory Resources
Memory resources are gauged in terms of bytes. The suffixes K, M, G, etc. may be used for multiples of 1000, and the suffixes Ki, Mi, Gi, etc. may be used for multiples of 1024.
Events
View Latest Events
$ kubectl get events --all-namespaces
NAMESPACE LAST SEEN TYPE REASON OBJECT MESSAGE
kube-system 7m8s Normal Scheduled pod/metrics-server-6b5c979cf8-t8496 Successfully assigned kube-system/metrics-server-6b5c979cf8-t8496 to docker-desktop
kube-system 7m6s Normal Pulling pod/metrics-server-6b5c979cf8-t8496 Pulling image "k8s.gcr.io/metrics-server/metrics-server:v0.4.3"
[...]
Horizontal Pod Autoscaler
The Horizontal Pod Autoscaler (HPA) scales the number of Pods in a replication controller, deployment, replica set or stateful set based on metrics such as CPU utilization.
Day X
Day 1 activities generally include installation and configuration activities.
Day 2 activities generally include scaling up and down, reconfiguration, updates, backups, failovers, restores, etc.
In general, operators are used to implement day 1 and day 2 activities.
Pod Affinity
Example ensuring that not all pods run on the same node:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- myappname
topologyKey: "kubernetes.io/hostname"
NodePorts
Set externalTrafficPolicy: Local
on a kubernetes service
so that NodePort won't open on every Node but only on the nodes where
the pods are actually running.
Clustering
- Cluster size limitations: https://kubernetes.io/docs/setup/best-practices/cluster-large/#support
- Without Cluster Federation (Ubernetes), clusters should not span dispersed data centers and "stretching an OpenShift Cluster Platform across multiple data centers is not recommended".
Jobs
A Job may be used to run one or more pods until a specified number have successfully completed. A CronJob is a Job on a repeating schedule. Note:
A Replication Controller manages Pods which are not expected to terminate (e.g. web servers), and a Job manages Pods that are expected to terminate (e.g. batch tasks).
List jobs
# kubectl get jobs -o wide
NAME COMPLETIONS DURATION AGE CONTAINERS IMAGES SELECTOR
myjobname 1/1 5s 34s myjobcontainername kgibm/containerdiagsmall controller-uid=5078824a-fad1-4961-af97-62d387ef2fc7
Create job
printf '{"apiVersion": "batch/v1","kind": "Job", "metadata": {"name": "%s"}, "spec": {"template": {"spec": {"restartPolicy": "Never", "containers": [{"name": "%s", "image": "%s", "command": %s}]}}}}' myjobname myjobcontainername kgibm/containerdiagsmall '["ls", "-l"]' | kubectl create -f -
Describe job
$ kubectl describe job myjobname
[...]
Start Time: Wed, 23 Jun 2021 08:20:59 -0700
Completed At: Wed, 23 Jun 2021 08:21:04 -0700
Duration: 5s
Pods Statuses: 0 Running / 1 Succeeded / 0 Failed
[...]
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 73s job-controller Created pod: myjobname-d9rr5
Normal Completed 68s job-controller Job completed
Print job logs
$ kubectl logs myjobname-d9rr5
total 60
lrwxrwxrwx 1 root root 7 Jan 26 06:05 bin -> usr/bin
[...]
DaemonSets
A DaemonSet may be used to run persistent pods on all or a subset of nodes. Note:
DaemonSets are similar to Deployments in that they both create Pods, and those Pods have processes which are not expected to terminate (e.g. web servers, storage servers). Use a Deployment for stateless services, like frontends, where scaling up and down the number of replicas and rolling out updates are more important than controlling exactly which host the Pod runs on. Use a DaemonSet when it is important that a copy of a Pod always run on all or certain hosts, and when it needs to start before other Pods.
Services
List Services
$ kubectl get services --all-namespaces
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 8d
kube-system kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 8d
kube-system metrics-server ClusterIP 10.102.139.243 <none> 443/TCP 6d23h
kubernetes-dashboard dashboard-metrics-scraper ClusterIP 10.107.135.44 <none> 8000/TCP 7d
kubernetes-dashboard kubernetes-dashboard ClusterIP 10.97.139.73 <none> 443/TCP 7d
Create Service
By
default, services are exposed on ClusterIP
which is
internal to the cluster.
kubectl expose deployment ${DEPLOYMENT} --port=${EXTERNALPORT} --target-port=${PODPORT} --namespace=${NAMESPACE}
For example:
kubectl expose deployment liberty1 --port=80 --target-port=9080 --namespace=testns1
To expose a service on a NodePort (i.e. a random port between 30000-32767 on each node):
kubectl expose deployment liberty1 --port=80 --target-port=9080 --type=NodePort --namespace=testns1
Then, access the service at the LoadBalancer Ingress
host on port NodePort
:
$ kubectl describe services liberty1 --namespace=testns1
Name: liberty1
Namespace: testns1
Labels: app=liberty1
Annotations: <none>
Selector: app=liberty1
Type: NodePort
IP: 10.107.0.163
LoadBalancer Ingress: localhost
Port: <unset> 80/TCP
TargetPort: 9080/TCP
NodePort: <unset> 30187/TCP
Endpoints: 10.1.0.36:9080,10.1.0.37:9080
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>
For example:
$ curl -I http://localhost:30187/
HTTP/1.1 200 OK
[...]
Delete Service
kubectl delete service/${DEPLOYMENT} --namespace=${NAMESPACE}
Ingresses
An Ingress exposes services outside of the cluster network. Before creating an ingress, you must create at least one Ingress Controller to manage the ingress. By default, no ingress controller is installed. A commonly used ingress controller which is supported by Kubernetes is the nginx ingress controller.
Create nginx Ingress controller
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v0.46.0/deploy/static/provider/cloud/deploy.yaml
kubectl wait --namespace ingress-nginx --for=condition=ready pod --selector=app.kubernetes.io/component=controller --timeout=120s
See https://kubernetes.github.io/ingress-nginx/deploy/
Create Ingress
printf '{"apiVersion":"networking.k8s.io/v1","kind":"Ingress","metadata":{"name":"%s","annotations":{"nginx.ingress.kubernetes.io/rewrite-target":"/"}},"spec":{"rules":[{"http":{"paths":[{"path":"%s","pathType":"Prefix","backend":{"service":{"name":"%s","port":{"number":80}}}}]}}]}}' "${INGRESS}" "${PATH}" "${SERVICE}" | kubectl create -f - --namespace=${NAMESPACE}
For example:
printf '{"apiVersion":"networking.k8s.io/v1","kind":"Ingress","metadata":{"name":"%s","annotations":{"nginx.ingress.kubernetes.io/rewrite-target":"/"}},"spec":{"rules":[{"http":{"paths":[{"path":"%s","pathType":"Prefix","backend":{"service":{"name":"%s","port":{"number":80}}}}]}}]}}' "ingress1" "/" "liberty1" | kubectl create -f - --namespace=testns1
List Ingresses
$ kubectl get ingresses --all-namespaces
NAMESPACE NAME CLASS HOSTS ADDRESS PORTS AGE
testns1 ingress1 <none> * localhost 80 63s
Describe Ingress
$ kubectl describe ingress ${INGRESS} --namespace=${NAMESPACE}
Name: ingress1
Namespace: testns1
Address: localhost
Default backend: default-http-backend:80 (<error: endpoints "default-http-backend" not found>)
Rules:
Host Path Backends
---- ---- --------
*
/ liberty1:80 (10.1.0.44:9080,10.1.0.47:9080)
Annotations: nginx.ingress.kubernetes.io/rewrite-target: /
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Sync 51s (x2 over 93s) nginx-ingress-controller Scheduled for sync
Delete Ingress
kubectl delete ingress/${INGRESS} --namespace=${NAMESPACE}
Authentication
Kubernetes authentication supports service accounts and normal users. Normal users are managed through external mechanisms rather than by Kubernetes itself:
It is assumed that a cluster-independent service manages normal users [...]
Kubernetes does not have objects which represent normal user accounts. Normal users cannot be added to a cluster through an API call.
[...] any user that presents a valid certificate signed by the cluster's certificate authority (CA) is considered authenticated.
[...] Kubernetes determines the username from the common name field in the 'subject' of the cert.
[...] client certificates can also indicate a user's group memberships using the certificate's organization fields. To include multiple group memberships for a user, include multiple organization fields in the certificate.
List Service Accounts
$ kubectl get serviceaccounts
NAME SECRETS AGE
default 1 136m
Retrieve Default Service Account Token
The default
service account token may be retrieved:
$ TOKEN=$(kubectl get secrets -o jsonpath="{.items[?(@.metadata.annotations['kubernetes\.io/service-account\.name']=='default')].data.token}" | base64 --decode)
$ echo ${TOKEN}
This may be then used in an API request. For example:
$ curl -X GET https://kubernetes.docker.internal:6443/api --header "Authorization: Bearer ${TOKEN}" --insecure
{
"kind": "APIVersions",
"versions": [
"v1"
],
[...]
Role-Based Access Control
Role-Based Access Control (RBAC) implements authorization in Kubernetes. Roles are namespace-scoped and ClusterRoles are cluster-scoped. RoleBindings and ClusterRoleBindings attach users and/or groups to a set of Roles or ClusterRoles, respectively.
List Roles
$ kubectl get roles --all-namespaces
NAMESPACE NAME CREATED AT
kube-public system:controller:bootstrap-signer 2021-04-27T15:24:35Z
[...]
List Role Bindings
$ kubectl get rolebindings --all-namespaces
NAMESPACE NAME ROLE AGE
kube-public system:controller:bootstrap-signer Role/system:controller:bootstrap-signer 138m
[...]
List Cluster Roles
$ kubectl get clusterroles
NAME CREATED AT
admin 2021-04-27T15:24:34Z
cluster-admin 2021-04-27T15:24:34Z
edit 2021-04-27T15:24:34Z
system:basic-user 2021-04-27T15:24:34Z
[...]
List Cluster Role Bindings
$ kubectl get clusterrolebindings
NAME ROLE AGE
cluster-admin ClusterRole/cluster-admin 135m
[...]
Monitoring
Show CPU and memory usage:
kubectl top pods --all-namespaces
kubectl top pods --containers --all-namespaces
kubectl top nodes
Tekton Pipelines
Tekton pipelines describes CI/CD pipelines as code using Kubernetes custom resources. Terms:
- Task: set of sequential steps
- Pipeline: set of sequential tasks
Technologies such as OpenShift Pipelines, Jenkins, JenkinsX, etc. use Tekton to implement their CI/CD workflow on top of Kubernetes.
Appsody
Appsody was a way to create application stacks using predefined templates. It has been superceded by OpenShift do (odo).
Helm
Helm groups together YAML templates that define a logical application release and its required Kubernetes resources using helm charts.
Common commands
- Show Helm CLI version:
helm version
- Show available options:
helm show values .
- Install a chart:
helm install $NAME .
- List installed charts:
helm ls
- Upgrade a chart:
helm upgrade $NAME .
- Rollback an upgrade:
helm rollback $NAME 1
Kubernetes Dashboard
Kubernetes Dashboard is a simple web interface for Kubernetes. Example installation:
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.2.0/aio/deploy/recommended.yaml
kubectl proxy
- Open http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/
- Use a login token such as the
default
service account token - Change the namespace at the top as needed and explore.
To delete the dashboard, use the same YAML as above:
kubectl delete -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.2.0/aio/deploy/recommended.yaml
Kubernetes Metrics Server
Kubernetes Metrics Server provides basic container resource metrics for consumers such as Kubernetes Dashboard. Example installation:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
- For a development installation, allow insecure certificates:
kubectl patch deployment metrics-server -n kube-system --type 'json' -p '[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--kubelet-insecure-tls"}]'
- If using Kubernetes Dashboard, refresh the Pods view after a few minutes to see an overall CPU usage graph if it works.
To delete the metrics-server, use the same YAML as above:
kubectl delete -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Knative
Knative helps deploy and manage serverless workloads.