Resource Management in Kubernetes

Introduction:

Resource management in kubernetes means specifying and controlling how much of each of the resources such as CPU and memory a container needs within a pod. Resource management ensures sufficient utilization and allocation of resources for the containers within the kubernetes clusters.

In this guide we will see 3 different types of resource management with examples:

1. Resource Quotas
2. Resource Requests
3. Resource Limits

The goal of resource limits and quotas is to ensure each application gets its "fair share" of resources, no more and no less.

Pre-requisites: If you're using minikube clusters for this hands-on, please make sure you enable metrics-server in your cluster in order to view the metrics.
minikube addons enable metrics-server

Resource Quotas

Resource quotas are native Kubernetes objects that impose limits on resource consumption. Administrators can set up resource quotas to control resource consumption on a per-namespace level, ensuring each namespace only uses a fair share of resources.

Let us create a pod with resource quota and set the pod limit to 5.

apiVersion: v1
kind: ResourceQuota
metadata:
    name: res-quota
spec:
  hard:
    pods: "5"

kubectl apply -f res-quota.yaml

kubectl get resourcequota

As you can see, the pod request is set to 5 and available pod is 0. You can do same for any particular namespace and set the request and limit of the pods.

We will look at an example where, What will happen "if we specify more pods than the allocated quota" ?

Create a simple deployment with replicas=10 and apply the manifest file. You can create a simple deployment.yaml file using this documentation. I already a deployment running, so I modified the file using the below command.

Now if you try to get the pods, you will see only 5 pods (resource Quota) running in our deployment.

We can check the events to understand more about the error.

kubectl get events | grep replicaset/nginx-deployment

You can read the error as exceeded quota, used and limits which are already set. So your pods wont be scheduled and only the limit specified in the quota will be available.

Additionally you can add memory and cpu to the spec of the resource quota and limit your default namespace or any other namespace with allocated resources. You can check the below example for the same.

Example:

apiVersion: v1
kind: ResourceQuota
metadata:
    name: res-quota
spec:
  hard:
    pods: "5"
    memory: "200m"
    cpu: "0.5"

kubectl apply -f res-quota.yaml

kubectl describe resourcequota

This object will restrict the total number of CPUs requests in the default namespace to 0.5 CPU core, and the memory is limited to 200Mb.

With this you can find out how much of quota is available and how much can be allocated to new pods.

Any combination of CPU/memory requests is allowed if the total number does not breach the limit defined in the Resource Quota. New pods will fail to deploy to the relevant namespace if the limit is reached. The API Server will decline to deploy new pods by using the Resource Quota Admission Controller to determine whether any restrictions are being breached.

Resource Requests

Requests are used to specify the minimum amount of resources( CPU & Memory ) that a Pod needs. Kubernetes uses these requests to decide which node to place the pod on, ensuring that the node has enough available resources to meet the pods requirements. Resource requests are defined in the spec.containers.resources.requests field of your pod manifests.

Let us consider an example and try to get the metrics used by the resources.

apiVersion: v1
kind: Pod
metadata:
  name: requests-demo
spec:
  containers:
    - name: nginx
      image: nginx:latest
      resources:
        requests:
          cpu: "100m"
          memory: "150Mi"

Requests don’t limit the actual resource usage of pods, once they’re scheduled to nodes pods are free to use more CPU and memory than their requests allow when there is spare capacity on the node.

Resource Limits

A limit is the maximum amount of resources that a container is allowed to use. When we do not configure a limit, Kubernetes automatically selects one. These limits help ensure that one Pod does not consume all the resources on a node, which could lead to performance degradation.

Example:

apiVersion: v1
kind: Pod
metadata:
  name: limits-demo
spec:
  containers:
    - name: nginx
      image: nginx:latest
      resources:
        limits:
          cpu: 100m
          memory: 150Mi

In the above example the pod has 100 millicores or 0.1 CPU core and 150Mi or 150 megabytes. It is important to set resource limits appropriately based on your application requirement.

Now we will combine both requests and limits and use them in an example.

First we will create a pod with resource requests and limits with memory defined inside it. In short we are both requesting and limiting the memory for a pod.

apiVersion: v1
kind: Pod
metadata:
  name: resource-demo
spec:
  containers:
  - name: stress
    image: polinux/stress
    resources:
      requests:
        memory: "100Mi"
      limits:
        memory: "200Mi"
    command: ["stress"]
    args: ["--vm", "1", "--vm-bytes", "150M", "--vm-hang", "1"]

In the above example, request memory is 100mi and limits memory is 200mi. The stress command is used to stress the memory by allocating 150megabytes of memory ('--vm-bytes 150M')

Create the above pod using the below command:

kubectl apply -f pod.yaml

kubectl get pods

kubectl top po resource-demo

Now you can see that the memory allocated to the pod is 151Mi, which matches the argument we provided in our pod YAML file. This value is within the limits, so the pod is running.

Next, we will increase the argument to a value that exceeds the allocated limit and check the pod's status.

Delete the existing pod, modify it with the below code and create the pod again.

apiVersion: v1
kind: Pod
metadata:
  name: resource-demo
spec:
  containers:
  - name: stress
    image: polinux/stress
    resources:
      requests:
        memory: "100Mi"
      limits:
        memory: "200Mi"
    command: ["stress"]
    args: ["--vm", "1", "--vm-bytes", "350M", "--vm-hang", "1"]

In the above example we are giving the argument as 350M of memory which is well outside the resource limits.

kubectl apply -f pod.yaml

Now you can clearly see that the status shows OOM (OutOfMemory) because we stressed the pod with more memory than allocated. In this case, the pod will crash and eventually show the error OOMKilled.

Also, if you set resource requests and limits too high, for example, requesting 1000Gi of memory and setting a limit of 2000Gi, the pod will fail to schedule and show the following error.

It will show insufficient memory as the message in such cases, same you can also try using CPU specification.

Let us combine both CPU and memory now, earlier we had defined a resource quota and set the requests and limits of the CPU, Memory and pods. Now we will add a pod by defining the pod's request and limit, and then check the available CPU and Memory after its creation.

apiVersion: v1
kind: Pod
metadata:
  name: resource-demo
spec:
  containers:
  - name: stress
    image: polinux/stress
    resources:
      requests:
        memory: "50Mi"
        cpu: "100m"
      limits:
        memory: "50Mi"
        cpu: "100m"
    command: ["stress"]
    args: ["--cpu", "2"]

kubectl apply -f pod.yaml

Now we will check the resource quota for the requests and limits allocated and provided for the pods.

As you can see, the pods CPU and memory has been allocated and assigned. We also see the resources used until now and space available for further resources.

Conclusion

The more resources a Kubernetes cluster consumes, the more expensive it is to run. We should always specify our resource requests and limits properly, if we do not mention then kubernetes automatically does this and this might sound convenient, but Kubernetes aims to ensure that the application doesn’t fail. As a result, it will assign unreasonably generous limits.

By correctly specifying our compute resource quota, requests and limits, we can reduce overspending, improve performance, and ensure efficient use of our Kubernetes resources.

If you liked this article please like and share with your network!

Keep learning and sharing !

Follow Bala for more such posts 💯🚀