Horizontal Pod Autoscaler

May 16, 2024 · 3 min read · k8s kubernetes ·

(this post is part of the material I cover in my devops course)

In Kubernetes, a HorizontalPodAutoscaler(or HPA) automatically updates a workload resource (such as a Deployment or StatefulSet), with more pods.

Scaling basics

Horizontal scaling means run more processing units, where Vertical scaling means making a single processing unit more powerfull.
(read more here)
HPA scales out by running more pods automatically, as a response of load.
HPA itself is an object that records the configuration of the desired performance.
Let's see an example.

Metrics server

First, make sure that the metrics-server is working.
We have covered that (at least for minikube) here.
So please wait patiently until you cam see metrics data with your pods.

Running the HPA

Here's the HPA definition:

 1apiVersion: autoscaling/v1
 2kind: HorizontalPodAutoscaler
 3metadata:
 4  name: myhpa
 5spec:
 6  scaleTargetRef:
 7    apiVersion: apps/v1
 8    kind: Deployment
 9    name: mydep
10  minReplicas: 1
11  maxReplicas: 3
12  targetCPUUtilizationPercentage: 60

This is a HorizontalPodAutoscaler definition, with 1-3 pods. It points to mydep deployments.
Just apply it:

1$> 
2$> kubectl apply -f HPA.yaml 
3horizontalpodautoscaler.autoscaling/myhpa created
4$>

Deployment and load

First, here's a deployment definition:

 1apiVersion: apps/v1
 2kind: Deployment
 3metadata:
 4  name: mydep
 5spec:
 6  selector:
 7    matchLabels:
 8      app: mydep
 9  template:
10    metadata:
11      labels:
12        app: mydep
13    spec:
14      containers:
15      - name: mydep-a
16        image:  alpine:latest
17        command:
18        - sleep
19        - "3600"
20        resources:
21          requests:
22            cpu: "100m"

This is a simple deployment, but note that there are no replicas definition.
Apply it to see a single pod:

1$> 
2$> kubectl apply -f mydep.yaml 
3deployment.apps/mydep created
4$> kubectl get pods
5NAME                     READY   STATUS    RESTARTS   AGE
6mydep-68bf56555d-grmn2   1/1     Running   0          6s
7$>

The only thing left to be done is creating some load:

 1$> 
 2$> kubectl exec -it mydep-68bf56555d-grmn2 -- sh
 3/ # 
 4/ # for i in 1 2 3 4; do while : ; do : ; done & done
 5/ # 
 6/ # ps -e
 7PID   USER     TIME  COMMAND
 8    1 root      0:00 sleep 3600
 9    7 root      0:00 sh
10   14 root      0:03 sh
11   15 root      0:03 sh
12   16 root      0:04 sh
13   17 root      0:03 sh
14   18 root      0:00 ps -e
15/ # 
16/ # exit
17$> 
18$> kubectl get pods
19NAME                     READY   STATUS    RESTARTS   AGE
20mydep-68bf56555d-grmn2   1/1     Running   0          110s
21mydep-68bf56555d-j5bnr   1/1     Running   0          47s
22mydep-68bf56555d-xdspd   1/1     Running   0          47s
23$>

After some time, we could see the load on just one pod (the one we have run the script on):

1$> 
2$> kubectl top pods
3NAME                     CPU(cores)   MEMORY(bytes)   
4mydep-68bf56555d-grmn2   3947m        0Mi             
5mydep-68bf56555d-j5bnr   0m           0Mi             
6mydep-68bf56555d-xdspd   0m           0Mi             
7$>

You can exec into this pod again, remove those processes and see the number of pods getting bacl to 1