Horizontal Pod Autoscaler

(this post is part of the material I cover in my devops course)

In Kubernetes, a HorizontalPodAutoscaler(or HPA) automatically updates a workload resource (such as a Deployment or StatefulSet), with more pods.

Scaling basics

  • Horizontal scaling means run more processing units, where Vertical scaling means making a single processing unit more powerfull.
    (read more here)
  • HPA scales out by running more pods automatically, as a response of load.
  • HPA itself is an object that records the configuration of the desired performance.
    Let's see an example.

Metrics server

  • First, make sure that the metrics-server is working.
    We have covered that (at least for minikube) here.
  • So please wait patiently until you cam see metrics data with your pods.

Running the HPA

  • Here's the HPA definition:
 1apiVersion: autoscaling/v1
 2kind: HorizontalPodAutoscaler
 3metadata:
 4  name: myhpa
 5spec:
 6  scaleTargetRef:
 7    apiVersion: apps/v1
 8    kind: Deployment
 9    name: mydep
10  minReplicas: 1
11  maxReplicas: 3
12  targetCPUUtilizationPercentage: 60
  • This is a HorizontalPodAutoscaler definition, with 1-3 pods. It points to mydep deployments.
  • Just apply it:
1$> 
2$> kubectl apply -f HPA.yaml 
3horizontalpodautoscaler.autoscaling/myhpa created
4$> 

Deployment and load

  • First, here's a deployment definition:
 1apiVersion: apps/v1
 2kind: Deployment
 3metadata:
 4  name: mydep
 5spec:
 6  selector:
 7    matchLabels:
 8      app: mydep
 9  template:
10    metadata:
11      labels:
12        app: mydep
13    spec:
14      containers:
15      - name: mydep-a
16        image:  alpine:latest
17        command:
18        - sleep
19        - "3600"
20        resources:
21          requests:
22            cpu: "100m"
  • This is a simple deployment, but note that there are no replicas definition.
  • Apply it to see a single pod:
1$> 
2$> kubectl apply -f mydep.yaml 
3deployment.apps/mydep created
4$> kubectl get pods
5NAME                     READY   STATUS    RESTARTS   AGE
6mydep-68bf56555d-grmn2   1/1     Running   0          6s
7$> 
  • The only thing left to be done is creating some load:
 1$> 
 2$> kubectl exec -it mydep-68bf56555d-grmn2 -- sh
 3/ # 
 4/ # for i in 1 2 3 4; do while : ; do : ; done & done
 5/ # 
 6/ # ps -e
 7PID   USER     TIME  COMMAND
 8    1 root      0:00 sleep 3600
 9    7 root      0:00 sh
10   14 root      0:03 sh
11   15 root      0:03 sh
12   16 root      0:04 sh
13   17 root      0:03 sh
14   18 root      0:00 ps -e
15/ # 
16/ # exit
17$> 
18$> kubectl get pods
19NAME                     READY   STATUS    RESTARTS   AGE
20mydep-68bf56555d-grmn2   1/1     Running   0          110s
21mydep-68bf56555d-j5bnr   1/1     Running   0          47s
22mydep-68bf56555d-xdspd   1/1     Running   0          47s
23$> 
  • After some time, we could see the load on just one pod (the one we have run the script on):
1$> 
2$> kubectl top pods
3NAME                     CPU(cores)   MEMORY(bytes)   
4mydep-68bf56555d-grmn2   3947m        0Mi             
5mydep-68bf56555d-j5bnr   0m           0Mi             
6mydep-68bf56555d-xdspd   0m           0Mi             
7$>
  • You can exec into this pod again, remove those processes and see the number of pods getting bacl to 1