Autoscaling in Kubernetes

--- title: " Autoscaling in Kubernetes " date: 2016-07-12 slug: autoscaling-in-kubernetes url: /blog/2016/07/Autoscaling-In-Kubernetes author: > Jerzy Szczepkowski (Google), Marcin Wielgus (Google) --- _**Editor's note:** this post is part of a [series of in-depth articles](/blog/2016/07/five-days-of-kubernetes-1-3) on what's new in Kubernetes 1.3_ Customers using Kubernetes respond to end user requests quickly and ship software faster than ever before. But what happens when you build a service that is even more popular than you planned for, and run out of compute? In [Kubernetes 1.3](https://kubernetes.io/blog/2016/07/kubernetes-1-3-bridging-cloud-native-and-enterprise-workloads/), we are proud to announce that we have a solution: autoscaling. On [Google Compute Engine](https://cloud.google.com/compute/) (GCE) and [Google Container Engine](https://cloud.google.com/container-engine/) (GKE) (and coming soon on [AWS](https://aws.amazon.com/)), Kubernetes will automatically scale up your cluster as soon as you need it, and scale it back down to save you money when you don’t. ### Benefits of Autoscaling To understand better where autoscaling would provide the most value, let’s start with an example. Imagine you have a 24/7 production service with a load that is variable in time, where it is very busy during the day in the US, and relatively low at night. Ideally, we would want the number of nodes in the cluster and the number of pods in deployment to dynamically adjust to the load to meet end user demand. The new Cluster Autoscaling feature together with Horizontal Pod Autoscaler can handle this for you automatically. ### Setting Up Autoscaling on GCE The following instructions apply to GCE. For GKE please check the autoscaling section in cluster operations manual available [here](https://cloud.google.com/container-engine/docs/clusters/operations#create_a_cluster_with_autoscaling). Before we begin, we need to have an active GCE project with Google Cloud Monitoring, Google Cloud Logging and Stackdriver enabled. For more information on project creation, please read our [Getting Started Guide](https://github.com/kubernetes/kubernetes/blob/master/docs/getting-started-guides/gce.md#prerequisites). We also need to download a recent version of Kubernetes project (version [v1.3.0](http://v1.3.0/) or later). First, we set up a cluster with Cluster Autoscaler turned on. The number of nodes in the cluster will start at 2, and autoscale up to a maximum of 5. To implement this, we’ll export the following environment variables: ``` export NUM\_NODES=2 export KUBE\_AUTOSCALER\_MIN\_NODES=2 export KUBE\_AUTOSCALER\_MAX\_NODES=5 export KUBE\_ENABLE\_CLUSTER\_AUTOSCALER=true ``` and start the cluster by running: ``` ./cluster/kube-up.sh ``` The kube-up.sh script creates a cluster together with Cluster Autoscaler add-on. The autoscaler will try to add new nodes to the cluster if there are pending pods which could schedule on a new node. Let’s see our cluster, it should have two nodes: ``` $ kubectl get nodes NAME STATUS AGE kubernetes-master Ready,SchedulingDisabled 2m kubernetes-minion-group-de5q Ready 2m kubernetes-minion-group-yhdx Ready 1m ``` #### Run & Expose PHP-Apache Server To demonstrate autoscaling we will use a custom docker image based on php-apache server. The image can be found [here](https://github.com/kubernetes/kubernetes/blob/8caeec429ee1d2a9df7b7a41b21c626346b456fb/docs/user-guide/horizontal-pod-autoscaling/image). It defines [index.php](https://github.com/kubernetes/kubernetes/blob/8caeec429ee1d2a9df7b7a41b21c626346b456fb/docs/user-guide/horizontal-pod-autoscaling/image/index.php) page which performs some CPU intensive computations. First, we’ll start a deployment running the image and expose it as a service: ``` $ kubectl run php-apache \ --image=gcr.io/google\_containers/hpa-example \

This blog post introduces autoscaling in Kubernetes 1.3, focusing on its benefits and setup on Google Compute Engine (GCE). Autoscaling dynamically adjusts the number of nodes and pods based on demand, optimizing resource utilization and cost efficiency. The post provides step-by-step instructions for setting up autoscaling on GCE, including enabling the Cluster Autoscaler and demonstrating its functionality using a PHP-Apache server example.