---
title: " Autoscaling in Kubernetes "
date: 2016-07-12
slug: autoscaling-in-kubernetes
url: /blog/2016/07/Autoscaling-In-Kubernetes
author: >
Jerzy Szczepkowski (Google),
Marcin Wielgus (Google)
---
_**Editor's note:** this post is part of a [series of in-depth articles](/blog/2016/07/five-days-of-kubernetes-1-3) on what's new in Kubernetes 1.3_
Customers using Kubernetes respond to end user requests quickly and ship software faster than ever before. But what happens when you build a service that is even more popular than you planned for, and run out of compute? In [Kubernetes 1.3](https://kubernetes.io/blog/2016/07/kubernetes-1-3-bridging-cloud-native-and-enterprise-workloads/), we are proud to announce that we have a solution: autoscaling. On [Google Compute Engine](https://cloud.google.com/compute/) (GCE) and [Google Container Engine](https://cloud.google.com/container-engine/) (GKE) (and coming soon on [AWS](https://aws.amazon.com/)), Kubernetes will automatically scale up your cluster as soon as you need it, and scale it back down to save you money when you don’t.
### Benefits of Autoscaling
To understand better where autoscaling would provide the most value, let’s start with an example. Imagine you have a 24/7 production service with a load that is variable in time, where it is very busy during the day in the US, and relatively low at night. Ideally, we would want the number of nodes in the cluster and the number of pods in deployment to dynamically adjust to the load to meet end user demand. The new Cluster Autoscaling feature together with Horizontal Pod Autoscaler can handle this for you automatically.
### Setting Up Autoscaling on GCE
The following instructions apply to GCE. For GKE please check the autoscaling section in cluster operations manual available [here](https://cloud.google.com/container-engine/docs/clusters/operations#create_a_cluster_with_autoscaling).
Before we begin, we need to have an active GCE project with Google Cloud Monitoring, Google Cloud Logging and Stackdriver enabled. For more information on project creation, please read our [Getting Started Guide](https://github.com/kubernetes/kubernetes/blob/master/docs/getting-started-guides/gce.md#prerequisites). We also need to download a recent version of Kubernetes project (version [v1.3.0](http://v1.3.0/) or later).
First, we set up a cluster with Cluster Autoscaler turned on. The number of nodes in the cluster will start at 2, and autoscale up to a maximum of 5. To implement this, we’ll export the following environment variables:
```
export NUM\_NODES=2
export KUBE\_AUTOSCALER\_MIN\_NODES=2
export KUBE\_AUTOSCALER\_MAX\_NODES=5
export KUBE\_ENABLE\_CLUSTER\_AUTOSCALER=true
```
and start the cluster by running:
```
./cluster/kube-up.sh
```
The kube-up.sh script creates a cluster together with Cluster Autoscaler add-on. The autoscaler will try to add new nodes to the cluster if there are pending pods which could schedule on a new node.
Let’s see our cluster, it should have two nodes:
```
$ kubectl get nodes
NAME STATUS AGE
kubernetes-master Ready,SchedulingDisabled 2m
kubernetes-minion-group-de5q Ready 2m
kubernetes-minion-group-yhdx Ready 1m
```
#### Run & Expose PHP-Apache Server
To demonstrate autoscaling we will use a custom docker image based on php-apache server. The image can be found [here](https://github.com/kubernetes/kubernetes/blob/8caeec429ee1d2a9df7b7a41b21c626346b456fb/docs/user-guide/horizontal-pod-autoscaling/image). It defines [index.php](https://github.com/kubernetes/kubernetes/blob/8caeec429ee1d2a9df7b7a41b21c626346b456fb/docs/user-guide/horizontal-pod-autoscaling/image/index.php) page which performs some CPU intensive computations.
First, we’ll start a deployment running the image and expose it as a service:
```
$ kubectl run php-apache \
--image=gcr.io/google\_containers/hpa-example \