---
title: " Introducing Kubeflow - A Composable, Portable, Scalable ML Stack Built for Kubernetes "
date: 2017-12-21
slug: introducing-kubeflow-composable
url: /blog/2017/12/Introducing-Kubeflow-Composable
author: >
Jeremy Lewi (Google),
David Aronchick (Google)
---
## Kubernetes and Machine Learning
Kubernetes has quickly become the hybrid solution for deploying complicated workloads anywhere. While it started with just stateless services, customers have begun to move complex workloads to the platform, taking advantage of rich APIs, reliability and performance provided by Kubernetes. One of the fastest growing use cases is to use Kubernetes as the deployment platform of choice for machine learning.
Building any production-ready machine learning system involves various components, often mixing vendors and hand-rolled solutions. Connecting and managing these services for even moderately sophisticated setups introduces huge barriers of complexity in adopting machine learning. Infrastructure engineers will often spend a significant amount of time manually tweaking deployments and hand rolling solutions before a single model can be tested.
Worse, these deployments are so tied to the clusters they have been deployed to that these stacks are immobile, meaning that moving a model from a laptop to a highly scalable cloud cluster is effectively impossible without significant re-architecture. All these differences add up to wasted effort and create opportunities to introduce bugs at each transition.
## Introducing Kubeflow
To address these concerns, we’re announcing the creation of the Kubeflow project, a new open source GitHub repo dedicated to making using ML stacks on Kubernetes easy, fast and extensible. This repository contains:
- JupyterHub to create & manage interactive Jupyter notebooks
- A Tensorflow [Custom Resource](/docs/concepts/api-extension/custom-resources/) (CRD) that can be configured to use CPUs or GPUs, and adjusted to the size of a cluster with a single setting
- A TF Serving container
Because this solution relies on Kubernetes, it runs wherever Kubernetes runs. Just spin up a cluster and go!
## Using Kubeflow
Let's suppose you are working with two different Kubernetes clusters: a local [minikube](https://github.com/kubernetes/minikube) cluster; and a [GKE cluster with GPUs](https://docs.google.com/forms/d/1JNnoUe1_3xZvAogAi16DwH6AjF2eu08ggED24OGO7Xc/viewform?edit_requested=true); and that you have two [kubectl contexts](/docs/tasks/access-application-cluster/configure-access-multiple-clusters/#define-clusters-users-and-contexts) defined named minikube and gke.
First we need to initialize our [ksonnet](https://github.com/ksonnet) application and install the Kubeflow packages. (To use ksonnet, you must first install it on your operating system - the instructions for doing so are [here](https://github.com/ksonnet/ksonnet))
```
ks init my-kubeflow
cd my-kubeflow
ks registry add kubeflow \
github.com/google/kubeflow/tree/master/kubeflow
ks pkg install kubeflow/core
ks pkg install kubeflow/tf-serving
ks pkg install kubeflow/tf-job
ks generate core kubeflow-core --name=kubeflow-core
```
We can now define [environments](https://ksonnet.io/docs/concepts#environment) corresponding to our two clusters.
```
kubectl config use-context minikube
ks env add minikube
kubectl config use-context gke
ks env add gke
```
And we’re done! Now just create the environments on your cluster. First, on minikube:
```
ks apply minikube -c kubeflow-core
```
And to create it on our multi-node GKE cluster for quicker training:
```
ks apply gke -c kubeflow-core
```
By making it easy to deploy the same rich ML stack everywhere, the drift and rewriting between these environments is kept to a minimum.
To access either deployments, you can execute the following command:
```
kubectl port-forward tf-hub-0 8100:8000