Visualizing Kubelet Performance with the Node Performance Dashboard

--- title: " Visualize Kubelet Performance with Node Dashboard " date: 2016-11-17 slug: visualize-kubelet-performance-with-node-dashboard url: /blog/2016/11/Visualize-Kubelet-Performance-With-Node-Dashboard author: > Zhou Fang (Google) --- _Since this article was published, the Node Performance Dashboard was retired and is no longer available._ _This retirement happened in early 2019, as part of the_ `kubernetes/contrib` _[repository deprecation](https://github.com/kubernetes-retired/contrib/issues/3007)_. In Kubernetes 1.4, we introduced a new node performance analysis tool, called the _node performance dashboard_, to visualize and explore the behavior of the Kubelet in much richer details. This new feature will make it easy to understand and improve code performance for Kubelet developers, and lets cluster maintainer set configuration according to provided Service Level Objectives (SLOs). **Background** A Kubernetes cluster is made up of both master and worker nodes. The master node manages the cluster’s state, and the worker nodes do the actual work of running and managing pods. To do so, on each worker node, a binary, called [Kubelet](/docs/admin/kubelet/), watches for any changes in pod configuration, and takes corresponding actions to make sure that containers run successfully. High performance of the Kubelet, such as low latency to converge with new pod configuration and efficient housekeeping with low resource usage, is essential for the entire Kubernetes cluster. To measure this performance, Kubernetes uses [end-to-end (e2e) tests](https://github.com/kubernetes/kubernetes/blob/master/docs/devel/e2e-tests.md#overview) to continuously monitor benchmark changes of latest builds with new features. **Kubernetes SLOs are defined by the following benchmarks** : **\* API responsiveness** : 99% of all API calls return in less than 1s. **\* Pod startup time** : 99% of pods and their containers (with pre-pulled images) start within 5s. Prior to 1.4 release, we’ve only measured and defined these at the cluster level, opening up the risk that other factors could influence the results. Beyond these, we also want to have more performance related SLOs such as the maximum number of pods for a specific machine type allowing maximum utilization of your cluster. In order to do the measurement correctly, we want to introduce a set of tests isolated to just a node’s performance. In addition, we aim to collect more fine-grained resource usage and operation tracing data of Kubelet from the new tests. **Data Collection** The node specific density and resource usage tests are now added into e2e-node test set since 1.4. The resource usage is measured by a standalone cAdvisor pod for flexible monitoring interval (comparing with Kubelet integrated cAdvisor). The performance data, such as latency and resource usage percentile, are recorded in persistent test result logs. The tests also record time series data such as creation time, running time of pods, as well as real-time resource usage. Tracing data of Kubelet operations are recorded in its log stored together with test results. **Node Performance Dashboard** Since Kubernetes 1.4, we are continuously building the newest Kubelet code and running node performance tests. The data is collected by our new performance dashboard available at [node-perf-dash.k8s.io](http://node-perf-dash.k8s.io/). Figure 1 gives a preview of the dashboard. You can start to explore it by selecting a test, either using the drop-down list of short test names (region (a)) or by choosing test options one by one (region (b)). The test details show up in region (c) containing the full test name from Ginkgo (the Go test framework used by Kubernetes). Then select a node type (image and machine) in region (d). | ![](https://lh5.googleusercontent.com/xREqs-NpWw2isELQ3YekYYMXRsY0fTs0t8lBR5xbZDB02mOAfQAnidXo8AF9hOICBUFI20kD6BVvTR0vDS1ErgQ8fVxP530TWUkyZTeV_KziI9uHvZOrHk5E304MeiLfdEPG2fzz)

This article discusses the introduction of the node performance dashboard in Kubernetes 1.4, a tool for visualizing and analyzing Kubelet behavior. It highlights the importance of Kubelet performance for the entire Kubernetes cluster and how the dashboard aids in understanding and improving code performance for developers and setting configurations based on Service Level Objectives (SLOs). The dashboard collects data from end-to-end node tests, measuring resource usage and operation tracing to monitor Kubelet performance. As of early 2019, the Node Performance Dashboard was retired.