5th chunk of `content/en/blog/_posts/2016-11-00-Visualize-Kubelet-Performance-With-Node-Dashboard.md`
e2dd0b6329a17d9de5113811f5d229a40bd45eec1964057c00000001000007e6
 |
| Figure 4. Pod startup latency when creating 105 pods. |
Looking specifically at build #162, we are able to see that the tracing data plotted in the pod creation latency chart (Figure 5). Each curve is an accumulated histogram of the number of pod operations which have already arrive at a certain tracing probe. The timestamp of tracing pod is either collected from the performance tests or by parsing the Kubelet log. Currently we collect the following tracing data:
- "create" (in test): the test creates pods through API client;
- "running" (in test): the test watches that pods are running from API server;
- "pod\_config\_change": pod config change detected by Kubelet SyncLoop;
- "runtime\_manager": runtime manager starts to create containers;
- "infra\_container\_start": the infra container of a pod starts;
- "container\_start': the container of a pod starts;
- "pod\_running": a pod is running;
- "pod\_status\_running": status manager updates status for a running pod;
The time series chart illustrates that it is taking a long time for the status manager to update pod status (the data of "running" is not shown since it overlaps with "pod\_status\_running"). We figure out this latency is introduced due to the query per second (QPS) limits of Kubelet to the API server (default is 5). After being aware of this, we find in additional tests that by increasing QPS limits, curve "running" gradually converges with "pod\_running', and results in much lower latency. Therefore the previous e2e test pod startup results reflect the combined latency of both Kubelet and time of uploading status, the performance of Kubelet is thus under-estimated.
| 