Home Explore Blog CI



docker

5th chunk of `content/manuals/engine/containers/runmetrics.md`
e750af19d68e410b00c133302924f9f1cd0214c7049eb3b00000000100000fa1
processes in different control groups both read the same file
(ultimately relying on the same blocks on disk), the corresponding
memory charge is split between the control groups. It's nice, but
it also means that when a cgroup is terminated, it could increase the
memory usage of another cgroup, because they're not splitting the cost
anymore for those memory pages.

### CPU metrics: `cpuacct.stat`

Now that we've covered memory metrics, everything else is
simple in comparison. CPU metrics are in the
`cpuacct` controller.

For each container, a pseudo-file `cpuacct.stat` contains the CPU usage
accumulated by the processes of the container, broken down into `user` and
`system` time. The distinction is:

- `user` time is the amount of time a process has direct control of the CPU,
  executing process code.
- `system` time is the time the kernel is executing system calls on behalf of
  the process.

Those times are expressed in ticks of 1/100th of a second, also called "user
jiffies". There are `USER_HZ` _"jiffies"_ per second, and on x86 systems,
`USER_HZ` is 100. Historically, this mapped exactly to the number of scheduler
"ticks" per second, but higher frequency scheduling and
[tickless kernels](https://lwn.net/Articles/549580/) have made the number of
ticks irrelevant.

#### Block I/O metrics

Block I/O is accounted in the `blkio` controller.
Different metrics are scattered across different files. While you can
find in-depth details in the [blkio-controller](https://www.kernel.org/doc/Documentation/cgroup-v1/blkio-controller.txt)
file in the kernel documentation, here is a short list of the most
relevant ones:

`blkio.sectors`
: Contains the number of 512-bytes sectors read and written by the processes
  member of the cgroup, device by device. Reads and writes are merged in a single
  counter.

`blkio.io_service_bytes`
: Indicates the number of bytes read and written by the cgroup. It has 4
  counters per device, because for each device, it differentiates between
  synchronous vs. asynchronous I/O, and reads vs. writes.

`blkio.io_serviced`
: The number of I/O operations performed, regardless of their size. It also has
  4 counters per device.

`blkio.io_queued`
: Indicates the number of I/O operations currently queued for this cgroup. In
  other words, if the cgroup isn't doing any I/O, this is zero. The opposite is
  not true. In other words, if there is no I/O queued, it doesn't mean that the
  cgroup is idle (I/O-wise). It could be doing purely synchronous reads on an
  otherwise quiescent device, which can therefore handle them immediately,
  without queuing. Also, while it's helpful to figure out which cgroup is
  putting stress on the I/O subsystem, keep in mind that it's a relative
  quantity. Even if a process group doesn't perform more I/O, its queue size can
  increase just because the device load increases because of other devices.

### Network metrics

Network metrics aren't exposed directly by control groups. There is a
good explanation for that: network interfaces exist within the context
of _network namespaces_. The kernel could probably accumulate metrics
about packets and bytes sent and received by a group of processes, but
those metrics wouldn't be very useful. You want per-interface metrics
(because traffic happening on the local `lo`
interface doesn't really count). But since processes in a single cgroup
can belong to multiple network namespaces, those metrics would be harder
to interpret: multiple network namespaces means multiple `lo`
interfaces, potentially multiple `eth0`
interfaces, etc.; so this is why there is no easy way to gather network
metrics with control groups.

Instead you can gather network metrics from other sources.

#### iptables

iptables (or rather, the netfilter framework for which iptables is just
an interface) can do some serious accounting.

For instance, you can setup a rule to account for the outbound HTTP
traffic on a web server:

```console
$ iptables -I OUTPUT -p tcp --sport 80

Title: CPU, Block I/O, and Network Metrics in Cgroups
Summary
The section explains CPU metrics available through the `cpuacct.stat` file, detailing `user` and `system` time. It also covers Block I/O metrics from the `blkio` controller, including `blkio.sectors`, `blkio.io_service_bytes`, `blkio.io_serviced`, and `blkio.io_queued`. The passage further discusses the absence of direct network metrics in cgroups due to network namespaces, suggesting iptables as an alternative for gathering network traffic information.