processes in different control groups both read the same file
(ultimately relying on the same blocks on disk), the corresponding
memory charge is split between the control groups. It's nice, but
it also means that when a cgroup is terminated, it could increase the
memory usage of another cgroup, because they're not splitting the cost
anymore for those memory pages.
### CPU metrics: `cpuacct.stat`
Now that we've covered memory metrics, everything else is
simple in comparison. CPU metrics are in the
`cpuacct` controller.
For each container, a pseudo-file `cpuacct.stat` contains the CPU usage
accumulated by the processes of the container, broken down into `user` and
`system` time. The distinction is:
- `user` time is the amount of time a process has direct control of the CPU,
executing process code.
- `system` time is the time the kernel is executing system calls on behalf of
the process.
Those times are expressed in ticks of 1/100th of a second, also called "user
jiffies". There are `USER_HZ` _"jiffies"_ per second, and on x86 systems,
`USER_HZ` is 100. Historically, this mapped exactly to the number of scheduler
"ticks" per second, but higher frequency scheduling and
[tickless kernels](https://lwn.net/Articles/549580/) have made the number of
ticks irrelevant.
#### Block I/O metrics
Block I/O is accounted in the `blkio` controller.
Different metrics are scattered across different files. While you can
find in-depth details in the [blkio-controller](https://www.kernel.org/doc/Documentation/cgroup-v1/blkio-controller.txt)
file in the kernel documentation, here is a short list of the most
relevant ones:
`blkio.sectors`
: Contains the number of 512-bytes sectors read and written by the processes
member of the cgroup, device by device. Reads and writes are merged in a single
counter.
`blkio.io_service_bytes`
: Indicates the number of bytes read and written by the cgroup. It has 4
counters per device, because for each device, it differentiates between
synchronous vs. asynchronous I/O, and reads vs. writes.
`blkio.io_serviced`
: The number of I/O operations performed, regardless of their size. It also has
4 counters per device.
`blkio.io_queued`
: Indicates the number of I/O operations currently queued for this cgroup. In
other words, if the cgroup isn't doing any I/O, this is zero. The opposite is
not true. In other words, if there is no I/O queued, it doesn't mean that the
cgroup is idle (I/O-wise). It could be doing purely synchronous reads on an
otherwise quiescent device, which can therefore handle them immediately,
without queuing. Also, while it's helpful to figure out which cgroup is
putting stress on the I/O subsystem, keep in mind that it's a relative
quantity. Even if a process group doesn't perform more I/O, its queue size can
increase just because the device load increases because of other devices.
### Network metrics
Network metrics aren't exposed directly by control groups. There is a
good explanation for that: network interfaces exist within the context
of _network namespaces_. The kernel could probably accumulate metrics
about packets and bytes sent and received by a group of processes, but
those metrics wouldn't be very useful. You want per-interface metrics
(because traffic happening on the local `lo`
interface doesn't really count). But since processes in a single cgroup
can belong to multiple network namespaces, those metrics would be harder
to interpret: multiple network namespaces means multiple `lo`
interfaces, potentially multiple `eth0`
interfaces, etc.; so this is why there is no easy way to gather network
metrics with control groups.
Instead you can gather network metrics from other sources.
#### iptables
iptables (or rather, the netfilter framework for which iptables is just
an interface) can do some serious accounting.
For instance, you can setup a rule to account for the outbound HTTP
traffic on a web server:
```console
$ iptables -I OUTPUT -p tcp --sport 80