Reserved CPU List, Eviction Thresholds, and Enforcing Node Allocatable

`reservedSystemCPUs` is meant to define an explicit CPU set for OS system daemons and kubernetes system daemons. `reservedSystemCPUs` is for systems that do not intend to define separate top level cgroups for OS system daemons and kubernetes system daemons with regard to cpuset resource. If the Kubelet **does not** have `kubeReservedCgroup` and `systemReservedCgroup`, the explicit cpuset provided by `reservedSystemCPUs` will take precedence over the CPUs defined by `kubeReservedCgroup` and `systemReservedCgroup` options. This option is specifically designed for Telco/NFV use cases where uncontrolled interrupts/timers may impact the workload performance. you can use this option to define the explicit cpuset for the system/kubernetes daemons as well as the interrupts/timers, so the rest CPUs on the system can be used exclusively for workloads, with less impact from uncontrolled interrupts/timers. To move the system daemon, kubernetes daemons and interrupts/timers to the explicit cpuset defined by this option, other mechanism outside Kubernetes should be used. For example: in Centos, you can do this using the tuned toolset. ### Eviction Thresholds **KubeletConfiguration Setting**: `evictionHard: {memory.available: "100Mi", nodefs.available: "10%", nodefs.inodesFree: "5%", imagefs.available: "15%"}`. Example value: `{memory.available: "<500Mi"}` Memory pressure at the node level leads to System OOMs which affects the entire node and all pods running on it. Nodes can go offline temporarily until memory has been reclaimed. To avoid (or reduce the probability of) system OOMs kubelet provides [out of resource](/docs/concepts/scheduling-eviction/node-pressure-eviction/) management. Evictions are supported for `memory` and `ephemeral-storage` only. By reserving some memory via `evictionHard` setting, the `kubelet` attempts to evict pods whenever memory availability on the node drops below the reserved value. Hypothetically, if system daemons did not exist on a node, pods cannot use more than `capacity - eviction-hard`. For this reason, resources reserved for evictions are not available for pods. ### Enforcing Node Allocatable **KubeletConfiguration setting**: `enforceNodeAllocatable: [pods]`. Example value: `[pods,system-reserved,kube-reserved]` The scheduler treats 'Allocatable' as the available `capacity` for pods. `kubelet` enforce 'Allocatable' across pods by default. Enforcement is performed by evicting pods whenever the overall usage across all pods exceeds 'Allocatable'. More details on eviction policy can be found on the [node pressure eviction](/docs/concepts/scheduling-eviction/node-pressure-eviction/) page. This enforcement is controlled by specifying `pods` value to the KubeletConfiguration setting `enforceNodeAllocatable`. Optionally, `kubelet` can be made to enforce `kubeReserved` and `systemReserved` by specifying `kube-reserved` & `system-reserved` values in

This section details how to use `reservedSystemCPUs` to define an explicit CPU set for system daemons, particularly in Telco/NFV use cases. It then explains how to configure `evictionHard` to manage memory pressure and avoid system OOMs by evicting pods when memory availability drops below a certain threshold. Finally, it describes how to enforce 'Allocatable' resources across pods using `enforceNodeAllocatable`, including the options to enforce `kubeReserved` and `systemReserved`.