StorPool and Cgroups

Machines that run StorPool have a storpool.slice, where all StorPool core services run. On a properly configured machine all slices will have configured memory (and memory+swap) limits. This is done to guarantee two things:

  • The kernel will have enough memory to run properly (explained later).

  • The storage system (StorPool) will not suffer from OOM situation that was caused by other software running on the machine.

The cpuset controller in the storpool.slice is used to:

  • Dedicate CPUs only for StorPool (that other slices do not have access to).

  • Pin the StorPool services to the dedicated CPUs in a specific manner to optimize performance.

Cgroup configuration

To make Cgroup configurations persistent across reboots, create your desired configuration files in /etc/cgconfig.d/. When the machine boots, the cgconfig service runs and applies the configuration using the cgconfigparser tool. The service and the tool are part of the base Linux installation.

Note

Writing the configuration files for cgconfig by hand could be a difficult job. It is recommended to generate them using the storpool_cg utility provided by StorPool; for details, see Introduction to storpool_cg.

Remember the following:

  • After you create the configuration files you need to apply the configuration to the machine. Do this by restarting the cgconfig service, or by parsing the configuration files with cgconfigparser.

  • Restarting the cgconfig service could result in cgroups deletion and leaking (potentially all) running processes to the root cgroup!

    This is dangerous, as resource limitations will no longer be applied, and it might be hard to recover to a desired state.

    If you are trying to migrate a cgroups configuration on a StorPool node, check the storpool_cg migration options described in Configuration options and parameters and Updating already configured machines.

Memory configuration

For the root cgroup, memory.use_hierarchy should be set to 1, so that a hierarchical memory model is used for the cgroups.

For all slices:

  • memory.move_charge_at_immigrate should be set to at least 1, and for the storpool.slice it should be set to 3.

  • memory.limit_in_bytes and memory.memsw.limit_in_bytes should be set to the same appropriate value.

For the storpool.slice, memory.swappines should be set to 0.

Note

To ensure enough memory for the kernel, the sum of the memory limits of all slices should be at least 1G short of the total machine memory as reported by the kernel.

memory:storpool.slice has two memory subslices - common and alloc. The storpool.slice/alloc subslice is used to limit the memory usage of the mgmt, iscsi and bridge services, while the storpool.slice/common subslice is for everything else. Their memory limits should also be configured and their sum should be equal to the storpool.slice memory limit.

Cpuset configuration

Dedicated CPUs for StorPool should be set in cpuset.cpus of storpool.slice . The rest of the CPUs should be used for all other slices’ cpuset.cpus. The cpuset.cpu_exclusive flag in storpool.slice should be set to 1 to ensure that the other slices cannot use the CPUs dedicated for StorPool.

Warning

This can be overridden by setting the SP_CPUS_NOT_EXCLUSIVE parameter when running the storpool_cg conf command (see Configuration parameters). Note that this is not recommended and not supported, and should be used with great caution and only for a very good reason.

cpuset:storpool.slice should have a subslice for each running StorPool service on the machine. So, if you have two servers, beacon, mgmt and block running, there should be storpool.slice/{server,server_1,beacon,mgmt,block} subslices. These are used to assign the services to specific CPUs. This is achieved by, for example, setting the cpuset.cpus of the storpool.slice/beacon to 2. This will restrict the storpool_beacon to run on CPU#2.

Note that for nodes that do not have hardware-accelerated network cards (see Network interfaces), the storpool.slice will also need to include one additional CPU for the NIC, but there is no nic subslice. That CPU must not be in any of the subslices (should be left empty), as StorPool will assign the CPU to process NIC interrupts.

For storpool.slice and each of it subslices, cpuset.mems should be set to all available NUMA nodes on the machine (for example, 0-3 on a 4 NUMA nodes machine).