StorPool and Cgroups
Machines that run StorPool have a storpool.slice
, where all StorPool
core services run. On a properly configured machine all slices will have
configured memory (and memory+swap) limits. This is done to guarantee two
things:
The kernel will have enough memory to run properly (explained later).
The storage system (StorPool) will not suffer from OOM situation that was caused by other software running on the machine.
The cpuset
controller in the storpool.slice
is used to:
Dedicate CPUs only for StorPool (that other slices do not have access to).
Pin the StorPool services to the dedicated CPUs in a specific manner to optimize performance.
Cgroup configuration
To make Cgroup configurations persistent across reboots, create your desired
configuration files in /etc/cgconfig.d/
. When the machine boots, the
cgconfig
service runs and applies the configuration using the
cgconfigparser
tool. The service and the tool are part of the base Linux
installation.
Note
Writing the configuration files for cgconfig
by hand could be a
difficult job. It is recommended to generate them using the
storpool_cg
utility provided by StorPool; for details, see
Introduction to storpool_cg.
Remember the following:
After you create the configuration files you need to apply the configuration to the machine. Do this by restarting the
cgconfig
service, or by parsing the configuration files withcgconfigparser
.Restarting the
cgconfig
service could result in cgroups deletion and leaking (potentially all) running processes to the root cgroup!This is dangerous, as resource limitations will no longer be applied, and it might be hard to recover to a desired state.
If you are trying to migrate a cgroups configuration on a StorPool node, check the
storpool_cg
migration options described in Configuration options and parameters and Updating already configured machines.
Memory configuration
For the root cgroup, memory.use_hierarchy
should be set to 1
, so that a
hierarchical memory model is used for the cgroups.
For all slices:
memory.move_charge_at_immigrate
should be set to at least1
, and for thestorpool.slice
it should be set to3
.memory.limit_in_bytes
andmemory.memsw.limit_in_bytes
should be set to the same appropriate value.
For the storpool.slice
, memory.swappines
should be set to 0
.
Note
To ensure enough memory for the kernel, the sum of the memory limits of all slices should be at least 1G short of the total machine memory as reported by the kernel.
memory:storpool.slice
has two memory subslices - common
and alloc
.
The storpool.slice/alloc
subslice is used to limit the memory usage of the
mgmt, iscsi and bridge services, while the storpool.slice/common
subslice is for everything else. Their memory limits should also be configured
and their sum should be equal to the storpool.slice
memory limit.
Cpuset configuration
Dedicated CPUs for StorPool should be set in cpuset.cpus
of
storpool.slice
. The rest of the CPUs should be used for all other slices’
cpuset.cpus
. The cpuset.cpu_exclusive
flag in storpool.slice
should
be set to 1
to ensure that the other slices cannot use the CPUs dedicated
for StorPool.
Warning
This can be overridden by setting the SP_CPUS_NOT_EXCLUSIVE
parameter when running the storpool_cg conf
command (see
Configuration parameters). Note that this
is not recommended and not supported, and should be used with
great caution and only for a very good reason.
cpuset:storpool.slice
should have a subslice for each running StorPool
service on the machine. So, if you have two servers, beacon, mgmt and block
running, there should be storpool.slice/{server,server_1,beacon,mgmt,block}
subslices. These are used to assign the services to specific CPUs. This is
achieved by, for example, setting the cpuset.cpus
of the
storpool.slice/beacon
to 2
. This will restrict the storpool_beacon
to run on CPU#2.
Note that for nodes that do not have hardware-accelerated network cards (see
Network interfaces), the storpool.slice
will also need
to include one additional CPU for the NIC, but there is no nic subslice.
That CPU must not be in any of the subslices (should be left empty), as StorPool
will assign the CPU to process NIC interrupts.
For storpool.slice
and each of it subslices, cpuset.mems
should be set
to all available NUMA nodes on the machine (for example, 0-3 on a 4 NUMA nodes
machine).