Volume overrides

Volume disk set overrides (in short: volume overrides) refer to changing the target disk sets for particular object IDs of a volume with different disks than the ones inherited from a parent snapshot or created by the allocator.

About

This feature is useful when many volumes are created from the same parent snapshot, which is the usual case when a virtual machine template is used to create many virtual machines ending up with the same OS root disk type. These are usually the same OS and filesystem type, as well as behaviour. In the common case the filesystem journal will be overwritten on the same block device offset for all such volumes. For example, a cron job running on all such virtual machines at the same time (for example, unattended upgrade) will lead to writes to this same exact object or set of objects with the same couple of disks in the cluster, ending up processing all these writes. This ends up causing an excessive load on this set of disks in the cluster, which will lead to degraded performance when these drives start to aggregate all the overwritten data, or just from the extra load.

How to use it

A set of tools could now be used to collect metadata for all objects from each of the disks from the API, and analyze which objects are with the most excessive number of writes in the cluster. These tools will calculate proper overrides for such objects, so that even in case of an excessive load on these particular offsets on all volumes created out of the same parent, they will end up on different sets of disks instead of the ones inherited from the parent snapshot in the original virtual machine template.

The way the tooling is designed to work is by looking for a template called overrides which placeTail parameter is used as a target placement group for the disks used as replacement for the most overwritten objects. For example, if a cluster has one template with hybrid placement (one or more replicas on HDDs and tail on SSD or NVMe drives), an override would have to be the SSD or NVME placement group. An example:

-------------------------------------------------------------------------------------------------
| template  | size  | rdnd. | placeHead | placeAll | placeTail | iops  |  bw   | parent | flags |
-------------------------------------------------------------------------------------------------
| hybrid    |     - |     3 | hdd       | hdd      | ssd       |     - |     - |        |       |
| overrides |     - |     - | default   | default  | ssd       |     - |     - |        |       |
-------------------------------------------------------------------------------------------------

Multiple templates will use the same overrides template placeTail specification. An example would be an SSD only and HDD-only template, in which case the drives for the top most overwritten objects will be overridden with SSD disks.

The tool to collect and compute overrides is /usr/lib/storpool/collect_override_data, and the resulting overrides.json file could be loaded in the following ways:

To load the overrides:

# storpool balancer override add-from-file ./overrides.json

To see how the data will be re-distributed:
# storpool balancer disks
To actually load them for redistribution:
# storpool balancer commit

Note that once overrides are loaded, on future re-balancing operations the overrides will be re-calculated. For more information about balancing, see 18. Rebalancing the cluster.

Note

As of 19.3 revision 19.01.2268.656ce3b10 loaded overrides are visible with storpool balancer disks and require a storpool balancer commit to be applied.

The default number of top objects to be overridden is 9600, or 300GiB of virtual space. This could be specified as the MAX_OBJ_COUNT environment variable to collect_override_data tool.

History

This feature is available starting with release 19.1 revision 19.01.1511.0b533fb.