Enable TRIM/Discard operations for KVM-based clouds
Issuing TRIM/Discard commands (trims), the Operating System notifies the underlying storage device which data is marked as deleted in the file system and can be effectively erased from the physical media. TRIM in the ATA command set (known as UNMAP in the SCSI command set) is well-known by storage vendors and has been in the Linux kernel for a long time. Supported in most used Linux file systems (ext4, XFS, Btrfs, and so on), these commands help reduce SSD drive wearing and prevent progressive performance degradation of write operations in long-term use.
This document covers all steps needed to configure and enable trims inside virtual machines (VMs) and ensure these commands are propagated to the storage system. The steps are for virtio-scsi (/dev/sd*) and virtio-blk (/dev/vd*) virtual disks.
Verify the storage solution
The first step is ensuring the chosen storage solution supports trim operations
on its provided virtual drives. On a Linux host with an attached virtual drive,
check the values of DISC-GRAN
and DISC-MAX
in the output of
lsblk --discard
command. In a StorPool cluster with an attached volume
(volume-1
) on a client, issue the following command:
# lsblk --discard /dev/storpool/volume-1
If the values for DISC-GRAN
and DISC-MAX
output are non-zero numbers
(e.g. 4K and 1M), trims are supported. If the result contains only zeroes,
there is no trim support.
Please consult StorPool support if you don’t see TRIM enabled at this stage.
Virtualization stack
Since version 1.5 of qemu
, trims for virtio-scsi
devices are supported and can
be enabled. For virtio-blk
, such capability was included in qemu
4.0. To check the
version of the used qemu
use:
For RHEL and its derivatives:
# /usr/libexec/qemu-kvm --version
For Ubuntu:
# /usr/bin/qemu-system-x86_64 --version
If your host’s OS is relatively new (RHEL 8.x or Ubuntu 20.x), trims are
supported for both virtio-scsi
and virtio-blk
devices. Otherwise, it is
probably time to consider OS upgrades.
Also note that for virtio-blk
TRIM is supported only for q35
type
virtual machines.
Virtual machine type
From qemu
version 4.0, trims on virtio-blk
devices are available. But
there is one catch - only for VMs which type is q35
. With the older
i440fx
type, trims are available only for virtio-scsi
drives.
To check VM type use:
# virsh qemu-monitor-command $VM_NAME --hmp info qom-tree | head -n 1
The output will be something like this:
/machine (pc-q35-rhel8.2.0-machine)
or
/machine (pc-i440fx-2.11-machine)
An alternative way is to check running VM’s XML definition in libvirt
, if
the orchestration uses it. For this approach, the command xmllint
is
needed, which is provided by the libxml2
package (in RHEL) or by
libxml2-utils
(in Ubuntu). To get the VM type:
# virsh dumpxml $VM_NAME | xmllint --xpath 'string(/domain/os/type/@machine)' - ; echo
And the output will be something like this:
pc-q35-rhel8.2.0
or
pc-i440fx-2.11
Virtual drives discard
The next step is verifying that the VM is configured to propagate the trims to the storage system. First, let’s check the VM’s definition for the necessary parameters. Get the definition by:
# virsh dumpxml $VM_NAME | less
And, for each disk with type='block'
, ensure that there is
discard='unmap’
in its driver configuration. For virtio-blk
, it should look
like this:
...
<disk type='block' device='disk'>
<driver name='qemu' type='raw' cache='none' io='native' discard='unmap'/>
<target dev='vda' bus='virtio'/>
<alias name='virtio-disk0'/>
...
For virtio-scsi
, like this:
...
<disk type='block' device='disk'>
<driver name='qemu' type='raw' cache='none' io='native' discard='unmap'/>
<target dev='sda' bus='scsi'/>
<alias name='scsi0-0-0-0'/>
...
Then, check that these settings are effective in use by qemu-monitor-command
and the alias of the device. For virtio-blk
, the command is:
# virsh qemu-monitor-command $VM_NAME --cmd '{"execute": "qom-get", "arguments": { "path": "/machine/peripheral/virtio-disk0", "property": "discard" }}'
If trims are enabled it will return something like:
{"return": true, "id": "libvirt-2872196"}
For virtio-scsi
devices need to check discard_granularity
by:
# virsh qemu-monitor-command $VM_NAME --cmd '{"execute": "qom-get", "arguments": { "path": "/machine/peripheral/scsi0-0-0-0", "property": "discard_granularity" }}'
{"return":4096, "id": "libvirt-2872649"}
If the value of return
is bigger than 0, trims are available.
Also, could check the command line of a running VM:
# ps axww | grep $VM_NAME
2972318 ? Sl 302:11 /usr/bin/qemu-kvm -name guest=$VM_NAME,debug-threads=on … -blockdev {"driver": "host_device", … "discard": "unmap"} … -blockdev {"driver": "host_device", … "discard": "unmap"} …
Guest configuration
Trim operations have been supported in the virtio-scsi
kernel module for a long
time ago, but such operations are available for the virtio-blk
kernel module
relatively soon, introduced in kernel version 4.20-rc7 (see virtio-blk: modernize sysfs attribute creation). Use the appropriate
version of the guest kernel to take advantage of this feature. Ubuntu 18.04 LTS
offers kernel 5.4, the same version as in 20.x LTS. RHEL, and its derivatives,
like AlmaLinux, added discard support from version 8.1 with plans to backport it
to RHEL 7, too (Solution 4780831).
After the OS is chosen and installed, there are a few more steps to ensure that trims are available inside the VM.
# lsblk --discard
If the columns DISC-GRAN
and DISC-MAX
are not 0, the device is ready for
trimming. Test it by:
# fstrim --verbose --all
/: 5.4 GiB (5793656832 bytes) trimmed on /dev/vda1
Output like this means that everything works as expected.
The last step for the administrator is to configure how trim operations will happen.
The first option is after each delete, the OS to issue trim command too. If that
is the chosen strategy, in /etc/fstab
, to mount options for a device or
partition, add discard
.
UUID=f41e390f-835b-4223-a9bb-9b45984ddf8d / xfs defaults,discard 0 0
Remount with these options, and everything is done. However, observe how the VM will behave for a while, especially with frequent deletions. Because after each delete, the OS issue a trim command, there is a chance for impact on the whole VM performance.
An alternative approach is the execution of fstrim
periodically, once a day or
even a week, depending on the intensity of delete operations. If this is the
chosen method, there is a systemd
timer for that. Enable and start it.
# systemctl enable --now fstrim.timer
And the job is done.
Note
There is one thing that needs consideration in this approach - if
fstrim.timer
runs in multiple virtual machines, the administrator
should spread the times for execution evenly. Otherwise, if all
fstrim.timers
run on the same day, at the same time, depending on
the amount of trimmed data, there could be peaks in storage system
utilization which impacts the performance.
Note
For StorPool’s internal systems, we utilize both the discard
option in /etc/fstab
and a periodic fstrim
, as just
discard
does seem to miss some places.
Note
Also note that on ext4
, fstrim
discards only not-previously
discarded regions, but on xfs
it discards all the empty space
of the partition. Expect fstrim
to be slower on xfs
.
Hints
If the file system is placed on a
LVM
volume, in/etc/lvm/lvm.conf
, find the optionissue_discards
and set it to1
. That tellsLVM
to send discards to the LV’s underlying physical volumes when the LV no longer uses the physical volumes’ space, e.g.lvremove
,lvreduce
.If the file system is on
LUKS
encrypted partition, add the optiondiscard
for that partition in/etc/crypttab
. Edit and save the file, reboot, and all is done.If inside VM, there is
qemu-agent
installed, and it runs, and trims are allowed, the administrator can issue thefstrim
commands from the hypervisor.
# virsh domfstrim --domain $VM_NAME