OnApp LVM to LVMSP disk migration

There is a helper tool named lvm2lvmsp.sh to automate the creation of the StorPool volumes which require the virsh domain id or name of a given VM as it’s argument. Optionally it is possible to query and process all running VMs by supplying the -a argument. In this case the domain list is irrelevant.

/usr/local/lvmsp/lvm2lvmsp.sh [-v] [-y] [-c compute] <-a|virsh_domain_id_or_name>

If there is no LVMSP installed on the initial Compute node for the Preparation phase it is possible to query the remote Compute node from a host with LVMSP installed where by using -c compute_host argument. The -v argument is for slightly verbose messages.

By default the tool will show only the commands that should be run. To do the real changes the -y argument must be set.

Limitations

Most vDisk related operations like disk resize, etc, will not work properly while a VM is in process of migration.

Migrating a VM which is in process of disk migration will reset the RAID1 synchronization. In that case the raw StorPool volume must be re-attached to the degraded RAID1 array and the resynchronization will restart.

Preparation

The following one time configurations should be made in advance:

  1. A StorPool template matching the LVM VG must be created:

storpool template onapp-dhksnikrhgqrad ...

Note

This will enable LVMSP to handle the creation of all new VM disks. Deploying a new vDisk or entirely new VM will use separate StorPool volumes for each vDisk.

2. It is recommended to configure LVM to issue TRIM/DISCARD when deleting native LVM logical volumes with /sbin/lvremove:

Hint

Edit /etc/lvm/lvm.conf and set issue_discards = 1 in the devices section.

devices {
      ...
      issue_discards = 1
}

Note

If there are different OS generations in the cluster for removing of the native LVM logical volumes the node where this option is available should be used.

Once the above steps are completed the VM disk migration should start by creating StorPool volumes with same size as the native LVM logical volumes and tagged with StorPool volume tag named lvm and value i.

For better control of the migration process it is recommended to do the migration of the VMs on batches.

It is recommended to use the provided lvm2lvmsp.sh helper tool:

[root@s18 lvmsp]# ./lvm2lvmsp.sh -y xnujnrdugvqdjc
# [xnujnrdugvqdjc] processing ...
# [xnujnrdugvqdjc] /dev/dm-0 /dev/onapp-dhksnikrhgqrad/gjfisfgsudthkc DMtable {0 10485760 linear 252:4 2048}
# [xnujnrdugvqdjc] Creating volume onapp-dhksnikrhgqrad:gjfisfgsudthkc, size 5368709120 Bytes (5GiB), tag:lvm=i
(0) volume onapp-dhksnikrhgqrad:gjfisfgsudthkc size 5368709120B; set tag lvm=i [xnujnrdugvqdjc]
# [xnujnrdugvqdjc] /dev/dm-1 /dev/onapp-dhksnikrhgqrad/ppeonkmjbrkghn DMtable {0 2097152 linear 252:4 10487808}
# [xnujnrdugvqdjc] Creating volume onapp-dhksnikrhgqrad:ppeonkmjbrkghn, size 1073741824 Bytes (1GiB), tag:lvm=i
(0) volume onapp-dhksnikrhgqrad:ppeonkmjbrkghn size 1073741824B; set tag lvm=i [xnujnrdugvqdjc]
# [xnujnrdugvqdjc] /dev/dm-2 /dev/onapp-dhksnikrhgqrad/hxqdofizwvndsr DMtable {0 209715200 linear 252:4 12584960}
# [xnujnrdugvqdjc] Creating volume onapp-dhksnikrhgqrad:hxqdofizwvndsr, size 107374182400 Bytes (100GiB), tag:lvm=i
(0) volume onapp-dhksnikrhgqrad:hxqdofizwvndsr size 107374182400B; set tag lvm=i [xnujnrdugvqdjc]

Note

It is possible to access a remote host with password-less ssh using the configuration variable -c compute_hostname.

[root@s20 lvmsp]# ./lvm2lvmsp.sh -y -c s18 xnujnrdugvqdjc
# [xnujnrdugvqdjc] processing on compute s18 ...
# [xnujnrdugvqdjc] on s18 /dev/dm-0 /dev/onapp-dhksnikrhgqrad/gjfisfgsudthkc DMtable {0 10485760 linear 252:4 2048}
# [xnujnrdugvqdjc] Creating volume onapp-dhksnikrhgqrad:gjfisfgsudthkc, size 5368709120 Bytes (5GiB), tag:lvm=i
(0) volume onapp-dhksnikrhgqrad:gjfisfgsudthkc size 5368709120B = 5120M = 5G; set tag lvm=i [xnujnrdugvqdjc]
# [xnujnrdugvqdjc] on s18 /dev/dm-1 /dev/onapp-dhksnikrhgqrad/ppeonkmjbrkghn DMtable {0 2097152 linear 252:4 10487808}
# [xnujnrdugvqdjc] Creating volume onapp-dhksnikrhgqrad:ppeonkmjbrkghn, size 1073741824 Bytes (1GiB), tag:lvm=i
(0) volume onapp-dhksnikrhgqrad:ppeonkmjbrkghn size 1073741824B = 1024M = 1G; set tag lvm=i [xnujnrdugvqdjc]
# [xnujnrdugvqdjc] on s18 /dev/dm-2 /dev/onapp-dhksnikrhgqrad/hxqdofizwvndsr DMtable {0 209715200 linear 252:4 12584960}
# [xnujnrdugvqdjc] Creating volume onapp-dhksnikrhgqrad:hxqdofizwvndsr, size 107374182400 Bytes (100GiB), tag:lvm=i
(0) volume onapp-dhksnikrhgqrad:hxqdofizwvndsr size 107374182400B = 102400M = 100G; set tag lvm=i [xnujnrdugvqdjc]

Once the StorPool volumes are ready the VM must be migrated (or rebooted) via OnApp CP. For each of the tagged volumes LVMSP will create a degraded RAID1 array on top of the LVM volume and replace the symlink in /dev/VG/LV.

Data synchronization

When a VM is running the previously prepared StorPool volumes must be attached to the RAID1 arrays and the kernel starts syncing the data from the native LVM volumes to the LVMSP volumes:

The lvm2lvmsp.sh tool will detect the RAID1 array and attach the raw StorPool volumes:

[root@s20 lvmsp]# ./lvm2lvmsp.sh -v -y xnujnrdugvqdjc
# [xnujnrdugvqdjc] processing ...
# [xnujnrdugvqdjc] /dev/dm-1 /dev/onapp-dhksnikrhgqrad/gjfisfgsudthkc DMtable {0 10485760 raid raid1 3 0 region_size 8192 2 - 253:0 - -}
# [xnujnrdugvqdjc] Attaching onapp-dhksnikrhgqrad:gjfisfgsudthkc as the second disk to RAID1 'gjfisfgsudthkc-raid1'
# dmsetup: suspend, reload, resume, table & status:
0 10485760 raid raid1 3 0 region_size 8192 2 - 253:0 - 252:1
0 10485760 raid raid1 2 aa 0/10485760 resync 0
# [xnujnrdugvqdjc] /dev/dm-3 /dev/onapp-dhksnikrhgqrad/ppeonkmjbrkghn DMtable {0 2097152 raid raid1 3 0 region_size 8192 2 - 253:2 - -}
# [xnujnrdugvqdjc] Attaching onapp-dhksnikrhgqrad:ppeonkmjbrkghn as the second disk to RAID1 'ppeonkmjbrkghn-raid1'
# dmsetup: suspend, reload, resume, table & status:
0 2097152 raid raid1 3 0 region_size 8192 2 - 253:2 - 252:2
0 2097152 raid raid1 2 aa 0/2097152 resync 0
# [xnujnrdugvqdjc] /dev/dm-5 /dev/onapp-dhksnikrhgqrad/hxqdofizwvndsr DMtable {0 209715200 raid raid1 3 0 region_size 8192 2 - 253:4 - -}
# [xnujnrdugvqdjc] Attaching onapp-dhksnikrhgqrad:hxqdofizwvndsr as the second disk to RAID1 'hxqdofizwvndsr-raid1'
# dmsetup: suspend, reload, resume, table & status:
0 209715200 raid raid1 3 0 region_size 8192 2 - 253:4 - 252:3
0 209715200 raid raid1 2 aa 0/209715200 resync 0

Each sequential call of lvm2lvmsp.sh will display the progress of the synchronization:

[root@s20 lvmsp]# ./lvm2lvmsp.sh -v -y xnujnrdugvqdjc
# [xnujnrdugvqdjc] processing ...
# [xnujnrdugvqdjc] /dev/dm-1 /dev/onapp-dhksnikrhgqrad/gjfisfgsudthkc DMtable {0 10485760 raid raid1 3 0 region_size 8192 2 - 253:0 - 252:1}
# [xnujnrdugvqdjc] DMstatus {0 10485760 raid raid1 2 aa 3932928/10485760 resync 0} 30.0% completed, 3GB left
# [xnujnrdugvqdjc] /dev/dm-3 /dev/onapp-dhksnikrhgqrad/ppeonkmjbrkghn DMtable {0 2097152 raid raid1 3 0 region_size 8192 2 - 253:2 - 252:2}
# [xnujnrdugvqdjc] DMstatus {0 2097152 raid raid1 2 AA 2097152/2097152 idle 0} ~~ {READY}
# [xnujnrdugvqdjc] /dev/dm-5 /dev/onapp-dhksnikrhgqrad/hxqdofizwvndsr DMtable {0 209715200 raid raid1 3 0 region_size 8192 2 - 253:4 - 252:3}
# [xnujnrdugvqdjc] DMstatus {0 209715200 raid raid1 2 aa 0/209715200 resync 0} 0% completed, 100GiB left

Note

The example shows that the first VM disk is with 30% completed synchronization, the second one is in sync and the third disk (the largest one) hasn’t started synchronizing yet.

Tip

There are RAID1 performance tuning options described in Raid1 resync tuning

When all VM disks are in sync the lvm2lvmsp.sh script will remove the lvm=i tag from the StorPool volumes.

[root@s20 lvmsp]# ./lvm2lvmsp.sh -v -y xnujnrdugvqdjc
# [xnujnrdugvqdjc] processing ...
# [xnujnrdugvqdjc] /dev/dm-1 /dev/onapp-dhksnikrhgqrad/gjfisfgsudthkc DMtable {0 10485760 raid raid1 3 0 region_size 8192 2 - 253:0 - 252:1}
# [xnujnrdugvqdjc] DMstatus {0 10485760 raid raid1 2 AA 10485760/10485760 idle 0} ~~ {READY}
# [xnujnrdugvqdjc] /dev/dm-3 /dev/onapp-dhksnikrhgqrad/ppeonkmjbrkghn DMtable {0 2097152 raid raid1 3 0 region_size 8192 2 - 253:2 - 252:2}
# [xnujnrdugvqdjc] DMstatus {0 2097152 raid raid1 2 AA 2097152/2097152 idle 0} ~~ {READY}
# [xnujnrdugvqdjc] /dev/dm-5 /dev/onapp-dhksnikrhgqrad/hxqdofizwvndsr DMtable {0 209715200 raid raid1 3 0 region_size 8192 2 - 253:4 - 252:3}
# [xnujnrdugvqdjc] DMstatus {0 209715200 raid raid1 2 AA 209715200/209715200 idle 0} ~~ {READY}
# [xnujnrdugvqdjc] volumes:3 raidReady:3
# [xnujnrdugvqdjc] removing the 'lvm' tag from volume onapp-dhksnikrhgqrad:ppeonkmjbrkghn ...
# [xnujnrdugvqdjc] removing the 'lvm' tag from volume onapp-dhksnikrhgqrad:hxqdofizwvndsr ...
# [xnujnrdugvqdjc] removing the 'lvm' tag from volume onapp-dhksnikrhgqrad:gjfisfgsudthkc ...

Note

Once the lvm=i tag is removed LVMSP will apply standard operation on them.

Finalizing

On next VM migration or reboot via OnApp CP LVMSP will destroy the RAID1 array and the VM will run from the LVMSP disks (StorPool volumes).

Only after the VM is migrated the LVM LVs could be removed. A special care must be taken to avoid the use of the LVMSP wrappers by using the full path to the native LVM tools! For example /sbin/lvremove.

/sbin/lvremove /dev/onapp-dhksnikrhgqrad/ppeonkmjbrkghn
/sbin/lvremove /dev/onapp-dhksnikrhgqrad/hxqdofizwvndsr
/sbin/lvremove /dev/onapp-dhksnikrhgqrad/gjfisfgsudthkc

RAID1 resync tuning

dev.raid.{speed_limit_max,speed_limit_min}

The resync process is managed by the kernel in a way to reduce the impact on the RAID1 device performance. The variables that could be tweaked to improve the resync performance are as follow:

  • /proc/sys/dev/raid/speed_limit_min - the minimum guaranteed resync speed when there are active BIOs. This value shouldn’t be touched (default: 1000)

  • /proc/sys/dev/raid/speed_limit_max - the maximum resync speed allowed per device. This value could be tweaked to achieve better sync performance (default: 200000)

The values are in KiB/s and could be changed using sysctl as follow:

sysctl -w dev.raid.speed_limit_max=550000