Bridge throughput performance

The throughput performance of a bridge connection (see Overview) depends on a couple of factors (not in this exact sequence) - network throughput, network latency, CPU speed, and disk latency. Each could become a bottleneck and could require additional tuning in order to get a higher throughput from the available link between the two sites.

Network

For high-throughput links latency is the most important factor for achieving higher link utilization. For example, a low-latency 10 Gbps link will be easily saturated (provided crypto is off), but would require some tuning when the latency is higher for optimizing the TCP window size. Same is in effect with lower-bandwidth links with higher latency.

For these cases the send buffer size could be bumped in small increments so that the TCP window is optimized. For more information on how to update the send buffer size in each location see Location.

Note

To test what would be the best send buffer size for throughput performance from primary to backup site, fill a volume with data in the primary (source) site, then create a backup to the backup (remote) site. While observing the bandwidth utilized, increase the send buffers in small increments in the source and the destination cluster until the throughput either stops rising or stays at an acceptable level.

There is a theoretical value for the buffer size that can be calculated as the product of the bandwidth of the link in bytes and the RTT of the link in seconds. For example, a 1Gbps link with 20ms of RTT would need 104857600 * 0.02 bytes of buffer to keep the link full, or 2MiB. The theoretical value should always be tested, due to other factors like QoS or network issues affecting the throughput.

Note that increasing the send buffers above this value can lead to delays when recovering a backup in the opposite direction.

Further sysctl changes might be required, depending on the NIC driver. For more information, check the /usr/share/doc/storpool/examples/bridge/90-StorPoolBridgeTcp.conf file on the node with the storpool_bridge service.

CPU

The CPU usually becomes a bottleneck only when crypto is configured to on. In such situations, it is helpful to move the bridge service on a node with a faster CPU. There is no requirement that the node is a server, it can be any node in the cluster.

If a faster CPU is not available in the same cluster, it could be of help to set the SP_BRIDGE_SLEEP_TYPE configuration option to hsleep, or even to no (see Type of sleep for the bridge service). Note that when this is configured, the storpool_cg tool (see Introduction to storpool_cg) will attempt to isolate a full-CPU core (with the second thread free from other processes).

Disks throughput

The default remote recovery setting maxRemoteRecoveryRequests is relatively low, especially for dedicated backup clusters (see Local and remote recovery). Thus, when the underlying disks in the receiving cluster are underutilized (this does not happen with flash media) they become the bottleneck. This parameter could be tuned for higher parallelism.

Here is an example: a small cluster of 3 nodes with 8 disks, translating to 48 default queue depth from the bridge, when there are 8 * 3 * 32 available from the underlying disks, and (by default with a 10gbps link) 2048 requests available from the bridge service (256 on an 1 Gbps link).

As with the throughput tuning, any change in this parameter needs to be tested.