Redundancy

StorPool provides two mechanisms for protecting data from unplanned events: replication and erasure coding.

Tip

When planning replication and erasure coding schemes you can use the StorPool capacity planner tool. For details, see StorPool capacity planner.

Replication

With replication, redundancy is provided by having multiple copies (replicas) of the data written synchronously across the cluster. You can set the number of replication copies as needed. The replication level directly correlates with the number of fault sets or servers that may be down without interruption in the service. These are all disks in two full (groups of) nodes that are in two different fault sets, or two nodes, depending on how fault sets are configured (see Fault sets). For example, with triple replication the number of the servers that may be down simultaneously, without losing access to the data, is two.

Each volume or snapshot could be replicated on a different set of drives. Each set of drives is configured through the placement groups. A volume would either have all of its copies in a single set of drives in a different set of nodes, or have different copies in a different set of drives. There are many parameters through which you can manage replication; for details, see Volumes and Placement groups.

Tip

When using the replication mechanism, StorPool recommends having 3 copies as a standard for critical data.

Triple replication

The minimum requirement for triple replication is at least three nodes (with recommended five).

With triple replication each block of data is stored on three different storage nodes. This protects the data against two simultaneous failures - for example, one node is down for maintenance, and a drive on another node fails.

Dual replication

Dual replication can be used for non-critical data, or for data that can be recreated from other sources. Dual-replicated data can tolerate a single failure without service interruption.

This type of replication is suitable for test and staging environments, and can be deployed on a single node cluster (not recommended for production deployments). Deployment can also be performed on larger HDD-based backup clusters.

Erasure Coding

As of release 21.0 revision 21.0.75.1e0880427 StorPool supports erasure coding on NVMe drives.

Features

The erasure coding mechanism reduces the amount of data stored on the same hardware set, while at the same time preserves the level of data protection. It provides the following advantages:

  • Cross-node data protection

    Erasure-coded data is always protected across servers with two parity objects, so that any two servers can fail, and user data is safe.

  • Delayed batch-encoding

    Incoming data is initially written with triple replication. The erasure coding mechanism is automatically applied later. This way the data processing overhead is significantly reduced, and the impact on latency for user I/O operations is minimized.

  • Designed for always-on operations

    Up to two storage nodes can be rebooted or brought down for maintenance while the storage system keeps running, and all data is available and in use.

  • A pure software feature

    The implementation requires no additional hardware components.

Redundancy schemes

StorPool supports three redundancy schemes for erasure coding - 2+2, 4+2, or 8+2 schemes. You can choose which one to use based on the size of your cluster. The naming of the schemes follows the k+m pattern:

  • k is the number of data blocks stored.

  • m is the number of parity blocks stored.

  • A redundancy scheme can recover data when any up to m blocks are lost.

For example, 4+2 stores 4 data blocks and protects them with two parity blocks. It can operate and recover when any 2 drives or nodes are lost.

When planning, consider the minimum required number of nodes (or fault sets) for each scheme:

Scheme

Nodes

Raw space used

Overhead

2+2

5+

2.4x

140%

4+2

7+

1.8x

80%

8+2

11+

1.5x

50%

For example, storing 1TB user data using the 8+2 scheme requires 1.5TB raw storage capacity.

The nodes have to be relatively similar in size. A mixture of a few very large nodes could lead to inability to use their capacity efficiently.

Note

Erasure coding requires making snapshots on a regular basis. Make sure your cluster is configured to create snapshots regularly, for example using the VolumeCare service. A single periodic snapshot per volume is required; more snapshots are optional.

FAQ

For more information about StorPool’s erasure coding implementation, see the Erasure coding section of the the frequently asked questions.