StorPool Capacity Planner

1. Introduction

The storpool_capacity_planner is a tool that can figure out what StorPool modes can a specific StorPool cluster support. The StorPool redundancy modes are:

Factor

Overhead

Min. No. of nodes

3

3.3

3

2+2

2.4

5

4+2

1.8

7

8+2

1.5

11

The factor “3”, a.k.a. “3R” is StorPool’s standard triple replication. The rest, in the form of “N+2”, are the Erasure Coding (EC) modes. For details, see 14.  Redundancy.

2. Usage

The tool can be run in one of the following ways:

  • Online:

    Performed on a machine with the storpool CLI installed (data taken from a live cluster).

  • Offline:

    By passing the --disk-list-json FILE and --fault-set-list-json FILE options.

  • Planning:

    By passing the --csv-file CSV_FILE option with a CSV file that has been generated by a previous run of the tool.

3. Mode support calculation

The total raw capacity of each fault set must be smaller or equal to the total capacity divided by the minimum number of nodes.

Every node is its own fault set, apart from those that are combined in custom fault sets.

4. Examples

4.1. Online, checking for mode eligibility

Running the storpool_capacity_planner on a live cluster will report which modes it can support:

$ storpool_capacity_planner
Original cluster:
Cluster: raw=35.505TB, fault sets:
    fault set: name=FAKE_FOR_NODE_1 size=5.118TB
    fault set: name=FAKE_FOR_NODE_2 size=3.518TB
    fault set: name=FAKE_FOR_NODE_3 size=3.518TB
    fault set: name=FAKE_FOR_NODE_4 size=5.758TB
    fault set: name=FAKE_FOR_NODE_5 size=5.118TB
    fault set: name=FAKE_FOR_NODE_6 size=5.117TB
    fault set: name=FAKE_FOR_NODE_7 size=1.599TB
    fault set: name=FAKE_FOR_NODE_8 size=1.919TB
    fault set: name=FAKE_FOR_NODE_9 size=3.839TB


3R/3.3/3: Result: OK
Cluster: raw=35.505TB (0.0TB), usable=10.759TB, mode=3R/3.3/3

EC(2+2)/2.4/5: Result: OK
Cluster: raw=35.505TB (0.0TB), usable=14.794TB, mode=EC(2+2)/2.4/5

EC(4+2)/1.8/7: Result OK (cluster has been modified!)
Cluster: raw=33.586TB (-1.919TB), usable=18.659TB, mode=EC(4+2)/1.8/7

EC(8+2)/1.5/11: Result: Not compatible, reasons:
Cluster does not have enough nodes. Actual: 9, expected: >= 11

The tool has calculated that the cluster supports “3R” and “EC 2+2” without any modifications to the underlying capacity.

It has also calculated that the cluster can support EC 4+2, and ~1.9TB can be removed.

EC 8+2 is not supported because there are not enough nodes.

4.2. Planning, adding disks to an existing cluster

Note

Depending on the storage system and its configuration, adding more disks to a storage system does not result directly in more storage capacity.

This is the case for distributed systems like StorPool. Disk additions need to be planned considering the specific cluster, its nodes, and fault sets.

Consider the following cluster with capacities in TB:

Node

Size (TB)

Disks (TB)

Node 1

10

5 + 5

Node 2

8

4 + 4

Node 3

6

2 + 2 + 2

Node 4

2

1 + 1

Node 5

2

1 + 1

Node 6

2

1 + 1

Running the capacity planner on this cluster reports that it supports 3R. The capacity that is required is raw=32.977TB (-0.002TB). The space that is available to users from this capacity is usable=9.993TB.

$ storpool_capacity_planner
Original cluster:
Cluster: raw=32.979TB, fault sets:
    fault set: name=FAKE_FOR_NODE_10 size=10.994TB
    fault set: name=FAKE_FOR_NODE_20 size=8.795TB
    fault set: name=FAKE_FOR_NODE_30 size=6.596TB
    fault set: name=FAKE_FOR_NODE_40 size=2.198TB
    fault set: name=FAKE_FOR_NODE_50 size=2.198TB
    fault set: name=FAKE_FOR_NODE_60 size=2.198TB


3R/3.3/3: Result OK (cluster has been modified!)
Cluster: raw=32.977TB (-0.002TB), usable=9.993TB, mode=3R/3.3/3

...

You can generate a CSV representation of the cluster and modify the disk layout. Afterwards, you can provide the modified CSV to the tool to check the potential effects of adding, removing, or shuffling disks in the cluster.

Assume you have a CSV file called with_extra_1tb_to_node_1.csv, and you modify it so that there is an extra 1TB disk added to Node 1. You provide this file to the tool using the --csv option. As shown in the example below, this would not increase the storage capacity. The tool will report that the usable size would remain usable=9.993TB, and the raw space would be decreased with 1.1TB raw=32.977TB (-1.1TB):

$ storpool_capacity_planner --csv with_extra_1tb_to_node_1.csv
Original cluster:
Cluster: raw=34.077TB, fault sets:
    fault set: name=FAKE_FOR_NODE_10 size=12.092TB
    fault set: name=FAKE_FOR_NODE_20 size=8.795TB
    fault set: name=FAKE_FOR_NODE_30 size=6.596TB
    fault set: name=FAKE_FOR_NODE_40 size=2.198TB
    fault set: name=FAKE_FOR_NODE_50 size=2.198TB
    fault set: name=FAKE_FOR_NODE_60 size=2.198TB


3R/3.3/3: Result OK (cluster has been modified!)
Cluster: raw=32.977TB (-1.1TB), usable=9.993TB, mode=3R/3.3/3

...

Using the same approach you can check what would happen on adding a 1TB disk to Node 6. As shown in the example below, this will increase the cluster’s usable space. Originally it was usable=9.993TB, and now it is usable=10.326TB. It can also be noted that the cluster has not rejected the extra capacity raw=34.077TB (0.0TB) (no negative values inside the brackets):

$ storpool_capacity_planner --csv with_extra_1tb_to_node_6.csv
Original cluster:
Cluster: raw=34.077TB, fault sets:
    fault set: name=FAKE_FOR_NODE_10 size=10.993TB
    fault set: name=FAKE_FOR_NODE_20 size=8.795TB
    fault set: name=FAKE_FOR_NODE_30 size=6.596TB
    fault set: name=FAKE_FOR_NODE_40 size=2.198TB
    fault set: name=FAKE_FOR_NODE_50 size=2.198TB
    fault set: name=FAKE_FOR_NODE_60 size=3.297TB


3R/3.3/3: Result: OK
Cluster: raw=34.077TB (0.0TB), usable=10.326TB, mode=3R/3.3/3

...