StorPool Capacity Planner
1. Introduction
The storpool_capacity_planner is a tool that can figure out what StorPool modes can a specific StorPool cluster support. The StorPool redundancy modes are:
Factor |
Overhead |
Min. No. of nodes |
---|---|---|
3 |
3.3 |
3 |
2+2 |
2.4 |
5 |
4+2 |
1.8 |
7 |
8+2 |
1.5 |
11 |
The factor “3”, a.k.a. “3R” is StorPool’s standard triple replication. The rest, in the form of “N+2”, are the Erasure Coding (EC) modes. For details, see 14. Redundancy.
2. Usage
The tool can be run in one of the following ways:
Online:
Performed on a machine with the
storpool
CLI installed (data taken from a live cluster).Offline:
By passing the
--disk-list-json FILE
and--fault-set-list-json FILE
options.Planning:
By passing the
--csv-file CSV_FILE
option with a CSV file that has been generated by a previous run of the tool.
3. Mode support calculation
The total raw capacity of each fault set must be smaller or equal to the total capacity divided by the minimum number of nodes.
Every node is its own fault set, apart from those that are combined in custom fault sets.
4. Examples
4.1. Online, checking for mode eligibility
Running the storpool_capacity_planner
on a live cluster will report
which modes it can support:
$ storpool_capacity_planner
Original cluster:
Cluster: raw=35.505TB, fault sets:
fault set: name=FAKE_FOR_NODE_1 size=5.118TB
fault set: name=FAKE_FOR_NODE_2 size=3.518TB
fault set: name=FAKE_FOR_NODE_3 size=3.518TB
fault set: name=FAKE_FOR_NODE_4 size=5.758TB
fault set: name=FAKE_FOR_NODE_5 size=5.118TB
fault set: name=FAKE_FOR_NODE_6 size=5.117TB
fault set: name=FAKE_FOR_NODE_7 size=1.599TB
fault set: name=FAKE_FOR_NODE_8 size=1.919TB
fault set: name=FAKE_FOR_NODE_9 size=3.839TB
3R/3.3/3: Result: OK
Cluster: raw=35.505TB (0.0TB), usable=10.759TB, mode=3R/3.3/3
EC(2+2)/2.4/5: Result: OK
Cluster: raw=35.505TB (0.0TB), usable=14.794TB, mode=EC(2+2)/2.4/5
EC(4+2)/1.8/7: Result OK (cluster has been modified!)
Cluster: raw=33.586TB (-1.919TB), usable=18.659TB, mode=EC(4+2)/1.8/7
EC(8+2)/1.5/11: Result: Not compatible, reasons:
Cluster does not have enough nodes. Actual: 9, expected: >= 11
The tool has calculated that the cluster supports “3R” and “EC 2+2” without any modifications to the underlying capacity.
It has also calculated that the cluster can support EC 4+2, and ~1.9TB can be removed.
EC 8+2 is not supported because there are not enough nodes.
4.2. Planning, adding disks to an existing cluster
Note
Depending on the storage system and its configuration, adding more disks to a storage system does not result directly in more storage capacity.
This is the case for distributed systems like StorPool. Disk additions need to be planned considering the specific cluster, its nodes, and fault sets.
Consider the following cluster with capacities in TB:
Node |
Size (TB) |
Disks (TB) |
---|---|---|
Node 1 |
10 |
5 + 5 |
Node 2 |
8 |
4 + 4 |
Node 3 |
6 |
2 + 2 + 2 |
Node 4 |
2 |
1 + 1 |
Node 5 |
2 |
1 + 1 |
Node 6 |
2 |
1 + 1 |
Running the capacity planner on this cluster reports that it supports 3R. The
capacity that is required is raw=32.977TB (-0.002TB)
. The space that is
available to users from this capacity is usable=9.993TB
.
$ storpool_capacity_planner
Original cluster:
Cluster: raw=32.979TB, fault sets:
fault set: name=FAKE_FOR_NODE_10 size=10.994TB
fault set: name=FAKE_FOR_NODE_20 size=8.795TB
fault set: name=FAKE_FOR_NODE_30 size=6.596TB
fault set: name=FAKE_FOR_NODE_40 size=2.198TB
fault set: name=FAKE_FOR_NODE_50 size=2.198TB
fault set: name=FAKE_FOR_NODE_60 size=2.198TB
3R/3.3/3: Result OK (cluster has been modified!)
Cluster: raw=32.977TB (-0.002TB), usable=9.993TB, mode=3R/3.3/3
...
You can generate a CSV representation of the cluster and modify the disk layout. Afterwards, you can provide the modified CSV to the tool to check the potential effects of adding, removing, or shuffling disks in the cluster.
Assume you have a CSV file called with_extra_1tb_to_node_1.csv
, and you
modify it so that there is an extra 1TB disk added to Node 1. You provide this
file to the tool using the --csv
option. As shown in the example below, this
would not increase the storage capacity. The tool will report that the usable
size would remain usable=9.993TB
, and the raw space would be decreased with
1.1TB raw=32.977TB (-1.1TB)
:
$ storpool_capacity_planner --csv with_extra_1tb_to_node_1.csv
Original cluster:
Cluster: raw=34.077TB, fault sets:
fault set: name=FAKE_FOR_NODE_10 size=12.092TB
fault set: name=FAKE_FOR_NODE_20 size=8.795TB
fault set: name=FAKE_FOR_NODE_30 size=6.596TB
fault set: name=FAKE_FOR_NODE_40 size=2.198TB
fault set: name=FAKE_FOR_NODE_50 size=2.198TB
fault set: name=FAKE_FOR_NODE_60 size=2.198TB
3R/3.3/3: Result OK (cluster has been modified!)
Cluster: raw=32.977TB (-1.1TB), usable=9.993TB, mode=3R/3.3/3
...
Using the same approach you can check what would happen on adding a 1TB disk to
Node 6. As shown in the example below, this will increase the cluster’s usable
space. Originally it was usable=9.993TB
, and now it is usable=10.326TB
.
It can also be noted that the cluster has not rejected the extra capacity
raw=34.077TB (0.0TB)
(no negative values inside the brackets):
$ storpool_capacity_planner --csv with_extra_1tb_to_node_6.csv
Original cluster:
Cluster: raw=34.077TB, fault sets:
fault set: name=FAKE_FOR_NODE_10 size=10.993TB
fault set: name=FAKE_FOR_NODE_20 size=8.795TB
fault set: name=FAKE_FOR_NODE_30 size=6.596TB
fault set: name=FAKE_FOR_NODE_40 size=2.198TB
fault set: name=FAKE_FOR_NODE_50 size=2.198TB
fault set: name=FAKE_FOR_NODE_60 size=3.297TB
3R/3.3/3: Result: OK
Cluster: raw=34.077TB (0.0TB), usable=10.326TB, mode=3R/3.3/3
...