Common monitoring via the StorPool API
Introduction
The operations of a StorPool cluster can be monitored via the API. It returns JSON and is easy to automate and integrate with different monitoring systems. Below is an explanation of the different elements that can be monitored and their meaning.
Most of this work is based on the running monitoring system of StorPool, available at https://monitoring.storpool.com. If you are a customer and don’t have access to it, please contact us via our ticketing system to request your credentials.
To see the JSON for a specific command, try storpool -B -j COMMAND
, like
this:
[root@one1 ~]# storpool -B -j disk list
{
"data" : {
"101" : {
"agAllocated" : 544,
"agCount" : 1444,
"agFree" : 900,
"agFreeNotTrimmed" : 1,
"agFreeing" : 1,
"agFull" : 0,
"agMaxSizeFull" : 0,
"agMaxSizePartial" : 4,
"agPartial" : 535,
"aggregateScore" : {
"entries" : 0,
"space" : 1,
"total" : 1
},
"description" : "",
"device" : "/dev/sda1",
"empty" : false,
...
Note
The -B
option means “batch”, it will retry the request if there
are transient errors.
Internal elements
These elements do not have status of their own, but provide information about other components.
Tasks
Available in StorPool’s command line interface (CLI) via task list
.
The tasks that run in the cluster are operations that fall in one of these three categories:
Transaction ID 0 - recovery task, a drive is recovering the data changes it missed while its server was down
Transaction ID 1 - bridge tasks, sending data between clusters (“Multi-site”)
Everything else - balancer/relocator and similar tasks
The following can be done with the information from the tasks:
To see if there is a server or drive in recovery. This information is useful to know when some service or node can be restarted, as it’s recommended to wait for the recoveries to finish before doing any other maintenance.
To see if there is re-balancing going on, which can affect cluster performance.
To see if any bridge operation is in progress.
To track the progress of the tasks, you can see the count of completed objects per task. Please note this cannot be used for estimating completion time, as the total size of the objects cannot be estimated from this output.
Attachments
Available in the CLI via attach list
.
The attachments are volumes presented as block-devices to a specific node. The information here can be used to decide which clients do not have attachments and their downtime does not affect any known users of the storage system.
Visible elements
These are the elements of the system that should be monitored directly.
Networks
Available in the CLI via net list
.
This is a list of the network interfaces and state of the storpool_beacon
services on each node. Please note that in the JSON there are two types of
network definitions per host, one is network
, the other is rdma
(for
Infiniband networks), and they all should count towards the number of interfaces
for the node.
You should monitor here that:
There is the same number of network interfaces for each node, and that it’s at least two. StorPool does not support clusters in production with less than two interfaces.
All networks for the node are up.
beaconStatus
is in theNODE_UP
state.clusterStatus
is in theCNODE_UP
state.joined
istrue
.There is no node missing from the list of networks.
Services
Available in the CLI via service list
.
These are the different services running on the nodes that perform their separate tasks.
Globals that need to be monitored for all services:
Is the service up (
status
in the JSON)?Are all the services running the same major version (16.01, 16.02) of StorPool?
Server
Performs communication with drives and provides access to them.
Note that for multi-server installations, the server’s ID (SID) in the JSON is in two parts: node id (N) and server instance id (I). The formula is SID=I*4096+N, and the intances are counted from 0.
What should be monitored on server instances:
Is it in recovery (see tasks above)?
Client
Provides access to the OS and processes on it to StorPool volumes.
For every client, the list of active requests can be fetched via client N
activeRequests
in the CLI. It should be verified that there are no requests
that have been waiting for more than 1 second, and to generate warning if such
were found.
Alerts for this service can be suppressed if there are no attachments on it, not to generate warnings for hosts in maintenance.
Management
Provides access to the management of the StorPool cluster.
What should be monitored for this service:
There need to be more than one such service in the cluster. The recommended number is 3.
There needs to be exactly one (1) node that’s active.
iSCSI
Provides access via iSCSI to the StorPool cluster.
If you have iSCSI clients, you need at least one of these to be running. Also, please note that you should keep track if such service existed in the cluster, as it might not get reported if the whole cluster goes down and comes back up without these services present.
Bridge
Provides snapshot transfer service between cluster.
If you have the bridge services, you need at least one of these to be running. Also, please note that you should keep track if such service existed in the cluster, as it might not get reported if the whole cluster goes down and comes back up without these services present.
Disks
Available in the CLI via disk list
.
Note
For monitoring disk usage (and template usage below) we recommend using hysteresis. This means that if the change between consecutive checks is less than 1%, not to change the status, even though a value has gone above (or below) a specific watermark.
Disks have the most options that need to be monitored.
For disk state, the following needs to be monitored:
If the disk is seen by the cluster (the
device
field is not empty)If the disk is not in recovery (see tasks above)
If the disk was scrubbed less than 2 weeks ago (
lastScrubCompleted
field)StorPool scrubs the drives of the system every week, but that operation can be delayed because of load. Also please note that prior to 16.02 new drives get their
lastScrubCompleted
field set to 0.If a disk that was known to be in the system is not there now.
This is possible to happen while the server is starting and hasn’t added the disk yet. For this reason we recommend to have an “inventory” system that keeps track of what drives were reported by StorPool and remove drives from it manually or when a drive hasn’t show for more than 2 hours.
For disk usage stats, the following values need to be checked, with watermarks for warning/critical state, and with hysteresis:
General disk usage:
agAllocated
/agCount
Objects usage:
objectsAllocated
/objectsCount
entriesFree
should be above a certain threshold; we recommend the warning to be below 100000 and critical below 70000.
For the disk errors, they should be monitored not based on their absolute value, but on the velocity of this value, i.e. rate of change. The recommended watermark for marking a disk critical is more than 100 errors within 48 hours.
Currently there is no recommended way to monitor the dis-balance of data on the disks.
For every drive, the list of active requests can be fetched via disk N
activeRequests
in the CLI. It should be verified that there are no requests
that have been waiting for more than 1 second, and to generate warning if such
were found.
Templates
Available in the CLI via template status
.
Template status is the standard way of seeing the amount of free space in the cluster.
Note
For a discussion of the meaning of “free space” in StorPool please refer to the documentation, as it’s not the same as “space you can allocate”, but “data you can write in”.
For easier understanding of the data, we recommend not to get data for all
templates, but to group them based on placeHead
, placeAll
, placeTail
and replication
. Otherwise you’ll have repeated data.
Also, note that there can be two or more templates that have overlapping placement groups, or placement groups with overlapping drives. This means that the sum of the free space of several templates can be more than the actual data you can write to the system.
For a template, first you need to check if there are any volumes allocated on it. A template without volumes (i.e. that’s not in use) should not generate any alerts.
The available space is stored in stored->free
, the total space in
stored->capacity
. This is also recorded for the placement groups, and that
information should also be displayed, so it would be known which placement group
is the limiting factor.
Volumes
Available in the CLI via volume status
.
As of 16.01/16.02 this is not recommended to be ran often, as it would generate too much load on the cluster.
General cluster status
Most of the above checks are with too low granularity and do not provide an easy to understand picture of the state of the cluster. Here are some ideas how to group the data above for getting a better picture:
Disks
If there are disks missing/gone from just one server, that should be a warning condition, if there are on more servers, it’s critical.
Please note that this is not the case when using replication factor 2, then every missing drive should be treated as critical.
Any drive above the critical usage watermark should set the condition to critical.
Any drive in recovery should set the condition to warning.
Networks
Any critical conditions in a network should be treated as such for the cluster.
Any node with just client and no attachments should not trigger the above.
Services
In general, the loss of one should be treated as warning, more than that - as critical.
Cluster services
There are a few services for the whole cluster that need to be monitored:
relocator
should always be on.balancer
is currently (as of 16.01 and 16.02) handled by an external service and shouldn’t be monitored.