Metrics collected and available from StorPool¶
1. Overview¶
A StorPool cluster collects & sends metrics information about its performance and related stats. These are described below.
Customers can access the data either:
via https://analytics.storpool.com/ on pre-defined dashboards;
via direct access in the InfluxDB instance for their cluster (please contact StorPool support to get access);
via an InfluxDB (or other database that supports InfluxDB’s line protocol) of their own, by configuring
SP_STATDB_ADDITIONAL
in/etc/storpool.conf
, see the last section in this document.
The Grafana dashboards used by StorPool at https://analytics.storpool.com/ are available on request, via StorPool support.
2. Internals¶
In this section you can find some details on the operation of the data collection and InfluxDB, which could be helpful to customers that would like to operate their own metrics database for StorPool.
2.1. storpool_stat
operations¶
The storpool_stat
tool is responsible for collecting and sending most of the information to the InfluxDB instances. It’s general flow is as follows:
on start, it forks one child per measurement type and one per receiving database;
all measurement processes collect data and write atomically a file every few seconds in
/tmp/storpool_stat
;the sending processes take the data from the files and push it to the databases, then delete the files;
if any file is found to be older than two days, the file is removed.
Note
It has been known for storpool_stat
to fill up /tmp
on loss of network connectivity for nodes with large amount of measured elements (CPUs, volumes).
To configure an extra metrics database for data to be pushed to, the SP_STATDB_ADDITIONAL
parameter needs to be set in storpool.conf
. It must contain a full URL to the write endpoint of the database, for example http://USER:PASSWORD@10.1.2.3:8086/write?db=metrics
. Note that if the URL scheme is https
, the endpoint will need a valid certificate.
2.2. Data interpolation between s1
and m1
retention policies¶
In StorPool’s initial deployment, this was done by continuous queries. Below is what was used:
CREATE CONTINUOUS QUERY "1s_to_1m" ON DBNAME BEGIN SELECT mean(*) INTO m1.:MEASUREMENT FROM /.*/ GROUP BY time(1m),* FILL(NULL) END;
CREATE CONTINUOUS QUERY "1s_to_1m_max" ON DBNAME BEGIN SELECT max(*) INTO m1.:MEASUREMENT FROM /.*/ GROUP BY time(1m),* FILL(NULL) END'
This solution did not scale for larger amount of databases, as the continuous queries are executed sequentially with no parallelization. For this purpose, StorPool developed cqsched
(available to customers on request to StorPool support) to process multiple databases and measurements in parallel.
2.3. Disk usage and IO of InfluxDB databases¶
The IO requirements of a single database are remarkably modest. For planing purposes, you should note that a cluster sends a data point every second for:
every CPU thread;
every attached volume;
every HDD or SSD drive on a node.
For disk usage, as an example, a cluster with ~800 attached volumes and ~800 CPU threads takes 11GiB space for the s1
measurement, and 78GiB for the m1
measurement.
3. Data structure¶
The main unit is a “measurement”, which contains the data for a specific measurement (disk I/O, CPU usage, etc).
All of the data below has tags and fields. Basically, a “tag” is something that is used to filter data, and field is something used to do calculations/plot graphs.
Note
More information can be found at InfluxDB’s documentation at https://docs.influxdata.com/influxdb/v1.8/concepts/key_concepts/
There are two retention policies for data: the per-second data (s1
) is retained for 48 hours, the per-minute data (m1
) is retained for 365 days. All data from storpool_stat
is pushed into s1
and the InfluxDB instances take care of downsampling it to per-minute data either via continuous queries or other means.
4. Measurements reference¶
The instances contain the following measurements:
4.1. cpustat¶
Collected by: storpool_stat
Collected from: /proc/schedstat
, /proc/stat
For extra information, see the documentation in the Linux kernel in Documentation/filesystems/proc.txt
for the two files above.
Tags:
cpu
- the CPU thread the stats are for;hostname
- hostname of the node;labels
- pipe-delimited (|
) list of StorPool services that are pinned on the CPU;server
-SP_OURID
of the node.
Fields:
guest
- Amount of time running a guest (virtual machine);guest_nice
- Amount of time running a lower-priority (“nice”) guest (virtual machine);idle
- Amount of time the CPU has been idle;iowait
- Amount of time the CPU has been idle and waiting for I/O;irq
- Amount of time the CPU has processed interrupts;nice
- Amount of time the CPU was running lower-priority (“nice”) tasks;run
- Amount of time a task has been running on the cpu;runwait
- Sum ofrun
andwait
;softirq
- Amount of time the CPU has processed software interruptssteal
- Amount of time the CPU was not able to run because the host didn’t allow this (has meaning only for virtual machines);system
- Amount of time the CPU was executing kernel (and non-IRQ) tasks;user
- Amount of time the CPU was executing in user-space;wait
- Amount of time tasks(s) have been waiting to run on the CPU.
Note
run
and wait
come from the scheduler stats. Their main benefit is that they
allow for the contention of the system to be measured, for example wait
on a host
running virtual machines translates directly to steal
inside the virtual machines.
4.2. disk¶
Collected by: storpool_monitor
, pushed via the monitoring system
Collected from: storpool disk list
This measurement is basically storpool disk list
collected once a minute.
Tags:
id
- Disk ID in StorPool;isWbc
- Does the drive have write-back cache enabled;model
- Drive model;noFlush
- Is the drive initialized to not send FLUSH commands;noFua
- Is the drive initialized not to use FUA (Force Unit Access);noTrim
- Is the drive initialized not to use TRIM/DISCARD;serial
- Serial number of the drive;serverId
- Server instance ID of thestorpool_server
working with the drive;ssd
- Is the drive an SSD/NVMe device.
Fields:
agAllocated
- Allocation groups currently in use;agCount
- Total number of allocation groups;agFree
- Free allocation groups;agFreeNotTrimmed
- (internal) free allocation groups that haven’t been trimmed yet;agFreeing
- (internal) allocation groups currently being freed;agFull
- (internal) allocation groups that are full;agMaxSizeFull
- (internal) allocation groups that are full with max-sized entries;agMaxSizePartial
- (internal) allocation groups that have only max-sized entries, but are not full;agPartial
- (internal) allocation groups that are partially full;aggregateScore_entries
- aggregate score for entries;aggregateScore_space
- aggregate score for space;aggregateScore_total
- combined aggregate score;entriesAllocated
- Entries currently in use;entriesCount
- Total number of entries;entriesFree
- Free entries;lastScrubCompleted
- Time stamp of the last completed scrubbing operation;objectsAllocated
- Objects currently in use;objectsCount
- Total number of objects;objectsFree
- Free objects;objectsOnDiskSize
- Total amount of user data on drive (sum of all data in objects);scrubbedBytes
- Progress of the current scrubbing operation;scrubbingBW
- Bandwidth of the current scrubbing operation;scrubbingFinishAfter
- estimated ETA of the scrubbing operation;scrubbingStartedBefore
- approximate start of the scrubbing operation;sectorsCount
- Number of sectors of the drive;totalErrorsDetected
- Total errors detected by checksum verification on the drive.
4.3. diskiostat¶
Collected by: storpool_stat
Collected from: /proc/diskstats
, storpool_initdisk --list
This measurement collects I/O stats for all HDD and SATA SSD drives on the system. For extra information, see the documentation in the Linux kernel in Documentation/admin-guide/iostats.rst
.
Tags:
device
- device name;hostname
- host name of the node;server
-SP_OURID
of the node;sp_id
- disk ID in StorPool (if applicable). Journal devices are prefixed withj
;ssd
- is the drive SSD.
Fields:
queue_depth
- queue utilization;r_wait
- wait time for read operations;read_bytes
- bytes transferred for read operations;reads
- number of read operations;reads_merges
- merged read operations;utilization
- device utilization (time busy);w_wait
- wait time for write operations;wait
- total wait time;write_bytes
- bytes transferred for write operations;write_merges
- merged write operations;writes
- number of write operations;
4.4. diskstat¶
Collected by: storpool_stat
Collected from: /usr/lib/storpool/server_stat
These metrics show the performance of the drive and operations as seen from the storpool_server processes.
Tags:
disk
- the ID of the disk in StorPool;hostname
- host name of the server;server
-SP_OURID
of the server;
Fields:
aggregation_completion_time
-aggregations
- number of aggregation operations performed;disk_initiated_read_bytes
-disk_initiated_reads
-disk_read_operations_completion_time
-disk_reads_completion_time
-disk_trims_bytes
-disk_trims_count
-disk_write_operations_completion_time
-disk_writes_completion_time
-entry_group_switches
- (internal);max_disk_writes_completion_time
-max_outstanding_read_requests
- peak read requests in the queue;max_outstanding_write_requests
- peak write requests in the queue;max_transfer_time
-metadata_completion_time
-pct_utilization_aggregation
- drive utilization for aggregation;pct_utilization_metadata
- drive utilization for metadata operations;pct_utilization_reads
- drive utilization for read operations;pct_utilization_server_reads
- drive utilization for server reads;pct_utilization_sys
- drive utilization for system operations;pct_utilization_total
- drive utilization in total;pct_utilization_total2
-pct_utilization_unknwon
-pct_utilization_user
-pct_utilization_writes
- drive utilization for write operations;queued_read_requests
- number of read operations in the queue;queued_write_requests
- number of write operations in the queue;read_balance_forward_double_dispatch
-read_balance_forward_double_dispatch_pct
-read_balance_forward_rcvd
-read_balance_forwards_sent
-read_bytes
- bytes transferred for read operations;reads
- read operation;reads_completion_time
-server_read_bytes
- bytes transferred for server reads;server_reads
- server reads (requests from other servers);transfer_average_time
-trims
- TRIM operations issued to the device;write_bytes
- bytes transferred for write operations;writes
- write operations;writes_completion_time
-
4.5. iostat¶
Collected by: storpool_stat
Collected from: /proc/schedstat
, /proc/stat
This measurement collects data for the I/O usage and latency for the volumes attached in hosts via the native StorPool driver. These are the same as for diskiostat.
Tags:
hostname
- host name of the node;server
-SP_OURID
of the node;volume
- volume name.
Fields:
queue_depth
- queue utilization;r_wait
- wait time for read operations;read_bytes
- bytes transferred for read operations;reads
- number of read operations;reads_merges
- merged read operations;utilization
- device utilization (time busy);w_wait
- wait time for write operations;wait
- total wait time;write_bytes
- bytes transferred for write operations;write_merges
- merged write operations;writes
- number of write operations;
4.6. iscsisession¶
Collected by: storpool_monitor
, pushed via the monitoring system
Collected from: storpool iscsi sessions list
This measurement is storpool iscsi sessions list
, collected once a minute. The data in it is counters, not differences, except the tasks_* fields, which are the current usage of the task queue.
Tags:
ISID
- ISID;connectionId
- numeric ID of the connection;controllerId
-SP_OURID
of the target exporting node;hwPort
- network interface number (0/1);initiator
- initiator IQN;initiatorIP
- initiator IP address;initiatorId
- internal numeric ID for the initiator;initiatorPort
- initiator originating TCP port;localMSS
- MSS for the TCP connection;portalIP
- IP of the portal;portalPort
- TCP port of the portal;status
- status of the connection;target
- target name;targetId
- numerical target ID;timeCreated
- timestamp of connection creation;
Fields:
dataHoles
- data “holes” observer, either because of dropped packets or reordering;discardedBytes
- amount of data discarded;discardedPackets
- number of packets discarded;newBytesIn
- bytes in SYN packets;newBytesOut
- bytes in SYN and/or ACK packets;newPacketsIn
- number of SYN packets;newPacketsOut
- bytes in SYN and/or ACK packets.retransmitsAcks
- number of fast retransmits;retransmitsAcks2
- number of second retransmits;retransmitsTimeout
- number of retransmissions because of a timeout;retransmittedBytes
- amount of retransmitted data;retransmittedPackets
- number of retransmitted packets;stats_dataIn
- amount of data received;stats_dataOut
- amount of data sent;stats_login
- number of login requests;stats_loginRsp
- number of login responses;stats_logout
- number of logout requests;stats_logoutRsp
- number of logout responses;stats_nopIn
- number of NOPs received;stats_nopOut
- number of NOPs sent;stats_r2t
-stats_reject
-stats_scsi
-stats_scsiRsp
-stats_snack
-stats_task
-stats_taskRsp
-stats_text
-stats_textRsp
-tasks_aborted
- task slots with ABORT tasks;tasks_dataOut
- total number of tasks for sending data;tasks_dataResp
- tasks responding with data;tasks_inFreeList
- available tasks slots;tasks_processing
- task slots currently being processed;tasks_queued
- task slots queued for processing;tcp_remoteMSS
- MSS advertised from the remote side;tcp_remoteWindowSize
- remote side window size;tcp_wscale
- TCP window scaling factor;totalBytesIn
- Total bytes received;totalBytesOut
- Total bytes sent;totalPacketsIn
- Total packets received;totalPacketsOut
- Total packets sent.
4.7. memstat¶
Collected by: storpool_stat
Collected from: /sys/fs/cgroup/memory/**/memory.stat
This measurement describes the memory usage of the mode and its cgroups. For the full description of the fields, see the documentation in the Linux kernel in Documentation/admin-guide/cgroup-v1/memory.rst
.
Tags:
cgroup
- name of the cgroup;hostname
- host name of the node;server
-SP_OURID
of the node.
Fields:
active_anon
active_file
cache
hierarchical_memory_limit
hierarchical_memsw_limit
inactive_anon
inactive_file
mapped_file
pgfault
pgmajfault
pgpgin
pgpgout
rss
rss_huge
swap
total_active_anon
total_active_file
total_cache
total_inactive_anon
total_inactive_file
total_mapped_file
total_pgfault
total_pgmajfault
total_pgpgin
total_pgpgout
total_rss
total_rss_huge
total_swap
total_unevictable
unevictable
4.8. netstat¶
Collected by: storpool_stat
Collected from: /usr/lib/storpool/sdump
This measurement collects network stats for every StorPool service running on any node, for the StorPool network protocol.
Tags:
hostname
- host name of the node;network
- network ID;server
-SP_OURID
of the node;service
- name of the service.
Fields:
getNotSpace
- number of requests rejected because of no space in local queues;rxBytes
- received bytes;rxChecksumError
- received packets with checksum errors;rxDataChecksumErrors
- received packets with checksum error in the data;rxDataHoles
- “holes” in the received data, caused by either packet loss or reordering;rxDataPackets
- received packets with data;rxHwChecksumError
- received packets with checksum errors detected by hardware;rxNotForUs
- received packets not destined to this service;rxPackets
- total received packets;rxShort
- received packets that were too short/truncated;txBytes
- transmitted bytes;txBytesLocal
- transmitted bytes to services on the same node;txDropNoBuf
- dropped packets because no buffers were available;txGetPackets
- transmittedget
packets (requesting data from other services);txPackets
- total transmitted packets;txPacketsLocal
- transmitted packets to services on the same node;txPingPackets
- transmittedping
packets.
4.9. servicestat¶
Collected by: storpool_stat
Collected from: /usr/lib/storpool/sdump
Tags:
hostname
- hostname of the node;server
-SP_OURID
of the node;service
- service name;
Fields:
data_transfers
- number of successful data transfers;data_transfers_failed
- number of failed data transfers;loops_per_second
- processing loops done by the service;slept_for_usecs
- amount of time the service has been idle;
4.10. template¶
Collected by: storpool_monitor
, pushed via the monitoring system
Collected from: storpool template status
This measurement tracks the amount of used/free space in a StorPool cluster.
Tags:
placeAll
- placement group name for placeAll;placeHead
- placement group name for placeHead;placeTail
- placement group name for placeTail;templatename
- template name;
Fields:
availablePlaceAll
- available space in placeAll placement group;availablePlaceHead
- available space in placeHead placement group;availablePlaceTail
- available space in placeTail placement group;capacity
- total capacity of the template;capacityPlaceAll
- capacity of the placeAll placement group;capacityPlaceHead
- capacity of the placeHead placement group;capacityPlaceTail
- capacity of the placeTail placement group;free
- available space in the template;objectsCount
- number of objects;onDiskSize
- total size of data stored on disks in this template;removingSnapshotsCount
- number of snapshots being deleted;size
- total provisioned size on the template;snapshotsCount
- number of snapshots;storedSize
- total amount of data stored in template;stored_internal_u1
- internal value;stored_internal_u1_placeAll
- internal value;stored_internal_u1_placeHead
- internal value;stored_internal_u1_placeTail
- internal value;stored_internal_u2
- internal value;stored_internal_u2_placeAll
- internal value;stored_internal_u2_placeHead
- internal value;stored_internal_u2_placeTail
- internal value;stored_internal_u3
- internal value (estimate of space “lost” due to disbalance);stored_internal_u3_placeAll
- internal value;stored_internal_u3_placeHead
- internal value;stored_internal_u3_placeTail
- internal value;totalSize
- The number of bytes of all volumes based on this template, including the StorPool replication overhead;volumesCount
- number of volumes;