Metrics collected and available from StorPool
Overview
A StorPool cluster collects and sends metrics information about its performance and related stats. These are described below.
Customers can access the data either:
Via analytics.storpool.com on pre-defined dashboards
Via direct access in the InfluxDB instance for their cluster (please contact StorPool support to get access)
Via an InfluxDB (or other database that supports InfluxDB’s line protocol) of their own, by configuring
SP_STATDB_ADDITIONALin/etc/storpool.conf, see the last section in this document.
The Grafana dashboards used by StorPool at analytics.storpool.com are available on request, via StorPool support.
Internals
In this section you can find some details on the operation of the data collection and InfluxDB, which could be helpful to customers that would like to operate their own metrics database for StorPool.
storpool_stat operations
The storpool_stat tool is responsible for collecting and sending most of the
information to the InfluxDB instances. It’s general flow is as follows:
On start, it forks one child per measurement type and one per receiving database
All measurement processes collect data and write atomically a file every few seconds in
/tmp/storpool_statThe sending processes take the data from the files and push it to the databases, then delete the files
If any file is found to be older than two days, the file is removed.
Note
It has been known for storpool_stat to fill up /tmp on loss of
network connectivity for nodes with large amount of measured elements
(CPUs, volumes).
To configure an extra metrics database for data to be pushed to, the
SP_STATDB_ADDITIONAL parameter needs to be set in storpool.conf. It must
contain a full URL to the write endpoint of the database, for example
http://USER:PASSWORD@10.1.2.3:8086/write?db=metrics. Note that if the URL
scheme is https, the endpoint will need a valid certificate.
Data interpolation between s1 and m1 retention policies
In StorPool’s initial deployment, this was done by continuous queries. Below is what was used:
CREATE CONTINUOUS QUERY "1s_to_1m" ON DBNAME BEGIN SELECT mean(*) INTO m1.:MEASUREMENT FROM /.*/ GROUP BY time(1m),* FILL(NULL) END
CREATE CONTINUOUS QUERY "1s_to_1m_max" ON DBNAME BEGIN SELECT max(*) INTO m1.:MEASUREMENT FROM /.*/ GROUP BY time(1m),* FILL(NULL) END'
This solution did not scale for larger amount of databases, as the continuous
queries are executed sequentially with no parallelization. For this purpose,
StorPool developed cqsched (available to customers on request to StorPool
support) to process multiple databases and measurements in parallel.
Disk usage and IO of InfluxDB databases
The IO requirements of a single database are remarkably modest. For planing purposes, you should note that a cluster sends a data point every second for:
every CPU thread
every attached volume
every HDD or SSD drive on a node.
For disk usage, as an example, a cluster with ~800 attached volumes and ~800 CPU
threads takes 11GiB space for the s1 measurement, and 78GiB for the m1
measurement.
Data structure
The main unit is a “measurement”, which contains the data for a specific measurement (disk I/O, CPU usage, etc).
All of the data below has tags and fields. Basically, a “tag” is something that is used to filter data, and field is something used to do calculations/plot graphs.
For more information, see InfluxDB’s documentation at https://docs.influxdata.com/influxdb/v1/concepts/key_concepts/.
There are two retention policies for data: the per-second data (s1) is
retained for 7 days, the per-minute data (m1) is retained for 730 days (2
years). All data from storpool_stat is pushed into s1 and the InfluxDB
instances take care of downsampling it to per-minute data either via continuous
queries or other means.
Measurements reference
The instances contain the following measurements:
bridgestatus
Collected by: storpool_monitor, pushed via the monitoring system.
Collected from: storpool remoteBridge status
This measurement is basically storpool remoteBridge status collected once a
minute.
The tcpInfo_* fields are direct copy of the tcp_info structure in the
Linux kernel for the relevant connection.
Tags:
clusterId- ID of the remote clusterconnectionState- current state of the connection (string)ip- IP address of the remote bridgelastErrno-errnoof the last errorlastError- string representation of the last errorprotocolVersion- StorPool bridge protocol version.
Fields:
countersSinceConnect_bytesRecv- bytes received in the current connectioncountersSinceConnect_bytesSent- bytes sent in the current connectioncountersSinceStart_bytesRecv- bytes received from peer since the start of the processcountersSinceStart_bytesSent- bytes sent to peer since the start of the processtcpInfo_tcpi_advmss- this and all below -tcpinfofield.tcpInfo_tcpi_atotcpInfo_tcpi_backofftcpInfo_tcpi_ca_statetcpInfo_tcpi_facketstcpInfo_tcpi_last_ack_recvtcpInfo_tcpi_last_ack_senttcpInfo_tcpi_last_data_recvtcpInfo_tcpi_last_data_senttcpInfo_tcpi_losttcpInfo_tcpi_optionstcpInfo_tcpi_pmtutcpInfo_tcpi_probestcpInfo_tcpi_rcv_msstcpInfo_tcpi_rcv_rtttcpInfo_tcpi_rcv_spacetcpInfo_tcpi_rcv_ssthreshtcpInfo_tcpi_rcv_wscaletcpInfo_tcpi_reorderingtcpInfo_tcpi_retranstcpInfo_tcpi_retransmitstcpInfo_tcpi_rtotcpInfo_tcpi_rtttcpInfo_tcpi_rttvartcpInfo_tcpi_sackedtcpInfo_tcpi_snd_cwndtcpInfo_tcpi_snd_msstcpInfo_tcpi_snd_ssthreshtcpInfo_tcpi_snd_wscaletcpInfo_tcpi_statetcpInfo_tcpi_total_retranstcpInfo_tcpi_unacked
cpustat
Collected by: storpool_stat
Collected from: /proc/schedstat, /proc/stat
For extra information, see the documentation of the Linux kernel at https://docs.kernel.org/filesystems/proc.html for the two files above.
Tags:
cpu- the CPU thread the stats are forhostname- hostname of the nodelabels- pipe-delimited (|) list of StorPool services that are pinned on the CPUserver-SP_OURIDof the node.
Fields:
guest- Amount of time running a guest (virtual machine)guest_nice- Amount of time running a lower-priority (“nice”) guest (virtual machine)idle- Amount of time the CPU has been idleiowait- Amount of time the CPU has been idle and waiting for I/Oirq- Amount of time the CPU has processed interruptsnice- Amount of time the CPU was running lower-priority (“nice”) tasksrun- Amount of time a task has been running on the cpurunwait- Sum ofrunandwaitsoftirq- Amount of time the CPU has processed software interruptssteal- Amount of time the CPU was not able to run because the host didn’t allow this (has meaning only for virtual machines)system- Amount of time the CPU was executing kernel (and non-IRQ) tasksuser- Amount of time the CPU was executing in user-spacewait- Amount of time tasks(s) have been waiting to run on the CPU.
Note
run and wait come from the scheduler stats. Their main benefit
is that they allow for the contention of the system to be measured,
for example wait on a host running virtual machines translates
directly to steal inside the virtual machines.
disk
Collected by: storpool_monitor, pushed via the monitoring system
Collected from: storpool disk list
This measurement is basically storpool disk list collected once a minute.
Tags:
id- Disk ID in StorPoolisWbc- Does the drive have write-back cache enabledmodel- Drive modelnoFlush- Is the drive initialized to not send FLUSH commandsnoFua- Is the drive initialized not to use FUA (Force Unit Access)noTrim- Is the drive initialized not to use TRIM/DISCARDserial- Serial number of the driveserverId- Server instance ID of thestorpool_serverworking with the drivessd- Is the drive an SSD/NVMe device.
Fields:
agAllocated- Allocation groups currently in useagCount- Total number of allocation groupsagFree- Free allocation groupsagFreeNotTrimmed- (internal) free allocation groups that haven’t been trimmed yetagFreeing- (internal) allocation groups currently being freedagFull- (internal) allocation groups that are fullagMaxSizeFull- (internal) allocation groups that are full with max-sized entriesagMaxSizePartial- (internal) allocation groups that have only max-sized entries, but are not fullagPartial- (internal) allocation groups that are partially fullaggregateScore_entries- aggregate score for entriesaggregateScore_space- aggregate score for spaceaggregateScore_total- combined aggregate scoreentriesAllocated- Entries currently in useentriesCount- Total number of entriesentriesFree- Free entrieslastScrubCompleted- Time stamp of the last completed scrubbing operationobjectsAllocated- Objects currently in useobjectsCount- Total number of objectsobjectsFree- Free objectsobjectsOnDiskSize- Total amount of user data on drive (sum of all data in objects)scrubbedBytes- Progress of the current scrubbing operationscrubbingBW- Bandwidth of the current scrubbing operationscrubbingFinishAfter- estimated ETA of the scrubbing operationscrubbingStartedBefore- approximate start of the scrubbing operationsectorsCount- Number of sectors of the drivetotalErrorsDetected- Total errors detected by checksum verification on the drive.
diskiostat
Collected by: storpool_stat
Collected from: /proc/diskstats, storpool_initdisk --list
This measurement collects I/O stats for all HDD and SATA SSD drives on the system. For extra information, see the documentation of the Linux kernel at https://docs.kernel.org/admin-guide/iostats.html.
Tags:
device- device namehostname- host name of the nodeserver-SP_OURIDof the nodesp_id- disk ID in StorPool (if applicable). Journal devices are prefixed withjssd- is the drive SSD.
Fields:
queue_depth- queue utilizationr_wait- wait time for read operationsread_bytes- bytes transferred for read operationsreads- number of read operationsreads_merges- merged read operationsutilization- device utilization (time busy)w_wait- wait time for write operationswait- total wait timewrite_bytes- bytes transferred for write operationswrite_merges- merged write operationswrites- number of write operations
diskstat
Collected by: storpool_stat
Collected from: /usr/lib/storpool/server_stat
These metrics show the performance of the drive and operations as seen from the
storpool_server processes.
Tags:
disk- the ID of the disk in StorPoolhostname- host name of the serverserver-SP_OURIDof the server
Fields:
aggregation_completion_time-aggregations- number of aggregation operations performeddisk_initiated_read_bytes-disk_initiated_reads-disk_read_operations_completion_time-disk_reads_completion_time-disk_trims_bytes-disk_trims_count-disk_write_operations_completion_time-disk_writes_completion_time-entry_group_switches- (internal)max_disk_writes_completion_time-max_outstanding_read_requests- peak read requests in the queuemax_outstanding_write_requests- peak write requests in the queuemax_transfer_time-metadata_completion_time-pct_utilization_aggregation- drive utilization for aggregationpct_utilization_metadata- drive utilization for metadata operationspct_utilization_reads- drive utilization for read operationspct_utilization_server_reads- drive utilization for server readspct_utilization_sys- drive utilization for system operationspct_utilization_total- drive utilization in totalpct_utilization_total2-pct_utilization_unknwon-pct_utilization_user-pct_utilization_writes- drive utilization for write operationsqueued_read_requests- number of read operations in the queuequeued_write_requests- number of write operations in the queueread_balance_forward_double_dispatch-read_balance_forward_double_dispatch_pct-read_balance_forward_rcvd-read_balance_forwards_sent-read_bytes- bytes transferred for read operationsreads- read operationreads_completion_time-server_read_bytes- bytes transferred for server readsserver_reads- server reads (requests from other servers)transfer_average_time-trims- TRIM operations issued to the devicewrite_bytes- bytes transferred for write operationswrites- write operationswrites_completion_time-
iostat
Collected by: storpool_stat
Collected from: /proc/schedstat, /proc/stat
This measurement collects data for the I/O usage and latency for the volumes attached in hosts via the native StorPool driver. These are the same as for diskiostat.
Tags:
hostname- host name of the nodeserver-SP_OURIDof the nodevolume- volume name.
Fields:
queue_depth- queue utilizationr_wait- wait time for read operationsread_bytes- bytes transferred for read operationsreads- number of read operationsreads_merges- merged read operationsutilization- device utilization (time busy)w_wait- wait time for write operationswait- total wait timewrite_bytes- bytes transferred for write operationswrite_merges- merged write operationswrites- number of write operations
iscsisession
Collected by: storpool_monitor, pushed via the monitoring system
Collected from: storpool iscsi sessions list
This measurement is storpool iscsi sessions list, collected once a minute.
The data in it is counters, not differences, except the tasks_* fields, which
are the current usage of the task queue.
Tags:
ISID- ISIDconnectionId- numeric ID of the connectioncontrollerId-SP_OURIDof the target exporting nodehwPort- network interface number (0/1)initiator- initiator IQNinitiatorIP- initiator IP addressinitiatorId- internal numeric ID for the initiatorinitiatorPort- initiator originating TCP portlocalMSS- MSS for the TCP connectionportalIP- IP of the portalportalPort- TCP port of the portalstatus- status of the connectiontarget- target nametargetId- numerical target IDtimeCreated- timestamp of connection creation
Fields:
dataHoles- data “holes” observer, either because of dropped packets or reorderingdiscardedBytes- amount of data discardeddiscardedPackets- number of packets discardednewBytesIn- bytes in SYN packetsnewBytesOut- bytes in SYN and/or ACK packetsnewPacketsIn- number of SYN packetsnewPacketsOut- bytes in SYN and/or ACK packets.retransmitsAcks- number of fast retransmitsretransmitsAcks2- number of second retransmitsretransmitsTimeout- number of retransmissions because of a timeoutretransmittedBytes- amount of retransmitted dataretransmittedPackets- number of retransmitted packetsstats_dataIn- amount of data receivedstats_dataOut- amount of data sentstats_login- number of login requestsstats_loginRsp- number of login responsesstats_logout- number of logout requestsstats_logoutRsp- number of logout responsesstats_nopIn- number of NOPs receivedstats_nopOut- number of NOPs sentstats_r2t-stats_reject-stats_scsi-stats_scsiRsp-stats_snack-stats_task-stats_taskRsp-stats_text-stats_textRsp-tasks_aborted- task slots with ABORT taskstasks_dataOut- total number of tasks for sending datatasks_dataResp- tasks responding with datatasks_inFreeList- available tasks slotstasks_processing- task slots currently being processedtasks_queued- task slots queued for processingtcp_remoteMSS- MSS advertised from the remote sidetcp_remoteWindowSize- remote side window sizetcp_wscale- TCP window scaling factortotalBytesIn- Total bytes receivedtotalBytesOut- Total bytes senttotalPacketsIn- Total packets receivedtotalPacketsOut- Total packets sent.
memstat
Collected by: storpool_stat
Collected from: /sys/fs/cgroup/memory/**/memory.stat
This measurement describes the memory usage of the mode and its cgroups. For the full description of the fields, see the documentation of the Linux kernel at https://docs.kernel.org/admin-guide/cgroup-v1/memory.html.
Tags:
cgroup- name of the cgroup;hostname- host name of the nodeserver-SP_OURIDof the node.
Fields:
active_anonactive_filecachehierarchical_memory_limithierarchical_memsw_limitinactive_anoninactive_filemapped_filepgfaultpgmajfaultpgpginpgpgoutrssrss_hugeswaptotal_active_anontotal_active_filetotal_cachetotal_inactive_anontotal_inactive_filetotal_mapped_filetotal_pgfaulttotal_pgmajfaulttotal_pgpgintotal_pgpgouttotal_rsstotal_rss_hugetotal_swaptotal_unevictableunevictable
netstat
Collected by: storpool_stat
Collected from: /usr/lib/storpool/sdump
This measurement collects network stats for every StorPool service running on any node, for the StorPool network protocol.
Tags:
hostname- host name of the nodenetwork- network IDserver-SP_OURIDof the nodeservice- name of the service.
Fields:
getNotSpace- number of requests rejected because of no space in local queuesrxBytes- received bytesrxChecksumError- received packets with checksum errorsrxDataChecksumErrors- received packets with checksum error in the datarxDataHoles- “holes” in the received data, caused by either packet loss or reorderingrxDataPackets- received packets with datarxHwChecksumError- received packets with checksum errors detected by hardwarerxNotForUs- received packets not destined to this servicerxPackets- total received packetsrxShort- received packets that were too short/truncatedtxBytes- transmitted bytestxBytesLocal- transmitted bytes to services on the same nodetxDropNoBuf- dropped packets because no buffers were availabletxGetPackets- transmittedgetpackets (requesting data from other services)txPackets- total transmitted packetstxPacketsLocal- transmitted packets to services on the same nodetxPingPackets- transmittedpingpackets.
servicestat
Collected by: storpool_stat
Collected from: /usr/lib/storpool/sdump
Tags:
hostname- hostname of the nodeserver-SP_OURIDof the nodeservice- service name
Fields:
data_transfers- number of successful data transfersdata_transfers_failed- number of failed data transfersloops_per_second- processing loops done by the serviceslept_for_usecs- amount of time the service has been idle
task
Collected by: storpool_monitor, pushed via the monitoring system
Collected from: storpool task list
This measurement tracks the active tasks in the cluster.
Tags:
diskId- disk initiating the taskstransactionId- transaction ID of the task (0 isRECOVERY, 1 isbridge, 2 isbalancer)
Fields:
allObjects- the sum of all objects in the taskcompletedObjects- completed objectsdispatchedObjects- objects currently being processedunresolvedObjects- objects not yet resolved.
template
Collected by: storpool_monitor, pushed via the monitoring system
Collected from: storpool template status
This measurement tracks the amount of used/free space in a StorPool cluster.
Tags:
placeAll- placement group name for placeAllplaceHead- placement group name for placeHeadplaceTail- placement group name for placeTailtemplatename- template name
Fields:
availablePlaceAll- available space in placeAll placement groupavailablePlaceHead- available space in placeHead placement groupavailablePlaceTail- available space in placeTail placement groupcapacity- total capacity of the templatecapacityPlaceAll- capacity of the placeAll placement groupcapacityPlaceHead- capacity of the placeHead placement groupcapacityPlaceTail- capacity of the placeTail placement groupfree- available space in the templateobjectsCount- number of objectsonDiskSize- total size of data stored on disks in this templateremovingSnapshotsCount- number of snapshots being deletedsize- total provisioned size on the templatesnapshotsCount- number of snapshotsstoredSize- total amount of data stored in templatestored_internal_u1- internal valuestored_internal_u1_placeAll- internal valuestored_internal_u1_placeHead- internal valuestored_internal_u1_placeTail- internal valuestored_internal_u2- internal valuestored_internal_u2_placeAll- internal valuestored_internal_u2_placeHead- internal valuestored_internal_u2_placeTail- internal valuestored_internal_u3- internal value (estimate of space “lost” due to disbalance)stored_internal_u3_placeAll- internal valuestored_internal_u3_placeHead- internal valuestored_internal_u3_placeTail- internal valuetotalSize- The number of bytes of all volumes based on this template, including the StorPool replication overheadvolumesCount- number of volumes
per_host_status
This measurement collects inventory and per host data for monitoring and host-wide system checks.
Collected by: storpool_stat
Collected from: ph_status splib module.
Fields:
rootcgprocesses- Shows processes that run without a memory constraint in the node’s root cgroup, used for monitoring to prevent OOM and deadlocks.apichecks- Checks that the API address and port are both reachable as a part of monitoring (catches blocked ports).iscsi_checks- Reports which iSCSI remote portals are unreachable from this node and their MTU, as part of monitoring node and cluster-wide network issues.service_checks- Reports all StorPool local services running and enabled state.inventorydata- Collect the following used for inventory and comprehensive alerts:allkernels- list of all installed kernel versions on the node.by-id- list of all device symlinks in the /dev/disk/by-id directory.by-path- list of all device symlinks in the /dev/disk/by-path directory.conf_sums- map of sha256sum checksums of /etc/storpool.conf and all files in the /etc/storpool.conf.d/ directory.cputype- the CPU architecture as recognized by the tooling in thesplib.df- list of lines output ofdf.dmidecode- raw output ofdmidecode, used mostly for RAM compatibility alerts.free_-m- the output offree -mcommand.fstab- the raw contents of/etc/fstab.kernel- the running kernel of the node.lldp- the output fromlldpcli show neighboursin JSON.lsblk- the raw output fromlsblk.lscpu- the raw output fromlscpu.lshw_-json- the JSON output fromlshw.lsmod- the raw output fromlsmod.lspci_-DvvA_linux-proc- the raw output fromlspci -DvvA linux-proc.lspci_-k- the raw output fromlspci -kused mostly for device driver inventory.mounts- list of lines output from/proc/mounts.net- map of symlink name and the path it is leading to for all devices in the/sys/class/netdirectory.nvme_list- the raw output fromnvme list.os- the operating system string as detected by the tooling insplib.revision- the contents of/etc/storpool_revision, as well as the output from thestorpool_revisiontool in JSON format.spconf- map of key/value for all resolved values from the configuration files in/etc/storpool.confand/etc/storpool.conf.d/*.conffiles for this node.sprdma- map of files and their contents in the/sys/class/storpool_rdma/storpool_rdma/statedirectory.taint- list of all modules reported as tainted (to detect live patched kernels).unsupportedkernels- all kernels later than the presently running one that do not have StorPool kernel modules installed.vfgenconf- the JSON configuration created for the network interfaces used for StorPool to enable hardware acceleration.
net_info- Report the present network state as reported by/usr/lib/storpool/storpool_ping netInfofrom the point of view of each local service, used for more comprehensive monitoring checks.systemd- Report node’s systemd units and their state.sysctl- Report the following sysctl values:kernel.core_uses_pidkernel.panickernel.panic_on_oopskernel.softlockup_panickernel.unknown_nmi_panickernel.panic_on_unrecovered_nmikernel.panic_on_io_nmikernel.hung_task_panicvm.panic_on_oomvm.dirty_background_bytesvm.dirty_expire_centisecsvm.dirty_writeback_centisecsvm.oom_dump_tasksvm.oom_kill_allocating_taskkernel.sysrqnet.ipv4.ip_forwardnet.ipv6.conf.all.forwardingnet.ipv6.conf.default.forwardingnet.nf_conntrack_maxnet.ipv4.tcp_rmemnet.ipv4.conf.all.arp_filternet.ipv4.conf.all.arp_announcenet.ipv4.conf.all.arp_ignorenet.ipv4.conf.default.arp_filternet.ipv4.conf.default.arp_announcenet.ipv4.conf.default.arp_ignore
proc_cmdline- The cmdline of the running kernel.one_version- The version of the OpenNebula plugin if one is installed.cloudstack_info2- Information about the installed CloudStack plugin for StorPool.kdumpctl_status- Reports the output fromkdump-config statusorkdumpctl statusdepending on the OS version, used to ensure thekdumpservice is configured and running correctly.iscsi_tool- Reports the output from the following views of the/usr/lib/storpool/iscsi_tool:ip net listip neigh listip route list
Used to monitor various local node parameters for more comprehensive monitoring alerts.