Metrics collected and available from StorPool
Overview
A StorPool cluster collects and sends metrics information about its performance and related stats. These are described below.
Customers can access the data either:
Via https://analytics.storpool.com/ on pre-defined dashboards
Via direct access in the InfluxDB instance for their cluster (please contact StorPool support to get access)
Via an InfluxDB (or other database that supports InfluxDB’s line protocol) of their own, by configuring
SP_STATDB_ADDITIONAL
in/etc/storpool.conf
, see the last section in this document.
The Grafana dashboards used by StorPool at https://analytics.storpool.com/ are available on request, via StorPool support.
Internals
In this section you can find some details on the operation of the data collection and InfluxDB, which could be helpful to customers that would like to operate their own metrics database for StorPool.
storpool_stat
operations
The storpool_stat
tool is responsible for collecting and sending most of the
information to the InfluxDB instances. It’s general flow is as follows:
On start, it forks one child per measurement type and one per receiving database
All measurement processes collect data and write atomically a file every few seconds in
/tmp/storpool_stat
The sending processes take the data from the files and push it to the databases, then delete the files
If any file is found to be older than two days, the file is removed.
Note
It has been known for storpool_stat
to fill up /tmp
on loss of
network connectivity for nodes with large amount of measured elements
(CPUs, volumes).
To configure an extra metrics database for data to be pushed to, the
SP_STATDB_ADDITIONAL
parameter needs to be set in storpool.conf
. It must
contain a full URL to the write endpoint of the database, for example
http://USER:PASSWORD@10.1.2.3:8086/write?db=metrics
. Note that if the URL
scheme is https
, the endpoint will need a valid certificate.
Data interpolation between s1
and m1
retention policies
In StorPool’s initial deployment, this was done by continuous queries. Below is what was used:
CREATE CONTINUOUS QUERY "1s_to_1m" ON DBNAME BEGIN SELECT mean(*) INTO m1.:MEASUREMENT FROM /.*/ GROUP BY time(1m),* FILL(NULL) END
CREATE CONTINUOUS QUERY "1s_to_1m_max" ON DBNAME BEGIN SELECT max(*) INTO m1.:MEASUREMENT FROM /.*/ GROUP BY time(1m),* FILL(NULL) END'
This solution did not scale for larger amount of databases, as the continuous
queries are executed sequentially with no parallelization. For this purpose,
StorPool developed cqsched
(available to customers on request to StorPool
support) to process multiple databases and measurements in parallel.
Disk usage and IO of InfluxDB databases
The IO requirements of a single database are remarkably modest. For planing purposes, you should note that a cluster sends a data point every second for:
every CPU thread
every attached volume
every HDD or SSD drive on a node.
For disk usage, as an example, a cluster with ~800 attached volumes and ~800 CPU
threads takes 11GiB space for the s1
measurement, and 78GiB for the m1
measurement.
Data structure
The main unit is a “measurement”, which contains the data for a specific measurement (disk I/O, CPU usage, etc).
All of the data below has tags and fields. Basically, a “tag” is something that is used to filter data, and field is something used to do calculations/plot graphs.
For more information, see InfluxDB’s documentation at https://docs.influxdata.com/influxdb/v1/concepts/key_concepts/.
There are two retention policies for data: the per-second data (s1
) is
retained for 7 days, the per-minute data (m1
) is retained for 730 days (2
years). All data from storpool_stat
is pushed into s1
and the InfluxDB
instances take care of downsampling it to per-minute data either via continuous
queries or other means.
Measurements reference
The instances contain the following measurements:
bridgestatus
Collected by: storpool_monitor
, pushed via the monitoring system.
Collected from: storpool remoteBridge status
This measurement is basically storpool remoteBridge status
collected once a
minute.
The tcpInfo_*
fields are direct copy of the tcp_info
structure in the
Linux kernel for the relevant connection.
Tags:
clusterId
- ID of the remote clusterconnectionState
- current state of the connection (string)ip
- IP address of the remote bridgelastErrno
-errno
of the last errorlastError
- string representation of the last errorprotocolVersion
- StorPool bridge protocol version.
Fields:
countersSinceConnect_bytesRecv
- bytes received in the current connectioncountersSinceConnect_bytesSent
- bytes sent in the current connectioncountersSinceStart_bytesRecv
- bytes received from peer since the start of the processcountersSinceStart_bytesSent
- bytes sent to peer since the start of the processtcpInfo_tcpi_advmss
- this and all below -tcpinfo
field.tcpInfo_tcpi_ato
tcpInfo_tcpi_backoff
tcpInfo_tcpi_ca_state
tcpInfo_tcpi_fackets
tcpInfo_tcpi_last_ack_recv
tcpInfo_tcpi_last_ack_sent
tcpInfo_tcpi_last_data_recv
tcpInfo_tcpi_last_data_sent
tcpInfo_tcpi_lost
tcpInfo_tcpi_options
tcpInfo_tcpi_pmtu
tcpInfo_tcpi_probes
tcpInfo_tcpi_rcv_mss
tcpInfo_tcpi_rcv_rtt
tcpInfo_tcpi_rcv_space
tcpInfo_tcpi_rcv_ssthresh
tcpInfo_tcpi_rcv_wscale
tcpInfo_tcpi_reordering
tcpInfo_tcpi_retrans
tcpInfo_tcpi_retransmits
tcpInfo_tcpi_rto
tcpInfo_tcpi_rtt
tcpInfo_tcpi_rttvar
tcpInfo_tcpi_sacked
tcpInfo_tcpi_snd_cwnd
tcpInfo_tcpi_snd_mss
tcpInfo_tcpi_snd_ssthresh
tcpInfo_tcpi_snd_wscale
tcpInfo_tcpi_state
tcpInfo_tcpi_total_retrans
tcpInfo_tcpi_unacked
cpustat
Collected by: storpool_stat
Collected from: /proc/schedstat
, /proc/stat
For extra information, see the documentation of the Linux kernel at https://docs.kernel.org/filesystems/proc.html for the two files above.
Tags:
cpu
- the CPU thread the stats are forhostname
- hostname of the nodelabels
- pipe-delimited (|
) list of StorPool services that are pinned on the CPUserver
-SP_OURID
of the node.
Fields:
guest
- Amount of time running a guest (virtual machine)guest_nice
- Amount of time running a lower-priority (“nice”) guest (virtual machine)idle
- Amount of time the CPU has been idleiowait
- Amount of time the CPU has been idle and waiting for I/Oirq
- Amount of time the CPU has processed interruptsnice
- Amount of time the CPU was running lower-priority (“nice”) tasksrun
- Amount of time a task has been running on the cpurunwait
- Sum ofrun
andwait
softirq
- Amount of time the CPU has processed software interruptssteal
- Amount of time the CPU was not able to run because the host didn’t allow this (has meaning only for virtual machines)system
- Amount of time the CPU was executing kernel (and non-IRQ) tasksuser
- Amount of time the CPU was executing in user-spacewait
- Amount of time tasks(s) have been waiting to run on the CPU.
Note
run
and wait
come from the scheduler stats. Their main benefit
is that they allow for the contention of the system to be measured,
for example wait
on a host running virtual machines translates
directly to steal
inside the virtual machines.
disk
Collected by: storpool_monitor
, pushed via the monitoring system
Collected from: storpool disk list
This measurement is basically storpool disk list
collected once a minute.
Tags:
id
- Disk ID in StorPoolisWbc
- Does the drive have write-back cache enabledmodel
- Drive modelnoFlush
- Is the drive initialized to not send FLUSH commandsnoFua
- Is the drive initialized not to use FUA (Force Unit Access)noTrim
- Is the drive initialized not to use TRIM/DISCARDserial
- Serial number of the driveserverId
- Server instance ID of thestorpool_server
working with the drivessd
- Is the drive an SSD/NVMe device.
Fields:
agAllocated
- Allocation groups currently in useagCount
- Total number of allocation groupsagFree
- Free allocation groupsagFreeNotTrimmed
- (internal) free allocation groups that haven’t been trimmed yetagFreeing
- (internal) allocation groups currently being freedagFull
- (internal) allocation groups that are fullagMaxSizeFull
- (internal) allocation groups that are full with max-sized entriesagMaxSizePartial
- (internal) allocation groups that have only max-sized entries, but are not fullagPartial
- (internal) allocation groups that are partially fullaggregateScore_entries
- aggregate score for entriesaggregateScore_space
- aggregate score for spaceaggregateScore_total
- combined aggregate scoreentriesAllocated
- Entries currently in useentriesCount
- Total number of entriesentriesFree
- Free entrieslastScrubCompleted
- Time stamp of the last completed scrubbing operationobjectsAllocated
- Objects currently in useobjectsCount
- Total number of objectsobjectsFree
- Free objectsobjectsOnDiskSize
- Total amount of user data on drive (sum of all data in objects)scrubbedBytes
- Progress of the current scrubbing operationscrubbingBW
- Bandwidth of the current scrubbing operationscrubbingFinishAfter
- estimated ETA of the scrubbing operationscrubbingStartedBefore
- approximate start of the scrubbing operationsectorsCount
- Number of sectors of the drivetotalErrorsDetected
- Total errors detected by checksum verification on the drive.
diskiostat
Collected by: storpool_stat
Collected from: /proc/diskstats
, storpool_initdisk --list
This measurement collects I/O stats for all HDD and SATA SSD drives on the system. For extra information, see the documentation of the Linux kernel at https://docs.kernel.org/admin-guide/iostats.html.
Tags:
device
- device namehostname
- host name of the nodeserver
-SP_OURID
of the nodesp_id
- disk ID in StorPool (if applicable). Journal devices are prefixed withj
ssd
- is the drive SSD.
Fields:
queue_depth
- queue utilizationr_wait
- wait time for read operationsread_bytes
- bytes transferred for read operationsreads
- number of read operationsreads_merges
- merged read operationsutilization
- device utilization (time busy)w_wait
- wait time for write operationswait
- total wait timewrite_bytes
- bytes transferred for write operationswrite_merges
- merged write operationswrites
- number of write operations
diskstat
Collected by: storpool_stat
Collected from: /usr/lib/storpool/server_stat
These metrics show the performance of the drive and operations as seen from the
storpool_server
processes.
Tags:
disk
- the ID of the disk in StorPoolhostname
- host name of the serverserver
-SP_OURID
of the server
Fields:
aggregation_completion_time
-aggregations
- number of aggregation operations performeddisk_initiated_read_bytes
-disk_initiated_reads
-disk_read_operations_completion_time
-disk_reads_completion_time
-disk_trims_bytes
-disk_trims_count
-disk_write_operations_completion_time
-disk_writes_completion_time
-entry_group_switches
- (internal)max_disk_writes_completion_time
-max_outstanding_read_requests
- peak read requests in the queuemax_outstanding_write_requests
- peak write requests in the queuemax_transfer_time
-metadata_completion_time
-pct_utilization_aggregation
- drive utilization for aggregationpct_utilization_metadata
- drive utilization for metadata operationspct_utilization_reads
- drive utilization for read operationspct_utilization_server_reads
- drive utilization for server readspct_utilization_sys
- drive utilization for system operationspct_utilization_total
- drive utilization in totalpct_utilization_total2
-pct_utilization_unknwon
-pct_utilization_user
-pct_utilization_writes
- drive utilization for write operationsqueued_read_requests
- number of read operations in the queuequeued_write_requests
- number of write operations in the queueread_balance_forward_double_dispatch
-read_balance_forward_double_dispatch_pct
-read_balance_forward_rcvd
-read_balance_forwards_sent
-read_bytes
- bytes transferred for read operationsreads
- read operationreads_completion_time
-server_read_bytes
- bytes transferred for server readsserver_reads
- server reads (requests from other servers)transfer_average_time
-trims
- TRIM operations issued to the devicewrite_bytes
- bytes transferred for write operationswrites
- write operationswrites_completion_time
-
iostat
Collected by: storpool_stat
Collected from: /proc/schedstat
, /proc/stat
This measurement collects data for the I/O usage and latency for the volumes attached in hosts via the native StorPool driver. These are the same as for diskiostat.
Tags:
hostname
- host name of the nodeserver
-SP_OURID
of the nodevolume
- volume name.
Fields:
queue_depth
- queue utilizationr_wait
- wait time for read operationsread_bytes
- bytes transferred for read operationsreads
- number of read operationsreads_merges
- merged read operationsutilization
- device utilization (time busy)w_wait
- wait time for write operationswait
- total wait timewrite_bytes
- bytes transferred for write operationswrite_merges
- merged write operationswrites
- number of write operations
iscsisession
Collected by: storpool_monitor
, pushed via the monitoring system
Collected from: storpool iscsi sessions list
This measurement is storpool iscsi sessions list
, collected once a minute.
The data in it is counters, not differences, except the tasks_* fields, which
are the current usage of the task queue.
Tags:
ISID
- ISIDconnectionId
- numeric ID of the connectioncontrollerId
-SP_OURID
of the target exporting nodehwPort
- network interface number (0/1)initiator
- initiator IQNinitiatorIP
- initiator IP addressinitiatorId
- internal numeric ID for the initiatorinitiatorPort
- initiator originating TCP portlocalMSS
- MSS for the TCP connectionportalIP
- IP of the portalportalPort
- TCP port of the portalstatus
- status of the connectiontarget
- target nametargetId
- numerical target IDtimeCreated
- timestamp of connection creation
Fields:
dataHoles
- data “holes” observer, either because of dropped packets or reorderingdiscardedBytes
- amount of data discardeddiscardedPackets
- number of packets discardednewBytesIn
- bytes in SYN packetsnewBytesOut
- bytes in SYN and/or ACK packetsnewPacketsIn
- number of SYN packetsnewPacketsOut
- bytes in SYN and/or ACK packets.retransmitsAcks
- number of fast retransmitsretransmitsAcks2
- number of second retransmitsretransmitsTimeout
- number of retransmissions because of a timeoutretransmittedBytes
- amount of retransmitted dataretransmittedPackets
- number of retransmitted packetsstats_dataIn
- amount of data receivedstats_dataOut
- amount of data sentstats_login
- number of login requestsstats_loginRsp
- number of login responsesstats_logout
- number of logout requestsstats_logoutRsp
- number of logout responsesstats_nopIn
- number of NOPs receivedstats_nopOut
- number of NOPs sentstats_r2t
-stats_reject
-stats_scsi
-stats_scsiRsp
-stats_snack
-stats_task
-stats_taskRsp
-stats_text
-stats_textRsp
-tasks_aborted
- task slots with ABORT taskstasks_dataOut
- total number of tasks for sending datatasks_dataResp
- tasks responding with datatasks_inFreeList
- available tasks slotstasks_processing
- task slots currently being processedtasks_queued
- task slots queued for processingtcp_remoteMSS
- MSS advertised from the remote sidetcp_remoteWindowSize
- remote side window sizetcp_wscale
- TCP window scaling factortotalBytesIn
- Total bytes receivedtotalBytesOut
- Total bytes senttotalPacketsIn
- Total packets receivedtotalPacketsOut
- Total packets sent.
memstat
Collected by: storpool_stat
Collected from: /sys/fs/cgroup/memory/**/memory.stat
This measurement describes the memory usage of the mode and its cgroups. For the full description of the fields, see the documentation of the Linux kernel at https://docs.kernel.org/admin-guide/cgroup-v1/memory.html.
Tags:
cgroup
- name of the cgroup;hostname
- host name of the nodeserver
-SP_OURID
of the node.
Fields:
active_anon
active_file
cache
hierarchical_memory_limit
hierarchical_memsw_limit
inactive_anon
inactive_file
mapped_file
pgfault
pgmajfault
pgpgin
pgpgout
rss
rss_huge
swap
total_active_anon
total_active_file
total_cache
total_inactive_anon
total_inactive_file
total_mapped_file
total_pgfault
total_pgmajfault
total_pgpgin
total_pgpgout
total_rss
total_rss_huge
total_swap
total_unevictable
unevictable
netstat
Collected by: storpool_stat
Collected from: /usr/lib/storpool/sdump
This measurement collects network stats for every StorPool service running on any node, for the StorPool network protocol.
Tags:
hostname
- host name of the nodenetwork
- network IDserver
-SP_OURID
of the nodeservice
- name of the service.
Fields:
getNotSpace
- number of requests rejected because of no space in local queuesrxBytes
- received bytesrxChecksumError
- received packets with checksum errorsrxDataChecksumErrors
- received packets with checksum error in the datarxDataHoles
- “holes” in the received data, caused by either packet loss or reorderingrxDataPackets
- received packets with datarxHwChecksumError
- received packets with checksum errors detected by hardwarerxNotForUs
- received packets not destined to this servicerxPackets
- total received packetsrxShort
- received packets that were too short/truncatedtxBytes
- transmitted bytestxBytesLocal
- transmitted bytes to services on the same nodetxDropNoBuf
- dropped packets because no buffers were availabletxGetPackets
- transmittedget
packets (requesting data from other services)txPackets
- total transmitted packetstxPacketsLocal
- transmitted packets to services on the same nodetxPingPackets
- transmittedping
packets.
servicestat
Collected by: storpool_stat
Collected from: /usr/lib/storpool/sdump
Tags:
hostname
- hostname of the nodeserver
-SP_OURID
of the nodeservice
- service name
Fields:
data_transfers
- number of successful data transfersdata_transfers_failed
- number of failed data transfersloops_per_second
- processing loops done by the serviceslept_for_usecs
- amount of time the service has been idle
task
Collected by: storpool_monitor
, pushed via the monitoring system
Collected from: storpool task list
This measurement tracks the active tasks in the cluster.
Tags:
diskId
- disk initiating the taskstransactionId
- transaction ID of the task (0 isRECOVERY
, 1 isbridge
, 2 isbalancer
)
Fields:
allObjects
- the sum of all objects in the taskcompletedObjects
- completed objectsdispatchedObjects
- objects currently being processedunresolvedObjects
- objects not yet resolved.
template
Collected by: storpool_monitor
, pushed via the monitoring system
Collected from: storpool template status
This measurement tracks the amount of used/free space in a StorPool cluster.
Tags:
placeAll
- placement group name for placeAllplaceHead
- placement group name for placeHeadplaceTail
- placement group name for placeTailtemplatename
- template name
Fields:
availablePlaceAll
- available space in placeAll placement groupavailablePlaceHead
- available space in placeHead placement groupavailablePlaceTail
- available space in placeTail placement groupcapacity
- total capacity of the templatecapacityPlaceAll
- capacity of the placeAll placement groupcapacityPlaceHead
- capacity of the placeHead placement groupcapacityPlaceTail
- capacity of the placeTail placement groupfree
- available space in the templateobjectsCount
- number of objectsonDiskSize
- total size of data stored on disks in this templateremovingSnapshotsCount
- number of snapshots being deletedsize
- total provisioned size on the templatesnapshotsCount
- number of snapshotsstoredSize
- total amount of data stored in templatestored_internal_u1
- internal valuestored_internal_u1_placeAll
- internal valuestored_internal_u1_placeHead
- internal valuestored_internal_u1_placeTail
- internal valuestored_internal_u2
- internal valuestored_internal_u2_placeAll
- internal valuestored_internal_u2_placeHead
- internal valuestored_internal_u2_placeTail
- internal valuestored_internal_u3
- internal value (estimate of space “lost” due to disbalance)stored_internal_u3_placeAll
- internal valuestored_internal_u3_placeHead
- internal valuestored_internal_u3_placeTail
- internal valuetotalSize
- The number of bytes of all volumes based on this template, including the StorPool replication overheadvolumesCount
- number of volumes
per_host_status
This measurement collects inventory and per host data for monitoring and host-wide system checks.
Collected by: storpool_stat
Collected from: ph_status
splib module.
Fields:
rootcgprocesses
- Shows processes that run without a memory constraint in the node’s root cgroup, used for monitoring to prevent OOM and deadlocks.apichecks
- Checks that the API address and port are both reachable as a part of monitoring (catches blocked ports).iscsi_checks
- Reports which iSCSI remote portals are unreachable from this node and their MTU, as part of monitoring node and cluster-wide network issues.service_checks
- Reports all StorPool local services running and enabled state.inventorydata
- Collect the following used for inventory and comprehensive alerts:allkernels
- list of all installed kernel versions on the node.by-id
- list of all device symlinks in the /dev/disk/by-id directory.by-path
- list of all device symlinks in the /dev/disk/by-path directory.conf_sums
- map of sha256sum checksums of /etc/storpool.conf and all files in the /etc/storpool.conf.d/ directory.cputype
- the CPU architecture as recognized by the tooling in thesplib
.df
- list of lines output ofdf
.dmidecode
- raw output ofdmidecode
, used mostly for RAM compatibility alerts.free_-m
- the output offree -m
command.fstab
- the raw contents of/etc/fstab
.kernel
- the running kernel of the node.lldp
- the output fromlldpcli show neighbours
in JSON.lsblk
- the raw output fromlsblk
.lscpu
- the raw output fromlscpu
.lshw_-json
- the JSON output fromlshw
.lsmod
- the raw output fromlsmod
.lspci_-DvvA_linux-proc
- the raw output fromlspci -DvvA linux-proc
.lspci_-k
- the raw output fromlspci -k
used mostly for device driver inventory.mounts
- list of lines output from/proc/mounts
.net
- map of symlink name and the path it is leading to for all devices in the/sys/class/net
directory.nvme_list
- the raw output fromnvme list
.os
- the operating system string as detected by the tooling insplib
.revision
- the contents of/etc/storpool_revision
, as well as the output from thestorpool_revision
tool in JSON format.spconf
- map of key/value for all resolved values from the configuration files in/etc/storpool.conf
and/etc/storpool.conf.d/*.conf
files for this node.sprdma
- map of files and their contents in the/sys/class/storpool_rdma/storpool_rdma/state
directory.taint
- list of all modules reported as tainted (to detect live patched kernels).unsupportedkernels
- all kernels later than the presently running one that do not have StorPool kernel modules installed.vfgenconf
- the JSON configuration created for the network interfaces used for StorPool to enable hardware acceleration.
net_info
- Report the present network state as reported by/usr/lib/storpool/storpool_ping netInfo
from the point of view of each local service, used for more comprehensive monitoring checks.systemd
- Report node’s systemd units and their state.sysctl
- Report the following sysctl values:kernel.core_uses_pid
kernel.panic
kernel.panic_on_oops
kernel.softlockup_panic
kernel.unknown_nmi_panic
kernel.panic_on_unrecovered_nmi
kernel.panic_on_io_nmi
kernel.hung_task_panic
vm.panic_on_oom
vm.dirty_background_bytes
vm.dirty_expire_centisecs
vm.dirty_writeback_centisecs
vm.oom_dump_tasks
vm.oom_kill_allocating_task
kernel.sysrq
net.ipv4.ip_forward
net.ipv6.conf.all.forwarding
net.ipv6.conf.default.forwarding
net.nf_conntrack_max
net.ipv4.tcp_rmem
net.ipv4.conf.all.arp_filter
net.ipv4.conf.all.arp_announce
net.ipv4.conf.all.arp_ignore
net.ipv4.conf.default.arp_filter
net.ipv4.conf.default.arp_announce
net.ipv4.conf.default.arp_ignore
proc_cmdline
- The cmdline of the running kernel.one_version
- The version of the OpenNebula plugin if one is installed.cloudstack_info2
- Information about the installed CloudStack plugin for StorPool.kdumpctl_status
- Reports the output fromkdump-config status
orkdumpctl status
depending on the OS version, used to ensure thekdump
service is configured and running correctly.iscsi_tool
- Reports the output from the following views of the/usr/lib/storpool/iscsi_tool
:ip net list
ip neigh list
ip route list
Used to monitor various local node parameters for more comprehensive monitoring alerts.