Xeon scalable BIOS & OS tuning
1. Intro
For Xeon scalable systems we need to tune the CPU and OS differently. The main reason for this is the existence of HWP, quoting:
Note
HWP is a technology introduced in Skylake which lets the CPU select its own stepping speed without the usage of the CPU Multiplier. Additionally it throttles/boosts itself much faster, which improves overall CPU performance. With enabled HWP you don’t need to create SSDTs with CPU P-States anymore.
The faster speed stepping leads to less latency being observed by workloads.
The settings are in two parts, BIOS and OS.
2. TLDR
Options:
Where |
Option |
Old value |
New value |
Note |
---|---|---|---|---|
BIOS |
Hardware P-State |
Disabled |
Native Mode |
|
BIOS |
Autonomous C-State |
Disabled |
Enabled |
|
OS |
Performance bias |
0 |
15 |
cpupower set -b |
OS |
Frequency governor |
performance |
powersave |
cpupower frequency-set -g |
Attention
For Xeon Scalable Gen2 Autonomous C-State
should be
Disabled
.
Tested on:
Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz
kernel 3.10.0-862.2.3.el7.x86_64
Results:
Latency tested with lat-r-4k-1 on a single NVME on the same server, through StorPool.
Power usage measured with
ipmitool sdr list
.
Measure |
Unit |
Old value |
New value |
---|---|---|---|
Latency |
us |
60 |
67 |
Power usage |
Watt |
~150 |
~100 |
3. BIOS
To make HWP work, two settings were changed. Both are in CPU -> Power management.
The main setting to have this enabled is the Hardware PM state control
, in
which the Hardware P-State
should be set to Native Mode
.
A secondary setting which should be helpful (we haven’t tested without it) is in
CPU C-State
, called Autonomous Core C-State
, which should be enabled.
4. BIOS Walkthrough
These are screenshots of all related pages in the BIOS. There are changes on only two of them, the rest are for completeness.
First page of the BIOS:
Go to the Advanced
tab:
In it, go to the CPU Configuration
menu:
In there, on the bottom, go to the Advanced Power Management Configuration
:
In it, there are four separate options. First, CPU P-state control
:
Then, in Hardware PM State Control
, for Hardware P-States
there are
different options. Need to choose Native mode
:
Next, CPU C State control
. Here, the Autonomous Core C-State
needs to be
enabled:
And last, Package C state control
:
5. OS
If the HWP is enabled, you should be able to see it in dmesg:
[ 4.158697] intel_pstate: HWP enabled
The settings needed are as follows:
performance bias 15
frequency governor
powersave
The first one is a change in our procedure in rc.local
. The line
cpupower set -b 0
should be
cpupower set -b 15
The second one is added to rc.local
with the following line:
cpupower frequency-set -g powersave
6. Rationale
The tests we performed were done with fio
on a StorPool volume placed on a
single NVME drive in the same server. We observed ~5us difference in latency,
and ~50W power saving with these settings.