Xeon scalable BIOS & OS tuning

1. Intro

For Xeon scalable systems we need to tune the CPU and OS differently. The main reason for this is the existence of HWP, quoting:

Note

HWP is a technology introduced in Skylake which lets the CPU select its own stepping speed without the usage of the CPU Multiplier. Additionally it throttles/boosts itself much faster, which improves overall CPU performance. With enabled HWP you don’t need to create SSDTs with CPU P-States anymore.

The faster speed stepping leads to less latency being observed by workloads.

The settings are in two parts, BIOS and OS.

2. TLDR

Options:

Where

Option

Old value

New value

Note

BIOS

Hardware P-State

Disabled

Native Mode

BIOS

Autonomous C-State

Disabled

Enabled

OS

Performance bias

0

15

cpupower set -b

OS

Frequency governor

performance

powersave

cpupower frequency-set -g

Attention

For Xeon Scalable Gen2 Autonomous C-State should be Disabled.

Tested on:

  • Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz

  • kernel 3.10.0-862.2.3.el7.x86_64

Results:

  • Latency tested with lat-r-4k-1 on a single NVME on the same server, through StorPool.

  • Power usage measured with ipmitool sdr list.

Measure

Unit

Old value

New value

Latency

us

60

67

Power usage

Watt

~150

~100

3. BIOS

To make HWP work, two settings were changed. Both are in CPU -> Power management.

The main setting to have this enabled is the Hardware PM state control, in which the Hardware P-State should be set to Native Mode.

A secondary setting which should be helpful (we haven’t tested without it) is in CPU C-State, called Autonomous Core C-State, which should be enabled.

4. BIOS Walkthrough

These are screenshots of all related pages in the BIOS. There are changes on only two of them, the rest are for completeness.

First page of the BIOS:

Initial page

Go to the Advanced tab:

xeon_scalable_bios/01advanced.png

In it, go to the CPU Configuration menu:

xeon_scalable_bios/02processor.png

In there, on the bottom, go to the Advanced Power Management Configuration:

xeon_scalable_bios/03apm.png

In it, there are four separate options. First, CPU P-state control:

xeon_scalable_bios/04pstate.png

Then, in Hardware PM State Control, for Hardware P-States there are different options. Need to choose Native mode:

xeon_scalable_bios/06hwpstate-open.png

Next, CPU C State control. Here, the Autonomous Core C-State needs to be enabled:

xeon_scalable_bios/07cstate.png

And last, Package C state control:

xeon_scalable_bios/08pkgcstate.png

5. OS

If the HWP is enabled, you should be able to see it in dmesg:

[    4.158697] intel_pstate: HWP enabled

The settings needed are as follows:

  • performance bias 15

  • frequency governor powersave

The first one is a change in our procedure in rc.local. The line

cpupower set -b 0

should be

cpupower set -b 15

The second one is added to rc.local with the following line:

cpupower frequency-set -g powersave

6. Rationale

The tests we performed were done with fio on a StorPool volume placed on a single NVME drive in the same server. We observed ~5us difference in latency, and ~50W power saving with these settings.