diff options
author | Rafael J. Wysocki <rafael.j.wysocki@intel.com> | 2025-02-25 14:13:52 +0300 |
---|---|---|
committer | Rafael J. Wysocki <rafael.j.wysocki@intel.com> | 2025-02-25 14:13:52 +0300 |
commit | de585eac08b972c86faa060a77263af3d209ed24 (patch) | |
tree | f003a300b55ec190e1ee315c5e592972cb4e7f34 /tools/perf/scripts/python/libxed.py | |
parent | 5e7e39ae15b0ea370e783a9326fdd1d91357fc3e (diff) | |
parent | 5c350410999653dff8d2975d794088e4c166e8b5 (diff) | |
download | linux-de585eac08b972c86faa060a77263af3d209ed24.tar.xz |
Merge branch 'cpuidle-menu'
This work had been triggered by a report that commit 0611a640e60a
("eventpoll: prefer kfree_rcu() in __ep_remove()") had caused the
critical-jOPS metric of the SPECjbb 2015 benchmark [1] to drop by around
50% even though it generally reduced kernel overhead. Indeed, it was
found during further investigation that the total interrupt rate while
running the SPECjbb workload had fallen as a result of that commit by
55% and the local timer interrupt rate had fallen by almost 80%.
That turned out to cause the menu cpuidle governor to select the deepest
idle state supplied by the cpuidle driver (intel_idle) much more often
which added significantly more idle state latency to the workload and
that led to the decrease of the critical-jOPS score.
Interestingly enough, this problem was not visible when the teo cpuidle
governor was used instead of menu, so it appeared to be specific to the
latter. CPU wakeup event statistics collected while running the
workload indicated that the menu governor was effectively ignoring non-
timer wakeup information and all of its idle state selection decisions
appeared to be based on timer wakeups only. Thus, it appeared that the
reduction of the local timer interrupt rate caused the governor to
predict a idle duration much more often while running the workload and
the deepest idle state was selected significantly more often as a result
of that.
A subsequent inspection of the get_typical_interval() function in the
menu governor indicated that it might return UINT_MAX too often which
then caused the governor's decisions to be based entirely on information
related to timers.
Generally speaking, UINT_MAX is returned by get_typical_interval() if it
cannot make a prediction based on the most recent idle intervals data
with sufficiently high confidence, but at least in some cases this means
that useful information is not taken into account at all which may lead
to significant idle state selection mistakes. Moreover, this is not
really unlikely to happen.
One issue with get_typical_interval() is that, when it eliminates
outliers from the sample set in an attempt to reduce the standard
deviation (and so improve the prediction confidence), it does that by
dropping high-end samples only, while samples at the low end of the set
are retained. However, the samples at the low end very well may be the
outliers and they should be eliminated from the sample set instead of
the high-end samples. Accordingly, the likelihood of making a
meaningful idle duration prediction can be improved by making it also
eliminate low-end samples if they are farther from the average than
high-end samples.
Another issue is that get_typical_interval() gives up after eliminating
1/4 of the samples if the standard deviation is still not as low as
desired (within 1/6 of the average or within 20 us if the average is
close to 0), but the remaining samples in the set still represent useful
information at that point and discarding them altogether may lead to
suboptimal idle state selection.
For instance, the largest idle duration value in the get_typical_interval()
data set is the maximum idle duration observed recently and it is likely
that the upcoming idle duration will not exceed it. Therefore, in the
absence of a better choice, this value can be used as an upper bound on
the target residency of the idle state to select.
* cpuidle-menu:
cpuidle: menu: Update documentation after get_typical_interval() changes
cpuidle: menu: Avoid discarding useful information
cpuidle: menu: Eliminate outliers on both ends of the sample set
cpuidle: menu: Tweak threshold use in get_typical_interval()
cpuidle: menu: Use one loop for average and variance computations
cpuidle: menu: Drop a redundant local variable
Diffstat (limited to 'tools/perf/scripts/python/libxed.py')
0 files changed, 0 insertions, 0 deletions