summaryrefslogtreecommitdiff
path: root/include/linux
diff options
context:
space:
mode:
authorPeter Zijlstra <peterz@infradead.org>2023-10-10 00:04:25 +0300
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>2023-11-20 13:58:53 +0300
commitb0ebeb5956e5eae19c1520fb78197a252aabc4c3 (patch)
treea3eb420efc07d0552d75779ececb976f14ca7be9 /include/linux
parentad5cb6deb41417ef41b9d6ff54f789212108606f (diff)
downloadlinux-b0ebeb5956e5eae19c1520fb78197a252aabc4c3.tar.xz
perf: Optimize perf_cgroup_switch()
[ Upstream commit f06cc667f79909e9175460b167c277b7c64d3df0 ] Namhyung reported that bd2756811766 ("perf: Rewrite core context handling") regresses context switch overhead when perf-cgroup is in use together with 'slow' PMUs like uncore. Specifically, perf_cgroup_switch()'s perf_ctx_disable() / ctx_sched_out() etc.. all iterate the full list of active PMUs for that CPU, even if they don't have cgroup events. Previously there was cgrp_cpuctx_list which linked the relevant PMUs together, but that got lost in the rework. Instead of re-instruducing a similar list, let the perf_event_pmu_context iteration skip those that do not have cgroup events. This avoids growing multiple versions of the perf_event_pmu_context iteration. Measured performance (on a slightly different patch): Before) $ taskset -c 0 ./perf bench sched pipe -l 10000 -G AAA,BBB # Running 'sched/pipe' benchmark: # Executed 10000 pipe operations between two processes Total time: 0.901 [sec] 90.128700 usecs/op 11095 ops/sec After) $ taskset -c 0 ./perf bench sched pipe -l 10000 -G AAA,BBB # Running 'sched/pipe' benchmark: # Executed 10000 pipe operations between two processes Total time: 0.065 [sec] 6.560100 usecs/op 152436 ops/sec Fixes: bd2756811766 ("perf: Rewrite core context handling") Reported-by: Namhyung Kim <namhyung@kernel.org> Debugged-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20231009210425.GC6307@noisy.programming.kicks-ass.net Signed-off-by: Sasha Levin <sashal@kernel.org>
Diffstat (limited to 'include/linux')
-rw-r--r--include/linux/perf_event.h1
1 files changed, 1 insertions, 0 deletions
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 7b5406e3288d..16913318af93 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -879,6 +879,7 @@ struct perf_event_pmu_context {
unsigned int embedded : 1;
unsigned int nr_events;
+ unsigned int nr_cgroups;
atomic_t refcount; /* event <-> epc */
struct rcu_head rcu_head;