sched/membarrier: Use per-CPU mutexes for targeted commands - kernel/linux.git

diff options

author	Aniket Gattani <aniketgattani@google.com>	2026-05-04 00:22:03 +0300
committer	Peter Zijlstra <peterz@infradead.org>	2026-05-05 13:50:48 +0300
commit	89976cd73739dcb73745705a63ccc67a8be26cdf (patch)
tree	e2053441035aa8479da13373eb1b4a072764453b /include/linux
parent	b00192d78bb4f8b85d3545c7f86722610f088554 (diff)
download	linux-89976cd73739dcb73745705a63ccc67a8be26cdf.tar.xz

sched/membarrier: Use per-CPU mutexes for targeted commands

Currently, the membarrier system call uses a single global mutex (`membarrier_ipi_mutex`) to serialize expedited commands. This causes significant contention on large systems when multiple threads invoke membarrier concurrently, even if they target different CPUs. This contention becomes critical when combined with CFS bandwidth throttling/unthrottling, during which interrupts can be disabled for relatively long periods on target CPUs. If membarrier is waiting for a response from such a CPU, it holds the global mutex, blocking all other membarrier calls on the system. This cascade effect can lead to hard lockups when thousands of threads stall waiting for the mutex. Optimize `MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ` when a specific CPU is targeted by introducing per-CPU mutexes. Broadcast commands and commands without a specific CPU target continue to use the global mutex. This prevents the cascade lockup scenario. As measured by the stress test introduced in the subsequent patch, on an AMD Turin machine with 384 CPUs (2 NUMA nodes with SMT=2), this optimization yields 200x more throughput. Signed-off-by: Aniket Gattani <aniketgattani@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260503212205.3714217-2-aniketgattani@google.com

Diffstat (limited to 'include/linux')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: