locking/qspinlock: Micro-optimize pending state waiting for unlock

When we're pending, we only care about lock value. The xchg_tail wouldn't affect the pending state. That means the hardware thread could stay in a sleep state and leaves the rest execution units' resources of pipeline to other hardware threads. This situation is the SMT scenarios in the same core. Not an entering low-power state situation. Of course, the granularity between cores is "cacheline", but the granularity between SMT hw threads of the same core could be "byte" which internal LSU handles. For example, when a hw-thread yields the resources of the core to other hw-threads, this patch could help the hw-thread stay in the sleep state and prevent it from being woken up by other hw-threads xchg_tail. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Waiman Long <longman@redhat.com> Link: https://lore.kernel.org/r/20230105021952.3090070-1-guoren@kernel.org Cc: Peter Zijlstra <peterz@infradead.org>
author: Guo Ren <guoren@linux.alibaba.com> 2023-01-05 05:19:52 +0300
committer: Ingo Molnar <mingo@kernel.org> 2023-01-05 13:01:50 +0300
commit: 4282494a20cdcaf38d553f2c2ff6f252084f979c (patch)
tree: bea8c4e3804947e024255c90804de55bbb37e4f4 /kernel/locking
parent: 512dee0c00ad9e9c7ae9f11fc6743702ea40caff (diff)
download: linux-4282494a20cdcaf38d553f2c2ff6f252084f979c.tar.xz
1 files changed, 2 insertions, 2 deletions
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index 2b23378775fe..ebe6b8ec7cb3 100644
--- a/kernel/locking/qspinlock.c
+++ b/kernel/locking/qspinlock.c
@@ -371,7 +371,7 @@ void __lockfunc queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
 	/*
 	 * We're pending, wait for the owner to go away.
 	 *
-	 * 0,1,1 -> 0,1,0
+	 * 0,1,1 -> *,1,0
 	 *
 	 * this wait loop must be a load-acquire such that we match the
 	 * store-release that clears the locked bit and create lock
@@ -380,7 +380,7 @@ void __lockfunc queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
 	 * barriers.
 	 */
 	if (val & _Q_LOCKED_MASK)
-		atomic_cond_read_acquire(&lock->val, !(VAL & _Q_LOCKED_MASK));
+		smp_cond_load_acquire(&lock->locked, !VAL);
 
 	/*
 	 * take ownership and clear the pending bit.
author	Guo Ren <guoren@linux.alibaba.com>	2023-01-05 05:19:52 +0300
committer	Ingo Molnar <mingo@kernel.org>	2023-01-05 13:01:50 +0300
commit	4282494a20cdcaf38d553f2c2ff6f252084f979c (patch)
tree	bea8c4e3804947e024255c90804de55bbb37e4f4 /kernel/locking
parent	512dee0c00ad9e9c7ae9f11fc6743702ea40caff (diff)
download	linux-4282494a20cdcaf38d553f2c2ff6f252084f979c.tar.xz