locking: Introduce smp_mb__after_spinlock()

Since its inception, our understanding of ACQUIRE, esp. as applied to spinlocks, has changed somewhat. Also, I wonder if, with a simple change, we cannot make it provide more. The problem with the comment is that the STORE done by spin_lock isn't itself ordered by the ACQUIRE, and therefore a later LOAD can pass over it and cross with any prior STORE, rendering the default WMB insufficient (pointed out by Alan). Now, this is only really a problem on PowerPC and ARM64, both of which already defined smp_mb__before_spinlock() as a smp_mb(). At the same time, we can get a much stronger construct if we place that same barrier _inside_ the spin_lock(). In that case we upgrade the RCpc spinlock to an RCsc. That would make all schedule() calls fully transitive against one another. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
author: Peter Zijlstra <peterz@infradead.org> 2016-09-05 12:37:53 +0300
committer: Ingo Molnar <mingo@kernel.org> 2017-08-10 13:29:02 +0300
commit: d89e588ca4081615216cc25f2489b0281ac0bfe9 (patch)
tree: 9f3fd5958adb8b6a0a86065ca0c0603fc73c3c06 /kernel/sched
parent: ff7a5fb0f1d510997a845e0d227f30831ff38d9d (diff)
download: linux-d89e588ca4081615216cc25f2489b0281ac0bfe9.tar.xz
1 files changed, 2 insertions, 2 deletions
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 0869b20fba81..9fece583a1f0 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1967,8 +1967,8 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
 	 * reordered with p->state check below. This pairs with mb() in
 	 * set_current_state() the waiting thread does.
 	 */
-	smp_mb__before_spinlock();
 	raw_spin_lock_irqsave(&p->pi_lock, flags);
+	smp_mb__after_spinlock();
 	if (!(p->state & state))
 		goto out;
 
@@ -3281,8 +3281,8 @@ static void __sched notrace __schedule(bool preempt)
 	 * can't be reordered with __set_current_state(TASK_INTERRUPTIBLE)
 	 * done by the caller to avoid the race with signal_wake_up().
 	 */
-	smp_mb__before_spinlock();
 	rq_lock(rq, &rf);
+	smp_mb__after_spinlock();
 
 	/* Promote REQ to ACT */
 	rq->clock_update_flags <<= 1;
author	Peter Zijlstra <peterz@infradead.org>	2016-09-05 12:37:53 +0300
committer	Ingo Molnar <mingo@kernel.org>	2017-08-10 13:29:02 +0300
commit	d89e588ca4081615216cc25f2489b0281ac0bfe9 (patch)
tree	9f3fd5958adb8b6a0a86065ca0c0603fc73c3c06 /kernel/sched
parent	ff7a5fb0f1d510997a845e0d227f30831ff38d9d (diff)
download	linux-d89e588ca4081615216cc25f2489b0281ac0bfe9.tar.xz