kernel/linux.git/include/asm-generic/qspinlock.h, branch v6.6.131

asm-generic: qspinlock: fix queued_spin_value_unlocked() implementation

2023-12-20T16:01:59+00:00

[ Upstream commit 125b0bb95dd6bec81b806b997a4ccb026eeecf8f ] We really don't want to do atomic_read() or anything like that, since we already have the value, not the lock. The whole point of this is that we've loaded the lock from memory, and we want to check whether the value we loaded was a locked one or not. The main use of this is the lockref code, which loads both the lock and the reference count in one atomic operation, and then works on that combined value. With the atomic_read(), the compiler would pointlessly spill the value to the stack, in order to then be able to read it back "atomically". This is the qspinlock version of commit c6f4a9002252 ("asm-generic: ticket-lock: Optimize arch_spin_value_unlocked()") which fixed this same bug for ticket locks. Cc: Guo Ren Cc: Ingo Molnar Cc: Waiman Long Link: https://lore.kernel.org/all/CAHk-=whNRv0v6kQiV5QO6DJhjH4KEL36vWQ6Re8Csrnh4zbRkQ@mail.gmail.com/ Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin

asm-generic: qspinlock: Indicate the use of mixed-size atomics

2022-05-11T18:49:47+00:00

The qspinlock implementation depends on having well behaved mixed-size atomics. This is true on the more widely-used platforms, but these requirements are somewhat subtle and may not be satisfied by all the platforms that qspinlock is used on. Document these requirements, so ports that use qspinlock can more easily determine if they meet these requirements. Signed-off-by: Peter Zijlstra (Intel) Acked-by: Waiman Long Reviewed-by: Arnd Bergmann Signed-off-by: Palmer Dabbelt

qspinlock: use signed temporaries for cmpxchg

2020-10-26T19:19:48+00:00

When building with W=2, the build log is flooded with include/asm-generic/qrwlock.h:65:56: warning: pointer targets in passing argument 2 of 'atomic_try_cmpxchg_acquire' differ in signedness [-Wpointer-sign] include/asm-generic/qrwlock.h:92:53: warning: pointer targets in passing argument 2 of 'atomic_try_cmpxchg_acquire' differ in signedness [-Wpointer-sign] include/asm-generic/qspinlock.h:68:55: warning: pointer targets in passing argument 2 of 'atomic_try_cmpxchg_acquire' differ in signedness [-Wpointer-sign] include/asm-generic/qspinlock.h:82:52: warning: pointer targets in passing argument 2 of 'atomic_try_cmpxchg_acquire' differ in signedness [-Wpointer-sign] The atomics are built on top of signed integers, but the caller doesn't actually care. Just use signed types as well. Fixes: 27df89689e25 ("locking/spinlocks: Remove an instruction from spin and write locks") Signed-off-by: Arnd Bergmann

Merge tag 'powerpc-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

2020-08-07T17:33:50+00:00

Pull powerpc updates from Michael Ellerman: - Add support for (optionally) using queued spinlocks & rwlocks. - Support for a new faster system call ABI using the scv instruction on Power9 or later. - Drop support for the PROT_SAO mmap/mprotect flag as it will be unsupported on Power10 and future processors, leaving us with no way to implement the functionality it requests. This risks breaking userspace, though we believe it is unused in practice. - A bug fix for, and then the removal of, our custom stack expansion checking. We now allow stack expansion up to the rlimit, like other architectures. - Remove the remnants of our (previously disabled) topology update code, which tried to react to NUMA layout changes on virtualised systems, but was prone to crashes and other problems. - Add PMU support for Power10 CPUs. - A change to our signal trampoline so that we don't unbalance the link stack (branch return predictor) in the signal delivery path. - Lots of other cleanups, refactorings, smaller features and so on as usual. Thanks to: Abhishek Goel, Alastair D'Silva, Alexander A. Klimov, Alexey Kardashevskiy, Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Anton Blanchard, Arnd Bergmann, Athira Rajeev, Balamuruhan S, Bharata B Rao, Bill Wendling, Bin Meng, Cédric Le Goater, Chris Packham, Christophe Leroy, Christoph Hellwig, Daniel Axtens, Dan Williams, David Lamparter, Desnes A. Nunes do Rosario, Erhard F., Finn Thain, Frederic Barrat, Ganesh Goudar, Gautham R. Shenoy, Geoff Levand, Greg Kurz, Gustavo A. R. Silva, Hari Bathini, Harish, Imre Kaloz, Joel Stanley, Joe Perches, John Crispin, Jordan Niethe, Kajol Jain, Kamalesh Babulal, Kees Cook, Laurent Dufour, Leonardo Bras, Li RongQing, Madhavan Srinivasan, Mahesh Salgaonkar, Mark Cave-Ayland, Michal Suchanek, Milton Miller, Mimi Zohar, Murilo Opsfelder Araujo, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nayna Jain, Nicholas Piggin, Oliver O'Halloran, Palmer Dabbelt, Pedro Miraglia Franco de Carvalho, Philippe Bergheaud, Pingfan Liu, Pratik Rajesh Sampat, Qian Cai, Qinglang Miao, Randy Dunlap, Ravi Bangoria, Sachin Sant, Sam Bobroff, Sandipan Das, Santosh Sivaraj, Satheesh Rajendran, Shirisha Ganta, Sourabh Jain, Srikar Dronamraju, Stan Johnson, Stephen Rothwell, Thadeu Lima de Souza Cascardo, Thiago Jung Bauermann, Tom Lane, Vaibhav Jain, Vladis Dronov, Wei Yongjun, Wen Xiong, YueHaibing. * tag 'powerpc-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (337 commits) selftests/powerpc: Fix pkey syscall redefinitions powerpc: Fix circular dependency between percpu.h and mmu.h powerpc/powernv/sriov: Fix use of uninitialised variable selftests/powerpc: Skip vmx/vsx/tar/etc tests on older CPUs powerpc/40x: Fix assembler warning about r0 powerpc/papr_scm: Add support for fetching nvdimm 'fuel-gauge' metric powerpc/papr_scm: Fetch nvdimm performance stats from PHYP cpuidle: pseries: Fixup exit latency for CEDE(0) cpuidle: pseries: Add function to parse extended CEDE records cpuidle: pseries: Set the latency-hint before entering CEDE selftests/powerpc: Fix online CPU selection powerpc/perf: Consolidate perf_callchain_user_[64|32]() powerpc/pseries/hotplug-cpu: Remove double free in error path powerpc/pseries/mobility: Add pr_debug() for device tree changes powerpc/pseries/mobility: Set pr_fmt() powerpc/cacheinfo: Warn if cache object chain becomes unordered powerpc/cacheinfo: Improve diagnostics about malformed cache lists powerpc/cacheinfo: Use name@unit instead of full DT path in debug messages powerpc/cacheinfo: Set pr_fmt() powerpc: fix function annotations to avoid section mismatch warnings with gcc-10 ...

locking/qspinlock: Do not include atomic.h from qspinlock_types.h

2020-07-29T14:14:19+00:00

This patch breaks a header loop involving qspinlock_types.h. The issue is that qspinlock_types.h includes atomic.h, which then eventually includes kernel.h which could lead back to the original file via spinlock_types.h. As ATOMIC_INIT is now defined by linux/types.h, there is no longer any need to include atomic.h from qspinlock_types.h. This also allows the CONFIG_PARAVIRT hack to be removed since it was trying to prevent exactly this loop. Signed-off-by: Herbert Xu Signed-off-by: Peter Zijlstra (Intel) Acked-by: Waiman Long Link: https://lkml.kernel.org/r/20200729123316.GC7047@gondor.apana.org.au

powerpc/pseries: Implement paravirt qspinlocks for SPLPAR

2020-07-26T14:01:29+00:00

This implements the generic paravirt qspinlocks using H_PROD and H_CONFER to kick and wait. This uses an un-directed yield to any CPU rather than the directed yield to a pre-empted lock holder that paravirtualised simple spinlocks use, that requires no kick hcall. This is something that could be investigated and improved in future. Performance results can be found in the commit which added queued spinlocks. Signed-off-by: Nicholas Piggin Acked-by: Peter Zijlstra (Intel) Acked-by: Waiman Long Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20200724131423.1362108-5-npiggin@gmail.com

powerpc/64s: Implement queued spinlocks and rwlocks

2020-07-26T14:01:23+00:00

These have shown significantly improved performance and fairness when spinlock contention is moderate to high on very large systems. With this series including subsequent patches, on a 16 socket 1536 thread POWER9, a stress test such as same-file open/close from all CPUs gets big speedups, 11620op/s aggregate with simple spinlocks vs 384158op/s (33x faster), where the difference in throughput between the fastest and slowest thread goes from 7x to 1.4x. Thanks to the fast path being identical in terms of atomics and barriers (after a subsequent optimisation patch), single threaded performance is not changed (no measurable difference). On smaller systems, performance and fairness seems to be generally improved. Using dbench on tmpfs as a test (that starts to run into kernel spinlock contention), a 2-socket OpenPOWER POWER9 system was tested with bare metal and KVM guest configurations. Results can be found here: https://github.com/linuxppc/issues/issues/305#issuecomment-663487453 Observations are: - Queued spinlocks are equal when contention is insignificant, as expected and as measured with microbenchmarks. - When there is contention, on bare metal queued spinlocks have better throughput and max latency at all points. - When virtualised, queued spinlocks are slightly worse approaching peak throughput, but significantly better throughput and max latency at all points beyond peak, until queued spinlock maximum latency rises when clients are 2x vCPUs. The regressions haven't been analysed very well yet, there are a lot of things that can be tuned, particularly the paravirtualised locking, but the numbers already look like a good net win even on relatively small systems. Signed-off-by: Nicholas Piggin Acked-by: Peter Zijlstra (Intel) Acked-by: Waiman Long Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20200724131423.1362108-4-npiggin@gmail.com

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157

2019-05-30T18:26:37+00:00

Based on 3 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version [author] [kishon] [vijay] [abraham] [i] [kishon]@[ti] [com] this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version [author] [graeme] [gregory] [gg]@[slimlogic] [co] [uk] [author] [kishon] [vijay] [abraham] [i] [kishon]@[ti] [com] [based] [on] [twl6030]_[usb] [c] [author] [hema] [hk] [hemahk]@[ti] [com] this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 1105 file(s). Signed-off-by: Thomas Gleixner Reviewed-by: Allison Randal Reviewed-by: Richard Fontana Reviewed-by: Kate Stewart Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070033.202006027@linutronix.de Signed-off-by: Greg Kroah-Hartman

locking/spinlocks: Remove an instruction from spin and write locks

2018-10-02T07:49:42+00:00

Both spin locks and write locks currently do: f0 0f b1 17 lock cmpxchg %edx,(%rdi) 85 c0 test %eax,%eax 75 05 jne [slowpath] This 'test' insn is superfluous; the cmpxchg insn sets the Z flag appropriately. Peter pointed out that using atomic_try_cmpxchg_acquire() will let the compiler know this is true. Comparing before/after disassemblies show the only effect is to remove this insn. Take this opportunity to make the spin & write lock code resemble each other more closely and have similar likely() hints. Suggested-by: Peter Zijlstra Signed-off-by: Matthew Wilcox Signed-off-by: Peter Zijlstra (Intel) Acked-by: Will Deacon Cc: Arnd Bergmann Cc: Linus Torvalds Cc: Thomas Gleixner Cc: Waiman Long Link: http://lkml.kernel.org/r/20180820162639.GC25153@bombadil.infradead.org Signed-off-by: Ingo Molnar

locking/spinlocks: Clean up comment and #ifndef for {,queued_}spin_is_locked()

2018-05-15T06:11:15+00:00

Removes "#ifndef queued_spin_is_locked" from the generic code: this is unused and it's reasonable to conclude that it will continue to be unused. Also removes the comment about spin_is_locked() from mutex_is_locked(): the comment remains valid but not particularly useful. Suggested-by: Will Deacon Signed-off-by: Andrea Parri Signed-off-by: Paul E. McKenney Acked-by: Will Deacon Cc: Andrew Morton Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: akiyks@gmail.com Cc: boqun.feng@gmail.com Cc: dhowells@redhat.com Cc: j.alglave@ucl.ac.uk Cc: linux-arch@vger.kernel.org Cc: luc.maranget@inria.fr Cc: npiggin@gmail.com Cc: parri.andrea@gmail.com Cc: stern@rowland.harvard.edu Link: http://lkml.kernel.org/r/1526338889-7003-3-git-send-email-paulmck@linux.vnet.ibm.com Signed-off-by: Ingo Molnar