summaryrefslogtreecommitdiff
path: root/arch/s390/lib
AgeCommit message (Collapse)AuthorFilesLines
2017-08-29s390/uaccess: avoid mvcos jump labelMartin Schwidefsky1-12/+26
If the kernel is compiled for z10 or later machines the uaccess code inlines the mvcos instruction. The facility bit 27 which indicates the availability of MVCOS has to be set. The have_mvcos jump label will always be true. Make the generation of the have_mvcos jump label conditional on !CONFIG_HAVE_MARCH_Z10_FEATURES. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-26s390/spinlock: add niai spinlock hintsMartin Schwidefsky1-36/+51
The z14 machine introduces new mode of the next-instruction-access-intent NIAI instruction. With NIAI-8 it is possible to pin a cache-line on a CPU for a small amount of time, NIAI-7 releases the cache-line again. Finally NIAI-4 can be used to prevent the CPU to speculatively access memory beyond the compare-and-swap instruction to get the lock. Use these instruction in the spinlock code. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-26s390/time: add support for the TOD clock epoch extensionMartin Schwidefsky1-1/+1
The TOD epoch extension adds 8 epoch bits to the TOD clock to provide a continuous clock after 2042/09/17. The store-clock-extended (STCKE) instruction will store the epoch index in the first byte of the 16 bytes stored by the instruction. The read_boot_clock64 and the read_presistent_clock64 functions need to take the additional bits into account to give the correct result after 2042/09/17. The clock-comparator register will stay 64 bit wide. The comparison of the clock-comparator with the TOD clock is limited to bytes 1 to 8 of the extended TOD format. To deal with the overflow problem due to an epoch change the clock-comparator sign control in CR0 can be used to switch the comparison of the 64-bit TOD clock with the clock-comparator to a signed comparison. The decision between the signed vs. unsigned clock-comparator comparisons is done at boot time. Only if the TOD clock is in the second half of a 142 year epoch the signed comparison is used. This solves the epoch overflow issue as long as the machine is booted at least once in an epoch. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-05-09s390/uaccess: use sane length for __strncpy_from_user()Heiko Carstens1-2/+2
The average string that is copied from user space to kernel space is rather short. E.g. booting a system involves about 50.000 strncpy_from_user() calls where the NULL terminated string has an average size of 27 bytes. By default our s390 specific strncpy_from_user() implementation however copies up to 4096 bytes, which is a waste of cpu cycles and cache lines. Therefore reduce the default length to L1_CACHE_BYTES (256 bytes), which also reduces the average execution time of strncpy_from_user() by 30-40%. Alternatively we could have switched to the generic strncpy_from_user() implementation, however it turned out that that variant would be slower than the now optimized s390 variant. Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-05-03s390/uprobes: fix compile for !KPROBESHeiko Carstens1-0/+1
Fix the following compile error(s) if CONFIG_KPROBES is disabled: arch/s390/kernel/uprobes.c:79:14: error: implicit declaration of function 'probe_get_fixup_type' arch/s390/kernel/uprobes.c:87:14: error: 'FIXUP_PSW_NORMAL' undeclared (first use in this function) Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-05-02Merge branch 'for-linus' of ↵Linus Torvalds1-55/+29
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: - three merges for KVM/s390 with changes for vfio-ccw and cpacf. The patches are included in the KVM tree as well, let git sort it out. - add the new 'trng' random number generator - provide the secure key verification API for the pkey interface - introduce the z13 cpu counters to perf - add a new system call to set up the guarded storage facility - simplify TASK_SIZE and arch_get_unmapped_area - export the raw STSI data related to CPU topology to user space - ... and the usual churn of bug-fixes and cleanups. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (74 commits) s390/crypt: use the correct module alias for paes_s390. s390/cpacf: Introduce kma instruction s390/cpacf: query instructions use unique parameters for compatibility with KMA s390/trng: Introduce s390 TRNG device driver. s390/crypto: Provide s390 specific arch random functionality. s390/crypto: Add new subfunctions to the cpacf PRNO function. s390/crypto: Renaming PPNO to PRNO. s390/pageattr: avoid unnecessary page table splitting s390/mm: simplify arch_get_unmapped_area[_topdown] s390/mm: make TASK_SIZE independent from the number of page table levels s390/gs: add regset for the guarded storage broadcast control block s390/kvm: Add use_cmma field to mm_context_t s390/kvm: Add PGSTE manipulation functions vfio: ccw: improve error handling for vfio_ccw_mdev_remove vfio: ccw: remove unnecessary NULL checks of a pointer s390/spinlock: remove compare and delay instruction s390/spinlock: use atomic primitives for spinlocks s390/cpumf: simplify detection of guest samples s390/pci: remove forward declaration s390/pci: increase the PCI_NR_FUNCTIONS default ...
2017-04-12s390/spinlock: remove compare and delay instructionMartin Schwidefsky1-28/+5
The CAD instruction never worked quite as expected for the spinlock code. It has been disabled by default with git commit 61b0b01686d48220, if the "cad" kernel parameter is specified it is enabled for both user space and the spinlock code. Leave the option to enable the instruction for user space but remove it from the spinlock code. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-04-12s390/spinlock: use atomic primitives for spinlocksMartin Schwidefsky1-38/+35
Add a couple more __atomic_xxx function to atomic_ops.h and use them to replace the compare-and-swap inlines in the spinlock code. This changes the type of the lock value from unsigned int to int. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-03-30s390: get rid of zeroing, switch to RAW_COPY_USERAl Viro1-45/+23
[folded a fix from Martin] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-02-17s390: replace ACCESS_ONCE with READ_ONCEChristian Borntraeger1-1/+1
Remove the last places of ACCESS_ONCE in s390 code. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-02-17s390: Audit and remove any remaining unnecessary uses of module.hPaul Gortmaker4-4/+5
Historically a lot of these existed because we did not have a distinction between what was modular code and what was providing support to modules via EXPORT_SYMBOL and friends. That changed when we forked out support for the latter into the export.h file. This means we should be able to reduce the usage of module.h in code that is obj-y Makefile or bool Kconfig. The advantage in doing so is that module.h itself sources about 15 other headers; adding significantly to what we feed cpp, and it can obscure what headers we are effectively using. Since module.h was the source for init.h (for __init) and for export.h (for EXPORT_SYMBOL) we consider each change instance for the presence of either and replace as needed. An instance where module_param was used without moduleparam.h was also fixed, as well as implicit use of ptrace.h and string.h headers. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-01-16s390/lib: improve memmove, memset and memcpyHeiko Carstens1-15/+13
Improve the memmove implementation to save one instruction and use better label names. Also use better label names for the memset and memcpy implementations so everything looks consistent. Suggested-by: Jens Remus <jremus@linux.vnet.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-14s390/lib: add missing memory barriers to string inline assembliesHeiko Carstens1-6/+6
We have a couple of inline assemblies like memchr() and strlen() that read from memory, but tell the compiler only they need the addresses of the strings they access. This allows the compiler to omit the initialization of such strings and therefore generate broken code. Add the missing memory barrier to all string related inline assemblies to fix this potential issue. It looks like the compiler currently does not generate broken code due to these bugs. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-14Merge branch 'for-linus' of ↵Linus Torvalds1-0/+39
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: "The main bulk of the s390 patches for the 4.10 merge window: - Add support for the contiguous memory allocator. - The recovery for I/O errors in the dasd device driver is improved, the driver will now remove channel paths that are not working properly. - Additional fields are added to /proc/sysinfo, the extended partition name and the partition UUID. - New naming for PCI devices with system defined UIDs. - The last few remaining alloc_bootmem calls are converted to memblock. - The thread_info structure is stripped down and moved to the task_struct. The only field left in thread_info is the flags field. - Rework of the arch topology code to fix a fake numa issue. - Refactoring of the atomic primitives and add a new preempt_count implementation. - Clocksource steering for the STP sync check offsets. - The s390 specific headers are changed to make them usable with CLANG. - Bug fixes and cleanup" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (70 commits) s390/cpumf: Use configuration level indication for sampling data s390: provide memmove implementation s390: cleanup arch/s390/kernel Makefile s390: fix initrd corruptions with gcov/kcov instrumented kernels s390: exclude early C code from gcov profiling s390/dasd: channel path aware error recovery s390/dasd: extend dasd path handling s390: remove unused labels from entry.S s390/vmlogrdr: fix IUCV buffer allocation s390/crypto: unlock on error in prng_tdes_read() s390/sysinfo: show partition extended name and UUID if available s390/numa: pin all possible cpus to nodes early s390/numa: establish cpu to node mapping early s390/topology: use cpu_topology array instead of per cpu variable s390/smp: initialize cpu_present_mask in setup_arch s390/topology: always use s390 specific sched_domain_topology_level s390/smp: use smp_get_base_cpu() helper function s390/numa: always use logical cpu and core ids s390: Remove VLAIS in ptff() and clear_table() s390: fix machine check panic stack switch ...
2016-12-12s390: provide memmove implementationHeiko Carstens1-0/+39
Provide an s390 specific memmove implementation which is faster than the generic implementation which copies byte-wise. For non-destructive (as defined by the mvc instruction) memmove operations the following table compares the old default implementation versus the new s390 specific implementation: size old new 1 1ns 8ns 2 2ns 8ns 4 4ns 8ns 8 7ns 8ns 16 17ns 8ns 32 35ns 8ns 64 65ns 9ns 128 146ns 10ns 256 298ns 11ns 512 537ns 11ns 1024 1193ns 19ns 2048 2405ns 36ns So only for very small sizes the old implementation is faster. For overlapping memmoves, where the mvc instruction can't be used, the new implementation is as slow as the old one. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-22locking/spinlocks, s390: Implement vcpu_is_preempted(cpu)Christian Borntraeger1-17/+8
This implements the s390 version for vcpu_is_preempted(cpu), by reworking the existing smp_vcpu_scheduled() function into arch_vcpu_is_preempted(). We can then also get rid of the local cpu_is_preempted() function by moving the CIF_ENABLED_WAIT test into arch_vcpu_is_preempted(). Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: David.Laight@ACULAB.COM Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: benh@kernel.crashing.org Cc: boqun.feng@gmail.com Cc: bsingharora@gmail.com Cc: dave@stgolabs.net Cc: jgross@suse.com Cc: kernellwp@gmail.com Cc: konrad.wilk@oracle.com Cc: linuxppc-dev@lists.ozlabs.org Cc: mpe@ellerman.id.au Cc: paulmck@linux.vnet.ibm.com Cc: paulus@samba.org Cc: pbonzini@redhat.com Cc: rkrcmar@redhat.com Cc: virtualization@lists.linux-foundation.org Cc: will.deacon@arm.com Cc: xen-devel-request@lists.xenproject.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1478077718-37424-6-git-send-email-xinhui.pan@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-15Merge branch 'kbuild' of ↵Linus Torvalds1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull kbuild updates from Michal Marek: - EXPORT_SYMBOL for asm source by Al Viro. This does bring a regression, because genksyms no longer generates checksums for these symbols (CONFIG_MODVERSIONS). Nick Piggin is working on a patch to fix this. Plus, we are talking about functions like strcpy(), which rarely change prototypes. - Fixes for PPC fallout of the above by Stephen Rothwell and Nick Piggin - fixdep speedup by Alexey Dobriyan. - preparatory work by Nick Piggin to allow architectures to build with -ffunction-sections, -fdata-sections and --gc-sections - CONFIG_THIN_ARCHIVES support by Stephen Rothwell - fix for filenames with colons in the initramfs source by me. * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: (22 commits) initramfs: Escape colons in depfile ppc: there is no clear_pages to export powerpc/64: whitelist unresolved modversions CRCs kbuild: -ffunction-sections fix for archs with conflicting sections kbuild: add arch specific post-link Makefile kbuild: allow archs to select link dead code/data elimination kbuild: allow architectures to use thin archives instead of ld -r kbuild: Regenerate genksyms lexer kbuild: genksyms fix for typeof handling fixdep: faster CONFIG_ search ia64: move exports to definitions sparc32: debride memcpy.S a bit [sparc] unify 32bit and 64bit string.h sparc: move exports to definitions ppc: move exports to definitions arm: move exports to definitions s390: move exports to definitions m68k: move exports to definitions alpha: move exports to actual definitions x86: move exports to actual definitions ...
2016-08-17Merge branch 'for-linus' of ↵Linus Torvalds1-9/+7
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "A couple of bug fixes, minor cleanup and a change to the default config" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/dasd: fix failing CUIR assignment under LPAR s390/pageattr: handle numpages parameter correctly s390/dasd: fix hanging device after clear subchannel s390/qdio: avoid reschedule of outbound tasklet once killed s390/qdio: remove checks for ccw device internal state s390/qdio: fix double return code evaluation s390/qdio: get rid of spin_lock_irqsave usage s390/cio: remove subchannel_id from ccw_device_private s390/qdio: obtain subchannel_id via ccw_device_get_schid() s390/cio: stop using subchannel_id from ccw_device_private s390/config: make the vector optimized crc function builtin s390/lib: fix memcmp and strstr s390/crc32-vx: Fix checksum calculation for small sizes s390: clarify compressed image code path
2016-08-09Merge tag 'usercopy-v4.8' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull usercopy protection from Kees Cook: "Tbhis implements HARDENED_USERCOPY verification of copy_to_user and copy_from_user bounds checking for most architectures on SLAB and SLUB" * tag 'usercopy-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: mm: SLUB hardened usercopy support mm: SLAB hardened usercopy support s390/uaccess: Enable hardened usercopy sparc/uaccess: Enable hardened usercopy powerpc/uaccess: Enable hardened usercopy ia64/uaccess: Enable hardened usercopy arm64/uaccess: Enable hardened usercopy ARM: uaccess: Enable hardened usercopy x86/uaccess: Enable hardened usercopy mm: Hardened usercopy mm: Implement stack frame object validation mm: Add is_migrate_cma_page
2016-08-08s390/lib: fix memcmp and strstrChristian Borntraeger1-9/+7
if two string compare equal the clcle instruction will update the string addresses to point _after_ the string. This might already be on a different page, so we should not use these pointer to calculate the difference as in that case the calculation of the difference can cause oopses. The return value of memcmp does not need the difference, we can just reuse the condition code and return for CC=1 (All bytes compared, first operand low) -1 and for CC=2 (All bytes compared, first operand high) +1 strstr also does not need the diff. While fixing this, make the common function clcle "correct on its own" by using l1 instead of l2 for the first length. strstr will call this with l2 for both strings. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Fixes: db7f5eef3dc0 ("s390/lib: use basic blocks for inline assemblies") Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-08s390: move exports to definitionsAl Viro1-0/+3
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-07-27s390/uaccess: Enable hardened usercopyKees Cook1-0/+2
Enables CONFIG_HARDENED_USERCOPY checks on s390. Signed-off-by: Kees Cook <keescook@chromium.org>
2016-06-28s390/lib: use basic blocks for inline assembliesHeiko Carstens1-24/+26
Use only simple inline assemblies which consist of a single basic block if the register asm construct is being used. Otherwise gcc would generate broken code if the compiler option --sanitize-coverage=trace-pc would be used. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-06-13s390/uaccess: fix whitespace damageHeiko Carstens1-3/+3
Fix some whitespace damage that was introduced by me with a query-replace when removing 31 bit support. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-04-15s390/spinlock: avoid yield to non existent cpuHeiko Carstens1-0/+1
arch_spin_lock_wait_flags() checks if a spinlock is not held before trying a compare and swap instruction. If the lock is unlocked it tries the compare and swap instruction, however if a different cpu grabbed the lock in the meantime the instruction will fail as expected. Subsequently the arch_spin_lock_wait_flags() incorrectly tries to figure out if the cpu that holds the lock is running. However it is using the wrong cpu number for this (-1) and then will also yield the current cpu to the wrong cpu. Fix this by adding a missing continue statement. Fixes: 470ada6b1a1d ("s390/spinlock: refactor arch_spin_lock_wait[_flags]") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-02-23s390/xor: optimized xor routing using the XC instructionMartin Schwidefsky2-1/+135
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-11-27s390/spinlock: do not yield to a CPU in udelay/mdelayMartin Schwidefsky1-8/+17
It does not make sense to try to relinquish the time slice with diag 0x9c to a CPU in a state that does not allow to schedule the CPU. The scenario where this can happen is a CPU waiting in udelay/mdelay while holding a spin-lock. Add a CIF bit to tag a CPU in enabled wait and use it to detect that the yield of a CPU will not be successful and skip the diagnose call. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-11-27s390/spinlock: avoid diagnose loopMartin Schwidefsky1-9/+19
The spinlock implementation calls the diagnose 0x9c / 0x44 immediately if the SIGP sense running reported the target CPU as not running. The diagnose 0x9c is a hint to the hypervisor to schedule the target CPU in preference to the source CPU that issued the diagnose. It can happen that on return from the diagnose the target CPU has not been scheduled yet, e.g. if the target logical CPU is on another physical CPU and the hypervisor did not want to migrate the logical CPU. Avoid the immediate repeat of the diagnose instruction, instead do the retry loop before the next invocation of diagnose 0x9c. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-10-14s390/bitops: remove 31 bit related commentsHeiko Carstens1-3/+1
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-10-14s390/udelay: make udelay have busy loop semanticsHeiko Carstens1-16/+14
When using systemtap it was observed that our udelay implementation is rather suboptimal if being called from a kprobe handler installed by systemtap. The problem observed when a kprobe was installed on lock_acquired(). When the probe was hit the kprobe handler did call udelay, which set up an (internal) timer and reenabled interrupts (only the clock comparator interrupt) and waited for the interrupt. This is an optimization to avoid that the cpu is busy looping while waiting that enough time passes. The problem is that the interrupt handler still does call irq_enter()/irq_exit() which then again can lead to a deadlock, since some accounting functions may take locks as well. If one of these locks is the same, which caused lock_acquired() to be called, we have a nice deadlock. This patch reworks the udelay code for the interrupts disabled case to immediately leave the low level interrupt handler when the clock comparator interrupt happens. That way no C code is being called and the deadlock cannot happen anymore. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-10-14s390/spinlock: use correct barriersChristian Borntraeger1-2/+2
_raw_write_lock_wait first sets the high order bit to indicate a pending writer and then waits for the reader to drop to zero. smp_rmb by definition only orders reads against reads. Let's use a full smp_mb instead. As right now smp_rmb is implemented as full serialization, this needs no stable backport, but this patch will be necessary if we reimplement smp_rmb. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-09-04Merge branch 'locking-core-for-linus' of ↵Linus Torvalds1-6/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking and atomic updates from Ingo Molnar: "Main changes in this cycle are: - Extend atomic primitives with coherent logic op primitives (atomic_{or,and,xor}()) and deprecate the old partial APIs (atomic_{set,clear}_mask()) The old ops were incoherent with incompatible signatures across architectures and with incomplete support. Now every architecture supports the primitives consistently (by Peter Zijlstra) - Generic support for 'relaxed atomics': - _acquire/release/relaxed() flavours of xchg(), cmpxchg() and {add,sub}_return() - atomic_read_acquire() - atomic_set_release() This came out of porting qwrlock code to arm64 (by Will Deacon) - Clean up the fragile static_key APIs that were causing repeat bugs, by introducing a new one: DEFINE_STATIC_KEY_TRUE(name); DEFINE_STATIC_KEY_FALSE(name); which define a key of different types with an initial true/false value. Then allow: static_branch_likely() static_branch_unlikely() to take a key of either type and emit the right instruction for the case. To be able to know the 'type' of the static key we encode it in the jump entry (by Peter Zijlstra) - Static key self-tests (by Jason Baron) - qrwlock optimizations (by Waiman Long) - small futex enhancements (by Davidlohr Bueso) - ... and misc other changes" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits) jump_label/x86: Work around asm build bug on older/backported GCCs locking, ARM, atomics: Define our SMP atomics in terms of _relaxed() operations locking, include/llist: Use linux/atomic.h instead of asm/cmpxchg.h locking/qrwlock: Make use of _{acquire|release|relaxed}() atomics locking/qrwlock: Implement queue_write_unlock() using smp_store_release() locking/lockref: Remove homebrew cmpxchg64_relaxed() macro definition locking, asm-generic: Add _{relaxed|acquire|release}() variants for 'atomic_long_t' locking, asm-generic: Rework atomic-long.h to avoid bulk code duplication locking/atomics: Add _{acquire|release|relaxed}() variants of some atomic operations locking, compiler.h: Cast away attributes in the WRITE_ONCE() magic locking/static_keys: Make verify_keys() static jump label, locking/static_keys: Update docs locking/static_keys: Provide a selftest jump_label: Provide a self-test s390/uaccess, locking/static_keys: employ static_branch_likely() x86, tsc, locking/static_keys: Employ static_branch_likely() locking/static_keys: Add selftest locking/static_keys: Add a new static_key interface locking/static_keys: Rework update logic locking/static_keys: Add static_key_{en,dis}able() helpers ...
2015-08-19s390/uaccess: remove uaccess_primary kernel parameterHeiko Carstens1-14/+1
get_user() and put_user() are inline functions in the meantime again. Both will generate the mvcos instruction if compiled with -march=z10 (or greater). The kernel parameter "uaccess_primary" can only change the behavior of out-of-line uaccess functions like copy_from_user() to not use the mvcos instruction, but not for the above named inlined functions. Therefore it is quite useless and the parameter can be removed. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-08-07s390/lib: export __delayGuenter Roeck1-0/+1
__delay is exported by most architectures, and may be used in modules. Since it is not exported for s390, s390:allmodconfig currently fails to build with ERROR: "__delay" [drivers/net/phy/mdio-octeon.ko] undefined! Fixes: a6d678645210 ("net: mdio-octeon: Modify driver to work on both ThunderX and Octeon") Cc: Radha Mohan Chintakuntla <rchintakuntla@cavium.com> Cc: David Daney <david.daney@cavium.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-08-03s390/uaccess, locking/static_keys: employ static_branch_likely()Heiko Carstens1-6/+6
Use the new static_branch_likely() primitive to make sure that the most likely case is executed without taking an unconditional branch. This wasn't possible with the old jump label primitives. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/20150729064600.GB3953@osiris Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-25s390: remove "64" suffix from mem64.S and swsusp_asm64.SHeiko Carstens2-1/+1
Rename two more files which I forgot. Also remove the "asm" from the swsusp_asm64.S file, since the ".S" suffix already makes it obvious that this file contains assembler code. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-03-25s390: remove 31 bit supportHeiko Carstens6-420/+62
Remove the 31 bit support in order to reduce maintenance cost and effectively remove dead code. Since a couple of years there is no distribution left that comes with a 31 bit kernel. The 31 bit kernel also has been broken since more than a year before anybody noticed. In addition I added a removal warning to the kernel shown at ipl for 5 minutes: a960062e5826 ("s390: add 31 bit warning message") which let everybody know about the plan to remove 31 bit code. We didn't get any response. Given that the last 31 bit only machine was introduced in 1999 let's remove the code. Anybody with 31 bit user space code can still use the compat mode. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-01-23s390/spinlock: add compare-and-delay to lock wait loopsMartin Schwidefsky1-7/+45
Add the compare-and-delay instruction to the spin-lock and rw-lock retry loops. A CPU executing the compare-and-delay instruction stops until the lock value has changed. This is done to make the locking code for contended locks to behave better in regard to the multi- hreading facility. A thread of a core executing a compare-and-delay will allow the other threads of a core to get a larger share of the core resources. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-10-17s390/uprobes: fix kprobes dependencyJan Willeke1-1/+1
If kprobes is disabled uprobes will not compile. Fix this by including the correct header files. Signed-off-by: Jan Willeke <willeke@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-10-09s390/idle: consolidate idle functions and definitionsMartin Schwidefsky1-2/+2
Move the C functions and definitions related to the idle state handling to arch/s390/include/asm/idle.h and arch/s390/kernel/idle.c. The function s390_get_idle_time is renamed to arch_cpu_idle_time and vtime_stop_cpu to enabled_wait. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25s390/uprobes: common library for kprobes and uprobesJan Willeke2-0/+161
This patch moves common functions from kprobes.c to probes.c. Thus its possible for uprobes to use them without enabling kprobes. Signed-off-by: Jan Willeke <willeke@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25s390/rwlock: use the interlocked-access facility 1 instructionsMartin Schwidefsky1-0/+34
Make use of the load-and-add, load-and-or and load-and-and instructions to atomically update the read-write lock without a compare-and-swap loop. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25s390/rwlock: improve writer fairnessMartin Schwidefsky1-5/+9
Set the write-lock bit in the out-of-line rwlock code to indicate that a writer is waiting. Additional readers will no be able to get the lock until at least one writer got the lock. Additional writers have to wait for the first writer to release the lock again. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25s390/rwlock: remove interrupt-enabling rwlock variant.Martin Schwidefsky1-50/+0
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25s390/rwlock: use directed yield for write-locked rwlocksMartin Schwidefsky1-19/+30
Add an owner field to the arch_rwlock_t to be able to pass the timeslice of a virtual CPU with diagnose 0x9c to the lock owner in case the rwlock is write-locked. The undirected yield in case the rwlock is acquired writable but the lock is read-locked is removed. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-05-20s390/spinlock: refactor arch_spin_lock_wait[_flags]Martin Schwidefsky1-34/+47
Reorder the spinlock wait code to make it more readable. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-05-20s390/rwlock: add missing local_irq_restore callsMartin Schwidefsky1-0/+2
The out of line _raw_read_lock_wait_flags/_raw_write_lock_wait_flags functions for the arch_read_lock_flags/arch_write_lock_flags calls fail to re-enable the interrupts after another unsuccessful try to get the lock with compare-and-swap. The following wait would be done with interrupts disabled which is suboptimal. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-05-20s390/spinlock,rwlock: always to a load-and-test firstMartin Schwidefsky1-13/+16
In case a lock is contended it is better to do a load-and-test first before trying to get the lock with compare-and-swap. This helps to avoid unnecessary cache invalidations of the cacheline for the lock if the CPU has to wait for the lock. For an uncontended lock doing the compare-and-swap directly is a bit better, if the CPU does not have the cacheline in its cache yet the compare-and-swap will get it read-write immediately while a load-and-test would get it read-only first. Always to the load-and-test first to avoid the cacheline invalidations for the contended case outweight the potential read-only to read-write cacheline upgrade for the uncontended case. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-05-20s390/spinlock: fix system hang with spin_retry <= 0Gerald Schaefer1-6/+8
On LPAR, when spin_retry is set to <= 0, arch_spin_lock_wait() and arch_spin_lock_wait_flags() may end up in a while(1) loop w/o doing any compare and swap operation. To fix this, use do/while instead of for loop. Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-05-20s390/uaccess: simplify control register updatesMartin Schwidefsky1-5/+5
Always switch to the kernel ASCE in switch_mm. Load the secondary space ASCE in finish_arch_post_lock_switch after checking that any pending page table operations have completed. The primary ASCE is loaded in entry[64].S. With this the update_primary_asce call can be removed from the switch_to macro and from the start of switch_mm function. Remove the load_primary argument from update_user_asce/clear_user_asce, rename update_user_asce to set_user_asce and rename update_primary_asce to load_kernel_asce. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>