summaryrefslogtreecommitdiff
path: root/arch/arm64/include/asm
diff options
context:
space:
mode:
authorWill Deacon <will.deacon@arm.com>2018-12-12 17:17:20 +0300
committerWill Deacon <will.deacon@arm.com>2018-12-12 17:43:35 +0300
commit6e4ede698d1c17102e42c6f734f7d8cf60df28d5 (patch)
tree3c1e1d857756822ed612c0451cca580abf9d8255 /arch/arm64/include/asm
parentc3296a1391cb0ffbea95ee2249af63cb45e4899b (diff)
downloadlinux-6e4ede698d1c17102e42c6f734f7d8cf60df28d5.tar.xz
arm64: percpu: Fix LSE implementation of value-returning pcpu atomics
Commit 959bf2fd03b5 ("arm64: percpu: Rewrite per-cpu ops to allow use of LSE atomics") introduced alternative code sequences for the arm64 percpu atomics, so that the LSE instructions can be patched in at runtime if they are supported by the CPU. Unfortunately, when patching in the LSE sequence for a value-returning pcpu atomic, the argument registers are the wrong way round. The implementation of this_cpu_add_return() therefore ends up adding uninitialised stack to the percpu variable and returning garbage. As it turns out, there aren't very many users of the value-returning percpu atomics in mainline and we only spotted this due to a failure in the kprobes selftests. In this case, when attempting to single-step over the out-of-line instruction slot, the debug monitors would not be enabled because calling this_cpu_inc_return() on the kernel debug monitor refcount would fail to detect the transition from 0. We would consequently execute past the slot and take an undefined instruction exception from the kernel, resulting in a BUG: | kernel BUG at arch/arm64/kernel/traps.c:421! | PREEMPT SMP | pc : do_undefinstr+0x268/0x278 | lr : do_undefinstr+0x124/0x278 | Process swapper/0 (pid: 1, stack limit = 0x(____ptrval____)) | Call trace: | do_undefinstr+0x268/0x278 | el1_undef+0x10/0x78 | 0xffff00000803c004 | init_kprobes+0x150/0x180 | do_one_initcall+0x74/0x178 | kernel_init_freeable+0x188/0x224 | kernel_init+0x10/0x100 | ret_from_fork+0x10/0x1c Fix the argument order to get the value-returning pcpu atomics working correctly when implemented using the LSE instructions. Reported-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
Diffstat (limited to 'arch/arm64/include/asm')
-rw-r--r--arch/arm64/include/asm/percpu.h2
1 files changed, 1 insertions, 1 deletions
diff --git a/arch/arm64/include/asm/percpu.h b/arch/arm64/include/asm/percpu.h
index f7b1bbbb6f12..6b81dd8cee01 100644
--- a/arch/arm64/include/asm/percpu.h
+++ b/arch/arm64/include/asm/percpu.h
@@ -94,7 +94,7 @@ __percpu_##name##_return_case_##sz(void *ptr, unsigned long val) \
" stxr" #sfx "\t%w[loop], %" #w "[ret], %[ptr]\n" \
" cbnz %w[loop], 1b", \
/* LSE atomics */ \
- #op_lse "\t%" #w "[ret], %" #w "[val], %[ptr]\n" \
+ #op_lse "\t%" #w "[val], %" #w "[ret], %[ptr]\n" \
#op_llsc "\t%" #w "[ret], %" #w "[ret], %" #w "[val]\n" \
__nops(2)) \
: [loop] "=&r" (loop), [ret] "=&r" (ret), \