summaryrefslogtreecommitdiff
path: root/arch/powerpc/kernel
AgeCommit message (Collapse)AuthorFilesLines
2019-09-16powerpc/tm: Fix restoring FP/VMX facility incorrectly on interruptsGustavo Romero1-16/+2
[ Upstream commit a8318c13e79badb92bc6640704a64cc022a6eb97 ] When in userspace and MSR FP=0 the hardware FP state is unrelated to the current process. This is extended for transactions where if tbegin is run with FP=0, the hardware checkpoint FP state will also be unrelated to the current process. Due to this, we need to ensure this hardware checkpoint is updated with the correct state before we enable FP for this process. Unfortunately we get this wrong when returning to a process from a hardware interrupt. A process that starts a transaction with FP=0 can take an interrupt. When the kernel returns back to that process, we change to FP=1 but with hardware checkpoint FP state not updated. If this transaction is then rolled back, the FP registers now contain the wrong state. The process looks like this: Userspace: Kernel Start userspace with MSR FP=0 TM=1 < ----- ... tbegin bne Hardware interrupt ---- > <do_IRQ...> .... ret_from_except restore_math() /* sees FP=0 */ restore_fp() tm_active_with_fp() /* sees FP=1 (Incorrect) */ load_fp_state() FP = 0 -> 1 < ----- Return to userspace with MSR TM=1 FP=1 with junk in the FP TM checkpoint TM rollback reads FP junk When returning from the hardware exception, tm_active_with_fp() is incorrectly making restore_fp() call load_fp_state() which is setting FP=1. The fix is to remove tm_active_with_fp(). tm_active_with_fp() is attempting to handle the case where FP state has been changed inside a transaction. In this case the checkpointed and transactional FP state is different and hence we must restore the FP state (ie. we can't do lazy FP restore inside a transaction that's used FP). It's safe to remove tm_active_with_fp() as this case is handled by restore_tm_state(). restore_tm_state() detects if FP has been using inside a transaction and will set load_fp and call restore_math() to ensure the FP state (checkpoint and transaction) is restored. This is a data integrity problem for the current process as the FP registers are corrupted. It's also a security problem as the FP registers from one process may be leaked to another. Similarly for VMX. A simple testcase to replicate this will be posted to tools/testing/selftests/powerpc/tm/tm-poison.c This fixes CVE-2019-15031. Fixes: a7771176b439 ("powerpc: Don't enable FP/Altivec if not checkpointed") Cc: stable@vger.kernel.org # 4.15+ Signed-off-by: Gustavo Romero <gromero@linux.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190904045529.23002-2-gromero@linux.vnet.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16powerpc/tm: Remove msr_tm_active()Breno Leitao1-12/+9
[ Upstream commit 5c784c8414fba11b62e12439f11e109fb5751f38 ] Currently msr_tm_active() is a wrapper around MSR_TM_ACTIVE() if CONFIG_PPC_TRANSACTIONAL_MEM is set, or it is just a function that returns false if CONFIG_PPC_TRANSACTIONAL_MEM is not set. This function is not necessary, since MSR_TM_ACTIVE() just do the same and could be used, removing the dualism and simplifying the code. This patchset remove every instance of msr_tm_active() and replaced it by MSR_TM_ACTIVE(). Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16KVM: PPC: Use ccr field in pt_regs struct embedded in vcpu structPaul Mackerras1-2/+2
[ Upstream commit fd0944baad806dfb4c777124ec712c55b714ff51 ] When the 'regs' field was added to struct kvm_vcpu_arch, the code was changed to use several of the fields inside regs (e.g., gpr, lr, etc.) but not the ccr field, because the ccr field in struct pt_regs is 64 bits on 64-bit platforms, but the cr field in kvm_vcpu_arch is only 32 bits. This changes the code to use the regs.ccr field instead of cr, and changes the assembly code on 64-bit platforms to use 64-bit loads and stores instead of 32-bit ones. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16powerpc/64: mark start_here_multiplatform as __refChristophe Leroy1-0/+2
[ Upstream commit 9c4e4c90ec24652921e31e9551fcaedc26eec86d ] Otherwise, the following warning is encountered: WARNING: vmlinux.o(.text+0x3dc6): Section mismatch in reference from the variable start_here_multiplatform to the function .init.text:.early_setup() The function start_here_multiplatform() references the function __init .early_setup(). This is often because start_here_multiplatform lacks a __init annotation or the annotation of .early_setup is wrong. Fixes: 56c46bba9bbf ("powerpc/64: Fix booting large kernels with STRICT_KERNEL_RWX") Cc: Russell Currey <ruscur@russell.cc> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16powerpc/tm: Fix FP/VMX unavailable exceptions inside a transactionGustavo Romero1-1/+2
commit 8205d5d98ef7f155de211f5e2eb6ca03d95a5a60 upstream. When we take an FP unavailable exception in a transaction we have to account for the hardware FP TM checkpointed registers being incorrect. In this case for this process we know the current and checkpointed FP registers must be the same (since FP wasn't used inside the transaction) hence in the thread_struct we copy the current FP registers to the checkpointed ones. This copy is done in tm_reclaim_thread(). We use thread->ckpt_regs.msr to determine if FP was on when in userspace. thread->ckpt_regs.msr represents the state of the MSR when exiting userspace. This is setup by check_if_tm_restore_required(). Unfortunatley there is an optimisation in giveup_all() which returns early if tsk->thread.regs->msr (via local variable `usermsr`) has FP=VEC=VSX=SPE=0. This optimisation means that check_if_tm_restore_required() is not called and hence thread->ckpt_regs.msr is not updated and will contain an old value. This can happen if due to load_fp=255 we start a userspace process with MSR FP=1 and then we are context switched out. In this case thread->ckpt_regs.msr will contain FP=1. If that same process is then context switched in and load_fp overflows, MSR will have FP=0. If that process now enters a transaction and does an FP instruction, the FP unavailable will not update thread->ckpt_regs.msr (the bug) and MSR FP=1 will be retained in thread->ckpt_regs.msr. tm_reclaim_thread() will then not perform the required memcpy and the checkpointed FP regs in the thread struct will contain the wrong values. The code path for this happening is: Userspace: Kernel Start userspace with MSR FP/VEC/VSX/SPE=0 TM=1 < ----- ... tbegin bne fp instruction FP unavailable ---- > fp_unavailable_tm() tm_reclaim_current() tm_reclaim_thread() giveup_all() return early since FP/VMX/VSX=0 /* ckpt MSR not updated (Incorrect) */ tm_reclaim() /* thread_struct ckpt FP regs contain junk (OK) */ /* Sees ckpt MSR FP=1 (Incorrect) */ no memcpy() performed /* thread_struct ckpt FP regs not fixed (Incorrect) */ tm_recheckpoint() /* Put junk in hardware checkpoint FP regs */ .... < ----- Return to userspace with MSR TM=1 FP=1 with junk in the FP TM checkpoint TM rollback reads FP junk This is a data integrity problem for the current process as the FP registers are corrupted. It's also a security problem as the FP registers from one process may be leaked to another. This patch moves up check_if_tm_restore_required() in giveup_all() to ensure thread->ckpt_regs.msr is updated correctly. A simple testcase to replicate this will be posted to tools/testing/selftests/powerpc/tm/tm-poison.c Similarly for VMX. This fixes CVE-2019-15030. Fixes: f48e91e87e67 ("powerpc/tm: Fix FP and VMX register corruption") Cc: stable@vger.kernel.org # 4.12+ Signed-off-by: Gustavo Romero <gromero@linux.vnet.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190904045529.23002-1-gromero@linux.vnet.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-29powerpc: Allow flush_(inval_)dcache_range to work across ranges >4GBAlastair D'Silva1-2/+2
The upstream commit: 22e9c88d486a ("powerpc/64: reuse PPC32 static inline flush_dcache_range()") has a similar effect, but since it is a rewrite of the assembler to C, is too invasive for stable. This patch is a minimal fix to address the issue in assembler. This patch applies cleanly to v5.2, v4.19 & v4.14. When calling flush_(inval_)dcache_range with a size >4GB, we were masking off the upper 32 bits, so we would incorrectly flush a range smaller than intended. This patch replaces the 32 bit shifts with 64 bit ones, so that the full size is accounted for. Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31powerpc/tm: Fix oops on sigreturn on systems without TMMichael Neuling2-0/+8
commit f16d80b75a096c52354c6e0a574993f3b0dfbdfe upstream. On systems like P9 powernv where we have no TM (or P8 booted with ppc_tm=off), userspace can construct a signal context which still has the MSR TS bits set. The kernel tries to restore this context which results in the following crash: Unexpected TM Bad Thing exception at c0000000000022fc (msr 0x8000000102a03031) tm_scratch=800000020280f033 Oops: Unrecoverable exception, sig: 6 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: CPU: 0 PID: 1636 Comm: sigfuz Not tainted 5.2.0-11043-g0a8ad0ffa4 #69 NIP: c0000000000022fc LR: 00007fffb2d67e48 CTR: 0000000000000000 REGS: c00000003fffbd70 TRAP: 0700 Not tainted (5.2.0-11045-g7142b497d8) MSR: 8000000102a03031 <SF,VEC,VSX,FP,ME,IR,DR,LE,TM[E]> CR: 42004242 XER: 00000000 CFAR: c0000000000022e0 IRQMASK: 0 GPR00: 0000000000000072 00007fffb2b6e560 00007fffb2d87f00 0000000000000669 GPR04: 00007fffb2b6e728 0000000000000000 0000000000000000 00007fffb2b6f2a8 GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR12: 0000000000000000 00007fffb2b76900 0000000000000000 0000000000000000 GPR16: 00007fffb2370000 00007fffb2d84390 00007fffea3a15ac 000001000a250420 GPR20: 00007fffb2b6f260 0000000010001770 0000000000000000 0000000000000000 GPR24: 00007fffb2d843a0 00007fffea3a14a0 0000000000010000 0000000000800000 GPR28: 00007fffea3a14d8 00000000003d0f00 0000000000000000 00007fffb2b6e728 NIP [c0000000000022fc] rfi_flush_fallback+0x7c/0x80 LR [00007fffb2d67e48] 0x7fffb2d67e48 Call Trace: Instruction dump: e96a0220 e96a02a8 e96a0330 e96a03b8 394a0400 4200ffdc 7d2903a6 e92d0c00 e94d0c08 e96d0c10 e82d0c18 7db242a6 <4c000024> 7db243a6 7db142a6 f82d0c18 The problem is the signal code assumes TM is enabled when CONFIG_PPC_TRANSACTIONAL_MEM is enabled. This may not be the case as with P9 powernv or if `ppc_tm=off` is used on P8. This means any local user can crash the system. Fix the problem by returning a bad stack frame to the user if they try to set the MSR TS bits with sigreturn() on systems where TM is not supported. Found with sigfuz kernel selftest on P9. This fixes CVE-2019-13648. Fixes: 2b0a576d15e0 ("powerpc: Add new transactional memory state to the signal context") Cc: stable@vger.kernel.org # v3.9 Reported-by: Praveen Pandey <Praveen.Pandey@in.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190719050502.405-1-mikey@neuling.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31powerpc/eeh: Handle hugepages in ioremap spaceOliver O'Halloran1-3/+12
[ Upstream commit 33439620680be5225c1b8806579a291e0d761ca0 ] In commit 4a7b06c157a2 ("powerpc/eeh: Handle hugepages in ioremap space") support for using hugepages in the vmalloc and ioremap areas was enabled for radix. Unfortunately this broke EEH MMIO error checking. Detection works by inserting a hook which checks the results of the ioreadXX() set of functions. When a read returns a 0xFFs response we need to check for an error which we do by mapping the (virtual) MMIO address back to a physical address, then mapping physical address to a PCI device via an interval tree. When translating virt -> phys we currently assume the ioremap space is only populated by PAGE_SIZE mappings. If a hugepage mapping is found we emit a WARN_ON(), but otherwise handles the check as though a normal page was found. In pathalogical cases such as copying a buffer containing a lot of 0xFFs from BAR memory this can result in the system not booting because it's too busy printing WARN_ON()s. There's no real reason to assume huge pages can't be present and we're prefectly capable of handling them, so do that. Fixes: 4a7b06c157a2 ("powerpc/eeh: Handle hugepages in ioremap space") Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190710150517.27114-1-oohall@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-07-31powerpc/pci/of: Fix OF flags parsing for 64bit BARsAlexey Kardashevskiy1-0/+2
[ Upstream commit df5be5be8735ef2ae80d5ae1f2453cd81a035c4b ] When the firmware does PCI BAR resource allocation, it passes the assigned addresses and flags (prefetch/64bit/...) via the "reg" property of a PCI device device tree node so the kernel does not need to do resource allocation. The flags are stored in resource::flags - the lower byte stores PCI_BASE_ADDRESS_SPACE/etc bits and the other bytes are IORESOURCE_IO/etc. Some flags from PCI_BASE_ADDRESS_xxx and IORESOURCE_xxx are duplicated, such as PCI_BASE_ADDRESS_MEM_PREFETCH/PCI_BASE_ADDRESS_MEM_TYPE_64/etc. When parsing the "reg" property, we copy the prefetch flag but we skip on PCI_BASE_ADDRESS_MEM_TYPE_64 which leaves the flags out of sync. The missing IORESOURCE_MEM_64 flag comes into play under 2 conditions: 1. we remove PCI_PROBE_ONLY for pseries (by hacking pSeries_setup_arch() or by passing "/chosen/linux,pci-probe-only"); 2. we request resource alignment (by passing pci=resource_alignment= via the kernel cmd line to request PAGE_SIZE alignment or defining ppc_md.pcibios_default_alignment which returns anything but 0). Note that the alignment requests are ignored if PCI_PROBE_ONLY is enabled. With 1) and 2), the generic PCI code in the kernel unconditionally decides to: - reassign the BARs in pci_specified_resource_alignment() (works fine) - write new BARs to the device - this fails for 64bit BARs as the generic code looks at IORESOURCE_MEM_64 (not set) and writes only lower 32bits of the BAR and leaves the upper 32bit unmodified which breaks BAR mapping in the hypervisor. This fixes the issue by copying the flag. This is useful if we want to enforce certain BAR alignment per platform as handling subpage sized BARs is proven to cause problems with hotplug (SLOF already aligns BARs to 64k). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com> Reviewed-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Shawn Anastasio <shawn@anastas.io> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-07-26powerpc/watchpoint: Restore NV GPRs while returning from exceptionRavi Bangoria1-2/+7
commit f474c28fbcbe42faca4eb415172c07d76adcb819 upstream. powerpc hardware triggers watchpoint before executing the instruction. To make trigger-after-execute behavior, kernel emulates the instruction. If the instruction is 'load something into non-volatile register', exception handler should restore emulated register state while returning back, otherwise there will be register state corruption. eg, adding a watchpoint on a list can corrput the list: # cat /proc/kallsyms | grep kthread_create_list c00000000121c8b8 d kthread_create_list Add watchpoint on kthread_create_list->prev: # perf record -e mem:0xc00000000121c8c0 Run some workload such that new kthread gets invoked. eg, I just logged out from console: list_add corruption. next->prev should be prev (c000000001214e00), \ but was c00000000121c8b8. (next=c00000000121c8b8). WARNING: CPU: 59 PID: 309 at lib/list_debug.c:25 __list_add_valid+0xb4/0xc0 CPU: 59 PID: 309 Comm: kworker/59:0 Kdump: loaded Not tainted 5.1.0-rc7+ #69 ... NIP __list_add_valid+0xb4/0xc0 LR __list_add_valid+0xb0/0xc0 Call Trace: __list_add_valid+0xb0/0xc0 (unreliable) __kthread_create_on_node+0xe0/0x260 kthread_create_on_node+0x34/0x50 create_worker+0xe8/0x260 worker_thread+0x444/0x560 kthread+0x160/0x1a0 ret_from_kernel_thread+0x5c/0x70 List corruption happened because it uses 'load into non-volatile register' instruction: Snippet from __kthread_create_on_node: c000000000136be8: addis r29,r2,-19 c000000000136bec: ld r29,31424(r29) if (!__list_add_valid(new, prev, next)) c000000000136bf0: mr r3,r30 c000000000136bf4: mr r5,r28 c000000000136bf8: mr r4,r29 c000000000136bfc: bl c00000000059a2f8 <__list_add_valid+0x8> Register state from WARN_ON(): GPR00: c00000000059a3a0 c000007ff23afb50 c000000001344e00 0000000000000075 GPR04: 0000000000000000 0000000000000000 0000001852af8bc1 0000000000000000 GPR08: 0000000000000001 0000000000000007 0000000000000006 00000000000004aa GPR12: 0000000000000000 c000007ffffeb080 c000000000137038 c000005ff62aaa00 GPR16: 0000000000000000 0000000000000000 c000007fffbe7600 c000007fffbe7370 GPR20: c000007fffbe7320 c000007fffbe7300 c000000001373a00 0000000000000000 GPR24: fffffffffffffef7 c00000000012e320 c000007ff23afcb0 c000000000cb8628 GPR28: c00000000121c8b8 c000000001214e00 c000007fef5b17e8 c000007fef5b17c0 Watchpoint hit at 0xc000000000136bec. addis r29,r2,-19 => r29 = 0xc000000001344e00 + (-19 << 16) => r29 = 0xc000000001214e00 ld r29,31424(r29) => r29 = *(0xc000000001214e00 + 31424) => r29 = *(0xc00000000121c8c0) 0xc00000000121c8c0 is where we placed a watchpoint and thus this instruction was emulated by emulate_step. But because handle_dabr_fault did not restore emulated register state, r29 still contains stale value in above register state. Fixes: 5aae8a5370802 ("powerpc, hw_breakpoints: Implement hw_breakpoints for 64-bit server processors") Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: stable@vger.kernel.org # 2.6.36+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26powerpc/32s: fix suspend/resume when IBATs 4-7 are usedChristophe Leroy1-8/+65
commit 6ecb78ef56e08d2119d337ae23cb951a640dc52d upstream. Previously, only IBAT1 and IBAT2 were used to map kernel linear mem. Since commit 63b2bc619565 ("powerpc/mm/32s: Use BATs for STRICT_KERNEL_RWX"), we may have all 8 BATs used for mapping kernel text. But the suspend/restore functions only save/restore BATs 0 to 3, and clears BATs 4 to 7. Make suspend and restore functions respectively save and reload the 8 BATs on CPUs having MMU_FTR_USE_HIGH_BATS feature. Reported-by: Andreas Schwab <schwab@linux-m68k.org> Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-11pstore: Convert buf_lock to semaphoreKees Cook1-2/+0
commit ea84b580b95521644429cc6748b6c2bf27c8b0f3 upstream. Instead of running with interrupts disabled, use a semaphore. This should make it easier for backends that may need to sleep (e.g. EFI) when performing a write: |BUG: sleeping function called from invalid context at kernel/sched/completion.c:99 |in_atomic(): 1, irqs_disabled(): 1, pid: 2236, name: sig-xstate-bum |Preemption disabled at: |[<ffffffff99d60512>] pstore_dump+0x72/0x330 |CPU: 26 PID: 2236 Comm: sig-xstate-bum Tainted: G D 4.20.0-rc3 #45 |Call Trace: | dump_stack+0x4f/0x6a | ___might_sleep.cold.91+0xd3/0xe4 | __might_sleep+0x50/0x90 | wait_for_completion+0x32/0x130 | virt_efi_query_variable_info+0x14e/0x160 | efi_query_variable_store+0x51/0x1a0 | efivar_entry_set_safe+0xa3/0x1b0 | efi_pstore_write+0x109/0x140 | pstore_dump+0x11c/0x330 | kmsg_dump+0xa4/0xd0 | oops_exit+0x22/0x30 ... Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Fixes: 21b3ddd39fee ("efi: Don't use spinlocks for efi vars") Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-04jump_label: move 'asm goto' support test to KconfigMasahiro Yamada1-2/+0
commit e9666d10a5677a494260d60d1fa0b73cc7646eb3 upstream. Currently, CONFIG_JUMP_LABEL just means "I _want_ to use jump label". The jump label is controlled by HAVE_JUMP_LABEL, which is defined like this: #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL) # define HAVE_JUMP_LABEL #endif We can improve this by testing 'asm goto' support in Kconfig, then make JUMP_LABEL depend on CC_HAS_ASM_GOTO. Ugly #ifdef HAVE_JUMP_LABEL will go away, and CONFIG_JUMP_LABEL will match to the real kernel capability. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Tested-by: Sedat Dilek <sedat.dilek@gmail.com> [nc: Fix trivial conflicts in 4.19 arch/xtensa/kernel/jump_label.c doesn't exist yet Ensured CC_HAVE_ASM_GOTO and HAVE_JUMP_LABEL were sufficiently eliminated] Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-31powerpc/64: Fix booting large kernels with STRICT_KERNEL_RWXRussell Currey1-1/+3
[ Upstream commit 56c46bba9bbfe229b4472a5be313c44c5b714a39 ] With STRICT_KERNEL_RWX enabled anything marked __init is placed at a 16M boundary. This is necessary so that it can be repurposed later with different permissions. However, in kernels with text larger than 16M, this pushes early_setup past 32M, incapable of being reached by the branch instruction. Fix this by setting the CTR and branching there instead. Fixes: 1e0fc9d1eb2b ("powerpc/Kconfig: Enable STRICT_KERNEL_RWX for some configs") Signed-off-by: Russell Currey <ruscur@russell.cc> [mpe: Fix it to work on BE by using DOTSYM()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-31powerpc/watchdog: Use hrtimers for per-CPU heartbeatNicholas Piggin1-41/+40
[ Upstream commit 7ae3f6e130e8dc6188b59e3b4ebc2f16e9c8d053 ] Using a jiffies timer creates a dependency on the tick_do_timer_cpu incrementing jiffies. If that CPU has locked up and jiffies is not incrementing, the watchdog heartbeat timer for all CPUs stops and creates false positives and confusing warnings on local CPUs, and also causes the SMP detector to stop, so the root cause is never detected. Fix this by using hrtimer based timers for the watchdog heartbeat, like the generic kernel hardlockup detector. Cc: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Reported-by: Ravikumar Bangoria <ravi.bangoria@in.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Reported-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-16powerpc/powernv/idle: Restore IAMR after idleRussell Currey1-0/+20
commit a3f3072db6cad40895c585dce65e36aab997f042 upstream. Without restoring the IAMR after idle, execution prevention on POWER9 with Radix MMU is overwritten and the kernel can freely execute userspace without faulting. This is necessary when returning from any stop state that modifies user state, as well as hypervisor state. To test how this fails without this patch, load the lkdtm driver and do the following: $ echo EXEC_USERSPACE > /sys/kernel/debug/provoke-crash/DIRECT which won't fault, then boot the kernel with powersave=off, where it will fault. Applying this patch will fix this. Fixes: 3b10d0095a1e ("powerpc/mm/radix: Prevent kernel execution of user space") Cc: stable@vger.kernel.org # v4.10+ Signed-off-by: Russell Currey <ruscur@russell.cc> Reviewed-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-16powerpc/64s: Include cpu headerBreno Leitao1-0/+1
commit 42e2acde1237878462b028f5a27d9cc5bea7502c upstream. Current powerpc security.c file is defining functions, as cpu_show_meltdown(), cpu_show_spectre_v{1,2} and others, that are being declared at linux/cpu.h header without including the header file that contains these declarations. This is being reported by sparse, which thinks that these functions are static, due to the lack of declaration: arch/powerpc/kernel/security.c:105:9: warning: symbol 'cpu_show_meltdown' was not declared. Should it be static? arch/powerpc/kernel/security.c:139:9: warning: symbol 'cpu_show_spectre_v1' was not declared. Should it be static? arch/powerpc/kernel/security.c:161:9: warning: symbol 'cpu_show_spectre_v2' was not declared. Should it be static? arch/powerpc/kernel/security.c:209:6: warning: symbol 'stf_barrier' was not declared. Should it be static? arch/powerpc/kernel/security.c:289:9: warning: symbol 'cpu_show_spec_store_bypass' was not declared. Should it be static? This patch simply includes the proper header (linux/cpu.h) to match function definition and declaration. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Joel Stanley <joel@jms.id.au> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Major Hayden <major@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-16powerpc/smp: Fix NMI IPI xmon timeoutNicholas Piggin1-64/+29
[ Upstream commit 88b9a3d1425a436e95c41f09986fdae2daee437a ] The xmon debugger IPI handler waits in the callback function while xmon is still active. This means they don't complete the IPI, and the initiator always times out waiting for them. Things manage to work after the timeout because there is some fallback logic to keep NMI IPI state sane in case of the timeout, but this is a bit ugly. This patch changes NMI IPI back to half-asynchronous (i.e., wait for everyone to call in, do not wait for IPI function to complete), but the complexity is avoided by going one step further and allowing new IPIs to be issued before the IPI functions to all complete. If synchronization against that is required, it is left up to the caller, but current callers don't require that. In fact with the timeout handling, callers must be able to cope with this already. Fixes: 5b73151fff63 ("powerpc: NMI IPI make NMI IPIs fully sychronous") Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
2019-05-16powerpc/smp: Fix NMI IPI timeoutNicholas Piggin1-2/+3
[ Upstream commit 1b5fc84aba170bdfe3533396ca9662ceea1609b7 ] The NMI IPI timeout logic is broken, if __smp_send_nmi_ipi() times out on the first condition, delay_us will be zero which will send it into the second spin loop with no timeout so it will spin forever. Fixes: 5b73151fff63 ("powerpc: NMI IPI make NMI IPIs fully sychronous") Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
2019-05-14powerpc/speculation: Support 'mitigations=' cmdline optionJosh Poimboeuf2-4/+4
commit 782e69efb3dfed6e8360bc612e8c7827a901a8f9 upstream Configure powerpc CPU runtime speculation bug mitigations in accordance with the 'mitigations=' cmdline option. This affects Meltdown, Spectre v1, Spectre v2, and Speculative Store Bypass. The default behavior is unchanged. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> (on x86) Reviewed-by: Jiri Kosina <jkosina@suse.cz> Cc: Borislav Petkov <bp@alien8.de> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Waiman Long <longman@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Jon Masters <jcm@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux-s390@vger.kernel.org Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-arch@vger.kernel.org Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Tyler Hicks <tyhicks@canonical.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Steven Price <steven.price@arm.com> Cc: Phil Auld <pauld@redhat.com> Link: https://lkml.kernel.org/r/245a606e1a42a558a310220312d9b6adb9159df6.1555085500.git.jpoimboe@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-08kmemleak: powerpc: skip scanning holes in the .bss sectionCatalin Marinas1-0/+7
[ Upstream commit 298a32b132087550d3fa80641ca58323c5dfd4d9 ] Commit 2d4f567103ff ("KVM: PPC: Introduce kvm_tmp framework") adds kvm_tmp[] into the .bss section and then free the rest of unused spaces back to the page allocator. kernel_init kvm_guest_init kvm_free_tmp free_reserved_area free_unref_page free_unref_page_prepare With DEBUG_PAGEALLOC=y, it will unmap those pages from kernel. As the result, kmemleak scan will trigger a panic when it scans the .bss section with unmapped pages. This patch creates dedicated kmemleak objects for the .data, .bss and potentially .data..ro_after_init sections to allow partial freeing via the kmemleak_free_part() in the powerpc kvm_free_tmp() function. Link: http://lkml.kernel.org/r/20190321171917.62049-1-catalin.marinas@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Qian Cai <cai@lca.pw> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Tested-by: Qian Cai <cai@lca.pw> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Avi Kivity <avi@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-02powerpc/vdso32: fix CLOCK_MONOTONIC on PPC64Christophe Leroy1-1/+1
[ Upstream commit dd9a994fc68d196a052b73747e3366c57d14a09e ] Commit b5b4453e7912 ("powerpc/vdso64: Fix CLOCK_MONOTONIC inconsistencies across Y2038") changed the type of wtom_clock_sec to s64 on PPC64. Therefore, VDSO32 needs to read it with a 4 bytes shift in order to retrieve the lower part of it. Fixes: b5b4453e7912 ("powerpc/vdso64: Fix CLOCK_MONOTONIC inconsistencies across Y2038") Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-20powerpc/pseries: Remove prrn_work workqueueNathan Fontenot1-14/+3
[ Upstream commit cd24e457fd8b2d087d9236700c8d2957054598bf ] When a PRRN event is received we are already running in a worker thread. Instead of spawning off another worker thread on the prrn_work workqueue to handle the PRRN event we can just call the PRRN handler routine directly. With this update we can also pass the scope variable for the PRRN event directly to the handler instead of it being a global variable. This patch fixes the following oops mnessage we are seeing in PRRN testing: Oops: Bad kernel stack pointer, sig: 6 [#1] SMP NR_CPUS=2048 NUMA pSeries Modules linked in: nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache binfmt_misc reiserfs vfat fat rpadlpar_io(X) rpaphp(X) tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag af_packet xfs libcrc32c dm_service_time ibmveth(X) ses enclosure scsi_transport_sas rtc_generic btrfs xor raid6_pq sd_mod ibmvscsi(X) scsi_transport_srp ipr(X) libata sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod autofs4 Supported: Yes, External 54 CPU: 7 PID: 18967 Comm: kworker/u96:0 Tainted: G X 4.4.126-94.22-default #1 Workqueue: pseries hotplug workque pseries_hp_work_fn task: c000000775367790 ti: c00000001ebd4000 task.ti: c00000070d140000 NIP: 0000000000000000 LR: 000000001fb3d050 CTR: 0000000000000000 REGS: c00000001ebd7d40 TRAP: 0700 Tainted: G X (4.4.126-94.22-default) MSR: 8000000102081000 <41,VEC,ME5 CR: 28000002 XER: 20040018 4 CFAR: 000000001fb3d084 40 419 1 3 GPR00: 000000000000000040000000000010007 000000001ffff400 000000041fffe200 GPR04: 000000000000008050000000000000000 000000001fb15fa8 0000000500000500 GPR08: 000000000001f40040000000000000001 0000000000000000 000005:5200040002 GPR12: 00000000000000005c000000007a05400 c0000000000e89f8 000000001ed9f668 GPR16: 000000001fbeff944000000001fbeff94 000000001fb545e4 0000006000000060 GPR20: ffffffffffffffff4ffffffffffffffff 0000000000000000 0000000000000000 GPR24: 00000000000000005400000001fb3c000 0000000000000000 000000001fb1b040 GPR28: 000000001fb240004000000001fb440d8 0000000000000008 0000000000000000 NIP [0000000000000000] 5 (null) LR [000000001fb3d050] 031fb3d050 Call Trace: 4 Instruction dump: 4 5:47 12 2 XXXXXXXX XXXXXXXX XXXXX4XX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXX5XX XXXXXXXX 60000000 60000000 60000000 60000000 ---[ end trace aa5627b04a7d9d6b ]--- 3NMI watchdog: BUG: soft lockup - CPU#27 stuck for 23s! [kworker/27:0:13903] Modules linked in: nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache binfmt_misc reiserfs vfat fat rpadlpar_io(X) rpaphp(X) tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag af_packet xfs libcrc32c dm_service_time ibmveth(X) ses enclosure scsi_transport_sas rtc_generic btrfs xor raid6_pq sd_mod ibmvscsi(X) scsi_transport_srp ipr(X) libata sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod autofs4 Supported: Yes, External CPU: 27 PID: 13903 Comm: kworker/27:0 Tainted: G D X 4.4.126-94.22-default #1 Workqueue: events prrn_work_fn task: c000000747cfa390 ti: c00000074712c000 task.ti: c00000074712c000 NIP: c0000000008002a8 LR: c000000000090770 CTR: 000000000032e088 REGS: c00000074712f7b0 TRAP: 0901 Tainted: G D X (4.4.126-94.22-default) MSR: 8000000100009033 <SF,EE,ME,IR,DR,RI,LE> CR: 22482044 XER: 20040000 CFAR: c0000000008002c4 SOFTE: 1 GPR00: c000000000090770 c00000074712fa30 c000000000f09800 c000000000fa1928 6:02 GPR04: c000000775f5e000 fffffffffffffffe 0000000000000001 c000000000f42db8 GPR08: 0000000000000001 0000000080000007 0000000000000000 0000000000000000 GPR12: 8006210083180000 c000000007a14400 NIP [c0000000008002a8] _raw_spin_lock+0x68/0xd0 LR [c000000000090770] mobility_rtas_call+0x50/0x100 Call Trace: 59 5 [c00000074712fa60] [c000000000090770] mobility_rtas_call+0x50/0x100 [c00000074712faf0] [c000000000090b08] pseries_devicetree_update+0xf8/0x530 [c00000074712fc20] [c000000000031ba4] prrn_work_fn+0x34/0x50 [c00000074712fc40] [c0000000000e0390] process_one_work+0x1a0/0x4e0 [c00000074712fcd0] [c0000000000e0870] worker_thread+0x1a0/0x6105:57 2 [c00000074712fd80] [c0000000000e8b18] kthread+0x128/0x150 [c00000074712fe30] [c0000000000096f8] ret_from_kernel_thread+0x5c/0x64 Instruction dump: 2c090000 40c20010 7d40192d 40c2fff0 7c2004ac 2fa90000 40de0018 5:540030 3 e8010010 ebe1fff8 7c0803a6 4e800020 <7c210b78> e92d0000 89290009 792affe3 Signed-off-by: John Allen <jallen@linux.ibm.com> Signed-off-by: Haren Myneni <haren@us.ibm.com> Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-17powerpc/tm: Limit TM code inside PPC_TRANSACTIONAL_MEMBreno Leitao1-5/+18
commit 897bc3df8c5aebb54c32d831f917592e873d0559 upstream. Commit e1c3743e1a20 ("powerpc/tm: Set MSR[TS] just prior to recheckpoint") moved a code block around and this block uses a 'msr' variable outside of the CONFIG_PPC_TRANSACTIONAL_MEM, however the 'msr' variable is declared inside a CONFIG_PPC_TRANSACTIONAL_MEM block, causing a possible error when CONFIG_PPC_TRANSACTION_MEM is not defined. error: 'msr' undeclared (first use in this function) This is not causing a compilation error in the mainline kernel, because 'msr' is being used as an argument of MSR_TM_ACTIVE(), which is defined as the following when CONFIG_PPC_TRANSACTIONAL_MEM is *not* set: #define MSR_TM_ACTIVE(x) 0 This patch just fixes this issue avoiding the 'msr' variable usage outside the CONFIG_PPC_TRANSACTIONAL_MEM block, avoiding trusting in the MSR_TM_ACTIVE() definition. Cc: stable@vger.kernel.org Reported-by: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de> Fixes: e1c3743e1a20 ("powerpc/tm: Set MSR[TS] just prior to recheckpoint") Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-05powerpc/64s: Clear on-stack exception marker upon exception returnNicolai Stange1-0/+7
[ Upstream commit eddd0b332304d554ad6243942f87c2fcea98c56b ] The ppc64 specific implementation of the reliable stacktracer, save_stack_trace_tsk_reliable(), bails out and reports an "unreliable trace" whenever it finds an exception frame on the stack. Stack frames are classified as exception frames if the STACK_FRAME_REGS_MARKER magic, as written by exception prologues, is found at a particular location. However, as observed by Joe Lawrence, it is possible in practice that non-exception stack frames can alias with prior exception frames and thus, that the reliable stacktracer can find a stale STACK_FRAME_REGS_MARKER on the stack. It in turn falsely reports an unreliable stacktrace and blocks any live patching transition to finish. Said condition lasts until the stack frame is overwritten/initialized by function call or other means. In principle, we could mitigate this by making the exception frame classification condition in save_stack_trace_tsk_reliable() stronger: in addition to testing for STACK_FRAME_REGS_MARKER, we could also take into account that for all exceptions executing on the kernel stack - their stack frames's backlink pointers always match what is saved in their pt_regs instance's ->gpr[1] slot and that - their exception frame size equals STACK_INT_FRAME_SIZE, a value uncommonly large for non-exception frames. However, while these are currently true, relying on them would make the reliable stacktrace implementation more sensitive towards future changes in the exception entry code. Note that false negatives, i.e. not detecting exception frames, would silently break the live patching consistency model. Furthermore, certain other places (diagnostic stacktraces, perf, xmon) rely on STACK_FRAME_REGS_MARKER as well. Make the exception exit code clear the on-stack STACK_FRAME_REGS_MARKER for those exceptions running on the "normal" kernel stack and returning to kernelspace: because the topmost frame is ignored by the reliable stack tracer anyway, returns to userspace don't need to take care of clearing the marker. Furthermore, as I don't have the ability to test this on Book 3E or 32 bits, limit the change to Book 3S and 64 bits. Fixes: df78d3f61480 ("powerpc/livepatch: Implement reliable stack tracing for the consistency model") Reported-by: Joe Lawrence <joe.lawrence@redhat.com> Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-03powerpc/security: Fix spectre_v2 reportingMichael Ellerman1-15/+8
commit 92edf8df0ff2ae86cc632eeca0e651fd8431d40d upstream. When I updated the spectre_v2 reporting to handle software count cache flush I got the logic wrong when there's no software count cache enabled at all. The result is that on systems with the software count cache flush disabled we print: Mitigation: Indirect branch cache disabled, Software count cache flush Which correctly indicates that the count cache is disabled, but incorrectly says the software count cache flush is enabled. The root of the problem is that we are trying to handle all combinations of options. But we know now that we only expect to see the software count cache flush enabled if the other options are false. So split the two cases, which simplifies the logic and fixes the bug. We were also missing a space before "(hardware accelerated)". The result is we see one of: Mitigation: Indirect branch serialisation (kernel only) Mitigation: Indirect branch cache disabled Mitigation: Software count cache flush Mitigation: Software count cache flush (hardware accelerated) Fixes: ee13cb249fab ("powerpc/64s: Add support for software count cache flush") Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Michael Neuling <mikey@neuling.org> Reviewed-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-03powerpc/fsl: Fix the flush of branch predictor.Christophe Leroy1-0/+1
commit 27da80719ef132cf8c80eb406d5aeb37dddf78cc upstream. The commit identified below adds MC_BTB_FLUSH macro only when CONFIG_PPC_FSL_BOOK3E is defined. This results in the following error on some configs (seen several times with kisskb randconfig_defconfig) arch/powerpc/kernel/exceptions-64e.S:576: Error: Unrecognized opcode: `mc_btb_flush' make[3]: *** [scripts/Makefile.build:367: arch/powerpc/kernel/exceptions-64e.o] Error 1 make[2]: *** [scripts/Makefile.build:492: arch/powerpc/kernel] Error 2 make[1]: *** [Makefile:1043: arch/powerpc] Error 2 make: *** [Makefile:152: sub-make] Error 2 This patch adds a blank definition of MC_BTB_FLUSH for other cases. Fixes: 10c5e83afd4a ("powerpc/fsl: Flush the branch predictor at each kernel entry (64bit)") Cc: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Daniel Axtens <dja@axtens.net> Reviewed-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-03powerpc/fsl: Fixed warning: orphan section `__btb_flush_fixup'Diana Craciun1-6/+12
commit 039daac5526932ec731e4499613018d263af8b3e upstream. Fixed the following build warning: powerpc-linux-gnu-ld: warning: orphan section `__btb_flush_fixup' from `arch/powerpc/kernel/head_44x.o' being placed in section `__btb_flush_fixup'. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-03powerpc/fsl: Update Spectre v2 reportingDiana Craciun1-1/+4
commit dfa88658fb0583abb92e062c7a9cd5a5b94f2a46 upstream. Report branch predictor state flush as a mitigation for Spectre variant 2. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-03powerpc/fsl: Enable runtime patching if nospectre_v2 boot arg is usedDiana Craciun1-0/+1
commit 3bc8ea8603ae4c1e09aca8de229ad38b8091fcb3 upstream. If the user choses not to use the mitigations, replace the code sequence with nops. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-03powerpc/fsl: Flush the branch predictor at each kernel entry (32 bit)Diana Craciun2-0/+21
commit 7fef436295bf6c05effe682c8797dfcb0deb112a upstream. In order to protect against speculation attacks on indirect branches, the branch predictor is flushed at kernel entry to protect for the following situations: - userspace process attacking another userspace process - userspace process attacking the kernel Basically when the privillege level change (i.e.the kernel is entered), the branch predictor state is flushed. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-03powerpc/fsl: Flush the branch predictor at each kernel entry (64bit)Diana Craciun2-1/+30
commit 10c5e83afd4a3f01712d97d3bb1ae34d5b74a185 upstream. In order to protect against speculation attacks on indirect branches, the branch predictor is flushed at kernel entry to protect for the following situations: - userspace process attacking another userspace process - userspace process attacking the kernel Basically when the privillege level change (i.e. the kernel is entered), the branch predictor state is flushed. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-03powerpc/fsl: Add nospectre_v2 command line argumentDiana Craciun1-0/+21
commit f633a8ad636efb5d4bba1a047d4a0f1ef719aa06 upstream. When the command line argument is present, the Spectre variant 2 mitigations are disabled. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-03powerpc/fsl: Add infrastructure to fixup branch predictor flushDiana Craciun1-0/+8
commit 76a5eaa38b15dda92cd6964248c39b5a6f3a4e9d upstream. In order to protect against speculation attacks (Spectre variant 2) on NXP PowerPC platforms, the branch predictor should be flushed when the privillege level is changed. This patch is adding the infrastructure to fixup at runtime the code sections that are performing the branch predictor flush depending on a boot arg parameter which is added later in a separate patch. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-27powerpc/vdso64: Fix CLOCK_MONOTONIC inconsistencies across Y2038Michael Ellerman1-2/+2
commit b5b4453e7912f056da1ca7572574cada32ecb60c upstream. Jakub Drnec reported: Setting the realtime clock can sometimes make the monotonic clock go back by over a hundred years. Decreasing the realtime clock across the y2k38 threshold is one reliable way to reproduce. Allegedly this can also happen just by running ntpd, I have not managed to reproduce that other than booting with rtc at >2038 and then running ntp. When this happens, anything with timers (e.g. openjdk) breaks rather badly. And included a test case (slightly edited for brevity): #define _POSIX_C_SOURCE 199309L #include <stdio.h> #include <time.h> #include <stdlib.h> #include <unistd.h> long get_time(void) { struct timespec tp; clock_gettime(CLOCK_MONOTONIC, &tp); return tp.tv_sec + tp.tv_nsec / 1000000000; } int main(void) { long last = get_time(); while(1) { long now = get_time(); if (now < last) { printf("clock went backwards by %ld seconds!\n", last - now); } last = now; sleep(1); } return 0; } Which when run concurrently with: # date -s 2040-1-1 # date -s 2037-1-1 Will detect the clock going backward. The root cause is that wtom_clock_sec in struct vdso_data is only a 32-bit signed value, even though we set its value to be equal to tk->wall_to_monotonic.tv_sec which is 64-bits. Because the monotonic clock starts at zero when the system boots the wall_to_montonic.tv_sec offset is negative for current and future dates. Currently on a freshly booted system the offset will be in the vicinity of negative 1.5 billion seconds. However if the wall clock is set past the Y2038 boundary, the offset from wall to monotonic becomes less than negative 2^31, and no longer fits in 32-bits. When that value is assigned to wtom_clock_sec it is truncated and becomes positive, causing the VDSO assembly code to calculate CLOCK_MONOTONIC incorrectly. That causes CLOCK_MONOTONIC to jump ahead by ~4 billion seconds which it is not meant to do. Worse, if the time is then set back before the Y2038 boundary CLOCK_MONOTONIC will jump backward. We can fix it simply by storing the full 64-bit offset in the vdso_data, and using that in the VDSO assembly code. We also shuffle some of the fields in vdso_data to avoid creating a hole. The original commit that added the CLOCK_MONOTONIC support to the VDSO did actually use a 64-bit value for wtom_clock_sec, see commit a7f290dad32e ("[PATCH] powerpc: Merge vdso's and add vdso support to 32 bits kernel") (Nov 2005). However just 3 days later it was converted to 32-bits in commit 0c37ec2aa88b ("[PATCH] powerpc: vdso fixes (take #2)"), and the bug has existed since then AFAICS. Fixes: 0c37ec2aa88b ("[PATCH] powerpc: vdso fixes (take #2)") Cc: stable@vger.kernel.org # v2.6.15+ Link: http://lkml.kernel.org/r/HaC.ZfES.62bwlnvAvMP.1STMMj@seznam.cz Reported-by: Jakub Drnec <jaydee@email.cz> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-23powerpc/traps: Fix the message printed when stack overflowsChristophe Leroy1-2/+2
commit 9bf3d3c4e4fd82c7174f4856df372ab2a71005b9 upstream. Today's message is useless: [ 42.253267] Kernel stack overflow in process (ptrval), r1=c65500b0 This patch fixes it: [ 66.905235] Kernel stack overflow in process sh[356], r1=c65560b0 Fixes: ad67b74d2469 ("printk: hash addresses printed with %p") Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Use task_pid_nr()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-23powerpc/traps: fix recoverability of machine check handling on book3s/32Christophe Leroy1-4/+4
commit 0bbea75c476b77fa7d7811d6be911cc7583e640f upstream. Looks like book3s/32 doesn't set RI on machine check, so checking RI before calling die() will always be fatal allthought this is not an issue in most cases. Fixes: b96672dd840f ("powerpc: Machine check interrupt is a non-maskable interrupt") Fixes: daf00ae71dad ("powerpc/traps: restore recoverability of machine_check interrupts") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Cc: stable@vger.kernel.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-23powerpc/ptrace: Simplify vr_get/set() to avoid GCC warningMichael Ellerman1-2/+8
commit ca6d5149d2ad0a8d2f9c28cbe379802260a0a5e0 upstream. GCC 8 warns about the logic in vr_get/set(), which with -Werror breaks the build: In function ‘user_regset_copyin’, inlined from ‘vr_set’ at arch/powerpc/kernel/ptrace.c:628:9: include/linux/regset.h:295:4: error: ‘memcpy’ offset [-527, -529] is out of the bounds [0, 16] of object ‘vrsave’ with type ‘union <anonymous>’ [-Werror=array-bounds] arch/powerpc/kernel/ptrace.c: In function ‘vr_set’: arch/powerpc/kernel/ptrace.c:623:5: note: ‘vrsave’ declared here } vrsave; This has been identified as a regression in GCC, see GCC bug 88273. However we can avoid the warning and also simplify the logic and make it more robust. Currently we pass -1 as end_pos to user_regset_copyout(). This says "copy up to the end of the regset". The definition of the regset is: [REGSET_VMX] = { .core_note_type = NT_PPC_VMX, .n = 34, .size = sizeof(vector128), .align = sizeof(vector128), .active = vr_active, .get = vr_get, .set = vr_set }, The end is calculated as (n * size), ie. 34 * sizeof(vector128). In vr_get/set() we pass start_pos as 33 * sizeof(vector128), meaning we can copy up to sizeof(vector128) into/out-of vrsave. The on-stack vrsave is defined as: union { elf_vrreg_t reg; u32 word; } vrsave; And elf_vrreg_t is: typedef __vector128 elf_vrreg_t; So there is no bug, but we rely on all those sizes lining up, otherwise we would have a kernel stack exposure/overwrite on our hands. Rather than relying on that we can pass an explict end_pos based on the sizeof(vrsave). The result should be exactly the same but it's more obviously not over-reading/writing the stack and it avoids the compiler warning. Reported-by: Meelis Roos <mroos@linux.ee> Reported-by: Mathieu Malaterre <malat@debian.org> Cc: stable@vger.kernel.org Tested-by: Mathieu Malaterre <malat@debian.org> Tested-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-23powerpc: Fix 32-bit KVM-PR lockup and host crash with MacOS guestMark Cave-Ayland1-1/+1
commit fe1ef6bcdb4fca33434256a802a3ed6aacf0bd2f upstream. Commit 8792468da5e1 "powerpc: Add the ability to save FPU without giving it up" unexpectedly removed the MSR_FE0 and MSR_FE1 bits from the bitmask used to update the MSR of the previous thread in __giveup_fpu() causing a KVM-PR MacOS guest to lockup and panic the host kernel. Leaving FE0/1 enabled means unrelated processes might receive FPEs when they're not expecting them and crash. In particular if this happens to init the host will then panic. eg (transcribed): qemu-system-ppc[837]: unhandled signal 8 at 12cc9ce4 nip 12cc9ce4 lr 12cc9ca4 code 0 systemd[1]: unhandled signal 8 at 202f02e0 nip 202f02e0 lr 001003d4 code 0 Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b Reinstate these bits to the MSR bitmask to enable MacOS guests to run under 32-bit KVM-PR once again without issue. Fixes: 8792468da5e1 ("powerpc: Add the ability to save FPU without giving it up") Cc: stable@vger.kernel.org # v4.6+ Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-23powerpc/32: Clear on-stack exception marker upon exception returnChristophe Leroy1-0/+9
commit 9580b71b5a7863c24a9bd18bcd2ad759b86b1eff upstream. Clear the on-stack STACK_FRAME_REGS_MARKER on exception exit in order to avoid confusing stacktrace like the one below. Call Trace: [c0e9dca0] [c01c42a0] print_address_description+0x64/0x2bc (unreliable) [c0e9dcd0] [c01c4684] kasan_report+0xfc/0x180 [c0e9dd10] [c0895130] memchr+0x24/0x74 [c0e9dd30] [c00a9e38] msg_print_text+0x124/0x574 [c0e9dde0] [c00ab710] console_unlock+0x114/0x4f8 [c0e9de40] [c00adc60] vprintk_emit+0x188/0x1c4 --- interrupt: c0e9df00 at 0x400f330 LR = init_stack+0x1f00/0x2000 [c0e9de80] [c00ae3c4] printk+0xa8/0xcc (unreliable) [c0e9df20] [c0c27e44] early_irq_init+0x38/0x108 [c0e9df50] [c0c15434] start_kernel+0x310/0x488 [c0e9dff0] [00003484] 0x3484 With this patch the trace becomes: Call Trace: [c0e9dca0] [c01c42c0] print_address_description+0x64/0x2bc (unreliable) [c0e9dcd0] [c01c46a4] kasan_report+0xfc/0x180 [c0e9dd10] [c0895150] memchr+0x24/0x74 [c0e9dd30] [c00a9e58] msg_print_text+0x124/0x574 [c0e9dde0] [c00ab730] console_unlock+0x114/0x4f8 [c0e9de40] [c00adc80] vprintk_emit+0x188/0x1c4 [c0e9de80] [c00ae3e4] printk+0xa8/0xcc [c0e9df20] [c0c27e44] early_irq_init+0x38/0x108 [c0e9df50] [c0c15434] start_kernel+0x310/0x488 [c0e9dff0] [00003484] 0x3484 Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-27powerpc/8xx: fix setting of pagetable for Abatron BDI debug tool.Christophe Leroy1-1/+2
[ Upstream commit fb0bdec51a4901b7dd088de0a1e365e1b9f5cd21 ] Commit 8c8c10b90d88 ("powerpc/8xx: fix handling of early NULL pointer dereference") moved the loading of r6 earlier in the code. As some functions are called inbetween, r6 needs to be loaded again with the address of swapper_pg_dir in order to set PTE pointers for the Abatron BDI. Fixes: 8c8c10b90d88 ("powerpc/8xx: fix handling of early NULL pointer dereference") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-02-12powerpc/fadump: Do not allow hot-remove memory from fadump reserved area.Mahesh Salgaonkar1-2/+8
[ Upstream commit 0db6896ff6332ba694f1e61b93ae3b2640317633 ] For fadump to work successfully there should not be any holes in reserved memory ranges where kernel has asked firmware to move the content of old kernel memory in event of crash. Now that fadump uses CMA for reserved area, this memory area is now not protected from hot-remove operations unless it is cma allocated. Hence, fadump service can fail to re-register after the hot-remove operation, if hot-removed memory belongs to fadump reserved region. To avoid this make sure that memory from fadump reserved area is not hot-removable if fadump is registered. However, if user still wants to remove that memory, he can do so by manually stopping fadump service before hot-remove operation. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-02-12powerpc/32: Add .data..Lubsan_data*/.data..Lubsan_type* sections explicitlyMathieu Malaterre1-0/+4
[ Upstream commit beba24ac59133cb36ecd03f9af9ccb11971ee20e ] When both `CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y` and `CONFIG_UBSAN=y` are set, link step typically produce numberous warnings about orphan section: + powerpc-linux-gnu-ld -EB -m elf32ppc -Bstatic --orphan-handling=warn --build-id --gc-sections -X -o .tmp_vmlinux1 -T ./arch/powerpc/kernel/vmlinux.lds --who le-archive built-in.a --no-whole-archive --start-group lib/lib.a --end-group powerpc-linux-gnu-ld: warning: orphan section `.data..Lubsan_data393' from `init/main.o' being placed in section `.data..Lubsan_data393'. powerpc-linux-gnu-ld: warning: orphan section `.data..Lubsan_data394' from `init/main.o' being placed in section `.data..Lubsan_data394'. ... powerpc-linux-gnu-ld: warning: orphan section `.data..Lubsan_type11' from `init/main.o' being placed in section `.data..Lubsan_type11'. powerpc-linux-gnu-ld: warning: orphan section `.data..Lubsan_type12' from `init/main.o' being placed in section `.data..Lubsan_type12'. ... This commit remove those warnings produced at W=1. Link: https://www.mail-archive.com/linuxppc-dev@lists.ozlabs.org/msg135407.html Suggested-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-13powerpc/tm: Set MSR[TS] just prior to recheckpointBreno Leitao2-15/+49
commit e1c3743e1a20647c53b719dbf28b48f45d23f2cd upstream. On a signal handler return, the user could set a context with MSR[TS] bits set, and these bits would be copied to task regs->msr. At restore_tm_sigcontexts(), after current task regs->msr[TS] bits are set, several __get_user() are called and then a recheckpoint is executed. This is a problem since a page fault (in kernel space) could happen when calling __get_user(). If it happens, the process MSR[TS] bits were already set, but recheckpoint was not executed, and SPRs are still invalid. The page fault can cause the current process to be de-scheduled, with MSR[TS] active and without tm_recheckpoint() being called. More importantly, without TEXASR[FS] bit set also. Since TEXASR might not have the FS bit set, and when the process is scheduled back, it will try to reclaim, which will be aborted because of the CPU is not in the suspended state, and, then, recheckpoint. This recheckpoint will restore thread->texasr into TEXASR SPR, which might be zero, hitting a BUG_ON(). kernel BUG at /build/linux-sf3Co9/linux-4.9.30/arch/powerpc/kernel/tm.S:434! cpu 0xb: Vector: 700 (Program Check) at [c00000041f1576d0] pc: c000000000054550: restore_gprs+0xb0/0x180 lr: 0000000000000000 sp: c00000041f157950 msr: 8000000100021033 current = 0xc00000041f143000 paca = 0xc00000000fb86300 softe: 0 irq_happened: 0x01 pid = 1021, comm = kworker/11:1 kernel BUG at /build/linux-sf3Co9/linux-4.9.30/arch/powerpc/kernel/tm.S:434! Linux version 4.9.0-3-powerpc64le (debian-kernel@lists.debian.org) (gcc version 6.3.0 20170516 (Debian 6.3.0-18) ) #1 SMP Debian 4.9.30-2+deb9u2 (2017-06-26) enter ? for help [c00000041f157b30] c00000000001bc3c tm_recheckpoint.part.11+0x6c/0xa0 [c00000041f157b70] c00000000001d184 __switch_to+0x1e4/0x4c0 [c00000041f157bd0] c00000000082eeb8 __schedule+0x2f8/0x990 [c00000041f157cb0] c00000000082f598 schedule+0x48/0xc0 [c00000041f157ce0] c0000000000f0d28 worker_thread+0x148/0x610 [c00000041f157d80] c0000000000f96b0 kthread+0x120/0x140 [c00000041f157e30] c00000000000c0e0 ret_from_kernel_thread+0x5c/0x7c This patch simply delays the MSR[TS] set, so, if there is any page fault in the __get_user() section, it does not have regs->msr[TS] set, since the TM structures are still invalid, thus avoiding doing TM operations for in-kernel exceptions and possible process reschedule. With this patch, the MSR[TS] will only be set just before recheckpointing and setting TEXASR[FS] = 1, thus avoiding an interrupt with TM registers in invalid state. Other than that, if CONFIG_PREEMPT is set, there might be a preemption just after setting MSR[TS] and before tm_recheckpoint(), thus, this block must be atomic from a preemption perspective, thus, calling preempt_disable/enable() on this code. It is not possible to move tm_recheckpoint to happen earlier, because it is required to get the checkpointed registers from userspace, with __get_user(), thus, the only way to avoid this undesired behavior is delaying the MSR[TS] set. The 32-bits signal handler seems to be safe this current issue, but, it might be exposed to the preemption issue, thus, disabling preemption in this chunk of code. Changes from v2: * Run the critical section with preempt_disable. Fixes: 87b4e5393af7 ("powerpc/tm: Fix return of active 64bit signals") Cc: stable@vger.kernel.org (v3.9+) Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-13Revert "powerpc/tm: Unset MSR[TS] if not recheckpointing"Greg Kroah-Hartman2-29/+9
This reverts commit a9935a12768851762089fda8e5a9daaf0231808e which is commit 6f5b9f018f4c7686fd944d920209d1382d320e4e upstream. It breaks the powerpc build, so drop it from the tree until a fix goes upstream. Reported-by: Guenter Roeck <linux@roeck-us.net> Cc: Breno Leitao <leitao@debian.org> Cc: Michal Suchánek <msuchanek@suse.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-13powerpc: Disable -Wbuiltin-requires-header when setjmp is usedJoel Stanley1-0/+3
commit aea447141c7e7824b81b49acd1bc785506fba46e upstream. The powerpc kernel uses setjmp which causes a warning when building with clang: In file included from arch/powerpc/xmon/xmon.c:51: ./arch/powerpc/include/asm/setjmp.h:15:13: error: declaration of built-in function 'setjmp' requires inclusion of the header <setjmp.h> [-Werror,-Wbuiltin-requires-header] extern long setjmp(long *); ^ ./arch/powerpc/include/asm/setjmp.h:16:13: error: declaration of built-in function 'longjmp' requires inclusion of the header <setjmp.h> [-Werror,-Wbuiltin-requires-header] extern void longjmp(long *, long); ^ This *is* the header and we're not using the built-in setjump but rather the one in arch/powerpc/kernel/misc.S. As the compiler warning does not make sense, it for the files where setjmp is used. Signed-off-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> [mpe: Move subdir-ccflags in xmon/Makefile to not clobber -Werror] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-13powerpc: consolidate -mno-sched-epilog into FTRACE flagsNicholas Piggin2-5/+5
commit 2a056f58fd33ccc6a0261b552b0f17e7fa4a12f3 upstream. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09powerpc/tm: Unset MSR[TS] if not recheckpointingBreno Leitao2-9/+29
commit 6f5b9f018f4c7686fd944d920209d1382d320e4e upstream. There is a TM Bad Thing bug that can be caused when you return from a signal context in a suspended transaction but with ucontext MSR[TS] unset. This forces regs->msr[TS] to be set at syscall entrance (since the CPU state is transactional). It also calls treclaim() to flush the transaction state, which is done based on the live (mfmsr) MSR state. Since user context MSR[TS] is not set, then restore_tm_sigcontexts() is not called, thus, not executing recheckpoint, keeping the CPU state as not transactional. When calling rfid, SRR1 will have MSR[TS] set, but the CPU state is non transactional, causing the TM Bad Thing with the following stack: [ 33.862316] Bad kernel stack pointer 3fffd9dce3e0 at c00000000000c47c cpu 0x8: Vector: 700 (Program Check) at [c00000003ff7fd40] pc: c00000000000c47c: fast_exception_return+0xac/0xb4 lr: 00003fff865f442c sp: 3fffd9dce3e0 msr: 8000000102a03031 current = 0xc00000041f68b700 paca = 0xc00000000fb84800 softe: 0 irq_happened: 0x01 pid = 1721, comm = tm-signal-sigre Linux version 4.9.0-3-powerpc64le (debian-kernel@lists.debian.org) (gcc version 6.3.0 20170516 (Debian 6.3.0-18) ) #1 SMP Debian 4.9.30-2+deb9u2 (2017-06-26) WARNING: exception is not recoverable, can't continue The same problem happens on 32-bits signal handler, and the fix is very similar, if tm_recheckpoint() is not executed, then regs->msr[TS] should be zeroed. This patch also fixes a sparse warning related to lack of indentation when CONFIG_PPC_TRANSACTIONAL_MEM is set. Fixes: 2b0a576d15e0e ("powerpc: Add new transactional memory state to the signal context") CC: Stable <stable@vger.kernel.org> # 3.10+ Signed-off-by: Breno Leitao <leitao@debian.org> Tested-by: Michal Suchánek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09powerpc/fsl: Fix spectre_v2 mitigations reportingDiana Craciun1-1/+1
commit 7d8bad99ba5a22892f0cad6881289fdc3875a930 upstream. Currently for CONFIG_PPC_FSL_BOOK3E the spectre_v2 file is incorrect: $ cat /sys/devices/system/cpu/vulnerabilities/spectre_v2 "Mitigation: Software count cache flush" Which is wrong. Fix it to report vulnerable for now. Fixes: ee13cb249fab ("powerpc/64s: Add support for software count cache flush") Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-19powerpc: Look for "stdout-path" when setting up legacy consolesBenjamin Herrenschmidt1-1/+5
commit bf3d6afbb234156749b640b6c50f714967a85964 upstream. Commit 78e5dfea84dc ("powerpc: dts: replace 'linux,stdout-path' with 'stdout-path'") broke the default console on a number of embedded PowerPC systems, because it failed to also update the code in arch/powerpc/kernel/legacy_serial.c to look for that property in addition to the old one. This fixes it. Fixes: 78e5dfea84dc ("powerpc: dts: replace 'linux,stdout-path' with 'stdout-path'") Cc: stable@vger.kernel.org # v4.17+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>