summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2018-09-15arm64: Handle mismatched cache typeSuzuki K Poulose2-4/+16
commit 314d53d297980676011e6fd83dac60db4a01dc70 upstream. Track mismatches in the cache type register (CTR_EL0), other than the D/I min line sizes and trap user accesses if there are any. Fixes: be68a8aaf925 ("arm64: cpufeature: Fix CTR_EL0 field definitions") Cc: <stable@vger.kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-15arm64: Fix mismatched cache line size detectionSuzuki K Poulose3-4/+11
commit 4c4a39dd5fe2d13e2d2fa5fceb8ef95d19fc389a upstream. If there is a mismatch in the I/D min line size, we must always use the system wide safe value both in applications and in the kernel, while performing cache operations. However, we have been checking more bits than just the min line sizes, which triggers false negatives. We may need to trap the user accesses in such cases, but not necessarily patch the kernel. This patch fixes the check to do the right thing as advertised. A new capability will be added to check mismatches in other fields and ensure we trap the CTR accesses. Fixes: be68a8aaf925 ("arm64: cpufeature: Fix CTR_EL0 field definitions") Cc: <stable@vger.kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-15arm64: cpu_errata: include required headersArnd Bergmann1-0/+2
commit 94a5d8790e79ab78f499d2d9f1ff2cab63849d9f upstream. Without including psci.h and arm-smccc.h, we now get a build failure in some configurations: arch/arm64/kernel/cpu_errata.c: In function 'arm64_update_smccc_conduit': arch/arm64/kernel/cpu_errata.c:278:10: error: 'psci_ops' undeclared (first use in this function); did you mean 'sysfs_ops'? arch/arm64/kernel/cpu_errata.c: In function 'arm64_set_ssbd_mitigation': arch/arm64/kernel/cpu_errata.c:311:3: error: implicit declaration of function 'arm_smccc_1_1_hvc' [-Werror=implicit-function-declaration] arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_WORKAROUND_2, state, NULL); Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-15x86: kvm: avoid unused variable warningArnd Bergmann1-3/+1
commit 7288bde1f9df6c1475675419bdd7725ce84dec56 upstream. Removing one of the two accesses of the maxphyaddr variable led to a harmless warning: arch/x86/kvm/x86.c: In function 'kvm_set_mmio_spte_mask': arch/x86/kvm/x86.c:6563:6: error: unused variable 'maxphyaddr' [-Werror=unused-variable] Removing the #ifdef seems to be the nicest workaround, as it makes the code look cleaner than adding another #ifdef. Fixes: 28a1f3ac1d0c ("kvm: x86: Set highest physical address bits in non-present/reserved SPTEs") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: stable@vger.kernel.org # L1TF Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-15kvm: x86: Set highest physical address bits in non-present/reserved SPTEsJunaid Shahid2-7/+44
commit 28a1f3ac1d0c8558ee4453d9634dad891a6e922e upstream. Always set the 5 upper-most supported physical address bits to 1 for SPTEs that are marked as non-present or reserved, to make them unusable for L1TF attacks from the guest. Currently, this just applies to MMIO SPTEs. (We do not need to mark PTEs that are completely 0 as physical page 0 is already reserved.) This allows mitigation of L1TF without disabling hyper-threading by using shadow paging mode instead of EPT. Signed-off-by: Junaid Shahid <junaids@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-15Revert "ARM: imx_v6_v7_defconfig: Select ULPI support"Fabio Estevam1-2/+0
This reverts commit 2059e527a659cf16d6bb709f1c8509f7a7623fc4. This commit causes reboot to fail on imx6 wandboard, so let's revert it. Cc: <stable@vger.kernel.org> #4.14 Reported-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-15s390/lib: use expoline for all bcr instructionsMartin Schwidefsky1-4/+8
commit 5eda25b10297684c1f46a14199ec00210f3c346e upstream. The memove, memset, memcpy, __memset16, __memset32 and __memset64 function have an additional indirect return branch in form of a "bzr" instruction. These need to use expolines as well. Cc: <stable@vger.kernel.org> # v4.17+ Fixes: 97489e0663 ("s390/lib: use expoline for indirect branches") Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-15x86/xen: don't write ptes directly in 32-bit PV guestsJuergen Gross1-4/+3
commit f7c90c2aa4004808dff777ba6ae2c7294dd06851 upstream. In some cases 32-bit PAE PV guests still write PTEs directly instead of using hypercalls. This is especially bad when clearing a PTE as this is done via 32-bit writes which will produce intermediate L1TF attackable PTEs. Change the code to use hypercalls instead. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-15x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clearJuergen Gross1-4/+3
commit b2d7a075a1ccef2fb321d595802190c8e9b39004 upstream. Using only 32-bit writes for the pte will result in an intermediate L1TF vulnerable PTE. When running as a Xen PV guest this will at once switch the guest to shadow mode resulting in a loss of performance. Use arch_atomic64_xchg() instead which will perform the requested operation atomically with all 64 bits. Some performance considerations according to: https://software.intel.com/sites/default/files/managed/ad/dc/Intel-Xeon-Scalable-Processor-throughput-latency.pdf The main number should be the latency, as there is no tight loop around native_ptep_get_and_clear(). "lock cmpxchg8b" has a latency of 20 cycles, while "lock xchg" (with a memory operand) isn't mentioned in that document. "lock xadd" (with xadd having 3 cycles less latency than xchg) has a latency of 11, so we can assume a latency of 14 for "lock xchg". Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jan Beulich <jbeulich@suse.com> Tested-by: Jason Andryuk <jandryuk@gmail.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> [ Atomic operations gained an arch_ prefix in 8bf705d13039 ("locking/atomic/x86: Switch atomic.h to use atomic-instrumented.h") so s/arch_atomic64_xchg/atomic64_xchg/ for backport.] Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-15ARM: rockchip: Force CONFIG_PM on Rockchip systemsMarc Zyngier1-0/+1
[ Upstream commit d1558dfd9f22c99a5b8e1354ad881ee40749da89 ] A number of the Rockchip-specific drivers (IOMMU, display controllers) are now assuming that CONFIG_PM is set, and may completely misbehave if that's not the case. Since there is hardly any reason for this configuration option not to be selected anyway, let's require it (in the same way Tegra already does). Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-15arm64: rockchip: Force CONFIG_PM on Rockchip systemsMarc Zyngier1-0/+1
[ Upstream commit 7db7a8f5638a2ffe0c0c0d55b5186b6191fd6af7 ] A number of the Rockchip-specific drivers (IOMMU, display controllers) are now assuming that CONFIG_PM is set, and may completely misbehave if that's not the case. Since there is hardly any reason for this configuration option not to be selected anyway, let's require it (in the same way Tegra already does). Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-15kvm: nVMX: Fix fault vector for VMX operation at CPL > 0Jim Mattson1-2/+2
[ Upstream commit 36090bf43a6b835a42f515cb515ff6fa293a25fe ] The fault that should be raised for a privilege level violation is #GP rather than #UD. Fixes: 727ba748e110b4 ("kvm: nVMX: Enforce cpl=0 for VMX instructions") Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-15KVM: vmx: track host_state.loaded using a loaded_vmcs pointerSean Christopherson1-7/+15
[ Upstream commit bd9966de4e14fb559e89a06f7f5c9aab2cc028b9 ] Using 'struct loaded_vmcs*' to track whether the CPU registers contain host or guest state kills two birds with one stone. 1. The (effective) boolean host_state.loaded is poorly named. It does not track whether or not host state is loaded into the CPU registers (which most readers would expect), but rather tracks if host state has been saved AND guest state is loaded. 2. Using a loaded_vmcs pointer provides a more robust framework for the optimized guest/host state switching, especially when consideration per-VMCS enhancements. To that end, WARN_ONCE if we try to switch to host state with a different VMCS than was last used to save host state. Resolve an occurrence of the new WARN by setting loaded_vmcs after the call to vmx_vcpu_put() in vmx_switch_vmcs(). Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-15powerpc/pseries: Avoid using the size greater than RTAS_ERROR_LOG_MAX.Mahesh Salgaonkar1-1/+1
[ Upstream commit 74e96bf44f430cf7a01de19ba6cf49b361cdfd6e ] The global mce data buffer that used to copy rtas error log is of 2048 (RTAS_ERROR_LOG_MAX) bytes in size. Before the copy we read extended_log_length from rtas error log header, then use max of extended_log_length and RTAS_ERROR_LOG_MAX as a size of data to be copied. Ideally the platform (phyp) will never send extended error log with size > 2048. But if that happens, then we have a risk of buffer overrun and corruption. Fix this by using min_t instead. Fixes: d368514c3097 ("powerpc: Fix corruption when grabbing FWNMI data") Reported-by: Michal Suchanek <msuchanek@suse.com> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-15powerpc/64s: Make rfi_flush_fallback a little more robustMichael Ellerman1-0/+6
[ Upstream commit 78ee9946371f5848ddfc88ab1a43867df8f17d83 ] Because rfi_flush_fallback runs immediately before the return to userspace it currently runs with the user r1 (stack pointer). This means if we oops in there we will report a bad kernel stack pointer in the exception entry path, eg: Bad kernel stack pointer 7ffff7150e40 at c0000000000023b4 Oops: Bad kernel stack pointer, sig: 6 [#1] LE SMP NR_CPUS=32 NUMA PowerNV Modules linked in: CPU: 0 PID: 1246 Comm: klogd Not tainted 4.18.0-rc2-gcc-7.3.1-00175-g0443f8a69ba3 #7 NIP: c0000000000023b4 LR: 0000000010053e00 CTR: 0000000000000040 REGS: c0000000fffe7d40 TRAP: 4100 Not tainted (4.18.0-rc2-gcc-7.3.1-00175-g0443f8a69ba3) MSR: 9000000002803031 <SF,HV,VEC,VSX,FP,ME,IR,DR,LE> CR: 44000442 XER: 20000000 CFAR: c00000000000bac8 IRQMASK: c0000000f1e66a80 GPR00: 0000000002000000 00007ffff7150e40 00007fff93a99900 0000000000000020 ... NIP [c0000000000023b4] rfi_flush_fallback+0x34/0x80 LR [0000000010053e00] 0x10053e00 Although the NIP tells us where we were, and the TRAP number tells us what happened, it would still be nicer if we could report the actual exception rather than barfing about the stack pointer. We an do that fairly simply by loading the kernel stack pointer on entry and restoring the user value before returning. That way we see a regular oops such as: Unrecoverable exception 4100 at c00000000000239c Oops: Unrecoverable exception, sig: 6 [#1] LE SMP NR_CPUS=32 NUMA PowerNV Modules linked in: CPU: 0 PID: 1251 Comm: klogd Not tainted 4.18.0-rc3-gcc-7.3.1-00097-g4ebfcac65acd-dirty #40 NIP: c00000000000239c LR: 0000000010053e00 CTR: 0000000000000040 REGS: c0000000f1e17bb0 TRAP: 4100 Not tainted (4.18.0-rc3-gcc-7.3.1-00097-g4ebfcac65acd-dirty) MSR: 9000000002803031 <SF,HV,VEC,VSX,FP,ME,IR,DR,LE> CR: 44000442 XER: 20000000 CFAR: c00000000000bac8 IRQMASK: 0 ... NIP [c00000000000239c] rfi_flush_fallback+0x3c/0x80 LR [0000000010053e00] 0x10053e00 Call Trace: [c0000000f1e17e30] [c00000000000b9e4] system_call+0x5c/0x70 (unreliable) Note this shouldn't make the kernel stack pointer vulnerable to a meltdown attack, because it should be flushed from the cache before we return to userspace. The user r1 value will be in the cache, because we load it in the return path, but that is harmless. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-15powerpc/platforms/85xx: fix t1042rdb_diu.c build errors & warningRandy Dunlap1-0/+4
[ Upstream commit f5daf77a55ef0e695cc90c440ed6503073ac5e07 ] Fix build errors and warnings in t1042rdb_diu.c by adding header files and MODULE_LICENSE(). ../arch/powerpc/platforms/85xx/t1042rdb_diu.c:152:1: warning: data definition has no type or storage class early_initcall(t1042rdb_diu_init); ../arch/powerpc/platforms/85xx/t1042rdb_diu.c:152:1: error: type defaults to 'int' in declaration of 'early_initcall' [-Werror=implicit-int] ../arch/powerpc/platforms/85xx/t1042rdb_diu.c:152:1: warning: parameter names (without types) in function declaration and WARNING: modpost: missing MODULE_LICENSE() in arch/powerpc/platforms/85xx/t1042rdb_diu.o Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Scott Wood <oss@buserror.net> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-15powerpc: Fix size calculation using resource_size()Dan Carpenter1-1/+1
[ Upstream commit c42d3be0c06f0c1c416054022aa535c08a1f9b39 ] The problem is the the calculation should be "end - start + 1" but the plus one is missing in this calculation. Fixes: 8626816e905e ("powerpc: add support for MPIC message register API") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-15powerpc/uaccess: Enable get_user(u64, *p) on 32-bitMichael Ellerman1-3/+10
[ Upstream commit f7a6947cd49b7ff4e03f1b4f7e7b223003d752ca ] Currently if you build a 32-bit powerpc kernel and use get_user() to load a u64 value it will fail to build with eg: kernel/rseq.o: In function `rseq_get_rseq_cs': kernel/rseq.c:123: undefined reference to `__get_user_bad' This is hitting the check in __get_user_size() that makes sure the size we're copying doesn't exceed the size of the destination: #define __get_user_size(x, ptr, size, retval) do { retval = 0; __chk_user_ptr(ptr); if (size > sizeof(x)) (x) = __get_user_bad(); Which doesn't immediately make sense because the size of the destination is u64, but it's not really, because __get_user_check() etc. internally create an unsigned long and copy into that: #define __get_user_check(x, ptr, size) ({ long __gu_err = -EFAULT; unsigned long __gu_val = 0; The problem being that on 32-bit unsigned long is not big enough to hold a u64. We can fix this with a trick from hpa in the x86 code, we statically check the type of x and set the type of __gu_val to either unsigned long or unsigned long long. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-15s390/kdump: Fix memleak in nt_vmcoreinfoPhilipp Rudo1-5/+12
[ Upstream commit 2d2e7075b87181ed0c675e4936e20bdadba02e1f ] The vmcoreinfo of a crashed system is potentially fragmented. Thus the crash kernel has an intermediate step where the vmcoreinfo is copied into a temporary, continuous buffer in the crash kernel memory. This temporary buffer is never freed. Free it now to prevent the memleak. While at it replace all occurrences of "VMCOREINFO" by its corresponding macro to prevent potential renaming issues. Signed-off-by: Philipp Rudo <prudo@linux.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-15x86/mce: Add notifier_block forward declarationArnd Bergmann1-0/+1
[ Upstream commit 704ae091b061082b37a9968621af4c290c641d50 ] Without linux/irq.h, there is no declaration of notifier_block, leading to a build warning: In file included from arch/x86/kernel/cpu/mcheck/threshold.c:10: arch/x86/include/asm/mce.h:151:46: error: 'struct notifier_block' declared inside parameter list will not be visible outside of this definition or declaration [-Werror] It's sufficient to declare the struct tag here, which avoids pulling in more header files. Fixes: 447ae3166702 ("x86: Don't include linux/irq.h from asm/hardirq.h") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Nicolai Stange <nstange@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20180817100156.3009043-1-arnd@arndb.de Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-09arm64: mm: always enable CONFIG_HOLES_IN_ZONEJames Morse1-1/+0
commit f52bb98f5aded4c43e52f5ce19fb83f7261e9e73 upstream. Commit 6d526ee26ccd ("arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA") only enabled HOLES_IN_ZONE for NUMA systems because the NUMA code was choking on the missing zone for nomap pages. This problem doesn't just apply to NUMA systems. If the architecture doesn't set HAVE_ARCH_PFN_VALID, pfn_valid() will return true if the pfn is part of a valid sparsemem section. When working with multiple pages, the mm code uses pfn_valid_within() to test each page it uses within the sparsemem section is valid. On most systems memory comes in MAX_ORDER_NR_PAGES chunks which all have valid/initialised struct pages. In this case pfn_valid_within() is optimised out. Systems where this isn't true (e.g. due to nomap) should set HOLES_IN_ZONE and provide HAVE_ARCH_PFN_VALID so that mm tests each page as it works with it. Currently non-NUMA arm64 systems can't enable HOLES_IN_ZONE, leading to a VM_BUG_ON(): | page:fffffdff802e1780 is uninitialized and poisoned | raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff | raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff | page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) | ------------[ cut here ]------------ | kernel BUG at include/linux/mm.h:978! | Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [...] | CPU: 1 PID: 25236 Comm: dd Not tainted 4.18.0 #7 | Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 | pstate: 40000085 (nZcv daIf -PAN -UAO) | pc : move_freepages_block+0x144/0x248 | lr : move_freepages_block+0x144/0x248 | sp : fffffe0071177680 [...] | Process dd (pid: 25236, stack limit = 0x0000000094cc07fb) | Call trace: | move_freepages_block+0x144/0x248 | steal_suitable_fallback+0x100/0x16c | get_page_from_freelist+0x440/0xb20 | __alloc_pages_nodemask+0xe8/0x838 | new_slab+0xd4/0x418 | ___slab_alloc.constprop.27+0x380/0x4a8 | __slab_alloc.isra.21.constprop.26+0x24/0x34 | kmem_cache_alloc+0xa8/0x180 | alloc_buffer_head+0x1c/0x90 | alloc_page_buffers+0x68/0xb0 | create_empty_buffers+0x20/0x1ec | create_page_buffers+0xb0/0xf0 | __block_write_begin_int+0xc4/0x564 | __block_write_begin+0x10/0x18 | block_write_begin+0x48/0xd0 | blkdev_write_begin+0x28/0x30 | generic_perform_write+0x98/0x16c | __generic_file_write_iter+0x138/0x168 | blkdev_write_iter+0x80/0xf0 | __vfs_write+0xe4/0x10c | vfs_write+0xb4/0x168 | ksys_write+0x44/0x88 | sys_write+0xc/0x14 | el0_svc_naked+0x30/0x34 | Code: aa1303e0 90001a01 91296421 94008902 (d4210000) | ---[ end trace 1601ba47f6e883fe ]--- Remove the NUMA dependency. Link: https://www.spinics.net/lists/arm-kernel/msg671851.html Cc: <stable@vger.kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reported-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com> Tested-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-09sys: don't hold uts_sem while accessing userspace memoryJann Horn3-44/+49
commit 42a0cc3478584d4d63f68f2f5af021ddbea771fa upstream. Holding uts_sem as a writer while accessing userspace memory allows a namespace admin to stall all processes that attempt to take uts_sem. Instead, move data through stack buffers and don't access userspace memory while uts_sem is held. Cc: stable@vger.kernel.org Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-09ARM: tegra: Fix Tegra30 Cardhu PCA954x resetJon Hunter1-0/+1
commit 6e1811900b6fe6f2b4665dba6bd6ed32c6b98575 upstream. On all versions of Tegra30 Cardhu, the reset signal to the NXP PCA9546 I2C mux is connected to the Tegra GPIO BB0. Currently, this pin on the Tegra is not configured as a GPIO but as a special-function IO (SFIO) that is multiplexing the pin to an I2S controller. On exiting system suspend, I2C commands sent to the PCA9546 are failing because there is no ACK. Although it is not possible to see exactly what is happening to the reset during suspend, by ensuring it is configured as a GPIO and driven high, to de-assert the reset, the failures are no longer seen. Please note that this GPIO is also used to drive the reset signal going to the camera connector on the board. However, given that there is no camera support currently for Cardhu, this should not have any impact. Fixes: 40431d16ff11 ("ARM: tegra: enable PCA9546 on Cardhu") Cc: stable@vger.kernel.org Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-09xtensa: increase ranges in ___invalidate_{i,d}cache_allMax Filippov1-2/+2
commit fec3259c9f747c039f90e99570540114c8d81a14 upstream. Cache invalidation macros use cache line size to iterate over invalidated cache lines, assuming that all cache ways are invalidated by single instruction, but xtensa ISA recommends to not assume that for future compatibility: In some implementations all ways at index Addry-1..z are invalidated regardless of the specified way, but for future compatibility this behavior should not be assumed. Iterate over all cache ways in ___invalidate_icache_all and ___invalidate_dcache_all. Cc: stable@vger.kernel.org Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-09xtensa: limit offsets in __loop_cache_{all,page}Max Filippov1-25/+40
commit be75de25251f7cf3e399ca1f584716a95510d24a upstream. When building kernel for xtensa cores with big cache lines (e.g. 128 bytes or more) __loop_cache_all and __loop_cache_page may generate assembly instructions with immediate fields that are too big. This results in the following build errors: arch/xtensa/mm/misc.S: Assembler messages: arch/xtensa/mm/misc.S:464: Error: operand 2 of 'diwbi' has invalid value '256' arch/xtensa/mm/misc.S:464: Error: operand 2 of 'diwbi' has invalid value '384' arch/xtensa/kernel/head.S: Assembler messages: arch/xtensa/kernel/head.S:172: Error: operand 2 of 'diu' has invalid value '256' arch/xtensa/kernel/head.S:172: Error: operand 2 of 'diu' has invalid value '384' arch/xtensa/kernel/head.S:176: Error: operand 2 of 'iiu' has invalid value '256' arch/xtensa/kernel/head.S:176: Error: operand 2 of 'iiu' has invalid value '384' arch/xtensa/kernel/head.S:255: Error: operand 2 of 'diwb' has invalid value '256' arch/xtensa/kernel/head.S:255: Error: operand 2 of 'diwb' has invalid value '384' Add parameter max_immed to these macros and use it to limit values of immediate operands. Extract common code of these macros into the new macro __loop_cache_unroll. Cc: stable@vger.kernel.org Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-09KVM: PPC: Book3S: Fix guest DMA when guest partially backed by THP pagesPaul Mackerras1-7/+10
commit 8cfbdbdc24815417a3ab35101ccf706b9a23ff17 upstream. Commit 76fa4975f3ed ("KVM: PPC: Check if IOMMU page is contained in the pinned physical page", 2018-07-17) added some checks to ensure that guest DMA mappings don't attempt to map more than the guest is entitled to access. However, errors in the logic mean that legitimate guest requests to map pages for DMA are being denied in some situations. Specifically, if the first page of the range passed to mm_iommu_get() is mapped with a normal page, and subsequent pages are mapped with transparent huge pages, we end up with mem->pageshift == 0. That means that the page size checks in mm_iommu_ua_to_hpa() and mm_iommu_up_to_hpa_rm() will always fail for every page in that region, and thus the guest can never map any memory in that region for DMA, typically leading to a flood of error messages like this: qemu-system-ppc64: VFIO_MAP_DMA: -22 qemu-system-ppc64: vfio_dma_map(0x10005f47780, 0x800000000000000, 0x10000, 0x7fff63ff0000) = -22 (Invalid argument) The logic errors in mm_iommu_get() are: (a) use of 'ua' not 'ua + (i << PAGE_SHIFT)' in the find_linux_pte() call (meaning that find_linux_pte() returns the pte for the first address in the range, not the address we are currently up to); (b) use of 'pageshift' as the variable to receive the hugepage shift returned by find_linux_pte() - for a normal page this gets set to 0, leading to us setting mem->pageshift to 0 when we conclude that the pte returned by find_linux_pte() didn't match the page we were looking at; (c) comparing 'compshift', which is a page order, i.e. log base 2 of the number of pages, with 'pageshift', which is a log base 2 of the number of bytes. To fix these problems, this patch introduces 'cur_ua' to hold the current user address and uses that in the find_linux_pte() call; introduces 'pteshift' to hold the hugepage shift found by find_linux_pte(); and compares 'pteshift' with 'compshift + PAGE_SHIFT' rather than 'compshift'. The patch also moves the local_irq_restore to the point after the PTE pointer returned by find_linux_pte() has been dereferenced because otherwise the PTE could change underneath us, and adds a check to avoid doing the find_linux_pte() call once mem->pageshift has been reduced to PAGE_SHIFT, as an optimization. Fixes: 76fa4975f3ed ("KVM: PPC: Check if IOMMU page is contained in the pinned physical page") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-09KVM: VMX: fixes for vmentry_l1d_flush module parameterPaolo Bonzini1-10/+16
commit 0027ff2a75f9dcf0537ac0a65c5840b0e21a4950 upstream. Two bug fixes: 1) missing entries in the l1d_param array; this can cause a host crash if an access attempts to reach the missing entry. Future-proof the get function against any overflows as well. However, the two entries VMENTER_L1D_FLUSH_EPT_DISABLED and VMENTER_L1D_FLUSH_NOT_REQUIRED must not be accepted by the parse function, so disable them there. 2) invalid values must be rejected even if the CPU does not have the bug, so test for them before checking boot_cpu_has(X86_BUG_L1TF) ... and a small refactoring, since the .cmd field is redundant with the index in the array. Reported-by: Bandan Das <bsd@redhat.com> Cc: stable@vger.kernel.org Fixes: a7b9020b06ec6d7c3f3b0d4ef1a9eba12654f4f7 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-09powerpc/powernv/pci: Work around races in PCI bridge enablingBenjamin Herrenschmidt1-0/+37
commit db2173198b9513f7add8009f225afa1f1c79bcc6 upstream. The generic code is racy when multiple children of a PCI bridge try to enable it simultaneously. This leads to drivers trying to access a device through a not-yet-enabled bridge, and this EEH errors under various circumstances when using parallel driver probing. There is work going on to fix that properly in the PCI core but it will take some time. x86 gets away with it because (outside of hotplug), the BIOS enables all the bridges at boot time. This patch does the same thing on powernv by enabling all bridges that have child devices at boot time, thus avoiding subsequent races. It's suitable for backporting to stable and distros, while the proper PCI fix will probably be significantly more invasive. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: stable@vger.kernel.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-09powerpc/pseries: Fix endianness while restoring of r3 in MCE handler.Mahesh Salgaonkar1-1/+1
commit cd813e1cd7122f2c261dce5b54d1e0c97f80e1a5 upstream. During Machine Check interrupt on pseries platform, register r3 points RTAS extended event log passed by hypervisor. Since hypervisor uses r3 to pass pointer to rtas log, it stores the original r3 value at the start of the memory (first 8 bytes) pointed by r3. Since hypervisor stores this info and rtas log is in BE format, linux should make sure to restore r3 value in correct endian format. Without this patch when MCE handler, after recovery, returns to code that that caused the MCE may end up with Data SLB access interrupt for invalid address followed by kernel panic or hang. Severe Machine check interrupt [Recovered] NIP [d00000000ca301b8]: init_module+0x1b8/0x338 [bork_kernel] Initiator: CPU Error type: SLB [Multihit] Effective address: d00000000ca70000 cpu 0xa: Vector: 380 (Data SLB Access) at [c0000000fc7775b0] pc: c0000000009694c0: vsnprintf+0x80/0x480 lr: c0000000009698e0: vscnprintf+0x20/0x60 sp: c0000000fc777830 msr: 8000000002009033 dar: a803a30c000000d0 current = 0xc00000000bc9ef00 paca = 0xc00000001eca5c00 softe: 3 irq_happened: 0x01 pid = 8860, comm = insmod vscnprintf+0x20/0x60 vprintk_emit+0xb4/0x4b0 vprintk_func+0x5c/0xd0 printk+0x38/0x4c init_module+0x1c0/0x338 [bork_kernel] do_one_initcall+0x54/0x230 do_init_module+0x8c/0x248 load_module+0x12b8/0x15b0 sys_finit_module+0xa8/0x110 system_call+0x58/0x6c --- Exception: c00 (System Call) at 00007fff8bda0644 SP (7fffdfbfe980) is in userspace This patch fixes this issue. Fixes: a08a53ea4c97 ("powerpc/le: Enable RTAS events support") Cc: stable@vger.kernel.org # v3.15+ Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-09powerpc/fadump: handle crash memory ranges array index overflowHari Bathini2-17/+77
commit 1bd6a1c4b80a28d975287630644e6b47d0f977a5 upstream. Crash memory ranges is an array of memory ranges of the crashing kernel to be exported as a dump via /proc/vmcore file. The size of the array is set based on INIT_MEMBLOCK_REGIONS, which works alright in most cases where memblock memory regions count is less than INIT_MEMBLOCK_REGIONS value. But this count can grow beyond INIT_MEMBLOCK_REGIONS value since commit 142b45a72e22 ("memblock: Add array resizing support"). On large memory systems with a few DLPAR operations, the memblock memory regions count could be larger than INIT_MEMBLOCK_REGIONS value. On such systems, registering fadump results in crash or other system failures like below: task: c00007f39a290010 ti: c00000000b738000 task.ti: c00000000b738000 NIP: c000000000047df4 LR: c0000000000f9e58 CTR: c00000000010f180 REGS: c00000000b73b570 TRAP: 0300 Tainted: G L X (4.4.140+) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 22004484 XER: 20000000 CFAR: c000000000008500 DAR: 000007a450000000 DSISR: 40000000 SOFTE: 0 ... NIP [c000000000047df4] smp_send_reschedule+0x24/0x80 LR [c0000000000f9e58] resched_curr+0x138/0x160 Call Trace: resched_curr+0x138/0x160 (unreliable) check_preempt_curr+0xc8/0xf0 ttwu_do_wakeup+0x38/0x150 try_to_wake_up+0x224/0x4d0 __wake_up_common+0x94/0x100 ep_poll_callback+0xac/0x1c0 __wake_up_common+0x94/0x100 __wake_up_sync_key+0x70/0xa0 sock_def_readable+0x58/0xa0 unix_stream_sendmsg+0x2dc/0x4c0 sock_sendmsg+0x68/0xa0 ___sys_sendmsg+0x2cc/0x2e0 __sys_sendmsg+0x5c/0xc0 SyS_socketcall+0x36c/0x3f0 system_call+0x3c/0x100 as array index overflow is not checked for while setting up crash memory ranges causing memory corruption. To resolve this issue, dynamically allocate memory for crash memory ranges and resize it incrementally, in units of pagesize, on hitting array size limit. Fixes: 2df173d9e85d ("fadump: Initialize elfcore header and add PT_LOAD program headers.") Cc: stable@vger.kernel.org # v3.4+ Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> [mpe: Just use PAGE_SIZE directly, fixup variable placement] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-09Fix kexec forbidding kernels signed with keys in the secondary keyring to bootYannik Sembritzki1-1/+1
commit ea93102f32244e3f45c8b26260be77ed0cc1d16c upstream. The split of .system_keyring into .builtin_trusted_keys and .secondary_trusted_keys broke kexec, thereby preventing kernels signed by keys which are now in the secondary keyring from being kexec'd. Fix this by passing VERIFY_USE_SECONDARY_KEYRING to verify_pefile_signature(). Fixes: d3bfe84129f6 ("certs: Add a secondary system keyring that can be added to dynamically") Signed-off-by: Yannik Sembritzki <yannik@sembritzki.me> Signed-off-by: David Howells <dhowells@redhat.com> Cc: kexec@lists.infradead.org Cc: keyrings@vger.kernel.org Cc: linux-security-module@vger.kernel.org Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05MIPS: lib: Provide MIPS64r6 __multi3() for GCC < 7Paul Burton1-3/+3
commit 690d9163bf4b8563a2682e619f938e6a0443947f upstream. Some versions of GCC suboptimally generate calls to the __multi3() intrinsic for MIPS64r6 builds, resulting in link failures due to the missing function: LD vmlinux.o MODPOST vmlinux.o kernel/bpf/verifier.o: In function `kmalloc_array': include/linux/slab.h:631: undefined reference to `__multi3' fs/select.o: In function `kmalloc_array': include/linux/slab.h:631: undefined reference to `__multi3' ... We already have a workaround for this in which we provide the instrinsic, but we do so selectively for GCC 7 only. Unfortunately the issue occurs with older GCC versions too - it has been observed with both GCC 5.4.0 & GCC 6.4.0. MIPSr6 support was introduced in GCC 5, so all major GCC versions prior to GCC 8 are affected and we extend our workaround accordingly to all MIPS64r6 builds using GCC versions older than GCC 8. Signed-off-by: Paul Burton <paul.burton@mips.com> Reported-by: Vladimir Kondratiev <vladimir.kondratiev@intel.com> Fixes: ebabcf17bcd7 ("MIPS: Implement __multi3 for GCC7 MIPS64r6 builds") Patchwork: https://patchwork.linux-mips.org/patch/20297/ Cc: James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org # 4.15+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05MIPS: Change definition of cpu_relax() for Loongson-3Huacai Chen1-0/+13
commit a30718868915fbb991a9ae9e45594b059f28e9ae upstream. Linux expects that if a CPU modifies a memory location, then that modification will eventually become visible to other CPUs in the system. Loongson 3 CPUs include a Store Fill Buffer (SFB) which sits between a core & its L1 data cache, queueing memory accesses & allowing for faster forwarding of data from pending stores to younger loads from the core. Unfortunately the SFB prioritizes loads such that a continuous stream of loads may cause a pending write to be buffered indefinitely. This is problematic if we end up with 2 CPUs which each perform a store that the other polls for - one or both CPUs may end up with their stores buffered in the SFB, never reaching cache due to the continuous reads from the poll loop. Such a deadlock condition has been observed whilst running qspinlock code. This patch changes the definition of cpu_relax() to smp_mb() for Loongson-3, forcing a flush of the SFB on SMP systems which will cause any pending writes to make it as far as the L1 caches where they will become visible to other CPUs. If the kernel is not compiled for SMP support, this will expand to a barrier() as before. This workaround matches that currently implemented for ARM when CONFIG_ARM_ERRATA_754327=y, which was introduced by commit 534be1d5a2da ("ARM: 6194/1: change definition of cpu_relax() for ARM11MPCore"). Although the workaround is only required when the Loongson 3 SFB functionality is enabled, and we only began explicitly enabling that functionality in v4.7 with commit 1e820da3c9af ("MIPS: Loongson-3: Introduce CONFIG_LOONGSON3_ENHANCEMENT"), existing or future firmware may enable the SFB which means we may need the workaround backported to earlier kernels too. [paul.burton@mips.com: - Reword commit message & comment. - Limit stable backport to v3.15+ where we support Loongson 3 CPUs.] Signed-off-by: Huacai Chen <chenhc@lemote.com> Signed-off-by: Paul Burton <paul.burton@mips.com> References: 534be1d5a2da ("ARM: 6194/1: change definition of cpu_relax() for ARM11MPCore") References: 1e820da3c9af ("MIPS: Loongson-3: Introduce CONFIG_LOONGSON3_ENHANCEMENT") Patchwork: https://patchwork.linux-mips.org/patch/19830/ Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang <zhangfx@lemote.com> Cc: Zhangjin Wu <wuzhangjin@gmail.com> Cc: Huacai Chen <chenhuacai@gmail.com> Cc: stable@vger.kernel.org # v3.15+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05MIPS: Always use -march=<arch>, not -<arch> shortcutsPaul Burton1-8/+4
commit 344ebf09949c31bcb8818d8458b65add29f1d67b upstream. The VDSO Makefile filters CFLAGS to select a subset which it uses whilst building the VDSO ELF. One of the flags it allows through is the -march= flag that selects the architecture/ISA to target. Unfortunately in cases where CONFIG_CPU_MIPS32_R{1,2}=y and the toolchain defaults to building for MIPS64, the main MIPS Makefile ends up using the short-form -<arch> flags in cflags-y. This is because the calls to cc-option always fail to use the long-form -march=<arch> flag due to the lack of an -mabi=<abi> flag in KBUILD_CFLAGS at the point where the cc-option function is executed. The resulting GCC invocation is something like: $ mips64-linux-gcc -Werror -march=mips32r2 -c -x c /dev/null -o tmp cc1: error: '-march=mips32r2' is not compatible with the selected ABI These short-form -<arch> flags are dropped by the VDSO Makefile's filtering, and so we attempt to build the VDSO without specifying any architecture. This results in an attempt to build the VDSO using whatever the compiler's default architecture is, regardless of whether that is suitable for the kernel configuration. One encountered build failure resulting from this mismatch is a rejection of the sync instruction if the kernel is configured for a MIPS32 or MIPS64 r1 or r2 target but the toolchain defaults to an older architecture revision such as MIPS1 which did not include the sync instruction: CC arch/mips/vdso/gettimeofday.o /tmp/ccGQKoOj.s: Assembler messages: /tmp/ccGQKoOj.s:273: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:329: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:520: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:714: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:1009: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:1066: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:1114: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:1279: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:1334: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:1374: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:1459: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:1514: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:1814: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:2002: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:2066: Error: opcode not supported on this processor: mips1 (mips1) `sync' make[2]: *** [scripts/Makefile.build:318: arch/mips/vdso/gettimeofday.o] Error 1 make[1]: *** [scripts/Makefile.build:558: arch/mips/vdso] Error 2 make[1]: *** Waiting for unfinished jobs.... This can be reproduced for example by attempting to build pistachio_defconfig using Arnd's GCC 8.1.0 mips64 toolchain from kernel.org: https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/8.1.0/x86_64-gcc-8.1.0-nolibc-mips64-linux.tar.xz Resolve this problem by using the long-form -march=<arch> in all cases, which makes it through the arch/mips/vdso/Makefile's filtering & is thus consistently used to build both the kernel proper & the VDSO. The use of cc-option to prefer the long-form & fall back to the short-form flags makes no sense since the short-form is just an abbreviation for the also-supported long-form in all GCC versions that we support building with. This means there is no case in which we have to use the short-form -<arch> flags, so we can simply remove them. The manual redefinition of _MIPS_ISA is removed naturally along with the use of the short-form flags that it accompanied, and whilst here we remove the separate assembler ISA selection. I suspect that both of these were only required due to the mips32 vs mips2 mismatch that was introduced by commit 59b3e8e9aac6 ("[MIPS] Makefile crapectomy.") and fixed but not cleaned up by commit 9200c0b2a07c ("[MIPS] Fix Makefile bugs for MIPS32/MIPS64 R1 and R2."). I've marked this for backport as far as v4.4 where the MIPS VDSO was introduced. In earlier kernels there should be no ill effect to using the short-form flags. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org # v4.4+ Reviewed-by: James Hogan <jhogan@kernel.org> Patchwork: https://patchwork.linux-mips.org/patch/19579/ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05MIPS: Correct the 64-bit DSP accumulator register sizeMaciej W. Rozycki3-3/+3
commit f5958b4cf4fc38ed4583ab83fb7c4cd1ab05f47b upstream. Use the `unsigned long' rather than `__u32' type for DSP accumulator registers, like with the regular MIPS multiply/divide accumulator and general-purpose registers, as all are 64-bit in 64-bit implementations and using a 32-bit data type leads to contents truncation on context saving. Update `arch_ptrace' and `compat_arch_ptrace' accordingly, removing casts that are similarly not used with multiply/divide accumulator or general-purpose register accesses. Signed-off-by: Maciej W. Rozycki <macro@mips.com> Signed-off-by: Paul Burton <paul.burton@mips.com> Fixes: e50c0a8fa60d ("Support the MIPS32 / MIPS64 DSP ASE.") Patchwork: https://patchwork.linux-mips.org/patch/19329/ Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-fsdevel@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org # 2.6.15+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05kprobes/arm: Fix %p uses in error messagesMasami Hiramatsu2-3/+2
commit 75b2f5f5911fe7a2fc82969b2b24dde34e8f820d upstream. Fix %p uses in error messages by removing it and using general dumper. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: David Howells <dhowells@redhat.com> Cc: David S . Miller <davem@davemloft.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jon Medhurst <tixy@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tobin C . Harding <me@tobin.cc> Cc: Will Deacon <will.deacon@arm.com> Cc: acme@kernel.org Cc: akpm@linux-foundation.org Cc: brueckner@linux.vnet.ibm.com Cc: linux-arch@vger.kernel.org Cc: rostedt@goodmis.org Cc: schwidefsky@de.ibm.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/lkml/152491905361.9916.15300852365956231645.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05s390/pci: fix out of bounds access during irq setupSebastian Ott1-0/+2
commit 866f3576a72b2233a76dffb80290f8086dc49e17 upstream. During interrupt setup we allocate interrupt vectors, walk the list of msi descriptors, and fill in the message data. Requesting more interrupts than supported on s390 can lead to an out of bounds access. When we restrict the number of interrupts we should also stop walking the msi list after all supported interrupts are handled. Cc: stable@vger.kernel.org Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05s390/numa: move initial setup of node_to_cpumask_mapMartin Schwidefsky1-14/+2
commit fb7d7518b0d65955f91c7b875c36eae7694c69bd upstream. The numa_init_early initcall sets the node_to_cpumask_map[0] to the full cpu_possible_mask. Unfortunately this early_initcall is too late, the NUMA setup for numa=emu is done even earlier. The order of calls is numa_setup() -> emu_update_cpu_topology(), then the early_initcalls(), followed by sched_init_domains(). Starting with git commit 051f3ca02e46432c0965e8948f00c07d8a2f09c0 "sched/topology: Introduce NUMA identity node sched domain" the incorrect node_to_cpumask_map[0] really screws up the domain setup and the kernel panics with the follow oops: Cc: <stable@vger.kernel.org> # v4.15+ Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05s390/qdio: reset old sbal_state flagsJulian Wiedmann1-1/+0
commit 64e03ff72623b8c2ea89ca3cb660094e019ed4ae upstream. When allocating a new AOB fails, handle_outbound() is still capable of transmitting the selected buffer (just without async completion). But if a previous transfer on this queue slot used async completion, its sbal_state flags field is still set to QDIO_OUTBUF_STATE_FLAG_PENDING. So when the upper layer driver sees this stale flag, it expects an async completion that never happens. Fix this by unconditionally clearing the flags field. Fixes: 104ea556ee7f ("qdio: support asynchronous delivery of storage blocks") Cc: <stable@vger.kernel.org> #v3.2+ Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05s390: fix br_r1_trampoline for machines without exrlMartin Schwidefsky1-2/+0
commit 26f843848bae973817b3587780ce6b7b0200d3e4 upstream. For machines without the exrl instruction the BFP jit generates code that uses an "br %r1" instruction located in the lowcore page. Unfortunately there is a cut & paste error that puts an additional "larl %r1,.+14" instruction in the code that clobbers the branch target address in %r1. Remove the larl instruction. Cc: <stable@vger.kernel.org> # v4.17+ Fixes: de5cb6eb51 ("s390: use expoline thunks in the BPF JIT") Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05s390/mm: fix addressing exception after suspend/resumeGerald Schaefer1-1/+1
commit 37a366face294facb9c9d9fdd9f5b64a27456cbd upstream. Commit c9b5ad546e7d "s390/mm: tag normal pages vs pages used in page tables" accidentally changed the logic in arch_set_page_states(), which is used by the suspend/resume code. set_page_stable(page, order) was changed to set_page_stable_dat(page, 0). After this, only the first page of higher order pages will be set to stable, and a write to one of the unstable pages will result in an addressing exception. Fix this by using "order" again, instead of "0". Fixes: c9b5ad546e7d ("s390/mm: tag normal pages vs pages used in page tables") Cc: stable@vger.kernel.org # 4.14+ Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05x86/entry/64: Wipe KASAN stack shadow before rewind_stack_do_exit()Jann Horn1-0/+4
commit f12d11c5c184626b4befdee3d573ec8237405a33 upstream. Reset the KASAN shadow state of the task stack before rewinding RSP. Without this, a kernel oops will leave parts of the stack poisoned, and code running under do_exit() can trip over such poisoned regions and cause nonsensical false-positive KASAN reports about stack-out-of-bounds bugs. This does not wipe the exception stacks; if an oops happens on an exception stack, it might result in random KASAN false-positives from other tasks afterwards. This is probably relatively uninteresting, since if the kernel oopses on an exception stack, there are most likely bigger things to worry about. It'd be more interesting if vmapped stacks and KASAN were compatible, since then handle_stack_overflow() would oops from exception stack context. Fixes: 2deb4be28077 ("x86/dumpstack: When OOPSing, rewind the stack before do_exit()") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: kasan-dev@googlegroups.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180828184033.93712-1-jannh@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05x86/speculation/l1tf: Increase l1tf memory limit for Nehalem+Andi Kleen3-6/+45
commit cc51e5428ea54f575d49cfcede1d4cb3a72b4ec4 upstream. On Nehalem and newer core CPUs the CPU cache internally uses 44 bits physical address space. The L1TF workaround is limited by this internal cache address width, and needs to have one bit free there for the mitigation to work. Older client systems report only 36bit physical address space so the range check decides that L1TF is not mitigated for a 36bit phys/32GB system with some memory holes. But since these actually have the larger internal cache width this warning is bogus because it would only really be needed if the system had more than 43bits of memory. Add a new internal x86_cache_bits field. Normally it is the same as the physical bits field reported by CPUID, but for Nehalem and newerforce it to be at least 44bits. Change the L1TF memory size warning to use the new cache_bits field to avoid bogus warnings and remove the bogus comment about memory size. Fixes: 17dbca119312 ("x86/speculation/l1tf: Add sysfs reporting for l1tf") Reported-by: George Anchev <studio@anchev.net> Reported-by: Christopher Snowhill <kode54@gmail.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Cc: linux-kernel@vger.kernel.org Cc: Michael Hocko <mhocko@suse.com> Cc: vbabka@suse.cz Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180824170351.34874-1-andi@firstfloor.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05x86/spectre: Add missing family 6 check to microcode checkAndi Kleen1-0/+3
commit 1ab534e85c93945f7862378d8c8adcf408205b19 upstream. The check for Spectre microcodes does not check for family 6, only the model numbers. Add a family 6 check to avoid ambiguity with other families. Fixes: a5b296636453 ("x86/cpufeature: Blacklist SPEC_CTRL/PRED_CMD on early Spectre v2 microcodes") Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180824170351.34874-2-andi@firstfloor.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05x86/irqflags: Mark native_restore_fl extern inlineNick Desaulniers1-1/+2
commit 1f59a4581b5ecfe9b4f049a7a2cf904d8352842d upstream. This should have been marked extern inline in order to pick up the out of line definition in arch/x86/kernel/irqflags.S. Fixes: 208cbb325589 ("x86/irqflags: Provide a declaration for native_save_fl") Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Juergen Gross <jgross@suse.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180827214011.55428-1-ndesaulniers@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05x86/nmi: Fix NMI uaccess race against CR3 switchingAndy Lutomirski4-1/+53
commit 4012e77a903d114f915fc607d6d2ed54a3d6c9b1 upstream. A NMI can hit in the middle of context switching or in the middle of switch_mm_irqs_off(). In either case, CR3 might not match current->mm, which could cause copy_from_user_nmi() and friends to read the wrong memory. Fix it by adding a new nmi_uaccess_okay() helper and checking it in copy_from_user_nmi() and in __copy_from_user_nmi()'s callers. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Rik van Riel <riel@surriel.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jann Horn <jannh@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/dd956eba16646fd0b15c3c0741269dfd84452dac.1535557289.git.luto@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05x86/vdso: Fix lsl operand orderSamuel Neves1-1/+1
commit e78e5a91456fcecaa2efbb3706572fe043766f4d upstream. In the __getcpu function, lsl is using the wrong target and destination registers. Luckily, the compiler tends to choose %eax for both variables, so it has been working so far. Fixes: a582c540ac1b ("x86/vdso: Use RDPID in preference to LSL when available") Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180901201452.27828-1-sneves@dei.uc.pt Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05x86/kvm/vmx: Remove duplicate l1d flush definitionsJosh Poimboeuf1-3/+0
commit 94d7a86c21a3d6046bf4616272313cb7d525075a upstream. These are already defined higher up in the file. Fixes: 7db92e165ac8 ("x86/kvm: Move l1tf setup function") Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/d7ca03ae210d07173452aeed85ffe344301219a5.1534253536.git.jpoimboe@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05KVM: x86: SVM: Call x86_spec_ctrl_set_guest/host() with interrupts disabledThomas Gleixner1-4/+4
commit 024d83cadc6b2af027e473720f3c3da97496c318 upstream. Mikhail reported the following lockdep splat: WARNING: possible irq lock inversion dependency detected CPU 0/KVM/10284 just changed the state of lock: 000000000d538a88 (&st->lock){+...}, at: speculative_store_bypass_update+0x10b/0x170 but this lock was taken by another, HARDIRQ-safe lock in the past: (&(&sighand->siglock)->rlock){-.-.} and interrupts could create inverse lock ordering between them. Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&st->lock); local_irq_disable(); lock(&(&sighand->siglock)->rlock); lock(&st->lock); <Interrupt> lock(&(&sighand->siglock)->rlock); *** DEADLOCK *** The code path which connects those locks is: speculative_store_bypass_update() ssb_prctl_set() do_seccomp() do_syscall_64() In svm_vcpu_run() speculative_store_bypass_update() is called with interupts enabled via x86_virt_spec_ctrl_set_guest/host(). This is actually a false positive, because GIF=0 so interrupts are disabled even if IF=1; however, we can easily move the invocations of x86_virt_spec_ctrl_set_guest/host() into the interrupt disabled region to cure it, and it's a good idea to keep the GIF=0/IF=1 area as small and self-contained as possible. Fixes: 1f50ddb4f418 ("x86/speculation: Handle HT correctly on AMD") Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Borislav Petkov <bp@suse.de> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: kvm@vger.kernel.org Cc: x86@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05x86/process: Re-export start_thread()Rian Hunter1-0/+1
commit dc76803e57cc86589c4efcb5362918f9b0c0436f upstream. The consolidation of the start_thread() functions removed the export unintentionally. This breaks binfmt handlers built as a module. Add it back. Fixes: e634d8fc792c ("x86-64: merge the standard and compat start_thread() functions") Signed-off-by: Rian Hunter <rian@alum.mit.edu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Dmitry Safonov <dima@arista.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180819230854.7275-1-rian@alum.mit.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>