summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2017-06-13Merge branch 'uuid-types' of bombadil.infradead.org:public_git/uuid into ↵Christoph Hellwig2-3/+3
nvme-base
2017-06-12Merge tag 'v4.12-rc5' into for-4.13/blockJens Axboe97-438/+727
We've already got a few conflicts and upcoming work depends on some of the changes that have gone into mainline as regression fixes for this series. Pull in 4.12-rc5 to resolve these conflicts and make it easier on down stream trees to continue working on 4.13 changes. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-12Merge branch 'for-linus' of ↵Linus Torvalds5-19/+0
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull key subsystem fixes from James Morris: "Here are a bunch of fixes for Linux keyrings, including: - Fix up the refcount handling now that key structs use the refcount_t type and the refcount_t ops don't allow a 0->1 transition. - Fix a potential NULL deref after error in x509_cert_parse(). - Don't put data for the crypto algorithms to use on the stack. - Fix the handling of a null payload being passed to add_key(). - Fix incorrect cleanup an uninitialised key_preparsed_payload in key_update(). - Explicit sanitisation of potentially secure data before freeing. - Fixes for the Diffie-Helman code" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (23 commits) KEYS: fix refcount_inc() on zero KEYS: Convert KEYCTL_DH_COMPUTE to use the crypto KPP API crypto : asymmetric_keys : verify_pefile:zero memory content before freeing KEYS: DH: add __user annotations to keyctl_kdf_params KEYS: DH: ensure the KDF counter is properly aligned KEYS: DH: don't feed uninitialized "otherinfo" into KDF KEYS: DH: forbid using digest_null as the KDF hash KEYS: sanitize key structs before freeing KEYS: trusted: sanitize all key material KEYS: encrypted: sanitize all key material KEYS: user_defined: sanitize key payloads KEYS: sanitize add_key() and keyctl() key payloads KEYS: fix freeing uninitialized memory in key_update() KEYS: fix dereferencing NULL payload with nonzero length KEYS: encrypted: use constant-time HMAC comparison KEYS: encrypted: fix race causing incorrect HMAC calculations KEYS: encrypted: fix buffer overread in valid_master_desc() KEYS: encrypted: avoid encrypting/decrypting stack buffers KEYS: put keyring if install_session_keyring_to_cred() fails KEYS: Delete an error message for a failed memory allocation in get_derived_key() ...
2017-06-11Merge tag 'hexagon-for-linus-v4.12-rc5' of ↵Linus Torvalds1-3/+2
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hexagon fix from Guenter Roeck: "This fixes a build error seen when building hexagon images. Richard sent me an Ack, but didn't reply when asked if he wants me to send the patch to you directly, so I figured I'd just do it" * tag 'hexagon-for-linus-v4.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hexagon: Use raw_copy_to_user
2017-06-11Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds13-32/+40
Pull KVM fixes from Paolo Bonzini: "Bug fixes (ARM, s390, x86)" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: async_pf: avoid async pf injection when in guest mode KVM: cpuid: Fix read/write out-of-bounds vulnerability in cpuid emulation arm: KVM: Allow unaligned accesses at HYP arm64: KVM: Allow unaligned accesses at EL2 arm64: KVM: Preserve RES1 bits in SCTLR_EL2 KVM: arm/arm64: Handle possible NULL stage2 pud when ageing pages KVM: nVMX: Fix exception injection kvm: async_pf: fix rcu_irq_enter() with irqs enabled KVM: arm/arm64: vgic-v3: Fix nr_pre_bits bitfield extraction KVM: s390: fix ais handling vs cpu model KVM: arm/arm64: Fix isues with GICv2 on GICv3 migration
2017-06-11KVM: async_pf: avoid async pf injection when in guest modeWanpeng Li3-4/+7
INFO: task gnome-terminal-:1734 blocked for more than 120 seconds. Not tainted 4.12.0-rc4+ #8 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. gnome-terminal- D 0 1734 1015 0x00000000 Call Trace: __schedule+0x3cd/0xb30 schedule+0x40/0x90 kvm_async_pf_task_wait+0x1cc/0x270 ? __vfs_read+0x37/0x150 ? prepare_to_swait+0x22/0x70 do_async_page_fault+0x77/0xb0 ? do_async_page_fault+0x77/0xb0 async_page_fault+0x28/0x30 This is triggered by running both win7 and win2016 on L1 KVM simultaneously, and then gives stress to memory on L1, I can observed this hang on L1 when at least ~70% swap area is occupied on L0. This is due to async pf was injected to L2 which should be injected to L1, L2 guest starts receiving pagefault w/ bogus %cr2(apf token from the host actually), and L1 guest starts accumulating tasks stuck in D state in kvm_async_pf_task_wait() since missing PAGE_READY async_pfs. This patch fixes the hang by doing async pf when executing L1 guest. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-11hexagon: Use raw_copy_to_userGuenter Roeck1-3/+2
Commit ac4691fac8ad ("hexagon: switch to RAW_COPY_USER") replaced __copy_to_user_hexagon() with raw_copy_to_user(), but did not catch all callers, resulting in the following build error. arch/hexagon/mm/uaccess.c: In function '__clear_user_hexagon': arch/hexagon/mm/uaccess.c:40:3: error: implicit declaration of function '__copy_to_user_hexagon' Fixes: ac4691fac8ad ("hexagon: switch to RAW_COPY_USER") Cc: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Richard Kuo <rkuo@codeaurora.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-06-10Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds2-0/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes: a Geode fix plus a microcode loader fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/microcode/intel: Clear patch pointer before jettisoning the initrd x86/cpu/cyrix: Add alternative Device ID of Geode GX1 SoC
2017-06-10Merge tag 'iommu-fixes-v4.12-rc4' of ↵Linus Torvalds2-16/+16
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU fixes from Joerg Roedel: - another compile-fix for my header cleanup - a couple of fixes for the recently merged IOMMU probe deferal code - fixes for ACPI/IORT code necessary with IOMMU probe deferal * tag 'iommu-fixes-v4.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: arm: dma-mapping: Reset the device's dma_ops ACPI/IORT: Move the check to get iommu_ops from translated fwspec ARM: dma-mapping: Don't tear down third-party mappings ACPI/IORT: Ignore all errors except EPROBE_DEFER iommu/of: Ignore all errors except EPROBE_DEFER iommu/of: Fix check for returning EPROBE_DEFER iommu/dma: Fix function declaration
2017-06-09Merge tag 'powerpc-4.12-5' of ↵Linus Torvalds16-50/+109
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Mostly fairly minor, of note are: - Fix percpu allocations to be NUMA aware - Limit 4k page size config to 64TB virtual address space - Avoid needlessly restoring FP and vector registers Thanks to Aneesh Kumar K.V, Breno Leitao, Christophe Leroy, Frederic Barrat, Madhavan Srinivasan, Michael Bringmann, Nicholas Piggin, Vaibhav Jain" * tag 'powerpc-4.12-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/book3s64: Move PPC_DT_CPU_FTRs and enable it by default powerpc/mm/4k: Limit 4k page size config to 64TB virtual address space cxl: Fix error path on bad ioctl powerpc/perf: Fix Power9 test_adder fields powerpc/numa: Fix percpu allocations to be NUMA aware cxl: Avoid double free_irq() for psl,slice interrupts powerpc/kernel: Initialize load_tm on task creation powerpc/kernel: Fix FP and vector register restoration powerpc/64: Reclaim CPU_FTR_SUBCORE powerpc/hotplug-mem: Fix missing endian conversion of aa_index powerpc/sysdev/simple_gpio: Fix oops in gpio save_regs function powerpc/spufs: Fix coredump of SPU contexts powerpc/64s: Add dt_cpu_ftrs boot time setup option
2017-06-09Merge tag 'armsoc-fixes' of ↵Linus Torvalds9-8/+35
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "Been sitting on these for a couple of weeks waiting on some larger batches to come in but it's been pretty quiet. Just your garden variety fixes here: - A few maintainers updates (ep93xx, Exynos, TI, Marvell) - Some PM fixes for Atmel/at91 and Marvell - A few DT fixes for Marvell, Versatile, TI Keystone, bcm283x - A reset driver patch to set module license for symbol access" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: MAINTAINERS: EP93XX: Update maintainership MAINTAINERS: remove kernel@stlinux.com obsolete mailing list ARM: dts: versatile: use #include "..." to include local DT MAINTAINERS: add device-tree files to TI DaVinci entry ARM: at91: select CONFIG_ARM_CPU_SUSPEND ARM: dts: keystone-k2l: fix broken Ethernet due to disabled OSR arm64: defconfig: enable some core options for 64bit Rockchip socs arm64: marvell: dts: fix interrupts in 7k/8k crypto nodes reset: hi6220: Set module license so that it can be loaded MAINTAINERS: add irqchip related drivers to Marvell EBU maintainers MAINTAINERS: sort F entries for Marvell EBU maintainers ARM: davinci: PM: Do not free useful resources in normal path in 'davinci_pm_init' ARM: davinci: PM: Free resources in error handling path in 'davinci_pm_init' ARM: dts: bcm283x: Reserve first page for firmware memory: atmel-ebi: mark PM ops as __maybe_unused MAINTAINERS: Remove Javier Martinez Canillas as reviewer for Exynos
2017-06-09block: introduce new block status code typeChristoph Hellwig2-3/+5
Currently we use nornal Linux errno values in the block layer, and while we accept any error a few have overloaded magic meanings. This patch instead introduces a new blk_status_t value that holds block layer specific status codes and explicitly explains their meaning. Helpers to convert from and to the previous special meanings are provided for now, but I suspect we want to get rid of them in the long run - those drivers that have a errno input (e.g. networking) usually get errnos that don't know about the special block layer overloads, and similarly returning them to userspace will usually return somethings that strictly speaking isn't correct for file system operations, but that's left as an exercise for later. For now the set of errors is a very limited set that closely corresponds to the previous overloaded errno values, but there is some low hanging fruite to improve it. blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse typechecking, so that we can easily catch places passing the wrong values. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-06-09security/keys: add CONFIG_KEYS_COMPAT to KconfigBilal Amarni5-19/+0
CONFIG_KEYS_COMPAT is defined in arch-specific Kconfigs and is missing for several 64-bit architectures : mips, parisc, tile. At the moment and for those architectures, calling in 32-bit userspace the keyctl syscall would return an ENOSYS error. This patch moves the CONFIG_KEYS_COMPAT option to security/keys/Kconfig, to make sure the compatibility wrapper is registered by default for any 64-bit architecture as long as it is configured with CONFIG_COMPAT. [DH: Modified to remove arm64 compat enablement also as requested by Eric Biggers] Signed-off-by: Bilal Amarni <bilal.amarni@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> cc: Eric Biggers <ebiggers3@gmail.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-06-08Merge tag 'kvm-s390-master-4.12-1' of ↵Paolo Bonzini3-5/+2
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Fix for master (4.12) - The newly created AIS capability enables the feature unconditionally and ignores the cpu model
2017-06-08KVM: cpuid: Fix read/write out-of-bounds vulnerability in cpuid emulationWanpeng Li1-9/+11
If "i" is the last element in the vcpu->arch.cpuid_entries[] array, it potentially can be exploited the vulnerability. this will out-of-bounds read and write. Luckily, the effect is small: /* when no next entry is found, the current entry[i] is reselected */ for (j = i + 1; ; j = (j + 1) % nent) { struct kvm_cpuid_entry2 *ej = &vcpu->arch.cpuid_entries[j]; if (ej->function == e->function) { It reads ej->maxphyaddr, which is user controlled. However... ej->flags |= KVM_CPUID_FLAG_STATE_READ_NEXT; After cpuid_entries there is int maxphyaddr; struct x86_emulate_ctxt emulate_ctxt; /* 16-byte aligned */ So we have: - cpuid_entries at offset 1B50 (6992) - maxphyaddr at offset 27D0 (6992 + 3200 = 10192) - padding at 27D4...27DF - emulate_ctxt at 27E0 And it writes in the padding. Pfew, writing the ops field of emulate_ctxt would have been much worse. This patch fixes it by modding the index to avoid the out-of-bounds access. Worst case, i == j and ej->function == e->function, the loop can bail out. Reported-by: Moguofang <moguofang@huawei.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Guofang Mo <moguofang@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-08Merge tag 'kvm-arm-for-v4.12-rc5-take2' of ↵Paolo Bonzini4-12/+18
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/ARM Fixes for v4.12-rc5 - Take 2 Changes include: - Fix an issue with migrating GICv2 VMs on GICv3 systems. - Squashed a bug for gicv3 when figuring out preemption levels. - Fix a potential null pointer derefence in KVM happening under memory pressure. - Maintain RES1 bits in the SCTLR_EL2 to make sure KVM works on new architecture revisions. - Allow unaligned accesses at EL2/HYP
2017-06-08powerpc/book3s64: Move PPC_DT_CPU_FTRs and enable it by defaultMichael Ellerman2-11/+11
The PPC_DT_CPU_FTRs is a bit misplaced in menuconfig, it shows up with other general kernel options. It's really more at home in the "Platform Support" section, so move it there. Also enable it by default, for Book3s 64. It does mostly nothing unless the device tree properties are found, and we will want it enabled eventually in distro kernels, so turn it on to start getting more testing. Fixes: 5a61ef74f269 ("powerpc/64s: Support new device tree binding for discovering CPU features") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-08powerpc/mm/4k: Limit 4k page size config to 64TB virtual address spaceAneesh Kumar K.V4-16/+15
Supporting 512TB requires us to do a order 3 allocation for level 1 page table (pgd). This results in page allocation failures with certain workloads. For now limit 4k linux page size config to 64TB. Fixes: f6eedbba7a26 ("powerpc/mm/hash: Increase VA range to 128TB") Reported-by: Hugh Dickins <hughd@google.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-08x86/microcode/intel: Clear patch pointer before jettisoning the initrdDominik Brodowski1-0/+3
During early boot, load_ucode_intel_ap() uses __load_ucode_intel() to obtain a pointer to the relevant microcode patch (embedded in the initrd), and stores this value in 'intel_ucode_patch' to speed up the microcode patch application for subsequent CPUs. On resuming from suspend-to-RAM, however, load_ucode_ap() calls load_ucode_intel_ap() for each non-boot-CPU. By then the initramfs is long gone so the pointer stored in 'intel_ucode_patch' no longer points to a valid microcode patch. Clear that pointer so that we effectively fall back to the CPU hotplug notifier callbacks to update the microcode. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> [ Edit and massage commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> # 4.10.. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170607095819.9754-1-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds1-0/+6
Pull networking fixes from David Miller: 1) Made TCP congestion control documentation match current reality, from Anmol Sarma. 2) Various build warning and failure fixes from Arnd Bergmann. 3) Fix SKB list leak in ipv6_gso_segment(). 4) Use after free in ravb driver, from Eugeniu Rosca. 5) Don't use udp_poll() in ping protocol driver, from Eric Dumazet. 6) Don't crash in PCI error recovery of cxgb4 driver, from Guilherme Piccoli. 7) _SRC_NAT_DONE_BIT needs to be cleared using atomics, from Liping Zhang. 8) Use after free in vxlan deletion, from Mark Bloch. 9) Fix ordering of NAPI poll enabled in ethoc driver, from Max Filippov. 10) Fix stmmac hangs with TSO, from Niklas Cassel. 11) Fix crash in CALIPSO ipv6, from Richard Haines. 12) Clear nh_flags properly on mpls link up. From Roopa Prabhu. 13) Fix regression in sk_err socket error queue handling, noticed by ping applications. From Soheil Hassas Yeganeh. 14) Update mlx4/mlx5 MAINTAINERS information. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (78 commits) net: stmmac: fix a broken u32 less than zero check net: stmmac: fix completely hung TX when using TSO net: ethoc: enable NAPI before poll may be scheduled net: bridge: fix a null pointer dereference in br_afspec ravb: Fix use-after-free on `ifconfig eth0 down` net/ipv6: Fix CALIPSO causing GPF with datagram support net: stmmac: ensure jumbo_frm error return is correctly checked for -ve value Revert "sit: reload iphdr in ipip6_rcv" i40e/i40evf: proper update of the page_offset field i40e: Fix state flags for bit set and clean operations of PF iwlwifi: fix host command memory leaks iwlwifi: fix min API version for 7265D, 3168, 8000 and 8265 iwlwifi: mvm: clear new beacon command template struct iwlwifi: mvm: don't fail when removing a key from an inexisting sta iwlwifi: pcie: only use d0i3 in suspend/resume if system_pm is set to d0i3 iwlwifi: mvm: fix firmware debug restart recording iwlwifi: tt: move ucode_loaded check under mutex iwlwifi: mvm: support ibss in dqa mode iwlwifi: mvm: Fix command queue number on d0i3 flow iwlwifi: mvm: rs: start using LQ command color ...
2017-06-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds17-116/+201
Pull sparc fixes from David Miller: 1) Fix TLB context wrap races, from Pavel Tatashin. 2) Cure some gcc-7 build issues. 3) Handle invalid setup_hugepagesz command line values properly, from Liam R Howlett. 4) Copy TSB using the correct address shift for the huge TSB, from Mike Kravetz. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc64: delete old wrap code sparc64: new context wrap sparc64: add per-cpu mm of secondary contexts sparc64: redefine first version sparc64: combine activate_mm and switch_mm sparc64: reset mm cpumask after wrap sparc/mm/hugepages: Fix setup_hugepagesz for invalid values. sparc: Machine description indices can vary sparc64: mm: fix copy_tsb to correctly copy huge page TSBs arch/sparc: support NR_CPUS = 4096 sparc64: Add __multi3 for gcc 7.x and later. sparc64: Fix build warnings with gcc 7. arch/sparc: increase CONFIG_NODES_SHIFT on SPARC64 to 5
2017-06-06sparc64: delete old wrap codePavel Tatashin6-45/+1
The old method that is using xcall and softint to get new context id is deleted, as it is replaced by a method of using per_cpu_secondary_mm without xcall to perform the context wrap. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06sparc64: new context wrapPavel Tatashin1-27/+54
The current wrap implementation has a race issue: it is called outside of the ctx_alloc_lock, and also does not wait for all CPUs to complete the wrap. This means that a thread can get a new context with a new version and another thread might still be running with the same context. The problem is especially severe on CPUs with shared TLBs, like sun4v. I used the following test to very quickly reproduce the problem: - start over 8K processes (must be more than context IDs) - write and read values at a memory location in every process. Very quickly memory corruptions start happening, and what we read back does not equal what we wrote. Several approaches were explored before settling on this one: Approach 1: Move smp_new_mmu_context_version() inside ctx_alloc_lock, and wait for every process to complete the wrap. (Note: every CPU must WAIT before leaving smp_new_mmu_context_version_client() until every one arrives). This approach ends up with deadlocks, as some threads own locks which other threads are waiting for, and they never receive softint until these threads exit smp_new_mmu_context_version_client(). Since we do not allow the exit, deadlock happens. Approach 2: Handle wrap right during mondo interrupt. Use etrap/rtrap to enter into into C code, and issue new versions to every CPU. This approach adds some overhead to runtime: in switch_mm() we must add some checks to make sure that versions have not changed due to wrap while we were loading the new secondary context. (could be protected by PSTATE_IE but that degrades performance as on M7 and older CPUs as it takes 50 cycles for each access). Also, we still need a global per-cpu array of MMs to know where we need to load new contexts, otherwise we can change context to a thread that is going way (if we received mondo between switch_mm() and switch_to() time). Finally, there are some issues with window registers in rtrap() when context IDs are changed during CPU mondo time. The approach in this patch is the simplest and has almost no impact on runtime. We use the array with mm's where last secondary contexts were loaded onto CPUs and bump their versions to the new generation without changing context IDs. If a new process comes in to get a context ID, it will go through get_new_mmu_context() because of version mismatch. But the running processes do not need to be interrupted. And wrap is quicker as we do not need to xcall and wait for everyone to receive and complete wrap. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06sparc64: add per-cpu mm of secondary contextsPavel Tatashin2-2/+4
The new wrap is going to use information from this array to figure out mm's that currently have valid secondary contexts setup. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06sparc64: redefine first versionPavel Tatashin2-4/+4
CTX_FIRST_VERSION defines the first context version, but also it defines first context. This patch redefines it to only include the first context version. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06sparc64: combine activate_mm and switch_mmPavel Tatashin1-20/+1
The only difference between these two functions is that in activate_mm we unconditionally flush context. However, there is no need to keep this difference after fixing a bug where cpumask was not reset on a wrap. So, in this patch we combine these. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06sparc64: reset mm cpumask after wrapPavel Tatashin1-0/+2
After a wrap (getting a new context version) a process must get a new context id, which means that we would need to flush the context id from the TLB before running for the first time with this ID on every CPU. But, we use mm_cpumask to determine if this process has been running on this CPU before, and this mask is not reset after a wrap. So, there are two possible fixes for this issue: 1. Clear mm cpumask whenever mm gets a new context id 2. Unconditionally flush context every time process is running on a CPU This patch implements the first solution Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06sparc/mm/hugepages: Fix setup_hugepagesz for invalid values.Liam R. Howlett1-1/+2
hugetlb_bad_size needs to be called on invalid values. Also change the pr_warn to a pr_err to better align with other platforms. Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06sparc: Machine description indices can varyJames Clarke2-4/+65
VIO devices were being looked up by their index in the machine description node block, but this often varies over time as devices are added and removed. Instead, store the ID and look up using the type, config handle and ID. Signed-off-by: James Clarke <jrtc27@jrtc27.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=112541 Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06sparc64: mm: fix copy_tsb to correctly copy huge page TSBsMike Kravetz2-6/+12
When a TSB grows beyond its current capacity, a new TSB is allocated and copy_tsb is called to copy entries from the old TSB to the new. A hash shift based on page size is used to calculate the index of an entry in the TSB. copy_tsb has hard coded PAGE_SHIFT in these calculations. However, for huge page TSBs the value REAL_HPAGE_SHIFT should be used. As a result, when copy_tsb is called for a huge page TSB the entries are placed at the incorrect index in the newly allocated TSB. When doing hardware table walk, the MMU does not match these entries and we end up in the TSB miss handling code. This code will then create and write an entry to the correct index in the TSB. We take a performance hit for the table walk miss and recreation of these entries. Pass a new parameter to copy_tsb that is the page size shift to be used when copying the TSB. Suggested-by: Anthony Yznaga <anthony.yznaga@oracle.com> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06arch/sparc: support NR_CPUS = 4096Jane Chu2-6/+15
Linux SPARC64 limits NR_CPUS to 4064 because init_cpu_send_mondo_info() only allocates a single page for NR_CPUS mondo entries. Thus we cannot use all 4096 CPUs on some SPARC platforms. To fix, allocate (2^order) pages where order is set according to the size of cpu_list for possible cpus. Since cpu_list_pa and cpu_mondo_block_pa are not used in asm code, there are no imm13 offsets from the base PA that will break because they can only reach one page. Orabug: 25505750 Signed-off-by: Jane Chu <jane.chu@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Atish Patra <atish.patra@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06arm: KVM: Allow unaligned accesses at HYPMarc Zyngier1-3/+2
We currently have the HSCTLR.A bit set, trapping unaligned accesses at HYP, but we're not really prepared to deal with it. Since the rest of the kernel is pretty happy about that, let's follow its example and set HSCTLR.A to zero. Modern CPUs don't really care. Cc: stable@vger.kernel.org Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-06-06arm64: KVM: Allow unaligned accesses at EL2Marc Zyngier1-2/+3
We currently have the SCTLR_EL2.A bit set, trapping unaligned accesses at EL2, but we're not really prepared to deal with it. So far, this has been unnoticed, until GCC 7 started emitting those (in particular 64bit writes on a 32bit boundary). Since the rest of the kernel is pretty happy about that, let's follow its example and set SCTLR_EL2.A to zero. Modern CPUs don't really care. Cc: stable@vger.kernel.org Reported-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-06-06arm64: KVM: Preserve RES1 bits in SCTLR_EL2Marc Zyngier2-4/+10
__do_hyp_init has the rather bad habit of ignoring RES1 bits and writing them back as zero. On a v8.0-8.2 CPU, this doesn't do anything bad, but may end-up being pretty nasty on future revisions of the architecture. Let's preserve those bits so that we don't have to fix this later on. Cc: stable@vger.kernel.org Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-06-06KVM: nVMX: Fix exception injectionWanpeng Li1-1/+1
WARNING: CPU: 3 PID: 2840 at arch/x86/kvm/vmx.c:10966 nested_vmx_vmexit+0xdcd/0xde0 [kvm_intel] CPU: 3 PID: 2840 Comm: qemu-system-x86 Tainted: G OE 4.12.0-rc3+ #23 RIP: 0010:nested_vmx_vmexit+0xdcd/0xde0 [kvm_intel] Call Trace: ? kvm_check_async_pf_completion+0xef/0x120 [kvm] ? rcu_read_lock_sched_held+0x79/0x80 vmx_queue_exception+0x104/0x160 [kvm_intel] ? vmx_queue_exception+0x104/0x160 [kvm_intel] kvm_arch_vcpu_ioctl_run+0x1171/0x1ce0 [kvm] ? kvm_arch_vcpu_load+0x47/0x240 [kvm] ? kvm_arch_vcpu_load+0x62/0x240 [kvm] kvm_vcpu_ioctl+0x384/0x7b0 [kvm] ? kvm_vcpu_ioctl+0x384/0x7b0 [kvm] ? __fget+0xf3/0x210 do_vfs_ioctl+0xa4/0x700 ? __fget+0x114/0x210 SyS_ioctl+0x79/0x90 do_syscall_64+0x81/0x220 entry_SYSCALL64_slow_path+0x25/0x25 This is triggered occasionally by running both win7 and win2016 in L2, in addition, EPT is disabled on both L1 and L2. It can't be reproduced easily. Commit 0b6ac343fc (KVM: nVMX: Correct handling of exception injection) mentioned that "KVM wants to inject page-faults which it got to the guest. This function assumes it is called with the exit reason in vmcs02 being a #PF exception". Commit e011c663 (KVM: nVMX: Check all exceptions for intercept during delivery to L2) allows to check all exceptions for intercept during delivery to L2. However, there is no guarantee the exit reason is exception currently, when there is an external interrupt occurred on host, maybe a time interrupt for host which should not be injected to guest, and somewhere queues an exception, then the function nested_vmx_check_exception() will be called and the vmexit emulation codes will try to emulate the "Acknowledge interrupt on exit" behavior, the warning is triggered. Reusing the exit reason from the L2->L0 vmexit is wrong in this case, the reason must always be EXCEPTION_NMI when injecting an exception into L1 as a nested vmexit. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Fixes: e011c663b9c7 ("KVM: nVMX: Check all exceptions for intercept during delivery to L2") Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-06-06kvm: async_pf: fix rcu_irq_enter() with irqs enabledPaolo Bonzini1-1/+1
native_safe_halt enables interrupts, and you just shouldn't call rcu_irq_enter() with interrupts enabled. Reorder the call with the following local_irq_disable() to respect the invariant. Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-06-06powerpc/perf: Fix Power9 test_adder fieldsMadhavan Srinivasan1-2/+2
Commit 8d911904f3ce4 ('powerpc/perf: Add restrictions to PMC5 in power9 DD1') was added to restrict the use of PMC5 in Power9 DD1. Intention was to disable the use of PMC5 using raw event code. But instead of updating the power9_isa207_pmu structure (used on DD1), the commit incorrectly updated the power9_pmu structure. Fix it. Fixes: 8d911904f3ce ("powerpc/perf: Add restrictions to PMC5 in power9 DD1") Reported-by: Shriya <shriyak@linux.vnet.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Tested-by: Shriya <shriyak@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-06powerpc/numa: Fix percpu allocations to be NUMA awareMichael Ellerman2-2/+16
In commit 8c272261194d ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID"), we switched to the generic implementation of cpu_to_node(), which uses a percpu variable to hold the NUMA node for each CPU. Unfortunately we neglected to notice that we use cpu_to_node() in the allocation of our percpu areas, leading to a chicken and egg problem. In practice what happens is when we are setting up the percpu areas, cpu_to_node() reports that all CPUs are on node 0, so we allocate all percpu areas on node 0. This is visible in the dmesg output, as all pcpu allocs being in group 0: pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07 pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15 pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23 pcpu-alloc: [0] 24 25 26 27 [0] 28 29 30 31 pcpu-alloc: [0] 32 33 34 35 [0] 36 37 38 39 pcpu-alloc: [0] 40 41 42 43 [0] 44 45 46 47 To fix it we need an early_cpu_to_node() which can run prior to percpu being setup. We already have the numa_cpu_lookup_table we can use, so just plumb it in. With the patch dmesg output shows two groups, 0 and 1: pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07 pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15 pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23 pcpu-alloc: [1] 24 25 26 27 [1] 28 29 30 31 pcpu-alloc: [1] 32 33 34 35 [1] 36 37 38 39 pcpu-alloc: [1] 40 41 42 43 [1] 44 45 46 47 We can also check the data_offset in the paca of various CPUs, with the fix we see: CPU 0: data_offset = 0x0ffe8b0000 CPU 24: data_offset = 0x1ffe5b0000 And we can see from dmesg that CPU 24 has an allocation on node 1: node 0: [mem 0x0000000000000000-0x0000000fffffffff] node 1: [mem 0x0000001000000000-0x0000001fffffffff] Cc: stable@vger.kernel.org # v3.16+ Fixes: 8c272261194d ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-06powerpc/kernel: Initialize load_tm on task creationBreno Leitao1-0/+1
Currently tsk->thread.load_tm is not initialized in the task creation and can contain garbage on a new task. This is an undesired behaviour, since it affects the timing to enable and disable the transactional memory laziness (disabling and enabling the MSR TM bit, which affects TM reclaim and recheckpoint in the scheduling process). Fixes: 5d176f751ee3 ("powerpc: tm: Enable transactional memory (TM) lazily for userspace") Cc: stable@vger.kernel.org # v4.9+ Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-05sparc64: Add __multi3 for gcc 7.x and later.David S. Miller2-0/+36
Reported-by: Waldemar Brodkorb <wbx@openadk.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-05Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds4-13/+15
Pull ARM fixes from Russell King: "Three fixes this time around: - Two fixes for noMMU, fixing the decompressor header layout, and preventing a build error with some configurations. - Fixing the hyp-stub updates that went in during the merge window for platforms that use MCPM" * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 8677/1: boot/compressed: fix decompressor header layout for v7-M ARM: 8676/1: NOMMU: provide pgprot_device() macro ARM: 8675/1: MCPM: ensure not to enter __hyp_soft_restart from loopback and cpu_power_down
2017-06-05S390/sysinfo: use uuid_is_null instead of opencoding itChristoph Hellwig2-3/+3
And switch to use uuid_t instead of the old uuid_be type. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2017-06-05ARM: 8677/1: boot/compressed: fix decompressor header layout for v7-MArd Biesheuvel2-10/+11
As reported by Patrice, the header layout of the decompressor is incorrect when building for v7-M. In this case, the __nop macro resolves to 'mov r0, r0', which is emitted as a narrow encoding, resulting in the header data fields to end up at lower offsets than required. Given the variety of targets we need to support with the same code, the startup sequence is a bit of a jumble, and uses instructions and macros whose encoding widths cannot be specified (badr), or only exist in a narrow encoding (bx) So force the use of a wide encoding in __nop, and replace the start sequence with a simple jump to the label marking the start of code, preceded by a Thumb2 mode switch if required (using explicit wide encodings where appropriate). The label itself can be moved to the start of code [where it belongs] due to the larger range of branch instructions as compared to adr instructions. Reported-by: Patrice CHOTARD <patrice.chotard@st.com> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2017-06-05ARM: 8676/1: NOMMU: provide pgprot_device() macroVladimir Murzin1-0/+1
NOMMU build leads to the following error: CC drivers/pci/mmap.o drivers/pci/mmap.c: In function 'pci_mmap_resource_range': drivers/pci/mmap.c:60:3: error: implicit declaration of function 'pgprot_device' [-Werror=implicit-function-declaration] vma->vm_page_prot = pgprot_device(vma->vm_page_prot); ^ cc1: some warnings being treated as errors scripts/Makefile.build:302: recipe for target 'drivers/pci/mmap.o' failed make[2]: *** [drivers/pci/mmap.o] Error 1 scripts/Makefile.build:561: recipe for target 'drivers/pci' failed make[1]: *** [drivers/pci] Error 2 Makefile:1016: recipe for target 'drivers' failed make: *** [drivers] Error 2 Fix it with support of pgprot_device() macro for NOMMU. Fixes: 00d2904ffeac ("ARM/PCI: Use generic pci_mmap_resource_range()") Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2017-06-05x86/cpu/cyrix: Add alternative Device ID of Geode GX1 SoCChristian Sünkenberg1-0/+1
A SoC variant of Geode GX1, notably NSC branded SC1100, seems to report an inverted Device ID in its DIR0 configuration register, specifically 0xb instead of the expected 0x4. Catch this presumably quirky version so it's properly recognized as GX1 and has its cache switched to write-back mode, which provides a significant performance boost in most workloads. SC1100's datasheet "Geode™ SC1100 Information Appliance On a Chip", states in section 1.1.7.1 "Device ID" that device identification values are specified in SC1100's device errata. These, however, seem to not have been publicly released. Wading through a number of boot logs and /proc/cpuinfo dumps found on pastebin and blogs, this patch should mostly be relevant for a number of now admittedly aging Soekris NET4801 and PC Engines WRAP devices, the latter being the platform this issue was discovered on. Performance impact was verified using "openssl speed", with write-back caching scaling throughput between -3% and +41%. Signed-off-by: Christian Sünkenberg <christian.suenkenberg@student.kit.edu> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1496596719.26725.14.camel@student.kit.edu Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-05powerpc/kernel: Fix FP and vector register restorationBreno Leitao1-0/+2
Currently tsk->thread->load_vec and load_fp are not initialized during task creation, which can lead to garbage values in these variables (non-zero values). These variables will be checked later in restore_math() to validate if the FP and vector registers are being utilized. Since these values might be non-zero, the restore_math() will continue to save the FP and vectors even if they were never utilized by the userspace application. load_fp and load_vec counters will then overflow (they wrap at 255) and the FP and Altivec will be finally disabled, but before that condition is reached (counter overflow) several context switches will have restored FP and vector registers without need, causing a performance degradation. Fixes: 70fe3d980f5f ("powerpc: Restore FPU/VEC/VSX if previously used") Cc: stable@vger.kernel.org # v4.6+ Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Gustavo Romero <gusbromero@gmail.com> Acked-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-03Merge branch 'akpm' (patches from Andrew)Linus Torvalds1-0/+6
Merge misc fixes from Andrew Morton: "15 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: scripts/gdb: make lx-dmesg command work (reliably) mm: consider memblock reservations for deferred memory initialization sizing mm/hugetlb: report -EHWPOISON not -EFAULT when FOLL_HWPOISON is specified mlock: fix mlock count can not decrease in race condition mm/migrate: fix refcount handling when !hugepage_migration_supported() dax: fix race between colliding PMD & PTE entries mm: avoid spurious 'bad pmd' warning messages mm/page_alloc.c: make sure OOM victim can try allocations with no watermarks once pcmcia: remove left-over %Z format slub/memcg: cure the brainless abuse of sysfs attributes initramfs: fix disabling of initramfs (and its compression) mm: clarify why we want kmalloc before falling backto vmallock frv: declare jiffies to be located in the .data section include/linux/gfp.h: fix ___GFP_NOLOCKDEP value ksm: prevent crash after write_protect_page fails
2017-06-03frv: declare jiffies to be located in the .data sectionMatthias Kaehlcke1-0/+6
Commit 7c30f352c852 ("jiffies.h: declare jiffies and jiffies_64 with ____cacheline_aligned_in_smp") removed a section specification from the jiffies declaration that caused conflicts on some platforms. Unfortunately this change broke the build for frv: kernel/built-in.o: In function `__do_softirq': (.text+0x6460): relocation truncated to fit: R_FRV_GPREL12 against symbol `jiffies' defined in *ABS* section in .tmp_vmlinux1 kernel/built-in.o: In function `__do_softirq': (.text+0x6574): relocation truncated to fit: R_FRV_GPREL12 against symbol `jiffies' defined in *ABS* section in .tmp_vmlinux1 kernel/built-in.o: In function `pwq_activate_delayed_work': workqueue.c:(.text+0x15b9c): relocation truncated to fit: R_FRV_GPREL12 against symbol `jiffies' defined in *ABS* section in .tmp_vmlinux1 ... Add __jiffy_arch_data to the declaration of jiffies and use it on frv to include the section specification. For all other platforms __jiffy_arch_data (currently) has no effect. Fixes: 7c30f352c852 ("jiffies.h: declare jiffies and jiffies_64 with ____cacheline_aligned_in_smp") Link: http://lkml.kernel.org/r/20170516221333.177280-1-mka@chromium.org Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: David Howells <dhowells@redhat.com> Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-02Merge tag 'arm64-fixes' of ↵Linus Torvalds2-4/+6
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: "ACPI-related fixes for arm64: - GICC MADT entry validity check fix - Skip IRQ registration with pmu=off in an ACPI guest - struct acpi_pci_root_ops freeing on error path" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: ARM64/ACPI: Fix BAD_MADT_GICC_ENTRY() macro implementation drivers/perf: arm_pmu_acpi: avoid perf IRQ init when guest PMU is off ARM64: PCI: Fix struct acpi_pci_root_ops allocation failure path
2017-06-02Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds3-15/+12
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes: - revert a broken PAT commit that broke a number of systems - fix two preemptability warnings/bugs that can trigger under certain circumstances, in the debug code and in the microcode loader" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "x86/PAT: Fix Xorg regression on CPUs that don't support PAT" x86/debug/32: Convert a smp_processor_id() call to raw to avoid DEBUG_PREEMPT warning x86/microcode/AMD: Change load_microcode_amd()'s param to bool to fix preemptibility bug