summaryrefslogtreecommitdiff
path: root/arch/powerpc
AgeCommit message (Collapse)AuthorFilesLines
2018-01-12crypto: hash - annotate algorithms taking optional keyEric Biggers1-0/+1
We need to consistently enforce that keyed hashes cannot be used without setting the key. To do this we need a reliable way to determine whether a given hash algorithm is keyed or not. AF_ALG currently does this by checking for the presence of a ->setkey() method. However, this is actually slightly broken because the CRC-32 algorithms implement ->setkey() but can also be used without a key. (The CRC-32 "key" is not actually a cryptographic key but rather represents the initial state. If not overridden, then a default initial state is used.) Prepare to fix this by introducing a flag CRYPTO_ALG_OPTIONAL_KEY which indicates that the algorithm has a ->setkey() method, but it is not required to be called. Then set it on all the CRC-32 algorithms. The same also applies to the Adler-32 implementation in Lustre. Also, the cryptd and mcryptd templates have to pass through the flag from their underlying algorithm. Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-01-12powerpc/xive: Add interrupt flag to disable automatic EOIBenjamin Herrenschmidt2-1/+5
This will be used by KVM in order to keep escalation interrupts in the non-EOI (masked) state after they fire. They will be re-enabled directly in HW by KVM when needed. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-12powerpc/xive: Move definition of ESB bitsBenjamin Herrenschmidt2-35/+35
From xive.h to xive-regs.h since it's a HW register definition and it can be used from assembly Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-12powerpc/PCI: Deprecate pci_get_bus_and_slot()Sinan Kaya3-4/+5
pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as where a PCI device is present. This restricts the device drivers to be reused for other domain numbers. Getting ready to remove pci_get_bus_and_slot() function in favor of pci_get_domain_bus_and_slot(). Use pci_get_domain_bus_and_slot() with a domain number of 0 as the code is not ready to consume multiple domains and existing code used domain number 0. Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Signed-off-by: Bjorn Helgaas <helgaas@kernel.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-11Merge tag 'kvm-ppc-fixes-4.15-3' of ↵Paolo Bonzini3-29/+64
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into kvm-master PPC KVM fixes for 4.15 Four commits here, including two that were tagged but never merged. Three of them are for the HPT resizing code; two of those fix a user-triggerable use-after-free in the host, and one that fixes stale TLB entries in the guest. The remaining commit fixes a bug causing PR KVM guests under PowerVM to fail to start.
2018-01-11PCI: Add #defines for Completion Timeout Disable featureBjorn Helgaas1-3/+3
Add #defines for the Completion Timeout Disable feature and use them. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-10powerpc: rename dma_direct_ to dma_nommu_Christoph Hellwig14-67/+67
We want to use the dma_direct_ namespace for a generic implementation, so rename powerpc to the second best choice: dma_nommu_. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-01-10dma-mapping: move dma_mark_clean to dma-direct.hChristoph Hellwig1-2/+0
And unlike the other helpers we don't require a <asm/dma-direct.h> as this helper is a special case for ia64 only, and this keeps it as simple as possible. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-01-10dma-mapping: move swiotlb arch helpers to a new headerChristoph Hellwig3-25/+30
phys_to_dma, dma_to_phys and dma_capable are helpers published by architecture code for use of swiotlb and xen-swiotlb only. Drivers are not supposed to use these directly, but use the DMA API instead. Move these to a new asm/dma-direct.h helper, included by a linux/dma-direct.h wrapper that provides the default linear mapping unless the architecture wants to override it. In the MIPS case the existing dma-coherent.h is reused for now as untangling it will take a bit of work. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Robin Murphy <robin.murphy@arm.com>
2018-01-10powerpc: Don't preempt_disable() in show_cpuinfo()Benjamin Herrenschmidt1-11/+0
This causes warnings from cpufreq mutex code. This is also rather unnecessary and ineffective. If we really want to prevent concurrent unplug, we could take the unplug read lock but I don't see this being critical. Fixes: cd77b5ce208c ("powerpc/powernv/cpufreq: Fix the frequency read by /proc/cpuinfo") Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-10powerpc/xmon: Don't print hashed pointers in paca dumpMichael Ellerman1-11/+11
Remember when the biggest problem we had to worry about was hashed pointers, those were the days. These were missed in my earlier patch because they don't match "%p", but the macro is hiding a "%p", so these all end up being hashed, which is not what we want in xmon. Convert them to "%px". Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-10powerpc/xmon: Add RFI flush related fields to paca dumpMichael Ellerman1-0/+4
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-10powerpc/powernv: Check device-tree for RFI flush settingsOliver O'Halloran1-0/+49
New device-tree properties are available which tell the hypervisor settings related to the RFI flush. Use them to determine the appropriate flush instruction to use, and whether the flush is required. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-10powerpc/pseries: Query hypervisor for RFI flush settingsMichael Neuling1-0/+35
A new hypervisor call is available which tells the guest settings related to the RFI flush. Use it to query the appropriate flush instruction(s), and whether the flush is required. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-10powerpc/64s: Support disabling RFI flush with no_rfi_flush and noptiMichael Ellerman1-1/+23
Because there may be some performance overhead of the RFI flush, add kernel command line options to disable it. We add a sensibly named 'no_rfi_flush' option, but we also hijack the x86 option 'nopti'. The RFI flush is not the same as KPTI, but if we see 'nopti' we can guess that the user is trying to avoid any overhead of Meltdown mitigations, and it means we don't have to educate every one about a different command line option. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-10powerpc/64s: Add support for RFI flush of L1-D cacheMichael Ellerman9-8/+286
On some CPUs we can prevent the Meltdown vulnerability by flushing the L1-D cache on exit from kernel to user mode, and from hypervisor to guest. This is known to be the case on at least Power7, Power8 and Power9. At this time we do not know the status of the vulnerability on other CPUs such as the 970 (Apple G5), pasemi CPUs (AmigaOne X1000) or Freescale CPUs. As more information comes to light we can enable this, or other mechanisms on those CPUs. The vulnerability occurs when the load of an architecturally inaccessible memory region (eg. userspace load of kernel memory) is speculatively executed to the point where its result can influence the address of a subsequent speculatively executed load. In order for that to happen, the first load must hit in the L1, because before the load is sent to the L2 the permission check is performed. Therefore if no kernel addresses hit in the L1 the vulnerability can not occur. We can ensure that is the case by flushing the L1 whenever we return to userspace. Similarly for hypervisor vs guest. In order to flush the L1-D cache on exit, we add a section of nops at each (h)rfi location that returns to a lower privileged context, and patch that with some sequence. Newer firmwares are able to advertise to us that there is a special nop instruction that flushes the L1-D. If we do not see that advertised, we fall back to doing a displacement flush in software. For guest kernels we support migration between some CPU versions, and different CPUs may use different flush instructions. So that we are prepared to migrate to a machine with a different flush instruction activated, we may have to patch more than one flush instruction at boot if the hypervisor tells us to. In the end this patch is mostly the work of Nicholas Piggin and Michael Ellerman. However a cast of thousands contributed to analysis of the issue, earlier versions of the patch, back ports testing etc. Many thanks to all of them. Tested-by: Jon Masters <jcm@redhat.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-10KVM: PPC: Book3S HV: Always flush TLB in kvmppc_alloc_reset_hpt()David Gibson1-2/+4
The KVM_PPC_ALLOCATE_HTAB ioctl(), implemented by kvmppc_alloc_reset_hpt() is supposed to completely clear and reset a guest's Hashed Page Table (HPT) allocating or re-allocating it if necessary. In the case where an HPT of the right size already exists and it just zeroes it, it forces a TLB flush on all guest CPUs, to remove any stale TLB entries loaded from the old HPT. However, that situation can arise when the HPT is resizing as well - or even when switching from an RPT to HPT - so those cases need a TLB flush as well. So, move the TLB flush to trigger in all cases except for errors. Cc: stable@vger.kernel.org # v4.10+ Fixes: f98a8bf9ee20 ("KVM: PPC: Book3S HV: Allow KVM_PPC_ALLOCATE_HTAB ioctl() to change HPT size") Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-01-10KVM: PPC: Book3S PR: Fix WIMG handling under pHypAlexey Kardashevskiy2-0/+3
Commit 96df226 ("KVM: PPC: Book3S PR: Preserve storage control bits") added code to preserve WIMG bits but it missed 2 special cases: - a magic page in kvmppc_mmu_book3s_64_xlate() and - guest real mode in kvmppc_handle_pagefault(). For these ptes, WIMG was 0 and pHyp failed on these causing a guest to stop in the very beginning at NIP=0x100 (due to bd9166ffe "KVM: PPC: Book3S PR: Exit KVM on failed mapping"). According to LoPAPR v1.1 14.5.4.1.2 H_ENTER: The hypervisor checks that the WIMG bits within the PTE are appropriate for the physical page number else H_Parameter return. (For System Memory pages WIMG=0010, or, 1110 if the SAO option is enabled, and for IO pages WIMG=01**.) This hence initializes WIMG to non-zero value HPTE_R_M (0x10), as expected by pHyp. [paulus@ozlabs.org - fix compile for 32-bit] Cc: stable@vger.kernel.org # v4.11+ Fixes: 96df226 "KVM: PPC: Book3S PR: Preserve storage control bits" Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Tested-by: Ruediger Oertel <ro@suse.de> Reviewed-by: Greg Kurz <groug@kaod.org> Tested-by: Greg Kurz <groug@kaod.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-01-10Construct init thread stack in the linker script rather than by unionDavid Howells1-3/+0
Construct the init thread stack in the linker script rather than doing it by means of a union so that ia64's init_task.c can be got rid of. The following symbols are then made available from INIT_TASK_DATA() linker script macro: init_thread_union init_stack INIT_TASK_DATA() also expands the region to THREAD_SIZE to accommodate the size of the init stack. init_thread_union is given its own section so that it can be placed into the stack space in the right order. I'm assuming that the ia64 ordering is correct and that the task_struct is first and the thread_info second. Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Tony Luck <tony.luck@intel.com> Tested-by: Will Deacon <will.deacon@arm.com> (arm64) Tested-by: Palmer Dabbelt <palmer@sifive.com> Acked-by: Thomas Gleixner <tglx@linutronix.de>
2018-01-09powerpc/64s: Convert slb_miss_common to use RFI_TO_USER/KERNELNicholas Piggin1-1/+28
In the SLB miss handler we may be returning to user or kernel. We need to add a check early on and save the result in the cr4 register, and then we bifurcate the return path based on that. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-09powerpc/64: Convert fast_exception_return to use RFI_TO_USER/KERNELNicholas Piggin1-2/+16
Similar to the syscall return path, in fast_exception_return we may be returning to user or kernel context. We already have a test for that, because we conditionally restore r13. So use that existing test and branch, and bifurcate the return based on that. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-09powerpc/64: Convert the syscall exit path to use RFI_TO_USER/KERNELNicholas Piggin1-1/+11
In the syscall exit path we may be returning to user or kernel context. We already have a test for that, because we conditionally restore r13. So use that existing test and branch, and bifurcate the return based on that. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-09powerpc/64s: Simple RFI macro conversionsNicholas Piggin6-28/+34
This commit does simple conversions of rfi/rfid to the new macros that include the expected destination context. By simple we mean cases where there is a single well known destination context, and it's simply a matter of substituting the instruction for the appropriate macro. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-09powerpc/64: Add macros for annotating the destination of rfid/hrfidNicholas Piggin2-0/+35
The rfid/hrfid ((Hypervisor) Return From Interrupt) instruction is used for switching from the kernel to userspace, and from the hypervisor to the guest kernel. However it can and is also used for other transitions, eg. from real mode kernel code to virtual mode kernel code, and it's not always clear from the code what the destination context is. To make it clearer when reading the code, add macros which encode the expected destination context. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2-2/+9
2018-01-09powerpc: remove unused flush_write_buffers definitionChristoph Hellwig1-3/+0
Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-01-09Merge branch 'topic/ppc-kvm' into fixesMichael Ellerman2-0/+31
Merge the topic branch with share with the kvm-ppc tree. In this case we need to share the definition of a new hypervisor call and associated flags.
2018-01-09powerpc/pseries: Add H_GET_CPU_CHARACTERISTICS flags & wrapperMichael Neuling2-0/+31
A new hypervisor call has been defined to communicate various characteristics of the CPU to guests. Add definitions for the hcall number, flags and a wrapper function. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-09powerpc64: Add .opd based function descriptor dereferenceSergey Senozhatsky4-0/+31
We are moving towards separate kernel and module function descriptor dereference callbacks. This patch enables it for powerpc64. For pointers that belong to the kernel - Added __start_opd and __end_opd pointers, to track the kernel .opd section address range; - Added dereference_kernel_function_descriptor(). Now we will dereference only function pointers that are within [__start_opd, __end_opd); For pointers that belong to a module - Added dereference_module_function_descriptor() to handle module function descriptor dereference. Now we will dereference only pointers that are within [module->opd.start, module->opd.end). Link: http://lkml.kernel.org/r/20171109234830.5067-4-sergey.senozhatsky@gmail.com To: Tony Luck <tony.luck@intel.com> To: Fenghua Yu <fenghua.yu@intel.com> To: Helge Deller <deller@gmx.de> To: Benjamin Herrenschmidt <benh@kernel.crashing.org> To: Paul Mackerras <paulus@samba.org> To: Michael Ellerman <mpe@ellerman.id.au> To: James Bottomley <jejb@parisc-linux.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jessica Yu <jeyu@kernel.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: linux-ia64@vger.kernel.org Cc: linux-parisc@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-kernel@vger.kernel.org Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Tested-by: Santosh Sivaraj <santosh@fossix.org> #powerpc Signed-off-by: Petr Mladek <pmladek@suse.com>
2018-01-08mm: split altmap memory map allocation from normal caseChristoph Hellwig1-1/+4
No functional changes, just untangling the call chain and document why the altmap is passed around the hotplug code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-01-08mm: pass the vmem_altmap to vmemmap_freeChristoph Hellwig1-3/+2
We can just pass this on instead of having to do a radix tree lookup without proper locking a few levels into the callchain. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-01-08mm: pass the vmem_altmap to arch_remove_memory and __remove_pagesChristoph Hellwig1-4/+2
We can just pass this on instead of having to do a radix tree lookup without proper locking 2 levels into the callchain. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-01-08mm: pass the vmem_altmap to vmemmap_populateChristoph Hellwig1-5/+2
We can just pass this on instead of having to do a radix tree lookup without proper locking a few levels into the callchain. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-01-08mm: pass the vmem_altmap to arch_add_memory and __add_pagesChristoph Hellwig1-2/+3
We can just pass this on instead of having to do a radix tree lookup without proper locking 2 levels into the callchain. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-01-08powerpc/pseries: Make RAS IRQ explicitly dependent on DLPAR WQMichael Ellerman3-4/+22
The hotplug code uses its own workqueue to handle IRQ requests (pseries_hp_wq), however that workqueue is initialized after init_ras_IRQ(). That can lead to a kernel panic if any hotplug interrupts fire after init_ras_IRQ() but before pseries_hp_wq is initialised. eg: UDP-Lite hash table entries: 2048 (order: 0, 65536 bytes) NET: Registered protocol family 1 Unpacking initramfs... (qemu) object_add memory-backend-ram,id=mem1,size=10G (qemu) device_add pc-dimm,id=dimm1,memdev=mem1 Unable to handle kernel paging request for data at address 0xf94d03007c421378 Faulting instruction address: 0xc00000000012d744 Oops: Kernel access of bad area, sig: 11 [#1] LE SMP NR_CPUS=2048 NUMA pSeries Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.15.0-rc2-ziviani+ #26 task: (ptrval) task.stack: (ptrval) NIP: c00000000012d744 LR: c00000000012d744 CTR: 0000000000000000 REGS: (ptrval) TRAP: 0380 Not tainted (4.15.0-rc2-ziviani+) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28088042 XER: 20040000 CFAR: c00000000012d3c4 SOFTE: 0 ... NIP [c00000000012d744] __queue_work+0xd4/0x5c0 LR [c00000000012d744] __queue_work+0xd4/0x5c0 Call Trace: [c0000000fffefb90] [c00000000012d744] __queue_work+0xd4/0x5c0 (unreliable) [c0000000fffefc70] [c00000000012dce4] queue_work_on+0xb4/0xf0 This commit makes the RAS IRQ registration explicitly dependent on the creation of the pseries_hp_wq. Reported-by: Min Deng <mdeng@redhat.com> Reported-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com> Tested-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-06Merge tag 'powerpc-4.15-6' of ↵Linus Torvalds1-1/+6
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fix from Michael Ellerman: "Just one fix to correctly return SEGV_ACCERR when we take a SEGV on a mapped region. The bug was introduced in the refactoring of the page fault handler we did in the previous release. Thanks to John Sperbeck" * tag 'powerpc-4.15-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/mm: Fix SEGV on mapped region to return SEGV_ACCERR
2018-01-03arch: Remove clkdev.h asm-generic from KbuildStephen Boyd1-1/+0
Now that every architecture is using the generic clkdev.h file and we no longer include asm/clkdev.h anywhere in the tree, we can remove it. Cc: Russell King <linux@armlinux.org.uk> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: <linux-arch@vger.kernel.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2018-01-02powerpc/mm: Fix SEGV on mapped region to return SEGV_ACCERRJohn Sperbeck1-1/+6
The recent refactoring of the powerpc page fault handler in commit c3350602e876 ("powerpc/mm: Make bad_area* helper functions") caused access to protected memory regions to indicate SEGV_MAPERR instead of the traditional SEGV_ACCERR in the si_code field of a user-space signal handler. This can confuse debug libraries that temporarily change the protection of memory regions, and expect to use SEGV_ACCERR as an indication to restore access to a region. This commit restores the previous behavior. The following program exhibits the issue: $ ./repro read || echo "FAILED" $ ./repro write || echo "FAILED" $ ./repro exec || echo "FAILED" #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <signal.h> #include <sys/mman.h> #include <assert.h> static void segv_handler(int n, siginfo_t *info, void *arg) { _exit(info->si_code == SEGV_ACCERR ? 0 : 1); } int main(int argc, char **argv) { void *p = NULL; struct sigaction act = { .sa_sigaction = segv_handler, .sa_flags = SA_SIGINFO, }; assert(argc == 2); p = mmap(NULL, getpagesize(), (strcmp(argv[1], "write") == 0) ? PROT_READ : 0, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); assert(p != MAP_FAILED); assert(sigaction(SIGSEGV, &act, NULL) == 0); if (strcmp(argv[1], "read") == 0) printf("%c", *(unsigned char *)p); else if (strcmp(argv[1], "write") == 0) *(unsigned char *)p = 0; else if (strcmp(argv[1], "exec") == 0) ((void (*)(void))p)(); return 1; /* failed to generate SEGV */ } Fixes: c3350602e876 ("powerpc/mm: Make bad_area* helper functions") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: John Sperbeck <jsperbeck@google.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [mpe: Add commit references in change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-12-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller5-9/+30
net/ipv6/ip6_gre.c is a case of parallel adds. include/trace/events/tcp.h is a little bit more tricky. The removal of in-trace-macro ifdefs in 'net' paralleled with moving show_tcp_state_name and friends over to include/trace/events/sock.h in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-28kernel/irq: Extend lockdep class for request mutexAndrew Lunn1-1/+3
The IRQ code already has support for lockdep class for the lock mutex in an interrupt descriptor. Extend this to add a second class for the request mutex in the descriptor. Not having a class is resulting in false positive splats in some code paths. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: linus.walleij@linaro.org Cc: grygorii.strashko@ti.com Cc: f.fainelli@gmail.com Link: https://lkml.kernel.org/r/1512234664-21555-1-git-send-email-andrew@lunn.ch
2017-12-23Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds1-2/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 PTI preparatory patches from Thomas Gleixner: "Todays Advent calendar window contains twentyfour easy to digest patches. The original plan was to have twenty three matching the date, but a late fixup made that moot. - Move the cpu_entry_area mapping out of the fixmap into a separate address space. That's necessary because the fixmap becomes too big with NRCPUS=8192 and this caused already subtle and hard to diagnose failures. The top most patch is fresh from today and cures a brain slip of that tall grumpy german greybeard, who ignored the intricacies of 32bit wraparounds. - Limit the number of CPUs on 32bit to 64. That's insane big already, but at least it's small enough to prevent address space issues with the cpu_entry_area map, which have been observed and debugged with the fixmap code - A few TLB flush fixes in various places plus documentation which of the TLB functions should be used for what. - Rename the SYSENTER stack to CPU_ENTRY_AREA stack as it is used for more than sysenter now and keeping the name makes backtraces confusing. - Prevent LDT inheritance on exec() by moving it to arch_dup_mmap(), which is only invoked on fork(). - Make vysycall more robust. - A few fixes and cleanups of the debug_pagetables code. Check PAGE_PRESENT instead of checking the PTE for 0 and a cleanup of the C89 initialization of the address hint array which already was out of sync with the index enums. - Move the ESPFIX init to a different place to prepare for PTI. - Several code moves with no functional change to make PTI integration simpler and header files less convoluted. - Documentation fixes and clarifications" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) x86/cpu_entry_area: Prevent wraparound in setup_cpu_entry_area_ptes() on 32bit init: Invoke init_espfix_bsp() from mm_init() x86/cpu_entry_area: Move it out of the fixmap x86/cpu_entry_area: Move it to a separate unit x86/mm: Create asm/invpcid.h x86/mm: Put MMU to hardware ASID translation in one place x86/mm: Remove hard-coded ASID limit checks x86/mm: Move the CR3 construction functions to tlbflush.h x86/mm: Add comments to clarify which TLB-flush functions are supposed to flush what x86/mm: Remove superfluous barriers x86/mm: Use __flush_tlb_one() for kernel memory x86/microcode: Dont abuse the TLB-flush interface x86/uv: Use the right TLB-flush API x86/entry: Rename SYSENTER_stack to CPU_ENTRY_AREA_entry_stack x86/doc: Remove obvious weirdnesses from the x86 MM layout documentation x86/mm/64: Improve the memory map documentation x86/ldt: Prevent LDT inheritance on exec x86/ldt: Rework locking arch, mm: Allow arch_dup_mmap() to fail x86/vsyscall/64: Warn and fail vsyscall emulation in NATIVE mode ...
2017-12-22Merge tag 'powerpc-4.15-5' of ↵Linus Torvalds4-7/+27
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "This is all fairly boring, except that there's two KVM fixes that you'd normally get via Paul's kvm-ppc tree. He's away so I picked them up. I was waiting to see if he would apply them, which is why they have only been in my tree since today. But they were on the list for a while and have been tested on the relevant hardware. Of note is two fixes for KVM XIVE (Power9 interrupt controller). These would normally go via the KVM tree but Paul is away so I've picked them up. Other than that, two fixes for error handling in the IMC driver, and one for a potential oops in the BHRB code if the hardware records a branch address that has subsequently been unmapped, and finally a s/%p/%px/ in our oops code. Thanks to: Anju T Sudhakar, Cédric Le Goater, Laurent Vivier, Madhavan Srinivasan, Naveen N. Rao, Ravi Bangoria" * tag 'powerpc-4.15-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: KVM: PPC: Book3S HV: Fix pending_pri value in kvmppc_xive_get_icp() KVM: PPC: Book3S: fix XIVE migration of pending interrupts powerpc/kernel: Print actual address of regs when oopsing powerpc/perf: Fix kfree memory allocated for nest pmus powerpc/perf/imc: Fix nest-imc cpuhotplug callback failure powerpc/perf: Dereference BHRB entries safely
2017-12-22arch, mm: Allow arch_dup_mmap() to failThomas Gleixner1-2/+3
In order to sanitize the LDT initialization on x86 arch_dup_mmap() must be allowed to fail. Fix up all instances. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirsky <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: dan.j.williams@intel.com Cc: hughd@google.com Cc: keescook@google.com Cc: kirill.shutemov@linux.intel.com Cc: linux-mm@kvack.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-2/+4
Lots of overlapping changes. Also on the net-next side the XDP state management is handled more in the generic layers so undo the 'net' nfp fix which isn't applicable in net-next. Include a necessary change by Jakub Kicinski, with log message: ==================== cls_bpf no longer takes care of offload tracking. Make sure netdevsim performs necessary checks. This fixes a warning caused by TC trying to remove a filter it has not added. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-22powerpc/mm: Add proper pte access check helper for other platformsAneesh Kumar K.V2-0/+46
pte_access_premitted get called in get_user_pages_fast path. If we have marked the pte PROT_NONE, we should not allow a read access on the address. With the current implementation we are not checking the READ and only check for WRITE. This is needed on archs like ppc64 that implement PROT_NONE using _PAGE_USER access instead of _PAGE_PRESENT. Also add pte_user check just to make sure we are not accessing kernel mapping. Even though there is code duplication, keeping the low level pte accessors different for different platforms helps in code readability. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-12-22powerpc/mm/book3s/64: Add proper pte access check helperAneesh Kumar K.V1-0/+41
pte_access_premitted get called in get_user_pages_fast path. If we have marked the pte PROT_NONE, we should not allow a read access on the address. With the current implementation we are not checking the READ and only check for WRITE. This is needed on archs like ppc64 that implement PROT_NONE using RWX access instead of _PAGE_PRESENT. Also add pte_user check just to make sure we are not accessing kernel mapping. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-12-22powerpc/mm/hugetlb: Use pte_access_permitted for hugetlb access checkAneesh Kumar K.V1-3/+1
No functional change in this patch. This update gup_hugepte to use the helper. This will help later when we add memory keys. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-12-22KVM: PPC: Book3S HV: Fix pending_pri value in kvmppc_xive_get_icp()Laurent Vivier1-1/+2
When we migrate a VM from a POWER8 host (XICS) to a POWER9 host (XICS-on-XIVE), we have an error: qemu-kvm: Unable to restore KVM interrupt controller state \ (0xff000000) for CPU 0: Invalid argument This is because kvmppc_xics_set_icp() checks the new state is internaly consistent, and especially: ... 1129 if (xisr == 0) { 1130 if (pending_pri != 0xff) 1131 return -EINVAL; ... On the other side, kvmppc_xive_get_icp() doesn't set neither the pending_pri value, nor the xisr value (set to 0) (and kvmppc_xive_set_icp() ignores the pending_pri value) As xisr is 0, pending_pri must be set to 0xff. Fixes: 5af50993850a ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Laurent Vivier <lvivier@redhat.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-12-22KVM: PPC: Book3S: fix XIVE migration of pending interruptsCédric Le Goater1-2/+2
When restoring a pending interrupt, we are setting the Q bit to force a retrigger in xive_finish_unmask(). But we also need to force an EOI in this case to reach the same initial state : P=1, Q=0. This can be done by not setting 'old_p' for pending interrupts which will inform xive_finish_unmask() that an EOI needs to be sent. Fixes: 5af50993850a ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller") Cc: stable@vger.kernel.org # v4.12+ Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Tested-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-12-20powerpc: capture the PTE format changes in the dump pte reportRam Pai1-1/+2
The H_PAGE_F_SECOND,H_PAGE_F_GIX are not in the 64K main-PTE. capture these changes in the dump pte report. Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>