summaryrefslogtreecommitdiff
path: root/arch/powerpc
AgeCommit message (Collapse)AuthorFilesLines
2019-11-14KVM: PPC: Book3S HV: Flush link stack on guest exit to host kernelMichael Ellerman3-0/+41
On some systems that are vulnerable to Spectre v2, it is up to software to flush the link stack (return address stack), in order to protect against Spectre-RSB. When exiting from a guest we do some house keeping and then potentially exit to C code which is several stack frames deep in the host kernel. We will then execute a series of returns without preceeding calls, opening up the possiblity that the guest could have poisoned the link stack, and direct speculative execution of the host to a gadget of some sort. To prevent this we add a flush of the link stack on exit from a guest. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-11-14powerpc/book3s64: Fix link stack flush on context switchMichael Ellerman4-4/+54
In commit ee13cb249fab ("powerpc/64s: Add support for software count cache flush"), I added support for software to flush the count cache (indirect branch cache) on context switch if firmware told us that was the required mitigation for Spectre v2. As part of that code we also added a software flush of the link stack (return address stack), which protects against Spectre-RSB between user processes. That is all correct for CPUs that activate that mitigation, which is currently Power9 Nimbus DD2.3. What I got wrong is that on older CPUs, where firmware has disabled the count cache, we also need to flush the link stack on context switch. To fix it we create a new feature bit which is not set by firmware, which tells us we need to flush the link stack. We set that when firmware tells us that either of the existing Spectre v2 mitigations are enabled. Then we adjust the patching code so that if we see that feature bit we enable the link stack flush. If we're also told to flush the count cache in software then we fall through and do that also. On the older CPUs we don't need to do do the software count cache flush, firmware has disabled it, so in that case we patch in an early return after the link stack flush. The naming of some of the functions is awkward after this patch, because they're called "count cache" but they also do link stack. But we'll fix that up in a later commit to ease backporting. This is the fix for CVE-2019-18660. Reported-by: Anthony Steinhauser <asteinhauser@google.com> Fixes: ee13cb249fab ("powerpc/64s: Add support for software count cache flush") Cc: stable@vger.kernel.org # v4.4+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-11-13powerpc/fsl_booke/kaslr: export offset in VMCOREINFO ELF notesJason Yan1-0/+1
Like all other architectures such as x86 or arm64, include KASLR offset in VMCOREINFO ELF notes to assist in debugging. After this, we can use crash --kaslr option to parse vmcore generated from a kaslr kernel. Note: The crash tool needs to support --kaslr too. Signed-off-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Scott Wood <oss@buserror.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-11-13powerpc/fsl_booke/kaslr: dump out kernel offset information on panicJason Yan2-0/+25
When kaslr is enabled, the kernel offset is different for every boot. This brings some difficult to debug the kernel. Dump out the kernel offset when panic so that we can easily debug the kernel. This code is derived from x86/arm64 which has similar functionality. Signed-off-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Diana Craciun <diana.craciun@nxp.com> Tested-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-11-13powerpc/fsl_booke/kaslr: support nokaslr cmdline parameterJason Yan1-0/+7
One may want to disable kaslr when boot, so provide a cmdline parameter 'nokaslr' to support this. Signed-off-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Diana Craciun <diana.craciun@nxp.com> Tested-by: Diana Craciun <diana.craciun@nxp.com> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-11-13powerpc/fsl_booke/kaslr: clear the original kernel if randomizedJason Yan3-0/+14
The original kernel still exists in the memory, clear it now. Signed-off-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Diana Craciun <diana.craciun@nxp.com> Tested-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-11-13powerpc/fsl_booke/32: randomize the kernel image offsetJason Yan1-2/+323
After we have the basic support of relocate the kernel in some appropriate place, we can start to randomize the offset now. Entropy is derived from the banner and timer, which will change every build and boot. This not so much safe so additionally the bootloader may pass entropy via the /chosen/kaslr-seed node in device tree. We will use the first 512M of the low memory to randomize the kernel image. The memory will be split in 64M zones. We will use the lower 8 bit of the entropy to decide the index of the 64M zone. Then we chose a 16K aligned offset inside the 64M zone to put the kernel in. We also check if we will overlap with some areas like the dtb area, the initrd area or the crashkernel area. If we cannot find a proper area, kaslr will be disabled and boot from the original kernel. Some pieces of code are derived from arch/x86/boot/compressed/kaslr.c or arch/arm64/kernel/kaslr.c such as rotate_xor(). Credit goes to Kees and Ard. Signed-off-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Diana Craciun <diana.craciun@nxp.com> Tested-by: Diana Craciun <diana.craciun@nxp.com> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-11-13powerpc/fsl_booke/32: implement KASLR infrastructureJason Yan9-17/+109
This patch add support to boot kernel from places other than KERNELBASE. Since CONFIG_RELOCATABLE has already supported, what we need to do is map or copy kernel to a proper place and relocate. Freescale Book-E parts expect lowmem to be mapped by fixed TLB entries(TLB1). The TLB1 entries are not suitable to map the kernel directly in a randomized region, so we chose to copy the kernel to a proper place and restart to relocate. The offset of the kernel was not randomized yet(a fixed 64M is set). We will randomize it in the next patch. Signed-off-by: Jason Yan <yanaijie@huawei.com> Tested-by: Diana Craciun <diana.craciun@nxp.com> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net> [mpe: Use PTRRELOC() in early_init()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-11-13powerpc/fsl_booke/32: introduce reloc_kernel_entry() helperJason Yan2-0/+14
Add a new helper reloc_kernel_entry() to jump back to the start of the new kernel. After we put the new kernel in a randomized place we can use this new helper to enter the kernel and begin to relocate again. Signed-off-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Diana Craciun <diana.craciun@nxp.com> Tested-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-11-13powerpc/fsl_booke/32: introduce create_kaslr_tlb_entry() helperJason Yan2-0/+36
Add a new helper create_kaslr_tlb_entry() to create a tlb entry by the virtual and physical address. This is a preparation to support boot kernel at a randomized address. Signed-off-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Diana Craciun <diana.craciun@nxp.com> Tested-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-11-13powerpc: introduce kernstart_virt_addr to store the kernel baseJason Yan2-0/+4
Now the kernel base is a fixed value - KERNELBASE. To support KASLR, we need a variable to store the kernel base. Signed-off-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Diana Craciun <diana.craciun@nxp.com> Tested-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-11-13powerpc: move memstart_addr and kernstart_addr to init-common.cJason Yan3-10/+5
These two variables are both defined in init_32.c and init_64.c. Move them to init-common.c and make them __ro_after_init. Signed-off-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Diana Craciun <diana.craciun@nxp.com> Tested-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-11-13powerpc: unify definition of M_IF_NEEDEDJason Yan4-29/+14
M_IF_NEEDED is defined too many times. Move it to a common place and rename it to MAS2_M_IF_NEEDED which is much readable. Signed-off-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Diana Craciun <diana.craciun@nxp.com> Tested-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-11-13powerpc/fadump: when fadump is supported register the fadump sysfs files.Michal Suchanek1-9/+6
Currently it is not possible to distinguish the case when fadump is supported by firmware and disabled in kernel and completely unsupported using the kernel sysfs interface. User can investigate the devicetree but it is more reasonable to provide sysfs files in case we get some fadumpv2 in the future. With this patch sysfs files are available whenever fadump is supported by firmware. There is duplicate message about lack of support by firmware in fadump_reserve_mem and setup_fadump. Remove the duplicate message in setup_fadump. Signed-off-by: Michal Suchanek <msuchanek@suse.de> Reviewed-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191107164757.15140-1-msuchanek@suse.de
2019-11-13powerpc/perf: remove current_is_64bit()Michal Suchanek1-16/+1
Since commit ed1cd6deb013 ("powerpc: Activate CONFIG_THREAD_INFO_IN_TASK") current_is_64bit() is quivalent to !is_32bit_task(). Remove the redundant function. Suggested-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190912194633.12045-1-msuchanek@suse.de
2019-11-13powerpc/eeh: differentiate duplicate detection messageSam Bobroff1-2/+2
Currently when an EEH error is detected, the system log receives the same (or almost the same) message twice: EEH: PHB#0 failure detected, location: N/A EEH: PHB#0 failure detected, location: N/A or EEH: eeh_dev_check_failure: Frozen PHB#0-PE#0 detected EEH: Frozen PHB#0-PE#0 detected This looks like a bug, but in fact the messages are from different functions and mean slightly different things. So keep both but change one of the messages slightly, so that it's clear they are different: EEH: PHB#0 failure detected, location: N/A EEH: Recovering PHB#0, location: N/A or EEH: eeh_dev_check_failure: Frozen PHB#0-PE#0 detected EEH: Recovering PHB#0-PE#0 Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/43817cb6e6631b0828b9a6e266f60d1f8ca8eb22.1571288375.git.sbobroff@linux.ibm.com
2019-11-13powerpc/pseries/hotplug-memory: Change rc variable to boolLeonardo Bras1-3/+3
Changes the return variable to bool (as the return value) and avoids doing a ternary operation before returning. Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802133914.30413-1-leonardo@linux.ibm.com
2019-11-13powerpc: use <asm-generic/dma-mapping.h>Christoph Hellwig2-18/+1
The powerpc version of dma-mapping.h only contains a version of get_arch_dma_ops that always return NULL. Replace it with the asm-generic version that does the same. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190807150752.17894-1-hch@lst.de
2019-11-13powerpc/xive: Prevent page fault issues in the machine crash handlerCédric Le Goater1-0/+9
When the machine crash handler is invoked, all interrupts are masked but interrupts which have not been started yet do not have an ESB page mapped in the Linux address space. This crashes the 'crash kexec' sequence on sPAPR guests. To fix, force the mapping of the ESB page when an interrupt is being mapped in the Linux IRQ number space. This is done by setting the initial state of the interrupt to OFF which is not necessarily the case on PowerNV. Fixes: 243e25112d06 ("powerpc/xive: Native exploitation of the XIVE interrupt controller") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191031063100.3864-1-clg@kaod.org
2019-11-13powerpc/64s/exception: Fix kaup -> kuap typoAndrew Donnellan1-3/+3
It's KUAP, not KAUP. Fix typo in INT_COMMON macro. Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191022060603.24101-1-ajd@linux.ibm.com
2019-11-13powerpc: Replace GPL boilerplate with SPDX identifiersThomas Huth4-65/+3
The FSF does not reside in "675 Mass Ave, Cambridge" anymore... let's simply use proper SPDX identifiers instead. Signed-off-by: Thomas Huth <thuth@redhat.com> Acked-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190828060737.32531-1-thuth@redhat.com
2019-11-13powerpc/book3s/mm: Update Oops message to print the correct translation in useAneesh Kumar K.V1-4/+11
Avoids confusion when printing Oops message like below Faulting instruction address: 0xc00000000008bdb4 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV This was because we never clear the MMU_FTR_HPTE_TABLE feature flag even if we run with radix translation. It was discussed that we should look at this feature flag as an indication of the capability to run hash translation and we should not clear the flag even if we run in radix translation. All the code paths check for radix_enabled() check and if found true consider we are running with radix translation. Follow the same sequence for finding the MMU translation string to be used in Oops message. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Acked-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190711145814.17970-1-aneesh.kumar@linux.ibm.com
2019-11-13powerpc/spufs: remove set but not used variable 'ctx'YueHaibing1-2/+0
arch/powerpc/platforms/cell/spufs/inode.c:201:22: warning: variable ctx set but not used [-Wunused-but-set-variable] It is not used since commit 67cba9fd6456 ("move spu_forget() into spufs_rmdir()") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191023134423.15052-1-yuehaibing@huawei.com
2019-11-13powerpc/powernv/ioda: using kfree_rcu() to simplify the codeYueHaibing1-9/+1
The callback function of call_rcu() just calls a kfree(), so we can use kfree_rcu() instead of call_rcu() + callback function. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190711141818.18044-1-yuehaibing@huawei.com
2019-11-13powerpc/powernv: Make some symbols staticYueHaibing3-4/+4
Fix sparse warnings: arch/powerpc/platforms/powernv/opal-psr.c:20:1: warning: symbol 'psr_mutex' was not declared. Should it be static? arch/powerpc/platforms/powernv/opal-psr.c:27:3: warning: symbol 'psr_attrs' was not declared. Should it be static? arch/powerpc/platforms/powernv/opal-powercap.c:20:1: warning: symbol 'powercap_mutex' was not declared. Should it be static? arch/powerpc/platforms/powernv/opal-sensor-groups.c:20:1: warning: symbol 'sg_mutex' was not declared. Should it be static? Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190702131733.44100-1-yuehaibing@huawei.com
2019-11-13powerpc/configs: remove obsolete CONFIG_INET_XFRM_MODE_* and ↵YueHaibing43-126/+0
CONFIG_INET6_XFRM_MODE_* These Kconfig options has been removed in commit 4c145dce2601 ("xfrm: make xfrm modes builtin") So there is no point to keep it in defconfigs any longer. Signed-off-by: YueHaibing <yuehaibing@huawei.com> [mpe: Extract from cross arch patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190612071901.21736-1-yuehaibing@huawei.com
2019-11-13powerpc/pseries: Fix platform_no_drv_owner.cocci warningsYueHaibing1-1/+0
Remove .owner field if calls are used which set it automatically Generated by: scripts/coccinelle/api/platform_no_drv_owner.cocci Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190218133950.95225-1-yuehaibing@huawei.com
2019-11-13powerpc/pseries: Drop pointless static qualifier in vpa_debugfs_init()YueHaibing1-1/+1
There is no need to have the 'struct dentry *vpa_dir' variable static since new value always be assigned before use it. Fixes: c6c26fb55e8e ("powerpc/pseries: Export raw per-CPU VPA data via debugfs") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190218125644.87448-1-yuehaibing@huawei.com
2019-11-13powerpc/powernv/npu: Fix debugfs_simple_attr.cocci warningsYueHaibing1-4/+4
Use DEFINE_DEBUGFS_ATTRIBUTE rather than DEFINE_SIMPLE_ATTRIBUTE for debugfs files. Semantic patch information: Rationale: DEFINE_SIMPLE_ATTRIBUTE + debugfs_create_file() imposes some significant overhead as compared to DEFINE_DEBUGFS_ATTRIBUTE + debugfs_create_file_unsafe(). Generated by: scripts/coccinelle/api/debugfs/debugfs_simple_attr.cocci Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1545705876-63132-1-git-send-email-yuehaibing@huawei.com
2019-11-13powerpc/64s: Fix debugfs_simple_attr.cocci warningsYueHaibing1-10/+14
Use DEFINE_DEBUGFS_ATTRIBUTE rather than DEFINE_SIMPLE_ATTRIBUTE for debugfs files. Semantic patch information: Rationale: DEFINE_SIMPLE_ATTRIBUTE + debugfs_create_file() imposes some significant overhead as compared to DEFINE_DEBUGFS_ATTRIBUTE + debugfs_create_file_unsafe(). Generated by: scripts/coccinelle/api/debugfs/debugfs_simple_attr.cocci Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1543498518-107601-1-git-send-email-yuehaibing@huawei.com
2019-11-13powerpc/pseries: Use correct event modifier in rtas_parse_epow_errlog()YueHaibing1-1/+1
rtas_parse_epow_errlog() should pass 'modifier' to handle_system_shutdown, because event modifier only use bottom 4 bits. Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191023134838.21280-1-yuehaibing@huawei.com
2019-11-13powerpc/watchpoint: Don't ignore extraneous exceptions blindlyRavi Bangoria1-21/+31
On powerpc, watchpoint match range is double-word granular. On a watchpoint hit, DAR is set to the first byte of overlap between actual access and watched range. And thus it's quite possible that DAR does not point inside user specified range. Ex, say user creates a watchpoint with address range 0x1004 to 0x1007. So hw would be configured to watch from 0x1000 to 0x1007. If there is a 4 byte access from 0x1002 to 0x1005, DAR will point to 0x1002 and thus interrupt handler considers it as extraneous, but it's actually not, because part of the access belongs to what user has asked. Instead of blindly ignoring the exception, get actual address range by analysing an instruction, and ignore only if actual range does not overlap with user specified range. Note: The behavior is unchanged for 8xx. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191017093204.7511-5-ravi.bangoria@linux.ibm.com
2019-11-13powerpc/watchpoint: Fix ptrace code that muck around with address/lenRavi Bangoria2-8/+5
ptrace_set_debugreg() does not consider new length while overwriting the watchpoint. Fix that. ppc_set_hwdebug() aligns watchpoint address to doubleword boundary but does not change the length. If address range is crossing doubleword boundary and length is less then 8, we will lose samples from second doubleword. So fix that as well. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191017093204.7511-4-ravi.bangoria@linux.ibm.com
2019-11-13powerpc/watchpoint: Fix length calculation for unaligned targetRavi Bangoria5-23/+56
Watchpoint match range is always doubleword(8 bytes) aligned on powerpc. If the given range is crossing doubleword boundary, we need to increase the length such that next doubleword also get covered. Ex, address len = 6 bytes |=========. |------------v--|------v--------| | | | | | | | | | | | | | | | | | |---------------|---------------| <---8 bytes---> In such case, current code configures hw as: start_addr = address & ~HW_BREAKPOINT_ALIGN len = 8 bytes And thus read/write in last 4 bytes of the given range is ignored. Fix this by including next doubleword in the length. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191017093204.7511-3-ravi.bangoria@linux.ibm.com
2019-11-13powerpc/watchpoint: Introduce macros for watchpoint lengthRavi Bangoria4-6/+9
We are hadrcoding length everywhere in the watchpoint code. Introduce macros for the length and use them. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191017093204.7511-2-ravi.bangoria@linux.ibm.com
2019-11-13powerpc/security: Fix wrong message when RFI Flush is disableGustavo L. F. Walbon1-10/+6
The issue was showing "Mitigation" message via sysfs whatever the state of "RFI Flush", but it should show "Vulnerable" when it is disabled. If you have "L1D private" feature enabled and not "RFI Flush" you are vulnerable to meltdown attacks. "RFI Flush" is the key feature to mitigate the meltdown whatever the "L1D private" state. SEC_FTR_L1D_THREAD_PRIV is a feature for Power9 only. So the message should be as the truth table shows: CPU | L1D private | RFI Flush | sysfs ----|-------------|-----------|------------------------------------- P9 | False | False | Vulnerable P9 | False | True | Mitigation: RFI Flush P9 | True | False | Vulnerable: L1D private per thread P9 | True | True | Mitigation: RFI Flush, L1D private per thread P8 | False | False | Vulnerable P8 | False | True | Mitigation: RFI Flush Output before this fix: # cat /sys/devices/system/cpu/vulnerabilities/meltdown Mitigation: RFI Flush, L1D private per thread # echo 0 > /sys/kernel/debug/powerpc/rfi_flush # cat /sys/devices/system/cpu/vulnerabilities/meltdown Mitigation: L1D private per thread Output after fix: # cat /sys/devices/system/cpu/vulnerabilities/meltdown Mitigation: RFI Flush, L1D private per thread # echo 0 > /sys/kernel/debug/powerpc/rfi_flush # cat /sys/devices/system/cpu/vulnerabilities/meltdown Vulnerable: L1D private per thread Signed-off-by: Gustavo L. F. Walbon <gwalbon@linux.ibm.com> Signed-off-by: Mauro S. M. Rodrigues <maurosr@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190502210907.42375-1-gwalbon@linux.ibm.com
2019-11-13powerpc/crypto: Add cond_resched() in crc-vpmsum self-testChris Smart1-0/+1
The stress test for vpmsum implementations executes a long for loop in the kernel. This blocks the scheduler, which prevents other tasks from running, resulting in a warning. This fix adds a call to cond_reshed() at the end of each loop, which allows the scheduler to run other tasks as required. Signed-off-by: Chris Smart <chris.smart@humanservices.gov.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191103233356.5472-1-chris.smart@humanservices.gov.au
2019-11-13powerpc/pseries/cmm: Simulation modeDavid Hildenbrand1-8/+30
Let's allow to test the implementation without needing HW support. When "simulate=1" is specified when loading the module, we bypass all HW checks and HW calls. The sysfs file "simulate_loan_target_kb" can be used to simulate HW requests. The simualtion mode can be activated using: modprobe cmm debug=1 simulate=1 And the requested loan target can be changed using: echo X > /sys/devices/system/cmm/cmm0/simulate_loan_target_kb Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191031142933.10779-11-david@redhat.com
2019-11-13powerpc/pseries/cmm: Switch to balloon_page_alloc()David Hildenbrand1-2/+1
balloon_page_alloc() will use GFP_HIGHUSER_MOVABLE in case we have CONFIG_BALLOON_COMPACTION. This is now possible, as balloon pages are movable with CONFIG_BALLOON_COMPACTION. Without CONFIG_BALLOON_COMPACTION, GFP_HIGHUSER is used. Note that apart from that, balloon_page_alloc() uses the following flags: __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_NOWARN And current code used: GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY | __GFP_NOMEMALLOC GFP_HIGHUSER/GFP_HIGHUSER_MOVABLE include __GFP_RECLAIM | __GFP_IO | __GFP_FS | __GFP_HARDWALL | __GFP_HIGHMEM GFP_NOIO is __GFP_RECLAIM. With CONFIG_BALLOON_COMPACTION, we essentially add: __GFP_IO | __GFP_FS | __GFP_HARDWALL | __GFP_HIGHMEM | __GFP_MOVABLE Without CONFIG_BALLOON_COMPACTION, we essentially add: __GFP_IO | __GFP_FS | __GFP_HARDWALL | __GFP_HIGHMEM I assume this is fine, as this is what all other balloon compaction users use. If it turns out to be a problem, we could add __GFP_MOVABLE manually if we have CONFIG_BALLOON_COMPACTION. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191031142933.10779-10-david@redhat.com
2019-11-13powerpc/pseries/cmm: Implement balloon compactionDavid Hildenbrand2-14/+120
We can now get rid of the cmm_lock and completely rely on the balloon compaction internals, which now also manage the page list and the lock. Inflated/"loaned" pages are now movable. Memory blocks that contain such pages can get offlined. Also, all such pages will be marked PageOffline() and can therefore be excluded in memory dumps using recent versions of makedumpfile. Don't switch to balloon_page_alloc() yet (due to the GFP_NOIO). Will do that separately to discuss this change in detail. Signed-off-by: David Hildenbrand <david@redhat.com> [mpe: Add isolated_pages-- in cmm_migratepage() as suggested by David] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191031142933.10779-9-david@redhat.com
2019-11-13powerpc/pseries/cmm: Convert loaned_pages to an atomic_long_tDavid Hildenbrand1-16/+19
When switching to balloon compaction, we want to drop the cmm_lock and completely rely on the balloon compaction list lock internally. loaned_pages is currently protected under the cmm_lock. Note: Right now cmm_alloc_pages() and cmm_free_pages() can be called at the same time, e.g., via the thread and a concurrent OOM notifier. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191031142933.10779-8-david@redhat.com
2019-11-13powerpc/pseries/cmm: Rip out memory isolate notifierDavid Hildenbrand1-96/+1
The memory isolate notifier was added to allow to offline memory blocks that contain inflated/"loaned" pages. We can achieve the same using the balloon compaction framework. Get rid of the memory isolate notifier. Also, we can get rid of cmm_mem_going_offline(), as we will never reach that code path now when we have allocated memory in the balloon (allocated pages are unmovable and will no longer be special-cased using the memory isolation notifier). Leave the memory notifier in place, so we can still back off in case memory gets offlined. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191031142933.10779-7-david@redhat.com
2019-11-13powerpc/pseries/cmm: Use adjust_managed_page_count() insted of totalram_pages_*David Hildenbrand1-3/+3
adjust_managed_page_count() performs a totalram_pages_add(), but also adjusts the managed pages of the zone. Let's use that instead, similar to virtio-balloon. Use it before freeing a page. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191031142933.10779-6-david@redhat.com
2019-11-13powerpc/pseries/cmm: Drop page arrayDavid Hildenbrand1-127/+36
We can simply store the pages in a list (page->lru), no need for a separate data structure (+ complicated handling). This is how most other balloon drivers store allocated pages without additional tracking data. For the notifiers, use page_to_pfn() to check if a page is in the applicable range. Use page_to_phys() in plpar_page_set_loaned() and plpar_page_set_active() (I assume due to the __pa() that's the right thing to do). Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191031142933.10779-5-david@redhat.com
2019-11-13powerpc/pseries/cmm: Cleanup rc handling in cmm_init()David Hildenbrand1-4/+3
No need to initialize rc. Also, let's return 0 directly when succeeding. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191031142933.10779-4-david@redhat.com
2019-11-13powerpc/pseries/cmm: Report errors when registering notifiers failsDavid Hildenbrand1-2/+6
If we don't set the rc, we will return "0", making it look like we succeeded. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191031142933.10779-3-david@redhat.com
2019-11-13powerpc/pseries/cmm: Implement release() function for sysfs deviceDavid Hildenbrand1-0/+5
When unloading the module, one gets ------------[ cut here ]------------ Device 'cmm0' does not have a release() function, it is broken and must be fixed. See Documentation/kobject.txt. WARNING: CPU: 0 PID: 19308 at drivers/base/core.c:1244 .device_release+0xcc/0xf0 ... We only have one static fake device. There is nothing to do when releasing the device (via cmm_exit()). Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191031142933.10779-2-david@redhat.com
2019-11-13powerpc/pseries: Enable support for ibm,drc-info propertyTyrel Datwyler1-1/+1
Advertise client support for the PAPR architected ibm,drc-info device tree property during CAS handshake. Fixes: c7a3275e0f9e ("powerpc/pseries: Revert support for ibm,drc-info devtree property") Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1573449697-5448-11-git-send-email-tyreld@linux.ibm.com
2019-11-13powerpc/pseries: Add cpu DLPAR support for drc-info propertyTyrel Datwyler1-15/+112
Older firmwares provided information about Dynamic Reconfig Connectors (DRC) through several device tree properties, namely ibm,drc-types, ibm,drc-indexes, ibm,drc-names, and ibm,drc-power-domains. New firmwares have the ability to present this same information in a much condensed format through a device tree property called ibm,drc-info. The existing cpu DLPAR hotplug code only understands the older DRC property format when validating the drc-index of a cpu during a hotplug add. This updates those code paths to use the ibm,drc-info property, when present, instead for validation. Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1573449697-5448-4-git-send-email-tyreld@linux.ibm.com
2019-11-13powerpc/pseries: Fix drc-info mappings of logical cpus to drc-indexTyrel Datwyler1-13/+10
There are a couple subtle errors in the mapping between cpu-ids and a cpus associated drc-index when using the new ibm,drc-info property. The first is that while drc-info may have been a supported firmware feature at boot it is possible we have migrated to a CEC with older firmware that doesn't support the ibm,drc-info property. In that case the device tree would have been updated after migration to remove the ibm,drc-info property and replace it with the older style ibm,drc-* properties for types, indexes, names, and power-domains. PAPR even goes as far as dictating that if we advertise support for drc-info that we are capable of supporting either property type at runtime. The second is that the first value of the ibm,drc-info property is the int encoded count of drc-info entries. As such "value" returned by of_prop_next_u32() is pointing at that count, and not the first element of the first drc-info entry as is expected by the of_read_drc_info_cell() helper. Fix the first by ignoring DRC-INFO firmware feature and instead testing directly for ibm,drc-info, and then falling back to the old style ibm,drc-indexes in the case it doesn't exit. Fix the second by incrementing value to the next element prior to parsing drc-info entries. Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1573449697-5448-3-git-send-email-tyreld@linux.ibm.com