summaryrefslogtreecommitdiff
path: root/arch/powerpc/platforms/powernv/setup.c
AgeCommit message (Collapse)AuthorFilesLines
2021-04-22powerpc/64s: remove unneeded semicolonYang Li1-1/+1
Eliminate the following coccicheck warning: ./arch/powerpc/platforms/powernv/setup.c:160:2-3: Unneeded semicolon Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1612236877-104974-1-git-send-email-yang.lee@linux.alibaba.com
2021-02-08powerpc/pci: Move PHB discovery for PCI_DN using platformsOliver O'Halloran1-3/+1
Make powernv, pseries, powermac and maple use ppc_mc.discover_phbs. These platforms need to be done together because they all depend on pci_dn's being created from the DT. The pci_dn contains a pointer to the relevant pci_controller so they need to be created after the pci_controller structures are available, but before PCI devices are scanned. Currently this ordering is provided by initcalls and the sequence is: 1. PHBs are discovered (setup_arch) (early boot, pre-initcalls) 2. pci_dn are created from the unflattended DT (core initcall) 3. PHBs are scanned pcibios_init() (subsys initcall) The new ppc_md.discover_phbs() function is also a core_initcall so we can't guarantee ordering between the creation of pci_controllers and the creation of pci_dn's which require a pci_controller. We could use the postcore, or core_sync initcall levels, but it's cleaner to just move the pci_dn setup into the per-PHB inits which occur inside of .discover_phb() for these platforms. This brings the boot-time path in line with the PHB hotplug path that is used for pseries DLPAR operations too. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> [mpe: Squash powermac & maple in to avoid breakage those platforms, convert memblock allocs to use kmalloc to avoid warnings] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201103043523.916109-2-oohall@gmail.com
2020-12-02powerpc/64s/powernv: Fix memory corruption when saving SLB entries on MCENicholas Piggin1-2/+7
This can be hit by an HPT guest running on an HPT host and bring down the host, so it's quite important to fix. Fixes: 7290f3b3d3e6 ("powerpc/64s/powernv: machine check dump SLB contents") Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Mahesh Salgaonkar <mahesh@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201128070728.825934-2-npiggin@gmail.com
2020-11-19powerpc/64s: rename pnv|pseries_setup_rfi_flush to _setup_security_mitigationsDaniel Axtens1-3/+4
pseries|pnv_setup_rfi_flush already does the count cache flush setup, and we just added entry and uaccess flushes. So the name is not very accurate any more. In both platforms we then also immediately setup the STF flush. Rename them to _setup_security_mitigations and fold the STF flush in. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2020-11-19powerpc/64s: flush L1D after user accessesNicholas Piggin1-2/+8
IBM Power9 processors can speculatively operate on data in the L1 cache before it has been completely validated, via a way-prediction mechanism. It is not possible for an attacker to determine the contents of impermissible memory using this method, since these systems implement a combination of hardware and software security measures to prevent scenarios where protected data could be leaked. However these measures don't address the scenario where an attacker induces the operating system to speculatively execute instructions using data that the attacker controls. This can be used for example to speculatively bypass "kernel user access prevention" techniques, as discovered by Anthony Steinhauser of Google's Safeside Project. This is not an attack by itself, but there is a possibility it could be used in conjunction with side-channels or other weaknesses in the privileged code to construct an attack. This issue can be mitigated by flushing the L1 cache between privilege boundaries of concern. This patch flushes the L1 cache after user accesses. This is part of the fix for CVE-2020-4788. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2020-11-19powerpc/64s: flush L1D on kernel entryNicholas Piggin1-0/+11
IBM Power9 processors can speculatively operate on data in the L1 cache before it has been completely validated, via a way-prediction mechanism. It is not possible for an attacker to determine the contents of impermissible memory using this method, since these systems implement a combination of hardware and software security measures to prevent scenarios where protected data could be leaked. However these measures don't address the scenario where an attacker induces the operating system to speculatively execute instructions using data that the attacker controls. This can be used for example to speculatively bypass "kernel user access prevention" techniques, as discovered by Anthony Steinhauser of Google's Safeside Project. This is not an attack by itself, but there is a possibility it could be used in conjunction with side-channels or other weaknesses in the privileged code to construct an attack. This issue can be mitigated by flushing the L1 cache between privilege boundaries of concern. This patch flushes the L1 cache on kernel entry. This is part of the fix for CVE-2020-4788. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2020-09-08powerpc/powernv: Print helpful message when cores guardedJoel Stanley1-0/+24
Often the firmware will guard out cores after a crash. This often undesirable, and is not immediately noticeable. This adds an informative message when a CPU device tree nodes are marked bad in the device tree. Signed-off-by: Joel Stanley <joel@jms.id.au> [mpe: Use an eye-catcher that's less likely to get us in trouble] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190801051630.5804-1-joel@jms.id.au
2020-07-20powerpc/mm/radix: Create separate mappings for hot-plugged memoryAneesh Kumar K.V1-1/+9
To enable memory unplug without splitting kernel page table mapping, we force the max mapping size to the LMB size. LMB size is the unit in which hypervisor will do memory add/remove operation. Pseries systems supports max LMB size of 256MB. Hence on pseries, we now end up mapping memory with 2M page size instead of 1G. To improve that we want hypervisor to hint the kernel about the hotplug memory range. That was added that as part of commit b6eca183e23e ("powerpc/kernel: Enables memory hot-remove after reboot on pseries guests") But PowerVM doesn't provide that hint yet. Once we get PowerVM updated, we can then force the 2M mapping only to hot-pluggable memory region using memblock_is_hotpluggable(). Till then let's depend on LMB size for finding the mapping page size for linear range. With this change KVM guest will also be doing linear mapping with 2M page size. The actual TLB benefit of mapping guest page table entries with hugepage size can only be materialized if the partition scoped entries are also using the same or higher page size. A guest using 1G hugetlbfs backing guest memory can have a performance impact with the above change. Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> [mpe: Fold in fix from Aneesh spotted by lkp@intel.com] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200709131925.922266-5-aneesh.kumar@linux.ibm.com
2020-03-04powerpc/powernv: Add explicit fast-reboot supportOliver O'Halloran1-0/+2
Add a way to manually invoke a fast-reboot rather than setting the NVRAM flag. The idea is to allow userspace to invoke a fast-reboot using the optional string argument to the reboot() system call, or using the xmon zr command so we don't need to leave around a persistent changes on a system to use the feature. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200217024833.30580-2-oohall@gmail.com
2020-03-04powerpc/powernv: Treat an empty reboot string as defaultOliver O'Halloran1-1/+1
Treat an empty reboot cmd string the same as a NULL string. This squashes a spurious unsupported reboot message that sometimes gets out when using xmon. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200217024833.30580-1-oohall@gmail.com
2020-01-23powerpc/powernv: Allow manually invoking special rebootsOliver O'Halloran1-0/+4
OPAL provides several different kinds of reboot for the kernel to use, namely forcing a full reboot, platform error reboot and MPIPL. Right now triggering the alternative resets requires some ad-hoc method such as triggering a kernel crash and hoping the stars align. It's sometimes handy to be able to trigger one of these resets directly, so add a way to do that. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191101085522.3055-2-oohall@gmail.com
2019-08-30powerpc/64s/powernv: machine check dump SLB contentsNicholas Piggin1-0/+9
Re-use the code introduced in pseries to save and dump the contents of the SLB in the case of an SLB involved machine check exception. This patch also avoids allocating the SLB save array on pseries radix. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-9-npiggin@gmail.com
2019-05-30treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152Thomas Gleixner1-5/+1
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-02powerpc/hmi: Fix kernel hang when TB is in error state.Mahesh Salgaonkar1-1/+4
On TOD/TB errors timebase register stops/freezes until HMI error recovery gets TOD/TB back into running state. On successful recovery, TB starts running again and udelay() that relies on TB value continues to function properly. But in case when HMI fails to recover from TOD/TB errors, the TB register stay freezed. With TB not running the __delay() function keeps looping and never return. If __delay() is called while in panic path then system hangs and never reboots after panic. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-03powerpc/powernv: Make possible for user to force a full ipl cec rebootVaibhav Jain1-6/+30
Ever since fast reboot is enabled by default in opal, opal_cec_reboot() will use fast-reset instead of full IPL to perform system reboot. This leaves the user with no direct way to force a full IPL reboot except changing an nvram setting that persistently disables fast-reset for all subsequent reboots. This patch provides a more direct way for the user to force a one-shot full IPL reboot by passing the command line argument 'full' to the reboot command. So the user will be able to tweak the reboot behavior via: $ sudo reboot full # Force a full ipl reboot skipping fast-reset or $ sudo reboot # default reboot path (usually fast-reset) The reboot command passes the un-parsed command argument to the kernel via the 'Reboot' syscall which is then passed on to the arch function pnv_restart(). The patch updates pnv_restart() to handle this cmd-arg and issues opal_cec_reboot2 with OPAL_REBOOT_FULL_IPL to force a full IPL reset. Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-09-19powerpc/pseries: Flush SLB contents on SLB MCE errors.Mahesh Salgaonkar1-0/+11
On pseries, as of today system crashes if we get a machine check exceptions due to SLB errors. These are soft errors and can be fixed by flushing the SLBs so the kernel can continue to function instead of system crash. We do this in real mode before turning on MMU. Otherwise we would run into nested machine checks. This patch now fetches the rtas error log in real mode and flushes the SLBs on SLB/ERAT errors. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michal Suchanek <msuchanek@suse.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07powerpc/powernv: Query firmware for count cache flush settingsMichael Ellerman1-0/+7
Look for fw-features properties to determine the appropriate settings for the count cache flush, and then call the generic powerpc code to set it up based on the security feature flags. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07powerpc/64: Call setup_barrier_nospec() from setup_arch()Michael Ellerman1-1/+0
Currently we require platform code to call setup_barrier_nospec(). But if we add an empty definition for the !CONFIG_PPC_BARRIER_NOSPEC case then we can call it in setup_arch(). Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07powerpc/xive: Remove xive_kexec_teardown_cpu()Benjamin Herrenschmidt1-1/+1
It's identical to xive_teardown_cpu() so just use the latter Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-07Merge tag 'powerpc-4.18-1' of ↵Linus Torvalds1-9/+2
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Notable changes: - Support for split PMD page table lock on 64-bit Book3S (Power8/9). - Add support for HAVE_RELIABLE_STACKTRACE, so we properly support live patching again. - Add support for patching barrier_nospec in copy_from_user() and syscall entry. - A couple of fixes for our data breakpoints on Book3S. - A series from Nick optimising TLB/mm handling with the Radix MMU. - Numerous small cleanups to squash sparse/gcc warnings from Mathieu Malaterre. - Several series optimising various parts of the 32-bit code from Christophe Leroy. - Removal of support for two old machines, "SBC834xE" and "C2K" ("GEFanuc,C2K"), which is why the diffstat has so many deletions. And many other small improvements & fixes. There's a few out-of-area changes. Some minor ftrace changes OK'ed by Steve, and a fix to our powernv cpuidle driver. Then there's a series touching mm, x86 and fs/proc/task_mmu.c, which cleans up some details around pkey support. It was ack'ed/reviewed by Ingo & Dave and has been in next for several weeks. Thanks to: Akshay Adiga, Alastair D'Silva, Alexey Kardashevskiy, Al Viro, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Arnd Bergmann, Balbir Singh, Cédric Le Goater, Christophe Leroy, Christophe Lombard, Colin Ian King, Dave Hansen, Fabio Estevam, Finn Thain, Frederic Barrat, Gautham R. Shenoy, Haren Myneni, Hari Bathini, Ingo Molnar, Jonathan Neuschäfer, Josh Poimboeuf, Kamalesh Babulal, Madhavan Srinivasan, Mahesh Salgaonkar, Mark Greer, Mathieu Malaterre, Matthew Wilcox, Michael Neuling, Michal Suchanek, Naveen N. Rao, Nicholas Piggin, Nicolai Stange, Olof Johansson, Paul Gortmaker, Paul Mackerras, Peter Rosin, Pridhiviraj Paidipeddi, Ram Pai, Rashmica Gupta, Ravi Bangoria, Russell Currey, Sam Bobroff, Samuel Mendoza-Jonas, Segher Boessenkool, Shilpasri G Bhat, Simon Guo, Souptick Joarder, Stewart Smith, Thiago Jung Bauermann, Torsten Duwe, Vaibhav Jain, Wei Yongjun, Wolfram Sang, Yisheng Xie, YueHaibing" * tag 'powerpc-4.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (251 commits) powerpc/64s/radix: Fix missing ptesync in flush_cache_vmap cpuidle: powernv: Fix promotion from snooze if next state disabled powerpc: fix build failure by disabling attribute-alias warning in pci_32 ocxl: Fix missing unlock on error in afu_ioctl_enable_p9_wait() powerpc-opal: fix spelling mistake "Uniterrupted" -> "Uninterrupted" powerpc: fix spelling mistake: "Usupported" -> "Unsupported" powerpc/pkeys: Detach execute_only key on !PROT_EXEC powerpc/powernv: copy/paste - Mask SO bit in CR powerpc: Remove core support for Marvell mv64x60 hostbridges powerpc/boot: Remove core support for Marvell mv64x60 hostbridges powerpc/boot: Remove support for Marvell mv64x60 i2c controller powerpc/boot: Remove support for Marvell MPSC serial controller powerpc/embedded6xx: Remove C2K board support powerpc/lib: optimise PPC32 memcmp powerpc/lib: optimise 32 bits __clear_user() powerpc/time: inline arch_vtime_task_switch() powerpc/Makefile: set -mcpu=860 flag for the 8xx powerpc: Implement csum_ipv6_magic in assembly powerpc/32: Optimise __csum_partial() powerpc/lib: Adjust .balign inside string functions for PPC32 ...
2018-06-03powerpc/64s: Enable barrier_nospec based on firmware settingsMichal Suchanek1-0/+1
Check what firmware told us and enable/disable the barrier_nospec as appropriate. We err on the side of enabling the barrier, as it's no-op on older systems, see the comment for more detail. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-05-22powerpc/64s: Add support for a store forwarding barrier at kernel entry/exitNicholas Piggin1-0/+1
On some CPUs we can prevent a vulnerability related to store-to-load forwarding by preventing store forwarding between privilege domains, by inserting a barrier in kernel entry and exit paths. This is known to be the case on at least Power7, Power8 and Power9 powerpc CPUs. Barriers must be inserted generally before the first load after moving to a higher privilege, and after the last store before moving to a lower privilege, HV and PR privilege transitions must be protected. Barriers are added as patch sections, with all kernel/hypervisor entry points patched, and the exit points to lower privilge levels patched similarly to the RFI flush patching. Firmware advertisement is not implemented yet, so CPU flush types are hard coded. Thanks to Michal Suchánek for bug fixes and review. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michal Suchánek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-07Revert "powerpc/powernv: Increase memory block size to 1GB on radix"Balbir Singh1-9/+1
This commit was a stop-gap to prevent crashes on hotunplug, caused by the mismatch between the 1G mappings used for the linear mapping and the memory block size. Those issues are now resolved because we split the linear mapping at hotunplug time if necessary, as implemented in commit 4dd5f8a99e79 ("powerpc/mm/radix: Split linear mapping on hot-unplug"). Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Tested-by: Rashmica Gupta <rashmica.g@gmail.com> Tested-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-03powerpc/powernv: Always stop secondaries before reboot/shutdownNicholas Piggin1-10/+5
Currently powernv reboot and shutdown requests just leave secondaries to do their own things. This is undesirable because they can trigger any number of watchdogs while waiting for reboot, but also we don't know what else they might be doing -- they might be causing trouble, trampling memory, etc. The opal scheduled flash update code already ran into watchdog problems due to flashing taking a long time, and it was fixed with 2196c6f1ed ("powerpc/powernv: Return secondary CPUs to firmware before FW update"), which returns secondaries to opal. It's been found that regular reboots can take over 10 seconds, which can result in the hard lockup watchdog firing, reboot: Restarting system [ 360.038896709,5] OPAL: Reboot request... Watchdog CPU:0 Hard LOCKUP Watchdog CPU:44 detected Hard LOCKUP other CPUS:16 Watchdog CPU:16 Hard LOCKUP watchdog: BUG: soft lockup - CPU#16 stuck for 3s! [swapper/16:0] This patch removes the special case for flash update, and calls smp_send_stop in all cases before calling reboot/shutdown. smp_send_stop could return CPUs to OPAL, the main reason not to is that the request could come from a NMI that interrupts OPAL code, so re-entry to OPAL can cause a number of problems. Putting secondaries into simple spin loops improves the chances of a successful reboot. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-31Merge branch 'topic/paca' into nextMichael Ellerman1-2/+2
Bring in yet another series that touches KVM code, and might need to be merged into the kvm-ppc branch to resolve conflicts. This required some changes in pnv_power9_force_smt4_catch/release() due to the paca array becomming an array of pointers.
2018-03-30powerpc/64: Use array of paca pointers and allocate pacas individuallyNicholas Piggin1-2/+2
Change the paca array into an array of pointers to pacas. Allocate pacas individually. This allows flexibility in where the PACAs are allocated. Future work will allocate them node-local. Platforms that don't have address limits on PACAs would be able to defer PACA allocations until later in boot rather than allocate all possible ones up-front then freeing unused. This is slightly more overhead (one additional indirection) for cross CPU paca references, but those aren't too common. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-27powerpc/powernv: Use the security flags in pnv_setup_rfi_flush()Michael Ellerman1-31/+10
Now that we have the security flags we can significantly simplify the code in pnv_setup_rfi_flush(), because we can use the flags instead of checking device tree properties and because the security flags have pessimistic defaults. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-27powerpc/powernv: Set or clear security feature flagsMichael Ellerman1-0/+56
Now that we have feature flags for security related things, set or clear them based on what we see in the device tree provided by firmware. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-02-23powerpc/powernv: Support firmware disable of RFI flushMichael Ellerman1-0/+4
Some versions of firmware will have a setting that can be configured to disable the RFI flush, add support for it. Fixes: 6e032b350cd1 ("powerpc/powernv: Check device-tree for RFI flush settings") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-10powerpc/powernv: Check device-tree for RFI flush settingsOliver O'Halloran1-0/+49
New device-tree properties are available which tell the hypervisor settings related to the RFI flush. Use them to determine the appropriate flush instruction to use, and whether the flush is required. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-10Merge branch 'fixes' into nextMichael Ellerman1-1/+9
We have some dependencies & conflicts between patches in fixes and things to go in next, both in the radix TLB flush code and the IMC PMU driver. So merge fixes into next.
2017-11-07powerpc/powernv/cpufreq: Fix the frequency read by /proc/cpuinfoShriya1-1/+1
The call to /proc/cpuinfo in turn calls cpufreq_quick_get() which returns the last frequency requested by the kernel, but may not reflect the actual frequency the processor is running at. This patch makes a call to cpufreq_get() instead which returns the current frequency reported by the hardware. Fixes: fb5153d05a7d ("powerpc: powernv: Implement ppc_md.get_proc_freq()") Signed-off-by: Shriya <shriyak@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-21powerpc/powernv: Enable TM without suspend if possibleMichael Ellerman1-0/+23
Some Power9 revisions can run in a mode where TM operates without suspended state. If we find ourself on a CPU that might be in this mode, we query OPAL to check, and if so we reenable TM in CPU features, and enable a new user feature to signal to userspace that we are in this mode. We do not enable the "normal" user feature, PPC_FEATURE2_HTM, but we do enable PPC_FEATURE2_HTM_NOSC because that indicates to userspace that the kernel will abort transactions on syscall entry, which is true regardless of the suspend mode. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-06powerpc/powernv: Increase memory block size to 1GB on radixAnton Blanchard1-1/+9
Memory hot unplug on PowerNV radix hosts is broken. Our memory block size is 256MB but since we map the linear region with very large pages, each pte we tear down maps 1GB. A hot unplug of one 256MB memory block results in 768MB of memory getting unintentionally unmapped. At this point we are likely to oops. Fix this by increasing our memory block size to 1GB on PowerNV radix hosts. Fixes: 4b5d62ca17a1 ("powerpc/mm: add radix__remove_section_mapping()") Cc: stable@vger.kernel.org # v4.11+ Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-04powerpc/powernv: Implement NMI IPI with OPAL_SIGNAL_SYSTEM_RESETNicholas Piggin1-0/+1
This allows MSR[EE]=0 lockups to be detected on an OPAL (bare metal) system similarly to the hcall NMI IPI on pseries guests, when the platform/firmware supports it. This is an example of CPU10 spinning with interrupts hard disabled: Watchdog CPU:32 detected Hard LOCKUP other CPUS:10 Watchdog CPU:10 Hard LOCKUP CPU: 10 PID: 4410 Comm: bash Not tainted 4.13.0-rc7-00074-ge89ce1f89f62-dirty #34 task: c0000003a82b4400 task.stack: c0000003af55c000 NIP: c0000000000a7b38 LR: c000000000659044 CTR: c0000000000a7b00 REGS: c00000000fd23d80 TRAP: 0100 Not tainted (4.13.0-rc7-00074-ge89ce1f89f62-dirty) MSR: 90000000000c1033 <SF,HV,ME,IR,DR,RI,LE> CR: 28422222 XER: 20000000 CFAR: c0000000000a7b38 SOFTE: 0 GPR00: c000000000659044 c0000003af55fbb0 c000000001072a00 0000000000000078 GPR04: c0000003c81b5c80 c0000003c81cc7e8 9000000000009033 0000000000000000 GPR08: 0000000000000000 c0000000000a7b00 0000000000000001 9000000000001003 GPR12: c0000000000a7b00 c00000000fd83200 0000000010180df8 0000000010189e60 GPR16: 0000000010189ed8 0000000010151270 000000001018bd88 000000001018de78 GPR20: 00000000370a0668 0000000000000001 00000000101645e0 0000000010163c10 GPR24: 00007fffd14d6294 00007fffd14d6290 c000000000fba6f0 0000000000000004 GPR28: c000000000f351d8 0000000000000078 c000000000f4095c 0000000000000000 NIP [c0000000000a7b38] sysrq_handle_xmon+0x38/0x40 LR [c000000000659044] __handle_sysrq+0xe4/0x270 Call Trace: [c0000003af55fbd0] [c000000000659044] __handle_sysrq+0xe4/0x270 [c0000003af55fc70] [c000000000659810] write_sysrq_trigger+0x70/0xa0 [c0000003af55fca0] [c0000000003da650] proc_reg_write+0xb0/0x110 [c0000003af55fcf0] [c0000000003423bc] __vfs_write+0x6c/0x1b0 [c0000003af55fd90] [c000000000344398] vfs_write+0xd8/0x240 [c0000003af55fde0] [c00000000034632c] SyS_write+0x6c/0x110 [c0000003af55fe30] [c00000000000b220] system_call+0x58/0x6c Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Use kernel types for opal_signal_system_reset()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-07-10powerpc/powernv: Tell OPAL about our MMU mode on POWER9Benjamin Herrenschmidt1-1/+10
That will allow OPAL to configure the CPU in an optimal way. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-12Merge branch 'topic/xive' (early part) into nextMichael Ellerman1-3/+12
This merges the arch part of the XIVE support, leaving the final commit with the KVM specific pieces dangling on the branch for Paul to merge via the kvm-ppc tree.
2017-04-10powerpc/xive: Native exploitation of the XIVE interrupt controllerBenjamin Herrenschmidt1-3/+12
The XIVE interrupt controller is the new interrupt controller found in POWER9. It supports advanced virtualization capabilities among other things. Currently we use a set of firmware calls that simulate the old "XICS" interrupt controller but this is fairly inefficient. This adds the framework for using XIVE along with a native backend which OPAL for configuration. Later, a backend allowing the use in a KVM or PowerVM guest will also be provided. This disables some fast path for interrupts in KVM when XIVE is enabled as these rely on the firmware emulation code which is no longer available when the XIVE is used natively by Linux. A latter patch will make KVM also directly exploit the XIVE, thus recovering the lost performance (and more). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [mpe: Fixup pr_xxx("XIVE:"...), don't split pr_xxx() strings, tweak Kconfig so XIVE_NATIVE selects XIVE and depends on POWERNV, fix build errors when SMP=n, fold in fixes from Ben: Don't call cpu_online() on an invalid CPU number Fix irq target selection returning out of bounds cpu# Extra sanity checks on cpu numbers ] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-31powerpc/mm: Add translation mode information in /proc/cpuinfoAneesh Kumar K.V1-0/+4
With this we have on powernv and pseries /proc/cpuinfo reporting timebase : 512000000 platform : PowerNV model : 8247-22L machine : PowerNV 8247-22L firmware : OPAL MMU : Hash Reviewed-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-30powerpc: Change places using CONFIG_KEXEC to use CONFIG_KEXEC_CORE instead.Thiago Jung Bauermann1-3/+3
Commit 2965faa5e03d ("kexec: split kexec_load syscall from kexec core code") introduced CONFIG_KEXEC_CORE so that CONFIG_KEXEC means whether the kexec_load system call should be compiled-in and CONFIG_KEXEC_FILE means whether the kexec_file_load system call should be compiled-in. These options can be set independently from each other. Since until now powerpc only supported kexec_load, CONFIG_KEXEC and CONFIG_KEXEC_CORE were synonyms. That is not the case anymore, so we need to make a distinction. Almost all places where CONFIG_KEXEC was being used should be using CONFIG_KEXEC_CORE instead, since kexec_file_load also needs that code compiled in. Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21powerpc: Get rid of ppc_md.init_early()Benjamin Herrenschmidt1-2/+3
It is now called right after platform probe, so the probe function can just do the job. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21powerpc/64: Move 64-bit probe_machine() to later in the boot processBenjamin Herrenschmidt1-3/+1
We no long need the machine type that early, so we can move probe_machine() to after the device-tree has been expanded. This will allow further consolidation. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21powerpc/64: Move MMU backend selection out of platform codeBenjamin Herrenschmidt1-5/+0
We move it into early_mmu_init() based on firmware features. For PS3, we have to move the setting of these into early_init_devtree(). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-23powerpc/powernv: set power_save func after the idle states are initializedShreyas B. Prabhu1-1/+1
pnv_init_idle_states() discovers supported idle states from the device tree and does the required initialization. Set power_save function pointer only after this initialization is done Otherwise on machines which don't support nap, eg. Power9, the kernel will crash when it tries to nap. Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01powerpc/mm/radix: Add radix callbacks for early init routinesAneesh Kumar K.V1-1/+4
This adds routines for early setup for radix. We use device tree property "ibm,processor-radix-AP-encodings" to find supported page sizes. If we don't find the above we consider 64K and 4K as supported page sizes. We do map vmemap using 2M page size if we can. The linear mapping is done such that we use required page size for that range. For example memory of 3.5G is mapped such that we use 1G mapping till 3G range and use 2M mapping for the rest. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17powerpc/powernv: remove FW_FEATURE_OPALv3 and just use FW_FEATURE_OPALStewart Smith1-4/+4
Long ago, only in the lab, there was OPALv1 and OPALv2. Now there is just OPALv3, with nobody ever expecting anything on pre-OPALv3 to be cared about or supported by mainline kernels. So, let's remove FW_FEATURE_OPALv3 and instead use FW_FEATURE_OPAL exclusively. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17powerpc/powernv: Remove OPALv2 firmware define and referencesStewart Smith1-4/+0
OPALv2 only ever existed in the lab and didn't escape to the world. All OPAL systems in the wild are OPALv3. The probability of there being an OPALv2 system still powered on anywhere inside IBM is approximately zero, let alone anyone expecting to run mainline kernels. So, start to remove references to OPALv2. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-06powerpc/kexec: Wait 1s for secondaries to enter OPALSamuel Mendoza-Jonas1-8/+13
Always include a timeout when waiting for secondary cpus to enter OPAL in the kexec path, rather than only when crashing. Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-20powerpc/powernv: Reset HILE before kexec_sequence()Samuel Mendoza-Jonas1-0/+7
On powernv secondary cpus are returned to OPAL, and will then enter the target kernel in big-endian. However if it is set the HILE bit will persist, causing the first exception in the target kernel to be delivered in litte-endian regardless of the current endianness. If running on top of OPAL make sure the HILE bit is reset once we've finished waiting for all of the secondaries to be returned to OPAL. Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-18powerpc/powernv: move dma_get_required_mask from pnv_phb to pci_controller_opsAndrew Donnellan1-9/+0
Simplify the dma_get_required_mask call chain by moving it from pnv_phb to pci_controller_ops, similar to commit 763d2d8df1ee ("powerpc/powernv: Move dma_set_mask from pnv_phb to pci_controller_ops"). Previous call chain: 0) call dma_get_required_mask() (kernel/dma.c) 1) call ppc_md.dma_get_required_mask, if it exists. On powernv, that points to pnv_dma_get_required_mask() (platforms/powernv/setup.c) 2) device is PCI, therefore call pnv_pci_dma_get_required_mask() (platforms/powernv/pci.c) 3) call phb->dma_get_required_mask if it exists 4) it only exists in the ioda case, where it points to pnv_pci_ioda_dma_get_required_mask() (platforms/powernv/pci-ioda.c) New call chain: 0) call dma_get_required_mask() (kernel/dma.c) 1) device is PCI, therefore call pci_controller_ops.dma_get_required_mask if it exists 2) in the ioda case, that points to pnv_pci_ioda_dma_get_required_mask() (platforms/powernv/pci-ioda.c) In the p5ioc2 case, the call chain remains the same - dma_get_required_mask() does not find either a ppc_md call or pci_controller_ops call, so it calls __dma_get_required_mask(). Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>