summaryrefslogtreecommitdiff
path: root/arch/powerpc
AgeCommit message (Collapse)AuthorFilesLines
2021-12-19Merge tag 'powerpc-5.16-4' of ↵Linus Torvalds2-10/+36
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Fix a recently introduced oops at boot on 85xx in some configurations. Fix crashes when loading some livepatch modules with STRICT_MODULE_RWX. Thanks to Joe Lawrence, Russell Currey, and Xiaoming Ni" * tag 'powerpc-5.16-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/module_64: Fix livepatching for RO modules powerpc/85xx: Fix oops when CONFIG_FSL_PMC=n
2021-12-19Merge branch 'topic/ppc-kvm' of ↵Paolo Bonzini29-730/+1558
https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux into HEAD Fix conflicts between memslot overhaul and commit 511d25d6b789f ("KVM: PPC: Book3S: Suppress warnings when allocating too big memory slots") from the powerpc tree.
2021-12-17arch: Remove leftovers from prism54 wireless driverAlexandre Ghiti1-1/+0
This driver was removed so remove all references to it. Fixes: d249ff28b1d8 ("intersil: remove obsolete prism54 wireless driver") Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com> Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-12-17Documentation, arch: Remove leftovers from CIFS_WEAK_PW_HASHAlexandre Ghiti1-1/+0
This config was removed so remove all references to it. Fixes: 76a3c92ec9e0 ("cifs: remove support for NTLM and weaker authentication algorithms") Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com> Reviewed-by: Steve French <smfrench@gmail.com> Acked-by: Arnd Bergmann <arnd@arndb.de> [arch/arm/configs] Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-12-17Documentation, arch: Remove leftovers from raw deviceAlexandre Ghiti1-1/+0
Raw device interface was removed so remove all references to configs related to it. Fixes: 603e4922f1c8 ("remove the raw driver") Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com> Acked-by: Arnd Bergmann <arnd@arndb.de> [arch/arm/configs] Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-12-17of/fdt: Rework early_init_dt_scan_memory() to call directlyRob Herring1-11/+10
Use of the of_scan_flat_dt() function predates libfdt and is discouraged as libfdt provides a nicer set of APIs. Rework early_init_dt_scan_memory() to be called directly and use libfdt. Cc: John Crispin <john@phrozen.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Frank Rowand <frowand.list@gmail.com> Cc: linux-mips@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Reviewed-by: Frank Rowand <frank.rowand@sony.com> Signed-off-by: Rob Herring <robh@kernel.org> Tested-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211215150102.1303588-1-robh@kernel.org
2021-12-17of/fdt: Rework early_init_dt_scan_root() to call directlyRob Herring1-2/+2
Use of the of_scan_flat_dt() function predates libfdt and is discouraged as libfdt provides a nicer set of APIs. Rework early_init_dt_scan_root() to be called directly and use libfdt. Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Frank Rowand <frowand.list@gmail.com> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Frank Rowand <frank.rowand@sony.com> Link: https://lore.kernel.org/r/20211118181213.1433346-3-robh@kernel.org
2021-12-17of/fdt: Rework early_init_dt_scan_chosen() to call directlyRob Herring2-4/+2
Use of the of_scan_flat_dt() function predates libfdt and is discouraged as libfdt provides a nicer set of APIs. Rework early_init_dt_scan_chosen() to be called directly and use libfdt. Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Frank Rowand <frowand.list@gmail.com> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Frank Rowand <frank.rowand@sony.com> Link: https://lore.kernel.org/r/20211118181213.1433346-2-robh@kernel.org
2021-12-17powerpc/mpic_u3msi: Use msi_for_each-desc()Thomas Gleixner1-7/+2
Replace the about to vanish iterators and make use of the filtering. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20211206210748.576162169@linutronix.de
2021-12-17powerpc/fsl_msi: Use msi_for_each_desc()Thomas Gleixner1-6/+2
Replace the about to vanish iterators and make use of the filtering. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20211206210748.522641685@linutronix.de
2021-12-17powerpc/pasemi/msi: Convert to msi_on_each_dec()Thomas Gleixner1-7/+2
Replace the about to vanish iterators and make use of the filtering. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20211206210748.468512783@linutronix.de
2021-12-17powerpc/cell/axon_msi: Convert to msi_on_each_desc()Thomas Gleixner1-5/+2
Replace the about to vanish iterators and make use of the filtering. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20211206210748.414712173@linutronix.de
2021-12-17powerpc/4xx/hsta: Rework MSI handlingThomas Gleixner1-5/+2
Replace the about to vanish iterators and make use of the filtering. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20211206210748.359766435@linutronix.de
2021-12-17powerpc/pseries/msi: Let core code check for contiguous entriesThomas Gleixner1-25/+8
Set the domain info flag and remove the check. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20211210221814.720998720@linutronix.de
2021-12-17PCI/MSI: Use msi_desc::msi_indexThomas Gleixner1-2/+2
The usage of msi_desc::pci::entry_nr is confusing at best. It's the index into the MSI[X] descriptor table. Use msi_desc::msi_index which is shared between all MSI incarnations instead of having a PCI specific storage for no value. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Nishanth Menon <nm@ti.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20211210221814.602911509@linutronix.de
2021-12-17powerpc/pseries/msi: Use PCI device propertiesThomas Gleixner1-2/+1
instead of fiddling with MSI descriptors. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20211210221813.556202506@linutronix.de
2021-12-17powerpc/cell/axon_msi: Use PCI device propertyThomas Gleixner1-4/+1
instead of fiddling with MSI descriptors. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20211210221813.493922179@linutronix.de
2021-12-16powerpc: wii_defconfig: Enable the RTC driverEmmanuel Gil Peyrot1-1/+1
This selects the rtc-gamecube driver, which provides a real-time clock on this platform. Signed-off-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20211215175501.6761-6-linkmauve@linkmauve.fr
2021-12-16powerpc/pseries: use slab context cpumask allocation in CPU hotplug initNicholas Piggin1-4/+5
Slab is up at this point, using the bootmem allocator triggers a warning. Switch to using the regular cpumask allocator. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211105132923.1582514-1-npiggin@gmail.com
2021-12-16powerpc/64s/interrupt: avoid saving CFAR in some asynchronous interruptsNicholas Piggin1-0/+63
Reading the CFAR register is quite costly (~20 cycles on POWER9). It is a good idea to have for most synchronous interrupts, but for async ones it is much less important. Doorbell, external, and decrementer interrupts are the important asynchronous ones. HV interrupts can't skip CFAR if KVM HV is possible, because it might be a guest exit that requires CFAR preserved. But the important pseries interrupts can avoid loading CFAR. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210922145452.352571-7-npiggin@gmail.com
2021-12-16powerpc/64/interrupt: reduce expensive debug testsNicholas Piggin1-4/+10
Move the assertions requiring restart table searches under CONFIG_PPC_IRQ_SOFT_MASK_DEBUG. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210922145452.352571-6-npiggin@gmail.com
2021-12-16powerpc/64s/interrupt: Don't enable MSR[EE] in irq handlers unless perf is ↵Nicholas Piggin4-27/+67
in use Enabling MSR[EE] in interrupt handlers while interrupts are still soft masked allows PMIs to profile interrupt handlers to some degree, beyond what SIAR latching allows. When perf is not being used, this is almost useless work. It requires an extra mtmsrd in the irq handler, and it also opens the door to masked interrupts hitting and requiring replay, which is more expensive than just taking them directly. This effect can be noticable in high IRQ workloads. Avoid enabling MSR[EE] unless perf is currently in use. This saves about 60 cycles (or 8%) on a simple decrementer interrupt microbenchmark. Replayed interrupts drop from 1.4% of all interrupts taken, to 0.003%. This does prevent the soft-nmi interrupt being taken in these handlers, but that's not too reliable anyway. The SMP watchdog will continue to be the reliable way to catch lockups. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210922145452.352571-5-npiggin@gmail.com
2021-12-16powerpc/64s/perf: add power_pmu_wants_prompt_pmi to say whether perf wants ↵Nicholas Piggin2-0/+33
PMIs to be soft-NMI Interrupt code enables MSR[EE] in some irq handlers while keeping local irqs disabled via soft-mask, allowing PMI interrupts to be taken as soft-NMI to improve profiling of irq handlers. When perf is not enabled, there is no point to doing this, it's additional overhead. So provide a function that can say if PMIs should be taken promptly if possible. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210922145452.352571-4-npiggin@gmail.com
2021-12-16powerpc/64s/interrupt: handle MSR EE and RI in interrupt entry wrapperNicholas Piggin4-38/+42
The mtmsrd to enable MSR[RI] can be combined with the mtmsrd to enable MSR[EE] in interrupt entry code, for those interrupts which enable EE. This helps performance of important synchronous interrupts (e.g., page faults). This is similar to what commit dd152f70bdc1 ("powerpc/64s: system call avoid setting MSR[RI] until we set MSR[EE]") does for system calls. Do this by enabling EE and RI together at the beginning of the entry wrapper if PACA_IRQ_HARD_DIS is clear, and only enabling RI if it is set. Asynchronous interrupts set PACA_IRQ_HARD_DIS, but synchronous ones leave it unchanged, so by default they always get EE=1 unless they have interrupted a caller that is hard disabled. When the sync interrupt later calls interrupt_cond_local_irq_enable(), it will not require another mtmsrd because MSR[EE] was already enabled here. This avoids one mtmsrd L=1 for synchronous interrupts on 64s, which saves about 20 cycles on POWER9. And for kernel-mode interrupts, both synchronous and asynchronous, this saves an additional 40 cycles due to the mtmsrd being moved ahead of mfspr SPRN_AMR, which prevents a SPR scoreboard stall. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210922145452.352571-3-npiggin@gmail.com
2021-12-16powerpc/64/interrupt: make normal synchronous interrupts enable MSR[EE] if ↵Nicholas Piggin1-1/+18
possible Make synchronous interrupt handler entry wrappers enable MSR[EE] if MSR[EE] was enabled in the interrupted context. IRQs are soft-disabled at this point so there is no change to high level code, but it's a masked interrupt could fire. This is a performance disadvantage for interrupts which do not later call interrupt_cond_local_irq_enable(), because an an additional mtmsrd or wrtee instruction is executed. However the important synchronous interrupts (e.g., page fault) do enable interrupts, so the performance disadvantage is mostly avoided. In the next patch, MSR[RI] enabling can be combined with MSR[EE] enabling, which mitigates the performance drop for the former and gives a performance advanage for the latter interrupts, on 64s machines. 64e is coming along for the ride for now to avoid divergences with 64s in this tricky code. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210922145452.352571-2-npiggin@gmail.com
2021-12-16powerpc/pseries/vas: Don't print an error when VAS is unavailableNicholas Piggin1-2/+9
KVM does not support VAS so guests always print a useless error on boot vas: HCALL(398) error -2, query_type 0, result buffer 0x57f2000 Change this to only print the message if the error is not H_FUNCTION. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211126052133.1664375-1-npiggin@gmail.com
2021-12-16powerpc/perf: Add data source encodings for power10 platformKajol Jain1-12/+42
The code represent memory/cache level data based on PERF_MEM_LVL_* namespace, which is in the process of deprication in the favour of newer composite PERF_MEM_{LVLNUM_,REMOTE_,SNOOPX_,HOPS_} fields. Add data source encodings to represent cache/memory data based on newer composite PERF_MEM_{LVLNUM_,REMOTE_,SNOOPX_,HOPS_} fields. Add data source encodings to represent data coming from local memory/Remote memory/distant memory and remote/distant cache hits. In order to represent data coming from OpenCAPI cache/memory, we use LVLNUM "PMEM" field which is used to present persistent memory accesses. Result in power10 system with patch changes: localhost:# ./perf mem report --sort="mem,sym,dso" --stdio # Overhead Samples Memory access Symbol Shared Object # ........ ............ ........................ .......................... ................ # 29.46% 2331 L1 or L1 hit [.] __random libc-2.28.so 23.11% 2121 L1 or L1 hit [.] producer_populate_cache producer_consumer 18.56% 1758 L1 or L1 hit [.] __random_r libc-2.28.so 15.64% 1559 L2 or L2 hit [.] __random libc-2.28.so ..... 0.09% 5 Remote socket, same board Any cache hit [.] __random libc-2.28.so 0.07% 4 Remote socket, same board Any cache hit [.] __random libc-2.28.so ..... Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211206091749.87585-5-kjain@linux.ibm.com
2021-12-16powerpc/perf: Add encodings to represent data based on newer composite ↵Kajol Jain1-3/+3
PERF_MEM_LVLNUM* fields The code represent data coming from L1/L2/L3 cache hits based on PERF_MEM_LVL_* namespace, which is in the process of deprecation in the favour of newer composite PERF_MEM_{LVLNUM_,REMOTE_,SNOOPX_,HOPS_} fields. Add data source encodings to represent L1/L2/L3 cache hits based on newer composite PERF_MEM_{LVLNUM_,REMOTE_,SNOOPX_,HOPS_} fields for power10 and older platforms Result in power9 system without patch changes: localhost:# ./perf mem report --sort="mem,sym,dso" --stdio # Overhead Samples Memory access Symbol Shared Object # ........ ............ ........................ ................................. ................ # 29.51% 1 L2 hit [k] perf_event_exec [kernel.vmlinux] 27.05% 1 L1 hit [k] perf_ctx_unlock [kernel.vmlinux] 13.93% 1 L1 hit [k] vtime_delta [kernel.vmlinux] 13.11% 1 L1 hit [k] prepend_path.isra.11 [kernel.vmlinux] 8.20% 1 L1 hit [.] 00000038.plt_call.__GI_strlen libc-2.28.so 8.20% 1 L1 hit [k] perf_event_interrupt [kernel.vmlinux] Result in power9 system with patch changes: localhost:# ./perf mem report --sort="mem,sym,dso" --stdio # Overhead Samples Memory access Symbol Shared Object # ........ ............ ........................ .......................... ................ # 36.63% 1 L2 or L2 hit [k] perf_event_exec [kernel.vmlinux] 25.50% 1 L1 or L1 hit [k] vtime_delta [kernel.vmlinux] 13.12% 1 L1 or L1 hit [k] unmap_region [kernel.vmlinux] 12.62% 1 L1 or L1 hit [k] perf_sample_event_took [kernel.vmlinux] 6.93% 1 L1 or L1 hit [k] perf_ctx_unlock [kernel.vmlinux] 5.20% 1 L1 or L1 hit [.] __memcpy_power7 libc-2.28.so Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211206091749.87585-4-kjain@linux.ibm.com
2021-12-16powerpc: gamecube_defconfig: Enable the RTC driverEmmanuel Gil Peyrot1-1/+1
This selects the rtc-gamecube driver, which provides a real-time clock on this platform. Signed-off-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20211215175501.6761-5-linkmauve@linkmauve.fr
2021-12-16powerpc: wii.dts: Expose HW_SRNPROT on this platformEmmanuel Gil Peyrot1-0/+5
This Hollywood register isn’t properly understood, but can allow or reject access to the SRAM, which we need to set for RTC usage if it isn’t previously set correctly beforehand. See https://wiibrew.org/wiki/Hardware/Hollywood_Registers#HW_SRNPROT Signed-off-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20211215175501.6761-4-linkmauve@linkmauve.fr
2021-12-15Merge branch 'topic/ppc-kvm' into nextMichael Ellerman3-15/+13
Bring in some more KVM commits from our KVM topic branch.
2021-12-14powerpc/module_64: Fix livepatching for RO modulesRussell Currey1-8/+34
Livepatching a loaded module involves applying relocations through apply_relocate_add(), which attempts to write to read-only memory when CONFIG_STRICT_MODULE_RWX=y. Work around this by performing these writes through the text poke area by using patch_instruction(). R_PPC_REL24 is the only relocation type generated by the kpatch-build userspace tool or klp-convert kernel tree that I observed applying a relocation to a post-init module. A more comprehensive solution is planned, but using patch_instruction() for R_PPC_REL24 on should serve as a sufficient fix. This does have a performance impact, I observed ~15% overhead in module_load() on POWER8 bare metal with checksum verification off. Fixes: c35717c71e98 ("powerpc: Set ARCH_HAS_STRICT_MODULE_RWX") Cc: stable@vger.kernel.org # v5.14+ Reported-by: Joe Lawrence <joe.lawrence@redhat.com> Signed-off-by: Russell Currey <ruscur@russell.cc> Tested-by: Joe Lawrence <joe.lawrence@redhat.com> [mpe: Check return codes from patch_instruction()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211214121248.777249-1-mpe@ellerman.id.au
2021-12-14KVM: PPC: Book3S HV P9: Use kvm_arch_vcpu_get_wait() to get rcuwait objectSean Christopherson1-2/+3
Use kvm_arch_vcpu_get_wait() to get a vCPU's rcuwait object instead of using vcpu->wait directly in kvmhv_run_single_vcpu(). Functionally, this is a nop as vcpu->arch.waitp is guaranteed to point at vcpu->wait. But that is not obvious at first glance, and a future change coming in via the KVM tree, commit 510958e99721 ("KVM: Force PPC to define its own rcuwait object"), will hide vcpu->wait from architectures that define __KVM_HAVE_ARCH_WQP to prevent generic KVM from attepting to wake a vCPU with the wrong rcuwait object. Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211213174556.3871157-1-seanjc@google.com
2021-12-13exit: Add and use make_task_dead.Eric W. Biederman1-4/+4
There are two big uses of do_exit. The first is it's design use to be the guts of the exit(2) system call. The second use is to terminate a task after something catastrophic has happened like a NULL pointer in kernel code. Add a function make_task_dead that is initialy exactly the same as do_exit to cover the cases where do_exit is called to handle catastrophic failure. In time this can probably be reduced to just a light wrapper around do_task_dead. For now keep it exactly the same so that there will be no behavioral differences introducing this new concept. Replace all of the uses of do_exit that use it for catastraphic task cleanup with make_task_dead to make it clear what the code is doing. As part of this rename rewind_stack_do_exit rewind_stack_and_make_dead. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-12-13Merge tag 'v5.16-rc5' into locking/core, to pick up fixesIngo Molnar18-68/+71
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2021-12-11Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski2-4/+4
Andrii Nakryiko says: ==================== bpf-next 2021-12-10 v2 We've added 115 non-merge commits during the last 26 day(s) which contain a total of 182 files changed, 5747 insertions(+), 2564 deletions(-). The main changes are: 1) Various samples fixes, from Alexander Lobakin. 2) BPF CO-RE support in kernel and light skeleton, from Alexei Starovoitov. 3) A batch of new unified APIs for libbpf, logging improvements, version querying, etc. Also a batch of old deprecations for old APIs and various bug fixes, in preparation for libbpf 1.0, from Andrii Nakryiko. 4) BPF documentation reorganization and improvements, from Christoph Hellwig and Dave Tucker. 5) Support for declarative initialization of BPF_MAP_TYPE_PROG_ARRAY in libbpf, from Hengqi Chen. 6) Verifier log fixes, from Hou Tao. 7) Runtime-bounded loops support with bpf_loop() helper, from Joanne Koong. 8) Extend branch record capturing to all platforms that support it, from Kajol Jain. 9) Light skeleton codegen improvements, from Kumar Kartikeya Dwivedi. 10) bpftool doc-generating script improvements, from Quentin Monnet. 11) Two libbpf v0.6 bug fixes, from Shuyi Cheng and Vincent Minet. 12) Deprecation warning fix for perf/bpf_counter, from Song Liu. 13) MAX_TAIL_CALL_CNT unification and MIPS build fix for libbpf, from Tiezhu Yang. 14) BTF_KING_TYPE_TAG follow-up fixes, from Yonghong Song. 15) Selftests fixes and improvements, from Ilya Leoshkevich, Jean-Philippe Brucker, Jiri Olsa, Maxim Mikityanskiy, Tirthendu Sarkar, Yucong Sun, and others. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (115 commits) libbpf: Add "bool skipped" to struct bpf_map libbpf: Fix typo in btf__dedup@LIBBPF_0.0.2 definition bpftool: Switch bpf_object__load_xattr() to bpf_object__load() selftests/bpf: Remove the only use of deprecated bpf_object__load_xattr() selftests/bpf: Add test for libbpf's custom log_buf behavior selftests/bpf: Replace all uses of bpf_load_btf() with bpf_btf_load() libbpf: Deprecate bpf_object__load_xattr() libbpf: Add per-program log buffer setter and getter libbpf: Preserve kernel error code and remove kprobe prog type guessing libbpf: Improve logging around BPF program loading libbpf: Allow passing user log setting through bpf_object_open_opts libbpf: Allow passing preallocated log_buf when loading BTF into kernel libbpf: Add OPTS-based bpf_btf_load() API libbpf: Fix bpf_prog_load() log_buf logic for log_level 0 samples/bpf: Remove unneeded variable bpf: Remove redundant assignment to pointer t selftests/bpf: Fix a compilation warning perf/bpf_counter: Use bpf_map_create instead of bpf_create_map samples: bpf: Fix 'unknown warning group' build warning on Clang samples: bpf: Fix xdp_sample_user.o linking with Clang ... ==================== Link: https://lore.kernel.org/r/20211210234746.2100561-1-andrii@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-10arch: Make ARCH_STACKWALK independent of STACKTRACEPeter Zijlstra1-2/+1
Make arch_stack_walk() available for ARCH_STACKWALK architectures without it being entangled in STACKTRACE. Link: https://lore.kernel.org/lkml/20211022152104.356586621@infradead.org/ Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> [Mark: rebase, drop unnecessary arm change] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20211129142849.3056714-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-12-09KVM: powerpc: Use Makefile.kvm for common filesDavid Woodhouse1-6/+2
It's all fairly baroque but in the end, I don't think there's any reason for $(KVM)/irqchip.o to have been handled differently, as they all end up in $(kvm-y) in the end anyway, regardless of whether they get there via $(common-objs-y) and the CPU-specific object lists. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Message-Id: <20211121125451.9489-7-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-09powerpc/powermac: Add additional missing lockdep_register_key()Christophe Leroy1-0/+2
Commit df1f679d19ed ("powerpc/powermac: Add missing lockdep_register_key()") fixed a problem that was causing a WARNING. There are two other places in the same file with the same problem originating from commit 9e607f72748d ("i2c_powermac: shut up lockdep warning"). Add missing lockdep_register_key() Fixes: 9e607f72748d ("i2c_powermac: shut up lockdep warning") Reported-by: Erhard Furtner <erhard_f@mailbox.org> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Depends-on: df1f679d19ed ("powerpc/powermac: Add missing lockdep_register_key()") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://bugzilla.kernel.org/show_bug.cgi?id=200055 Link: https://lore.kernel.org/r/2c7e421874e21b2fb87813d768cf662f630c2ad4.1638984999.git.christophe.leroy@csgroup.eu
2021-12-09powerpc/fadump: Fix inaccurate CPU state info in vmcore generated with panicHari Bathini2-0/+18
In panic path, fadump is triggered via a panic notifier function. Before calling panic notifier functions, smp_send_stop() gets called, which stops all CPUs except the panic'ing CPU. Commit 8389b37dffdc ("powerpc: stop_this_cpu: remove the cpu from the online map.") and again commit bab26238bbd4 ("powerpc: Offline CPU in stop_this_cpu()") started marking CPUs as offline while stopping them. So, if a kernel has either of the above commits, vmcore captured with fadump via panic path would not process register data for all CPUs except the panic'ing CPU. Sample output of crash-utility with such vmcore: # crash vmlinux vmcore ... KERNEL: vmlinux DUMPFILE: vmcore [PARTIAL DUMP] CPUS: 1 DATE: Wed Nov 10 09:56:34 EST 2021 UPTIME: 00:00:42 LOAD AVERAGE: 2.27, 0.69, 0.24 TASKS: 183 NODENAME: XXXXXXXXX RELEASE: 5.15.0+ VERSION: #974 SMP Wed Nov 10 04:18:19 CST 2021 MACHINE: ppc64le (2500 Mhz) MEMORY: 8 GB PANIC: "Kernel panic - not syncing: sysrq triggered crash" PID: 3394 COMMAND: "bash" TASK: c0000000150a5f80 [THREAD_INFO: c0000000150a5f80] CPU: 1 STATE: TASK_RUNNING (PANIC) crash> p -x __cpu_online_mask __cpu_online_mask = $1 = { bits = {0x2, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0} } crash> crash> crash> p -x __cpu_active_mask __cpu_active_mask = $2 = { bits = {0xff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0} } crash> While this has been the case since fadump was introduced, the issue was not identified for two probable reasons: - In general, the bulk of the vmcores analyzed were from crash due to exception. - The above did change since commit 8341f2f222d7 ("sysrq: Use panic() to force a crash") started using panic() instead of deferencing NULL pointer to force a kernel crash. But then commit de6e5d38417e ("powerpc: smp_send_stop do not offline stopped CPUs") stopped marking CPUs as offline till kernel commit bab26238bbd4 ("powerpc: Offline CPU in stop_this_cpu()") reverted that change. To ensure post processing register data of all other CPUs happens as intended, let panic() function take the crash friendly path (read crash_smp_send_stop()) with the help of crash_kexec_post_notifiers option. Also, as register data for all CPUs is captured by f/w, skip IPI callbacks here for fadump, to avoid any complications in finding the right backtraces. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211207103719.91117-2-hbathini@linux.ibm.com
2021-12-09powerpc: handle kdump appropriately with crash_kexec_post_notifiers optionHari Bathini1-0/+30
Kdump can be triggered after panic_notifers since commit f06e5153f4ae2 ("kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after panic_notifers") introduced crash_kexec_post_notifiers option. But using this option would mean smp_send_stop(), that marks all other CPUs as offline, gets called before kdump is triggered. As a result, kdump routines fail to save other CPUs' registers. To fix this, kdump friendly crash_smp_send_stop() function was introduced with kernel commit 0ee59413c967 ("x86/panic: replace smp_send_stop() with kdump friendly version in panic path"). Override this kdump friendly weak function to handle crash_kexec_post_notifiers option appropriately on powerpc. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> [Fixed signature of crash_stop_this_cpu() - reported by lkp@intel.com] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211207103719.91117-1-hbathini@linux.ibm.com
2021-12-09powerpc/cell: Fix clang -Wimplicit-fallthrough warningAnders Roxell1-0/+1
Clang warns: arch/powerpc/platforms/cell/pervasive.c:81:2: error: unannotated fall-through between switch labels case SRR1_WAKEEE: ^ arch/powerpc/platforms/cell/pervasive.c:81:2: note: insert 'break;' to avoid fall-through case SRR1_WAKEEE: ^ break; 1 error generated. Clang is more pedantic than GCC, which does not warn when failing through to a case that is just break or return. Clang's version is more in line with the kernel's own stance in deprecated.rst. Add athe missing break to silence the warning. Fixes: 6e83985b0f6e ("powerpc/cbe: Do not process external or decremeter interrupts from sreset") Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211207110228.698956-1-anders.roxell@linaro.org
2021-12-09powerpc/inst: Optimise copy_inst_from_kernel_nofault()Christophe Leroy2-18/+24
copy_inst_from_kernel_nofault() uses copy_from_kernel_nofault() to copy one or two 32bits words. This means calling an out-of-line function which itself calls back copy_from_kernel_nofault_allowed() then performs a generic copy with loops. Rewrite copy_inst_from_kernel_nofault() to do everything at a single place and use __get_kernel_nofault() directly to perform single accesses without loops. Allthough the generic function uses pagefault_disable(), it is not required on powerpc because do_page_fault() bails earlier when a kernel mode fault happens on a kernel address. As the function has now become very small, inline it. With this change, on an 8xx the time spent in the loop in ftrace_replace_code() is reduced by 23% at function tracer activation and 27% at nop tracer activation. The overall time to activate function tracer (measured with shell command 'time') is 570ms before the patch and 470ms after the patch. Even vmlinux size is reduced (by 152 instruction). Before the patch: 00000018 <copy_inst_from_kernel_nofault>: 18: 94 21 ff e0 stwu r1,-32(r1) 1c: 7c 08 02 a6 mflr r0 20: 38 a0 00 04 li r5,4 24: 93 e1 00 1c stw r31,28(r1) 28: 7c 7f 1b 78 mr r31,r3 2c: 38 61 00 08 addi r3,r1,8 30: 90 01 00 24 stw r0,36(r1) 34: 48 00 00 01 bl 34 <copy_inst_from_kernel_nofault+0x1c> 34: R_PPC_REL24 copy_from_kernel_nofault 38: 2c 03 00 00 cmpwi r3,0 3c: 40 82 00 0c bne 48 <copy_inst_from_kernel_nofault+0x30> 40: 81 21 00 08 lwz r9,8(r1) 44: 91 3f 00 00 stw r9,0(r31) 48: 80 01 00 24 lwz r0,36(r1) 4c: 83 e1 00 1c lwz r31,28(r1) 50: 38 21 00 20 addi r1,r1,32 54: 7c 08 03 a6 mtlr r0 58: 4e 80 00 20 blr After the patch (before inlining): 00000018 <copy_inst_from_kernel_nofault>: 18: 3d 20 b0 00 lis r9,-20480 1c: 7c 04 48 40 cmplw r4,r9 20: 7c 69 1b 78 mr r9,r3 24: 41 80 00 14 blt 38 <copy_inst_from_kernel_nofault+0x20> 28: 81 44 00 00 lwz r10,0(r4) 2c: 38 60 00 00 li r3,0 30: 91 49 00 00 stw r10,0(r9) 34: 4e 80 00 20 blr 38: 38 60 ff de li r3,-34 3c: 4e 80 00 20 blr 40: 38 60 ff f2 li r3,-14 44: 4e 80 00 20 blr Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Add clang workaround, with version check as suggested by Nathan] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/0d5b12183d5176dd702d29ad94c39c384e51c78f.1638208156.git.christophe.leroy@csgroup.eu
2021-12-09powerpc/inst: Move ppc_inst_t definition in asm/reg.hChristophe Leroy4-11/+13
Because of circular inclusion of asm/hw_breakpoint.h, we need to move definition of asm/reg.h outside of inst.h so that asm/hw_breakpoint.h gets it without including asm/inst.h Also remove asm/inst.h from asm/uprobes.h as it's not needed anymore. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4b79f1491118af96b1ac0735e74aeca02ea4c04e.1638208156.git.christophe.leroy@csgroup.eu
2021-12-09powerpc/inst: Define ppc_inst_t as u32 on PPC32Christophe Leroy2-8/+15
Unlike PPC64 ABI, PPC32 uses the stack to pass a parameter defined as a struct, even when the struct has a single simple element. To avoid that, define ppc_inst_t as u32 on PPC32. Keep it as 'struct ppc_inst' when __CHECKER__ is defined so that sparse can perform type checking. Also revert commit 511eea5e2ccd ("powerpc/kprobes: Fix Oops by passing ppc_inst as a pointer to emulate_step() on ppc32") as now the instruction to be emulated is passed as a register to emulate_step(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c6d0c46f598f76ad0b0a88bc0d84773bd921b17c.1638208156.git.christophe.leroy@csgroup.eu
2021-12-09powerpc/inst: Define ppc_inst_tChristophe Leroy23-112/+112
In order to stop using 'struct ppc_inst' on PPC32, define a ppc_inst_t typedef. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/fe5baa2c66fea9db05a8b300b3e8d2880a42596c.1638208156.git.christophe.leroy@csgroup.eu
2021-12-09powerpc/inst: Refactor ___get_user_instr()Christophe Leroy1-11/+2
PPC64 version of ___get_user_instr() can be used for PPC32 as well, by simply disabling the suffix part with IS_ENABLED(CONFIG_PPC64). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1f0ede830ccb33a659119a55cb590820c27004db.1638208156.git.christophe.leroy@csgroup.eu
2021-12-09powerpc/32s: Allocate one 256k IBAT instead of two consecutives 128k IBATsChristophe Leroy1-3/+2
Today we have the following IBATs allocated: ---[ Instruction Block Address Translation ]--- 0: 0xc0000000-0xc03fffff 0x00000000 4M Kernel x m 1: 0xc0400000-0xc05fffff 0x00400000 2M Kernel x m 2: 0xc0600000-0xc06fffff 0x00600000 1M Kernel x m 3: 0xc0700000-0xc077ffff 0x00700000 512K Kernel x m 4: 0xc0780000-0xc079ffff 0x00780000 128K Kernel x m 5: 0xc07a0000-0xc07bffff 0x007a0000 128K Kernel x m 6: - 7: - The two 128K should be a single 256K instead. When _etext is not aligned to 128Kbytes, the system will allocate all necessary BATs to the lower 128Kbytes boundary, then allocate an additional 128Kbytes BAT for the remaining block. Instead, align the top to 128Kbytes so that the function directly allocates a 256Kbytes last block: ---[ Instruction Block Address Translation ]--- 0: 0xc0000000-0xc03fffff 0x00000000 4M Kernel x m 1: 0xc0400000-0xc05fffff 0x00400000 2M Kernel x m 2: 0xc0600000-0xc06fffff 0x00600000 1M Kernel x m 3: 0xc0700000-0xc077ffff 0x00700000 512K Kernel x m 4: 0xc0780000-0xc07bffff 0x00780000 256K Kernel x m 5: - 6: - 7: - Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/ab58b296832b0ec650e2203200e060adbcb2677d.1637930421.git.christophe.leroy@csgroup.eu
2021-12-09powerpc: Remove CONFIG_PPC_HAVE_KUAP and CONFIG_PPC_HAVE_KUEPChristophe Leroy1-21/+0
All platforms now have KUAP and KUEP so remove CONFIG_PPC_HAVE_KUAP and CONFIG_PPC_HAVE_KUEP. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a3c007ad0951965199e6ab2ef1035966bc66e771.1634627931.git.christophe.leroy@csgroup.eu
2021-12-09powerpc/kuap: Wire-up KUAP on book3e/64Christophe Leroy2-6/+35
This adds KUAP support to book3e/64. This is done by reading the content of SPRN_MAS1 and checking the TID at the time user pgtable is loaded. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e2c2c9375afd4bbc06aa904d0103a5f5102a2b1a.1634627931.git.christophe.leroy@csgroup.eu