summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
3 daysLoongArch: dts: loongson-2k2000: Add default interrupt controller address cellsBinbin Zhou1-0/+3
commit e65df3f77ecd59d3a8647d19df82b22a6ce210a9 upstream. Add missing address-cells 0 to the Local I/O, Extend I/O and PCH-PIC Interrupt Controller node to silence W=1 warning: loongson-2k2000.dtsi:364.5-49: Warning (interrupt_map): /bus@10000000/pcie@1a000000/pcie@9,0:interrupt-map: Missing property '#address-cells' in node /bus@10000000/interrupt-controller@10000000, using 0 as fallback Value '0' is correct because: 1. The LIO/EIO/PCH interrupt controller does not have children, 2. interrupt-map property (in PCI node) consists of five components and the fourth component "parent unit address", which size is defined by '#address-cells' of the node pointed to by the interrupt-parent component, is not used (=0) Cc: stable@vger.kernel.org Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 daysLoongArch: dts: loongson-2k1000: Fix i2c-gpio node namesBinbin Zhou1-2/+2
commit 14ea5a3625881d79f75418c66e3a7d98db8518e1 upstream. The binding wants the node to be named "i2c-number", but those are named "i2c-gpio-number" instead. Thus rename those to i2c-0, i2c-1 to adhere to the binding and suppress dtbs_check warnings. Cc: stable@vger.kernel.org Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 daysLoongArch: dts: loongson-2k1000: Add default interrupt controller address cellsBinbin Zhou1-0/+2
commit 81e8cb7e504a5adbcc48f7f954bf3c2aa9b417f8 upstream. Add missing address-cells 0 to the Local I/O interrupt controller node to silence W=1 warning: loongson-2k1000.dtsi:498.5-55: Warning (interrupt_map): /bus@10000000/pcie@1a000000/pcie@9,0:interrupt-map: Missing property '#address-cells' in node /bus@10000000/interrupt-controller@1fe01440, using 0 as fallback Value '0' is correct because: 1. The Local I/O interrupt controller does not have children, 2. interrupt-map property (in PCI node) consists of five components and the fourth component "parent unit address", which size is defined by '#address-cells' of the node pointed to by the interrupt-parent component, is not used (=0) Cc: stable@vger.kernel.org Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 daysLoongArch: dts: loongson-2k0500: Add default interrupt controller address cellsBinbin Zhou1-0/+3
commit c4461754e6fe7e12a3ff198cce4707e3e20e43d4 upstream. Add missing address-cells 0 to the Local I/O and Extend I/O interrupt controller node to silence W=1 warning: loongson-2k0500.dtsi:513.5-51: Warning (interrupt_map): /bus@10000000/pcie@1a000000/pcie@0,0:interrupt-map: Missing property '#address-cells' in node /bus@10000000/interrupt-controller@1fe11600, using 0 as fallback Value '0' is correct because: 1. The Local I/O & Extend I/O interrupt controller do not have children, 2. interrupt-map property (in PCI node) consists of five components and the fourth component "parent unit address", which size is defined by '#address-cells' of the node pointed to by the interrupt-parent component, is not used (=0) Cc: stable@vger.kernel.org Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 daysLoongArch: Fix PMU counter allocation for mixed-type event groupsLisa Robinson1-3/+18
commit a91f86e27087f250a5d9c89bb4a427b9c30fd815 upstream. When validating a perf event group, validate_group() unconditionally attempts to allocate hardware PMU counters for the leader, sibling events and the new event being added. This is incorrect for mixed-type groups. If a PERF_TYPE_SOFTWARE event is part of the group, the current code still tries to allocate a hardware PMU counter for it, which can wrongly consume hardware PMU resources and cause spurious allocation failures. Fix this by only allocating PMU counters for hardware events during group validation, and skipping software events. A trimmed down reproducer is as simple as this: #include <stdio.h> #include <assert.h> #include <unistd.h> #include <string.h> #include <sys/syscall.h> #include <linux/perf_event.h> int main (int argc, char *argv[]) { struct perf_event_attr attr = { 0 }; int fds[5]; attr.disabled = 1; attr.exclude_kernel = 1; attr.exclude_hv = 1; attr.read_format = PERF_FORMAT_TOTAL_TIME_ENABLED | PERF_FORMAT_TOTAL_TIME_RUNNING | PERF_FORMAT_ID | PERF_FORMAT_GROUP; attr.size = sizeof (attr); attr.type = PERF_TYPE_SOFTWARE; attr.config = PERF_COUNT_SW_DUMMY; fds[0] = syscall (SYS_perf_event_open, &attr, 0, -1, -1, 0); assert (fds[0] >= 0); attr.type = PERF_TYPE_HARDWARE; attr.config = PERF_COUNT_HW_CPU_CYCLES; fds[1] = syscall (SYS_perf_event_open, &attr, 0, -1, fds[0], 0); assert (fds[1] >= 0); attr.type = PERF_TYPE_HARDWARE; attr.config = PERF_COUNT_HW_INSTRUCTIONS; fds[2] = syscall (SYS_perf_event_open, &attr, 0, -1, fds[0], 0); assert (fds[2] >= 0); attr.type = PERF_TYPE_HARDWARE; attr.config = PERF_COUNT_HW_BRANCH_MISSES; fds[3] = syscall (SYS_perf_event_open, &attr, 0, -1, fds[0], 0); assert (fds[3] >= 0); attr.type = PERF_TYPE_HARDWARE; attr.config = PERF_COUNT_HW_CACHE_REFERENCES; fds[4] = syscall (SYS_perf_event_open, &attr, 0, -1, fds[0], 0); assert (fds[4] >= 0); printf ("PASSED\n"); return 0; } Cc: stable@vger.kernel.org Fixes: b37042b2bb7c ("LoongArch: Add perf events support") Signed-off-by: Lisa Robinson <lisa@bytefly.space> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 daysx86/resctrl: Fix memory bandwidth counter width for HygonXiaochen Shen2-2/+16
commit 7517e899e1b87b4c22a92c7e40d8733c48e4ec3c upstream. The memory bandwidth calculation relies on reading the hardware counter and measuring the delta between samples. To ensure accurate measurement, the software reads the counter frequently enough to prevent it from rolling over twice between reads. The default Memory Bandwidth Monitoring (MBM) counter width is 24 bits. Hygon CPUs provide a 32-bit width counter, but they do not support the MBM capability CPUID leaf (0xF.[ECX=1]:EAX) to report the width offset (from 24 bits). Consequently, the kernel falls back to the 24-bit default counter width, which causes incorrect overflow handling on Hygon CPUs. Fix this by explicitly setting the counter width offset to 8 bits (resulting in a 32-bit total counter width) for Hygon CPUs. Fixes: d8df126349da ("x86/cpu/hygon: Add missing resctrl_cpu_detect() in bsp_init helper") Signed-off-by: Xiaochen Shen <shenxiaochen@open-hieco.net> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251209062650.1536952-3-shenxiaochen@open-hieco.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 daysx86/resctrl: Add missing resctrl initialization for HygonXiaochen Shen1-2/+4
commit 6ee98aabdc700b5705e4f1833e2edc82a826b53b upstream. Hygon CPUs supporting Platform QoS features currently undergo partial resctrl initialization through resctrl_cpu_detect() in the Hygon BSP init helper and AMD/Hygon common initialization code. However, several critical data structures remain uninitialized for Hygon CPUs in the following paths: - get_mem_config()-> __rdt_get_mem_config_amd(): rdt_resource::membw,alloc_capable hw_res::num_closid - rdt_init_res_defs()->rdt_init_res_defs_amd(): rdt_resource::cache hw_res::msr_base,msr_update Add the missing AMD/Hygon common initialization to ensure proper Platform QoS functionality on Hygon CPUs. Fixes: d8df126349da ("x86/cpu/hygon: Add missing resctrl_cpu_detect() in bsp_init helper") Signed-off-by: Xiaochen Shen <shenxiaochen@open-hieco.net> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251209062650.1536952-2-shenxiaochen@open-hieco.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 daysx86/kaslr: Recognize all ZONE_DEVICE users as physaddr consumersDan Williams1-5/+5
commit 269031b15c1433ff39e30fa7ea3ab8f0be9d6ae2 upstream. Commit 7ffb791423c7 ("x86/kaslr: Reduce KASLR entropy on most x86 systems") is too narrow. The effect being mitigated in that commit is caused by ZONE_DEVICE which PCI_P2PDMA has a dependency. ZONE_DEVICE, in general, lets any physical address be added to the direct-map. I.e. not only ACPI hotplug ranges, CXL Memory Windows, or EFI Specific Purpose Memory, but also any PCI MMIO range for the DEVICE_PRIVATE and PCI_P2PDMA cases. Update the mitigation, limit KASLR entropy, to apply in all ZONE_DEVICE=y cases. Distro kernels typically have PCI_P2PDMA=y, so the practical exposure of this problem is limited to the PCI_P2PDMA=n case. A potential path to recover entropy would be to walk ACPI and determine the limits for hotplug and PCI MMIO before kernel_randomize_memory(). On smaller systems that could yield some KASLR address bits. This needs additional investigation to determine if some limited ACPI table scanning can happen this early without an open coded solution like arch/x86/boot/compressed/acpi.c needs to deploy. Cc: Ingo Molnar <mingo@kernel.org> Cc: Kees Cook <kees@kernel.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Hildenbrand <david@redhat.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mike Rapoport <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Michal Hocko <mhocko@suse.com> Fixes: 7ffb791423c7 ("x86/kaslr: Reduce KASLR entropy on most x86 systems") Cc: <stable@vger.kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Balbir Singh <balbirs@nvidia.com> Tested-by: Yasunori Goto <y-goto@fujitsu.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: http://patch.msgid.link/692e08b2516d4_261c1100a3@dwillia2-mobl4.notmuch Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 daysx86/fpu: Clear XSTATE_BV[i] in guest XSAVE state whenever XFD[i]=1Sean Christopherson2-3/+38
commit b45f721775947a84996deb5c661602254ce25ce6 upstream. When loading guest XSAVE state via KVM_SET_XSAVE, and when updating XFD in response to a guest WRMSR, clear XFD-disabled features in the saved (or to be restored) XSTATE_BV to ensure KVM doesn't attempt to load state for features that are disabled via the guest's XFD. Because the kernel executes XRSTOR with the guest's XFD, saving XSTATE_BV[i]=1 with XFD[i]=1 will cause XRSTOR to #NM and panic the kernel. E.g. if fpu_update_guest_xfd() sets XFD without clearing XSTATE_BV: ------------[ cut here ]------------ WARNING: arch/x86/kernel/traps.c:1524 at exc_device_not_available+0x101/0x110, CPU#29: amx_test/848 Modules linked in: kvm_intel kvm irqbypass CPU: 29 UID: 1000 PID: 848 Comm: amx_test Not tainted 6.19.0-rc2-ffa07f7fd437-x86_amx_nm_xfd_non_init-vm #171 NONE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:exc_device_not_available+0x101/0x110 Call Trace: <TASK> asm_exc_device_not_available+0x1a/0x20 RIP: 0010:restore_fpregs_from_fpstate+0x36/0x90 switch_fpu_return+0x4a/0xb0 kvm_arch_vcpu_ioctl_run+0x1245/0x1e40 [kvm] kvm_vcpu_ioctl+0x2c3/0x8f0 [kvm] __x64_sys_ioctl+0x8f/0xd0 do_syscall_64+0x62/0x940 entry_SYSCALL_64_after_hwframe+0x4b/0x53 </TASK> ---[ end trace 0000000000000000 ]--- This can happen if the guest executes WRMSR(MSR_IA32_XFD) to set XFD[18] = 1, and a host IRQ triggers kernel_fpu_begin() prior to the vmexit handler's call to fpu_update_guest_xfd(). and if userspace stuffs XSTATE_BV[i]=1 via KVM_SET_XSAVE: ------------[ cut here ]------------ WARNING: arch/x86/kernel/traps.c:1524 at exc_device_not_available+0x101/0x110, CPU#14: amx_test/867 Modules linked in: kvm_intel kvm irqbypass CPU: 14 UID: 1000 PID: 867 Comm: amx_test Not tainted 6.19.0-rc2-2dace9faccd6-x86_amx_nm_xfd_non_init-vm #168 NONE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:exc_device_not_available+0x101/0x110 Call Trace: <TASK> asm_exc_device_not_available+0x1a/0x20 RIP: 0010:restore_fpregs_from_fpstate+0x36/0x90 fpu_swap_kvm_fpstate+0x6b/0x120 kvm_load_guest_fpu+0x30/0x80 [kvm] kvm_arch_vcpu_ioctl_run+0x85/0x1e40 [kvm] kvm_vcpu_ioctl+0x2c3/0x8f0 [kvm] __x64_sys_ioctl+0x8f/0xd0 do_syscall_64+0x62/0x940 entry_SYSCALL_64_after_hwframe+0x4b/0x53 </TASK> ---[ end trace 0000000000000000 ]--- The new behavior is consistent with the AMX architecture. Per Intel's SDM, XSAVE saves XSTATE_BV as '0' for components that are disabled via XFD (and non-compacted XSAVE saves the initial configuration of the state component): If XSAVE, XSAVEC, XSAVEOPT, or XSAVES is saving the state component i, the instruction does not generate #NM when XCR0[i] = IA32_XFD[i] = 1; instead, it operates as if XINUSE[i] = 0 (and the state component was in its initial state): it saves bit i of XSTATE_BV field of the XSAVE header as 0; in addition, XSAVE saves the initial configuration of the state component (the other instructions do not save state component i). Alternatively, KVM could always do XRSTOR with XFD=0, e.g. by using a constant XFD based on the set of enabled features when XSAVEing for a struct fpu_guest. However, having XSTATE_BV[i]=1 for XFD-disabled features can only happen in the above interrupt case, or in similar scenarios involving preemption on preemptible kernels, because fpu_swap_kvm_fpstate()'s call to save_fpregs_to_fpstate() saves the outgoing FPU state with the current XFD; and that is (on all but the first WRMSR to XFD) the guest XFD. Therefore, XFD can only go out of sync with XSTATE_BV in the above interrupt case, or in similar scenarios involving preemption on preemptible kernels, and it we can consider it (de facto) part of KVM ABI that KVM_GET_XSAVE returns XSTATE_BV[i]=0 for XFD-disabled features. Reported-by: Paolo Bonzini <pbonzini@redhat.com> Cc: stable@vger.kernel.org Fixes: 820a6ee944e7 ("kvm: x86: Add emulation for IA32_XFD", 2022-01-14) Signed-off-by: Sean Christopherson <seanjc@google.com> [Move clearing of XSTATE_BV from fpu_copy_uabi_to_guest_fpstate to kvm_vcpu_ioctl_x86_set_xsave. - Paolo] Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 daysriscv: pgtable: Cleanup useless VA_USER_XXX definitionsGuo Ren (Alibaba DAMO Academy)1-4/+0
[ Upstream commit 5e5be092ffadcab0093464ccd9e30f0c5cce16b9 ] These marcos are not used after commit b5b4287accd7 ("riscv: mm: Use hint address in mmap if available"). Cleanup VA_USER_XXX definitions in asm/pgtable.h. Fixes: b5b4287accd7 ("riscv: mm: Use hint address in mmap if available") Signed-off-by: Guo Ren (Alibaba DAMO Academy) <guoren@kernel.org> Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://patch.msgid.link/20251201005850.702569-1-guoren@kernel.org Signed-off-by: Paul Walmsley <pjw@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysarm64: dts: mba8mx: Fix Ethernet PHY IRQ supportAlexander Stein1-1/+1
[ Upstream commit 89e87d0dc87eb3654c9ae01afc4a18c1c6d1e523 ] Ethernet PHY interrupt mode is level triggered. Adjust the mode accordingly. Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Fixes: 70cf622bb16e ("arm64: dts: mba8mx: Add Ethernet PHY IRQ support") Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysarm64: dts: imx8qm-ss-dma: correct the dma channels of lpuartSherry Sun1-4/+4
[ Upstream commit a988caeed9d918452aa0a68de2c6e94d86aa43ba ] The commit 616effc0272b5 ("arm64: dts: imx8: Fix lpuart DMA channel order") swap uart rx and tx channel at common imx8-ss-dma.dtsi. But miss update imx8qm-ss-dma.dtsi. The commit 5a8e9b022e569 ("arm64: dts: imx8qm-ss-dma: Pass lpuart dma-names") just simple add dma-names as binding doc requirement. Correct lpuart0 - lpuart3 dma rx and tx channels, and use defines for the FSL_EDMA_RX flag. Fixes: 5a8e9b022e56 ("arm64: dts: imx8qm-ss-dma: Pass lpuart dma-names") Signed-off-by: Sherry Sun <sherry.sun@nxp.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Reviewed-by: Alexander Stein <alexander.stein@ew.tq-group.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysarm64: dts: imx8mp: Fix LAN8740Ai PHY reference clock on DH electronics ↵Marek Vasut1-0/+1
i.MX8M Plus DHCOM [ Upstream commit c63749a7ddc59ac6ec0b05abfa0a21af9f2c1d38 ] Add missing 'clocks' property to LAN8740Ai PHY node, to allow the PHY driver to manage LAN8740Ai CLKIN reference clock supply. This fixes sporadic link bouncing caused by interruptions on the PHY reference clock, by letting the PHY driver manage the reference clock and assure there are no interruptions. This follows the matching PHY driver recommendation described in commit bedd8d78aba3 ("net: phy: smsc: LAN8710/20: add phy refclk in support") Fixes: 8d6712695bc8 ("arm64: dts: imx8mp: Add support for DH electronics i.MX8M Plus DHCOM and PDK2") Signed-off-by: Marek Vasut <marek.vasut@mailbox.org> Tested-by: Christoph Niedermaier <cniedermaier@dh-electronics.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysARM: dts: imx6q-ba16: fix RTC interrupt levelIan Ray1-1/+1
[ Upstream commit e6a4eedd49ce27c16a80506c66a04707e0ee0116 ] RTC interrupt level should be set to "LOW". This was revealed by the introduction of commit: f181987ef477 ("rtc: m41t80: use IRQ flags obtained from fwnode") which changed the way IRQ type is obtained. Fixes: 56c27310c1b4 ("ARM: dts: imx: Add Advantech BA-16 Qseven module") Signed-off-by: Ian Ray <ian.ray@gehealthcare.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysarm64: dts: add off-on-delay-us for usdhc2 regulatorHaibo Chen1-0/+1
[ Upstream commit ca643894a37a25713029b36cfe7d1bae515cac08 ] For SD card, according to the spec requirement, for sd card power reset operation, it need sd card supply voltage to be lower than 0.5v and keep over 1ms, otherwise, next time power back the sd card supply voltage to 3.3v, sd card can't support SD3.0 mode again. To match such requirement on imx8qm-mek board, add 4.8ms delay between sd power off and power on. Fixes: 307fd14d4b14 ("arm64: dts: imx: add imx8qm mek support") Reviewed-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Haibo Chen <haibo.chen@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysarm64: dts: ti: k3-am62-lp-sk-nand: Rename pinctrls to fix schema warningsWadim Egorov1-1/+1
[ Upstream commit cf5e8adebe77917a4cc95e43e461cdbd857591ce ] Rename pinctrl nodes to comply with naming conventions required by pinctrl-single schema. Fixes: e569152274fec ("arm64: dts: ti: am62-lp-sk: Add overlay for NAND expansion card") Signed-off-by: Wadim Egorov <w.egorov@phytec.de> Link: https://patch.msgid.link/20251127122733.2523367-3-w.egorov@phytec.de Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysalpha: don't reference obsolete termio struct for TC* constantsSam James1-4/+4
[ Upstream commit 9aeed9041929812a10a6d693af050846942a1d16 ] Similar in nature to ab107276607af90b13a5994997e19b7b9731e251. glibc-2.42 drops the legacy termio struct, but the ioctls.h header still defines some TC* constants in terms of termio (via sizeof). Hardcode the values instead. This fixes building Python for example, which falls over like: ./Modules/termios.c:1119:16: error: invalid application of 'sizeof' to incomplete type 'struct termio' Link: https://bugs.gentoo.org/961769 Link: https://bugs.gentoo.org/962600 Signed-off-by: Sam James <sam@gentoo.org> Reviewed-by: Magnus Lindholm <linmag7@gmail.com> Link: https://lore.kernel.org/r/6ebd3451908785cad53b50ca6bc46cfe9d6bc03c.1764922497.git.sam@gentoo.org Signed-off-by: Magnus Lindholm <linmag7@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysARM: 9461/1: Disable HIGHPTE on PREEMPT_RT kernelsSebastian Andrzej Siewior1-1/+1
[ Upstream commit fedadc4137234c3d00c4785eeed3e747fe9036ae ] gup_pgd_range() is invoked with disabled interrupts and invokes __kmap_local_page_prot() via pte_offset_map(), gup_p4d_range(). With HIGHPTE enabled, __kmap_local_page_prot() invokes kmap_high_get() which uses a spinlock_t via lock_kmap_any(). This leads to an sleeping-while-atomic error on PREEMPT_RT because spinlock_t becomes a sleeping lock and must not be acquired in atomic context. The loop in map_new_virtual() uses wait_queue_head_t for wake up which also is using a spinlock_t. Since HIGHPTE is rarely needed at all, turn it off for PREEMPT_RT to allow the use of get_user_pages_fast(). [arnd: rework patch to turn off HIGHPTE instead of HAVE_PAST_GUP] Co-developed-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 dayscsky: fix csky_cmpxchg_fixup not workingYang Li1-2/+2
[ Upstream commit 809ef03d6d21d5fea016bbf6babeec462e37e68c ] In the csky_cmpxchg_fixup function, it is incorrect to use the global variable csky_cmpxchg_stw to determine the address where the exception occurred.The global variable csky_cmpxchg_stw stores the opcode at the time of the exception, while &csky_cmpxchg_stw shows the address where the exception occurred. Signed-off-by: Yang Li <yang.li85200@gmail.com> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysarm64: Fix cleared E0POE bit after cpu_suspend()/resume()Yeoreum Yun2-1/+9
commit bdf3f4176092df5281877cacf42f843063b4784d upstream. TCR2_ELx.E0POE is set during smp_init(). However, this bit is not reprogrammed when the CPU enters suspension and later resumes via cpu_resume(), as __cpu_setup() does not re-enable E0POE and there is no save/restore logic for the TCR2_ELx system register. As a result, the E0POE feature no longer works after cpu_resume(). To address this, save and restore TCR2_EL1 in the cpu_suspend()/cpu_resume() path, rather than adding related logic to __cpu_setup(), taking into account possible future extensions of the TCR2_ELx feature. Fixes: bf83dae90fbc ("arm64: enable the Permission Overlay Extension for EL0") Cc: <stable@vger.kernel.org> # 6.12.x Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08powerpc/pseries/cmm: adjust BALLOON_MIGRATE when migrating pagesDavid Hildenbrand1-0/+1
[ Upstream commit 0da2ba35c0d532ca0fe7af698b17d74c4d084b9a ] Let's properly adjust BALLOON_MIGRATE like the other drivers. Note that the INFLATE/DEFLATE events are triggered from the core when enqueueing/dequeueing pages. This was found by code inspection. Link: https://lkml.kernel.org/r/20251021100606.148294-3-david@redhat.com Fixes: fe030c9b85e6 ("powerpc/pseries/cmm: Implement balloon compaction") Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08mm/balloon_compaction: convert balloon_page_delete() to balloon_page_finalize()David Hildenbrand1-1/+1
[ Upstream commit 15504b1163007bbfbd9a63460d5c14737c16e96d ] Let's move the removal of the page from the balloon list into the single caller, to remove the dependency on the PG_isolated flag and clarify locking requirements. Note that for now, balloon_page_delete() was used on two paths: (1) Removing a page from the balloon for deflation through balloon_page_list_dequeue() (2) Removing an isolated page from the balloon for migration in the per-driver migration handlers. Isolated pages were already removed from the balloon list during isolation. So instead of relying on the flag, we can just distinguish both cases directly and handle it accordingly in the caller. We'll shuffle the operations a bit such that they logically make more sense (e.g., remove from the list before clearing flags). In balloon migration functions we can now move the balloon_page_finalize() out of the balloon lock and perform the finalization just before dropping the balloon reference. Document that the page lock is currently required when modifying the movability aspects of a page; hopefully we can soon decouple this from the page lock. Link: https://lkml.kernel.org/r/20250704102524.326966-3-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Brendan Jackman <jackmanb@google.com> Cc: Byungchul Park <byungchul@sk.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Brauner <brauner@kernel.org> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Eugenio Pé rez <eperezma@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Gregory Price <gourry@gourry.net> Cc: Harry Yoo <harry.yoo@oracle.com> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Jan Kara <jack@suse.cz> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jason Wang <jasowang@redhat.com> Cc: Jerrin Shaji George <jerrin.shaji-george@broadcom.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mathew Brost <matthew.brost@intel.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Xu <peterx@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Rik van Riel <riel@surriel.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: 0da2ba35c0d5 ("powerpc/pseries/cmm: adjust BALLOON_MIGRATE when migrating pages") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08x86/microcode/AMD: Select which microcode patch to loadBorislav Petkov (AMD)1-37/+67
commit 8d171045069c804e5ffaa18be590c42c6af0cf3f upstream. All microcode patches up to the proper BIOS Entrysign fix are loaded only after the sha256 signature carried in the driver has been verified. Microcode patches after the Entrysign fix has been applied, do not need that signature verification anymore. In order to not abandon machines which haven't received the BIOS update yet, add the capability to select which microcode patch to load. The corresponding microcode container supplied through firmware-linux has been modified to carry two patches per CPU type (family/model/stepping) so that the proper one gets selected. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Waiman Long <longman@redhat.com> Link: https://patch.msgid.link/20251027133818.4363-1-bp@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08ARM: dts: microchip: sama7g5: fix uart fifo size to 32Nicolas Ferre1-2/+2
[ Upstream commit 5654889a94b0de5ad6ceae3793e7f5e0b61b50b6 ] On some flexcom nodes related to uart, the fifo sizes were wrong: fix them to 32 data. Fixes: 7540629e2fc7 ("ARM: dts: at91: add sama7g5 SoC DT and sama7g5-ek") Cc: stable@vger.kernel.org # 5.15+ Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Link: https://lore.kernel.org/r/20251114103313.20220-2-nicolas.ferre@microchip.com Signed-off-by: Claudiu Beznea <claudiu.beznea@tuxon.dev> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08lib/crypto: riscv/chacha: Avoid s0/fp registerVivian Wang1-3/+2
commit 43169328c7b4623b54b7713ec68479cebda5465f upstream. In chacha_zvkb, avoid using the s0 register, which is the frame pointer, by reallocating KEY0 to t5. This makes stack traces available if e.g. a crash happens in chacha_zvkb. No frame pointer maintenance is otherwise required since this is a leaf function. Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn> Fixes: bb54668837a0 ("crypto: riscv - add vector crypto accelerated ChaCha20") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20251202-riscv-chacha_zvkb-fp-v2-1-7bd00098c9dc@iscas.ac.cn Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08LoongArch: BPF: Sign extend kfunc call argumentsHengqi Chen2-0/+42
commit 3f5a238f24d7b75f9efe324d3539ad388f58536e upstream. The kfunc calls are native calls so they should follow LoongArch calling conventions. Sign extend its arguments properly to avoid kernel panic. This is done by adding a new emit_abi_ext() helper. The emit_abi_ext() helper performs extension in place meaning a value already store in the target register (Note: this is different from the existing sign_extend() helper and thus we can't reuse it). Cc: stable@vger.kernel.org Fixes: 5dc615520c4d ("LoongArch: Add BPF JIT support") Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08LoongArch: BPF: Zero-extend bpf_tail_call() indexHengqi Chen1-0/+2
commit eb71f5c433e1c6dff089b315881dec40a88a7baf upstream. The bpf_tail_call() index should be treated as a u32 value. Let's zero-extend it to avoid calling wrong BPF progs. See similar fixes for x86 [1]) and arm64 ([2]) for more details. [1]: https://github.com/torvalds/linux/commit/90caccdd8cc0215705f18b92771b449b01e2474a [2]: https://github.com/torvalds/linux/commit/16338a9b3ac30740d49f5dfed81bac0ffa53b9c7 Cc: stable@vger.kernel.org Fixes: 5dc615520c4d ("LoongArch: Add BPF JIT support") Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08LoongArch: Refactor register restoration in ftrace_common_returnChenghao Duan1-4/+10
commit 45cb47c628dfbd1994c619f3eac271a780602826 upstream. Refactor the register restoration sequence in the ftrace_common_return function to clearly distinguish between the logic of normal returns and direct call returns in function tracing scenarios. The logic is as follows: 1. In the case of a normal return, the execution flow returns to the traced function, and ftrace must ensure that the register data is consistent with the state when the function was entered. ra = parent return address; t0 = traced function return address. 2. In the case of a direct call return, the execution flow jumps to the custom trampoline function, and ftrace must ensure that the register data is consistent with the state when ftrace was entered. ra = traced function return address; t0 = parent return address. Cc: stable@vger.kernel.org Fixes: 9cdc3b6a299c ("LoongArch: ftrace: Add direct call support") Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08x86/microcode/AMD: Fix Entrysign revision check for Zen5/Strix HaloRong Zhang1-1/+1
commit 150b1b97e27513535dcd3795d5ecd28e61b6cb8c upstream. Zen5 also contains family 1Ah, models 70h-7Fh, which are mistakenly missing from cpu_has_entrysign(). Add the missing range. Fixes: 8a9fb5129e8e ("x86/microcode/AMD: Limit Entrysign signature checking to known generations") Signed-off-by: Rong Zhang <i@rong.moe> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@kernel.org Link: https://patch.msgid.link/20251229182245.152747-1-i@rong.moe Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08LoongArch: Use unsigned long for _end and _textTiezhu Yang1-2/+2
commit a258a3cb1895e3acf5f2fe245d17426e894bc935 upstream. It is better to use unsigned long rather than long for _end and _text to calculate the kernel length. Cc: stable@vger.kernel.org # v6.3+ Fixes: e5f02b51fa0c ("LoongArch: Add support for kernel address space layout randomization (KASLR)") Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08LoongArch: Use __pmd()/__pte() for swap entry conversionsWangYuli1-2/+2
commit 4a71df151e703b5e7e85b33369cee59ef2665e61 upstream. The __pmd() and __pte() helper macros provide the correct initialization syntax and abstraction for the pmd_t and pte_t types. Use __pmd() to fix follow warning about __swp_entry_to_pmd() with gcc-15 under specific configs [1] : In file included from ./include/linux/pgtable.h:6, from ./include/linux/mm.h:31, from ./include/linux/pagemap.h:8, from arch/loongarch/mm/init.c:14: ./include/linux/swapops.h: In function ‘swp_entry_to_pmd’: ./arch/loongarch/include/asm/pgtable.h:302:34: error: missing braces around initializer [-Werror=missing-braces] 302 | #define __swp_entry_to_pmd(x) ((pmd_t) { (x).val | _PAGE_HUGE }) | ^ ./include/linux/swapops.h:559:16: note: in expansion of macro ‘__swp_entry_to_pmd’ 559 | return __swp_entry_to_pmd(arch_entry); | ^~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors Also update __swp_entry_to_pte() to use __pte() for consistency. [1]. https://download.01.org/0day-ci/archive/20251119/202511190316.luI90kAo-lkp@intel.com/config Cc: stable@vger.kernel.org Signed-off-by: Yuli Wang <wangyl5933@chinaunicom.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08LoongArch: Fix build errors for CONFIG_RANDSTRUCTHuacai Chen1-2/+2
commit 3c250aecef62da81deb38ac6738ac0a88d91f1fc upstream. When CONFIG_RANDSTRUCT enabled, members of task_struct are randomized. There is a chance that TASK_STACK_CANARY be out of 12bit immediate's range and causes build errors. TASK_STACK_CANARY is naturally aligned, so fix it by replacing ld.d/st.d with ldptr.d/stptr.d which have 14bit immediates. Cc: stable@vger.kernel.org Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202511240656.0NaPcJs1-lkp@intel.com/ Suggested-by: Rui Wang <wangrui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08LoongArch: Correct the calculation logic of thread_countQiang Ma1-1/+7
commit 1de0ae21f136efa6c5d8a4d3e07b7d1ca39c750f upstream. For thread_count, the current calculation method has a maximum of 255, which may not be sufficient in the future. Therefore, we are correcting it now. Reference: SMBIOS Specification, 7.5 Processor Information (Type 4)[1] [1]: https://www.dmtf.org/sites/default/files/standards/documents/DSP0134_3.9.0.pdf Cc: stable@vger.kernel.org Signed-off-by: Qiang Ma <maqianga@uniontech.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08LoongArch: Add new PCI ID for pci_fixup_vgadev()Huacai Chen1-0/+2
commit bf3fa8f232a1eec8d7b88dcd9e925e60f04f018d upstream. Loongson-2K3000 has a new PCI ID (0x7a46) for its display controller, Add it for pci_fixup_vgadev() since we prefer a discrete graphics card as default boot device if present. Cc: stable@vger.kernel.org Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08powerpc/pseries/cmm: call balloon_devinfo_init() also without ↵David Hildenbrand1-1/+1
CONFIG_BALLOON_COMPACTION commit fc6bcf9ac4de76f5e7bcd020b3c0a86faff3f2d5 upstream. Patch series "powerpc/pseries/cmm: two smaller fixes". Two smaller fixes identified while doing a bigger rework. This patch (of 2): We always have to initialize the balloon_dev_info, even when compaction is not configured in: otherwise the containing list and the lock are left uninitialized. Likely not many such configs exist in practice, but let's CC stable to be sure. This was found by code inspection. Link: https://lkml.kernel.org/r/20251021100606.148294-1-david@redhat.com Link: https://lkml.kernel.org/r/20251021100606.148294-2-david@redhat.com Fixes: fe030c9b85e6 ("powerpc/pseries/cmm: Implement balloon compaction") Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08perf/x86/amd/uncore: Fix the return value of amd_uncore_df_event_init() on errorSandipan Das1-4/+1
commit 01439286514ce9d13b8123f8ec3717d7135ff1d6 upstream. If amd_uncore_event_init() fails, return an error irrespective of the pmu_version. Setting hwc->config should be safe even if there is an error so use this opportunity to simplify the code. Closes: https://lore.kernel.org/all/aTaI0ci3vZ44lmBn@stanley.mountain/ Fixes: d6389d3ccc13 ("perf/x86/amd/uncore: Refactor uncore management") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/076935e23a70335d33bd6e23308b75ae0ad35ba2.1765268667.git.sandipan.das@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08parisc: entry: set W bit for !compat tasks in syscall_restore_rfi()Sven Schnelle2-1/+6
commit 5fb1d3ce3e74a4530042795e1e065422295f1371 upstream. When the kernel leaves to userspace via syscall_restore_rfi(), the W bit is not set in the new PSW. This doesn't cause any problems because there's no 64 bit userspace for parisc. Simple static binaries are usually loaded at addresses way below the 32 bit limit so the W bit doesn't matter. Fix this by setting the W bit when TIF_32BIT is not set. Signed-off-by: Sven Schnelle <svens@stackframe.org> Cc: stable@vger.kernel.org Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08parisc: entry.S: fix space adjustment on interruption for 64-bit userspaceSven Schnelle1-3/+8
commit 1aa4524c0c1b54842c4c0a370171d11b12d0709b upstream. In wide mode, the IASQ contain the upper part of the GVA during interruption. This needs to be reversed before the space is used - otherwise it contains parts of IAOQ. See Page 2-13 "Processing Resources / Interruption Instruction Address Queues" in the Parisc 2.0 Architecture Manual page 2-13 for an explanation. The IAOQ/IASQ space_adjust was skipped for other interruptions than itlb misses. However, the code in handle_interruption() checks whether iasq[0] contains a valid space. Due to the not masked out bits this match failed and the process was killed. Also add space_adjust for IAOQ1/IASQ1 so ptregs contains sane values. Signed-off-by: Sven Schnelle <svens@stackframe.org> Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08powerpc/64s/slb: Fix SLB multihit issue during SLB preloadDonet Tom5-98/+0
commit 00312419f0863964625d6dcda8183f96849412c6 upstream. On systems using the hash MMU, there is a software SLB preload cache that mirrors the entries loaded into the hardware SLB buffer. This preload cache is subject to periodic eviction — typically after every 256 context switches — to remove old entry. To optimize performance, the kernel skips switch_mmu_context() in switch_mm_irqs_off() when the prev and next mm_struct are the same. However, on hash MMU systems, this can lead to inconsistencies between the hardware SLB and the software preload cache. If an SLB entry for a process is evicted from the software cache on one CPU, and the same process later runs on another CPU without executing switch_mmu_context(), the hardware SLB may retain stale entries. If the kernel then attempts to reload that entry, it can trigger an SLB multi-hit error. The following timeline shows how stale SLB entries are created and can cause a multi-hit error when a process moves between CPUs without a MMU context switch. CPU 0 CPU 1 ----- ----- Process P exec swapper/1 load_elf_binary begin_new_exc activate_mm switch_mm_irqs_off switch_mmu_context switch_slb /* * This invalidates all * the entries in the HW * and setup the new HW * SLB entries as per the * preload cache. */ context_switch sched_migrate_task migrates process P to cpu-1 Process swapper/0 context switch (to process P) (uses mm_struct of Process P) switch_mm_irqs_off() switch_slb load_slb++ /* * load_slb becomes 0 here * and we evict an entry from * the preload cache with * preload_age(). We still * keep HW SLB and preload * cache in sync, that is * because all HW SLB entries * anyways gets evicted in * switch_slb during SLBIA. * We then only add those * entries back in HW SLB, * which are currently * present in preload_cache * (after eviction). */ load_elf_binary continues... setup_new_exec() slb_setup_new_exec() sched_switch event sched_migrate_task migrates process P to cpu-0 context_switch from swapper/0 to Process P switch_mm_irqs_off() /* * Since both prev and next mm struct are same we don't call * switch_mmu_context(). This will cause the HW SLB and SW preload * cache to go out of sync in preload_new_slb_context. Because there * was an SLB entry which was evicted from both HW and preload cache * on cpu-1. Now later in preload_new_slb_context(), when we will try * to add the same preload entry again, we will add this to the SW * preload cache and then will add it to the HW SLB. Since on cpu-0 * this entry was never invalidated, hence adding this entry to the HW * SLB will cause a SLB multi-hit error. */ load_elf_binary continues... START_THREAD start_thread preload_new_slb_context /* * This tries to add a new EA to preload cache which was earlier * evicted from both cpu-1 HW SLB and preload cache. This caused the * HW SLB of cpu-0 to go out of sync with the SW preload cache. The * reason for this was, that when we context switched back on CPU-0, * we should have ideally called switch_mmu_context() which will * bring the HW SLB entries on CPU-0 in sync with SW preload cache * entries by setting up the mmu context properly. But we didn't do * that since the prev mm_struct running on cpu-0 was same as the * next mm_struct (which is true for swapper / kernel threads). So * now when we try to add this new entry into the HW SLB of cpu-0, * we hit a SLB multi-hit error. */ WARNING: CPU: 0 PID: 1810970 at arch/powerpc/mm/book3s64/slb.c:62 assert_slb_presence+0x2c/0x50(48 results) 02:47:29 [20157/42149] Modules linked in: CPU: 0 UID: 0 PID: 1810970 Comm: dd Not tainted 6.16.0-rc3-dirty #12 VOLUNTARY Hardware name: IBM pSeries (emulated by qemu) POWER8 (architected) 0x4d0200 0xf000004 of:SLOF,HEAD hv:linux,kvm pSeries NIP: c00000000015426c LR: c0000000001543b4 CTR: 0000000000000000 REGS: c0000000497c77e0 TRAP: 0700 Not tainted (6.16.0-rc3-dirty) MSR: 8000000002823033 <SF,VEC,VSX,FP,ME,IR,DR,RI,LE> CR: 28888482 XER: 00000000 CFAR: c0000000001543b0 IRQMASK: 3 <...> NIP [c00000000015426c] assert_slb_presence+0x2c/0x50 LR [c0000000001543b4] slb_insert_entry+0x124/0x390 Call Trace: 0x7fffceb5ffff (unreliable) preload_new_slb_context+0x100/0x1a0 start_thread+0x26c/0x420 load_elf_binary+0x1b04/0x1c40 bprm_execve+0x358/0x680 do_execveat_common+0x1f8/0x240 sys_execve+0x58/0x70 system_call_exception+0x114/0x300 system_call_common+0x160/0x2c4 >From the above analysis, during early exec the hardware SLB is cleared, and entries from the software preload cache are reloaded into hardware by switch_slb. However, preload_new_slb_context and slb_setup_new_exec also attempt to load some of the same entries, which can trigger a multi-hit. In most cases, these additional preloads simply hit existing entries and add nothing new. Removing these functions avoids redundant preloads and eliminates the multi-hit issue. This patch removes these two functions. We tested process switching performance using the context_switch benchmark on POWER9/hash, and observed no regression. Without this patch: 129041 ops/sec With this patch: 129341 ops/sec We also measured SLB faults during boot, and the counts are essentially the same with and without this patch. SLB faults without this patch: 19727 SLB faults with this patch: 19786 Fixes: 5434ae74629a ("powerpc/64s/hash: Add a SLB preload cache") cc: stable@vger.kernel.org Suggested-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Donet Tom <donettom@linux.ibm.com> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/0ac694ae683494fe8cadbd911a1a5018d5d3c541.1761834163.git.ritesh.list@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08powerpc, mm: Fix mprotect on book3s 32-bitDave Vasilevsky2-1/+13
commit 78fc63ffa7813e33681839bb33826c24195f0eb7 upstream. On 32-bit book3s with hash-MMUs, tlb_flush() was a no-op. This was unnoticed because all uses until recently were for unmaps, and thus handled by __tlb_remove_tlb_entry(). After commit 4a18419f71cd ("mm/mprotect: use mmu_gather") in kernel 5.19, tlb_gather_mmu() started being used for mprotect as well. This caused mprotect to simply not work on these machines: int *ptr = mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); *ptr = 1; // force HPTE to be created mprotect(ptr, 4096, PROT_READ); *ptr = 2; // should segfault, but succeeds Fixed by making tlb_flush() actually flush TLB pages. This finally agrees with the behaviour of boot3s64's tlb_flush(). Fixes: 4a18419f71cd ("mm/mprotect: use mmu_gather") Cc: stable@vger.kernel.org Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Dave Vasilevsky <dave@vasilevsky.ca> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20251116-vasi-mprotect-g3-v3-1-59a9bd33ba00@vasilevsky.ca Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08arm64: dts: ti: k3-j721e-sk: Fix pinmux for pin Y1 used by power regulatorSiddharth Vadapalli1-6/+6
commit 51f89c488f2ecc020f82bfedd77482584ce8027a upstream. The SoC pin Y1 is incorrectly defined in the WKUP Pinmux device-tree node (pinctrl@4301c000) leading to the following silent failure: pinctrl-single 4301c000.pinctrl: mux offset out of range: 0x1dc (0x178) According to the datasheet for the J721E SoC [0], the pin Y1 belongs to the MAIN Pinmux device-tree node (pinctrl@11c000). This is confirmed by the address of the pinmux register for it on page 142 of the datasheet which is 0x00011C1DC. Hence fix it. [0]: https://www.ti.com/lit/ds/symlink/tda4vm.pdf Fixes: 97b67cc102dc ("arm64: dts: ti: k3-j721e-sk: Add DT nodes for power regulators") Cc: stable@vger.kernel.org Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Reviewed-by: Yemike Abhilash Chandra <y-abhilashchandra@ti.com> Link: https://patch.msgid.link/20251119160148.2752616-1-s-vadapalli@ti.com Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08x86/msi: Make irq_retrigger() functional for posted MSIThomas Gleixner2-0/+30
[ Upstream commit 0edc78b82bea85e1b2165d8e870a5c3535919695 ] Luigi reported that retriggering a posted MSI interrupt does not work correctly. The reason is that the retrigger happens at the vector domain by sending an IPI to the actual vector on the target CPU. That works correctly exactly once because the posted MSI interrupt chip does not issue an EOI as that's only required for the posted MSI notification vector itself. As a consequence the vector becomes stale in the ISR, which not only affects this vector but also any lower priority vector in the affected APIC because the ISR bit is not cleared. Luigi proposed to set the vector in the remap PIR bitmap and raise the posted MSI notification vector. That works, but that still does not cure a related problem: If there is ever a stray interrupt on such a vector, then the related APIC ISR bit becomes stale due to the lack of EOI as described above. Unlikely to happen, but if it happens it's not debuggable at all. So instead of playing games with the PIR, this can be actually solved for both cases by: 1) Keeping track of the posted interrupt vector handler state 2) Implementing a posted MSI specific irq_ack() callback which checks that state. If the posted vector handler is inactive it issues an EOI, otherwise it delegates that to the posted handler. This is correct versus affinity changes and concurrent events on the posted vector as the actual handler invocation is serialized through the interrupt descriptor lock. Fixes: ed1e48ea4370 ("iommu/vt-d: Enable posted mode for device MSIs") Reported-by: Luigi Rizzo <lrizzo@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Luigi Rizzo <lrizzo@google.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251125214631.044440658@linutronix.de Closes: https://lore.kernel.org/lkml/20251124104836.3685533-1-lrizzo@google.com [ DEFINE_PER_CPU_CACHE_HOT => DEFINE_PER_CPU ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08ARM: dts: microchip: sama5d2: fix spi flexcom fifo size to 32Nicolas Ferre1-5/+5
commit 7d5864dc5d5ea6a35983dd05295fb17f2f2f44ce upstream. Unlike standalone spi peripherals, on sama5d2, the flexcom spi have fifo size of 32 data. Fix flexcom/spi nodes where this property is wrong. Fixes: 6b9a3584c7ed ("ARM: dts: at91: sama5d2: Add missing flexcom definitions") Cc: stable@vger.kernel.org # 5.8+ Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Link: https://lore.kernel.org/r/20251114140225.30372-1-nicolas.ferre@microchip.com Signed-off-by: Claudiu Beznea <claudiu.beznea@tuxon.dev> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08arm64: Revamp HCR_EL2.E2H RES1 detectionMarc Zyngier1-6/+32
[ Upstream commit ca88ecdce5f51874a7c151809bd2c936ee0d3805 ] We currently have two ways to identify CPUs that only implement FEAT_VHE and not FEAT_E2H0: - either they advertise it via ID_AA64MMFR4_EL1.E2H0, - or the HCR_EL2.E2H bit is RAO/WI However, there is a third category of "cpus" that fall between these two cases: on CPUs that do not implement FEAT_FGT, it is IMPDEF whether an access to ID_AA64MMFR4_EL1 can trap to EL2 when the register value is zero. A consequence of this is that on systems such as Neoverse V2, a NV guest cannot reliably detect that it is in a VHE-only configuration (E2H is writable, and ID_AA64MMFR0_EL1 is 0), despite the hypervisor's best effort to repaint the id register. Replace the RAO/WI test by a sequence that makes use of the VHE register remnapping between EL1 and EL2 to detect this situation, and work out whether we get the VHE behaviour even after having set HCR_EL2.E2H to 0. This solves the NV problem, and provides a more reliable acid test for CPUs that do not completely follow the letter of the architecture while providing a RES1 behaviour for HCR_EL2.E2H. Suggested-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Tested-by: Jan Kotas <jank@cadence.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/15A85F2B-1A0C-4FA7-9FE4-EEC2203CC09E@global.cadence.com Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08KVM: arm64: Initialize SCTLR_EL1 in __kvm_hyp_init_cpu()Ahmed Genidi4-8/+5
[ Upstream commit 3855a7b91d42ebf3513b7ccffc44807274978b3d ] When KVM is in protected mode, host calls to PSCI are proxied via EL2, and cold entries from CPU_ON, CPU_SUSPEND, and SYSTEM_SUSPEND bounce through __kvm_hyp_init_cpu() at EL2 before entering the host kernel's entry point at EL1. While __kvm_hyp_init_cpu() initializes SPSR_EL2 for the exception return to EL1, it does not initialize SCTLR_EL1. Due to this, it's possible to enter EL1 with SCTLR_EL1 in an UNKNOWN state. In practice this has been seen to result in kernel crashes after CPU_ON as a result of SCTLR_EL1.M being 1 in violation of the initial core configuration specified by PSCI. Fix this by initializing SCTLR_EL1 for cold entry to the host kernel. As it's necessary to write to SCTLR_EL12 in VHE mode, this initialization is moved into __kvm_host_psci_cpu_entry() where we can use write_sysreg_el1(). The remnants of the '__init_el2_nvhe_prepare_eret' macro are folded into its only caller, as this is clearer than having the macro. Fixes: cdf367192766ad11 ("KVM: arm64: Intercept host's CPU_ON SMCs") Reported-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Ahmed Genidi <ahmed.genidi@arm.com> [ Mark: clarify commit message, handle E2H, move to C, remove macro ] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ahmed Genidi <ahmed.genidi@arm.com> Cc: Ben Horgan <ben.horgan@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Will Deacon <will@kernel.org> Reviewed-by: Leo Yan <leo.yan@arm.com> Link: https://lore.kernel.org/r/20250227180526.1204723-3-mark.rutland@arm.com Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08KVM: arm64: Initialize HCR_EL2.E2H earlyMark Rutland3-19/+34
[ Upstream commit 7a68b55ff39b0a1638acb1694c185d49f6077a0d ] On CPUs without FEAT_E2H0, HCR_EL2.E2H is RES1, but may reset to an UNKNOWN value out of reset and consequently may not read as 1 unless it has been explicitly initialized. We handled this for the head.S boot code in commits: 3944382fa6f22b54 ("arm64: Treat HCR_EL2.E2H as RES1 when ID_AA64MMFR4_EL1.E2H0 is negative") b3320142f3db9b3f ("arm64: Fix early handling of FEAT_E2H0 not being implemented") Unfortunately, we forgot to apply a similar fix to the KVM PSCI entry points used when relaying CPU_ON, CPU_SUSPEND, and SYSTEM SUSPEND. When KVM is entered via these entry points, the value of HCR_EL2.E2H may be consumed before it has been initialized (e.g. by the 'init_el2_state' macro). Initialize HCR_EL2.E2H early in these paths such that it can be consumed reliably. The existing code in head.S is factored out into a new 'init_el2_hcr' macro, and this is used in the __kvm_hyp_init_cpu() function common to all the relevant PSCI entry points. For clarity, I've tweaked the assembly used to check whether ID_AA64MMFR4_EL1.E2H0 is negative. The bitfield is extracted as a signed value, and this is checked with a signed-greater-or-equal (GE) comparison. As the hyp code will reconfigure HCR_EL2 later in ___kvm_hyp_init(), all bits other than E2H are initialized to zero in __kvm_hyp_init_cpu(). Fixes: 3944382fa6f22b54 ("arm64: Treat HCR_EL2.E2H as RES1 when ID_AA64MMFR4_EL1.E2H0 is negative") Fixes: b3320142f3db9b3f ("arm64: Fix early handling of FEAT_E2H0 not being implemented") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ahmed Genidi <ahmed.genidi@arm.com> Cc: Ben Horgan <ben.horgan@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250227180526.1204723-2-mark.rutland@arm.com [maz: fixed LT->GE thinko] Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08s390/ipl: Clear SBP flag when bootprog is setSven Schnelle2-12/+37
commit b1aa01d31249bd116b18c7f512d3e46b4b4ad83b upstream. With z16 a new flag 'search boot program' was introduced for list-directed IPL (SCSI, NVMe, ECKD DASD). If this flag is set, e.g. via selecting the "Automatic" value for the "Boot program selector" control on an HMC load panel, it is copied to the reipl structure from the initial ipl structure. When a user now sets a boot prog via sysfs, the flag is not cleared and the bootloader will again automatically select the boot program, ignoring user configuration. To avoid that, clear the SBP flag when a bootprog sysfs file is written. Cc: stable@vger.kernel.org Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08powerpc/kexec: Enable SMT before waking offline CPUsNysal Jan K.A.1-0/+19
commit c2296a1e42418556efbeb5636c4fa6aa6106713a upstream. If SMT is disabled or a partial SMT state is enabled, when a new kernel image is loaded for kexec, on reboot the following warning is observed: kexec: Waking offline cpu 228. WARNING: CPU: 0 PID: 9062 at arch/powerpc/kexec/core_64.c:223 kexec_prepare_cpus+0x1b0/0x1bc [snip] NIP kexec_prepare_cpus+0x1b0/0x1bc LR kexec_prepare_cpus+0x1a0/0x1bc Call Trace: kexec_prepare_cpus+0x1a0/0x1bc (unreliable) default_machine_kexec+0x160/0x19c machine_kexec+0x80/0x88 kernel_kexec+0xd0/0x118 __do_sys_reboot+0x210/0x2c4 system_call_exception+0x124/0x320 system_call_vectored_common+0x15c/0x2ec This occurs as add_cpu() fails due to cpu_bootable() returning false for CPUs that fail the cpu_smt_thread_allowed() check or non primary threads if SMT is disabled. Fix the issue by enabling SMT and resetting the number of SMT threads to the number of threads per core, before attempting to wake up all present CPUs. Fixes: 38253464bc82 ("cpu/SMT: Create topology_smt_thread_allowed()") Reported-by: Sachin P Bappalige <sachinpb@linux.ibm.com> Cc: stable@vger.kernel.org # v6.6+ Reviewed-by: Srikar Dronamraju <srikar@linux.ibm.com> Signed-off-by: Nysal Jan K.A. <nysal@linux.ibm.com> Tested-by: Samir M <samir@linux.ibm.com> Reviewed-by: Sourabh Jain <sourabhjain@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20251028105516.26258-1-nysal@linux.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08KVM: nSVM: Clear exit_code_hi in VMCB when synthesizing nested VM-ExitsSean Christopherson2-3/+6
commit da01f64e7470988f8607776aa7afa924208863fb upstream. Explicitly clear exit_code_hi in the VMCB when synthesizing "normal" nested VM-Exits, as the full exit code is a 64-bit value (spoiler alert), and all exit codes for non-failing VMRUN use only bits 31:0. Cc: Jim Mattson <jmattson@google.com> Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Cc: stable@vger.kernel.org Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251113225621.1688428-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08KVM: nSVM: Set exit_code_hi to -1 when synthesizing SVM_EXIT_ERR (failed VMRUN)Sean Christopherson1-2/+2
commit f402ecd7a8b6446547076f4bd24bd5d4dcc94481 upstream. Set exit_code_hi to -1u as a temporary band-aid to fix a long-standing (effectively since KVM's inception) bug where KVM treats the exit code as a 32-bit value, when in reality it's a 64-bit value. Per the APM, offset 0x70 is a single 64-bit value: 070h 63:0 EXITCODE And a sane reading of the error values defined in "Table C-1. SVM Intercept Codes" is that negative values use the full 64 bits: –1 VMEXIT_INVALID Invalid guest state in VMCB. –2 VMEXIT_BUSYBUSY bit was set in the VMSA –3 VMEXIT_IDLE_REQUIREDThe sibling thread is not in an idle state -4 VMEXIT_INVALID_PMC Invalid PMC state And that interpretation is confirmed by testing on Milan and Turin (by setting bits in CR0[63:32] to generate VMEXIT_INVALID on VMRUN). Furthermore, Xen has treated exitcode as a 64-bit value since HVM support was adding in 2006 (see Xen commit d1bd157fbc ("Big merge the HVM full-virtualisation abstractions.")). Cc: Jim Mattson <jmattson@google.com> Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Cc: stable@vger.kernel.org Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251113225621.1688428-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>