kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2 days	unwind_user/x86: Fix arch=um build	Peter Zijlstra	1	-0/+4
	commit aa7387e79a5cff0585cd1b9091944142a06872b6 upstream. Add CONFIG_HAVE_UNWIND_USER_FP guards to make sure this code doesn't break arch=um builds. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Closes: https://lore.kernel.org/oe-kbuild-all/202510291919.FFGyU7nq-lkp@intel.com/ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	powerpc64/bpf: do not increment tailcall count when prog is NULL	Hari Bathini	1	-9/+14
	commit 521bd39d9d28ce54cbfec7f9b89c94ad4fdb8350 upstream. Do not increment tailcall count, if tailcall did not succeed due to missing BPF program. Fixes: ce0761419fae ("powerpc/bpf: Implement support for tail calls") Cc: stable@vger.kernel.org Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20260303181031.390073-2-hbathini@linux.ibm.com [ Conflict due to missing feature commit 2ed2d8f6fb38 ("powerpc64/bpf: Support tailcalls with subprogs") resolved accordingly. ] Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	arm64: dts: imx8mn-tqma8mqnl: fix LDO5 power off	Markus Niebel	2	-6/+29
	commit 8adc841d43ebceabec996c9dcff6e82d3e585268 upstream. Fix SD card removal caused by automatic LDO5 power off after boot To prevent this, add vqmmc regulator for USDHC, using a GPIO-controlled regulator that is supplied by LDO5. Since this is implemented on SoM but used on baseboards with SD-card interface, implement the functionality on SoM part and optionally enable it on baseboards if needed. Signed-off-by: Markus Niebel <Markus.Niebel@ew.tq-group.com> Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	LoongArch: KVM: Handle the case that EIOINTC's coremap is empty	Huacai Chen	1	-1/+1
	commit b97bd69eb0f67b5f961b304d28e9ba45e202d841 upstream. EIOINTC's coremap in eiointc_update_sw_coremap() can be empty, currently we get a cpuid with -1 in this case, but we actually need 0 because it's similar as the case that cpuid >= 4. This fix an out-of-bounds access to kvm_arch::phyid_map::phys_map[]. Cc: <stable@vger.kernel.org> Fixes: 3956a52bc05bd81 ("LoongArch: KVM: Add EIOINTC read and write functions") Reported-by: Aurelien Jarno <aurel32@debian.org> Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1131431 Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	LoongArch: KVM: Make kvm_get_vcpu_by_cpuid() more robust	Huacai Chen	1	-0/+3
	commit 2db06c15d8c7a0ccb6108524e16cd9163753f354 upstream. kvm_get_vcpu_by_cpuid() takes a cpuid parameter whose type is int, so cpuid can be negative. Let kvm_get_vcpu_by_cpuid() return NULL for this case so as to make it more robust. This fix an out-of-bounds access to kvm_arch::phyid_map::phys_map[]. Cc: <stable@vger.kernel.org> Fixes: 73516e9da512adc ("LoongArch: KVM: Add vcpu mapping from physical cpuid") Reported-by: Aurelien Jarno <aurel32@debian.org> Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1131431 Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	LoongArch: Workaround LS2K/LS7A GPU DMA hang bug	Huacai Chen	1	-0/+80
	commit 95db0c9f526d583634cddb2e5914718570fbac87 upstream. 1. Hardware limitation: GPU, DC and VPU are typically PCI device 06.0, 06.1 and 06.2. They share some hardware resources, so when configure the PCI 06.0 device BAR1, DMA memory access cannot be performed through this BAR, otherwise it will cause hardware abnormalities. 2. In typical scenarios of reboot or S3/S4, DC access to memory through BAR is not prohibited, resulting in GPU DMA hangs. 3. Workaround method: When configuring the 06.0 device BAR1, turn off the memory access of DC, GPU and VPU (via DC's CRTC registers). Cc: stable@vger.kernel.org Signed-off-by: Qianhai Wu <wuqianhai@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	LoongArch: vDSO: Emit GNU_EH_FRAME correctly	Xi Ruoyao	6	-10/+53
	commit e4878c37f6679fdea91b27a0f4e60a871f0b7bad upstream. With -fno-asynchronous-unwind-tables and --no-eh-frame-hdr (the default of the linker), the GNU_EH_FRAME segment (specified by vdso.lds.S) is empty. This is not valid, as the current DWARF specification mandates the first byte of the EH frame to be the version number 1. It causes some unwinders to complain, for example the ClickHouse query profiler spams the log with messages: clickhouse-server[365854]: libunwind: unsupported .eh_frame_hdr version: 127 at 7ffffffb0000 Here "127" is just the byte located at the p_vaddr (0, i.e. the beginning of the vDSO) of the empty GNU_EH_FRAME segment. Cross- checking with /proc/365854/maps has also proven 7ffffffb0000 is the start of vDSO in the process VM image. In LoongArch the -fno-asynchronous-unwind-tables option seems just a MIPS legacy, and MIPS only uses this option to satisfy the MIPS-specific "genvdso" program, per the commit cfd75c2db17e ("MIPS: VDSO: Explicitly use -fno-asynchronous-unwind-tables"). IIRC it indicates some inherent limitation of the MIPS ELF ABI and has nothing to do with LoongArch. So we can simply flip it over to -fasynchronous-unwind-tables and pass --eh-frame-hdr for linking the vDSO, allowing the profilers to unwind the stack for statistics even if the sample point is taken when the PC is in the vDSO. However simply adjusting the options above would exploit an issue: when the libgcc unwinder saw the invalid GNU_EH_FRAME segment, it silently falled back to a machine-specific routine to match the code pattern of rt_sigreturn() and extract the registers saved in the sigframe if the code pattern is matched. As unwinding from signal handlers is vital for libgcc to support pthread cancellation etc., the fall-back routine had been silently keeping the LoongArch Linux systems functioning since Linux 5.19. But when we start to emit GNU_EH_FRAME with the correct format, fall-back routine will no longer be used and libgcc will fail to unwind the sigframe, and unwinding from signal handlers will no longer work, causing dozens of glibc test failures. To make it possible to unwind from signal handlers again, it's necessary to code the unwind info in __vdso_rt_sigreturn via .cfi_* directives. The offsets in the .cfi_* directives depend on the layout of struct sigframe, notably the offset of sigcontext in the sigframe. To use the offset in the assembly file, factor out struct sigframe into a header to allow asm-offsets.c to output the offset for assembly. To work around a long-term issue in the libgcc unwinder (the pc is unconditionally substracted by 1: doing so is technically incorrect for a signal frame), a nop instruction is included with the two real instructions in __vdso_rt_sigreturn in the same FDE PC range. The same hack has been used on x86 for a long time. Cc: stable@vger.kernel.org Fixes: c6b99bed6b8f ("LoongArch: Add VDSO and VSYSCALL support") Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	LoongArch: Fix missing NULL checks for kstrdup()	Li Jun	1	-4/+3
	commit 3a28daa9b7d7c2ddf2c722e9e95d7e0928bf0cd1 upstream. 1. Replace "of_find_node_by_path("/")" with "of_root" to avoid multiple calls to "of_node_put()". 2. Fix a potential kernel oops during early boot when memory allocation fails while parsing CPU model from device tree. Cc: stable@vger.kernel.org Signed-off-by: Li Jun <lijun01@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	KVM: x86/mmu: Only WARN in direct MMUs when overwriting shadow-present SPTE	Sean Christopherson	1	-1/+2
	commit df83746075778958954aa0460cca55f4b3fc9c02 upstream. Adjust KVM's sanity check against overwriting a shadow-present SPTE with a another SPTE with a different target PFN to only apply to direct MMUs, i.e. only to MMUs without shadowed gPTEs. While it's impossible for KVM to overwrite a shadow-present SPTE in response to a guest write, writes from outside the scope of KVM, e.g. from host userspace, aren't detected by KVM's write tracking and so can break KVM's shadow paging rules. ------------[ cut here ]------------ pfn != spte_to_pfn(*sptep) WARNING: arch/x86/kvm/mmu/mmu.c:3069 at mmu_set_spte+0x1e4/0x440 [kvm], CPU#0: vmx_ept_stale_r/872 Modules linked in: kvm_intel kvm irqbypass CPU: 0 UID: 1000 PID: 872 Comm: vmx_ept_stale_r Not tainted 7.0.0-rc2-eafebd2d2ab0-sink-vm #319 PREEMPT Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:mmu_set_spte+0x1e4/0x440 [kvm] Call Trace: <TASK> ept_page_fault+0x535/0x7f0 [kvm] kvm_mmu_do_page_fault+0xee/0x1f0 [kvm] kvm_mmu_page_fault+0x8d/0x620 [kvm] vmx_handle_exit+0x18c/0x5a0 [kvm_intel] kvm_arch_vcpu_ioctl_run+0xc55/0x1c20 [kvm] kvm_vcpu_ioctl+0x2d5/0x980 [kvm] __x64_sys_ioctl+0x8a/0xd0 do_syscall_64+0xb5/0x730 entry_SYSCALL_64_after_hwframe+0x4b/0x53 </TASK> ---[ end trace 0000000000000000 ]--- Fixes: 11d45175111d ("KVM: x86/mmu: Warn if PFN changes on shadow-present SPTE in shadow MMU") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	KVM: x86/mmu: Drop/zap existing present SPTE even when creating an MMIO SPTE	Sean Christopherson	1	-6/+8
	commit aad885e774966e97b675dfe928da164214a71605 upstream. When installing an emulated MMIO SPTE, do so after dropping/zapping the existing SPTE (if it's shadow-present). While commit a54aa15c6bda3 was right about it being impossible to convert a shadow-present SPTE to an MMIO SPTE due to a _guest_ write, it failed to account for writes to guest memory that are outside the scope of KVM. E.g. if host userspace modifies a shadowed gPTE to switch from a memslot to emulted MMIO and then the guest hits a relevant page fault, KVM will install the MMIO SPTE without first zapping the shadow-present SPTE. ------------[ cut here ]------------ is_shadow_present_pte(*sptep) WARNING: arch/x86/kvm/mmu/mmu.c:484 at mark_mmio_spte+0xb2/0xc0 [kvm], CPU#0: vmx_ept_stale_r/4292 Modules linked in: kvm_intel kvm irqbypass CPU: 0 UID: 1000 PID: 4292 Comm: vmx_ept_stale_r Not tainted 7.0.0-rc2-eafebd2d2ab0-sink-vm #319 PREEMPT Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:mark_mmio_spte+0xb2/0xc0 [kvm] Call Trace: <TASK> mmu_set_spte+0x237/0x440 [kvm] ept_page_fault+0x535/0x7f0 [kvm] kvm_mmu_do_page_fault+0xee/0x1f0 [kvm] kvm_mmu_page_fault+0x8d/0x620 [kvm] vmx_handle_exit+0x18c/0x5a0 [kvm_intel] kvm_arch_vcpu_ioctl_run+0xc55/0x1c20 [kvm] kvm_vcpu_ioctl+0x2d5/0x980 [kvm] __x64_sys_ioctl+0x8a/0xd0 do_syscall_64+0xb5/0x730 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x47fa3f </TASK> ---[ end trace 0000000000000000 ]--- Reported-by: Alexander Bulekov <bkov@amazon.com> Debugged-by: Alexander Bulekov <bkov@amazon.com> Suggested-by: Fred Griffoul <fgriffo@amazon.co.uk> Fixes: a54aa15c6bda3 ("KVM: x86/mmu: Handle MMIO SPTEs directly in mmu_set_spte()") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	x86/fred: Fix early boot failures on SEV-ES/SNP guests	Nikunj A Dadhania	2	-0/+20
	commit 3645eb7e3915990a149460c151a00894cb586253 upstream. FRED-enabled SEV-(ES,SNP) guests fail to boot due to the following issues in the early boot sequence: * FRED does not have a #VC exception handler in the dispatch logic * Early FRED #VC exceptions attempt to use uninitialized per-CPU GHCBs instead of boot_ghcb Add X86_TRAP_VC case to fred_hwexc() with a new exc_vmm_communication() function that provides the unified entry point FRED requires, dispatching to existing user/kernel handlers based on privilege level. The function is already declared via DECLARE_IDTENTRY_VC(). Fix early GHCB access by falling back to boot_ghcb in __sev_{get,put}_ghcb() when per-CPU GHCBs are not yet initialized. Fixes: 14619d912b65 ("x86/fred: FRED entry/exit and dispatch code") Signed-off-by: Nikunj A Dadhania <nikunj@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: <stable@kernel.org> # 6.12+ Link: https://patch.msgid.link/20260318075654.1792916-4-nikunj@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	x86/cpu: Remove X86_CR4_FRED from the CR4 pinned bits mask	Borislav Petkov (AMD)	1	-1/+1
	commit 411df123c017169922cc767affce76282b8e6c85 upstream. Commit in Fixes added the FRED CR4 bit to the CR4 pinned bits mask so that whenever something else modifies CR4, that bit remains set. Which in itself is a perfectly fine idea. However, there's an issue when during boot FRED is initialized: first on the BSP and later on the APs. Thus, there's a window in time when exceptions cannot be handled. This becomes particularly nasty when running as SEV-{ES,SNP} or TDX guests which, when they manage to trigger exceptions during that short window described above, triple fault due to FRED MSRs not being set up yet. See Link tag below for a much more detailed explanation of the situation. So, as a result, the commit in that Link URL tried to address this shortcoming by temporarily disabling CR4 pinning when an AP is not online yet. However, that is a problem in itself because in this case, an attack on the kernel needs to only modify the online bit - a single bit in RW memory - and then disable CR4 pinning and then disable SM*P, leading to more and worse things to happen to the system. So, instead, remove the FRED bit from the CR4 pinning mask, thus obviating the need to temporarily disable CR4 pinning. If someone manages to disable FRED when poking at CR4, then idt_invalidate() would make sure the system would crash'n'burn on the first exception triggered, which is a much better outcome security-wise. Fixes: ff45746fbf00 ("x86/cpu: Add X86_CR4_FRED macro") Suggested-by: Dave Hansen <dave.hansen@linux.intel.com> Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable@kernel.org> # 6.12+ Link: https://lore.kernel.org/r/177385987098.1647592.3381141860481415647.tip-bot2@tip-bot2 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	x86/cpu: Enable FSGSBASE early in cpu_init_exception_handling()	Nikunj A Dadhania	1	-6/+12
	commit 05243d490bb7852a8acca7b5b5658019c7797a52 upstream. Move FSGSBASE enablement from identify_cpu() to cpu_init_exception_handling() to ensure it is enabled before any exceptions can occur on both boot and secondary CPUs. == Background == Exception entry code (paranoid_entry()) uses ALTERNATIVE patching based on X86_FEATURE_FSGSBASE to decide whether to use RDGSBASE/WRGSBASE instructions or the slower RDMSR/SWAPGS sequence for saving/restoring GSBASE. On boot CPU, ALTERNATIVE patching happens after enabling FSGSBASE in CR4. When the feature is available, the code is permanently patched to use RDGSBASE/WRGSBASE, which require CR4.FSGSBASE=1 to execute without triggering == Boot Sequence == Boot CPU (with CR pinning enabled): trap_init() cpu_init() <- Uses unpatched code (RDMSR/SWAPGS) x2apic_setup() ... arch_cpu_finalize_init() identify_boot_cpu() identify_cpu() cr4_set_bits(X86_CR4_FSGSBASE) # Enables the feature # This becomes part of cr4_pinned_bits ... alternative_instructions() <- Patches code to use RDGSBASE/WRGSBASE Secondary CPUs (with CR pinning enabled): start_secondary() cr4_init() <- Code already patched, CR4.FSGSBASE=1 set implicitly via cr4_pinned_bits cpu_init() <- exceptions work because FSGSBASE is already enabled Secondary CPU (with CR pinning disabled): start_secondary() cr4_init() <- Code already patched, CR4.FSGSBASE=0 cpu_init() x2apic_setup() rdmsrq(MSR_IA32_APICBASE) <- Triggers #VC in SNP guests exc_vmm_communication() paranoid_entry() <- Uses RDGSBASE with CR4.FSGSBASE=0 (patched code) ... ap_starting() identify_secondary_cpu() identify_cpu() cr4_set_bits(X86_CR4_FSGSBASE) <- Enables the feature, which is too late == CR Pinning == Currently, for secondary CPUs, CR4.FSGSBASE is set implicitly through CR-pinning: the boot CPU sets it during identify_cpu(), it becomes part of cr4_pinned_bits, and cr4_init() applies those pinned bits to secondary CPUs. This works but creates an undocumented dependency between cr4_init() and the pinning mechanism. == Problem == Secondary CPUs boot after alternatives have been applied globally. They execute already-patched paranoid_entry() code that uses RDGSBASE/WRGSBASE instructions, which require CR4.FSGSBASE=1. Upcoming changes to CR pinning behavior will break the implicit dependency, causing secondary CPUs to generate #UD. This issue manifests itself on AMD SEV-SNP guests, where the rdmsrq() in x2apic_setup() triggers a #VC exception early during cpu_init(). The #VC handler (exc_vmm_communication()) executes the patched paranoid_entry() path. Without CR4.FSGSBASE enabled, RDGSBASE instructions trigger #UD. == Fix == Enable FSGSBASE explicitly in cpu_init_exception_handling() before loading exception handlers. This makes the dependency explicit and ensures both boot and secondary CPUs have FSGSBASE enabled before paranoid_entry() executes. Fixes: c82965f9e530 ("x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit") Reported-by: Borislav Petkov <bp@alien8.de> Suggested-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Cc: <stable@kernel.org> Link: https://patch.msgid.link/20260318075654.1792916-2-nikunj@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	KVM: arm64: Discard PC update state on vcpu reset	Marc Zyngier	1	-0/+14
	commit 1744a6ef48b9a48f017e3e1a0d05de0a6978396e upstream. Our vcpu reset suffers from a particularly interesting flaw, as it does not correctly deal with state that will have an effect on the execution flow out of reset. Take the following completely random example, never seen in the wild and that never resulted in a couple of sleepless nights: /s - vcpu-A issues a PSCI_CPU_OFF using the SMC conduit - SMC being a trapped instruction (as opposed to HVC which is always normally executed), we annotate the vcpu as needing to skip the next instruction, which is the SMC itself - vcpu-A is now safely off - vcpu-B issues a PSCI_CPU_ON for vcpu-A, providing a starting PC - vcpu-A gets reset, get the new PC, and is sent on its merry way - right at the point of entering the guest, we notice that a PC increment is pending (remember the earlier SMC?) - vcpu-A skips its first instruction... What could possibly go wrong? Well, I'm glad you asked. For pKVM as a NV guest, that first instruction is extremely significant, as it indicates whether the CPU is booting or resuming. Having skipped that instruction, nothing makes any sense anymore, and CPU hotplugging fails. This is all caused by the decoupling of PC update from the handling of an exception that triggers such update, making it non-obvious what affects what when. Fix this train wreck by discarding all the PC-affecting state on vcpu reset. Fixes: f5e30680616ab ("KVM: arm64: Move __adjust_pc out of line") Cc: stable@vger.kernel.org Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Link: https://patch.msgid.link/20260312140850.822968-1-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	s390/entry: Scrub r12 register on kernel entry	Vasily Gorbik	1	-0/+3
	commit 0738d395aab8fae3b5a3ad3fc640630c91693c27 upstream. Before commit f33f2d4c7c80 ("s390/bp: remove TIF_ISOLATE_BP"), all entry handlers loaded r12 with the current task pointer (lg %r12,__LC_CURRENT) for use by the BPENTER/BPEXIT macros. That commit removed TIF_ISOLATE_BP, dropping both the branch prediction macros and the r12 load, but did not add r12 to the register clearing sequence. Add the missing xgr %r12,%r12 to make the register scrub consistent across all entry points. Fixes: f33f2d4c7c80 ("s390/bp: remove TIF_ISOLATE_BP") Cc: stable@kernel.org Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	s390/barrier: Make array_index_mask_nospec() __always_inline	Vasily Gorbik	1	-2/+2
	commit c5c0a268b38adffbb2e70e6957017537ff54c157 upstream. Mark array_index_mask_nospec() as __always_inline to guarantee the mitigation is emitted inline regardless of compiler inlining decisions. Fixes: e2dd833389cc ("s390: add optimized array_index_mask_nospec") Cc: stable@kernel.org Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	s390/syscalls: Add spectre boundary for syscall dispatch table	Greg Kroah-Hartman	1	-1/+4
	commit 48b8814e25d073dd84daf990a879a820bad2bcbd upstream. The s390 syscall number is directly controlled by userspace, but does not have an array_index_nospec() boundary to prevent access past the syscall function pointer tables. Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Fixes: 56e62a737028 ("s390: convert to generic entry") Cc: stable@kernel.org Assisted-by: gkh_clanker_2000 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/2026032404-sterling-swoosh-43e6@gregkh Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	x86/efi: efi_unmap_boot_services: fix calculation of ranges_to_free size	Mike Rapoport (Microsoft)	1	-1/+1
	[ Upstream commit 217c0a5c177a3d4f7c8497950cbf5c36756e8bbb ] ranges_to_free array should have enough room to store the entire EFI memmap plus an extra element for NULL entry. The calculation of this array size wrongly adds 1 to the overall size instead of adding 1 to the number of elements. Add parentheses to properly size the array. Reported-by: Guenter Roeck <linux@roeck-us.net> Fixes: a4b0bf6a40f3 ("x86/efi: defer freeing of boot services memory") Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2 days	powerpc64/ftrace: fix OOL stub count with clang	Hari Bathini	1	-2/+2
	[ Upstream commit 875612a7745013a43c67493cb0583ee3f7476344 ] The total number of out-of-line (OOL) stubs required for function tracing is determined using the following command: $(OBJDUMP) -r -j __patchable_function_entries vmlinux.o While this works correctly with GNU objdump, llvm-objdump does not list the expected relocation records for this section. Fix this by using the -d option and counting R_PPC64_ADDR64 relocation entries. This works as desired with both objdump and llvm-objdump. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20260127084926.34497-3-hbathini@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2 days	sh: platform_early: remove pdev->driver_override check	Danilo Krummrich	1	-4/+0
	[ Upstream commit c5f60e3f07b6609562d21efda878e83ce8860728 ] In commit 507fd01d5333 ("drivers: move the early platform device support to arch/sh") platform_match() was copied over to the sh platform_early code, accidentally including the driver_override check. This check does not make sense for platform_early, as sysfs is not even available in first place at this point in the boot process, hence remove the check. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Fixes: 507fd01d5333 ("drivers: move the early platform device support to arch/sh") Link: https://lore.kernel.org/all/DH4M3DJ4P58T.1BGVAVXN71Z09@kernel.org/ Signed-off-by: Danilo Krummrich <dakr@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2 days	s390/mm: Add missing secure storage access fixups for donated memory	Janosch Frank	1	-2/+9
	[ Upstream commit b00be77302d7ec4ad0367bb236494fce7172b730 ] There are special cases where secure storage access exceptions happen in a kernel context for pages that don't have the PG_arch_1 bit set. That bit is set for non-exported guest secure storage (memory) but is absent on storage donated to the Ultravisor since the kernel isn't allowed to export donated pages. Prior to this patch we would try to export the page by calling arch_make_folio_accessible() which would instantly return since the arch bit is absent signifying that the page was already exported and no further action is necessary. This leads to secure storage access exception loops which can never be resolved. With this patch we unconditionally try to export and if that fails we fixup. Fixes: 084ea4d611a3 ("s390/mm: add (non)secure page access exceptions handlers") Reported-by: Heiko Carstens <hca@linux.ibm.com> Suggested-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Tested-by: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2 days	x86/perf: Make sure to program the counter value for stopped events on migration	Peter Zijlstra	1	-1/+3
	[ Upstream commit f1cac6ac62d28a9a57b17f51ac5795bf250c12d3 ] Both Mi Dapeng and Ian Rogers noted that not everything that sets HES_STOPPED is required to EF_UPDATE. Specifically the 'step 1' loop of rescheduling explicitly does EF_UPDATE to ensure the counter value is read. However, then 'step 2' simply leaves the new counter uninitialized when HES_STOPPED, even though, as noted above, the thing that stopped them might not be aware it needs to EF_RELOAD -- since it didn't EF_UPDATE on stop. One such location that is affected is throttling, throttle does pmu->stop(, 0); and unthrottle does pmu->start(, 0); possibly restarting an uninitialized counter. Fixes: a4eaf7f14675 ("perf: Rework the PMU methods") Reported-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Reported-by: Ian Rogers <irogers@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://patch.msgid.link/20260311204035.GX606826@noisy.programming.kicks-ass.net Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	arm64: realm: Fix PTE_NS_SHARED for 52bit PA support	Suzuki K Poulose	1	-1/+2
	[ Upstream commit 8c6e9b60f5c7985a9fe41320556a92d7a33451df ] With LPA/LPA2, the top bits of the PFN (Bits[51:48]) end up in the lower bits of the PTE. So, simply creating a mask of the "top IPA bit" doesn't work well for these configurations to set the "top" bit at the output of Stage1 translation. Fix this by using the __phys_to_pte_val() to do the right thing for all configurations. Tested using, kvmtool, placing the memory at a higher address (-m <size>@<Addr>). e.g: # lkvm run --realm -c 4 -m 512M@@128T -k Image --console serial sh-5.0# dmesg \| grep "LPA2\\|RSI" [ 0.000000] RME: Using RSI version 1.0 [ 0.000000] CPU features: detected: 52-bit Virtual Addressing (LPA2) [ 0.777354] CPU features: detected: 52-bit Virtual Addressing for KVM (LPA2) Fixes: 399306954996 ("arm64: realm: Query IPA size from the RMM") Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Steven Price <steven.price@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	x86/platform/uv: Handle deconfigured sockets	Kyle Meyer	1	-2/+16
	commit 1f6aa5bbf1d0f81a8a2aafc16136e7dd9a609ff3 upstream. When a socket is deconfigured, it's mapped to SOCK_EMPTY (0xffff). This causes a panic while allocating UV hub info structures. Fix this by using NUMA_NO_NODE, allowing UV hub info structures to be allocated on valid nodes. Fixes: 8a50c5851927 ("x86/platform/uv: UV support for sub-NUMA clustering") Signed-off-by: Kyle Meyer <kyle.meyer@hpe.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/ab2BmGL0ehVkkjKk@hpe.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 days	perf/x86: Move event pointer setup earlier in x86_pmu_enable()	Breno Leitao	1	-1/+2
	commit 8d5fae6011260de209aaf231120e8146b14bc8e0 upstream. A production AMD EPYC system crashed with a NULL pointer dereference in the PMU NMI handler: BUG: kernel NULL pointer dereference, address: 0000000000000198 RIP: x86_perf_event_update+0xc/0xa0 Call Trace: <NMI> amd_pmu_v2_handle_irq+0x1a6/0x390 perf_event_nmi_handler+0x24/0x40 The faulting instruction is `cmpq $0x0, 0x198(%rdi)` with RDI=0, corresponding to the `if (unlikely(!hwc->event_base))` check in x86_perf_event_update() where hwc = &event->hw and event is NULL. drgn inspection of the vmcore on CPU 106 showed a mismatch between cpuc->active_mask and cpuc->events[]: active_mask: 0x1e (bits 1, 2, 3, 4) events[1]: 0xff1100136cbd4f38 (valid) events[2]: 0x0 (NULL, but active_mask bit 2 set) events[3]: 0xff1100076fd2cf38 (valid) events[4]: 0xff1100079e990a90 (valid) The event that should occupy events[2] was found in event_list[2] with hw.idx=2 and hw.state=0x0, confirming x86_pmu_start() had run (which clears hw.state and sets active_mask) but events[2] was never populated. Another event (event_list[0]) had hw.state=0x7 (STOPPED\|UPTODATE\|ARCH), showing it was stopped when the PMU rescheduled events, confirming the throttle-then-reschedule sequence occurred. The root cause is commit 7e772a93eb61 ("perf/x86: Fix NULL event access and potential PEBS record loss") which moved the cpuc->events[idx] assignment out of x86_pmu_start() and into step 2 of x86_pmu_enable(), after the PERF_HES_ARCH check. This broke any path that calls pmu->start() without going through x86_pmu_enable() -- specifically the unthrottle path: perf_adjust_freq_unthr_events() -> perf_event_unthrottle_group() -> perf_event_unthrottle() -> event->pmu->start(event, 0) -> x86_pmu_start() // sets active_mask but not events[] The race sequence is: 1. A group of perf events overflows, triggering group throttle via perf_event_throttle_group(). All events are stopped: active_mask bits cleared, events[] preserved (x86_pmu_stop no longer clears events[] after commit 7e772a93eb61). 2. While still throttled (PERF_HES_STOPPED), x86_pmu_enable() runs due to other scheduling activity. Stopped events that need to move counters get PERF_HES_ARCH set and events[old_idx] cleared. In step 2 of x86_pmu_enable(), PERF_HES_ARCH causes these events to be skipped -- events[new_idx] is never set. 3. The timer tick unthrottles the group via pmu->start(). Since commit 7e772a93eb61 removed the events[] assignment from x86_pmu_start(), active_mask[new_idx] is set but events[new_idx] remains NULL. 4. A PMC overflow NMI fires. The handler iterates active counters, finds active_mask[2] set, reads events[2] which is NULL, and crashes dereferencing it. Move the cpuc->events[hwc->idx] assignment in x86_pmu_enable() to before the PERF_HES_ARCH check, so that events[] is populated even for events that are not immediately started. This ensures the unthrottle path via pmu->start() always finds a valid event pointer. Fixes: 7e772a93eb61 ("perf/x86: Fix NULL event access and potential PEBS record loss") Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260310-perf-v2-1-4a3156fce43c@debian.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 days	perf/x86/intel: Add missing branch counters constraint apply	Dapeng Mi	1	-10/+21
	commit 1d07bbd7ea36ea0b8dfa8068dbe67eb3a32d9590 upstream. When running the command: 'perf record -e "{instructions,instructions:p}" -j any,counter sleep 1', a "shift-out-of-bounds" warning is reported on CWF. UBSAN: shift-out-of-bounds in /kbuild/src/consumer/arch/x86/events/intel/lbr.c:970:15 shift exponent 64 is too large for 64-bit type 'long long unsigned int' ...... intel_pmu_lbr_counters_reorder.isra.0.cold+0x2a/0xa7 intel_pmu_lbr_save_brstack+0xc0/0x4c0 setup_arch_pebs_sample_data+0x114b/0x2400 The warning occurs because the second "instructions:p" event, which involves branch counters sampling, is incorrectly programmed to fixed counter 0 instead of the general-purpose (GP) counters 0-3 that support branch counters sampling. Currently only GP counters 0-3 support branch counters sampling on CWF, any event involving branch counters sampling should be programed on GP counters 0-3. Since the counter index of fixed counter 0 is 32, it leads to the "src" value in below code is right shifted 64 bits and trigger the "shift-out-of-bounds" warning. cnt = (src >> (order[j] * LBR_INFO_BR_CNTR_BITS)) & LBR_INFO_BR_CNTR_MASK; The root cause is the loss of the branch counters constraint for the new event in the branch counters sampling event group. Since it isn't yet part of the sibling list. This results in the second "instructions:p" event being programmed on fixed counter 0 incorrectly instead of the appropriate GP counters 0-3. To address this, we apply the missing branch counters constraint for the last event in the group. Additionally, we introduce a new function, `intel_set_branch_counter_constr()`, to apply the branch counters constraint and avoid code duplication. Fixes: 33744916196b ("perf/x86/intel: Support branch counters logging") Reported-by: Xudong Hao <xudong.hao@intel.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260228053320.140406-2-dapeng1.mi@linux.intel.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 days	arm64: dts: renesas: rzg3s-smarc-som: Set bypass for Versa3 PLL2	Claudiu Beznea	1	-1/+1
	[ Upstream commit 6dcbb6f070cccabc6a13d640a5a84de581fdd761 ] The default settings for the Versa3 device on the Renesas RZ/G3S SMARC SoM board have PLL2 disabled. PLL2 was later enabled together with audio support, as it is required to support both 44.1 kHz and 48 kHz audio. With PLL2 enabled, it was observed that Linux occasionally either hangs during boot (the last log message being related to the I2C probe) or randomly crashes. This was mainly reproducible on cold boots. During debugging, it was also noticed that the Unicode replacement character (�) sometimes appears on the serial console. Further investigation traced this to the configuration applied through the Versa3 register at offset 0x1c, which controls PLL enablement. The appearance of the Unicode replacement character suggested an issue with the SoC reference clock. The RZ/G3S reference clock is provided by the Versa3 clock generator (REF output). After checking with the Renesas Versa3 hardware team, it was found that this is related to the PLL2 lock bit being set through the renesas,settings DT property. The PLL lock bit must be set to avoid unstable clock output from the PLL. However, due to the Versa3 hardware design, when a PLL lock bit is set, all outputs (including the REF clock) are temporarily disabled until the configured PLLs become stable. As an alternative, the bypass bit can be used. This does not interrupt the PLL2 output or any other Versa3 outputs, but it may result in temporary instability on PLL2 output while the configuration is applied. Since PLL2 feeds only the audio path and audio is not used during early boot, this is acceptable and does not affect system boot. Drop the PLL2 lock bit and set the bypass bit instead. This has been tested with more than 1000 cold boots. Fixes: a94253232b04 ("arm64: dts: renesas: rzg3s-smarc-som: Add versa3 clock generator node") Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://patch.msgid.link/20260302135703.162601-1-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	arm64: dts: renesas: r9a09g087: Fix CPG register region sizes	Lad Prabhakar	1	-2/+2
	[ Upstream commit f459672cf3ffd3c062973838951418271aa2ceef ] The CPG register regions were incorrectly sized. Update them to match the actual hardware specification: - First region (0x80280000): 0x1000 -> 0x10000 (64kiB) - Second region (0x81280000): 0x9000 -> 0x10000 (64kiB) Fixes: 4b3d31f0b81fe ("arm64: dts: renesas: Add initial SoC DTSI for the RZ/N2H SoC") Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://patch.msgid.link/20260213131742.3606334-3-prabhakar.mahadev-lad.rj@bp.renesas.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	arm64: dts: renesas: r9a09g077: Fix CPG register region sizes	Lad Prabhakar	1	-2/+2
	[ Upstream commit b12985ceca18bcf67f176883175d544daad5e00e ] The CPG register regions were incorrectly sized. Update them to match the actual hardware specification: - First region (0x80280000): 0x1000 -> 0x10000 (64kiB) - Second region (0x81280000): 0x9000 -> 0x10000 (64kiB) Fixes: d17b34744f5e4 ("arm64: dts: renesas: Add initial support for the Renesas RZ/T2H SoC") Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://patch.msgid.link/20260213131742.3606334-2-prabhakar.mahadev-lad.rj@bp.renesas.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	arm64: dts: renesas: r9a09g057: Remove wdt{0,2,3} nodes	Fabrizio Castro	1	-30/+0
	[ Upstream commit a3f34651de4287138c0da19ba321ad72622b4af3 ] The HW user manual for the Renesas RZ/V2H(P) SoC (a.k.a r9a09g057) states that only WDT1 is supposed to be accessed by the CA55 cores. WDT0 is supposed to be used by the CM33 core, WDT2 is supposed to be used by the CR8 core 0, and WDT3 is supposed to be used by the CR8 core 1. Remove wdt{0,2,3} from the SoC specific device tree to make it compliant with the specification from the HW manual. This change is harmless as there are currently no users of the wdt{0,2,3} device tree nodes, only the wdt1 node is actually used. Fixes: 095105496e7d ("arm64: dts: renesas: r9a09g057: Add WDT0-WDT3 nodes") Signed-off-by: Fabrizio Castro <fabrizio.castro.jz@renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://patch.msgid.link/20260203124247.7320-3-fabrizio.castro.jz@renesas.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	arm64: dts: renesas: r9a09g057: Add RTC node	Ovidiu Panait	1	-0/+15
	[ Upstream commit cfc733da4e79018f88d8ac5f3a5306abbba8ef89 ] Add RTC node to Renesas RZ/V2H ("R9A09G057") SoC DTSI. Signed-off-by: Ovidiu Panait <ovidiu.panait.rb@renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://patch.msgid.link/20251107210706.45044-4-ovidiu.panait.rb@renesas.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Stable-dep-of: a3f34651de42 ("arm64: dts: renesas: r9a09g057: Remove wdt{0,2,3} nodes") Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	arm64: dts: renesas: rzv2-evk-cn15-sd: Add ramp delay for SD0 regulator	Lad Prabhakar	1	-0/+1
	[ Upstream commit 5c03465ecf6a56b7b261df9594f0e10612f53a50 ] Set an appropriate ramp delay for the SD0 I/O voltage regulator in the CN15 SD overlay to make UHS-I voltage switching reliable during card initialization. This issue was observed on the RZ/V2H EVK, while the same UHS-I cards worked on the RZ/V2N EVK without problems. Adding the ramp delay makes the behavior consistent and avoids SD init timeouts. Before this change SD0 could fail with: mmc0: error -110 whilst initialising SD card With the delay in place UHS-I cards enumerate correctly: mmc0: new UHS-I speed SDR104 SDXC card at address aaaa mmcblk0: mmc0:aaaa SR64G 59.5 GiB mmcblk0: p1 Fixes: 3d6c2bc7629c8 ("arm64: dts: renesas: Add CN15 eMMC and SD overlays for RZ/V2H and RZ/V2N EVKs") Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://patch.msgid.link/20260123225957.1007089-5-prabhakar.mahadev-lad.rj@bp.renesas.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	arm64: dts: renesas: rzt2h-n2h-evk: Add ramp delay for SD0 card regulator	Lad Prabhakar	1	-0/+1
	[ Upstream commit bb70589b67039e491dd60cf71272884e926a0f95 ] Add a ramp delay of 60 uV/us to the vqmmc_sdhi0 voltage regulator to fix UHS-I SD card detection failures. Measurements on CN78 pin 4 showed the actual voltage ramp time to be 21.86ms when switching between 3.3V and 1.8V. A 25ms ramp delay has been configured to provide adequate margin. The calculation is based on the voltage delta of 1.5V (3.3V - 1.8V): 1500000 uV / 60 uV/us = 25000 us (25ms) Prior to this patch, UHS-I cards failed to initialize with: mmc0: error -110 whilst initialising SD card After this patch, UHS-I cards are properly detected on SD0: mmc0: new UHS-I speed SDR104 SDXC card at address aaaa mmcblk0: mmc0:aaaa SR64G 59.5 GiB Fixes: d065453e5ee09 ("arm64: dts: renesas: rzt2h-rzn2h-evk: Enable SD card slot") Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://patch.msgid.link/20260123225957.1007089-2-prabhakar.mahadev-lad.rj@bp.renesas.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	LoongArch: Check return values for set_memory_{rw,rox}	Tiezhu Yang	1	-2/+13
	[ Upstream commit 431ce839dad66d0d56fb604785452c6a57409f35 ] set_memory_rw() and set_memory_rox() may fail, so we should check the return values and return immediately in larch_insn_text_copy(). Cc: stable@vger.kernel.org Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> [ kept `stop_machine()` instead of `stop_machine_cpuslocked()` ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 days	parisc: Flush correct cache in cacheflush() syscall	Helge Deller	1	-2/+2
	commit 2c98a8fbd6aa647414c6248dacf254ebe91c79ad upstream. The assembly flush instructions were swapped for I- and D-cache flags: SYSCALL_DEFINE3(cacheflush, ...) { if (cache & DCACHE) { "fic ...\n" } if (cache & ICACHE && error == 0) { "fdc ...\n" } Fix it by using fdc for DCACHE, and fic for ICACHE flushing. Reported-by: Felix Lechner <felix.lechner@lease-up.com> Fixes: c6d96328fecd ("parisc: Add cacheflush() syscall") Cc: <stable@vger.kernel.org> # v6.5+ Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 days	LoongArch: No need to flush icache if text copy failed	Tiezhu Yang	1	-2/+4
	commit d3b8491961207ac967795c34375890407fd51a45 upstream. If copy_to_kernel_nofault() failed, no need to flush icache and just return immediately. Cc: stable@vger.kernel.org Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 days	LoongArch: Give more information if kmem access failed	Tiezhu Yang	1	-2/+12
	commit a47f0754bdd01f971c9715acdbdd3a07515c8f83 upstream. If memory access such as copy_{from, to}_kernel_nofault() failed, its users do not know what happened, so it is very useful to print the exception code for such cases. Furthermore, it is better to print the caller function to know where is the entry. Here are the low level call chains: copy_from_kernel_nofault() copy_from_kernel_nofault_loop() __get_kernel_nofault() copy_to_kernel_nofault() copy_to_kernel_nofault_loop() __put_kernel_nofault() Cc: stable@vger.kernel.org Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-19	KVM: SVM: Set/clear CR8 write interception when AVIC is (de)activated	Sean Christopherson	2	-5/+9
	[ Upstream commit 87d0f901a9bd8ae6be57249c737f20ac0cace93d ] Explicitly set/clear CR8 write interception when AVIC is (de)activated to fix a bug where KVM leaves the interception enabled after AVIC is activated. E.g. if KVM emulates INIT=>WFS while AVIC is deactivated, CR8 will remain intercepted in perpetuity. On its own, the dangling CR8 intercept is "just" a performance issue, but combined with the TPR sync bug fixed by commit d02e48830e3f ("KVM: SVM: Sync TPR from LAPIC into VMCB::V_TPR even if AVIC is active"), the danging intercept is fatal to Windows guests as the TPR seen by hardware gets wildly out of sync with reality. Note, VMX isn't affected by the bug as TPR_THRESHOLD is explicitly ignored when Virtual Interrupt Delivery is enabled, i.e. when APICv is active in KVM's world. I.e. there's no need to trigger update_cr8_intercept(), this is firmly an SVM implementation flaw/detail. WARN if KVM gets a CR8 write #VMEXIT while AVIC is active, as KVM should never enter the guest with AVIC enabled and CR8 writes intercepted. Fixes: 3bbf3565f48c ("svm: Do not intercept CR8 when enable AVIC") Cc: stable@vger.kernel.org Cc: Jim Mattson <jmattson@google.com> Cc: Naveen N Rao (AMD) <naveen@kernel.org> Cc: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Naveen N Rao (AMD) <naveen@kernel.org> Reviewed-by: Jim Mattson <jmattson@google.com> Link: https://patch.msgid.link/20260203190711.458413-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> [Squash fix to avic_deactivate_vmcb. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-19	KVM: SVM: Add a helper to look up the max physical ID for AVIC	Naveen N Rao	1	-6/+20
	[ Upstream commit f2f6e67a56dc88fea7e9b10c4e79bb01d97386b7 ] To help with a future change, add a helper to look up the maximum physical ID depending on the vCPU AVIC mode. No functional change intended. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Naveen N Rao (AMD) <naveen@kernel.org> Link: https://lore.kernel.org/r/0ab9bf5e20a3463a4aa3a5ea9bbbac66beedf1d1.1757009416.git.naveen@kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Stable-dep-of: 87d0f901a9bd ("KVM: SVM: Set/clear CR8 write interception when AVIC is (de)activated") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-19	KVM: SVM: Limit AVIC physical max index based on configured max_vcpu_ids	Naveen N Rao	1	-2/+5
	[ Upstream commit 574ef752d4aea04134bc121294d717f4422c2755 ] KVM allows VMMs to specify the maximum possible APIC ID for a virtual machine through KVM_CAP_MAX_VCPU_ID capability so as to limit data structures related to APIC/x2APIC. Utilize the same to set the AVIC physical max index in the VMCB, similar to VMX. This helps hardware limit the number of entries to be scanned in the physical APIC ID table speeding up IPI broadcasts for virtual machines with smaller number of vCPUs. Unlike VMX, SVM AVIC requires a single page to be allocated for the Physical APIC ID table and the Logical APIC ID table, so retain the existing approach of allocating those during VM init. Signed-off-by: Naveen N Rao (AMD) <naveen@kernel.org> Link: https://lore.kernel.org/r/adb07ccdb3394cd79cb372ba6bcc69a4e4d4ef54.1757009416.git.naveen@kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Stable-dep-of: 87d0f901a9bd ("KVM: SVM: Set/clear CR8 write interception when AVIC is (de)activated") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-19	KVM: arm64: Eagerly init vgic dist/redist on vgic creation	Marc Zyngier	1	-16/+16
	[ Upstream commit ac6769c8f948dff33265c50e524aebf9aa6f1be0 ] If vgic_allocate_private_irqs_locked() fails for any odd reason, we exit kvm_vgic_create() early, leaving dist->rd_regions uninitialised. kvm_vgic_dist_destroy() then comes along and walks into the weeds trying to free the RDs. Got to love this stuff. Solve it by moving all the static initialisation early, and make sure that if we fail halfway, we're in a reasonable shape to perform the rest of the teardown. While at it, reset the vgic model on failure, just in case... Reported-by: syzbot+f6a46b038fc243ac0175@syzkaller.appspotmail.com Tested-by: syzbot+f6a46b038fc243ac0175@syzkaller.appspotmail.com Fixes: b3aa9283c0c50 ("KVM: arm64: vgic: Hoist SGI/PPI alloc from vgic_init() to kvm_create_vgic()") Link: https://lore.kernel.org/r/69a2d58c.050a0220.3a55be.003b.GAE@google.com Link: https://patch.msgid.link/20260228164559.936268-1-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-19	KVM: arm64: gic: Set vgic_model before initing private IRQs	Sascha Bischoff	1	-4/+4
	[ Upstream commit 9435c1e1431003e23aa34ef8e46c30d09c3dbcb5 ] Different GIC types require the private IRQs to be initialised differently. GICv5 is the culprit as it supports both a different number of private IRQs, and all of these are PPIs (there are no SGIs). Moreover, as GICv5 uses the top bits of the interrupt ID to encode the type, the intid also needs to computed differently. Up until now, the GIC model has been set after initialising the private IRQs for a VCPU. Move this earlier to ensure that the GIC model is available when configuring the private IRQs. While we're at it, also move the setting of the in_kernel flag and implementation revision to keep them grouped together as before. Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20260128175919.3828384-7-sascha.bischoff@arm.com Signed-off-by: Marc Zyngier <maz@kernel.org> Stable-dep-of: ac6769c8f948 ("KVM: arm64: Eagerly init vgic dist/redist on vgic creation") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-19	x86/apic: Disable x2apic on resume if the kernel expects so	Shashank Balaji	1	-0/+6
	commit 8cc7dd77a1466f0ec58c03478b2e735a5b289b96 upstream. When resuming from s2ram, firmware may re-enable x2apic mode, which may have been disabled by the kernel during boot either because it doesn't support IRQ remapping or for other reasons. This causes the kernel to continue using the xapic interface, while the hardware is in x2apic mode, which causes hangs. This happens on defconfig + bare metal + s2ram. Fix this in lapic_resume() by disabling x2apic if the kernel expects it to be disabled, i.e. when x2apic_mode = 0. The ACPI v6.6 spec, Section 16.3 [1] says firmware restores either the pre-sleep configuration or initial boot configuration for each CPU, including MSR state: When executing from the power-on reset vector as a result of waking from an S2 or S3 sleep state, the platform firmware performs only the hardware initialization required to restore the system to either the state the platform was in prior to the initial operating system boot, or to the pre-sleep configuration state. In multiprocessor systems, non-boot processors should be placed in the same state as prior to the initial operating system boot. (further ahead) If this is an S2 or S3 wake, then the platform runtime firmware restores minimum context of the system before jumping to the waking vector. This includes: CPU configuration. Platform runtime firmware restores the pre-sleep configuration or initial boot configuration of each CPU (MSR, MTRR, firmware update, SMBase, and so on). Interrupts must be disabled (for IA-32 processors, disabled by CLI instruction). (and other things) So at least as per the spec, re-enablement of x2apic by the firmware is allowed if "x2apic on" is a part of the initial boot configuration. [1] https://uefi.org/specs/ACPI/6.6/16_Waking_and_Sleeping.html#initialization [ bp: Massage. ] Fixes: 6e1cb38a2aef ("x64, x2apic/intr-remap: add x2apic support, including enabling interrupt-remapping") Co-developed-by: Rahul Bukte <rahul.bukte@sony.com> Signed-off-by: Rahul Bukte <rahul.bukte@sony.com> Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Thomas Gleixner <tglx@kernel.org> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260306-x2apic-fix-v2-1-bee99c12efa3@sony.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-19	powerpc64/bpf: fix the address returned by bpf_get_func_ip	Hari Bathini	1	-9/+19
	commit 157820264ac3dadfafffad63184b883eb28f9ae0 upstream. bpf_get_func_ip() helper function returns the address of the traced function. It relies on the IP address stored at ctx - 16 by the bpf trampoline. On 64-bit powerpc, this address is recovered from LR accounting for OOL trampoline. But the address stored here was off by 4-bytes. Ensure the address is the actual start of the traced function. Reported-by: Abhishek Dubey <adubey@linux.ibm.com> Fixes: d243b62b7bd3 ("powerpc64/bpf: Add support for bpf trampolines") Cc: stable@vger.kernel.org Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20260303181031.390073-3-hbathini@linux.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-19	powerpc64/bpf: fix kfunc call support	Hari Bathini	2	-9/+94
	commit 01b6ac72729610ae732ca2a66e3a642e23f6cd60 upstream. Commit 61688a82e047 ("powerpc/bpf: enable kfunc call") inadvertently enabled kfunc call support for 32-bit powerpc but that support will not be possible until ABI mismatch between 32-bit powerpc and eBPF is handled in 32-bit powerpc JIT code. Till then, advertise support only for 64-bit powerpc. Also, in powerpc ABI, caller needs to extend the arguments properly based on signedness. The JIT code is responsible for handling this explicitly for kfunc calls as verifier can't handle this for each architecture-specific ABI needs. But this was not taken care of while kfunc call support was enabled for powerpc. Fix it by handling this with bpf_jit_find_kfunc_model() and using zero_extend() & sign_extend() helper functions. Fixes: 61688a82e047 ("powerpc/bpf: enable kfunc call") Cc: stable@vger.kernel.org Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20260303181031.390073-7-hbathini@linux.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-19	powerpc/pseries: Correct MSI allocation tracking	Nam Cao	1	-1/+1
	commit 35e4f2a17eb40288f9bcdb09549fa04a63a96279 upstream. The per-device MSI allocation calculation in pseries_irq_domain_alloc() is clearly wrong. It can still happen to work when nr_irqs is 1. Correct it. Fixes: c0215e2d72de ("powerpc/pseries: Fix MSI-X allocation failure when quota is exceeded") Cc: stable@vger.kernel.org Signed-off-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> [maddy: Fixed Nilay's reviewed-by tag] Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20260302003948.1452016-1-namcao@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-19	s390/xor: Fix xor_xc_5() inline assembly	Vasily Gorbik	1	-1/+0
	commit 5f25805303e201f3afaff0a90f7c7ce257468704 upstream. xor_xc_5() contains a larl 1,2f that is not used by the asm and is not declared as a clobber. This can corrupt a compiler-allocated value in %r1 and lead to miscompilation. Remove the instruction. Fixes: 745600ed6965 ("s390/lib: Use exrl instead of ex in xor functions") Cc: stable@vger.kernel.org Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-19	s390/xor: Fix xor_xc_2() inline assembly constraints	Heiko Carstens	1	-2/+2
	commit f775276edc0c505dc0f782773796c189f31a1123 upstream. The inline assembly constraints for xor_xc_2() are incorrect. "bytes", "p1", and "p2" are input operands, while all three of them are modified within the inline assembly. Given that the function consists only of this inline assembly it seems unlikely that this may cause any problems, however fix this in any case. Fixes: 2cfc5f9ce7f5 ("s390/xor: optimized xor routing using the XC instruction") Cc: stable@vger.kernel.org Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20260302133500.1560531-2-hca@linux.ibm.com Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-19	s390/stackleak: Fix __stackleak_poison() inline assembly constraint	Heiko Carstens	1	-1/+1
	commit 674c5ff0f440a051ebf299d29a4c013133d81a65 upstream. The __stackleak_poison() inline assembly comes with a "count" operand where the "d" constraint is used. "count" is used with the exrl instruction and "d" means that the compiler may allocate any register from 0 to 15. If the compiler would allocate register 0 then the exrl instruction would not or the value of "count" into the executed instruction - resulting in a stackframe which is only partially poisoned. Use the correct "a" constraint, which excludes register 0 from register allocation. Fixes: 2a405f6bb3a5 ("s390/stackleak: provide fast __stackleak_poison() implementation") Cc: stable@vger.kernel.org Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20260302133500.1560531-4-hca@linux.ibm.com Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-19	parisc: Check kernel mapping earlier at bootup	Helge Deller	1	-8/+12
	commit 17c144f1104bfc29a3ce3f7d0931a1bfb7a3558c upstream. The check if the initial mapping is sufficient needs to happen much earlier during bootup. Move this test directly to the start_parisc() function and use native PDC iodc functions to print the warning, because panic() and printk() are not functional yet. This fixes boot when enabling various KALLSYSMS options which need much more space. Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v6.0+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>