summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2026-03-04drm/xe/vm: Skip ufence association for CPU address mirror VMA during MAPHimal Prasad Ghimiray1-1/+2
[ Upstream commit 7f08cc5b3cc3bf6416f8b55bff906f67ed75637d ] The MAP operation for a CPU address mirror VMA does not require ufence association because such mappings are not GPU-synchronized and do not participate in GPU job completion signaling. Remove the unnecessary ufence addition for this case to avoid -EBUSY failure in check_ufence of unbind ops. Cc: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251125075628.1182481-6-himal.prasad.ghimiray@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04drm/xe: Covert return of -EBUSY to -ENOMEM in VM bind IOCTLMatthew Brost1-1/+10
[ Upstream commit 6028f59620927aee2e15a424004012ae05c50684 ] xe_vma_userptr_pin_pages can return -EBUSY but -EBUSY has special meaning in VM bind IOCTLs that user fence is pending that is attached to the VMA. Convert -EBUSY to -ENOMEM in this case as -EBUSY in practice means we are low or out of memory. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Link: https://patch.msgid.link/20251122012502.382587-2-matthew.brost@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04drm/xe/ggtt: Use scope-based runtime pmMatt Roper1-2/+1
[ Upstream commit 8a579f4b2476fd1df07e2bca9fedc82a39a56a65 ] Switch the GGTT code to scope-based runtime PM for consistency with other parts of the driver. Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patch.msgid.link/20251118164338.3572146-51-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04parisc: Prevent interrupts during rebootHelge Deller1-0/+3
[ Upstream commit 35ac5a728c878594f2ea6c43b57652a16be3c968 ] Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04x86/sev: Use kfree_sensitive() when freeing a SNP message descriptorBorislav Petkov (AMD)1-2/+1
[ Upstream commit af05e558988ed004a20fc4de7d0f80cfbba663f0 ] Use the proper helper instead of an open-coded variant. Closes: https://lore.kernel.org/r/202512202235.WHPQkLZu-lkp@intel.com Reported-by: kernel test robot <lkp@intel.com> Reported-by: Julia Lawall <julia.lawall@inria.fr> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://patch.msgid.link/20260112114147.GBaWTd-8HSy_Xp4S3X@fat_crate.local Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04soc/tegra: pmc: Fix unsafe generic_handle_irq() callPrathamesh Shete1-30/+74
[ Upstream commit e6d96073af681780820c94079b978474a8a44413 ] Currently, when resuming from system suspend on Tegra platforms, the following warning is observed: WARNING: CPU: 0 PID: 14459 at kernel/irq/irqdesc.c:666 Call trace: handle_irq_desc+0x20/0x58 (P) tegra186_pmc_wake_syscore_resume+0xe4/0x15c syscore_resume+0x3c/0xb8 suspend_devices_and_enter+0x510/0x540 pm_suspend+0x16c/0x1d8 The warning occurs because generic_handle_irq() is being called from a non-interrupt context which is considered as unsafe. Fix this warning by deferring generic_handle_irq() call to an IRQ work which gets executed in hard IRQ context where generic_handle_irq() can be called safely. When PREEMPT_RT kernels are used, regular IRQ work (initialized with init_irq_work) is deferred to run in per-CPU kthreads in preemptible context rather than hard IRQ context. Hence, use the IRQ_WORK_INIT_HARD variant so that with PREEMPT_RT kernels, the IRQ work is processed in hardirq context instead of being deferred to a thread which is required for calling generic_handle_irq(). On non-PREEMPT_RT kernels, both init_irq_work() and IRQ_WORK_INIT_HARD() execute in IRQ context, so this change has no functional impact for standard kernel configurations. Signed-off-by: Petlozu Pravareshwar <petlozup@nvidia.com> Signed-off-by: Prathamesh Shete <pshete@nvidia.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> [treding@nvidia.com: miscellaneous cleanups] Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04soc: imx8m: Fix error handling for clk_prepare_enable()Peng Fan1-1/+5
[ Upstream commit f6ef3d9ff81240e9bcc030f2da132eb0f8a761d7 ] imx8m_soc_prepare() directly returns the result of clk_prepare_enable(), which skips proper cleanup if the clock enable fails. Check the return value of clk_prepare_enable() and release resources if failure. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Closes: https://lore.kernel.org/r/202601111406.ZVV3YaiU-lkp@intel.com/ Signed-off-by: Peng Fan <peng.fan@nxp.com> Reviewed-by: Marco Felsch <m.felsch@pengutronix.de> Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04arm64: tegra: smaug: Add usb-role-switch supportDiogo Ivo1-0/+2
[ Upstream commit dfa93788dd8b2f9c59adf45ecf592082b1847b7b ] The USB2 port on Smaug is configured for OTG operation but lacked the required 'usb-role-switch' property, leading to a failed probe and a non-functioning USB port. Add the property along with setting the default role to host. Signed-off-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04EDAC/igen6: Add two Intel Amston Lake SoCs supportQiuxu Zhuo1-0/+4
[ Upstream commit 41ca2155d62b0b0d217f59e1bce18362d0c2446f ] Intel Amston Lake SoCs with IBECC (In-Band ECC) capability share the same IBECC registers as Alder Lake-N SoCs. Add two new compute die IDs for Amston Lake SoC products to enable EDAC support. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Tested-by: Jianfeng Gao <jianfeng.gao@intel.com> Link: https://patch.msgid.link/20251124065457.3630949-2-qiuxu.zhuo@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04EDAC/igen6: Add more Intel Panther Lake-H SoCs supportLili Li1-0/+20
[ Upstream commit 4c36e6106997b6ad8f4a279b4bdbca3ed6f53c6c ] Add more Intel Panther Lake-H SoC compute die IDs for EDAC support. Signed-off-by: Lili Li <lili.li@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Link: https://patch.msgid.link/20251124131537.3633983-1-qiuxu.zhuo@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04Revert "arm64: zynqmp: Add an OP-TEE node to the device tree"Tomas Melin1-5/+0
[ Upstream commit c197179990124f991fca220d97fac56779a02c6d ] This reverts commit 06d22ed6b6635b17551f386b50bb5aaff9b75fbe. OP-TEE logic in U-Boot automatically injects a reserved-memory node along with optee firmware node to kernel device tree. The injection logic is dependent on that there is no manually defined optee node. Having the node in zynqmp.dtsi effectively breaks OP-TEE's insertion of the reserved-memory node, causing memory access violations during runtime. Signed-off-by: Tomas Melin <tomas.melin@vaisala.com> Signed-off-by: Michal Simek <michal.simek@amd.com> Link: https://lore.kernel.org/r/20251125-revert-zynqmp-optee-v1-1-d2ce4c0fcaf6@vaisala.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04firmware: arm_ffa: Unmap Rx/Tx buffers on init failureHaoxiang Li1-0/+1
[ Upstream commit 9fda364cb78c8b9e1abe4029f877300c94655742 ] ffa_init() maps the Rx/Tx buffers via ffa_rxtx_map() but on the partition setup failure path it never unmaps them. Add the missing ffa_rxtx_unmap() call in the error path so that the Rx/Tx buffers are properly released before freeing the backing pages. Signed-off-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn> Message-Id: <20251210031656.56194-1-lihaoxiang@isrc.iscas.ac.cn> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04pstore: ram_core: fix incorrect success return when vmap() failsRuipeng Qi1-0/+7
[ Upstream commit 05363abc7625cf18c96e67f50673cd07f11da5e9 ] In persistent_ram_vmap(), vmap() may return NULL on failure. If offset is non-zero, adding offset_in_page(start) causes the function to return a non-NULL pointer even though the mapping failed. persistent_ram_buffer_map() therefore incorrectly returns success. Subsequent access to prz->buffer may dereference an invalid address and cause crashes. Add proper NULL checking for vmap() failures. Signed-off-by: Ruipeng Qi <ruipengqi3@gmail.com> Link: https://patch.msgid.link/20260203020358.3315299-1-ruipengqi3@gmail.com Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04sched/debug: Fix updating of ppos on server write opsJoel Fernandes1-3/+4
[ Upstream commit 6080fb211672aec6ce8f2f5a2e0b4eae736f2027 ] Updating "ppos" on error conditions does not make much sense. The pattern is to return the error code directly without modifying the position, or modify the position on success and return the number of bytes written. Since on success, the return value of apply is 0, there is no point in modifying ppos either. Fix it by removing all this and just returning error code or number of bytes written on success. Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Juri Lelli <juri.lelli@redhat.com> Reviewed-by: Andrea Righi <arighi@nvidia.com> Acked-by: Tejun Heo <tj@kernel.org> Tested-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/20260126100050.3854740-3-arighi@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04char: tpm: cr50: Remove IRQF_ONESHOTSebastian Andrzej Siewior2-3/+2
[ Upstream commit 1affd29ffbd50125a5492c6be1dbb1f04be18d4f ] Passing IRQF_ONESHOT ensures that the interrupt source is masked until the secondary (threaded) handler is done. If only a primary handler is used then the flag makes no sense because the interrupt can not fire (again) while its handler is running. The flag also prevents force-threading of the primary handler and the irq-core will warn about this. Remove IRQF_ONESHOT from irqflags. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Link: https://patch.msgid.link/20260128095540.863589-10-bigeasy@linutronix.de Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04mailbox: bcm-ferxrm-mailbox: Use default primary handlerSebastian Andrzej Siewior1-12/+2
[ Upstream commit 03843d95a4a4e0ba22ad4fcda65ccf21822b104c ] request_threaded_irq() is invoked with a primary and a secondary handler and no flags are passed. The primary handler is the same as irq_default_primary_handler() so there is no need to have an identical copy. The lack of the IRQF_ONESHOT flag can be dangerous because the interrupt source is not masked while the threaded handler is active. This means, especially on LEVEL typed interrupt lines, the interrupt can fire again before the threaded handler had a chance to run. Use the default primary interrupt handler by specifying NULL and set IRQF_ONESHOT so the interrupt source is masked until the secondary handler is done. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260128095540.863589-5-bigeasy@linutronix.de Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04crypto: hisilicon/qm - move the barrier before writing to the mailbox registerChenghai Huang1-1/+5
[ Upstream commit ebf35d8f9368816c930f5d70783a72716fab5e19 ] Before sending the data via the mailbox to the hardware, to ensure that the data accessed by the hardware is the most up-to-date, a write barrier should be added before writing to the mailbox register. The current memory barrier is placed after writing to the register, the barrier order should be modified to be before writing to the register. Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04bpftool: Fix dependencies for static buildIhor Solodrai1-2/+2
[ Upstream commit 08a7491843224f8b96518fbe70d9e48163046054 ] When building selftests/bpf with EXTRA_LDFLAGS=-static the follwoing error happens: LINK /ws/linux/tools/testing/selftests/bpf/tools/build/bpftool/bootstrap/bpftool /usr/bin/x86_64-linux-gnu-ld.bfd: /usr/lib/gcc/x86_64-linux-gnu/15/../../../x86_64-linux-gnu/libcrypto.a(libcrypto-lib-dso_dlfcn.o): in function `dlfcn_globallookup': [...] /usr/bin/x86_64-linux-gnu-ld.bfd: /usr/lib/gcc/x86_64-linux-gnu/15/../../../x86_64-linux-gnu/libcrypto.a(libcrypto-lib-c_zlib.o): in function `zlib_oneshot_expand_block': (.text+0xc64): undefined reference to `uncompress' /usr/bin/x86_64-linux-gnu-ld.bfd: /usr/lib/gcc/x86_64-linux-gnu/15/../../../x86_64-linux-gnu/libcrypto.a(libcrypto-lib-c_zlib.o): in function `zlib_oneshot_compress_block': (.text+0xce4): undefined reference to `compress' collect2: error: ld returned 1 exit status make[1]: *** [Makefile:252: /ws/linux/tools/testing/selftests/bpf/tools/build/bpftool/bootstrap/bpftool] Error 1 make: *** [Makefile:327: /ws/linux/tools/testing/selftests/bpf/tools/sbin/bpftool] Error 2 make: *** Waiting for unfinished jobs.... This is caused by wrong order of dependencies in the Makefile. Fix it. Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260128211255.376933-1-ihor.solodrai@linux.dev Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI/MSI: Unmap MSI-X region on errorHaoxiang Li1-1/+3
[ Upstream commit 1a8d4c6ecb4c81261bcdf13556abd4a958eca202 ] msix_capability_init() fails to unmap the MSI-X region if msix_setup_interrupts() fails. Add the missing iounmap() for that error path. [ tglx: Massaged change log ] Signed-off-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260125144452.2103812-1-lihaoxiang@isrc.iscas.ac.cn Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04clocksource/drivers/timer-integrator-ap: Add missing Kconfig dependency on OFBartosz Golaszewski1-0/+1
[ Upstream commit 2246464821e2820572e6feefca2029f17629cc50 ] This driver accesses the of_aliases global variable declared in linux/of.h and defined in drivers/base/of.c. It requires OF support or will cause a link failure. Add the missing Kconfig dependency. Closes: https://lore.kernel.org/oe-kbuild-all/202601152233.og6LdeUo-lkp@intel.com/ Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/20260116111723.10585-1-bartosz.golaszewski@oss.qualcomm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04clocksource/drivers/sh_tmu: Always leave device running after probeNiklas Söderlund1-18/+0
[ Upstream commit b1278972b08e480990e2789bdc6a7c918bc349be ] The TMU device can be used as both a clocksource and a clockevent provider. The driver tries to be smart and power itself on and off, as well as enabling and disabling its clock when it's not in operation. This behavior is slightly altered if the TMU is used as an early platform device in which case the device is left powered on after probe, but the clock is still enabled and disabled at runtime. This has worked for a long time, but recent improvements in PREEMPT_RT and PROVE_LOCKING have highlighted an issue. As the TMU registers itself as a clockevent provider, clockevents_register_device(), it needs to use raw spinlocks internally as this is the context of which the clockevent framework interacts with the TMU driver. However in the context of holding a raw spinlock the TMU driver can't really manage its power state or clock with calls to pm_runtime_*() and clk_*() as these calls end up in other platform drivers using regular spinlocks to control power and clocks. This mix of spinlock contexts trips a lockdep warning. ============================= [ BUG: Invalid wait context ] 6.18.0-arm64-renesas-09926-gee959e7c5e34 #1 Not tainted ----------------------------- swapper/0/0 is trying to lock: ffff000008c9e180 (&dev->power.lock){-...}-{3:3}, at: __pm_runtime_resume+0x38/0x88 other info that might help us debug this: context-{5:5} 1 lock held by swapper/0/0: ccree e6601000.crypto: ARM CryptoCell 630P Driver: HW version 0xAF400001/0xDCC63000, Driver version 5.0 #0: ffff8000817ec298 ccree e6601000.crypto: ARM ccree device initialized (tick_broadcast_lock){-...}-{2:2}, at: __tick_broadcast_oneshot_control+0xa4/0x3a8 stack backtrace: CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.18.0-arm64-renesas-09926-gee959e7c5e34 #1 PREEMPT Hardware name: Renesas Salvator-X 2nd version board based on r8a77965 (DT) Call trace: show_stack+0x14/0x1c (C) dump_stack_lvl+0x6c/0x90 dump_stack+0x14/0x1c __lock_acquire+0x904/0x1584 lock_acquire+0x220/0x34c _raw_spin_lock_irqsave+0x58/0x80 __pm_runtime_resume+0x38/0x88 sh_tmu_clock_event_set_oneshot+0x84/0xd4 clockevents_switch_state+0xfc/0x13c tick_broadcast_set_event+0x30/0xa4 __tick_broadcast_oneshot_control+0x1e0/0x3a8 tick_broadcast_oneshot_control+0x30/0x40 cpuidle_enter_state+0x40c/0x680 cpuidle_enter+0x30/0x40 do_idle+0x1f4/0x280 cpu_startup_entry+0x34/0x40 kernel_init+0x0/0x130 do_one_initcall+0x0/0x230 __primary_switched+0x88/0x90 For non-PREEMPT_RT builds this is not really an issue, but for PREEMPT_RT builds where normal spinlocks can sleep this might be an issue. Be cautious and always leave the power and clock running after probe. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://patch.msgid.link/20251202221341.1856773-1-niklas.soderlund+renesas@ragnatech.se Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04arm64/ftrace,bpf: Fix partial regs after bpf_prog_runJiri Olsa2-0/+26
[ Upstream commit 276f3b6daf6024ae2742afd161e7418a5584a660 ] Mahe reported issue with bpf_override_return helper not working when executed from kprobe.multi bpf program on arm. The problem is that on arm we use alternate storage for pt_regs object that is passed to bpf_prog_run and if any register is changed (which is the case of bpf_override_return) it's not propagated back to actual pt_regs object. Fixing this by introducing and calling ftrace_partial_regs_update function to propagate the values of changed registers (ip and stack). Reported-by: Mahe Tardy <mahe.tardy@gmail.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/bpf/20260112121157.854473-1-jolsa@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04perf/core: Fix slow perf_event_task_exit() with LBR callstacksNamhyung Kim1-2/+18
[ Upstream commit 4960626f956d63dce57f099016c2ecbe637a8229 ] I got a report that a task is stuck in perf_event_exit_task() waiting for global_ctx_data_rwsem. On large systems with lots threads, it'd have performance issues when it grabs the lock to iterate all threads in the system to allocate the context data. And it'd block task exit path which is problematic especially under memory pressure. perf_event_open perf_event_alloc attach_perf_ctx_data attach_global_ctx_data percpu_down_write (global_ctx_data_rwsem) for_each_process_thread alloc_task_ctx_data do_exit perf_event_exit_task percpu_down_read (global_ctx_data_rwsem) It should not hold the global_ctx_data_rwsem on the exit path. Let's skip allocation for exiting tasks and free the data carefully. Reported-by: Rosalie Fang <rosaliefang@google.com> Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260112165157.1919624-1-namhyung@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04bpf: Properly mark live registers for indirect jumpsAnton Protopopov1-0/+6
[ Upstream commit d1aab1ca576c90192ba961094d51b0be6355a4d6 ] For a `gotox rX` instruction the rX register should be marked as used in the compute_insn_live_regs() function. Fix this. Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Link: https://lore.kernel.org/r/20260114162544.83253-2-a.s.protopopov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04genirq/cpuhotplug: Notify about affinity changes breaking the affinity maskImran Khan3-11/+23
[ Upstream commit dd9f6d30c64001ca4dde973ac04d8d155e856743 ] During CPU offlining the interrupts affined to that CPU are moved to other online CPUs, which might break the original affinity mask if the outgoing CPU was the last online CPU in that mask. This change is not propagated to irq_desc::affinity_notify(), which leaves users of the affinity notifier mechanism with stale information. Avoid this by scheduling affinity change notification work for interrupts that were affined to the CPU being offlined, if the new target CPU is not part of the original affinity mask. Since irq_set_affinity_locked() uses the same logic to schedule affinity change notification work, split out this logic into a dedicated function and use that at both places. [ tglx: Removed the EXPORT(), removed the !SMP stub, moved the prototype, added a lockdep assert instead of a comment, fixed up coding style and name space. Polished and clarified the change log ] Signed-off-by: Imran Khan <imran.f.khan@oracle.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260113143727.1041265-1-imran.f.khan@oracle.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04bpf: Recognize special arithmetic shift in the verifierAlexei Starovoitov1-0/+39
[ Upstream commit bffacdb80b93b7b5e96b26fad64cc490a6c7d6c7 ] cilium bpf_wiregard.bpf.c when compiled with -O1 fails to load with the following verifier log: 192: (79) r2 = *(u64 *)(r10 -304) ; R2=pkt(r=40) R10=fp0 fp-304=pkt(r=40) ... 227: (85) call bpf_skb_store_bytes#9 ; R0=scalar() 228: (bc) w2 = w0 ; R0=scalar() R2=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) 229: (c4) w2 s>>= 31 ; R2=scalar(smin=0,smax=umax=0xffffffff,smin32=-1,smax32=0,var_off=(0x0; 0xffffffff)) 230: (54) w2 &= -134 ; R2=scalar(smin=0,smax=umax=umax32=0xffffff7a,smax32=0x7fffff7a,var_off=(0x0; 0xffffff7a)) ... 232: (66) if w2 s> 0xffffffff goto pc+125 ; R2=scalar(smin=umin=umin32=0x80000000,smax=umax=umax32=0xffffff7a,smax32=-134,var_off=(0x80000000; 0x7fffff7a)) ... 238: (79) r4 = *(u64 *)(r10 -304) ; R4=scalar() R10=fp0 fp-304=scalar() 239: (56) if w2 != 0xffffff78 goto pc+210 ; R2=0xffffff78 // -136 ... 258: (71) r1 = *(u8 *)(r4 +0) R4 invalid mem access 'scalar' The error might confuse most bpf authors, since fp-304 slot had 'pkt' pointer at insn 192 and became 'scalar' at 238. That happened because bpf_skb_store_bytes() clears all packet pointers including those in the stack. On the first glance it might look like a bug in the source code, since ctx->data pointer should have been reloaded after the call to bpf_skb_store_bytes(). The relevant part of cilium source code looks like this: // bpf/lib/nodeport.h int dsr_set_ipip6() { if (ctx_adjust_hroom(...)) return DROP_INVALID; // -134 if (ctx_store_bytes(...)) return DROP_WRITE_ERROR; // -141 return 0; } bool dsr_fail_needs_reply(int code) { if (code == DROP_FRAG_NEEDED) // -136 return true; return false; } tail_nodeport_ipv6_dsr() { ret = dsr_set_ipip6(...); if (!IS_ERR(ret)) { ... } else { if (dsr_fail_needs_reply(ret)) return dsr_reply_icmp6(...); } } The code doesn't have arithmetic shift by 31 and it reloads ctx->data every time it needs to access it. So it's not a bug in the source code. The reason is DAGCombiner::foldSelectCCToShiftAnd() LLVM transformation: // If this is a select where the false operand is zero and the compare is a // check of the sign bit, see if we can perform the "gzip trick": // select_cc setlt X, 0, A, 0 -> and (sra X, size(X)-1), A // select_cc setgt X, 0, A, 0 -> and (not (sra X, size(X)-1)), A The conditional branch in dsr_set_ipip6() and its return values are optimized into BPF_ARSH plus BPF_AND: 227: (85) call bpf_skb_store_bytes#9 228: (bc) w2 = w0 229: (c4) w2 s>>= 31 ; R2=scalar(smin=0,smax=umax=0xffffffff,smin32=-1,smax32=0,var_off=(0x0; 0xffffffff)) 230: (54) w2 &= -134 ; R2=scalar(smin=0,smax=umax=umax32=0xffffff7a,smax32=0x7fffff7a,var_off=(0x0; 0xffffff7a)) after insn 230 the register w2 can only be 0 or -134, but the verifier approximates it, since there is no way to represent two scalars in bpf_reg_state. After fallthough at insn 232 the w2 can only be -134, hence the branch at insn 239: (56) if w2 != -136 goto pc+210 should be always taken, and trapping insn 258 should never execute. LLVM generated correct code, but the verifier follows impossible path and rejects valid program. To fix this issue recognize this special LLVM optimization and fork the verifier state. So after insn 229: (c4) w2 s>>= 31 the verifier has two states to explore: one with w2 = 0 and another with w2 = 0xffffffff which makes the verifier accept bpf_wiregard.c A similar pattern exists were OR operation is used in place of the AND operation, the verifier detects that pattern as well by forking the state before the OR operation with a scalar in range [-1,0]. Note there are 20+ such patterns in bpf_wiregard.o compiled with -O1 and -O2, but they're rarely seen in other production bpf programs, so push_stack() approach is not a concern. Reported-by: Hao Sun <sunhao.th@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Co-developed-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260112201424.816836-2-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04bpf: net_sched: Use the correct destructor kfunc typeSami Tolvanen1-1/+7
[ Upstream commit c99d97b46631c4bea0c14b7581b7a59214601e63 ] With CONFIG_CFI enabled, the kernel strictly enforces that indirect function calls use a function pointer type that matches the target function. As bpf_kfree_skb() signature differs from the btf_dtor_kfunc_t pointer type used for the destructor calls in bpf_obj_free_fields(), add a stub function with the correct type to fix the type mismatch. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20260110082548.113748-8-samitolvanen@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04bpf: crypto: Use the correct destructor kfunc typeSami Tolvanen1-1/+7
[ Upstream commit b40a5d724f29fc2eed23ff353808a9aae616b48a ] With CONFIG_CFI enabled, the kernel strictly enforces that indirect function calls use a function pointer type that matches the target function. I ran into the following type mismatch when running BPF self-tests: CFI failure at bpf_obj_free_fields+0x190/0x238 (target: bpf_crypto_ctx_release+0x0/0x94; expected type: 0xa488ebfc) Internal error: Oops - CFI: 00000000f2008228 [#1] SMP ... As bpf_crypto_ctx_release() is also used in BPF programs and using a void pointer as the argument would make the verifier unhappy, add a simple stub function with the correct type and register it as the destructor kfunc instead. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Tested-by: Viktor Malik <vmalik@redhat.com> Link: https://lore.kernel.org/r/20260110082548.113748-7-samitolvanen@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04gendwarfksyms: Fix build on 32-bit hostsSami Tolvanen2-3/+6
[ Upstream commit ddc54f912a551f6eb0bbcfc3880f45fe27a252cb ] We have interchangeably used unsigned long for some of the types defined in elfutils, assuming they're always 64-bit. This obviously fails when building gendwarfksyms on 32-bit hosts. Fix the types. Reported-by: Michal Suchánek <msuchanek@suse.de> Closes: https://lore.kernel.org/linux-modules/aRcxzPxtJblVSh1y@kitsune.suse.cz/ Tested-by: Michal Suchánek <msuchanek@suse.de> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04perf/x86/intel: Add Airmont NPMartin Schiller1-0/+1
[ Upstream commit a08340fd291671c54d379d285b2325490ce90ddd ] The Intel / MaxLinear Airmont NP (aka Lightning Mountain) supports the same architectual and non-architecural events as Airmont. Signed-off-by: Martin Schiller <ms@dev.tdt.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://patch.msgid.link/20251124074846.9653-3-ms@dev.tdt.de Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04perf/x86/cstate: Add Airmont NPMartin Schiller1-0/+1
[ Upstream commit 3006911f284d769b0f66c12b39da130325ef1440 ] From the perspective of Intel cstate residency counters, the Airmont NP (aka Lightning Mountain) is identical to the Airmont. Signed-off-by: Martin Schiller <ms@dev.tdt.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://patch.msgid.link/20251124074846.9653-4-ms@dev.tdt.de Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04perf/x86/msr: Add Airmont NPMartin Schiller1-0/+1
[ Upstream commit 63dbadcafc1f4d1da796a8e2c0aea1e561f79ece ] Like Airmont, the Airmont NP (aka Intel / MaxLinear Lightning Mountain) supports SMI_COUNT MSR. Signed-off-by: Martin Schiller <ms@dev.tdt.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://patch.msgid.link/20251124074846.9653-2-ms@dev.tdt.de Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04irqchip/riscv-imsic: Add a CPU pm notifier to restore the IMSIC on exitNick Hu1-8/+31
[ Upstream commit f48b4bd0915bf61ac12b8c65c7939ebd03bc8abf ] The IMSIC might be reset when the system enters a low power state, but on exit nothing restores the registers, which prevents interrupt delivery. Solve this by registering a CPU power management notifier, which restores the IMSIC on exit. Signed-off-by: Nick Hu <nick.hu@sifive.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Yong-Xuan Wang <yongxuan.wang@sifive.com> Reviewed-by: Cyan Yang <cyan.yang@sifive.com> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com> Link: https://patch.msgid.link/20251202-preserve-aplic-imsic-v3-1-1844fbf1fe92@sifive.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04bpf: verifier improvement in 32bit shift sign extension patternCupertino Miranda1-11/+7
[ Upstream commit d18dec4b8990048ce75f0ece32bb96b3fbd3f422 ] This patch improves the verifier to correctly compute bounds for sign extension compiler pattern composed of left shift by 32bits followed by a sign right shift by 32bits. Pattern in the verifier was limitted to positive value bounds and would reset bound computation for negative values. New code allows both positive and negative values for sign extension without compromising bound computation and verifier to pass. This change is required by GCC which generate such pattern, and was detected in the context of systemd, as described in the following GCC bugzilla: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119731 Three new tests were added in verifier_subreg.c. Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com> Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Cc: David Faust <david.faust@oracle.com> Cc: Jose Marchesi <jose.marchesi@oracle.com> Cc: Elena Zannoni <elena.zannoni@oracle.com> Link: https://lore.kernel.org/r/20251202180220.11128-2-cupertino.miranda@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04sparc: don't reference obsolete termio struct for TC* constantsSam James1-4/+4
[ Upstream commit be0bccffcde3308150d2a90e55fc10e249098909 ] Similar in nature to commit ab107276607a ("powerpc: Fix struct termio related ioctl macros"). glibc-2.42 drops the legacy termio struct, but the ioctls.h header still defines some TC* constants in terms of termio (via sizeof). Hardcode the values instead. This fixes building Python for example, which falls over like: ./Modules/termios.c:1119:16: error: invalid application of 'sizeof' to incomplete type 'struct termio' Link: https://bugs.gentoo.org/961769 Link: https://bugs.gentoo.org/962600 Signed-off-by: Sam James <sam@gentoo.org> Reviewed-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04sparc: Synchronize user stack on fork and cloneAndreas Larsson1-14/+24
[ Upstream commit e38eba3b77878ada327a572a41596a3b0b44e522 ] Flush all uncommitted user windows before calling the generic syscall handlers for clone, fork, and vfork. Prior to entering the arch common handlers sparc_{clone|fork|vfork}, the arch-specific syscall wrappers for these syscalls will attempt to flush all windows (including user windows). In the window overflow trap handlers on both SPARC{32|64}, if the window can't be stored (i.e due to MMU related faults) the routine backups the user window and increments a thread counter (wsaved). By adding a synchronization point after the flush attempt, when fault handling is enabled, any uncommitted user windows will be flushed. Link: https://sourceware.org/bugzilla/show_bug.cgi?id=31394 Closes: https://lore.kernel.org/sparclinux/fe5cc47167430007560501aabb28ba154985b661.camel@physik.fu-berlin.de/ Signed-off-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: Ludwig Rydberg <ludwig.rydberg@gaisler.com> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Link: https://lore.kernel.org/r/20260119144753.27945-2-ludwig.rydberg@gaisler.com Signed-off-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04block: decouple secure erase size limit from discard size limitLuke Wang2-5/+22
[ Upstream commit ee81212f74a57c5d2b56cf504f40d528dac6faaf ] Secure erase should use max_secure_erase_sectors instead of being limited by max_discard_sectors. Separate the handling of REQ_OP_SECURE_ERASE from REQ_OP_DISCARD to allow each operation to use its own size limit. Signed-off-by: Luke Wang <ziniu.wang_1@nxp.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04blk-mq-sched: unify elevators checking for async requestsYu Kuai4-3/+8
[ Upstream commit 1db61b0afdd7e8aa9289c423fdff002603b520b5 ] bfq and mq-deadline consider sync writes as async requests and only reserve tags for sync reads by async_depth, however, kyber doesn't consider sync writes as async requests for now. Consider the case there are lots of dirty pages, and user use fsync to flush dirty pages. In this case sched_tags can be exhausted by sync writes and sync reads can stuck waiting for tag. Hence let kyber follow what mq-deadline and bfq did, and unify async requests checking for all elevators. Signed-off-by: Yu Kuai <yukuai@fnnas.com> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04blk-mq-debugfs: add missing debugfs_mutex in blk_mq_debugfs_register_hctxs()Yu Kuai1-0/+2
[ Upstream commit 9d20fd6ce1ba9733cd5ac96fcab32faa9fc404dd ] In blk_mq_update_nr_hw_queues(), debugfs_mutex is not held while creating debugfs entries for hctxs. Hence add debugfs_mutex there, it's safe because queue is not frozen. Signed-off-by: Yu Kuai <yukuai@fnnas.com> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04xenbus: Use .freeze/.thaw to handle xenbus devicesJason Andryuk1-3/+1
[ Upstream commit e08dd1ee49838750a514e83c0aa60cd12ba6ecbb ] The goal is to fix s2idle and S3 for Xen PV devices. A domain resuming from s3 or s2idle disconnects its PV devices during resume. The backends are not expecting this and do not reconnect. b3e96c0c7562 ("xen: use freeze/restore/thaw PM events for suspend/ resume/chkpt") changed xen_suspend()/do_suspend() from PMSG_SUSPEND/PMSG_RESUME to PMSG_FREEZE/PMSG_THAW/PMSG_RESTORE, but the suspend/resume callbacks remained. .freeze/restore are used with hiberation where Linux restarts in a new place in the future. .suspend/resume are useful for runtime power management for the duration of a boot. The current behavior of the callbacks works for an xl save/restore or live migration where the domain is restored/migrated to a new location and connecting to a not-already-connected backend. Change xenbus_pm_ops to use .freeze/thaw/restore and drop the .suspend/resume hook. This matches the use in drivers/xen/manage.c for save/restore and live migration. With .suspend/resume empty, PV devices are left connected during s2idle and s3, so PV devices are not changed and work after resume. Signed-off-by: Jason Andryuk <jason.andryuk@amd.com> Acked-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Message-ID: <20251119224731.61497-2-jason.andryuk@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04ACPI: battery: fix incorrect charging status when current is zeroAta İlhan Köktürk1-1/+8
[ Upstream commit bb1256e0ddc7e9e406164319769b9f8d8389f056 ] On some laptops, such as the Huawei Matebook series, the embedded controller continues to report "Charging" status even when the charge threshold is reached and no current is being drawn. This incorrect reporting prevents the system from switching to battery power profiles, leading to significantly higher power (e.g., 18W instead of 7W during browsing) and missed remaining battery time estimation. Validate the "Charging" state by checking if rate_now is zero. If the hardware reports charging but the current is zero, report "Not Charging" to user space. Signed-off-by: Ata İlhan Köktürk <atailhan2006@gmail.com> [ rjw: Whitespace fix, braces added to an inner if (), new comment rewrite ] [ rjw: Changelog edits ] Link: https://patch.msgid.link/20260129144856.43058-1-atailhan2006@gmail.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04ACPI: scan: Use async schedule function in acpi_scan_clear_dep_fn()Yicong Yang1-26/+15
[ Upstream commit 7cf28b3797a81b616bb7eb3e90cf131afc452919 ] The device object rescan in acpi_scan_clear_dep_fn() is scheduled on a system workqueue which is not guaranteed to be finished before entering userspace. This may cause some key devices to be missing when userspace init task tries to find them. Two issues observed on RISCV platforms: - Kernel panic due to userspace init cannot have an opened console. The console device scanning is queued by acpi_scan_clear_dep_queue() and not finished by the time userspace init process running, thus by the time userspace init runs, no console is present. - Entering rescue shell due to the lack of root devices (PCIe nvme in our case). Same reason as above, the PCIe host bridge scanning is queued on a system workqueue and finished after init process runs. The reason is because both devices (console, PCIe host bridge) depend on riscv-aplic irqchip to serve their interrupts (console's wired interrupt and PCI's INTx interrupts). In order to keep the dependency, these devices are scanned and created after initializing riscv-aplic. The riscv-aplic is initialized in device_initcall() and a device scan work is queued via acpi_scan_clear_dep_queue(), which is close to the time userspace init process is run. Since system_dfl_wq is used in acpi_scan_clear_dep_queue() with no synchronization, the issues will happen if userspace init runs before these devices are ready. The solution is to wait for the queued work to complete before entering userspace init. One possible way would be to use a dedicated workqueue instead of system_dfl_wq, and explicitly flush it somewhere in the initcall stage before entering userspace. Another way is to use async_schedule_dev_nocall() for scanning these devices. It's designed for asynchronous initialization and will work in the same way as before because it's using a dedicated unbound workqueue as well, but the kernel init code calls async_synchronize_full() right before entering userspace init which will wait for the work to complete. Compared to a dedicated workqueue, the second approach is simpler because the async schedule framework takes care of all of the details. The ACPI code only needs to focus on its job. A dedicated workqueue for this could also be redundant because some platforms don't need acpi_scan_clear_dep_queue() for their device scanning. Signed-off-by: Yicong Yang <yang.yicong@picoheart.com> [ rjw: Subject adjustment, changelog edits ] Link: https://patch.msgid.link/20260128132848.93638-1-yang.yicong@picoheart.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04ACPI: x86: s2idle: Invoke Microsoft _DSM Function 9 (Turn On Display)Jakob Riemenschneider1-0/+6
[ Upstream commit 229ecbaac6b31f89c554b77eb407377a5eade7d4 ] Windows 11, version 22H2 introduced a new function index (Function 9) to the Microsoft LPS0 _DSM, titled "Turn On Display Notification". According to Microsoft documentation, this function signals to the system firmware that the OS intends to turn on the display when exiting Modern Standby. This allows the firmware to release Power Limits (PLx) earlier. Crucially, this patch fixes a functional issue observed on the Lenovo Yoga Slim 7i Aura (15ILL9), where system fans and keyboard backlights fail to resume after suspend. Investigation linked shows the EC on this device turns off these components during sleep but requires the Function 9 notification to wake them up again. This patch defines the new function index (ACPI_MS_TURN_ON_DISPLAY) and invokes it in acpi_s2idle_restore_early_lps0(). The execution order is updated to match the logic of an "intent" signal: 1. LPS0 Exit (Function 6) 2. Turn On Display Intent (Function 9) 3. Modern Standby Exit (Function 8) 4. Screen On (Function 4) Invoking Function 9 before the Modern Standby Exit ensures the firmware has time to restore power rails and functionality (like fans) before the software fully exits the sleep state. Link: https://learn.microsoft.com/en-us/windows-hardware/design/device-experiences/modern-standby-firmware-notifications#turn-on-display-notification-function-9 Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220505 Suggested-by: Antheas Kapenekakis <antheas@antheas.dev> Signed-off-by: Jakob Riemenschneider <riemenschneiderjakob@gmail.com> Link: https://patch.msgid.link/20260127200121.1292216-1-riemenschneiderjakob@gmail.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04perf/cxlpmu: Replace IRQF_ONESHOT with IRQF_NO_THREADSebastian Andrzej Siewior1-1/+1
[ Upstream commit ab26d9c85554c4ff1d95ca8341522880ed9219d6 ] Passing IRQF_ONESHOT ensures that the interrupt source is masked until the secondary (threaded) handler is done. If only a primary handler is used then the flag makes no sense because the interrupt can not fire (again) while its handler is running. The flag also disallows force-threading of the primary handler and the irq-core will warn about this. The intention here was probably not allowing forced-threading. Replace IRQF_ONESHOT with IRQF_NO_THREAD. Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04arm64: mte: Set TCMA1 whenever MTE is present in the kernelCarl Worth1-5/+5
[ Upstream commit a4e5927115f30a301f9939ed43e6a21a343e06ad ] Set the TCMA1 bit so that access to TTBR1 addresses with 0xf in their tag bits will be treated as tag unchecked. This is important to avoid unwanted tag checking on some systems. Specifically, SCTLR_EL1.TCF can be set to indicate that no tag check faults are desired. But the architecture doesn't guarantee that in this case the system won't still perform tag checks. Use TCMA1 to ensure that undesired tag checks are not performed. This bit was already set in the KASAN case. Adding it to the non-KASAN case prevents tag checking since all TTBR1 address will have a value of 0xf in their tag bits. This patch has been measured on an Ampere system to improve the following: * Eliminate over 98% of kernel-side tag checks during "perf bench futex hash", as measured with "perf stat". * Eliminate all MTE overhead (was previously a 25% performance penalty) from the Phoronix pts/memcached benchmark (1:10 Set:Get ration with 96 cores). Reported-by: Taehyun Noh <taehyun@utexas.edu> Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Carl Worth <carl@os.amperecomputing.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04s390/perf: Disable register readout on sampling eventsThomas Richter1-1/+1
[ Upstream commit b2c04fc1239062b39ddfdd8731ee1a10810dfb74 ] Running commands # ./perf record -IR0,R1 -a sleep 1 extracts and displays register value of general purpose register r1 and r0. However the value displayed of any register is random and does not reflect the register value recorded at the time of the sample interrupt. The sampling device driver on s390 creates a very large buffer for the hardware to store the samples. Only when that large buffer gets full an interrupt is generated and many hundreds of sample entries are processed and copied to the kernel ring buffer and eventually get copied to the perf tool. It is during the copy to the kernel ring buffer that each sample is processed (on s390) and at that time the register values are extracted. This is not the original goal, the register values should be read when the samples are created not when the samples are copied to the kernel ring buffer. Prevent this event from being installed in the first place and return -EOPNOTSUPP. This is already the case for PERF_SAMPLE_REGS_USER. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Jan Polensky <japo@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04cpufreq: dt-platdev: Block the driver from probing on more QC platformsKonrad Dybcio1-0/+3
[ Upstream commit 7b781899072c5701ef9538c365757ee9ab9c00bd ] Add a number of QC platforms to the blocklist, they all use either the qcom-cpufreq-hw driver. Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04rust: cpufreq: always inline functions using build_assert with argumentsAlexandre Courbot1-0/+2
[ Upstream commit 8c8b12a55614ea05953e8d695e700e6e1322a05d ] `build_assert` relies on the compiler to optimize out its error path. Functions using it with its arguments must thus always be inlined, otherwise the error path of `build_assert` might not be optimized out, triggering a build error. Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04md raid: fix hang when stopping arrays with metadata through dm-raidHeinz Mauelshagen1-6/+8
[ Upstream commit cefcb9297fbdb6d94b61787b4f8d84f55b741470 ] When using device-mapper's dm-raid target, stopping a RAID array can cause the system to hang under specific conditions. This occurs when: - A dm-raid managed device tree is suspended from top to bottom (the top-level RAID device is suspended first, followed by its underlying metadata and data devices) - The top-level RAID device is then removed Removing the top-level device triggers a hang in the following sequence: the dm-raid destructor calls md_stop(), which tries to flush the write-intent bitmap by writing to the metadata sub-devices. However, these devices are already suspended, making them unable to complete the write-intent operations and causing an indefinite block. Fix: - Prevent bitmap flushing when md_stop() is called from dm-raid destructor context and avoid a quiescing/unquescing cycle which could also cause I/O - Still allow write-intent bitmap flushing when called from dm-raid suspend context This ensures that RAID array teardown can complete successfully even when the underlying devices are in a suspended state. This second patch uses md_is_rdwr() to distinguish between suspend and destructor paths as elaborated on above. Link: https://lore.kernel.org/linux-raid/CAM23VxqYrwkhKEBeQrZeZwQudbiNey2_8B_SEOLqug=pXxaFrA@mail.gmail.com Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Yu Kuai <yukuai@fnnas.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04md-cluster: fix NULL pointer dereference in process_metadata_updateJiasheng Jiang1-1/+6
[ Upstream commit f150e753cb8dd756085f46e86f2c35ce472e0a3c ] The function process_metadata_update() blindly dereferences the 'thread' pointer (acquired via rcu_dereference_protected) within the wait_event() macro. While the code comment states "daemon thread must exist", there is a valid race condition window during the MD array startup sequence (md_run): 1. bitmap_load() is called, which invokes md_cluster_ops->join(). 2. join() starts the "cluster_recv" thread (recv_daemon). 3. At this point, recv_daemon is active and processing messages. 4. However, mddev->thread (the main MD thread) is not initialized until later in md_run(). If a METADATA_UPDATED message is received from a remote node during this specific window, process_metadata_update() will be called while mddev->thread is still NULL, leading to a kernel panic. To fix this, we must validate the 'thread' pointer. If it is NULL, we release the held lock (no_new_dev_lockres) and return early, safely ignoring the update request as the array is not yet fully ready to process it. Link: https://lore.kernel.org/linux-raid/20260117145903.28921-1-jiashengjiangcool@gmail.com Signed-off-by: Jiasheng Jiang <jiashengjiangcool@gmail.com> Signed-off-by: Yu Kuai <yukuai@fnnas.com> Signed-off-by: Sasha Levin <sashal@kernel.org>