summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2026-04-08Input: logibm - remove driverDmitry Torokhov3-177/+0
Bus mice use specialized bus interface implemented via an ISA add-in cards. They were superseded by PS/2 and later USB. Kconfig entry for the Logitech bus mice states that they "are rather rare these days". This statement was true in 2002 and is no less true in 2024. Remove the driver. Link: https://patch.msgid.link/20240808172733.1194442-3-dmitry.torokhov@gmail.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2026-04-08Input: inport - remove driverDmitry Torokhov3-194/+0
Inport (ATI XL and Microsoft) mice use specialized bus interface implemented via an ISA add-in card. Have been superseded by PS/2 and then USB, and are historical curiosity by now. Remove the driver. Link: https://patch.msgid.link/20240808172733.1194442-2-dmitry.torokhov@gmail.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2026-04-08Input: qt1070 - inline i2c_check_functionality checkThorsten Blum1-2/+1
Inline the i2c_check_functionality() check, since the function returns a boolean status rather than an error code. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Link: https://patch.msgid.link/20260408141926.1181389-4-thorsten.blum@linux.dev Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2026-04-08Input: qt1050 - inline i2c_check_functionality checkThorsten Blum1-2/+1
Inline the i2c_check_functionality() check, since the function returns a boolean status rather than an error code. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Link: https://patch.msgid.link/20260408141926.1181389-3-thorsten.blum@linux.dev Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2026-04-08firmware: efi: Never declare sysfb_primary_display on x86Thomas Zimmermann1-1/+1
The x86 architecture comes with its own instance of the global state variable sysfb_primary_display. Never declare it in the EFI subsystem. Fix the test for CONFIG_FIRMWARE_EDID accordingly. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: e65ca1646311 ("efi: export sysfb_primary_display for EDID") Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2026-04-08Documentation: gpio: update the preferred method for using software node lookupBartosz Golaszewski2-14/+45
In its current version, the manual for converting of board files from using GPIO lookup tables to software nodes recommends leaving the software nodes representing GPIO controllers as "free-floating", not attached objects and relying on the matching of their names against the GPIO controller's name. This is an abuse of the software node API and makes it impossible to create fw_devlinks between GPIO suppliers and consumers in this case. We want to remove this behavior from GPIOLIB and to this end, work on converting all existing drivers to using "attached" software nodes. Except for a few corner-cases where board files define consumers depending on GPIO controllers described in firmware - where we need to reference a real firmware node from a software node - which requires a more complex approach, most board files can easily be converted to using propert firmware node lookup. Update the documentation to recommend attaching the GPIO chip's software nodes to the actual platform devices and show how to do it. Reviewed-by: Linus Walleij <linusw@kernel.org> Link: https://patch.msgid.link/20260403-doc-gpio-swnodes-v2-1-c705f5897b80@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
2026-04-08dma-fence: correct kernel-doc function parameter @flagsRandy Dunlap1-2/+2
'make htmldocs' complains that dma_fence_unlock_irqrestore() is missing a description of its @flags parameter. The description is there but it is missing a ':' sign. Add that and correct the possessive form of "its". WARNING: ../include/linux/dma-fence.h:414 function parameter 'flags' not described in 'dma_fence_unlock_irqrestore' Fixes: 3e5067931b5d ("dma-buf: abstract fence locking v2") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20260407043649.2015894-1-rdunlap@infradead.org Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>
2026-04-08ASoC: amd: acp-da7219-max98357a: tidyup acp_soc_is_rltk_max()Kuninori Morimoto1-3/+1
acp-da7219-max98357a() user exists behind it. No need to has pre-define. Remove it. And it is local function, add static. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Link: https://patch.msgid.link/87h5pmvxfm.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2026-04-08Merge tag 'md-7.1-20260407' of ↵Jens Axboe7-62/+405
git://git.kernel.org/pub/scm/linux/kernel/git/mdraid/linux into for-7.1/block Pull MD changes from Yu Kuai: "Bug Fixes: - avoid a sysfs deadlock when clearing array state (Yu Kuai) - validate raid5 journal payloads before reading metadata (Junrui Luo) - fall back to the correct bitmap operations after version mismatches (Yu Kuai) - serialize overlapping writes on writemostly raid1 disks (Xiao Ni) - wake raid456 reshape waiters before suspend (Yu Kuai) - prevent retry_aligned_read() from triggering soft lockups (Chia-Ming Chang) Improvements: - switch raid0 strip zone and devlist allocations to kvmalloc helpers (Gregory Price) - track clean unwritten stripes for proactive RAID5 parity building (Yu Kuai) - speed up initial llbitmap sync with write_zeroes_unmap support (Yu Kuai) Cleanups: - remove the unused static md workqueue definition (Abd-Alrhman Masalkhi)" * tag 'md-7.1-20260407' of git://git.kernel.org/pub/scm/linux/kernel/git/mdraid/linux: md/raid5: fix soft lockup in retry_aligned_read() md: wake raid456 reshape waiters before suspend md/raid1: serialize overlap io for writemostly disk md/md-llbitmap: optimize initial sync with write_zeroes_unmap support md/md-llbitmap: add CleanUnwritten state for RAID-5 proactive parity building md: add fallback to correct bitmap_ops on version mismatch md/raid5: validate payload size before accessing journal metadata md: remove unused static md_wq workqueue md/raid0: use kvzalloc/kvfree for strip_zone and devlist allocations md: fix array_state=clear sysfs deadlock
2026-04-08ASoC: stm32_sai: fix incorrect BCLK polarity for DSP_A/B, LEFT_JTomasz Merta1-0/+3
The STM32 SAI driver do not set the clock strobing bit (CKSTR) for DSP_A, DSP_B and LEFT_J formats, causing data to be sampled on the wrong BCLK edge when SND_SOC_DAIFMT_NB_NF is used. Per ALSA convention, NB_NF requires sampling on the rising BCLK edge. The STM32MP25 SAI reference manual states that CKSTR=1 is required for signals received by the SAI to be sampled on the SCK rising edge. Without setting CKSTR=1, the SAI samples on the falling edge, violating the NB_NF convention. For comparison, the NXP FSL SAI driver correctly sets FSL_SAI_CR2_BCP for DSP_A, DSP_B and LEFT_J, consistent with its I2S handling. This patch adds SAI_XCR1_CKSTR for DSP_A, DSP_B and LEFT_J in stm32_sai_set_dai_fmt which was verified empirically with a cs47l35 codec. RIGHT_J (LSB) is not investigated and addressed by this patch. Note: the STM32 I2S driver (stm32_i2s_set_dai_fmt) may have the same issue for DSP_A mode, as I2S_CGFR_CKPOL is not set. This has not been verified and is left for a separate investigation. Signed-off-by: Tomasz Merta <tommerta@gmail.com> Link: https://patch.msgid.link/20260408084056.20588-1-tommerta@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2026-04-08ASoC: SOF: Intel: hda: modify period size constraints for ACE4Kai Vehmanen1-2/+12
Intel ACE4 based products set more strict constraints on HDA BDLE start address and length alignment. Add a constraint to align period size to 128 bytes. The commit removes the "minimum as per HDA spec" comment. This comment was misleading as spec actually does allow a 2 byte BDLE length, and more importantly, period size also directly impacts how the BDLE start addresses are aligned, so it is not sufficient just to consider allowed buffer length. Fixes: d3df422f66e8 ("ASoC: SOF: Intel: add initial support for NVL-S") Cc: stable@vger.kernel.org Reported-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com> Link: https://patch.msgid.link/20260408084514.24325-3-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2026-04-08ALSA: hda/intel: enforce stricter period-size alignment for Intel NVLKai Vehmanen1-2/+5
Intel ACE4 based products set more strict constraints on HDA BDLE start address and length alignment. Modify capability flags to drop AZX_DCAPS_NO_ALIGN_BUFSIZE for Intel Nova Lake platforms. Fixes: 7f428282fde3 ("ALSA: hda: controllers: intel: add support for Nova Lake") Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20260408084514.24325-2-peter.ujfalusi@linux.intel.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-04-08ASoC: nau8325: Add software reset during probeNeo Chang1-1/+8
Currently, the driver only performs a hardware reset during the I2C probe sequence. To ensure all internal states of the codec are properly cleared without affecting the configuration registers, a software reset is also required. According to the hardware specification, writing to the Software Reset register (R01) twice will reset all internal states safely. This patch adds the nau8325_software_reset() function, executes it right after the hardware reset in the probe function, and marks the R01 register as writeable in the regmap configuration. Signed-off-by: Neo Chang <YLCHANG2@nuvoton.com> Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com> Link: https://patch.msgid.link/20260408052639.187149-1-YLCHANG2@nuvoton.com Signed-off-by: Mark Brown <broonie@kernel.org>
2026-04-08selftests: nft_queue.sh: add a parallel stress testFernando Fernandez Mancera2-18/+115
Introduce a new stress test to check for race conditions in the nfnetlink_queue subsystem, where an entry is freed while another CPU is concurrently walking the global rhashtable. To trigger this, `nf_queue.c` is extended with two new flags: * -O (out-of-order): Buffers packet IDs and flushes them in reverse. * -b (bogus verdicts): Floods the kernel with non-existent packet IDs. The bogus verdict loop forces the kernel's lookup function to perform full rhashtable bucket traversals (-ENOENT). Combined with reverse-order flushing and heavy parallel UDP/ping flooding across 8 queues, this puts the nfnetlink_queue code under pressure. Joint work with Florian Westphal. Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-08netfilter: nfnetlink_queue: make hash table per queueFlorian Westphal2-91/+49
Sharing a global hash table among all queues is tempting, but it can cause crash: BUG: KASAN: slab-use-after-free in nfqnl_recv_verdict+0x11ac/0x15e0 [nfnetlink_queue] [..] nfqnl_recv_verdict+0x11ac/0x15e0 [nfnetlink_queue] nfnetlink_rcv_msg+0x46a/0x930 kmem_cache_alloc_node_noprof+0x11e/0x450 struct nf_queue_entry is freed via kfree, but parallel cpu can still encounter such an nf_queue_entry when walking the list. Alternative fix is to free the nf_queue_entry via kfree_rcu() instead, but as we have to alloc/free for each skb this will cause more mem pressure. Cc: Scott Mitchell <scott.k.mitch1@gmail.com> Fixes: e19079adcd26 ("netfilter: nfnetlink_queue: optimize verdict lookup with hash table") Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-08netfilter: nft_ct: fix use-after-free in timeout object destroyTuan Do2-1/+2
nft_ct_timeout_obj_destroy() frees the timeout object with kfree() immediately after nf_ct_untimeout(), without waiting for an RCU grace period. Concurrent packet processing on other CPUs may still hold RCU-protected references to the timeout object obtained via rcu_dereference() in nf_ct_timeout_data(). Add an rcu_head to struct nf_ct_timeout and use kfree_rcu() to defer freeing until after an RCU grace period, matching the approach already used in nfnetlink_cttimeout.c. KASAN report: BUG: KASAN: slab-use-after-free in nf_conntrack_tcp_packet+0x1381/0x29d0 Read of size 4 at addr ffff8881035fe19c by task exploit/80 Call Trace: nf_conntrack_tcp_packet+0x1381/0x29d0 nf_conntrack_in+0x612/0x8b0 nf_hook_slow+0x70/0x100 __ip_local_out+0x1b2/0x210 tcp_sendmsg_locked+0x722/0x1580 __sys_sendto+0x2d8/0x320 Allocated by task 75: nft_ct_timeout_obj_init+0xf6/0x290 nft_obj_init+0x107/0x1b0 nf_tables_newobj+0x680/0x9c0 nfnetlink_rcv_batch+0xc29/0xe00 Freed by task 26: nft_obj_destroy+0x3f/0xa0 nf_tables_trans_destroy_work+0x51c/0x5c0 process_one_work+0x2c4/0x5a0 Fixes: 7e0b2b57f01d ("netfilter: nft_ct: add ct timeout support") Cc: stable@vger.kernel.org Signed-off-by: Tuan Do <tuan@calif.io> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-08netfilter: ip6t_eui64: reject invalid MAC header for all packetsZhengchuan Liang1-2/+1
`eui64_mt6()` derives a modified EUI-64 from the Ethernet source address and compares it with the low 64 bits of the IPv6 source address. The existing guard only rejects an invalid MAC header when `par->fragoff != 0`. For packets with `par->fragoff == 0`, `eui64_mt6()` can still reach `eth_hdr(skb)` even when the MAC header is not valid. Fix this by removing the `par->fragoff != 0` condition so that packets with an invalid MAC header are rejected before accessing `eth_hdr(skb)`. Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2") Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Co-developed-by: Yuan Tan <yuantan098@gmail.com> Signed-off-by: Yuan Tan <yuantan098@gmail.com> Suggested-by: Xin Liu <bird@lzu.edu.cn> Tested-by: Ren Wei <enjou1224z@gmail.com> Signed-off-by: Zhengchuan Liang <zcliangcn@gmail.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-08netfilter: xt_multiport: validate range encoding in checkentryRen Wei1-4/+30
ports_match_v1() treats any non-zero pflags entry as the start of a port range and unconditionally consumes the next ports[] element as the range end. The checkentry path currently validates protocol, flags and count, but it does not validate the range encoding itself. As a result, malformed rules can mark the last slot as a range start or place two range starts back to back, leaving ports_match_v1() to step past the last valid ports[] element while interpreting the rule. Reject malformed multiport v1 rules in checkentry by validating that each range start has a following element and that the following element is not itself marked as another range start. Fixes: a89ecb6a2ef7 ("[NETFILTER]: x_tables: unify IPv4/IPv6 multiport match") Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Co-developed-by: Yuan Tan <yuantan098@gmail.com> Signed-off-by: Yuan Tan <yuantan098@gmail.com> Suggested-by: Xin Liu <bird@lzu.edu.cn> Tested-by: Yuhang Zheng <z1652074432@gmail.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-08netfilter: nfnetlink_log: initialize nfgenmsg in NLMSG_DONE terminatorXiang Mei1-4/+4
When batching multiple NFLOG messages (inst->qlen > 1), __nfulnl_send() appends an NLMSG_DONE terminator with sizeof(struct nfgenmsg) payload via nlmsg_put(), but never initializes the nfgenmsg bytes. The nlmsg_put() helper only zeroes alignment padding after the payload, not the payload itself, so four bytes of stale kernel heap data are leaked to userspace in the NLMSG_DONE message body. Use nfnl_msg_put() to build the NLMSG_DONE terminator, which initializes the nfgenmsg payload via nfnl_fill_hdr(), consistent with how __build_packet_message() already constructs NFULNL_MSG_PACKET headers. Fixes: 29c5d4afba51 ("[NETFILTER]: nfnetlink_log: fix sending of multipart messages") Reported-by: Weiming Shi <bestswngs@gmail.com> Signed-off-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-08ipvs: fix NULL deref in ip_vs_add_service error pathWeiming Shi1-1/+0
When ip_vs_bind_scheduler() succeeds in ip_vs_add_service(), the local variable sched is set to NULL. If ip_vs_start_estimator() subsequently fails, the out_err cleanup calls ip_vs_unbind_scheduler(svc, sched) with sched == NULL. ip_vs_unbind_scheduler() passes the cur_sched NULL check (because svc->scheduler was set by the successful bind) but then dereferences the NULL sched parameter at sched->done_service, causing a kernel panic at offset 0x30 from NULL. Oops: general protection fault, [..] [#1] PREEMPT SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037] RIP: 0010:ip_vs_unbind_scheduler (net/netfilter/ipvs/ip_vs_sched.c:69) Call Trace: <TASK> ip_vs_add_service.isra.0 (net/netfilter/ipvs/ip_vs_ctl.c:1500) do_ip_vs_set_ctl (net/netfilter/ipvs/ip_vs_ctl.c:2809) nf_setsockopt (net/netfilter/nf_sockopt.c:102) [..] Fix by simply not clearing the local sched variable after a successful bind. ip_vs_unbind_scheduler() already detects whether a scheduler is installed via svc->scheduler, and keeping sched non-NULL ensures the error path passes the correct pointer to both ip_vs_unbind_scheduler() and ip_vs_scheduler_put(). While the bug is older, the problem popups in more recent kernels (6.2), when the new error path is taken after the ip_vs_start_estimator() call. Fixes: 705dd3444081 ("ipvs: use kthreads for stats estimation") Reported-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Weiming Shi <bestswngs@gmail.com> Acked-by: Simon Horman <horms@kernel.org> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-08drm/i915/gt: fix refcount underflow in intel_engine_park_heartbeatSebastian Brzezinka1-8/+18
A use-after-free / refcount underflow is possible when the heartbeat worker and intel_engine_park_heartbeat() race to release the same engine->heartbeat.systole request. The heartbeat worker reads engine->heartbeat.systole and calls i915_request_put() on it when the request is complete, but clears the pointer in a separate, non-atomic step. Concurrently, a request retirement on another CPU can drop the engine wakeref to zero, triggering __engine_park() -> intel_engine_park_heartbeat(). If the heartbeat timer is pending at that point, cancel_delayed_work() returns true and intel_engine_park_heartbeat() reads the stale non-NULL systole pointer and calls i915_request_put() on it again, causing a refcount underflow: ``` <4> [487.221889] Workqueue: i915-unordered engine_retire [i915] <4> [487.222640] RIP: 0010:refcount_warn_saturate+0x68/0xb0 ... <4> [487.222707] Call Trace: <4> [487.222711] <TASK> <4> [487.222716] intel_engine_park_heartbeat.part.0+0x6f/0x80 [i915] <4> [487.223115] intel_engine_park_heartbeat+0x25/0x40 [i915] <4> [487.223566] __engine_park+0xb9/0x650 [i915] <4> [487.223973] ____intel_wakeref_put_last+0x2e/0xb0 [i915] <4> [487.224408] __intel_wakeref_put_last+0x72/0x90 [i915] <4> [487.224797] intel_context_exit_engine+0x7c/0x80 [i915] <4> [487.225238] intel_context_exit+0xf1/0x1b0 [i915] <4> [487.225695] i915_request_retire.part.0+0x1b9/0x530 [i915] <4> [487.226178] i915_request_retire+0x1c/0x40 [i915] <4> [487.226625] engine_retire+0x122/0x180 [i915] <4> [487.227037] process_one_work+0x239/0x760 <4> [487.227060] worker_thread+0x200/0x3f0 <4> [487.227068] ? __pfx_worker_thread+0x10/0x10 <4> [487.227075] kthread+0x10d/0x150 <4> [487.227083] ? __pfx_kthread+0x10/0x10 <4> [487.227092] ret_from_fork+0x3d4/0x480 <4> [487.227099] ? __pfx_kthread+0x10/0x10 <4> [487.227107] ret_from_fork_asm+0x1a/0x30 <4> [487.227141] </TASK> ``` Fix this by replacing the non-atomic pointer read + separate clear with xchg() in both racing paths. xchg() is a single indivisible hardware instruction that atomically reads the old pointer and writes NULL. This guarantees only one of the two concurrent callers obtains the non-NULL pointer and performs the put, the other gets NULL and skips it. Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/work_items/15880 Fixes: 058179e72e09 ("drm/i915/gt: Replace hangcheck by heartbeats") Cc: <stable@vger.kernel.org> # v5.5+ Signed-off-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com> Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://lore.kernel.org/r/d4c1c14255688dd07cc8044973c4f032a8d1559e.1775038106.git.sebastian.brzezinka@intel.com (cherry picked from commit 13238dc0ee4f9ab8dafa2cca7295736191ae2f42) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2026-04-08Merge branch kvm-arm64/misc-7.1 into kvmarm-master/nextMarc Zyngier4-12/+11
* kvm-arm64/misc-7.1: KVM: arm64: selftests: Avoid testing the IMPDEF behavior KVM: arm64: Destroy stage-2 page-table in kvm_arch_destroy_vm() KVM: arm64: Don't leave mmu->pgt dangling on kvm_init_stage2_mmu() error KVM: arm64: Prevent the host from using an smc with imm16 != 0 Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-04-08Merge branch kvm-arm64/vgic-fixes-7.1 into kvmarm-master/nextMarc Zyngier12-92/+125
* kvm-arm64/vgic-fixes-7.1: : . : FIrst pass at fixing a number of vgic-v5 bugs that were found : after the merge of the initial series. : . KVM: arm64: Advertise ID_AA64PFR2_EL1.GCIE KVM: arm64: vgic-v5: Fold PPI state for all exposed PPIs KVM: arm64: set_id_regs: Allow GICv3 support to be set at runtime KVM: arm64: Don't advertises GICv3 in ID_PFR1_EL1 if AArch32 isn't supported KVM: arm64: Correctly plumb ID_AA64PFR2_EL1 into pkvm idreg handling KVM: arm64: Move GICv5 timer PPI validation into timer_irqs_are_valid() KVM: arm64: Remove evaluation of timer state in kvm_cpu_has_pending_timer() KVM: arm64: Kill arch_timer_context::direct field KVM: arm64: vgic-v5: Correctly set dist->ready once initialised KVM: arm64: vgic-v5: Make the effective priority mask a strict limit KVM: arm64: vgic-v5: Cast vgic_apr to u32 to avoid undefined behaviours KVM: arm64: vgic-v5: Transfer edge pending state to ICH_PPI_PENDRx_EL2 KVM: arm64: vgic-v5: Hold config_lock while finalizing GICv5 PPIs KVM: arm64: Account for RESx bits in __compute_fgt() KVM: arm64: Fix writeable mask for ID_AA64PFR2_EL1 arm64: Fix field references for ICH_PPI_DVIR[01]_EL2 KVM: arm64: Don't skip per-vcpu NV initialisation KVM: arm64: vgic: Don't reset cpuif/redist addresses at finalize time Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-04-08Merge branch kvm-arm64/pkvm-protected-guest into kvmarm-master/nextMarc Zyngier24-232/+1384
* kvm-arm64/pkvm-protected-guest: (41 commits) : . : pKVM support for protected guests, implementing the very long : awaited support for anonymous memory, as the elusive guestmem : has failed to deliver on its promises despite a multi-year : effort. Patches courtesy of Will Deacon. From the initial cover : letter: : : "[...] this patch series implements support for protected guest : memory with pKVM, where pages are unmapped from the host as they are : faulted into the guest and can be shared back from the guest using pKVM : hypercalls. Protected guests are created using a new machine type : identifier and can be booted to a shell using the kvmtool patches : available at [2], which finally means that we are able to test the pVM : logic in pKVM. Since this is an incremental step towards full isolation : from the host (for example, the CPU register state and DMA accesses are : not yet isolated), creating a pVM requires a developer Kconfig option to : be enabled in addition to booting with 'kvm-arm.mode=protected' and : results in a kernel taint." : . KVM: arm64: Don't hold 'vm_table_lock' across guest page reclaim KVM: arm64: Allow get_pkvm_hyp_vm() to take a reference to a dying VM KVM: arm64: Prevent teardown finalisation of referenced 'hyp_vm' drivers/virt: pkvm: Add Kconfig dependency on DMA_RESTRICTED_POOL KVM: arm64: Rename PKVM_PAGE_STATE_MASK KVM: arm64: Extend pKVM page ownership selftests to cover guest hvcs KVM: arm64: Extend pKVM page ownership selftests to cover forced reclaim KVM: arm64: Register 'selftest_vm' in the VM table KVM: arm64: Extend pKVM page ownership selftests to cover guest donation KVM: arm64: Add some initial documentation for pKVM KVM: arm64: Allow userspace to create protected VMs when pKVM is enabled KVM: arm64: Implement the MEM_UNSHARE hypercall for protected VMs KVM: arm64: Implement the MEM_SHARE hypercall for protected VMs KVM: arm64: Add hvc handler at EL2 for hypercalls from protected VMs KVM: arm64: Return -EFAULT from VCPU_RUN on access to a poisoned pte KVM: arm64: Reclaim faulting page from pKVM in spurious fault handler KVM: arm64: Introduce hypercall to force reclaim of a protected page KVM: arm64: Annotate guest donations with handle and gfn in host stage-2 KVM: arm64: Change 'pkvm_handle_t' to u16 KVM: arm64: Introduce host_stage2_set_owner_metadata_locked() ... Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-04-08Merge branch kvm-arm64/spe-trbe-nvhe into kvmarm-master/nextMarc Zyngier3-25/+95
* kvm-arm64/spe-trbe-nvhe: : . : Fix SPE and TRBE nVHE world switch which can otherwise result in : pretty bad behaviours, as they have the nasty habit of performing : out of context speculative page table walks. : : Patches courtesy of Will Deacon. : . KVM: arm64: Don't pass host_debug_state to BRBE world-switch routines KVM: arm64: Disable SPE Profiling Buffer when running in guest context KVM: arm64: Disable TRBE Trace Buffer Unit when running in guest context Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-04-08Merge branch kvm-arm64/user_mem_abort-rework into kvmarm-master/nextMarc Zyngier1-220/+310
* kvm-arm64/user_mem_abort-rework: (30 commits) : . : user_mem_abort() has become an absolute pain to maintain, : to the point that each single fix is likely to introduce : *two* new bugs. : : Deconstruct the whole thing in logical units, reducing : the amount of visible and/or mutable state between functions, : and finally making the code a bit more maintainable. : . KVM: arm64: Convert gmem_abort() to struct kvm_s2_fault_desc KVM: arm64: Simplify integration of adjust_nested_*_perms() KVM: arm64: Directly expose mapping prot and kill kvm_s2_fault KVM: arm64: Move device mapping management into kvm_s2_fault_pin_pfn() KVM: arm64: Replace force_pte with a max_map_size attribute KVM: arm64: Move kvm_s2_fault.{pfn,page} to kvm_s2_vma_info KVM: arm64: Restrict the scope of the 'writable' attribute KVM: arm64: Kill logging_active from kvm_s2_fault KVM: arm64: Move VMA-related information to kvm_s2_fault_vma_info KVM: arm64: Kill topup_memcache from kvm_s2_fault KVM: arm64: Kill exec_fault from kvm_s2_fault KVM: arm64: Kill write_fault from kvm_s2_fault KVM: arm64: Constrain fault_granule to kvm_s2_fault_map() KVM: arm64: Replace fault_is_perm with a helper KVM: arm64: Move fault context to const structure KVM: arm64: Make fault_ipa immutable KVM: arm64: Kill fault->ipa KVM: arm64: Clean up control flow in kvm_s2_fault_map() KVM: arm64: Hoist MTE validation check out of MMU lock path KVM: arm64: Optimize early exit checks in kvm_s2_fault_pin_pfn() ... Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-04-08Merge branch kvm-arm64/pkvm-psci into kvmarm-master/nextMarc Zyngier7-59/+43
* kvm-arm64/pkvm-psci: : . : Cleanup of the pKVM PSCI relay CPU entry code, making it slightly : easier to follow, should someone have to wade into these waters : ever again. : . KVM: arm64: Remove extra ISBs when using msr_hcr_el2 KVM: arm64: pkvm: Use direct function pointers for cpu_{on,resume} KVM: arm64: pkvm: Turn __kvm_hyp_init_cpu into an inner label KVM: arm64: pkvm: Simplify BTI handling on CPU boot KVM: arm64: pkvm: Move error handling to the end of kvm_hyp_cpu_entry Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-04-08Merge branch kvm-arm64/nv-s2-debugfs into kvmarm-master/nextMarc Zyngier4-30/+68
* kvm-arm64/nv-s2-debugfs: : . : Expand the stage-2 ptdump infrastructure to be able to display : the content of the shadow s2 tables generated by nested virt. : : Patches courtesy of Wei-Lin Chang. : . KVM: arm64: ptdump: Initialize parser_state before pgtable walk KVM: arm64: nv: Expose shadow page tables in debugfs KVM: arm64: ptdump: Make KVM ptdump code s2 mmu aware Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-04-08Merge branch kvm-arm64/vgic-v5-ppi into kvmarm-master/nextMarc Zyngier47-382/+3204
* kvm-arm64/vgic-v5-ppi: (40 commits) : . : Add initial GICv5 support for KVM guests, only adding PPI support : for the time being. Patches courtesy of Sascha Bischoff. : : From the cover letter: : : "This is v7 of the patch series to add the virtual GICv5 [1] device : (vgic_v5). Only PPIs are supported by this initial series, and the : vgic_v5 implementation is restricted to the CPU interface, : only. Further patch series are to follow in due course, and will add : support for SPIs, LPIs, the GICv5 IRS, and the GICv5 ITS." : . KVM: arm64: selftests: Add no-vgic-v5 selftest KVM: arm64: selftests: Introduce a minimal GICv5 PPI selftest KVM: arm64: gic-v5: Communicate userspace-driveable PPIs via a UAPI Documentation: KVM: Introduce documentation for VGICv5 KVM: arm64: gic-v5: Probe for GICv5 device KVM: arm64: gic-v5: Set ICH_VCTLR_EL2.En on boot KVM: arm64: gic-v5: Introduce kvm_arm_vgic_v5_ops and register them KVM: arm64: gic-v5: Hide FEAT_GCIE from NV GICv5 guests KVM: arm64: gic: Hide GICv5 for protected guests KVM: arm64: gic-v5: Mandate architected PPI for PMU emulation on GICv5 KVM: arm64: gic-v5: Enlighten arch timer for GICv5 irqchip/gic-v5: Introduce minimal irq_set_type() for PPIs KVM: arm64: gic-v5: Initialise ID and priority bits when resetting vcpu KVM: arm64: gic-v5: Create and initialise vgic_v5 KVM: arm64: gic-v5: Support GICv5 interrupts with KVM_IRQ_LINE KVM: arm64: gic-v5: Implement direct injection of PPIs KVM: arm64: Introduce set_direct_injection irq_op KVM: arm64: gic-v5: Trap and mask guest ICC_PPI_ENABLERx_EL1 writes KVM: arm64: gic-v5: Check for pending PPIs KVM: arm64: gic-v5: Clear TWI if single task running ... Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-04-08Merge branch kvm-arm64/hyp-tracing into kvmarm-master/nextMarc Zyngier65-120/+4863
* kvm-arm64/hyp-tracing: (40 commits) : . : EL2 tracing support, adding both 'remote' ring-buffer : infrastructure and the tracing itself, courtesy of : Vincent Donnefort. From the cover letter: : : "The growing set of features supported by the hypervisor in protected : mode necessitates debugging and profiling tools. Tracefs is the : ideal candidate for this task: : : * It is simple to use and to script. : : * It is supported by various tools, from the trace-cmd CLI to the : Android web-based perfetto. : : * The ring-buffer, where are stored trace events consists of linked : pages, making it an ideal structure for sharing between kernel and : hypervisor. : : This series first introduces a new generic way of creating remote events and : remote buffers. Then it adds support to the pKVM hypervisor." : . tracing: selftests: Extend hotplug testing for trace remotes tracing: Non-consuming read for trace remotes with an offline CPU tracing: Adjust cmd_check_undefined to show unexpected undefined symbols tracing: Restore accidentally removed SPDX tag KVM: arm64: avoid unused-variable warning tracing: Generate undef symbols allowlist for simple_ring_buffer KVM: arm64: tracing: add ftrace dependency tracing: add more symbols to whitelist tracing: Update undefined symbols allow list for simple_ring_buffer KVM: arm64: Fix out-of-tree build for nVHE/pKVM tracing tracing: selftests: Add hypervisor trace remote tests KVM: arm64: Add selftest event support to nVHE/pKVM hyp KVM: arm64: Add hyp_enter/hyp_exit events to nVHE/pKVM hyp KVM: arm64: Add event support to the nVHE/pKVM hyp and trace remote KVM: arm64: Add trace reset to the nVHE/pKVM hyp KVM: arm64: Sync boot clock with the nVHE/pKVM hyp KVM: arm64: Add trace remote for the nVHE/pKVM hyp KVM: arm64: Add tracing capability for the nVHE/pKVM hyp KVM: arm64: Support unaligned fixmap in the pKVM hyp KVM: arm64: Initialise hyp_nr_cpus for nVHE hyp ... Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-04-08perf/events: Replace READ_ONCE() with standard pgtable accessorsAnshuman Khandual1-3/+3
Replace raw READ_ONCE() dereferences of pgtable entries with corresponding standard page table accessors pxdp_get() in perf_get_pgtable_size(). These accessors default to READ_ONCE() on platforms that don't override them. So there is no functional change on such platforms. However arm64 platform is being extended to support 128 bit page tables via a new architecture feature i.e FEAT_D128 in which case READ_ONCE() will not provide required single copy atomic access for 128 bit page table entries. Although pxdp_get() accessors can later be overridden on arm64 platform to extend required single copy atomicity support on 128 bit entries. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260227062744.2215491-1-anshuman.khandual@arm.com
2026-04-08sched/rt: Cleanup global RT bandwidth functionsMichal Koutný1-24/+0
The commit 5f6bd380c7bdb ("sched/rt: Remove default bandwidth control") and followup changes made a few of the functions unnecessary, drop them for simplicity. Signed-off-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260323-sched-rert_groups-v3-3-1e7d5ed6b249@suse.com
2026-04-08sched/rt: Move group schedulability check to sched_rt_global_validate()Michal Koutný1-9/+8
The sched_rt_global_constraints() function is a remnant that used to set up global RT throttling but that is no more since commit 5f6bd380c7bdb ("sched/rt: Remove default bandwidth control") and the function ended up only doing schedulability check. Move the check into the validation function where it fits better. (The order of validations sched_dl_global_validate() and sched_rt_global_validate() shouldn't matter.) Signed-off-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260323-sched-rert_groups-v3-2-1e7d5ed6b249@suse.com
2026-04-08sched/rt: Skip group schedulable check with rt_group_sched=0Michal Koutný1-3/+2
The warning from the commit 87f1fb77d87a6 ("sched: Add RT_GROUP WARN checks for non-root task_groups") is wrong -- it assumes that only task_groups with rt_rq are traversed, however, the schedulability check would iterate all task_groups even when rt_group_sched=0 is disabled at boot time but some non-root task_groups exist. The schedulability check is supposed to validate: a) that children don't overcommit its parent, b) no RT task group overcommits global RT limit. but with rt_group_sched=0 there is no (non-trivial) hierarchy of RT groups, therefore skip the validation altogether. Otherwise, writes to the global sched_rt_runtime_us knob will be rejected with incorrect validation error. This fix is immaterial with CONFIG_RT_GROUP_SCHED=n. Fixes: 87f1fb77d87a6 ("sched: Add RT_GROUP WARN checks for non-root task_groups") Signed-off-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260323-sched-rert_groups-v3-1-1e7d5ed6b249@suse.com
2026-04-08sched/deadline: Use revised wakeup rule for dl_serverPeter Zijlstra1-1/+1
John noted that commit 115135422562 ("sched/deadline: Fix 'stuck' dl_server") unfixed the issue from commit a3a70caf7906 ("sched/deadline: Fix dl_server behaviour"). The issue in commit 115135422562 was for wakeups of the server after the deadline; in which case you *have* to start a new period. The case for a3a70caf7906 is wakeups before the deadline. Now, because the server is effectively running a least-laxity policy, it means that any wakeup during the runnable phase means dl_entity_overflow() will be true. This means we need to adjust the runtime to allow it to still run until the existing deadline expires. Use the revised wakeup rule for dl_defer entities. Fixes: 115135422562 ("sched/deadline: Fix 'stuck' dl_server") Reported-by: John Stultz <jstultz@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Juri Lelli <juri.lelli@redhat.com> Tested-by: John Stultz <jstultz@google.com> Link: https://patch.msgid.link/20260404102244.GB22575@noisy.programming.kicks-ass.net
2026-04-08x86/fpu: Correct the comment explaining what xfeatures_in_use() doesBorislav Petkov (AMD)1-1/+1
It returns the mask of the features which are being currently used, i.e., NOT in their initial configuration. No functional changes. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2026-04-08Merge branch 'thermal-intel'Rafael J. Wysocki2-6/+21
Merge updates of Intel thermal drivers for 7.1: - Replace cpumask_weight() in intel_hfi_offline() with cpumask_empty() which is generally more efficient (Yury Norov) - Add support for reading DDR data rate from PCI config space on Nova Lake platforms to the int340x thermal driver (Srinivas Pandruvada) * thermal-intel: thermal: intel: hfi: use cpumask_empty() in intel_hfi_offline() thermal: intel: int340x: Read DDR data rate for Nova Lake
2026-04-08thermal: core: Suspend thermal zones later and resume them earlierRafael J. Wysocki3-42/+29
To avoid some undesirable interactions between thermal zone suspend and resume with user space that is running when those operations are carried out, move them closer to the suspend and resume of devices, respectively, by updating dpm_prepare() to carry out thermal zone suspend and dpm_complete() to start thermal zone resume (that will continue asynchronously). This also makes the code easier to follow by removing one, arguably redundant, level of indirection represented by the thermal PM notifier. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Armin Wolf <W_Armin@gmx.de> Link: https://patch.msgid.link/2036875.PYKUYFuaPT@rafael.j.wysocki
2026-04-08thermal: core: Allocate thermal_class staticallyRafael J. Wysocki1-18/+12
Define thermal_class as a static structure to simplify thermal_init() and to simplify thermal class availability checks that will need to be carried out during the suspend and resume of thermal zones after subsequent changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/10831981.nUPlyArG6x@rafael.j.wysocki
2026-04-08thermal: core: Adjust thermal_wq allocation flagsRafael J. Wysocki1-2/+1
The thermal workqueue doesn't need to be freezable or per-CPU, so drop WQ_FREEZABLE and WQ_PERCPU from the flags when allocating it. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [ rjw: Subject rewrite ] Link: https://patch.msgid.link/3413335.44csPzL39Z@rafael.j.wysocki Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-04-08thermal: core: Drop redundant check from thermal_zone_device_update()Rafael J. Wysocki1-7/+1
Since __thermal_zone_device_update() checks if tz->state is TZ_STATE_READY and bails out immediately otherwise, it is not necessary to check the thermal_zone_is_present() return value in thermal_zone_device_update(). Namely, tz->state is equal to TZ_STATE_FLAG_INIT initially and that flag is only cleared in thermal_zone_init_complete() after adding tz to the list of thermal zones, and thermal_zone_exit() sets TZ_STATE_FLAG_EXIT in tz->state while removing tz from that list. Thus tz->state is not TZ_STATE_READY when tz is not in the list and the check mentioned above is redundant. Accordingly, drop the redundant thermal_zone_is_present() check from thermal_zone_device_update() and drop the former altogether because it has no more users. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/3406806.aeNJFYEL58@rafael.j.wysocki
2026-04-08thermal: core: Free thermal zone ID later during removalRafael J. Wysocki1-2/+4
The thermal zone removal ordering is different from the thermal zone registration rollback path ordering and the former is arguably problematic because freeing a thermal zone ID prematurely may cause it to be used during the registration of another thermal zone which may fail as a result. Prevent that from occurring by changing the thermal zone removal ordering to reflect the thermal zone registration rollback path ordering. Also more the ida_destroy() call from thermal_zone_device_unregister() to thermal_release() for consistency. Fixes: b31ef8285b19 ("thermal core: convert ID allocation to IDA") Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/5063934.GXAFRqVoOG@rafael.j.wysocki
2026-04-08thermal: core: Fix thermal zone governor cleanup issuesRafael J. Wysocki1-3/+4
If thermal_zone_device_register_with_trips() fails after adding a thermal governor to the thermal zone being registered, the governor is not removed from it as appropriate which may lead to a memory leak. In turn, thermal_zone_device_unregister() calls thermal_set_governor() without acquiring the thermal zone lock beforehand which may race with a governor update via sysfs and may lead to a use-after-free in that case. Address these issues by adding two thermal_set_governor() calls, one to thermal_release() to remove the governor from the given thermal zone, and one to the thermal zone registration error path to cover failures preceding the thermal zone device registration. Fixes: e33df1d2f3a0 ("thermal: let governors have private data for each thermal zone") Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/5092923.31r3eYUQgx@rafael.j.wysocki
2026-04-08pmdomain: qcom: rpmhpd: Add power domains for Hawi SoCFenglin Wu1-0/+38
Add the RPMh power domains required for the Hawi SoC. This includes new definitions for domains supplying specific hardware components: - DCX: supplies VDD_DISP - GBX: supplies VDD_GFX_BX Reviewed-by: Taniya Das <taniya.das@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Fenglin Wu <fenglin.wu@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2026-04-08pmdomain: Merge branch dt into nextUlf Hansson2-0/+13
Merge the immutable branch dt into next, to allow the updated DT bindings to be tested together with the pmdomain changes that are targeted for the next release. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2026-04-08dt-bindings: power: qcom,rpmhpd: Add RPMh power domain for Hawi SoCFenglin Wu2-0/+13
Document the RPMh power domain for Hawi SoC, and add definitions for the new power domains which present in Hawi SoC: - RPMHPD_DCX (Display Core X): supplies VDD_DISP for the display subsystem - RPMHPD_GBX (Graphics Box): supplies VDD_GFX_BX for the GPU/graphics subsystem Also, add constants for new power domain levels that supported in Hawi SoC, including: LOW_SVS_D3_0, LOW_SVS_D1_0, LOW_SVS_D0_0, SVS_L2_0, TURBO_L1_0/1/2, TURBO_L1_0/1/2. Signed-off-by: Fenglin Wu <fenglin.wu@oss.qualcomm.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2026-04-08pmdomain: qcom: cpr: add COMPILE_TEST supportRosen Penev1-1/+1
Allows the buildbots to build the driver on other platforms. There's nothing special arch specific thing going on here. Signed-off-by: Rosen Penev <rosenp@gmail.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2026-04-08entry: Split preemption from irqentry_exit_to_kernel_mode()Mark Rutland1-14/+59
Some architecture-specific work needs to be performed between the state management for exception entry/exit and the "real" work to handle the exception. For example, arm64 needs to manipulate a number of exception masking bits, with different exceptions requiring different masking. Generally this can all be hidden in the architecture code, but for arm64 the current structure of irqentry_exit_to_kernel_mode() makes this particularly difficult to handle in a way that is correct, maintainable, and efficient. The gory details are described in the thread surrounding: https://lore.kernel.org/lkml/acPAzdtjK5w-rNqC@J2N7QTR9R3/ The summary is: * Currently, irqentry_exit_to_kernel_mode() handles both involuntary preemption AND state management necessary for exception return. * When scheduling (including involuntary preemption), arm64 needs to have all arm64-specific exceptions unmasked, though regular interrupts must be masked. * Prior to the state management for exception return, arm64 needs to mask a number of arm64-specific exceptions, and perform some work with these exceptions masked (with RCU watching, etc). While in theory it is possible to handle this with a new arch_*() hook called somewhere under irqentry_exit_to_kernel_mode(), this is fragile and complicated, and doesn't match the flow used for exception return to user mode, which has a separate 'prepare' step (where preemption can occur) prior to the state management. To solve this, refactor irqentry_exit_to_kernel_mode() to match the style of {irqentry,syscall}_exit_to_user_mode(), moving preemption logic into a new irqentry_exit_to_kernel_mode_preempt() function, and moving state management in a new irqentry_exit_to_kernel_mode_after_preempt() function. The existing irqentry_exit_to_kernel_mode() is left as a caller of both of these, avoiding the need to modify existing callers. There should be no functional change as a result of this change. [ tglx: Updated kernel doc ] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260407131650.3813777-6-mark.rutland@arm.com
2026-04-08entry: Split kernel mode logic from irqentry_{enter,exit}()Mark Rutland2-95/+142
The generic irqentry code has entry/exit functions specifically for exceptions taken from user mode, but doesn't have entry/exit functions specifically for exceptions taken from kernel mode. It would be helpful to have separate entry/exit functions specifically for exceptions taken from kernel mode. This would make the structure of the entry code more consistent, and would make it easier for architectures to manage logic specific to exceptions taken from kernel mode. Move the logic specific to kernel mode out of irqentry_enter() and irqentry_exit() into new irqentry_enter_from_kernel_mode() and irqentry_exit_to_kernel_mode() functions. These are marked __always_inline and placed in irq-entry-common.h, as with irqentry_enter_from_user_mode() and irqentry_exit_to_user_mode(), so that they can be inlined into architecture-specific wrappers. The existing out-of-line irqentry_enter() and irqentry_exit() functions retained as callers of the new functions. The lockdep assertion from irqentry_exit() is moved into irqentry_exit_to_user_mode() and irqentry_exit_to_kernel_mode(). This was previously missing from irqentry_exit_to_user_mode() when called directly, and any new lockdep assertion failure relating from this change is a latent bug. Aside from the lockdep change noted above, there should be no functional change as a result of this change. [ tglx: Updated kernel doc ] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260407131650.3813777-5-mark.rutland@arm.com
2026-04-08entry: Move irqentry_enter() prototype laterMark Rutland1-22/+22
Subsequent patches will rework the irqentry_*() functions. The end result (and the intermediate diffs) will be much clearer if the prototype for the irqentry_enter() function is moved later, immediately before the prototype of the irqentry_exit() function. Move the prototype later. This is purely a move; there should be no functional change as a result of this change. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260407131650.3813777-4-mark.rutland@arm.com