From c5c3ef8d49e15d2fc1cec4ad7c91d81b99977440 Mon Sep 17 00:00:00 2001 From: Michael Kelley Date: Tue, 17 Feb 2026 10:23:34 -0800 Subject: Drivers: hv: vmbus: Provide option to skip VMBus unload on panic Currently, VMBus code initiates a VMBus unload in the panic path so that if a kdump kernel is loaded, it can start fresh in setting up its own VMBus connection. However, a driver for the VMBus virtual frame buffer may need to flush dirty portions of the frame buffer back to the Hyper-V host so that panic information is visible in the graphics console. To support such flushing, provide exported functions for the frame buffer driver to specify that the VMBus unload should not be done by the VMBus driver, and to initiate the VMBus unload itself. Together these allow a frame buffer driver to delay the VMBus unload until after it has completed the flush. Ideally, the VMBus driver could use its own panic-path callback to do the unload after all frame buffer drivers have finished. But DRM frame buffer drivers use the kmsg dump callback, and there are no callbacks after that in the panic path. Hence this somewhat messy approach to properly sequencing the frame buffer flush and the VMBus unload. Fixes: 3671f3777758 ("drm/hyperv: Add support for drm_panic") Signed-off-by: Michael Kelley Reviewed-by: Long Li Signed-off-by: Wei Liu --- include/linux/hyperv.h | 3 +++ 1 file changed, 3 insertions(+) (limited to 'include/linux') diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 964f1be8150c..41a3d82f0722 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -1336,6 +1336,9 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj, bool fb_overlap_ok); void vmbus_free_mmio(resource_size_t start, resource_size_t size); +void vmbus_initiate_unload(bool crash); +void vmbus_set_skip_unload(bool skip); + /* * GUID definitions of various offer types - services offered to the guest. */ -- cgit v1.2.3 From 840b740a35bf969734e0f2e44c21289fdd03079e Mon Sep 17 00:00:00 2001 From: Michael Kelley Date: Tue, 26 May 2026 07:13:04 -0700 Subject: mshv: Add conditional VMBus dependency When the VMBus driver is not part of the kernel (CONFIG_HYPERV_VMBUS=n), the MSHV root driver fails to link: ERROR: modpost: "hv_vmbus_exists" [drivers/hv/mshv_root.ko] undefined! Fix this while meeting these requirements: * It must be possible to include the MSHV root driver without the VMBus driver. In such case, the MSHV root driver can be built-in to the kernel image, or it can be built as a separate module. * If both the MSHV root driver and the VMBus driver are present, the MSHV root driver and VMBus driver can both be built-in, or they can both be separate modules. Or the MSHV root driver can be a module while the VMBus driver can be built-in, but the reverse is disallowed. Regardless of the build choices, the VMBus driver must be loaded before the MSHV driver in order for the SynIC to be managed properly (see comments in the MSHV SynIC code). The fix has two parts: * Add a Kconfig entry for MSHV_ROOT to depend on HYPERV_VMBUS if HYPERV_VMBUS is present. The entry disallows MSHV_ROOT being built-in when HYPERV_VMBUS is a module, but without requiring that HYPERV_VMBUS be built. * Add a stub implementation of hv_vmbus_exists() for when the VMBus driver is not present so that the MSHV root driver has no module dependency on VMBus. When the VMBus driver *is* present, the module dependency ensures that the VMBus driver loads first when both are built as modules. Existing code ensures that the VMBus driver loads first if it is built-in. The VMBus driver uses subsys_initcall(), which is initcall level 4. The MSHV root driver uses module_init(), which becomes device_init() when built-in, and device_init() is initcall level 6. Reported-by: Arnd Bergmann Closes: https://lore.kernel.org/all/20260520074044.923728-1-arnd@kernel.org/ Signed-off-by: Michael Kelley Acked-by: Arnd Bergmann Reviewed-by: Jork Loeser Reviewed-by: Hardik Garg Signed-off-by: Wei Liu --- drivers/hv/Kconfig | 1 + include/linux/hyperv.h | 4 ++++ 2 files changed, 5 insertions(+) (limited to 'include/linux') diff --git a/drivers/hv/Kconfig b/drivers/hv/Kconfig index 2d0b3fcb0ff8..aa11bcefddf2 100644 --- a/drivers/hv/Kconfig +++ b/drivers/hv/Kconfig @@ -74,6 +74,7 @@ config MSHV_ROOT # e.g. When withdrawing memory, the hypervisor gives back 4k pages in # no particular order, making it impossible to reassemble larger pages depends on PAGE_SIZE_4KB + depends on HYPERV_VMBUS if HYPERV_VMBUS select EVENTFD select VIRT_XFER_TO_GUEST_WORK select HMM_MIRROR diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 41a3d82f0722..734b7ef98f4d 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -1304,7 +1304,11 @@ static inline void *hv_get_drvdata(struct hv_device *dev) struct device *hv_get_vmbus_root_device(void); +#if IS_ENABLED(CONFIG_HYPERV_VMBUS) bool hv_vmbus_exists(void); +#else +static inline bool hv_vmbus_exists(void) { return false; } +#endif struct hv_ring_buffer_debug_info { u32 current_interrupt_mask; -- cgit v1.2.3 From 3c2d42b8ee345b17a4ba56b0f6492d1ff4c1178e Mon Sep 17 00:00:00 2001 From: Wupeng Ma Date: Fri, 22 May 2026 09:03:05 +0800 Subject: mm/memory-failure: fix hugetlb_lock AA deadlock in get_huge_page_for_hwpoison Two concurrent madvise(MADV_HWPOISON) calls on the same hugetlb page can trigger a recursive spinlock self-deadlock (AA deadlock) on hugetlb_lock when racing with a concurrent unmap: thread#0 thread#1 -------- -------- madvise(folio, MADV_HWPOISON) -> poisons the folio successfully madvise(folio, MADV_HWPOISON) unmap(folio) try_memory_failure_hugetlb get_huge_page_for_hwpoison spin_lock_irq(&hugetlb_lock) <- held __get_huge_page_for_hwpoison hugetlb_update_hwpoison() -> MF_HUGETLB_FOLIO_PRE_POISONED goto out: folio_put() refcount: 1 -> 0 free_huge_folio() spin_lock_irqsave(&hugetlb_lock) -> AA DEADLOCK! The out: path in __get_huge_page_for_hwpoison() calls folio_put() to drop the GUP reference while the hugetlb_lock is still held by the hugetlb.c wrapper get_huge_page_for_hwpoison(). If concurrent unmap has released the page table mapping reference, folio_put() drops the folio refcount to zero, triggering free_huge_folio() which attempts to re-acquire the non-recursive hugetlb_lock. Fix this by moving hugetlb_lock acquisition from the hugetlb.c wrapper into get_huge_page_for_hwpoison(). Place spin_unlock_irq() before the folio_put() at the out: label so the folio is always released outside the lock. [akpm@linux-foundation.org: fix race, rename label per Miaohe] Link: https://sashiko.dev/#/patchset/20260522010305.4099834-1-mawupeng1@huawei.com Link: https://lore.kernel.org/f39f405e-4b4b-8f79-70fe-a2b5b62114eb@huawei.com Link: https://lore.kernel.org/20260522010305.4099834-1-mawupeng1@huawei.com Fixes: 405ce051236c ("mm/hwpoison: fix race between hugetlb free/demotion and memory_failure_hugetlb()") Signed-off-by: Wupeng Ma Acked-by: Oscar Salvador (SUSE) Acked-by: Muchun Song Reviewed-by: Kefeng Wang Acked-by: Miaohe Lin Cc: David Hildenbrand Cc: Liam Howlett Cc: Lorenzo Stoakes Cc: Michal Hocko Cc: Mike Rapoport Cc: Naoya Horiguchi Cc: Suren Baghdasaryan Cc: Vlastimil Babka Cc: Signed-off-by: Andrew Morton --- include/linux/hugetlb.h | 8 -------- include/linux/mm.h | 8 -------- mm/hugetlb.c | 11 ----------- mm/memory-failure.c | 19 ++++++++++--------- 4 files changed, 10 insertions(+), 36 deletions(-) (limited to 'include/linux') diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 5957bc25efa8..2abaf99321e9 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -153,8 +153,6 @@ long hugetlb_unreserve_pages(struct inode *inode, long start, long end, long freed); bool folio_isolate_hugetlb(struct folio *folio, struct list_head *list); int get_hwpoison_hugetlb_folio(struct folio *folio, bool *hugetlb, bool unpoison); -int get_huge_page_for_hwpoison(unsigned long pfn, int flags, - bool *migratable_cleared); void folio_putback_hugetlb(struct folio *folio); void move_hugetlb_state(struct folio *old_folio, struct folio *new_folio, int reason); void hugetlb_fix_reserve_counts(struct inode *inode); @@ -421,12 +419,6 @@ static inline int get_hwpoison_hugetlb_folio(struct folio *folio, bool *hugetlb, return 0; } -static inline int get_huge_page_for_hwpoison(unsigned long pfn, int flags, - bool *migratable_cleared) -{ - return 0; -} - static inline void folio_putback_hugetlb(struct folio *folio) { } diff --git a/include/linux/mm.h b/include/linux/mm.h index 06bbe9eba636..fc2acedf0b76 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -4975,8 +4975,6 @@ extern int soft_offline_page(unsigned long pfn, int flags); */ extern const struct attribute_group memory_failure_attr_group; extern void memory_failure_queue(unsigned long pfn, int flags); -extern int __get_huge_page_for_hwpoison(unsigned long pfn, int flags, - bool *migratable_cleared); void num_poisoned_pages_inc(unsigned long pfn); void num_poisoned_pages_sub(unsigned long pfn, long i); #else @@ -4984,12 +4982,6 @@ static inline void memory_failure_queue(unsigned long pfn, int flags) { } -static inline int __get_huge_page_for_hwpoison(unsigned long pfn, int flags, - bool *migratable_cleared) -{ - return 0; -} - static inline void num_poisoned_pages_inc(unsigned long pfn) { } diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 1b1d4f87a3a4..c921287489de 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -7161,17 +7161,6 @@ int get_hwpoison_hugetlb_folio(struct folio *folio, bool *hugetlb, bool unpoison return ret; } -int get_huge_page_for_hwpoison(unsigned long pfn, int flags, - bool *migratable_cleared) -{ - int ret; - - spin_lock_irq(&hugetlb_lock); - ret = __get_huge_page_for_hwpoison(pfn, flags, migratable_cleared); - spin_unlock_irq(&hugetlb_lock); - return ret; -} - /** * folio_putback_hugetlb - unisolate a hugetlb folio * @folio: the isolated hugetlb folio diff --git a/mm/memory-failure.c b/mm/memory-failure.c index ee42d4361309..d47aef256a32 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1966,20 +1966,19 @@ void folio_clear_hugetlb_hwpoison(struct folio *folio) folio_free_raw_hwp(folio, true); } -/* - * Called from hugetlb code with hugetlb_lock held. - */ -int __get_huge_page_for_hwpoison(unsigned long pfn, int flags, +static int get_huge_page_for_hwpoison(unsigned long pfn, int flags, bool *migratable_cleared) { struct page *page = pfn_to_page(pfn); - struct folio *folio = page_folio(page); + struct folio *folio; bool count_increased = false; int ret, rc; + spin_lock_irq(&hugetlb_lock); + folio = page_folio(page); if (!folio_test_hugetlb(folio)) { ret = MF_HUGETLB_NON_HUGEPAGE; - goto out; + goto out_unlock; } else if (flags & MF_COUNT_INCREASED) { ret = MF_HUGETLB_IN_USED; count_increased = true; @@ -1995,13 +1994,13 @@ int __get_huge_page_for_hwpoison(unsigned long pfn, int flags, } else { ret = MF_HUGETLB_RETRY; if (!(flags & MF_NO_RETRY)) - goto out; + goto out_unlock; } rc = hugetlb_update_hwpoison(folio, page); if (rc >= MF_HUGETLB_FOLIO_PRE_POISONED) { ret = rc; - goto out; + goto out_unlock; } /* @@ -2013,8 +2012,10 @@ int __get_huge_page_for_hwpoison(unsigned long pfn, int flags, *migratable_cleared = true; } + spin_unlock_irq(&hugetlb_lock); return ret; -out: +out_unlock: + spin_unlock_irq(&hugetlb_lock); if (count_increased) folio_put(folio); return ret; -- cgit v1.2.3 From 6d99479799c69c3cb588fcda19c81d8f61d64ecd Mon Sep 17 00:00:00 2001 From: Qing Wang Date: Tue, 2 Jun 2026 11:08:54 +0800 Subject: rseq: Fix using an uninitialized stack variable in rseq_exit_user_update() There is an bug in which an uninitialized stack variable is used in rseq_exit_user_update() as reported by syzbot: BUG: KMSAN: kernel-infoleak in rseq_set_ids_get_csaddr include/linux/rseq_entry.h:502 [inline] The local variable: struct rseq_ids ids = { .cpu_id = task_cpu(t), .mm_cid = task_mm_cid(t), .node_id = cpu_to_node(ids.cpu_id), }; According to the C standard, the evaluation order of expressions in an initializer list is indeterminately sequenced. The compiler (Clang, in this KMSAN build) evaluates `cpu_to_node(ids.cpu_id)` *before* `ids.cpu_id` is initialized with `task_cpu(t)`. This is fixed by moving the assignment of ids.node_id outside the structure initialization. Fixes: 82f572449cfe ("rseq: Implement read only ABI enforcement for optimized RSEQ V2 mode") Closes: https://syzkaller.appspot.com/bug?extid=185a631927096f9da2fc Reported-by: syzbot+185a631927096f9da2fc@syzkaller.appspotmail.com Signed-off-by: Qing Wang Signed-off-by: Peter Zijlstra (Intel) Acked-by: Mark Rutland Link: https://patch.msgid.link/20260602030854.574038-1-wangqing7171@gmail.com --- include/linux/rseq_entry.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) (limited to 'include/linux') diff --git a/include/linux/rseq_entry.h b/include/linux/rseq_entry.h index 63bc72086e75..ed9da6e41a2a 100644 --- a/include/linux/rseq_entry.h +++ b/include/linux/rseq_entry.h @@ -635,10 +635,11 @@ static __always_inline bool rseq_exit_user_update(struct pt_regs *regs, struct t return true; } + int cpu = task_cpu(t); struct rseq_ids ids = { - .cpu_id = task_cpu(t), + .cpu_id = cpu, .mm_cid = task_mm_cid(t), - .node_id = cpu_to_node(ids.cpu_id), + .node_id = cpu_to_node(cpu), }; return rseq_update_usr(t, regs, &ids); -- cgit v1.2.3 From 0652a3daa78723f955b1ebeb621665ce72bec53e Mon Sep 17 00:00:00 2001 From: Eva Kurchatova Date: Wed, 3 Jun 2026 18:31:42 +0300 Subject: tracing: Fix CFI violation in probestub being called by tprobes The probestub is a function to allow tprobes to hook to a tracepoint to gain access to its parameters. The function itself is only referenced by the tracepoint structure which lives in the __tracepoint section. objtool explicitly ignores that section and when processing functions in the kernel, if it detects one that has no references it will seal it to have its ENDBR stripped on boot up. This means when a tprobe is attached to the sched_wakeup tracepoint, when it is triggered it will call __probestub_sched_wakeup and due to the missing ENDBR on a CFI-enabled machine it will take a #CP exception. Fix this by adding CFI_NOSEAL annotation to probestub declaration. Cc: stable@vger.kernel.org Acked-by: Masami Hiramatsu (Google) Link: https://patch.msgid.link/20260603153147.573589-1-eva.kurchatova@virtuozzo.com Fixes: d5173f753750 ("objtool: Exclude __tracepoints data from ENDBR checks") Signed-off-by: Eva Kurchatova [ Updated change log ] Signed-off-by: Steven Rostedt --- include/linux/tracepoint.h | 8 ++++++++ 1 file changed, 8 insertions(+) (limited to 'include/linux') diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h index 763eea4d80d8..2d2b9f8cdda4 100644 --- a/include/linux/tracepoint.h +++ b/include/linux/tracepoint.h @@ -20,6 +20,7 @@ #include #include #include +#include struct module; struct tracepoint; @@ -389,6 +390,13 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p) void __probestub_##_name(void *__data, proto) \ { \ } \ + /* \ + * Annotate the probestub 'CFI_NOSEAL' to stop objtool from \ + * requesting the kernel remove the ENDBR, because the only \ + * references to the function are in the __tracepoint section, \ + * that objtool doesn't scan. \ + */ \ + CFI_NOSEAL(__probestub_##_name); \ DEFINE_STATIC_CALL(tp_func_##_name, __traceiter_##_name); \ DEFINE_RUST_DO_TRACE(_name, TP_PROTO(proto), TP_ARGS(args)) -- cgit v1.2.3 From 979c294509f9248fe1e7c358d582fb37dd5ca12d Mon Sep 17 00:00:00 2001 From: Nathan Chancellor Date: Thu, 4 Jun 2026 17:33:21 -0700 Subject: cfi: Include uaccess.h for get_kernel_nofault() After commit 0652a3daa787 ("tracing: Fix CFI violation in probestub being called by tprobes"), there are many build errors when building ARCH=arm multi_v7_defconfig + CONFIG_CFI=y like: In file included from drivers/base/devres.c:17: In file included from drivers/base/trace.h:16: In file included from include/linux/tracepoint.h:23: include/linux/cfi.h:44:6: error: call to undeclared function 'get_kernel_nofault'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 44 | if (get_kernel_nofault(hash, func - cfi_get_offset())) | ^ 1 error generated. get_kernel_nofault() is called in the generic version of cfi_get_func_hash() but nothing ensures uaccess.h is always included for a proper expansion and prototype. Include uaccess.h in cfi.h to clear up the errors. Cc: stable@vger.kernel.org Fixes: 0652a3daa787 ("tracing: Fix CFI violation in probestub being called by tprobes") Signed-off-by: Nathan Chancellor Acked-by: Masami Hiramatsu (Google) Reviewed-by: Sami Tolvanen Signed-off-by: Linus Torvalds --- include/linux/cfi.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/linux') diff --git a/include/linux/cfi.h b/include/linux/cfi.h index 1fd22ea6eba4..0f220d29225c 100644 --- a/include/linux/cfi.h +++ b/include/linux/cfi.h @@ -9,6 +9,7 @@ #include #include +#include #include #ifdef CONFIG_CFI -- cgit v1.2.3