summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2026-04-22Linux 6.19.14v6.19.14linux-6.19.yGreg Kroah-Hartman1-1/+1
Link: https://lore.kernel.org/r/20260420153934.013228280@linuxfoundation.org Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Tested-by: Salvatore Bonaccorso <carnil@debian.org> Tested-by: Peter Schneider <pschneider1968@googlemail.com> Tested-by: Ron Economos <re@w6rz.net> Tested-by: Brett A C Sheffield <bacs@librecast.net> Tested-by: Miguel Ojeda <ojeda@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Tested-by: Justin M. Forbes <jforbes@fedoraproject.org> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Barry K. Nathan <barryn@pobox.com> Tested-by: Pavel Machek (CIP) <pavel@nabladev.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22dma-mapping: handle DMA_ATTR_CPU_CACHE_CLEAN in trace outputLeon Romanovsky1-1/+2
commit 6f45b1604cf43945ef472ae4ef30354025307c19 upstream. Tracing prints decoded DMA attribute flags, but it does not yet include the recently added DMA_ATTR_CPU_CACHE_CLEAN. Add support for decoding and displaying this attribute in the trace output. Fixes: 61868dc55a11 ("dma-mapping: add DMA_ATTR_CPU_CACHE_CLEAN") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20260316-dma-debug-overlap-v3-2-1dde90a7f08b@nvidia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22dma-debug: Allow multiple invocations of overlapping entriesLeon Romanovsky1-3/+3
commit eca58535b154e6951327319afda94ac80eae7dc3 upstream. Repeated DMA mappings with DMA_ATTR_CPU_CACHE_CLEAN trigger the following splat. This prevents using the attribute in cases where a DMA region is shared and reused more than seven times. ------------[ cut here ]------------ DMA-API: exceeded 7 overlapping mappings of cacheline 0x000000000438c440 WARNING: kernel/dma/debug.c:467 at add_dma_entry+0x219/0x280, CPU#4: ibv_rc_pingpong/1644 Modules linked in: xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat nf_nat xt_addrtype br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay mlx5_fwctl zram zsmalloc mlx5_ib fuse rpcrdma rdma_ucm ib_uverbs ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm mlx5_core ib_core CPU: 4 UID: 2733 PID: 1644 Comm: ibv_rc_pingpong Not tainted 6.19.0+ #129 PREEMPT Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:add_dma_entry+0x221/0x280 Code: c0 0f 84 f2 fe ff ff 83 e8 01 89 05 6d 99 11 01 e9 e4 fe ff ff 0f 8e 1f ff ff ff 48 8d 3d 07 ef 2d 01 be 07 00 00 00 48 89 e2 <67> 48 0f b9 3a e9 06 ff ff ff 48 c7 c7 98 05 2b 82 c6 05 72 92 28 RSP: 0018:ff1100010e657970 EFLAGS: 00010002 RAX: 0000000000000007 RBX: ff1100010234eb00 RCX: 0000000000000000 RDX: ff1100010e657970 RSI: 0000000000000007 RDI: ffffffff82678660 RBP: 000000000438c440 R08: 0000000000000228 R09: 0000000000000000 R10: 00000000000001be R11: 000000000000089d R12: 0000000000000800 R13: 00000000ffffffef R14: 0000000000000202 R15: ff1100010234eb00 FS: 00007fb15f3f6740(0000) GS:ff110008dcc19000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fb15f32d3a0 CR3: 0000000116f59001 CR4: 0000000000373eb0 Call Trace: <TASK> debug_dma_map_sg+0x1b4/0x390 __dma_map_sg_attrs+0x6d/0x1a0 dma_map_sgtable+0x19/0x30 ib_umem_get+0x284/0x3b0 [ib_uverbs] mlx5_ib_reg_user_mr+0x68/0x2a0 [mlx5_ib] ib_uverbs_reg_mr+0x17f/0x2a0 [ib_uverbs] ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0xc2/0x130 [ib_uverbs] ib_uverbs_cmd_verbs+0xa0b/0xae0 [ib_uverbs] ? ib_uverbs_handler_UVERBS_METHOD_QUERY_PORT_SPEED+0xe0/0xe0 [ib_uverbs] ? mmap_region+0x7a/0xb0 ? do_mmap+0x3b8/0x5c0 ib_uverbs_ioctl+0xa7/0x110 [ib_uverbs] __x64_sys_ioctl+0x14f/0x8b0 ? ksys_mmap_pgoff+0xc5/0x190 do_syscall_64+0x8c/0xbf0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7fb15f5e4eed Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00 RSP: 002b:00007ffe09a5c540 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007ffe09a5c5d0 RCX: 00007fb15f5e4eed RDX: 00007ffe09a5c5f0 RSI: 00000000c0181b01 RDI: 0000000000000003 RBP: 00007ffe09a5c590 R08: 0000000000000028 R09: 00007ffe09a5c794 R10: 0000000000000001 R11: 0000000000000246 R12: 00007ffe09a5c794 R13: 000000000000000c R14: 0000000025a49170 R15: 000000000000000c </TASK> ---[ end trace 0000000000000000 ]--- Fixes: 61868dc55a11 ("dma-mapping: add DMA_ATTR_CPU_CACHE_CLEAN") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20260316-dma-debug-overlap-v3-1-1dde90a7f08b@nvidia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22mm/userfaultfd: fix hugetlb fault mutex hash calculationJianhui Zhou2-1/+18
commit 0217c7fb4de4a40cee667eb21901f3204effe5ac upstream. In mfill_atomic_hugetlb(), linear_page_index() is used to calculate the page index for hugetlb_fault_mutex_hash(). However, linear_page_index() returns the index in PAGE_SIZE units, while hugetlb_fault_mutex_hash() expects the index in huge page units. This mismatch means that different addresses within the same huge page can produce different hash values, leading to the use of different mutexes for the same huge page. This can cause races between faulting threads, which can corrupt the reservation map and trigger the BUG_ON in resv_map_release(). Fix this by introducing hugetlb_linear_page_index(), which returns the page index in huge page granularity, and using it in place of linear_page_index(). Link: https://lkml.kernel.org/r/20260310110526.335749-1-jianhuizzzzz@gmail.com Fixes: a08c7193e4f1 ("mm/filemap: remove hugetlb special casing in filemap.c") Signed-off-by: Jianhui Zhou <jianhuizzzzz@gmail.com> Reported-by: syzbot+f525fd79634858f478e7@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=f525fd79634858f478e7 Acked-by: SeongJae Park <sj@kernel.org> Reviewed-by: David Hildenbrand (Arm) <david@kernel.org> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Jane Chu <jane.chu@oracle.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: JonasZhou <JonasZhou@zhaoxin.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Xu <peterx@redhat.com> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22media: hackrf: fix to not free memory after the device is registered in ↵Jeongjun Park1-3/+4
hackrf_probe() commit 3b7da2b4d0fe014eff181ed37e3bf832eb8ed258 upstream. In hackrf driver, the following race condition occurs: ``` CPU0 CPU1 hackrf_probe() kzalloc(); // alloc hackrf_dev .... v4l2_device_register(); .... fd = sys_open("/path/to/dev"); // open hackrf fd .... v4l2_device_unregister(); .... kfree(); // free hackrf_dev .... sys_ioctl(fd, ...); v4l2_ioctl(); video_is_registered() // UAF!! .... sys_close(fd); v4l2_release() // UAF!! hackrf_video_release() kfree(); // DFB!! ``` When a V4L2 or video device is unregistered, the device node is removed so new open() calls are blocked. However, file descriptors that are already open-and any in-flight I/O-do not terminate immediately; they remain valid until the last reference is dropped and the driver's release() is invoked. Therefore, freeing device memory on the error path after hackrf_probe() has registered dev it will lead to a race to use-after-free vuln, since those already-open handles haven't been released yet. And since release() free memory too, race to use-after-free and double-free vuln occur. To prevent this, if device is registered from probe(), it should be modified to free memory only through release() rather than calling kfree() directly. Cc: <stable@vger.kernel.org> Reported-by: syzbot+6ffd76b5405c006a46b7@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=6ffd76b5405c006a46b7 Reported-by: syzbot+f1b20958f93d2d250727@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=f1b20958f93d2d250727 Fixes: 8bc4a9ed8504 ("[media] hackrf: add support for transmitter") Signed-off-by: Jeongjun Park <aha310510@gmail.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22media: vidtv: fix pass-by-value structs causing MSAN warningsAbd-Alrhman Masalkhi3-28/+28
commit 5f8e73bde67e931468bc2a1860d78d72f0c6ba41 upstream. vidtv_ts_null_write_into() and vidtv_ts_pcr_write_into() take their argument structs by value, causing MSAN to report uninit-value warnings. While only vidtv_ts_null_write_into() has triggered a report so far, both functions share the same issue. Fix by passing both structs by const pointer instead, avoiding the stack copy of the struct along with its MSAN shadow and origin metadata. The functions do not modify the structs, which is enforced by the const qualifier. Fixes: f90cf6079bf67 ("media: vidtv: add a bridge driver") Cc: stable@vger.kernel.org Reported-by: syzbot+96f901260a0b2d29cd1a@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=96f901260a0b2d29cd1a Tested-by: syzbot+96f901260a0b2d29cd1a@syzkaller.appspotmail.com Suggested-by: Yihan Ding <dingyihan@uniontech.com> Signed-off-by: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22nilfs2: fix NULL i_assoc_inode dereference in nilfs_mdt_save_to_shadow_mapDeepanshu Kartikey1-0/+3
commit 4a4e0328edd9e9755843787d28f16dd4165f8b48 upstream. The DAT inode's btree node cache (i_assoc_inode) is initialized lazily during btree operations. However, nilfs_mdt_save_to_shadow_map() assumes i_assoc_inode is already initialized when copying dirty pages to the shadow map during GC. If NILFS_IOCTL_CLEAN_SEGMENTS is called immediately after mount before any btree operation has occurred on the DAT inode, i_assoc_inode is NULL leading to a general protection fault. Fix this by calling nilfs_attach_btree_node_cache() on the DAT inode in nilfs_dat_read() at mount time, ensuring i_assoc_inode is always initialized before any GC operation can use it. Reported-by: syzbot+4b4093b1f24ad789bf37@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=4b4093b1f24ad789bf37 Tested-by: syzbot+4b4093b1f24ad789bf37@syzkaller.appspotmail.com Fixes: e897be17a441 ("nilfs2: fix lockdep warnings in page operations for btree nodes") Signed-off-by: Deepanshu Kartikey <Kartikey406@gmail.com> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22media: as102: fix to not free memory after the device is registered in ↵Jeongjun Park1-0/+2
as102_usb_probe() commit 8bd29dbe03fc5b0f039ab2395ff37b64236d2f0c upstream. In as102_usb driver, the following race condition occurs: ``` CPU0 CPU1 as102_usb_probe() kzalloc(); // alloc as102_dev_t .... usb_register_dev(); fd = sys_open("/path/to/dev"); // open as102 fd .... usb_deregister_dev(); .... kfree(); // free as102_dev_t .... sys_close(fd); as102_release() // UAF!! as102_usb_release() kfree(); // DFB!! ``` When a USB character device registered with usb_register_dev() is later unregistered (via usb_deregister_dev() or disconnect), the device node is removed so new open() calls fail. However, file descriptors that are already open do not go away immediately: they remain valid until the last reference is dropped and the driver's .release() is invoked. In as102, as102_usb_probe() calls usb_register_dev() and then, on an error path, does usb_deregister_dev() and frees as102_dev_t right away. If userspace raced a successful open() before the deregistration, that open FD will later hit as102_release() --> as102_usb_release() and access or free as102_dev_t again, occur a race to use-after-free and double-free vuln. The fix is to never kfree(as102_dev_t) directly once usb_register_dev() has succeeded. After deregistration, defer freeing memory to .release(). In other words, let release() perform the last kfree when the final open FD is closed. Cc: <stable@vger.kernel.org> Reported-by: syzbot+47321e8fd5a4c84088db@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=47321e8fd5a4c84088db Fixes: cd19f7d3e39b ("[media] as102: fix leaks at failure paths in as102_usb_probe()") Signed-off-by: Jeongjun Park <aha310510@gmail.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22wireguard: device: use exit_rtnl callback instead of manual rtnl_lock in ↵Shardul Bankar1-5/+3
pre_exit commit 60a25ef8dacb3566b1a8c4de00572a498e2a3bf9 upstream. wg_netns_pre_exit() manually acquires rtnl_lock() inside the pernet .pre_exit callback. This causes a hung task when another thread holds rtnl_mutex - the cleanup_net workqueue (or the setup_net failure rollback path) blocks indefinitely in wg_netns_pre_exit() waiting to acquire the lock. Convert to .exit_rtnl, introduced in commit 7a60d91c690b ("net: Add ->exit_rtnl() hook to struct pernet_operations."), where the framework already holds RTNL and batches all callbacks under a single rtnl_lock()/rtnl_unlock() pair, eliminating the contention window. The rcu_assign_pointer(wg->creating_net, NULL) is safe to move from .pre_exit to .exit_rtnl (which runs after synchronize_rcu()) because all RCU readers of creating_net either use maybe_get_net() - which returns NULL for a dying namespace with zero refcount - or access net->user_ns which remains valid throughout the entire ops_undo_list sequence. Reported-by: syzbot+f2fbf7478a35a94c8b7c@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?id=cb64c22a492202ca929e18262fdb8cb89e635c70 Signed-off-by: Shardul Bankar <shardul.b@mpiricsoftware.com> [ Jason: added __net_exit and __read_mostly annotations that were missing. ] Fixes: 900575aa33a3 ("wireguard: device: avoid circular netns references") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Link: https://patch.msgid.link/20260414153944.2742252-5-Jason@zx2c4.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22bcache: fix cached_dev.sb_bio use-after-free and crashMingzhe Zou1-0/+7
commit fec114a98b8735ee89c75216c45a78e28be0f128 upstream. In our production environment, we have received multiple crash reports regarding libceph, which have caught our attention: ``` [6888366.280350] Call Trace: [6888366.280452] blk_update_request+0x14e/0x370 [6888366.280561] blk_mq_end_request+0x1a/0x130 [6888366.280671] rbd_img_handle_request+0x1a0/0x1b0 [rbd] [6888366.280792] rbd_obj_handle_request+0x32/0x40 [rbd] [6888366.280903] __complete_request+0x22/0x70 [libceph] [6888366.281032] osd_dispatch+0x15e/0xb40 [libceph] [6888366.281164] ? inet_recvmsg+0x5b/0xd0 [6888366.281272] ? ceph_tcp_recvmsg+0x6f/0xa0 [libceph] [6888366.281405] ceph_con_process_message+0x79/0x140 [libceph] [6888366.281534] ceph_con_v1_try_read+0x5d7/0xf30 [libceph] [6888366.281661] ceph_con_workfn+0x329/0x680 [libceph] ``` After analyzing the coredump file, we found that the address of dc->sb_bio has been freed. We know that cached_dev is only freed when it is stopped. Since sb_bio is a part of struct cached_dev, rather than an alloc every time. If the device is stopped while writing to the superblock, the released address will be accessed at endio. This patch hopes to wait for sb_write to complete in cached_dev_free. It should be noted that we analyzed the cause of the problem, then tell all details to the QWEN and adopted the modifications it made. Signed-off-by: Mingzhe Zou <mingzhe.zou@easystack.cn> Fixes: cafe563591446 ("bcache: A block layer cache") Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Coly Li <colyli@fnnas.com> Link: https://patch.msgid.link/20260322134102.480107-1-colyli@fnnas.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22ALSA: 6fire: fix use-after-free on disconnectBerk Cem Goksel1-5/+12
commit b9c826916fdce6419b94eb0cd8810fdac18c2386 upstream. In usb6fire_chip_abort(), the chip struct is allocated as the card's private data (via snd_card_new with sizeof(struct sfire_chip)). When snd_card_free_when_closed() is called and no file handles are open, the card and embedded chip are freed synchronously. The subsequent chip->card = NULL write then hits freed slab memory. Call trace: usb6fire_chip_abort sound/usb/6fire/chip.c:59 [inline] usb6fire_chip_disconnect+0x348/0x358 sound/usb/6fire/chip.c:182 usb_unbind_interface+0x1a8/0x88c drivers/usb/core/driver.c:458 ... hub_event+0x1a04/0x4518 drivers/usb/core/hub.c:5953 Fix by moving the card lifecycle out of usb6fire_chip_abort() and into usb6fire_chip_disconnect(). The card pointer is saved in a local before any teardown, snd_card_disconnect() is called first to prevent new opens, URBs are aborted while chip is still valid, and snd_card_free_when_closed() is called last so chip is never accessed after the card may be freed. Fixes: a0810c3d6dd2 ("ALSA: 6fire: Release resources at card release") Cc: stable@vger.kernel.org Cc: Andrey Konovalov <andreyknvl@gmail.com> Signed-off-by: Berk Cem Goksel <berkcgoksel@gmail.com> Link: https://patch.msgid.link/20260410051341.1069716-1-berkcgoksel@gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22hwmon: (powerz) Fix use-after-free on USB disconnectSanman Pradhan1-2/+6
commit 08e57f5e1a9067d5fbf33993aa7f51d60b3d13a4 upstream. After powerz_disconnect() frees the URB and releases the mutex, a subsequent powerz_read() call can acquire the mutex and call powerz_read_data(), which dereferences the freed URB pointer. Fix by: - Setting priv->urb to NULL in powerz_disconnect() so that powerz_read_data() can detect the disconnected state. - Adding a !priv->urb check at the start of powerz_read_data() to return -ENODEV on a disconnected device. - Moving usb_set_intfdata() before hwmon registration so the disconnect handler can always find the priv pointer. Fixes: 4381a36abdf1c ("hwmon: add POWER-Z driver") Cc: stable@vger.kernel.org Signed-off-by: Sanman Pradhan <psanman@juniper.net> Link: https://lore.kernel.org/r/20260410002521.422645-2-sanman.pradhan@hpe.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22media: em28xx: fix use-after-free in em28xx_v4l2_open()Abhishek Kumar1-4/+10
commit a66485a934c7187ae8e36517d40615fa2e961cff upstream. em28xx_v4l2_open() reads dev->v4l2 without holding dev->lock, creating a race with em28xx_v4l2_init()'s error path and em28xx_v4l2_fini(), both of which free the em28xx_v4l2 struct and set dev->v4l2 to NULL under dev->lock. This race leads to two issues: - use-after-free in v4l2_fh_init() when accessing vdev->ctrl_handler, since the video_device is embedded in the freed em28xx_v4l2 struct. - NULL pointer dereference in em28xx_resolution_set() when accessing v4l2->norm, since dev->v4l2 has been set to NULL. Fix this by moving the mutex_lock() before the dev->v4l2 read and adding a NULL check for dev->v4l2 under the lock. Reported-by: syzbot+c025d34b8eaa54c571b8@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=c025d34b8eaa54c571b8 Fixes: 8139a4d583ab ("[media] em28xx: move v4l2 user counting fields from struct em28xx to struct v4l2") Cc: stable@vger.kernel.org Signed-off-by: Abhishek Kumar <abhishek_sts8@yahoo.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22media: mediatek: vcodec: fix use-after-free in encoder release pathFan Wu1-0/+9
commit 76e35091ffc722ba39b303e48bc5d08abb59dd56 upstream. The fops_vcodec_release() function frees the context structure (ctx) without first cancelling any pending or running work in ctx->encode_work. This creates a race window where the workqueue handler (mtk_venc_worker) may still be accessing the context memory after it has been freed. Race condition: CPU 0 (release path) CPU 1 (workqueue) --------------------- ------------------ fops_vcodec_release() v4l2_m2m_ctx_release() v4l2_m2m_cancel_job() // waits for m2m job "done" mtk_venc_worker() v4l2_m2m_job_finish() // m2m job "done" // BUT worker still running! // post-job_finish access: other ctx dereferences // UAF if ctx already freed // returns (job "done") kfree(ctx) // ctx freed Root cause: The v4l2_m2m_ctx_release() only waits for the m2m job lifecycle (via TRANS_RUNNING flag), not the workqueue lifecycle. After v4l2_m2m_job_finish() is called, the m2m framework considers the job complete and v4l2_m2m_ctx_release() returns, but the worker function continues executing and may still access ctx. The work is queued during encode operations via: queue_work(ctx->dev->encode_workqueue, &ctx->encode_work) The worker function accesses ctx->m2m_ctx, ctx->dev, and other ctx fields even after calling v4l2_m2m_job_finish(). This vulnerability was confirmed with KASAN by running an instrumented test module that widens the post-job_finish race window. KASAN detected: BUG: KASAN: slab-use-after-free in mtk_venc_worker+0x159/0x180 Read of size 4 at addr ffff88800326e000 by task kworker/u8:0/12 Workqueue: mtk_vcodec_enc_wq mtk_venc_worker Allocated by task 47: __kasan_kmalloc+0x7f/0x90 fops_vcodec_open+0x85/0x1a0 Freed by task 47: __kasan_slab_free+0x43/0x70 kfree+0xee/0x3a0 fops_vcodec_release+0xb7/0x190 Fix this by calling cancel_work_sync(&ctx->encode_work) before kfree(ctx). This ensures the workqueue handler is both cancelled (if pending) and synchronized (waits for any running handler to complete) before the context is freed. Placement rationale: The fix is placed after v4l2_ctrl_handler_free() and before list_del_init(&ctx->list). At this point, all m2m operations are done (v4l2_m2m_ctx_release() has returned), and we need to ensure the workqueue is synchronized before removing ctx from the list and freeing it. Note: The open error path does NOT need cancel_work_sync() because INIT_WORK() only initializes the work structure - it does not schedule it. Work is only scheduled later during device_run() operations. Fixes: 0934d3759615 ("media: mediatek: vcodec: separate decoder and encoder") Cc: stable@vger.kernel.org Signed-off-by: Fan Wu <fanwu01@zju.edu.cn> Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22media: vidtv: fix nfeeds state corruption on start_streaming failureRuslan Valiyev1-1/+3
commit a0e5a598fe9a4612b852406b51153b881592aede upstream. syzbot reported a memory leak in vidtv_psi_service_desc_init [1]. When vidtv_start_streaming() fails inside vidtv_start_feed(), the nfeeds counter is left incremented even though no feed was actually started. This corrupts the driver state: subsequent start_feed calls see nfeeds > 1 and skip starting the mux, while stop_feed calls eventually try to stop a non-existent stream. This state corruption can also lead to memory leaks, since the mux and channel resources may be partially allocated during a failed start_streaming but never cleaned up, as the stop path finds dvb->streaming == false and returns early. Fix by decrementing nfeeds back when start_streaming fails, keeping the counter in sync with the actual number of active feeds. [1] BUG: memory leak unreferenced object 0xffff888145b50820 (size 32): comm "syz.0.17", pid 6068, jiffies 4294944486 backtrace (crc 90a0c7d4): vidtv_psi_service_desc_init+0x74/0x1b0 drivers/media/test-drivers/vidtv/vidtv_psi.c:288 vidtv_channel_s302m_init+0xb1/0x2a0 drivers/media/test-drivers/vidtv/vidtv_channel.c:83 vidtv_channels_init+0x1b/0x40 drivers/media/test-drivers/vidtv/vidtv_channel.c:524 vidtv_mux_init+0x516/0xbe0 drivers/media/test-drivers/vidtv/vidtv_mux.c:518 vidtv_start_streaming drivers/media/test-drivers/vidtv/vidtv_bridge.c:194 [inline] vidtv_start_feed+0x33e/0x4d0 drivers/media/test-drivers/vidtv/vidtv_bridge.c:239 Fixes: f90cf6079bf67 ("media: vidtv: add a bridge driver") Cc: stable@vger.kernel.org Reported-by: syzbot+639ebc6ec75e96674741@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=639ebc6ec75e96674741 Signed-off-by: Ruslan Valiyev <linuxoid@gmail.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22mm: blk-cgroup: fix use-after-free in cgwb_release_workfn()Breno Leitao1-2/+3
commit 8f5857be99f1ed1fa80991c72449541f634626ee upstream. cgwb_release_workfn() calls css_put(wb->blkcg_css) and then later accesses wb->blkcg_css again via blkcg_unpin_online(). If css_put() drops the last reference, the blkcg can be freed asynchronously (css_free_rwork_fn -> blkcg_css_free -> kfree) before blkcg_unpin_online() dereferences the pointer to access blkcg->online_pin, resulting in a use-after-free: BUG: KASAN: slab-use-after-free in blkcg_unpin_online (./include/linux/instrumented.h:112 ./include/linux/atomic/atomic-instrumented.h:400 ./include/linux/refcount.h:389 ./include/linux/refcount.h:432 ./include/linux/refcount.h:450 block/blk-cgroup.c:1367) Write of size 4 at addr ff11000117aa6160 by task kworker/71:1/531 Workqueue: cgwb_release cgwb_release_workfn Call Trace: <TASK> blkcg_unpin_online (./include/linux/instrumented.h:112 ./include/linux/atomic/atomic-instrumented.h:400 ./include/linux/refcount.h:389 ./include/linux/refcount.h:432 ./include/linux/refcount.h:450 block/blk-cgroup.c:1367) cgwb_release_workfn (mm/backing-dev.c:629) process_scheduled_works (kernel/workqueue.c:3278 kernel/workqueue.c:3385) Freed by task 1016: kfree (./include/linux/kasan.h:235 mm/slub.c:2689 mm/slub.c:6246 mm/slub.c:6561) css_free_rwork_fn (kernel/cgroup/cgroup.c:5542) process_scheduled_works (kernel/workqueue.c:3302 kernel/workqueue.c:3385) ** Stack based on commit 66672af7a095 ("Add linux-next specific files for 20260410") I am seeing this crash sporadically in Meta fleet across multiple kernel versions. A full reproducer is available at: https://github.com/leitao/debug/blob/main/reproducers/repro_blkcg_uaf.sh (The race window is narrow. To make it easily reproducible, inject a msleep(100) between css_put() and blkcg_unpin_online() in cgwb_release_workfn(). With that delay and a KASAN-enabled kernel, the reproducer triggers the splat reliably in less than a second.) Fix this by moving blkcg_unpin_online() before css_put(), so the cgwb's CSS reference keeps the blkcg alive while blkcg_unpin_online() accesses it. Link: https://lore.kernel.org/20260413-blkcg-v1-1-35b72622d16c@debian.org Fixes: 59b57717fff8 ("blkcg: delay blkg destruction until after writeback has finished") Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Dennis Zhou <dennis@kernel.org> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Cc: David Hildenbrand <david@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Josef Bacik <josef@toxicpanda.com> Cc: JP Kobryn <inwardvessel@gmail.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22mm/kasan: fix double free for kasan pXdsRitesh Harjani (IBM)1-4/+4
commit 51d8c78be0c27ddb91bc2c0263941d8b30a47d3b upstream. kasan_free_pxd() assumes the page table is always struct page aligned. But that's not always the case for all architectures. E.g. In case of powerpc with 64K pagesize, PUD table (of size 4096) comes from slab cache named pgtable-2^9. Hence instead of page_to_virt(pxd_page()) let's just directly pass the start of the pxd table which is passed as the 1st argument. This fixes the below double free kasan issue seen with PMEM: radix-mmu: Mapped 0x0000047d10000000-0x0000047f90000000 with 2.00 MiB pages ================================================================== BUG: KASAN: double-free in kasan_remove_zero_shadow+0x9c4/0xa20 Free of addr c0000003c38e0000 by task ndctl/2164 CPU: 34 UID: 0 PID: 2164 Comm: ndctl Not tainted 6.19.0-rc1-00048-gea1013c15392 #157 VOLUNTARY Hardware name: IBM,9080-HEX POWER10 (architected) 0x800200 0xf000006 of:IBM,FW1060.00 (NH1060_012) hv:phyp pSeries Call Trace: dump_stack_lvl+0x88/0xc4 (unreliable) print_report+0x214/0x63c kasan_report_invalid_free+0xe4/0x110 check_slab_allocation+0x100/0x150 kmem_cache_free+0x128/0x6e0 kasan_remove_zero_shadow+0x9c4/0xa20 memunmap_pages+0x2b8/0x5c0 devm_action_release+0x54/0x70 release_nodes+0xc8/0x1a0 devres_release_all+0xe0/0x140 device_unbind_cleanup+0x30/0x120 device_release_driver_internal+0x3e4/0x450 unbind_store+0xfc/0x110 drv_attr_store+0x78/0xb0 sysfs_kf_write+0x114/0x140 kernfs_fop_write_iter+0x264/0x3f0 vfs_write+0x3bc/0x7d0 ksys_write+0xa4/0x190 system_call_exception+0x190/0x480 system_call_vectored_common+0x15c/0x2ec ---- interrupt: 3000 at 0x7fff93b3d3f4 NIP: 00007fff93b3d3f4 LR: 00007fff93b3d3f4 CTR: 0000000000000000 REGS: c0000003f1b07e80 TRAP: 3000 Not tainted (6.19.0-rc1-00048-gea1013c15392) MSR: 800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 48888208 XER: 00000000 <...> NIP [00007fff93b3d3f4] 0x7fff93b3d3f4 LR [00007fff93b3d3f4] 0x7fff93b3d3f4 ---- interrupt: 3000 The buggy address belongs to the object at c0000003c38e0000 which belongs to the cache pgtable-2^9 of size 4096 The buggy address is located 0 bytes inside of 4096-byte region [c0000003c38e0000, c0000003c38e1000) The buggy address belongs to the physical page: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x3c38c head: order:2 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 memcg:c0000003bfd63e01 flags: 0x63ffff800000040(head|node=6|zone=0|lastcpupid=0x7ffff) page_type: f5(slab) raw: 063ffff800000040 c000000140058980 5deadbeef0000122 0000000000000000 raw: 0000000000000000 0000000080200020 00000000f5000000 c0000003bfd63e01 head: 063ffff800000040 c000000140058980 5deadbeef0000122 0000000000000000 head: 0000000000000000 0000000080200020 00000000f5000000 c0000003bfd63e01 head: 063ffff800000002 c00c000000f0e301 00000000ffffffff 00000000ffffffff head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000004 page dumped because: kasan: bad access detected [ 138.953636] [ T2164] Memory state around the buggy address: [ 138.953643] [ T2164] c0000003c38dff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 138.953652] [ T2164] c0000003c38dff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 138.953661] [ T2164] >c0000003c38e0000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 138.953669] [ T2164] ^ [ 138.953675] [ T2164] c0000003c38e0080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 138.953684] [ T2164] c0000003c38e0100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 138.953692] [ T2164] ================================================================== [ 138.953701] [ T2164] Disabling lock debugging due to kernel taint Link: https://lkml.kernel.org/r/2f9135c7866c6e0d06e960993b8a5674a9ebc7ec.1771938394.git.ritesh.list@gmail.com Fixes: 0207df4fa1a8 ("kernel/memremap, kasan: make ZONE_DEVICE with work with KASAN") Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22ASoC: qcom: q6apm: move component registration to unmanaged versionSrinivas Kandagatla1-2/+12
commit 6ec1235fc941dac6c011b30ee01d9220ff87e0cd upstream. q6apm component registers dais dynamically from ASoC toplology, which are allocated using device managed version apis. Allocating both component and dynamic dais using managed version could lead to incorrect free ordering, dai will be freed while component still holding references to it. Fix this issue by moving component to unmanged version so that the dai pointers are only freeded after the component is removed. ================================================================== BUG: KASAN: slab-use-after-free in snd_soc_del_component_unlocked+0x3d4/0x400 [snd_soc_core] Read of size 8 at addr ffff00084493a6e8 by task kworker/u48:0/3426 Tainted: [W]=WARN Hardware name: LENOVO 21N2ZC5PUS/21N2ZC5PUS, BIOS N42ET57W (1.31 ) 08/08/2024 Workqueue: pdr_notifier_wq pdr_notifier_work [pdr_interface] Call trace: show_stack+0x28/0x7c (C) dump_stack_lvl+0x60/0x80 print_report+0x160/0x4b4 kasan_report+0xac/0xfc __asan_report_load8_noabort+0x20/0x34 snd_soc_del_component_unlocked+0x3d4/0x400 [snd_soc_core] snd_soc_unregister_component_by_driver+0x50/0x88 [snd_soc_core] devm_component_release+0x30/0x5c [snd_soc_core] devres_release_all+0x13c/0x210 device_unbind_cleanup+0x20/0x190 device_release_driver_internal+0x350/0x468 device_release_driver+0x18/0x30 bus_remove_device+0x1a0/0x35c device_del+0x314/0x7f0 device_unregister+0x20/0xbc apr_remove_device+0x5c/0x7c [apr] device_for_each_child+0xd8/0x160 apr_pd_status+0x7c/0xa8 [apr] pdr_notifier_work+0x114/0x240 [pdr_interface] process_one_work+0x500/0xb70 worker_thread+0x630/0xfb0 kthread+0x370/0x6c0 ret_from_fork+0x10/0x20 Allocated by task 77: kasan_save_stack+0x40/0x68 kasan_save_track+0x20/0x40 kasan_save_alloc_info+0x44/0x58 __kasan_kmalloc+0xbc/0xdc __kmalloc_node_track_caller_noprof+0x1f4/0x620 devm_kmalloc+0x7c/0x1c8 snd_soc_register_dai+0x50/0x4f0 [snd_soc_core] soc_tplg_pcm_elems_load+0x55c/0x1eb8 [snd_soc_core] snd_soc_tplg_component_load+0x4f8/0xb60 [snd_soc_core] audioreach_tplg_init+0x124/0x1fc [snd_q6apm] q6apm_audio_probe+0x10/0x1c [snd_q6apm] snd_soc_component_probe+0x5c/0x118 [snd_soc_core] soc_probe_component+0x44c/0xaf0 [snd_soc_core] snd_soc_bind_card+0xad0/0x2370 [snd_soc_core] snd_soc_register_card+0x3b0/0x4c0 [snd_soc_core] devm_snd_soc_register_card+0x50/0xc8 [snd_soc_core] x1e80100_platform_probe+0x208/0x368 [snd_soc_x1e80100] platform_probe+0xc0/0x188 really_probe+0x188/0x804 __driver_probe_device+0x158/0x358 driver_probe_device+0x60/0x190 __device_attach_driver+0x16c/0x2a8 bus_for_each_drv+0x100/0x194 __device_attach+0x174/0x380 device_initial_probe+0x14/0x20 bus_probe_device+0x124/0x154 deferred_probe_work_func+0x140/0x220 process_one_work+0x500/0xb70 worker_thread+0x630/0xfb0 kthread+0x370/0x6c0 ret_from_fork+0x10/0x20 Freed by task 3426: kasan_save_stack+0x40/0x68 kasan_save_track+0x20/0x40 __kasan_save_free_info+0x4c/0x80 __kasan_slab_free+0x78/0xa0 kfree+0x100/0x4a4 devres_release_all+0x144/0x210 device_unbind_cleanup+0x20/0x190 device_release_driver_internal+0x350/0x468 device_release_driver+0x18/0x30 bus_remove_device+0x1a0/0x35c device_del+0x314/0x7f0 device_unregister+0x20/0xbc apr_remove_device+0x5c/0x7c [apr] device_for_each_child+0xd8/0x160 apr_pd_status+0x7c/0xa8 [apr] pdr_notifier_work+0x114/0x240 [pdr_interface] process_one_work+0x500/0xb70 worker_thread+0x630/0xfb0 kthread+0x370/0x6c0 ret_from_fork+0x10/0x20 Fixes: 5477518b8a0e ("ASoC: qdsp6: audioreach: add q6apm support") Cc: Stable@vger.kernel.org Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com> Link: https://patch.msgid.link/20260402081118.348071-2-srinivas.kandagatla@oss.qualcomm.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22KVM: x86: Use scratch field in MMIO fragment to hold small write valuesSean Christopherson2-2/+15
commit 0b16e69d17d8c35c5c9d5918bf596c75a44655d3 upstream. When exiting to userspace to service an emulated MMIO write, copy the to-be-written value to a scratch field in the MMIO fragment if the size of the data payload is 8 bytes or less, i.e. can fit in a single chunk, instead of pointing the fragment directly at the source value. This fixes a class of use-after-free bugs that occur when the emulator initiates a write using an on-stack, local variable as the source, the write splits a page boundary, *and* both pages are MMIO pages. Because KVM's ABI only allows for physically contiguous MMIO requests, accesses that split MMIO pages are separated into two fragments, and are sent to userspace one at a time. When KVM attempts to complete userspace MMIO in response to KVM_RUN after the first fragment, KVM will detect the second fragment and generate a second userspace exit, and reference the on-stack variable. The issue is most visible if the second KVM_RUN is performed by a separate task, in which case the stack of the initiating task can show up as truly freed data. ================================================================== BUG: KASAN: use-after-free in complete_emulated_mmio+0x305/0x420 Read of size 1 at addr ffff888009c378d1 by task syz-executor417/984 CPU: 1 PID: 984 Comm: syz-executor417 Not tainted 5.10.0-182.0.0.95.h2627.eulerosv2r13.x86_64 #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0xbe/0xfd print_address_description.constprop.0+0x19/0x170 __kasan_report.cold+0x6c/0x84 kasan_report+0x3a/0x50 check_memory_region+0xfd/0x1f0 memcpy+0x20/0x60 complete_emulated_mmio+0x305/0x420 kvm_arch_vcpu_ioctl_run+0x63f/0x6d0 kvm_vcpu_ioctl+0x413/0xb20 __se_sys_ioctl+0x111/0x160 do_syscall_64+0x30/0x40 entry_SYSCALL_64_after_hwframe+0x67/0xd1 RIP: 0033:0x42477d Code: <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007faa8e6890e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00000000004d7338 RCX: 000000000042477d RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000005 RBP: 00000000004d7330 R08: 00007fff28d546df R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004d733c R13: 0000000000000000 R14: 000000000040a200 R15: 00007fff28d54720 The buggy address belongs to the page: page:0000000029f6a428 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x9c37 flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff) raw: 000fffffc0000000 0000000000000000 ffffea0000270dc8 0000000000000000 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888009c37780: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff888009c37800: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >ffff888009c37880: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ ffff888009c37900: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff888009c37980: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ================================================================== The bug can also be reproduced with a targeted KVM-Unit-Test by hacking KVM to fill a large on-stack variable in complete_emulated_mmio(), i.e. by overwrite the data value with garbage. Limit the use of the scratch fields to 8-byte or smaller accesses, and to just writes, as larger accesses and reads are not affected thanks to implementation details in the emulator, but add a sanity check to ensure those details don't change in the future. Specifically, KVM never uses on-stack variables for accesses larger that 8 bytes, e.g. uses an operand in the emulator context, and *all* reads are buffered through the mem_read cache. Note! Using the scratch field for reads is not only unnecessary, it's also extremely difficult to handle correctly. As above, KVM buffers all reads through the mem_read cache, and heavily relies on that behavior when re-emulating the instruction after a userspace MMIO read exit. If a read splits a page, the first page is NOT an MMIO page, and the second page IS an MMIO page, then the MMIO fragment needs to point at _just_ the second chunk of the destination, i.e. its position in the mem_read cache. Taking the "obvious" approach of copying the fragment value into the destination when re-emulating the instruction would clobber the first chunk of the destination, i.e. would clobber the data that was read from guest memory. Fixes: f78146b0f923 ("KVM: Fix page-crossing MMIO") Suggested-by: Yashu Zhang <zhangjiaji1@huawei.com> Reported-by: Yashu Zhang <zhangjiaji1@huawei.com> Closes: https://lore.kernel.org/all/369eaaa2b3c1425c85e8477066391bc7@huawei.com Cc: stable@vger.kernel.org Tested-by: Tom Lendacky <thomas.lendacky@gmail.com> Tested-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Link: https://patch.msgid.link/20260225012049.920665-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22x86-64/arm64/powerpc: clean up and rename __copy_from_user_flushcacheLinus Torvalds6-17/+17
commit 809b997a5ce945ab470f70c187048fe4f5df20bf upstream. This finishes the work on these odd functions that were only implemented by a handful of architectures. The 'flushcache' function was only used from the iterator code, and let's make it do the same thing that the nontemporal version does: remove the two underscores and add the user address checking. Yes, yes, the user address checking is also done at iovec import time, but we have long since walked away from the old double-underscore thing where we try to avoid address checking overhead at access time, and these functions shouldn't be so special and old-fashioned. The arm64 version already did the address check, in fact, so there it's just a matter of renaming it. For powerpc and x86-64 we now do the proper user access boilerplate. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22x86: rename and clean up __copy_from_user_inatomic_nocache()Linus Torvalds8-19/+20
commit 5de7bcaadf160c1716b20a263cf8f5b06f658959 upstream. Similarly to the previous commit, this renames the somewhat confusingly named function. But in this case, it was at least less confusing: the __copy_from_user_inatomic_nocache is indeed copying from user memory, and it is indeed ok to be used in an atomic context, so it will not warn about it. But the previous commit also removed the NTB mis-use of the __copy_from_user_inatomic_nocache() function, and as a result every call-site is now _actually_ doing a real user copy. That means that we can now do the proper user pointer verification too. End result: add proper address checking, remove the double underscores, and change the "nocache" to "nontemporal" to more accurately describe what this x86-only function actually does. It might be worth noting that only the target is non-temporal: the actual user accesses are normal memory accesses. Also worth noting is that non-x86 targets (and on older 32-bit x86 CPU's before XMM2 in the Pentium III) we end up just falling back on a regular user copy, so nothing can actually depend on the non-temporal semantics, but that has always been true. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22x86-64: rename misleadingly named '__copy_user_nocache()' functionLinus Torvalds6-16/+16
commit d187a86de793f84766ea40b9ade7ac60aabbb4fe upstream. This function was a masterclass in bad naming, for various historical reasons. It claimed to be a non-cached user copy. It is literally _neither_ of those things. It's a specialty memory copy routine that uses non-temporal stores for the destination (but not the source), and that does exception handling for both source and destination accesses. Also note that while it works for unaligned targets, any unaligned parts (whether at beginning or end) will not use non-temporal stores, since only words and quadwords can be non-temporal on x86. The exception handling means that it _can_ be used for user space accesses, but not on its own - it needs all the normal "start user space access" logic around it. But typically the user space access would be the source, not the non-temporal destination. That was the original intention of this, where the destination was some fragile persistent memory target that needed non-temporal stores in order to catch machine check exceptions synchronously and deal with them gracefully. Thus that non-descriptive name: one use case was to copy from user space into a non-cached kernel buffer. However, the existing users are a mix of that intended use-case, and a couple of random drivers that just did this as a performance tweak. Some of those random drivers then actively misused the user copying version (with STAC/CLAC and all) to do kernel copies without ever even caring about the exception handling, _just_ for the non-temporal destination. Rename it as a first small step to actually make it halfway sane, and change the prototype to be more normal: it doesn't take a user pointer unless the caller has done the proper conversion, and the argument size is the full size_t (it still won't actually copy more than 4GB in one go, but there's also no reason to silently truncate the size argument in the caller). Finally, use this now sanely named function in the NTB code, which mis-used a user copy version (with STAC/CLAC and all) of this interface despite it not actually being a user copy at all. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22checkpatch: add support for Assisted-by tagSasha Levin1-0/+10
commit d1db4118489fffd2b2f612140b7acbb477880839 upstream. The Assisted-by tag was introduced in Documentation/process/coding-assistants.rst for attributing AI tool contributions to kernel patches. However, checkpatch.pl did not recognize this tag, causing two issues: WARNING: Non-standard signature: Assisted-by: ERROR: Unrecognized email address: 'AGENT_NAME:MODEL_VERSION' Fix this by: 1. Adding Assisted-by to the recognized $signature_tags list 2. Skipping email validation for Assisted-by lines since they use the AGENT_NAME:MODEL_VERSION format instead of an email address 3. Warning when the Assisted-by value doesn't match the expected format Link: https://lkml.kernel.org/r/20260311215818.518930-1-sashal@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org> Reported-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Joe Perches <joe@perches.com> Cc: Andy Whitcroft <apw@canonical.com> Cc: Dwaipayan Ray <dwaipayanray1@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22ocfs2: fix out-of-bounds write in ocfs2_write_end_inlineJoseph Qi1-0/+10
[ Upstream commit 7bc5da4842bed3252d26e742213741a4d0ac1b14 ] KASAN reports a use-after-free write of 4086 bytes in ocfs2_write_end_inline, called from ocfs2_write_end_nolock during a copy_file_range splice fallback on a corrupted ocfs2 filesystem mounted on a loop device. The actual bug is an out-of-bounds write past the inode block buffer, not a true use-after-free. The write overflows into an adjacent freed page, which KASAN reports as UAF. The root cause is that ocfs2_try_to_write_inline_data trusts the on-disk id_count field to determine whether a write fits in inline data. On a corrupted filesystem, id_count can exceed the physical maximum inline data capacity, causing writes to overflow the inode block buffer. Call trace (crash path): vfs_copy_file_range (fs/read_write.c:1634) do_splice_direct splice_direct_to_actor iter_file_splice_write ocfs2_file_write_iter generic_perform_write ocfs2_write_end ocfs2_write_end_nolock (fs/ocfs2/aops.c:1949) ocfs2_write_end_inline (fs/ocfs2/aops.c:1915) memcpy_from_folio <-- KASAN: write OOB So add id_count upper bound check in ocfs2_validate_inode_block() to alongside the existing i_size check to fix it. Link: https://lkml.kernel.org/r/20260403063830.3662739-1-joseph.qi@linux.alibaba.com Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reported-by: syzbot+62c1793956716ea8b28a@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=62c1793956716ea8b28a Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22ocfs2: validate inline data i_size during inode readDeepanshu Kartikey1-6/+19
[ Upstream commit 1524af3685b35feac76662cc551cbc37bd14775f ] When reading an inode from disk, ocfs2_validate_inode_block() performs various sanity checks but does not validate the size of inline data. If the filesystem is corrupted, an inode's i_size can exceed the actual inline data capacity (id_count). This causes ocfs2_dir_foreach_blk_id() to iterate beyond the inline data buffer, triggering a use-after-free when accessing directory entries from freed memory. In the syzbot report: - i_size was 1099511627576 bytes (~1TB) - Actual inline data capacity (id_count) is typically <256 bytes - A garbage rec_len (54648) caused ctx->pos to jump out of bounds - This triggered a UAF in ocfs2_check_dir_entry() Fix by adding a validation check in ocfs2_validate_inode_block() to ensure inodes with inline data have i_size <= id_count. This catches the corruption early during inode read and prevents all downstream code from operating on invalid data. Link: https://lkml.kernel.org/r/20251212052132.16750-1-kartikey406@gmail.com Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com> Reported-by: syzbot+c897823f699449cc3eb4@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=c897823f699449cc3eb4 Tested-by: syzbot+c897823f699449cc3eb4@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/20251211115231.3560028-1-kartikey406@gmail.com/T/ [v1] Link: https://lore.kernel.org/all/20251212040400.6377-1-kartikey406@gmail.com/T/ [v2] Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: 7bc5da4842be ("ocfs2: fix out-of-bounds write in ocfs2_write_end_inline") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22KVM: x86: Use __DECLARE_FLEX_ARRAY() for UAPI structures with VLAsDavid Woodhouse2-11/+12
[ Upstream commit 2619da73bb2f10d88f7e1087125c40144fdf0987 ] Commit 94dfc73e7cf4 ("treewide: uapi: Replace zero-length arrays with flexible-array members") broke the userspace API for C++. These structures ending in VLAs are typically a *header*, which can be followed by an arbitrary number of entries. Userspace typically creates a larger structure with some non-zero number of entries, for example in QEMU's kvm_arch_get_supported_msr_feature(): struct { struct kvm_msrs info; struct kvm_msr_entry entries[1]; } msr_data = {}; While that works in C, it fails in C++ with an error like: flexible array member 'kvm_msrs::entries' not at end of 'struct msr_data' Fix this by using __DECLARE_FLEX_ARRAY() for the VLA, which uses [0] for C++ compilation. Fixes: 94dfc73e7cf4 ("treewide: uapi: Replace zero-length arrays with flexible-array members") Cc: stable@vger.kernel.org Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Link: https://patch.msgid.link/3abaf6aefd6e5efeff3b860ac38421d9dec908db.camel@infradead.org [sean: tag for stable@] Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22KVM: Remove subtle "struct kvm_stats_desc" pseudo-overlaySean Christopherson14-75/+70
[ Upstream commit da142f3d373a6ddaca0119615a8db2175ddc4121 ] Remove KVM's internal pseudo-overlay of kvm_stats_desc, which subtly aliases the flexible name[] in the uAPI definition with a fixed-size array of the same name. The unusual embedded structure results in compiler warnings due to -Wflex-array-member-not-at-end, and also necessitates an extra level of dereferencing in KVM. To avoid the "overlay", define the uAPI structure to have a fixed-size name when building for the kernel. Opportunistically clean up the indentation for the stats macros, and replace spaces with tabs. No functional change intended. Reported-by: Gustavo A. R. Silva <gustavoars@kernel.org> Closes: https://lore.kernel.org/all/aPfNKRpLfhmhYqfP@kspp Acked-by: Marc Zyngier <maz@kernel.org> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> [..] Acked-by: Anup Patel <anup@brainfault.org> Reviewed-by: Bibo Mao <maobibo@loongson.cn> Acked-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://patch.msgid.link/20251205232655.445294-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Stable-dep-of: 2619da73bb2f ("KVM: x86: Use __DECLARE_FLEX_ARRAY() for UAPI structures with VLAs") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22selftests/bpf: Test refinement of single-value tnumPaul Chaignon1-0/+137
commit e6ad477d1bf8829973cddd9accbafa9d1a6cd15a upstream. This patch introduces selftests to cover the new bounds refinement logic introduced in the previous patch. Without the previous patch, the first two tests fail because of the invariant violation they trigger. The last test fails because the R10 access is not detected as dead code. In addition, all three tests fail because of R0 having a non-constant value in the verifier logs. In addition, the last two cases are covering the negative cases: when we shouldn't refine the bounds because the u64 and tnum overlap in at least two values. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Link: https://lore.kernel.org/r/90d880c8cf587b9f7dc715d8961cd1b8111d01a8.1772225741.git.paul.chaignon@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> [shung-hsi.yu: test for backported upstream commit efc11a667878 ("bpf: Improve bounds when tnum has a single possible value")] Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22mm: call ->free_folio() directly in folio_unmap_invalidate()Matthew Wilcox (Oracle)3-3/+7
commit 615d9bb2ccad42f9e21d837431e401db2e471195 upstream. We can only call filemap_free_folio() if we have a reference to (or hold a lock on) the mapping. Otherwise, we've already removed the folio from the mapping so it no longer pins the mapping and the mapping can be removed, causing a use-after-free when accessing mapping->a_ops. Follow the same pattern as __remove_mapping() and load the free_folio function pointer before dropping the lock on the mapping. That lets us make filemap_free_folio() static as this was the only caller outside filemap.c. Link: https://lore.kernel.org/20260413184314.3419945-1-willy@infradead.org Fixes: fb7d3bc41493 ("mm/filemap: drop streaming/uncached pages when writeback completes") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reported-by: Google Big Sleep <big-sleep-vuln-reports+bigsleep-501448199@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jan Kara <jack@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22KVM: SEV: Drop WARN on large size for KVM_MEMORY_ENCRYPT_REG_REGIONSean Christopherson1-4/+7
commit 8acffeef5ef720c35e513e322ab08e32683f32f2 upstream. Drop the WARN in sev_pin_memory() on npages overflowing an int, as the WARN is comically trivially to trigger from userspace, e.g. by doing: struct kvm_enc_region range = { .addr = 0, .size = -1ul, }; __vm_ioctl(vm, KVM_MEMORY_ENCRYPT_REG_REGION, &range); Note, the checks in sev_mem_enc_register_region() that presumably exist to verify the incoming address+size are completely worthless, as both "addr" and "size" are u64s and SEV is 64-bit only, i.e. they _can't_ be greater than ULONG_MAX. That wart will be cleaned up in the near future. if (range->addr > ULONG_MAX || range->size > ULONG_MAX) return -EINVAL; Opportunistically add a comment to explain why the code calculates the number of pages the "hard" way, e.g. instead of just shifting @ulen. Fixes: 78824fabc72e ("KVM: SVM: fix svn_pin_memory()'s use of get_user_pages_fast()") Cc: stable@vger.kernel.org Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Tested-by: Liam Merwick <liam.merwick@oracle.com> Link: https://patch.msgid.link/20260313003302.3136111-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22KVM: SEV: Lock all vCPUs when synchronzing VMSAs for SNP launch finishSean Christopherson1-4/+12
commit cb923ee6a80f4e604e6242a4702b59251e61a380 upstream. Lock all vCPUs when synchronizing and encrypting VMSAs for SNP guests, as allowing userspace to manipulate and/or run a vCPU while its state is being synchronized would at best corrupt vCPU state, and at worst crash the host kernel. Opportunistically assert that vcpu->mutex is held when synchronizing its VMSA (the SEV-ES path already locks vCPUs). Fixes: ad27ce155566 ("KVM: SEV: Add KVM_SEV_SNP_LAUNCH_FINISH command") Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260310234829.2608037-6-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22KVM: SEV: Disallow LAUNCH_FINISH if vCPUs are actively being createdSean Christopherson2-2/+15
commit 624bf3440d7214b62c22d698a0a294323f331d5d upstream. Reject LAUNCH_FINISH for SEV-ES and SNP VMs if KVM is actively creating one or more vCPUs, as KVM needs to process and encrypt each vCPU's VMSA. Letting userspace create vCPUs while LAUNCH_FINISH is in-progress is "fine", at least in the current code base, as kvm_for_each_vcpu() operates on online_vcpus, LAUNCH_FINISH (all SEV+ sub-ioctls) holds kvm->mutex, and fully onlining a vCPU in kvm_vm_ioctl_create_vcpu() is done under kvm->mutex. I.e. there's no difference between an in-progress vCPU and a vCPU that is created entirely after LAUNCH_FINISH. However, given that concurrent LAUNCH_FINISH and vCPU creation can't possibly work (for any reasonable definition of "work"), since userspace can't guarantee whether a particular vCPU will be encrypted or not, disallow the combination as a hardening measure, to reduce the probability of introducing bugs in the future, and to avoid having to reason about the safety of future changes related to LAUNCH_FINISH. Cc: Jethro Beekman <jethro@fortanix.com> Closes: https://lore.kernel.org/all/b31f7c6e-2807-4662-bcdd-eea2c1e132fa@fortanix.com Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260310234829.2608037-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22KVM: SEV: Protect *all* of sev_mem_enc_register_region() with kvm->lockSean Christopherson1-4/+2
commit b6408b6cec5df76a165575777800ef2aba12b109 upstream. Take and hold kvm->lock for before checking sev_guest() in sev_mem_enc_register_region(), as sev_guest() isn't stable unless kvm->lock is held (or KVM can guarantee KVM_SEV_INIT{2} has completed and can't rollack state). If KVM_SEV_INIT{2} fails, KVM can end up trying to add to a not-yet-initialized sev->regions_list, e.g. triggering a #GP Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 110 UID: 0 PID: 72717 Comm: syz.15.11462 Tainted: G U W O 6.16.0-smp-DEV #1 NONE Tainted: [U]=USER, [W]=WARN, [O]=OOT_MODULE Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 12.52.0-0 10/28/2024 RIP: 0010:sev_mem_enc_register_region+0x3f0/0x4f0 ../include/linux/list.h:83 Code: <41> 80 3c 04 00 74 08 4c 89 ff e8 f1 c7 a2 00 49 39 ed 0f 84 c6 00 RSP: 0018:ffff88838647fbb8 EFLAGS: 00010256 RAX: dffffc0000000000 RBX: 1ffff92015cf1e0b RCX: dffffc0000000000 RDX: 0000000000000000 RSI: 0000000000001000 RDI: ffff888367870000 RBP: ffffc900ae78f050 R08: ffffea000d9e0007 R09: 1ffffd4001b3c000 R10: dffffc0000000000 R11: fffff94001b3c001 R12: 0000000000000000 R13: ffff8982ab0bde00 R14: ffffc900ae78f058 R15: 0000000000000000 FS: 00007f34e9dc66c0(0000) GS:ffff89ee64d33000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fe180adef98 CR3: 000000047210e000 CR4: 0000000000350ef0 Call Trace: <TASK> kvm_arch_vm_ioctl+0xa72/0x1240 ../arch/x86/kvm/x86.c:7371 kvm_vm_ioctl+0x649/0x990 ../virt/kvm/kvm_main.c:5363 __se_sys_ioctl+0x101/0x170 ../fs/ioctl.c:51 do_syscall_x64 ../arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0x6f/0x1f0 ../arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f34e9f7e9a9 Code: <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f34e9dc6038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007f34ea1a6080 RCX: 00007f34e9f7e9a9 RDX: 0000200000000280 RSI: 000000008010aebb RDI: 0000000000000007 RBP: 00007f34ea000d69 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 00007f34ea1a6080 R15: 00007ffce77197a8 </TASK> with a syzlang reproducer that looks like: syz_kvm_add_vcpu$x86(0x0, &(0x7f0000000040)={0x0, &(0x7f0000000180)=ANY=[], 0x70}) (async) syz_kvm_add_vcpu$x86(0x0, &(0x7f0000000080)={0x0, &(0x7f0000000180)=ANY=[@ANYBLOB="..."], 0x4f}) (async) r0 = openat$kvm(0xffffffffffffff9c, &(0x7f0000000200), 0x0, 0x0) r1 = ioctl$KVM_CREATE_VM(r0, 0xae01, 0x0) r2 = openat$kvm(0xffffffffffffff9c, &(0x7f0000000240), 0x0, 0x0) r3 = ioctl$KVM_CREATE_VM(r2, 0xae01, 0x0) ioctl$KVM_SET_CLOCK(r3, 0xc008aeba, &(0x7f0000000040)={0x1, 0x8, 0x0, 0x5625e9b0}) (async) ioctl$KVM_SET_PIT2(r3, 0x8010aebb, &(0x7f0000000280)={[...], 0x5}) (async) ioctl$KVM_SET_PIT2(r1, 0x4070aea0, 0x0) (async) r4 = ioctl$KVM_CREATE_VM(0xffffffffffffffff, 0xae01, 0x0) openat$kvm(0xffffffffffffff9c, 0x0, 0x0, 0x0) (async) ioctl$KVM_SET_USER_MEMORY_REGION(r4, 0x4020ae46, &(0x7f0000000400)={0x0, 0x0, 0x0, 0x2000, &(0x7f0000001000/0x2000)=nil}) (async) r5 = ioctl$KVM_CREATE_VCPU(r4, 0xae41, 0x2) close(r0) (async) openat$kvm(0xffffffffffffff9c, &(0x7f0000000000), 0x8000, 0x0) (async) ioctl$KVM_SET_GUEST_DEBUG(r5, 0x4048ae9b, &(0x7f0000000300)={0x4376ea830d46549b, 0x0, [0x46, 0x0, 0x0, 0x0, 0x0, 0x1000]}) (async) ioctl$KVM_RUN(r5, 0xae80, 0x0) Opportunistically use guard() to avoid having to define a new error label and goto usage. Fixes: 1e80fdc09d12 ("KVM: SVM: Pin guest memory when SEV is active") Cc: stable@vger.kernel.org Reported-by: Alexander Potapenko <glider@google.com> Tested-by: Alexander Potapenko <glider@google.com> Link: https://patch.msgid.link/20260310234829.2608037-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22KVM: SEV: Reject attempts to sync VMSA of an already-launched/encrypted vCPUSean Christopherson1-0/+3
commit 9b9f7962e3e879d12da2bf47e02a24ec51690e3d upstream. Reject synchronizing vCPU state to its associated VMSA if the vCPU has already been launched, i.e. if the VMSA has already been encrypted. On a host with SNP enabled, accessing guest-private memory generates an RMP #PF and panics the host. BUG: unable to handle page fault for address: ff1276cbfdf36000 #PF: supervisor write access in kernel mode #PF: error_code(0x80000003) - RMP violation PGD 5a31801067 P4D 5a31802067 PUD 40ccfb5063 PMD 40e5954063 PTE 80000040fdf36163 SEV-SNP: PFN 0x40fdf36, RMP entry: [0x6010fffffffff001 - 0x000000000000001f] Oops: Oops: 0003 [#1] SMP NOPTI CPU: 33 UID: 0 PID: 996180 Comm: qemu-system-x86 Tainted: G OE Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE Hardware name: Dell Inc. PowerEdge R7625/0H1TJT, BIOS 1.5.8 07/21/2023 RIP: 0010:sev_es_sync_vmsa+0x54/0x4c0 [kvm_amd] Call Trace: <TASK> snp_launch_update_vmsa+0x19d/0x290 [kvm_amd] snp_launch_finish+0xb6/0x380 [kvm_amd] sev_mem_enc_ioctl+0x14e/0x720 [kvm_amd] kvm_arch_vm_ioctl+0x837/0xcf0 [kvm] kvm_vm_ioctl+0x3fd/0xcc0 [kvm] __x64_sys_ioctl+0xa3/0x100 x64_sys_call+0xfe0/0x2350 do_syscall_64+0x81/0x10f0 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7ffff673287d </TASK> Note, the KVM flaw has been present since commit ad73109ae7ec ("KVM: SVM: Provide support to launch and run an SEV-ES guest"), but has only been actively dangerous for the host since SNP support was added. With SEV-ES, KVM would "just" clobber guest state, which is totally fine from a host kernel perspective since userspace can clobber guest state any time before sev_launch_update_vmsa(). Fixes: ad27ce155566 ("KVM: SEV: Add KVM_SEV_SNP_LAUNCH_FINISH command") Reported-by: Jethro Beekman <jethro@fortanix.com> Closes: https://lore.kernel.org/all/d98692e2-d96b-4c36-8089-4bc1e5cc3d57@fortanix.com Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260310234829.2608037-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22KVM: selftests: Remove duplicate LAUNCH_UPDATE_VMSA call in SEV-ES migrate testSean Christopherson1-2/+0
commit 25a642b6abc98bbbabbf2baef9fc498bbea6aee6 upstream. Drop the explicit KVM_SEV_LAUNCH_UPDATE_VMSA call when creating an SEV-ES VM in the SEV migration test, as sev_vm_create() automatically updates the VMSA pages for SEV-ES guests. The only reason the duplicate call doesn't cause visible problems is because the test doesn't actually try to run the vCPUs. That will change when KVM adds a check to prevent userspace from re-launching a VMSA (which corrupts the VMSA page due to KVM writing encrypted private memory). Fixes: 69f8e15ab61f ("KVM: selftests: Use the SEV library APIs in the intra-host migration test") Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260310234829.2608037-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22PCI: endpoint: pci-epf-vntb: Remove duplicate resource teardownKoichiro Den1-18/+1
commit 0da63230d3ec1ec5fcc443a2314233e95bfece54 upstream. epf_ntb_epc_destroy() duplicates the teardown that the caller is supposed to perform later. This leads to an oops when .allow_link fails or when .drop_link is performed. The following is an example oops of the former case: Unable to handle kernel paging request at virtual address dead000000000108 [...] [dead000000000108] address between user and kernel address ranges Internal error: Oops: 0000000096000044 [#1] SMP [...] Call trace: pci_epc_remove_epf+0x78/0xe0 (P) pci_primary_epc_epf_link+0x88/0xa8 configfs_symlink+0x1f4/0x5a0 vfs_symlink+0x134/0x1d8 do_symlinkat+0x88/0x138 __arm64_sys_symlinkat+0x74/0xe0 [...] Remove the helper, and drop pci_epc_put(). EPC device refcounting is tied to the configfs EPC group lifetime, and pci_epc_put() in the .drop_link path is sufficient. Fixes: e35f56bb0330 ("PCI: endpoint: Support NTB transfer between RC and EP") Signed-off-by: Koichiro Den <den@valinux.co.jp> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260226084142.2226875-2-den@valinux.co.jp Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22PCI: endpoint: pci-epf-vntb: Stop cmd_handler work in epf_ntb_epc_cleanupKoichiro Den1-0/+1
commit d799984233a50abd2667a7d17a9a710a3f10ebe2 upstream. Disable the delayed work before clearing BAR mappings and doorbells to avoid running the handler after resources have been torn down. Unable to handle kernel paging request at virtual address ffff800083f46004 [...] Internal error: Oops: 0000000096000007 [#1] SMP [...] Call trace: epf_ntb_cmd_handler+0x54/0x200 [pci_epf_vntb] (P) process_one_work+0x154/0x3b0 worker_thread+0x2c8/0x400 kthread+0x148/0x210 ret_from_fork+0x10/0x20 Fixes: e35f56bb0330 ("PCI: endpoint: Support NTB transfer between RC and EP") Signed-off-by: Koichiro Den <den@valinux.co.jp> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260226084142.2226875-4-den@valinux.co.jp Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22ocfs2: handle invalid dinode in ocfs2_group_extendZhengYuan Huang1-3/+7
commit 4a1c0ddc6e7bcf2e9db0eeaab9340dcfe97f448f upstream. [BUG] kernel BUG at fs/ocfs2/resize.c:308! Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI RIP: 0010:ocfs2_group_extend+0x10aa/0x1ae0 fs/ocfs2/resize.c:308 Code: 8b8520ff ffff83f8 860f8580 030000e8 5cc3c1fe Call Trace: ... ocfs2_ioctl+0x175/0x6e0 fs/ocfs2/ioctl.c:869 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:597 [inline] __se_sys_ioctl fs/ioctl.c:583 [inline] __x64_sys_ioctl+0x197/0x1e0 fs/ioctl.c:583 x64_sys_call+0x1144/0x26a0 arch/x86/include/generated/asm/syscalls_64.h:17 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0x93/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x76/0x7e ... [CAUSE] ocfs2_group_extend() assumes that the global bitmap inode block returned from ocfs2_inode_lock() has already been validated and BUG_ONs when the signature is not a dinode. That assumption is too strong for crafted filesystems because the JBD2-managed buffer path can bypass structural validation and return an invalid dinode to the resize ioctl. [FIX] Validate the dinode explicitly in ocfs2_group_extend(). If the global bitmap buffer does not contain a valid dinode, report filesystem corruption with ocfs2_error() and fail the resize operation instead of crashing the kernel. Link: https://lkml.kernel.org/r/20260401092303.3709187-1-gality369@gmail.com Fixes: 10995aa2451a ("ocfs2: Morph the haphazard OCFS2_IS_VALID_DINODE() checks.") Signed-off-by: ZhengYuan Huang <gality369@gmail.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22ocfs2: fix use-after-free in ocfs2_fault() when VM_FAULT_RETRYTejas Bharambe2-10/+7
commit 7de554cabf160e331e4442e2a9ad874ca9875921 upstream. filemap_fault() may drop the mmap_lock before returning VM_FAULT_RETRY, as documented in mm/filemap.c: "If our return value has VM_FAULT_RETRY set, it's because the mmap_lock may be dropped before doing I/O or by lock_folio_maybe_drop_mmap()." When this happens, a concurrent munmap() can call remove_vma() and free the vm_area_struct via RCU. The saved 'vma' pointer in ocfs2_fault() then becomes a dangling pointer, and the subsequent trace_ocfs2_fault() call dereferences it -- a use-after-free. Fix this by saving ip_blkno as a plain integer before calling filemap_fault(), and removing vma from the trace event. Since ip_blkno is copied by value before the lock can be dropped, it remains valid regardless of what happens to the vma or inode afterward. Link: https://lkml.kernel.org/r/20260410083816.34951-1-tejas.bharambe@outlook.com Fixes: 614a9e849ca6 ("ocfs2: Remove FILE_IO from masklog.") Signed-off-by: Tejas Bharambe <tejas.bharambe@outlook.com> Reported-by: syzbot+a49010a0e8fcdeea075f@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a49010a0e8fcdeea075f Suggested-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22ocfs2: fix possible deadlock between unlink and dio_end_io_writeJoseph Qi1-2/+1
commit b02da26a992db0c0e2559acbda0fc48d4a2fd337 upstream. ocfs2_unlink takes orphan dir inode_lock first and then ip_alloc_sem, while in ocfs2_dio_end_io_write, it acquires these locks in reverse order. This creates an ABBA lock ordering violation on lock classes ocfs2_sysfile_lock_key[ORPHAN_DIR_SYSTEM_INODE] and ocfs2_file_ip_alloc_sem_key. Lock Chain #0 (orphan dir inode_lock -> ip_alloc_sem): ocfs2_unlink ocfs2_prepare_orphan_dir ocfs2_lookup_lock_orphan_dir inode_lock(orphan_dir_inode) <- lock A __ocfs2_prepare_orphan_dir ocfs2_prepare_dir_for_insert ocfs2_extend_dir ocfs2_expand_inline_dir down_write(&oi->ip_alloc_sem) <- Lock B Lock Chain #1 (ip_alloc_sem -> orphan dir inode_lock): ocfs2_dio_end_io_write down_write(&oi->ip_alloc_sem) <- Lock B ocfs2_del_inode_from_orphan() inode_lock(orphan_dir_inode) <- Lock A Deadlock Scenario: CPU0 (unlink) CPU1 (dio_end_io_write) ------ ------ inode_lock(orphan_dir_inode) down_write(ip_alloc_sem) down_write(ip_alloc_sem) inode_lock(orphan_dir_inode) Since ip_alloc_sem is to protect allocation changes, which is unrelated with operations in ocfs2_del_inode_from_orphan. So move ocfs2_del_inode_from_orphan out of ip_alloc_sem to fix the deadlock. Link: https://lkml.kernel.org/r/20260306032211.1016452-1-joseph.qi@linux.alibaba.com Reported-by: syzbot+67b90111784a3eac8c04@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=67b90111784a3eac8c04 Fixes: a86a72a4a4e0 ("ocfs2: take ip_alloc_sem in ocfs2_dio_get_block & ocfs2_dio_end_io_write") Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22media: vidtv: fix NULL pointer dereference in vidtv_channel_pmt_match_sectionsRuslan Valiyev1-0/+4
commit f8e1fc918a9fe67103bcda01d20d745f264d00a7 upstream. syzbot reported a general protection fault in vidtv_psi_desc_assign [1]. vidtv_psi_pmt_stream_init() can return NULL on memory allocation failure, but vidtv_channel_pmt_match_sections() does not check for this. When tail is NULL, the subsequent call to vidtv_psi_desc_assign(&tail->descriptor, desc) dereferences a NULL pointer offset, causing a general protection fault. Add a NULL check after vidtv_psi_pmt_stream_init(). On failure, clean up the already-allocated stream chain and return. [1] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] RIP: 0010:vidtv_psi_desc_assign+0x24/0x90 drivers/media/test-drivers/vidtv/vidtv_psi.c:629 Call Trace: <TASK> vidtv_channel_pmt_match_sections drivers/media/test-drivers/vidtv/vidtv_channel.c:349 [inline] vidtv_channel_si_init+0x1445/0x1a50 drivers/media/test-drivers/vidtv/vidtv_channel.c:479 vidtv_mux_init+0x526/0xbe0 drivers/media/test-drivers/vidtv/vidtv_mux.c:519 vidtv_start_streaming drivers/media/test-drivers/vidtv/vidtv_bridge.c:194 [inline] vidtv_start_feed+0x33e/0x4d0 drivers/media/test-drivers/vidtv/vidtv_bridge.c:239 Fixes: f90cf6079bf67 ("media: vidtv: add a bridge driver") Cc: stable@vger.kernel.org Reported-by: syzbot+1f5bcc7c919ec578777a@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=1f5bcc7c919ec578777a Signed-off-by: Ruslan Valiyev <linuxoid@gmail.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22arm64: mm: Handle invalid large leaf mappings correctlyRyan Roberts5-59/+48
commit 15bfba1ad77fad8e45a37aae54b3c813b33fe27c upstream. It has been possible for a long time to mark ptes in the linear map as invalid. This is done for secretmem, kfence, realm dma memory un/share, and others, by simply clearing the PTE_VALID bit. But until commit a166563e7ec37 ("arm64: mm: support large block mapping when rodata=full") large leaf mappings were never made invalid in this way. It turns out various parts of the code base are not equipped to handle invalid large leaf mappings (in the way they are currently encoded) and I've observed a kernel panic while booting a realm guest on a BBML2_NOABORT system as a result: [ 15.432706] software IO TLB: Memory encryption is active and system is using DMA bounce buffers [ 15.476896] Unable to handle kernel paging request at virtual address ffff000019600000 [ 15.513762] Mem abort info: [ 15.527245] ESR = 0x0000000096000046 [ 15.548553] EC = 0x25: DABT (current EL), IL = 32 bits [ 15.572146] SET = 0, FnV = 0 [ 15.592141] EA = 0, S1PTW = 0 [ 15.612694] FSC = 0x06: level 2 translation fault [ 15.640644] Data abort info: [ 15.661983] ISV = 0, ISS = 0x00000046, ISS2 = 0x00000000 [ 15.694875] CM = 0, WnR = 1, TnD = 0, TagAccess = 0 [ 15.723740] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 15.755776] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000081f3f000 [ 15.800410] [ffff000019600000] pgd=0000000000000000, p4d=180000009ffff403, pud=180000009fffe403, pmd=00e8000199600704 [ 15.855046] Internal error: Oops: 0000000096000046 [#1] SMP [ 15.886394] Modules linked in: [ 15.900029] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 7.0.0-rc4-dirty #4 PREEMPT [ 15.935258] Hardware name: linux,dummy-virt (DT) [ 15.955612] pstate: 21400005 (nzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) [ 15.986009] pc : __pi_memcpy_generic+0x128/0x22c [ 16.006163] lr : swiotlb_bounce+0xf4/0x158 [ 16.024145] sp : ffff80008000b8f0 [ 16.038896] x29: ffff80008000b8f0 x28: 0000000000000000 x27: 0000000000000000 [ 16.069953] x26: ffffb3976d261ba8 x25: 0000000000000000 x24: ffff000019600000 [ 16.100876] x23: 0000000000000001 x22: ffff0000043430d0 x21: 0000000000007ff0 [ 16.131946] x20: 0000000084570010 x19: 0000000000000000 x18: ffff00001ffe3fcc [ 16.163073] x17: 0000000000000000 x16: 00000000003fffff x15: 646e612065766974 [ 16.194131] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 [ 16.225059] x11: 0000000000000000 x10: 0000000000000010 x9 : 0000000000000018 [ 16.256113] x8 : 0000000000000018 x7 : 0000000000000000 x6 : 0000000000000000 [ 16.287203] x5 : ffff000019607ff0 x4 : ffff000004578000 x3 : ffff000019600000 [ 16.318145] x2 : 0000000000007ff0 x1 : ffff000004570010 x0 : ffff000019600000 [ 16.349071] Call trace: [ 16.360143] __pi_memcpy_generic+0x128/0x22c (P) [ 16.380310] swiotlb_tbl_map_single+0x154/0x2b4 [ 16.400282] swiotlb_map+0x5c/0x228 [ 16.415984] dma_map_phys+0x244/0x2b8 [ 16.432199] dma_map_page_attrs+0x44/0x58 [ 16.449782] virtqueue_map_page_attrs+0x38/0x44 [ 16.469596] virtqueue_map_single_attrs+0xc0/0x130 [ 16.490509] virtnet_rq_alloc.isra.0+0xa4/0x1fc [ 16.510355] try_fill_recv+0x2a4/0x584 [ 16.526989] virtnet_open+0xd4/0x238 [ 16.542775] __dev_open+0x110/0x24c [ 16.558280] __dev_change_flags+0x194/0x20c [ 16.576879] netif_change_flags+0x24/0x6c [ 16.594489] dev_change_flags+0x48/0x7c [ 16.611462] ip_auto_config+0x258/0x1114 [ 16.628727] do_one_initcall+0x80/0x1c8 [ 16.645590] kernel_init_freeable+0x208/0x2f0 [ 16.664917] kernel_init+0x24/0x1e0 [ 16.680295] ret_from_fork+0x10/0x20 [ 16.696369] Code: 927cec03 cb0e0021 8b0e0042 a9411c26 (a900340c) [ 16.723106] ---[ end trace 0000000000000000 ]--- [ 16.752866] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b [ 16.792556] Kernel Offset: 0x3396ea200000 from 0xffff800080000000 [ 16.818966] PHYS_OFFSET: 0xfff1000080000000 [ 16.837237] CPU features: 0x0000000,00060005,13e38581,957e772f [ 16.862904] Memory Limit: none [ 16.876526] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]--- This panic occurs because the swiotlb memory was previously shared to the host (__set_memory_enc_dec()), which involves transitioning the (large) leaf mappings to invalid, sharing to the host, then marking the mappings valid again. But pageattr_p[mu]d_entry() would only update the entry if it is a section mapping, since otherwise it concluded it must be a table entry so shouldn't be modified. But p[mu]d_sect() only returns true if the entry is valid. So the result was that the large leaf entry was made invalid in the first pass then ignored in the second pass. It remains invalid until the above code tries to access it and blows up. The simple fix would be to update pageattr_pmd_entry() to use !pmd_table() instead of pmd_sect(). That would solve this problem. But the ptdump code also suffers from a similar issue. It checks pmd_leaf() and doesn't call into the arch-specific note_page() machinery if it returns false. As a result of this, ptdump wasn't even able to show the invalid large leaf mappings; it looked like they were valid which made this super fun to debug. the ptdump code is core-mm and pmd_table() is arm64-specific so we can't use the same trick to solve that. But we already support the concept of "present-invalid" for user space entries. And even better, pmd_leaf() will return true for a leaf mapping that is marked present-invalid. So let's just use that encoding for present-invalid kernel mappings too. Then we can use pmd_leaf() where we previously used pmd_sect() and everything is magically fixed. Additionally, from inspection kernel_page_present() was broken in a similar way, so I'm also updating that to use pmd_leaf(). The transitional page tables component was also similarly broken; it creates a copy of the kernel page tables, making RO leaf mappings RW in the process. It also makes invalid (but-not-none) pte mappings valid. But it was not doing this for large leaf mappings. This could have resulted in crashes at kexec- or hibernate-time. This code is fixed to flip "present-invalid" mappings back to "present-valid" at all levels. Finally, I have hardened split_pmd()/split_pud() so that if it is passed a "present-invalid" leaf, it will maintain that property in the split leaves, since I wasn't able to convince myself that it would only ever be called for "present-valid" leaves. Fixes: a166563e7ec3 ("arm64: mm: support large block mapping when rodata=full") Cc: stable@vger.kernel.org Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22vfio/xe: Reorganize the init to decouple migration from resetMichał Winiarski1-18/+25
commit 1b81ed612e12ea9df8c5cb6f0ddd4419fd0b8ac8 upstream. Attempting to issue reset on VF devices that don't support migration leads to the following: BUG: unable to handle page fault for address: 00000000000011f8 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP NOPTI CPU: 2 UID: 0 PID: 7443 Comm: xe_sriov_flr Tainted: G S U 7.0.0-rc1-lgci-xe-xe-4588-cec43d5c2696af219-nodebug+ #1 PREEMPT(lazy) Tainted: [S]=CPU_OUT_OF_SPEC, [U]=USER Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR4 RVP, BIOS RPLPFWI1.R00.4035.A00.2301200723 01/20/2023 RIP: 0010:xe_sriov_vfio_wait_flr_done+0xc/0x80 [xe] Code: ff c3 cc cc cc cc 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 55 48 89 e5 41 54 53 <83> bf f8 11 00 00 02 75 61 41 89 f4 85 f6 74 52 48 8b 47 08 48 89 RSP: 0018:ffffc9000f7c39b8 EFLAGS: 00010202 RAX: ffffffffa04d8660 RBX: ffff88813e3e4000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffffc9000f7c39c8 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff888101a48800 R13: ffff88813e3e4150 R14: ffff888130d0d008 R15: ffff88813e3e40d0 FS: 00007877d3d0d940(0000) GS:ffff88890b6d3000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000011f8 CR3: 000000015a762000 CR4: 0000000000f52ef0 PKRU: 55555554 Call Trace: <TASK> xe_vfio_pci_reset_done+0x49/0x120 [xe_vfio_pci] pci_dev_restore+0x3b/0x80 pci_reset_function+0x109/0x140 reset_store+0x5c/0xb0 dev_attr_store+0x17/0x40 sysfs_kf_write+0x72/0x90 kernfs_fop_write_iter+0x161/0x1f0 vfs_write+0x261/0x440 ksys_write+0x69/0xf0 __x64_sys_write+0x19/0x30 x64_sys_call+0x259/0x26e0 do_syscall_64+0xcb/0x1500 ? __fput+0x1a2/0x2d0 ? fput_close_sync+0x3d/0xa0 ? __x64_sys_close+0x3e/0x90 ? x64_sys_call+0x1b7c/0x26e0 ? do_syscall_64+0x109/0x1500 ? __task_pid_nr_ns+0x68/0x100 ? __do_sys_getpid+0x1d/0x30 ? x64_sys_call+0x10b5/0x26e0 ? do_syscall_64+0x109/0x1500 ? putname+0x41/0x90 ? do_faccessat+0x1e8/0x300 ? __x64_sys_access+0x1c/0x30 ? x64_sys_call+0x1822/0x26e0 ? do_syscall_64+0x109/0x1500 ? tick_program_event+0x43/0xa0 ? hrtimer_interrupt+0x126/0x260 ? irqentry_exit+0xb2/0x710 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7877d5f1c5a4 Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d a5 ea 0e 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89 RSP: 002b:00007fff48e5f908 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007877d5f1c5a4 RDX: 0000000000000001 RSI: 00007877d621b0c9 RDI: 0000000000000009 RBP: 0000000000000001 R08: 00005fb49113b010 R09: 0000000000000007 R10: 0000000000000000 R11: 0000000000000202 R12: 00007877d621b0c9 R13: 0000000000000009 R14: 00007fff48e5fac0 R15: 00007fff48e5fac0 </TASK> This is caused by the fact that some of the xe_vfio_pci_core_device members needed for handling reset are only initialized as part of migration init. Fix the problem by reorganizing the code to decouple VF init from migration init. Fixes: 1f5556ec8b9ef ("vfio/xe: Add device specific vfio_pci driver variant for Intel graphics") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/work_items/7352 Cc: stable@vger.kernel.org Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20260410224948.900550-1-michal.winiarski@intel.com Signed-off-by: Alex Williamson <alex@shazbot.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22dcache: Limit the minimal number of bucket to twoZhihao Cheng1-2/+2
commit f08fe8891c3eeb63b73f9f1f6d97aa629c821579 upstream. There is an OOB read problem on dentry_hashtable when user sets 'dhash_entries=1': BUG: unable to handle page fault for address: ffff888b30b774b0 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page Oops: Oops: 0000 [#1] SMP PTI RIP: 0010:__d_lookup+0x56/0x120 Call Trace: d_lookup.cold+0x16/0x5d lookup_dcache+0x27/0xf0 lookup_one_qstr_excl+0x2a/0x180 start_dirop+0x55/0xa0 simple_start_creating+0x8d/0xa0 debugfs_start_creating+0x8c/0x180 debugfs_create_dir+0x1d/0x1c0 pinctrl_init+0x6d/0x140 do_one_initcall+0x6d/0x3d0 kernel_init_freeable+0x39f/0x460 kernel_init+0x2a/0x260 There will be only one bucket in dentry_hashtable when dhash_entries is set as one, and d_hash_shift is calculated as 32 by dcache_init(). Then, following process will access more than one buckets(which memory region is not allocated) in dentry_hashtable: d_lookup b = d_hash(hash) dentry_hashtable + ((u32)hashlen >> d_hash_shift) // The C standard defines the behavior of right shift amounts // exceeding the bit width of the operand as undefined. The // result of '(u32)hashlen >> d_hash_shift' becomes 'hashlen', // so 'b' will point to an unallocated memory region. hlist_bl_for_each_entry_rcu(b) hlist_bl_first_rcu(head) h->first // read OOB! Fix it by limiting the minimal number of dentry_hashtable bucket to two, so that 'd_hash_shift' won't exceeds the bit width of type u32. Cc: stable@vger.kernel.org Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Link: https://patch.msgid.link/20260130034853.215819-1-chengzhihao1@huawei.com Reviewed-by: Yang Erkun <yangerkun@huawei.com> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22ALSA: ctxfi: Limit PTP to a single pageHarin Lee1-1/+1
commit e9418da50d9e5c496c22fe392e4ad74c038a94eb upstream. Commit 391e69143d0a increased CT_PTP_NUM from 1 to 4 to support 256 playback streams, but the additional pages are not used by the card correctly. The CT20K2 hardware already has multiple VMEM_PTPAL registers, but using them separately would require refactoring the entire virtual memory allocation logic. ct_vm_map() always uses PTEs in vm->ptp[0].area regardless of CT_PTP_NUM. On AMD64 systems, a single PTP covers 512 PTEs (2M). When aggregate memory allocations exceed this limit, ct_vm_map() tries to access beyond the allocated space and causes a page fault: BUG: unable to handle page fault for address: ffffd4ae8a10a000 Oops: Oops: 0002 [#1] SMP PTI RIP: 0010:ct_vm_map+0x17c/0x280 [snd_ctxfi] Call Trace: atc_pcm_playback_prepare+0x225/0x3b0 ct_pcm_playback_prepare+0x38/0x60 snd_pcm_do_prepare+0x2f/0x50 snd_pcm_action_single+0x36/0x90 snd_pcm_action_nonatomic+0xbf/0xd0 snd_pcm_ioctl+0x28/0x40 __x64_sys_ioctl+0x97/0xe0 do_syscall_64+0x81/0x610 entry_SYSCALL_64_after_hwframe+0x76/0x7e Revert CT_PTP_NUM to 1. The 256 SRC_RESOURCE_NUM and playback_count remain unchanged. Fixes: 391e69143d0a ("ALSA: ctxfi: Bump playback substreams to 256") Cc: stable@vger.kernel.org Signed-off-by: Harin Lee <me@harin.net> Link: https://patch.msgid.link/20260406074857.216034-1-me@harin.net Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22Docs/admin-guide/mm/damon/reclaim: warn commit_inputs vs param updates raceSeongJae Park1-0/+4
commit 0beba407d4585a15b0dc09f2064b5b3ddcb0e857 upstream. Patch series "Docs/admin-guide/mm/damon: warn commit_inputs vs other params race". Writing 'Y' to the commit_inputs parameter of DAMON_RECLAIM and DAMON_LRU_SORT, and writing other parameters before the commit_inputs request is completely processed can cause race conditions. While the consequence can be bad, the documentation is not clearly describing that. Add clear warnings. The issue was discovered [1,2] by sashiko. This patch (of 2): DAMON_RECLAIM handles commit_inputs request inside kdamond thread, reading the module parameters. If the user updates the module parameters while the kdamond thread is reading those, races can happen. To avoid this, the commit_inputs parameter shows whether it is still in the progress, assuming users wouldn't update parameters in the middle of the work. Some users might ignore that. Add a warning about the behavior. The issue was discovered in [1] by sashiko. Link: https://lore.kernel.org/20260329153052.46657-2-sj@kernel.org Link: https://lore.kernel.org/20260319161620.189392-3-objecting@objecting.org [1] Link: https://lore.kernel.org/20260319161620.189392-2-objecting@objecting.org [3] Fixes: 81a84182c343 ("Docs/admin-guide/mm/damon/reclaim: document 'commit_inputs' parameter") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: <stable@vger.kernel.org> # 5.19.x Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22USB: serial: option: add Telit Cinterion FN990A MBIM compositionFabio Porcedda1-0/+2
commit f8cc59ecc22841be5deb07b549c0c6a2657cd5f9 upstream. Add the following Telit Cinterion FN990A MBIM composition: 0x1074: MBIM + tty (AT/NMEA) + tty (AT) + tty (AT) + tty (diag) + DPL (Data Packet Logging) + adb T: Bus=01 Lev=01 Prnt=04 Port=06 Cnt=01 Dev#= 7 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=1074 Rev=05.04 S: Manufacturer=Telit Wireless Solutions S: Product=FN990 S: SerialNumber=70628d0c C: #Ifs= 8 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim E: Ad=81(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim E: Ad=0f(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8e(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 6 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=80 Driver=(none) E: Ad=8f(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 7 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms Cc: stable@vger.kernel.org Signed-off-by: Fabio Porcedda <fabio.porcedda@gmail.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22selftests/mm: hmm-tests: don't hardcode THP size to 2MBAlistair Popple1-67/+16
commit f9d7975c52c00b3685cf9a90a81023d17817d991 upstream. Several HMM tests hardcode TWOMEG as the THP size. This is wrong on architectures where the PMD size is not 2MB such as arm64 with 64K base pages where THP is 512MB. Fix this by using read_pmd_pagesize() from vm_util instead. While here also replace the custom file_read_ulong() helper used to parse the default hugetlbfs page size from /proc/meminfo with the existing default_huge_page_size() from vm_util. Link: https://lore.kernel.org/20260331063445.3551404-3-apopple@nvidia.com Link: https://lore.kernel.org/linux-mm/8bd0396a-8997-4d2e-a13f-5aac033083d7@linux.dev/ Fixes: fee9f6d1b8df ("mm/hmm/test: add selftests for HMM") Fixes: 519071529d2a ("selftests/mm/hmm-tests: new tests for zone device THP migration") Signed-off-by: Alistair Popple <apopple@nvidia.com> Reported-by: Zenghui Yu <zenghui.yu@linux.dev> Closes: https://lore.kernel.org/linux-mm/8bd0396a-8997-4d2e-a13f-5aac033083d7@linux.dev/ Reviewed-by: Balbir Singh <balbirs@nvidia.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: David Hildenbrand <david@kernel.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Leon Romanovsky <leon@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22staging: sm750fb: fix division by zero in ps_to_hz()Junrui Luo1-0/+3
commit 75a1621e4f91310673c9acbcbb25c2a7ff821cd3 upstream. ps_to_hz() is called from hw_sm750_crtc_set_mode() without validating that pixclock is non-zero. A zero pixclock passed via FBIOPUT_VSCREENINFO causes a division by zero. Fix by rejecting zero pixclock in lynxfb_ops_check_var(), consistent with other framebuffer drivers. Fixes: 81dee67e215b ("staging: sm750fb: add sm750 to staging") Reported-by: Yuhao Jiang <danisjiang@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Junrui Luo <moonafterrain@outlook.com> Link: https://patch.msgid.link/SYBPR01MB7881AFBFCE28CCF528B35D0CAF4BA@SYBPR01MB7881.ausprd01.prod.outlook.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22wifi: rtw88: fix device leak on probe failureJohan Hovold1-2/+1
commit bbb15e71156cd9f5e1869eee7207a06ea8e96c39 upstream. Driver core holds a reference to the USB interface and its parent USB device while the interface is bound to a driver and there is no need to take additional references unless the structures are needed after disconnect. This driver takes a reference to the USB device during probe but does not to release it on all probe errors (e.g. when descriptor parsing fails). Drop the redundant device reference to fix the leak, reduce cargo culting, make it easier to spot drivers where an extra reference is needed, and reduce the risk of further memory leaks. Fixes: a82dfd33d123 ("wifi: rtw88: Add common USB chip support") Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/netdev/2026022319-turbofan-darkened-206d@gregkh/ Cc: stable@vger.kernel.org # 6.2 Cc: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20260306085144.12064-19-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>