summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
5 daysNFSD: Remove NFSERR_EAGAINChuck Lever2-3/+0
[ Upstream commit c6c209ceb87f64a6ceebe61761951dcbbf4a0baa ] I haven't found an NFSERR_EAGAIN in RFCs 1094, 1813, 7530, or 8881. None of these RFCs have an NFS status code that match the numeric value "11". Based on the meaning of the EAGAIN errno, I presume the use of this status in NFSD means NFS4ERR_DELAY. So replace the one usage of nfserr_eagain, and remove it from NFSD's NFS status conversion tables. As far as I can tell, NFSERR_EAGAIN has existed since the pre-git era, but was not actually used by any code until commit f4e44b393389 ("NFSD: delay unmount source's export after inter-server copy completed."), at which time it become possible for NFSD to return a status code of 11 (which is not valid NFS protocol). Fixes: f4e44b393389 ("NFSD: delay unmount source's export after inter-server copy completed.") Cc: stable@vger.kernel.org Reviewed-by: NeilBrown <neil@brown.name> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 daysnfs_common: factor out nfs_errtbl and nfs_stat_to_errnoMike Snitzer1-0/+16
[ Upstream commit 4806ded4c14c5e8fdc6ce885d83221a78c06a428 ] Common nfs_stat_to_errno() is used by both fs/nfs/nfs2xdr.c and fs/nfs/nfs3xdr.c Will also be used by fs/nfsd/localio.c Signed-off-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com> Stable-dep-of: c6c209ceb87f ("NFSD: Remove NFSERR_EAGAIN") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 daysNFS: trace: show TIMEDOUT instead of 0x6eChen Hanxiao1-0/+1
[ Upstream commit cef48236dfe55fa266d505e8a497963a7bc5ef2a ] __nfs_revalidate_inode may return ETIMEDOUT. print symbol of ETIMEDOUT in nfs trace: before: cat-5191 [005] 119.331127: nfs_revalidate_inode_exit: error=-110 (0x6e) after: cat-1738 [004] 44.365509: nfs_revalidate_inode_exit: error=-110 (TIMEDOUT) Signed-off-by: Chen Hanxiao <chenhx.fnst@fujitsu.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Stable-dep-of: c6c209ceb87f ("NFSD: Remove NFSERR_EAGAIN") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 daysnetdev: preserve NETIF_F_ALL_FOR_ALL across TSO updatesDi Zhu1-1/+2
[ Upstream commit 02d1e1a3f9239cdb3ecf2c6d365fb959d1bf39df ] Directly increment the TSO features incurs a side effect: it will also directly clear the flags in NETIF_F_ALL_FOR_ALL on the master device, which can cause issues such as the inability to enable the nocache copy feature on the bonding driver. The fix is to include NETIF_F_ALL_FOR_ALL in the update mask, thereby preventing it from being cleared. Fixes: b0ce3508b25e ("bonding: allow TSO being set on bonding master") Signed-off-by: Di Zhu <zhud@hygon.cn> Link: https://patch.msgid.link/20251224012224.56185-1-zhud@hygon.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
5 daysnet: Remove RTNL dance for SIOCBRADDIF and SIOCBRDELIF.Thadeu Lima de Souza Cascardo1-4/+2
commit ed3ba9b6e280e14cc3148c1b226ba453f02fa76c upstream. SIOCBRDELIF is passed to dev_ioctl() first and later forwarded to br_ioctl_call(), which causes unnecessary RTNL dance and the splat below [0] under RTNL pressure. Let's say Thread A is trying to detach a device from a bridge and Thread B is trying to remove the bridge. In dev_ioctl(), Thread A bumps the bridge device's refcnt by netdev_hold() and releases RTNL because the following br_ioctl_call() also re-acquires RTNL. In the race window, Thread B could acquire RTNL and try to remove the bridge device. Then, rtnl_unlock() by Thread B will release RTNL and wait for netdev_put() by Thread A. Thread A, however, must hold RTNL after the unlock in dev_ifsioc(), which may take long under RTNL pressure, resulting in the splat by Thread B. Thread A (SIOCBRDELIF) Thread B (SIOCBRDELBR) ---------------------- ---------------------- sock_ioctl sock_ioctl `- sock_do_ioctl `- br_ioctl_call `- dev_ioctl `- br_ioctl_stub |- rtnl_lock | |- dev_ifsioc ' ' |- dev = __dev_get_by_name(...) |- netdev_hold(dev, ...) . / |- rtnl_unlock ------. | | |- br_ioctl_call `---> |- rtnl_lock Race | | `- br_ioctl_stub |- br_del_bridge Window | | | |- dev = __dev_get_by_name(...) | | | May take long | `- br_dev_delete(dev, ...) | | | under RTNL pressure | `- unregister_netdevice_queue(dev, ...) | | | | `- rtnl_unlock \ | |- rtnl_lock <-' `- netdev_run_todo | |- ... `- netdev_run_todo | `- rtnl_unlock |- __rtnl_unlock | |- netdev_wait_allrefs_any |- netdev_put(dev, ...) <----------------' Wait refcnt decrement and log splat below To avoid blocking SIOCBRDELBR unnecessarily, let's not call dev_ioctl() for SIOCBRADDIF and SIOCBRDELIF. In the dev_ioctl() path, we do the following: 1. Copy struct ifreq by get_user_ifreq in sock_do_ioctl() 2. Check CAP_NET_ADMIN in dev_ioctl() 3. Call dev_load() in dev_ioctl() 4. Fetch the master dev from ifr.ifr_name in dev_ifsioc() 3. can be done by request_module() in br_ioctl_call(), so we move 1., 2., and 4. to br_ioctl_stub(). Note that 2. is also checked later in add_del_if(), but it's better performed before RTNL. SIOCBRADDIF and SIOCBRDELIF have been processed in dev_ioctl() since the pre-git era, and there seems to be no specific reason to process them there. [0]: unregister_netdevice: waiting for wpan3 to become free. Usage count = 2 ref_tracker: wpan3@ffff8880662d8608 has 1/1 users at __netdev_tracker_alloc include/linux/netdevice.h:4282 [inline] netdev_hold include/linux/netdevice.h:4311 [inline] dev_ifsioc+0xc6a/0x1160 net/core/dev_ioctl.c:624 dev_ioctl+0x255/0x10c0 net/core/dev_ioctl.c:826 sock_do_ioctl+0x1ca/0x260 net/socket.c:1213 sock_ioctl+0x23a/0x6c0 net/socket.c:1318 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:906 [inline] __se_sys_ioctl fs/ioctl.c:892 [inline] __x64_sys_ioctl+0x1a4/0x210 fs/ioctl.c:892 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcb/0x250 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: 893b19587534 ("net: bridge: fix ioctl locking") Reported-by: syzkaller <syzkaller@googlegroups.com> Reported-by: yan kang <kangyan91@outlook.com> Reported-by: yue sun <samsun1006219@gmail.com> Closes: https://lore.kernel.org/netdev/SY8P300MB0421225D54EB92762AE8F0F2A1D32@SY8P300MB0421.AUSP300.PROD.OUTLOOK.COM/ Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20250316192851.19781-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> [cascardo: fixed conflict at dev_ifsioc] Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 daysmm/balloon_compaction: convert balloon_page_delete() to balloon_page_finalize()David Hildenbrand1-27/+16
[ Upstream commit 15504b1163007bbfbd9a63460d5c14737c16e96d ] Let's move the removal of the page from the balloon list into the single caller, to remove the dependency on the PG_isolated flag and clarify locking requirements. Note that for now, balloon_page_delete() was used on two paths: (1) Removing a page from the balloon for deflation through balloon_page_list_dequeue() (2) Removing an isolated page from the balloon for migration in the per-driver migration handlers. Isolated pages were already removed from the balloon list during isolation. So instead of relying on the flag, we can just distinguish both cases directly and handle it accordingly in the caller. We'll shuffle the operations a bit such that they logically make more sense (e.g., remove from the list before clearing flags). In balloon migration functions we can now move the balloon_page_finalize() out of the balloon lock and perform the finalization just before dropping the balloon reference. Document that the page lock is currently required when modifying the movability aspects of a page; hopefully we can soon decouple this from the page lock. Link: https://lkml.kernel.org/r/20250704102524.326966-3-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Brendan Jackman <jackmanb@google.com> Cc: Byungchul Park <byungchul@sk.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Brauner <brauner@kernel.org> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Eugenio Pé rez <eperezma@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Gregory Price <gourry@gourry.net> Cc: Harry Yoo <harry.yoo@oracle.com> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Jan Kara <jack@suse.cz> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jason Wang <jasowang@redhat.com> Cc: Jerrin Shaji George <jerrin.shaji-george@broadcom.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mathew Brost <matthew.brost@intel.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Xu <peterx@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Rik van Riel <riel@surriel.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: 0da2ba35c0d5 ("powerpc/pseries/cmm: adjust BALLOON_MIGRATE when migrating pages") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 daysmm/balloon_compaction: make balloon page compaction callbacks staticMiaohe Lin1-22/+0
[ Upstream commit 504c1cabe325df65c18ef38365ddd1a41c6b591b ] Since commit b1123ea6d3b3 ("mm: balloon: use general non-lru movable page feature"), these functions are called via balloon_aops callbacks. They're not called directly outside this file. So make them static and clean up the relevant code. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Link: https://lore.kernel.org/r/20220125132221.2220-1-linmiaohe@huawei.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Stable-dep-of: 0da2ba35c0d5 ("powerpc/pseries/cmm: adjust BALLOON_MIGRATE when migrating pages") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 daysmptcp: pm: ignore unknown endpoint flagsMatthieu Baerts (NGI0)1-0/+1
[ Upstream commit 0ace3297a7301911e52d8195cb1006414897c859 ] Before this patch, the kernel was saving any flags set by the userspace, even unknown ones. This doesn't cause critical issues because the kernel is only looking at specific ones. But on the other hand, endpoints dumps could tell the userspace some recent flags seem to be supported on older kernel versions. Instead, ignore all unknown flags when parsing them. By doing that, the userspace can continue to set unsupported flags, but it has a way to verify what is supported by the kernel. Note that it sounds better to continue accepting unsupported flags not to change the behaviour, but also that eases things on the userspace side by adding "optional" endpoint types only supported by newer kernel versions without having to deal with the different kernel versions. A note for the backports: there will be conflicts in mptcp.h on older versions not having the mentioned flags, the new line should still be added last, and the '5' needs to be adapted to have the same value as the last entry. Fixes: 01cacb00b35c ("mptcp: add netlink-based PM") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20251205-net-mptcp-misc-fixes-6-19-rc1-v1-1-9e4781a6c1b8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> [ GENMASK(5, 0) => GENMASK(3, 0) + context ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 daysALSA: wavefront: Use standard print APITakashi Iwai1-4/+0
[ Upstream commit 8b4ac5429938dd5f1fbf2eea0687f08cbcccb6be ] Use the standard print API with dev_*() instead of the old house-baked one. It gives better information and allows dynamically control of debug prints. Reviewed-by: Jaroslav Kysela <perex@perex.cz> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20240807133452.9424-36-tiwai@suse.de Stable-dep-of: 0c4a13ba8859 ("ALSA: wavefront: Fix integer overflow in sample size validation") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 daystpm: Cap the number of PCR banksJarkko Sakkinen1-3/+6
[ Upstream commit faf07e611dfa464b201223a7253e9dc5ee0f3c9e ] tpm2_get_pcr_allocation() does not cap any upper limit for the number of banks. Cap the limit to eight banks so that out of bounds values coming from external I/O cause on only limited harm. Cc: stable@vger.kernel.org # v5.10+ Fixes: bcfff8384f6c ("tpm: dynamically allocate the allocated_banks array") Tested-by: Lai Yi <yi1.lai@linux.intel.com> Reviewed-by: Jonathan McDowell <noodles@meta.com> Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@opinsys.com> [ added backward-compatible define for TPM_MAX_DIGEST_SIZE to support older ima_init.c code still using that macro name ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 daysusb: gadget: udc: fix use-after-free in usb_gadget_state_workJimmy Hu1-0/+5
[ Upstream commit baeb66fbd4201d1c4325074e78b1f557dff89b5b ] A race condition during gadget teardown can lead to a use-after-free in usb_gadget_state_work(), as reported by KASAN: BUG: KASAN: invalid-access in sysfs_notify+0x2c/0xd0 Workqueue: events usb_gadget_state_work The fundamental race occurs because a concurrent event (e.g., an interrupt) can call usb_gadget_set_state() and schedule gadget->work at any time during the cleanup process in usb_del_gadget(). Commit 399a45e5237c ("usb: gadget: core: flush gadget workqueue after device removal") attempted to fix this by moving flush_work() to after device_del(). However, this does not fully solve the race, as a new work item can still be scheduled *after* flush_work() completes but before the gadget's memory is freed, leading to the same use-after-free. This patch fixes the race condition robustly by introducing a 'teardown' flag and a 'state_lock' spinlock to the usb_gadget struct. The flag is set during cleanup in usb_del_gadget() *before* calling flush_work() to prevent any new work from being scheduled once cleanup has commenced. The scheduling site, usb_gadget_set_state(), now checks this flag under the lock before queueing the work, thus safely closing the race window. Fixes: 5702f75375aa9 ("usb: gadget: udc-core: move sysfs_notify() to a workqueue") Cc: stable <stable@kernel.org> Signed-off-by: Jimmy Hu <hhhuuu@google.com> Link: https://patch.msgid.link/20251023054945.233861-1-hhhuuu@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [ Adjust context ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 daysgenalloc.h: fix htmldocs warningAndrew Morton1-0/+1
[ Upstream commit 5393802c94e0ab1295c04c94c57bcb00222d4674 ] WARNING: include/linux/genalloc.h:52 function parameter 'start_addr' not described in 'genpool_algo_t' Fixes: 52fbf1134d47 ("lib/genalloc.c: fix allocation of aligned buffer from non-aligned chunk") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lkml.kernel.org/r/20251127130624.563597e3@canb.auug.org.au Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Cc: Alexey Skidanov <alexey.skidanov@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
5 daysmedia: v4l2-mem2mem: Fix outdated documentationLaurent Pinchart1-2/+1
commit 082b86919b7a94de01d849021b4da820a6cb89dc upstream. Commit cbd9463da1b1 ("media: v4l2-mem2mem: Avoid calling .device_run in v4l2_m2m_job_finish") deferred calls to .device_run() to a work queue to avoid recursive calls when a job is finished right away from .device_run(). It failed to update the v4l2_m2m_job_finish() documentation that still states the function must not be called from .device_run(). Fix it. Fixes: cbd9463da1b1 ("media: v4l2-mem2mem: Avoid calling .device_run in v4l2_m2m_job_finish") Cc: stable@vger.kernel.org Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 daysefi/cper: align ARM CPER type with UEFI 2.9A/2.10 specsMauro Carvalho Chehab1-5/+5
[ Upstream commit 96b010536ee020e716d28d9b359a4bcd18800aeb ] Up to UEFI spec 2.9, the type byte of CPER struct for ARM processor was defined simply as: Type at byte offset 4: - Cache error - TLB Error - Bus Error - Micro-architectural Error All other values are reserved Yet, there was no information about how this would be encoded. Spec 2.9A errata corrected it by defining: - Bit 1 - Cache Error - Bit 2 - TLB Error - Bit 3 - Bus Error - Bit 4 - Micro-architectural Error All other values are reserved That actually aligns with the values already defined on older versions at N.2.4.1. Generic Processor Error Section. Spec 2.10 also preserve the same encoding as 2.9A. Adjust CPER and GHES handling code for both generic and ARM processors to properly handle UEFI 2.9A and 2.10 encoding. Link: https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#arm-processor-error-information Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
5 daysefi/cper: Add a new helper function to print bitmasksMauro Carvalho Chehab1-0/+2
[ Upstream commit a976d790f49499ccaa0f991788ad8ebf92e7fd5c ] Add a helper function to print a string with names associated to each bit field. A typical example is: const char * const bits[] = { "bit 3 name", "bit 4 name", "bit 5 name", }; char str[120]; unsigned int bitmask = BIT(3) | BIT(5); #define MASK GENMASK(5,3) cper_bits_to_str(str, sizeof(str), FIELD_GET(MASK, bitmask), bits, ARRAY_SIZE(bits)); The above code fills string "str" with "bit 3 name|bit 5 name". Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
5 daysALSA: uapi: Fix typo in asound.h commentAndres J Rosa1-1/+1
[ Upstream commit 9a97857db0c5655b8932f86b5d18bb959079b0ee ] Fix 'level-shit' to 'level-shift' in struct snd_cea_861_aud_if comment. Fixes: 7ba1c40b536e ("ALSA: Add definitions for CEA-861 Audio InfoFrames") Signed-off-by: Andres J Rosa <andyrosa@gmail.com> Link: https://patch.msgid.link/20251203162509.1822-1-andyrosa@gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
5 daysblock: fix comment for op_is_zone_mgmt() to include RESET_ALLshechenglong1-4/+1
[ Upstream commit 8a32282175c964eb15638e8dfe199fc13c060f67 ] REQ_OP_ZONE_RESET_ALL is a zone management request, and op_is_zone_mgmt() has returned true for it. Update the comment to remove the misleading exception note so the documentation matches the implementation. Fixes: 12a1c9353c47 ("block: fix op_is_zone_mgmt() to handle REQ_OP_ZONE_RESET_ALL") Signed-off-by: shechenglong <shechenglong@xfusion.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
5 daysfs_context: drop the unused lsm_flags memberOndrej Mosnacek2-2/+1
[ Upstream commit 4e04143c869c5b6d499fbd5083caa860d5c942c3 ] This isn't ever used by VFS now, and it couldn't even work. Any FS that uses the SECURITY_LSM_NATIVE_LABELS flag needs to also process the value returned back from the LSM, so it needs to do its security_sb_set_mnt_opts() call on its own anyway. Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Stable-dep-of: 8675c69816e4 ("NFS: Automounted filesystems should inherit ro,noexec,nodev,sync flags") Signed-off-by: Sasha Levin <sashal@kernel.org>
5 daysNFSv4: Add some support for case insensitive filesystemsTrond Myklebust2-0/+4
[ Upstream commit 1ab5be4ac5b1c9ce39ce1037c45b68d2ce6eede0 ] Add capabilities to allow the NFS client to recognise when it is dealing with case insensitive and case preserving filesystems. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Stable-dep-of: 518c32a1bc4f ("NFS: Initialise verifiers for visible dentries in nfs_atomic_open()") Signed-off-by: Sasha Levin <sashal@kernel.org>
5 daysNFS: don't unhash dentry during unlink/renameNeilBrown1-0/+9
[ Upstream commit 3c59366c207e4c6c6569524af606baf017a55c61 ] NFS unlink() (and rename over existing target) must determine if the file is open, and must perform a "silly rename" instead of an unlink (or before rename) if it is. Otherwise the client might hold a file open which has been removed on the server. Consequently if it determines that the file isn't open, it must block any subsequent opens until the unlink/rename has been completed on the server. This is currently achieved by unhashing the dentry. This forces any open attempt to the slow-path for lookup which will block on i_rwsem on the directory until the unlink/rename completes. A future patch will change the VFS to only get a shared lock on i_rwsem for unlink, so this will no longer work. Instead we introduce an explicit interlock. A special value is stored in dentry->d_fsdata while the unlink/rename is running and ->d_revalidate blocks while that value is present. When ->d_revalidate unblocks, the dentry will be invalid. This closes the race without requiring exclusion on i_rwsem. d_fsdata is already used in two different ways. 1/ an IS_ROOT directory dentry might have a "devname" stored in d_fsdata. Such a dentry doesn't have a name and so cannot be the target of unlink or rename. For safety we check if an old devname is still stored, and remove it if it is. 2/ a dentry with DCACHE_NFSFS_RENAMED set will have a 'struct nfs_unlinkdata' stored in d_fsdata. While this is set maydelete() will fail, so an unlink or rename will never proceed on such a dentry. Neither of these can be in effect when a dentry is the target of unlink or rename. So we can expect d_fsdata to be NULL, and store a special value ((void*)1) which is given the name NFS_FSDATA_BLOCKED to indicate that any lookup will be blocked. The d_count() is incremented under d_lock() when a lookup finds the dentry, so we check d_count() is low, and set NFS_FSDATA_BLOCKED under the same lock to avoid any races. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Stable-dep-of: bd4928ec799b ("NFS: Avoid changing nlink when file removes and attribute updates race") Signed-off-by: Sasha Levin <sashal@kernel.org>
5 daysnetfilter: nf_conncount: rework API to use sk_buff directlyFernando Fernandez Mancera1-9/+8
[ Upstream commit be102eb6a0e7c03db00e50540622f4e43b2d2844 ] When using nf_conncount infrastructure for non-confirmed connections a duplicated track is possible due to an optimization introduced since commit d265929930e2 ("netfilter: nf_conncount: reduce unnecessary GC"). In order to fix this introduce a new conncount API that receives directly an sk_buff struct. It fetches the tuple and zone and the corresponding ct from it. It comes with both existing conncount variants nf_conncount_count_skb() and nf_conncount_add_skb(). In addition remove the old API and adjust all the users to use the new one. This way, for each sk_buff struct it is possible to check if there is a ct present and already confirmed. If so, skip the add operation. Fixes: d265929930e2 ("netfilter: nf_conncount: reduce unnecessary GC") Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Stable-dep-of: 69894e5b4c5e ("netfilter: nft_connlimit: update the count if add was skipped") Signed-off-by: Sasha Levin <sashal@kernel.org>
5 daysnetfilter: nf_conncount: reduce unnecessary GCWilliam Tu1-0/+1
[ Upstream commit d265929930e2ffafc744c0ae05fb70acd53be1ee ] Currently nf_conncount can trigger garbage collection (GC) at multiple places. Each GC process takes a spin_lock_bh to traverse the nf_conncount_list. We found that when testing port scanning use two parallel nmap, because the number of connection increase fast, the nf_conncount_count and its subsequent call to __nf_conncount_add take too much time, causing several CPU lockup. This happens when user set the conntrack limit to +20,000, because the larger the limit, the longer the list that GC has to traverse. The patch mitigate the performance issue by avoiding unnecessary GC with a timestamp. Whenever nf_conncount has done a GC, a timestamp is updated, and beforce the next time GC is triggered, we make sure it's more than a jiffies. By doin this we can greatly reduce the CPU cycles and avoid the softirq lockup. To reproduce it in OVS, $ ovs-appctl dpctl/ct-set-limits zone=1,limit=20000 $ ovs-appctl dpctl/ct-get-limits At another machine, runs two nmap $ nmap -p1- <IP> $ nmap -p1- <IP> Signed-off-by: William Tu <u9012063@gmail.com> Co-authored-by: Yifeng Sun <pkusunyifeng@gmail.com> Reported-by: Greg Rose <gvrose8192@gmail.com> Suggested-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Stable-dep-of: 69894e5b4c5e ("netfilter: nft_connlimit: update the count if add was skipped") Signed-off-by: Sasha Levin <sashal@kernel.org>
5 daysvirtio: fix virtqueue_set_affinity() docsMichael S. Tsirkin1-1/+1
[ Upstream commit 43236d8bbafff94b423afecc4a692dd90602d426 ] Rewrite the comment for better grammar and clarity. Fixes: 75a0a52be3c2 ("virtio: introduce an API to set affinity for a virtqueue") Message-Id: <e317e91bd43b070e5eaec0ebbe60c5749d02e2dd.1763026134.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
5 daysvdpa: Sync calls set/get config/status with cf_mutexEli Cohen1-0/+3
[ Upstream commit 73bc0dbb591baea322a7319c735e5f6c7dba9cfb ] Add wrappers to get/set status and protect these operations with cf_mutex to serialize these operations with respect to get/set config operations. Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-4-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Stable-dep-of: e40b6abe0b12 ("virtio_vdpa: fix misleading return in void function") Signed-off-by: Sasha Levin <sashal@kernel.org>
5 daysvdpa: Introduce query of device config layoutParav Pandit2-0/+8
[ Upstream commit ad69dd0bf26b88ec6ab26f8bbe5cd74fbed7672a ] Introduce a command to query a device config layout. An example query of network vdpa device: $ vdpa dev add name bar mgmtdev vdpasim_net $ vdpa dev config show bar: mac 00:35:09:19:48:05 link up link_announce false mtu 1500 $ vdpa dev config show -jp { "config": { "bar": { "mac": "00:35:09:19:48:05", "link ": "up", "link_announce ": false, "mtu": 1500, } } } Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20211026175519.87795-3-parav@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Stable-dep-of: e40b6abe0b12 ("virtio_vdpa: fix misleading return in void function") Signed-off-by: Sasha Levin <sashal@kernel.org>
5 daysvdpa: Introduce and use vdpa device get, set config helpersParav Pandit1-15/+4
[ Upstream commit 6dbb1f1687a2ccdfc5b84b0a35bbc6dfefc4de3b ] Subsequent patches enable get and set configuration either via management device or via vdpa device' config ops. This requires synchronization between multiple callers to get and set config callbacks. Features setting also influence the layout of the configuration fields endianness. To avoid exposing synchronization primitives to callers, introduce helper for setting the configuration and use it. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Eli Cohen <elic@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20211026175519.87795-2-parav@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Stable-dep-of: e40b6abe0b12 ("virtio_vdpa: fix misleading return in void function") Signed-off-by: Sasha Levin <sashal@kernel.org>
5 daysbacklight: lp855x: Fix lp855x.h kernel-doc warningsRandy Dunlap1-2/+2
[ Upstream commit 2d45db63260c6ae3cf007361e04a1c41bd265084 ] Add a missing struct short description and a missing leading " *" to lp855x.h to avoid kernel-doc warnings: Warning: include/linux/platform_data/lp855x.h:126 missing initial short description on line: * struct lp855x_platform_data Warning: include/linux/platform_data/lp855x.h:131 bad line: Only valid when mode is PWM_BASED. Fixes: 7be865ab8634 ("backlight: new backlight driver for LP855x devices") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Daniel Thompson (RISCstar) <danielt@kernel.org> Link: https://patch.msgid.link/20251111060916.1995920-1-rdunlap@infradead.org Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
5 dayswifi: ieee80211: correct FILS status codesRia Thomas1-2/+2
[ Upstream commit 24d4da5c2565313c2ad3c43449937a9351a64407 ] The FILS status codes are set to 108/109, but the IEEE 802.11-2020 spec defines them as 112/113. Update the enum so it matches the specification and keeps the kernel consistent with standard values. Fixes: a3caf7440ded ("cfg80211: Add support for FILS shared key authentication offload") Signed-off-by: Ria Thomas <ria.thomas@morsemicro.com> Reviewed-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com> Link: https://patch.msgid.link/20251124125637.3936154-1-ria.thomas@morsemicro.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
5 dayscoresight-etm4x: add isb() before reading the TRCSTATRYuanfang Zhang1-0/+4
[ Upstream commit 4ff6039ffb79a4a8a44b63810a8a2f2b43264856 ] As recommended by section 4.3.7 ("Synchronization when using system instructions to progrom the trace unit") of ARM IHI 0064H.b, the self-hosted trace analyzer must perform a Context synchronization event between writing to the TRCPRGCTLR and reading the TRCSTATR. Additionally, add an ISB between the each read of TRCSTATR on coresight_timeout() when using system instructions to program the trace unit. Fixes: 1ab3bb9df5e3 ("coresight: etm4x: Add necessary synchronization for sysreg access") Signed-off-by: Yuanfang Zhang <quic_yuanfang@quicinc.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20250116-etm_sync-v4-1-39f2b05e9514@quicinc.com Stable-dep-of: 64eb04ae5452 ("coresight: etm4x: Add context synchronization before enabling trace") Signed-off-by: Sasha Levin <sashal@kernel.org>
5 dayskmsan: introduce __no_sanitize_memory and __no_kmsan_checksAlexander Potapenko2-0/+29
[ Upstream commit 9b448bc25b776daab3215393c3ce6953dd3bb8ad ] __no_sanitize_memory is a function attribute that instructs KMSAN to skip a function during instrumentation. This is needed to e.g. implement the noinstr functions. __no_kmsan_checks is a function attribute that makes KMSAN ignore the uninitialized values coming from the function's inputs, and initialize the function's outputs. Functions marked with this attribute can't be inlined into functions not marked with it, and vice versa. This behavior is overridden by __always_inline. __SANITIZE_MEMORY__ is a macro that's defined iff the file is instrumented with KMSAN. This is not the same as CONFIG_KMSAN, which is defined for every file. Link: https://lkml.kernel.org/r/20220915150417.722975-8-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: ced37e9ceae5 ("x86/dumpstack: Prevent KASAN false positive warnings in __show_regs()") Signed-off-by: Sasha Levin <sashal@kernel.org>
5 dayscompiler-gcc.h: Define __SANITIZE_ADDRESS__ under hwaddress sanitizerKees Cook1-0/+8
[ Upstream commit 9a48e7564ac83fb0f1d5b0eac5fe8a7af62da398 ] When Clang is using the hwaddress sanitizer, it sets __SANITIZE_ADDRESS__ explicitly: #if __has_feature(address_sanitizer) || __has_feature(hwaddress_sanitizer) /* Emulate GCC's __SANITIZE_ADDRESS__ flag */ #define __SANITIZE_ADDRESS__ #endif Once hwaddress sanitizer was added to GCC, however, a separate define was created, __SANITIZE_HWADDRESS__. The kernel is expecting to find __SANITIZE_ADDRESS__ in either case, though, and the existing string macros break on supported architectures: #if (defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)) && \ !defined(__SANITIZE_ADDRESS__) where as other architectures (like arm32) have no idea about hwaddress sanitizer and just check for __SANITIZE_ADDRESS__: #if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__) This would lead to compiler foritfy self-test warnings when building with CONFIG_KASAN_SW_TAGS=y: warning: unsafe memmove() usage lacked '__read_overflow2' symbol in lib/test_fortify/read_overflow2-memmove.c warning: unsafe memcpy() usage lacked '__write_overflow' symbol in lib/test_fortify/write_overflow-memcpy.c ... Sort this out by also defining __SANITIZE_ADDRESS__ in GCC under the hwaddress sanitizer. Suggested-by: Arnd Bergmann <arnd@arndb.de> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Will Deacon <will@kernel.org> Cc: Arvind Sankar <nivedita@alum.mit.edu> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: llvm@lists.linux.dev Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Miguel Ojeda <ojeda@kernel.org> Reviewed-by: Marco Elver <elver@google.com> Link: https://lore.kernel.org/r/20211020200039.170424-1-keescook@chromium.org Stable-dep-of: ced37e9ceae5 ("x86/dumpstack: Prevent KASAN false positive warnings in __show_regs()") Signed-off-by: Sasha Levin <sashal@kernel.org>
5 daysinet: Avoid ehash lookup race in inet_ehash_insert()Xuanqiang Luo1-0/+13
[ Upstream commit 1532ed0d0753c83e72595f785f82b48c28bbe5dc ] Since ehash lookups are lockless, if one CPU performs a lookup while another concurrently deletes and inserts (removing reqsk and inserting sk), the lookup may fail to find the socket, an RST may be sent. The call trace map is drawn as follows: CPU 0 CPU 1 ----- ----- inet_ehash_insert() spin_lock() sk_nulls_del_node_init_rcu(osk) __inet_lookup_established() (lookup failed) __sk_nulls_add_node_rcu(sk, list) spin_unlock() As both deletion and insertion operate on the same ehash chain, this patch introduces a new sk_nulls_replace_node_init_rcu() helper functions to implement atomic replacement. Fixes: 5e0724d027f0 ("tcp/dccp: fix hashdance race for passive sessions") Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev> Signed-off-by: Xuanqiang Luo <luoxuanqiang@kylinos.cn> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20251015020236.431822-3-xuanqiang.luo@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
5 daysrculist: Add hlist_nulls_replace_rcu() and hlist_nulls_replace_init_rcu()Xuanqiang Luo1-0/+59
[ Upstream commit 9c4609225ec1cb551006d6a03c7c4ad8cb5584c0 ] Add two functions to atomically replace RCU-protected hlist_nulls entries. Keep using WRITE_ONCE() to assign values to ->next and ->pprev, as mentioned in the patch below: commit efd04f8a8b45 ("rcu: Use WRITE_ONCE() for assignments to ->next for rculist_nulls") commit 860c8802ace1 ("rcu: Use WRITE_ONCE() for assignments to ->pprev for hlist_nulls") Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Xuanqiang Luo <luoxuanqiang@kylinos.cn> Link: https://patch.msgid.link/20251015020236.431822-2-xuanqiang.luo@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 1532ed0d0753 ("inet: Avoid ehash lookup race in inet_ehash_insert()") Signed-off-by: Sasha Levin <sashal@kernel.org>
5 daysRevert "xfrm: destroy xfrm_state synchronously on net exit path"Sabrina Dubroca1-9/+3
[ Upstream commit 2a198bbec6913ae1c90ec963750003c6213668c7 ] This reverts commit f75a2804da391571563c4b6b29e7797787332673. With all states (whether user or kern) removed from the hashtables during deletion, there's no need for synchronous destruction of states. xfrm6_tunnel states still need to have been destroyed (which will be the case when its last user is deleted (not destroyed)) so that xfrm6_tunnel_free_spi removes it from the per-netns hashtable before the netns is destroyed. This has the benefit of skipping one synchronize_rcu per state (in __xfrm_state_destroy(sync=true)) when we exit a netns. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
5 daysxfrm: delete x->tunnel as we delete xSabrina Dubroca1-1/+0
[ Upstream commit b441cf3f8c4b8576639d20c8eb4aa32917602ecd ] The ipcomp fallback tunnels currently get deleted (from the various lists and hashtables) as the last user state that needed that fallback is destroyed (not deleted). If a reference to that user state still exists, the fallback state will remain on the hashtables/lists, triggering the WARN in xfrm_state_fini. Because of those remaining references, the fix in commit f75a2804da39 ("xfrm: destroy xfrm_state synchronously on net exit path") is not complete. We recently fixed one such situation in TCP due to defered freeing of skbs (commit 9b6412e6979f ("tcp: drop secpath at the same time as we currently drop dst")). This can also happen due to IP reassembly: skbs with a secpath remain on the reassembly queue until netns destruction. If we can't guarantee that the queues are flushed by the time xfrm_state_fini runs, there may still be references to a (user) xfrm_state, preventing the timely deletion of the corresponding fallback state. Instead of chasing each instance of skbs holding a secpath one by one, this patch fixes the issue directly within xfrm, by deleting the fallback state as soon as the last user state depending on it has been deleted. Destruction will still happen when the final reference is dropped. A separate lockdep class for the fallback state is required since we're going to lock x->tunnel while x is locked. Fixes: 9d4139c76905 ("netns xfrm: per-netns xfrm_state_all list") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07ata: libata-scsi: Fix system suspend for a security locked driveNiklas Cassel1-0/+1
[ Upstream commit b11890683380a36b8488229f818d5e76e8204587 ] Commit cf3fc037623c ("ata: libata-scsi: Fix ata_to_sense_error() status handling") fixed ata_to_sense_error() to properly generate sense key ABORTED COMMAND (without any additional sense code), instead of the previous bogus sense key ILLEGAL REQUEST with the additional sense code UNALIGNED WRITE COMMAND, for a failed command. However, this broke suspend for Security locked drives (drives that have Security enabled, and have not been Security unlocked by boot firmware). The reason for this is that the SCSI disk driver, for the Synchronize Cache command only, treats any sense data with sense key ILLEGAL REQUEST as a successful command (regardless of ASC / ASCQ). After commit cf3fc037623c ("ata: libata-scsi: Fix ata_to_sense_error() status handling") the code that treats any sense data with sense key ILLEGAL REQUEST as a successful command is no longer applicable, so the command fails, which causes the system suspend to be aborted: sd 1:0:0:0: PM: dpm_run_callback(): scsi_bus_suspend returns -5 sd 1:0:0:0: PM: failed to suspend async: error -5 PM: Some devices failed to suspend, or early wake event detected To make suspend work once again, for a Security locked device only, return sense data LOGICAL UNIT ACCESS NOT AUTHORIZED, the actual sense data which a real SCSI device would have returned if locked. The SCSI disk driver treats this sense data as a successful command. Cc: stable@vger.kernel.org Reported-by: Ilia Baryshnikov <qwelias@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220704 Fixes: cf3fc037623c ("ata: libata-scsi: Fix ata_to_sense_error() status handling") Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Niklas Cassel <cassel@kernel.org> [ Adjust context ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07usb: deprecate the third argument of usb_maxpacket()Vincent Mailhol1-11/+5
[ Upstream commit 0f08c2e7458e25c967d844170f8ad1aac3b57a02 ] This is a transitional patch with the ultimate goal of changing the prototype of usb_maxpacket() from: | static inline __u16 | usb_maxpacket(struct usb_device *udev, int pipe, int is_out) into: | static inline u16 usb_maxpacket(struct usb_device *udev, int pipe) The third argument of usb_maxpacket(): is_out gets removed because it can be derived from its second argument: pipe using usb_pipeout(pipe). Furthermore, in the current version, ubs_pipeout(pipe) is called regardless in order to sanitize the is_out parameter. In order to make a smooth change, we first deprecate the is_out parameter by simply ignoring it (using a variadic function) and will remove it later, once all the callers get updated. The body of the function is reworked accordingly and is_out is replaced by usb_pipeout(pipe). The WARN_ON() calls become unnecessary and get removed. Finally, the return type is changed from __u16 to u16 because this is not a UAPI function. Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://lore.kernel.org/r/20220317035514.6378-2-mailhol.vincent@wanadoo.fr Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Stable-dep-of: 69aeb5073123 ("Input: pegasus-notetaker - fix potential out-of-bounds access") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07net: tls: Cancel RX async resync request on rcd_delta overflowShahar Shitrit1-0/+6
[ Upstream commit c15d5c62ab313c19121f10e25d4fec852bd1c40c ] When a netdev issues a RX async resync request for a TLS connection, the TLS module handles it by logging record headers and attempting to match them to the tcp_sn provided by the device. If a match is found, the TLS module approves the tcp_sn for resynchronization. While waiting for a device response, the TLS module also increments rcd_delta each time a new TLS record is received, tracking the distance from the original resync request. However, if the device response is delayed or fails (e.g due to unstable connection and device getting out of tracking, hardware errors, resource exhaustion etc.), the TLS module keeps logging and incrementing, which can lead to a WARN() when rcd_delta exceeds the threshold. To address this, introduce tls_offload_rx_resync_async_request_cancel() to explicitly cancel resync requests when a device response failure is detected. Call this helper also as a final safeguard when rcd_delta crosses its threshold, as reaching this point implies that earlier cancellation did not occur. Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1761508983-937977-3-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07kernel.h: Move ARRAY_SIZE() to a separate headerAlejandro Colomar3-6/+15
[ Upstream commit 3cd39bc3b11b8d34b7d7c961a35fdfd18b0ebf75 ] Touching files so used for the kernel, forces 'make' to recompile most of the kernel. Having those definitions in more granular files helps avoid recompiling so much of the kernel. Signed-off-by: Alejandro Colomar <alx@kernel.org> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20230817143352.132583-2-lucas.segarra.fernandez@intel.com [andy: reduced to cover only string.h for now] Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Stable-dep-of: 896f1a2493b5 ("net: qlogic/qede: fix potential out-of-bounds read in qede_tpa_cont() and qede_tpa_end()") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07mm/ksm: fix flag-dropping behavior in ksm_madviseJakub Acs1-1/+1
[ Upstream commit f04aad36a07cc17b7a5d5b9a2d386ce6fae63e93 ] syzkaller discovered the following crash: (kernel BUG) [ 44.607039] ------------[ cut here ]------------ [ 44.607422] kernel BUG at mm/userfaultfd.c:2067! [ 44.608148] Oops: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI [ 44.608814] CPU: 1 UID: 0 PID: 2475 Comm: reproducer Not tainted 6.16.0-rc6 #1 PREEMPT(none) [ 44.609635] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 [ 44.610695] RIP: 0010:userfaultfd_release_all+0x3a8/0x460 <snip other registers, drop unreliable trace> [ 44.617726] Call Trace: [ 44.617926] <TASK> [ 44.619284] userfaultfd_release+0xef/0x1b0 [ 44.620976] __fput+0x3f9/0xb60 [ 44.621240] fput_close_sync+0x110/0x210 [ 44.622222] __x64_sys_close+0x8f/0x120 [ 44.622530] do_syscall_64+0x5b/0x2f0 [ 44.622840] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 44.623244] RIP: 0033:0x7f365bb3f227 Kernel panics because it detects UFFD inconsistency during userfaultfd_release_all(). Specifically, a VMA which has a valid pointer to vma->vm_userfaultfd_ctx, but no UFFD flags in vma->vm_flags. The inconsistency is caused in ksm_madvise(): when user calls madvise() with MADV_UNMEARGEABLE on a VMA that is registered for UFFD in MINOR mode, it accidentally clears all flags stored in the upper 32 bits of vma->vm_flags. Assuming x86_64 kernel build, unsigned long is 64-bit and unsigned int and int are 32-bit wide. This setup causes the following mishap during the &= ~VM_MERGEABLE assignment. VM_MERGEABLE is a 32-bit constant of type unsigned int, 0x8000'0000. After ~ is applied, it becomes 0x7fff'ffff unsigned int, which is then promoted to unsigned long before the & operation. This promotion fills upper 32 bits with leading 0s, as we're doing unsigned conversion (and even for a signed conversion, this wouldn't help as the leading bit is 0). & operation thus ends up AND-ing vm_flags with 0x0000'0000'7fff'ffff instead of intended 0xffff'ffff'7fff'ffff and hence accidentally clears the upper 32-bits of its value. Fix it by changing `VM_MERGEABLE` constant to unsigned long, using the BIT() macro. Note: other VM_* flags are not affected: This only happens to the VM_MERGEABLE flag, as the other VM_* flags are all constants of type int and after ~ operation, they end up with leading 1 and are thus converted to unsigned long with leading 1s. Note 2: After commit 31defc3b01d9 ("userfaultfd: remove (VM_)BUG_ON()s"), this is no longer a kernel BUG, but a WARNING at the same place: [ 45.595973] WARNING: CPU: 1 PID: 2474 at mm/userfaultfd.c:2067 but the root-cause (flag-drop) remains the same. [akpm@linux-foundation.org: rust bindgen wasn't able to handle BIT(), from Miguel] Link: https://lore.kernel.org/oe-kbuild-all/202510030449.VfSaAjvd-lkp@intel.com/ Link: https://lkml.kernel.org/r/20251001090353.57523-2-acsjakub@amazon.de Fixes: 7677f7fd8be7 ("userfaultfd: add minor fault registration mode") Signed-off-by: Jakub Acs <acsjakub@amazon.de> Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: SeongJae Park <sj@kernel.org> Tested-by: Alice Ryhl <aliceryhl@google.com> Tested-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Cc: Xu Xin <xu.xin16@zte.com.cn> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Peter Xu <peterx@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ acsjakub: drop rust-compatibility change (no rust in 5.15) ] Signed-off-by: Jakub Acs <acsjakub@amazon.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07bpf: Add bpf_prog_run_data_pointers()Eric Dumazet1-0/+20
[ Upstream commit 4ef92743625818932b9c320152b58274c05e5053 ] syzbot found that cls_bpf_classify() is able to change tc_skb_cb(skb)->drop_reason triggering a warning in sk_skb_reason_drop(). WARNING: CPU: 0 PID: 5965 at net/core/skbuff.c:1192 __sk_skb_reason_drop net/core/skbuff.c:1189 [inline] WARNING: CPU: 0 PID: 5965 at net/core/skbuff.c:1192 sk_skb_reason_drop+0x76/0x170 net/core/skbuff.c:1214 struct tc_skb_cb has been added in commit ec624fe740b4 ("net/sched: Extend qdisc control block with tc control block"), which added a wrong interaction with db58ba459202 ("bpf: wire in data and data_end for cls_act_bpf"). drop_reason was added later. Add bpf_prog_run_data_pointers() helper to save/restore the net_sched storage colliding with BPF data_meta/data_end. Fixes: ec624fe740b4 ("net/sched: Extend qdisc control block with tc control block") Reported-by: syzbot <syzkaller@googlegroups.com> Closes: https://lore.kernel.org/netdev/6913437c.a70a0220.22f260.013b.GAE@google.com/ Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20251112125516.1563021-1-edumazet@google.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07net_sched: act_connmark: use RCU in tcf_connmark_dump()Eric Dumazet1-0/+1
[ Upstream commit 0d752877705c0252ef2726e4c63c5573f048951c ] Also storing tcf_action into struct tcf_connmark_parms makes sure there is no discrepancy in tcf_connmark_act(). Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250709090204.797558-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 62b656e43eae ("net: sched: act_connmark: initialize struct tc_ife to fix kernel leak") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07net/sched: act_connmark: transition to percpu stats and rcuPedro Tammela1-2/+7
[ Upstream commit 288864effe33885988d53faf7830b35cb9a84c7a ] The tc action act_connmark was using shared stats and taking the per action lock in the datapath. Improve it by using percpu stats and rcu. perf before: - 13.55% tcf_connmark_act - 81.18% _raw_spin_lock 80.46% native_queued_spin_lock_slowpath perf after: - 2.85% tcf_connmark_act tdc results: 1..15 ok 1 2002 - Add valid connmark action with defaults ok 2 56a5 - Add valid connmark action with control pass ok 3 7c66 - Add valid connmark action with control drop ok 4 a913 - Add valid connmark action with control pipe ok 5 bdd8 - Add valid connmark action with control reclassify ok 6 b8be - Add valid connmark action with control continue ok 7 d8a6 - Add valid connmark action with control jump ok 8 aae8 - Add valid connmark action with zone argument ok 9 2f0b - Add valid connmark action with invalid zone argument ok 10 9305 - Add connmark action with unsupported argument ok 11 71ca - Add valid connmark action and replace it ok 12 5f8f - Add valid connmark action with cookie ok 13 c506 - Replace connmark with invalid goto chain control ok 14 6571 - Delete connmark action with valid index ok 15 3426 - Delete connmark action with invalid index Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Stable-dep-of: 62b656e43eae ("net: sched: act_connmark: initialize struct tc_ife to fix kernel leak") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07net: sched: act: move global static variable net_id to tc_action_opsZhengchao Shao1-0/+1
[ Upstream commit acd0a7ab6334f35c3720120d53f79eb8e9b3ac2e ] Each tc action module has a corresponding net_id, so put net_id directly into the structure tc_action_ops. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net> Stable-dep-of: 62b656e43eae ("net: sched: act_connmark: initialize struct tc_ife to fix kernel leak") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07compiler_types: Move unused static inline functions warning to W=2Peter Zijlstra1-3/+2
[ Upstream commit 9818af18db4bfefd320d0fef41390a616365e6f7 ] Per Nathan, clang catches unused "static inline" functions in C files since commit 6863f5643dd7 ("kbuild: allow Clang to find unused static inline functions for W=1 build"). Linus said: > So I entirely ignore W=1 issues, because I think so many of the extra > warnings are bogus. > > But if this one in particular is causing more problems than most - > some teams do seem to use W=1 as part of their test builds - it's fine > to send me a patch that just moves bad warnings to W=2. > > And if anybody uses W=2 for their test builds, that's THEIR problem.. Here is the change to bump the warning from W=1 to W=2. Fixes: 6863f5643dd7 ("kbuild: allow Clang to find unused static inline functions for W=1 build") Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://patch.msgid.link/20251106105000.2103276-1-andriy.shevchenko@linux.intel.com [nathan: Adjust comment as well] Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07net/cls_cgroup: Fix task_get_classid() during qdisc runYafang Shao1-1/+1
[ Upstream commit 66048f8b3cc7e462953c04285183cdee43a1cb89 ] During recent testing with the netem qdisc to inject delays into TCP traffic, we observed that our CLS BPF program failed to function correctly due to incorrect classid retrieval from task_get_classid(). The issue manifests in the following call stack: bpf_get_cgroup_classid+5 cls_bpf_classify+507 __tcf_classify+90 tcf_classify+217 __dev_queue_xmit+798 bond_dev_queue_xmit+43 __bond_start_xmit+211 bond_start_xmit+70 dev_hard_start_xmit+142 sch_direct_xmit+161 __qdisc_run+102 <<<<< Issue location __dev_xmit_skb+1015 __dev_queue_xmit+637 neigh_hh_output+159 ip_finish_output2+461 __ip_finish_output+183 ip_finish_output+41 ip_output+120 ip_local_out+94 __ip_queue_xmit+394 ip_queue_xmit+21 __tcp_transmit_skb+2169 tcp_write_xmit+959 __tcp_push_pending_frames+55 tcp_push+264 tcp_sendmsg_locked+661 tcp_sendmsg+45 inet_sendmsg+67 sock_sendmsg+98 sock_write_iter+147 vfs_write+786 ksys_write+181 __x64_sys_write+25 do_syscall_64+56 entry_SYSCALL_64_after_hwframe+100 The problem occurs when multiple tasks share a single qdisc. In such cases, __qdisc_run() may transmit skbs created by different tasks. Consequently, task_get_classid() retrieves an incorrect classid since it references the current task's context rather than the skb's originating task. Given that dev_queue_xmit() always executes with bh disabled, we can use softirq_count() instead to obtain the correct classid. The simple steps to reproduce this issue: 1. Add network delay to the network interface: such as: tc qdisc add dev bond0 root netem delay 1.5ms 2. Build two distinct net_cls cgroups, each with a network-intensive task 3. Initiate parallel TCP streams from both tasks to external servers. Under this specific condition, the issue reliably occurs. The kernel eventually dequeues an SKB that originated from Task-A while executing in the context of Task-B. It is worth noting that it will change the established behavior for a slightly different scenario: <sock S is created by task A> <class ID for task A is changed> <skb is created by sock S xmit and classified> prior to this patch the skb will be classified with the 'new' task A classid, now with the old/original one. The bpf_get_cgroup_classid_curr() function is a more appropriate choice for this case. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Thomas Graf <tgraf@suug.ch> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20250902062933.30087-1-laoar.shao@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07net: nfc: nci: Increase NCI_DATA_TIMEOUT to 3000 msJuraj Šarinay1-1/+1
[ Upstream commit 21f82062d0f241e55dd59eb630e8710862cc90b4 ] An exchange with a NFC target must complete within NCI_DATA_TIMEOUT. A delay of 700 ms is not sufficient for cryptographic operations on smart cards. CardOS 6.0 may need up to 1.3 seconds to perform 256-bit ECDH or 3072-bit RSA. To prevent brute-force attacks, passports and similar documents introduce even longer delays into access control protocols (BAC/PACE). The timeout should be higher, but not too much. The expiration allows us to detect that a NFC target has disappeared. Signed-off-by: Juraj Šarinay <juraj@sarinay.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20250902113630.62393-1-juraj@sarinay.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07dmaengine: sh: setup_xref error handlingThomas Andreatta1-1/+1
[ Upstream commit d9a3e9929452780df16f3414f0d59b5f69d058cf ] This patch modifies the type of setup_xref from void to int and handles errors since the function can fail. `setup_xref` now returns the (eventual) error from `dmae_set_dmars`|`dmae_set_chcr`, while `shdma_tx_submit` handles the result, removing the chunks from the queue and marking PM as idle in case of an error. Signed-off-by: Thomas Andreatta <thomas.andreatta2000@gmail.com> Link: https://lore.kernel.org/r/20250827152442.90962-1-thomas.andreatta2000@gmail.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07bpf: Don't use %pK through printkThomas Weißschuh1-1/+1
[ Upstream commit 2caa6b88e0ba0231fb4ff0ba8e73cedd5fb81fc8 ] In the past %pK was preferable to %p as it would not leak raw pointer values into the kernel log. Since commit ad67b74d2469 ("printk: hash addresses printed with %p") the regular %p has been improved to avoid this issue. Furthermore, restricted pointers ("%pK") were never meant to be used through printk(). They can still unintentionally leak raw pointers or acquire sleeping locks in atomic contexts. Switch to the regular pointer formatting which is safer and easier to reason about. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250811-restricted-pointers-bpf-v1-1-a1d7cc3cb9e7@linutronix.de Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07block: make REQ_OP_ZONE_OPEN a write operationDamien Le Moal1-5/+5
[ Upstream commit 19de03b312d69a7e9bacb51c806c6e3f4207376c ] A REQ_OP_OPEN_ZONE request changes the condition of a sequential zone of a zoned block device to the explicitly open condition (BLK_ZONE_COND_EXP_OPEN). As such, it should be considered a write operation. Change this operation code to be an odd number to reflect this. The following operation numbers are changed to keep the numbering compact. No problems were reported without this change as this operation has no data. However, this unifies the zone operation to reflect that they modify the device state and also allows strengthening checks in the block layer, e.g. checking if this operation is not issued against a read-only device. Fixes: 6c1b1da58f8c ("block: add zone open, close and finish operations") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> [ relocated REQ_OP_ZONE_APPEND from 15 to 21 to resolve numbering conflict ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>