summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2025-07-15smc: Fix various oops due to inet_sock type confusion.Kuniyuki Iwashima2-4/+18
syzbot reported weird splats [0][1] in cipso_v4_sock_setattr() while freeing inet_sk(sk)->inet_opt. The address was freed multiple times even though it was read-only memory. cipso_v4_sock_setattr() did nothing wrong, and the root cause was type confusion. The cited commit made it possible to create smc_sock as an INET socket. The issue is that struct smc_sock does not have struct inet_sock as the first member but hijacks AF_INET and AF_INET6 sk_family, which confuses various places. In this case, inet_sock.inet_opt was actually smc_sock.clcsk_data_ready(), which is an address of a function in the text segment. $ pahole -C inet_sock vmlinux struct inet_sock { ... struct ip_options_rcu * inet_opt; /* 784 8 */ $ pahole -C smc_sock vmlinux struct smc_sock { ... void (*clcsk_data_ready)(struct sock *); /* 784 8 */ The same issue for another field was reported before. [2][3] At that time, an ugly hack was suggested [4], but it makes both INET and SMC code error-prone and hard to change. Also, yet another variant was fixed by a hacky commit 98d4435efcbf3 ("net/smc: prevent NULL pointer dereference in txopt_get"). Instead of papering over the root cause by such hacks, we should not allow non-INET socket to reuse the INET infra. Let's add inet_sock as the first member of smc_sock. [0]: kvfree_call_rcu(): Double-freed call. rcu_head 000000006921da73 WARNING: CPU: 0 PID: 6718 at mm/slab_common.c:1956 kvfree_call_rcu+0x94/0x3f0 mm/slab_common.c:1955 Modules linked in: CPU: 0 UID: 0 PID: 6718 Comm: syz.0.17 Tainted: G W 6.16.0-rc4-syzkaller-g7482bb149b9f #0 PREEMPT Tainted: [W]=WARN Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025 pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : kvfree_call_rcu+0x94/0x3f0 mm/slab_common.c:1955 lr : kvfree_call_rcu+0x94/0x3f0 mm/slab_common.c:1955 sp : ffff8000a03a7730 x29: ffff8000a03a7730 x28: 00000000fffffff5 x27: 1fffe000184823d3 x26: dfff800000000000 x25: ffff0000c2411e9e x24: ffff0000dd88da00 x23: ffff8000891ac9a0 x22: 00000000ffffffea x21: ffff8000891ac9a0 x20: ffff8000891ac9a0 x19: ffff80008afc2480 x18: 00000000ffffffff x17: 0000000000000000 x16: ffff80008ae642c8 x15: ffff700011ede14c x14: 1ffff00011ede14c x13: 0000000000000004 x12: ffffffffffffffff x11: ffff700011ede14c x10: 0000000000ff0100 x9 : 5fa3c1ffaf0ff000 x8 : 5fa3c1ffaf0ff000 x7 : 0000000000000001 x6 : 0000000000000001 x5 : ffff8000a03a7078 x4 : ffff80008f766c20 x3 : ffff80008054d360 x2 : 0000000000000000 x1 : 0000000000000201 x0 : 0000000000000000 Call trace: kvfree_call_rcu+0x94/0x3f0 mm/slab_common.c:1955 (P) cipso_v4_sock_setattr+0x2f0/0x3f4 net/ipv4/cipso_ipv4.c:1914 netlbl_sock_setattr+0x240/0x334 net/netlabel/netlabel_kapi.c:1000 smack_netlbl_add+0xa8/0x158 security/smack/smack_lsm.c:2581 smack_inode_setsecurity+0x378/0x430 security/smack/smack_lsm.c:2912 security_inode_setsecurity+0x118/0x3c0 security/security.c:2706 __vfs_setxattr_noperm+0x174/0x5c4 fs/xattr.c:251 __vfs_setxattr_locked+0x1ec/0x218 fs/xattr.c:295 vfs_setxattr+0x158/0x2ac fs/xattr.c:321 do_setxattr fs/xattr.c:636 [inline] file_setxattr+0x1b8/0x294 fs/xattr.c:646 path_setxattrat+0x2ac/0x320 fs/xattr.c:711 __do_sys_fsetxattr fs/xattr.c:761 [inline] __se_sys_fsetxattr fs/xattr.c:758 [inline] __arm64_sys_fsetxattr+0xc0/0xdc fs/xattr.c:758 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline] invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49 el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132 do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151 el0_svc+0x58/0x180 arch/arm64/kernel/entry-common.c:879 el0t_64_sync_handler+0x84/0x12c arch/arm64/kernel/entry-common.c:898 el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:600 [1]: Unable to handle kernel write to read-only memory at virtual address ffff8000891ac9a8 KASAN: probably user-memory-access in range [0x0000000448d64d40-0x0000000448d64d47] Mem abort info: ESR = 0x000000009600004e EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x0e: level 2 permission fault Data abort info: ISV = 0, ISS = 0x0000004e, ISS2 = 0x00000000 CM = 0, WnR = 1, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000207144000 [ffff8000891ac9a8] pgd=0000000000000000, p4d=100000020f950003, pud=100000020f951003, pmd=0040000201000781 Internal error: Oops: 000000009600004e [#1] SMP Modules linked in: CPU: 0 UID: 0 PID: 6946 Comm: syz.0.69 Not tainted 6.16.0-rc4-syzkaller-g7482bb149b9f #0 PREEMPT Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025 pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : kvfree_call_rcu+0x31c/0x3f0 mm/slab_common.c:1971 lr : add_ptr_to_bulk_krc_lock mm/slab_common.c:1838 [inline] lr : kvfree_call_rcu+0xfc/0x3f0 mm/slab_common.c:1963 sp : ffff8000a28a7730 x29: ffff8000a28a7730 x28: 00000000fffffff5 x27: 1fffe00018b09bb3 x26: 0000000000000001 x25: ffff80008f66e000 x24: ffff00019beaf498 x23: ffff00019beaf4c0 x22: 0000000000000000 x21: ffff8000891ac9a0 x20: ffff8000891ac9a0 x19: 0000000000000000 x18: 00000000ffffffff x17: ffff800093363000 x16: ffff80008052c6e4 x15: ffff700014514ecc x14: 1ffff00014514ecc x13: 0000000000000004 x12: ffffffffffffffff x11: ffff700014514ecc x10: 0000000000000001 x9 : 0000000000000001 x8 : ffff00019beaf7b4 x7 : ffff800080a94154 x6 : 0000000000000000 x5 : ffff8000935efa60 x4 : 0000000000000008 x3 : ffff80008052c7fc x2 : 0000000000000001 x1 : ffff8000891ac9a0 x0 : 0000000000000001 Call trace: kvfree_call_rcu+0x31c/0x3f0 mm/slab_common.c:1967 (P) cipso_v4_sock_setattr+0x2f0/0x3f4 net/ipv4/cipso_ipv4.c:1914 netlbl_sock_setattr+0x240/0x334 net/netlabel/netlabel_kapi.c:1000 smack_netlbl_add+0xa8/0x158 security/smack/smack_lsm.c:2581 smack_inode_setsecurity+0x378/0x430 security/smack/smack_lsm.c:2912 security_inode_setsecurity+0x118/0x3c0 security/security.c:2706 __vfs_setxattr_noperm+0x174/0x5c4 fs/xattr.c:251 __vfs_setxattr_locked+0x1ec/0x218 fs/xattr.c:295 vfs_setxattr+0x158/0x2ac fs/xattr.c:321 do_setxattr fs/xattr.c:636 [inline] file_setxattr+0x1b8/0x294 fs/xattr.c:646 path_setxattrat+0x2ac/0x320 fs/xattr.c:711 __do_sys_fsetxattr fs/xattr.c:761 [inline] __se_sys_fsetxattr fs/xattr.c:758 [inline] __arm64_sys_fsetxattr+0xc0/0xdc fs/xattr.c:758 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline] invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49 el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132 do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151 el0_svc+0x58/0x180 arch/arm64/kernel/entry-common.c:879 el0t_64_sync_handler+0x84/0x12c arch/arm64/kernel/entry-common.c:898 el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:600 Code: aa1f03e2 52800023 97ee1e8d b4000195 (f90006b4) Fixes: d25a92ccae6b ("net/smc: Introduce IPPROTO_SMC") Reported-by: syzbot+40bf00346c3fe40f90f2@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/686d9b50.050a0220.1ffab7.0020.GAE@google.com/ Tested-by: syzbot+40bf00346c3fe40f90f2@syzkaller.appspotmail.com Reported-by: syzbot+f22031fad6cbe52c70e7@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/686da0f3.050a0220.1ffab7.0022.GAE@google.com/ Reported-by: syzbot+271fed3ed6f24600c364@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=271fed3ed6f24600c364 # [2] Link: https://lore.kernel.org/netdev/99f284be-bf1d-4bc4-a629-77b268522fff@huawei.com/ # [3] Link: https://lore.kernel.org/netdev/20250331081003.1503211-1-wangliang74@huawei.com/ # [4] Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: D. Wythe <alibuda@linux.alibaba.com> Reviewed-by: Wang Liang <wangliang74@huawei.com> Link: https://patch.msgid.link/20250711060808.2977529-1-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-15dev: Pass netdevice_tracker to dev_get_by_flags_rcu().Kuniyuki Iwashima3-12/+14
This is a follow-up for commit eb1ac9ff6c4a5 ("ipv6: anycast: Don't hold RTNL for IPV6_JOIN_ANYCAST."). We should not add a new device lookup API without netdevice_tracker. Let's pass netdevice_tracker to dev_get_by_flags_rcu() and rename it with netdev_ prefix to match other newer APIs. Note that we always use GFP_ATOMIC for netdev_hold() as it's expected to be called under RCU. Suggested-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/netdev/20250708184053.102109f6@kernel.org/ Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250711051120.2866855-1-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-15net: phy: micrel: Add ksz9131_resume()Biju Das1-1/+9
The Renesas RZ/G3E SMARC EVK uses KSZ9131RNXC phy. On deep power state, PHY loses the power and on wakeup the rgmii delays are not reconfigured causing it to fail. Replace the callback kszphy_resume()->ksz9131_resume() for reconfiguring the rgmii_delay when it exits from PM suspend state. Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250711054029.48536-1-biju.das.jz@bp.renesas.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-15io_uring/net: cast min_not_zero() typeJens Axboe1-1/+1
kernel test robot reports that xtensa complains about different signedness for a min_not_zero() comparison. Cast the int part to size_t to avoid this issue. Fixes: e227c8cdb47b ("io_uring/net: use passed in 'len' in io_recv_buf_select()") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202507150504.zO5FsCPm-lkp@intel.com/ Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-07-15KVM: x86: Reject KVM_SET_TSC_KHZ VM ioctl when vCPUs have been createdKai Huang2-4/+7
Reject the KVM_SET_TSC_KHZ VM ioctl when vCPUs have been created and update the documentation to reflect it. The VM scope KVM_SET_TSC_KHZ ioctl is used to set up the default TSC frequency that all subsequently created vCPUs can use. It is only intended to be called before any vCPU is created. Allowing it to be called after that only results in confusion but nothing good. Note this is an ABI change. But currently in Qemu (the de facto userspace VMM) only TDX uses this VM ioctl, and it is only called once before creating any vCPU, therefore the risk of breaking userspace is pretty low. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Chao Gao <chao.gao@intel.com> Reviewed-by: Nikunj A Dadhania <nikunj@amd.com> Link: https://lore.kernel.org/r/135a35223ce8d01cea06b6cef30bfe494ec85827.1752444335.git.kai.huang@intel.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-07-15scsi: ufs: ufs-qcom: Enable QUnipro Internal Clock GatingNitin Rawat2-0/+30
Enable internal clock gating for Qualcomm UFS host controller by setting the following attributes to 1 during host controller initialization: - DL_VS_CLK_CFG - PA_VS_CLK_CFG_REG - DME_VS_CORE_CLK_CTRL.DME_HW_CGC_EN This change is necessary to support the internal clock gating mechanism in Qualcomm UFS host controller. This is power saving feature and hence driver can continue to function correctly despite any error in enabling these feature. Signed-off-by: Nitin Rawat <quic_nitirawa@quicinc.com> Link: https://lore.kernel.org/r/20250714075336.2133-4-quic_nitirawa@quicinc.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-15scsi: ufs: core: Add ufshcd_dme_rmw() to modify DME attributesNitin Rawat2-0/+25
Introduce ufshcd_dme_rmw() API to read, modify, and write DME attributes in UFS host controllers using a mask and value. Signed-off-by: Nitin Rawat <quic_nitirawa@quicinc.com> Link: https://lore.kernel.org/r/20250714075336.2133-3-quic_nitirawa@quicinc.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-15scsi: ufs: ufs-qcom: Update esi_vec_mask for HW major version >= 6Bao D. Nguyen1-2/+1
The MCQ feature and ESI are supported by all Qualcomm UFS controller versions 6 and above. Therefore, update the ESI vector mask in the UFS_MEM_CFG3 register for platforms with major version number of 6 or higher. Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bao D. Nguyen <quic_nguyenb@quicinc.com> Signed-off-by: Nitin Rawat <quic_nitirawa@quicinc.com> Link: https://lore.kernel.org/r/20250714075336.2133-2-quic_nitirawa@quicinc.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-07-15pNFS: Fix disk addr range check in block/scsi layoutSergey Bashirov1-2/+2
At the end of the isect translation, disc_addr represents the physical disk offset. Thus, end calculated from disk_addr is also a physical disk offset. Therefore, range checking should be done using map->disk_offset, not map->start. Signed-off-by: Sergey Bashirov <sergeybashirov@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250702133226.212537-1-sergeybashirov@gmail.com Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-15pNFS: Fix stripe mapping in block/scsi layoutSergey Bashirov1-2/+3
Because of integer division, we need to carefully calculate the disk offset. Consider the example below for a stripe of 6 volumes, a chunk size of 4096, and an offset of 70000. chunk = div_u64(offset, dev->chunk_size) = 70000 / 4096 = 17 offset = chunk * dev->chunk_size = 17 * 4096 = 69632 disk_offset_wrong = div_u64(offset, dev->nr_children) = 69632 / 6 = 11605 disk_chunk = div_u64(chunk, dev->nr_children) = 17 / 6 = 2 disk_offset = disk_chunk * dev->chunk_size = 2 * 4096 = 8192 Signed-off-by: Sergey Bashirov <sergeybashirov@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250701122341.199112-1-sergeybashirov@gmail.com Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-15pNFS: Handle RPC size limit for layoutcommitsSergey Bashirov1-3/+8
When there are too many block extents for a layoutcommit, they may not all fit into the maximum-sized RPC. This patch allows the generic pnfs code to properly handle -ENOSPC returned by the block/scsi layout driver and trigger additional layoutcommits if necessary. Co-developed-by: Konstantin Evtushenko <koevtushenko@yandex.com> Signed-off-by: Konstantin Evtushenko <koevtushenko@yandex.com> Signed-off-by: Sergey Bashirov <sergeybashirov@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250630183537.196479-5-sergeybashirov@gmail.com Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-15pNFS: Add prepare commit trace to block/scsi layoutSergey Bashirov3-3/+38
Replace dprintk with trace event in ext_tree_prepare_commit() function. Co-developed-by: Konstantin Evtushenko <koevtushenko@yandex.com> Signed-off-by: Konstantin Evtushenko <koevtushenko@yandex.com> Signed-off-by: Sergey Bashirov <sergeybashirov@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250630183537.196479-4-sergeybashirov@gmail.com Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-15pNFS: Fix extent encoding in block/scsi layoutSergey Bashirov1-6/+74
The ext_tree_encode_commit() function may be called multiple times for the same file, layout, and last written byte if the provided buffer is not large enough to encode all extents in it. The first problem is that the last written byte field must be zeroed only on a successful call, otherwise we will lose its actual value and get an integer overflow on the next encoding attempt. The second problem is that we can't count and encode in one pass. The extent state changes during encoding, so if we return -ENOSPC but have already encoded some extents into a small buffer, they will not be re-encoded into a new larger buffer on the next try. As a result, the client never commits these extents to the server. Co-developed-by: Konstantin Evtushenko <koevtushenko@yandex.com> Signed-off-by: Konstantin Evtushenko <koevtushenko@yandex.com> Signed-off-by: Sergey Bashirov <sergeybashirov@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250630183537.196479-3-sergeybashirov@gmail.com Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-15pNFS: Fix uninited ptr deref in block/scsi layoutSergey Bashirov1-5/+15
The error occurs on the third attempt to encode extents. When function ext_tree_prepare_commit() reallocates a larger buffer to retry encoding extents, the "layoutupdate_pages" page array is initialized only after the retry loop. But ext_tree_free_commitdata() is called on every iteration and tries to put pages in the array, thus dereferencing uninitialized pointers. An additional problem is that there is no limit on the maximum possible buffer_size. When there are too many extents, the client may create a layoutcommit that is larger than the maximum possible RPC size accepted by the server. During testing, we observed two typical scenarios. First, one memory page for extents is enough when we work with small files, append data to the end of the file, or preallocate extents before writing. But when we fill a new large file without preallocating, the number of extents can be huge, and counting the number of written extents in ext_tree_encode_commit() does not help much. Since this number increases even more between unlocking and locking of ext_tree, the reallocated buffer may not be large enough again and again. Co-developed-by: Konstantin Evtushenko <koevtushenko@yandex.com> Signed-off-by: Konstantin Evtushenko <koevtushenko@yandex.com> Signed-off-by: Sergey Bashirov <sergeybashirov@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250630183537.196479-2-sergeybashirov@gmail.com Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-15NFS: Remove unused function nfs_umountDr. David Alan Gilbert2-69/+0
nfs_umount() has been unused since 2013's commit 4580a92d44e2 ("NFS: Use server-recommended security flavor by default (NFSv3)") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Link: https://lore.kernel.org/r/20250218215250.263709-1-linux@treblig.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-15SUNRPC: Remove unused xdr functionsDr. David Alan Gilbert2-119/+0
Remove a bunch of unused xdr_*decode* functions: The last use of xdr_decode_netobj() was removed in 2021 by: commit 7cf96b6d0104 ("lockd: Update the NLMv4 SHARE arguments decoder to use struct xdr_stream") The last use of xdr_decode_string_inplace() was removed in 2021 by: commit 3049e974a7c7 ("lockd: Update the NLMv4 FREE_ALL arguments decoder to use struct xdr_stream") The last use of xdr_stream_decode_opaque() was removed in 2024 by: commit fed8a17c61ff ("xdrgen: typedefs should use the built-in string and opaque functions") The functions xdr_stream_decode_string() and xdr_stream_decode_opaque_dup() were both added in 2018 by the commit 0e779aa70308 ("SUNRPC: Add helpers for decoding opaque and string types") but never used. Remove them. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Link: https://lore.kernel.org/r/20250712233006.403226-1-linux@treblig.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-15nfs: create a kernel keyringChristoph Hellwig1-0/+35
Create a kernel .nfs keyring similar to the nvme .nvme one. Unlike for a userspace-created keyrind, tlshd is a possesor of the keys with this and thus the keys don't need user read permissions. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Link: https://lore.kernel.org/r/20250515115107.33052-3-hch@lst.de Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-15NFS: support the kernel keyring for TLSChristoph Hellwig1-0/+42
Allow tlshd to use a per-mount key from the kernel keyring similar to NVMe over TCP. Note that tlshd expects keys and certificates stored in the kernel keyring to be in DER format, not the PEM format used for file based keys and certificates, so they need to be converted before they are added to the keyring, which is a bit unexpected. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Link: https://lore.kernel.org/r/20250515115107.33052-2-hch@lst.de Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-15NFS: Allow folio migration for the case of mode == MIGRATE_SYNCTrond Myklebust1-2/+6
When the mode is MIGRATE_SYNC, we are allowed to call nfs_wb_folio() under the folio lock. Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-15nfs: new tracepoint in match_stateid operationJeff Layton2-0/+61
Add new tracepoints in the NFSv4 match_stateid minorversion op that show the info in both stateids. Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20250618-nfs-tracepoints-v2-4-540c9fb48da2@kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-15nfs: new tracepoint in nfs_delegation_need_returnJeff Layton2-0/+49
Add a tracepoint in the function that decides whether to return a delegation to the server. Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20250618-nfs-tracepoints-v2-3-540c9fb48da2@kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-15nfs: add a tracepoint to nfs_inode_detach_delegation_lockedJeff Layton2-0/+3
We have tracepoints for setting a delegation and reclaiming them. Add a tracepoint for when the delegation is being detached from the inode. Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20250618-nfs-tracepoints-v2-2-540c9fb48da2@kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-15nfs: add cache_validity to the nfs_inode_event tracepointsJeff Layton1-2/+6
Managing the cache_validity flags is the deep voodoo of NFS cache coherency. Let's have a little extra visibility into that value via the nfs_inode_event tracepoints. Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20250618-nfs-tracepoints-v2-1-540c9fb48da2@kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-15NFS: remove unused pnfs_ld_data field from struct nfs_serverAnthony Iliopoulos1-1/+0
The last code that was using this was removed via commit 20d655d6197d ("pnfs/blocklayout: use the device id cache") which was merged in v3.18-rc1, so it can be removed completely. Signed-off-by: Anthony Iliopoulos <ailiop@suse.com> Link: https://lore.kernel.org/r/20250613094439.82338-4-ailiop@suse.com Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-15NFS: remove unused time_delta field from struct nfs_serverAnthony Iliopoulos2-2/+0
The last code that was using this was removed via commit ca0daa277aca ("NFS: Cache aggressively when file is open for writing") which was merged in v4.8-rc1, so it can be removed completely. Signed-off-by: Anthony Iliopoulos <ailiop@suse.com> Link: https://lore.kernel.org/r/20250613094439.82338-3-ailiop@suse.com Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-15NFS: remove unused wpages field from struct nfs_serverAnthony Iliopoulos2-2/+0
The wpages field is not serving any purpose since commit c63c7b051395 ("NFS: Fix a race when doing NFS write coalescing") which was merged in v2.6.22-rc1. Remove it completely. Signed-off-by: Anthony Iliopoulos <ailiop@suse.com> Link: https://lore.kernel.org/r/20250613094439.82338-2-ailiop@suse.com Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-15pnfs: add pnfs_ds_connect trace pointTigran Mkrtchyan3-5/+36
This tracepoint aims to expose pnfs DS connect status Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Link: https://lore.kernel.org/r/20250610151246.9147-1-tigran.mkrtchyan@desy.de Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-15nfs: use lock_two_nondirectories()NeilBrown1-23/+2
Rather than open-coding this function call it to make intention clear and to use "correct" nesting levels (parent and child are for directories). This is purely cosmetic with no expected change in behaviour. Signed-off-by: NeilBrown <neil@brown.name> Link: https://lore.kernel.org/r/174942091741.608730.3327223511347232829@noble.neil.brown.name Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-15NFS: Return the file btime in the statx results when appropriateTrond Myklebust2-3/+15
If the server supports the NFSv4.x "create_time" attribute, then return it as part of the statx results. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/eae27d6467e08aaa67e0ac6ae7119263a0f83349.1748515333.git.bcodding@redhat.com Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-15nfs: Add timecreate to nfs inodeAnne Marie Merritt6-4/+65
Add tracking of the create time (a.k.a. btime) along with corresponding bitfields, request, and decode xdr routines. Signed-off-by: Anne Marie Merritt <annemarie.merritt@primarydata.com> Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com> Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/1e3677b0655fa2bbaba0817b41d111d94a06e5ee.1748515333.git.bcodding@redhat.com Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-15Expand the type of nfs_fattr->validTrond Myklebust3-29/+29
We need to be able to track more than 32 attributes per inode. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com> Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/1e3405fca54efd0be7c91c1da77917b94f5dfcc4.1748515333.git.bcodding@redhat.com Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-15KVM: SVM: Flush cache only on CPUs running SEV guestZheyun Shen2-9/+63
On AMD CPUs without ensuring cache consistency, each memory page reclamation in an SEV guest triggers a call to do WBNOINVD/WBINVD on all CPUs, thereby affecting the performance of other programs on the host. Typically, an AMD server may have 128 cores or more, while the SEV guest might only utilize 8 of these cores. Meanwhile, host can use qemu-affinity to bind these 8 vCPUs to specific physical CPUs. Therefore, keeping a record of the physical core numbers each time a vCPU runs can help avoid flushing the cache for all CPUs every time. Take care to allocate the cpumask used to track which CPUs have run a vCPU when copying or moving an "encryption context", as nothing guarantees memory in a mirror VM is a strict subset of the ASID owner, and the destination VM for intrahost migration needs to maintain it's own set of CPUs. E.g. for intrahost migration, if a CPU was used for the source VM but not the destination VM, then it can only have cached memory that was accessible to the source VM. And a CPU that was run in the source is also used by the destination is no different than a CPU that was run in the destination only. Note, KVM is guaranteed to do flush caches prior to sev_vm_destroy(), thanks to kvm_arch_guest_memory_reclaimed for SEV and SEV-ES, and kvm_arch_gmem_invalidate() for SEV-SNP. I.e. it's safe to free the cpumask prior to unregistering encrypted regions and freeing the ASID. Opportunistically clean up sev_vm_destroy()'s comment regarding what is (implicitly, what isn't) skipped for mirror VMs. Cc: Srikanth Aithal <sraithal@amd.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Zheyun Shen <szy0127@sjtu.edu.cn> Link: https://lore.kernel.org/r/20250522233733.3176144-9-seanjc@google.com Link: https://lore.kernel.org/all/935a82e3-f7ad-47d7-aaaf-f3d2b62ed768@amd.com Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-07-15Merge branch 'bpf-tcp-exactly-once-socket-iteration'Martin KaFai Lau3-84/+679
Jordan Rife says: ==================== bpf: tcp: Exactly-once socket iteration TCP socket iterators use iter->offset to track progress through a bucket, which is a measure of the number of matching sockets from the current bucket that have been seen or processed by the iterator. On subsequent iterations, if the current bucket has unprocessed items, we skip at least iter->offset matching items in the bucket before adding any remaining items to the next batch. However, iter->offset isn't always an accurate measure of "things already seen" when the underlying bucket changes between reads, which can lead to repeated or skipped sockets. Instead, this series remembers the cookies of the sockets we haven't seen yet in the current bucket and resumes from the first cookie in that list that we can find on the next iteration. This is a continuation of the work started in [1]. This series largely replicates the patterns applied to UDP socket iterators, applying them instead to TCP socket iterators. CHANGES ======= v5 -> v6: * In patch ten ("selftests/bpf: Create established sockets in socket iterator tests"), use poll() to choose a socket that has a connection ready to be accept()ed. Before, connect_to_server would set the O_NONBLOCK flag on all listening sockets so that accept_from_one could loop through them all and find the one that connect_to_addr_str connected to. However, this is subtly buggy and could potentially lead to test flakes, since the 3 way handshake isn't necessarily done when connect returns, so it's possible none of the accept() calls succeed. Use poll() instead to guarantee that the socket we accept() from is ready and eliminate the need for the O_NONBLOCK flag (Martin). v4 -> v5: * Move WARN_ON_ONCE before the `done` label in patch two ("bpf: tcp: Make sure iter->batch always contains a full bucket snapshot"") (Martin). * Remove unnecessary kfunc declaration in patch eleven ("selftests/bpf: Create iter_tcp_destroy test program") (Martin). * Make sure to close the socket fd at the end of `destroy` in patch twelve ("selftests/bpf: Add tests for bucket resume logic in established sockets") (Martin). v3 -> v4: * Drop braces around sk_nulls_for_each_from in patch five ("bpf: tcp: Avoid socket skips and repeats during iteration") (Stanislav). * Add a break after the TCP_SEQ_STATE_ESTABLISHED case in patch five (Stanislav). * Add an `if (sock_type == SOCK_STREAM)` check before assigning TCP_LISTEN to skel->rodata->ss in patch eight ("selftests/bpf: Allow for iteration over multiple states") to more clearly express the intent that the option is only consumed for SOCK_STREAM tests (Stanislav). * Move the `i = 0` assignment into the for loop in patch ten ("selftests/bpf: Create established sockets in socket iterator tests") (Stanislav). v2 -> v3: * Unroll the loop inside bpf_iter_tcp_batch to make the logic easier to follow in patch two ("bpf: tcp: Make sure iter->batch always contains a full bucket snapshot"). This gets rid of the `resizes` variable from v2 and eliminates the extra conditional that checks how many batch resize attempts have occurred so far (Stanislav). Note: This changes the behavior slightly. Before, in the case that the second call to tcp_seek_last_pos (and later bpf_iter_tcp_resume) advances to a new bucket, which may happen if the current bucket is emptied after releasing its lock, the `resizes` "budget" would be reset, the net effect being that we would try a batch resize with GFP_USER at most once per bucket. Now, we try to resize the batch with GFP_USER at most once per call, so it makes it slightly more likely that we hit the GFP_NOWAIT scenario. However, this edge case should be rare in practice anyway, and the new behavior is more or less consistent with the original retry logic, so avoid the loop and prefer code clarity. * Move the call to bpf_iter_tcp_put_batch out of bpf_iter_tcp_realloc_batch and call it directly before invoking bpf_iter_tcp_realloc_batch with GFP_USER inside bpf_iter_tcp_batch. /Don't/ call it before invoking bpf_iter_tcp_realloc_batch the second time while we hold the lock with GFP_NOWAIT. This avoids a conditional inside bpf_iter_tcp_realloc_batch from v2 that only calls bpf_iter_tcp_put_batch if flags != GFP_NOWAIT and is a bit more explicit (Stanislav). * Adjust patch five ("bpf: tcp: Avoid socket skips and repeats during iteration") to fit with the new logic in patch two. v1 -> v2: * In patch five ("bpf: tcp: Avoid socket skips and repeats during iteration"), remove unnecessary bucket bounds checks in bpf_iter_tcp_resume. In either case, if st->bucket is outside the current table's range then bpf_iter_tcp_resume_* calls *_get_first which immediately returns NULL anyway and the logic will fall through. (Martin) * Add a check at the top of bpf_iter_tcp_resume_listening and bpf_iter_tcp_resume_established to see if we're done with the current bucket and advance it immediately instead of wasting time finding the first matching socket in that bucket with (listening|established)_get_first. In v1, we originally discussed adding logic to advance the bucket in bpf_iter_tcp_seq_next and bpf_iter_tcp_seq_stop, but after trying this the logic seemed harder to track. Overall, keeping everything inside bpf_iter_tcp_resume_* seemed a bit clearer. (Martin) * Instead of using a timeout in the last patch ("selftests/bpf: Add tests for bucket resume logic in established sockets") to wait for sockets to leave the ehash table after calling close(), use bpf_sock_destroy to deterministically destroy and remove them. This introduces one more patch ("selftests/bpf: Create iter_tcp_destroy test program") to create the iterator program that destroys a selected socket. Drive this through a destroy() function in the last patch which, just like close(), accepts a socket file descriptor. (Martin) * Introduce one more patch ("selftests/bpf: Allow for iteration over multiple states") to fix a latent bug in iter_tcp_soreuse where the sk->sk_state != TCP_LISTEN check was ignored. Add the "ss" variable to allow test code to configure which socket states to allow. [1]: https://lore.kernel.org/bpf/20250502161528.264630-1-jordan@jrife.io/ ==================== Link: https://patch.msgid.link/20250714180919.127192-1-jordan@jrife.io Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2025-07-15selftests/bpf: Add tests for bucket resume logic in established socketsJordan Rife1-0/+293
Replicate the set of test cases used for UDP socket iterators to test similar scenarios for TCP established sockets. Signed-off-by: Jordan Rife <jordan@jrife.io> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me>
2025-07-15selftests/bpf: Create iter_tcp_destroy test programJordan Rife1-0/+21
Prepare for bucket resume tests for established TCP sockets by creating a program to immediately destroy and remove sockets from the TCP ehash table, since close() is not deterministic. Signed-off-by: Jordan Rife <jordan@jrife.io> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me>
2025-07-15selftests/bpf: Create established sockets in socket iterator testsJordan Rife1-4/+85
Prepare for bucket resume tests for established TCP sockets by creating established sockets. Collect socket fds from connect() and accept() sides and pass them to test cases. Signed-off-by: Jordan Rife <jordan@jrife.io> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me>
2025-07-15selftests/bpf: Make ehash buckets configurable in socket iterator testsJordan Rife1-1/+18
Prepare for bucket resume tests for established TCP sockets by making the number of ehash buckets configurable. Subsequent patches force all established sockets into the same bucket by setting ehash_buckets to one. Signed-off-by: Jordan Rife <jordan@jrife.io> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me>
2025-07-15jfs: stop using write_cache_pagesChristoph Hellwig1-3/+5
Stop using the obsolete write_cache_pages and use writeback_iter directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
2025-07-15jfs: truncate good inode pages when hard link is 0Lizhi Xu1-1/+1
The fileset value of the inode copy from the disk by the reproducer is AGGR_RESERVED_I. When executing evict, its hard link number is 0, so its inode pages are not truncated. This causes the bugon to be triggered when executing clear_inode() because nrpages is greater than 0. Reported-by: syzbot+6e516bb515d93230bc7b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=6e516bb515d93230bc7b Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
2025-07-15jfs: jfs_xtree: replace XT_GETPAGE macro with xt_getpage()Suchit Karunakaran1-64/+78
Replace legacy XT_GETPAGE macro with an inline function that returns a xtpage_t pointer and update all instances of XT_GETPAGE in jfs_xtree.c Signed-off-by: Suchit Karunakaran <suchitkarunakaran@gmail.com> Simplified xt_getpage by removing size and rc arguments. Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
2025-07-15jfs: Regular file corruption checkEdward Adam Davis1-0/+3
The reproducer builds a corrupted file on disk with a negative i_size value. Add a check when opening this file to avoid subsequent operation failures. Reported-by: syzbot+630f6d40b3ccabc8e96e@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=630f6d40b3ccabc8e96e Tested-by: syzbot+630f6d40b3ccabc8e96e@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
2025-07-15jfs: upper bound check of tree index in dbAllocAGArnaud Lecomte1-0/+6
When computing the tree index in dbAllocAG, we never check if we are out of bounds realative to the size of the stree. This could happen in a scenario where the filesystem metadata are corrupted. Reported-by: syzbot+cffd18309153948f3c3e@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=cffd18309153948f3c3e Tested-by: syzbot+cffd18309153948f3c3e@syzkaller.appspotmail.com Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
2025-07-15rust: types: require `ForeignOwnable::into_foreign` return non-nullAndreas Hindborg1-0/+1
The intended implementations of `ForeignOwnable` will not return null pointers from `into_foreign`, as this would render the implementation of `try_from_foreign` useless. Current users of `ForeignOwnable` rely on `into_foreign` returning non-null pointers. So require `into_foreign` to return non-null pointers. Suggested-by: Benno Lossin <lossin@kernel.org> Suggested-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org> Reviewed-by: Benno Lossin <lossin@kernel.org> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20250612-pointed-to-v3-2-b009006d86a1@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2025-07-15rust: types: add FOREIGN_ALIGN to ForeignOwnableAndreas Hindborg7-63/+70
The current implementation of `ForeignOwnable` is leaking the type of the opaque pointer to consumers of the API. This allows consumers of the opaque pointer to rely on the information that can be extracted from the pointer type. To prevent this, change the API to the version suggested by Maira Canal (link below): Remove `ForeignOwnable::PointedTo` in favor of a constant, which specifies the alignment of the pointers returned by `into_foreign`. With this change, `ArcInner` no longer needs `pub` visibility, so change it to private. Suggested-by: Alice Ryhl <aliceryhl@google.com> Suggested-by: Maíra Canal <mcanal@igalia.com> Link: https://lore.kernel.org/r/20240309235927.168915-3-mcanal@igalia.com Acked-by: Danilo Krummrich <dakr@kernel.org> Reviewed-by: Benno Lossin <lossin@kernel.org> Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20250612-pointed-to-v3-1-b009006d86a1@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2025-07-15rust: rbtree: simplify finding `current` in `remove_current`Onur Özkan1-16/+7
The previous version used a verbose `match` to get `current`, which may be slightly confusing at first glance. This change makes it shorter and more clearly expresses the intent: prefer `next` if available, otherwise fall back to `prev`. Signed-off-by: Onur Özkan <work@onurozkan.dev> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Alexandre Courbot <acourbot@nvidia.com> Link: https://lore.kernel.org/r/20250708075850.25789-1-work@onurozkan.dev Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2025-07-15rust: helpers: sort includes alphabeticallyKrishna Ketan Rai1-1/+1
The helper includes should be sorted alphabetically as indicated by the comment at the top of the file, but they were not. Sort them properly. Suggested-by: Alice Ryhl <aliceryhl@google.com> Link: https://github.com/Rust-for-Linux/linux/issues/1174 Signed-off-by: Krishna Ketan Rai <prafulrai522@gmail.com> Link: https://lore.kernel.org/r/20250629152533.889-1-prafulrai522@gmail.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2025-07-15rust: uaccess: use newtype for user pointersAlice Ryhl3-12/+63
Currently, Rust code uses a typedef for unsigned long to represent userspace addresses. This is unfortunate because it means that userspace addresses could accidentally be mixed up with other integers. To alleviate that, we introduce a new UserPtr struct that wraps a raw pointer to represent a userspace address. By using a struct, type checking enforces that userspace addresses cannot be mixed up with anything else. This is similar to the __user annotation in C that detects cases where user pointers are mixed with non-user pointers. Note that unlike __user pointers in C, this type is just a pointer without a target type. This means that it can't detect cases such as mixing up which struct this user pointer references. However, that is okay due to the way this is intended to be used - generally, you create a UserPtr in your ioctl callback from the provided usize *before* dispatching on which ioctl is in use, and then after dispatching on the ioctl you pass the UserPtr into a UserSliceReader or UserSliceWriter; selecting the target type does not happen until you have obtained the UserSliceReader/Writer. The UserPtr type is not marked with #[derive(Debug)], which means that it's not possible to print values of this type. This avoids ASLR leakage. The type is added to the prelude as it is a fairly fundamental type similar to c_int. The wrapping_add() method is renamed to wrapping_byte_add() for consistency with the method name found on raw pointers. Reviewed-by: Benno Lossin <lossin@kernel.org> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Reviewed-by: Christian Schrefl <chrisi.schrefl@gmail.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20250616-userptr-newtype-v3-1-5ff7b2d18d9e@google.com [ Reworded title. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2025-07-15iommu/tegra241-cmdqv: import IOMMUFD module namespaceArnd Bergmann1-0/+2
The tegra variant of smmu-v3 now uses the iommufd mmap interface but is missing the corresponding import: ERROR: modpost: module arm_smmu_v3 uses symbol _iommufd_object_depend from namespace IOMMUFD, but does not import it. ERROR: modpost: module arm_smmu_v3 uses symbol iommufd_viommu_report_event from namespace IOMMUFD, but does not import it. ERROR: modpost: module arm_smmu_v3 uses symbol _iommufd_destroy_mmap from namespace IOMMUFD, but does not import it. ERROR: modpost: module arm_smmu_v3 uses symbol _iommufd_object_undepend from namespace IOMMUFD, but does not import it. ERROR: modpost: module arm_smmu_v3 uses symbol _iommufd_alloc_mmap from namespace IOMMUFD, but does not import it. Fixes: b135de24cfc0 ("iommu/tegra241-cmdqv: Add user-space use support") Link: https://patch.msgid.link/r/20250714205747.3475772-1-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-07-15rust: use `#[used(compiler)]` to fix build and `modpost` with Rust >= 1.89.0Miguel Ojeda6-8/+13
Starting with Rust 1.89.0 (expected 2025-08-07), the Rust compiler fails to build the `rusttest` target due to undefined references such as: kernel...-cgu.0:(.text....+0x116): undefined reference to `rust_helper_kunit_get_current_test' Moreover, tooling like `modpost` gets confused: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpu/drm/nova/nova.o ERROR: modpost: missing MODULE_LICENSE() in drivers/gpu/nova-core/nova_core.o The reason behind both issues is that the Rust compiler will now [1] treat `#[used]` as `#[used(linker)]` instead of `#[used(compiler)]` for our targets. This means that the retain section flag (`R`, `SHF_GNU_RETAIN`) will be used and that they will be marked as `unique` too, with different IDs. In turn, that means we end up with undefined references that did not get discarded in `rusttest` and that multiple `.modinfo` sections are generated, which confuse tooling like `modpost` because they only expect one. Thus start using `#[used(compiler)]` to keep the previous behavior and to be explicit about what we want. Sadly, it is an unstable feature (`used_with_arg`) [2] -- we will talk to upstream Rust about it. The good news is that it has been available for a long time (Rust >= 1.60) [3]. The changes should also be fine for previous Rust versions, since they behave the same way as before [4]. Alternatively, we could use `#[no_mangle]` or `#[export_name = ...]` since those still behave like `#[used(compiler)]`, but of course it is not really what we want to express, and it requires other changes to avoid symbol conflicts. Cc: David Wood <david@davidtw.co> Cc: Wesley Wiser <wwiser@gmail.com> Cc: stable@vger.kernel.org # Needed in 6.12.y and later (Rust is pinned in older LTSs). Link: https://github.com/rust-lang/rust/pull/140872 [1] Link: https://github.com/rust-lang/rust/issues/93798 [2] Link: https://github.com/rust-lang/rust/pull/91504 [3] Link: https://godbolt.org/z/sxzWTMfzW [4] Reviewed-by: Alice Ryhl <aliceryhl@google.com> Acked-by: Björn Roy Baron <bjorn3_gh@protonmail.com> Link: https://lore.kernel.org/r/20250712160103.1244945-3-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2025-07-15dt-bindings: gpu: mali-bifrost: Add Allwinner A523 compatibleMikhail Kalashnikov1-0/+1
Add a compatible for the Allwinner A523 SoC, with an integrated ARM Mali G57 MC1 (Valhall-JM) GPU. Signed-off-by: Mikhail Kalashnikov <iuncuim@gmail.com> Link: https://lore.kernel.org/r/20250711035730.17507-2-iuncuim@gmail.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>