summaryrefslogtreecommitdiff
path: root/include/uapi/linux
AgeCommit message (Collapse)AuthorFilesLines
2025-05-08platform/x86: ISST: Support SST-PP revision 2Srinivas Pandruvada1-0/+26
SST PP revision 2 added fabric 1 P0, P1 and Pm frequencies. Export them by using a new IOCTL ISST_IF_GET_PERF_LEVEL_FABRIC_INFO. This IOCTL requires platforms with SST PP revision 2 or higher. To accommodate potential future increases in fabric count and avoid ABI changes, support is extended for up to 8 fabrics. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://lore.kernel.org/r/20250506163531.1061185-3-srinivas.pandruvada@linux.intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-05-08bpf: Clarify handling of mark and tstamp by redirect_peerPaul Chaignon1-0/+3
When switching network namespaces with the bpf_redirect_peer helper, the skb->mark and skb->tstamp fields are not zeroed out like they can be on a typical netns switch. This patch clarifies that in the helper description. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/ccc86af26d43c5c0b776bcba2601b7479c0d46d0.1746460653.git.paul.chaignon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-08fs: add atomic write unit max opt to statxJohn Garry1-2/+6
XFS will be able to support large atomic writes (atomic write > 1x block) in future. This will be achieved by using different operating methods, depending on the size of the write. Specifically a new method of operation based in FS atomic extent remapping will be supported in addition to the current HW offload-based method. The FS method will generally be appreciably slower performing than the HW-offload method. However the FS method will be typically able to contribute to achieving a larger atomic write unit max limit. XFS will support a hybrid mode, where HW offload method will be used when possible, i.e. HW offload is used when the length of the write is supported, and for other times FS-based atomic writes will be used. As such, there is an atomic write length at which the user may experience appreciably slower performance. Advertise this limit in a new statx field, stx_atomic_write_unit_max_opt. When zero, it means that there is no such performance boundary. Masks STATX{_ATTR}_WRITE_ATOMIC can be used to get this new field. This is ok for older kernels which don't support this new field, as they would report 0 in this field (from zeroing in cp_statx()) already. Furthermore those older kernels don't support large atomic writes - apart from block fops, but there would be consistent performance there for atomic writes in range [unit min, unit max]. Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: John Garry <john.g.garry@oracle.com>
2025-05-07Merge tag 'wireless-next-2025-05-06' of ↵Jakub Kicinski1-0/+6
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Johannes Berg says: ==================== wireless features, notably * stack - free SKBTX_WIFI_STATUS flag - fixes for VLAN multicast in multi-link - improve codel parameters (revert some old twiddling) * ath12k - Enable AHB support for IPQ5332. - Add monitor interface support to QCN9274. - Add MLO support to WCN7850. - Add 802.11d scan offload support to WCN7850. * ath11k - Restore hibernation support * iwlwifi - EMLSR on two 5 GHz links * mwifiex - cleanups/refactoring along with many other small features/cleanups * tag 'wireless-next-2025-05-06' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (177 commits) Revert "wifi: iwlwifi: clean up config macro" wifi: iwlwifi: move phy_filters to fw_runtime wifi: iwlwifi: pcie: make sure to lock rxq->read wifi: iwlwifi: add definitions for iwl_mac_power_cmd version 2 wifi: iwlwifi: clean up config macro wifi: iwlwifi: mld: simplify iwl_mld_rx_fill_status() wifi: iwlwifi: mld: rx: simplify channel handling wifi: iwlwifi: clean up band in RX metadata wifi: iwlwifi: mld: skip unknown FW channel load values wifi: iwlwifi: define API for external FSEQ images wifi: iwlwifi: mld: allow EMLSR on separated 5 GHz subbands wifi: iwlwifi: mld: use cfg80211_chandef_get_width() wifi: iwlwifi: mld: fix iwl_mld_emlsr_disallowed_with_link() return wifi: iwlwifi: mld: clarify variable type wifi: iwlwifi: pcie: add support for the reset handshake in MSI wifi: mac80211_hwsim: Prevent tsf from setting if beacon is disabled wifi: mac80211: restructure tx profile retrieval for MLO MBSSID wifi: nl80211: add link id of transmitted profile for MLO MBSSID wifi: ieee80211: Add helpers to fetch EMLSR delay and timeout values wifi: mac80211: update ML STA with EML capabilities ... ==================== Link: https://patch.msgid.link/20250506174656.119970-3-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07devlink: define enum for attr types of dynamic attributesJiri Pirko1-0/+15
Devlink param and health reporter fmsg use attributes with dynamic type which is determined according to a different type. Currently used values are NLA_*. The problem is, they are not part of UAPI. They may change which would cause a break. To make this future safe, introduce a enum that shadows NLA_* values in it and is part of UAPI. Also, this allows to possibly carry types that are unrelated to NLA_* values. Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20250505114513.53370-3-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-06io_uring/zcrx: dmabuf backed zerocopy receivePavel Begunkov1-1/+5
Add support for dmabuf backed zcrx areas. To use it, the user should pass IORING_ZCRX_AREA_DMABUF in the struct io_uring_zcrx_area_reg flags field and pass a dmabuf fd in the dmabuf_fd field. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/20bb1890e60a82ec945ab36370d1fd54be414ab6.1746097431.git.asml.silence@gmail.com Link: https://lore.kernel.org/io-uring/6e37db97303212bbd8955f9501cf99b579f8aece.1746547722.git.asml.silence@gmail.com [axboe: fold in fixup] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-06io_uring: enable per-io write streamsKeith Busch1-0/+4
Allow userspace to pass a per-I/O write stream in the SQE: __u8 write_stream; The __u8 type matches the size the filesystems and block layer support. Application can query the supported values from the block devices max_write_streams sysfs attribute. Unsupported values are ignored by file operations that do not support write streams or rejected with an error by those that support them. Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Nitesh Shetty <nj.shetty@samsung.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com> Link: https://lore.kernel.org/r/20250506121732.8211-7-joshi.k@samsung.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-06BackMerge tag 'v6.15-rc5' into drm-nextDave Airlie5-38/+63
Linux 6.15-rc5, requested by tzimmerman for fixes required in drm-next. Signed-off-by: Dave Airlie <airlied@redhat.com>
2025-05-05block: remove bounce buffering supportChristoph Hellwig1-1/+1
The block layer bounce buffering support is unused now, remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20250505081138.3435992-7-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-04dm mpath: Interface for explicit probing of active pathsKevin Wolf1-2/+7
Multipath cannot directly provide failover for ioctls in the kernel because it doesn't know what each ioctl means and which result could indicate a path error. Userspace generally knows what the ioctl it issued means and if it might be a path error, but neither does it know which path the ioctl took nor does it necessarily have the privileges to fail a path using the control device. In order to allow userspace to address this situation, implement a DM_MPATH_PROBE_PATHS ioctl that prompts the dm-mpath driver to probe all active paths in the current path group to see whether they still work, and fail them if not. If this returns success, userspace can retry the ioctl and expect that the previously hit bad path is now failed (or working again). The immediate motivation for this is the use of SG_IO in QEMU for SCSI passthrough. Following a failed SG_IO ioctl, QEMU will trigger probing to ensure that all active paths are actually alive, so that retrying SG_IO at least has a lower chance of failing due to a path error. However, the problem is broader than just SG_IO (it affects any ioctl), and if applications need failover support for other ioctls, the same probing can be used. This is not implemented on the DM control device, but on the DM mpath block devices, to allow all users who have access to such a block device to make use of this interface, specifically to implement failover for ioctls. For the same reason, it is also unprivileged. Its implementation is effectively just a bunch of reads, which could already be issued by userspace, just without any guarantee that all the rights paths are selected. The probing implemented here is done fully synchronously path by path; probing all paths concurrently is left as an improvement for the future. Co-developed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2025-05-03futex: Implement FUTEX2_MPOLPeter Zijlstra1-1/+1
Extend the futex2 interface to be aware of mempolicy. When FUTEX2_MPOL is specified and there is a MPOL_PREFERRED or home_node specified covering the futex address, use that hash-map. Notably, in this case the futex will go to the global node hashtable, even if it is a PRIVATE futex. When FUTEX2_NUMA|FUTEX2_MPOL is specified and the user specified node value is FUTEX_NO_NODE, the MPOL lookup (as described above) will be tried first before reverting to setting node to the local node. [bigeasy: add CONFIG_FUTEX_MPOL, add MPOL to FUTEX2_VALID_MASK, write the node only to user if FUTEX_NO_NODE was supplied] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250416162921.513656-18-bigeasy@linutronix.de
2025-05-03futex: Implement FUTEX2_NUMAPeter Zijlstra1-0/+7
Extend the futex2 interface to be numa aware. When FUTEX2_NUMA is specified for a futex, the user value is extended to two words (of the same size). The first is the user value we all know, the second one will be the node to place this futex on. struct futex_numa_32 { u32 val; u32 node; }; When node is set to ~0, WAIT will set it to the current node_id such that WAKE knows where to find it. If userspace corrupts the node value between WAIT and WAKE, the futex will not be found and no wakeup will happen. When FUTEX2_NUMA is not set, the node is simply an extension of the hash, such that traditional futexes are still interleaved over the nodes. This is done to avoid having to have a separate !numa hash-table. [bigeasy: ensure to have at least hashsize of 4 in futex_init(), add pr_info() for size and allocation information. Cast the naddr math to void*] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250416162921.513656-17-bigeasy@linutronix.de
2025-05-03futex: Allow to make the private hash immutableSebastian Andrzej Siewior1-0/+2
My initial testing showed that: perf bench futex hash reported less operations/sec with private hash. After using the same amount of buckets in the private hash as used by the global hash then the operations/sec were about the same. This changed once the private hash became resizable. This feature added an RCU section and reference counting via atomic inc+dec operation into the hot path. The reference counting can be avoided if the private hash is made immutable. Extend PR_FUTEX_HASH_SET_SLOTS by a fourth argument which denotes if the private should be made immutable. Once set (to true) the a further resize is not allowed (same if set to global hash). Add PR_FUTEX_HASH_GET_IMMUTABLE which returns true if the hash can not be changed. Update "perf bench" suite. For comparison, results of "perf bench futex hash -s": - Xeon CPU E5-2650, 2 NUMA nodes, total 32 CPUs: - Before the introducing task local hash shared Averaged 1.487.148 operations/sec (+- 0,53%), total secs = 10 private Averaged 2.192.405 operations/sec (+- 0,07%), total secs = 10 - With the series shared Averaged 1.326.342 operations/sec (+- 0,41%), total secs = 10 -b128 Averaged 141.394 operations/sec (+- 1,15%), total secs = 10 -Ib128 Averaged 851.490 operations/sec (+- 0,67%), total secs = 10 -b8192 Averaged 131.321 operations/sec (+- 2,13%), total secs = 10 -Ib8192 Averaged 1.923.077 operations/sec (+- 0,61%), total secs = 10 128 is the default allocation of hash buckets. 8192 was the previous amount of allocated hash buckets. - Xeon(R) CPU E7-8890 v3, 4 NUMA nodes, total 144 CPUs: - Before the introducing task local hash shared Averaged 1.810.936 operations/sec (+- 0,26%), total secs = 20 private Averaged 2.505.801 operations/sec (+- 0,05%), total secs = 20 - With the series shared Averaged 1.589.002 operations/sec (+- 0,25%), total secs = 20 -b1024 Averaged 42.410 operations/sec (+- 0,20%), total secs = 20 -Ib1024 Averaged 740.638 operations/sec (+- 1,51%), total secs = 20 -b65536 Averaged 48.811 operations/sec (+- 1,35%), total secs = 20 -Ib65536 Averaged 1.963.165 operations/sec (+- 0,18%), total secs = 20 1024 is the default allocation of hash buckets. 65536 was the previous amount of allocated hash buckets. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Shrikanth Hegde <sshegde@linux.ibm.com> Link: https://lore.kernel.org/r/20250416162921.513656-16-bigeasy@linutronix.de
2025-05-03futex: Add basic infrastructure for local task local hashSebastian Andrzej Siewior1-0/+5
The futex hash is system wide and shared by all tasks. Each slot is hashed based on futex address and the VMA of the thread. Due to randomized VMAs (and memory allocations) the same logical lock (pointer) can end up in a different hash bucket on each invocation of the application. This in turn means that different applications may share a hash bucket on the first invocation but not on the second and it is not always clear which applications will be involved. This can result in high latency's to acquire the futex_hash_bucket::lock especially if the lock owner is limited to a CPU and can not be effectively PI boosted. Introduce basic infrastructure for process local hash which is shared by all threads of process. This hash will only be used for a PROCESS_PRIVATE FUTEX operation. The hashmap can be allocated via: prctl(PR_FUTEX_HASH, PR_FUTEX_HASH_SET_SLOTS, num); A `num' of 0 means that the global hash is used instead of a private hash. Other values for `num' specify the number of slots for the hash and the number must be power of two, starting with two. The prctl() returns zero on success. This function can only be used before a thread is created. The current status for the private hash can be queried via: num = prctl(PR_FUTEX_HASH, PR_FUTEX_HASH_GET_SLOTS); which return the current number of slots. The value 0 means that the global hash is used. Values greater than 0 indicate the number of slots that are used. A negative number indicates an error. For optimisation, for the private hash jhash2() uses only two arguments the address and the offset. This omits the VMA which is always the same. [peterz: Use 0 for global hash. A bit shuffling and renaming. ] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250416162921.513656-13-bigeasy@linutronix.de
2025-05-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-35/+57
Cross-merge networking fixes after downstream PR (net-6.15-rc5). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-01Merge tag 'net-6.15-rc5' of ↵Linus Torvalds1-5/+0
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Happy May Day. Things have calmed down on our end (knock on wood), no outstanding investigations. Including fixes from Bluetooth and WiFi. Current release - fix to a fix: - igc: fix lock order in igc_ptp_reset Current release - new code bugs: - Revert "wifi: iwlwifi: make no_160 more generic", fixes regression to Killer line of devices reported by a number of people - Revert "wifi: iwlwifi: add support for BE213", initial FW is too buggy - number of fixes for mld, the new Intel WiFi subdriver Previous releases - regressions: - wifi: mac80211: restore monitor for outgoing frames - drv: vmxnet3: fix malformed packet sizing in vmxnet3_process_xdp - eth: bnxt_en: fix timestamping FIFO getting out of sync on reset, delivering stale timestamps - use sock_gen_put() in the TCP fraglist GRO heuristic, don't assume every socket is a full socket Previous releases - always broken: - sched: adapt qdiscs for reentrant enqueue cases, fix list corruptions - xsk: fix race condition in AF_XDP generic RX path, shared UMEM can't be protected by a per-socket lock - eth: mtk-star-emac: fix spinlock recursion issues on rx/tx poll - btusb: avoid NULL pointer dereference in skb_dequeue() - dsa: felix: fix broken taprio gate states after clock jump" * tag 'net-6.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (83 commits) net: vertexcom: mse102x: Fix RX error handling net: vertexcom: mse102x: Add range check for CMD_RTS net: vertexcom: mse102x: Fix LEN_MASK net: vertexcom: mse102x: Fix possible stuck of SPI interrupt net: hns3: defer calling ptp_clock_register() net: hns3: fixed debugfs tm_qset size net: hns3: fix an interrupt residual problem net: hns3: store rx VLAN tag offload state for VF octeon_ep: Fix host hang issue during device reboot net: fec: ERR007885 Workaround for conventional TX net: lan743x: Fix memleak issue when GSO enabled ptp: ocp: Fix NULL dereference in Adva board SMA sysfs operations net: use sock_gen_put() when sk_state is TCP_TIME_WAIT bnxt_en: fix module unload sequence bnxt_en: Fix ethtool -d byte order for 32-bit values bnxt_en: Fix out-of-bound memcpy() during ethtool -w bnxt_en: Fix coredump logic to free allocated buffer bnxt_en: delay pci_alloc_irq_vectors() in the AER path bnxt_en: call pci_alloc_irq_vectors() after bnxt_reserve_rings() bnxt_en: Add missing skb_mark_for_recycle() in bnxt_rx_vlan() ...
2025-04-30media: uapi: cec-funcs.h: use CEC_LOG_ADDR_BROADCASTHans Verkuil1-20/+20
The cec-funcs.h header sets the destination to 0xf for those messages that can only be broadcast. Instead of writing: msg->msg[0] |= 0xf; /* broadcast */ just write: msg->msg[0] |= CEC_LOG_ADDR_BROADCAST; which is more descriptive and allows us to drop the comment. Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2025-04-30Merge tag 'nf-next-25-04-29' of ↵Jakub Kicinski1-0/+4
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following batch contains Netfilter updates for net-next: 1) Replace msecs_to_jiffies() by secs_to_jiffies(), from Easwar Hariharan. 2) Allow to compile xt_cgroup with cgroupsv2 support only, from Michal Koutny. 3) Prepare for sock_cgroup_classid() removal by wrapping it around ifdef, also from Michal Koutny. 4) Remove redundant pointer fetch on conntrack template, from Xuanqiang Luo. 5) Re-format one block in the tproxy documentation for consistency, from Chen Linxuan. 6) Expose set element count and type via netlink attributes, from Florian Westphal. * tag 'nf-next-25-04-29' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: nf_tables: export set count and backend name to userspace docs: tproxy: fix formatting for nft code block netfilter: conntrack: Remove redundant NFCT_ALIGN call net: cgroup: Guard users of sock_cgroup_classid() netfilter: xt_cgroup: Make it independent from net_cls netfilter: xt_IDLETIMER: convert timeouts to secs_to_jiffies() ==================== Link: https://patch.msgid.link/20250428221254.3853-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-29netlink: specs: ethtool: Remove UAPI duplication of phy-upstream enumKory Maincent1-5/+0
The phy-upstream enum is already defined in the ethtool.h UAPI header and used by the ethtool userspace tool. However, the ethtool spec does not reference it, causing YNL to auto-generate a duplicate and redundant enum. Fix this by updating the spec to reference the existing UAPI enum in ethtool.h. Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20250425171419.947352-1-kory.maincent@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-29netfilter: nf_tables: export set count and backend name to userspaceFlorian Westphal1-0/+4
nf_tables picks a suitable set backend implementation (bitmap, hash, rbtree..) based on the userspace requirements. Figuring out the chosen backend requires information about the set flags and the kernel version. Export this to userspace so nft can include this information in '--debug=netlink' output. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-04-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after rc4Alexei Starovoitov3-32/+60
Cross-merge bpf and other fixes after downstream PRs. No conflicts. Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-04-28Merge 6.15-rc4 into usb-nextGreg Kroah-Hartman4-33/+63
We need the USB fixes in here as well, and this resolves the following merge conflicts that were reported in linux-next: drivers/usb/chipidea/ci_hdrc_imx.c drivers/usb/host/xhci.h Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-26pidfs: get rid of __pidfd_prepare()Christian Brauner1-1/+1
Fold it into pidfd_prepare() and rename PIDFD_CLONE to PIDFD_STALE to indicate that the passed pid might not have task linkage and no explicit check for that should be performed. Link: https://lore.kernel.org/20250425-work-pidfs-net-v2-3-450a19461e75@kernel.org Reviewed-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: David Rheinsberg <david@readahead.eu> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-04-25tcp: fastopen: pass TFO child indication through getsockoptJeremy Harris1-0/+1
tcp: fastopen: pass TFO child indication through getsockopt Note that this uses up the last bit of a field in struct tcp_info Signed-off-by: Jeremy Harris <jgh@exim.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Link: https://patch.msgid.link/20250423124334.4916-3-jgh@exim.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24Merge tag 'landlock-6.15-rc4' of ↵Linus Torvalds1-30/+57
git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock fixes from Mickaël Salaün: "Fix some Landlock audit issues, add related tests, and updates documentation" * tag 'landlock-6.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: landlock: Update log documentation landlock: Fix documentation for landlock_restrict_self(2) landlock: Fix documentation for landlock_create_ruleset(2) selftests/landlock: Add PID tests for audit records selftests/landlock: Factor out audit fixture in audit_test landlock: Log the TGID of the domain creator landlock: Remove incorrect warning
2025-04-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski3-3/+6
Cross-merge networking fixes after downstream PR (net-6.15-rc4). This pull includes wireless and a fix to vxlan which isn't in Linus's tree just yet. The latter creates with a silent conflict / build breakage, so merging it now to avoid causing problems. drivers/net/vxlan/vxlan_vnifilter.c 094adad91310 ("vxlan: Use a single lock to protect the FDB table") 087a9eb9e597 ("vxlan: vnifilter: Fix unlocked deletion of default FDB entry") https://lore.kernel.org/20250423145131.513029-1-idosch@nvidia.com No "normal" conflicts, or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-23wifi: nl80211: add link id of transmitted profile for MLO MBSSIDRameshkumar Sundaram1-0/+6
During non-transmitted (nontx) profile configuration, interface index of the transmitted (tx) profile is used to retrieve the wireless device (wdev) associated with it. With MLO, this 'wdev' may be part of an MLD with more than one link, hence only interface index is not sufficient anymore to retrieve the correct tx profile. Add a new attribute to configure link id of tx profile. Signed-off-by: Rameshkumar Sundaram <rameshkumar.sundaram@oss.qualcomm.com> Co-developed-by: Muna Sinada <muna.sinada@oss.qualcomm.com> Signed-off-by: Muna Sinada <muna.sinada@oss.qualcomm.com> Co-developed-by: Aloka Dixit <aloka.dixit@oss.qualcomm.com> Signed-off-by: Aloka Dixit <aloka.dixit@oss.qualcomm.com> Link: https://patch.msgid.link/20250408184501.3715887-2-aloka.dixit@oss.qualcomm.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-04-23Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds2-2/+3
Pull virtio fixes from Michael Tsirkin: "A small number of fixes: - virtgpu is exempt from reset shutdown fow now - a more complete fix is in the works - spec compliance fixes in: - virtio-pci cap commands - vhost_scsi_send_bad_target - virtio console resize - missing locking fix in vhost-scsi - virtio ring - a KCSAN false positive fix - VHOST_*_OWNER documentation fix" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost-scsi: Fix vhost_scsi_send_status() vhost-scsi: Fix vhost_scsi_send_bad_target() vhost-scsi: protect vq->log_used with vq->mutex vhost_task: fix vhost_task_create() documentation virtio_console: fix order of fields cols and rows virtio_console: fix missing byte order handling for cols and rows virtgpu: don't reset on shutdown virtio_ring: Fix data race by tagging event_triggered as racy for KCSAN vhost: fix VHOST_*_OWNER documentation virtio_pci: Use self group type for cap commands
2025-04-21ublk: Add UBLK_U_CMD_UPDATE_SIZEOmri Mann1-0/+8
Currently ublk only allows the size of the ublkb block device to be set via UBLK_CMD_SET_PARAMS before UBLK_CMD_START_DEV is triggered. This does not provide support for extendable user-space block devices without having to stop and restart the underlying ublkb block device causing IO interruption. This patch adds a new ublk command UBLK_U_CMD_UPDATE_SIZE to allow the ublk block device to be resized on-the-fly. Feature flag UBLK_F_UPDATE_SIZE is also added to indicate support. Signed-off-by: Omri Mann <omri@nvidia.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/2a370ab1-d85b-409d-b762-f9f3f6bdf705@nvidia.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after rc3Alexei Starovoitov1-1/+3
Cross-merge bpf and other fixes after downstream PRs. No conflicts. Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-04-21io_uring: add support for IORING_OP_PIPEJens Axboe1-0/+2
This works just like pipe2(2), except it also supports fixed file descriptors. Used in a similar fashion as for other fd instantiating opcodes (like accept, socket, open, etc), where sqe->file_slot is set appropriately if two direct descriptors are desired rather than a set of normal file descriptors. sqe->addr must be set to a pointer to an array of 2 integers, which is where the fixed/normal file descriptors are copied to. sqe->pipe_flags contains flags, same as what is allowed for pipe2(2). Future expansion of per-op private flags can go in sqe->ioprio, like we do for other opcodes that take both a "syscall" flag set and an io_uring opcode specific flag set. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-19PCI: Add lane equalization register offsetsKrishna Chaitanya Chundru1-1/+11
As per PCIe spec 6.0.1, add PCIe lane equalization register offset for data rates 8.0 GT/s, 32.0 GT/s and 64.0 GT/s. Also add a macro for defining data rate 64.0 GT/s physical layer capability ID. Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://patch.msgid.link/20250328-preset_v6-v9-4-22cfa0490518@oss.qualcomm.com
2025-04-18net: add UAPI to the header guard in various network headersJakub Kicinski19-46/+46
fib_rule, ip6_tunnel, and a whole lot of if_* headers lack the customary _UAPI in the header guard. Without it YNL build can't protect from in tree and system headers both getting included. YNL doesn't need most of these but it's annoying to have to fix them one by one. Note that header installation strips this _UAPI prefix so this should result in no change to the end user. Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250416200840.1338195-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17ovpn: introduce the ovpn_socket objectAntonio Quartulli1-0/+1
This specific structure is used in the ovpn kernel module to wrap and carry around a standard kernel socket. ovpn takes ownership of passed sockets and therefore an ovpn specific objects is attached to them for status tracking purposes. Initially only UDP support is introduced. TCP will come in a later patch. Cc: willemdebruijn.kernel@gmail.com Signed-off-by: Antonio Quartulli <antonio@openvpn.net> Link: https://patch.msgid.link/20250415-b4-ovpn-v26-6-577f6097b964@openvpn.net Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-17ovpn: add basic interface creation/destruction/management routinesAntonio Quartulli1-0/+15
Add basic infrastructure for handling ovpn interfaces. Tested-by: Donald Hunter <donald.hunter@gmail.com> Signed-off-by: Antonio Quartulli <antonio@openvpn.net> Link: https://patch.msgid.link/20250415-b4-ovpn-v26-3-577f6097b964@openvpn.net Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-17ovpn: add basic netlink supportAntonio Quartulli1-0/+109
This commit introduces basic netlink support with family registration/unregistration functionalities and stub pre/post-doit. More importantly it introduces the YAML uAPI description along with its auto-generated files: - include/uapi/linux/ovpn.h - drivers/net/ovpn/netlink-gen.c - drivers/net/ovpn/netlink-gen.h Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Signed-off-by: Antonio Quartulli <antonio@openvpn.net> Link: https://patch.msgid.link/20250415-b4-ovpn-v26-2-577f6097b964@openvpn.net Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-17landlock: Update log documentationMickaël Salaün1-26/+38
Fix and improve documentation related to landlock_restrict_self(2)'s flags. Update the LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF documentation according to the current semantic. Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250416154716.1799902-3-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-04-17landlock: Fix documentation for landlock_restrict_self(2)Mickaël Salaün1-25/+36
Fix, deduplicate, and improve rendering of landlock_restrict_self(2)'s flags documentation. The flags are now rendered like the syscall's parameters and description. Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250416154716.1799902-2-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-04-17landlock: Fix documentation for landlock_create_ruleset(2)Mickaël Salaün1-5/+9
Move and fix the flags documentation, and improve formatting. It makes more sense and it eases maintenance to document syscall flags in landlock.h, where they are defined. This is already the case for landlock_restrict_self(2)'s flags. The flags are now rendered like the syscall's parameters and description. Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250416154716.1799902-1-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-04-15io_uring/zcrx: return ifq id to the userPavel Begunkov1-1/+3
IORING_OP_RECV_ZC requests take a zcrx object id via sqe::zcrx_ifq_idx, which binds it to the corresponding if / queue. However, we don't return that id back to the user. It's fine as currently there can be only one zcrx and the user assumes that its id should be 0, but as we'll need multiple zcrx objects in the future let's explicitly pass it back on registration. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/8714667d370651962f7d1a169032e5f02682a73e.1744722517.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-15fuse: add more control over cache invalidation behaviourLuis Henriques1-1/+5
Currently userspace is able to notify the kernel to invalidate the cache for an inode. This means that, if all the inodes in a filesystem need to be invalidated, then userspace needs to iterate through all of them and do this kernel notification separately. This patch adds the concept of 'epoch': each fuse connection will have the current epoch initialized and every new dentry will have it's d_time set to the current epoch value. A new operation will then allow userspace to increment the epoch value. Every time a dentry is d_revalidate()'ed, it's epoch is compared with the current connection epoch and invalidated if it's value is different. Signed-off-by: Luis Henriques <luis@igalia.com> Tested-by: Laura Promberger <laura.promberger@cern.ch> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2025-04-15rxrpc: Add the security index for yfs-rxgkDavid Howells1-0/+31
Add the security index and abort codes for the YFS variant of rxgk. Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20250411095303.2316168-6-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-15rxrpc: Allow CHALLENGEs to the passed to the app for a RESPONSEDavid Howells1-12/+34
Allow the app to request that CHALLENGEs be passed to it through an out-of-band queue that allows recvmsg() to pick it up so that the app can add data to it with sendmsg(). This will allow the application (AFS or userspace) to interact with the process if it wants to and put values into user-defined fields. This will be used by AFS when talking to a fileserver to supply that fileserver with a crypto key by which callback RPCs can be encrypted (ie. notifications from the fileserver to the client). Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/20250411095303.2316168-5-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-15net: bridge: Add offload_fail_notification boptJoseph Huang1-0/+1
Add BR_BOOLOPT_MDB_OFFLOAD_FAIL_NOTIFICATION bool option. Signed-off-by: Joseph Huang <Joseph.Huang@garmin.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20250411150323.1117797-3-Joseph.Huang@garmin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-15net: bridge: mcast: Add offload failed mdb flagJoseph Huang1-4/+5
Add MDB_FLAGS_OFFLOAD_FAILED and MDB_PG_FLAGS_OFFLOAD_FAILED to indicate that an attempt to offload the MDB entry to switchdev has failed. Signed-off-by: Joseph Huang <Joseph.Huang@garmin.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20250411150323.1117797-2-Joseph.Huang@garmin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14vhost: fix VHOST_*_OWNER documentationStefano Garzarella1-2/+2
VHOST_OWNER_SET and VHOST_OWNER_RESET are used in the documentation instead of VHOST_SET_OWNER and VHOST_RESET_OWNER respectively. To avoid confusion, let's use the right names in the documentation. No change to the API, only the documentation is involved. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20250303085237.19990-1-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-04-14virtio_pci: Use self group type for cap commandsDaniel Jurgens1-0/+1
Section 2.12.1.2 of v1.4 of the VirtIO spec states: The device and driver capabilities commands are currently defined for self group type. 1. VIRTIO_ADMIN_CMD_CAP_ID_LIST_QUERY 2. VIRTIO_ADMIN_CMD_DEVICE_CAP_GET 3. VIRTIO_ADMIN_CMD_DRIVER_CAP_SET Fixes: bfcad518605d ("virtio: Manage device and driver capabilities via the admin commands") Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Message-Id: <20250304161442.90700-1-danielj@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-04-12drm/amdkfd: add smi events for process start and endEric Huang1-0/+5
rocm-smi will be able to show the events for KFD process start/end, it is the implementation of this feature. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11ALSA: Add USB audio device jack typeWesley Cheng1-1/+2
Add an USB jack type, in order to support notifying of a valid USB audio device. Since USB audio devices can have a slew of different configurations that reach beyond the basic headset and headphone use cases, classify these devices differently. Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com> Acked-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20250409194804.3773260-8-quic_wcheng@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-11tcp: add LINUX_MIB_PAWS_TW_REJECTED counterJiayuan Chen1-0/+1
When TCP is in TIME_WAIT state, PAWS verification uses LINUX_PAWSESTABREJECTED, which is ambiguous and cannot be distinguished from other PAWS verification processes. We added a new counter, like the existing PAWS_OLD_ACK one. Also we update the doc with previously missing PAWS_OLD_ACK. usage: ''' nstat -az | grep PAWSTimewait TcpExtPAWSTimewait 1 0.0 ''' Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250409112614.16153-3-jiayuan.chen@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>