summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2026-04-10net: remove the netif_get_rx_queue_lease_locked() helpersJakub Kicinski1-5/+0
The netif_get_rx_queue_lease_locked() API hides the locking and the descend onto the leased queue. Making the code harder to follow (at least to me). Remove the API and open code the descend a bit. Most of the code now looks like: if (!leased) return __helper(x); hw_rxq = .. netdev_lock(hw_rxq->dev); ret = __helper(x); netdev_unlock(hw_rxq->dev); return ret; Of course if we have more code paths that need the wrapping we may need to revisit. For now, IMHO, having to know what netif_get_rx_queue_lease_locked() does is not worth the 20LoC it saves. Link: https://patch.msgid.link/20260408151251.72bd2482@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10Merge branch 'netkit-support-for-io_uring-zero-copy-and-af_xdp'Jakub Kicinski6-13/+75
Daniel Borkmann says: ==================== netkit: Support for io_uring zero-copy and AF_XDP Containers use virtual netdevs to route traffic from a physical netdev in the host namespace. They do not have access to the physical netdev in the host and thus can't use memory providers or AF_XDP that require reconfiguring/restarting queues in the physical netdev. This patchset adds the concept of queue leasing to virtual netdevs that allow containers to use memory providers and AF_XDP at native speed. Leased queues are bound to a real queue in a physical netdev and act as a proxy. Memory providers and AF_XDP operations take an ifindex and queue id, so containers would pass in an ifindex for a virtual netdev and a queue id of a leased queue, which then gets proxied to the underlying real queue. We have implemented support for this concept in netkit and tested the latter against Nvidia ConnectX-6 (mlx5) as well as Broadcom BCM957504 (bnxt_en) 100G NICs. For more details see the individual patches. ==================== Link: https://patch.msgid.link/20260402231031.447597-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10netkit: Add netkit notifier to check for unregistering devicesDaniel Borkmann1-0/+2
Add a netdevice notifier in netkit to watch for NETDEV_UNREGISTER events. If the target device is indeed NETREG_UNREGISTERING and previously leased a queue to a netkit device, then collect the related netkit devices and batch-unregister_netdevice_many() them. If this were not done, then the netkit device would hold a reference on the physical device preventing it from going away. However, in case of both io_uring zero-copy as well as AF_XDP this situation is handled gracefully and the allocated resources are torn down. In the case where mentioned infra is used through netkit, the applications have a reference on netkit, and netkit in turn holds a reference on the physical device. In order to have netkit release the reference on the physical device, we need such watcher to then unregister the netkit ones. This is generally quite similar to the dependency handling in case of tunnels (e.g. vxlan bound to a underlying netdev) where the tunnel device gets removed along with the physical device. # ip a [...] 4: enp10s0f0np0: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group default qlen 1000 link/ether e8:eb:d3:a3:43:f6 brd ff:ff:ff:ff:ff:ff inet 10.0.0.2/24 scope global enp10s0f0np0 valid_lft forever preferred_lft forever [...] 8: nk@NONE: <BROADCAST,MULTICAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff [...] # rmmod mlx5_ib # rmmod mlx5_core [...] [ 309.261822] mlx5_core 0000:0a:00.0 mlx5_0: Port: 1 Link DOWN [ 344.235236] mlx5_core 0000:0a:00.1: E-Switch: Unload vfs: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) [ 344.246948] mlx5_core 0000:0a:00.1: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) [ 344.463754] mlx5_core 0000:0a:00.1: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) [ 344.770155] mlx5_core 0000:0a:00.1: E-Switch: cleanup [...] # ip a [...] [ both enp10s0f0np0 and nk gone ] [...] Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Co-developed-by: David Wei <dw@davidwei.uk> Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20260402231031.447597-13-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10netkit: Add single device mode for netkitDaniel Borkmann1-0/+6
Add a single device mode for netkit instead of netkit pairs. The primary target for the paired devices is to connect network namespaces, of course, and support has been implemented in projects like Cilium [0]. For the rxq leasing the plan is to support two main scenarios related to single device mode: * For the use-case of io_uring zero-copy, the control plane can either set up a netkit pair where the peer device can perform rxq leasing which is then tied to the lifetime of the peer device, or the control plane can use a regular netkit pair to connect the hostns to a Pod/container and dynamically add/remove rxq leasing through a single device without having to interrupt the device pair. In the case of io_uring, the memory pool is used as skb non-linear pages, and thus the skb will go its way through the regular stack into netkit. Things like the netkit policy when no BPF is attached or skb scrubbing etc apply as-is in case the paired devices are used, or if the backend memory is tied to the single device and traffic goes through a paired device. * For the use-case of AF_XDP, the control plane needs to use netkit in the single device mode. The single device mode currently enforces only a pass policy when no BPF is attached, and does not yet support BPF link attachments for AF_XDP. skbs sent to that device get dropped at the moment. Given AF_XDP operates at a lower layer of the stack tying this to the netkit pair did not make sense. In future, the plan is to allow BPF at the XDP layer which can: i) process traffic coming from the AF_XDP application (e.g. QEMU with AF_XDP backend) to filter egress traffic or to push selected egress traffic up to the single netkit device to the local stack (e.g. DHCP requests), and ii) vice-versa skbs sent to the single netkit into the AF_XDP application (e.g. DHCP replies). Also, the control-plane can dynamically manage rxq leasing for the single netkit device without having to interrupt (e.g. down/up cycle) the main netkit pair for the Pod which has traffic going in and out. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Co-developed-by: David Wei <dw@davidwei.uk> Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Jordan Rife <jordan@jrife.io> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://docs.cilium.io/en/stable/operations/performance/tuning/#netkit-device-mode [0] Link: https://patch.msgid.link/20260402231031.447597-11-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10net: Proxy netdev_queue_get_dma_dev for leased queuesDavid Wei1-1/+3
Extend netdev_queue_get_dma_dev to return the physical device of the real rxq for DMA in case the queue was leased. This allows memory providers like io_uring zero-copy or devmem to bind to the physically leased rxq via virtual devices such as netkit. Signed-off-by: David Wei <dw@davidwei.uk> Co-developed-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20260402231031.447597-8-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10net: Slightly simplify net_mp_{open,close}_rxqDaniel Borkmann1-6/+2
net_mp_open_rxq is currently not used in the tree as all callers are using __net_mp_open_rxq directly, and net_mp_close_rxq is only used once while all other locations use __net_mp_close_rxq. Consolidate into a single API, netif_mp_{open,close}_rxq, using the netif_ prefix to indicate that the caller is responsible for locking. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Co-developed-by: David Wei <dw@davidwei.uk> Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20260402231031.447597-6-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10net: Add lease info to queue-get responseDaniel Borkmann1-0/+14
Populate nested lease info to the queue-get response that returns the ifindex, queue id with type and optionally netns id if the device resides in a different netns. Example with ynl client when using AF_XDP via queue leasing: # ip a [...] 4: enp10s0f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp/id:24 qdisc mq state UP group default qlen 1000 link/ether e8:eb:d3:a3:43:f6 brd ff:ff:ff:ff:ff:ff inet 10.0.0.2/24 scope global enp10s0f0np0 valid_lft forever preferred_lft forever inet6 fe80::eaeb:d3ff:fea3:43f6/64 scope link proto kernel_ll valid_lft forever preferred_lft forever [...] # ethtool -i enp10s0f0np0 driver: mlx5_core [...] # ynl --family netdev --output-json --do queue-get \ --json '{"ifindex": 4, "id": 15, "type": "rx"}' {'id': 15, 'ifindex': 4, 'lease': {'ifindex': 8, 'netns-id': 0, 'queue': {'id': 1, 'type': 'rx'}}, 'napi-id': 8227, 'type': 'rx', 'xsk': {}} # ip netns list foo (id: 0) # ip netns exec foo ip a [...] 8: nk@NONE: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff inet6 fe80::200:ff:fe00:0/64 scope link proto kernel_ll valid_lft forever preferred_lft forever [...] # ip netns exec foo ethtool -i nk driver: netkit [...] # ip netns exec foo ls /sys/class/net/nk/queues/ rx-0 rx-1 tx-0 # ip netns exec foo ynl --family netdev --output-json --do queue-get \ --json '{"ifindex": 8, "id": 1, "type": "rx"}' {"id": 1, "type": "rx", "ifindex": 8, "xsk": {}} Note that the caller of netdev_nl_queue_fill_one() holds the netdevice lock. For the queue-get we do not lock both devices. When queues get {un,}leased, both devices are locked, thus if __netif_get_rx_queue_lease() returns a lease pointer, it points to a valid device. The netns-id is fetched via peernet2id_alloc() similarly as done in OVS. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Co-developed-by: David Wei <dw@davidwei.uk> Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20260402231031.447597-4-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10net: Implement netdev_nl_queue_create_doitDaniel Borkmann3-6/+37
Implement netdev_nl_queue_create_doit which creates a new rx queue in a virtual netdev and then leases it to a rx queue in a physical netdev. Example with ynl client: # ynl --family netdev --output-json --do queue-create \ --json '{"ifindex": 8, "type": "rx", "lease": {"ifindex": 4, "queue": {"type": "rx", "id": 15}}}' {'id': 1} Note that the netdevice locking order is always from the virtual to the physical device. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Co-developed-by: David Wei <dw@davidwei.uk> Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20260402231031.447597-3-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10net: Add queue-create operationDaniel Borkmann1-0/+11
Add a ynl netdev family operation called queue-create that creates a new queue on a netdevice: name: queue-create attribute-set: queue flags: [admin-perm] do: request: attributes: - ifindex - type - lease reply: &queue-create-op attributes: - id This is a generic operation such that it can be extended for various use cases in future. Right now it is mandatory to specify ifindex, the queue type which is enforced to rx and a lease. The newly created queue id is returned to the caller. A queue from a virtual device can have a lease which refers to another queue from a physical device. This is useful for memory providers and AF_XDP operations which take an ifindex and queue id to allow applications to bind against virtual devices in containers. The lease couples both queues together and allows to proxy the operations from a virtual device in a container to the physical device. In future, the nested lease attribute can be lifted and made optional for other use-cases such as dynamic queue creation for physical netdevs. The lack of lease and the specification of the physical device as an ifindex will imply that we need a real queue to be allocated. Similarly, the queue type enforcement to rx can then be lifted as well to support tx. An early implementation had only driver-specific integration [0], but in order for other virtual devices to reuse, it makes sense to have this as a generic API in core net. For leasing queues, the virtual netdev must have real_num_rx_queues less than num_rx_queues at the time of calling queue-create. The queue-type must be rx as only rx queues are supported for leasing for now. We also enforce that the queue-create ifindex must point to a virtual device, and that the nested lease attribute's ifindex must point to a physical device. The nested lease attribute set contains a netns-id attribute which is optional and can specify a netns-id relative to the caller's netns. It requires cap_net_admin and if the netns-id attribute is not specified, the lease ifindex will be retrieved from the current netns. Also, it is modeled as an s32 type similarly as done elsewhere in the stack. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Co-developed-by: David Wei <dw@davidwei.uk> Signed-off-by: David Wei <dw@davidwei.uk> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://bpfconf.ebpf.io/bpfconf2025/bpfconf2025_material/lsfmmbpf_2025_netkit_borkmann.pdf [0] Link: https://patch.msgid.link/20260402231031.447597-2-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10Merge tag 'drm-misc-next-fixes-2026-04-09' of ↵Dave Airlie1-2/+2
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next Short summary of fixes pull: dma-buf: - fence: fix docs for dma_fence_unlock_irqrestore() fb-helper: - unlock in error path gem-shmem: - fix PMD write update gem-vram: - remove obsolete documentation ivpu: - fix device-recovery handling Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/20260409113921.GA181028@linux.fritz.box
2026-04-10ublk: widen ublk_shmem_buf_reg.len to __u64 for 4GB buffer supportMing Lei1-1/+2
The __u32 len field cannot represent a 4GB buffer (0x100000000 overflows to 0). Change it to __u64 so buffers up to 4GB can be registered. Add a reserved field for alignment and validate it is zero. The kernel enforces a default max of 4GB (UBLK_SHMEM_BUF_SIZE_MAX) which may be increased in future. Signed-off-by: Ming Lei <tom.leiming@gmail.com> Link: https://patch.msgid.link/20260409133020.3780098-2-tom.leiming@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-04-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski20-39/+136
Cross-merge networking fixes after downstream PR (net-7.0-rc8). Conflicts: net/ipv6/seg6_iptunnel.c c3812651b522f ("seg6: separate dst_cache for input and output paths in seg6 lwtunnel") 78723a62b969a ("seg6: add per-route tunnel source address") https://lore.kernel.org/adZhwtOYfo-0ImSa@sirena.org.uk net/ipv4/icmp.c fde29fd934932 ("ipv4: icmp: fix null-ptr-deref in icmp_build_probe()") d98adfbdd5c01 ("ipv4: drop ipv6_stub usage and use direct function calls") https://lore.kernel.org/adO3dccqnr6j-BL9@sirena.org.uk Adjacent changes: drivers/net/ethernet/stmicro/stmmac/chain_mode.c 51f4e090b9f8 ("net: stmmac: fix integer underflow in chain mode") 6b4286e05508 ("net: stmmac: rename STMMAC_GET_ENTRY() -> STMMAC_NEXT_ENTRY()") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-09Merge branch 'acpi-apei'Rafael J. Wysocki1-0/+11
Merge ACPI APEI updates for 7.1-rc1: - Add devm_ghes_register_vendor_record_notifier(), use it in the PCI hisi driver, and Add NVIDIA vendor CPER record handler (Kai-Heng Feng) * acpi-apei: ACPI: APEI: GHES: Add NVIDIA vendor CPER record handler PCI: hisi: Use devm_ghes_register_vendor_record_notifier() ACPI: APEI: GHES: Add devm_ghes_register_vendor_record_notifier()
2026-04-09Merge branch 'acpi-driver'Rafael J. Wysocki3-3/+6
Merge ACPI core driver core driver updates and assorted driver updates related to ACPI support for 7.1-rc1: - Clean up the ACPI AC and ACPI PAD (processor aggregator device) drivers (Rafael Wysocki) - Rework checking for duplicate video bus devices and consolidate pnp.bus_id workarounds handling in the ACPI video bus driver (Rafael Wysocki) - Update the ACPI core device drivers to stop setting acpi_device_name() unnecessarily (Rafael Wysocki) - Rearrange code using acpi_device_class() in the ACPI core device drivers and update them to stop setting acpi_device_class() unnecessarily (Rafael Wysocki) - Define ACPI_AC_CLASS in one place (Rafael Wysocki) - Convert the ni903x_wdt watchdog driver and the xen ACPI PAD driver to bind to platform devices instead of ACPI devices (Rafael Wysocki) * acpi-driver: watchdog: ni903x_wdt: Convert to a platform driver ACPI: PAD: xen: Convert to a platform driver ACPI: AC: Define ACPI_AC_CLASS in one place ACPI: driver: Do not set acpi_device_class() unnecessarily ACPI: driver: Avoid using pnp.device_class for netlink handling ACPI: event: Redefine acpi_notifier_call_chain() ACPI: driver: Do not set acpi_device_name() unnecessarily ACPI: video: Consolidate pnp.bus_id workarounds handling ACPI: video: Rework checking for duplicate video bus devices driver core: auxiliary bus: Introduce dev_is_auxiliary() ACPI: PAD: Rearrange notify handler installation and removal ACPI: AC: Get rid of unnecessary declarations
2026-04-09Merge branch 'acpi-cmos-rtc'Rafael J. Wysocki2-9/+10
Merge updates related to the CMOS RTC driver and x86/ACPI CMOS RTC support for 7.1-rc1: - Add ACPI support to the platform device interface in the CMOS RTC driver, make the ACPI core device enumeration code create a platform device for the CMOS RTC, and drop CMOS RTC PNP device support (Rafael Wysocki) - Consolidate the x86-specific CMOS RTC handling with the ACPI TAD driver and clean up the CMOS RTC ACPI address space handler (Rafael Wysocki) - Enable ACPI alarm in the CMOS RTC driver if advertised in ACPI FADT and allow that driver to work without a dedicated IRQ if the ACPI alarm is used (Rafael Wysocki) * acpi-cmos-rtc: rtc: cmos: Do not require IRQ if ACPI alarm is used rtc: cmos: Enable ACPI alarm if advertised in ACPI FADT ACPI: TAD/x86: cmos_rtc: Consolidate address space handler setup rtc: cmos: Drop PNP device support x86: rtc: Drop PNP device check ACPI: PNP: Drop CMOS RTC PNP device support ACPI: x86/rtc-cmos: Use platform device for driver binding ACPI: x86: cmos_rtc: Create a CMOS RTC platform device ACPI: x86: cmos_rtc: Improve coordination with ACPI TAD driver ACPI: x86: cmos_rtc: Clean up address space handler driver
2026-04-09Merge branches 'acpi-processor' and 'acpi-cppc'Rafael J. Wysocki2-1/+23
Merge ACPI processor driver updates and ACPI CPPC library updates for 7.1-rc1: - Address multiple assorted issues and clean up the code in the ACPI processor idle driver (Huisong Li) - Replace strlcat() in the ACPI processor idle drive with a better alternative (Andy Shevchenko) - Rearrange and clean up acpi_processor_errata_piix4() (Rafael Wysocki) - Move reference performance to capabilities and fix an uninitialized variable in the ACPI CPPC library (Pengjie Zhang) - Add support for the Performance Limited Register to the ACPI CPPC library (Sumit Gupta) - Add cppc_get_perf() API to read performance controls, extend cppc_set_epp_perf() for FFH/SystemMemory, and make the ACPI CPPC library warn on missing mandatory DESIRED_PERF register (Sumit Gupta) - Modify the cpufreq CPPC driver to update MIN_PERF/MAX_PERF in target callbacks to allow it to control performance bounds via standard scaling_min_freq and scaling_max_freq sysfs attributes and add sysfs documentation for the Performance Limited Register to it (Sumit Gupta) * acpi-processor: ACPI: processor: idle: Reset cpuidle on C-state list changes cpuidle: Extract and export no-lock variants of cpuidle_unregister_device() ACPI: processor: idle: Fix NULL pointer dereference in hotplug path ACPI: processor: idle: Reset power_setup_done flag on initialization failure ACPI: processor: Rearrange and clean up acpi_processor_errata_piix4() ACPI: processor: idle: Replace strlcat() with better alternative ACPI: processor: idle: Remove redundant static variable and rename cstate check function ACPI: processor: idle: Move max_cstate update out of the loop ACPI: processor: idle: Remove redundant cstate check in acpi_processor_power_init ACPI: processor: idle: Add missing bounds check in flatten_lpi_states() * acpi-cppc: ACPI: CPPC: Check cpc_read() return values consistently ACPI: CPPC: Fix uninitialized ref variable in cppc_get_perf_caps() ACPI: CPPC: Move reference performance to capabilities cpufreq: CPPC: Add sysfs documentation for perf_limited ACPI: CPPC: add APIs and sysfs interface for perf_limited cpufreq: cppc: Update MIN_PERF/MAX_PERF in target callbacks cpufreq: CPPC: Update cached perf_ctrls on sysfs write ACPI: CPPC: Extend cppc_set_epp_perf() for FFH/SystemMemory ACPI: CPPC: Warn on missing mandatory DESIRED_PERF register ACPI: CPPC: Add cppc_get_perf() API to read performance controls
2026-04-09ASoC: Yet another round of SDCA fixesMark Brown1-0/+5
Charles Keepax <ckeepax@opensource.cirrus.com> says: Another round of SDCA fixes a couple of fix to the IRQ cleanup from Richard, and a minor tweak to the IRQ handling from me.
2026-04-09Merge tag 'sound-7.0' of ↵Linus Torvalds1-0/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Still a bit higher amount than wished, but nothing looks really scary, and all changes are about nice and smooth device-specific fixes. - HD-audio quirks, one revert for a regression and another oneliner - AMD ACP quirks - Fixes for SDCA interrupt handling - A few Intel SOF, avs and NVL fixes - Fixes for TAS2552 DT, NAU8325, and STM32" * tag 'sound-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ASoC: amd: acp: update DMI quirk and add ACP DMIC for Lenovo platforms ASoC: SDCA: Unregister IRQ handlers on module remove ASoC: SDCA: mask Function_Status value ASoC: SDCA: Fix overwritten var within for loop ASoC: stm32_sai: fix incorrect BCLK polarity for DSP_A/B, LEFT_J ASoC: SOF: Intel: hda: modify period size constraints for ACE4 ALSA: hda/intel: enforce stricter period-size alignment for Intel NVL ASoC: nau8325: Add software reset during probe Revert "ALSA: hda/realtek: Add quirk for Gigabyte Technology to fix headphone" ASoC: Intel: avs: Fix memory leak in avs_register_i2s_test_boards() ASoC: SOF: Intel: fix iteration in is_endpoint_present() ASoC: SOF: Intel: Fix endpoint index if endpoints are missing ASoC: SDCA: Fix errors in IRQ cleanup ASoC: amd: acp: add Lenovo P16s G5 AMD quirk for legacy SDW machine ASoC: dt-bindings: ti,tas2552: Add sound-dai-cells ALSA: hda/realtek: Add quirk for Lenovo Yoga Pro 7 14IAH10
2026-04-09Merge tag 'pmdomain-v7.0-rc6' of ↵Linus Torvalds1-74/+0
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm Pull pmdomain fixes from Ulf Hansson: - imx: Prevent hang at power down for imx8mp-blk-ctrl - thead: Fix buffer overflow for TH1520 AON driver - Change Ulf Hansson's email * tag 'pmdomain-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm: MAINTAINERS, mailmap: Change Ulf Hansson's email pmdomain: imx8mp-blk-ctrl: Keep the NOC_HDCP clock enabled firmware: thead: Fix buffer overflow and use standard endian macros
2026-04-09bitops: Update kernel-doc for sign_extendXX()Andy Shevchenko1-2/+8
The sign_extendXX() lack of Return section and have other style issues. Address that by updating kernel-doc accordingly. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Yury Norov <ynorov@nvidia.com>
2026-04-09Merge tag 'net-7.0-rc8' of ↵Linus Torvalds6-5/+28
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from netfilter, IPsec and wireless. This is again considerably bigger than the old average. No known outstanding regressions. Current release - regressions: - net: increase IP_TUNNEL_RECURSION_LIMIT to 5 - eth: ice: fix PTP timestamping broken by SyncE code on E825C Current release - new code bugs: - eth: stmmac: dwmac-motorcomm: fix eFUSE MAC address read failure Previous releases - regressions: - core: fix cross-cache free of KFENCE-allocated skb head - sched: act_csum: validate nested VLAN headers - rxrpc: fix call removal to use RCU safe deletion - xfrm: - wait for RCU readers during policy netns exit - fix refcount leak in xfrm_migrate_policy_find - wifi: rt2x00usb: fix devres lifetime - mptcp: fix slab-use-after-free in __inet_lookup_established - ipvs: fix NULL deref in ip_vs_add_service error path - eth: - airoha: fix memory leak in airoha_qdma_rx_process() - lan966x: fix use-after-free and leak in lan966x_fdma_reload() Previous releases - always broken: - ipv6: ioam: fix potential NULL dereferences in __ioam6_fill_trace_data() - ipv4: nexthop: avoid duplicate NHA_HW_STATS_ENABLE on nexthop group dump - bridge: guard local VLAN-0 FDB helpers against NULL vlan group - xsk: tailroom reservation and MTU validation - rxrpc: - fix to request an ack if window is limited - fix RESPONSE authenticator parser OOB read - netfilter: nft_ct: fix use-after-free in timeout object destroy - batman-adv: hold claim backbone gateways by reference - eth: - stmmac: fix PTP ref clock for Tegra234 - idpf: fix PREEMPT_RT raw/bh spinlock nesting for async VC handling - ipa: fix GENERIC_CMD register field masks for IPA v5.0+" * tag 'net-7.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (104 commits) net: lan966x: fix use-after-free and leak in lan966x_fdma_reload() net: lan966x: fix page pool leak in error paths net: lan966x: fix page_pool error handling in lan966x_fdma_rx_alloc_page_pool() nfc: pn533: allocate rx skb before consuming bytes l2tp: Drop large packets with UDP encap net: ipa: fix event ring index not programmed for IPA v5.0+ net: ipa: fix GENERIC_CMD register field masks for IPA v5.0+ MAINTAINERS: Add Prashanth as additional maintainer for amd-xgbe driver devlink: Fix incorrect skb socket family dumping af_unix: read UNIX_DIAG_VFS data under unix_state_lock Revert "mptcp: add needs_id for netlink appending addr" mptcp: fix slab-use-after-free in __inet_lookup_established net: txgbe: leave space for null terminators on property_entry net: ioam6: fix OOB and missing lock rxrpc: proc: size address buffers for %pISpc output rxrpc: only handle RESPONSE during service challenge rxrpc: Fix buffer overread in rxgk_do_verify_authenticator() rxrpc: Fix leak of rxgk context in rxgk_verify_response() rxrpc: Fix integer overflow in rxgk_verify_response() rxrpc: Fix missing error checks for rxkad encryption/decryption failure ...
2026-04-09memblock: Permit existing reserved regions to be marked RSRV_KERNArd Biesheuvel1-0/+1
Permit existing memblock reservations to be marked as RSRV_KERN. This will be used by the EFI code on x86 to distinguish between reservations of boot services data regions that have actual significance to the kernel and regions that are reserved temporarily to work around buggy firmware. Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2026-04-09jbd2: store jinode dirty range in PAGE_SIZE unitsLi Chen1-12/+22
jbd2_inode fields are updated under journal->j_list_lock, but some paths read them without holding the lock (e.g. fast commit helpers and ordered truncate helpers). READ_ONCE() alone is not sufficient for the dirty range fields when they are stored as loff_t because 32-bit platforms can observe torn loads. Store the dirty range in PAGE_SIZE units as pgoff_t instead. Represent the dirty range end as an exclusive end page. This avoids a special sentinel value and keeps MAX_LFS_FILESIZE on 32-bit representable. Publish a new dirty range by updating end_page before start_page, and treat start_page >= end_page as empty in the accessor for robustness. Use READ_ONCE() on the read side and WRITE_ONCE() on the write side for the dirty range and i_flags to match the existing lockless access pattern. Suggested-by: Jan Kara <jack@suse.cz> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Li Chen <me@linux.beauty> Link: https://patch.msgid.link/20260306085643.465275-5-me@linux.beauty Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2026-04-09jbd2: add jinode dirty range accessorsLi Chen1-0/+14
Provide a helper to fetch jinode dirty ranges in bytes. This lets filesystem callbacks avoid depending on the internal representation, preparing for a later conversion to page units. Suggested-by: Andreas Dilger <adilger@dilger.ca> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Li Chen <me@linux.beauty> Link: https://patch.msgid.link/20260306085643.465275-2-me@linux.beauty Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2026-04-09RDMA/mana_ib: Support memory windowsKonstantin Taranov1-0/+5
Implement .alloc_mw() and .dealloc_mw() for mana device. This is just the basic infrastructure, MW is not practically usable until additional kernel support for allowing user space to submit MW work requests is completed. Link: https://patch.msgid.link/r/20260331090851.2276205-1-kotaranov@linux.microsoft.com Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2026-04-09kernfs: pass struct ns_common instead of const void * for namespace tagsChristian Brauner7-45/+54
kernfs has historically used const void * to pass around namespace tags used for directory-level namespace filtering. The only current user of this is sysfs network namespace tagging where struct net pointers are cast to void *. Replace all const void * namespace parameters with const struct ns_common * throughout the kernfs, sysfs, and kobject namespace layers. This includes the kobj_ns_type_operations callbacks, kobject_namespace(), and all sysfs/kernfs APIs that accept or return namespace tags. Passing struct ns_common is needed because various codepaths require access to the underlying namespace. A struct ns_common can always be converted back to the concrete namespace type (e.g., struct net) via container_of() or to_ns_common() in the reverse direction. This is a preparatory change for switching to ns_id-based directory iteration to prevent a KASLR pointer leak through the current use of raw namespace pointers as hash seeds and comparison keys. Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-04-09Merge branches 'fixes', 'arm/smmu/updates', 'arm/smmu/bindings', 'riscv', ↵Will Deacon4-12/+99
'intel/vt-d', 'amd/amd-vi' and 'core' into next
2026-04-09LoongArch: KVM: Add DMSINTC device supportSong Gao1-0/+2
Add device model for DMSINTC interrupt controller, implement basic create/destroy/set_attr interfaces, and register device model to kvm device table. Reviewed-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Song Gao <gaosong@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2026-04-09ALSA: hda: Add a simple GPIO setup helper functionTakashi Iwai1-0/+4
Introduce a common GPIO setup helper function, so that we can clean up the open code found in many codec drivers later. Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20260409093826.1317626-3-tiwai@suse.de
2026-04-09ALSA: hda: Add sync version of snd_hda_codec_write()Takashi Iwai1-0/+11
We used snd_hda_codec_read() for the verb write when a synchronization is needed after the write, e.g. for the power state toggle or such cases. It works in principle, but it looks rather confusing and too hackish. For improving the code readability, introduce a new helper function, snd_hda_codec_write_sync(), which is another variant of snd_hda_codec_write(), and replace the existing snd_hda_codec_read() calls with this one. No behavior change but just the code refactoring. Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20260409093826.1317626-2-tiwai@suse.de
2026-04-09net/mlx5: Add icm_mng_function_id_mode cap bitMoshe Shemesh1-1/+7
Introduce the capability bit icm_mng_function_id_mode to indicate that the device firmware uses vhca_id instead of function_id as the effective identifier for the firmware commands MANAGE_PAGES, QUERY_PAGES, and page request event. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Akiva Goldberger <agoldberger@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260403090028.137783-3-tariqt@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-04-09net/mlx5: Rename MLX5_PF page counter type to MLX5_SELFMoshe Shemesh1-1/+1
The MLX5_PF enum value in mlx5_func_type is used to track firmware page allocations for the page manager function itself, which is either the ECPF on SmartNIC systems or the host PF when there is no ECPF. Rename it to MLX5_SELF to accurately reflect that this counter tracks pages allocated by the manager for its own use, regardless of whether it is a PF or ECPF. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260403090028.137783-2-tariqt@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-04-09Merge tag 'coresight-next-v7.1' of ↵Greg Kroah-Hartman1-10/+4
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/coresight/linux into char-misc-next Suzuki writes: coresight: Updates for Linux v7.1 CoreSight self hosted tracing subsystem updates for Linux v7.1, includes: - Fix unregistration related issues - Clean up CTI power management and sysfs code - Miscellaneous fixes - MAINTAINERS: Add Leo Yan as Reviewer - MAINTAINERS: Update Mike's email address Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> * tag 'coresight-next-v7.1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/coresight/linux: (25 commits) coresight: tpdm: fix invalid MMIO access issue coresight: tpdm: add traceid_show for checking traceid coresight: platform: check the availability of the endpoint before parse coresight: cti: fix the check condition in inout_sel_store MAINTAINERS: coresight: Add Leo Yan as Reviewer coresight: cti: Properly handle negative offsets in cti_reg32_{show|store}() coresight: cti: Remove hw_enabled flag coresight: cti: Remove hw_powered flag coresight: cti: Rename cti_active() to cti_is_active() coresight: cti: Remove CPU power management code coresight: cti: Access ASICCTL only when implemented coresight: cti: Fix register reads coresight: cti: Make spinlock usage consistent drivers/hwtracing/coresight: remove unneeded variable in tmc_crashdata_release() MAINTAINERS: Change e-mail address for reviewer coresight: ctcu: fix the spin_bug coresight: Unify bus unregistration via coresight_unregister() coresight: Do not mix success path with failure handling coresight: Move sink validation into etm_perf_add_symlink_sink() coresight: Refactor sysfs connection group cleanup ...
2026-04-09Merge branch 'for-linus' into for-nextTakashi Iwai38-76/+257
Pull 7.0-devel branch for further development of HD-audio codec quirks. Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-04-09devlink: Add resource scope filtering to resource dumpOr Har-Toov1-0/+11
Allow filtering the resource dump to device-level or port-level resources using the 'scope' option. Example - dump only device-level resources: $ devlink resource show scope dev pci/0000:03:00.0: name max_local_SFs size 128 unit entry dpipe_tables none name max_external_SFs size 128 unit entry dpipe_tables none pci/0000:03:00.1: name max_local_SFs size 128 unit entry dpipe_tables none name max_external_SFs size 128 unit entry dpipe_tables none Example - dump only port-level resources: $ devlink resource show scope port pci/0000:03:00.0/196608: name max_SFs size 128 unit entry dpipe_tables none pci/0000:03:00.0/196609: name max_SFs size 128 unit entry dpipe_tables none pci/0000:03:00.1/196708: name max_SFs size 128 unit entry dpipe_tables none pci/0000:03:00.1/196709: name max_SFs size 128 unit entry dpipe_tables none Signed-off-by: Or Har-Toov <ohartoov@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260407194107.148063-11-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-09devlink: Add port-level resource registration infrastructureOr Har-Toov1-0/+8
The current devlink resource infrastructure supports only device-level resources. Some hardware resources are associated with specific ports rather than the entire device, and today we have no way to show resource per-port. Add support for registering resources at the port level. Signed-off-by: Or Har-Toov <ohartoov@nvidia.com> Reviewed-by: Shay Drori <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260407194107.148063-3-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-09devlink: Refactor resource functions to be genericOr Har-Toov1-1/+1
Currently the resource functions take devlink pointer as parameter and take the resource list from there. Allow resource functions to work with other resource lists that will be added in next patches and not only with the devlink's resource list. Signed-off-by: Or Har-Toov <ohartoov@nvidia.com> Reviewed-by: Shay Drori <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260407194107.148063-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-09net: dsa: eliminate <linux/dsa/loop.h>Vladimir Oltean1-42/+0
There is no reason at all to export these data types to the global include directory. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20260406212158.721806-5-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-09net: dsa: remove unused platform_data definitionsVladimir Oltean1-3/+0
Pretty self-explanatory, nobody needs these. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20260406212158.721806-4-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-09net: dsa: clean up struct dsa_chip_dataVladimir Oltean1-20/+0
This has accumulated some fields which are no longer parsed by the core or set by any driver. Remove them. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20260406212158.721806-3-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-09net: dsa: remove struct platform_dataVladimir Oltean1-17/+0
This is not used anywhere in the kernel. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20260406212158.721806-2-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-09mptcp: better mptcp-level RTT estimatorPaolo Abeni1-1/+1
The current MPTCP-level RTT estimator has several issues. On high speed links, the MPTCP-level receive buffer auto-tuning happens with a frequency well above the TCP-level's one. That in turn can cause excessive/unneeded receive buffer increase. On such links, the initial rtt_us value is considerably higher than the actual delay, and the current mptcp_rcv_space_adjust() updates msk->rcvq_space.rtt_us with a period equal to the such field previous value. If the initial rtt_us is 40ms, its first update will happen after 40ms, even if the subflows see actual RTT orders of magnitude lower. Additionally: - setting the msk RTT to the maximum among all the subflows RTTs makes DRS constantly overshooting the rcvbuf size when a subflow has considerable higher latency than the other(s). - during unidirectional bulk transfers with multiple active subflows, the TCP-level RTT estimator occasionally sees considerably higher value than the real link delay, i.e. when the packet scheduler reacts to an incoming ACK on given subflow pushing data on a different subflow. - currently inactive but still open subflows (i.e. switched to backup mode) are always considered when computing the msk-level RTT. Address the all the issues above with a more accurate RTT estimation strategy: the MPTCP-level RTT is set to the minimum of all the subflows actually feeding data into the MPTCP receive buffer, using a small sliding window. While at it, also use EWMA to compute the msk-level scaling_ratio, to that MPTCP can avoid traversing the subflow list is mptcp_rcv_space_adjust(). Use some care to avoid updating msk and ssk level fields too often. Fixes: a6b118febbab ("mptcp: add receive buffer auto-tuning") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260407-net-next-mptcp-reduce-rbuf-v2-1-0d1d135bf6f6@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-09net: dropreason: add MACVLAN_BROADCAST_BACKLOG and IPVLAN_MULTICAST_BACKLOGEric Dumazet1-0/+12
ipvlan and macvlan use queues to process broadcast/multicast packets from a work queue. Under attack these queues can drop packets. Add MACVLAN_BROADCAST_BACKLOG drop_reason for macvlan broadcast queue. Add IPVLAN_MULTICAST_BACKLOG drop_reason for ipvlan multicast queue. Use different reasons as some deployments use both ipvlan and macvlan. Also change ipvlan_rcv_frame() to use SKB_DROP_REASON_DEV_READY when the device is not UP. Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260407150710.1640747-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-09codel: annotate data-races in codel_dump_stats()Eric Dumazet1-21/+24
codel_dump_stats() only runs with RTNL held, reading fields that can be changed in qdisc fast path. Add READ_ONCE()/WRITE_ONCE() annotations. Alternative would be to acquire the qdisc spinlock, but our long-term goal is to make qdisc dump operations lockless as much as we can. tc_codel_xstats fields don't need to be latched atomically, otherwise this bug would have been caught earlier. No change in kernel size: $ scripts/bloat-o-meter -t vmlinux.0 vmlinux add/remove: 0/0 grow/shrink: 1/1 up/down: 3/-1 (2) Function old new delta codel_qdisc_dequeue 2462 2465 +3 codel_dump_stats 250 249 -1 Total: Before=29739919, After=29739921, chg +0.00% Fixes: 76e3cc126bb2 ("codel: Controlled Delay AQM") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260407143053.1570620-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-09bonding: remove unused bond_is_first_slave and bond_is_last_slave macrosXiang Mei1-3/+0
Since commit 2884bf72fb8f ("net: bonding: fix use-after-free in bond_xmit_broadcast()"), bond_is_last_slave() was only used in bond_xmit_broadcast(). After the recent fix replaced that usage with a simple index comparison, bond_is_last_slave() has no remaining callers. bond_is_first_slave() likewise has no callers. Remove both unused macros. Signed-off-by: Xiang Mei <xmei5@asu.edu> Link: https://patch.msgid.link/20260404220412.444753-1-xmei5@asu.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-09dt-bindings: clock: qcom: Add Nord Global Clock ControllerTaniya Das4-0/+438
Add device tree bindings for the global clock controller on Qualcomm Nord platform. The global clock controller on Nord SoC is divided into multiple clock controllers (GCC,SE_GCC,NE_GCC and NW_GCC). Add each of the bindings to define the clock controllers. Signed-off-by: Taniya Das <taniya.das@oss.qualcomm.com> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260403-nord-clks-v1-3-018af14979fd@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-04-09dt-bindings: clock: qcom: Document the Nord SoC TCSR Clock ControllerTaniya Das1-0/+26
The Nord SoC TCSR block provides CLKREF clocks for DP, PCIe, UFS, SGMII and USB. Signed-off-by: Taniya Das <taniya.das@oss.qualcomm.com> [Shawn: Use compatible qcom,nord-tcsrcc rather than qcom,nord-tcsr] Signed-off-by: Shawn Guo <shengchao.guo@oss.qualcomm.com> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260403-nord-clks-v1-1-018af14979fd@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-04-09scsi: libsas: Delete unused to_dom_device() and to_dev_attr()Thomas Weißschuh1-4/+0
These macros are unused and to_dev_attr() will conflict with an upcoming centralization of general attribute macros. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: John Garry <john.g.garry@oracle.com> Link: https://patch.msgid.link/20260408-libsas-cleanup-v1-1-826325bbc0ba@weissschuh.net Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2026-04-09Merge tag 'nf-26-04-08' of ↵Jakub Kicinski2-1/+1
https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Florian Westphal says: ==================== netfilter updates for net I only included crash fixes, as we're closer to a release, rest will be handled via -next. 1) Fix a NULL pointer dereference in ip_vs_add_service error path, from Weiming Shi, bug added in 6.2 development cycle. 2) Don't leak kernel data bytes from allocator to userspace: nfnetlink_log needs to init the trailing NLMSG_DONE terminator. From Xiang Mei. 3) xt_multiport match lacks range validation, bogus userspace request will cause out-of-bounds read. From Ren Wei. 4) ip6t_eui64 match must reject packets with invalid mac header before calling eth_hdr. Make existing check unconditional. From Zhengchuan Liang. 5) nft_ct timeout policies are free'd via kfree() while they may still be reachable by other cpus that process a conntrack object that uses such a timeout policy. Existing reaping of entries is not sufficient because it doesn't wait for a grace period. Use kfree_rcu(). From Tuan Do. 6/7) Make nfnetlink_queue hash table per queue. As-is we can hit a page fault in case underlying page of removed element was free'd. Per-queue hash prevents parallel lookups. This comes with a test case that demonstrates the bug, from Fernando Fernandez Mancera. * tag 'nf-26-04-08' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: selftests: nft_queue.sh: add a parallel stress test netfilter: nfnetlink_queue: make hash table per queue netfilter: nft_ct: fix use-after-free in timeout object destroy netfilter: ip6t_eui64: reject invalid MAC header for all packets netfilter: xt_multiport: validate range encoding in checkentry netfilter: nfnetlink_log: initialize nfgenmsg in NLMSG_DONE terminator ipvs: fix NULL deref in ip_vs_add_service error path ==================== Link: https://patch.msgid.link/20260408163512.30537-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-09rxrpc: Fix to request an ack if window is limitedMarc Dionne1-0/+1
Peers may only send immediate acks for every 2 UDP packets received. When sending a jumbogram, it is important to check that there is sufficient window space to send another same sized jumbogram following the current one, and request an ack if there isn't. Failure to do so may cause the call to stall waiting for an ack until the resend timer fires. Where jumbograms are in use this causes a very significant drop in performance. Fixes: fe24a5494390 ("rxrpc: Send jumbo DATA packets") Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260408121252.2249051-10-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>