summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2026-01-27dt-bindings: net: dsa: lantiq,gswip: add Intel GSW150Daniel Golle1-0/+2
Add compatible strings for the Intel GSW150 which is apparently identical or at least compatible with the Lantiq PEB7084 Ethernet switch IC. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/1dc62de5263e8536d5960b837bc5dad7b8f42fad.1769099517.git.daniel@makrotopia.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-27dt-bindings: net: dsa: lantiq,gswip: use correct node nameDaniel Golle1-2/+2
Ethernet PHYs should use nodes named 'ethernet-phy@'. Rename the Ethernet PHY nodes in the example to comply. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/94f439aa17d7b51fb367877df4fb84c8c07c7ce4.1769099517.git.daniel@makrotopia.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-27Merge branch 'vsock-add-namespace-support-to-vhost-vsock-and-loopback'Paolo Abeni14-140/+1531
Bobby Eshleman says: ==================== vsock: add namespace support to vhost-vsock and loopback This series adds namespace support to vhost-vsock and loopback. It does not add namespaces to any of the other guest transports (virtio-vsock, hyperv, or vmci). The current revision supports two modes: local and global. Local mode is complete isolation of namespaces, while global mode is complete sharing between namespaces of CIDs (the original behavior). The mode is set using the parent namespace's /proc/sys/net/vsock/child_ns_mode and inherited when a new namespace is created. The mode of the current namespace can be queried by reading /proc/sys/net/vsock/ns_mode. The mode can not change after the namespace has been created. Modes are per-netns. This allows a system to configure namespaces independently (some may share CIDs, others are completely isolated). This also supports future possible mixed use cases, where there may be namespaces in global mode spinning up VMs while there are mixed mode namespaces that provide services to the VMs, but are not allowed to allocate from the global CID pool (this mode is not implemented in this series). Additionally, added tests for the new namespace features: tools/testing/selftests/vsock/vmtest.sh 1..25 ok 1 vm_server_host_client ok 2 vm_client_host_server ok 3 vm_loopback ok 4 ns_host_vsock_ns_mode_ok ok 5 ns_host_vsock_child_ns_mode_ok ok 6 ns_global_same_cid_fails ok 7 ns_local_same_cid_ok ok 8 ns_global_local_same_cid_ok ok 9 ns_local_global_same_cid_ok ok 10 ns_diff_global_host_connect_to_global_vm_ok ok 11 ns_diff_global_host_connect_to_local_vm_fails ok 12 ns_diff_global_vm_connect_to_global_host_ok ok 13 ns_diff_global_vm_connect_to_local_host_fails ok 14 ns_diff_local_host_connect_to_local_vm_fails ok 15 ns_diff_local_vm_connect_to_local_host_fails ok 16 ns_diff_global_to_local_loopback_local_fails ok 17 ns_diff_local_to_global_loopback_fails ok 18 ns_diff_local_to_local_loopback_fails ok 19 ns_diff_global_to_global_loopback_ok ok 20 ns_same_local_loopback_ok ok 21 ns_same_local_host_connect_to_local_vm_ok ok 22 ns_same_local_vm_connect_to_local_host_ok ok 23 ns_delete_vm_ok ok 24 ns_delete_host_ok ok 25 ns_delete_both_ok SUMMARY: PASS=25 SKIP=0 FAIL=0 Thanks again for everyone's help and reviews! Suggested-by: Sargun Dhillon <sargun@sargun.me> Signed-off-by: Bobby Eshleman <bobbyeshleman@gmail.com> v15: https://lore.kernel.org/r/20260116-vsock-vmtest-v15-0-bbfd1a668548@meta.com v14: https://lore.kernel.org/r/20260112-vsock-vmtest-v14-0-a5c332db3e2b@meta.com v13: https://lore.kernel.org/all/20251223-vsock-vmtest-v13-0-9d6db8e7c80b@meta.com/ v12: https://lore.kernel.org/r/20251126-vsock-vmtest-v12-0-257ee21cd5de@meta.com v11: https://lore.kernel.org/r/20251120-vsock-vmtest-v11-0-55cbc80249a7@meta.com v10: https://lore.kernel.org/r/20251117-vsock-vmtest-v10-0-df08f165bf3e@meta.com v9: https://lore.kernel.org/all/20251111-vsock-vmtest-v9-0-852787a37bed@meta.com v8: https://lore.kernel.org/r/20251023-vsock-vmtest-v8-0-dea984d02bb0@meta.com v7: https://lore.kernel.org/r/20251021-vsock-vmtest-v7-0-0661b7b6f081@meta.com v6: https://lore.kernel.org/r/20250916-vsock-vmtest-v6-0-064d2eb0c89d@meta.com v5: https://lore.kernel.org/r/20250827-vsock-vmtest-v5-0-0ba580bede5b@meta.com v4: https://lore.kernel.org/r/20250805-vsock-vmtest-v4-0-059ec51ab111@meta.com v2: https://lore.kernel.org/kvm/20250312-vsock-netns-v2-0-84bffa1aa97a@gmail.com v1: https://lore.kernel.org/r/20200116172428.311437-1-sgarzare@redhat.com ==================== Link: https://patch.msgid.link/20260121-vsock-vmtest-v16-0-2859a7512097@meta.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-27selftests/vsock: add tests for namespace deletionBobby Eshleman1-0/+84
Add tests that validate vsock sockets are resilient to deleting namespaces. The vsock sockets should still function normally. The function check_ns_delete_doesnt_break_connection() is added to re-use the step-by-step logic of 1) setup connections, 2) delete ns, 3) check that the connections are still ok. Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com> Link: https://patch.msgid.link/20260121-vsock-vmtest-v16-12-2859a7512097@meta.com Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-27selftests/vsock: add tests for host <-> vm connectivity with namespacesBobby Eshleman1-4/+568
Add tests to validate namespace correctness using vsock_test and socat. The vsock_test tool is used to validate expected success tests, but socat is used for expected failure tests. socat is used to ensure that connections are rejected outright instead of failing due to some other socket behavior (as tested in vsock_test). Additionally, socat is already required for tunneling TCP traffic from vsock_test. Using only one of the vsock_test tests like 'test_stream_client_close_client' would have yielded a similar result, but doing so wouldn't remove the socat dependency. Additionally, check for the dependency socat. socat needs special handling beyond just checking if it is on the path because it must be compiled with support for both vsock and unix. The function check_socat() checks that this support exists. Add more padding to test name printf strings because the tests added in this patch would otherwise overflow. Add vm_dmesg_* helpers to encapsulate checking dmesg for oops and warnings. Add ability to pass extra args to host-side vsock_test so that tests that cause false positives may be skipped with arg --skip. Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com> Link: https://patch.msgid.link/20260121-vsock-vmtest-v16-11-2859a7512097@meta.com Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-27selftests/vsock: add namespace tests for CID collisionsBobby Eshleman1-0/+78
Add tests to verify CID collision rules across different vsock namespace modes. 1. Two VMs with the same CID cannot start in different global namespaces (ns_global_same_cid_fails) 2. Two VMs with the same CID can start in different local namespaces (ns_local_same_cid_ok) 3. VMs with the same CID can coexist when one is in a global namespace and another is in a local namespace (ns_global_local_same_cid_ok and ns_local_global_same_cid_ok) The tests ns_global_local_same_cid_ok and ns_local_global_same_cid_ok make sure that ordering does not matter. The tests use a shared helper function namespaces_can_boot_same_cid() that attempts to start two VMs with identical CIDs in the specified namespaces and verifies whether VM initialization failed or succeeded. Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com> Link: https://patch.msgid.link/20260121-vsock-vmtest-v16-10-2859a7512097@meta.com Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-27selftests/vsock: add tests for proc sys vsock ns_modeBobby Eshleman1-2/+138
Add tests for the /proc/sys/net/vsock/{ns_mode,child_ns_mode} interfaces. Namely, that they accept/report "global" and "local" strings and enforce their access policies. Start a convention of commenting the test name over the test description. Add test name comments over test descriptions that existed before this convention. Add a check_netns() function that checks if the test requires namespaces and if the current kernel supports namespaces. Skip tests that require namespaces if the system does not have namespace support. This patch is the first to add tests that do *not* re-use the same shared VM. For that reason, it adds a run_ns_tests() function to run these tests and filter out the shared VM tests. Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com> Link: https://patch.msgid.link/20260121-vsock-vmtest-v16-9-2859a7512097@meta.com Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-27selftests/vsock: use ss to wait for listeners instead of /proc/netBobby Eshleman1-17/+30
Replace /proc/net parsing with ss(8) for detecting listening sockets in wait_for_listener() functions and add support for TCP, VSOCK, and Unix socket protocols. The previous implementation parsed /proc/net/tcp using awk to detect listening sockets, but this approach could not support vsock because vsock does not export socket information to /proc/net/. Instead, use ss so that we can detect listeners on tcp, vsock, and unix. The protocol parameter is now required for all wait_for_listener family functions (wait_for_listener, vm_wait_for_listener, host_wait_for_listener) to explicitly specify which socket type to wait for. ss is added to the dependency check in check_deps(). Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com> Link: https://patch.msgid.link/20260121-vsock-vmtest-v16-8-2859a7512097@meta.com Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-27selftests/vsock: add vm_dmesg_{warn,oops}_count() helpersBobby Eshleman1-4/+15
These functions are reused by the VM tests to collect and compare dmesg warnings and oops counts. The future VM-specific tests use them heavily. This patches relies on vm_ssh() already supporting namespaces. Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com> Link: https://patch.msgid.link/20260121-vsock-vmtest-v16-7-2859a7512097@meta.com Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-27selftests/vsock: prepare vm management helpers for namespacesBobby Eshleman1-32/+69
Add namespace support to vm management, ssh helpers, and vsock_test wrapper functions. This enables running VMs and test helpers in specific namespaces, which is required for upcoming namespace isolation tests. The functions still work correctly within the init ns, though the caller must now pass "init_ns" explicitly. No functional changes for existing tests. All have been updated to pass "init_ns" explicitly. Affected functions (such as vm_start() and vm_ssh()) now wrap their commands with 'ip netns exec' when executing commands in non-init namespaces. Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com> Link: https://patch.msgid.link/20260121-vsock-vmtest-v16-6-2859a7512097@meta.com Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-27selftests/vsock: add namespace helpers to vmtest.shBobby Eshleman1-0/+32
Add functions for initializing namespaces with the different vsock NS modes. Callers can use add_namespaces() and del_namespaces() to create namespaces global0, global1, local0, and local1. The add_namespaces() function initializes global0, local0, etc... with their respective vsock NS mode by toggling child_ns_mode before creating the namespace. Remove namespaces upon exiting the program in cleanup(). This is unlikely to be needed for a healthy run, but it is useful for tests that are manually killed mid-test. This patch is in preparation for later namespace tests. Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com> Link: https://patch.msgid.link/20260121-vsock-vmtest-v16-5-2859a7512097@meta.com Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-27selftests/vsock: increase timeout to 1200Bobby Eshleman1-1/+1
Increase the timeout from 300s to 1200s. On a modern bare metal server my last run showed the new set of tests taking ~400s. Multiply by an (arbitrary) factor of three to account for slower/nested runners. Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com> Link: https://patch.msgid.link/20260121-vsock-vmtest-v16-4-2859a7512097@meta.com Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-27vsock: add netns support to virtio transportsBobby Eshleman5-40/+84
Add netns support to loopback and vhost. Keep netns disabled for virtio-vsock, but add necessary changes to comply with common API updates. This is the patch in the series when vhost-vsock namespaces actually come online. Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com> Link: https://patch.msgid.link/20260121-vsock-vmtest-v16-3-2859a7512097@meta.com Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-27virtio: set skb owner of virtio_transport_reset_no_sock() replyBobby Eshleman1-0/+6
Associate reply packets with the sending socket. When vsock must reply with an RST packet and there exists a sending socket (e.g., for loopback), setting the skb owner to the socket correctly handles reference counting between the skb and sk (i.e., the sk stays alive until the skb is freed). This allows the net namespace to be used for socket lookups for the duration of the reply skb's lifetime, preventing race conditions between the namespace lifecycle and vsock socket search using the namespace pointer. Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com> Link: https://patch.msgid.link/20260121-vsock-vmtest-v16-2-2859a7512097@meta.com Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-27vsock: add netns to vsock coreBobby Eshleman12-51/+437
Add netns logic to vsock core. Additionally, modify transport hook prototypes to be used by later transport-specific patches (e.g., *_seqpacket_allow()). Namespaces are supported primarily by changing socket lookup functions (e.g., vsock_find_connected_socket()) to take into account the socket namespace and the namespace mode before considering a candidate socket a "match". This patch also introduces the sysctl /proc/sys/net/vsock/ns_mode to report the mode and /proc/sys/net/vsock/child_ns_mode to set the mode for new namespaces. Add netns functionality (initialization, passing to transports, procfs, etc...) to the af_vsock socket layer. Later patches that add netns support to transports depend on this patch. This patch changes the allocation of random ports for connectible vsocks in order to avoid leaking the random port range starting point to other namespaces. dgram_allow(), stream_allow(), and seqpacket_allow() callbacks are modified to take a vsk in order to perform logic on namespace modes. In future patches, the net will also be used for socket lookups in these functions. Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com> Link: https://patch.msgid.link/20260121-vsock-vmtest-v16-1-2859a7512097@meta.com Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-27net: ethernet: ti: netcp: Use u64_stats_t with u64_stats_sync properlyDavid Yang2-12/+12
On 64bit arches, struct u64_stats_sync is empty and provides no help against load/store tearing. Convert to u64_stats_t to ensure atomic operations. Note that does not mean the code is now tear-free: there're u32 counters unprotected by u64_stats or anything else. Signed-off-by: David Yang <mmyangfl@gmail.com> Link: https://patch.msgid.link/20260123164841.2890054-1-mmyangfl@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-27netdevsim: use u64_stats_t with u64_stats_sync properlyDavid Yang2-12/+14
On 64bit arches, struct u64_stats_sync is empty and provides no help against load/store tearing. Convert to u64_stats_t to ensure atomic operations. Signed-off-by: David Yang <mmyangfl@gmail.com> Link: https://patch.msgid.link/20260123211101.2929547-1-mmyangfl@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-27selftests: net: fix wrong boolean evaluation in __exit__Gal Pressman1-1/+1
The __exit__ method receives ex_type as the exception class when an exception occurs. The previous code used implicit boolean evaluation: terminate = self.terminate or (self._exit_wait and ex_type) ^^^^^^^^^^^ In Python, the and operator can be used with non-boolean values, but it does not always return a boolean result. This is probably not what we want, because 'self._exit_wait and ex_type' could return the actual ex_type value (the exception class) rather than a boolean True when an exception occurs. Use explicit `ex_type is not None` check to properly evaluate whether an exception occurred, returning a boolean result. Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20260125105524.773993-1-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-27r8169: remove optional size argument in calls to strscpyHeiner Kallweit2-5/+4
Minor simplification of the code by removing the optional size argument. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/d560ec66-848e-4290-818a-ce28f39de493@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-27r8169: add support for extended chip version id and RTL9151ASJaven Xu2-9/+39
The bits in register TxConfig used for chip identification aren't sufficient for the number of upcoming chip versions. Therefore a register is added with extended chip version information, for compatibility purposes it's called TX_CONFIG_V2. First chip to use the extended chip identification is RTL9151AS. Signed-off-by: Javen Xu <javen_xu@realsil.com.cn> [hkallweit1@gmail.com: add support for extended XID where XID is printed] Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/a3525b74-a1aa-43f6-8413-56615f6fa795@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-27net: usb: replace unnecessary get_link functions with usbnet_get_linkEthan Nelson-Moore3-25/+4
usbnet_get_link calls mii_link_ok if the device has a MII defined in its usbnet struct and no check_connect function defined there. This is true of these drivers, so their custom get_link functions which call mii_link_ok are useless. Remove them in favor of usbnet_get_link. Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Peter Korsgaard <peter@korsgaard.com> Link: https://patch.msgid.link/20260124082217.82351-1-enelsonmoore@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-27net: usb: smsc95xx: use phy_do_ioctl_running functionEthan Nelson-Moore1-9/+1
The smsc95xx_ioctl function behaves identically to the phy_do_ioctl_running function. Remove it and use the phy_do_ioctl_running function directly instead. Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20260124080751.78488-1-enelsonmoore@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-27Merge branch 'code-clean-up'Jakub Kicinski3-102/+24
Justin Chen says: ==================== code clean up Clean up and streamlined some code that is no longer needed due to older HW support being dropped. ==================== Link: https://patch.msgid.link/20260122194949.1145107-1-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-27net: bcmasp: streamline early exit in probeJustin Chen1-13/+14
Streamline the bcmasp_probe early exit. As support for other functionality is added(i.e. ptp), it is easier to keep track of early exit cleanup when it is all in one place. Signed-off-by: Justin Chen <justin.chen@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20260122194949.1145107-3-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-27net: bcmasp: clean up some legacy logicJustin Chen3-89/+10
Removed wol_irq check. This was needed for brcm,asp-v2.0, which was removed in previous commits. Removed bcmasp_intf_ops. These function pointers were added to make it easier to implement pseudo channels. These channels were removed in newer versions of the hardware and were never implemented. Signed-off-by: Justin Chen <justin.chen@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20260122194949.1145107-2-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-27net: alacritech: Use u64_stats_t with u64_stats_sync properlyDavid Yang2-29/+29
On 64bit arches, struct u64_stats_sync is empty and provides no help against load/store tearing. Convert to u64_stats_t to ensure atomic operations. Signed-off-by: David Yang <mmyangfl@gmail.com> Link: https://patch.msgid.link/20260122185113.2760355-1-mmyangfl@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-27net: include <linux/hex.h> from sysctl_net_core.cEric Dumazet1-0/+1
Needed for hex_byte_pack(). x86_64 was already including it, but some arches were not. Fixes: 37b0ea8fef56 ("net: expand NETDEV_RSS_KEY_LEN to 256 bytes") Reported-by: Mark Brown <broonie@kernel.org> Closes: https://lore.kernel.org/netdev/aXeka0KYBnrkwUcF@sirena.org.uk/ Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260126174731.2767372-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-26dt-bindings: net: dsa: fix typos in bindings docsAkiyoshi Kurita1-1/+1
Fix "alway" -> "always" in lan9303.txt and marvell,mv88e6xxx.yaml. Signed-off-by: Akiyoshi Kurita <weibu@redadmin.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20260123150211.2646235-1-weibu@redadmin.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-26Merge branch '200GbE' of ↵Jakub Kicinski13-1146/+1417
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== refactor IDPF resource access Pavan Kumar Linga says: Queue and vector resources for a given vport, are stored in the idpf_vport structure. At the time of configuration, these resources are accessed using vport pointer. Meaning, all the config path functions are tied to the default queue and vector resources of the vport. There are use cases which can make use of config path functions to configure queue and vector resources that are not tied to any vport. One such use case is PTP secondary mailbox creation (it would be in a followup series). To configure queue and interrupt resources for such cases, we can make use of the existing config infrastructure by passing the necessary queue and vector resources info. To achieve this, group the existing queue and vector resources into default resource group and refactor the code to pass the resource pointer to the config path functions. This series also includes patches which generalizes the send virtchnl message APIs and mailbox API that are necessary for the implementation of PTP secondary mailbox. * '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: idpf: generalize mailbox API idpf: avoid calling get_rx_ptypes for each vport idpf: generalize send virtchnl message API idpf: remove vport pointer from queue sets idpf: add rss_data field to RSS function parameters idpf: reshuffle idpf_vport struct members to avoid holes idpf: move some iterator declarations inside for loops idpf: move queue resources to idpf_q_vec_rsrc structure idpf: introduce idpf_q_vec_rsrc struct and move vector resources to it idpf: introduce local idpf structure to store virtchnl queue chunks ==================== Link: https://patch.msgid.link/20260122223601.2208759-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-26net: usb: sr9700: rename register write commands for clarityEthan Nelson-Moore2-6/+6
SR_WR_REG and SR_WR_REGS may be confused at a cursory glance. Rename them to be more easily differentiated to prevent this. Suggested-by: Andrew Lunn <andrew+netdev@lunn.ch> Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Peter Korsgaard <peter@korsgaard.com> Link: https://patch.msgid.link/20260123080409.64165-1-enelsonmoore@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-26net: usb: sr9700: use ETH_ALEN instead of magic numberEthan Nelson-Moore1-1/+1
The driver hardcodes the number 6 as the number of bytes to write to the SR_PAR register, which stores the MAC address. Use ETH_ALEN instead to make the code clearer. Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Peter Korsgaard <peter@korsgaard.com> Link: https://patch.msgid.link/20260123070645.56434-1-enelsonmoore@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-26Merge branch 'net-neighbour-notify-changes-atomically'Jakub Kicinski1-57/+93
Petr Machata says: ==================== net: neighbour: Notify changes atomically Andy Roulin and Francesco Ruggeri have apparently independently both hit an issue with the current neighbor notification scheme. Francesco reported the issue in [1]. In a response[2] to that report, Andy said: neigh_update sends a rtnl notification if an update, e.g., nud_state change, was done but there is no guarantee of ordering of the rtnl notifications. Consider the following scenario: userspace thread kernel thread ================ ============= neigh_update write_lock_bh(n->lock) n->nud_state = STALE write_unlock_bh(n->lock) neigh_notify neigh_fill_info read_lock_bh(n->lock) ndm->nud_state = STALE read_unlock_bh(n->lock) --------------------------> neigh:update write_lock_bh(n->lock) n->nud_state = REACHABLE write_unlock_bh(n->lock) neigh_notify neigh_fill_info read_lock_bh(n->lock) ndm->nud_state = REACHABLE read_unlock_bh(n->lock) rtnl_nofify RTNL REACHABLE sent <-------- rtnl_notify RTNL STALE sent In this scenario, the kernel neigh is updated first to STALE and then REACHABLE but the netlink notifications are sent out of order, first REACHABLE and then STALE. The solution presented in [2] was to extend the critical region to include both the call to neigh_fill_info(), as well as rtnl_notify(). Then we have a guarantee that whatever state was captured by neigh_fill_info(), will be sent right away. The above scenario can thus not happen. This is how this patchset begins: patches #1 and #2 add helper duals to neigh_fill_info() and __neigh_notify() such that the __-prefixed function assumes the neighbor lock is held, and the unprefixed one is a thin wrapper that manages locking. This extends locking further than Andy's patch, but makes for a clear code and supports the following part. At that point, the original race is gone. But what can happen is the following race, where the notification does not reflect the change that was made: userspace thread kernel thread ================ ============= neigh_update write_lock_bh(n->lock) n->nud_state = STALE write_unlock_bh(n->lock) --------------------------> neigh:update write_lock_bh(n->lock) n->nud_state = REACHABLE write_unlock_bh(n->lock) neigh_notify read_lock_bh(n->lock) __neigh_fill_info ndm->nud_state = REACHABLE rtnl_notify read_unlock_bh(n->lock) RTNL REACHABLE sent <-------- neigh_notify read_lock_bh(n->lock) __neigh_fill_info ndm->nud_state = REACHABLE rtnl_notify read_unlock_bh(n->lock) RTNL REACHABLE sent again Here, even though neigh_update() made a change to STALE, it later sends a notification with a NUD of REACHABLE. The obvious solution to fix this race is to move the notifier to the same critical section that actually makes the change. Sending a notification in fact involves two things: invoking the internal notifier chain, and sending the netlink notification. The overall approach in this patchset is to move the netlink notification to the critical section of the change, while keeping the internal notifier intact. Since the motion is not obviously correct, the patchset presents the change in series of incremental steps with discussion in commit messages. Please see details in the patches themselves. Reproducer ========== To consistently reproduce, I injected an mdelay before the rtnl_notify() call. Since only one thread should delay, a bit of instrumentation was needed to see where the call originates. The mdelay was then only issued on the call stack rooted in the RTNL request. Then the general idea is to issue an "ip neigh replace" to mark a neighbor entry as failed. In parallel to that, inject an ARP burst that validates the entry. This is all observed with an "ip monitor neigh", where one can see either a REACHABLE->FAILED transition, or FAILED->REACHABLE, while the actual state at the end of the sequence is always REACHABLE. With the patchset, only FAILED->REACHABLE is ever observed in the monitor. Alternatives ============ Another approach to solving the issue would be to have a per-neighbor queue of notification digests, each with a set of fields necessary for formatting a notification. In pseudocode, a neighbor update would look something like this: neighbor_update: - lock - do update - allocate notification digest, fill partially, mark not-committed - unlock - critical-section-breaking stuff (probes, ARP Q, etc.) - lock - fill in missing details to the digest (notably neigh->probes) - mark the digest as committed - while (front of the digest queue is committed) - pop it, convert to notifier, send the notification - unlock This adds more complexity and would imply more changes to the code, which is why I think the approach presented in this patchset is better. But it would allow us to retain the overall structure of the code while giving us accurate notifications. A third approach would be to consider the second race not very serious and be OK with seeing a notification that does not reflect the change that prompted it. Then a two-patch prefix of this patchset would be all that is needed. [1]: https://lore.kernel.org/20220606230107.D70B55EC0B30@us226.sjc.aristanetworks.com [2]: https://lore.kernel.org/ed6768c1-80b8-aee2-e545-b51661d49336@nvidia.com ==================== Link: https://patch.msgid.link/cover.1769012464.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-26net: core: neighbour: Make another netlink notification atomicallyPetr Machata1-4/+9
Similarly to the issue from the previous patch, neigh_timer_handler() also updates the neighbor separately from formatting and sending the netlink notification message. We have not seen reports to the effect of this causing trouble, but in theory, the same sort of issues could have come up: neigh_timer_handler() would make changes as necessary, but before formatting and sending a notification, is interrupted before sending by another thread, which makes a parallel change and sends its own message. The message send that is prompted by an earlier change thus contains information that does not reflect the change having been made. To solve this, the netlink notification needs to be in the same critical section that updates the neighbor. The critical section is ended by the neigh_probe() call which drops the lock before calling solicit. Stretching the critical section over the solicit call is problematic, because that can then involved all sorts of forwarding callbacks. Therefore, like in the previous patch, split the netlink notification away from the internal one and move it ahead of the probe call. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/e440118511cbdbe1d88eb0d71c9047116feb96e0.1769012464.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-26net: core: neighbour: Make one netlink notification atomicallyPetr Machata1-3/+5
As noted in a previous patch, one race remains in the current code. A kernel thread might interrupt a userspace thread after the change is done, but before formatting and sending the message. Then what we would see is two messages with the same contents: userspace thread kernel thread ================ ============= neigh_update write_lock_bh(n->lock) n->nud_state = STALE write_unlock_bh(n->lock) --------------------------> neigh:update write_lock_bh(n->lock) n->nud_state = REACHABLE write_unlock_bh(n->lock) neigh_notify read_lock_bh(n->lock) __neigh_fill_info ndm->nud_state = REACHABLE rtnl_notify read_unlock_bh(n->lock) RTNL REACHABLE sent <-------- neigh_notify read_lock_bh(n->lock) __neigh_fill_info ndm->nud_state = REACHABLE rtnl_notify read_unlock_bh(n->lock) RTNL REACHABLE sent again The solution is to send the netlink message inside the critical section where the neighbor is changed, so that it reflects the notified-upon neighbor state. To that end, in __neigh_update(), move the current neigh_notify() call up to said critical section, and convert it to __neigh_notify(), because the lock is held. This motion crosses calls to neigh_update_managed_list(), neigh_update_gc_list() and neigh_update_process_arp_queue(), all of which potentially unlock and give an opportunity for the above race. This also crosses a call to neigh_update_process_arp_queue() which calls neigh->output(), which might be neigh_resolve_output() calls neigh_event_send() calls neigh_event_send_probe() calls __neigh_event_send() calls neigh_probe(), which touches neigh->probes, an update which will now not be visible in the notification. However, there is indication that there is no promise that these changes will be accurately projected to notifications: fib6_table_lookup() indirectly calls route.c's find_match() calls rt6_probe(), which looks up a neighbor and call __neigh_set_probe_once(), which sets neigh->probes to 0, but neither this nor the caller seems to send a notification. Additionally, the neighbor object that the neigh_probe() mentioned above is called on, might be the alternative neighbor looked up for the ARP queue packet destination. If that is the case, the changed value of n1->probes is not notified anywhere. So at least in some circumstances, the reported number of probes needs to be assumed to change without notification. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/ceb44995498eb52375cb2d46c3245bdb9e74b355.1769012464.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-26net: core: neighbour: Reorder netlink & internal notificationPetr Machata1-2/+2
The netlink message needs to be send inside the critical section where the neighbor is changed, so that it reflects the notified-upon neighbor state. On the other hand, there is no such need in case of notifier chain: the listeners do not assume lock, and often in fact just schedule a delayed work to act on the neighbor later. At least one in fact also takes the neighbor lock. This requires that the netlink notification be done before the internal notifier chain message is sent. That is safe to do, because the current listeners, as well as __neigh_notify(), only read the updated neighbor fields, and never modify them. (Apart from locking.) Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/f3ef74d5460f14c4d102b8a5857d4a6624da9a5a.1769012464.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-26net: core: neighbour: Inline neigh_update_notify() callsPetr Machata1-11/+10
The obvious idea behind the helper is to keep together the two bits that should be done either both or neither: the internal notifier chain message, and the netlink notification. To make sure that the notification sent reflects the change being made, the netlink message needs to be send inside the critical section where the neighbor is changed. But for the notifier chain, there is no such need: the listeners do not assume lock, and often in fact just schedule a delayed work to act on the neighbor later. At least one in fact also takes the neighbor lock. Therefore these two items have each different locking needs. Now we could unlock inside the helper, but I find that error prone, and the fact that the notification is conditional in the first place does not help to make the call site obvious. So in this patch, the helper is instead removed and the body, which is just these two calls, inlined. That way we can use each notifier independently. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/e65dce5882bc6f4aa2530b8a4877d0e003071a1a.1769012464.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-26net: core: neighbour: Process ARP queue laterPetr Machata1-1/+7
ARP queue processing unlocks the neighbor lock, which can allow another thread to asynchronously perform a neighbor update and send an out of order notification. Therefore this needs to be done after the notification is sent. Move it just before the end of the critical section. Since neigh_update_process_arp_queue() unlocks, it does not form a part of the critical section anymore but it can benefit from the lock being taken. The intention is to eventually do the RTNL notification before this call. This motion crosses a call to neigh_update_is_router(), which should not influence processing of the ARP queue. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/9ea7159e71430ebdc837ebcc880a76b7e82e52a4.1769012464.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-26net: core: neighbour: Extract ARP queue processing to a helper functionPetr Machata1-35/+43
In order to make manipulation with this bit of code clearer, extract it to a helper function, neigh_update_process_arp_queue(). Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/8b0fa0abe2cf0e24484903f5436fe0ac64163057.1769012464.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-26net: core: neighbour: Call __neigh_notify() under a lockPetr Machata1-6/+12
Andy Roulin has described an issue with the current neighbor notification scheme as follows. This was also presented publicly at the link below. neigh_update sends a rtnl notification if an update, e.g., nud_state change, was done but there is no guarantee of ordering of the rtnl notifications. Consider the following scenario: userspace thread kernel thread ================ ============= neigh_update write_lock_bh(n->lock) n->nud_state = STALE write_unlock_bh(n->lock) neigh_notify neigh_fill_info read_lock_bh(n->lock) ndm->nud_state = STALE read_unlock_bh(n->lock) --------------------------> neigh:update write_lock_bh(n->lock) n->nud_state = REACHABLE write_unlock_bh(n->lock) neigh_notify neigh_fill_info read_lock_bh(n->lock) ndm->nud_state = REACHABLE read_unlock_bh(n->lock) rtnl_nofify RTNL REACHABLE sent <-------- rtnl_notify RTNL STALE sent In this scenario, the kernel neigh is updated first to STALE and then REACHABLE but the netlink notifications are sent out of order, first REACHABLE and then STALE. The solution is to send the netlink message inside the same critical section that formats the message. That way both the contents and ordering of the message reflect the same state, and we cannot see the abovementioned out-of-order delivery. Even with this patch, a remaining issue that the contents of the message may not reflect the changes made to the neighbor. A kernel thread might still interrupt a userspace thread after the change is done, but before formatting and sending the message. Then what we would see is two messages with the same contents. The following patches will attempt to address that issue. To support those future patches, convert __neigh_notify() to a helper that assumes that the neighbor lock is already taken by having it call __neigh_fill_info() instead of neigh_fill_info(). Add a new helper, neigh_notify(), which takes the lock before calling __neigh_notify(). Migrate all callers to use the latter. Link: https://lore.kernel.org/netdev/ed6768c1-80b8-aee2-e545-b51661d49336@nvidia.com/ Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/4b4368dcc5f5a7e407009cb6c36b69cfb5282864.1769012464.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-26net: core: neighbour: Add a neigh_fill_info() helper for when lock not heldPetr Machata1-7/+17
The netlink message needs to be formatted and sent inside the critical section where the neighbor is changed, so that it reflects the notified-upon neighbor state. Because it will happen inside an already existing critical section, it has to assume that the neighbor lock is held. Add a helper __neigh_fill_info(), which is like neigh_fill_info(), but makes this assumption. Convert neigh_fill_info() to a wrapper around this new helper. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/7ec20113d5d809200e3534d3ed8f0004514914b8.1769012464.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-26ipv4: igmp: annotate data-races around idev->mr_maxdelayEric Dumazet2-3/+3
idev->mr_maxdelay is read and written locklessly, add READ_ONCE()/WRITE_ONCE() annotations. While we are at it, make this field an u32. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20260122172247.2429403-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-26ipvlan: remove ipvlan_ht_addr_lookup()Eric Dumazet1-21/+34
ipvlan_ht_addr_lookup() is called four times and not inlined. Split it to ipvlan_ht_addr_lookup6() and ipvlan_ht_addr_lookup4() and rework ipvlan_addr_lookup() to call these helpers once, so that they are (auto)inlined. After this change, ipvlan_addr_lookup() is faster, and we save 350 bytes of text on x86_64. $ scripts/bloat-o-meter -t vmlinux.old vmlinux.new add/remove: 0/2 grow/shrink: 1/0 up/down: 123/-473 (-350) Function old new delta ipvlan_addr_lookup 467 590 +123 __pfx_ipvlan_ht_addr_lookup 16 - -16 ipvlan_ht_addr_lookup 457 - -457 Total: Before=22571833, After=22571483, chg -0.00% Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Mahesh Bandewar <maheshb@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260122165049.2366985-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-26net: expand NETDEV_RSS_KEY_LEN to 256 bytesEric Dumazet3-10/+16
NETDEV_RSS_KEY_LEN has been set to 52 bytes in 2014, until now. Jakub suggested we bump the size to 128 bytes or more. Some drivers (like idpf) were already working around the core limit. Since this change might cause some issues in admin scripts, bump it directly to 256 in one go. tjbp26:~# cat /proc/sys/net/core/netdev_rss_key | wc -c 768 tjbp26:~# ethtool -x eth1 RX flow hash indirection table for eth1 with 32 RX ring(s): ... RSS hash key: fe:16:5b:2f:93:85:c2:c9:c1:ef:bd:60:c6:e0:2b:99:4d:bf:b7:14:c8:1e:8d:cb:31:17:51:da:55:eb:91:d9:9e:f9:89:9b:44:a1:dc:08:72:3a:b3:d6:31:86:9a:fe:02:3a:0d:eb:a1:7c:f5:a3:51:3b:08:56:c9:3f:71:69:01:ba:70:38 RSS hash function: toeplitz: on xor: off crc32: off Suggested-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/netdev/20260122075206.504ec591@kernel.org/ Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20260122190349.2771064-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-26Merge branch 'net-few-critical-helpers-are-inlined-again'Jakub Kicinski6-40/+47
Eric Dumazet says: ==================== net: few critical helpers are inlined again Recent devmem additions increased stack depth. Some helpers that were inlined in the past are now out-of-line. ==================== Link: https://patch.msgid.link/20260122045720.1221017-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-26net: inline get_netmem() and put_netmem()Eric Dumazet2-23/+28
These helpers are used in network fast paths. Only call out-of-line helpers for netmem case. We might consider inlining __get_netmem() and __put_netmem() in the future. $ scripts/bloat-o-meter -t vmlinux.3 vmlinux.4 add/remove: 6/6 grow/shrink: 22/1 up/down: 2614/-646 (1968) Function old new delta pskb_carve 1669 1894 +225 gro_pull_from_frag0 - 206 +206 get_page 190 380 +190 skb_segment 3561 3747 +186 put_page 595 765 +170 skb_copy_ubufs 1683 1822 +139 __pskb_trim_head 276 401 +125 __pskb_copy_fclone 734 858 +124 skb_zerocopy 1092 1215 +123 pskb_expand_head 892 1008 +116 skb_split 828 940 +112 skb_release_data 297 409 +112 ___pskb_trim 829 941 +112 __skb_zcopy_downgrade_managed 120 226 +106 tcp_clone_payload 530 634 +104 esp_ssg_unref 191 294 +103 dev_gro_receive 1464 1514 +50 __put_netmem - 41 +41 __get_netmem - 41 +41 skb_shift 1139 1175 +36 skb_try_coalesce 681 714 +33 __pfx_put_page 112 144 +32 __pfx_get_page 32 64 +32 __pskb_pull_tail 1137 1168 +31 veth_xdp_get 250 267 +17 __pfx_gro_pull_from_frag0 - 16 +16 __pfx___put_netmem - 16 +16 __pfx___get_netmem - 16 +16 __pfx_put_netmem 16 - -16 __pfx_gro_try_pull_from_frag0 16 - -16 __pfx_get_netmem 16 - -16 put_netmem 114 - -114 get_netmem 130 - -130 napi_gro_frags 929 771 -158 gro_try_pull_from_frag0 196 - -196 Total: Before=22565857, After=22567825, chg +0.01% Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260122045720.1221017-5-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-26net: inline net_is_devmem_iov()Eric Dumazet3-11/+13
1) Inline this small helper to reduce code size and decrease cpu costs. 2) Constify its argument. 3) Move it to include/net/netmem.h, as a prereq for the following patch. $ scripts/bloat-o-meter -t vmlinux.2 vmlinux.3 add/remove: 0/2 grow/shrink: 0/4 up/down: 0/-158 (-158) Function old new delta validate_xmit_skb 866 857 -9 __pfx_net_is_devmem_iov 16 - -16 net_is_devmem_iov 22 - -22 get_netmem 152 130 -22 put_netmem 140 114 -26 tcp_recvmsg_locked 3860 3797 -63 Total: Before=22566015, After=22565857, chg -0.00% Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260122045720.1221017-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-26gro: change the BUG_ON() in gro_pull_from_frag0()Eric Dumazet1-1/+1
Replace the BUG_ON() which never fired with a DEBUG_NET_WARN_ON_ONCE() $ scripts/bloat-o-meter -t vmlinux.1 vmlinux.2 add/remove: 2/2 grow/shrink: 1/1 up/down: 370/-254 (116) Function old new delta gro_try_pull_from_frag0 - 196 +196 napi_gro_frags 771 929 +158 __pfx_gro_try_pull_from_frag0 - 16 +16 __pfx_gro_pull_from_frag0 16 - -16 dev_gro_receive 1514 1464 -50 gro_pull_from_frag0 188 - -188 Total: Before=22565899, After=22566015, chg +0.00% Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260122045720.1221017-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-26net: always inline skb_frag_unref() and __skb_frag_unref()Eric Dumazet1-5/+5
clang is not inlining skb_frag_unref() and __skb_frag_unref() in gro fast path. It also does not inline gro_try_pull_from_frag0(). Using __always_inline fixes this issue, makes the kernel faster _and_ smaller. Also change __skb_frag_ref(), skb_frag_ref() and skb_page_unref() to let them inlined for the last patch in this series. $ scripts/bloat-o-meter -t vmlinux.0 vmlinux.1 add/remove: 2/6 grow/shrink: 1/2 up/down: 218/-511 (-293) Function old new delta gro_pull_from_frag0 - 188 +188 __pfx_gro_pull_from_frag0 - 16 +16 skb_shift 1125 1139 +14 __pfx_skb_frag_unref 16 - -16 __pfx_gro_try_pull_from_frag0 16 - -16 __pfx___skb_frag_unref 16 - -16 __skb_frag_unref 36 - -36 skb_frag_unref 59 - -59 dev_gro_receive 1608 1514 -94 napi_gro_frags 892 771 -121 gro_try_pull_from_frag0 153 - -153 Total: Before=22566192, After=22565899, chg -0.00% Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260122045720.1221017-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-26Merge branch 'u64_stats-introduce-u64_stats_copy'Jakub Kicinski4-5/+20
David Yang says: ==================== u64_stats: Introduce u64_stats_copy() On 64bit arches, struct u64_stats_sync is empty and provides no help against load/store tearing. memcpy() should not be considered atomic against u64 values. Use u64_stats_copy() instead. ==================== Link: https://patch.msgid.link/20260120092137.2161162-1-mmyangfl@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-26vxlan: vnifilter: fix memcpy with u64_statsDavid Yang1-1/+1
On 64bit arches, struct u64_stats_sync is empty and provides no help against load/store tearing. memcpy() should not be considered atomic against u64 values. Use u64_stats_copy() instead. Signed-off-by: David Yang <mmyangfl@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20260120092137.2161162-5-mmyangfl@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>