summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-11-16Merge tag 'nf-next-24-11-15' of ↵Jakub Kicinski11-127/+484
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next: 1) Extended netlink error reporting if nfnetlink attribute parser fails, from Donald Hunter. 2) Incorrect request_module() module, from Simon Horman. 3) A series of patches to reduce memory consumption for set element transactions. Florian Westphal says: "When doing a flush on a set or mass adding/removing elements from a set, each element needs to allocate 96 bytes to hold the transactional state. In such cases, virtually all the information in struct nft_trans_elem is the same. Change nft_trans_elem to a flex-array, i.e. a single nft_trans_elem can hold multiple set element pointers. The number of elements that can be stored in one nft_trans_elem is limited by the slab allocator, this series limits the compaction to at most 62 elements as it caps the reallocation to 2048 bytes of memory." 4) A series of patches to prepare the transition to dscp_t in .flowi_tos. From Guillaume Nault. 5) Support for bitwise operations with two source registers, from Jeremy Sowden. * tag 'nf-next-24-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: bitwise: add support for doing AND, OR and XOR directly netfilter: bitwise: rename some boolean operation functions netfilter: nf_dup4: Convert nf_dup_ipv4_route() to dscp_t. netfilter: nft_fib: Convert nft_fib4_eval() to dscp_t. netfilter: rpfilter: Convert rpfilter_mt() to dscp_t. netfilter: flow_offload: Convert nft_flow_route() to dscp_t. netfilter: ipv4: Convert ip_route_me_harder() to dscp_t. netfilter: nf_tables: allocate element update information dynamically netfilter: nf_tables: switch trans_elem to real flex array netfilter: nf_tables: prepare nft audit for set element compaction netfilter: nf_tables: prepare for multiple elements in nft_trans_elem structure netfilter: nf_tables: add nft_trans_commit_list_add_elem helper netfilter: bpf: Pass string literal as format argument of request_module() netfilter: nfnetlink: Report extack policy errors for batched ops ==================== Link: https://patch.msgid.link/20241115133207.8907-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15netfilter: bitwise: add support for doing AND, OR and XOR directlyJeremy Sowden2-11/+131
Hitherto, these operations have been converted in user space to mask-and-xor operations on one register and two immediate values, and it is the latter which have been evaluated by the kernel. We add support for evaluating these operations directly in kernel space on one register and either an immediate value or a second register. Pablo made a few changes to the original patch: - EINVAL if NFTA_BITWISE_SREG2 is used with fast version. - Allow _AND,_OR,_XOR with _DATA != sizeof(u32) - Dump _SREG2 or _DATA with _AND,_OR,_XOR Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15netfilter: bitwise: rename some boolean operation functionsJeremy Sowden2-20/+24
In the next patch we add support for doing AND, OR and XOR operations directly in the kernel, so rename some functions and an enum constant related to mask-and-xor boolean operations. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15netfilter: nf_dup4: Convert nf_dup_ipv4_route() to dscp_t.Guillaume Nault1-1/+1
Use ip4h_dscp() instead of reading iph->tos directly. ip4h_dscp() returns a dscp_t value which is temporarily converted back to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to dscp_t in the future, we'll only have to remove that inet_dscp_to_dsfield() call. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15netfilter: nft_fib: Convert nft_fib4_eval() to dscp_t.Guillaume Nault1-1/+2
Use ip4h_dscp() instead of reading iph->tos directly. ip4h_dscp() returns a dscp_t value which is temporarily converted back to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to dscp_t in the future, we'll only have to remove that inet_dscp_to_dsfield() call. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15netfilter: rpfilter: Convert rpfilter_mt() to dscp_t.Guillaume Nault1-1/+1
Use ip4h_dscp() instead of reading iph->tos directly. ip4h_dscp() returns a dscp_t value which is temporarily converted back to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to dscp_t in the future, we'll only have to remove that inet_dscp_to_dsfield() call. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15netfilter: flow_offload: Convert nft_flow_route() to dscp_t.Guillaume Nault1-2/+2
Use ip4h_dscp()instead of reading ip_hdr()->tos directly. ip4h_dscp() returns a dscp_t value which is temporarily converted back to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to dscp_t in the future, we'll only have to remove that inet_dscp_to_dsfield() call. Also, remove the comment about the net/ip.h include file, since it's now required for the ip4h_dscp() helper too. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15netfilter: ipv4: Convert ip_route_me_harder() to dscp_t.Guillaume Nault1-1/+1
Use ip4h_dscp()instead of reading iph->tos directly. ip4h_dscp() returns a dscp_t value which is temporarily converted back to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to dscp_t in the future, we'll only have to remove that inet_dscp_to_dsfield() call. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15Merge branch 'net-make-rss-rxnfc-semantics-more-explicit'Jakub Kicinski6-19/+113
Edward Cree says: ==================== net: make RSS+RXNFC semantics more explicit The original semantics of ntuple filters with FLOW_RSS were not fully understood by all drivers, some ignoring the ring_cookie from the flow rule. Require this support to be explicitly declared by the driver for filters relying on it to be inserted, and add self- test coverage for this functionality. Also teach ethtool_check_max_channel() about this. ==================== Link: https://patch.msgid.link/cover.1731499021.git.ecree.xilinx@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15selftest: extend test_rss_context_queue_reconfigure for action additionEdward Cree1-0/+26
The combination of ntuple action (ring_cookie) and RSS context can cause an ntuple rule to target a higher queue than appears in any RSS indirection table or directly in the ntuple rule, since the two numbers are added together. Verify the logic that prevents reducing the queue count in this case. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/58276b800ab78c0a79c1918046ccae7fe45ba802.1731499022.git.ecree.xilinx@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15selftest: validate RSS+ntuple filters with nonzero ring_cookieEdward Cree1-1/+40
Test creates an ntuple filter with 'action 2' and an RSS context whose indirection table has entries 0 and 1. Resulting traffic should go to queues 2 and 3; verify that it never hits queues 0 and 1. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/114afdf4d2867f72ed27751e8e08fe8b128a8529.1731499022.git.ecree.xilinx@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15selftest: include dst-ip in ethtool ntuple rulesEdward Cree1-6/+6
sfc hardware does not support filters with only ipproto + dst-port; adding dst-ip to the flow spec allows the rss_ctx test to be run on these devices. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com> Link: https://patch.msgid.link/8e5d23c8f21310c23c080cc7bcd31b76f8fd3096.1731499022.git.ecree.xilinx@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15net: ethtool: account for RSS+RXNFC add semantics when checking channel countEdward Cree1-12/+30
In ethtool_check_max_channel(), the new RX count must not only cover the max queue indices in RSS indirection tables and RXNFC destinations separately, but must also, for RXNFC rules with FLOW_RSS, cover the sum of the destination queue and the maximum index in the associated RSS context's indirection table, since that is the highest queue that the rule can actually deliver traffic to. It could be argued that the max queue across all custom RSS contexts (ethtool_get_max_rss_ctx_channel()) need no longer be considered, since any context to which packets can actually be delivered will be targeted by some RXNFC rule and its max will thus be allowed for by ethtool_get_max_rxnfc_channel(). For simplicity we keep both checks, so even RSS contexts unused by any RXNFC rule must fit the channel count. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/43257d375434bef388e36181492aa4c458b88336.1731499022.git.ecree.xilinx@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15net: ethtool: only allow set_rxnfc with rss + ring_cookie if driver opts inEdward Cree4-0/+11
Ethtool ntuple filters with FLOW_RSS were originally defined as adding the base queue ID (ring_cookie) to the value from the indirection table, so that the same table could distribute over more than one set of queues when used by different filters. However, some drivers / hardware ignore the ring_cookie, and simply use the indirection table entries as queue IDs directly. Thus, for drivers which have not opted in by setting ethtool_ops.cap_rss_rxnfc_adds to declare that they support the original (addition) semantics, reject in ethtool_set_rxnfc any filter which combines FLOW_RSS and a nonzero ring. (For a ring_cookie of zero, both behaviours are equivalent.) Set the cap bit in sfc, as it is known to support this feature. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com> Link: https://patch.msgid.link/cc3da0844083b0e301a33092a6299e4042b65221.1731499022.git.ecree.xilinx@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15dt-bindings: net: sff,sfp: Fix "interrupts" property typoRob Herring (Arm)1-1/+1
The example has "interrupt" property which is not a defined property. It should be "interrupts" instead. "interrupts" also should not contain a phandle. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://patch.msgid.link/20241113225825.1785588-2-robh@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15dt-bindings: net: mdio-mux-gpio: Drop undocumented "marvell,reg-init"Rob Herring (Arm)1-32/+0
"marvell,reg-init" is not yet documented by schema. It's irrelevant to the example, so just drop it. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://patch.msgid.link/20241113225713.1784118-2-robh@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15net: sparx5: add missing lan969x Kconfig dependencyArnd Bergmann2-2/+2
The sparx5 switchdev driver can be built either with or without support for the Lan969x switch. However, it cannot be built-in when the lan969x driver is a loadable module because of a link-time dependency: arm-linux-gnueabi-ld: drivers/net/ethernet/microchip/sparx5/sparx5_main.o:(.rodata+0xd44): undefined reference to `lan969x_desc' Add a Kconfig dependency to reflect this in Kconfig, allowing all the valid configurations but forcing sparx5 to be a loadable module as well if lan969x is. Fixes: 98a01119608d ("net: sparx5: add compatible string for lan969x") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20241113115513.4132548-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15net: enetc: clean up before returning in probe()Dan Carpenter1-3/+5
We recently added this error path. We need to call enetc_pci_remove() before returning. It cleans up the resources from enetc_pci_probe(). Fixes: 99100d0d9922 ("net: enetc: add preliminary support for i.MX95 ENETC PF") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/93888efa-c838-4682-a7e5-e6bf318e844e@stanley.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15mdio: Remove mdio45_ethtool_gset_npage()Alistair Francis2-175/+0
The mdio45_ethtool_gset_npage() function isn't called, so let's remove it. Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Link: https://patch.msgid.link/20241112105430.438491-2-alistair@alistair23.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15include: mdio: Remove mdio45_ethtool_gset()Alistair Francis1-16/+0
mdio45_ethtool_gset() is never called, so let's remove it. Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Link: https://patch.msgid.link/20241112105430.438491-1-alistair@alistair23.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15Merge tag 'for-netdev' of ↵Jakub Kicinski3-84/+226
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Martin KaFai Lau says: ==================== pull-request: bpf-next 2024-11-14 We've added 9 non-merge commits during the last 4 day(s) which contain a total of 3 files changed, 226 insertions(+), 84 deletions(-). The main changes are: 1) Fixes to bpf_msg_push/pop_data and test_sockmap. The changes has dependency on the other changes in the bpf-next/net branch, from Zijian Zhang. 2) Drop netns codes from mptcp test. Reuse the common helpers in test_progs, from Geliang Tang. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: bpf, sockmap: Fix sk_msg_reset_curr bpf, sockmap: Several fixes to bpf_msg_pop_data bpf, sockmap: Several fixes to bpf_msg_push_data selftests/bpf: Add more tests for test_txmsg_push_pop in test_sockmap selftests/bpf: Add push/pop checking for msg_verify_data in test_sockmap selftests/bpf: Fix total_bytes in msg_loop_rx in test_sockmap selftests/bpf: Fix SENDPAGE data logic in test_sockmap selftests/bpf: Add txmsg_pass to pull/push/pop in test_sockmap selftests/bpf: Drop netns helpers in mptcp ==================== Link: https://patch.msgid.link/20241114202832.3187927-1-martin.lau@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15Merge branch 'ipv4-prepare-bpf-helpers-to-flowi4_tos-conversion'Jakub Kicinski2-2/+2
Guillaume Nault says: ==================== ipv4: Prepare bpf helpers to .flowi4_tos conversion. Continue the process of making a dscp_t variable available when setting .flowi4_tos. This series focuses on the BPF helpers that initialise a struct flowi4 manually. The objective is to eventually convert .flowi4_tos to dscp_t, (to get type annotation and prevent ECN bits from interfering with DSCP). ==================== Link: https://patch.msgid.link/cover.1731064982.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15bpf: lwtunnel: Prepare bpf_lwt_xmit_reroute() to future .flowi4_tos conversion.Guillaume Nault1-1/+1
Use ip4h_dscp() to get the DSCP from the IPv4 header, then convert the dscp_t value to __u8 with inet_dscp_to_dsfield(). Then, when we'll convert .flowi4_tos to dscp_t, we'll just have to drop the inet_dscp_to_dsfield() call. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/8338a12377c44f698a651d1ce357dd92bdf18120.1731064982.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15bpf: ipv4: Prepare __bpf_redirect_neigh_v4() to future .flowi4_tos conversion.Guillaume Nault1-1/+1
Use ip4h_dscp() to get the DSCP from the IPv4 header, then convert the dscp_t value to __u8 with inet_dscp_to_dsfield(). Then, when we'll convert .flowi4_tos to dscp_t, we'll just have to drop the inet_dscp_to_dsfield() call. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/35eacc8955003e434afb1365d404193cc98a9579.1731064982.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15Merge branch 'tools-net-ynl-rework-async-notification-handling'Jakub Kicinski2-33/+44
Donald Hunter says: ==================== tools/net/ynl: rework async notification handling Revert patch 1bf70e6c3a53 which modified check_ntf() and instead add a new poll_ntf() with async notification semantics. See patch 2 for a detailed description. ==================== Link: https://patch.msgid.link/20241113090843.72917-1-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15tools/net/ynl: add async notification handlingDonald Hunter2-10/+34
The notification handling in ynl is currently very simple, using sleep() to wait a period of time and then handling all the buffered messages in a single batch. This patch adds async notification handling so that messages can be processed as they are received. This makes it possible to use ynl as a library that supplies notifications in a timely manner. - Add poll_ntf() to be a generator that yields 1 notification at a time and blocks until a notification is available. - Add a --duration parameter to the CLI, with --sleep as an alias. ./tools/net/ynl/cli.py \ --spec <SPEC> --subscribe <TOPIC> [ --duration <SECS> ] The cli will report any notifications for duration seconds and then exit. If duration is not specified, then it will poll forever, until interrupted. Here is an example python snippet that shows how to use ynl as a library for receiving notifications: ynl = YnlFamily(f"{dir}/rt_route.yaml") ynl.ntf_subscribe('rtnlgrp-ipv4-route') for event in ynl.poll_ntf(): handle(event) Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20241113090843.72917-3-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15Revert "tools/net/ynl: improve async notification handling"Donald Hunter2-36/+23
This reverts commit 1bf70e6c3a5346966c25e0a1ff492945b25d3f80. This modification to check_ntf() is being reverted so that its behaviour remains equivalent to ynl_ntf_check() in the C YNL. Instead a new poll_ntf() will be added in a separate patch. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20241113090843.72917-2-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15Merge branch ↵Jakub Kicinski6-37/+39
'net-phy-switch-eee_broken_modes-to-linkmode-bitmap-and-add-accessor' Heiner Kallweit says: ==================== net: phy: switch eee_broken_modes to linkmode bitmap and add accessor eee_broken_modes has a eee_cap1 register layout currently. This doesn't allow to flag e.g. 2.5Gbps or 5Gbps BaseT EEE as broken. To overcome this limitation switch eee_broken_modes to a linkmode bitmap. Add an accessor for the bitmap and use it in r8169. ==================== Link: https://patch.msgid.link/405734c5-0ed4-40e4-9ac9-91084b9536d6@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15r8169: copy vendor driver 2.5G/5G EEE advertisement constraintsHeiner Kallweit2-12/+10
Vendor driver r8125 doesn't advertise 2.5G EEE on RTL8125A, and r8126 doesn't advertise 5G EEE. Likely there are compatibility issues, therefore do the same in r8169. With this change we don't have to disable 2.5G EEE advertisement in rtl8125a_config_eee_phy() any longer. We use new phylib accessor phy_set_eee_broken() to mark the respective EEE modes as broken. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/ce185e10-8a2f-4cf8-a49b-fd8fb3c3c8a1@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15net: phy: add phy_set_eee_brokenHeiner Kallweit1-0/+10
Add an accessor for eee_broken_modes, so that drivers don't have to deal with phylib internals. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/0f8ee279-d40d-4489-a3b0-d993472d744a@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15net: phy: convert eee_broken_modes to a linkmode bitmapHeiner Kallweit4-25/+19
eee_broken_modes has a eee_cap1 register layout currently. This doen't allow to flag e.g. 2.5Gbps or 5Gbps BaseT EEE as broken. To overcome this limitation switch eee_broken_modes to a linkmode bitmap. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/dfe0c9ff-84b0-4328-86d7-e917ebc084a1@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski282-1272/+2592
Cross-merge networking fixes after downstream PR (net-6.12-rc8). Conflicts: tools/testing/selftests/net/.gitignore 252e01e68241 ("selftests: net: add netlink-dumps to .gitignore") be43a6b23829 ("selftests: ncdevmem: Move ncdevmem under drivers/net/hw") https://lore.kernel.org/all/20241113122359.1b95180a@canb.auug.org.au/ drivers/net/phy/phylink.c 671154f174e0 ("net: phylink: ensure PHY momentary link-fails are handled") 7530ea26c810 ("net: phylink: remove "using_mac_select_pcs"") Adjacent changes: drivers/net/ethernet/stmicro/stmmac/dwmac-intel-plat.c 5b366eae7193 ("stmmac: dwmac-intel-plat: fix call balance of tx_clk handling routines") e96321fad3ad ("net: ethernet: Switch back to struct platform_driver::remove()") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14Merge tag 'net-6.12-rc8' of ↵Linus Torvalds41-109/+543
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bluetooth. Quite calm week. No new regression under investigation. Current release - regressions: - eth: revert "igb: Disable threaded IRQ for igb_msix_other" Current release - new code bugs: - bluetooth: btintel: direct exception event to bluetooth stack Previous releases - regressions: - core: fix data-races around sk->sk_forward_alloc - netlink: terminate outstanding dump on socket close - mptcp: error out earlier on disconnect - vsock: fix accept_queue memory leak - phylink: ensure PHY momentary link-fails are handled - eth: mlx5: - fix null-ptr-deref in add rule err flow - lock FTE when checking if active - eth: dwmac-mediatek: fix inverted handling of mediatek,mac-wol Previous releases - always broken: - sched: fix u32's systematic failure to free IDR entries for hnodes. - sctp: fix possible UAF in sctp_v6_available() - eth: bonding: add ns target multicast address to slave device - eth: mlx5: fix msix vectors to respect platform limit - eth: icssg-prueth: fix 1 PPS sync" * tag 'net-6.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (38 commits) net: sched: u32: Add test case for systematic hnode IDR leaks selftests: bonding: add ns multicast group testing bonding: add ns target multicast address to slave device net: ti: icssg-prueth: Fix 1 PPS sync stmmac: dwmac-intel-plat: fix call balance of tx_clk handling routines net: Make copy_safe_from_sockptr() match documentation net: stmmac: dwmac-mediatek: Fix inverted handling of mediatek,mac-wol ipmr: Fix access to mfc_cache_list without lock held samples: pktgen: correct dev to DEV net: phylink: ensure PHY momentary link-fails are handled mptcp: pm: use _rcu variant under rcu_read_lock mptcp: hold pm lock when deleting entry mptcp: update local address flags when setting it net: sched: cls_u32: Fix u32's systematic failure to free IDR entries for hnodes. MAINTAINERS: Re-add cancelled Renesas driver sections Revert "igb: Disable threaded IRQ for igb_msix_other" Bluetooth: btintel: Direct exception event to bluetooth stack Bluetooth: hci_core: Fix calling mgmt_device_connected virtio/vsock: Improve MSG_ZEROCOPY error handling vsock: Fix sk_error_queue memory leak ...
2024-11-14Merge tag 'bcachefs-2024-11-13' of git://evilpiepirate.org/bcachefsLinus Torvalds14-19/+72
Pull bcachefs fixes from Kent Overstreet: "This fixes one minor regression from the btree cache fixes (in the scan_for_btree_nodes repair path) - and the shutdown path fix is the big one here, in terms of bugs closed: - Assorted tiny syzbot fixes - Shutdown path fix: "bch2_btree_write_buffer_flush_going_ro()" The shutdown path wasn't flushing the btree write buffer, leading to shutting down while we still had operations in flight. This fixes a whole slew of syzbot bugs, and undoubtedly other strange heisenbugs. * tag 'bcachefs-2024-11-13' of git://evilpiepirate.org/bcachefs: bcachefs: Fix assertion pop in bch2_ptr_swab() bcachefs: Fix journal_entry_dev_usage_to_text() overrun bcachefs: Allow for unknown key types in backpointers fsck bcachefs: Fix assertion pop in topology repair bcachefs: Fix hidden btree errors when reading roots bcachefs: Fix validate_bset() repair path bcachefs: Fix missing validation for bch_backpointer.level bcachefs: Fix bch_member.btree_bitmap_shift validation bcachefs: bch2_btree_write_buffer_flush_going_ro()
2024-11-14eth: fbnic: Add support to dump registersMohsin Bashir5-1/+186
Add support for the 'ethtool -d <dev>' command to retrieve and print a register dump for fbnic. The dump defaults to version 1 and consists of two parts: all the register sections that can be dumped linearly, and an RPC RAM section that is structured in an interleaved fashion and requires special handling. For each register section, the dump also contains the start and end boundary information which can simplify parsing. Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Link: https://patch.msgid.link/20241112222605.3303211-1-mohsin.bashr@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-14netfilter: nf_tables: allocate element update information dynamicallyFlorian Westphal2-24/+43
Move the timeout/expire/flag members from nft_trans_one_elem struct into a dybamically allocated structure, only needed when timeout update was requested. This halves size of nft_trans_one_elem struct and allows to compact up to 124 elements in one transaction container rather than 62. This halves memory requirements for a large flush or insert transaction, where ->update remains NULL. Care has to be taken to release the extra data in all spots, including abort path. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-14netfilter: nf_tables: switch trans_elem to real flex arrayFlorian Westphal1-0/+90
When queueing a set element add or removal operation to the transaction log, check if the previous operation already asks for a the identical operation on the same set. If so, store the element reference in the preceding operation. This significantlty reduces memory consumption when many set add/delete operations appear in a single transaction. Example: 10k elements require 937kb of memory (10k allocations from kmalloc-96 slab). Assuming we can compact 4 elements in the same set, 468 kbytes are needed (64 bytes for base struct, nft_trans_elemn, 32 bytes for nft_trans_one_elem structure, so 2500 allocations from kmalloc-192 slab). For large batch updates we can compact up to 62 elements into one single nft_trans_elem structure (~65% mem reduction): (64 bytes for base struct, nft_trans_elem, 32 byte for nft_trans_one_elem struct). We can halve size of nft_trans_one_elem struct by moving timeout/expire/update_flags into a dynamically allocated structure, this allows to store 124 elements in a 2k slab nft_trans_elem struct. This is done in a followup patch. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-14netfilter: nf_tables: prepare nft audit for set element compactionFlorian Westphal1-3/+18
nftables audit log format emits the number of added/deleted rules, sets, set elements and so on, to userspace: table=t1 family=2 entries=4 op=nft_register_set ~~~~~~~~~ At this time, the 'entries' key is the number of transactions that will be applied. The upcoming set element compression will coalesce subsequent adds/deletes to the same set requests in the same transaction request to conseve memory. Without this patch, we'd under-report the number of altered elements. Increment the audit counter by the number of elements to keep the reported entries value the same. Without this, nft_audit.sh selftest fails because the recorded (expected) entries key is smaller than the expected one. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-14netfilter: nf_tables: prepare for multiple elements in nft_trans_elem structureFlorian Westphal2-76/+173
Add helpers to release the individual elements contained in the trans_elem container structure. No functional change intended. Followup patch will add 'nelems' member and will turn 'priv' into a flexible array. These helpers can then loop over all elements. Care needs to be taken to handle a mix of new elements and existing elements that are being updated (e.g. timeout refresh). Before this patch, NEWSETELEM transaction with update is released early so nft_trans_set_elem_destroy() won't get called, so we need to skip elements marked as update. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-14netfilter: nf_tables: add nft_trans_commit_list_add_elem helperFlorian Westphal1-5/+16
Add and use a wrapper to append trans_elem structures to the transaction log. Unlike the existing helper, pass a gfp_t to indicate if sleeping is allowed. This will be used by a followup patch to realloc nft_trans_elem structures after they gain a flexible array member to reduce number of such container structures on the transaction list. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-14netfilter: bpf: Pass string literal as format argument of request_module()Simon Horman1-1/+1
Both gcc-14 and clang-18 report that passing a non-string literal as the format argument of request_module() is potentially insecure. E.g. clang-18 says: .../nf_bpf_link.c:46:24: warning: format string is not a string literal (potentially insecure) [-Wformat-security] 46 | err = request_module(mod); | ^~~ .../kmod.h:25:55: note: expanded from macro 'request_module' 25 | #define request_module(mod...) __request_module(true, mod) | ^~~ .../nf_bpf_link.c:46:24: note: treat the string as an argument to avoid this 46 | err = request_module(mod); | ^ | "%s", .../kmod.h:25:55: note: expanded from macro 'request_module' 25 | #define request_module(mod...) __request_module(true, mod) | ^ It is always the case where the contents of mod is safe to pass as the format argument. That is, in my understanding, it never contains any format escape sequences. But, it seems better to be safe than sorry. And, as a bonus, compiler output becomes less verbose by addressing this issue as suggested by clang-18. Compile tested only. Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-14netfilter: nfnetlink: Report extack policy errors for batched opsDonald Hunter1-1/+1
The nftables batch processing does not currently populate extack with policy errors. Fix this by passing extack when parsing batch messages. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-14net: sched: u32: Add test case for systematic hnode IDR leaksAlexandre Ferrieux1-0/+24
Add a tdc test case to exercise the just-fixed systematic leak of IDR entries in u32 hnode disposal. Given the IDR in question is confined to the range [1..0x7FF], it is sufficient to create/delete the same filter 2048 times to fill it up and get a nonzero exit status from "tc filter add". Signed-off-by: Alexandre Ferrieux <alexandre.ferrieux@orange.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Link: https://patch.msgid.link/20241113100428.360460-1-alexandre.ferrieux@orange.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-14Merge branch 'bonding-fix-ns-targets-not-work-on-hardware-nic'Paolo Abeni4-3/+151
Hangbin Liu says: ==================== bonding: fix ns targets not work on hardware NIC The first patch fixed ns targets not work on hardware NIC when bonding set arp_validate. The second patch add a related selftest for bonding. v4: Thanks Nikolay for the comments: use bond_slave_ns_maddrs_{add/del} with clear name fix comments typos remove _slave_set_ns_maddrs underscore directly update bond_option_arp_validate_set() change logic v3: use ndisc_mc_map to convert the mcast mac address (Jay Vosburgh) v2: only add/del mcast group on backup slaves when arp_validate is set (Jay Vosburgh) arp_validate doesn't support 3ad, tlb, alb. So let's only do it on ab mode. ==================== Link: https://patch.msgid.link/20241111101650.27685-1-liuhangbin@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-14selftests: bonding: add ns multicast group testingHangbin Liu1-1/+53
Add a test to make sure the backup slaves join correct multicast group when arp_validate enabled and ns_ip6_target is set. Here is the result: TEST: arp_validate (active-backup ns_ip6_target arp_validate 0) [ OK ] TEST: arp_validate (join mcast group) [ OK ] TEST: arp_validate (active-backup ns_ip6_target arp_validate 1) [ OK ] TEST: arp_validate (join mcast group) [ OK ] TEST: arp_validate (active-backup ns_ip6_target arp_validate 2) [ OK ] TEST: arp_validate (join mcast group) [ OK ] TEST: arp_validate (active-backup ns_ip6_target arp_validate 3) [ OK ] TEST: arp_validate (join mcast group) [ OK ] TEST: arp_validate (active-backup ns_ip6_target arp_validate 4) [ OK ] TEST: arp_validate (join mcast group) [ OK ] TEST: arp_validate (active-backup ns_ip6_target arp_validate 5) [ OK ] TEST: arp_validate (join mcast group) [ OK ] TEST: arp_validate (active-backup ns_ip6_target arp_validate 6) [ OK ] TEST: arp_validate (join mcast group) [ OK ] Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-14bonding: add ns target multicast address to slave deviceHangbin Liu3-2/+98
Commit 4598380f9c54 ("bonding: fix ns validation on backup slaves") tried to resolve the issue where backup slaves couldn't be brought up when receiving IPv6 Neighbor Solicitation (NS) messages. However, this fix only worked for drivers that receive all multicast messages, such as the veth interface. For standard drivers, the NS multicast message is silently dropped because the slave device is not a member of the NS target multicast group. To address this, we need to make the slave device join the NS target multicast group, ensuring it can receive these IPv6 NS messages to validate the slave’s status properly. There are three policies before joining the multicast group: 1. All settings must be under active-backup mode (alb and tlb do not support arp_validate), with backup slaves and slaves supporting multicast. 2. We can add or remove multicast groups when arp_validate changes. 3. Other operations, such as enslaving, releasing, or setting NS targets, need to be guarded by arp_validate. Fixes: 4e24be018eb9 ("bonding: add new parameter ns_targets") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-14net: ti: icssg-prueth: Fix 1 PPS syncMeghana Malladi2-2/+23
The first PPS latch time needs to be calculated by the driver (in rounded off seconds) and configured as the start time offset for the cycle. After synchronizing two PTP clocks running as master/slave, missing this would cause master and slave to start immediately with some milliseconds drift which causes the PPS signal to never synchronize with the PTP master. Fixes: 186734c15886 ("net: ti: icssg-prueth: add packet timestamping and ptp support") Signed-off-by: Meghana Malladi <m-malladi@ti.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: MD Danish Anwar <danishanwar@ti.com> Link: https://patch.msgid.link/20241111095842.478833-1-m-malladi@ti.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-14Merge branch 'net-dsa-microchip-add-lan9646-switch-support'Jakub Kicinski7-3/+75
Tristram Ha says: ==================== net: dsa: microchip: Add LAN9646 switch support This series of patches is to add LAN9646 switch support to the KSZ DSA driver. ==================== Link: https://patch.msgid.link/20241109015705.82685-1-Tristram.Ha@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14net: dsa: microchip: Add LAN9646 switch support to KSZ DSA driverTristram Ha6-3/+74
LAN9646 switch is a 6-port switch with functions like KSZ9897. It has 4 internal PHYs and 1 SGMII port. The chip id read from hardware is same as KSZ9477, so software driver needs to create a new chip id and group allowable functions under its chip data structure to differentiate the product. Signed-off-by: Tristram Ha <tristram.ha@microchip.com> Link: https://patch.msgid.link/20241109015705.82685-3-Tristram.Ha@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14dt-bindings: net: dsa: microchip: Add LAN9646 switch supportTristram Ha1-0/+1
LAN9646 switch is a 6-port switch with functions like KSZ9897. It has 4 internal PHYs and 1 SGMII port. Signed-off-by: Tristram Ha <tristram.ha@microchip.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20241109015705.82685-2-Tristram.Ha@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>