summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet
AgeCommit message (Collapse)AuthorFilesLines
2025-02-26drivers: net: xgene: Don't use "proxy" headersAndy Shevchenko1-2/+8
Update header inclusions to follow IWYU (Include What You Use) principle. In this case replace *gpio.h, which are subject to remove by the GPIOLIB subsystem, with the respective headers that are being used by the driver. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://patch.msgid.link/20250224120037.3801609-1-andriy.shevchenko@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-26net: stmmac: dwc-qos: clean up clock initialisationRussell King (Oracle)1-14/+19
Clean up the clock initialisation by providing a helper to find a named clock in the bulk clocks, and provide the name of the stmmac clock in match data so we can locate the stmmac clock in generic code. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Thierry Reding <treding@nvidia.com> Link: https://patch.msgid.link/E1tmbhj-004vSz-Pt@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-26net: stmmac: dwc-qos: name struct plat_stmmacenet_data consistentlyRussell King (Oracle)1-16/+16
Most of the stmmac driver uses "plat_dat" to name the platform data, and "data" is used for the glue driver data. make dwc-qos follow this pattern to avoid silly mistakes. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Thierry Reding <treding@nvidia.com> Link: https://patch.msgid.link/E1tmbhe-004vSt-M3@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-26net: ethernet: ti: am65-cpsw: select PAGE_POOLSascha Hauer1-0/+1
am65-cpsw uses page_pool_dev_alloc_pages(), thus needs PAGE_POOL selected to avoid linker errors. This is missing since the driver started to use page_pool helpers in 8acacc40f733 ("net: ethernet: ti: am65-cpsw: Add minimal XDP support") Fixes: 8acacc40f733 ("net: ethernet: ti: am65-cpsw: Add minimal XDP support") Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/20250224-net-am654-nuss-kconfig-v2-1-c124f4915c92@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-26Octeontx2-af: RPM: Register driver with PCI subsys IDsHariprasad Kelam2-2/+14
Although the PCI device ID and Vendor ID for the RPM (MAC) block have remained the same across Octeon CN10K and the next-generation CN20K silicon, Hardware architecture has changed (NIX mapped RPMs and RFOE Mapped RPMs). Add PCI Subsystem IDs to the device table to ensure that this driver can be probed from NIX mapped RPM devices only. Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Link: https://patch.msgid.link/20250224035603.1220913-1-hkelam@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25eth: fbnic: Update return value in kdocMohsin Bashir1-1/+1
Fix return value in kdoc for fbnic_netdev_alloc() Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-25eth: fbnic: Consolidate PUL_USER CSR sectionMohsin Bashir1-38/+35
Move PUL_USER CSRs in the relevant section, update the end boundary address, and remove the redundant definition of end boundary. Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-25eth: fbnic: Add PCIe registers dumpMohsin Bashir2-0/+6
Provide coverage to PCIe registers in ethtool register dump Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-25Merge branch 'mlx5-next' into wip/leon-for-nextLeon Romanovsky19-155/+263
This is merge of shared branch between RDMA and net-next trees. * mlx5-next: (550 commits) net/mlx5: Change POOL_NEXT_SIZE define value and make it global net/mlx5: Add new health syndrome error and crr bit offset Linux 6.14-rc3 ... Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-02-25net/mlx5: Use secs_to_jiffies() instead of msecs_to_jiffies()Thorsten Blum1-1/+1
Use secs_to_jiffies() and simplify the code. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Saeed Mahameed <saeed@kernel.org> Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Link: https://patch.msgid.link/20250221085350.198024-3-thorsten.blum@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25net/mlx5e: Support RX xfrm state selector's UPSPEC for packet offloadJianbo Liu3-2/+242
Previously, the upper layer matches are added for the decryption rule when xfrm selector's UPSPEC is specified in the command. However, it's impossible as packets are not decrypted, and there is no way to do match on the upper protocol (TCP/UDP) with specific source/destination port. The result is that packets are not decrypted by hardware because of this mismatch. Instead, they are forwarded to kernel, and decryption is done by software. To resolve this issue, this patch adds new table (sa_sel) after status table and before policy table. When UPSPEC's proto is specified in xfrm state's selector, a rule is added in status table to forward the decrypted packets to sa_sel table, where the corresponding rule for selector's UPSPEC is added, and packet's upper headers are checked there. If matched, they will be forward to policy table to do policy check. Otherwise, they are dropped immediately. Besides, add a global count for this kind of packet drop. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250220213959.504304-9-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25net/mlx5e: Add pass flow group for IPSec RX status tableJianbo Liu1-1/+50
This flow group is added for the pass rules for both crypto offload and packet offload. It is placed at the end of the table, and right before the miss group. There are two entries, and the default pass rules for both offloads are added in this group. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250220213959.504304-8-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25net/mlx5e: Add num_reserved_entries param for ipsec_ft_create()Jianbo Liu1-7/+8
Add parameter for ipsec_ft_create() to pass the number of the reserved entries when creating auto-grouped flow table. It's used to create table with pre-defined group(s) which may have more than one rule. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250220213959.504304-7-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25net/mlx5e: Skip IPSec RX policy check for crypto offloadJianbo Liu1-25/+56
For crypto offload, there is no xfrm policy rule offloaded to hardware, so no need to continue with policy check for it. Previously, for crypto offload, the hardware metadata reg c4 is not used and not changed, but set to ASO_OK(0) before decryption to avoid garbage data. Then a default rule is added to check ipsec_syndrome and this register. Packets are forwarded to policy table if succeed, or drop if fails. According to hardware document, this register value could be 0, 1. So a special value (0xAA), which is not used by hardware, is chosen as an indication for crypto offload. It is set to c4 before decryption. Then a default rule, which matches on 0xAA (and ipsec_syndrome on 0), is added, which means packets are done by crypto offload, and sends them to kernel directly, thus skips the policy check. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250220213959.504304-6-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25net/mlx5e: Move IPSec policy check after decryptionJianbo Liu3-56/+145
Currently, xfrm policy check is done before decryption in mlx5 driver. If matching any policy, packets are forwarded to xfrm state table for decryption. But this is exact opposite to what software does. For kernel implementation, xfrm decode is unconditionally activated whenever an IPSec packet reaches the input flow if there’s a matching state rule. This patch changes the order, move policy check after decryption. Besides, a miss flow table is added at the end for legacy mode, to make it easier to update the default destination of the steering rules. So ESP packets are firstly forwarded to SA table for decryption, then the result is checked in status table. If the decryption succeeds, packets are forwarded to another table to check xfrm policy rules. When a policy with allow action is matched, if in legacy mode packets are forwarded to miss flow table with one rule to forward them to RoCE tables, if in switchdev mode they are forwarded directly to TC root chain instead. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250220213959.504304-5-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25net/mlx5e: Add correct match to check IPSec syndromes for switchdev modeJianbo Liu3-7/+39
In commit dddb49b63d86 ("net/mlx5e: Add IPsec and ASO syndromes check in HW"), IPSec and ASO syndromes checks after decryption for the specified ASO object were added. But they are correct only for eswith in legacy mode. For switchdev mode, metadata register c1 is used to save the mapped id (not ASO object id). So, need to change the match accordingly for the check rules in status table. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250220213959.504304-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25net/mlx5e: Change the destination of IPSec RX SA miss ruleJianbo Liu1-3/+17
For eswitch in legacy mode, the packets decrypted in RX SA table will continue to be processed for RoCE. But this is not necessary for the un-decrypted packets, which don't match any decryption rules but hit the miss rule at the end of the table. So, change the destination of miss rule to TTC default one and skip RoCE. For eswitch in switchdev mode, the destination is unchanged. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250220213959.504304-3-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25net/mlx5e: Add helper function to update IPSec default destinationJianbo Liu1-4/+10
The default destination of IPSec steering rules for MPV mode will be updated when the master device is brought up or down. Move the common code into the helper function. It’s convenient to update destinations in later patches. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250220213959.504304-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25net: wangxun: Replace the judgement of MAC type with flagsJiawen Wu4-6/+9
Since device MAC types are constantly being added, the judgments of wx->mac.type are complex. Try to convert the types to flags depending on functions. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Link: https://patch.msgid.link/20250221065718.197544-2-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25net: txgbe: Add basic support for new AML devicesJiawen Wu9-54/+333
There is a new 40/25/10 Gigabit Ethernet device. To support basic functions, PHYLINK is temporarily skipped as it is intended to implement these configurations in the firmware. And the associated link IRQ is also skipped. And Implement the new SW-FW interaction interface, which use 64 Byte message buffer. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Link: https://patch.msgid.link/20250221065718.197544-1-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25net: ethernet: renesas: rcar_gen4_ptp: Remove bool conversionThorsten Blum1-1/+1
Remove the unnecessary bool conversion and simplify the code. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Link: https://patch.msgid.link/20250223233613.100518-2-thorsten.blum@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25net: stmmac: thead: ensure divisor gives proper rateRussell King (Oracle)1-8/+4
thead was checking that the stmmac_clk rate was a multiple of the RGMII rates for 1G and 100M, but didn't check for 10M. Rather than use this with hard-coded speeds, check that the calculated divisor gives the required rate by multplying the transmit clock rate back up to the stmmac clock rate and checking that it agrees. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Drew Fustini <drew@pdp7.com> Link: https://patch.msgid.link/E1tlToD-004W3g-HB@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25net: stmmac: thead: use rgmii_clock() for RGMII clock rateRussell King (Oracle)1-11/+5
Switch to using rgmii_clock() to get the RGMII TXC rate, and calculate the divisor from the parent clock rate and the rate indicated by rgmii_clock(). Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Drew Fustini <drew@pdp7.com> Link: https://patch.msgid.link/E1tlTo8-004W3a-CO@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25net: stmmac: Correct usage of maximum queue number macrosKunihiko Hayashi2-6/+5
The maximum numbers of each Rx and Tx queues are defined by MTL_MAX_RX_QUEUES and MTL_MAX_TX_QUEUES respectively. There are some places where Rx and Tx are used in reverse. There is no issue when the Tx and Rx macros have the same value, but should correct usage of macros for maximum queue number to keep consistency and prevent unexpected mistakes. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Huacai Chen <chenhuacai@kernel.org> Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Link: https://patch.msgid.link/20250221051818.4163678-1-hayashi.kunihiko@socionext.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25ibmvnic: simplify ibmvnic_set_queue_affinity()Yury Norov1-7/+11
A loop based on cpumask_next_wrap() opencodes the dedicated macro for_each_online_cpu_wrap(). Use it as it improves readability and simplifies maintenance. This also helps to drop cpumask handling code in the caller function. Signed-off-by: Yury Norov <yury.norov@gmail.com> Tested-by: Nick Child <nnac123@linux.ibm.com>
2025-02-24net: stmmac: qcom-ethqos: use rgmii_clock() to set the link clockRussell King (Oracle)1-18/+5
The link clock operates at twice the RGMII clock rate. Therefore, we can use the rgmii_clock() helper to set this clock rate. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Vinod Koul <vkoul@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/E1tlRMK-004Vsx-Ss@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-23net/mlx5: Change POOL_NEXT_SIZE define value and make it globalPatrisious Haddad4-6/+7
Change POOL_NEXT_SIZE define value from 0 to BIT(30), since this define is used to request the available maximum sized flow table, and zero doesn't make sense for it, whereas some places in the driver use zero explicitly expecting the smallest table size possible but instead due to this define they end up allocating the biggest table size unawarely. In addition move the definition to "include/linux/mlx5/fs.h" to expose the define to IB driver as well, while appropriately renaming it. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250219085808.349923-3-tariqt@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-02-22net: cadence: macb: Implement BQLSean Anderson1-2/+18
Implement byte queue limits to allow queuing disciplines to account for packets enqueued in the ring buffer but not yet transmitted. There are a separate set of transmit functions for AT91 that I haven't touched since I don't have hardware to test on. Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Link: https://patch.msgid.link/20250220164257.96859-1-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-22net: cadence: macb: Synchronize stats calculationsSean Anderson2-2/+12
Stats calculations involve a RMW to add the stat update to the existing value. This is currently not protected by any synchronization mechanism, so data races are possible. Add a spinlock to protect the update. The reader side could be protected using u64_stats, but we would still need a spinlock for the update side anyway. And we always do an update immediately before reading the stats anyway. Fixes: 89e5785fc8a6 ("[PATCH] Atmel MACB ethernet driver") Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Link: https://patch.msgid.link/20250220162950.95941-1-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-22net: stmmac: print stmmac_init_dma_engine() errors using netdev_err()Russell King (Oracle)1-2/+2
stmmac_init_dma_engine() uses dev_err() which leads to errors being reported as e.g: dwc-eth-dwmac 2490000.ethernet: Failed to reset the dma dwc-eth-dwmac 2490000.ethernet eth0: stmmac_hw_setup: DMA engine initialization failed stmmac_init_dma_engine() is only called from stmmac_hw_setup() which itself uses netdev_err(), and we will have a net_device setup. So, change the dev_err() to netdev_err() to give consistent error messages. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tl5y1-004UgG-8X@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-22Merge tag 'for-netdev' of ↵Jakub Kicinski4-34/+125
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Martin KaFai Lau says: ==================== pull-request: bpf-next 2025-02-20 We've added 19 non-merge commits during the last 8 day(s) which contain a total of 35 files changed, 1126 insertions(+), 53 deletions(-). The main changes are: 1) Add TCP_RTO_MAX_MS support to bpf_set/getsockopt, from Jason Xing 2) Add network TX timestamping support to BPF sock_ops, from Jason Xing 3) Add TX metadata Launch Time support, from Song Yoong Siang * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: igc: Add launch time support to XDP ZC igc: Refactor empty frame insertion for launch time support net: stmmac: Add launch time support to XDP ZC selftests/bpf: Add launch time request to xdp_hw_metadata xsk: Add launch time hardware offload support to XDP Tx metadata selftests/bpf: Add simple bpf tests in the tx path for timestamping feature bpf: Support selective sampling for bpf timestamping bpf: Add BPF_SOCK_OPS_TSTAMP_SENDMSG_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_ACK_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_SND_HW_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_SND_SW_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_SCHED_CB callback net-timestamp: Prepare for isolating two modes of SO_TIMESTAMPING bpf: Disable unsafe helpers in TX timestamping callbacks bpf: Prevent unsafe access to the sock fields in the BPF timestamping callback bpf: Prepare the sock_ops ctx and call bpf prog for TX timestamping bpf: Add networking timestamping support to bpf_get/setsockopt() selftests/bpf: Add rto max for bpf_setsockopt test bpf: Support TCP_RTO_MAX_MS for bpf_setsockopt ==================== Link: https://patch.msgid.link/20250221022104.386462-1-martin.lau@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-22gve: Add RSS cache for non RSS device option scenarioZiwei Xiao4-23/+205
Not all the devices have the capability for the driver to query for the registered RSS configuration. The driver can discover this by checking the relevant device option during setup. If it cannot, the driver needs to store the RSS config cache and directly return such cache when queried by the ethtool. RSS config is inited when driver probes. Also the default RSS config will be adjusted when there is RX queue count change. At this point, only keys of GVE_RSS_KEY_SIZE and indirection tables of GVE_RSS_INDIR_SIZE are supported. Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Link: https://patch.msgid.link/20250219200451.3348166-1-jeroendb@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-22net: Use link/peer netns in newlink() of rtnl_link_opsXiao Liang1-2/+2
Add two helper functions - rtnl_newlink_link_net() and rtnl_newlink_peer_net() for netns fallback logic. Peer netns falls back to link netns, and link netns falls back to source netns. Convert the use of params->net in netdevice drivers to one of the helper functions for clarity. Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250219125039.18024-4-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-22rtnetlink: Pack newlink() params into structXiao Liang1-2/+5
There are 4 net namespaces involved when creating links: - source netns - where the netlink socket resides, - target netns - where to put the device being created, - link netns - netns associated with the device (backend), - peer netns - netns of peer device. Currently, two nets are passed to newlink() callback - "src_net" parameter and "dev_net" (implicitly in net_device). They are set as follows, depending on netlink attributes in the request. +------------+-------------------+---------+---------+ | peer netns | IFLA_LINK_NETNSID | src_net | dev_net | +------------+-------------------+---------+---------+ | | absent | source | target | | absent +-------------------+---------+---------+ | | present | link | link | +------------+-------------------+---------+---------+ | | absent | peer | target | | present +-------------------+---------+---------+ | | present | peer | link | +------------+-------------------+---------+---------+ When IFLA_LINK_NETNSID is present, the device is created in link netns first and then moved to target netns. This has some side effects, including extra ifindex allocation, ifname validation and link events. These could be avoided if we create it in target netns from the beginning. On the other hand, the meaning of src_net parameter is ambiguous. It varies depending on how parameters are passed. It is the effective link (or peer netns) by design, but some drivers ignore it and use dev_net instead. To provide more netns context for drivers, this patch packs existing newlink() parameters, along with the source netns, link netns and peer netns, into a struct. The old "src_net" is renamed to "net" to avoid confusion with real source netns, and will be deprecated later. The use of src_net are converted to params->net trivially. Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250219125039.18024-3-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21xfrm: provide common xdo_dev_offload_ok callback implementationLeon Romanovsky7-121/+0
Almost all drivers except bond and nsim had same check if device can perform XFRM offload on that specific packet. The check was that packet doesn't have IPv4 options and IPv6 extensions. In NIC drivers, the IPv4 HELEN comparison was slightly different, but the intent was to check for the same conditions. So let's chose more strict variant as a common base. Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-02-21octeontx2: hide unused labelArnd Bergmann2-0/+4
A previous patch introduces a build-time warning when CONFIG_DCB is disabled: drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c: In function 'otx2_probe': drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c:3217:1: error: label 'err_free_zc_bmap' defined but not used [-Werror=unused-label] drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c: In function 'otx2vf_probe': drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c:740:1: error: label 'err_free_zc_bmap' defined but not used [-Werror=unused-label] Add the same #ifdef check around it. Fixes: efabce290151 ("octeontx2-pf: AF_XDP zero copy receive support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Suman Ghosh <sumang@marvell.com> Link: https://patch.msgid.link/20250219162239.1376865-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net/mlx5e: Separate extended link modes request from link modes type selectionShahar Shitrit1-9/+4
The function ext_requested() serves two distinct purposes: it checks if extended link modes were requested, and it selects whether to use extended or legacy link modes. This change separates these two purposes. Now, ext_link_mode_requested() is used directly for checking if extended link modes are requested, while the selection of extended modes is handled independently based on the autonegotiation status. By making this distinction, the logic for determining whether to select extended or legacy link modes is clearer. Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250219114112.403808-6-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net/mlx5e: Change eth_proto parameter namingShahar Shitrit1-2/+2
eth_proto_cap parameter represents the supported link modes, while eth_proto_admin refers to the configured ones. The function get_advertising() retrieves the configured link modes, thus we update its parameter name to eth_proto_admin. Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250219114112.403808-5-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net/mlx5e: Introduce ptys2ethtool_process_link()Shahar Shitrit1-25/+10
The functions ptys2ethtool_supported_link(), ptys2ethtool_adver_link() share the same code, thus, in order to remove code duplication we introduce a new function ptys2ethtool_process_link() to handle the processing of both supported and advertised link modes. Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250219114112.403808-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net/mlx5e: Refactor ptys2ethtool_adver_link()Shahar Shitrit1-12/+8
The function ptys2ethtool_adver_link() contains duplicated code that is found in mlx5e_ethtool_get_speed_arr(). To eliminate this redundancy, we update mlx5e_ethtool_get_speed_arr() to select the appropriate table based on the ext argument passed by the caller, rather than querying the supported mode locally. This allows us to replace the current logic in ptys2ethtool_adver_link() with a call to mlx5e_ethtool_get_speed_arr(). This adjustment aligns with the ptys2ethtool_supported_link() function and prepares for an upcoming patch that reduces code duplication. Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250219114112.403808-3-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net/mlx5: Bridge, correct config option descriptionCosmin Ratiu1-2/+2
The implication of the previous help text was that without this option enabled, representor devices couldn't be added to a bridge device, while in fact that was possible, just that rules didn't get offloaded to hw. This commit clarifies the help text. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250219114112.403808-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net: stmmac: dwmac-loongson: Add fix_soc_reset() callbackQunqin Zhao1-0/+14
Loongson's DWMAC device may take nearly two seconds to complete DMA reset, however, the default waiting time for reset is 200 milliseconds. Therefore, the following error message may appear: [14.427169] dwmac-loongson-pci 0000:00:03.2: Failed to reset the dma Fixes: 803fc61df261 ("net: stmmac: dwmac-loongson: Add Loongson Multi-channels GMAC support") Cc: stable@vger.kernel.org Signed-off-by: Qunqin Zhao <zhaoqunqin@loongson.cn> Reviewed-by: Huacai Chen <chenhuacai@loongson.cn> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Yanteng Si <si.yanteng@linux.dev> Link: https://patch.msgid.link/20250219020701.15139-1-zhaoqunqin@loongson.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21igc: Add launch time support to XDP ZCSong Yoong Siang2-2/+60
Enable Launch Time Control (LTC) support for XDP zero copy via XDP Tx metadata framework. This patch has been tested with tools/testing/selftests/bpf/xdp_hw_metadata on Intel I225-LM Ethernet controller. Below are the test steps and result. Test 1: Send a single packet with the launch time set to 1 s in the future. Test steps: 1. On the DUT, start the xdp_hw_metadata selftest application: $ sudo ./xdp_hw_metadata enp2s0 -l 1000000000 -L 1 2. On the Link Partner, send a UDP packet with VLAN priority 1 to port 9091 of the DUT. Result: When the launch time is set to 1 s in the future, the delta between the launch time and the transmit hardware timestamp is 0.016 us, as shown in printout of the xdp_hw_metadata application below. 0x562ff5dc8880: rx_desc[4]->addr=84110 addr=84110 comp_addr=84110 EoP rx_hash: 0xE343384 with RSS type:0x1 HW RX-time: 1734578015467548904 (sec:1734578015.4675) delta to User RX-time sec:0.0002 (183.103 usec) XDP RX-time: 1734578015467651698 (sec:1734578015.4677) delta to User RX-time sec:0.0001 (80.309 usec) No rx_vlan_tci or rx_vlan_proto, err=-95 0x562ff5dc8880: ping-pong with csum=561c (want c7dd) csum_start=34 csum_offset=6 HW RX-time: 1734578015467548904 (sec:1734578015.4675) delta to HW Launch-time sec:1.0000 (1000000.000 usec) 0x562ff5dc8880: complete tx idx=4 addr=4018 HW Launch-time: 1734578016467548904 (sec:1734578016.4675) delta to HW TX-complete-time sec:0.0000 (0.016 usec) HW TX-complete-time: 1734578016467548920 (sec:1734578016.4675) delta to User TX-complete-time sec:0.0000 (32.546 usec) XDP RX-time: 1734578015467651698 (sec:1734578015.4677) delta to User TX-complete-time sec:0.9999 (999929.768 usec) HW RX-time: 1734578015467548904 (sec:1734578015.4675) delta to HW TX-complete-time sec:1.0000 (1000000.016 usec) 0x562ff5dc8880: complete rx idx=132 addr=84110 Test 2: Send 1000 packets with a 10 ms interval and the launch time set to 500 us in the future. Test steps: 1. On the DUT, start the xdp_hw_metadata selftest application: $ sudo chrt -f 99 ./xdp_hw_metadata enp2s0 -l 500000 -L 1 > \ /dev/shm/result.log 2. On the Link Partner, send 1000 UDP packets with a 10 ms interval and VLAN priority 1 to port 9091 of the DUT. Result: When the launch time is set to 500 us in the future, the average delta between the launch time and the transmit hardware timestamp is 0.016 us, as shown in the analysis of /dev/shm/result.log below. The XDP launch time works correctly in sending 1000 packets continuously. Min delta: 0.005 us Avr delta: 0.016 us Max delta: 0.031 us Total packets forwarded: 1000 Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://patch.msgid.link/20250216093430.957880-6-yoong.siang.song@intel.com
2025-02-21igc: Refactor empty frame insertion for launch time supportSong Yoong Siang1-32/+50
Refactor the code for inserting an empty frame into a new function igc_insert_empty_frame(). This change extracts the logic for inserting an empty packet from igc_xmit_frame_ring() into a separate function, allowing it to be reused in future implementations, such as the XDP zero copy transmit function. Remove the igc_desc_unused() checking in igc_init_tx_empty_descriptor() because the number of descriptors needed is guaranteed. Ensure that skb allocation and DMA mapping work for the empty frame, before proceeding to fill in igc_tx_buffer info, context descriptor, and data descriptor. Rate limit the error messages for skb allocation and DMA mapping failures. Update the comment to indicate that the 2 descriptors needed by the empty frame are already taken into consideration in igc_xmit_frame_ring(). Handle the case where the insertion of an empty frame fails and explain the reason behind this handling. Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://patch.msgid.link/20250216093430.957880-5-yoong.siang.song@intel.com
2025-02-21net: stmmac: Add launch time support to XDP ZCSong Yoong Siang2-0/+15
Enable launch time (Time-Based Scheduling) support for XDP zero copy via the XDP Tx metadata framework. This patch has been tested with tools/testing/selftests/bpf/xdp_hw_metadata on Intel Tiger Lake platform. Below are the test steps and result. Test 1: Send a single packet with the launch time set to 1 s in the future. Test steps: 1. On the DUT, start the xdp_hw_metadata selftest application: $ sudo ./xdp_hw_metadata enp0s30f4 -l 1000000000 -L 1 2. On the Link Partner, send a UDP packet with VLAN priority 1 to port 9091 of the DUT. Result: When the launch time is set to 1 s in the future, the delta between the launch time and the transmit hardware timestamp is 16.963 us, as shown in printout of the xdp_hw_metadata application below. 0x55b5864717a8: rx_desc[4]->addr=88100 addr=88100 comp_addr=88100 EoP No rx_hash, err=-95 HW RX-time: 1734579065767717328 (sec:1734579065.7677) delta to User RX-time sec:0.0004 (375.624 usec) XDP RX-time: 1734579065768004454 (sec:1734579065.7680) delta to User RX-time sec:0.0001 (88.498 usec) No rx_vlan_tci or rx_vlan_proto, err=-95 0x55b5864717a8: ping-pong with csum=5619 (want 0000) csum_start=34 csum_offset=6 HW RX-time: 1734579065767717328 (sec:1734579065.7677) delta to HW Launch-time sec:1.0000 (1000000.000 usec) 0x55b5864717a8: complete tx idx=4 addr=4018 HW Launch-time: 1734579066767717328 (sec:1734579066.7677) delta to HW TX-complete-time sec:0.0000 (16.963 usec) HW TX-complete-time: 1734579066767734291 (sec:1734579066.7677) delta to User TX-complete-time sec:0.0001 (130.408 usec) XDP RX-time: 1734579065768004454 (sec:1734579065.7680) delta to User TX-complete-time sec:0.9999 (999860.245 usec) HW RX-time: 1734579065767717328 (sec:1734579065.7677) delta to HW TX-complete-time sec:1.0000 (1000016.963 usec) 0x55b5864717a8: complete rx idx=132 addr=88100 Test 2: Send 1000 packets with a 10 ms interval and the launch time set to 500 us in the future. Test steps: 1. On the DUT, start the xdp_hw_metadata selftest application: $ sudo chrt -f 99 ./xdp_hw_metadata enp0s30f4 -l 500000 -L 1 > \ /dev/shm/result.log 2. On the Link Partner, send 1000 UDP packets with a 10 ms interval and VLAN priority 1 to port 9091 of the DUT. Result: When the launch time is set to 500 us in the future, the average delta between the launch time and the transmit hardware timestamp is 13.854 us, as shown in the analysis of /dev/shm/result.log below. The XDP launch time works correctly in sending 1000 packets continuously. Min delta: 08.410 us Avr delta: 13.854 us Max delta: 17.076 us Total packets forwarded: 1000 Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Choong Yong Liang <yong.liang.choong@linux.intel.com> Link: https://patch.msgid.link/20250216093430.957880-4-yoong.siang.song@intel.com
2025-02-21eth: fbnic: Add ethtool support for IRQ coalescingMohsin Bashir6-8/+121
Add ethtool support to configure the IRQ coalescing behavior. Support separate timers for Rx and Tx for time based coalescing. For frame based configuration, currently we only support the Rx side. The hardware allows configuration of descriptor count instead of frame count requiring conversion between the two. We assume 2 descriptors per frame, one for the metadata and one for the data segment. When rx-frames are not configured, we set the RX descriptor count to half the ring size as a fail safe. Default configuration: ethtool -c eth0 | grep -E "rx-usecs:|tx-usecs:|rx-frames:" rx-usecs: 30 rx-frames: 0 tx-usecs: 35 IRQ rate test: With single iperf flow we monitor IRQ rate while changing the tx-usesc and rx-usecs to high and low values. ethtool -C eth0 rx-frames 8192 rx-usecs 150 tx-usecs 150 irq/sec 13k irq/sec 14k irq/sec 14k ethtool -C eth0 rx-frames 8192 rx-usecs 10 tx-usecs 10 irq/sec 27k irq/sec 28k irq/sec 28k Validating the use of extack: ethtool -C eth0 rx-frames 16384 netlink error: fbnic: rx_frames is above device max netlink error: Invalid argument Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://patch.msgid.link/20250218023520.2038010-1-mohsin.bashr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net: ngbe: Add support for 1PPS and TODJiawen Wu7-4/+250
Implement support for generating a 1pps output signal on SDP0. And support custom firmware to output TOD. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Link: https://patch.msgid.link/20250218023432.146536-5-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net: wangxun: Add periodic checks for overflow and errorsJiawen Wu4-0/+109
Implement watchdog task to detect SYSTIME overflow and error cases of Rx/Tx timestamp. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20250218023432.146536-4-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net: wangxun: Support to get ts infoJiawen Wu4-0/+58
Implement the function get_ts_info and get_ts_stats in ethtool_ops to get the HW capabilities and statistics for timestamping. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20250218023432.146536-3-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net: wangxun: Add support for PTP clockJiawen Wu11-6/+780
Implement support for PTP clock on Wangxun NICs. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20250218023432.146536-2-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>