summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet
AgeCommit message (Collapse)AuthorFilesLines
2025-02-15bnxt_en: Refactor TX ring free logicMichael Chan1-14/+19
Add a new bnxt_hwrm_tx_ring_free() function to handle freeing a HW transmit ring. The new function will also be used in the next patch to free the TX ring in queue_stop. Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250213011240.1640031-10-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-15bnxt_en: Reallocate RX completion ring for TPH supportSomnath Kotur2-6/+34
In order to program the correct Steering Tag during an IRQ affinity change, we need to free/re-allocate the RX completion ring during queue_restart. If TPH is enabled, call FW to free the Rx completion ring and clear the ring entries in queue_stop(). Re-allocate it in queue_start() if TPH is enabled. Note that TPH mode is not enabled in this patch and will be enabled later in the patch series. While modifying bnxt_queue_start(), remove the unnecessary zeroing of rxr->rx_next_cons. It gets overwritten by the clone in bnxt_queue_start(). Remove the rx_reset counter increment since restart is not reset. Add comment to clarify that the ring allocations in queue_start should never fail. Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250213011240.1640031-9-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-15bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG ringsMichael Chan1-1/+3
Newer firmware can use the NQ ring ID associated with each RX/RX AGG ring to enable PCIe Steering Tags on P5_PLUS chips. When allocating RX/RX AGG rings, pass along NQ ring ID for the firmware to use. This information helps optimize DMA writes by directing them to the cache closer to the CPU consuming the data, potentially improving the processing speed. This change is backward-compatible with older firmware, which will simply disregard the information. Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250213011240.1640031-8-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-15bnxt_en: Refactor RX/RX AGG ring parameters setup for P5_PLUSMichael Chan1-30/+28
There is some common code for setting up RX and RX AGG ring allocation parameters for P5_PLUS chips. Refactor the logic into a new function. Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250213011240.1640031-7-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-15bnxt_en: Refactor bnxt_free_tx_rings() to free per TX ringSomnath Kotur1-54/+61
Modify bnxt_free_tx_rings() to free the skbs per TX ring. This will be useful later in the series. Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250213011240.1640031-6-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-15bnxt_en: Refactor completion ring free routineSomnath Kotur1-10/+16
Add a wrapper routine to free L2 completion rings. This will be useful later in the series. Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250213011240.1640031-5-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-15bnxt_en: Refactor TX ring allocation logicMichael Chan1-7/+15
Add a new bnxt_hwrm_tx_ring_alloc() function to handle allocating a transmit ring. This will be useful later in the series. Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250213011240.1640031-4-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-15bnxt_en: Refactor completion ring allocation logic for P5_PLUS chipsMichael Chan1-23/+21
Add a new bnxt_hwrm_cp_ring_alloc_p5() function to handle allocating one completion ring on P5_PLUS chips. This simplifies the existing code and will be useful later in the series. Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250213011240.1640031-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-15bnxt_en: Set NPAR 1.2 support when registering with firmwareMichael Chan2-0/+6
NPAR (Network interface card partitioning)[1] 1.2 adds a transparent VLAN tag for all packets between the NIC and the switch. Because of that, RX VLAN acceleration cannot be supported for any additional host configured VLANs. The driver has to acknowledge that it can support no RX VLAN acceleration and set the NPAR 1.2 supported flag when registering with the FW. Otherwise, the FW call will fail and the driver will abort on these NPAR 1.2 NICs with this error: bnxt_en 0000:26:00.0 (unnamed net_device) (uninitialized): hwrm req_type 0x1d seq id 0xb error 0x2 [1] https://techdocs.broadcom.com/us/en/storage-and-ethernet-connectivity/ethernet-nic-controllers/bcm957xxx/adapters/introduction/features/network-partitioning-npar.html Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250213011240.1640031-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-15net/mlx4_core: Avoid impossible mlx4_db_alloc() order valueKees Cook1-3/+3
GCC can see that the value range for "order" is capped, but this leads it to consider that it might be negative, leading to a false positive warning (with GCC 15 with -Warray-bounds -fdiagnostics-details): ../drivers/net/ethernet/mellanox/mlx4/alloc.c:691:47: error: array subscript -1 is below array bounds of 'long unsigned int *[2]' [-Werror=array-bounds=] 691 | i = find_first_bit(pgdir->bits[o], MLX4_DB_PER_PAGE >> o); | ~~~~~~~~~~~^~~ 'mlx4_alloc_db_from_pgdir': events 1-2 691 | i = find_first_bit(pgdir->bits[o], MLX4_DB_PER_PAGE >> o); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | | | | (2) out of array bounds here | (1) when the condition is evaluated to true In file included from ../drivers/net/ethernet/mellanox/mlx4/mlx4.h:53, from ../drivers/net/ethernet/mellanox/mlx4/alloc.c:42: ../include/linux/mlx4/device.h:664:33: note: while referencing 'bits' 664 | unsigned long *bits[2]; | ^~~~ Switch the argument to unsigned int, which removes the compiler needing to consider negative values. Signed-off-by: Kees Cook <kees@kernel.org> Link: https://patch.msgid.link/20250210174504.work.075-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-15ice: Fix signedness bug in ice_init_interrupt_scheme()Dan Carpenter1-2/+2
If pci_alloc_irq_vectors() can't allocate the minimum number of vectors then it returns -ENOSPC so there is no need to check for that in the caller. In fact, because pf->msix.min is an unsigned int, it means that any negative error codes are type promoted to high positive values and treated as success. So here, the "return -ENOMEM;" is unreachable code. Check for negatives instead. Now that we're only dealing with error codes, it's easier to propagate the error code from pci_alloc_irq_vectors() instead of hardcoding -ENOMEM. Fixes: 79d97b8cf9a8 ("ice: remove splitting MSI-X between features") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/b16e4f01-4c85-46e2-b602-fce529293559@stanley.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-15net: remove phylink_pcs .neg_mode booleanRussell King (Oracle)9-11/+0
As all PCS are using the neg_mode parameter rather than the legacy an_mode, remove the ability to use the legacy an_mode. We remove the tests in the phylink code, unconditionally passing the PCS neg_mode parameter to PCS methods, and remove setting the flag from drivers. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tidPn-0040hd-2R@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-15r8169: add PHY c45 ops for MDIO_MMD_VENDOR2 registersHeiner Kallweit1-0/+32
The integrated PHYs on chip versions from RTL8168g allow to address MDIO_MMD_VEND2 registers. All c22 standard registers are mapped to MDIO_MMD_VEND2 registers. So far the paging mechanism is used to address PHY registers. Add support for c45 ops to address MDIO_MMD_VEND2 registers directly, w/o the paging. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/d6f97eaa-0f13-468f-89cb-75a41087bc4a@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-15net: stmmac: remove calls to xpcs_config_eee()Russell King (Oracle)1-7/+0
Remove the explicit calls to xpcs_config_eee() from the stmmac driver, preferring instead for phylink to manage the EEE configuration at the PCS. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1thRQO-003w7I-Ap@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-15net: stmmac: call xpcs_config_eee_mult_fact()Russell King (Oracle)1-0/+2
Arrange for stmmac to call the new xpcs_config_eee_mult_fact() function to configure the EEE clock multiplying factor. This will allow the removal of the xpcs_config_eee() calls in the next patch. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1thRQE-003w76-3C@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-15net: stmmac: refactor clock management in EQoS driverSwathi K S1-99/+32
Refactor clock management in EQoS driver for code reuse and to avoid redundancy. This way, only minimal changes are required when a new platform is added. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Swathi K S <swathi.ks@samsung.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250213041559.106111-1-swathi.ks@samsung.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-15net: airoha: Fix TSO support for header cloned skbsLorenzo Bianconi1-5/+5
For GSO packets, skb_cow_head() will reallocate the skb for TSO header cloned skbs in airoha_dev_xmit(). For this reason, sinfo pointer can be no more valid. Fix the issue relying on skb_shinfo() macro directly in airoha_dev_xmit(). The problem exists since commit 23020f049327 ("net: airoha: Introduce ethernet support for EN7581 SoC") but it is not a user visible, since we can't currently enable TSO for DSA user ports since we are missing to initialize net_device vlan_features field. Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20250213-airoha-en7581-flowtable-offload-v4-1-b69ca16d74db@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-14iavf: add support for Rx timestamps to hotpathJacob Keller5-0/+125
Add support for receive timestamps to the Rx hotpath. This support only works when using the flexible descriptor format, so make sure that we request this format by default if we have receive timestamp support available in the PTP capabilities. In order to report the timestamps to userspace, we need to perform timestamp extension. The Rx descriptor does actually contain the "40 bit" timestamp. However, upper 32 bits which contain nanoseconds are conveniently stored separately in the descriptor. We could extract the 32bits and lower 8 bits, then perform a bitwise OR to calculate the 40bit value. This makes no sense, because the timestamp extension algorithm would simply discard the lower 8 bits anyways. Thus, implement timestamp extension as iavf_ptp_extend_32b_timestamp(), and extract and forward only the 32bits of nominal nanoseconds. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Sunil Goutham <sgoutham@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-02-14iavf: handle set and get timestamps opsJacob Keller4-0/+131
Add handlers for the .ndo_hwtstamp_get and .ndo_hwtstamp_set ops which allow userspace to request timestamp enablement for the device. This support allows standard Linux applications to request the timestamping desired. As with other devices that support timestamping all packets, the driver will upgrade any request for timestamping of a specific type of packet to HWTSTAMP_FILTER_ALL. The current configuration is stored, so that it can be retrieved by calling .ndo_hwtstamp_get The Tx timestamps are not implemented yet so calling set ops for Tx path will end with EOPNOTSUPP error code. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Co-developed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-02-14iavf: Implement checking DD desc fieldMateusz Polchlopek3-29/+43
Rx timestamping introduced in PF driver caused the need of refactoring the VF driver mechanism to check packet fields. The function to check errors in descriptor has been removed and from now only previously set struct fields are being checked. The field DD (descriptor done) needs to be checked at the very beginning, before extracting other fields. Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-02-14iavf: refactor iavf_clean_rx_irq to support legacy and flex descriptorsJacob Keller2-154/+327
Using VIRTCHNL_VF_OFFLOAD_FLEX_DESC, the iAVF driver is capable of negotiating to enable the advanced flexible descriptor layout. Add the flexible NIC layout (RXDID=2) as a member of the Rx descriptor union. Also add bit position definitions for the status and error indications that are needed. The iavf_clean_rx_irq function needs to extract a few fields from the Rx descriptor, including the size, rx_ptype, and vlan_tag. Move the extraction to a separate function that decodes the fields into a structure. This will reduce the burden for handling multiple descriptor types by keeping the relevant extraction logic in one place. To support handling an additional descriptor format with minimal code duplication, refactor Rx checksum handling so that the general logic is separated from the bit calculations. Introduce an iavf_rx_desc_decoded structure which holds the relevant bits decoded from the Rx descriptor. This will enable implementing flexible descriptor handling without duplicating the general logic twice. Introduce an iavf_extract_flex_rx_fields, iavf_flex_rx_hash, and iavf_flex_rx_csum functions which operate on the flexible NIC descriptor format instead of the legacy 32 byte format. Based on the negotiated RXDID, select the correct function for processing the Rx descriptors. With this change, the Rx hot path should be functional when using either the default legacy 32byte format or when we switch to the flexible NIC layout. Modify the Rx hot path to add support for the flexible descriptor format and add request enabling Rx timestamps for all queues. As in ice, make sure we bump the checksum level if the hardware detected a packet type which could have an outer checksum. This is important because hardware only verifies the inner checksum. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Co-developed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-02-14iavf: define Rx descriptors as qwordsMateusz Polchlopek5-111/+77
The union iavf_32byte_rx_desc consists of two unnamed structs defined inside. One of them represents legacy 32 byte descriptor and second the 16 byte descriptor (extended to 32 byte). Each of them consists of bunch of unions, structs and __le fields that represent specific fields in descriptor. This commit changes the representation of iavf_32byte_rx_desc union to store four __le64 fields (qw0, qw1, qw2, qw3) that represent quad-words. Those quad-words will be then accessed by calling leXY_get_bits macros in upcoming commits. Suggested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-02-14libeth: move idpf_rx_csum_decoded and idpf_rx_extractedMateusz Polchlopek3-51/+35
Structs idpf_rx_csum_decoded and idpf_rx_extracted are used both in idpf and iavf Intel drivers. Change the prefix from idpf_* to libeth_* and move mentioned structs to libeth's rx.h header file. Adjust usage in idpf driver. Suggested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-02-14iavf: periodically cache PHC timeJacob Keller1-0/+53
The Rx timestamps reported by hardware may only have 32 bits of storage for nanosecond time. These timestamps cannot be directly reported to the Linux stack, as it expects 64bits of time. To handle this, the timestamps must be extended using an algorithm that calculates the corrected 64bit timestamp by comparison between the PHC time and the timestamp. This algorithm requires the PHC time to be captured within ~2 seconds of when the timestamp was captured. Instead of trying to read the PHC time in the Rx hotpath, the algorithm relies on a cached value that is periodically updated. Keep this cached time up to date by using the PTP .do_aux_work kthread function. The iavf_ptp_do_aux_work will reschedule itself about twice a second, and will check whether or not the cached PTP time needs to be updated. If so, it issues a VIRTCHNL_OP_1588_PTP_GET_TIME to request the time from the PF. The jitter and latency involved with this command aren't important, because the cached time just needs to be kept up to date within about ~2 seconds. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Co-developed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-02-14iavf: add support for indirect access to PHC timeJacob Keller4-2/+257
Implement support for reading the PHC time indirectly via the VIRTCHNL_OP_1588_PTP_GET_TIME operation. Based on some simple tests with ftrace, the latency of the indirect clock access appears to be about ~110 microseconds. This is due to the cost of preparing a message to send over the virtchnl queue. This is expected, due to the increased jitter caused by sending messages over virtchnl. It is not easy to control the precise time that the message is sent by the VF, or the time that the message is responded to by the PF, or the time that the message sent from the PF is received by the VF. For sending the request, note that many PTP related operations will require sending of VIRTCHNL messages. Instead of adding a separate AQ flag and storage for each operation, setup a simple queue mechanism for queuing up virtchnl messages. Each message will be converted to a iavf_ptp_aq_cmd structure which ends with a flexible array member. A single AQ flag is added for processing messages from this queue. In principle this could be extended to handle arbitrary virtchnl messages. For now it is kept to PTP-specific as the need is primarily for handling PTP-related commands. Use this to implement .gettimex64 using the indirect method via the virtchnl command. The response from the PF is processed and stored into the cached_phc_time. A wait queue is used to allow the PTP clock gettime request to sleep until the message is sent from the PF. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-02-14iavf: add initial framework for registering PTP clockJacob Keller7-0/+155
Add the iavf_ptp.c file and fill it in with a skeleton framework to allow registering the PTP clock device. Add implementation of helper functions to check if a PTP capability is supported and handle change in PTP capabilities. Enabling virtual clock would be possible, though it would probably perform poorly due to the lack of direct time access. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Sai Krishna <saikrishnag@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Co-developed-by: Ahmed Zaki <ahmed.zaki@intel.com> Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Co-developed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-02-14iavf: negotiate PTP capabilitiesJacob Keller5-2/+178
Add a new extended capabilities negotiation to exchange information from the PF about what PTP capabilities are supported by this VF. This requires sending a VIRTCHNL_OP_1588_PTP_GET_CAPS message, and waiting for the response from the PF. Handle this early on during the VF initialization. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Co-developed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-02-14iavf: add support for negotiating flexible RXDID formatJacob Keller5-11/+178
Enable support for VIRTCHNL_VF_OFFLOAD_RX_FLEX_DESC, to enable the VF driver the ability to determine what Rx descriptor formats are available. This requires sending an additional message during initialization and reset, the VIRTCHNL_OP_GET_SUPPORTED_RXDIDS. This operation requests the supported Rx descriptor IDs available from the PF. This is treated the same way that VLAN V2 capabilities are handled. Add a new set of extended capability flags, used to process send and receipt of the VIRTCHNL_OP_GET_SUPPORTED_RXDIDS message. This ensures we finish negotiating for the supported descriptor formats prior to beginning configuration of receive queues. This change stores the supported format bitmap into the iavf_adapter structure. Additionally, if VIRTCHNL_VF_OFFLOAD_RX_FLEX_DESC is enabled by the PF, we need to make sure that the Rx queue configuration specifies the format. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Co-developed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-02-14ice: support Rx timestamp on flex descriptorSimei Su8-10/+102
To support Rx timestamp offload, VIRTCHNL_OP_1588_PTP_CAPS is sent by the VF to request PTP capability and responded by the PF what capability is enabled for that VF. Hardware captures timestamps which contain only 32 bits of nominal nanoseconds, as opposed to the 64bit timestamps that the stack expects. To convert 32b to 64b, we need a current PHC time. VIRTCHNL_OP_1588_PTP_GET_TIME is sent by the VF and responded by the PF with the current PHC time. Signed-off-by: Simei Su <simei.su@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Co-developed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-02-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski8-35/+60
Cross-merge networking fixes after downstream PR (net-6.14-rc3). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-13mlxsw: Add return value check for mlxsw_sp_port_get_stats_raw()Wentao Liang1-1/+3
Add a check for the return value of mlxsw_sp_port_get_stats_raw() in __mlxsw_sp_port_get_stats(). If mlxsw_sp_port_get_stats_raw() returns an error, exit the function to prevent further processing with potentially invalid data. Fixes: 614d509aa1e7 ("mlxsw: Move ethtool_ops to spectrum_ethtool.c") Cc: stable@vger.kernel.org # 5.9+ Signed-off-by: Wentao Liang <vulab@iscas.ac.cn> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20250212152311.1332-1-vulab@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-13r8169: add support for Intel Killer E5000Heiner Kallweit1-0/+1
This adds support for the Intel Killer E5000 which seems to be a rebranded RTL8126. Copied from r8126 vendor driver. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/9db73e9b-e2e8-45de-97a5-041c5f71d774@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-13ixgene-v2: prepare for phylib stop exporting phy_10_100_features_arrayHeiner Kallweit1-12/+6
As part of phylib cleanup we plan to stop exporting the feature arrays. So explicitly remove the modes not supported by the MAC. The media type bits don't have any impact on kernel behavior, so don't touch them. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Link: https://patch.msgid.link/be356a21-5a1a-45b3-9407-3a97f3af4600@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-13net: ethernet: ti: am65_cpsw: fix tx_cleanup for XDP caseRoger Quadros1-3/+11
For XDP transmit case, swdata doesn't contain SKB but the XDP Frame. Infer the correct swdata based on buffer type and return the XDP Frame for XDP transmit case. Signed-off-by: Roger Quadros <rogerq@kernel.org> Fixes: 8acacc40f733 ("net: ethernet: ti: am65-cpsw: Add minimal XDP support") Link: https://patch.msgid.link/20250210-am65-cpsw-xdp-fixes-v1-3-ec6b1f7f1aca@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-13net: ethernet: ti: am65-cpsw: fix RX & TX statistics for XDP_TX caseRoger Quadros1-3/+7
For successful XDP_TX and XDP_REDIRECT cases, the packet was received successfully so update RX statistics. Use original received packet length for that. TX packets statistics are incremented on TX completion so don't update it while TX queueing. If xdp_convert_buff_to_frame() fails, increment tx_dropped. Signed-off-by: Roger Quadros <rogerq@kernel.org> Fixes: 8acacc40f733 ("net: ethernet: ti: am65-cpsw: Add minimal XDP support") Link: https://patch.msgid.link/20250210-am65-cpsw-xdp-fixes-v1-2-ec6b1f7f1aca@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-13net: ethernet: ti: am65-cpsw: fix memleak in certain XDP casesRoger Quadros1-13/+13
If the XDP program doesn't result in XDP_PASS then we leak the memory allocated by am65_cpsw_build_skb(). It is pointless to allocate SKB memory before running the XDP program as we would be wasting CPU cycles for cases other than XDP_PASS. Move the SKB allocation after evaluating the XDP program result. This fixes the memleak. A performance boost is seen for XDP_DROP test. XDP_DROP test: Before: 460256 rx/s 0 err/s After: 784130 rx/s 0 err/s Fixes: 8acacc40f733 ("net: ethernet: ti: am65-cpsw: Add minimal XDP support") Signed-off-by: Roger Quadros <rogerq@kernel.org> Link: https://patch.msgid.link/20250210-am65-cpsw-xdp-fixes-v1-1-ec6b1f7f1aca@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-13net: stmmac: dwmac-loongson: Set correct {tx,rx}_fifo_sizeHuacai Chen1-0/+3
Now for dwmac-loongson {tx,rx}_fifo_size are uninitialised, which means zero. This means dwmac-loongson doesn't support changing MTU because in stmmac_change_mtu() it requires the fifo size be no less than MTU. Thus, set the correct tx_fifo_size and rx_fifo_size for it (16KB multiplied by queue counts). Here {tx,rx}_fifo_size is initialised with the initial value (also the maximum value) of {tx,rx}_queues_to_use. So it will keep as 16KB if we don't change the queue count, and will be larger than 16KB if we change (decrease) the queue count. However stmmac_change_mtu() still work well with current logic (MTU cannot be larger than 16KB for stmmac). Note: the Fixes tag picked here is the oldest commit and key commit of the dwmac-loongson series "stmmac: Add Loongson platform support". Acked-by: Yanteng Si <si.yanteng@linux.dev> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Chong Qiao <qiaochong@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Link: https://patch.msgid.link/20250210134328.2755328-1-chenhuacai@loongson.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-13Merge branch '200GbE' of ↵Jakub Kicinski4-14/+20
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2025-02-11 (idpf, ixgbe, igc) For idpf: Sridhar fixes a couple issues in handling of RSC packets. Josh adds a call to set_real_num_queues() to keep queue count in sync. For ixgbe: Piotr removes missed IS_ERR() removal when ERR_PTR usage was removed. For igc: Zdenek Bouska fixes reporting of Rx timestamp with AF_XDP. Siang sets buffer type on empty frame to ensure proper handling. * '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: igc: Set buffer type for empty frames in igc_init_empty_frame igc: Fix HW RX timestamp when passed by ZC XDP ixgbe: Fix possible skb NULL pointer dereference idpf: call set_real_num_queues in idpf_open idpf: record rx queue in skb for RSC packets idpf: fix handling rsc packet with a single segment ==================== Link: https://patch.msgid.link/20250211214343.4092496-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-13eth: fbnic: re-sort the objects in the MakefileJakub Kicinski1-1/+2
Looks like recent commit broke the sort order, fix it. Acked-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20250211181356.580800-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-13eth: fbnic: report software Tx queue statsJakub Kicinski3-4/+32
Gather and report software Tx queue stats - checksum stats and queue stop / start. Acked-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20250211181356.580800-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-13eth: fbnic: report software Rx queue statsJakub Kicinski3-9/+51
Gather and report software Rx queue stats - checksum stats and allocation failures. Acked-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20250211181356.580800-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-13eth: fbnic: wrap tx queue stats in a structJakub Kicinski3-10/+16
The queue stats struct is used for Rx and Tx queues. Wrap the Tx stats in a struct and a union, so that we can reuse the same space for Rx stats on Rx queues. This also makes it easy to add an assert to the stat handling code to catch new stats not being aggregated on shutdown. Acked-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20250211181356.580800-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-12net/mlx5: XDP, Enable TX side XDP multi-buffer supportAlexei Lazar6-61/+21
In XDP scenarios, fragmented packets can occur if the MTU is larger than the page size, even when the packet size fits within the linear part. If XDP multi-buffer support is disabled, the fragmented part won't be handled in the TX flow, leading to packet drops. Since XDP multi-buffer support is always available, this commit removes the conditional check for enabling it. This ensures that XDP multi-buffer support is always enabled, regardless of the `is_xdp_mb` parameter, and guarantees the handling of fragmented packets in such scenarios. Signed-off-by: Alexei Lazar <alazar@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250209101716.112774-16-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-12net/mlx5: Extend Ethtool loopback selftest to support non-linear SKBAlexei Lazar1-0/+3
Current loopback test validation ignores non-linear SKB case in the SKB access, which can lead to failures in scenarios such as when HW GRO is enabled. Linearize the SKB so both cases will be handled. Signed-off-by: Alexei Lazar <alazar@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250209101716.112774-15-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-12net/mlx5e: Expose RSS via devlink rx reporter diagnoseAmir Tzin7-2/+99
Underneath "rx resources" tag expose RSS diagnostic information. For each RSS expose its rqtn, TIRs and inner TIRs. $ devlink health diagnose auxiliary/mlx5_core.eth.0/65535 reporter rx ....... RSS: Index: 0 rqtn: 0 TIRs Numbers: tt: TT_IPV4_TCP tirn: 0 tt: TT_IPV6_TCP tirn: 1 tt: TT_IPV4_UDP tirn: 2 tt: TT_IPV6_UDP tirn: 3 tt: TT_IPV4_IPSEC_AH tirn: 4 tt: TT_IPV6_IPSEC_AH tirn: 5 tt: TT_IPV4_IPSEC_ESP tirn: 6 tt: TT_IPV6_IPSEC_ESP tirn: 7 tt: TT_IPV4 tirn: 8 tt: TT_IPV6 tirn: 9 Inner TIRs Numbers: tt: TT_IPV4_TCP tirn: 10 tt: TT_IPV6_TCP tirn: 11 tt: TT_IPV4_UDP tirn: 12 tt: TT_IPV6_UDP tirn: 13 tt: TT_IPV4_IPSEC_AH tirn: 14 tt: TT_IPV6_IPSEC_AH tirn: 15 tt: TT_IPV4_IPSEC_ESP tirn: 16 tt: TT_IPV6_IPSEC_ESP tirn: 17 tt: TT_IPV4 tirn: 18 tt: TT_IPV6 tirn: 19 Index: 2 rqtn: 27 TIRs Numbers: tt: TT_IPV6_TCP tirn: 46 Signed-off-by: Amir Tzin <amirtz@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250209101716.112774-14-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-12net/mlx5e: Add direct TIRs to devlink rx reporter diagnoseAmir Tzin3-1/+40
Add "RX resources" tag to the output of rx reporter diagnose callback. Underneath add tag for direct TIRs, for each TIR expose its tirn and the corresponding rqtn. $ sudo devlink health diagnose auxiliary/mlx5_core.eth.0/65535 reporter rx .... rx resources: Direct TIRs: ix: 0 tirn: 20 rqtn: 1 ix: 1 tirn: 21 rqtn: 2 ix: 2 tirn: 22 rqtn: 3 ix: 3 tirn: 23 rqtn: 4 ix: 4 tirn: 24 rqtn: 5 ix: 5 tirn: 25 rqtn: 6 ix: 6 tirn: 26 rqtn: 7 ix: 7 tirn: 27 rqtn: 8 ix: 8 tirn: 28 rqtn: 9 ix: 9 tirn: 29 rqtn: 10 ix: 10 tirn: 30 rqtn: 11 ix: 11 tirn: 31 rqtn: 12 ix: 12 tirn: 32 rqtn: 13 ix: 13 tirn: 33 rqtn: 14 ix: 14 tirn: 34 rqtn: 15 ix: 15 tirn: 35 rqtn: 16 ix: 16 tirn: 36 rqtn: 17 ix: 17 tirn: 37 rqtn: 18 ix: 18 tirn: 38 rqtn: 19 ix: 19 tirn: 39 rqtn: 20 ix: 20 tirn: 40 rqtn: 21 ix: 21 tirn: 41 rqtn: 22 ix: 22 tirn: 42 rqtn: 23 ix: 23 tirn: 43 rqtn: 24 Signed-off-by: Amir Tzin <amirtz@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250209101716.112774-13-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-12net/mlx5e: Move RQs diagnose to a dedicated functionAmir Tzin1-13/+18
Move rx reporter RQs diagnose from mlx5e_rx_reporter_diagnose() to a dedicated function. This change is a preparation for the following series which extends diagnose output for the rx reporter. While at it, also pass a mlx5e_priv pointer to mlx5e_rx_reporter_diagnose_common_config() as this is the argument the latter actually needs. Signed-off-by: Amir Tzin <amirtz@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250209101716.112774-12-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-12net/mlx5: Expose ICM consumption per functionAkiva Goldberger1-0/+46
ICM is a portion of the host's memory assigned to a function by the OS through requests made by the NIC's firmware. PF ICM consumption can be accessed directly, while VF/SF ICM consumption can be accessed through their representors in switchdev mode. The value is exposed to the user in granularity of 4KB through the vnic health reporter as follows: $ devlink health diagnose pci/0000:08:00.0 reporter vnic vNIC env counters: total_error_queues: 0 send_queue_priority_update_flow: 0 comp_eq_overrun: 0 async_eq_overrun: 0 cq_overrun: 0 invalid_command: 0 quota_exceeded_command: 0 nic_receive_steering_discard: 0 icm_consumption: 1032 Signed-off-by: Akiva Goldberger <agoldberger@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250209101716.112774-11-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-12net/mlx5: Rename and move mlx5_esw_query_vport_vhca_idAkiva Goldberger3-27/+29
Rename mlx5_esw_query_vport_vhca_id to mlx5_vport_get_vhca_id and move it to vport file. Also, add function declaration to mlx5_core header file. This better represents the function's usage and allows for it to be called from other parts of the mlx5_core driver. Signed-off-by: Akiva Goldberger <agoldberger@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250209101716.112774-10-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-12net/mlx5e: set the tx_queue_len for pfifo_fastWilliam Tu1-0/+2
By default, the mq netdev creates a pfifo_fast qdisc. On a system with 16 core, the pfifo_fast with 3 bands consumes 16 * 3 * 8 (size of pointer) * 1024 (default tx queue len) = 393KB. The patch sets the tx qlen to representor default value, 128 (1<<MLX5E_REP_PARAMS_DEF_LOG_SQ_SIZE), which consumes 16 * 3 * 8 * 128 = 49KB, saving 344KB for each representor at ECPF. Signed-off-by: William Tu <witu@nvidia.com> Reviewed-by: Daniel Jurgens <danielj@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/20250209101716.112774-9-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>