summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/broadcom/bnxt
AgeCommit message (Collapse)AuthorFilesLines
2023-11-15bnxt_en: Add completion ring pointer in TX and RX ring structuresMichael Chan3-25/+19
From the TX or RX ring structure, we need to find the corresponding completion ring during initialization. On P5 chips, we use the MSIX/napi entry to locate the completion ring because there is only one RX/TX ring per MSIX. To allow multiple TX rings for each MSIX, we need to add a direct pointer from the TX ring and RX ring structures. This also simplifies the existing logic. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-15bnxt_en: Restructure cp_ring_arr in struct bnxt_cp_ring_infoMichael Chan3-73/+66
The cp_ring_arr is currently a fixed array of 2 pointers for the TX and RX completion rings. These pointers are allocated during ring initialization. Currntly, we support up to 2 completion rings for each MSIX. In order to support more completion rings, we change this fixed array to a pointer and allocate the required entries during ring initialization. This patch keeps the current scheme of allocating only 2 entries when needed. Later patches will expand and allocate more entries when required. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-15bnxt_en: Add completion ring pointer in TX and RX ring structuresMichael Chan3-26/+40
From the TX or RX ring structure, we need to find the corresponding completion ring during initialization. On P5 chips, we use the MSIX/napi entry to locate the completion ring because there is only one RX/TX ring per MSIX. To allow multiple TX rings for each MSIX, we need to add a direct pointer from the TX ring and RX ring structures. This also simplifies the existing logic. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-15bnxt_en: Put the TX producer information in the TX BD opaque fieldMichael Chan3-3/+11
Currently, the opaque field in the TX BD is only used for debugging. The TX completion logic relies on getting one TX completion for each packet and they always complete in order. Improve this scheme by putting the producer information (ring index plus number of BDs for the packet) in the opaque field. This way, we can handle TX completion processing by looking at the last TX completion instead of counting the number of completions. Since we no longer need to count the exact number of completions, we can optimize xmit_more by disabling TX completion when the xmit_more condition is true. This will be done in later patches. This patch is only initializing the opaque field in the TX BD and is not changing the driver's TX completion logic yet. Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27bnxt_en: Fix 2 stray ethtool -S countersMichael Chan1-6/+22
The recent firmware interface change has added 2 counters in struct rx_port_stats_ext. This caused 2 stray ethtool counters to be displayed. Since new counters are added from time to time, fix it so that the ethtool logic will only display up to the maximum known counters. These 2 counters are not used by production firmware yet. Fixes: 754fbf604ff6 ("bnxt_en: Update firmware interface to 1.10.2.171") Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231026013231.53271-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-24page_pool: remove PP_FLAG_PAGE_FRAGYunsheng Lin1-2/+0
PP_FLAG_PAGE_FRAG is not really needed after pp_frag_count handling is unified and page_pool_alloc_frag() is supported in 32-bit arch with 64-bit DMA, so remove it. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> CC: Lorenzo Bianconi <lorenzo@kernel.org> CC: Alexander Duyck <alexander.duyck@gmail.com> CC: Liang Chen <liangchen.linux@gmail.com> CC: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://lore.kernel.org/r/20231020095952.11055-3-linyunsheng@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-22bnxt_en: extend media types to supported and autoneg modesEdwin Peer1-120/+153
The current driver code does not accurately report the supported and advertised link modes. It basically always assumes the media type is copper for any particular speed. Utilize the recently added link mode mappings to accurately report fully qualified ethtool link modes for advertised and supported speeds. If the media type is known, we will report the supported link modes for that media only. If the media is not known, we will report all possible supported link modes. The user can now specify any supported link modes (including NRZ and PAM4) to advertise for autoneg. It used to only accept copper NRZ modes. Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-22bnxt_en: convert to linkmode_set_bit() APIEdwin Peer1-24/+24
Barring the BNXT_FW_TO_ETHTOOL speed macros, which will be removed in the next patch, update code to use the newer API. No functional change. Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-22bnxt_en: Refactor NRZ/PAM4 link speed related logicMichael Chan1-31/+67
Refactor some NRZ/PAM4 link speed related logic into helper functions. The NRZ and PAM4 link parameters are stored in separate structure fields. The driver logic has to check whether it is in NRZ or PAM4 mode and then use the appropriate field. Refactor this logic into helper functions for better readability. Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-22bnxt_en: refactor speed independent ethtool modesEdwin Peer1-38/+40
A future patch in this series will change the algorithm used to determine ethtool speed and media modes. Extract the handling of the unrelated pause, autoneg modes into an independent function. Also separate FEC handling out of bnxt_fw_to_ethtool_*_spds(). No functional change. Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-22bnxt_en: support lane configuration via ethtoolEdwin Peer1-7/+25
Recent kernels support changing the number of link lanes via ethtool. This is useful for determining the appropriate signal mode to use when a given link speed can be achieved using different lane configurations. Accept the ethtool lanes parameter when configuring forced speed. If there is no lanes parameter, select a default. Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-22bnxt_en: add infrastructure to lookup ethtool link modeEdwin Peer2-20/+258
Add infrastructure to look up the enum ethtool_link_mode_bit_indices from link information provided by the firmware. The link speed, signal mode, and media type returned by firmware will be used to look up the ethtool link mode. The immediate benefit is that once the link mode is determined, we can now use ethtool_params_from_link_mode() to fill the basic ethtool parameters including the number of lanes. Lanes will be fully supported in the next patch. Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-22bnxt_en: Fix invoking hwmon_notify_eventKalesh AP1-5/+11
FW sends the async event to the driver when the device temperature goes above or below the threshold values. Only notify hwmon if the temperature is increasing to the next alert level, not when it is decreasing. Cc: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Kashyap Desai <kashyap.desai@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-22bnxt_en: Do not call sleeping hwmon_notify_event() from NAPIKalesh AP4-9/+18
Defer hwmon_notify_event() to bnxt_sp_task() workqueue because hwmon_notify_event() can try to acquire a mutex shown in the stack trace below. Modify bnxt_event_error_report() to return true if we need to schedule bnxt_sp_task() to notify hwmon. __schedule+0x68/0x520 hwmon_notify_event+0xe8/0x114 schedule+0x60/0xe0 schedule_preempt_disabled+0x28/0x40 __mutex_lock.constprop.0+0x534/0x550 __mutex_lock_slowpath+0x18/0x20 mutex_lock+0x5c/0x70 kobject_uevent_env+0x2f4/0x3d0 kobject_uevent+0x10/0x20 hwmon_notify_event+0x94/0x114 bnxt_hwmon_notify_event+0x40/0x70 [bnxt_en] bnxt_event_error_report+0x260/0x290 [bnxt_en] bnxt_async_event_process.isra.0+0x250/0x850 [bnxt_en] bnxt_hwrm_handler.isra.0+0xc8/0x120 [bnxt_en] bnxt_poll_p5+0x150/0x350 [bnxt_en] __napi_poll+0x3c/0x210 net_rx_action+0x308/0x3b0 __do_softirq+0x120/0x3e0 Cc: Guenter Roeck <linux@roeck-us.net> Fixes: a19b4801457b ("bnxt_en: Event handler for Thermal event") Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20bnxt_en: devlink health: use retained error fmsg APIPrzemek Kitszel1-61/+32
Drop unneeded error checking. devlink_fmsg_*() family of functions is now retaining errors, so there is no need to check for them after each call. Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-18eth: bnxt: fix backward compatibility with older devicesJakub Kicinski5-12/+28
Recent FW interface update bumped the size of struct hwrm_func_cfg_input above 128B which is the max some devices support. Probe on Stratus (BCM957452) with FW 20.8.3.11 fails with: bnxt_en ...: Unable to reserve tx rings bnxt_en ...: 2nd rings reservation failed. bnxt_en ...: Not enough rings available. Once probe is fixed other errors pop up: bnxt_en ...: Failed to set async event completion ring. This is because __hwrm_send() rejects requests larger than bp->hwrm_max_ext_req_len with -E2BIG. Since the driver doesn't actually access any of the new fields, yet, trim the length. It should be safe. Similar workaround exists for backing_store_cfg_input. Although that one mins() to a constant of 256, not 128 we'll effectively use here. Michael explains: "the backing store cfg command is supported by relatively newer firmware that will accept 256 bytes at least." To make debugging easier in the future add a warning for oversized requests. Fixes: 754fbf604ff6 ("bnxt_en: Update firmware interface to 1.10.2.171") Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231016171640.1481493-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04Revert "bnxt_en: Support QOS and TPID settings for the SRIOV VLAN"Jakub Kicinski3-15/+12
This reverts commit e76d44fe722761f5480b908e38c5ce1a2c2cb6d6. We no longer accept drivers extending their use of the legacy SR-IOV configuration APIs. Users should move to bridge offload. Link: https://lore.kernel.org/r/20231004112243.41cb6351@kernel.org/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04bnxt_en: Update VNIC resource calculation for VFsVikas Gupta3-2/+29
Newer versions of firmware will pre-reserve 1 VNIC for every possible PF and VF function. Update the driver logic to take this into account when assigning VNICs to the VFs. These pre-reserved VNICs for the inactive VFs should be subtracted from the global pool before assigning them to the active VFs. Not doing so may cause discrepancies that ultimately may cause some VFs to have insufficient VNICs to support features such as aRFS. Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-04bnxt_en: Support QOS and TPID settings for the SRIOV VLANSreekanth Reddy3-12/+16
Add these missing settings in the .ndo_set_vf_vlan() method. Older firmware does not support the TPID setting so check for proper support. Remove the unused BNXT_VF_QOS flag. Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-04bnxt_en: Event handler for Thermal eventKalesh AP3-0/+82
Newer FW will send a new async event when it detects that the chip's temperature has crossed the configured threshold value. The driver will now notify hwmon and will log a warning message. Link: https://lore.kernel.org/netdev/20230815045658.80494-13-michael.chan@broadcom.com/ Cc: Jean Delvare <jdelvare@suse.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: linux-hwmon@vger.kernel.org Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Acked-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-04bnxt_en: Use non-standard attribute to expose shutdown temperatureKalesh AP1-1/+53
Implement the sysfs attributes directly in the driver for shutdown threshold temperature and pass an extra attribute group to the hwmon core when registering the hwmon device. Link: https://lore.kernel.org/netdev/20230815045658.80494-12-michael.chan@broadcom.com/ Cc: Jean Delvare <jdelvare@suse.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: linux-hwmon@vger.kernel.org Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-04bnxt_en: Expose threshold temperatures through hwmonKalesh AP2-7/+57
HWRM_TEMP_MONITOR_QUERY response now indicates various threshold temperatures. Expose these threshold temperatures through the hwmon sysfs using this mapping: hwmon_temp_max : bp->warn_thresh_temp hwmon_temp_crit : bp->crit_thresh_temp hwmon_temp_emergency : bp->fatal_thresh_temp hwmon_temp_max_alarm : temp >= bp->warn_thresh_temp hwmon_temp_crit_alarm : temp >= bp->crit_thresh_temp hwmon_temp_emergency_alarm : temp >= bp->fatal_thresh_temp Link: https://lore.kernel.org/netdev/20230815045658.80494-12-michael.chan@broadcom.com/ Cc: Jean Delvare <jdelvare@suse.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: linux-hwmon@vger.kernel.org Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Acked-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-04bnxt_en: Modify the driver to use hwmon_device_register_with_infoKalesh AP1-16/+55
The use of hwmon_device_register_with_groups() is deprecated. Modified the driver to use hwmon_device_register_with_info(). Driver currently exports only temp1_input through hwmon sysfs interface. But FW has been modified to report more threshold temperatures and driver want to report them through the hwmon interface. Cc: Jean Delvare <jdelvare@suse.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: linux-hwmon@vger.kernel.org Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Acked-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-04bnxt_en: Move hwmon functions into a dedicated fileKalesh AP4-75/+109
This is in preparation for upcoming patches in the series. Driver has to expose more threshold temperatures through the hwmon sysfs interface. More code will be added and do not want to overload bnxt.c. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Cc: Jean Delvare <jdelvare@suse.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: linux-hwmon@vger.kernel.org Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Acked-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-04bnxt_en: Enhance hwmon temperature reportingKalesh AP1-7/+8
Driver currently does hwmon device register and unregister in open and close() respectively. As a result, user will not be able to query hwmon temperature when interface is in ifdown state. Enhance it by moving the hwmon register/unregister to the probe/remove functions. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-04bnxt_en: Update firmware interface to 1.10.2.171Michael Chan1-178/+367
The main changes are the additional thermal thresholds in hwrm_temp_monitor_query_output and the new async event to report thermal errors. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-21bnxt_en: Flush XDP for bnxt_poll_nitroa0()'s NAPISebastian Andrzej Siewior1-0/+5
bnxt_poll_nitroa0() invokes bnxt_rx_pkt() which can run a XDP program which in turn can return XDP_REDIRECT. bnxt_rx_pkt() is also used by __bnxt_poll_work() which flushes (xdp_do_flush()) the packets after each round. bnxt_poll_nitroa0() lacks this feature. xdp_do_flush() should be invoked before leaving the NAPI callback. Invoke xdp_do_flush() after a redirect in bnxt_poll_nitroa0() NAPI. Cc: Michael Chan <michael.chan@broadcom.com> Fixes: f18c2b77b2e4e ("bnxt_en: optimized XDP_REDIRECT support") Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-09-02Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds3-1/+56
Pull rdma updates from Jason Gunthorpe: "Many small changes across the subystem, some highlights: - Usual driver cleanups in qedr, siw, erdma, hfi1, mlx4/5, irdma, mthca, hns, and bnxt_re - siw now works over tunnel and other netdevs with a MAC address by removing assumptions about a MAC/GID from the connection manager - "Doorbell Pacing" for bnxt_re - this is a best effort scheme to allow userspace to slow down the doorbell rings if the HW gets full - irdma egress VLAN priority, better QP/WQ sizing - rxe bug fixes in queue draining and srq resizing - Support more ethernet speed options in the core layer - DMABUF support for bnxt_re - Multi-stage MTT support for erdma to allow much bigger MR registrations - A irdma fix with a CVE that came in too late to go to -rc, missing bounds checking for 0 length MRs" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (87 commits) IB/hfi1: Reduce printing of errors during driver shut down RDMA/hfi1: Move user SDMA system memory pinning code to its own file RDMA/hfi1: Use list_for_each_entry() helper RDMA/mlx5: Fix trailing */ formatting in block comment RDMA/rxe: Fix redundant break statement in switch-case. RDMA/efa: Fix wrong resources deallocation order RDMA/siw: Call llist_reverse_order in siw_run_sq RDMA/siw: Correct wrong debug message RDMA/siw: Balance the reference of cep->kref in the error path Revert "IB/isert: Fix incorrect release of isert connection" RDMA/bnxt_re: Fix kernel doc errors RDMA/irdma: Prevent zero-length STAG registration RDMA/erdma: Implement hierarchical MTT RDMA/erdma: Refactor the storage structure of MTT entries RDMA/erdma: Renaming variable names and field names of struct erdma_mem RDMA/hns: Support hns HW stats RDMA/hns: Dump whole QP/CQ/MR resource in raw RDMA/irdma: Add missing kernel-doc in irdma_setup_umode_qp() RDMA/mlx4: Copy union directly RDMA/irdma: Drop unused kernel push code ...
2023-08-23bnxt: use the NAPI skb allocation cacheJakub Kicinski1-3/+3
All callers of build_skb() (*) in bnxt are in NAPI context. The budget checking is somewhat convoluted because in the shared completion queue cases Rx packets are discarded by netpoll by forcing an error (E). But that happens before skb allocation. Only a call chain starting at __bnxt_poll_work() can lead to an skb allocation and it checks budget (b). * bnxt_rx_multi_page_skb * bnxt_rx_skb ` bp->rx_skb_func * bnxt_tpa_end ` bnxt_rx_pkt E bnxt_force_rx_discard E bnxt_poll_nitroa0 b __bnxt_poll_work Use napi_build_skb() to take advantage of the skb cache. In iperf tests with HW-GRO enabled it barely makes a difference but in cases where HW-GRO is not as effective (or disabled) it can give even a >10% boost (20.7Gbps -> 23.1Gbps). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-19bnxt_en: Add tx_resets ring counterMichael Chan3-0/+10
Add a new tx_resets ring counter. This counter will be saved as tx_total_resets across any reset. Since we currently do a full reset in bnxt_sched_reset_txr(), the per ring counter will always be cleared during reset. Only the tx_total_resets count will be meaningful and we only display this under ethtool -S. Link: https://lore.kernel.org/netdev/CACKFLimD-bKmJ1tGZOLYRjWzEwxkri-Mw7iFme1x2Dr0twdCeg@mail.gmail.com/ Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20230817231911.165035-7-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-19bnxt_en: Display the ring error counters under ethtool -SMichael Chan1-25/+23
The existing driver displays the sum of 4 ring counters under ethtool -S. These counters are in the array bnxt_sw_func_stats. These counters are summed at the time of ethtool -S and will be lost when the device is reset. Replace these counters with the new total ring error counters added in the last patch. These new counters are saved before reset. ethtool -S will now display the sum of the saved counters plus the current counters. Link: https://lore.kernel.org/netdev/CACKFLimD-bKmJ1tGZOLYRjWzEwxkri-Mw7iFme1x2Dr0twdCeg@mail.gmail.com/ Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20230817231911.165035-6-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-19bnxt_en: Save ring error counters across resetMichael Chan2-1/+46
Currently, the ring counters are stored in the per ring datastructure. During reset, all the rings are freed together with the associated datastructures. As a result, all the ring error counters will be reset to zero. Add logic to keep track of the total error counts of all the rings and save them before reset (including ifdown). The next patch will display these total ring error counters under ethtool -S. Link: https://lore.kernel.org/netdev/CACKFLimD-bKmJ1tGZOLYRjWzEwxkri-Mw7iFme1x2Dr0twdCeg@mail.gmail.com/ Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20230817231911.165035-5-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-19bnxt_en: Increment rx_resets counter in bnxt_disable_napi()Michael Chan1-5/+7
If we are doing a complete reset with irq_re_init set to true in bnxt_close_nic(), all the ring structures will be freed. New structures will be allocated in bnxt_open_nic(). The current code increments rx_resets counter in bnxt_enable_napi() if bnapi->in_reset is true. In a complete reset, bnapi->in_reset will never be true since the structure is just allocated. Increment the rx_resets counter in bnxt_disable_napi() instead. This will allow us to save all the ring error counters including the rx_resets counters in the next patch. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20230817231911.165035-4-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-19bnxt_en: Let the page pool manage the DMA mappingSomnath Kotur1-22/+10
Use the page pool's ability to maintain DMA mappings for us. This avoids re-mapping of the recycled pages. Link: https://lore.kernel.org/netdev/20230728231829.235716-4-michael.chan@broadcom.com/ Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20230817231911.165035-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-19bnxt_en: Use the unified RX page pool buffers for XDP and non-XDPSomnath Kotur2-60/+14
Convert to use the page pool buffers for the aggregation ring when running in non-XDP mode. This simplifies the driver and we benefit from the recycling of pages. Adjust the page pool size to account for the aggregation ring size. Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20230817231911.165035-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09rtnetlink: remove redundant checks for nlattr IFLA_BRIDGE_MODELin Ma1-3/+0
The commit d73ef2d69c0d ("rtnetlink: let rtnl_bridge_setlink checks IFLA_BRIDGE_MODE length") added the nla_len check in rtnl_bridge_setlink, which is the only caller for ndo_bridge_setlink handlers defined in low-level driver codes. Hence, this patch cleanups the redundant checks in each ndo_bridge_setlink handler function. Suggested-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Lin Ma <linma@zju.edu.cn> Acked-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230807091347.3804523-1-linma@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09bnxt_en: Fix W=stringop-overflow warning in bnxt_dcb.cMichael Chan2-284/+49
Fix the following warning: drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c: In function ‘bnxt_hwrm_queue_cos2bw_cfg’: cc1: error: writing 12 bytes into a region of size 1 [-Werror=stringop-overflow ] In file included from drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c:19: drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h:6045:17: note: destination object ‘unused_0’ of size 1 6045 | u8 unused_0; Fix it by modifying struct hwrm_queue_cos2bw_cfg_input to use an array of sub struct similar to the previous patch. This will eliminate the pointer arithmetc to calculate the destination pointer passed to memcpy(). Link: https://lore.kernel.org/netdev/CACKFLinikvXmKcxr4kjWO9TPYxTd2cb5agT1j=w9Qyj5-24s5A@mail.gmail.com/ Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20230807145720.159645-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09bnxt_en: Fix W=1 warning in bnxt_dcb.c from fortify memcpy()Michael Chan2-285/+52
Fix the following warning: inlined from ‘bnxt_hwrm_queue_cos2bw_qcfg’ at drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c:165:3, ./include/linux/fortify-string.h:592:4: error: call to ‘__read_overflow2_field’ declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Werror] __read_overflow2_field(q_size_field, size); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Modify the FW interface defintion of struct hwrm_queue_cos2bw_qcfg_output to use an array of sub struct for the queue1 to queue7 fields. Note that the layout of the queue0 fields are different and these are not part of the array. This makes the code much cleaner by removing the pointer arithmetic for memcpy(). Link: https://lore.kernel.org/netdev/20230727190726.1859515-2-kuba@kernel.org/ Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20230807145720.159645-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-07page_pool: split types and declarations from page_pool.hYunsheng Lin2-2/+2
Split types and pure function declarations from page_pool.h and add them in page_page/types.h, so that C sources can include page_pool.h and headers should generally only include page_pool/types.h as suggested by jakub. Rename page_pool.h to page_pool/helpers.h to have both in one place. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://lore.kernel.org/r/20230804180529.2483231-2-aleksander.lobakin@intel.com [Jakub: change microsoft/mana, fix kdoc paths in Documentation] Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski4-40/+63
Cross-merge networking fixes after downstream PR. Conflicts: net/dsa/port.c 9945c1fb03a3 ("net: dsa: fix older DSA drivers using phylink") a88dd7538461 ("net: dsa: remove legacy_pre_march2020 detection") https://lore.kernel.org/all/20230731102254.2c9868ca@canb.auug.org.au/ net/xdp/xsk.c 3c5b4d69c358 ("net: annotate data-races around sk->sk_mark") b7f72a30e9ac ("xsk: introduce wrappers and helpers for supporting multi-buffer in Tx path") https://lore.kernel.org/all/20230731102631.39988412@canb.auug.org.au/ drivers/net/ethernet/broadcom/bnxt/bnxt.c 37b61cda9c16 ("bnxt: don't handle XDP in netpoll") 2b56b3d99241 ("eth: bnxt: handle invalid Tx completions more gracefully") https://lore.kernel.org/all/20230801101708.1dc7faac@canb.auug.org.au/ Adjacent changes: drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c 62da08331f1a ("net/mlx5e: Set proper IPsec source port in L4 selector") fbd517549c32 ("net/mlx5e: Add function to get IPsec offload namespace") drivers/net/ethernet/sfc/selftest.c 55c1528f9b97 ("sfc: fix field-spanning memcpy in selftest") ae9d445cd41f ("sfc: Miscellaneous comment removals") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-02bnxt_en: Fix max_mtu setting for multi-buf XDPMichael Chan1-7/+10
The existing code does not allow the MTU to be set to the maximum even after an XDP program supporting multiple buffers is attached. Fix it to set the netdev->max_mtu to the maximum value if the attached XDP program supports mutiple buffers, regardless of the current MTU value. Also use a local variable dev instead of repeatedly using bp->dev. Fixes: 1dc4c557bfed ("bnxt: adding bnxt_xdp_build_skb to build skb from multibuffer xdp_buff") Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20230731142043.58855-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-02bnxt_en: Fix page pool logic for page size >= 64KSomnath Kotur2-19/+29
The RXBD length field on all bnxt chips is 16-bit and so we cannot support a full page when the native page size is 64K or greater. The non-XDP (non page pool) code path has logic to handle this but the XDP page pool code path does not handle this. Add the missing logic to use page_pool_dev_alloc_frag() to allocate 32K chunks if the page size is 64K or greater. Fixes: 9f4b28301ce6 ("bnxt: XDP multibuffer enablement") Link: https://lore.kernel.org/netdev/20230728231829.235716-2-michael.chan@broadcom.com/ Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20230731142043.58855-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-01bnxt: don't handle XDP in netpollJakub Kicinski4-14/+24
Similarly to other recently fixed drivers make sure we don't try to access XDP or page pool APIs when NAPI budget is 0. NAPI budget of 0 may mean that we are in netpoll. This may result in running software IRQs in hard IRQ context, leading to deadlocks or crashes. To make sure bnapi->tx_pkts don't get wiped without handling the events, move clearing the field into the handler itself. Remember to clear tx_pkts after reset (bnxt_enable_napi()) as it's technically possible that netpoll will accumulate some tx_pkts and then a reset will happen, leaving tx_pkts out of sync with reality. Fixes: 322b87ca55f2 ("bnxt_en: add page_pool support") Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20230728205020.2784844-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-31net: flow_dissector: Use 64bits for used_keysRatheesh Kannoth1-3/+3
As 32bits of dissector->used_keys are exhausted, increase the size to 64bits. This is base change for ESP/AH flow dissector patch. Please find patch and discussions at https://lore.kernel.org/netdev/ZMDNjD46BvZ5zp5I@corigine.com/T/#t Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Reviewed-by: Petr Machata <petrm@nvidia.com> # for mlxsw Tested-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-28eth: bnxt: fix warning for define in struct_groupJakub Kicinski1-1/+2
Fix C=1 warning with sparse 0.6.4: drivers/net/ethernet/broadcom/bnxt/bnxt.c: note: in included file: drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h:30:1: warning: directive in macro's argument list Don't put defines in a struct_group(). Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20230727190726.1859515-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-28eth: bnxt: fix one of the W=1 warnings about fortified memcpy()Jakub Kicinski1-1/+1
Fix a W=1 warning with gcc 13.1: In function ‘fortify_memcpy_chk’, inlined from ‘bnxt_hwrm_queue_cos2bw_cfg’ at drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c:133:3: include/linux/fortify-string.h:592:25: warning: call to ‘__read_overflow2_field’ declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] 592 | __read_overflow2_field(q_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The field group is already defined and starts at queue_id: struct bnxt_cos2bw_cfg { u8 pad[3]; struct_group_attr(cfg, __packed, u8 queue_id; __le32 min_bw; Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20230727190726.1859515-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-21bnxt_en: Share the bar0 address with the RoCE driverChandramohan Akula2-1/+2
Add a parameter in the bnxt_en_dev structure to share the bar0 address with RoCE driver. Link: https://lore.kernel.org/r/1689742977-9128-3-git-send-email-selvin.xavier@broadcom.com CC: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-07-21bnxt_en: Update HW interface headersChandramohan Akula1-0/+54
Updating the HW structures for the doorbell pacing related information. Newly added interface structures will be used in the followup patches. Link: https://lore.kernel.org/r/1689742977-9128-2-git-send-email-selvin.xavier@broadcom.com CC: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-07-21eth: bnxt: handle invalid Tx completions more gracefullyJakub Kicinski3-1/+31
Invalid Tx completions should never happen (tm) but when they do they crash the host, because driver blindly trusts that there is a valid skb pointer on the ring. The completions I've seen appear to be some form of FW / HW miscalculation or staleness, they have typical (small) values (<100), but they are most often higher than number of queued descriptors. They usually happen after boot. Instead of crashing, print a warning and schedule a reset. Link: https://lore.kernel.org/r/20230720010440.1967136-4-kuba@kernel.org Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-21eth: bnxt: take the bit to set as argument of bnxt_queue_sp_work()Jakub Kicinski1-35/+26
Most callers of bnxt_queue_sp_work() set a bit to indicate what work to perform right before calling it. Pass it to the function instead. Link: https://lore.kernel.org/r/20230720010440.1967136-3-kuba@kernel.org Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>