summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet
AgeCommit message (Collapse)AuthorFilesLines
2022-11-24net/mlx5e: MACsec, fix update Rx secure channel active fieldRaed Salem1-1/+1
The main functionality for this operation is to update the active state of the Rx security channel (SC) if the new active setting is different from the current active state of this Rx SC, however the relevant active state check is done post updating the current active state to match the new active state, effectively blocks any offload state update for the Rx SC in question. Fix by delay the assignment to be post the relevant check. Fixes: aae3454e4d4c ("net/mlx5e: Add MACsec offload Rx command support") Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Emeel Hakim <ehakim@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-24net/mlx5e: MACsec, fix memory leak when MACsec device is deletedRaed Salem1-44/+33
When the MACsec netdevice is deleted, all related Rx/Tx HW/SW states should be released/deallocated, however currently part of the Rx security channel association data is not cleaned properly, hence the memory leaks. Fix by make sure all related Rx Sc resources are cleaned/freed, while at it improve code by grouping release SC context in a function so it can be used in both delete MACsec device and delete Rx SC operations. Fixes: 5a39816a75e5 ("net/mlx5e: Add MACsec offload SecY support") Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Emeel Hakim <ehakim@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-24net/mlx5e: MACsec, fix RX data path 16 RX security channel limitRaed Salem3-7/+14
Currently the data path metadata flow id mask wrongly limits the number of different RX security channels (SC) to 16, whereas in adding RX SC the limit is "2^16 - 1" this cause an overlap in metadata flow id once more than 16 RX SCs is added, this corrupts MACsec RX offloaded flow handling. Fix by using the correct mask, while at it improve code to use this mask when adding the Rx rule and improve visibility of such errors by adding debug massage. Fixes: b7c9400cbc48 ("net/mlx5e: Implement MACsec Rx data path using MACsec skb_metadata_dst") Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Emeel Hakim <ehakim@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-24net/mlx5e: Use kvfree() in mlx5e_accel_fs_tcp_create()YueHaibing1-2/+2
'accel_tcp' is allocated by kvzalloc(), which should freed by kvfree(). Fixes: f52f2faee581 ("net/mlx5e: Introduce flow steering API") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-24net/mlx5e: Fix a couple error codesDan Carpenter1-4/+8
If kvzalloc() fails then return -ENOMEM. Don't return success. Fixes: 3b20949cb21b ("net/mlx5e: Add MACsec RX steering rules") Fixes: e467b283ffd5 ("net/mlx5e: Add MACsec TX steering rules") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-24net/mlx5e: Fix use-after-free when reverting termination tableRoi Dayan1-0/+2
When having multiple dests with termination tables and second one or afterwards fails the driver reverts usage of term tables but doesn't reset the assignment in attr->dests[num_vport_dests].termtbl which case a use-after-free when releasing the rule. Fix by resetting the assignment of termtbl to null. Fixes: 10caabdaad5a ("net/mlx5e: Use termination table for VLAN push actions") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-24net/mlx5: Fix uninitialized variable bug in outlen_write()YueHaibing1-2/+2
If sscanf() return 0, outlen is uninitialized and used in kzalloc(), this is unexpected. We should return -EINVAL if the string is invalid. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-24net/mlx5: E-switch, Fix duplicate lag creationChris Mi2-2/+11
If creating bond first and then enabling sriov in switchdev mode, will hit the following syndrome: mlx5_core 0000:08:00.0: mlx5_cmd_out_err:778:(pid 25543): CREATE_LAG(0x840) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x7d49cb), err(-22) The reason is because the offending patch removes eswitch mode none. In vf lag, the checking of eswitch mode none is replaced by checking if sriov is enabled. But when driver enables sriov, it triggers the bond workqueue task first and then setting sriov number in pci_enable_sriov(). So the check fails. Fix it by checking if sriov is enabled using eswitch internal counter that is set before triggering the bond workqueue task. Fixes: f019679ea5f2 ("net/mlx5: E-switch, Remove dependency between sriov and eswitch mode") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-24net/mlx5: E-switch, Destroy legacy fdb table when neededChris Mi2-0/+10
The cited commit removes eswitch mode none. But when disabling sriov in legacy mode or changing from switchdev to legacy mode without sriov enabled, the legacy fdb table is not destroyed. It is not the right behavior. Destroy legacy fdb table in above two caes. Fixes: f019679ea5f2 ("net/mlx5: E-switch, Remove dependency between sriov and eswitch mode") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Eli Cohen <elic@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-24net/mlx5: DR, Fix uninitialized var warningYueHaibing1-1/+4
Smatch warns this: drivers/net/ethernet/mellanox/mlx5/core/steering/dr_table.c:81 mlx5dr_table_set_miss_action() error: uninitialized symbol 'ret'. Initializing ret with -EOPNOTSUPP and fix missing action case. Fixes: 7838e1725394 ("net/mlx5: DR, Expose steering table functionality") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-24octeontx2-pf: Add check for devm_kcallocJiasheng Jiang1-0/+2
As the devm_kcalloc may return NULL pointer, it should be better to add check for the return value, as same as the others. Fixes: e8e095b3b370 ("octeontx2-af: cn10k: Bandwidth profiles config support") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/r/20221122055449.31247-1-jiasheng@iscas.ac.cn Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-24net: enetc: preserve TX ring priority across reconfigurationVladimir Oltean3-11/+19
In the blamed commit, a rudimentary reallocation procedure for RX buffer descriptors was implemented, for the situation when their format changes between normal (no PTP) and extended (PTP). enetc_hwtstamp_set() calls enetc_close() and enetc_open() in a sequence, and this sequence loses information which was previously configured in the TX BDR Mode Register, specifically via the enetc_set_bdr_prio() call. The TX ring priority is configured by tc-mqprio and tc-taprio, and affects important things for TSN such as the TX time of packets. The issue manifests itself most visibly by the fact that isochron --txtime reports premature packet transmissions when PTP is first enabled on an enetc interface. Save the TX ring priority in a new field in struct enetc_bdr (occupies a 2 byte hole on arm64) in order to make this survive a ring reconfiguration. Fixes: 434cebabd3a2 ("enetc: Add dynamic allocation of extended Rx BD rings") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Link: https://lore.kernel.org/r/20221122130936.1704151-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-24net: marvell: prestera: add missing unregister_netdev() in ↵Zhang Changzhong1-0/+1
prestera_port_create() If prestera_port_sfp_bind() fails, unregister_netdev() should be called in error handling path. Compile tested only. Fixes: 52323ef75414 ("net: marvell: prestera: add phylink support") Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/r/1669115432-36841-1-git-send-email-zhangchangzhong@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-23e100: Fix possible use after free in e100_xmit_prepareWang Hai1-4/+1
In e100_xmit_prepare(), if we can't map the skb, then return -ENOMEM, so e100_xmit_frame() will return NETDEV_TX_BUSY and the upper layer will resend the skb. But the skb is already freed, which will cause UAF bug when the upper layer resends the skb. Remove the harmful free. Fixes: 5e5d49422dfb ("e100: Release skb when DMA mapping is failed in e100_xmit_prepare") Signed-off-by: Wang Hai <wanghai38@huawei.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-11-23iavf: Fix error handling in iavf_init_module()Yuan Can1-1/+8
The iavf_init_module() won't destroy workqueue when pci_register_driver() failed. Call destroy_workqueue() when pci_register_driver() failed to prevent the resource leak. Similar to the handling of u132_hcd_init in commit f276e002793c ("usb: u132-hcd: fix resource leak") Fixes: 2803b16c10ea ("i40e/i40evf: Use private workqueue") Signed-off-by: Yuan Can <yuancan@huawei.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-11-23fm10k: Fix error handling in fm10k_init_module()Yuan Can1-1/+9
A problem about modprobe fm10k failed is triggered with the following log given: Intel(R) Ethernet Switch Host Interface Driver Copyright(c) 2013 - 2019 Intel Corporation. debugfs: Directory 'fm10k' with parent '/' already present! The reason is that fm10k_init_module() returns fm10k_register_pci_driver() directly without checking its return value, if fm10k_register_pci_driver() failed, it returns without removing debugfs and destroy workqueue, resulting the debugfs of fm10k can never be created later and leaks the workqueue. fm10k_init_module() alloc_workqueue() fm10k_dbg_init() # create debugfs fm10k_register_pci_driver() pci_register_driver() driver_register() bus_add_driver() priv = kzalloc(...) # OOM happened # return without remove debugfs and destroy workqueue Fix by remove debugfs and destroy workqueue when fm10k_register_pci_driver() returns error. Fixes: 7461fd913afe ("fm10k: Add support for debugfs") Fixes: b382bb1b3e2d ("fm10k: use separate workqueue for fm10k driver") Signed-off-by: Yuan Can <yuancan@huawei.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-11-23i40e: Fix error handling in i40e_init_module()Shang XiaoJing1-1/+10
i40e_init_module() won't free the debugfs directory created by i40e_dbg_init() when pci_register_driver() failed. Add fail path to call i40e_dbg_exit() to remove the debugfs entries to prevent the bug. i40e: Intel(R) Ethernet Connection XL710 Network Driver i40e: Copyright (c) 2013 - 2019 Intel Corporation. debugfs: Directory 'i40e' with parent '/' already present! Fixes: 41c445ff0f48 ("i40e: main driver core") Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-11-23ixgbevf: Fix resource leak in ixgbevf_init_module()Shang XiaoJing1-1/+9
ixgbevf_init_module() won't destroy the workqueue created by create_singlethread_workqueue() when pci_register_driver() failed. Add destroy_workqueue() in fail path to prevent the resource leak. Similar to the handling of u132_hcd_init in commit f276e002793c ("usb: u132-hcd: fix resource leak") Fixes: 40a13e2493c9 ("ixgbevf: Use a private workqueue to avoid certain possible hangs") Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Reviewed-by: Saeed Mahameed <saeed@kernel.org> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-11-23net: dm9051: Fix missing dev_kfree_skb() in dm9051_loop_rx()Yuan Can1-1/+3
The dm9051_loop_rx() returns without release skb when dm9051_stop_mrcmd() returns error, free the skb to avoid this leak. Fixes: 2dc95a4d30ed ("net: Add dm9051 driver") Signed-off-by: Yuan Can <yuancan@huawei.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-23Merge tag 'mlx5-fixes-2022-11-21' of ↵Jakub Kicinski17-118/+280
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2022-11-21 This series provides bug fixes to mlx5 driver. * tag 'mlx5-fixes-2022-11-21' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: Fix possible race condition in macsec extended packet number update routine net/mlx5e: Fix MACsec update SecY net/mlx5e: Fix MACsec SA initialization routine net/mlx5e: Remove leftovers from old XSK queues enumeration net/mlx5e: Offload rule only when all encaps are valid net/mlx5e: Fix missing alignment in size of MTT/KLM entries net/mlx5: Fix sync reset event handler error flow net/mlx5: E-Switch, Set correctly vport destination net/mlx5: Lag, avoid lockdep warnings net/mlx5: Fix handling of entry refcount when command is not issued to FW net/mlx5: cmdif, Print info on any firmware cmd failure to tracepoint net/mlx5: SF: Fix probing active SFs during driver probe phase net/mlx5: Fix FW tracer timestamp calculation net/mlx5: Do not query pci info while pci disabled ==================== Link: https://lore.kernel.org/r/20221122022559.89459-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-23net: ethernet: mtk_eth_soc: fix memory leak in error pathYan Cangang3-6/+23
In mtk_ppe_init(), when dmam_alloc_coherent() or devm_kzalloc() failed, the rhashtable ppe->l2_flows isn't destroyed. Fix it. In mtk_probe(), when mtk_ppe_init() or mtk_eth_offload_init() or register_netdev() failed, have the same problem. Fix it. Fixes: 33fc42de3327 ("net: ethernet: mtk_eth_soc: support creating mac address based offload entries") Signed-off-by: Yan Cangang <nalanzeyu@gmail.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-23net: ethernet: mtk_eth_soc: fix resource leak in error pathYan Cangang1-2/+2
In mtk_probe(), when mtk_ppe_init() or mtk_eth_offload_init() failed, mtk_mdio_cleanup() isn't called. Fix it. Fixes: ba37b7caf1ed ("net: ethernet: mtk_eth_soc: add support for initializing the PPE") Fixes: 502e84e2382d ("net: ethernet: mtk_eth_soc: add flow offloading support") Signed-off-by: Yan Cangang <nalanzeyu@gmail.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-23net: ethernet: mtk_eth_soc: fix potential memory leak in mtk_rx_alloc()Ziyang Xuan1-1/+3
When fail to dma_map_single() in mtk_rx_alloc(), it returns directly. But the memory allocated for local variable data is not freed, and local variabel data has not been attached to ring->data[i] yet, so the memory allocated for local variable data will not be freed outside mtk_rx_alloc() too. Thus memory leak would occur in this scenario. Add skb_free_frag(data) when dma_map_single() failed. Fixes: 23233e577ef9 ("net: ethernet: mtk_eth_soc: rely on page_pool for single page buffers") Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Acked-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Link: https://lore.kernel.org/r/20221120035405.1464341-1-william.xuanziyang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-22tsnep: Fix rotten packetsGerhard Engleder1-1/+56
If PTP synchronisation is done every second, then sporadic the interval is higher than one second: ptp4l[696.582]: master offset -17 s2 freq -1891 path delay 573 ptp4l[697.582]: master offset -22 s2 freq -1901 path delay 573 ptp4l[699.368]: master offset -1 s2 freq -1887 path delay 573 ^^^^^^^ Should be 698.582! This problem is caused by rotten packets, which are received after polling but before interrupts are enabled again. This can be fixed by checking for pending work and rescheduling if necessary after interrupts has been enabled again. Fixes: 403f69bbdbad ("tsnep: Add TSN endpoint Ethernet MAC driver") Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Link: https://lore.kernel.org/r/20221119211825.81805-1-gerhard@engleder-embedded.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-22octeontx2-pf: Remove duplicate MACSEC settingZheng Bin1-1/+0
Commit 4581dd480c9e ("net: octeontx2-pf: mcs: consider MACSEC setting") has already added "depends on MACSEC || !MACSEC", so remove it. Signed-off-by: Zheng Bin <zhengbin13@huawei.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20221119133616.3583538-1-zhengbin13@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-22bnx2x: fix pci device refcount leak in bnx2x_vf_is_pcie_pending()Yang Yingliang1-4/+8
As comment of pci_get_domain_bus_and_slot() says, it returns a pci device with refcount increment, when finish using it, the caller must decrement the reference count by calling pci_dev_put(). Call pci_dev_put() before returning from bnx2x_vf_is_pcie_pending() to avoid refcount leak. Fixes: b56e9670ffa4 ("bnx2x: Prepare device and initialize VF database") Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20221119070202.1407648-1-yangyingliang@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-22net: sparx5: fix error handling in sparx5_port_open()Liu Jian1-2/+12
If phylink_of_phy_connect() fails, the port should be disabled. If sparx5_serdes_set()/phy_power_on() fails, the port should be disabled and the phylink should be stopped and disconnected. Fixes: 946e7fd5053a ("net: sparx5: add port module support") Fixes: f3cad2611a77 ("net: sparx5: add hostmode with phylink support") Signed-off-by: Liu Jian <liujian56@huawei.com> Tested-by: Bjarni Jonasson <bjarni.jonasson@microchip.com> Reviewed-by: Steen Hegelund <steen.hegelund@microchip.com> Link: https://lore.kernel.org/r/20221117125918.203997-1-liujian56@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-22sfc: fix potential memleak in __ef100_hard_start_xmit()Zhang Changzhong1-0/+1
The __ef100_hard_start_xmit() returns NETDEV_TX_OK without freeing skb in error handling case, add dev_kfree_skb_any() to fix it. Fixes: 51b35a454efd ("sfc: skeleton EF100 PF driver") Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/1668671409-10909-1-git-send-email-zhangchangzhong@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-22ice: fix handling of burst Tx timestampsJacob Keller2-16/+16
Commit 1229b33973c7 ("ice: Add low latency Tx timestamp read") refactored PTP timestamping logic to use a threaded IRQ instead of a separate kthread. This implementation introduced ice_misc_intr_thread_fn and redefined the ice_ptp_process_ts function interface to return a value of whether or not the timestamp processing was complete. ice_misc_intr_thread_fn would take the return value from ice_ptp_process_ts and convert it into either IRQ_HANDLED if there were no more timestamps to be processed, or IRQ_WAKE_THREAD if the thread should continue processing. This is not correct, as the kernel does not re-schedule threaded IRQ functions automatically. IRQ_WAKE_THREAD can only be used by the main IRQ function. This results in the ice_ptp_process_ts function (and in turn the ice_ptp_tx_tstamp function) from only being called exactly once per interrupt. If an application sends a burst of Tx timestamps without waiting for a response, the interrupt will trigger for the first timestamp. However, later timestamps may not have arrived yet. This can result in dropped or discarded timestamps. Worse, on E822 hardware this results in the interrupt logic getting stuck such that no future interrupts will be triggered. The result is complete loss of Tx timestamp functionality. Fix this by modifying the ice_misc_intr_thread_fn to perform its own polling of the ice_ptp_process_ts function. We sleep for a few microseconds between attempts to avoid wasting significant CPU time. The value was chosen to allow time for the Tx timestamps to complete without wasting so much time that we overrun application wait budgets in the worst case. The ice_ptp_process_ts function also currently returns false in the event that the Tx tracker is not initialized. This would result in the threaded IRQ handler never exiting if it gets started while the tracker is not initialized. Fix the function to appropriately return true when the tracker is not initialized. Note that this will not reproduce with default ptp4l behavior, as the program always synchronously waits for a timestamp response before sending another timestamp request. Reported-by: Siddaraju DH <siddaraju.dh@intel.com> Fixes: 1229b33973c7 ("ice: Add low latency Tx timestamp read") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20221118222729.1565317-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-22Merge branch '40GbE' of ↵Jakub Kicinski2-19/+23
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-11-18 (iavf) Ivan Vecera resolves issues related to reset by adding back call to netif_tx_stop_all_queues() and adding calls to dev_close() to ensure device is properly closed during reset. Stefan Assmann removes waiting for setting of MAC address as this breaks ARP. Slawomir adds setting of __IAVF_IN_REMOVE_TASK bit to prevent deadlock between remove and shutdown. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: iavf: Fix race condition between iavf_shutdown and iavf_remove iavf: remove INITIAL_MAC_SET to allow gARP to work properly iavf: Do not restart Tx queues after reset task failure iavf: Fix a crash during reset task ==================== Link: https://lore.kernel.org/r/20221118222439.1565245-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-22net/mlx5e: Fix possible race condition in macsec extended packet number ↵Emeel Hakim1-0/+3
update routine Currenty extended packet number (EPN) update routine is accessing macsec object without holding the general macsec lock hence facing a possible race condition when an EPN update occurs while updating or deleting the SA. Fix by holding the general macsec lock before accessing the object. Fixes: 4411a6c0abd3 ("net/mlx5e: Support MACsec offload extended packet number (EPN)") Signed-off-by: Emeel Hakim <ehakim@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-22net/mlx5e: Fix MACsec update SecYEmeel Hakim1-1/+1
Currently updating SecY destroys and re-creates RX SA objects, the re-created RX SA objects are not identical to the destroyed objects and it disagree on the encryption enabled property which holds the value false after recreation, this value is not supported with offload which leads to no traffic after an update. Fix by recreating an identical objects. Fixes: 5a39816a75e5 ("net/mlx5e: Add MACsec offload SecY support") Signed-off-by: Emeel Hakim <ehakim@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-22net/mlx5e: Fix MACsec SA initialization routineEmeel Hakim1-7/+7
Currently as part of MACsec SA initialization routine extended packet number (EPN) object attribute is always being set without checking if EPN is actually enabled, the above could lead to a NULL dereference. Fix by adding such a check. Fixes: 4411a6c0abd3 ("net/mlx5e: Support MACsec offload extended packet number (EPN)") Signed-off-by: Emeel Hakim <ehakim@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-22net/mlx5e: Remove leftovers from old XSK queues enumerationTariq Toukan1-18/+0
Before the cited commit, for N channels, a dedicated set of N queues was created to support XSK, in indices [N, 2N-1], doubling the number of queues. In addition, changing the number of channels was prohibited, as it would shift the indices. Remove these two leftovers, as we moved XSK to a new queueing scheme, starting from index 0. Fixes: 3db4c85cde7a ("net/mlx5e: xsk: Use queue indices starting from 0 for XSK queues") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-22net/mlx5e: Offload rule only when all encaps are validChris Mi3-17/+9
The cited commit adds a for loop to support multiple encapsulations. But it only checks if the last encap is valid. Fix it by setting slow path flag when one of the encap is invalid. Fixes: f493f15534ec ("net/mlx5e: Move flow attr reformat action bit to per dest flags") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-22net/mlx5e: Fix missing alignment in size of MTT/KLM entriesTariq Toukan1-2/+3
In the cited patch, an alignment required by the HW spec was mistakenly dropped. Bring it back to fix error completions like the below: mlx5_core 0000:00:08.0 eth2: Error cqe on cqn 0x40b, ci 0x0, qn 0x104f, opcode 0xd, syndrome 0x2, vendor syndrome 0x68 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000030: 00 00 00 00 86 00 68 02 25 00 10 4f 00 00 bb d2 WQE DUMP: WQ size 1024 WQ cur size 0, WQE index 0x0, len: 192 00000000: 00 00 00 25 00 10 4f 0c 00 00 00 00 00 18 2e 00 00000010: 90 00 00 00 00 02 00 00 00 00 00 00 20 00 00 00 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000080: 08 00 00 00 48 6a 00 02 08 00 00 00 0e 10 00 02 00000090: 08 00 00 00 0c db 00 02 08 00 00 00 0e 82 00 02 000000a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Fixes: 9f123f740428 ("net/mlx5e: Improve MTT/KSM alignment") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-22net/mlx5: Fix sync reset event handler error flowMoshe Shemesh1-2/+7
When sync reset now event handling fails on mlx5_pci_link_toggle() then no reset was done. However, since mlx5_cmd_fast_teardown_hca() was already done, the firmware function is closed and the driver is left without firmware functionality. Fix it by setting device error state and reopen the firmware resources. Reopening is done by the thread that was called for devlink reload fw_activate as it already holds the devlink lock. Fixes: 5ec697446f46 ("net/mlx5: Add support for devlink reload action fw activate") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-22net/mlx5: E-Switch, Set correctly vport destinationRoi Dayan2-5/+7
The cited commit moved from using reformat_id integer to packet_reformat pointer which introduced the possibility to null pointer dereference. When setting packet reformat flag and pkt_reformat pointer must exists so checking MLX5_ESW_DEST_ENCAP is not enough, we need to make sure the pkt_reformat is valid and check for MLX5_ESW_DEST_ENCAP_VALID. If the dest encap valid flag does not exists then pkt_reformat can be either invalid address or null. Also, to make sure we don't try to access invalid pkt_reformat set it to null when invalidated and invalidate it before calling add flow code as its logically more correct and to be safe. Fixes: 2b688ea5efde ("net/mlx5: Add flow steering actions to fs_cmd shim layer") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Chris Mi <cmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-22net/mlx5: Lag, avoid lockdep warningsEli Cohen4-40/+78
ldev->lock is used to serialize lag change operations. Since multiport eswtich functionality was added, we now change the mode dynamically. However, acquiring ldev->lock is not allowed as it could possibly lead to a deadlock as reported by the lockdep mechanism. [ 836.154963] WARNING: possible circular locking dependency detected [ 836.155850] 5.19.0-rc5_net_56b7df2 #1 Not tainted [ 836.156549] ------------------------------------------------------ [ 836.157418] handler1/12198 is trying to acquire lock: [ 836.158178] ffff888187d52b58 (&ldev->lock){+.+.}-{3:3}, at: mlx5_lag_do_mirred+0x3b/0x70 [mlx5_core] [ 836.159575] [ 836.159575] but task is already holding lock: [ 836.160474] ffff8881d4de2930 (&block->cb_lock){++++}-{3:3}, at: tc_setup_cb_add+0x5b/0x200 [ 836.161669] which lock already depends on the new lock. [ 836.162905] [ 836.162905] the existing dependency chain (in reverse order) is: [ 836.164008] -> #3 (&block->cb_lock){++++}-{3:3}: [ 836.164946] down_write+0x25/0x60 [ 836.165548] tcf_block_get_ext+0x1c6/0x5d0 [ 836.166253] ingress_init+0x74/0xa0 [sch_ingress] [ 836.167028] qdisc_create.constprop.0+0x130/0x5e0 [ 836.167805] tc_modify_qdisc+0x481/0x9f0 [ 836.168490] rtnetlink_rcv_msg+0x16e/0x5a0 [ 836.169189] netlink_rcv_skb+0x4e/0xf0 [ 836.169861] netlink_unicast+0x190/0x250 [ 836.170543] netlink_sendmsg+0x243/0x4b0 [ 836.171226] sock_sendmsg+0x33/0x40 [ 836.171860] ____sys_sendmsg+0x1d1/0x1f0 [ 836.172535] ___sys_sendmsg+0xab/0xf0 [ 836.173183] __sys_sendmsg+0x51/0x90 [ 836.173836] do_syscall_64+0x3d/0x90 [ 836.174471] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 836.175282] [ 836.175282] -> #2 (rtnl_mutex){+.+.}-{3:3}: [ 836.176190] __mutex_lock+0x6b/0xf80 [ 836.176830] register_netdevice_notifier+0x21/0x120 [ 836.177631] rtnetlink_init+0x2d/0x1e9 [ 836.178289] netlink_proto_init+0x163/0x179 [ 836.178994] do_one_initcall+0x63/0x300 [ 836.179672] kernel_init_freeable+0x2cb/0x31b [ 836.180403] kernel_init+0x17/0x140 [ 836.181035] ret_from_fork+0x1f/0x30 [ 836.181687] -> #1 (pernet_ops_rwsem){+.+.}-{3:3}: [ 836.182628] down_write+0x25/0x60 [ 836.183235] unregister_netdevice_notifier+0x1c/0xb0 [ 836.184029] mlx5_ib_roce_cleanup+0x94/0x120 [mlx5_ib] [ 836.184855] __mlx5_ib_remove+0x35/0x60 [mlx5_ib] [ 836.185637] mlx5_eswitch_unregister_vport_reps+0x22f/0x440 [mlx5_core] [ 836.186698] auxiliary_bus_remove+0x18/0x30 [ 836.187409] device_release_driver_internal+0x1f6/0x270 [ 836.188253] bus_remove_device+0xef/0x160 [ 836.188939] device_del+0x18b/0x3f0 [ 836.189562] mlx5_rescan_drivers_locked+0xd6/0x2d0 [mlx5_core] [ 836.190516] mlx5_lag_remove_devices+0x69/0xe0 [mlx5_core] [ 836.191414] mlx5_do_bond_work+0x441/0x620 [mlx5_core] [ 836.192278] process_one_work+0x25c/0x590 [ 836.192963] worker_thread+0x4f/0x3d0 [ 836.193609] kthread+0xcb/0xf0 [ 836.194189] ret_from_fork+0x1f/0x30 [ 836.194826] -> #0 (&ldev->lock){+.+.}-{3:3}: [ 836.195734] __lock_acquire+0x15b8/0x2a10 [ 836.196426] lock_acquire+0xce/0x2d0 [ 836.197057] __mutex_lock+0x6b/0xf80 [ 836.197708] mlx5_lag_do_mirred+0x3b/0x70 [mlx5_core] [ 836.198575] tc_act_parse_mirred+0x25b/0x800 [mlx5_core] [ 836.199467] parse_tc_actions+0x168/0x5a0 [mlx5_core] [ 836.200340] __mlx5e_add_fdb_flow+0x263/0x480 [mlx5_core] [ 836.201241] mlx5e_configure_flower+0x8a0/0x1820 [mlx5_core] [ 836.202187] tc_setup_cb_add+0xd7/0x200 [ 836.202856] fl_hw_replace_filter+0x14c/0x1f0 [cls_flower] [ 836.203739] fl_change+0xbbe/0x1730 [cls_flower] [ 836.204501] tc_new_tfilter+0x407/0xd90 [ 836.205168] rtnetlink_rcv_msg+0x406/0x5a0 [ 836.205877] netlink_rcv_skb+0x4e/0xf0 [ 836.206535] netlink_unicast+0x190/0x250 [ 836.207217] netlink_sendmsg+0x243/0x4b0 [ 836.207915] sock_sendmsg+0x33/0x40 [ 836.208538] ____sys_sendmsg+0x1d1/0x1f0 [ 836.209219] ___sys_sendmsg+0xab/0xf0 [ 836.209878] __sys_sendmsg+0x51/0x90 [ 836.210510] do_syscall_64+0x3d/0x90 [ 836.211137] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 836.211954] other info that might help us debug this: [ 836.213174] Chain exists of: [ 836.213174] &ldev->lock --> rtnl_mutex --> &block->cb_lock 836.214650] Possible unsafe locking scenario: [ 836.214650] [ 836.215574] CPU0 CPU1 [ 836.216255] ---- ---- [ 836.216943] lock(&block->cb_lock); [ 836.217518] lock(rtnl_mutex); [ 836.218348] lock(&block->cb_lock); [ 836.219212] lock(&ldev->lock); [ 836.219758] [ 836.219758] *** DEADLOCK *** [ 836.219758] [ 836.220747] 2 locks held by handler1/12198: [ 836.221390] #0: ffff8881d4de2930 (&block->cb_lock){++++}-{3:3}, at: tc_setup_cb_add+0x5b/0x200 [ 836.222646] #1: ffff88810c9a92c0 (&esw->mode_lock){++++}-{3:3}, at: mlx5_esw_hold+0x39/0x50 [mlx5_core] [ 836.224063] stack backtrace: [ 836.224799] CPU: 6 PID: 12198 Comm: handler1 Not tainted 5.19.0-rc5_net_56b7df2 #1 [ 836.225923] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 836.227476] Call Trace: [ 836.227929] <TASK> [ 836.228332] dump_stack_lvl+0x57/0x7d [ 836.228924] check_noncircular+0x104/0x120 [ 836.229562] __lock_acquire+0x15b8/0x2a10 [ 836.230201] lock_acquire+0xce/0x2d0 [ 836.230776] ? mlx5_lag_do_mirred+0x3b/0x70 [mlx5_core] [ 836.231614] ? find_held_lock+0x2b/0x80 [ 836.232221] __mutex_lock+0x6b/0xf80 [ 836.232799] ? mlx5_lag_do_mirred+0x3b/0x70 [mlx5_core] [ 836.233636] ? mlx5_lag_do_mirred+0x3b/0x70 [mlx5_core] [ 836.234451] ? xa_load+0xc3/0x190 [ 836.234995] mlx5_lag_do_mirred+0x3b/0x70 [mlx5_core] [ 836.235803] tc_act_parse_mirred+0x25b/0x800 [mlx5_core] [ 836.236636] ? tc_act_can_offload_mirred+0x135/0x210 [mlx5_core] [ 836.237550] parse_tc_actions+0x168/0x5a0 [mlx5_core] [ 836.238364] __mlx5e_add_fdb_flow+0x263/0x480 [mlx5_core] [ 836.239202] mlx5e_configure_flower+0x8a0/0x1820 [mlx5_core] [ 836.240076] ? lock_acquire+0xce/0x2d0 [ 836.240668] ? tc_setup_cb_add+0x5b/0x200 [ 836.241294] tc_setup_cb_add+0xd7/0x200 [ 836.241917] fl_hw_replace_filter+0x14c/0x1f0 [cls_flower] [ 836.242709] fl_change+0xbbe/0x1730 [cls_flower] [ 836.243408] tc_new_tfilter+0x407/0xd90 [ 836.244043] ? tc_del_tfilter+0x880/0x880 [ 836.244672] rtnetlink_rcv_msg+0x406/0x5a0 [ 836.245310] ? netlink_deliver_tap+0x7a/0x4b0 [ 836.245991] ? if_nlmsg_stats_size+0x2b0/0x2b0 [ 836.246675] netlink_rcv_skb+0x4e/0xf0 [ 836.258046] netlink_unicast+0x190/0x250 [ 836.258669] netlink_sendmsg+0x243/0x4b0 [ 836.259288] sock_sendmsg+0x33/0x40 [ 836.259857] ____sys_sendmsg+0x1d1/0x1f0 [ 836.260473] ___sys_sendmsg+0xab/0xf0 [ 836.261064] ? lock_acquire+0xce/0x2d0 [ 836.261669] ? find_held_lock+0x2b/0x80 [ 836.262272] ? __fget_files+0xb9/0x190 [ 836.262871] ? __fget_files+0xd3/0x190 [ 836.263462] __sys_sendmsg+0x51/0x90 [ 836.264064] do_syscall_64+0x3d/0x90 [ 836.264652] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 836.265425] RIP: 0033:0x7fdbe5e2677d [ 836.266012] Code: 28 89 54 24 1c 48 89 74 24 10 89 7c 24 08 e8 ba ee ff ff 8b 54 24 1c 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 33 44 89 c7 48 89 44 24 08 e8 ee ee ff ff 48 [ 836.268485] RSP: 002b:00007fdbe48a75a0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e [ 836.269598] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fdbe5e2677d [ 836.270576] RDX: 0000000000000000 RSI: 00007fdbe48a7640 RDI: 000000000000003c [ 836.271565] RBP: 00007fdbe48a8368 R08: 0000000000000000 R09: 0000000000000000 [ 836.272546] R10: 00007fdbe48a84b0 R11: 0000000000000293 R12: 0000557bd17dc860 [ 836.273527] R13: 0000000000000000 R14: 0000557bd17dc860 R15: 00007fdbe48a7640 [ 836.274521] </TASK> To avoid using mode holding ldev->lock in the configure flow, we queue a work to the lag workqueue and cease wait on a completion object. In addition, we remove the lock from mlx5_lag_do_mirred() since it is not really protecting anything. It should be noted that an actual deadlock has not been observed. Signed-off-by: Eli Cohen <elic@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-22net/mlx5: Fix handling of entry refcount when command is not issued to FWMoshe Shemesh1-3/+3
In case command interface is down, or the command is not allowed, driver did not increment the entry refcount, but might have decrement as part of forced completion handling. Fix that by always increment and decrement the refcount to make it symmetric for all flows. Fixes: 50b2412b7e78 ("net/mlx5: Avoid possible free of command entry while timeout comp handler") Signed-off-by: Eran Ben Elisha <eranbe@nvidia.com> Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reported-by: Jack Wang <jinpu.wang@ionos.com> Tested-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-22net/mlx5: cmdif, Print info on any firmware cmd failure to tracepointMoshe Shemesh2-19/+67
While moving to new CMD API (quiet API), some pre-existing flows may call the new API function that in case of error, returns the error instead of printing it as previously done. For such flows we bring back the print but to tracepoint this time for sys admins to have the ability to check for errors especially for commands using the new quiet API. Tracepoint output example: devlink-1333 [001] ..... 822.746922: mlx5_cmd: ACCESS_REG(0x805) op_mod(0x0) failed, status bad resource(0x5), syndrome (0xb06e1f), err(-22) Fixes: f23519e542e5 ("net/mlx5: cmdif, Add new api for command execution") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-22net/mlx5: SF: Fix probing active SFs during driver probe phaseShay Drory1-0/+88
When SF devices and SF port representors are located on different functions, unloading and reloading of SF parent driver doesn't recreate the existing SF present in the device. Fix it by querying SFs and probe active SFs during driver probe phase. Fixes: 90d010b8634b ("net/mlx5: SF, Add auxiliary device support") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-22net/mlx5: Fix FW tracer timestamp calculationMoshe Shemesh1-1/+1
Fix a bug in calculation of FW tracer timestamp. Decreasing one in the calculation should effect only bits 52_7 and not effect bits 6_0 of the timestamp, otherwise bits 6_0 are always set in this calculation. Fixes: 70dd6fdb8987 ("net/mlx5: FW tracer, parse traces and kernel tracing support") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Feras Daoud <ferasda@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-22net/mlx5: Do not query pci info while pci disabledRoy Novich1-3/+6
The driver should not interact with PCI while PCI is disabled. Trying to do so may result in being unable to get vital signs during PCI reset, driver gets timed out and fails to recover. Fixes: fad1783a6d66 ("net/mlx5: Print more info on pci error handlers") Signed-off-by: Roy Novich <royno@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-21octeontx2-af: cn10k: mcs: Fix copy and paste bug in mcs_bbe_intr_handler()Dan Carpenter1-1/+1
This code accidentally uses the RX macro twice instead of the RX and TX. Fixes: 6c635f78c474 ("octeontx2-af: cn10k: mcs: Handle MCS block interrupts") Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21net: microchip: sparx5: Fix return value in sparx5_tc_setup_qdisc_ets()Lu Wei1-5/+2
Function sparx5_tc_setup_qdisc_ets() always returns negative value because it return -EOPNOTSUPP in the end. This patch returns the rersult of sparx5_tc_ets_add() and sparx5_tc_ets_del() directly. Fixes: 211225428d65 ("net: microchip: sparx5: add support for offloading ets qdisc") Signed-off-by: Lu Wei <luwei32@huawei.com> Reviewed-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21cnic: don't pass bogus GFP_ flags to dma_alloc_coherentChristoph Hellwig1-4/+2
dma_alloc_coherent is an opaque allocator that only uses the GFP_ flags for allocation context control. Don't pass __GFP_COMP which makes no sense for an allocation that can't in any way be converted to a page pointer. Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-11-19nfp: add port from netdev validation for EEPROM accessJaco Coetzee1-0/+3
Setting of the port flag `NFP_PORT_CHANGED`, introduced to ensure the correct reading of EEPROM data, causes a fatal kernel NULL pointer dereference in cases where the target netdev type cannot be determined. Add validation of port struct pointer before attempting to set the `NFP_PORT_CHANGED` flag. Return that operation is not supported if the netdev type cannot be determined. Fixes: 4ae97cae07e1 ("nfp: ethtool: fix the display error of `ethtool -m DEVNAME`") Signed-off-by: Jaco Coetzee <jaco.coetzee@corigine.com> Reviewed-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-19nfp: fill splittable of devlink_port_attrs correctlyDiana Wang1-1/+1
The error is reflected in that it shows wrong splittable status of port when executing "devlink port show". The reason which leads the error is that the assigned operation of splittable is just a simple negation operation of split and it does not consider port lanes quantity. A splittable port should have several lanes that can be split(lanes quantity > 1). If without the judgement, it will show wrong message for some firmware, such as 2x25G, 2x10G. Fixes: a0f49b548652 ("devlink: Add a new devlink port split ability attribute and pass to netlink") Signed-off-by: Diana Wang <na.wang@corigine.com> Reviewed-by: Louis Peens <louis.peens@corigine.com> Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-19net: pch_gbe: fix pci device refcount leak while module exitingYang Yingliang1-1/+4
As comment of pci_get_domain_bus_and_slot() says, it returns a pci device with refcount increment, when finish using it, the caller must decrement the reference count by calling pci_dev_put(). In pch_gbe_probe(), pci_get_domain_bus_and_slot() is called, so in error path in probe() and remove() function, pci_dev_put() should be called to avoid refcount leak. Compile tested only. Fixes: 1a0bdadb4e36 ("net/pch_gbe: supports eg20t ptp clock") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221117135148.301014-1-yangyingliang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>