summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/mellanox
AgeCommit message (Collapse)AuthorFilesLines
2022-01-05net/mlx5e: Fix wrong features assignment in case of errorGal Pressman1-6/+5
[ Upstream commit 992d8a4e38f0527f24e273ce3a9cd6dea1a6a436 ] In case of an error in mlx5e_set_features(), 'netdev->features' must be updated with the correct state of the device to indicate which features were updated successfully. To do that we maintain a copy of 'netdev->features' and update it after successful feature changes, so we can assign it to back to 'netdev->features' if needed. However, since not all netdev features are handled by the driver (e.g. GRO/TSO/etc), some features may not be updated correctly in case of an error updating another feature. For example, while requesting to disable TSO (feature which is not handled by the driver) and enable HW-GRO, if an error occurs during HW-GRO enable, 'oper_features' will be assigned with 'netdev->features' and HW-GRO turned off. TSO will remain enabled in such case, which is a bug. To solve that, instead of using 'netdev->features' as the baseline of 'oper_features' and changing it on set feature success, use 'features' instead and update it in case of errors. Fixes: 75b81ce719b7 ("net/mlx5e: Don't override netdev features field unless in error flow") Signed-off-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-22net/mlx4_en: Update reported link modes for 1/10GErik Ekman1-3/+3
[ Upstream commit 2191b1dfef7d45f44b5008d2148676d9f2c82874 ] When link modes were initially added in commit 2c762679435dc ("net/mlx4_en: Use PTYS register to query ethtool settings") and later updated for the new ethtool API in commit 3d8f7cc78d0eb ("net: mlx4: use new ETHTOOL_G/SSETTINGS API") the only 1/10G non-baseT link modes configured were 1000baseKX, 10000baseKX4 and 10000baseKR. It looks like these got picked to represent other modes since nothing better was available. Switch to using more specific link modes added in commit 5711a98221443 ("net: ethtool: add support for 1000BaseX and missing 10G link modes"). Tested with MCX311A-XCAT connected via DAC. Before: % sudo ethtool enp3s0 Settings for enp3s0: Supported ports: [ FIBRE ] Supported link modes: 1000baseKX/Full 10000baseKR/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: No Supported FEC modes: Not reported Advertised link modes: 1000baseKX/Full 10000baseKR/Full Advertised pause frame use: Symmetric Advertised auto-negotiation: No Advertised FEC modes: Not reported Speed: 10000Mb/s Duplex: Full Auto-negotiation: off Port: Direct Attach Copper PHYAD: 0 Transceiver: internal Supports Wake-on: d Wake-on: d Current message level: 0x00000014 (20) link ifdown Link detected: yes With this change: % sudo ethtool enp3s0 Settings for enp3s0: Supported ports: [ FIBRE ] Supported link modes: 1000baseX/Full 10000baseCR/Full 10000baseSR/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: No Supported FEC modes: Not reported Advertised link modes: 1000baseX/Full 10000baseCR/Full 10000baseSR/Full Advertised pause frame use: Symmetric Advertised auto-negotiation: No Advertised FEC modes: Not reported Speed: 10000Mb/s Duplex: Full Auto-negotiation: off Port: Direct Attach Copper PHYAD: 0 Transceiver: internal Supports Wake-on: d Wake-on: d Current message level: 0x00000014 (20) link ifdown Link detected: yes Tested-by: Michael Stapelberg <michael@stapelberg.ch> Signed-off-by: Erik Ekman <erik@kryo.se> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-08net/mlx4_en: Fix an use-after-free bug in mlx4_en_try_alloc_resources()Zhou Qingyang1-2/+7
commit addad7643142f500080417dd7272f49b7a185570 upstream. In mlx4_en_try_alloc_resources(), mlx4_en_copy_priv() is called and tmp->tx_cq will be freed on the error path of mlx4_en_copy_priv(). After that mlx4_en_alloc_resources() is called and there is a dereference of &tmp->tx_cq[t][i] in mlx4_en_alloc_resources(), which could lead to a use after free problem on failure of mlx4_en_copy_priv(). Fix this bug by adding a check of mlx4_en_copy_priv() This bug was found by a static analyzer. The analysis employs differential checking to identify inconsistent security operations (e.g., checks or kfrees) between two code paths and confirms that the inconsistent operations are not recovered in the current function or the callers, so they constitute bugs. Note that, as a bug found by static analysis, it can be a false positive or hard to trigger. Multiple researchers have cross-reviewed the bug. Builds with CONFIG_MLX4_EN=m show no new warnings, and our static analyzer no longer warns about this code. Fixes: ec25bc04ed8e ("net/mlx4_en: Add resilience in low memory systems") Signed-off-by: Zhou Qingyang <zhou1615@umn.edu> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20211130164438.190591-1-zhou1615@umn.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-06net/mlx4_en: Don't allow aRFS for encapsulated packetsAya Levin1-0/+3
[ Upstream commit fdbccea419dc782079ce5881d2705cc9e3881480 ] Driver doesn't support aRFS for encapsulated packets, return early error in such a case. Fixes: 1eb8c695bda9 ("net/mlx4_en: Add accelerated RFS support") Signed-off-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-22net/mlx5: Fix potential sleeping in atomic contextMaor Gottlieb1-3/+2
commit ee27e330a953595903979ffdb84926843595a9fe upstream. Fixes the below flow of sleeping in atomic context by releasing the RCU lock before calling to free_match_list. build_match_list() <- disables preempt -> free_match_list() -> tree_put_node() -> down_write_ref_node() <- take write lock Fixes: 693c6883bbc4 ("net/mlx5: Add hash table for flow groups in flow table") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-04net/mlx5: Fix flow table chainingMaor Gottlieb1-4/+6
[ Upstream commit 8b54874ef1617185048029a3083d510569e93751 ] Fix a bug when flow table is created in priority that already has other flow tables as shown in the below diagram. If the new flow table (FT-B) has the lowest level in the priority, we need to connect the flow tables from the previous priority (p0) to this new table. In addition when this flow table is destroyed (FT-B), we need to connect the flow tables from the previous priority (p0) to the next level flow table (FT-C) in the same priority of the destroyed table (if exists). --------- |root_ns| --------- | -------------------------------- | | | ---------- ---------- --------- |p(prio)-x| | p-y | | p-n | ---------- ---------- --------- | | ---------------- ------------------ |ns(e.g bypass)| |ns(e.g. kernel) | ---------------- ------------------ | | | ------- ------ ---- | p0 | | p1 | |p2| ------- ------ ---- | | \ -------- ------- ------ | FT-A | |FT-B | |FT-C| -------- ------- ------ Fixes: f90edfd279f3 ("net/mlx5_core: Connect flow tables") Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-08-04mlx4: Fix missing error code in mlx4_load_one()Jiapeng Chong1-0/+1
[ Upstream commit 7e4960b3d66d7248b23de3251118147812b42da2 ] The error code is missing in this code scenario, add the error code '-EINVAL' to the return value 'err'. Eliminate the follow smatch warning: drivers/net/ethernet/mellanox/mlx4/main.c:3538 mlx4_load_one() warn: missing error code 'err'. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Fixes: 7ae0e400cd93 ("net/mlx4_core: Flexible (asymmetric) allocation of EQs and MSI-X vectors for PF/VFs") Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-30net/mlx5e: Block offload of outer header csum for UDP tunnelsAya Levin1-5/+2
[ Upstream commit 6d6727dddc7f93fcc155cb8d0c49c29ae0e71122 ] The device is able to offload either the outer header csum or inner header csum. The driver utilizes the inner csum offload. Hence, block setting of tx-udp_tnl-csum-segmentation and set it to off[fixed]. Fixes: b49663c8fb49 ("net/mlx5e: Add support for UDP tunnel segmentation with outer checksum offload") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-30net/mlx5e: Remove dependency in IPsec initialization flowsHuy Nguyen1-3/+0
[ Upstream commit 8ad893e516a77209a1818a2072d2027d87db809f ] Currently, IPsec feature is disabled because mlx5e_build_nic_netdev is required to be called after mlx5e_ipsec_init. This requirement is invalid as mlx5e_build_nic_netdev and mlx5e_ipsec_init initialize independent resources. Remove ipsec pointer check in mlx5e_build_nic_netdev so that the two functions can be called at any order. Fixes: 547eede070eb ("net/mlx5e: IPSec, Innova IPSec offload infrastructure") Signed-off-by: Huy Nguyen <huyn@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-16RDMA/mlx4: Do not map the core_clock page to user space unless enabledShay Drory3-0/+10
commit 404e5a12691fe797486475fe28cc0b80cb8bef2c upstream. Currently when mlx4 maps the hca_core_clock page to the user space there are read-modifiable registers, one of which is semaphore, on this page as well as the clock counter. If user reads the wrong offset, it can modify the semaphore and hang the device. Do not map the hca_core_clock page to the user space unless the device has been put in a backwards compatibility mode to support this feature. After this patch, mlx4 core_clock won't be mapped to user space on the majority of existing devices and the uverbs device time feature in ibv_query_rt_values_ex() will be disabled. Fixes: 52033cfb5aab ("IB/mlx4: Add mmap call to map the hardware clock") Link: https://lore.kernel.org/r/9632304e0d6790af84b3b706d8c18732bc0d5e27.1622726305.git.leonro@nvidia.com Signed-off-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-03drivers/net/ethernet: clean up unused assignmentsJesse Brandeburg1-1/+1
commit 7c8c0291f84027558bd5fca5729cbcf288c510f4 upstream. As part of the W=1 compliation series, these lines all created warnings about unused variables that were assigned a value. Most of them are from register reads, but some are just picking up a return value from a function and never doing anything with it. Fixed warnings: .../ethernet/brocade/bna/bnad.c:3280:6: warning: variable ‘rx_count’ set but not used [-Wunused-but-set-variable] .../ethernet/brocade/bna/bnad.c:3280:6: warning: variable ‘rx_count’ set but not used [-Wunused-but-set-variable] .../ethernet/cortina/gemini.c:512:6: warning: variable ‘val’ set but not used [-Wunused-but-set-variable] .../ethernet/cortina/gemini.c:2110:21: warning: variable ‘config0’ set but not used [-Wunused-but-set-variable] .../ethernet/cavium/liquidio/octeon_device.c:1327:6: warning: variable ‘val32’ set but not used [-Wunused-but-set-variable] .../ethernet/cavium/liquidio/octeon_device.c:1358:6: warning: variable ‘val32’ set but not used [-Wunused-but-set-variable] .../ethernet/dec/tulip/media.c:322:8: warning: variable ‘setup’ set but not used [-Wunused-but-set-variable] .../ethernet/dec/tulip/de4x5.c:4928:13: warning: variable ‘r3’ set but not used [-Wunused-but-set-variable] .../ethernet/micrel/ksz884x.c:1652:7: warning: variable ‘dummy’ set but not used [-Wunused-but-set-variable] .../ethernet/micrel/ksz884x.c:1652:7: warning: variable ‘dummy’ set but not used [-Wunused-but-set-variable] .../ethernet/micrel/ksz884x.c:1652:7: warning: variable ‘dummy’ set but not used [-Wunused-but-set-variable] .../ethernet/micrel/ksz884x.c:1652:7: warning: variable ‘dummy’ set but not used [-Wunused-but-set-variable] .../ethernet/micrel/ksz884x.c:4981:6: warning: variable ‘rx_status’ set but not used [-Wunused-but-set-variable] .../ethernet/micrel/ksz884x.c:6510:6: warning: variable ‘rc’ set but not used [-Wunused-but-set-variable] .../ethernet/micrel/ksz884x.c:6087: warning: cannot understand function prototype: 'struct hw_regs ' .../ethernet/microchip/lan743x_main.c:161:6: warning: variable ‘int_en’ set but not used [-Wunused-but-set-variable] .../ethernet/microchip/lan743x_main.c:1702:6: warning: variable ‘int_sts’ set but not used [-Wunused-but-set-variable] .../ethernet/microchip/lan743x_main.c:3041:6: warning: variable ‘ret’ set but not used [-Wunused-but-set-variable] .../ethernet/natsemi/ns83820.c:603:6: warning: variable ‘tbisr’ set but not used [-Wunused-but-set-variable] .../ethernet/natsemi/ns83820.c:1207:11: warning: variable ‘tanar’ set but not used [-Wunused-but-set-variable] .../ethernet/marvell/mvneta.c:754:6: warning: variable ‘dummy’ set but not used [-Wunused-but-set-variable] .../ethernet/neterion/vxge/vxge-traffic.c:33:6: warning: variable ‘val64’ set but not used [-Wunused-but-set-variable] .../ethernet/neterion/vxge/vxge-traffic.c:160:6: warning: variable ‘val64’ set but not used [-Wunused-but-set-variable] .../ethernet/neterion/vxge/vxge-traffic.c:490:6: warning: variable ‘val32’ set but not used [-Wunused-but-set-variable] .../ethernet/neterion/vxge/vxge-traffic.c:2378:6: warning: variable ‘val64’ set but not used [-Wunused-but-set-variable] .../ethernet/packetengines/yellowfin.c:1063:18: warning: variable ‘yf_size’ set but not used [-Wunused-but-set-variable] .../ethernet/realtek/8139cp.c:1242:6: warning: variable ‘rc’ set but not used [-Wunused-but-set-variable] .../ethernet/mellanox/mlx4/en_tx.c:858:6: warning: variable ‘ring_cons’ set but not used [-Wunused-but-set-variable] .../ethernet/sis/sis900.c:792:6: warning: variable ‘status’ set but not used [-Wunused-but-set-variable] .../ethernet/sfc/falcon/farch.c:878:11: warning: variable ‘rx_ev_pkt_type’ set but not used [-Wunused-but-set-variable] .../ethernet/sfc/falcon/farch.c:877:23: warning: variable ‘rx_ev_mcast_pkt’ set but not used [-Wunused-but-set-variable] .../ethernet/sfc/falcon/farch.c:877:7: warning: variable ‘rx_ev_hdr_type’ set but not used [-Wunused-but-set-variable] .../ethernet/sfc/falcon/farch.c:876:7: warning: variable ‘rx_ev_other_err’ set but not used [-Wunused-but-set-variable] .../ethernet/sfc/falcon/farch.c:1646:21: warning: variable ‘buftbl_min’ set but not used [-Wunused-but-set-variable] .../ethernet/sfc/falcon/farch.c:2535:32: warning: variable ‘spec’ set but not used [-Wunused-but-set-variable] .../ethernet/via/via-velocity.c:880:6: warning: variable ‘curr_status’ set but not used [-Wunused-but-set-variable] .../ethernet/ti/tlan.c:656:6: warning: variable ‘rc’ set but not used [-Wunused-but-set-variable] .../ethernet/ti/davinci_emac.c:1230:6: warning: variable ‘num_tx_pkts’ set but not used [-Wunused-but-set-variable] .../ethernet/synopsys/dwc-xlgmac-common.c:516:8: warning: variable ‘str’ set but not used [-Wunused-but-set-variable] .../ethernet/ti/cpsw_new.c:1662:22: warning: variable ‘priv’ set but not used [-Wunused-but-set-variable] The register reads should be OK, because the current implementation of readl and friends will always execute even without an lvalue. When it makes sense, just remove the lvalue assignment and the local. Other times, just remove the offending code, and occasionally, just mark the variable as maybe unused since it could be used in an ifdef or debug scenario. Only compile tested with W=1. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Acked-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> [fixes gcc-11 build warnings - gregkh] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-03net/mlx4: Fix EEPROM dump supportVladyslav Tarasiuk2-7/+104
commit db825feefc6868896fed5e361787ba3bee2fd906 upstream. Fix SFP and QSFP* EEPROM queries by setting i2c_address, offset and page number correctly. For SFP set the following params: - I2C address for offsets 0-255 is 0x50. For 256-511 - 0x51. - Page number is zero. - Offset is 0-255. At the same time, QSFP* parameters are different: - I2C address is always 0x50. - Page number is not limited to zero. - Offset is 0-255 for page zero and 128-255 for others. To set parameters accordingly to cable used, implement function to query module ID and implement respective helper functions to set parameters correctly. Fixes: 135dd9594f12 ("net/mlx4_en: ethtool, Remove unsupported SFP EEPROM high pages query") Signed-off-by: Vladyslav Tarasiuk <vladyslavt@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-22mlxsw: spectrum_mr: Update egress RIF list before route's actionIdo Schimmel1-15/+15
commit cbaf3f6af9c268caf558c8e7ec52bcb35c5455dd upstream. Each multicast route that is forwarding packets (as opposed to trapping them) points to a list of egress router interfaces (RIFs) through which packets are replicated. A route's action can transition from trap to forward when a RIF is created for one of the route's egress virtual interfaces (eVIF). When this happens, the route's action is first updated and only later the list of egress RIFs is committed to the device. This results in the route pointing to an invalid list. In case the list pointer is out of range (due to uninitialized memory), the device will complain: mlxsw_spectrum2 0000:06:00.0: EMAD reg access failed (tid=5733bf490000905c,reg_id=300f(pefa),type=write,status=7(bad parameter)) Fix this by first committing the list of egress RIFs to the device and only later update the route's action. Note that a fix is not needed in the reverse function (i.e., mlxsw_sp_mr_route_evif_unresolve()), as there the route's action is first updated and only later the RIF is removed from the list. Cc: stable@vger.kernel.org Fixes: c011ec1bbfd6 ("mlxsw: spectrum: Add the multicast routing offloading logic") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/20210506072308.3834303-1-idosch@idosch.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-30net/mlx5e: Fix error path for ethtool set-priv-flagAya Levin1-1/+5
[ Upstream commit 4eacfe72e3e037e3fc019113df32c39a705148c2 ] Expose error value when failing to comply to command: $ ethtool --set-priv-flags eth2 rx_cqe_compress [on/off] Fixes: be7e87f92b58 ("net/mlx5e: Fail safe cqe compressing/moderation mode setting") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-17net/mlx4_en: update moderation when config resetKevin(Yudong) Yang3-1/+4
commit 00ff801bb8ce6711e919af4530b6ffa14a22390a upstream. This patch fixes a bug that the moderation config will not be applied when calling mlx4_en_reset_config. For example, when turning on rx timestamping, mlx4_en_reset_config() will be called, causing the NIC to forget previous moderation config. This fix is in phase with a previous fix: commit 79c54b6bbf06 ("net/mlx4_en: Fix TX moderation info loss after set_ringparam is called") Tested: Before this patch, on a host with NIC using mlx4, run netserver and stream TCP to the host at full utilization. $ sar -I SUM 1 INTR intr/s 14:03:56 sum 48758.00 After rx hwtstamp is enabled: $ sar -I SUM 1 14:10:38 sum 317771.00 We see the moderation is not working properly and issued 7x more interrupts. After the patch, and turned on rx hwtstamp, the rate of interrupts is as expected: $ sar -I SUM 1 14:52:11 sum 49332.00 Fixes: 79c54b6bbf06 ("net/mlx4_en: Fix TX moderation info loss after set_ringparam is called") Signed-off-by: Kevin(Yudong) Yang <yyd@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> CC: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-04net/mlx4_core: Add missed mlx4_free_cmd_mailbox()Chuhong Yuan1-0/+1
[ Upstream commit 8eb65fda4a6dbd59cd5de24b106a10b6ee0d2176 ] mlx4_do_mirror_rule() forgets to call mlx4_free_cmd_mailbox() to free the memory region allocated by mlx4_alloc_cmd_mailbox() before an exit. Add the missed call to fix it. Fixes: 78efed275117 ("net/mlx4_core: Support mirroring VF DMFS rules on both ports") Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20210221143559.390277-1-hslester96@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-04net/mlx5: Fix memory leak on flow table creation error flowRoi Dayan1-0/+1
[ Upstream commit 487c6ef81eb98d0a43cb08be91b1fcc9b4250626 ] When we create the ft object we also init rhltable in ft->fgs_hash. So in error flow before kfree of ft we need to destroy that rhltable. Fixes: 693c6883bbc4 ("net/mlx5: Add hash table for flow groups in flow table") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-17net/mlx5e: Fix two double free casesDinghao Liu1-0/+2
commit 7a6eb072a9548492ead086f3e820e9aac71c7138 upstream. mlx5e_create_ttc_table_groups() frees ft->g on failure of kvzalloc(), but such failure will be caught by its caller in mlx5e_create_ttc_table() and ft->g will be freed again in mlx5e_destroy_flow_table(). The same issue also occurs in mlx5e_create_ttc_table_groups(). Set ft->g to NULL after kfree() to avoid double free. Fixes: 7b3722fa9ef6 ("net/mlx5e: Support RSS for GRE tunneled packets") Fixes: 33cfaaa8f36f ("net/mlx5e: Split the main flow steering table") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-17net/mlx5e: Fix memleak in mlx5e_create_l2_table_groupsDinghao Liu1-0/+1
commit 5b0bb12c58ac7d22e05b5bfdaa30a116c8c32e32 upstream. When mlx5_create_flow_group() fails, ft->g should be freed just like when kvzalloc() fails. The caller of mlx5e_create_l2_table_groups() does not catch this issue on failure, which leads to memleak. Fixes: 33cfaaa8f36f ("net/mlx5e: Split the main flow steering table") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-30net/mlx5: Properly convey driver version to firmwareLeon Romanovsky1-1/+5
[ Upstream commit 907af0f0cab4ee5d5604f182ecec2c5b5119d294 ] mlx5 firmware expects driver version in specific format X.X.X, so make it always correct and based on real kernel version aligned with the driver. Fixes: 012e50e109fd ("net/mlx5: Set driver version into firmware") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30net/mlx4_en: Handle TX error CQEMoshe Shemesh3-7/+39
[ Upstream commit ba603d9d7b1215c72513d7c7aa02b6775fd4891b ] In case error CQE was found while polling TX CQ, the QP is in error state and all posted WQEs will generate error CQEs without any data transmitted. Fix it by reopening the channels, via same method used for TX timeout handling. In addition add some more info on error CQE and WQE for debug. Fixes: bd2f631d7c60 ("net/mlx4_en: Notify user when TX ring in error state") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-30net/mlx4_en: Avoid scheduling restart task if it is already runningMoshe Shemesh2-8/+19
[ Upstream commit fed91613c9dd455dd154b22fa8e11b8526466082 ] Add restarting state flag to avoid scheduling another restart task while such task is already running. Change task name from watchdog_task to restart_task to better fit the task role. Fixes: 1e338db56e5a ("mlx4_en: Fix a race at restart task") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-08net/mlx5: Fix wrong address reclaim when command interface is downEran Ben Elisha1-2/+19
[ Upstream commit 1d2bb5ad89f47d8ce8aedc70ef85059ab3870292 ] When command interface is down, driver to reclaim all 4K page chucks that were hold by the Firmeware. Fix a bug for 64K page size systems, where driver repeatedly released only the first chunk of the page. Define helper function to fill 4K chunks for a given Firmware pages. Iterate over all unreleased Firmware pages and call the hepler per each. Fixes: 5adff6a08862 ("net/mlx5: Fix incorrect page count when in internal error") Signed-off-by: Eran Ben Elisha <eranbe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-24net/mlx5: Disable QoS when min_rates on all VFs are zeroVladyslav Tarasiuk1-7/+8
[ Upstream commit 470b74758260e4abc2508cf1614573c00a00465c ] Currently when QoS is enabled for VF and any min_rate is configured, the driver sets bw_share value to at least 1 and doesn’t allow to set it to 0 to make minimal rate unlimited. It means there is always a minimal rate configured for every VF, even if user tries to remove it. In order to make QoS disable possible, check whether all vports have configured min_rate = 0. If this is true, set their bw_share to 0 to disable min_rate limitations. Fixes: c9497c98901c ("net/mlx5: Add support for setting VF min rate") Signed-off-by: Vladyslav Tarasiuk <vladyslavt@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-24net/mlx4_core: Fix init_hca fields offsetAya Levin2-5/+5
[ Upstream commit 6d9c8d15af0ef20a66a0b432cac0d08319920602 ] Slave function read the following capabilities from the wrong offset: 1. log_mc_entry_sz 2. fs_log_entry_sz 3. log_mc_hash_sz Fix that by adjusting these capabilities offset to match firmware layout. Due to the wrong offset read, the following issues might occur: 1+2. Negative value reported at max_mcast_qp_attach. 3. Driver to init FW with multicast hash size of zero. Fixes: a40ded604365 ("net/mlx4_core: Add masking for a few queries on HCA caps") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20201118081922.553-1-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-24mlxsw: core: Use variable timeout for EMAD retriesIdo Schimmel1-1/+2
[ Upstream commit 1f492eab67bced119a0ac7db75ef2047e29a30c6 ] The driver sends Ethernet Management Datagram (EMAD) packets to the device for configuration purposes and waits for up to 200ms for a reply. A request is retried up to 5 times. When the system is under heavy load, replies are not always processed in time and EMAD transactions fail. Make the process more robust to such delays by using exponential backoff. First wait for up to 200ms, then retransmit and wait for up to 400ms and so on. Fixes: caf7297e7ab5 ("mlxsw: core: Introduce support for asynchronous EMAD register access") Reported-by: Denis Yulevich <denisyu@nvidia.com> Tested-by: Denis Yulevich <denisyu@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-05mlxsw: core: Fix use-after-free in mlxsw_emad_trans_finish()Amit Cohen1-0/+3
[ Upstream commit 0daf2bf5a2dcf33d446b76360908f109816e2e21 ] Each EMAD transaction stores the skb used to issue the EMAD request ('trans->tx_skb') so that the request could be retried in case of a timeout. The skb can be freed when a corresponding response is received or as part of the retry logic (e.g., failed retransmit, exceeded maximum number of retries). The two tasks (i.e., response processing and retransmits) are synchronized by the atomic 'trans->active' field which ensures that responses to inactive transactions are ignored. In case of a failed retransmit the transaction is finished and all of its resources are freed. However, the current code does not mark it as inactive. Syzkaller was able to hit a race condition in which a concurrent response is processed while the transaction's resources are being freed, resulting in a use-after-free [1]. Fix the issue by making sure to mark the transaction as inactive after a failed retransmit and free its resources only if a concurrent task did not already do that. [1] BUG: KASAN: use-after-free in consume_skb+0x30/0x370 net/core/skbuff.c:833 Read of size 4 at addr ffff88804f570494 by task syz-executor.0/1004 CPU: 0 PID: 1004 Comm: syz-executor.0 Not tainted 5.8.0-rc7+ #68 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xf6/0x16e lib/dump_stack.c:118 print_address_description.constprop.0+0x1c/0x250 mm/kasan/report.c:383 __kasan_report mm/kasan/report.c:513 [inline] kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530 check_memory_region_inline mm/kasan/generic.c:186 [inline] check_memory_region+0x14e/0x1b0 mm/kasan/generic.c:192 instrument_atomic_read include/linux/instrumented.h:56 [inline] atomic_read include/asm-generic/atomic-instrumented.h:27 [inline] refcount_read include/linux/refcount.h:147 [inline] skb_unref include/linux/skbuff.h:1044 [inline] consume_skb+0x30/0x370 net/core/skbuff.c:833 mlxsw_emad_trans_finish+0x64/0x1c0 drivers/net/ethernet/mellanox/mlxsw/core.c:592 mlxsw_emad_process_response drivers/net/ethernet/mellanox/mlxsw/core.c:651 [inline] mlxsw_emad_rx_listener_func+0x5c9/0xac0 drivers/net/ethernet/mellanox/mlxsw/core.c:672 mlxsw_core_skb_receive+0x4df/0x770 drivers/net/ethernet/mellanox/mlxsw/core.c:2063 mlxsw_pci_cqe_rdq_handle drivers/net/ethernet/mellanox/mlxsw/pci.c:595 [inline] mlxsw_pci_cq_tasklet+0x12a6/0x2520 drivers/net/ethernet/mellanox/mlxsw/pci.c:651 tasklet_action_common.isra.0+0x13f/0x3e0 kernel/softirq.c:550 __do_softirq+0x223/0x964 kernel/softirq.c:292 asm_call_on_stack+0x12/0x20 arch/x86/entry/entry_64.S:711 Allocated by task 1006: save_stack+0x1b/0x40 mm/kasan/common.c:48 set_track mm/kasan/common.c:56 [inline] __kasan_kmalloc mm/kasan/common.c:494 [inline] __kasan_kmalloc.constprop.0+0xc2/0xd0 mm/kasan/common.c:467 slab_post_alloc_hook mm/slab.h:586 [inline] slab_alloc_node mm/slub.c:2824 [inline] slab_alloc mm/slub.c:2832 [inline] kmem_cache_alloc+0xcd/0x2e0 mm/slub.c:2837 __build_skb+0x21/0x60 net/core/skbuff.c:311 __netdev_alloc_skb+0x1e2/0x360 net/core/skbuff.c:464 netdev_alloc_skb include/linux/skbuff.h:2810 [inline] mlxsw_emad_alloc drivers/net/ethernet/mellanox/mlxsw/core.c:756 [inline] mlxsw_emad_reg_access drivers/net/ethernet/mellanox/mlxsw/core.c:787 [inline] mlxsw_core_reg_access_emad+0x1ab/0x1420 drivers/net/ethernet/mellanox/mlxsw/core.c:1817 mlxsw_reg_trans_query+0x39/0x50 drivers/net/ethernet/mellanox/mlxsw/core.c:1831 mlxsw_sp_sb_pm_occ_clear drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c:260 [inline] mlxsw_sp_sb_occ_max_clear+0xbff/0x10a0 drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c:1365 mlxsw_devlink_sb_occ_max_clear+0x76/0xb0 drivers/net/ethernet/mellanox/mlxsw/core.c:1037 devlink_nl_cmd_sb_occ_max_clear_doit+0x1ec/0x280 net/core/devlink.c:1765 genl_family_rcv_msg_doit net/netlink/genetlink.c:669 [inline] genl_family_rcv_msg net/netlink/genetlink.c:714 [inline] genl_rcv_msg+0x617/0x980 net/netlink/genetlink.c:731 netlink_rcv_skb+0x152/0x440 net/netlink/af_netlink.c:2470 genl_rcv+0x24/0x40 net/netlink/genetlink.c:742 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline] netlink_unicast+0x53a/0x750 net/netlink/af_netlink.c:1330 netlink_sendmsg+0x850/0xd90 net/netlink/af_netlink.c:1919 sock_sendmsg_nosec net/socket.c:651 [inline] sock_sendmsg+0x150/0x190 net/socket.c:671 ____sys_sendmsg+0x6d8/0x840 net/socket.c:2359 ___sys_sendmsg+0xff/0x170 net/socket.c:2413 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2446 do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:384 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Freed by task 73: save_stack+0x1b/0x40 mm/kasan/common.c:48 set_track mm/kasan/common.c:56 [inline] kasan_set_free_info mm/kasan/common.c:316 [inline] __kasan_slab_free+0x12c/0x170 mm/kasan/common.c:455 slab_free_hook mm/slub.c:1474 [inline] slab_free_freelist_hook mm/slub.c:1507 [inline] slab_free mm/slub.c:3072 [inline] kmem_cache_free+0xbe/0x380 mm/slub.c:3088 kfree_skbmem net/core/skbuff.c:622 [inline] kfree_skbmem+0xef/0x1b0 net/core/skbuff.c:616 __kfree_skb net/core/skbuff.c:679 [inline] consume_skb net/core/skbuff.c:837 [inline] consume_skb+0xe1/0x370 net/core/skbuff.c:831 mlxsw_emad_trans_finish+0x64/0x1c0 drivers/net/ethernet/mellanox/mlxsw/core.c:592 mlxsw_emad_transmit_retry.isra.0+0x9d/0xc0 drivers/net/ethernet/mellanox/mlxsw/core.c:613 mlxsw_emad_trans_timeout_work+0x43/0x50 drivers/net/ethernet/mellanox/mlxsw/core.c:625 process_one_work+0xa3e/0x17a0 kernel/workqueue.c:2269 worker_thread+0x9e/0x1050 kernel/workqueue.c:2415 kthread+0x355/0x470 kernel/kthread.c:291 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:293 The buggy address belongs to the object at ffff88804f5703c0 which belongs to the cache skbuff_head_cache of size 224 The buggy address is located 212 bytes inside of 224-byte region [ffff88804f5703c0, ffff88804f5704a0) The buggy address belongs to the page: page:ffffea00013d5c00 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 flags: 0x100000000000200(slab) raw: 0100000000000200 dead000000000100 dead000000000122 ffff88806c625400 raw: 0000000000000000 00000000000c000c 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88804f570380: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb ffff88804f570400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff88804f570480: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff88804f570500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff88804f570580: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc Fixes: caf7297e7ab5f ("mlxsw: core: Introduce support for asynchronous EMAD register access") Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-11-05mlxsw: core: Fix memory leak on module removalIdo Schimmel1-0/+2
[ Upstream commit adc80b6cfedff6dad8b93d46a5ea2775fd5af9ec ] Free the devlink instance during the teardown sequence in the non-reload case to avoid the following memory leak. unreferenced object 0xffff888232895000 (size 2048): comm "modprobe", pid 1073, jiffies 4295568857 (age 164.871s) hex dump (first 32 bytes): 00 01 00 00 00 00 ad de 22 01 00 00 00 00 ad de ........"....... 10 50 89 32 82 88 ff ff 10 50 89 32 82 88 ff ff .P.2.....P.2.... backtrace: [<00000000c704e9a6>] __kmalloc+0x13a/0x2a0 [<00000000ee30129d>] devlink_alloc+0xff/0x760 [<0000000092ab3e5d>] 0xffffffffa042e5b0 [<000000004f3f8a31>] 0xffffffffa042f6ad [<0000000092800b4b>] 0xffffffffa0491df3 [<00000000c4843903>] local_pci_probe+0xcb/0x170 [<000000006993ded7>] pci_device_probe+0x2c2/0x4e0 [<00000000a8e0de75>] really_probe+0x2c5/0xf90 [<00000000d42ba75d>] driver_probe_device+0x1eb/0x340 [<00000000bcc95e05>] device_driver_attach+0x294/0x300 [<000000000e2bc177>] __driver_attach+0x167/0x2f0 [<000000007d44cd6e>] bus_for_each_dev+0x148/0x1f0 [<000000003cd5a91e>] driver_attach+0x45/0x60 [<000000000041ce51>] bus_add_driver+0x3b8/0x720 [<00000000f5215476>] driver_register+0x230/0x4e0 [<00000000d79356f5>] __pci_register_driver+0x190/0x200 Fixes: a22712a96291 ("mlxsw: core: Fix devlink unregister flow") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reported-by: Vadim Pasternak <vadimp@nvidia.com> Tested-by: Oleksandr Shamray <oleksandrs@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-29net/mlx5: Don't call timecounter cyc2time directly from 1PPS flowEran Ben Elisha1-2/+3
[ Upstream commit 0d2ffdc8d4002a62de31ff7aa3bef28c843c3cbe ] Before calling timecounter_cyc2time(), clock->lock must be taken. Use mlx5_timecounter_cyc2time instead which guarantees a safe access. Fixes: afc98a0b46d8 ("net/mlx5: Update ptp_clock_event foreach PPS event") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-29mlx4: handle non-napi callers to napi_pollJonathan Lemon2-1/+4
[ Upstream commit b2b8a92733b288128feb57ffa694758cf475106c ] netcons calls napi_poll with a budget of 0 to transmit packets. Handle this by: - skipping RX processing - do not try to recycle TX packets to the RX cache Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-14net/mlx5e: Fix VLAN create flowAya Levin1-2/+4
[ Upstream commit d4a16052bccdd695982f89d815ca075825115821 ] When interface is attached while in promiscuous mode and with VLAN filtering turned off, both configurations are not respected and VLAN filtering is performed. There are 2 flows which add the any-vid rules during interface attach: VLAN creation table and set rx mode. Each is relaying on the other to add any-vid rules, eventually non of them does. Fix this by adding any-vid rules on VLAN creation regardless of promiscuous mode. Fixes: 9df30601c843 ("net/mlx5e: Restore vlan filter after seamless reset") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-14net/mlx5e: Fix VLAN cleanup flowAya Levin1-2/+6
[ Upstream commit 8c7353b6f716436ad0bfda2b5c5524ab2dde5894 ] Prior to this patch unloading an interface in promiscuous mode with RX VLAN filtering feature turned off - resulted in a warning. This is due to a wrong condition in the VLAN rules cleanup flow, which left the any-vid rules in the VLAN steering table. These rules prevented destroying the flow group and the flow table. The any-vid rules are removed in 2 flows, but none of them remove it in case both promiscuous is set and VLAN filtering is off. Fix the issue by changing the condition of the VLAN table cleanup flow to clean also in case of promiscuous mode. mlx5_core 0000:00:08.0: mlx5_destroy_flow_group:2123:(pid 28729): Flow group 20 wasn't destroyed, refcount > 1 mlx5_core 0000:00:08.0: mlx5_destroy_flow_group:2123:(pid 28729): Flow group 19 wasn't destroyed, refcount > 1 mlx5_core 0000:00:08.0: mlx5_destroy_flow_table:2112:(pid 28729): Flow table 262149 wasn't destroyed, refcount > 1 ... ... ------------[ cut here ]------------ FW pages counter is 11560 after reclaiming all pages WARNING: CPU: 1 PID: 28729 at drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c:660 mlx5_reclaim_startup_pages+0x178/0x230 [mlx5_core] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Call Trace: mlx5_function_teardown+0x2f/0x90 [mlx5_core] mlx5_unload_one+0x71/0x110 [mlx5_core] remove_one+0x44/0x80 [mlx5_core] pci_device_remove+0x3e/0xc0 device_release_driver_internal+0xfb/0x1c0 device_release_driver+0x12/0x20 pci_stop_bus_device+0x68/0x90 pci_stop_and_remove_bus_device+0x12/0x20 hv_eject_device_work+0x6f/0x170 [pci_hyperv] ? __schedule+0x349/0x790 process_one_work+0x206/0x400 worker_thread+0x34/0x3f0 ? process_one_work+0x400/0x400 kthread+0x126/0x140 ? kthread_park+0x90/0x90 ret_from_fork+0x22/0x30 ---[ end trace 6283bde8d26170dc ]--- Fixes: 9df30601c843 ("net/mlx5e: Restore vlan filter after seamless reset") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-12net/mlx5e: Don't support phys switch id if not in switchdev modeRoi Dayan1-1/+1
Support for phys switch id ndo added for representors and if we do not have representors there is no need to support it. Since each port return different switch id supporting this block support for creating bond over PFs and attaching to bridge in legacy mode. This bug doesn't exist upstream as the code got refactored and the netdev api is totally different. Fixes: cb67b832921c ("net/mlx5e: Introduce SRIOV VF representors") Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-09net: ethernet: mlx4: Fix memory allocation in mlx4_buddy_init()Shung-Hsi Yu1-1/+1
[ Upstream commit cbedcb044e9cc4e14bbe6658111224bb923094f4 ] On machines with much memory (> 2 TByte) and log_mtts_per_seg == 0, a max_order of 31 will be passed to mlx_buddy_init(), which results in s = BITS_TO_LONGS(1 << 31) becoming a negative value, leading to kvmalloc_array() failure when it is converted to size_t. mlx4_core 0000:b1:00.0: Failed to initialize memory region table, aborting mlx4_core: probe of 0000:b1:00.0 failed with error -12 Fix this issue by changing the left shifting operand from a signed literal to an unsigned one. Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters") Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-05net/mlx5e: fix bpf_prog reference count leaks in mlx5e_alloc_rqXin Xiong1-2/+2
[ Upstream commit e692139e6af339a1495ef401b2d95f7f9d1c7a44 ] The function invokes bpf_prog_inc(), which increases the reference count of a bpf_prog object "rq->xdp_prog" if the object isn't NULL. The refcount leak issues take place in two error handling paths. When either mlx5_wq_ll_create() or mlx5_wq_cyc_create() fails, the function simply returns the error code and forgets to drop the reference count increased earlier, causing a reference count leak of "rq->xdp_prog". Fix this issue by jumping to the error handling path err_rq_wq_destroy while either function fails. Fixes: 422d4c401edd ("net/mlx5e: RX, Split WQ objects for different RQ types") Signed-off-by: Xin Xiong <xiongx18@fudan.edu.cn> Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-05mlxsw: core: Free EMAD transactions using kfree_rcu()Ido Schimmel1-1/+1
[ Upstream commit 3c8ce24b037648a5a15b85888b259a74b05ff97d ] The lifetime of EMAD transactions (i.e., 'struct mlxsw_reg_trans') is managed using RCU. They are freed using kfree_rcu() once the transaction ends. However, in case the transaction failed it is freed immediately after being removed from the active transactions list. This is problematic because it is still possible for a different CPU to dereference the transaction from an RCU read-side critical section while traversing the active transaction list in mlxsw_emad_rx_listener_func(). In which case, a use-after-free is triggered [1]. Fix this by freeing the transaction after a grace period by calling kfree_rcu(). [1] BUG: KASAN: use-after-free in mlxsw_emad_rx_listener_func+0x969/0xac0 drivers/net/ethernet/mellanox/mlxsw/core.c:671 Read of size 8 at addr ffff88800b7964e8 by task syz-executor.2/2881 CPU: 0 PID: 2881 Comm: syz-executor.2 Not tainted 5.8.0-rc4+ #44 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xf6/0x16e lib/dump_stack.c:118 print_address_description.constprop.0+0x1c/0x250 mm/kasan/report.c:383 __kasan_report mm/kasan/report.c:513 [inline] kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530 mlxsw_emad_rx_listener_func+0x969/0xac0 drivers/net/ethernet/mellanox/mlxsw/core.c:671 mlxsw_core_skb_receive+0x571/0x700 drivers/net/ethernet/mellanox/mlxsw/core.c:2061 mlxsw_pci_cqe_rdq_handle drivers/net/ethernet/mellanox/mlxsw/pci.c:595 [inline] mlxsw_pci_cq_tasklet+0x12a6/0x2520 drivers/net/ethernet/mellanox/mlxsw/pci.c:651 tasklet_action_common.isra.0+0x13f/0x3e0 kernel/softirq.c:550 __do_softirq+0x223/0x964 kernel/softirq.c:292 asm_call_on_stack+0x12/0x20 arch/x86/entry/entry_64.S:711 </IRQ> __run_on_irqstack arch/x86/include/asm/irq_stack.h:22 [inline] run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:48 [inline] do_softirq_own_stack+0x109/0x140 arch/x86/kernel/irq_64.c:77 invoke_softirq kernel/softirq.c:387 [inline] __irq_exit_rcu kernel/softirq.c:417 [inline] irq_exit_rcu+0x16f/0x1a0 kernel/softirq.c:429 sysvec_apic_timer_interrupt+0x4e/0xd0 arch/x86/kernel/apic/apic.c:1091 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:587 RIP: 0010:arch_local_irq_restore arch/x86/include/asm/irqflags.h:85 [inline] RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline] RIP: 0010:_raw_spin_unlock_irqrestore+0x3b/0x40 kernel/locking/spinlock.c:191 Code: e8 2a c3 f4 fc 48 89 ef e8 12 96 f5 fc f6 c7 02 75 11 53 9d e8 d6 db 11 fd 65 ff 0d 1f 21 b3 56 5b 5d c3 e8 a7 d7 11 fd 53 9d <eb> ed 0f 1f 00 55 48 89 fd 65 ff 05 05 21 b3 56 ff 74 24 08 48 8d RSP: 0018:ffff8880446ffd80 EFLAGS: 00000286 RAX: 0000000000000006 RBX: 0000000000000286 RCX: 0000000000000006 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffa94ecea9 RBP: ffff888012934408 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: fffffbfff57be301 R12: 1ffff110088dffc1 R13: ffff888037b817c0 R14: ffff88802442415a R15: ffff888024424000 __do_sys_perf_event_open+0x1b5d/0x2bd0 kernel/events/core.c:11874 do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:384 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x473dbd Code: Bad RIP value. RSP: 002b:00007f21e5e9cc28 EFLAGS: 00000246 ORIG_RAX: 000000000000012a RAX: ffffffffffffffda RBX: 000000000057bf00 RCX: 0000000000473dbd RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000020000040 RBP: 000000000057bf00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000003 R11: 0000000000000246 R12: 000000000057bf0c R13: 00007ffd0493503f R14: 00000000004d0f46 R15: 00007f21e5e9cd80 Allocated by task 871: save_stack+0x1b/0x40 mm/kasan/common.c:48 set_track mm/kasan/common.c:56 [inline] __kasan_kmalloc mm/kasan/common.c:494 [inline] __kasan_kmalloc.constprop.0+0xc2/0xd0 mm/kasan/common.c:467 kmalloc include/linux/slab.h:555 [inline] kzalloc include/linux/slab.h:669 [inline] mlxsw_core_reg_access_emad+0x70/0x1410 drivers/net/ethernet/mellanox/mlxsw/core.c:1812 mlxsw_core_reg_access+0xeb/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1991 mlxsw_sp_port_get_hw_xstats+0x335/0x7e0 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1130 update_stats_cache+0xf4/0x140 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1173 process_one_work+0xa3e/0x17a0 kernel/workqueue.c:2269 worker_thread+0x9e/0x1050 kernel/workqueue.c:2415 kthread+0x355/0x470 kernel/kthread.c:291 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:293 Freed by task 871: save_stack+0x1b/0x40 mm/kasan/common.c:48 set_track mm/kasan/common.c:56 [inline] kasan_set_free_info mm/kasan/common.c:316 [inline] __kasan_slab_free+0x12c/0x170 mm/kasan/common.c:455 slab_free_hook mm/slub.c:1474 [inline] slab_free_freelist_hook mm/slub.c:1507 [inline] slab_free mm/slub.c:3072 [inline] kfree+0xe6/0x320 mm/slub.c:4052 mlxsw_core_reg_access_emad+0xd45/0x1410 drivers/net/ethernet/mellanox/mlxsw/core.c:1819 mlxsw_core_reg_access+0xeb/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1991 mlxsw_sp_port_get_hw_xstats+0x335/0x7e0 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1130 update_stats_cache+0xf4/0x140 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1173 process_one_work+0xa3e/0x17a0 kernel/workqueue.c:2269 worker_thread+0x9e/0x1050 kernel/workqueue.c:2415 kthread+0x355/0x470 kernel/kthread.c:291 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:293 The buggy address belongs to the object at ffff88800b796400 which belongs to the cache kmalloc-512 of size 512 The buggy address is located 232 bytes inside of 512-byte region [ffff88800b796400, ffff88800b796600) The buggy address belongs to the page: page:ffffea00002de500 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 head:ffffea00002de500 order:2 compound_mapcount:0 compound_pincount:0 flags: 0x100000000010200(slab|head) raw: 0100000000010200 dead000000000100 dead000000000122 ffff88806c402500 raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88800b796380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88800b796400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff88800b796480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88800b796500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88800b796580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Fixes: caf7297e7ab5 ("mlxsw: core: Introduce support for asynchronous EMAD register access") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-05mlxsw: core: Increase scope of RCU read-side critical sectionIdo Schimmel1-2/+4
[ Upstream commit 7d8e8f3433dc8d1dc87c1aabe73a154978fb4c4d ] The lifetime of the Rx listener item ('rxl_item') is managed using RCU, but is dereferenced outside of RCU read-side critical section, which can lead to a use-after-free. Fix this by increasing the scope of the RCU read-side critical section. Fixes: 93c1edb27f9e ("mlxsw: Introduce Mellanox switch driver core") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-05mlx4: disable device on shutdownJakub Kicinski1-0/+2
[ Upstream commit 3cab8c65525920f00d8f4997b3e9bb73aecb3a8e ] It appears that not disabling a PCI device on .shutdown may lead to a Hardware Error with particular (perhaps buggy) BIOS versions: mlx4_en: eth0: Close port called mlx4_en 0000:04:00.0: removed PHC reboot: Restarting system {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1 {1}[Hardware Error]: event severity: fatal {1}[Hardware Error]: Error 0, type: fatal {1}[Hardware Error]: section_type: PCIe error {1}[Hardware Error]: port_type: 4, root port {1}[Hardware Error]: version: 1.16 {1}[Hardware Error]: command: 0x4010, status: 0x0143 {1}[Hardware Error]: device_id: 0000:00:02.2 {1}[Hardware Error]: slot: 0 {1}[Hardware Error]: secondary_bus: 0x04 {1}[Hardware Error]: vendor_id: 0x8086, device_id: 0x2f06 {1}[Hardware Error]: class_code: 000604 {1}[Hardware Error]: bridge: secondary_status: 0x2000, control: 0x0003 {1}[Hardware Error]: aer_uncor_status: 0x00100000, aer_uncor_mask: 0x00000000 {1}[Hardware Error]: aer_uncor_severity: 0x00062030 {1}[Hardware Error]: TLP Header: 40000018 040000ff 791f4080 00000000 [hw error repeats] Kernel panic - not syncing: Fatal hardware error! CPU: 0 PID: 2189 Comm: reboot Kdump: loaded Not tainted 5.6.x-blabla #1 Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 05/05/2017 Fix the mlx4 driver. This is a very similar problem to what had been fixed in: commit 0d98ba8d70b0 ("scsi: hpsa: disable device during shutdown") to address https://bugzilla.kernel.org/show_bug.cgi?id=199779. Fixes: 2ba5fbd62b25 ("net/mlx4_core: Handle AER flow properly") Reported-by: Jake Lawrence <lawja@fb.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-05net/mlx5: Verify Hardware supports requested ptp function on a given pinEran Ben Elisha1-1/+22
[ Upstream commit 071995c877a8646209d55ff8edddd2b054e7424c ] Fix a bug where driver did not verify Hardware pin capabilities for PTP functions. Fixes: ee7f12205abc ("net/mlx5e: Implement 1PPS support") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-29mlxsw: destroy workqueue when trap_register in mlxsw_emad_initLiu Jian1-1/+2
[ Upstream commit 5dbaeb87f2b309936be0aeae00cbc9e7f20ab296 ] When mlxsw_core_trap_register fails in mlxsw_emad_init, destroy_workqueue() shouled be called to destroy mlxsw_core->emad_wq. Fixes: d965465b60ba ("mlxsw: core: Fix possible deadlock") Signed-off-by: Liu Jian <liujian56@huawei.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-16mlxsw: spectrum_router: Remove inappropriate usage of WARN_ON()Ido Schimmel1-1/+1
[ Upstream commit d9d5420273997664a1c09151ca86ac993f2f89c1 ] We should not trigger a warning when a memory allocation fails. Remove the WARN_ON(). The warning is constantly triggered by syzkaller when it is injecting faults: [ 2230.758664] FAULT_INJECTION: forcing a failure. [ 2230.758664] name failslab, interval 1, probability 0, space 0, times 0 [ 2230.762329] CPU: 3 PID: 1407 Comm: syz-executor.0 Not tainted 5.8.0-rc2+ #28 ... [ 2230.898175] WARNING: CPU: 3 PID: 1407 at drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:6265 mlxsw_sp_router_fib_event+0xfad/0x13e0 [ 2230.898179] Kernel panic - not syncing: panic_on_warn set ... [ 2230.898183] CPU: 3 PID: 1407 Comm: syz-executor.0 Not tainted 5.8.0-rc2+ #28 [ 2230.898190] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Fixes: 3057224e014c ("mlxsw: spectrum_router: Implement FIB offload in deferred work") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-22net/mlx5e: IPoIB, Drop multicast packets that this interface sentErez Shitrit1-3/+12
[ Upstream commit 8b46d424a743ddfef8056d5167f13ee7ebd1dcad ] After enabled loopback packets for IPoIB, we need to drop these packets that this HCA has replicated and came back to the same interface that sent them. Fixes: 4c6c615e3f30 ("net/mlx5e: IPoIB, Add PKEY child interface nic profile") Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-10Revert "net/mlx5: Annotate mutex destroy for root ns"Greg Kroah-Hartman1-6/+0
This reverts commit 95fde2e46860c183f6f47a99381a3b9bff488bd5 which is commit 9ca415399dae133b00273a4283ef31d003a6818d upstream. It was backported incorrectly, Paul writes at: https://lore.kernel.org/r/20200607203425.GD23662@windriver.com I happened to notice this commit: 9ca415399dae - "net/mlx5: Annotate mutex destroy for root ns" ...was backported to 4.19 and 5.4 and v5.6 in linux-stable. It patches del_sw_root_ns() - which only exists after v5.7-rc7 from: 6eb7a268a99b - "net/mlx5: Don't maintain a case of del_sw_func being null" which creates the one line del_sw_root_ns stub function around kfree(node) by breaking it out of tree_put_node(). In the absense of del_sw_root_ns - the backport finds an identical one line kfree stub fcn - named del_sw_prio from this earlier commit: 139ed6c6c46a - "net/mlx5: Fix steering memory leak" [in v4.15-rc5] and then puts the mutex_destroy() into that (wrong) function, instead of putting it into tree_put_node where the root ns case used to be hand Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Roi Dayan <roid@mellanox.com> Cc: Mark Bloch <markb@mellanox.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-03mlxsw: spectrum: Fix use-after-free of split/unsplit/type_set in case reload ↵Jiri Pirko2-2/+20
fails commit 4340f42f207eacb81e7a6b6bb1e3b6afad9a2e26 upstream. In case of reload fail, the mlxsw_sp->ports contains a pointer to a freed memory (either by reload_down() or reload_up() error path). Fix this by initializing the pointer to NULL and checking it before dereferencing in split/unsplit/type_set callpaths. Fixes: 24cc68ad6c46 ("mlxsw: core: Add support for reload") Reported-by: Danielle Ratson <danieller@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-03net/mlx4_core: fix a memory leak bug.Qiushi Wu1-1/+1
commit febfd9d3c7f74063e8e630b15413ca91b567f963 upstream. In function mlx4_opreq_action(), pointer "mailbox" is not released, when mlx4_cmd_box() return and error, causing a memory leak bug. Fix this issue by going to "out" label, mlx4_free_cmd_mailbox() can free this pointer. Fixes: fe6f700d6cbb ("net/mlx4_core: Respond to operation request by firmware") Signed-off-by: Qiushi Wu <wu000273@umn.edu> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-03net/mlx5: Annotate mutex destroy for root nsRoi Dayan1-0/+6
commit 9ca415399dae133b00273a4283ef31d003a6818d upstream. Invoke mutex_destroy() to catch any errors. Fixes: 2cc43b494a6c ("net/mlx5_core: Managing root flow table") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-03net/mlx5e: Update netdev txq on completions during closureMoshe Shemesh1-1/+5
[ Upstream commit 5e911e2c06bd8c17df29147a5e2d4b17fafda024 ] On sq closure when we free its descriptors, we should also update netdev txq on completions which would not arrive. Otherwise if we reopen sqs and attach them back, for example on fw fatal recovery flow, we may get tx timeout. Fixes: 29429f3300a3 ("net/mlx5e: Timeout if SQ doesn't flush during close") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-03net/mlx5: Add command entry handling completionMoshe Shemesh1-0/+14
[ Upstream commit 17d00e839d3b592da9659c1977d45f85b77f986a ] When FW response to commands is very slow and all command entries in use are waiting for completion we can have a race where commands can get timeout before they get out of the queue and handled. Timeout completion on uninitialized command will cause releasing command's buffers before accessing it for initialization and then we will get NULL pointer exception while trying access it. It may also cause releasing buffers of another command since we may have timeout completion before even allocating entry index for this command. Add entry handling completion to avoid this race. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-14net/mlx5: Fix command entry leak in Internal Error StateMoshe Shemesh1-0/+4
[ Upstream commit cece6f432cca9f18900463ed01b97a152a03600a ] Processing commands by cmd_work_handler() while already in Internal Error State will result in entry leak, since the handler process force completion without doorbell. Forced completion doesn't release the entry and event completion will never arrive, so entry should be released. Fixes: 73dd3a4839c1 ("net/mlx5: Avoid using pending command interface slots") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-14net/mlx5: Fix forced completion access non initialized command entryMoshe Shemesh1-1/+1
[ Upstream commit f3cb3cebe26ed4c8036adbd9448b372129d3c371 ] mlx5_cmd_flush() will trigger forced completions to all valid command entries. Triggered by an asynch event such as fast teardown it can happen at any stage of the command, including command initialization. It will trigger forced completion and that can lead to completion on an uninitialized command entry. Setting MLX5_CMD_ENT_STATE_PENDING_COMP only after command entry is initialized will ensure force completion is treated only if command entry is initialized. Fixes: 73dd3a4839c1 ("net/mlx5: Avoid using pending command interface slots") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>