summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet
AgeCommit message (Collapse)AuthorFilesLines
2022-03-27net: stmmac: use GFP_DMA32Matteo Croce1-2/+2
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
2022-03-27net: stmmac: Configure gtxclk based on speedTom1-0/+47
2022-03-23iavf: Fix hang during reboot/shutdownIvan Vecera1-0/+7
[ Upstream commit b04683ff8f0823b869c219c78ba0d974bddea0b5 ] Recent commit 974578017fc1 ("iavf: Add waiting so the port is initialized in remove") adds a wait-loop at the beginning of iavf_remove() to ensure that port initialization is finished prior unregistering net device. This causes a regression in reboot/shutdown scenario because in this case callback iavf_shutdown() is called and this callback detaches the device, makes it down if it is running and sets its state to __IAVF_REMOVE. Later shutdown callback of associated PF driver (e.g. ice_shutdown) is called. That callback calls among other things sriov_disable() that calls indirectly iavf_remove() (see stack trace below). As the adapter state is already __IAVF_REMOVE then the mentioned loop is end-less and shutdown process hangs. The patch fixes this by checking adapter's state at the beginning of iavf_remove() and skips the rest of the function if the adapter is already in remove state (shutdown is in progress). Reproducer: 1. Create VF on PF driven by ice or i40e driver 2. Ensure that the VF is bound to iavf driver 3. Reboot [52625.981294] sysrq: SysRq : Show Blocked State [52625.988377] task:reboot state:D stack: 0 pid:17359 ppid: 1 f2 [52625.996732] Call Trace: [52625.999187] __schedule+0x2d1/0x830 [52626.007400] schedule+0x35/0xa0 [52626.010545] schedule_hrtimeout_range_clock+0x83/0x100 [52626.020046] usleep_range+0x5b/0x80 [52626.023540] iavf_remove+0x63/0x5b0 [iavf] [52626.027645] pci_device_remove+0x3b/0xc0 [52626.031572] device_release_driver_internal+0x103/0x1f0 [52626.036805] pci_stop_bus_device+0x72/0xa0 [52626.040904] pci_stop_and_remove_bus_device+0xe/0x20 [52626.045870] pci_iov_remove_virtfn+0xba/0x120 [52626.050232] sriov_disable+0x2f/0xe0 [52626.053813] ice_free_vfs+0x7c/0x340 [ice] [52626.057946] ice_remove+0x220/0x240 [ice] [52626.061967] ice_shutdown+0x16/0x50 [ice] [52626.065987] pci_device_shutdown+0x34/0x60 [52626.070086] device_shutdown+0x165/0x1c5 [52626.074011] kernel_restart+0xe/0x30 [52626.077593] __do_sys_reboot+0x1d2/0x210 [52626.093815] do_syscall_64+0x5b/0x1a0 [52626.097483] entry_SYSCALL_64_after_hwframe+0x65/0xca Fixes: 974578017fc1 ("iavf: Add waiting so the port is initialized in remove") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Link: https://lore.kernel.org/r/20220317104524.2802848-1-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-23net: mscc: ocelot: fix backwards compatibility with single-chain tc-flower ↵Vladimir Oltean1-1/+15
offload [ Upstream commit 8e0341aefcc9133f3f48683873284b169581315b ] ACL rules can be offloaded to VCAP IS2 either through chain 0, or, since the blamed commit, through a chain index whose number encodes a specific PAG (Policy Action Group) and lookup number. The chain number is translated through ocelot_chain_to_pag() into a PAG, and through ocelot_chain_to_lookup() into a lookup number. The problem with the blamed commit is that the above 2 functions don't have special treatment for chain 0. So ocelot_chain_to_pag(0) returns filter->pag = 224, which is in fact -32, but the "pag" field is an u8. So we end up programming the hardware with VCAP IS2 entries having a PAG of 224. But the way in which the PAG works is that it defines a subset of VCAP IS2 filters which should match on a packet. The default PAG is 0, and previous VCAP IS1 rules (which we offload using 'goto') can modify it. So basically, we are installing filters with a PAG on which no packet will ever match. This is the hardware equivalent of adding filters to a chain which has no 'goto' to it. Restore the previous functionality by making ACL filters offloaded to chain 0 go to PAG 0 and lookup number 0. The choice of PAG is clearly correct, but the choice of lookup number isn't "as before" (which was to leave the lookup a "don't care"). However, lookup 0 should be fine, since even though there are ACL actions (policers) which have a requirement to be used in a specific lookup, that lookup is 0. Fixes: 226e9cd82a96 ("net: mscc: ocelot: only install TCAM entries into a specific lookup and PAG") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220316192117.2568261-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-23net: bcmgenet: skip invalid partial checksumsDoug Berger1-2/+4
[ Upstream commit 0f643c88c8d240eba0ea25c2e095a46515ff46e9 ] The RXCHK block will return a partial checksum of 0 if it encounters a problem while receiving a packet. Since a 1's complement sum can only produce this result if no bits are set in the received data stream it is fair to treat it as an invalid partial checksum and not pass it up the stack. Fixes: 810155397890 ("net: bcmgenet: use CHECKSUM_COMPLETE for NETIF_F_RXCSUM") Signed-off-by: Doug Berger <opendmb@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220317012812.1313196-1-opendmb@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-23bnx2x: fix built-in kernel driver load failureManish Chopra3-26/+19
[ Upstream commit 424e7834e293936a54fcf05173f2884171adc5a3 ] Commit b7a49f73059f ("bnx2x: Utilize firmware 7.13.21.0") added request_firmware() logic in probe() which caused load failure when firmware file is not present in initrd (below), as access to firmware file is not feasible during probe. Direct firmware load for bnx2x/bnx2x-e2-7.13.15.0.fw failed with error -2 Direct firmware load for bnx2x/bnx2x-e2-7.13.21.0.fw failed with error -2 This patch fixes this issue by - 1. Removing request_firmware() logic from the probe() such that .ndo_open() handle it as it used to handle it earlier 2. Given request_firmware() is removed from probe(), so driver has to relax FW version comparisons a bit against the already loaded FW version (by some other PFs of same adapter) to allow different compatible/close enough FWs with which multiple PFs may run with (in different environments), as the given PF who is in probe flow has no idea now with which firmware file version it is going to initialize the device in ndo_open() Link: https://lore.kernel.org/all/46f2d9d9-ae7f-b332-ddeb-b59802be2bab@molgen.mpg.de/ Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> Fixes: b7a49f73059f ("bnx2x: Utilize firmware 7.13.21.0") Signed-off-by: Manish Chopra <manishc@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Link: https://lore.kernel.org/r/20220316214613.6884-1-manishc@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-23iavf: Fix double free in iavf_reset_taskPrzemyslaw Patynowski1-1/+7
[ Upstream commit 16b2dd8cdf6f4e0597c34899de74b4d012b78188 ] Fix double free possibility in iavf_disable_vf, as crit_lock is freed in caller, iavf_reset_task. Add kernel-doc for iavf_disable_vf. Remove mutex_unlock in iavf_disable_vf. Without this patch there is double free scenario, when calling iavf_reset_task. Fixes: e85ff9c631e1 ("iavf: Fix deadlock in iavf_reset_task") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Suggested-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-23ice: fix NULL pointer dereference in ice_update_vsi_tx_ring_stats()Maciej Fijalkowski1-2/+3
[ Upstream commit f153546913bada41a811722f2c6d17c3243a0333 ] It is possible to do NULL pointer dereference in routine that updates Tx ring stats. Currently only stats and bytes are updated when ring pointer is valid, but later on ring is accessed to propagate gathered Tx stats onto VSI stats. Change the existing logic to move to next ring when ring is NULL. Fixes: e72bba21355d ("ice: split ice_ring onto Tx/Rx separate structs") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Alexander Lobakin <alexandr.lobakin@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-23alx: acquire mutex for alx_reinit in alx_change_mtuNiels Dossche1-1/+4
[ Upstream commit 46b348fd2d81a341b15fb3f3f986204b038f5c42 ] alx_reinit has a lockdep assertion that the alx->mtx mutex must be held. alx_reinit is called from two places: alx_reset and alx_change_mtu. alx_reset does acquire alx->mtx before calling alx_reinit. alx_change_mtu does not acquire this mutex, nor do its callers or any path towards alx_change_mtu. Acquire the mutex in alx_change_mtu. The issue was introduced when the fine-grained locking was introduced to the code to replace the RTNL. The same commit also introduced the lockdep assertion. Fixes: 4a5fe57e7751 ("alx: use fine-grained locking instead of RTNL") Signed-off-by: Niels Dossche <dossche.niels@gmail.com> Link: https://lore.kernel.org/r/20220310232707.44251-1-dossche.niels@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-19ice: Fix race condition during interface enslaveIvan Vecera2-2/+21
commit 5cb1ebdbc4342b1c2ce89516e19808d64417bdbc upstream. Commit 5dbbbd01cbba83 ("ice: Avoid RTNL lock when re-creating auxiliary device") changes a process of re-creation of aux device so ice_plug_aux_dev() is called from ice_service_task() context. This unfortunately opens a race window that can result in dead-lock when interface has left LAG and immediately enters LAG again. Reproducer: ``` #!/bin/sh ip link add lag0 type bond mode 1 miimon 100 ip link set lag0 for n in {1..10}; do echo Cycle: $n ip link set ens7f0 master lag0 sleep 1 ip link set ens7f0 nomaster done ``` This results in: [20976.208697] Workqueue: ice ice_service_task [ice] [20976.213422] Call Trace: [20976.215871] __schedule+0x2d1/0x830 [20976.219364] schedule+0x35/0xa0 [20976.222510] schedule_preempt_disabled+0xa/0x10 [20976.227043] __mutex_lock.isra.7+0x310/0x420 [20976.235071] enum_all_gids_of_dev_cb+0x1c/0x100 [ib_core] [20976.251215] ib_enum_roce_netdev+0xa4/0xe0 [ib_core] [20976.256192] ib_cache_setup_one+0x33/0xa0 [ib_core] [20976.261079] ib_register_device+0x40d/0x580 [ib_core] [20976.266139] irdma_ib_register_device+0x129/0x250 [irdma] [20976.281409] irdma_probe+0x2c1/0x360 [irdma] [20976.285691] auxiliary_bus_probe+0x45/0x70 [20976.289790] really_probe+0x1f2/0x480 [20976.298509] driver_probe_device+0x49/0xc0 [20976.302609] bus_for_each_drv+0x79/0xc0 [20976.306448] __device_attach+0xdc/0x160 [20976.310286] bus_probe_device+0x9d/0xb0 [20976.314128] device_add+0x43c/0x890 [20976.321287] __auxiliary_device_add+0x43/0x60 [20976.325644] ice_plug_aux_dev+0xb2/0x100 [ice] [20976.330109] ice_service_task+0xd0c/0xed0 [ice] [20976.342591] process_one_work+0x1a7/0x360 [20976.350536] worker_thread+0x30/0x390 [20976.358128] kthread+0x10a/0x120 [20976.365547] ret_from_fork+0x1f/0x40 ... [20976.438030] task:ip state:D stack: 0 pid:213658 ppid:213627 flags:0x00004084 [20976.446469] Call Trace: [20976.448921] __schedule+0x2d1/0x830 [20976.452414] schedule+0x35/0xa0 [20976.455559] schedule_preempt_disabled+0xa/0x10 [20976.460090] __mutex_lock.isra.7+0x310/0x420 [20976.464364] device_del+0x36/0x3c0 [20976.467772] ice_unplug_aux_dev+0x1a/0x40 [ice] [20976.472313] ice_lag_event_handler+0x2a2/0x520 [ice] [20976.477288] notifier_call_chain+0x47/0x70 [20976.481386] __netdev_upper_dev_link+0x18b/0x280 [20976.489845] bond_enslave+0xe05/0x1790 [bonding] [20976.494475] do_setlink+0x336/0xf50 [20976.502517] __rtnl_newlink+0x529/0x8b0 [20976.543441] rtnl_newlink+0x43/0x60 [20976.546934] rtnetlink_rcv_msg+0x2b1/0x360 [20976.559238] netlink_rcv_skb+0x4c/0x120 [20976.563079] netlink_unicast+0x196/0x230 [20976.567005] netlink_sendmsg+0x204/0x3d0 [20976.570930] sock_sendmsg+0x4c/0x50 [20976.574423] ____sys_sendmsg+0x1eb/0x250 [20976.586807] ___sys_sendmsg+0x7c/0xc0 [20976.606353] __sys_sendmsg+0x57/0xa0 [20976.609930] do_syscall_64+0x5b/0x1a0 [20976.613598] entry_SYSCALL_64_after_hwframe+0x65/0xca 1. Command 'ip link ... set nomaster' causes that ice_plug_aux_dev() is called from ice_service_task() context, aux device is created and associated device->lock is taken. 2. Command 'ip link ... set master...' calls ice's notifier under RTNL lock and that notifier calls ice_unplug_aux_dev(). That function tries to take aux device->lock but this is already taken by ice_plug_aux_dev() in step 1 3. Later ice_plug_aux_dev() tries to take RTNL lock but this is already taken in step 2 4. Dead-lock The patch fixes this issue by following changes: - Bit ICE_FLAG_PLUG_AUX_DEV is kept to be set during ice_plug_aux_dev() call in ice_service_task() - The bit is checked in ice_clear_rdma_cap() and only if it is not set then ice_unplug_aux_dev() is called. If it is set (in other words plugging of aux device was requested and ice_plug_aux_dev() is potentially running) then the function only clears the bit - Once ice_plug_aux_dev() call (in ice_service_task) is finished the bit ICE_FLAG_PLUG_AUX_DEV is cleared but it is also checked whether it was already cleared by ice_clear_rdma_cap(). If so then aux device is unplugged. Signed-off-by: Ivan Vecera <ivecera@redhat.com> Co-developed-by: Petr Oros <poros@redhat.com> Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Dave Ertman <david.m.ertman@intel.com> Link: https://lore.kernel.org/r/20220310171641.3863659-1-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-19bnx2: Fix an error messageChristophe JAILLET1-1/+1
[ Upstream commit 8ccffe9ac3239e549beaa0a9d5e1a1eac94e866c ] Fix an error message and report the correct failing function. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-19sfc: extend the locking on mcdi->seqnoNiels Dossche1-1/+1
[ Upstream commit f1fb205efb0ccca55626fd4ef38570dd16b44719 ] seqno could be read as a stale value outside of the lock. The lock is already acquired to protect the modification of seqno against a possible race condition. Place the reading of this value also inside this locking to protect it against a possible race condition. Signed-off-by: Niels Dossche <dossche.niels@gmail.com> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16net/mlx5: Fix offloading with ESWITCH_IPV4_TTL_MODIFY_ENABLEDima Chumak1-3/+0
commit 39bab83b119faac4bf7f07173a42ed35be95147e upstream. Only prio 1 is supported for nic mode when there is no ignore flow level support in firmware. But for switchdev mode, which supports fixed number of statically pre-allocated prios, this restriction is not relevant so it can be relaxed. Fixes: d671e109bd85 ("net/mlx5: Fix tc max supported prio for nic mode") Signed-off-by: Dima Chumak <dchumak@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16net: macb: Fix lost RX packet wakeup race in NAPI receiveRobert Hancock1-1/+24
commit 0bf476fc3624e3a72af4ba7340d430a91c18cd67 upstream. There is an oddity in the way the RSR register flags propagate to the ISR register (and the actual interrupt output) on this hardware: it appears that RSR register bits only result in ISR being asserted if the interrupt was actually enabled at the time, so enabling interrupts with RSR bits already set doesn't trigger an interrupt to be raised. There was already a partial fix for this race in the macb_poll function where it checked for RSR bits being set and re-triggered NAPI receive. However, there was a still a race window between checking RSR and actually enabling interrupts, where a lost wakeup could happen. It's necessary to check again after enabling interrupts to see if RSR was set just prior to the interrupt being enabled, and re-trigger receive in that case. This issue was noticed in a point-to-point UDP request-response protocol which periodically saw timeouts or abnormally high response times due to received packets not being processed in a timely fashion. In many applications, more packets arriving, including TCP retransmissions, would cause the original packet to be processed, thus masking the issue. Fixes: 02f7a34f34e3 ("net: macb: Re-enable RX interrupt only when RX is done") Cc: stable@vger.kernel.org Co-developed-by: Scott McNutt <scott.mcnutt@siriusxm.com> Signed-off-by: Scott McNutt <scott.mcnutt@siriusxm.com> Signed-off-by: Robert Hancock <robert.hancock@calian.com> Tested-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16net: bcmgenet: Don't claim WOL when its not availableJeremy Linton1-0/+7
[ Upstream commit 00b022f8f876a3a036b0df7f971001bef6398605 ] Some of the bcmgenet platforms don't correctly support WOL, yet ethtool returns: "Supports Wake-on: gsf" which is false. Ideally if there isn't a wol_irq, or there is something else that keeps the device from being able to wakeup it should display: "Supports Wake-on: d" This patch checks whether the device can wakup, before using the hard-coded supported flags. This corrects the ethtool reporting, as well as the WOL configuration because ethtool verifies that the mode is supported before attempting it. Fixes: c51de7f3976b ("net: bcmgenet: add Wake-on-LAN support code") Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Tested-by: Peter Robinson <pbrobinson@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220310045535.224450-1-jeremy.linton@arm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16net: arc_emac: Fix use after free in arc_mdio_probe()Jianglei Nie1-2/+3
[ Upstream commit bc0e610a6eb0d46e4123fafdbe5e6141d9fff3be ] If bus->state is equal to MDIOBUS_ALLOCATED, mdiobus_free(bus) will free the "bus". But bus->name is still used in the next line, which will lead to a use after free. We can fix it by putting the name in a local variable and make the bus->name point to the rodata section "name",then use the name in the error message without referring to bus to avoid the uaf. Fixes: 95b5fc03c189 ("net: arc_emac: Make use of the helper function dev_err_probe()") Signed-off-by: Jianglei Nie <niejianglei2021@163.com> Link: https://lore.kernel.org/r/20220309121824.36529-1-niejianglei2021@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16gianfar: ethtool: Fix refcount leak in gfar_get_ts_infoMiaoqian Lin1-0/+1
[ Upstream commit 2ac5b58e645c66932438bb021cb5b52097ce70b0 ] The of_find_compatible_node() function returns a node pointer with refcount incremented, We should use of_node_put() on it when done Add the missing of_node_put() to release the refcount. Fixes: 7349a74ea75c ("net: ethernet: gianfar_ethtool: get phc index through drvdata") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Link: https://lore.kernel.org/r/20220310015313.14938-1-linmq006@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16net/mlx5e: SHAMPO, reduce TIR indicationBen Ben-Ishay2-5/+1
[ Upstream commit 99a2b9be077ae3a5d97fbf5f7782e0f2e9812978 ] SHAMPO is an RQ / WQ feature, an indication was added to the TIR in the first place to enforce suitability between connected TIR and RQ, this enforcement does not exist in current the Firmware implementation and was redundant in the first place. Fixes: 83439f3c37aa ("net/mlx5e: Add HW-GRO offload") Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16net/mlx5e: Lag, Only handle events from highest priority multipath entryRoi Dayan1-3/+8
[ Upstream commit ad11c4f1d8fd1f03639460e425a36f7fd0ea83f5 ] There could be multiple multipath entries but changing the port affinity for each one doesn't make much sense and there should be a default one. So only track the entry with lowest priority value. The commit doesn't affect existing users with a single entry. Fixes: 544fe7c2e654 ("net/mlx5e: Activate HW multipath and handle port affinity based on FIB events") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16net/mlx5: Fix a race on command flush flowMoshe Shemesh1-7/+8
[ Upstream commit 063bd355595428750803d8736a9bb7c8db67d42d ] Fix a refcount use after free warning due to a race on command entry. Such race occurs when one of the commands releases its last refcount and frees its index and entry while another process running command flush flow takes refcount to this command entry. The process which handles commands flush may see this command as needed to be flushed if the other process released its refcount but didn't release the index yet. Fix it by adding the needed spin lock. It fixes the following warning trace: refcount_t: addition on 0; use-after-free. WARNING: CPU: 11 PID: 540311 at lib/refcount.c:25 refcount_warn_saturate+0x80/0xe0 ... RIP: 0010:refcount_warn_saturate+0x80/0xe0 ... Call Trace: <TASK> mlx5_cmd_trigger_completions+0x293/0x340 [mlx5_core] mlx5_cmd_flush+0x3a/0xf0 [mlx5_core] enter_error_state+0x44/0x80 [mlx5_core] mlx5_fw_fatal_reporter_err_work+0x37/0xe0 [mlx5_core] process_one_work+0x1be/0x390 worker_thread+0x4d/0x3d0 ? rescuer_thread+0x350/0x350 kthread+0x141/0x160 ? set_kthread_struct+0x40/0x40 ret_from_fork+0x1f/0x30 </TASK> Fixes: 50b2412b7e78 ("net/mlx5: Avoid possible free of command entry while timeout comp handler") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16net: marvell: prestera: Add missing of_node_put() in ↵Miaoqian Lin1-0/+1
prestera_switch_set_base_mac_addr [ Upstream commit c9ffa3e2bc451816ce0295e40063514fabf2bd36 ] This node pointer is returned by of_find_compatible_node() with refcount incremented. Calling of_node_put() to aovid the refcount leak. Fixes: 501ef3066c89 ("net: marvell: prestera: Add driver for Prestera family ASIC devices") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16net: ethernet: lpc_eth: Handle error for clk_enableJiasheng Jiang1-1/+4
[ Upstream commit 2169b79258c8be803d2595d6456b1e77129fe154 ] As the potential failure of the clk_enable(), it should be better to check it and return error if fails. Fixes: b7370112f519 ("lpc32xx: Added ethernet driver") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16net: ethernet: ti: cpts: Handle error for clk_enableJiasheng Jiang1-1/+3
[ Upstream commit 6babfc6e6fab068018c36e8f6605184b8c0b349d ] As the potential failure of the clk_enable(), it should be better to check it and return error if fails. Fixes: 8a2c9a5ab4b9 ("net: ethernet: ti: cpts: rework initialization/deinitialization") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16ethernet: Fix error handling in xemaclite_of_probeMiaoqian Lin1-1/+3
[ Upstream commit b19ab4b38b06aae12442b2de95ccf58b5dc53584 ] This node pointer is returned by of_parse_phandle() with refcount incremented in this function. Calling of_node_put() to avoid the refcount leak. As the remove function do. Fixes: 5cdaaa12866e ("net: emaclite: adding MDIO and phy lib support") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220308024751.2320-1-linmq006@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16ice: Fix curr_link_speed advertised speedJedrzej Jagielski1-1/+1
[ Upstream commit ad35ffa252af67d4cc7c744b9377a2b577748e3f ] Change curr_link_speed advertised speed, due to link_info.link_speed is not equal phy.curr_user_speed_req. Without this patch it is impossible to set advertised speed to same as link_speed. Testing Hints: Try to set advertised speed to 25G only with 25G default link (use ethtool -s 0x80000000) Fixes: 48cb27f2fd18 ("ice: Implement handlers for ethtool PHY/link operations") Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16ice: Don't use GFP_KERNEL in atomic contextChristophe JAILLET1-1/+1
[ Upstream commit 3d97f1afd8d831e0c0dc1157418f94b8faa97b54 ] ice_misc_intr() is an irq handler. It should not sleep. Use GFP_ATOMIC instead of GFP_KERNEL when allocating some memory. Fixes: 348048e724a0 ("ice: Implement iidc operations") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Tested-by: Leszek Kaliszczuk <leszek.kaliszczuk@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16ice: Fix error with handling of bonding MTUDave Ertman2-15/+15
[ Upstream commit 97b0129146b1544bbb0773585327896da3bb4e0a ] When a bonded interface is destroyed, .ndo_change_mtu can be called during the tear-down process while the RTNL lock is held. This is a problem since the auxiliary driver linked to the LAN driver needs to be notified of the MTU change, and this requires grabbing a device_lock on the auxiliary_device's dev. Currently this is being attempted in the same execution context as the call to .ndo_change_mtu which is causing a dead-lock. Move the notification of the changed MTU to a separate execution context (watchdog service task) and eliminate the "before" notification. Fixes: 348048e724a0e ("ice: Implement iidc operations") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Jonathan Toppins <jtoppins@redhat.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16ice: stop disabling VFs due to PF error responsesJacob Keller2-21/+0
[ Upstream commit 79498d5af8e458102242d1667cf44df1f1564e63 ] The ice_vc_send_msg_to_vf function has logic to detect "failure" responses being sent to a VF. If a VF is sent more than ICE_DFLT_NUM_INVAL_MSGS_ALLOWED then the VF is marked as disabled. Almost identical logic also existed in the i40e driver. This logic was added to the ice driver in commit 1071a8358a28 ("ice: Implement virtchnl commands for AVF support") which itself copied from the i40e implementation in commit 5c3c48ac6bf5 ("i40e: implement virtual device interface"). Neither commit provides a proper explanation or justification of the check. In fact, later commits to i40e changed the logic to allow bypassing the check in some specific instances. The "logic" for this seems to be that error responses somehow indicate a malicious VF. This is not really true. The PF might be sending an error for any number of reasons such as lack of resources, etc. Additionally, this causes the PF to log an info message for every failed VF response which may confuse users, and can spam the kernel log. This behavior is not documented as part of any requirement for our products and other operating system drivers such as the FreeBSD implementation of our drivers do not include this type of check. In fact, the change from dev_err to dev_info in i40e commit 18b7af57d9c1 ("i40e: Lower some message levels") explains that these messages typically don't actually indicate a real issue. It is quite likely that a user who hits this in practice will be very confused as the VF will be disabled without an obvious way to recover. We already have robust malicious driver detection logic using actual hardware detection mechanisms that detect and prevent invalid device usage. Remove the logic since its not a documented requirement and the behavior is not intuitive. Fixes: 1071a8358a28 ("ice: Implement virtchnl commands for AVF support") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16i40e: stop disabling VFs due to PF error responsesJacob Keller3-59/+9
[ Upstream commit 5710ab79166504013f7c0ae6a57e7d2fd26e5c43 ] The i40e_vc_send_msg_to_vf_ex (and its wrapper i40e_vc_send_msg_to_vf) function has logic to detect "failure" responses sent to the VF. If a VF is sent more than I40E_DEFAULT_NUM_INVALID_MSGS_ALLOWED, then the VF is marked as disabled. In either case, a dev_info message is printed stating that a VF opcode failed. This logic originates from the early implementation of VF support in commit 5c3c48ac6bf5 ("i40e: implement virtual device interface"). That commit did not go far enough. The "logic" for this behavior seems to be that error responses somehow indicate a malicious VF. This is not really true. The PF might be sending an error for any number of reasons such as lacking resources, an unsupported operation, etc. This does not indicate a malicious VF. We already have a separate robust malicious VF detection which relies on hardware logic to detect and prevent a variety of behaviors. There is no justification for this behavior in the original implementation. In fact, a later commit 18b7af57d9c1 ("i40e: Lower some message levels") reduced the opcode failure message from a dev_err to a dev_info. In addition, recent commit 01cbf50877e6 ("i40e: Fix to not show opcode msg on unsuccessful VF MAC change") changed the logic to allow quieting it for expected failures. That commit prevented this logic from kicking in for specific circumstances. This change did not go far enough. The behavior is not documented nor is it part of any requirement for our products. Other operating systems such as the FreeBSD implementation of our driver do not include this logic. It is clear this check does not make sense, and causes problems which led to ugly workarounds. Fix this by just removing the entire logic and the need for the i40e_vc_send_msg_to_vf_ex function. Fixes: 01cbf50877e6 ("i40e: Fix to not show opcode msg on unsuccessful VF MAC change") Fixes: 5c3c48ac6bf5 ("i40e: implement virtual device interface") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16iavf: Fix handling of vlan strip virtual channel messagesMichal Maloszewski1-0/+40
[ Upstream commit 2cf29e55894886965722e6625f6a03630b4db31d ] Modify netdev->features for vlan stripping based on virtual channel messages received from the PF. Change is needed to synchronize vlan strip status between PF sysfs and iavf ethtool. Fixes: 5951a2b9812d ("iavf: Fix VLAN feature flags after VFR") Signed-off-by: Norbert Ciosek <norbertx.ciosek@intel.com> Signed-off-by: Michal Maloszewski <michal.maloszewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16qed: return status of qed_iov_get_linkTom Rix1-7/+11
[ Upstream commit d9dc0c84ad2d4cc911ba252c973d1bf18d5eb9cf ] Clang static analysis reports this issue qed_sriov.c:4727:19: warning: Assigned value is garbage or undefined ivi->max_tx_rate = tx_rate ? tx_rate : link.speed; ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ link is only sometimes set by the call to qed_iov_get_link() qed_iov_get_link fails without setting link or returning status. So change the decl to return status. Fixes: 73390ac9d82b ("qed*: support ndo_get_vf_config") Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16net: qlogic: check the return value of dma_alloc_coherent() in ↵Jia-Ju Bai1-0/+7
qed_vf_hw_prepare() [ Upstream commit e0058f0fa80f6e09c4d363779c241c45a3c56b94 ] The function dma_alloc_coherent() in qed_vf_hw_prepare() can fail, so its return value should be checked. Fixes: 1408cc1fa48c ("qed: Introduce VFs") Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08net: stmmac: perserve TX and RX coalesce value during XDP setupOng Boon Leong1-2/+3
commit 61da6ac715700bcfeef50d187e15c6cc7c9d079b upstream. When XDP program is loaded, it is desirable that the previous TX and RX coalesce values are not re-inited to its default value. This prevents unnecessary re-configurig the coalesce values that were working fine before. Fixes: ac746c8520d9 ("net: stmmac: enhance XDP ZC driver level switching performance") Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Tested-by: Kurt Kanzenbach <kurt@linutronix.de> Link: https://lore.kernel.org/r/20211124114019.3949125-1-boon.leong.ong@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-08e1000e: Fix possible HW unit hang after an s0ix exitSasha Neftin4-0/+32
[ Upstream commit 1866aa0d0d6492bc2f8d22d0df49abaccf50cddd ] Disable the OEM bit/Gig Disable/restart AN impact and disable the PHY LAN connected device (LCD) reset during power management flows. This fixes possible HW unit hangs on the s0ix exit on some corporate ADL platforms. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=214821 Fixes: 3e55d231716e ("e1000e: Add handshake with the CSME to support S0ix") Suggested-by: Dima Ruinskiy <dima.ruinskiy@intel.com> Suggested-by: Nir Efrati <nir.efrati@intel.com> Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08iavf: Fix __IAVF_RESETTING state usageSlawomir Laba1-7/+6
[ Upstream commit 14756b2ae265d526b8356e86729090b01778fdf6 ] The setup of __IAVF_RESETTING state in watchdog task had no effect and could lead to slow resets in the driver as the task for __IAVF_RESETTING state only requeues watchdog. Till now the __IAVF_RESETTING was interpreted by reset task as running state which could lead to errors with allocating and resources disposal. Make watchdog_task queue the reset task when it's necessary. Do not update the state to __IAVF_RESETTING so the reset task knows exactly what is the current state of the adapter. Fixes: 898ef1cb1cb2 ("iavf: Combine init and watchdog state machines") Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Phani Burra <phani.r.burra@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08iavf: Fix race in init stateSlawomir Laba1-1/+2
[ Upstream commit a472eb5cbaebb5774672c565e024336c039e9128 ] When iavf_init_version_check sends VIRTCHNL_OP_GET_VF_RESOURCES message, the driver will wait for the response after requeueing the watchdog task in iavf_init_get_resources call stack. The logic is implemented this way that iavf_init_get_resources has to be called in order to allocate adapter->vf_res. It is polling for the AQ response in iavf_get_vf_config function. Expect a call trace from kernel when adminq_task worker handles this message first. adapter->vf_res will be NULL in iavf_virtchnl_completion. Make the watchdog task not queue the adminq_task if the init process is not finished yet. Fixes: 898ef1cb1cb2 ("iavf: Combine init and watchdog state machines") Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Phani Burra <phani.r.burra@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08iavf: Fix locking for VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPSSlawomir Laba3-14/+17
[ Upstream commit 0579fafd37fb7efe091f0e6c8ccf968864f40f3e ] iavf_virtchnl_completion is called under crit_lock but when the code for VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS is called, this lock is released in order to obtain rtnl_lock to avoid ABBA deadlock with unregister_netdev. Along with the new way iavf_remove behaves, there exist many risks related to the lock release and attmepts to regrab it. The driver faces crashes related to races between unregister_netdev and netdev_update_features. Yet another risk is that the driver could already obtain the crit_lock in order to destroy it and iavf_virtchnl_completion could crash or block forever. Make iavf_virtchnl_completion never relock crit_lock in it's call paths. Extract rtnl_lock locking logic to the driver for unregister_netdev in order to set the netdev_registered flag inside the lock. Introduce a new flag that will inform adminq_task to perform the code from VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS right after it finishes processing messages. Guard this code with remove flags so it's never called when the driver is in remove state. Fixes: 5951a2b9812d ("iavf: Fix VLAN feature flags after VFR") Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Phani Burra <phani.r.burra@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08iavf: Fix init state closure on removeSlawomir Laba2-1/+27
[ Upstream commit 3ccd54ef44ebfa0792c5441b6d9c86618f3378d1 ] When init states of the adapter work, the errors like lack of communication with the PF might hop in. If such events occur the driver restores previous states in order to retry initialization in a proper way. When remove task kicks in, this situation could lead to races with unregistering the netdevice as well as resources cleanup. With the commit introducing the waiting in remove for init to complete, this problem turns into an endless waiting if init never recovers from errors. Introduce __IAVF_IN_REMOVE_TASK bit to indicate that the remove thread has started. Make __IAVF_COMM_FAILED adapter state respect the __IAVF_IN_REMOVE_TASK bit and set the __IAVF_INIT_FAILED state and return without any action instead of trying to recover. Make __IAVF_INIT_FAILED adapter state respect the __IAVF_IN_REMOVE_TASK bit and return without any further actions. Make the loop in the remove handler break when adapter has __IAVF_INIT_FAILED state set. Fixes: 898ef1cb1cb2 ("iavf: Combine init and watchdog state machines") Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Phani Burra <phani.r.burra@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08iavf: Add waiting so the port is initialized in removeSlawomir Laba1-11/+16
[ Upstream commit 974578017fc1fdd06cea8afb9dfa32602e8529ed ] There exist races when port is being configured and remove is triggered. unregister_netdev is not and can't be called under crit_lock mutex since it is calling ndo_stop -> iavf_close which requires this lock. Depending on init state the netdev could be still unregistered so unregister_netdev never cleans up, when shortly after that the device could become registered. Make iavf_remove wait until port finishes initialization. All critical state changes are atomic (under crit_lock). Crashes that come from iavf_reset_interrupt_capability and iavf_free_traffic_irqs should now be solved in a graceful manner. Fixes: 605ca7c5c6707 ("iavf: Fix kernel BUG in free_msi_irqs") Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Phani Burra <phani.r.burra@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08iavf: Rework mutexes for better synchronisationSlawomir Laba2-31/+36
[ Upstream commit fc2e6b3b132a907378f6af08356b105a4139c4fb ] The driver used to crash in multiple spots when put to stress testing of the init, reset and remove paths. The user would experience call traces or hangs when creating, resetting, removing VFs. Depending on the machines, the call traces are happening in random spots, like reset restoring resources racing with driver remove. Make adapter->crit_lock mutex a mandatory lock for guarding the operations performed on all workqueues and functions dealing with resource allocation and disposal. Make __IAVF_REMOVE a final state of the driver respected by workqueues that shall not requeue, when they fail to obtain the crit_lock. Make the IRQ handler not to queue the new work for adminq_task when the __IAVF_REMOVE state is set. Fixes: 5ac49f3c2702 ("iavf: use mutexes for locking of critical sections") Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Phani Burra <phani.r.burra@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08iavf: Add trace while removing deviceJedrzej Jagielski1-0/+1
[ Upstream commit bdb9e5c7aec73a7b8b5acab37587b6de1203e68d ] Add kernel trace that device was removed. Currently there is no such information. I.e. Host admin removes a PCI device from a VM, than on VM shall be info about the event. This patch adds info log to iavf_remove function. Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08net: sparx5: Fix add vlan when invalid operationCasper Andersson1-10/+10
[ Upstream commit b3a34dc362c03215031b268fcc0b988e69490231 ] Check if operation is valid before changing any settings in hardware. Otherwise it results in changes being made despite it not being a valid operation. Fixes: 78eab33bb68b ("net: sparx5: add vlan support") Signed-off-by: Casper Andersson <casper.casan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08net: chelsio: cxgb3: check the return value of pci_find_capability()Jia-Ju Bai1-0/+2
[ Upstream commit 767b9825ed1765894e569a3d698749d40d83762a ] The function pci_find_capability() in t3_prep_adapter() can fail, so its return value should be checked. Fixes: 4d22de3e6cc4 ("Add support for the latest 1G/10G Chelsio adapter, T3") Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08ibmvnic: Allow queueing resets during probeSukadev Bhattiprolu2-10/+98
[ Upstream commit fd98693cb0721317f27341951593712c580c36a1 ] We currently don't allow queuing resets when adapter is in VNIC_PROBING state - instead we throw away the reset and return EBUSY. The reasoning is probably that during ibmvnic_probe() the ibmvnic_adapter itself is being initialized so performing a reset during this time can lead us to accessing fields in the ibmvnic_adapter that are not fully initialized. A review of the code shows that all the adapter state neede to process a reset is initialized before registering the CRQ so that should no longer be a concern. Further the expectation is that if we do get a reset (transport event) during probe, the do..while() loop in ibmvnic_probe() will handle this by reinitializing the CRQ. While that is true to some extent, it is possible that the reset might occur _after_ the CRQ is registered and CRQ_INIT message was exchanged but _before_ the adapter state is set to VNIC_PROBED. As mentioned above, such a reset will be thrown away. While the client assumes that the adapter is functional, the vnic server will wait for the client to reinit the adapter. This disconnect between the two leaves the adapter down needing manual intervention. Because ibmvnic_probe() has other work to do after initializing the CRQ (such as registering the netdev at a minimum) and because the reset event can occur at any instant after the CRQ is initialized, there will always be a window between initializing the CRQ and considering the adapter ready for resets (ie state == PROBED). So rather than discarding resets during this window, allow queueing them - but only process them after the adapter is fully initialized. To do this, introduce a new completion state ->probe_done and have the reset worker thread wait on this before processing resets. This change brings up two new situations in or just after ibmvnic_probe(). First after one or more resets were queued, we encounter an error and decide to retry the initialization. At that point the queued resets are no longer relevant since we could be talking to a new vnic server. So we must purge/flush the queued resets before restarting the initialization. As a side note, since we are still in the probing stage and we have not registered the netdev, it will not be CHANGE_PARAM reset. Second this change opens up a potential race between the worker thread in __ibmvnic_reset(), the tasklet and the ibmvnic_open() due to the following sequence of events: 1. Register CRQ 2. Get transport event before CRQ_INIT completes. 3. Tasklet schedules reset: a) add rwi to list b) schedule_work() to start worker thread which runs and waits for ->probe_done. 4. ibmvnic_probe() decides to retry, purges rwi_list 5. Re-register crq and this time rest of probe succeeds - register netdev and complete(->probe_done). 6. Worker thread resumes in __ibmvnic_reset() from 3b. 7. Worker thread sets ->resetting bit 8. ibmvnic_open() comes in, notices ->resetting bit, sets state to IBMVNIC_OPEN and returns early expecting worker thread to finish the open. 9. Worker thread finds rwi_list empty and returns without opening the interface. If this happens, the ->ndo_open() call is effectively lost and the interface remains down. To address this, ensure that ->rwi_list is not empty before setting the ->resetting bit. See also comments in __ibmvnic_reset(). Fixes: 6a2fb0e99f9c ("ibmvnic: driver initialization for kdump/kexec") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08ibmvnic: clear fop when retrying probeSukadev Bhattiprolu1-0/+5
[ Upstream commit f628ad531b4f34fdba0984255b4a2850dd369513 ] Clear ->failover_pending flag that may have been set in the previous pass of registering CRQ. If we don't clear, a subsequent ibmvnic_open() call would be misled into thinking a failover is pending and assuming that the reset worker thread would open the adapter. If this pass of registering the CRQ succeeds (i.e there is no transport event), there wouldn't be a reset worker thread. This would leave the adapter unconfigured and require manual intervention to bring it up during boot. Fixes: 5a18e1e0c193 ("ibmvnic: Fix failover case for non-redundant configuration") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08ibmvnic: init init_done_rc earlierSukadev Bhattiprolu1-5/+21
[ Upstream commit ae16bf15374d8b055e040ac6f3f1147ab1c9bb7d ] We currently initialize the ->init_done completion/return code fields before issuing a CRQ_INIT command. But if we get a transport event soon after registering the CRQ the taskslet may already have recorded the completion and error code. If we initialize here, we might overwrite/ lose that and end up issuing the CRQ_INIT only to timeout later. If that timeout happens during probe, we will leave the adapter in the DOWN state rather than retrying to register/init the CRQ. Initialize the completion before registering the CRQ so we don't lose the notification. Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08ibmvnic: Update driver return codesDany Madden1-30/+34
[ Upstream commit b6ee566cf3940883d67c0d142fae8d410e975f47 ] Update return codes to be more informative. Signed-off-by: Jacob Root <otis@otisroot.com> Signed-off-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08ibmvnic: complete init_done on transport eventsSukadev Bhattiprolu1-0/+7
[ Upstream commit 36491f2df9ad2501e5a4ec25d3d95d72bafd2781 ] If we get a transport event, set the error and mark the init as complete so the attempt to send crq-init or login fail sooner rather than wait for the timeout. Fixes: bbd669a868bb ("ibmvnic: Fix completion structure initialization") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08ibmvnic: define flush_reset_queue helperSukadev Bhattiprolu1-8/+16
[ Upstream commit 83da53f7e4bd86dca4b2edc1e2bb324fb3c033a1 ] Define and use a helper to flush the reset queue. Fixes: 2770a7984db5 ("ibmvnic: Introduce hard reset recovery") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08ibmvnic: initialize rc before completing waitSukadev Bhattiprolu1-1/+1
[ Upstream commit 765559b10ce514eb1576595834f23cdc92125fee ] We should initialize ->init_done_rc before calling complete(). Otherwise the waiting thread may see ->init_done_rc as 0 before we have updated it and may assume that the CRQ was successful. Fixes: 6b278c0cb378 ("ibmvnic delay complete()") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>