summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel
AgeCommit message (Collapse)AuthorFilesLines
2022-04-08Revert "iavf: Fix deadlock occurrence during resetting VF interface"Mateusz Palczewski1-5/+2
This change caused a regression with resetting while changing network namespaces. By clearing the IFF_UP flag, the kernel now thinks it has fully closed the device. This reverts commit 0cc318d2e8408bc0ffb4662a0c3e5e57005ac6ff. Fixes: 0cc318d2e840 ("iavf: Fix deadlock occurrence during resetting VF interface") Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-04-08ice: arfs: fix use-after-free when freeing @rx_cpu_rmapAlexander Lobakin3-18/+14
The CI testing bots triggered the following splat: [ 718.203054] BUG: KASAN: use-after-free in free_irq_cpu_rmap+0x53/0x80 [ 718.206349] Read of size 4 at addr ffff8881bd127e00 by task sh/20834 [ 718.212852] CPU: 28 PID: 20834 Comm: sh Kdump: loaded Tainted: G S W IOE 5.17.0-rc8_nextqueue-devqueue-02643-g23f3121aca93 #1 [ 718.219695] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0012.070720200218 07/07/2020 [ 718.223418] Call Trace: [ 718.227139] [ 718.230783] dump_stack_lvl+0x33/0x42 [ 718.234431] print_address_description.constprop.9+0x21/0x170 [ 718.238177] ? free_irq_cpu_rmap+0x53/0x80 [ 718.241885] ? free_irq_cpu_rmap+0x53/0x80 [ 718.245539] kasan_report.cold.18+0x7f/0x11b [ 718.249197] ? free_irq_cpu_rmap+0x53/0x80 [ 718.252852] free_irq_cpu_rmap+0x53/0x80 [ 718.256471] ice_free_cpu_rx_rmap.part.11+0x37/0x50 [ice] [ 718.260174] ice_remove_arfs+0x5f/0x70 [ice] [ 718.263810] ice_rebuild_arfs+0x3b/0x70 [ice] [ 718.267419] ice_rebuild+0x39c/0xb60 [ice] [ 718.270974] ? asm_sysvec_apic_timer_interrupt+0x12/0x20 [ 718.274472] ? ice_init_phy_user_cfg+0x360/0x360 [ice] [ 718.278033] ? delay_tsc+0x4a/0xb0 [ 718.281513] ? preempt_count_sub+0x14/0xc0 [ 718.284984] ? delay_tsc+0x8f/0xb0 [ 718.288463] ice_do_reset+0x92/0xf0 [ice] [ 718.292014] ice_pci_err_resume+0x91/0xf0 [ice] [ 718.295561] pci_reset_function+0x53/0x80 <...> [ 718.393035] Allocated by task 690: [ 718.433497] Freed by task 20834: [ 718.495688] Last potentially related work creation: [ 718.568966] The buggy address belongs to the object at ffff8881bd127e00 which belongs to the cache kmalloc-96 of size 96 [ 718.574085] The buggy address is located 0 bytes inside of 96-byte region [ffff8881bd127e00, ffff8881bd127e60) [ 718.579265] The buggy address belongs to the page: [ 718.598905] Memory state around the buggy address: [ 718.601809] ffff8881bd127d00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc [ 718.604796] ffff8881bd127d80: 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc fc [ 718.607794] >ffff8881bd127e00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc [ 718.610811] ^ [ 718.613819] ffff8881bd127e80: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc [ 718.617107] ffff8881bd127f00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc This is due to that free_irq_cpu_rmap() is always being called *after* (devm_)free_irq() and thus it tries to work with IRQ descs already freed. For example, on device reset the driver frees the rmap right before allocating a new one (the splat above). Make rmap creation and freeing function symmetrical with {request,free}_irq() calls i.e. do that on ifup/ifdown instead of device probe/remove/resume. These operations can be performed independently from the actual device aRFS configuration. Also, make sure ice_vsi_free_irq() clears IRQ affinity notifiers only when aRFS is disabled -- otherwise, CPU rmap sets and clears its own and they must not be touched manually. Fixes: 28bf26724fdb0 ("ice: Implement aRFS") Co-developed-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Tested-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-04-08Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/nexDavid S. Miller2-290/+211
t-queue Tony Nguyen says: ==================== 100GbE Intel Wired LAN Driver Updates 2022-04-07 Alexander Lobakin says: This hunts down several places around packet templates/dummies for switch rules which are either repetitive, fragile or just not really readable code. It's a common need to add new packet templates and to review such changes as well, try to simplify both with the help of a pair macros and aliases. ice_find_dummy_packet() became very complex at this point with tons of nested if-elses. It clearly showed this approach does not scale, so convert its logics to the simple mask-match + static const array. bloat-o-meter is happy about that (built w/ LLVM 13): add/remove: 0/1 grow/shrink: 1/1 up/down: 2/-1058 (-1056) Function old new delta ice_fill_adv_dummy_packet 289 291 +2 ice_adv_add_update_vsi_list 201 - -201 ice_add_adv_rule 2950 2093 -857 Total: Before=414512, After=413456, chg -0.25% add/remove: 53/52 grow/shrink: 0/0 up/down: 4660/-3988 (672) RO Data old new delta ice_dummy_pkt_profiles - 672 +672 Total: Before=37895, After=38567, chg +1.77% Diffstat also looks nice, and adding new packet templates now takes less lines. We'll probably come out with dynamic template crafting in a while, but for now let's improve what we have currently. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-07ice: switch: convert packet template match code to rodataAlexander Lobakin1-107/+108
Trade text size for rodata size and replace tons of nested if-elses to the const mask match based structs. The almost entire ice_find_dummy_packet() now becomes just one plain while-increment loop. The order in ice_dummy_pkt_profiles[] should be same with the if-elses order previously, as masks become less and less strict through the array to follow the original code flow. Apart from removing 80 locs of 4-level if-elses, it brings a solid text size optimization: add/remove: 0/1 grow/shrink: 1/1 up/down: 2/-1058 (-1056) Function old new delta ice_fill_adv_dummy_packet 289 291 +2 ice_adv_add_update_vsi_list 201 - -201 ice_add_adv_rule 2950 2093 -857 Total: Before=414512, After=413456, chg -0.25% add/remove: 53/52 grow/shrink: 0/0 up/down: 4660/-3988 (672) RO Data old new delta ice_dummy_pkt_profiles - 672 +672 Total: Before=37895, After=38567, chg +1.77% Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Marcin Szycik <marcin.szycik@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-04-07ice: switch: use convenience macros to declare dummy pkt templatesAlexander Lobakin1-71/+62
Declarations of dummy/template packet headers and offsets can be minified to improve readability and simplify adding new templates. Move all the repetitive constructions into two macros and let them do the name and type expansions. Linewrap removal is yet another positive side effect. Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Marcin Szycik <marcin.szycik@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-04-07ice: switch: use a struct to pass packet template paramsAlexander Lobakin1-174/+94
ice_find_dummy_packet() contains a lot of boilerplate code and a nice room for copy-paste mistakes. Instead of passing 3 separate pointers back and forth to get packet template (dummy) params, directly return a structure containing them. Then, use a macro to compose compound literals and avoid code duplication on return path. Now, dummy packet type/name is needed only once to return a full correct triple pkt-pkt_len-offsets, and those are all one-liners. dummy_ipv4_gtpu_ipv4_packet_offsets is just moved around and renamed (as well as dummy_ipv6_gtp_packet_offsets) with no function changes. Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Marcin Szycik <marcin.szycik@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-04-07ice: switch: unobscurify bitops loop in ice_fill_adv_dummy_packet()Alexander Lobakin1-7/+9
A loop performing header modification according to the provided mask in ice_fill_adv_dummy_packet() is very cryptic (and error-prone). Replace two identical cast-deferences with a variable. Replace three struct-member-array-accesses with a variable. Invert the condition, reduce the indentation by one -> eliminate line wraps. Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Marcin Szycik <marcin.szycik@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-04-07ice: switch: add and use u16[] aliases to ice_adv_lkup_elem::{h, m}_uAlexander Lobakin2-10/+17
ice_adv_lkup_elem fields h_u and m_u are being accessed as raw u16 arrays in several places. To reduce cast and braces burden, add permanent array-of-u16 aliases with the same size as the `union ice_prot_hdr` itself via anonymous unions to the actual struct declaration, and just access them directly. This: - removes the need to cast the union to u16[] and then dereference it each time -> reduces the horizon for potential bugs; - improves -Warray-bounds coverage -- the array size is now known at compilation time; - addresses cppcheck complaints. Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Marcin Szycik <marcin.szycik@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-04-05ice: clear cmd_type_offset_bsz for TX ringsMaciej Fijalkowski1-1/+1
Currently when XDP rings are created, each descriptor gets its DD bit set, which turns out to be the wrong approach as it can lead to a situation where more descriptors get cleaned than it was supposed to, e.g. when AF_XDP busy poll is run with a large batch size. In this situation, the driver would request for more buffers than it is able to handle. Fix this by not setting the DD bits in ice_xdp_alloc_setup_rings(). They should be initialized to zero instead. Fixes: 9610bd988df9 ("ice: optimize XDP_TX workloads") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Shwetha Nagaraju <shwetha.nagaraju@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-04-05ice: xsk: fix VSI state check in ice_xsk_wakeup()Maciej Fijalkowski1-1/+1
ICE_DOWN is dedicated for pf->state. Check for ICE_VSI_DOWN being set on vsi->state in ice_xsk_wakeup(). Fixes: 2d4238f55697 ("ice: Add support for AF_XDP") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Shwetha Nagaraju <shwetha.nagaraju@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-04-05ice: synchronize_rcu() when terminating ringsMaciej Fijalkowski3-3/+7
Unfortunately, the ice driver doesn't respect the RCU critical section that XSK wakeup is surrounded with. To fix this, add synchronize_rcu() calls to paths that destroy resources that might be in use. This was addressed in other AF_XDP ZC enabled drivers, for reference see for example commit b3873a5be757 ("net/i40e: Fix concurrency issues between config flow and XSK") Fixes: efc2214b6047 ("ice: Add support for XDP") Fixes: 2d4238f55697 ("ice: Add support for AF_XDP") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Shwetha Nagaraju <shwetha.nagaraju@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-04-05ice: Do not skip not enabled queues in ice_vc_dis_qs_msgAnatolii Gerasymenko1-2/+2
Disable check for queue being enabled in ice_vc_dis_qs_msg, because there could be a case when queues were created, but were not enabled. We still need to delete those queues. Normal workflow for VF looks like: Enable path: VIRTCHNL_OP_ADD_ETH_ADDR (opcode 10) VIRTCHNL_OP_CONFIG_VSI_QUEUES (opcode 6) VIRTCHNL_OP_ENABLE_QUEUES (opcode 8) Disable path: VIRTCHNL_OP_DISABLE_QUEUES (opcode 9) VIRTCHNL_OP_DEL_ETH_ADDR (opcode 11) The issue appears only in stress conditions when VF is enabled and disabled very fast. Eventually there will be a case, when queues are created by VIRTCHNL_OP_CONFIG_VSI_QUEUES, but are not enabled by VIRTCHNL_OP_ENABLE_QUEUES. In turn, these queues are not deleted by VIRTCHNL_OP_DISABLE_QUEUES, because there is a check whether queues are enabled in ice_vc_dis_qs_msg. When we bring up the VF again, we will see the "Failed to set LAN Tx queue context" error during VIRTCHNL_OP_CONFIG_VSI_QUEUES step. This happens because old 16 queues were not deleted and VF requests to create 16 more, but ice_sched_get_free_qparent in ice_ena_vsi_txq would fail to find a parent node for first newly requested queue (because all nodes are allocated to 16 old queues). Testing Hints: Just enable and disable VF fast enough, so it would be disabled before reaching VIRTCHNL_OP_ENABLE_QUEUES. while true; do ip link set dev ens785f0v0 up sleep 0.065 # adjust delay value for you machine ip link set dev ens785f0v0 down done Fixes: 77ca27c41705 ("ice: add support for virtchnl_queue_select.[tx|rx]_queues bitmap") Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Alice Michael <alice.michael@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-05ice: Set txq_teid to ICE_INVAL_TEID on ring creationAnatolii Gerasymenko1-0/+1
When VF is freshly created, but not brought up, ring->txq_teid value is by default set to 0. But 0 is a valid TEID. On some platforms the Root Node of Tx scheduler has a TEID = 0. This can cause issues as shown below. The proper way is to set ring->txq_teid to ICE_INVAL_TEID (0xFFFFFFFF). Testing Hints: echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs ip link set dev ens785f0v0 up ip link set dev ens785f0v0 down If we have freshly created VF and quickly turn it on and off, so there would be no time to reach VIRTCHNL_OP_CONFIG_VSI_QUEUES stage, then VIRTCHNL_OP_DISABLE_QUEUES stage will fail with error: [ 639.531454] disable queue 89 failed 14 [ 639.532233] Failed to disable LAN Tx queues, error: ICE_ERR_AQ_ERROR [ 639.533107] ice 0000:02:00.0: Failed to stop Tx ring 0 on VSI 5 The reason for the fail is that we are trying to send AQ command to delete queue 89, which has never been created and receive an "invalid argument" error from firmware. As this queue has never been created, it's teid and ring->txq_teid have default value 0. ice_dis_vsi_txq has a check against non-existent queues: node = ice_sched_find_node_by_teid(pi->root, q_teids[i]); if (!node) continue; But on some platforms the Root Node of Tx scheduler has a teid = 0. Hence, ice_sched_find_node_by_teid finds a node with teid = 0 (it is pi->root), and we go further to submit an erroneous request to firmware. Fixes: 37bb83901286 ("ice: Move common functions out of ice_main.c part 7/7") Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Alice Michael <alice.michael@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-01ice: Fix broken IFF_ALLMULTI handlingIvan Vecera3-38/+121
Handling of all-multicast flag and associated multicast promiscuous mode is broken in ice driver. When an user switches allmulticast flag on or off the driver checks whether any VLANs are configured over the interface (except default VLAN 0). If any extra VLANs are registered it enables multicast promiscuous mode for all these VLANs (including default VLAN 0) using ICE_SW_LKUP_PROMISC_VLAN look-up type. In this situation all multicast packets tagged with known VLAN ID or untagged are received and multicast packets tagged with unknown VLAN ID ignored. If no extra VLANs are registered (so only VLAN 0 exists) it enables multicast promiscuous mode for VLAN 0 and uses ICE_SW_LKUP_PROMISC look-up type. In this situation any multicast packets including tagged ones are received. The driver handles IFF_ALLMULTI in ice_vsi_sync_fltr() this way: ice_vsi_sync_fltr() { ... if (changed_flags & IFF_ALLMULTI) { if (netdev->flags & IFF_ALLMULTI) { if (vsi->num_vlans > 1) ice_set_promisc(..., ICE_MCAST_VLAN_PROMISC_BITS); else ice_set_promisc(..., ICE_MCAST_PROMISC_BITS); } else { if (vsi->num_vlans > 1) ice_clear_promisc(..., ICE_MCAST_VLAN_PROMISC_BITS); else ice_clear_promisc(..., ICE_MCAST_PROMISC_BITS); } } ... } The code above depends on value vsi->num_vlan that specifies number of VLANs configured over the interface (including VLAN 0) and this is problem because that value is modified in NDO callbacks ice_vlan_rx_add_vid() and ice_vlan_rx_kill_vid(). Scenario 1: 1. ip link set ens7f0 allmulticast on 2. ip link add vlan10 link ens7f0 type vlan id 10 3. ip link set ens7f0 allmulticast off 4. ip link set ens7f0 allmulticast on [1] In this scenario IFF_ALLMULTI is enabled and the driver calls ice_set_promisc(..., ICE_MCAST_PROMISC_BITS) that installs multicast promisc rule with non-VLAN look-up type. [2] Then VLAN with ID 10 is added and vsi->num_vlan incremented to 2 [3] Command switches IFF_ALLMULTI off and the driver calls ice_clear_promisc(..., ICE_MCAST_VLAN_PROMISC_BITS) but this call is effectively NOP because it looks for multicast promisc rules for VLAN 0 and VLAN 10 with VLAN look-up type but no such rules exist. So the all-multicast remains enabled silently in hardware. [4] Command tries to switch IFF_ALLMULTI on and the driver calls ice_clear_promisc(..., ICE_MCAST_PROMISC_BITS) but this call fails (-EEXIST) because non-VLAN multicast promisc rule already exists. Scenario 2: 1. ip link add vlan10 link ens7f0 type vlan id 10 2. ip link set ens7f0 allmulticast on 3. ip link add vlan20 link ens7f0 type vlan id 20 4. ip link del vlan10 ; ip link del vlan20 5. ip link set ens7f0 allmulticast off [1] VLAN with ID 10 is added and vsi->num_vlan==2 [2] Command switches IFF_ALLMULTI on and driver installs multicast promisc rules with VLAN look-up type for VLAN 0 and 10 [3] VLAN with ID 20 is added and vsi->num_vlan==3 but no multicast promisc rules is added for this new VLAN so the interface does not receive MC packets from VLAN 20 [4] Both VLANs are removed but multicast rule for VLAN 10 remains installed so interface receives multicast packets from VLAN 10 [5] Command switches IFF_ALLMULTI off and because vsi->num_vlan is 1 the driver tries to remove multicast promisc rule for VLAN 0 with non-VLAN look-up that does not exist. All-multicast looks disabled from user point of view but it is partially enabled in HW (interface receives all multicast packets either untagged or tagged with VLAN ID 10) To resolve these issues the patch introduces these changes: 1. Adds handling for IFF_ALLMULTI to ice_vlan_rx_add_vid() and ice_vlan_rx_kill_vid() callbacks. So when VLAN is added/removed and IFF_ALLMULTI is enabled an appropriate multicast promisc rule for that VLAN ID is added/removed. 2. In ice_vlan_rx_add_vid() when first VLAN besides VLAN 0 is added so (vsi->num_vlan == 2) and IFF_ALLMULTI is enabled then look-up type for existing multicast promisc rule for VLAN 0 is updated to ICE_MCAST_VLAN_PROMISC_BITS. 3. In ice_vlan_rx_kill_vid() when last VLAN besides VLAN 0 is removed so (vsi->num_vlan == 1) and IFF_ALLMULTI is enabled then look-up type for existing multicast promisc rule for VLAN 0 is updated to ICE_MCAST_PROMISC_BITS. 4. Both ice_vlan_rx_{add,kill}_vid() have to run under ICE_CFG_BUSY bit protection to avoid races with ice_vsi_sync_fltr() that runs in ice_service_task() context. 5. Bit ICE_VSI_VLAN_FLTR_CHANGED is use-less and can be removed. 6. Error messages added to ice_fltr_*_vsi_promisc() helper functions to avoid them in their callers 7. Small improvements to increase readability Fixes: 5eda8afd6bcc ("ice: Add support for PF/VF promiscuous mode") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Alice Michael <alice.michael@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-01ice: Fix MAC address settingIvan Vecera1-2/+5
Commit 2ccc1c1ccc671b ("ice: Remove excess error variables") merged the usage of 'status' and 'err' variables into single one in function ice_set_mac_address(). Unfortunately this causes a regression when call of ice_fltr_add_mac() returns -EEXIST because this return value does not indicate an error in this case but value of 'err' remains to be -EEXIST till the end of the function and is returned to caller. Prior mentioned commit this does not happen because return value of ice_fltr_add_mac() was stored to 'status' variable first and if it was -EEXIST then 'err' remains to be zero. Fix the problem by reset 'err' to zero when ice_fltr_add_mac() returns -EEXIST. Fixes: 2ccc1c1ccc671b ("ice: Remove excess error variables") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Alexander Lobakin <alexandr.lobakin@intel.com> Signed-off-by: Alice Michael <alice.michael@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-01ice: Clear default forwarding VSI during VSI releaseIvan Vecera1-0/+2
VSI is set as default forwarding one when promisc mode is set for PF interface, when PF is switched to switchdev mode or when VF driver asks to enable allmulticast or promisc mode for the VF interface (when vf-true-promisc-support priv flag is off). The third case is buggy because in that case VSI associated with VF remains as default one after VF removal. Reproducer: 1. Create VF echo 1 > sys/class/net/ens7f0/device/sriov_numvfs 2. Enable allmulticast or promisc mode on VF ip link set ens7f0v0 allmulticast on ip link set ens7f0v0 promisc on 3. Delete VF echo 0 > sys/class/net/ens7f0/device/sriov_numvfs 4. Try to enable promisc mode on PF ip link set ens7f0 promisc on Although it looks that promisc mode on PF is enabled the opposite is true because ice_vsi_sync_fltr() responsible for IFF_PROMISC handling first checks if any other VSI is set as default forwarding one and if so the function does not do anything. At this point it is not possible to enable promisc mode on PF without re-probe device. To resolve the issue this patch clear default forwarding VSI during ice_vsi_release() when the VSI to be released is the default one. Fixes: 01b5e89aab49 ("ice: Add VF promiscuous support") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alice Michael <alice.michael@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-29ice: xsk: Fix indexing in ice_tx_xsk_pool()Maciej Fijalkowski1-1/+1
Ice driver tries to always create XDP rings array to be num_possible_cpus() sized, regardless of user's queue count setting that can be changed via ethtool -L for example. Currently, ice_tx_xsk_pool() calculates the qid by decrementing the ring->q_index by the count of XDP queues, but ring->q_index is set to 'i + vsi->alloc_txq'. When user did ethtool -L $IFACE combined 1, alloc_txq is 1, but vsi->num_xdp_txq is still num_possible_cpus(). Then, ice_tx_xsk_pool() will do OOB access and in the final result ring would not get xsk_pool pointer assigned. Then, each ice_xsk_wakeup() call will fail with error and it will not be possible to get into NAPI and do the processing from driver side. Fix this by decrementing vsi->alloc_txq instead of vsi->num_xdp_txq from ring-q_index in ice_tx_xsk_pool() so the calculation is reflected to the setting of ring->q_index. Fixes: 22bf877e528f ("ice: introduce XDP_TX fallback path") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220328142123.170157-5-maciej.fijalkowski@intel.com
2022-03-29ice: xsk: Stop Rx processing when ntc catches ntuMaciej Fijalkowski1-0/+3
This can happen with big budget values and some breakage of re-filling descriptors as we do not clear the entry that ntu is pointing at the end of ice_alloc_rx_bufs_zc. So if ntc is at ntu then it might be the case that status_error0 has an old, uncleared value and ntc would go over with processing which would result in false results. Break Rx loop when ntc == ntu to avoid broken behavior. Fixes: 3876ff525de7 ("ice: xsk: Handle SW XDP ring wrap and bump tail more often") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220328142123.170157-4-maciej.fijalkowski@intel.com
2022-03-29ice: xsk: Eliminate unnecessary loop iterationMagnus Karlsson1-1/+1
The NIC Tx ring completion routine cleans entries from the ring in batches. However, it processes one more batch than it is supposed to. Note that this does not matter from a functionality point of view since it will not find a set DD bit for the next batch and just exit the loop. But from a performance perspective, it is faster to terminate the loop before and not issue an expensive read over PCIe to get the DD bit. Fixes: 126cdfe1007a ("ice: xsk: Improve AF_XDP ZC Tx and use batching API") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220328142123.170157-3-maciej.fijalkowski@intel.com
2022-03-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski3-10/+20
Merge in overtime fixes, no conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-23ice: don't allow to run ice_send_event_to_aux() in atomic ctxAlexander Lobakin1-0/+3
ice_send_event_to_aux() eventually descends to mutex_lock() (-> might_sched()), so it must not be called under non-task context. However, at least two fixes have happened already for the bug splats occurred due to this function being called from atomic context. To make the emergency landings softer, bail out early when executed in non-task context emitting a warn splat only once. This way we trade some events being potentially lost for system stability and avoid any related hangs and crashes. Fixes: 348048e724a0e ("ice: Implement iidc operations") Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Tested-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-23ice: fix 'scheduling while atomic' on aux critical err interruptAlexander Lobakin2-10/+17
There's a kernel BUG splat on processing aux critical error interrupts in ice_misc_intr(): [ 2100.917085] BUG: scheduling while atomic: swapper/15/0/0x00010000 ... [ 2101.060770] Call Trace: [ 2101.063229] <IRQ> [ 2101.065252] dump_stack+0x41/0x60 [ 2101.068587] __schedule_bug.cold.100+0x4c/0x58 [ 2101.073060] __schedule+0x6a4/0x830 [ 2101.076570] schedule+0x35/0xa0 [ 2101.079727] schedule_preempt_disabled+0xa/0x10 [ 2101.084284] __mutex_lock.isra.7+0x310/0x420 [ 2101.088580] ? ice_misc_intr+0x201/0x2e0 [ice] [ 2101.093078] ice_send_event_to_aux+0x25/0x70 [ice] [ 2101.097921] ice_misc_intr+0x220/0x2e0 [ice] [ 2101.102232] __handle_irq_event_percpu+0x40/0x180 [ 2101.106965] handle_irq_event_percpu+0x30/0x80 [ 2101.111434] handle_irq_event+0x36/0x53 [ 2101.115292] handle_edge_irq+0x82/0x190 [ 2101.119148] handle_irq+0x1c/0x30 [ 2101.122480] do_IRQ+0x49/0xd0 [ 2101.125465] common_interrupt+0xf/0xf [ 2101.129146] </IRQ> ... As Andrew correctly mentioned previously[0], the following call ladder happens: ice_misc_intr() <- hardirq ice_send_event_to_aux() device_lock() mutex_lock() might_sleep() might_resched() <- oops Add a new PF state bit which indicates that an aux critical error occurred and serve it in ice_service_task() in process context. The new ice_pf::oicr_err_reg is read-write in both hardirq and process contexts, but only 3 bits of non-critical data probably aren't worth explicit synchronizing (and they're even in the same byte [31:24]). [0] https://lore.kernel.org/all/YeSRUVmrdmlUXHDn@lunn.ch Fixes: 348048e724a0e ("ice: Implement iidc operations") Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Tested-by: Michal Kubiak <michal.kubiak@intel.com> Acked-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-19Merge branch '40GbE' of ↵Jakub Kicinski2-6/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 40GbE Intel Wired LAN Driver Updates 2022-03-17 This series contains updates to i40e and igb drivers. Tom Rix moves a conversion to little endian to occur only when the value is used for i40e. He also zeros out a structure to resolve possible use of garbage value for igb as reported by clang. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: igb: zero hwtstamp by default i40e: little endian only valid checksums ==================== Link: https://lore.kernel.org/r/20220317160236.3534321-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18Merge branch '100GbE' of ↵Jakub Kicinski4-3/+35
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 100GbE Intel Wired LAN Driver Updates 2022-03-16 This series contains updates to gtp and ice driver. Wojciech fixes smatch reported inconsistent indenting for gtp and ice. Yang Yingliang fixes a couple of return value checks for GNSS to IS_PTR instead of null. Jacob adds support for trace events on tx timestamps. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: add trace events for tx timestamps ice: fix return value check in ice_gnss.c ice: Fix inconsistent indenting in ice_switch gtp: Fix inconsistent indenting ==================== Link: https://lore.kernel.org/r/20220316204024.3201500-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-4/+18
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17iavf: Fix hang during reboot/shutdownIvan Vecera1-0/+7
Recent commit 974578017fc1 ("iavf: Add waiting so the port is initialized in remove") adds a wait-loop at the beginning of iavf_remove() to ensure that port initialization is finished prior unregistering net device. This causes a regression in reboot/shutdown scenario because in this case callback iavf_shutdown() is called and this callback detaches the device, makes it down if it is running and sets its state to __IAVF_REMOVE. Later shutdown callback of associated PF driver (e.g. ice_shutdown) is called. That callback calls among other things sriov_disable() that calls indirectly iavf_remove() (see stack trace below). As the adapter state is already __IAVF_REMOVE then the mentioned loop is end-less and shutdown process hangs. The patch fixes this by checking adapter's state at the beginning of iavf_remove() and skips the rest of the function if the adapter is already in remove state (shutdown is in progress). Reproducer: 1. Create VF on PF driven by ice or i40e driver 2. Ensure that the VF is bound to iavf driver 3. Reboot [52625.981294] sysrq: SysRq : Show Blocked State [52625.988377] task:reboot state:D stack: 0 pid:17359 ppid: 1 f2 [52625.996732] Call Trace: [52625.999187] __schedule+0x2d1/0x830 [52626.007400] schedule+0x35/0xa0 [52626.010545] schedule_hrtimeout_range_clock+0x83/0x100 [52626.020046] usleep_range+0x5b/0x80 [52626.023540] iavf_remove+0x63/0x5b0 [iavf] [52626.027645] pci_device_remove+0x3b/0xc0 [52626.031572] device_release_driver_internal+0x103/0x1f0 [52626.036805] pci_stop_bus_device+0x72/0xa0 [52626.040904] pci_stop_and_remove_bus_device+0xe/0x20 [52626.045870] pci_iov_remove_virtfn+0xba/0x120 [52626.050232] sriov_disable+0x2f/0xe0 [52626.053813] ice_free_vfs+0x7c/0x340 [ice] [52626.057946] ice_remove+0x220/0x240 [ice] [52626.061967] ice_shutdown+0x16/0x50 [ice] [52626.065987] pci_device_shutdown+0x34/0x60 [52626.070086] device_shutdown+0x165/0x1c5 [52626.074011] kernel_restart+0xe/0x30 [52626.077593] __do_sys_reboot+0x1d2/0x210 [52626.093815] do_syscall_64+0x5b/0x1a0 [52626.097483] entry_SYSCALL_64_after_hwframe+0x65/0xca Fixes: 974578017fc1 ("iavf: Add waiting so the port is initialized in remove") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Link: https://lore.kernel.org/r/20220317104524.2802848-1-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17igb: zero hwtstamp by defaultTom Rix1-4/+2
Clang static analysis reports this representative issue igb_ptp.c:997:3: warning: The left operand of '+' is a garbage value ktime_add_ns(shhwtstamps.hwtstamp, adjust); ^ ~~~~~~~~~~~~~~~~~~~~ shhwtstamps.hwtstamp is set by a call to igb_ptp_systim_to_hwtstamp(). In the switch-statement for the hw type, the hwtstamp is zeroed for matches but not the default case. Move the memset out of switch-statement. This degarbages the default case and reduces the size. Some whitespace cleanup of empty lines Signed-off-by: Tom Rix <trix@redhat.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-17i40e: little endian only valid checksumsTom Rix1-2/+3
The calculation of the checksum can fail. So move converting the checksum to little endian to inside the return status check. Signed-off-by: Tom Rix <trix@redhat.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-16ice: add trace events for tx timestampsJacob Keller2-0/+32
We've previously run into many issues related to the latency of a Tx timestamp completion with the ice hardware. It can be difficult to determine the root cause of a slow Tx timestamp. To aid in this, introduce new trace events which capture timing data about when the driver reaches certain points while processing a transmit timestamp * ice_tx_tstamp_request: Trace when the stack initiates a new timestamp request. * ice_tx_tstamp_fw_req: Trace when the driver begins a read of the timestamp register in the work thread. * ice_tx_tstamp_fw_done: Trace when the driver finishes reading a timestamp register in the work thread. * ice_tx_tstamp_complete: Trace when the driver submits the skb back to the stack with a completed Tx timestamp. These trace events can be enabled using the standard trace event subsystem exposed by the ice driver. If they are disabled, they become no-ops with no run time cost. The following is a simple GNU AWK script which can highlight one potential way to use the trace events to capture latency data from the trace buffer about how long the driver takes to process a timestamp: ----- BEGIN { PREC=256 } # Detect requests /tx_tstamp_request/ { time=strtonum($4) skb=$7 # Store the time of request for this skb requests[skb] = time printf("skb %s: idx %d at %.6f\n", skb, idx, time) } # Detect completions /tx_tstamp_complete/ { time=strtonum($4) skb=$7 idx=$9 if (skb in requests) { latency = (time - requests[skb]) * 1000 printf("skb %s: %.3f to complete\n", skb, latency) if (latency > 4) { printf(">>> HIGH LATENCY <<<\n") } printf("\n") } else { printf("!!! skb %s (idx %d) at %.6f\n", skb, idx, time) } } ----- Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-16ice: fix return value check in ice_gnss.cYang Yingliang1-2/+2
kthread_create_worker() and tty_alloc_driver() return ERR_PTR() and never return NULL. The NULL test in the return value check should be replaced with IS_ERR(). Fixes: 43113ff73453 ("ice: add TTY for GNSS module for E810T device") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-16ice: Fix inconsistent indenting in ice_switchWojciech Drewek1-1/+1
Fix the following warning as reported by smatch: smatch warnings: drivers/net/ethernet/intel/ice/ice_switch.c:5568 ice_find_dummy_packet() warn: inconsistent indenting Fixes: 9a225f81f540 ("ice: Support GTP-U and GTP-C offload in switchdev") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-15iavf: Fix double free in iavf_reset_taskPrzemyslaw Patynowski1-1/+7
Fix double free possibility in iavf_disable_vf, as crit_lock is freed in caller, iavf_reset_task. Add kernel-doc for iavf_disable_vf. Remove mutex_unlock in iavf_disable_vf. Without this patch there is double free scenario, when calling iavf_reset_task. Fixes: e85ff9c631e1 ("iavf: Fix deadlock in iavf_reset_task") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Suggested-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-15ice: destroy flow director filter mutex after releasing VSIsSudheer Mogilappagari1-1/+1
Currently fdir_fltr_lock is accessed in ice_vsi_release_all() function after it is destroyed. Instead destroy mutex after ice_vsi_release_all. Fixes: 40319796b732 ("ice: Add flow director support for channel mode") Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-15ice: fix NULL pointer dereference in ice_update_vsi_tx_ring_stats()Maciej Fijalkowski1-2/+3
It is possible to do NULL pointer dereference in routine that updates Tx ring stats. Currently only stats and bytes are updated when ring pointer is valid, but later on ring is accessed to propagate gathered Tx stats onto VSI stats. Change the existing logic to move to next ring when ring is NULL. Fixes: e72bba21355d ("ice: split ice_ring onto Tx/Rx separate structs") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Alexander Lobakin <alexandr.lobakin@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-15ice: remove PF pointer from ice_check_vf_initJacob Keller3-16/+14
The ice_check_vf_init function takes both a PF and a VF pointer. Every caller looks up the PF pointer from the VF structure. Some callers only use of the PF pointer is call this function. Move the lookup inside ice_check_vf_init and drop the unnecessary argument. Cleanup the callers to drop the now unnecessary local variables. In particular, replace the local PF pointer with a HW structure pointer in ice_vc_get_vf_res_msg which simplifies a few accesses to the HW structure in that function. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-15ice: introduce ice_virtchnl.c and ice_virtchnl.hJacob Keller5-3825/+3869
Just as we moved the generic virtualization library logic into ice_vf_lib.c, move the virtchnl message handling into ice_virtchnl.c Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-15ice: cleanup long lines in ice_sriov.cJacob Keller1-12/+27
Before we move the virtchnl message handling from ice_sriov.c into ice_virtchnl.c, cleanup some long line warnings to avoid checkpatch.pl complaints. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-15ice: introduce ICE_VF_RESET_LOCK flagJacob Keller4-16/+19
The ice_reset_vf function performs actions which must be taken only while holding the VF configuration lock. Some flows already acquired the lock, while other flows must acquire it just for the reset function. Add the ICE_VF_RESET_LOCK flag to the function so that it can handle taking and releasing the lock instead at the appropriate scope. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-15ice: introduce ICE_VF_RESET_NOTIFY flagJacob Keller3-39/+34
In some cases of resetting a VF, the PF would like to first notify the VF that a reset is impending. This is currently done via ice_vc_notify_vf_reset. A wrapper to ice_reset_vf, ice_vf_reset_vf, is used to call this function first before calling ice_reset_vf. In fact, every single call to ice_vc_notify_vf_reset occurs just prior to a call to ice_vc_reset_vf. Now that ice_reset_vf has flags, replace this separate call with an ICE_VF_RESET_NOTIFY flag. This removes an unnecessary exported function of ice_vc_notify_vf_reset, and also makes there be a single function to reset VFs (ice_reset_vf). Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-15ice: convert ice_reset_vf to take flagsJacob Keller4-9/+17
The ice_reset_vf function takes a boolean parameter which indicates whether or not the reset is due to a VFLR event. This is somewhat confusing to read because readers must interpret what "true" and "false" mean when seeing a line of code like "ice_reset_vf(vf, false)". We will want to add another toggle to the ice_reset_vf in a following change. To avoid proliferating many arguments, convert this function to take flags instead. ICE_VF_RESET_VFLR will indicate if this is a VFLR reset. A value of 0 indicates no flags. One could argue that "ice_reset_vf(vf, 0)" is no more readable than "ice_reset_vf(vf, false)".. However, this type of flags interface is somewhat common and using 0 to mean "no flags" makes sense in this context. We could bother to add a define for "ICE_VF_RESET_PLAIN" or something similar, but this can be confusing since its not an actual bit flag. This paves the way to add another flag to the function in a following change. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-15ice: convert ice_reset_vf to standard error codesJacob Keller2-10/+11
The ice_reset_vf function returns a boolean value indicating whether or not the VF reset. This is a bit confusing since it means that callers need to know how to interpret the return value when needing to indicate an error. Refactor the function and call sites to report a regular error code. We still report success (i.e. return 0) in cases where the reset is in progress or is disabled. Existing callers don't care because they do not check the return value. We keep the error code anyways instead of a void return because we expect future code which may care about or at least report the error value. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-15ice: make ice_reset_all_vfs voidJacob Keller2-8/+5
The ice_reset_all_vfs function returns true if any VFs were reset, and false otherwise. However, no callers check the return value. Drop this return value and make the function void since the callers do not care about this. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-15ice: drop is_vflr parameter from ice_reset_all_vfsJacob Keller3-7/+6
The ice_reset_all_vfs function takes a parameter to handle whether its operating after a VFLR event or not. This is not necessary as every caller always passes true. Simplify the interface by removing the parameter. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-15ice: move reset functionality into ice_vf_lib.cJacob Keller5-490/+488
Now that the reset functions do not rely on Single Root specific behavior, move the ice_reset_vf, ice_reset_all_vfs, and ice_vf_rebuild_host_cfg functions and their dependent helper functions out of ice_sriov.c and into ice_vf_lib.c Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-15ice: fix a long line warning in ice_reset_vfJacob Keller1-1/+2
We're about to move ice_reset_vf out of ice_sriov.c and into ice_vf_lib.c One of the dev_err statements has a checkpatch.pl violation due to putting the vf->vf_id on the same line as the dev_err. Fix this style issue first before moving the code. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-15ice: introduce VF operations structure for reset flowsJacob Keller3-139/+196
The ice driver currently supports virtualization using Single Root IOV, with code in the ice_sriov.c file. In the future, we plan to also implement support for Scalable IOV, which uses slightly different hardware implementations for some functionality. To eventually allow this, we introduce a new ice_vf_ops structure which will contain the basic operations that are different between the two IOV implementations. This primarily includes logic for how to handle the VF reset registers, as well as what to do before and after rebuilding the VF's VSI. Implement these ops structures and call the ops table instead of directly calling the SR-IOV specific function. This will allow us to easily add the Scalable IOV implementation in the future. Additionally, it helps separate the generalized VF logic from SR-IOV specifics. This change allows us to move the reset logic out of ice_sriov.c and into ice_vf_lib.c without placing any Single Root specific details into the generic file. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-15ice: fix incorrect dev_dbg print mistaking 'i' for vf->vf_idJacob Keller1-1/+2
If we fail to clear the malicious VF indication after a VF reset, the dev_dbg message which is printed uses the local variable 'i' when it meant to use vf->vf_id. Fix this. Fixes: 0891c89674e8 ("ice: warn about potentially malicious VFs") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-15ice: introduce ice_vf_lib.c, ice_vf_lib.h, and ice_vf_lib_private.hJacob Keller8-711/+823
Introduce the ice_vf_lib.c file along with the ice_vf_lib.h and ice_vf_lib_private.h header files. These files will house the generic VF structures and access functions. Move struct ice_vf and its dependent definitions into this new header file. The ice_vf_lib.c is compiled conditionally on CONFIG_PCI_IOV. Some of its functionality is required by all driver files. However, some of its functionality will only be required by other files also conditionally compiled based on CONFIG_PCI_IOV. Declaring these functions used only in CONFIG_PCI_IOV files in ice_vf_lib.h is verbose. This is because we must provide a fallback implementation for each function in this header since it is included in files which may not be compiled with CONFIG_PCI_IOV. Instead, introduce a new ice_vf_lib_private.h header which verifies that CONFIG_PCI_IOV is enabled. This header is intended to be directly included in .c files which are CONFIG_PCI_IOV only. Add a #error indication that will complain if the file ever gets included by another C file on a kernel with CONFIG_PCI_IOV disabled. Add a comment indicating the nature of the file and why it is useful. This makes it so that we can easily define functions exposed from ice_vf_lib.c into other virtualization files without needing to add fallback implementations for every single function. This begins the path to separate out generic code which will be reused by other virtualization implementations from ice_sriov.h and ice_sriov.c Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-15ice: use ice_is_vf_trusted helper functionJacob Keller1-10/+10
The ice_vc_cfg_promiscuous_mode_msg function directly checks ICE_VIRTCHNL_VF_CAP_PRIVILEGE, instead of using the existing helper function ice_is_vf_trusted. Switch this to use the helper function so that all trusted checks are consistent. This aids in any potential future refactor by ensuring consistent code. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-15ice: log an error message when eswitch fails to configureJacob Keller1-1/+3
When ice_eswitch_configure fails, print an error message to make it more obvious why VF initialization did not succeed. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>