summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel
AgeCommit message (Collapse)AuthorFilesLines
2022-09-02ice: use bitmap_free instead of devm_kfreeMichal Swiatkowski1-1/+1
pf->avail_txqs was allocated using bitmap_zalloc, bitmap_free should be used to free this memory. Fixes: 78b5713ac1241 ("ice: Alloc queue management bitmaps and arrays dynamically") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-02ice: Fix DMA mappings leakPrzemyslaw Patynowski4-17/+79
Fix leak, when user changes ring parameters. During reallocation of RX buffers, new DMA mappings are created for those buffers. New buffers with different RX ring count should substitute older ones, but those buffers were freed in ice_vsi_cfg_rxq and reallocated again with ice_alloc_rx_buf. kfree on rx_buf caused leak of already mapped DMA. Reallocate ZC with xdp_buf struct, when BPF program loads. Reallocate back to rx_buf, when BPF program unloads. If BPF program is loaded/unloaded and XSK pools are created, reallocate RX queues accordingly in XDP_SETUP_XSK_POOL handler. Steps for reproduction: while : do for ((i=0; i<=8160; i=i+32)) do ethtool -G enp130s0f0 rx $i tx $i sleep 0.5 ethtool -g enp130s0f0 done done Fixes: 617f3e1b588c ("ice: xsk: allocate separate memory for XDP SW ring") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Chandan <chandanx.rout@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-01net: ethernet: move from strlcpy with unused retval to strscpyWolfram Sang16-40/+40
Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Petr Machata <petrm@nvidia.com> # For drivers/net/ethernet/mellanox/mlxsw Acked-by: Geoff Levand <geoff@infradead.org> # For ps3_gelic_net and spider_net_ethtool Acked-by: Tom Lendacky <thomas.lendacky@amd.com> # For drivers/net/ethernet/amd/xgbe/xgbe-ethtool.c Acked-by: Marcin Wojtas <mw@semihalf.com> # For drivers/net/ethernet/marvell/mvpp2 Reviewed-by: Leon Romanovsky <leonro@nvidia.com> # For drivers/net/ethernet/mellanox/mlx{4|5} Reviewed-by: Shay Agroskin <shayagr@amazon.com> # For drivers/net/ethernet/amazon/ena Acked-by: Krzysztof Hałasa <khalasa@piap.pl> # For IXP4xx Ethernet Link: https://lore.kernel.org/r/20220830201457.7984-3-wsa+renesas@sang-engineering.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-26Merge branch '100GbE' of ↵David S. Miller9-27/+768
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-08-24 (ice) This series contains updates to ice driver only. Marcin adds support for TC parsing on TTL and ToS fields. Anatolli adds support for devlink port split command to allow configuration of various port configurations. Jake allows for passing and writing an additional NVM write activate field by expanding current cmd_flag. Ani makes PHY debug output more readable. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski6-43/+101
drivers/net/ethernet/mellanox/mlx5/core/en_fs.c 21234e3a84c7 ("net/mlx5e: Fix use after free in mlx5e_fs_init()") c7eafc5ed068 ("net/mlx5e: Convert ethtool_steering member of flow_steering struct to pointer") https://lore.kernel.org/all/20220825104410.67d4709c@canb.auug.org.au/ https://lore.kernel.org/all/20220823055533.334471-1-saeed@kernel.org/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-24i40e: Fix incorrect address type for IPv6 flow rulesSylwester Dziedziuch1-1/+1
It was not possible to create 1-tuple flow director rule for IPv6 flow type. It was caused by incorrectly checking for source IP address when validating user provided destination IP address. Fix this by changing ip6src to correct ip6dst address in destination IP address validation for IPv6 flow type. Fixes: efca91e89b67 ("i40e: Add flow director support for IPv6") Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-24ixgbe: stop resetting SYSTIME in ixgbe_ptp_start_cyclecounterJacob Keller1-13/+46
The ixgbe_ptp_start_cyclecounter is intended to be called whenever the cyclecounter parameters need to be changed. Since commit a9763f3cb54c ("ixgbe: Update PTP to support X550EM_x devices"), this function has cleared the SYSTIME registers and reset the TSAUXC DISABLE_SYSTIME bit. While these need to be cleared during ixgbe_ptp_reset, it is wrong to clear them during ixgbe_ptp_start_cyclecounter. This function may be called during both reset and link status change. When link changes, the SYSTIME counter is still operating normally, but the cyclecounter should be updated to account for the possibly changed parameters. Clearing SYSTIME when link changes causes the timecounter to jump because the cycle counter now reads zero. Extract the SYSTIME initialization out to a new function and call this during ixgbe_ptp_reset. This prevents the timecounter adjustment and avoids an unnecessary reset of the current time. This also restores the original SYSTIME clearing that occurred during ixgbe_ptp_reset before the commit above. Reported-by: Steve Payne <spayne@aurora.tech> Reported-by: Ilya Evenbach <ievenbach@aurora.tech> Fixes: a9763f3cb54c ("ixgbe: Update PTP to support X550EM_x devices") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-24ice: Print human-friendly PHY typesAnirudh Venkataramanan1-18/+140
Provide human readable description of PHY capabilities and report_mode. Sample output: Old: [ 286.130405] ice 0000:16:00.0: get phy caps - report_mode = 0x2 [ 286.130409] ice 0000:16:00.0: phy_type_low = 0x108021020502000 [ 286.130412] ice 0000:16:00.0: phy_type_high = 0x0 [ 286.130415] ice 0000:16:00.0: caps = 0xc8 [ 286.130419] ice 0000:16:00.0: low_power_ctrl_an = 0x4 [ 286.130421] ice 0000:16:00.0: eee_cap = 0x0 [ 286.130424] ice 0000:16:00.0: eeer_value = 0x0 [ 286.130427] ice 0000:16:00.0: link_fec_options = 0xdf [ 286.130430] ice 0000:16:00.0: module_compliance_enforcement = 0x0 [ 286.130433] ice 0000:16:00.0: extended_compliance_code = 0xb [ 286.130435] ice 0000:16:00.0: module_type[0] = 0x11 [ 286.130438] ice 0000:16:00.0: module_type[1] = 0x1 [ 286.130441] ice 0000:16:00.0: module_type[2] = 0x0 New: [ 1128.297347] ice 0000:16:00.0: get phy caps dump [ 1128.297351] ice 0000:16:00.0: phy_caps_active: phy_type_low: 0x0108021020502000 [ 1128.297355] ice 0000:16:00.0: phy_caps_active: bit(13): 10G_SFI_DA [ 1128.297359] ice 0000:16:00.0: phy_caps_active: bit(20): 25GBASE_CR [ 1128.297362] ice 0000:16:00.0: phy_caps_active: bit(22): 25GBASE_CR1 [ 1128.297365] ice 0000:16:00.0: phy_caps_active: bit(29): 25G_AUI_C2C [ 1128.297368] ice 0000:16:00.0: phy_caps_active: bit(36): 50GBASE_CR2 [ 1128.297371] ice 0000:16:00.0: phy_caps_active: bit(41): 50G_LAUI2 [ 1128.297374] ice 0000:16:00.0: phy_caps_active: bit(51): 100GBASE_CR4 [ 1128.297377] ice 0000:16:00.0: phy_caps_active: bit(56): 100G_CAUI4 [ 1128.297380] ice 0000:16:00.0: phy_caps_active: phy_type_high: 0x0000000000000000 [ 1128.297383] ice 0000:16:00.0: phy_caps_active: report_mode = 0x4 [ 1128.297386] ice 0000:16:00.0: phy_caps_active: caps = 0xc8 [ 1128.297389] ice 0000:16:00.0: phy_caps_active: low_power_ctrl_an = 0x4 [ 1128.297392] ice 0000:16:00.0: phy_caps_active: eee_cap = 0x0 [ 1128.297394] ice 0000:16:00.0: phy_caps_active: eeer_value = 0x0 [ 1128.297397] ice 0000:16:00.0: phy_caps_active: link_fec_options = 0xdf [ 1128.297400] ice 0000:16:00.0: phy_caps_active: module_compliance_enforcement = 0x0 [ 1128.297402] ice 0000:16:00.0: phy_caps_active: extended_compliance_code = 0xb [ 1128.297405] ice 0000:16:00.0: phy_caps_active: module_type[0] = 0x11 [ 1128.297408] ice 0000:16:00.0: phy_caps_active: module_type[1] = 0x1 [ 1128.297411] ice 0000:16:00.0: phy_caps_active: module_type[2] = 0x0 Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Co-developed-by: Lukasz Plachno <lukasz.plachno@intel.com> Signed-off-by: Lukasz Plachno <lukasz.plachno@intel.com> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-24ice: Implement devlink port split operationsAnatolii Gerasymenko2-0/+290
Allow to configure port split options using the devlink port split interface. Support port splitting only for port 0, as the FW has a predefined set of available port split options for the whole device. Add ice_devlink_port_options_print() function to print the table with all available FW port split options. It will be printed after each port split and unsplit command. Add documentation for devlink port split interface usage for the ice driver. Co-developed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-24ice: Add additional flags to ice_nvm_write_activateJacob Keller3-5/+16
The ice_nvm_write_activate function is used to issue AdminQ command 0x0707 which sends a request to firmware to activate a flash bank. For basic operations, this command takes an 8bit flag value which defines the flags to control the activation process. There are some additional flags that are stored in a second 8bit flag field. We can simplify the interface by using a u16 cmd_flags variable. Split this over the two bytes of flag storage in the structure. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-24ice: Add port option admin queue commandsAnatolii Gerasymenko3-0/+178
Implement support for Get/Set Port Options admin queue commands (0x06EA/0x06EB). These firmware commands allow the driver to change port specific options and will be used in the next patch. Co-developed-by: Lev Faerman <lev.faerman@intel.com> Signed-off-by: Lev Faerman <lev.faerman@intel.com> Co-developed-by: Damian Milosek <damian.milosek@intel.com> Signed-off-by: Damian Milosek <damian.milosek@intel.com> Co-developed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-24ice: Add support for ip TTL & ToS offloadMarcin Szycik2-4/+144
Add support for parsing TTL and ToS (Hop Limit and Traffic Class) tc fields and matching on those fields in filters. Incomplete part of implementation was already in place (getting enc_ip and enc_tos from flow_match_ip and writing them to filter header). Note: matching on ipv6 ip_ttl, enc_ttl and enc_tos is currently not supported by the DDP package. Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-23Merge branch '10GbE' of ↵Jakub Kicinski3-6/+57
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-08-18 (ixgbe) This series contains updates to ixgbe driver only. Fabio M. De Francesco replaces kmap() call to page_address() for rx_buffer->page(). Jeff Daly adds a manual AN-37 restart to help resolve issues with some link partners. * '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ixgbe: Manual AN-37 for troublesome link partners for X550 SFI ixgbe: Don't call kmap() on page allocated with GFP_ATOMIC ==================== Link: https://lore.kernel.org/r/20220818223402.1294091-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-23Merge branch '100GbE' of ↵Jakub Kicinski13-184/+136
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-08-18 (ice) This series contains updates to ice driver only. Jesse and Anatolii add support for controlling FCS/CRC stripping via ethtool. Anirudh allows for 100M speeds on devices which support it. Sylwester removes ucast_shared field and the associated dead code related to it. Mikael removes non-inclusive language from the driver. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: remove non-inclusive language ice: Remove ucast_shared ice: Allow 100M speeds for some devices ice: Implement FCS/CRC and VLAN stripping co-existence policy ice: Implement control of FCS/CRC stripping ==================== Link: https://lore.kernel.org/r/20220818155207.996297-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-22ice: xsk: use Rx ring's XDP ring when picking NAPI contextMaciej Fijalkowski4-29/+48
Ice driver allocates per cpu XDP queues so that redirect path can safely use smp_processor_id() as an index to the array. At the same time though, XDP rings are used to pick NAPI context to call napi_schedule() or set NAPIF_STATE_MISSED. When user reduces queue count, say to 8, and num_possible_cpus() of underlying platform is 44, then this means queue vectors with correlated NAPI contexts will carry several XDP queues. This in turn can result in a broken behavior where NAPI context of interest will never be scheduled and AF_XDP socket will not process any traffic. To fix this, let us change the way how XDP rings are assigned to Rx rings and use this information later on when setting ice_tx_ring::xsk_pool pointer. For each Rx ring, grab the associated queue vector and walk through Tx ring's linked list. Once we stumble upon XDP ring in it, assign this ring to ice_rx_ring::xdp_ring. Previous [0] approach of fixing this issue was for txonly scenario because of the described grouping of XDP rings across queue vectors. So, relying on Rx ring meant that NAPI context could be scheduled with a queue vector without XDP ring with associated XSK pool. [0]: https://lore.kernel.org/netdev/20220707161128.54215-1-maciej.fijalkowski@intel.com/ Fixes: 2d4238f55697 ("ice: Add support for AF_XDP") Fixes: 22bf877e528f ("ice: introduce XDP_TX fallback path") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-22ice: xsk: prohibit usage of non-balanced queue idMaciej Fijalkowski1-0/+6
Fix the following scenario: 1. ethtool -L $IFACE rx 8 tx 96 2. xdpsock -q 10 -t -z Above refers to a case where user would like to attach XSK socket in txonly mode at a queue id that does not have a corresponding Rx queue. At this moment ice's XSK logic is tightly bound to act on a "queue pair", e.g. both Tx and Rx queues at a given queue id are disabled/enabled and both of them will get XSK pool assigned, which is broken for the presented queue configuration. This results in the splat included at the bottom, which is basically an OOB access to Rx ring array. To fix this, allow using the ids only in scope of "combined" queues reported by ethtool. However, logic should be rewritten to allow such configurations later on, which would end up as a complete rewrite of the control path, so let us go with this temporary fix. [420160.558008] BUG: kernel NULL pointer dereference, address: 0000000000000082 [420160.566359] #PF: supervisor read access in kernel mode [420160.572657] #PF: error_code(0x0000) - not-present page [420160.579002] PGD 0 P4D 0 [420160.582756] Oops: 0000 [#1] PREEMPT SMP NOPTI [420160.588396] CPU: 10 PID: 21232 Comm: xdpsock Tainted: G OE 5.19.0-rc7+ #10 [420160.597893] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019 [420160.609894] RIP: 0010:ice_xsk_pool_setup+0x44/0x7d0 [ice] [420160.616968] Code: f3 48 83 ec 40 48 8b 4f 20 48 8b 3f 65 48 8b 04 25 28 00 00 00 48 89 44 24 38 31 c0 48 8d 04 ed 00 00 00 00 48 01 c1 48 8b 11 <0f> b7 92 82 00 00 00 48 85 d2 0f 84 2d 75 00 00 48 8d 72 ff 48 85 [420160.639421] RSP: 0018:ffffc9002d2afd48 EFLAGS: 00010282 [420160.646650] RAX: 0000000000000050 RBX: ffff88811d8bdd00 RCX: ffff888112c14ff8 [420160.655893] RDX: 0000000000000000 RSI: ffff88811d8bdd00 RDI: ffff888109861000 [420160.665166] RBP: 000000000000000a R08: 000000000000000a R09: 0000000000000000 [420160.674493] R10: 000000000000889f R11: 0000000000000000 R12: 000000000000000a [420160.683833] R13: 000000000000000a R14: 0000000000000000 R15: ffff888117611828 [420160.693211] FS: 00007fa869fc1f80(0000) GS:ffff8897e0880000(0000) knlGS:0000000000000000 [420160.703645] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [420160.711783] CR2: 0000000000000082 CR3: 00000001d076c001 CR4: 00000000007706e0 [420160.721399] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [420160.731045] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [420160.740707] PKRU: 55555554 [420160.745960] Call Trace: [420160.750962] <TASK> [420160.755597] ? kmalloc_large_node+0x79/0x90 [420160.762703] ? __kmalloc_node+0x3f5/0x4b0 [420160.769341] xp_assign_dev+0xfd/0x210 [420160.775661] ? shmem_file_read_iter+0x29a/0x420 [420160.782896] xsk_bind+0x152/0x490 [420160.788943] __sys_bind+0xd0/0x100 [420160.795097] ? exit_to_user_mode_prepare+0x20/0x120 [420160.802801] __x64_sys_bind+0x16/0x20 [420160.809298] do_syscall_64+0x38/0x90 [420160.815741] entry_SYSCALL_64_after_hwframe+0x63/0xcd [420160.823731] RIP: 0033:0x7fa86a0dd2fb [420160.830264] Code: c3 66 0f 1f 44 00 00 48 8b 15 69 8b 0c 00 f7 d8 64 89 02 b8 ff ff ff ff eb bc 0f 1f 44 00 00 f3 0f 1e fa b8 31 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 3d 8b 0c 00 f7 d8 64 89 01 48 [420160.855410] RSP: 002b:00007ffc1146f618 EFLAGS: 00000246 ORIG_RAX: 0000000000000031 [420160.866366] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fa86a0dd2fb [420160.876957] RDX: 0000000000000010 RSI: 00007ffc1146f680 RDI: 0000000000000003 [420160.887604] RBP: 000055d7113a0520 R08: 00007fa868fb8000 R09: 0000000080000000 [420160.898293] R10: 0000000000008001 R11: 0000000000000246 R12: 000055d7113a04e0 [420160.909038] R13: 000055d7113a0320 R14: 000000000000000a R15: 0000000000000000 [420160.919817] </TASK> [420160.925659] Modules linked in: ice(OE) af_packet binfmt_misc nls_iso8859_1 ipmi_ssif intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp mei_me coretemp ioatdma mei ipmi_si wmi ipmi_msghandler acpi_pad acpi_power_meter ip_tables x_tables autofs4 ixgbe i40e crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd ahci mdio dca libahci lpc_ich [last unloaded: ice] [420160.977576] CR2: 0000000000000082 [420160.985037] ---[ end trace 0000000000000000 ]--- [420161.097724] RIP: 0010:ice_xsk_pool_setup+0x44/0x7d0 [ice] [420161.107341] Code: f3 48 83 ec 40 48 8b 4f 20 48 8b 3f 65 48 8b 04 25 28 00 00 00 48 89 44 24 38 31 c0 48 8d 04 ed 00 00 00 00 48 01 c1 48 8b 11 <0f> b7 92 82 00 00 00 48 85 d2 0f 84 2d 75 00 00 48 8d 72 ff 48 85 [420161.134741] RSP: 0018:ffffc9002d2afd48 EFLAGS: 00010282 [420161.144274] RAX: 0000000000000050 RBX: ffff88811d8bdd00 RCX: ffff888112c14ff8 [420161.155690] RDX: 0000000000000000 RSI: ffff88811d8bdd00 RDI: ffff888109861000 [420161.168088] RBP: 000000000000000a R08: 000000000000000a R09: 0000000000000000 [420161.179295] R10: 000000000000889f R11: 0000000000000000 R12: 000000000000000a [420161.190420] R13: 000000000000000a R14: 0000000000000000 R15: ffff888117611828 [420161.201505] FS: 00007fa869fc1f80(0000) GS:ffff8897e0880000(0000) knlGS:0000000000000000 [420161.213628] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [420161.223413] CR2: 0000000000000082 CR3: 00000001d076c001 CR4: 00000000007706e0 [420161.234653] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [420161.245893] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [420161.257052] PKRU: 55555554 Fixes: 2d4238f55697 ("ice: Add support for AF_XDP") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski12-32/+140
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-19igc: add xdp frags support to ndo_xdp_xmitLorenzo Bianconi1-45/+83
Add the capability to map non-linear xdp frames in XDP_TX and ndo_xdp_xmit callback. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20220817173628.109102-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-19ixgbe: Manual AN-37 for troublesome link partners for X550 SFIJeff Daly2-3/+56
Some (Juniper MX5) SFP link partners exhibit a disinclination to autonegotiate with X550 configured in SFI mode. This patch enables a manual AN-37 restart to work around the problem. Signed-off-by: Jeff Daly <jeffd@silicom-usa.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-18igb: Add lock to avoid data raceLin Ma2-1/+13
The commit c23d92b80e0b ("igb: Teardown SR-IOV before unregister_netdev()") places the unregister_netdev() call after the igb_disable_sriov() call to avoid functionality issue. However, it introduces several race conditions when detaching a device. For example, when .remove() is called, the below interleaving leads to use-after-free. (FREE from device detaching) | (USE from netdev core) igb_remove | igb_ndo_get_vf_config igb_disable_sriov | vf >= adapter->vfs_allocated_count? kfree(adapter->vf_data) | adapter->vfs_allocated_count = 0 | | memcpy(... adapter->vf_data[vf] Moreover, the igb_disable_sriov() also suffers from data race with the requests from VF driver. (FREE from device detaching) | (USE from requests) igb_remove | igb_msix_other igb_disable_sriov | igb_msg_task kfree(adapter->vf_data) | vf < adapter->vfs_allocated_count adapter->vfs_allocated_count = 0 | To this end, this commit first eliminates the data races from netdev core by using rtnl_lock (similar to commit 719479230893 ("dpaa2-eth: add MAC/PHY support through phylink")). And then adds a spinlock to eliminate races from driver requests. (similar to commit 1e53834ce541 ("ixgbe: Add locking to prevent panic when setting sriov_numvfs to zero") Fixes: c23d92b80e0b ("igb: Teardown SR-IOV before unregister_netdev()") Signed-off-by: Lin Ma <linma@zju.edu.cn> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20220817184921.735244-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-18ixgbe: Don't call kmap() on page allocated with GFP_ATOMICFabio M. De Francesco1-3/+1
Pages allocated with GFP_ATOMIC cannot come from Highmem. This is why there is no need to call kmap() on them. Therefore, don't call kmap() on rx_buffer->page() and instead use a plain page_address() to get the kernel address. Suggested-by: Ira Weiny <ira.weiny@intel.com> Suggested-by: Alexander Duyck <alexander.duyck@gmail.com> Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-18ice: remove non-inclusive languageMikael Barsehyan2-9/+9
Remove non-inclusive language from the driver where possible; replace "master" with "primary"; replace "slave" with "secondary". Signed-off-by: Mikael Barsehyan <mikael.barsehyan@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-18ice: Remove ucast_sharedSylwester Dziedziuch3-165/+5
Remove ucast_shared as it was always true. Remove the code depending on ucast_shared from ice_add_mac and ice_remove_mac. Remove ice_find_ucast_rule_entry function as it was only used when ucast_shared was set to false. Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-18ice: Allow 100M speeds for some devicesAnirudh Venkataramanan3-4/+28
For certain devices, 100M speeds are supported. Do not mask off 100M speed for these devices. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Co-developed-by: Chinh T Cao <chinh.t.cao@intel.com> Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Signed-off-by: Mikael Barsehyan <mikael.barsehyan@intel.com> Tested-by: Kavya AV <kavyax.av@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-18ice: Implement FCS/CRC and VLAN stripping co-existence policyAnatolii Gerasymenko1-0/+25
Make sure that only the valid combinations of FCS/CRC stripping and VLAN stripping offloads are allowed. You cannot have FCS/CRC stripping disabled while VLAN stripping is enabled - this breaks the correctness of the FCS/CRC. If administrator tries to enable VLAN stripping when FCS/CRC stripping is disabled, the request should be rejected. If administrator tries to disable FCS/CRC stripping when VLAN stripping is enabled, the request should be rejected if VLANs are configured. If there is no VLAN configured, then both FCS/CRC and VLAN stripping should be disabled. Testing Hints: The default settings after driver load are: - VLAN C-Tag offloads are enabled - VLAN S-Tag offloads are disabled - FCS/CRC stripping is enabled Restore the default settings before each test with the command: ethtool -K eth0 rx-fcs off rxvlan on txvlan on rx-vlan-stag-hw-parse off tx-vlan-stag-hw-insert off Test 1: Disable FCS/CRC and VLAN stripping: ethtool -K eth0 rx-fcs on rxvlan off Try to enable VLAN stripping: ethtool -K eth0 rxvlan on Expected: VLAN stripping request is rejected Test 2: Try to disable FCS/CRC stripping: ethtool -K eth0 rx-fcs on Expected: VLAN stripping is also disabled, as there are no VLAN configured Test 3: Add a VLAN: ip link add link eth0 eth0.42 type vlan id 42 ip link set eth0 up Try to disable FCS/CRC stripping: ethtool -K eth0 rx-fcs on Expected: FCS/CRC stripping request is rejected Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-18ice: Implement control of FCS/CRC strippingJesse Brandeburg7-6/+69
The driver can allow the user to configure whether the CRC aka the FCS (Frame Check Sequence) is DMA'd to the host as part of the receive buffer. The driver usually wants this feature disabled so that the hardware checks the FCS and strips it in order to save PCI bandwidth. Control the reception of FCS to the host using the command: ethtool -K eth0 rx-fcs <on|off> The default shown in ethtool -k eth0 | grep fcs; should be "off", as the hardware will drop any frame with a bad checksum, and DMA of the checksum is useless overhead especially for small packets. Testing Hints: test the FCS/CRC arrives with received packets using tcpdump -nnpi eth0 -xxxx and it should show crc data as the last 4 bytes of the packet. Can also use wireshark to turn on CRC checking and check the data is correct. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Co-developed-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Co-developed-by: Benjamin Mikailenko <benjamin.mikailenko@intel.com> Signed-off-by: Benjamin Mikailenko <benjamin.mikailenko@intel.com> Co-developed-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com> Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-17ice: Fix VF not able to send tagged traffic with no VLAN filtersSylwester Dziedziuch2-11/+57
VF was not able to send tagged traffic when it didn't have any VLAN interfaces and VLAN anti-spoofing was enabled. Fix this by allowing VFs with no VLAN filters to send tagged traffic. After VF adds a VLAN interface it will be able to send tagged traffic matching VLAN filters only. Testing hints: 1. Spawn VF 2. Send tagged packet from a VF 3. The packet should be sent out and not dropped 4. Add a VLAN interface on VF 5. Send tagged packet on that VLAN interface 6. Packet should be sent out and not dropped 7. Send tagged packet with id different than VLAN interface 8. Packet should be dropped Fixes: daf4dd16438b ("ice: Refactor spoofcheck configuration functions") Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-17ice: Ignore error message when setting same promiscuous modeBenjamin Mikailenko1-4/+4
Commit 1273f89578f2 ("ice: Fix broken IFF_ALLMULTI handling") introduced new checks when setting/clearing promiscuous mode. But if the requested promiscuous mode setting already exists, an -EEXIST error message would be printed. This is incorrect because promiscuous mode is either on/off and shouldn't print an error when the requested configuration is already set. This can happen when removing a bridge with two bonded interfaces and promiscuous most isn't fully cleared from VLAN VSI in hardware. Fix this by ignoring cases where requested promiscuous mode exists. Fixes: 1273f89578f2 ("ice: Fix broken IFF_ALLMULTI handling") Signed-off-by: Benjamin Mikailenko <benjamin.mikailenko@intel.com> Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com> Link: https://lore.kernel.org/all/CAK8fFZ7m-KR57M_rYX6xZN39K89O=LGooYkKsu6HKt0Bs+x6xQ@mail.gmail.com/ Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-17ice: Fix clearing of promisc mode with bridge over bondGrzegorz Siwik2-2/+16
When at least two interfaces are bonded and a bridge is enabled on the bond, an error can occur when the bridge is removed and re-added. The reason for the error is because promiscuous mode was not fully cleared from the VLAN VSI in the hardware. With this change, promiscuous mode is properly removed when the bridge disconnects from bonding. [ 1033.676359] bond1: link status definitely down for interface enp95s0f0, disabling it [ 1033.676366] bond1: making interface enp175s0f0 the new active one [ 1033.676369] device enp95s0f0 left promiscuous mode [ 1033.676522] device enp175s0f0 entered promiscuous mode [ 1033.676901] ice 0000:af:00.0 enp175s0f0: Error setting Multicast promiscuous mode on VSI 6 [ 1041.795662] ice 0000:af:00.0 enp175s0f0: Error setting Multicast promiscuous mode on VSI 6 [ 1041.944826] bond1: link status definitely down for interface enp175s0f0, disabling it [ 1041.944874] device enp175s0f0 left promiscuous mode [ 1041.944918] bond1: now running without any active interface! Fixes: c31af68a1b94 ("ice: Add outer_vlan_ops and VSI specific VLAN ops implementations") Co-developed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com> Link: https://lore.kernel.org/all/CAK8fFZ7m-KR57M_rYX6xZN39K89O=LGooYkKsu6HKt0Bs+x6xQ@mail.gmail.com/ Tested-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Tested-by: Igor Raits <igor@gooddata.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-17ice: Ignore EEXIST when setting promisc modeGrzegorz Siwik1-1/+1
Ignore EEXIST error when setting promiscuous mode. This fix is needed because the driver could set promiscuous mode when it still has not cleared properly. Promiscuous mode could be set only once, so setting it second time will be rejected. Fixes: 5eda8afd6bcc ("ice: Add support for PF/VF promiscuous mode") Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com> Link: https://lore.kernel.org/all/CAK8fFZ7m-KR57M_rYX6xZN39K89O=LGooYkKsu6HKt0Bs+x6xQ@mail.gmail.com/ Tested-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Tested-by: Igor Raits <igor@gooddata.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-17ice: Fix double VLAN error when entering promisc modeGrzegorz Siwik1-0/+7
Avoid enabling or disabling VLAN 0 when trying to set promiscuous VLAN mode if double VLAN mode is enabled. This fix is needed because the driver tries to add the VLAN 0 filter twice (once for inner and once for outer) when double VLAN mode is enabled. The filter program is rejected by the firmware when double VLAN is enabled, because the promiscuous filter only needs to be set once. This issue was missed in the initial implementation of double VLAN mode. Fixes: 5eda8afd6bcc ("ice: Add support for PF/VF promiscuous mode") Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com> Link: https://lore.kernel.org/all/CAK8fFZ7m-KR57M_rYX6xZN39K89O=LGooYkKsu6HKt0Bs+x6xQ@mail.gmail.com/ Tested-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Tested-by: Igor Raits <igor@gooddata.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-16ice: introduce ice_ptp_reset_cached_phctime functionJacob Keller1-23/+76
If the PTP hardware clock is adjusted, the ice driver must update the cached PHC timestamp. This is required in order to perform timestamp extension on the shorter timestamps captured by the PHY. Currently, we simply call ice_ptp_update_cached_phctime in the settime and adjtime callbacks. This has a few issues: 1) if ICE_CFG_BUSY is set because another thread is updating the Rx rings, we will exit with an error. This is not checked, and the functions do not re-schedule the update. This could leave the cached timestamp incorrect until the next scheduled work item execution. 2) even if we did handle an update, any currently outstanding Tx timestamp would be extended using the wrong cached PHC time. This would produce incorrect results. To fix these issues, introduce a new ice_ptp_reset_cached_phctime function. This function calls the ice_ptp_update_cached_phctime, and discards outstanding Tx timestamps. If the ice_ptp_update_cached_phctime function fails because ICE_CFG_BUSY is set, we log a warning and schedule the thread to execute soon. The update function is modified so that it always updates the cached copy in the PF regardless. This ensures we have the most up to date values possible and minimizes the risk of a packet timestamp being extended with the wrong value. It would be nice if we could skip reporting Rx timestamps until the cached values are up to date. However, we can't access the Rx rings while ICE_CFG_BUSY is set because they are actively being updated by another thread. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-16ice: re-arrange some static functions in ice_ptp.cJacob Keller1-336/+336
A following change is going to want to make use of ice_ptp_flush_tx_tracker earlier in the ice_ptp.c file. To make this work, move the Tx timestamp tracking functions higher up in the file, and pull the ice_ptp_update_cached_timestamp function below them. This should have no functional change. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-16ice: track and warn when PHC update is lateJacob Keller3-3/+34
The ice driver requires a cached copy of the PHC time in order to perform timestamp extension on Tx and Rx hardware timestamp values. This cached PHC time must always be updated at least once every 2 seconds. Otherwise, the math used to perform the extension would produce invalid results. The updates are supposed to occur periodically in the PTP kthread work item, which is scheduled to run every half second. Thus, we do not expect an update to be delayed for so long. However, there are error conditions which can cause the update to be delayed. Track this situation by using jiffies to determine approximately how long ago the last update occurred. Add a new statistic and a dev_warn when we have failed to update the cached PHC time. This makes the error case more obvious. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-16ice: track Tx timestamp stats similar to other Intel driversJacob Keller4-4/+20
Several Intel networking drivers which support PTP track when Tx timestamps are skipped or when they timeout without a timestamp from hardware. The conditions which could cause these events are rare, but it can be useful to know when and how often they occur. Implement similar statistics for the ice driver, tx_hwtstamp_skipped, tx_hwtstamp_timeouts, and tx_hwtstamp_flushed. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-16ice: initialize cached_phctime when creating Rx ringsJacob Keller2-0/+2
When we create new Rx rings, the cached_phctime field is zero initialized. This could result in incorrect timestamp reporting due to the cached value not yet being updated. Although a background task will periodically update the cached value, ensure it matches the existing cached value in the PF structure at ring initialization. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-16ice: set tx_tstamps when creating new Tx rings via ethtoolJacob Keller1-0/+1
When the user changes the number of queues via ethtool, the driver allocates new rings. This allocation did not initialize tx_tstamps. This results in the tx_tstamps field being zero (due to kcalloc allocation), and would result in a NULL pointer dereference when attempting a transmit timestamp on the new ring. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-16i40e: Fix to stop tx_timeout recovery if GLOBR failsAlan Brady1-1/+3
When a tx_timeout fires, the PF attempts to recover by incrementally resetting. First we try a PFR, then CORER and finally a GLOBR. If the GLOBR fails, then we keep hitting the tx_timeout and incrementing the recovery level and issuing dmesgs, which is both annoying to the user and accomplishes nothing. If the GLOBR fails, then we're pretty much totally hosed, and there's not much else we can do to recover, so this makes it such that we just kill the VSI and stop hitting the tx_timeout in such a case. Fixes: 41c445ff0f48 ("i40e: main driver core") Signed-off-by: Alan Brady <alan.brady@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-16i40e: Fix tunnel checksum offload with fragmented trafficPrzemyslaw Patynowski1-3/+5
Fix checksum offload on VXLAN tunnels. In case, when mpls protocol is not used, set l4 header to transport header of skb. This fixes case, when user tries to offload checksums of VXLAN tunneled traffic. Steps for reproduction (requires link partner with tunnels): ip l s enp130s0f0 up ip a f enp130s0f0 ip a a 10.10.110.2/24 dev enp130s0f0 ip l s enp130s0f0 mtu 1600 ip link add vxlan12_sut type vxlan id 12 group 238.168.100.100 dev \ enp130s0f0 dstport 4789 ip l s vxlan12_sut up ip a a 20.10.110.2/24 dev vxlan12_sut iperf3 -c 20.10.110.1 #should connect Without this patch, TX descriptor was using wrong data, due to l4 header pointing wrong address. NIC would then drop those packets internally, due to incorrect TX descriptor data, which increased GLV_TEPC register. Fixes: b4fb2d33514a ("i40e: Add support for MPLS + TSO") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-16Merge branch '40GbE' of ↵Jakub Kicinski2-7/+30
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-08-12 (iavf) This series contains updates to iavf driver only. Przemyslaw frees memory for admin queues in initialization error paths, prevents freeing of vf_res which is causing null pointer dereference, and adjusts calls in error path of reset to avoid iavf_close() which could cause deadlock. Ivan Vecera avoids deadlock that can occur when driver if part of failover. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: iavf: Fix deadlock in initialization iavf: Fix reset error handling iavf: Fix NULL pointer dereference in iavf_get_link_ksettings iavf: Fix adminq error handling ==================== Link: https://lore.kernel.org/r/20220812172309.853230-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-15Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/netDavid S. Miller2-2/+4
-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-08-11 (ice) This series contains updates to ice driver only. Benjamin corrects a misplaced parenthesis for a WARN_ON check. Michal removes WARN_ON from a check as its recoverable and not warranting of a call trace. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-12iavf: Fix deadlock in initializationIvan Vecera1-1/+10
Fix deadlock that occurs when iavf interface is a part of failover configuration. 1. Mutex crit_lock is taken at the beginning of iavf_watchdog_task() 2. Function iavf_init_config_adapter() is called when adapter state is __IAVF_INIT_CONFIG_ADAPTER 3. iavf_init_config_adapter() calls register_netdevice() that emits NETDEV_REGISTER event 4. Notifier function failover_event() then calls net_failover_slave_register() that calls dev_open() 5. dev_open() calls iavf_open() that tries to take crit_lock in end-less loop Stack trace: ... [ 790.251876] usleep_range_state+0x5b/0x80 [ 790.252547] iavf_open+0x37/0x1d0 [iavf] [ 790.253139] __dev_open+0xcd/0x160 [ 790.253699] dev_open+0x47/0x90 [ 790.254323] net_failover_slave_register+0x122/0x220 [net_failover] [ 790.255213] failover_slave_register.part.7+0xd2/0x180 [failover] [ 790.256050] failover_event+0x122/0x1ab [failover] [ 790.256821] notifier_call_chain+0x47/0x70 [ 790.257510] register_netdevice+0x20f/0x550 [ 790.258263] iavf_watchdog_task+0x7c8/0xea0 [iavf] [ 790.259009] process_one_work+0x1a7/0x360 [ 790.259705] worker_thread+0x30/0x390 To fix the situation we should check the current adapter state after first unsuccessful mutex_trylock() and return with -EBUSY if it is __IAVF_INIT_CONFIG_ADAPTER. Fixes: 226d528512cf ("iavf: fix locking of critical sections") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-12iavf: Fix reset error handlingPrzemyslaw Patynowski1-3/+6
Do not call iavf_close in iavf_reset_task error handling. Doing so can lead to double call of napi_disable, which can lead to deadlock there. Removing VF would lead to iavf_remove task being stuck, because it requires crit_lock, which is held by iavf_close. Call iavf_disable_vf if reset fail, so that driver will clean up remaining invalid resources. During rapid VF resets, HW can fail to setup VF mailbox. Wrong error handling can lead to iavf_remove being stuck with: [ 5218.999087] iavf 0000:82:01.0: Failed to init adminq: -53 ... [ 5267.189211] INFO: task repro.sh:11219 blocked for more than 30 seconds. [ 5267.189520] Tainted: G S E 5.18.0-04958-ga54ce3703613-dirty #1 [ 5267.189764] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 5267.190062] task:repro.sh state:D stack: 0 pid:11219 ppid: 8162 flags:0x00000000 [ 5267.190347] Call Trace: [ 5267.190647] <TASK> [ 5267.190927] __schedule+0x460/0x9f0 [ 5267.191264] schedule+0x44/0xb0 [ 5267.191563] schedule_preempt_disabled+0x14/0x20 [ 5267.191890] __mutex_lock.isra.12+0x6e3/0xac0 [ 5267.192237] ? iavf_remove+0xf9/0x6c0 [iavf] [ 5267.192565] iavf_remove+0x12a/0x6c0 [iavf] [ 5267.192911] ? _raw_spin_unlock_irqrestore+0x1e/0x40 [ 5267.193285] pci_device_remove+0x36/0xb0 [ 5267.193619] device_release_driver_internal+0xc1/0x150 [ 5267.193974] pci_stop_bus_device+0x69/0x90 [ 5267.194361] pci_stop_and_remove_bus_device+0xe/0x20 [ 5267.194735] pci_iov_remove_virtfn+0xba/0x120 [ 5267.195130] sriov_disable+0x2f/0xe0 [ 5267.195506] ice_free_vfs+0x7d/0x2f0 [ice] [ 5267.196056] ? pci_get_device+0x4f/0x70 [ 5267.196496] ice_sriov_configure+0x78/0x1a0 [ice] [ 5267.196995] sriov_numvfs_store+0xfe/0x140 [ 5267.197466] kernfs_fop_write_iter+0x12e/0x1c0 [ 5267.197918] new_sync_write+0x10c/0x190 [ 5267.198404] vfs_write+0x24e/0x2d0 [ 5267.198886] ksys_write+0x5c/0xd0 [ 5267.199367] do_syscall_64+0x3a/0x80 [ 5267.199827] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 5267.200317] RIP: 0033:0x7f5b381205c8 [ 5267.200814] RSP: 002b:00007fff8c7e8c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 5267.201981] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f5b381205c8 [ 5267.202620] RDX: 0000000000000002 RSI: 00005569420ee900 RDI: 0000000000000001 [ 5267.203426] RBP: 00005569420ee900 R08: 000000000000000a R09: 00007f5b38180820 [ 5267.204327] R10: 000000000000000a R11: 0000000000000246 R12: 00007f5b383c06e0 [ 5267.205193] R13: 0000000000000002 R14: 00007f5b383bb880 R15: 0000000000000002 [ 5267.206041] </TASK> [ 5267.206970] Kernel panic - not syncing: hung_task: blocked tasks [ 5267.207809] CPU: 48 PID: 551 Comm: khungtaskd Kdump: loaded Tainted: G S E 5.18.0-04958-ga54ce3703613-dirty #1 [ 5267.208726] Hardware name: Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.11.0 11/02/2019 [ 5267.209623] Call Trace: [ 5267.210569] <TASK> [ 5267.211480] dump_stack_lvl+0x33/0x42 [ 5267.212472] panic+0x107/0x294 [ 5267.213467] watchdog.cold.8+0xc/0xbb [ 5267.214413] ? proc_dohung_task_timeout_secs+0x30/0x30 [ 5267.215511] kthread+0xf4/0x120 [ 5267.216459] ? kthread_complete_and_exit+0x20/0x20 [ 5267.217505] ret_from_fork+0x22/0x30 [ 5267.218459] </TASK> Fixes: f0db78928783 ("i40evf: use netdev variable in reset task") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-12iavf: Fix NULL pointer dereference in iavf_get_link_ksettingsPrzemyslaw Patynowski1-1/+1
Fix possible NULL pointer dereference, due to freeing of adapter->vf_res in iavf_init_get_resources. Previous commit introduced a regression, where receiving IAVF_ERR_ADMIN_QUEUE_NO_WORK from iavf_get_vf_config would free adapter->vf_res. However, netdev is still registered, so ethtool_ops can be called. Calling iavf_get_link_ksettings with no vf_res, will result with: [ 9385.242676] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 9385.242683] #PF: supervisor read access in kernel mode [ 9385.242686] #PF: error_code(0x0000) - not-present page [ 9385.242690] PGD 0 P4D 0 [ 9385.242696] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI [ 9385.242701] CPU: 6 PID: 3217 Comm: pmdalinux Kdump: loaded Tainted: G S E 5.18.0-04958-ga54ce3703613-dirty #1 [ 9385.242708] Hardware name: Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.11.0 11/02/2019 [ 9385.242710] RIP: 0010:iavf_get_link_ksettings+0x29/0xd0 [iavf] [ 9385.242745] Code: 00 0f 1f 44 00 00 b8 01 ef ff ff 48 c7 46 30 00 00 00 00 48 c7 46 38 00 00 00 00 c6 46 0b 00 66 89 46 08 48 8b 87 68 0e 00 00 <f6> 40 08 80 75 50 8b 87 5c 0e 00 00 83 f8 08 74 7a 76 1d 83 f8 20 [ 9385.242749] RSP: 0018:ffffc0560ec7fbd0 EFLAGS: 00010246 [ 9385.242755] RAX: 0000000000000000 RBX: ffffc0560ec7fc08 RCX: 0000000000000000 [ 9385.242759] RDX: ffffffffc0ad4550 RSI: ffffc0560ec7fc08 RDI: ffffa0fc66674000 [ 9385.242762] RBP: 00007ffd1fb2bf50 R08: b6a2d54b892363ee R09: ffffa101dc14fb00 [ 9385.242765] R10: 0000000000000000 R11: 0000000000000004 R12: ffffa0fc66674000 [ 9385.242768] R13: 0000000000000000 R14: ffffa0fc66674000 R15: 00000000ffffffa1 [ 9385.242771] FS: 00007f93711a2980(0000) GS:ffffa0fad72c0000(0000) knlGS:0000000000000000 [ 9385.242775] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 9385.242778] CR2: 0000000000000008 CR3: 0000000a8e61c003 CR4: 00000000003706e0 [ 9385.242781] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 9385.242784] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 9385.242787] Call Trace: [ 9385.242791] <TASK> [ 9385.242793] ethtool_get_settings+0x71/0x1a0 [ 9385.242814] __dev_ethtool+0x426/0x2f40 [ 9385.242823] ? slab_post_alloc_hook+0x4f/0x280 [ 9385.242836] ? kmem_cache_alloc_trace+0x15d/0x2f0 [ 9385.242841] ? dev_ethtool+0x59/0x170 [ 9385.242848] dev_ethtool+0xa7/0x170 [ 9385.242856] dev_ioctl+0xc3/0x520 [ 9385.242866] sock_do_ioctl+0xa0/0xe0 [ 9385.242877] sock_ioctl+0x22f/0x320 [ 9385.242885] __x64_sys_ioctl+0x84/0xc0 [ 9385.242896] do_syscall_64+0x3a/0x80 [ 9385.242904] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 9385.242918] RIP: 0033:0x7f93702396db [ 9385.242923] Code: 73 01 c3 48 8b 0d ad 57 38 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 7d 57 38 00 f7 d8 64 89 01 48 [ 9385.242927] RSP: 002b:00007ffd1fb2bf18 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 9385.242932] RAX: ffffffffffffffda RBX: 000055671b1d2fe0 RCX: 00007f93702396db [ 9385.242935] RDX: 00007ffd1fb2bf20 RSI: 0000000000008946 RDI: 0000000000000007 [ 9385.242937] RBP: 00007ffd1fb2bf20 R08: 0000000000000003 R09: 0030763066307330 [ 9385.242940] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffd1fb2bf80 [ 9385.242942] R13: 0000000000000007 R14: 0000556719f6de90 R15: 00007ffd1fb2c1b0 [ 9385.242948] </TASK> [ 9385.242949] Modules linked in: iavf(E) xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nft_compat nf_nat_tftp nft_objref nf_conntrack_tftp bridge stp llc nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables rfkill nfnetlink vfat fat irdma ib_uverbs ib_core intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm iTCO_wdt iTCO_vendor_support ice irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl i40e pcspkr intel_cstate joydev mei_me intel_uncore mxm_wmi mei ipmi_ssif lpc_ich ipmi_si acpi_power_meter xfs libcrc32c mgag200 i2c_algo_bit drm_shmem_helper drm_kms_helper sd_mod t10_pi crc64_rocksoft crc64 syscopyarea sg sysfillrect sysimgblt fb_sys_fops drm ixgbe ahci libahci libata crc32c_intel mdio dca wmi dm_mirror dm_region_hash dm_log dm_mod ipmi_devintf ipmi_msghandler fuse [ 9385.243065] [last unloaded: iavf] Dereference happens in if (ADV_LINK_SUPPORT(adapter)) statement Fixes: 209f2f9c7181 ("iavf: Add support for VIRTCHNL_VF_OFFLOAD_VLAN_V2 negotiation") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-12iavf: Fix adminq error handlingPrzemyslaw Patynowski1-2/+13
iavf_alloc_asq_bufs/iavf_alloc_arq_bufs allocates with dma_alloc_coherent memory for VF mailbox. Free DMA regions for both ASQ and ARQ in case error happens during configuration of ASQ/ARQ registers. Without this change it is possible to see when unloading interface: 74626.583369: dma_debug_device_change: device driver has pending DMA allocations while released from device [count=32] One of leaked entries details: [device address=0x0000000b27ff9000] [size=4096 bytes] [mapped with DMA_BIDIRECTIONAL] [mapped as coherent] Fixes: d358aa9a7a2d ("i40evf: init code and hardware support") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-11ice: Fix call trace with null VSI during VF resetMichal Jaron1-1/+3
During stress test with attaching and detaching VF from KVM and simultaneously changing VFs spoofcheck and trust there was a call trace in ice_reset_vf that VF's VSI is null. [145237.352797] WARNING: CPU: 46 PID: 840629 at drivers/net/ethernet/intel/ice/ice_vf_lib.c:508 ice_reset_vf+0x3d6/0x410 [ice] [145237.352851] Modules linked in: ice(E) vfio_pci vfio_pci_core vfio_virqfd vfio_iommu_type1 vfio iavf dm_mod xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink tun bridge stp llc sunrpc intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm iTCO_wdt iTC O_vendor_support irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl ipmi_si intel_cstate ipmi_devintf joydev intel_uncore m ei_me ipmi_msghandler i2c_i801 pcspkr mei lpc_ich ioatdma i2c_smbus acpi_pad acpi_power_meter ip_tables xfs libcrc32c i2c_algo_bit drm_sh mem_helper drm_kms_helper sd_mod t10_pi crc64_rocksoft syscopyarea crc64 sysfillrect sg sysimgblt fb_sys_fops drm i40e ixgbe ahci libahci libata crc32c_intel mdio dca wmi fuse [last unloaded: ice] [145237.352917] CPU: 46 PID: 840629 Comm: kworker/46:2 Tainted: G S W I E 5.19.0-rc6+ #24 [145237.352921] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015 [145237.352923] Workqueue: ice ice_service_task [ice] [145237.352948] RIP: 0010:ice_reset_vf+0x3d6/0x410 [ice] [145237.352984] Code: 30 ec f3 cc e9 28 fd ff ff 0f b7 4b 50 48 c7 c2 48 19 9c c0 4c 89 ee 48 c7 c7 30 fe 9e c0 e8 d1 21 9d cc 31 c0 e9 a 9 fe ff ff <0f> 0b b8 ea ff ff ff e9 c1 fc ff ff 0f 0b b8 fb ff ff ff e9 91 fe [145237.352987] RSP: 0018:ffffb453e257fdb8 EFLAGS: 00010246 [145237.352990] RAX: ffff8bd0040181c0 RBX: ffff8be68db8f800 RCX: 0000000000000000 [145237.352991] RDX: 000000000000ffff RSI: 0000000000000000 RDI: ffff8be68db8f800 [145237.352993] RBP: ffff8bd0040181c0 R08: 0000000000001000 R09: ffff8bcfd520e000 [145237.352995] R10: 0000000000000000 R11: 00008417b5ab0bc0 R12: 0000000000000005 [145237.352996] R13: ffff8bcee061c0d0 R14: ffff8bd004019640 R15: 0000000000000000 [145237.352998] FS: 0000000000000000(0000) GS:ffff8be5dfb00000(0000) knlGS:0000000000000000 [145237.353000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [145237.353002] CR2: 00007fd81f651d68 CR3: 0000001a0fe10001 CR4: 00000000001726e0 [145237.353003] Call Trace: [145237.353008] <TASK> [145237.353011] ice_process_vflr_event+0x8d/0xb0 [ice] [145237.353049] ice_service_task+0x79f/0xef0 [ice] [145237.353074] process_one_work+0x1c8/0x390 [145237.353081] ? process_one_work+0x390/0x390 [145237.353084] worker_thread+0x30/0x360 [145237.353087] ? process_one_work+0x390/0x390 [145237.353090] kthread+0xe8/0x110 [145237.353094] ? kthread_complete_and_exit+0x20/0x20 [145237.353097] ret_from_fork+0x22/0x30 [145237.353103] </TASK> Remove WARN_ON() from check if VSI is null in ice_reset_vf. Add "VF is already removed\n" in dev_dbg(). This WARN_ON() is unnecessary and causes call trace, despite that call trace, driver still works. There is no need for this warn because this piece of code is responsible for disabling VF's Tx/Rx queues when VF is disabled, but when VF is already removed there is no need to do reset or disable queues. Fixes: efe41860008e ("ice: Fix memory corruption in VF driver") Signed-off-by: Michal Jaron <michalx.jaron@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-11ice: Fix VSI rebuild WARN_ON check for VFBenjamin Mikailenko1-1/+1
In commit b03d519d3460 ("ice: store VF pointer instead of VF ID") WARN_ON checks were added to validate the vsi->vf pointer and catch programming errors. However, one check to vsi->vf was missed. This caused a call trace when resetting VFs. Fix ice_vsi_rebuild by encompassing VF pointer in WARN_ON check. Fixes: b03d519d3460 ("ice: store VF pointer instead of VF ID") Signed-off-by: Benjamin Mikailenko <benjamin.mikailenko@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-08Merge tag 'bitmap-6.0-rc1' of https://github.com/norov/linuxLinus Torvalds1-1/+1
Pull bitmap updates from Yury Norov: - fix the duplicated comments on bitmap_to_arr64() (Qu Wenruo) - optimize out non-atomic bitops on compile-time constants (Alexander Lobakin) - cleanup bitmap-related headers (Yury Norov) - x86/olpc: fix 'logical not is only applied to the left hand side' (Alexander Lobakin) - lib/nodemask: inline wrappers around bitmap (Yury Norov) * tag 'bitmap-6.0-rc1' of https://github.com/norov/linux: (26 commits) lib/nodemask: inline next_node_in() and node_random() powerpc: drop dependency on <asm/machdep.h> in archrandom.h x86/olpc: fix 'logical not is only applied to the left hand side' lib/cpumask: move some one-line wrappers to header file headers/deps: mm: align MANITAINERS and Docs with new gfp.h structure headers/deps: mm: Split <linux/gfp_types.h> out of <linux/gfp.h> headers/deps: mm: Optimize <linux/gfp.h> header dependencies lib/cpumask: move trivial wrappers around find_bit to the header lib/cpumask: change return types to unsigned where appropriate cpumask: change return types to bool where appropriate lib/bitmap: change type of bitmap_weight to unsigned long lib/bitmap: change return types to bool where appropriate arm: align find_bit declarations with generic kernel iommu/vt-d: avoid invalid memory access via node_online(NUMA_NO_NODE) lib/test_bitmap: test the tail after bitmap_to_arr64() lib/bitmap: fix off-by-one in bitmap_to_arr64() lib: test_bitmap: add compile-time optimization/evaluations assertions bitmap: don't assume compiler evaluates small mem*() builtins calls net/ice: fix initializing the bitmap in the switch code bitops: let optimize out non-atomic bitops on compile-time constants ...
2022-08-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netPaolo Abeni3-3/+51
Conflicts: net/ax25/af_ax25.c d7c4c9e075f8c ("ax25: fix incorrect dev_tracker usage") d62607c3fe459 ("net: rename reference+tracking helpers") drivers/net/netdevsim/fib.c 180a6a3ee60a ("netdevsim: fib: Fix reference count leak on route deletion failure") 012ec02ae441 ("netdevsim: convert driver to use unlocked devlink API during init/fini") Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-08-01net: ice: fix error NETIF_F_HW_VLAN_CTAG_FILTER check in ice_vsi_sync_fltr()Jian Shen1-1/+1
vsi->current_netdev_flags is used store the current net device flags, not the active netdevice features. So it should use vsi->netdev->featurs, rather than vsi->current_netdev_flags to check NETIF_F_HW_VLAN_CTAG_FILTER. Fixes: 1babaf77f49d ("ice: Advertise 802.1ad VLAN filtering and offloads for PF netdev") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Acked-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>