Age | Commit message (Collapse) | Author | Files | Lines |
|
commit e6ba5273d4ede03d075d7a116b8edad1f6115f4d upstream.
The VF can be configured via the PF's ndo ops at the same time the PF is
receiving/handling virtchnl messages. This has many issues, with
one of them being the ndo op could be actively resetting a VF (i.e.
resetting it to the default state and deleting/re-adding the VF's VSI)
while a virtchnl message is being handled. The following error was seen
because a VF ndo op was used to change a VF's trust setting while the
VIRTCHNL_OP_CONFIG_VSI_QUEUES was ongoing:
[35274.192484] ice 0000:88:00.0: Failed to set LAN Tx queue context, error: ICE_ERR_PARAM
[35274.193074] ice 0000:88:00.0: VF 0 failed opcode 6, retval: -5
[35274.193640] iavf 0000:88:01.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
Fix this by making sure the virtchnl handling and VF ndo ops that
trigger VF resets cannot run concurrently. This is done by adding a
struct mutex cfg_lock to each VF structure. For VF ndo ops, the mutex
will be locked around the critical operations and VFR. Since the ndo ops
will trigger a VFR, the virtchnl thread will use mutex_trylock(). This
is done because if any other thread (i.e. VF ndo op) has the mutex, then
that means the current VF message being handled is no longer valid, so
just ignore it.
This issue can be seen using the following commands:
for i in {0..50}; do
rmmod ice
modprobe ice
sleep 1
echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs
echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs
ip link set ens785f1 vf 0 trust on
ip link set ens785f0 vf 0 trust on
sleep 2
echo 0 > /sys/class/net/ens785f0/device/sriov_numvfs
echo 0 > /sys/class/net/ens785f1/device/sriov_numvfs
sleep 1
echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs
echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs
ip link set ens785f1 vf 0 trust on
ip link set ens785f0 vf 0 trust on
done
Fixes: 7c710869d64e ("ice: Add handlers for VF netdevice operations")
Cc: <stable@vger.kernel.org>
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
[I had to fix the cherry-pick manually as the patch added a line around
some context that was missing.]
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 5950bdc88dd1d158f2845fdff8fb1de86476806c upstream.
Clang static analysis reports this issues
ice_common.c:5008:21: warning: The left expression of the compound
assignment is an uninitialized value. The computed value will
also be garbage
ldo->phy_type_low |= ((u64)buf << (i * 16));
~~~~~~~~~~~~~~~~~ ^
When called from ice_cfg_phy_fec() ldo is the uninitialized local
variable tlv. So initialize.
Fixes: ea78ce4dab05 ("ice: add link lenient and default override support")
Signed-off-by: Tom Rix <trix@redhat.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit ed22d9c8d128293fc7b0b086c7d3654bcb99a8dd upstream.
Clang static analysis reports this issue
time64.h:69:50: warning: The left operand of '+'
is a garbage value
set_normalized_timespec64(&ts_delta, lhs.tv_sec + rhs.tv_sec,
~~~~~~~~~~ ^
In ice_ptp_adjtime_nonatomic(), the timespec64 variable 'now'
is set by ice_ptp_gettimex64(). This function can fail
with -EBUSY, so 'now' can have a gargbage value.
So check the return.
Fixes: 06c16d89d2cb ("ice: register 1588 PTP clock device object for E810 devices")
Signed-off-by: Tom Rix <trix@redhat.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit fe20371578ef640069e6ae9fa8038f60e7908565 upstream.
Revert of a patch that instead of fixing a AQ error when trying
to reset BW limit introduced several regressions related to
creation and managing TC. Currently there are errors when creating
a TC on both PF and VF.
Error log:
[17428.783095] i40e 0000:3b:00.1: AQ command Config VSI BW allocation per TC failed = 14
[17428.783107] i40e 0000:3b:00.1: Failed configuring TC map 0 for VSI 391
[17428.783254] i40e 0000:3b:00.1: AQ command Config VSI BW allocation per TC failed = 14
[17428.783259] i40e 0000:3b:00.1: Unable to configure TC map 0 for VSI 391
This reverts commit 3d2504663c41104b4359a15f35670cfa82de1bbf.
Fixes: 3d2504663c41 (i40e: Fix reset bw limit when DCB enabled with 1 TC)
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20220223175347.1690692-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 86006f996346e8a5a1ea80637ec949ceeea4ecbc upstream.
The COMMS package can enable the hardware parser to recognize IPSEC
frames with ESP header and SPI identifier. If this package is available
and configured for loading in /lib/firmware, then the driver will
succeed in enabling this protocol type for RSS.
This in turn allows the hardware to hash over the SPI and use it to pick
a consistent receive queue for the same secure flow. Without this all
traffic is steered to the same queue for multiple traffic threads from
the same IP address. For that reason this is marked as a fix, as the
driver supports the model, but it wasn't enabled.
If the package is not available, adding this type will fail, but the
failure is ignored on purpose as it has no negative affect.
Fixes: c90ed40cefe1 ("ice: Enable writing hardware filtering tables")
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 5dbbbd01cbba831233c6ea9a3e6bfa133606d3c0 ]
If a call to re-create the auxiliary device happens in a context that has
already taken the RTNL lock, then the call flow that recreates auxiliary
device can hang if there is another attempt to claim the RTNL lock by the
auxiliary driver.
To avoid this, any call to re-create auxiliary devices that comes from
an source that is holding the RTNL lock (e.g. netdev notifier when
interface exits a bond) should execute in a separate thread. To
accomplish this, add a flag to the PF that will be evaluated in the
service task and dealt with there.
Fixes: f9f5301e7e2d ("ice: Register auxiliary device to provide RDMA")
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Reviewed-by: Jonathan Toppins <jtoppins@redhat.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit bea1898f65b9b7096cb4e73e97c83b94718f1fa1 ]
Currently, the same handler is called for both a NETDEV_BONDING_INFO
LAG unlink notification as for a NETDEV_UNREGISTER call. This is
causing a problem though, since the netdev_notifier_info passed has
a different structure depending on which event is passed. The problem
manifests as a call trace from a BUG: KASAN stack-out-of-bounds error.
Fix this by creating a handler specific to NETDEV_UNREGISTER that only
is passed valid elements in the netdev_notifier_info struct for the
NETDEV_UNREGISTER event.
Also included is the removal of an unbalanced dev_put on the peer_netdev
and related braces.
Fixes: 6a8b357278f5 ("ice: Respond to a NETDEV_UNREGISTER event for LAG")
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Acked-by: Jonathan Toppins <jtoppins@redhat.com>
Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 46b699c50c0304cdbd725d7740073a7f9d5edb10 ]
The driver was avoiding offload for IPIP (at least) frames due to
parsing the inner header offsets incorrectly when trying to check
lengths.
This length check works for VXLAN frames but fails on IPIP frames
because skb_transport_offset points to the inner header in IPIP
frames, which meant the subtraction of transport_header from
inner_network_header returns a negative value (-20).
With the code before this patch, everything continued to work, but GSO
was being used to segment, causing throughputs of 1.5Gb/s per thread.
After this patch, throughput is more like 10Gb/s per thread for IPIP
traffic.
Fixes: e94d44786693 ("ice: Implement filter sync, NDO operations and bump version")
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 21338d58736ef70eaae5fd75d567a358ff7902f9 ]
Propagate the error code from ice_get_link_default_override() instead
of returning success.
Fixes: ea78ce4dab05 ("ice: add link lenient and default override support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit fe68195daf34d5dddacd3f93dd3eafc4beca3a0e ]
From 4.17 onwards the ixgbevf driver uses build_skb() to build an skb
around new data in the page buffer shared with the ixgbe PF.
This uses either a 2K or 3K buffer, and offsets the DMA mapping by
NET_SKB_PAD + NET_IP_ALIGN. When using a smaller buffer RXDCTL is set to
ensure the PF does not write a full 2K bytes into the buffer, which is
actually 2K minus the offset.
However on the 82599 virtual function, the RXDCTL mechanism is not
available. The driver attempts to work around this by using the SET_LPE
mailbox method to lower the maximm frame size, but the ixgbe PF driver
ignores this in order to keep the PF and all VFs in sync[0].
This means the PF will write up to the full 2K set in SRRCTL, causing it
to write NET_SKB_PAD + NET_IP_ALIGN bytes past the end of the buffer.
With 4K pages split into two buffers, this means it either writes
NET_SKB_PAD + NET_IP_ALIGN bytes past the first buffer (and into the
second), or NET_SKB_PAD + NET_IP_ALIGN bytes past the end of the DMA
mapping.
Avoid this by only enabling build_skb when using "large" buffers (3K).
These are placed in each half of an order-1 page, preventing the PF from
writing past the end of the mapping.
[0]: Technically it only ever raises the max frame size, see
ixgbe_set_vf_lpe() in ixgbe_sriov.c
Fixes: f15c5ba5b6cd ("ixgbevf: add support for using order 1 pages to receive large frames")
Signed-off-by: Samuel Mendoza-Jonas <samjonas@amazon.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 68defd528f94ed1cf11f49a75cc1875dccd781fa upstream.
We have the same LAN controller on different PCH's. Separate ADP board
type from a TGP which will allow for specific fixes to be applied for
ADP platforms.
Suggested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Suggested-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit cad014b7b5a6897d8c4fad13e2888978bfb7a53f upstream.
Handshake with CSME/AMT on none provisioned platforms during S0ix flow
is not supported on TGL platform and can cause to HW unit hang. Update
the handshake with CSME flow to start from the ADL platform.
Fixes: 3e55d231716e ("e1000e: Add handshake with the CSME to support S0ix")
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6533e558c6505e94c3e0ed4281ed5e31ec985f4d upstream.
Fix the crash in kernel while dereferencing the NULL pointer,
when the driver is unloaded and simultaneously the VSI rings
are being stopped.
The hardware requires 50msec in order to finish RX queues
disable. For this purpose the driver spins in mdelay function
for the operation to be completed.
For example changing number of queues which requires reset would
fail in the following call stack:
1) i40e_prep_for_reset
2) i40e_pf_quiesce_all_vsi
3) i40e_quiesce_vsi
4) i40e_vsi_close
5) i40e_down
6) i40e_vsi_stop_rings
7) i40e_vsi_control_rx -> disable requires the delay of 50msecs
8) continue back in i40e_down function where
i40e_clean_tx_ring(vsi->tx_rings[i]) is going to crash
When the driver was spinning vsi_release called
i40e_vsi_free_arrays where the vsi->tx_rings resources
were freed and the pointer was set to NULL.
Fixes: 5b6d4a7f20b0 ("i40e: Fix crash during removing i40e driver")
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Signed-off-by: Karen Sornek <karen.sornek@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 3d2504663c41104b4359a15f35670cfa82de1bbf upstream.
There was an AQ error I40E_AQ_RC_EINVAL when trying
to reset bw limit as part of bw allocation setup.
This was caused by trying to reset bw limit with
DCB enabled. Bw limit should not be reset when
DCB is enabled. The code was relying on the pf->flags
to check if DCB is enabled but if only 1 TC is available
this flag will not be set even though DCB is enabled.
Add a check for number of TC and if it is 1
don't try to reset bw limit even if pf->flags shows
DCB as disabled.
Fixes: fa38e30ac73f ("i40e: Fix for Tx timeouts when interface is brought up if DCB is enabled")
Suggested-by: Alexander Lobakin <alexandr.lobakin@intel.com> # Flatten the condition
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Tested-by: Imam Hassan Reza Biswas <imam.hassan.reza.biswas@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 3b8428b84539c78fdc8006c17ebd25afd4722d51 upstream.
Change i40e_update_vsi_stats and struct i40e_vsi to use u64 fields to match
the width of the stats counters in struct i40e_rx_queue_stats.
Update debugfs code to use the correct format specifier for u64.
Fixes: 41c445ff0f48 ("i40e: main driver core")
Signed-off-by: Joe Damato <jdamato@fastly.com>
Reported-by: kernel test robot <lkp@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 0f344c8129a5337dae50e31b817dd50a60ff238c upstream.
Fix for failed to init adminq: -53 while VF is resetting via MAC
address changing procedure.
Added sync module to avoid reading deadbeef value in reinit adminq
during software reset.
Without this patch it is possible to trigger VF reset procedure
during reinit adminq. This resulted in an incorrect reading of
value from the AQP registers and generated the -53 error.
Fixes: 5c3c48ac6bf5 ("i40e: implement virtual device interface")
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com>
Signed-off-by: Karen Sornek <karen.sornek@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 92947844b8beee988c0ce17082b705c2f75f0742 upstream.
When XDP was configured on a system with large number of CPUs
and X722 NIC there was a call trace with NULL pointer dereference.
i40e 0000:87:00.0: failed to get tracking for 256 queues for VSI 0 err -12
i40e 0000:87:00.0: setup of MAIN VSI failed
BUG: kernel NULL pointer dereference, address: 0000000000000000
RIP: 0010:i40e_xdp+0xea/0x1b0 [i40e]
Call Trace:
? i40e_reconfig_rss_queues+0x130/0x130 [i40e]
dev_xdp_install+0x61/0xe0
dev_xdp_attach+0x18a/0x4c0
dev_change_xdp_fd+0x1e6/0x220
do_setlink+0x616/0x1030
? ahci_port_stop+0x80/0x80
? ata_qc_issue+0x107/0x1e0
? lock_timer_base+0x61/0x80
? __mod_timer+0x202/0x380
rtnl_setlink+0xe5/0x170
? bpf_lsm_binder_transaction+0x10/0x10
? security_capable+0x36/0x50
rtnetlink_rcv_msg+0x121/0x350
? rtnl_calcit.isra.0+0x100/0x100
netlink_rcv_skb+0x50/0xf0
netlink_unicast+0x1d3/0x2a0
netlink_sendmsg+0x22a/0x440
sock_sendmsg+0x5e/0x60
__sys_sendto+0xf0/0x160
? __sys_getsockname+0x7e/0xc0
? _copy_from_user+0x3c/0x80
? __sys_setsockopt+0xc8/0x1a0
__x64_sys_sendto+0x20/0x30
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f83fa7a39e0
This was caused by PF queue pile fragmentation due to
flow director VSI queue being placed right after main VSI.
Because of this main VSI was not able to resize its
queue allocation for XDP resulting in no queues allocated
for main VSI when XDP was turned on.
Fix this by always allocating last queue in PF queue pile
for a flow director VSI.
Fixes: 41c445ff0f48 ("i40e: main driver core")
Fixes: 74608d17fe29 ("i40e: add support for XDP_TX action")
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d701658a50a471591094b3eb3961b4926cc8f104 upstream.
Before this patch VF interface vanished when
maximum queue number was exceeded. Driver tried
to add next queues even if there was not enough
space. PF sent incorrect number of queues to
the VF when there were not enough of them.
Add an additional condition introduced to check
available space in 'qp_pile' before proceeding.
This condition makes it impossible to add queues
if they number is greater than the number resulting
from available space.
Also add the search for free space in PF queue
pair piles.
Without this patch VF interfaces are not seen
when available space for queues has been
exceeded and following logs appears permanently
in dmesg:
"Unable to get VF config (-32)".
"VF 62 failed opcode 3, retval: -5"
"Unable to get VF config due to PF error condition, not retrying"
Fixes: 7daa6bf3294e ("i40e: driver core headers")
Fixes: 41c445ff0f48 ("i40e: main driver core")
Signed-off-by: Jaroslaw Gawin <jaroslawx.gawin@intel.com>
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9b13bd53134c9ddd544a790125199fdbdb505e67 upstream.
Recently simplified i40e_rebuild causes that FW sometimes
is not ready after NVM update, the ping does not return.
Increase the delay in case of EMP reset.
Old delay of 300 ms was introduced for specific cards for 710 series.
Now it works for all the cards and delay was increased.
Fixes: 1fa51a650e1d ("i40e: Add delay after EMP reset for firmware to recover")
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 4fa8fcd3440101dbacf4fae91de69877ef751977 ]
Driver already implicitly supports XDP metadata access in AF_XDP
zero-copy mode, as xsk_buff_pool's xp_alloc() naturally set xdp_buff
data_meta equal data.
This works fine for XDP and AF_XDP, but if a BPF-prog adjust via
bpf_xdp_adjust_meta() and choose to call XDP_PASS, then igc function
igc_construct_skb_zc() will construct an invalid SKB packet. The
function correctly include the xdp->data_meta area in the memcpy, but
forgot to pull header to take metasize into account.
Fixes: fc9df2a0b520 ("igc: Enable RX via AF_XDP zero-copy")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit e738451d78b2f8a9635d66c6a87f304b4d965f7a upstream.
There was a wrong queues representation in sysfs during
driver's reinitialization in case of online cpus number is
less than combined queues. It was caused by stopped
NetworkManager, which is responsible for calling vsi_open
function during driver's initialization.
In specific situation (ex. 12 cpus online) there were 16 queues
in /sys/class/net/<iface>/queues. In case of modifying queues with
value higher, than number of online cpus, then it caused write
errors and other errors.
Add updating of sysfs's queues representation during driver
initialization.
Fixes: 41c445ff0f48 ("i40e: main driver core")
Signed-off-by: Lukasz Cieplicki <lukaszx.cieplicki@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 40feded8a247f95957a0de9abd100085fb320a2f upstream.
When loading the i40e driver, it prints a message like: 'The driver for the
device detected a newer version of the NVM image v1.x than expected v1.y.
Please install the most recent version of the network driver.' This is
misleading as the driver is working as expected.
Fix that by removing the second part of message and changing it from
dev_info to dev_dbg.
Fixes: 4fb29bddb57f ("i40e: The driver now prints the API version in error message")
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 3116f59c12bd24c513194cd3acb3ec1f7d468954 upstream.
Using ifconfig command to delete the ipv6 address will cause
the i40e network card driver to delete its internal mac_filter and
i40e_service_task kernel thread will concurrently access the mac_filter.
These two processes are not protected by lock
so causing the following use-after-free problems.
print_address_description+0x70/0x360
? vprintk_func+0x5e/0xf0
kasan_report+0x1b2/0x330
i40e_sync_vsi_filters+0x4f0/0x1850 [i40e]
i40e_sync_filters_subtask+0xe3/0x130 [i40e]
i40e_service_task+0x195/0x24c0 [i40e]
process_one_work+0x3f5/0x7d0
worker_thread+0x61/0x6c0
? process_one_work+0x7d0/0x7d0
kthread+0x1c3/0x1f0
? kthread_park+0xc0/0xc0
ret_from_fork+0x35/0x40
Allocated by task 2279810:
kasan_kmalloc+0xa0/0xd0
kmem_cache_alloc_trace+0xf3/0x1e0
i40e_add_filter+0x127/0x2b0 [i40e]
i40e_add_mac_filter+0x156/0x190 [i40e]
i40e_addr_sync+0x2d/0x40 [i40e]
__hw_addr_sync_dev+0x154/0x210
i40e_set_rx_mode+0x6d/0xf0 [i40e]
__dev_set_rx_mode+0xfb/0x1f0
__dev_mc_add+0x6c/0x90
igmp6_group_added+0x214/0x230
__ipv6_dev_mc_inc+0x338/0x4f0
addrconf_join_solict.part.7+0xa2/0xd0
addrconf_dad_work+0x500/0x980
process_one_work+0x3f5/0x7d0
worker_thread+0x61/0x6c0
kthread+0x1c3/0x1f0
ret_from_fork+0x35/0x40
Freed by task 2547073:
__kasan_slab_free+0x130/0x180
kfree+0x90/0x1b0
__i40e_del_filter+0xa3/0xf0 [i40e]
i40e_del_mac_filter+0xf3/0x130 [i40e]
i40e_addr_unsync+0x85/0xa0 [i40e]
__hw_addr_sync_dev+0x9d/0x210
i40e_set_rx_mode+0x6d/0xf0 [i40e]
__dev_set_rx_mode+0xfb/0x1f0
__dev_mc_del+0x69/0x80
igmp6_group_dropped+0x279/0x510
__ipv6_dev_mc_dec+0x174/0x220
addrconf_leave_solict.part.8+0xa2/0xd0
__ipv6_ifa_notify+0x4cd/0x570
ipv6_ifa_notify+0x58/0x80
ipv6_del_addr+0x259/0x4a0
inet6_addr_del+0x188/0x260
addrconf_del_ifaddr+0xcc/0x130
inet6_ioctl+0x152/0x190
sock_do_ioctl+0xd8/0x2b0
sock_ioctl+0x2e5/0x4c0
do_vfs_ioctl+0x14e/0xa80
ksys_ioctl+0x7c/0xa0
__x64_sys_ioctl+0x42/0x50
do_syscall_64+0x98/0x2c0
entry_SYSCALL_64_after_hwframe+0x65/0xca
Fixes: 41c445ff0f48 ("i40e: main driver core")
Signed-off-by: Di Zhu <zhudi2@huawei.com>
Signed-off-by: Rui Zhang <zhangrui182@huawei.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b712941c8085e638bb92456e866ed3de4404e3d5 upstream.
In the absence of this validation, if the user requests to
configure queues more than the enabled queues, it results in
sending the requested number of queues to the kernel stack
(due to the asynchronous nature of VF response), in which
case the stack might pick a queue to transmit that is not
enabled and result in Tx hang. Fix this bug by
limiting the total number of queues allocated for VF to
active queues of VF.
Fixes: d5b33d024496 ("i40evf: add ndo_setup_tc callback to i40evf")
Signed-off-by: Ashwin Vijayavel <ashwin.vijayavel@intel.com>
Signed-off-by: Karen Sornek <karen.sornek@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 01cbf50877e602e2376af89e4a51c30bc574c618 upstream.
Hide i40e opcode information sent during response to VF in case when
untrusted VF tried to change MAC on the VF interface.
This is implemented by adding an additional parameter 'hide' to the
response sent to VF function that hides the display of error
information, but forwards the error code to VF.
Previously it was not possible to send response with some error code
to VF without displaying opcode information.
Fixes: 5c3c48ac6bf5 ("i40e: implement virtual device interface")
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Reviewed-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit f85846bbf43de38fb2c89fe7d2a085608c4eb25a ]
Time synchronization was not properly enabled on non-MSI-X platforms.
Fixes: 2c344ae24501 ("igc: Add support for TX timestamping")
Signed-off-by: James McLaughlin <james.mclaughlin@qsc.com>
Reviewed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 1e81dcc1ab7de7a789e60042ce82d5a612632599 ]
It was reported that when PCIe PTM is enabled, some lockups could
be observed with some integrated i225-V models.
While the issue is investigated, we can disable crosstimestamp for
those models and see no loss of functionality, because those models
don't have any support for time synchronization.
Fixes: a90ec8483732 ("igc: Add support for PTP getcrosststamp()")
Link: https://lore.kernel.org/all/924175a188159f4e03bd69908a91e606b574139b.camel@gmx.de/
Reported-by: Stefan Dietrich <roots@gmx.de>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ac8c58f5b535d6272324e2b8b4a0454781c9147e ]
Recent net core changes caused an issue with few Intel drivers
(reportedly igb), where taking RTNL in RPM resume path results in a
deadlock. See [0] for a bug report. I don't think the core changes
are wrong, but taking RTNL in RPM resume path isn't needed.
The Intel drivers are the only ones doing this. See [1] for a
discussion on the issue. Following patch changes the RPM resume path
to not take RTNL.
[0] https://bugzilla.kernel.org/show_bug.cgi?id=215129
[1] https://lore.kernel.org/netdev/20211125074949.5f897431@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com/t/
Fixes: bd869245a3dc ("net: core: try to runtime-resume detached device in __dev_open")
Fixes: f32a21376573 ("ethtool: runtime-resume netdev parent before ethtool ioctl ops")
Tested-by: Martin Stolpe <martin.stolpe@gmail.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20211220201844.2714498-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit afe8a3ba85ec2a6b6849367e25c06a2f8e0ddd05 ]
Currently we only NULL the xdp_buff pointer in the internal SW ring but
we never give it back to the xsk buffer pool. This means that buffers
can be leaked out of the buff pool and never be used again.
Add missing xsk_buff_free() call to the routine that is supposed to
clean the entries that are left in the ring so that these buffers in the
umem can be used by other sockets.
Also, only go through the space that is actually left to be cleaned
instead of a whole ring.
Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 57f7f8b6bc0bc80d94443f94fe5f21f266499a2b ]
In order to use the new xsk batched buffer allocation interface, a
pointer to an array of struct xsk_buff pointers need to be provided so
that the function can put the result of the allocation there. In the
ice driver, we already have a ring that stores pointers to
xdp_buffs. This is only used for the xsk zero-copy driver and is a
union with the structure that is used for the regular non zero-copy
path. Unfortunately, that structure is larger than the xdp_buffs
pointers which mean that there will be a stride (of 20 bytes) between
each xdp_buff pointer. And feeding this into the xsk_buff_alloc_batch
interface will not work since it assumes a regular array of xdp_buff
pointers (each 8 bytes with 0 bytes in-between them on a 64-bit
system).
To fix this, remove the xdp_buff pointer from the rx_buf union and
move it one step higher to the union above which only has pointers to
arrays in it. This solves the problem and we can directly feed the SW
ring of xdp_buff pointers straight into the allocation function in the
next patch when that interface is used. This will improve performance.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210922075613.12186-4-magnus.karlsson@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit bf0a375055bd1afbbf02a0ef45f7655da7b71317 ]
The MDIO bus speed must be initialized before talking to the PHY the first
time in order to avoid talking to it using a speed that the PHY doesn't
support.
This fixes HW initialization error -17 (IXGBE_ERR_PHY_ADDR_INVALID) on
Denverton CPUs (a.k.a. the Atom C3000 family) on ports with a 10Gb network
plugged in. On those devices, HLREG0[MDCSPD] resets to 1, which combined
with the 10Gb network results in a 24MHz MDIO speed, which is apparently
too fast for the connected PHY. PHY register reads over MDIO bus return
garbage, leading to initialization failure.
Reproduced with Linux kernel 4.19 and 5.15-rc7. Can be reproduced using
the following setup:
* Use an Atom C3000 family system with at least one X552 LAN on the SoC
* Disable PXE or other BIOS network initialization if possible
(the interface must not be initialized before Linux boots)
* Connect a live 10Gb Ethernet cable to an X550 port
* Power cycle (not reset, doesn't always work) the system and boot Linux
* Observe: ixgbe interfaces w/ 10GbE cables plugged in fail with error -17
Fixes: e84db7272798 ("ixgbe: Introduce function to control MDIO speed")
Signed-off-by: Cyril Novikov <cnovikov@lynx.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 271225fd57c2f1e0b3f8826df51be6c634affefe ]
Commit a296d665eae1 ("ixgbe: Add ethtool support to enable 2.5 and 5.0
Gbps support") introduced suppression of the advertisement of NBASE-T
speeds by default, according to Todd Fujinaka to accommodate customers
with network switches which could not cope with advertised NBASE-T
speeds, as posted in the E1000-devel mailing list:
https://sourceforge.net/p/e1000/mailman/message/37106269/
However, the suppression was not documented at all, nor was how to
enable NBASE-T support.
Properly document the NBASE-T suppression and how to enable NBASE-T
support.
Fixes: a296d665eae1 ("ixgbe: Add ethtool support to enable 2.5 and 5.0 Gbps support")
Reported-by: Robert Schlabbach <robert_s@gmx.net>
Signed-off-by: Robert Schlabbach <robert_s@gmx.net>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 0182d1f3fa640888a2ed7e3f6df2fdb10adee7c8 ]
The LTR maximum value was incorrectly written using the scale from
the LTR minimum value. This would cause incorrect values to be sent,
in cases where the initial calculation lead to different min/max scales.
Fixes: 707abf069548 ("igc: Add initial LTR support")
Suggested-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b6d335a60dc624c0d279333b22c737faa765b028 ]
In `igbvf_probe`, if register_netdev() fails, the program will go to
label err_hw_init, and then to label err_ioremap. In free_netdev() which
is just below label err_ioremap, there is `list_for_each_entry_safe` and
`netif_napi_del` which aims to delete all entries in `dev->napi_list`.
The program has added an entry `adapter->rx_ring->napi` which is added by
`netif_napi_add` in igbvf_alloc_queues(). However, adapter->rx_ring has
been freed below label err_hw_init. So this a UAF.
In terms of how to patch the problem, we can refer to igbvf_remove() and
delete the entry before `adapter->rx_ring`.
The KASAN logs are as follows:
[ 35.126075] BUG: KASAN: use-after-free in free_netdev+0x1fd/0x450
[ 35.127170] Read of size 8 at addr ffff88810126d990 by task modprobe/366
[ 35.128360]
[ 35.128643] CPU: 1 PID: 366 Comm: modprobe Not tainted 5.15.0-rc2+ #14
[ 35.129789] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
[ 35.131749] Call Trace:
[ 35.132199] dump_stack_lvl+0x59/0x7b
[ 35.132865] print_address_description+0x7c/0x3b0
[ 35.133707] ? free_netdev+0x1fd/0x450
[ 35.134378] __kasan_report+0x160/0x1c0
[ 35.135063] ? free_netdev+0x1fd/0x450
[ 35.135738] kasan_report+0x4b/0x70
[ 35.136367] free_netdev+0x1fd/0x450
[ 35.137006] igbvf_probe+0x121d/0x1a10 [igbvf]
[ 35.137808] ? igbvf_vlan_rx_add_vid+0x100/0x100 [igbvf]
[ 35.138751] local_pci_probe+0x13c/0x1f0
[ 35.139461] pci_device_probe+0x37e/0x6c0
[ 35.165526]
[ 35.165806] Allocated by task 366:
[ 35.166414] ____kasan_kmalloc+0xc4/0xf0
[ 35.167117] foo_kmem_cache_alloc_trace+0x3c/0x50 [igbvf]
[ 35.168078] igbvf_probe+0x9c5/0x1a10 [igbvf]
[ 35.168866] local_pci_probe+0x13c/0x1f0
[ 35.169565] pci_device_probe+0x37e/0x6c0
[ 35.179713]
[ 35.179993] Freed by task 366:
[ 35.180539] kasan_set_track+0x4c/0x80
[ 35.181211] kasan_set_free_info+0x1f/0x40
[ 35.181942] ____kasan_slab_free+0x103/0x140
[ 35.182703] kfree+0xe3/0x250
[ 35.183239] igbvf_probe+0x1173/0x1a10 [igbvf]
[ 35.184040] local_pci_probe+0x13c/0x1f0
Fixes: d4e0fe01a38a0 (igbvf: add new driver to support 82576 virtual functions)
Reported-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Letu Ren <fantasquex@gmail.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 584af82154f56e6b2740160fcc84a2966d969e15 ]
Move checking condition of VF MAC filter before clearing
or adding MAC filter to VF to prevent potential blackout caused
by removal of necessary and working VF's MAC filter.
Fixes: 1b8b062a99dc ("igb: add VF trust infrastructure")
Signed-off-by: Karen Sornek <karen.sornek@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 37e738b6fdb14529534dca441e0222313688fde3 ]
The driver has to check if it does not accidentally put the timestamp in
the SKB before previous timestamp gets overwritten.
Timestamp values in the PHY are read only and do not get cleared except
at hardware reset or when a new timestamp value is captured.
The cached_tstamp field is used to detect the case where a new timestamp
has not yet been captured, ensuring that we avoid sending stale
timestamp data to the stack.
Fixes: ea9b847cda64 ("ice: enable transmit timestamps for E810 devices")
Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 0013881c1145d36bf26165bb70fdd7560a5507a3 ]
Change the division in ice_ptp_adjfine from div_u64 to div64_u64.
div_u64 is used when the divisor is 32 bit but in this case incval is
64 bit and it caused incorrect calculations and incval adjustments.
Fixes: 06c16d89d2cb ("ice: register 1588 PTP clock device object for E810 devices")
Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f23ab04dd6f703e282bb2d51fe3ae14f4b88a628 ]
When VF is being reset, ice_reset_vf() will be called and FDIR
resource should be released and initialized again.
Fixes: 1f7ea1cd6a37 ("ice: Enable FDIR Configure for AVF")
Signed-off-by: Yahui Cao <yahui.cao@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 23ec111bf3549aae37140330c31a16abfc172421 upstream.
When trying to dump VFs VSI RX/TX descriptors
using debugfs there was a crash
due to NULL pointer dereference in i40e_dbg_dump_desc.
Added a check to i40e_dbg_dump_desc that checks if
VSI type is correct for dumping RX/TX descriptors.
Fixes: 02e9c290814c ("i40e: debugfs interface")
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Signed-off-by: Norbert Zulinski <norbertx.zulinski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 8aa55ab422d9d0d825ebfb877702ed661e96e682 upstream.
After setting pre-set combined to 16 queues and reserving 16 queues by
tc qdisc, pre-set maximum combined queues returned to default value
after VF reset being 4 and this generated errors during removing tc.
Fixed by removing clear num_req_queues before reset VF.
Fixes: e284fc280473 (i40e: Add and delete cloud filter)
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Bindushree P <Bindushree.p@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 61125b8be85dfbc7e9c7fe1cc6c6d631ab603516 upstream.
Fix failed operation code appearing if handling messages from VF.
Implemented by waiting for VF appropriate state if request starts
handle while VF reset.
Without this patch the message handling request while VF is in
a reset state ends with error -5 (I40E_ERR_PARAM).
Fixes: 5c3c48ac6bf5 ("i40e: implement virtual device interface")
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com>
Signed-off-by: Karen Sornek <karen.sornek@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 1a1aa356ddf3f16539f5962c01c5f702686dfc15 upstream.
iavf_set_ringparams doesn't communicate to the user that
1. The user requested descriptor count is out of range. Instead it
just quietly sets descriptors to the "clamped" value and calls it
done. This makes it look an invalid value was successfully set as
the descriptor count when this isn't actually true.
2. The user provided descriptor count needs to be inflated for alignment
reasons.
This behavior is confusing. The ice driver has already addressed this
by rejecting invalid values for descriptor count and
messaging for alignment adjustments.
Do the same thing here by adding the error and info messages.
Fixes: fbb7ddfef253 ("i40evf: core ethtool functionality")
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Michal Maloszewski <michal.maloszewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 7e4dcc13965c57869684d57a1dc6dd7be589488c upstream.
If the PF experiences an FLR, the VF's MSI and MSI-X configuration will
be conveniently and silently removed in the process. When this happens,
reset recovery will appear to complete normally but no traffic will
pass. The netdev watchdog will helpfully notify everyone of this issue.
To prevent such public embarrassment, restore MSI configuration at every
reset. For normal resets, this will do no harm, but for VF resets
resulting from a PF FLR, this will keep the VF working.
Fixes: 5eae00c57f5e ("i40evf: main driver core")
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 28dc1b86f8ea9fd6f4c9e0b363db73ecabf84e22 upstream.
If the hardware is constantly receiving unicast or broadcast packets
during driver load, the device previously counted many GLV_RDPC (VSI
dropped packets) events during init. This causes confusing dropped
packet statistics during driver load. The dropped packets counter
incrementing does stop once the driver finishes loading.
Avoid this problem by baselining our statistics at the end of driver
open instead of the end of probe.
Fixes: cdedef59deb0 ("ice: Configure VSIs for Tx/Rx")
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit eaeace60778e524a2820d0c0ad60bf80289e292c ]
Oleksandr brought a bug report where netpoll causes trace
messages in the log on igb.
Danielle brought this back up as still occurring, so we'll try
again.
[22038.710800] ------------[ cut here ]------------
[22038.710801] igb_poll+0x0/0x1440 [igb] exceeded budget in poll
[22038.710802] WARNING: CPU: 12 PID: 40362 at net/core/netpoll.c:155 netpoll_poll_dev+0x18a/0x1a0
As Alex suggested, change the driver to return work_done at the
exit of napi_poll, which should be safe to do in this driver
because it is not polling multiple queues in this single napi
context (multiple queues attached to one MSI-X vector). Several
other drivers contain the same simple sequence, so I hope
this will not create new problems.
Fixes: 16eb8815c235 ("igb: Refactor clean_rx_irq to reduce overhead and improve performance")
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reported-by: Danielle Ratson <danieller@nvidia.com>
Suggested-by: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Danielle Ratson <danieller@nvidia.com>
Link: https://lore.kernel.org/r/20211123204000.1597971-1-jesse.brandeburg@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f65ee535df775a13a1046c0a0b2d72db342f8a5b ]
Ice driver has the routines for managing XDP resources that are shared
between ndo_bpf op and VSI rebuild flow. The latter takes place for
example when user changes queue count on an interface via ethtool's
set_channels().
There is an issue around the bpf_prog refcounting when VSI is being
rebuilt - since ice_prepare_xdp_rings() is called with vsi->xdp_prog as
an argument that is used later on by ice_vsi_assign_bpf_prog(), same
bpf_prog pointers are swapped with each other. Then it is also
interpreted as an 'old_prog' which in turn causes us to call
bpf_prog_put on it that will decrement its refcount.
Below splat can be interpreted in a way that due to zero refcount of a
bpf_prog it is wiped out from the system while kernel still tries to
refer to it:
[ 481.069429] BUG: unable to handle page fault for address: ffffc9000640f038
[ 481.077390] #PF: supervisor read access in kernel mode
[ 481.083335] #PF: error_code(0x0000) - not-present page
[ 481.089276] PGD 100000067 P4D 100000067 PUD 1001cb067 PMD 106d2b067 PTE 0
[ 481.097141] Oops: 0000 [#1] PREEMPT SMP PTI
[ 481.101980] CPU: 12 PID: 3339 Comm: sudo Tainted: G OE 5.15.0-rc5+ #1
[ 481.110840] Hardware name: Intel Corp. GRANTLEY/GRANTLEY, BIOS GRRFCRB1.86B.0276.D07.1605190235 05/19/2016
[ 481.122021] RIP: 0010:dev_xdp_prog_id+0x25/0x40
[ 481.127265] Code: 80 00 00 00 00 0f 1f 44 00 00 89 f6 48 c1 e6 04 48 01 fe 48 8b 86 98 08 00 00 48 85 c0 74 13 48 8b 50 18 31 c0 48 85 d2 74 07 <48> 8b 42 38 8b 40 20 c3 48 8b 96 90 08 00 00 eb e8 66 2e 0f 1f 84
[ 481.148991] RSP: 0018:ffffc90007b63868 EFLAGS: 00010286
[ 481.155034] RAX: 0000000000000000 RBX: ffff889080824000 RCX: 0000000000000000
[ 481.163278] RDX: ffffc9000640f000 RSI: ffff889080824010 RDI: ffff889080824000
[ 481.171527] RBP: ffff888107af7d00 R08: 0000000000000000 R09: ffff88810db5f6e0
[ 481.179776] R10: 0000000000000000 R11: ffff8890885b9988 R12: ffff88810db5f4bc
[ 481.188026] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 481.196276] FS: 00007f5466d5bec0(0000) GS:ffff88903fb00000(0000) knlGS:0000000000000000
[ 481.205633] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 481.212279] CR2: ffffc9000640f038 CR3: 000000014429c006 CR4: 00000000003706e0
[ 481.220530] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 481.228771] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 481.237029] Call Trace:
[ 481.239856] rtnl_fill_ifinfo+0x768/0x12e0
[ 481.244602] rtnl_dump_ifinfo+0x525/0x650
[ 481.249246] ? __alloc_skb+0xa5/0x280
[ 481.253484] netlink_dump+0x168/0x3c0
[ 481.257725] netlink_recvmsg+0x21e/0x3e0
[ 481.262263] ____sys_recvmsg+0x87/0x170
[ 481.266707] ? __might_fault+0x20/0x30
[ 481.271046] ? _copy_from_user+0x66/0xa0
[ 481.275591] ? iovec_from_user+0xf6/0x1c0
[ 481.280226] ___sys_recvmsg+0x82/0x100
[ 481.284566] ? sock_sendmsg+0x5e/0x60
[ 481.288791] ? __sys_sendto+0xee/0x150
[ 481.293129] __sys_recvmsg+0x56/0xa0
[ 481.297267] do_syscall_64+0x3b/0xc0
[ 481.301395] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 481.307238] RIP: 0033:0x7f5466f39617
[ 481.311373] Code: 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb bd 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2f 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
[ 481.342944] RSP: 002b:00007ffedc7f4308 EFLAGS: 00000246 ORIG_RAX: 000000000000002f
[ 481.361783] RAX: ffffffffffffffda RBX: 00007ffedc7f5460 RCX: 00007f5466f39617
[ 481.380278] RDX: 0000000000000000 RSI: 00007ffedc7f5360 RDI: 0000000000000003
[ 481.398500] RBP: 00007ffedc7f53f0 R08: 0000000000000000 R09: 000055d556f04d50
[ 481.416463] R10: 0000000000000077 R11: 0000000000000246 R12: 00007ffedc7f5360
[ 481.434131] R13: 00007ffedc7f5350 R14: 00007ffedc7f5344 R15: 0000000000000e98
[ 481.451520] Modules linked in: ice(OE) af_packet binfmt_misc nls_iso8859_1 ipmi_ssif intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp mxm_wmi mei_me coretemp mei ipmi_si ipmi_msghandler wmi acpi_pad acpi_power_meter ip_tables x_tables autofs4 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel ahci crypto_simd cryptd libahci lpc_ich [last unloaded: ice]
[ 481.528558] CR2: ffffc9000640f038
[ 481.542041] ---[ end trace d1f24c9ecf5b61c1 ]---
Fix this by only calling ice_vsi_assign_bpf_prog() inside
ice_prepare_xdp_rings() when current vsi->xdp_prog pointer is NULL.
This way set_channels() flow will not attempt to swap the vsi->xdp_prog
pointers with itself.
Also, sprinkle around some comments that provide a reasoning about
correlation between driver and kernel in terms of bpf_prog refcount.
Fixes: efc2214b6047 ("ice: Add support for XDP")
Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Signed-off-by: Marta Plantykow <marta.a.plantykow@intel.com>
Co-developed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 792b2086584f25d84081a526beee80d103c2a913 ]
The approach of having XDP queue per CPU regardless of user's setting
exposed a hidden bug that could occur in case when Rx queue count differ
from Tx queue count. Currently vsi->txq_map's size is equal to the
doubled vsi->alloc_txq, which is not correct due to the fact that XDP
rings were previously based on the Rx queue count. Below splat can be
seen when ethtool -L is used and XDP rings are configured:
[ 682.875339] BUG: kernel NULL pointer dereference, address: 000000000000000f
[ 682.883403] #PF: supervisor read access in kernel mode
[ 682.889345] #PF: error_code(0x0000) - not-present page
[ 682.895289] PGD 0 P4D 0
[ 682.898218] Oops: 0000 [#1] PREEMPT SMP PTI
[ 682.903055] CPU: 42 PID: 2878 Comm: ethtool Tainted: G OE 5.15.0-rc5+ #1
[ 682.912214] Hardware name: Intel Corp. GRANTLEY/GRANTLEY, BIOS GRRFCRB1.86B.0276.D07.1605190235 05/19/2016
[ 682.923380] RIP: 0010:devres_remove+0x44/0x130
[ 682.928527] Code: 49 89 f4 55 48 89 fd 4c 89 ff 53 48 83 ec 10 e8 92 b9 49 00 48 8b 9d a8 02 00 00 48 8d 8d a0 02 00 00 49 89 c2 48 39 cb 74 0f <4c> 3b 63 10 74 25 48 8b 5b 08 48 39 cb 75 f1 4c 89 ff 4c 89 d6 e8
[ 682.950237] RSP: 0018:ffffc90006a679f0 EFLAGS: 00010002
[ 682.956285] RAX: 0000000000000286 RBX: ffffffffffffffff RCX: ffff88908343a370
[ 682.964538] RDX: 0000000000000001 RSI: ffffffff81690d60 RDI: 0000000000000000
[ 682.972789] RBP: ffff88908343a0d0 R08: 0000000000000000 R09: 0000000000000000
[ 682.981040] R10: 0000000000000286 R11: 3fffffffffffffff R12: ffffffff81690d60
[ 682.989282] R13: ffffffff81690a00 R14: ffff8890819807a8 R15: ffff88908343a36c
[ 682.997535] FS: 00007f08c7bfa740(0000) GS:ffff88a03fd00000(0000) knlGS:0000000000000000
[ 683.006910] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 683.013557] CR2: 000000000000000f CR3: 0000001080a66003 CR4: 00000000003706e0
[ 683.021819] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 683.030075] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 683.038336] Call Trace:
[ 683.041167] devm_kfree+0x33/0x50
[ 683.045004] ice_vsi_free_arrays+0x5e/0xc0 [ice]
[ 683.050380] ice_vsi_rebuild+0x4c8/0x750 [ice]
[ 683.055543] ice_vsi_recfg_qs+0x9a/0x110 [ice]
[ 683.060697] ice_set_channels+0x14f/0x290 [ice]
[ 683.065962] ethnl_set_channels+0x333/0x3f0
[ 683.070807] genl_family_rcv_msg_doit+0xea/0x150
[ 683.076152] genl_rcv_msg+0xde/0x1d0
[ 683.080289] ? channels_prepare_data+0x60/0x60
[ 683.085432] ? genl_get_cmd+0xd0/0xd0
[ 683.089667] netlink_rcv_skb+0x50/0xf0
[ 683.094006] genl_rcv+0x24/0x40
[ 683.097638] netlink_unicast+0x239/0x340
[ 683.102177] netlink_sendmsg+0x22e/0x470
[ 683.106717] sock_sendmsg+0x5e/0x60
[ 683.110756] __sys_sendto+0xee/0x150
[ 683.114894] ? handle_mm_fault+0xd0/0x2a0
[ 683.119535] ? do_user_addr_fault+0x1f3/0x690
[ 683.134173] __x64_sys_sendto+0x25/0x30
[ 683.148231] do_syscall_64+0x3b/0xc0
[ 683.161992] entry_SYSCALL_64_after_hwframe+0x44/0xae
Fix this by taking into account the value that num_possible_cpus()
yields in addition to vsi->alloc_txq instead of doubling the latter.
Fixes: efc2214b6047 ("ice: Add support for XDP")
Fixes: 22bf877e528f ("ice: introduce XDP_TX fallback path")
Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5951a2b9812d8227d33f20d1899fae60e4f72c04 ]
When a VF goes through a reset, it's possible for the VF's feature set
to change. For example it may lose the VIRTCHNL_VF_OFFLOAD_VLAN
capability after VF reset. Unfortunately, the driver doesn't correctly
deal with this situation and errors are seen from downing/upping the
interface and/or moving the interface in/out of a network namespace.
When setting the interface down/up we see the following errors after the
VIRTCHNL_VF_OFFLOAD_VLAN capability was taken away from the VF:
ice 0000:51:00.1: VF 1 failed opcode 12, retval: -64 iavf 0000:51:09.1:
Failed to add VLAN filter, error IAVF_NOT_SUPPORTED ice 0000:51:00.1: VF
1 failed opcode 13, retval: -64 iavf 0000:51:09.1: Failed to delete VLAN
filter, error IAVF_NOT_SUPPORTED
These add/delete errors are happening because the VLAN filters are
tracked internally to the driver and regardless of the VLAN_ALLOWED()
setting the driver tries to delete/re-add them over virtchnl.
Fix the delete failure by making sure to delete any VLAN filter tracking
in the driver when a removal request is made, while preventing the
virtchnl request. This makes it so the driver's VLAN list is up to date
and the errors are
Fix the add failure by making sure the check for VLAN_ALLOWED() during
reset is done after the VF receives its capability list from the PF via
VIRTCHNL_OP_GET_VF_RESOURCES. If VLAN functionality is not allowed, then
prevent requesting re-adding the filters over virtchnl.
When moving the interface into a network namespace we see the following
errors after the VIRTCHNL_VF_OFFLOAD_VLAN capability was taken away from
the VF:
iavf 0000:51:09.1 enp81s0f1v1: NIC Link is Up Speed is 25 Gbps Full Duplex
iavf 0000:51:09.1 temp_27: renamed from enp81s0f1v1
iavf 0000:51:09.1 mgmt: renamed from temp_27
iavf 0000:51:09.1 dev27: set_features() failed (-22); wanted 0x020190001fd54833, left 0x020190001fd54bb3
These errors are happening because we aren't correctly updating the
netdev capabilities and dealing with ndo_fix_features() and
ndo_set_features() correctly.
Fix this by only reporting errors in the driver's ndo_set_features()
callback when VIRTCHNL_VF_OFFLOAD_VLAN is not allowed and any attempt to
enable the VLAN features is made. Also, make sure to disable VLAN
insertion, filtering, and stripping since the VIRTCHNL_VF_OFFLOAD_VLAN
flag applies to all of them and not just VLAN stripping.
Also, after we process the capabilities in the VF reset path, make sure
to call netdev_update_features() in case the capabilities have changed
in order to update the netdev's feature set to match the VF's actual
capabilities.
Lastly, make sure to always report success on VLAN filter delete when
VIRTCHNL_VF_OFFLOAD_VLAN is not supported. The changed flow in
iavf_del_vlans() allows the stack to delete previosly existing VLAN
filters even if VLAN filtering is not allowed. This makes it so the VLAN
filter list is up to date.
Fixes: 8774370d268f ("i40e/i40evf: support for VF VLAN tag stripping control")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 3b5bdd18eb76e7570d9bacbcab6828a9b26ae121 ]
Currently iavf adapter statistics are refreshed only in a
watchdog task, triggered approximately every two seconds,
which causes some ethtool requests to return outdated values.
Add explicit statistics refresh when requested by ethtool -S.
Fixes: b476b0030e61 ("iavf: Move commands processing to the separate function")
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e792779e6b639c182df91b46ac1e5803460b0b15 ]
Resolve being able to change static values on VF when adaptive interrupt
moderation is enabled.
This problem is fixed by checking the interrupt settings is not
a combination of change of static value while adaptive interrupt
moderation is turned on.
Without this fix, the user would be able to change static values
on VF with adaptive moderation enabled.
Fixes: 65e87c0398f5 ("i40evf: support queue-specific settings for interrupt moderation")
Signed-off-by: Nitesh B Venkatesh <nitesh.b.venkatesh@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|