Age | Commit message (Collapse) | Author | Files | Lines |
|
encapsulation
commit c65d638ab39034cbaa36773b980d28106cfc81fa upstream.
Current code wrongly uses the skb->protocol field which reflects the
outer l3 protocol to set the inner l3 type in Software Parser (SWP)
fields settings in the ethernet segment (eseg) in flows where inner
l3 exists like in Vxlan over ESP flow, the above method wrongly use
the outer protocol type instead of the inner one. thus breaking cases
where inner and outer headers have different protocols.
Fix by setting the inner l3 type in SWP according to the inner l3 ip
header version.
Fixes: 2ac9cfe78223 ("net/mlx5e: IPSec, Add Innova IPSec offload TX data path")
Signed-off-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e2dabc4f7e7b60299c20a36d6a7b24ed9bf8e572 upstream.
In qlcnic_83xx_add_rings(), the indirect function of
ahw->hw_ops->alloc_mbx_args will be called to allocate memory for
cmd.req.arg, and there is a dereference of it in qlcnic_83xx_add_rings(),
which could lead to a NULL pointer dereference on failure of the
indirect function like qlcnic_83xx_alloc_mbx_args().
Fix this bug by adding a check of alloc_mbx_args(), this patch
imitates the logic of mbx_cmd()'s failure handling.
This bug was found by a static analyzer. The analysis employs
differential checking to identify inconsistent security operations
(e.g., checks or kfrees) between two code paths and confirms that the
inconsistent operations are not recovered in the current function or
the callers, so they constitute bugs.
Note that, as a bug found by static analysis, it can be a false
positive or hard to trigger. Multiple researchers have cross-reviewed
the bug.
Builds with CONFIG_QLCNIC=m show no new warnings, and our
static analyzer no longer warns about this code.
Fixes: 7f9664525f9c ("qlcnic: 83xx memory map and HW access routine")
Signed-off-by: Zhou Qingyang <zhou1615@umn.edu>
Link: https://lore.kernel.org/r/20211130110848.109026-1-zhou1615@umn.edu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b0f38e15979fa8851e88e8aa371367f264e7b6e9 upstream.
Fix section mismatch warnings in xtsonic. The first one appears to be
bogus and after fixing the second one, the first one is gone.
WARNING: modpost: vmlinux.o(.text+0x529adc): Section mismatch in reference from the function sonic_get_stats() to the function .init.text:set_reset_devices()
The function sonic_get_stats() references
the function __init set_reset_devices().
This is often because sonic_get_stats lacks a __init
annotation or the annotation of set_reset_devices is wrong.
WARNING: modpost: vmlinux.o(.text+0x529b3b): Section mismatch in reference from the function xtsonic_probe() to the function .init.text:sonic_probe1()
The function xtsonic_probe() references
the function __init sonic_probe1().
This is often because xtsonic_probe lacks a __init
annotation or the annotation of sonic_probe1 is wrong.
Fixes: 74f2a5f0ef64 ("xtensa: Add support for the Sonic Ethernet device for the XT2000 board.")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Finn Thain <fthain@telegraphics.com.au>
Cc: Chris Zankel <chris@zankel.net>
Cc: linux-xtensa@linux-xtensa.org
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Link: https://lore.kernel.org/r/20211130063947.7529-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
type3_infoblock()
[ Upstream commit 0fa68da72c3be09e06dd833258ee89c33374195f ]
The definition of macro MOTO_SROM_BUG is:
#define MOTO_SROM_BUG (lp->active == 8 && (get_unaligned_le32(
dev->dev_addr) & 0x00ffffff) == 0x3e0008)
and the if statement
if (MOTO_SROM_BUG) lp->active = 0;
using this macro indicates lp->active could be 8. If lp->active is 8 and
the second comparison of this macro is false. lp->active will remain 8 in:
lp->phy[lp->active].gep = (*p ? p : NULL); p += (2 * (*p) + 1);
lp->phy[lp->active].rst = (*p ? p : NULL); p += (2 * (*p) + 1);
lp->phy[lp->active].mc = get_unaligned_le16(p); p += 2;
lp->phy[lp->active].ana = get_unaligned_le16(p); p += 2;
lp->phy[lp->active].fdx = get_unaligned_le16(p); p += 2;
lp->phy[lp->active].ttm = get_unaligned_le16(p); p += 2;
lp->phy[lp->active].mci = *p;
However, the length of array lp->phy is 8, so array overflows can occur.
To fix these possible array overflows, we first check lp->active and then
return -EINVAL if it is greater or equal to ARRAY_SIZE(lp->phy) (i.e. 8).
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Teng Qi <starmiku1207184332@gmail.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
bound
[ Upstream commit 61217be886b5f7402843677e4be7e7e83de9cb41 ]
In line 5001, if all id in the array 'lp->phy[8]' is not 0, when the
'for' end, the 'k' is 8.
At this time, the array 'lp->phy[8]' may be out of bound.
Signed-off-by: zhangyue <zhangyue1@kylinos.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
hns_dsaf_ge_srst_by_port()
[ Upstream commit a66998e0fbf213d47d02813b9679426129d0d114 ]
The if statement:
if (port >= DSAF_GE_NUM)
return;
limits the value of port less than DSAF_GE_NUM (i.e., 8).
However, if the value of port is 6 or 7, an array overflow could occur:
port_rst_off = dsaf_dev->mac_cb[port]->port_rst_off;
because the length of dsaf_dev->mac_cb is DSAF_MAX_PORT_NUM (i.e., 6).
To fix this possible array overflow, we first check port and if it is
greater than or equal to DSAF_MAX_PORT_NUM, the function returns.
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Teng Qi <starmiku1207184332@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b922f622592af76b57cbc566eaeccda0b31a3496 ]
This bug report shows up when running our research tools. The
reports is SOOB read, but it seems SOOB write is also possible
a few lines below.
In details, fw.len and sw.len are inputs coming from io. A len
over the size of self->rpc triggers SOOB. The patch fixes the
bugs by adding sanity checks.
The bugs are triggerable with compromised/malfunctioning devices.
They are potentially exploitable given they first leak up to
0xffff bytes and able to overwrite the region later.
The patch is tested with QEMU emulater.
This is NOT tested with a real device.
Attached is the log we found by fuzzing.
BUG: KASAN: slab-out-of-bounds in
hw_atl_utils_fw_upload_dwords+0x393/0x3c0 [atlantic]
Read of size 4 at addr ffff888016260b08 by task modprobe/213
CPU: 0 PID: 213 Comm: modprobe Not tainted 5.6.0 #1
Call Trace:
dump_stack+0x76/0xa0
print_address_description.constprop.0+0x16/0x200
? hw_atl_utils_fw_upload_dwords+0x393/0x3c0 [atlantic]
? hw_atl_utils_fw_upload_dwords+0x393/0x3c0 [atlantic]
__kasan_report.cold+0x37/0x7c
? aq_hw_read_reg_bit+0x60/0x70 [atlantic]
? hw_atl_utils_fw_upload_dwords+0x393/0x3c0 [atlantic]
kasan_report+0xe/0x20
hw_atl_utils_fw_upload_dwords+0x393/0x3c0 [atlantic]
hw_atl_utils_fw_rpc_call+0x95/0x130 [atlantic]
hw_atl_utils_fw_rpc_wait+0x176/0x210 [atlantic]
hw_atl_utils_mpi_create+0x229/0x2e0 [atlantic]
? hw_atl_utils_fw_rpc_wait+0x210/0x210 [atlantic]
? hw_atl_utils_initfw+0x9f/0x1c8 [atlantic]
hw_atl_utils_initfw+0x12a/0x1c8 [atlantic]
aq_nic_ndev_register+0x88/0x650 [atlantic]
? aq_nic_ndev_init+0x235/0x3c0 [atlantic]
aq_pci_probe+0x731/0x9b0 [atlantic]
? aq_pci_func_init+0xc0/0xc0 [atlantic]
local_pci_probe+0xd3/0x160
pci_device_probe+0x23f/0x3e0
Reported-by: Brendan Dolan-Gavitt <brendandg@nyu.edu>
Signed-off-by: Zekun Shen <bruceshenzk@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c49a35eedfef08bffd46b53c25dbf9d6016a86ff ]
The driver doesn't support RX timestamping for non-PTP packets, but it
declares that it does. Restrict the reported RX filters to PTP v2 over
L2 and over L4.
Fixes: 4e3b0468e6d7 ("net: mscc: PTP Hardware Clock (PHC) support")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8a075464d1e9317ffae0973dfe538a7511291a06 ]
The ocelot driver, when asked to timestamp all receiving packets, 1588
v1 or NTP, says "nah, here's 1588 v2 for you".
According to this discussion:
https://patchwork.kernel.org/project/netdevbpf/patch/20211104133204.19757-8-martin.kaistra@linutronix.de/#24577647
drivers that downgrade from a wider request to a narrower response (or
even a response where the intersection with the request is empty) are
buggy, and should return -ERANGE instead. This patch fixes that.
Fixes: 4e3b0468e6d7 ("net: mscc: PTP Hardware Clock (PHC) support")
Suggested-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 82229c4dbb8a2780f05fa1bab29c97ef7bcd21bb ]
Currently, HNS3 driver doesn't clear the reset flags of components after
successfully executing reset, it causes userspace info of
"Components reset" and "Components not reset" is incorrect.
So fix this problem by clear corresponding reset flag after reset process.
Fixes: ddccc5e368a3 ("net: hns3: add support for triggering reset by ethtool")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8d2ad993aa05c0768f00c886c9d369cd97a337ac ]
When PF is set to multi-TCs and configured mapping relationship between
priorities and TCs, the hardware will active these settings for this PF
and its VFs.
In this case when VF just uses one TC and its rx packets contain priority,
and if the priority is not mapped to TC0, as other TCs of VF is not valid,
hardware always put this kind of packets to the queue 0. It cause this kind
of packets of VF can not be used RSS function.
To fix this problem, set tc mode of all unused TCs of VF to the setting of
TC0, then rx packet with priority which map to unused TC will be direct to
TC0.
Fixes: e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b270bfe697367776eca2e6759a71d700fb8d82a2 ]
The Tx queues were not disabled in situations where the driver needed to
stop the interface to apply a new configuration. This could result in a
kernel panic when doing any of the 3 following actions:
* reconfiguring the number of queues (ethtool -L)
* reconfiguring the size of the ring buffers (ethtool -G)
* installing/removing an XDP program (ip l set dev ethX xdp)
Prevent the panic by making sure netif_tx_disable is called when stopping
an interface.
Without this patch, the following kernel panic can be observed when doing
any of the actions above:
Unable to handle kernel paging request at virtual address ffff80001238d040
[....]
Call trace:
dwmac4_set_addr+0x8/0x10
dev_hard_start_xmit+0xe4/0x1ac
sch_direct_xmit+0xe8/0x39c
__dev_queue_xmit+0x3ec/0xaf0
dev_queue_xmit+0x14/0x20
[...]
[ end trace 0000000000000002 ]---
Fixes: 5fabb01207a2d ("net: stmmac: Add initial XDP support")
Fixes: aa042f60e4961 ("net: stmmac: Add support to Ethtool get/set ring parameters")
Fixes: 0366f7e06a6be ("net: stmmac: add ethtool support for get/set channels")
Signed-off-by: Yannick Vignon <yannick.vignon@nxp.com>
Link: https://lore.kernel.org/r/20211124154731.1676949-1-yannick.vignon@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit eaeace60778e524a2820d0c0ad60bf80289e292c ]
Oleksandr brought a bug report where netpoll causes trace
messages in the log on igb.
Danielle brought this back up as still occurring, so we'll try
again.
[22038.710800] ------------[ cut here ]------------
[22038.710801] igb_poll+0x0/0x1440 [igb] exceeded budget in poll
[22038.710802] WARNING: CPU: 12 PID: 40362 at net/core/netpoll.c:155 netpoll_poll_dev+0x18a/0x1a0
As Alex suggested, change the driver to return work_done at the
exit of napi_poll, which should be safe to do in this driver
because it is not polling multiple queues in this single napi
context (multiple queues attached to one MSI-X vector). Several
other drivers contain the same simple sequence, so I hope
this will not create new problems.
Fixes: 16eb8815c235 ("igb: Refactor clean_rx_irq to reduce overhead and improve performance")
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reported-by: Danielle Ratson <danieller@nvidia.com>
Suggested-by: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Danielle Ratson <danieller@nvidia.com>
Link: https://lore.kernel.org/r/20211123204000.1597971-1-jesse.brandeburg@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ddb826c2c92d461f290a7bab89e7c28696191875 ]
Usage of phy_ethtool_get_link_ksettings() in the link status change
handler isn't needed, and in combination with the referenced change
it results in a deadlock. Simply remove the call and replace it with
direct access to phydev->speed. The duplex argument of
lan743x_phy_update_flowcontrol() isn't used and can be removed.
Fixes: c10a485c3de5 ("phy: phy_ethtool_ksettings_get: Lock the phy for consistency")
Reported-by: Alessandro B Maurici <abmaurici@gmail.com>
Tested-by: Alessandro B Maurici <abmaurici@gmail.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/40e27f76-0ba3-dcef-ee32-a78b9df38b0f@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 7b1b62bc1e6a7b2fd5ee7a4296268eb291d23aeb ]
Currently mvpp2_xdp_setup won't allow attaching XDP program if
mtu > ETH_DATA_LEN (1500).
The mvpp2_change_mtu on the other hand checks whether
MVPP2_RX_PKT_SIZE(mtu) > MVPP2_BM_LONG_PKT_SIZE.
These two checks are semantically different.
Moreover this limit can be increased to MVPP2_MAX_RX_BUF_SIZE, since in
mvpp2_rx we have
xdp.data = data + MVPP2_MH_SIZE + MVPP2_SKB_HEADROOM;
xdp.frame_sz = PAGE_SIZE;
Change the checks to check whether
mtu > MVPP2_MAX_RX_BUF_SIZE
Fixes: 07dd0a7aae7f ("mvpp2: add basic XDP support")
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 63b08b1f6834bbb0b4f7783bf63b80c8c8e9a047 ]
When processing port up/down events generated by the device's firmware,
the driver protects itself from events reported for non-existent local
ports, but not the CPU port (local port 0), which exists, but lacks a
netdev.
This can result in a NULL pointer dereference when calling
netif_carrier_{on,off}().
Fix this by bailing early when processing an event reported for the CPU
port. Problem was only observed when running on top of a buggy emulator.
Fixes: 28b1987ef506 ("mlxsw: spectrum: Register CPU port with devlink")
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f65ee535df775a13a1046c0a0b2d72db342f8a5b ]
Ice driver has the routines for managing XDP resources that are shared
between ndo_bpf op and VSI rebuild flow. The latter takes place for
example when user changes queue count on an interface via ethtool's
set_channels().
There is an issue around the bpf_prog refcounting when VSI is being
rebuilt - since ice_prepare_xdp_rings() is called with vsi->xdp_prog as
an argument that is used later on by ice_vsi_assign_bpf_prog(), same
bpf_prog pointers are swapped with each other. Then it is also
interpreted as an 'old_prog' which in turn causes us to call
bpf_prog_put on it that will decrement its refcount.
Below splat can be interpreted in a way that due to zero refcount of a
bpf_prog it is wiped out from the system while kernel still tries to
refer to it:
[ 481.069429] BUG: unable to handle page fault for address: ffffc9000640f038
[ 481.077390] #PF: supervisor read access in kernel mode
[ 481.083335] #PF: error_code(0x0000) - not-present page
[ 481.089276] PGD 100000067 P4D 100000067 PUD 1001cb067 PMD 106d2b067 PTE 0
[ 481.097141] Oops: 0000 [#1] PREEMPT SMP PTI
[ 481.101980] CPU: 12 PID: 3339 Comm: sudo Tainted: G OE 5.15.0-rc5+ #1
[ 481.110840] Hardware name: Intel Corp. GRANTLEY/GRANTLEY, BIOS GRRFCRB1.86B.0276.D07.1605190235 05/19/2016
[ 481.122021] RIP: 0010:dev_xdp_prog_id+0x25/0x40
[ 481.127265] Code: 80 00 00 00 00 0f 1f 44 00 00 89 f6 48 c1 e6 04 48 01 fe 48 8b 86 98 08 00 00 48 85 c0 74 13 48 8b 50 18 31 c0 48 85 d2 74 07 <48> 8b 42 38 8b 40 20 c3 48 8b 96 90 08 00 00 eb e8 66 2e 0f 1f 84
[ 481.148991] RSP: 0018:ffffc90007b63868 EFLAGS: 00010286
[ 481.155034] RAX: 0000000000000000 RBX: ffff889080824000 RCX: 0000000000000000
[ 481.163278] RDX: ffffc9000640f000 RSI: ffff889080824010 RDI: ffff889080824000
[ 481.171527] RBP: ffff888107af7d00 R08: 0000000000000000 R09: ffff88810db5f6e0
[ 481.179776] R10: 0000000000000000 R11: ffff8890885b9988 R12: ffff88810db5f4bc
[ 481.188026] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 481.196276] FS: 00007f5466d5bec0(0000) GS:ffff88903fb00000(0000) knlGS:0000000000000000
[ 481.205633] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 481.212279] CR2: ffffc9000640f038 CR3: 000000014429c006 CR4: 00000000003706e0
[ 481.220530] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 481.228771] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 481.237029] Call Trace:
[ 481.239856] rtnl_fill_ifinfo+0x768/0x12e0
[ 481.244602] rtnl_dump_ifinfo+0x525/0x650
[ 481.249246] ? __alloc_skb+0xa5/0x280
[ 481.253484] netlink_dump+0x168/0x3c0
[ 481.257725] netlink_recvmsg+0x21e/0x3e0
[ 481.262263] ____sys_recvmsg+0x87/0x170
[ 481.266707] ? __might_fault+0x20/0x30
[ 481.271046] ? _copy_from_user+0x66/0xa0
[ 481.275591] ? iovec_from_user+0xf6/0x1c0
[ 481.280226] ___sys_recvmsg+0x82/0x100
[ 481.284566] ? sock_sendmsg+0x5e/0x60
[ 481.288791] ? __sys_sendto+0xee/0x150
[ 481.293129] __sys_recvmsg+0x56/0xa0
[ 481.297267] do_syscall_64+0x3b/0xc0
[ 481.301395] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 481.307238] RIP: 0033:0x7f5466f39617
[ 481.311373] Code: 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb bd 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2f 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
[ 481.342944] RSP: 002b:00007ffedc7f4308 EFLAGS: 00000246 ORIG_RAX: 000000000000002f
[ 481.361783] RAX: ffffffffffffffda RBX: 00007ffedc7f5460 RCX: 00007f5466f39617
[ 481.380278] RDX: 0000000000000000 RSI: 00007ffedc7f5360 RDI: 0000000000000003
[ 481.398500] RBP: 00007ffedc7f53f0 R08: 0000000000000000 R09: 000055d556f04d50
[ 481.416463] R10: 0000000000000077 R11: 0000000000000246 R12: 00007ffedc7f5360
[ 481.434131] R13: 00007ffedc7f5350 R14: 00007ffedc7f5344 R15: 0000000000000e98
[ 481.451520] Modules linked in: ice(OE) af_packet binfmt_misc nls_iso8859_1 ipmi_ssif intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp mxm_wmi mei_me coretemp mei ipmi_si ipmi_msghandler wmi acpi_pad acpi_power_meter ip_tables x_tables autofs4 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel ahci crypto_simd cryptd libahci lpc_ich [last unloaded: ice]
[ 481.528558] CR2: ffffc9000640f038
[ 481.542041] ---[ end trace d1f24c9ecf5b61c1 ]---
Fix this by only calling ice_vsi_assign_bpf_prog() inside
ice_prepare_xdp_rings() when current vsi->xdp_prog pointer is NULL.
This way set_channels() flow will not attempt to swap the vsi->xdp_prog
pointers with itself.
Also, sprinkle around some comments that provide a reasoning about
correlation between driver and kernel in terms of bpf_prog refcount.
Fixes: efc2214b6047 ("ice: Add support for XDP")
Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Signed-off-by: Marta Plantykow <marta.a.plantykow@intel.com>
Co-developed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 792b2086584f25d84081a526beee80d103c2a913 ]
The approach of having XDP queue per CPU regardless of user's setting
exposed a hidden bug that could occur in case when Rx queue count differ
from Tx queue count. Currently vsi->txq_map's size is equal to the
doubled vsi->alloc_txq, which is not correct due to the fact that XDP
rings were previously based on the Rx queue count. Below splat can be
seen when ethtool -L is used and XDP rings are configured:
[ 682.875339] BUG: kernel NULL pointer dereference, address: 000000000000000f
[ 682.883403] #PF: supervisor read access in kernel mode
[ 682.889345] #PF: error_code(0x0000) - not-present page
[ 682.895289] PGD 0 P4D 0
[ 682.898218] Oops: 0000 [#1] PREEMPT SMP PTI
[ 682.903055] CPU: 42 PID: 2878 Comm: ethtool Tainted: G OE 5.15.0-rc5+ #1
[ 682.912214] Hardware name: Intel Corp. GRANTLEY/GRANTLEY, BIOS GRRFCRB1.86B.0276.D07.1605190235 05/19/2016
[ 682.923380] RIP: 0010:devres_remove+0x44/0x130
[ 682.928527] Code: 49 89 f4 55 48 89 fd 4c 89 ff 53 48 83 ec 10 e8 92 b9 49 00 48 8b 9d a8 02 00 00 48 8d 8d a0 02 00 00 49 89 c2 48 39 cb 74 0f <4c> 3b 63 10 74 25 48 8b 5b 08 48 39 cb 75 f1 4c 89 ff 4c 89 d6 e8
[ 682.950237] RSP: 0018:ffffc90006a679f0 EFLAGS: 00010002
[ 682.956285] RAX: 0000000000000286 RBX: ffffffffffffffff RCX: ffff88908343a370
[ 682.964538] RDX: 0000000000000001 RSI: ffffffff81690d60 RDI: 0000000000000000
[ 682.972789] RBP: ffff88908343a0d0 R08: 0000000000000000 R09: 0000000000000000
[ 682.981040] R10: 0000000000000286 R11: 3fffffffffffffff R12: ffffffff81690d60
[ 682.989282] R13: ffffffff81690a00 R14: ffff8890819807a8 R15: ffff88908343a36c
[ 682.997535] FS: 00007f08c7bfa740(0000) GS:ffff88a03fd00000(0000) knlGS:0000000000000000
[ 683.006910] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 683.013557] CR2: 000000000000000f CR3: 0000001080a66003 CR4: 00000000003706e0
[ 683.021819] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 683.030075] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 683.038336] Call Trace:
[ 683.041167] devm_kfree+0x33/0x50
[ 683.045004] ice_vsi_free_arrays+0x5e/0xc0 [ice]
[ 683.050380] ice_vsi_rebuild+0x4c8/0x750 [ice]
[ 683.055543] ice_vsi_recfg_qs+0x9a/0x110 [ice]
[ 683.060697] ice_set_channels+0x14f/0x290 [ice]
[ 683.065962] ethnl_set_channels+0x333/0x3f0
[ 683.070807] genl_family_rcv_msg_doit+0xea/0x150
[ 683.076152] genl_rcv_msg+0xde/0x1d0
[ 683.080289] ? channels_prepare_data+0x60/0x60
[ 683.085432] ? genl_get_cmd+0xd0/0xd0
[ 683.089667] netlink_rcv_skb+0x50/0xf0
[ 683.094006] genl_rcv+0x24/0x40
[ 683.097638] netlink_unicast+0x239/0x340
[ 683.102177] netlink_sendmsg+0x22e/0x470
[ 683.106717] sock_sendmsg+0x5e/0x60
[ 683.110756] __sys_sendto+0xee/0x150
[ 683.114894] ? handle_mm_fault+0xd0/0x2a0
[ 683.119535] ? do_user_addr_fault+0x1f3/0x690
[ 683.134173] __x64_sys_sendto+0x25/0x30
[ 683.148231] do_syscall_64+0x3b/0xc0
[ 683.161992] entry_SYSCALL_64_after_hwframe+0x44/0xae
Fix this by taking into account the value that num_possible_cpus()
yields in addition to vsi->alloc_txq instead of doubling the latter.
Fixes: efc2214b6047 ("ice: Add support for XDP")
Fixes: 22bf877e528f ("ice: introduce XDP_TX fallback path")
Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit a6da2bbb0005e6b4909472962c9d0af29e75dd06 ]
Currently, when user space emits SIOCSHWTSTAMP ioctl calls such as
enabling/disabling timestamping or changing filter settings, the driver
reads the current CLOCK_REALTIME value and programming this into the
NIC's hardware clock. This might be necessary during system
initialization, but at runtime, when the PTP clock has already been
synchronized to a grandmaster, a reset of the timestamp settings might
result in a clock jump. Furthermore, if the clock is also controlled by
phc2sys in automatic mode (where the UTC offset is queried from ptp4l),
that UTC-to-TAI offset (currently 37 seconds in 2021) would be
temporarily reset to 0, and it would take a long time for phc2sys to
readjust so that CLOCK_REALTIME and the PHC are apart by 37 seconds
again.
To address the issue, we introduce a new function called
stmmac_init_tstamp_counter(), which gets called during ndo_open().
It contains the code snippet moved from stmmac_hwtstamp_set() that
manages the time synchronization. Besides, the sub second increment
configuration is also moved here since the related values are hardware
dependent and runtime invariant.
Furthermore, the hardware clock must be kept running even when no time
stamping mode is selected in order to retain the synchronized time base.
That way, timestamping can be enabled again at any time only with the
need to compensate the clock's natural drifting.
As a side effect, this patch fixes the issue that ptp_clock_info::enable
can be called before SIOCSHWTSTAMP and the driver (which looks at
priv->systime_flags) was not prepared to handle that ordering.
Fixes: 92ba6888510c ("stmmac: add the support for PTP hw clock driver")
Reported-by: Michael Olbrich <m.olbrich@pengutronix.de>
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Signed-off-by: Holger Assmann <h.assmann@pengutronix.de>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 3bd6b2a838ba6a3b86d41b077f570b1b61174def ]
Use nn->tlv_caps.me_freq_mhz instead of nn->me_freq_mhz to check whether
rx-usecs/tx-usecs is valid.
This is because nn->tlv_caps.me_freq_mhz represents the clock_freq (MHz) of
the flow processing cores (FPC) on the NIC. While nn->me_freq_mhz is not
be set.
Fixes: ce991ab6662a ("nfp: read ME frequency from vNIC ctrl memory")
Signed-off-by: Diana Wang <na.wang@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5951a2b9812d8227d33f20d1899fae60e4f72c04 ]
When a VF goes through a reset, it's possible for the VF's feature set
to change. For example it may lose the VIRTCHNL_VF_OFFLOAD_VLAN
capability after VF reset. Unfortunately, the driver doesn't correctly
deal with this situation and errors are seen from downing/upping the
interface and/or moving the interface in/out of a network namespace.
When setting the interface down/up we see the following errors after the
VIRTCHNL_VF_OFFLOAD_VLAN capability was taken away from the VF:
ice 0000:51:00.1: VF 1 failed opcode 12, retval: -64 iavf 0000:51:09.1:
Failed to add VLAN filter, error IAVF_NOT_SUPPORTED ice 0000:51:00.1: VF
1 failed opcode 13, retval: -64 iavf 0000:51:09.1: Failed to delete VLAN
filter, error IAVF_NOT_SUPPORTED
These add/delete errors are happening because the VLAN filters are
tracked internally to the driver and regardless of the VLAN_ALLOWED()
setting the driver tries to delete/re-add them over virtchnl.
Fix the delete failure by making sure to delete any VLAN filter tracking
in the driver when a removal request is made, while preventing the
virtchnl request. This makes it so the driver's VLAN list is up to date
and the errors are
Fix the add failure by making sure the check for VLAN_ALLOWED() during
reset is done after the VF receives its capability list from the PF via
VIRTCHNL_OP_GET_VF_RESOURCES. If VLAN functionality is not allowed, then
prevent requesting re-adding the filters over virtchnl.
When moving the interface into a network namespace we see the following
errors after the VIRTCHNL_VF_OFFLOAD_VLAN capability was taken away from
the VF:
iavf 0000:51:09.1 enp81s0f1v1: NIC Link is Up Speed is 25 Gbps Full Duplex
iavf 0000:51:09.1 temp_27: renamed from enp81s0f1v1
iavf 0000:51:09.1 mgmt: renamed from temp_27
iavf 0000:51:09.1 dev27: set_features() failed (-22); wanted 0x020190001fd54833, left 0x020190001fd54bb3
These errors are happening because we aren't correctly updating the
netdev capabilities and dealing with ndo_fix_features() and
ndo_set_features() correctly.
Fix this by only reporting errors in the driver's ndo_set_features()
callback when VIRTCHNL_VF_OFFLOAD_VLAN is not allowed and any attempt to
enable the VLAN features is made. Also, make sure to disable VLAN
insertion, filtering, and stripping since the VIRTCHNL_VF_OFFLOAD_VLAN
flag applies to all of them and not just VLAN stripping.
Also, after we process the capabilities in the VF reset path, make sure
to call netdev_update_features() in case the capabilities have changed
in order to update the netdev's feature set to match the VF's actual
capabilities.
Lastly, make sure to always report success on VLAN filter delete when
VIRTCHNL_VF_OFFLOAD_VLAN is not supported. The changed flow in
iavf_del_vlans() allows the stack to delete previosly existing VLAN
filters even if VLAN filtering is not allowed. This makes it so the VLAN
filter list is up to date.
Fixes: 8774370d268f ("i40e/i40evf: support for VF VLAN tag stripping control")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 3b5bdd18eb76e7570d9bacbcab6828a9b26ae121 ]
Currently iavf adapter statistics are refreshed only in a
watchdog task, triggered approximately every two seconds,
which causes some ethtool requests to return outdated values.
Add explicit statistics refresh when requested by ethtool -S.
Fixes: b476b0030e61 ("iavf: Move commands processing to the separate function")
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e792779e6b639c182df91b46ac1e5803460b0b15 ]
Resolve being able to change static values on VF when adaptive interrupt
moderation is enabled.
This problem is fixed by checking the interrupt settings is not
a combination of change of static value while adaptive interrupt
moderation is turned on.
Without this fix, the user would be able to change static values
on VF with adaptive moderation enabled.
Fixes: 65e87c0398f5 ("i40evf: support queue-specific settings for interrupt moderation")
Signed-off-by: Nitesh B Venkatesh <nitesh.b.venkatesh@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e8d032507cb7912baf1d3e0af54516f823befefd ]
fix error path handling in prestera_bridge_port_join() that
cases prestera driver to crash (see below).
Trace:
Internal error: Oops: 96000044 [#1] SMP
Modules linked in: prestera_pci prestera uio_pdrv_genirq
CPU: 1 PID: 881 Comm: ip Not tainted 5.15.0 #1
pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : prestera_bridge_destroy+0x2c/0xb0 [prestera]
lr : prestera_bridge_port_join+0x2cc/0x350 [prestera]
sp : ffff800011a1b0f0
...
x2 : ffff000109ca6c80 x1 : dead000000000100 x0 : dead000000000122
Call trace:
prestera_bridge_destroy+0x2c/0xb0 [prestera]
prestera_bridge_port_join+0x2cc/0x350 [prestera]
prestera_netdev_port_event.constprop.0+0x3c4/0x450 [prestera]
prestera_netdev_event_handler+0xf4/0x110 [prestera]
raw_notifier_call_chain+0x54/0x80
call_netdevice_notifiers_info+0x54/0xa0
__netdev_upper_dev_link+0x19c/0x380
Fixes: e1189d9a5fbe ("net: marvell: prestera: Add Switchdev driver implementation")
Signed-off-by: Volodymyr Mytnyk <vmytnyk@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 253e9b4d11e577bb8cbc77ef68a9ff46438065ca ]
Return NOTIFY_DONE (dont't care) for switchdev notifications
that prestera driver don't know how to handle them.
With introduction of SWITCHDEV_BRPORT_[UN]OFFLOADED switchdev
events, the driver rejects adding swport to bridge operation
which is handled by prestera_bridge_port_join() func. The root
cause of this is that prestera driver returns error (EOPNOTSUPP)
in prestera_switchdev_blk_event() handler for unknown swdev
events. This causes switchdev_bridge_port_offload() to fail
when adding port to bridge in prestera_bridge_port_join().
Fixes: 957e2235e526 ("net: make switchdev_bridge_port_{,unoffload} loosely coupled with the bridge")
Signed-off-by: Volodymyr Mytnyk <vmytnyk@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 2ff04286a9569675948f39cec2c6ad47c3584633 upstream.
PF pointer is always valid when PCI core calls its .shutdown() and
.remove() callbacks. There is no need to check it again.
Fixes: 837f08fdecbe ("ice: Add basic driver framework for Intel(R) E800 Series")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 1a8c7778bcde5981463a5b9f9b2caa44a327ff93 upstream.
When a VF requests promiscuous mode and it's trusted and true promiscuous
mode is enabled the PF driver attempts to enable unicast and/or
multicast promiscuous mode filters based on the request. This is fine,
but there are a couple issues with the current code.
[1] The define to configure the unicast promiscuous mode mask also
includes bits to configure the multicast promiscuous mode mask, which
causes multicast to be set/cleared unintentionally.
[2] All 4 cases for enable/disable unicast/multicast mode are not
handled in the promiscuous mode message handler, which causes
unexpected results regarding the current promiscuous mode settings.
To fix [1] make sure any promiscuous mask defines include the correct
bits for each of the promiscuous modes.
To fix [2] make sure that all 4 cases are handled since there are 2 bits
(FLAG_VF_UNICAST_PROMISC and FLAG_VF_MULTICAST_PROMISC) that can be
either set or cleared. Also, since either unicast and/or multicast
promiscuous configuration can fail, introduce two separate error values
to handle each of these cases.
Fixes: 01b5e89aab49 ("ice: Add VF promiscuous support")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 3751c3d34cd5a750c86d1c8eaf217d8faf7f9325 upstream.
The recent addition of timestamp correction to compensate the CDC error
introduced a subtle signed/unsigned bug in stmmac_get_tx_hwtstamp() while
it managed for some obscure reason to avoid that in stmmac_get_rx_hwtstamp().
The issue is:
s64 adjust = 0;
u64 ns;
adjust += -(2 * (NSEC_PER_SEC / priv->plat->clk_ptp_rate));
ns += adjust;
works by chance on 64bit, but falls apart on 32bit because the compiler
knows that adjust fits into 32bit and then treats the addition as a u64 +
u32 resulting in an off by ~2 seconds failure.
The RX variant uses an u64 for adjust and does the adjustment via
ns -= adjust;
because consistency is obviously overrated.
Get rid of the pointless zero initialized adjust variable and do:
ns -= (2 * NSEC_PER_SEC) / priv->plat->clk_ptp_rate;
which is obviously correct and spares the adjust obfuscation. Aside of that
it yields a more accurate result because the multiplication takes place
before the integer divide truncation and not afterwards.
Stick the calculation into an inline so it can't be accidentally
disimproved. Return an u32 from that inline as the result is guaranteed
to fit which lets the compiler optimize the substraction.
Cc: stable@vger.kernel.org
Fixes: 3600be5f58c1 ("net: stmmac: add timestamp correction to rid CDC sync error")
Reported-by: Benedikt Spranger <b.spranger@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Benedikt Spranger <b.spranger@linutronix.de>
Tested-by: Kurt Kanzenbach <kurt@linutronix.de> # Intel EHL
Link: https://lore.kernel.org/r/87mtm578cs.ffs@tglx
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9119570039481d56350af1c636f040fb300b8cf3 upstream.
According to upstream commit 5ec55823438e("net: stmmac:
add clocks management for gmac driver"), it improve clocks
management for stmmac driver. So, it is necessary to implement
the runtime callback in dwmac-socfpga driver because it doesn't
use the common stmmac_pltfr_pm_ops instance. Otherwise, clocks
are not disabled when system enters suspend status.
Fixes: 5ec55823438e ("net: stmmac: add clocks management for gmac driver")
Cc: stable@vger.kernel.org
Signed-off-by: Meng Li <Meng.Li@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 5d2ca2e12dfb2aff3388ca57b06f570fa6206ced ]
As reported in [1], e100 was no longer working for suspend/resume
cycles. The previous commit mentioned in the fixes appears to have
broken things and this attempts to practice best known methods for
device power management and keep wake-up working while allowing
suspend/resume to work. To do this, I reorder a little bit of code
and fix the resume path to make sure the device is enabled.
[1] https://bugzilla.kernel.org/show_bug.cgi?id=214933
Fixes: 69a74aef8a18 ("e100: use generic power management")
Cc: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Reported-by: Alexey Kuznetsov <axet@me.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Alexey Kuznetsov <axet@me.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5aff430d4e33a0b48a6b3d5beb06f79da23f9916 ]
Fix misleading display error in dmesg if tc filter return fail.
Only i40e status error code should be converted to string, not linux
error code. Otherwise, we return false information about the error.
Fixes: 2f4b411a3d67 ("i40e: Enable cloud filters via tc-flower")
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 2e6d218c1ec6fb9cd70693b78134cbc35ae0b5a9 ]
Reject TCs creation with proper message if the first queue
assignment is not equal to the power of two.
The first queue number was checked too late in the second queue
iteration, if second queue was configured at all. Now if first queue value
is not a power of two, then trying to create qdisc will be rejected.
Fixes: 8f88b3034db3 ("i40e: Add infrastructure for queue channel support")
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 3a3b311e3881172fc8e019b6508f04bc40c92d9d ]
Restore part of reset functionality used when reset is called
from the VF to reset itself. Without this fix warning message
is displayed when VF is being removed via sysfs.
Fix the crash of the VF during reset by ensuring
that the PF receives the reset message successfully.
Refactor code to use one function instead of two.
Fixes: 5c3c48ac6bf5 ("i40e: implement virtual device interface")
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com>
Signed-off-by: Karen Sornek <karen.sornek@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 9e0a603cb7dce2a19d98116d42de84b6db26d716 ]
Properly reconfigure VF VSIs after VF request ADQ.
Created new function to update queue mapping and queue pairs per TC
with AQ update VSI. This sets proper RSS size on NIC.
VFs num_queue_pairs should not be changed during setup of queue maps.
Previously, VF main VSI in ADQ had configured too many queues and had
wrong RSS size, which lead to packets not being consumed and drops in
connectivity.
Fixes: bc6d33c8d93f ("i40e: Fix the number of queues available to be mapped for use")
Co-developed-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Eryk Rybak <eryk.roch.rybak@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d2a69fefd75683004ffe87166de5635b3267ee07 ]
Currently, the i40e_vsi_setup_queue_map is basing the count of queues in
TCs on a VSI's alloc_queue_pairs member which is not changed throughout
any user's action (for example via ethtool's set_channels callback).
This implies that vsi->tc_config.tc_info[n].qcount value that is given
to the kernel via netdev_set_tc_queue() that notifies about the count of
queues per particular traffic class is constant even if user has changed
the total count of queues.
This in turn caused the kernel warning after setting the queue count to
the lower value than the initial one:
$ ethtool -l ens801f0
Channel parameters for ens801f0:
Pre-set maximums:
RX: 0
TX: 0
Other: 1
Combined: 64
Current hardware settings:
RX: 0
TX: 0
Other: 1
Combined: 64
$ ethtool -L ens801f0 combined 40
[dmesg]
Number of in use tx queues changed invalidating tc mappings. Priority
traffic classification disabled!
Reason was that vsi->alloc_queue_pairs stayed at 64 value which was used
to set the qcount on TC0 (by default only TC0 exists so all of the
existing queues are assigned to TC0). we update the offset/qcount via
netdev_set_tc_queue() back to the old value but then the
netif_set_real_num_tx_queues() is using the vsi->num_queue_pairs as a
value which got set to 40.
Fix it by using vsi->req_queue_pairs as a queue count that will be
distributed across TCs. Do it only for non-zero values, which implies
that user actually requested the new count of queues.
For VSIs other than main, stay with the vsi->alloc_queue_pairs as we
only allow manipulating the queue count on main VSI.
Fixes: bc6d33c8d93f ("i40e: Fix the number of queues available to be mapped for use")
Co-developed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Co-developed-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Eryk Rybak <eryk.roch.rybak@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 37d9e304acd903a445df8208b8a13d707902dea6 ]
Remove the reason of null pointer dereference in sync VSI filters.
Added new I40E_VSI_RELEASING flag to signalize deleting and releasing
of VSI resources to sync this thread with sync filters subtask.
Without this patch it is possible to start update the VSI filter list
after VSI is removed, that's causing a kernel oops.
Fixes: 41c445ff0f48 ("i40e: main driver core")
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com>
Signed-off-by: Michal Maloszewski <michal.maloszewski@intel.com>
Reviewed-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Reviewed-by: Witold Fijalkowski <witoldx.fijalkowski@intel.com>
Reviewed-by: Jaroslaw Gawin <jaroslawx.gawin@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 6afbd7b3c53cb7417189f476e99d431daccb85b0 ]
Setting VLAN port increasing RX queue max_pkt_size
by 4 bytes to take VLAN tag into account.
Trigger the VF reset when setting port VLAN for
VF to renegotiate its capabilities and reinitialize.
Fixes: ba4e003d29c1 ("i40e: don't hold spinlock while resetting VF")
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Eryk Rybak <eryk.roch.rybak@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 9b5a333272a48c2f8b30add7a874e46e8b26129c ]
Access to netdev after free_netdev() will cause use-after-free bug.
Move debug log before free_netdev() call to avoid it.
Fixes: 7472dd9f6499 ("staging: fsl-dpaa2/eth: Move print message")
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 2460386bef0b9b98b71728d3c173e15558b78d82 ]
The kernel test robot reported a following issue:
>> drivers/net/ethernet/marvell/mvmdio.c:426:36: warning:
unused variable 'orion_mdio_acpi_match' [-Wunused-const-variable]
static const struct acpi_device_id orion_mdio_acpi_match[] = {
^
1 warning generated.
Fix that by surrounding the variable by appropriate ifdef.
Fixes: c54da4c1acb1 ("net: mvmdio: add ACPI support")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20211115153024.209083-1-mw@semihalf.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c4c3176739dfa6efcc5b1d1de4b3fd2b51b048c7 ]
On regular ConnectX HCAs getting encap mode isn't supported when the
E-Switch is in NONE mode. Current code would return no error code when
trying to get encap mode in such case which is wrong.
Fix by returning error value to indicate failure to caller in such case.
Fixes: 8e0aa4bc959c ("net/mlx5: E-switch, Protect eswitch mode changes")
Signed-off-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ae396d85c01c7bdc9eeceecde1f493d03f793465 ]
Currently, In NETDEV_CHANGELOWERSTATE/NETDEV_CHANGEUPPERSTATE events
handling, tracking is not fully completed if the LAG device is not ready
at the time the events occur. But, we must keep track of the upper and
lower states after receiving the events because RoCE needs this info in
mlx5_lag_get_roce_netdev() - in order to return the corresponding port
that its running on. Returning the wrong (not most recent) port will lead
to gids table being incorrect.
For example: If during the attachment of a slave to the bond, the other
non-attached port performs pci_reload, then the LAG device is not ready,
but that should not result in dismissing attached slave tracker update
automatically (which is performed in mlx5_handle_changelowerstate()), Since
these events might not come later, which can lead to both bond ports
having tx_enabled=0 - which is not a valid state of LAG bond.
Fixes: 9b412cc35f00 ("net/mlx5e: Add LAG warning if bond slave is not lag master")
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 806401c20a0f9c51b6c8fd7035671e6ca841f6c2 ]
CT clear action offload adds additional mod hdr actions to the
flow's original mod actions in order to clear the registers which
hold ct_state.
When such flow also includes encap action, a neigh update event
can cause the driver to unoffload the flow and then reoffload it.
Each time this happens, the ct clear handling adds that same set
of mod hdr actions to reset ct_state until the max of mod hdr
actions is reached.
Also the driver never releases the allocated mod hdr actions and
causing a memleak.
Fix above two issues by moving CT clear mod acts allocation
into the parsing actions phase and only use it when offloading the rule.
The release of mod acts will be done in the normal flow_put().
backtrace:
[<000000007316e2f3>] krealloc+0x83/0xd0
[<00000000ef157de1>] mlx5e_mod_hdr_alloc+0x147/0x300 [mlx5_core]
[<00000000970ce4ae>] mlx5e_tc_match_to_reg_set_and_get_id+0xd7/0x240 [mlx5_core]
[<0000000067c5fa17>] mlx5e_tc_match_to_reg_set+0xa/0x20 [mlx5_core]
[<00000000d032eb98>] mlx5_tc_ct_entry_set_registers.isra.0+0x36/0xc0 [mlx5_core]
[<00000000fd23b869>] mlx5_tc_ct_flow_offload+0x272/0x1f10 [mlx5_core]
[<000000004fc24acc>] mlx5e_tc_offload_fdb_rules.part.0+0x150/0x620 [mlx5_core]
[<00000000dc741c17>] mlx5e_tc_encap_flows_add+0x489/0x690 [mlx5_core]
[<00000000e92e49d7>] mlx5e_rep_update_flows+0x6e4/0x9b0 [mlx5_core]
[<00000000f60f5602>] mlx5e_rep_neigh_update+0x39a/0x5d0 [mlx5_core]
Fixes: 1ef3018f5af3 ("net/mlx5e: CT: Support clear action")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 2eb0cb31bc4ce2ede5460cf3ef433b40cf5f040d ]
A user can enable VFs without changing E-Switch mode, this can happen
when a user moves straight to switchdev mode and only once in switchdev
VFs are enabled via the sysfs interface.
The cited commit assumed this isn't possible and exposed a single
API function where the E-switch calls into the lag code, breaks the lag
and prevents any other lag operations to take place until the
E-switch update has ended.
Breaking the hardware lag when it isn't needed can make it such that
hardware lag can't be enabled again.
In the sysfs call path check if the current E-Switch mode is NONE,
in the context of the function it can only mean the E-Switch is moving
out of NONE mode and the hardware lag should be disabled and enabled
once the mode change has ended. If the mode isn't NONE it means
VFs are about to be enabled and such operation doesn't require
toggling the hardware lag.
Fixes: cac1eb2cf2e3 ("net/mlx5: Lag, properly lock eswitch if needed")
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ba50cd9451f6c49cf0841c0a4a146ff6a2822699 ]
In the fast unload flow, the device state is set to internal error,
which indicates that the driver started the destroy process.
In this case, when a destroy command is being executed, it should return
MLX5_CMD_STAT_OK.
Fix MLX5_CMD_OP_DESTROY_UCTX and MLX5_CMD_OP_DESTROY_UMEM to return OK
instead of EIO.
This fixes a call trace in the umem release process -
[ 2633.536695] Call Trace:
[ 2633.537518] ib_uverbs_remove_one+0xc3/0x140 [ib_uverbs]
[ 2633.538596] remove_client_context+0x8b/0xd0 [ib_core]
[ 2633.539641] disable_device+0x8c/0x130 [ib_core]
[ 2633.540615] __ib_unregister_device+0x35/0xa0 [ib_core]
[ 2633.541640] ib_unregister_device+0x21/0x30 [ib_core]
[ 2633.542663] __mlx5_ib_remove+0x38/0x90 [mlx5_ib]
[ 2633.543640] auxiliary_bus_remove+0x1e/0x30 [auxiliary]
[ 2633.544661] device_release_driver_internal+0x103/0x1f0
[ 2633.545679] bus_remove_device+0xf7/0x170
[ 2633.546640] device_del+0x181/0x410
[ 2633.547606] mlx5_rescan_drivers_locked.part.10+0x63/0x160 [mlx5_core]
[ 2633.548777] mlx5_unregister_device+0x27/0x40 [mlx5_core]
[ 2633.549841] mlx5_uninit_one+0x21/0xc0 [mlx5_core]
[ 2633.550864] remove_one+0x69/0xe0 [mlx5_core]
[ 2633.551819] pci_device_remove+0x3b/0xc0
[ 2633.552731] device_release_driver_internal+0x103/0x1f0
[ 2633.553746] unbind_store+0xf6/0x130
[ 2633.554657] kernfs_fop_write+0x116/0x190
[ 2633.555567] vfs_write+0xa5/0x1a0
[ 2633.556407] ksys_write+0x4f/0xb0
[ 2633.557233] do_syscall_64+0x5b/0x1a0
[ 2633.558071] entry_SYSCALL_64_after_hwframe+0x65/0xca
[ 2633.559018] RIP: 0033:0x7f9977132648
[ 2633.559821] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 55 6f 2d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55
[ 2633.562332] RSP: 002b:00007fffb1a83888 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 2633.563472] RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007f9977132648
[ 2633.564541] RDX: 000000000000000c RSI: 000055b90546e230 RDI: 0000000000000001
[ 2633.565596] RBP: 000055b90546e230 R08: 00007f9977406860 R09: 00007f9977a54740
[ 2633.566653] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f99774056e0
[ 2633.567692] R13: 000000000000000c R14: 00007f9977400880 R15: 000000000000000c
[ 2633.568725] ---[ end trace 10b4fe52945e544d ]---
Fixes: 6a6fabbfa3e8 ("net/mlx5: Update pci error handler entries and command translation")
Signed-off-by: Neta Ostrovsky <netao@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 76ded29d3fcda4928da8849ffc446ea46871c1c2 ]
Prior to this patch in case mlx5_core_destroy_cq() failed it proceeds
to rest of destroy operations. mlx5_core_destroy_cq() could be called again
by user and cause additional call of mlx5_debug_cq_remove().
cq->dbg was not nullify in previous call and cause the crash.
Fix it by nullify cq->dbg pointer after removal.
Also proceed to destroy operations only if FW return 0
for MLX5_CMD_OP_DESTROY_CQ command.
general protection fault, probably for non-canonical address 0x2000300004058: 0000 [#1] SMP PTI
CPU: 5 PID: 1228 Comm: python Not tainted 5.15.0-rc5_for_upstream_min_debug_2021_10_14_11_06 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:lockref_get+0x1/0x60
Code: 5d e9 53 ff ff ff 48 8d 7f 70 e8 0a 2e 48 00 c7 85 d0 00 00 00 02
00 00 00 c6 45 70 00 fb 5d c3 c3 cc cc cc cc cc cc cc cc 53 <48> 8b 17
48 89 fb 85 d2 75 3d 48 89 d0 bf 64 00 00 00 48 89 c1 48
RSP: 0018:ffff888137dd7a38 EFLAGS: 00010206
RAX: 0000000000000000 RBX: ffff888107d5f458 RCX: 00000000fffffffe
RDX: 000000000002c2b0 RSI: ffffffff8155e2e0 RDI: 0002000300004058
RBP: ffff888137dd7a88 R08: 0002000300004058 R09: ffff8881144a9f88
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8881141d4000
R13: ffff888137dd7c68 R14: ffff888137dd7d58 R15: ffff888137dd7cc0
FS: 00007f4644f2a4c0(0000) GS:ffff8887a2d40000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055b4500f4380 CR3: 0000000114f7a003 CR4: 0000000000170ea0
Call Trace:
simple_recursive_removal+0x33/0x2e0
? debugfs_remove+0x60/0x60
debugfs_remove+0x40/0x60
mlx5_debug_cq_remove+0x32/0x70 [mlx5_core]
mlx5_core_destroy_cq+0x41/0x1d0 [mlx5_core]
devx_obj_cleanup+0x151/0x330 [mlx5_ib]
? __pollwait+0xd0/0xd0
? xas_load+0x5/0x70
? xa_load+0x62/0xa0
destroy_hw_idr_uobject+0x20/0x80 [ib_uverbs]
uverbs_destroy_uobject+0x3b/0x360 [ib_uverbs]
uobj_destroy+0x54/0xa0 [ib_uverbs]
ib_uverbs_cmd_verbs+0xaf2/0x1160 [ib_uverbs]
? uverbs_finalize_object+0xd0/0xd0 [ib_uverbs]
ib_uverbs_ioctl+0xc4/0x1b0 [ib_uverbs]
__x64_sys_ioctl+0x3e4/0x8e0
Fixes: 94b960b9deff ("net/mlx5e: Fix memory leak in mlx5_core_destroy_cq() error path")
Signed-off-by: Valentine Fatiev <valentinef@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d7751d6476185ff754b9dad2cba0c0a6e43ecadc ]
E-Switch encap mode is relevant only when in switchdev mode.
The RDMA driver can query the encap configuration via
mlx5_eswitch_get_encap_mode(). Make sure it returns the currently
used mode and not the set one.
This reverts the cited commit which reset the encap mode
on entering switchdev and fixes the original issue properly.
Fixes: 9a64144d683a ("net/mlx5: E-Switch, Fix default encap mode")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 362980eada85b5ea691e5e0d9257a991aa7ade47 ]
Function mlx5e_take_tmp_flow() skips flows with zero reference count. This
can cause syndrome 0x179e84 when the called from neigh or route update code
and the skipped flow is not removed from the hardware by the time
underlying encap/decap resource is deleted. Add new completion
'del_hw_done' that is completed when flow is unoffloaded. This is safe to
do because flow with reference count zero needs to be detached from
encap/decap entry before its memory is deallocated, which requires taking
the encap_tbl_lock mutex that is held by the event handlers code.
Fixes: 8914add2c9e5 ("net/mlx5e: Handle FIB events to update tunnel endpoint device")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit cc4a9cc03faa6d8db1a6954bb536f2c1e63bdff6 ]
For the TLS RX resync flow, we maintain a list of TLS contexts
that require some attention, to communicate their resync information
to the HW.
Here we fix list corruptions, by protecting the entries against
movements coming from resync_handle_seq_match(), until their resync
handling in napi is fully completed.
Fixes: e9ce991bce5b ("net/mlx5e: kTLS, Add resiliency to RX resync failures")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 4293014230b887d94b68aa460ff00153454a3709 ]
Restore VLAN filters after the link is brought down, and up - since all
filters are deleted from HW during the netdev link down routine.
Fixes: ed1f5b58ea01 ("i40evf: remove VLAN filters on close")
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 9a6e9e483a9684a34573fd9f9e30ecfb047cb8cb ]
Now setting combine to 0 will be rejected with the
appropriate error code.
This has been implemented by adding a condition that checks
the value of combine equal to zero.
Without this patch, when the user requested it, no error was
returned and combine was set to the default value for VF.
Fixes: 5520deb15326 ("iavf: Enable support for up to 16 queues")
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|