Age | Commit message (Collapse) | Author | Files | Lines |
|
In the nft_offload there is the mate flow_dissector with no
ingress_ifindex but with ingress_iftype that only be used
in the software. So if the mask of ingress_ifindex in meta is
0, this meta check should be bypass.
Fixes: 6d65bc64e232 ("net/mlx5e: Add mlx5e_flower_parse_meta support")
Signed-off-by: wenxu <wenxu@ucloud.cn>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Change register setting from bit number to bit mask.
Fixes: b5ede32d3329 ("net/mlx5e: Add support for FEC modes based on 50G per lane links")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Prevent setting of devlink traps on the uplink while in switchdev mode.
In this mode, it is the SW switch responsibility to handle both packets
with a mismatch in destination MAC or VLAN ID. Therefore, there are no
flow steering tables to trap undesirable packets and driver crashes upon
setting a trap.
Fixes: 241dc159391f ("net/mlx5: Notify on trap action by blocking event")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Some drivers clear the 'ethtool_link_ksettings' struct in their
get_link_ksettings() callback, before populating it with actual values.
Such drivers will set the new 'link_mode' field to zero, resulting in
user space receiving wrong link mode information given that zero is a
valid value for the field.
Another problem is that some drivers (notably tun) can report random
values in the 'link_mode' field. This can result in a general protection
fault when the field is used as an index to the 'link_mode_params' array
[1].
This happens because such drivers implement their set_link_ksettings()
callback by simply overwriting their private copy of
'ethtool_link_ksettings' struct with the one they get from the stack,
which is not always properly initialized.
Fix these problems by removing 'link_mode' from 'ethtool_link_ksettings'
and instead have drivers call ethtool_params_from_link_mode() with the
current link mode. The function will derive the link parameters (e.g.,
speed) from the link mode and fill them in the 'ethtool_link_ksettings'
struct.
v3:
* Remove link_mode parameter and derive the link parameters in
the driver instead of passing link_mode parameter to ethtool
and derive it there.
v2:
* Introduce 'cap_link_mode_supported' instead of adding a
validity field to 'ethtool_link_ksettings' struct.
[1]
general protection fault, probably for non-canonical address 0xdffffc00f14cc32c: 0000 [#1] PREEMPT SMP KASAN
KASAN: probably user-memory-access in range [0x000000078a661960-0x000000078a661967]
CPU: 0 PID: 8452 Comm: syz-executor360 Not tainted 5.11.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__ethtool_get_link_ksettings+0x1a3/0x3a0 net/ethtool/ioctl.c:446
Code: b7 3e fa 83 fd ff 0f 84 30 01 00 00 e8 16 b0 3e fa 48 8d 3c ed 60 d5 69 8a 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03
+38 d0 7c 08 84 d2 0f 85 b9
RSP: 0018:ffffc900019df7a0 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff888026136008 RCX: 0000000000000000
RDX: 00000000f14cc32c RSI: ffffffff873439ca RDI: 000000078a661960
RBP: 00000000ffff8880 R08: 00000000ffffffff R09: ffff88802613606f
R10: ffffffff873439bc R11: 0000000000000000 R12: 0000000000000000
R13: ffff88802613606c R14: ffff888011d0c210 R15: ffff888011d0c210
FS: 0000000000749300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004b60f0 CR3: 00000000185c2000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
linkinfo_prepare_data+0xfd/0x280 net/ethtool/linkinfo.c:37
ethnl_default_notify+0x1dc/0x630 net/ethtool/netlink.c:586
ethtool_notify+0xbd/0x1f0 net/ethtool/netlink.c:656
ethtool_set_link_ksettings+0x277/0x330 net/ethtool/ioctl.c:620
dev_ethtool+0x2b35/0x45d0 net/ethtool/ioctl.c:2842
dev_ioctl+0x463/0xb70 net/core/dev_ioctl.c:440
sock_do_ioctl+0x148/0x2d0 net/socket.c:1060
sock_ioctl+0x477/0x6a0 net/socket.c:1177
vfs_ioctl fs/ioctl.c:48 [inline]
__do_sys_ioctl fs/ioctl.c:753 [inline]
__se_sys_ioctl fs/ioctl.c:739 [inline]
__x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: c8907043c6ac9 ("ethtool: Get link mode in use instead of speed and duplex parameters")
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Memory allocated by kvzalloc() should be freed by kvfree().
Fixes: 34ca65352ddf2 ("net/mlx5: E-Switch, Indirect table infrastructur")
Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Make sure to modify uplink port to follow only if the uplink_follow
capability is set as required by the HW spec. Failure to do so causes
traffic to the uplink representor net device to cease after switching to
switchdev mode.
Fixes: 7d0314b11cdd ("net/mlx5e: Modify uplink state on interface up/down")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
XSK wakeup flow triggers an IRQ by posting a NOP WQE and hitting
the doorbell on the async ICOSQ.
It maintains its state so that it doesn't issue another NOP WQE
if it has an outstanding one already.
For this flow to work properly, the NOP post must not fail.
Make sure to reserve room for the NOP WQE in all WQE posts to the
async ICOSQ.
Fixes: 8d94b590f1e4 ("net/mlx5e: Turn XSK ICOSQ into a general asynchronous one")
Fixes: 1182f3659357 ("net/mlx5e: kTLS, Add kTLS RX HW offload support")
Fixes: 0419d8c9d8f8 ("net/mlx5e: kTLS, Add kTLS RX resync support")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Current algorithm for encap keys is legacy from initial vxlan
implementation and doesn't take into account all possible fields of a
tunnel. For example, for a Geneve tunnel, which may have additional TLV
options, they are ignored when comparing encap keys and a rule can be
attached to an incorrect encap entry.
Fix that by introducing encap_info_equal() operation in
struct mlx5e_tc_tunnel. Geneve tunnel type uses custom implementation,
which extends generic algorithm and considers options if they are set.
Fixes: 7f1a546e3222 ("net/mlx5e: Consider tunnel type for encap contexts")
Signed-off-by: Dima Chumak <dchumak@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Calculating the number of compeltion EQs based on the number of
available IRQ vectors doesn't work now that all async EQs share one IRQ.
Thus the max number of EQs can be exceeded on systems with more than
approximately 256 CPUs. Take this into account when calculating the
number of available completion EQs.
Fixes: 81bfa206032a ("net/mlx5: Use a single IRQ for all async EQs")
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Some TLS RX counters increment per socket/connection, and are not
protected against parallel modifications from several cores.
Switch them to atomic counters by taking them out of the RQ stats into
the global atomic TLS stats.
In this patch, we touch 'rx_tls_ctx/del' that count the number of
device-offloaded RX TLS connections added/deleted.
These counters are updated in the add/del callbacks, out of the fast
data-path.
This change is not needed for counters that increment only in NAPI
context, as they are protected by the NAPI mechanism.
Keep them as tls_* counters under 'struct mlx5e_rq_stats'.
Fixes: 76c1e1ac2aae ("net/mlx5e: kTLS, Add kTLS RX stats")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Some TLS TX counters increment per socket/connection, and are not
protected against parallel modifications from several cores.
Switch them to atomic counters by taking them out of the SQ stats into
the global atomic TLS stats.
In this patch, we touch a single counter 'tx_tls_ctx' that counts the
number of device-offloaded TX TLS connections added.
Now that this counter can be increased without the for having the SQ
context in hand, move it to the mlx5e_ktls_add_tx() callback where it
really belongs, out of the fast data-path.
This change is not needed for counters that increment only in NAPI
context or under the TX lock, as they are already protected.
Keep them as tls_* counters under 'struct mlx5e_sq_stats'.
Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Create send to vport miss group was added in order to support traffic
recirculation to root table with metadata source rewrite.
This group is created also in case source rewrite isn't supported.
Fixed by creating send to vport miss group only if source rewrite is
supported by FW.
Fixes: 8e404fefa58b ("net/mlx5e: Match recirculated packet miss in slow table using reg_c1")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Use connector_type read from PTYS register when it's valid, based on
corresponding capability bit.
Fixes: 5b4793f81745 ("net/mlx5e: Add support for reading connector type from PTYS")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Delete auxiliary bus drivers flow deletes the eth driver
first and then the eth-reps driver but eth-reps devices resources
are depend on eth device.
Fixed by changing the delete order of auxiliary bus drivers to delete
the eth-rep driver first and after it the eth driver.
Fixes: 601c10c89cbb ("net/mlx5: Delete custom device management logic")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
ct_label 0 is a default label each flow has and therefore
there can be rules that match on ct_label=0 without a prior
rule that set the ct_label to this value.
The ct_label value is not used directly in the HW rules and
instead it is mapped to some id within a defined range and this
id is used to set and match the metadata register which carries
the ct_label.
If we have a rule that matches on ct_label=0, the hw rule will
perform matching on a value that is != 0 because of the mapping
from label to id. Since the metadata register default value is
0 and it was never set before to anything else by an action that
sets the ct_label, there will always be a mismatch between that
register and the value in the rule.
To support such rule, a forced mapping of ct_label 0 to id=0
is done so that it will match the metadata register default
value of 0.
Fixes: 54b154ecfb8c ("net/mlx5e: CT: Map 128 bits labels to 32 bit map ID")
Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Cited commit changed the behavior of the software data path with regards
to the ECN marking of decapsulated packets. However, the commit did not
change other callers of __INET_ECN_decapsulate(), namely mlxsw. The
driver is using the function in order to ensure that the hardware and
software data paths act the same with regards to the ECN marking of
decapsulated packets.
The discrepancy was uncovered by commit 5aa3c334a449 ("selftests:
forwarding: vxlan_bridge_1d: Fix vxlan ecn decapsulate value") that
aligned the selftest to the new behavior. Without this patch the
selftest passes when used with veth pairs, but fails when used with
mlxsw netdevs.
Fix this by instructing the device to propagate the ECT(1) mark from the
outer header to the inner header when the inner header is ECT(0), for
both NVE and IP-in-IP tunnels.
A helper is added in order not to duplicate the code between both tunnel
types.
Fixes: b723748750ec ("tunnel: Propagate ECT(1) when decapsulating as recommended by RFC6040")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Device firmware doesn't handle ecpu bit for vhca state processing
events and commands. Instead device firmware refers to the unique
function id to distinguish SF of different PCI functions.
When ecpu bit is used, firmware returns a syndrome.
mlx5_cmd_check:780:(pid 872): MODIFY_VHCA_STATE(0xb0e) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x263211)
mlx5_sf_dev_table_create:248:(pid 872): SF DEV table create err = -22
Hence, avoid using ecpu bit.
Fixes: 8f0105418668 ("net/mlx5: SF, Add port add delete functionality")
Fixes: 90d010b8634b ("net/mlx5: SF, Add auxiliary device support")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
mlx5e_select_queue compares num_tc_x_num_ch to real_num_tx_queues to
determine if HTB and/or PTP offloads are active. If they are, it
calculates netdev_pick_tx() % num_tc_x_num_ch to prevent it from
selecting HTB and PTP queues for regular traffic. However, before the
channels are first activated, num_tc_x_num_ch is zero. If
ndo_select_queue gets called at this point, the HTB/PTP check will pass,
and mlx5e_select_queue will attempt to take a modulo by num_tc_x_num_ch,
which equals to zero.
This commit fixes the bug by assigning num_tc_x_num_ch to a non-zero
value before registering the netdev.
Fixes: 214baf22870c ("net/mlx5e: Support HTB offload")
Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Expose error value when failing to comply to command:
$ ethtool --set-priv-flags eth2 rx_cqe_compress [on/off]
Fixes: be7e87f92b58 ("net/mlx5e: Fail safe cqe compressing/moderation mode setting")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Setting connection tracking OVS flows and then setting non-CT flows that
use tuple rewrite action (e.g. mod_tp_dst), causes the latter flows not
being offloaded.
Fix by using a stricter condition in modify_header_match_supported() to
check tuple rewrite support only for flows with CT action. The check is
factored out into standalone modify_tuple_supported() function to aid
readability.
Fixes: 7e36feeb0467 ("net/mlx5e: CT: Don't offload tuple rewrites for established tuples")
Signed-off-by: Dima Chumak <dchumak@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Currently, we support hardware offload only for MPLS over UDP.
However, rules matching on MPLS parameters are now wrongly offloaded
for regular MPLS, without actually taking the parameters into
consideration when doing the offload.
Fix it by rejecting such unsupported rules.
Fixes: 72046a91d134 ("net/mlx5e: Allow to match on mpls parameters")
Signed-off-by: Alaa Hleihel <alaa@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
The multicast counter got removed from uplink representor due to the
cited patch.
Fixes: 47c97e6b10a1 ("net/mlx5e: Fix multicast counter not up-to-date in "ip -s"")
Signed-off-by: Huy Nguyen <huyn@nvidia.com>
Reviewed-by: Daniel Jurgens <danielj@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Fix 32-bit variable shift wrapping in dr_ste_v1_get_miss_addr.
Fixes: a6098129c781 ("net/mlx5: DR, Add STEv1 setters and getters")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
When SF id is unavailable, code jumps to wrong label that accesses
sw id array outside of its range.
Hence, when SF id is not allocated, avoid accessing such array.
Fixes: 8f0105418668 ("net/mlx5: SF, Add port add delete functionality")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Cited patch in the fixes tag missed to free the allocated work.
Fix it by freeing the work after work execution.
Fixes: f3196bb0f14c ("net/mlx5: Introduce vhca state event notifier")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Fix vhca context size as defined by device interface specification.
Fixes: f3196bb0f14c ("net/mlx5: Introduce vhca state event notifier")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
do_div() returns reminder, while cited patch wanted to use
quotient.
Fix it by using quotient.
Fixes: 0e22bfb7c046 ("net/mlx5e: E-switch, Fix rate calculation for overflow")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
QPs which don't care from timestamp mode, should set the ts_format
to default, otherwise the QP creation could be failed if the timestamp
mode is not supported.
Fixes: 2fe8d4b87802 ("RDMA/mlx5: Fail QP creation if the device can not support the CQE TS")
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Move priv memset from init to cleanup to avoid double priv cleanup
that can happen on profile change if also roolback fails.
Add missing cleanup flow in mlx5e_netdev_attach_profile().
Fixes: c4d7eb57687f ("net/mxl5e: Add change profile method")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
VF tunnel TX traffic offload is adding flow which forward to flow
tables with lower level, which isn't support on all FW versions
and may cause firmware to fail with syndrome.
Fixed by enabling VF tunnel TX offload only if flow table capability
ignore_flow_level is enabled.
Fixes: 10742efc20a4 ("net/mlx5e: VF tunnel TX traffic offloading")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
flow_attr->ip_version has the matching that should be done inner/outer.
When working with chains, decapsulation is done on chain0 and next chain
match on outer header which is the original inner which could be ipv4.
So in tunnel route resolution we cannot use that to know which ip version
we are at so save tun_ip_version when parsing the tunnel match and use
that.
Fixes: a508728a4c8b ("net/mlx5e: VF tunnel RX traffic offloading")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Dmytro Linkin <dlinkin@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Fix a bug of uninitialized pin index when trying to turn off PPS out.
Fixes: de19cd6cc977 ("net/mlx5: Move some PPS logic into helper functions")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
The cited change added offload support for Geneve options without verifying
the validity of the options masks, this caused offload of rules with match
on Geneve options with class,type and data masks which are zero to fail.
Fix by ignoring the match on Geneve options in case option masks are
all zero.
Fixes: 9272e3df3023 ("net/mlx5e: Geneve, Add support for encap/decap flows offload")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Port timestamping for PTP can be enabled/disabled while the channels are
closed. In that case mlx5e_safe_switch_channels is skipped, and the
preactivate hook is called directly. However, if that hook returns an
error, the channel parameters must be reverted back to their old values.
This commit adds missing handling on this case.
Fixes: 145e5637d941 ("net/mlx5e: Add TX PTP port object support")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Each RQ (including XSK RQs) takes a reference to the XDP program. When
an XDP program is attached or detached, the channels and queues are
recreated, however, there is a special flow for changing an active XDP
program to another one. In that flow, channels and queues stay alive,
but the refcounts of the old and new XDP programs are adjusted. This
flow didn't increment refcount by the number of active XSK RQs, and this
commit fixes it.
Fixes: db05815b36cb ("net/mlx5e: Add XSK zero-copy support")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
When closing the PTP channel, set its pointer explicitly to NULL. PTP
channel is opened on demand, the code verify the pointer validity before
access. Nullify it when closing the PTP channel to avoid unexpected
behavior.
Fixes: 145e5637d941 ("net/mlx5e: Add TX PTP port object support")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
In addition to .get_ethtool_stats, add port PTP TX stats to
.ndo_get_stats64.
Fixes: 145e5637d941 ("net/mlx5e: Add TX PTP port object support")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Since cited patch, MLX5E_REQUIRED_WQE_MTTS is not a power of two.
Hence, usage of MLX5E_LOG_ALIGNED_MPWQE_PPW should be replaced,
as it lost some accuracy. Use the designated macro to calculate
the number of required MTTs.
This makes sure the solution in cited patch works properly.
While here, un-inline mlx5e_get_mpwqe_offset(), and remove the
unused RQ parameter.
Fixes: c3c9402373fe ("net/mlx5e: Add resiliency in Striding RQ mode for packets larger than MTU")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
The ICOSQ size should not go below MLX5E_PARAMS_MINIMUM_LOG_SQ_SIZE.
Enforce this where it's missing.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
This patch fixes a bug that the moderation config will not be
applied when calling mlx4_en_reset_config. For example, when
turning on rx timestamping, mlx4_en_reset_config() will be called,
causing the NIC to forget previous moderation config.
This fix is in phase with a previous fix:
commit 79c54b6bbf06 ("net/mlx4_en: Fix TX moderation info loss
after set_ringparam is called")
Tested: Before this patch, on a host with NIC using mlx4, run
netserver and stream TCP to the host at full utilization.
$ sar -I SUM 1
INTR intr/s
14:03:56 sum 48758.00
After rx hwtstamp is enabled:
$ sar -I SUM 1
14:10:38 sum 317771.00
We see the moderation is not working properly and issued 7x more
interrupts.
After the patch, and turned on rx hwtstamp, the rate of interrupts
is as expected:
$ sar -I SUM 1
14:52:11 sum 49332.00
Fixes: 79c54b6bbf06 ("net/mlx4_en: Fix TX moderation info loss after set_ringparam is called")
Signed-off-by: Kevin(Yudong) Yang <yyd@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
CC: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Routes are currently processed from a workqueue whereas nexthop objects
are processed in system call context. This can result in the driver not
finding a suitable nexthop group for a route and issuing a warning [1].
Fix this by ignoring such routes earlier in the process. The subsequent
deletion notification will be ignored as well.
[1]
WARNING: CPU: 2 PID: 7754 at drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:4853 mlxsw_sp_router_fib_event_work+0x1112/0x1e00 [mlxsw_spectrum]
[...]
CPU: 2 PID: 7754 Comm: kworker/u8:0 Not tainted 5.11.0-rc6-cq-20210207-1 #16
Hardware name: Mellanox Technologies Ltd. MSN2100/SA001390, BIOS 5.6.5 05/24/2018
Workqueue: mlxsw_core_ordered mlxsw_sp_router_fib_event_work [mlxsw_spectrum]
RIP: 0010:mlxsw_sp_router_fib_event_work+0x1112/0x1e00 [mlxsw_spectrum]
Fixes: cdd6cfc54c64 ("mlxsw: spectrum_router: Allow programming routes with nexthop objects")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reported-by: Alex Veber <alexve@nvidia.com>
Tested-by: Alex Veber <alexve@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently, only external bits are added to the PTYS register, whereas
there is one external bit that is wrongly marked as internal, and so was
recently removed from the register.
Add that bit to the PTYS register again, as this bit is no longer
internal.
Its removal resulted in '100000baseLR4_ER4/Full' link mode no longer
being supported, causing a regression on some setups.
Fixes: 5bf01b571cf4 ("mlxsw: spectrum_ethtool: Remove internal speeds from PTYS register")
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reported-by: Eddie Shklaer <eddies@nvidia.com>
Tested-by: Eddie Shklaer <eddies@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Rather small batch this time.
Current release - regressions:
- bcm63xx_enet: fix sporadic kernel panic due to queue length
mis-accounting
Current release - new code bugs:
- bcm4908_enet: fix RX path possible mem leak
- bcm4908_enet: fix NAPI poll returned value
- stmmac: fix missing spin_lock_init in visconti_eth_dwmac_probe()
- sched: cls_flower: validate ct_state for invalid and reply flags
Previous releases - regressions:
- net: introduce CAN specific pointer in the struct net_device to
prevent mis-interpreting memory
- phy: micrel: set soft_reset callback to genphy_soft_reset for
KSZ8081
- psample: fix netlink skb length with tunnel info
Previous releases - always broken:
- icmp: pass zeroed opts from icmp{,v6}_ndo_send before sending
- wireguard: device: do not generate ICMP for non-IP packets
- mptcp: provide subflow aware release function to avoid a mem leak
- hsr: add support for EntryForgetTime
- r8169: fix jumbo packet handling on RTL8168e
- octeontx2-af: fix an off by one in rvu_dbg_qsize_write()
- i40e: fix flow for IPv6 next header (extension header)
- phy: icplus: call phy_restore_page() when phy_select_page() fails
- dpaa_eth: fix the access method for the dpaa_napi_portal"
* tag 'net-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (55 commits)
r8169: fix jumbo packet handling on RTL8168e
net: phy: micrel: set soft_reset callback to genphy_soft_reset for KSZ8081
net: psample: Fix netlink skb length with tunnel info
net: broadcom: bcm4908_enet: fix NAPI poll returned value
net: broadcom: bcm4908_enet: fix RX path possible mem leak
net: hsr: add support for EntryForgetTime
net: dsa: sja1105: Remove unneeded cast in sja1105_crc32()
ibmvnic: fix a race between open and reset
net: stmmac: Fix missing spin_lock_init in visconti_eth_dwmac_probe()
net: introduce CAN specific pointer in the struct net_device
net: usb: qmi_wwan: support ZTE P685M modem
wireguard: kconfig: use arm chacha even with no neon
wireguard: queueing: get rid of per-peer ring buffers
wireguard: device: do not generate ICMP for non-IP packets
wireguard: peer: put frequently used members above cache lines
wireguard: selftests: test multiple parallel streams
wireguard: socket: remove bogus __be32 annotation
wireguard: avoid double unlikely() notation when using IS_ERR()
net: qrtr: Fix memory leak in qrtr_tun_open
vxlan: move debug check after netdev unregister
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Fix false-positive build warnings for ARCH=ia64 builds
- Optimize dictionary size for module compression with xz
- Check the compiler and linker versions in Kconfig
- Fix misuse of extra-y
- Support DWARF v5 debug info
- Clamp SUBLEVEL to 255 because stable releases 4.4.x and 4.9.x
exceeded the limit
- Add generic syscall{tbl,hdr}.sh for cleanups across arches
- Minor cleanups of genksyms
- Minor cleanups of Kconfig
* tag 'kbuild-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (38 commits)
initramfs: Remove redundant dependency of RD_ZSTD on BLK_DEV_INITRD
kbuild: remove deprecated 'always' and 'hostprogs-y/m'
kbuild: parse C= and M= before changing the working directory
kbuild: reuse this-makefile to define abs_srctree
kconfig: unify rule of config, menuconfig, nconfig, gconfig, xconfig
kconfig: omit --oldaskconfig option for 'make config'
kconfig: fix 'invalid option' for help option
kconfig: remove dead code in conf_askvalue()
kconfig: clean up nested if-conditionals in check_conf()
kconfig: Remove duplicate call to sym_get_string_value()
Makefile: Remove # characters from compiler string
Makefile: reuse CC_VERSION_TEXT
kbuild: check the minimum linker version in Kconfig
kbuild: remove ld-version macro
scripts: add generic syscallhdr.sh
scripts: add generic syscalltbl.sh
arch: syscalls: remove $(srctree)/ prefix from syscall tables
arch: syscalls: add missing FORCE and fix 'targets' to make if_changed work
gen_compile_commands: prune some directories
kbuild: simplify access to the kernel's version
...
|
|
mlx4_do_mirror_rule() forgets to call mlx4_free_cmd_mailbox() to
free the memory region allocated by mlx4_alloc_cmd_mailbox() before
an exit.
Add the missed call to fix it.
Fixes: 78efed275117 ("net/mlx4_core: Support mirroring VF DMFS rules on both ports")
Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20210221143559.390277-1-hslester96@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pull rdma updates from Jason Gunthorpe:
"This is quite a small cycle, if not for Lee's 70 patches cleaning the
kdocs it would be well below typical for patch count.
Most of the interesting work here was in the HNS and rxe drivers which
got fairly major internal changes.
Summary:
- Driver updates and bug fixes: siw, hns, bnxt_re, mlx5, efa
- Significant rework in rxe to get it ready to have XRC support added
- Several rts bug fixes
- Big series to get to 'make W=1' cleanness, primarily updating kdocs
- Support for creating a RDMA MR from a DMABUF fd to allow PCI peer
to peer transfers to GPU VRAM
- Device disassociation now works properly with umad
- Work to support more than 255 ports on a RDMA device
- Further support for the new HNS HIP09 hardware
- Coding style cleanups: comma to semicolon, unneded semicolon/blank
lines, remove 'h' printk format, don't check for NULL before kfree,
use true/false for bool"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (205 commits)
RDMA/rtrs-srv: Do not pass a valid pointer to PTR_ERR()
RDMA/srp: Fix support for unpopulated and unbalanced NUMA nodes
RDMA/mlx5: Fail QP creation if the device can not support the CQE TS
RDMA/mlx5: Allow CQ creation without attached EQs
RDMA/rtrs-srv-sysfs: fix missing put_device
RDMA/rtrs-srv: fix memory leak by missing kobject free
RDMA/rtrs: Only allow addition of path to an already established session
RDMA/rtrs-srv: Fix stack-out-of-bounds
RDMA/rxe: Remove unused pkt->offset
RDMA/ucma: Fix use-after-free bug in ucma_create_uevent
RDMA/core: Fix kernel doc warnings for ib_port_immutable_read()
RDMA/qedr: Use true and false for bool variable
RDMA/hns: Adjust definition of FRMR fields
RDMA/hns: Refactor process of posting CMDQ
RDMA/hns: Adjust fields and variables about CMDQ tail/head
RDMA/hns: Remove redundant operations on CMDQ
RDMA/hns: Fixes missing error code of CMDQ
RDMA/hns: Remove unused member and variable of CMDQ
RDMA/ipoib: Remove racy Subnet Manager sendonly join checks
RDMA/mlx5: Support 400Gbps IB rate in mlx5 driver
...
|
|
Linux 5.11
Merged to resolve conflicts with RDMA rc commits
- drivers/infiniband/sw/rxe/rxe_net.c
The final logic is to call rxe_get_dev_from_net() again with the master
netdev if the packet was rx'd on a vlan. To keep the elimination of the
local variables requires a trivial edit to the code in -rc
Link: https://lore.kernel.org/r/20210210131542.215ea67c@canb.auug.org.au
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Saeed Mahameed says:
====================
pull-request: mlx5-next 2021-02-16
The patches in this pr are already submitted and reviewed through the
netdev and rdma mailing lists.
The series includes mlx5 HW bits and definitions for mlx5 real time clock
translation and handling in the mlx5 driver clock module to enable and
support such mode [1]
[1] https://patchwork.kernel.org/project/netdevbpf/patch/20210212223042.449816-7-saeed@kernel.org/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|