Age | Commit message (Collapse) | Author | Files | Lines |
|
[ Upstream commit da78fe5fb35737058de52364484ffed74e7d329b ]
Fix the below crash when deleting a slave from the unaffiliated list
twice. First time when the slave is bound to the master and the second
when the slave is unloaded.
Fix it by checking if slave is unaffiliated (doesn't have ib device)
before removing from the list.
RIP: 0010:mlx5r_mp_remove+0x4e/0xa0 [mlx5_ib]
Call Trace:
auxiliary_bus_remove+0x18/0x30
__device_release_driver+0x177/x220
device_release_driver+0x24/0x30
bus_remove_device+0xd8/0x140
device_del+0x18a/0x3e0
mlx5_rescan_drivers_locked+0xa9/0x210 [mlx5_core]
mlx5_unregister_device+0x34/0x60 [mlx5_core]
mlx5_uninit_one+0x32/0x100 [mlx5_core]
remove_one+0x6e/0xe0 [mlx5_core]
pci_device_remove+0x36/0xa0
__device_release_driver+0x177/0x220
device_driver_detach+0x3c/0xa0
unbind_store+0x113/0x130
kernfs_fop_write_iter+0x110/0x1a0
new_sync_write+0x116/0x1a0
vfs_write+0x1ba/0x260
ksys_write+0x5f/0xe0
do_syscall_64+0x3d/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
Fixes: 93f8244431ad ("RDMA/mlx5: Convert mlx5_ib to use auxiliary bus")
Link: https://lore.kernel.org/r/17ec98989b0ba88f7adfbad68eb20bce8d567b44.1628587493.git.leonro@nvidia.com
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 563476ae0c5e48a028cbfa38fa9d2fc0418eb88f ]
The CQ destroy is performed based on the IRQ number that is stored in
cq->irqn. That number wasn't set explicitly during CQ creation and as
expected some of the API users of mlx5_core_create_cq() forgot to update
it.
This caused to wrong synchronization call of the wrong IRQ with a number
0 instead of the real one.
As a fix, set the IRQ number directly in the mlx5_core_create_cq() and
update all users accordingly.
Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
Fixes: ef1659ade359 ("IB/mlx5: Add DEVX support for CQ events")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d6793ca97b76642b77629dd0783ec64782a50bdb ]
Fixing a typo that causes a cache entry to shrink immediately after adding
to it new MRs if the entry size exceeds the high limit. In doing so, the
cache misses its purpose to prevent the creation of new mkeys on the
runtime by using the cached ones.
Fixes: b9358bdbc713 ("RDMA/mlx5: Fix locking in MR cache work queue")
Link: https://lore.kernel.org/r/fcb546986be346684a016f5ca23a0567399145fa.1627370131.git.leonro@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 4a754d7637026b42b0c9ba5787ad5ee3bc2ff77f ]
The "dev->port[i].mp.mpi" is set to NULL during mlx5_ib_unbind_slave_port()
execution, however that field is needed to add device to unaffiliated list.
Such flow causes to the following kernel panic while unloading mlx5_ib
module in multi-port mode, hence the device should be added to the list
prior to unbind call.
RPC: Unregistered rdma transport module.
RPC: Unregistered rdma backchannel transport module.
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 0 P4D 0
Oops: 0002 [#1] SMP NOPTI
CPU: 4 PID: 1904 Comm: modprobe Not tainted 5.13.0-rc7_for_upstream_min_debug_2021_06_24_12_08 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:mlx5_ib_cleanup_multiport_master+0x18b/0x2d0 [mlx5_ib]
Code: 00 04 0f 85 c4 00 00 00 48 89 df e8 ef fa ff ff 48 8b 83 40 0d 00 00 48 8b 15 b9 e8 05 00 4a 8b 44 28 20 48 89 05 ad e8 05 00 <48> c7 00 d0 57 c5 a0 48 89 50 08 48 89 02 39 ab 88 0a 00 00 0f 86
RSP: 0018:ffff888116ee3df8 EFLAGS: 00010296
RAX: 0000000000000000 RBX: ffff8881154f6000 RCX: 0000000000000080
RDX: ffffffffa0c557d0 RSI: ffff88810b69d200 RDI: 000000000002d8a0
RBP: 0000000000000002 R08: ffff888110780408 R09: 0000000000000000
R10: ffff88812452e1c0 R11: fffffffffff7e028 R12: 0000000000000000
R13: 0000000000000080 R14: ffff888102c58000 R15: 0000000000000000
FS: 00007f884393a740(0000) GS:ffff8882f5a00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000001249f6004 CR4: 0000000000370ea0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
mlx5_ib_stage_init_cleanup+0x16/0xd0 [mlx5_ib]
__mlx5_ib_remove+0x33/0x90 [mlx5_ib]
mlx5r_remove+0x22/0x30 [mlx5_ib]
auxiliary_bus_remove+0x18/0x30
__device_release_driver+0x177/0x220
driver_detach+0xc4/0x100
bus_remove_driver+0x58/0xd0
auxiliary_driver_unregister+0x12/0x20
mlx5_ib_cleanup+0x13/0x897 [mlx5_ib]
__x64_sys_delete_module+0x154/0x230
? exit_to_user_mode_prepare+0x104/0x140
do_syscall_64+0x3f/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f8842e095c7
Code: 73 01 c3 48 8b 0d d9 48 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 48 2c 00 f7 d8 64 89 01 48
RSP: 002b:00007ffc68f6e758 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 00005638207929c0 RCX: 00007f8842e095c7
RDX: 0000000000000000 RSI: 0000000000000800 RDI: 0000563820792a28
RBP: 00005638207929c0 R08: 00007ffc68f6d701 R09: 0000000000000000
R10: 00007f8842e82880 R11: 0000000000000206 R12: 0000563820792a28
R13: 0000000000000001 R14: 0000563820792a28 R15: 00007ffc68f6fb40
Modules linked in: xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype xt_conntrack nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter overlay rdma_ucm ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_ipoib ib_cm ib_umad mlx5_ib(-) mlx4_ib ib_uverbs ib_core mlx4_en mlx4_core mlx5_core ptp pps_core [last unloaded: rpcrdma]
CR2: 0000000000000000
---[ end trace a0bb7e20804e9e9b ]---
Fixes: 7ce6095e3bff ("RDMA/mlx5: Don't add slave port to unaffiliated list")
Link: https://lore.kernel.org/r/899ac1b33a995be5ec0e16a4765c4e43c2b1ba5b.1624956444.git.leonro@nvidia.com
Reviewed-by: Itay Aveksis <itayav@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 7ce6095e3bff8e20ce018b050960b527e298f7df ]
The mlx5_ib_bind_slave_port() doesn't remove multiport device from the
unaffiliated list, but mlx5_ib_unbind_slave_port() did it. This unbalanced
flow caused to the situation where mlx5_ib_unaffiliated_port_list was
changed during iteration.
Fixes: 32f69e4be269 ("{net, IB}/mlx5: Manage port association for multiport RoCE")
Link: https://lore.kernel.org/r/2726e6603b1e6ecfe76aa5a12a063af72173bcf7.1622477058.git.leonro@nvidia.com
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f97442887275d11c88c2899e720fe945c1f61488 ]
The mlx4 and mlx5 implemented differently the WQ input checks. Instead of
duplicating mlx4 logic in the mlx5, let's prepare the input in the central
place.
The mlx5 implementation didn't check for validity of state input. It is
not real bug because our FW checked that, but still worth to fix.
Fixes: f213c0527210 ("IB/uverbs: Add WQ support")
Link: https://lore.kernel.org/r/ac41ad6a81b095b1a8ad453dcf62cf8d3c5da779.1621413310.git.leonro@nvidia.com
Reported-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
The function init_cq_frag_buf() can be called to initialize the current CQ
fragments buffer cq->buf, or the temporary cq->resize_buf that is filled
during CQ resize operation.
However, the offending commit started to use function get_cqe() for
getting the CQEs, the issue with this change is that get_cqe() always
returns CQEs from cq->buf, which leads us to initialize the wrong buffer,
and in case of enlarging the CQ we try to access elements beyond the size
of the current cq->buf and eventually hit a kernel panic.
[exception RIP: init_cq_frag_buf+103]
[ffff9f799ddcbcd8] mlx5_ib_resize_cq at ffffffffc0835d60 [mlx5_ib]
[ffff9f799ddcbdb0] ib_resize_cq at ffffffffc05270df [ib_core]
[ffff9f799ddcbdc0] llt_rdma_setup_qp at ffffffffc0a6a712 [llt]
[ffff9f799ddcbe10] llt_rdma_cc_event_action at ffffffffc0a6b411 [llt]
[ffff9f799ddcbe98] llt_rdma_client_conn_thread at ffffffffc0a6bb75 [llt]
[ffff9f799ddcbec8] kthread at ffffffffa66c5da1
[ffff9f799ddcbf50] ret_from_fork_nospec_begin at ffffffffa6d95ddd
Fix it by getting the needed CQE by calling mlx5_frag_buf_get_wqe() that
takes the correct source buffer as a parameter.
Fixes: 388ca8be0037 ("IB/mlx5: Implement fragmented completion queue (CQ)")
Link: https://lore.kernel.org/r/90a0e8c924093cfa50a482880ad7e7edb73dc19a.1623309971.git.leonro@nvidia.com
Signed-off-by: Alaa Hleihel <alaa@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The value mr->sig is stored in the entry upon mr allocation, however, ibmr
is wrongly entered here as "old", therefore, xa_cmpxchg() does not replace
the entry with NULL, which leads to the following trace:
WARNING: CPU: 28 PID: 2078 at drivers/infiniband/hw/mlx5/main.c:3643 mlx5_ib_stage_init_cleanup+0x4d/0x60 [mlx5_ib]
Modules linked in: nvme_rdma nvme_fabrics nvme_core 8021q garp mrp bonding bridge stp llc rfkill rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_tad
CPU: 28 PID: 2078 Comm: reboot Tainted: G X --------- --- 5.13.0-0.rc2.19.el9.x86_64 #1
Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 2.9.1 12/07/2018
RIP: 0010:mlx5_ib_stage_init_cleanup+0x4d/0x60 [mlx5_ib]
Code: 8d bb 70 1f 00 00 be 00 01 00 00 e8 9d 94 ce da 48 3d 00 01 00 00 75 02 5b c3 0f 0b 5b c3 0f 0b 48 83 bb b0 20 00 00 00 74 d5 <0f> 0b eb d1 4
RSP: 0018:ffffa8db06d33c90 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff97f890a44000 RCX: ffff97f900ec0160
RDX: 0000000000000000 RSI: 0000000080080001 RDI: ffff97f890a44000
RBP: ffffffffc0c189b8 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000300 R12: ffff97f890a44000
R13: ffffffffc0c36030 R14: 00000000fee1dead R15: 0000000000000000
FS: 00007f0d5a8a3b40(0000) GS:ffff98077fb80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000555acbf4f450 CR3: 00000002a6f56002 CR4: 00000000001706e0
Call Trace:
mlx5r_remove+0x39/0x60 [mlx5_ib]
auxiliary_bus_remove+0x1b/0x30
__device_release_driver+0x17a/0x230
device_release_driver+0x24/0x30
bus_remove_device+0xdb/0x140
device_del+0x18b/0x3e0
mlx5_detach_device+0x59/0x90 [mlx5_core]
mlx5_unload_one+0x22/0x60 [mlx5_core]
shutdown+0x31/0x3a [mlx5_core]
pci_device_shutdown+0x34/0x60
device_shutdown+0x15b/0x1c0
__do_sys_reboot.cold+0x2f/0x5b
? vfs_writev+0xc7/0x140
? handle_mm_fault+0xc5/0x290
? do_writev+0x6b/0x110
do_syscall_64+0x40/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
Fixes: e6fb246ccafb ("RDMA/mlx5: Consolidate MR destruction to mlx5_ib_dereg_mr()")
Link: https://lore.kernel.org/r/f3f585ea0db59c2a78f94f65eedeafc5a2374993.1623309971.git.leonro@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Validate port value provided by the user and with that remove no longer
needed validation by the driver. The missing check in the mlx5_ib driver
could cause to the below oops.
Call trace:
_create_flow_rule+0x2d4/0xf28 [mlx5_ib]
mlx5_ib_create_flow+0x2d0/0x5b0 [mlx5_ib]
ib_uverbs_ex_create_flow+0x4cc/0x624 [ib_uverbs]
ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0xd4/0x150 [ib_uverbs]
ib_uverbs_cmd_verbs.isra.7+0xb28/0xc50 [ib_uverbs]
ib_uverbs_ioctl+0x158/0x1d0 [ib_uverbs]
do_vfs_ioctl+0xd0/0xaf0
ksys_ioctl+0x84/0xb4
__arm64_sys_ioctl+0x28/0xc4
el0_svc_common.constprop.3+0xa4/0x254
el0_svc_handler+0x84/0xa0
el0_svc+0x10/0x26c
Code: b9401260 f9615681 51000400 8b001c20 (f9403c1a)
Fixes: 436f2ad05a0b ("IB/core: Export ib_create/destroy_flow through uverbs")
Link: https://lore.kernel.org/r/faad30dc5219a01727f47db3dc2f029d07c82c00.1623309971.git.leonro@nvidia.com
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Allow creating FDB steering rules only when in switchdev mode.
The only software model where a userspace application can manipulate
FDB entries is when it manages the eswitch. This is only possible in
switchdev mode where we expose a single RDMA device with representors
for all the vports that are connected to the eswitch.
Fixes: 52438be44112 ("RDMA/mlx5: Allow inserting a steering rule to the FDB")
Link: https://lore.kernel.org/r/e928ae7c58d07f104716a2a8d730963d1bd01204.1623052923.git.leonro@nvidia.com
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
In a fork scenario, the parent and child can have same virtual address and
also share the uverbs fd. That causes to the list_for_each_entry search
return same doorbell physical page for all processes, even though that
page has been COW' or copied.
This patch takes the mm_struct into consideration during search, to make
sure that VA's belonging to different processes are not intermixed.
Resolves the malfunction of uverbs after fork in some specific cases.
Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Link: https://lore.kernel.org/r/feacc23fe0bc6e1088c6824d5583798745e72405.1622726212.git.leonro@nvidia.com
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Networking fixes for 5.13-rc4, including fixes from bpf, netfilter,
can and wireless trees. Notably including fixes for the recently
announced "FragAttacks" WiFi vulnerabilities. Rather large batch,
touching some core parts of the stack, too, but nothing hair-raising.
Current release - regressions:
- tipc: make node link identity publish thread safe
- dsa: felix: re-enable TAS guard band mode
- stmmac: correct clocks enabled in stmmac_vlan_rx_kill_vid()
- stmmac: fix system hang if change mac address after interface
ifdown
Current release - new code bugs:
- mptcp: avoid OOB access in setsockopt()
- bpf: Fix nested bpf_bprintf_prepare with more per-cpu buffers
- ethtool: stats: fix a copy-paste error - init correct array size
Previous releases - regressions:
- sched: fix packet stuck problem for lockless qdisc
- net: really orphan skbs tied to closing sk
- mlx4: fix EEPROM dump support
- bpf: fix alu32 const subreg bound tracking on bitwise operations
- bpf: fix mask direction swap upon off reg sign change
- bpf, offload: reorder offload callback 'prepare' in verifier
- stmmac: Fix MAC WoL not working if PHY does not support WoL
- packetmmap: fix only tx timestamp on request
- tipc: skb_linearize the head skb when reassembling msgs
Previous releases - always broken:
- mac80211: address recent "FragAttacks" vulnerabilities
- mac80211: do not accept/forward invalid EAPOL frames
- mptcp: avoid potential error message floods
- bpf, ringbuf: deny reserve of buffers larger than ringbuf to
prevent out of buffer writes
- bpf: forbid trampoline attach for functions with variable arguments
- bpf: add deny list of functions to prevent inf recursion of tracing
programs
- tls splice: check SPLICE_F_NONBLOCK instead of MSG_DONTWAIT
- can: isotp: prevent race between isotp_bind() and
isotp_setsockopt()
- netfilter: nft_set_pipapo_avx2: Add irq_fpu_usable() check,
fallback to non-AVX2 version
Misc:
- bpf: add kconfig knob for disabling unpriv bpf by default"
* tag 'net-5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (172 commits)
net: phy: Document phydev::dev_flags bits allocation
mptcp: validate 'id' when stopping the ADD_ADDR retransmit timer
mptcp: avoid error message on infinite mapping
mptcp: drop unconditional pr_warn on bad opt
mptcp: avoid OOB access in setsockopt()
nfp: update maintainer and mailing list addresses
net: mvpp2: add buffer header handling in RX
bnx2x: Fix missing error code in bnx2x_iov_init_one()
net: zero-initialize tc skb extension on allocation
net: hns: Fix kernel-doc
sctp: fix the proc_handler for sysctl encap_port
sctp: add the missing setting for asoc encap_port
bpf, selftests: Adjust few selftest result_unpriv outcomes
bpf: No need to simulate speculative domain for immediates
bpf: Fix mask direction swap upon off reg sign change
bpf: Wrap aux data inside bpf_sanitize_info container
bpf: Fix BPF_LSM kconfig symbol dependency
selftests/bpf: Add test for l3 use of bpf_redirect_peer
bpftool: Add sock_release help info for cgroup attach/prog load command
net: dsa: microchip: enable phy errata workaround on 9567
...
|
|
When executing DEVX command to query QP object, we need to take the QP
type from the mlx5_ib_qp struct which hold the driver specific QP types as
well, such as DC.
Fixes: 34613eb1d2ad ("IB/mlx5: Enable modify and query verbs objects via DEVX")
Link: https://lore.kernel.org/r/6eee15d63f09bb70787488e0cf96216e2957f5aa.1621413654.git.leonro@nvidia.com
Reviewed-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
mlx5_core_dev holds pointer to static profile, hence when the
log_max_qp of the profile is override by some device, then it
effect all other mlx5 devices that share the same profile.
Fix it by having a profile instance for every mlx5 device.
Fixes: 883371c453b9 ("net/mlx5: Check FW limitations on log_max_qp before setting it")
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
When there is fatal event on the slave port, the device is marked as not
active. We need to mark it as active again when the slave is recovered to
regain full functionality.
Fixes: d69a24e03659 ("IB/mlx5: Move IB event processing onto a workqueue")
Link: https://lore.kernel.org/r/8906754455bb23019ef223c725d2c0d38acfb80b.1620711734.git.leonro@nvidia.com
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Fix the complaint from smatch by verifing that the user requested DM
operation is not greater than 31.
divers/infiniband/hw/mlx5/dm.c:220 mlx5_ib_handler_MLX5_IB_METHOD_DM_MAP_OP_ADDR()
error: undefined (user controlled) shift '(((1))) << op'
Fixes: cea85fa5dbc2 ("RDMA/mlx5: Add support in MEMIC operations")
Link: https://lore.kernel.org/r/458b1d7710c3cf01360c8771893f483665569786.1620711734.git.leonro@nvidia.com
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Pull rdma updates from Jason Gunthorpe:
"This is significantly bug fixes and general cleanups. The noteworthy
new features are fairly small:
- XRC support for HNS and improves RQ operations
- Bug fixes and updates for hns, mlx5, bnxt_re, hfi1, i40iw, rxe, siw
and qib
- Quite a few general cleanups on spelling, error handling, static
checker detections, etc
- Increase the number of device ports supported beyond 255. High port
count software switches now exist
- Several bug fixes for rtrs
- mlx5 Device Memory support for host controlled atomics
- Report SRQ tables through to rdma-tool"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (145 commits)
IB/qib: Remove redundant assignment to ret
RDMA/nldev: Add copy-on-fork attribute to get sys command
RDMA/bnxt_re: Fix a double free in bnxt_qplib_alloc_res
RDMA/siw: Fix a use after free in siw_alloc_mr
IB/hfi1: Remove redundant variable rcd
RDMA/nldev: Add QP numbers to SRQ information
RDMA/nldev: Return SRQ information
RDMA/restrack: Add support to get resource tracking for SRQ
RDMA/nldev: Return context information
RDMA/core: Add CM to restrack after successful attachment to a device
RDMA/cma: Skip device which doesn't support CM
RDMA/rxe: Fix a bug in rxe_fill_ip_info()
RDMA/mlx5: Expose private query port
RDMA/mlx4: Remove an unused variable
RDMA/mlx5: Fix type assignment for ICM DM
IB/mlx5: Set right RoCE l3 type and roce version while deleting GID
RDMA/i40iw: Fix error unwinding when i40iw_hmc_sd_one fails
RDMA/cxgb4: add missing qpid increment
IB/ipoib: Remove unnecessary struct declaration
RDMA/bnxt_re: Get rid of custom module reference counting
...
|
|
Expose a non standard query port via IOCTL that will be used to expose
port attributes that are specific to mlx5 devices.
The new interface receives a port number to query and returns a structure
that contains the available attributes for that port. This will be used
to fill the gap between pure DEVX use cases and use cases where a kernel
needs to inform userspace about various kernel driver configurations that
userspace must use in order to work correctly.
Flags is used to indicate which fields are valid on return.
MLX5_IB_UAPI_QUERY_PORT_VPORT:
The vport number of the queered port.
MLX5_IB_UAPI_QUERY_PORT_VPORT_VHCA_ID:
The VHCA ID of the vport of the queered port.
MLX5_IB_UAPI_QUERY_PORT_VPORT_STEERING_ICM_RX:
The vport's RX ICM address used for sw steering.
MLX5_IB_UAPI_QUERY_PORT_VPORT_STEERING_ICM_TX:
The vport's TX ICM address used for sw steering.
MLX5_IB_UAPI_QUERY_PORT_VPORT_REG_C0:
The metadata used to tag egress packets of the vport.
MLX5_IB_UAPI_QUERY_PORT_ESW_OWNER_VHCA_ID:
The E-Switch owner vhca id of the vport.
Link: https://lore.kernel.org/r/6e2ef13e5a266a6c037eb0105eb1564c7bb52f23.1618743394.git.leonro@nvidia.com
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
We should hold the UAPI DM type in the base struct and not the internal
mlx5 type.
Fixes: 251b9d788750 ("RDMA/mlx5: Re-organize the DM code")
Link: https://lore.kernel.org/r/58dedbd5c132660f808e59166d434e2eaa6ecf7a.1618753425.git.leonro@nvidia.com
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Currently when GID is deleted, it zero out all the fields of the RoCE
address in the SET_ROCE_ADDRESS command for a specified index.
roce_version = 0 means RoCEv1 in the SET_ROCE_ADDRESS command.
This assumes that device has RoCEv1 always enabled which is not always
correct. For example Subfunction does not support RoCEv1.
Due to this assumption a previously added RoCEv2 GID is always deleted as
RoCEv1 GID. This results in a below syndrome:
mlx5_core.sf mlx5_core.sf.4: mlx5_cmd_check:777:(pid 4256): SET_ROCE_ADDRESS(0x761) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x12822d)
Hence set the right RoCE version during GID deletion provided by the core.
Link: https://lore.kernel.org/r/d3f54129c90ca329caf438dbe31875d8ad08d91a.1618753425.git.leonro@nvidia.com
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Maor Gottlieb says:
====================
This series from Maor extends MEMIC to support atomic operations from the
host in addition to already supported regular read/write.
====================
* 'memic_ops':
RDMA/mlx5: Expose UAPI to query DM
RDMA/mlx5: Add support in MEMIC operations
RDMA/mlx5: Add support to MODIFY_MEMIC command
RDMA/mlx5: Re-organize the DM code
RDMA/mlx5: Move all DM logic to separate file
RDMA/uverbs: Make UVERBS_OBJECT_METHODS to consider line number
net/mlx5: Add MEMIC operations related bits
|
|
Expose UAPI to query MEMIC DM, this will let user space application
that didn't allocate the DM but has access to by owning the matching
command FD to retrieve its information.
Link: https://lore.kernel.org/r/20210411122924.60230-8-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
MEMIC buffer, in addition to regular read and write operations, can
support atomic operations from the host.
Introduce and implement new UAPI to allocate address space for MEMIC
operations such as atomic. This includes:
1. Expose new IOCTL for request mapping of MEMIC operation.
2. Hold the operations address in a list, so same operation to same DM
will be allocated only once.
3. Manage refcount on the mlx5_ib_dm object, so it would be keep valid
until all addresses were unmapped.
Link: https://lore.kernel.org/r/20210411122924.60230-7-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Add two functions to allocate and deallocate MEMIC operations
by using the MODIFY_MEMIC command.
Link: https://lore.kernel.org/r/20210411122924.60230-6-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
1. Inline the checks from check_dm_type_support() into their
respective allocation functions.
2. Fix use after free when driver fails to copy the MEMIC address to the
user by moving the allocation code into their respective functions,
hence we avoid the explicit call to free the DM in the error flow.
3. Split mlx5_ib_dm struct to memic and icm proper typesafety
throughout.
Fixes: dc2316eba73f ("IB/mlx5: Fix device memory flows")
Link: https://lore.kernel.org/r/20210411122924.60230-5-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Move all device memory related code to a separate file.
Link: https://lore.kernel.org/r/20210411122924.60230-4-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Saeed Mahameed says:
====================
This pr contains changes from mlx5-next branch,
already reviewed on netdev and rdma mailing lists, links below.
1) From Leon, Dynamically assign MSI-X vectors count
Already Acked by Bjorn Helgaas.
https://patchwork.kernel.org/project/netdevbpf/cover/20210314124256.70253-1-leon@kernel.org/
2) Cleanup series:
https://patchwork.kernel.org/project/netdevbpf/cover/20210311070915.321814-1-saeed@kernel.org/
From Mark, E-Switch cleanups and refactoring, and the addition
of single FDB mode needed HW bits.
From Mikhael, Remove unused struct field
From Saeed, Cleanup W=1 prototype warning
From Zheng, Esw related cleanup
From Tariq, User order-0 page allocation for EQs
====================
* mlx5-next:
net/mlx5: Implement sriov_get_vf_total_msix/count() callbacks
net/mlx5: Dynamically assign MSI-X vectors count
net/mlx5: Add dynamic MSI-X capabilities bits
PCI/IOV: Add sysfs MSI-X vector assignment interface
net/mlx5: Use order-0 allocations for EQs
net/mlx5: Add IFC bits needed for single FDB mode
net/mlx5: E-Switch, Refactor send to vport to be more generic
RDMA/mlx5: Use representor E-Switch when getting netdev and metadata
net/mlx5: E-Switch, Add eswitch pointer to each representor
net/mlx5: E-Switch, Add match on vhca id to default send rules
net/mlx5: Remove unused mlx5_core_health member recover_work
net/mlx5: simplify the return expression of mlx5_esw_offloads_pair()
net/mlx5: Cleanup prototype warning
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Saeed Mahameed says:
====================
mlx5-next 2021-04-09
This pr contains changes from mlx5-next branch,
already reviewed on netdev and rdma mailing lists, links below.
1) From Leon, Dynamically assign MSI-X vectors count
Already Acked by Bjorn Helgaas.
https://patchwork.kernel.org/project/netdevbpf/cover/20210314124256.70253-1-leon@kernel.org/
2) Cleanup series:
https://patchwork.kernel.org/project/netdevbpf/cover/20210311070915.321814-1-saeed@kernel.org/
From Mark, E-Switch cleanups and refactoring, and the addition
of single FDB mode needed HW bits.
From Mikhael, Remove unused struct field
From Saeed, Cleanup W=1 prototype warning
From Zheng, Esw related cleanup
From Tariq, User order-0 page allocation for EQs
* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
net/mlx5: Implement sriov_get_vf_total_msix/count() callbacks
net/mlx5: Dynamically assign MSI-X vectors count
net/mlx5: Add dynamic MSI-X capabilities bits
PCI/IOV: Add sysfs MSI-X vector assignment interface
net/mlx5: Use order-0 allocations for EQs
net/mlx5: Add IFC bits needed for single FDB mode
net/mlx5: E-Switch, Refactor send to vport to be more generic
RDMA/mlx5: Use representor E-Switch when getting netdev and metadata
net/mlx5: E-Switch, Add eswitch pointer to each representor
net/mlx5: E-Switch, Add match on vhca id to default send rules
net/mlx5: Remove unused mlx5_core_health member recover_work
net/mlx5: simplify the return expression of mlx5_esw_offloads_pair()
net/mlx5: Cleanup prototype warning
====================
Link: https://lore.kernel.org/r/20210409200704.10886-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
To update xlt (during mlx5_ib_reg_user_mr()), the driver can request up to
1 MB (order-8) memory, depending on the size of the MR. This costly
allocation can sometimes take very long to return (a few seconds). This
causes the calling application to hang for a long time, especially when
the system is fragmented. To avoid these long latency spikes, the calls
the higher order allocations need to fail faster in case they are not
available.
In order to acheive this we need __GFP_NORETRY flag in the gfp_mask before
during fetching the free pages. Allow the algorithm to automatically fall
back to smaller page sizes.
Link: https://lore.kernel.org/r/1617425635-35631-1-git-send-email-praveen.kannoju@oracle.com
Signed-off-by: Praveen Kumar Kannoju <praveen.kannoju@oracle.com>
Acked-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Currently, ODP caps are set during the init stage of mlx5_ib_dev,
regardless of whether the device profile supports ODP or not. There is no
point in setting ODP caps if the device profile doesn't support
ODP. Hence, move setting the ODP caps to the odp_init stage.
Link: https://lore.kernel.org/r/20210318135259.681264-1-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Initial drop action support missed that drop action can be added to egress
flow tables as well. Add the missing support.
This requires making sure that dest_type isn't set to PORT which in turn
exposes a possibility of passing dst while indicating number of dsts as
zero. Explicitly check for number of dsts and pass the appropriate
pointer.
Fixes: f29de9eee782 ("RDMA/mlx5: Add support for drop action in DV steering")
Link: https://lore.kernel.org/r/20210318135123.680759-1-leon@kernel.org
Reviewed-by: Mark Bloch <markb@nvidia.com>
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Current code uses many different types when dealing with a port of a RDMA
device: u8, unsigned int and u32. Switch to u32 to clean up the logic.
This allows us to make (at least) the core view consistent and use the
same type. Unfortunately not all places can be converted. Many uverbs
functions expect port to be u8 so keep those places in order not to break
UAPIs. HW/Spec defined values must also not be changed.
With the switch to u32 we now can support devices with more than 255
ports. U32_MAX is reserved to make control logic a bit easier to deal
with. As a device with U32_MAX ports probably isn't going to happen any
time soon this seems like a non issue.
When a device with more than 255 ports is created uverbs will report the
RDMA device as having 255 ports as this is the max currently supported.
The verbs interface is not changed yet because the IBTA spec limits the
port size in too many places to be u8 and all applications that relies in
verbs won't be able to cope with this change. At this stage, we are
extending the interfaces that are using vendor channel solely
Once the limitation is lifted mlx5 in switchdev mode will be able to have
thousands of SFs created by the device. As the only instance of an RDMA
device that reports more than 255 ports will be a representor device and
it exposes itself as a RAW Ethernet only device CM/MAD/IPoIB and other
ULPs aren't effected by this change and their sysfs/interfaces that are
exposes to userspace can remain unchanged.
While here cleanup some alignment issues and remove unneeded sanity
checks (mainly in rdmavt),
Link: https://lore.kernel.org/r/20210301070420.439400-1-leon@kernel.org
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Pull rdma fixes from Jason Gunthorpe:
"Not much going on, just some small bug fixes:
- Typo causing a regression in mlx5 devx
- Regression in the recent hns rework causing the HW to get out of
sync
- Long-standing cxgb4 adaptor crash when destroying cm ids"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/cxgb4: Fix adapter LE hash errors while destroying ipv6 listening server
RDMA/hns: Fix bug during CMDQ initialization
RDMA/mlx5: Fix typo in destroy_mkey inbox
|
|
There is no need to create the ODP EQ if the user doesn't use ODP MRs.
Hence, create it only when the first ODP MR is created. This EQ will be
destroyed only when the device is unloaded.
This will decrease the number of EQs created per device. for example: If
we creates 1K devices (SF/VF/etc'), than we will decrease the num of EQs
by 1K.
Link: https://lore.kernel.org/r/20210314125418.179716-1-leon@kernel.org
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Fix the following smatch error:
drivers/infiniband/hw/mlx5/mr.c:1950 mlx5_ib_dereg_mr() error: uninitialized symbol 'rc'.
Fixes: e6fb246ccafb ("RDMA/mlx5: Consolidate MR destruction to mlx5_ib_dereg_mr()")
Link: https://lore.kernel.org/r/20210314082250.10143-1-leon@kernel.org
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Now that each representor stores a pointer to the managing E-Switch
use that information when creating the send-to-vport rules.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Now that a pointer to the managing E-Switch is stored in the representor
use it.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
The umem DMA list calculation was locked at 4k pages due to confusion
around how this API works and is used when larger pages are present.
The conclusion is:
- umem's cannot extend past what is mapped into the process, so creating
a lage page size and referring to a sub-range is not allowed
- umem's must always have a page offset of zero, except for sub PAGE_SIZE
umems
- The feature of umem_offset to create multiple objects inside a umem
is buggy and isn't used anyplace. Thus we can assume all users of the
current API have umem_offset == 0 as well
Provide a new page size calculator that limits the DMA list to the VA
range and enforces umem_offset == 0.
Allow user space to specify the page sizes which it can accept, this
bitmap must be derived from the intended use of the umem, based on
per-usage HW limitations.
Link: https://lore.kernel.org/r/20210304130501.1102577-4-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Change uverbs_get_const/uverbs_get_const_default to work properly with
both signed/unsigned parameters.
Current APIs mix s64 and u64 which leads to incorrect check when u64
value was supplied and its upper bit was set. In that case
uverbs_get_const() / uverbs_get_const_default() lower bound check may
fail unexpectedly, target is unsigned (lower bound is 0) but value
became negative as of the s64 usage.
Split to have two different APIs, no change to callers as the required
API will be called internally according to the target type.
Link: https://lore.kernel.org/r/20210304130501.1102577-3-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Correct the map between mlx5 rates and corresponding ib rates, as they
don't always have a fixed offset between them.
Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Link: https://lore.kernel.org/r/20210304124517.1100608-4-leon@kernel.org
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
mlx5_is_roce_enabled returns the devlink RoCE init value, therefore it
should be used only when driver is loaded. Instead we just need to read
the roce_en field.
In addition, rename mlx5_is_roce_enabled to mlx5_is_roce_init_enabled.
Fixes: 7a58779edd75 ("IB/mlx5: Improve query port for representor port")
Link: https://lore.kernel.org/r/20210304124517.1100608-2-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Now that this is only used in a few places in mr.c give it a sensible
name. It has nothing to do with the cache and can be invoked on any
MR. DMA is stopped and the user cannot touch the MR any further once it
completes.
Link: https://lore.kernel.org/r/20210304120745.1090751-5-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Now that the SRCU stuff has been removed the entire MR destroy logic can
be made a lot simpler. Currently there are many different ways to destroy a
MR and it makes it really hard to do this task correctly. Route all
destruction through mlx5_ib_dereg_mr() and make it work for all
situations.
Since it turns out all the different MR types do basically the same thing
this removes a lot of knowledge of MR internals from ODP and leaves ODP
just exporting an operation to clean up children.
This fixes a few weird corner cases bugs and firmly uses the correct
ordering of the MR destruction:
- Stop parallel access to the mkey via the ODP xarray
- Stop DMA
- Release the umem
- Clean up ODP children
- Free/Recycle the MR
Link: https://lore.kernel.org/r/20210304120745.1090751-4-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The struct mlx5_ib_mr can be used for three different things, but only one
at a time:
- In the user MR cache
- As a kernel MR
- As a user MR
Overlay the three things into a single union with the following rules:
- If the mr is found on the cache_ent->head list then it is a cache MR
and umem == NULL. The entire union is zero after the MR is removed from
the cache.
- If umem != NULL or type == IB_MR_TYPE_USER then it is a user MR.
- If umem == NULL then it is a kernel MR
This reduces the size of struct mlx5_ib_mr to 552 bytes from 702.
The only place the three flows overlap in the code is during dereg, so add
a few extra checks along there.
Link: https://lore.kernel.org/r/20210304120745.1090751-3-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
All of the ODP code assumes when it calls mlx5_mr_cache_alloc() the ODP
related fields are zero'd. This is true if the MR was just allocated, but
if the MR is recycled through the cache then the values are never zero'd.
This causes a bug in the odp_stats, they don't reset when the MR is
reallocated, also is_odp_implicit is never 0'd.
So we can use memset on a block of the mlx5_ib_mr reorganize the structure
to put all the data that can be zero'd by the cache at the end.
It is organized as an anonymous struct because the next patch will make
this a union.
Delete the unused smr_info. Don't set the kernel only desc_size on the
user path. No longer any need to zero mr->parent before freeing it, the
memset() will get it now.
Fixes: a3de94e3d61e ("IB/mlx5: Introduce ODP diagnostic counters")
Link: https://lore.kernel.org/r/20210304120745.1090751-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Set mkey_index to the correct place when preparing the destroy_mkey inbox,
this will fix the following syndrome:
mlx5_core 0000:08:00.0: mlx5_cmd_check:782:(pid 980716): DEALLOC_PD(0x801)
op_mod(0x0) failed, status bad resource state(0x9), syndrome (0x31b635)
Link: https://lore.kernel.org/r/20210311142024.1282004-1-leon@kernel.org
Fixes: 1368ead04c36 ("RDMA/mlx5: Use strict get/set operations for obj_id")
Reviewed-by: Ido Kalir <idok@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
1. Don't set the ts_format bit to default when it reserved - device is
running in the old mode (free running).
2. XRC doesn't have a CQ therefore the ts format in the QP
context should be default / free running.
3. Set ts_format to WQ.
Fixes: 2fe8d4b87802 ("RDMA/mlx5: Fail QP creation if the device can not support the CQE TS")
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
The W=1 allmodconfig build produces the following warning:
drivers/infiniband/hw/mlx5/odp.c:1086: warning: wrong kernel-doc identifier on line:
* Parse a series of data segments for page fault handling.
Fix it by changing /** to be /* as it is written in kernel-doc
documentation.
Fixes: 5e769e444d26 ("RDMA/hw/mlx5/odp: Fix formatting and add missing descriptions in 'pagefault_data_segments()'")
Link: https://lore.kernel.org/r/20210302074214.1054299-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Set err to -ENOMEM if kzalloc fails instead of 0.
Fixes: 759738537142 ("IB/mlx5: Enable subscription for device events over DEVX")
Link: https://lore.kernel.org/r/20210222122343.19720-1-yuehaibing@huawei.com
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Linux 5.11
Merged to resolve conflicts with RDMA rc commits
- drivers/infiniband/sw/rxe/rxe_net.c
The final logic is to call rxe_get_dev_from_net() again with the master
netdev if the packet was rx'd on a vlan. To keep the elimination of the
local variables requires a trivial edit to the code in -rc
Link: https://lore.kernel.org/r/20210210131542.215ea67c@canb.auug.org.au
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|