Age | Commit message (Collapse) | Author | Files | Lines |
|
commit 22bb13653410424d9fce8d447506a41f8292f22f upstream.
There is no reason for a different pad buffer for the two
packet types.
Expand the current buffer allocation to allow for both
packet types.
Fixes: f8195f3b14a0 ("IB/hfi1: Eliminate allocation while atomic")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Link: https://lore.kernel.org/r/20191004204934.26838.13099.stgit@awfm-01.aw.intel.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a9c3c4c597704b3a1a2b9bef990e7d8a881f6533 upstream.
If an hfi1 card is inserted in a Gen4 systems, the driver will avoid the
gen3 speed bump and the card will operate at half speed.
This is because the driver avoids the gen3 speed bump when the parent bus
speed isn't identical to gen3, 8.0GT/s. This is not compatible with gen4
and newer speeds.
Fix by relaxing the test to explicitly look for the lower capability
speeds which inherently allows for gen4 and all future speeds.
Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Link: https://lore.kernel.org/r/20191101192059.106248.1699.stgit@awfm-01.aw.intel.com
Cc: <stable@vger.kernel.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: James Erwin <james.erwin@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit ce8e8087cf3b5b4f19d29248bfc7deef95525490 upstream.
Normal RDMA WRITE request never returns IB_WC_RNR_RETRY_EXC_ERR to ULPs
because it does not need post receive buffer on the responder side.
Consequently, as an enhancement to normal RDMA WRITE request inside the
hfi1 driver, TID RDMA WRITE request should not return such an error status
to ULPs, although it does receive RNR NAKs from the responder when TID
resources are not available. This behavior is violated when
qp->s_rnr_retry_cnt is set in current hfi1 implementation.
This patch enforces these semantics by avoiding any reaction to the updates
of the RNR QP attributes.
Fixes: 3c6cb20a0d17 ("IB/hfi1: Add TID RDMA WRITE functionality into RDMA verbs")
Link: https://lore.kernel.org/r/20191025195842.106825.71532.stgit@awfm-01.aw.intel.com
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit c2be3865a1763c4be39574937e1aae27e917af4d upstream.
For a TID RDMA WRITE request, a QP on the responder side could be put into
a queue when a hardware flow is not available. A RNR NAK will be returned
to the requester with a RNR timeout value based on the position of the QP
in the queue. The tid_rdma_flow_wt variable is used to calculate the
timeout value and is determined by using a MTU of 4096 at the module
loading time. This could reduce the timeout value by half from the desired
value, leading to excessive RNR retries.
This patch fixes the issue by calculating the flow weight with the real
MTU assigned to the QP.
Fixes: 07b923701e38 ("IB/hfi1: Add functions to receive TID RDMA WRITE request")
Link: https://lore.kernel.org/r/20191025195836.106825.77769.stgit@awfm-01.aw.intel.com
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit c1abd865bd125015783286b353abb8da51644f59 upstream.
The index r_tid_ack is used to indicate the next TID RDMA WRITE request to
acknowledge in the ring s_ack_queue[] on the responder side and should be
set to a valid index other than its initial value before r_tid_tail is
advanced to the next TID RDMA WRITE request and particularly before a TID
RDMA ACK is built. Otherwise, a NULL pointer dereference may result:
BUG: unable to handle kernel paging request at ffff9a32d27abff8
IP: [<ffffffffc0d87ea6>] hfi1_make_tid_rdma_pkt+0x476/0xcb0 [hfi1]
PGD 2749032067 PUD 0
Oops: 0000 1 SMP
Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) lustre(OE) lmv(OE) mdc(OE) lov(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) ib_ipoib(OE) hfi1(OE) rdmavt(OE) nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache ib_isert iscsi_target_mod target_core_mod ib_ucm dm_mirror dm_region_hash dm_log mlx5_ib dm_mod zfs(POE) rpcrdma sunrpc rdma_ucm ib_uverbs opa_vnic ib_iser zunicode(POE) ib_umad zavl(POE) icp(POE) sb_edac intel_powerclamp coretemp rdma_cm intel_rapl iosf_mbi iw_cm libiscsi scsi_transport_iscsi kvm ib_cm iTCO_wdt mxm_wmi iTCO_vendor_support irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd zcommon(POE) znvpair(POE) pcspkr spl(OE) mei_me
sg mei ioatdma lpc_ich joydev i2c_i801 shpchp ipmi_si ipmi_devintf ipmi_msghandler wmi acpi_power_meter ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 mlx5_core drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ixgbe ahci ttm mlxfw ib_core libahci devlink mdio crct10dif_pclmul crct10dif_common drm ptp libata megaraid_sas crc32c_intel i2c_algo_bit pps_core i2c_core dca [last unloaded: rdmavt]
CPU: 15 PID: 68691 Comm: kworker/15:2H Kdump: loaded Tainted: P W OE ------------ 3.10.0-862.2.3.el7_lustre.x86_64 #1
Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE5C610.86B.01.01.0016.033120161139 03/31/2016
Workqueue: hfi0_0 _hfi1_do_tid_send [hfi1]
task: ffff9a01f47faf70 ti: ffff9a11776a8000 task.ti: ffff9a11776a8000
RIP: 0010:[<ffffffffc0d87ea6>] [<ffffffffc0d87ea6>] hfi1_make_tid_rdma_pkt+0x476/0xcb0 [hfi1]
RSP: 0018:ffff9a11776abd08 EFLAGS: 00010002
RAX: ffff9a32d27abfc0 RBX: ffff99f2d27aa000 RCX: 00000000ffffffff
RDX: 0000000000000000 RSI: 0000000000000220 RDI: ffff99f2ffc05300
RBP: ffff9a11776abd88 R08: 000000000001c310 R09: ffffffffc0d87ad4
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9a117a423c00
R13: ffff9a117a423c00 R14: ffff9a03500c0000 R15: ffff9a117a423cb8
FS: 0000000000000000(0000) GS:ffff9a117e9c0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff9a32d27abff8 CR3: 0000002748a0e000 CR4: 00000000001607e0
Call Trace:
[<ffffffffc0d88874>] _hfi1_do_tid_send+0x194/0x320 [hfi1]
[<ffffffffaf0b2dff>] process_one_work+0x17f/0x440
[<ffffffffaf0b3ac6>] worker_thread+0x126/0x3c0
[<ffffffffaf0b39a0>] ? manage_workers.isra.24+0x2a0/0x2a0
[<ffffffffaf0bae31>] kthread+0xd1/0xe0
[<ffffffffaf0bad60>] ? insert_kthread_work+0x40/0x40
[<ffffffffaf71f5f7>] ret_from_fork_nospec_begin+0x21/0x21
[<ffffffffaf0bad60>] ? insert_kthread_work+0x40/0x40
hfi1 0000:05:00.0: hfi1_0: reserved_op: opcode 0xf2, slot 2, rsv_used 1, rsv_ops 1
Code: 00 00 41 8b 8d d8 02 00 00 89 c8 48 89 45 b0 48 c1 65 b0 06 48 8b 83 a0 01 00 00 48 01 45 b0 48 8b 45 b0 41 80 bd 10 03 00 00 00 <48> 8b 50 38 4c 8d 7a 50 74 45 8b b2 d0 00 00 00 85 f6 0f 85 72
RIP [<ffffffffc0d87ea6>] hfi1_make_tid_rdma_pkt+0x476/0xcb0 [hfi1]
RSP <ffff9a11776abd08>
CR2: ffff9a32d27abff8
This problem can happen if a RESYNC request is received before r_tid_ack
is modified.
This patch fixes the issue by making sure that r_tid_ack is set to a valid
value before a TID RDMA ACK is built. Functions are defined to simplify
the code.
Fixes: 07b923701e38 ("IB/hfi1: Add functions to receive TID RDMA WRITE request")
Fixes: 7cf0ad679de4 ("IB/hfi1: Add a function to receive TID RDMA RESYNC packet")
Link: https://lore.kernel.org/r/20191025195830.106825.44022.stgit@awfm-01.aw.intel.com
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit b681a0529968d2261aa15d7a1e78801b2c06bb07 ]
eq->buf_list->buf and eq->buf_list should also be freed when eqe_hop_num
is set to 0, or there will be memory leaks.
Fixes: a5073d6054f7 ("RDMA/hns: Add eq support of hip08")
Link: https://lore.kernel.org/r/1572072995-11277-3-git-send-email-liweihang@hisilicon.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Weihang Li <liweihang@hisilicon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d4934f45693651ea15357dd6c7c36be28b6da884 ]
_put_ep_safe() and _put_pass_ep_safe() free the skb before it is freed by
process_work(). fix double free by freeing the skb only in process_work().
Fixes: 1dad0ebeea1c ("iw_cxgb4: Avoid touch after free error in ARP failure handlers")
Link: https://lore.kernel.org/r/1572006880-5800-1-git-send-email-bharat@chelsio.com
Signed-off-by: Dakshaja Uppalapati <dakshaja@chelsio.com>
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit a15542bb72a48042f5df7475893d46f725f5f9fb ]
The counter resource should return -EAGAIN if it was requested for a
different port, this is similar to how QP works if the users provides a
port filter.
Otherwise port filtering in netlink will return broken counter nests.
Fixes: c4ffee7c9bdb ("RDMA/netlink: Implement counter dumpit calback")
Link: https://lore.kernel.org/r/20191020062800.8065-1-leon@kernel.org
Signed-off-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit a9018adfde809d44e71189b984fa61cc89682b5e ]
The issue is in drivers/infiniband/core/uverbs_std_types_cq.c in the
UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE) function. We check that:
if (attr.comp_vector >= attrs->ufile->device->num_comp_vectors) {
But we don't check if "attr.comp_vector" is negative. It could
potentially lead to an array underflow. My concern would be where
cq->vector is used in the create_cq() function from the cxgb4 driver.
And really "attr.comp_vector" is appears as a u32 to user space so that's
the right type to use.
Fixes: 9ee79fce3642 ("IB/core: Add completion queue (cq) object actions")
Link: https://lore.kernel.org/r/20191011133419.GA22905@mwanda
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 777a8b32bc0f9bb25848a025f72a9febc30d9033 ]
Current code tries to derive VLAN ID and compares it with GID
attribute for matching entry. This raw search fails on macvlan
netdevice as its not a VLAN device, but its an upper device of a VLAN
netdevice.
Due to this limitation, incoming QP1 packets fail to match in the
GID table. Such packets are dropped.
Hence, to support it, use the existing rdma_read_gid_l2_fields()
that takes care of diffferent device types.
Fixes: dbf727de7440 ("IB/core: Use GID table in AH creation and dmac resolution")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20191002121750.17313-1-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b806c94ee44e53233b8ce6c92d9078d9781786a5 ]
Remove spaces from the reported firmware version string.
Actual value:
$ cat /sys/class/infiniband/qedr0/fw_ver
8. 37. 7. 0
Expected value:
$ cat /sys/class/infiniband/qedr0/fw_ver
8.37.7.0
Fixes: ec72fce401c6 ("qedr: Add support for RoCE HW init")
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Acked-by: Michal KalderonĀ <michal.kalderon@marvell.com>
Link: https://lore.kernel.org/r/20191007210730.7173-1-kamalheib1@gmail.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e17fa5c95ef2434a08e0be217969d246d037f0c2 ]
As siw_free_qp() is the last routine to access 'siw_base_qp' structure,
freeing this structure early in siw_destroy_qp() could cause
touch-after-free issue.
Hence, moved kfree(siw_base_qp) from siw_destroy_qp() to siw_free_qp().
Fixes: 303ae1cdfdf7 ("rdma/siw: application interface")
Signed-off-by: Krishnamraju Eraparaju <krishna2@chelsio.com>
Link: https://lore.kernel.org/r/20191007104229.29412-1-krishna2@chelsio.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 612e0486ad0845c41ac10492e78144f99e326375 ]
pass_accept_req() is using the same skb for handling accept request and
sending accept reply to HW. Here req and rpl structures are pointing to
same skb->data which is over written by INIT_TP_WR() and leads to
accessing corrupt req fields in accept_cr() while checking for ECN flags.
Reordered code in accept_cr() to fetch correct req fields.
Fixes: 92e7ae7172 ("iw_cxgb4: Choose appropriate hw mtu index and ISS for iWARP connections")
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
Link: https://lore.kernel.org/r/20191003104353.11590-1-bharat@chelsio.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c8973df2da677f375f8b12b6eefca2f44c8884d5 ]
Before QP is closed it changes to ERROR state, when this happens
the QP was left with old rate limit that was already removed from
the table.
Fixes: 7d29f349a4b9 ("IB/mlx5: Properly adjust rate limit on QP state transitions")
Signed-off-by: Rafi Wiener <rafiw@mellanox.com>
Signed-off-by: Oleg Kuporosov <olegk@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20191002120243.16971-1-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 1524b12a6e02a85264af4ed208b034a2239ef374 ]
The mkey_table xarray is touched by the reg_mr_callback() function which
is called from a hard irq. Thus all other uses of xa_lock must use the
_irq variants.
WARNING: inconsistent lock state
5.4.0-rc1 #12 Not tainted
--------------------------------
inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
python3/343 [HC0[0]:SC0[0]:HE1:SE1] takes:
ffff888182be1d40 (&(&xa->xa_lock)->rlock#3){?.-.}, at: xa_erase+0x12/0x30
{IN-HARDIRQ-W} state was registered at:
lock_acquire+0xe1/0x200
_raw_spin_lock_irqsave+0x35/0x50
reg_mr_callback+0x2dd/0x450 [mlx5_ib]
mlx5_cmd_exec_cb_handler+0x2c/0x70 [mlx5_core]
mlx5_cmd_comp_handler+0x355/0x840 [mlx5_core]
[..]
Possible unsafe locking scenario:
CPU0
----
lock(&(&xa->xa_lock)->rlock#3);
<Interrupt>
lock(&(&xa->xa_lock)->rlock#3);
*** DEADLOCK ***
2 locks held by python3/343:
#0: ffff88818eb4bd38 (&uverbs_dev->disassociate_srcu){....}, at: ib_uverbs_ioctl+0xe5/0x1e0 [ib_uverbs]
#1: ffff888176c76d38 (&file->hw_destroy_rwsem){++++}, at: uobj_destroy+0x2d/0x90 [ib_uverbs]
stack backtrace:
CPU: 3 PID: 343 Comm: python3 Not tainted 5.4.0-rc1 #12
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x86/0xca
print_usage_bug.cold.50+0x2e5/0x355
mark_lock+0x871/0xb50
? match_held_lock+0x20/0x250
? check_usage_forwards+0x240/0x240
__lock_acquire+0x7de/0x23a0
? __kasan_check_read+0x11/0x20
? mark_lock+0xae/0xb50
? mark_held_locks+0xb0/0xb0
? find_held_lock+0xca/0xf0
lock_acquire+0xe1/0x200
? xa_erase+0x12/0x30
_raw_spin_lock+0x2a/0x40
? xa_erase+0x12/0x30
xa_erase+0x12/0x30
mlx5_ib_dealloc_mw+0x55/0xa0 [mlx5_ib]
uverbs_dealloc_mw+0x3c/0x70 [ib_uverbs]
uverbs_free_mw+0x1a/0x20 [ib_uverbs]
destroy_hw_idr_uobject+0x49/0xa0 [ib_uverbs]
[..]
Fixes: 0417791536ae ("RDMA/mlx5: Add missing synchronize_srcu() for MW cases")
Link: https://lore.kernel.org/r/20191024234910.GA9038@ziepe.ca
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 9ed5bd7d22241ad232fd3a5be404e83eb6cadc04 upstream.
A TID RDMA READ request could be retried under one of the following
conditions:
- The RC retry timer expires;
- A later TID RDMA READ RESP packet is received before the next
expected one.
For the latter, under normal conditions, the PSN in IB space is used
for comparison. More specifically, the IB PSN in the incoming TID RDMA
READ RESP packet is compared with the last IB PSN of a given TID RDMA
READ request to determine if the request should be retried. This is
similar to the retry logic for noraml RDMA READ request.
However, if a TID RDMA READ RESP packet is lost due to congestion,
header suppresion will be disabled and each incoming packet will raise
an interrupt until the hardware flow is reloaded. Under this condition,
each packet KDETH PSN will be checked by software against r_next_psn
and a retry will be requested if the packet KDETH PSN is later than
r_next_psn. Since each TID RDMA READ segment could have up to 64
packets and each TID RDMA READ request could have many segments, we
could make far more retries under such conditions, and thus leading to
RETRY_EXC_ERR status.
This patch fixes the issue by removing the retry when the incoming
packet KDETH PSN is later than r_next_psn. Instead, it resorts to
RC timer and normal IB PSN comparison for any request retry.
Fixes: 9905bf06e890 ("IB/hfi1: Add functions to receive TID RDMA READ response")
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Link: https://lore.kernel.org/r/20191004204035.26542.41684.stgit@awfm-01.aw.intel.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 0417791536ae1e28d7f0418f1d20048ec4d3c6cf ]
While MR uses live as the SRCU 'update', the MW case uses the xarray
directly, xa_erase() causes the MW to become inaccessible to the pagefault
thread.
Thus whenever a MW is removed from the xarray we must synchronize_srcu()
before freeing it.
This must be done before freeing the mkey as re-use of the mkey while the
pagefault thread is using the stale mkey is undesirable.
Add the missing synchronizes to MW and DEVX indirect mkey and delete the
bogus protection against double destroy in mlx5_core_destroy_mkey()
Fixes: 534fd7aac56a ("IB/mlx5: Manage indirection mkey upon DEVX flow for ODP")
Fixes: 6aec21f6a832 ("IB/mlx5: Page faults handling infrastructure")
Link: https://lore.kernel.org/r/20191001153821.23621-7-jgg@ziepe.ca
Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit aa116b810ac9077a263ed8679fb4d595f180e0eb ]
During destroy setting live = 0 and then synchronize_srcu() prevents
num_pending_prefetch from incrementing, and also, ensures that all work
holding that count is queued on the WQ. Testing before causes races of the
form:
CPU0 CPU1
dereg_mr()
mlx5_ib_advise_mr_prefetch()
srcu_read_lock()
num_pending_prefetch_inc()
if (!live)
live = 0
atomic_read() == 0
// skip flush_workqueue()
atomic_inc()
queue_work();
srcu_read_unlock()
WARN_ON(atomic_read()) // Fails
Swap the order so that the synchronize_srcu() prevents this.
Fixes: a6bc3875f176 ("IB/mlx5: Protect against prefetch of invalid MR")
Link: https://lore.kernel.org/r/20191001153821.23621-5-jgg@ziepe.ca
Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 880505cfef1d086d18b59d2920eb2160429ffa1f ]
This code is completely broken, the umem of a ODP MR simply cannot be
discarded without a lot more locking, nor can an ODP mkey be blithely
destroyed via destroy_mkey().
Fixes: 6aec21f6a832 ("IB/mlx5: Page faults handling infrastructure")
Link: https://lore.kernel.org/r/20191001153821.23621-2-jgg@ziepe.ca
Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 594e6c5d41ed2471ab0b90f3f0b66cdf618b7ac9 ]
Properly unwind QP counter rebinding in case of failure.
Trying to rebind the counter after unbiding it is not going to work
reliably, move the unbind to the end so it doesn't have to be unwound.
Fixes: b389327df905 ("RDMA/nldev: Allow counter manual mode configration through RDMA netlink")
Link: https://lore.kernel.org/r/20191002115627.16740-1-leon@kernel.org
Reviewed-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 94635c36f3854934a46d9e812e028d4721bbb0e6 ]
In the process of moving the debug counters sysfs entries, the commit
mentioned below eliminated the cm_infiniband sysfs directory.
This sysfs directory was tied to the cm_port object allocated in procedure
cm_add_one().
Before the commit below, this cm_port object was freed via a call to
kobject_put(port->kobj) in procedure cm_remove_port_fs().
Since port no longer uses its kobj, kobject_put(port->kobj) was eliminated.
This, however, meant that kfree was never called for the cm_port buffers.
Fix this by adding explicit kfree(port) calls to functions cm_add_one()
and cm_remove_one().
Note: the kfree call in the first chunk below (in the cm_add_one error
flow) fixes an old, undetected memory leak.
Fixes: c87e65cfb97c ("RDMA/cm: Move debug counters to be under relevant IB device")
Link: https://lore.kernel.org/r/20190916071154.20383-2-leon@kernel.org
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ab59ca3eb4e7059727df85eee68bda169d26c8f8 ]
According to surrounding error paths, it is likely that 'goto err_get;' is
expected here. Otherwise, a call to 'rdma_restrack_put(res);' would be
missing.
Fixes: c5dfe0ea6ffa ("RDMA/nldev: Add resource tracker doit callback")
Link: https://lore.kernel.org/r/20190818091044.8845-1-christophe.jaillet@wanadoo.fr
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b66f31efbdad95ec274345721d99d1d835e6de01 ]
This patch fixes the lock inversion complaint:
============================================
WARNING: possible recursive locking detected
5.3.0-rc7-dbg+ #1 Not tainted
--------------------------------------------
kworker/u16:6/171 is trying to acquire lock:
00000000035c6e6c (&id_priv->handler_mutex){+.+.}, at: rdma_destroy_id+0x78/0x4a0 [rdma_cm]
but task is already holding lock:
00000000bc7c307d (&id_priv->handler_mutex){+.+.}, at: iw_conn_req_handler+0x151/0x680 [rdma_cm]
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&id_priv->handler_mutex);
lock(&id_priv->handler_mutex);
*** DEADLOCK ***
May be due to missing lock nesting notation
3 locks held by kworker/u16:6/171:
#0: 00000000e2eaa773 ((wq_completion)iw_cm_wq){+.+.}, at: process_one_work+0x472/0xac0
#1: 000000001efd357b ((work_completion)(&work->work)#3){+.+.}, at: process_one_work+0x476/0xac0
#2: 00000000bc7c307d (&id_priv->handler_mutex){+.+.}, at: iw_conn_req_handler+0x151/0x680 [rdma_cm]
stack backtrace:
CPU: 3 PID: 171 Comm: kworker/u16:6 Not tainted 5.3.0-rc7-dbg+ #1
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Workqueue: iw_cm_wq cm_work_handler [iw_cm]
Call Trace:
dump_stack+0x8a/0xd6
__lock_acquire.cold+0xe1/0x24d
lock_acquire+0x106/0x240
__mutex_lock+0x12e/0xcb0
mutex_lock_nested+0x1f/0x30
rdma_destroy_id+0x78/0x4a0 [rdma_cm]
iw_conn_req_handler+0x5c9/0x680 [rdma_cm]
cm_work_handler+0xe62/0x1100 [iw_cm]
process_one_work+0x56d/0xac0
worker_thread+0x7a/0x5d0
kthread+0x1bc/0x210
ret_from_fork+0x24/0x30
This is not a bug as there are actually two lock classes here.
Link: https://lore.kernel.org/r/20190930231707.48259-3-bvanassche@acm.org
Fixes: de910bd92137 ("RDMA/cma: Simplify locking needed for serialization of callbacks")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 91724c1e5afe45b64970036170659726e7dc5cff ]
dump_qp() is wrongly trying to dump SRQ structures as QP when SRQ is used
by the application. This patch matches the QPID before dumping them. Also
removes unwanted SRQ id addition to QP id xarray.
Fixes: 2f43129127e6 ("cxgb4: Convert qpidr to XArray")
Link: https://lore.kernel.org/r/20190930074119.20046-1-bharat@chelsio.com
Signed-off-by: Rahul Kundu <rahul.kundu@chelsio.com>
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 34b3be18a04ecdc610aae4c48e5d1b799d8689f6 ]
In sdma_init if rhashtable_init fails the allocated memory for
tmp_sdma_rht should be released.
Fixes: 5a52a7acf7e2 ("IB/hfi1: NULL pointer dereference when freeing rhashtable")
Link: https://lore.kernel.org/r/20190925144543.10141-1-navid.emamdoost@gmail.com
Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit df791c54d627bae53c9be3be40a69594c55de487 ]
In siw_qp_llp_write_space(), 'sock' members should be accessed with
sk_callback_lock held, otherwise, it could race with
siw_sk_restore_upcalls(). And this could cause "NULL deref" panic. Below
panic is due to the NULL cep returned from sk_to_cep(sk):
Call Trace:
<IRQ> siw_qp_llp_write_space+0x11/0x40 [siw]
tcp_check_space+0x4c/0xf0
tcp_rcv_established+0x52b/0x630
tcp_v4_do_rcv+0xf4/0x1e0
tcp_v4_rcv+0x9b8/0xab0
ip_protocol_deliver_rcu+0x2c/0x1c0
ip_local_deliver_finish+0x44/0x50
ip_local_deliver+0x6b/0xf0
? ip_protocol_deliver_rcu+0x1c0/0x1c0
ip_rcv+0x52/0xd0
? ip_rcv_finish_core.isra.14+0x390/0x390
__netif_receive_skb_one_core+0x83/0xa0
netif_receive_skb_internal+0x73/0xb0
napi_gro_frags+0x1ff/0x2b0
t4_ethrx_handler+0x4a7/0x740 [cxgb4]
process_responses+0x2c9/0x590 [cxgb4]
? t4_sge_intr_msix+0x1d/0x30 [cxgb4]
? handle_irq_event_percpu+0x51/0x70
? handle_irq_event+0x41/0x60
? handle_edge_irq+0x97/0x1a0
napi_rx_handler+0x14/0xe0 [cxgb4]
net_rx_action+0x2af/0x410
__do_softirq+0xda/0x2a8
do_softirq_own_stack+0x2a/0x40
</IRQ>
do_softirq+0x50/0x60
__local_bh_enable_ip+0x50/0x60
ip_finish_output2+0x18f/0x520
ip_output+0x6e/0xf0
? __ip_finish_output+0x1f0/0x1f0
__ip_queue_xmit+0x14f/0x3d0
? __slab_alloc+0x4b/0x58
__tcp_transmit_skb+0x57d/0xa60
tcp_write_xmit+0x23b/0xfd0
__tcp_push_pending_frames+0x2e/0xf0
tcp_sendmsg_locked+0x939/0xd50
tcp_sendmsg+0x27/0x40
sock_sendmsg+0x57/0x80
siw_tx_hdt+0x894/0xb20 [siw]
? find_busiest_group+0x3e/0x5b0
? common_interrupt+0xa/0xf
? common_interrupt+0xa/0xf
? common_interrupt+0xa/0xf
siw_qp_sq_process+0xf1/0xe60 [siw]
? __wake_up_common_lock+0x87/0xc0
siw_sq_resume+0x33/0xe0 [siw]
siw_run_sq+0xac/0x190 [siw]
? remove_wait_queue+0x60/0x60
kthread+0xf8/0x130
? siw_sq_resume+0xe0/0xe0 [siw]
? kthread_bind+0x10/0x10
ret_from_fork+0x35/0x40
Fixes: f29dd55b0236 ("rdma/siw: queue pair methods")
Link: https://lore.kernel.org/r/20190923101112.32685-1-krishna2@chelsio.com
Signed-off-by: Krishnamraju Eraparaju <krishna2@chelsio.com>
Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 3840c5b78803b2b6cc1ff820100a74a092c40cbb upstream.
Nicolas pointed out that the cxgb4 driver is doing dma off of the stack,
which is generally considered a very bad thing. On some architectures it
could be a security problem, but odds are none of them actually run this
driver, so it's just a "normal" bug.
Resolve this by allocating the memory for a message off of the heap
instead of the stack. kmalloc() always will give us a proper memory
location that DMA will work correctly from.
Link: https://lore.kernel.org/r/20191001165611.GA3542072@kroah.com
Reported-by: Nicolas Waisman <nico@semmle.com>
Tested-by: Potnuri Bharat Teja <bharat@chelsio.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 18545e8b6871d21aa3386dc42867138da9948a33 upstream.
An extra kfree cleanup was missed since these are now deallocated by core.
Link: https://lore.kernel.org/r/1568848066-12449-1-git-send-email-aditr@vmware.com
Cc: <stable@vger.kernel.org>
Fixes: 68e326dea1db ("RDMA: Handle SRQ allocations by IB/core")
Signed-off-by: Adit Ranadive <aditr@vmware.com>
Reviewed-by: Vishnu Dasa <vdasa@vmware.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 1cbe866cbcb53338de33cf67262e73f9315a9725 upstream.
rdma_for_each_port is already incrementing the iterator's value it
receives therefore, after the first iteration the iterator is increased by
2 which eventually causing wrong queries and possible traces.
Fix the above by removing the old redundant incrementation that was used
before rdma_for_each_port() macro.
Cc: <stable@vger.kernel.org>
Fixes: ea1075edcbab ("RDMA: Add and use rdma_for_each_port")
Link: https://lore.kernel.org/r/20191002122127.17571-1-leon@kernel.org
Signed-off-by: Mohamad Heib <mohamadh@mellanox.com>
Reviewed-by: Erez Alfasi <ereza@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 3eca7fc2d8d1275d9cf0c709f0937becbfcf6d96 upstream.
The cited commit introduced a double-free of the srq buffer in the error
flow of procedure __uverbs_create_xsrq().
The problem is that ib_destroy_srq_user() called in the error flow also
frees the srq buffer.
Thus, if uverbs_response() fails in __uverbs_create_srq(), the srq buffer
will be freed twice.
Cc: <stable@vger.kernel.org>
Fixes: 68e326dea1db ("RDMA: Handle SRQ allocations by IB/core")
Link: https://lore.kernel.org/r/20190916071154.20383-5-leon@kernel.org
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b2590bdd0b1dfb91737e6cb07ebb47bd74957f7e upstream.
When a KDETH packet is subject to fault injection during transmission,
HCRC is supposed to be omitted from the packet so that the hardware on the
receiver side would drop the packet. When creating pbc, the PbcInsertHcrc
field is set to be PBC_IHCRC_NONE if the KDETH packet is subject to fault
injection, but overwritten with PBC_IHCRC_LKDETH when update_hcrc() is
called later.
This problem is fixed by not calling update_hcrc() when the packet is
subject to fault injection.
Fixes: 6b6cf9357f78 ("IB/hfi1: Set PbcInsertHcrc for TID RDMA packets")
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20190715164546.74174.99296.stgit@awfm-01.aw.intel.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f8659d68e2bee5b86a1beaf7be42d942e1fc81f4 upstream.
Define the working variables to be unsigned long to be compatible with
for_each_set_bit and change types as needed.
While we are at it remove unused variables from a couple of functions.
This was found because of the following KASAN warning:
==================================================================
BUG: KASAN: stack-out-of-bounds in find_first_bit+0x19/0x70
Read of size 8 at addr ffff888362d778d0 by task kworker/u308:2/1889
CPU: 21 PID: 1889 Comm: kworker/u308:2 Tainted: G W 5.3.0-rc2-mm1+ #2
Hardware name: Intel Corporation W2600CR/W2600CR, BIOS SE5C600.86B.02.04.0003.102320141138 10/23/2014
Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core]
Call Trace:
dump_stack+0x9a/0xf0
? find_first_bit+0x19/0x70
print_address_description+0x6c/0x332
? find_first_bit+0x19/0x70
? find_first_bit+0x19/0x70
__kasan_report.cold.6+0x1a/0x3b
? find_first_bit+0x19/0x70
kasan_report+0xe/0x12
find_first_bit+0x19/0x70
pma_get_opa_portstatus+0x5cc/0xa80 [hfi1]
? ret_from_fork+0x3a/0x50
? pma_get_opa_port_ectrs+0x200/0x200 [hfi1]
? stack_trace_consume_entry+0x80/0x80
hfi1_process_mad+0x39b/0x26c0 [hfi1]
? __lock_acquire+0x65e/0x21b0
? clear_linkup_counters+0xb0/0xb0 [hfi1]
? check_chain_key+0x1d7/0x2e0
? lock_downgrade+0x3a0/0x3a0
? match_held_lock+0x2e/0x250
ib_mad_recv_done+0x698/0x15e0 [ib_core]
? clear_linkup_counters+0xb0/0xb0 [hfi1]
? ib_mad_send_done+0xc80/0xc80 [ib_core]
? mark_held_locks+0x79/0xa0
? _raw_spin_unlock_irqrestore+0x44/0x60
? rvt_poll_cq+0x1e1/0x340 [rdmavt]
__ib_process_cq+0x97/0x100 [ib_core]
ib_cq_poll_work+0x31/0xb0 [ib_core]
process_one_work+0x4ee/0xa00
? pwq_dec_nr_in_flight+0x110/0x110
? do_raw_spin_lock+0x113/0x1d0
worker_thread+0x57/0x5a0
? process_one_work+0xa00/0xa00
kthread+0x1bb/0x1e0
? kthread_create_on_node+0xc0/0xc0
ret_from_fork+0x3a/0x50
The buggy address belongs to the page:
page:ffffea000d8b5dc0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0
flags: 0x17ffffc0000000()
raw: 0017ffffc0000000 0000000000000000 ffffea000d8b5dc8 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected
addr ffff888362d778d0 is located in stack of task kworker/u308:2/1889 at offset 32 in frame:
pma_get_opa_portstatus+0x0/0xa80 [hfi1]
this frame has 1 object:
[32, 36) 'vl_select_mask'
Memory state around the buggy address:
ffff888362d77780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff888362d77800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff888362d77880: 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2 00 00
^
ffff888362d77900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff888362d77980: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2
==================================================================
Cc: <stable@vger.kernel.org>
Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Link: https://lore.kernel.org/r/20190911113053.126040.47327.stgit@awfm-01.aw.intel.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 5d44adebbb7e785939df3db36ac360f5e8b73e44 upstream.
ib_add_slave_port() allocates a multiport struct but never frees it.
Don't leak memory, free the allocated mpi struct during driver unload.
Cc: <stable@vger.kernel.org>
Fixes: 32f69e4be269 ("{net, IB}/mlx5: Manage port association for multiport RoCE")
Link: https://lore.kernel.org/r/20190916064818.19823-3-leon@kernel.org
Signed-off-by: Danit Goldberg <danitg@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 77d5bc7e6a6cf8bbeca31aab7f0c5449a5eee762 ]
Julian noted that rt_uses_gateway has a more subtle use than 'is gateway
set':
https://lore.kernel.org/netdev/alpine.LFD.2.21.1909151104060.2546@ja.home.ssi.bg/
Revert that part of the commit referenced in the Fixes tag.
Currently, there are no u8 holes in 'struct rtable'. There is a 4-byte hole
in the second cacheline which contains the gateway declaration. So move
rt_gw_family down to the gateway declarations since they are always used
together, and then re-use that u8 for rt_uses_gateway. End result is that
rtable size is unchanged.
Fixes: 1550c171935d ("ipv4: Prepare rtable for IPv6 gateway")
Reported-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Walking the address list of an inet6_dev requires
appropriate locking. Since the called function
siw_listen_address() may sleep, we have to use
rtnl_lock() instead of read_lock_bh().
Also introduces sanity checks if we got a device
from in_dev_get() or in6_dev_get().
Reported-by: Bart Van Assche <bvanassche@acm.org>
Fixes: 6c52fdc244b5 ("rdma/siw: connection management")
Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com>
Link: https://lore.kernel.org/r/20190828130355.22830-1-bmt@zurich.ibm.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Fixes improper casting between addresses and unsigned types.
Changes siw_pbl_get_buffer() function to return appropriate
dma_addr_t, and not u64.
Also fixes debug prints. Now any potentially kernel private
pointers are printed formatted as '%pK', to allow keeping that
information secret.
Fixes: d941bfe500be ("RDMA/siw: Change CQ flags from 64->32 bits")
Fixes: b0fff7317bb4 ("rdma/siw: completion queue methods")
Fixes: 8b6a361b8c48 ("rdma/siw: receive path")
Fixes: b9be6f18cf9e ("rdma/siw: transmit path")
Fixes: f29dd55b0236 ("rdma/siw: queue pair methods")
Fixes: 2251334dcac9 ("rdma/siw: application buffer management")
Fixes: 303ae1cdfdf7 ("rdma/siw: application interface")
Fixes: 6c52fdc244b5 ("rdma/siw: connection management")
Fixes: a531975279f3 ("rdma/siw: main include file")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reported-by: Jason Gunthorpe <jgg@ziepe.ca>
Reported-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com>
Link: https://lore.kernel.org/r/20190822173738.26817-1-bmt@zurich.ibm.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
All user level and most in-kernel applications submit WQEs
where the SG list entries are all of a single type.
iSER in particular, however, will send us WQEs with mixed SG
types: sge[0] = kernel buffer, sge[1] = PBL region.
Check and set is_kva on each SG entry individually instead of
assuming the first SGE type carries through to the last.
This fixes iSER over siw.
Fixes: b9be6f18cf9e ("rdma/siw: transmit path")
Reported-by: Krishnamraju Eraparaju <krishna2@chelsio.com>
Tested-by: Krishnamraju Eraparaju <krishna2@chelsio.com>
Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com>
Link: https://lore.kernel.org/r/20190822150741.21871-1-bmt@zurich.ibm.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Driver copies FW commands to the HW queue as units of 16 bytes. Some
of the command structures are not exact multiple of 16. So while copying
the data from those structures, the stack out of bounds messages are
reported by KASAN. The following error is reported.
[ 1337.530155] ==================================================================
[ 1337.530277] BUG: KASAN: stack-out-of-bounds in bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
[ 1337.530413] Read of size 16 at addr ffff888725477a48 by task rmmod/2785
[ 1337.530540] CPU: 5 PID: 2785 Comm: rmmod Tainted: G OE 5.2.0-rc6+ #75
[ 1337.530541] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.0.4 08/28/2014
[ 1337.530542] Call Trace:
[ 1337.530548] dump_stack+0x5b/0x90
[ 1337.530556] ? bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
[ 1337.530560] print_address_description+0x65/0x22e
[ 1337.530568] ? bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
[ 1337.530575] ? bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
[ 1337.530577] __kasan_report.cold.3+0x37/0x77
[ 1337.530581] ? _raw_write_trylock+0x10/0xe0
[ 1337.530588] ? bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
[ 1337.530590] kasan_report+0xe/0x20
[ 1337.530592] memcpy+0x1f/0x50
[ 1337.530600] bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
[ 1337.530608] ? bnxt_qplib_creq_irq+0xa0/0xa0 [bnxt_re]
[ 1337.530611] ? xas_create+0x3aa/0x5f0
[ 1337.530613] ? xas_start+0x77/0x110
[ 1337.530615] ? xas_clear_mark+0x34/0xd0
[ 1337.530623] bnxt_qplib_free_mrw+0x104/0x1a0 [bnxt_re]
[ 1337.530631] ? bnxt_qplib_destroy_ah+0x110/0x110 [bnxt_re]
[ 1337.530633] ? bit_wait_io_timeout+0xc0/0xc0
[ 1337.530641] bnxt_re_dealloc_mw+0x2c/0x60 [bnxt_re]
[ 1337.530648] bnxt_re_destroy_fence_mr+0x77/0x1d0 [bnxt_re]
[ 1337.530655] bnxt_re_dealloc_pd+0x25/0x60 [bnxt_re]
[ 1337.530677] ib_dealloc_pd_user+0xbe/0xe0 [ib_core]
[ 1337.530683] srpt_remove_one+0x5de/0x690 [ib_srpt]
[ 1337.530689] ? __srpt_close_all_ch+0xc0/0xc0 [ib_srpt]
[ 1337.530692] ? xa_load+0x87/0xe0
...
[ 1337.530840] do_syscall_64+0x6d/0x1f0
[ 1337.530843] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1337.530845] RIP: 0033:0x7ff5b389035b
[ 1337.530848] Code: 73 01 c3 48 8b 0d 2d 0b 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d fd 0a 2c 00 f7 d8 64 89 01 48
[ 1337.530849] RSP: 002b:00007fff83425c28 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[ 1337.530852] RAX: ffffffffffffffda RBX: 00005596443e6750 RCX: 00007ff5b389035b
[ 1337.530853] RDX: 000000000000000a RSI: 0000000000000800 RDI: 00005596443e67b8
[ 1337.530854] RBP: 0000000000000000 R08: 00007fff83424ba1 R09: 0000000000000000
[ 1337.530856] R10: 00007ff5b3902960 R11: 0000000000000206 R12: 00007fff83425e50
[ 1337.530857] R13: 00007fff8342673c R14: 00005596443e6260 R15: 00005596443e6750
[ 1337.530885] The buggy address belongs to the page:
[ 1337.530962] page:ffffea001c951dc0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0
[ 1337.530964] flags: 0x57ffffc0000000()
[ 1337.530967] raw: 0057ffffc0000000 0000000000000000 ffffffff1c950101 0000000000000000
[ 1337.530970] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[ 1337.530970] page dumped because: kasan: bad access detected
[ 1337.530996] Memory state around the buggy address:
[ 1337.531072] ffff888725477900: 00 00 00 00 f1 f1 f1 f1 00 00 00 00 00 f2 f2 f2
[ 1337.531180] ffff888725477980: 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00
[ 1337.531288] >ffff888725477a00: 00 f2 f2 f2 f2 f2 f2 00 00 00 f2 00 00 00 00 00
[ 1337.531393] ^
[ 1337.531478] ffff888725477a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1337.531585] ffff888725477b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1337.531691] ==================================================================
Fix this by passing the exact size of each FW command to
bnxt_qplib_rcfw_send_message as req->cmd_size. Before sending
the command to HW, modify the req->cmd_size to number of 16 byte units.
Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1566468170-489-1-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
In fault_opcodes_write(), 'data' is allocated through kcalloc(). However,
it is not deallocated in the following execution if an error occurs,
leading to memory leaks. To fix this issue, introduce the 'free_data' label
to free 'data' before returning the error.
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Link: https://lore.kernel.org/r/1566154486-3713-1-git-send-email-wenwen@cs.uga.edu
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
In fault_opcodes_read(), 'data' is not deallocated if debugfs_file_get()
fails, leading to a memory leak. To fix this bug, introduce the 'free_data'
label to free 'data' before returning the error.
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Link: https://lore.kernel.org/r/1566156571-4335-1-git-send-email-wenwen@cs.uga.edu
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
In mlx4_ib_alloc_pv_bufs(), 'tun_qp->tx_ring' is allocated through
kcalloc(). However, it is not always deallocated in the following execution
if an error occurs, leading to memory leaks. To fix this issue, free
'tun_qp->tx_ring' whenever an error occurs.
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/1566159781-4642-1-git-send-email-wenwen@cs.uga.edu
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
In cma_init, if cma_configfs_init fails, need to free the
previously memory and return fail, otherwise will trigger
null-ptr-deref Read in cma_cleanup.
cma_cleanup
cma_configfs_exit
configfs_unregister_subsystem
Fixes: 045959db65c6 ("IB/cma: Add configfs for rdma_cm")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Link: https://lore.kernel.org/r/1566188859-103051-1-git-send-email-zhengbin13@huawei.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Check conditions that are mandatory to post_send UMR WQEs.
1. Modifying page size.
2. Modifying remote atomic permissions if atomic access is required.
If either condition is not fulfilled then fail to post_send() flow.
Fixes: c8d75a980fab ("IB/mlx5: Respect new UMR capabilities")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190815083834.9245-9-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The UMR WQE in the MR re-registration flow requires that
modify_atomic and modify_entity_size capabilities are enabled.
Therefore, check that the these capabilities are present before going to
umr flow and go through slow path if not.
Fixes: c8d75a980fab ("IB/mlx5: Respect new UMR capabilities")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190815083834.9245-8-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
ODP depends on the several device capabilities, among them is the ability
to send UMR WQEs with that modify atomic and entity size of the MR.
Therefore, only if all conditions to send such a UMR WQE are met then
driver can report that ODP is supported. Use this check of conditions
in all places where driver needs to know about ODP support.
Also, implicit ODP support depends on ability of driver to send UMR WQEs
for an indirect mkey. Therefore, verify that all conditions to do so are
met when reporting support.
Fixes: c8d75a980fab ("IB/mlx5: Respect new UMR capabilities")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190815083834.9245-7-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Introduce helper function to unify various use_umr checks.
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190815083834.9245-6-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
task_active_pid_ns() is wrong API to check PID namespace because it
posses some restrictions and return PID namespace where the process
was allocated. It created mismatches with current namespace, which
can be different.
Rewrite whole rdma_is_visible_in_pid_ns() logic to provide reliable
results without any relation to allocated PID namespace.
Fixes: 8be565e65fa9 ("RDMA/nldev: Factor out the PID namespace check")
Fixes: 6a6c306a09b5 ("RDMA/restrack: Make is_visible_in_pid_ns() as an API")
Reviewed-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190815083834.9245-4-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
"Auto" configuration mode is called for visible in that PID
namespace and it ensures that all counters and QPs are coexist
in the same namespace and belong to same PID.
Fixes: 99fa331dc862 ("RDMA/counter: Add "auto" configuration mode support")
Reviewed-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190815083834.9245-3-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
If QP is not visible to the pid, then we try to decrease its reference
count and return from the function before the QP pointer is
initialized. This lead to NULL pointer dereference.
Fix it by pass directly the res to the rdma_restract_put as arg instead of
&qp->res.
This fixes below call trace:
[ 5845.110329] BUG: kernel NULL pointer dereference, address:
00000000000000dc
[ 5845.120482] Oops: 0002 [#1] SMP PTI
[ 5845.129119] RIP: 0010:rdma_restrack_put+0x5/0x30 [ib_core]
[ 5845.169450] Call Trace:
[ 5845.170544] rdma_counter_get_qp+0x5c/0x70 [ib_core]
[ 5845.172074] rdma_counter_bind_qpn_alloc+0x6f/0x1a0 [ib_core]
[ 5845.173731] nldev_stat_set_doit+0x314/0x330 [ib_core]
[ 5845.175279] rdma_nl_rcv_msg+0xeb/0x1d0 [ib_core]
[ 5845.176772] ? __kmalloc_node_track_caller+0x20b/0x2b0
[ 5845.178321] rdma_nl_rcv+0xcb/0x120 [ib_core]
[ 5845.179753] netlink_unicast+0x179/0x220
[ 5845.181066] netlink_sendmsg+0x2d8/0x3d0
[ 5845.182338] sock_sendmsg+0x30/0x40
[ 5845.183544] __sys_sendto+0xdc/0x160
[ 5845.184832] ? syscall_trace_enter+0x1f8/0x2e0
[ 5845.186209] ? __audit_syscall_exit+0x1d9/0x280
[ 5845.187584] __x64_sys_sendto+0x24/0x30
[ 5845.188867] do_syscall_64+0x48/0x120
[ 5845.190097] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: 1bd8e0a9d0fd1 ("RDMA/counter: Allow manual mode configuration support")
Signed-off-by: Ido Kalir <idok@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190815083834.9245-2-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
In a congested fabric with adaptive routing enabled, traces show that
packets could be delivered out of order. A stale TID RDMA data packet
could lead to TidErr if the TID entries have been released by duplicate
data packets generated from retries, and subsequently erroneously force
the qp into error state in the current implementation.
Since the payload has already been dropped by hardware, the packet can
be simply dropped and it is no longer necessary to put the qp into
error state.
Fixes: 9905bf06e890 ("IB/hfi1: Add functions to receive TID RDMA READ response")
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Link: https://lore.kernel.org/r/20190815192058.105923.72324.stgit@awfm-01.aw.intel.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
|