summaryrefslogtreecommitdiff
path: root/drivers/infiniband
AgeCommit message (Collapse)AuthorFilesLines
2020-08-05IB/rdmavt: Fix RQ counting issues causing use of an invalid RWQEMike Marciniszyn2-32/+5
commit 54a485e9ec084da1a4b32dcf7749c7d760ed8aa5 upstream. The lookaside count is improperly initialized to the size of the Receive Queue with the additional +1. In the traces below, the RQ size is 384, so the count was set to 385. The lookaside count is then rarely refreshed. Note the high and incorrect count in the trace below: rvt_get_rwqe: [hfi1_0] wqe ffffc900078e9008 wr_id 55c7206d75a0 qpn c qpt 2 pid 3018 num_sge 1 head 1 tail 0, count 385 rvt_get_rwqe: (hfi1_rc_rcv+0x4eb/0x1480 [hfi1] <- rvt_get_rwqe) ret=0x1 The head,tail indicate there is only one RWQE posted although the count says 385 and we correctly return the element 0. The next call to rvt_get_rwqe with the decremented count: rvt_get_rwqe: [hfi1_0] wqe ffffc900078e9058 wr_id 0 qpn c qpt 2 pid 3018 num_sge 0 head 1 tail 1, count 384 rvt_get_rwqe: (hfi1_rc_rcv+0x4eb/0x1480 [hfi1] <- rvt_get_rwqe) ret=0x1 Note that the RQ is empty (head == tail) yet we return the RWQE at tail 1, which is not valid because of the bogus high count. Best case, the RWQE has never been posted and the rc logic sees an RWQE that is too small (all zeros) and puts the QP into an error state. In the worst case, a server slow at posting receive buffers might fool rvt_get_rwqe() into fetching an old RWQE and corrupt memory. Fix by deleting the faulty initialization code and creating an inline to fetch the posted count and convert all callers to use new inline. Fixes: f592ae3c999f ("IB/rdmavt: Fracture single lock used for posting and processing RWQEs") Link: https://lore.kernel.org/r/20200728183848.22226.29132.stgit@awfm-01.aw.intel.com Reported-by: Zhaojuan Guo <zguo@redhat.com> Cc: <stable@vger.kernel.org> # 5.4.x Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Tested-by: Honggang Li <honli@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-29RDMA/mlx5: Use xa_lock_irq when access to SRQ tableMaor Gottlieb1-2/+2
[ Upstream commit c3d6057e07a5d15be7c69ea545b3f91877808c96 ] SRQ table is accessed both from interrupt and process context, therefore we must use xa_lock_irq. inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. kworker/u17:9/8573 takes: ffff8883e3503d30 (&xa->xa_lock#13){?...}-{2:2}, at: mlx5_cmd_get_srq+0x18/0x70 [mlx5_ib] {IN-HARDIRQ-W} state was registered at: lock_acquire+0xb9/0x3a0 _raw_spin_lock+0x25/0x30 srq_event_notifier+0x2b/0xc0 [mlx5_ib] notifier_call_chain+0x45/0x70 __atomic_notifier_call_chain+0x69/0x100 forward_event+0x36/0xc0 [mlx5_core] notifier_call_chain+0x45/0x70 __atomic_notifier_call_chain+0x69/0x100 mlx5_eq_async_int+0xc5/0x160 [mlx5_core] notifier_call_chain+0x45/0x70 __atomic_notifier_call_chain+0x69/0x100 mlx5_irq_int_handler+0x19/0x30 [mlx5_core] __handle_irq_event_percpu+0x43/0x2a0 handle_irq_event_percpu+0x30/0x70 handle_irq_event+0x34/0x60 handle_edge_irq+0x7c/0x1b0 do_IRQ+0x60/0x110 ret_from_intr+0x0/0x2a default_idle+0x34/0x160 do_idle+0x1ec/0x220 cpu_startup_entry+0x19/0x20 start_secondary+0x153/0x1a0 secondary_startup_64+0xa4/0xb0 irq event stamp: 20907 hardirqs last enabled at (20907): _raw_spin_unlock_irq+0x24/0x30 hardirqs last disabled at (20906): _raw_spin_lock_irq+0xf/0x40 softirqs last enabled at (20746): __do_softirq+0x2c9/0x436 softirqs last disabled at (20681): irq_exit+0xb3/0xc0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&xa->xa_lock#13); <Interrupt> lock(&xa->xa_lock#13); *** DEADLOCK *** 2 locks held by kworker/u17:9/8573: #0: ffff888295218d38 ((wq_completion)mlx5_ib_page_fault){+.+.}-{0:0}, at: process_one_work+0x1f1/0x5f0 #1: ffff888401647e78 ((work_completion)(&pfault->work)){+.+.}-{0:0}, at: process_one_work+0x1f1/0x5f0 stack backtrace: CPU: 0 PID: 8573 Comm: kworker/u17:9 Tainted: GO 5.7.0_for_upstream_min_debug_2020_06_14_11_31_46_41 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Workqueue: mlx5_ib_page_fault mlx5_ib_eqe_pf_action [mlx5_ib] Call Trace: dump_stack+0x71/0x9b mark_lock+0x4f2/0x590 ? print_shortest_lock_dependencies+0x200/0x200 __lock_acquire+0xa00/0x1eb0 lock_acquire+0xb9/0x3a0 ? mlx5_cmd_get_srq+0x18/0x70 [mlx5_ib] _raw_spin_lock+0x25/0x30 ? mlx5_cmd_get_srq+0x18/0x70 [mlx5_ib] mlx5_cmd_get_srq+0x18/0x70 [mlx5_ib] mlx5_ib_eqe_pf_action+0x257/0xa30 [mlx5_ib] ? process_one_work+0x209/0x5f0 process_one_work+0x27b/0x5f0 ? __schedule+0x280/0x7e0 worker_thread+0x2d/0x3c0 ? process_one_work+0x5f0/0x5f0 kthread+0x111/0x130 ? kthread_park+0x90/0x90 ret_from_fork+0x24/0x30 Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Link: https://lore.kernel.org/r/20200712102641.15210-1-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-22RDMA/mlx5: Verify that QP is created with RQ or SQAharon Landau1-0/+2
commit 0eacc574aae7300bf46c10c7116c3ba5825505b7 upstream. RAW packet QP and underlay QP must be created with either RQ or SQ, check that. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Link: https://lore.kernel.org/r/20200427154636.381474-37-leon@kernel.org Signed-off-by: Aharon Landau <aharonl@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-16IB/hfi1: Do not destroy link_wq when the device is shut downKaike Wan1-5/+5
commit 2315ec12ee8e8257bb335654c62e0cae71dc278d upstream. The workqueue link_wq should only be destroyed when the hfi1 driver is unloaded, not when the device is shut down. Fixes: 71d47008ca1b ("IB/hfi1: Create workqueue for link events") Link: https://lore.kernel.org/r/20200623204053.107638.70315.stgit@awfm-01.aw.intel.com Cc: <stable@vger.kernel.org> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-16IB/hfi1: Do not destroy hfi1_wq when the device is shut downKaike Wan3-6/+31
commit 28b70cd9236563e1a88a6094673fef3c08db0d51 upstream. The workqueue hfi1_wq is destroyed in function shutdown_device(), which is called by either shutdown_one() or remove_one(). The function shutdown_one() is called when the kernel is rebooted while remove_one() is called when the hfi1 driver is unloaded. When the kernel is rebooted, hfi1_wq is destroyed while all qps are still active, leading to a kernel crash: BUG: unable to handle kernel NULL pointer dereference at 0000000000000102 IP: [<ffffffff94cb7b02>] __queue_work+0x32/0x3e0 PGD 0 Oops: 0000 [#1] SMP Modules linked in: dm_round_robin nvme_rdma(OE) nvme_fabrics(OE) nvme_core(OE) ib_isert iscsi_target_mod target_core_mod ib_ucm mlx4_ib iTCO_wdt iTCO_vendor_support mxm_wmi sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm rpcrdma sunrpc irqbypass crc32_pclmul ghash_clmulni_intel rdma_ucm aesni_intel ib_uverbs lrw gf128mul opa_vnic glue_helper ablk_helper ib_iser cryptd ib_umad rdma_cm iw_cm ses enclosure libiscsi scsi_transport_sas pcspkr joydev ib_ipoib(OE) scsi_transport_iscsi ib_cm sg ipmi_ssif mei_me lpc_ich i2c_i801 mei ioatdma ipmi_si dm_multipath ipmi_devintf ipmi_msghandler wmi acpi_pad acpi_power_meter hangcheck_timer ip_tables ext4 mbcache jbd2 mlx4_en sd_mod crc_t10dif crct10dif_generic mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm hfi1(OE) crct10dif_pclmul crct10dif_common crc32c_intel drm ahci mlx4_core libahci rdmavt(OE) igb megaraid_sas ib_core libata drm_panel_orientation_quirks ptp pps_core devlink dca i2c_algo_bit dm_mirror dm_region_hash dm_log dm_mod CPU: 19 PID: 0 Comm: swapper/19 Kdump: loaded Tainted: G OE ------------ 3.10.0-957.el7.x86_64 #1 Hardware name: Phegda X2226A/S2600CW, BIOS SE5C610.86B.01.01.0024.021320181901 02/13/2018 task: ffff8a799ba0d140 ti: ffff8a799bad8000 task.ti: ffff8a799bad8000 RIP: 0010:[<ffffffff94cb7b02>] [<ffffffff94cb7b02>] __queue_work+0x32/0x3e0 RSP: 0018:ffff8a90dde43d80 EFLAGS: 00010046 RAX: 0000000000000082 RBX: 0000000000000086 RCX: 0000000000000000 RDX: ffff8a90b924fcb8 RSI: 0000000000000000 RDI: 000000000000001b RBP: ffff8a90dde43db8 R08: ffff8a799ba0d6d8 R09: ffff8a90dde53900 R10: 0000000000000002 R11: ffff8a90dde43de8 R12: ffff8a90b924fcb8 R13: 000000000000001b R14: 0000000000000000 R15: ffff8a90d2890000 FS: 0000000000000000(0000) GS:ffff8a90dde40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000102 CR3: 0000001a70410000 CR4: 00000000001607e0 Call Trace: [<ffffffff94cb8105>] queue_work_on+0x45/0x50 [<ffffffffc03f781e>] _hfi1_schedule_send+0x6e/0xc0 [hfi1] [<ffffffffc03f78a2>] hfi1_schedule_send+0x32/0x70 [hfi1] [<ffffffffc02cf2d9>] rvt_rc_timeout+0xe9/0x130 [rdmavt] [<ffffffff94ce563a>] ? trigger_load_balance+0x6a/0x280 [<ffffffffc02cf1f0>] ? rvt_free_qpn+0x40/0x40 [rdmavt] [<ffffffff94ca7f58>] call_timer_fn+0x38/0x110 [<ffffffffc02cf1f0>] ? rvt_free_qpn+0x40/0x40 [rdmavt] [<ffffffff94caa3bd>] run_timer_softirq+0x24d/0x300 [<ffffffff94ca0f05>] __do_softirq+0xf5/0x280 [<ffffffff9537832c>] call_softirq+0x1c/0x30 [<ffffffff94c2e675>] do_softirq+0x65/0xa0 [<ffffffff94ca1285>] irq_exit+0x105/0x110 [<ffffffff953796c8>] smp_apic_timer_interrupt+0x48/0x60 [<ffffffff95375df2>] apic_timer_interrupt+0x162/0x170 <EOI> [<ffffffff951adfb7>] ? cpuidle_enter_state+0x57/0xd0 [<ffffffff951ae10e>] cpuidle_idle_call+0xde/0x230 [<ffffffff94c366de>] arch_cpu_idle+0xe/0xc0 [<ffffffff94cfc3ba>] cpu_startup_entry+0x14a/0x1e0 [<ffffffff94c57db7>] start_secondary+0x1f7/0x270 [<ffffffff94c000d5>] start_cpu+0x5/0x14 The solution is to destroy the workqueue only when the hfi1 driver is unloaded, not when the device is shut down. In addition, when the device is shut down, no more work should be scheduled on the workqueues and the workqueues are flushed. Fixes: 8d3e71136a08 ("IB/{hfi1, qib}: Add handling of kernel restart") Link: https://lore.kernel.org/r/20200623204047.107638.77646.stgit@awfm-01.aw.intel.com Cc: <stable@vger.kernel.org> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-16IB/mlx5: Fix 50G per lane indicationAya Levin1-1/+1
[ Upstream commit 530c8632b547ff72f11ff83654b22462a73f1f7b ] Some released FW versions mistakenly don't set the capability that 50G per lane link-modes are supported for VFs (ptys_extended_ethernet capability bit). Use PTYS.ext_eth_proto_capability instead, as this indication is always accurate. If PTYS.ext_eth_proto_capability is valid (has a non-zero value) conclude that the HCA supports 50G per lane. Otherwise, conclude that the HCA doesn't support 50G per lane. Fixes: 08e8676f1607 ("IB/mlx5: Add support for 50Gbps per lane link modes") Link: https://lore.kernel.org/r/20200707110612.882962-3-leon@kernel.org Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-16RDMA/siw: Fix reporting vendor_part_idKamal Heib1-1/+2
[ Upstream commit 04340645f69ab7abb6f9052688a60f0213b3f79c ] Move the initialization of the vendor_part_id to be before calling ib_register_device(), this is needed because the query_device() callback is called from the context of ib_register_device() before initializing the vendor_part_id, so the reported value is wrong. Fixes: bdcf26bf9b3a ("rdma/siw: network and RDMA core interface") Link: https://lore.kernel.org/r/20200707130931.444724-1-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-16IB/sa: Resolv use-after-free in ib_nl_make_request()Divya Indi1-21/+17
[ Upstream commit f427f4d6214c183c474eeb46212d38e6c7223d6a ] There is a race condition where ib_nl_make_request() inserts the request data into the linked list but the timer in ib_nl_request_timeout() can see it and destroy it before ib_nl_send_msg() is done touching it. This could happen, for instance, if there is a long delay allocating memory during nlmsg_new() This causes a use-after-free in the send_mad() thread: [<ffffffffa02f43cb>] ? ib_pack+0x17b/0x240 [ib_core] [ <ffffffffa032aef1>] ib_sa_path_rec_get+0x181/0x200 [ib_sa] [<ffffffffa0379db0>] rdma_resolve_route+0x3c0/0x8d0 [rdma_cm] [<ffffffffa0374450>] ? cma_bind_port+0xa0/0xa0 [rdma_cm] [<ffffffffa040f850>] ? rds_rdma_cm_event_handler_cmn+0x850/0x850 [rds_rdma] [<ffffffffa040f22c>] rds_rdma_cm_event_handler_cmn+0x22c/0x850 [rds_rdma] [<ffffffffa040f860>] rds_rdma_cm_event_handler+0x10/0x20 [rds_rdma] [<ffffffffa037778e>] addr_handler+0x9e/0x140 [rdma_cm] [<ffffffffa026cdb4>] process_req+0x134/0x190 [ib_addr] [<ffffffff810a02f9>] process_one_work+0x169/0x4a0 [<ffffffff810a0b2b>] worker_thread+0x5b/0x560 [<ffffffff810a0ad0>] ? flush_delayed_work+0x50/0x50 [<ffffffff810a68fb>] kthread+0xcb/0xf0 [<ffffffff816ec49a>] ? __schedule+0x24a/0x810 [<ffffffff816ec49a>] ? __schedule+0x24a/0x810 [<ffffffff810a6830>] ? kthread_create_on_node+0x180/0x180 [<ffffffff816f25a7>] ret_from_fork+0x47/0x90 [<ffffffff810a6830>] ? kthread_create_on_node+0x180/0x180 The ownership rule is once the request is on the list, ownership transfers to the list and the local thread can't touch it any more, just like for the normal MAD case in send_mad(). Thus, instead of adding before send and then trying to delete after on errors, move the entire thing under the spinlock so that the send and update of the lists are atomic to the conurrent threads. Lightly reoganize things so spinlock safe memory allocations are done in the final NL send path and the rest of the setup work is done before and outside the lock. Fixes: 3ebd2fd0d011 ("IB/sa: Put netlink request into the request list before sending") Link: https://lore.kernel.org/r/1592964789-14533-1-git-send-email-divya.indi@oracle.com Signed-off-by: Divya Indi <divya.indi@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-09RDMA/counter: Query a counter before releaseMark Zhang1-1/+3
[ Upstream commit c1d869d64a1955817c4d6fff08ecbbe8e59d36f8 ] Query a dynamically-allocated counter before release it, to update it's hwcounters and log all of them into history data. Otherwise all values of these hwcounters will be lost. Fixes: f34a55e497e8 ("RDMA/core: Get sum value of all counters when perform a sysfs stat read") Link: https://lore.kernel.org/r/20200621110000.56059-1-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-30RDMA/mad: Fix possible memory leak in ib_mad_post_receive_mads()Fan Guo1-0/+1
[ Upstream commit a17f4bed811c60712d8131883cdba11a105d0161 ] If ib_dma_mapping_error() returns non-zero value, ib_mad_post_receive_mads() will jump out of loops and return -ENOMEM without freeing mad_priv. Fix this memory-leak problem by freeing mad_priv in this case. Fixes: 2c34e68f4261 ("IB/mad: Check and handle potential DMA mapping errors") Link: https://lore.kernel.org/r/20200612063824.180611-1-guofan5@huawei.com Signed-off-by: Fan Guo <guofan5@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-30RDMA/cma: Protect bind_list and listen_list while finding matching cm idMark Zhang1-0/+18
[ Upstream commit 730c8912484186d4623d0c76509066d285c3a755 ] The bind_list and listen_list must be accessed under a lock, add the missing locking around the access in cm_ib_id_from_event() In addition add lockdep asserts to make it clearer what the locking semantic is here. general protection fault: 0000 [#1] SMP NOPTI CPU: 226 PID: 126135 Comm: kworker/226:1 Tainted: G OE 4.12.14-150.47-default #1 SLE15 Hardware name: Cray Inc. Windom/Windom, BIOS 0.8.7 01-10-2020 Workqueue: ib_cm cm_work_handler [ib_cm] task: ffff9c5a60a1d2c0 task.stack: ffffc1d91f554000 RIP: 0010:cma_ib_req_handler+0x3f1/0x11b0 [rdma_cm] RSP: 0018:ffffc1d91f557b40 EFLAGS: 00010286 RAX: deacffffffffff30 RBX: 0000000000000001 RCX: ffff9c2af5bb6000 RDX: 00000000000000a9 RSI: ffff9c5aa4ed2f10 RDI: ffffc1d91f557b08 RBP: ffffc1d91f557d90 R08: ffff9c340cc80000 R09: ffff9c2c0f901900 R10: 0000000000000000 R11: 0000000000000001 R12: deacffffffffff30 R13: ffff9c5a48aeec00 R14: ffffc1d91f557c30 R15: ffff9c5c2eea3688 FS: 0000000000000000(0000) GS:ffff9c5c2fa80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00002b5cc03fa320 CR3: 0000003f8500a000 CR4: 00000000003406e0 Call Trace: ? rdma_addr_cancel+0xa0/0xa0 [ib_core] ? cm_process_work+0x28/0x140 [ib_cm] cm_process_work+0x28/0x140 [ib_cm] ? cm_get_bth_pkey.isra.44+0x34/0xa0 [ib_cm] cm_work_handler+0xa06/0x1a6f [ib_cm] ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x40/0x70 ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x40/0x70 ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x40/0x70 ? __switch_to+0x7c/0x4b0 ? __switch_to_asm+0x40/0x70 ? __switch_to_asm+0x34/0x70 process_one_work+0x1da/0x400 worker_thread+0x2b/0x3f0 ? process_one_work+0x400/0x400 kthread+0x118/0x140 ? kthread_create_on_node+0x40/0x40 ret_from_fork+0x22/0x40 Code: 00 66 83 f8 02 0f 84 ca 05 00 00 49 8b 84 24 d0 01 00 00 48 85 c0 0f 84 68 07 00 00 48 2d d0 01 00 00 49 89 c4 0f 84 59 07 00 00 <41> 0f b7 44 24 20 49 8b 77 50 66 83 f8 0a 75 9e 49 8b 7c 24 28 Fixes: 4c21b5bcef73 ("IB/cma: Add net_dev and private data checks to RDMA CM") Link: https://lore.kernel.org/r/20200616104304.2426081-1-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-30RDMA/qedr: Fix KASAN: use-after-free in ucma_event_handler+0x532Michal Kalderon1-2/+11
[ Upstream commit 0dfbd5ecf28cbcb81674c49d34ee97366db1be44 ] Private data passed to iwarp_cm_handler is copied for connection request / response, but ignored otherwise. If junk is passed, it is stored in the event and used later in the event processing. The driver passes an old junk pointer during connection close which leads to a use-after-free on event processing. Set private data to NULL for events that don 't have private data. BUG: KASAN: use-after-free in ucma_event_handler+0x532/0x560 [rdma_ucm] kernel: Read of size 4 at addr ffff8886caa71200 by task kworker/u128:1/5250 kernel: kernel: Workqueue: iw_cm_wq cm_work_handler [iw_cm] kernel: Call Trace: kernel: dump_stack+0x8c/0xc0 kernel: print_address_description.constprop.0+0x1b/0x210 kernel: ? ucma_event_handler+0x532/0x560 [rdma_ucm] kernel: ? ucma_event_handler+0x532/0x560 [rdma_ucm] kernel: __kasan_report.cold+0x1a/0x33 kernel: ? ucma_event_handler+0x532/0x560 [rdma_ucm] kernel: kasan_report+0xe/0x20 kernel: check_memory_region+0x130/0x1a0 kernel: memcpy+0x20/0x50 kernel: ucma_event_handler+0x532/0x560 [rdma_ucm] kernel: ? __rpc_execute+0x608/0x620 [sunrpc] kernel: cma_iw_handler+0x212/0x330 [rdma_cm] kernel: ? iw_conn_req_handler+0x6e0/0x6e0 [rdma_cm] kernel: ? enqueue_timer+0x86/0x140 kernel: ? _raw_write_lock_irq+0xd0/0xd0 kernel: cm_work_handler+0xd3d/0x1070 [iw_cm] Fixes: e411e0587e0d ("RDMA/qedr: Add iWARP connection management functions") Link: https://lore.kernel.org/r/20200616093408.17827-1-michal.kalderon@marvell.com Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-30RDMA/rvt: Fix potential memory leak caused by rvt_alloc_rqAditya Pakki1-2/+4
[ Upstream commit 90a239ee25fa3a483facec3de7c144361a3d3a51 ] In case of failure of alloc_ud_wq_attr(), the memory allocated by rvt_alloc_rq() is not freed. Fix it by calling rvt_free_rq() using the existing clean-up code. Fixes: d310c4bf8aea ("IB/{rdmavt, hfi1, qib}: Remove AH refcount for UD QPs") Link: https://lore.kernel.org/r/20200614041148.131983-1-pakki001@umn.edu Signed-off-by: Aditya Pakki <pakki001@umn.edu> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-30RDMA/siw: Fix pointer-to-int-cast warning in siw_rx_pbl()Tom Seewald1-1/+2
[ Upstream commit 6769b275a313c76ddcd7d94c632032326db5f759 ] The variable buf_addr is type dma_addr_t, which may not be the same size as a pointer. To ensure it is the correct size, cast to a uintptr_t. Fixes: c536277e0db1 ("RDMA/siw: Fix 64/32bit pointer inconsistency") Link: https://lore.kernel.org/r/20200610174717.15932-1-tseewald@gmail.com Signed-off-by: Tom Seewald <tseewald@gmail.com> Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-30IB/hfi1: Fix module use count flaw due to leftover module put callsDennis Dalessandro1-17/+2
commit 822fbd37410639acdae368ea55477ddd3498651d upstream. When the try_module_get calls were removed from opening and closing of the i2c debugfs file, the corresponding module_put calls were missed. This results in an inaccurate module use count that requires a power cycle to fix. Fixes: 09fbca8e6240 ("IB/hfi1: No need to use try_module_get for debugfs") Link: https://lore.kernel.org/r/20200623203230.106975.76240.stgit@awfm-01.aw.intel.com Cc: <stable@vger.kernel.org> Reviewed-by: Kaike Wan <kaike.wan@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-30IB/mad: Fix use after free when destroying MAD agentShay Drory1-1/+1
commit 116a1b9f1cb769b83e5adff323f977a62b1dcb2e upstream. Currently, when RMPP MADs are processed while the MAD agent is destroyed, it could result in use after free of rmpp_recv, as decribed below: cpu-0 cpu-1 ----- ----- ib_mad_recv_done() ib_mad_complete_recv() ib_process_rmpp_recv_wc() unregister_mad_agent() ib_cancel_rmpp_recvs() cancel_delayed_work() process_rmpp_data() start_rmpp() queue_delayed_work(rmpp_recv->cleanup_work) destroy_rmpp_recv() free_rmpp_recv() cleanup_work()[1] spin_lock_irqsave(&rmpp_recv->agent->lock) <-- use after free [1] cleanup_work() == recv_cleanup_handler Fix it by waiting for the MAD agent reference count becoming zero before calling to ib_cancel_rmpp_recvs(). Fixes: 9a41e38a467c ("IB/mad: Use IDR for agent IDs") Link: https://lore.kernel.org/r/20200621104738.54850-2-leon@kernel.org Signed-off-by: Shay Drory <shayd@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-24RDMA/iw_cxgb4: cleanup device debugfs entries on ULD removePotnuri Bharat Teja1-0/+1
[ Upstream commit 49ea0c036ede81f126f1a9389d377999fdf5c5a1 ] Remove device specific debugfs entries immediately if LLD detaches a particular ULD device in case of fatal PCI errors. Link: https://lore.kernel.org/r/20200524190814.17599-1-bharat@chelsio.com Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24IB/cma: Fix ports memory leak in cma_configfsMaor Gottlieb1-0/+13
[ Upstream commit 63a3345c2d42a9b29e1ce2d3a4043689b3995cea ] The allocated ports structure in never freed. The free function should be called by release_cma_ports_group, but the group is never released since we don't remove its default group. Remove default groups when device group is deleted. Fixes: 045959db65c6 ("IB/cma: Add configfs for rdma_cm") Link: https://lore.kernel.org/r/20200521072650.567908-1-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24RDMA/hns: Fix cmdq parameter of querying pf timer resourceLang Cheng1-20/+12
[ Upstream commit 441c88d5b3ff80108ff536c6cf80591187015403 ] The firmware has reduced the number of descriptions of command HNS_ROCE_OPC_QUERY_PF_TIMER_RES to 1. The driver needs to adapt, otherwise the hardware will report error 4(CMD_NEXT_ERR). Fixes: 0e40dc2f70cd ("RDMA/hns: Add timer allocation support for hip08") Link: https://lore.kernel.org/r/1588931159-56875-3-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24RDMA/hns: Bugfix for querying qkeyLijun Ou1-1/+1
[ Upstream commit 349be276509455ac2f19fa4051ed773082c6a27e ] The qkey queried through the query ud qp verb is a fixed value and it should be read from qp context. Fixes: 926a01dc000d ("RDMA/hns: Add QP operations support for hip08 SoC") Link: https://lore.kernel.org/r/1588931159-56875-2-git-send-email-liweihang@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24RDMA/mlx5: Fix udata response upon SRQ creationYishai Hadas1-2/+8
[ Upstream commit cf26deff9036cd3270af562dbec545239e5c7f07 ] Fix udata response upon SRQ creation to use the UAPI structure (i.e. mlx5_ib_create_srq_resp). It did not zero the reserved field in userspace. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Link: https://lore.kernel.org/r/20200406173540.1466477-1-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24RDMA/core: Fix several reference count leaks.Qiushi Wu1-5/+5
[ Upstream commit 0b8e125e213204508e1b3c4bdfe69713280b7abd ] kobject_init_and_add() takes reference even when it fails. If this function returns an error, kobject_put() must be called to properly clean up the memory associated with the object. Previous commit b8eb718348b8 ("net-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject") fixed a similar problem. Link: https://lore.kernel.org/r/20200528030231.9082-1-wu000273@umn.edu Signed-off-by: Qiushi Wu <wu000273@umn.edu> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24IB/mlx5: Fix DEVX support for MLX5_CMD_OP_INIT2INIT_QP commandMark Zhang1-0/+4
[ Upstream commit d246a3061528be6d852156d25c02ea69d6db7e65 ] The commit citied in the Fixes line wasn't complete and solved only part of the problems. Update the mlx5_ib to properly support MLX5_CMD_OP_INIT2INIT_QP command in the DEVX, that is required when modify the QP tx_port_affinity. Fixes: 819f7427bafd ("RDMA/mlx5: Add init2init as a modify command") Link: https://lore.kernel.org/r/20200527135703.482501-1-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24RDMA/mlx5: Add init2init as a modify commandAharon Landau1-0/+1
[ Upstream commit 819f7427bafd494ef7ca4942ec6322db20722d7b ] Missing INIT2INIT entry in the list of modify commands caused DEVX applications to be unable to modify_qp for this transition state. Add the MLX5_CMD_OP_INIT2INIT_QP opcode to the list of allowed DEVX opcodes. Fixes: e662e14d801b ("IB/mlx5: Add DEVX support for modify and query commands") Link: https://lore.kernel.org/r/20200513095550.211345-1-leon@kernel.org Signed-off-by: Aharon Landau <aharonl@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-17RDMA/uverbs: Make the event_queue fds return POLLERR when disassociatedJason Gunthorpe1-0/+2
[ Upstream commit eb356e6dc15a30af604f052cd0e170450193c254 ] If is_closed is set, and the event list is empty, then read() will return -EIO without blocking. After setting is_closed in ib_uverbs_free_event_queue(), we do trigger a wake_up on the poll_wait, but the fops->poll() function does not check it, so poll will continue to sleep on an empty list. Fixes: 14e23bd6d221 ("RDMA/core: Fix locking in ib_uverbs_event_read") Link: https://lore.kernel.org/r/0-v1-ace813388969+48859-uverbs_poll_fix%25jgg@mellanox.com Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-07RDMA/qedr: Fix synchronization methods and memory leaks in qedrMichal Kalderon3-75/+158
[ Upstream commit 82af6d19d8d9227c22a53ff00b40fb2a4f9fce69 ] Re-design of the iWARP CM related objects reference counting and synchronization methods, to ensure operations are synchronized correctly and that memory allocated for "ep" is properly released. Also makes sure QP memory is not released before ep is finished accessing it. Where as the QP object is created/destroyed by external operations, the ep is created/destroyed by internal operations and represents the tcp connection associated with the QP. QP destruction flow: - needs to wait for ep establishment to complete (either successfully or with error) - needs to wait for ep disconnect to be fully posted to avoid a race condition of disconnect being called after reset. - both the operations above don't always happen, so we use atomic flags to indicate whether the qp destruction flow needs to wait for these completions or not, if the destroy is called before these operations began, the flows will check the flags and not execute them ( connect / disconnect). We use completion structure for waiting for the completions mentioned above. The QP refcnt was modified to kref object. The EP has a kref added to it to handle additional worker thread accessing it. Memory Leaks - https://www.spinics.net/lists/linux-rdma/msg83762.html Concurrency not managed correctly - https://www.spinics.net/lists/linux-rdma/msg67949.html Fixes: de0089e692a9 ("RDMA/qedr: Add iWARP connection management qp related callbacks") Link: https://lore.kernel.org/r/20191027200451.28187-4-michal.kalderon@marvell.com Reported-by: Chuck Lever <chuck.lever@oracle.com> Reported-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-07RDMA/qedr: Fix qpids xarray api usedMichal Kalderon3-4/+4
[ Upstream commit 5fdff18b4dc64e2d1e912ad2b90495cd487f791b ] The qpids xarray isn't accessed from irq context and therefore there is no need to use the xa_XXX_irq version of the apis. Remove the _irq. Fixes: b6014f9e5f39 ("qedr: Convert qpidr to XArray") Link: https://lore.kernel.org/r/20191027200451.28187-3-michal.kalderon@marvell.com Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-03IB/ipoib: Fix double free of skb in case of multicast traffic in CM modeValentine Fatiev4-12/+26
[ Upstream commit 1acba6a817852d4aa7916d5c4f2c82f702ee9224 ] When connected mode is set, and we have connected and datagram traffic in parallel, ipoib might crash with double free of datagram skb. The current mechanism assumes that the order in the completion queue is the same as the order of sent packets for all QPs. Order is kept only for specific QP, in case of mixed UD and CM traffic we have few QPs (one UD and few CM's) in parallel. The problem: ---------------------------------------------------------- Transmit queue: ----------------- UD skb pointer kept in queue itself, CM skb kept in spearate queue and uses transmit queue as a placeholder to count the number of total transmitted packets. 0 1 2 3 4 5 6 7 8 9 10 11 12 13 .........127 ------------------------------------------------------------ NL ud1 UD2 CM1 ud3 cm2 cm3 ud4 cm4 ud5 NL NL NL ........... ------------------------------------------------------------ ^ ^ tail head Completion queue (problematic scenario) - the order not the same as in the transmit queue: 1 2 3 4 5 6 7 8 9 ------------------------------------ ud1 CM1 UD2 ud3 cm2 cm3 ud4 cm4 ud5 ------------------------------------ 1. CM1 'wc' processing - skb freed in cm separate ring. - tx_tail of transmit queue increased although UD2 is not freed. Now driver assumes UD2 index is already freed and it could be used for new transmitted skb. 0 1 2 3 4 5 6 7 8 9 10 11 12 13 .........127 ------------------------------------------------------------ NL NL UD2 CM1 ud3 cm2 cm3 ud4 cm4 ud5 NL NL NL ........... ------------------------------------------------------------ ^ ^ ^ (Bad)tail head (Bad - Could be used for new SKB) In this case (due to heavy load) UD2 skb pointer could be replaced by new transmitted packet UD_NEW, as the driver assumes its free. At this point we will have to process two 'wc' with same index but we have only one pointer to free. During second attempt to free the same skb we will have NULL pointer exception. 2. UD2 'wc' processing - skb freed according the index we got from 'wc', but it was already overwritten by mistake. So actually the skb that was released is the skb of the new transmitted packet and not the original one. 3. UD_NEW 'wc' processing - attempt to free already freed skb. NUll pointer exception. The fix: ----------------------------------------------------------------------- The fix is to stop using the UD ring as a placeholder for CM packets, the cyclic ring variables tx_head and tx_tail will manage the UD tx_ring, a new cyclic variables global_tx_head and global_tx_tail are introduced for managing and counting the overall outstanding sent packets, then the send queue will be stopped and waken based on these variables only. Note that no locking is needed since global_tx_head is updated in the xmit flow and global_tx_tail is updated in the NAPI flow only. A previous attempt tried to use one variable to count the outstanding sent packets, but it did not work since xmit and NAPI flows can run at the same time and the counter will be updated wrongly. Thus, we use the same simple cyclic head and tail scheme that we have today for the UD tx_ring. Fixes: 2c104ea68350 ("IB/ipoib: Get rid of the tx_outstanding variable in all modes") Link: https://lore.kernel.org/r/20200527134705.480068-1-leon@kernel.org Signed-off-by: Valentine Fatiev <valentinef@mellanox.com> Signed-off-by: Alaa Hleihel <alaa@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-03RDMA/core: Fix double destruction of uobjectJason Gunthorpe1-7/+13
[ Upstream commit c85f4abe66bea0b5db8d28d55da760c4fe0a0301 ] Fix use after free when user user space request uobject concurrently for the same object, within the RCU grace period. In that case, remove_handle_idr_uobject() is called twice and we will have an extra put on the uobject which cause use after free. Fix it by leaving the uobject write locked after it was removed from the idr. Call to rdma_lookup_put_uobject with UVERBS_LOOKUP_DESTROY instead of UVERBS_LOOKUP_WRITE will do the work. refcount_t: underflow; use-after-free. WARNING: CPU: 0 PID: 1381 at lib/refcount.c:28 refcount_warn_saturate+0xfe/0x1a0 Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 1381 Comm: syz-executor.0 Not tainted 5.5.0-rc3 #8 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x94/0xce panic+0x234/0x56f __warn+0x1cc/0x1e1 report_bug+0x200/0x310 fixup_bug.part.11+0x32/0x80 do_error_trap+0xd3/0x100 do_invalid_op+0x31/0x40 invalid_op+0x1e/0x30 RIP: 0010:refcount_warn_saturate+0xfe/0x1a0 Code: 0f 0b eb 9b e8 23 f6 6d ff 80 3d 6c d4 19 03 00 75 8d e8 15 f6 6d ff 48 c7 c7 c0 02 55 bd c6 05 57 d4 19 03 01 e8 a2 58 49 ff <0f> 0b e9 6e ff ff ff e8 f6 f5 6d ff 80 3d 42 d4 19 03 00 0f 85 5c RSP: 0018:ffffc90002df7b98 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88810f6a193c RCX: ffffffffba649009 RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff88811b0283cc RBP: 0000000000000003 R08: ffffed10236060e3 R09: ffffed10236060e3 R10: 0000000000000001 R11: ffffed10236060e2 R12: ffff88810f6a193c R13: ffffc90002df7d60 R14: 0000000000000000 R15: ffff888116ae6a08 uverbs_uobject_put+0xfd/0x140 __uobj_perform_destroy+0x3d/0x60 ib_uverbs_close_xrcd+0x148/0x170 ib_uverbs_write+0xaa5/0xdf0 __vfs_write+0x7c/0x100 vfs_write+0x168/0x4a0 ksys_write+0xc8/0x200 do_syscall_64+0x9c/0x390 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x465b49 Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f759d122c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000000000073bfa8 RCX: 0000000000465b49 RDX: 000000000000000c RSI: 0000000020000080 RDI: 0000000000000003 RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f759d1236bc R13: 00000000004ca27c R14: 000000000070de40 R15: 00000000ffffffff Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: 0x39400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) Fixes: 7452a3c745a2 ("IB/uverbs: Allow RDMA_REMOVE_DESTROY to work concurrently with disassociate") Link: https://lore.kernel.org/r/20200527135534.482279-1-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-03RDMA/pvrdma: Fix missing pci disable in pvrdma_pci_probe()Qiushi Wu1-1/+1
[ Upstream commit db857e6ae548f0f4f4a0f63fffeeedf3cca21f9d ] In function pvrdma_pci_probe(), pdev was not disabled in one error path. Thus replace the jump target “err_free_device” by "err_disable_pdev". Fixes: 29c8d9eba550 ("IB: Add vmw_pvrdma driver") Link: https://lore.kernel.org/r/20200523030457.16160-1-wu000273@umn.edu Signed-off-by: Qiushi Wu <wu000273@umn.edu> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-03IB/qib: Call kobject_put() when kobject_init_and_add() failsKaike Wan1-4/+5
[ Upstream commit a35cd6447effd5c239b564c80fa109d05ff3d114 ] When kobject_init_and_add() returns an error in the function qib_create_port_files(), the function kobject_put() is not called for the corresponding kobject, which potentially leads to memory leak. This patch fixes the issue by calling kobject_put() even if kobject_init_and_add() fails. In addition, the ppd->diagc_kobj is released along with other kobjects when the sysfs is unregistered. Fixes: f931551bafe1 ("IB/qib: Add new qib driver for QLogic PCIe InfiniBand adapters") Link: https://lore.kernel.org/r/20200512031328.189865.48627.stgit@awfm-01.aw.intel.com Cc: <stable@vger.kernel.org> Suggested-by: Lin Yi <teroincn@gmail.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-03IB/i40iw: Remove bogus call to netdev_master_upper_dev_get()Denis V. Lunev1-8/+0
[ Upstream commit 856ec7f64688387b100b7083cdf480ce3ac41227 ] Local variable netdev is not used in these calls. It should be noted, that this change is required to work in bonded mode. Otherwise we would get the following assert: "RTNL: assertion failed at net/core/dev.c (5665)" With the calltrace as follows: dump_stack+0x19/0x1b netdev_master_upper_dev_get+0x61/0x70 i40iw_addr_resolve_neigh+0x1e8/0x220 i40iw_make_cm_node+0x296/0x700 ? i40iw_find_listener.isra.10+0xcc/0x110 i40iw_receive_ilq+0x3d4/0x810 i40iw_puda_poll_completion+0x341/0x420 i40iw_process_ceq+0xa5/0x280 i40iw_ceq_dpc+0x1e/0x40 tasklet_action+0x83/0x140 __do_softirq+0x125/0x2bb call_softirq+0x1c/0x30 do_softirq+0x65/0xa0 irq_exit+0x105/0x110 do_IRQ+0x56/0xf0 common_interrupt+0x16a/0x16a ? cpuidle_enter_state+0x57/0xd0 cpuidle_idle_call+0xde/0x230 arch_cpu_idle+0xe/0xc0 cpu_startup_entry+0x14a/0x1e0 start_secondary+0x1f7/0x270 start_cpu+0x5/0x14 Link: https://lore.kernel.org/r/20200428131511.11049-1-den@openvz.org Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-20RDMA/iw_cxgb4: Fix incorrect function parametersPotnuri Bharat Teja1-4/+3
[ Upstream commit c8b1f340e54158662acfa41d6dee274846370282 ] While reading the TCB field in t4_tcb_get_field32() the wrong mask is passed as a parameter which leads the driver eventually to a kernel panic/app segfault from access to an illegal SRQ index while flushing the SRQ completions during connection teardown. Fixes: 11a27e2121a5 ("iw_cxgb4: complete the cached SRQ buffers") Link: https://lore.kernel.org/r/20200511185608.5202-1-bharat@chelsio.com Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-20RDMA/core: Fix double put of resourceSasha Levin1-1/+1
[ Upstream commit 50bbe3d34fea74b7c0fabe553c40c2f4a48bb9c3 ] Do not decrease the reference count of resource tracker object twice in the error flow of res_get_common_doit. Fixes: c5dfe0ea6ffa ("RDMA/nldev: Add resource tracker doit callback") Link: https://lore.kernel.org/r/20200507062942.98305-1-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-20IB/core: Fix potential NULL pointer dereference in pkey cacheJack Morgenstein1-2/+5
[ Upstream commit 1901b91f99821955eac2bd48fe25ee983385dc00 ] The IB core pkey cache is populated by procedure ib_cache_update(). Initially, the pkey cache pointer is NULL. ib_cache_update allocates a buffer and populates it with the device's pkeys, via repeated calls to procedure ib_query_pkey(). If there is a failure in populating the pkey buffer via ib_query_pkey(), ib_cache_update does not replace the old pkey buffer cache with the updated one -- it leaves the old cache as is. Since initially the pkey buffer cache is NULL, when calling ib_cache_update the first time, a failure in ib_query_pkey() will cause the pkey buffer cache pointer to remain NULL. In this situation, any calls subsequent to ib_get_cached_pkey(), ib_find_cached_pkey(), or ib_find_cached_pkey_exact() will try to dereference the NULL pkey cache pointer, causing a kernel panic. Fix this by checking the ib_cache_update() return value. Fixes: 8faea9fd4a39 ("RDMA/cache: Move the cache per-port data into the main ib_port_data") Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Link: https://lore.kernel.org/r/20200507071012.100594-1-leon@kernel.org Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-20IB/mlx4: Test return value of calls to ib_get_cached_pkeyJack Morgenstein1-3/+11
[ Upstream commit 6693ca95bd4330a0ad7326967e1f9bcedd6b0800 ] In the mlx4_ib_post_send() flow, some functions call ib_get_cached_pkey() without checking its return value. If ib_get_cached_pkey() returns an error code, these functions should return failure. Fixes: 1ffeb2eb8be9 ("IB/mlx4: SR-IOV IB context objects and proxy/tunnel SQP support") Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters") Fixes: e622f2f4ad21 ("IB: split struct ib_send_wr") Link: https://lore.kernel.org/r/20200426075921.130074-1-leon@kernel.org Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-20RDMA/rxe: Always return ERR_PTR from rxe_create_mmap_info()Sudip Mukherjee2-5/+8
[ Upstream commit bb43c8e382e5da0ee253e3105d4099820ff4d922 ] The commit below modified rxe_create_mmap_info() to return ERR_PTR's but didn't update the callers to handle them. Modify rxe_create_mmap_info() to only return ERR_PTR and fix all error checking after rxe_create_mmap_info() is called. Ensure that all other exit paths properly set the error return. Fixes: ff23dfa13457 ("IB: Pass only ib_udata in function prototypes") Link: https://lore.kernel.org/r/20200425233545.17210-1-sudipm.mukherjee@gmail.com Link: https://lore.kernel.org/r/20200511183742.GB225608@mwanda Cc: stable@vger.kernel.org [5.4+] Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-20i40iw: Fix error handling in i40iw_manage_arp_cache()Dan Carpenter1-1/+1
[ Upstream commit 37e31d2d26a4124506c24e95434e9baf3405a23a ] The i40iw_arp_table() function can return -EOVERFLOW if i40iw_alloc_resource() fails so we can't just test for "== -1". Fixes: 4e9042e647ff ("i40iw: add hw and utils files") Link: https://lore.kernel.org/r/20200422092211.GA195357@mwanda Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-20IB/hfi1: Fix another case where pq is left on waitlistMike Marciniszyn1-4/+0
[ Upstream commit fa8dac3968635dec8518a13ac78d662f2aa88e4d ] The commit noted below fixed a case where a pq is left on the sdma wait list. It however missed another case. user_sdma_send_pkts() has two calls from hfi1_user_sdma_process_request(). If the first one fails as indicated by -EBUSY, the pq will be placed on the waitlist as by design. If the second call then succeeds, the pq is still on the waitlist setting up a race with the interrupt handler if a subsequent request uses a different SDMA engine Fix by deleting the first call. The use of pcount and the intent to send a short burst of packets followed by the larger balance of packets was never correctly implemented, because the two calls always send pcount packets no matter what. A subsequent patch will correct that issue. Fixes: 9a293d1e21a6 ("IB/hfi1: Ensure pq is not left on waitlist") Link: https://lore.kernel.org/r/20200504130917.175613.43231.stgit@awfm-01.aw.intel.com Cc: <stable@vger.kernel.org> Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-06RDMA/cm: Fix an error check in cm_alloc_id_priv()Dan Carpenter1-1/+1
commit 983653515849fb56b78ce55d349bb384d43030f6 upstream. The xa_alloc_cyclic_irq() function returns either 0 or 1 on success and negatives on error. This code treats 1 as an error and returns ERR_PTR(1) which will cause an Oops in the caller. Fixes: ae78ff3a0f0c ("RDMA/cm: Convert local_id_table to XArray") Link: https://lore.kernel.org/r/20200407093714.GA80285@mwanda Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-06RDMA/cm: Fix ordering of xa_alloc_cyclic() in ib_create_cm_id()Jason Gunthorpe1-16/+11
commit e8dc4e885c459343970b25acd9320fe9ee5492e7 upstream. xa_alloc_cyclic() is a SMP release to be paired with some later acquire during xa_load() as part of cm_acquire_id(). As such, xa_alloc_cyclic() must be done after the cm_id is fully initialized, in particular, it absolutely must be after the refcount_set(), otherwise the refcount_inc() in cm_acquire_id() may not see the set. As there are several cases where a reader will be able to use the id.local_id after cm_acquire_id in the IB_CM_IDLE state there needs to be an unfortunate split into a NULL allocate and a finalizing xa_store. Fixes: a977049dacde ("[PATCH] IB: Add the kernel CM implementation") Link: https://lore.kernel.org/r/20200310092545.251365-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-06RDMA/core: Fix race between destroy and release FD objectLeon Romanovsky1-1/+1
commit f0abc761bbb9418876cc4d1ebc473e4ea6352e42 upstream. The call to ->lookup_put() was too early and it caused an unlock of the read/write protection of the uobject after the FD was put. This allows a race: CPU1 CPU2 rdma_lookup_put_uobject() lookup_put_fd_uobject() fput() fput() uverbs_uobject_fd_release() WARN_ON(uverbs_try_lock_object(uobj, UVERBS_LOOKUP_WRITE)); atomic_dec(usecnt) Fix the code by changing the order, first unlock and call to ->lookup_put() after that. Fixes: 3832125624b7 ("IB/core: Add support for idr types") Link: https://lore.kernel.org/r/20200423060122.6182-1-leon@kernel.org Suggested-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-06RDMA/core: Prevent mixed use of FDs between shared ufilesLeon Romanovsky1-1/+1
commit 0fb00941dc63990a10951146df216fc7b0e20bc2 upstream. FDs can only be used on the ufile that created them, they cannot be mixed to other ufiles. We are lacking a check to prevent it. BUG: KASAN: null-ptr-deref in atomic64_sub_and_test include/asm-generic/atomic-instrumented.h:1547 [inline] BUG: KASAN: null-ptr-deref in atomic_long_sub_and_test include/asm-generic/atomic-long.h:460 [inline] BUG: KASAN: null-ptr-deref in fput_many+0x1a/0x140 fs/file_table.c:336 Write of size 8 at addr 0000000000000038 by task syz-executor179/284 CPU: 0 PID: 284 Comm: syz-executor179 Not tainted 5.5.0-rc5+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x94/0xce lib/dump_stack.c:118 __kasan_report+0x18f/0x1b7 mm/kasan/report.c:510 kasan_report+0xe/0x20 mm/kasan/common.c:639 check_memory_region_inline mm/kasan/generic.c:185 [inline] check_memory_region+0x15d/0x1b0 mm/kasan/generic.c:192 atomic64_sub_and_test include/asm-generic/atomic-instrumented.h:1547 [inline] atomic_long_sub_and_test include/asm-generic/atomic-long.h:460 [inline] fput_many+0x1a/0x140 fs/file_table.c:336 rdma_lookup_put_uobject+0x85/0x130 drivers/infiniband/core/rdma_core.c:692 uobj_put_read include/rdma/uverbs_std_types.h:96 [inline] _ib_uverbs_lookup_comp_file drivers/infiniband/core/uverbs_cmd.c:198 [inline] create_cq+0x375/0xba0 drivers/infiniband/core/uverbs_cmd.c:1006 ib_uverbs_create_cq+0x114/0x140 drivers/infiniband/core/uverbs_cmd.c:1089 ib_uverbs_write+0xaa5/0xdf0 drivers/infiniband/core/uverbs_main.c:769 __vfs_write+0x7c/0x100 fs/read_write.c:494 vfs_write+0x168/0x4a0 fs/read_write.c:558 ksys_write+0xc8/0x200 fs/read_write.c:611 do_syscall_64+0x9c/0x390 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x44ef99 Code: 00 b8 00 01 00 00 eb e1 e8 74 1c 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c4 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffc0b74c028 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00007ffc0b74c030 RCX: 000000000044ef99 RDX: 0000000000000040 RSI: 0000000020000040 RDI: 0000000000000005 RBP: 00007ffc0b74c038 R08: 0000000000401830 R09: 0000000000401830 R10: 00007ffc0b74c038 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 00000000006be018 R15: 0000000000000000 Fixes: cf8966b3477d ("IB/core: Add support for fd objects") Link: https://lore.kernel.org/r/20200421082929.311931-2-leon@kernel.org Suggested-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-06RDMA/siw: Fix potential siw_mem refcnt leak in siw_fastreg_mr()Jason Gunthorpe1-4/+11
commit 6e051971b0e2eeb0ce7ec65d3cc8180450512d36 upstream. siw_fastreg_mr() invokes siw_mem_id2obj(), which returns a local reference of the siw_mem object to "mem" with increased refcnt. When siw_fastreg_mr() returns, "mem" becomes invalid, so the refcount should be decreased to keep refcount balanced. The issue happens in one error path of siw_fastreg_mr(). When "base_mr" equals to NULL but "mem" is not NULL, the function forgets to decrease the refcnt increased by siw_mem_id2obj() and causes a refcnt leak. Reorganize the flow so that the goto unwind can be used as expected. Fixes: b9be6f18cf9e ("rdma/siw: transmit path") Link: https://lore.kernel.org/r/1586939949-69856-1-git-send-email-xiyuyang19@fudan.edu.cn Reported-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-06RDMA/mlx4: Initialize ib_spec on the stackAlaa Hleihel1-1/+2
commit c08cfb2d8d78bfe81b37cc6ba84f0875bddd0d5c upstream. Initialize ib_spec on the stack before using it, otherwise we will have garbage values that will break creating default rules with invalid parsing error. Fixes: a37a1a428431 ("IB/mlx4: Add mechanism to support flow steering over IB links") Link: https://lore.kernel.org/r/20200413132235.930642-1-leon@kernel.org Signed-off-by: Alaa Hleihel <alaa@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-06RDMA/mlx5: Set GRH fields in query QP on RoCEAharon Landau1-1/+3
commit 2d7e3ff7b6f2c614eb21d0dc348957a47eaffb57 upstream. GRH fields such as sgid_index, hop limit, et. are set in the QP context when QP is created/modified. Currently, when query QP is performed, we fill the GRH fields only if the GRH bit is set in the QP context, but this bit is not set for RoCE. Adjust the check so we will set all relevant data for the RoCE too. Since this data is returned to userspace, the below is an ABI regression. Fixes: d8966fcd4c25 ("IB/core: Use rdma_ah_attr accessor functions") Link: https://lore.kernel.org/r/20200413132028.930109-1-leon@kernel.org Signed-off-by: Aharon Landau <aharonl@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-06IB/rdmavt: Always return ERR_PTR from rvt_create_mmap_info()Sudip Mukherjee4-8/+8
commit 47c370c1a5eea9b2f6f026d49e060c3748c89667 upstream. The commit below modified rvt_create_mmap_info() to return ERR_PTR's but didn't update the callers to handle them. Modify rvt_create_mmap_info() to only return ERR_PTR and fix all error checking after rvt_create_mmap_info() was called. Fixes: ff23dfa13457 ("IB: Pass only ib_udata in function prototypes") Link: https://lore.kernel.org/r/20200424173146.10970-1-sudipm.mukherjee@gmail.com Cc: stable@vger.kernel.org [5.4+] Tested-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-13RDMA/cm: Update num_paths in cma_resolve_iboe_route error flowAvihai Horon1-0/+1
commit 987914ab841e2ec281a35b54348ab109b4c0bb4e upstream. After a successful allocation of path_rec, num_paths is set to 1, but any error after such allocation will leave num_paths uncleared. This causes to de-referencing a NULL pointer later on. Hence, num_paths needs to be set back to 0 if such an error occurs. The following crash from syzkaller revealed it. kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI CPU: 0 PID: 357 Comm: syz-executor060 Not tainted 4.18.0+ #311 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:ib_copy_path_rec_to_user+0x94/0x3e0 Code: f1 f1 f1 f1 c7 40 0c 00 00 f4 f4 65 48 8b 04 25 28 00 00 00 48 89 45 c8 31 c0 e8 d7 60 24 ff 48 8d 7b 4c 48 89 f8 48 c1 e8 03 <42> 0f b6 14 30 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 RSP: 0018:ffff88006586f980 EFLAGS: 00010207 RAX: 0000000000000009 RBX: 0000000000000000 RCX: 1ffff1000d5fe475 RDX: ffff8800621e17c0 RSI: ffffffff820d45f9 RDI: 000000000000004c RBP: ffff88006586fa50 R08: ffffed000cb0df73 R09: ffffed000cb0df72 R10: ffff88006586fa70 R11: ffffed000cb0df73 R12: 1ffff1000cb0df30 R13: ffff88006586fae8 R14: dffffc0000000000 R15: ffff88006aff2200 FS: 00000000016fc880(0000) GS:ffff88006d000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000040 CR3: 0000000063fec000 CR4: 00000000000006b0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ? ib_copy_path_rec_from_user+0xcc0/0xcc0 ? __mutex_unlock_slowpath+0xfc/0x670 ? wait_for_completion+0x3b0/0x3b0 ? ucma_query_route+0x818/0xc60 ucma_query_route+0x818/0xc60 ? ucma_listen+0x1b0/0x1b0 ? sched_clock_cpu+0x18/0x1d0 ? sched_clock_cpu+0x18/0x1d0 ? ucma_listen+0x1b0/0x1b0 ? ucma_write+0x292/0x460 ucma_write+0x292/0x460 ? ucma_close_id+0x60/0x60 ? sched_clock_cpu+0x18/0x1d0 ? sched_clock_cpu+0x18/0x1d0 __vfs_write+0xf7/0x620 ? ucma_close_id+0x60/0x60 ? kernel_read+0x110/0x110 ? time_hardirqs_on+0x19/0x580 ? lock_acquire+0x18b/0x3a0 ? finish_task_switch+0xf3/0x5d0 ? _raw_spin_unlock_irq+0x29/0x40 ? _raw_spin_unlock_irq+0x29/0x40 ? finish_task_switch+0x1be/0x5d0 ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x40/0x70 ? security_file_permission+0x172/0x1e0 vfs_write+0x192/0x460 ksys_write+0xc6/0x1a0 ? __ia32_sys_read+0xb0/0xb0 ? entry_SYSCALL_64_after_hwframe+0x3e/0xbe ? do_syscall_64+0x1d/0x470 do_syscall_64+0x9e/0x470 entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: 3c86aa70bf67 ("RDMA/cm: Add RDMA CM support for IBoE devices") Link: https://lore.kernel.org/r/20200318101741.47211-1-leon@kernel.org Signed-off-by: Avihai Horon <avihaih@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-13RDMA/siw: Fix passive connection establishmentBernard Metzler1-114/+31
commit 33fb27fd54465c74cbffba6315b2f043e90cec4c upstream. Holding the rtnl_lock while iterating a devices interface address list potentially causes deadlocks with the cma_netdev_callback. While this was implemented to limit the scope of a wildcard listen to addresses of the current device only, a better solution limits the scope of the socket to the device. This completely avoiding locking, and also results in significant code simplification. Fixes: c421651fa229 ("RDMA/siw: Add missing rtnl_lock around access to ifa") Link: https://lore.kernel.org/r/20200228173534.26815-1-bmt@zurich.ibm.com Reported-by: syzbot+55de90ab5f44172b0c90@syzkaller.appspotmail.com Suggested-by: Jason Gunthorpe <jgg@ziepe.ca> Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-13RDMA/cma: Teach lockdep about the order of rtnl and lockJason Gunthorpe1-0/+13
commit 32ac9e4399b12d3e54d312a0e0e30ed5cd19bd4e upstream. This lock ordering only happens when bonding is enabled and a certain bonding related event fires. However, since it can happen this is a global restriction on lock ordering. Teach lockdep about the order directly and unconditionally so bugs here are found quickly. See https://syzkaller.appspot.com/bug?extid=55de90ab5f44172b0c90 Link: https://lore.kernel.org/r/20200227203651.GA27185@ziepe.ca Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>