<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/infiniband, branch v4.19.39</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v4.19.39</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v4.19.39'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2019-05-02T07:58:54+00:00</updated>
<entry>
<title>RDMA/mlx5: Do not allow the user to write to the clock page</title>
<updated>2019-05-02T07:58:54+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@mellanox.com</email>
</author>
<published>2019-04-16T11:07:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1af34bf97eca3624b5f2ec786527c8609e1a68b8'/>
<id>urn:sha1:1af34bf97eca3624b5f2ec786527c8609e1a68b8</id>
<content type='text'>
commit c660133c339f9ab684fdf568c0d51b9ae5e86002 upstream.

The intent of this VMA was to be read-only from user space, but the
VM_MAYWRITE masking was missed, so mprotect could make it writable.

Cc: stable@vger.kernel.org
Fixes: 5c99eaecb1fc ("IB/mlx5: Mmap the HCA's clock info to user-space")
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Reviewed-by: Haggai Eran &lt;haggaie@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>IB/rdmavt: Fix frwr memory registration</title>
<updated>2019-05-02T07:58:54+00:00</updated>
<author>
<name>Josh Collier</name>
<email>josh.d.collier@intel.com</email>
</author>
<published>2019-04-15T18:34:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8ac57d1bdc255f07c1cec130e8a860457e6bb7d3'/>
<id>urn:sha1:8ac57d1bdc255f07c1cec130e8a860457e6bb7d3</id>
<content type='text'>
commit 7c39f7f671d2acc0a1f39ebbbee4303ad499bbfa upstream.

Current implementation was not properly handling frwr memory
registrations. This was uncovered by commit 27f26cec761das ("xprtrdma:
Plant XID in on-the-wire RDMA offset (FRWR)") in which xprtrdma, which is
used for NFS over RDMA, started failing as it was the first ULP to modify
the ib_mr iova resulting in the NFS server getting REMOTE ACCESS ERROR
when attempting to perform RDMA Writes to the client.

The fix is to properly capture the true iova, offset, and length in the
call to ib_map_mr_sg, and then update the iova when processing the
IB_WR_REG_MEM on the send queue.

Fixes: a41081aa5936 ("IB/rdmavt: Add support for ib_map_mr_sg")
Cc: stable@vger.kernel.org
Reviewed-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Reviewed-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Reviewed-by: Michael J. Ruhl &lt;michael.j.ruhl@intel.com&gt;
Signed-off-by: Josh Collier &lt;josh.d.collier@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>IB/hfi1: Failed to drain send queue when QP is put into error state</title>
<updated>2019-04-20T07:16:05+00:00</updated>
<author>
<name>Kaike Wan</name>
<email>kaike.wan@intel.com</email>
</author>
<published>2019-03-18T16:55:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7a4626798276af0f8da71d4c9a5822991a9300a1'/>
<id>urn:sha1:7a4626798276af0f8da71d4c9a5822991a9300a1</id>
<content type='text'>
commit 662d66466637862ef955f7f6e78a286d8cf0ebef upstream.

When a QP is put into error state, all pending requests in the send work
queue should be drained. The following sequence of events could lead to a
failure, causing a request to hang:

(1) The QP builds a packet and tries to send through SDMA engine.
    However, PIO engine is still busy. Consequently, this packet is put on
    the QP's tx list and the QP is put on the PIO waiting list. The field
    qp-&gt;s_flags is set with HFI1_S_WAIT_PIO_DRAIN;

(2) The QP is put into error state by the user application and
    notify_error_qp() is called, which removes the QP from the PIO waiting
    list and the packet from the QP's tx list. In addition, qp-&gt;s_flags is
    cleared of RVT_S_ANY_WAIT_IO bits, which does not include
    HFI1_S_WAIT_PIO_DRAIN bit;

(3) The hfi1_schdule_send() function is called to drain the QP's send
    queue. Subsequently, hfi1_do_send() is called. Since the flag bit
    HFI1_S_WAIT_PIO_DRAIN is set in qp-&gt;s_flags, hfi1_send_ok() fails.  As
    a result, hfi1_do_send() bails out without draining any request from
    the send queue;

(4) The PIO engine completes the sending and tries to wake up any QP on
    its waiting list. But the QP has been removed from the PIO waiting
    list and therefore is kept in sleep forever.

The fix is to clear qp-&gt;s_flags of HFI1_S_ANY_WAIT_IO bits in step (2).
HFI1_S_ANY_WAIT_IO includes RVT_S_ANY_WAIT_IO and HFI1_S_WAIT_PIO_DRAIN.

Fixes: 2e2ba09e48b7 ("IB/rdmavt, IB/hfi1: Create device dependent s_flags")
Cc: &lt;stable@vger.kernel.org&gt; # 4.19.x+
Reviewed-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Reviewed-by: Alex Estrin &lt;alex.estrin@intel.com&gt;
Signed-off-by: Kaike Wan &lt;kaike.wan@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;


</content>
</entry>
<entry>
<title>i40iw: Avoid panic when handling the inetdev event</title>
<updated>2019-04-20T07:15:55+00:00</updated>
<author>
<name>Feng Tang</name>
<email>feng.tang@intel.com</email>
</author>
<published>2019-03-14T10:37:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f596ad9ea8a436fe2ebd902125d9c474a995d018'/>
<id>urn:sha1:f596ad9ea8a436fe2ebd902125d9c474a995d018</id>
<content type='text'>
[ Upstream commit ec4fe4bcc584b55e24e8d1768f5510a62c0fd619 ]

There is a panic reported that on a system with x722 ethernet, when doing
the operations like:

	# ip link add br0 type bridge
	# ip link set eno1 master br0
	# systemctl restart systemd-networkd

The system will panic "BUG: unable to handle kernel null pointer
dereference at 0000000000000034", with call chain:

	i40iw_inetaddr_event
	notifier_call_chain
	blocking_notifier_call_chain
	notifier_call_chain
	__inet_del_ifa
	inet_rtm_deladdr
	rtnetlink_rcv_msg
	netlink_rcv_skb
	rtnetlink_rcv
	netlink_unicast
	netlink_sendmsg
	sock_sendmsg
	__sys_sendto

It is caused by "local_ipaddr = ntohl(in-&gt;ifa_list-&gt;ifa_address)", while
the in-&gt;ifa_list is NULL.

So add a check for the "in-&gt;ifa_list == NULL" case, and skip the ARP
operation accordingly.

Signed-off-by: Feng Tang &lt;feng.tang@intel.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>IB/mlx4: Fix race condition between catas error reset and aliasguid flows</title>
<updated>2019-04-20T07:15:55+00:00</updated>
<author>
<name>Jack Morgenstein</name>
<email>jackm@dev.mellanox.co.il</email>
</author>
<published>2019-03-06T17:17:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=74ed2226e65a94ea2492d6a99bb2d5d4ec84c4dd'/>
<id>urn:sha1:74ed2226e65a94ea2492d6a99bb2d5d4ec84c4dd</id>
<content type='text'>
[ Upstream commit 587443e7773e150ae29e643ee8f41a1eed226565 ]

Code review revealed a race condition which could allow the catas error
flow to interrupt the alias guid query post mechanism at random points.
Thiis is fixed by doing cancel_delayed_work_sync() instead of
cancel_delayed_work() during the alias guid mechanism destroy flow.

Fixes: a0c64a17aba8 ("mlx4: Add alias_guid mechanism")
Signed-off-by: Jack Morgenstein &lt;jackm@dev.mellanox.co.il&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>iw_cxgb4: fix srqidx leak during connection abort</title>
<updated>2019-04-05T20:33:09+00:00</updated>
<author>
<name>Raju Rangoju</name>
<email>rajur@chelsio.com</email>
</author>
<published>2019-02-06T17:24:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5203cf8e29e57599138e7bfb61b786a7d84d3988'/>
<id>urn:sha1:5203cf8e29e57599138e7bfb61b786a7d84d3988</id>
<content type='text'>
[ Upstream commit f368ff188ae4b3ef6f740a15999ea0373261b619 ]

When an application aborts the connection by moving QP from RTS to ERROR,
then iw_cxgb4's modify_rc_qp() RTS-&gt;ERROR logic sets the
*srqidxp to 0 via t4_set_wq_in_error(&amp;qhp-&gt;wq, 0), and aborts the
connection by calling c4iw_ep_disconnect().

c4iw_ep_disconnect() does the following:
 1. sends up a close_complete_upcall(ep, -ECONNRESET) to libcxgb4.
 2. sends abort request CPL to hw.

But, since the close_complete_upcall() is sent before sending the
ABORT_REQ to hw, libcxgb4 would fail to release the srqidx if the
connection holds one. Because, the srqidx is passed up to libcxgb4 only
after corresponding ABORT_RPL is processed by kernel in abort_rpl().

This patch handle the corner-case by moving the call to
close_complete_upcall() from c4iw_ep_disconnect() to abort_rpl().  So that
libcxgb4 is notified about the -ECONNRESET only after abort_rpl(), and
libcxgb4 can relinquish the srqidx properly.

Signed-off-by: Raju Rangoju &lt;rajur@chelsio.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>IB/mlx4: Increase the timeout for CM cache</title>
<updated>2019-04-05T20:33:03+00:00</updated>
<author>
<name>Håkon Bugge</name>
<email>haakon.bugge@oracle.com</email>
</author>
<published>2019-02-17T14:45:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d3ec442d609b2476efb5b6556e1c176b543690f4'/>
<id>urn:sha1:d3ec442d609b2476efb5b6556e1c176b543690f4</id>
<content type='text'>
[ Upstream commit 2612d723aadcf8281f9bf8305657129bd9f3cd57 ]

Using CX-3 virtual functions, either from a bare-metal machine or
pass-through from a VM, MAD packets are proxied through the PF driver.

Since the VF drivers have separate name spaces for MAD Transaction Ids
(TIDs), the PF driver has to re-map the TIDs and keep the book keeping
in a cache.

Following the RDMA Connection Manager (CM) protocol, it is clear when
an entry has to evicted form the cache. But life is not perfect,
remote peers may die or be rebooted. Hence, it's a timeout to wipe out
a cache entry, when the PF driver assumes the remote peer has gone.

During workloads where a high number of QPs are destroyed concurrently,
excessive amount of CM DREQ retries has been observed

The problem can be demonstrated in a bare-metal environment, where two
nodes have instantiated 8 VFs each. This using dual ported HCAs, so we
have 16 vPorts per physical server.

64 processes are associated with each vPort and creates and destroys
one QP for each of the remote 64 processes. That is, 1024 QPs per
vPort, all in all 16K QPs. The QPs are created/destroyed using the
CM.

When tearing down these 16K QPs, excessive CM DREQ retries (and
duplicates) are observed. With some cat/paste/awk wizardry on the
infiniband_cm sysfs, we observe as sum of the 16 vPorts on one of the
nodes:

cm_rx_duplicates:
      dreq  2102
cm_rx_msgs:
      drep  1989
      dreq  6195
       rep  3968
       req  4224
       rtu  4224
cm_tx_msgs:
      drep  4093
      dreq 27568
       rep  4224
       req  3968
       rtu  3968
cm_tx_retries:
      dreq 23469

Note that the active/passive side is equally distributed between the
two nodes.

Enabling pr_debug in cm.c gives tons of:

[171778.814239] &lt;mlx4_ib&gt; mlx4_ib_multiplex_cm_handler: id{slave:
1,sl_cm_id: 0xd393089f} is NULL!

By increasing the CM_CLEANUP_CACHE_TIMEOUT from 5 to 30 seconds, the
tear-down phase of the application is reduced from approximately 90 to
50 seconds. Retries/duplicates are also significantly reduced:

cm_rx_duplicates:
      dreq  2460
[]
cm_tx_retries:
      dreq  3010
       req    47

Increasing the timeout further didn't help, as these duplicates and
retries stems from a too short CMA timeout, which was 20 (~4 seconds)
on the systems. By increasing the CMA timeout to 22 (~17 seconds), the
numbers fell down to about 10 for both of them.

Adjustment of the CMA timeout is not part of this commit.

Signed-off-by: Håkon Bugge &lt;haakon.bugge@oracle.com&gt;
Acked-by: Jack Morgenstein &lt;jackm@dev.mellanox.co.il&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>RDMA/cma: Rollback source IP address if failing to acquire device</title>
<updated>2019-03-27T05:14:42+00:00</updated>
<author>
<name>Myungho Jung</name>
<email>mhjungk@gmail.com</email>
</author>
<published>2019-01-10T06:27:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9dd5053c8cd5f35848705ef9310d31089e4fdd0b'/>
<id>urn:sha1:9dd5053c8cd5f35848705ef9310d31089e4fdd0b</id>
<content type='text'>
commit 5fc01fb846bce8fa6d5f95e2625b8ce0f8e86810 upstream.

If cma_acquire_dev_by_src_ip() returns error in addr_handler(), the
device state changes back to RDMA_CM_ADDR_BOUND but the resolved source
IP address is still left. After that, if rdma_destroy_id() is called
after rdma_listen(), the device is freed without removed from
listen_any_list in cma_cancel_operation(). Revert to the previous IP
address if acquiring device fails.

Reported-by: syzbot+f3ce716af730c8f96637@syzkaller.appspotmail.com
Signed-off-by: Myungho Jung &lt;mhjungk@gmail.com&gt;
Reviewed-by: Parav Pandit &lt;parav@mellanox.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>IB/hfi1: Close race condition on user context disable and close</title>
<updated>2019-03-23T19:10:02+00:00</updated>
<author>
<name>Michael J. Ruhl</name>
<email>michael.j.ruhl@intel.com</email>
</author>
<published>2019-02-26T16:45:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=54674984d4780a8d7f2a938e69918b09e02fa79f'/>
<id>urn:sha1:54674984d4780a8d7f2a938e69918b09e02fa79f</id>
<content type='text'>
commit bc5add09764c123f58942a37c8335247e683d234 upstream.

When disabling and removing a receive context, it is possible for an
asynchronous event (i.e IRQ) to occur.  Because of this, there is a race
between cleaning up the context, and the context being used by the
asynchronous event.

cpu 0  (context cleanup)
    rc-&gt;ref_count-- (ref_count == 0)
    hfi1_rcd_free()
cpu 1  (IRQ (with rcd index))
	rcd_get_by_index()
	lock
	ref_count+++     &lt;-- reference count race (WARNING)
	return rcd
	unlock
cpu 0
    hfi1_free_ctxtdata() &lt;-- incorrect free location
    lock
    remove rcd from array
    unlock
    free rcd

This race will cause the following WARNING trace:

WARNING: CPU: 0 PID: 175027 at include/linux/kref.h:52 hfi1_rcd_get_by_index+0x84/0xa0 [hfi1]
CPU: 0 PID: 175027 Comm: IMB-MPI1 Kdump: loaded Tainted: G OE ------------ 3.10.0-957.el7.x86_64 #1
Hardware name: Intel Corporation S2600KP/S2600KP, BIOS SE5C610.86B.11.01.0076.C4.111920150602 11/19/2015
Call Trace:
  dump_stack+0x19/0x1b
  __warn+0xd8/0x100
  warn_slowpath_null+0x1d/0x20
  hfi1_rcd_get_by_index+0x84/0xa0 [hfi1]
  is_rcv_urgent_int+0x24/0x90 [hfi1]
  general_interrupt+0x1b6/0x210 [hfi1]
  __handle_irq_event_percpu+0x44/0x1c0
  handle_irq_event_percpu+0x32/0x80
  handle_irq_event+0x3c/0x60
  handle_edge_irq+0x7f/0x150
  handle_irq+0xe4/0x1a0
  do_IRQ+0x4d/0xf0
  common_interrupt+0x162/0x162

The race can also lead to a use after free which could be similar to:

general protection fault: 0000 1 SMP
CPU: 71 PID: 177147 Comm: IMB-MPI1 Kdump: loaded Tainted: G W OE ------------ 3.10.0-957.el7.x86_64 #1
Hardware name: Intel Corporation S2600KP/S2600KP, BIOS SE5C610.86B.11.01.0076.C4.111920150602 11/19/2015
task: ffff9962a8098000 ti: ffff99717a508000 task.ti: ffff99717a508000 __kmalloc+0x94/0x230
Call Trace:
  ? hfi1_user_sdma_process_request+0x9c8/0x1250 [hfi1]
  hfi1_user_sdma_process_request+0x9c8/0x1250 [hfi1]
  hfi1_aio_write+0xba/0x110 [hfi1]
  do_sync_readv_writev+0x7b/0xd0
  do_readv_writev+0xce/0x260
  ? handle_mm_fault+0x39d/0x9b0
  ? pick_next_task_fair+0x5f/0x1b0
  ? sched_clock_cpu+0x85/0xc0
  ? __schedule+0x13a/0x890
  vfs_writev+0x35/0x60
  SyS_writev+0x7f/0x110
  system_call_fastpath+0x22/0x27

Use the appropriate kref API to verify access.

Reorder context cleanup to ensure context removal before cleanup occurs
correctly.

Cc: stable@vger.kernel.org # v4.14.0+
Fixes: f683c80ca68e ("IB/hfi1: Resolve kernel panics by reference counting receive contexts")
Reviewed-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Michael J. Ruhl &lt;michael.j.ruhl@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>IB/ipoib: Fix for use-after-free in ipoib_cm_tx_start</title>
<updated>2019-03-13T21:02:28+00:00</updated>
<author>
<name>Feras Daoud</name>
<email>ferasda@mellanox.com</email>
</author>
<published>2019-01-24T12:33:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1ee82160e2eff77606a70e72a2017c16d88876a7'/>
<id>urn:sha1:1ee82160e2eff77606a70e72a2017c16d88876a7</id>
<content type='text'>
[ Upstream commit 6ab4aba00f811a5265acc4d3eb1863bb3ca60562 ]

The following BUG was reported by kasan:

 BUG: KASAN: use-after-free in ipoib_cm_tx_start+0x430/0x1390 [ib_ipoib]
 Read of size 80 at addr ffff88034c30bcd0 by task kworker/u16:1/24020

 Workqueue: ipoib_wq ipoib_cm_tx_start [ib_ipoib]
 Call Trace:
  dump_stack+0x9a/0xeb
  print_address_description+0xe3/0x2e0
  kasan_report+0x18a/0x2e0
  ? ipoib_cm_tx_start+0x430/0x1390 [ib_ipoib]
  memcpy+0x1f/0x50
  ipoib_cm_tx_start+0x430/0x1390 [ib_ipoib]
  ? kvm_clock_read+0x1f/0x30
  ? ipoib_cm_skb_reap+0x610/0x610 [ib_ipoib]
  ? __lock_is_held+0xc2/0x170
  ? process_one_work+0x880/0x1960
  ? process_one_work+0x912/0x1960
  process_one_work+0x912/0x1960
  ? wq_pool_ids_show+0x310/0x310
  ? lock_acquire+0x145/0x440
  worker_thread+0x87/0xbb0
  ? process_one_work+0x1960/0x1960
  kthread+0x314/0x3d0
  ? kthread_create_worker_on_cpu+0xc0/0xc0
  ret_from_fork+0x3a/0x50

 Allocated by task 0:
  kasan_kmalloc+0xa0/0xd0
  kmem_cache_alloc_trace+0x168/0x3e0
  path_rec_create+0xa2/0x1f0 [ib_ipoib]
  ipoib_start_xmit+0xa98/0x19e0 [ib_ipoib]
  dev_hard_start_xmit+0x159/0x8d0
  sch_direct_xmit+0x226/0xb40
  __dev_queue_xmit+0x1d63/0x2950
  neigh_update+0x889/0x1770
  arp_process+0xc47/0x21f0
  arp_rcv+0x462/0x760
  __netif_receive_skb_core+0x1546/0x2da0
  netif_receive_skb_internal+0xf2/0x590
  napi_gro_receive+0x28e/0x390
  ipoib_ib_handle_rx_wc_rss+0x873/0x1b60 [ib_ipoib]
  ipoib_rx_poll_rss+0x17d/0x320 [ib_ipoib]
  net_rx_action+0x427/0xe30
  __do_softirq+0x28e/0xc42

 Freed by task 26680:
  __kasan_slab_free+0x11d/0x160
  kfree+0xf5/0x360
  ipoib_flush_paths+0x532/0x9d0 [ib_ipoib]
  ipoib_set_mode_rss+0x1ad/0x560 [ib_ipoib]
  set_mode+0xc8/0x150 [ib_ipoib]
  kernfs_fop_write+0x279/0x440
  __vfs_write+0xd8/0x5c0
  vfs_write+0x15e/0x470
  ksys_write+0xb8/0x180
  do_syscall_64+0x9b/0x420
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

 The buggy address belongs to the object at ffff88034c30bcc8
                which belongs to the cache kmalloc-512 of size 512
 The buggy address is located 8 bytes inside of
                512-byte region [ffff88034c30bcc8, ffff88034c30bec8)
 The buggy address belongs to the page:

The following race between change mode and xmit flow is the reason for
this use-after-free:

Change mode     Send packet 1 to GID XX      Send packet 2 to GID XX
     |                    |                             |
   start                  |                             |
     |                    |                             |
     |                    |                             |
     |         Create new path for GID XX               |
     |           and update neigh path                  |
     |                    |                             |
     |                    |                             |
     |                    |                             |
 flush_paths              |                             |
                          |                             |
               queue_work(cm.start_task)                |
                          |                 Path for GID XX not found
                          |                      create new path
                          |
                          |
               start_task runs with old
                    released path

There is no locking to protect the lifetime of the path through the
ipoib_cm_tx struct, so delete it entirely and always use the newly looked
up path under the priv-&gt;lock.

Fixes: 546481c2816e ("IB/ipoib: Fix memory corruption in ipoib cm mode connect flow")
Signed-off-by: Feras Daoud &lt;ferasda@mellanox.com&gt;
Reviewed-by: Erez Shitrit &lt;erezsh@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
