<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/block/nbd.c, branch v4.19.112</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v4.19.112</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v4.19.112'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2020-02-24T07:34:39+00:00</updated>
<entry>
<title>nbd: add a flush_workqueue in nbd_start_device</title>
<updated>2020-02-24T07:34:39+00:00</updated>
<author>
<name>Sun Ke</name>
<email>sunke32@huawei.com</email>
</author>
<published>2020-01-22T03:18:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7ce9e00a4a1dc570e3ac7d83fe9cc4f8a3fadbf7'/>
<id>urn:sha1:7ce9e00a4a1dc570e3ac7d83fe9cc4f8a3fadbf7</id>
<content type='text'>
[ Upstream commit 5c0dd228b5fc30a3b732c7ae2657e0161ec7ed80 ]

When kzalloc fail, may cause trying to destroy the
workqueue from inside the workqueue.

If num_connections is m (2 &lt; m), and NO.1 ~ NO.n
(1 &lt; n &lt; m) kzalloc are successful. The NO.(n + 1)
failed. Then, nbd_start_device will return ENOMEM
to nbd_start_device_ioctl, and nbd_start_device_ioctl
will return immediately without running flush_workqueue.
However, we still have n recv threads. If nbd_release
run first, recv threads may have to drop the last
config_refs and try to destroy the workqueue from
inside the workqueue.

To fix it, add a flush_workqueue in nbd_start_device.

Fixes: e9e006f5fcf2 ("nbd: fix max number of supported devs")
Signed-off-by: Sun Ke &lt;sunke32@huawei.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nbd: fix shutdown and recv work deadlock v2</title>
<updated>2019-12-31T15:36:36+00:00</updated>
<author>
<name>Mike Christie</name>
<email>mchristi@redhat.com</email>
</author>
<published>2019-12-08T22:51:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e83a26a49356a3dbd4f54102abe17fc594643698'/>
<id>urn:sha1:e83a26a49356a3dbd4f54102abe17fc594643698</id>
<content type='text'>
commit 1c05839aa973cfae8c3db964a21f9c0eef8fcc21 upstream.

This fixes a regression added with:

commit e9e006f5fcf2bab59149cb38a48a4817c1b538b4
Author: Mike Christie &lt;mchristi@redhat.com&gt;
Date:   Sun Aug 4 14:10:06 2019 -0500

    nbd: fix max number of supported devs

where we can deadlock during device shutdown. The problem occurs if
the recv_work's nbd_config_put occurs after nbd_start_device_ioctl has
returned and the userspace app has droppped its reference via closing
the device and running nbd_release. The recv_work nbd_config_put call
would then drop the refcount to zero and try to destroy the config which
would try to do destroy_workqueue from the recv work.

This patch just has nbd_start_device_ioctl do a flush_workqueue when it
wakes so we know after the ioctl returns running works have exited. This
also fixes a possible race where we could try to reuse the device while
old recv_works are still running.

Cc: stable@vger.kernel.org
Fixes: e9e006f5fcf2 ("nbd: fix max number of supported devs")
Signed-off-by: Mike Christie &lt;mchristi@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>nbd: prevent memory leak</title>
<updated>2019-12-01T08:17:38+00:00</updated>
<author>
<name>Navid Emamdoost</name>
<email>navid.emamdoost@gmail.com</email>
</author>
<published>2019-09-23T20:09:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=344966da99c962bea479298e4d3744e0c6a513f1'/>
<id>urn:sha1:344966da99c962bea479298e4d3744e0c6a513f1</id>
<content type='text'>
commit 03bf73c315edca28f47451913177e14cd040a216 upstream.

In nbd_add_socket when krealloc succeeds, if nsock's allocation fail the
reallocted memory is leak. The correct behaviour should be assigning the
reallocted memory to config-&gt;socks right after success.

Reviewed-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Signed-off-by: Navid Emamdoost &lt;navid.emamdoost@gmail.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>nbd:fix memory leak in nbd_get_socket()</title>
<updated>2019-12-01T08:16:09+00:00</updated>
<author>
<name>Sun Ke</name>
<email>sunke32@huawei.com</email>
</author>
<published>2019-11-19T06:09:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=65d153c8ed6505075740f005122d01298f6e474a'/>
<id>urn:sha1:65d153c8ed6505075740f005122d01298f6e474a</id>
<content type='text'>
commit dff10bbea4be47bdb615b036c834a275b7c68133 upstream.

Before returning NULL, put the sock first.

Cc: stable@vger.kernel.org
Fixes: cf1b2326b734 ("nbd: verify socket is supported during setup")
Reviewed-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Reviewed-by: Mike Christie &lt;mchristi@redhat.com&gt;
Signed-off-by: Sun Ke &lt;sunke32@huawei.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>nbd: handle racing with error'ed out commands</title>
<updated>2019-11-10T10:27:35+00:00</updated>
<author>
<name>Josef Bacik</name>
<email>josef@toxicpanda.com</email>
</author>
<published>2019-10-21T19:56:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1f032ca298dda0cc16eb121bce900f15373f346e'/>
<id>urn:sha1:1f032ca298dda0cc16eb121bce900f15373f346e</id>
<content type='text'>
[ Upstream commit 7ce23e8e0a9cd38338fc8316ac5772666b565ca9 ]

We hit the following warning in production

print_req_error: I/O error, dev nbd0, sector 7213934408 flags 80700
------------[ cut here ]------------
refcount_t: underflow; use-after-free.
WARNING: CPU: 25 PID: 32407 at lib/refcount.c:190 refcount_sub_and_test_checked+0x53/0x60
Workqueue: knbd-recv recv_work [nbd]
RIP: 0010:refcount_sub_and_test_checked+0x53/0x60
Call Trace:
 blk_mq_free_request+0xb7/0xf0
 blk_mq_complete_request+0x62/0xf0
 recv_work+0x29/0xa1 [nbd]
 process_one_work+0x1f5/0x3f0
 worker_thread+0x2d/0x3d0
 ? rescuer_thread+0x340/0x340
 kthread+0x111/0x130
 ? kthread_create_on_node+0x60/0x60
 ret_from_fork+0x1f/0x30
---[ end trace b079c3c67f98bb7c ]---

This was preceded by us timing out everything and shutting down the
sockets for the device.  The problem is we had a request in the queue at
the same time, so we completed the request twice.  This can actually
happen in a lot of cases, we fail to get a ref on our config, we only
have one connection and just error out the command, etc.

Fix this by checking cmd-&gt;status in nbd_read_stat.  We only change this
under the cmd-&gt;lock, so we are safe to check this here and see if we've
already error'ed this command out, which would indicate that we've
completed it as well.

Reviewed-by: Mike Christie &lt;mchristi@redhat.com&gt;
Signed-off-by: Josef Bacik &lt;josef@toxicpanda.com&gt;

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nbd: protect cmd-&gt;status with cmd-&gt;lock</title>
<updated>2019-11-10T10:27:34+00:00</updated>
<author>
<name>Josef Bacik</name>
<email>josef@toxicpanda.com</email>
</author>
<published>2019-10-21T19:56:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=82b7c99ee1410a50b52283f4b80dca31fa7591b6'/>
<id>urn:sha1:82b7c99ee1410a50b52283f4b80dca31fa7591b6</id>
<content type='text'>
[ Upstream commit de6346ecbc8f5591ebd6c44ac164e8b8671d71d7 ]

We already do this for the most part, except in timeout and clear_req.
For the timeout case we take the lock after we grab a ref on the config,
but that isn't really necessary because we're safe to touch the cmd at
this point, so just move the order around.

For the clear_req cause this is initiated by the user, so again is safe.

Reviewed-by: Mike Christie &lt;mchristi@redhat.com&gt;
Signed-off-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nbd: verify socket is supported during setup</title>
<updated>2019-11-06T12:06:11+00:00</updated>
<author>
<name>Mike Christie</name>
<email>mchristi@redhat.com</email>
</author>
<published>2019-10-17T21:27:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=083322455c67d278c56a66b73f1221f004ee600a'/>
<id>urn:sha1:083322455c67d278c56a66b73f1221f004ee600a</id>
<content type='text'>
[ Upstream commit cf1b2326b734896734c6e167e41766f9cee7686a ]

nbd requires socket families to support the shutdown method so the nbd
recv workqueue can be woken up from its sock_recvmsg call. If the socket
does not support the callout we will leave recv works running or get hangs
later when the device or module is removed.

This adds a check during socket connection/reconnection to make sure the
socket being passed in supports the needed callout.

Reported-by: syzbot+24c12fa8d218ed26011a@syzkaller.appspotmail.com
Fixes: e9e006f5fcf2 ("nbd: fix max number of supported devs")
Tested-by: Richard W.M. Jones &lt;rjones@redhat.com&gt;
Signed-off-by: Mike Christie &lt;mchristi@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nbd: fix possible sysfs duplicate warning</title>
<updated>2019-11-06T12:06:07+00:00</updated>
<author>
<name>Xiubo Li</name>
<email>xiubli@redhat.com</email>
</author>
<published>2019-09-19T06:14:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=cad4448dfc9cefe6eb4c73714b4e99220b22e13d'/>
<id>urn:sha1:cad4448dfc9cefe6eb4c73714b4e99220b22e13d</id>
<content type='text'>
[ Upstream commit 862488105b84ca744b3d8ff131e0fcfe10644be1 ]

1. nbd_put takes the mutex and drops nbd-&gt;ref to 0. It then does
idr_remove and drops the mutex.

2. nbd_genl_connect takes the mutex. idr_find/idr_for_each fails
to find an existing device, so it does nbd_dev_add.

3. just before the nbd_put could call nbd_dev_remove or not finished
totally, but if nbd_dev_add try to add_disk, we can hit:

debugfs: Directory 'nbd1' with parent 'block' already present!

This patch will make sure all the disk add/remove stuff are done
by holding the nbd_index_mutex lock.

Reported-by: Mike Christie &lt;mchristi@redhat.com&gt;
Reviewed-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Signed-off-by: Xiubo Li &lt;xiubli@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nbd: fix crash when the blksize is zero</title>
<updated>2019-10-11T16:21:26+00:00</updated>
<author>
<name>Xiubo Li</name>
<email>xiubli@redhat.com</email>
</author>
<published>2019-05-29T20:16:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c688982ffaeb71cbf0c6a16a232d037baf21e472'/>
<id>urn:sha1:c688982ffaeb71cbf0c6a16a232d037baf21e472</id>
<content type='text'>
[ Upstream commit 553768d1169a48c0cd87c4eb4ab57534ee663415 ]

This will allow the blksize to be set zero and then use 1024 as
default.

Reviewed-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Signed-off-by: Xiubo Li &lt;xiubli@redhat.com&gt;
[fix to use goto out instead of return in genl_connect]
Signed-off-by: Mike Christie &lt;mchristi@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nbd: fix max number of supported devs</title>
<updated>2019-10-11T16:20:46+00:00</updated>
<author>
<name>Mike Christie</name>
<email>mchristi@redhat.com</email>
</author>
<published>2019-08-04T19:10:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9f0f39c92e4f50189155dfb13bb5524372e40eba'/>
<id>urn:sha1:9f0f39c92e4f50189155dfb13bb5524372e40eba</id>
<content type='text'>
commit e9e006f5fcf2bab59149cb38a48a4817c1b538b4 upstream.

This fixes a bug added in 4.10 with commit:

commit 9561a7ade0c205bc2ee035a2ac880478dcc1a024
Author: Josef Bacik &lt;jbacik@fb.com&gt;
Date:   Tue Nov 22 14:04:40 2016 -0500

    nbd: add multi-connection support

that limited the number of devices to 256. Before the patch we could
create 1000s of devices, but the patch switched us from using our
own thread to using a work queue which has a default limit of 256
active works.

The problem is that our recv_work function sits in a loop until
disconnection but only handles IO for one connection. The work is
started when the connection is started/restarted, but if we end up
creating 257 or more connections, the queue_work call just queues
connection257+'s recv_work and that waits for connection 1 - 256's
recv_work to be disconnected and that work instance completing.

Instead of reverting back to kthreads, this has us allocate a
workqueue_struct per device, so we can block in the work.

Cc: stable@vger.kernel.org
Reviewed-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Signed-off-by: Mike Christie &lt;mchristi@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
</feed>
