<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/block/drbd, branch v4.4.235</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v4.4.235</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v4.4.235'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2020-01-29T09:21:47+00:00</updated>
<entry>
<title>signal: Allow cifs and drbd to receive their terminating signals</title>
<updated>2020-01-29T09:21:47+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2019-08-16T17:33:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a1800bbb992418c5572d7136b59624a967ccf055'/>
<id>urn:sha1:a1800bbb992418c5572d7136b59624a967ccf055</id>
<content type='text'>
[ Upstream commit 33da8e7c814f77310250bb54a9db36a44c5de784 ]

My recent to change to only use force_sig for a synchronous events
wound up breaking signal reception cifs and drbd.  I had overlooked
the fact that by default kthreads start out with all signals set to
SIG_IGN.  So a change I thought was safe turned out to have made it
impossible for those kernel thread to catch their signals.

Reverting the work on force_sig is a bad idea because what the code
was doing was very much a misuse of force_sig.  As the way force_sig
ultimately allowed the signal to happen was to change the signal
handler to SIG_DFL.  Which after the first signal will allow userspace
to send signals to these kernel threads.  At least for
wake_ack_receiver in drbd that does not appear actively wrong.

So correct this problem by adding allow_kernel_signal that will allow
signals whose siginfo reports they were sent by the kernel through,
but will not allow userspace generated signals, and update cifs and
drbd to call allow_kernel_signal in an appropriate place so that their
thread can receive this signal.

Fixing things this way ensures that userspace won't be able to send
signals and cause problems, that it is clear which signals the
threads are expecting to receive, and it guarantees that nothing
else in the system will be affected.

This change was partly inspired by similar cifs and drbd patches that
added allow_signal.

Reported-by: ronnie sahlberg &lt;ronniesahlberg@gmail.com&gt;
Reported-by: Christoph Böhmwalder &lt;christoph.boehmwalder@linbit.com&gt;
Tested-by: Christoph Böhmwalder &lt;christoph.boehmwalder@linbit.com&gt;
Cc: Steve French &lt;smfrench@gmail.com&gt;
Cc: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Cc: David Laight &lt;David.Laight@ACULAB.COM&gt;
Fixes: 247bc9470b1e ("cifs: fix rmmod regression in cifs.ko caused by force_sig changes")
Fixes: 72abe3bcf091 ("signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of force_sig")
Fixes: fee109901f39 ("signal/drbd: Use send_sig not force_sig")
Fixes: 3cf5d076fb4d ("signal: Remove task parameter from force_sig")
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drbd: fix print_st_err()'s prototype to match the definition</title>
<updated>2019-12-05T14:26:59+00:00</updated>
<author>
<name>Luc Van Oostenryck</name>
<email>luc.vanoostenryck@gmail.com</email>
</author>
<published>2018-12-20T16:23:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d04e4286035c99279dcb5309e7b52a7860ba4ff7'/>
<id>urn:sha1:d04e4286035c99279dcb5309e7b52a7860ba4ff7</id>
<content type='text'>
[ Upstream commit 2c38f035117331eb78d0504843c79ea7c7fabf37 ]

print_st_err() is defined with its 4th argument taking an
'enum drbd_state_rv' but its prototype use an int for it.

Fix this by using 'enum drbd_state_rv' in the prototype too.

Signed-off-by: Luc Van Oostenryck &lt;luc.vanoostenryck@gmail.com&gt;
Signed-off-by: Roland Kammerer &lt;roland.kammerer@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drbd: reject attach of unsuitable uuids even if connected</title>
<updated>2019-12-05T14:26:59+00:00</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars.ellenberg@linbit.com</email>
</author>
<published>2018-12-20T16:23:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bdf15725f68a4413f4f1884a90f5740e64627518'/>
<id>urn:sha1:bdf15725f68a4413f4f1884a90f5740e64627518</id>
<content type='text'>
[ Upstream commit fe43ed97bba3b11521abd934b83ed93143470e4f ]

Multiple failure scenario:
a) all good
   Connected Primary/Secondary UpToDate/UpToDate
b) lose disk on Primary,
   Connected Primary/Secondary Diskless/UpToDate
c) continue to write to the device,
   changes only make it to the Secondary storage.
d) lose disk on Secondary,
   Connected Primary/Secondary Diskless/Diskless
e) now try to re-attach on Primary

This would have succeeded before, even though that is clearly the
wrong data set to attach to (missing the modifications from c).
Because we only compared our "effective" and the "to-be-attached"
data generation uuid tags if (device-&gt;state.conn &lt; C_CONNECTED).

Fix: change that constraint to (device-&gt;state.pdsk != D_UP_TO_DATE)
compare the uuids, and reject the attach.

This patch also tries to improve the reverse scenario:
first lose Secondary, then Primary disk,
then try to attach the disk on Secondary.

Before this patch, the attach on the Secondary succeeds, but since commit
drbd: disconnect, if the wrong UUIDs are attached on a connected peer
the Primary will notice unsuitable data, and drop the connection hard.

Though unfortunately at a point in time during the handshake where
we cannot easily abort the attach on the peer without more
refactoring of the handshake.

We now reject any attach to "unsuitable" uuids,
as long as we can see a Primary role,
unless we already have access to "good" data.

Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>block: drbd: remove a stray unlock in __drbd_send_protocol()</title>
<updated>2019-12-05T14:26:37+00:00</updated>
<author>
<name>Dan Carpenter</name>
<email>dan.carpenter@oracle.com</email>
</author>
<published>2019-11-07T07:48:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2c2e31f88e1a1d64eec3fe6786198e03790961b7'/>
<id>urn:sha1:2c2e31f88e1a1d64eec3fe6786198e03790961b7</id>
<content type='text'>
[ Upstream commit 8e9c523016cf9983b295e4bc659183d1fa6ef8e0 ]

There are two callers of this function and they both unlock the mutex so
this ends up being a double unlock.

Fixes: 44ed167da748 ("drbd: rcu_read_lock() and rcu_dereference() for tconn-&gt;net_conf")
Signed-off-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drbd: skip spurious timeout (ping-timeo) when failing promote</title>
<updated>2019-02-20T09:13:11+00:00</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars.ellenberg@linbit.com</email>
</author>
<published>2018-12-20T16:23:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=eb15351d9c125d6f4e34ffbc327f4f2d7d7de4d2'/>
<id>urn:sha1:eb15351d9c125d6f4e34ffbc327f4f2d7d7de4d2</id>
<content type='text'>
[ Upstream commit 9848b6ddd8c92305252f94592c5e278574e7a6ac ]

If you try to promote a Secondary while connected to a Primary
and allow-two-primaries is NOT set, we will wait for "ping-timeout"
to give this node a chance to detect a dead primary,
in case the cluster manager noticed faster than we did.

But if we then are *still* connected to a Primary,
we fail (after an additional timeout of ping-timout).

This change skips the spurious second timeout.

Most people won't notice really,
since "ping-timeout" by default is half a second.

But in some installations, ping-timeout may be 10 or 20 seconds or more,
and spuriously delaying the error return becomes annoying.

Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drbd: disconnect, if the wrong UUIDs are attached on a connected peer</title>
<updated>2019-02-20T09:13:11+00:00</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars.ellenberg@linbit.com</email>
</author>
<published>2018-12-20T16:23:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5c857f014b05c3565becd447d94d0d9e7e88f96a'/>
<id>urn:sha1:5c857f014b05c3565becd447d94d0d9e7e88f96a</id>
<content type='text'>
[ Upstream commit b17b59602b6dcf8f97a7dc7bc489a48388d7063a ]

With "on-no-data-accessible suspend-io", DRBD requires the next attach
or connect to be to the very same data generation uuid tag it lost last.

If we first lost connection to the peer,
then later lost connection to our own disk,
we would usually refuse to re-connect to the peer,
because it presents the wrong data set.

However, if the peer first connects without a disk,
and then attached its disk, we accepted that same wrong data set,
which would be "unexpected" by any user of that DRBD
and cause "undefined results" (read: very likely data corruption).

The fix is to forcefully disconnect as soon as we notice that the peer
attached to the "wrong" dataset.

Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drbd: narrow rcu_read_lock in drbd_sync_handshake</title>
<updated>2019-02-20T09:13:11+00:00</updated>
<author>
<name>Roland Kammerer</name>
<email>roland.kammerer@linbit.com</email>
</author>
<published>2018-12-20T16:23:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3b5d41253b54509853bfde93b6e2f17427e07908'/>
<id>urn:sha1:3b5d41253b54509853bfde93b6e2f17427e07908</id>
<content type='text'>
[ Upstream commit d29e89e34952a9ad02c77109c71a80043544296e ]

So far there was the possibility that we called
genlmsg_new(GFP_NOIO)/mutex_lock() while holding an rcu_read_lock().

This included cases like:

drbd_sync_handshake (acquire the RCU lock)
  drbd_asb_recover_1p
    drbd_khelper
      drbd_bcast_event
        genlmsg_new(GFP_NOIO) --&gt; may sleep

drbd_sync_handshake (acquire the RCU lock)
  drbd_asb_recover_1p
    drbd_khelper
      notify_helper
        genlmsg_new(GFP_NOIO) --&gt; may sleep

drbd_sync_handshake (acquire the RCU lock)
  drbd_asb_recover_1p
    drbd_khelper
      notify_helper
        mutex_lock --&gt; may sleep

While using GFP_ATOMIC whould have been possible in the first two cases,
the real fix is to narrow the rcu_read_lock.

Reported-by: Jia-Ju Bai &lt;baijiaju1990@163.com&gt;
Reviewed-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Roland Kammerer &lt;roland.kammerer@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drbd: fix access after free</title>
<updated>2018-07-11T14:03:48+00:00</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars.ellenberg@linbit.com</email>
</author>
<published>2018-06-25T09:39:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=df0fe72e2e1f18da3d045f0211b2c833d22578ff'/>
<id>urn:sha1:df0fe72e2e1f18da3d045f0211b2c833d22578ff</id>
<content type='text'>
commit 64dafbc9530c10300acffc57fae3269d95fa8f93 upstream.

We have
  struct drbd_requests { ... struct bio *private_bio;  ... }
to hold a bio clone for local submission.

On local IO completion, we put that bio, and in case we want to use the
result later, we overload that member to hold the ERR_PTR() of the
completion result,

Which, before v4.3, used to be the passed in "int error",
so we could first bio_put(), then assign.

v4.3-rc1~100^2~21 4246a0b63bd8 block: add a bi_error field to struct bio
changed that:
  	bio_put(req-&gt;private_bio);
 -	req-&gt;private_bio = ERR_PTR(error);
 +	req-&gt;private_bio = ERR_PTR(bio-&gt;bi_error);

Which introduces an access after free,
because it was non obvious that req-&gt;private_bio == bio.

Impact of that was mostly unnoticable, because we only use that value
in a multiple-failure case, and even then map any "unexpected" error
code to EIO, so worst case we could potentially mask a more specific
error with EIO in a multiple failure case.

Unless the pointed to memory region was unmapped, as is the case with
CONFIG_DEBUG_PAGEALLOC, in which case this results in

  BUG: unable to handle kernel paging request

v4.13-rc1~70^2~75 4e4cbee93d56 block: switch bios to blk_status_t
changes it further to
  	bio_put(req-&gt;private_bio);
  	req-&gt;private_bio = ERR_PTR(blk_status_to_errno(bio-&gt;bi_status));

And blk_status_to_errno() now contains a WARN_ON_ONCE() for unexpected
values, which catches this "sometimes", if the memory has been reused
quickly enough for other things.

Should also go into stable since 4.3, with the trivial change around 4.13.

Cc: stable@vger.kernel.org
Fixes: 4246a0b63bd8 block: add a bi_error field to struct bio
Reported-by: Sarah Newman &lt;srn@prgmr.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>drbd: avoid redefinition of BITS_PER_PAGE</title>
<updated>2017-05-08T05:46:01+00:00</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars.ellenberg@linbit.com</email>
</author>
<published>2015-03-20T14:47:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ee6b88767e872fcc7af8e786a05d29d1ba229e75'/>
<id>urn:sha1:ee6b88767e872fcc7af8e786a05d29d1ba229e75</id>
<content type='text'>
commit 2630628b2dbc3fc320aafaf84836119e4e3d62f1 upstream.

Apparently we now implicitly get definitions for BITS_PER_PAGE and
BITS_PER_PAGE_MASK from the pid_namespace.h

Instead of renaming our defines, I chose to define only if not yet
defined, but to double check the value if already defined.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>drbd: Fix kernel_sendmsg() usage - potential NULL deref</title>
<updated>2016-11-18T09:48:35+00:00</updated>
<author>
<name>Richard Weinberger</name>
<email>richard@nod.at</email>
</author>
<published>2016-11-09T21:52:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f523deedff82342859cb3628e5d164ceccfdbc7e'/>
<id>urn:sha1:f523deedff82342859cb3628e5d164ceccfdbc7e</id>
<content type='text'>
commit d8e9e5e80e882b4f90cba7edf1e6cb7376e52e54 upstream.

Don't pass a size larger than iov_len to kernel_sendmsg().
Otherwise it will cause a NULL pointer deref when kernel_sendmsg()
returns with rv &lt; size.

DRBD as external module has been around in the kernel 2.4 days already.
We used to be compatible to 2.4 and very early 2.6 kernels,
we used to use
 rv = sock_sendmsg(sock, &amp;msg, iov.iov_len);
then later changed to
 rv = kernel_sendmsg(sock, &amp;msg, &amp;iov, 1, size);
when we should have used
 rv = kernel_sendmsg(sock, &amp;msg, &amp;iov, 1, iov.iov_len);

tcp_sendmsg() used to totally ignore the size parameter.
 57be5bd ip: convert tcp_sendmsg() to iov_iter primitives
changes that, and exposes our long standing error.

Even with this error exposed, to trigger the bug, we would need to have
an environment (config or otherwise) causing us to not use sendpage()
for larger transfers, a failing connection, and have it fail "just at the
right time".  Apparently that was unlikely enough for most, so this went
unnoticed for years.

Still, it is known to trigger at least some of these,
and suspected for the others:
[0] http://lists.linbit.com/pipermail/drbd-user/2016-July/023112.html
[1] http://lists.linbit.com/pipermail/drbd-dev/2016-March/003362.html
[2] https://forums.grsecurity.net/viewtopic.php?f=3&amp;t=4546
[3] https://ubuntuforums.org/showthread.php?t=2336150
[4] http://e2.howsolveproblem.com/i/1175162/

This should go into 4.9,
and into all stable branches since and including v4.0,
which is the first to contain the exposing change.

It is correct for all stable branches older than that as well
(which contain the DRBD driver; which is 2.6.33 and up).

It requires a small "conflict" resolution for v4.4 and earlier, with v4.5
we dropped the comment block immediately preceding the kernel_sendmsg().

Fixes: b411b3637fa7 ("The DRBD driver")
Cc: viro@zeniv.linux.org.uk
Cc: christoph.lechleitner@iteg.at
Cc: wolfgang.glas@iteg.at
Reported-by: Christoph Lechleitner &lt;christoph.lechleitner@iteg.at&gt;
Tested-by: Christoph Lechleitner &lt;christoph.lechleitner@iteg.at&gt;
Signed-off-by: Richard Weinberger &lt;richard@nod.at&gt;
[changed oneliner to be "obvious" without context; more verbose message]
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
</feed>
