summaryrefslogtreecommitdiff
path: root/drivers/block/drbd/drbd_nl.c
diff options
context:
space:
mode:
authorPhilipp Reisner <philipp.reisner@linbit.com>2014-11-10 19:21:11 +0300
committerJens Axboe <axboe@fb.com>2014-11-10 19:27:35 +0300
commita88215312c5ed74697973f6c9f0fce718bcf18ad (patch)
tree831cef10aa7728cff1fe109ba5f6ee8dad4e3225 /drivers/block/drbd/drbd_nl.c
parentf221f4bcc5f40e2967e4596ef167bdbc987c8e9d (diff)
downloadlinux-a88215312c5ed74697973f6c9f0fce718bcf18ad.tar.xz
drbd: fix race between role change and handshake
Symptoms: If DRBD was "cleanly shut down" (all in sync, both Secondary before disconnect, identical data generation uuids), and then one side was promoted *during* the next connection handshake, the role change could confuse the handshake. The Primary would get stuck in WFBitmapS, the Secondary would log unexpected cstate (Connected) in receive_bitmap and get stuck in WFBitmapT. Fix: The test in is_valid_soft_transition wrong. It works because the not allowed actions (promote/attach) do not touch the cstate. The previous condition failed to demand a cstate change in one clause. In order to avoid deadlocks give up the state_mutex while waiting for the transient state to go away. Conflicts: drbd/drbd_state.c drbd/drbd_state.h drbd/drbd_wrappers.h Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@fb.com>
Diffstat (limited to 'drivers/block/drbd/drbd_nl.c')
-rw-r--r--drivers/block/drbd/drbd_nl.c2
1 files changed, 1 insertions, 1 deletions
diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 4782d074c8cd..74df8cfad414 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -588,7 +588,7 @@ drbd_set_role(struct drbd_device *const device, enum drbd_role new_role, int for
val.i = 0; val.role = new_role;
while (try++ < max_tries) {
- rv = _drbd_request_state(device, mask, val, CS_WAIT_COMPLETE);
+ rv = _drbd_request_state_holding_state_mutex(device, mask, val, CS_WAIT_COMPLETE);
/* in case we first succeeded to outdate,
* but now suddenly could establish a connection */