diff options
author | Lars Ellenberg <lars.ellenberg@linbit.com> | 2017-08-29 11:20:43 +0300 |
---|---|---|
committer | Jens Axboe <axboe@kernel.dk> | 2017-08-30 00:34:45 +0300 |
commit | 33d32fa7120ed184efc9be1ea3c016109b4fea84 (patch) | |
tree | 30c0fdf900ecdfbf51207795eebc0d985a06effe /drivers/block/drbd/drbd_nl.c | |
parent | 427fd2bee0a33a670de186387e79d280a6808a66 (diff) | |
download | linux-33d32fa7120ed184efc9be1ea3c016109b4fea84.tar.xz |
drbd: fix potential deadlock when trying to detach during handshake
When requesting a detach, we first suspend IO, and also inhibit meta-data IO
by means of drbd_md_get_buffer(), because we don't want to "fail" the disk
while there is IO in-flight: the transition into D_FAILED for detach purposes
may get misinterpreted as actual IO error in a confused endio function.
We wrap it all into wait_event(), to retry in case the drbd_req_state()
returns SS_IN_TRANSIENT_STATE, as it does for example during an ongoing
connection handshake.
In that example, the receiver thread may need to grab drbd_md_get_buffer()
during the handshake to make progress. To avoid potential deadlock with
detach, detach needs to grab and release the meta data buffer inside of
that wait_event retry loop. To avoid lock inversion between
mutex_lock(&device->state_mutex) and drbd_md_get_buffer(device),
introduce a new enum chg_state_flag CS_INHIBIT_MD_IO, and move the
call to drbd_md_get_buffer() inside the state_mutex grabbed in
drbd_req_state().
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Diffstat (limited to 'drivers/block/drbd/drbd_nl.c')
-rw-r--r-- | drivers/block/drbd/drbd_nl.c | 25 |
1 files changed, 2 insertions, 23 deletions
diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c index c383b6cf272a..6bb58a6836ed 100644 --- a/drivers/block/drbd/drbd_nl.c +++ b/drivers/block/drbd/drbd_nl.c @@ -2149,34 +2149,13 @@ int drbd_adm_attach(struct sk_buff *skb, struct genl_info *info) static int adm_detach(struct drbd_device *device, int force) { - enum drbd_state_rv retcode; - void *buffer; - int ret; - if (force) { set_bit(FORCE_DETACH, &device->flags); drbd_force_state(device, NS(disk, D_FAILED)); - retcode = SS_SUCCESS; - goto out; + return SS_SUCCESS; } - drbd_suspend_io(device); /* so no-one is stuck in drbd_al_begin_io */ - buffer = drbd_md_get_buffer(device, __func__); /* make sure there is no in-flight meta-data IO */ - if (buffer) { - retcode = drbd_request_state(device, NS(disk, D_FAILED)); - drbd_md_put_buffer(device); - } else /* already <= D_FAILED */ - retcode = SS_NOTHING_TO_DO; - /* D_FAILED will transition to DISKLESS. */ - drbd_resume_io(device); - ret = wait_event_interruptible(device->misc_wait, - device->state.disk != D_FAILED); - if ((int)retcode == (int)SS_IS_DISKLESS) - retcode = SS_NOTHING_TO_DO; - if (ret) - retcode = ERR_INTR; -out: - return retcode; + return drbd_request_detach_interruptible(device); } /* Detaching the disk is a process in multiple stages. First we need to lock |