diff options
author | piaojun <piaojun@huawei.com> | 2016-08-03 00:02:19 +0300 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2016-08-03 00:31:41 +0300 |
commit | ee8f7fcbe638b07e8d1c3dc98e8be35e56306d05 (patch) | |
tree | 25b25d9131e4a07e700302c01a5f3924d9c8ae2b /fs/ocfs2/dlm/dlmcommon.h | |
parent | 309e91911daede6adde0364f489e69909c3f6894 (diff) | |
download | linux-ee8f7fcbe638b07e8d1c3dc98e8be35e56306d05.tar.xz |
ocfs2/dlm: continue to purge recovery lockres when recovery master goes down
We found a dlm-blocked situation caused by continuous breakdown of
recovery masters described below. To solve this problem, we should
purge recovery lock once detecting recovery master goes down.
N3 N2 N1(reco master)
go down
pick up recovery lock and
begin recoverying for N2
go down
pick up recovery
lock failed, then
purge it:
dlm_purge_lockres
->DROPPING_REF is set
send deref to N1 failed,
recovery lock is not purged
find N1 go down, begin
recoverying for N1, but
blocked in dlm_do_recovery
as DROPPING_REF is set:
dlm_do_recovery
->dlm_pick_recovery_master
->dlmlock
->dlm_get_lock_resource
->__dlm_wait_on_lockres_flags(tmpres,
DLM_LOCK_RES_DROPPING_REF);
Fixes: 8c0343968163 ("ocfs2/dlm: clear DROPPING_REF flag when the master goes down")
Link: http://lkml.kernel.org/r/578453AF.8030404@huawei.com
Signed-off-by: Jun Piao <piaojun@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: Jiufei Xue <xuejiufei@huawei.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'fs/ocfs2/dlm/dlmcommon.h')
-rw-r--r-- | fs/ocfs2/dlm/dlmcommon.h | 2 |
1 files changed, 2 insertions, 0 deletions
diff --git a/fs/ocfs2/dlm/dlmcommon.h b/fs/ocfs2/dlm/dlmcommon.h index 8107d0d0c3f6..e9f3705c4c9f 100644 --- a/fs/ocfs2/dlm/dlmcommon.h +++ b/fs/ocfs2/dlm/dlmcommon.h @@ -1004,6 +1004,8 @@ int dlm_finalize_reco_handler(struct o2net_msg *msg, u32 len, void *data, int dlm_do_master_requery(struct dlm_ctxt *dlm, struct dlm_lock_resource *res, u8 nodenum, u8 *real_master); +void __dlm_do_purge_lockres(struct dlm_ctxt *dlm, + struct dlm_lock_resource *res); int dlm_dispatch_assert_master(struct dlm_ctxt *dlm, struct dlm_lock_resource *res, |