diff options
author | Joseph Qi <joseph.qi@huawei.com> | 2014-10-10 02:25:13 +0400 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2014-10-10 06:25:47 +0400 |
commit | 70e82a12dbfa3acbff41be08a36e8be4578878c9 (patch) | |
tree | 2f5462c4d1ded93f1710d8c0d83ae67473402a87 /fs/ocfs2/dlm | |
parent | 5046f18d5bd9ad7638b32c3b304ff39a74c064df (diff) | |
download | linux-70e82a12dbfa3acbff41be08a36e8be4578878c9.tar.xz |
ocfs2: fix deadlock between o2hb thread and o2net_wq
The following case may lead to o2net_wq and o2hb thread deadlock on
o2hb_callback_sem.
Currently there are 2 nodes say N1, N2 in the cluster. And N2 down, at
the same time, N3 tries to join the cluster. So N1 will handle node
down (N2) and join (N3) simultaneously.
o2hb o2net_wq
->o2hb_do_disk_heartbeat
->o2hb_check_slot
->o2hb_run_event_list
->o2hb_fire_callbacks
->down_write(&o2hb_callback_sem)
->o2net_hb_node_down_cb
->flush_workqueue(o2net_wq)
->o2net_process_message
->dlm_query_join_handler
->o2hb_check_node_heartbeating
->o2hb_fill_node_map
->down_read(&o2hb_callback_sem)
No need to take o2hb_callback_sem in dlm_query_join_handler,
o2hb_live_lock is enough to protect live node map.
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: xMark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: jiangyiwen <jiangyiwen@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'fs/ocfs2/dlm')
-rw-r--r-- | fs/ocfs2/dlm/dlmdomain.c | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/fs/ocfs2/dlm/dlmdomain.c b/fs/ocfs2/dlm/dlmdomain.c index 257a6dfe3f13..02d315fef432 100644 --- a/fs/ocfs2/dlm/dlmdomain.c +++ b/fs/ocfs2/dlm/dlmdomain.c @@ -839,7 +839,7 @@ static int dlm_query_join_handler(struct o2net_msg *msg, u32 len, void *data, * to back off and try again. This gives heartbeat a chance * to catch up. */ - if (!o2hb_check_node_heartbeating(query->node_idx)) { + if (!o2hb_check_node_heartbeating_no_sem(query->node_idx)) { mlog(0, "node %u is not in our live map yet\n", query->node_idx); |