summaryrefslogtreecommitdiff
path: root/fs/ocfs2
diff options
context:
space:
mode:
authorSrinivas Eeda <srinivas.eeda@oracle.com>2014-12-11 02:41:48 +0300
committerLinus Torvalds <torvalds@linux-foundation.org>2014-12-11 04:41:03 +0300
commitcb79662bc2f83f7b3b60970ad88df43085f96514 (patch)
treef4a154486a41682ed7a1288d73ced7e8d2814a6d /fs/ocfs2
parentf5425fcea72eec084c50f36db4d25f697227c602 (diff)
downloadlinux-cb79662bc2f83f7b3b60970ad88df43085f96514.tar.xz
ocfs2: o2dlm: fix a race between purge and master query
Node A sends master query request to node B which is the master. At this time lockres happens to be on purgelist. dlm_master_request_handler gets the dlm spinlock, finds the resource and releases the dlm spin lock. Right at this dlm_thread on this node could purge the lockres. dlm_master_request_handler can then acquire lockres spinlock and reply to Node A that node B is the master even though lockres on node B is purged. The above scenario will now make node A falsely think node B is the master which is inconsistent. Further if another node C tries to master the same resource, every node will respond they are not the master. Node C then masters the resource and sends assert master to all nodes. This will now make node A crash with the following message. dlm_assert_master_handler:1831 ERROR: DIE! Mastery assert from 9, but current owner is 10! Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Reviewed-by: Wengang Wang <wen.gang.wang@oracle.com> Tested-by: Joseph Qi <joseph.qi@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'fs/ocfs2')
-rw-r--r--fs/ocfs2/dlm/dlmmaster.c12
1 files changed, 12 insertions, 0 deletions
diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
index 215e41abf101..3689b3592042 100644
--- a/fs/ocfs2/dlm/dlmmaster.c
+++ b/fs/ocfs2/dlm/dlmmaster.c
@@ -1460,6 +1460,18 @@ way_up_top:
/* take care of the easy cases up front */
spin_lock(&res->spinlock);
+
+ /*
+ * Right after dlm spinlock was released, dlm_thread could have
+ * purged the lockres. Check if lockres got unhashed. If so
+ * start over.
+ */
+ if (hlist_unhashed(&res->hash_node)) {
+ spin_unlock(&res->spinlock);
+ dlm_lockres_put(res);
+ goto way_up_top;
+ }
+
if (res->state & (DLM_LOCK_RES_RECOVERING|
DLM_LOCK_RES_MIGRATING)) {
spin_unlock(&res->spinlock);