summaryrefslogtreecommitdiff
path: root/fs/ocfs2/dlm
AgeCommit message (Collapse)AuthorFilesLines
2006-06-27ocfs2: fix incorrect error returnsKurt Hackel1-2/+2
Use DLM_REJECTED instead of DLM_RECOVERING. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: tune down some noisy messages during dlm recoveryKurt Hackel2-6/+7
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: display message before waiting for recovery to completeKurt Hackel1-0/+7
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: mlog in dlm_convert_lock_handler() should be ML_ERRORKurt Hackel1-1/+1
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: retry operations when a lock is marked in recoveryKurt Hackel1-0/+20
Before checking for a nonexistent lock, make sure the lockres is not marked RECOVERING. The caller will just retry and the state should be fixed up when recovery completes. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: use cond_resched() in dlm_thread()Kurt Hackel1-1/+1
yield() does not yield. cond_resched() does. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: use GFP_NOFS in some dlm operationsKurt Hackel5-19/+19
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: wait for recovery when starting lock masteryKurt Hackel3-0/+34
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: continue recovery when a dead node is encounteredKurt Hackel1-0/+1
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: remove unneccesary spin_unlock() in dlm_remaster_locks()Kurt Hackel1-1/+0
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: dlm_remaster_locks() should never exit without completingKurt Hackel1-54/+62
We cannot restart recovery. Once we begin to recover a node, keep the state of the recovery intact and follow through, regardless of any other node deaths that may occur. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: special case recovery lock in dlmlock_remote()Kurt Hackel2-10/+27
If the previous master of the recovery lock dies, let calc_usage take it down completely and let the caller completely redo the dlmlock() call. Otherwise, there will never be an opportunity to re-master the lockres and recovery wont be able to progress. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: pending mastery asserts and migrations should block each otherKurt Hackel1-0/+21
Use the existing structure for blocking migrations when ASTs are pending to achieve the same result. If we can catch the assert before it goes on the wire, just cancel it and let the migration continue. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: temporarily disable automatic lock migrationKurt Hackel2-5/+23
Now we never change the owner of a lock resource until unmount or node death. This will be re-enabled once some issues in the algorithm used have been resolved. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: do not unconditionally purge the lockres in dlmlock_remote()Kurt Hackel1-1/+7
In dlmlock_remote(), do not call purge_lockres until the lock resource actually changes. otherwise, the mastery info on the lockres will go away underneath the caller. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: increase backoff before waiting for recoveryKurt Hackel1-1/+1
When mastering non-recovery lock resources, additional time was frequently needed to allow the disk heartbeat to catch up with the network timeout. the recovery lock resource is time critical and avoids this path. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: have dlm_pre_master_reco_lockres() ignore dead nodesKurt Hackel1-0/+1
Recovery will spin in dlm_pre_master_reco_lockres if we do not ignore timed-out network responses from dead nodes. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: give the dlm dirty list a reference on the lockresKurt Hackel2-3/+17
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: teach dlm_restart_lock_mastery() to wait on recoveryKurt Hackel1-56/+44
Change behavior of dlm_restart_lock_mastery() when a node goes down. Dump all responses that have been collected and start over. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: gracefully handle stale create_lock messages.Kurt Hackel1-3/+16
This is an error on the sending side, so gracefully error out on the receiving end. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: update lvb immediately during recoveryKurt Hackel1-18/+26
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: do not send master requests to localhostKurt Hackel1-6/+8
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: purge lockres' soonerKurt Hackel1-2/+35
Immediately purge a lockress that the local node is not the master of. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: dump mismatching migrated lvbs before BUG()Kurt Hackel1-2/+13
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: make dlm recovery finalization 2 stageKurt Hackel2-19/+99
Makes it easier for the recovery process to deal with node death. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: dlm recovery / lockres reference count fixKurt Hackel3-4/+15
Take a reference on lockres structures while they are on the recovery list. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: better error handling during assert master messageKurt Hackel1-4/+14
handle errors during lock assert master by either killing self or other node Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: dump lockres info before we BUG() on a bad referenceKurt Hackel1-0/+22
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: do LVB puts in placeMark Fasheh2-5/+10
Don't wait until the AST will be fired to do the LVB copy into the lock resource. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: mle ref count debuggingKurt Hackel1-9/+20
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: allow for an assert message during lock masteryKurt Hackel1-1/+2
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: take mle reference during migrationKurt Hackel1-0/+17
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: properly initialize the mle structureKurt Hackel1-4/+1
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: detach mle from heartbeat eventsKurt Hackel1-0/+2
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: mle ref counting fixesKurt Hackel1-19/+90
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: better mle debuggingKurt Hackel1-5/+28
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: clean up recovery related messagesKurt Hackel1-12/+90
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: handle network errors during recoveryKurt Hackel1-17/+36
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: only recover one dead node at a timeKurt Hackel1-3/+35
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: Better tracking for recovery state changesKurt Hackel1-9/+28
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: Fix empty lvb checkKurt Hackel2-5/+16
The check for an empty lvb should check the entire buffer not just the first byte. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: fix inverted logic in dlm_is_node_deadKurt Hackel1-1/+1
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: recheck lockres master before sending an unlock request.Kurt Hackel1-0/+10
Recovery may have happened and it may now be mastered locally. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: add a small delay after a failed migrationKurt Hackel1-2/+5
Otherwise we risk starving other threads. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: silence a compile warning in dlm_alloc_pagevec()Mark Fasheh1-2/+2
Reported by Andrew Morton. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27[PATCH] ocfs2: Alloc at least a page for the DLM hashJoel Becker2-2/+9
The OCFS2 DLM allocates a number of pages for a hash to lookup locks. There was a bug where a PAGE_SIZE bigger than the hash size (eg, 64K pages) would result in zero pages allocated. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: allocate lockres hash pages in an arrayDaniel Phillips4-13/+46
This allows us to have a hash table greater than a single page which greatly improves dlm performance on some tests. Signed-off-by: Daniel Phillips <phillips@google.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: inline dlm_lockres_get()Mark Fasheh2-6/+6
It's called on every lookup so this might help performance a bit. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27[PATCH] Clean up ocfs2 hash probe and make it fasterDaniel Phillips1-15/+14
Signed-Off-By: Daniel Phillips <phillips@google.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-27ocfs2: calculate lockid hash values outside of the spinlockMark Fasheh4-19/+30
Fixes a performance bug - pointed out by Andrew. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>