kernel/linux.git/fs/dlm/lock.c, branch v6.18.21

dlm: validate length in dlm_search_rsb_tree

2026-03-04T12:19:31+00:00

[ Upstream commit 080e5563f878c64e697b89e7439d730d0daad882 ] The len parameter in dlm_dump_rsb_name() is not validated and comes from network messages. When it exceeds DLM_RESNAME_MAXLEN, it can cause out-of-bounds write in dlm_search_rsb_tree(). Add length validation to prevent potential buffer overflow. Signed-off-by: Ezrak1e Signed-off-by: Alexander Aring Signed-off-by: David Teigland Signed-off-by: Sasha Levin

dlm: fix recovery pending middle conversion

2026-03-04T12:19:30+00:00

[ Upstream commit 1416bd508c78bdfdb9ae0b4511369e5581f348ea ] During a workload involving conversions between lock modes PR and CW, lock recovery can create a "conversion deadlock" state between locks that have been recovered. When this occurs, kernel warning messages are logged, e.g. "dlm: WARN: pending deadlock 1e node 0 2 1bf21" "dlm: receive_rcom_lock_args 2e middle convert gr 3 rq 2 remote 2 1e" After this occurs, the deadlocked conversions both appear on the convert queue of the resource being locked, and the conversion requests do not complete. Outside of recovery, conversions that would produce a deadlock are resolved immediately, and return -EDEADLK. The locks are not placed on the convert queue in the deadlocked state. To fix this problem, an lkb under conversion between PR/CW is rebuilt during recovery on a new master's granted queue, with the currently granted mode, rather than being rebuilt on the new master's convert queue, with the currently granted mode and the newly requested mode. The in-progress convert is then resent to the new master after recovery, so the conversion deadlock will be processed outside of the recovery context and handled as described above. Signed-off-by: Alexander Aring Signed-off-by: David Teigland Signed-off-by: Sasha Levin

dlm: move to rinfo for all middle conversion cases

2025-08-14T20:16:04+00:00

Since commit f74dacb4c8116 ("dlm: fix recovery of middle conversions") we introduced additional debugging information if we hit the middle conversion by using log_limit(). The DLM log_limit() functionality requires a DLM debug option being enabled. As this case is so rarely and excempt any potential introduced new issue with recovery we switching it to log_rinfo() ad this is ratelimited under normal DLM loglevel. Signed-off-by: Alexander Aring Signed-off-by: David Teigland

treewide, timers: Rename from_timer() to timer_container_of()

2025-06-08T07:07:37+00:00

Move this API to the canonical timer_*() namespace. [ tglx: Redone against pre rc1 ] Signed-off-by: Ingo Molnar Signed-off-by: Thomas Gleixner Link: https://lore.kernel.org/all/aB2X0jCKQO56WdMt@gmail.com

dlm: fix error if active rsb is not hashed

2025-03-01T00:24:21+00:00

If an active rsb is not hashed anymore and this could occur because we releases and acquired locks we need to signal the followed code that the lookup failed. Since the lookup was successful, but it isn't part of the rsb hash anymore we need to signal it by setting error to -EBADR as dlm_search_rsb_tree() does it. Cc: stable@vger.kernel.org Fixes: 5be323b0c64d ("dlm: move dlm_search_rsb_tree() out of lock") Signed-off-by: Alexander Aring Signed-off-by: David Teigland

dlm: fix error if inactive rsb is not hashed

2025-03-01T00:24:21+00:00

If an inactive rsb is not hashed anymore and this could occur because we releases and acquired locks we need to signal the followed code that the lookup failed. Since the lookup was successful, but it isn't part of the rsb hash anymore we need to signal it by setting error to -EBADR as dlm_search_rsb_tree() does it. Cc: stable@vger.kernel.org Fixes: 01fdeca1cc2d ("dlm: use rcu to avoid an extra rsb struct lookup") Signed-off-by: Alexander Aring Signed-off-by: David Teigland

dlm: fix removal of rsb struct that is master and dir record

2024-12-19T19:07:02+00:00

An rsb struct was not being removed in the case where it was both the master and the dir record. This case (master and dir node) was missed in the condition for doing add_scan() from deactivate_rsb(). Fixing this triggers a related WARN_ON that needs to be fixed, and requires adjusting where two del_scan() calls are made. Fixes: c217adfc8caa ("dlm: fix add_scan and del_scan usage") Signed-off-by: Alexander Aring Signed-off-by: David Teigland

dlm: fix recovery of middle conversions

2024-11-15T19:39:36+00:00

In one special case, recovery is unable to reliably rebuild lock state by simply recreating lkb structs as sent from the lock holders. That case is when the lkb's include conversions between PR and CW modes. The recovery code has always recognized this special case, but the implemention has always been broken, and would set invalid modes in recovered lkb's. Unpredictable or bogus errors could then be returned for further locking calls on these locks. This bug has gone unnoticed for so long due to some combination of: - applications never or infrequently converting between PR/CW - recovery not occuring during these conversions - if the recovery bug does occur, the caller may not notice, depending on what further locking calls are made, e.g. if the lock is simply unlocked it may go unnoticed However, a core analysis from a recent gfs2 bug report points to this broken code. PR = Protected Read CW = Concurrent Write PR and CW are incompatible PR and PR are compatible CW and CW are compatible Example 1 node C, resource R granted: PR node A granted: PR node B granted: NL node C granted: NL node D - A sends convert PR->CW to C - C fails before A gets a reply - recovery occurs At this point, A does not know if it still holds the lock in PR, or if its conversion to CW was granted: - If A's conversion to CW was granted, then another node's CW lock may also have been granted. - If A's conversion to CW was not granted, it still holds a PR lock, and other nodes may also hold PR locks. So, the new master of R cannot simply recreate the lock from A using granted mode PR and requested mode CW. The new master must look at all the recovered locks to determine the correct granted modes, and ensure that all the recovered locks are recreated in compatible states. The correct lock recovery steps in this example are: - node D becomes the new master of R - node B sends D its lkb, granted PR - node A sends D its lkb, convert PR->CW - D determines the correct lock state is: granted: PR node B convert: PR->CW node A The lkb sent by each node was recreated without any change on the new master node. Example 2 node C, resource R granted: PR node A granted: NL node C granted: NL node D waiting: CW node B - A sends convert PR->CW to C - C grants the conversion to CW for A - C grants the waiting request for CW to B - C sends granted message to B, but fails before it can send the granted message to A - B receives the granted message from C At this point: - A believes it is converting PR->CW - B believes it is holding a CW lock The correct lock recovery steps in this example are: - node D becomes the new master of R - node A sends D its lkb, convert PR->CW - node B sends D its lkb, granted CW - D determins the correct lock state is: granted: CW node B granted: CW node A The lkb sent by B is recreated without change, but the lkb sent by A is changed because the granted mode was not compatible. Fixes to make this work correctly: recover_convert_waiter: should not make any changes to a converting lkb that is still waiting for a reply message. It was previously setting grmode to IV, which is invalid state, so the lkb would not be handled correctly by other code. receive_rcom_lock_args: was checking the wrong lkb field (wait_type instead of status) to determine if the lkb is being converted, and in need of inspection for this special recovery. It was also setting grmode to IV in the lkb, causing it to be mishandled by other code. Now, this function just puts the lkb, directly as sent, onto the convert queue of the resource being recovered, and corrects it in recover_conversion() later, if needed. recover_conversion: the job of this function is to detect and correct lkb states for the special PR/CW conversions. The new code now checks for recovered lkbs on the granted queue with grmode PR or CW, and takes the real grmode from that. Then it looks for lkbs on the convert queue with an incompatible grmode (i.e. grmode PR when the real grmode is CW, or v.v.) These converting lkbs need to be fixed. They are fixed by temporarily setting their grmode to NL, so that grmodes are not incompatible and won't confuse other locking code. The converting lkb will then be granted at the end of recovery, replacing the temporary NL grmode. Signed-off-by: Alexander Aring Signed-off-by: David Teigland

dlm: make add_to_waiters() that it can't fail

2024-10-04T15:31:31+00:00

If add_to_waiters() fails we have a problem because the previous called functions such as validate_lock_args() or validate_unlock_args() sets specific lkb values that are set for a request, there exists no way back to revert those changes. When there is a pending lock request the original request arguments will be overwritten with unknown consequences. The good news are that I believe those cases that we fail in add_to_waiters() can't happen or very unlikely to happen (only if the DLM user does stupid API things), but if so we have the above mentioned problem. There are two conditions that will be removed here. The first one is the -EINVAL case which contains is_overlap_unlock() or (is_overlap_cancel() and mstype == DLM_MSG_CANCEL). The is_overlap_unlock() is missing for the normal UNLOCK case which is moved to validate_unlock_args(). The is_overlap_cancel() already happens in validate_unlock_args() when DLM_LKF_CANCEL is set. In case of validate_lock_args() we check on is_overlap() when it is not a new request, on a new request the lkb is always new and does not have those values set. The -EBUSY check can't happen in case as for non new lock requests (when DLM_LKF_CONVERT is set) we already check in validate_lock_args() for lkb_wait_type and is_overlap(). Then there is only validate_unlock_args() that will never hit the default case because dlm_unlock() will produce DLM_MSG_UNLOCK and DLM_MSG_CANCEL messages. Signed-off-by: Alexander Aring Signed-off-by: David Teigland

dlm: fix possible lkb_resource null dereference

2024-10-04T15:31:31+00:00

This patch fixes a possible null pointer dereference when this function is called from request_lock() as lkb->lkb_resource is not assigned yet, only after validate_lock_args() by calling attach_lkb(). Another issue is that a resource name could be a non printable bytearray and we cannot assume to be ASCII coded. The log functionality is probably never being hit when DLM is used in normal way and no debug logging is enabled. The null pointer dereference can only occur on a new created lkb that does not have the resource assigned yet, it probably never hits the null pointer dereference but we should be sure that other changes might not change this behaviour and we actually can hit the mentioned null pointer dereference. In this patch we just drop the printout of the resource name, the lkb id is enough to make a possible connection to a resource name if this exists. Signed-off-by: Alexander Aring Signed-off-by: David Teigland