<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/fs/dlm/lock.c, branch v6.18.21</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.18.21</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.18.21'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-03-04T12:19:31+00:00</updated>
<entry>
<title>dlm: validate length in dlm_search_rsb_tree</title>
<updated>2026-03-04T12:19:31+00:00</updated>
<author>
<name>Ezrak1e</name>
<email>ezrakiez@gmail.com</email>
</author>
<published>2026-01-20T15:35:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=082083c9fbd99422a0370fe2102144a231c9f5d6'/>
<id>urn:sha1:082083c9fbd99422a0370fe2102144a231c9f5d6</id>
<content type='text'>
[ Upstream commit 080e5563f878c64e697b89e7439d730d0daad882 ]

The len parameter in dlm_dump_rsb_name() is not validated and comes
from network messages. When it exceeds DLM_RESNAME_MAXLEN, it can
cause out-of-bounds write in dlm_search_rsb_tree().

Add length validation to prevent potential buffer overflow.

Signed-off-by: Ezrak1e &lt;ezrakiez@gmail.com&gt;
Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>dlm: fix recovery pending middle conversion</title>
<updated>2026-03-04T12:19:30+00:00</updated>
<author>
<name>Alexander Aring</name>
<email>aahringo@redhat.com</email>
</author>
<published>2026-01-20T15:35:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=75903591f869a463d034638361743695fe613eb9'/>
<id>urn:sha1:75903591f869a463d034638361743695fe613eb9</id>
<content type='text'>
[ Upstream commit 1416bd508c78bdfdb9ae0b4511369e5581f348ea ]

During a workload involving conversions between lock modes PR and CW,
lock recovery can create a "conversion deadlock" state between locks
that have been recovered.  When this occurs, kernel warning messages
are logged, e.g.

  "dlm: WARN: pending deadlock 1e node 0 2 1bf21"

  "dlm: receive_rcom_lock_args 2e middle convert gr 3 rq 2 remote 2 1e"

After this occurs, the deadlocked conversions both appear on the convert
queue of the resource being locked, and the conversion requests do not
complete.

Outside of recovery, conversions that would produce a deadlock are
resolved immediately, and return -EDEADLK.  The locks are not placed
on the convert queue in the deadlocked state.

To fix this problem, an lkb under conversion between PR/CW is rebuilt
during recovery on a new master's granted queue, with the currently
granted mode, rather than being rebuilt on the new master's convert
queue, with the currently granted mode and the newly requested mode.
The in-progress convert is then resent to the new master after
recovery, so the conversion deadlock will be processed outside of
the recovery context and handled as described above.

Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>dlm: move to rinfo for all middle conversion cases</title>
<updated>2025-08-14T20:16:04+00:00</updated>
<author>
<name>Alexander Aring</name>
<email>aahringo@redhat.com</email>
</author>
<published>2025-08-14T15:22:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a8abcff174f7f9ce4587c6451b1a2450d01f52c9'/>
<id>urn:sha1:a8abcff174f7f9ce4587c6451b1a2450d01f52c9</id>
<content type='text'>
Since commit f74dacb4c8116 ("dlm: fix recovery of middle conversions")
we introduced additional debugging information if we hit the middle
conversion by using log_limit(). The DLM log_limit() functionality
requires a DLM debug option being enabled. As this case is so rarely and
excempt any potential introduced new issue with recovery we switching it
to log_rinfo() ad this is ratelimited under normal DLM loglevel.

Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
</entry>
<entry>
<title>treewide, timers: Rename from_timer() to timer_container_of()</title>
<updated>2025-06-08T07:07:37+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2025-05-09T05:51:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=41cb08555c4164996d67c78b3bf1c658075b75f1'/>
<id>urn:sha1:41cb08555c4164996d67c78b3bf1c658075b75f1</id>
<content type='text'>
Move this API to the canonical timer_*() namespace.

[ tglx: Redone against pre rc1 ]

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lore.kernel.org/all/aB2X0jCKQO56WdMt@gmail.com

</content>
</entry>
<entry>
<title>dlm: fix error if active rsb is not hashed</title>
<updated>2025-03-01T00:24:21+00:00</updated>
<author>
<name>Alexander Aring</name>
<email>aahringo@redhat.com</email>
</author>
<published>2025-02-28T22:48:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a3672304abf2a847ac0c54c84842c64c5bfba279'/>
<id>urn:sha1:a3672304abf2a847ac0c54c84842c64c5bfba279</id>
<content type='text'>
If an active rsb is not hashed anymore and this could occur because we
releases and acquired locks we need to signal the followed code that
the lookup failed. Since the lookup was successful, but it isn't part of
the rsb hash anymore we need to signal it by setting error to -EBADR as
dlm_search_rsb_tree() does it.

Cc: stable@vger.kernel.org
Fixes: 5be323b0c64d ("dlm: move dlm_search_rsb_tree() out of lock")
Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
</entry>
<entry>
<title>dlm: fix error if inactive rsb is not hashed</title>
<updated>2025-03-01T00:24:21+00:00</updated>
<author>
<name>Alexander Aring</name>
<email>aahringo@redhat.com</email>
</author>
<published>2025-02-28T22:48:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=94e6e889a786dd16542fc8f2a45405fa13e3bbb5'/>
<id>urn:sha1:94e6e889a786dd16542fc8f2a45405fa13e3bbb5</id>
<content type='text'>
If an inactive rsb is not hashed anymore and this could occur because we
releases and acquired locks we need to signal the followed code that the
lookup failed. Since the lookup was successful, but it isn't part of the
rsb hash anymore we need to signal it by setting error to -EBADR as
dlm_search_rsb_tree() does it.

Cc: stable@vger.kernel.org
Fixes: 01fdeca1cc2d ("dlm: use rcu to avoid an extra rsb struct lookup")
Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
</entry>
<entry>
<title>dlm: fix removal of rsb struct that is master and dir record</title>
<updated>2024-12-19T19:07:02+00:00</updated>
<author>
<name>Alexander Aring</name>
<email>aahringo@redhat.com</email>
</author>
<published>2024-11-19T20:56:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=134129520beaf3339482c557361ea0bde709cf36'/>
<id>urn:sha1:134129520beaf3339482c557361ea0bde709cf36</id>
<content type='text'>
An rsb struct was not being removed in the case where it
was both the master and the dir record.  This case (master
and dir node) was missed in the condition for doing add_scan()
from deactivate_rsb().  Fixing this triggers a related WARN_ON
that needs to be fixed, and requires adjusting where two
del_scan() calls are made.

Fixes: c217adfc8caa ("dlm: fix add_scan and del_scan usage")
Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
</entry>
<entry>
<title>dlm: fix recovery of middle conversions</title>
<updated>2024-11-15T19:39:36+00:00</updated>
<author>
<name>Alexander Aring</name>
<email>aahringo@redhat.com</email>
</author>
<published>2024-11-13T16:46:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f74dacb4c81164d7578a11d5f8b660ad87059e6a'/>
<id>urn:sha1:f74dacb4c81164d7578a11d5f8b660ad87059e6a</id>
<content type='text'>
In one special case, recovery is unable to reliably rebuild
lock state by simply recreating lkb structs as sent from the
lock holders.  That case is when the lkb's include conversions
between PR and CW modes.

The recovery code has always recognized this special case,
but the implemention has always been broken, and would set
invalid modes in recovered lkb's.  Unpredictable or bogus
errors could then be returned for further locking calls on
these locks.

This bug has gone unnoticed for so long due to some
combination of:
- applications never or infrequently converting between PR/CW
- recovery not occuring during these conversions
- if the recovery bug does occur, the caller may not notice,
  depending on what further locking calls are made, e.g. if
  the lock is simply unlocked it may go unnoticed

However, a core analysis from a recent gfs2 bug report points
to this broken code.

PR = Protected Read
CW = Concurrent Write
PR and CW are incompatible
PR and PR are compatible
CW and CW are compatible

Example 1

node C, resource R
granted: PR node A
granted: PR node B
granted: NL node C
granted: NL node D

- A sends convert PR-&gt;CW to C
- C fails before A gets a reply
- recovery occurs

At this point, A does not know if it still holds
the lock in PR, or if its conversion to CW was granted:
- If A's conversion to CW was granted, then another
  node's CW lock may also have been granted.
- If A's conversion to CW was not granted, it still
  holds a PR lock, and other nodes may also hold PR locks.

So, the new master of R cannot simply recreate the lock
from A using granted mode PR and requested mode CW.
The new master must look at all the recovered locks to
determine the correct granted modes, and ensure that all
the recovered locks are recreated in compatible states.

The correct lock recovery steps in this example are:
- node D becomes the new master of R
- node B sends D its lkb, granted PR
- node A sends D its lkb, convert PR-&gt;CW
- D determines the correct lock state is:
  granted: PR node B
  convert: PR-&gt;CW node A

The lkb sent by each node was recreated without
any change on the new master node.

Example 2

node C, resource R
granted: PR node A
granted: NL node C
granted: NL node D
waiting: CW node B

- A sends convert PR-&gt;CW to C
- C grants the conversion to CW for A
- C grants the waiting request for CW to B
- C sends granted message to B, but fails
  before it can send the granted message to A
- B receives the granted message from C

At this point:
- A believes it is converting PR-&gt;CW
- B believes it is holding a CW lock

The correct lock recovery steps in this example are:
- node D becomes the new master of R
- node A sends D its lkb, convert PR-&gt;CW
- node B sends D its lkb, granted CW
- D determins the correct lock state is:
  granted: CW node B
  granted: CW node A

The lkb sent by B is recreated without change,
but the lkb sent by A is changed because the
granted mode was not compatible.

Fixes to make this work correctly:

recover_convert_waiter: should not make any changes
to a converting lkb that is still waiting for a reply
message.  It was previously setting grmode to IV, which
is invalid state, so the lkb would not be handled
correctly by other code.

receive_rcom_lock_args: was checking the wrong lkb field
(wait_type instead of status) to determine if the lkb is
being converted, and in need of inspection for this special
recovery.  It was also setting grmode to IV in the lkb,
causing it to be mishandled by other code.
Now, this function just puts the lkb, directly as sent,
onto the convert queue of the resource being recovered,
and corrects it in recover_conversion() later, if needed.

recover_conversion: the job of this function is to detect
and correct lkb states for the special PR/CW conversions.
The new code now checks for recovered lkbs on the granted
queue with grmode PR or CW, and takes the real grmode from
that.  Then it looks for lkbs on the convert queue with an
incompatible grmode (i.e. grmode PR when the real grmode is
CW, or v.v.)  These converting lkbs need to be fixed.
They are fixed by temporarily setting their grmode to NL,
so that grmodes are not incompatible and won't confuse other
locking code.  The converting lkb will then be granted at
the end of recovery, replacing the temporary NL grmode.

Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
</entry>
<entry>
<title>dlm: make add_to_waiters() that it can't fail</title>
<updated>2024-10-04T15:31:31+00:00</updated>
<author>
<name>Alexander Aring</name>
<email>aahringo@redhat.com</email>
</author>
<published>2024-10-04T15:13:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=dfe5a6cc42043d8a9433aea6d2e49887b72bf3ed'/>
<id>urn:sha1:dfe5a6cc42043d8a9433aea6d2e49887b72bf3ed</id>
<content type='text'>
If add_to_waiters() fails we have a problem because the previous called
functions such as validate_lock_args() or validate_unlock_args() sets
specific lkb values that are set for a request, there exists no way back
to revert those changes. When there is a pending lock request the
original request arguments will be overwritten with unknown
consequences.

The good news are that I believe those cases that we fail in
add_to_waiters() can't happen or very unlikely to happen (only if the DLM
user does stupid API things), but if so we have the above mentioned
problem.

There are two conditions that will be removed here. The first one is the
-EINVAL case which contains is_overlap_unlock() or (is_overlap_cancel()
and mstype == DLM_MSG_CANCEL).

The is_overlap_unlock() is missing for the normal UNLOCK case which is
moved to validate_unlock_args(). The is_overlap_cancel() already happens
in validate_unlock_args() when DLM_LKF_CANCEL is set. In case of
validate_lock_args() we check on is_overlap() when it is not a new request,
on a new request the lkb is always new and does not have those values set.

The -EBUSY check can't happen in case as for non new lock requests (when
DLM_LKF_CONVERT is set) we already check in validate_lock_args() for
lkb_wait_type and is_overlap(). Then there is only
validate_unlock_args() that will never hit the default case because
dlm_unlock() will produce DLM_MSG_UNLOCK and DLM_MSG_CANCEL messages.

Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
</entry>
<entry>
<title>dlm: fix possible lkb_resource null dereference</title>
<updated>2024-10-04T15:31:31+00:00</updated>
<author>
<name>Alexander Aring</name>
<email>aahringo@redhat.com</email>
</author>
<published>2024-10-04T15:13:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b98333c67daf887c724cd692e88e2db9418c0861'/>
<id>urn:sha1:b98333c67daf887c724cd692e88e2db9418c0861</id>
<content type='text'>
This patch fixes a possible null pointer dereference when this function is
called from request_lock() as lkb-&gt;lkb_resource is not assigned yet,
only after validate_lock_args() by calling attach_lkb(). Another issue
is that a resource name could be a non printable bytearray and we cannot
assume to be ASCII coded.

The log functionality is probably never being hit when DLM is used in
normal way and no debug logging is enabled. The null pointer dereference
can only occur on a new created lkb that does not have the resource
assigned yet, it probably never hits the null pointer dereference but we
should be sure that other changes might not change this behaviour and we
actually can hit the mentioned null pointer dereference.

In this patch we just drop the printout of the resource name, the lkb id
is enough to make a possible connection to a resource name if this
exists.

Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
</entry>
</feed>
