<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/fs/btrfs/locking.c, branch v5.10.257</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v5.10.257</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v5.10.257'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-03-13T11:47:44+00:00</updated>
<entry>
<title>btrfs: bring back the incorrectly removed extent buffer lock recursion support</title>
<updated>2025-03-13T11:47:44+00:00</updated>
<author>
<name>Filipe Manana</name>
<email>fdmanana@suse.com</email>
</author>
<published>2025-03-02T12:47:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=474c73d5c9228cfb209c823ec4991b744df5a056'/>
<id>urn:sha1:474c73d5c9228cfb209c823ec4991b744df5a056</id>
<content type='text'>
Commit 51b03b7473a0 ("btrfs: locking: remove the recursion handling code")
from the 5.10.233 stable tree removed the support for extent buffer lock
recursion, but we need that code because in 5.10.x we load the free space
cache synchronously - while modifying the extent tree and holding a write
lock on some extent buffer, we may need to load the free space cache,
which requires acquiring read locks on the extent tree and therefore
result in a deadlock in case we need to read lock an extent buffer we had
write locked while modifying the extent tree.

Backporting that commit from Linus' tree is therefore wrong, and was done
so in order to backport upstream commit 97e86631bccd ("btrfs: don't set
lock_owner when locking extent buffer for reading"). However we should
have instead had the commit adapted to the 5.10 stable tree instead.

Note that the backport of that dependency is ok only for stable trees
5.11+, because in those tree the space cache loading code is not
synchronous anymore, so there is no need to have the lock recursion
and indeed there are no users of the extent buffer lock recursion
support. In other words, the backport is only valid for kernel releases
that have the asynchrounous free space cache loading support, which
was introduced in kernel 5.11 with commit e747853cae3a ("btrfs: load
free space cache asynchronously").

This was causing deadlocks and reported by a user (see below Link tag).

So revert commit 51b03b7473a0 ("btrfs: locking: remove the recursion
handling code") while not undoing what commit d5a30a6117ea ("btrfs: don't
set lock_owner when locking extent buffer for reading") from the 5.10.x
stable tree did.

Reported-by: pk &lt;pkoroau@gmail.com&gt;
Link: https://lore.kernel.org/linux-btrfs/CAMNwjEKH6znTHE5hMc5er2dFs5ypw4Szx6TMDMb0H76yFq5DGQ@mail.gmail.com/
Signed-off-by: Filipe Manana &lt;fdmanana@suse.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>btrfs: don't set lock_owner when locking extent buffer for reading</title>
<updated>2025-01-09T12:25:05+00:00</updated>
<author>
<name>Zygo Blaxell</name>
<email>ce3g8jdj@umail.furryterror.org</email>
</author>
<published>2022-06-09T02:39:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d5a30a6117eaee25a1aadf09ab3ae15223581b51'/>
<id>urn:sha1:d5a30a6117eaee25a1aadf09ab3ae15223581b51</id>
<content type='text'>
[ Upstream commit 97e86631bccddfbbe0c13f9a9605cdef11d31296 ]

In 196d59ab9ccc "btrfs: switch extent buffer tree lock to rw_semaphore"
the functions for tree read locking were rewritten, and in the process
the read lock functions started setting eb-&gt;lock_owner = current-&gt;pid.
Previously lock_owner was only set in tree write lock functions.

Read locks are shared, so they don't have exclusive ownership of the
underlying object, so setting lock_owner to any single value for a
read lock makes no sense.  It's mostly harmless because write locks
and read locks are mutually exclusive, and none of the existing code
in btrfs (btrfs_init_new_buffer and print_eb_refs_lock) cares what
nonsense is written in lock_owner when no writer is holding the lock.

KCSAN does care, and will complain about the data race incessantly.
Remove the assignments in the read lock functions because they're
useless noise.

Fixes: 196d59ab9ccc ("btrfs: switch extent buffer tree lock to rw_semaphore")
CC: stable@vger.kernel.org # 5.15+
Reviewed-by: Nikolay Borisov &lt;nborisov@suse.com&gt;
Reviewed-by: Filipe Manana &lt;fdmanana@suse.com&gt;
Signed-off-by: Zygo Blaxell &lt;ce3g8jdj@umail.furryterror.org&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>btrfs: locking: remove the recursion handling code</title>
<updated>2025-01-09T12:25:05+00:00</updated>
<author>
<name>Josef Bacik</name>
<email>josef@toxicpanda.com</email>
</author>
<published>2020-11-06T21:27:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=51b03b7473a0d9dc4bc05ffed6c2c34fff26dc35'/>
<id>urn:sha1:51b03b7473a0d9dc4bc05ffed6c2c34fff26dc35</id>
<content type='text'>
[ Upstream commit 4048daedb910f83f080c6bb03c78af794aebdff5 ]

Now that we're no longer using recursion, rip out all of the supporting
code.  Follow up patches will clean up the callers of these functions.

The extent_buffer::lock_owner is still retained as it allows safety
checks in btrfs_init_new_buffer for the case that the free space cache
is corrupted and we try to allocate a block that we are currently using
and have locked in the path.

Reviewed-by: Filipe Manana &lt;fdmanana@suse.com&gt;
Signed-off-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
Stable-dep-of: 97e86631bccd ("btrfs: don't set lock_owner when locking extent buffer for reading")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>btrfs: locking: remove all the blocking helpers</title>
<updated>2025-01-09T12:25:04+00:00</updated>
<author>
<name>Josef Bacik</name>
<email>josef@toxicpanda.com</email>
</author>
<published>2020-08-20T15:46:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1817e3e849f6578c479e011fa898dcdc6e3dff61'/>
<id>urn:sha1:1817e3e849f6578c479e011fa898dcdc6e3dff61</id>
<content type='text'>
[ Upstream commit ac5887c8e013d6754d36e6d51dc03448ee0b0065 ]

Now that we're using a rw_semaphore we no longer need to indicate if a
lock is blocking or not, nor do we need to flip the entire path from
blocking to spinning.  Remove these helpers and all the places they are
called.

Signed-off-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
Stable-dep-of: 44f52bbe96df ("btrfs: fix use-after-free when COWing tree bock and tracing is enabled")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>btrfs: switch extent buffer tree lock to rw_semaphore</title>
<updated>2025-01-09T12:25:04+00:00</updated>
<author>
<name>Josef Bacik</name>
<email>josef@toxicpanda.com</email>
</author>
<published>2020-08-20T15:46:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4da6be8eb61894978c4bbf48a11a3be5a4b0c404'/>
<id>urn:sha1:4da6be8eb61894978c4bbf48a11a3be5a4b0c404</id>
<content type='text'>
[ Upstream commit 196d59ab9ccc975d8d29292845d227cdf4423ef8 ]

Historically we've implemented our own locking because we wanted to be
able to selectively spin or sleep based on what we were doing in the
tree.  For instance, if all of our nodes were in cache then there's
rarely a reason to need to sleep waiting for node locks, as they'll
likely become available soon.  At the time this code was written the
rw_semaphore didn't do adaptive spinning, and thus was orders of
magnitude slower than our home grown locking.

However now the opposite is the case.  There are a few problems with how
we implement blocking locks, namely that we use a normal waitqueue and
simply wake everybody up in reverse sleep order.  This leads to some
suboptimal performance behavior, and a lot of context switches in highly
contended cases.  The rw_semaphores actually do this properly, and also
have adaptive spinning that works relatively well.

The locking code is also a bit of a bear to understand, and we lose the
benefit of lockdep for the most part because the blocking states of the
lock are simply ad-hoc and not mapped into lockdep.

So rework the locking code to drop all of this custom locking stuff, and
simply use a rw_semaphore for everything.  This makes the locking much
simpler for everything, as we can now drop a lot of cruft and blocking
transitions.  The performance numbers vary depending on the workload,
because generally speaking there doesn't tend to be a lot of contention
on the btree.  However, on my test system which is an 80 core single
socket system with 256GiB of RAM and a 2TiB NVMe drive I get the
following results (with all debug options off):

  dbench 200 baseline
  Throughput 216.056 MB/sec  200 clients  200 procs  max_latency=1471.197 ms

  dbench 200 with patch
  Throughput 737.188 MB/sec  200 clients  200 procs  max_latency=714.346 ms

Previously we also used fs_mark to test this sort of contention, and
those results are far less impressive, mostly because there's not enough
tasks to really stress the locking

  fs_mark -d /d[0-15] -S 0 -L 20 -n 100000 -s 0 -t 16

  baseline
    Average Files/sec:     160166.7
    p50 Files/sec:         165832
    p90 Files/sec:         123886
    p99 Files/sec:         123495

    real    3m26.527s
    user    2m19.223s
    sys     48m21.856s

  patched
    Average Files/sec:     164135.7
    p50 Files/sec:         171095
    p90 Files/sec:         122889
    p99 Files/sec:         113819

    real    3m29.660s
    user    2m19.990s
    sys     44m12.259s

Signed-off-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
Stable-dep-of: 44f52bbe96df ("btrfs: fix use-after-free when COWing tree bock and tracing is enabled")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>btrfs: add nesting tags to the locking helpers</title>
<updated>2020-10-07T10:12:16+00:00</updated>
<author>
<name>Josef Bacik</name>
<email>josef@toxicpanda.com</email>
</author>
<published>2020-08-20T15:46:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=fd7ba1c1202d15ac96d7b42256550973e12686b5'/>
<id>urn:sha1:fd7ba1c1202d15ac96d7b42256550973e12686b5</id>
<content type='text'>
We will need these when we switch to an rwsem, so plumb in the
infrastructure here to use later on.  I violate the 80 character limit
some here because it'll be cleaned up later.

Signed-off-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: introduce btrfs_path::recurse</title>
<updated>2020-10-07T10:12:16+00:00</updated>
<author>
<name>Josef Bacik</name>
<email>josef@toxicpanda.com</email>
</author>
<published>2020-08-20T15:46:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=51899412dd95b2d9f50297c527b22ada3da0000c'/>
<id>urn:sha1:51899412dd95b2d9f50297c527b22ada3da0000c</id>
<content type='text'>
Our current tree locking stuff allows us to recurse with read locks if
we're already holding the write lock.  This is necessary for the space
cache inode, as we could be holding a lock on the root_tree root when we
need to cache a block group, and thus need to be able to read down the
root_tree to read in the inode cache.

We can get away with this in our current locking, but we won't be able
to with a rwsem.  Handle this by purposefully annotating the places
where we require recursion, so that in the future we can maybe come up
with a way to avoid the recursion.  In the case of the free space inode,
this will be superseded by the free space tree.

Signed-off-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: rename extent_buffer::lock_nested to extent_buffer::lock_recursed</title>
<updated>2020-10-07T10:12:15+00:00</updated>
<author>
<name>Josef Bacik</name>
<email>josef@toxicpanda.com</email>
</author>
<published>2020-08-20T15:46:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=329ced799be881feab445b68163cd3869340cc08'/>
<id>urn:sha1:329ced799be881feab445b68163cd3869340cc08</id>
<content type='text'>
Nested locking with lockdep and everything else refers to lock hierarchy
within the same lock map.  This is how we indicate the same locks for
different objects are ok to take in a specific order, for our use case
that would be to take the lock on a leaf and then take a lock on an
adjacent leaf.

What -&gt;lock_nested _actually_ refers to is if we happen to already be
holding the write lock on the extent buffer and we're allowing a read
lock to be taken on that extent buffer, which is recursion.  Rename this
so we don't get confused when we switch to a rwsem and have to start
using the _nested helpers.

Signed-off-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: add missing annotation for btrfs_tree_lock()</title>
<updated>2020-05-25T09:25:16+00:00</updated>
<author>
<name>Jules Irenge</name>
<email>jbi.octave@gmail.com</email>
</author>
<published>2020-03-31T20:46:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=78d933c79cb649906577715af15400c7724ca633'/>
<id>urn:sha1:78d933c79cb649906577715af15400c7724ca633</id>
<content type='text'>
Sparse reports a warning at btrfs_tree_lock()

warning: context imbalance in btrfs_tree_lock() - wrong count at exit

The root cause is the missing annotation at btrfs_tree_lock()
Add the missing __acquires(&amp;eb-&gt;lock) annotation

Signed-off-by: Jules Irenge &lt;jbi.octave@gmail.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: Implement DREW lock</title>
<updated>2020-03-23T16:01:43+00:00</updated>
<author>
<name>Nikolay Borisov</name>
<email>nborisov@suse.com</email>
</author>
<published>2020-01-30T12:59:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2992df73268f78ec9281692b9b44ae92f3933b54'/>
<id>urn:sha1:2992df73268f78ec9281692b9b44ae92f3933b54</id>
<content type='text'>
A (D)ouble (R)eader (W)riter (E)xclustion lock is a locking primitive
that allows to have multiple readers or multiple writers but not
multiple readers and writers holding it concurrently.

The code is factored out from the existing open-coded locking scheme
used to exclude pending snapshots from nocow writers and vice-versa.
Current implementation actually favors Readers (that is snapshot
creaters) to writers (nocow writers of the filesystem).

The API provides lock/unlock/trylock for reads and writes.

Formal specification for TLA+ provided by Valentin Schneider is at
https://lore.kernel.org/linux-btrfs/2dcaf81c-f0d3-409e-cb29-733d8b3b4cc9@arm.com/

Signed-off-by: Nikolay Borisov &lt;nborisov@suse.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
</feed>
