<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/fs/btrfs, branch v7.0.11</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v7.0.11</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v7.0.11'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-06-01T15:54:45+00:00</updated>
<entry>
<title>btrfs: fix squota accounting during enable generation</title>
<updated>2026-06-01T15:54:45+00:00</updated>
<author>
<name>Boris Burkov</name>
<email>boris@bur.io</email>
</author>
<published>2026-05-12T02:53:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5c277bd21e83e67c35fa775b8f1afd47f923183e'/>
<id>urn:sha1:5c277bd21e83e67c35fa775b8f1afd47f923183e</id>
<content type='text'>
[ Upstream commit d7c600554816b8ef70adffe078a0e360c055d82b ]

The first transaction that enables squotas is special and a bit tricky.
We have to set BTRFS_FS_QUOTA_ENABLED after the transaction to avoid a
deadlock, so any delayed refs that run before we set the bit are not
squota accounted. For data this is fine, we don't get an owner_ref, so
there is no real harm, it's as if the extent predated squotas. However
for metadata, the tree block will have gen == enable_gen so when we free
it later, we will decrement the squota accounting, which can result in
an underflow. Before it is freed, btrfs check shows errors, as we have
mismatched usage between the node generations/owners and the squota
values.

There are two angles to this fix:

1. For extents that come in delayed_refs that run during the
   enable_gen transaction, we must actually set enable_gen to the *next*
   transaction. That is the first transaction that we can really
   properly account in any way.
2. For extents that come in between the end of our transaction handle
   and the time we set the BTRFS_FS_QUOTA_ENABLED bit, we need an
   additional bit, BTRFS_FS_SQUOTA_ENABLING which only affects recording
   squota deltas, so we do pick up those extents. Otherwise, we would
   miss them, even for enable_gen + 1.

Fixes: bd7c1ea3a302 ("btrfs: qgroup: check generation when recording simple quota delta")
Reviewed-by: Qu Wenruo &lt;wqu@suse.com&gt;
Signed-off-by: Boris Burkov &lt;boris@bur.io&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>btrfs: check for subvolume before deleting squota qgroup</title>
<updated>2026-06-01T15:54:45+00:00</updated>
<author>
<name>Boris Burkov</name>
<email>boris@bur.io</email>
</author>
<published>2026-05-11T20:07:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ce337b5a5e376e1df1aa8c704c5a73a5b93f3020'/>
<id>urn:sha1:ce337b5a5e376e1df1aa8c704c5a73a5b93f3020</id>
<content type='text'>
[ Upstream commit 1e92637722ae4bd417f7a37e8d1485dc23b93935 ]

The invariant that we want to maintain with subvolume qgroups is that
the qgroup can only be deleted if there is no root. With squotas, we
thought that it was sufficient to just check the usage, because we
assumed that deleting a subvolume will drive it's qgroups usage to 0,
and thus 0 usage implies no subvolume.

However, this is false, for two reasons:

- A subvol whose extents are all from before squotas was enabled.
- A subvol that was created in this transaction and for which we have
  not yet run any delayed refs.

In both cases, deleting the qgroup breaks the desired invariant and we
are left with a subvolume with no qgroup but squotas are enabled.

Fix this by unifying the deletion check logic between full qgroups and
squotas. Squotas do all the same checks *and* the additional usage == 0
check, which is the one extra rule peculiar to squotas.

Link: https://lore.kernel.org/linux-btrfs/adnBhWfJQ1n3hZC8@merlins.org/
Fixes: a8df35619948 ("btrfs: forbid deleting live subvol qgroup")
Reviewed-by: Qu Wenruo &lt;wqu@suse.com&gt;
Signed-off-by: Boris Burkov &lt;boris@bur.io&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>btrfs: only release the dirty pages io tree after successful writes</title>
<updated>2026-05-23T11:09:40+00:00</updated>
<author>
<name>Qu Wenruo</name>
<email>wqu@suse.com</email>
</author>
<published>2026-04-30T01:07:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=df03d67dc63722845cb9fe59d815d1225b04fd54'/>
<id>urn:sha1:df03d67dc63722845cb9fe59d815d1225b04fd54</id>
<content type='text'>
commit 4066c55e109475a06d18a1f127c939d551211956 upstream.

[WARNING]
With extra warning on dirty extent buffers at umount (aka, the next
patch in the series), test case generic/388 can trigger the following
warning about dirty extent buffers at unmount time:

  BTRFS critical (device dm-2 state E): emergency shutdown
  BTRFS error (device dm-2 state E): error while writing out transaction: -30
  BTRFS warning (device dm-2 state E): Skipping commit of aborted transaction.
  BTRFS error (device dm-2 state EA): Transaction 9 aborted (error -30)
  BTRFS: error (device dm-2 state EA) in cleanup_transaction:2068: errno=-30 Readonly filesystem
  BTRFS info (device dm-2 state EA): forced readonly
  BTRFS info (device dm-2 state EA): last unmount of filesystem 4fbf2e15-f941-49a0-bc7c-716315d2777c
  ------------[ cut here ]------------
  WARNING: disk-io.c:3311 at invalidate_and_check_btree_folios+0xfd/0x1ca [btrfs], CPU#8: umount/914368
  CPU: 8 UID: 0 PID: 914368 Comm: umount Tainted: G           OE       7.1.0-rc1-custom+ #372 PREEMPT(full)  2de38db8d1deae71fde295430a0ff3ab98ccf596
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 02/02/2022
  RIP: 0010:invalidate_and_check_btree_folios+0xfd/0x1ca [btrfs]
  Call Trace:
   &lt;TASK&gt;
   close_ctree+0x52e/0x574 [btrfs d2f0b1cd330d1287e7a9919d112eadfc0e914efd]
   generic_shutdown_super+0x89/0x1a0
   kill_anon_super+0x16/0x40
   btrfs_kill_super+0x16/0x20 [btrfs d2f0b1cd330d1287e7a9919d112eadfc0e914efd]
   deactivate_locked_super+0x2d/0xb0
   cleanup_mnt+0xdc/0x140
   task_work_run+0x5a/0xa0
   exit_to_user_mode_loop+0x123/0x4b0
   do_syscall_64+0x243/0x7c0
   entry_SYSCALL_64_after_hwframe+0x4b/0x53
   &lt;/TASK&gt;
  ---[ end trace 0000000000000000 ]---
  BTRFS warning (device dm-2 state EA): unable to release extent buffer 30539776 owner 9 gen 9 refs 2 flags 0x7
  BTRFS warning (device dm-2 state EA): unable to release extent buffer 30621696 owner 257 gen 9 refs 2 flags 0x7
  BTRFS warning (device dm-2 state EA): unable to release extent buffer 30638080 owner 258 gen 9 refs 2 flags 0x7
  BTRFS warning (device dm-2 state EA): unable to release extent buffer 30654464 owner 7 gen 9 refs 2 flags 0x7
  BTRFS warning (device dm-2 state EA): unable to release extent buffer 30703616 owner 2 gen 9 refs 2 flags 0x7
  BTRFS warning (device dm-2 state EA): unable to release extent buffer 30720000 owner 10 gen 9 refs 2 flags 0x7
  BTRFS warning (device dm-2 state EA): unable to release extent buffer 30736384 owner 4 gen 9 refs 2 flags 0x7
  BTRFS warning (device dm-2 state EA): unable to release extent buffer 30752768 owner 11 gen 9 refs 2 flags 0x7

I'm using a stripped down version, which seems to trigger the warning
more reliably:

  _fsstress_pid=""
  workload()
  {
  	dmesg -C
  	mkfs.btrfs -f -K $dev &gt; /dev/null
  	echo 1 &gt; /sys/kernel/debug/clear_warn_once
  	mount $dev $mnt
  	$fsstress -w -n 1024 -p 4 -d $mnt &amp;
  	_fsstress_pid=$!
  	sleep 0
  	$godown $mnt
  	pkill --echo -PIPE fsstress &gt; /dev/null
  	wait $_fsstress_pid
  	unset _fsstress_pid
  	umount $mnt

  	if dmesg | grep -q "WARNING"; then
  		fail
  	fi
  }

  for (( i = 0; i &lt; $runtime; i++ )); do
  	echo "=== $i/$runtime ==="
  	workload
  done

[CAUSE]
Inside btrfs_write_and_wait_transaction(), we first try to write all
dirty ebs, then wait for them to finish.

After that we call btrfs_extent_io_tree_release() to free all
extent states from dirty_pages io tree.

However if we hit an error from btrfs_write_marked_extent(), then we
still call btrfs_extent_io_tree_release() to clear that dirty_pages io
tree, which may contain dirty records that we haven't yet submitted.

Furthermore, the later transaction cleanup path will utilize that
dirty_pages io tree to properly cleanup those dirty ebs, but since it's
already empty, no dirty ebs are properly cleaned up, thus will later
trigger the warnings inside invalidate_btree_folios().

[FIX]
Normally such dirty ebs won't cause problems, as when the iput() is
called on the btree inode, the dirty ebs will be forcibly written back,
and since the fs is already in an error status, such writeback will not
reach disk and finish immediately.

But it's still better to get rid of such dirty ebs, if we ended up with
dirty ebs but the fs is not in an error status, then such writeback at
iput() time will be too late, as all workers are already stopped but
writeback will utilize workers, which will lead to NULL pointer
dereferences.

Instead of unconditionally calling btrfs_extent_io_tree_release(), only
call it if btrfs_write_and_wait_transaction() finished successfully, so
that @dirty_pages extent io tree is kept untouched for transaction
cleanup.

CC: stable@vger.kernel.org # 6.1+
Reviewed-by: Filipe Manana &lt;fdmanana@suse.com&gt;
Signed-off-by: Qu Wenruo &lt;wqu@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>btrfs: fix double-decrement of bytes_may_use in submit_one_async_extent()</title>
<updated>2026-05-23T11:09:26+00:00</updated>
<author>
<name>Mark Harmstone</name>
<email>mark@harmstone.com</email>
</author>
<published>2026-04-16T17:15:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0210d68c60bc4777d00faf98ba51628b90771b56'/>
<id>urn:sha1:0210d68c60bc4777d00faf98ba51628b90771b56</id>
<content type='text'>
[ Upstream commit 82323b1a7088b7a5c3e528a5d634bff447fa286f ]

submit_one_async_extent() calls btrfs_reserve_extent(), which decrements
bytes_may_use. If the call btrfs_create_io_em() fails, we jump to
out_free_reserve, which calls extent_clear_unlock_delalloc().

Because we're specifying EXTENT_DO_ACCOUNTING, i.e.
EXTENT_CLEAR_META_RESV | EXTENT_CLEAR_DATA_RESV, this decreases
bytes_may_use again. This can lead to problems later on, as an initial
write can fail only for the writeback to silently ENOSPC.

Fix this by replacing EXTENT_DO_ACCOUNTING with EXTENT_CLEAR_META_RESV.
This parallels a4fe134fc1d8eb ("btrfs: fix a double release on reserved
extents in cow_one_range()"), which is the same fix in cow_one_range().

Fixes: 151a41bc46df ("Btrfs: fix what bits we clear when erroring out from delalloc")
Reviewed-by: Qu Wenruo &lt;wqu@suse.com&gt;
Signed-off-by: Mark Harmstone &lt;mark@harmstone.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>btrfs: don't clobber errors in add_remap_tree_entries()</title>
<updated>2026-05-23T11:09:26+00:00</updated>
<author>
<name>Mark Harmstone</name>
<email>maharmstone@fb.com</email>
</author>
<published>2026-03-23T17:17:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=01cc967c6378d008da316dcf1126b09b53cefba4'/>
<id>urn:sha1:01cc967c6378d008da316dcf1126b09b53cefba4</id>
<content type='text'>
[ Upstream commit 44366af74061793ee5ceef455a4f0e465892d0de ]

In add_remap_tree_entries(), we only process a certain number of entries
at a time, meaning we may need to loop.

But because we weren't checking the return value of btrfs_insert_empty_items()
within the loop, this meant that if the last iteration of the loop
succeeded but a previous iteration failed, we were erroneously returning
0.

Fix this by breaking the loop early if btrfs_insert_empty_items() fails.

Fixes: b56f35560b82 ("btrfs: handle setting up relocation of block group with remap-tree")
Signed-off-by: Mark Harmstone &lt;mark@harmstone.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>btrfs: fix bytes_may_use leak in do_remap_reloc_trans()</title>
<updated>2026-05-23T11:09:26+00:00</updated>
<author>
<name>Mark Harmstone</name>
<email>mark@harmstone.com</email>
</author>
<published>2026-03-23T12:59:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=806aeb061baf0c2d48e68515e0440264cd7a86c2'/>
<id>urn:sha1:806aeb061baf0c2d48e68515e0440264cd7a86c2</id>
<content type='text'>
[ Upstream commit 9b8824533d75fb199a3fb0f6147ffcca64b5caf8 ]

If the call to btrfs_reserve_extent() in do_remap_reloc_trans() returns
a smaller extent than we asked for, currently we're not undoing the
bytes_may_use change that we made. Fix this by calling
btrfs_space_info_update_bytes_may_use() again for the difference.

Fixes: fd6594b1446c ("btrfs: replace identity remaps with actual remaps when doing relocations")
Reviewed-by: Boris Burkov &lt;boris@bur.io&gt;
Signed-off-by: Mark Harmstone &lt;mark@harmstone.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>btrfs: fix bytes_may_use leak in move_existing_remap()</title>
<updated>2026-05-23T11:09:26+00:00</updated>
<author>
<name>Mark Harmstone</name>
<email>mark@harmstone.com</email>
</author>
<published>2026-03-23T12:59:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=662f4a8bd1db4dd552c0c4df900f9f7de848bec1'/>
<id>urn:sha1:662f4a8bd1db4dd552c0c4df900f9f7de848bec1</id>
<content type='text'>
[ Upstream commit 68a135013bf73dfd6a277f76fc4e088b0f3dfa79 ]

If the call to btrfs_reserve_extent() in move_existing_remap() returns a
smaller extent than we asked for, currently we're not undoing the
bytes_may_use change that we made. Fix this by calling
btrfs_space_info_update_bytes_may_use() again for the difference.

Fixes: bbea42dfb91f ("btrfs: move existing remaps before relocating block group")
Reviewed-by: Boris Burkov &lt;boris@bur.io&gt;
Signed-off-by: Mark Harmstone &lt;mark@harmstone.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>btrfs: do not reject a valid running dev-replace</title>
<updated>2026-05-23T11:08:27+00:00</updated>
<author>
<name>Qu Wenruo</name>
<email>wqu@suse.com</email>
</author>
<published>2026-03-31T23:02:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2d9e16ab83f3f98a3eeebe9d01314d628089b581'/>
<id>urn:sha1:2d9e16ab83f3f98a3eeebe9d01314d628089b581</id>
<content type='text'>
[ Upstream commit 3c0c45a4dff73845ba93d41365fc14e45ee32bd7 ]

[BUG]
There is a bug report that a btrfs with running dev-replace got rejected
with the following messages:

  BTRFS error (device sdk1): devid 0 path /dev/sdk1 is registered but not found in chunk tree
  BTRFS error (device sdk1): remove the above devices or use 'btrfs device scan --forget &lt;dev&gt;' to unregister them before mount
  BTRFS error (device sdk1): open_ctree failed: -117

[CAUSE]
The tree and super block dumps show the fs is completely sane, except
one thing, there is no dev item for devid 0 in chunk tree.

However this is not a bug, as we do not insert dev item for devid 0 in
the first place.
Since the devid 0 is only there temporarily we do not really need to
insert a dev item for it and then later remove it again.

It is the commit 34308187395f ("btrfs: add extra device item checks at
mount") adding a overly strict check that triggers a false alert and
rejected the valid filesystem.

[FIX]
Add a special handling for devid 0, and doesn't require devid 0 to
have a device item in chunk tree.

Reported-by: Jaron Viëtor &lt;jaron@vietors.com&gt;
Link: https://lore.kernel.org/linux-btrfs/CAF1bhLVYLZvD=j2XyuxXDKD-NWNJAwDnpVN+UYeQW-HbzNRn1A@mail.gmail.com/
Fixes: 34308187395f ("btrfs: add extra device item checks at mount")
Signed-off-by: Qu Wenruo &lt;wqu@suse.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>btrfs: fix deadlock between reflink and transaction commit when using flushoncommit</title>
<updated>2026-05-23T11:08:27+00:00</updated>
<author>
<name>Filipe Manana</name>
<email>fdmanana@suse.com</email>
</author>
<published>2026-03-23T15:50:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6f0f9c0a368aa1fe078109091322d3b0632d9380'/>
<id>urn:sha1:6f0f9c0a368aa1fe078109091322d3b0632d9380</id>
<content type='text'>
[ Upstream commit b48c980b6a7e409050bb3067165db31cc6205e3e ]

When using the flushoncommit mount option, we can have a deadlock between
a transaction commit and a reflink operation that copied an inline extent
to an offset beyond the current i_size of the destination node.

The deadlock happens like this:

1) Task A clones an inline extent from inode X to an offset of inode Y
   that is beyond Y's current i_size. This means we copied the inline
   extent's data to a folio of inode Y that is beyond its EOF, using a
   call to copy_inline_to_page();

2) Task B starts a transaction commit and calls
   btrfs_start_delalloc_flush() to flush delalloc;

3) The delalloc flushing sees the new dirty folio of inode Y and when it
   attempts to flush it, it ends up at extent_writepage() and sees that
   the offset of the folio is beyond the i_size of inode Y, so it attempts
   to invalidate the folio by calling folio_invalidate(), which ends up at
   btrfs' folio invalidate callback - btrfs_invalidate_folio(). There it
   tries to lock the folio's range in inode Y's extent io tree, but it
   blocks since it's currently locked by task A - during a reflink we lock
   the inodes and the source and destination ranges after flushing all
   delalloc and waiting for ordered extent completion - after that we
   don't expect to have dirty folios in the ranges, the exception is if
   we have to copy an inline extent's data (because the destination offset
   is not zero);

4) Task A then attempts to start a transaction to update the inode item,
   and then it's blocked since the current transaction is in the
   TRANS_STATE_COMMIT_START state. Therefore task A has to wait for the
   current transaction to become unblocked (its state &gt;=
   TRANS_STATE_UNBLOCKED).

   So task A is waiting for the transaction commit done by task B, and
   the later waiting on the extent lock of inode Y that is currently
   held by task A.

Syzbot recently reported this with the following stack traces:

  INFO: task kworker/u8:7:1053 blocked for more than 143 seconds.
        Not tainted syzkaller #0
  "echo 0 &gt; /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  task:kworker/u8:7    state:D stack:23520 pid:1053  tgid:1053  ppid:2      task_flags:0x4208060 flags:0x00080000
  Workqueue: writeback wb_workfn (flush-btrfs-46)
  Call Trace:
   &lt;TASK&gt;
   context_switch kernel/sched/core.c:5298 [inline]
   __schedule+0x1553/0x5240 kernel/sched/core.c:6911
   __schedule_loop kernel/sched/core.c:6993 [inline]
   schedule+0x164/0x360 kernel/sched/core.c:7008
   wait_extent_bit fs/btrfs/extent-io-tree.c:811 [inline]
   btrfs_lock_extent_bits+0x59c/0x700 fs/btrfs/extent-io-tree.c:1914
   btrfs_lock_extent fs/btrfs/extent-io-tree.h:152 [inline]
   btrfs_invalidate_folio+0x43d/0xc40 fs/btrfs/inode.c:7704
   extent_writepage fs/btrfs/extent_io.c:1852 [inline]
   extent_write_cache_pages fs/btrfs/extent_io.c:2580 [inline]
   btrfs_writepages+0x12ff/0x2440 fs/btrfs/extent_io.c:2713
   do_writepages+0x32e/0x550 mm/page-writeback.c:2554
   __writeback_single_inode+0x133/0x11a0 fs/fs-writeback.c:1750
   writeback_sb_inodes+0x995/0x19d0 fs/fs-writeback.c:2042
   wb_writeback+0x456/0xb70 fs/fs-writeback.c:2227
   wb_do_writeback fs/fs-writeback.c:2374 [inline]
   wb_workfn+0x41a/0xf60 fs/fs-writeback.c:2414
   process_one_work kernel/workqueue.c:3276 [inline]
   process_scheduled_works+0xb6e/0x18c0 kernel/workqueue.c:3359
   worker_thread+0xa53/0xfc0 kernel/workqueue.c:3440
   kthread+0x388/0x470 kernel/kthread.c:436
   ret_from_fork+0x51e/0xb90 arch/x86/kernel/process.c:158
   ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
   &lt;/TASK&gt;
  INFO: task syz.4.64:6910 blocked for more than 143 seconds.
        Not tainted syzkaller #0
  "echo 0 &gt; /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  task:syz.4.64        state:D stack:22752 pid:6910  tgid:6905  ppid:5944   task_flags:0x400140 flags:0x00080002
  Call Trace:
   &lt;TASK&gt;
   context_switch kernel/sched/core.c:5298 [inline]
   __schedule+0x1553/0x5240 kernel/sched/core.c:6911
   __schedule_loop kernel/sched/core.c:6993 [inline]
   schedule+0x164/0x360 kernel/sched/core.c:7008
   wait_current_trans+0x39f/0x590 fs/btrfs/transaction.c:535
   start_transaction+0x6a7/0x1650 fs/btrfs/transaction.c:705
   clone_copy_inline_extent fs/btrfs/reflink.c:299 [inline]
   btrfs_clone+0x128a/0x24d0 fs/btrfs/reflink.c:529
   btrfs_clone_files+0x271/0x3f0 fs/btrfs/reflink.c:750
   btrfs_remap_file_range+0x76b/0x1320 fs/btrfs/reflink.c:903
   vfs_copy_file_range+0xda7/0x1390 fs/read_write.c:1600
   __do_sys_copy_file_range fs/read_write.c:1683 [inline]
   __se_sys_copy_file_range+0x2fb/0x480 fs/read_write.c:1650
   do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
   do_syscall_64+0x14d/0xf80 arch/x86/entry/syscall_64.c:94
   entry_SYSCALL_64_after_hwframe+0x77/0x7f
  RIP: 0033:0x7f5f73afc799
  RSP: 002b:00007f5f7315e028 EFLAGS: 00000246 ORIG_RAX: 0000000000000146
  RAX: ffffffffffffffda RBX: 00007f5f73d75fa0 RCX: 00007f5f73afc799
  RDX: 0000000000000005 RSI: 0000000000000000 RDI: 0000000000000005
  RBP: 00007f5f73b92c99 R08: 0000000000000863 R09: 0000000000000000
  R10: 00002000000000c0 R11: 0000000000000246 R12: 0000000000000000
  R13: 00007f5f73d76038 R14: 00007f5f73d75fa0 R15: 00007fff138a5068
   &lt;/TASK&gt;
  INFO: task syz.4.64:6975 blocked for more than 143 seconds.
        Not tainted syzkaller #0
  "echo 0 &gt; /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  task:syz.4.64        state:D stack:24736 pid:6975  tgid:6905  ppid:5944   task_flags:0x400040 flags:0x00080002
  Call Trace:
   &lt;TASK&gt;
   context_switch kernel/sched/core.c:5298 [inline]
   __schedule+0x1553/0x5240 kernel/sched/core.c:6911
   __schedule_loop kernel/sched/core.c:6993 [inline]
   schedule+0x164/0x360 kernel/sched/core.c:7008
   wb_wait_for_completion+0x3e8/0x790 fs/fs-writeback.c:227
   __writeback_inodes_sb_nr+0x24c/0x2d0 fs/fs-writeback.c:2838
   try_to_writeback_inodes_sb+0x9a/0xc0 fs/fs-writeback.c:2886
   btrfs_start_delalloc_flush fs/btrfs/transaction.c:2175 [inline]
   btrfs_commit_transaction+0x82e/0x31a0 fs/btrfs/transaction.c:2364
   btrfs_ioctl+0xca7/0xd00 fs/btrfs/ioctl.c:5206
   vfs_ioctl fs/ioctl.c:51 [inline]
   __do_sys_ioctl fs/ioctl.c:597 [inline]
   __se_sys_ioctl+0xff/0x170 fs/ioctl.c:583
   do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
   do_syscall_64+0x14d/0xf80 arch/x86/entry/syscall_64.c:94
   entry_SYSCALL_64_after_hwframe+0x77/0x7f
  RIP: 0033:0x7f5f73afc799
  RSP: 002b:00007f5f7313d028 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
  RAX: ffffffffffffffda RBX: 00007f5f73d76090 RCX: 00007f5f73afc799
  RDX: 0000000000000000 RSI: 0000000000009408 RDI: 0000000000000004
  RBP: 00007f5f73b92c99 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
  R13: 00007f5f73d76128 R14: 00007f5f73d76090 R15: 00007fff138a5068
   &lt;/TASK&gt;

Fix this by updating the i_size of the destination inode of a reflink
operation after we copy an inline extent's data to an offset beyond the
i_size and before attempting to start a transaction to update the inode's
item.

Reported-by: syzbot+63056bf627663701bbbf@syzkaller.appspotmail.com
Link: https://lore.kernel.org/linux-btrfs/69bba3fe.050a0220.227207.002f.GAE@google.com/
Fixes: 05a5a7621ce6 ("Btrfs: implement full reflink support for inline extents")
Reviewed-by: Boris Burkov &lt;boris@bur.io&gt;
Signed-off-by: Filipe Manana &lt;fdmanana@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>btrfs: fix the inline compressed extent check in inode_need_compress()</title>
<updated>2026-05-23T11:08:27+00:00</updated>
<author>
<name>Qu Wenruo</name>
<email>wqu@suse.com</email>
</author>
<published>2026-02-09T00:21:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=639d20e0230085f0796a38de70ab69964efd05d9'/>
<id>urn:sha1:639d20e0230085f0796a38de70ab69964efd05d9</id>
<content type='text'>
[ Upstream commit 883adb6dcff0f96dbbdb6488842a38b121ebd68c ]

[BUG]
Since commit 59615e2c1f63 ("btrfs: reject single block sized compression
early"), the following script will result the inode to have NOCOMPRESS
flag, meanwhile old kernels don't:

  # mkfs.btrfs -f $dev
  # mount $dev $mnt -o max_inline=2k,compress=zstd
  # truncate -s 8k $mnt/foobar
  # xfs_io -f -c "pwrite 0 2k" $mnt/foobar
  # sync

Before that commit, the inode will not have NOCOMPRESS flag:

	item 4 key (257 INODE_ITEM 0) itemoff 15879 itemsize 160
		generation 9 transid 9 size 8192 nbytes 4096
		block group 0 mode 100644 links 1 uid 0 gid 0 rdev 0
		sequence 3 flags 0x0(none)

But after that commit, the inode will have NOCOMPRESS flag:

	item 4 key (257 INODE_ITEM 0) itemoff 15879 itemsize 160
		generation 9 transid 10 size 8192 nbytes 4096
		block group 0 mode 100644 links 1 uid 0 gid 0 rdev 0
		sequence 3 flags 0x8(NOCOMPRESS)

This will make a lot of files no longer to be compressed.

[CAUSE]
The old compressed inline check looks like this:

	if (total_compressed &lt;= blocksize &amp;&amp;
	   (start &gt; 0 || end + 1 &lt; inode-&gt;disk_i_size))
		goto cleanup_and_bail_uncompressed;

That inline part check is equal to "!(start == 0 &amp;&amp; end + 1 &gt;=
inode-&gt;disk_i_size)", but the new check no longer has that disk_i_size
check.

Thus it means any single block sized write at file offset 0 will pass
the inline check, which is wrong.

Furthermore, since we have merged the old check into
inode_need_compress(), there is no disk_i_size based inline check
anymore, we will always try compressing that single block at file offset
0, then later find out it's not a net win and go to the
mark_incompressible tag.

This results the inode to have NOCOMPRESS flag.

[FIX]
Add back the missing disk_i_size based check into inode_need_compress().

Now the same script will no longer cause NOCOMPRESS flag.

Fixes: 59615e2c1f63 ("btrfs: reject single block sized compression early")
Reported-by: Chris Mason &lt;clm@meta.com&gt;
Link: https://lore.kernel.org/linux-btrfs/20260208183840.975975-1-clm@meta.com/
Reviewed-by: Filipe Manana &lt;fdmanana@suse.com&gt;
Signed-off-by: Qu Wenruo &lt;wqu@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
