<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/fs/btrfs/ordered-data.c, branch v7.1-rc5</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v7.1-rc5</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v7.1-rc5'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-04-07T17:43:50+00:00</updated>
<entry>
<title>btrfs: fix silent IO error loss in encoded writes and zoned split</title>
<updated>2026-04-07T17:43:50+00:00</updated>
<author>
<name>Michal Grzedzicki</name>
<email>mge@meta.com</email>
</author>
<published>2026-03-30T16:06:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3cd181cc46d36aa7bd4af85f14639d86a25beaec'/>
<id>urn:sha1:3cd181cc46d36aa7bd4af85f14639d86a25beaec</id>
<content type='text'>
can_finish_ordered_extent() and btrfs_finish_ordered_zoned() set
BTRFS_ORDERED_IOERR via bare set_bit(). Later,
btrfs_mark_ordered_extent_error() in btrfs_finish_one_ordered() uses
test_and_set_bit(), finds it already set, and skips
mapping_set_error(). The error is never recorded on the inode's
address_space, making it invisible to fsync. For encoded writes this
causes btrfs receive to silently produce files with zero-filled holes.

Fix: replace bare set_bit(BTRFS_ORDERED_IOERR) with
btrfs_mark_ordered_extent_error() which pairs test_and_set_bit() with
mapping_set_error(), guaranteeing the error is recorded exactly once.

Reviewed-by: Qu Wenruo &lt;wqu@suse.com&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Reviewed-by: Mark Harmstone &lt;mark@harmstone.com&gt;
Signed-off-by: Michal Grzedzicki &lt;mge@meta.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: output more info when duplicated ordered extent is found</title>
<updated>2026-04-07T16:56:02+00:00</updated>
<author>
<name>Qu Wenruo</name>
<email>wqu@suse.com</email>
</author>
<published>2026-03-14T00:00:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3c53ad7549ed80f4d27b6bee89425bb022ecfd32'/>
<id>urn:sha1:3c53ad7549ed80f4d27b6bee89425bb022ecfd32</id>
<content type='text'>
During development of a new feature, I triggered that btrfs_panic()
inside insert_ordered_extent() and spent quite some unnecessary before
noticing I'm passing incorrect flags when creating a new ordered extent.

Unfortunately the existing error message is not providing much help.

Enhance the output to provide file offset, num bytes and flags of both
existing and new ordered extents.

Signed-off-by: Qu Wenruo &lt;wqu@suse.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: check type flags in alloc_ordered_extent()</title>
<updated>2026-04-07T16:56:02+00:00</updated>
<author>
<name>Qu Wenruo</name>
<email>wqu@suse.com</email>
</author>
<published>2026-03-14T00:00:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=232770bcf3aa65ae83fd7389961fa651f09f7b3a'/>
<id>urn:sha1:232770bcf3aa65ae83fd7389961fa651f09f7b3a</id>
<content type='text'>
Unlike other flags used in btrfs, BTRFS_ORDERED_* macros are different as
they cannot be directly used as flags.

They are defined as bit values, thus they should be utilized with
bit operations, not directly with logical operations.

Unfortunately sometimes I forgot this and passed the incorrect flags
to alloc_ordered_extent() and hit weird bugs.

Enhance the type checks in alloc_ordered_extent():

- Make sure there is one and only one bit set for exclusive type flags
  There are four exclusive type flags, REGULAR, NOCOW, PREALLOC and
  COMPRESSED.
  So introduce a new macro, BTRFS_ORDERED_EXCLUSIVE_FLAGS, to cover
  above flags.

  Add an ASSERT() to check one and only one of those exclusive flags can
  be set for alloc_ordered_extent().

- Re-order the type bit numbers to the end of the enum
  This is make it much harder to get a valid false negative.

  E.g., with the old code BTRFS_ORDERED_REGULAR starts at zero, we can
  have the following flags passing the bit uniqueness check:

  * BTRFS_ORDERED_NOCOW
    Be treated as BTRFS_ORDERED_REGULAR (1 == 1UL &lt;&lt; 0).

  * BTRFS_ORDERED_PREALLOC
    Be treated as BTRFS_ORDERED_NOCOW (2 == 1UL &lt;&lt; 1).

  * BTRFS_ORDERED_DIRECT
    Be treated as BTRFS_ORDERED_PREALLOC (4 == 1UL &lt;&lt; 2).

  Now all those types start at 8, passing any of those bit numbers as
  flags directly will not pass the ASSERT().

- Add a static assert to avoid overflow
  To make sure all BTRFS_ORDERED_* flags can fit into an unsigned long.

Signed-off-by: Qu Wenruo &lt;wqu@suse.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: remove folio parameter from ordered io related functions</title>
<updated>2026-04-07T16:55:57+00:00</updated>
<author>
<name>Qu Wenruo</name>
<email>wqu@suse.com</email>
</author>
<published>2026-02-12T09:13:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a11d6912fdd9e57aff889ec97256b1d6b4e5bf06'/>
<id>urn:sha1:a11d6912fdd9e57aff889ec97256b1d6b4e5bf06</id>
<content type='text'>
Both functions btrfs_finish_ordered_extent() and
btrfs_mark_ordered_io_finished() are accepting an optional folio
parameter.

That @folio is passed into can_finish_ordered_extent(), which later will
test and clear the ordered flag for the involved range.

However I do not think there is any other call site that can clear
ordered flags of an page cache folio and can affect
can_finish_ordered_extent().

There are limited *_clear_ordered() callers out of
can_finish_ordered_extent() function:

- btrfs_migrate_folio()
  This is completely unrelated, it's just migrating the ordered flag to
  the new folio.

- btrfs_cleanup_ordered_extents()
  We manually clean the ordered flags of all involved folios, then call
  btrfs_mark_ordered_io_finished() without a @folio parameter.
  So it doesn't need and didn't pass a @folio parameter in the first
  place.

- btrfs_writepage_fixup_worker()
  This function is going to be removed soon, and we should not hit that
  function anymore.

- btrfs_invalidate_folio()
  This is the real call site we need to bother with.

  If we already have a bio running, btrfs_finish_ordered_extent() in
  end_bbio_data_write() will be executed first, as
  btrfs_invalidate_folio() will wait for the writeback to finish.

  Thus if there is a running bio, it will not see the range has
  ordered flags, and just skip to the next range.

  If there is no bio running, meaning the ordered extent is created but
  the folio is not yet submitted.

  In that case btrfs_invalidate_folio() will manually clear the folio
  ordered range, but then manually finish the ordered extent with
  btrfs_dec_test_ordered_pending() without bothering the folio ordered
  flags.

  Meaning if the OE range with folio ordered flags will be finished
  manually without the need to call can_finish_ordered_extent().

This means all can_finish_ordered_extent() call sites should get a range
that has folio ordered flag set, thus the old "return false" branch
should never be triggered.

Now we can:

- Remove the @folio parameter from involved functions
  * btrfs_mark_ordered_io_finished()
  * btrfs_finish_ordered_extent()

  For call sites passing a @folio into those functions, let them
  manually clear the ordered flag of involved folios.

- Move btrfs_finish_ordered_extent() out of the loop in
  end_bbio_data_write()

  We only need to call btrfs_finish_ordered_extent() once per bbio,
  not per folio.

- Add an ASSERT() to make sure all folio ranges have ordered flags
  It's only for end_bbio_data_write().

  And we already have enough safe nets to catch over-accounting of ordered
  extents.

Reviewed-by: Filipe Manana &lt;fdmanana@suse.com&gt;
Signed-off-by: Qu Wenruo &lt;wqu@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: remove the btrfs_inode parameter from btrfs_remove_ordered_extent()</title>
<updated>2026-04-07T16:55:57+00:00</updated>
<author>
<name>Qu Wenruo</name>
<email>wqu@suse.com</email>
</author>
<published>2026-02-11T22:49:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=afe60cdb3cb9495472b7feb10c5f2b31b7429956'/>
<id>urn:sha1:afe60cdb3cb9495472b7feb10c5f2b31b7429956</id>
<content type='text'>
We already have btrfs_ordered_extent::inode, thus there is no need to
pass a btrfs_inode parameter to btrfs_remove_ordered_extent().

Reviewed-by: Filipe Manana &lt;fdmanana@suse.com&gt;
Signed-off-by: Qu Wenruo &lt;wqu@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: rename btrfs_ordered_extent::list to csum_list</title>
<updated>2026-04-07T16:55:56+00:00</updated>
<author>
<name>Qu Wenruo</name>
<email>wqu@suse.com</email>
</author>
<published>2026-02-10T03:24:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=09664971b33718680c20f94aaf5d18fc8d7b0a3c'/>
<id>urn:sha1:09664971b33718680c20f94aaf5d18fc8d7b0a3c</id>
<content type='text'>
That list head records all pending checksums for that ordered
extent. And unlike other lists, we just use the name "list", which can
be very confusing for readers.

Rename it to "csum_list" which follows the remaining lists, showing the
purpose of the list.

And since we're here, remove a comment inside
btrfs_finish_ordered_zoned() where we have
"ASSERT(!list_empty(&amp;ordered-&gt;csum_list))" to make sure the OE has
pending csums.

That comment is only here to make sure we do not call list_first_entry()
before checking BTRFS_ORDERED_PREALLOC.
But since we already have that bit checked and even have a dedicated
ASSERT(), there is no need for that comment anymore.

Reviewed-by: Filipe Manana &lt;fdmanana@suse.com&gt;
Signed-off-by: Qu Wenruo &lt;wqu@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'for-6.19-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux</title>
<updated>2025-12-04T04:03:46+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-12-04T04:03:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7696286034ac72cf9b46499be1715ac62fd302c3'/>
<id>urn:sha1:7696286034ac72cf9b46499be1715ac62fd302c3</id>
<content type='text'>
Pull btrfs updates from David Sterba:
 "Features:

   - shutdown ioctl support (needs CONFIG_BTRFS_EXPERIMENTAL for now):
      - set filesystem state as being shut down (also named going down
        in other filesystems), where all active operations return EIO
        and this cannot be changed until unmount
      - pending operations are attempted to be finished but error
        messages may still show up depending on where exactly the
        shutdown happened

   - scrub (and device replace) vs suspend/hibernate:
      - a running scrub will prevent suspend, which can be annoying as
        suspend is an immediate request and scrub is not critical
      - filesystem freezing before suspend was not sufficient as the
        problem was in process freezing
      - behaviour change: on suspend scrub and device replace are
        cancelled, where scrub can record the last state and continue
        from there; the device replace has to be restarted from the
        beginning

   - zone stats exported in sysfs, from the perspective of the
     filesystem this includes active, reclaimable, relocation etc zones

  Performance:

   - improvements when processing space reservation tickets by
     optimizing locking and shrinking critical sections, cumulative
     improvements in lockstat numbers show +15%

  Notable fixes:

   - use vmalloc fallback when allocating bios as high order allocations
     can happen with wide checksums (like sha256)

   - scrub will always track the last position of progress so it's not
     starting from zero after an error

  Core:

   - under experimental config, checksum calculations are offloaded to
     process context, simplifies locking and allows to remove
     compression write worker kthread(s):
      - speed improvement in direct IO throughput with buffered IO
        fallback is +15% when not offloaded but this is more related to
        internal crypto subsystem improvements
      - this will be probably default in the future removing the sysfs
        tunable

   - (experimental) block size &gt; page size updates:
      - support more operations when not using large folios (encoded
        read/write and send)
      - raid56

   - more preparations for fscrypt support

  Other:

   - more conversions to auto-cleaned variables

   - parameter cleanups and removals

   - extended warning fixes

   - improved printing of structured values like keys

   - lots of other cleanups and refactoring"

* tag 'for-6.19-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (147 commits)
  btrfs: remove unnecessary inode key in btrfs_log_all_parents()
  btrfs: remove redundant zero/NULL initializations in btrfs_alloc_root()
  btrfs: remaining BTRFS_PATH_AUTO_FREE conversions
  btrfs: send: do not allocate memory for xattr data when checking it exists
  btrfs: send: add unlikely to all unexpected overflow checks
  btrfs: reduce arguments to btrfs_del_inode_ref_in_log()
  btrfs: remove root argument from btrfs_del_dir_entries_in_log()
  btrfs: use test_and_set_bit() in btrfs_delayed_delete_inode_ref()
  btrfs: don't search back for dir inode item in INO_LOOKUP_USER
  btrfs: don't rewrite ret from inode_permission
  btrfs: add orig_logical to btrfs_bio for encryption
  btrfs: disable verity on encrypted inodes
  btrfs: disable various operations on encrypted inodes
  btrfs: remove redundant level reset in btrfs_del_items()
  btrfs: simplify leaf traversal after path release in btrfs_next_old_leaf()
  btrfs: optimize balance_level() path reference handling
  btrfs: factor out root promotion logic into promote_child_to_root()
  btrfs: raid56: remove the "_step" infix
  btrfs: raid56: enable bs &gt; ps support
  btrfs: raid56: prepare finish_parity_scrub() to support bs &gt; ps cases
  ...
</content>
</entry>
<entry>
<title>btrfs: relax btrfs_inode::ordered_tree_lock IRQ locking context</title>
<updated>2025-11-24T21:42:21+00:00</updated>
<author>
<name>Qu Wenruo</name>
<email>wqu@suse.com</email>
</author>
<published>2025-10-23T22:02:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=39bc80216a3656d54d65cdda994f406aeb27c3da'/>
<id>urn:sha1:39bc80216a3656d54d65cdda994f406aeb27c3da</id>
<content type='text'>
We used IRQ version of spinlock for ordered_tree_lock, as
btrfs_finish_ordered_extent() can be called in end_bbio_data_write()
which was in IRQ context.

However since we're moving all the btrfs_bio::end_io() calls into task
context, there is no more need to support IRQ context thus we can relax
to regular spin_lock()/spin_unlock() for btrfs_inode::ordered_tree_lock.

Signed-off-by: Qu Wenruo &lt;wqu@suse.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: avoid repeated computations in btrfs_mark_ordered_io_finished()</title>
<updated>2025-11-24T20:59:09+00:00</updated>
<author>
<name>Filipe Manana</name>
<email>fdmanana@suse.com</email>
</author>
<published>2025-10-12T16:48:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f1ae05b8eaf5b2049ef0f6bfff4376f793adeb83'/>
<id>urn:sha1:f1ae05b8eaf5b2049ef0f6bfff4376f793adeb83</id>
<content type='text'>
We're computing a few values several times:

1) The current ordered extent's end offset inside the while loop, we have
   computed it and stored it in the 'entry_end' variable but then we
   compute it again later as the first argument to the min() macro;

2) The end file offset, open coded 3 times;

3) The current length (stored in variable 'len') computed 2 times, one
   inside an assertion and the other when assigning to the 'len' variable.

So use existing variables and add new ones to prevent repeating these
expressions and reduce the source code.

We were also subtracting one from the result of min() macro call and
then adding 1 back in the next line, making both operations pointless.
So just remove the decrement and increment by 1.

This also reduces very slightly the object code.

Before:

  $ size fs/btrfs/btrfs.ko
     text	   data	    bss	    dec	    hex	filename
  1916576	 161679	  15592	2093847	 1ff317	fs/btrfs/btrfs.ko

After:

  $ size fs/btrfs/btrfs.ko
     text	   data	    bss	    dec	    hex	filename
  1916556	 161679	  15592	2093827	 1ff303	fs/btrfs/btrfs.ko

Reviewed-by: Qu Wenruo &lt;wqu@suse.com&gt;
Reviewed-by: Anand Jain &lt;asj@kernel.org&gt;
Signed-off-by: Filipe Manana &lt;fdmanana@suse.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: truncate ordered extent when skipping writeback past i_size</title>
<updated>2025-11-24T20:59:08+00:00</updated>
<author>
<name>Filipe Manana</name>
<email>fdmanana@suse.com</email>
</author>
<published>2025-10-10T15:50:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=18de34daa7c62c830be533aace6b7c271e8e95cf'/>
<id>urn:sha1:18de34daa7c62c830be533aace6b7c271e8e95cf</id>
<content type='text'>
While running test case btrfs/192 from fstests with support for large
folios (needs CONFIG_BTRFS_EXPERIMENTAL=y) I ended up getting very sporadic
btrfs check failures reporting that csum items were missing. Looking into
the issue it turned out that btrfs check searches for csum items of a file
extent item with a range that spans beyond the i_size of a file and we
don't have any, because the kernel's writeback code skips submitting bios
for ranges beyond eof. It's not expected however to find a file extent item
that crosses the rounded up (by the sector size) i_size value, but there is
a short time window where we can end up with a transaction commit leaving
this small inconsistency between the i_size and the last file extent item.

Example btrfs check output when this happens:

  $ btrfs check /dev/sdc
  Opening filesystem to check...
  Checking filesystem on /dev/sdc
  UUID: 69642c61-5efb-4367-aa31-cdfd4067f713
  [1/8] checking log skipped (none written)
  [2/8] checking root items
  [3/8] checking extents
  [4/8] checking free space tree
  [5/8] checking fs roots
  root 5 inode 332 errors 1000, some csum missing
  ERROR: errors found in fs roots
  (...)

Looking at a tree dump of the fs tree (root 5) for inode 332 we have:

   $ btrfs inspect-internal dump-tree -t 5 /dev/sdc
   (...)
        item 28 key (332 INODE_ITEM 0) itemoff 2006 itemsize 160
                generation 17 transid 19 size 610969 nbytes 86016
                block group 0 mode 100666 links 1 uid 0 gid 0 rdev 0
                sequence 11 flags 0x0(none)
                atime 1759851068.391327881 (2025-10-07 16:31:08)
                ctime 1759851068.410098267 (2025-10-07 16:31:08)
                mtime 1759851068.410098267 (2025-10-07 16:31:08)
                otime 1759851068.391327881 (2025-10-07 16:31:08)
        item 29 key (332 INODE_REF 340) itemoff 1993 itemsize 13
                index 2 namelen 3 name: f1f
        item 30 key (332 EXTENT_DATA 589824) itemoff 1940 itemsize 53
                generation 19 type 1 (regular)
                extent data disk byte 21745664 nr 65536
                extent data offset 0 nr 65536 ram 65536
                extent compression 0 (none)
   (...)

We can see that the file extent item for file offset 589824 has a length of
64K and its number of bytes is 64K. Looking at the inode item we see that
its i_size is 610969 bytes which falls within the range of that file extent
item [589824, 655360[.

Looking into the csum tree:

  $ btrfs inspect-internal dump-tree /dev/sdc
  (...)
        item 15 key (EXTENT_CSUM EXTENT_CSUM 21565440) itemoff 991 itemsize 200
                range start 21565440 end 21770240 length 204800
           item 16 key (EXTENT_CSUM EXTENT_CSUM 1104576512) itemoff 983 itemsize 8
                range start 1104576512 end 1104584704 length 8192
  (..)

We see that the csum item number 15 covers the first 24K of the file extent
item - it ends at offset 21770240 and the extent's disk_bytenr is 21745664,
so we have:

   21770240 - 21745664 = 24K

We see that the next csum item (number 16) is completely outside the range,
so the remaining 40K of the extent doesn't have csum items in the tree.

If we round up the i_size to the sector size, we get:

   round_up(610969, 4096) = 614400

If we subtract from that the file offset for the extent item we get:

   614400 - 589824 = 24K

So the missing 40K corresponds to the end of the file extent item's range
minus the rounded up i_size:

   655360 - 614400 = 40K

Normally we don't expect a file extent item to span over the rounded up
i_size of an inode, since when truncating, doing hole punching and other
operations that trim a file extent item, the number of bytes is adjusted.

There is however a short time window where the kernel can end up,
temporarily,persisting an inode with an i_size that falls in the middle of
the last file extent item and the file extent item was not yet trimmed (its
number of bytes reduced so that it doesn't cross i_size rounded up by the
sector size).

The steps (in the kernel) that lead to such scenario are the following:

 1) We have inode I as an empty file, no allocated extents, i_size is 0;

 2) A buffered write is done for file range [589824, 655360[ (length of
    64K) and the i_size is updated to 655360. Note that we got a single
    large folio for the range (64K);

 3) A truncate operation starts that reduces the inode's i_size down to
    610969 bytes. The truncate sets the inode's new i_size at
    btrfs_setsize() by calling truncate_setsize() and before calling
    btrfs_truncate();

 4) At btrfs_truncate() we trigger writeback for the range starting at
    610304 (which is the new i_size rounded down to the sector size) and
    ending at (u64)-1;

 5) During the writeback, at extent_write_cache_pages(), we get from the
    call to filemap_get_folios_tag(), the 64K folio that starts at file
    offset 589824 since it contains the start offset of the writeback
    range (610304);

 6) At writepage_delalloc() we find the whole range of the folio is dirty
    and therefore we run delalloc for that 64K range ([589824, 655360[),
    reserving a 64K extent, creating an ordered extent, etc;

 7) At extent_writepage_io() we submit IO only for subrange [589824, 614400[
    because the inode's i_size is 610969 bytes (rounded up by sector size
    is 614400). There, in the while loop we intentionally skip IO beyond
    i_size to avoid any unnecessay work and just call
    btrfs_mark_ordered_io_finished() for the range [614400, 655360[ (which
    has a 40K length);

 8) Once the IO finishes we finish the ordered extent by ending up at
    btrfs_finish_one_ordered(), join transaction N, insert a file extent
    item in the inode's subvolume tree for file offset 589824 with a number
    of bytes of 64K, and update the inode's delayed inode item or directly
    the inode item with a call to btrfs_update_inode_fallback(), which
    results in storing the new i_size of 610969 bytes;

 9) Transaction N is committed either by the transaction kthread or some
    other task committed it (in response to a sync or fsync for example).

    At this point we have inode I persisted with an i_size of 610969 bytes
    and file extent item that starts at file offset 589824 and has a number
    of bytes of 64K, ending at an offset of 655360 which is beyond the
    i_size rounded up to the sector size (614400).

    --&gt; So after a crash or power failure here, the btrfs check program
        reports that error about missing checksum items for this inode, as
	it tries to lookup for checksums covering the whole range of the
	extent;

10) Only after transaction N is committed that at btrfs_truncate() the
    call to btrfs_start_transaction() starts a new transaction, N + 1,
    instead of joining transaction N. And it's with transaction N + 1 that
    it calls btrfs_truncate_inode_items() which updates the file extent
    item at file offset 589824 to reduce its number of bytes from 64K down
    to 24K, so that the file extent item's range ends at the i_size
    rounded up to the sector size (614400 bytes).

Fix this by truncating the ordered extent at extent_writepage_io() when we
skip writeback because the current offset in the folio is beyond i_size.
This ensures we don't ever persist a file extent item with a number of
bytes beyond the rounded up (by sector size) value of the i_size.

Reviewed-by: Qu Wenruo &lt;wqu@suse.com&gt;
Reviewed-by: Anand Jain &lt;asj@kernel.org&gt;
Signed-off-by: Filipe Manana &lt;fdmanana@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
</feed>
