diff options
| author | Robbie Ko <robbieko@synology.com> | 2026-05-01 05:41:56 +0300 |
|---|---|---|
| committer | David Sterba <dsterba@suse.com> | 2026-05-08 01:32:08 +0300 |
| commit | c562ba61fc5e11798720acc1b172862158f1fa0b (patch) | |
| tree | fef7083b9ef768db64309ff1fb3483ac676c3c8c /drivers/gpu/tests/git@radix-linux.su:pub | |
| parent | 4066c55e109475a06d18a1f127c939d551211956 (diff) | |
| download | linux-c562ba61fc5e11798720acc1b172862158f1fa0b.tar.xz | |
btrfs: fix incorrect i_size after remount caused by KEEP_SIZE prealloc gap
When fallocate() with FALLOC_FL_KEEP_SIZE preallocates an extent past the
current i_size, the file_extent_tree of the inode is updated to cover
that range. However, on the next mount, btrfs_read_locked_inode() only
re-populates file_extent_tree with [0, round_up(i_size, sectorsize)),
losing the marks that belonged to the KEEP_SIZE prealloc extent beyond
i_size.
Later, when a non-KEEP_SIZE fallocate() extends i_size into / past that
old prealloc extent, the reservation loop in btrfs_fallocate() skips
already-prealloc segments and does not call into the path that marks the
file_extent_tree, so a gap remains inside the file_extent_tree across
[old_aligned_i_size, start_of_new_alloc). Then __btrfs_prealloc_file_range()
calls btrfs_inode_safe_disk_i_size_write(), which uses
find_contiguous_extent_bit() starting at offset 0 to derive disk_i_size.
The walk stops at the gap, so disk_i_size ends up smaller than i_size and
gets persisted. After the next mount, the file shows the wrong (smaller)
size.
The following reproducer triggers the problem:
$ cat test.sh
MNT=/mnt/sdi
DEV=/dev/sdi
mkdir -p $MNT
mkfs.btrfs -f -O ^no-holes $DEV
mount $DEV $MNT
touch $MNT/file1
# KEEP_SIZE prealloc beyond i_size (i_size stays 0)
fallocate -n -o 4M -l 4M $MNT/file1
umount $MNT
mount $DEV $MNT
# non-KEEP_SIZE fallocate that overlaps the previous prealloc tail
# and extends past it
fallocate -o 7M -l 2M $MNT/file1
ls -lh $MNT/file1
umount $MNT
mount $DEV $MNT
ls -lh $MNT/file1
umount $MNT
Running the reproducer gives the following result:
$ ./test.sh
(...)
-rw-rw-r-- 1 root root 9.0M May 4 16:35 /mnt/sdi/file1
-rw-rw-r-- 1 root root 7.0M May 4 16:35 /mnt/sdi/file1
The size before the second mount is correct (9M), but after the
remount it drops to 7M, i.e. the start of the gap inside file_extent_tree.
Fix this in __btrfs_prealloc_file_range() by marking the entire range
[round_down(old_i_size, sectorsize), round_up(new_i_size, sectorsize))
in file_extent_tree before updating i_size and calling
btrfs_inode_safe_disk_i_size_write(). This ensures the contiguous bit
search starting from 0 is not truncated by a stale gap left behind by a
previous KEEP_SIZE prealloc that was not restored on inode load.
The fix has no effect when the NO_HOLES feature is enabled because
btrfs_inode_safe_disk_i_size_write() and
btrfs_inode_set_file_extent_range()
both take the fast path that directly tracks disk_i_size without
consulting file_extent_tree.
Fixes: 9ddc959e802b ("btrfs: use the file extent tree infrastructure")
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Robbie Ko <robbieko@synology.com>
[ Minor updates to the change log ]
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Diffstat (limited to 'drivers/gpu/tests/git@radix-linux.su:pub')
0 files changed, 0 insertions, 0 deletions
