<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/fs/ext4/indirect.c, branch v6.12.80</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-10-19T14:33:59+00:00</updated>
<entry>
<title>ext4: add ext4_sb_bread_nofail() helper function for ext4_free_branches()</title>
<updated>2025-10-19T14:33:59+00:00</updated>
<author>
<name>Baokun Li</name>
<email>libaokun1@huawei.com</email>
</author>
<published>2025-08-21T13:38:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9169ef838d0c05e32397247a17694902423df69c'/>
<id>urn:sha1:9169ef838d0c05e32397247a17694902423df69c</id>
<content type='text'>
commit d8b90e6387a74bcb1714c8d1e6a782ff709de9a9 upstream.

The implicit __GFP_NOFAIL flag in ext4_sb_bread() was removed in commit
8a83ac54940d ("ext4: call bdev_getblk() from sb_getblk_gfp()"), meaning
the function can now fail under memory pressure.

Most callers of ext4_sb_bread() propagate the error to userspace and do not
remount the filesystem read-only. However, ext4_free_branches() handles
ext4_sb_bread() failure by remounting the filesystem read-only.

This implies that an ext3 filesystem (mounted via the ext4 driver) could be
forcibly remounted read-only due to a transient page allocation failure,
which is unacceptable.

To mitigate this, introduce a new helper function, ext4_sb_bread_nofail(),
which explicitly uses __GFP_NOFAIL, and use it in ext4_free_branches().

Fixes: 8a83ac54940d ("ext4: call bdev_getblk() from sb_getblk_gfp()")
Cc: stable@kernel.org
Signed-off-by: Baokun Li &lt;libaokun1@huawei.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>ext4: fix hole length calculation overflow in non-extent inodes</title>
<updated>2025-08-28T14:30:56+00:00</updated>
<author>
<name>Zhang Yi</name>
<email>yi.zhang@huawei.com</email>
</author>
<published>2025-08-11T06:45:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=275c88b797f284c272bde74f1cc6331277212880'/>
<id>urn:sha1:275c88b797f284c272bde74f1cc6331277212880</id>
<content type='text'>
commit 02c7f7219ac0e2277b3379a3a0e9841ef464b6d4 upstream.

In a filesystem with a block size larger than 4KB, the hole length
calculation for a non-extent inode in ext4_ind_map_blocks() can easily
exceed INT_MAX. Then it could return a zero length hole and trigger the
following waring and infinite in the iomap infrastructure.

  ------------[ cut here ]------------
  WARNING: CPU: 3 PID: 434101 at fs/iomap/iter.c:34 iomap_iter_done+0x148/0x190
  CPU: 3 UID: 0 PID: 434101 Comm: fsstress Not tainted 6.16.0-rc7+ #128 PREEMPT(voluntary)
  Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022
  pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : iomap_iter_done+0x148/0x190
  lr : iomap_iter+0x174/0x230
  sp : ffff8000880af740
  x29: ffff8000880af740 x28: ffff0000db8e6840 x27: 0000000000000000
  x26: 0000000000000000 x25: ffff8000880af830 x24: 0000004000000000
  x23: 0000000000000002 x22: 000001bfdbfa8000 x21: ffffa6a41c002e48
  x20: 0000000000000001 x19: ffff8000880af808 x18: 0000000000000000
  x17: 0000000000000000 x16: ffffa6a495ee6cd0 x15: 0000000000000000
  x14: 00000000000003d4 x13: 00000000fa83b2da x12: 0000b236fc95f18c
  x11: ffffa6a4978b9c08 x10: 0000000000001da0 x9 : ffffa6a41c1a2a44
  x8 : ffff8000880af5c8 x7 : 0000000001000000 x6 : 0000000000000000
  x5 : 0000000000000004 x4 : 000001bfdbfa8000 x3 : 0000000000000000
  x2 : 0000000000000000 x1 : 0000004004030000 x0 : 0000000000000000
  Call trace:
   iomap_iter_done+0x148/0x190 (P)
   iomap_iter+0x174/0x230
   iomap_fiemap+0x154/0x1d8
   ext4_fiemap+0x110/0x140 [ext4]
   do_vfs_ioctl+0x4b8/0xbc0
   __arm64_sys_ioctl+0x8c/0x120
   invoke_syscall+0x6c/0x100
   el0_svc_common.constprop.0+0x48/0xf0
   do_el0_svc+0x24/0x38
   el0_svc+0x38/0x120
   el0t_64_sync_handler+0x10c/0x138
   el0t_64_sync+0x198/0x1a0
  ---[ end trace 0000000000000000 ]---

Cc: stable@kernel.org
Fixes: facab4d9711e ("ext4: return hole from ext4_map_blocks()")
Reported-by: Qu Wenruo &lt;wqu@suse.com&gt;
Closes: https://lore.kernel.org/linux-ext4/9b650a52-9672-4604-a765-bb6be55d1e4a@gmx.com/
Tested-by: Qu Wenruo &lt;wqu@suse.com&gt;
Signed-off-by: Zhang Yi &lt;yi.zhang@huawei.com&gt;
Link: https://patch.msgid.link/20250811064532.1788289-1-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>ext4: fix race in buffer_head read fault injection</title>
<updated>2024-12-05T13:01:15+00:00</updated>
<author>
<name>Long Li</name>
<email>leo.lilong@huawei.com</email>
</author>
<published>2024-09-06T09:17:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=61832ee7fa2fbd569d129379e795038abfb0d128'/>
<id>urn:sha1:61832ee7fa2fbd569d129379e795038abfb0d128</id>
<content type='text'>
[ Upstream commit 2f3d93e210b9c2866c8b3662adae427d5bf511ec ]

When I enabled ext4 debug for fault injection testing, I encountered the
following warning:

  EXT4-fs error (device sda): ext4_read_inode_bitmap:201: comm fsstress:
         Cannot read inode bitmap - block_group = 8, inode_bitmap = 1051
  WARNING: CPU: 0 PID: 511 at fs/buffer.c:1181 mark_buffer_dirty+0x1b3/0x1d0

The root cause of the issue lies in the improper implementation of ext4's
buffer_head read fault injection. The actual completion of buffer_head
read and the buffer_head fault injection are not atomic, which can lead
to the uptodate flag being cleared on normally used buffer_heads in race
conditions.

[CPU0]           [CPU1]         [CPU2]
ext4_read_inode_bitmap
  ext4_read_bh()
  &lt;bh read complete&gt;
                 ext4_read_inode_bitmap
                   if (buffer_uptodate(bh))
                     return bh
                               jbd2_journal_commit_transaction
                                 __jbd2_journal_refile_buffer
                                   __jbd2_journal_unfile_buffer
                                     __jbd2_journal_temp_unlink_buffer
  ext4_simulate_fail_bh()
    clear_buffer_uptodate
                                      mark_buffer_dirty
                                        &lt;report warning&gt;
                                        WARN_ON_ONCE(!buffer_uptodate(bh))

The best approach would be to perform fault injection in the IO completion
callback function, rather than after IO completion. However, the IO
completion callback function cannot get the fault injection code in sb.

Fix it by passing the result of fault injection into the bh read function,
we simulate faults within the bh read function itself. This requires adding
an extra parameter to the bh read functions that need fault injection.

Fixes: 46f870d690fe ("ext4: simulate various I/O and checksum errors when reading metadata")
Signed-off-by: Long Li &lt;leo.lilong@huawei.com&gt;
Link: https://patch.msgid.link/20240906091746.510163-1-leo.lilong@huawei.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>ext4: update delalloc data reserve spcae in ext4_es_insert_extent()</title>
<updated>2024-09-02T19:26:14+00:00</updated>
<author>
<name>Zhang Yi</name>
<email>yi.zhang@huawei.com</email>
</author>
<published>2024-08-13T12:34:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c543e2429640293d9eda8c7841d4b5d5e8682826'/>
<id>urn:sha1:c543e2429640293d9eda8c7841d4b5d5e8682826</id>
<content type='text'>
Now that we update data reserved space for delalloc after allocating
new blocks in ext4_{ind|ext}_map_blocks(), and if bigalloc feature is
enabled, we also need to query the extents_status tree to calculate the
exact reserved clusters. This is complicated now and it appears that
it's better to do this job in ext4_es_insert_extent(), because
__es_remove_extent() have already count delalloc blocks when removing
delalloc extents and __revise_pending() return new adding pending count,
we could update the reserved blocks easily in ext4_es_insert_extent().

We direct reduce the reserved cluster count when replacing a delalloc
extent. However, thers are two special cases need to concern about the
quota claiming when doing direct block allocation (e.g. from fallocate).

A),
fallocate a range that covers a delalloc extent but start with
non-delayed allocated blocks, e.g. a hole.

  hhhhhhh+ddddddd+ddddddd
  ^^^^^^^^^^^^^^^^^^^^^^^  fallocate this range

Current ext4_map_blocks() can't always trim the extent since it may
release i_data_sem before calling ext4_map_create_blocks() and raced by
another delayed allocation. Hence the EXT4_GET_BLOCKS_DELALLOC_RESERVE
may not set even when we are replacing a delalloc extent, without this
flag set, the quota has already been claimed by ext4_mb_new_blocks(), so
we should release the quota reservations instead of claim them again.

B),
bigalloc feature is enabled, fallocate a range that contains non-delayed
allocated blocks.

  |&lt;         one cluster       &gt;|
  hhhhhhh+hhhhhhh+hhhhhhh+ddddddd
  ^^^^^^^  fallocate this range

This case is similar to above case, the EXT4_GET_BLOCKS_DELALLOC_RESERVE
flag is also not set.

Hence we should release the quota reservations if we replace a delalloc
extent but without EXT4_GET_BLOCKS_DELALLOC_RESERVE set.

Signed-off-by: Zhang Yi &lt;yi.zhang@huawei.com&gt;
Link: https://patch.msgid.link/20240813123452.2824659-7-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>ext4: remove unnecessary parameter "needed" in ext4_discard_preallocations</title>
<updated>2024-01-18T15:52:45+00:00</updated>
<author>
<name>Kemeng Shi</name>
<email>shikemeng@huaweicloud.com</email>
</author>
<published>2024-01-05T09:21:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2ffd2a6ad1d3a8213bb5805f45f49098fe615db1'/>
<id>urn:sha1:2ffd2a6ad1d3a8213bb5805f45f49098fe615db1</id>
<content type='text'>
The "needed" controls the number of ext4_prealloc_space to discard in
ext4_discard_preallocations. Function ext4_discard_preallocations is
supposed to discard all non-used preallocated blocks when "needed"
is 0 and now ext4_discard_preallocations is always called with "needed"
= 0. Remove unnecessary parameter "needed" and remove all non-used
preallocated spaces in ext4_discard_preallocations to simplify the
code.

Note: If count of non-used preallocated spaces could be more than
UINT_MAX, there was a memory leak as some non-used preallocated
spaces are left ununsed and this commit will fix it. Otherwise,
there is no behavior change.

Signed-off-by: Kemeng Shi &lt;shikemeng@huaweicloud.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20240105092102.496631-9-shikemeng@huaweicloud.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>ext4: only update i_reserved_data_blocks on successful block allocation</title>
<updated>2023-06-26T23:34:56+00:00</updated>
<author>
<name>Baokun Li</name>
<email>libaokun1@huawei.com</email>
</author>
<published>2023-04-24T03:38:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=de25d6e9610a8b30cce9bbb19b50615d02ebca02'/>
<id>urn:sha1:de25d6e9610a8b30cce9bbb19b50615d02ebca02</id>
<content type='text'>
In our fault injection test, we create an ext4 file, migrate it to
non-extent based file, then punch a hole and finally trigger a WARN_ON
in the ext4_da_update_reserve_space():

EXT4-fs warning (device sda): ext4_da_update_reserve_space:369:
ino 14, used 11 with only 10 reserved data blocks

When writing back a non-extent based file, if we enable delalloc, the
number of reserved blocks will be subtracted from the number of blocks
mapped by ext4_ind_map_blocks(), and the extent status tree will be
updated. We update the extent status tree by first removing the old
extent_status and then inserting the new extent_status. If the block range
we remove happens to be in an extent, then we need to allocate another
extent_status with ext4_es_alloc_extent().

       use old    to remove   to add new
    |----------|------------|------------|
              old extent_status

The problem is that the allocation of a new extent_status failed due to a
fault injection, and __es_shrink() did not get free memory, resulting in
a return of -ENOMEM. Then do_writepages() retries after receiving -ENOMEM,
we map to the same extent again, and the number of reserved blocks is again
subtracted from the number of blocks in that extent. Since the blocks in
the same extent are subtracted twice, we end up triggering WARN_ON at
ext4_da_update_reserve_space() because used &gt; ei-&gt;i_reserved_data_blocks.

For non-extent based file, we update the number of reserved blocks after
ext4_ind_map_blocks() is executed, which causes a problem that when we call
ext4_ind_map_blocks() to create a block, it doesn't always create a block,
but we always reduce the number of reserved blocks. So we move the logic
for updating reserved blocks to ext4_ind_map_blocks() to ensure that the
number of reserved blocks is updated only after we do succeed in allocating
some new blocks.

Fixes: 5f634d064c70 ("ext4: Fix quota accounting error with fallocate")
Cc: stable@kernel.org
Signed-off-by: Baokun Li &lt;libaokun1@huawei.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20230424033846.4732-2-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>ext4: fix error code return to user-space in ext4_get_branch()</title>
<updated>2022-12-09T02:49:24+00:00</updated>
<author>
<name>Luís Henriques</name>
<email>lhenriques@suse.de</email>
</author>
<published>2022-11-09T18:14:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=26d75a16af285a70863ba6a81f85d81e7e65da50'/>
<id>urn:sha1:26d75a16af285a70863ba6a81f85d81e7e65da50</id>
<content type='text'>
If a block is out of range in ext4_get_branch(), -ENOMEM will be returned
to user-space.  Obviously, this error code isn't really useful.  This
patch fixes it by making sure the right error code (-EFSCORRUPTED) is
propagated to user-space.  EUCLEAN is more informative than ENOMEM.

Signed-off-by: Luís Henriques &lt;lhenriques@suse.de&gt;
Link: https://lore.kernel.org/r/20221109181445.17843-1-lhenriques@suse.de
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Cc: stable@kernel.org
</content>
</entry>
<entry>
<title>ext4: use ext4_debug() instead of jbd_debug()</title>
<updated>2022-08-03T03:52:19+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2022-06-08T11:23:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4978c659e7b5c1926cdb4b556e4ca1fd2de8ad42'/>
<id>urn:sha1:4978c659e7b5c1926cdb4b556e4ca1fd2de8ad42</id>
<content type='text'>
We use jbd_debug() in some places in ext4. It seems a bit strange to use
jbd2 debugging output function for ext4 code. Also these days
ext4_debug() uses dynamic printk so each debug message can be enabled /
disabled on its own so the time when it made some sense to have these
combined (to allow easier common selecting of messages to report) has
passed. Just convert all jbd_debug() uses in ext4 to ext4_debug().

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Reviewed-by: Lukas Czerner &lt;lczerner@redhat.com&gt;
Link: https://lore.kernel.org/r/20220608112355.4397-1-jack@suse.cz
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>fs/ext4: fix comments mentioning i_mutex</title>
<updated>2022-02-03T15:57:53+00:00</updated>
<author>
<name>hongnanli</name>
<email>hongnan.li@linux.alibaba.com</email>
</author>
<published>2022-01-21T07:06:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f340b3d9027485945d59f9c04f1e33070b02cae2'/>
<id>urn:sha1:f340b3d9027485945d59f9c04f1e33070b02cae2</id>
<content type='text'>
inode-&gt;i_mutex has been replaced with inode-&gt;i_rwsem long ago. Fix
comments still mentioning i_mutex.

Signed-off-by: hongnanli &lt;hongnan.li@linux.alibaba.com&gt;
Link: https://lore.kernel.org/r/20220121070611.21618-1-hongnan.li@linux.alibaba.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>ext4: Support for checksumming from journal triggers</title>
<updated>2021-08-31T03:36:50+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2021-08-16T09:57:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=188c299e2a26cc33747187f87c9e044dfd85a782'/>
<id>urn:sha1:188c299e2a26cc33747187f87c9e044dfd85a782</id>
<content type='text'>
JBD2 layer support triggers which are called when journaling layer moves
buffer to a certain state. We can use the frozen trigger, which gets
called when buffer data is frozen and about to be written out to the
journal, to compute block checksums for some buffer types (similarly as
does ocfs2). This avoids unnecessary repeated recomputation of the
checksum (at the cost of larger window where memory corruption won't be
caught by checksumming) and is even necessary when there are
unsynchronized updaters of the checksummed data.

So add superblock and journal trigger type arguments to
ext4_journal_get_write_access() and ext4_journal_get_create_access() so
that frozen triggers can be set accordingly. Also add inode argument to
ext4_walk_page_buffers() and all the callbacks used with that function
for the same purpose. This patch is mostly only a change of prototype of
the above mentioned functions and a few small helpers. Real checksumming
will come later.

Reviewed-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20210816095713.16537-1-jack@suse.cz
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
</feed>
