<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/fs/ext4/fast_commit.c, branch v7.0-rc7</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v7.0-rc7</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v7.0-rc7'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-03-28T03:38:01+00:00</updated>
<entry>
<title>ext4: fix iloc.bh leak in ext4_fc_replay_inode() error paths</title>
<updated>2026-03-28T03:38:01+00:00</updated>
<author>
<name>Baokun Li</name>
<email>libaokun@linux.alibaba.com</email>
</author>
<published>2026-03-23T06:08:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ec0a7500d8eace5b4f305fa0c594dd148f0e8d29'/>
<id>urn:sha1:ec0a7500d8eace5b4f305fa0c594dd148f0e8d29</id>
<content type='text'>
During code review, Joseph found that ext4_fc_replay_inode() calls
ext4_get_fc_inode_loc() to get the inode location, which holds a
reference to iloc.bh that must be released via brelse().

However, several error paths jump to the 'out' label without
releasing iloc.bh:

 - ext4_handle_dirty_metadata() failure
 - sync_dirty_buffer() failure
 - ext4_mark_inode_used() failure
 - ext4_iget() failure

Fix this by introducing an 'out_brelse' label placed just before
the existing 'out' label to ensure iloc.bh is always released.

Additionally, make ext4_fc_replay_inode() propagate errors
properly instead of always returning 0.

Reported-by: Joseph Qi &lt;joseph.qi@linux.alibaba.com&gt;
Fixes: 8016e29f4362 ("ext4: fast commit recovery path")
Signed-off-by: Baokun Li &lt;libaokun@linux.alibaba.com&gt;
Reviewed-by: Zhang Yi &lt;yi.zhang@huawei.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://patch.msgid.link/20260323060836.3452660-1-libaokun@linux.alibaba.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Cc: stable@kernel.org
</content>
</entry>
<entry>
<title>ext4: publish jinode after initialization</title>
<updated>2026-03-28T03:32:07+00:00</updated>
<author>
<name>Li Chen</name>
<email>me@linux.beauty</email>
</author>
<published>2026-02-25T08:26:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1aec30021edd410b986c156f195f3d23959a9d11'/>
<id>urn:sha1:1aec30021edd410b986c156f195f3d23959a9d11</id>
<content type='text'>
ext4_inode_attach_jinode() publishes ei-&gt;jinode to concurrent users.
It used to set ei-&gt;jinode before jbd2_journal_init_jbd_inode(),
allowing a reader to observe a non-NULL jinode with i_vfs_inode
still unset.

The fast commit flush path can then pass this jinode to
jbd2_wait_inode_data(), which dereferences i_vfs_inode-&gt;i_mapping and
may crash.

Below is the crash I observe:
```
BUG: unable to handle page fault for address: 000000010beb47f4
PGD 110e51067 P4D 110e51067 PUD 0
Oops: Oops: 0000 [#1] SMP NOPTI
CPU: 1 UID: 0 PID: 4850 Comm: fc_fsync_bench_ Not tainted 6.18.0-00764-g795a690c06a5 #1 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.17.0-2-2 04/01/2014
RIP: 0010:xas_find_marked+0x3d/0x2e0
Code: e0 03 48 83 f8 02 0f 84 f0 01 00 00 48 8b 47 08 48 89 c3 48 39 c6 0f 82 fd 01 00 00 48 85 c9 74 3d 48 83 f9 03 77 63 4c 8b 0f &lt;49&gt; 8b 71 08 48 c7 47 18 00 00 00 00 48 89 f1 83 e1 03 48 83 f9 02
RSP: 0018:ffffbbee806e7bf0 EFLAGS: 00010246
RAX: 000000000010beb4 RBX: 000000000010beb4 RCX: 0000000000000003
RDX: 0000000000000001 RSI: 0000002000300000 RDI: ffffbbee806e7c10
RBP: 0000000000000001 R08: 0000002000300000 R09: 000000010beb47ec
R10: ffff9ea494590090 R11: 0000000000000000 R12: 0000002000300000
R13: ffffbbee806e7c90 R14: ffff9ea494513788 R15: ffffbbee806e7c88
FS: 00007fc2f9e3e6c0(0000) GS:ffff9ea6b1444000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000010beb47f4 CR3: 0000000119ac5000 CR4: 0000000000750ef0
PKRU: 55555554
Call Trace:
&lt;TASK&gt;
filemap_get_folios_tag+0x87/0x2a0
__filemap_fdatawait_range+0x5f/0xd0
? srso_alias_return_thunk+0x5/0xfbef5
? __schedule+0x3e7/0x10c0
? srso_alias_return_thunk+0x5/0xfbef5
? srso_alias_return_thunk+0x5/0xfbef5
? srso_alias_return_thunk+0x5/0xfbef5
? preempt_count_sub+0x5f/0x80
? srso_alias_return_thunk+0x5/0xfbef5
? cap_safe_nice+0x37/0x70
? srso_alias_return_thunk+0x5/0xfbef5
? preempt_count_sub+0x5f/0x80
? srso_alias_return_thunk+0x5/0xfbef5
filemap_fdatawait_range_keep_errors+0x12/0x40
ext4_fc_commit+0x697/0x8b0
? ext4_file_write_iter+0x64b/0x950
? srso_alias_return_thunk+0x5/0xfbef5
? preempt_count_sub+0x5f/0x80
? srso_alias_return_thunk+0x5/0xfbef5
? vfs_write+0x356/0x480
? srso_alias_return_thunk+0x5/0xfbef5
? preempt_count_sub+0x5f/0x80
ext4_sync_file+0xf7/0x370
do_fsync+0x3b/0x80
? syscall_trace_enter+0x108/0x1d0
__x64_sys_fdatasync+0x16/0x20
do_syscall_64+0x62/0x2c0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
...
```

Fix this by initializing the jbd2_inode first.
Use smp_wmb() and WRITE_ONCE() to publish ei-&gt;jinode after
initialization. Readers use READ_ONCE() to fetch the pointer.

Fixes: a361293f5fede ("jbd2: Fix oops in jbd2_journal_file_inode()")
Cc: stable@vger.kernel.org
Signed-off-by: Li Chen &lt;me@linux.beauty&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://patch.msgid.link/20260225082617.147957-1-me@linux.beauty
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Cc: stable@kernel.org
</content>
</entry>
<entry>
<title>ext4: fast commit: make s_fc_lock reclaim-safe</title>
<updated>2026-01-20T03:46:05+00:00</updated>
<author>
<name>Li Chen</name>
<email>me@linux.beauty</email>
</author>
<published>2026-01-06T12:06:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=491f2927ae097e2d405afe0b3fe841931ab8aad2'/>
<id>urn:sha1:491f2927ae097e2d405afe0b3fe841931ab8aad2</id>
<content type='text'>
s_fc_lock can be acquired from inode eviction and thus is
reclaim unsafe. Since the fast commit path holds s_fc_lock while writing
the commit log, allocations under the lock can enter reclaim and invert
the lock order with fs_reclaim. Add ext4_fc_lock()/ext4_fc_unlock()
helpers which acquire s_fc_lock under memalloc_nofs_save()/restore()
context and use them everywhere so allocations under the lock cannot
recurse into filesystem reclaim.

Fixes: 6593714d67ba ("ext4: hold s_fc_lock while during fast commit")
Signed-off-by: Li Chen &lt;me@linux.beauty&gt;
Reviewed-by: Baokun Li &lt;libaokun1@huawei.com&gt;
Reviewed-by: Zhang Yi &lt;yi.zhang@huawei.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://patch.msgid.link/20260106120621.440126-1-me@linux.beauty
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>ext4: mark move extents fast-commit ineligible</title>
<updated>2026-01-20T00:26:35+00:00</updated>
<author>
<name>Li Chen</name>
<email>me@linux.beauty</email>
</author>
<published>2025-12-11T11:51:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=690558921d9f9388c6bc83610451d8cb393e4d88'/>
<id>urn:sha1:690558921d9f9388c6bc83610451d8cb393e4d88</id>
<content type='text'>
Fast commits only log operations that have dedicated replay support.
EXT4_IOC_MOVE_EXT swaps extents between regular files and may copy
data, rewriting the affected inodes' block mapping layout without
going through the fast commit tracking paths.
In practice these operations are rare and usually followed by further
updates, but mixing them into a fast commit makes the overall
semantics harder to reason about and risks replay gaps if new call
sites appear.

Teach ext4 to mark the filesystem fast-commit ineligible for the
journal transactions used by move_extent_per_page() when
EXT4_IOC_MOVE_EXT runs.
This forces those transactions to fall back to a full commit,
ensuring that these multi-inode extent swaps are captured by the
normal journal rather than partially encoded in fast commit TLVs.
This change should not affect common workloads but makes online
defragmentation safer and easier to reason about under fast commit.

Testing:
1. prepare:
        dd if=/dev/zero of=/root/fc_move.img bs=1M count=0 seek=256
        mkfs.ext4 -O fast_commit -F /root/fc_move.img
        mkdir -p /mnt/fc_move &amp;&amp; mount -t ext4 -o loop \
/root/fc_move.img /mnt/fc_move
2. Created two files, ran EXT4_IOC_MOVE_EXT via e4defrag, and checked
   the ineligible reason statistics:
        fallocate -l 64M /mnt/fc_move/file1
        cp /mnt/fc_move/file1 /mnt/fc_move/file2
        e4defrag /mnt/fc_move/file1
        cat /proc/fs/ext4/loop0/fc_info
   shows "Move extents": &gt; 0 and fc stats ineligible &gt; 0.

Signed-off-by: Li Chen &lt;me@linux.beauty&gt;
Link: https://patch.msgid.link/20251211115146.897420-4-me@linux.beauty
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>ext4: mark fs-verity enable fast-commit ineligible</title>
<updated>2026-01-20T00:26:35+00:00</updated>
<author>
<name>Li Chen</name>
<email>me@linux.beauty</email>
</author>
<published>2025-12-11T11:51:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=16d43b9748c655b36a675cc55789f40fd827e9b1'/>
<id>urn:sha1:16d43b9748c655b36a675cc55789f40fd827e9b1</id>
<content type='text'>
Fast commits only log operations that have dedicated replay support.
Enabling fs-verity builds a Merkle tree and updates inode and orphan
state in ways that are not described by the fast commit replay tags.
In practice these operations are rare and usually followed by further
updates, but mixing them into a fast commit makes the overall
semantics harder to reason about and risks replay gaps if new call
sites appear.

Teach ext4 to mark the filesystem fast-commit ineligible when
ext4_end_enable_verity() starts its journal transaction.
This forces that transaction to fall back to a full commit, ensuring
that the fs-verity enable changes are captured by the normal journal
rather than partially encoded in fast commit TLVs.
This change should not affect common workloads but makes fs-verity
enable safer and easier to reason about under fast commit.

Testing:
1. prepare:
    dd if=/dev/zero of=/root/fc_verity.img bs=1M count=0 seek=128
    mkfs.ext4 -O fast_commit,verity -F /root/fc_verity.img
    mkdir -p /mnt/fc_verity &amp;&amp; mount -t ext4 -o loop /root/fc_verity.img /mnt/fc_verity
2. Enabled fs-verity on a file and verified reason accounting:
    echo "data" &gt; /mnt/fc_verity/verityfile
    /root/enable_verity /mnt/fc_verity/verityfile
    sync
    tail -n 1 /proc/fs/ext4/loop0/fc_info
    "fs-verity enable":     1
3. Enabled fs-verity on a second file, fsynced it, and checked that the
   ineligible commit counter is updated too:
    echo "data2" &gt; /mnt/fc_verity/verityfile2
    /root/enable_verity /mnt/fc_verity/verityfile2
    /root/fsync_file /mnt/fc_verity/verityfile2
    sync
    /proc/fs/ext4/loop0/fc_info shows "fs-verity enable" incremented and
    fc stats ineligible increased accordingly.

Signed-off-by: Li Chen &lt;me@linux.beauty&gt;
Link: https://patch.msgid.link/20251211115146.897420-3-me@linux.beauty
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>ext4: mark inode format migration fast-commit ineligible</title>
<updated>2026-01-20T00:26:35+00:00</updated>
<author>
<name>Li Chen</name>
<email>me@linux.beauty</email>
</author>
<published>2025-12-11T11:51:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=87e79fa122bc9a6576f1690ee264fcbd77d3ab58'/>
<id>urn:sha1:87e79fa122bc9a6576f1690ee264fcbd77d3ab58</id>
<content type='text'>
Fast commits only log operations that have dedicated replay support.
Inode format migration (indirect&lt;-&gt;extent layout changes via
EXT4_IOC_MIGRATE or toggling EXT4_EXTENTS_FL) rewrites the block mapping
representation without going through the fast commit tracking paths.
In practice these migrations are rare and usually followed by further
updates, but mixing them into a fast commit makes the overall semantics
harder to reason about and risks replay gaps if new call sites appear.

Teach ext4 to mark the filesystem fast-commit ineligible when
ext4_ext_migrate() or ext4_ind_migrate() start their journal transactions.
This forces those transactions to fall back to a full commit, ensuring
that the entire inode layout change is captured by the normal journal
rather than partially encoded in fast commit TLVs. This change should
not affect common workloads but makes format migrations safer and easier
to reason about under fast commit.

Testing:
1. prepare:
    dd if=/dev/zero of=/root/fc.img bs=1M count=0 seek=128
    mkfs.ext4 -O fast_commit -F /root/fc.img
    mkdir -p /mnt/fc &amp;&amp; mount -t ext4 -o loop /root/fc.img /mnt/fc
2.  Created a test file and toggled the extents flag to exercise both
    ext4_ind_migrate() and ext4_ext_migrate():
    touch /mnt/fc/migtest
    chattr -e /mnt/fc/migtest
    chattr +e /mnt/fc/migtest
3. Verified fast-commit ineligible statistics:
    tail -n 1 /proc/fs/ext4/loop0/fc_info
    "Inode format migration":       2

Signed-off-by: Li Chen &lt;me@linux.beauty&gt;
Link: https://patch.msgid.link/20251211115146.897420-2-me@linux.beauty
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>ext4: increase IO priority of fastcommit</title>
<updated>2025-09-25T18:56:31+00:00</updated>
<author>
<name>Julian Sun</name>
<email>sunjunchao@bytedance.com</email>
</author>
<published>2025-08-27T12:18:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=46e75c56dfeafb6756773b71cabe187a6886859a'/>
<id>urn:sha1:46e75c56dfeafb6756773b71cabe187a6886859a</id>
<content type='text'>
The following code paths may result in high latency or even task hangs:
   1. fastcommit io is throttled by wbt.
   2. jbd2_fc_wait_bufs() might wait for a long time while
JBD2_FAST_COMMIT_ONGOING is set in journal-&gt;flags, and then
jbd2_journal_commit_transaction() waits for the
JBD2_FAST_COMMIT_ONGOING bit for a long time while holding the write
lock of j_state_lock.
   3. start_this_handle() waits for read lock of j_state_lock which
results in high latency or task hang.

Given the fact that ext4_fc_commit() already modifies the current
process' IO priority to match that of the jbd2 thread, it should be
reasonable to match jbd2's IO submission flags as well.

Suggested-by: Ritesh Harjani (IBM) &lt;ritesh.list@gmail.com&gt;
Signed-off-by: Julian Sun &lt;sunjunchao@bytedance.com&gt;
Reviewed-by: Zhang Yi &lt;yi.zhang@huawei.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Message-ID: &lt;20250827121812.1477634-1-sunjunchao@bytedance.com&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>ext4: remove sbi argument from ext4_chksum()</title>
<updated>2025-05-20T14:31:12+00:00</updated>
<author>
<name>Eric Biggers</name>
<email>ebiggers@google.com</email>
</author>
<published>2025-05-13T05:38:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6cbab5f95e49ec8a9f21784fae3ff0ee09b2dfbc'/>
<id>urn:sha1:6cbab5f95e49ec8a9f21784fae3ff0ee09b2dfbc</id>
<content type='text'>
Since ext4_chksum() no longer uses its sbi argument, remove it.

Signed-off-by: Eric Biggers &lt;ebiggers@google.com&gt;
Reviewed-by: Baokun Li &lt;libaokun1@huawei.com&gt;
Link: https://patch.msgid.link/20250513053809.699974-2-ebiggers@kernel.org
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>ext4: prevent stale extent cache entries caused by concurrent I/O writeback</title>
<updated>2025-05-14T14:42:12+00:00</updated>
<author>
<name>Zhang Yi</name>
<email>yi.zhang@huawei.com</email>
</author>
<published>2025-04-23T08:52:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=402e38e6b71f5739119ca3107f375e112d63c7c5'/>
<id>urn:sha1:402e38e6b71f5739119ca3107f375e112d63c7c5</id>
<content type='text'>
Currently, in the I/O writeback path, ext4_map_blocks() may attempt to
cache additional unrelated extents in the extent status tree without
holding the inode's i_rwsem and the mapping's invalidate_lock. This can
lead to stale extent status entries remaining in certain scenarios,
potentially causing data corruption.

For example, when performing a collapse range in ext4_collapse_range(),
it clears the extent cache and dirty pages before removing blocks and
shifting extents. It also holds the i_data_sem during these two
operations. However, both ext4_ext_remove_space() and
ext4_ext_shift_extents() may briefly release the i_data_sem if journal
credits are insufficient (ext4_datasem_ensure_credits()). If another
writeback process writes dirty pages from other regions during this
interval, it may cache extents that are about to be modified. Unless
ext4_collapse_range() explicitly clears the extent cache again, these
cached entries can become stale and inconsistent with the actual
extents.

     0 a  n       b      c         m
     | |  |       |      |         |
    [www][wwwwww][wwwwwwww]...[wwwww][wwww]...
          |                           |
          N                           M

Assume that block a is dirty. The collapse range operation is removing
data from n to m and drops i_data_sem immediately after removing the
extent from b to c. At the same time, a concurrent writeback begins to
write back block a; it will reloads the extent from [n, b) into the
extent status tree since it does not hold the i_rwsem or the
invalidate_lock. After the collapse range operation, it left the stale
extent [n, b), which points logical block n to N, but the actual
physical block of n should be M.

Similarly, both ext4_insert_range() and ext4_truncate() have the same
problem. ext4_punch_hole() survived since it re-add a hole extent entry
after removing space since commit 9f1118223aa0 ("ext4: add a hole extent
entry in cache after punch").

In most cases, during dirty page writeback, the block mapping
information is likely to be found in the extent cache, making it less
necessary to search for physical extents. Consequently, loading
unrelated extent caches during writeback appears to be ineffective.
Therefore, fix this by adds EXT4_EX_NOCACHE in the writeback path to
prevent caching of unrelated extents, eliminating this potential source
of corruption.

Signed-off-by: Zhang Yi &lt;yi.zhang@huawei.com&gt;
Link: https://patch.msgid.link/20250423085257.122685-4-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>ext4: generalize EXT4_GET_BLOCKS_IO_SUBMIT flag usage</title>
<updated>2025-05-14T14:42:12+00:00</updated>
<author>
<name>Zhang Yi</name>
<email>yi.zhang@huawei.com</email>
</author>
<published>2025-04-23T08:52:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=86b349ce0312a397a6961e457108556e44a3d211'/>
<id>urn:sha1:86b349ce0312a397a6961e457108556e44a3d211</id>
<content type='text'>
Currently, the EXT4_GET_BLOCKS_IO_SUBMIT flag is only used during data
writeback to indicate that in ordered mode, the journal commit thread
should skip re-submitting data and simply wait for I/O completion.

To prepare for later patches that need to detect I/O submission context
in ext4_map_blocks(), generalizes the meaning of
EXT4_GET_BLOCKS_IO_SUBMIT. This flag will be set during:
 1) data I/O writeback,
 2) I/O completion extents conversion,
 3) journal performing commit in fast_commit.

This change doesn't affect current usage of this flag and provides a
clear way to identify I/O submission context.

Signed-off-by: Zhang Yi &lt;yi.zhang@huawei.com&gt;
Link: https://patch.msgid.link/20250423085257.122685-3-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
</feed>
