<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/fs/jbd2/checkpoint.c, branch v6.6.141</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.141</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.141'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-04-02T11:07:26+00:00</updated>
<entry>
<title>jbd2: gracefully abort on checkpointing state corruptions</title>
<updated>2026-04-02T11:07:26+00:00</updated>
<author>
<name>Milos Nikic</name>
<email>nikic.milos@gmail.com</email>
</author>
<published>2026-03-11T04:15:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d536a00f1b4512fc99098404a054eb6dd27f80cc'/>
<id>urn:sha1:d536a00f1b4512fc99098404a054eb6dd27f80cc</id>
<content type='text'>
commit bac3190a8e79beff6ed221975e0c9b1b5f2a21da upstream.

This patch targets two internal state machine invariants in checkpoint.c
residing inside functions that natively return integer error codes.

- In jbd2_cleanup_journal_tail(): A blocknr of 0 indicates a severely
corrupted journal superblock. Replaced the J_ASSERT with a WARN_ON_ONCE
and a graceful journal abort, returning -EFSCORRUPTED.

- In jbd2_log_do_checkpoint(): Replaced the J_ASSERT_BH checking for
an unexpected buffer_jwrite state. If the warning triggers, we
explicitly drop the just-taken get_bh() reference and call __flush_batch()
to safely clean up any previously queued buffers in the j_chkpt_bhs array,
preventing a memory leak before returning -EFSCORRUPTED.

Signed-off-by: Milos Nikic &lt;nikic.milos@gmail.com&gt;
Reviewed-by: Andreas Dilger &lt;adilger@dilger.ca&gt;
Reviewed-by: Zhang Yi &lt;yi.zhang@huawei.com&gt;
Reviewed-by: Baokun Li &lt;libaokun@linux.alibaba.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://patch.msgid.link/20260311041548.159424-1-nikic.milos@gmail.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>jbd2: prevent softlockup in jbd2_log_do_checkpoint()</title>
<updated>2025-08-28T14:28:36+00:00</updated>
<author>
<name>Baokun Li</name>
<email>libaokun1@huawei.com</email>
</author>
<published>2025-08-12T06:37:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=41f40038de62e8306897cf6840791b268996432a'/>
<id>urn:sha1:41f40038de62e8306897cf6840791b268996432a</id>
<content type='text'>
commit 9d98cf4632258720f18265a058e62fde120c0151 upstream.

Both jbd2_log_do_checkpoint() and jbd2_journal_shrink_checkpoint_list()
periodically release j_list_lock after processing a batch of buffers to
avoid long hold times on the j_list_lock. However, since both functions
contend for j_list_lock, the combined time spent waiting and processing
can be significant.

jbd2_journal_shrink_checkpoint_list() explicitly calls cond_resched() when
need_resched() is true to avoid softlockups during prolonged operations.
But jbd2_log_do_checkpoint() only exits its loop when need_resched() is
true, relying on potentially sleeping functions like __flush_batch() or
wait_on_buffer() to trigger rescheduling. If those functions do not sleep,
the kernel may hit a softlockup.

watchdog: BUG: soft lockup - CPU#3 stuck for 156s! [kworker/u129:2:373]
CPU: 3 PID: 373 Comm: kworker/u129:2 Kdump: loaded Not tainted 6.6.0+ #10
Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.27 06/13/2017
Workqueue: writeback wb_workfn (flush-7:2)
pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : native_queued_spin_lock_slowpath+0x358/0x418
lr : jbd2_log_do_checkpoint+0x31c/0x438 [jbd2]
Call trace:
 native_queued_spin_lock_slowpath+0x358/0x418
 jbd2_log_do_checkpoint+0x31c/0x438 [jbd2]
 __jbd2_log_wait_for_space+0xfc/0x2f8 [jbd2]
 add_transaction_credits+0x3bc/0x418 [jbd2]
 start_this_handle+0xf8/0x560 [jbd2]
 jbd2__journal_start+0x118/0x228 [jbd2]
 __ext4_journal_start_sb+0x110/0x188 [ext4]
 ext4_do_writepages+0x3dc/0x740 [ext4]
 ext4_writepages+0xa4/0x190 [ext4]
 do_writepages+0x94/0x228
 __writeback_single_inode+0x48/0x318
 writeback_sb_inodes+0x204/0x590
 __writeback_inodes_wb+0x54/0xf8
 wb_writeback+0x2cc/0x3d8
 wb_do_writeback+0x2e0/0x2f8
 wb_workfn+0x80/0x2a8
 process_one_work+0x178/0x3e8
 worker_thread+0x234/0x3b8
 kthread+0xf0/0x108
 ret_from_fork+0x10/0x20

So explicitly call cond_resched() in jbd2_log_do_checkpoint() to avoid
softlockup.

Cc: stable@kernel.org
Signed-off-by: Baokun Li &lt;libaokun1@huawei.com&gt;
Link: https://patch.msgid.link/20250812063752.912130-1-libaokun@huaweicloud.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>jbd2: stop waiting for space when jbd2_cleanup_journal_tail() returns error</title>
<updated>2024-10-10T09:57:50+00:00</updated>
<author>
<name>Baokun Li</name>
<email>libaokun1@huawei.com</email>
</author>
<published>2024-07-18T11:53:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1c62dc0d82c62f0dc8fcdc4843208e522acccaf5'/>
<id>urn:sha1:1c62dc0d82c62f0dc8fcdc4843208e522acccaf5</id>
<content type='text'>
commit f5cacdc6f2bb2a9bf214469dd7112b43dd2dd68a upstream.

In __jbd2_log_wait_for_space(), we might call jbd2_cleanup_journal_tail()
to recover some journal space. But if an error occurs while executing
jbd2_cleanup_journal_tail() (e.g., an EIO), we don't stop waiting for free
space right away, we try other branches, and if j_committing_transaction
is NULL (i.e., the tid is 0), we will get the following complain:

============================================
JBD2: I/O error when updating journal superblock for sdd-8.
__jbd2_log_wait_for_space: needed 256 blocks and only had 217 space available
__jbd2_log_wait_for_space: no way to get more journal space in sdd-8
------------[ cut here ]------------
WARNING: CPU: 2 PID: 139804 at fs/jbd2/checkpoint.c:109 __jbd2_log_wait_for_space+0x251/0x2e0
Modules linked in:
CPU: 2 PID: 139804 Comm: kworker/u8:3 Not tainted 6.6.0+ #1
RIP: 0010:__jbd2_log_wait_for_space+0x251/0x2e0
Call Trace:
 &lt;TASK&gt;
 add_transaction_credits+0x5d1/0x5e0
 start_this_handle+0x1ef/0x6a0
 jbd2__journal_start+0x18b/0x340
 ext4_dirty_inode+0x5d/0xb0
 __mark_inode_dirty+0xe4/0x5d0
 generic_update_time+0x60/0x70
[...]
============================================

So only if jbd2_cleanup_journal_tail() returns 1, i.e., there is nothing to
clean up at the moment, continue to try to reclaim free space in other ways.

Note that this fix relies on commit 6f6a6fda2945 ("jbd2: fix ocfs2 corrupt
when updating journal superblock fails") to make jbd2_cleanup_journal_tail
return the correct error code.

Fixes: 8c3f25d8950c ("jbd2: don't give up looking for space so easily in __jbd2_log_wait_for_space")
Cc: stable@kernel.org
Signed-off-by: Baokun Li &lt;libaokun1@huawei.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://patch.msgid.link/20240718115336.2554501-1-libaokun@huaweicloud.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>ext4: fix incorrect tid assumption in jbd2_journal_shrink_checkpoint_list()</title>
<updated>2024-10-10T09:57:48+00:00</updated>
<author>
<name>Luis Henriques (SUSE)</name>
<email>luis.henriques@linux.dev</email>
</author>
<published>2024-07-24T16:11:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1552199ace59dd0bd7685d82af6f438cc6a2e7e7'/>
<id>urn:sha1:1552199ace59dd0bd7685d82af6f438cc6a2e7e7</id>
<content type='text'>
commit 7a6443e1dad70281f99f0bd394d7fd342481a632 upstream.

Function jbd2_journal_shrink_checkpoint_list() assumes that '0' is not a
valid value for transaction IDs, which is incorrect.  Don't assume that and
use two extra boolean variables to control the loop iterations and keep
track of the first and last tid.

Signed-off-by: Luis Henriques (SUSE) &lt;luis.henriques@linux.dev&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://patch.msgid.link/20240724161119.13448-4-luis.henriques@linux.dev
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>ext4: fix incorrect tid assumption in __jbd2_log_wait_for_space()</title>
<updated>2024-10-10T09:57:47+00:00</updated>
<author>
<name>Luis Henriques (SUSE)</name>
<email>luis.henriques@linux.dev</email>
</author>
<published>2024-07-24T16:11:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=93051d16b31256d4d1f88b8088846487d98846f0'/>
<id>urn:sha1:93051d16b31256d4d1f88b8088846487d98846f0</id>
<content type='text'>
commit 972090651ee15e51abfb2160e986fa050cfc7a40 upstream.

Function __jbd2_log_wait_for_space() assumes that '0' is not a valid value
for transaction IDs, which is incorrect.  Don't assume that and invoke
jbd2_log_wait_commit() if the journal had a committing transaction instead.

Signed-off-by: Luis Henriques (SUSE) &lt;luis.henriques@linux.dev&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://patch.msgid.link/20240724161119.13448-3-luis.henriques@linux.dev
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>jbd2: remove unused function '__cp_buffer_busy'</title>
<updated>2023-08-04T20:32:29+00:00</updated>
<author>
<name>Yang Li</name>
<email>yang.lee@linux.alibaba.com</email>
</author>
<published>2023-07-14T02:55:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5f02a30eac5cc1c081cbdb42d19fd0ded00b0618'/>
<id>urn:sha1:5f02a30eac5cc1c081cbdb42d19fd0ded00b0618</id>
<content type='text'>
The code calling function '__cp_buffer_busy' has been removed, so the
function should also be removed.
silence the warning:
fs/jbd2/checkpoint.c:48:20: warning: unused function '__cp_buffer_busy'

Reported-by: Abaci Robot &lt;abaci@linux.alibaba.com&gt;
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=5518
Signed-off-by: Yang Li &lt;yang.lee@linux.alibaba.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20230714025528.564988-4-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>jbd2: check 'jh-&gt;b_transaction' before removing it from checkpoint</title>
<updated>2023-08-04T20:32:23+00:00</updated>
<author>
<name>Zhihao Cheng</name>
<email>chengzhihao1@huawei.com</email>
</author>
<published>2023-07-14T02:55:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=590a809ff743e7bd890ba5fb36bc38e20a36de53'/>
<id>urn:sha1:590a809ff743e7bd890ba5fb36bc38e20a36de53</id>
<content type='text'>
Following process will corrupt ext4 image:
Step 1:
jbd2_journal_commit_transaction
 __jbd2_journal_insert_checkpoint(jh, commit_transaction)
 // Put jh into trans1-&gt;t_checkpoint_list
 journal-&gt;j_checkpoint_transactions = commit_transaction
 // Put trans1 into journal-&gt;j_checkpoint_transactions

Step 2:
do_get_write_access
 test_clear_buffer_dirty(bh) // clear buffer dirty，set jbd dirty
 __jbd2_journal_file_buffer(jh, transaction) // jh belongs to trans2

Step 3:
drop_cache
 journal_shrink_one_cp_list
  jbd2_journal_try_remove_checkpoint
   if (!trylock_buffer(bh))  // lock bh, true
   if (buffer_dirty(bh))     // buffer is not dirty
   __jbd2_journal_remove_checkpoint(jh)
   // remove jh from trans1-&gt;t_checkpoint_list

Step 4:
jbd2_log_do_checkpoint
 trans1 = journal-&gt;j_checkpoint_transactions
 // jh is not in trans1-&gt;t_checkpoint_list
 jbd2_cleanup_journal_tail(journal)  // trans1 is done

Step 5: Power cut, trans2 is not committed, jh is lost in next mounting.

Fix it by checking 'jh-&gt;b_transaction' before remove it from checkpoint.

Cc: stable@kernel.org
Fixes: 46f881b5b175 ("jbd2: fix a race when checking checkpoint buffer busy")
Signed-off-by: Zhihao Cheng &lt;chengzhihao1@huawei.com&gt;
Signed-off-by: Zhang Yi &lt;yi.zhang@huawei.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20230714025528.564988-3-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>jbd2: fix checkpoint cleanup performance regression</title>
<updated>2023-08-04T20:32:05+00:00</updated>
<author>
<name>Zhang Yi</name>
<email>yi.zhang@huawei.com</email>
</author>
<published>2023-07-14T02:55:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=373ac521799d9e97061515aca6ec6621789036bb'/>
<id>urn:sha1:373ac521799d9e97061515aca6ec6621789036bb</id>
<content type='text'>
journal_clean_one_cp_list() has been merged into
journal_shrink_one_cp_list(), but do chekpoint buffer cleanup from the
committing process is just a best effort, it should stop scan once it
meet a busy buffer, or else it will cause a lot of invalid buffer scan
and checks. We catch a performance regression when doing fs_mark tests
below.

Test cmd:
 ./fs_mark  -d  scratch  -s  1024  -n  10000  -t  1  -D  100  -N  100

Before merging checkpoint buffer cleanup:
 FSUse%        Count         Size    Files/sec     App Overhead
     95        10000         1024       8304.9            49033

After merging checkpoint buffer cleanup:
 FSUse%        Count         Size    Files/sec     App Overhead
     95        10000         1024       7649.0            50012
 FSUse%        Count         Size    Files/sec     App Overhead
     95        10000         1024       2107.1            50871

After merging checkpoint buffer cleanup, the total loop count in
journal_shrink_one_cp_list() could be up to 6,261,600+ (50,000+ ~
100,000+ in general), most of them are invalid. This patch fix it
through passing 'shrink_type' into journal_shrink_one_cp_list() and add
a new 'SHRINK_BUSY_STOP' to indicate it should stop once meet a busy
buffer. After fix, the loop count descending back to 10,000+.

After this fix:
 FSUse%        Count         Size    Files/sec     App Overhead
     95        10000         1024       8558.4            49109

Cc: stable@kernel.org
Fixes: b98dba273a0e ("jbd2: remove journal_clean_one_cp_list()")
Signed-off-by: Zhang Yi &lt;yi.zhang@huawei.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20230714025528.564988-2-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>jbd2: fix a race when checking checkpoint buffer busy</title>
<updated>2023-07-11T03:09:21+00:00</updated>
<author>
<name>Zhang Yi</name>
<email>yi.zhang@huawei.com</email>
</author>
<published>2023-06-06T13:59:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=46f881b5b1758dc4a35fba4a643c10717d0cf427'/>
<id>urn:sha1:46f881b5b1758dc4a35fba4a643c10717d0cf427</id>
<content type='text'>
Before removing checkpoint buffer from the t_checkpoint_list, we have to
check both BH_Dirty and BH_Lock bits together to distinguish buffers
have not been or were being written back. But __cp_buffer_busy() checks
them separately, it first check lock state and then check dirty, the
window between these two checks could be raced by writing back
procedure, which locks buffer and clears buffer dirty before I/O
completes. So it cannot guarantee checkpointing buffers been written
back to disk if some error happens later. Finally, it may clean
checkpoint transactions and lead to inconsistent filesystem.

jbd2_journal_forget() and __journal_try_to_free_buffer() also have the
same problem (journal_unmap_buffer() escape from this issue since it's
running under the buffer lock), so fix them through introducing a new
helper to try holding the buffer lock and remove really clean buffer.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=217490
Cc: stable@vger.kernel.org
Suggested-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Zhang Yi &lt;yi.zhang@huawei.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20230606135928.434610-6-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>jbd2: Fix wrongly judgement for buffer head removing while doing checkpoint</title>
<updated>2023-07-11T03:09:21+00:00</updated>
<author>
<name>Zhihao Cheng</name>
<email>chengzhihao1@huawei.com</email>
</author>
<published>2023-06-06T13:59:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e34c8dd238d0c9368b746480f313055f5bab5040'/>
<id>urn:sha1:e34c8dd238d0c9368b746480f313055f5bab5040</id>
<content type='text'>
Following process,

jbd2_journal_commit_transaction
// there are several dirty buffer heads in transaction-&gt;t_checkpoint_list
          P1                   wb_workfn
jbd2_log_do_checkpoint
 if (buffer_locked(bh)) // false
                            __block_write_full_page
                             trylock_buffer(bh)
                             test_clear_buffer_dirty(bh)
 if (!buffer_dirty(bh))
  __jbd2_journal_remove_checkpoint(jh)
   if (buffer_write_io_error(bh)) // false
                             &gt;&gt; bh IO error occurs &lt;&lt;
 jbd2_cleanup_journal_tail
  __jbd2_update_log_tail
   jbd2_write_superblock
   // The bh won't be replayed in next mount.
, which could corrupt the ext4 image, fetch a reproducer in [Link].

Since writeback process clears buffer dirty after locking buffer head,
we can fix it by try locking buffer and check dirtiness while buffer is
locked, the buffer head can be removed if it is neither dirty nor locked.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=217490
Fixes: 470decc613ab ("[PATCH] jbd2: initial copy of files from jbd")
Signed-off-by: Zhihao Cheng &lt;chengzhihao1@huawei.com&gt;
Signed-off-by: Zhang Yi &lt;yi.zhang@huawei.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20230606135928.434610-5-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
</feed>
