diff options
| author | Robbie Ko <robbieko@synology.com> | 2025-12-11 08:30:33 +0300 |
|---|---|---|
| committer | Greg Kroah-Hartman <gregkh@linuxfoundation.org> | 2026-01-23 13:21:36 +0300 |
| commit | 9ac63333d600732a56b35ee1fa46836da671eb50 (patch) | |
| tree | 629dcc3ea0d68b3435fd25952e94b7c4c9dbcd1c | |
| parent | 7e58addb8e05379e437e4722534a7cb1cabd767b (diff) | |
| download | linux-9ac63333d600732a56b35ee1fa46836da671eb50.tar.xz | |
btrfs: fix deadlock in wait_current_trans() due to ignored transaction type
commit 5037b342825df7094a4906d1e2a9674baab50cb2 upstream.
When wait_current_trans() is called during start_transaction(), it
currently waits for a blocked transaction without considering whether
the given transaction type actually needs to wait for that particular
transaction state. The btrfs_blocked_trans_types[] array already defines
which transaction types should wait for which transaction states, but
this check was missing in wait_current_trans().
This can lead to a deadlock scenario involving two transactions and
pending ordered extents:
1. Transaction A is in TRANS_STATE_COMMIT_DOING state
2. A worker processing an ordered extent calls start_transaction()
with TRANS_JOIN
3. join_transaction() returns -EBUSY because Transaction A is in
TRANS_STATE_COMMIT_DOING
4. Transaction A moves to TRANS_STATE_UNBLOCKED and completes
5. A new Transaction B is created (TRANS_STATE_RUNNING)
6. The ordered extent from step 2 is added to Transaction B's
pending ordered extents
7. Transaction B immediately starts commit by another task and
enters TRANS_STATE_COMMIT_START
8. The worker finally reaches wait_current_trans(), sees Transaction B
in TRANS_STATE_COMMIT_START (a blocked state), and waits
unconditionally
9. However, TRANS_JOIN should NOT wait for TRANS_STATE_COMMIT_START
according to btrfs_blocked_trans_types[]
10. Transaction B is waiting for pending ordered extents to complete
11. Deadlock: Transaction B waits for ordered extent, ordered extent
waits for Transaction B
This can be illustrated by the following call stacks:
CPU0 CPU1
btrfs_finish_ordered_io()
start_transaction(TRANS_JOIN)
join_transaction()
# -EBUSY (Transaction A is
# TRANS_STATE_COMMIT_DOING)
# Transaction A completes
# Transaction B created
# ordered extent added to
# Transaction B's pending list
btrfs_commit_transaction()
# Transaction B enters
# TRANS_STATE_COMMIT_START
# waiting for pending ordered
# extents
wait_current_trans()
# waits for Transaction B
# (should not wait!)
Task bstore_kv_sync in btrfs_commit_transaction waiting for ordered
extents:
__schedule+0x2e7/0x8a0
schedule+0x64/0xe0
btrfs_commit_transaction+0xbf7/0xda0 [btrfs]
btrfs_sync_file+0x342/0x4d0 [btrfs]
__x64_sys_fdatasync+0x4b/0x80
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Task kworker in wait_current_trans waiting for transaction commit:
Workqueue: btrfs-syno_nocow btrfs_work_helper [btrfs]
__schedule+0x2e7/0x8a0
schedule+0x64/0xe0
wait_current_trans+0xb0/0x110 [btrfs]
start_transaction+0x346/0x5b0 [btrfs]
btrfs_finish_ordered_io.isra.0+0x49b/0x9c0 [btrfs]
btrfs_work_helper+0xe8/0x350 [btrfs]
process_one_work+0x1d3/0x3c0
worker_thread+0x4d/0x3e0
kthread+0x12d/0x150
ret_from_fork+0x1f/0x30
Fix this by passing the transaction type to wait_current_trans() and
checking btrfs_blocked_trans_types[cur_trans->state] against the given
type before deciding to wait. This ensures that transaction types which
are allowed to join during certain blocked states will not unnecessarily
wait and cause deadlocks.
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Robbie Ko <robbieko@synology.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Cc: Motiejus Jakštys <motiejus@jakstys.lt>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| -rw-r--r-- | fs/btrfs/transaction.c | 11 |
1 files changed, 6 insertions, 5 deletions
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index 89ae0c7a610a..c457316c2788 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -518,13 +518,14 @@ static inline int is_transaction_blocked(struct btrfs_transaction *trans) * when this is done, it is safe to start a new transaction, but the current * transaction might not be fully on disk. */ -static void wait_current_trans(struct btrfs_fs_info *fs_info) +static void wait_current_trans(struct btrfs_fs_info *fs_info, unsigned int type) { struct btrfs_transaction *cur_trans; spin_lock(&fs_info->trans_lock); cur_trans = fs_info->running_transaction; - if (cur_trans && is_transaction_blocked(cur_trans)) { + if (cur_trans && is_transaction_blocked(cur_trans) && + (btrfs_blocked_trans_types[cur_trans->state] & type)) { refcount_inc(&cur_trans->use_count); spin_unlock(&fs_info->trans_lock); @@ -699,12 +700,12 @@ again: sb_start_intwrite(fs_info->sb); if (may_wait_transaction(fs_info, type)) - wait_current_trans(fs_info); + wait_current_trans(fs_info, type); do { ret = join_transaction(fs_info, type); if (ret == -EBUSY) { - wait_current_trans(fs_info); + wait_current_trans(fs_info, type); if (unlikely(type == TRANS_ATTACH || type == TRANS_JOIN_NOSTART)) ret = -ENOENT; @@ -1001,7 +1002,7 @@ out: void btrfs_throttle(struct btrfs_fs_info *fs_info) { - wait_current_trans(fs_info); + wait_current_trans(fs_info, TRANS_START); } bool btrfs_should_end_transaction(struct btrfs_trans_handle *trans) |
