<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/md/raid10.c, branch v5.15.209</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v5.15.209</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v5.15.209'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-06-01T15:35:23+00:00</updated>
<entry>
<title>md/raid10: fix divide-by-zero in setup_geo() with zero far_copies</title>
<updated>2026-06-01T15:35:23+00:00</updated>
<author>
<name>Junrui Luo</name>
<email>moonafterrain@outlook.com</email>
</author>
<published>2026-04-16T03:39:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c716ab3034f84f8a6c226814247b8c5ac9f95da1'/>
<id>urn:sha1:c716ab3034f84f8a6c226814247b8c5ac9f95da1</id>
<content type='text'>
commit 9aa6d860b0930e2f72795665c42c44252a558a0c upstream.

setup_geo() extracts near_copies (nc) and far_copies (fc) from the
user-provided layout parameter without checking for zero. When fc=0
with the "improved" far set layout selected, 'geo-&gt;far_set_size =
disks / fc' triggers a divide-by-zero.

Validate nc and fc immediately after extraction, returning -1 if
either is zero.

Fixes: 475901aff158 ("MD RAID10: Improve redundancy for 'far' and 'offset' algorithms (part 1)")
Cc: stable@vger.kernel.org
Signed-off-by: Junrui Luo &lt;moonafterrain@outlook.com&gt;
Link: https://lore.kernel.org/linux-raid/SYBPR01MB7881A5E2556806CC1D318582AF232@SYBPR01MB7881.ausprd01.prod.outlook.com
Signed-off-by: Yu Kuai &lt;yukuai@fnnas.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>md/raid10: fix deadlock with check operation and nowait requests</title>
<updated>2026-06-01T15:35:15+00:00</updated>
<author>
<name>Josh Hunt</name>
<email>johunt@akamai.com</email>
</author>
<published>2026-03-03T00:56:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2249983d971e6839b36284e6610390b2c217dfa1'/>
<id>urn:sha1:2249983d971e6839b36284e6610390b2c217dfa1</id>
<content type='text'>
commit 7d96f3120a7fb7210d21b520c5b6f495da6ba436 upstream.

When an array check is running it will raise the barrier at which point
normal requests will become blocked and increment the nr_pending value to
signal there is work pending inside of wait_barrier(). NOWAIT requests
do not block and so will return immediately with an error, and additionally
do not increment nr_pending in wait_barrier(). Upstream change commit
43806c3d5b9b ("raid10: cleanup memleak at raid10_make_request") added a
call to raid_end_bio_io() to fix a memory leak when NOWAIT requests hit
this condition. raid_end_bio_io() eventually calls allow_barrier() and
it will unconditionally do an atomic_dec_and_test(&amp;conf-&gt;nr_pending) even
though the corresponding increment on nr_pending didn't happen in the
NOWAIT case.

This can be easily seen by starting a check operation while an application
is doing nowait IO on the same array. This results in a deadlocked state
due to nr_pending value underflowing and so the md resync thread gets stuck
waiting for nr_pending to == 0.

Output of r10conf state of the array when we hit this condition:

crash&gt; struct r10conf
	barrier = 1,
        nr_pending = {
          counter = -41
        },
        nr_waiting = 15,
        nr_queued = 0,

Example of md_sync thread stuck waiting on raise_barrier() and other
requests stuck in wait_barrier():

md1_resync
[&lt;0&gt;] raise_barrier+0xce/0x1c0
[&lt;0&gt;] raid10_sync_request+0x1ca/0x1ed0
[&lt;0&gt;] md_do_sync+0x779/0x1110
[&lt;0&gt;] md_thread+0x90/0x160
[&lt;0&gt;] kthread+0xbe/0xf0
[&lt;0&gt;] ret_from_fork+0x34/0x50
[&lt;0&gt;] ret_from_fork_asm+0x1a/0x30

kworker/u1040:2+flush-253:4
[&lt;0&gt;] wait_barrier+0x1de/0x220
[&lt;0&gt;] regular_request_wait+0x30/0x180
[&lt;0&gt;] raid10_make_request+0x261/0x1000
[&lt;0&gt;] md_handle_request+0x13b/0x230
[&lt;0&gt;] __submit_bio+0x107/0x1f0
[&lt;0&gt;] submit_bio_noacct_nocheck+0x16f/0x390
[&lt;0&gt;] ext4_io_submit+0x24/0x40
[&lt;0&gt;] ext4_do_writepages+0x254/0xc80
[&lt;0&gt;] ext4_writepages+0x84/0x120
[&lt;0&gt;] do_writepages+0x7a/0x260
[&lt;0&gt;] __writeback_single_inode+0x3d/0x300
[&lt;0&gt;] writeback_sb_inodes+0x1dd/0x470
[&lt;0&gt;] __writeback_inodes_wb+0x4c/0xe0
[&lt;0&gt;] wb_writeback+0x18b/0x2d0
[&lt;0&gt;] wb_workfn+0x2a1/0x400
[&lt;0&gt;] process_one_work+0x149/0x330
[&lt;0&gt;] worker_thread+0x2d2/0x410
[&lt;0&gt;] kthread+0xbe/0xf0
[&lt;0&gt;] ret_from_fork+0x34/0x50
[&lt;0&gt;] ret_from_fork_asm+0x1a/0x30

Fixes: 43806c3d5b9b ("raid10: cleanup memleak at raid10_make_request")
Cc: stable@vger.kernel.org
Signed-off-by: Josh Hunt &lt;johunt@akamai.com&gt;
Link: https://lore.kernel.org/linux-raid/20260303005619.1352958-1-johunt@akamai.com
Signed-off-by: Yu Kuai &lt;yukuai@fnnas.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>md/raid10: fix any_working flag handling in raid10_sync_request</title>
<updated>2026-03-04T12:19:22+00:00</updated>
<author>
<name>Li Nan</name>
<email>linan122@huawei.com</email>
</author>
<published>2026-01-05T11:02:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c640eea74e9b1841f6d700a6bf115bcd245bf79f'/>
<id>urn:sha1:c640eea74e9b1841f6d700a6bf115bcd245bf79f</id>
<content type='text'>
[ Upstream commit 99582edb3f62e8ee6c34512021368f53f9b091f2 ]

In raid10_sync_request(), 'any_working' indicates if any IO will
be submitted. When there's only one In_sync disk with badblocks,
'any_working' might be set to 1 but no IO is submitted. Fix it by
setting 'any_working' after badblock checks.

Link: https://lore.kernel.org/linux-raid/20260105110300.1442509-11-linan666@huaweicloud.com
Fixes: e875ecea266a ("md/raid10 record bad blocks as needed during recovery.")
Signed-off-by: Li Nan &lt;linan122@huawei.com&gt;
Reviewed-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Yu Kuai &lt;yukuai@fnnas.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>raid10: cleanup memleak at raid10_make_request</title>
<updated>2025-07-17T16:30:52+00:00</updated>
<author>
<name>Nigel Croxon</name>
<email>ncroxon@redhat.com</email>
</author>
<published>2025-07-03T15:23:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=10c6021a609deb95f23f0cc2f89aa9d4bffb14c7'/>
<id>urn:sha1:10c6021a609deb95f23f0cc2f89aa9d4bffb14c7</id>
<content type='text'>
[ Upstream commit 43806c3d5b9bb7d74ba4e33a6a8a41ac988bde24 ]

If raid10_read_request or raid10_write_request registers a new
request and the REQ_NOWAIT flag is set, the code does not
free the malloc from the mempool.

unreferenced object 0xffff8884802c3200 (size 192):
   comm "fio", pid 9197, jiffies 4298078271
   hex dump (first 32 bytes):
     00 00 00 00 00 00 00 00 88 41 02 00 00 00 00 00  .........A......
     08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
   backtrace (crc c1a049a2):
     __kmalloc+0x2bb/0x450
     mempool_alloc+0x11b/0x320
     raid10_make_request+0x19e/0x650 [raid10]
     md_handle_request+0x3b3/0x9e0
     __submit_bio+0x394/0x560
     __submit_bio_noacct+0x145/0x530
     submit_bio_noacct_nocheck+0x682/0x830
     __blkdev_direct_IO_async+0x4dc/0x6b0
     blkdev_read_iter+0x1e5/0x3b0
     __io_read+0x230/0x1110
     io_read+0x13/0x30
     io_issue_sqe+0x134/0x1180
     io_submit_sqes+0x48c/0xe90
     __do_sys_io_uring_enter+0x574/0x8b0
     do_syscall_64+0x5c/0xe0
     entry_SYSCALL_64_after_hwframe+0x76/0x7e

V4: changing backing tree to see if CKI tests will pass.
The patch code has not changed between any versions.

Fixes: c9aa889b035f ("md: raid10 add nowait support")
Signed-off-by: Nigel Croxon &lt;ncroxon@redhat.com&gt;
Link: https://lore.kernel.org/linux-raid/c0787379-9caa-42f3-b5fc-369aed784400@redhat.com
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>md/raid10: fix missing discard IO accounting</title>
<updated>2025-05-02T05:44:08+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2025-03-25T01:57:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e2c440b23f219f7cd795f6041a992cf7931e6104'/>
<id>urn:sha1:e2c440b23f219f7cd795f6041a992cf7931e6104</id>
<content type='text'>
[ Upstream commit d05af90d6218e9c8f1c2026990c3f53c1b41bfb0 ]

md_account_bio() is not called from raid10_handle_discard(), now that we
handle bitmap inside md_account_bio(), also fix missing
bitmap_startwrite for discard.

Test whole disk discard for 20G raid10:

Before:
Device   d/s     dMB/s   drqm/s  %drqm d_await dareq-sz
md0    48.00     16.00     0.00   0.00    5.42   341.33

After:
Device   d/s     dMB/s   drqm/s  %drqm d_await dareq-sz
md0    68.00  20462.00     0.00   0.00    2.65 308133.65

Link: https://lore.kernel.org/linux-raid/20250325015746.3195035-1-yukuai1@huaweicloud.com
Fixes: 528bc2cf2fcc ("md/raid10: enable io accounting")
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Acked-by: Coly Li &lt;colyli@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>md/raid10: improve code of mrdev in raid10_sync_request</title>
<updated>2024-11-17T14:06:25+00:00</updated>
<author>
<name>Li Nan</name>
<email>linan122@huawei.com</email>
</author>
<published>2023-05-27T07:22:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1f4fd53ffb7c94e9b961534d4f8763bcbbbb6f8e'/>
<id>urn:sha1:1f4fd53ffb7c94e9b961534d4f8763bcbbbb6f8e</id>
<content type='text'>
commit 59f8f0b54c8ffb4521f6bbd1cb6f4dfa5022e75e upstream.

'need_recover' and 'mrdev' are equivalent in raid10_sync_request(), and
inc mrdev-&gt;nr_pending is unreasonable if don't need recovery. Replace
'need_recover' with 'mrdev', and only inc nr_pending when needed.

Signed-off-by: Li Nan &lt;linan122@huawei.com&gt;
Reviewed-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20230527072218.2365857-3-linan666@huaweicloud.com
Cc: Hagar Gamal Halim &lt;hagarhem@amazon.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>md/raid10: prevent soft lockup while flush writes</title>
<updated>2024-03-01T12:21:55+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2023-05-29T13:11:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=84a578961b2566e475bfa8740beaf0abcc781a6f'/>
<id>urn:sha1:84a578961b2566e475bfa8740beaf0abcc781a6f</id>
<content type='text'>
[ Upstream commit 010444623e7f4da6b4a4dd603a7da7469981e293 ]

Currently, there is no limit for raid1/raid10 plugged bio. While flushing
writes, raid1 has cond_resched() while raid10 doesn't, and too many
writes can cause soft lockup.

Follow up soft lockup can be triggered easily with writeback test for
raid10 with ramdisks:

watchdog: BUG: soft lockup - CPU#10 stuck for 27s! [md0_raid10:1293]
Call Trace:
 &lt;TASK&gt;
 call_rcu+0x16/0x20
 put_object+0x41/0x80
 __delete_object+0x50/0x90
 delete_object_full+0x2b/0x40
 kmemleak_free+0x46/0xa0
 slab_free_freelist_hook.constprop.0+0xed/0x1a0
 kmem_cache_free+0xfd/0x300
 mempool_free_slab+0x1f/0x30
 mempool_free+0x3a/0x100
 bio_free+0x59/0x80
 bio_put+0xcf/0x2c0
 free_r10bio+0xbf/0xf0
 raid_end_bio_io+0x78/0xb0
 one_write_done+0x8a/0xa0
 raid10_end_write_request+0x1b4/0x430
 bio_endio+0x175/0x320
 brd_submit_bio+0x3b9/0x9b7 [brd]
 __submit_bio+0x69/0xe0
 submit_bio_noacct_nocheck+0x1e6/0x5a0
 submit_bio_noacct+0x38c/0x7e0
 flush_pending_writes+0xf0/0x240
 raid10d+0xac/0x1ed0

Fix the problem by adding cond_resched() to raid10 like what raid1 did.

Note that unlimited plugged bio still need to be optimized, for example,
in the case of lots of dirty pages writeback, this will take lots of
memory and io will spend a long time in plug, hence io latency is bad.

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20230529131106.2123367-2-yukuai1@huaweicloud.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>md: Set MD_BROKEN for RAID1 and RAID10</title>
<updated>2023-09-19T10:22:39+00:00</updated>
<author>
<name>Mariusz Tkaczyk</name>
<email>mariusz.tkaczyk@linux.intel.com</email>
</author>
<published>2022-03-22T15:23:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d8b6adb84041ba1e327f9df30f731fb222f2db32'/>
<id>urn:sha1:d8b6adb84041ba1e327f9df30f731fb222f2db32</id>
<content type='text'>
[ Upstream commit 9631abdbf406c764f2a5d8305eac063bc3396a0a ]

There is no direct mechanism to determine raid failure outside
personality. It is done by checking rdev-&gt;flags after executing
md_error(). If "faulty" flag is not set then -EBUSY is returned to
userspace. -EBUSY means that array will be failed after drive removal.

Mdadm has special routine to handle the array failure and it is executed
if -EBUSY is returned by md.

There are at least two known reasons to not consider this mechanism
as correct:
1. drive can be removed even if array will be failed[1].
2. -EBUSY seems to be wrong status. Array is not busy, but removal
   process cannot proceed safe.

-EBUSY expectation cannot be removed without breaking compatibility
with userspace. In this patch first issue is resolved by adding support
for MD_BROKEN flag for RAID1 and RAID10. Support for RAID456 is added in
next commit.

The idea is to set the MD_BROKEN if we are sure that raid is in failed
state now. This is done in each error_handler(). In md_error() MD_BROKEN
flag is checked. If is set, then -EBUSY is returned to userspace.

As in previous commit, it causes that #mdadm --set-faulty is able to
fail array. Previously proposed workaround is valid if optional
functionality[1] is disabled.

[1] commit 9a567843f7ce("md: allow last device to be forcibly removed from
    RAID1/RAID10.")

Reviewd-by: Xiao Ni &lt;xni@redhat.com&gt;
Signed-off-by: Mariusz Tkaczyk &lt;mariusz.tkaczyk@linux.intel.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Stable-dep-of: 319ff40a5427 ("md/raid0: Fix performance regression for large sequential writes")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>md/raid10: use dereference_rdev_and_rrdev() to get devices</title>
<updated>2023-09-19T10:22:38+00:00</updated>
<author>
<name>Li Nan</name>
<email>linan122@huawei.com</email>
</author>
<published>2023-07-01T08:05:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a3d36107ee487dae0010eea10280c3a3f12b1a2c'/>
<id>urn:sha1:a3d36107ee487dae0010eea10280c3a3f12b1a2c</id>
<content type='text'>
[ Upstream commit 673643490b9a0eb3b25633abe604f62b8f63dba1 ]

Commit 2ae6aaf76912 ("md/raid10: fix io loss while replacement replace
rdev") reads replacement first to prevent io loss. However, there are same
issue in wait_blocked_dev() and raid10_handle_discard(), too. Fix it by
using dereference_rdev_and_rrdev() to get devices.

Fixes: d30588b2731f ("md/raid10: improve raid10 discard request")
Fixes: f2e7e269a752 ("md/raid10: pull the code that wait for blocked dev into one function")
Signed-off-by: Li Nan &lt;linan122@huawei.com&gt;
Link: https://lore.kernel.org/r/20230701080529.2684932-4-linan666@huaweicloud.com
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>md/raid10: factor out dereference_rdev_and_rrdev()</title>
<updated>2023-09-19T10:22:38+00:00</updated>
<author>
<name>Li Nan</name>
<email>linan122@huawei.com</email>
</author>
<published>2023-07-01T08:05:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=94ca5eed95307d665472ff6aaa17694cc43d35fb'/>
<id>urn:sha1:94ca5eed95307d665472ff6aaa17694cc43d35fb</id>
<content type='text'>
[ Upstream commit b99f8fd2d91eb734f13098aa1cf337edaca454b7 ]

Factor out a helper to get 'rdev' and 'replacement' from config-&gt;mirrors.
Just to make code cleaner and prepare to fix the bug of io loss while
'replacement' replace 'rdev'.

There is no functional change.

Signed-off-by: Li Nan &lt;linan122@huawei.com&gt;
Link: https://lore.kernel.org/r/20230701080529.2684932-3-linan666@huaweicloud.com
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Stable-dep-of: 673643490b9a ("md/raid10: use dereference_rdev_and_rrdev() to get devices")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
