<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/md/raid10.c, branch v7.1-rc5</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v7.1-rc5</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v7.1-rc5'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-04-28T12:44:37+00:00</updated>
<entry>
<title>md/raid10: fix divide-by-zero in setup_geo() with zero far_copies</title>
<updated>2026-04-28T12:44:37+00:00</updated>
<author>
<name>Junrui Luo</name>
<email>moonafterrain@outlook.com</email>
</author>
<published>2026-04-16T03:39:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9aa6d860b0930e2f72795665c42c44252a558a0c'/>
<id>urn:sha1:9aa6d860b0930e2f72795665c42c44252a558a0c</id>
<content type='text'>
setup_geo() extracts near_copies (nc) and far_copies (fc) from the
user-provided layout parameter without checking for zero. When fc=0
with the "improved" far set layout selected, 'geo-&gt;far_set_size =
disks / fc' triggers a divide-by-zero.

Validate nc and fc immediately after extraction, returning -1 if
either is zero.

Fixes: 475901aff158 ("MD RAID10: Improve redundancy for 'far' and 'offset' algorithms (part 1)")
Cc: stable@vger.kernel.org
Signed-off-by: Junrui Luo &lt;moonafterrain@outlook.com&gt;
Link: https://lore.kernel.org/linux-raid/SYBPR01MB7881A5E2556806CC1D318582AF232@SYBPR01MB7881.ausprd01.prod.outlook.com
Signed-off-by: Yu Kuai &lt;yukuai@fnnas.com&gt;
</content>
</entry>
<entry>
<title>md/raid10: fix deadlock with check operation and nowait requests</title>
<updated>2026-03-15T17:24:59+00:00</updated>
<author>
<name>Josh Hunt</name>
<email>johunt@akamai.com</email>
</author>
<published>2026-03-03T00:56:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7d96f3120a7fb7210d21b520c5b6f495da6ba436'/>
<id>urn:sha1:7d96f3120a7fb7210d21b520c5b6f495da6ba436</id>
<content type='text'>
When an array check is running it will raise the barrier at which point
normal requests will become blocked and increment the nr_pending value to
signal there is work pending inside of wait_barrier(). NOWAIT requests
do not block and so will return immediately with an error, and additionally
do not increment nr_pending in wait_barrier(). Upstream change commit
43806c3d5b9b ("raid10: cleanup memleak at raid10_make_request") added a
call to raid_end_bio_io() to fix a memory leak when NOWAIT requests hit
this condition. raid_end_bio_io() eventually calls allow_barrier() and
it will unconditionally do an atomic_dec_and_test(&amp;conf-&gt;nr_pending) even
though the corresponding increment on nr_pending didn't happen in the
NOWAIT case.

This can be easily seen by starting a check operation while an application
is doing nowait IO on the same array. This results in a deadlocked state
due to nr_pending value underflowing and so the md resync thread gets stuck
waiting for nr_pending to == 0.

Output of r10conf state of the array when we hit this condition:

crash&gt; struct r10conf
	barrier = 1,
        nr_pending = {
          counter = -41
        },
        nr_waiting = 15,
        nr_queued = 0,

Example of md_sync thread stuck waiting on raise_barrier() and other
requests stuck in wait_barrier():

md1_resync
[&lt;0&gt;] raise_barrier+0xce/0x1c0
[&lt;0&gt;] raid10_sync_request+0x1ca/0x1ed0
[&lt;0&gt;] md_do_sync+0x779/0x1110
[&lt;0&gt;] md_thread+0x90/0x160
[&lt;0&gt;] kthread+0xbe/0xf0
[&lt;0&gt;] ret_from_fork+0x34/0x50
[&lt;0&gt;] ret_from_fork_asm+0x1a/0x30

kworker/u1040:2+flush-253:4
[&lt;0&gt;] wait_barrier+0x1de/0x220
[&lt;0&gt;] regular_request_wait+0x30/0x180
[&lt;0&gt;] raid10_make_request+0x261/0x1000
[&lt;0&gt;] md_handle_request+0x13b/0x230
[&lt;0&gt;] __submit_bio+0x107/0x1f0
[&lt;0&gt;] submit_bio_noacct_nocheck+0x16f/0x390
[&lt;0&gt;] ext4_io_submit+0x24/0x40
[&lt;0&gt;] ext4_do_writepages+0x254/0xc80
[&lt;0&gt;] ext4_writepages+0x84/0x120
[&lt;0&gt;] do_writepages+0x7a/0x260
[&lt;0&gt;] __writeback_single_inode+0x3d/0x300
[&lt;0&gt;] writeback_sb_inodes+0x1dd/0x470
[&lt;0&gt;] __writeback_inodes_wb+0x4c/0xe0
[&lt;0&gt;] wb_writeback+0x18b/0x2d0
[&lt;0&gt;] wb_workfn+0x2a1/0x400
[&lt;0&gt;] process_one_work+0x149/0x330
[&lt;0&gt;] worker_thread+0x2d2/0x410
[&lt;0&gt;] kthread+0xbe/0xf0
[&lt;0&gt;] ret_from_fork+0x34/0x50
[&lt;0&gt;] ret_from_fork_asm+0x1a/0x30

Fixes: 43806c3d5b9b ("raid10: cleanup memleak at raid10_make_request")
Cc: stable@vger.kernel.org
Signed-off-by: Josh Hunt &lt;johunt@akamai.com&gt;
Link: https://lore.kernel.org/linux-raid/20260303005619.1352958-1-johunt@akamai.com
Signed-off-by: Yu Kuai &lt;yukuai@fnnas.com&gt;
</content>
</entry>
<entry>
<title>block: remove bdev_nonrot()</title>
<updated>2026-03-09T20:30:00+00:00</updated>
<author>
<name>Damien Le Moal</name>
<email>dlemoal@kernel.org</email>
</author>
<published>2026-02-26T07:54:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ecd92cfec5349876d6a80f8188ea98c5920094b6'/>
<id>urn:sha1:ecd92cfec5349876d6a80f8188ea98c5920094b6</id>
<content type='text'>
bdev_nonrot() is simply the negative return value of bdev_rot().
So replace all call sites of bdev_nonrot() with calls to bdev_rot()
and remove bdev_nonrot().

Signed-off-by: Damien Le Moal &lt;dlemoal@kernel.org&gt;
Reviewed-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Reviewed-by: Paul Menzel &lt;pmenzel@molgen.mpg.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses</title>
<updated>2026-02-22T16:26:33+00:00</updated>
<author>
<name>Kees Cook</name>
<email>kees@kernel.org</email>
</author>
<published>2026-02-22T07:46:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=189f164e573e18d9f8876dbd3ad8fcbe11f93037'/>
<id>urn:sha1:189f164e573e18d9f8876dbd3ad8fcbe11f93037</id>
<content type='text'>
Conversion performed via this Coccinelle script:

  // SPDX-License-Identifier: GPL-2.0-only
  // Options: --include-headers-for-types --all-includes --include-headers --keep-comments
  virtual patch

  @gfp depends on patch &amp;&amp; !(file in "tools") &amp;&amp; !(file in "samples")@
  identifier ALLOC = {kmalloc_obj,kmalloc_objs,kmalloc_flex,
 		    kzalloc_obj,kzalloc_objs,kzalloc_flex,
		    kvmalloc_obj,kvmalloc_objs,kvmalloc_flex,
		    kvzalloc_obj,kvzalloc_objs,kvzalloc_flex};
  @@

  	ALLOC(...
  -		, GFP_KERNEL
  	)

  $ make coccicheck MODE=patch COCCI=gfp.cocci

Build and boot tested x86_64 with Fedora 42's GCC and Clang:

Linux version 6.19.0+ (user@host) (gcc (GCC) 15.2.1 20260123 (Red Hat 15.2.1-7), GNU ld version 2.44-12.fc42) #1 SMP PREEMPT_DYNAMIC 1970-01-01
Linux version 6.19.0+ (user@host) (clang version 20.1.8 (Fedora 20.1.8-4.fc42), LLD 20.1.8) #1 SMP PREEMPT_DYNAMIC 1970-01-01

Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Convert 'alloc_obj' family to use the new default GFP_KERNEL argument</title>
<updated>2026-02-22T01:09:51+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-02-22T00:37:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bf4afc53b77aeaa48b5409da5c8da6bb4eff7f43'/>
<id>urn:sha1:bf4afc53b77aeaa48b5409da5c8da6bb4eff7f43</id>
<content type='text'>
This was done entirely with mindless brute force, using

    git grep -l '\&lt;k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>treewide: Replace kmalloc with kmalloc_obj for non-scalar types</title>
<updated>2026-02-21T09:02:28+00:00</updated>
<author>
<name>Kees Cook</name>
<email>kees@kernel.org</email>
</author>
<published>2026-02-21T07:49:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=69050f8d6d075dc01af7a5f2f550a8067510366f'/>
<id>urn:sha1:69050f8d6d075dc01af7a5f2f550a8067510366f</id>
<content type='text'>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
</content>
</entry>
<entry>
<title>md: remove recovery_disabled</title>
<updated>2026-01-26T05:17:38+00:00</updated>
<author>
<name>Li Nan</name>
<email>linan122@huawei.com</email>
</author>
<published>2026-01-05T11:03:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5d1dd57929be2158fb5a8bc74817cc08b10b0118'/>
<id>urn:sha1:5d1dd57929be2158fb5a8bc74817cc08b10b0118</id>
<content type='text'>
'recovery_disabled' logic is complex and confusing, originally intended to
preserve raid in extreme scenarios. It was used in following cases:
- When sync fails and setting badblocks also fails, kick out non-In_sync
  rdev and block spare rdev from joining to preserve raid [1]
- When last backup is unavailable, prevent repeated add-remove of spares
  triggering recovery [2]

The original issues are now resolved:
- Error handlers in all raid types prevent last rdev from being kicked out
- Disks with failed recovery are marked Faulty and can't re-join

Therefore, remove 'recovery_disabled' as it's no longer needed.

[1] 5389042ffa36 ("md: change managed of recovery_disabled.")
[2] 4044ba58dd15 ("md: don't retry recovery of raid1 that fails due to error on source drive.")

Link: https://lore.kernel.org/linux-raid/20260105110300.1442509-13-linan666@huaweicloud.com
Signed-off-by: Li Nan &lt;linan122@huawei.com&gt;
Signed-off-by: Yu Kuai &lt;yukuai@fnnas.com&gt;
</content>
</entry>
<entry>
<title>md/raid10: cleanup skip handling in raid10_sync_request</title>
<updated>2026-01-26T05:16:26+00:00</updated>
<author>
<name>Li Nan</name>
<email>linan122@huawei.com</email>
</author>
<published>2026-01-05T11:02:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7435b73f05fbb40c07b087fefd3d40bfd759519c'/>
<id>urn:sha1:7435b73f05fbb40c07b087fefd3d40bfd759519c</id>
<content type='text'>
Skip a sector in raid10_sync_request() when it needs no syncing or no
readable device exists. Current skip handling is unnecessary:

- Use 'skip' label to reissue the next sector instead of return directly
- Complete sync and return 'max_sectors' when multiple sectors are skipped
  due to badblocks

The first is error-prone. For example, commit bc49694a9e8f ("md: pass in
max_sectors for pers-&gt;sync_request()") removed redundant max_sector
assignments. Since skip modifies max_sectors, `goto skip` leaves
max_sectors equal to sector_nr after the jump, which is incorrect.

The second causes sync to complete erroneously when no actual sync occurs.
For recovery, recording badblocks and continue syncing subsequent sectors
is more suitable. For resync, just skip bad sectors and syncing subsequent
sectors.

Clean up complex and unnecessary skip code. Return immediately when a
sector should be skipped. Reduce code paths and lower regression risk.

Link: https://lore.kernel.org/linux-raid/20260105110300.1442509-12-linan666@huaweicloud.com
Fixes: bc49694a9e8f ("md: pass in max_sectors for pers-&gt;sync_request()")
Signed-off-by: Li Nan &lt;linan122@huawei.com&gt;
Reviewed-by: Yu Kuai &lt;yukuai@fnnas.com&gt;
Signed-off-by: Yu Kuai &lt;yukuai@fnnas.com&gt;
</content>
</entry>
<entry>
<title>md/raid10: fix any_working flag handling in raid10_sync_request</title>
<updated>2026-01-26T05:16:25+00:00</updated>
<author>
<name>Li Nan</name>
<email>linan122@huawei.com</email>
</author>
<published>2026-01-05T11:02:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=99582edb3f62e8ee6c34512021368f53f9b091f2'/>
<id>urn:sha1:99582edb3f62e8ee6c34512021368f53f9b091f2</id>
<content type='text'>
In raid10_sync_request(), 'any_working' indicates if any IO will
be submitted. When there's only one In_sync disk with badblocks,
'any_working' might be set to 1 but no IO is submitted. Fix it by
setting 'any_working' after badblock checks.

Link: https://lore.kernel.org/linux-raid/20260105110300.1442509-11-linan666@huaweicloud.com
Fixes: e875ecea266a ("md/raid10 record bad blocks as needed during recovery.")
Signed-off-by: Li Nan &lt;linan122@huawei.com&gt;
Reviewed-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Yu Kuai &lt;yukuai@fnnas.com&gt;
</content>
</entry>
<entry>
<title>md: mark rdev Faulty when badblocks setting fails</title>
<updated>2026-01-26T05:16:15+00:00</updated>
<author>
<name>Li Nan</name>
<email>linan122@huawei.com</email>
</author>
<published>2026-01-05T11:02:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=fd4d44c14ff6a0e815eefd5d87bbba2b2668b18f'/>
<id>urn:sha1:fd4d44c14ff6a0e815eefd5d87bbba2b2668b18f</id>
<content type='text'>
Currently when sync read fails and badblocks set fails (exceeding
512 limit), rdev isn't immediately marked Faulty. Instead
'recovery_disabled' is set and non-In_sync rdevs are removed later.
This preserves array availability if bad regions aren't read, but bad
sectors might be read by users before rdev removal. This occurs due
to incorrect resync/recovery_offset updates that include these bad
sectors.

When badblocks exceed 512, keeping the disk provides little benefit
while adding complexity. Prompt disk replacement is more important.
Therefore when badblocks set fails, directly call md_error to mark rdev
Faulty immediately, preventing potential data access issues.

After this change, cleanup of offset update logic and 'recovery_disabled'
handling will follow.

Link: https://lore.kernel.org/linux-raid/20260105110300.1442509-6-linan666@huaweicloud.com
Fixes: 5e5702898e93 ("md/raid10: Handle read errors during recovery better.")
Fixes: 3a9f28a5117e ("md/raid1: improve handling of read failure during recovery.")
Signed-off-by: Li Nan &lt;linan122@huawei.com&gt;
Signed-off-by: Yu Kuai &lt;yukuai@fnnas.com&gt;
</content>
</entry>
</feed>
