<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/md/md.c, branch v6.1.124</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.1.124</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.1.124'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2024-08-29T15:30:36+00:00</updated>
<entry>
<title>md: clean up invalid BUG_ON in md_ioctl</title>
<updated>2024-08-29T15:30:36+00:00</updated>
<author>
<name>Li Nan</name>
<email>linan122@huawei.com</email>
</author>
<published>2024-02-26T03:14:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=86ea3c4e71f35a68a2a1c31d5180be6ba2cf6510'/>
<id>urn:sha1:86ea3c4e71f35a68a2a1c31d5180be6ba2cf6510</id>
<content type='text'>
[ Upstream commit 9dd8702e7cd28ebf076ff838933f29cf671165ec ]

'disk-&gt;private_data' is set to mddev in md_alloc() and never set to NULL,
and users need to open mddev before submitting ioctl. So mddev must not
have been freed during ioctl, and there is no need to check mddev here.
Clean up it.

Signed-off-by: Li Nan &lt;linan122@huawei.com&gt;
Reviewed-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20240226031444.3606764-4-linan666@huaweicloud.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>md: do not delete safemode_timer in mddev_suspend</title>
<updated>2024-08-14T11:52:45+00:00</updated>
<author>
<name>Li Nan</name>
<email>linan122@huawei.com</email>
</author>
<published>2024-05-08T09:20:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=38ea2243a7b2b1217d073bf1965cd86d6c71d617'/>
<id>urn:sha1:38ea2243a7b2b1217d073bf1965cd86d6c71d617</id>
<content type='text'>
[ Upstream commit a8768a134518e406d41799a3594aeb74e0889cf7 ]

The deletion of safemode_timer in mddev_suspend() is redundant and
potentially harmful now. If timer is about to be woken up but gets
deleted, 'in_sync' will remain 0 until the next write, causing array
to stay in the 'active' state instead of transitioning to 'clean'.

Commit 0d9f4f135eb6 ("MD: Add del_timer_sync to mddev_suspend (fix
nasty panic))" introduced this deletion for dm, because if timer fired
after dm is destroyed, the resource which the timer depends on might
have been freed.

However, commit 0dd84b319352 ("md: call __md_stop_writes in md_stop")
added __md_stop_writes() to md_stop(), which is called before freeing
resource. Timer is deleted in __md_stop_writes(), and the origin issue
is resolved. Therefore, delete safemode_timer can be removed safely now.

Signed-off-by: Li Nan &lt;linan122@huawei.com&gt;
Reviewed-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20240508092053.1447930-1-linan666@huaweicloud.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>md: fix deadlock between mddev_suspend and flush bio</title>
<updated>2024-08-03T06:48:52+00:00</updated>
<author>
<name>Li Nan</name>
<email>linan122@huawei.com</email>
</author>
<published>2024-05-25T18:52:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=32226070813140234b6c507084738e8e8385c5c6'/>
<id>urn:sha1:32226070813140234b6c507084738e8e8385c5c6</id>
<content type='text'>
[ Upstream commit 611d5cbc0b35a752e657a83eebadf40d814d006b ]

Deadlock occurs when mddev is being suspended while some flush bio is in
progress. It is a complex issue.

T1. the first flush is at the ending stage, it clears 'mddev-&gt;flush_bio'
    and tries to submit data, but is blocked because mddev is suspended
    by T4.
T2. the second flush sets 'mddev-&gt;flush_bio', and attempts to queue
    md_submit_flush_data(), which is already running (T1) and won't
    execute again if on the same CPU as T1.
T3. the third flush inc active_io and tries to flush, but is blocked because
    'mddev-&gt;flush_bio' is not NULL (set by T2).
T4. mddev_suspend() is called and waits for active_io dec to 0 which is inc
    by T3.

  T1		T2		T3		T4
  (flush 1)	(flush 2)	(third 3)	(suspend)
  md_submit_flush_data
   mddev-&gt;flush_bio = NULL;
   .
   .	 	md_flush_request
   .	  	 mddev-&gt;flush_bio = bio
   .	  	 queue submit_flushes
   .		 .
   .		 .		md_handle_request
   .		 .		 active_io + 1
   .		 .		 md_flush_request
   .		 .		  wait !mddev-&gt;flush_bio
   .		 .
   .		 .				mddev_suspend
   .		 .				 wait !active_io
   .		 .
   .		 submit_flushes
   .		 queue_work md_submit_flush_data
   .		 //md_submit_flush_data is already running (T1)
   .
   md_handle_request
    wait resume

The root issue is non-atomic inc/dec of active_io during flush process.
active_io is dec before md_submit_flush_data is queued, and inc soon
after md_submit_flush_data() run.
  md_flush_request
    active_io + 1
    submit_flushes
      active_io - 1
      md_submit_flush_data
        md_handle_request
        active_io + 1
          make_request
        active_io - 1

If active_io is dec after md_handle_request() instead of within
submit_flushes(), make_request() can be called directly intead of
md_handle_request() in md_submit_flush_data(), and active_io will
only inc and dec once in the whole flush process. Deadlock will be
fixed.

Additionally, the only difference between fixing the issue and before is
that there is no return error handling of make_request(). But after
previous patch cleaned md_write_start(), make_requst() only return error
in raid5_make_request() by dm-raid, see commit 41425f96d7aa ("dm-raid456,
md/raid456: fix a deadlock for dm-raid456 while io concurrent with
reshape)". Since dm always splits data and flush operation into two
separate io, io size of flush submitted by dm always is 0, make_request()
will not be called in md_submit_flush_data(). To prevent future
modifications from introducing issues, add WARN_ON to ensure
make_request() no error is returned in this context.

Fixes: fa2bbff7b0b4 ("md: synchronize flush io with array reconfiguration")
Signed-off-by: Li Nan &lt;linan122@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20240525185257.3896201-3-linan666@huaweicloud.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>md: fix kmemleak of rdev-&gt;serial</title>
<updated>2024-05-17T09:56:24+00:00</updated>
<author>
<name>Li Nan</name>
<email>linan122@huawei.com</email>
</author>
<published>2024-02-08T08:55:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=beaf11969fd5cbe6f09cefaa34df1ce8578e8dd9'/>
<id>urn:sha1:beaf11969fd5cbe6f09cefaa34df1ce8578e8dd9</id>
<content type='text'>
commit 6cf350658736681b9d6b0b6e58c5c76b235bb4c4 upstream.

If kobject_add() is fail in bind_rdev_to_array(), 'rdev-&gt;serial' will be
alloc not be freed, and kmemleak occurs.

unreferenced object 0xffff88815a350000 (size 49152):
  comm "mdadm", pid 789, jiffies 4294716910
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace (crc f773277a):
    [&lt;0000000058b0a453&gt;] kmemleak_alloc+0x61/0xe0
    [&lt;00000000366adf14&gt;] __kmalloc_large_node+0x15e/0x270
    [&lt;000000002e82961b&gt;] __kmalloc_node.cold+0x11/0x7f
    [&lt;00000000f206d60a&gt;] kvmalloc_node+0x74/0x150
    [&lt;0000000034bf3363&gt;] rdev_init_serial+0x67/0x170
    [&lt;0000000010e08fe9&gt;] mddev_create_serial_pool+0x62/0x220
    [&lt;00000000c3837bf0&gt;] bind_rdev_to_array+0x2af/0x630
    [&lt;0000000073c28560&gt;] md_add_new_disk+0x400/0x9f0
    [&lt;00000000770e30ff&gt;] md_ioctl+0x15bf/0x1c10
    [&lt;000000006cfab718&gt;] blkdev_ioctl+0x191/0x3f0
    [&lt;0000000085086a11&gt;] vfs_ioctl+0x22/0x60
    [&lt;0000000018b656fe&gt;] __x64_sys_ioctl+0xba/0xe0
    [&lt;00000000e54e675e&gt;] do_syscall_64+0x71/0x150
    [&lt;000000008b0ad622&gt;] entry_SYSCALL_64_after_hwframe+0x6c/0x74

Fixes: 963c555e75b0 ("md: introduce mddev_create/destroy_wb_pool for the change of member device")
Signed-off-by: Li Nan &lt;linan122@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20240208085556.2412922-1-linan666@huaweicloud.com
[ mddev_destroy_serial_pool third parameter was removed in mainline,
  where there is no need to suspend within this function anymore. ]
Signed-off-by: Jeremy Bongio &lt;jbongio@google.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>md: Don't clear MD_CLOSING when the raid is about to stop</title>
<updated>2024-03-26T22:20:28+00:00</updated>
<author>
<name>Li Nan</name>
<email>linan122@huawei.com</email>
</author>
<published>2024-02-26T03:14:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d6c28aefe9b46583767b706dbf53e9bcf7552b72'/>
<id>urn:sha1:d6c28aefe9b46583767b706dbf53e9bcf7552b72</id>
<content type='text'>
[ Upstream commit 9674f54e41fffaf06f6a60202e1fa4cc13de3cf5 ]

The raid should not be opened anymore when it is about to be stopped.
However, other processes can open it again if the flag MD_CLOSING is
cleared before exiting. From now on, this flag will not be cleared when
the raid will be stopped.

Fixes: 065e519e71b2 ("md: MD_CLOSING needs to be cleared after called md_set_readonly or do_md_stop")
Signed-off-by: Li Nan &lt;linan122@huawei.com&gt;
Reviewed-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20240226031444.3606764-6-linan666@huaweicloud.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>md: fix data corruption for raid456 when reshape restart while grow up</title>
<updated>2024-03-26T22:20:22+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2023-05-12T01:56:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7512a70376f584589ada6363a8918fadd90189ed'/>
<id>urn:sha1:7512a70376f584589ada6363a8918fadd90189ed</id>
<content type='text'>
[ Upstream commit 873f50ece41aad5c4f788a340960c53774b5526e ]

Currently, if reshape is interrupted, echo "reshape" to sync_action will
restart reshape from scratch, for example:

echo frozen &gt; sync_action
echo reshape &gt; sync_action

This will corrupt data before reshape_position if the array is growing,
fix the problem by continue reshape from reshape_position.

Reported-by: Peter Neuwirth &lt;reddunur@online.de&gt;
Link: https://lore.kernel.org/linux-raid/e2f96772-bfbc-f43b-6da1-f520e5164536@online.de/
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20230512015610.821290-3-yukuai1@huaweicloud.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>md: Fix missing release of 'active_io' for flush</title>
<updated>2024-03-01T12:26:32+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2024-02-01T09:25:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6b2ff10390b19a2364af622b6666b690443f9f3f'/>
<id>urn:sha1:6b2ff10390b19a2364af622b6666b690443f9f3f</id>
<content type='text'>
commit 855678ed8534518e2b428bcbcec695de9ba248e8 upstream.

submit_flushes
 atomic_set(&amp;mddev-&gt;flush_pending, 1);
 rdev_for_each_rcu(rdev, mddev)
  atomic_inc(&amp;mddev-&gt;flush_pending);
  bi-&gt;bi_end_io = md_end_flush
  submit_bio(bi);
                        /* flush io is done first */
                        md_end_flush
                         if (atomic_dec_and_test(&amp;mddev-&gt;flush_pending))
                          percpu_ref_put(&amp;mddev-&gt;active_io)
                          -&gt; active_io is not released

 if (atomic_dec_and_test(&amp;mddev-&gt;flush_pending))
  -&gt; missing release of active_io

For consequence, mddev_suspend() will wait for 'active_io' to be zero
forever.

Fix this problem by releasing 'active_io' in submit_flushes() if
'flush_pending' is decreased to zero.

Fixes: fa2bbff7b0b4 ("md: synchronize flush io with array reconfiguration")
Cc: stable@vger.kernel.org # v6.1+
Reported-by: Blazej Kucman &lt;blazej.kucman@linux.intel.com&gt;
Closes: https://lore.kernel.org/lkml/20240130172524.0000417b@linux.intel.com/
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20240201092559.910982-7-yukuai1@huaweicloud.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>md: bypass block throttle for superblock update</title>
<updated>2024-02-23T08:12:48+00:00</updated>
<author>
<name>Junxiao Bi</name>
<email>junxiao.bi@oracle.com</email>
</author>
<published>2023-11-08T18:22:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4bf19cef220aaff4b026143518916355806006e7'/>
<id>urn:sha1:4bf19cef220aaff4b026143518916355806006e7</id>
<content type='text'>
[ Upstream commit d6e035aad6c09991da1c667fb83419329a3baed8 ]

commit 5e2cf333b7bd ("md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d")
introduced a hung bug and will be reverted in next patch, since the issue
that commit is fixing is due to md superblock write is throttled by wbt,
to fix it, we can have superblock write bypass block layer throttle.

Fixes: 5e2cf333b7bd ("md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d")
Cc: stable@vger.kernel.org # v5.19+
Suggested-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Junxiao Bi &lt;junxiao.bi@oracle.com&gt;
Reviewed-by: Logan Gunthorpe &lt;logang@deltatee.com&gt;
Reviewed-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20231108182216.73611-1-junxiao.bi@oracle.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>md: Whenassemble the array, consult the superblock of the freshest device</title>
<updated>2024-02-05T20:12:53+00:00</updated>
<author>
<name>Alex Lyakas</name>
<email>alex.lyakas@zadara.com</email>
</author>
<published>2023-12-13T12:24:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=fd9a2c700323f6e9e2f201f6adfae079f740c9ff'/>
<id>urn:sha1:fd9a2c700323f6e9e2f201f6adfae079f740c9ff</id>
<content type='text'>
[ Upstream commit dc1cc22ed58f11d58d8553c5ec5f11cbfc3e3039 ]

Upon assembling the array, both kernel and mdadm allow the devices to have event
counter difference of 1, and still consider them as up-to-date.
However, a device whose event count is behind by 1, may in fact not be up-to-date,
and array resync with such a device may cause data corruption.
To avoid this, consult the superblock of the freshest device about the status
of a device, whose event counter is behind by 1.

Signed-off-by: Alex Lyakas &lt;alex.lyakas@zadara.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/1702470271-16073-1-git-send-email-alex.lyakas@zadara.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>md: synchronize flush io with array reconfiguration</title>
<updated>2024-01-25T23:27:25+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2023-11-29T02:02:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f9f2d957a8ea93c73182aebf7de30935a58c027d'/>
<id>urn:sha1:f9f2d957a8ea93c73182aebf7de30935a58c027d</id>
<content type='text'>
[ Upstream commit fa2bbff7b0b4e211fec5e5686ef96350690597b5 ]

Currently rcu is used to protect iterating rdev from submit_flushes():

submit_flushes			remove_and_add_spares
				synchronize_rcu
				pers-&gt;hot_remove_disk()
 rcu_read_lock()
 rdev_for_each_rcu
  if (rdev-&gt;raid_disk &gt;= 0)
				rdev-&gt;radi_disk = -1;
   atomic_inc(&amp;rdev-&gt;nr_pending)
   rcu_read_unlock()
   bi = bio_alloc_bioset()
   bi-&gt;bi_end_io = md_end_flush
   bi-&gt;private = rdev
   submit_bio
   // issue io for removed rdev

Fix this problem by grabbing 'acive_io' before iterating rdev, make sure
that remove_and_add_spares() won't concurrent with submit_flushes().

Fixes: a2826aa92e2e ("md: support barrier requests on all personalities.")
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20231129020234.1586910-1-yukuai1@huaweicloud.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
