kernel/linux.git/drivers/md/raid1.c, branch v4.14.263

md/raid1: properly indicate failure when ending a failed write request

2021-05-22T08:57:21+00:00

commit 2417b9869b81882ab90fd5ed1081a1cb2d4db1dd upstream. This patch addresses a data corruption bug in raid1 arrays using bitmaps. Without this fix, the bitmap bits for the failed I/O end up being cleared. Since we are in the failure leg of raid1_end_write_request, the request either needs to be retried (R1BIO_WriteError) or failed (R1BIO_Degraded). Fixes: eeba6809d8d5 ("md/raid1: end bio when the device faulty") Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Paul Clements Signed-off-by: Song Liu Signed-off-by: Greg Kroah-Hartman

md: raid1: check rdev before reference in raid1_sync_request func

2020-01-09T09:17:52+00:00

[ Upstream commit 028288df635f5a9addd48ac4677b720192747944 ] In raid1_sync_request func, rdev should be checked before reference. Signed-off-by: Zhiqiang Liu Signed-off-by: Song Liu Signed-off-by: Sasha Levin

md/raid1: fail run raid1 array when active disk less than one

2019-10-05T10:48:00+00:00

[ Upstream commit 07f1a6850c5d5a65c917c3165692b5179ac4cb6b ] When run test case: mdadm -CR /dev/md1 -l 1 -n 4 /dev/sd[a-d] --assume-clean --bitmap=internal mdadm -S /dev/md1 mdadm -A /dev/md1 /dev/sd[b-c] --run --force mdadm --zero /dev/sda mdadm /dev/md1 -a /dev/sda echo offline > /sys/block/sdc/device/state echo offline > /sys/block/sdb/device/state sleep 5 mdadm -S /dev/md1 echo running > /sys/block/sdb/device/state echo running > /sys/block/sdc/device/state mdadm -A /dev/md1 /dev/sd[a-c] --run --force mdadm run fail with kernel message as follow: [ 172.986064] md: kicking non-fresh sdb from array! [ 173.004210] md: kicking non-fresh sdc from array! [ 173.022383] md/raid1:md1: active with 0 out of 4 mirrors [ 173.022406] md1: failed to create bitmap (-5) In fact, when active disk in raid1 array less than one, we need to return fail in raid1_run(). Reviewed-by: NeilBrown Signed-off-by: Yufen Yu Signed-off-by: Song Liu Signed-off-by: Sasha Levin

md/raid1: end bio when the device faulty

2019-10-05T10:47:49+00:00

[ Upstream commit eeba6809d8d58908b5ed1b5ceb5fcb09a98a7cad ] When write bio return error, it would be added to conf->retry_list and wait for raid1d thread to retry write and acknowledge badblocks. In narrow_write_error(), the error bio will be split in the unit of badblock shift (such as one sector) and raid1d thread issues them one by one. Until all of the splited bio has finished, raid1d thread can go on processing other things, which is time consuming. But, there is a scene for error handling that is not necessary. When the device has been set faulty, flush_bio_list() may end bios in pending_bio_list with error status. Since these bios has not been issued to the device actually, error handlding to retry write and acknowledge badblocks make no sense. Even without that scene, when the device is faulty, badblocks info can not be written out to the device. Thus, we also no need to handle the error IO. Signed-off-by: Yufen Yu Signed-off-by: Song Liu Signed-off-by: Sasha Levin

md/raid1: don't clear bitmap bits on interrupted recovery.

2019-02-20T09:20:54+00:00

commit dfcc34c99f3ebc16b787b118763bf9cb6b1efc7a upstream. sync_request_write no longer submits writes to a Faulty device. This has the unfortunate side effect that bitmap bits can be incorrectly cleared if a recovery is interrupted (previously, end_sync_write would have prevented this). This means the next recovery may not copy everything it should, potentially corrupting data. Add a function for doing the proper md_bitmap_end_sync, called from end_sync_write and the Faulty case in sync_request_write. backport note to 4.14: s/md_bitmap_end_sync/bitmap_end_sync Cc: stable@vger.kernel.org 4.14+ Fixes: 0c9d5b127f69 ("md/raid1: avoid reusing a resync bio after error handling.") Reviewed-by: Jack Wang Tested-by: Jack Wang Signed-off-by: Nate Dailey Signed-off-by: Song Liu Signed-off-by: Greg Kroah-Hartman

MD: fix invalid stored role for a disk - try2

2018-11-13T19:15:18+00:00

commit 9e753ba9b9b405e3902d9f08aec5f2ea58a0c317 upstream. Commit d595567dc4f0 (MD: fix invalid stored role for a disk) broke linear hotadd. Let's only fix the role for disks in raid1/10. Based on Guoqing's original patch. Reported-by: kernel test robot Cc: Gioh Kim Cc: Guoqing Jiang Signed-off-by: Shaohua Li Signed-off-by: Greg Kroah-Hartman

md/raid1: add error handling of read error from FailFast device

2018-08-03T05:50:33+00:00

[ Upstream commit b33d10624fdc15cdf1495f3f00481afccec76783 ] Current handle_read_error() function calls fix_read_error() only if md device is RW and rdev does not include FailFast flag. It does not handle a read error from a RW device including FailFast flag. I am not sure it is intended. But I found that write IO error sets rdev faulty. The md module should handle the read IO error and write IO error equally. So I think read IO error should set rdev faulty. Signed-off-by: Gioh Kim Reviewed-by: Jack Wang Signed-off-by: Shaohua Li Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman

md: remove special meaning of ->quiesce(.., 2)

2018-07-08T13:30:50+00:00

commit b03e0ccb5ab9df3efbe51c87843a1ffbecbafa1f upstream. The '2' argument means "wake up anything that is waiting". This is an inelegant part of the design and was added to help support management of suspend_lo/suspend_hi setting. Now that suspend_lo/hi is managed in mddev_suspend/resume, that need is gone. These is still a couple of places where we call 'quiesce' with an argument of '2', but they can safely be changed to call ->quiesce(.., 1); ->quiesce(.., 0) which achieve the same result at the small cost of pausing IO briefly. This removes a small "optimization" from suspend_{hi,lo}_store, but it isn't clear that optimization served a useful purpose. The code now is a lot clearer. Suggested-by: Shaohua Li Signed-off-by: NeilBrown Signed-off-by: Shaohua Li Signed-off-by: Jack Wang Signed-off-by: Greg Kroah-Hartman

md: move suspend_hi/lo handling into core md code

2018-07-08T13:30:50+00:00

commit b3143b9a38d5039bcd1f2d1c94039651bfba8043 upstream. responding to ->suspend_lo and ->suspend_hi is similar to responding to ->suspended. It is best to wait in the common core code without incrementing ->active_io. This allows mddev_suspend()/mddev_resume() to work while requests are waiting for suspend_lo/hi to change. This is will be important after a subsequent patch which uses mddev_suspend() to synchronize updating for suspend_lo/hi. So move the code for testing suspend_lo/hi out of raid1.c and raid5.c, and place it in md.c Signed-off-by: NeilBrown Signed-off-by: Shaohua Li Signed-off-by: Jack Wang Signed-off-by: Greg Kroah-Hartman

md/raid1: fix NULL pointer dereference

2018-05-30T05:52:03+00:00

[ Upstream commit 3de59bb9d551428cbdc76a9ea57883f82e350b4d ] In handle_write_finished(), if r1_bio->bios[m] != NULL, it thinks the corresponding conf->mirrors[m].rdev is also not NULL. But, it is not always true. Even if some io hold replacement rdev(i.e. rdev->nr_pending.count > 0), raid1_remove_disk() can also set the rdev as NULL. That means, bios[m] != NULL, but mirrors[m].rdev is NULL, resulting in NULL pointer dereference in handle_write_finished and sync_request_write. This patch can fix BUGs as follows: BUG: unable to handle kernel NULL pointer dereference at 0000000000000140 IP: [] raid1d+0x2bd/0xfc0 PGD 12ab52067 PUD 12f587067 PMD 0 Oops: 0000 [#1] SMP CPU: 1 PID: 2008 Comm: md3_raid1 Not tainted 4.1.44+ #130 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc26 04/01/2014 Call Trace: ? schedule+0x37/0x90 ? prepare_to_wait_event+0x83/0xf0 md_thread+0x144/0x150 ? wake_atomic_t_function+0x70/0x70 ? md_start_sync+0xf0/0xf0 kthread+0xd8/0xf0 ? kthread_worker_fn+0x160/0x160 ret_from_fork+0x42/0x70 ? kthread_worker_fn+0x160/0x160 BUG: unable to handle kernel NULL pointer dereference at 00000000000000b8 IP: sync_request_write+0x9e/0x980 PGD 800000007c518067 P4D 800000007c518067 PUD 8002b067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 24 PID: 2549 Comm: md3_raid1 Not tainted 4.15.0+ #118 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc26 04/01/2014 Call Trace: ? sched_clock+0x5/0x10 ? sched_clock_cpu+0xc/0xb0 ? flush_pending_writes+0x3a/0xd0 ? pick_next_task_fair+0x4d5/0x5f0 ? __switch_to+0xa2/0x430 raid1d+0x65a/0x870 ? find_pers+0x70/0x70 ? find_pers+0x70/0x70 ? md_thread+0x11c/0x160 md_thread+0x11c/0x160 ? finish_wait+0x80/0x80 kthread+0x111/0x130 ? kthread_create_worker_on_cpu+0x70/0x70 ? do_syscall_64+0x6f/0x190 ? SyS_exit_group+0x10/0x10 ret_from_fork+0x35/0x40 Reviewed-by: NeilBrown Signed-off-by: Yufen Yu Signed-off-by: Shaohua Li Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman