kernel/linux.git/drivers/md/raid5.c, branch v3.0.35

md/raid5: fix bug that could result in reads from a failed device.

2011-12-21T20:57:42+00:00

commit 355840e7a7e56bb2834fd3b0da64da5465f8aeaa upstream. commit a847627709b3402163d99f7c6fda4a77bcd6b51b in linux-3.0.9 attempted to backport this to 3.0 but only made one change were two were necessary. This add the second change. This bug was introduced in 415e72d034c50520ddb7ff79e7d1792c1306f0c9 which was in 2.6.36. There is a small window of time between when a device fails and when it is removed from the array. During this time we might still read from it, but we won't write to it - so it is possible that we could read stale data. We didn't need the test of 'Faulty' before because the test on In_sync is sufficient. Since we started allowing reads from the early part of non-In_sync devices we need a test on Faulty too. This is suitable for any kernel from 2.6.36 onwards, though the patch might need a bit of tweaking in 3.0 and earlier. Signed-off-by: NeilBrown Signed-off-by: Greg Kroah-Hartman

md/raid5: abort any pending parity operations when array fails.

2011-11-21T22:31:22+00:00

commit 9a3f530f39f4490eaa18b02719fb74ce5f4d2d86 upstream. When the number of failed devices exceeds the allowed number we must abort any active parity operations (checks or updates) as they are no longer meaningful, and can lead to a BUG_ON in handle_parity_checks6. This bug was introduce by commit 6c0069c0ae9659e3a91b68eaed06a5c6c37f45c8 in 2.6.29. Reported-by: Manish Katiyar Tested-by: Manish Katiyar Acked-by: Dan Williams Signed-off-by: NeilBrown Signed-off-by: Greg Kroah-Hartman

md/raid5: fix bug that could result in reads from a failed device.

2011-11-11T17:35:53+00:00

commit 355840e7a7e56bb2834fd3b0da64da5465f8aeaa upstream. This bug was introduced in 415e72d034c50520ddb7ff79e7d1792c1306f0c9 which was in 2.6.36. There is a small window of time between when a device fails and when it is removed from the array. During this time we might still read from it, but we won't write to it - so it is possible that we could read stale data. We didn't need the test of 'Faulty' before because the test on In_sync is sufficient. Since we started allowing reads from the early part of non-In_sync devices we need a test on Faulty too. This is suitable for any kernel from 2.6.36 onwards, though the patch might need a bit of tweaking in 3.0 and earlier. Signed-off-by: NeilBrown Signed-off-by: Greg Kroah-Hartman

md: Avoid waking up a thread after it has been freed.

2011-10-16T21:14:53+00:00

commit 01f96c0a9922cd9919baf9d16febdf7016177a12 upstream. Two related problems: 1/ some error paths call "md_unregister_thread(mddev->thread)" without subsequently clearing ->thread. A subsequent call to mddev_unlock will try to wake the thread, and crash. 2/ Most calls to md_wakeup_thread are protected against the thread disappeared either by: - holding the ->mutex - having an active request, so something else must be keeping the array active. However mddev_unlock calls md_wakeup_thread after dropping the mutex and without any certainty of an active request, so the ->thread could theoretically disappear. So we need a spinlock to provide some protections. So change md_unregister_thread to take a pointer to the thread pointer, and ensure that it always does the required locking, and clears the pointer properly. Reported-by: "Moshe Melnikov" Signed-off-by: NeilBrown Signed-off-by: Greg Kroah-Hartman

md/raid5: remove unusual use of bio_iovec_idx()

2011-06-14T04:23:57+00:00

In the bio_for_each_segment loop, bvl always points current bio_vec, so the same as bio_iovec_idx(, i). Let's get rid of it. Cc: Dan Williams Signed-off-by: Namhyung Kim Signed-off-by: NeilBrown

md/raid5: fix FUA request handling in ops_run_io()

2011-06-14T04:20:19+00:00

Commit e9c7469bb4f5 ("md: implment REQ_FLUSH/FUA support") introduced R5_WantFUA flag and set rw to WRITE_FUA in that case. However remaining code still checks whether rw is exactly same as WRITE or not, so FUAed-write ends up with being treated as READ. Fix it. This bug has been present since 2.6.37 and the fix is suitable for any -stable kernel since then. It is not clear why this has not caused more problems. Cc: Tejun Heo Cc: stable@kernel.org Signed-off-by: Namhyung Kim Signed-off-by: NeilBrown

md/raid5: fix raid5_set_bi_hw_segments

2011-06-14T04:09:41+00:00

The @bio->bi_phys_segments consists of active stripes count in the lower 16 bits and processed stripes count in the upper 16 bits. So logical-OR operator should be bitwise one. This bug has been present since 2.6.27 and the fix is suitable for any -stable kernel since then. Fortunately the bad code is only used on error paths and is relatively unlikely to be hit. Cc: stable@kernel.org Signed-off-by: Namhyung Kim Signed-off-by: NeilBrown

MD: raid5 do not set fullsync

2011-06-09T01:42:29+00:00

Add check to determine if a device needs full resync or if partial resync will do RAID 5 was assuming that if a device was not In_sync, it must undergo a full resync. We add a check to see if 'saved_raid_disk' is the same as 'raid_disk'. If it is, we can safely skip the full resync and rely on the bitmap for partial recovery instead. This is the legitimate purpose of 'saved_raid_disk', from md.h: int saved_raid_disk; /* role that device used to have in the * array and could again if we did a partial * resync from the bitmap */ Signed-off-by: Jonathan Brassow Signed-off-by: NeilBrown

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial

2011-05-23T16:12:26+00:00

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) b43: fix comment typo reqest -> request Haavard Skinnemoen has left Atmel cris: typo in mach-fs Makefile Kconfig: fix copy/paste-ism for dell-wmi-aio driver doc: timers-howto: fix a typo ("unsgined") perf: Only include annotate.h once in tools/perf/util/ui/browsers/annotate.c md, raid5: Fix spelling error in comment ('Ofcourse' --> 'Of course'). treewide: fix a few typos in comments regulator: change debug statement be consistent with the style of the rest Revert "arm: mach-u300/gpio: Fix mem_region resource size miscalculations" audit: acquire creds selectively to reduce atomic op overhead rtlwifi: don't touch with treewide double semicolon removal treewide: cleanup continuations and remove logging message whitespace ath9k_hw: don't touch with treewide double semicolon removal include/linux/leds-regulator.h: fix syntax in example code tty: fix typo in descripton of tty_termios_encode_baud_rate xtensa: remove obsolete BKL kernel option from defconfig m68k: fix comment typo 'occcured' arch:Kconfig.locks Remove unused config option. treewide: remove extra semicolons ...

md: allow resync_start to be set while an array is active.

2011-05-11T05:52:21+00:00

The sysfs attribute 'resync_start' (known internally as recovery_cp), records where a resync is up to. A value of 0 means the array is not known to be in-sync at all. A value of MaxSector means the array is believed to be fully in-sync. When the size of member devices of an array (RAID1,RAID4/5/6) is increased, the array can be increased to match. This process sets resync_start to the old end-of-device offset so that the new part of the array gets resynced. However with RAID1 (and RAID6) a resync is not technically necessary and may be undesirable. So it would be good if the implied resync after the array is resized could be avoided. So: change 'resync_start' so the value can be changed while the array is active, and as a precaution only allow it to be changed while resync/recovery is 'frozen'. Changing it once resync has started is not going to be useful anyway. This allows the array to be resized without a resync by: write 'frozen' to 'sync_action' write new size to 'component_size' (this will set resync_start) write 'none' to 'resync_start' write 'idle' to 'sync_action'. Also slightly improve some tests on recovery_cp when resizing raid1/raid5. Now that an arbitrary value could be set we should be more careful in our tests. Signed-off-by: NeilBrown