summaryrefslogtreecommitdiff
path: root/drivers/md
AgeCommit message (Collapse)AuthorFilesLines
2012-04-03md/raid1,raid10: don't compare excess byte during consistency check.NeilBrown2-2/+2
When comparing two pages read from different legs of a mirror, only compare the bytes that were read, not the whole page. In most cases we read a whole page, but in some cases with bad blocks or odd sizes devices we might read fewer than that. This bug has been present "forever" but at worst it might cause a report of two many mismatches and generate a little bit extra resync IO, so there is no need to back-port to -stable kernels. Reported-by: majianpeng <majianpeng@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
2012-04-03md/raid5: Fix a bug about judging if the operation is syncing or replacingmajianpeng1-1/+3
When create a raid5 using assume-clean and echo check or repair to sync_action.Then component disks did not operated IO but the raid check/resync faster than normal. Because the judgement in function analyse_stripe(): if (do_recovery || sh->sector >= conf->mddev->recovery_cp) s->syncing = 1; else s->replacing = 1; When check or repair,the recovery_cp == MaxSectore,so syncing equal zero not one. This bug was introduced by commit 9a3e1101b827 md/raid5: detect and handle replacements during recovery. so this patch is suitable for 3.3-stable. Cc: stable@vger.kernel.org Signed-off-by: majianpeng <majianpeng@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
2012-04-03md/raid1:Remove unnecessary rcu_dereference(conf->mirrors[i].rdev).majianpeng1-2/+1
Because rde->nr_pending > 0,so can not remove this disk. And in any case, we aren't holding rcu_read_lock() Signed-off-by: majianpeng <majianpeng@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
2012-04-03md: Avoid OOPS when reshaping raid1 to raid0Jes Sorensen1-1/+17
raid1 arrays do not have the notion of chunk size. Calculate the largest chunk sector size we can use to avoid a divide by zero OOPS when aligning the size of the new array to the chunk size. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>
2012-04-03md/raid5: fix handling of bad blocks during recovery.NeilBrown1-26/+29
1/ We can only treat a known-bad-block like a read-error if we have the data that belongs in that block. So fix that test. 2/ If we cannot recovery a stripe due to insufficient data, don't tell "md_done_sync" that the sync failed unless we really did fail something. If we successfully record bad blocks, that is success. Reported-by: "majianpeng" <majianpeng@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
2012-04-02md/raid1: If md_integrity_register() failed,run() must free the memmajianpeng1-1/+7
Signed-off-by: majianpeng <majianpeng@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
2012-04-02md/raid0: If md_integrity_register() fails, raid0_run() must free the mem.majianpeng1-1/+8
Signed-off-by: majianpeng <majianpeng@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
2012-04-02md/linear: If md_integrity_register() fails, linear_run() must free the mem.majianpeng1-1/+8
Signed-off-by: majianpeng <majianpeng@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-28dm: add verity targetMikulas Patocka3-0/+934
This device-mapper target creates a read-only device that transparently validates the data on one underlying device against a pre-generated tree of cryptographic checksums stored on a second device. Two checksum device formats are supported: version 0 which is already shipping in Chromium OS and version 1 which incorporates some improvements. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mandeep Singh Baines <msb@chromium.org> Signed-off-by: Will Drewry <wad@chromium.org> Signed-off-by: Elly Jones <ellyjones@chromium.org> Cc: Milan Broz <mbroz@redhat.com> Cc: Olof Johansson <olofj@chromium.org> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-03-28dm bufio: prefetchMikulas Patocka2-26/+90
This patch introduces a new function dm_bufio_prefetch. It prefetches the specified range of blocks into dm-bufio cache without waiting for i/o completion. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-03-28dm thin: add pool target flags to control discardJoe Thornber1-27/+108
Add dm thin target arguments to control discard support. ignore_discard: Disables discard support no_discard_passdown: Don't pass discards down to the underlying data device, but just remove the mapping within the thin provisioning target. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-03-28dm thin: support discardsJoe Thornber1-14/+158
Support discards in the thin target. On discard the corresponding mapping(s) are removed from the thin device. If the associated block(s) are no longer shared the discard is passed to the underlying device. All bios other than discards now have an associated deferred_entry that is saved to the 'all_io_entry' in endio_hook. When non-discard IO completes and associated mappings are quiesced any discards that were deferred, via ds_add_work() in process_discard(), will be queued for processing by the worker thread. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> drivers/md/dm-thin.c | 173 ++++++++++++++++++++++++++++++++++++++++++++++---- drivers/md/dm-thin.c | 172 ++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 158 insertions(+), 14 deletions(-)
2012-03-28dm thin: prepare to support discardJoe Thornber1-53/+72
This patch contains the ground work needed for dm-thin to support discard. - Adds endio function that replaces shared_read_endio. - Introduce an explicit 'quiesced' flag into the new_mapping structure. Before, this was implicitly indicated by m->list being empty. - The map_info->ptr remains constant for the duration of a bio's trip through the thin target. Make it easier to reason about it. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-03-28dm thin: use dm_target_offsetAlasdair G Kergon1-1/+1
Use dm_target_offset wrapper instead of referencing the awkward ti->begin explicitly. Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-03-28dm thin: support read only external snapshot originsJoe Thornber1-14/+71
Support the use of an external _read only_ device as an origin for a thin device. Any read to an unprovisioned area of the thin device will be passed through to the origin. Writes trigger allocation of new blocks as usual. One possible use case for this would be VM hosts that want to run guests on thinly-provisioned volumes but have the base image on another device (possibly shared between many VMs). Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-03-28dm thin: relax hard limit on the maximum size of a metadata deviceMike Snitzer3-15/+20
The thin metadata format can only make use of a device that is <= THIN_METADATA_MAX_SECTORS (currently 15.9375 GB). Therefore, there is no practical benefit to using a larger device. However, it may be that other factors impose a certain granularity for the space that is allocated to a device (E.g. lvm2 can impose a coarse granularity through the use of large, >= 1 GB, physical extents). Rather than reject a larger metadata device, during thin-pool device construction, switch to allowing it but issue a warning if a device larger than THIN_METADATA_MAX_SECTORS_WARNING (16 GB) is provided. Any space over 15.9375 GB will not be used. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-03-28dm persistent data: remove space map ref_count entries if redundantJoe Thornber1-3/+0
Save space by removing entries from the space map ref_count tree if they're no longer needed. Ref counts are stored in two places: a bitmap if the ref_count is below 3, or a btree of uint32_t if 3 or above. When a ref_count that was above 3 drops below we can remove it from the tree and save some metadata space. This removal was commented out before because I was unsure why this was causing under-populated btree nodes. Earlier patches have fixed this issue. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-03-28dm thin: commit outstanding data every secondJoe Thornber1-2/+26
Commit unwritten data every second to prevent too much building up. Released blocks don't become available until after the next commit (for crash resilience). Prior to this patch commits were only triggered by a message to the target or a REQ_{FLUSH,FUA} bio. This allowed far too big a position to build up. The interval is hard-coded to 1 second. This is a sensible setting. I'm not making this user configurable, since there isn't much to be gained by tweaking this - and a lot lost by setting it far too high. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-03-28dm: reject trailing characters in sccanf inputMikulas Patocka13-25/+44
Device mapper uses sscanf to convert arguments to numbers. The problem is that the way we use it ignores additional unmatched characters in the scanned string. For example, this `if (sscanf(string, "%d", &number) == 1)' will match a number, but also it will match number with some garbage appended, like "123abc". As a result, device mapper accepts garbage after some numbers. For example the command `dmsetup create vg1-new --table "0 16384 linear 254:1bla 34816bla"' will pass without an error. This patch fixes all sscanf uses in device mapper. It appends "%c" with a pointer to a dummy character variable to every sscanf statement. The construct `if (sscanf(string, "%d%c", &number, &dummy) == 1)' succeeds only if string is a null-terminated number (optionally preceded by some whitespace characters). If there is some character appended after the number, sscanf matches "%c", writes the character to the dummy variable and returns 2. We check the return value for 1 and consequently reject numbers with some garbage appended. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-03-28dm raid: handle failed devices during start upJonathan E Brassow1-2/+51
The dm-raid code currently fails to create a RAID array if any of the superblocks cannot be read. This was an oversight as there is already code to handle this case if the values ('- -') were provided for the failed array position. With this patch, if a superblock cannot be read, the array position's fields are initialized as though '- -' was set in the table. That is, the device is failed and the position should not be used, but if there is sufficient redundancy, the array should still be activated. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-03-28dm thin metadata: pass correct space map to dm_sm_root_sizeJoe Thornber1-1/+1
Fix a harmless typo. The root is a chunk of data that gets written to the superblock. This data is used to recreate the space map when opening a metadata area. We have two space maps; one tracking space on the metadata device and one of the data device. Both of these use the same format for their root, so this typo was harmless. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-03-28dm persistent data: remove redundant value_size arg from value_ptrJoe Thornber3-33/+29
Now that the value_size is held within every node of the btrees we can remove this argument from value_ptr(). For the last few months a BUG_ON has been checking this argument is the same as that held in the node. No issues were reported. So this is a safe change. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-03-28dm mpath: detect invalid map_contextJun'ichi Nomura1-14/+32
The map_context pointer should always be set. However, we have reports that upon requeuing it is not set correctly. So add set and clear functions with a BUG_ON() to track the issue properly. Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: Mike Snitzer <snitzer@redhat.com> Acked-by: Hannes Reinecke <hare@suse.de> Tested-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Dave Wysochanski <dwysocha@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-03-28dm: clear bi_end_io on remapping failureHannes Reinecke1-0/+1
As a precaution, set bi_end_io to NULL when failing to remap. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-03-28dm table: simplify call to free_devicesHannes Reinecke1-2/+1
free_devices in dm_table.c already uses list_for_each(), so we don't need to check if the list is empty. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-03-28dm thin: correct commentsJoe Thornber1-1/+1
Remove documentation for unimplemented 'trim' message. I'd planned a 'trim' target message for shrinking thin devices, but this is better handled via the discard ioctl. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-03-28dm raid: no longer experimentalAlasdair G Kergon1-2/+2
The dm raid module (using md) is becoming the preferred way of creating long-lived mirrors through userspace LVM so remove the EXPERIMENTAL tag. Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-03-28dm uevent: no longer experimentalAlasdair G Kergon1-2/+2
Drop EXPERIMENTAL tag from dm-uevent. It's not changed for a while and some userspace tools are relying upon it. Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-03-28dm persistent data: fix btree rebalancing after removeJoe Thornber1-75/+99
When we remove an entry from a node we sometimes rebalance with it's two neighbours. This wasn't being done correctly; in some cases entries have to move all the way from the right neighbour to the left neighbour, or vice versa. This patch pretty much re-writes the balancing code to fix it. This code is barely used currently; only when you delete a thin device, and then only if you have hundreds of them in the same pool. Once we have discard support, which removes mappings, this will be used much more heavily. Signed-off-by: Joe Thornber <ejt@redhat.com> Cc: stable@kernel.org Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-03-28dm thin: fix stacked bi_next usageJoe Thornber1-51/+73
Avoid using the bi_next field for the holder of a cell when deferring bios because a stacked device below might change it. Store the holder in a new field in struct cell instead. When a cell is created, the bio that triggered creation (the holder) was added to the same bio list as subsequent bios. In some cases we pass this holder bio directly to devices underneath. If those devices use the bi_next field there will be trouble... This also simplifies some code that had to work out which bio was the holder. Signed-off-by: Joe Thornber <ejt@redhat.com> Cc: stable@kernel.org Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-03-28dm crypt: add missing error handlingMikulas Patocka1-12/+16
Always set io->error to -EIO when an error is detected in dm-crypt. There were cases where an error code would be set only if we finish processing the last sector. If there were other encryption operations in flight, the error would be ignored and bio would be returned with success as if no error happened. This bug is present in kcryptd_crypt_write_convert, kcryptd_crypt_read_convert and kcryptd_async_done. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@kernel.org Reviewed-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-03-28dm crypt: fix mempool deadlockMikulas Patocka1-6/+4
This patch fixes a possible deadlock in dm-crypt's mempool use. Currently, dm-crypt reserves a mempool of MIN_BIO_PAGES reserved pages. It allocates first MIN_BIO_PAGES with non-failing allocation (the allocation cannot fail and waits until the mempool is refilled). Further pages are allocated with different gfp flags that allow failing. Because allocations may be done in parallel, this code can deadlock. Example: There are two processes, each tries to allocate MIN_BIO_PAGES and the processes run simultaneously. It may end up in a situation where each process allocates (MIN_BIO_PAGES / 2) pages. The mempool is exhausted. Each process waits for more pages to be freed to the mempool, which never happens. To avoid this deadlock scenario, this patch changes the code so that only the first page is allocated with non-failing gfp mask. Allocation of further pages may fail. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@kernel.org Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-03-28dm exception store: fix init error pathAndrei Warkentin1-1/+1
Call the correct exit function on failure in dm_exception_store_init. Signed-off-by: Andrei Warkentin <andrey.warkentin@gmail.com> Acked-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@kernel.org Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-03-22Merge tag 'md-3.4' of git://neil.brown.name/mdLinus Torvalds13-373/+491
Pull md updates for 3.4 from Neil Brown: "Mostly tidying up code in preparation for some bigger changes next time. A few bug fixes tagged for -stable. Main functionality change is that some RAID10 arrays can now grow to use extra space that may have been made available on the individual devices." Fixed up trivial conflicts with the k[un]map_atomic() cleanups in drivers/md/bitmap.c. * tag 'md-3.4' of git://neil.brown.name/md: (22 commits) md: Add judgement bb->unacked_exist in function md_ack_all_badblocks(). md: fix clearing of the 'changed' flags for the bad blocks list. md/bitmap: discard CHUNK_BLOCK_SHIFT macro md/bitmap: remove unnecessary indirection when allocating. md/bitmap: remove some pointless locking. md/bitmap: change a 'goto' to a normal 'if' construct. md/bitmap: move printing of bitmap status to bitmap.c md/bitmap: remove some unused noise from bitmap.h md/raid10 - support resizing some RAID10 arrays. md/raid1: handle merge_bvec_fn in member devices. md/raid10: handle merge_bvec_fn in member devices. md: add proper merge_bvec handling to RAID0 and Linear. md: tidy up rdev_for_each usage. md/raid1,raid10: avoid deadlock during resync/recovery. md/bitmap: ensure to load bitmap when creating via sysfs. md: don't set md arrays to readonly on shutdown. md: allow re-add to failed arrays. md/raid5: use atomic_dec_return() instead of atomic_dec() and atomic_read(). md: Use existed macros instead of numbers md/raid5: removed unused 'added_devices' variable. ...
2012-03-21Merge branch 'kmap_atomic' of git://github.com/congwang/linuxLinus Torvalds2-25/+25
Pull kmap_atomic cleanup from Cong Wang. It's been in -next for a long time, and it gets rid of the (no longer used) second argument to k[un]map_atomic(). Fix up a few trivial conflicts in various drivers, and do an "evil merge" to catch some new uses that have come in since Cong's tree. * 'kmap_atomic' of git://github.com/congwang/linux: (59 commits) feature-removal-schedule.txt: schedule the deprecated form of kmap_atomic() for removal highmem: kill all __kmap_atomic() [swarren@nvidia.com: highmem: Fix ARM build break due to __kmap_atomic rename] drbd: remove the second argument of k[un]map_atomic() zcache: remove the second argument of k[un]map_atomic() gma500: remove the second argument of k[un]map_atomic() dm: remove the second argument of k[un]map_atomic() tomoyo: remove the second argument of k[un]map_atomic() sunrpc: remove the second argument of k[un]map_atomic() rds: remove the second argument of k[un]map_atomic() net: remove the second argument of k[un]map_atomic() mm: remove the second argument of k[un]map_atomic() lib: remove the second argument of k[un]map_atomic() power: remove the second argument of k[un]map_atomic() kdb: remove the second argument of k[un]map_atomic() udf: remove the second argument of k[un]map_atomic() ubifs: remove the second argument of k[un]map_atomic() squashfs: remove the second argument of k[un]map_atomic() reiserfs: remove the second argument of k[un]map_atomic() ocfs2: remove the second argument of k[un]map_atomic() ntfs: remove the second argument of k[un]map_atomic() ...
2012-03-21Merge branch 'for-linus' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree from Jiri Kosina: "It's indeed trivial -- mostly documentation updates and a bunch of typo fixes from Masanari. There are also several linux/version.h include removals from Jesper." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (101 commits) kcore: fix spelling in read_kcore() comment constify struct pci_dev * in obvious cases Revert "char: Fix typo in viotape.c" init: fix wording error in mm_init comment usb: gadget: Kconfig: fix typo for 'different' Revert "power, max8998: Include linux/module.h just once in drivers/power/max8998_charger.c" writeback: fix fn name in writeback_inodes_sb_nr_if_idle() comment header writeback: fix typo in the writeback_control comment Documentation: Fix multiple typo in Documentation tpm_tis: fix tis_lock with respect to RCU Revert "media: Fix typo in mixer_drv.c and hdmi_drv.c" Doc: Update numastat.txt qla4xxx: Add missing spaces to error messages compiler.h: Fix typo security: struct security_operations kerneldoc fix Documentation: broken URL in libata.tmpl Documentation: broken URL in filesystems.tmpl mtd: simplify return logic in do_map_probe() mm: fix comment typo of truncate_inode_pages_range power: bq27x00: Fix typos in comment ...
2012-03-20dm: remove the second argument of k[un]map_atomic()Cong Wang1-4/+4
Acked-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20md: remove the second argument of k[un]map_atomic()Cong Wang1-21/+21
Acked-by: NeilBrown <neilb@suse.de> Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-19md: Add judgement bb->unacked_exist in function md_ack_all_badblocks().majianpeng1-1/+1
If there are no unacked bad blocks, then there is no point searching for them to acknowledge them. Signed-off-by: majianpeng <majianpeng@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-19md: fix clearing of the 'changed' flags for the bad blocks list.NeilBrown1-1/+2
In super_1_sync (the first hunk) we need to clear 'changed' before checking read_seqretry(), otherwise we might race with other code adding a bad block and so won't retry later. In md_update_sb (the second hunk), in the case where there is no metadata (neither persistent nor external), we treat any bad blocks as an error. However we need to clear the 'changed' flag before calling md_ack_all_badblocks, else it won't do anything. This patch is suitable for -stable release 3.0 and later. Cc: stable@vger.kernel.org Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-19md/bitmap: discard CHUNK_BLOCK_SHIFT macroNeilBrown2-19/+19
Be redefining ->chunkshift as the shift from sectors to chunks rather than bytes to chunks, we can just use "bitmap->chunkshift" which is shorter than the macro call, and less indirect. Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-19md/bitmap: remove unnecessary indirection when allocating.NeilBrown1-28/+3
These funcitons don't add anything useful except possibly the trace points, and I don't think they are worth the extra indirection. So remove them. Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-19md/bitmap: remove some pointless locking.NeilBrown1-12/+2
There is nothing gained by holding a lock while we check if a pointer is NULL or not. If there could be a race, then it could become NULL immediately after the unlock - but there is no race here. So just remove the locking. Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-19md/bitmap: change a 'goto' to a normal 'if' construct.NeilBrown1-19/+21
The use of a goto makes the control flow more obscure here. So make it a normal: if (x) { Y; } No functional change. Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-19md/bitmap: move printing of bitmap status to bitmap.cNeilBrown3-22/+30
The part of /proc/mdstat which describes the bitmap should really be generated by code in bitmap.c. So move it there. Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-19md/bitmap: remove some unused noise from bitmap.hNeilBrown1-18/+0
Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-19md/raid10 - support resizing some RAID10 arrays.NeilBrown1-0/+38
'resizing' an array in this context means making use of extra space that has become available in component devices, not adding new devices. It also includes shrinking the array to take up less space of component devices. This is not supported for array with a 'far' layout. However for 'near' and 'offset' layout arrays, adding and removing space at the end of the devices is easy to support, and this patch provides that support. Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-19md/raid1: handle merge_bvec_fn in member devices.NeilBrown1-21/+56
Currently we don't honour merge_bvec_fn in member devices so if there is one, we force all requests to be single-page at most. This is not ideal. So create a raid1 merge_bvec_fn to check that function in children as well. This introduces a small problem. There is no locking around calls the ->merge_bvec_fn and subsequent calls to ->make_request. So a device added between these could end up getting a request which violates its merge_bvec_fn. Currently the best we can do is synchronize_sched(). This will work providing no preemption happens. If there is is preemption, we just have to hope that new devices are largely consistent with old devices. Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-19md/raid10: handle merge_bvec_fn in member devices.NeilBrown3-41/+90
Currently we don't honour merge_bvec_fn in member devices so if there is one, we force all requests to be single-page at most. This is not ideal. So enhance the raid10 merge_bvec_fn to check that function in children as well. This introduces a small problem. There is no locking around calls the ->merge_bvec_fn and subsequent calls to ->make_request. So a device added between these could end up getting a request which violates its merge_bvec_fn. Currently the best we can do is synchronize_sched(). This will work providing no preemption happens. If there is preemption, we just have to hope that new devices are largely consistent with old devices. Signed-off-by: NeilBrown <neilb@suse.de>
2012-03-19md: add proper merge_bvec handling to RAID0 and Linear.NeilBrown3-88/+107
These personalities currently set a max request size of one page when any member device has a merge_bvec_fn because they don't bother to call that function. This causes extra works in splitting and combining requests. So make the extra effort to call the merge_bvec_fn when it exists so that we end up with larger requests out the bottom. Signed-off-by: NeilBrown <neilb@suse.de>