<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/md, branch v3.4.98</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v3.4.98</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v3.4.98'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2014-07-09T17:51:21+00:00</updated>
<entry>
<title>md: flush writes before starting a recovery.</title>
<updated>2014-07-09T17:51:21+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2014-07-02T02:04:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d3ef6557f0130b07ff09c962593b9cf51f9f6346'/>
<id>urn:sha1:d3ef6557f0130b07ff09c962593b9cf51f9f6346</id>
<content type='text'>
commit 133d4527eab8d199a62eee6bd433f0776842df2e upstream.

When we write to a degraded array which has a bitmap, we
make sure the relevant bit in the bitmap remains set when
the write completes (so a 're-add' can quickly rebuilt a
temporarily-missing device).

If, immediately after such a write starts, we incorporate a spare,
commence recovery, and skip over the region where the write is
happening (because the 'needs recovery' flag isn't set yet),
then that write will not get to the new device.

Once the recovery finishes the new device will be trusted, but will
have incorrect data, leading to possible corruption.

We cannot set the 'needs recovery' flag when we start the write as we
do not know easily if the write will be "degraded" or not.  That
depends on details of the particular raid level and particular write
request.

This patch fixes a corruption issue of long standing and so it
suitable for any -stable kernel.  It applied correctly to 3.0 at
least and will minor editing to earlier kernels.

Reported-by: Bill &lt;billstuff2001@sbcglobal.net&gt;
Tested-by: Bill &lt;billstuff2001@sbcglobal.net&gt;
Link: http://lkml.kernel.org/r/53A518BB.60709@sbcglobal.net
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>md: always set MD_RECOVERY_INTR when aborting a reshape or other "resync".</title>
<updated>2014-06-11T19:04:11+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2014-05-28T03:39:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0b78a4287b8cc7c0817b7bc4894d8c9c177bb751'/>
<id>urn:sha1:0b78a4287b8cc7c0817b7bc4894d8c9c177bb751</id>
<content type='text'>
commit 3991b31ea072b070081ca3bfa860a077eda67de5 upstream.

If mddev-&gt;ro is set, md_to_sync will (correctly) abort.
However in that case MD_RECOVERY_INTR isn't set.

If a RESHAPE had been requested, then -&gt;finish_reshape() will be
called and it will think the reshape was successful even though
nothing happened.

Normally a resync will not be requested if -&gt;ro is set, but if an
array is stopped while a reshape is on-going, then when the array is
started, the reshape will be restarted.  If the array is also set
read-only at this point, the reshape will instantly appear to success,
resulting in data corruption.

Consequently, this patch is suitable for any -stable kernel.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>dm thin: fix discard corruption</title>
<updated>2014-06-07T23:02:05+00:00</updated>
<author>
<name>Joe Thornber</name>
<email>ejt@redhat.com</email>
</author>
<published>2013-03-20T17:21:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=87dba703d0d9eb4352506dbd8f8878e3e2ee92f0'/>
<id>urn:sha1:87dba703d0d9eb4352506dbd8f8878e3e2ee92f0</id>
<content type='text'>
commit f046f89a99ccfd9408b94c653374ff3065c7edb3 upstream.

Fix a bug in dm_btree_remove that could leave leaf values with incorrect
reference counts.  The effect of this was that removal of a shared block
could result in the space maps thinking the block was no longer used.
More concretely, if you have a thin device and a snapshot of it, sending
a discard to a shared region of the thin could corrupt the snapshot.

Thinp uses a 2-level nested btree to store it's mappings.  This first
level is indexed by thin device, and the second level by logical
block.

Often when we're removing an entry in this mapping tree we need to
rebalance nodes, which can involve shadowing them, possibly creating a
copy if the block is shared.  If we do create a copy then children of
that node need to have their reference counts incremented.  In this
way reference counts percolate down the tree as shared trees diverge.

The rebalance functions were incrementing the children at the
appropriate time, but they were always assuming the children were
internal nodes.  This meant the leaf values (in our case packed
block/flags entries) were not being incremented.

Signed-off-by: Joe Thornber &lt;ejt@redhat.com&gt;
Signed-off-by: Alasdair G Kergon &lt;agk@redhat.com&gt;
[bwh: Backported to 3.2: bump target version numbers from 1.0.1 to 1.0.2]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
[xr: Backported to 3.4: bump target version numbers to 1.1.1]
Signed-off-by: Rui Xiang &lt;rui.xiang@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>dm mpath: fix race condition between multipath_dtr and pg_init_done</title>
<updated>2014-06-07T23:02:05+00:00</updated>
<author>
<name>Shiva Krishna Merla</name>
<email>shivakrishna.merla@netapp.com</email>
</author>
<published>2013-10-30T03:26:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5e301eba0a521b3682aec467eb13a94dd6c4027a'/>
<id>urn:sha1:5e301eba0a521b3682aec467eb13a94dd6c4027a</id>
<content type='text'>
commit 954a73d5d3073df2231820c718fdd2f18b0fe4c9 upstream.

Whenever multipath_dtr() is happening we must prevent queueing any
further path activation work.  Implement this by adding a new
'pg_init_disabled' flag to the multipath structure that denotes future
path activation work should be skipped if it is set.  By disabling
pg_init and then re-enabling in flush_multipath_work() we also avoid the
potential for pg_init to be initiated while suspending an mpath device.

Without this patch a race condition exists that may result in a kernel
panic:

1) If after pg_init_done() decrements pg_init_in_progress to 0, a call
   to wait_for_pg_init_completion() assumes there are no more pending path
   management commands.
2) If pg_init_required is set by pg_init_done(), due to retryable
   mode_select errors, then process_queued_ios() will again queue the
   path activation work.
3) If free_multipath() completes before activate_path() work is called a
   NULL pointer dereference like the following can be seen when
   accessing members of the recently destructed multipath:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000090
RIP: 0010:[&lt;ffffffffa003db1b&gt;]  [&lt;ffffffffa003db1b&gt;] activate_path+0x1b/0x30 [dm_multipath]
[&lt;ffffffff81090ac0&gt;] worker_thread+0x170/0x2a0
[&lt;ffffffff81096c80&gt;] ? autoremove_wake_function+0x0/0x40

[switch to disabling pg_init in flush_multipath_work &amp; header edits by Mike Snitzer]
Signed-off-by: Shiva Krishna Merla &lt;shivakrishna.merla@netapp.com&gt;
Reviewed-by: Krishnasamy Somasundaram &lt;somasundaram.krishnasamy@netapp.com&gt;
Tested-by: Speagle Andy &lt;Andy.Speagle@netapp.com&gt;
Acked-by: Junichi Nomura &lt;j-nomura@ce.jp.nec.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
[bwh: Backported to 3.2:
 - Adjust context
 - Bump version to 1.3.2 not 1.6.0]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
[xr: Backported to 3.4: Adjust context]
Signed-off-by: Rui Xiang &lt;rui.xiang@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>dm snapshot: avoid snapshot space leak on crash</title>
<updated>2014-06-07T23:02:05+00:00</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2013-11-29T23:13:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d110fd5113953b6f539dee75dbabd9a86dce790f'/>
<id>urn:sha1:d110fd5113953b6f539dee75dbabd9a86dce790f</id>
<content type='text'>
commit 230c83afdd9cd384348475bea1e14b80b3b6b1b8 upstream.

There is a possible leak of snapshot space in case of crash.

The reason for space leaking is that chunks in the snapshot device are
allocated sequentially, but they are finished (and stored in the metadata)
out of order, depending on the order in which copying finished.

For example, supposed that the metadata contains the following records
SUPERBLOCK
METADATA (blocks 0 ... 250)
DATA 0
DATA 1
DATA 2
...
DATA 250

Now suppose that you allocate 10 new data blocks 251-260. Suppose that
copying of these blocks finish out of order (block 260 finished first
and the block 251 finished last). Now, the snapshot device looks like
this:
SUPERBLOCK
METADATA (blocks 0 ... 250, 260, 259, 258, 257, 256)
DATA 0
DATA 1
DATA 2
...
DATA 250
DATA 251
DATA 252
DATA 253
DATA 254
DATA 255
METADATA (blocks 255, 254, 253, 252, 251)
DATA 256
DATA 257
DATA 258
DATA 259
DATA 260

Now, if the machine crashes after writing the first metadata block but
before writing the second metadata block, the space for areas DATA 250-255
is leaked, it contains no valid data and it will never be used in the
future.

This patch makes dm-snapshot complete exceptions in the same order they
were allocated, thus fixing this bug.

Note: when backporting this patch to the stable kernel, change the version
field in the following way:
* if version in the stable kernel is {1, 11, 1}, change it to {1, 12, 0}
* if version in the stable kernel is {1, 10, 0} or {1, 10, 1}, change it
  to {1, 10, 2}
Userspace reads the version to determine if the bug was fixed, so the
version change is needed.

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
[xr: Backported to 3.4: adjust version]
Signed-off-by: Rui Xiang &lt;rui.xiang@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>md/raid10: fix "enough" function for detecting if array is failed.</title>
<updated>2014-06-07T23:02:05+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2012-09-27T02:35:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=352f526f91dca8673d9be1ba3166411fd8fea918'/>
<id>urn:sha1:352f526f91dca8673d9be1ba3166411fd8fea918</id>
<content type='text'>
commit 80b4812407c6b1f66a4f2430e69747a13f010839 upstream.

The 'enough' function is written to work with 'near' arrays only
in that is implicitly assumes that the offset from one 'group' of
devices to the next is the same as the number of copies.
In reality it is the number of 'near' copies.

So change it to make this number explicit.

This bug makes it possible to run arrays without enough drives
present, which is dangerous.
It is appropriate for an -stable kernel, but will almost certainly
need to be modified for some of them.

Reported-by: Jakub Husák &lt;jakub@gooseman.cz&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
[bwh: Backported to 3.2: s/geo-&gt;/conf-&gt;/]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Cc: Rui Xiang &lt;rui.xiang@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>dm snapshot: add missing module aliases</title>
<updated>2014-06-07T23:02:05+00:00</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2013-03-01T22:45:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e4bf93019b63da30c8f9e9edb436862998865865'/>
<id>urn:sha1:e4bf93019b63da30c8f9e9edb436862998865865</id>
<content type='text'>
commit 23cb21092eb9dcec9d3604b68d95192b79915890 upstream.

Add module aliases so that autoloading works correctly if the user
tries to activate "snapshot-origin" or "snapshot-merge" targets.

Reference: https://bugzilla.redhat.com/889973

Reported-by: Chao Yang &lt;chyang@redhat.com&gt;
Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Alasdair G Kergon &lt;agk@redhat.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Cc: Rui Xiang &lt;rui.xiang@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>dm bufio: avoid a possible __vmalloc deadlock</title>
<updated>2014-06-07T23:02:05+00:00</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2013-05-10T13:37:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bed74df4fd8380c87adb576c121f40cb1037435b'/>
<id>urn:sha1:bed74df4fd8380c87adb576c121f40cb1037435b</id>
<content type='text'>
commit 502624bdad3dba45dfaacaf36b7d83e39e74b2d2 upstream.

This patch uses memalloc_noio_save to avoid a possible deadlock in
dm-bufio.  (it could happen only with large block size, at most
PAGE_SIZE &lt;&lt; MAX_ORDER (typically 8MiB).

__vmalloc doesn't fully respect gfp flags. The specified gfp flags are
used for allocation of requested pages, structures vmap_area, vmap_block
and vm_struct and the radix tree nodes.

However, the kernel pagetables are allocated always with GFP_KERNEL.
Thus the allocation of pagetables can recurse back to the I/O layer and
cause a deadlock.

This patch uses the function memalloc_noio_save to set per-process
PF_MEMALLOC_NOIO flag and the function memalloc_noio_restore to restore
it. When this flag is set, all allocations in the process are done with
implied GFP_NOIO flag, thus the deadlock can't happen.

This should be backported to stable kernels, but they don't have the
PF_MEMALLOC_NOIO flag and memalloc_noio_save/memalloc_noio_restore
functions. So, PF_MEMALLOC should be set and restored instead.

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Alasdair G Kergon &lt;agk@redhat.com&gt;
[bwh: Backported to 3.2 as recommended]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Cc: Rui Xiang &lt;rui.xiang@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>md: avoid possible spinning md thread at shutdown.</title>
<updated>2014-06-07T23:02:01+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2014-05-05T23:36:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=546c518fb111a753a00f55af9052ea275ab69f12'/>
<id>urn:sha1:546c518fb111a753a00f55af9052ea275ab69f12</id>
<content type='text'>
commit 0f62fb220aa4ebabe8547d3a9ce4a16d3c045f21 upstream.

If an md array with externally managed metadata (e.g. DDF or IMSM)
is in use, then we should not set safemode==2 at shutdown because:

1/ this is ineffective: user-space need to be involved in any 'safemode' handling,
2/ The safemode management code doesn't cope with safemode==2 on external metadata
   and md_check_recover enters an infinite loop.

Even at shutdown, an infinite-looping process can be problematic, so this
could cause shutdown to hang.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>dm thin: fix dangling bio in process_deferred_bios error path</title>
<updated>2014-05-13T12:11:32+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2014-03-28T06:15:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=53b67ae8d3d01efbdfd7ae431d8d46cba70084b9'/>
<id>urn:sha1:53b67ae8d3d01efbdfd7ae431d8d46cba70084b9</id>
<content type='text'>
commit fe76cd88e654124d1431bb662a0fc6e99ca811a5 upstream.

If unable to ensure_next_mapping() we must add the current bio, which
was removed from the @bios list via bio_list_pop, back to the
deferred_bios list before all the remaining @bios.

Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Acked-by: Joe Thornber &lt;ejt@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
</feed>
