<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/md, branch v3.4.105</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v3.4.105</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v3.4.105'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2014-12-01T10:02:43+00:00</updated>
<entry>
<title>dm crypt: fix access beyond the end of allocated space</title>
<updated>2014-12-01T10:02:43+00:00</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2014-08-28T15:09:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=be73621bd21353b97ba55fb5ad9b4c35aa09271a'/>
<id>urn:sha1:be73621bd21353b97ba55fb5ad9b4c35aa09271a</id>
<content type='text'>
commit d49ec52ff6ddcda178fc2476a109cf1bd1fa19ed upstream.

The DM crypt target accesses memory beyond allocated space resulting in
a crash on 32 bit x86 systems.

This bug is very old (it dates back to 2.6.25 commit 3a7f6c990ad04 "dm
crypt: use async crypto").  However, this bug was masked by the fact
that kmalloc rounds the size up to the next power of two.  This bug
wasn't exposed until 3.17-rc1 commit 298a9fa08a ("dm crypt: use per-bio
data").  By switching to using per-bio data there was no longer any
padding beyond the end of a dm-crypt allocated memory block.

To minimize allocation overhead dm-crypt puts several structures into one
block allocated with kmalloc.  The block holds struct ablkcipher_request,
cipher-specific scratch pad (crypto_ablkcipher_reqsize(any_tfm(cc))),
struct dm_crypt_request and an initialization vector.

The variable dmreq_start is set to offset of struct dm_crypt_request
within this memory block.  dm-crypt allocates the block with this size:
cc-&gt;dmreq_start + sizeof(struct dm_crypt_request) + cc-&gt;iv_size.

When accessing the initialization vector, dm-crypt uses the function
iv_of_dmreq, which performs this calculation: ALIGN((unsigned long)(dmreq
+ 1), crypto_ablkcipher_alignmask(any_tfm(cc)) + 1).

dm-crypt allocated "cc-&gt;iv_size" bytes beyond the end of dm_crypt_request
structure.  However, when dm-crypt accesses the initialization vector, it
takes a pointer to the end of dm_crypt_request, aligns it, and then uses
it as the initialization vector.  If the end of dm_crypt_request is not
aligned on a crypto_ablkcipher_alignmask(any_tfm(cc)) boundary the
alignment causes the initialization vector to point beyond the allocated
space.

Fix this bug by calculating the variable iv_size_padding and adding it
to the allocated size.

Also correct the alignment of dm_crypt_request.  struct dm_crypt_request
is specific to dm-crypt (it isn't used by the crypto subsystem at all),
so it is aligned on __alignof__(struct dm_crypt_request).

Also align per_bio_data_size on ARCH_KMALLOC_MINALIGN, so that it is
aligned as if the block was allocated with kmalloc.

Reported-by: Krzysztof Kolasa &lt;kkolasa@winsoft.pl&gt;
Tested-by: Milan Broz &lt;gmazyland@gmail.com&gt;
Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
[lizf: Backported by Mikulas]
Signed-off-by: Zefan Li &lt;lizefan@huawei.com&gt;
</content>
</entry>
<entry>
<title>md/raid6: avoid data corruption during recovery of double-degraded RAID6</title>
<updated>2014-09-25T03:49:11+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2014-08-12T23:57:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a0e5b9d2c3cdaf1f980409ec1a84db22abfd6958'/>
<id>urn:sha1:a0e5b9d2c3cdaf1f980409ec1a84db22abfd6958</id>
<content type='text'>
commit 9c4bdf697c39805078392d5ddbbba5ae5680e0dd upstream.

During recovery of a double-degraded RAID6 it is possible for
some blocks not to be recovered properly, leading to corruption.

If a write happens to one block in a stripe that would be written to a
missing device, and at the same time that stripe is recovering data
to the other missing device, then that recovered data may not be written.

This patch skips, in the double-degraded case, an optimisation that is
only safe for single-degraded arrays.

Bug was introduced in 2.6.32 and fix is suitable for any kernel since
then.  In an older kernel with separate handle_stripe5() and
handle_stripe6() functions the patch must change handle_stripe6().

Fixes: 6c0069c0ae9659e3a91b68eaed06a5c6c37f45c8
Cc: Yuri Tikhonov &lt;yur@emcraft.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Reported-by: "Manibalan P" &lt;pmanibalan@amiindia.co.in&gt;
Tested-by: "Manibalan P" &lt;pmanibalan@amiindia.co.in&gt;
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1090423
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Acked-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Zefan Li &lt;lizefan@huawei.com&gt;
</content>
</entry>
<entry>
<title>md: flush writes before starting a recovery.</title>
<updated>2014-07-09T17:51:21+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2014-07-02T02:04:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d3ef6557f0130b07ff09c962593b9cf51f9f6346'/>
<id>urn:sha1:d3ef6557f0130b07ff09c962593b9cf51f9f6346</id>
<content type='text'>
commit 133d4527eab8d199a62eee6bd433f0776842df2e upstream.

When we write to a degraded array which has a bitmap, we
make sure the relevant bit in the bitmap remains set when
the write completes (so a 're-add' can quickly rebuilt a
temporarily-missing device).

If, immediately after such a write starts, we incorporate a spare,
commence recovery, and skip over the region where the write is
happening (because the 'needs recovery' flag isn't set yet),
then that write will not get to the new device.

Once the recovery finishes the new device will be trusted, but will
have incorrect data, leading to possible corruption.

We cannot set the 'needs recovery' flag when we start the write as we
do not know easily if the write will be "degraded" or not.  That
depends on details of the particular raid level and particular write
request.

This patch fixes a corruption issue of long standing and so it
suitable for any -stable kernel.  It applied correctly to 3.0 at
least and will minor editing to earlier kernels.

Reported-by: Bill &lt;billstuff2001@sbcglobal.net&gt;
Tested-by: Bill &lt;billstuff2001@sbcglobal.net&gt;
Link: http://lkml.kernel.org/r/53A518BB.60709@sbcglobal.net
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>md: always set MD_RECOVERY_INTR when aborting a reshape or other "resync".</title>
<updated>2014-06-11T19:04:11+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2014-05-28T03:39:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0b78a4287b8cc7c0817b7bc4894d8c9c177bb751'/>
<id>urn:sha1:0b78a4287b8cc7c0817b7bc4894d8c9c177bb751</id>
<content type='text'>
commit 3991b31ea072b070081ca3bfa860a077eda67de5 upstream.

If mddev-&gt;ro is set, md_to_sync will (correctly) abort.
However in that case MD_RECOVERY_INTR isn't set.

If a RESHAPE had been requested, then -&gt;finish_reshape() will be
called and it will think the reshape was successful even though
nothing happened.

Normally a resync will not be requested if -&gt;ro is set, but if an
array is stopped while a reshape is on-going, then when the array is
started, the reshape will be restarted.  If the array is also set
read-only at this point, the reshape will instantly appear to success,
resulting in data corruption.

Consequently, this patch is suitable for any -stable kernel.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>dm thin: fix discard corruption</title>
<updated>2014-06-07T23:02:05+00:00</updated>
<author>
<name>Joe Thornber</name>
<email>ejt@redhat.com</email>
</author>
<published>2013-03-20T17:21:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=87dba703d0d9eb4352506dbd8f8878e3e2ee92f0'/>
<id>urn:sha1:87dba703d0d9eb4352506dbd8f8878e3e2ee92f0</id>
<content type='text'>
commit f046f89a99ccfd9408b94c653374ff3065c7edb3 upstream.

Fix a bug in dm_btree_remove that could leave leaf values with incorrect
reference counts.  The effect of this was that removal of a shared block
could result in the space maps thinking the block was no longer used.
More concretely, if you have a thin device and a snapshot of it, sending
a discard to a shared region of the thin could corrupt the snapshot.

Thinp uses a 2-level nested btree to store it's mappings.  This first
level is indexed by thin device, and the second level by logical
block.

Often when we're removing an entry in this mapping tree we need to
rebalance nodes, which can involve shadowing them, possibly creating a
copy if the block is shared.  If we do create a copy then children of
that node need to have their reference counts incremented.  In this
way reference counts percolate down the tree as shared trees diverge.

The rebalance functions were incrementing the children at the
appropriate time, but they were always assuming the children were
internal nodes.  This meant the leaf values (in our case packed
block/flags entries) were not being incremented.

Signed-off-by: Joe Thornber &lt;ejt@redhat.com&gt;
Signed-off-by: Alasdair G Kergon &lt;agk@redhat.com&gt;
[bwh: Backported to 3.2: bump target version numbers from 1.0.1 to 1.0.2]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
[xr: Backported to 3.4: bump target version numbers to 1.1.1]
Signed-off-by: Rui Xiang &lt;rui.xiang@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>dm mpath: fix race condition between multipath_dtr and pg_init_done</title>
<updated>2014-06-07T23:02:05+00:00</updated>
<author>
<name>Shiva Krishna Merla</name>
<email>shivakrishna.merla@netapp.com</email>
</author>
<published>2013-10-30T03:26:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5e301eba0a521b3682aec467eb13a94dd6c4027a'/>
<id>urn:sha1:5e301eba0a521b3682aec467eb13a94dd6c4027a</id>
<content type='text'>
commit 954a73d5d3073df2231820c718fdd2f18b0fe4c9 upstream.

Whenever multipath_dtr() is happening we must prevent queueing any
further path activation work.  Implement this by adding a new
'pg_init_disabled' flag to the multipath structure that denotes future
path activation work should be skipped if it is set.  By disabling
pg_init and then re-enabling in flush_multipath_work() we also avoid the
potential for pg_init to be initiated while suspending an mpath device.

Without this patch a race condition exists that may result in a kernel
panic:

1) If after pg_init_done() decrements pg_init_in_progress to 0, a call
   to wait_for_pg_init_completion() assumes there are no more pending path
   management commands.
2) If pg_init_required is set by pg_init_done(), due to retryable
   mode_select errors, then process_queued_ios() will again queue the
   path activation work.
3) If free_multipath() completes before activate_path() work is called a
   NULL pointer dereference like the following can be seen when
   accessing members of the recently destructed multipath:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000090
RIP: 0010:[&lt;ffffffffa003db1b&gt;]  [&lt;ffffffffa003db1b&gt;] activate_path+0x1b/0x30 [dm_multipath]
[&lt;ffffffff81090ac0&gt;] worker_thread+0x170/0x2a0
[&lt;ffffffff81096c80&gt;] ? autoremove_wake_function+0x0/0x40

[switch to disabling pg_init in flush_multipath_work &amp; header edits by Mike Snitzer]
Signed-off-by: Shiva Krishna Merla &lt;shivakrishna.merla@netapp.com&gt;
Reviewed-by: Krishnasamy Somasundaram &lt;somasundaram.krishnasamy@netapp.com&gt;
Tested-by: Speagle Andy &lt;Andy.Speagle@netapp.com&gt;
Acked-by: Junichi Nomura &lt;j-nomura@ce.jp.nec.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
[bwh: Backported to 3.2:
 - Adjust context
 - Bump version to 1.3.2 not 1.6.0]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
[xr: Backported to 3.4: Adjust context]
Signed-off-by: Rui Xiang &lt;rui.xiang@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>dm snapshot: avoid snapshot space leak on crash</title>
<updated>2014-06-07T23:02:05+00:00</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2013-11-29T23:13:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d110fd5113953b6f539dee75dbabd9a86dce790f'/>
<id>urn:sha1:d110fd5113953b6f539dee75dbabd9a86dce790f</id>
<content type='text'>
commit 230c83afdd9cd384348475bea1e14b80b3b6b1b8 upstream.

There is a possible leak of snapshot space in case of crash.

The reason for space leaking is that chunks in the snapshot device are
allocated sequentially, but they are finished (and stored in the metadata)
out of order, depending on the order in which copying finished.

For example, supposed that the metadata contains the following records
SUPERBLOCK
METADATA (blocks 0 ... 250)
DATA 0
DATA 1
DATA 2
...
DATA 250

Now suppose that you allocate 10 new data blocks 251-260. Suppose that
copying of these blocks finish out of order (block 260 finished first
and the block 251 finished last). Now, the snapshot device looks like
this:
SUPERBLOCK
METADATA (blocks 0 ... 250, 260, 259, 258, 257, 256)
DATA 0
DATA 1
DATA 2
...
DATA 250
DATA 251
DATA 252
DATA 253
DATA 254
DATA 255
METADATA (blocks 255, 254, 253, 252, 251)
DATA 256
DATA 257
DATA 258
DATA 259
DATA 260

Now, if the machine crashes after writing the first metadata block but
before writing the second metadata block, the space for areas DATA 250-255
is leaked, it contains no valid data and it will never be used in the
future.

This patch makes dm-snapshot complete exceptions in the same order they
were allocated, thus fixing this bug.

Note: when backporting this patch to the stable kernel, change the version
field in the following way:
* if version in the stable kernel is {1, 11, 1}, change it to {1, 12, 0}
* if version in the stable kernel is {1, 10, 0} or {1, 10, 1}, change it
  to {1, 10, 2}
Userspace reads the version to determine if the bug was fixed, so the
version change is needed.

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
[xr: Backported to 3.4: adjust version]
Signed-off-by: Rui Xiang &lt;rui.xiang@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>md/raid10: fix "enough" function for detecting if array is failed.</title>
<updated>2014-06-07T23:02:05+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2012-09-27T02:35:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=352f526f91dca8673d9be1ba3166411fd8fea918'/>
<id>urn:sha1:352f526f91dca8673d9be1ba3166411fd8fea918</id>
<content type='text'>
commit 80b4812407c6b1f66a4f2430e69747a13f010839 upstream.

The 'enough' function is written to work with 'near' arrays only
in that is implicitly assumes that the offset from one 'group' of
devices to the next is the same as the number of copies.
In reality it is the number of 'near' copies.

So change it to make this number explicit.

This bug makes it possible to run arrays without enough drives
present, which is dangerous.
It is appropriate for an -stable kernel, but will almost certainly
need to be modified for some of them.

Reported-by: Jakub Husák &lt;jakub@gooseman.cz&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
[bwh: Backported to 3.2: s/geo-&gt;/conf-&gt;/]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Cc: Rui Xiang &lt;rui.xiang@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>dm snapshot: add missing module aliases</title>
<updated>2014-06-07T23:02:05+00:00</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2013-03-01T22:45:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e4bf93019b63da30c8f9e9edb436862998865865'/>
<id>urn:sha1:e4bf93019b63da30c8f9e9edb436862998865865</id>
<content type='text'>
commit 23cb21092eb9dcec9d3604b68d95192b79915890 upstream.

Add module aliases so that autoloading works correctly if the user
tries to activate "snapshot-origin" or "snapshot-merge" targets.

Reference: https://bugzilla.redhat.com/889973

Reported-by: Chao Yang &lt;chyang@redhat.com&gt;
Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Alasdair G Kergon &lt;agk@redhat.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Cc: Rui Xiang &lt;rui.xiang@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>dm bufio: avoid a possible __vmalloc deadlock</title>
<updated>2014-06-07T23:02:05+00:00</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2013-05-10T13:37:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bed74df4fd8380c87adb576c121f40cb1037435b'/>
<id>urn:sha1:bed74df4fd8380c87adb576c121f40cb1037435b</id>
<content type='text'>
commit 502624bdad3dba45dfaacaf36b7d83e39e74b2d2 upstream.

This patch uses memalloc_noio_save to avoid a possible deadlock in
dm-bufio.  (it could happen only with large block size, at most
PAGE_SIZE &lt;&lt; MAX_ORDER (typically 8MiB).

__vmalloc doesn't fully respect gfp flags. The specified gfp flags are
used for allocation of requested pages, structures vmap_area, vmap_block
and vm_struct and the radix tree nodes.

However, the kernel pagetables are allocated always with GFP_KERNEL.
Thus the allocation of pagetables can recurse back to the I/O layer and
cause a deadlock.

This patch uses the function memalloc_noio_save to set per-process
PF_MEMALLOC_NOIO flag and the function memalloc_noio_restore to restore
it. When this flag is set, all allocations in the process are done with
implied GFP_NOIO flag, thus the deadlock can't happen.

This should be backported to stable kernels, but they don't have the
PF_MEMALLOC_NOIO flag and memalloc_noio_save/memalloc_noio_restore
functions. So, PF_MEMALLOC should be set and restored instead.

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Alasdair G Kergon &lt;agk@redhat.com&gt;
[bwh: Backported to 3.2 as recommended]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Cc: Rui Xiang &lt;rui.xiang@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
</feed>
