<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/md, branch linux-2.6.35.y</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=linux-2.6.35.y</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=linux-2.6.35.y'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2011-08-01T20:54:58+00:00</updated>
<entry>
<title>md: avoid endless recovery loop when waiting for fail device to complete.</title>
<updated>2011-08-01T20:54:58+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2011-06-28T06:59:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=04bf190d9339a6cd5adef1a4b789517ff9838fdb'/>
<id>urn:sha1:04bf190d9339a6cd5adef1a4b789517ff9838fdb</id>
<content type='text'>
commit 4274215d24633df7302069e51426659d4759c5ed upstream.

If a device fails in a way that causes pending request to take a while
to complete, md will not be able to immediately remove it from the
array in remove_and_add_spares.
It will then incorrectly look like a spare device and md will try to
recover it even though it is failed.
This leads to a recovery process starting and instantly aborting over
and over again.

We should check if the device is faulty before considering it to be a
spare.  This will avoid trying to start a recovery that cannot
proceed.

This bug was introduced in 2.6.26 so that patch is suitable for any
kernel since then.

Reported-by: Jim Paradis &lt;james.paradis@stratus.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;

</content>
</entry>
<entry>
<title>md/raid5: fix raid5_set_bi_hw_segments</title>
<updated>2011-08-01T20:54:56+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@gmail.com</email>
</author>
<published>2011-06-13T05:48:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f5404e8ba2873b04e010c6cd2750ce8058736386'/>
<id>urn:sha1:f5404e8ba2873b04e010c6cd2750ce8058736386</id>
<content type='text'>
commit 9b2dc8b665932a8e681a7ab3237f60475e75e161 upstream.

The @bio-&gt;bi_phys_segments consists of active stripes count in the
lower 16 bits and processed stripes count in the upper 16 bits. So
logical-OR operator should be bitwise one.

This bug has been present since 2.6.27 and the fix is suitable for any
-stable kernel since then.  Fortunately the bad code is only used on
error paths and is relatively unlikely to be hit.

Signed-off-by: Namhyung Kim &lt;namhyung@gmail.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;

</content>
</entry>
<entry>
<title>md: check -&gt;hot_remove_disk when removing disk</title>
<updated>2011-08-01T20:54:56+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@gmail.com</email>
</author>
<published>2011-06-09T01:42:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8a3c154695b81f55ae4f38b46b86a9db6af73cbc'/>
<id>urn:sha1:8a3c154695b81f55ae4f38b46b86a9db6af73cbc</id>
<content type='text'>
commit 01393f3d5836b7d62e925e6f4658a7eb22b83a11 upstream.

Check pers-&gt;hot_remove_disk instead of pers-&gt;hot_add_disk in slot_store()
during disk removal. The linear personality only has -&gt;hot_add_disk and
no -&gt;hot_remove_disk, so that removing disk in the array resulted to
following kernel bug:

$ sudo mdadm --create /dev/md0 --level=linear --raid-devices=4 /dev/loop[0-3]
$ echo none | sudo tee /sys/block/md0/md/dev-loop2/slot
 BUG: unable to handle kernel NULL pointer dereference at           (null)
 IP: [&lt;          (null)&gt;]           (null)
 PGD c9f5d067 PUD 8575a067 PMD 0
 Oops: 0010 [#1] SMP
 CPU 2
 Modules linked in: linear loop bridge stp llc kvm_intel kvm asus_atk0110 sr_mod cdrom sg

 Pid: 10450, comm: tee Not tainted 3.0.0-rc1-leonard+ #173 System manufacturer System Product Name/P5G41TD-M PRO
 RIP: 0010:[&lt;0000000000000000&gt;]  [&lt;          (null)&gt;]           (null)
 RSP: 0018:ffff880085757df0  EFLAGS: 00010282
 RAX: ffffffffa00168e0 RBX: ffff8800d1431800 RCX: 000000000000006e
 RDX: 0000000000000001 RSI: 0000000000000002 RDI: ffff88008543c000
 RBP: ffff880085757e48 R08: 0000000000000002 R09: 000000000000000a
 R10: 0000000000000000 R11: ffff88008543c2e0 R12: 00000000ffffffff
 R13: ffff8800b4641000 R14: 0000000000000005 R15: 0000000000000000
 FS:  00007fe8c9e05700(0000) GS:ffff88011fa00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000000000 CR3: 00000000b4502000 CR4: 00000000000406e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process tee (pid: 10450, threadinfo ffff880085756000, task ffff8800c9f08000)
 Stack:
  ffffffff8138496a ffff8800b4641000 ffff88008543c268 0000000000000000
  ffff8800b4641000 ffff88008543c000 ffff8800d1431868 ffffffff81a78a90
  ffff8800b4641000 ffff88008543c000 ffff8800d1431800 ffff880085757e98
 Call Trace:
  [&lt;ffffffff8138496a&gt;] ? slot_store+0xaa/0x265
  [&lt;ffffffff81384bae&gt;] rdev_attr_store+0x89/0xa8
  [&lt;ffffffff8115a96a&gt;] sysfs_write_file+0x108/0x144
  [&lt;ffffffff81106b87&gt;] vfs_write+0xb1/0x10d
  [&lt;ffffffff8106e6c0&gt;] ? trace_hardirqs_on_caller+0x111/0x135
  [&lt;ffffffff81106cac&gt;] sys_write+0x4d/0x77
  [&lt;ffffffff814fe702&gt;] system_call_fastpath+0x16/0x1b
 Code:  Bad RIP value.
 RIP  [&lt;          (null)&gt;]           (null)
  RSP &lt;ffff880085757df0&gt;
 CR2: 0000000000000000
 ---[ end trace ba5fc64319a826fb ]---

Signed-off-by: Namhyung Kim &lt;namhyung@gmail.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;

</content>
</entry>
<entry>
<title>dm table: reject devices without request fns</title>
<updated>2011-08-01T20:54:53+00:00</updated>
<author>
<name>Milan Broz</name>
<email>mbroz@redhat.com</email>
</author>
<published>2011-05-29T12:02:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1e293bba2a05ab9b38d696c467de02b3b2b45331'/>
<id>urn:sha1:1e293bba2a05ab9b38d696c467de02b3b2b45331</id>
<content type='text'>
commit f4808ca99a203f20b4475601748e44b25a65bdec upstream.

This patch adds a check that a block device has a request function
defined before it is used.  Otherwise, misconfiguration can cause an oops.

Because we are allowing devices with zero size e.g. an offline multipath
device as in commit 2cd54d9bedb79a97f014e86c0da393416b264eb3
("dm: allow offline devices") there needs to be an additional check
to ensure devices are initialised.  Some block devices, like a loop
device without a backing file, exist but have no request function.

Reproducer is trivial: dm-mirror on unbound loop device
(no backing file on loop devices)

dmsetup create x --table "0 8 mirror core 2 8 sync 2 /dev/loop0 0 /dev/loop1 0"

and mirror resync will immediatelly cause OOps.

BUG: unable to handle kernel NULL pointer dereference at   (null)
 ? generic_make_request+0x2bd/0x590
 ? kmem_cache_alloc+0xad/0x190
 submit_bio+0x53/0xe0
 ? bio_add_page+0x3b/0x50
 dispatch_io+0x1ca/0x210 [dm_mod]
 ? read_callback+0x0/0xd0 [dm_mirror]
 dm_io+0xbb/0x290 [dm_mod]
 do_mirror+0x1e0/0x748 [dm_mirror]

Signed-off-by: Milan Broz &lt;mbroz@redhat.com&gt;
Reported-by: Zdenek Kabelac &lt;zkabelac@redhat.com&gt;
Acked-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Alasdair G Kergon &lt;agk@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;

</content>
</entry>
<entry>
<title>md: Fix - again - partition detection when array becomes active</title>
<updated>2011-03-31T18:58:54+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2011-02-24T06:26:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ecbde8e81e12c89a4d19ba4c66afa4f8784187d3'/>
<id>urn:sha1:ecbde8e81e12c89a4d19ba4c66afa4f8784187d3</id>
<content type='text'>
[ upstream commit f0b4f7e2f29af678bd9af43422c537dcb6008603 ]

Revert
    b821eaa572fd737faaf6928ba046e571526c36c6
and
    f3b99be19ded511a1bf05a148276239d9f13eefa

When I wrote the first of these I had a wrong idea about the
lifetime of 'struct block_device'.  It can disappear at any time that
the block device is not open if it falls out of the inode cache.

So relying on the 'size' recorded with it to detect when the
device size has changed and so we need to revalidate, is wrong.

Rather, we really do need the 'changed' attribute stored directly in
the mddev and set/tested as appropriate.

Without this patch, a sequence of:
   mknod / open / close / unlink

(which can cause a block_device to be created and then destroyed)
will result in a rescan of the partition table and consequence removal
and addition of partitions.
Several of these in a row can get udev racing to create and unlink and
other code can get confused.

With the patch, the rescan is only performed when needed and so there
are no races.

This is suitable for any stable kernel from 2.6.35.

Reported-by: "Wojcik, Krzysztof" &lt;krzysztof.wojcik@intel.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: stable@kernel.org

</content>
</entry>
<entry>
<title>md: correctly handle probe of an 'mdp' device.</title>
<updated>2011-03-31T18:58:13+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2011-02-16T02:58:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e918f6a71094520d8a7db5a2c6aa94f3859e7faa'/>
<id>urn:sha1:e918f6a71094520d8a7db5a2c6aa94f3859e7faa</id>
<content type='text'>
commit 8f5f02c460b7ca74ce55ce126ce0c1e58a3f923d upstream.

'mdp' devices are md devices with preallocated device numbers
for partitions. As such it is possible to mknod and open a partition
before opening the whole device.

this causes  md_probe() to be called with a device number of a
partition, which in-turn calls mddev_find with such a number.

However mddev_find expects the number of a 'whole device' and
does the wrong thing with partition numbers.

So add code to mddev_find to remove the 'partition' part of
a device number and just work with the 'whole device'.

This patch addresses https://bugzilla.kernel.org/show_bug.cgi?id=28652

Reported-by: hkmaly@bigfoot.com
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;

</content>
</entry>
<entry>
<title>dm mpath: disable blk_abort_queue</title>
<updated>2011-03-31T18:57:55+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2011-01-13T19:59:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=940da4c89846182301e4cef616f62e921a979369'/>
<id>urn:sha1:940da4c89846182301e4cef616f62e921a979369</id>
<content type='text'>
commit 09c9d4c9b6a2b5909ae3c6265e4cd3820b636863 upstream.

Revert commit 224cb3e981f1b2f9f93dbd49eaef505d17d894c2
  dm: Call blk_abort_queue on failed paths

Multipath began to use blk_abort_queue() to allow for
lower latency path deactivation.  This was found to
cause list corruption:

   the cmd gets blk_abort_queued/timedout run on it and the scsi eh
   somehow is able to complete and run scsi_queue_insert while
   scsi_request_fn is still trying to process the request.

   https://www.redhat.com/archives/dm-devel/2010-November/msg00085.html

Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Alasdair G Kergon &lt;agk@redhat.com&gt;
Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Mike Anderson &lt;andmike@linux.vnet.ibm.com&gt;
Cc: Mike Christie &lt;michaelc@cs.wisc.edu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>dm: dont take i_mutex to change device size</title>
<updated>2011-03-31T18:57:55+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2011-01-13T19:53:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=909e776f2916d9197efd789d52c9f34c41415453'/>
<id>urn:sha1:909e776f2916d9197efd789d52c9f34c41415453</id>
<content type='text'>
commit c217649bf2d60ac119afd71d938278cffd55962b upstream.

No longer needlessly hold md-&gt;bdev-&gt;bd_inode-&gt;i_mutex when changing the
size of a DM device.  This additional locking is unnecessary because
i_size_write() is already protected by the existing critical section in
dm_swap_table().  DM already has a reference on md-&gt;bdev so the
associated bd_inode may be changed without lifetime concerns.

A negative side-effect of having held md-&gt;bdev-&gt;bd_inode-&gt;i_mutex was
that a concurrent DM device resize and flush (via fsync) would deadlock.
Dropping md-&gt;bdev-&gt;bd_inode-&gt;i_mutex eliminates this potential for
deadlock.  The following reproducer no longer deadlocks:
  https://www.redhat.com/archives/dm-devel/2009-July/msg00284.html

Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Alasdair G Kergon &lt;agk@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;

</content>
</entry>
<entry>
<title>md: fix regression with re-adding devices to arrays with no metadata</title>
<updated>2011-03-31T18:57:53+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2011-01-11T22:03:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=863591babfdd78d107f54d0b3cff1c55841f902b'/>
<id>urn:sha1:863591babfdd78d107f54d0b3cff1c55841f902b</id>
<content type='text'>
commit bf572541ab44240163eaa2d486b06f306a31d45a upstream.

Commit 1a855a0606 (2.6.37-rc4) fixed a problem where devices were
re-added when they shouldn't be but caused a regression in a less
common case that means sometimes devices cannot be re-added when they
should be.

In particular, when re-adding a device to an array without metadata
we should always access the device, but after the above commit we
didn't.

This patch sets the In_sync flag in that case so that the re-add
succeeds.

This patch is suitable for any -stable kernel to which 1a855a0606 was
applied.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;

</content>
</entry>
<entry>
<title>block: Deprecate QUEUE_FLAG_CLUSTER and use queue_limits instead</title>
<updated>2011-02-06T19:03:52+00:00</updated>
<author>
<name>Martin K. Petersen</name>
<email>martin.petersen@oracle.com</email>
</author>
<published>2010-12-01T18:41:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3b5dd0d60f277a0f991e548894defeb6665d2c20'/>
<id>urn:sha1:3b5dd0d60f277a0f991e548894defeb6665d2c20</id>
<content type='text'>
commit e692cb668fdd5a712c6ed2a2d6f2a36ee83997b4 upstream.

When stacking devices, a request_queue is not always available. This
forced us to have a no_cluster flag in the queue_limits that could be
used as a carrier until the request_queue had been set up for a
metadevice.

There were several problems with that approach. First of all it was up
to the stacking device to remember to set queue flag after stacking had
completed. Also, the queue flag and the queue limits had to be kept in
sync at all times. We got that wrong, which could lead to us issuing
commands that went beyond the max scatterlist limit set by the driver.

The proper fix is to avoid having two flags for tracking the same thing.
We deprecate QUEUE_FLAG_CLUSTER and use the queue limit directly in the
block layer merging functions. The queue_limit 'no_cluster' is turned
into 'cluster' to avoid double negatives and to ease stacking.
Clustering defaults to being enabled as before. The queue flag logic is
removed from the stacking function, and explicitly setting the cluster
flag is no longer necessary in DM and MD.

Reported-by: Ed Lin &lt;ed.lin@promise.com&gt;
Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Acked-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;

</content>
</entry>
</feed>
