<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/md/raid5.c, branch v4.11.5</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v4.11.5</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v4.11.5'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2017-05-25T13:46:12+00:00</updated>
<entry>
<title>md: update slab_cache before releasing new stripes when stripes resizing</title>
<updated>2017-05-25T13:46:12+00:00</updated>
<author>
<name>Dennis Yang</name>
<email>dennisyang@qnap.com</email>
</author>
<published>2017-03-29T07:46:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1800fde3093e035ab37a68597b850c23437ae964'/>
<id>urn:sha1:1800fde3093e035ab37a68597b850c23437ae964</id>
<content type='text'>
commit 583da48e388f472e8818d9bb60ef6a1d40ee9f9d upstream.

When growing raid5 device on machine with small memory, there is chance that
mdadm will be killed and the following bug report can be observed. The same
bug could also be reproduced in linux-4.10.6.

[57600.075774] BUG: unable to handle kernel NULL pointer dereference at           (null)
[57600.083796] IP: [&lt;ffffffff81a6aa87&gt;] _raw_spin_lock+0x7/0x20
[57600.110378] PGD 421cf067 PUD 4442d067 PMD 0
[57600.114678] Oops: 0002 [#1] SMP
[57600.180799] CPU: 1 PID: 25990 Comm: mdadm Tainted: P           O    4.2.8 #1
[57600.187849] Hardware name: To be filled by O.E.M. To be filled by O.E.M./MAHOBAY, BIOS QV05AR66 03/06/2013
[57600.197490] task: ffff880044e47240 ti: ffff880043070000 task.ti: ffff880043070000
[57600.204963] RIP: 0010:[&lt;ffffffff81a6aa87&gt;]  [&lt;ffffffff81a6aa87&gt;] _raw_spin_lock+0x7/0x20
[57600.213057] RSP: 0018:ffff880043073810  EFLAGS: 00010046
[57600.218359] RAX: 0000000000000000 RBX: 000000000000000c RCX: ffff88011e296dd0
[57600.225486] RDX: 0000000000000001 RSI: ffffe8ffffcb46c0 RDI: 0000000000000000
[57600.232613] RBP: ffff880043073878 R08: ffff88011e5f8170 R09: 0000000000000282
[57600.239739] R10: 0000000000000005 R11: 28f5c28f5c28f5c3 R12: ffff880043073838
[57600.246872] R13: ffffe8ffffcb46c0 R14: 0000000000000000 R15: ffff8800b9706a00
[57600.253999] FS:  00007f576106c700(0000) GS:ffff88011e280000(0000) knlGS:0000000000000000
[57600.262078] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[57600.267817] CR2: 0000000000000000 CR3: 00000000428fe000 CR4: 00000000001406e0
[57600.274942] Stack:
[57600.276949]  ffffffff8114ee35 ffff880043073868 0000000000000282 000000000000eb3f
[57600.284383]  ffffffff81119043 ffff880043073838 ffff880043073838 ffff88003e197b98
[57600.291820]  ffffe8ffffcb46c0 ffff88003e197360 0000000000000286 ffff880043073968
[57600.299254] Call Trace:
[57600.301698]  [&lt;ffffffff8114ee35&gt;] ? cache_flusharray+0x35/0xe0
[57600.307523]  [&lt;ffffffff81119043&gt;] ? __page_cache_release+0x23/0x110
[57600.313779]  [&lt;ffffffff8114eb53&gt;] kmem_cache_free+0x63/0xc0
[57600.319344]  [&lt;ffffffff81579942&gt;] drop_one_stripe+0x62/0x90
[57600.324915]  [&lt;ffffffff81579b5b&gt;] raid5_cache_scan+0x8b/0xb0
[57600.330563]  [&lt;ffffffff8111b98a&gt;] shrink_slab.part.36+0x19a/0x250
[57600.336650]  [&lt;ffffffff8111e38c&gt;] shrink_zone+0x23c/0x250
[57600.342039]  [&lt;ffffffff8111e4f3&gt;] do_try_to_free_pages+0x153/0x420
[57600.348210]  [&lt;ffffffff8111e851&gt;] try_to_free_pages+0x91/0xa0
[57600.353959]  [&lt;ffffffff811145b1&gt;] __alloc_pages_nodemask+0x4d1/0x8b0
[57600.360303]  [&lt;ffffffff8157a30b&gt;] check_reshape+0x62b/0x770
[57600.365866]  [&lt;ffffffff8157a4a5&gt;] raid5_check_reshape+0x55/0xa0
[57600.371778]  [&lt;ffffffff81583df7&gt;] update_raid_disks+0xc7/0x110
[57600.377604]  [&lt;ffffffff81592b73&gt;] md_ioctl+0xd83/0x1b10
[57600.382827]  [&lt;ffffffff81385380&gt;] blkdev_ioctl+0x170/0x690
[57600.388307]  [&lt;ffffffff81195238&gt;] block_ioctl+0x38/0x40
[57600.393525]  [&lt;ffffffff811731c5&gt;] do_vfs_ioctl+0x2b5/0x480
[57600.399010]  [&lt;ffffffff8115e07b&gt;] ? vfs_write+0x14b/0x1f0
[57600.404400]  [&lt;ffffffff811733cc&gt;] SyS_ioctl+0x3c/0x70
[57600.409447]  [&lt;ffffffff81a6ad97&gt;] entry_SYSCALL_64_fastpath+0x12/0x6a
[57600.415875] Code: 00 00 00 00 55 48 89 e5 8b 07 85 c0 74 04 31 c0 5d c3 ba 01 00 00 00 f0 0f b1 17 85 c0 75 ef b0 01 5d c3 90 31 c0 ba 01 00 00 00 &lt;f0&gt; 0f b1 17 85 c0 75 01 c3 55 89 c6 48 89 e5 e8 85 d1 63 ff 5d
[57600.435460] RIP  [&lt;ffffffff81a6aa87&gt;] _raw_spin_lock+0x7/0x20
[57600.441208]  RSP &lt;ffff880043073810&gt;
[57600.444690] CR2: 0000000000000000
[57600.448000] ---[ end trace cbc6b5cc4bf9831d ]---

The problem is that resize_stripes() releases new stripe_heads before assigning new
slab cache to conf-&gt;slab_cache. If the shrinker function raid5_cache_scan() gets called
after resize_stripes() starting releasing new stripes but right before new slab cache
being assigned, it is possible that these new stripe_heads will be freed with the old
slab_cache which was already been destoryed and that triggers this bug.

Signed-off-by: Dennis Yang &lt;dennisyang@qnap.com&gt;
Fixes: edbe83ab4c27 ("md/raid5: allow the stripe_cache to grow and shrink.")
Reviewed-by: NeilBrown &lt;neilb@suse.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>md/r5cache: fix set_syndrome_sources() for data in cache</title>
<updated>2017-03-14T16:57:10+00:00</updated>
<author>
<name>Song Liu</name>
<email>songliubraving@fb.com</email>
</author>
<published>2017-03-13T20:44:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0977762f6d15f13caccc20d71a5dec47d098907d'/>
<id>urn:sha1:0977762f6d15f13caccc20d71a5dec47d098907d</id>
<content type='text'>
Before this patch, device InJournal will be included in prexor
(SYNDROME_SRC_WANT_DRAIN) but not in reconstruct (SYNDROME_SRC_WRITTEN). So it
will break parity calculation. With srctype == SYNDROME_SRC_WRITTEN, we need
include both dev with non-null -&gt;written and dev with R5_InJournal. This fixes
logic in 1e6d690(md/r5cache: caching phase of r5cache)

Cc: stable@vger.kernel.org (v4.10+)
Signed-off-by: Song Liu &lt;songliubraving@fb.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
</entry>
<entry>
<title>md: move funcs from pers-&gt;resize to update_size</title>
<updated>2017-03-09T17:02:18+00:00</updated>
<author>
<name>Guoqing Jiang</name>
<email>gqjiang@suse.com</email>
</author>
<published>2017-02-24T03:15:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c94836342192b05d599d6aa3397f732f7a238689'/>
<id>urn:sha1:c94836342192b05d599d6aa3397f732f7a238689</id>
<content type='text'>
raid1_resize and raid5_resize should also check the
mddev-&gt;queue if run underneath dm-raid.

And both set_capacity and revalidate_disk are used in
pers-&gt;resize such as raid1, raid10 and raid5. So
move them from personality file to common code.

Reviewed-by: NeilBrown &lt;neilb@suse.com&gt;
Signed-off-by: Guoqing Jiang &lt;gqjiang@suse.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
</entry>
<entry>
<title>sched/headers: Prepare for new header dependencies before moving code to &lt;linux/sched/signal.h&gt;</title>
<updated>2017-03-02T07:42:29+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2017-02-08T17:51:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3f07c0144132e4f59d88055ac8ff3e691a5fa2b8'/>
<id>urn:sha1:3f07c0144132e4f59d88055ac8ff3e691a5fa2b8</id>
<content type='text'>
We are going to split &lt;linux/sched/signal.h&gt; out of &lt;linux/sched.h&gt;, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder &lt;linux/sched/signal.h&gt; file that just
maps to &lt;linux/sched.h&gt; to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md</title>
<updated>2017-02-24T22:42:19+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2017-02-24T22:42:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a682e0035494c449e53a57d039f86f75b9e2fe67'/>
<id>urn:sha1:a682e0035494c449e53a57d039f86f75b9e2fe67</id>
<content type='text'>
Pull md updates from Shaohua Li:
 "Mainly fixes bugs and improves performance:

   - Improve scalability for raid1 from Coly

   - Improve raid5-cache read performance, disk efficiency and IO
     pattern from Song and me

   - Fix a race condition of disk hotplug for linear from Coly

   - A few cleanup patches from Ming and Byungchul

   - Fix a memory leak from Neil

   - Fix WRITE SAME IO failure from me

   - Add doc for raid5-cache from me"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md: (23 commits)
  md/raid1: fix write behind issues introduced by bio_clone_bioset_partial
  md/raid1: handle flush request correctly
  md/linear: shutup lockdep warnning
  md/raid1: fix a use-after-free bug
  RAID1: avoid unnecessary spin locks in I/O barrier code
  RAID1: a new I/O barrier implementation to remove resync window
  md/raid5: Don't reinvent the wheel but use existing llist API
  md: fast clone bio in bio_clone_mddev()
  md: remove unnecessary check on mddev
  md/raid1: use bio_clone_bioset_partial() in case of write behind
  md: fail if mddev-&gt;bio_set can't be created
  block: introduce bio_clone_bioset_partial()
  md: disable WRITE SAME if it fails in underlayer disks
  md/raid5-cache: exclude reclaiming stripes in reclaim check
  md/raid5-cache: stripe reclaim only counts valid stripes
  MD: add doc for raid5-cache
  Documentation: move MD related doc into a separate dir
  md: ensure md devices are freed before module is unloaded.
  md/r5cache: improve journal device efficiency
  md/r5cache: enable chunk_aligned_read with write back cache
  ...
</content>
</entry>
<entry>
<title>Merge branch 'for-4.11/next' into for-4.11/linus-merge</title>
<updated>2017-02-17T21:08:19+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@fb.com</email>
</author>
<published>2017-02-17T21:08:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=818551e2b2c662a1b26de6b4f7d6b8411a838d18'/>
<id>urn:sha1:818551e2b2c662a1b26de6b4f7d6b8411a838d18</id>
<content type='text'>
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>md/raid5: Don't reinvent the wheel but use existing llist API</title>
<updated>2017-02-16T22:49:05+00:00</updated>
<author>
<name>Byungchul Park</name>
<email>byungchul.park@lge.com</email>
</author>
<published>2017-02-14T07:26:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=eae8263fb1f4256460270dd8f42334604dcdfac6'/>
<id>urn:sha1:eae8263fb1f4256460270dd8f42334604dcdfac6</id>
<content type='text'>
Although llist provides proper APIs, they are not used. Make them used.

Signed-off-by: Byungchul Park &lt;byungchul.park@lge.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
</entry>
<entry>
<title>md: fast clone bio in bio_clone_mddev()</title>
<updated>2017-02-15T19:24:54+00:00</updated>
<author>
<name>Ming Lei</name>
<email>tom.leiming@gmail.com</email>
</author>
<published>2017-02-14T15:29:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d7a1030839d35c04a620e841f406b9b2a8600041'/>
<id>urn:sha1:d7a1030839d35c04a620e841f406b9b2a8600041</id>
<content type='text'>
Firstly bio_clone_mddev() is used in raid normal I/O and isn't
in resync I/O path.

Secondly all the direct access to bvec table in raid happens on
resync I/O except for write behind of raid1, in which we still
use bio_clone() for allocating new bvec table.

So this patch replaces bio_clone() with bio_clone_fast()
in bio_clone_mddev().

Also kill bio_clone_mddev() and call bio_clone_fast() directly, as
suggested by Christoph Hellwig.

Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Ming Lei &lt;tom.leiming@gmail.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
</entry>
<entry>
<title>md/raid5-cache: exclude reclaiming stripes in reclaim check</title>
<updated>2017-02-13T17:20:05+00:00</updated>
<author>
<name>Shaohua Li</name>
<email>shli@fb.com</email>
</author>
<published>2017-02-11T00:18:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e33fbb9cc73d6502e69eaf1c178e0c39059763ea'/>
<id>urn:sha1:e33fbb9cc73d6502e69eaf1c178e0c39059763ea</id>
<content type='text'>
stripes which are being reclaimed are still accounted into cached
stripes. The reclaim takes time. r5c_do_reclaim isn't aware of the
stripes and does unnecessary stripe reclaim. In practice, I saw one
stripe is reclaimed one time. This will cause bad IO pattern. Fixing
this by excluding the reclaing stripes in the check.

Cc: Song Liu &lt;songliubraving@fb.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
</entry>
<entry>
<title>md/r5cache: improve journal device efficiency</title>
<updated>2017-02-13T17:17:52+00:00</updated>
<author>
<name>Song Liu</name>
<email>songliubraving@fb.com</email>
</author>
<published>2017-01-24T22:08:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=39b99586b321a9cbd4fe9abb5ea9346d2c4ca9c6'/>
<id>urn:sha1:39b99586b321a9cbd4fe9abb5ea9346d2c4ca9c6</id>
<content type='text'>
It is important to be able to flush all stripes in raid5-cache.
Therefore, we need reserve some space on the journal device for
these flushes. If flush operation includes pending writes to the
stripe, we need to reserve (conf-&gt;raid_disk + 1) pages per stripe
for the flush out. This reduces the efficiency of journal space.
If we exclude these pending writes from flush operation, we only
need (conf-&gt;max_degraded + 1) pages per stripe.

With this patch, when log space is critical (R5C_LOG_CRITICAL=1),
pending writes will be excluded from stripe flush out. Therefore,
we can reduce reserved space for flush out and thus improve journal
device efficiency.

Signed-off-by: Song Liu &lt;songliubraving@fb.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
</entry>
</feed>
