<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/md/bcache/request.c, branch v4.4.235</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v4.4.235</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v4.4.235'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2018-05-30T05:49:02+00:00</updated>
<entry>
<title>bcache: fix kcrashes with fio in RAID5 backend dev</title>
<updated>2018-05-30T05:49:02+00:00</updated>
<author>
<name>Tang Junhui</name>
<email>tang.junhui@zte.com.cn</email>
</author>
<published>2018-02-27T17:49:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3d0dca33a2346608db5c3466f2f654e1e2e9e76e'/>
<id>urn:sha1:3d0dca33a2346608db5c3466f2f654e1e2e9e76e</id>
<content type='text'>
[ Upstream commit 60eb34ec5526e264c2bbaea4f7512d714d791caf ]

Kernel crashed when run fio in a RAID5 backend bcache device, the call
trace is bellow:
[  440.012034] kernel BUG at block/blk-ioc.c:146!
[  440.012696] invalid opcode: 0000 [#1] SMP NOPTI
[  440.026537] CPU: 2 PID: 2205 Comm: md127_raid5 Not tainted 4.15.0 #8
[  440.027441] Hardware name: HP ProLiant MicroServer Gen8, BIOS J06 07/16
/2015
[  440.028615] RIP: 0010:put_io_context+0x8b/0x90
[  440.029246] RSP: 0018:ffffa8c882b43af8 EFLAGS: 00010246
[  440.029990] RAX: 0000000000000000 RBX: ffffa8c88294fca0 RCX: 0000000000
0f4240
[  440.031006] RDX: 0000000000000004 RSI: 0000000000000286 RDI: ffffa8c882
94fca0
[  440.032030] RBP: ffffa8c882b43b10 R08: 0000000000000003 R09: ffff949cb8
0c1700
[  440.033206] R10: 0000000000000104 R11: 000000000000b71c R12: 00000000000
01000
[  440.034222] R13: 0000000000000000 R14: ffff949cad84db70 R15: ffff949cb11
bd1e0
[  440.035239] FS:  0000000000000000(0000) GS:ffff949cba280000(0000) knlGS:
0000000000000000
[  440.060190] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  440.084967] CR2: 00007ff0493ef000 CR3: 00000002f1e0a002 CR4: 00000000001
606e0
[  440.110498] Call Trace:
[  440.135443]  bio_disassociate_task+0x1b/0x60
[  440.160355]  bio_free+0x1b/0x60
[  440.184666]  bio_put+0x23/0x30
[  440.208272]  search_free+0x23/0x40 [bcache]
[  440.231448]  cached_dev_write_complete+0x31/0x70 [bcache]
[  440.254468]  closure_put+0xb6/0xd0 [bcache]
[  440.277087]  request_endio+0x30/0x40 [bcache]
[  440.298703]  bio_endio+0xa1/0x120
[  440.319644]  handle_stripe+0x418/0x2270 [raid456]
[  440.340614]  ? load_balance+0x17b/0x9c0
[  440.360506]  handle_active_stripes.isra.58+0x387/0x5a0 [raid456]
[  440.380675]  ? __release_stripe+0x15/0x20 [raid456]
[  440.400132]  raid5d+0x3ed/0x5d0 [raid456]
[  440.419193]  ? schedule+0x36/0x80
[  440.437932]  ? schedule_timeout+0x1d2/0x2f0
[  440.456136]  md_thread+0x122/0x150
[  440.473687]  ? wait_woken+0x80/0x80
[  440.491411]  kthread+0x102/0x140
[  440.508636]  ? find_pers+0x70/0x70
[  440.524927]  ? kthread_associate_blkcg+0xa0/0xa0
[  440.541791]  ret_from_fork+0x35/0x40
[  440.558020] Code: c2 48 00 5b 41 5c 41 5d 5d c3 48 89 c6 4c 89 e7 e8 bb c2
48 00 48 8b 3d bc 36 4b 01 48 89 de e8 7c f7 e0 ff 5b 41 5c 41 5d 5d c3 &lt;0f&gt; 0b
0f 1f 00 0f 1f 44 00 00 55 48 8d 47 b8 48 89 e5 41 57 41
[  440.610020] RIP: put_io_context+0x8b/0x90 RSP: ffffa8c882b43af8
[  440.628575] ---[ end trace a1fd79d85643a73e ]--

All the crash issue happened when a bypass IO coming, in such scenario
s-&gt;iop.bio is pointed to the s-&gt;orig_bio. In search_free(), it finishes the
s-&gt;orig_bio by calling bio_complete(), and after that, s-&gt;iop.bio became
invalid, then kernel would crash when calling bio_put(). Maybe its upper
layer's faulty, since bio should not be freed before we calling bio_put(),
but we'd better calling bio_put() first before calling bio_complete() to
notify upper layer ending this bio.

This patch moves bio_complete() under bio_put() to avoid kernel crash.

[mlyle: fixed commit subject for character limits]

Reported-by: Matthias Ferdinand &lt;bcache@mfedv.net&gt;
Tested-by: Matthias Ferdinand &lt;bcache@mfedv.net&gt;
Signed-off-by: Tang Junhui &lt;tang.junhui@zte.com.cn&gt;
Reviewed-by: Michael Lyle &lt;mlyle@lyle.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>bcache: fix wrong cache_misses statistics</title>
<updated>2017-12-20T09:04:59+00:00</updated>
<author>
<name>tang.junhui</name>
<email>tang.junhui@zte.com.cn</email>
</author>
<published>2017-10-30T21:46:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=230c4ba404d35ca035e76ebb268a1d86187a581b'/>
<id>urn:sha1:230c4ba404d35ca035e76ebb268a1d86187a581b</id>
<content type='text'>
[ Upstream commit c157313791a999646901b3e3c6888514ebc36d62 ]

Currently, Cache missed IOs are identified by s-&gt;cache_miss, but actually,
there are many situations that missed IOs are not assigned a value for
s-&gt;cache_miss in cached_dev_cache_miss(), for example, a bypassed IO
(s-&gt;iop.bypass = 1), or the cache_bio allocate failed. In these situations,
it will go to out_put or out_submit, and s-&gt;cache_miss is null, which leads
bch_mark_cache_accounting() to treat this IO as a hit IO.

[ML: applied by 3-way merge]

Signed-off-by: tang.junhui &lt;tang.junhui@zte.com.cn&gt;
Reviewed-by: Michael Lyle &lt;mlyle@lyle.org&gt;
Reviewed-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>bcache: recover data from backing when data is clean</title>
<updated>2017-12-09T17:42:37+00:00</updated>
<author>
<name>Rui Hua</name>
<email>huarui.dev@gmail.com</email>
</author>
<published>2017-11-24T23:14:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3f7477e64478c7d22642a97c25eb43649d2a480c'/>
<id>urn:sha1:3f7477e64478c7d22642a97c25eb43649d2a480c</id>
<content type='text'>
commit e393aa2446150536929140739f09c6ecbcbea7f0 upstream.

When we send a read request and hit the clean data in cache device, there
is a situation called cache read race in bcache(see the commit in the tail
of cache_look_up(), the following explaination just copy from there):
The bucket we're reading from might be reused while our bio is in flight,
and we could then end up reading the wrong data. We guard against this
by checking (in bch_cache_read_endio()) if the pointer is stale again;
if so, we treat it as an error (s-&gt;iop.error = -EINTR) and reread from
the backing device (but we don't pass that error up anywhere)

It should be noted that cache read race happened under normal
circumstances, not the circumstance when SSD failed, it was counted
and shown in  /sys/fs/bcache/XXX/internal/cache_read_races.

Without this patch, when we use writeback mode, we will never reread from
the backing device when cache read race happened, until the whole cache
device is clean, because the condition
(s-&gt;recoverable &amp;&amp; (dc &amp;&amp; !atomic_read(&amp;dc-&gt;has_dirty))) is false in
cached_dev_read_error(). In this situation, the s-&gt;iop.error(= -EINTR)
will be passed up, at last, user will receive -EINTR when it's bio end,
this is not suitable, and wield to up-application.

In this patch, we use s-&gt;read_dirty_data to judge whether the read
request hit dirty data in cache device, it is safe to reread data from
the backing device when the read request hit clean data. This can not
only handle cache read race, but also recover data when failed read
request from cache device.

[edited by mlyle to fix up whitespace, commit log title, comment
spelling]

Fixes: d59b23795933 ("bcache: only permit to recovery read error when cache device is clean")
Signed-off-by: Hua Rui &lt;huarui.dev@gmail.com&gt;
Reviewed-by: Michael Lyle &lt;mlyle@lyle.org&gt;
Reviewed-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Michael Lyle &lt;mlyle@lyle.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>bcache: only permit to recovery read error when cache device is clean</title>
<updated>2017-12-09T17:42:37+00:00</updated>
<author>
<name>Coly Li</name>
<email>colyli@suse.de</email>
</author>
<published>2017-10-30T21:46:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f80f34d8ba92b29f84228f91f9b6c0e0fca5c641'/>
<id>urn:sha1:f80f34d8ba92b29f84228f91f9b6c0e0fca5c641</id>
<content type='text'>
commit d59b23795933678c9638fd20c942d2b4f3cd6185 upstream.

When bcache does read I/Os, for example in writeback or writethrough mode,
if a read request on cache device is failed, bcache will try to recovery
the request by reading from cached device. If the data on cached device is
not synced with cache device, then requester will get a stale data.

For critical storage system like database, providing stale data from
recovery may result an application level data corruption, which is
unacceptible.

With this patch, for a failed read request in writeback or writethrough
mode, recovery a recoverable read request only happens when cache device
is clean. That is to say, all data on cached device is up to update.

For other cache modes in bcache, read request will never hit
cached_dev_read_error(), they don't need this patch.

Please note, because cache mode can be switched arbitrarily in run time, a
writethrough mode might be switched from a writeback mode. Therefore
checking dc-&gt;has_data in writethrough mode still makes sense.

Changelog:
V4: Fix parens error pointed by Michael Lyle.
v3: By response from Kent Oversteet, he thinks recovering stale data is a
    bug to fix, and option to permit it is unnecessary. So this version
    the sysfs file is removed.
v2: rename sysfs entry from allow_stale_data_on_failure  to
    allow_stale_data_on_failure, and fix the confusing commit log.
v1: initial patch posted.

[small change to patch comment spelling by mlyle]

Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Michael Lyle &lt;mlyle@lyle.org&gt;
Reported-by: Arne Wolf &lt;awolf@lenovo.com&gt;
Reviewed-by: Michael Lyle &lt;mlyle@lyle.org&gt;
Cc: Kent Overstreet &lt;kent.overstreet@gmail.com&gt;
Cc: Nix &lt;nix@esperi.org.uk&gt;
Cc: Kai Krakow &lt;hurikhan77@gmail.com&gt;
Cc: Eric Wheeler &lt;bcache@lists.ewheeler.net&gt;
Cc: Junhui Tang &lt;tang.junhui@zte.com.cn&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>bcache: do not subtract sectors_to_gc for bypassed IO</title>
<updated>2017-09-27T09:00:16+00:00</updated>
<author>
<name>Tang Junhui</name>
<email>tang.junhui@zte.com.cn</email>
</author>
<published>2017-09-06T06:25:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0471f58e18e60dcdbd89f882dc0fa6370b94cd48'/>
<id>urn:sha1:0471f58e18e60dcdbd89f882dc0fa6370b94cd48</id>
<content type='text'>
commit 69daf03adef5f7bc13e0ac86b4b8007df1767aab upstream.

Since bypassed IOs use no bucket, so do not subtract sectors_to_gc to
trigger gc thread.

Signed-off-by: tang.junhui &lt;tang.junhui@zte.com.cn&gt;
Acked-by: Coly Li &lt;colyli@suse.de&gt;
Reviewed-by: Eric Wheeler &lt;bcache@linux.ewheeler.net&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>bcache: Make gc wakeup sane, remove set_task_state()</title>
<updated>2017-02-23T16:43:10+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kent.overstreet@gmail.com</email>
</author>
<published>2016-10-27T03:31:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6f26f0ba2435a49179b5f9c452afde0600e24987'/>
<id>urn:sha1:6f26f0ba2435a49179b5f9c452afde0600e24987</id>
<content type='text'>
commit be628be09563f8f6e81929efbd7cf3f45c344416 upstream.

Signed-off-by: Kent Overstreet &lt;kent.overstreet@gmail.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>block: change -&gt;make_request_fn() and users to return a queue cookie</title>
<updated>2015-11-07T17:40:46+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@fb.com</email>
</author>
<published>2015-11-05T17:41:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=dece16353ef47d8d33f5302bc158072a9d65e26f'/>
<id>urn:sha1:dece16353ef47d8d33f5302bc158072a9d65e26f</id>
<content type='text'>
No functional changes in this patch, but it prepares us for returning
a more useful cookie related to the IO that was queued up.

Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Acked-by: Christoph Hellwig &lt;hch@lst.de&gt;
Acked-by: Keith Busch &lt;keith.busch@intel.com&gt;
</content>
</entry>
<entry>
<title>bcache: remove driver private bio splitting code</title>
<updated>2015-08-13T18:31:40+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kent.overstreet@gmail.com</email>
</author>
<published>2013-11-24T07:11:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=749b61dab30736eb95b1ee23738cae90973d4fc3'/>
<id>urn:sha1:749b61dab30736eb95b1ee23738cae90973d4fc3</id>
<content type='text'>
The bcache driver has always accepted arbitrarily large bios and split
them internally.  Now that every driver must accept arbitrarily large
bios this code isn't nessecary anymore.

Cc: linux-bcache@vger.kernel.org
Signed-off-by: Kent Overstreet &lt;kent.overstreet@gmail.com&gt;
[dpark: add more description in commit message]
Signed-off-by: Dongsu Park &lt;dpark@posteo.net&gt;
Signed-off-by: Ming Lin &lt;ming.l@ssi.samsung.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>block: add a bi_error field to struct bio</title>
<updated>2015-07-29T14:55:15+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2015-07-20T13:29:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4246a0b63bd8f56a1469b12eafeb875b1041a451'/>
<id>urn:sha1:4246a0b63bd8f56a1469b12eafeb875b1041a451</id>
<content type='text'>
Currently we have two different ways to signal an I/O error on a BIO:

 (1) by clearing the BIO_UPTODATE flag
 (2) by returning a Linux errno value to the bi_end_io callback

The first one has the drawback of only communicating a single possible
error (-EIO), and the second one has the drawback of not beeing persistent
when bios are queued up, and are not passed along from child to parent
bio in the ever more popular chaining scenario.  Having both mechanisms
available has the additional drawback of utterly confusing driver authors
and introducing bugs where various I/O submitters only deal with one of
them, and the others have to add boilerplate code to deal with both kinds
of error returns.

So add a new bi_error field to store an errno value directly in struct
bio and remove the existing mechanisms to clean all this up.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
Reviewed-by: NeilBrown &lt;neilb@suse.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>bcache: don't embed 'return' statements in closure macros</title>
<updated>2015-07-11T15:57:32+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@fb.com</email>
</author>
<published>2015-03-06T15:37:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=77b5a08427e87514c33730afc18cd02c9475e2c3'/>
<id>urn:sha1:77b5a08427e87514c33730afc18cd02c9475e2c3</id>
<content type='text'>
This is horribly confusing, it breaks the flow of the code without
it being apparent in the caller.

Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Acked-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
</entry>
</feed>
