<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/md/bcache/alloc.c, branch v4.19.77</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v4.19.77</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v4.19.77'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2019-07-26T07:14:13+00:00</updated>
<entry>
<title>bcache: check CACHE_SET_IO_DISABLE in allocator code</title>
<updated>2019-07-26T07:14:13+00:00</updated>
<author>
<name>Coly Li</name>
<email>colyli@suse.de</email>
</author>
<published>2019-06-28T11:59:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=57cfb755c356a256daae7a78d6358f3d4f0d0e75'/>
<id>urn:sha1:57cfb755c356a256daae7a78d6358f3d4f0d0e75</id>
<content type='text'>
[ Upstream commit e775339e1ae1205b47d94881db124c11385e597c ]

If CACHE_SET_IO_DISABLE of a cache set flag is set by too many I/O
errors, currently allocator routines can still continue allocate
space which may introduce inconsistent metadata state.

This patch checkes CACHE_SET_IO_DISABLE bit in following allocator
routines,
- bch_bucket_alloc()
- __bch_bucket_alloc_set()
Once CACHE_SET_IO_DISABLE is set on cache set, the allocator routines
may reject allocation request earlier to avoid potential inconsistent
metadata.

Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>bcache: avoid clang -Wunintialized warning</title>
<updated>2019-05-31T13:46:15+00:00</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2019-04-24T16:48:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=06740892db92f85322bb0cc509b69df92a663963'/>
<id>urn:sha1:06740892db92f85322bb0cc509b69df92a663963</id>
<content type='text'>
[ Upstream commit 78d4eb8ad9e1d413449d1b7a060f50b6efa81ebd ]

clang has identified a code path in which it thinks a
variable may be unused:

drivers/md/bcache/alloc.c:333:4: error: variable 'bucket' is used uninitialized whenever 'if' condition is false
      [-Werror,-Wsometimes-uninitialized]
                        fifo_pop(&amp;ca-&gt;free_inc, bucket);
                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/md/bcache/util.h:219:27: note: expanded from macro 'fifo_pop'
 #define fifo_pop(fifo, i)       fifo_pop_front(fifo, (i))
                                ^~~~~~~~~~~~~~~~~~~~~~~~~
drivers/md/bcache/util.h:189:6: note: expanded from macro 'fifo_pop_front'
        if (_r) {                                                       \
            ^~
drivers/md/bcache/alloc.c:343:46: note: uninitialized use occurs here
                        allocator_wait(ca, bch_allocator_push(ca, bucket));
                                                                  ^~~~~~
drivers/md/bcache/alloc.c:287:7: note: expanded from macro 'allocator_wait'
                if (cond)                                               \
                    ^~~~
drivers/md/bcache/alloc.c:333:4: note: remove the 'if' if its condition is always true
                        fifo_pop(&amp;ca-&gt;free_inc, bucket);
                        ^
drivers/md/bcache/util.h:219:27: note: expanded from macro 'fifo_pop'
 #define fifo_pop(fifo, i)       fifo_pop_front(fifo, (i))
                                ^
drivers/md/bcache/util.h:189:2: note: expanded from macro 'fifo_pop_front'
        if (_r) {                                                       \
        ^
drivers/md/bcache/alloc.c:331:15: note: initialize the variable 'bucket' to silence this warning
                        long bucket;
                                   ^

This cannot happen in practice because we only enter the loop
if there is at least one element in the list.

Slightly rearranging the code makes this clearer to both the
reader and the compiler, which avoids the warning.

Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Reviewed-by: Nathan Chancellor &lt;natechancellor@gmail.com&gt;
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>bcache: style fix to add a blank line after declarations</title>
<updated>2018-08-11T21:46:41+00:00</updated>
<author>
<name>Coly Li</name>
<email>colyli@suse.de</email>
</author>
<published>2018-08-11T05:19:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1fae7cf05293d3a2c9e59c1bc59372322386467c'/>
<id>urn:sha1:1fae7cf05293d3a2c9e59c1bc59372322386467c</id>
<content type='text'>
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Reviewed-by: Shenghui Wang &lt;shhuiw@foxmail.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>bcache: style fix to replace 'unsigned' by 'unsigned int'</title>
<updated>2018-08-11T21:46:41+00:00</updated>
<author>
<name>Coly Li</name>
<email>colyli@suse.de</email>
</author>
<published>2018-08-11T05:19:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6f10f7d1b02b1bbc305f88d7696445dd38b13881'/>
<id>urn:sha1:6f10f7d1b02b1bbc305f88d7696445dd38b13881</id>
<content type='text'>
This patch fixes warning reported by checkpatch.pl by replacing 'unsigned'
with 'unsigned int'.

Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Reviewed-by: Shenghui Wang &lt;shhuiw@foxmail.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>bcache: add wait_for_kthread_stop() in bch_allocator_thread()</title>
<updated>2018-05-03T14:35:13+00:00</updated>
<author>
<name>Coly Li</name>
<email>colyli@suse.de</email>
</author>
<published>2018-05-03T10:51:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ecb2ba8cb83549f1bb06bc7e693ae8fed43c0e4f'/>
<id>urn:sha1:ecb2ba8cb83549f1bb06bc7e693ae8fed43c0e4f</id>
<content type='text'>
When CACHE_SET_IO_DISABLE is set on cache set flags, bcache allocator
thread routine bch_allocator_thread() may stop the while-loops and
exit. Then it is possible to observe the following kernel oops message,

[  631.068366] bcache: bch_btree_insert() error -5
[  631.069115] bcache: cached_dev_detach_finish() Caching disabled for sdf
[  631.070220] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[  631.070250] PGD 0 P4D 0
[  631.070261] Oops: 0002 [#1] SMP PTI
[snipped]
[  631.070578] Workqueue: events cache_set_flush [bcache]
[  631.070597] RIP: 0010:exit_creds+0x1b/0x50
[  631.070610] RSP: 0018:ffffc9000705fe08 EFLAGS: 00010246
[  631.070626] RAX: 0000000000000001 RBX: ffff880a622ad300 RCX: 000000000000000b
[  631.070645] RDX: 0000000000000601 RSI: 000000000000000c RDI: 0000000000000000
[  631.070663] RBP: ffff880a622ad300 R08: ffffea00190c66e0 R09: 0000000000000200
[  631.070682] R10: ffff880a48123000 R11: ffff880000000000 R12: 0000000000000000
[  631.070700] R13: ffff880a4b160e40 R14: ffff880a4b160000 R15: 0ffff880667e2530
[  631.070719] FS:  0000000000000000(0000) GS:ffff880667e00000(0000) knlGS:0000000000000000
[  631.070740] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  631.070755] CR2: 0000000000000000 CR3: 000000000200a001 CR4: 00000000003606e0
[  631.070774] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  631.070793] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  631.070811] Call Trace:
[  631.070828]  __put_task_struct+0x55/0x160
[  631.070845]  kthread_stop+0xee/0x100
[  631.070863]  cache_set_flush+0x11d/0x1a0 [bcache]
[  631.070879]  process_one_work+0x146/0x340
[  631.070892]  worker_thread+0x47/0x3e0
[  631.070906]  kthread+0xf5/0x130
[  631.070917]  ? max_active_store+0x60/0x60
[  631.070930]  ? kthread_bind+0x10/0x10
[  631.070945]  ret_from_fork+0x35/0x40
[snipped]
[  631.071017] RIP: exit_creds+0x1b/0x50 RSP: ffffc9000705fe08
[  631.071033] CR2: 0000000000000000
[  631.071045] ---[ end trace 011c63a24b22c927 ]---
[  631.071085] bcache: bcache_device_free() bcache0 stopped

The reason is when cache_set_flush() tries to call kthread_stop() to stop
allocator thread, but it exits already due to cache device I/O errors.

This patch adds wait_for_kthread_stop() at tail of bch_allocator_thread(),
to prevent the thread routine exiting directly. Then the allocator thread
can be blocked at wait_for_kthread_stop() and wait for cache_set_flush()
to stop it by calling kthread_stop().

changelog:
v3: add Reviewed-by from Hannnes.
v2: not directly return from allocator_wait(), move 'return 0' to tail of
    bch_allocator_thread().
v1: initial version.

Fixes: 771f393e8ffc ("bcache: add CACHE_SET_IO_DISABLE to struct cache_set flags")
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>bcache: add CACHE_SET_IO_DISABLE to struct cache_set flags</title>
<updated>2018-03-19T02:15:20+00:00</updated>
<author>
<name>Coly Li</name>
<email>colyli@suse.de</email>
</author>
<published>2018-03-19T00:36:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=771f393e8ffc9b3066e4830ee5f7391b8e8874f1'/>
<id>urn:sha1:771f393e8ffc9b3066e4830ee5f7391b8e8874f1</id>
<content type='text'>
When too many I/Os failed on cache device, bch_cache_set_error() is called
in the error handling code path to retire whole problematic cache set. If
new I/O requests continue to come and take refcount dc-&gt;count, the cache
set won't be retired immediately, this is a problem.

Further more, there are several kernel thread and self-armed kernel work
may still running after bch_cache_set_error() is called. It needs to wait
quite a while for them to stop, or they won't stop at all. They also
prevent the cache set from being retired.

The solution in this patch is, to add per cache set flag to disable I/O
request on this cache and all attached backing devices. Then new coming I/O
requests can be rejected in *_make_request() before taking refcount, kernel
threads and self-armed kernel worker can stop very fast when flags bit
CACHE_SET_IO_DISABLE is set.

Because bcache also do internal I/Os for writeback, garbage collection,
bucket allocation, journaling, this kind of I/O should be disabled after
bch_cache_set_error() is called. So closure_bio_submit() is modified to
check whether CACHE_SET_IO_DISABLE is set on cache_set-&gt;flags. If set,
closure_bio_submit() will set bio-&gt;bi_status to BLK_STS_IOERR and
return, generic_make_request() won't be called.

A sysfs interface is also added to set or clear CACHE_SET_IO_DISABLE bit
from cache_set-&gt;flags, to disable or enable cache set I/O for debugging. It
is helpful to trigger more corner case issues for failed cache device.

Changelog
v4, add wait_for_kthread_stop(), and call it before exits writeback and gc
    kernel threads.
v3, change CACHE_SET_IO_DISABLE from 4 to 3, since it is bit index.
    remove "bcache: " prefix when printing out kernel message.
v2, more changes by previous review,
- Use CACHE_SET_IO_DISABLE of cache_set-&gt;flags, suggested by Junhui.
- Check CACHE_SET_IO_DISABLE in bch_btree_gc() to stop a while-loop, this
  is reported and inspired from origal patch of Pavel Vazharov.
v1, initial version.

Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Reviewed-by: Michael Lyle &lt;mlyle@lyle.org&gt;
Cc: Junhui Tang &lt;tang.junhui@zte.com.cn&gt;
Cc: Michael Lyle &lt;mlyle@lyle.org&gt;
Cc: Pavel Vazharov &lt;freakpv@gmail.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>bcache: properly set task state in bch_writeback_thread()</title>
<updated>2018-02-07T19:50:01+00:00</updated>
<author>
<name>Coly Li</name>
<email>colyli@suse.de</email>
</author>
<published>2018-02-07T19:41:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=99361bbf26337186f02561109c17a4c4b1a7536a'/>
<id>urn:sha1:99361bbf26337186f02561109c17a4c4b1a7536a</id>
<content type='text'>
Kernel thread routine bch_writeback_thread() has the following code block,

447         down_write(&amp;dc-&gt;writeback_lock);
448~450     if (check conditions) {
451                 up_write(&amp;dc-&gt;writeback_lock);
452                 set_current_state(TASK_INTERRUPTIBLE);
453
454                 if (kthread_should_stop())
455                         return 0;
456
457                 schedule();
458                 continue;
459         }

If condition check is true, its task state is set to TASK_INTERRUPTIBLE
and call schedule() to wait for others to wake up it.

There are 2 issues in current code,
1, Task state is set to TASK_INTERRUPTIBLE after the condition checks, if
   another process changes the condition and call wake_up_process(dc-&gt;
   writeback_thread), then at line 452 task state is set back to
   TASK_INTERRUPTIBLE, the writeback kernel thread will lose a chance to be
   waken up.
2, At line 454 if kthread_should_stop() is true, writeback kernel thread
   will return to kernel/kthread.c:kthread() with TASK_INTERRUPTIBLE and
   call do_exit(). It is not good to enter do_exit() with task state
   TASK_INTERRUPTIBLE, in following code path might_sleep() is called and a
   warning message is reported by __might_sleep(): "WARNING: do not call
   blocking ops when !TASK_RUNNING; state=1 set at [xxxx]".

For the first issue, task state should be set before condition checks.
Ineed because dc-&gt;writeback_lock is required when modifying all the
conditions, calling set_current_state() inside code block where dc-&gt;
writeback_lock is hold is safe. But this is quite implicit, so I still move
set_current_state() before all the condition checks.

For the second issue, frankley speaking it does not hurt when kernel thread
exits with TASK_INTERRUPTIBLE state, but this warning message scares users,
makes them feel there might be something risky with bcache and hurt their
data.  Setting task state to TASK_RUNNING before returning fixes this
problem.

In alloc.c:allocator_wait(), there is also a similar issue, and is also
fixed in this patch.

Changelog:
v3: merge two similar fixes into one patch
v2: fix the race issue in v1 patch.
v1: initial buggy fix.

Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
Reviewed-by: Michael Lyle &lt;mlyle@lyle.org&gt;
Cc: Michael Lyle &lt;mlyle@lyle.org&gt;
Cc: Junhui Tang &lt;tang.junhui@zte.com.cn&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>bcache: segregate flash only volume write streams</title>
<updated>2018-01-08T20:29:00+00:00</updated>
<author>
<name>Tang Junhui</name>
<email>tang.junhui@zte.com.cn</email>
</author>
<published>2018-01-08T20:21:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4eca1cb28d8b0574ca4f1f48e9331c5f852d43b9'/>
<id>urn:sha1:4eca1cb28d8b0574ca4f1f48e9331c5f852d43b9</id>
<content type='text'>
In such scenario that there are some flash only volumes
, and some cached devices, when many tasks request these devices in
writeback mode, the write IOs may fall to the same bucket as bellow:
| cached data | flash data | cached data | cached data| flash data|
then after writeback of these cached devices, the bucket would
be like bellow bucket:
| free | flash data | free | free | flash data |

So, there are many free space in this bucket, but since data of flash
only volumes still exists, so this bucket cannot be reclaimable,
which would cause waste of bucket space.

In this patch, we segregate flash only volume write streams from
cached devices, so data from flash only volumes and cached devices
can store in different buckets.

Compare to v1 patch, this patch do not add a additionally open bucket
list, and it is try best to segregate flash only volume write streams
from cached devices, sectors of flash only volumes may still be mixed
with dirty sectors of cached device, but the number is very small.

[mlyle: fixed commit log formatting, permissions, line endings]

Signed-off-by: Tang Junhui &lt;tang.junhui@zte.com.cn&gt;
Reviewed-by: Michael Lyle &lt;mlyle@lyle.org&gt;
Signed-off-by: Michael Lyle &lt;mlyle@lyle.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>bcache: Fix building error on MIPS</title>
<updated>2017-11-24T23:22:58+00:00</updated>
<author>
<name>Huacai Chen</name>
<email>chenhc@lemote.com</email>
</author>
<published>2017-11-24T23:14:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=cf33c1ee5254c6a430bc1538232b49c3ea13e613'/>
<id>urn:sha1:cf33c1ee5254c6a430bc1538232b49c3ea13e613</id>
<content type='text'>
This patch try to fix the building error on MIPS. The reason is MIPS
has already defined the PTR macro, which conflicts with the PTR macro
in include/uapi/linux/bcache.h.

[fixed by mlyle: corrected a line-length issue]

Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen &lt;chenhc@lemote.com&gt;
Reviewed-by: Michael Lyle &lt;mlyle@lyle.org&gt;
Signed-off-by: Michael Lyle &lt;mlyle@lyle.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-4.15/block' of git://git.kernel.dk/linux-block</title>
<updated>2017-11-14T23:32:19+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2017-11-14T23:32:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e2c5923c349c1738fe8fda980874d93f6fb2e5b6'/>
<id>urn:sha1:e2c5923c349c1738fe8fda980874d93f6fb2e5b6</id>
<content type='text'>
Pull core block layer updates from Jens Axboe:
 "This is the main pull request for block storage for 4.15-rc1.

  Nothing out of the ordinary in here, and no API changes or anything
  like that. Just various new features for drivers, core changes, etc.
  In particular, this pull request contains:

   - A patch series from Bart, closing the whole on blk/scsi-mq queue
     quescing.

   - A series from Christoph, building towards hidden gendisks (for
     multipath) and ability to move bio chains around.

   - NVMe
        - Support for native multipath for NVMe (Christoph).
        - Userspace notifications for AENs (Keith).
        - Command side-effects support (Keith).
        - SGL support (Chaitanya Kulkarni)
        - FC fixes and improvements (James Smart)
        - Lots of fixes and tweaks (Various)

   - bcache
        - New maintainer (Michael Lyle)
        - Writeback control improvements (Michael)
        - Various fixes (Coly, Elena, Eric, Liang, et al)

   - lightnvm updates, mostly centered around the pblk interface
     (Javier, Hans, and Rakesh).

   - Removal of unused bio/bvec kmap atomic interfaces (me, Christoph)

   - Writeback series that fix the much discussed hundreds of millions
     of sync-all units. This goes all the way, as discussed previously
     (me).

   - Fix for missing wakeup on writeback timer adjustments (Yafang
     Shao).

   - Fix laptop mode on blk-mq (me).

   - {mq,name} tupple lookup for IO schedulers, allowing us to have
     alias names. This means you can use 'deadline' on both !mq and on
     mq (where it's called mq-deadline). (me).

   - blktrace race fix, oopsing on sg load (me).

   - blk-mq optimizations (me).

   - Obscure waitqueue race fix for kyber (Omar).

   - NBD fixes (Josef).

   - Disable writeback throttling by default on bfq, like we do on cfq
     (Luca Miccio).

   - Series from Ming that enable us to treat flush requests on blk-mq
     like any other request. This is a really nice cleanup.

   - Series from Ming that improves merging on blk-mq with schedulers,
     getting us closer to flipping the switch on scsi-mq again.

   - BFQ updates (Paolo).

   - blk-mq atomic flags memory ordering fixes (Peter Z).

   - Loop cgroup support (Shaohua).

   - Lots of minor fixes from lots of different folks, both for core and
     driver code"

* 'for-4.15/block' of git://git.kernel.dk/linux-block: (294 commits)
  nvme: fix visibility of "uuid" ns attribute
  blk-mq: fixup some comment typos and lengths
  ide: ide-atapi: fix compile error with defining macro DEBUG
  blk-mq: improve tag waiting setup for non-shared tags
  brd: remove unused brd_mutex
  blk-mq: only run the hardware queue if IO is pending
  block: avoid null pointer dereference on null disk
  fs: guard_bio_eod() needs to consider partitions
  xtensa/simdisk: fix compile error
  nvme: expose subsys attribute to sysfs
  nvme: create 'slaves' and 'holders' entries for hidden controllers
  block: create 'slaves' and 'holders' entries for hidden gendisks
  nvme: also expose the namespace identification sysfs files for mpath nodes
  nvme: implement multipath access to nvme subsystems
  nvme: track shared namespaces
  nvme: introduce a nvme_ns_ids structure
  nvme: track subsystems
  block, nvme: Introduce blk_mq_req_flags_t
  block, scsi: Make SCSI quiesce and resume work reliably
  block: Add the QUEUE_FLAG_PREEMPT_ONLY request queue flag
  ...
</content>
</entry>
</feed>
