summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-08-16tracing/blktrace: Fix to allow setting same valueSteven Rostedt (VMware)1-0/+4
Masami Hiramatsu reported: Current trace-enable attribute in sysfs returns an error if user writes the same setting value as current one, e.g. # cat /sys/block/sda/trace/enable 0 # echo 0 > /sys/block/sda/trace/enable bash: echo: write error: Invalid argument # echo 1 > /sys/block/sda/trace/enable # echo 1 > /sys/block/sda/trace/enable bash: echo: write error: Device or resource busy But this is not a preferred behavior, it should ignore if new setting is same as current one. This fixes the problem as below. # cat /sys/block/sda/trace/enable 0 # echo 0 > /sys/block/sda/trace/enable # echo 1 > /sys/block/sda/trace/enable # echo 1 > /sys/block/sda/trace/enable Link: http://lkml.kernel.org/r/20180816103802.08678002@gandalf.local.home Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: linux-block@vger.kernel.org Cc: stable@vger.kernel.org Fixes: cd649b8bb830d ("blktrace: remove sysfs_blk_trace_enable_show/store()") Reported-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-16pktcdvd: fix setting of 'ret' error return for a few casesJens Axboe1-0/+1
We initialize it to -ENOMEM, but then later overwrite it. After overwriting, we don't set it again for two later failure cases. Reported-by: Jason Wood <jasonwood2031@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-16block: change return type to boolChengguang Xu1-1/+1
Because blk_do_io_stat() only does a judgement about the request contributes to IO statistics, it better changes return type to bool. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-16block, bfq: return nbytes and not zero from struct cftype .write() methodMaciej S. Szmigiero1-1/+2
The value that struct cftype .write() method returns is then directly returned to userspace as the value returned by write() syscall, so it should be the number of bytes actually written (or consumed) and not zero. Returning zero from write() syscall makes programs like /bin/echo or bash spin. Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> Fixes: e21b7a0b9887 ("block, bfq: add full hierarchical scheduling and cgroups support") Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-16block, bfq: improve code of bfq_bfqq_charge_timePaolo Valente1-9/+5
bfq_bfqq_charge_time contains some lengthy and redundant code. This commit trims and condenses that code. Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-16block, bfq: reduce write overchargePaolo Valente1-14/+19
When a sync request is dispatched, the queue that contains that request, and all the ancestor entities of that queue, are charged with the number of sectors of the request. In constrast, if the request is async, then the queue and its ancestor entities are charged with the number of sectors of the request, multiplied by an overcharge factor. This throttles the bandwidth for async I/O, w.r.t. to sync I/O, and it is done to counter the tendency of async writes to steal I/O throughput to reads. On the opposite end, the lower this parameter, the stabler I/O control, in the following respect. The lower this parameter is, the less the bandwidth enjoyed by a group decreases - when the group does writes, w.r.t. to when it does reads; - when other groups do reads, w.r.t. to when they do writes. The fixes "block, bfq: always update the budget of an entity when needed" and "block, bfq: readd missing reset of parent-entity service" improved I/O control in bfq to such an extent that it has been possible to revise this overcharge factor downwards. This commit introduces the resulting, new value. Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-16block, bfq: always update the budget of an entity when neededPaolo Valente1-2/+6
When the next child entity to serve changes for a given parent entity, the budget of that parent entity must be updated accordingly. Unfortunately, this update is not performed, by mistake, for the entities that happen to switch from having no child entity to serve, to having one child entity to serve. Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-16block, bfq: readd missing reset of parent-entity servicePaolo Valente1-0/+21
The received-service counter needs to be equal to 0 when an entity is set in service. Unfortunately, commit "block, bfq: fix service being wrongly set to zero in case of preemption" mistakenly removed the resetting of this counter for the parent entities of the bfq_queue being set in service. This commit fixes this issue by resetting service for parent entities, directly on the expiration of the in-service bfq_queue. Fixes: 9fae8dd59ff3 ("block, bfq: fix service being wrongly set to zero in case of preemption") Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-14blk-wbt: fix IO hang in wbt_wait()Ming Lei1-5/+1
On wbt invariant is that if one IO is tracked via WBT_TRACKED, rqw->inflight should be updated for tracking this IO. But commit c1c80384c8f ("block: remove external dependency on wbt_flags") forgets to remove the early handling of !rwb_enabled(rwb) inside wbt_wait(), then the inflight counter may not be increased in wbt_wait(), but decreased in wbt_done() for this kind of IO, so this counter may become negative, then wbt_wait() may wait forever. This patch fixes the report in the following link: https://marc.info/?l=linux-block&m=153221542021033&w=2 Fixes: c1c80384c8f ("block: remove external dependency on wbt_flags") Cc: Josef Bacik <jbacik@fb.com> Reported-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-14block: don't warn for flush on read-only deviceJens Axboe1-1/+3
Don't warn for a flush issued to a read-only device. It's not strictly a writable command, as it doesn't change any on-media data by itself. Reported-by: Stefan Agner <stefan@agner.ch> Fixes: 721c7fc701c7 ("block: fail op_is_write() requests to read-only partitions") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-12bcache: add the missing comments for smp_mb()/smp_wmb()Coly Li2-2/+4
Checkpatch.pl warns there are 2 locations of smp_mb() and smp_wmb() without code comment. This patch adds the missing code comments for these memory barrier calls. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-12bcache: remove unnecessary space before ioctl function pointer argumentsColy Li1-2/+2
This is warned by checkpatch.pl, this patch removes the extra space. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-12bcache: add missing SPDX headerColy Li3-0/+3
The SPDX header is missing fro closure.c, super.c and util.c, this patch adds SPDX header for GPL-2.0 into these files. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-12bcache: move open brace at end of function definitions to next lineColy Li1-3/+6
This is not a preferred style to place open brace '{' at the end of function definition, checkpatch.pl reports error for such coding style. This patch moves them into the start of the next new line. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-12bcache: add static const prefix to char * array declarationsColy Li1-1/+1
This patch declares char * array with const prefix in sysfs.c, which is suggested by checkpatch.pl. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-12bcache: fix code comments styleColy Li3-13/+21
This patch fixes 3 style issues warned by checkpatch.pl, - Comment lines are not aligned - Comments use "/*" on subsequent lines - Comment lines use a trailing "*/" Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-12bcache: do not check NULL pointer before calling kmem_cache_destroyColy Li1-2/+1
kmem_cache_destroy() is safe for NULL pointer as input, the NULL pointer checking is unncessary. This patch just removes the NULL pointer checking to make code simpler. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-12bcache: prefer 'help' in KconfigColy Li1-3/+3
Current bcache Kconfig uses '---help---' as header of help information, for now 'help' is prefered. This patch fixes this style by replacing '---help---' by 'help' in bcache Kconfig file. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-12bcache: fix typo 'succesfully' to 'successfully'Coly Li2-2/+2
This patch fixes typo 'succesfully' to correct 'successfully', which is suggested by checkpatch.pl. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-12bcache: replace '%pF' by '%pS' in seq_printf()Coly Li1-2/+2
'%pF' and '%pf' are deprecated vsprintf pointer extensions, this patch replace them by '%pS', which is suggested by checkpatch.pl. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-12bcache: fix indent by replacing blank by tabsColy Li1-2/+2
bch_btree_insert_check_key() has unaligned indent, or indent by blank characters. This patch makes the indent aligned and replace blank by tabs. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-12bcache: replace printk() by pr_*() routinesColy Li4-15/+15
There are still many places in bcache use printk to display kernel message, which are suggested to be preplaced by pr_*() routines like pr_err(), pr_info(), or pr_notice(). This patch replaces all printk() with a proper pr_*() routine for bcache code. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-12bcache: replace Symbolic permissions by octal permission numbersColy Li2-5/+5
Symbolic permission names are used in bcache, for now octal permission numbers are encouraged to use for readability. This patch replaces all symbolic permissions by octal permission numbers. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-12bcache: style fixes for lines over 80 charactersColy Li13-28/+59
This patch fixes the lines over 80 characters into more lines, to minimize warnings by checkpatch.pl. There are still some lines exceed 80 characters, but it is better to be a single line and I don't change them. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-12bcache: add identifier names to arguments of function definitionsColy Li14-185/+215
There are many function definitions do not have identifier argument names, scripts/checkpatch.pl complains warnings like this, WARNING: function definition argument 'struct bcache_device *' should also have an identifier name #16735: FILE: writeback.h:120: +void bch_sectors_dirty_init(struct bcache_device *); This patch adds identifier argument names to all bcache function definitions to fix such warnings. Signed-off-by: Coly Li <colyli@suse.de> Reviewed: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-12bcache: style fix to add a blank line after declarationsColy Li17-7/+57
Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-12bcache: style fix to replace 'unsigned' by 'unsigned int'Coly Li23-296/+309
This patch fixes warning reported by checkpatch.pl by replacing 'unsigned' with 'unsigned int'. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-12blkcg: Make blkg_root_lookup() work for queues in bypass modeBart Van Assche2-11/+6
For legacy queues the only call of blkg_root_lookup() happens after bypass mode has been enabled. Since blkg_lookup() returns NULL for queues in bypass mode, modify the blkg_root_lookup() such that it no longer depends on bypass mode. Rename the function into blk_queue_root_blkg() as suggested by Tejun. Suggested-by: Tejun Heo <tj@kernel.org> Fixes: 6bad9b210a22 ("blkcg: Introduce blkg_root_lookup()") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-10bcache: fix error setting writeback_rate through sysfs interfaceColy Li1-3/+10
Commit ea8c5356d390 ("bcache: set max writeback rate when I/O request is idle") changes struct bch_ratelimit member rate from uint32_t to atomic_long_t and uses atomic_long_set() in drivers/md/bcache/sysfs.c to set new writeback rate, after the input is converted from memory buf to long int by sysfs_strtoul_clamp(). The above change has a problem because there is an implicit return inside sysfs_strtoul_clamp() so the following atomic_long_set() won't be called. This error is detected by 0day system with following snipped smatch warnings: drivers/md/bcache/sysfs.c:271 __cached_dev_store() error: uninitialized symbol 'v'. 270 sysfs_strtoul_clamp(writeback_rate, v, 1, INT_MAX); ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @271 atomic_long_set(&dc->writeback_rate.rate, v); This patch fixes the above error by using strtoul_safe_clamp() to convert the input buffer into a long int type result. Fixes: ea8c5356d390 ("bcache: set max writeback rate when I/O request is idle") Cc: Kai Krakow <kai@kaishome.de> Cc: Stefan Priebe <s.priebe@profihost.ag> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-09null_blk: add lock drop/acquire annotationJens Axboe1-1/+3
sparse complains: drivers/block/null_blk_main.c:816:24: sparse: context imbalance in 'null_insert_page' - unexpected unlock Fix it by adding the necessary annotations to the function. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-09Blk-throttle: reduce tail io latency when iops limit is enforcedLiu Bo1-6/+1
When an application's iops has exceeded its cgroup's iops limit, surely it is throttled and kernel will set a timer for dispatching, thus IO latency includes the delay. However, the dispatch delay which is calculated by the limit and the elapsed jiffies is suboptimal. As the dispatch delay is only calculated once the application's iops is (iops limit + 1), it doesn't need to wait any longer than the remaining time of the current slice. The difference can be proved by the following fio job and cgroup iops setting, ----- $ echo 4 > /mnt/config/nullb/disk1/mbps # limit nullb's bandwidth to 4MB/s for testing. $ echo "253:1 riops=100 rbps=max" > /sys/fs/cgroup/unified/cg1/io.max $ cat r2.job [global] name=fio-rand-read filename=/dev/nullb1 rw=randread bs=4k direct=1 numjobs=1 time_based=1 runtime=60 group_reporting=1 [file1] size=4G ioengine=libaio iodepth=1 rate_iops=50000 norandommap=1 thinktime=4ms ----- wo patch: file1: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1 fio-3.7-66-gedfc Starting 1 process read: IOPS=99, BW=400KiB/s (410kB/s)(23.4MiB/60001msec) slat (usec): min=10, max=336, avg=27.71, stdev=17.82 clat (usec): min=2, max=28887, avg=5929.81, stdev=7374.29 lat (usec): min=24, max=28901, avg=5958.73, stdev=7366.22 clat percentiles (usec): | 1.00th=[ 4], 5.00th=[ 4], 10.00th=[ 4], 20.00th=[ 4], | 30.00th=[ 4], 40.00th=[ 4], 50.00th=[ 6], 60.00th=[11731], | 70.00th=[11863], 80.00th=[11994], 90.00th=[12911], 95.00th=[22676], | 99.00th=[23725], 99.50th=[23987], 99.90th=[23987], 99.95th=[25035], | 99.99th=[28967] w/ patch: file1: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1 fio-3.7-66-gedfc Starting 1 process read: IOPS=100, BW=400KiB/s (410kB/s)(23.4MiB/60005msec) slat (usec): min=10, max=155, avg=23.24, stdev=16.79 clat (usec): min=2, max=12393, avg=5961.58, stdev=5959.25 lat (usec): min=23, max=12412, avg=5985.91, stdev=5951.92 clat percentiles (usec): | 1.00th=[ 3], 5.00th=[ 3], 10.00th=[ 4], 20.00th=[ 4], | 30.00th=[ 4], 40.00th=[ 5], 50.00th=[ 47], 60.00th=[11863], | 70.00th=[11994], 80.00th=[11994], 90.00th=[11994], 95.00th=[11994], | 99.00th=[11994], 99.50th=[11994], 99.90th=[12125], 99.95th=[12125], | 99.99th=[12387] Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-09block: paride: pd: mark expected switch fall-throughsGustavo A. R. Silva1-0/+2
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Addresses-Coverity-ID: 1056543 ("Missing break in switch") Addresses-Coverity-ID: 1056544 ("Missing break in switch") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-09block: Ensure that a request queue is dissociated from the cgroup controllerBart Van Assche1-0/+15
Several block drivers call alloc_disk() followed by put_disk() if something fails before device_add_disk() is called without calling blk_cleanup_queue(). Make sure that also for this scenario a request queue is dissociated from the cgroup controller. This patch avoids that loading the parport_pc, paride and pf drivers triggers the following kernel crash: BUG: KASAN: null-ptr-deref in pi_init+0x42e/0x580 [paride] Read of size 4 at addr 0000000000000008 by task modprobe/744 Call Trace: dump_stack+0x9a/0xeb kasan_report+0x139/0x350 pi_init+0x42e/0x580 [paride] pf_init+0x2bb/0x1000 [pf] do_one_initcall+0x8e/0x405 do_init_module+0xd9/0x2f2 load_module+0x3ab4/0x4700 SYSC_finit_module+0x176/0x1a0 do_syscall_64+0xee/0x2b0 entry_SYSCALL_64_after_hwframe+0x42/0xb7 Reported-by: Alexandru Moise <00moses.alexander00@gmail.com> Fixes: a063057d7c73 ("block: Fix a race between request queue removal and the block cgroup controller") # v4.17 Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Tested-by: Alexandru Moise <00moses.alexander00@gmail.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Alexandru Moise <00moses.alexander00@gmail.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: <stable@vger.kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-09block: Introduce blk_exit_queue()Bart Van Assche2-24/+31
This patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Omar Sandoval <osandov@fb.com> Cc: Alexandru Moise <00moses.alexander00@gmail.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: <stable@vger.kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-09blkcg: Introduce blkg_root_lookup()Bart Van Assche1-0/+18
This new function will be used in a later patch to verify whether a queue has been dissociated from the cgroup controller before being released. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Omar Sandoval <osandov@fb.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: Alexandru Moise <00moses.alexander00@gmail.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: <stable@vger.kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-09block: Remove two superfluous #include directivesBart Van Assche1-2/+0
Commit 12f5b9314545 ("blk-mq: Remove generation seqeunce") removed the only seqcount_t and u64_stats_sync instances from <linux/blkdev.h> but did not remove the corresponding #include directives. Since these include directives are no longer needed, remove them. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Keith Busch <keith.busch@intel.com> Cc: Ming Lei <ming.lei@redhat.com> Cc: Jianchao Wang <jianchao.w.wang@oracle.com> Cc: Hannes Reinecke <hare@suse.com>, Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-09blk-mq: count the hctx as active before allocating tagJianchao Wang2-2/+9
Currently, we count the hctx as active after allocate driver tag successfully. If a previously inactive hctx try to get tag first time, it may fails and need to wait. However, due to the stale tag ->active_queues, the other shared-tags users are still able to occupy all driver tags while there is someone waiting for tag. Consequently, even if the previously inactive hctx is waked up, it still may not be able to get a tag and could be starved. To fix it, we count the hctx as active before try to allocate driver tag, then when it is waiting the tag, the other shared-tag users will reserve budget for it. Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-09block: bvec_nr_vecs() returns value for wrong slabGreg Edwards1-1/+1
In commit ed996a52c868 ("block: simplify and cleanup bvec pool handling"), the value of the slab index is incremented by one in bvec_alloc() after the allocation is done to indicate an index value of 0 does not need to be later freed. bvec_nr_vecs() was not updated accordingly, and thus returns the wrong value. Decrement idx before performing the lookup. Fixes: ed996a52c868 ("block: simplify and cleanup bvec pool handling") Signed-off-by: Greg Edwards <gedwards@ddn.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-09Merge branch 'nvme-4.19' of git://git.infradead.org/nvme into for-4.19/blockJens Axboe9-10/+138
Pull NVMe updates from Christoph: "This should be the last round of NVMe updates before the 4.19 merge window opens. It conatins support for write protected (aka read-only) namespaces from Chaitanya, two ANA fixes from Hannes and a fabrics fix from Tal Shorer." * 'nvme-4.19' of git://git.infradead.org/nvme: nvme-fabrics: fix ctrl_loss_tmo < 0 to reconnect forever nvmet: add ns write protect support nvme: set gendisk read only based on nsattr nvme.h: add support for ns write protect definitions nvme.h: fixup ANA group descriptor format nvme: fixup crash on failed discovery
2018-08-09bcache: trivial - remove tailing backslash in macro BTREE_FLAGShenghui Wang1-1/+1
Remove the tailing backslash in macro BTREE_FLAG in btree.h Signed-off-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-09bcache: make the pr_err statement used for ENOENT only in sysfs_attatch sectionShenghui Wang1-2/+2
The pr_err statement in the code for sysfs_attatch section would run for various error codes, which maybe confusing. E.g, Run the command twice: echo 796b5c05-b03c-4bc7-9cbd-a8df5e8be891 > \ /sys/block/bcache0/bcache/attach [the backing dev got attached on the first run] echo 796b5c05-b03c-4bc7-9cbd-a8df5e8be891 > \ /sys/block/bcache0/bcache/attach In dmesg, after the command run twice, we can get: bcache: bch_cached_dev_attach() Can't attach sda6: already attached bcache: __cached_dev_store() Can't attach 796b5c05-b03c-4bc7-9cbd-\ a8df5e8be891 : cache set not found The first statement in the message was right, but the second was confusing. bch_cached_dev_attach has various pr_ statements for various error codes, except ENOENT. After the change, rerun above command twice: echo 796b5c05-b03c-4bc7-9cbd-a8df5e8be891 > \ /sys/block/bcache0/bcache/attach echo 796b5c05-b03c-4bc7-9cbd-a8df5e8be891 > \ /sys/block/bcache0/bcache/attach In dmesg we only got: bcache: bch_cached_dev_attach() Can't attach sda6: already attached No confusing "cache set not found" message anymore. And for some not exist SET-UUID: echo 796b5c05-b03c-4bc7-9cbd-a8df5e8be898 > \ /sys/block/bcache0/bcache/attach In dmesg we can get: bcache: __cached_dev_store() Can't attach 796b5c05-b03c-4bc7-9cbd-\ a8df5e8be898 : cache set not found Signed-off-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-09bcache: set max writeback rate when I/O request is idleColy Li7-45/+138
Commit b1092c9af9ed ("bcache: allow quick writeback when backing idle") allows the writeback rate to be faster if there is no I/O request on a bcache device. It works well if there is only one bcache device attached to the cache set. If there are many bcache devices attached to a cache set, it may introduce performance regression because multiple faster writeback threads of the idle bcache devices will compete the btree level locks with the bcache device who have I/O requests coming. This patch fixes the above issue by only permitting fast writebac when all bcache devices attached on the cache set are idle. And if one of the bcache devices has new I/O request coming, minimized all writeback throughput immediately and let PI controller __update_writeback_rate() to decide the upcoming writeback rate for each bcache device. Also when all bcache devices are idle, limited wrieback rate to a small number is wast of thoughput, especially when backing devices are slower non-rotation devices (e.g. SATA SSD). This patch sets a max writeback rate for each backing device if the whole cache set is idle. A faster writeback rate in idle time means new I/Os may have more available space for dirty data, and people may observe a better write performance then. Please note bcache may change its cache mode in run time, and this patch still works if the cache mode is switched from writeback mode and there is still dirty data on cache. Fixes: Commit b1092c9af9ed ("bcache: allow quick writeback when backing idle") Cc: stable@vger.kernel.org #4.16+ Signed-off-by: Coly Li <colyli@suse.de> Tested-by: Kai Krakow <kai@kaishome.de> Tested-by: Stefan Priebe <s.priebe@profihost.ag> Cc: Michael Lyle <mlyle@lyle.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-09bcache: add code comments for bset.cColy Li1-0/+63
This patch tries to add code comments in bset.c, to make some tricky code and designment to be more comprehensible. Most information of this patch comes from the discussion between Kent and I, he offers very informative details. If there is any mistake of the idea behind the code, no doubt that's from me misrepresentation. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-09bcache: fix mistaken comments in request.cColy Li1-1/+1
This patch updates code comment in bch_keylist_realloc() by fixing incorrected function names, to make the code to be more comprehennsible. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-09bcache: fix mistaken code comments in bcache.hColy Li1-3/+3
This patch updates the code comment in struct cache with correct array names, to make the code to be more comprehensible. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-09bcache: add a comment in super.cColy Li1-0/+1
This patch adds a line of code comment in super.c:register_bdev(), to make code to be more comprehensible. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-09bcache: avoid unncessary cache prefetch bch_btree_node_get()Coly Li1-7/+7
In bch_btree_node_get() the read-in btree node will be partially prefetched into L1 cache for following bset iteration (if there is). But if the btree node read is failed, the perfetch operations will waste L1 cache space. This patch checkes whether read operation and only does cache prefetch when read I/O succeeded. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-09bcache: display rate debug parameters to 0 when writeback is not runningColy Li1-10/+16
When writeback is not running, writeback rate should be 0, other value is misleading. And the following dyanmic writeback rate debug parameters should be 0 too, rate, proportional, integral, change otherwise they are misleading when writeback is not running. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-09bcache: do not check return value of debugfs_create_dir()Coly Li5-13/+21
Greg KH suggests that normal code should not care about debugfs. Therefore no matter successful or failed of debugfs_create_dir() execution, it is unncessary to check its return value. There are two functions called debugfs_create_dir() and check the return value, which are bch_debug_init() and closure_debug_init(). This patch changes these two functions from int to void type, and ignore return values of debugfs_create_dir(). This patch does not fix exact bug, just makes things work as they should. Signed-off-by: Coly Li <colyli@suse.de> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: stable@vger.kernel.org Cc: Kai Krakow <kai@kaishome.de> Cc: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-08drivers/block/drbd: remove the null check for kmem_cache_destroyzhong jiang1-8/+4
kmem_cache_destroy has taken null pointer into account. So it is safe to drop the null check before calling the function. Signed-off-by: zhong jiang <zhongjiang@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>