From a33801e8b4735b8d473f963e5854172f9cde3e8b Mon Sep 17 00:00:00 2001 From: Luca Miccio Date: Mon, 13 Nov 2017 07:34:10 +0100 Subject: block, bfq: move debug blkio stats behind CONFIG_DEBUG_BLK_CGROUP BFQ currently creates, and updates, its own instance of the whole set of blkio statistics that cfq creates. Yet, from the comments of Tejun Heo in [1], it turned out that most of these statistics are meant/useful only for debugging. This commit makes BFQ create the latter, debugging statistics only if the option CONFIG_DEBUG_BLK_CGROUP is set. By doing so, this commit also enables BFQ to enjoy a high perfomance boost. The reason is that, if CONFIG_DEBUG_BLK_CGROUP is not set, then BFQ has to update far fewer statistics, and, in particular, not the heaviest to update. To give an idea of the benefits, if CONFIG_DEBUG_BLK_CGROUP is not set, then, on an Intel i7-4850HQ, and with 8 threads doing random I/O in parallel on null_blk (configured with 0 latency), the throughput of BFQ grows from 310 to 400 KIOPS (+30%). We have measured similar or even much higher boosts with other CPUs: e.g., +45% with an ARM CortexTM-A53 Octa-core. Our results have been obtained and can be reproduced very easily with the script in [1]. [1] https://www.spinics.net/lists/linux-block/msg18943.html Suggested-by: Tejun Heo Suggested-by: Ulf Hansson Tested-by: Lee Tibbert Tested-by: Oleksandr Natalenko Signed-off-by: Luca Miccio Signed-off-by: Paolo Valente Signed-off-by: Jens Axboe --- Documentation/block/bfq-iosched.txt | 38 +++++++++++++++++++++++++++++++------ 1 file changed, 32 insertions(+), 6 deletions(-) (limited to 'Documentation') diff --git a/Documentation/block/bfq-iosched.txt b/Documentation/block/bfq-iosched.txt index 7fad6c061470..8d8d8f06cab2 100644 --- a/Documentation/block/bfq-iosched.txt +++ b/Documentation/block/bfq-iosched.txt @@ -20,12 +20,22 @@ for that device, by setting low_latency to 0. See Section 3 for details on how to configure BFQ for the desired tradeoff between latency and throughput, or on how to maximize throughput. -BFQ has a non-null overhead, which limits the maximum IOPS that the -CPU can process for a device scheduled with BFQ. To give an idea of -the limits on slow or average CPUs, here are BFQ limits for three -different CPUs, on, respectively, an average laptop, an old desktop, -and a cheap embedded system, in case full hierarchical support is -enabled (i.e., CONFIG_BFQ_GROUP_IOSCHED is set): +BFQ has a non-null overhead, which limits the maximum IOPS that a CPU +can process for a device scheduled with BFQ. To give an idea of the +limits on slow or average CPUs, here are, first, the limits of BFQ for +three different CPUs, on, respectively, an average laptop, an old +desktop, and a cheap embedded system, in case full hierarchical +support is enabled (i.e., CONFIG_BFQ_GROUP_IOSCHED is set), but +CONFIG_DEBUG_BLK_CGROUP is not set (Section 4-2): +- Intel i7-4850HQ: 400 KIOPS +- AMD A8-3850: 250 KIOPS +- ARM CortexTM-A53 Octa-core: 80 KIOPS + +If CONFIG_DEBUG_BLK_CGROUP is set (and of course full hierarchical +support is enabled), then the sustainable throughput with BFQ +decreases, because all blkio.bfq* statistics are created and updated +(Section 4-2). For BFQ, this leads to the following maximum +sustainable throughputs, on the same systems as above: - Intel i7-4850HQ: 310 KIOPS - AMD A8-3850: 200 KIOPS - ARM CortexTM-A53 Octa-core: 56 KIOPS @@ -505,6 +515,22 @@ BFQ-specific files is "blkio.bfq." or "io.bfq." For example, the group parameter to set the weight of a group with BFQ is blkio.bfq.weight or io.bfq.weight. +As for cgroups-v1 (blkio controller), the exact set of stat files +created, and kept up-to-date by bfq, depends on whether +CONFIG_DEBUG_BLK_CGROUP is set. If it is set, then bfq creates all +the stat files documented in +Documentation/cgroup-v1/blkio-controller.txt. If, instead, +CONFIG_DEBUG_BLK_CGROUP is not set, then bfq creates only the files +blkio.bfq.io_service_bytes +blkio.bfq.io_service_bytes_recursive +blkio.bfq.io_serviced +blkio.bfq.io_serviced_recursive + +The value of CONFIG_DEBUG_BLK_CGROUP greatly influences the maximum +throughput sustainable with bfq, because updating the blkio.bfq.* +stats is rather costly, especially for some of the stats enabled by +CONFIG_DEBUG_BLK_CGROUP. + Parameters to set ----------------- -- cgit v1.2.3