<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/linux/blk-cgroup.h, branch linux-5.8.y</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=linux-5.8.y</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=linux-5.8.y'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2020-04-30T21:54:45+00:00</updated>
<entry>
<title>blk-iocost: switch to fixed non-auto-decaying use_delay</title>
<updated>2020-04-30T21:54:45+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2020-04-13T16:27:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=54c52e10dc9b939084a7e6e3d32ce8fd8dee7898'/>
<id>urn:sha1:54c52e10dc9b939084a7e6e3d32ce8fd8dee7898</id>
<content type='text'>
The use_delay mechanism was introduced by blk-iolatency to hold memory
allocators accountable for the reclaim and other shared IOs they cause. The
duration of the delay is dynamically balanced between iolatency increasing the
value on each target miss and it auto-decaying as time passes and threads get
delayed on it.

While this works well for iolatency, iocost's control model isn't compatible
with it. There is no repeated "violation" events which can be balanced against
auto-decaying. iocost instead knows how much a given cgroup is over budget and
wants to prevent that cgroup from issuing IOs while over budget. Until now,
iocost has been adding the cost of force-issued IOs. However, this doesn't
reflect the amount which is already over budget and is simply not enough to
counter the auto-decaying allowing anon-memory leaking low priority cgroup to
go over its alloted share of IOs.

As auto-decaying doesn't make much sense for iocost, this patch introduces a
different mode of operation for use_delay - when blkcg_set_delay() are used
insted of blkcg_add/use_delay(), the delay duration is not auto-decayed until it
is explicitly cleared with blkcg_clear_delay(). iocost is updated to keep the
delay duration synchronized to the budget overage amount.

With this change, iocost can effectively police cgroups which generate
significant amount of force-issued IOs.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Josef Bacik &lt;josef@toxicpanda.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: replace BIO_QUEUE_ENTERED with BIO_CGROUP_ACCT</title>
<updated>2020-04-29T15:33:26+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2020-04-28T11:27:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0376e9efe18388bd486a65edbc16d34b84bddc8f'/>
<id>urn:sha1:0376e9efe18388bd486a65edbc16d34b84bddc8f</id>
<content type='text'>
BIO_QUEUE_ENTERED is only used for cgroup accounting now, so rename
the flag and move setting it into the cgroup code.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blkcg: don't offline parent blkcg first</title>
<updated>2020-04-01T20:56:44+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2019-07-24T17:37:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4308a434e5e08c78676aa66bc626ef78cbef0883'/>
<id>urn:sha1:4308a434e5e08c78676aa66bc626ef78cbef0883</id>
<content type='text'>
blkcg-&gt;cgwb_refcnt is used to delay blkcg offlining so that blkgs
don't get offlined while there are active cgwbs on them.  However, it
ends up making offlining unordered sometimes causing parents to be
offlined before children.

Let's fix this by making child blkcgs pin the parents' online states.

Note that pin/unpin names are chosen over get/put intentionally
because css uses get/put online for something different.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blkcg: rename blkcg-&gt;cgwb_refcnt to -&gt;online_pin and always use it</title>
<updated>2020-04-01T20:56:42+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2019-07-24T17:37:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d866dbf6178713e37d2fec2870af00b345684e1a'/>
<id>urn:sha1:d866dbf6178713e37d2fec2870af00b345684e1a</id>
<content type='text'>
blkcg-&gt;cgwb_refcnt is used to delay blkcg offlining so that blkgs
don't get offlined while there are active cgwbs on them.  However, it
ends up making offlining unordered sometimes causing parents to be
offlined before children.

To fix it, we want child blkcgs to pin the parents' online states
turning the refcnt into a more generic online pinning mechanism.

In prepartion,

* blkcg-&gt;cgwb_refcnt -&gt; blkcg-&gt;online_pin
* blkcg_cgwb_get/put() -&gt; blkcg_pin/unpin_online()
* Take them out of CONFIG_CGROUP_WRITEBACK

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-cgroup: remove blkcg_drain_queue</title>
<updated>2019-12-12T16:26:55+00:00</updated>
<author>
<name>Guoqing Jiang</name>
<email>guoqing.jiang@cloud.ionos.com</email>
</author>
<published>2019-12-12T15:52:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5addeae1bedc4c126b179f61e43e039bb373581f'/>
<id>urn:sha1:5addeae1bedc4c126b179f61e43e039bb373581f</id>
<content type='text'>
Since blk_drain_queue had already been removed, so this function
is not needed anymore.

Signed-off-by: Guoqing Jiang &lt;guoqing.jiang@cloud.ionos.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-cgroup: cgroup_rstat_updated() shouldn't be called on cgroup1</title>
<updated>2019-11-18T15:40:41+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2019-11-14T22:31:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=496074f94b19574c77240d3b3f84cfb1097de51d'/>
<id>urn:sha1:496074f94b19574c77240d3b3f84cfb1097de51d</id>
<content type='text'>
Currently, cgroup rstat is supported only on cgroup2 hierarchy and
rstat functions shouldn't be called on cgroup1 cgroups.  While
converting blk-cgroup core statistics to rstat, f73316482977
("blk-cgroup: reimplement basic IO stats using cgroup rstat")
accidentally ended up calling cgroup_rstat_updated() on cgroup1
cgroups causing crashes.

Longer term, we probably should add cgroup1 support to rstat but for
now let's mask the call directly.

Fixes: f73316482977 ("blk-cgroup: reimplement basic IO stats using cgroup rstat")
Tested-by: Faiz Abbas &lt;faiz_abbas@ti.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-cgroup: separate out blkg_rwstat under CONFIG_BLK_CGROUP_RWSTAT</title>
<updated>2019-11-07T19:28:13+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2019-11-07T19:18:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1d156646e0d8ec390e5d5ac288137df02d4207be'/>
<id>urn:sha1:1d156646e0d8ec390e5d5ac288137df02d4207be</id>
<content type='text'>
blkg_rwstat is now only used by bfq-iosched and blk-throtl when on
cgroup1.  Let's move it into its own files and gate it behind a config
option.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-cgroup: reimplement basic IO stats using cgroup rstat</title>
<updated>2019-11-07T19:28:13+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2019-11-07T19:18:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f73316482977ac401ac37245c9df48079d4e11f3'/>
<id>urn:sha1:f73316482977ac401ac37245c9df48079d4e11f3</id>
<content type='text'>
blk-cgroup has been using blkg_rwstat to track basic IO stats.
Unfortunately, reading recursive stats scales badly as itinvolves
walking all descendants.  On systems with a huge number of cgroups
(dead or alive), this can lead to substantial CPU cost when reading IO
stats.

This patch reimplements basic IO stats using cgroup rstat which uses
more memory but makes recursive stat reading O(# descendants which
have been active since last reading) instead of O(# descendants).

* blk-cgroup core no longer uses sync/async stats.  Introduce new stat
  enums - BLKG_IOSTAT_{READ|WRITE|DISCARD}.

* Add blkg_iostat[_set] which encapsulates byte and io stats, last
  values for propagation delta calculation and u64_stats_sync for
  correctness on 32bit archs.

* Update the new percpu stat counters directly and implement
  blkcg_rstat_flush() to implement propagation.

* blkg_print_stat() can now bring the stats up to date by calling
  cgroup_rstat_flush() and print them instead of directly summing up
  all descendants.

* It now allocates 96 bytes per cpu.  It used to be 40 bytes.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Dan Schatzberg &lt;dschatzberg@fb.com&gt;
Cc: Daniel Xu &lt;dlxu@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-cgroup: remove now unused blkg_print_stat_{bytes|ios}_recursive()</title>
<updated>2019-11-07T19:28:13+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2019-11-07T19:18:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8a80d5d6638b7d58480a83aef49d587de63d4cbb'/>
<id>urn:sha1:8a80d5d6638b7d58480a83aef49d587de63d4cbb</id>
<content type='text'>
These don't have users anymore.  Remove them.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blkcg: separate blkcg_conf_get_disk() out of blkg_conf_prep()</title>
<updated>2019-08-29T03:17:04+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2019-08-28T22:05:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=015d254cb02b6d8eec4b3366274bf4672f9e0b64'/>
<id>urn:sha1:015d254cb02b6d8eec4b3366274bf4672f9e0b64</id>
<content type='text'>
Separate out blkcg_conf_get_disk() so that it can be used by blkcg
policy interface file input parsers before the policy is actually
enabled.  This doesn't introduce any functional changes.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
</feed>
