<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/block/bfq-iosched.c, branch v7.0.10</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v7.0.10</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v7.0.10'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-02-21T09:02:28+00:00</updated>
<entry>
<title>treewide: Replace kmalloc with kmalloc_obj for non-scalar types</title>
<updated>2026-02-21T09:02:28+00:00</updated>
<author>
<name>Kees Cook</name>
<email>kees@kernel.org</email>
</author>
<published>2026-02-21T07:49:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=69050f8d6d075dc01af7a5f2f550a8067510366f'/>
<id>urn:sha1:69050f8d6d075dc01af7a5f2f550a8067510366f</id>
<content type='text'>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
</content>
</entry>
<entry>
<title>block, bfq: convert to use request_queue-&gt;async_depth</title>
<updated>2026-02-03T14:45:36+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai@fnnas.com</email>
</author>
<published>2026-02-03T08:19:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2110858c5178176d0d535b7762b20cb9c0d03146'/>
<id>urn:sha1:2110858c5178176d0d535b7762b20cb9c0d03146</id>
<content type='text'>
The default limits is unchanged, and user can configure async_depth now.

Signed-off-by: Yu Kuai &lt;yukuai@fnnas.com&gt;
Reviewed-by: Nilay Shroff &lt;nilay@linux.ibm.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-mq-sched: unify elevators checking for async requests</title>
<updated>2026-02-03T14:45:36+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai@fnnas.com</email>
</author>
<published>2026-02-03T08:19:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1db61b0afdd7e8aa9289c423fdff002603b520b5'/>
<id>urn:sha1:1db61b0afdd7e8aa9289c423fdff002603b520b5</id>
<content type='text'>
bfq and mq-deadline consider sync writes as async requests and only
reserve tags for sync reads by async_depth, however, kyber doesn't
consider sync writes as async requests for now.

Consider the case there are lots of dirty pages, and user use fsync to
flush dirty pages. In this case sched_tags can be exhausted by sync writes
and sync reads can stuck waiting for tag. Hence let kyber follow what
mq-deadline and bfq did, and unify async requests checking for all
elevators.

Signed-off-by: Yu Kuai &lt;yukuai@fnnas.com&gt;
Reviewed-by: Nilay Shroff &lt;nilay@linux.ibm.com&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: introduce blk_queue_rot()</title>
<updated>2026-01-29T20:15:50+00:00</updated>
<author>
<name>Damien Le Moal</name>
<email>dlemoal@kernel.org</email>
</author>
<published>2026-01-29T07:27:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2719bd1ee1a1cd0535bc62e89b52822f2bbd14eb'/>
<id>urn:sha1:2719bd1ee1a1cd0535bc62e89b52822f2bbd14eb</id>
<content type='text'>
To check if a request queue is for a rotational device, a double
negation is needed with the pattern "!blk_queue_nonrot(q)". Simplify
this with the introduction of the helper blk_queue_rot() which tests
if a requests queue limit has the BLK_FEAT_ROTATIONAL feature set.
All call sites of blk_queue_nonrot() are modified to use blk_queue_rot()
and blk_queue_nonrot() definition removed.

No functional changes.

Signed-off-by: Damien Le Moal &lt;dlemoal@kernel.org&gt;
Reviewed-by: Nitesh Shetty &lt;nj.shetty@samsung.com&gt;
Reviewed-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: fix race between wbt_enable_default and IO submission</title>
<updated>2025-12-12T19:51:11+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2025-12-12T14:35:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9869d3a6fed381f3b98404e26e1afc75d680cbf9'/>
<id>urn:sha1:9869d3a6fed381f3b98404e26e1afc75d680cbf9</id>
<content type='text'>
When wbt_enable_default() is moved out of queue freezing in elevator_change(),
it can cause the wbt inflight counter to become negative (-1), leading to hung
tasks in the writeback path. Tasks get stuck in wbt_wait() because the counter
is in an inconsistent state.

The issue occurs because wbt_enable_default() could race with IO submission,
allowing the counter to be decremented before proper initialization. This manifests
as:

  rq_wait[0]:
    inflight:             -1
    has_waiters:        True

rwb_enabled() checks the state, which can be updated exactly between wbt_wait()
(rq_qos_throttle()) and wbt_track()(rq_qos_track()), then the inflight counter
will become negative.

And results in hung task warnings like:
  task:kworker/u24:39 state:D stack:0 pid:14767
  Call Trace:
    rq_qos_wait+0xb4/0x150
    wbt_wait+0xa9/0x100
    __rq_qos_throttle+0x24/0x40
    blk_mq_submit_bio+0x672/0x7b0
    ...

Fix this by:

1. Splitting wbt_enable_default() into:
   - __wbt_enable_default(): Returns true if wbt_init() should be called
   - wbt_enable_default(): Wrapper for existing callers (no init)
   - wbt_init_enable_default(): New function that checks and inits WBT

2. Using wbt_init_enable_default() in blk_register_queue() to ensure
   proper initialization during queue registration

3. Move wbt_init() out of wbt_enable_default() which is only for enabling
   disabled wbt from bfq and iocost, and wbt_init() isn't needed. Then the
   original lock warning can be avoided.

4. Removing the ELEVATOR_FLAG_ENABLE_WBT_ON_EXIT flag and its handling
   code since it's no longer needed

This ensures WBT is properly initialized before any IO can be submitted,
preventing the counter from going negative.

Cc: Nilay Shroff &lt;nilay@linux.ibm.com&gt;
Cc: Yu Kuai &lt;yukuai@fnnas.com&gt;
Cc: Guangwu Zhang &lt;guazhang@redhat.com&gt;
Fixes: 78c271344b6f ("block: move wbt_enable_default() out of queue freezing from sched -&gt;exit()")
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Reviewed-by: Nilay Shroff &lt;nilay@linux.ibm.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-mq: fix elevator depth_updated method</title>
<updated>2025-09-05T19:52:52+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2025-08-21T06:06:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7d337eef4affc5e26e0570513168c69ddbc40f92'/>
<id>urn:sha1:7d337eef4affc5e26e0570513168c69ddbc40f92</id>
<content type='text'>
Current depth_updated has some problems:

1) depth_updated() will be called for each hctx, while all elevators
will update async_depth for the disk level, this is not related to hctx;
2) In blk_mq_update_nr_requests(), if previous hctx update succeed and
this hctx update failed, q-&gt;nr_requests will not be updated, while
async_depth is already updated with new nr_reqeuests in previous
depth_updated();
3) All elevators are using q-&gt;nr_requests to calculate async_depth now,
however, q-&gt;nr_requests is still the old value when depth_updated() is
called from blk_mq_update_nr_requests();

Those problems are first from error path, then mq-deadline, and recently
for bfq and kyber, fix those problems by:

- pass in request_queue instead of hctx;
- move depth_updated() after q-&gt;nr_requests is updated in
  blk_mq_update_nr_requests();
- add depth_updated() call inside init_sched() method to initialize
  async_depth;
- remove init_hctx() method for mq-deadline and bfq that is useless now;

Fixes: 77f1e0a52d26 ("bfq: update internal depth state when queue depth changes")
Fixes: 39823b47bbd4 ("block/mq-deadline: Fix the tag reservation code")
Fixes: 42e6c6ce03fd ("lib/sbitmap: convert shallow_depth from one word to the whole sbitmap")
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
Reviewed-by: Li Nan &lt;linan122@huawei.com&gt;
Reviewed-by: Nilay Shroff &lt;nilay@linux.ibm.com&gt;
Link: https://lore.kernel.org/r/20250821060612.1729939-2-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block, bfq: remove redundant __GFP_NOWARN</title>
<updated>2025-08-11T13:59:22+00:00</updated>
<author>
<name>Qianfeng Rong</name>
<email>rongqianfeng@vivo.com</email>
</author>
<published>2025-08-11T08:11:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8f3e4e87b0945aeea8b5a5aa43c419f4a1b4ca6a'/>
<id>urn:sha1:8f3e4e87b0945aeea8b5a5aa43c419f4a1b4ca6a</id>
<content type='text'>
Commit 16f5dfbc851b ("gfp: include __GFP_NOWARN in GFP_NOWAIT") made
GFP_NOWAIT implicitly include __GFP_NOWARN.

Therefore, explicit __GFP_NOWARN combined with GFP_NOWAIT (e.g.,
`GFP_NOWAIT | __GFP_NOWARN`) is now redundant.  Let's clean up these
redundant flags across subsystems.

Reviewed-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Qianfeng Rong &lt;rongqianfeng@vivo.com&gt;
Link: https://lore.kernel.org/r/20250811081135.374315-1-rongqianfeng@vivo.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>lib/sbitmap: convert shallow_depth from one word to the whole sbitmap</title>
<updated>2025-08-07T12:30:17+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2025-08-07T03:24:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=42e6c6ce03fd3e41e39a0f93f9b1a1d9fa664338'/>
<id>urn:sha1:42e6c6ce03fd3e41e39a0f93f9b1a1d9fa664338</id>
<content type='text'>
Currently elevators will record internal 'async_depth' to throttle
asynchronous requests, and they both calculate shallow_dpeth based on
sb-&gt;shift, with the respect that sb-&gt;shift is the available tags in one
word.

However, sb-&gt;shift is not the availbale tags in the last word, see
__map_depth:

if (index == sb-&gt;map_nr - 1)
  return sb-&gt;depth - (index &lt;&lt; sb-&gt;shift);

For consequence, if the last word is used, more tags can be get than
expected, for example, assume nr_requests=256 and there are four words,
in the worst case if user set nr_requests=32, then the first word is
the last word, and still use bits per word, which is 64, to calculate
async_depth is wrong.

One the ohter hand, due to cgroup qos, bfq can allow only one request
to be allocated, and set shallow_dpeth=1 will still allow the number
of words request to be allocated.

Fix this problems by using shallow_depth to the whole sbitmap instead
of per word, also change kyber, mq-deadline and bfq to follow this,
a new helper __map_depth_with_shallow() is introduced to calculate
available bits in each word.

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Link: https://lore.kernel.org/r/20250807032413.1469456-2-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: move elevator queue allocation logic into blk_mq_init_sched</title>
<updated>2025-07-30T12:20:51+00:00</updated>
<author>
<name>Nilay Shroff</name>
<email>nilay@linux.ibm.com</email>
</author>
<published>2025-07-30T07:46:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=49811586be373e26a3ab52f54e0dfa663c02fddd'/>
<id>urn:sha1:49811586be373e26a3ab52f54e0dfa663c02fddd</id>
<content type='text'>
In preparation for allocating sched_tags before freezing the request
queue and acquiring -&gt;elevator_lock, move the elevator queue allocation
logic from the elevator ops -&gt;init_sched callback into blk_mq_init_sched.
As elevator_alloc is now only invoked from block layer core, we don't
need to export it, so unexport elevator_alloc function.

This refactoring provides a centralized location for elevator queue
initialization, which makes it easier to store pre-allocated sched_tags
in the struct elevator_queue during later changes.

Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Nilay Shroff &lt;nilay@linux.ibm.com&gt;
Link: https://lore.kernel.org/r/20250730074614.2537382-2-nilay@linux.ibm.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-ioc: don't hold queue_lock for ioc_lookup_icq()</title>
<updated>2025-07-29T12:26:34+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2025-07-29T02:32:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5421681bc3ef13476bd9443379cd69381e8760fa'/>
<id>urn:sha1:5421681bc3ef13476bd9443379cd69381e8760fa</id>
<content type='text'>
Currently issue io can grab queue_lock three times from bfq_bio_merge(),
bfq_limit_depth() and bfq_prepare_request(), the queue_lock is not
necessary if icq is already created because both queue and ioc can't be
freed before io issuing is done, hence remove the unnecessary queue_lock
and use rcu to protect radix tree lookup.

Noted this is also a prep patch to support request batch dispatching[1].

[1] https://lore.kernel.org/all/20250722072431.610354-1-yukuai1@huaweicloud.com/

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Reviewed-by: Damien Le Moal &lt;dlemoal@kernel.org&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20250729023229.2944898-1-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
</feed>
