<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/block/elevator.c, branch v5.15.208</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v5.15.208</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v5.15.208'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2024-11-17T14:06:24+00:00</updated>
<entry>
<title>block: Fix elevator_get_default() checking for NULL q-&gt;tag_set</title>
<updated>2024-11-17T14:06:24+00:00</updated>
<author>
<name>SurajSonawane2415</name>
<email>surajsonawane0215@gmail.com</email>
</author>
<published>2024-10-07T11:14:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f8b7e56b7555328ecd5edec99f4868cef884151a'/>
<id>urn:sha1:f8b7e56b7555328ecd5edec99f4868cef884151a</id>
<content type='text'>
[ Upstream commit b402328a24ee7193a8ab84277c0c90ae16768126 ]

elevator_get_default() and elv_support_iosched() both check for whether
or not q-&gt;tag_set is non-NULL, however it's not possible for them to be
NULL. This messes up some static checkers, as the checking of tag_set
isn't consistent.

Remove the checks, which both simplifies the logic and avoids checker
errors.

Signed-off-by: SurajSonawane2415 &lt;surajsonawane0215@gmail.com&gt;
Link: https://lore.kernel.org/r/20241007111416.13814-1-surajsonawane0215@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>block/wbt: fix negative inflight counter when remove scsi device</title>
<updated>2022-02-23T11:03:15+00:00</updated>
<author>
<name>Laibin Qiu</name>
<email>qiulaibin@huawei.com</email>
</author>
<published>2022-01-22T11:10:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4b9c861a589e3486fb3806c96953d27853401d04'/>
<id>urn:sha1:4b9c861a589e3486fb3806c96953d27853401d04</id>
<content type='text'>
commit e92bc4cd34de2ce454bdea8cd198b8067ee4e123 upstream.

Now that we disable wbt by set WBT_STATE_OFF_DEFAULT in
wbt_disable_default() when switch elevator to bfq. And when
we remove scsi device, wbt will be enabled by wbt_enable_default.
If it become false positive between wbt_wait() and wbt_track()
when submit write request.

The following is the scenario that triggered the problem.

T1                          T2                           T3
                            elevator_switch_mq
                            bfq_init_queue
                            wbt_disable_default &lt;= Set
                            rwb-&gt;enable_state (OFF)
Submit_bio
blk_mq_make_request
rq_qos_throttle
&lt;= rwb-&gt;enable_state (OFF)
                                                         scsi_remove_device
                                                         sd_remove
                                                         del_gendisk
                                                         blk_unregister_queue
                                                         elv_unregister_queue
                                                         wbt_enable_default
                                                         &lt;= Set rwb-&gt;enable_state (ON)
q_qos_track
&lt;= rwb-&gt;enable_state (ON)
^^^^^^ this request will mark WBT_TRACKED without inflight add and will
lead to drop rqw-&gt;inflight to -1 in wbt_done() which will trigger IO hung.

Fix this by move wbt_enable_default() from elv_unregister to
bfq_exit_queue(). Only re-enable wbt when bfq exit.

Fixes: 76a8040817b4b ("blk-wbt: make sure throttle is enabled properly")

Remove oneline stale comment, and kill one oneshot local variable.

Signed-off-by: Ming Lei &lt;ming.lei@rehdat.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Link: https://lore.kernel.org/linux-block/20211214133103.551813-1-qiulaibin@huawei.com/
Signed-off-by: Laibin Qiu &lt;qiulaibin@huawei.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>block: avoid to quiesce queue in elevator_init_mq</title>
<updated>2021-12-01T08:04:56+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2021-11-17T11:55:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=db8ed1e61b4922b841b47c989a0000e06acac0fe'/>
<id>urn:sha1:db8ed1e61b4922b841b47c989a0000e06acac0fe</id>
<content type='text'>
commit 245a489e81e13dd55ae46d27becf6d5901eb7828 upstream.

elevator_init_mq() is only called before adding disk, when there isn't
any FS I/O, only passthrough requests can be queued, so freezing queue
plus canceling dispatch work is enough to drain any dispatch activities,
then we can avoid synchronize_srcu() in blk_mq_quiesce_queue().

Long boot latency issue can be fixed in case of lots of disks added
during booting.

Fixes: 737eb78e82d5 ("block: Delay default elevator initialization")
Reported-by: yangerkun &lt;yangerkun@huawei.com&gt;
Cc: Damien Le Moal &lt;damien.lemoal@wdc.com&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Link: https://lore.kernel.org/r/20211117115502.1600950-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>block: return ELEVATOR_DISCARD_MERGE if possible</title>
<updated>2021-08-09T20:37:47+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2021-07-29T03:42:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=866663b7b52d2da267b28e12eed89ee781b8fed1'/>
<id>urn:sha1:866663b7b52d2da267b28e12eed89ee781b8fed1</id>
<content type='text'>
When merging one bio to request, if they are discard IO and the queue
supports multi-range discard, we need to return ELEVATOR_DISCARD_MERGE
because both block core and related drivers(nvme, virtio-blk) doesn't
handle mixed discard io merge(traditional IO merge together with
discard merge) well.

Fix the issue by returning ELEVATOR_DISCARD_MERGE in this situation,
so both blk-mq and drivers just need to handle multi-range discard.

Reported-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Tested-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Fixes: 2705dfb20947 ("block: fix discard request merge")
Link: https://lore.kernel.org/r/20210729034226.1591070-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: remove support for delayed queue registrations</title>
<updated>2021-08-09T17:50:43+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2021-08-04T09:41:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d1254a8749711e0d7441036a74ce592341f89697'/>
<id>urn:sha1:d1254a8749711e0d7441036a74ce592341f89697</id>
<content type='text'>
Now that device mapper has been changed to register the disk once
it is fully ready all this code is unused.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Link: https://lore.kernel.org/r/20210804094147.459763-9-hch@lst.de
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-mq: Introduce the BLK_MQ_F_NO_SCHED_BY_DEFAULT flag</title>
<updated>2021-08-05T17:49:21+00:00</updated>
<author>
<name>Bart Van Assche</name>
<email>bvanassche@acm.org</email>
</author>
<published>2021-08-05T17:41:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=90b7198001f23ea37d3b46dc631bdaa2357a20b1'/>
<id>urn:sha1:90b7198001f23ea37d3b46dc631bdaa2357a20b1</id>
<content type='text'>
elevator_get_default() uses the following algorithm to select an I/O
scheduler from inside add_disk():
- In case of a single hardware queue or if sharing hardware queues across
  multiple request queues (BLK_MQ_F_TAG_HCTX_SHARED), use mq-deadline.
- Otherwise, use 'none'.

This is a good choice for most but not for all block drivers. Make it
possible to override the selection of mq-deadline with a new flag,
namely BLK_MQ_F_NO_SCHED_BY_DEFAULT.

Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Ming Lei &lt;ming.lei@redhat.com&gt;
Cc: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Cc: Martijn Coenen &lt;maco@android.com&gt;
Cc: Jaegeuk Kim &lt;jaegeuk@kernel.org&gt;
Signed-off-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Link: https://lore.kernel.org/r/20210805174200.3250718-2-bvanassche@acm.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk: Fix lock inversion between ioc lock and bfqd lock</title>
<updated>2021-06-25T00:43:55+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2021-06-23T09:36:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=fd2ef39cc9a6b9c4c41864ac506906c52f94b06a'/>
<id>urn:sha1:fd2ef39cc9a6b9c4c41864ac506906c52f94b06a</id>
<content type='text'>
Lockdep complains about lock inversion between ioc-&gt;lock and bfqd-&gt;lock:

bfqd -&gt; ioc:
 put_io_context+0x33/0x90 -&gt; ioc-&gt;lock grabbed
 blk_mq_free_request+0x51/0x140
 blk_put_request+0xe/0x10
 blk_attempt_req_merge+0x1d/0x30
 elv_attempt_insert_merge+0x56/0xa0
 blk_mq_sched_try_insert_merge+0x4b/0x60
 bfq_insert_requests+0x9e/0x18c0 -&gt; bfqd-&gt;lock grabbed
 blk_mq_sched_insert_requests+0xd6/0x2b0
 blk_mq_flush_plug_list+0x154/0x280
 blk_finish_plug+0x40/0x60
 ext4_writepages+0x696/0x1320
 do_writepages+0x1c/0x80
 __filemap_fdatawrite_range+0xd7/0x120
 sync_file_range+0xac/0xf0

ioc-&gt;bfqd:
 bfq_exit_icq+0xa3/0xe0 -&gt; bfqd-&gt;lock grabbed
 put_io_context_active+0x78/0xb0 -&gt; ioc-&gt;lock grabbed
 exit_io_context+0x48/0x50
 do_exit+0x7e9/0xdd0
 do_group_exit+0x54/0xc0

To avoid this inversion we change blk_mq_sched_try_insert_merge() to not
free the merged request but rather leave that upto the caller similarly
to blk_mq_sched_try_merge(). And in bfq_insert_requests() we make sure
to free all the merged requests after dropping bfqd-&gt;lock.

Fixes: aee69d78dec0 ("block, bfq: introduce the BFQ-v0 I/O scheduler as an extra scheduler")
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Acked-by: Paolo Valente &lt;paolo.valente@linaro.org&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20210623093634.27879-3-jack@suse.cz
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: Remove unnecessary elevator operation checks</title>
<updated>2021-06-18T14:51:48+00:00</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@wdc.com</email>
</author>
<published>2021-06-18T01:59:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e42cfb1da0bf33c313318da201730324c423351d'/>
<id>urn:sha1:e42cfb1da0bf33c313318da201730324c423351d</id>
<content type='text'>
The insert_requests and dispatch_request elevator operations are
mandatory for the correct execution of an elevator, and all implemented
elevators (bfq, kyber and mq-deadline) implement them. As a result,
there is no need to check for these operations before calling them when
a queue has an elevator set. This simplifies the code in
__blk_mq_sched_dispatch_requests() and blk_mq_sched_insert_request().

To avoid out-of-tree elevators to crash the kernel in case of bad
implementation, add a check in elv_register() to verify that these
operations are implemented.

A small, probably not significant, IOPS improvement of 0.1% is observed
with this patch applied (4.117 MIOPS to 4.123 MIOPS, average of 20 fio
runs doing 4K random direct reads with psync and 32 jobs).

Signed-off-by: Damien Le Moal &lt;damien.lemoal@wdc.com&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Link: https://lore.kernel.org/r/20210618015922.713999-1-damien.lemoal@wdc.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-mq: improve the blk_mq_init_allocated_queue interface</title>
<updated>2021-06-11T17:53:02+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2021-06-02T06:53:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=26a9750aa875126e4b7fc5ee6de652a529c5b7ee'/>
<id>urn:sha1:26a9750aa875126e4b7fc5ee6de652a529c5b7ee</id>
<content type='text'>
Don't return the passed in request_queue but a normal error code, and
drop the elevator_init argument in favor of just calling elevator_init_mq
directly from dm-rq.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Chaitanya Kulkarni &lt;chaitanya.kulkarni@wdc.com&gt;
Link: https://lore.kernel.org/r/20210602065345.355274-3-hch@lst.de
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-mq: set default elevator as deadline in case of hctx shared tagset</title>
<updated>2021-04-07T16:15:23+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2021-04-06T03:19:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=580dca8143d215977811bd2ff881e1e4f6ff39f0'/>
<id>urn:sha1:580dca8143d215977811bd2ff881e1e4f6ff39f0</id>
<content type='text'>
Yanhui found that write performance is degraded a lot after applying
hctx shared tagset on one test machine with megaraid_sas. And turns out
it is caused by none scheduler which becomes default elevator caused by
hctx shared tagset patchset.

Given more scsi HBAs will apply hctx shared tagset, and the similar
performance exists for them too.

So keep previous behavior by still using default mq-deadline for queues
which apply hctx shared tagset, just like before.

Fixes: 32bc15afed04 ("blk-mq: Facilitate a shared sbitmap per tagset")
Reported-by: Yanhui Ma &lt;yama@redhat.com&gt;
Cc: John Garry &lt;john.garry@huawei.com&gt;
Cc: Hannes Reinecke &lt;hare@suse.de&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Reviewed-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Reviewed-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Reviewed-by: John Garry &lt;john.garry@huawei.com&gt;
Link: https://lore.kernel.org/r/20210406031933.767228-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
</feed>
