<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/linux/blk-mq.h, branch v6.19.11</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.11</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.11'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-12-04T14:19:26+00:00</updated>
<entry>
<title>blk-mq: add blk_rq_nr_bvec() helper</title>
<updated>2025-12-04T14:19:26+00:00</updated>
<author>
<name>Chaitanya Kulkarni</name>
<email>ckulkarnilinux@gmail.com</email>
</author>
<published>2025-12-03T03:58:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=71075d25ca5cae732fb57da065fbf14aeb3bcfc7'/>
<id>urn:sha1:71075d25ca5cae732fb57da065fbf14aeb3bcfc7</id>
<content type='text'>
Add a new helper function blk_rq_nr_bvec() that returns the number of
bvecs in a request. This count represents the number of iterations
rq_for_each_bvec() would perform on a request.

Drivers need to pre-allocate bvec arrays before iterating through
a request's bvecs. Currently, they manually count bvecs using
rq_for_each_bvec() in a loop, which is repetitive. The new helper
centralizes this logic.

This pattern exists in loop and zloop drivers, where multi-bio requests
require copying bvecs into a contiguous array before creating
an iov_iter for file operations.

Update loop and zloop drivers to use the new helper, eliminating
duplicate code.

This patch also provides a clear API to avoid any potential misuse of
blk_nr_phys_segments() for calculating the bvecs since, one bvec can
have more than one segments and use of blk_nr_phys_segments() can
lead to extra memory allocation :-

[ 6155.673749] nullb_bio: 128K bio as ONE bvec: sector=0, size=131072
[ 6155.673846] null_blk: #### null_handle_data_transfer:1375
[ 6155.673850] null_blk: nr_bvec=1 blk_rq_nr_phys_segments=2
[ 6155.674263] null_blk: #### null_handle_data_transfer:1375
[ 6155.674267] null_blk: nr_bvec=1 blk_rq_nr_phys_segments=1

Reviewed-by: Niklas Cassel &lt;cassel@kernel.org&gt;
Signed-off-by: Chaitanya Kulkarni &lt;ckulkarnilinux@gmail.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-mq: fix potential uaf for 'queue_hw_ctx'</title>
<updated>2025-11-28T16:09:19+00:00</updated>
<author>
<name>Fengnan Chang</name>
<email>fengnanchang@gmail.com</email>
</author>
<published>2025-11-28T08:53:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=89e1fb7ceffd898505ad7fa57acec0585bfaa2cc'/>
<id>urn:sha1:89e1fb7ceffd898505ad7fa57acec0585bfaa2cc</id>
<content type='text'>
This is just apply Kuai's patch in [1] with mirror changes.

blk_mq_realloc_hw_ctxs() will free the 'queue_hw_ctx'(e.g. undate
submit_queues through configfs for null_blk), while it might still be
used from other context(e.g. switch elevator to none):

t1					t2
elevator_switch
 blk_mq_unquiesce_queue
  blk_mq_run_hw_queues
   queue_for_each_hw_ctx
    // assembly code for hctx = (q)-&gt;queue_hw_ctx[i]
    mov    0x48(%rbp),%rdx -&gt; read old queue_hw_ctx

					__blk_mq_update_nr_hw_queues
					 blk_mq_realloc_hw_ctxs
					  hctxs = q-&gt;queue_hw_ctx
					  q-&gt;queue_hw_ctx = new_hctxs
					  kfree(hctxs)
    movslq %ebx,%rax
    mov    (%rdx,%rax,8),%rdi -&gt;uaf

This problem was found by code review, and I comfirmed that the concurrent
scenario do exist(specifically 'q-&gt;queue_hw_ctx' can be changed during
blk_mq_run_hw_queues()), however, the uaf problem hasn't been repoduced yet
without hacking the kernel.

Sicne the queue is freezed in __blk_mq_update_nr_hw_queues(), fix the
problem by protecting 'queue_hw_ctx' through rcu where it can be accessed
without grabbing 'q_usage_counter'.

[1] https://lore.kernel.org/all/20220225072053.2472431-1-yukuai3@huawei.com/

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Fengnan Chang &lt;changfengnan@bytedance.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-mq: use array manage hctx map instead of xarray</title>
<updated>2025-11-28T16:09:19+00:00</updated>
<author>
<name>Fengnan Chang</name>
<email>fengnanchang@gmail.com</email>
</author>
<published>2025-11-28T08:53:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d0c98769ee7d5db8d699a270690639cde1766cd4'/>
<id>urn:sha1:d0c98769ee7d5db8d699a270690639cde1766cd4</id>
<content type='text'>
After commit 4e5cc99e1e48 ("blk-mq: manage hctx map via xarray"), we use
an xarray instead of array to store hctx, but in poll mode, each time
in blk_mq_poll, we need use xa_load to find corresponding hctx, this
introduce some costs. In my test, xa_load may cost 3.8% cpu.

This patch revert previous change, eliminates the overhead of xa_load
and can result in a 3% performance improvement.

Signed-off-by: Fengnan Chang &lt;changfengnan@bytedance.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: accumulate memory segment gaps per bio</title>
<updated>2025-11-07T01:11:58+00:00</updated>
<author>
<name>Keith Busch</name>
<email>kbusch@kernel.org</email>
</author>
<published>2025-10-14T15:04:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2f6b2565d43cdb5087cac23d530cca84aa3d897e'/>
<id>urn:sha1:2f6b2565d43cdb5087cac23d530cca84aa3d897e</id>
<content type='text'>
The blk-mq dma iterator has an optimization for requests that align to
the device's iommu merge boundary. This boundary may be larger than the
device's virtual boundary, but the code had been depending on that queue
limit to know ahead of time if the request is guaranteed to align to
that optimization.

Rather than rely on that queue limit, which many devices may not report,
save the lowest set bit of any boundary gap between each segment in the
bio while checking the segments. The request stores the value for
merging and quickly checking per io if the request can use iova
optimizations.

Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-mq: Document tags_srcu member in blk_mq_tag_set structure</title>
<updated>2025-09-09T13:37:05+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2025-09-09T12:33:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=199c9a8d26638845f509b76e3c176c27e7baafd7'/>
<id>urn:sha1:199c9a8d26638845f509b76e3c176c27e7baafd7</id>
<content type='text'>
Add missing documentation for the tags_srcu member that was introduced
to defer freeing of tags page_list to prevent use-after-free when
iterating tags.

Fixes htmldocs warning:
WARNING: include/linux/blk-mq.h:536 struct member 'tags_srcu' not described in 'blk_mq_tag_set'

Reported-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-mq: Defer freeing of tags page_list to SRCU callback</title>
<updated>2025-09-08T14:05:32+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2025-08-30T02:18:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ad0d05dbddc1bf86e92220fea873176de6b12f78'/>
<id>urn:sha1:ad0d05dbddc1bf86e92220fea873176de6b12f78</id>
<content type='text'>
Tag iterators can race with the freeing of the request pages(tags-&gt;page_list),
potentially leading to use-after-free issues.

Defer the freeing of the page list and the tags structure itself until
after an SRCU grace period has passed. This ensures that any concurrent
tag iterators have completed before the memory is released. With this
way, we can replace the big tags-&gt;lock in tags iterator code path with
srcu for solving the issue.

This is achieved by:
- Adding a new `srcu_struct tags_srcu` to `blk_mq_tag_set` to protect
  tag map iteration.
- Adding an `rcu_head` to `struct blk_mq_tags` to be used with
  `call_srcu`.
- Moving the page list freeing logic and the `kfree(tags)` call into a
  new callback function, `blk_mq_free_tags_callback`.
- In `blk_mq_free_tags`, invoking `call_srcu` to schedule the new
  callback for deferred execution.

The read-side protection for the tag iterators will be added in a
subsequent patch.

Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
Reviewed-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Reviewed-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-mq: add number of queue calc helper</title>
<updated>2025-07-01T16:24:19+00:00</updated>
<author>
<name>Daniel Wagner</name>
<email>wagi@kernel.org</email>
</author>
<published>2025-06-17T13:43:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3f27c1de5df265f9d8edf0cc5d75dc92e328484a'/>
<id>urn:sha1:3f27c1de5df265f9d8edf0cc5d75dc92e328484a</id>
<content type='text'>
Add two variants of helper functions that calculate the correct number
of queues to use. Two variants are needed because some drivers base
their maximum number of queues on the possible CPU mask, while others
use the online CPU mask.

Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Daniel Wagner &lt;wagi@kernel.org&gt;
Reviewed-by: Chaitanya Kulkarni &lt;kch@nvidia.com&gt;
Link: https://lore.kernel.org/r/20250617-isolcpus-queue-counters-v1-2-13923686b54b@kernel.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: remove the q argument from blk_rq_map_kern</title>
<updated>2025-05-07T13:31:07+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2025-05-07T12:04:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=af78428ed3f3eebad7be9d0463251046e9582cf6'/>
<id>urn:sha1:af78428ed3f3eebad7be9d0463251046e9582cf6</id>
<content type='text'>
Remove the q argument from blk_rq_map_kern and the internal helpers
called by it as the queue can trivially be derived from the request.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Damien Le Moal &lt;dlemoal@kernel.org&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Link: https://lore.kernel.org/r/20250507120451.4000627-6-hch@lst.de
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: fix warning on 'make htmldocs'</title>
<updated>2025-05-07T11:37:39+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2025-05-07T09:25:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f31acff017b1d80e649f674450d3b64d3265848c'/>
<id>urn:sha1:f31acff017b1d80e649f674450d3b64d3265848c</id>
<content type='text'>
Fix the following warning when running 'make htmldocs':

+WARNING: include/linux/blk-mq.h:532 struct member 'update_nr_hwq_lock' not described in 'blk_mq_tag_set'

Fixes: 98e68f67020c ("block: prevent adding/deleting disk during updating nr_hw_queues")
Reported-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Closes: https://lore.kernel.org/all/20250507163220.00141d77@canb.auug.org.au/
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Link: https://lore.kernel.org/r/20250507092537.3009112-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: prevent adding/deleting disk during updating nr_hw_queues</title>
<updated>2025-05-06T13:43:42+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2025-05-05T14:17:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=98e68f67020ce30e1a4d8e2d05d85a453738dfb8'/>
<id>urn:sha1:98e68f67020ce30e1a4d8e2d05d85a453738dfb8</id>
<content type='text'>
Both adding/deleting disk code are reader of `nr_hw_queues`, so we can't
allow them in-progress when updating nr_hw_queues, kernel panic and
kasan has been reported in [1].

Prevent adding/deleting disk during updating nr_hw_queues by adding
rw_semaphore to tagset, write lock is grabbed in blk_mq_update_nr_hw_queues(),
and read lock is acquired when adding/deleting disk.

Also mark GFP_NOIO allocation scope for adding/deleting disk because
blk_mq_update_nr_hw_queues() is part of some driver's error handler.

This way avoids lot of trouble.

Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
Reviewed-by: Nilay Shroff &lt;nilay@linux.ibm.com&gt;
Suggested-by: Nilay Shroff &lt;nilay@linux.ibm.com&gt;
Reported-by: Nilay Shroff &lt;nilay@linux.ibm.com&gt;
Closes: https://lore.kernel.org/linux-block/a5896cdb-a59a-4a37-9f99-20522f5d2987@linux.ibm.com/
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Link: https://lore.kernel.org/r/20250505141805.2751237-9-ming.lei@redhat.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
</feed>
