<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/linux/blk-mq.h, branch v7.2-rc1</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v7.2-rc1</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v7.2-rc1'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-05-28T15:35:38+00:00</updated>
<entry>
<title>block: export passthrough stats enabled</title>
<updated>2026-05-28T15:35:38+00:00</updated>
<author>
<name>Keith Busch</name>
<email>kbusch@kernel.org</email>
</author>
<published>2026-05-28T01:00:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b7f40ab50190e2500c3c297d15e00040dca47feb'/>
<id>urn:sha1:b7f40ab50190e2500c3c297d15e00040dca47feb</id>
<content type='text'>
A user can enable io accounting for passthrough requests, so export the
helper that checks if the request should be tracked. This will enable
stacking drivers to to report iostats for passthrough workloads. Since
the stacking request_queue may not be the one providing the request, the
API has to add a parameter for the caller to specify which one to check.

Reviewed-by: Nilay Shroff &lt;nilay@linux.ibm.com&gt;
Reviewed-by: Nitesh Shetty &lt;nj.shetty@samsung.com&gt;
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Link: https://patch.msgid.link/20260528010041.1533124-2-kbusch@meta.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: switch numa_node to int in blk_mq_hw_ctx and init_request</title>
<updated>2026-05-26T17:01:55+00:00</updated>
<author>
<name>Mateusz Nowicki</name>
<email>mateusz.nowicki@posteo.net</email>
</author>
<published>2026-05-23T12:52:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b040a1a4523d99a935cb6566b1e2a753c84733cd'/>
<id>urn:sha1:b040a1a4523d99a935cb6566b1e2a753c84733cd</id>
<content type='text'>
numa_node in blk_mq_hw_ctx and the matching argument of
blk_mq_ops::init_request can be NUMA_NO_NODE (-1).  Declared as
unsigned int, NUMA_NO_NODE becomes UINT_MAX and walks off
nvme_dev::descriptor_pools[] on CONFIG_NUMA=n [1].

Switch the field and the callback prototype to int and update all
in-tree init_request implementations.  No functional change:
cpu_to_node(), kmalloc_node() and blk_alloc_flush_queue() already
take int.

Link: https://lore.kernel.org/linux-nvme/20260522150628.399288-1-mateusz.nowicki@posteo.net/ [1]
Link: https://lore.kernel.org/linux-nvme/20260309062840.2937858-2-iam@sung-woo.kim/
Suggested-by: Caleb Sander Mateos &lt;csander@purestorage.com&gt;
Suggested-by: Sung-woo Kim &lt;iam@sung-woo.kim&gt;
Signed-off-by: Mateusz Nowicki &lt;mateusz.nowicki@posteo.net&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Link: https://patch.msgid.link/20260523125210.272274-1-mateusz.nowicki@posteo.net
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-mq: introduce blk_rq_has_data()</title>
<updated>2026-05-22T14:05:33+00:00</updated>
<author>
<name>Caleb Sander Mateos</name>
<email>csander@purestorage.com</email>
</author>
<published>2026-05-13T21:18:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=999722b34441b4ab65b7ca7fb16dd4b62fc3c354'/>
<id>urn:sha1:999722b34441b4ab65b7ca7fb16dd4b62fc3c354</id>
<content type='text'>
Add blk_rq_has_data(), an analogue of bio_has_data() for struct request.
This skips one dereference relative to bio_has_data(rq-&gt;bio).

Signed-off-by: Caleb Sander Mateos &lt;csander@purestorage.com&gt;
Reviewed-by: Ming Lei &lt;tom.leiming@gmail.com&gt;
Link: https://patch.msgid.link/20260513211846.1956810-2-csander@purestorage.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: pass io_comp_batch to rq_end_io_fn callback</title>
<updated>2026-01-20T17:12:54+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2026-01-16T07:46:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5e2fde1a9433efc484a5feec36f748aa3ea58c85'/>
<id>urn:sha1:5e2fde1a9433efc484a5feec36f748aa3ea58c85</id>
<content type='text'>
Add a third parameter 'const struct io_comp_batch *' to the rq_end_io_fn
callback signature. This allows end_io handlers to access the completion
batch context when requests are completed via blk_mq_end_request_batch().

The io_comp_batch is passed from blk_mq_end_request_batch(), while NULL
is passed from __blk_mq_end_request() and blk_mq_put_rq_ref() which don't
have batch context.

This infrastructure change enables drivers to detect whether they're
being called from a batched completion path (like iopoll) and access
additional context stored in the io_comp_batch.

Update all rq_end_io_fn implementations:
- block/blk-mq.c: blk_end_sync_rq
- block/blk-flush.c: flush_end_io, mq_flush_data_end_io
- drivers/nvme/host/ioctl.c: nvme_uring_cmd_end_io
- drivers/nvme/host/core.c: nvme_keep_alive_end_io
- drivers/nvme/host/pci.c: abort_endio, nvme_del_queue_end, nvme_del_cq_end
- drivers/nvme/target/passthru.c: nvmet_passthru_req_done
- drivers/scsi/scsi_error.c: eh_lock_door_done
- drivers/scsi/sg.c: sg_rq_end_io
- drivers/scsi/st.c: st_scsi_execute_end
- drivers/target/target_core_pscsi.c: pscsi_req_done
- drivers/md/dm-rq.c: end_clone_request

Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Reviewed-by: Kanchan Joshi &lt;joshi.k@samsung.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-mq: add blk_rq_nr_bvec() helper</title>
<updated>2025-12-04T14:19:26+00:00</updated>
<author>
<name>Chaitanya Kulkarni</name>
<email>ckulkarnilinux@gmail.com</email>
</author>
<published>2025-12-03T03:58:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=71075d25ca5cae732fb57da065fbf14aeb3bcfc7'/>
<id>urn:sha1:71075d25ca5cae732fb57da065fbf14aeb3bcfc7</id>
<content type='text'>
Add a new helper function blk_rq_nr_bvec() that returns the number of
bvecs in a request. This count represents the number of iterations
rq_for_each_bvec() would perform on a request.

Drivers need to pre-allocate bvec arrays before iterating through
a request's bvecs. Currently, they manually count bvecs using
rq_for_each_bvec() in a loop, which is repetitive. The new helper
centralizes this logic.

This pattern exists in loop and zloop drivers, where multi-bio requests
require copying bvecs into a contiguous array before creating
an iov_iter for file operations.

Update loop and zloop drivers to use the new helper, eliminating
duplicate code.

This patch also provides a clear API to avoid any potential misuse of
blk_nr_phys_segments() for calculating the bvecs since, one bvec can
have more than one segments and use of blk_nr_phys_segments() can
lead to extra memory allocation :-

[ 6155.673749] nullb_bio: 128K bio as ONE bvec: sector=0, size=131072
[ 6155.673846] null_blk: #### null_handle_data_transfer:1375
[ 6155.673850] null_blk: nr_bvec=1 blk_rq_nr_phys_segments=2
[ 6155.674263] null_blk: #### null_handle_data_transfer:1375
[ 6155.674267] null_blk: nr_bvec=1 blk_rq_nr_phys_segments=1

Reviewed-by: Niklas Cassel &lt;cassel@kernel.org&gt;
Signed-off-by: Chaitanya Kulkarni &lt;ckulkarnilinux@gmail.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-mq: fix potential uaf for 'queue_hw_ctx'</title>
<updated>2025-11-28T16:09:19+00:00</updated>
<author>
<name>Fengnan Chang</name>
<email>fengnanchang@gmail.com</email>
</author>
<published>2025-11-28T08:53:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=89e1fb7ceffd898505ad7fa57acec0585bfaa2cc'/>
<id>urn:sha1:89e1fb7ceffd898505ad7fa57acec0585bfaa2cc</id>
<content type='text'>
This is just apply Kuai's patch in [1] with mirror changes.

blk_mq_realloc_hw_ctxs() will free the 'queue_hw_ctx'(e.g. undate
submit_queues through configfs for null_blk), while it might still be
used from other context(e.g. switch elevator to none):

t1					t2
elevator_switch
 blk_mq_unquiesce_queue
  blk_mq_run_hw_queues
   queue_for_each_hw_ctx
    // assembly code for hctx = (q)-&gt;queue_hw_ctx[i]
    mov    0x48(%rbp),%rdx -&gt; read old queue_hw_ctx

					__blk_mq_update_nr_hw_queues
					 blk_mq_realloc_hw_ctxs
					  hctxs = q-&gt;queue_hw_ctx
					  q-&gt;queue_hw_ctx = new_hctxs
					  kfree(hctxs)
    movslq %ebx,%rax
    mov    (%rdx,%rax,8),%rdi -&gt;uaf

This problem was found by code review, and I comfirmed that the concurrent
scenario do exist(specifically 'q-&gt;queue_hw_ctx' can be changed during
blk_mq_run_hw_queues()), however, the uaf problem hasn't been repoduced yet
without hacking the kernel.

Sicne the queue is freezed in __blk_mq_update_nr_hw_queues(), fix the
problem by protecting 'queue_hw_ctx' through rcu where it can be accessed
without grabbing 'q_usage_counter'.

[1] https://lore.kernel.org/all/20220225072053.2472431-1-yukuai3@huawei.com/

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Fengnan Chang &lt;changfengnan@bytedance.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-mq: use array manage hctx map instead of xarray</title>
<updated>2025-11-28T16:09:19+00:00</updated>
<author>
<name>Fengnan Chang</name>
<email>fengnanchang@gmail.com</email>
</author>
<published>2025-11-28T08:53:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d0c98769ee7d5db8d699a270690639cde1766cd4'/>
<id>urn:sha1:d0c98769ee7d5db8d699a270690639cde1766cd4</id>
<content type='text'>
After commit 4e5cc99e1e48 ("blk-mq: manage hctx map via xarray"), we use
an xarray instead of array to store hctx, but in poll mode, each time
in blk_mq_poll, we need use xa_load to find corresponding hctx, this
introduce some costs. In my test, xa_load may cost 3.8% cpu.

This patch revert previous change, eliminates the overhead of xa_load
and can result in a 3% performance improvement.

Signed-off-by: Fengnan Chang &lt;changfengnan@bytedance.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: accumulate memory segment gaps per bio</title>
<updated>2025-11-07T01:11:58+00:00</updated>
<author>
<name>Keith Busch</name>
<email>kbusch@kernel.org</email>
</author>
<published>2025-10-14T15:04:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2f6b2565d43cdb5087cac23d530cca84aa3d897e'/>
<id>urn:sha1:2f6b2565d43cdb5087cac23d530cca84aa3d897e</id>
<content type='text'>
The blk-mq dma iterator has an optimization for requests that align to
the device's iommu merge boundary. This boundary may be larger than the
device's virtual boundary, but the code had been depending on that queue
limit to know ahead of time if the request is guaranteed to align to
that optimization.

Rather than rely on that queue limit, which many devices may not report,
save the lowest set bit of any boundary gap between each segment in the
bio while checking the segments. The request stores the value for
merging and quickly checking per io if the request can use iova
optimizations.

Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-mq: Document tags_srcu member in blk_mq_tag_set structure</title>
<updated>2025-09-09T13:37:05+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2025-09-09T12:33:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=199c9a8d26638845f509b76e3c176c27e7baafd7'/>
<id>urn:sha1:199c9a8d26638845f509b76e3c176c27e7baafd7</id>
<content type='text'>
Add missing documentation for the tags_srcu member that was introduced
to defer freeing of tags page_list to prevent use-after-free when
iterating tags.

Fixes htmldocs warning:
WARNING: include/linux/blk-mq.h:536 struct member 'tags_srcu' not described in 'blk_mq_tag_set'

Reported-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-mq: Defer freeing of tags page_list to SRCU callback</title>
<updated>2025-09-08T14:05:32+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2025-08-30T02:18:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ad0d05dbddc1bf86e92220fea873176de6b12f78'/>
<id>urn:sha1:ad0d05dbddc1bf86e92220fea873176de6b12f78</id>
<content type='text'>
Tag iterators can race with the freeing of the request pages(tags-&gt;page_list),
potentially leading to use-after-free issues.

Defer the freeing of the page list and the tags structure itself until
after an SRCU grace period has passed. This ensures that any concurrent
tag iterators have completed before the memory is released. With this
way, we can replace the big tags-&gt;lock in tags iterator code path with
srcu for solving the issue.

This is achieved by:
- Adding a new `srcu_struct tags_srcu` to `blk_mq_tag_set` to protect
  tag map iteration.
- Adding an `rcu_head` to `struct blk_mq_tags` to be used with
  `call_srcu`.
- Moving the page list freeing logic and the `kfree(tags)` call into a
  new callback function, `blk_mq_free_tags_callback`.
- In `blk_mq_free_tags`, invoking `call_srcu` to schedule the new
  callback for deferred execution.

The read-side protection for the tag iterators will be added in a
subsequent patch.

Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
Reviewed-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Reviewed-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
</feed>
