<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/io_uring/kbuf.c, branch v6.18.21</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.18.21</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.18.21'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-03-25T10:10:35+00:00</updated>
<entry>
<title>io_uring/kbuf: propagate BUF_MORE through early buffer commit path</title>
<updated>2026-03-25T10:10:35+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2026-03-19T20:29:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=19e46bf37943bc26bd81045558bcd38b01f06f23'/>
<id>urn:sha1:19e46bf37943bc26bd81045558bcd38b01f06f23</id>
<content type='text'>
commit 418eab7a6f3c002d8e64d6e95ec27118017019af upstream.

When io_should_commit() returns true (eg for non-pollable files), buffer
commit happens at buffer selection time and sel-&gt;buf_list is set to
NULL. When __io_put_kbufs() generates CQE flags at completion time, it
calls __io_put_kbuf_ring() which finds a NULL buffer_list and hence
cannot determine whether the buffer was consumed or not. This means that
IORING_CQE_F_BUF_MORE is never set for non-pollable input with
incrementally consumed buffers.

Likewise for io_buffers_select(), which always commits upfront and
discards the return value of io_kbuf_commit().

Add REQ_F_BUF_MORE to store the result of io_kbuf_commit() during early
commit. Then __io_put_kbuf_ring() can check this flag and set
IORING_F_BUF_MORE accordingy.

Reported-by: Martin Michaelis &lt;code@mgjm.de&gt;
Cc: stable@vger.kernel.org
Fixes: ae98dbf43d75 ("io_uring/kbuf: add support for incremental buffer consumption")
Link: https://github.com/axboe/liburing/issues/1553
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>io_uring/kbuf: fix missing BUF_MORE for incremental buffers at EOF</title>
<updated>2026-03-25T10:10:35+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2026-03-19T20:29:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f98405afd2a7017f2d1bea769299c55ab41f1384'/>
<id>urn:sha1:f98405afd2a7017f2d1bea769299c55ab41f1384</id>
<content type='text'>
commit 3ecd3e03144b38a21a3b70254f1b9d2e16629b09 upstream.

For a zero length transfer, io_kbuf_inc_commit() is called with !len.
Since we never enter the while loop to consume the buffers,
io_kbuf_inc_commit() ends up returning true, consuming the buffer. But
if no data was consumed, by definition it cannot have consumed the
buffer. Return false for that case.

Reported-by: Martin Michaelis &lt;code@mgjm.de&gt;
Cc: stable@vger.kernel.org
Fixes: ae98dbf43d75 ("io_uring/kbuf: add support for incremental buffer consumption")
Link: https://github.com/axboe/liburing/issues/1553
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>io_uring/kbuf: check if target buffer list is still legacy on recycle</title>
<updated>2026-03-19T15:08:42+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2026-03-12T14:59:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=50ad880db3013c6fee0ef13781762a39e2e7ef83'/>
<id>urn:sha1:50ad880db3013c6fee0ef13781762a39e2e7ef83</id>
<content type='text'>
commit c2c185be5c85d37215397c8e8781abf0a69bec1f upstream.

There's a gap between when the buffer was grabbed and when it
potentially gets recycled, where if the list is empty, someone could've
upgraded it to a ring provided type. This can happen if the request
is forced via io-wq. The legacy recycling is missing checking if the
buffer_list still exists, and if it's of the correct type. Add those
checks.

Cc: stable@vger.kernel.org
Fixes: c7fb19428d67 ("io_uring: add support for ring mapped supplied buffers")
Reported-by: Keenan Dong &lt;keenanat2000@gmail.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>io_uring/kbuf: fix memory leak if io_buffer_add_list fails</title>
<updated>2026-02-26T22:59:01+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2026-02-05T15:38:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=445d35742d8636225d01e26f90fb49fa0327d8c1'/>
<id>urn:sha1:445d35742d8636225d01e26f90fb49fa0327d8c1</id>
<content type='text'>
[ Upstream commit 442ae406603a94f1a263654494f425302ceb0445 ]

io_register_pbuf_ring() ignores the return value of io_buffer_add_list(),
which can fail if xa_store() returns an error (e.g., -ENOMEM). When this
happens, the function returns 0 (success) to the caller, but the
io_buffer_list structure is neither added to the xarray nor freed.

In practice this requires failure injection to hit, hence not a real
issue. But it should get fixed up none the less.

Fixes: c7fb19428d67 ("io_uring: add support for ring mapped supplied buffers")
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>io_uring/kbuf: use READ_ONCE() for userspace-mapped memory</title>
<updated>2025-12-18T13:03:36+00:00</updated>
<author>
<name>Caleb Sander Mateos</name>
<email>csander@purestorage.com</email>
</author>
<published>2025-12-04T22:43:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0e777da54152fabd626828356f37505344c0ebb2'/>
<id>urn:sha1:0e777da54152fabd626828356f37505344c0ebb2</id>
<content type='text'>
[ Upstream commit 78385c7299f7514697d196b3233a91bd5e485591 ]

The struct io_uring_buf elements in a buffer ring are in a memory region
accessible from userspace. A malicious/buggy userspace program could
therefore write to them at any time, so they should be accessed with
READ_ONCE() in the kernel. Commit 98b6fa62c84f ("io_uring/kbuf: always
use READ_ONCE() to read ring provided buffer lengths") already switched
the reads of the len field to READ_ONCE(). Do the same for bid and addr.

Signed-off-by: Caleb Sander Mateos &lt;csander@purestorage.com&gt;
Fixes: c7fb19428d67 ("io_uring: add support for ring mapped supplied buffers")
Cc: Joanne Koong &lt;joannelkoong@gmail.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>io_uring: fix buffer auto-commit for multishot uring_cmd</title>
<updated>2025-10-24T01:41:31+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2025-10-24T01:34:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6f1cbf6d6fd13fc169dde14e865897924cdc4bbd'/>
<id>urn:sha1:6f1cbf6d6fd13fc169dde14e865897924cdc4bbd</id>
<content type='text'>
Commit 620a50c92700 ("io_uring: uring_cmd: add multishot support") added
multishot uring_cmd support with explicit buffer upfront commit via
io_uring_mshot_cmd_post_cqe(). However, the buffer selection path in
io_ring_buffer_select() was auto-committing buffers for non-pollable files,
which conflicts with uring_cmd's explicit upfront commit model.

This way consumes the whole selected buffer immediately, and causes
failure on the following buffer selection.

Fix this by checking uring_cmd to identify operations that handle buffer
commit explicitly, and skip auto-commit for these operations.

Cc: Caleb Sander Mateos &lt;csander@purestorage.com&gt;
Fixes: 620a50c92700 ("io_uring: uring_cmd: add multishot support")
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>Merge tag 'for-6.18/io_uring-20250929' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux</title>
<updated>2025-10-02T16:56:23+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-10-02T16:56:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5832d26433f2bd0d28f8b12526e3c2fdb203507f'/>
<id>urn:sha1:5832d26433f2bd0d28f8b12526e3c2fdb203507f</id>
<content type='text'>
Pull io_uring updates from Jens Axboe:

 - Store ring provided buffers locally for the users, rather than stuff
   them into struct io_kiocb.

   These types of buffers must always be fully consumed or recycled in
   the current context, and leaving them in struct io_kiocb is hence not
   a good ideas as that struct has a vastly different life time.

   Basically just an architecture cleanup that can help prevent issues
   with ring provided buffers in the future.

 - Support for mixed CQE sizes in the same ring.

   Before this change, a CQ ring either used the default 16b CQEs, or it
   was setup with 32b CQE using IORING_SETUP_CQE32. For use cases where
   a few 32b CQEs were needed, this caused everything else to use big
   CQEs. This is wasteful both in terms of memory usage, but also memory
   bandwidth for the posted CQEs.

   With IORING_SETUP_CQE_MIXED, applications may use request types that
   post both normal 16b and big 32b CQEs on the same ring.

 - Add helpers for async data management, to make it harder for opcode
   handlers to mess it up.

 - Add support for multishot for uring_cmd, which ublk can use. This
   helps improve efficiency, by providing a persistent request type that
   can trigger multiple CQEs.

 - Add initial support for ring feature querying.

   We had basic support for probe operations, but the API isn't great.
   Rather than expand that, add support for QUERY which is easily
   expandable and can cover a lot more cases than the existing probe
   support. This will help applications get a better idea of what
   operations are supported on a given host.

 - zcrx improvements from Pavel:
        - Improve refill entry alignment for better caching
        - Various cleanups, especially around deduplicating normal
          memory vs dmabuf setup.
        - Generalisation of the niov size (Patch 12). It's still hard
          coded to PAGE_SIZE on init, but will let the user to specify
          the rx buffer length on setup.
        - Syscall / synchronous bufer return. It'll be used as a slow
          fallback path for returning buffers when the refill queue is
          full. Useful for tolerating slight queue size misconfiguration
          or with inconsistent load.
        - Accounting more memory to cgroups.
        - Additional independent cleanups that will also be useful for
          mutli-area support.

 - Various fixes and cleanups

* tag 'for-6.18/io_uring-20250929' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (68 commits)
  io_uring/cmd: drop unused res2 param from io_uring_cmd_done()
  io_uring: fix nvme's 32b cqes on mixed cq
  io_uring/query: cap number of queries
  io_uring/query: prevent infinite loops
  io_uring/zcrx: account niov arrays to cgroup
  io_uring/zcrx: allow synchronous buffer return
  io_uring/zcrx: introduce io_parse_rqe()
  io_uring/zcrx: don't adjust free cache space
  io_uring/zcrx: use guards for the refill lock
  io_uring/zcrx: reduce netmem scope in refill
  io_uring/zcrx: protect netdev with pp_lock
  io_uring/zcrx: rename dma lock
  io_uring/zcrx: make niov size variable
  io_uring/zcrx: set sgt for umem area
  io_uring/zcrx: remove dmabuf_offset
  io_uring/zcrx: deduplicate area mapping
  io_uring/zcrx: pass ifq to io_zcrx_alloc_fallback()
  io_uring/zcrx: check all niovs filled with dma addresses
  io_uring/zcrx: move area reg checks into io_import_area
  io_uring/zcrx: don't pass slot to io_zcrx_create_area
  ...
</content>
</entry>
<entry>
<title>io_uring/kbuf: always use READ_ONCE() to read ring provided buffer lengths</title>
<updated>2025-08-28T11:48:34+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2025-08-27T21:27:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=98b6fa62c84f2e129161e976a5b9b3cb4ccd117b'/>
<id>urn:sha1:98b6fa62c84f2e129161e976a5b9b3cb4ccd117b</id>
<content type='text'>
Since the buffers are mapped from userspace, it is prudent to use
READ_ONCE() to read the value into a local variable, and use that for
any other actions taken. Having a stable read of the buffer length
avoids worrying about it changing after checking, or being read multiple
times.

Similarly, the buffer may well change in between it being picked and
being committed. Ensure the looping for incremental ring buffer commit
stops if it hits a zero sized buffer, as no further progress can be made
at that point.

Fixes: ae98dbf43d75 ("io_uring/kbuf: add support for incremental buffer consumption")
Link: https://lore.kernel.org/io-uring/tencent_000C02641F6250C856D0C26228DE29A3D30A@qq.com/
Reported-by: Qingyue Zhang &lt;chunzhennn@qq.com&gt;
Reported-by: Suoxing Zhang &lt;aftern00n@qq.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring/kbuf: fix signedness in this_len calculation</title>
<updated>2025-08-27T14:39:44+00:00</updated>
<author>
<name>Qingyue Zhang</name>
<email>chunzhennn@qq.com</email>
</author>
<published>2025-08-27T11:43:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c64eff368ac676e8540344d27a3de47e0ad90d21'/>
<id>urn:sha1:c64eff368ac676e8540344d27a3de47e0ad90d21</id>
<content type='text'>
When importing and using buffers, buf-&gt;len is considered unsigned.
However, buf-&gt;len is converted to signed int when committing. This can
lead to unexpected behavior if the buffer is large enough to be
interpreted as a negative value. Make min_t calculation unsigned.

Fixes: ae98dbf43d75 ("io_uring/kbuf: add support for incremental buffer consumption")
Co-developed-by: Suoxing Zhang &lt;aftern00n@qq.com&gt;
Signed-off-by: Suoxing Zhang &lt;aftern00n@qq.com&gt;
Signed-off-by: Qingyue Zhang &lt;chunzhennn@qq.com&gt;
Link: https://lore.kernel.org/r/tencent_4DBB3674C0419BEC2C0C525949DA410CA307@qq.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring/kbuf: switch to storing struct io_buffer_list locally</title>
<updated>2025-08-24T17:41:12+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2025-08-21T02:03:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5fda51255439addd1c9059098e30847a375a1008'/>
<id>urn:sha1:5fda51255439addd1c9059098e30847a375a1008</id>
<content type='text'>
Currently the buffer list is stored in struct io_kiocb. The buffer list
can be of two types:

1) Classic/legacy buffer list. These don't need to get referenced after
   a buffer pick, and hence storing them in struct io_kiocb is perfectly
   fine.

2) Ring provided buffer lists. These DO need to be referenced after the
   initial buffer pick, as they need to get consumed later on. This can
   be either just incrementing the head of the ring, or it can be
   consuming parts of a buffer if incremental buffer consumptions has
   been configured.

For case 2, io_uring needs to be careful not to access the buffer list
after the initial pick-and-execute context. The core does recycling of
these, but it's easy to make a mistake, because it's stored in the
io_kiocb which does persist across multiple execution contexts. Either
because it's a multishot request, or simply because it needed some kind
of async trigger (eg poll) for retry purposes.

Add a struct io_buffer_list to struct io_br_sel, which is always on
stack for the various users of it. This prevents the buffer list from
leaking outside of that execution context, and additionally it enables
kbuf to not even pass back the struct io_buffer_list if the given
context isn't appropriately locked already.

This doesn't fix any bugs, it's simply a defensive measure to prevent
any issues with reuse of a buffer list.

Link: https://lore.kernel.org/r/20250821020750.598432-12-axboe@kernel.dk
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
</feed>
