kernel/linux.git/io_uring, branch master

io_uring: take page references for NOMMU pbuf_ring mmaps

2026-04-22T02:14:39+00:00

Under !CONFIG_MMU, io_uring_get_unmapped_area() returns the kernel virtual address of the io_mapped_region's backing pages directly; the user's VMA aliases the kernel allocation. io_uring_mmap() then just returns 0 -- it takes no page references. The CONFIG_MMU path uses vm_insert_pages(), which takes a reference on each inserted page. Those references are released when the VMA is torn down (zap_pte_range -> put_page). io_free_region() -> release_pages() drops the io_uring-side references, but the pages survive until munmap drops the VMA-side references. Under NOMMU there are no VMA-side references. io_unregister_pbuf_ring -> io_put_bl -> io_free_region -> release_pages drops the only references and the pages return to the buddy allocator while the user's VMA still has vm_start pointing into them. The user can then write into whatever the allocator hands out next. Mirror the MMU lifetime: take get_page references in io_uring_mmap() and release them via vm_ops->close. NOMMU's delete_vma() calls vma_close() which runs ->close on munmap. This also incidentally addresses the duplicate-vm_start case: two mmaps of SQ_RING and CQ_RING resolve to the same ctx->ring_region pointer. With page refs taken per mmap, the second mmap takes its own refs and the pages survive until both mmaps are closed. The nommu rb-tree BUG_ON on duplicate vm_start is a separate mm/nommu.c concern (it should share the existing region rather than BUG), but the page lifetime is now correct. Cc: Jens Axboe Reported-by: Anthropic Assisted-by: gkh_clanker_t1000 Signed-off-by: Greg Kroah-Hartman Link: https://patch.msgid.link/2026042115-body-attention-d15b@gregkh [axboe: get rid of region lookup, just iterate pages in vma] Signed-off-by: Jens Axboe

io_uring/poll: ensure EPOLL_ONESHOT is propagated for EPOLL_URING_WAKE

2026-04-22T01:18:34+00:00

Commit: aacf2f9f382c ("io_uring: fix req->apoll_events") fixed an issue where poll->events and req->apoll_events weren't synchronized, but then when the commit referenced in Fixes got added, it didn't ensure the same thing. If we mask in EPOLLONESHOT in the regular EPOLL_URING_WAKE path, then ensure it's done for both. Including a link to the original report below, even though it's mostly nonsense. But it includes a reproducer that does show that IORING_CQE_F_MORE is set in the previous CQE, while no more CQEs will be generated for this request. Just ignore anything that pretends this is security related in any way, it's just the typical AI nonsense. Cc: stable@vger.kernel.org Link: https://lore.kernel.org/io-uring/CAM0zi7yQzF3eKncgHo4iVM5yFLAjsiob_ucqyWKs=hyd_GqiMg@mail.gmail.com/ Reported-by: Azizcan Daştan Fixes: 4464853277d0 ("io_uring: pass in EPOLL_URING_WAKE for eventfd signaling and wakeups") Signed-off-by: Jens Axboe

io_uring/zcrx: warn on freelist violations

2026-04-21T18:19:11+00:00

The freelist is appropriately sized to always be able to take a free niov, but let's be more defensive and check the invariant with a warning. That should help to catch any double-free issues. Suggested-by: Kai Aizen Signed-off-by: Pavel Begunkov Link: https://patch.msgid.link/2f3cea363b04649755e3b6bb9ab66485a95936d5.1776760901.git.asml.silence@gmail.com Signed-off-by: Jens Axboe

io_uring/zcrx: clear RQ headers on init

2026-04-21T18:19:11+00:00

It might be unexpected to users if the RQ head/tail after a ring creation are not zeroed, fix that. Cc: stable@vger.kernel.org Fixes: 6f377873cb239 ("io_uring/zcrx: add interface queue and refill queue") Signed-off-by: Pavel Begunkov Link: https://patch.msgid.link/331f94663c3e8f021ffa3cb770ca2844a07d4855.1776760911.git.asml.silence@gmail.com Signed-off-by: Jens Axboe

io_uring/zcrx: fix user_struct uaf

2026-04-21T18:19:11+00:00

io_free_rbuf_ring() usees a struct user_struct, which io_zcrx_ifq_free() puts it down before destroying the ring. Cc: stable@vger.kernel.org Fixes: 5c686456a4e83 ("io_uring/zcrx: add user_struct and mm_struct to io_zcrx_ifq") Signed-off-by: Pavel Begunkov Link: https://patch.msgid.link/e560ae00960d27a810522a7efc0e201c82dff351.1776760917.git.asml.silence@gmail.com Signed-off-by: Jens Axboe

io_uring/register: fix ring resizing with mixed/large SQEs/CQEs

2026-04-21T18:19:08+00:00

The ring resizing only properly handles "normal" sized SQEs or CQEs, if there are pending entries around a resize. This normally should not be the case, but the code is supposed to handle this regardless. For the mixed SQE/CQE cases, the current copying works fine as they are indexed in the same way. Each half is just copied separately. But for fixed large SQEs and CQEs, the iteration and copy need to take that into account. Cc: stable@kernel.org Fixes: 79cfe9e59c2a ("io_uring/register: add IORING_REGISTER_RESIZE_RINGS") Reviewed-by: Gabriel Krisman Bertazi Signed-off-by: Jens Axboe

io_uring/futex: ensure partial wakes are appropriately dequeued

2026-04-21T18:19:06+00:00

If a FUTEX_WAITV vectored operation is only partially woken, we should call __futex_wake_mark() on the queue to account for that. If not, then a later wakeup will wake the same entry, rather than the next one in line. Fixes: 8f350194d5cfd ("io_uring: add support for vectored futex waits") Reviewed-by: Gabriel Krisman Bertazi Signed-off-by: Jens Axboe

io_uring/rw: add defensive hardening for negative kbuf lengths

2026-04-21T18:19:03+00:00

No real bug here, just being a bit defensive in ensuring that whatever gets passed into io_put_kbuf() is always >= 0 and not some random error value. Reviewed-by: Gabriel Krisman Bertazi Signed-off-by: Jens Axboe

io_uring/rsrc: use kvfree() for the imu cache

2026-04-21T18:19:01+00:00

Currently anything that requires kvmalloc_flex() for allocations will not get re-cached, and hence the cache freeing path is correct in that it always uses kfree() to free the allocated memory. But this seems a bit fragile as it's something that could get mix should that situation change, so switch io_free_imu() and io_alloc_cache_free() to use kvfree as the desctructor. Signed-off-by: Jens Axboe

io_uring/rsrc: unify nospec indexing for direct descriptors

2026-04-21T18:18:54+00:00

For file updates, the node reset isn't capping the value via array_index_nospec() like the other paths do. Ensure it's all sane and have the update path do the proper capping as well. Reviewed-by: Gabriel Krisman Bertazi Signed-off-by: Jens Axboe