summaryrefslogtreecommitdiff
path: root/io_uring
AgeCommit message (Collapse)AuthorFilesLines
2025-03-07io_uring: introduce struct iou_vecPavel Begunkov4-12/+34
I need a convenient way to pass around and work with iovec+size pair, put them into a structure and makes use of it in rw.c Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d39fadafc9e9047b0a292e5be6db3cf2f48bb1f7.1741362889.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-07Merge branch 'for-6.15/io_uring-epoll-wait' into for-6.15/io_uring-reg-vecJens Axboe4-6/+54
* for-6.15/io_uring-epoll-wait: io_uring/epoll: add support for IORING_OP_EPOLL_WAIT io_uring/epoll: remove CONFIG_EPOLL guards eventpoll: add epoll_sendevents() helper eventpoll: abstract out ep_try_send_events() helper eventpoll: abstract out parameter sanity checking
2025-03-07Merge branch 'for-6.15/io_uring-rx-zc' into for-6.15/io_uring-reg-vecJens Axboe13-1/+1173
* for-6.15/io_uring-rx-zc: (80 commits) io_uring/zcrx: add selftest case for recvzc with read limit io_uring/zcrx: add a read limit to recvzc requests io_uring: add missing IORING_MAP_OFF_ZCRX_REGION in io_uring_mmap io_uring: Rename KConfig to Kconfig io_uring/zcrx: fix leaks on failed registration io_uring/zcrx: recheck ifq on shutdown io_uring/zcrx: add selftest net: add documentation for io_uring zcrx io_uring/zcrx: add copy fallback io_uring/zcrx: throttle receive requests io_uring/zcrx: set pp memory provider for an rx queue io_uring/zcrx: add io_recvzc request io_uring/zcrx: dma-map area for the device io_uring/zcrx: implement zerocopy receive pp memory provider io_uring/zcrx: grab a net device io_uring/zcrx: add io_zcrx_area io_uring/zcrx: add interface queue and refill queue net: add helpers for setting a memory provider on an rx queue net: page_pool: add memory provider helpers net: prepare for non devmem TCP memory providers ...
2025-03-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-1/+3
Cross-merge networking fixes after downstream PR (net-6.14-rc6). Conflicts: net/ethtool/cabletest.c 2bcf4772e45a ("net: ethtool: try to protect all callback with netdev instance lock") 637399bf7e77 ("net: ethtool: netlink: Allow NULL nlattrs when getting a phy_device") No Adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-06io_uring/rw: ensure reissue path is correctly handled for IOPOLLJens Axboe1-4/+3
The IOPOLL path posts CQEs when the io_kiocb is marked as completed, so it cannot rely on the usual retry that non-IOPOLL requests do for read/write requests. If -EAGAIN is received and the request should be retried, go through the normal completion path and let the normal flush logic catch it and reissue it, like what is done for !IOPOLL reads or writes. Fixes: d803d123948f ("io_uring/rw: handle -EAGAIN retry at IO completion time") Reported-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/io-uring/2b43ccfa-644d-4a09-8f8f-39ad71810f41@oracle.com/ Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-05io_uring: introduce io_cache_free() helperCaleb Sander Mateos4-15/+13
Add a helper function io_cache_free() that returns an allocation to a io_alloc_cache, falling back on kfree() if the io_alloc_cache is full. This is the inverse of io_cache_alloc(), which takes an allocation from an io_alloc_cache and falls back on kmalloc() if the cache is empty. Convert 4 callers to use the helper. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Suggested-by: Li Zetao <lizetao1@huawei.com> Link: https://lore.kernel.org/r/20250304194814.2346705-1-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-04io_uring/rsrc: skip NULL file/buffer checks in io_free_rsrc_node()Caleb Sander Mateos1-4/+2
io_rsrc_node's of type IORING_RSRC_FILE always have a file attached immediately after they are allocated. IORING_RSRC_BUFFER nodes won't be returned from io_sqe_buffer_register()/io_buffer_register_bvec() until they have a io_mapped_ubuf attached. So remove the checks for a NULL file/buffer in io_free_rsrc_node(). Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250228235916.670437-5-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-04io_uring/rsrc: avoid NULL node check on io_sqe_buffer_register() failureCaleb Sander Mateos1-2/+1
The done: label is only reachable if node is non-NULL. So don't bother checking, just call io_free_node(). Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250228235916.670437-4-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-04io_uring/rsrc: call io_free_node() on io_sqe_buffer_register() failureCaleb Sander Mateos1-2/+1
io_sqe_buffer_register() currently calls io_put_rsrc_node() if it fails to fully set up the io_rsrc_node. io_put_rsrc_node() is more involved than necessary, since we already know the reference count will reach 0 and no io_mapped_ubuf has been attached to the node yet. So just call io_free_node() to release the node's memory. This also avoids the need to temporarily set the node's buf pointer to NULL. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250228235916.670437-3-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-04io_uring/rsrc: free io_rsrc_node using kfree()Caleb Sander Mateos1-1/+1
io_rsrc_node_alloc() calls io_cache_alloc(), which uses kmalloc() to allocate the node. So it can be freed with kfree() instead of kvfree(). Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250228235916.670437-2-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-04io_uring/rsrc: split out io_free_node() helperCaleb Sander Mateos1-2/+7
Split the freeing of the io_rsrc_node from io_free_rsrc_node(), for use with nodes that haven't been fully initialized. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250228235916.670437-1-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-04io_uring/rsrc: include io_uring_types.h in rsrc.hCaleb Sander Mateos1-0/+1
io_uring/rsrc.h uses several types from include/linux/io_uring_types.h. Include io_uring_types.h explicitly in rsrc.h to avoid depending on users of rsrc.h including io_uring_types.h first. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Li Zetao <lizetao1@huawei.com> Link: https://lore.kernel.org/r/20250301183612.937529-1-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-01io_uring/nop: use io_find_buf_node()Caleb Sander Mateos1-11/+2
Call io_find_buf_node() to avoid duplicating it in io_nop(). Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250301001610.678223-2-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-01io_uring/rsrc: declare io_find_buf_node() in header fileCaleb Sander Mateos2-2/+4
Declare io_find_buf_node() in io_uring/rsrc.h so it can be called from other files. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250301001610.678223-1-csander@purestorage.com [axboe: keep the inline for local hot path usage] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-01io_uring/ublk: report error when unregister operation failsCaleb Sander Mateos1-4/+14
Indicate to userspace applications if a UBLK_IO_UNREGISTER_IO_BUF command specifies an invalid buffer index by returning an error code. Return -EINVAL if no buffer is registered with the given index, and -EBUSY if the registered buffer is not a kernel bvec. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250228231432.642417-1-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-01io_uring/uring_cmd: specify io_uring_cmd_import_fixed() pointer typeCaleb Sander Mateos1-1/+2
io_uring_cmd_import_fixed() takes a struct io_uring_cmd *, but the type of the ioucmd parameter is void *. Make the pointer type explicit so the compiler can type check it. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250228221514.604350-1-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-01io_uring/rsrc: use rq_data_dir() to compute bvec dirCaleb Sander Mateos1-5/+1
The macro rq_data_dir() already computes a request's data direction. Use it in place of the if-else to set imu->dir. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250228223057.615284-1-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-28Merge tag 'io_uring-6.14-20250228' of git://git.kernel.dk/linuxLinus Torvalds1-1/+3
Pull io_uring fix from Jens Axboe: "Just a single fix headed for stable, ensuring that msg_control is properly saved in compat mode as well" * tag 'io_uring-6.14-20250228' of git://git.kernel.dk/linux: io_uring/net: save msg_control for compat
2025-02-28io_uring: cache nodes and mapped buffersKeith Busch4-16/+63
Frequent alloc/free cycles on these is pretty costly. Use an io cache to more efficiently reuse these buffers. Signed-off-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250227223916.143006-7-kbusch@meta.com [axboe: fix imu leak] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-28io_uring: add support for kernel registered bvecsKeith Busch4-7/+131
Provide an interface for the kernel to leverage the existing pre-registered buffers that io_uring provides. User space can reference these later to achieve zero-copy IO. User space must register an empty fixed buffer table with io_uring in order for the kernel to make use of it. Signed-off-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250227223916.143006-5-kbusch@meta.com Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-28io_uring/rw: move fixed buffer import to issue pathKeith Busch3-11/+34
Registered buffers may depend on a linked command, which makes the prep path too early to import. Move to the issue path when the node is actually needed like all the other users of fixed buffers. Signed-off-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250227223916.143006-3-kbusch@meta.com Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-28io_uring/rw: move buffer_select outside generic prepKeith Busch1-17/+28
Cleans up the generic rw prep to not require the do_import flag. Use a different prep function for callers that might need buffer select. Based-on-a-patch-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250227223916.143006-2-kbusch@meta.com Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski4-20/+41
Cross-merge networking fixes after downstream PR (net-6.14-rc5). Conflicts: drivers/net/ethernet/cadence/macb_main.c fa52f15c745c ("net: cadence: macb: Synchronize stats calculations") 75696dd0fd72 ("net: cadence: macb: Convert to get_stats64") https://lore.kernel.org/20250224125848.68ee63e5@canb.auug.org.au Adjacent changes: drivers/net/ethernet/intel/ice/ice_sriov.c 79990cf5e7ad ("ice: Fix deinitializing VF in error path") a203163274a4 ("ice: simplify VF MSI-X managing") net/ipv4/tcp.c 18912c520674 ("tcp: devmem: don't write truncated dmabuf CMSGs to userspace") 297d389e9e5b ("net: prefix devmem specific helpers") net/mptcp/subflow.c 8668860b0ad3 ("mptcp: reset when MPTCP opts are dropped after join") c3349a22c200 ("mptcp: consolidate subflow cleanup") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-27io_uring/net: fix build warning for !CONFIG_COMPATArnd Bergmann1-6/+0
A code rework resulted in an uninitialized return code when COMPAT mode is disabled: io_uring/net.c:722:6: error: variable 'ret' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized] 722 | if (io_is_compat(req->ctx)) { | ^~~~~~~~~~~~~~~~~~~~~~ io_uring/net.c:736:15: note: uninitialized use occurs here 736 | if (unlikely(ret)) | ^~~ Since io_is_compat() turns into a compile-time 'false', the #ifdef here is completely unnecessary, and removing it avoids the warning. Fixes: 51e158d40589 ("io_uring/net: unify *mshot_prep calls with compat") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20250227132018.1111094-1-arnd@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-27io_uring: rearrange opdef flags by use patternPavel Begunkov1-6/+6
Keep all flags that we use in the generic req init path close together. That saves a load for x86 because apparently some compilers prefer reading single bytes. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/ef03b6ce4a0c2a5234cd4037fa07e9e4902dcc9e.1740602793.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-27io_uring/net: extract iovec import into a helperPavel Begunkov1-34/+28
Deduplicate iovec imports between compat and !compat by introducing a helper function. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6a5f8c526f6732c4249a7fa0213b49e1a3ecccf0.1740569495.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-27io_uring/net: unify *mshot_prep calls with compatPavel Begunkov1-7/+7
Instead of duplicating a io_recvmsg_mshot_prep() call in the compat path, let the common code handle it. For that, copy necessary compat fields into struct user_msghdr. Note, it zeroes user_msghdr to be on the safe side as compat is not that interesting and overhead shouldn't be high. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/94e62386dec570f83b4a4270a46ac60bc415fb71.1740569495.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-27io_uring/net: derive iovec storage laterPavel Begunkov1-22/+21
Don't read free_iov until right before we need it to import the iovec. The only place that uses it before that is provided buffer selection, but it only serves as temporary storage and iovec content is not reused afterwards, so use a local variable for that. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/8bfa7d74c33e37860a724f4e0e96660c25cd4c02.1740569495.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-27io_uring/net: verify msghdr before copying iovecPavel Begunkov1-25/+18
Normally, net/ would verify msghdr before importing iovec, for example see copy_msghdr_from_user(), which further assumed by __copy_msghdr() validating msg->msg_iovlen. io_uring does it in reverse order, which is fine, but it'll be more convenient for flip it so that the iovec business is done at the end and eventually can be nicely pulled out of msghdr parsing section and thought as a sepaarate step. That also makes structure accesses more localised, which should be better for caches. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/cd35dc1b48d4e6e31f59ae7304c037fbe8a3fd3d.1740569495.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-27io_uring/net: isolate msghdr copying codePavel Begunkov1-20/+25
The user access section in io_msg_copy_hdr() is overextended by covering selected buffers. It's hard to work with and prone to errors. Limit the section to msghdr import only, selected buffers will do a separate copy_from_user() call, and then move it into its own function. This should be fine, selected buffer single shots are not important, for multishots the overhead should be non-existent, and it's not that expensive overall. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d3eb1f81c8cfbea9f1aa57dab90c472d2aa6e371.1740569495.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-27io_uring/net: simplify compat selbuf iov parsingPavel Begunkov1-8/+4
Use copy_from_user() instead of open coded access_ok() + get_user(), that's simpler and we don't care about compat that much. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/e51f9c323a3cd4ad7c8da656559bdf6237f052fb.1740569495.git.asml.silence@gmail.com [axboe: fold in bogus < 0 check for tmp_iov.iov_len] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-27io_uring/net: remove unnecessary REQ_F_NEED_CLEANUPPavel Begunkov1-9/+2
REQ_F_NEED_CLEANUP in io_recvmsg_prep_setup() and in io_sendmsg_setup() are relics of the past and don't do anything useful, the flag should be and are set earlier on iovec and async_data allocation. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6aedc3141c1fc027128a4503656cfd686a6980ef.1740569495.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-27Merge branch 'io_uring-6.14' into for-6.15/io_uringJens Axboe5-21/+44
Merge mainline fixes into 6.15 branch, as upcoming patches depend on fixes that went into the 6.14 mainline branch. * io_uring-6.14: io_uring/net: save msg_control for compat io_uring/rw: clean up mshot forced sync mode io_uring/rw: move ki_complete init into prep io_uring/rw: don't directly use ki_complete io_uring/rw: forbid multishot async reads io_uring/rsrc: remove unused constants io_uring: fix spelling error in uapi io_uring.h io_uring: prevent opcode speculation io-wq: backoff when retrying worker creation
2025-02-27io_uring: combine buffer lookup and importPavel Begunkov5-51/+42
Registered buffer are currently imported in two steps, first we lookup a rsrc node and then use it to set up the iterator. The first part is usually done at the prep stage, and import happens whenever it's needed. As we want to defer binding to a node so that it works with linked requests, combine both steps into a single helper. Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250224213116.3509093-6-kbusch@meta.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-27io_uring/nvme: pass issue_flags to io_uring_cmd_import_fixed()Pavel Begunkov1-1/+2
io_uring_cmd_import_fixed() will need to know the io_uring execution state in following commits, for now just pass issue_flags into it without actually using. Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250224213116.3509093-5-kbusch@meta.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-27io_uring/net: reuse req->buf_index for sendzcPavel Begunkov1-3/+2
There is already a field in io_kiocb that can store a registered buffer index, use that instead of stashing the value into struct io_sr_msg. Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250224213116.3509093-4-kbusch@meta.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-27io_uring/nop: reuse req->buf_indexKeith Busch1-5/+2
There is already a field in io_kiocb that can store a registered buffer index, use that instead of stashing the value into struct io_nop. Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250224213116.3509093-3-kbusch@meta.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-27io_uring/rsrc: remove redundant check for valid imuKeith Busch1-11/+8
The only caller to io_buffer_unmap already checks if the node's buf is not null, so no need to check again. Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250224213116.3509093-2-kbusch@meta.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-27io_uring/rw: open code io_prep_rw_setup()Pavel Begunkov1-16/+9
Open code io_prep_rw_setup() into its only caller, it doesn't provide any meaningful abstraction anymore. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/61ba72e2d46119db71f27ab908018e6a6cd6c064.1740425922.git.asml.silence@gmail.com [axboe: fold in 'ret' being unused fix] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-25io_uring/net: save msg_control for compatPavel Begunkov1-1/+3
Match the compat part of io_sendmsg_copy_hdr() with its counterpart and save msg_control. Fixes: c55978024d123 ("io_uring/net: move receive multishot out of the generic msghdr path") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/2a8418821fe83d3b64350ad2b3c0303e9b732bbd.1740498502.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-25io_uring/rw: extract helper for iovec importPavel Begunkov1-26/+31
Split out a helper out of __io_import_rw_buffer() that handles vectored buffers. I'll need it for registered vectored buffers, but it also looks cleaner, especially with parameters being properly named. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/075470cfb24be38709d946815f35ec846d966f41.1740425922.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-25io_uring/rw: rename io_import_iovec()Pavel Begunkov1-7/+7
io_import_iovec() is not limited to iovecs but also imports buffers for normal reads and selected buffers, rename it for clarity. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/91cea59340b61a8f52dc7b8e720274577a25188c.1740425922.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-25io_uring/rw: allocate async data in io_prep_rw()Pavel Begunkov1-3/+3
rw always allocates async_data, so instead of doing that deeper in prep calls inside of io_prep_rw_setup(), be a bit more explicit and do that early on in io_prep_rw(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/5ead621051bc3374d1e8d96f816454906a6afd71.1740425922.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-24io_uring/zcrx: add a read limit to recvzc requestsDavid Wei3-9/+24
Currently multishot recvzc requests have no read limit and will remain active so as long as the socket remains open. But, there are sometimes a need to do a fixed length read e.g. peeking at some data in the socket. Add a length limit to recvzc requests `len`. A value of 0 means no limit which is the previous behaviour. A positive value N specifies how many bytes to read from the socket. Data will still be posted in aux completions, as before. This could be split across multiple frags. But the primary recvzc request will now complete once N bytes have been read. The completion of the recvzc request will have res and cflags both set to 0. Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/20250224041319.2389785-2-dw@davidwei.uk [axboe: fixup io_zcrx_recv() for !CONFIG_NET] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-24io_uring: make io_poll_issue() sturdierPavel Begunkov1-7/+33
io_poll_issue() forwards the call to io_issue_sqe() and thus inherits some of the handling. That's not particularly failure resistant, as for example returning an innocently looking IOU_OK from a multishot issue will lead to severe bugs. Reimplement io_poll_issue() without io_issue_sqe()'s request completion logic. Remove extra checks as we know that req->file is already set, linked timeout are armed, and iopoll is not supported. Also cover it with warnings for now. The patch should be useful by itself, but it's also preparing the codebase for other future clean ups. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/3096d7b1026d9a52426a598bdfc8d9d324555545.1740331076.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-24io_uring/net: canonise accept mshot handlingPavel Begunkov1-9/+4
Use a more recognisable pattern for mshot accept, first try to post an mshot cqe if needed and after do terminating handling. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/daf5c0df7e2966deb0a115021c065fc6161a52d7.1740331076.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-24io_uring/net: fix accept multishot handlingPavel Begunkov1-0/+2
REQ_F_APOLL_MULTISHOT doesn't guarantee it's executed from the multishot context, so a multishot accept may get executed inline, fail io_req_post_cqe(), and ask the core code to kill the request with -ECANCELED by returning IOU_STOP_MULTISHOT even when a socket has been accepted and installed. Cc: stable@vger.kernel.org Fixes: 390ed29b5e425 ("io_uring: add IORING_ACCEPT_MULTISHOT for accept") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/51c6deb01feaa78b08565ca8f24843c017f5bc80.1740331076.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-24io_uring/net: use io_is_compat()Pavel Begunkov1-11/+8
Use io_is_compat() for consistency. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Anuj Gupta <anuj20.g@samsung.com> Link: https://lore.kernel.org/r/fff93d9d08243284c5db5d546be766a82e85c130.1740400452.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-24io_uring/waitid: use io_is_compat()Pavel Begunkov1-5/+1
Use io_is_compat() for consistency. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Anuj Gupta <anuj20.g@samsung.com> Link: https://lore.kernel.org/r/28c5b5f1f1bf7f4d18869dafe6e4147ce1bbf0f5.1740400452.git.asml.silence@gmail.com Link: https://lore.kernel.org/r/20250224172337.2009871-1-csander@purestorage.com [axboe: fold in improvement from Caleb, see link] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-24io_uring/rw: shrink io_iov_compat_buffer_select_prepPavel Begunkov1-10/+4
Compat performance is not important and simplicity is more appreciated. Let's not be smart about it and use simpler copy_from_user() instead of access + __get_user pair. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b334a3a5040efa424ded58e4d8a6ef2554324266.1740400452.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>