<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/io_uring/io-wq.c, branch linux-7.1.y</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=linux-7.1.y</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=linux-7.1.y'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-05-13T19:06:37+00:00</updated>
<entry>
<title>io-wq: check that the predecessor is hashed in io_wq_remove_pending()</title>
<updated>2026-05-13T19:06:37+00:00</updated>
<author>
<name>Nicholas Carlini</name>
<email>nicholas@carlini.com</email>
</author>
<published>2026-05-11T18:02:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d6a2d7b04b5a093021a7a0e2e69e9d5237dfa8cc'/>
<id>urn:sha1:d6a2d7b04b5a093021a7a0e2e69e9d5237dfa8cc</id>
<content type='text'>
io_wq_remove_pending() needs to fix up wq-&gt;hash_tail[] if the cancelled
work was the tail of its hash bucket. When doing this, it checks whether
the preceding entry in acct-&gt;work_list has the same hash value, but
never checks that the predecessor is hashed at all. io_get_work_hash()
is simply atomic_read(&amp;work-&gt;flags) &gt;&gt; IO_WQ_HASH_SHIFT, and the hash
bits are never set for non-hashed work, so it returns 0. Thus, when a
hashed bucket-0 work is cancelled while a non-hashed work is its list
predecessor, the check spuriously passes and a pointer to the non-hashed
io_kiocb is stored in wq-&gt;hash_tail[0].

Because non-hashed work is dequeued via the fast path in
io_get_next_work(), which never touches hash_tail[], the stale pointer
is never cleared. Therefore, after the non-hashed io_kiocb completes and
is freed back to req_cachep, wq-&gt;hash_tail[0] is a dangling pointer. The
io_wq is per-task (tctx-&gt;io_wq) and survives ring open/close, so the
dangling pointer persists for the lifetime of the task; the next hashed
bucket-0 enqueue dereferences it in io_wq_insert_work() and
wq_list_add_after() writes through freed memory.

Add the missing io_wq_is_hashed() check so a non-hashed predecessor
never inherits a hash_tail[] slot.

Cc: stable@vger.kernel.org
Fixes: 204361a77f40 ("io-wq: fix hang after cancelling pending hashed work")
Signed-off-by: Nicholas Carlini &lt;nicholas@carlini.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>Convert 'alloc_obj' family to use the new default GFP_KERNEL argument</title>
<updated>2026-02-22T01:09:51+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-02-22T00:37:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bf4afc53b77aeaa48b5409da5c8da6bb4eff7f43'/>
<id>urn:sha1:bf4afc53b77aeaa48b5409da5c8da6bb4eff7f43</id>
<content type='text'>
This was done entirely with mindless brute force, using

    git grep -l '\&lt;k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>treewide: Replace kmalloc with kmalloc_obj for non-scalar types</title>
<updated>2026-02-21T09:02:28+00:00</updated>
<author>
<name>Kees Cook</name>
<email>kees@kernel.org</email>
</author>
<published>2026-02-21T07:49:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=69050f8d6d075dc01af7a5f2f550a8067510366f'/>
<id>urn:sha1:69050f8d6d075dc01af7a5f2f550a8067510366f</id>
<content type='text'>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'for-7.0/io_uring-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux</title>
<updated>2026-02-10T01:22:00+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-02-10T01:22:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f5d4feed174ce9fb3c42886a3c36038fd5a43e25'/>
<id>urn:sha1:f5d4feed174ce9fb3c42886a3c36038fd5a43e25</id>
<content type='text'>
Pull io_uring updates from Jens Axboe:

 - Clean up the IORING_SETUP_R_DISABLED and submitter task checking,
   mostly just in preparation for relaxing the locking for SINGLE_ISSUER
   in the future.

 - Improve IOPOLL by using a doubly linked list to manage completions.

   Previously it was singly listed, which meant that to complete request
   N in the chain 0..N-1 had to have completed first. With a doubly
   linked list we can complete whatever request completes in that order,
   rather than need to wait for a consecutive range to be available.
   This reduces latencies.

 - Improve the restriction setup and checking. Mostly in preparation for
   adding further features on top of that. Coming in a separate pull
   request.

 - Split out task_work and wait handling into separate files. These are
   mostly nicely abstracted already, but still remained in the
   io_uring.c file which is on the larger side.

 - Use GFP_KERNEL_ACCOUNT in a few more spots, where appropriate.

 - Ensure even the idle io-wq worker exits if a task no longer has any
   rings open.

 - Add support for a non-circular submission queue.

   By default, the SQ ring keeps moving around, even if only a few
   entries are used for each submission. This can be wasteful in terms
   of cachelines.

   If IORING_SETUP_SQ_REWIND is set for the ring when created, each
   submission will start at offset 0 instead of where we last left off
   doing submissions.

 - Various little cleanups

* tag 'for-7.0/io_uring-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (30 commits)
  io_uring/kbuf: fix memory leak if io_buffer_add_list fails
  io_uring: Add SPDX id lines to remaining source files
  io_uring: allow io-wq workers to exit when unused
  io_uring/io-wq: add exit-on-idle state
  io_uring/net: don't continue send bundle if poll was required for retry
  io_uring/rsrc: use GFP_KERNEL_ACCOUNT consistently
  io_uring/futex: use GFP_KERNEL_ACCOUNT for futex data allocation
  io_uring/io-wq: handle !sysctl_hung_task_timeout_secs
  io_uring: fix bad indentation for setup flags if statement
  io_uring/rsrc: take unsigned index in io_rsrc_node_lookup()
  io_uring: introduce non-circular SQ
  io_uring: split out CQ waiting code into wait.c
  io_uring: split out task work code into tw.c
  io_uring/io-wq: don't trigger hung task for syzbot craziness
  io_uring: add IO_URING_EXIT_WAIT_MAX definition
  io_uring/sync: validate passed in offset
  io_uring/eventfd: remove unused ctx-&gt;evfd_last_cq_tail member
  io_uring/timeout: annotate data race in io_flush_timeouts()
  io_uring/uring_cmd: explicitly disallow cancelations for IOPOLL
  io_uring: fix IOPOLL with passthrough I/O
  ...
</content>
</entry>
<entry>
<title>io_uring/io-wq: add exit-on-idle state</title>
<updated>2026-02-02T15:10:23+00:00</updated>
<author>
<name>Li Chen</name>
<email>me@linux.beauty</email>
</author>
<published>2026-02-02T14:37:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=38aa434ab9335ce2d178b7538cdf01d60b2014c3'/>
<id>urn:sha1:38aa434ab9335ce2d178b7538cdf01d60b2014c3</id>
<content type='text'>
io-wq uses an idle timeout to shrink the pool, but keeps the last worker
around indefinitely to avoid churn.

For tasks that used io_uring for file I/O and then stop using io_uring,
this can leave an iou-wrk-* thread behind even after all io_uring
instances are gone. This is unnecessary overhead and also gets in the
way of process checkpoint/restore.

Add an exit-on-idle state that makes all io-wq workers exit as soon as
they become idle, and provide io_wq_set_exit_on_idle() to toggle it.

Signed-off-by: Li Chen &lt;me@linux.beauty&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring/io-wq: handle !sysctl_hung_task_timeout_secs</title>
<updated>2026-01-23T20:58:03+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2026-01-23T20:58:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=816095894c0f44aaba4372d92c874121af687596'/>
<id>urn:sha1:816095894c0f44aaba4372d92c874121af687596</id>
<content type='text'>
If the hung_task_timeout sysctl is set to 0, then we'll end up busy
looping inside io_wq_exit_workers() after an earlier commit switched to
using wait_for_completion_timeout(). Use the maximum schedule timeout
value for that case.

Fixes: 1f293098a313 ("io_uring/io-wq: don't trigger hung task for syzbot craziness")
Reported-by: Chris Mason &lt;clm@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring/io-wq: don't trigger hung task for syzbot craziness</title>
<updated>2026-01-22T14:25:35+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2026-01-21T17:40:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1f293098a31368a5d5cfab215a56f4cbed3f657a'/>
<id>urn:sha1:1f293098a31368a5d5cfab215a56f4cbed3f657a</id>
<content type='text'>
Use the same trick that blk_io_schedule() does to avoid triggering the
hung task warning (and potential reboot/panic, depending on system
settings), and only wait for half the hung task timeout at the time.
If we exceed the default IO_URING_EXIT_WAIT_MAX period where we expect
things to certainly have finished unless there's a bug, then throw a
WARN_ON_ONCE() for that case.

Reported-by: syzbot+4eb282331cab6d5b6588@syzkaller.appspotmail.com
Tested-by: syzbot+4eb282331cab6d5b6588@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring/io-wq: check IO_WQ_BIT_EXIT inside work run loop</title>
<updated>2026-01-20T14:55:59+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2026-01-20T14:42:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=10dc959398175736e495f71c771f8641e1ca1907'/>
<id>urn:sha1:10dc959398175736e495f71c771f8641e1ca1907</id>
<content type='text'>
Currently this is checked before running the pending work. Normally this
is quite fine, as work items either end up blocking (which will create a
new worker for other items), or they complete fairly quickly. But syzbot
reports an issue where io-wq takes seemingly forever to exit, and with a
bit of debugging, this turns out to be because it queues a bunch of big
(2GB - 4096b) reads with a /dev/msr* file. Since this file type doesn't
support -&gt;read_iter(), loop_rw_iter() ends up handling them. Each read
returns 16MB of data read, which takes 20 (!!) seconds. With a bunch of
these pending, processing the whole chain can take a long time. Easily
longer than the syzbot uninterruptible sleep timeout of 140 seconds.
This then triggers a complaint off the io-wq exit path:

INFO: task syz.4.135:6326 blocked for more than 143 seconds.
      Not tainted syzkaller #0
      Blocked by coredump.
"echo 0 &gt; /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz.4.135       state:D stack:26824 pid:6326  tgid:6324  ppid:5957   task_flags:0x400548 flags:0x00080000
Call Trace:
 &lt;TASK&gt;
 context_switch kernel/sched/core.c:5256 [inline]
 __schedule+0x1139/0x6150 kernel/sched/core.c:6863
 __schedule_loop kernel/sched/core.c:6945 [inline]
 schedule+0xe7/0x3a0 kernel/sched/core.c:6960
 schedule_timeout+0x257/0x290 kernel/time/sleep_timeout.c:75
 do_wait_for_common kernel/sched/completion.c:100 [inline]
 __wait_for_common+0x2fc/0x4e0 kernel/sched/completion.c:121
 io_wq_exit_workers io_uring/io-wq.c:1328 [inline]
 io_wq_put_and_exit+0x271/0x8a0 io_uring/io-wq.c:1356
 io_uring_clean_tctx+0x10d/0x190 io_uring/tctx.c:203
 io_uring_cancel_generic+0x69c/0x9a0 io_uring/cancel.c:651
 io_uring_files_cancel include/linux/io_uring.h:19 [inline]
 do_exit+0x2ce/0x2bd0 kernel/exit.c:911
 do_group_exit+0xd3/0x2a0 kernel/exit.c:1112
 get_signal+0x2671/0x26d0 kernel/signal.c:3034
 arch_do_signal_or_restart+0x8f/0x7e0 arch/x86/kernel/signal.c:337
 __exit_to_user_mode_loop kernel/entry/common.c:41 [inline]
 exit_to_user_mode_loop+0x8c/0x540 kernel/entry/common.c:75
 __exit_to_user_mode_prepare include/linux/irq-entry-common.h:226 [inline]
 syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:256 [inline]
 syscall_exit_to_user_mode_work include/linux/entry-common.h:159 [inline]
 syscall_exit_to_user_mode include/linux/entry-common.h:194 [inline]
 do_syscall_64+0x4ee/0xf80 arch/x86/entry/syscall_64.c:100
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fa02738f749
RSP: 002b:00007fa0281ae0e8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00007fa0275e6098 RCX: 00007fa02738f749
RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007fa0275e6098
RBP: 00007fa0275e6090 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fa0275e6128 R14: 00007fff14e4fcb0 R15: 00007fff14e4fd98

There's really nothing wrong here, outside of processing these reads
will take a LONG time. However, we can speed up the exit by checking the
IO_WQ_BIT_EXIT inside the io_worker_handle_work() loop, as syzbot will
exit the ring after queueing up all of these reads. Then once the first
item is processed, io-wq will simply cancel the rest. That should avoid
syzbot running into this complaint again.

Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/68a2decc.050a0220.e29e5.0099.GAE@google.com/
Reported-by: syzbot+4eb282331cab6d5b6588@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring/io-wq: remove io_wq_for_each_worker() return value</title>
<updated>2026-01-05T22:39:20+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2026-01-05T18:42:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e4fdbca2dc774366aca6532b57bfcdaae29aaf63'/>
<id>urn:sha1:e4fdbca2dc774366aca6532b57bfcdaae29aaf63</id>
<content type='text'>
The only use of this helper is to iterate all of the workers, and
hence all callers will pass in a func that always returns false to do
that. As none of the callers use the return value, get rid of it.

Reviewed-by: Gabriel Krisman Bertazi &lt;krisman@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring/io-wq: fix incorrect io_wq_for_each_worker() termination logic</title>
<updated>2026-01-05T21:37:33+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2026-01-05T14:42:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e0392a10c9e80a3991855a81317da3039fcbe32c'/>
<id>urn:sha1:e0392a10c9e80a3991855a81317da3039fcbe32c</id>
<content type='text'>
A previous commit added this helper, and had it terminate if false is
returned from the handler. However, that is completely opposite, it
should abort the loop if true is returned.

Fix this up by having io_wq_for_each_worker() keep iterating as long
as false is returned, and only abort if true is returned.

Cc: stable@vger.kernel.org
Fixes: 751eedc4b4b7 ("io_uring/io-wq: move worker lists to struct io_wq_acct")
Reported-by: Lewis Campbell &lt;info@lewiscampbell.tech&gt;
Reviewed-by: Gabriel Krisman Bertazi &lt;krisman@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
</feed>
