<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/fs/eventpoll.c, branch v7.2-rc1</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v7.2-rc1</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v7.2-rc1'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-06-04T11:53:50+00:00</updated>
<entry>
<title>eventpoll: restore EP_UNACTIVE_PTR sentinel for ctx-&gt;tfile_check_list</title>
<updated>2026-06-04T11:53:50+00:00</updated>
<author>
<name>Zhan Wei</name>
<email>zhanwei919@gmail.com</email>
</author>
<published>2026-05-29T14:25:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a1e9718b406bc4e6d0c63c7b999d06febbdc4091'/>
<id>urn:sha1:a1e9718b406bc4e6d0c63c7b999d06febbdc4091</id>
<content type='text'>
Commit e09c77d94003 ("eventpoll: hoist CTL_ADD scratch state into
struct ep_ctl_ctx") moved tfile_check_list from a file-scope global into
the stack-allocated struct ep_ctl_ctx, and in doing so replaced the
EP_UNACTIVE_PTR sentinel with NULL on the grounds that "NULL is the
obvious 'empty' value and zero-init handles it for free", describing the
change as "No functional change". It is not.

epitems_head-&gt;next is overloaded with two roles:

  1. the "next" pointer that threads a head onto ctx-&gt;tfile_check_list;
  2. a membership flag: ep_remove_file() uses
     !smp_load_acquire(&amp;v-&gt;next) to mean "this head is not on any
     pending ctx-&gt;tfile_check_list and is therefore safe to free".

Before that change the EP_UNACTIVE_PTR sentinel kept the two roles
disjoint: a head on the list always had a non-NULL -&gt;next (another head,
or the sentinel at the tail), so -&gt;next == NULL was equivalent to "never
listed". With the sentinel gone the list is NULL-terminated, so the tail
head's -&gt;next is NULL as well. ep_remove_file()'s gate can no longer
distinguish "never listed" from "listed at the tail", and misfires on
the tail head.

The reader (reverse_path_check_proc) holds epnested_mutex +
rcu_read_lock; the freer (ep_remove_file) holds ep-&gt;mtx + file-&gt;f_lock.
The two sides share no mutex -- the sentinel was the invariant the gate
relied on to know it could skip the read side. With it gone,
ep_remove_file() frees the tail head while reverse_path_check_proc() is
still walking it, producing the slab-use-after-free read. The syzbot
reproducer hits this within seconds on a multi-CPU VM.

Restore the sentinel: initialize ctx.tfile_check_list to EP_UNACTIVE_PTR
in do_epoll_ctl_file(), and terminate the walk on "!= EP_UNACTIVE_PTR"
in reverse_path_check() and clear_tfile_check_list(). The tail head's
-&gt;next becomes the sentinel again rather than NULL, so
ep_remove_file()'s gate regains its exclusivity and stops misfiring on
the tail. ep_remove_file() itself is unchanged.

This restores the invariant the file-scope tfile_check_list relied on
before that change while preserving the ctx packaging it introduced.

Reported-by: syzbot+e70e1b6cba8714543f7c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=e70e1b6cba8714543f7c
Fixes: e09c77d94003 ("eventpoll: hoist CTL_ADD scratch state into struct ep_ctl_ctx")
Suggested-by: Christian Brauner &lt;brauner@kernel.org&gt;
Link: https://lore.kernel.org/all/20260528-rotwild-summt-kuhhandel-7276ef4c33b7@brauner.io/
Signed-off-by: Zhan Wei &lt;zhanwei919@gmail.com&gt;
Link: https://patch.msgid.link/20260529142533.23696-1-zhanwei919@gmail.com
Signed-off-by: Christian Brauner (Amutable) &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>eventpoll: Fix epoll_wait() report false negative</title>
<updated>2026-06-04T08:25:10+00:00</updated>
<author>
<name>Nam Cao</name>
<email>namcao@linutronix.de</email>
</author>
<published>2026-06-02T17:51:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0c4aefe3c2d0f272a2ad73699a12d4446ffdbe7b'/>
<id>urn:sha1:0c4aefe3c2d0f272a2ad73699a12d4446ffdbe7b</id>
<content type='text'>
ep_events_available() checks for available events by looking at
ep-&gt;rdllist and ep_is_scanning(). However, this is done without a lock
and can report false negative if ep_start_scan() or ep_done_scan() are
executed by another task concurrently. For example:
_________________________________________________________________________
                                   |ep_start_scan()
                                   |  list_splice_init(&amp;ep-&gt;rdllist, ...)
ep_events_available()              |
  !list_empty_careful(&amp;ep-&gt;rdllist)|
  || ep_is_scanning(ep)            |
	                           |  ep_enter_scan(ep)
___________________________________|_____________________________________

Another example:
_________________________________________________________________________
ep_events_available()              |
                                   |ep_start_scan()
                                   |  list_splice_init(&amp;ep-&gt;rdllist, ...)
	                           |  ep_enter_scan(ep)
  !list_empty_careful(&amp;ep-&gt;rdllist)|
                                   |ep_done_scan()
                                   |  ep_exit_scan(ep)
                                   |  list_splice(..., &amp;ep-&gt;rdllist)
  || ep_is_scanning(ep)            |
___________________________________|_____________________________________

In the above examples, ep_events_available() sees no event despite
events being available. In case epoll_wait() is called with timeout=0,
epoll_wait() will wrongly return "no event" to user.

Introduce a sequence lock to resolve this issue.

Measuring the time consumption of 10 million loop iterations doing
epoll_wait(), the following performance drop is observed:

   timeout  #event  before    after    diff
     0ms      0     3727ms   3974ms   +6.6%
     0ms      1     8099ms   9134ms    +13%
     1ms      1    13525ms  13586ms  +0.45%

Considering the use case of epoll_wait() (wait for events, do something
with the events, repeat), it should only contribute to a small portion of
user's CPU consumption. Therefore this performance drop is not alarming.

Fixes: c5a282e9635e ("fs/epoll: reduce the scope of wq lock in epoll_wait()")
Suggested-by: Mateusz Guzik &lt;mjguzik@gmail.com&gt;
Signed-off-by: Nam Cao &lt;namcao@linutronix.de&gt;
Link: https://patch.msgid.link/4363cd8e34a21d4f0d257be1b33e84dc25030fdf.1780422138.git.namcao@linutronix.de
Signed-off-by: Christian Brauner (Amutable) &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>eventpoll: add missing kernel-doc for @ctx function parameters</title>
<updated>2026-05-22T12:32:06+00:00</updated>
<author>
<name>Randy Dunlap</name>
<email>rdunlap@infradead.org</email>
</author>
<published>2026-05-19T04:23:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=70a03a385de2b8f0fa54dbc70bdc3ed176853d1c'/>
<id>urn:sha1:70a03a385de2b8f0fa54dbc70bdc3ed176853d1c</id>
<content type='text'>
Add the missing kernel-doc comments to prevent kernel-doc build
warnings while building the documentation.

WARNING: fs/eventpoll.c:1684 function parameter 'ctx' not described in 'reverse_path_check'
WARNING: fs/eventpoll.c:2349 function parameter 'ctx' not described in 'ep_loop_check_proc'

Fixes: e09c77d94003 ("eventpoll: hoist CTL_ADD scratch state into struct ep_ctl_ctx")
Signed-off-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Link: https://patch.msgid.link/20260519042314.124041-1-rdunlap@infradead.org
Signed-off-by: Christian Brauner (Amutable) &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge patch series "io_uring related epoll cleanups"</title>
<updated>2026-05-15T15:41:05+00:00</updated>
<author>
<name>Christian Brauner</name>
<email>brauner@kernel.org</email>
</author>
<published>2026-05-15T15:41:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6ece1a31c58c8c8293ecbbe79d7f92d52e1b0022'/>
<id>urn:sha1:6ece1a31c58c8c8293ecbbe79d7f92d52e1b0022</id>
<content type='text'>
Jens Axboe &lt;axboe@kernel.dk&gt; says:

One of the nastier things about epoll is how it allows nesting contexts
inside each other, leading to the necessity of loop detection and the
issues that have come with that.

I don't believe there's any reason to support nesting on the io_uring
side, in fact IORING_OP_EPOLL_CTL is a historical mistake, imho. But
let's at least try and contain the damage and disallow nested contexts
from our side.

Christian Brauner &lt;brauner@kernel.org&gt; says:

Bring in the eventpoll specific io_uring changes together with the
eventpoll cleanup I did this cycle. The io_uring changes can go on top
of both through the block tree.

* patches from https://patch.msgid.link/20260514140817.623026-1-axboe@kernel.dk:
  eventpoll: rename struct epoll_filefd to epoll_key
  eventpoll: add file based control interface
  eventpoll: export is_file_epoll()
  eventpoll: pass struct epoll_filefd through ep_find() and ep_insert()

Link: https://patch.msgid.link/20260514140817.623026-1-axboe@kernel.dk
Signed-off-by: Christian Brauner (Amutable) &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>eventpoll: rename struct epoll_filefd to epoll_key</title>
<updated>2026-05-15T15:34:26+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2026-05-14T14:07:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=463aba8090738bcd956297f7341b6874e76ae664'/>
<id>urn:sha1:463aba8090738bcd956297f7341b6874e76ae664</id>
<content type='text'>
This more accurately describes what purpose this structure serves, as
a lookup key.

Suggested-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Link: https://patch.msgid.link/20260514140817.623026-5-axboe@kernel.dk
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>eventpoll: add file based control interface</title>
<updated>2026-05-15T15:34:26+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2026-05-14T14:07:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0995baf261ce8fc141768911d97ba76e854e26cf'/>
<id>urn:sha1:0995baf261ce8fc141768911d97ba76e854e26cf</id>
<content type='text'>
Add do_epoll_ctl_file(), which takes a pre-resolved epoll file and a
struct epoll_filefd for the target rather than two integer file
descriptors. do_epoll_ctl() remains as a thin wrapper.

In preparation for using the file based interface from io_uring.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Link: https://patch.msgid.link/20260514140817.623026-4-axboe@kernel.dk
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>eventpoll: export is_file_epoll()</title>
<updated>2026-05-15T15:34:26+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2026-05-14T14:07:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5de2759f2b7c925f187e552cae47775acd5f4b40'/>
<id>urn:sha1:5de2759f2b7c925f187e552cae47775acd5f4b40</id>
<content type='text'>
Make is_file_epoll() available outside of epoll. This is in preparation
from using it from io_uring.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Link: https://patch.msgid.link/20260514140817.623026-3-axboe@kernel.dk
Reviewed-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>eventpoll: pass struct epoll_filefd through ep_find() and ep_insert()</title>
<updated>2026-05-15T15:34:26+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2026-05-14T14:07:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9caef5b2a1fc22c2e24159a3d0fd74a47aed94ab'/>
<id>urn:sha1:9caef5b2a1fc22c2e24159a3d0fd74a47aed94ab</id>
<content type='text'>
Have ep_find() and ep_insert() take a struct epoll_filefd rather
than a file/fd tuple. Kill off ep_set_ffd() as it's now no longer
needed.

No functional change. This is a prep patch for adding a file based
do_epoll_ctl() variant.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Link: https://patch.msgid.link/20260514140817.623026-2-axboe@kernel.dk
Reviewed-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>eventpoll: hoist CTL_ADD scratch state into struct ep_ctl_ctx</title>
<updated>2026-04-28T15:27:29+00:00</updated>
<author>
<name>Christian Brauner</name>
<email>brauner@kernel.org</email>
</author>
<published>2026-04-24T13:46:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e09c77d940034e3f0c4ae1c18cc655ea9ec2cc41'/>
<id>urn:sha1:e09c77d940034e3f0c4ae1c18cc655ea9ec2cc41</id>
<content type='text'>
Three globals were shared between the loop check and the path check
paths: tfile_check_list (chain of epitems_head to walk afterwards),
path_count[] (per-depth wakeup-path tally) and inserting_into
(cycle-detection sentinel). All three are scratch state used only
during a single EPOLL_CTL_ADD full_check, yet they sit at file
scope and rely on epnested_mutex for exclusion.

The area has had three bugs in the last year -- CVE-2025-38349,
f2e467a48287 ("eventpoll: Fix semi-unbounded recursion"), and
fdcfce93073d ("eventpoll: Fix integer overflow in
ep_loop_check_proc()") -- all rooted in the shared-mutable-global
pattern being hard to reason about.

Collect the three into a stack-allocated struct ep_ctl_ctx:

    struct ep_ctl_ctx {
        struct eventpoll   *inserting_into;
        struct epitems_head *tfile_check_list;
        int                 path_count[PATH_ARR_SIZE];
    };

do_epoll_ctl() zero-initializes one on its stack and plumbs it
through ep_ctl_lock() / ep_ctl_unlock() / ep_insert() /
ep_register_epitem() / list_file() / ep_loop_check() /
ep_loop_check_proc() / reverse_path_check() /
reverse_path_check_proc() / path_count_inc() / path_count_init() /
clear_tfile_check_list(). Non-nested inserts leave the ctx zeroed
and skip the machinery entirely.

With the scratch state in ctx:
  - tfile_check_list no longer has an EP_UNACTIVE_PTR sentinel --
    NULL is the obvious "empty" value and the zero-init handles it
    for free;
  - path_count[] is no longer an array global that could be touched
    in unexpected orderings;
  - inserting_into is scoped to the exact call that set it.

loop_check_gen stays as a file-scope monotonic counter, because the
stamp left on ep-&gt;gen by a completed walk must not equal the stamp
of a future walk -- something a stack-local value cannot guarantee
across calls. It remains protected by epnested_mutex for the bump
and read lockless for the "do we need a full check" trigger in
ep_ctl_lock().

Every bail-out that existed before (the ELOOP on cycle, the path
limit check, the unbounded-recursion cap, the +1 overflow guard) is
preserved verbatim; only the data they operate on moved from file
scope to the stack ctx.

No functional change.

Signed-off-by: Christian Brauner (Amutable) &lt;brauner@kernel.org&gt;
Link: https://patch.msgid.link/20260424-work-epoll-rework-v1-17-249ed00a20f3@kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>eventpoll: use bool for predicate helpers</title>
<updated>2026-04-28T15:27:29+00:00</updated>
<author>
<name>Christian Brauner</name>
<email>brauner@kernel.org</email>
</author>
<published>2026-04-24T13:46:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=23431ae93a21d485f66a177304de353113dcdd28'/>
<id>urn:sha1:23431ae93a21d485f66a177304de353113dcdd28</id>
<content type='text'>
Three inline predicates -- is_file_epoll(), ep_is_linked(),
ep_events_available() -- were declared to return int even though
their only use is as a truthy test and their bodies are already
boolean expressions. ep_has_wakeup_source(), in the same file,
returns bool, so the convention was already inconsistent.

Convert all three to return bool. Rewrite ep_events_available()'s
verbose kerneldoc to the same one-line style the rest of the
predicates use now.

ep_poll()'s local eavail variable stores the result of
ep_events_available() (already boolean), ep_busy_loop() (returns
bool), and list_empty() (int but tested as boolean). Split it out
of the combined int declaration and give it bool type; replace the
one "eavail = 1" after a wakeup with "eavail = true" to match.

No functional change.

Signed-off-by: Christian Brauner (Amutable) &lt;brauner@kernel.org&gt;
Link: https://patch.msgid.link/20260424-work-epoll-rework-v1-16-249ed00a20f3@kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
</feed>
