<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/fs/select.c, branch v7.1-rc5</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v7.1-rc5</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v7.1-rc5'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-05-12T12:41:40+00:00</updated>
<entry>
<title>fs/select: reject negative timeval components in kern_select()</title>
<updated>2026-05-12T12:41:40+00:00</updated>
<author>
<name>Breno Leitao</name>
<email>leitao@debian.org</email>
</author>
<published>2026-04-29T13:09:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=859c199bb3a90ec49a678cc0846694b06703bdde'/>
<id>urn:sha1:859c199bb3a90ec49a678cc0846694b06703bdde</id>
<content type='text'>
kern_select() normalises the user-supplied struct __kernel_old_timeval
with

	tv.tv_sec + (tv.tv_usec / USEC_PER_SEC)
	(tv.tv_usec % USEC_PER_SEC) * NSEC_PER_USEC

before calling poll_select_set_timeout() -&gt; timespec64_valid().  Both
operands of the seconds sum are unbounded user-controlled signed long.
A crafted pair where tv_usec is a negative multiple of USEC_PER_SEC
drives the sum across the wrap boundary - e.g.

	{ .tv_sec = LONG_MIN, .tv_usec = -1000000 }

yields sec = LONG_MAX, nsec = 0, which passes timespec64_valid() and
then flows through timespec64_add_safe(), which saturates the absolute
deadline to TIME64_MAX (clamped further to KTIME_MAX downstream).
select(2) therefore blocks effectively forever instead of returning
-EINVAL as POSIX requires for a negative timeout.

Only the legacy __NR_select syscall takes this path.  pselect6, ppoll,
poll and epoll_pwait2 all hand the user's two fields directly to
poll_select_set_timeout(), which validates *before* doing any
arithmetic:

	/* fs/select.c:271 -- the validator */
	int poll_select_set_timeout(struct timespec64 *to, time64_t sec, long nsec)
	{
		struct timespec64 ts = {.tv_sec = sec, .tv_nsec = nsec};
		if (!timespec64_valid(&amp;ts))
			return -EINVAL;
		...
	}

	/* include/linux/time64.h:97 -- timespec64_valid */
	if (ts-&gt;tv_sec &lt; 0)                              return false;
	if ((unsigned long)ts-&gt;tv_nsec &gt;= NSEC_PER_SEC)  return false;

	/* fs/select.c:744  do_pselect() (pselect6, pselect6_time32) */
	if (get_timespec64(&amp;ts, tsp)) return -EFAULT;
	if (poll_select_set_timeout(to, ts.tv_sec, ts.tv_nsec)) return -EINVAL;

	/* fs/select.c:1097 ppoll */
	if (get_timespec64(&amp;ts, tsp)) return -EFAULT;
	if (poll_select_set_timeout(to, ts.tv_sec, ts.tv_nsec)) return -EINVAL;

	/* fs/select.c:1065 poll -- timeout_msecs is int; &gt;= 0 gates the math */
	if (timeout_msecs &gt;= 0)
		poll_select_set_timeout(to, timeout_msecs / MSEC_PER_SEC,
		                        NSEC_PER_MSEC * (timeout_msecs % MSEC_PER_SEC));

	/* fs/eventpoll.c:2512 epoll_pwait2 */
	if (get_timespec64(&amp;ts, timeout)) return -EFAULT;
	if (poll_select_set_timeout(to, ts.tv_sec, ts.tv_nsec)) return -EINVAL;

In every one of these the wrap-prone arithmetic from kern_select()
simply does not exist; the user fields reach timespec64_valid()
unmodified.  glibc routes the C-library select() through pselect6,
so the bug is reachable only via a direct syscall(__NR_select, ...).

The pre-validation negative check that used to live here was lost
when the syscall was switched to the poll_select_set_timeout() helper.
Restore it: reject tv_sec &lt; 0 || tv_usec &lt; 0 up front, mirroring what
glibc does in userspace.  do_compat_select() has the same arithmetic
pattern but is only reachable on 32-bit compat and from a different
syscall entry; left for a follow-up so this change stays minimal.

Reproducer (returns -1/EINVAL on a fixed kernel; blocks indefinitely
on an unfixed one):

	struct timeval tv = { .tv_sec = LONG_MIN, .tv_usec = -1000000 };
	fd_set r;
	int pfd[2];
	pipe(pfd);
	FD_ZERO(&amp;r);
	FD_SET(pfd[0], &amp;r);
	syscall(__NR_select, pfd[0] + 1, &amp;r, NULL, NULL, &amp;tv);

Fixes: 4d36a9e65d49 ("select: deal with math overflow from borderline valid userland data")
Signed-off-by: Breno Leitao &lt;leitao@debian.org&gt;
Link: https://patch.msgid.link/20260429-timeval-v1-1-4448e2588bbf@debian.org
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'vfs-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs</title>
<updated>2026-04-13T21:20:11+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-04-13T21:20:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ef3da345ccb1fd70e2288b821301698483c6c35a'/>
<id>urn:sha1:ef3da345ccb1fd70e2288b821301698483c6c35a</id>
<content type='text'>
Pull misc vfs updates from Christian Brauner:
 "Features:
   - coredump: add tracepoint for coredump events
   - fs: hide file and bfile caches behind runtime const machinery

  Fixes:
   - fix architecture-specific compat_ftruncate64 implementations
   - dcache: Limit the minimal number of bucket to two
   - fs/omfs: reject s_sys_blocksize smaller than OMFS_DIR_START
   - fs/mbcache: cancel shrink work before destroying the cache
   - dcache: permit dynamic_dname()s up to NAME_MAX

  Cleanups:
   - remove or unexport unused fs_context infrastructure
   - trivial -&gt;setattr cleanups
   - selftests/filesystems: Assume that TIOCGPTPEER is defined
   - writeback: fix kernel-doc function name mismatch for wb_put_many()
   - autofs: replace manual symlink buffer allocation in autofs_dir_symlink
   - init/initramfs.c: trivial fix: FSM -&gt; Finite-state machine
   - fs: remove stale and duplicate forward declarations
   - readdir: Introduce dirent_size()
   - fs: Replace user_access_{begin/end} by scoped user access
   - kernel: acct: fix duplicate word in comment
   - fs: write a better comment in step_into() concerning .mnt assignment
   - fs: attr: fix comment formatting and spelling issues"

* tag 'vfs-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits)
  dcache: permit dynamic_dname()s up to NAME_MAX
  fs: attr: fix comment formatting and spelling issues
  fs: hide file and bfile caches behind runtime const machinery
  fs: write a better comment in step_into() concerning .mnt assignment
  proc: rename proc_notify_change to proc_setattr
  proc: rename proc_setattr to proc_nochmod_setattr
  affs: rename affs_notify_change to affs_setattr
  adfs: rename adfs_notify_change to adfs_setattr
  hfs: update comments on hfs_inode_setattr
  kernel: acct: fix duplicate word in comment
  fs: Replace user_access_{begin/end} by scoped user access
  readdir: Introduce dirent_size()
  coredump: add tracepoint for coredump events
  fs: remove do_sys_truncate
  fs: pass on FTRUNCATE_* flags to do_truncate
  fs: fix archiecture-specific compat_ftruncate64
  fs: remove stale and duplicate forward declarations
  init/initramfs.c: trivial fix: FSM -&gt; Finite-state machine
  autofs: replace manual symlink buffer allocation in autofs_dir_symlink
  fs/mbcache: cancel shrink work before destroying the cache
  ...
</content>
</entry>
<entry>
<title>fs: Replace user_access_{begin/end} by scoped user access</title>
<updated>2026-03-24T13:44:02+00:00</updated>
<author>
<name>Christophe Leroy (CS GROUP)</name>
<email>chleroy@kernel.org</email>
</author>
<published>2026-03-24T11:41:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b98f7363f72ff83b9f5194d26e7f9fe74f45b46a'/>
<id>urn:sha1:b98f7363f72ff83b9f5194d26e7f9fe74f45b46a</id>
<content type='text'>
Scoped user access reduces code complexity and seamlessly bring
masked user access on architectures that support it.

Replace user_access_begin/user_access_end blocks by
scoped user access.

Signed-off-by: Christophe Leroy (CS GROUP) &lt;chleroy@kernel.org&gt;
Link: https://patch.msgid.link/16daf33a8190a771a93e294d050bd8153521ffca.1774350128.git.chleroy@kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>Convert more 'alloc_obj' cases to default GFP_KERNEL arguments</title>
<updated>2026-02-22T04:03:00+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-02-22T04:03:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=32a92f8c89326985e05dce8b22d3f0aa07a3e1bd'/>
<id>urn:sha1:32a92f8c89326985e05dce8b22d3f0aa07a3e1bd</id>
<content type='text'>
This converts some of the visually simpler cases that have been split
over multiple lines.  I only did the ones that are easy to verify the
resulting diff by having just that final GFP_KERNEL argument on the next
line.

Somebody should probably do a proper coccinelle script for this, but for
me the trivial script actually resulted in an assertion failure in the
middle of the script.  I probably had made it a bit _too_ trivial.

So after fighting that far a while I decided to just do some of the
syntactically simpler cases with variations of the previous 'sed'
scripts.

The more syntactically complex multi-line cases would mostly really want
whitespace cleanup anyway.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>treewide: Replace kmalloc with kmalloc_obj for non-scalar types</title>
<updated>2026-02-21T09:02:28+00:00</updated>
<author>
<name>Kees Cook</name>
<email>kees@kernel.org</email>
</author>
<published>2026-02-21T07:49:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=69050f8d6d075dc01af7a5f2f550a8067510366f'/>
<id>urn:sha1:69050f8d6d075dc01af7a5f2f550a8067510366f</id>
<content type='text'>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
</content>
</entry>
<entry>
<title>select: store end_time as timespec64 in restart block</title>
<updated>2025-12-24T13:01:57+00:00</updated>
<author>
<name>Thomas Weißschuh</name>
<email>thomas.weissschuh@linutronix.de</email>
</author>
<published>2025-12-23T07:00:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0f166bf1d6d82701cc1d94445cc2a9107d1790df'/>
<id>urn:sha1:0f166bf1d6d82701cc1d94445cc2a9107d1790df</id>
<content type='text'>
Storing the end time seconds as 'unsigned long' can lead to truncation
on 32-bit architectures if assigned from the 64-bit timespec64::tv_sec.
As the select() core uses timespec64 consistently, also use that in the
restart block.

This also allows the simplification of the accessors.

Signed-off-by: Thomas Weißschuh &lt;thomas.weissschuh@linutronix.de&gt;
Link: https://patch.msgid.link/20251223-restart-block-expiration-v2-1-8e33e5df7359@linutronix.de
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>select: Convert to scoped user access</title>
<updated>2025-11-04T07:28:34+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2025-10-27T08:44:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3ce17e6909944b3f83b54915e36f5957f1327712'/>
<id>urn:sha1:3ce17e6909944b3f83b54915e36f5957f1327712</id>
<content type='text'>
Replace the open coded implementation with the scoped user access guard.

No functional change intended.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Reviewed-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Link: https://patch.msgid.link/20251027083745.862419776@linutronix.de
</content>
</entry>
<entry>
<title>fs: annotate suspected data race between poll_schedule_timeout() and pollwake()</title>
<updated>2025-06-23T10:36:51+00:00</updated>
<author>
<name>Dmitry Antipov</name>
<email>dmantipov@yandex.ru</email>
</author>
<published>2025-06-20T06:30:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2b7c9664c3ce9d8da1113a1ca2546046de8eb93e'/>
<id>urn:sha1:2b7c9664c3ce9d8da1113a1ca2546046de8eb93e</id>
<content type='text'>
When running almost any select()/poll() workload intense enough,
KCSAN is likely to report data races around using 'triggered' flag
of 'struct poll_wqueues'. For example, running 'find /' on a tty
console may trigger the following:

BUG: KCSAN: data-race in poll_schedule_timeout / pollwake

write to 0xffffc900030cfb90 of 4 bytes by task 97 on cpu 5:
 pollwake+0xd1/0x130
 __wake_up_common_lock+0x7f/0xd0
 n_tty_receive_buf_common+0x776/0xc30
 n_tty_receive_buf2+0x3d/0x60
 tty_ldisc_receive_buf+0x6b/0x100
 tty_port_default_receive_buf+0x63/0xa0
 flush_to_ldisc+0x169/0x3c0
 process_scheduled_works+0x6fe/0xf40
 worker_thread+0x53b/0x7b0
 kthread+0x4f8/0x590
 ret_from_fork+0x28c/0x450
 ret_from_fork_asm+0x1a/0x30

read to 0xffffc900030cfb90 of 4 bytes by task 5802 on cpu 4:
 poll_schedule_timeout+0x96/0x160
 do_sys_poll+0x966/0xb30
 __se_sys_ppoll+0x1c3/0x210
 __x64_sys_ppoll+0x71/0x90
 x64_sys_call+0x3079/0x32b0
 do_syscall_64+0xfa/0x3b0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

According to Jan, "there's no practical issue here because it is hard
to imagine how the compiler could compile the above code using some
intermediate values stored into 'triggered' or multiple fetches from
'triggered'". Nevertheless, silence KCSAN by using WRITE_ONCE() in
__pollwake() and READ_ONCE() in poll_schedule_timeout(), respectively.

Link: https://lore.kernel.org/linux-fsdevel/bwx72orsztfjx6aoftzzkl7wle3hi4syvusuwc7x36nw6t235e@bjwrosehblty
Signed-off-by: Dmitry Antipov &lt;dmantipov@yandex.ru&gt;
Link: https://lore.kernel.org/20250620063059.1800689-1-dmantipov@yandex.ru
Acked-by: Marco Elver &lt;elver@google.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>select: core_sys_select add unlikely branch hint on return path</title>
<updated>2025-04-21T08:27:58+00:00</updated>
<author>
<name>Colin Ian King</name>
<email>colin.i.king@gmail.com</email>
</author>
<published>2025-04-14T09:24:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6b24a702ecf167ab61456276bb72133d84ccca45'/>
<id>urn:sha1:6b24a702ecf167ab61456276bb72133d84ccca45</id>
<content type='text'>
Adding an unlikely() hint on the n &lt; 0 comparison return path improves
run-time performance of the select() system call, the negative
value of n is very uncommon in normal select usage.

Benchmarking on an Debian based Intel(R) Core(TM) Ultra 9 285K with
a 6.15-rc1 kernel built with 14.2.0 using a select of 1000 file
descriptors with zero timeout shows a consistent call reduction from
258 ns down to 254 ns, which is a ~1.5% performance improvement.

Results based on running 25 tests with turbo disabled (to reduce clock
freq turbo changes), with 30 second run per test and comparing the number
of select() calls per second. The % standard deviation of the 25 tests
was 0.24%, so results are reliable.

Signed-off-by: Colin Ian King &lt;colin.i.king@gmail.com&gt;
Link: https://lore.kernel.org/20250414092426.53529-1-colin.i.king@gmail.com
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>select: do_pollfd: add unlikely branch hint return path</title>
<updated>2025-04-11T13:56:54+00:00</updated>
<author>
<name>Colin Ian King</name>
<email>colin.i.king@gmail.com</email>
</author>
<published>2025-04-09T15:55:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5730609ffd7e558e1e3305d0c6839044e8f6591b'/>
<id>urn:sha1:5730609ffd7e558e1e3305d0c6839044e8f6591b</id>
<content type='text'>
Adding an unlikely() hint on the fd &lt; 0 comparison return path improves
run-time performance of the poll() system call. gcov based coverage
analysis based on running stress-ng and a kernel build shows that this
path return path is highly unlikely.

Benchmarking on an Debian based Intel(R) Core(TM) Ultra 9 285K with
a 6.15-rc1 kernel and a poll of 1024 file descriptors with zero timeout
shows an call reduction from 32818 ns down to 32635 ns, which is a ~0.5%
performance improvement.

Results based on running 25 tests with turbo disabled (to reduce clock
freq turbo changes), with 30 second run per test and comparing the number
of poll() calls per second. The % standard deviation of the 25 tests
was 0.08%, so results are reliable.

Signed-off-by: Colin Ian King &lt;colin.i.king@gmail.com&gt;
Link: https://lore.kernel.org/20250409155510.577490-1-colin.i.king@gmail.com
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
</feed>
