<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/fs/select.c, branch v6.12.80</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-02-08T08:56:53+00:00</updated>
<entry>
<title>select: Fix unbalanced user_access_end()</title>
<updated>2025-02-08T08:56:53+00:00</updated>
<author>
<name>Christophe Leroy</name>
<email>christophe.leroy@csgroup.eu</email>
</author>
<published>2025-01-13T08:37:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=68303b5d382bf83f64abcb57e71f11d945d54c41'/>
<id>urn:sha1:68303b5d382bf83f64abcb57e71f11d945d54c41</id>
<content type='text'>
[ Upstream commit 344af27715ddbf357cf76978d674428b88f8e92d ]

While working on implementing user access validation on powerpc
I got the following warnings on a pmac32_defconfig build:

	  CC      fs/select.o
	fs/select.o: warning: objtool: sys_pselect6+0x1bc: redundant UACCESS disable
	fs/select.o: warning: objtool: sys_pselect6_time32+0x1bc: redundant UACCESS disable

On powerpc/32s, user_read_access_begin/end() are no-ops, but the
failure path has a user_access_end() instead of user_read_access_end()
which means an access end without any prior access begin.

Replace that user_access_end() by user_read_access_end().

Fixes: 7e71609f64ec ("pselect6() and friends: take handling the combined 6th/7th args into helper")
Signed-off-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Link: https://lore.kernel.org/r/a7139e28d767a13e667ee3c79599a8047222ef36.1736751221.git.christophe.leroy@csgroup.eu
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'pull-stable-struct_fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2024-09-23T16:35:36+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-09-23T16:35:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f8ffbc365f703d74ecca8ca787318d05bbee2bf7'/>
<id>urn:sha1:f8ffbc365f703d74ecca8ca787318d05bbee2bf7</id>
<content type='text'>
Pull 'struct fd' updates from Al Viro:
 "Just the 'struct fd' layout change, with conversion to accessor
  helpers"

* tag 'pull-stable-struct_fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  add struct fd constructors, get rid of __to_fd()
  struct fd: representation change
  introduce fd_file(), convert all accessors to it.
</content>
</entry>
<entry>
<title>Merge branch 'address-masking'</title>
<updated>2024-09-22T18:19:35+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-09-22T18:19:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=de5cb0dcb74c294ec527eddfe5094acfdb21ff21'/>
<id>urn:sha1:de5cb0dcb74c294ec527eddfe5094acfdb21ff21</id>
<content type='text'>
Merge user access fast validation using address masking.

This allows architectures to optionally use a data dependent address
masking model instead of a conditional branch for validating user
accesses.  That avoids the Spectre-v1 speculation barriers.

Right now only x86-64 takes advantage of this, and not all architectures
will be able to do it.  It requires a guard region between the user and
kernel address spaces (so that you can't overflow from one to the
other), and an easy way to generate a guaranteed-to-fault address for
invalid user pointers.

Also note that this currently assumes that there is no difference
between user read and write accesses.  If extended to architectures like
powerpc, we'll also need to separate out the user read-vs-write cases.

* address-masking:
  x86: make the masked_user_access_begin() macro use its argument only once
  x86: do the user address masking outside the user access area
  x86: support user address masking instead of non-speculative conditional
</content>
</entry>
<entry>
<title>Merge tag 'timers-core-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2024-09-17T05:25:37+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-09-17T05:25:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9ea925c806dbb8fee6797f59148daaf7f648832e'/>
<id>urn:sha1:9ea925c806dbb8fee6797f59148daaf7f648832e</id>
<content type='text'>
Pull timer updates from Thomas Gleixner:
 "Core:

   - Overhaul of posix-timers in preparation of removing the workaround
     for periodic timers which have signal delivery ignored.

   - Remove the historical extra jiffie in msleep()

     msleep() adds an extra jiffie to the timeout value to ensure
     minimal sleep time. The timer wheel ensures minimal sleep time
     since the large rewrite to a non-cascading wheel, but the extra
     jiffie in msleep() remained unnoticed. Remove it.

   - Make the timer slack handling correct for realtime tasks.

     The procfs interface is inconsistent and does neither reflect
     reality nor conforms to the man page. Show the correct 0 slack for
     real time tasks and enforce it at the core level instead of having
     inconsistent individual checks in various timer setup functions.

   - The usual set of updates and enhancements all over the place.

  Drivers:

   - Allow the ACPI PM timer to be turned off during suspend

   - No new drivers

   - The usual updates and enhancements in various drivers"

* tag 'timers-core-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (43 commits)
  ntp: Make sure RTC is synchronized when time goes backwards
  treewide: Fix wrong singular form of jiffies in comments
  cpu: Use already existing usleep_range()
  timers: Rename next_expiry_recalc() to be unique
  platform/x86:intel/pmc: Fix comment for the pmc_core_acpi_pm_timer_suspend_resume function
  clocksource/drivers/jcore: Use request_percpu_irq()
  clocksource/drivers/cadence-ttc: Add missing clk_disable_unprepare in ttc_setup_clockevent
  clocksource/drivers/asm9260: Add missing clk_disable_unprepare in asm9260_timer_init
  clocksource/drivers/qcom: Add missing iounmap() on errors in msm_dt_timer_init()
  clocksource/drivers/ingenic: Use devm_clk_get_enabled() helpers
  platform/x86:intel/pmc: Enable the ACPI PM Timer to be turned off when suspended
  clocksource: acpi_pm: Add external callback for suspend/resume
  clocksource/drivers/arm_arch_timer: Using for_each_available_child_of_node_scoped()
  dt-bindings: timer: rockchip: Add rk3576 compatible
  timers: Annotate possible non critical data race of next_expiry
  timers: Remove historical extra jiffie for timeout in msleep()
  hrtimer: Use and report correct timerslack values for realtime tasks
  hrtimer: Annotate hrtimer_cpu_base_.*_expiry() for sparse.
  timers: Add sparse annotation for timer_sync_wait_running().
  signal: Replace BUG_ON()s
  ...
</content>
</entry>
<entry>
<title>fs/select: Annotate struct poll_list with __counted_by()</title>
<updated>2024-08-30T06:22:35+00:00</updated>
<author>
<name>Thorsten Blum</name>
<email>thorsten.blum@toblux.com</email>
</author>
<published>2024-08-08T15:00:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=193b72792fdba54aa2af6c3ad72ce16c3f06b4f7'/>
<id>urn:sha1:193b72792fdba54aa2af6c3ad72ce16c3f06b4f7</id>
<content type='text'>
Add the __counted_by compiler attribute to the flexible array member
entries to improve access bounds-checking via CONFIG_UBSAN_BOUNDS and
CONFIG_FORTIFY_SOURCE.

Signed-off-by: Thorsten Blum &lt;thorsten.blum@toblux.com&gt;
Link: https://lore.kernel.org/r/20240808150023.72578-2-thorsten.blum@toblux.com
Reviewed-by: Gustavo A. R. Silva &lt;gustavoars@kernel.org&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>hrtimer: Use and report correct timerslack values for realtime tasks</title>
<updated>2024-08-23T18:13:02+00:00</updated>
<author>
<name>Felix Moessbauer</name>
<email>felix.moessbauer@siemens.com</email>
</author>
<published>2024-08-14T12:10:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ed4fb6d7ef68111bb539283561953e5c6e9a6e38'/>
<id>urn:sha1:ed4fb6d7ef68111bb539283561953e5c6e9a6e38</id>
<content type='text'>
The timerslack_ns setting is used to specify how much the hardware
timers should be delayed, to potentially dispatch multiple timers in a
single interrupt. This is a performance optimization. Timers of
realtime tasks (having a realtime scheduling policy) should not be
delayed.

This logic was inconsitently applied to the hrtimers, leading to delays
of realtime tasks which used timed waits for events (e.g. condition
variables). Due to the downstream override of the slack for rt tasks,
the procfs reported incorrect (non-zero) timerslack_ns values.

This is changed by setting the timer_slack_ns task attribute to 0 for
all tasks with a rt policy. By that, downstream users do not need to
specially handle rt tasks (w.r.t. the slack), and the procfs entry
shows the correct value of "0". Setting non-zero slack values (either
via procfs or PR_SET_TIMERSLACK) on tasks with a rt policy is ignored,
as stated in "man 2 PR_SET_TIMERSLACK":

  Timer slack is not applied to threads that are scheduled under a
  real-time scheduling policy (see sched_setscheduler(2)).

The special handling of timerslack on rt tasks in downstream users
is removed as well.

Signed-off-by: Felix Moessbauer &lt;felix.moessbauer@siemens.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lore.kernel.org/all/20240814121032.368444-2-felix.moessbauer@siemens.com

</content>
</entry>
<entry>
<title>x86: support user address masking instead of non-speculative conditional</title>
<updated>2024-08-19T18:31:18+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-04-09T03:04:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2865baf54077aa98fcdb478cefe6a42c417b9374'/>
<id>urn:sha1:2865baf54077aa98fcdb478cefe6a42c417b9374</id>
<content type='text'>
The Spectre-v1 mitigations made "access_ok()" much more expensive, since
it has to serialize execution with the test for a valid user address.

All the normal user copy routines avoid this by just masking the user
address with a data-dependent mask instead, but the fast
"unsafe_user_read()" kind of patterms that were supposed to be a fast
case got slowed down.

This introduces a notion of using

	src = masked_user_access_begin(src);

to do the user address sanity using a data-dependent mask instead of the
more traditional conditional

	if (user_read_access_begin(src, len)) {

model.

This model only works for dense accesses that start at 'src' and on
architectures that have a guard region that is guaranteed to fault in
between the user space and the kernel space area.

With this, the user access doesn't need to be manually checked, because
a bad address is guaranteed to fault (by some architecture masking
trick: on x86-64 this involves just turning an invalid user address into
all ones, since we don't map the top of address space).

This only converts a couple of examples for now.  Example x86-64 code
generation for loading two words from user space:

        stac
        mov    %rax,%rcx
        sar    $0x3f,%rcx
        or     %rax,%rcx
        mov    (%rcx),%r13
        mov    0x8(%rcx),%r14
        clac

where all the error handling and -EFAULT is now purely handled out of
line by the exception path.

Of course, if the micro-architecture does badly at 'clac' and 'stac',
the above is still pitifully slow.  But at least we did as well as we
could.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>introduce fd_file(), convert all accessors to it.</title>
<updated>2024-08-13T02:00:43+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2024-05-31T18:12:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1da91ea87aefe2c25b68c9f96947a9271ba6325d'/>
<id>urn:sha1:1da91ea87aefe2c25b68c9f96947a9271ba6325d</id>
<content type='text'>
	For any changes of struct fd representation we need to
turn existing accesses to fields into calls of wrappers.
Accesses to struct fd::flags are very few (3 in linux/file.h,
1 in net/socket.c, 3 in fs/overlayfs/file.c and 3 more in
explicit initializers).
	Those can be dealt with in the commit converting to
new layout; accesses to struct fd::file are too many for that.
	This commit converts (almost) all of f.file to
fd_file(f).  It's not entirely mechanical ('file' is used as
a member name more than just in struct fd) and it does not
even attempt to distinguish the uses in pointer context from
those in boolean context; the latter will be eventually turned
into a separate helper (fd_empty()).

	NOTE: mass conversion to fd_empty(), tempting as it
might be, is a bad idea; better do that piecewise in commit
that convert from fdget...() to CLASS(...).

[conflicts in fs/fhandle.c, kernel/bpf/syscall.c, mm/memcontrol.c
caught by git; fs/stat.c one got caught by git grep]
[fs/xattr.c conflict]

Reviewed-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>fs/select: rework stack allocation hack for clang</title>
<updated>2024-02-20T08:23:52+00:00</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2024-02-16T20:23:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ddb9fd7a544088ed70eccbb9f85e9cc9952131c1'/>
<id>urn:sha1:ddb9fd7a544088ed70eccbb9f85e9cc9952131c1</id>
<content type='text'>
A while ago, we changed the way that select() and poll() preallocate
a temporary buffer just under the size of the static warning limit of
1024 bytes, as clang was frequently going slightly above that limit.

The warnings have recently returned and I took another look. As it turns
out, clang is not actually inherently worse at reserving stack space,
it just happens to inline do_select() into core_sys_select(), while gcc
never inlines it.

Annotate do_select() to never be inlined and in turn remove the special
case for the allocation size. This should give the same behavior for
both clang and gcc all the time and once more avoids those warnings.

Fixes: ad312f95d41c ("fs/select: avoid clang stack usage warning")
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Link: https://lore.kernel.org/r/20240216202352.2492798-1-arnd@kernel.org
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Reviewed-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>select: Avoid wrap-around instrumentation in do_sys_poll()</title>
<updated>2024-02-02T12:11:49+00:00</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2024-01-29T18:40:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c67ef897fe08effad98f0c7fb9223fa1f771d09e'/>
<id>urn:sha1:c67ef897fe08effad98f0c7fb9223fa1f771d09e</id>
<content type='text'>
The mix of int, unsigned int, and unsigned long used by struct
poll_list::len, todo, len, and j meant that the signed overflow
sanitizer got worried it needed to instrument several places where
arithmetic happens between these variables. Since all of the variables
are always positive and bounded by unsigned int, use a single type in
all places. Additionally expand the zero-test into an explicit range
check before updating "todo".

This keeps sanitizer instrumentation[1] out of a UACCESS path:

vmlinux.o: warning: objtool: do_sys_poll+0x285: call to __ubsan_handle_sub_overflow() with UACCESS enabled

Link: https://github.com/KSPP/linux/issues/26 [1]
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: &lt;linux-fsdevel@vger.kernel.org&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Link: https://lore.kernel.org/r/20240129184014.work.593-kees@kernel.org
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
</feed>
