<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/lib/strnlen_user.c, branch v6.19.11</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.11</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.11'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-11-18T14:27:35+00:00</updated>
<entry>
<title>lib/strn*,uaccess: Use masked_user_{read/write}_access_begin when required</title>
<updated>2025-11-18T14:27:35+00:00</updated>
<author>
<name>Christophe Leroy</name>
<email>christophe.leroy@csgroup.eu</email>
</author>
<published>2025-11-17T16:43:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4322c8f81c58da493a3c46eda32f0e7534a350a0'/>
<id>urn:sha1:4322c8f81c58da493a3c46eda32f0e7534a350a0</id>
<content type='text'>
Properly use masked_user_read_access_begin() and
masked_user_write_access_begin() instead of masked_user_access_begin() in
order to match user_read_access_end() and user_write_access_end().  This is
important for architectures like PowerPC that enable separately user reads
and user writes.

That means masked_user_read_access_begin() is used when user memory is
exclusively read during the window and masked_user_write_access_begin()
is used when user memory is exclusively writen during the window.
masked_user_access_begin() remains and is used when both reads and
writes are performed during the open window. Each of them is expected
to be terminated by the matching user_read_access_end(),
user_write_access_end() and user_access_end().

Signed-off-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://patch.msgid.link/cb5e4b0fa49ea9c740570949d5e3544423389757.1763396724.git.christophe.leroy@csgroup.eu
</content>
</entry>
<entry>
<title>x86: support user address masking instead of non-speculative conditional</title>
<updated>2024-08-19T18:31:18+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-04-09T03:04:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2865baf54077aa98fcdb478cefe6a42c417b9374'/>
<id>urn:sha1:2865baf54077aa98fcdb478cefe6a42c417b9374</id>
<content type='text'>
The Spectre-v1 mitigations made "access_ok()" much more expensive, since
it has to serialize execution with the test for a valid user address.

All the normal user copy routines avoid this by just masking the user
address with a data-dependent mask instead, but the fast
"unsafe_user_read()" kind of patterms that were supposed to be a fast
case got slowed down.

This introduces a notion of using

	src = masked_user_access_begin(src);

to do the user address sanity using a data-dependent mask instead of the
more traditional conditional

	if (user_read_access_begin(src, len)) {

model.

This model only works for dense accesses that start at 'src' and on
architectures that have a guard region that is guaranteed to fault in
between the user space and the kernel space area.

With this, the user access doesn't need to be manually checked, because
a bad address is guaranteed to fault (by some architecture masking
trick: on x86-64 this involves just turning an invalid user address into
all ones, since we don't map the top of address space).

This only converts a couple of examples for now.  Example x86-64 code
generation for loading two words from user space:

        stac
        mov    %rax,%rcx
        sar    $0x3f,%rcx
        or     %rax,%rcx
        mov    (%rcx),%r13
        mov    0x8(%rcx),%r14
        clac

where all the error handling and -EFAULT is now purely handled out of
line by the exception path.

Of course, if the micro-architecture does badly at 'clac' and 'stac',
the above is still pitifully slow.  But at least we did as well as we
could.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>lib/strn*,objtool: Enforce user_access_begin() rules</title>
<updated>2022-04-19T19:58:47+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2022-04-08T09:45:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=226d44acf6dfe71c9df5804b82364e93cf908b53'/>
<id>urn:sha1:226d44acf6dfe71c9df5804b82364e93cf908b53</id>
<content type='text'>
Apparently GCC can fail to inline a 'static inline' single caller
function:

  lib/strnlen_user.o: warning: objtool: strnlen_user()+0x33: call to do_strnlen_user() with UACCESS enabled
  lib/strncpy_from_user.o: warning: objtool: strncpy_from_user()+0x33: call to do_strncpy_from_user() with UACCESS enabled

Reported-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Link: https://lore.kernel.org/r/20220408094718.262932488@infradead.org
</content>
</entry>
<entry>
<title>uaccess: remove CONFIG_SET_FS</title>
<updated>2022-02-25T08:36:06+00:00</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2022-02-11T20:42:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=967747bbc084b93b54e66f9047d342232314cd25'/>
<id>urn:sha1:967747bbc084b93b54e66f9047d342232314cd25</id>
<content type='text'>
There are no remaining callers of set_fs(), so CONFIG_SET_FS
can be removed globally, along with the thread_info field and
any references to it.

This turns access_ok() into a cheaper check against TASK_SIZE_MAX.

As CONFIG_SET_FS is now gone, drop all remaining references to
set_fs()/get_fs(), mm_segment_t, user_addr_max() and uaccess_kernel().

Acked-by: Sam Ravnborg &lt;sam@ravnborg.org&gt; # for sparc32 changes
Acked-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Tested-by: Sergey Matyukevich &lt;sergey.matyukevich@synopsys.com&gt; # for arc changes
Acked-by: Stafford Horne &lt;shorne@gmail.com&gt; # [openrisc, asm-generic]
Acked-by: Dinh Nguyen &lt;dinguyen@kernel.org&gt;
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
</content>
</entry>
<entry>
<title>uaccess: Selectively open read or write user access</title>
<updated>2020-05-01T02:35:21+00:00</updated>
<author>
<name>Christophe Leroy</name>
<email>christophe.leroy@c-s.fr</email>
</author>
<published>2020-04-03T07:20:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=41cd780524674082b037e7c8461f90c5e42103f0'/>
<id>urn:sha1:41cd780524674082b037e7c8461f90c5e42103f0</id>
<content type='text'>
When opening user access to only perform reads, only open read access.
When opening user access to only perform writes, only open write
access.

Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/2e73bc57125c2c6ab12a587586a4eed3a47105fc.1585898438.git.christophe.leroy@c-s.fr
</content>
</entry>
<entry>
<title>lib: Reduce user_access_begin() boundaries in strncpy_from_user() and strnlen_user()</title>
<updated>2020-01-24T17:27:34+00:00</updated>
<author>
<name>Christophe Leroy</name>
<email>christophe.leroy@c-s.fr</email>
</author>
<published>2020-01-23T08:34:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ab10ae1c3bef56c29bac61e1201c752221b87b41'/>
<id>urn:sha1:ab10ae1c3bef56c29bac61e1201c752221b87b41</id>
<content type='text'>
The range passed to user_access_begin() by strncpy_from_user() and
strnlen_user() starts at 'src' and goes up to the limit of userspace
although reads will be limited by the 'count' param.

On 32 bits powerpc (book3s/32) access has to be granted for each
256Mbytes segment and the cost increases with the number of segments to
unlock.

Limit the range with 'count' param.

Fixes: 594cc251fdd0 ("make 'user_access_begin()' do 'access_ok()'")
Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>lib: introduce copy_struct_from_user() helper</title>
<updated>2019-10-01T13:45:03+00:00</updated>
<author>
<name>Aleksa Sarai</name>
<email>cyphar@cyphar.com</email>
</author>
<published>2019-10-01T01:10:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f5a1a536fa14895ccff4e94e6a5af90901ce86aa'/>
<id>urn:sha1:f5a1a536fa14895ccff4e94e6a5af90901ce86aa</id>
<content type='text'>
A common pattern for syscall extensions is increasing the size of a
struct passed from userspace, such that the zero-value of the new fields
result in the old kernel behaviour (allowing for a mix of userspace and
kernel vintages to operate on one another in most cases).

While this interface exists for communication in both directions, only
one interface is straightforward to have reasonable semantics for
(userspace passing a struct to the kernel). For kernel returns to
userspace, what the correct semantics are (whether there should be an
error if userspace is unaware of a new extension) is very
syscall-dependent and thus probably cannot be unified between syscalls
(a good example of this problem is [1]).

Previously there was no common lib/ function that implemented
the necessary extension-checking semantics (and different syscalls
implemented them slightly differently or incompletely[2]). Future
patches replace common uses of this pattern to make use of
copy_struct_from_user().

Some in-kernel selftests that insure that the handling of alignment and
various byte patterns are all handled identically to memchr_inv() usage.

[1]: commit 1251201c0d34 ("sched/core: Fix uclamp ABI bug, clean up and
     robustify sched_read_attr() ABI logic and code")

[2]: For instance {sched_setattr,perf_event_open,clone3}(2) all do do
     similar checks to copy_struct_from_user() while rt_sigprocmask(2)
     always rejects differently-sized struct arguments.

Suggested-by: Rasmus Villemoes &lt;linux@rasmusvillemoes.dk&gt;
Signed-off-by: Aleksa Sarai &lt;cyphar@cyphar.com&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Reviewed-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Link: https://lore.kernel.org/r/20191001011055.19283-2-cyphar@cyphar.com
Signed-off-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
</content>
</entry>
<entry>
<title>lib: untag user pointers in strn*_user</title>
<updated>2019-09-26T00:51:41+00:00</updated>
<author>
<name>Andrey Konovalov</name>
<email>andreyknvl@google.com</email>
</author>
<published>2019-09-25T23:48:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=903f433f8f7a33e292a319259483adece8cc6674'/>
<id>urn:sha1:903f433f8f7a33e292a319259483adece8cc6674</id>
<content type='text'>
Patch series "arm64: untag user pointers passed to the kernel", v19.

=== Overview

arm64 has a feature called Top Byte Ignore, which allows to embed pointer
tags into the top byte of each pointer.  Userspace programs (such as
HWASan, a memory debugging tool [1]) might use this feature and pass
tagged user pointers to the kernel through syscalls or other interfaces.

Right now the kernel is already able to handle user faults with tagged
pointers, due to these patches:

1. 81cddd65 ("arm64: traps: fix userspace cache maintenance emulation on a
             tagged pointer")
2. 7dcd9dd8 ("arm64: hw_breakpoint: fix watchpoint matching for tagged
	      pointers")
3. 276e9327 ("arm64: entry: improve data abort handling of tagged
	      pointers")

This patchset extends tagged pointer support to syscall arguments.

As per the proposed ABI change [3], tagged pointers are only allowed to be
passed to syscalls when they point to memory ranges obtained by anonymous
mmap() or sbrk() (see the patchset [3] for more details).

For non-memory syscalls this is done by untaging user pointers when the
kernel performs pointer checking to find out whether the pointer comes
from userspace (most notably in access_ok).  The untagging is done only
when the pointer is being checked, the tag is preserved as the pointer
makes its way through the kernel and stays tagged when the kernel
dereferences the pointer when perfoming user memory accesses.

The mmap and mremap (only new_addr) syscalls do not currently accept
tagged addresses.  Architectures may interpret the tag as a background
colour for the corresponding vma.

Other memory syscalls (mprotect, etc.) don't do user memory accesses but
rather deal with memory ranges, and untagged pointers are better suited to
describe memory ranges internally.  Thus for memory syscalls we untag
pointers completely when they enter the kernel.

=== Other approaches

One of the alternative approaches to untagging that was considered is to
completely strip the pointer tag as the pointer enters the kernel with
some kind of a syscall wrapper, but that won't work with the countless
number of different ioctl calls.  With this approach we would need a
custom wrapper for each ioctl variation, which doesn't seem practical.

An alternative approach to untagging pointers in memory syscalls prologues
is to inspead allow tagged pointers to be passed to find_vma() (and other
vma related functions) and untag them there.  Unfortunately, a lot of
find_vma() callers then compare or subtract the returned vma start and end
fields against the pointer that was being searched.  Thus this approach
would still require changing all find_vma() callers.

=== Testing

The following testing approaches has been taken to find potential issues
with user pointer untagging:

1. Static testing (with sparse [2] and separately with a custom static
   analyzer based on Clang) to track casts of __user pointers to integer
   types to find places where untagging needs to be done.

2. Static testing with grep to find parts of the kernel that call
   find_vma() (and other similar functions) or directly compare against
   vm_start/vm_end fields of vma.

3. Static testing with grep to find parts of the kernel that compare
   user pointers with TASK_SIZE or other similar consts and macros.

4. Dynamic testing: adding BUG_ON(has_tag(addr)) to find_vma() and running
   a modified syzkaller version that passes tagged pointers to the kernel.

Based on the results of the testing the requried patches have been added
to the patchset.

=== Notes

This patchset is meant to be merged together with "arm64 relaxed ABI" [3].

This patchset is a prerequisite for ARM's memory tagging hardware feature
support [4].

This patchset has been merged into the Pixel 2 &amp; 3 kernel trees and is
now being used to enable testing of Pixel phones with HWASan.

Thanks!

[1] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html

[2] https://github.com/lucvoo/sparse-dev/commit/5f960cb10f56ec2017c128ef9d16060e0145f292

[3] https://lkml.org/lkml/2019/6/12/745

[4] https://community.arm.com/processors/b/blog/posts/arm-a-profile-architecture-2018-developments-armv85a

This patch (of 11)

This patch is a part of a series that extends kernel ABI to allow to pass
tagged user pointers (with the top byte set to something else other than
0x00) as syscall arguments.

strncpy_from_user and strnlen_user accept user addresses as arguments, and
do not go through the same path as copy_from_user and others, so here we
need to handle the case of tagged user addresses separately.

Untag user pointers passed to these functions.

Note, that this patch only temporarily untags the pointers to perform
validity checks, but then uses them as is to perform user memory accesses.

[andreyknvl@google.com: fix sparc4 build]
 Link: http://lkml.kernel.org/r/CAAeHK+yx4a-P0sDrXTUxMvO2H0CJZUFPffBrg_cU7oJOZyC7ew@mail.gmail.com
Link: http://lkml.kernel.org/r/c5a78bcad3e94d6cda71fcaa60a423231ae71e4c.1563904656.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov &lt;andreyknvl@google.com&gt;
Reviewed-by: Vincenzo Frascino &lt;vincenzo.frascino@arm.com&gt;
Reviewed-by: Khalid Aziz &lt;khalid.aziz@oracle.com&gt;
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Eric Auger &lt;eric.auger@redhat.com&gt;
Cc: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Cc: Jens Wiklander &lt;jens.wiklander@linaro.org&gt;
Cc: Mauro Carvalho Chehab &lt;mchehab+samsung@kernel.org&gt;
Cc: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/uaccess: Use 'unsigned long' to placate UBSAN warnings on older GCC versions</title>
<updated>2019-04-24T10:19:45+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2019-04-24T07:19:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=29da93fea3ea39ab9b12270cc6be1b70ef201c9e'/>
<id>urn:sha1:29da93fea3ea39ab9b12270cc6be1b70ef201c9e</id>
<content type='text'>
Randy reported objtool triggered on his (GCC-7.4) build:

  lib/strncpy_from_user.o: warning: objtool: strncpy_from_user()+0x315: call to __ubsan_handle_add_overflow() with UACCESS enabled
  lib/strnlen_user.o: warning: objtool: strnlen_user()+0x337: call to __ubsan_handle_sub_overflow() with UACCESS enabled

This is due to UBSAN generating signed-overflow-UB warnings where it
should not. Prior to GCC-8 UBSAN ignored -fwrapv (which the kernel
uses through -fno-strict-overflow).

Make the functions use 'unsigned long' throughout.

Reported-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Randy Dunlap &lt;rdunlap@infradead.org&gt; # build-tested
Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: luto@kernel.org
Link: http://lkml.kernel.org/r/20190424072208.754094071@infradead.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>make 'user_access_begin()' do 'access_ok()'</title>
<updated>2019-01-04T20:56:09+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-01-04T20:56:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=594cc251fdd0d231d342d88b2fdff4bc42fb0690'/>
<id>urn:sha1:594cc251fdd0d231d342d88b2fdff4bc42fb0690</id>
<content type='text'>
Originally, the rule used to be that you'd have to do access_ok()
separately, and then user_access_begin() before actually doing the
direct (optimized) user access.

But experience has shown that people then decide not to do access_ok()
at all, and instead rely on it being implied by other operations or
similar.  Which makes it very hard to verify that the access has
actually been range-checked.

If you use the unsafe direct user accesses, hardware features (either
SMAP - Supervisor Mode Access Protection - on x86, or PAN - Privileged
Access Never - on ARM) do force you to use user_access_begin().  But
nothing really forces the range check.

By putting the range check into user_access_begin(), we actually force
people to do the right thing (tm), and the range check vill be visible
near the actual accesses.  We have way too long a history of people
trying to avoid them.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
