<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/arch/x86/entry/syscall_64.c, branch linux-7.1.y</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=linux-7.1.y</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=linux-7.1.y'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-03-25T04:12:03+00:00</updated>
<entry>
<title>randomize_kstack: Unify random source across arches</title>
<updated>2026-03-25T04:12:03+00:00</updated>
<author>
<name>Ryan Roberts</name>
<email>ryan.roberts@arm.com</email>
</author>
<published>2026-03-03T15:08:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a96ef5848cb096226bf6aff31a90d8b136d99b71'/>
<id>urn:sha1:a96ef5848cb096226bf6aff31a90d8b136d99b71</id>
<content type='text'>
Previously different architectures were using random sources of
differing strength and cost to decide the random kstack offset. A number
of architectures (loongarch, powerpc, s390, x86) were using their
timestamp counter, at whatever the frequency happened to be. Other
arches (arm64, riscv) were using entropy from the crng via
get_random_u16().

There have been concerns that in some cases the timestamp counters may
be too weak, because they can be easily guessed or influenced by user
space. And get_random_u16() has been shown to be too costly for the
level of protection kstack offset randomization provides.

So let's use a common, architecture-agnostic source of entropy; a
per-cpu prng, seeded at boot-time from the crng. This has a few
benefits:

  - We can remove choose_random_kstack_offset(); That was only there to
    try to make the timestamp counter value a bit harder to influence
    from user space [*].

  - The architecture code is simplified. All it has to do now is call
    add_random_kstack_offset() in the syscall path.

  - The strength of the randomness can be reasoned about independently
    of the architecture.

  - Arches previously using get_random_u16() now have much faster
    syscall paths, see below results.

[*] Additionally, this gets rid of some redundant work on s390 and x86.
Before this patch, those architectures called
choose_random_kstack_offset() under arch_exit_to_user_mode_prepare(),
which is also called for exception returns to userspace which were *not*
syscalls (e.g. regular interrupts). Getting rid of
choose_random_kstack_offset() avoids a small amount of redundant work
for the non-syscall cases.

In some configurations, add_random_kstack_offset() will now call
instrumentable code, so for a couple of arches, I have moved the call a
bit later to the first point where instrumentation is allowed. This
doesn't impact the efficacy of the mechanism.

There have been some claims that a prng may be less strong than the
timestamp counter if not regularly reseeded. But the prng has a period
of about 2^113. So as long as the prng state remains secret, it should
not be possible to guess. If the prng state can be accessed, we have
bigger problems.

Additionally, we are only consuming 6 bits to randomize the stack, so
there are only 64 possible random offsets. I assert that it would be
trivial for an attacker to brute force by repeating their attack and
waiting for the random stack offset to be the desired one. The prng
approach seems entirely proportional to this level of protection.

Performance data are provided below. The baseline is v6.18 with rndstack
on for each respective arch. (I)/(R) indicate statistically significant
improvement/regression. arm64 platform is AWS Graviton3 (m7g.metal).
x86_64 platform is AWS Sapphire Rapids (m7i.24xlarge):

+-----------------+--------------+---------------+---------------+
| Benchmark       | Result Class |  per-cpu-prng |  per-cpu-prng |
|                 |              | arm64 (metal) |   x86_64 (VM) |
+=================+==============+===============+===============+
| syscall/getpid  | mean (ns)    |    (I) -9.50% |   (I) -17.65% |
|                 | p99 (ns)     |   (I) -59.24% |   (I) -24.41% |
|                 | p99.9 (ns)   |   (I) -59.52% |   (I) -28.52% |
+-----------------+--------------+---------------+---------------+
| syscall/getppid | mean (ns)    |    (I) -9.52% |   (I) -19.24% |
|                 | p99 (ns)     |   (I) -59.25% |   (I) -25.03% |
|                 | p99.9 (ns)   |   (I) -59.50% |   (I) -28.17% |
+-----------------+--------------+---------------+---------------+
| syscall/invalid | mean (ns)    |   (I) -10.31% |   (I) -18.56% |
|                 | p99 (ns)     |   (I) -60.79% |   (I) -20.06% |
|                 | p99.9 (ns)   |   (I) -61.04% |   (I) -25.04% |
+-----------------+--------------+---------------+---------------+

I tested an earlier version of this change on x86 bare metal and it
showed a smaller but still significant improvement. The bare metal
system wasn't available this time around so testing was done in a VM
instance. I'm guessing the cost of rdtsc is higher for VMs.

Acked-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Signed-off-by: Ryan Roberts &lt;ryan.roberts@arm.com&gt;
Link: https://patch.msgid.link/20260303150840.3789438-3-ryan.roberts@arm.com
Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
</content>
</entry>
<entry>
<title>x86/syscall: Remove stray semicolons</title>
<updated>2025-03-19T10:19:18+00:00</updated>
<author>
<name>Brian Gerst</name>
<email>brgerst@gmail.com</email>
</author>
<published>2025-03-14T15:12:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=604939552231730c52b823b178e9080877b20851'/>
<id>urn:sha1:604939552231730c52b823b178e9080877b20851</id>
<content type='text'>
No functional change.

Suggested-by: Sohil Mehta &lt;sohil.mehta@intel.com&gt;
Signed-off-by: Brian Gerst &lt;brgerst@gmail.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Reviewed-by: Sohil Mehta &lt;sohil.mehta@intel.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Link: https://lore.kernel.org/r/20250314151220.862768-7-brgerst@gmail.com
</content>
</entry>
<entry>
<title>x86/syscall/x32: Move x32 syscall table</title>
<updated>2025-03-19T10:19:15+00:00</updated>
<author>
<name>Brian Gerst</name>
<email>brgerst@gmail.com</email>
</author>
<published>2025-03-14T15:12:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=21832247f2df4f636b0f2ae6939819e8dd2ed35f'/>
<id>urn:sha1:21832247f2df4f636b0f2ae6939819e8dd2ed35f</id>
<content type='text'>
Since commit:

  2e958a8a510d ("x86/entry/x32: Rename __x32_compat_sys_* to __x64_compat_sys_*")

the ABI prefix for x32 syscalls is the same as native 64-bit
syscalls.  Move the x32 syscall table to syscall_64.c

No functional changes.

Signed-off-by: Brian Gerst &lt;brgerst@gmail.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Reviewed-by: Sohil Mehta &lt;sohil.mehta@intel.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Link: https://lore.kernel.org/r/20250314151220.862768-5-brgerst@gmail.com
</content>
</entry>
<entry>
<title>x86/syscall/64: Move 64-bit syscall dispatch code</title>
<updated>2025-03-19T10:19:04+00:00</updated>
<author>
<name>Brian Gerst</name>
<email>brgerst@gmail.com</email>
</author>
<published>2025-03-14T15:12:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=01dfb4805420291966b75b208d66c88c62579649'/>
<id>urn:sha1:01dfb4805420291966b75b208d66c88c62579649</id>
<content type='text'>
Move the 64-bit syscall dispatch code to syscall_64.c.

No functional changes.

Signed-off-by: Brian Gerst &lt;brgerst@gmail.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Reviewed-by: Sohil Mehta &lt;sohil.mehta@intel.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Link: https://lore.kernel.org/r/20250314151220.862768-4-brgerst@gmail.com
</content>
</entry>
<entry>
<title>x86/syscall: Mark exit[_group] syscall handlers __noreturn</title>
<updated>2024-06-28T13:23:38+00:00</updated>
<author>
<name>Josh Poimboeuf</name>
<email>jpoimboe@kernel.org</email>
</author>
<published>2024-06-26T06:02:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9142be9e6443fd641ca37f820efe00d9cd890eb1'/>
<id>urn:sha1:9142be9e6443fd641ca37f820efe00d9cd890eb1</id>
<content type='text'>
The direct-call syscall dispatch function doesn't know that the exit()
and exit_group() syscall handlers don't return, so the call sites aren't
optimized accordingly.

Fix that by marking the exit syscall declarations __noreturn.

Fixes the following warnings:

  vmlinux.o: warning: objtool: x64_sys_call+0x2804: __x64_sys_exit() is missing a __noreturn annotation
  vmlinux.o: warning: objtool: ia32_sys_call+0x29b6: __ia32_sys_exit_group() is missing a __noreturn annotation

Fixes: 1e3ad78334a6 ("x86/syscall: Don't force use of indirect calls for system calls")
Closes: https://lkml.kernel.org/lkml/6dba9b32-db2c-4e6d-9500-7a08852f17a3@paulmck-laptop
Reported-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Signed-off-by: Josh Poimboeuf &lt;jpoimboe@kernel.org&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Tested-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Link: https://lore.kernel.org/r/5d8882bc077d8eadcc7fd1740b56dfb781f12288.1719381528.git.jpoimboe@kernel.org
</content>
</entry>
<entry>
<title>x86/syscall: Don't force use of indirect calls for system calls</title>
<updated>2024-04-08T17:27:05+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-04-03T23:36:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1e3ad78334a69b36e107232e337f9d693dcc9df2'/>
<id>urn:sha1:1e3ad78334a69b36e107232e337f9d693dcc9df2</id>
<content type='text'>
Make &lt;asm/syscall.h&gt; build a switch statement instead, and the compiler can
either decide to generate an indirect jump, or - more likely these days due
to mitigations - just a series of conditional branches.

Yes, the conditional branches also have branch prediction, but the branch
prediction is much more controlled, in that it just causes speculatively
running the wrong system call (harmless), rather than speculatively running
possibly wrong random less controlled code gadgets.

This doesn't mitigate other indirect calls, but the system call indirection
is the first and most easily triggered case.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Daniel Sneddon &lt;daniel.sneddon@linux.intel.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Josh Poimboeuf &lt;jpoimboe@kernel.org&gt;

</content>
</entry>
<entry>
<title>x86/syscalls: Stop filling syscall arrays with *_sys_ni_syscall</title>
<updated>2021-05-20T13:03:59+00:00</updated>
<author>
<name>Masahiro Yamada</name>
<email>masahiroy@kernel.org</email>
</author>
<published>2021-05-17T07:38:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=44fe4895f47cbe9f4692e1d3cdc2ef8352f4d88e'/>
<id>urn:sha1:44fe4895f47cbe9f4692e1d3cdc2ef8352f4d88e</id>
<content type='text'>
This is a follow-up cleanup after switching to the generic syscalltbl.sh.

The old x86 specific script skipped non-existing syscalls. So, the
generated syscalls_64.h, for example, had a big hole in the syscall numbers
335-423 range. That is why there exists [0 ... __NR_*_syscall_max] =
&amp;__*_sys_ni_cyscall.

The new script, scripts/syscalltbl.sh automatically fills holes
with __SYSCALL(&lt;nr&gt;, sys_ni_syscall), hence such ugly code can
go away. The designated initializers, '[nr] =' are also unneeded.

Also, there is no need to give __NR_*_syscall_max+1 because the array
size is implied by the number of syscalls in the generated headers.
Hence, there is no need to include &lt;asm/unistd.h&gt;, either.

Signed-off-by: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lore.kernel.org/r/20210517073815.97426-4-masahiroy@kernel.org

</content>
</entry>
<entry>
<title>x86/syscalls: Switch to generic syscalltbl.sh</title>
<updated>2021-05-20T13:03:58+00:00</updated>
<author>
<name>Masahiro Yamada</name>
<email>masahiroy@kernel.org</email>
</author>
<published>2021-05-17T07:38:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6218d0f6b8dece1f2e82f0a47a0e6b8ecb631ef6'/>
<id>urn:sha1:6218d0f6b8dece1f2e82f0a47a0e6b8ecb631ef6</id>
<content type='text'>
Many architectures duplicate similar shell scripts.

Convert x86 and UML to use scripts/syscalltbl.sh. The generic script
generates seperate headers for x86/64 and x86/x32 syscalls, while the x86
specific script coalesced them into one. Adjust the code accordingly.

Signed-off-by: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lore.kernel.org/r/20210517073815.97426-3-masahiroy@kernel.org
</content>
</entry>
<entry>
<title>x86/entry: Drop asmlinkage from syscalls</title>
<updated>2020-03-21T15:03:25+00:00</updated>
<author>
<name>Brian Gerst</name>
<email>brgerst@gmail.com</email>
</author>
<published>2020-03-13T19:51:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0f78ff17112d8b3469b805ff4ea9780cc1e5c93b'/>
<id>urn:sha1:0f78ff17112d8b3469b805ff4ea9780cc1e5c93b</id>
<content type='text'>
asmlinkage is no longer required since the syscall ABI is now fully under
x86 architecture control.  This makes the 32-bit native syscalls a bit more
effecient by passing in regs via EAX instead of on the stack.

Signed-off-by: Brian Gerst &lt;brgerst@gmail.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Dominik Brodowski &lt;linux@dominikbrodowski.net&gt;
Reviewed-by: Andy Lutomirski &lt;luto@kernel.org&gt;
Link: https://lkml.kernel.org/r/20200313195144.164260-18-brgerst@gmail.com

</content>
</entry>
<entry>
<title>x86/entry: Remove ABI prefixes from functions in syscall tables</title>
<updated>2020-03-21T15:03:23+00:00</updated>
<author>
<name>Brian Gerst</name>
<email>brgerst@gmail.com</email>
</author>
<published>2020-03-13T19:51:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=cab56d3484d4bb8b21e4d9500392ac1ce99af026'/>
<id>urn:sha1:cab56d3484d4bb8b21e4d9500392ac1ce99af026</id>
<content type='text'>
Move the ABI prefixes to the __SYSCALL_[abi]() macros.  This allows removal
of the need to strip the prefix for UML.

Signed-off-by: Brian Gerst &lt;brgerst@gmail.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lkml.kernel.org/r/20200313195144.164260-13-brgerst@gmail.com

</content>
</entry>
</feed>
