diff options
author | Peter Zijlstra <peterz@infradead.org> | 2023-11-20 17:33:45 +0300 |
---|---|---|
committer | Ingo Molnar <mingo@kernel.org> | 2023-11-21 15:57:30 +0300 |
commit | c516213726fb572700cce4a5909aa8d82b77192a (patch) | |
tree | 58e59e4af6716d5276dd928e75e4a0d3fbd1a048 /arch/x86/entry/calling.h | |
parent | 98b1cc82c4affc16f5598d4fa14b1858671b2263 (diff) | |
download | linux-c516213726fb572700cce4a5909aa8d82b77192a.tar.xz |
x86/entry: Optimize common_interrupt_return()
The code in common_interrupt_return() does a bunch of unconditional
work that is really only needed on PTI kernels. Specifically it
unconditionally copies the IRET frame back onto the entry stack,
swizzles onto the entry stack and does IRET from there.
However, without PTI we can simply IRET from whatever stack we're on.
ivb-ep, mitigations=off, gettid-1m:
PRE:
140,118,538 cycles:k ( +- 0.01% )
236,692,878 instructions:k # 1.69 insn per cycle ( +- 0.00% )
POST:
140,026,608 cycles:k ( +- 0.01% )
236,696,176 instructions:k # 1.69 insn per cycle ( +- 0.00% )
(this is with --repeat 100 and the run-to-run variance is bigger than
the difference shown)
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20231120143626.638107480@infradead.org
Diffstat (limited to 'arch/x86/entry/calling.h')
-rw-r--r-- | arch/x86/entry/calling.h | 12 |
1 files changed, 9 insertions, 3 deletions
diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h index f6907627172b..9f1d94790a54 100644 --- a/arch/x86/entry/calling.h +++ b/arch/x86/entry/calling.h @@ -175,8 +175,7 @@ For 32-bit we have the following conventions - kernel is built with #define THIS_CPU_user_pcid_flush_mask \ PER_CPU_VAR(cpu_tlbstate) + TLB_STATE_user_pcid_flush_mask -.macro SWITCH_TO_USER_CR3_NOSTACK scratch_reg:req scratch_reg2:req - ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI +.macro SWITCH_TO_USER_CR3 scratch_reg:req scratch_reg2:req mov %cr3, \scratch_reg ALTERNATIVE "jmp .Lwrcr3_\@", "", X86_FEATURE_PCID @@ -206,13 +205,20 @@ For 32-bit we have the following conventions - kernel is built with /* Flip the PGD to the user version */ orq $(PTI_USER_PGTABLE_MASK), \scratch_reg mov \scratch_reg, %cr3 +.endm + +.macro SWITCH_TO_USER_CR3_NOSTACK scratch_reg:req scratch_reg2:req + ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI + SWITCH_TO_USER_CR3 \scratch_reg \scratch_reg2 .Lend_\@: .endm .macro SWITCH_TO_USER_CR3_STACK scratch_reg:req + ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI pushq %rax - SWITCH_TO_USER_CR3_NOSTACK scratch_reg=\scratch_reg scratch_reg2=%rax + SWITCH_TO_USER_CR3 scratch_reg=\scratch_reg scratch_reg2=%rax popq %rax +.Lend_\@: .endm .macro SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg:req save_reg:req |