summaryrefslogtreecommitdiff
path: root/arch/powerpc/include/asm/processor.h
diff options
context:
space:
mode:
authorChristophe Leroy <christophe.leroy@csgroup.eu>2021-10-19 10:29:17 +0300
committerMichael Ellerman <mpe@ellerman.id.au>2021-12-09 14:41:17 +0300
commit526d4a4c77aedf1b7df1133e5cced29c70232e6e (patch)
treeeae41ed69c6e5b71842206dcda4a9bc7e961bf51 /arch/powerpc/include/asm/processor.h
parentdf415cd758261bceff27f34a145dd8328bbfb018 (diff)
downloadlinux-526d4a4c77aedf1b7df1133e5cced29c70232e6e.tar.xz
powerpc/32s: Do kuep_lock() and kuep_unlock() in assembly
When interrupt and syscall entries where converted to C, KUEP locking and unlocking was also converted. It improved performance by unrolling the loop, and allowed easily implementing boot time deactivation of KUEP. However, null_syscall selftest shows that KUEP is still heavy (361 cycles with KUEP, 212 cycles without). A way to improve more is to group 'mtsr's together, instead of repeating 'addi' + 'mtsr' several times. In order to do that, more registers need to be available. In C, GCC will always be able to provide the requested number of registers, but at the cost of saving some data on the stack, which is counter performant here. So let's do it in assembly, when we have full control of which register can be used. It also has the advantage of locking earlier and unlocking later and it helps GCC generating less tricky code. The only drawback is to make boot time deactivation less straight forward and require 'hand' instruction patching. Group 'mtsr's by 4. With this change, null_syscall selftest reports 336 cycles. Without the change it was 361 cycles, that's a 7% reduction. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/115cb279e9b9948dfd93a065e047081c59e3a2a6.1634627931.git.christophe.leroy@csgroup.eu
Diffstat (limited to 'arch/powerpc/include/asm/processor.h')
0 files changed, 0 insertions, 0 deletions