diff options
author | Peter Zijlstra <peterz@infradead.org> | 2025-09-02 12:20:35 +0300 |
---|---|---|
committer | Peter Zijlstra <peterz@infradead.org> | 2025-09-04 22:59:09 +0300 |
commit | 4a1e02b15ac174c3c6d5e358e67c4ba980e7b336 (patch) | |
tree | 1b324bdca09051f329179fd0eb0cfb4826e9748b /tools/perf/scripts/python/parallel-perf.py | |
parent | 85a2d4a890dce3cfc9c14aa91afc3dd7af8e3bf5 (diff) | |
download | linux-4a1e02b15ac174c3c6d5e358e67c4ba980e7b336.tar.xz |
x86,retpoline: Optimize patch_retpoline()
Currently the very common retpoline: "CS CALL __x86_indirect_thunk_r11"
is transformed into "CALL *R11; NOP3" for eIBRS/BHI_NO parts.
Similarly, paranoid fineibt has: "CALL *R11; NOP".
Recognise that CS stuffing can avoid the extra NOP. However, due to
prefix decode penalties, make sure to not emit too many CS prefixes.
Notably: "CS CALL __x86_indirect_thunk_rax" must not become "CS CS CS
CS CALL *RAX". Prefix decode penalties are typically many more cycles
than decoding an extra NOP.
Additionally, if the retpoline is a tail-call, the "JMP *%\reg" should
be followed by INT3 for straight-line-speculation mitigation, since
emit_indirect() now has a length argument, move this into
emit_indirect() such that other users (paranoid-fineibt) also do this.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20250902104627.GM4068168@noisy.programming.kicks-ass.net
Diffstat (limited to 'tools/perf/scripts/python/parallel-perf.py')
0 files changed, 0 insertions, 0 deletions