summaryrefslogtreecommitdiff
path: root/tools/perf/scripts/python/flamegraph.py
diff options
context:
space:
mode:
authorPeter Zijlstra <peterz@infradead.org>2025-09-02 12:20:35 +0300
committerPeter Zijlstra <peterz@infradead.org>2025-09-04 22:59:09 +0300
commit4a1e02b15ac174c3c6d5e358e67c4ba980e7b336 (patch)
tree1b324bdca09051f329179fd0eb0cfb4826e9748b /tools/perf/scripts/python/flamegraph.py
parent85a2d4a890dce3cfc9c14aa91afc3dd7af8e3bf5 (diff)
downloadlinux-4a1e02b15ac174c3c6d5e358e67c4ba980e7b336.tar.xz
x86,retpoline: Optimize patch_retpoline()
Currently the very common retpoline: "CS CALL __x86_indirect_thunk_r11" is transformed into "CALL *R11; NOP3" for eIBRS/BHI_NO parts. Similarly, paranoid fineibt has: "CALL *R11; NOP". Recognise that CS stuffing can avoid the extra NOP. However, due to prefix decode penalties, make sure to not emit too many CS prefixes. Notably: "CS CALL __x86_indirect_thunk_rax" must not become "CS CS CS CS CALL *RAX". Prefix decode penalties are typically many more cycles than decoding an extra NOP. Additionally, if the retpoline is a tail-call, the "JMP *%\reg" should be followed by INT3 for straight-line-speculation mitigation, since emit_indirect() now has a length argument, move this into emit_indirect() such that other users (paranoid-fineibt) also do this. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20250902104627.GM4068168@noisy.programming.kicks-ass.net
Diffstat (limited to 'tools/perf/scripts/python/flamegraph.py')
0 files changed, 0 insertions, 0 deletions