summaryrefslogtreecommitdiff
path: root/scripts/patch-kernel
diff options
context:
space:
mode:
authorKuan-Wei Chiu <visitorckw@gmail.com>2025-09-30 07:30:55 +0300
committerGeorgi Djakov <djakov@kernel.org>2025-10-24 18:02:26 +0300
commit245f14f5fe283c782b16143280f283bee29dbb5f (patch)
tree0b8400cea590b061a12e7f71d3b4c35d7dd3f4d5 /scripts/patch-kernel
parent3a8660878839faadb4f1a6dd72c3179c1df56787 (diff)
downloadlinux-245f14f5fe283c782b16143280f283bee29dbb5f.tar.xz
interconnect: Optimize kbps_to_icc() macro
The current expansion of kbps_to_icc() introduces unnecessary logic when compiled from a general expression. Rewriting it allows compilers to emit shorter and more efficient code across architectures. For example, with gcc -O2: arm64: old: tst x0, 7 add w1, w0, 7 cset w2, ne cmp w0, 0 csel w0, w1, w0, lt add w0, w2, w0, asr 3 new: add w1, w0, 14 adds w0, w0, 7 csel w0, w1, w0, mi asr w0, w0, 3 x86-64: old: xor eax, eax test dil, 7 lea edx, [rdi+7] setne al test edi, edi cmovns edx, edi sar edx, 3 add eax, edx new: lea eax, [rdi+14] add edi, 7 cmovns eax, edi sar eax, 3 In both cases the old form relies on extra test and compare instructions (tst, test, cmp) combined with conditional moves or sets, while the new form uses fewer instructions by folding the addition and flag update together (adds on arm64, add on x86). This reduces the instruction sequence, prevents multiple evaluations of x when it is an expression or a function call, and keeps the macro simpler. Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> Link: https://lore.kernel.org/r/20250930043055.2200322-1-visitorckw@gmail.com Signed-off-by: Georgi Djakov <djakov@kernel.org>
Diffstat (limited to 'scripts/patch-kernel')
0 files changed, 0 insertions, 0 deletions