diff options
| author | I Hsin Cheng <richard120310@gmail.com> | 2025-02-26 09:56:23 +0300 | 
|---|---|---|
| committer | Yury Norov <yury.norov@gmail.com> | 2025-02-28 20:36:11 +0300 | 
| commit | 1e7933a575ed8af4a64dd5089c2f6912da66dd79 (patch) | |
| tree | b9df5dd724949ae5c564cff12f2497d423ab8a23 /tools/perf/scripts/python/mem-phys-addr.py | |
| parent | 14c384131ea09fb70e9e01b0a3f2c3d3cd56d832 (diff) | |
| download | linux-1e7933a575ed8af4a64dd5089c2f6912da66dd79.tar.xz | |
uapi: Revert "bitops: avoid integer overflow in GENMASK(_ULL)"
This patch reverts 'commit c32ee3d9abd2("bitops: avoid integer overflow in
 GENMASK(_ULL)")'.
The code generation can be shrink by over 1KB by reverting this commit.
Originally the commit claimed that clang would emit warnings using the
implementation at that time.
The patch was applied and tested against numerous compilers, including
gcc-13, gcc-12, gcc-11 cross-compiler, clang-17, clang-18 and clang-19.
Various warning levels were set (-W=0, -W=1, -W=2) and CONFIG_WERROR
disabled to complete the compilation. The results show that no compilation
errors or warnings were generated due to the patch.
The results of code size reduction are summarized in the following table.
The code size changes for clang are all zero across different versions,
so they're not listed in the table.
For NR_CPUS=64 on x86_64.
----------------------------------------------
|	        |   gcc-13 |   gcc-12 |   gcc-11 |
----------------------------------------------
|       old | 22438085 | 22453915 | 22302033 |
----------------------------------------------
|       new | 22436816 | 22452913 | 22300826 |
----------------------------------------------
| new - old |    -1269 |    -1002 |    -1207 |
----------------------------------------------
For NR_CPUS=1024 on x86_64.
----------------------------------------------
|	        |   gcc-13 |   gcc-12 |   gcc-11 |
----------------------------------------------
|       old | 22493682 | 22509812 | 22357661 |
----------------------------------------------
|       new | 22493230 | 22509487 | 22357250 |
----------------------------------------------
| new - old |     -452 |     -325 |     -411 |
----------------------------------------------
For arm64 architecture, gcc cross-compiler was used and QEMU was
utilized to execute a VM for a CPU-heavy workload to ensure no
side effects and that functionalities remained correct. The test
even demonstrated a positive result in terms of code size reduction:
* Before: 31660668
* After: 31658724
* Difference (After - Before): -1944
An analysis of multiple functions compiled with gcc-13 on x86_64 was
performed. In summary, the patch elimates one negation in almost
every use case. However, negative effects may occur in some cases,
such as the generation of additional "mov" instruction or increased
register usage. The use of "~_UL(0) << (l)" may even result in the
allocations of "%r*" registers instead of "%e*" registers (which are
32-bit registers) because the compiler cannot assume that the higher
bits are zero.
Yury:
We limit GENMASK() usage with the const_true(l > h) condition, and
most of users just call it with constant parameters. For those, the
actual implementation of the macro doesn't matter, and since it
triggered clang warnings back then, it was reasonable to workaround
the warnings on the kernel side.
Now that some find_bit() functions call GENMASK() with runtime
parameters (although the const_true() condition holds), this ended up
hurting the generated code, as I Hsin discovered. This is especially
bad because it hurts small_const_nbits() optimization, where people are
most concerned about generated code quality. So, revert it to the
original version for good.
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Diffstat (limited to 'tools/perf/scripts/python/mem-phys-addr.py')
0 files changed, 0 insertions, 0 deletions
