powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32 - starfive-tech/linux.git - StarFive Tech Linux Kernel for VisionFive (JH7110) boards (mirror)

diff options

author	Christophe Leroy <christophe.leroy@csgroup.eu>	2021-04-13 19:38:09 +0300
committer	Michael Ellerman <mpe@ellerman.id.au>	2021-08-14 15:00:01 +0300
commit	db87a7199229b75c9996bf78117eceb81854fce2 (patch)
tree	2d09c0cd1575509900a2c25f9f4c8e6ec32fbb15 /scripts/mod/modpost.c
parent	1c6b5a7e74052768977855f95d6b8812f6e7772c (diff)
download	linux-db87a7199229b75c9996bf78117eceb81854fce2.tar.xz

powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32

powerpc BUG_ON() and WARN_ON() are based on using twnei instruction. For catching simple conditions like a variable having value 0, this is efficient because it does the test and the trap at the same time. But most conditions used with BUG_ON or WARN_ON are more complex and forces GCC to format the condition into a 0 or 1 value in a register. This will usually require 2 to 3 instructions. The most efficient solution would be to use __builtin_trap() because GCC is able to optimise the use of the different trap instructions based on the requested condition, but this is complex if not impossible for the following reasons: - __builtin_trap() is a non-recoverable instruction, so it can't be used for WARN_ON - Knowing which line of code generated the trap would require the analysis of DWARF information. This is not a feature we have today. As mentioned in commit 8d4fbcfbe0a4 ("Fix WARN_ON() on bitfield ops") the way WARN_ON() is implemented is suboptimal. That commit also mentions an issue with 'long long' condition. It fixed it for WARN_ON() but the same problem still exists today with BUG_ON() on PPC32. It will be fixed by using the generic implementation. By using the generic implementation, gcc will naturally generate a branch to the unconditional trap generated by BUG(). As modern powerpc implement zero-cycle branch, that's even more efficient. And for the functions using WARN_ON() and its return, the test on return from WARN_ON() is now also used for the WARN_ON() itself. On PPC64 we don't want it because we want to be able to use CFAR register to track how we entered the code that trapped. The CFAR register would be clobbered by the branch. A simple test function: unsigned long test9w(unsigned long a, unsigned long b) { if (WARN_ON(!b)) return 0; return a / b; } Before the patch: 0000046c <test9w>: 46c: 7c 89 00 34 cntlzw r9,r4 470: 55 29 d9 7e rlwinm r9,r9,27,5,31 474: 0f 09 00 00 twnei r9,0 478: 2c 04 00 00 cmpwi r4,0 47c: 41 82 00 0c beq 488 <test9w+0x1c> 480: 7c 63 23 96 divwu r3,r3,r4 484: 4e 80 00 20 blr 488: 38 60 00 00 li r3,0 48c: 4e 80 00 20 blr After the patch: 00000468 <test9w>: 468: 2c 04 00 00 cmpwi r4,0 46c: 41 82 00 0c beq 478 <test9w+0x10> 470: 7c 63 23 96 divwu r3,r3,r4 474: 4e 80 00 20 blr 478: 0f e0 00 00 twui r0,0 47c: 38 60 00 00 li r3,0 480: 4e 80 00 20 blr So we see before the patch we need 3 instructions on the likely path to handle the WARN_ON(). With the patch the trap goes on the unlikely path. See below the difference at the entry of system_call_exception where we have several BUG_ON(), allthough less impressing. With the patch: 00000000 <system_call_exception>: 0: 81 6a 00 84 lwz r11,132(r10) 4: 90 6a 00 88 stw r3,136(r10) 8: 71 60 00 02 andi. r0,r11,2 c: 41 82 00 70 beq 7c <system_call_exception+0x7c> 10: 71 60 40 00 andi. r0,r11,16384 14: 41 82 00 6c beq 80 <system_call_exception+0x80> 18: 71 6b 80 00 andi. r11,r11,32768 1c: 41 82 00 68 beq 84 <system_call_exception+0x84> 20: 94 21 ff e0 stwu r1,-32(r1) 24: 93 e1 00 1c stw r31,28(r1) 28: 7d 8c 42 e6 mftb r12 ... 7c: 0f e0 00 00 twui r0,0 80: 0f e0 00 00 twui r0,0 84: 0f e0 00 00 twui r0,0 Without the patch: 00000000 <system_call_exception>: 0: 94 21 ff e0 stwu r1,-32(r1) 4: 93 e1 00 1c stw r31,28(r1) 8: 90 6a 00 88 stw r3,136(r10) c: 81 6a 00 84 lwz r11,132(r10) 10: 69 60 00 02 xori r0,r11,2 14: 54 00 ff fe rlwinm r0,r0,31,31,31 18: 0f 00 00 00 twnei r0,0 1c: 69 60 40 00 xori r0,r11,16384 20: 54 00 97 fe rlwinm r0,r0,18,31,31 24: 0f 00 00 00 twnei r0,0 28: 69 6b 80 00 xori r11,r11,32768 2c: 55 6b 8f fe rlwinm r11,r11,17,31,31 30: 0f 0b 00 00 twnei r11,0 34: 7d 8c 42 e6 mftb r12 Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b286e07fb771a664b631cd07a40b09c06f26e64b.1618331881.git.christophe.leroy@csgroup.eu

Diffstat (limited to 'scripts/mod/modpost.c')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: