diff options
author | gao xu <gaoxu2@honor.com> | 2025-02-19 04:56:28 +0300 |
---|---|---|
committer | Andrew Morton <akpm@linux-foundation.org> | 2025-03-06 08:36:14 +0300 |
commit | c3e998398de48a7528842f05858a3a6bb21002e6 (patch) | |
tree | 4b126fd89bf969fadc1c1039bfef44054076bd4d /tools/perf/util/scripting-engines/trace-event-python.c | |
parent | 19fac3c93991502a22c5132824c40b6a2e64b136 (diff) | |
download | linux-c3e998398de48a7528842f05858a3a6bb21002e6.tar.xz |
mm: fix possible NULL pointer dereference in __swap_duplicate
Add a NULL check on the return value of swp_swap_info in __swap_duplicate
to prevent crashes caused by NULL pointer dereference.
The reason why swp_swap_info() returns NULL is unclear; it may be due
to CPU cache issues or DDR bit flips. The probability of this issue is
very small - it has been observed to occur approximately 1 in 500,000
times per week. The stack info we encountered is as follows:
Unable to handle kernel NULL pointer dereference at virtual address
0000000000000058
[RB/E]rb_sreason_str_set: sreason_str set null_pointer
Mem abort info:
ESR = 0x0000000096000005
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x05: level 1 translation fault
Data abort info:
ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
CM = 0, WnR = 0, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
user pgtable: 4k pages, 39-bit VAs, pgdp=00000008a80e5000
[0000000000000058] pgd=0000000000000000, p4d=0000000000000000,
pud=0000000000000000
Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
Skip md ftrace buffer dump for: 0x1609e0
...
pc : swap_duplicate+0x44/0x164
lr : copy_page_range+0x508/0x1e78
sp : ffffffc0f2a699e0
x29: ffffffc0f2a699e0 x28: ffffff8a5b28d388 x27: ffffff8b06603388
x26: ffffffdf7291fe70 x25: 0000000000000006 x24: 0000000000100073
x23: 00000000002d2d2f x22: 0000000000000008 x21: 0000000000000000
x20: 00000000002d2d2f x19: 18000000002d2d2f x18: ffffffdf726faec0
x17: 0000000000000000 x16: 0010000000000001 x15: 0040000000000001
x14: 0400000000000001 x13: ff7ffffffffffb7f x12: ffeffffffffffbff
x11: ffffff8a5c7e1898 x10: 0000000000000018 x9 : 0000000000000006
x8 : 1800000000000000 x7 : 0000000000000000 x6 : ffffff8057c01f10
x5 : 000000000000a318 x4 : 0000000000000000 x3 : 0000000000000000
x2 : 0000006daf200000 x1 : 0000000000000001 x0 : 18000000002d2d2f
Call trace:
swap_duplicate+0x44/0x164
copy_page_range+0x508/0x1e78
copy_process+0x1278/0x21cc
kernel_clone+0x90/0x438
__arm64_sys_clone+0x5c/0x8c
invoke_syscall+0x58/0x110
do_el0_svc+0x8c/0xe0
el0_svc+0x38/0x9c
el0t_64_sync_handler+0x44/0xec
el0t_64_sync+0x1a8/0x1ac
Code: 9139c35a 71006f3f 54000568 f8797b55 (f9402ea8)
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Oops: Fatal exception
SMP: stopping secondary CPUs
The patch seems to only provide a workaround, but there are no more
effective software solutions to handle the bit flips problem. This path
will change the issue from a system crash to a process exception, thereby
reducing the impact on the entire machine.
akpm: this is probably a kernel bug, but this patch keeps the system
running and doesn't reduce that bug's debuggability.
Link: https://lkml.kernel.org/r/e223b0e6ba2f4924984b1917cc717bd5@honor.com
Signed-off-by: gao xu <gaoxu2@honor.com>
Reviewed-by: Barry Song <baohua@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Yosry Ahmed <yosry.ahmed@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Diffstat (limited to 'tools/perf/util/scripting-engines/trace-event-python.c')
0 files changed, 0 insertions, 0 deletions