summaryrefslogtreecommitdiff
path: root/tools/perf/scripts/python/task-analyzer.py
diff options
context:
space:
mode:
authorKrister Johansen <kjlx@templeofstupid.com>2025-01-08 02:24:58 +0300
committerMikulas Patocka <mpatocka@redhat.com>2025-01-08 17:29:39 +0300
commit80f130bfad1dab93b95683fc39b87235682b8f72 (patch)
tree9dc2f9a9821cffb4541eb66101c6c411def717f3 /tools/perf/scripts/python/task-analyzer.py
parent47f33c27fc9565fb0bc7dfb76be08d445cd3d236 (diff)
downloadlinux-80f130bfad1dab93b95683fc39b87235682b8f72.tar.xz
dm thin: make get_first_thin use rcu-safe list first function
The documentation in rculist.h explains the absence of list_empty_rcu() and cautions programmers against relying on a list_empty() -> list_first() sequence in RCU safe code. This is because each of these functions performs its own READ_ONCE() of the list head. This can lead to a situation where the list_empty() sees a valid list entry, but the subsequent list_first() sees a different view of list head state after a modification. In the case of dm-thin, this author had a production box crash from a GP fault in the process_deferred_bios path. This function saw a valid list head in get_first_thin() but when it subsequently dereferenced that and turned it into a thin_c, it got the inside of the struct pool, since the list was now empty and referring to itself. The kernel on which this occurred printed both a warning about a refcount_t being saturated, and a UBSAN error for an out-of-bounds cpuid access in the queued spinlock, prior to the fault itself. When the resulting kdump was examined, it was possible to see another thread patiently waiting in thin_dtr's synchronize_rcu. The thin_dtr call managed to pull the thin_c out of the active thins list (and have it be the last entry in the active_thins list) at just the wrong moment which lead to this crash. Fortunately, the fix here is straight forward. Switch get_first_thin() function to use list_first_or_null_rcu() which performs just a single READ_ONCE() and returns NULL if the list is already empty. This was run against the devicemapper test suite's thin-provisioning suites for delete and suspend and no regressions were observed. Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Fixes: b10ebd34ccca ("dm thin: fix rcu_read_lock being held in code that can sleep") Cc: stable@vger.kernel.org Acked-by: Ming-Hung Tsai <mtsai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Diffstat (limited to 'tools/perf/scripts/python/task-analyzer.py')
0 files changed, 0 insertions, 0 deletions