summaryrefslogtreecommitdiff
path: root/scripts/bash-completion/make
diff options
context:
space:
mode:
authorVitaly Prosyak <vitaly.prosyak@amd.com>2026-04-14 06:07:55 +0300
committerAlex Deucher <alexander.deucher@amd.com>2026-04-17 21:48:37 +0300
commitd91db49e97382efe38b51f5943c1a7073c0f06cd (patch)
tree88ddadc585d1682741c69398dd857a249bdf4ff5 /scripts/bash-completion/make
parent97284621c5bf513fa013f9a1c67b42f267732ce4 (diff)
downloadlinux-d91db49e97382efe38b51f5943c1a7073c0f06cd.tar.xz
drm/amdgpu: fix NULL pointer dereference in amdgpu_devcoredump_format
A race condition in the devcoredump code causes a NULL pointer dereference in amdgpu_devcoredump_format() when multiple GPU resets occur in quick succession. The sequence of events: 1. First reset calls amdgpu_coredump(), creates coredump1, sets adev->coredump = coredump1, and queues the deferred work. 2. The deferred work begins executing (work_pending() returns false since the work is now running, not just queued). 3. A second reset calls amdgpu_coredump(). work_pending() returns false because the work is running, so amdgpu_coredump() proceeds: creates coredump2, overwrites adev->coredump = coredump2, and re-queues the deferred work with queue_work(). 4. The first deferred work finishes and unconditionally sets adev->coredump = NULL, destroying the reference to coredump2. 5. The re-queued deferred work starts and reads adev->coredump = NULL. It then passes this NULL into amdgpu_devcoredump_format() which dereferences coredump->adev (offset 0 in the struct), triggering: KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] RIP: 0010:amdgpu_devcoredump_format+0xa6/0x36b0 [amdgpu] This was observed during the amd_deadlock IGT test where multiple subtests trigger rapid ring resets. The dmesg log shows four coredumps created within 120ms (at 102.377s, 104.424s, 104.492s, and 104.497s), with the crash occurring 13ms after the last one. Fix this with two changes: - Replace work_pending() with work_busy() in amdgpu_coredump() to also reject new coredumps while the deferred work is executing, not just when it is queued. This closes the main race window. - Add a defensive NULL check for adev->coredump at the start of amdgpu_devcoredump_deferred_work() to prevent the crash if the race still occurs (work_busy() is advisory, not a full barrier). v2: Drop the job->pasid NULL guard -- that fix was independently submitted and merged as commit 4c1f0a162da5 ("drm/amdgpu: add job->pasid in check as amdgpu_job could be NULL") by Sunil Khatri, reviewed by Christian König. Integrate with that patch as suggested by Christian. Fixes: 4bbba79a7f1d ("drm/amdgpu: move devcoredump generation to a worker") Cc: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Diffstat (limited to 'scripts/bash-completion/make')
0 files changed, 0 insertions, 0 deletions