summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm
diff options
context:
space:
mode:
authorMario Limonciello <mario.limonciello@amd.com>2026-02-05 19:42:54 +0300
committerSasha Levin <sashal@kernel.org>2026-03-12 14:09:32 +0300
commit378dff71efddd15f34124bf9d7c98cd69cd05286 (patch)
treefb973e02155e6d78190dc79c60c65be9ce8d162e /drivers/gpu/drm
parent6c80b35076bcf4e2d43be0f81d61926211b8959d (diff)
downloadlinux-378dff71efddd15f34124bf9d7c98cd69cd05286.tar.xz
drm/amd: Fix hang on amdgpu unload by using pci_dev_is_disconnected()
[ Upstream commit f7afda7fcd169a9168695247d07ad94cf7b9798f ] The commit 6a23e7b4332c ("drm/amd: Clean up kfd node on surprise disconnect") introduced early KFD cleanup when drm_dev_is_unplugged() returns true. However, this causes hangs during normal module unload (rmmod amdgpu). The issue occurs because drm_dev_unplug() is called in amdgpu_pci_remove() for all removal scenarios, not just surprise disconnects. This was done intentionally in commit 39934d3ed572 ("Revert "drm/amdgpu: TA unload messages are not actually sent to psp when amdgpu is uninstalled"") to fix IGT PCI software unplug test failures. As a result, drm_dev_is_unplugged() returns true even during normal module unload, triggering the early KFD cleanup inappropriately. The correct check should distinguish between: - Actual surprise disconnect (eGPU unplugged): pci_dev_is_disconnected() returns true - Normal module unload (rmmod): pci_dev_is_disconnected() returns false Replace drm_dev_is_unplugged() with pci_dev_is_disconnected() to ensure the early cleanup only happens during true hardware disconnect events. Cc: stable@vger.kernel.org Reported-by: Cal Peake <cp@absolutedigital.net> Closes: https://lore.kernel.org/all/b0c22deb-c0fa-3343-33cf-fd9a77d7db99@absolutedigital.net/ Fixes: 6a23e7b4332c ("drm/amd: Clean up kfd node on surprise disconnect") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Diffstat (limited to 'drivers/gpu/drm')
-rw-r--r--drivers/gpu/drm/amd/amdgpu/amdgpu_device.c4
1 files changed, 2 insertions, 2 deletions
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index fb096bf551ef..dbcd55611a37 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4991,7 +4991,7 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
* before ip_fini_early to prevent kfd locking refcount issues by calling
* amdgpu_amdkfd_suspend()
*/
- if (drm_dev_is_unplugged(adev_to_drm(adev)))
+ if (pci_dev_is_disconnected(adev->pdev))
amdgpu_amdkfd_device_fini_sw(adev);
amdgpu_device_ip_fini_early(adev);
@@ -5003,7 +5003,7 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
amdgpu_gart_dummy_page_fini(adev);
- if (drm_dev_is_unplugged(adev_to_drm(adev)))
+ if (pci_dev_is_disconnected(adev->pdev))
amdgpu_device_unmap_mmio(adev);
}