kernel/linux.git/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c, branch v5.15.3

drm/amdgpu: fix a potential memory leak in amdgpu_device_fini_sw()

2021-11-18T18:16:43+00:00

[ Upstream commit a5c5d8d50ecf5874be90a76e1557279ff8a30c9e ] amdgpu_fence_driver_sw_fini() should be executed before amdgpu_device_ip_fini(), otherwise fence driver resource won't be properly freed as adev->rings have been tore down. Fixes: 72c8c97b1522 ("drm/amdgpu: Split amdgpu_device_fini into early and late") Signed-off-by: Lang Yu Reviewed-by: Andrey Grodzovsky Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin

drm/amdgpu: move amdgpu_virt_release_full_gpu to fini_early stage

2021-11-18T18:16:25+00:00

[ Upstream commit 6effad8abe0ba4db3d9c58ed585127858a990f35 ] adev->rmmio is set to be NULL in amdgpu_device_unmap_mmio to prevent access after pci_remove, however, in SRIOV case, amdgpu_virt_release_full_gpu will still use adev->rmmio for access after amdgpu_device_unmap_mmio. The patch is to move such SRIOV calling earlier to fini_early stage. Fixes: 07775fc13878 ("drm/amdgpu: Unmap all MMIO mappings") Cc: Andrey Grodzovsky Signed-off-by: Leslie Shi Signed-off-by: Guchun Chen Reviewed-by: Andrey Grodzovsky Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin

drm/amdgpu: move iommu_resume before ip init/resume

2021-11-18T18:16:09+00:00

[ Upstream commit 9cec53c18a3170c7e5673c414da56aeecee94832 ] Separate iommu_resume from kfd_resume, and move it before other amdgpu ip init/resume. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277 Signed-off-by: James Zhu Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin

drm/amdgpu: revert "Add autodump debugfs node for gpu reset v8"

2021-11-06T13:13:31+00:00

commit c8365dbda056578eebe164bf110816b1a39b4b7f upstream. This reverts commit 728e7e0cd61899208e924472b9e641dbeb0775c4. Further discussion reveals that this feature is severely broken and needs to be reverted ASAP. GPU reset can never be delayed by userspace even for debugging or otherwise we can run into in kernel deadlocks. Signed-off-by: Christian König Acked-by: Alex Deucher Acked-by: Nirmoy Das Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman

drm/amdkfd: fix boot failure when iommu is disabled in Picasso.

2021-11-06T13:13:30+00:00

commit afd18180c07026f94a80ff024acef5f4159084a4 upstream. When IOMMU disabled in sbios and kfd in iommuv2 path, iommuv2 init will fail. But this failure should not block amdgpu driver init. Reported-by: youling Tested-by: youling Signed-off-by: Yifan Zhang Reviewed-by: James Zhu Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman

drm/amdgpu: handle the case of pci_channel_io_frozen only in amdgpu_pci_resume

2021-10-05T17:02:31+00:00

In current code, when a PCI error state pci_channel_io_normal is detectd, it will report PCI_ERS_RESULT_CAN_RECOVER status to PCI driver, and PCI driver will continue the execution of PCI resume callback report_resume by pci_walk_bridge, and the callback will go into amdgpu_pci_resume finally, where write lock is releasd unconditionally without acquiring such lock first. In this case, a deadlock will happen when other threads start to acquire the read lock. To fix this, add a member in amdgpu_device strucutre to cache pci_channel_state, and only continue the execution in amdgpu_pci_resume when it's pci_channel_io_frozen. Fixes: c9a6b82f45e2 ("drm/amdgpu: Implement DPC recovery") Suggested-by: Andrey Grodzovsky Signed-off-by: Guchun Chen Reviewed-by: Andrey Grodzovsky Signed-off-by: Alex Deucher

drm/amdgpu: init iommu after amdkfd device init

2021-10-05T17:02:20+00:00

This patch is to fix clinfo failure in Raven/Picasso: Number of platforms: 1 Platform Profile: FULL_PROFILE Platform Version: OpenCL 2.2 AMD-APP (3364.0) Platform Name: AMD Accelerated Parallel Processing Platform Vendor: Advanced Micro Devices, Inc. Platform Extensions: cl_khr_icd cl_amd_event_callback Platform Name: AMD Accelerated Parallel Processing Number of devices: 0 Signed-off-by: Yifan Zhang Reviewed-by: James Zhu Tested-by: James Zhu Acked-by: Felix Kuehling Signed-off-by: Alex Deucher

drm/amdgpu: move iommu_resume before ip init/resume

2021-09-16T13:56:24+00:00

Separate iommu_resume from kfd_resume, and move it before other amdgpu ip init/resume. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277 Signed-off-by: James Zhu Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org

drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-20T16:09:44+00:00

schedule_delayed_work does not push back the work if it was already scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms after the first time GFXOFF was disabled and re-enabled, even if GFXOFF was disabled and re-enabled again during those 100 ms. This resulted in frame drops / stutter with the upcoming mutter 41 release on Navi 14, due to constantly enabling GFXOFF in the HW and disabling it again (for getting the GPU clock counter). To fix this, call cancel_delayed_work_sync when the disable count transitions from 0 to 1, and only schedule the delayed work on the reverse transition, not if the disable count was already 0. This makes sure the delayed work doesn't run at unexpected times, and allows it to be lock-free. v2: * Use cancel_delayed_work_sync & mutex_trylock instead of mod_delayed_work. v3: * Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König) v4: * Fix race condition between amdgpu_gfx_off_ctrl incrementing adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off checking for it to be 0 (Evan Quan) Cc: stable@vger.kernel.org Reviewed-by: Evan Quan Reviewed-by: Lijo Lazar # v3 Acked-by: Christian König # v3 Signed-off-by: Michel Dänzer Signed-off-by: Alex Deucher

drm/amd/amdgpu:flush ttm delayed work before cancel_sync

2021-08-18T22:23:00+00:00

[Why] In some cases when we unload driver, warning call trace will show up in vram_mgr_fini which claims that LRU is not empty, caused by the ttm bo inside delay deleted queue. [How] We should flush delayed work to make sure the delay deleting is done. Signed-off-by: YuBiao Wang Reviewed-by: Andrey Grodzovsky Signed-off-by: Alex Deucher