kernel/linux.git/drivers/gpu/drm/amd, branch v4.18.18

drm/amd/display: Signal hw_done() after waiting for flip_done()

2018-11-04T13:51:47+00:00

[ Upstream commit 987bf116445db5d63a5c2ed94c4479687d9c9973 ] In amdgpu_dm_commit_tail(), wait until flip_done() is signaled before we signal hw_done(). [Why] This is to temporarily address a paging error that occurs when a nonblocking commit contends with another commit, particularly in a mirrored display configuration where at least 2 CRTCs are updated. The error occurs in drm_atomic_helper_wait_for_flip_done(), when we attempt to access the contents of new_crtc_state->commit. Here's the sequence for a mirrored 2 display setup (irrelevant steps left out for clarity): **THREAD 1** | **THREAD 2** | Initialize atomic state for flip | | Queue worker | ... | Do work for flip | | Signal hw_done() on CRTC 1 | Signal hw_done() on CRTC 2 | | Wait for flip_done() on CRTC 1 <---- **PREEMPTED BY THREAD 1** Initialize atomic state for cursor | update (1) | | Do cursor update work on both CRTCs | | Clear atomic state (2) | **DONE** | ... | | Wait for flip_done() on CRTC 2 | *ERROR* | The issue starts with (1). When the atomic state is initialized, the current CRTC states are duplicated to be the new_crtc_states, and referenced to be the old_crtc_states. (The new_crtc_states are to be filled with update data.) Some things to note: * Due to the mirrored configuration, the cursor updates on both CRTCs. * At this point, the pflip IRQ has already been handled, and flip_done signaled on all CRTCs. The cursor commit can therefore continue. * The old_crtc_states used by the cursor update are the **same states** as the new_crtc_states used by the flip worker. At (2), the old_crtc_state is freed (*), and the cursor commit completes. We then context switch back to the flip worker, where we attempt to access the new_crtc_state->commit object. This is problematic, as this state has already been freed. (*) Technically, 'state->crtcs[i].state' is freed, which was made to reference old_crtc_state in drm_atomic_helper_swap_state() [How] By moving hw_done() after wait_for_flip_done(), we're guaranteed that the new_crtc_state (from the flip worker's perspective) still exists. This is because any other commit will be blocked, waiting for the hw_done() signal. Note that both the i915 and imx drivers have this sequence flipped already, masking this problem. Signed-off-by: Shirish S Signed-off-by: Leo Li Reviewed-by: Harry Wentland Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin

drm/amdkfd: Fix ATS capablity was not reported correctly on some APUs

2018-10-18T07:18:17+00:00

[ Upstream commit 44d8cc6f1a905e4bb1d4221a898abb0d7e9d100a ] Because CRAT_CU_FLAGS_IOMMU_PRESENT was not set in some BIOS crat, we need to workaround this. For future compatibility, we also overwrite the bit in capability according to the value of needs_iommu_device. Acked-by: Alex Deucher Signed-off-by: Yong Zhao Reviewed-by: Felix Kuehling Signed-off-by: Felix Kuehling Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman

drm/amdkfd: Change the control stack MTYPE from UC to NC on GFX9

2018-10-18T07:18:17+00:00

[ Upstream commit 15426dbb65c5b37680d27e84d58a1ed3b8532518 ] CWSR fails on Raven if the control stack is MTYPE_UC, which is used for regular GART mappings. As a workaround we map it using MTYPE_NC. The MEC firmware expects the control stack at one page offset from the start of the MQD so it is part of the MQD allocation on GFXv9. AMDGPU added a memory allocation flag just for this purpose. Acked-by: Alex Deucher Signed-off-by: Yong Zhao Reviewed-by: Felix Kuehling Signed-off-by: Felix Kuehling Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman

drm/amdgpu: Fix SDMA HQD destroy error on gfx_v7

2018-10-18T07:18:17+00:00

[ Upstream commit caaa4c8a6be2a275bd14f2369ee364978ff74704 ] A wrong register bit was examinated for checking SDMA status so it reports false failures. This typo only appears on gfx_v7. gfx_v8 checks the correct bit. Acked-by: Alex Deucher Signed-off-by: Amber Lin Reviewed-by: Felix Kuehling Signed-off-by: Felix Kuehling Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman

drm/amdgpu: Fix vce work queue was not cancelled when suspend

2018-10-13T07:33:11+00:00

commit 61ea6f5831974ebd1a57baffd7cc30600a2e26fc upstream. The vce cancel_delayed_work_sync never be called. driver call the function in error path. This caused the A+A suspend hang when runtime pm enebled. As we will visit the smu in the idle queue. this will cause smu hang because the dgpu has been suspend, and the dgpu also will be waked up. As the smu has been hang, so the dgpu resume will failed. Reviewed-by: Christian König Reviewed-by: Feifei Xu Signed-off-by: Rex Zhu Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman

Revert "drm/amd/pp: Send khz clock values to DC for smu7/8"

2018-10-10T06:56:08+00:00

This reverts commit 93b100ddda3be284be160e9ccba28c7f8f21ab73 which was commit c3cb424a086921f6bb0449b10d998352a756d6d5 upstream. It was not needed for 4.18.y and caused problems there. Reported-by: Alexander Deucher Cc: Harry Wentland Cc: Sasha Levin Signed-off-by: Greg Kroah-Hartman

drm/amdgpu: fix error handling in amdgpu_cs_user_fence_chunk

2018-10-10T06:56:03+00:00

[ Upstream commit 0165de983272d1fae0809ed9db47c46a412279bc ] Slowly leaking memory one page at a time :) Signed-off-by: Christian König Reviewed-by: Andrey Grodzovsky Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman

drm/amdgpu: Fix SDMA hang in prt mode v2

2018-10-10T06:56:03+00:00

[ Upstream commit 68ebc13ea40656fddd3803735d621921a2d74a5e ] Fix SDMA hang in prt mode, clear XNACK_WATERMARK in reg SDMA0_UTCL1_WATERMK to avoid the issue Affected ASICs: VEGA10 VEGA12 RV1 RV2 v2: add reg clear for SDMA1 Signed-off-by: Tao Zhou Tested-by: Yukun Li Reviewed-by: Hawking Zhang Acked-by: Christian König Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman

drm/amdgpu: Need to set moved to true when evict bo

2018-10-03T23:59:26+00:00

[ Upstream commit 6ddd9769db4fc11a98bd7e58be1764e47fdb8384 ] Fix the VMC page fault when the running sequence is as below: 1.amdgpu_gem_create_ioctl 2.ttm_bo_swapout->amdgpu_vm_bo_invalidate, as not called amdgpu_vm_bo_base_init, so won't called list_add_tail(&base->bo_list, &bo->va). Even the bo was evicted, it won't set the bo_base->moved. 3.drm_gem_open_ioctl->amdgpu_vm_bo_base_init, here only called list_move_tail(&base->vm_status, &vm->evicted), but not set the bo_base->moved. 4.amdgpu_vm_bo_map->amdgpu_vm_bo_insert_map, as the bo_base->moved is not set true, the function amdgpu_vm_bo_insert_map will call list_move(&bo_va->base.vm_status, &vm->moved) 5.amdgpu_cs_ioctl won't validate the swapout bo, as it is only in the moved list, not in the evict list. So VMC page fault occurs. Signed-off-by: Emily Deng Reviewed-by: Christian König Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman

drm/amdgpu: Update power state at the end of smu hw_init.

2018-10-03T23:59:26+00:00

[ Upstream commit 2ab4d0e74256fc49b7b270f63c1d1e47c2455abc ] For SI/Kv, the power state is managed by function amdgpu_pm_compute_clocks. when dpm enabled, we should call amdgpu_pm_compute_clocks to update current power state instand of set boot state. this change can fix the oops when kfd driver was enabled on Kv. Reviewed-by: Alex Deucher Tested-by: Michel Dänzer Signed-off-by: Rex Zhu Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman