summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd
AgeCommit message (Collapse)AuthorFilesLines
2018-07-13drm/amd/display: Move common GPIO registers into a common defineCharlene Liu1-2/+5
Signed-off-by: Charlene Liu <charlene.liu@amd.com> Reviewed-by: Tony Cheng <Tony.Cheng@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13drm/amd/display: Linux Set/Read link rate and lane count through debugfsHersen Wu2-0/+82
expose dc function to be called by linux dm Signed-off-by: Hersen Wu <hersenxs.wu@amd.com> Reviewed-by: Sun peng Li <Sunpeng.Li@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13drm/amd/display: Implement cursor multiplierKrunoslav Kovac3-4/+24
DCN allows cursor multiplier when blending FP16 surface. Signed-off-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com> Reviewed-by: Aric Cyr <Aric.Cyr@amd.com> Reviewed-by: Anthony Koo <Anthony.Koo@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13drm/amd/display: support access ddc for mst branchEric Yang1-0/+4
[Why] Megachip dockings accesses ddc line through display driver when installing FW. Previously, we would fail every transaction because link attached to mst branch did not have their ddc transaction type set. [How] Set ddc transaction type when mst branch is connected. Signed-off-by: Eric Yang <Eric.Yang2@amd.com> Reviewed-by: Charlene Liu <Charlene.Liu@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13drm/amd/display: Add avoid_vbios_exec_table debug bitTony Cheng1-0/+1
Signed-off-by: Tony Cheng <tony.cheng@amd.com> Reviewed-by: Yongqiang Sun <yongqiang.sun@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13drm/amd/display: Separate HUBP surface size and rotation/mirror programmingEric Bernstein2-13/+23
Separate HUBP surface size and rotation/mirror programming so that HUBP revision without mirror/rotation do not access those register fields. Signed-off-by: Eric Bernstein <eric.bernstein@amd.com> Reviewed-by: Tony Cheng <Tony.Cheng@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13Revert "drm/amd/display: make dm_dp_aux_transfer return payload bytes ↵Harry Wentland5-13/+21
instead of size" This reverts commit cc195141133ac3e767d930bedd8294ceebf1f10b. This commit was problematic on other OSes. The real solution is to leave all the error checking to DRM and don't do it in DC, which is addressed by "Return aux replies directly to DRM" later in this patchset. v2: Add reason for revert. Signed-off-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13Revert "drm/amd/display: Don't return ddc result and read_bytes in same ↵Harry Wentland3-22/+13
return value" This reverts commit 8a61bc085ffab3071c59efcbeff4044c034e7490. Need to revert "make dm_dp_aux_transfer return payload bytes instead of size", which this commit is based on. That commit was problematic on other OSes. The real solution is to leave all the error checking to DRM and don't do it in DC, which is addressed by "Return aux replies directly to DRM" later in this patchset. v2: Add reason for revert. Signed-off-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13drm/amdgpu: Warn and update pin_size values when destroying a pinned BOMichel Dänzer1-7/+25
This shouldn't happen, but if it does, we'll get a backtrace of the caller, and update the pin_size values as needed. v2: * Check bo->pin_count instead of placement flags (Christian König) Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13drm/amdgpu: Make pin_size values atomicMichel Dänzer4-21/+23
Concurrent execution of the non-atomic arithmetic could result in completely bogus values. v2: * Rebased on v2 of the previous patch Cc: stable@vger.kernel.org Bugzilla: https://bugs.freedesktop.org/106872 Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13drm/amdgpu: Keep track of amount of pinned CPU visible VRAMMichel Dänzer5-19/+14
Instead of CPU invisible VRAM. Preparation for the following, no functional change intended. v2: * Also change amdgpu_vram_mgr_bo_invisible_size to amdgpu_vram_mgr_bo_visible_size, allowing further simplification (Christian König) Cc: stable@vger.kernel.org Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13drm/scheduler: modify args of drm_sched_entity_initNayan Deshmukh7-14/+11
replace run queue by a list of run queues and remove the sched arg as that is part of run queue itself Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Eric Anholt <eric@anholt.net> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13drm/amdgpu: fix TTM move entity init orderChristian König1-16/+21
We are initializing the entity before the scheduler is actually initialized. This can lead to all kind of problem, but especially NULL pointer deref because of Nayan's scheduler work. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13drm/amd: Use newly added interrupt source defs for SOC15.Andrey Grodzovsky7-15/+29
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13drm/amd: Add interrupt source definitions for SOC15 v3.Andrey Grodzovsky9-0/+359
Stop using 'magic numbers' when registering interrupt sources. v2: Switch to kernel style comments. v3: Rebase. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13drm/amd: Use newly added interrupt source defs for VI v3.Andrey Grodzovsky12-26/+46
v2: Rebase v3: Use defines for CP_SQ and CP_ECC_ERROR interrupts. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13drm/amd: Add interrupt source definitions for VI v3.Andrey Grodzovsky1-0/+98
Stop using 'magic numbers' when registering interrupt sources. v2: Clean redundant comments. Switch to kernel style comments. v3: Add CP_ECC_ERROR define Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13drm/amd/powerplay: convert the sclk/mclk into Mhz for comparationEvan Quan1-2/+2
Convert the clocks into right Mhz unit. Otherwise, it will miss the equal situation. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13drm/amd/powerplay: no need to mask workable gfxoff feature for vega12Evan Quan1-1/+1
Gfxoff feature for vega12 is workable. So, there is no need to mask it any more. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13drm/amd/powerplay: add vega12 SMU gfxoff support v3Evan Quan3-0/+46
Export apis for enabling/disabling SMU gfxoff support. v2: fit the latest gfxoff support framework v3: add feature_mask control Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Huang Rui <ray.huang at amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13drm/amdgpu: reduce the idle period that RLC has to wait before request CGCGEvan Quan1-4/+7
Gfxoff feature may depends on the CGCG(on vega12, that's the case). This change will help to enable gfxoff feature more frequently. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13drm/amdgpu: no touch for the reserved bit of RLC_CGTT_MGCG_OVERRIDEEvan Quan1-4/+11
On vega12, the bit0 of RLC_CGTT_MGCG_OVERRIDE is reserved. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13drm/amdgpu: drop mmRLC_PG_CNTL clear v2Evan Quan1-3/+0
SMU owns this register so the driver should not set it to avoid breaking gfxoff. v2: update description Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher at amd.com> Reviewed-by: Huang Rui <ray.huang at amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13drm/amdgpu: correct rlc save restore list initialization for v2_1Evan Quan1-6/+12
The save restore list initialization does not have to be pg guarded. And for some asic(e.g. Vega12), it does not have cntl/gpm/srm lists. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13drm/amdgpu: init CSIB regardless of rlc version and pg statusEvan Quan1-1/+2
CSIB init has no relation with rlc version and pg status. It should be needed regardless of them. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13drm/amdgpu: pin the csb buffer on hw init v2Evan Quan1-0/+40
Without this pin, the csb buffer will be filled with inconsistent data after S3 resume. And that will causes gfx hang on gfxoff exit since this csb will be executed then. v2: fit amdgpu_bo_pin change(take one less argument) Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13drm/amdgpu/pp/smu7: use a local variable for toc indexingAlex Deucher1-11/+12
Rather than using the index variable stored in vram. If the device fails to come back online after a resume cycle, reads from vram will return all 1s which will cause a segfault. Based on a patch from Thomas Martitz <kugel@rockbox.org>. This avoids the segfault, but we still need to sort out why the GPU does not come back online after a resume. Bug: https://bugs.freedesktop.org/show_bug.cgi?id=105760 Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2018-07-13drm: drop _mode_ from remaining connector functionsDaniel Vetter1-3/+3
Since there's very few callers of these I've decided to do them all in one patch. With this the unecessarily long drm_mode_connector_ prefix is gone from the codebase! The only exception being struct drm_mode_connector_set_property, which is part of the uapi so can't be renamed. Again done with sed+some manual fixups for indent issues. Reviewed-by: Sean Paul <seanpaul@chromium.org> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180709084016.23750-8-daniel.vetter@ffwll.ch
2018-07-13drm: drop _mode_ from drm_mode_connector_attach_encoderDaniel Vetter3-3/+3
Again to align with the usual prefix of just drm_connector_. Again done with sed + manual fixup for indent issues. Reviewed-by: Sean Paul <seanpaul@chromium.org> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180709084016.23750-7-daniel.vetter@ffwll.ch
2018-07-13drm: drop _mode_ from update_edit_property()Daniel Vetter3-6/+6
Just makes it longer, and for most things in drm_connector.[hc] we just use the drm_connector_ prefix. Done with sed + a bit of manual fixup for the indenting. Reviewed-by: Sean Paul <seanpaul@chromium.org> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180709084016.23750-6-daniel.vetter@ffwll.ch
2018-07-12amd/dc/dce100: On dce100, set clocks to 0 on suspendDavid Francis1-3/+16
[Why] When a dce100 asic was suspended, the clocks were not set to 0. Upon resume, the new clock was compared to the existing clock, they were found to be the same, and so the clock was not set. This resulted in a pernicious blackscreen. [How] In atomic commit, check to see if there are any active pipes. If no, set clocks to 0 Signed-off-by: David Francis <David.Francis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-12drm/amd/display: Convert 10kHz clks from PPLib into kHz for VegaHarry Wentland1-2/+3
The driver is expecting clock frequency in kHz, while SMU returns the values in 10kHz, which causes the bandwidth validation to fail 4.18 has the faulty clock assignment in pp_to_dc_clock_levels_with_latency only, which is only used by Vega. Make sure we multiply these values by 10 here, as we do for other ASICs as powerplay assigned them wrong. 4.19 has the proper fix in powerplay. v2: Add Fixes tag v3: Fixes -> Bugzilla, with simplified link Bugzilla: https://bugs.freedesktop.org/107082 Signed-off-by: Mikita Lipski <mikita.lipski@amd.com> Signed-off-by: Harry Wentland <harry.wentland@amd.com> Acked-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-12drm/amdkfd: Clean up reference of radeonYong Zhao5-6/+41
Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-12drm/amdkfd: Replace mqd with mqd_mgr as the variable name for mqd_managerYong Zhao5-66/+68
This will make reading code much easier. Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-12drm/amdkfd: Use module parameters noretry as the internal variable nameYong Zhao3-8/+10
This makes all module parameters use the same form. Meanwhile clean up the surrounding code. Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-12drm/amdkfd: Introduce KFD module parameter halt_if_hws_hangYong Zhao3-0/+16
This avoids triggering a GPU reset or otherwise changing the HW state. Instead KFD will hang, which allows HW debugging tools to analyze the problem. Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-12drm/amdkfd: Add debugfs interface to trigger HWS hangShaoyun Liu5-0/+113
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-12drm/amdgpu: Avoid destroy hqd when GPU is on resetShaoyun Liu3-0/+9
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-12drm/amdgpu: Avoid invalidate tlbs when gpu is on resetShaoyun Liu3-0/+9
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-12drm/amdkfd: Fix kernel queue 64 bit doorbell offset calculationShaoyun Liu1-4/+5
The bitmap index calculation should reverse the logic used on allocation so it will clear the same bit used on allocation Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-12drm/amdgpu: Check NULL pointer for job before reset job's ringShaoyun Liu1-1/+1
job could be NULL when amdgpu_device_gpu_recover is called Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-12drm/amdgpu: Don't use shadow BO for compute contextShaoyun Liu1-5/+8
Compute contexts cannot keep going after a GPU reset. Currently the process must terminate. In the future a process may be able recreate its context from scratch. Either way, there is no need to restore the GPUVM page table from shadow BOs. Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-12drm/amdkfd: Implement hang detection in KFD and call amdgpuShaoyun Liu2-1/+24
The reset will be performed in a new hw_exception work thread to handle HWS hang without blocking the thread that detected the hang. Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-12drm/amdgpu: Enable the gpu reset from KFDShaoyun Liu5-2/+14
Hook up the gpu_recover callback from KFD to amdgpu to enable handling of GPU hangs detected by KFD. Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-12drm/amdkfd: Implement GPU reset handlers in KFDShaoyun Liu5-3/+75
Lock KFD and evict existing queues on reset. Notify user mode by signaling hw_exception events. Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-12drm/amdgpu: Call KFD reset handlers during GPU resetShaoyun Liu3-0/+29
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-12drm/amdkfd: Add gpu reset interface and place holderShaoyun Liu3-0/+16
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-12drm/amd: Add gpu reset interfaces between amdgpu and amdkfdShaoyun Liu1-0/+10
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-12drm/amdkfd: fix zero reading of VMID and PASID for HawaiiLan Xiao7-10/+77
Upon VM Fault, the VMID and PASID written by HW are zeros in Hawaii. Instead of reading from ih_ring_entry, read directly from the registers. This workaround fix the soft hang issues caused by mishandled VM Fault in Hawaii. Signed-off-by: Lan Xiao <Lan.Xiao@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-12drm/amdkfd: Handle VM faults in KFDshaoyunl6-6/+97
1. Pre-GFX9 the amdgpu ISR saves the vm-fault status and address per per-vmid. amdkfd needs to get the information from amdgpu through the new get_vm_fault_info interface. On GFX9 and later, all the required information is in the IH ring 2. amdkfd unmaps all queues from the faulting process and create new run-list without the guilty process 3. amdkfd notifies the runtime of the vm fault trap via EVENT_TYPE_MEMORY Signed-off-by: shaoyun liu <shaoyun.liu@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>