summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd
AgeCommit message (Collapse)AuthorFilesLines
2022-11-15drm/amd/display: Fix prefetch calculations for dcn32Dillon Varone1-0/+2
[Description] Prefetch calculation loop was not exiting until utilizing all of vstartup if it failed once. Locals need to be reset on each iteration of the loop. Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Dillon Varone <Dillon.Varone@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x
2022-11-15drm/amd/display: Fix optc2_configure warning on dcn314Roman Li1-1/+1
[Why] dcn314 uses optc2_configure_crc() that wraps optc1_configure_crc() + set additional registers not applicable to dcn314. It's not critical but when used leads to warning like: WARNING: drivers/gpu/drm/amd/amdgpu/../display/dc/dc_helper.c Call Trace: <TASK> generic_reg_set_ex+0x6d/0xe0 [amdgpu] optc2_configure_crc+0x60/0x80 [amdgpu] dc_stream_configure_crc+0x129/0x150 [amdgpu] amdgpu_dm_crtc_configure_crc_source+0x5d/0xe0 [amdgpu] [How] Use optc1_configure_crc() directly Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x
2022-11-15drm/amd/display: Fix calculation for cursor CAB allocationGeorge Shen1-9/+5
[Why] The cursor size (in memory) is currently incorrectly calculated, resulting not enough CAB being allocated for static screen cursor in MALL refresh. This results in cursor image corruption. [How] Use cursor pitch instead of cursor width when calculating cursor size. Update num cache lines calculation to use the result of the cursor size calculation instead of manually recalculating again. Reviewed-by: Alvin Lee <Alvin.Lee2@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: George Shen <george.shen@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x
2022-11-15drm/amd/display: Support parsing VRAM info v3.0 from VBIOSGeorge Shen1-0/+30
[Why] For DCN3.2 and DCN3.21, VBIOS has switch to using v3.0 of the VRAM info struct. We should read and override the VRAM info in driver with values provided by VBIOS to support memory downbin cases. Reviewed-by: Alvin Lee <Alvin.Lee2@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: George Shen <george.shen@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x
2022-11-15drm/amd/display: Fix invalid DPIA AUX reply causing system hangStylon Wang2-6/+20
[Why] Some DPIA AUX replies have incorrect data length from original request. This could lead to overwriting of destination buffer if reply length is larger, which could cause invalid access to stack since many destination buffers are declared as local variables. [How] Check for invalid length from DPIA AUX replies and trigger a retry if reply length is not the same as original request. A DRM_WARN() dmesg log is also produced. Reviewed-by: Roman Li <Roman.Li@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Stylon Wang <stylon.wang@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x
2022-11-15drm/amdgpu: Add psp_13_0_10_ta firmware to modinfoCandice Li1-0/+1
TA firmware loaded on psp v13_0_10, but it is missing in modinfo. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-15drm/amd/display: Add HUBP surface flip interrupt handlerRodrigo Siqueira1-0/+1
On IGT, there is a test named amd_hotplug, and when the subtest basic is executed on DCN31, we get the following error: [drm] *ERROR* [CRTC:71:crtc-0] flip_done timed out [drm] *ERROR* flip_done timed out [drm] *ERROR* [CRTC:71:crtc-0] commit wait timed out [drm] *ERROR* flip_done timed out [drm] *ERROR* [CONNECTOR:88:DP-1] commit wait timed out [drm] *ERROR* flip_done timed out [drm] *ERROR* [PLANE:59:plane-3] commit wait timed out After enable the page flip log with the below command: echo -n 'format "[PFLIP]" +p' > /sys/kernel/debug/dynamic_debug/control It is possible to see that the flip was submitted, but DC never replied back, which generates time-out issues. This is an indication that the HUBP surface flip is missing. This commit fixes this issue by adding hubp1_set_flip_int to DCN31. Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-11-15drm/amd/display: Fix access timeout to DPIA AUX at boot timeStylon Wang1-6/+6
[Why] Since introduction of patch "Query DPIA HPD status.", link detection at boot could be accessing DPIA AUX, which will not succeed until DMUB outbox messaging is enabled and results in below dmesg logs: [ 160.840227] [drm:amdgpu_dm_process_dmub_aux_transfer_sync [amdgpu]] *ERROR* wait_for_completion_timeout timeout! [How] Enable DMUB outbox messaging before link detection at boot time. Reviewed-by: Wayne Lin <Wayne.Lin@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Stylon Wang <stylon.wang@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x
2022-11-15drm/amdgpu: Fix memory leak in amdgpu_cs_pass1Dong Chenchen1-2/+4
When p->gang_size equals 0, amdgpu_cs_pass1() will return directly without freeing chunk_array, which will cause a memory leak issue, this patch fixes it. Fixes: 4624459c84d7 ("drm/amdgpu: add gang submit frontend v6") Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-15drm/amdgpu: use the last IB as gang leader v2Christian König2-7/+17
It turned out that not the last IB specified is the gang leader, but instead the last job allocated. This is a bit unfortunate and not very intuitive for the CS interface, so try to fix this. Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221115094206.6181-1-christian.koenig@amd.com Tested-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Fixes: 4624459c84d7 ("drm/amdgpu: add gang submit frontend v6")
2022-11-10Merge tag 'drm-misc-fixes-2022-11-09' of ↵Dave Airlie1-1/+1
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes drm-misc-fixes for v6.1-rc5: - HDMI fixes to vc4. - Make panfrost's uapi header compile with C++. - Add rotation quirks for 2 panels. - Fix s/r in amdgpu_vram_mgr_new - Handle 1 gb boundary correctly in panfrost mmu code. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/e02de501-4b85-28a0-3f6e-751ca13f5f9d@linux.intel.com
2022-11-10drm/amd/display: only fill dirty rectangles when PSR is enabledHamza Mahfooz1-3/+4
Currently, we are calling fill_dc_dirty_rects() even if PSR isn't supported by the relevant link in amdgpu_dm_commit_planes(), this is undesirable especially because when drm.debug is enabled we are printing messages in fill_dc_dirty_rects() that are only useful for debugging PSR (and confusing otherwise). So, we can instead limit the filling of dirty rectangles to only when PSR is enabled. Reviewed-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-10drm/amdgpu: disable BACO on special BEIGE_GOBY cardGuchun Chen1-1/+3
Still avoid intermittent failure. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-11-10drm/amdgpu: Drop eviction lock when allocating PT BOPhilip Yang3-26/+28
Re-take the eviction lock immediately again after the allocation is completed, to fix circular locking warning with drm_buddy allocator. Move amdgpu_vm_eviction_lock/unlock/trylock to amdgpu_vm.h as they are called from multiple files. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-10drm/amdgpu: Unlock bo_list_mutex after error handlingPhilip Yang1-0/+1
Get below kernel WARNING backtrace when pressing ctrl-C to kill kfdtest application. If amdgpu_cs_parser_bos returns error after taking bo_list_mutex, as caller amdgpu_cs_ioctl will not unlock bo_list_mutex, this generates the kernel WARNING. Add unlock bo_list_mutex after amdgpu_cs_parser_bos error handling to cleanup bo_list userptr bo. WARNING: kfdtest/2930 still has locks held! 1 lock held by kfdtest/2930: (&list->bo_list_mutex){+.+.}-{3:3}, at: amdgpu_cs_ioctl+0xce5/0x1f10 [amdgpu] stack backtrace: dump_stack_lvl+0x44/0x57 get_signal+0x79f/0xd00 arch_do_signal_or_restart+0x36/0x7b0 exit_to_user_mode_prepare+0xfd/0x1b0 syscall_exit_to_user_mode+0x19/0x40 do_syscall_64+0x40/0x80 Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-10Revert "drm/amdgpu: Revert "drm/amdgpu: getting fan speed pwm for vega10 ↵Asher Song1-13/+12
properly"" This reverts commit 4545ae2ed3f2f7c3f615a53399c9c8460ee5bca7. The origin patch "drm/amdgpu: getting fan speed pwm for vega10 properly" works fine. Test failure is caused by test case self. Signed-off-by: Asher Song <Asher.Song@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-10drm/amd/display: Enforce minimum prefetch time for low memclk on DCN32Dillon Varone10-2/+26
[WHY?] Data return times when using lowest memclk can be <= 60us, which can cause underflow on high bandwidth displays with a workload. [HOW?] Enforce a minimum prefetch time during validation for low memclk modes. Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Alan Liu <HaoPing.Liu@amd.com> Signed-off-by: Dillon Varone <Dillon.Varone@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-10drm/amd/display: Fix gpio port mapping issueSteve Su2-3/+20
[Why] 1. Port of gpio has different mapping. [How] 1. Add a dummy entry in mapping table. 2. Fix incorrect mask bit field access. Reviewed-by: Alvin Lee <Alvin.Lee2@amd.com> Acked-by: Alan Liu <HaoPing.Liu@amd.com> Signed-off-by: Steve Su <steve.su@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-10drm/amd/display: Fix reg timeout in enc314_enable_fifoNicholas Kazlauskas1-6/+18
[Why] The link enablement sequence can end up resetting the encoder while the PHY symclk isn't yet on. This means that waiting for symclk on will timeout, along with the reset bit never asserting high. This causes unnecessary delay when enabling the link and produces a warning affecting multiple IGT tests. [How] Don't wait for the symclk to be on here because firmware already does. Don't wait for reset if we know the symclk isn't on. Split the reset into a helper function that checks the bit and decides whether or not a delay is sufficient. Reviewed-by: Roman Li <Roman.Li@amd.com> Acked-by: Alan Liu <HaoPing.Liu@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x
2022-11-10drm/amd/display: Fix FCLK deviation and tool compile issuesChaitanya Dhere2-2/+2
[Why] Recent backports from open source do not have header inclusion pattern that is consistent with inclusion style in the rest of the file. This breaks the internal tool builds as well. A recent commit erronously modified the original DML formula for calculating ActiveClockChangeLatencyHidingY. This resulted in a FCLK deviation from the golden values. [How] Change the way in which display_mode_vba.h is included so that it is consistent with the inclusion style in rest of the file which also fixes the tool build. Restore the DML formula to its original state to fix the FCLK deviation. Reviewed-by: Aurabindo Pillai <Aurabindo.Pillai@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Alan Liu <HaoPing.Liu@amd.com> Signed-off-by: Chaitanya Dhere <chaitanya.dhere@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-10drm/amd/display: Zeromem mypipe heap struct before using itAurabindo Pillai1-0/+1
[Why&How] Bug was caused when moving variable from stack to heap because it was reusable and garbage was left over, so we need to zero mem. Reviewed-by: Martin Leung <Martin.Leung@amd.com> Acked-by: Alan Liu <HaoPing.Liu@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Martin Leung <Martin.Leung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-10drm/amd/display: Update SR watermarks for DCN314Nicholas Kazlauskas2-18/+18
[Why & How] New values requested by hardware after fine-tuning. Update for all memory types. Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Alan Liu <HaoPing.Liu@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x
2022-11-10drm/amdgpu: workaround for TLB seq raceChristian König1-0/+15
It can happen that we query the sequence value before the callback had a chance to run. Workaround that by grabbing the fence lock and releasing it again. Should be replaced by hw handling soon. Signed-off-by: Christian König <christian.koenig@amd.com> CC: stable@vger.kernel.org # 5.19+ Fixes: 5255e146c99a6 ("drm/amdgpu: rework TLB flushing") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2113 Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Philip Yang <Philip.Yang@amd.com> Tested-by: Stefan Springer <stefanspr94@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-10drm/amdkfd: Fix error handling in criu_checkpointFelix Kuehling1-19/+15
Checkpoint BOs last. That way we don't need to close dmabuf FDs if something else fails later. This avoids problematic access to user mode memory in the error handling code path. criu_checkpoint_bos has its own error handling and cleanup that does not depend on access to user memory. In the private data, keep BOs before the remaining objects. This is necessary to restore things in the correct order as restoring events depends on the events-page BO being restored first. Fixes: be072b06c739 ("drm/amdkfd: CRIU export BOs as prime dmabuf objects") Reported-by: Jann Horn <jannh@google.com> CC: Rajneesh Bhardwaj <Rajneesh.Bhardwaj@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-and-tested-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-11-10drm/amdkfd: Fix error handling in kfd_criu_restore_eventsFelix Kuehling1-2/+1
mutex_unlock before the exit label because all the error code paths that jump there didn't take that lock. This fixes unbalanced locking errors in case of restore errors. Fixes: 40e8a766a761 ("drm/amdkfd: CRIU checkpoint and restore events") Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-11-10drm/amd/pm: update SMU IP v13.0.4 msg interface headerTim Huang1-8/+7
Some of the unused messages that were used earlier in development have been freed up as spare messages, no intended functional changes. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Tim Huang <tim.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x
2022-11-08drm/amdgpu: Fix the lpfn checking condition in drm buddyMa Jun1-1/+1
Because the value of man->size is changed during suspend/resume process, use mgr->mm.size instead of man->size here for lpfn checking. Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220914125331.2467162-1-Jun.Ma2@amd.com Signed-off-by: Christian König <christian.koenig@amd.com>
2022-11-03drm/amdkfd: update GFX11 CWSR trap handlerJay Cornwall2-381/+389
With corresponding FW change fixes issue where triggering CWSR on a workgroup with waves in s_barrier wouldn't lead to a back-off and therefore cause a hang. Signed-off-by: Jay Cornwall <jay.cornwall@amd.com> Tested-by: Graham Sider <Graham.Sider@amd.com> Acked-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Graham Sider <Graham.Sider@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x
2022-11-03drm/amd/display: Investigate tool reported FCLK P-state deviationsNevenko Stupar1-1/+2
[Why] Fix for some of the tool reported modes for FCLK P-state deviations and UCLK P-state deviations that are coming from DSC terms and/or Scaling terms causing MinActiveFCLKChangeLatencySupported and MaxActiveDRAMClockChangeLatencySupported incorrectly calculated in DML for these configurations. Reviewed-by: Chaitanya Dhere <Chaitanya.Dhere@amd.com> Acked-by: Jasdeep Dhillon <jdhillon@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Nevenko Stupar <Nevenko.Stupar@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-03drm/amd/display: Add DSC delay factor workaroundGeorge Shen7-7/+19
[Why] Certain 4K high refresh rate modes requiring DSC are exhibiting top of screen underflow corruption. Increasing the DSC delay by a factor of 6 percent stops the underflow for most use cases. [How] Multiply DSC delay requirement in DML by a factor. Add debug option to make this DSC delay factor configurable. Reviewed-by: Alvin Lee <Alvin.Lee2@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: George Shen <george.shen@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-03drm/amd/display: Round up DST_after_scaler to nearest intGeorge Shen1-2/+2
[Why] The DST_after_scaler value that DML spreadsheet outputs is generally the driver value round up to the nearest int. Reviewed-by: Alvin Lee <Alvin.Lee2@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: George Shen <george.shen@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-03drm/amd/display: Use forced DSC bpp in DMLGeorge Shen2-2/+2
[Why] DSC config is calculated separately from DML calculations. DML should use these separately calculated DSC params. The issue is that the calculated bpp is not properly propagated into DML. [How] Correctly used forced_bpp value in DML. Reviewed-by: Alvin Lee <Alvin.Lee2@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: George Shen <george.shen@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-03drm/amd/display: Fix DCN32 DSC delay calculationGeorge Shen1-1/+1
[Why] DCN32 DSC delay calculation had an unintentional integer division, resulting in a mismatch against the DML spreadsheet. [How] Cast numerator to double before performing the division. Reviewed-by: Alvin Lee <Alvin.Lee2@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: George Shen <george.shen@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-03drm/amdgpu: Disable GPU reset on SRIOV before remove pci.Gavin Wan1-1/+2
The recent change brought a bug on SRIOV envrionment. It caused unloading amdgpu failed on Guest VM. The reason is that the VF FLR was requested while unloading amdgpu driver, but the VF FLR of SRIOV sequence is wrong while removing PCI device. For SRIOV, the guest driver should not trigger the whole XGMI hive to do the reset. Host driver control how the device been reset. Fixes: f5c7e7797060 ("drm/amdgpu: Adjust removal control flow for smu v13_0_2") Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Signed-off-by: Gavin Wan <Gavin.Wan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-03drm/amdgpu: disable GFXOFF during compute for GFX11Graham Sider1-0/+7
Temporary workaround to fix issues observed in some compute applications when GFXOFF is enabled on GFX11. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x
2022-11-03drm/amd: Fail the suspend if resources can't be evictedMario Limonciello1-5/+10
If a system does not have swap and memory is under 100% usage, amdgpu will fail to evict resources. Currently the suspend carries on proceeding to reset the GPU: ``` [drm] evicting device resources failed [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <vcn_v3_0> failed -12 [drm] free PSP TMR buffer [TTM] Failed allocating page table [drm] evicting device resources failed amdgpu 0000:03:00.0: amdgpu: MODE1 reset amdgpu 0000:03:00.0: amdgpu: GPU mode1 reset amdgpu 0000:03:00.0: amdgpu: GPU smu mode1 reset ``` At this point if the suspend actually succeeded I think that amdgpu would have recovered because the GPU would have power cut off and restored. However the kernel fails to continue the suspend from the memory pressure and amdgpu fails to run the "resume" from the aborted suspend. ``` ACPI: PM: Preparing to enter system sleep state S3 SLUB: Unable to allocate memory on node -1, gfp=0xdc0(GFP_KERNEL|__GFP_ZERO) cache: Acpi-State, object size: 80, buffer size: 80, default order: 0, min order: 0 node 0: slabs: 22, objs: 1122, free: 0 ACPI Error: AE_NO_MEMORY, Could not update object reference count (20210730/utdelete-651) [drm:psp_hw_start [amdgpu]] *ERROR* PSP load kdb failed! [drm:psp_resume [amdgpu]] *ERROR* PSP resume failed [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* resume of IP block <psp> failed -62 amdgpu 0000:03:00.0: amdgpu: amdgpu_device_ip_resume failed (-62). PM: dpm_run_callback(): pci_pm_resume+0x0/0x100 returns -62 amdgpu 0000:03:00.0: PM: failed to resume async: error -62 ``` To avoid this series of unfortunate events, fail amdgpu's suspend when the memory eviction fails. This will let the system gracefully recover and the user can try suspend again when the memory pressure is relieved. Reported-by: post@davidak.de Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2223 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-02drm/amdkfd: Fix NULL pointer dereference in svm_migrate_to_ram()Yang Li1-3/+1
./drivers/gpu/drm/amd/amdkfd/kfd_migrate.c:985:58-62: ERROR: p is NULL but dereferenced. Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2549 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-11-02drm/amdgpu: correct MES debugfs versionsGraham Sider1-4/+6
Use mes.sched_version, mes.kiq_version for debugfs as mes.ucode_fw_version does not contain correct versioning information. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-02drm/amdgpu: set fb_modifiers_not_supported in vkmsYifan Zhang1-0/+2
This patch to fix the gdm3 start failure with virual display: /usr/libexec/gdm-x-session[1711]: (II) AMDGPU(0): Setting screen physical size to 270 x 203 /usr/libexec/gdm-x-session[1711]: (EE) AMDGPU(0): Failed to make import prime FD as pixmap: 22 /usr/libexec/gdm-x-session[1711]: (EE) AMDGPU(0): failed to set mode: Invalid argument /usr/libexec/gdm-x-session[1711]: (WW) AMDGPU(0): Failed to set mode on CRTC 0 /usr/libexec/gdm-x-session[1711]: (EE) AMDGPU(0): Failed to enable any CRTC gnome-shell[1840]: Running GNOME Shell (using mutter 42.2) as a X11 window and compositing manager /usr/libexec/gdm-x-session[1711]: (EE) AMDGPU(0): failed to set mode: Invalid argument vkms doesn't have modifiers support, set fb_modifiers_not_supported to bring the gdm back. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-02drm/amd/display: cursor update command incompleteMax Tseng1-0/+4
Missing send cursor_rect width & Height into DMUB. PSR-SU would use these information. But missing these assignment in last refactor commit Reported-by: Timur Kristóf <timur.kristof@gmail.com> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2227 Fixes: b73353f7f3d4 ("drm/amd/display: Use the same cursor info across features") Tested-by: Mark Broadworth <mark.broadworth@amd.com> Reviewed-by: Anthony Koo <Anthony.Koo@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Max Tseng <max.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-02drm/amd/display: Enable timing sync on DCN32Alvin Lee1-0/+1
Missed enabling timing sync on DCN32 because DCN32 has a different DML param. Tested-by: Mark Broadworth <mark.broadworth@amd.com> Reviewed-by: Martin Leung <Martin.Leung@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-02drm/amd/display: Set memclk levels to be at least 1 for dcn32Dillon Varone1-0/+3
[Why] Cannot report 0 memclk levels even when SMU does not provide any. [How] When memclk levels reported by SMU is 0, set levels to 1. Tested-by: Mark Broadworth <mark.broadworth@amd.com> Reviewed-by: Martin Leung <Martin.Leung@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Dillon Varone <Dillon.Varone@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x
2022-11-02drm/amd/display: Update latencies on DCN321Dillon Varone1-5/+5
Update DF related latencies based on new measurements. Tested-by: Mark Broadworth <mark.broadworth@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Dillon Varone <Dillon.Varone@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x
2022-11-02drm/amd/display: Limit dcn32 to 1950Mhz display clockJun Lei1-4/+4
[why] Hardware team recommends we limit dispclock to 1950Mhz for all DCN3.2.x [how] Limit to 1950 when initializing clocks. Tested-by: Mark Broadworth <mark.broadworth@amd.com> Reviewed-by: Alvin Lee <Alvin.Lee2@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Jun Lei <jun.lei@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x
2022-11-02drm/amd/display: Ignore Cable ID FeatureFangzhi Zuo1-0/+3
Ignore cable ID for DP2 receivers that does not support the feature. Tested-by: Mark Broadworth <mark.broadworth@amd.com> Reviewed-by: Roman Li <Roman.Li@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-02drm/amd/display: Update DSC capabilitie for DCN314Leo Chen1-1/+1
dcn314 has 4 DSC - conflicted hardware document updated and confirmed. Tested-by: Mark Broadworth <mark.broadworth@amd.com> Reviewed-by: Charlene Liu <Charlene.Liu@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Leo Chen <sancchen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x
2022-10-27drm/amdgpu: disallow gfxoff until GC IP blocks complete s2idle resumePrike Liang1-0/+16
In the S2idle suspend/resume phase the gfxoff is keeping functional so some IP blocks will be likely to reinitialize at gfxoff entry and that will result in failing to program GC registers.Therefore, let disallow gfxoff until AMDGPU IPs reinitialized completely. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 5.15.x
2022-10-24drm/amd/display: Revert logic for plane modifiersJoaquín Ignacio Aramendía1-43/+7
This file was split in commit 5d945cbcd4b16a29d6470a80dfb19738f9a4319f ("drm/amd/display: Create a file dedicated to planes") and the logic in dm_plane_format_mod_supported() function got changed by a switch logic. That change broke drm_plane modifiers setting on series 5000 APUs (tested on OXP mini AMD 5800U and HP Dev One 5850U PRO) leading to Gamescope not working as reported on GitHub[1] To reproduce the issue, enter a TTY and run: $ gamescope -- vkcube With said commit applied it will abort. This one restores the old logic, fixing the issue that affects Gamescope. [1](https://github.com/Plagman/gamescope/issues/624) Cc: <stable@vger.kernel.org> # 6.0.x Signed-off-by: Joaquín Ignacio Aramendía <samsagax@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24drm/amdkfd: correct the cache info for gfx1036Jesse Zhang1-1/+52
correct the cache information for gfx1036 Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-10-24drm/amdkfd: update gfx1037 Lx cache settingPrike Liang1-1/+52
Update the gfx1037 L1/L2 cache setting. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org