summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd
AgeCommit message (Collapse)AuthorFilesLines
2021-06-24Merge tag 'drm-misc-fixes-2021-06-24' of ↵Dave Airlie1-1/+13
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes A DMA address check for nouveau, an error code return fix for kmb, fixes to wait for a moving fence after pinning the BO for amdgpu, nouveau and radeon, a crtc and async page flip fix for atmel-hlcdc and a cpu hang fix for vc4. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20210624190353.wyizoil3wqrrxz5d@gilmour
2021-06-22drm/amdgpu: wait for moving fence after pinningChristian König1-1/+13
We actually need to wait for the moving fence after pinning the BO to make sure that the pin is completed. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> References: https://lore.kernel.org/dri-devel/20210621151758.2347474-1-daniel.vetter@ffwll.ch/ CC: stable@kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20210622114506.106349-3-christian.koenig@amd.com
2021-06-22Revert "drm/amdgpu/gfx9: fix the doorbell missing when in CGPG issue."Yifan Zhang1-5/+1
This reverts commit 4cbbe34807938e6e494e535a68d5ff64edac3f20. Reason for revert: side effect of enlarging CP_MEC_DOORBELL_RANGE may cause some APUs fail to enter gfxoff in certain user cases. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-06-22Revert "drm/amdgpu/gfx10: enlarge CP_MEC_DOORBELL_RANGE_UPPER to cover full ↵Yifan Zhang1-5/+1
doorbell." This reverts commit 1c0b0efd148d5b24c4932ddb3fa03c8edd6097b3. Reason for revert: Side effect of enlarging CP_MEC_DOORBELL_RANGE may cause some APUs fail to enter gfxoff in certain user cases. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-06-22drm/amdgpu: Call drm_framebuffer_init last for framebuffer initMichel Dänzer1-5/+7
Once drm_framebuffer_init has returned 0, the framebuffer is hooked up to the reference counting machinery and can no longer be destroyed with a simple kfree. Therefore, it must be called last. If drm_framebuffer_init returns 0 but its caller then returns non-0, there will likely be memory corruption fireworks down the road. The following lead me to this fix: [ 12.891228] kernel BUG at lib/list_debug.c:25! [...] [ 12.891263] RIP: 0010:__list_add_valid+0x4b/0x70 [...] [ 12.891324] Call Trace: [ 12.891330] drm_framebuffer_init+0xb5/0x100 [drm] [ 12.891378] amdgpu_display_gem_fb_verify_and_init+0x47/0x120 [amdgpu] [ 12.891592] ? amdgpu_display_user_framebuffer_create+0x10d/0x1f0 [amdgpu] [ 12.891794] amdgpu_display_user_framebuffer_create+0x126/0x1f0 [amdgpu] [ 12.891995] drm_internal_framebuffer_create+0x378/0x3f0 [drm] [ 12.892036] ? drm_internal_framebuffer_create+0x3f0/0x3f0 [drm] [ 12.892075] drm_mode_addfb2+0x34/0xd0 [drm] [ 12.892115] ? drm_internal_framebuffer_create+0x3f0/0x3f0 [drm] [ 12.892153] drm_ioctl_kernel+0xe2/0x150 [drm] [ 12.892193] drm_ioctl+0x3da/0x460 [drm] [ 12.892232] ? drm_internal_framebuffer_create+0x3f0/0x3f0 [drm] [ 12.892274] amdgpu_drm_ioctl+0x43/0x80 [amdgpu] [ 12.892475] __se_sys_ioctl+0x72/0xc0 [ 12.892483] do_syscall_64+0x33/0x40 [ 12.892491] entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: f258907fdd835e "drm/amdgpu: Verify bo size can fit framebuffer size on init." Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-16drm/amdgpu/gfx10: enlarge CP_MEC_DOORBELL_RANGE_UPPER to cover full doorbell.Yifan Zhang1-1/+5
If GC has entered CGPG, ringing doorbell > first page doesn't wakeup GC. Enlarge CP_MEC_DOORBELL_RANGE_UPPER to workaround this issue. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-06-16drm/amdgpu/gfx9: fix the doorbell missing when in CGPG issue.Yifan Zhang1-1/+5
If GC has entered CGPG, ringing doorbell > first page doesn't wakeup GC. Enlarge CP_MEC_DOORBELL_RANGE_UPPER to workaround this issue. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-06-08drm/amd/pm: Fix fall-through warning for ClangGustavo A. R. Silva1-0/+1
In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning by explicitly adding a break statement instead of letting the code fall through to the next case. Link: https://github.com/KSPP/linux/issues/115 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-08drm/amdgpu: Fix incorrect register offsets for Sienna CichlidRohit Khaire1-5/+21
RLC_CP_SCHEDULERS and RLC_SPARE_INT0 have different offsets for Sienna Cichlid Signed-off-by: Rohit Khaire <rohit.khaire@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-08drm/amdgpu: Use drm_dbg_kms for reporting failure to get a GEM FBMichel Dänzer1-2/+2
drm_err meant broken user space could spam dmesg. Fixes: f258907fdd835e "drm/amdgpu: Verify bo size can fit framebuffer size on init." Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-08drm/amdgpu: switch kzalloc to kvzalloc in amdgpu_bo_createChangfeng1-2/+2
It will cause error when alloc memory larger than 128KB in amdgpu_bo_create->kzalloc. So it needs to switch kzalloc to kvzalloc. Call Trace: alloc_pages_current+0x6a/0xe0 kmalloc_order+0x32/0xb0 kmalloc_order_trace+0x1e/0x80 __kmalloc+0x249/0x2d0 amdgpu_bo_create+0x102/0x500 [amdgpu] ? xas_create+0x264/0x3e0 amdgpu_bo_create_vm+0x32/0x60 [amdgpu] amdgpu_vm_pt_create+0xf5/0x260 [amdgpu] amdgpu_vm_init+0x1fd/0x4d0 [amdgpu] Signed-off-by: Changfeng <Changfeng.Zhu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-03amd/display: convert DRM_DEBUG_ATOMIC to drm_dbg_atomicSimon Ser1-1/+1
This allows to tie the log message to a specific DRM device. Signed-off-by: Simon Ser <contact@emersion.fr> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Harry Wentland <hwentlan@amd.com> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-03drm/amdgpu: make sure we unpin the UVD BONirmoy Das1-0/+1
Releasing pinned BOs is illegal now. UVD 6 was missing from: commit 2f40801dc553 ("drm/amdgpu: make sure we unpin the UVD BO") Fixes: 2f40801dc553 ("drm/amdgpu: make sure we unpin the UVD BO") Cc: stable@vger.kernel.org Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-03drm/amd/amdgpu:save psp ring wptr to avoid attackVictor Zhao3-2/+5
[Why] When some tools performing psp mailbox attack, the readback value of register can be a random value which may break psp. [How] Use a psp wptr cache machanism to aovid the change made by attack. v2: unify change and add detailed reason Signed-off-by: Victor Zhao <Victor.Zhao@amd.com> Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-03drm/amd/display: Fix potential memory leak in DMUB hw_initRoman Li1-2/+2
[Why] On resume we perform DMUB hw_init which allocates memory: dm_resume->dm_dmub_hw_init->dc_dmub_srv_create->kzalloc That results in memory leak in suspend/resume scenarios. [How] Allocate memory for the DC wrapper to DMUB only if it was not allocated before. No need to reallocate it on suspend/resume. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-03drm/amdgpu: Don't query CE and UE errorsLuben Tuikov1-16/+0
On QUERY2 IOCTL don't query counts of correctable and uncorrectable errors, since when RAS is enabled and supported on Vega20 server boards, this takes insurmountably long time, in O(n^3), which slows the system down to the point of it being unusable when we have GUI up. Fixes: ae363a212b14 ("drm/amdgpu: Add a new flag to AMDGPU_CTX_OP_QUERY_STATE2") Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-03drm/amd/display: Fix overlay validation by considering cursorsRodrigo Siqueira1-1/+9
A few weeks ago, we saw a two cursor issue in a ChromeOS system. We fixed it in the commit: drm/amd/display: Fix two cursor duplication when using overlay (read the commit message for more details) After this change, we noticed that some IGT subtests related to kms_plane and kms_plane_scaling started to fail. After investigating this issue, we noticed that all subtests that fail have a primary plane covering the overlay plane, which is currently rejected by amdgpu dm. Fail those IGT tests highlight that our verification was too broad and compromises the overlay usage in our drive. This patch fixes this issue by ensuring that we only reject commits where the primary plane is not fully covered by the overlay when the cursor hardware is enabled. With this fix, all IGT tests start to pass again, which means our overlay support works as expected. Cc: Tianci.Yin <tianci.yin@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Nicholas Choi <nicholas.choi@amd.com> Cc: Bhawanpreet Lakha <bhawanpreet.lakha@amd.com> Cc: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Cc: Mark Yacoub <markyacoub@google.com> Cc: Daniel Wheeler <daniel.wheeler@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-03drm/amdgpu: refine amdgpu_fru_get_product_infoJiansong Chen1-19/+23
1. eliminate potential array index out of bounds. 2. return meaningful value for failure. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Jack Gui <Jack.Gui@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-03drm/amdgpu: add judgement for dc supportAsher Song1-1/+3
Drop DC initialization when DCN is harvested in VBIOS. The way doesn't affect virtual display ip initialization. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Asher Song <Asher.Song@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-03drm/amd/display: Fix GPU scaling regression by FS video supportNicholas Kazlauskas1-7/+7
[Why] FS video support regressed GPU scaling and the scaled buffer ends up stuck in the top left of the screen at native size - full, aspect, center scaling modes do not function. This is because decide_crtc_timing_for_drm_display_mode() does not get called when scaling is enabled. [How] Split recalculate timing and scaling into two different flags. We don't want to call drm_mode_set_crtcinfo() for scaling, but we do want to call it for FS video. Optimize and move preferred_refresh calculation next to decide_crtc_timing_for_drm_display_mode() like it used to be since that's not used for FS video. We don't need to copy over the VIC or polarity in the case of FS video modes because those don't change. Fixes: 6f59f229f8ed7a ("drm/amd/display: Skip modeset for front porch change") Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-03drm/amd/display: Allow bandwidth validation for 0 streams.Bindu Ramamurthy1-1/+1
[Why] Bandwidth calculations are triggered for non zero streams, and in case of 0 streams, these calculations were skipped with pstate status not being updated. [How] As the pstate status is applicable for non zero streams, check added for allowing 0 streams inline with dcn internal bandwidth validations. Signed-off-by: Bindu Ramamurthy <bindu.r@amd.com> Reviewed-by: Roman Li <Roman.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-21drm/amdgpu/jpeg3: add cancel_delayed_work_sync before power gateJames Zhu1-2/+2
Add cancel_delayed_work_sync before set power gating state to avoid race condition issue when power gating. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-05-21drm/amdgpu/jpeg2.5: add cancel_delayed_work_sync before power gateJames Zhu1-2/+2
Add cancel_delayed_work_sync before set power gating state to avoid race condition issue when power gating. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-05-21drm/amdgpu/jpeg2.0: add cancel_delayed_work_sync before power gateJames Zhu1-0/+2
Add cancel_delayed_work_sync before set power gating state to avoid race condition issue when power gating. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-05-21drm/amdgpu/vcn3: add cancel_delayed_work_sync before power gateJames Zhu1-3/+2
Add cancel_delayed_work_sync before set power gating state to avoid race condition issue when power gating. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-05-21drm/amdgpu/vcn2.5: add cancel_delayed_work_sync before power gateJames Zhu1-0/+2
Add cancel_delayed_work_sync before set power gating state to avoid race condition issue when power gating. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-05-21drm/amdgpu/vcn2.0: add cancel_delayed_work_sync before power gateJames Zhu1-0/+2
Add cancel_delayed_work_sync before set power gating state to avoid race condition issue when power gating. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-05-21drm/amdgpu/vcn1: add cancel_delayed_work_sync before power gateJames Zhu1-1/+5
Add cancel_delayed_work_sync before set power gating state to avoid race condition issue when power gating. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-05-21drm/amdkfd: correct sienna_cichlid SDMA RLC register offset errorKevin Wang1-6/+6
1.correct KFD SDMA RLC queue register offset error. (all sdma rlc register offset is base on SDMA0.RLC0_RLC0_RB_CNTL) 2.HQD_N_REGS (19+6+7+12) 12: the 2 more resgisters than navi1x (SDMAx_RLCy_MIDCMD_DATA{9,10}) the patch also can be fixed NULL pointer issue when read /sys/kernel/debug/kfd/hqds on sienna_cichlid chip. Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-05-20drm/amd/pm: correct MGpuFanBoost settingEvan Quan2-0/+19
No MGpuFanBoost setting for those ASICs which do not support it. Otherwise, it may breaks their fan control feature. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1580 Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-05-20drm/amdgpu: stop touching sched.ready in the backendChristian König4-16/+1
This unfortunately comes up in regular intervals and breaks GPU reset for the engine in question. The sched.ready flag controls if an engine can't get working during hw_init, but should never be set to false during hw_fini. v2: squash in unused variable fix (Alex) Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-20drm/amd/amdgpu: fix a potential deadlock in gpu resetLang Yu1-1/+0
When amdgpu_ib_ring_tests failed, the reset logic called amdgpu_device_ip_suspend twice, then deadlock occurred. Deadlock log: [ 805.655192] amdgpu 0000:04:00.0: amdgpu: ib ring test failed (-110). [ 806.290952] [drm] free PSP TMR buffer [ 806.319406] ============================================ [ 806.320315] WARNING: possible recursive locking detected [ 806.321225] 5.11.0-custom #1 Tainted: G W OEL [ 806.322135] -------------------------------------------- [ 806.323043] cat/2593 is trying to acquire lock: [ 806.323825] ffff888136b1cdc8 (&adev->dm.dc_lock){+.+.}-{3:3}, at: dm_suspend+0xb8/0x1d0 [amdgpu] [ 806.325668] but task is already holding lock: [ 806.326664] ffff888136b1cdc8 (&adev->dm.dc_lock){+.+.}-{3:3}, at: dm_suspend+0xb8/0x1d0 [amdgpu] [ 806.328430] other info that might help us debug this: [ 806.329539] Possible unsafe locking scenario: [ 806.330549] CPU0 [ 806.330983] ---- [ 806.331416] lock(&adev->dm.dc_lock); [ 806.332086] lock(&adev->dm.dc_lock); [ 806.332738] *** DEADLOCK *** [ 806.333747] May be due to missing lock nesting notation [ 806.334899] 3 locks held by cat/2593: [ 806.335537] #0: ffff888100d3f1b8 (&attr->mutex){+.+.}-{3:3}, at: simple_attr_read+0x4e/0x110 [ 806.337009] #1: ffff888136b1fd78 (&adev->reset_sem){++++}-{3:3}, at: amdgpu_device_lock_adev+0x42/0x94 [amdgpu] [ 806.339018] #2: ffff888136b1cdc8 (&adev->dm.dc_lock){+.+.}-{3:3}, at: dm_suspend+0xb8/0x1d0 [amdgpu] [ 806.340869] stack backtrace: [ 806.341621] CPU: 6 PID: 2593 Comm: cat Tainted: G W OEL 5.11.0-custom #1 [ 806.342921] Hardware name: AMD Celadon-CZN/Celadon-CZN, BIOS WLD0C23N_Weekly_20_12_2 12/23/2020 [ 806.344413] Call Trace: [ 806.344849] dump_stack+0x93/0xbd [ 806.345435] __lock_acquire.cold+0x18a/0x2cf [ 806.346179] lock_acquire+0xca/0x390 [ 806.346807] ? dm_suspend+0xb8/0x1d0 [amdgpu] [ 806.347813] __mutex_lock+0x9b/0x930 [ 806.348454] ? dm_suspend+0xb8/0x1d0 [amdgpu] [ 806.349434] ? amdgpu_device_indirect_rreg+0x58/0x70 [amdgpu] [ 806.350581] ? _raw_spin_unlock_irqrestore+0x47/0x50 [ 806.351437] ? dm_suspend+0xb8/0x1d0 [amdgpu] [ 806.352437] ? rcu_read_lock_sched_held+0x4f/0x80 [ 806.353252] ? rcu_read_lock_sched_held+0x4f/0x80 [ 806.354064] mutex_lock_nested+0x1b/0x20 [ 806.354747] ? mutex_lock_nested+0x1b/0x20 [ 806.355457] dm_suspend+0xb8/0x1d0 [amdgpu] [ 806.356427] ? soc15_common_set_clockgating_state+0x17d/0x19 [amdgpu] [ 806.357736] amdgpu_device_ip_suspend_phase1+0x78/0xd0 [amdgpu] [ 806.360394] amdgpu_device_ip_suspend+0x21/0x70 [amdgpu] [ 806.362926] amdgpu_device_pre_asic_reset+0xb3/0x270 [amdgpu] [ 806.365560] amdgpu_device_gpu_recover.cold+0x679/0x8eb [amdgpu] Signed-off-by: Lang Yu <Lang.Yu@amd.com> Acked-by: Christian KÃnig <christian.koenig@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-20drm/amdgpu: update sdma golden setting for Navi12Guchun Chen1-0/+4
Current golden setting is out of date. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-05-20drm/amdgpu: update gc golden setting for Navi12Guchun Chen1-2/+4
Current golden setting is out of date. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-05-20drm/amdgpu: Fix a use-after-freexinhui pan1-0/+1
looks like we forget to set ttm->sg to NULL. Hit panic below [ 1235.844104] general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b7b4b: 0000 [#1] SMP DEBUG_PAGEALLOC NOPTI [ 1235.989074] Call Trace: [ 1235.991751] sg_free_table+0x17/0x20 [ 1235.995667] amdgpu_ttm_backend_unbind.cold+0x4d/0xf7 [amdgpu] [ 1236.002288] amdgpu_ttm_backend_destroy+0x29/0x130 [amdgpu] [ 1236.008464] ttm_tt_destroy+0x1e/0x30 [ttm] [ 1236.013066] ttm_bo_cleanup_memtype_use+0x51/0xa0 [ttm] [ 1236.018783] ttm_bo_release+0x262/0xa50 [ttm] [ 1236.023547] ttm_bo_put+0x82/0xd0 [ttm] [ 1236.027766] amdgpu_bo_unref+0x26/0x50 [amdgpu] [ 1236.032809] amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x7aa/0xd90 [amdgpu] [ 1236.040400] kfd_ioctl_alloc_memory_of_gpu+0xe2/0x330 [amdgpu] [ 1236.046912] kfd_ioctl+0x463/0x690 [amdgpu] Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-20drm/amdgpu: add video_codecs query support for aldebaranJames Zhu1-0/+1
Add video_codecs query support for aldebaran. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-20drm/amd/amdgpu: fix refcount leakJingwen Chen1-0/+3
[Why] the gem object rfb->base.obj[0] is get according to num_planes in amdgpufb_create, but is not put according to num_planes [How] put rfb->base.obj[0] in amdgpu_fbdev_destroy according to num_planes Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-20drm/amd/display: Disconnect non-DP with no EDIDChris Park1-0/+18
[Why] Active DP dongles return no EDID when dongle is connected, but VGA display is taken out. Current driver behavior does not remove the active display when this happens, and this is a gap between dongle DTP and dongle behavior. [How] For active DP dongles and non-DP scenario, disconnect sink on detection when no EDID is read due to timeout. Signed-off-by: Chris Park <Chris.Park@amd.com> Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by: Stylon Wang <stylon.wang@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-20drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hangChangfeng2-5/+7
There is problem with 3DCGCG firmware and it will cause compute test hang on picasso/raven1. It needs to disable 3DCGCG in driver to avoid compute hang. Signed-off-by: Changfeng <Changfeng.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-05-20drm/amdgpu: Fix GPU TLB update error when PAGE_SIZE > AMDGPU_PAGE_SIZEYi Li1-1/+1
When PAGE_SIZE is larger than AMDGPU_PAGE_SIZE, the number of GPU TLB entries which need to update in amdgpu_map_buffer() should be multiplied by AMDGPU_GPU_PAGES_IN_CPU_PAGE (PAGE_SIZE / AMDGPU_PAGE_SIZE). Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Yi Li <liyi@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-05-20drm/amd/display: Use the correct max downscaling value for DCN3.x familyNikola Cornij3-9/+12
[why] As per spec, DCN3.x can do 6:1 downscaling and DCN2.x can do 4:1. The max downscaling limit value for DCN2.x is 250, which means it's calculated as 1000 / 4 = 250. For DCN3.x this then gives 1000 / 6 = 167. [how] Set maximum downscaling limit to 167 for DCN3.x Signed-off-by: Nikola Cornij <nikola.cornij@amd.com> Reviewed-by: Charlene Liu <Charlene.Liu@amd.com> Reviewed-by: Harry Wentland <Harry.Wentland@amd.com> Acked-by: Stylon Wang <stylon.wang@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-05-13drm/amdgpu: update vcn1.0 Non-DPG suspend sequenceSathishkumar S1-4/+9
update suspend register settings in Non-DPG mode. Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-13drm/amdgpu: set vcn mgcg flag for picassoSathishkumar S1-1/+2
enable vcn mgcg flag for picasso. Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-13drm/amdgpu: update the method for harvest IP for specific SKULikun Gao1-14/+16
Update the method of disabling VCN IP for specific SKU for navi1x ASIC, it will judge whether should add the related IP at the function of amdgpu_device_ip_block_add(). Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-13drm/amdgpu: add judgement when add ip blocks (v2)Likun GAO6-2/+57
Judgement whether to add an sw ip according to the harvest info. v2: fix indentation (Alex) Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-13drm/amd/display: Initialize attribute for hdcp_srm sysfs fileDavid Ward1-0/+1
It is stored in dynamically allocated memory, so sysfs_bin_attr_init() must be called to initialize it. (Note: "initialization" only sets the .attr.key member in this struct; it does not change the value of any other members.) Otherwise, when CONFIG_DEBUG_LOCK_ALLOC=y this message appears during boot: BUG: key ffff9248900cd148 has not been registered! Fixes: 9037246bb2da ("drm/amd/display: Add sysfs interface for set/get srm") Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1586 Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Signed-off-by: David Ward <david.ward@gatech.edu> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-05-13drm/amd/pm: Fix out-of-bounds bugGustavo A. R. Silva2-99/+109
Create new structure SISLANDS_SMC_SWSTATE_SINGLE, as initialState.levels and ACPIState.levels are never actually used as flexible arrays. Those arrays can be used as simple objects of type SISLANDS_SMC_HW_PERFORMANCE_LEVEL, instead. Currently, the code fails because flexible array _levels_ in struct SISLANDS_SMC_SWSTATE doesn't allow for code that accesses the first element of initialState.levels and ACPIState.levels arrays: drivers/gpu/drm/amd/pm/powerplay/si_dpm.c: 4820: table->initialState.levels[0].mclk.vDLL_CNTL = 4821: cpu_to_be32(si_pi->clock_registers.dll_cntl); ... 5021: table->ACPIState.levels[0].mclk.vDLL_CNTL = 5022: cpu_to_be32(dll_cntl); because such element cannot be accessed without previously allocating enough dynamic memory for it to exist (which never actually happens). So, there is an out-of-bounds bug in this case. That's why struct SISLANDS_SMC_SWSTATE should only be used as type for object driverState and new struct SISLANDS_SMC_SWSTATE_SINGLE is created as type for objects initialState, ACPIState and ULVState. Also, with the change from one-element array to flexible-array member in commit 0e1aa13ca3ff ("drm/amd/pm: Replace one-element array with flexible-array in struct SISLANDS_SMC_SWSTATE"), the size of dpmLevels in struct SISLANDS_SMC_STATETABLE should be fixed to be SISLANDS_MAX_SMC_PERFORMANCE_LEVELS_PER_SWSTATE instead of SISLANDS_MAX_SMC_PERFORMANCE_LEVELS_PER_SWSTATE - 1. Fixes: 0e1aa13ca3ff ("drm/amd/pm: Replace one-element array with flexible-array in struct SISLANDS_SMC_SWSTATE") Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-09Merge tag 'drm-next-2021-05-10' of git://anongit.freedesktop.org/drm/drmLinus Torvalds13-37/+310
Pull drm fixes from Dave Airlie: "Bit later than usual, I queued them all up on Friday then promptly forgot to write the pull request email. This is mainly amdgpu fixes, with some radeon/msm/fbdev and one i915 gvt fix thrown in. amdgpu: - MPO hang workaround - Fix for concurrent VM flushes on vega/navi - dcefclk is not adjustable on navi1x and newer - MST HPD debugfs fix - Suspend/resumes fixes - Register VGA clients late in case driver fails to load - Fix GEM leak in user framebuffer create - Add support for polaris12 with 32 bit memory interface - Fix duplicate cursor issue when using overlay - Fix corruption with tiled surfaces on VCN3 - Add BO size and stride check to fix BO size verification radeon: - Fix off-by-one in power state parsing - Fix possible memory leak in power state parsing msm: - NULL ptr dereference fix fbdev: - procfs disabled warning fix i915: - gvt: Fix a possible division by zero in vgpu display rate calculation" * tag 'drm-next-2021-05-10' of git://anongit.freedesktop.org/drm/drm: drm/amdgpu: Use device specific BO size & stride check. drm/amdgpu: Init GFX10_ADDR_CONFIG for VCN v3 in DPG mode. drm/amd/pm: initialize variable drm/radeon: Avoid power table parsing memory leaks drm/radeon: Fix off-by-one power_state index heap overwrite drm/amd/display: Fix two cursor duplication when using overlay drm/amdgpu: add new MC firmware for Polaris12 32bit ASIC fbmem: Mark proc_fb_seq_ops as __maybe_unused drm/msm/dpu: Delete bonkers code drm/i915/gvt: Prevent divided by zero when calculating refresh rate amdgpu: fix GEM obj leak in amdgpu_display_user_framebuffer_create drm/amdgpu: Register VGA clients after init can no longer fail drm/amdgpu: Handling of amdgpu_device_resume return value for graceful teardown drm/amdgpu: fix r initial values drm/amd/display: fix wrong statement in mst hpd debugfs amdgpu/pm: set pp_dpm_dcefclk to readonly on NAVI10 and newer gpus amdgpu/pm: Prevent force of DCEFCLK on NAVI10 and SIENNA_CICHLID drm/amdgpu: fix concurrent VM flushes on Vega/Navi v2 drm/amd/display: Reject non-zero src_y and src_x for video planes
2021-05-07Merge tag 'amd-drm-fixes-5.13-2021-05-05' of ↵Dave Airlie13-37/+310
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-fixes-5.13-2021-05-05: amdgpu: - MPO hang workaround - Fix for concurrent VM flushes on vega/navi - dcefclk is not adjustable on navi1x and newer - MST HPD debugfs fix - Suspend/resumes fixes - Register VGA clients late in case driver fails to load - Fix GEM leak in user framebuffer create - Add support for polaris12 with 32 bit memory interface - Fix duplicate cursor issue when using overlay - Fix corruption with tiled surfaces on VCN3 - Add BO size and stride check to fix BO size verification radeon: - Fix off-by-one in power state parsing - Fix possible memory leak in power state parsing Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210506033929.3875-1-alexander.deucher@amd.com
2021-05-06drm/amdgpu: Use device specific BO size & stride check.Bas Nieuwenhuizen1-6/+175
The builtin size check isn't really the right thing for AMD modifiers due to a couple of reasons: 1) In the format structs we don't do set any of the tilesize / blocks etc. to avoid having format arrays per modifier/GPU 2) The pitch on the main plane is pixel_pitch * bytes_per_pixel even for tiled ... 3) The pitch for the DCC planes is really the pixel pitch of the main surface that would be covered by it ... Note that we only handle GFX9+ case but we do this after converting the implicit modifier to an explicit modifier, so on GFX9+ all framebuffers should be checked here. There is a TODO about DCC alignment, but it isn't worse than before and I'd need to dig a bunch into the specifics. Getting this out in a reasonable timeframe to make sure it gets the appropriate testing seemed more important. Finally as I've found that debugging addfb2 failures is a pita I was generous adding explicit error messages to every failure case. Fixes: f258907fdd83 ("drm/amdgpu: Verify bo size can fit framebuffer size on init.") Tested-by: Simon Ser <contact@emersion.fr> Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>