summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd
AgeCommit message (Collapse)AuthorFilesLines
2023-07-19drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memoryChristian König1-30/+39
commit e2ad8e2df432498b1cee2af04df605723f4d75e6 upstream. We need to grab the lock of the BO or otherwise can run into a crash when we try to inspect the current location. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.3.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-07-19Revert "drm/amd/display: Move DCN314 DOMAIN power control to DMCUB"Daniel Miess3-26/+1
[ Upstream commit bf0097c5c9aec528da75e2b5fcede472165322bb ] This reverts commit e383b12709e32d6494c948422070c2464b637e44. Controling hubp power gating using the DMCUB isn't stable so we are reverting this change to move control back into the driver. Cc: stable@vger.kernel.org # 6.3+ Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Daniel Miess <daniel.miess@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19drm/amd: Don't try to enable secure display TA multiple timesMario Limonciello1-0/+2
[ Upstream commit 5c6d52ff4b61e5267b25be714eb5a9ba2a338199 ] If the securedisplay TA failed to load the first time, it's unlikely to work again after a suspend/resume cycle or reset cycle and it appears to be causing problems in futher attempts. Fixes: e42dfa66d592 ("drm/amdgpu: Add secure display TA load for Renoir") Reported-by: Filip Hejsek <filip.hejsek@gmail.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2633 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19drm/amdgpu: fix number of fence calculationsChristian König1-5/+6
[ Upstream commit 570b295248b00c3cf4cf59e397de5cb2361e10c2 ] Since adding gang submit we need to take the gang size into account while reserving fences. Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: 4624459c84d7 ("drm/amdgpu: add gang submit frontend v6") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19drm/amdgpu: Fix usage of UMC fill record in RASLuben Tuikov1-2/+1
[ Upstream commit 71344a718a9fda8c551cdc4381d354f9a9907f6f ] The fixed commit listed in the Fixes tag below, introduced a bug in amdgpu_ras.c::amdgpu_reserve_page_direct(), in that when introducing the new amdgpu_umc_fill_error_record() and internally in that new function the physical address (argument "uint64_t retired_page"--wrong name) is right-shifted by AMDGPU_GPU_PAGE_SHIFT. Thus, in amdgpu_reserve_page_direct() when we pass "address" to that new function, we should NOT right-shift it, since this results, erroneously, in the page address to be 0 for first 2^(2*AMDGPU_GPU_PAGE_SHIFT) memory addresses. This commit fixes this bug. Cc: Tao Zhou <tao.zhou1@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Alex Deucher <Alexander.Deucher@amd.com> Fixes: 400013b268cb ("drm/amdgpu: add umc_fill_error_record to make code more simple") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Link: https://lore.kernel.org/r/20230610113536.10621-1-luben.tuikov@amd.com Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19drm/amdgpu: Fix memcpy() in sienna_cichlid_append_powerplay_table function.Srinivasan Shanmugam1-4/+14
[ Upstream commit d50dc746ff72b9c48812dac3344fa87fbde940a3 ] Fixes the following gcc with W=1: In file included from ./include/linux/string.h:253, from ./include/linux/bitmap.h:11, from ./include/linux/cpumask.h:12, from ./arch/x86/include/asm/cpumask.h:5, from ./arch/x86/include/asm/msr.h:11, from ./arch/x86/include/asm/processor.h:22, from ./arch/x86/include/asm/cpufeature.h:5, from ./arch/x86/include/asm/thread_info.h:53, from ./include/linux/thread_info.h:60, from ./arch/x86/include/asm/preempt.h:7, from ./include/linux/preempt.h:78, from ./include/linux/spinlock.h:56, from ./include/linux/mmzone.h:8, from ./include/linux/gfp.h:7, from ./include/linux/firmware.h:7, from drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/sienna_cichlid_ppt.c:26: In function ‘fortify_memcpy_chk’, inlined from ‘sienna_cichlid_append_powerplay_table’ at drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/sienna_cichlid_ppt.c:444:2, inlined from ‘sienna_cichlid_setup_pptable’ at drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/sienna_cichlid_ppt.c:506:8, inlined from ‘sienna_cichlid_setup_pptable’ at drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/sienna_cichlid_ppt.c:494:12: ./include/linux/fortify-string.h:413:4: warning: call to ‘__read_overflow2_field’ declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] 413 | __read_overflow2_field(q_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ the compiler complains about the size calculation in the memcpy() - "sizeof(*smc_dpm_table) - sizeof(smc_dpm_table->table_header)" is much larger than what fits into table_member. Hence, reuse 'smu_memcpy_trailing' for nv1x Fixes: 7077b19a38240 ("drm/amd/pm: use macro to get pptable members") Suggested-by: Evan Quan <Evan.Quan@amd.com> Cc: Evan Quan <Evan.Quan@amd.com> Cc: Chengming Gui <Jack.Gui@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19amdgpu: validate offset_in_bo of drm_amdgpu_gem_vaChia-I Wu1-8/+8
[ Upstream commit 9f0bcf49e9895cb005d78b33a5eebfa11711b425 ] This is motivated by OOB access in amdgpu_vm_update_range when offset_in_bo+map_size overflows. v2: keep the validations in amdgpu_vm_bo_map v3: add the validations to amdgpu_vm_bo_map/amdgpu_vm_bo_replace_map rather than to amdgpu_gem_va_ioctl Fixes: 9f7eb5367d00 ("drm/amdgpu: actually use the VM map parameters") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19drm/amd/display: Fix artifacting on eDP panels when engaging freesync video modeAurabindo Pillai1-0/+2
[ Upstream commit b18f05a0666aecd5cb19c26a8305bcfa4e9d6502 ] [Why] When freesync video mode is enabled, switching resolution from native mode to one of the freesync video compatible modes can trigger continous artifacts on some eDP panels when running under KDE. The articating can be seen in the attached bug report. [How] Fix this by restricting updates that require full commit by using the same checks for stream and scaling changes in the the enable pass of dm_update_crtc_state() along with the check for compatible timings for freesync vide mode. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2162 Fixes: da5e14909776 ("drm/amd/display: Fix hang when skipping modeset") Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19drm/amdkfd: Fix potential deallocation of previously deallocated memory.Daniil Dulov1-6/+7
[ Upstream commit cabbdea1f1861098991768d7bbf5a49ed1608213 ] Pointer mqd_mem_obj can be deallocated in kfd_gtt_sa_allocate(). The function then returns non-zero value, which causes the second deallocation. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: d1f8f0d17d40 ("drm/amdkfd: Move non-sdma mqd allocation out of init_mqd") Signed-off-by: Daniil Dulov <d.dulov@aladdin.ru> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19drm/amd/display: Fix a test dml32_rq_dlg_get_rq_reg()Christophe JAILLET1-1/+1
[ Upstream commit bafc31166aa7df5fa26ae0ad8196d1717e6cdea9 ] It is likely p1_min_meta_chunk_bytes was expected here, instead of min_meta_chunk_bytes. Test the correct variable. Fixes: dda4fb85e433 ("drm/amd/display: DML changes for DCN32/321") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19drm/amd/display: Fix a test CalculatePrefetchSchedule()Christophe JAILLET1-1/+1
[ Upstream commit 960e27a5741cd3001996ff6ddfb3eb0ed3a4909d ] It is likely Height was expected here, instead of Width. Test the correct variable. Fixes: 17529ea2acfa ("drm/amd/display: Optimizations for DML math") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19drm/amd/display: Explicitly specify update type per plane info changeNicholas Kazlauskas1-3/+0
[ Upstream commit 710cc1e7cd461446a9325c9bd1e9a54daa462952 ] [Why] The bit for flip addr is being set causing the determination for FAST vs MEDIUM to always return MEDIUM when plane info is provided as a surface update. This causes extreme stuttering for the typical atomic update path on Linux. [How] Don't use update_flags->raw for determining FAST vs MEDIUM. It's too fragile to changes like this. Explicitly specify the update type per update flag instead. It's not as clever as checking the bits itself but at least it's correct. Fixes: aa5fdb1ab5b6 ("drm/amd/display: Explicitly specify update type per plane info change") Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19drm/amd/display: fix is_timing_changed() prototypeArnd Bergmann3-6/+8
[ Upstream commit 3306ba4b60b2f3d9ac6bddc587a4d702e1ba2224 ] Three functions in the amdgpu display driver cause -Wmissing-prototype warnings: drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_resource.c:1858:6: error: no previous prototype for 'is_timing_changed' [-Werror=missing-prototypes] is_timing_changed() is actually meant to be a global symbol, but needs a proper name and prototype. Fixes: 17ce8a6907f7 ("drm/amd/display: Add dsc pre-validation in atomic check") Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19drm/amd/display: Add logging for display MALL refresh settingWesley Chalmers1-0/+3
[ Upstream commit cd8f067a46d34dee3188da184912ae3d64d98444 ] [WHY] Add log entry for when display refresh from MALL settings are sent to SMU. Fixes: 1664641ea946 ("drm/amd/display: Add logger for SMU msg") Signed-off-by: Wesley Chalmers <Wesley.Chalmers@amd.com> Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19drm/amd/display: Unconditionally print when DP sink power state failsSrinivasan Shanmugam1-3/+1
[ Upstream commit e4dfd94d5e3851df607b26ab5b20ad8d94f5ccff ] The previous 'commit ca9beb8aac68 ("drm/amd/display: Add logging when setting DP sink power state fails")', it is better to unconditionally print "failed to power up sink", because we are returning DC_ERROR_UNEXPECTED. Fixes: ca9beb8aac68 ("drm/amd/display: Add logging when setting DP sink power state fails") Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Fangzhi Zuo <Jerry.Zuo@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19Revert "drm/amd/display: edp do not add non-edid timings"Hersen Wu1-7/+1
commit d6149086b45e150c170beaa4546495fd1880724c upstream. This change causes regression when eDP and external display in mirror mode. When external display supports low resolution than eDP, use eDP timing to driver external display may cause corruption on external display. This reverts commit e749dd10e5f292061ad63d2b030194bf7d7d452c. Cc: stable@vger.kernel.org Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2655 Signed-off-by: Hersen Wu <hersenxs.wu@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-07-05drm/amdgpu: Validate VM ioctl flags.Bas Nieuwenhuizen1-0/+4
commit a2b308044dcaca8d3e580959a4f867a1d5c37fac upstream. None have been defined yet, so reject anybody setting any. Mesa sets it to 0 anyway. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-06-14drm/amd/display: limit DPIA link rate to HBR3Peichen Huang1-0/+5
[Why] DPIA doesn't support UHBR, driver should not enable UHBR for dp tunneling [How] limit DPIA link rate to HBR3 Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Acked-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Peichen Huang <peichen.huang@amd.com> Reviewed-by: Mustapha Ghaddar <Mustapha.Ghaddar@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-14drm/amd/display: fix the system hang while disable PSRTom Chung1-4/+6
[Why] When the PSR enabled. If you try to adjust the timing parameters, it may cause system hang. Because the timing mismatch with the DMCUB settings. [How] Disable the PSR before adjusting timing parameters. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Acked-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Reviewed-by: Wayne Lin <Wayne.Lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-14drm/amd/display: edp do not add non-edid timingsHersen Wu1-1/+7
[Why] most edp support only timings from edid. applying non-edid timings, especially those timings out of edp bandwidth, may damage edp. [How] do not add non-edid timings for edp. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Acked-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Hersen Wu <hersenxs.wu@amd.com> Reviewed-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-14Revert "drm/amdgpu: remove TOPDOWN flags when allocating VRAM in large bar ↵Arunpravin Paneer Selvam1-1/+1
system" This reverts commit c105518679b6e87232874ffc989ec403bee59664. This patch disables the TOPDOWN flag for APU and few dGPU cards which has the VRAM size equal to the BAR size. When we enable the TOPDOWN flag, we get the free blocks at the highest available memory region and we don't split the lower order blocks. This change is required to keep off the fragmentation related issues particularly in ASIC which has VRAM space <= 500MiB Hence, we are reverting this patch. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2270 Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-06-14drm/amdgpu: vcn_4_0 set instance 0 init sched score to 1Sonny Jiang1-1/+5
Only vcn0 can process AV1 codecx. In order to use both vcn0 and vcn1 in h264/265 transcode to AV1 cases, set vcn0 sched score to 1 at initialization time. Signed-off-by: Sonny Jiang <sonjiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-06-14drm/amd/pm: workaround for compute workload type on some skusKenneth Feng1-2/+31
On smu 13.0.0, the compute workload type cannot be set on all the skus due to some other problems. This workaround is to make sure compute workload type can also run on some specific skus. v2: keep the variable consistent Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-06-14drm/amd: Tighten permissions on VBIOS flashing attributesMario Limonciello1-2/+2
Non-root users shouldn't be able to try to trigger a VBIOS flash or query the flashing status. This should be reserved for users with the appropriate permissions. Cc: stable@vger.kernel.org Fixes: 8424f2ccb3c0 ("drm/amdgpu/psp: Add vbflash sysfs interface support") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-14drm/amd: Make sure image is written to trigger VBIOS image update flowMario Limonciello1-0/+3
The VBIOS image update flow requires userspace to: 1) Write the image to `psp_vbflash` 2) Read `psp_vbflash` 3) Poll `psp_vbflash_status` to check for completion If userspace reads `psp_vbflash` before writing an image, it's possible that it causes problems that can put the dGPU into an invalid state. Explicitly check that an image has been written before letting a read succeed. Cc: stable@vger.kernel.org Fixes: 8424f2ccb3c0 ("drm/amdgpu/psp: Add vbflash sysfs interface support") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-14drm/amdgpu: add missing radeon secondary PCI IDAlex Deucher1-0/+1
0x5b70 is a missing RV370 secondary id. Add it so we don't try and probe it with amdgpu. Cc: michel@daenzer.net Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Tested-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-06-13drm/amdgpu: Implement gfx9 patch functions for resubmissionJiadong Zhu1-0/+80
Patch the packages including CONTEXT_CONTROL and WRITE_DATA for gfx9 during the resubmission scenario. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.3.x
2023-06-13drm/amdgpu: Modify indirect buffer packages for resubmissionJiadong Zhu4-0/+102
When the preempted IB frame resubmitted to cp, we need to modify the frame data including: 1. set PRE_RESUME 1 in CONTEXT_CONTROL. 2. use meta data(DE and CE) read from CSA in WRITE_DATA. Add functions to save the location the first time IBs emitted and callback to patch the package when resubmission happens. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.3.x
2023-06-13drm/amdgpu: Program gds backup address as zero if no gds allocatedJiadong Zhu1-5/+8
It is firmware requirement to set gds_backup_addrlo and gds_backup_addrhi of DE meta both zero if no gds partition is allocated for the frame. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.3.x
2023-06-13drm/amdgpu: Reset CP_VMID_PREEMPT after trailing fence signaledJiadong Zhu1-4/+4
When MEC executes unmap_queue for mid command buffer preemption, it will kick the write pointer of the gfx ring, set CP_VMID_PREEMPT to trigger the preemption and wait for CP_VMID_PREEMPT becomes zero after the preemption done. There is a race condition that PFP may excute the resetting command before MEC set CP_VMID_PREEMPT. As a result, hang happens as CP_VMID_PREEMPT is always 0xffff. To avoid this, we send resetting CP_VMID_PREEMPT command after the trailing fence is siganled and update gfx write pointer explicitly. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.3.x Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2535
2023-06-08drm/amd/display: Reduce sdp bw after urgent to 90%Alvin Lee1-1/+1
[Description] Reduce expected SDP bandwidth due to poor QoS and arbitration issues on high bandwidth configs Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Acked-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Alvin Lee <alvin.lee2@amd.com> Reviewed-by: Nevenko Stupar <Nevenko.Stupar@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-08drm/amdgpu: change reserved vram info printYiPeng Chai1-3/+4
The link object of mgr->reserved_pages is the blocks variable in struct amdgpu_vram_reservation, not the link variable in struct drm_buddy_block. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-06-08drm/amdgpu: fix xclk freq on CHIP_STONEYChia-I Wu1-2/+9
According to Alex, most APUs from that time seem to have the same issue (vbios says 48Mhz, actual is 100Mhz). I only have a CHIP_STONEY so I limit the fixup to CHIP_STONEY Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-06-07Revert "drm/amdgpu: switch to golden tsc registers for raven/raven2"Alex Deucher1-40/+0
This reverts commit f03eb1d26c2739b75580f58bbab4ab2d5d3eba46. This results in inconsistent timing reported via asynchronous GPU queries. Link: https://lists.freedesktop.org/archives/amd-gfx/2023-May/093731.html Cc: Jesse.Zhang@amd.com Cc: michel@daenzer.net Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-07Revert "drm/amdgpu: Differentiate between Raven2 and Raven/Picasso according ↵Alex Deucher1-14/+19
to revision id" This reverts commit 9d2d1827af295fd6971786672c41c4dba3657154. This results in inconsistent timing reported via asynchronous GPU queries. Link: https://lists.freedesktop.org/archives/amd-gfx/2023-May/093731.html Cc: Jesse.Zhang@amd.com Cc: michel@daenzer.net Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-07Revert "drm/amdgpu: change the reference clock for raven/raven2"Alex Deucher1-3/+4
This reverts commit fbc24293ca16b3b9ef891fe32ccd04735a6f8dc1. This results in inconsistent timing reported via asynchronous GPU queries. Link: https://lists.freedesktop.org/archives/amd-gfx/2023-May/093731.html Cc: Jesse.Zhang@amd.com Cc: michel@daenzer.net Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-07drm/amd/display: add ODM case when looking for first split pipeSamson Tam2-1/+55
[Why] When going from ODM 2:1 single display case to max displays, second odm pipe needs to be repurposed for one of the new single displays. However, acquire_first_split_pipe() only handles MPC case and not ODM case [How] Add ODM conditions in acquire_first_split_pipe() Add commit_minimal_transition_state() in commit_streams() to handle odm 2:1 exit first, and then process new streams Handle ODM condition in commit_minimal_transition_state() Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Acked-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Samson Tam <samson.tam@amd.com> Reviewed-by: Alvin Lee <Alvin.Lee2@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-07drm/amd: Make lack of `ACPI_FADT_LOW_POWER_S0` or `CONFIG_AMD_PMC` louder ↵Mario Limonciello1-2/+2
during suspend path Users have reported that s2idle wasn't working on OEM Phoenix systems, but it was root caused to be because `CONFIG_AMD_PMC` wasn't set in the distribution kernel config. To make this more apparent, raise the messaging to err instead of warn. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217497 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-07drm/amd/pm: conditionally disable pcie lane switching for some ↵Evan Quan1-18/+74
sienna_cichlid SKUs Disable the pcie lane switching for some sienna_cichlid SKUs since it might not work well on some platforms. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-06-07drm/amd/pm: Fix power context allocation in SMU13Lijo Lazar1-2/+2
Use the right data structure for allocation. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-06-07drm/amdgpu: fix Null pointer dereference error in amdgpu_device_recover_vramHoratio Zhang2-7/+4
Use the function of amdgpu_bo_vm_destroy to handle the resource release of shadow bo. During the amdgpu_mes_self_test, shadow bo released, but vmbo->shadow_list was not, which caused a null pointer reference error in amdgpu_device_recover_vram when GPU reset. Fixes: 6c032c37ac3e ("drm/amdgpu: Fix vram recover doesn't work after whole GPU reset (v2)") Signed-off-by: xinhui pan <xinhui.pan@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Acked-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-07drm/amd: Disallow s0ix without BIOS support againMario Limonciello1-2/+6
commit cf488dcd0ab7 ("drm/amd: Allow s0ix without BIOS support") showed improvements to power consumption over suspend when s0ix wasn't enabled in BIOS and the system didn't support S3. This patch however was misguided because the reason the system didn't support S3 was because SMT was disabled in OEM BIOS setup. This prevented the BIOS from allowing S3. Also allowing GPUs to use the s2idle path actually causes problems if they're invoked on systems that may not support s2idle in the platform firmware. `systemd` has a tendency to try to use `s2idle` if `deep` fails for any reason, which could lead to unexpected flows. The original commit also fixed a problem during resume from suspend to idle without hardware support, but this is no longer necessary with commit ca4751866397 ("drm/amd: Don't allow s0ix on APUs older than Raven") Revert commit cf488dcd0ab7 ("drm/amd: Allow s0ix without BIOS support") to make it match the expected behavior again. Cc: Rafael Ávila de Espíndola <rafael@espindo.la> Link: https://github.com/torvalds/linux/blob/v6.1/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c#L1060 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2599 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-06-01drm/amdgpu: enable tmz by default for GC 11.0.1Ikshwaku Chauhan1-1/+2
Add IP GC 11.0.1 in the list of target to have tmz enabled by default. Signed-off-by: Ikshwaku Chauhan <ikshwaku.chauhan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-06-01drm/amd/pm: resolve reboot exception for si olandGuchun Chen1-29/+0
During reboot test on arm64 platform, it may failure on boot. The error message are as follows: [ 1.706570][ 3] [ T273] [drm:si_thermal_enable_alert [amdgpu]] *ERROR* Could not enable thermal interrupts. [ 1.716547][ 3] [ T273] [drm:amdgpu_device_ip_late_init [amdgpu]] *ERROR* late_init of IP block <si_dpm> failed -22 [ 1.727064][ 3] [ T273] amdgpu 0000:02:00.0: amdgpu_device_ip_late_init failed [ 1.734367][ 3] [ T273] amdgpu 0000:02:00.0: Fatal error during GPU init v2: squash in built warning fix (Alex) Signed-off-by: Zhenneng Li <lizhenneng@kylinos.cn> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-06-01drm/amdgpu: add RAS POISON interrupt funcs for jpeg_v4_0Horatio Zhang1-7/+21
Add ras_poison_irq and functions. And fix the amdgpu_irq_put call trace in jpeg_v4_0_hw_fini. [ 50.497562] RIP: 0010:amdgpu_irq_put+0xa4/0xc0 [amdgpu] [ 50.497619] RSP: 0018:ffffaa2400fcfcb0 EFLAGS: 00010246 [ 50.497620] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 [ 50.497621] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 50.497621] RBP: ffffaa2400fcfcd0 R08: 0000000000000000 R09: 0000000000000000 [ 50.497622] R10: 0000000000000000 R11: 0000000000000000 R12: ffff99b2105242d8 [ 50.497622] R13: 0000000000000000 R14: ffff99b210500000 R15: ffff99b210500000 [ 50.497623] FS: 0000000000000000(0000) GS:ffff99b518480000(0000) knlGS:0000000000000000 [ 50.497623] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 50.497624] CR2: 00007f9d32aa91e8 CR3: 00000001ba210000 CR4: 0000000000750ee0 [ 50.497624] PKRU: 55555554 [ 50.497625] Call Trace: [ 50.497625] <TASK> [ 50.497627] jpeg_v4_0_hw_fini+0x43/0xc0 [amdgpu] [ 50.497693] jpeg_v4_0_suspend+0x13/0x30 [amdgpu] [ 50.497751] amdgpu_device_ip_suspend_phase2+0x240/0x470 [amdgpu] [ 50.497802] amdgpu_device_ip_suspend+0x41/0x80 [amdgpu] [ 50.497854] amdgpu_device_pre_asic_reset+0xd9/0x4a0 [amdgpu] [ 50.497905] amdgpu_device_gpu_recover.cold+0x548/0xcf1 [amdgpu] [ 50.498005] amdgpu_debugfs_reset_work+0x4c/0x80 [amdgpu] [ 50.498060] process_one_work+0x21f/0x400 [ 50.498063] worker_thread+0x200/0x3f0 [ 50.498064] ? process_one_work+0x400/0x400 [ 50.498065] kthread+0xee/0x120 [ 50.498067] ? kthread_complete_and_exit+0x20/0x20 [ 50.498068] ret_from_fork+0x22/0x30 Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-01drm/amdgpu: add RAS POISON interrupt funcs for jpeg_v2_6Horatio Zhang1-6/+22
Add ras_poison_irq and functions. Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-01drm/amdgpu: separate ras irq from jpeg instance irq for UVD_POISONHoratio Zhang2-1/+29
Separate jpegbRAS poison consumption handling from the instance irq, and register dedicated ras_poison_irq src and funcs for UVD_POISON. v2: - Separate ras irq from jpeg instance irq - Improve the subject and code comments v3: - Split the patch into three parts - Improve the code comments Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-01drm/amdgpu: add RAS POISON interrupt funcs for vcn_v4_0Horatio Zhang1-6/+30
Add ras_poison_irq and functions. And fix the amdgpu_irq_put call trace in vcn_v4_0_hw_fini. [ 44.563572] RIP: 0010:amdgpu_irq_put+0xa4/0xc0 [amdgpu] [ 44.563629] RSP: 0018:ffffb36740edfc90 EFLAGS: 00010246 [ 44.563630] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 [ 44.563630] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 44.563631] RBP: ffffb36740edfcb0 R08: 0000000000000000 R09: 0000000000000000 [ 44.563631] R10: 0000000000000000 R11: 0000000000000000 R12: ffff954c568e2ea8 [ 44.563631] R13: 0000000000000000 R14: ffff954c568c0000 R15: ffff954c568e2ea8 [ 44.563632] FS: 0000000000000000(0000) GS:ffff954f584c0000(0000) knlGS:0000000000000000 [ 44.563632] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 44.563633] CR2: 00007f028741ba70 CR3: 000000026ca10000 CR4: 0000000000750ee0 [ 44.563633] PKRU: 55555554 [ 44.563633] Call Trace: [ 44.563634] <TASK> [ 44.563634] vcn_v4_0_hw_fini+0x62/0x160 [amdgpu] [ 44.563700] vcn_v4_0_suspend+0x13/0x30 [amdgpu] [ 44.563755] amdgpu_device_ip_suspend_phase2+0x240/0x470 [amdgpu] [ 44.563806] amdgpu_device_ip_suspend+0x41/0x80 [amdgpu] [ 44.563858] amdgpu_device_pre_asic_reset+0xd9/0x4a0 [amdgpu] [ 44.563909] amdgpu_device_gpu_recover.cold+0x548/0xcf1 [amdgpu] [ 44.564006] amdgpu_debugfs_reset_work+0x4c/0x80 [amdgpu] [ 44.564061] process_one_work+0x21f/0x400 [ 44.564062] worker_thread+0x200/0x3f0 [ 44.564063] ? process_one_work+0x400/0x400 [ 44.564064] kthread+0xee/0x120 [ 44.564065] ? kthread_complete_and_exit+0x20/0x20 [ 44.564066] ret_from_fork+0x22/0x30 Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-01drm/amdgpu: add RAS POISON interrupt funcs for vcn_v2_6Horatio Zhang1-4/+21
Add ras_poison_irq and functions. Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-01drm/amdgpu: separate ras irq from vcn instance irq for UVD_POISONHoratio Zhang2-1/+29
Separate vcn RAS poison consumption handling from the instance irq, and register dedicated ras_poison_irq src and funcs for UVD_POISON. v2: - Separate ras irq from vcn instance irq - Improve the subject and code comments v3: - Split the patch into three parts - Improve the code comments Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>