summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd
AgeCommit message (Collapse)AuthorFilesLines
2021-12-22drm/amd/pm: fix reading SMU FW version from amdgpu_firmware_info on YCMario Limonciello1-0/+3
commit dcd10d879a9d1d4e929d374c2f24aba8fac3252b upstream. This value does not get cached into adev->pm.fw_version during startup for smu13 like it does for other SMU like smu12. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-22drm/amdgpu: don't override default ECO_BITs settingHawking Zhang8-9/+0
commit 841933d5b8aa853abe68e63827f68f50fab37226 upstream. Leave this bit as hardware default setting Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-22drm/amdgpu: correct register access for RLC_JUMP_TABLE_RESTORELe Ma1-2/+2
commit f3a8076eb28cae1553958c629aecec479394bbe2 upstream. should count on GC IP base address Signed-off-by: Le Ma <le.ma@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-22drm/amd/pm: fix a potential gpu_metrics_table memory leakLang Yu1-0/+3
[ Upstream commit aa464957f7e660abd554f2546a588f6533720e21 ] Memory is allocated for gpu_metrics_table in renoir_init_smc_tables(), but not freed in int smu_v12_0_fini_smc_tables(). Free it! Fixes: 95868b85764a ("drm/amd/powerplay: add Renoir support for gpu metrics export") Signed-off-by: Lang Yu <lang.yu@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-22drm/amd/display: Set exit_optimized_pwr_state for DCN31Nicholas Kazlauskas1-0/+1
[ Upstream commit 7e4d2f30df3fb48f75ce9e96867d42bdddab83ac ] [Why] SMU now respects the PHY refclk disable request from driver. This causes a hang during hotplug when PHY refclk was disabled because it's not being re-enabled and the transmitter control starts on dc_link_detect. [How] We normally would re-enable the clk with exit_optimized_pwr_state but this is only set on DCN21 and DCN301. Set it for dcn31 as well. This fixes DMCUB timeouts in the PHY. Fixes: 64b1d0e8d500 ("drm/amd/display: Add DCN3.1 HWSEQ") Reviewed-by: Eric Yang <Eric.Yang2@amd.com> Acked-by: Pavle Kotarac <Pavle.Kotarac@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-17drm/amdkfd: process_info lock not needed for svmPhilip Yang1-9/+0
[ Upstream commit 3abfe30d803e62cc75dec254eefab3b04d69219b ] process_info->lock is used to protect kfd_bo_list, vm_list_head, n_vms and userptr valid/inval list, svm_range_restore_work and svm_range_set_attr don't access those, so do not need to take process_info lock. This will avoid potential circular locking issue. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-17drm/amd/display: add connector type check for CRC source setPerry Yuan1-0/+8
[ Upstream commit 2da34b7bb59e1caa9a336e0e20a76b8b6a4abea2 ] [Why] IGT bypass test will set crc source as DPRX,and display DM didn`t check connection type, it run the test on the HDMI connector ,then the kernel will be crashed because aux->transfer is set null for HDMI connection. This patch will skip the invalid connection test and fix kernel crash issue. [How] Check the connector type while setting the pipe crc source as DPRX or auto,if the type is not DP or eDP, the crtc crc source will not be set and report error code to IGT test,IGT will show the this subtest as no valid crtc/connector combinations found. 116.779714] [IGT] amd_bypass: starting subtest 8bpc-bypass-mode [ 117.730996] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 117.731001] #PF: supervisor instruction fetch in kernel mode [ 117.731003] #PF: error_code(0x0010) - not-present page [ 117.731004] PGD 0 P4D 0 [ 117.731006] Oops: 0010 [#1] SMP NOPTI [ 117.731009] CPU: 11 PID: 2428 Comm: amd_bypass Tainted: G OE 5.11.0-34-generic #36~20.04.1-Ubuntu [ 117.731011] Hardware name: AMD CZN/, BIOS AB.FD 09/07/2021 [ 117.731012] RIP: 0010:0x0 [ 117.731015] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. [ 117.731016] RSP: 0018:ffffa8d64225bab8 EFLAGS: 00010246 [ 117.731017] RAX: 0000000000000000 RBX: 0000000000000020 RCX: ffffa8d64225bb5e [ 117.731018] RDX: ffff93151d921880 RSI: ffffa8d64225bac8 RDI: ffff931511a1a9d8 [ 117.731022] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 117.731023] CR2: ffffffffffffffd6 CR3: 000000010d5a4000 CR4: 0000000000750ee0 [ 117.731023] PKRU: 55555554 [ 117.731024] Call Trace: [ 117.731027] drm_dp_dpcd_access+0x72/0x110 [drm_kms_helper] [ 117.731036] drm_dp_dpcd_read+0xb7/0xf0 [drm_kms_helper] [ 117.731040] drm_dp_start_crc+0x38/0xb0 [drm_kms_helper] [ 117.731047] amdgpu_dm_crtc_set_crc_source+0x1ae/0x3e0 [amdgpu] [ 117.731149] crtc_crc_open+0x174/0x220 [drm] [ 117.731162] full_proxy_open+0x168/0x1f0 [ 117.731165] ? open_proxy_open+0x100/0x100 BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1546 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Perry Yuan <Perry.Yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-17drm/amdkfd: fix double free mem structurePhilip Yang1-3/+5
[ Upstream commit 494f2e42ce4a9ddffb5d8c5b2db816425ef90397 ] drm_gem_object_put calls release_notify callback to free the mem structure and unreserve_mem_limit, move it down after the last access of mem and make it conditional call. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-17drm/amd/display: Fix for the no Audio bug with Tiled DisplaysMustapha Ghaddar1-0/+4
[ Upstream commit 5ceaebcda9061c04f439c93961f0819878365c0f ] [WHY] It seems like after a series of plug/unplugs we end up in a situation where tiled display doesnt support Audio. [HOW] The issue seems to be related to when we check streams changed after an HPD, we should be checking the audio_struct as well to see if any of its values changed. Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Signed-off-by: Mustapha Ghaddar <mustapha.ghaddar@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-17drm/amdgpu: check atomic flag to differeniate with legacy pathFlora Cui1-2/+2
[ Upstream commit 1053b9c948e614473819a1a5bcaff6d44e680dcf ] since vkms support atomic KMS interface Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Alex Deucher <aleander.deucher@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-17drm/amdgpu: cancel the correct hrtimer on exitFlora Cui1-2/+2
[ Upstream commit 3e467e478ed3a9701bb588d648d6e0ccb82ced09 ] Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-14drm/amd/display: Fix DPIA outbox timeout after S3/S4/resetNicholas Kazlauskas1-1/+6
commit af6902ec415655236adea91826bd96ed0ab16f42 upstream. [Why] The HW interrupt gets disabled after S3/S4/reset so we don't receive notifications for HPD or AUX from DMUB - leading to timeout and black screen with (or without) DPIA links connected. [How] Re-enable the interrupt after S3/S4/reset like we do for the other DC interrupts. Guard both instances of the outbox interrupt enable or we'll hang during restore on ASIC that don't support it. Fixes: 6eff272dbee7ad ("drm/amd/display: Fix DPIA outbox timeout after GPU reset") Reviewed-by: Jude Shih <Jude.Shih@amd.com> Acked-by: Pavle Kotarac <Pavle.Kotarac@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-08drm/amd/display: Allow DSC on supported MST branch devicesNicholas Kazlauskas1-4/+16
commit 94ebc035456a4ccacfbbef60c444079a256623ad upstream. [Why] When trying to lightup two 4k60 non-DSC displays behind a branch device that supports DSC we can't lightup both at once due to bandwidth limitations - each requires 48 VCPI slots but we only have 63. [How] The workaround already exists in the code but is guarded by a CONFIG that cannot be set by the user and shouldn't need to be. Check for specific branch device IDs to device whether to enable the workaround for multiple display scenarios. Reviewed-by: Hersen Wu <hersenxs.wu@amd.com> Acked-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-08drm/amd/amdgpu: fix potential memleakBernard Zhao1-0/+1
[ Upstream commit 27dfaedc0d321b4ea4e10c53e4679d6911ab17aa ] In function amdgpu_get_xgmi_hive, when kobject_init_and_add failed There is a potential memleak if not call kobject_put. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Bernard Zhao <bernard@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-08drm/amd/amdkfd: Fix kernel panic when reset failed and been triggered againshaoyunl1-0/+5
[ Upstream commit 2cf49e00d40d5132e3d067b5aa6d84791929ab15 ] In SRIOV configuration, the reset may failed to bring asic back to normal but stop cpsch already been called, the start_cpsch will not be called since there is no resume in this case. When reset been triggered again, driver should avoid to do uninitialization again. Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-08drm/amd/pm: Remove artificial freq level on Navi1xLijo Lazar1-5/+8
[ Upstream commit be83a5676767c99c2417083c29d42aa1e109a69d ] Print Navi1x fine grained clocks in a consistent manner with other SOCs. Don't show aritificial DPM level when the current clock equals min or max. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-01drm/amdgpu/gfx9: switch to golden tsc registers for renoir+Alex Deucher1-11/+35
commit 53af98c091bc42fd9ec64cfabc40da4e5f3aae93 upstream. Renoir and newer gfx9 APUs have new TSC register that is not part of the gfxoff tile, so it can be read without needing to disable gfx off. Acked-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-01drm/amdgpu/gfx10: add wraparound gpu counter check for APUs as wellAlex Deucher1-2/+13
commit 244ee398855df2adc7d3ac5702b58424a5f684cc upstream. Apply the same check we do for dGPUs for APUs as well. Acked-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-01drm/amd/display: Set plane update flags for all planes in resetNicholas Kazlauskas1-2/+2
[ Upstream commit 21431f70f6014f81b0d118ff4fcee12b00b9dd70 ] [Why] We're only setting the flags on stream[0]'s planes so this logic fails if we have more than one stream in the state. This can cause a page flip timeout with multiple displays in the configuration. [How] Index into the stream_status array using the stream index - it's a 1:1 mapping. Fixes: cdaae8371aa9 ("drm/amd/display: Handle GPU reset for DC block") Reviewed-by: Harry Wentland <Harry.Wentland@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-01drm/amd/display: Fix DPIA outbox timeout after GPU resetNicholas Kazlauskas1-0/+2
[ Upstream commit 6eff272dbee7ad444c491c9a96d49e78e91e2161 ] [Why] The HW interrupt gets disabled after GPU reset so we don't receive notifications for HPD or AUX from DMUB - leading to timeout and black screen with (or without) DPIA links connected. [How] Re-enable the interrupt after GPU reset like we do for the other DC interrupts. Fixes: 81927e2808be ("drm/amd/display: Support for DMUB AUX") Reviewed-by: Jude Shih <Jude.Shih@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-01drm/amd/display: Fix OLED brightness control on eDPRoman Li1-0/+3
commit dab60582685aabdae2d4ff7ce716456bd0dc7a0f upstream. [Why] After commit ("drm/amdgpu/display: add support for multiple backlights") number of eDPs is defined while registering backlight device. However the panel's extended caps get updated once before register call. That leads to regression with extended caps like oled brightness control. [How] Update connector ext caps after register_backlight_device Fixes: 7fd13baeb7a3a4 ("drm/amdgpu/display: add support for multiple backlights") Link: https://www.reddit.com/r/AMDLaptops/comments/qst0fm/after_updating_to_linux_515_my_brightness/ Signed-off-by: Roman Li <Roman.Li@amd.com> Tested-by: Samuel Čavoj <samuel@cavoj.net> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Jasdeep Dhillon <Jasdeep.Dhillon@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-01drm/amdgpu/pm: fix powerplay OD interfaceAlex Deucher6-79/+67
commit d5c7255dc7ff6e1239d794b9c53029d83ced04ca upstream. The overclocking interface currently appends data to a string. Revert back to using sprintf(). Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1774 Fixes: 6db0c87a0a8ee1 ("amdgpu/pm: Replace hwmgr smu usage of sprintf with sysfs_emit") Acked-by: Evan Quan <evan.quan@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-01drm/amdgpu: IH process reset count when restartPhilip Yang1-1/+2
commit 4d62555f624582e60be416fbc4772cd3fcd12b1a upstream. Otherwise when IH process restart, count is zero, the loop will not exit to wake_up_all after processing AMDGPU_IH_MAX_NUM_IVS interrupts. Cc: stable@vger.kernel.org Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-25drm/amd/pm: avoid duplicate powergate/ungate settingEvan Quan4-1/+23
commit 6ee27ee27ba8b2e725886951ba2d2d87f113bece upstream. Just bail out if the target IP block is already in the desired powergate/ungate state. This can avoid some duplicate settings which sometimes may cause unexpected issues. Link: https://lore.kernel.org/all/YV81vidWQLWvATMM@zn.tnic/ Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214921 Bug: https://bugzilla.kernel.org/show_bug.cgi?id=215025 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1789 Fixes: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend") Signed-off-by: Evan Quan <evan.quan@amd.com> Tested-by: Borislav Petkov <bp@suse.de> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-25drm/amdgpu: fix set scaling mode Full/Full aspect/Center not works on vga ↵hongao1-0/+1
and dvi connectors commit bf552083916a7f8800477b5986940d1c9a31b953 upstream. amdgpu_connector_vga_get_modes missed function amdgpu_get_native_mode which assign amdgpu_encoder->native_mode with *preferred_mode result in amdgpu_encoder->native_mode.clock always be 0. That will cause amdgpu_connector_set_property returned early on: if ((rmx_type != DRM_MODE_SCALE_NONE) && (amdgpu_encoder->native_mode.clock == 0)) when we try to set scaling mode Full/Full aspect/Center. Add the missing function to amdgpu_connector_vga_get_mode can fix this. It also works on dvi connectors because amdgpu_connector_dvi_helper_funcs.get_mode use the same method. Signed-off-by: hongao <hongao@uniontech.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-25drm/amd/display: Limit max DSC target bpp for specific monitorsRoman Li1-0/+35
commit 55eea8ef98641f6e1e1c202bd3a49a57c1dd4059 upstream. [Why] Some monitors exhibit corruption at 16bpp DSC. [How] - Add helpers for patching edid caps. - Use it for limiting DSC target bitrate to 15bpp for known monitors Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Roman Li <Roman.Li@amd.com> Cc: stable@vger.kernel.org Tested-by: Daniel Wheeler <Daniel.Wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-25drm/amd/display: Update swizzle mode enumsAlvin Lee2-3/+5
commit 58065a1e524de30df9a2d8214661d5d7eed0a2d9 upstream. [Why] Swizzle mode enum for DC_SW_VAR_R_X was existing, but not mapped correctly. [How] Update mapping and conversion for DC_SW_VAR_R_X. Reviewed-by: XiangBing Foo <XiangBing.Foo@amd.com> Reviewed-by: Martin Leung <Martin.Leung@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com> Cc: stable@vger.kernel.org Tested-by: Daniel Wheeler <Daniel.Wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18drm/amd/display: Look at firmware version to determine using dmub on dcn21Mario Limonciello1-1/+8
commit 91adec9e07097e538691daed5d934e7886dd1dc3 upstream. commit 652de07addd2 ("drm/amd/display: Fully switch to dmub for all dcn21 asics") switched over to using dmub on Renoir to fix Gitlab 1735, but this implied a new dependency on newer firmware which might not be met on older kernel versions. Since sw_init runs before hw_init, there is an opportunity to determine whether or not the firmware version is new to adjust the behavior. Cc: Roman.Li@amd.com BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1772 BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1735 Fixes: 652de07addd2 ("drm/amd/display: Fully switch to dmub for all dcn21 asics") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Roman Li <Roman.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18drm/amdgpu: fix uvd crash on Polaris12 during driver unloadingEvan Quan1-11/+13
[ Upstream commit 4fc30ea780e0a5c1c019bc2e44f8523e1eed9051 ] There was a change(below) target for such issue: d82e2c249c8f ("drm/amdgpu: Fix crash on device remove/driver unload") But the fix for VI ASICs was missing there. This is a supplement for that. Fixes: d82e2c249c8f ("drm/amdgpu: Fix crash on device remove/driver unload") Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18drm/amdgpu/powerplay: fix sysfs_emit/sysfs_emit_at handlingAlex Deucher7-12/+51
[ Upstream commit e9c76719c1e99caf95e70de74170291b9457bbc1 ] sysfs_emit and sysfs_emit_at requrie a page boundary aligned buf address. Make them happy! v2: fix sysfs_emit -> sysfs_emit_at missed conversions Cc: Lang Yu <lang.yu@amd.com> Cc: Darren Powell <darren.powell@amd.com> Fixes: 6db0c87a0a8e ("amdgpu/pm: Replace hwmgr smu usage of sprintf with sysfs_emit") Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1774 Reviewed-by: Lang Yu <lang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18drm/ttm: remove ttm_bo_vm_insert_huge()Jason Gunthorpe1-1/+1
[ Upstream commit 0d979509539ed1df883a30d442177ca7be609565 ] The huge page functionality in TTM does not work safely because PUD and PMD entries do not have a special bit. get_user_pages_fast() considers any page that passed pmd_huge() as usable: if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd) || pmd_devmap(pmd))) { And vmf_insert_pfn_pmd_prot() unconditionally sets entry = pmd_mkhuge(pfn_t_pmd(pfn, prot)); eg on x86 the page will be _PAGE_PRESENT | PAGE_PSE. As such gup_huge_pmd() will try to deref a struct page: head = try_grab_compound_head(pmd_page(orig), refs, flags); and thus crash. Thomas further notices that the drivers are not expecting the struct page to be used by anything - in particular the refcount incr above will cause them to malfunction. Thus everything about this is not able to fully work correctly considering GUP_fast. Delete it entirely. It can return someday along with a proper PMD/PUD_SPECIAL bit in the page table itself to gate GUP_fast. Fixes: 314b6580adc5 ("drm/ttm, drm/vmwgfx: Support huge TTM pagefaults") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Thomas Hellström <thomas.helllstrom@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> [danvet: Update subject per Thomas' &Christian's review] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/0-v2-a44694790652+4ac-ttm_pmd_jgg@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18drm/amdgpu/gmc6: fix DMA mask from 44 to 40 bitsAlex Deucher1-2/+2
[ Upstream commit 403475be6d8b122c3e6b8a47e075926d7299e5ef ] The DMA mask on SI parts is 40 bits not 44. Copy paste typo. Fixes: 244511f386ccb9 ("drm/amdgpu: simplify and cleanup setting the dma mask") Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1762 Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18drm/amdgpu: fix a potential memory leak in amdgpu_device_fini_sw()Lang Yu1-1/+1
[ Upstream commit a5c5d8d50ecf5874be90a76e1557279ff8a30c9e ] amdgpu_fence_driver_sw_fini() should be executed before amdgpu_device_ip_fini(), otherwise fence driver resource won't be properly freed as adev->rings have been tore down. Fixes: 72c8c97b1522 ("drm/amdgpu: Split amdgpu_device_fini into early and late") Signed-off-by: Lang Yu <lang.yu@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18drm/amdkfd: Fix an inappropriate error handling in allloc memory of gpuLang Yu1-1/+1
[ Upstream commit 5aeeac6fa38fca450faed9770f75b1470c0e2073 ] We should unreference a gem object instead of an amdgpu bo here. Fixes: fd9a9f8801de ("drm/amdgpu: Use GEM obj reference for KFD BOs") Signed-off-by: Lang Yu <lang.yu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18drm/amdgpu: fix warning for overflow checkArnd Bergmann2-2/+2
[ Upstream commit 335aea75b0d95518951cad7c4c676e6f1c02c150 ] The overflow check in amdgpu_bo_list_create() causes a warning with clang-14 on 64-bit architectures, since the limit can never be exceeded. drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c:74:18: error: result of comparison of constant 256204778801521549 with expression of type 'unsigned int' is always false [-Werror,-Wtautological-constant-out-of-range-compare] if (num_entries > (SIZE_MAX - sizeof(struct amdgpu_bo_list)) ~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The check remains useful for 32-bit architectures, so just avoid the warning by using size_t as the type for the count. Fixes: 920990cb080a ("drm/amdgpu: allocate the bo_list array after the list") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18drm/amdgpu: move amdgpu_virt_release_full_gpu to fini_early stageGuchun Chen1-4/+5
[ Upstream commit 6effad8abe0ba4db3d9c58ed585127858a990f35 ] adev->rmmio is set to be NULL in amdgpu_device_unmap_mmio to prevent access after pci_remove, however, in SRIOV case, amdgpu_virt_release_full_gpu will still use adev->rmmio for access after amdgpu_device_unmap_mmio. The patch is to move such SRIOV calling earlier to fini_early stage. Fixes: 07775fc13878 ("drm/amdgpu: Unmap all MMIO mappings") Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18drm/amd/display: Pass display_pipe_params_st as const in DMLHarry Wentland12-120/+120
[ Upstream commit 22667e6ec6b2ce9ca706e9061660b059725d009c ] [Why] This neither needs to be on the stack nor passed by value to each function call. In fact, when building with clang it seems to break the Linux's default 1024 byte stack frame limit. [How] We can simply pass this as a const pointer. This patch fixes these Coverity IDs Addresses-Coverity-ID: 1424031: ("Big parameter passed by value") Addresses-Coverity-ID: 1423970: ("Big parameter passed by value") Addresses-Coverity-ID: 1423941: ("Big parameter passed by value") Addresses-Coverity-ID: 1451742: ("Big parameter passed by value") Addresses-Coverity-ID: 1451887: ("Big parameter passed by value") Addresses-Coverity-ID: 1454146: ("Big parameter passed by value") Addresses-Coverity-ID: 1454152: ("Big parameter passed by value") Addresses-Coverity-ID: 1454413: ("Big parameter passed by value") Addresses-Coverity-ID: 1466144: ("Big parameter passed by value") Addresses-Coverity-ID: 1487237: ("Big parameter passed by value") Signed-off-by: Harry Wentland <harry.wentland@amd.com> Fixes: 3fe617ccafd6 ("Enable '-Werror' by default for all kernel builds") Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: amd-gfx@lists.freedesktop.org Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: Leo Li <sunpeng.li@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Xinhui Pan <Xinhui.Pan@amd.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: llvm@lists.linux.dev Acked-by: Christian König <christian.koenig@amd.com> Build-tested-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18drm/amdgpu: Fix crash on device remove/driver unloadAndrey Grodzovsky7-90/+105
[ Upstream commit d82e2c249c8ffaec20fa618611ea2ab4dcfd4d01 ] Crash: BUG: unable to handle page fault for address: 00000000000010e1 RIP: 0010:vega10_power_gate_vce+0x26/0x50 [amdgpu] Call Trace: pp_set_powergating_by_smu+0x16a/0x2b0 [amdgpu] amdgpu_dpm_set_powergating_by_smu+0x92/0xf0 [amdgpu] amdgpu_dpm_enable_vce+0x2e/0xc0 [amdgpu] vce_v4_0_hw_fini+0x95/0xa0 [amdgpu] amdgpu_device_fini_hw+0x232/0x30d [amdgpu] amdgpu_driver_unload_kms+0x5c/0x80 [amdgpu] amdgpu_pci_remove+0x27/0x40 [amdgpu] pci_device_remove+0x3e/0xb0 device_release_driver_internal+0x103/0x1d0 device_release_driver+0x12/0x20 pci_stop_bus_device+0x79/0xa0 pci_stop_and_remove_bus_device_locked+0x1b/0x30 remove_store+0x7b/0x90 dev_attr_store+0x17/0x30 sysfs_kf_write+0x4b/0x60 kernfs_fop_write_iter+0x151/0x1e0 Why: VCE/UVD had dependency on SMC block for their suspend but SMC block is the first to do HW fini due to some constraints How: Since the original patch was dealing with suspend issues move the SMC block dependency back into suspend hooks as was done in V1 of the original patches. Keep flushing idle work both in suspend and HW fini seuqnces since it's essential in both cases. Fixes: 859e4659273f1d ("drm/amdgpu: add missing cleanups for more ASICs on UVD/VCE suspend") Fixes: bf756fb833cbe8 ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend") Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18drm/amd/display: dcn20_resource_construct reduce scope of FPU enabledAnson Jacob1-7/+9
[ Upstream commit bc39a69a2ac484e6575a958567c162ef56c9f278 ] Limit when FPU is enabled to only functions that does FPU operations for dcn20_resource_construct, which gets called during driver initialization. Enabling FPU operation disables preemption. Sleeping functions(mutex (un)lock, memory allocation using GFP_KERNEL, etc.) should not be called when preemption is disabled. Fixes the following case caught by enabling CONFIG_DEBUG_ATOMIC_SLEEP in kernel config [ 1.338434] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:281 [ 1.347395] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 197, name: systemd-udevd [ 1.356356] CPU: 7 PID: 197 Comm: systemd-udevd Not tainted 5.13.0+ #3 [ 1.356358] Hardware name: System manufacturer System Product Name/PRIME X570-PRO, BIOS 3405 02/01/2021 [ 1.356360] Call Trace: [ 1.356361] dump_stack+0x6b/0x86 [ 1.356366] ___might_sleep.cold+0x87/0x98 [ 1.356370] __might_sleep+0x4b/0x80 [ 1.356372] mutex_lock+0x21/0x50 [ 1.356376] smu_get_uclk_dpm_states+0x3f/0x80 [amdgpu] [ 1.356538] pp_nv_get_uclk_dpm_states+0x35/0x50 [amdgpu] [ 1.356711] init_soc_bounding_box+0xf9/0x210 [amdgpu] [ 1.356892] ? create_object+0x20d/0x340 [ 1.356897] ? dcn20_resource_construct+0x46f/0xd30 [amdgpu] [ 1.357077] dcn20_resource_construct+0x4b1/0xd30 [amdgpu] ... Tested on: 5700XT (NAVI10 0x1002:0x731F 0x1DA2:0xE410 0xC1) Cc: Christian König <christian.koenig@amd.com> Cc: Hersen Wu <hersenxs.wu@amd.com> Cc: Anson Jacob <Anson.Jacob@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Acked-by: Agustin Gutierrez <agustin.gutierrez@amd.com> Signed-off-by: Anson Jacob <Anson.Jacob@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18drm/amdgpu/pm: properly handle sclk for profiling modes on vangoghAlex Deucher1-60/+29
[ Upstream commit 68e3871dcd6e547f6c47454492bc452356cb9eac ] When selecting between levels in the force performance levels interface sclk (gfxclk) was not set correctly for all levels. Select the proper sclk settings for all levels. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1726 Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18drm/amdkfd: fix resume error when iommu disabled in PicassoYifan Zhang1-0/+1
[ Upstream commit 6f4b590aae217da16cfa44039a2abcfb209137ab ] When IOMMU disabled in sbios and kfd in iommuv2 path, IOMMU resume failure blocks system resume. Don't allow kfd to use iommu v2 when iommu is disabled. Reported-by: youling <youling257@gmail.com> Tested-by: youling <youling257@gmail.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18drm/amd/display: fix null pointer deref when plugging in displayAurabindo Pillai1-1/+2
[ Upstream commit 1f3b22e4eb162e0b1d423106a47484943a22a309 ] [Why&How] When system boots in headless mode, connecting a 4k display creates a null pointer dereference due to hubp for a certain plane being null. Add a condition to check for null hubp before dereferencing it. Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18drm/amdkfd: rm BO resv on validation to avoid deadlockAlex Sierra1-6/+1
[ Upstream commit ec6abe831a843208e99a59adf108adba22166b3f ] This fix the deadlock with the BO reservations during SVM_BO evictions while allocations in VRAM are concurrently performed. More specific, while the ttm waits for the fence to be signaled (ttm_bo_wait), it already has the BO reserved. In parallel, the restore worker might be running, prefetching memory to VRAM. This also requires to reserve the BO, but blocks the mmap semaphore first. The deadlock happens when the SVM_BO eviction worker kicks in and waits for the mmap semaphore held in restore worker. Preventing signal the fence back, causing the deadlock until the ttm times out. We don't need to hold the BO reservation anymore during validation and mapping. Now the physical addresses are taken from hmm_range_fault. We also take migrate_mutex to prevent range migration while validate_and_map update GPU page table. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Philip Yang <philip.yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18drm/amd/display: Fix null pointer dereference for encodersJimmy Kizito2-2/+2
[ Upstream commit 60f39edd897ea134a4ddb789a6795681691c3183 ] [Why] Links which are dynamically assigned link encoders have their link encoder set to NULL. [How] Check that a pointer to a link_encoder object is non-NULL before using it. Reviewed-by: Aric Cyr <aric.cyr@amd.com> Reviewed-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Jimmy Kizito <Jimmy.Kizito@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18drm/amdgpu: Fix MMIO access page faultAndrey Grodzovsky2-8/+17
[ Upstream commit c03509cbc01559549700e14c4a6239f2572ab4ba ] Add more guards to MMIO access post device unbind/unplug Bug: https://bugs.archlinux.org/task/72092?project=1&order=dateopened&sort=desc&pagenum=1 Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18drm/amdgpu: move iommu_resume before ip init/resumeJames Zhu1-0/+4
[ Upstream commit 9cec53c18a3170c7e5673c414da56aeecee94832 ] Separate iommu_resume from kfd_resume, and move it before other amdgpu ip init/resume. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277 Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-06drm/amd/display: Revert "Directly retrain link from debugfs"Anson Jacob1-1/+2
commit 1131cadfd7563975f3a4efcc6f7c1fdc872db38b upstream. This reverts commit f5b6a20c7ef40599095c796b0500d842ffdbc639. This patch broke new settings from taking effect. Hotplug is required for new settings to take effect. Reviewed-by: Mikita Lipski <mikita.lipski@amd.com> Acked-by: Mikita Lipski <mikita.lipski@amd.com> Signed-off-by: Anson Jacob <Anson.Jacob@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-06drm/amdgpu: revert "Add autodump debugfs node for gpu reset v8"Christian König4-91/+0
commit c8365dbda056578eebe164bf110816b1a39b4b7f upstream. This reverts commit 728e7e0cd61899208e924472b9e641dbeb0775c4. Further discussion reveals that this feature is severely broken and needs to be reverted ASAP. GPU reset can never be delayed by userspace even for debugging or otherwise we can run into in kernel deadlocks. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-06drm/amdkfd: fix boot failure when iommu is disabled in Picasso.Yifan Zhang2-4/+3
commit afd18180c07026f94a80ff024acef5f4159084a4 upstream. When IOMMU disabled in sbios and kfd in iommuv2 path, iommuv2 init will fail. But this failure should not block amdgpu driver init. Reported-by: youling <youling257@gmail.com> Tested-by: youling <youling257@gmail.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-28drm/amd/display: Fix deadlock when falling back to v2 from v3Nicholas Kazlauskas1-4/+2
[Why] A deadlock in the kernel occurs when we fallback from the V3 to V2 add_topology_to_display or remove_topology_to_display because they both try to acquire the dtm_mutex but recursive locking isn't supported on mutex_lock(). [How] Make the mutex_lock/unlock more fine grained and move them up such that they're only required for the psp invocation itself. Fixes: bf62221e9d0e ("drm/amd/display: Add DCN3.1 HDCP support") Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org