summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/pm
AgeCommit message (Collapse)AuthorFilesLines
2024-12-09drm/amd/pm: update current_socclk and current_uclk in gpu_metrics on smu v13.0.7Umio Yasuno1-0/+2
commit 2abf2f7032df4c4e7f6cf7906da59d0e614897d6 upstream. These were missed before. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3751 Signed-off-by: Umio Yasuno <coelacanth_dream@protonmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-22drm/amd/pm: Vangogh: Fix kernel memory out of bounds writeTvrtko Ursulin1-4/+3
commit 4aa923a6e6406b43566ef6ac35a3d9a3197fa3e8 upstream. KASAN reports that the GPU metrics table allocated in vangogh_tables_init() is not large enough for the memset done in smu_cmn_init_soft_gpu_metrics(). Condensed report follows: [ 33.861314] BUG: KASAN: slab-out-of-bounds in smu_cmn_init_soft_gpu_metrics+0x73/0x200 [amdgpu] [ 33.861799] Write of size 168 at addr ffff888129f59500 by task mangoapp/1067 ... [ 33.861808] CPU: 6 UID: 1000 PID: 1067 Comm: mangoapp Tainted: G W 6.12.0-rc4 #356 1a56f59a8b5182eeaf67eb7cb8b13594dd23b544 [ 33.861816] Tainted: [W]=WARN [ 33.861818] Hardware name: Valve Galileo/Galileo, BIOS F7G0107 12/01/2023 [ 33.861822] Call Trace: [ 33.861826] <TASK> [ 33.861829] dump_stack_lvl+0x66/0x90 [ 33.861838] print_report+0xce/0x620 [ 33.861853] kasan_report+0xda/0x110 [ 33.862794] kasan_check_range+0xfd/0x1a0 [ 33.862799] __asan_memset+0x23/0x40 [ 33.862803] smu_cmn_init_soft_gpu_metrics+0x73/0x200 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.863306] vangogh_get_gpu_metrics_v2_4+0x123/0xad0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.864257] vangogh_common_get_gpu_metrics+0xb0c/0xbc0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.865682] amdgpu_dpm_get_gpu_metrics+0xcc/0x110 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.866160] amdgpu_get_gpu_metrics+0x154/0x2d0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.867135] dev_attr_show+0x43/0xc0 [ 33.867147] sysfs_kf_seq_show+0x1f1/0x3b0 [ 33.867155] seq_read_iter+0x3f8/0x1140 [ 33.867173] vfs_read+0x76c/0xc50 [ 33.867198] ksys_read+0xfb/0x1d0 [ 33.867214] do_syscall_64+0x90/0x160 ... [ 33.867353] Allocated by task 378 on cpu 7 at 22.794876s: [ 33.867358] kasan_save_stack+0x33/0x50 [ 33.867364] kasan_save_track+0x17/0x60 [ 33.867367] __kasan_kmalloc+0x87/0x90 [ 33.867371] vangogh_init_smc_tables+0x3f9/0x840 [amdgpu] [ 33.867835] smu_sw_init+0xa32/0x1850 [amdgpu] [ 33.868299] amdgpu_device_init+0x467b/0x8d90 [amdgpu] [ 33.868733] amdgpu_driver_load_kms+0x19/0xf0 [amdgpu] [ 33.869167] amdgpu_pci_probe+0x2d6/0xcd0 [amdgpu] [ 33.869608] local_pci_probe+0xda/0x180 [ 33.869614] pci_device_probe+0x43f/0x6b0 Empirically we can confirm that the former allocates 152 bytes for the table, while the latter memsets the 168 large block. Root cause appears that when GPU metrics tables for v2_4 parts were added it was not considered to enlarge the table to fit. The fix in this patch is rather "brute force" and perhaps later should be done in a smarter way, by extracting and consolidating the part version to size logic to a common helper, instead of brute forcing the largest possible allocation. Nevertheless, for now this works and fixes the out of bounds write. v2: * Drop impossible v3_0 case. (Mario) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Fixes: 41cec40bc9ba ("drm/amd/pm: Vangogh: Add new gpu_metrics_v2_4 to acquire gpu_metrics") Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Evan Quan <evan.quan@amd.com> Cc: Wenyou Yang <WenYou.Yang@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20241025145639.19124-1-tursulin@igalia.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 0880f58f9609f0200483a49429af0f050d281703) Cc: stable@vger.kernel.org # v6.6+ Signed-off-by: Bin Lan <bin.lan.cn@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-22drm/amdgpu/swsmu: Only force workload setup on initAlex Deucher1-3/+3
commit cb07c8338fc2b9d5f949a19d4a07ee4d5ecf8793 upstream. Needed to set the workload type at init time so that we can apply the navi3x margin optimization. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3618 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3131 Fixes: c50fe289ed72 ("drm/amdgpu/swsmu: always force a state reprogram on init") Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 580ad7cbd4b7be8d2cb5ab5c1fca6bb76045eb0e) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-10drm/amd/pm: ensure the fw_info is not null before using itTim Huang1-0/+2
[ Upstream commit 186fb12e7a7b038c2710ceb2fb74068f1b5d55a4 ] This resolves the dereference null return value warning reported by Coverity. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-12Revert "drm/amdgpu: align pp_power_profile_mode with kernel docs"Alex Deucher1-2/+4
commit 1a8d845470941f1b6de1b392227530c097dc5e0c upstream. This reverts commit 8f614469de248a4bc55fb07e55d5f4c340c75b11. This breaks some manual setting of the profile mode in certain cases. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3600 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 7a199557643e993d4e7357860624b8aa5d8f4340) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-09-08drm/amd/pm: check negtive return for table entriesJesse Zhang1-5/+8
[ Upstream commit f76059fe14395b37ba8d997eb0381b1b9e80a939 ] Function hwmgr->hwmgr_func->get_num_of_pp_table_entries(hwmgr) returns a negative number Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Suggested-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-08drm/amd/pm: check specific index for smu13Jesse Zhang1-0/+2
[ Upstream commit a3ac9d1c9751f00026c2d98b802ec8a98626c3ed ] Check for specific indexes that may be invalid values. Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Suggested-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-08drm/amd/pm: check specific index for aldebaranJesse Zhang1-1/+2
[ Upstream commit 0ce8ef2639c112ae203c985b758389e378630aac ] Check for specific indexes that may be invalid values. Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-08drm/amdgpu/pm: Check input value for CUSTOM profile mode setting on legacy SOCsMa Jun2-3/+7
[ Upstream commit df0a9bd92fbbd3fcafcb2bce6463c9228a3e6868 ] Check the input value for CUSTOM profile mode setting on legacy SOCs. Otherwise we may use uninitalized value of input[] Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-08drm/amdgpu/pm: Fix uninitialized variable agc_btc_responseMa Jun1-2/+7
[ Upstream commit df4409d8a04dd39d7f2aa0c5f528a56b99eaaa13 ] Assign an default value to agc_btc_response in failed case Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-08drm/amdgpu/pm: Fix uninitialized variable warning for smu10Ma Jun3-16/+48
[ Upstream commit 336c8f558d596699d3d9814a45600139b2f23f27 ] Check return value of smum_send_msg_to_smc to fix uninitialized variable varning Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-08drm/amd/pm: fix uninitialized variable warnings for vangogh_pptTim Huang1-0/+14
[ Upstream commit b2871de6961d24d421839fbfa4aa3008ec9170d5 ] 1. Fix a issue that using uninitialized mask to get the ultimate frequency. 2. Check return of smu_cmn_send_smc_msg_with_param to avoid using uninitialized variable residency. Signed-off-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-08drm/amd/pm: fix uninitialized variable warnings for vega10_hwmgrTim Huang2-13/+39
[ Upstream commit 5fa7d540d95d97ddc021a74583f6b3da4df9c93a ] Clear warnings that using uninitialized variable when fails to get the valid value from SMU. Signed-off-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-08drm/amd/pm: fix the Out-of-bounds read warningJesse Zhang1-2/+3
[ Upstream commit 12c6967428a099bbba9dfd247bb4322a984fcc0b ] using index i - 1U may beyond element index for mc_data[] when i = 0. Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-08drm/amd/pm: Fix negative array index readJesse Zhang1-6/+21
[ Upstream commit c8c19ebf7c0b202a6a2d37a52ca112432723db5f ] Avoid using the negative values for clk_idex as an index into an array pptable->DpmDescriptor. V2: fix clk_index return check (Tim Huang) Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-08drm/amd/pm: fix warning using uninitialized value of max_vid_stepJesse Zhang1-1/+4
[ Upstream commit 17e3bea65cdc453695b2fe4ff26d25d17f5339e9 ] Check the return of pp_atomfwctrl_get_Voltage_table_v4 as it may fail to initialize max_vid_step V2: change the check condition (Tim Huang) Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-08drm/amd/pm: fix uninitialized variable warning for smu8_hwmgrTim Huang1-3/+12
[ Upstream commit 86df36b934640866eb249a4488abb148b985a0d9 ] Clear warnings that using uninitialized value level when fails to get the value from SMU. Signed-off-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-08drm/amd/pm: fix uninitialized variable warningJesse Zhang1-1/+1
[ Upstream commit 7c836905520703dbc8b938993b6d4d718bc739f3 ] Check the return of function smum_send_msg_to_smc as it may fail to initialize the variable. Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-08drm/amdgpu/pm: Check the return value of smum_send_msg_to_smcMa Jun1-2/+6
[ Upstream commit 579f0c21baec9e7506b6bb3f60f0a9b6d07693b4 ] Check the return value of smum_send_msg_to_smc, otherwise we might use an uninitialized variable "now" Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-04drm/amdgpu/swsmu: always force a state reprogram on initAlex Deucher1-6/+9
commit d420c857d85777663e8d16adfc24463f5d5c2dbc upstream. Always reprogram the hardware state on init. This ensures the PMFW state is explicitly programmed and we are not relying on the default PMFW state. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3131 Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit c50fe289ed7207f71df3b5f1720512a9620e84fb) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-09-04drm/amdgpu: align pp_power_profile_mode with kernel docsAlex Deucher1-4/+2
commit 8f614469de248a4bc55fb07e55d5f4c340c75b11 upstream. The kernel doc says you need to select manual mode to adjust this, but the code only allows you to adjust it when manual mode is not selected. Remove the manual mode check. Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit bbb05f8a9cd87f5046d05a0c596fddfb714ee457) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-29drm/amd/pm: fix error flow in sensor fetchingAlex Deucher1-7/+7
[ Upstream commit a5600853167aeba5cade81f184a382a0d1b14641 ] Sensor fetching functions should return an signed int to handle errors properly. Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reported-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-14drm/amd/pm: Fix the null pointer dereference for vega10_hwmgrBob Zhou1-4/+25
[ Upstream commit 50151b7f1c79a09117837eb95b76c2de76841dab ] Check return value and conduct null pointer handling to avoid null pointer dereference. Signed-off-by: Bob Zhou <bob.zhou@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-14drm/amdgpu/pm: Fix the null pointer dereference in apply_state_adjust_rulesMa Jun3-10/+18
[ Upstream commit d19fb10085a49b77578314f69fff21562f7cd054 ] Check the pointer value to fix potential null pointer dereference Acked-by: Yang Wang<kevinyang.wang@amd.com> Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-14drm/amdgpu/pm: Fix the null pointer dereference for smu7Ma Jun1-26/+24
[ Upstream commit c02c1960c93eede587576625a1221205a68a904f ] optimize the code to avoid pass a null pointer (hwmgr->backend) to function smu7_update_edc_leakage_table. Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-14drm/amdgpu/pm: Fix the param type of set_power_profile_modeMa Jun3-16/+16
[ Upstream commit f683f24093dd94a831085fe0ea8e9dc4c6c1a2d1 ] Function .set_power_profile_mode need an array as input parameter. So define variable workload as an array to fix the below coverity warning. "Passing &workload to function hwmgr->hwmgr_func->set_power_profile_mode which uses it as an array. This might corrupt or misinterpret adjacent memory locations" Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-03drm/amd/pm: Fix aldebaran pcie speed reportingLijo Lazar1-2/+2
[ Upstream commit b6420021e17e262c57bb289d0556ee181b014f9c ] Fix the field definitions for LC_CURRENT_DATA_RATE. Fixes: c05d1c401572 ("drm/amd/swsmu: add aldebaran smu13 ip support (v3)") Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-27drm/amdgpu: fix UBSAN warning in kv_dpm.cAlex Deucher1-0/+2
commit f0d576f840153392d04b2d52cf3adab8f62e8cb6 upstream. Adds bounds check for sumo_vid_mapping_entry. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3392 Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16drm/amd: Fix shutdown (again) on some SMU v13.0.4/11 platformsMario Limonciello1-9/+11
commit 267cace556e8a53d703119f7435ab556209e5b6a upstream. commit cd94d1b182d2 ("dm/amd/pm: Fix problems with reboot/shutdown for some SMU 13.0.4/13.0.11 users") attempted to fix shutdown issues that were reported since commit 31729e8c21ec ("drm/amd/pm: fixes a random hang in S4 for SMU v13.0.4/11") but caused issues for some people. Adjust the workaround flow to properly only apply in the S4 case: -> For shutdown go through SMU_MSG_PrepareMp1ForUnload -> For S4 go through SMU_MSG_GfxDeviceDriverReset and SMU_MSG_PrepareMp1ForUnload Reported-and-tested-by: lectrode <electrodexsnet@gmail.com> Closes: https://github.com/void-linux/void-packages/issues/50417 Cc: stable@vger.kernel.org Fixes: cd94d1b182d2 ("dm/amd/pm: Fix problems with reboot/shutdown for some SMU 13.0.4/13.0.11 users") Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-12drm/amd/pm: Restore config space after resetLijo Lazar1-0/+25
[ Upstream commit 30d1cda8ce31ab49051ff7159280c542a738b23d ] During mode-2 reset, pci config space registers are affected at device side. However, certain platforms have switches which assign virtual BAR addresses and returns the same even after device is reset. This affects pci_restore_state() as it doesn't issue another config write, if the value read is same as the saved value. Add a workaround to write saved config space values from driver side. Presently, these switches are in platforms with SMU v13.0.6 SOCs, hence restrict the workaround only to those. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17dm/amd/pm: Fix problems with reboot/shutdown for some SMU 13.0.4/13.0.11 usersMario Limonciello1-1/+1
[ Upstream commit cd94d1b182d2986378550c9087571991bfee01d4 ] Limit the workaround introduced by commit 31729e8c21ec ("drm/amd/pm: fixes a random hang in S4 for SMU v13.0.4/11") to only run in the s4 path. Cc: Tim Huang <Tim.Huang@amd.com> Fixes: 31729e8c21ec ("drm/amd/pm: fixes a random hang in S4 for SMU v13.0.4/11") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3351 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-17drm/amd/pm: fixes a random hang in S4 for SMU v13.0.4/11Tim Huang1-1/+11
commit 31729e8c21ecfd671458e02b6511eb68c2225113 upstream. While doing multiple S4 stress tests, GC/RLC/PMFW get into an invalid state resulting into hard hangs. Adding a GFX reset as workaround just before sending the MP1_UNLOAD message avoids this failure. Signed-off-by: Tim Huang <Tim.Huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Mario Limonciello <superm1@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-03drm/amdgpu/pm: Fix the error of pwm1_enable settingMa Jun1-1/+11
commit 0dafaf659cc463f2db0af92003313a8bc46781cd upstream. Fix the pwm_mode value error which used for pwm1_enable setting Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-27drm/amd/pm: Fix esm reg mask use to get pcie speedAsad Kamal3-6/+6
[ Upstream commit b485b899e5b8f83723833feca30a1a1e3df778df ] Fix mask used for esm ctrl register to get pcie link speed on smu_v11_0_3, smu_v13_0_2 & smu_v13_0_6 Fixes: 511a95552ec8 ("drm/amd/pm: Add SMU 13.0.6 support") Fixes: c05d1c401572 ("drm/amd/swsmu: add aldebaran smu13 ip support (v3)") Fixes: f1c378593153 ("drm/amd/powerplay: add Arcturus support for gpu metrics export") Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-03-06Revert "drm/amd/pm: resolve reboot exception for si oland"Alex Deucher1-0/+29
commit 955558030954b9637b41c97b730f9b38c92ac488 upstream. This reverts commit e490d60a2f76bff636c68ce4fe34c1b6c34bbd86. This causes hangs on SI when DC is enabled and errors on driver reboot and power off cycles. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3216 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2755 Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-05drm/amd/powerplay: Fix kzalloc parameter 'ATOM_Tonga_PPM_Table' in ↵Srinivasan Shanmugam1-1/+1
'get_platform_power_management_table()' [ Upstream commit 6616b5e1999146b1304abe78232af810080c67e3 ] In 'struct phm_ppm_table *ptr' allocation using kzalloc, an incorrect structure type is passed to sizeof() in kzalloc, larger structure types were used, thus using correct type 'struct phm_ppm_table' fixes the below: drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/process_pptables_v1_0.c:203 get_platform_power_management_table() warn: struct type mismatch 'phm_ppm_table vs _ATOM_Tonga_PPM_Table' Cc: Eric Huang <JinHuiEric.Huang@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-02-05drm/amdgpu: fix avg vs input power reporting on smu7Alex Deucher1-1/+16
[ Upstream commit 25852d4b97572ff62ffee574cb8bb4bc551af23a ] Hawaii, Bonaire, Fiji, and Tonga support average power, the others support current power. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-02-01drm/amdgpu/pm: Fix the power source flag errorMa Jun3-10/+7
commit ca1ffb174f16b699c536734fc12a4162097c49f4 upstream. The power source flag should be updated when [1] System receives an interrupt indicating that the power source has changed. [2] System resumes from suspend or runtime suspend Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-01-26drm/amd/pm/smu7: fix a memleak in smu7_hwmgr_backend_initZhipeng Lu1-1/+5
[ Upstream commit 2f3be3ca779b11c332441b10e00443a2510f4d7b ] The hwmgr->backend, (i.e. data) allocated by kzalloc is not freed in the error-handling paths of smu7_get_evv_voltages and smu7_update_edc_leakage_table. However, it did be freed in the error-handling of phm_initializa_dynamic_state_adjustment_rule_settings, by smu7_hwmgr_backend_fini. So the lack of free in smu7_get_evv_voltages and smu7_update_edc_leakage_table is considered a memleak in this patch. Fixes: 599a7e9fe1b6 ("drm/amd/powerplay: implement smu7 hwmgr to manager asics with smu ip version 7.") Fixes: 8f0804c6b7d0 ("drm/amd/pm: add edc leakage controller setting") Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-26drm/amd/pm: fix a double-free in amdgpu_parse_extended_power_tableZhipeng Lu1-39/+13
[ Upstream commit a6582701178a47c4d0cb2188c965c59c0c0647c8 ] The amdgpu_free_extended_power_table is called in every error-handling paths of amdgpu_parse_extended_power_table. However, after the following call chain of returning: amdgpu_parse_extended_power_table |-> kv_dpm_init / si_dpm_init (the only two caller of amdgpu_parse_extended_power_table) |-> kv_dpm_sw_init / si_dpm_sw_init (the only caller of kv_dpm_init / si_dpm_init, accordingly) |-> kv_dpm_fini / si_dpm_fini (goto dpm_failed in xx_dpm_sw_init) |-> amdgpu_free_extended_power_table As above, the amdgpu_free_extended_power_table is called twice in this returning chain and thus a double-free is triggered. Similarily, the last kfree in amdgpu_parse_extended_power_table also cause a double free with amdgpu_free_extended_power_table in kv_dpm_fini. Fixes: 84176663e70d ("drm/amd/pm: create a new holder for those APIs used only by legacy ASICs(si/kv)") Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-26drivers/amd/pm: fix a use-after-free in kv_parse_power_tableZhipeng Lu1-3/+1
[ Upstream commit 28dd788382c43b330480f57cd34cde0840896743 ] When ps allocated by kzalloc equals to NULL, kv_parse_power_table frees adev->pm.dpm.ps that allocated before. However, after the control flow goes through the following call chains: kv_parse_power_table |-> kv_dpm_init |-> kv_dpm_sw_init |-> kv_dpm_fini The adev->pm.dpm.ps is used in the for loop of kv_dpm_fini after its first free in kv_parse_power_table and causes a use-after-free bug. Fixes: a2e73f56fa62 ("drm/amdgpu: Add support for CIK parts") Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-26drm/amd/pm: fix a double-free in si_dpm_initZhipeng Lu1-3/+2
[ Upstream commit ac16667237a82e2597e329eb9bc520d1cf9dff30 ] When the allocation of adev->pm.dpm.dyn_state.vddc_dependency_on_dispclk.entries fails, amdgpu_free_extended_power_table is called to free some fields of adev. However, when the control flow returns to si_dpm_sw_init, it goes to label dpm_failed and calls si_dpm_fini, which calls amdgpu_free_extended_power_table again and free those fields again. Thus a double-free is triggered. Fixes: 841686df9f7d ("drm/amdgpu: add SI DPM support (v4)") Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-08drm/amd/pm: fix a memleak in aldebaran_tables_initDinghao Liu1-1/+4
[ Upstream commit 7a88f23e768491bae653b444a96091d2aaeb0818 ] When kzalloc() for smu_table->ecc_table fails, we should free the previously allocated resources to prevent memleak. Fixes: edd794208555 ("drm/amd/pm: add message smu to get ecc_table v2") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28drm/amdgpu/smu13: drop compute workload workaroundAlex Deucher1-30/+2
commit 23170863ea0a0965d224342c0eb2ad8303b1f267 upstream. This was fixed in PMFW before launch and is no longer required. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-28drm/amd/pm: Fix error of MACO flag setting codeMa Jun2-8/+9
commit 7f3e6b840fa8b0889d776639310a5dc672c1e9e1 upstream. MACO only works if BACO is supported Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-28drm/amd/pm: Handle non-terminated overdrive commands.Bas Nieuwenhuizen1-2/+6
commit 08e9ebc75b5bcfec9d226f9e16bab2ab7b25a39a upstream. The incoming strings might not be terminated by a newline or a 0. (found while testing a program that just wrote the string itself, causing a crash) Cc: stable@vger.kernel.org Fixes: e3933f26b657 ("drm/amd/pp: Add edit/commit/show OD clock/voltage support in sysfs") Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-28drm/amd: check num of link levels when update pcie paramLin.Cao1-0/+3
[ Upstream commit 406e8845356d18bdf3d3a23b347faf67706472ec ] In SR-IOV environment, the value of pcie_table->num_of_link_levels will be 0, and num_of_levels - 1 will cause array index out of bounds Signed-off-by: Lin.Cao <lincao12@amd.com> Acked-by: Jingwen Chen <Jingwen.Chen2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28drm/amd: Disable PP_PCIE_DPM_MASK when dynamic speed switching not supportedMario Limonciello3-5/+3
[ Upstream commit fbf1035b033a51eee48d5f42e781b02fff272ca0 ] Rather than individual ASICs checking for the quirk, set the quirk at the driver level. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28drm/amd: Fix UBSAN array-index-out-of-bounds for Polaris and TongaMario Limonciello1-6/+6
[ Upstream commit 0f0e59075b5c22f1e871fbd508d6e4f495048356 ] For pptable structs that use flexible array sizes, use flexible arrays. Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2036742 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28drm/amd: Fix UBSAN array-index-out-of-bounds for SMU7Mario Limonciello1-2/+2
[ Upstream commit 760efbca74a405dc439a013a5efaa9fadc95a8c3 ] For pptable structs that use flexible array sizes, use flexible arrays. Suggested-by: Felix Held <felix.held@amd.com> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2874 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>