summaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)AuthorFilesLines
2025-08-28drm/i915/lnl+/tc: Fix max lane count HW readoutImre Deak1-0/+9
commit c87514a0bb0a64507412a2d98264060dc0c1562a upstream. On LNL+ for a disconnected sink the pin assignment value gets cleared by the HW/FW as soon as the sink gets disconnected, even if the PHY ownership got acquired already by the BIOS/driver (and hence the PHY itself is still connected and used by the display). During HW readout this can result in detecting the PHY's max lane count as 0 - matching the above cleared aka NONE pin assignment HW state. For a connected PHY the driver in general (outside of intel_tc.c) expects the max lane count value to be valid for the video mode enabled on the corresponding output (1, 2 or 4). Ensure this by setting the max lane count to 4 in this case. Note, that it doesn't matter if this lane count happened to be more than the max lane count with which the PHY got connected and enabled, since the only thing the driver can do with such an output - where the DP-alt sink is disconnected - is to disable the output. v2: Rebased on change reading out the pin configuration only if the PHY is connected. Cc: stable@vger.kernel.org # v6.8+ Reported-by: Charlton Lin <charlton.lin@intel.com> Tested-by: Khaled Almahallawy <khaled.almahallawy@intel.com> Reviewed-by: Mika Kahola <mika.kahola@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://lore.kernel.org/r/20250811080152.906216-4-imre.deak@intel.com (cherry picked from commit 33cf70bc0fe760224f892bc1854a33665f27d482) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/i915/icl+/tc: Cache the max lane count valueImre Deak1-9/+48
commit 5fd35236546abe780eaadb7561e09953719d4fc3 upstream. The PHY's pin assignment value in the TCSS_DDI_STATUS register - as set by the HW/FW based on the connected DP-alt sink's TypeC/PD pin assignment negotiation - gets cleared by the HW/FW on LNL+ as soon as the sink gets disconnected, even if the PHY ownership got acquired already by the driver (and hence the PHY itself is still connected and used by the display). This is similar to how the PHY Ready flag gets cleared on LNL+ in the same register. To be able to query the max lane count value on LNL+ - which is based on the above pin assignment - at all times even after the sink gets disconnected, the max lane count must be determined and cached during the PHY's HW readout and connect sequences. Do that here, leaving the actual use of the cached value to a follow-up change. v2: Don't read out the pin configuration if the PHY is disconnected. Cc: stable@vger.kernel.org # v6.8+ Reported-by: Charlton Lin <charlton.lin@intel.com> Tested-by: Khaled Almahallawy <khaled.almahallawy@intel.com> Reviewed-by: Mika Kahola <mika.kahola@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://lore.kernel.org/r/20250811080152.906216-3-imre.deak@intel.com (cherry picked from commit 3e32438fc406761f81b1928d210b3d2a5e7501a0) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/i915/lnl+/tc: Fix handling of an enabled/disconnected dp-alt sinkImre Deak1-6/+11
commit f52d6aa98379842fc255d93282655566f2114e0c upstream. The TypeC PHY HW readout during driver loading and system resume determines which TypeC mode the PHY is in (legacy/DP-alt/TBT-alt) and whether the PHY is connected, based on the PHY's Owned and Ready flags. For the PHY to be in DP-alt or legacy mode and for the PHY to be in the connected state in these modes, both the Owned (set by the BIOS/driver) and the Ready (set by the HW) flags should be set. On ICL-MTL the HW kept the PHY's Ready flag set after the driver connected the PHY by acquiring the PHY ownership (by setting the Owned flag), until the driver disconnected the PHY by releasing the PHY ownership (by clearing the Owned flag). On LNL+ this has changed, in that the HW clears the Ready flag as soon as the sink gets disconnected, even if the PHY ownership was acquired already and hence the PHY is being used by the display. When inheriting the HW state from BIOS for a PHY connected in DP-alt mode on which the sink got disconnected - i.e. in a case where the sink was connected while BIOS/GOP was running and so the sink got enabled connecting the PHY, but the user disconnected the sink by the time the driver loaded - the PHY Owned but not Ready state must be accounted for on LNL+ according to the above. Do that by assuming on LNL+ that the PHY is connected in DP-alt mode whenever the PHY Owned flag is set, regardless of the PHY Ready flag. This fixes a problem on LNL+, where the PHY TypeC mode / connected state was detected incorrectly for a DP-alt sink, which got connected and then disconnected by the user in the above way. v2: Rename tc_phy_in_legacy_or_dp_alt_mode() to tc_phy_owned_by_display(). (Luca, Jani) Cc: Jani Nikula <jani.nikula@intel.com> Cc: stable@vger.kernel.org # v6.8+ Reported-by: Charlton Lin <charlton.lin@intel.com> Tested-by: Khaled Almahallawy <khaled.almahallawy@intel.com> Reviewed-by: Mika Kahola <mika.kahola@intel.com> Reviewed-by: Luca Coelho <luciano.coelho@intel.com> [Imre: Add one-liner function documentation for tc_phy_owned_by_display()] Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://lore.kernel.org/r/20250811080152.906216-2-imre.deak@intel.com (cherry picked from commit 89f4b196ee4b056e0e8c179b247b29d4a71a4e7e) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/i915/gt: Relocate compression repacking WA for JSL/EHLSebastian Brzezinka1-9/+11
commit 8236820fd767f400d1baefb71bc7e36e37730a1e upstream. CACHE_MODE_0 registers should be saved and restored as part of the context, not during engine reset. Move the related workaround (Disable Repacking for Compression) from rcs_engine_wa_init() to icl_ctx_workarounds_init() for Jasper Lake and Elkhart Lake platforms. This ensures the WA is applied during context initialisation. BSPEC: 11322 Fixes: 0ddae025ab6c ("drm/i915: Disable compression tricks on JSL") Closes: Fixes: 0ddae025ab6c ("drm/i915: Disable compression tricks on JSL") Signed-off-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com> Cc: stable@vger.kernel.org # v6.13+ Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://lore.kernel.org/r/4feaa24094e019e000ceb6011d8cd419b0361b3f.1754902406.git.sebastian.brzezinka@intel.com (cherry picked from commit c9932f0d604e4c8f2c6018e598a322acb43c68a2) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/nouveau/gsp: fix mismatched alloc/free for kvmalloc()Qianfeng Rong1-2/+2
commit 989fe6771266bdb82a815d78802c5aa7c918fdfd upstream. Replace kfree() with kvfree() for memory allocated by kvmalloc(). Compile-tested only. Cc: stable@vger.kernel.org Fixes: 8a8b1ec5261f ("drm/nouveau/gsp: split rpc handling out on its own") Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com> Reviewed-by: Timur Tabi <ttabi@nvidia.com> Acked-by: Zhi Wang <zhiw@nvidia.com> Link: https://lore.kernel.org/r/20250813125412.96178-1-rongqianfeng@vivo.com Signed-off-by: Danilo Krummrich <dakr@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/i915: silence rpm wakeref asserts on GEN11_GU_MISC_IIR accessJani Nikula1-0/+4
commit ff646d033783068cc5b38924873cab4a536b17c1 upstream. Commit 8d9908e8fe9c ("drm/i915/display: remove small micro-optimizations in irq handling") not only removed the optimizations, it also enabled wakeref asserts for the GEN11_GU_MISC_IIR access. Silence the asserts by wrapping the access inside intel_display_rpm_assert_{block,unblock}(). Reported-by: "Jason A. Donenfeld" <Jason@zx2c4.com> Closes: https://lore.kernel.org/r/aG0tWkfmxWtxl_xc@zx2c4.com Fixes: 8d9908e8fe9c ("drm/i915/display: remove small micro-optimizations in irq handling") Cc: stable@vger.kernel.org # v6.13+ Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Jouni Högander <jouni.hogander@intel.com> Link: https://lore.kernel.org/r/20250805115656.832235-1-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit cbd3baeffbc08052ce7dc53f11bf5524b4411056) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amdgpu/swm14: Update power limit logicAlex Deucher1-5/+25
commit 79e25cd06e85105c75701ef1773c6c64bb304091 upstream. Take into account the limits from the vbios. Ported from the SMU13 code. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4352 Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 203cc7f1dd86f2c8de5c3c6182f19adac7c9c206) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amd/display: Don't overwrite dce60_clk_mgrTimur Kristóf1-1/+0
commit 4db9cd554883e051df1840d4d58d636043101034 upstream. dc_clk_mgr_create accidentally overwrites the dce60_clk_mgr with the dce_clk_mgr, causing incorrect behaviour on DCE6. Fix it by removing the extra dce_clk_mgr_construct. Fixes: 62eab49faae7 ("drm/amd/display: hide VGH asic specific structs") Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit bbddcbe36a686af03e91341b9bbfcca94bd45fb6) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amd/display: Revert "drm/amd/display: Fix AMDGPU_MAX_BL_LEVEL value"Mario Limonciello1-4/+4
commit 8e6a18cbf3ee2c1e3d0afd8d3debd0ba8738ad0c upstream. This reverts commit 66abb996999de0d440a02583a6e70c2c24deab45. This broke custom brightness curves but it wasn't obvious because of other related changes. Custom brightness curves are always from a 0-255 input signal. The correct fix was to fix the default value which was done by [1]. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4412 Link: https://lore.kernel.org/amd-gfx/0f094c4b-d2a3-42cd-824c-dc2858a5618d@kernel.org/T/#m69f875a7e69aa22df3370b3e3a9e69f4a61fdaf2 Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 6ec8a5cbec751625133461600d0d4950ffd3a214) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amd/display: Pass up errors for reset GPU that fails to init HWMario Limonciello1-1/+3
commit 2b6943df54136f40aff8a6d7ba7c26724d89a0bd upstream. [Why] If a GPU is in reset and the hardware fails to initialize the rest of the resume sequence shouldn't be run. [How] Pass error code up to caller of dm_resume(). Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amd/display: fix initial backlight brightness calculationLauri Tirkkonen1-2/+2
commit 9c2883057b3c861879b647f34e8bc448954e8729 upstream. DIV_ROUND_CLOSEST(x, 100) returns either 0 or 1 if 0<x<=100, so the division needs to be performed after the multiplication and not the other way around, to properly scale the value. Fixes: 8b5f3a229a70 ("drm/amd/display: Fix default DC and AC levels") Signed-off-by: Lauri Tirkkonen <lauri@hacktheplanet.fi> Cc: stable@vger.kernel.org Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/aH2Q_HJvxKbW74vU@hacktheplanet.fi Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amd/display: Fix DCE 6.0 and 6.4 PLL programming.Timur Kristóf2-14/+25
commit 1c8dc3e088e09531bcdfc9fe348204abc3decb6c upstream. Apparently, both DCE 6.0 and 6.4 have 3 PLLs, but PLL0 can only be used for DP. Make sure to initialize the correct amount of PLLs in DC for these DCE versions and use PLL0 only for DP. Also, on DCE 6.0 and 6.4, the PLL0 needs to be powered on at initialization as opposed to DCE 6.1 and 7.x which use a different clock source for DFS. The following functions were used as reference from the old radeon driver implementation of DCE 6.x: - radeon_atom_pick_pll - atombios_crtc_set_disp_eng_pll Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 35222b5934ec8d762473592ece98659baf6bc48e) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amd/display: fix a Null pointer dereference vulnerabilitySiyang Liu1-9/+10
commit 1bcf63a44381691d6192872801f830ce3250e367 upstream. [Why] A null pointer dereference vulnerability exists in the AMD display driver's (DC module) cleanup function dc_destruct(). When display control context (dc->ctx) construction fails (due to memory allocation failure), this pointer remains NULL. During subsequent error handling when dc_destruct() is called, there's no NULL check before dereferencing the perf_trace member (dc->ctx->perf_trace), causing a kernel null pointer dereference crash. [How] Check if dc->ctx is non-NULL before dereferencing. Link: https://lore.kernel.org/r/tencent_54FF4252EDFB6533090A491A25EEF3EDBF06@qq.com Co-developed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> (Updated commit text and removed unnecessary error message) Signed-off-by: Siyang Liu <Security@tencent.com> Signed-off-by: Roman Li <roman.li@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 9dd8e2ba268c636c240a918e0a31e6feaee19404) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amd/display: Add primary plane to commits for correct VRR handlingMichel Dänzer1-0/+9
commit 3477c1b0972dc1c8a46f78e8fb1fa6966095b5ec upstream. amdgpu_dm_commit_planes calls update_freesync_state_on_stream only for the primary plane. If a commit affects a CRTC but not its primary plane, it would previously not trigger a refresh cycle or affect LFC, violating current UAPI semantics. Fixes e.g. atomic commits affecting only the cursor plane being limited to the minimum refresh rate. Don't do this for the legacy cursor ioctls though, it would break the UAPI semantics for those. Suggested-by: Xaver Hugl <xaver.hugl@kde.org> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3034 Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit cc7bfba95966251b254cb970c21627124da3b7f4) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amdkfd: Fix checkpoint-restore on multi-xccDavid Yat Sin3-16/+67
commit f6c0f3d24478a0792e50a64c2eba9f34d65519f2 upstream. GPUs with multi-xcc have multiple MQDs per queue. This patch saves and restores all the MQDs within the partition. Signed-off-by: David Yat Sin <David.YatSin@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a578f2a58c3ab38f0643b1b6e7534af860233cb1) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amdkfd: Destroy KFD debugfs after destroy KFD wqAmber Lin1-1/+1
commit 2e58401a24e7b2d4ec619104e1a76590c1284a4c upstream. Since KFD proc content was moved to kernel debugfs, we can't destroy KFD debugfs before kfd_process_destroy_wq. Move kfd_process_destroy_wq prior to kfd_debugfs_fini to fix a kernel NULL pointer problem. It happens when /sys/kernel/debug/kfd was already destroyed in kfd_debugfs_fini but kfd_process_destroy_wq calls kfd_debugfs_remove_process. This line debugfs_remove_recursive(entry->proc_dentry); tries to remove /sys/kernel/debug/kfd/proc/<pid> while /sys/kernel/debug/kfd is already gone. It hangs the kernel by kernel NULL pointer. Signed-off-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Eric Huang <jinhuieric.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 0333052d90683d88531558dcfdbf2525cc37c233) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amdgpu: Update supported modes for GC v9.5.0Lijo Lazar1-1/+4
commit 389d79a195a9f71a103b39097ee8341a7ca60927 upstream. For GC v9.5.0 SOCs, both CPX and QPX compute modes are also supported in NPS2 mode. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 9d1ac25c7f830e0132aa816393b1e9f140e71148) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amdgpu: update mmhub 4.1.0 client id mappingsAlex Deucher1-21/+13
commit a0b34e4c8663b13e45c78267b4de3004b1a72490 upstream. Update the client id mapping so the correct clients get printed when there is a mmhub page fault. Tested-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Reviewed-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amdgpu: update mmhub 3.3 client id mappingsAlex Deucher1-1/+104
commit 9f9bddfa31d87b084700a6e9eca1a8b4f8ddcdf6 upstream. Update the client id mapping so the correct clients get printed when there is a mmhub page fault. v2: fix typos spotted by David Wu. v3: fix additional typo spotted by David. Reviewed-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit e932f4779a2d329841bb9ca70bb80a4bb2d707b6) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amdgpu: update mmhub 3.0.1 client id mappingsAlex Deucher1-25/+32
commit 0bae62cc989fa99ac9cb564eb573aad916d1eb61 upstream. Update the client id mapping so the correct clients get printed when there is a mmhub page fault. Reviewed-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 2a2681eda73b99a2c1ee8cdb006099ea5d0c2505) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amdgpu: Update external revid for GC v9.5.0Lijo Lazar1-0/+2
commit 05c8b690511854ba31d8d1bff7139a13ec66b9e7 upstream. Use different external revid for GC v9.5.0 SOCs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 21c6764ed4bfaecad034bc4fd15dd64c5a436325) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amdgpu: track whether a queue is a kernel queue in amdgpu_mqd_propAlex Deucher2-0/+2
commit 284d4dfe850e665f0e7d4dfaf4d3d3da76d11fb0 upstream. Used to to set the MQD appropriately for each queue type. Kernel queues have additional privileges. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.16.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amdgpu: Retain job->vm in amdgpu_job_prepare_jobYuanShang1-7/+0
commit c00d8b79fd2167c6ac65e096619535acdf8678d5 upstream. The field job->vm is used in function amdgpu_job_run to get the page table re-generation counter and decide whether the job should be skipped. Specifically, function amdgpu_vm_generation checks if the VM is valid for this job to use. For instance, if a gfx job depends on a cancelled sdma job from entity vm->delayed, then the gfx job should be skipped. Fixes: 26c95e838e63 ("drm/amdgpu: set the VM pointer to NULL in amdgpu_job_prepare") Signed-off-by: YuanShang <YuanShang.Mao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit ed76936c6b10b547c6df4ca75412331e9ef6d339) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amdgpu: Initialize data to NULL in imu_v12_0_program_rlc_ram()Nathan Chancellor1-1/+1
commit c90f2e1172c51fa25492471dc9910e2d7c1444b9 upstream. After a recent change in clang to expose uninitialized warnings from const variables and pointers [1], there is a warning in imu_v12_0_program_rlc_ram() because data is passed uninitialized to program_imu_rlc_ram(): drivers/gpu/drm/amd/amdgpu/imu_v12_0.c:374:30: error: variable 'data' is uninitialized when used here [-Werror,-Wuninitialized] 374 | program_imu_rlc_ram(adev, data, (const u32)size); | ^~~~ As this warning happens early in clang's frontend, it does not realize that due to the assignment of r to -EINVAL, program_imu_rlc_ram() is never actually called, and even if it were, data would not be dereferenced because size is 0. Just initialize data to NULL to silence the warning, as the commit that added program_imu_rlc_ram() mentioned it would eventually be used over the old method, at which point data can be properly initialized and used. Cc: stable@vger.kernel.org Closes: https://github.com/ClangBuiltLinux/linux/issues/2107 Fixes: 56159fffaab5 ("drm/amdgpu: use new method to program rlc ram") Link: https://github.com/llvm/llvm-project/commit/2464313eef01c5b1edf0eccf57a32cdee01472c7 [1] Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amdgpu: check if hubbub is NULL in debugfs/amdgpu_dm_capabilitiesPeter Shkenev1-1/+1
commit b4a69f7f29c8a459ad6b4d8a8b72450f1d9fd288 upstream. HUBBUB structure is not initialized on DCE hardware, so check if it is NULL to avoid null dereference while accessing amdgpu_dm_capabilities file in debugfs. Signed-off-by: Peter Shkenev <mustela@erminea.space> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amdgpu: Avoid extra evict-restore process.Gang Ba1-4/+2
commit 1f02f2044bda1db1fd995bc35961ab075fa7b5a2 upstream. If vm belongs to another process, this is fclose after fork, wait may enable signaling KFD eviction fence and cause parent process queue evicted. [677852.634569] amdkfd_fence_enable_signaling+0x56/0x70 [amdgpu] [677852.634814] __dma_fence_enable_signaling+0x3e/0xe0 [677852.634820] dma_fence_wait_timeout+0x3a/0x140 [677852.634825] amddma_resv_wait_timeout+0x7f/0xf0 [amdkcl] [677852.634831] amdgpu_vm_wait_idle+0x2d/0x60 [amdgpu] [677852.635026] amdgpu_flush+0x34/0x50 [amdgpu] [677852.635208] filp_flush+0x38/0x90 [677852.635213] filp_close+0x14/0x30 [677852.635216] do_close_on_exec+0xdd/0x130 [677852.635221] begin_new_exec+0x1da/0x490 [677852.635225] load_elf_binary+0x307/0xea0 [677852.635231] ? srso_alias_return_thunk+0x5/0xfbef5 [677852.635235] ? ima_bprm_check+0xa2/0xd0 [677852.635240] search_binary_handler+0xda/0x260 [677852.635245] exec_binprm+0x58/0x1a0 [677852.635249] bprm_execve.part.0+0x16f/0x210 [677852.635254] bprm_execve+0x45/0x80 [677852.635257] do_execveat_common.isra.0+0x190/0x200 Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Gang Ba <Gang.Ba@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amdgpu: add missing vram lost check for LEGACY RESETAlex Deucher1-0/+1
commit 81699fe81b0be287fb28b6210324db48e8458d9f upstream. Legacy resets reset the memory controllers so VRAM contents may be unreliable after reset. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit aae94897b6661a2a4b1de2d328090fc388b3e0af) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amdgpu: add kicker fws loading for gfx12/smu14/psp14Frank Min5-10/+29
commit 0395cde08e1f7eee810b5799466e41635a21e599 upstream. 1. Add kicker firmwares loading for gfx12/smu14/psp14 2. Register additional MODULE_FIRMWARE entries for kicker fws - gc_12_0_1_rlc_kicker.bin - gc_12_0_1_imu_kicker.bin - psp_14_0_3_sos_kicker.bin - psp_14_0_3_ta_kicker.bin - smu_14_0_3_kicker.bin Signed-off-by: Frank Min <Frank.Min@amd.com> Reviewed-by: Gui Chengming <Jack.Gui@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amd: Restore cached power limit during resumeMario Limonciello1-0/+6
commit ed4efe426a49729952b3dc05d20e33b94409bdd1 upstream. The power limit will be cached in smu->current_power_limit but if the ASIC goes into S3 this value won't be restored. Restore the value during SMU resume. Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250725031222.3015095-2-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 26a609e053a6fc494403e95403bc6a2470383bec) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amdgpu/discovery: fix fw based ip discoveryAlex Deucher2-36/+41
commit 514678da56da089b756b4d433efd964fa22b2079 upstream. We only need the fw based discovery table for sysfs. No need to parse it. Additionally parsing some of the board specific tables may result in incorrect data on some boards. just load the binary and don't parse it on those boards. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4441 Fixes: 80a0e8282933 ("drm/amdgpu/discovery: optionally use fw based ip discovery") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 62eedd150fa11aefc2d377fc746633fdb1baeb55) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amd/amdgpu: fix missing lock for cper.ring->rptr/wptr accessYang Wang1-2/+4
commit 8e0d1edb5c16732b695eaf4bd7096b1569817cf0 upstream. Add lock protection for 'ring->wptr'/'ring->rptr' to ensure the correct execution. Fixes: 8652920d2c00 ("drm/amdgpu: add mutex lock for cper ring") Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/xe: Defer buffer object shrinker write-backs and GPU waitsThomas Hellström1-4/+47
commit 2dd7a47669ae6c1da18c55f8e89c4a44418c7006 upstream. When the xe buffer-object shrinker allows GPU waits and write-back, (typically from kswapd), perform multiple passes, skipping subsequent passes if the shrinker number of scanned objects target is reached. 1) Without GPU waits and write-back 2) Without write-back 3) With both GPU-waits and write-back This is to avoid stalls and costly write- and readbacks unless they are really necessary. v2: - Don't test for scan completion twice. (Stuart Summers) - Update tags. Reported-by: melvyn <melvyn2@dnsense.pub> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5557 Cc: Summers Stuart <stuart.summers@intel.com> Fixes: 00c8efc3180f ("drm/xe: Add a shrinker for xe bos") Cc: <stable@vger.kernel.org> # v6.15+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://lore.kernel.org/r/20250805074842.11359-1-thomas.hellstrom@linux.intel.com (cherry picked from commit 80944d334182ce5eb27d00e2bf20a88bfc32dea1) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28Mark xe driver as BROKEN if kernel page size is not 4kBSimon Richter1-0/+1
commit 022906afdf90327bce33d52fb4fb41b6c7d618fb upstream. This driver, for the time being, assumes that the kernel page size is 4kB, so it fails on loong64 and aarch64 with 16kB pages, and ppc64el with 64kB pages. Signed-off-by: Simon Richter <Simon.Richter@hogyros.de> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: stable@vger.kernel.org # v6.8+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20250802024152.3021-1-Simon.Richter@hogyros.de (cherry picked from commit 0521a868222ffe636bf202b6e9d29292c1e19c62) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28amdgpu/amdgpu_discovery: increase timeout limit for IFWI initXaver Hugl1-2/+2
commit 928587381b54b1b6c62736486b1dc6cb16c568c2 upstream. With a timeout of only 1 second, my rx 5700XT fails to initialize, so this increases the timeout to 2s. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3697 Signed-off-by: Xaver Hugl <xaver.hugl@kde.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 9ed3d7bdf2dcdf1a1196630fab89a124526e9cc2) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-20drm/amd/display: Allow DCN301 to clear update flagsIvan Lipski1-1/+2
commit 2d418e4fd9f1eca7dfce80de86dd702d36a06a25 upstream. [Why & How] Not letting DCN301 to clear after surface/stream update results in artifacts when switching between active overlay planes. The issue is known and has been solved initially. See below: (https://gitlab.freedesktop.org/drm/amd/-/issues/3441) Fixes: f354556e29f4 ("drm/amd/display: limit clear_update_flags t dcn32 and above") Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-20drm/amdgpu: fix incorrect vm flags to map boJack Xiao1-2/+2
[ Upstream commit 040bc6d0e0e9c814c9c663f6f1544ebaff6824a8 ] It should use vm flags instead of pte flags to specify bo vm attributes. Fixes: 7946340fa389 ("drm/amdgpu: Move csa related code to separate file") Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit b08425fa77ad2f305fe57a33dceb456be03b653f) Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-20drm/amdgpu: fix vram reservation issueYiPeng Chai1-2/+1
[ Upstream commit 10ef476aad1c848449934e7bec2ab2374333c7b6 ] The vram block allocation flag must be cleared before making vram reservation, otherwise reserving addresses within the currently freed memory range will always fail. Fixes: c9cad937c0c5 ("drm/amdgpu: add drm buddy support to amdgpu") Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit d38eaf27de1b8584f42d6fb3f717b7ec44b3a7a1) Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-20drm/xe/hwmon: Add SW clamp for power limits writesKarthik Poosa1-0/+29
[ Upstream commit 55d49f06162e45686399df4ae6292167f0deab7c ] Clamp writes to power limits powerX_crit/currX_crit, powerX_cap, powerX_max, to the maximum supported by the pcode mailbox when sysfs-provided values exceed this limit. Although the pcode already performs clamping, values beyond the pcode mailbox's supported range get truncated, leading to incorrect critical power settings. This patch ensures proper clamping to prevent such truncation. v2: - Address below review comments. (Riana) - Split comments into multiple sentences. - Use local variables for readability. - Add a debug log. - Use u64 instead of unsigned long. v3: - Change drm_dbg logs to drm_info. (Badal) v4: - Rephrase the drm_info log. (Rodrigo, Riana) - Rename variable max_mbx_power_limit to max_supp_power_limit, as limit is same for platforms with and without mailbox power limit support. Signed-off-by: Karthik Poosa <karthik.poosa@intel.com> Fixes: 92d44a422d0d ("drm/xe/hwmon: Expose card reactive critical power") Fixes: fb1b70607f73 ("drm/xe/hwmon: Expose power attributes") Reviewed-by: Riana Tauro <riana.tauro@intel.com> Link: https://lore.kernel.org/r/20250808185310.3466529-1-karthik.poosa@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit d301eb950da59f962bafe874cf5eb6d61a85b2c2) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-20drm/xe/migrate: prevent potential UAFMatthew Auld1-4/+5
[ Upstream commit 145832fbdd17b1d77ffd6cdd1642259e101d1b7e ] If we hit the error path, the previous fence (if there is one) has already been put() prior to this, so doing a fence_wait could lead to UAF. Tweak the flow to do to the put() until after we do the wait. Fixes: 270172f64b11 ("drm/xe: Update xe_ttm_access_memory to use GPU for non-visible access") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Maciej Patelczyk <maciej.patelczyk@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://lore.kernel.org/r/20250731093807.207572-8-matthew.auld@intel.com (cherry picked from commit 9b7ca35ed28fe5fad86e9d9c24ebd1271e4c9c3e) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-20drm/xe/migrate: don't overflow max copy sizeMatthew Auld1-0/+6
[ Upstream commit 4126cb327a2e3273c81fcef1c594c5b7b645c44c ] With non-page aligned copy, we need to use 4 byte aligned pitch, however the size itself might still be close to our maximum of ~8M, and so the dimensions of the copy can easily exceed the S16_MAX limit of the copy command leading to the following assert: xe 0000:03:00.0: [drm] Assertion `size / pitch <= ((s16)(((u16)~0U) >> 1))` failed! platform: BATTLEMAGE subplatform: 1 graphics: Xe2_HPG 20.01 step A0 media: Xe2_HPM 13.01 step A1 tile: 0 VRAM 10.0 GiB GT: 0 type 1 WARNING: CPU: 23 PID: 10605 at drivers/gpu/drm/xe/xe_migrate.c:673 emit_copy+0x4b5/0x4e0 [xe] To fix this account for the pitch when calculating the number of current bytes to copy. Fixes: 270172f64b11 ("drm/xe: Update xe_ttm_access_memory to use GPU for non-visible access") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Maciej Patelczyk <maciej.patelczyk@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://lore.kernel.org/r/20250731093807.207572-7-matthew.auld@intel.com (cherry picked from commit 8c2d61e0e916e077fda7e7b8e67f25ffe0f361fc) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-20drm/xe/migrate: prevent infinite recursionMatthew Auld1-12/+17
[ Upstream commit 9d7a1cbebbb691891671def57407ba2f8ee914e8 ] If the buf + offset is not aligned to XE_CAHELINE_BYTES we fallback to using a bounce buffer. However the bounce buffer here is allocated on the stack, and the only alignment requirement here is that it's naturally aligned to u8, and not XE_CACHELINE_BYTES. If the bounce buffer is also misaligned we then recurse back into the function again, however the new bounce buffer might also not be aligned, and might never be until we eventually blow through the stack, as we keep recursing. Instead of using the stack use kmalloc, which should respect the power-of-two alignment request here. Fixes a kernel panic when triggering this path through eudebug. v2 (Stuart): - Add build bug check for power-of-two restriction - s/EINVAL/ENOMEM/ Fixes: 270172f64b11 ("drm/xe: Update xe_ttm_access_memory to use GPU for non-visible access") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Maciej Patelczyk <maciej.patelczyk@intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://lore.kernel.org/r/20250731093807.207572-6-matthew.auld@intel.com (cherry picked from commit 38b34e928a08ba594c4bbf7118aa3aadacd62fff) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-20drm/i915/psr: Do not trigger Frame Change events from frontbuffer flushJouni Högander1-5/+9
[ Upstream commit 184889dfe0568528fd6d14bba864dd57ed45bbf2 ] We want to get rid of triggering "Frame Change" events from frontbuffer flush calls. We are about to move using TRANS_PUSH register for this on LunarLake and onwards. Touching TRANS_PUSH register from fronbuffer flush would be problematic as it's written by DSB as well. Fix this by using intel_psr_exit when flush or invalidate is done on LunarLake and onwards. This is not possible on AlderLake and MeteorLake due to HW bug in PSR2 disable. This patch is also fixing problems with cursor plane where cursor is disappearing or duplicate cursor is seen on the screen. v2: Commit message updated Bspec: 68927, 68934, 66624 Reported-by: Janna Martl <janna.martl109@gmail.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5522 Fixes: 411ad63877bb ("drm/i915/psr: Use SFF_CTL on invalidate/flush for LunarLake onwards") Tested-by: Janna Martl <janna.martl109@gmail.com> Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://lore.kernel.org/r/20250801062905.564453-1-jouni.hogander@intel.com (cherry picked from commit 46fb38cb20c0d185a6391ab524b23e0e0219c41f) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-20drm/i915/fbc: fix the implementation of wa_18038517565Vinod Govindapillai1-4/+4
[ Upstream commit fd56b9c9507f32b16159f9a922e1af5628254567 ] As per the wa_18038517565, we need to disable FBC compressor clock gating before enabling FBC and enable after disabling FBC. Placing the enabling of clock gating in the fbc deactivate function can make the above wa logic go wrong in case of frontbuffer rendering FBC mechanism. FBC deactivate can get called during fb invalidate and then the corresponding FBC activate can get called without properly disabling the clock gating and can result in compression stalled. So move the enable clock gating at the end of one FBC session after FBC is completely disabled for a pipe. Bspec: 74212, 72197, 69741, 65555 Fixes: 010363c46189 ("drm/i915/display: implement wa_18038517565") Signed-off-by: Vinod Govindapillai <vinod.govindapillai@intel.com> Reviewed-by: Jouni Högander <jouni.hogander@intel.com> Link: https://lore.kernel.org/r/20250729124648.288497-1-vinod.govindapillai@intel.com (cherry picked from commit 82dde0407ab126f8413fd6c51429e5057ced5ba2) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-20drm/amd/display: Disable dsc_power_gate for dcn314 by defaultRoman Li1-0/+1
[ Upstream commit 02f3ec53177243d32ee8b6f8ba99136d7887ee3a ] [Why] "REG_WAIT timeout 1us * 1000 tries - dcn314_dsc_pg_control line" warnings seen after resuming from s2idle. DCN314 has issues with DSC power gating that cause REG_WAIT timeouts when attempting to power down DSC blocks. [How] Disable dsc_power_gate for dcn314 by default. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Roman Li <Roman.Li@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-20drm/amd/display: Avoid configuring PSR granularity if PSR-SU not supportedMario Limonciello1-2/+4
[ Upstream commit a5ce8695d6d1b40d6960d2d298b579042c158f25 ] [Why] If PSR-SU is disabled on the link, then configuring su_y granularity in mod_power_calc_psr_configs() can lead to assertions in psr_su_set_dsc_slice_height(). [How] Check the PSR version in amdgpu_dm_link_setup_psr() to determine whether or not to configure granularity. Reviewed-by: Sun peng (Leo) Li <sunpeng.li@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-20drm/amd/display: Only finalize atomic_obj if it was initializedMario Limonciello1-1/+2
[ Upstream commit b174084b3fe15ad1acc69530e673c1535d2e4f85 ] [Why] If amdgpu_dm failed to initalize before amdgpu_dm_initialize_drm_device() completed then freeing atomic_obj will lead to list corruption. [How] Check if atomic_obj state is initialized before trying to free. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-20drm/ttm: Respect the shrinker core free targetTvrtko Ursulin1-3/+5
[ Upstream commit eac21f8ebeb4f84d703cf41dc3f81d16fa9dc00a ] Currently the TTM shrinker aborts shrinking as soon as it frees pages from any of the page order pools and by doing so it can fail to respect the freeing target which was configured by the shrinker core. We use the wording "can fail" because the number of freed pages will depend on the presence of pages in the pools and the order of the pools on the LRU list. For example if there are no free pages in the high order pools the shrinker core may require multiple passes over the TTM shrinker before it will free the default target of 128 pages (assuming there are free pages in the low order pools). This inefficiency can be compounded by the pool LRU where multiple further calls into the TTM shrinker are required to end up looking at the pool with pages. Improve this by never freeing less than the shrinker core has requested. At the same time we start reporting the number of scanned pages (freed in this case), which prevents the core shrinker from giving up on the TTM shrinker too soon and moving on. v2: * Simplify loop logic. (Christian) * Improve commit message. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Christian König <christian.koenig@amd.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net> Link: https://lore.kernel.org/r/20250603112750.34997-2-tvrtko.ursulin@igalia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-20drm/amd/display: Avoid trying AUX transactions on disconnected portsWayne Lin1-1/+2
[ Upstream commit deb24e64c8881c462b29e2c69afd9e6669058be5 ] [Why & How] Observe that we try to access DPCD 0x600h of disconnected DP ports. In order not to wasting time on retrying these ports, call dpcd_write_rx_power_ctrl() after checking its connection status. Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-20drm/amd/display: Update DMCUB loading sequence for DCN3.5Nicholas Kazlauskas1-13/+3
[ Upstream commit d42b2331e158fa6bcdc89e4c8c470dc5da20be1f ] [Why] New sequence from HW for reset and firmware reloading has been provided that aims to stabilize the reload sequence in the case the firmware is hung or has outstanding requests. [How] Update the sequence to remove the DMUIF reset and the redundant writes in the release. Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-20drm/ttm: Should to return the evict errorEmily Deng1-0/+3
[ Upstream commit 4e16a9a00239db5d819197b9a00f70665951bf50 ] For the evict fail case, the evict error should be returned. v2: Consider ENOENT case. v3: Abort directly when the eviction failed for some reason (except for -ENOENT) and not wait for the move to finish Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20250603091154.3472646-1-Emily.Deng@amd.com Signed-off-by: Sasha Levin <sashal@kernel.org>