summaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)AuthorFilesLines
2 daysdrm/i915/power: fix size for for_each_set_bit() in abox iterationJani Nikula1-3/+3
[ Upstream commit cfa7b7659757f8d0fc4914429efa90d0d2577dd7 ] for_each_set_bit() expects size to be in bits, not bytes. The abox mask iteration uses bytes, but it works by coincidence, because the local variable holding the mask is unsigned long, and the mask only ever has bit 2 as the highest bit. Using a smaller type could lead to subtle and very hard to track bugs. Fixes: 62afef2811e4 ("drm/i915/rkl: RKL uses ABOX0 for pixel transfers") Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: stable@vger.kernel.org # v5.9+ Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20250905104149.1144751-1-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit 7ea3baa6efe4bb93d11e1c0e6528b1468d7debf6) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net> [ adapted struct intel_display *display parameters to struct drm_i915_private *dev_priv ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 daysdrm/amdgpu: fix a memory leak in fence cleanup when unloadingAlex Deucher1-3/+0
[ Upstream commit 7838fb5f119191403560eca2e23613380c0e425e ] Commit b61badd20b44 ("drm/amdgpu: fix usage slab after free") reordered when amdgpu_fence_driver_sw_fini() was called after that patch, amdgpu_fence_driver_sw_fini() effectively became a no-op as the sched entities we never freed because the ring pointers were already set to NULL. Remove the NULL setting. Reported-by: Lin.Cao <lincao12@amd.com> Cc: Vitaly Prosyak <vitaly.prosyak@amd.com> Cc: Christian König <christian.koenig@amd.com> Fixes: b61badd20b44 ("drm/amdgpu: fix usage slab after free") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a525fa37aac36c4591cc8b07ae8957862415fbd5) Cc: stable@vger.kernel.org [ Adapt to conditional check ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
12 daysdrm/amd/amdgpu: Fix missing error return on kzalloc failureColin Ian King1-1/+1
[ Upstream commit 467e00b30dfe75c4cfc2197ceef1fddca06adc25 ] Currently the kzalloc failure check just sets reports the failure and sets the variable ret to -ENOMEM, which is not checked later for this specific error. Fix this by just returning -ENOMEM rather than setting ret. Fixes: 4fb930715468 ("drm/amd/amdgpu: remove redundant host to psp cmd buf allocations") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1ee9d1a0962c13ba5ab7e47d33a80e3b8dc4b52e) Signed-off-by: Sasha Levin <sashal@kernel.org>
12 daysdrm/amdgpu: Replace DRM_* with dev_* in amdgpu_psp.cHawking Zhang1-69/+75
[ Upstream commit ac3ff8a90637e813005404a0110802aa384af4aa ] So kernel message has the device pcie bdf information, which helps issue debugging especially in multiple GPU system. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Stable-dep-of: 467e00b30dfe ("drm/amd/amdgpu: Fix missing error return on kzalloc failure") Signed-off-by: Sasha Levin <sashal@kernel.org>
12 daysdrm/amd: Make flashing messages quieterMario Limonciello1-4/+4
[ Upstream commit 1cc506f08b4c4688c729e48d7c665910ed330724 ] Debug messages related to the kernel process of flashing an updated IFWI are needlessly noisy and also confusing. Downgrade them to debug instead and clarify what they are actually doing. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Stable-dep-of: 467e00b30dfe ("drm/amd/amdgpu: Fix missing error return on kzalloc failure") Signed-off-by: Sasha Levin <sashal@kernel.org>
12 daysdrm/amdgpu: Skip TMR allocation if not requiredLijo Lazar1-8/+26
[ Upstream commit 5b03127d4745d6848f208463390e6a76d489eb03 ] On ASICs with PSPv13.0.6, TMR is reserved at boot time. There is no need to allocate TMR region by driver. However, it's still required to send SETUP_TMR command to PSP. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Stable-dep-of: 467e00b30dfe ("drm/amd/amdgpu: Fix missing error return on kzalloc failure") Signed-off-by: Sasha Levin <sashal@kernel.org>
12 daysdrm/amd/amdgpu: Fix style problems in amdgpu_psp.cSrinivasan Shanmugam1-31/+20
[ Upstream commit f14c8c3e1fc9e10c6d54999a96acb2b5087374df ] Fix the following checkpatch warnings & error in amdgpu_psp.c WARNING: Comparisons should place the constant on the right side of the test WARNING: braces {} are not necessary for single statement blocks WARNING: please, no space before tabs WARNING: braces {} are not necessary for single statement blocks ERROR: that open brace { should be on the previous line Suggested-by: Christian König <christian.koenig@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Stable-dep-of: 467e00b30dfe ("drm/amd/amdgpu: Fix missing error return on kzalloc failure") Signed-off-by: Sasha Levin <sashal@kernel.org>
12 daysdrm/amdgpu: remove the check of init status in psp_ras_initializeTao Zhou1-5/+3
[ Upstream commit 3e931368091f7d5d7902cee9d410eb6db2eea419 ] The initialized status indicates RAS TA is loaded, but in some cases (such as RAS fatal error) RAS TA could be destroyed although it's not unloaded. Hence we load RAS TA unconditionally here. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Candice Li <candice.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Stable-dep-of: 467e00b30dfe ("drm/amd/amdgpu: Fix missing error return on kzalloc failure") Signed-off-by: Sasha Levin <sashal@kernel.org>
12 daysdrm/amdgpu: Optimize RAS TA initialization and TA unload funcsCandice Li1-2/+8
[ Upstream commit bf7d777289d106963fd2080d298e6b88b7263b66 ] 1. Save TA unload psp response status 2. Add RAS TA loading status check for initializaiton 3. Drop RAS context teardown to allow RAS TA to be reloaded Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Candice Li <candice.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Stable-dep-of: 467e00b30dfe ("drm/amd/amdgpu: Fix missing error return on kzalloc failure") Signed-off-by: Sasha Levin <sashal@kernel.org>
12 daysdrm/bridge: ti-sn65dsi86: fix REFCLK settingMichael Walle1-0/+11
[ Upstream commit bdd5a14e660062114bdebaef9ad52adf04970a89 ] The bridge has three bootstrap pins which are sampled to determine the frequency of the external reference clock. The driver will also (over)write that setting. But it seems this is racy after the bridge is enabled. It was observed that although the driver write the correct value (by sniffing on the I2C bus), the register has the wrong value. The datasheet states that the GPIO lines have to be stable for at least 5us after asserting the EN signal. Thus, there seems to be some logic which samples the GPIO lines and this logic appears to overwrite the register value which was set by the driver. Waiting 20us after asserting the EN line resolves this issue. Fixes: a095f15c00e2 ("drm/bridge: add support for sn65dsi86 bridge driver") Signed-off-by: Michael Walle <mwalle@kernel.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20250821122341.1257286-1-mwalle@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
12 daysRevert "drm/amdgpu: Avoid extra evict-restore process."Alex Deucher1-2/+4
This reverts commit 71598a5a7797f0052aaa7bcff0b8d4b8f20f1441 which is commit 1f02f2044bda1db1fd995bc35961ab075fa7b5a2 upstream. This commit introduced a regression, however the fix for the regression: aa5fc4362fac ("drm/amdgpu: fix task hang from failed job submission during process kill") depends on things not yet present in 6.12.y and older kernels. Since this commit is more of an optimization, just revert it for 6.12.y and older stable kernels. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x - 6.12.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
12 daysdrm/amd/display: Check link_res->hpo_dp_link_enc before using itAlex Hung1-0/+7
commit 0beca868cde8742240cd0038141c30482d2b7eb8 upstream. [WHAT & HOW] Functions dp_enable_link_phy and dp_disable_link_phy can pass link_res without initializing hpo_dp_link_enc and it is necessary to check for null before dereferencing. This fixes 2 FORWARD_NULL issues reported by Coverity. Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Jerry Zuo <jerry.zuo@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> [ Minor context change fixed. ] Signed-off-by: Alva Lan <alvalan9@foxmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
12 daysdrm/amdgpu: drop hw access in non-DC audio finiAlex Deucher4-20/+0
commit 71403f58b4bb6c13b71c05505593a355f697fd94 upstream. We already disable the audio pins in hw_fini so there is no need to do it again in sw_fini. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4481 Cc: oushixiong <oushixiong1025@163.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 5eeb16ca727f11278b2917fd4311a7d7efb0bbd6) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
12 daysdrm/amd/display: Don't warn when missing DCE encoder capsTimur Kristóf1-4/+4
[ Upstream commit 8246147f1fbaed522b8bcc02ca34e4260747dcfb ] On some GPUs the VBIOS just doesn't have encoder caps, or maybe not for every encoder. This isn't really a problem and it's handled well, so let's not litter the logs with it. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 33e0227ee96e62d034781e91f215e32fd0b1d512) Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-09-04Revert "drm/dp: Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUS"Imre Deak1-1/+1
This reverts commit 4f546a77671076a0a49c08b4c6a7d0888d72f7d2 which is commit a40c5d727b8111b5db424a1e43e14a1dcce1e77f upstream. The upstream commit a40c5d727b8111b5db424a1e43e14a1dcce1e77f ("drm/dp: Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUS") the reverted commit backported causes a regression, on one eDP panel at least resulting in display flickering, described in detail at the Link: below. The issue fixed by the upstream commit will need a different solution, revert the backport for now. Cc: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: Sasha Levin <sashal@kernel.org> Link: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14558 Signed-off-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-09-04drm/nouveau/disp: Always accept linear modifierJames Jones1-0/+4
commit e2fe0c54fb7401e6ecd3c10348519ab9e23bd639 upstream. On some chipsets, which block-linear modifiers are supported is format-specific. However, linear modifiers are always be supported. The prior modifier filtering logic was not accounting for the linear case. Cc: stable@vger.kernel.org Fixes: c586f30bf74c ("drm/nouveau/kms: Add format mod prop to base/ovly/nvdisp") Signed-off-by: James Jones <jajones@nvidia.com> Link: https://lore.kernel.org/r/20250811220017.1337-3-jajones@nvidia.com Signed-off-by: Danilo Krummrich <dakr@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-09-04Revert "drm/amdgpu: fix incorrect vm flags to map bo"Alex Deucher1-2/+2
commit ac4ed2da4c1305a1a002415058aa7deaf49ffe3e upstream. This reverts commit b08425fa77ad2f305fe57a33dceb456be03b653f. Revert this to align with 6.17 because the fixes tag was wrong on this commit. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit be33e8a239aac204d7e9e673c4220ef244eb1ba3) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-09-04drm/msm: Defer fd_install in SUBMIT ioctlRob Clark1-7/+7
[ Upstream commit f22853435bbd1e9836d0dce7fd99c040b94c2bf1 ] Avoid fd_install() until there are no more potential error paths, to avoid put_unused_fd() after the fd is made visible to userspace. Fixes: 68dc6c2d5eec ("drm/msm: Fix submit error-path leaks") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/665363/ Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-28drm/amd/display: Add null pointer check in mod_hdcp_hdcp1_create_session()Chenyuan Yang1-0/+3
[ Upstream commit 7a2ca2ea64b1b63c8baa94a8f5deb70b2248d119 ] The function mod_hdcp_hdcp1_create_session() calls the function get_first_active_display(), but does not check its return value. The return value is a null pointer if the display list is empty. This will lead to a null pointer dereference. Add a null pointer check for get_first_active_display() and return MOD_HDCP_STATUS_DISPLAY_NOT_FOUND if the function return null. This is similar to the commit c3e9826a2202 ("drm/amd/display: Add null pointer check for get_first_active_display()"). Fixes: 2deade5ede56 ("drm/amd/display: Remove hdcp display state with mst fix") Signed-off-by: Chenyuan Yang <chenyuan0y@gmail.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 5e43eb3cd731649c4f8b9134f857be62a416c893) Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-28drm/hisilicon/hibmc: fix the hibmc loaded failed bugBaihan Li1-2/+2
[ Upstream commit 93a08f856fcc5aaeeecad01f71bef3088588216a ] When hibmc loaded failed, the driver use hibmc_unload to free the resource, but the mutexes in mode.config are not init, which will access an NULL pointer. Just change goto statement to return, because hibnc_hw_init() doesn't need to free anything. Fixes: b3df5e65cc03 ("drm/hibmc: Drop drm_vblank_cleanup") Signed-off-by: Baihan Li <libaihan@huawei.com> Signed-off-by: Yongbang Shi <shiyongbang@huawei.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250813094238.3722345-5-shiyongbang@huawei.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-28drm/amd/display: Don't overclock DCE 6 by 15%Timur Kristóf1-5/+3
[ Upstream commit cb7b7ae53b557d168b4af5cd8549f3eff920bfb5 ] The extra 15% clock was added as a workaround for a Polaris issue which uses DCE 11, and should not have been used on DCE 6 which is already hardcoded to the highest possible display clock. Unfortunately, the extra 15% was mistakenly copied and kept even on code paths which don't affect Polaris. This commit fixes that and also adds a check to make sure not to exceed the maximum DCE 6 display clock. Fixes: 8cd61c313d8b ("drm/amd/display: Raise dispclk value for Polaris") Fixes: dc88b4a684d2 ("drm/amd/display: make clk mgr soc specific") Fixes: 3ecb3b794e2c ("drm/amd/display: dc/clk_mgr: add support for SI parts (v2)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 427980c1cbd22bb256b9385f5ce73c0937562408) Cc: stable@vger.kernel.org [ `MIN` => `min` ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/dp: Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUSImre Deak1-1/+1
[ Upstream commit a40c5d727b8111b5db424a1e43e14a1dcce1e77f ] Reading DPCD registers has side-effects in general. In particular accessing registers outside of the link training register range (0x102-0x106, 0x202-0x207, 0x200c-0x200f, 0x2216) is explicitly forbidden by the DP v2.1 Standard, see 3.6.5.1 DPTX AUX Transaction Handling Mandates 3.6.7.4 128b/132b DP Link Layer LTTPR Link Training Mandates Based on my tests, accessing the DPCD_REV register during the link training of an UHBR TBT DP tunnel sink leads to link training failures. Solve the above by using the DP_LANE0_1_STATUS (0x202) register for the DPCD register access quirk. Cc: <stable@vger.kernel.org> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://lore.kernel.org/r/20250605082850.65136-2-imre.deak@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amd/display: Fill display clock and vblank time in ↵Timur Kristóf3-11/+3
dce110_fill_display_configs commit 7d07140d37f792f01cfdb8ca9a6a792ab1d29126 upstream. Also needed by DCE 6. This way the code that gathers this info can be shared between different DCE versions and doesn't have to be repeated. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 8107432dff37db26fcb641b6cebeae8981cd73a0) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amd/display: Find first CRTC and its line time in ↵Timur Kristóf1-10/+20
dce110_fill_display_configs commit 669f73a26f6112eedbadac53a2f2707ac6d0b9c8 upstream. dce110_fill_display_configs is shared between DCE 6-11, and finding the first CRTC and its line time is relevant to DCE 6 too. Move the code to find it from DCE 11 specific code. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 4ab09785f8d5d03df052827af073d5c508ff5f63) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amd/display: Fix DP audio DTO1 clock source on DCE 6.Timur Kristóf1-15/+6
commit 297a4833a68aac3316eb808b4123eb016ef242d7 upstream. On DCE 6, DP audio was not working. However, it worked when an HDMI monitor was also plugged in. Looking at dce_aud_wall_dto_setup it seems that the main difference is that we use DTO1 when only DP is plugged in. When programming DTO1, it uses audio_dto_source_clock_in_khz which is set from get_dp_ref_freq_khz The dce60_get_dp_ref_freq_khz implementation looks incorrect, because DENTIST_DISPCLK_CNTL seems to be always zero on DCE 6, so it isn't usable. I compared dce60_get_dp_ref_freq_khz to the legacy display code, specifically dce_v6_0_audio_set_dto, and it turns out that in case of DCE 6, it needs to use the display clock. With that, DP audio started working on Pitcairn, Oland and Cape Verde. However, it still didn't work on Tahiti. Despite having the same DCE version, Tahiti seems to have a different audio device. After some trial and error I realized that it works with the default display clock as reported by the VBIOS, not the current display clock. The patch was tested on all four SI GPUs: * Pitcairn (DCE 6.0) * Oland (DCE 6.4) * Cape Verde (DCE 6.0) * Tahiti (DCE 6.0 but different) The testing was done on Samsung Odyssey G7 LS28BG700EPXEN on each of the above GPUs, at the following settings: * 4K 60 Hz * 1080p 60 Hz * 1080p 144 Hz Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 645cc7863da5de700547d236697dffd6760cf051) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amd/display: Fix fractional fb divider in set_pixel_clock_v3Timur Kristóf1-1/+1
commit 10507478468f165ea681605d133991ed05cdff62 upstream. For later VBIOS versions, the fractional feedback divider is calculated as the remainder of dividing the feedback divider by a factor, which is set to 1000000. For reference, see: - calculate_fb_and_fractional_fb_divider - calc_pll_max_vco_construct However, in case of old VBIOS versions that have set_pixel_clock_v3, they only have 1 byte available for the fractional feedback divider, and it's expected to be set to the remainder from dividing the feedback divider by 10. For reference see the legacy display code: - amdgpu_pll_compute - amdgpu_atombios_crtc_program_pll This commit fixes set_pixel_clock_v3 by dividing the fractional feedback divider passed to the function by 100000. Fixes: 4562236b3bc0 ("drm/amd/dc: Add dc display driver (v2)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 027e7acc7e17802ebf28e1edb88a404836ad50d6) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amd/display: Avoid a NULL pointer dereferenceMario Limonciello1-0/+3
commit 07b93a5704b0b72002f0c4bd1076214af67dc661 upstream. [WHY] Although unlikely drm_atomic_get_new_connector_state() or drm_atomic_get_old_connector_state() can return NULL. [HOW] Check returns before dereference. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1e5e8d672fec9f2ab352be121be971877bff2af9) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/sched: Remove optimization that causes hang when killing dependent jobsLin.Cao1-19/+2
[ Upstream commit 15f77764e90a713ee3916ca424757688e4f565b9 ] When application A submits jobs and application B submits a job with a dependency on A's fence, the normal flow wakes up the scheduler after processing each job. However, the optimization in drm_sched_entity_add_dependency_cb() uses a callback that only clears dependencies without waking up the scheduler. When application A is killed before its jobs can run, the callback gets triggered but only clears the dependency without waking up the scheduler, causing the scheduler to enter sleep state and application B to hang. Remove the optimization by deleting drm_sched_entity_clear_dep() and its usage, ensuring the scheduler is always woken up when dependencies are cleared. Fixes: 777dbd458c89 ("drm/amdgpu: drop a dummy wakeup scheduler") Cc: stable@vger.kernel.org # v4.6+ Signed-off-by: Lin.Cao <lincao12@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250717084453.921097-1-lincao12@amd.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amd/display: Don't overwrite dce60_clk_mgrTimur Kristóf1-1/+0
commit 4db9cd554883e051df1840d4d58d636043101034 upstream. dc_clk_mgr_create accidentally overwrites the dce60_clk_mgr with the dce_clk_mgr, causing incorrect behaviour on DCE6. Fix it by removing the extra dce_clk_mgr_construct. Fixes: 62eab49faae7 ("drm/amd/display: hide VGH asic specific structs") Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit bbddcbe36a686af03e91341b9bbfcca94bd45fb6) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amdkfd: Destroy KFD debugfs after destroy KFD wqAmber Lin1-1/+1
commit 2e58401a24e7b2d4ec619104e1a76590c1284a4c upstream. Since KFD proc content was moved to kernel debugfs, we can't destroy KFD debugfs before kfd_process_destroy_wq. Move kfd_process_destroy_wq prior to kfd_debugfs_fini to fix a kernel NULL pointer problem. It happens when /sys/kernel/debug/kfd was already destroyed in kfd_debugfs_fini but kfd_process_destroy_wq calls kfd_debugfs_remove_process. This line debugfs_remove_recursive(entry->proc_dentry); tries to remove /sys/kernel/debug/kfd/proc/<pid> while /sys/kernel/debug/kfd is already gone. It hangs the kernel by kernel NULL pointer. Signed-off-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Eric Huang <jinhuieric.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 0333052d90683d88531558dcfdbf2525cc37c233) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amdgpu: update mmhub 3.0.1 client id mappingsAlex Deucher1-25/+32
commit 0bae62cc989fa99ac9cb564eb573aad916d1eb61 upstream. Update the client id mapping so the correct clients get printed when there is a mmhub page fault. Reviewed-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 2a2681eda73b99a2c1ee8cdb006099ea5d0c2505) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amdgpu: Avoid extra evict-restore process.Gang Ba1-4/+2
commit 1f02f2044bda1db1fd995bc35961ab075fa7b5a2 upstream. If vm belongs to another process, this is fclose after fork, wait may enable signaling KFD eviction fence and cause parent process queue evicted. [677852.634569] amdkfd_fence_enable_signaling+0x56/0x70 [amdgpu] [677852.634814] __dma_fence_enable_signaling+0x3e/0xe0 [677852.634820] dma_fence_wait_timeout+0x3a/0x140 [677852.634825] amddma_resv_wait_timeout+0x7f/0xf0 [amdkcl] [677852.634831] amdgpu_vm_wait_idle+0x2d/0x60 [amdgpu] [677852.635026] amdgpu_flush+0x34/0x50 [amdgpu] [677852.635208] filp_flush+0x38/0x90 [677852.635213] filp_close+0x14/0x30 [677852.635216] do_close_on_exec+0xdd/0x130 [677852.635221] begin_new_exec+0x1da/0x490 [677852.635225] load_elf_binary+0x307/0xea0 [677852.635231] ? srso_alias_return_thunk+0x5/0xfbef5 [677852.635235] ? ima_bprm_check+0xa2/0xd0 [677852.635240] search_binary_handler+0xda/0x260 [677852.635245] exec_binprm+0x58/0x1a0 [677852.635249] bprm_execve.part.0+0x16f/0x210 [677852.635254] bprm_execve+0x45/0x80 [677852.635257] do_execveat_common.isra.0+0x190/0x200 Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Gang Ba <Gang.Ba@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amd: Restore cached power limit during resumeMario Limonciello1-0/+6
commit ed4efe426a49729952b3dc05d20e33b94409bdd1 upstream. The power limit will be cached in smu->current_power_limit but if the ASIC goes into S3 this value won't be restored. Restore the value during SMU resume. Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250725031222.3015095-2-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 26a609e053a6fc494403e95403bc6a2470383bec) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/amdgpu: fix incorrect vm flags to map boJack Xiao1-2/+2
[ Upstream commit 040bc6d0e0e9c814c9c663f6f1544ebaff6824a8 ] It should use vm flags instead of pte flags to specify bo vm attributes. Fixes: 7946340fa389 ("drm/amdgpu: Move csa related code to separate file") Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit b08425fa77ad2f305fe57a33dceb456be03b653f) Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-28drm/amd/display: Only finalize atomic_obj if it was initializedMario Limonciello1-1/+2
[ Upstream commit b174084b3fe15ad1acc69530e673c1535d2e4f85 ] [Why] If amdgpu_dm failed to initalize before amdgpu_dm_initialize_drm_device() completed then freeing atomic_obj will lead to list corruption. [How] Check if atomic_obj state is initialized before trying to free. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-28drm/ttm: Respect the shrinker core free targetTvrtko Ursulin1-3/+5
[ Upstream commit eac21f8ebeb4f84d703cf41dc3f81d16fa9dc00a ] Currently the TTM shrinker aborts shrinking as soon as it frees pages from any of the page order pools and by doing so it can fail to respect the freeing target which was configured by the shrinker core. We use the wording "can fail" because the number of freed pages will depend on the presence of pages in the pools and the order of the pools on the LRU list. For example if there are no free pages in the high order pools the shrinker core may require multiple passes over the TTM shrinker before it will free the default target of 128 pages (assuming there are free pages in the low order pools). This inefficiency can be compounded by the pool LRU where multiple further calls into the TTM shrinker are required to end up looking at the pool with pages. Improve this by never freeing less than the shrinker core has requested. At the same time we start reporting the number of scanned pages (freed in this case), which prevents the core shrinker from giving up on the TTM shrinker too soon and moving on. v2: * Simplify loop logic. (Christian) * Improve commit message. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Christian König <christian.koenig@amd.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net> Link: https://lore.kernel.org/r/20250603112750.34997-2-tvrtko.ursulin@igalia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-28drm/ttm: Should to return the evict errorEmily Deng1-0/+3
[ Upstream commit 4e16a9a00239db5d819197b9a00f70665951bf50 ] For the evict fail case, the evict error should be returned. v2: Consider ENOENT case. v3: Abort directly when the eviction failed for some reason (except for -ENOENT) and not wait for the move to finish Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20250603091154.3472646-1-Emily.Deng@amd.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-28drm/amd: Allow printing VanGogh OD SCLK levels without setting dpm to manualMario Limonciello1-22/+15
[ Upstream commit 2d1ec1e955414e8e8358178011c35afca1a1c0b1 ] Several other ASICs allow printing OD SCLK levels without setting DPM control to manual. When OD is disabled it will show the range the hardware supports. When OD is enabled it will show what values have been programmed. Adjust VanGogh to work the same. Cc: Pierre-Loup A. Griffais <pgriffais@valvesoftware.com> Reported-by: Vicki Pfau <vi@endrift.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250609031227.479079-1-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-28drm/amd/display: Fix 'failed to blank crtc!'Wen Chen1-1/+1
[ Upstream commit 01f60348d8fb6b3fbcdfc7bdde5d669f95b009a4 ] [why] DCN35 is having “DC: failed to blank crtc!” when running HPO test cases. It's caused by not having sufficient udelay time. [how] Replace the old wait_for_blank_complete function with fsleep function to sleep just until the next frame should come up. This way it doesn't poll in case the pixel clock or other clock was bugged or until vactive and the vblank are hit again. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Wen Chen <Wen.Chen3@amd.com> Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-28drm/amd/display: Separate set_gsl from set_gsl_source_selectIlya Bakoulin1-5/+4
[ Upstream commit 660a467a5e7366cd6642de61f1aaeaf0d253ee68 ] [Why/How] Separate the checks for set_gsl and set_gsl_source_select, since source_select may not be implemented/necessary. Reviewed-by: Nevenko Stupar <nevenko.stupar@amd.com> Signed-off-by: Ilya Bakoulin <Ilya.Bakoulin@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-28drm/msm: use trylock for debugfsRob Clark2-1/+8
[ Upstream commit 0a1ff88ec5b60b41ba830c5bf08b6cd8f45ab411 ] This resolves a potential deadlock vs msm_gem_vm_close(). Otherwise for _NO_SHARE buffers msm_gem_describe() could be trying to acquire the shared vm resv, while already holding priv->obj_lock. But _vm_close() might drop the last reference to a GEM obj while already holding the vm resv, and msm_gem_free_object() needs to grab priv->obj_lock, a locking inversion. OTOH this is only for debugfs and it isn't critical if we undercount by skipping a locked obj. So just use trylock() and move along if we can't get the lock. Signed-off-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com> Tested-by: Antonino Maniscalco <antomani103@gmail.com> Reviewed-by: Antonino Maniscalco <antomani103@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/661525/ Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15drm/amd/pm/powerplay/hwmgr/smu_helper: fix order of mask and valueFedor Pchelkin1-1/+1
[ Upstream commit a54e4639c4ef37a0241bac7d2a77f2e6ffb57099 ] There is a small typo in phm_wait_on_indirect_register(). Swap mask and value arguments provided to phm_wait_on_register() so that they satisfy the function signature and actual usage scheme. Found by Linux Verification Center (linuxtesting.org) with Svace static analysis tool. In practice this doesn't fix any issues because the only place this function is used uses the same value for the value and mask. Fixes: 3bace3591493 ("drm/amd/powerplay: add hardware manager sub-component") Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15drm/rockchip: cleanup fb when drm_gem_fb_afbc_init failedAndy Yan1-8/+1
[ Upstream commit 099593a28138b48feea5be8ce700e5bc4565e31d ] In the function drm_gem_fb_init_with_funcs, the framebuffer (fb) and its corresponding object ID have already been registered. So we need to cleanup the drm framebuffer if the subsequent execution of drm_gem_fb_afbc_init fails. Directly call drm_framebuffer_put to ensure that all fb related resources are cleanup. Fixes: 7707f7227f09 ("drm/rockchip: Add support for afbc") Signed-off-by: Andy Yan <andy.yan@rock-chips.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20250509031607.2542187-1-andyshrk@163.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15drm/i915/dp: Fix 2.7 Gbps DP_LINK_BW value on g4xVille Syrjälä1-0/+6
commit 9e0c433d0c05fde284025264b89eaa4ad59f0a3e upstream. On g4x we currently use the 96MHz non-SSC refclk, which can't actually generate an exact 2.7 Gbps link rate. In practice we end up with 2.688 Gbps which seems to be close enough to actually work, but link training is currently failing due to miscalculating the DP_LINK_BW value (we calcualte it directly from port_clock which reflects the actual PLL outpout frequency). Ideas how to fix this: - nudge port_clock back up to 270000 during PLL computation/readout - track port_clock and the nominal link rate separately so they might differ a bit - switch to the 100MHz refclk, but that one should be SSC so perhaps not something we want While we ponder about a better solution apply some band aid to the immediate issue of miscalculated DP_LINK_BW value. With this I can again use 2.7 Gbps link rate on g4x. Cc: stable@vger.kernel.org Fixes: 665a7b04092c ("drm/i915: Feed the DPLL output freq back into crtc_state") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250710201718.25310-2-ville.syrjala@linux.intel.com Reviewed-by: Imre Deak <imre.deak@intel.com> (cherry picked from commit a8b874694db5cae7baaf522756f87acd956e6e66) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [ changed display->platform.g4x to IS_G4X(i915) ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-15drm/amdkfd: Don't call mmput from MMU notifier callbackPhilip Yang1-25/+22
commit cf234231fcbc7d391e2135b9518613218cc5347f upstream. If the process is exiting, the mmput inside mmu notifier callback from compactd or fork or numa balancing could release the last reference of mm struct to call exit_mmap and free_pgtable, this triggers deadlock with below backtrace. The deadlock will leak kfd process as mmu notifier release is not called and cause VRAM leaking. The fix is to take mm reference mmget_non_zero when adding prange to the deferred list to pair with mmput in deferred list work. If prange split and add into pchild list, the pchild work_item.mm is not used, so remove the mm parameter from svm_range_unmap_split and svm_range_add_child. The backtrace of hung task: INFO: task python:348105 blocked for more than 64512 seconds. Call Trace: __schedule+0x1c3/0x550 schedule+0x46/0xb0 rwsem_down_write_slowpath+0x24b/0x4c0 unlink_anon_vmas+0xb1/0x1c0 free_pgtables+0xa9/0x130 exit_mmap+0xbc/0x1a0 mmput+0x5a/0x140 svm_range_cpu_invalidate_pagetables+0x2b/0x40 [amdgpu] mn_itree_invalidate+0x72/0xc0 __mmu_notifier_invalidate_range_start+0x48/0x60 try_to_unmap_one+0x10fa/0x1400 rmap_walk_anon+0x196/0x460 try_to_unmap+0xbb/0x210 migrate_page_unmap+0x54d/0x7e0 migrate_pages_batch+0x1c3/0xae0 migrate_pages_sync+0x98/0x240 migrate_pages+0x25c/0x520 compact_zone+0x29d/0x590 compact_zone_order+0xb6/0xf0 try_to_compact_pages+0xbe/0x220 __alloc_pages_direct_compact+0x96/0x1a0 __alloc_pages_slowpath+0x410/0x930 __alloc_pages_nodemask+0x3a9/0x3e0 do_huge_pmd_anonymous_page+0xd7/0x3e0 __handle_mm_fault+0x5e3/0x5f0 handle_mm_fault+0xf7/0x2e0 hmm_vma_fault.isra.0+0x4d/0xa0 walk_pmd_range.isra.0+0xa8/0x310 walk_pud_range+0x167/0x240 walk_pgd_range+0x55/0x100 __walk_page_range+0x87/0x90 walk_page_range+0xf6/0x160 hmm_range_fault+0x4f/0x90 amdgpu_hmm_range_get_pages+0x123/0x230 [amdgpu] amdgpu_ttm_tt_get_user_pages+0xb1/0x150 [amdgpu] init_user_pages+0xb1/0x2a0 [amdgpu] amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x543/0x7d0 [amdgpu] kfd_ioctl_alloc_memory_of_gpu+0x24c/0x4e0 [amdgpu] kfd_ioctl+0x29d/0x500 [amdgpu] Fixes: fa582c6f3684 ("drm/amdkfd: Use mmget_not_zero in MMU notifier") Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a29e067bd38946f752b0ef855f3dfff87e77bec7) Cc: stable@vger.kernel.org [ updated additional svm_range_add_child calls in svm_range_split_by_granularity to remove mm parameter ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-15drm/bridge: ti-sn65dsi86: Remove extra semicolon in ti_sn_bridge_probe()Douglas Anderson1-1/+1
[ Upstream commit 15a7ca747d9538c2ad8b0c81dd4c1261e0736c82 ] As reported by the kernel test robot, a recent patch introduced an unnecessary semicolon. Remove it. Fixes: 55e8ff842051 ("drm/bridge: ti-sn65dsi86: Add HPD for DisplayPort connector type") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202506301704.0SBj6ply-lkp@intel.com/ Reviewed-by: Devarsh Thakkar <devarsht@ti.com> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20250714130631.1.I1cfae3222e344a3b3c770d079ee6b6f7f3b5d636@changeid Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-17drm/tegra: nvdec: Fix dma_alloc_coherent error checkMikko Perttunen1-4/+2
[ Upstream commit 44306a684cd1699b8562a54945ddc43e2abc9eab ] Check for NULL return value with dma_alloc_coherent, in line with Robin's fix for vic.c in 'drm/tegra: vic: Fix DMA API misuse'. Fixes: 46f226c93d35 ("drm/tegra: Add NVDEC driver") Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20250702-nvdec-dma-error-check-v1-1-c388b402c53a@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-17drm/gem: Fix race in drm_gem_handle_create_tail()Simona Vetter1-1/+9
commit bd46cece51a36ef088f22ef0416ac13b0a46d5b0 upstream. Object creation is a careful dance where we must guarantee that the object is fully constructed before it is visible to other threads, and GEM buffer objects are no difference. Final publishing happens by calling drm_gem_handle_create(). After that the only allowed thing to do is call drm_gem_object_put() because a concurrent call to the GEM_CLOSE ioctl with a correctly guessed id (which is trivial since we have a linear allocator) can already tear down the object again. Luckily most drivers get this right, the very few exceptions I've pinged the relevant maintainers for. Unfortunately we also need drm_gem_handle_create() when creating additional handles for an already existing object (e.g. GETFB ioctl or the various bo import ioctl), and hence we cannot have a drm_gem_handle_create_and_put() as the only exported function to stop these issues from happening. Now unfortunately the implementation of drm_gem_handle_create() isn't living up to standards: It does correctly finishe object initialization at the global level, and hence is safe against a concurrent tear down. But it also sets up the file-private aspects of the handle, and that part goes wrong: We fully register the object in the drm_file.object_idr before calling drm_vma_node_allow() or obj->funcs->open, which opens up races against concurrent removal of that handle in drm_gem_handle_delete(). Fix this with the usual two-stage approach of first reserving the handle id, and then only registering the object after we've completed the file-private setup. Jacek reported this with a testcase of concurrently calling GEM_CLOSE on a freshly-created object (which also destroys the object), but it should be possible to hit this with just additional handles created through import or GETFB without completed destroying the underlying object with the concurrent GEM_CLOSE ioctl calls. Note that the close-side of this race was fixed in f6cd7daecff5 ("drm: Release driver references to handle before making it available again"), which means a cool 9 years have passed until someone noticed that we need to make this symmetry or there's still gaps left :-/ Without the 2-stage close approach we'd still have a race, therefore that's an integral part of this bugfix. More importantly, this means we can have NULL pointers behind allocated id in our drm_file.object_idr. We need to check for that now: - drm_gem_handle_delete() checks for ERR_OR_NULL already - drm_gem.c:object_lookup() also chekcs for NULL - drm_gem_release() should never be called if there's another thread still existing that could call into an IOCTL that creates a new handle, so cannot race. For paranoia I added a NULL check to drm_gem_object_release_handle() though. - most drivers (etnaviv, i915, msm) are find because they use idr_find(), which maps both ENOENT and NULL to NULL. - drivers using idr_for_each_entry() should also be fine, because idr_get_next does filter out NULL entries and continues the iteration. - The same holds for drm_show_memory_stats(). v2: Use drm_WARN_ON (Thomas) Reported-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Tested-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Cc: stable@vger.kernel.org Cc: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@gmail.com> Cc: Simona Vetter <simona@ffwll.ch> Signed-off-by: Simona Vetter <simona.vetter@intel.com> Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20250707151814.603897-1-simona.vetter@ffwll.ch Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-17drm/ttm: fix error handling in ttm_buffer_object_transferChristian König1-6/+7
commit 97e000acf2e20a86a50a0ec8c2739f0846f37509 upstream. Unlocking the resv object was missing in the error path, additionally to that we should move over the resource only after the fence slot was reserved. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Fixes: c8d4c18bfbc4a ("dma-buf/drivers: make reserving a shared slot mandatory v4") Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20250616130726.22863-3-christian.koenig@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-17drm/exynos: exynos7_drm_decon: add vblank check in IRQ handlingKaustabh Chakraborty1-0/+4
commit b846350aa272de99bf6fecfa6b08e64ebfb13173 upstream. If there's support for another console device (such as a TTY serial), the kernel occasionally panics during boot. The panic message and a relevant snippet of the call stack is as follows: Unable to handle kernel NULL pointer dereference at virtual address 000000000000000 Call trace: drm_crtc_handle_vblank+0x10/0x30 (P) decon_irq_handler+0x88/0xb4 [...] Otherwise, the panics don't happen. This indicates that it's some sort of race condition. Add a check to validate if the drm device can handle vblanks before calling drm_crtc_handle_vblank() to avoid this. Cc: stable@vger.kernel.org Fixes: 96976c3d9aff ("drm/exynos: Add DECON driver") Signed-off-by: Kaustabh Chakraborty <kauschluss@disroot.org> Signed-off-by: Inki Dae <inki.dae@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>