summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm
AgeCommit message (Collapse)AuthorFilesLines
2024-07-27drm/amdgpu: Fix signedness bug in sdma_v4_0_process_trap_irq()Dan Carpenter1-1/+1
commit 6769a23697f17f9bf9365ca8ed62fe37e361a05a upstream. The "instance" variable needs to be signed for the error handling to work. Fixes: 8b2faf1a4f3b ("drm/amdgpu: add error handle to avoid out-of-bounds") Reviewed-by: Bob Zhou <bob.zhou@amd.com> Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Siddh Raman Pant <siddh.raman.pant@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-25drm/radeon: check bo_va->bo is non-NULL before using itPierre-Eric Pelloux-Prayer1-1/+1
[ Upstream commit 6fb15dcbcf4f212930350eaee174bb60ed40a536 ] The call to radeon_vm_clear_freed might clear bo_va->bo, so we have to check it before dereferencing it. Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-25drm/amd/display: Fix array-index-out-of-bounds in dml2/FCLKChangeSupportRoman Li1-1/+1
[ Upstream commit 0ad4b4a2f6357c45fbe444ead1a929a0b4017d03 ] [Why] Potential out of bounds access in dml2_calculate_rq_and_dlg_params() because the value of out_lowest_state_idx used as an index for FCLKChangeSupport array can be greater than 1. [How] Currently dml2 core specifies identical values for all FCLKChangeSupport elements. Always use index 0 in the condition to avoid out of bounds access. Acked-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Jerry Zuo <jerry.zuo@amd.com> Signed-off-by: Roman Li <Roman.Li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-25drm/amd/display: Update efficiency bandwidth for dcn351Fangzhi Zuo1-0/+1
[ Upstream commit 7ae37db29a8bc4d3d116a409308dd98fc3a0b1b3 ] Fix 4k240 underflow on dcn351 Acked-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-25drm/amd/display: Fix refresh rate range for some panelTom Chung1-0/+48
[ Upstream commit 9ef1548aeaa8858e7aee2152bf95cc71cdcd6dff ] [Why] Some of the panels does not have the refresh rate range info in base EDID and only have the refresh rate range info in DisplayID block. It will cause the max/min freesync refresh rate set to 0. [How] Try to parse the refresh rate range info from DisplayID if the max/min refresh rate is 0. Reviewed-by: Sun peng Li <sunpeng.li@amd.com> Signed-off-by: Jerry Zuo <jerry.zuo@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-25drm/amd/display: Account for cursor prefetch BW in DML1 mode supportAlvin Lee1-0/+3
[ Upstream commit 074b3a886713f69d98d30bb348b1e4cb3ce52b22 ] [Description] We need to ensure to take into account cursor prefetch BW in mode support or we may pass ModeQuery but fail an actual flip which will cause a hang. Flip may fail because the cursor_pre_bw is populated during mode programming (and mode programming is never called prior to ModeQuery). Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com> Reviewed-by: Nevenko Stupar <nevenko.stupar@amd.com> Signed-off-by: Jerry Zuo <jerry.zuo@amd.com> Signed-off-by: Alvin Lee <alvin.lee2@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-25drm/amd/display: Add refresh rate range checkTom Chung1-1/+3
[ Upstream commit 74ad26b36d303ac233eccadc5c3a8d7ee4709f31 ] [Why] We only enable the VRR while monitor usable refresh rate range is greater than 10 Hz. But we did not check the range in DRM_EDID_FEATURE_CONTINUOUS_FREQ case. [How] Add a refresh rate range check before set the freesync_capable flag in DRM_EDID_FEATURE_CONTINUOUS_FREQ case. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Jerry Zuo <jerry.zuo@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-25drm/amd/swsmu: add MALL init support workaround for smu_v14_0_1Li Ma5-3/+96
[ Upstream commit c223376b3019a00a0241faea0bc8c966738d1cc5 ] [Why] SMU firmware has not supported MALL PG. [How] Disable MALL PG and make it always on until SMU firmware is ready. Signed-off-by: Li Ma <li.ma@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-25drm/amdgpu: init TA fw for psp v14Likun Gao1-0/+5
[ Upstream commit ed5a4484f074aa2bfb1dad99ff3628ea8da4acdc ] Add support to init TA firmware for psp v14. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-25drm/amd/display: change dram_clock_latency to 34us for dcn35Paul Hsieh1-1/+1
[ Upstream commit 6071607bfefefc50a3907c0ba88878846960d29a ] [Why & How] Current DRAM setting would cause underflow on customer platform. Modify dram_clock_change_latency_us from 11.72 to 34.0 us as per recommendation from HW team Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Signed-off-by: Paul Hsieh <paul.hsieh@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-25drm/amd/display: Change dram_clock_latency to 34us for dcn351Daniel Miess1-1/+1
[ Upstream commit c60e20f13c27662de36cd5538d6299760780db52 ] [Why] Intermittent underflow observed when using 4k144 display on dcn351 [How] Update dram_clock_change_latency_us from 11.72us to 34us Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Signed-off-by: Daniel Miess <daniel.miess@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-25drm/amdgpu: Indicate CU havest info to CPHarish Kasiviswanathan1-2/+13
[ Upstream commit 49c9ffabde555c841392858d8b9e6cf58998a50c ] To achieve full occupancy CP hardware needs to know if CUs in SE are symmetrically or asymmetrically harvested v2: Reset is_symmetric_cus for each loop Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-25drm/vmwgfx: Fix missing HYPERVISOR_GUEST dependencyAlexey Makhalov1-1/+1
[ Upstream commit 8c4d6945fe5bd04ff847c3c788abd34ca354ecee ] VMWARE_HYPERCALL alternative will not work as intended without VMware guest code initialization. [ bp: note that this doesn't reproduce with newer gccs so it must be something gcc-9-specific. ] Closes: https://lore.kernel.org/oe-kbuild-all/202406152104.FxakP1MB-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Alexey Makhalov <alexey.makhalov@broadcom.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240616012511.198243-1-alexey.makhalov@broadcom.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-25drm/mediatek: Call drm_atomic_helper_shutdown() at shutdown timeDouglas Anderson1-0/+8
[ Upstream commit c38896ca6318c2df20bbe6c8e3f633e071fda910 ] Based on grepping through the source code this driver appears to be missing a call to drm_atomic_helper_shutdown() at system shutdown time. Among other things, this means that if a panel is in use that it won't be cleanly powered off at system shutdown time. The fact that we should call drm_atomic_helper_shutdown() in the case of OS shutdown/restart comes straight out of the kernel doc "driver instance overview" in drm_drv.c. This driver users the component model and shutdown happens in the base driver. The "drvdata" for this driver will always be valid if shutdown() is called and as of commit 2a073968289d ("drm/atomic-helper: drm_atomic_helper_shutdown(NULL) should be a noop") we don't need to confirm that "drm" is non-NULL. Suggested-by: Maxime Ripard <mripard@kernel.org> Reviewed-by: Maxime Ripard <mripard@kernel.org> Reviewed-by: Fei Shao <fshao@chromium.org> Tested-by: Fei Shao <fshao@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240611102744.v2.1.I2b014f90afc4729b6ecc7b5ddd1f6dedcea4625b@changeid Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-25drm: renesas: shmobile: Call drm_atomic_helper_shutdown() at shutdown timeDouglas Anderson1-0/+8
[ Upstream commit 0320ca14c6fb68ad19aa72e55a1a21c061b2946b ] Based on grepping through the source code, this driver appears to be missing a call to drm_atomic_helper_shutdown() at system shutdown time. This is important because drm_atomic_helper_shutdown() will cause panels to get disabled cleanly which may be important for their power sequencing. Future changes will remove any custom powering off in individual panel drivers so the DRM drivers need to start getting this right. The fact that we should call drm_atomic_helper_shutdown() in the case of OS shutdown comes straight out of the kernel doc "driver instance overview" in drm_drv.c. [geert: shmob_drm_remove() already calls drm_atomic_helper_shutdown] Suggested-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20230901164111.RFT.15.Iaf638a1d4c8b3c307a6192efabb4cbb06b195f15@changeid [geert: s/drm_helper_force_disable_all/drm_atomic_helper_shutdown/] Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Sui Jingfeng <sui.jingfeng@linux.dev> Signed-off-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/17c6a5a668e5975f871b77fb1fca6711a0799d9e.1718176895.git.geert+renesas@glider.be Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-25drm: panel-orientation-quirks: Add quirk for Aya Neo KUNTobias Jakobi1-0/+6
[ Upstream commit f74fb5df429ebc6a614dc5aa9e44d7194d402e5a ] Similar to the other Aya Neo devices this one features again a portrait screen, here with a native resolution of 1600x2560. Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240310220401.895591-1-tjakobi@math.uni-bielefeld.de Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-25drm/exynos: dp: drop driver owner initializationKrzysztof Kozlowski1-1/+0
[ Upstream commit 1f3512cdf8299f9edaea9046d53ea324a7730bab ] Core in platform_driver_register() already sets the .owner, so driver does not need to. Whatever is set here will be anyway overwritten by main driver calling platform_driver_register(). Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Inki Dae <inki.dae@samsung.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-25drm/amdgpu/pptable: Fix UBSAN array-index-out-of-boundsTasos Sahanidis1-42/+49
[ Upstream commit c6c4dd54012551cce5cde408b35468f2c62b0cce ] Flexible arrays used [1] instead of []. Replace the former with the latter to resolve multiple UBSAN warnings observed on boot with a BONAIRE card. In addition, use the __counted_by attribute where possible to hint the length of the arrays to the compiler and any sanitizers. Signed-off-by: Tasos Sahanidis <tasos@tasossah.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-11drm/amdgpu: silence UBSAN warningAlex Deucher1-1/+1
[ Upstream commit 05d9e24ddb15160164ba6e917a88c00907dc2434 ] Convert a variable sized array from [1] to []. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-11drm/amdgpu: correct hbm field in boot statusHawking Zhang1-1/+1
[ Upstream commit ec58991054e899c9d86f7e3c8a96cb602d4b5938 ] hbm filed takes bit 13 and bit 14 in boot status. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-11drm/amdkfd: Let VRAM allocations go to GTT domain on small APUsLang Yu5-13/+23
[ Upstream commit eb853413d02c8d9b27942429b261a9eef228f005 ] Small APUs(i.e., consumer, embedded products) usually have a small carveout device memory which can't satisfy most compute workloads memory allocation requirements. We can't even run a Basic MNIST Example with a default 512MB carveout. https://github.com/pytorch/examples/tree/main/mnist. Error Log: "torch.cuda.OutOfMemoryError: HIP out of memory. Tried to allocate 84.00 MiB. GPU 0 has a total capacity of 512.00 MiB of which 0 bytes is free. Of the allocated memory 103.83 MiB is allocated by PyTorch, and 22.17 MiB is reserved by PyTorch but unallocated" Though we can change BIOS settings to enlarge carveout size, which is inflexible and may bring complaint. On the other hand, the memory resource can't be effectively used between host and device. The solution is MI300A approach, i.e., let VRAM allocations go to GTT. Then device and host can flexibly and effectively share memory resource. v2: Report local_mem_size_private as 0. (Felix) Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-11drm: panel-orientation-quirks: Add quirk for Valve GalileoJohn Schoenick1-0/+7
commit 26746ed40bb0e4ebe2b2bd61c04eaaa54e263c14 upstream. Valve's Steam Deck Galileo revision has a 800x1280 OLED panel Cc: stable@vger.kernel.org # 6.1+ Signed-off-by: John Schoenick <johns@valvesoftware.com> Signed-off-by: Matthew Schwartz <mattschwartz@gwu.edu> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240628205822.348402-2-mattschwartz@gwu.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-11drm/amdgpu/atomfirmware: silence UBSAN warningAlex Deucher1-1/+1
commit d0417264437a8fa05f894cabba5a26715b32d78e upstream. This is a variable sized array. Link: https://lists.freedesktop.org/archives/amd-gfx/2024-June/110420.html Tested-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-11drm/nouveau: fix null pointer dereference in nouveau_connector_get_modesMa Ke1-0/+3
commit 80bec6825b19d95ccdfd3393cf8ec15ff2a749b4 upstream. In nouveau_connector_get_modes(), the return value of drm_mode_duplicate() is assigned to mode, which will lead to a possible NULL pointer dereference on failure of drm_mode_duplicate(). Add a check to avoid npd. Cc: stable@vger.kernel.org Fixes: 6ee738610f41 ("drm/nouveau: Add DRM driver for NVIDIA GPUs") Signed-off-by: Ma Ke <make24@iscas.ac.cn> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240627074204.3023776-1-make24@iscas.ac.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-11drm/ttm: Always take the bo delayed cleanup path for imported bosThomas Hellström1-0/+1
commit d99fbd9aab624fc030934e21655389ab1765dc94 upstream. Bos can be put with multiple unrelated dma-resv locks held. But imported bos attempt to grab the bo dma-resv during dma-buf detach that typically happens during cleanup. That leads to lockde splats similar to the below and a potential ABBA deadlock. Fix this by always taking the delayed workqueue cleanup path for imported bos. Requesting stable fixes from when the Xe driver was introduced, since its usage of drm_exec and wide vm dma_resvs appear to be the first reliable trigger of this. [22982.116427] ============================================ [22982.116428] WARNING: possible recursive locking detected [22982.116429] 6.10.0-rc2+ #10 Tainted: G U W [22982.116430] -------------------------------------------- [22982.116430] glxgears:sh0/5785 is trying to acquire lock: [22982.116431] ffff8c2bafa539a8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: dma_buf_detach+0x3b/0xf0 [22982.116438] but task is already holding lock: [22982.116438] ffff8c2d9aba6da8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_exec_lock_obj+0x49/0x2b0 [drm_exec] [22982.116442] other info that might help us debug this: [22982.116442] Possible unsafe locking scenario: [22982.116443] CPU0 [22982.116444] ---- [22982.116444] lock(reservation_ww_class_mutex); [22982.116445] lock(reservation_ww_class_mutex); [22982.116447] *** DEADLOCK *** [22982.116447] May be due to missing lock nesting notation [22982.116448] 5 locks held by glxgears:sh0/5785: [22982.116449] #0: ffff8c2d9aba58c8 (&xef->vm.lock){+.+.}-{3:3}, at: xe_file_close+0xde/0x1c0 [xe] [22982.116507] #1: ffff8c2e28cc8480 (&vm->lock){++++}-{3:3}, at: xe_vm_close_and_put+0x161/0x9b0 [xe] [22982.116578] #2: ffff8c2e31982970 (&val->lock){.+.+}-{3:3}, at: xe_validation_ctx_init+0x6d/0x70 [xe] [22982.116647] #3: ffffacdc469478a8 (reservation_ww_class_acquire){+.+.}-{0:0}, at: xe_vma_destroy_unlocked+0x7f/0xe0 [xe] [22982.116716] #4: ffff8c2d9aba6da8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_exec_lock_obj+0x49/0x2b0 [drm_exec] [22982.116719] stack backtrace: [22982.116720] CPU: 8 PID: 5785 Comm: glxgears:sh0 Tainted: G U W 6.10.0-rc2+ #10 [22982.116721] Hardware name: ASUS System Product Name/PRIME B560M-A AC, BIOS 2001 02/01/2023 [22982.116723] Call Trace: [22982.116724] <TASK> [22982.116725] dump_stack_lvl+0x77/0xb0 [22982.116727] __lock_acquire+0x1232/0x2160 [22982.116730] lock_acquire+0xcb/0x2d0 [22982.116732] ? dma_buf_detach+0x3b/0xf0 [22982.116734] ? __lock_acquire+0x417/0x2160 [22982.116736] __ww_mutex_lock.constprop.0+0xd0/0x13b0 [22982.116738] ? dma_buf_detach+0x3b/0xf0 [22982.116741] ? dma_buf_detach+0x3b/0xf0 [22982.116743] ? ww_mutex_lock+0x2b/0x90 [22982.116745] ww_mutex_lock+0x2b/0x90 [22982.116747] dma_buf_detach+0x3b/0xf0 [22982.116749] drm_prime_gem_destroy+0x2f/0x40 [drm] [22982.116775] xe_ttm_bo_destroy+0x32/0x220 [xe] [22982.116818] ? __mutex_unlock_slowpath+0x3a/0x290 [22982.116821] drm_exec_unlock_all+0xa1/0xd0 [drm_exec] [22982.116823] drm_exec_fini+0x12/0xb0 [drm_exec] [22982.116824] xe_validation_ctx_fini+0x15/0x40 [xe] [22982.116892] xe_vma_destroy_unlocked+0xb1/0xe0 [xe] [22982.116959] xe_vm_close_and_put+0x41a/0x9b0 [xe] [22982.117025] ? xa_find+0xe3/0x1e0 [22982.117028] xe_file_close+0x10a/0x1c0 [xe] [22982.117074] drm_file_free+0x22a/0x280 [drm] [22982.117099] drm_release_noglobal+0x22/0x70 [drm] [22982.117119] __fput+0xf1/0x2d0 [22982.117122] task_work_run+0x59/0x90 [22982.117125] do_exit+0x330/0xb40 [22982.117127] do_group_exit+0x36/0xa0 [22982.117129] get_signal+0xbd2/0xbe0 [22982.117131] arch_do_signal_or_restart+0x3e/0x240 [22982.117134] syscall_exit_to_user_mode+0x1e7/0x290 [22982.117137] do_syscall_64+0xa1/0x180 [22982.117139] ? lock_acquire+0xcb/0x2d0 [22982.117140] ? __set_task_comm+0x28/0x1e0 [22982.117141] ? find_held_lock+0x2b/0x80 [22982.117144] ? __set_task_comm+0xe1/0x1e0 [22982.117145] ? lock_release+0xca/0x290 [22982.117147] ? __do_sys_prctl+0x245/0xab0 [22982.117149] ? lockdep_hardirqs_on_prepare+0xde/0x190 [22982.117150] ? syscall_exit_to_user_mode+0xb0/0x290 [22982.117152] ? do_syscall_64+0xa1/0x180 [22982.117154] ? __lock_acquire+0x417/0x2160 [22982.117155] ? reacquire_held_locks+0xd1/0x1f0 [22982.117156] ? do_user_addr_fault+0x30c/0x790 [22982.117158] ? lock_acquire+0xcb/0x2d0 [22982.117160] ? find_held_lock+0x2b/0x80 [22982.117162] ? do_user_addr_fault+0x357/0x790 [22982.117163] ? lock_release+0xca/0x290 [22982.117164] ? do_user_addr_fault+0x361/0x790 [22982.117166] ? trace_hardirqs_off+0x4b/0xc0 [22982.117168] ? clear_bhb_loop+0x45/0xa0 [22982.117170] ? clear_bhb_loop+0x45/0xa0 [22982.117172] ? clear_bhb_loop+0x45/0xa0 [22982.117174] entry_SYSCALL_64_after_hwframe+0x76/0x7e [22982.117176] RIP: 0033:0x7f943d267169 [22982.117192] Code: Unable to access opcode bytes at 0x7f943d26713f. [22982.117193] RSP: 002b:00007f9430bffc80 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca [22982.117195] RAX: fffffffffffffe00 RBX: 0000000000000000 RCX: 00007f943d267169 [22982.117196] RDX: 0000000000000000 RSI: 0000000000000189 RDI: 00005622f89579d0 [22982.117197] RBP: 00007f9430bffcb0 R08: 0000000000000000 R09: 00000000ffffffff [22982.117198] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [22982.117199] R13: 0000000000000000 R14: 0000000000000000 R15: 00005622f89579d0 [22982.117202] </TASK> Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Christian König <christian.koenig@amd.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: dri-devel@lists.freedesktop.org Cc: intel-xe@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.8+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240628153848.4989-1-thomas.hellstrom@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-11drm/xe: fix error handling in xe_migrate_update_pgtablesMatthew Auld1-4/+4
commit fc932f51926698488f874ddf7d8f18483ca10271 upstream. Don't call drm_suballoc_free with sa_bo pointing to PTR_ERR. References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2120 Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240620102025.127699-2-matthew.auld@intel.com (cherry picked from commit ce6b63336f79ec5f3996de65f452330e395f99ae) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-11drm/xe/mcr: Avoid clobbering DSS steeringMatt Roper1-3/+3
[ Upstream commit 1f006470284598060ca1307355352934400b37ca ] A couple copy/paste mistakes in the code that selects steering targets for OADDRM and INSTANCE0 unintentionally clobbered the steering target for DSS ranges in some cases. The OADDRM/INSTANCE0 values were also not assigned as intended, although that mistake wound up being harmless since the desired values for those specific ranges were '0' which the kzalloc of the GT structure should have already taken care of implicitly. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240626210536.1620176-2-matthew.d.roper@intel.com (cherry picked from commit 4f82ac6102788112e599a6074d2c1f2afce923df) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-11drm/fbdev-generic: Fix framebuffer on big endian devicesThomas Huth1-1/+2
[ Upstream commit 740b8dad05bee39e1e3b926f05bb4a8274b8ba49 ] Starting with kernel 6.7, the framebuffer text console is not working anymore with the virtio-gpu device on s390x hosts. Such big endian fb devices are usinga different pixel ordering than little endian devices, e.g. DRM_FORMAT_BGRX8888 instead of DRM_FORMAT_XRGB8888. This used to work fine as long as drm_client_buffer_addfb() was still calling drm_mode_addfb() which called drm_driver_legacy_fb_format() internally to get the right format. But drm_client_buffer_addfb() has recently been reworked to call drm_mode_addfb2() instead with the format value that has been passed to it as a parameter (see commit 6ae2ff23aa43 ("drm/client: Convert drm_client_buffer_addfb() to drm_mode_addfb2()"). That format parameter is determined in drm_fbdev_generic_helper_fb_probe() via the drm_mode_legacy_fb_format() function - which only generates formats suitable for little endian devices. So to fix this issue switch to drm_driver_legacy_fb_format() here instead to take the device endianness into consideration. Fixes: 6ae2ff23aa43 ("drm/client: Convert drm_client_buffer_addfb() to drm_mode_addfb2()") Closes: https://issues.redhat.com/browse/RHEL-45158 Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240627173530.460615-1-thuth@redhat.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-11drm/amdgpu: fix the warning about the expression (int)size - lenJesse Zhang1-2/+3
[ Upstream commit ea686fef5489ef7a2450a9fdbcc732b837fb46a8 ] Converting size from size_t to int may overflow. v2: keep reverse xmas tree order (Christian) Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-11drm/amdgpu: fix uninitialized scalar variable warningTim Huang1-1/+2
[ Upstream commit 9a5f15d2a29d06ce5bd50919da7221cda92afb69 ] Clear warning that uses uninitialized value fw_size. Signed-off-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-11drm/amd/display: Fix uninitialized variables in DMAlex Hung2-6/+6
[ Upstream commit f95bcb041f213a5da3da5fcaf73269bd13dba945 ] This fixes 11 UNINIT issues reported by Coverity. Reviewed-by: Hersen Wu <hersenxs.wu@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-11drm/amd/display: ASSERT when failing to find index by plane/stream idAlex Hung1-2/+4
[ Upstream commit 01eb50e53c1ce505bf449348d433181310288765 ] [WHY] find_disp_cfg_idx_by_plane_id and find_disp_cfg_idx_by_stream_id returns an array index and they return -1 when not found; however, -1 is not a valid index number. [HOW] When this happens, call ASSERT(), and return a positive number (which is fewer than callers' array size) instead. This fixes 4 OVERRUN and 2 NEGATIVE_RETURNS issues reported by Coverity. Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-11drm/amd/display: Do not return negative stream id for arrayAlex Hung1-0/+7
[ Upstream commit 3ac31c9a707dd1c7c890b95333182f955e9dcb57 ] [WHY] resource_stream_to_stream_idx returns an array index and it return -1 when not found; however, -1 is not a valid array index number. [HOW] When this happens, call ASSERT(), and return a zero instead. This fixes an OVERRUN and an NEGATIVE_RETURNS issues reported by Coverity. Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-11drm/amd/display: update pipe topology log to support subvpWenjing Liu1-31/+65
[ Upstream commit 5db346c256bbacc634ff515a1a9202cd4b61d8c7 ] [why] There is an ambiguity in subvp pipe topology log. The log doesn't show subvp relation to main stream and it is not clear that certain stream is an internal stream for subvp pipes. [how] Separate subvp pipe topology logging from main pipe topology. Log main stream indices instead of the internal stream for subvp pipes. The following is a sample log showing 2 streams with subvp enabled on both: pipe topology update ________________________ | plane0 slice0 stream0| |DPP1----OPP1----OTG1----| | plane0 slice0 stream1| |DPP0----OPP0----OTG0----| | (phantom pipes) | | plane0 slice0 stream0| |DPP3----OPP3----OTG3----| | plane0 slice0 stream1| |DPP2----OPP2----OTG2----| |________________________| Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Acked-by: Roman Li <roman.li@amd.com> Signed-off-by: Wenjing Liu <wenjing.liu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Stable-dep-of: 3ac31c9a707d ("drm/amd/display: Do not return negative stream id for array") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-11drm/amd/display: Fix overlapping copy within dml_core_mode_programmingHersen Wu1-1/+3
[ Upstream commit f1fd8a0a54e6d23a6d16ee29159f247862460fd1 ] [WHY] &mode_lib->mp.Watermark and &locals->Watermark are the same address. memcpy may lead to unexpected behavior. [HOW] memmove should be used. Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Hersen Wu <hersenxs.wu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-11drm/amd/display: Skip finding free audio for unknown engine_idAlex Hung1-0/+3
[ Upstream commit 1357b2165d9ad94faa4c4a20d5e2ce29c2ff29c3 ] [WHY] ENGINE_ID_UNKNOWN = -1 and can not be used as an array index. Plus, it also means it is uninitialized and does not need free audio. [HOW] Skip and return NULL. This fixes 2 OVERRUN issues reported by Coverity. Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-11drm/amd/display: Check pipe offset before setting vblankAlex Hung1-2/+6
[ Upstream commit 5396a70e8cf462ec5ccf2dc8de103c79de9489e6 ] pipe_ctx has a size of MAX_PIPES so checking its index before accessing the array. This fixes an OVERRUN issue reported by Coverity. Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-11drm/amd/display: Check index msg_id before read or writeAlex Hung1-0/+8
[ Upstream commit 59d99deb330af206a4541db0c4da8f73880fba03 ] [WHAT] msg_id is used as an array index and it cannot be a negative value, and therefore cannot be equal to MOD_HDCP_MESSAGE_ID_INVALID (-1). [HOW] Check whether msg_id is valid before reading and setting. This fixes 4 OVERRUN issues reported by Coverity. Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-11drm/amd/display: Add NULL pointer check for kzallocHersen Wu11-0/+44
[ Upstream commit 8e65a1b7118acf6af96449e1e66b7adbc9396912 ] [Why & How] Check return pointer of kzalloc before using it. Reviewed-by: Alex Hung <alex.hung@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Hersen Wu <hersenxs.wu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-11drm/amdgpu: fix double free err_addr pointer warningsBob Zhou1-0/+1
[ Upstream commit 506c245f3f1cd989cb89811a7f06e04ff8813a0d ] In amdgpu_umc_bad_page_polling_timeout, the amdgpu_umc_handle_bad_pages will be run many times so that double free err_addr in some special case. So set the err_addr to NULL to avoid the warnings. Signed-off-by: Bob Zhou <bob.zhou@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-11drm/amdgpu: Initialize timestamp for some legacy SOCsMa Jun1-0/+8
[ Upstream commit 2e55bcf3d742a4946d862b86e39e75a95cc6f1c0 ] Initialize the interrupt timestamp for some legacy SOCs to fix the coverity issue "Uninitialized scalar variable" Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-11drm/amdgpu: Using uninitialized value *size when calling amdgpu_vce_cs_relocJesse Zhang1-1/+2
[ Upstream commit 88a9a467c548d0b3c7761b4fd54a68e70f9c0944 ] Initialize the size before calling amdgpu_vce_cs_reloc, such as case 0x03000001. V2: To really improve the handling we would actually need to have a separate value of 0xffffffff.(Christian) Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-11drm/amdgpu: Fix uninitialized variable warningsMa Jun2-2/+2
[ Upstream commit 60c448439f3b5db9431e13f7f361b4074d0e8594 ] return 0 to avoid returning an uninitialized variable r Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-11drm/xe: Add outer runtime_pm protection to xe_live_ktest@xe_dma_bufRodrigo Vivi1-0/+3
[ Upstream commit f9116f658a6217b101e3b4e89f845775b6fb05d9 ] Any kunit doing any memory access should get their own runtime_pm outer references since they don't use the standard driver API entries. In special this dma_buf from the same driver. Found by pre-merge CI on adding WARN calls for unprotected inner callers: <6> [318.639739] # xe_dma_buf_kunit: running xe_test_dmabuf_import_same_driver <4> [318.639957] ------------[ cut here ]------------ <4> [318.639967] xe 0000:4d:00.0: Missing outer runtime PM protection <4> [318.640049] WARNING: CPU: 117 PID: 3832 at drivers/gpu/drm/xe/xe_pm.c:533 xe_pm_runtime_get_noresume+0x48/0x60 [xe] Cc: Matthew Auld <matthew.auld@intel.com> Cc: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240417203952.25503-10-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-11drm/lima: fix shared irq handling on driver removeErico Nunes3-0/+11
[ Upstream commit a6683c690bbfd1f371510cb051e8fa49507f3f5e ] lima uses a shared interrupt, so the interrupt handlers must be prepared to be called at any time. At driver removal time, the clocks are disabled early and the interrupts stay registered until the very end of the remove process due to the devm usage. This is potentially a bug as the interrupts access device registers which assumes clocks are enabled. A crash can be triggered by removing the driver in a kernel with CONFIG_DEBUG_SHIRQ enabled. This patch frees the interrupts at each lima device finishing callback so that the handlers are already unregistered by the time we fully disable clocks. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240401224329.1228468-2-nunes.erico@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-05drm/amdgpu/atomfirmware: fix parsing of vram_infoAlex Deucher1-1/+1
commit f6f49dda49db72e7a0b4ca32c77391d5ff5ce232 upstream. v3.x changed the how vram width was encoded. The previous implementation actually worked correctly for most boards. Fix the implementation to work correctly everywhere. This fixes the vram width reported in the kernel log on some boards. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-05drm/amd/display: Send DP_TOTAL_LTTPR_CNT during detection if LTTPR is presentMichael Strauss2-1/+14
commit 2ec6c7f802332d1eff16f03e7c757f1543ee1183 upstream. [WHY] New register field added in DP2.1 SCR, needed for auxless ALPM [HOW] Echo value read from 0xF0007 back to sink Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Michael Strauss <michael.strauss@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-05drm/nouveau/dispnv04: fix null pointer dereference in nv17_tv_get_hd_modesMa Ke1-0/+4
commit 6d411c8ccc0137a612e0044489030a194ff5c843 upstream. In nv17_tv_get_hd_modes(), the return value of drm_mode_duplicate() is assigned to mode, which will lead to a possible NULL pointer dereference on failure of drm_mode_duplicate(). The same applies to drm_cvt_mode(). Add a check to avoid null pointer dereference. Cc: stable@vger.kernel.org Signed-off-by: Ma Ke <make24@iscas.ac.cn> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240625081029.2619437-1-make24@iscas.ac.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-05drm/i915/gt: Fix potential UAF by revoke of fence registersJanusz Krzysztofik1-0/+1
commit 996c3412a06578e9d779a16b9e79ace18125ab50 upstream. CI has been sporadically reporting the following issue triggered by igt@i915_selftest@live@hangcheck on ADL-P and similar machines: <6> [414.049203] i915: Running intel_hangcheck_live_selftests/igt_reset_evict_fence ... <6> [414.068804] i915 0000:00:02.0: [drm] GT0: GUC: submission enabled <6> [414.068812] i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled <3> [414.070354] Unable to pin Y-tiled fence; err:-4 <3> [414.071282] i915_vma_revoke_fence:301 GEM_BUG_ON(!i915_active_is_idle(&fence->active)) ... <4>[ 609.603992] ------------[ cut here ]------------ <2>[ 609.603995] kernel BUG at drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c:301! <4>[ 609.604003] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI <4>[ 609.604006] CPU: 0 PID: 268 Comm: kworker/u64:3 Tainted: G U W 6.9.0-CI_DRM_14785-g1ba62f8cea9c+ #1 <4>[ 609.604008] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR4 RVP, BIOS RPLPFWI1.R00.4035.A00.2301200723 01/20/2023 <4>[ 609.604010] Workqueue: i915 __i915_gem_free_work [i915] <4>[ 609.604149] RIP: 0010:i915_vma_revoke_fence+0x187/0x1f0 [i915] ... <4>[ 609.604271] Call Trace: <4>[ 609.604273] <TASK> ... <4>[ 609.604716] __i915_vma_evict+0x2e9/0x550 [i915] <4>[ 609.604852] __i915_vma_unbind+0x7c/0x160 [i915] <4>[ 609.604977] force_unbind+0x24/0xa0 [i915] <4>[ 609.605098] i915_vma_destroy+0x2f/0xa0 [i915] <4>[ 609.605210] __i915_gem_object_pages_fini+0x51/0x2f0 [i915] <4>[ 609.605330] __i915_gem_free_objects.isra.0+0x6a/0xc0 [i915] <4>[ 609.605440] process_scheduled_works+0x351/0x690 ... In the past, there were similar failures reported by CI from other IGT tests, observed on other platforms. Before commit 63baf4f3d587 ("drm/i915/gt: Only wait for GPU activity before unbinding a GGTT fence"), i915_vma_revoke_fence() was waiting for idleness of vma->active via fence_update(). That commit introduced vma->fence->active in order for the fence_update() to be able to wait selectively on that one instead of vma->active since only idleness of fence registers was needed. But then, another commit 0d86ee35097a ("drm/i915/gt: Make fence revocation unequivocal") replaced the call to fence_update() in i915_vma_revoke_fence() with only fence_write(), and also added that GEM_BUG_ON(!i915_active_is_idle(&fence->active)) in front. No justification was provided on why we might then expect idleness of vma->fence->active without first waiting on it. The issue can be potentially caused by a race among revocation of fence registers on one side and sequential execution of signal callbacks invoked on completion of a request that was using them on the other, still processed in parallel to revocation of those fence registers. Fix it by waiting for idleness of vma->fence->active in i915_vma_revoke_fence(). Fixes: 0d86ee35097a ("drm/i915/gt: Make fence revocation unequivocal") Closes: https://gitlab.freedesktop.org/drm/intel/issues/10021 Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Cc: stable@vger.kernel.org # v5.8+ Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240603195446.297690-2-janusz.krzysztofik@linux.intel.com (cherry picked from commit 24bb052d3dd499c5956abad5f7d8e4fd07da7fb1) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-05drm/amdgpu: avoid using null object of framebufferJulia Zhang1-2/+16
commit bcfa48ff785bd121316592b131ff6531e3e696bb upstream. Instead of using state->fb->obj[0] directly, get object from framebuffer by calling drm_gem_fb_get_obj() and return error code when object is null to avoid using null object of framebuffer. Reported-by: Fusheng Huang <fusheng.huang@ecarxgroup.com> Signed-off-by: Julia Zhang <Julia.Zhang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>