summaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)AuthorFilesLines
40 hoursdrm/amdgpu/pm: drop SMU driver if version not matched messagesAlex Deucher3-3/+0
commit a3ffaa5b397f4df9d6ac16b10583e9df8e6fa471 upstream. It just leads to user confusion. Cc: Yang Wang <kevinyang.wang@amd.com> Cc: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit e471627d56272a791972f25e467348b611c31713) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
40 hoursdrm/amdgpu: Change AMDGPU_VA_RESERVED_TRAP_SIZE to 64KBDonet Tom2-3/+3
commit 4487571ef17a30d274600b3bd6965f497a881299 upstream. Currently, AMDGPU_VA_RESERVED_TRAP_SIZE is hardcoded to 8KB, while KFD_CWSR_TBA_TMA_SIZE is defined as 2 * PAGE_SIZE. On systems with 4K pages, both values match (8KB), so allocation and reserved space are consistent. However, on 64K page-size systems, KFD_CWSR_TBA_TMA_SIZE becomes 128KB, while the reserved trap area remains 8KB. This mismatch causes the kernel to crash when running rocminfo or rccl unit tests. Kernel attempted to read user page (2) - exploit attempt? (uid: 1001) BUG: Kernel NULL pointer dereference on read at 0x00000002 Faulting instruction address: 0xc0000000002c8a64 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries CPU: 34 UID: 1001 PID: 9379 Comm: rocminfo Tainted: G E 6.19.0-rc4-amdgpu-00320-gf23176405700 #56 VOLUNTARY Tainted: [E]=UNSIGNED_MODULE Hardware name: IBM,9105-42A POWER10 (architected) 0x800200 0xf000006 of:IBM,FW1060.30 (ML1060_896) hv:phyp pSeries NIP: c0000000002c8a64 LR: c00000000125dbc8 CTR: c00000000125e730 REGS: c0000001e0957580 TRAP: 0300 Tainted: G E MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24008268 XER: 00000036 CFAR: c00000000125dbc4 DAR: 0000000000000002 DSISR: 40000000 IRQMASK: 1 GPR00: c00000000125d908 c0000001e0957820 c0000000016e8100 c00000013d814540 GPR04: 0000000000000002 c00000013d814550 0000000000000045 0000000000000000 GPR08: c00000013444d000 c00000013d814538 c00000013d814538 0000000084002268 GPR12: c00000000125e730 c000007e2ffd5f00 ffffffffffffffff 0000000000020000 GPR16: 0000000000000000 0000000000000002 c00000015f653000 0000000000000000 GPR20: c000000138662400 c00000013d814540 0000000000000000 c00000013d814500 GPR24: 0000000000000000 0000000000000002 c0000001e0957888 c0000001e0957878 GPR28: c00000013d814548 0000000000000000 c00000013d814540 c0000001e0957888 NIP [c0000000002c8a64] __mutex_add_waiter+0x24/0xc0 LR [c00000000125dbc8] __mutex_lock.constprop.0+0x318/0xd00 Call Trace: 0xc0000001e0957890 (unreliable) __mutex_lock.constprop.0+0x58/0xd00 amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x6fc/0xb60 [amdgpu] kfd_process_alloc_gpuvm+0x54/0x1f0 [amdgpu] kfd_process_device_init_cwsr_dgpu+0xa4/0x1a0 [amdgpu] kfd_process_device_init_vm+0xd8/0x2e0 [amdgpu] kfd_ioctl_acquire_vm+0xd0/0x130 [amdgpu] kfd_ioctl+0x514/0x670 [amdgpu] sys_ioctl+0x134/0x180 system_call_exception+0x114/0x300 system_call_vectored_common+0x15c/0x2ec This patch changes AMDGPU_VA_RESERVED_TRAP_SIZE to 64 KB and KFD_CWSR_TBA_TMA_SIZE to the AMD GPU page size. This means we reserve 64 KB for the trap in the address space, but only allocate 8 KB within it. With this approach, the allocation size never exceeds the reserved area. Fixes: 34a1de0f7935 ("drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole") Reviewed-by: Christian König <christian.koenig@amd.com> Suggested-by: Felix Kuehling <felix.kuehling@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Donet Tom <donettom@linux.ibm.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 31b8de5e55666f26ea7ece5f412b83eab3f56dbb) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
40 hoursdrm/amdgpu: validate doorbell_offset in user queue creationJunrui Luo1-0/+7
commit a018d1819f158991b7308e4f74609c6c029b670c upstream. amdgpu_userq_get_doorbell_index() passes the user-provided doorbell_offset to amdgpu_doorbell_index_on_bar() without bounds checking. An arbitrarily large doorbell_offset can cause the calculated doorbell index to fall outside the allocated doorbell BO, potentially corrupting kernel doorbell space. Validate that doorbell_offset falls within the doorbell BO before computing the BAR index, using u64 arithmetic to prevent overflow. Fixes: f09c1e6077ab ("drm/amdgpu: generate doorbell index for userqueue") Reported-by: Yuhao Jiang <danisjiang@gmail.com> Signed-off-by: Junrui Luo <moonafterrain@outlook.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit de1ef4ffd70e1d15f0bf584fd22b1f28cbd5e2ec) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
40 hoursdrm/amdgpu: Fix wait after reset sequence in S4Lijo Lazar2-3/+8
commit daf470b8882b6f7f53cbfe9ec2b93a1b21528cdc upstream. For a mode-1 reset done at the end of S4 on PSPv11 dGPUs, only check if TOS is unloaded. Fixes: 32f73741d6ee ("drm/amdgpu: Wait for bootloader after PSPv11 reset") Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/4853 Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 2fb4883b884a437d760bd7bdf7695a7e5a60bba3) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
40 hoursdrm/i915/cdclk: Do the full CDCLK dance for min_voltage_level changesVille Syrjälä1-0/+54
commit e08e0754e690e4909cab83ac43fd2c93c6200514 upstream. Apparently I forgot about the pipe min_voltage_level when I decoupled the CDCLK calculations from modesets. Even if the CDCLK frequency doesn't need changing we may still need to bump the voltage level to accommodate an increase in the port clock frequency. Currently, even if there is a full modeset, we won't notice the need to go through the full CDCLK calculations/programming, unless the set of enabled/active pipes changes, or the pipe/dbuf min CDCLK changes. Duplicate the same logic we use the pipe's min CDCLK frequency to also deal with its min voltage level. Note that the 'allow_voltage_level_decrease' stuff isn't really useful here since the min voltage level can only change during a full modeset. But I think sticking to the same approach in the three similar parts (pipe min cdclk, pipe min voltage level, dbuf min cdclk) is a good idea. Cc: stable@vger.kernel.org Tested-by: Mikhail Rudenko <mike.rudenko@gmail.com> Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/15826 Fixes: ba91b9eecb47 ("drm/i915/cdclk: Decouple cdclk from state->modeset") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20260325135849.12603-2-ville.syrjala@linux.intel.com Reviewed-by: Michał Grzelak <michal.grzelak@intel.com> (cherry picked from commit 0f21a14987ebae3c05ad1184ea872e7b7a7b8695) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
40 hoursdrm/i915/dp: Use crtc_state->enhanced_framing properly on ivb/hsw CPU eDPVille Syrjälä1-1/+1
commit 9c9a57e4e337f94e23ddf69263fd0685c91155fb upstream. Looks like I missed the drm_dp_enhanced_frame_cap() in the ivb/hsw CPU eDP code when I introduced crtc_state->enhanced_framing. Fix it up so that the state we program to the hardware is guaranteed to match what we computed earlier. Cc: stable@vger.kernel.org Fixes: 3072a24c778a ("drm/i915: Introduce crtc_state->enhanced_framing") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20260325135849.12603-3-ville.syrjala@linux.intel.com Reviewed-by: Michał Grzelak <michal.grzelak@intel.com> (cherry picked from commit 799fe8dc2af52f35c78c4ac97f8e34994dfd8760) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
40 hoursdrm/i915/dsi: Don't do DSC horizontal timing adjustments in command modeVille Syrjälä1-2/+2
commit 4dfce79e098915d8e5fc2b9e1d980bc3251dd32c upstream. Stop adjusting the horizontal timing values based on the compression ratio in command mode. Bspec seems to be telling us to do this only in video mode, and this is also how the Windows driver does things. This should also fix a div-by-zero on some machines because the adjusted htotal ends up being so small that we end up with line_time_us==0 when trying to determine the vtotal value in command mode. Note that this doesn't actually make the display on the Huawei Matebook E work, but at least the kernel no longer explodes when the driver loads. Cc: stable@vger.kernel.org Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12045 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20260326111814.9800-2-ville.syrjala@linux.intel.com Fixes: 53693f02d80e ("drm/i915/dsi: account for DSC in horizontal timings") Reviewed-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit 0b475e91ecc2313207196c6d7fd5c53e1a878525) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
40 hoursdrm/ast: dp501: Fix initialization of SCU2CThomas Zimmermann1-1/+1
commit 2f42c1a6161646cbd29b443459fd635d29eda634 upstream. Ast's DP501 initialization reads the register SCU2C at offset 0x1202c and tries to set it to source data from VGA. But writes the update to offset 0x0, with unknown results. Write the result to SCU instead. The bug only happens in ast_init_analog(). There's similar code in ast_init_dvo(), which works correctly. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: 83c6620bae3f ("drm/ast: initial DP501 support (v0.2)") Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Jocelyn Falempe <jfalempe@redhat.com> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v3.16+ Link: https://patch.msgid.link/20260327133532.79696-2-tzimmermann@suse.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
40 hoursdrm/amdgpu: fix the idr allocation flagsPrike Liang1-1/+4
commit 62f553d60a801384336f5867967c26ddf3b17038 upstream. Fix the IDR allocation flags by using atomic GFP flags in non‑sleepable contexts to avoid the __might_sleep() complaint. 268.290239] [drm] Initialized amdgpu 3.64.0 for 0000:03:00.0 on minor 0 [ 268.294900] BUG: sleeping function called from invalid context at ./include/linux/sched/mm.h:323 [ 268.295355] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1744, name: modprobe [ 268.295705] preempt_count: 1, expected: 0 [ 268.295886] RCU nest depth: 0, expected: 0 [ 268.296072] 2 locks held by modprobe/1744: [ 268.296077] #0: ffff8c3a44abd1b8 (&dev->mutex){....}-{4:4}, at: __driver_attach+0xe4/0x210 [ 268.296100] #1: ffffffffc1a6ea78 (amdgpu_pasid_idr_lock){+.+.}-{3:3}, at: amdgpu_pasid_alloc+0x26/0xe0 [amdgpu] [ 268.296494] CPU: 12 UID: 0 PID: 1744 Comm: modprobe Tainted: G U OE 6.19.0-custom #16 PREEMPT(voluntary) [ 268.296498] Tainted: [U]=USER, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE [ 268.296499] Hardware name: AMD Majolica-RN/Majolica-RN, BIOS RMJ1009A 06/13/2021 [ 268.296501] Call Trace: Fixes: 8f1de51f49be ("drm/amdgpu: prevent immediate PASID reuse case") Tested-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit ea56aa2625708eaf96f310032391ff37746310ef) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
40 hoursdrm/amd/display: Fix NULL pointer dereference in dcn401_init_hw()Srinivasan Shanmugam1-6/+11
commit e927b36ae18b66b49219eaa9f46edc7b4fdbb25e upstream. dcn401_init_hw() assumes that update_bw_bounding_box() is valid when entering the update path. However, the existing condition: ((!fams2_enable && update_bw_bounding_box) || freq_changed) does not guarantee this, as the freq_changed branch can evaluate to true independently of the callback pointer. This can result in calling update_bw_bounding_box() when it is NULL. Fix this by separating the update condition from the pointer checks and ensuring the callback, dc->clk_mgr, and bw_params are validated before use. Fixes the below: ../dc/hwss/dcn401/dcn401_hwseq.c:367 dcn401_init_hw() error: we previously assumed 'dc->res_pool->funcs->update_bw_bounding_box' could be null (see line 362) Fixes: ca0fb243c3bb ("drm/amd/display: Underflow Seen on DCN401 eGPU") Cc: Daniel Sa <Daniel.Sa@amd.com> Cc: Alvin Lee <alvin.lee2@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Dan Carpenter <dan.carpenter@linaro.org> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 86117c5ab42f21562fedb0a64bffea3ee5fcd477) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
40 hoursdrm/ioc32: stop speculation on the drm_compat_ioctl pathGreg Kroah-Hartman1-0/+2
commit f8995c2df519f382525ca4bc90553ad2ec611067 upstream. The drm compat ioctl path takes a user controlled pointer, and then dereferences it into a table of function pointers, the signature method of spectre problems. Fix this up by calling array_index_nospec() on the index to the function pointer list. Fixes: 505b5240329b ("drm/ioctl: Fix Spectre v1 vulnerabilities") Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@gmail.com> Cc: Simona Vetter <simona@ffwll.ch> Cc: stable <stable@kernel.org> Assisted-by: gkh_clanker_2000 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Maxime Ripard <mripard@kernel.org> Reviewed-by: Simona Vetter <simona@ffwll.ch> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/2026032451-playing-rummage-8fa2@gregkh Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
40 hoursdrm/sysfb: Fix efidrm error handling and memory type mismatchChen Ni1-15/+31
[ Upstream commit 5e77923a3eb39cce91bf08ed7670f816bf86d4af ] Fix incorrect error checking and memory type confusion in efidrm_device_create(). devm_memremap() returns error pointers, not NULL, and returns system memory while devm_ioremap() returns I/O memory. The code incorrectly passes system memory to iosys_map_set_vaddr_iomem(). Restructure to handle each memory type separately. Use devm_ioremap*() with ERR_PTR(-ENXIO) for WC/UC, and devm_memremap() with ERR_CAST() for WT/WB. Fixes: 32ae90c66fb6 ("drm/sysfb: Add efidrm for EFI displays") Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/20260311064652.2903449-1-nichen@iscas.ac.cn Signed-off-by: Sasha Levin <sashal@kernel.org>
40 hoursdrm/xe/pxp: Clear restart flag in pxp_start after jumping backDaniele Ceraolo Spurio1-1/+3
[ Upstream commit 76903b2057c8677c2c006e87fede15f496555dc0 ] If we don't clear the flag we'll keep jumping back at the beginning of the function once we reach the end. Fixes: ccd3c6820a90 ("drm/xe/pxp: Decouple queue addition from PXP start") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Julia Filipchuk <julia.filipchuk@intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://patch.msgid.link/20260324153718.3155504-9-daniele.ceraolospurio@intel.com (cherry picked from commit 0850ec7bb2459602351639dccf7a68a03c9d1ee0) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
40 hoursdrm/xe/pxp: Remove incorrect handling of impossible state during suspendDaniele Ceraolo Spurio1-6/+0
[ Upstream commit 4fed244954c2dc9aafa333d08f66b14345225e03 ] The default case of the PXP suspend switch is incorrectly exiting without releasing the lock. However, this case is impossible to hit because we're switching on an enum and all the valid enum values have their own cases. Therefore, we can just get rid of the default case and rely on the compiler to warn us if a new enum value is added and we forget to add it to the switch. Fixes: 51462211f4a9 ("drm/xe/pxp: add PXP PM support") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn Teres Alexis <alan.previn.teres.alexis@intel.com> Cc: Julia Filipchuk <julia.filipchuk@intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://patch.msgid.link/20260324153718.3155504-8-daniele.ceraolospurio@intel.com (cherry picked from commit f1b5a77fc9b6a90cd9a5e3db9d4c73ae1edfcfac) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
40 hoursdrm/xe/pxp: Clean up termination status on failureDaniele Ceraolo Spurio1-0/+1
[ Upstream commit e2628e670bb0923fcdc00828bfcd67b26a7df020 ] If the PXP HW termination fails during PXP start, the normal completion code won't be called, so the termination will remain uncomplete. To avoid unnecessary waits, mark the termination as completed from the error path. Note that we already do this if the termination fails when handling a termination irq from the HW. Fixes: f8caa80154c4 ("drm/xe/pxp: Add PXP queue tracking and session start") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn Teres Alexis <alan.previn.teres.alexis@intel.com> Cc: Julia Filipchuk <julia.filipchuk@intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://patch.msgid.link/20260324153718.3155504-7-daniele.ceraolospurio@intel.com (cherry picked from commit 5d9e708d2a69ab1f64a17aec810cd7c70c5b9fab) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
40 hoursdrm/xe/xe_pagefault: Disallow writes to read-only VMAsJonathan Cavitt1-0/+6
[ Upstream commit 6d192b4f2d644d15d9a9f1d33dab05af936f6540 ] The page fault handler should reject write/atomic access to read only VMAs. Add code to handle this in xe_pagefault_service after the VMA lookup. v2: - Apply max line length (Matthew) Fixes: fb544b844508 ("drm/xe: Implement xe_pagefault_queue_work") Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Suggested-by: Matthew Brost <matthew.brost@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260324152935.72444-7-jonathan.cavitt@intel.com (cherry picked from commit 714ee6754ac5fa3dc078856a196a6b124cd797a0) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
40 hoursdrm/bridge: Fix refcount shown via debugfs for encoder_bridges_show()Liu Ying1-5/+11
[ Upstream commit f078634c184a9b5ccaa056e8b8d6cd32f7bff1b6 ] A typical bridge refcount value is 3 after a bridge chain is formed: - devm_drm_bridge_alloc() initializes the refcount value to be 1. - drm_bridge_add() gets an additional reference hence 2. - drm_bridge_attach() gets the third reference hence 3. This typical refcount value aligns with allbridges_show()'s behaviour. However, since encoder_bridges_show() uses drm_for_each_bridge_in_chain_scoped() to automatically get/put the bridge reference while iterating, a bogus reference is accidentally got when showing the wrong typical refcount value as 4 to users via debugfs. Fix this by caching the refcount value returned from kref_read() while iterating and explicitly decreasing the cached refcount value by 1 before showing it to users. Fixes: bd57048e4576 ("drm/bridge: use drm_for_each_bridge_in_chain_scoped()") Signed-off-by: Liu Ying <victor.liu@nxp.com> Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Tested-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Link: https://patch.msgid.link/20260318-drm-misc-next-2026-03-05-fix-encoder-bridges-refcount-v3-1-147fea581279@nxp.com Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
40 hoursRevert "drm: Fix use-after-free on framebuffers and property blobs when ↵Maarten Lankhorst2-10/+4
calling drm_dev_unplug" commit 45ebe43ea00d6b9f5b3e0db9c35b8ca2a96b7e70 upstream. This reverts commit 6bee098b91417654703e17eb5c1822c6dfd0c01d. Den 2026-03-25 kl. 22:11, skrev Simona Vetter: > On Wed, Mar 25, 2026 at 10:26:40AM -0700, Guenter Roeck wrote: >> Hi, >> >> On Fri, Mar 13, 2026 at 04:17:27PM +0100, Maarten Lankhorst wrote: >>> When trying to do a rather aggressive test of igt's "xe_module_load >>> --r reload" with a full desktop environment and game running I noticed >>> a few OOPSes when dereferencing freed pointers, related to >>> framebuffers and property blobs after the compositor exits. >>> >>> Solve this by guarding the freeing in drm_file with drm_dev_enter/exit, >>> and immediately put the references from struct drm_file objects during >>> drm_dev_unplug(). >>> >> >> With this patch in v6.18.20, I get the warning backtraces below. >> The backtraces are gone with the patch reverted. > > Yeah, this needs to be reverted, reasoning below. Maarten, can you please > take care of that and feed the revert through the usual channels? I don't > think it's critical enough that we need to fast-track this into drm.git > directly. > > Quoting the patch here again: > >> drivers/gpu/drm/drm_file.c | 5 ++++- >> drivers/gpu/drm/drm_mode_config.c | 9 ++++++--- >> 2 files changed, 10 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c >> index ec820686b3021..f52141f842a1f 100644 >> --- a/drivers/gpu/drm/drm_file.c >> +++ b/drivers/gpu/drm/drm_file.c >> @@ -233,6 +233,7 @@ static void drm_events_release(struct drm_file *file_priv) >> void drm_file_free(struct drm_file *file) >> { >> struct drm_device *dev; >> + int idx; >> >> if (!file) >> return; >> @@ -249,9 +250,11 @@ void drm_file_free(struct drm_file *file) >> >> drm_events_release(file); >> >> - if (drm_core_check_feature(dev, DRIVER_MODESET)) { >> + if (drm_core_check_feature(dev, DRIVER_MODESET) && >> + drm_dev_enter(dev, &idx)) { > > This is misplaced for two reasons: > > - Even if we'd want to guarantee that we hold a drm_dev_enter/exit > reference during framebuffer teardown, we'd need to do this > _consistently over all callsites. Not ad-hoc in just one place that a > testcase hits. This also means kerneldoc updates of the relevant hooks > and at least a bunch of acks from other driver people to document the > consensus. > > - More importantly, this is driver responsibilities in general unless we > have extremely good reasons to the contrary. Which means this must be > placed in xe. > >> drm_fb_release(file); >> drm_property_destroy_user_blobs(dev, file); >> + drm_dev_exit(idx); >> } >> >> if (drm_core_check_feature(dev, DRIVER_SYNCOBJ)) >> diff --git a/drivers/gpu/drm/drm_mode_config.c b/drivers/gpu/drm/drm_mode_config.c >> index 84ae8a23a3678..e349418978f79 100644 >> --- a/drivers/gpu/drm/drm_mode_config.c >> +++ b/drivers/gpu/drm/drm_mode_config.c >> @@ -583,10 +583,13 @@ void drm_mode_config_cleanup(struct drm_device *dev) >> */ >> WARN_ON(!list_empty(&dev->mode_config.fb_list)); >> list_for_each_entry_safe(fb, fbt, &dev->mode_config.fb_list, head) { >> - struct drm_printer p = drm_dbg_printer(dev, DRM_UT_KMS, "[leaked fb]"); >> + if (list_empty(&fb->filp_head) || drm_framebuffer_read_refcount(fb) > 1) { >> + struct drm_printer p = drm_dbg_printer(dev, DRM_UT_KMS, "[leaked fb]"); > > This is also wrong: > > - Firstly, it's a completely independent bug, we do not smash two bugfixes > into one patch. > > - Secondly, it's again a driver bug: drm_mode_cleanup must be called when > the last drm_device reference disappears (hence the existence of > drmm_mode_config_init), not when the driver gets unbound. The fact that > this shows up in a callchain from a devres cleanup means the intel > driver gets this wrong (like almost everyone else because historically > we didn't know better). > > If we don't follow this rule, then we get races with this code here > running concurrently with drm_file fb cleanups, which just does not > work. Review pointed that out, but then shrugged it off with a confused > explanation: > > https://lore.kernel.org/all/e61e64c796ccfb17ae673331a3df4b877bf42d82.camel@linux.intel.com/ > > Yes this also means a lot of the other drm_device teardown that drivers > do happens way too early. There is a massive can of worms here of a > magnitude that most likely is much, much bigger than what you can > backport to stable kernels. Hotunplug is _hard_. Back to the drawing board, and fixing it in the intel display driver instead. Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Fixes: 6bee098b9141 ("drm: Fix use-after-free on framebuffers and property blobs when calling drm_dev_unplug") Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Simona Vetter <simona.vetter@ffwll.ch> Signed-off-by: Maarten Lankhorst <dev@lankhorst.se> Link: https://patch.msgid.link/20260326082217.39941-2-dev@lankhorst.se Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
40 hoursdrm/amd/display: Fix gamma 2.2 colorop TFsAlex Hung1-3/+3
[ Upstream commit b49814033cb5224c818cfb04dccb3260da10cc4f ] Use GAMMA22 for degamma/blend and GAMMA22_INV for shaper so curves match the color pipeline. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/5016 Tested-by: Xaver Hugl <xaver.hugl@kde.org> Reviewed-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit d8f9f42effd767ffa7bbcd7e05fbd6b20737e468) Signed-off-by: Sasha Levin <sashal@kernel.org>
40 hoursdrm/amd/pm: disable OD_FAN_CURVE if temp or pwm range invalid for smu v13Yang Wang2-2/+64
[ Upstream commit 3e6dd28a11083e83e11a284d99fcc9eb748c321c ] Forcibly disable the OD_FAN_CURVE feature when temperature or PWM range is invalid, otherwise PMFW will reject this configuration on smu v13.0.x example: $ sudo cat /sys/bus/pci/devices/<BDF>/gpu_od/fan_ctrl/fan_curve OD_FAN_CURVE: 0: 0C 0% 1: 0C 0% 2: 0C 0% 3: 0C 0% 4: 0C 0% OD_RANGE: FAN_CURVE(hotspot temp): 0C 0C FAN_CURVE(fan speed): 0% 0% $ echo "0 50 40" | sudo tee fan_curve kernel log: [ 756.442527] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]! [ 777.345800] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]! Closes: https://github.com/ROCm/amdgpu/issues/208 Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 470891606c5a97b1d0d937e0aa67a3bed9fcb056) Cc: stable@vger.kernel.org [ adapted forward declaration placement to existing FEATURE_MASK macro ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 daysdrm/amd/pm: Return -EOPNOTSUPP for unsupported OD_MCLK on smu_v13_0_6Asad Kamal1-1/+1
commit 2f0e491faee43181b6a86e90f34016b256042fe1 upstream. When SET_UCLK_MAX capability is absent, return -EOPNOTSUPP from smu_v13_0_6_emit_clk_levels() for OD_MCLK instead of 0. This makes unsupported OD_MCLK reporting consistent with other clock types and allows callers to skip the entry cleanly. Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit d82e0a72d9189e8acd353988e1a57f85ce479e37) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 daysdrm/i915: Unlink NV12 planes earlierVille Syrjälä1-2/+9
commit bfa71b7a9dc6b5b8af157686e03308291141d00c upstream. unlink_nv12_plane() will clobber parts of the plane state potentially already set up by plane_atomic_check(), so we must make sure not to call the two in the wrong order. The problem happens when a plane previously selected as a Y plane is now configured as a normal plane by user space. plane_atomic_check() will first compute the proper plane state based on the userspace request, and unlink_nv12_plane() later clears some of the state. This used to work on account of unlink_nv12_plane() skipping the state clearing based on the plane visibility. But I removed that check, thinking it was an impossible situation. Now when that situation happens unlink_nv12_plane() will just WARN and proceed to clobber the state. Rather than reverting to the old way of doing things, I think it's more clear if we unlink the NV12 planes before we even compute the new plane state. Cc: stable@vger.kernel.org Reported-by: Khaled Almahallawy <khaled.almahallawy@intel.com> Closes: https://lore.kernel.org/intel-gfx/20260212004852.1920270-1-khaled.almahallawy@intel.com/ Tested-by: Khaled Almahallawy <khaled.almahallawy@intel.com> Fixes: 6a01df2f1b2a ("drm/i915: Remove pointless visible check in unlink_nv12_plane()") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20260316163953.12905-2-ville.syrjala@linux.intel.com Reviewed-by: Uma Shankar <uma.shankar@intel.com> (cherry picked from commit 017ecd04985573eeeb0745fa2c23896fb22ee0cc) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 daysdrm/i915: Order OP vs. timeout correctly in __wait_for()Ville Syrjälä1-1/+1
commit 6ad2a661ff0d3d94884947d2a593311ba46d34c2 upstream. Put the barrier() before the OP so that anything we read out in OP and check in COND will actually be read out after the timeout has been evaluated. Currently the only place where we use OP is __intel_wait_for_register(), but the use there is precisely susceptible to this reordering, assuming the ktime_*() stuff itself doesn't act as a sufficient barrier: __intel_wait_for_register(...) { ... ret = __wait_for(reg_value = intel_uncore_read_notrace(...), (reg_value & mask) == value, ...); ... } Cc: stable@vger.kernel.org Fixes: 1c3c1dc66a96 ("drm/i915: Add compiler barrier to wait_for") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20260313110740.24620-1-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit a464bace0482aa9a83e9aa7beefbaf44cd58e6cf) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 daysdrm/i915/dp_tunnel: Fix error handling when clearing stream BW in atomic stateImre Deak3-11/+28
commit 77fcf58df15edcf3f5b5421f24814fb72796def9 upstream. Clearing the DP tunnel stream BW in the atomic state involves getting the tunnel group state, which can fail. Handle the error accordingly. This fixes at least one issue where drm_dp_tunnel_atomic_set_stream_bw() failed to get the tunnel group state returning -EDEADLK, which wasn't handled. This lead to the ctx->contended warn later in modeset_lock() while taking a WW mutex for another object in the same atomic state, and thus within the same already contended WW context. Moving intel_crtc_state_alloc() later would avoid freeing saved_state on the error path; this stable patch leaves that simplification for a follow-up. Cc: Uma Shankar <uma.shankar@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: <stable@vger.kernel.org> # v6.9+ Fixes: a4efae87ecb2 ("drm/i915/dp: Compute DP tunnel BW during encoder state computation") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/7617 Reviewed-by: Michał Grzelak <michal.grzelak@intel.com> Reviewed-by: Uma Shankar <uma.shankar@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patch.msgid.link/20260320092900.13210-1-imre.deak@intel.com (cherry picked from commit fb69d0076e687421188bc8103ab0e8e5825b1df1) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 daysdrm/amd/display: check if ext_caps is valid in BL setupAlex Deucher1-1/+1
commit 9da4f9964abcaeb6e19797d5e3b10faad338a786 upstream. LVDS connectors don't have extended backlight caps so check if the pointer is valid before accessing it. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/5012 Fixes: 1454642960b0 ("drm/amd: Re-introduce property to control adaptive backlight modulation") Cc: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 3f797396d7f4eb9bb6eded184bbc6f033628a6f6) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 daysdrm/amd/display: Fix drm_edid leak in amdgpu_dmAlex Hung1-1/+2
commit 37c2caa167b0b8aca4f74c32404c5288b876a2a3 upstream. [WHAT] When a sink is connected, aconnector->drm_edid was overwritten without freeing the previous allocation, causing a memory leak on resume. [HOW] Free the previous drm_edid before updating it. Reviewed-by: Roman Li <roman.li@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 52024a94e7111366141cfc5d888b2ef011f879e5) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 daysdrm/amd/display: Fix DCE LVDS handlingAlex Deucher6-22/+19
commit 90d239cc53723c1a3f89ce08eac17bf3a9e9f2d4 upstream. LVDS does not use an HPD pin so it may be invalid. Handle this case correctly in link encoder creation. Fixes: 7c8fb3b8e9ba ("drm/amd/display: Add hpd_source index check for DCE60/80/100/110/112/120 link encoders") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/5012 Cc: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Cc: Roman Li <roman.li@amd.com> Reviewed-by: Roman Li <roman.li@amd.com> Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 3b5620f7ee688177fcf65cf61588c5435bce1872) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 daysdrm/amdgpu: fix strsep() corrupting lockup_timeout on multi-GPU (v3)Ruijing Dong1-2/+11
commit 2d300ebfc411205fa31ba7741c5821d381912381 upstream. amdgpu_device_get_job_timeout_settings() passes a pointer directly to the global amdgpu_lockup_timeout[] buffer into strsep(). strsep() destructively replaces delimiter characters with '\0' in-place. On multi-GPU systems, this function is called once per device. When a multi-value setting like "0,0,0,-1" is used, the first GPU's call transforms the global buffer into "0\00\00\0-1". The second GPU then sees only "0" (terminated at the first '\0'), parses a single value, hits the single-value fallthrough (index == 1), and applies timeout=0 to all rings — causing immediate false job timeouts. Fix this by copying into a stack-local array before calling strsep(), so the global module parameter buffer remains intact across calls. The buffer is AMDGPU_MAX_TIMEOUT_PARAM_LENGTH (256) bytes, which is safe for the stack. v2: wrap commit message to 72 columns, add Assisted-by tag. v3: use stack array with strscpy() instead of kstrdup()/kfree() to avoid unnecessary heap allocation (Christian). This patch was developed with assistance from Claude (claude-opus-4-6). Assisted-by: Claude:claude-opus-4-6 Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 94d79f51efecb74be1d88dde66bdc8bfcca17935) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 daysdrm/amdgpu: prevent immediate PASID reuse caseEric Huang3-13/+34
commit 14b81abe7bdc25f8097906fc2f91276ffedb2d26 upstream. PASID resue could cause interrupt issue when process immediately runs into hw state left by previous process exited with the same PASID, it's possible that page faults are still pending in the IH ring buffer when the process exits and frees up its PASID. To prevent the case, it uses idr cyclic allocator same as kernel pid's. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 8f1de51f49be692de137c8525106e0fce2d1912d) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 daysdrm/xe: always keep track of remap prev/nextMatthew Auld3-10/+28
commit bfe9e314d7574d1c5c851972e7aee342733819d2 upstream. During 3D workload, user is reporting hitting: [ 413.361679] WARNING: drivers/gpu/drm/xe/xe_vm.c:1217 at vm_bind_ioctl_ops_unwind+0x1e2/0x2e0 [xe], CPU#7: vkd3d_queue/9925 [ 413.361944] CPU: 7 UID: 1000 PID: 9925 Comm: vkd3d_queue Kdump: loaded Not tainted 7.0.0-070000rc3-generic #202603090038 PREEMPT(lazy) [ 413.361949] RIP: 0010:vm_bind_ioctl_ops_unwind+0x1e2/0x2e0 [xe] [ 413.362074] RSP: 0018:ffffd4c25c3df930 EFLAGS: 00010282 [ 413.362077] RAX: 0000000000000000 RBX: ffff8f3ee817ed10 RCX: 0000000000000000 [ 413.362078] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 413.362079] RBP: ffffd4c25c3df980 R08: 0000000000000000 R09: 0000000000000000 [ 413.362081] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8f41fbf99380 [ 413.362082] R13: ffff8f3ee817e968 R14: 00000000ffffffef R15: ffff8f43d00bd380 [ 413.362083] FS: 00000001040ff6c0(0000) GS:ffff8f4696d89000(0000) knlGS:00000000330b0000 [ 413.362085] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 [ 413.362086] CR2: 00007ddfc4747000 CR3: 00000002e6262005 CR4: 0000000000f72ef0 [ 413.362088] PKRU: 55555554 [ 413.362089] Call Trace: [ 413.362092] <TASK> [ 413.362096] xe_vm_bind_ioctl+0xa9a/0xc60 [xe] Which seems to hint that the vma we are re-inserting for the ops unwind is either invalid or overlapping with something already inserted in the vm. It shouldn't be invalid since this is a re-insertion, so must have worked before. Leaving the likely culprit as something already placed where we want to insert the vma. Following from that, for the case where we do something like a rebind in the middle of a vma, and one or both mapped ends are already compatible, we skip doing the rebind of those vma and set next/prev to NULL. As well as then adjust the original unmap va range, to avoid unmapping the ends. However, if we trigger the unwind path, we end up with three va, with the two ends never being removed and the original va range in the middle still being the shrunken size. If this occurs, one failure mode is when another unwind op needs to interact with that range, which can happen with a vector of binds. For example, if we need to re-insert something in place of the original va. In this case the va is still the shrunken version, so when removing it and then doing a re-insert it can overlap with the ends, which were never removed, triggering a warning like above, plus leaving the vm in a bad state. With that, we need two things here: 1) Stop nuking the prev/next tracking for the skip cases. Instead relying on checking for skip prev/next, where needed. That way on the unwind path, we now correctly remove both ends. 2) Undo the unmap va shrinkage, on the unwind path. With the two ends now removed the unmap va should expand back to the original size again, before re-insertion. v2: - Update the explanation in the commit message, based on an actual IGT of triggering this issue, rather than conjecture. - Also undo the unmap shrinkage, for the skip case. With the two ends now removed, the original unmap va range should expand back to the original range. v3: - Track the old start/range separately. vma_size/start() uses the va info directly. Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/7602 Fixes: 8f33b4f054fc ("drm/xe: Avoid doing rebinds") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260318100208.78097-2-matthew.auld@intel.com (cherry picked from commit aec6969f75afbf4e01fd5fb5850ed3e9c27043ac) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 daysdrm/amdgpu: Fix fence put before wait in amdgpu_amdkfd_submit_ibSrinivasan Shanmugam1-2/+2
[ Upstream commit 7150850146ebfa4ca998f653f264b8df6f7f85be ] amdgpu_amdkfd_submit_ib() submits a GPU job and gets a fence from amdgpu_ib_schedule(). This fence is used to wait for job completion. Currently, the code drops the fence reference using dma_fence_put() before calling dma_fence_wait(). If dma_fence_put() releases the last reference, the fence may be freed before dma_fence_wait() is called. This can lead to a use-after-free. Fix this by waiting on the fence first and releasing the reference only after dma_fence_wait() completes. Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c:697 amdgpu_amdkfd_submit_ib() warn: passing freed memory 'f' (line 696) Fixes: 9ae55f030dc5 ("drm/amdgpu: Follow up change to previous drm scheduler change.") Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Dan Carpenter <dan.carpenter@linaro.org> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 8b9e5259adc385b61a6590a13b82ae0ac2bd3482) Signed-off-by: Sasha Levin <sashal@kernel.org>
11 daysdrm/xe: Implement recent spec updates to Wa_16025250150Matt Roper2-1/+3
[ Upstream commit 56781a4597706cd25185b1dedc38841ec6c31496 ] The hardware teams noticed that the originally documented workaround steps for Wa_16025250150 may not be sufficient to fully avoid a hardware issue. The workaround documentation has been augmented to suggest programming one additional register; make the corresponding change in the driver. Fixes: 7654d51f1fd8 ("drm/xe/xe2hpg: Add Wa_16025250150") Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com> Link: https://patch.msgid.link/20260319-wa_16025250150_part2-v1-1-46b1de1a31b2@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit a31566762d4075646a8a2214586158b681e94305) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 daysdrm/amd/display: Do not skip unrelated mode changes in DSC validationYussuf Khalil3-1/+9
[ Upstream commit aed3d041ab061ec8a64f50a3edda0f4db7280025 ] Starting with commit 17ce8a6907f7 ("drm/amd/display: Add dsc pre-validation in atomic check"), amdgpu resets the CRTC state mode_changed flag to false when recomputing the DSC configuration results in no timing change for a particular stream. However, this is incorrect in scenarios where a change in MST/DSC configuration happens in the same KMS commit as another (unrelated) mode change. For example, the integrated panel of a laptop may be configured differently (e.g., HDR enabled/disabled) depending on whether external screens are attached. In this case, plugging in external DP-MST screens may result in the mode_changed flag being dropped incorrectly for the integrated panel if its DSC configuration did not change during precomputation in pre_validate_dsc(). At this point, however, dm_update_crtc_state() has already created new streams for CRTCs with DSC-independent mode changes. In turn, amdgpu_dm_commit_streams() will never release the old stream, resulting in a memory leak. amdgpu_dm_atomic_commit_tail() will never acquire a reference to the new stream either, which manifests as a use-after-free when the stream gets disabled later on: BUG: KASAN: use-after-free in dc_stream_release+0x25/0x90 [amdgpu] Write of size 4 at addr ffff88813d836524 by task kworker/9:9/29977 Workqueue: events drm_mode_rmfb_work_fn Call Trace: <TASK> dump_stack_lvl+0x6e/0xa0 print_address_description.constprop.0+0x88/0x320 ? dc_stream_release+0x25/0x90 [amdgpu] print_report+0xfc/0x1ff ? srso_alias_return_thunk+0x5/0xfbef5 ? __virt_addr_valid+0x225/0x4e0 ? dc_stream_release+0x25/0x90 [amdgpu] kasan_report+0xe1/0x180 ? dc_stream_release+0x25/0x90 [amdgpu] kasan_check_range+0x125/0x200 dc_stream_release+0x25/0x90 [amdgpu] dc_state_destruct+0x14d/0x5c0 [amdgpu] dc_state_release.part.0+0x4e/0x130 [amdgpu] dm_atomic_destroy_state+0x3f/0x70 [amdgpu] drm_atomic_state_default_clear+0x8ee/0xf30 ? drm_mode_object_put.part.0+0xb1/0x130 __drm_atomic_state_free+0x15c/0x2d0 atomic_remove_fb+0x67e/0x980 Since there is no reliable way of figuring out whether a CRTC has unrelated mode changes pending at the time of DSC validation, remember the value of the mode_changed flag from before the point where a CRTC was marked as potentially affected by a change in DSC configuration. Reset the mode_changed flag to this earlier value instead in pre_validate_dsc(). Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/5004 Fixes: 17ce8a6907f7 ("drm/amd/display: Add dsc pre-validation in atomic check") Signed-off-by: Yussuf Khalil <dev@pp3345.net> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit cc7c7121ae082b7b82891baa7280f1ff2608f22b) Signed-off-by: Sasha Levin <sashal@kernel.org>
11 daysdrm/xe/pf: Fix use-after-free in migration restoreMichał Winiarski1-0/+2
[ Upstream commit 87997b6c6516e049cbaf2fc6810b213d587a06b1 ] When an error is returned from xe_sriov_pf_migration_restore_produce(), the data pointer is not set to NULL, which can trigger use-after-free in subsequent .write() calls. Set the pointer to NULL upon error to fix the problem. Fixes: 1ed30397c0b92 ("drm/xe/pf: Add support for encap/decap of bitstream to/from packet") Reported-by: Sebastian Österlund <sebastian.osterlund@intel.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/7230 Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://patch.msgid.link/20260217154118.176902-1-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> (cherry picked from commit 4f53d8c6d23527d734fe3531d08e15cb170a0819) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 daysdrm/i915/gmbus: fix spurious timeout on 512-byte burst readsSamasth Norway Ananda1-1/+3
[ Upstream commit 08441f10f4dc09fdeb64529953ac308abc79dd38 ] When reading exactly 512 bytes with burst read enabled, the extra_byte_added path breaks out of the inner do-while without decrementing len. The outer while(len) then re-enters and gmbus_wait() times out since all data has been delivered. Decrement len before the break so the outer loop terminates correctly. Fixes: d5dc0f43f268 ("drm/i915/gmbus: Enable burst read") Signed-off-by: Samasth Norway Ananda <samasth.norway.ananda@oracle.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patch.msgid.link/20260316231920.135438-2-samasth.norway.ananda@oracle.com Signed-off-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit 4ab0f09ee73fc853d00466682635f67c531f909c) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 daysdrm/mediatek: dsi: Store driver data before invoking mipi_dsi_host_registerLuca Leonardo Scorcia1-4/+5
[ Upstream commit 4cfdfeb6ac06079f92fccd977fa742d6c5b8dd3a ] The call to mipi_dsi_host_register triggers a callback to mtk_dsi_bind, which uses dev_get_drvdata to retrieve the mtk_dsi struct, so this structure needs to be stored inside the driver data before invoking it. As drvdata is currently uninitialized it leads to a crash when registering the DSI DRM encoder right after acquiring the mode_config.idr_mutex, blocking all subsequent DRM operations. Fixes the following crash during mediatek-drm probe (tested on Xiaomi Smart Clock x04g): Unable to handle kernel NULL pointer dereference at virtual address 0000000000000040 [...] Modules linked in: mediatek_drm(+) drm_display_helper cec drm_client_lib drm_dma_helper drm_kms_helper panel_simple [...] Call trace: drm_mode_object_add+0x58/0x98 (P) __drm_encoder_init+0x48/0x140 drm_encoder_init+0x6c/0xa0 drm_simple_encoder_init+0x20/0x34 [drm_kms_helper] mtk_dsi_bind+0x34/0x13c [mediatek_drm] component_bind_all+0x120/0x280 mtk_drm_bind+0x284/0x67c [mediatek_drm] try_to_bring_up_aggregate_device+0x23c/0x320 __component_add+0xa4/0x198 component_add+0x14/0x20 mtk_dsi_host_attach+0x78/0x100 [mediatek_drm] mipi_dsi_attach+0x2c/0x50 panel_simple_dsi_probe+0x4c/0x9c [panel_simple] mipi_dsi_drv_probe+0x1c/0x28 really_probe+0xc0/0x3dc __driver_probe_device+0x80/0x160 driver_probe_device+0x40/0x120 __device_attach_driver+0xbc/0x17c bus_for_each_drv+0x88/0xf0 __device_attach+0x9c/0x1cc device_initial_probe+0x54/0x60 bus_probe_device+0x34/0xa0 device_add+0x5b0/0x800 mipi_dsi_device_register_full+0xdc/0x16c mipi_dsi_host_register+0xc4/0x17c mtk_dsi_probe+0x10c/0x260 [mediatek_drm] platform_probe+0x5c/0xa4 really_probe+0xc0/0x3dc __driver_probe_device+0x80/0x160 driver_probe_device+0x40/0x120 __driver_attach+0xc8/0x1f8 bus_for_each_dev+0x7c/0xe0 driver_attach+0x24/0x30 bus_add_driver+0x11c/0x240 driver_register+0x68/0x130 __platform_register_drivers+0x64/0x160 mtk_drm_init+0x24/0x1000 [mediatek_drm] do_one_initcall+0x60/0x1d0 do_init_module+0x54/0x240 load_module+0x1838/0x1dc0 init_module_from_file+0xd8/0xf0 __arm64_sys_finit_module+0x1b4/0x428 invoke_syscall.constprop.0+0x48/0xc8 do_el0_svc+0x3c/0xb8 el0_svc+0x34/0xe8 el0t_64_sync_handler+0xa0/0xe4 el0t_64_sync+0x198/0x19c Code: 52800022 941004ab 2a0003f3 37f80040 (29005a80) Fixes: e4732b590a77 ("drm/mediatek: dsi: Register DSI host after acquiring clocks and PHY") Signed-off-by: Luca Leonardo Scorcia <l.scorcia@gmail.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20260225094047.76780-1-l.scorcia@gmail.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 daysdrm/amdgpu: fix gpu idle power consumption issue for gfx v12Yang Wang1-1/+4
[ Upstream commit a6571045cf06c4aa749b4801382ae96650e2f0e1 ] Older versions of the MES firmware may cause abnormal GPU power consumption. When performing inference tasks on the GPU (e.g., with Ollama using ROCm), the GPU may show abnormal power consumption in idle state and incorrect GPU load information. This issue has been fixed in firmware version 0x8b and newer. Closes: https://github.com/ROCm/ROCm/issues/5706 Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 4e22a5fe6ea6e0b057e7f246df4ac3ff8bfbc46a) Signed-off-by: Sasha Levin <sashal@kernel.org>
11 daysdrm/ttm/tests: Fix build failure on PREEMPT_RTMaarten Lankhorst1-2/+2
[ Upstream commit a58d487fb1a52579d3c37544ea371da78ed70c45 ] Fix a compile error in the kunit tests when CONFIG_PREEMPT_RT is enabled, and the normal mutex is converted into a rtmutex. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202602261547.3bM6yVAS-lkp@intel.com/ Reviewed-by: Jouni Högander <jouni.hogander@intel.com> Link: https://patch.msgid.link/20260304085616.1216961-1-dev@lankhorst.se Signed-off-by: Maarten Lankhorst <dev@lankhorst.se> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-25drm/xe/guc: Fail immediately on GuC load errorDaniele Ceraolo Spurio1-3/+3
[ Upstream commit 9b72283ec9b8685acdb3467de8fbc3352fdb70bb ] By using the same variable for both the return of poll_timeout_us and the return of the polled function guc_wait_ucode, the return value of the latter is overwritten and lost after exiting the polling loop. Since guc_wait_ucode returns -1 on GuC load failure, we lose that information and always continue as if the GuC had been loaded correctly. This is fixed by simply using 2 separate variables. Fixes: a4916b4da448 ("drm/xe/guc: Refactor GuC load to use poll_timeout_us()") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patch.msgid.link/20260303001732.2540493-2-daniele.ceraolospurio@intel.com (cherry picked from commit c85ec5c5753a46b5c2aea1292536487be9470ffe) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-25drm/i915/gt: Check set_default_submission() before deferencingRahul Bukte1-1/+2
[ Upstream commit 0162ab3220bac870e43e229e6e3024d1a21c3f26 ] When the i915 driver firmware binaries are not present, the set_default_submission pointer is not set. This pointer is dereferenced during suspend anyways. Add a check to make sure it is set before dereferencing. [ 23.289926] PM: suspend entry (deep) [ 23.293558] Filesystems sync: 0.000 seconds [ 23.298010] Freezing user space processes [ 23.302771] Freezing user space processes completed (elapsed 0.000 seconds) [ 23.309766] OOM killer disabled. [ 23.313027] Freezing remaining freezable tasks [ 23.318540] Freezing remaining freezable tasks completed (elapsed 0.001 seconds) [ 23.342038] serial 00:05: disabled [ 23.345719] serial 00:02: disabled [ 23.349342] serial 00:01: disabled [ 23.353782] sd 0:0:0:0: [sda] Synchronizing SCSI cache [ 23.358993] sd 1:0:0:0: [sdb] Synchronizing SCSI cache [ 23.361635] ata1.00: Entering standby power mode [ 23.368863] ata2.00: Entering standby power mode [ 23.445187] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 23.452194] #PF: supervisor instruction fetch in kernel mode [ 23.457896] #PF: error_code(0x0010) - not-present page [ 23.463065] PGD 0 P4D 0 [ 23.465640] Oops: Oops: 0010 [#1] SMP NOPTI [ 23.469869] CPU: 8 UID: 0 PID: 211 Comm: kworker/u48:18 Tainted: G S W 6.19.0-rc4-00020-gf0b9d8eb98df #10 PREEMPT(voluntary) [ 23.482512] Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN [ 23.496511] Workqueue: async async_run_entry_fn [ 23.501087] RIP: 0010:0x0 [ 23.503755] Code: Unable to access opcode bytes at 0xffffffffffffffd6. [ 23.510324] RSP: 0018:ffffb4a60065fca8 EFLAGS: 00010246 [ 23.515592] RAX: 0000000000000000 RBX: ffff9f428290e000 RCX: 000000000000000f [ 23.522765] RDX: 0000000000000000 RSI: 0000000000000282 RDI: ffff9f428290e000 [ 23.529937] RBP: ffff9f4282907070 R08: ffff9f4281130428 R09: 00000000ffffffff [ 23.537111] R10: 0000000000000000 R11: 0000000000000001 R12: ffff9f42829070f8 [ 23.544284] R13: ffff9f4282906028 R14: ffff9f4282900000 R15: ffff9f4282906b68 [ 23.551457] FS: 0000000000000000(0000) GS:ffff9f466b2cf000(0000) knlGS:0000000000000000 [ 23.559588] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 23.565365] CR2: ffffffffffffffd6 CR3: 000000031c230001 CR4: 0000000000f70ef0 [ 23.572539] PKRU: 55555554 [ 23.575281] Call Trace: [ 23.577770] <TASK> [ 23.579905] intel_engines_reset_default_submission+0x42/0x60 [ 23.585695] __intel_gt_unset_wedged+0x191/0x200 [ 23.590360] intel_gt_unset_wedged+0x20/0x40 [ 23.594675] gt_sanitize+0x15e/0x170 [ 23.598290] i915_gem_suspend_late+0x6b/0x180 [ 23.602692] i915_drm_suspend_late+0x35/0xf0 [ 23.607008] ? __pfx_pci_pm_suspend_late+0x10/0x10 [ 23.611843] dpm_run_callback+0x78/0x1c0 [ 23.615817] device_suspend_late+0xde/0x2e0 [ 23.620037] async_suspend_late+0x18/0x30 [ 23.624082] async_run_entry_fn+0x25/0xa0 [ 23.628129] process_one_work+0x15b/0x380 [ 23.632182] worker_thread+0x2a5/0x3c0 [ 23.635973] ? __pfx_worker_thread+0x10/0x10 [ 23.640279] kthread+0xf6/0x1f0 [ 23.643464] ? __pfx_kthread+0x10/0x10 [ 23.647263] ? __pfx_kthread+0x10/0x10 [ 23.651045] ret_from_fork+0x131/0x190 [ 23.654837] ? __pfx_kthread+0x10/0x10 [ 23.658634] ret_from_fork_asm+0x1a/0x30 [ 23.662597] </TASK> [ 23.664826] Modules linked in: [ 23.667914] CR2: 0000000000000000 [ 23.671271] ------------[ cut here ]------------ Signed-off-by: Rahul Bukte <rahul.bukte@sony.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://patch.msgid.link/20260203044839.1555147-1-suraj.kandpal@intel.com (cherry picked from commit daa199abc3d3d1740c9e3a2c3e9216ae5b447cad) Fixes: ff44ad51ebf8 ("drm/i915: Move engine->submit_request selection to a vfunc") Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-25drm/bridge: dw-hdmi-qp: fix multi-channel audio outputJonas Karlman1-1/+1
[ Upstream commit cffcb42c57686e9a801dfcf37a3d0c62e51c1c3e ] Channel Allocation (PB4) and Level Shift Information (PB5) are configured with values from PB1 and PB2 due to the wrong offset being used. This results in missing audio channels or incorrect speaker placement when playing multi-channel audio. Use the correct offset to fix multi-channel audio output. Fixes: fd0141d1a8a2 ("drm/bridge: synopsys: Add audio support for dw-hdmi-qp") Reported-by: Christian Hewitt <christianshewitt@gmail.com> Signed-off-by: Jonas Karlman <jonas@kwiboo.se> Signed-off-by: Christian Hewitt <christianshewitt@gmail.com> Reviewed-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com> Link: https://patch.msgid.link/20260228112822.4056354-1-christianshewitt@gmail.com Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-25drm/amd: fix dcn 2.01 checkAndy Nguyen1-4/+4
[ Upstream commit 39f44f54afa58661ecae9c27e15f5dbce2372892 ] The ASICREV_IS_BEIGE_GOBY_P check always took precedence, because it includes all chip revisions upto NV_UNKNOWN. Fixes: 54b822b3eac3 ("drm/amd/display: Use dce_version instead of chip_id") Signed-off-by: Andy Nguyen <theofficialflow1996@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 9c7be0efa6f0daa949a5f3e3fdf9ea090b0713cb) Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-25drm/amd/display: Fix DisplayID not-found handling in parse_edid_displayid_vrr()Srinivasan Shanmugam1-2/+2
[ Upstream commit 2323b019651ad81c20a0f7f817c63392b3110652 ] parse_edid_displayid_vrr() searches the EDID extension blocks for a DisplayID extension before parsing the dynamic video timing range. The code previously checked whether edid_ext was NULL after the search loop. However, edid_ext is assigned during each iteration of the loop, so it will never be NULL once the loop has executed. If no DisplayID extension is found, edid_ext ends up pointing to the last extension block, and the NULL check does not correctly detect the failure case. Instead, check whether the loop completed without finding a matching DisplayID block by testing "i == edid->extensions". This ensures the function exits early when no DisplayID extension is present and avoids parsing an unrelated EDID extension block. Also simplify the EDID validation check using "!edid || !edid->extensions". Fixes the below: drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:13079 parse_edid_displayid_vrr() warn: variable dereferenced before check 'edid_ext' (see line 13075) Fixes: a638b837d0e6 ("drm/amd/display: Fix refresh rate range for some panel") Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Jerry Zuo <jerry.zuo@amd.com> Cc: Sun peng Li <sunpeng.li@amd.com> Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Dan Carpenter <dan.carpenter@linaro.org> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 91c7e6342e98c846b259c57273436fdea4c043f2) Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-25drm/vmwgfx: Don't overwrite KMS surface dirty trackerIan Forbes1-1/+2
[ Upstream commit c6cb77c474a32265e21c4871c7992468bf5e7638 ] We were overwriting the surface's dirty tracker here causing a memory leak. Reported-by: Mika Penttilä <mpenttil@redhat.com> Closes: https://lore.kernel.org/dri-devel/8c53f3c6-c6de-46fe-a8ca-d98dd52b3abe@redhat.com/ Fixes: 965544150d1c ("drm/vmwgfx: Refactor cursor handling") Signed-off-by: Ian Forbes <ian.forbes@broadcom.com> Reviewed-by: Maaz Mombasawala <maaz.mombasawala@broadcom.com> Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://patch.msgid.link/20260302200330.66763-1-ian.forbes@broadcom.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-25drm/xe: Open-code GGTT MMIO access protectionMatthew Brost2-7/+8
commit 01f2557aa684e514005541e71a3d01f4cd45c170 upstream. GGTT MMIO access is currently protected by hotplug (drm_dev_enter), which works correctly when the driver loads successfully and is later unbound or unloaded. However, if driver load fails, this protection is insufficient because drm_dev_unplug() is never called. Additionally, devm release functions cannot guarantee that all BOs with GGTT mappings are destroyed before the GGTT MMIO region is removed, as some BOs may be freed asynchronously by worker threads. To address this, introduce an open-coded flag, protected by the GGTT lock, that guards GGTT MMIO access. The flag is cleared during the dev_fini_ggtt devm release function to ensure MMIO access is disabled once teardown begins. Cc: stable@vger.kernel.org Fixes: 919bb54e989c ("drm/xe: Fix missing runtime outer protection for ggtt_remove_node") Reviewed-by: Zhanjun Dong <zhanjun.dong@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260310225039.1320161-8-zhanjun.dong@intel.com (cherry picked from commit 4f3a998a173b4325c2efd90bdadc6ccd3ad9a431) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-25drm/xe: Fix missing runtime PM reference in ccs_mode_storeSanjay Yadav1-0/+2
commit 65d046b2d8e0d6d855379a981869005fd6b6a41b upstream. ccs_mode_store() calls xe_gt_reset() which internally invokes xe_pm_runtime_get_noresume(). That function requires the caller to already hold an outer runtime PM reference and warns if none is held: [46.891177] xe 0000:03:00.0: [drm] Missing outer runtime PM protection [46.891178] WARNING: drivers/gpu/drm/xe/xe_pm.c:885 at xe_pm_runtime_get_noresume+0x8b/0xc0 Fix this by protecting xe_gt_reset() with the scope-based guard(xe_pm_runtime)(xe), which is the preferred form when the reference lifetime matches a single scope. v2: - Use scope-based guard(xe_pm_runtime)(xe) (Shuicheng) - Update commit message accordingly Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/7593 Fixes: 480b358e7d8e ("drm/xe: Do not wake device during a GT reset") Cc: <stable@vger.kernel.org> # v6.19+ Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Suggested-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Sanjay Yadav <sanjay.kumar.yadav@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patch.msgid.link/20260313071608.3459480-2-sanjay.kumar.yadav@intel.com (cherry picked from commit 7937ea733f79b3f25e802a0c8360bf7423856f36) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-25drm/xe: Always kill exec queues in xe_guc_submit_pause_abortMatthew Brost1-2/+1
commit 26c638d5602e329e0b26281a74c6ec69dee12f23 upstream. xe_guc_submit_pause_abort is intended to be called after something disastrous occurs (e.g., VF migration fails, device wedging, or driver unload) and should immediately trigger the teardown of remaining submission state. With that, kill any remaining queues in this function. Fixes: 7c4b7e34c83b ("drm/xe/vf: Abort VF post migration recovery on failure") Cc: stable@vger.kernel.org Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260310225039.1320161-2-zhanjun.dong@intel.com (cherry picked from commit 78f3bf00be4f15daead02ba32d4737129419c902) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-25drm/xe/oa: Allow reading after disabling OA streamAshutosh Dixit1-2/+5
commit 9be6fd9fbd2032b683e51374497768af9aaa228a upstream. Some OA data might be present in the OA buffer when OA stream is disabled. Allow UMD's to retrieve this data, so that all data till the point when OA stream is disabled can be retrieved. v2: Update tail pointer after disable (Umesh) Fixes: efb315d0a013 ("drm/xe/oa/uapi: Read file_operation") Cc: stable@vger.kernel.org Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Umesh Nerlige Ramappa<umesh.nerlige.ramappa@intel.com> Link: https://patch.msgid.link/20260313053630.3176100-1-ashutosh.dixit@intel.com (cherry picked from commit 4ff57c5e8dbba23b5457be12f9709d5c016da16e) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-25drm/xe/guc: Ensure CT state transitions via STOP before DISABLEDZhanjun Dong1-0/+1
commit 7838dd8367419e9fc43b79c038321cb3c04de2a2 upstream. The GuC CT state transition requires moving to the STOP state before entering the DISABLED state. Update the driver teardown sequence to make the proper state machine transitions. Fixes: ee4b32220a6b ("drm/xe/guc: Add devm release action to safely tear down CT") Cc: stable@vger.kernel.org Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260310225039.1320161-6-zhanjun.dong@intel.com (cherry picked from commit dace8cb0032f57ea67c87b3b92ad73c89dd2db44) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-25drm/i915/psr: Disable PSR on update_m_n and update_lrrJouni Högander1-0/+2
commit b0a4dba7b623aa7cbc9efcc56b4af2ec8b274f3e upstream. PSR/PR parameters might change based on update_m_n or update_lrr. Disable on update_m_n and update_lrr to ensure proper parameters are taken into use on next PSR enable in intel_psr_post_plane_update. Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/15771 Fixes: 2bc98c6f97af ("drm/i915/alpm: Compute ALPM parameters into crtc_state->alpm_state") Cc: <stable@vger.kernel.org> # v6.19+ Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://patch.msgid.link/20260312083710.1593781-2-jouni.hogander@intel.com (cherry picked from commit 65852b56bfa929f99e28c96fd98b02058959da7f) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>