summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2026-02-18drm/xe: Add bounds check on pat_index to prevent OOB kernel read in madviseJia Yao1-1/+6
When user provides a bogus pat_index value through the madvise IOCTL, the xe_pat_index_get_coh_mode() function performs an array access without validating bounds. This allows a malicious user to trigger an out-of-bounds kernel read from the xe->pat.table array. The vulnerability exists because the validation in madvise_args_are_sane() directly calls xe_pat_index_get_coh_mode(xe, args->pat_index.val) without first checking if pat_index is within [0, xe->pat.n_entries). Although xe_pat_index_get_coh_mode() has a WARN_ON to catch this in debug builds, it still performs the unsafe array access in production kernels. v2(Matthew Auld) - Using array_index_nospec() to mitigate spectre attacks when the value is used v3(Matthew Auld) - Put the declarations at the start of the block Fixes: ada7486c5668 ("drm/xe: Implement madvise ioctl for xe") Reviewed-by: Matthew Auld <matthew.auld@intel.com> Cc: <stable@vger.kernel.org> # v6.18+ Cc: Matthew Brost <matthew.brost@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Jia Yao <jia.yao@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patch.msgid.link/20260205161529.1819276-1-jia.yao@intel.com (cherry picked from commit 944a3329b05510d55c69c2ef455136e2fc02de29) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-02-18drm/xe/configfs: Fix 'parameter name omitted' errorsMichal Wajdeczko1-4/+8
On some configs and old compilers we can get following build errors: ../drivers/gpu/drm/xe/xe_configfs.h: In function 'xe_configfs_get_ctx_restore_mid_bb': ../drivers/gpu/drm/xe/xe_configfs.h:40:76: error: parameter name omitted static inline u32 xe_configfs_get_ctx_restore_mid_bb(struct pci_dev *pdev, enum xe_engine_class, ^~~~~~~~~~~~~~~~~~~~ ../drivers/gpu/drm/xe/xe_configfs.h: In function 'xe_configfs_get_ctx_restore_post_bb': ../drivers/gpu/drm/xe/xe_configfs.h:42:77: error: parameter name omitted static inline u32 xe_configfs_get_ctx_restore_post_bb(struct pci_dev *pdev, enum xe_engine_class, ^~~~~~~~~~~~~~~~~~~~ when trying to define our configfs stub functions. Fix that. Fixes: 7a4756b2fd04 ("drm/xe/lrc: Allow to add user commands mid context switch") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://patch.msgid.link/20260203193745.576-1-michal.wajdeczko@intel.com (cherry picked from commit f59cde8a2452b392115d2af8f1143a94725f4827) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-02-18drm/xe/pf: Fix sysfs initializationMichal Wajdeczko1-28/+26
In case of devm_add_action_or_reset() failure the provided cleanup action will be run immediately on the not yet initialized kobject. This may lead to errors like: [ ] kobject: '(null)' (ff110001393608e0): is not initialized, yet kobject_put() is being called. [ ] WARNING: lib/kobject.c:734 at kobject_put+0xd9/0x250, CPU#0: kworker/0:0/9 [ ] RIP: 0010:kobject_put+0xdf/0x250 [ ] Call Trace: [ ] xe_sriov_pf_sysfs_init+0x21/0x100 [xe] [ ] xe_sriov_pf_init_late+0x87/0x2b0 [xe] [ ] xe_sriov_init_late+0x5f/0x2c0 [xe] [ ] xe_device_probe+0x5f2/0xc20 [xe] [ ] xe_pci_probe+0x396/0x610 [xe] [ ] local_pci_probe+0x47/0xb0 [ ] refcount_t: underflow; use-after-free. [ ] WARNING: lib/refcount.c:28 at refcount_warn_saturate+0x68/0xb0, CPU#0: kworker/0:0/9 [ ] RIP: 0010:refcount_warn_saturate+0x68/0xb0 [ ] Call Trace: [ ] kobject_put+0x174/0x250 [ ] xe_sriov_pf_sysfs_init+0x21/0x100 [xe] [ ] xe_sriov_pf_init_late+0x87/0x2b0 [xe] [ ] xe_sriov_init_late+0x5f/0x2c0 [xe] [ ] xe_device_probe+0x5f2/0xc20 [xe] [ ] xe_pci_probe+0x396/0x610 [xe] [ ] local_pci_probe+0x47/0xb0 Fix that by calling kobject_init() and kobject_add() separately and register cleanup action after the kobject is initialized. Also make this cleanup registration a part of the create helper to fix another mistake, as in the loop we were wrongly passing parent kobject while registering cleanup action, and this resulted in some undetected leaks. Fixes: 5c170a4d9c53 ("drm/xe/pf: Prepare sysfs for SR-IOV admin attributes") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://patch.msgid.link/20260203235332.1350-1-michal.wajdeczko@intel.com (cherry picked from commit 98b16727f07e26a5d4de84d88805ce7ffcfdd324) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-02-06Merge tag 'drm-xe-next-fixes-2026-02-05' of ↵Dave Airlie8-11/+20
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next - Fix CFI violation in debugfs access (Daniele) - Kernel-doc fixes (Chaitanya, Shuicheng) - Disable D3Cold for BMG only on specific platforms (Karthik) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/aYStaLZVJWwKCDZt@intel.com
2026-02-06Merge tag 'drm-misc-next-fixes-2026-02-05' of ↵Dave Airlie13-80/+103
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next Several fixes for amdxdna around PM handling, error reporting and memory safety, a compilation fix for ilitek-ili9882t, a NULL pointer dereference fix for imx8qxp-pixel-combiner and several PTE fixes for nouveau Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://patch.msgid.link/20260205-refreshing-natural-vole-4c73af@houat
2026-02-06Merge tag 'drm-intel-next-fixes-2026-02-05' of ↵Dave Airlie4-24/+26
https://gitlab.freedesktop.org/drm/i915/kernel into drm-next - Fix the pixel normalization handling for xe3p_lpd display Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patch.msgid.link/aYROngKfyUIyoQW0@jlahtine-mobl
2026-02-05drm/xe/pm: Disable D3Cold for BMG only on specific platformsKarthik Poosa1-3/+10
Restrict D3Cold disablement for BMG to unsupported NUC platforms, instead of disabling it on all platforms. Signed-off-by: Karthik Poosa <karthik.poosa@intel.com> Fixes: 3e331a6715ee ("drm/xe/pm: Temporarily disable D3Cold on BMG") Link: https://patch.msgid.link/20260123173238.1642383-1-karthik.poosa@intel.com Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 39125eaf8863ab09d70c4b493f58639b08d5a897) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-02-05drm/xe: Fix kerneldoc for xe_tlb_inval_job_alloc_depShuicheng Lin1-1/+1
Correct the function name in the kerneldoc. It is for below warning: "Warning: drivers/gpu/drm/xe/xe_tlb_inval_job.c:210 expecting prototype for xe_tlb_inval_alloc_dep(). Prototype was for xe_tlb_inval_job_alloc_dep() instead" Fixes: 15366239e2130 ("drm/xe: Decouple TLB invalidations from GT") Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20260129233834.419977-8-shuicheng.lin@intel.com (cherry picked from commit 9f9c117ac566cb567dd56cc5b7564c45653f7a2a) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-02-05drm/xe: Fix kerneldoc for xe_gt_tlb_inval_init_earlyShuicheng Lin1-1/+1
Correct the function name in the kerneldoc. It is for below warning: "Warning: drivers/gpu/drm/xe/xe_tlb_inval.c:136 expecting prototype for xe_gt_tlb_inval_init(). Prototype was for xe_gt_tlb_inval_init_early() instead" v2: add () for the function. (Michal) Fixes: db16f9d90c1d9 ("drm/xe: Split TLB invalidation code in frontend and backend") Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20260129233834.419977-7-shuicheng.lin@intel.com (cherry picked from commit 0651dbb9d6a72e99569576fbec4681fd8160d161) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-02-05drm/xe: Fix kerneldoc for xe_migrate_exec_queueShuicheng Lin1-1/+1
Correct the function name in the kerneldoc. It is for below warning: "Warning: drivers/gpu/drm/xe/xe_migrate.c:1262 expecting prototype for xe_get_migrate_exec_queue(). Prototype was for xe_migrate_exec_queue() instead" Fixes: 916ee4704a865 ("drm/xe/vf: Register CCS read/write contexts with Guc") Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20260129233834.419977-6-shuicheng.lin@intel.com (cherry picked from commit 9fd8da717934f05125b9ba6782622c459a368dc0) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-02-05drm/xe/query: Fix topology query pointer advanceShuicheng Lin1-1/+1
The topology query helper advanced the user pointer by the size of the pointer, not the size of the structure. This can misalign the output blob and corrupt the following mask. Fix the increment to use sizeof(*topo). There is no issue currently, as sizeof(*topo) happens to be equal to sizeof(topo) on 64-bit systems (both evaluate to 8 bytes). Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20260130043907.465128-2-shuicheng.lin@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit c2a6859138e7f73ad904be17dd7d1da6cc7f06b3) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-02-05drm/xe/guc: Fix kernel-doc warning in GuC scheduler ABI headerChaitanya Kumar Borah1-1/+1
The GuC scheduler ABI header contains a file-level comment that is not intended to document a kernel-doc symbol. Using kernel-doc comment syntax (/** */) triggers kernel-doc warnings. With "-Werror", this causes the build to fail. Convert the comment to a regular block comment. HDRTEST drivers/gpu/drm/xe/abi/guc_scheduler_abi.h Warning: drivers/gpu/drm/xe/abi/guc_scheduler_abi.h:11 This comment starts with '/**', but isn't a kernel-doc comment. Refer to Documentation/doc-guide/kernel-doc.rst * Generic defines required for registration with and submissions to the GuC 1 warnings as errors make[6]: *** [drivers/gpu/drm/xe/Makefile:377: drivers/gpu/drm/xe/abi/guc_scheduler_abi.hdrtest] Error 3 make[5]: *** [scripts/Makefile.build:544: drivers/gpu/drm/xe] Error 2 make[4]: *** [scripts/Makefile.build:544: drivers/gpu/drm] Error 2 make[3]: *** [scripts/Makefile.build:544: drivers/gpu] Error 2 make[2]: *** [scripts/Makefile.build:544: drivers] Error 2 make[1]: *** [/home/kbuild2/kernel/Makefile:2088: .] Error 2 make: *** [Makefile:248: __sub-make] Error 2 v2: - Add Fixes tag (Daniele) Fixes: b0c5cf4f5917 ("drm/gt/guc: extract scheduler-related defines from guc_fwif.h") Signed-off-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patch.msgid.link/20260130135210.2659200-1-chaitanya.kumar.borah@intel.com (cherry picked from commit f89dbe14a0c8854b7aaf960dd842c10698b3ff19) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-02-05drm/xe/guc: Fix CFI violation in debugfs access.Daniele Ceraolo Spurio2-3/+5
xe_guc_print_info is void-returning, but the function pointer it is assigned to expects an int-returning function, leading to the following CFI error: [ 206.873690] CFI failure at guc_debugfs_show+0xa1/0xf0 [xe] (target: xe_guc_print_info+0x0/0x370 [xe]; expected type: 0xbe3bc66a) Fix this by updating xe_guc_print_info to return an integer. Fixes: e15826bb3c2c ("drm/xe/guc: Refactor GuC debugfs initialization") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: George D Sworo <george.d.sworo@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20260129182547.32899-2-daniele.ceraolospurio@intel.com (cherry picked from commit dd8ea2f2ab71b98887fdc426b0651dbb1d1ea760) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-02-05accel/amdxdna: Move RPM resume into job run functionLizhi Hou1-10/+9
Currently, amdxdna_pm_resume_get() is called during job creation, and amdxdna_pm_suspend_put() is called when the hardware notifies job completion. If a job is canceled before it is run, no hardware completion notification is generated, resulting in an unbalanced runtime PM resume/suspend pair. Fix this by moving amdxdna_pm_resume_get() to the job run path, ensuring runtime PM is only resumed for jobs that are actually executed. Fixes: 063db451832b ("accel/amdxdna: Enhance runtime power management") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260204171118.3165607-1-lizhi.hou@amd.com
2026-02-05accel/amdxdna: Fix incorrect DPM level after suspend/resumeLizhi Hou2-2/+3
The suspend routine sets the DPM level to 0, which unintentionally overwrites the previously saved DPM level. As a result, the device always resumes with DPM level 0 instead of restoring the original value. Fix this by ensuring the suspend path does not overwrite the saved DPM level, allowing the correct DPM level to be restored during resume. Fixes: f4d7b8a6bc8c ("accel/amdxdna: Enhance power management settings") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260204171048.3165580-1-lizhi.hou@amd.com
2026-02-04nouveau/vmm: start tracking if the LPT PTE is valid. (v6)Dave Airlie3-10/+36
When NVK enabled large pages userspace tests were seeing fault reports at a valid address. There was a case where an address moving from 64k page to 4k pages could expose a race between unmapping the 4k page, mapping the 64k page and unref the 4k pages. Unref 4k pages would cause the dual-page table handling to always set the LPTE entry to SPARSE or INVALID, but if we'd mapped a valid LPTE in the meantime, it would get trashed. Keep track of when a valid LPTE has been referenced, and don't reset in that case. This adds an lpte valid tracker and lpte reference count. Whenever an lpte is referenced, it gets made valid and the ref count increases, whenever it gets unreference the refcount is tracked. Link: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14610 Reviewed-by: Mary Guillemard <mary@mary.zone> Tested-by: Mary Guillemard <mary@mary.zone> Tested-by: Mel Henning <mhenning@darkrefraction.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patch.msgid.link/20260204030208.2313241-4-airlied@gmail.com
2026-02-04nouveau/vmm: increase size of vmm pte tracker struct to u32 (v2)Dave Airlie2-7/+8
We need to tracker large counts of spte than previously due to unref getting delayed sometimes. This doesn't fix LPT tracking yet, it just creates space for it. Reviewed-by: Mary Guillemard <mary@mary.zone> Tested-by: Mary Guillemard <mary@mary.zone> Tested-by: Mel Henning <mhenning@darkrefraction.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patch.msgid.link/20260204030208.2313241-3-airlied@gmail.com
2026-02-04nouveau/vmm: rewrite pte tracker using a struct and bitfields.Dave Airlie2-24/+31
I want to increase the counters here and start tracking LPTs as well as there are certain situations where userspace with mixed page sizes can cause ref/unrefs to live longer so need better reference counting. This should be entirely non-functional. Reviewed-by: Mary Guillemard <mary@mary.zone> Tested-by: Mary Guillemard <mary@mary.zone> Tested-by: Mel Henning <mhenning@darkrefraction.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patch.msgid.link/20260204030208.2313241-2-airlied@gmail.com
2026-02-03accel/amdxdna: Fix incorrect error code returned for failed chain commandLizhi Hou1-1/+1
The driver currently returns an incorrect error code when a chain command fails. In this case, ERT_CMD_STATE_ERROR is expected to be reported for failed chain commands. Fixes: aac243092b70 ("accel/amdxdna: Add command execution") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260203184037.2751889-1-lizhi.hou@amd.com
2026-02-03accel/amdxdna: Remove hardware context statusLizhi Hou3-28/+5
One newly supported command does not require hardware context configuration to be performed upfront. As a result, checking hardware context status causes this command to fail incorrectly. Remove hardware context status handling entirely. For other commands, if userspace submits a request without configuring the hardware context first, the firmware will report an error or time out as appropriate. Fixes: aac243092b70 ("accel/amdxdna: Add command execution") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260202212450.2681273-1-lizhi.hou@amd.com
2026-02-03drm/bridge: imx8qxp-pixel-combiner: Fix bailout for imx8qxp_pc_bridge_probe()Liu Ying1-1/+1
In case the channel0 is unavailable and bailing out from free_child is needed when we fail to add a DRM bridge for the available channel1, pointer pc->ch[0] in the bailout path would be NULL and it would be dereferenced as pc->ch[0]->bridge.next_bridge. Fix this by checking pc->ch[0] before dereferencing it. Fixes: ae754f049ce1 ("drm/bridge: imx8qxp-pixel-combiner: get/put the next bridge") Fixes: 99764593528f ("drm/bridge: imx8qxp-pixel-combiner: convert to devm_drm_bridge_alloc() API") Signed-off-by: Liu Ying <victor.liu@nxp.com> Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20260123-imx8qxp-drm-bridge-fixes-v1-3-8bb85ada5866@nxp.com Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
2026-02-03drm/panel: ilitek-ili9882t: Remove duplicate initializers in tianma_il79900a_dscNathan Chancellor1-4/+0
Clang warns (or errors with CONFIG_WERROR=y / W=e): drivers/gpu/drm/panel/panel-ilitek-ili9882t.c:95:16: error: initializer overrides prior initialization of this subobject [-Werror,-Winitializer-overrides] 95 | .vbr_enable = 0, | ^ drivers/gpu/drm/panel/panel-ilitek-ili9882t.c:90:16: note: previous initialization is here 90 | .vbr_enable = false, | ^~~~~ drivers/gpu/drm/panel/panel-ilitek-ili9882t.c:97:19: error: initializer overrides prior initialization of this subobject [-Werror,-Winitializer-overrides] 97 | .rc_model_size = DSC_RC_MODEL_SIZE_CONST, | ^~~~~~~~~~~~~~~~~~~~~~~ include/drm/display/drm_dsc.h:22:38: note: expanded from macro 'DSC_RC_MODEL_SIZE_CONST' 22 | #define DSC_RC_MODEL_SIZE_CONST 8192 | ^~~~ drivers/gpu/drm/panel/panel-ilitek-ili9882t.c:91:19: note: previous initialization is here 91 | .rc_model_size = DSC_RC_MODEL_SIZE_CONST, | ^~~~~~~~~~~~~~~~~~~~~~~ include/drm/display/drm_dsc.h:22:38: note: expanded from macro 'DSC_RC_MODEL_SIZE_CONST' 22 | #define DSC_RC_MODEL_SIZE_CONST 8192 | ^~~~ drivers/gpu/drm/panel/panel-ilitek-ili9882t.c:132:25: error: initializer overrides prior initialization of this subobject [-Werror,-Winitializer-overrides] 132 | .initial_scale_value = 32, | ^~ drivers/gpu/drm/panel/panel-ilitek-ili9882t.c:126:25: note: previous initialization is here 126 | .initial_scale_value = 32, | ^~ drivers/gpu/drm/panel/panel-ilitek-ili9882t.c:133:20: error: initializer overrides prior initialization of this subobject [-Werror,-Winitializer-overrides] 133 | .nfl_bpg_offset = 3511, | ^~~~ drivers/gpu/drm/panel/panel-ilitek-ili9882t.c:108:20: note: previous initialization is here 108 | .nfl_bpg_offset = 1402, | ^~~~ GCC would warn about this in the same manner but its version, -Woverride-init, is disabled for a normal kernel build in scripts/Makefile.warn. For clang, -Wextra in drivers/gpu/drm/Makefile turns it back but GCC respects turning it off earlier in the command line. Of all the duplicate fields in the initializer, only nfl_bpg_offset is a different value. Clear up the duplicate initializers, keeping the 'false' value for .vbr_enable, as it is bool, and the second value for .nfl_bpg_offset, assuming it is the correct one since it was the one tested in the original change. Fixes: 65ce1f5834e9 ("drm/panel: ilitek-ili9882t: Switch Tianma TL121BVMS07 to DSC 120Hz mode") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://patch.msgid.link/20260114-panel-ilitek-ili9882t-fix-override-init-v1-1-1d69a2b096df@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2026-02-02drm/i915/display: fix the pixel normalization handling for xe3p_lpdVinod Govindapillai4-24/+26
Pixel normalizer is enabled with normalization factor as 1.0 for FP16 formats in order to support FBC for those formats in xe3p_lpd. Previously pixel normalizer gets disabled during the plane disable routine. But there could be plane format settings without explicitly calling the plane disable in-between and we could endup keeping the pixel normalizer enabled for formats which we don't require that. This is causing crc mismatches in yuv formats and FIFO underruns in planar formats like NV12. Fix this by updating the pixel normalizer configuration based on the pixel formats explicitly during the plane settings arm calls itself - enable it for FP16 and disable it for other formats in HDR capable planes. v2: avoid redundant pixel normalization setting updates v3: moved the normalization factor definition to intel_fbc.c and some updates to comments v4: simplified the pixel normalizer setting handling Fixes: 5298eea7ed20 ("drm/i915/xe3p_lpd: use pixel normalizer for fp16 formats for FBC") Signed-off-by: Vinod Govindapillai <vinod.govindapillai@intel.com> Reviewed-by: Uma Shankar <uma.shankar@intel.com> Link: https://patch.msgid.link/20260130095919.107805-1-vinod.govindapillai@intel.com (cherry picked from commit c0dc68f4e2aa7eddb9ec6d95931f9576d8fe7334) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2026-02-02Merge tag 'exynos-drm-next-for-v6.20' of ↵Dave Airlie2-11/+64
git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-next Fix three regressions . Fix a regression where vidi_connection_ioctl() used the wrong device to look up the vidi context. It stores the vidi device in exynos_drm_private and uses it in ioctl(), preventing invalid pointer access and related bugs. . Fix a security regression where vidi_connection_ioctl() directly dereferenced a user pointer for EDID data. It copies EDID from user space with copy_from_user() into kernel memory before use, preventing arbitrary kernel memory access. . Fix a concurrency regression where vidi_context members related to EDID memory were accessed without locking. It protects alloc/free and state updates with ctx->lock, preventing race conditions and use-after-free bugs. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Inki Dae <inki.dae@samsung.com> Link: https://patch.msgid.link/20260201143939.27074-1-inki.dae@samsung.com
2026-02-01Merge tag 'amd-drm-next-6.20-2026-01-30' of ↵Dave Airlie75-576/+743
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.20-2026-01-30: amdgpu: - Misc cleanups - SMU 13 fixes - SMU 14 fixes - GPUVM fault filter fix - USB4 fixes - DC FP guard fixes - Powergating fix - JPEG ring reset fix - RAS fixes - Xclk fix for soc21 APUs - Fix COND_EXEC handling for GC 11 - UserQ fixes - MQD size alignment fixes - SMU feature interface cleanup - GC 10-12 KGQ init fixes - GC 11-12 KGQ reset fixes amdkfd: - Fix device snapshot reporting - GC 12.1 trap handler fixes - MQD size alignment fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patch.msgid.link/20260130183257.28879-1-alexander.deucher@amd.com
2026-02-01drm/exynos: vidi: use ctx->lock to protect struct vidi_context member ↵Jeongjun Park1-6/+32
variables related to memory alloc/free Exynos Virtual Display driver performs memory alloc/free operations without lock protection, which easily causes concurrency problem. For example, use-after-free can occur in race scenario like this: ``` CPU0 CPU1 CPU2 ---- ---- ---- vidi_connection_ioctl() if (vidi->connection) // true drm_edid = drm_edid_alloc(); // alloc drm_edid ... ctx->raw_edid = drm_edid; ... drm_mode_getconnector() drm_helper_probe_single_connector_modes() vidi_get_modes() if (ctx->raw_edid) // true drm_edid_dup(ctx->raw_edid); if (!drm_edid) // false ... vidi_connection_ioctl() if (vidi->connection) // false drm_edid_free(ctx->raw_edid); // free drm_edid ... drm_edid_alloc(drm_edid->edid) kmemdup(edid); // UAF!! ... ``` To prevent these vulns, at least in vidi_context, member variables related to memory alloc/free should be protected with ctx->lock. Cc: <stable@vger.kernel.org> Signed-off-by: Jeongjun Park <aha310510@gmail.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2026-02-01drm/exynos: vidi: fix to avoid directly dereferencing user pointerJeongjun Park1-4/+18
In vidi_connection_ioctl(), vidi->edid(user pointer) is directly dereferenced in the kernel. This allows arbitrary kernel memory access from the user space, so instead of directly accessing the user pointer in the kernel, we should modify it to copy edid to kernel memory using copy_from_user() and use it. Cc: <stable@vger.kernel.org> Signed-off-by: Jeongjun Park <aha310510@gmail.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2026-02-01drm/exynos: vidi: use priv->vidi_dev for ctx lookup in vidi_connection_ioctl()Jeongjun Park2-1/+14
vidi_connection_ioctl() retrieves the driver_data from drm_dev->dev to obtain a struct vidi_context pointer. However, drm_dev->dev is the exynos-drm master device, and the driver_data contained therein is not the vidi component device, but a completely different device. This can lead to various bugs, ranging from null pointer dereferences and garbage value accesses to, in unlucky cases, out-of-bounds errors, use-after-free errors, and more. To resolve this issue, we need to store/delete the vidi device pointer in exynos_drm_private->vidi_dev during bind/unbind, and then read this exynos_drm_private->vidi_dev within ioctl() to obtain the correct struct vidi_context pointer. Cc: <stable@vger.kernel.org> Signed-off-by: Jeongjun Park <aha310510@gmail.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2026-01-30accel/amdxdna: Fix memory leak in amdxdna_ubuf_mapZishun Yi1-2/+8
The amdxdna_ubuf_map() function allocates memory for sg and internal sg table structures, but it fails to free them if subsequent operations (sg_alloc_table_from_pages or dma_map_sgtable) fail. Fixes: bd72d4acda10 ("accel/amdxdna: Support user space allocated buffer") Signed-off-by: Zishun Yi <zishun.yi.dev@gmail.com> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Reviewed-by: Min Ma <mamin506@gmail.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260129171022.68578-1-zishun.yi.dev@gmail.com
2026-01-30accel/amdxdna: Stop job scheduling across aie2_release_resource()Lizhi Hou1-0/+6
Running jobs on a hardware context while it is in the process of releasing resources can lead to use-after-free and crashes. Fix this by stopping job scheduling before calling aie2_release_resource() and restarting it after the release completes. Additionally, aie2_sched_job_run() now checks whether the hardware context is still active. Fixes: 4fd6ca90fc7f ("accel/amdxdna: Refactor hardware context destroy routine") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260130003255.2083255-1-lizhi.hou@amd.com
2026-01-30accel/amdxdna: Hold mm structure across iommu_sva_unbind_device()Lizhi Hou2-0/+4
Some tests trigger a crash in iommu_sva_unbind_device() due to accessing iommu_mm after the associated mm structure has been freed. Fix this by taking an explicit reference to the mm structure after successfully binding the device, and releasing it only after the device is unbound. This ensures the mm remains valid for the entire SVA bind/unbind lifetime. Fixes: be462c97b7df ("accel/amdxdna: Add hardware context") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260128002356.1858122-1-lizhi.hou@amd.com
2026-01-30Merge tag 'drm-xe-next-fixes-2026-01-29' of ↵Dave Airlie3-3/+43
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next - Reduce LRC timestamp stuck message on VFs to notice (Brost) - Disable GuC Power DCC strategy on PTL (Vinay) - Unregister drm device on probe error (Lin) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/aXuyrtsnlAOmj_OB@intel.com
2026-01-30Merge tag 'drm-misc-next-fixes-2026-01-29' of ↵Dave Airlie3-5/+14
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next Two fixes for NULL pointer dereference in imx8 following the bridge refcounting conversions, and one for the bridge connector following the HDMI audio reworks. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://patch.msgid.link/20260129-efficient-jerboa-of-ecstasy-822832@houat
2026-01-30Merge tag 'drm-intel-next-fixes-2026-01-29' of ↵Dave Airlie1-8/+21
https://gitlab.freedesktop.org/drm/i915/kernel into drm-next - Prevent u64 underflow in intel_fbc_stolen_end Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patch.msgid.link/aXsWGWjacEJ03rTs@jlahtine-mobl
2026-01-29drm/amdgpu/gfx12: adjust KGQ reset sequenceAlex Deucher1-10/+13
Kernel gfx queues do not need to be reinitialized or remapped after a reset. Align with gfx11. v2: preserve init and remap for MMIO case. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29drm/amdgpu/gfx11: adjust KGQ reset sequenceAlex Deucher1-10/+13
Kernel gfx queues do not need to be reinitialized or remapped after a reset. This fixes queue reset failures on APUs. v2: preserve init and remap for MMIO case. Fixes: b3e9bfd86658 ("drm/amdgpu/gfx11: add ring reset callbacks") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4789 Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29drm/amdgpu/gfx12: fix wptr reset in KGQ initAlex Deucher1-1/+1
wptr is a 64 bit value and we need to update the full value, not just 32 bits. Align with what we already do for KCQs. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29drm/amdgpu/gfx11: fix wptr reset in KGQ initAlex Deucher1-1/+1
wptr is a 64 bit value and we need to update the full value, not just 32 bits. Align with what we already do for KCQs. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29drm/amdgpu/gfx10: fix wptr reset in KGQ initAlex Deucher1-1/+1
wptr is a 64 bit value and we need to update the full value, not just 32 bits. Align with what we already do for KCQs. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29drm/amdkfd: Use AMDGPU_MQD_SIZE_ALIGN in gfx11+ kfd mqd managerLang Yu4-47/+17
MES is enabled by default from gfx11+, use AMDGPU_MQD_SIZE_ALIGN unconditionally for gfx11+. Signed-off-by: Lang Yu <lang.yu@amd.com> Reviewed-by: David Belanger <david.belanger@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29drm/amdkfd: Adjust parameter of allocate_mqdLang Yu11-15/+24
Make allocate_mqd consistent with other callbacks. Prepare for next patch to use mqd_manager->mqd_size. Signed-off-by: Lang Yu <lang.yu@amd.com> Reviewed-by: David Belanger <david.belanger@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29drm/amdgpu: Use AMDGPU_MQD_SIZE_ALIGN in KGDLang Yu3-10/+13
Use AMDGPU_MQD_SIZE_ALIGN for both kernel and user queue. Signed-off-by: Lang Yu <lang.yu@amd.com> Reviewed-by: David Belanger <david.belanger@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29drm/amd/pm: Initialize allowed feature listLijo Lazar11-226/+158
Instead of returning feature bit mask of allowed features, initialize the allowed features in the callback implementation itself. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29drm/amd/pm: Remove unused logic in SMUv14.0.2Lijo Lazar1-39/+0
Remove commented and redundant logic in get_allowed_feature_mask implementation. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29drm/amd/pm: Add smu feature interface functionsLijo Lazar7-27/+120
Instead of using bitmap operations, add wrapper interface functions to operate on smu features. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29drm/amd/pm: Add smu feature bits data structLijo Lazar1-0/+81
Add a bitmap struct to represent smu feature bits and functions to set/clear features. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29drm/amdgpu: Add a helper macro to align mqd sizeLang Yu1-0/+8
MES FW uses address(mqd_addr + sizeof(struct mqd) + 3*sizeof(uint32_t)) as fence address and writes a 32 bit fence value to this address. Driver needs to allocate some extra memory(at least 4 DWs) in addition to sizeof(struct mqd) as mqd memory(limited to gfx/compute/sdma queue). For gfx11/12, sizeof(struct mqd) < PAGE_SIZE, KGD allocates mqd memory with PAGE_SIZE aligned works. For gfx12.1, sizeof(struct mqd) == PAGE_SIZE, it doesn't work. KFD mqd manager hardcodes mqd size to PAGE_SIZE/MQD_SIZE across different IP versions to solve this issue. To avoid hardcoding in differnet places and across different IP versions. Let's use AMDGPU_MQD_SIZE_ALIGN instead. It is used in two places. 1. mqd memory alloction 2. mqd stride handling for multi xcc config v2: Use AMDGPU_GPU_PAGE_ALIGN. (Mukul) Signed-off-by: Lang Yu <lang.yu@amd.com> Reviewed-by: David Belanger <david.belanger@amd.com> (v1) Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29drm/amdgpu: validate user queue size constraintsJesse.Zhang1-0/+11
Add validation to ensure user queue sizes meet hardware requirements: - Size must be a power of two for efficient ring buffer wrapping - Size must be at least AMDGPU_GPU_PAGE_SIZE to prevent undersized allocations This prevents invalid configurations that could lead to GPU faults or unexpected behavior. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29drm/amdgpu: Fix cond_exec handling in amdgpu_ib_schedule()Alex Deucher1-2/+3
The EXEC_COUNT field must be > 0. In the gfx shadow handling we always emit a cond_exec packet after the gfx_shadow packet, but the EXEC_COUNT never gets patched. This leads to a hang when we try and reset queues on gfx11 APUs. Fixes: c68cbbfd54c6 ("drm/amdgpu: cleanup conditional execution") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4789 Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29drm/amdgpu/soc21: fix xclk for APUsAlex Deucher1-1/+7
The reference clock is supposed to be 100Mhz, but it appears to actually be slightly lower (99.81Mhz). Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14451 Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>