summaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)AuthorFilesLines
5 daysMerge tag 'drm-fixes-2026-06-27' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds10-21/+38
Pull drm fixes from Dave Airlie: "These are just the fixes from our fixes branch, all pretty small and scattered. sysfb: - drm/sysfb truncation and alignment fixes edid: - fix edid OOB read in tile parsing - increase displayid topology id to correct size nouveau: - fix error handling paths in nouveau amdxdna: - get_bo_info fix ivpu: - fix leak when error handling in ivpu" * tag 'drm-fixes-2026-06-27' of https://gitlab.freedesktop.org/drm/kernel: drm/sysfb: Avoid truncating maximum stride drm/sysfb: Return errno code from drm_sysfb_get_visible_size() drm/sysfb: Avoid possible truncation with calculating visible size drm/sysfb: Do not page-align visible size of the framebuffer drm/edid: fix OOB read in drm_parse_tiled_block() drm/nouveau: fix reversed error cleanup order in ucopy functions drm/nouveau/acr: fix missing nvkm_done() in error path of nvkm_acr_oneinit() accel/amdxdna: Use caller client for debug BO sync drm/displayid: fix Tiled Display Topology ID size accel/ivpu: fix HWS command queue leak on registration failure
5 daysMerge tag 'drm-next-2026-06-27' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds44-457/+766
Pull drm merge window fixes from Dave Airlie: "This is the merge window fixes from our next tree, i915/xe and amdgpu make up all of it. I've got a separate fixes pull from our fixes branch arriving after this. i915: - Fix corrupted display output on GLK, #16209 - Add missing Spectre mitigation for parallel submit IOCTL - MTL+ fix for DP resume - clear CRTC blobs after dropping refs - fix sharpness filter on DP MST xe: - Set TTM beneficial order to 9 in Xe - Several error path cleanups - Fix TDR for unstarted jobs on kernel queues - Several TLB invalidation fixes related to suspending LR queues - Some small RAS fixes - Multi-queue suspend fix for LR queues - Revert inclusion of NVL_S firmware amdgpu: - devcoredump fixes - SMU15 fix - Various irq put/get imbalance cleanup fixes - 8K panel fix - DCN3.5 fix - lockdep fix - Cleaner shader sysfs IB overflow fix - Async flip fixes - GET_MAPPING_INFO fix - CP_GFX_SHADOW fix - Ctx pstate handling fix - GTT bo move handling fixes - Old UVD BO placement fixes - GC9 mode2 reset fix - IH6.1 version fix - Soft IH ring fix amdkfd: - Fix doorbell/mmio double unpin on free - CRIU fixes - SMI event fixes - Sysfs teardown fix - Various boundary checking fixes - Various error checking fixes - SVM fix" * tag 'drm-next-2026-06-27' of https://gitlab.freedesktop.org/drm/kernel: (52 commits) drm/i915/cdclk: Fix up CDCLK_FREQ_DECIMAL without a full PLL re-enable drm/i915/gem: Add missing nospec on parallel submit slot drm/amdgpu: Use system unbound workqueue for soft IH ring amdgpu/ih6.1: Fix minor version drm/amdkfd: Use exclusive bounds for SVM split alignment checks drm/amdgpu/gfx9: Fix Ring and IB test fail after mode2 drm/amdgpu/uvd: Fix forcing MSG, FB BOs into VCPU segment when it isn't at 0 (v2) drm/amdgpu/uvd: Place VCPU BO only in VRAM for UVD 4.x and older drm/amdgpu: Fix amdgpu_bo_move() when old_mem and new_mem are both GTT drm/amdgpu: Respect placement requirements in amdgpu_gtt_mgr functions drm/amdgpu: Fix context pstate override handling drm/amdkfd: Use memdup_array_user to copy data from/to user space at kfd ioctls drm/amdkfd: check find_first_zero_bit before __set_bit on kfd->doorbell_bitmap drm/amdkfd: Let driver decide buffer size at AMDKFD_IOC_GET_DMABUF_INFO ioctl drm/amdgpu: fix recursive ww_mutex acquire in amdgpu_devcoredump_format drm/amdgpu: convert amdgpu_vm_lock_by_pasid() to drm_exec drm/amdgpu: Don't use UTS_RELEASE directly drm/amdkfd: Fix NULL deref during sysfs teardown drm/amdgpu: validate CP_GFX_SHADOW chunk size in CS pass1 drm/amdgpu: check amdgpu_vm_bo_find() result in GET_MAPPING_INFO ...
6 daysMerge tag 'drm-misc-fixes-2026-06-25' of ↵Dave Airlie10-21/+38
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes drm-misc-fixes for v7.2: - drm/sysfb truncation and alignment fixes. - fix edid OOB read. - fix error handling paths in nouveau - amdxdna get_bo_info fix. - increase displayid topology id to correct size. - fix leak when error handling in ivpu. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patch.msgid.link/2d17f718-43f5-4772-9c04-a975c9ad4bc3@linux.intel.com
6 daysMerge tag 'drm-intel-next-fixes-2026-06-25-1' of ↵Dave Airlie2-7/+35
https://gitlab.freedesktop.org/drm/i915/kernel into drm-next - Fix corrupted display output on GLK, #16209 (Ville) - Add missing Spectre mitigation for parallel submit IOCTL (Joonas) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patch.msgid.link/ajzIhInnHnGCwMlu@jlahtine-mobl
7 daysdrm/i915/cdclk: Fix up CDCLK_FREQ_DECIMAL without a full PLL re-enableVille Syrjälä1-7/+34
The GOP (and even Bspec on some platforms) is a bit inconsistent on what the CDCLK_FREQ_DECIMAL divider should be. Currently any mismatch there causes a full CDCLK PLL disable+re-enable, which we really don't want to do if any displays are currently active. Let's instead just reprogram CDCLK_FREQ_DECIMAL when that is the only thing amiss. For any other (more serious) mismatch we still punt to the full PLL reprogramming. We also need to tweak the bxt_cdclk_cd2x_pipe() stuff a bit to consistently select pipe==NONE since we have no idea which pipes are enabled at this point. Since we're not actually changing the CDCLK frequency here we don't need to sync the update to any pipe. Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/work_items/16209 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20260612173653.7830-2-ville.syrjala@linux.intel.com Reviewed-by: Michał Grzelak <michal.grzelak@intel.com> (cherry picked from commit 3f9de66f8acbf8ff45a91b4920605ed10c6b7c06) Fixes: ba91b9eecb47 ("drm/i915/cdclk: Decouple cdclk from state->modeset") Fixes: d66a21947e21 ("drm/i915/bxt: Sanitize CDCLK to fix breakage during S4 resume") Fixes: c73666f394fc ("drm/i915/skl: If needed sanitize bios programmed cdclk") Cc: <stable@vger.kernel.org> # v4.5+ Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
8 daysdrm/i915/gem: Add missing nospec on parallel submit slotJoonas Lahtinen1-0/+1
Add missing Spectre mitigation for userspace controlled parallel submission slot. Discovered using AI-assisted static analysis confirmed by Intel Product Security. Reported-by: Martin Hodo <martin.hodo@intel.com> Fixes: e5e32171a2cf ("drm/i915/guc: Connect UAPI to GuC multi-lrc interface") Cc: Matthew Brost <matthew.brost@intel.com> Cc: Tvrtko Ursulin <tursulin@ursulin.net> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: <stable@vger.kernel.org> # v5.16+ Link: https://patch.msgid.link/20260622132539.165558-1-joonas.lahtinen@linux.intel.com (cherry picked from commit 15b9353deff3cf72331c387780de3cf9c316b643) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
9 daysdrm/sysfb: Avoid truncating maximum strideThomas Zimmermann1-1/+7
Passing a maximum as 64-bit type to drm_sysfb_get_validated_int0() can truncate the value to 32 bits. Use drm_sysfb_get_validated_size0(), which uses 64-bit arithmetics. Then test the returned stride against the limits of int to avoid truncations in the returned value. A valid stride is in the range of [1, INT_MAX] inclusive. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reported-by: Sashiko <sashiko-bot@kernel.org> Closes: https://lore.kernel.org/dri-devel/20260617114016.5A5991F000E9@smtp.kernel.org/ Fixes: 32ae90c66fb6 ("drm/sysfb: Add efidrm for EFI displays") Fixes: a84eb6abe2b6 ("drm/sysfb: Add vesadrm for VESA displays") Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Javier Martinez Canillas <javierm@redhat.com> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.16+ Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Link: https://patch.msgid.link/20260618084327.46567-5-tzimmermann@suse.de
9 daysdrm/sysfb: Return errno code from drm_sysfb_get_visible_size()Thomas Zimmermann4-8/+9
Change the return type of drm_sysfb_get_visible_size() to s64 so that it returns a possible errno code from _get_validated_size0(). Fix callers to handle the errno code. The currently returned unsigned type converts an errno code to a very large size value, which drivers interpret as visible size of the system framebuffer. Later efforts to reserve the framebuffer resource fail. The bug has been present since efidrm and vesadrm got merged. It was then part of each driver. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: 32ae90c66fb6 ("drm/sysfb: Add efidrm for EFI displays") Fixes: a84eb6abe2b6 ("drm/sysfb: Add vesadrm for VESA displays") Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Javier Martinez Canillas <javierm@redhat.com> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.16+ Link: https://patch.msgid.link/20260618084327.46567-4-tzimmermann@suse.de
9 daysdrm/sysfb: Avoid possible truncation with calculating visible sizeThomas Zimmermann1-1/+2
Calculating the visible size of the system framebuffer can result in truncation of the result. The calculation uses 32-bit arithmetics, which can overflow if the values for height and stride are large. Fix the issue by multiplying with mul_u32_u32(). Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: 32ae90c66fb6 ("drm/sysfb: Add efidrm for EFI displays") Fixes: a84eb6abe2b6 ("drm/sysfb: Add vesadrm for VESA displays") Reported-by: Sashiko <sashiko-bot@kernel.org> Closes: https://lore.kernel.org/dri-devel/20260617114027.1F2A71F000E9@smtp.kernel.org/ Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Javier Martinez Canillas <javierm@redhat.com> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.16+ Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Link: https://patch.msgid.link/20260618084327.46567-3-tzimmermann@suse.de
9 daysdrm/sysfb: Do not page-align visible size of the framebufferThomas Zimmermann1-1/+1
Only return the actually visible size of the system framebuffer in drm_sysfb_get_visible_size_si(). Drivers use this size value for reserving access to framebuffer memory. Increasing the value can make later attempts to do so fail. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: 32ae90c66fb6 ("drm/sysfb: Add efidrm for EFI displays") Fixes: a84eb6abe2b6 ("drm/sysfb: Add vesadrm for VESA displays") Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Javier Martinez Canillas <javierm@redhat.com> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.16+ Link: https://patch.msgid.link/20260618084327.46567-2-tzimmermann@suse.de
9 daysdrm/edid: fix OOB read in drm_parse_tiled_block()Xiang Mei1-0/+8
drm_parse_tiled_block() casts the DisplayID block to a struct displayid_tiled_block and reads the full fixed layout up to tile->topology_id[7] without checking block->num_bytes. The DisplayID iterator only validates the declared payload length, so a crafted EDID can advertise a tiled-display block (tag DATA_BLOCK_TILED_DISPLAY, or DATA_BLOCK_2_TILED_DISPLAY_TOPOLOGY for v2.0) with a small num_bytes at the end of a DisplayID extension. The read then runs past the end of the exact-sized kmemdup()'d EDID allocation, a heap out-of-bounds read. Reject blocks shorter than the spec's 22-byte tiled payload before reading the fixed struct, as drm_parse_vesa_mso_data() already does. BUG: KASAN: slab-out-of-bounds in drm_edid_connector_update Read of size 2 at addr ffff888010077700 by task exploit/147 dump_stack_lvl (lib/dump_stack.c:94 ...) print_report (mm/kasan/report.c:378 ...) kasan_report (mm/kasan/report.c:595) drm_edid_connector_update (drivers/gpu/drm/drm_edid.c:7581) bochs_connector_helper_get_modes (drivers/gpu/drm/tiny/bochs.c:574) drm_helper_probe_single_connector_modes (drivers/gpu/drm/drm_probe_helper.c:426) status_store (drivers/gpu/drm/drm_sysfs.c:219) ... vfs_write (fs/read_write.c:595 fs/read_write.c:688) ksys_write (fs/read_write.c:740) Fixes: 40d9b043a89e ("drm/connector: store tile information from displayid (v3)") Reported-by: Weiming Shi <bestswngs@gmail.com> Assisted-by: Claude:claude-opus-4-8 Signed-off-by: Xiang Mei <xmei5@asu.edu> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patch.msgid.link/20260615184737.899892-1-xmei5@asu.edu Signed-off-by: Jani Nikula <jani.nikula@intel.com>
10 daysMerge tag 'amd-drm-fixes-7.2-2026-06-19' of ↵Dave Airlie34-386/+618
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-fixes-7.2-2026-06-19: amdgpu: - devcoredump fixes - SMU15 fix - Various irq put/get imbalance cleanup fixes - 8K panel fix - DCN3.5 fix - lockdep fix - Cleaner shader sysfs IB overflow fix - Async flip fixes - GET_MAPPING_INFO fix - CP_GFX_SHADOW fix - Ctx pstate handling fix - GTT bo move handling fixes - Old UVD BO placement fixes - GC9 mode2 reset fix - IH6.1 version fix - Soft IH ring fix amdkfd: - Fix doorbell/mmio double unpin on free - CRIU fixes - SMI event fixes - Sysfs teardown fix - Various boundary checking fixes - Various error checking fixes - SVM fix radeon: - r100_copy_blit fix for large BOs Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patch.msgid.link/20260619152610.776982-1-alexander.deucher@amd.com
10 daysMerge tag 'drm-xe-next-fixes-2026-06-17' of ↵Dave Airlie13-202/+175
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next - Set TTM beneficial order to 9 in Xe - Several error path cleanups - Fix TDR for unstarted jobs on kernel queues - Several TLB invalidation fixes related to suspending LR queues - Some small RAS fixes - Multi-queue suspend fix for LR queues - Revert inclusion of NVL_S firmware Signed-off-by: Dave Airlie <airlied@redhat.com> From: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/ajLy2brwvOZEFNNN@gsse-cloud1.jf.intel.com
10 daysMerge tag 'mm-nonmm-stable-2026-06-21-10-22' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - "taskstats: fix TGID dead-thread stat retention" (Yiyang Chen) Fix a taskstats TGID aggregation bug where fields added in the TGID query path were not preserved after thread exit, and adds a kselftest covering the regression. - "lib/tests: string_helpers: Slight improvements" (Andy Shevchenko) Improve lib/tests/string_helpers_kunit.c a little - "lib/base64: decode fixes" (Josh Law) Address minor issues in lib/base64.c - "selftests/filelock: Make output more kselftestish" (Mark Brown) Make the output from the ofdlocks test a bit easier for tooling to work with. Also ignore the generated file - "uaccess: unify inline vs outline copy_{from,to}_user() selection" (Yury Norov) Simplify the usercopy code by removing the selectability of inlining copy_{from,to}_user(). - "ocfs2: validate inline xattr header consumers" (ZhengYuan Huang) Fix a number of possible issues in the ocfs2 xattr code - "lib and lib/cmdline enhancements" (Dmitry Antipov) Provide additional robustness checking in the cmdline handling code and its in-kernel testing and selftests - "cleanup the RAID6 P/Q library" (Christoph Hellwig) Clean up the RAID6 P/Q library to match the recent updates to the RAID 5 XOR library and other CRC/crypto libraries - "ocfs2: harden inode validators against forged metadata" (Michael Bommarito) Add three structural checks to OCFS2 dinode validation so malformed on-disk fields are rejected before ocfs2_populate_inode() copies them into the in-core inode - "lib/raid: replace __get_free_pages() call with kmalloc()" (Mike Rapoport) Clean up the lib/raid code by using kmalloc() in more places * tag 'mm-nonmm-stable-2026-06-21-10-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (108 commits) ocfs2: fix circular locking dependency in ocfs2_dio_end_io_write ocfs2: fix NULL h_transaction deref in ocfs2_assure_trans_credits lib: interval_tree_test: validate benchmark parameters ocfs2: avoid moving extents to occupied clusters treewide: fix transposed "sign" typos and update spelling.txt ocfs2: fix UBSAN array-index-out-of-bounds in ocfs2_sum_rightmost_rec fat: reject BPB volumes whose data area starts beyond total sectors selftests/uevent: increase __UEVENT_BUFFER_SIZE to avoid ENOBUFS on busy systems lib/test_firmware: allocate the configured into_buf size fs: efs: remove unneeded debug prints checkpatch: cuppress warnings when Reported-by: is followed by Link: MAINTAINERS: add Alexander as a kcov reviewer mailmap: update Alexander Sverdlin's Email addresses fs: fat: inode: replace sprintf() with scnprintf() ocfs2: fix out-of-bounds write in ocfs2_remove_refcount_extent ocfs2: fix race between ocfs2_control_install_private() and ocfs2_control_release() ocfs2/dlm: require a ref for locking_state debugfs open ocfs2: reject FITRIM ranges shorter than a cluster ocfs2: validate fast symlink target during inode read ocfs2: add journal NULL check in ocfs2_checkpoint_inode() ...
10 daysdrm/nouveau: fix reversed error cleanup order in ucopy functionsJunrui Luo2-4/+4
nouveau_uvmm_vm_bind_ucopy() and nouveau_exec_ucopy() place their error cleanup labels in allocation order rather than reverse allocation order. On a u_memcpya() failure for in_sync.s, the goto to err_free_ops (or err_free_pushs) frees the first allocation and then falls through to err_free_ins, which calls u_free() on args->in_sync.s. Since args->in_sync.s still holds the ERR_PTR returned by the failed u_memcpya(), and ERR_PTR values are not caught by ZERO_OR_NULL_PTR(), kvfree() proceeds to dereference it, which can result in a kernel oops. A failure for out_sync.s instead jumps to err_free_ins and skips freeing the first allocation, leading to a memory leak. Fix by swapping the cleanup label order so resources are freed in the correct reverse allocation sequence. Fixes: b88baab82871 ("drm/nouveau: implement new VM_BIND uAPI") Reported-by: Yuhao Jiang <danisjiang@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Junrui Luo <moonafterrain@outlook.com> Link: https://patch.msgid.link/SYBPR01MB7881484D91A6F80271415F71AF1A2@SYBPR01MB7881.ausprd01.prod.outlook.com Signed-off-by: Danilo Krummrich <dakr@kernel.org>
10 daysdrm/nouveau/acr: fix missing nvkm_done() in error path of nvkm_acr_oneinit()Wentao Liang1-0/+1
In nvkm_acr_oneinit(), nvkm_kmap(acr->wpr) is invoked unconditionally at line 309 to obtain a mapping reference. Additionally, when both acr->wpr_fw and acr->wpr_comp are present, a second nvkm_kmap() is called inside the conditional block. Both mappings are expected to be released by nvkm_done(acr->wpr) at line 320 before the function returns successfully. However, when a mismatch is detected during the loop within the conditional block, the function returns -EINVAL at line 318 without calling nvkm_done(). This results in a leak of the kmap reference(s) acquired earlier. Fix the issue by invoking nvkm_done(acr->wpr) prior to the early return to ensure proper release of the mapping references. Fixes: 22dcda45a3d1 ("drm/nouveau/acr: implement new subdev to replace "secure boot"") Cc: stable@vger.kernel.org Signed-off-by: Wentao Liang <vulab@iscas.ac.cn> Link: https://patch.msgid.link/20260606155606.77593-1-vulab@iscas.ac.cn Signed-off-by: Danilo Krummrich <dakr@kernel.org>
12 daysMerge tag 'mm-stable-2026-06-18-09-26' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - "selftests/mm: clean up build output and verbosity" (Li Wang) Remove some noise from the MM selftests build - "mm: Free contiguous order-0 pages efficiently" (Ryan Roberts) Speed up the freeing of a batch of 0-order pages by first scanning them for coalescing opportunities. This is applicable to vfree() and to the releasing of frozen pages - "mm/damon: introduce DAMOS failed region quota charge ratio" (SeongJae Park) Address a DAMOS usability issue: The DAMOS quota often exhausts prematurely because it charges for all memory attempted, causing slow and inconsistent performance when actions fail on unreclaimable memory. To fix this, a new feature lets users set a smaller, flexible quota charge ratio (via a numerator and denominator) for failed regions. Since failed actions cause less overhead, reducing their quota cost ensures more predictable and efficient DAMOS processing - "selftests/cgroup: improve zswap tests robustness and support large page sizes" (Li Wang) Fix various spurious failures and improves the overall robustness of the cgroup zswap selftests - "fix MAP_DROPPABLE not supported errno" (Anthony Yznaga) Fix an issue in the mlock selftests on arm32 - "mm: huge_memory: clean up defrag sysfs with shared" (Breno Leitao) Some maintenance work in the huge_memory code - "treewide: fixup gfp_t printks" (Brendan Jackman) Use the special vprintf() gfp_t conversion in various places - "mm: Fix vmemmap optimization accounting and initialization" (Muchun Song) Fix several bugs in the vmemmap optimization, mainly around incorrect page accounting and memmap initialization in the DAX and memory hotplug paths. It also fixes pageblock migratetype initialization and struct page initialization for ZONE_DEVICE compound pages - "mm/damon: repost non-hotfix reviewed patches in damon/next tree" A sprinkle of unrelated minor bugfixes for DAMON - "mm: remove page_mapped()" (David Hildenbrand) Remove this function from the tree, replacing it with folio_mapped() - "mm/damon: let DAMON be paused and resumed" (SeongJae Park) Allow DAMON to be paused and resumed without losing its current state - "kasan: hw_tags: Disable tagging for stack and page-tables" (Muhammad Usama Anjum) Simplify and speed up kasan by removing its ineffective tagging of stacks and page tables - "mm/damon/reclaim,lru_sort: monitor all system rams by default" (SeongJae Park) Simplify deployment on diverse hardware like NUMA systems by updating DAMON_RECLAIM and DAMON_LRU_SORT to automatically monitor the physical address range covering all System RAM areas by default, replacing the overly restrictive behavior that only targeted the single largest memory block to save on negligible overhead - "mm/damon/sysfs: document filters/ directory as deprecated" (SeongJae Park) Update some DAMON docs - "mm: use spinlock guards for zone lock" (Dmitry Ilvokhin) Switch zone->lock handling over to using the guard() mechanisms - "mm/filemap: tighten mmap_miss hit accounting" (fujunjie) Fix a flaw where the mmap_miss counter over-credited page cache hits during fault-arounds and page-fault retries. This results in significant reduction of redundant synchronous mmap readahead I/O, drastically cutting down execution time and gigabytes read for sparse random or strided memory access workloads - "selftests/cgroup: Fix false positive failures in test_percpu_basic" (Li Wang) Fix a couple of false-positives in the cgroup kmem selftests - "mm/damon/reclaim: support monitoring intervals auto-tuning" (SeongJae Park) Add a new parameter to DAMON permitting DAMON_RECLAIM to automatically tune DAMON's sampling and aggregation intervals - "mm/damon/stat: add kdamond_pid parameter" (SeongJae Park) Change DAMON_STAT to provide the pid of its kdamond - "mm/kmemleak: dedupe verbose scan output" (Breno Leitao) Remove large amounts of duplicated backtraces from the verbose-mode kmemleak output - "mm: remove CONFIG_HAVE_BOOTMEM_INFO_NODE (Part 1)" (David Hildenbrand) Reduce our use of CONFIG_HAVE_BOOTMEM_INFO_NODE, with a view to removing it entirely in a later series - "mm/damon: validate min_region_size to be power of 2" (Liew Rui Yan) Prevent users from passing a non-power-of-2 value of `addr_unit', as this later results in undesirable behavior - "mm: document read_pages and simplify usage" (Frederick Mayle) - "tools/mm/page-types: Fix misc bugs" (Ye Liu) Fix three issues in tools/mm/page-types.c - "mm: misc cleanups from __GFP_UNMAPPED series" (Brendan Jackman) Implement several cleanups in the page allocator and related code - "mm, swap: swap table phase IV: unify allocation" (Kairui Song) Unify the allocation and charging of anon and shmem swap in folios, provides better synchronization, consolidates the metadata management, hence dropping the static array and map, and improves performance - "mm/damon: introduce data attributes monitoring" (SeongJae Park( Extend DAMON to monitor general data attributes other than accesses - "mm/vmalloc: free unused pages on vrealloc() shrink" (Shivam Kalra) Implement the TODO in vrealloc() to unmap and free unused pages when shrinking across a page boundary - "mm/damon: documentation and comment fixes" (niecheng) - "remove mmap_action success, error hooks" (Lorenzo Stoakes) Eliminate custom hooks from mmap_action by removing the problematic success_hook which allowed drivers to improperly access uninitialized VMAs. It replaces the error_hook with a simple error-code field and updates the memory char driver accordingly - "mm/damon: minor improvements for code readability and tests" (SeongJae Park) - "mm/damon: fix macro arguments and clarify quota goals doc" (Maksym Shcherba) - "userfaultfd: merge fs/userfaultfd.c into mm/userfaultfd.c" (Mike Rapoport) - "mm/mglru: improve reclaim loop and dirty folio" (Kairui Song and others) Clean up and slightly improves MGLRU's reclaim loop and dirty writeback handling. Large performance improvements are measured - "use vma locks for proc/pid/{smaps|numa_maps} reads" (Suren Baghdasaryan) Use per-vma locks when reading /proc/pid/smaps and numa_maps similar to reduce contention on central mmap_lock - "refactors thpsize_shmem_enabled_store() and thpsize_shmem_enabled_show()" (Ran Xiaokai) Some cleanup work in the THP code - "selftests/memfd: fix compilation warnings" (Konstantin Khorenko) Fix a few build glitches in the memfd selftest code. - "memcg: shrink obj_stock_pcp and cache multiple objcgs" (Shakeel Butt) Resolve a 68% performance regression caused by NUMA-node cache thrashing around struct obj_stock_pcp by shrinking its existing fields and expanding it into a multi-slot array that caches up to five obj_cgroup pointers per CPU, allowing per-node variants of the same memcg to coexist within a single 64-byte cache line. - "zram: writeback fixes" (Sergey Senozhatsky) address a couple of unrelated zram writeback issues - "mm: switch THP shrinker to list_lru" (Johannes Weiner) Resolve NUMA-awareness issues and streamlines callsite interaction by refactoring and extending the list_lru API to completely replace the complex, open-coded deferred split queue for Transparent Huge Pages - "mm: improve large folio readahead for exec memory" (Usama Arif) Improve large-folio readahead on systems like 64K-page arm64 by preventing the mmap_miss check from permanently disabling target-oriented VM_EXEC readahead, and by generalizing the force_thp_readahead gate to support mappings with any usefully large maximum folio order under the cache cap. - "userfaultfd/pagemap: pre-existing fixes" (Kiryl Shutsemau) Fix a bunch of minor issues in the userfaultfd/pagemap, all of which were flagged by Sashiko review of proposed new material - "mm/sparse-vmemmap: Provide generic vmemmap_set_pmd() and vmemmap_check_pmd()" (Muchun Song) Provide generic versions of these two functions so the four arch-specific implementations can be removed. - "mm/swap, PM: hibernate: fix swapoff race in uswsusp by pinning swap device" (Youngjun Park) Address a uswsusp-vs-swapoff race and reduces the swap device reference taking/releasing frequency. - "mm/hmm: A fix and a selftest" (Dev Jain) * tag 'mm-stable-2026-06-18-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits) selftests/mm/hmm-tests: test pagemap reads of PMD device-private entries fs/proc/task_mmu: do not warn on seeing non-migration pmd entry lib/test_hmm: check alloc_page_vma() return value and handle OOM mm/compaction: cap compact_gap() at COMPACT_CLUSTER_MAX mm/swap: remove redundant swap device reference in alloc/free mm/swap, PM: hibernate: fix swapoff race in uswsusp by pinning swap device mm/filemap: use folio_next_index() for start vmalloc: fix NULL pointer dereference in is_vm_area_hugepages() sparc/mm: drop vmemmap_check_pmd helper and use generic code loongarch/mm: drop vmemmap_check_pmd helper and use generic code riscv/mm: drop vmemmap_pmd helpers and use generic code arm64/mm: drop vmemmap_pmd helpers and use generic code mm/sparse-vmemmap: provide generic vmemmap_set_pmd() and vmemmap_check_pmd() rust: page: mark Page::nid as inline userfaultfd: build __VMA_UFFD_FLAGS from config-gated masks userfaultfd: gate must_wait writability check on pte_present() mm/huge_memory: preserve pmd_swp_uffd_wp on device-private PMD downgrade fs/proc/task_mmu: fix hugetlb self-deadlock in pagemap_scan_pte_hole() fs/proc/task_mmu: use huge_page_size() in pagemap_scan_hugetlb_entry() fs/proc/task_mmu: fix make_uffd_wp_huge_pte() prot-update race ...
13 daysMerge tag 'media/v7.2-1' of ↵Linus Torvalds2-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media updates from Mauro Carvalho Chehab: - v4l2: - core: fix subdev sensor ownership - subdev: Allow accessing routes with STREAMS client capability - ctrls: Add validation for HEVC active reference counts and background detection control - common: Add YUV24 format info and has_alpha helper - vb2: Change vb2_read() and vb2_write() return types to ssize_t - i2c: cvs: Add driver of Intel Computer Vision Sensing Controller(CVS) - atmel-isc: remove deprecated driver - cec: Add CEC Latency Indication Protocol (LIP) support - imon: Add iMON VFD HID OEM v1.2 key mappings - AVMatrix: new HWS capture driver - isp4: new AMD capture driver - qcom: - iris: Add hierarchical coding, B-frame, and Long-Term Reference support for encoder - camss: Add SM6350 platform support - venus: Add SM6115 platform support - chips-media: wave5: Add support for Packed YUV422, CBP profile, and background detection - csi2rx: Add multistream support and 32 dma chans - Several cleanups and fixes * tag 'media/v7.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (394 commits) media: v4l2-fwnode: Fix subdev owner overwritten in v4l2_async_register_subdev_sensor() media: qcom: iris: vdec: allow GEN2 decoding into 10bit format media: qcom: iris: vdec: update find_format to handle 8bit and 10bit formats media: qcom: iris: vdec: update size and stride calculations for 10bit formats media: qcom: iris: gen2: add support for 10bit decoding media: qcom: iris: add QC10C & P010 buffer size calculations media: qcom: iris: add helpers for 8bit and 10bit formats media: qcom: iris: Fix FPS calculation and VPP FW overhead media: qcom: camss: vfe-340: Support for PIX client media: qcom: camss: vfe-340: Proper client handling media: qcom: camss: csid-340: Enable PIX interface routing media: qcom: camss: csid-340: Add port-to-interface mapping media: qcom: camss: csid-340: Switch to generic CSID_CFG/CTRL registers media: iris: Initialize HFI ops after firmware load in core init media: iris: drop struct iris_fmt media: iris: Add platform data for X1P42100 media: iris: Add hardware power on/off ops for X1P42100 media: iris: optimize COMV buffer allocation for VPU3x and VPU4x media: iris: add FPS calculation and VPP FW overhead in frequency formula media: qcom: iris: Simplify COMV size calculation ...
14 daystreewide: fix transposed "sign" typos and update spelling.txtShardul Deshpande1-1/+1
Several comments transpose the letters in "assigned" and "unsigned", spelling them with "sing" instead of "sign". Correct all of them. Of these, the misspelling of "assigned" is not yet flagged by checkpatch, so also add it to scripts/spelling.txt. The remaining matches of `grep -ri singed` are RISINGEDGE register and enum names, not typos. Link: https://lore.kernel.org/20260612181633.734458-1-iamsharduld@gmail.com Signed-off-by: Shardul Deshpande <iamsharduld@gmail.com> Suggested-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: SeongJae Park <sj@kernel.org> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
14 daysdrm/amdgpu: Use system unbound workqueue for soft IH ringTimur Kristóf1-1/+1
Allow the kernel to dispatch the soft IH work on other CPUs. Otherwise it can happen that the soft IH ring fills up before it actually starts processing anything, which can easily happen with retry page faults, in which case the CP repeatedly spams the CPU with a lot of interrupts. This significantly improves retry page fault handling on GPUs that don't have the filter CAM and must rely on software based filtering. Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 3cdff3c8b93c2834977224d9c2b201fc334dd184)
14 daysamdgpu/ih6.1: Fix minor versionTimur Kristóf1-1/+1
Report the correct version of IH v6.1 (previously it showed v6.0). Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 940d33ebbcdebaf095fade86e9c981ad8789aee2)
14 daysdrm/amdkfd: Use exclusive bounds for SVM split alignment checksGerhard Schwanzer1-4/+4
SVM ranges use inclusive page indices: prange->last is the last page in the range. The split-remap logic introduced by commit 448ee45353ef ("drm/amdkfd: Use huge page size to check split svm range alignment") uses ALIGN_DOWN(prange->last, 512) to determine whether the original range can contain a 2MB huge-page mapping. That aligns the last page itself down. Thus a range ending one page before the next 2MB boundary is classified as if the final 2MB block did not exist. When such a range is split inside that final block, the split head or tail can be left off the remap list even though it was derived from an original range that may have PMD mappings. Use prange->last + 1 as the exclusive upper bound when computing the original range's last 2MB-aligned boundary. Then use the actual split boundary for the head and tail alignment checks: tail->start for a tail split, and new_start for a head split. new_start is equivalent to head->last + 1 and directly names the exclusive end of the split head. Using head->last for the head-side check can both remap a head that ends exactly one page before a 2MB boundary and miss a head whose split boundary is one page after such a boundary. Philip Yang pointed out in the review of the original change that this condition should use head->last + 1 or new_start. Xiaogang Chen identified the inclusive-last cause and posted the candidate fix in the regression thread. With the culprit change active and the local revert not applied, the unchanged C/HSA reproducer completes 10/10 runs with this change on an RX 7600 XT. Fixes: 448ee45353ef ("drm/amdkfd: Use huge page size to check split svm range alignment") Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/4914 Link: https://lore.kernel.org/stable/IA1PR12MB85172F7FE9157C092EDA46A0E3112@IA1PR12MB8517.namprd12.prod.outlook.com/ Link: https://lore.kernel.org/all/32ce2b72-aa16-4202-9f99-92e3cd4408bc@amd.com/ Suggested-by: Xiaogang Chen <xiaogang.chen@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Gerhard Schwanzer <geschw@pm.me> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a60ea15807126b148a328051636977a33ad0e9bb) Cc: stable@vger.kernel.org
14 daysdrm/amdgpu/gfx9: Fix Ring and IB test fail after mode2Jiqian Chen1-0/+39
For Renior APU with gfx9, in some test scenarios with disabling ring_reset, like accessing an unmapped invalid address, it can trigger a gpu job timeout event, then driver uses Mode2 reset to reset GPU, but after Mode2 compute Ring test and IB test fail randomly. It because the HQDs of MECs are always active before or after Mode2, that causes MECs use stale HQDs when MECs are unhalted before driver restore MQDs, and causes CPC and CPF are still stuck after Mode2, then causes compute Ring and IB tests fail. So, add sequences to deactivate HQDs of MECs in suspend IP function of the resetting process. v2: Move all sequences into a new function gfx_v9_0_cp_mode2_clear_state (Ray Huang) To check reset Mode2 method in the if condition (Ray Huang) v3: Move all sequences before Mode2 instead of after Mode2 (Timur Kristóf) v4: Call amdgpu_gfx_rlc_enter/exit_safe_mode int the begin and end of gfx_v9_0_deactivate_kcq_hqd (Alex Deucher) Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit c3988a7ad4799514447294f04f063b422e0551df) Cc: stable@vger.kernel.org
14 daysdrm/amdgpu/uvd: Fix forcing MSG, FB BOs into VCPU segment when it isn't at 0 ↵Timur Kristóf1-9/+24
(v2) UVD 4.x and older can only access MSG, FEEDBACK buffers from a specific 256M VRAM segment that the VCPU BO is also located in. We already modify all placements of the given BO to ensure the BO is placed within this segment. Previously, it always assumed that the VCPU segment is the first 256M of VRAM, even though under some conditions the VCPU BO could be allocated outside this segment, which made UVD non-functional as the BOs were not inside the same segment as the UVD VCPU BO. Solve that by using the segment where the VCPU BO actually is. This fixes an issue with UVD failing to initialize on SI/CIK when resizable BAR is enabled and the VCPU BO is allocated in a different segment. v2: - For other BOs, keep using the same UVD segment as before. Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/3851 Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit cbfd4d3fc2061a1ec8e9d36e65973ac3e813358a) Cc: stable@vger.kernel.org
14 daysdrm/amdgpu/uvd: Place VCPU BO only in VRAM for UVD 4.x and olderTimur Kristóf1-6/+11
These UVD versions don't fully support GPUVM and are only validated to work when their VCPU BO is placed in VRAM. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 01b8dfc0660db5d6cdd62c22dc20f774a26ce853) Cc: stable@vger.kernel.org
14 daysdrm/amdgpu: Fix amdgpu_bo_move() when old_mem and new_mem are both GTTTimur Kristóf1-0/+18
The UVD code relies on GTT to GTT moves in order to ensure that its BOs don't cross 256M segments. Fixes: bfe5e585b44f ("drm/ttm: move last binding into the drivers.") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 21fd45e5e2628d00b478590bcc3d14d3de5d45b6) Cc: stable@vger.kernel.org
14 daysdrm/amdgpu: Respect placement requirements in amdgpu_gtt_mgr functionsTimur Kristóf1-2/+28
When testing intersection and compatibility, respect the actual placement requirements. This is a pre-requisite for ensuring that UVD CS BOs do not cross 256M segments. Fixes: ded910f368a5 ("drm/amdgpu: Implement intersect/compatible functions") Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit bc06579ca29dee9c245a41b12e39c7bb6938af5d) Cc: stable@vger.kernel.org
14 daysdrm/amdgpu: Fix context pstate override handlingTvrtko Ursulin1-29/+42
There are several problems in the context pstate handling code. The most serious ones are potential use-after-free and NULL pointer dereferences at context initialization time. Both are due amdgpu_ctx_init() not holding the adev->pm.stable_pstate_ctx_lock, which is otherwise used from both sysfs and the context code itself for modifying and clearing the stored context pointer. Second issue is that context fini can trample over the pstate configuration set via sysfs. This is due the restore state (ctx->stable_pstate) being saved at context init time, and not if, or when the context actually changes the pstate. As the context exits it will therefore incorrectly restore to what was set before the sysfs override was requested. The simplest fix is to drastically simplify how the state is tracked, by clearly defining the points at which pstate ownership is taken and released, and to handle all transitions under the correct lock. Instead of at context init time, the previous state is saved only at the point the context overrides the current state, and is restored on context exit only if the context is still the owner of the current override state. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Fixes: 79610d304133 ("drm/amdgpu: fix pstate setting issue") Cc: Chengming Gui <Jack.Gui@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1b5e413713c0a93bc1818394d0ce49aaad21bd27) Cc: <stable@vger.kernel.org> # v6.1+
14 daysdrm/amdkfd: Use memdup_array_user to copy data from/to user space at kfd ioctlsXiaogang Chen1-34/+12
Several kfd ioctls need transfer array data from/to user space. Kfd driver uses kmalloc_array with user provided size. That can oversize alloc or 32-bit wrap with hostile value. Replace it by memdup_array_user that does overflow checking and allocates through dedicated slab caches, also physical continuous as kmalloc. Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 4eca4742eb215951f9739ffe0122d179d545a7a4)
14 daysdrm/amdkfd: check find_first_zero_bit before __set_bit on kfd->doorbell_bitmapXiaogang Chen1-3/+5
If inx from find_first_zero_bit is beyond range not need set doorbell_bitmap. Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 2664ce9143d174651a793d96a6a2326050c4f45a)
14 daysdrm/amdkfd: Let driver decide buffer size at AMDKFD_IOC_GET_DMABUF_INFO ioctlXiaogang Chen3-13/+22
amdkfd driver needs allocate buffer to return bo metadata to user space. The buffer size is controlled by user currently. It is a potential security issue that hostile value (e.g. 2 GiB) lets any render-group user trigger order-MAX allocation/OOM in kernel context. This patch first finds bo metadata size. If the size is smaller than user provided value drive can safely allocate buffer in kernel space and copy to user space buffer. If not, driver will let user know, not allocate and copy. User will redo with new buffer in user space. This patch lets driver decide buffer allocation size to avoid potential hostile size from user space. Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit f54ce9e8cbd3abe0eda3a285f54dc4f572fe589a)
14 daysdrm/amdgpu: fix recursive ww_mutex acquire in amdgpu_devcoredump_formatMikhail Gavrilov1-89/+126
When dumping IB contents from a hung job, amdgpu_devcoredump_format() acquired the VM root PD's reservation via amdgpu_vm_lock_by_pasid() and then, for each IB, called amdgpu_bo_reserve() on the BO backing the IB. Both reservations are reservation_ww_class_mutex objects and neither used a ww_acquire_ctx, which trips lockdep: WARNING: possible recursive locking detected -------------------------------------------- kworker/u128:0 is trying to acquire lock: ffff88838b16e1f0 (reservation_ww_class_mutex){+.+.}-{4:4}, at: amdgpu_devcoredump_format+0x1594/0x23f0 [amdgpu] but task is already holding lock: ffff8882f82681f0 (reservation_ww_class_mutex){+.+.}-{4:4}, at: amdgpu_devcoredump_format+0x1594/0x23f0 [amdgpu] Possible unsafe locking scenario: CPU0 ---- lock(reservation_ww_class_mutex); lock(reservation_ww_class_mutex); *** DEADLOCK *** May be due to missing lock nesting notation Workqueue: events_unbound amdgpu_devcoredump_deferred_work [amdgpu] Call Trace: __ww_mutex_lock.constprop.0 ww_mutex_lock amdgpu_bo_reserve amdgpu_devcoredump_format+0x1594 [amdgpu] amdgpu_devcoredump_deferred_work+0xea [amdgpu] The two reservations are on different BOs in the captured trace, so the splat is a lockdep-correctness warning, not an observed deadlock. It becomes a real self-deadlock whenever the IB BO shares its dma_resv with the root PD (the always-valid case, see amdgpu_vm_is_bo_always_valid()): amdgpu_bo_reserve(abo) re-acquires the same ww_mutex without a ticket and blocks forever. With amdgpu.gpu_recovery=0 the timeout handler refires every ~2 s and each invocation produces this splat, drowning the kernel ring buffer. Now that amdgpu_vm_lock_by_pasid() takes a drm_exec context, move the IB dumping into a separate helper that locks the root PD and every IB BO together in a single drm_exec ticket. DRM_EXEC_IGNORE_DUPLICATES handles IB BOs that share a dma_resv (e.g. always-valid BOs, or two IBs backed by the same BO). Every lock is now a top-level acquire under one ww_acquire_ctx, so the recursive ww_mutex condition is gone, and the per-IB amdgpu_bo_reserve()/amdgpu_bo_unref() dance -- including a BO refcount leak on the amdgpu_bo_reserve() failure path -- is removed. Fixes: 7b15fc2d1f1a ("drm/amdgpu: dump job ibs in the devcoredump") Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit d6bf4242731219ee08ce54c365631e395486651e)
14 daysdrm/amdgpu: convert amdgpu_vm_lock_by_pasid() to drm_execMikhail Gavrilov2-35/+58
amdgpu_vm_lock_by_pasid() looks up a VM by PASID and reserves its root PD with a bare amdgpu_bo_reserve(), returning the still-reserved root to the caller. A caller that then needs to reserve further BOs (for example the devcoredump IB dump) ends up nesting reservation_ww_class_mutex acquires without a ww_acquire_ctx, which lockdep flags as recursive locking. Convert the helper to take a drm_exec context and lock the root PD with drm_exec_lock_obj(). Callers now run it inside a drm_exec_until_all_locked() loop and can lock additional BOs in the same ww ticket, so there is no nested ww_mutex acquire. The drm_exec context holds its own reference on the locked root BO, so the helper no longer hands a root reference back to the caller: the root output parameter is dropped, and the transient reference taken across the PASID lookup is released before returning. The only existing caller, amdgpu_vm_handle_fault(), is updated accordingly. Its is_compute_context path, which previously dropped the root reservation around svm_range_restore_pages() and re-took it, now finalises the drm_exec context and re-initialises a fresh one; behaviour is otherwise unchanged. No functional change intended for the page-fault path. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 14682de8ad377bf13ea66e47c26dcfea0b19a21d)
14 daysdrm/amdgpu: Don't use UTS_RELEASE directlyUwe Kleine-König (The Capable Hub)1-2/+2
UTS_RELEASE evaluates to a static string and changes quite easily (e.g. uncommitted changes in the source tree or new commits). So when checking if a patch introduces changes to the resulting binary each usage of UTS_RELEASE is source of annoyance. Instead of using UTS_RELEASE directly use init_utsname()->release which evaluates to the same string but with that a change of UTS_RELEASE doesn't affect amdgpu_dev_coredump.o. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Uwe Kleine-König (The Capable Hub) <u.kleine-koenig@baylibre.com> Link: https://patch.msgid.link/20260428144704.1114562-2-u.kleine-koenig@baylibre.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit d785df5598fd1d1cc2f2f45c05448271b6d490b7)
14 daysdrm/amdkfd: Fix NULL deref during sysfs teardownGeoffrey McRae1-16/+24
Move kfd_process_remove_sysfs() earlier in kfd_process_wq_release() so that all sysfs/procfs entries are removed before tearing down PDDs and dropping lead_thread. The per-process sysfs attributes are backed by struct kfd_process_device, and their show/store callbacks dereference PDD fields. Since sysfs removal waits for active callbacks to complete, removing these entries first closes a race where userspace reads sdma_* and stats_* files after PDD teardown. Previously this cleanup ran after kfd_process_destroy_pdds(), which resets p->n_pdds to 0. This meant kfd_process_remove_sysfs() could no longer walk the PDD array, so the per-PDD sysfs cleanup did not run as intended. This race caused NULL pointer dereferences observed in kfd_sdma_activity_worker and kfd_procfs_stats_show. Also harden kfd_process_remove_sysfs() against partially initialized or already-freed objects: - Check kobj_queues before removing PASID and deleting it - Guard kobj_stats and kobj_counters before use These checks prevent invalid dereferences during cleanup. Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Geoffrey McRae <geoffrey.mcrae@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 674c692702341fed321720b4b92036c5934fb485)
14 daysdrm/amdgpu: validate CP_GFX_SHADOW chunk size in CS pass1Mario Limonciello1-1/+5
Add a minimum-length check for the AMDGPU_CHUNK_ID_CP_GFX_SHADOW chunk in amdgpu_cs_pass1(), matching the gate already present for the IB, FENCE and BO_HANDLES chunk types. The CP_GFX_SHADOW case previously shared a bare break with the dependency and syncobj chunk types, which do not dereference a fixed-size struct. When userspace submits this chunk with length_dw == 0, vmemdup_array_user() is called with size 0 and returns ZERO_SIZE_PTR, which passes the IS_ERR() check. amdgpu_cs_p2_shadow() then dereferences chunk->kdata as a struct drm_amdgpu_cs_chunk_cp_gfx_shadow (reading shadow->flags), faulting on the ZERO_SIZE_PTR and causing a NULL-pointer dereference. This is reachable by an unprivileged process in the render group. Reject undersized chunks with -EINVAL during pass1 so the bad submission is rejected before pass2 ever dereferences the data. Fixes: ac9287055ff1 ("drm/amdgpu: add gfx shadow CS IOCTL support") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 7f61b2eef7415eccdb40850aca0de94211948657) Cc: stable@vger.kernel.org
14 daysdrm/amdgpu: check amdgpu_vm_bo_find() result in GET_MAPPING_INFOMario Limonciello1-0/+5
The AMDGPU_GEM_OP_GET_MAPPING_INFO path of amdgpu_gem_op_ioctl() looks up the bo_va for the buffer object in the caller's VM via amdgpu_vm_bo_find(), but uses the returned pointer without checking it. amdgpu_vm_bo_find() returns NULL when the BO has no bo_va in that VM, which is the normal case for a BO that has never been mapped. The result is fed straight into amdgpu_vm_bo_va_for_each_valid_mapping(), which expands to list_for_each_entry(mapping, &(bo_va)->valids, list) and dereferences bo_va, causing a NULL pointer dereference. This is reachable by any process able to issue the ioctl (render group) simply by requesting mapping info for an unmapped BO. Return -ENOENT when no bo_va is found, jumping to out_exec so the drm_exec context and GEM object reference are released. Fixes: 4d82724f7f2b ("drm/amdgpu: Add mapping info option for GEM_OP ioctl") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 528b19377affc1cc7362a70a254c1dda793595f9) Cc: stable@vger.kernel.org
14 daysdrm/amdgpu: initialize irq.lock spinlock earlierThadeu Lima de Souza Cascardo2-2/+2
If there is an early failure during amdgpu probe, like missing firmware, it will end up calling amdgpu_irq_disable_all, which takes irq.lock spinlock without it being initialized. Initializing irq.lock earlier at amdgpu_device_init fixes the issue. [ 79.334079] INFO: trying to register non-static key. [ 79.334081] The code is fine but needs lockdep annotation, or maybe [ 79.334083] you didn't initialize this object before use? [ 79.334084] turning off the locking correctness validator. [ 79.334088] CPU: 2 UID: 0 PID: 1819 Comm: bash Not tainted 7.1.0-rc5-gfd06300b2348 #96 PREEMPT 8e8f461221633dae3c832d6689eaf0546c0ed4cd [ 79.334092] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0133 08/05/2024 [ 79.334094] Call Trace: [ 79.334095] <TASK> [ 79.334097] dump_stack_lvl+0x5d/0x80 [ 79.334103] register_lock_class+0x7af/0x7c0 [ 79.334109] __lock_acquire+0x416/0x2610 [ 79.334114] lock_acquire+0xcf/0x310 [ 79.334117] ? amdgpu_irq_disable_all+0x3b/0xf0 [amdgpu c88bab43d391d519ad0d5c8e5a099b4aceefa180] [ 79.334503] ? _raw_spin_lock_irqsave+0x53/0x60 [ 79.334508] _raw_spin_lock_irqsave+0x3f/0x60 [ 79.334510] ? amdgpu_irq_disable_all+0x3b/0xf0 [amdgpu c88bab43d391d519ad0d5c8e5a099b4aceefa180] [ 79.334881] amdgpu_irq_disable_all+0x3b/0xf0 [amdgpu c88bab43d391d519ad0d5c8e5a099b4aceefa180] [ 79.335240] amdgpu_device_fini_hw+0x90/0x32c [amdgpu c88bab43d391d519ad0d5c8e5a099b4aceefa180] [ 79.335704] amdgpu_driver_load_kms.cold+0x22/0x44 [amdgpu c88bab43d391d519ad0d5c8e5a099b4aceefa180] [ 79.336159] amdgpu_pci_probe+0x204/0x440 [amdgpu c88bab43d391d519ad0d5c8e5a099b4aceefa180] [ 79.336494] local_pci_probe+0x3c/0x80 [ 79.336500] pci_call_probe+0x55/0x2e0 [ 79.336505] ? _raw_spin_unlock+0x2d/0x50 [ 79.336508] ? pci_match_device+0x157/0x180 [ 79.336512] pci_device_probe+0x9b/0x170 [ 79.336516] really_probe+0xd5/0x370 [ 79.336521] __driver_probe_device+0x84/0x150 [ 79.336525] device_driver_attach+0x47/0xb0 [ 79.336528] bind_store+0x73/0xc0 [ 79.336531] kernfs_fop_write_iter+0x176/0x250 [ 79.336536] vfs_write+0x24d/0x560 [ 79.336542] ksys_write+0x71/0xe0 [ 79.336546] do_syscall_64+0x122/0x710 [ 79.336550] ? do_syscall_64+0xd1/0x710 [ 79.336553] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [ 79.336557] RIP: 0033:0x7f92fd675006 [ 79.336561] Code: 5d e8 41 8b 93 08 03 00 00 59 5e 48 83 f8 fc 75 19 83 e2 39 83 fa 08 75 11 e8 26 ff ff ff 66 0f 1f 44 00 00 48 8b 45 10 0f 05 <48> 8b 5d f8 c9 c3 0f 1f 40 00 f3 0f 1e fa 55 48 89 e5 48 83 ec 08 [ 79.336562] RSP: 002b:00007ffe4fa867a0 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 [ 79.336565] RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007f92fd675006 [ 79.336567] RDX: 000000000000000d RSI: 000055b2dfce59b0 RDI: 0000000000000001 [ 79.336568] RBP: 00007ffe4fa867c0 R08: 0000000000000000 R09: 0000000000000000 [ 79.336569] R10: 0000000000000000 R11: 0000000000000202 R12: 000000000000000d [ 79.336570] R13: 000055b2dfce59b0 R14: 00007f92fd7ca5c0 R15: 000055b2dfdbaf70 [ 79.336574] </TASK> Fixes: 9950cda2a018 ("drm/amdgpu: drop the drm irq pre/post/un install callbacks") Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 7dba3e10ecdeec85208e255853fcd3890880b10e)
14 daysdrm/amdkfd: fix list_del corruption in kfd_criu_resume_svmMario Limonciello1-0/+1
The cleanup tail of kfd_criu_resume_svm() walks svms->criu_svm_metadata_list and kfree()s each struct criu_svm_metadata without removing it from the list. The list head is left pointing at freed kmalloc-96 objects. A second AMDKFD_IOC_CRIU_OP from the same process re-enters: list_empty() reads the dangling ->next (use-after-free), the loop walks freed entries, and each is kfree()'d again (double-free). This is reachable by an unprivileged render-group user via /dev/kfd with no capabilities required. Add list_del() before the kfree() so the list is properly emptied. The list_for_each_entry_safe() iterator already caches the next pointer, so unlinking during the walk is safe. Fixes: 2a909ae71871 ("drm/amdkfd: CRIU resume shared virtual memory ranges") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 6322d278a298e2c1430b9d2697743d3a04b788b1)
14 daysdrm/radeon: fix r100_copy_blit for large BOsPavel Ondračka1-4/+9
r100_copy_blit() copies BOs as 1024-pixel-wide ARGB8888 blits, so one GPU page becomes one blit row. Large copies are split into chunks of at most 8191 rows. The kernel register header names the packet coordinate dwords SRC_Y_X and DST_Y_X. In the BITBLT_MULTI description in R5xx_Acceleration_v1.5.pdf docs, these correspond to [SRC_X1 | SRC_Y1] and [DST_X1 | DST_Y1], which are signed 13-bit coordinates in the -8192..8191 range. The old code kept SRC/DST_PITCH_OFFSET at the BO base and used SRC_Y_X/DST_Y_X as the chunk address, so large BO moves could exceed that coordinate range. Compute per-chunk SRC/DST_PITCH_OFFSET bases and emit zero source and destination coordinates. r100_copy_blit() already packs SRC/DST_PITCH_OFFSET as pitch plus base offset, so large chunk addresses belong there rather than in the coordinate fields. This fixes Prison Architect corruption with 4096x4096 mipped textures after they are evicted to GTT under memory pressure on RV530. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/6716 Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 87be26aee76239c6da03e599f238a426897f78ad) Cc: stable@vger.kernel.org
14 daysdrm/amd/display: Fix mem_type change detection for async flipsMatthew Schwartz1-6/+4
[Why] amdgpu_dm_crtc_mem_type_changed() fetches the "old" and "new" plane state with two drm_atomic_get_plane_state() calls, which both return the new state. It compares a state against itself, so it never detects a mem_type change and never rejects the async flip. On DCN 3.0.1, this shows up as intermittent corruption when a single DCC plane is scanned out with immediate flips under gamescope and its buffer moves between the VRAM carveout and GTT. [How] Use drm_atomic_get_old_plane_state() and drm_atomic_get_new_plane_state() to compare the actual old and new states. These return NULL rather than an error pointer for a plane that is not part of the commit, so the IS_ERR() check becomes a NULL check that skips those planes, such as an unmodified cursor still in the CRTC's plane_mask. Fixes: 4caacd1671b7 ("drm/amd/display: Do not elevate mem_type change to full update") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Matthew Schwartz <matthew.schwartz@linux.dev> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 13158e5dbd896281f3e9982b5437cffa5fd621b2)
14 daysdrm/amd/display: Add IN_FORMATS_ASYNC support for planesJames Lin1-0/+1
[Why] The DRM core exposes an IN_FORMATS_ASYNC plane property describing the set of format/modifier pairs that are valid for asynchronous (immediate) page flips. amdgpu already advertises async page flip support via mode_config.async_page_flip = true, but never implemented the .format_mod_supported_async plane callback, so the IN_FORMATS_ASYNC property was not created. This inconsistency (advertising async flips while exposing IN_FORMATS but no IN_FORMATS_ASYNC) causes userspace, such as igt-gpu-tools, to emit a repeated warning during plane initialization, which in turn demotes many otherwise passing KMS subtests to a WARN result. [How] Wire up .format_mod_supported_async to the existing amdgpu_dm_plane_format_mod_supported callback so the async format list is populated. amdgpu does not restrict async flips at the format/modifier level: the async flip constraints are enforced at atomic check and commit time and only require a fast update (no change to FB pitch, DCC state, rotation or memory type) between the old and new buffers. Therefore the set of formats/modifiers valid for async flips is identical to the regular IN_FORMATS set, and the same callback can be reused. Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: James Lin <PingLei.Lin@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 8e2d7bbd6b184c0c1b0fe7cb404c9b5214d89931)
14 daysdrm/amdgpu/gfx: fix cleaner shader IB buffer overflowAsad Kamal1-5/+5
The cleaner shader sysfs path allocates a 16-dword (64 byte) IB but incorrectly fills (align_mask + 1) dwords. On GFX rings align_mask is 0xff, so the loop wrote 256 dwords into a 64-byte buffer, causing a kernel page fault. The IB only needs to be a minimal NOP shell to schedule the job; the cleaner shader itself is emitted on the ring via emit_cleaner_shader(). Fill 16 dwords to match the allocation. v2: Use ib_size_dw variable (Lijo) Fixes: d361ad5d2fc0 ("drm/amdgpu: Add sysfs interface for running cleaner shader") Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit bf21af331ebf72d0935fd70c73192414a422c03a) CC: stable@vger.kernel.org
14 daysdrm/amdgpu: allocate lockdep mutex on the heap to fix stack overflowPrike Liang1-50/+53
Replace the stack-allocated amdgpu_lockdep mutex with a heap allocation via kmalloc to fix a stack overflow caused by the large struct size. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit dbae980eefb2f46f31cee12f1f8540d0d79f61ae)
14 daysdrm/amdkfd: Fix SMI event PID reporting for containersAndrew Martin5-56/+77
SMI events were reporting incorrect PIDs in containerized environments, causing test failures where container processes expected to see their namespace-local PIDs but instead received global host PIDs. The issue had two root causes: 1. Event functions were called from kernel context (page fault handlers, migration workers) where 'current' refers to the kernel worker thread, not the userspace GPU process that triggered the event. 2. PID conversion used task_tgid_vnr() which returns the PID in the caller's namespace (init namespace for kernel threads), not the task's own namespace. This patch updates the SMI event interface: - Change 8 event function signatures to accept task_struct pointer instead of pid_t, allowing proper namespace-aware PID conversion - Convert PIDs using task_tgid_nr_ns(task, task_active_pid_ns(task)) which returns the PID as the process sees it via getpid() - Update 10 call sites to pass p->lead_thread (the GPU process) instead of p->lead_thread->pid or current (kernel worker) This ensures SMI events report container-local PIDs, which is critical for containerized GPU workloads to correctly correlate events with their processes. Tested-by: Andrew Martin <andmarti@amd.com> Assisted-by: Claude:Sonnet 4-5 Signed-off-by: Andrew Martin <andrew.martin@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 60271ec06e04ba5d69d68714f3abdf637d86c257)
14 daysdrm/amd/display: Restore periodic detection for DCN35Ivan Lipski3-4/+2
[Why&How] Periodic detection callbacks from DCN35 was removed for higher IPS residency causing some displays to fail to recover after DPMS sleep. The monitors bounces HPD ~1.2s after link training, and without periodic detection the system enters IPS with no mechanism to wake and rediscover the display. Restore the periodic detection calls in dcn35_clk_mgr for now. It should be replaced with a proper IPS-aware solution long term using DMUB. Also remove it from dcn31 and dcn314_clk_mgr.c since they do not have IPS, thus should not affect them. Fixes: 3f6c060846be ("drm/amd/display: Remove periodic detection callbacks from dcn35+") Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/5318 Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 0c300e6a76916e944b6b18a64c73f7895a0fee87) Cc: stable@vger.kernel.org
14 daysdrm/amd/display: Skip PHY SSC reduction on some 8K panelsRoman Li1-3/+10
[Why] Some 8K displays cannot tolerate the reduced phy ssc value at high link utilization and show corruption or black screen. [How] Add an EDID panel-id quirk to utilize existing skip_phy_ssc_reduction flag. To pass the link into the quirk handler, change the signature of apply_edid_quirks() to take link as an argument. The dev local in dm_helpers_parse_edid_caps() becomes unused and is removed. Fixes: 5fa62c87cffd ("drm/amd/display: Add option to disable PHY SSC reduction on transmitter enable") Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Roman Li <Roman.Li@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 144169e7be0831e09958a906d08d1856751aa6c6)
14 daysdrm/amdgpu: skip already suspended IP blocks in ip_suspend_phase2Yunxiang Li1-1/+1
The GPU reload test (S3 / mode1 reset / module reload) triggers a WARN_ON in amdgpu_irq_put() on gfx10 when unloading amdgpu: WARNING: CPU: 0 PID: 2314 at amd/amdgpu/amdgpu_irq.c:676 amdgpu_irq_put+0xc3/0xe0 [amdgpu] Call Trace: gfx_v10_0_hw_fini+0x41/0x150 [amdgpu] amdgpu_ip_block_hw_fini+0x29/0xc0 [amdgpu] amdgpu_device_fini_hw+0x315/0x610 [amdgpu] amdgpu_driver_unload_kms+0x7c/0x90 [amdgpu] amdgpu_pci_remove+0x51/0x90 [amdgpu] amdgpu_device_ip_resume_phase2() skips IP blocks whose status.hw is already set, but amdgpu_device_ip_suspend_phase2() never had the matching guard, so a block can be suspended twice (e.g. a reset or recovery issued while the device is already suspended). The second suspend runs hw_fini again, which now releases the gfx fault IRQs unconditionally, dropping a refcount that is already zero and tripping the WARN_ON in amdgpu_irq_put(). The fault/EOP IRQ get/put were balanced through late_init/hw_fini before, which masked the double-suspend; moving the get into hw_init made the suspend/resume asymmetry visible as an IRQ refcount underflow. Honor status.hw in ip_suspend_phase2() so suspend mirrors resume and a block is only torn down once. Fixes: 9117d8be850b ("drm/amdgpu/gfx: move fault and EOP IRQ get/put to hw_init/hw_fini") Fixes: 482f0e538580 ("drm/amdgpu: fix double ucode load by PSP(v3)") Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit f44f2af13c418969be358b15743f939d705de998)
14 daysdrm/amdkfd: Properly acquire queue buffers in CRIU restoreDavid Francis1-2/+10
When kfd_queue_acquire_buffers() was split off from set_queue_properties_from_user(), set_queue_properties_from_criu() was missed. Thus, set_queue_properties_from_criu() is not filling out the buffer fields of queue_properties, which can come up when subsequent code expects them to be non-null. Add the proper call to kfd_queue_acquire_buffers(), and also use the right cast types in set_queue_properties_from_criu() (which were missed at the same time) Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 88ed96abbbe27b70193544fbc1ee06448c274714)
14 daysdrm/amd/pm: re-enable MC access after PrepareMp1ForUnload on SMU V15 APUsShubhankar Milind Sardeshpande1-1/+6
During smu_v15_0_0_system_features_control(), the driver sends a PrepareMp1ForUnload message to PMFW. PMFW then performs nBIF and SYSHUB function-level resets (FLR), disabling PCIe CFG space reset, which clears the framebuffer enable bit to zero and disables MC (memory controller) access from the host. Re-enable MC access via the nbio mc_access_enable callback right after PrepareMp1ForUnload completes in smu_v15_0_0_system_features_control(). Signed-off-by: Shubhankar Milind Sardeshpande <Shubhankar.MilindSardeshpande@amd.com> Signed-off-by: Suresh Guttula <Suresh.Guttula@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 840a3c5aeae779a3bc75d7f747c3ed18b1af6507) Cc: stable@vger.kernel.org