kernel/linux.git/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c, branch v7.0-rc7

drm/amd: Fix MQD and control stack alignment for non-4K

2026-03-30T20:12:27+00:00

For gfxV9, due to a hardware bug ("based on the comments in the code here [1]"), the control stack of a user-mode compute queue must be allocated immediately after the page boundary of its regular MQD buffer. To handle this, we allocate an enlarged MQD buffer where the first page is used as the MQD and the remaining pages store the control stack. Although these regions share the same BO, they require different memory types: the MQD must be UC (uncached), while the control stack must be NC (non-coherent), matching the behavior when the control stack is allocated in user space. This logic works correctly on systems where the CPU page size matches the GPU page size (4K). However, the current implementation aligns both the MQD and the control stack to the CPU PAGE_SIZE. On systems with a larger CPU page size, the entire first CPU page is marked UC—even though that page may contain multiple GPU pages. The GPU treats the second 4K GPU page inside that CPU page as part of the control stack, but it is incorrectly mapped as UC. This patch fixes the issue by aligning both the MQD and control stack sizes to the GPU page size (4K). The first 4K page is correctly marked as UC for the MQD, and the remaining GPU pages are marked NC for the control stack. This ensures proper memory type assignment on systems with larger CPU page sizes. [1]: https://elixir.bootlin.com/linux/v6.18/source/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c#L118 Acked-by: Felix Kuehling Signed-off-by: Donet Tom Signed-off-by: Alex Deucher (cherry picked from commit 998d6781410de1c4b787fdbf6c56e851ea7fa553)

Convert 'alloc_obj' family to use the new default GFP_KERNEL argument

2026-02-22T01:09:51+00:00

This was done entirely with mindless brute force, using git grep -l '\

treewide: Replace kmalloc with kmalloc_obj for non-scalar types

2026-02-21T09:02:28+00:00

This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook

drm/amdgpu: Use correct address to setup gart page table for vram access

2026-01-10T19:06:39+00:00

Use dst input parameter to setup gart page table entries instead of using fixed location. Fixes: 237d623ae659 ("drm/amdgpu/gart: Add helper to bind VRAM pages (v2)") Signed-off-by: Xiaogang Chen Reviewed-by: Alex Deucher Signed-off-by: Alex Deucher

drm/amd: Convert DRM_() to drm_()

2026-01-05T21:59:55+00:00

The drm_*() macros include the device which is helpful for debugging issues in multi-GPU systems. Signed-off-by: Mario Limonciello (AMD) Reviewed-by: Aurabindo Pillai Signed-off-by: Alex Deucher

drm/amdgpu/gart: Add helper to bind VRAM pages (v2)

2025-11-12T02:54:17+00:00

Binds pages that located in VRAM to the GART page table. Useful when a kernel BO is located in VRAM but needs to be accessed from the GART address space, for example to give a kernel BO a 32-bit address when GART is placed in LOW address space. v2: - Refactor function to be more reusable Signed-off-by: Timur Kristóf Reviewed-by: Christian König Signed-off-by: Alex Deucher

drm/amd: Fix set but not used warnings

2025-10-20T22:25:58+00:00

There are many set but not used warnings under drivers/gpu/drm/amd when compiling with the latest upstream mainline GCC: drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c:305:18: warning: variable ‘p’ set but not used [-Wunused-but-set-variable=] drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h:103:26: warning: variable ‘internal_reg_offset’ set but not used [-Wunused-but-set-variable=] ... drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h:164:26: warning: variable ‘internal_reg_offset’ set but not used [-Wunused-but-set-variable=] ... drivers/gpu/drm/amd/amdgpu/../display/dc/dc_dmub_srv.c:445:13: warning: variable ‘pipe_idx’ set but not used [-Wunused-but-set-variable=] drivers/gpu/drm/amd/amdgpu/../display/dc/dc_dmub_srv.c:875:21: warning: variable ‘pipe_idx’ set but not used [-Wunused-but-set-variable=] Remove the variables actually not used or add __maybe_unused attribute for the variables actually used to fix them, compile tested only. Signed-off-by: Tiezhu Yang Signed-off-by: Alex Deucher

drm/amdgpu: Fix dummy_read_page overlapping mappings

2024-11-04T17:05:30+00:00

Use the dma_map_page_attrs() with DMA_ATTR_SKIP_CPU_SYNC attribute setting to handle the dummy page overlapping mappings. Signed-off-by: Prike Liang Suggested-by: Christian König Reviewed-by: Christian König Signed-off-by: Alex Deucher

drm/amdgpu: add lock in amdgpu_gart_invalidate_tlb

2024-06-14T20:15:59+00:00

We need to take the reset domain lock before flush hdp. We can't put the lock inside amdgpu_device_flush_hdp itself because it is used during reset where we already take the write side lock. Signed-off-by: Yunxiang Li Reviewed-by: Christian König Signed-off-by: Alex Deucher

drm/amdgpu: use helper in amdgpu_gart_unbind