kernel/linux.git/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c, branch v7.0-rc7

drm/amdgpu: validate doorbell_offset in user queue creation

2026-03-30T20:08:17+00:00

amdgpu_userq_get_doorbell_index() passes the user-provided doorbell_offset to amdgpu_doorbell_index_on_bar() without bounds checking. An arbitrarily large doorbell_offset can cause the calculated doorbell index to fall outside the allocated doorbell BO, potentially corrupting kernel doorbell space. Validate that doorbell_offset falls within the doorbell BO before computing the BAR index, using u64 arithmetic to prevent overflow. Fixes: f09c1e6077ab ("drm/amdgpu: generate doorbell index for userqueue") Reported-by: Yuhao Jiang Signed-off-by: Junrui Luo Signed-off-by: Alex Deucher (cherry picked from commit de1ef4ffd70e1d15f0bf584fd22b1f28cbd5e2ec) Cc: stable@vger.kernel.org

drm/amdgpu/userq: refcount userqueues to avoid any race conditions

2026-03-04T18:15:00+00:00

To avoid race condition and avoid UAF cases, implement kref based queues and protect the below operations using xa lock a. Getting a queue from xarray b. Increment/Decrement it's refcount Every time some one want to access a queue, always get via amdgpu_userq_get to make sure we have locks in place and get the object if active. A userqueue is destroyed on the last refcount is dropped which typically would be via IOCTL or during fini. v2: Add the missing drop in one the condition in the signal ioclt [Alex] v3: remove the queue from the xarray first in the free queue ioctl path [Christian] - Pass queue to the amdgpu_userq_put directly. - make amdgpu_userq_put xa_lock free since we are doing put for each get only and final put is done via destroy and we remove the queue from xa with lock. - use userq_put in fini too so cleanup is done fully. v4: Use xa_erase directly rather than doing load and erase in free ioctl. Also remove some of the error logs which could be exploited by the user to flood the logs [Christian] Signed-off-by: Sunil Khatri Reviewed-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Alex Deucher (cherry picked from commit 4952189b284d4d847f92636bb42dd747747129c0) Cc: # 048c1c4e5171: drm/amdgpu/userq: Consolidate wait ioctl exit path Cc:

Convert 'alloc_obj' family to use the new default GFP_KERNEL argument

2026-02-22T01:09:51+00:00

This was done entirely with mindless brute force, using git grep -l '\

treewide: Replace kmalloc with kmalloc_obj for non-scalar types

2026-02-21T09:02:28+00:00

This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook

drm/amdgpu: validate user queue size constraints

2026-01-29T17:26:15+00:00

Add validation to ensure user queue sizes meet hardware requirements: - Size must be a power of two for efficient ring buffer wrapping - Size must be at least AMDGPU_GPU_PAGE_SIZE to prevent undersized allocations This prevents invalid configurations that could lead to GPU faults or unexpected behavior. Reviewed-by: Christian König Signed-off-by: Jesse Zhang Signed-off-by: Alex Deucher

drm/amd/amdgpu: Add independent hang detect work for user queue fence

2026-01-20T22:16:12+00:00

In error scenarios (e.g., malformed commands), user queue fences may never be signaled, causing processes to wait indefinitely. To address this while preserving the requirement of infinite fence waits, implement an independent timeout detection mechanism: 1. Initialize a hang detect work when creating a user queue (one-time setup) 2. Start the work with queue-type-specific timeout (gfx/compute/sdma) when the last fence is created via amdgpu_userq_signal_ioctl (per-fence timing) 3. Trigger queue reset logic if the timer expires before the fence is signaled v2: make timeout per queue type (adev->gfx_timeout vs adev->compute_timeout vs adev->sdma_timeout) to be consistent with kernel queues. (Alex) v3: The timeout detection must be independent from the fence, e.g. you don't wait for a timeout on the fence but rather have the timeout start as soon as the fence is initialized. (Christian) v4: replace the timer with the `hang_detect_work` delayed work. Reviewed-by: Alex Deucher Acked-by: Christian König Signed-off-by: Jesse Zhang Signed-off-by: Alex Deucher

drm/amdgpu: make sure userqs are enabled in userq IOCTLs

2026-01-10T19:21:52+00:00

These IOCTLs shouldn't be called when userqs are not enabled. Make sure they are enabled before executing the IOCTLs. Reviewed-by: Christian König Signed-off-by: Alex Deucher

drm/amdgpu: do not use amdgpu_bo_gpu_offset_no_check individually

2025-12-16T18:27:13+00:00

This should not be used indiviually, use amdgpu_bo_gpu_offset with bo reserved. v3 - unpin bo in queue destroy (Christian) v2 - pin bo so that offset returned won't change after unlock (Christian) Signed-off-by: Saleemkhan Jamadar Suggested-by: Christian König Reviewed-by: Christian König Signed-off-by: Alex Deucher

drm/amdgpu: Rename userq_mgr_xa to userq_xa

2025-12-08T18:56:39+00:00

Rename since it is an xarray of userq pointers Signed-off-by: Lijo Lazar Reviewed-by: Alex Deucher Signed-off-by: Alex Deucher

drm/amdgpu: Clean up userq helper functions

2025-12-08T18:56:39+00:00

Remove userq manager from function signatures. Get the associated manager from userq itself. Signed-off-by: Lijo Lazar Reviewed-by: Alex Deucher Signed-off-by: Alex Deucher