kernel/linux.git/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c, branch v6.19.11

drm/amdgpu: Fix CPER ring debugfs read buffer overflow risk

2025-11-26T16:50:43+00:00

The CPER ring debugfs read code always writes a 12-byte header when the file is read for the first time (*offset == 0): copy_to_user(buf, ring_header, 12); But the code never checks whether the user buffer (@size) is at least 12 bytes long. After writing the 12-byte header, the code then gives the full original @size to the CPER payload handler: record_req->buf_size = size; This means the function can write: 12 bytes (header) + payload bytes (up to @size) into a buffer that is only @size bytes big. In other words, the kernel may write more data than the user asked for. This can overflow the user buffer. The fix is: - If the user buffer is smaller than 12 bytes on the first read, return -EINVAL instead of copying the header. - After writing the 12-byte header, subtract 12 from @size and pass the reduced size to record_req->buf_size. This ensures the CPER payload only uses the remaining free space in the buffer. Reads after the first one (*offset != 0) do not write the header, so their behavior stays exactly the same. The only user-visible change is that tiny buffers now fail safely instead of risking an overflow. Fixes: drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c:523 amdgpu_ras_cper_debugfs_read() warn: userbuf overflow? is 'ring_header_size' <= 'size' Fixes: 527e3d40339b ("drm/amd/ras: Add CPER ring read for uniras") Reported by: Dan Carpenter Cc: Xiang Liu Cc: Tao Zhou Cc: Yang Wang Cc: Christian König Cc: Alex Deucher Signed-off-by: Srinivasan Shanmugam Reviewed-by: Tao Zhou Signed-off-by: Alex Deucher

drm/amd/ras: Add CPER ring read for uniras

2025-11-04T16:33:54+00:00

Read CPER raw data from debugfs node "/sys/kernel/debug/dri/*/ amdgpu_ring_cper". Signed-off-by: Xiang Liu Reviewed-by: Tao Zhou Reviewed-by: Yang Wang Signed-off-by: Alex Deucher

drm/amdgpu: move reset debug disable handling

2025-11-04T16:33:54+00:00

Move everything to the supported resets masks rather than having an explicit misc checks for this. Reviewed-by: Jesse Zhang Signed-off-by: Alex Deucher

drm/amdgpu: Use memset32 for IB padding

2025-10-20T22:25:10+00:00

Use memset32 instead of open coding it, just because it is that bit nicer. Signed-off-by: Tvrtko Ursulin Signed-off-by: Alex Deucher

drm/amdgpu: set an error on all fences from a bad context

2025-10-13T18:14:15+00:00

When we backup ring contents to reemit after a queue reset, we don't backup ring contents from the bad context. When we signal the fences, we should set an error on those fences as well. v2: misc cleanups v3: add locking for fence error, fix comment (Christian) v4: fix wrap around, locking (Christian) Fixes: 77cc0da39c7c ("drm/amdgpu: track ring state associated with a fence") Reviewed-by: Christian König Signed-off-by: Alex Deucher

drm/amdgpu: Fix allocating extra dwords for rings (v2)

2025-09-15T20:52:52+00:00

Rename extra_dw to extra_bytes and document what it's for. The value is already used as if it were bytes in vcn_v4_0.c and in amdgpu_ring_init. Just adjust the dword count in jpeg_v1_0.c so that it becomes a byte count. v2: Rename extra_dw to extra_bytes as discussed during review. Fixes: c8c1a1d2ef04 ("drm/amdgpu: define and add extra dword for jpeg ring") Signed-off-by: Timur Kristóf Reviewed-by: Christian König Signed-off-by: Alex Deucher

drm/amdgpu: fix a memory leak in fence cleanup when unloading

2025-09-09T20:10:10+00:00

Commit b61badd20b44 ("drm/amdgpu: fix usage slab after free") reordered when amdgpu_fence_driver_sw_fini() was called after that patch, amdgpu_fence_driver_sw_fini() effectively became a no-op as the sched entities we never freed because the ring pointers were already set to NULL. Remove the NULL setting. Reported-by: Lin.Cao Cc: Vitaly Prosyak Cc: Christian König Fixes: b61badd20b44 ("drm/amdgpu: fix usage slab after free") Reviewed-by: Christian König Signed-off-by: Alex Deucher

drm/amdgpu: track whether a queue is a kernel queue in amdgpu_mqd_prop

2025-07-28T20:25:04+00:00

Used to to set the MQD appropriately for each queue type. Kernel queues have additional privileges. Acked-by: Christian König Reviewed-by: Lijo Lazar Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org # 6.16.x

drm/amdgpu: move reset support type checks into the caller

2025-07-17T16:36:56+00:00

Rather than checking in the callbacks, check if the reset type is supported in the caller. Reviewed-by: Lijo Lazar Signed-off-by: Alex Deucher

drm/amdgpu: Increase reset counter only on success

2025-07-16T20:14:44+00:00

Increment the reset counter only if soft recovery succeeded. This is consistent with a ring hard reset behaviour where counter gets incremented only if hard reset succeeded. Signed-off-by: Lijo Lazar Reviewed-by: Hawking Zhang Reviewed-by: Alex Deucher Signed-off-by: Alex Deucher