<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h, branch v7.0-rc7</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v7.0-rc7</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v7.0-rc7'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-03-30T20:14:11+00:00</updated>
<entry>
<title>drm/amdgpu: Change AMDGPU_VA_RESERVED_TRAP_SIZE to 64KB</title>
<updated>2026-03-30T20:14:11+00:00</updated>
<author>
<name>Donet Tom</name>
<email>donettom@linux.ibm.com</email>
</author>
<published>2026-03-26T12:21:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4487571ef17a30d274600b3bd6965f497a881299'/>
<id>urn:sha1:4487571ef17a30d274600b3bd6965f497a881299</id>
<content type='text'>
Currently, AMDGPU_VA_RESERVED_TRAP_SIZE is hardcoded to 8KB, while
KFD_CWSR_TBA_TMA_SIZE is defined as 2 * PAGE_SIZE. On systems with
4K pages, both values match (8KB), so allocation and reserved space
are consistent.

However, on 64K page-size systems, KFD_CWSR_TBA_TMA_SIZE becomes 128KB,
while the reserved trap area remains 8KB. This mismatch causes the
kernel to crash when running rocminfo or rccl unit tests.

Kernel attempted to read user page (2) - exploit attempt? (uid: 1001)
BUG: Kernel NULL pointer dereference on read at 0x00000002
Faulting instruction address: 0xc0000000002c8a64
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
CPU: 34 UID: 1001 PID: 9379 Comm: rocminfo Tainted: G E
6.19.0-rc4-amdgpu-00320-gf23176405700 #56 VOLUNTARY
Tainted: [E]=UNSIGNED_MODULE
Hardware name: IBM,9105-42A POWER10 (architected) 0x800200 0xf000006
of:IBM,FW1060.30 (ML1060_896) hv:phyp pSeries
NIP:  c0000000002c8a64 LR: c00000000125dbc8 CTR: c00000000125e730
REGS: c0000001e0957580 TRAP: 0300 Tainted: G E
MSR:  8000000000009033 &lt;SF,EE,ME,IR,DR,RI,LE&gt; CR: 24008268
XER: 00000036
CFAR: c00000000125dbc4 DAR: 0000000000000002 DSISR: 40000000
IRQMASK: 1
GPR00: c00000000125d908 c0000001e0957820 c0000000016e8100
c00000013d814540
GPR04: 0000000000000002 c00000013d814550 0000000000000045
0000000000000000
GPR08: c00000013444d000 c00000013d814538 c00000013d814538
0000000084002268
GPR12: c00000000125e730 c000007e2ffd5f00 ffffffffffffffff
0000000000020000
GPR16: 0000000000000000 0000000000000002 c00000015f653000
0000000000000000
GPR20: c000000138662400 c00000013d814540 0000000000000000
c00000013d814500
GPR24: 0000000000000000 0000000000000002 c0000001e0957888
c0000001e0957878
GPR28: c00000013d814548 0000000000000000 c00000013d814540
c0000001e0957888
NIP [c0000000002c8a64] __mutex_add_waiter+0x24/0xc0
LR [c00000000125dbc8] __mutex_lock.constprop.0+0x318/0xd00
Call Trace:
0xc0000001e0957890 (unreliable)
__mutex_lock.constprop.0+0x58/0xd00
amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x6fc/0xb60 [amdgpu]
kfd_process_alloc_gpuvm+0x54/0x1f0 [amdgpu]
kfd_process_device_init_cwsr_dgpu+0xa4/0x1a0 [amdgpu]
kfd_process_device_init_vm+0xd8/0x2e0 [amdgpu]
kfd_ioctl_acquire_vm+0xd0/0x130 [amdgpu]
kfd_ioctl+0x514/0x670 [amdgpu]
sys_ioctl+0x134/0x180
system_call_exception+0x114/0x300
system_call_vectored_common+0x15c/0x2ec

This patch changes AMDGPU_VA_RESERVED_TRAP_SIZE to 64 KB and
KFD_CWSR_TBA_TMA_SIZE to the AMD GPU page size. This means we reserve
64 KB for the trap in the address space, but only allocate 8 KB within
it. With this approach, the allocation size never exceeds the reserved
area.

Fixes: 34a1de0f7935 ("drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole")
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Suggested-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Suggested-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Donet Tom &lt;donettom@linux.ibm.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 31b8de5e55666f26ea7ece5f412b83eab3f56dbb)
Cc: stable@vger.kernel.org
</content>
</entry>
<entry>
<title>drm/amdgpu: rework how we handle TLB fences</title>
<updated>2026-03-17T22:03:09+00:00</updated>
<author>
<name>Alex Deucher</name>
<email>alexander.deucher@amd.com</email>
</author>
<published>2026-03-16T15:04:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e9f58ff991dd4be13fd7a651bbf64329c090af09'/>
<id>urn:sha1:e9f58ff991dd4be13fd7a651bbf64329c090af09</id>
<content type='text'>
Add a new VM flag to indicate whether or not we need
a TLB fence.  Userqs (KFD or KGD) require a TLB fence.
A TLB fence is not strictly required for kernel queues,
but it shouldn't hurt.  That said, enabling this
unconditionally should be fine, but it seems to tickle
some issues in KIQ/MES.  Only enable them for KFD,
or when KGD userq queues are enabled (currently via module
parameter).

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4798
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4749
Fixes: f3854e04b708 ("drm/amdgpu: attach tlb fence to the PTs update")
Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Prike Liang &lt;Prike.Liang@amd.com&gt;
Reviewed-by: Prike Liang &lt;Prike.Liang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 69c5fbd2b93b5ced77c6e79afe83371bca84c788)
Cc: stable@vger.kernel.org
</content>
</entry>
<entry>
<title>drm/amdgpu: fix 4-level paging if GMC supports 57-bit VA v2</title>
<updated>2026-02-19T17:16:12+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2026-02-11T12:28:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=aa25c111a782e711d9eaad0adc3d5e086d175a29'/>
<id>urn:sha1:aa25c111a782e711d9eaad0adc3d5e086d175a29</id>
<content type='text'>
It turned that using 4 level page tables on GMC generations which support
57bit VAs actually doesn't work at all.

Background is that the GMC actually can't switch between 4 and 5 levels,
but rather just uses a subset of address space when less than 5 levels are
selected.

Philip already removed the automatically switch to 4levels, now fix it as
well should it be enabled by module parameters.

v2: fix AMDGPU_GMC_HOLE_MASK as well, fix off by one issue pointed out
    by Philip

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Philip Yang &lt;philip.yang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: GPU vm support 5-level page table</title>
<updated>2025-12-08T18:56:30+00:00</updated>
<author>
<name>Philip Yang</name>
<email>Philip.Yang@amd.com</email>
</author>
<published>2025-04-25T14:55:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f6b1c1f5fd7237f77fc3880603ea54dcf0371a20'/>
<id>urn:sha1:f6b1c1f5fd7237f77fc3880603ea54dcf0371a20</id>
<content type='text'>
If GPU supports 5-level page table, but CPU disable 5-level page table
by using boot option no5lvl or CPU feature not available, the virtual
address will be 48bit, not needed to enable 5-level page table on GPU
vm.

If adev-&gt;vm_manager.num_level, number of pde levels, set to 4, then
gfxhub and mmhub register VM_CONTEXTx_CNTL/PAGE_TABLE_DEPTH will set
to 4 to enable 5-level page table in page table walker.

Set vm_manager.root_level to AMDGPU_VM_PDE3, then update GPU mapping
will allocate and update PDE3/PDE2/PDE1/PDE0/PTB 5-level page tables.

If max_level is not 4, no change for the logic to support features
needed by old ASICs.

v2: squash in CONFIG fix

Signed-off-by: Philip Yang &lt;Philip.Yang@amd.com&gt;
Acked-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: add PCIe atomics bit in PTE</title>
<updated>2025-12-08T18:56:29+00:00</updated>
<author>
<name>Mukul Joshi</name>
<email>mukul.joshi@amd.com</email>
</author>
<published>2025-09-10T20:57:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2718942f48d41c4dc1c19a7a20720633191afd2f'/>
<id>urn:sha1:2718942f48d41c4dc1c19a7a20720633191afd2f</id>
<content type='text'>
To enable atomic access to memory, setup the new PCIe atomics bit
in PTE.

Signed-off-by: Mukul Joshi &lt;mukul.joshi@amd.com&gt;
Reviewed-by: Alex Sierra &lt;alex.sierra@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: avoid memory allocation in the critical code path v3</title>
<updated>2025-11-14T16:27:46+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2025-10-29T14:36:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=20459c098d688971237e56e30263ddce467c8c9b'/>
<id>urn:sha1:20459c098d688971237e56e30263ddce467c8c9b</id>
<content type='text'>
When we run out of VMIDs we need to wait for some to become available.
Previously we were using a dma_fence_array for that, but this means that
we have to allocate memory.

Instead just wait for the first not signaled fence from the least recently
used VMID to signal. That is not as efficient since we end up in this
function multiple times again, but allocating memory can easily fail or
deadlock if we have to wait for memory to become available.

v2: remove now unused VM manager fields
v3: fix dma_fence reference

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4258
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: partially revert "revert to old status lock handling v3"</title>
<updated>2025-10-07T18:09:19+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2025-10-06T10:45:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a107aeb6a2150dd552673caefc771e2222d584de'/>
<id>urn:sha1:a107aeb6a2150dd552673caefc771e2222d584de</id>
<content type='text'>
The CI systems are pointing out list corruptions, so we still need to
fix something here.

Keep the asserts, but revert the lock changes for now.

Fixes: 59e4405e9ee2 ("drm/amdgpu: revert to old status lock handling v3")
Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Merge amdgpu_vm_set_pasid into amdgpu_vm_init</title>
<updated>2025-10-07T18:09:07+00:00</updated>
<author>
<name>Jesse.Zhang</name>
<email>Jesse.Zhang@amd.com</email>
</author>
<published>2025-09-25T10:02:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b809ca91a5b7eccba065460200512f83cc87987a'/>
<id>urn:sha1:b809ca91a5b7eccba065460200512f83cc87987a</id>
<content type='text'>
As KFD no longer uses a separate PASID, the global amdgpu_vm_set_pasid()function is no longer necessary.
Merge its functionality directly intoamdgpu_vm_init() to simplify code flow and eliminate redundant locking.

v2: remove superflous check
  adjust amdgpu_vm_fin and remove amdgpu_vm_set_pasid (Chritian)

v3: drop amdgpu_vm_assert_locked (Chritian)

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4614
Fixes: 59e4405e9ee2 ("drm/amdgpu: revert to old status lock handling v3")
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Suggested-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Jesse Zhang &lt;Jesse.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: revert "rework reserved VMID handling" v2</title>
<updated>2025-09-25T19:39:00+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2025-09-19T07:27:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=90e09ea4cfd4aaaf07ababa6d8c880035587e7e9'/>
<id>urn:sha1:90e09ea4cfd4aaaf07ababa6d8c880035587e7e9</id>
<content type='text'>
This reverts commit e44a0fe630c58b0a87d8281f5c1077a3479e5fce.

Initially we used VMID reservation to enforce isolation between
processes. That has now been replaced by proper fence handling.

Both OpenGL, RADV and ROCm developers requested a way to reserve a VMID
for SPM, so restore that approach by reverting back to only allowing a
single process to use the reserved VMID.

Only compile tested for now.

v2: use -ENOENT instead of -EINVAL if VMID is not available

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'amd-drm-next-6.18-2025-09-19' of https://gitlab.freedesktop.org/agd5f/linux into drm-next</title>
<updated>2025-09-21T22:45:51+00:00</updated>
<author>
<name>Dave Airlie</name>
<email>airlied@redhat.com</email>
</author>
<published>2025-09-21T22:44:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=342f141ba9f4c9e39de342d047a5245e8f4cda19'/>
<id>urn:sha1:342f141ba9f4c9e39de342d047a5245e8f4cda19</id>
<content type='text'>
amd-drm-next-6.18-2025-09-19:

amdgpu:
- Fence drv clean up fix
- DPC fixes
- Misc display fixes
- Support the MMIO remap page as a ttm pool
- JPEG parser updates
- UserQ updates
- VCN ctx handling fixes
- Documentation updates
- Misc cleanups
- SMU 13.0.x updates
- SI DPM updates
- GC 11.x cleaner shader updates
- DMCUB updates
- DML fixes
- Improve fallback handling for pixel encoding
- VCN reset improvements
- DCE6 DC updates
- DSC fixes
- Use devm for i2c buses
- GPUVM locking updates
- GPUVM documentation improvements
- Drop non-DC DCE11 code
- S0ix fixes
- Backlight fix
- SR-IOV fixes

amdkfd:
- SVM updates

Signed-off-by: Dave Airlie &lt;airlied@redhat.com&gt;

From: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Link: https://lore.kernel.org/r/20250919193354.2989255-1-alexander.deucher@amd.com
</content>
</entry>
</feed>
