<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c, branch v6.19.11</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.11</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.11'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-03-19T15:15:23+00:00</updated>
<entry>
<title>drm/amdgpu: Fix use-after-free race in VM acquire</title>
<updated>2026-03-19T15:15:23+00:00</updated>
<author>
<name>Alysa Liu</name>
<email>Alysa.Liu@amd.com</email>
</author>
<published>2026-02-05T16:21:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=94b7782d0c8024f5b88454241c8d4777076c3786'/>
<id>urn:sha1:94b7782d0c8024f5b88454241c8d4777076c3786</id>
<content type='text'>
commit 2c1030f2e84885cc58bffef6af67d5b9d2e7098f upstream.

Replace non-atomic vm-&gt;process_info assignment with cmpxchg()
to prevent race when parent/child processes sharing a drm_file
both try to acquire the same VM after fork().

Reviewed-by: Harish Kasiviswanathan &lt;Harish.Kasiviswanathan@amd.com&gt;
Signed-off-by: Alysa Liu &lt;Alysa.Liu@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit c7c573275ec20db05be769288a3e3bb2250ec618)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Fix double deletion of validate_list</title>
<updated>2026-02-03T22:24:21+00:00</updated>
<author>
<name>Harish Kasiviswanathan</name>
<email>Harish.Kasiviswanathan@amd.com</email>
</author>
<published>2026-01-09T20:26:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6b61a54e684006ca0d92d684a1d3c3a00f077d8f'/>
<id>urn:sha1:6b61a54e684006ca0d92d684a1d3c3a00f077d8f</id>
<content type='text'>
If amdgpu_amdkfd_gpuvm_free_memory_of_gpu() fails after kgd_mem is
removed from validate_list, the mem handle still lingers in the KFD idr.
This means when process is terminated,
kfd_process_free_outstanding_kfd_bos() will call
amdgpu_amdkfd_gpuvm_free_memory_of_gpu() again resulting in double
deletion.

To avoid this -
 (a) Check if list is empty before deleting it
 (b) Rearragne amdgpu_amdkfd_gpuvm_free_memory_of_gpu() such that it can
     be safely called again if it returns failure the first time.

Signed-off-by: Harish Kasiviswanathan &lt;Harish.Kasiviswanathan@amd.com&gt;
Reviewed-by: Philip Yang &lt;Philip.Yang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 6ba60345f45eaf7cb4f89105d26083a4b9fd1cba)
</content>
</entry>
<entry>
<title>drm/amdkfd: Don't clear PT after process killed</title>
<updated>2025-11-04T16:53:22+00:00</updated>
<author>
<name>Philip Yang</name>
<email>Philip.Yang@amd.com</email>
</author>
<published>2025-10-31T14:50:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=10c382ec6c6d1e11975a11962bec21cba6360391'/>
<id>urn:sha1:10c382ec6c6d1e11975a11962bec21cba6360391</id>
<content type='text'>
If process is killed. the vm entity is stopped, submit pt update job
will trigger the error message "*ERROR* Trying to push to a killed
entity", job will not execute.

Suggested-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Philip Yang &lt;Philip.Yang@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Fix vram_usage underflow</title>
<updated>2025-10-20T22:25:22+00:00</updated>
<author>
<name>Alysa Liu</name>
<email>Alysa.Liu@amd.com</email>
</author>
<published>2025-10-10T21:18:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bd07b3f08a0c9c91a33143215eda8a57f575dc2a'/>
<id>urn:sha1:bd07b3f08a0c9c91a33143215eda8a57f575dc2a</id>
<content type='text'>
vram_usage was subtracting non-vram memory size,
which caused it to become negative.

Signed-off-by: Alysa Liu &lt;Alysa.Liu@amd.com&gt;
Reviewed-by: Harish Kasiviswanathan &lt;Harish.Kasiviswanathan@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: update the functions to use amdgpu version of hmm</title>
<updated>2025-10-13T18:14:36+00:00</updated>
<author>
<name>Sunil Khatri</name>
<email>sunil.khatri@amd.com</email>
</author>
<published>2025-10-10T12:39:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=737da5363cc07c96d59f2ebaf9f9f87235becf1d'/>
<id>urn:sha1:737da5363cc07c96d59f2ebaf9f9f87235becf1d</id>
<content type='text'>
At times we need a bo reference for hmm and for that add
a new struct amdgpu_hmm_range which will hold an optional
bo member and hmm_range.

Use amdgpu_hmm_range instead of hmm_range and let the bo
as an optional argument for the caller if they want to
the bo reference to be taken or they want to handle that
explicitly.

Signed-off-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: clean up amdgpu hmm range functions</title>
<updated>2025-10-13T18:14:28+00:00</updated>
<author>
<name>Sunil Khatri</name>
<email>sunil.khatri@amd.com</email>
</author>
<published>2025-09-30T08:15:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b1dd0db1c668a33112bfb26618c090163700e368'/>
<id>urn:sha1:b1dd0db1c668a33112bfb26618c090163700e368</id>
<content type='text'>
Clean up the amdgpu hmm range functions for clearer
definition of each.

a. Split amdgpu_ttm_tt_get_user_pages_done into two:
   1. amdgpu_hmm_range_valid: To check if the user pages
      are valid and update seq num
   2. amdgpu_hmm_range_free: Clean up the hmm range
      and pfn memory.

b. amdgpu_ttm_tt_get_user_pages_done and
   amdgpu_ttm_tt_discard_user_pages are similar function so remove
   discard and directly use amdgpu_hmm_range_free to clean up the
   hmm range and pfn memory.

Suggested-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: use user provided hmm_range buffer in amdgpu_ttm_tt_get_user_pages</title>
<updated>2025-10-13T18:14:28+00:00</updated>
<author>
<name>Sunil Khatri</name>
<email>sunil.khatri@amd.com</email>
</author>
<published>2025-09-24T06:53:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e095b55155ef69a8ae0eb114a7fd2a381c012f33'/>
<id>urn:sha1:e095b55155ef69a8ae0eb114a7fd2a381c012f33</id>
<content type='text'>
update the amdgpu_ttm_tt_get_user_pages and all dependent function
along with it callers to use a user allocated hmm_range buffer instead
hmm layer allocates the buffer.

This is a need to get hmm_range pointers easily accessible
without accessing the bo and that is a requirement for the
userqueue to lock the userptrs effectively.

Signed-off-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: use atomic functions with memory barriers for vm fault info</title>
<updated>2025-10-13T18:14:15+00:00</updated>
<author>
<name>Gui-Dong Han</name>
<email>hanguidong02@gmail.com</email>
</author>
<published>2025-10-08T03:43:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6df8e84aa6b5b1812cc2cacd6b3f5ccbb18cda2b'/>
<id>urn:sha1:6df8e84aa6b5b1812cc2cacd6b3f5ccbb18cda2b</id>
<content type='text'>
The atomic variable vm_fault_info_updated is used to synchronize access to
adev-&gt;gmc.vm_fault_info between the interrupt handler and
get_vm_fault_info().

The default atomic functions like atomic_set() and atomic_read() do not
provide memory barriers. This allows for CPU instruction reordering,
meaning the memory accesses to vm_fault_info and the vm_fault_info_updated
flag are not guaranteed to occur in the intended order. This creates a
race condition that can lead to inconsistent or stale data being used.

The previous implementation, which used an explicit mb(), was incomplete
and inefficient. It failed to account for all potential CPU reorderings,
such as the access of vm_fault_info being reordered before the atomic_read
of the flag. This approach is also more verbose and less performant than
using the proper atomic functions with acquire/release semantics.

Fix this by switching to atomic_set_release() and atomic_read_acquire().
These functions provide the necessary acquire and release semantics,
which act as memory barriers to ensure the correct order of operations.
It is also more efficient and idiomatic than using explicit full memory
barriers.

Fixes: b97dfa27ef3a ("drm/amdgpu: save vm fault information for amdkfd")
Cc: stable@vger.kernel.org
Signed-off-by: Gui-Dong Han &lt;hanguidong02@gmail.com&gt;
Signed-off-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdkfd: Fix kfd process ref leaking when userptr unmapping</title>
<updated>2025-10-07T18:09:06+00:00</updated>
<author>
<name>Philip Yang</name>
<email>Philip.Yang@amd.com</email>
</author>
<published>2025-05-27T15:09:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=58e6fc2fb94f0f409447e5d46cf6a417b6397fbc'/>
<id>urn:sha1:58e6fc2fb94f0f409447e5d46cf6a417b6397fbc</id>
<content type='text'>
kfd_lookup_process_by_pid hold the kfd process reference to ensure it
doesn't get destroyed while sending the segfault event to user space.

Calling kfd_lookup_process_by_pid as function parameter leaks the kfd
process refcount and miss the NULL pointer check if app process is
already destroyed.

Fixes: 2d274bf7099b ("amd/amdkfd: Trigger segfault for early userptr unmmapping")
Signed-off-by: Philip Yang &lt;Philip.Yang@amd.com&gt;
Reviewed-by: Harish Kasiviswanathan &lt;Harish.Kasiviswanathan@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: use hmm_pfns instead of array of pages</title>
<updated>2025-09-23T14:22:31+00:00</updated>
<author>
<name>Sunil Khatri</name>
<email>sunil.khatri@amd.com</email>
</author>
<published>2025-09-17T14:42:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c5b3cc417b0260abc74ed32f6baa626c9de917c0'/>
<id>urn:sha1:c5b3cc417b0260abc74ed32f6baa626c9de917c0</id>
<content type='text'>
we dont need to allocate local array of pages to hold
the pages returned by the hmm, instead we could use
the hmm_range structure itself to get to hmm_pfn
and get the required pages directly.

This avoids call to alloc/free quite a lot.

Signed-off-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Suggested-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Acked-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
</feed>
