<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c, branch v6.1.168</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.1.168</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.1.168'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-04-11T12:16:12+00:00</updated>
<entry>
<title>drm/amdgpu: Fix fence put before wait in amdgpu_amdkfd_submit_ib</title>
<updated>2026-04-11T12:16:12+00:00</updated>
<author>
<name>Srinivasan Shanmugam</name>
<email>srinivasan.shanmugam@amd.com</email>
</author>
<published>2026-03-23T08:11:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bc7760c107dc08ef3e231d72c492e67b0a86848b'/>
<id>urn:sha1:bc7760c107dc08ef3e231d72c492e67b0a86848b</id>
<content type='text'>
[ Upstream commit 7150850146ebfa4ca998f653f264b8df6f7f85be ]

amdgpu_amdkfd_submit_ib() submits a GPU job and gets a fence
from amdgpu_ib_schedule(). This fence is used to wait for job
completion.

Currently, the code drops the fence reference using dma_fence_put()
before calling dma_fence_wait().

If dma_fence_put() releases the last reference, the fence may be
freed before dma_fence_wait() is called. This can lead to a
use-after-free.

Fix this by waiting on the fence first and releasing the reference
only after dma_fence_wait() completes.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c:697 amdgpu_amdkfd_submit_ib() warn: passing freed memory 'f' (line 696)

Fixes: 9ae55f030dc5 ("drm/amdgpu: Follow up change to previous drm scheduler change.")
Cc: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Cc: Dan Carpenter &lt;dan.carpenter@linaro.org&gt;
Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Srinivasan Shanmugam &lt;srinivasan.shanmugam@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 8b9e5259adc385b61a6590a13b82ae0ac2bd3482)
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: cleanup scheduler job initialization v2</title>
<updated>2026-01-11T14:19:22+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2026-01-07T11:54:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=49aa99f05dbc75b9ae360a74648c420a80f7ee49'/>
<id>urn:sha1:49aa99f05dbc75b9ae360a74648c420a80f7ee49</id>
<content type='text'>
[ Upstream commit f7d66fb2ea43a3016e78a700a2ca6c77a74579f9 ]

Init the DRM scheduler base class while allocating the job.

This makes the whole handling much more cleaner.

v2: fix coding style

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Luben Tuikov &lt;luben.tuikov@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20221014084641.128280-7-christian.koenig@amd.com
Stable-dep-of: 4fa944255be5 ("drm/amdgpu: add missing lock to amdgpu_ttm_access_memory_sdma")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drm/amdkfd: amdkfd_free_gtt_mem clear the correct pointer</title>
<updated>2024-11-17T14:07:22+00:00</updated>
<author>
<name>Philip Yang</name>
<email>Philip.Yang@amd.com</email>
</author>
<published>2024-07-14T15:11:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e7831613cbbcd9058d3658fbcdc5d5884ceb2e0c'/>
<id>urn:sha1:e7831613cbbcd9058d3658fbcdc5d5884ceb2e0c</id>
<content type='text'>
commit c86ad39140bbcb9dc75a10046c2221f657e8083b upstream.

Pass pointer reference to amdgpu_bo_unref to clear the correct pointer,
otherwise amdgpu_bo_unref clear the local variable, the original pointer
not set to NULL, this could cause use-after-free bug.

Signed-off-by: Philip Yang &lt;Philip.Yang@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
Signed-off-by: Vamsi Krishna Brahmajosyula &lt;vamsi-krishna.brahmajosyula@broadcom.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: disable GFXOFF during compute for GFX11</title>
<updated>2022-11-02T21:16:11+00:00</updated>
<author>
<name>Graham Sider</name>
<email>Graham.Sider@amd.com</email>
</author>
<published>2022-10-26T19:08:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a3e5ce56f3d260f2ec8e5242c33f57e60ae9eba7'/>
<id>urn:sha1:a3e5ce56f3d260f2ec8e5242c33f57e60ae9eba7</id>
<content type='text'>
Temporary workaround to fix issues observed in some compute applications
when GFXOFF is enabled on GFX11.

Signed-off-by: Graham Sider &lt;Graham.Sider@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Harish Kasiviswanathan &lt;Harish.Kasiviswanathan@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: stable@vger.kernel.org # 6.0.x
</content>
</entry>
<entry>
<title>Revert "drm/amdgpu: let mode2 reset fallback to default when failure"</title>
<updated>2022-10-19T02:08:33+00:00</updated>
<author>
<name>Victor Zhao</name>
<email>Victor.Zhao@amd.com</email>
</author>
<published>2022-10-13T03:06:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a340847b0214aa9b8fd9839f7b2822ccc607edab'/>
<id>urn:sha1:a340847b0214aa9b8fd9839f7b2822ccc607edab</id>
<content type='text'>
This reverts commit dac6b80818ac2353631c5a33d140d8d5508e2957.

This commit reverted the AMDGPU_SKIP_MODE2_RESET as it conflicts with
the original design of reset handler. Will redesign it.

Fixes: dac6b80818ac23 ("drm/amdgpu: let mode2 reset fallback to default when failure")
Signed-off-by: Victor Zhao &lt;Victor.Zhao@amd.com&gt;
Reviewed-by: Lijo Lazar &lt;lijo.lazar@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Correct amdgpu_amdkfd_total_mem_size calculation</title>
<updated>2022-10-06T16:08:18+00:00</updated>
<author>
<name>Philip Yang</name>
<email>Philip.Yang@amd.com</email>
</author>
<published>2022-10-03T21:53:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2302d507149f0ae7cc697089ab5675a2d4cf9d2a'/>
<id>urn:sha1:2302d507149f0ae7cc697089ab5675a2d4cf9d2a</id>
<content type='text'>
amdkfd_total_mem_size is the size of total GPUs vram plus system memory
to estimate page tables memory usage and leave enough VRAM room for page
tables allocation.

Calculate amdkfd_total_mem_size in amdgpu_amdkfd_device_probe is
incorrect because adev-&gt;gmc.real_vram_size is still 0 called from
amdgpu_device_ip_early_init. Move the calculation
to amdgpu_amdkfd_device_init to get the correct VRAM size.

Do reverse calculation in amdgpu_amdkfd_device_fini_sw to support
hot-unplugging GPUs.

Signed-off-by: Philip Yang &lt;Philip.Yang@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: add page retirement handling for CPU RAS</title>
<updated>2022-09-29T13:42:36+00:00</updated>
<author>
<name>Tao Zhou</name>
<email>tao.zhou1@amd.com</email>
</author>
<published>2022-09-21T09:03:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5e1fdf76cf9b1b764d6061d78c29901d774fc061'/>
<id>urn:sha1:5e1fdf76cf9b1b764d6061d78c29901d774fc061</id>
<content type='text'>
Do RAS page retirement in poison consumption handler unconditionally.

Signed-off-by: Tao Zhou &lt;tao.zhou1@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: add gang submit frontend v6</title>
<updated>2022-09-20T16:40:46+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2022-03-02T15:39:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4624459c84d71e0d5f94ea6a7b2c4eec4f1d122b'/>
<id>urn:sha1:4624459c84d71e0d5f94ea6a7b2c4eec4f1d122b</id>
<content type='text'>
Allows submitting jobs as gang which needs to run on multiple engines at the
same time.

All members of the gang get the same implicit, explicit and VM dependencies. So
no gang member will start running until everything else is ready.

The last job is considered the gang leader (usually a submission to the GFX
ring) and used for signaling output dependencies.

Each job is remembered individually as user of a buffer object, so there is no
joining of work at the end.

v2: rebase and fix review comments from Andrey and Yogesh
v3: use READ instead of BOOKKEEP for now because of VM unmaps, set gang
    leader only when necessary
v4: fix order of pushing jobs and adding fences found by Trigger.
v5: fix job index calculation and adding IBs to jobs
v6: fix typo found by Alex

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: cleanup coding style in amdgpu_amdkfd.c</title>
<updated>2022-09-13T18:32:58+00:00</updated>
<author>
<name>Jingyu Wang</name>
<email>jingyuwang_vip@163.com</email>
</author>
<published>2022-09-05T07:56:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2f4ca1ba6c9e7a4c2eea2ed8a378817ec1946f4f'/>
<id>urn:sha1:2f4ca1ba6c9e7a4c2eea2ed8a378817ec1946f4f</id>
<content type='text'>
Fix everything checkpatch.pl complained about in amdgpu_amdkfd.c

Signed-off-by: Jingyu Wang &lt;jingyuwang_vip@163.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: let mode2 reset fallback to default when failure</title>
<updated>2022-08-16T22:14:31+00:00</updated>
<author>
<name>Victor Zhao</name>
<email>Victor.Zhao@amd.com</email>
</author>
<published>2022-07-28T02:39:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=dac6b80818ac2353631c5a33d140d8d5508e2957'/>
<id>urn:sha1:dac6b80818ac2353631c5a33d140d8d5508e2957</id>
<content type='text'>
- introduce AMDGPU_SKIP_MODE2_RESET flag
- let mode2 reset fallback to default reset method if failed

v2: move this part out from the asic specific part

Signed-off-by: Victor Zhao &lt;Victor.Zhao@amd.com&gt;
Acked-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
</feed>
