<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/gpu/drm/amd/amdkfd, branch v6.6.12</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.12</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.12'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2023-12-13T17:45:11+00:00</updated>
<entry>
<title>drm/amdkfd: get doorbell's absolute offset based on the db_size</title>
<updated>2023-12-13T17:45:11+00:00</updated>
<author>
<name>Arvind Yadav</name>
<email>Arvind.Yadav@amd.com</email>
</author>
<published>2023-10-09T17:13:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=45d72eadf21ab807f477e7fffd5767b17c7a2aaf'/>
<id>urn:sha1:45d72eadf21ab807f477e7fffd5767b17c7a2aaf</id>
<content type='text'>
[ Upstream commit 367a0af43373d4f791cc8b466a659ecf5aa52377 ]

Here, Adding db_size in byte to find the doorbell's
absolute offset for both 32-bit and 64-bit doorbell sizes.
So that doorbell offset will be aligned based on the doorbell
size.

v2:
- Addressed the review comment from Felix.
v3:
- Adding doorbell_size as parameter to get db absolute offset.
v4:
  Squash the two patches into one.

Cc: Christian Koenig &lt;christian.koenig@amd.com&gt;
Cc: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Shashank Sharma &lt;shashank.sharma@amd.com&gt;
Signed-off-by: Arvind Yadav &lt;Arvind.Yadav@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdkfd: Fix shift out-of-bounds issue</title>
<updated>2023-11-28T17:19:41+00:00</updated>
<author>
<name>Jesse Zhang</name>
<email>jesse.zhang@amd.com</email>
</author>
<published>2023-10-20T01:43:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=56649c43d40ce0147465a2d5756d300e87f9ee1c'/>
<id>urn:sha1:56649c43d40ce0147465a2d5756d300e87f9ee1c</id>
<content type='text'>
[ Upstream commit 282c1d793076c2edac6c3db51b7e8ed2b41d60a5 ]

[  567.613292] shift exponent 255 is too large for 64-bit type 'long unsigned int'
[  567.614498] CPU: 5 PID: 238 Comm: kworker/5:1 Tainted: G           OE      6.2.0-34-generic #34~22.04.1-Ubuntu
[  567.614502] Hardware name: AMD Splinter/Splinter-RPL, BIOS WS43927N_871 09/25/2023
[  567.614504] Workqueue: events send_exception_work_handler [amdgpu]
[  567.614748] Call Trace:
[  567.614750]  &lt;TASK&gt;
[  567.614753]  dump_stack_lvl+0x48/0x70
[  567.614761]  dump_stack+0x10/0x20
[  567.614763]  __ubsan_handle_shift_out_of_bounds+0x156/0x310
[  567.614769]  ? srso_alias_return_thunk+0x5/0x7f
[  567.614773]  ? update_sd_lb_stats.constprop.0+0xf2/0x3c0
[  567.614780]  svm_range_split_by_granularity.cold+0x2b/0x34 [amdgpu]
[  567.615047]  ? srso_alias_return_thunk+0x5/0x7f
[  567.615052]  svm_migrate_to_ram+0x185/0x4d0 [amdgpu]
[  567.615286]  do_swap_page+0x7b6/0xa30
[  567.615291]  ? srso_alias_return_thunk+0x5/0x7f
[  567.615294]  ? __free_pages+0x119/0x130
[  567.615299]  handle_pte_fault+0x227/0x280
[  567.615303]  __handle_mm_fault+0x3c0/0x720
[  567.615311]  handle_mm_fault+0x119/0x330
[  567.615314]  ? lock_mm_and_find_vma+0x44/0x250
[  567.615318]  do_user_addr_fault+0x1a9/0x640
[  567.615323]  exc_page_fault+0x81/0x1b0
[  567.615328]  asm_exc_page_fault+0x27/0x30
[  567.615332] RIP: 0010:__get_user_8+0x1c/0x30

Signed-off-by: Jesse Zhang &lt;jesse.zhang@amd.com&gt;
Suggested-by: Philip Yang &lt;Philip.Yang@amd.com&gt;
Reviewed-by: Yifan Zhang &lt;yifan1.zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdkfd: Fix a race condition of vram buffer unref in svm code</title>
<updated>2023-11-28T17:19:40+00:00</updated>
<author>
<name>Xiaogang Chen</name>
<email>xiaogang.chen@amd.com</email>
</author>
<published>2023-09-27T16:20:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=fc0210720127cc6302e6d6f3de48f49c3fcf5659'/>
<id>urn:sha1:fc0210720127cc6302e6d6f3de48f49c3fcf5659</id>
<content type='text'>
[ Upstream commit 709c348261618da7ed89d6c303e2ceb9e453ba74 ]

prange-&gt;svm_bo unref can happen in both mmu callback and a callback after
migrate to system ram. Both are async call in different tasks. Sync svm_bo
unref operation to avoid random "use-after-free".

Signed-off-by: Xiaogang Chen &lt;xiaogang.chen@amd.com&gt;
Reviewed-by: Philip Yang &lt;Philip.Yang@amd.com&gt;
Reviewed-by: Jesse Zhang &lt;Jesse.Zhang@amd.com&gt;
Tested-by: Jesse Zhang &lt;Jesse.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdkfd: ratelimited SQ interrupt messages</title>
<updated>2023-11-28T17:19:39+00:00</updated>
<author>
<name>Harish Kasiviswanathan</name>
<email>Harish.Kasiviswanathan@amd.com</email>
</author>
<published>2023-08-10T16:10:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=de1ce1081b7881044c46012ca27c1f2d5bb27f2d'/>
<id>urn:sha1:de1ce1081b7881044c46012ca27c1f2d5bb27f2d</id>
<content type='text'>
[ Upstream commit 37fb87910724f21a1f27a75743d4f9accdee77fb ]

No functional change. Use ratelimited version of pr_ to avoid
overflowing of dmesg buffer

Signed-off-by: Harish Kasiviswanathan &lt;Harish.Kasiviswanathan@amd.com&gt;
Reviewed-by: Philip Yang &lt;philip.yang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdkfd: Handle errors from svm validate and map</title>
<updated>2023-11-20T10:59:10+00:00</updated>
<author>
<name>Philip Yang</name>
<email>Philip.Yang@amd.com</email>
</author>
<published>2023-09-13T14:10:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=49ca8b9035ff43ce0673cc5d1b4e5c34451172b3'/>
<id>urn:sha1:49ca8b9035ff43ce0673cc5d1b4e5c34451172b3</id>
<content type='text'>
[ Upstream commit eb3c357bcb286e89386e89302061fe717fe4e562 ]

If new range is splited to multiple pranges with max_svm_range_pages
alignment and added to update_list, svm validate and map should keep
going after error to make sure prange-&gt;mapped_to_gpu flag is up to date
for the whole range.

svm validate and map update set prange-&gt;mapped_to_gpu after mapping to
GPUs successfully, otherwise clear prange-&gt;mapped_to_gpu flag (for
update mapping case) instead of setting error flag, we can remove
the redundant error flag to simpliy code.

Refactor to remove goto and update prange-&gt;mapped_to_gpu flag inside
svm_range_lock, to guarant we always evict queues or unmap from GPUs if
there are invalid ranges.

After svm validate and map return error -EAGIN, the caller retry will
update the mapping for the whole range again.

Fixes: c22b04407097 ("drm/amdkfd: flag added to handle errors from svm validate and map")
Signed-off-by: Philip Yang &lt;Philip.Yang@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Tested-by: James Zhu &lt;james.zhu@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdkfd: Remove svm range validated_once flag</title>
<updated>2023-11-20T10:59:10+00:00</updated>
<author>
<name>Philip Yang</name>
<email>Philip.Yang@amd.com</email>
</author>
<published>2023-08-01T15:38:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=33b9170fef2b0fa31fb2a5dba922f95d512b4522'/>
<id>urn:sha1:33b9170fef2b0fa31fb2a5dba922f95d512b4522</id>
<content type='text'>
[ Upstream commit c99b16128082de519975aa147d9da3e40380de67 ]

The validated_once flag is not used after the prefault was removed, The
prefault was needed to ensure validate all system memory pages at least
once before mapping or migrating the range to GPU.

Signed-off-by: Philip Yang &lt;Philip.Yang@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Stable-dep-of: eb3c357bcb28 ("drm/amdkfd: Handle errors from svm validate and map")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdkfd: fix some race conditions in vram buffer alloc/free of svm code</title>
<updated>2023-11-20T10:59:10+00:00</updated>
<author>
<name>Xiaogang Chen</name>
<email>xiaogang.chen@amd.com</email>
</author>
<published>2023-09-20T16:02:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=da27d73ec90b6ebb6587fabc372d1466cc622313'/>
<id>urn:sha1:da27d73ec90b6ebb6587fabc372d1466cc622313</id>
<content type='text'>
[ Upstream commit 7bfaa160caed8192f8262c4638f552cad94bcf5a ]

This patch fixes:
1: ref number of prange's svm_bo got decreased by an async call from hmm. When
wait svm_bo of prange got released we shoul also wait prang-&gt;svm_bo become NULL,
otherwise prange-&gt;svm_bo may be set to null after allocate new vram buffer.

2: During waiting svm_bo of prange got released in a while loop should reschedule
current task to give other tasks oppotunity to run, specially the the workque
task that handles svm_bo ref release, otherwise we may enter to softlock.

Signed-off-by: Xiaogang.Chen &lt;xiaogang.chen@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdkfd: Use gpu_offset for user queue's wptr</title>
<updated>2023-09-20T21:30:42+00:00</updated>
<author>
<name>YuBiao Wang</name>
<email>YuBiao.Wang@amd.com</email>
</author>
<published>2023-09-15T02:47:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=cc39f9ccb82426e576734b493e1777ea01b144a8'/>
<id>urn:sha1:cc39f9ccb82426e576734b493e1777ea01b144a8</id>
<content type='text'>
Directly use tbo's start address will miss the domain start offset. Need
to use gpu_offset instead.

Signed-off-by: YuBiao Wang &lt;YuBiao.Wang@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: stable@vger.kernel.org
</content>
</entry>
<entry>
<title>drm/amdkfd: Insert missing TLB flush on GFX10 and later</title>
<updated>2023-09-12T21:45:40+00:00</updated>
<author>
<name>Harish Kasiviswanathan</name>
<email>Harish.Kasiviswanathan@amd.com</email>
</author>
<published>2023-09-11T18:49:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=edcfe22985d09ee8e2346c9217f5a52ab150099f'/>
<id>urn:sha1:edcfe22985d09ee8e2346c9217f5a52ab150099f</id>
<content type='text'>
Heavy-weight TLB flush is required after unmap on all GPUs for
correctness and security.

Signed-off-by: Harish Kasiviswanathan &lt;Harish.Kasiviswanathan@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: stable@vger.kernel.org
</content>
</entry>
<entry>
<title>drm/amdkfd: Checkpoint and restore queues on GFX11</title>
<updated>2023-09-11T22:22:38+00:00</updated>
<author>
<name>David Francis</name>
<email>David.Francis@amd.com</email>
</author>
<published>2022-11-22T20:14:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9296da8c40900b4dae3d973aa22be306e2a77671'/>
<id>urn:sha1:9296da8c40900b4dae3d973aa22be306e2a77671</id>
<content type='text'>
The code in kfd_mqd_manager_v11.c to support criu dump and
restore of queue state was missing.

Added it; should be equivalent to kfd_mqd_manager_v10.c.

CC: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Reviewed-by: Harish Kasiviswanathan &lt;Harish.Kasiviswanathan@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: David Francis &lt;David.Francis@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
</feed>
