<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/gpu/drm/amd/amdkfd, branch v6.0.14</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.0.14</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.0.14'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2022-12-02T16:43:11+00:00</updated>
<entry>
<title>drm/amdkfd: update GFX11 CWSR trap handler</title>
<updated>2022-12-02T16:43:11+00:00</updated>
<author>
<name>Jay Cornwall</name>
<email>jay.cornwall@amd.com</email>
</author>
<published>2022-10-14T02:41:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a53e0dccc3236f60359d93544a70a8c0b3a2bc36'/>
<id>urn:sha1:a53e0dccc3236f60359d93544a70a8c0b3a2bc36</id>
<content type='text'>
[ Upstream commit 6640f8e5adb69a0550fe1d224d3ac64c10f00eef ]

With corresponding FW change fixes issue where triggering CWSR on a
workgroup with waves in s_barrier wouldn't lead to a back-off and
therefore cause a hang.

Signed-off-by: Jay Cornwall &lt;jay.cornwall@amd.com&gt;
Tested-by: Graham Sider &lt;Graham.Sider@amd.com&gt;
Acked-by: Harish Kasiviswanathan &lt;Harish.Kasiviswanathan@amd.com&gt;
Acked-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Reviewed-by: Graham Sider &lt;Graham.Sider@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: stable@vger.kernel.org # 6.0.x
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Enable SA software trap.</title>
<updated>2022-12-02T16:43:11+00:00</updated>
<author>
<name>David Belanger</name>
<email>david.belanger@amd.com</email>
</author>
<published>2022-08-25T14:39:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a254daf7a473a9d9578b82d34a6c1b65a2a067f1'/>
<id>urn:sha1:a254daf7a473a9d9578b82d34a6c1b65a2a067f1</id>
<content type='text'>
[ Upstream commit 585a82618bc422508c0c8ae0dfe2f76f22c28361 ]

Enables support for software trap for MES &gt;= 4.
Adapted from implementation from Jay Cornwall.

v2: Add IP version check in conditions.
v3: Remove debugger code changes.

Signed-off-by: Jay Cornwall &lt;Jay.Cornwall@amd.com&gt;
Signed-off-by: David Belanger &lt;david.belanger@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Stable-dep-of: 6640f8e5adb6 ("drm/amdkfd: update GFX11 CWSR trap handler")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdkfd: Migrate in CPU page fault use current mm</title>
<updated>2022-11-16T09:04:14+00:00</updated>
<author>
<name>Philip Yang</name>
<email>Philip.Yang@amd.com</email>
</author>
<published>2022-09-08T21:56:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=128e284c6cccf5875261569fa3bb07558870c17f'/>
<id>urn:sha1:128e284c6cccf5875261569fa3bb07558870c17f</id>
<content type='text'>
commit 3a876060892ba52dd67d197c78b955e62657d906 upstream.

migrate_vma_setup shows below warning because we don't hold another
process mm mmap_lock. We should use current vmf-&gt;vma-&gt;vm_mm instead, the
caller already hold current mmap lock inside CPU page fault handler.

 WARNING: CPU: 10 PID: 3054 at include/linux/mmap_lock.h:155 find_vma
 Call Trace:
  walk_page_range+0x76/0x150
  migrate_vma_setup+0x18a/0x640
  svm_migrate_vram_to_ram+0x245/0xa10 [amdgpu]
  svm_migrate_to_ram+0x36f/0x470 [amdgpu]
  do_swap_page+0xcfe/0xec0
  __handle_mm_fault+0x96b/0x15e0
  handle_mm_fault+0x13f/0x3e0
  do_user_addr_fault+0x1e7/0x690

Fixes: e1f84eef313f ("drm/amdkfd: handle CPU fault on COW mapping")
Signed-off-by: Philip Yang &lt;Philip.Yang@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drm/amdkfd: Fix error handling in kfd_criu_restore_events</title>
<updated>2022-11-16T09:04:08+00:00</updated>
<author>
<name>Felix Kuehling</name>
<email>Felix.Kuehling@amd.com</email>
</author>
<published>2022-11-03T21:01:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0a35e62c408a4f43fd0efc59ebbaf470cd9ff94b'/>
<id>urn:sha1:0a35e62c408a4f43fd0efc59ebbaf470cd9ff94b</id>
<content type='text'>
commit 66f7903779fbbc620bf1040017e4833ef6a0b541 upstream.

mutex_unlock before the exit label because all the error code paths that
jump there didn't take that lock. This fixes unbalanced locking errors
in case of restore errors.

Fixes: 40e8a766a761 ("drm/amdkfd: CRIU checkpoint and restore events")
Signed-off-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Reviewed-by: Rajneesh Bhardwaj &lt;rajneesh.bhardwaj@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drm/amdkfd: Fix error handling in criu_checkpoint</title>
<updated>2022-11-16T09:04:08+00:00</updated>
<author>
<name>Felix Kuehling</name>
<email>Felix.Kuehling@amd.com</email>
</author>
<published>2022-11-01T19:02:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=182476d29def13b52d6df423550384ec5f8d728b'/>
<id>urn:sha1:182476d29def13b52d6df423550384ec5f8d728b</id>
<content type='text'>
commit b91c23e099f0b65d62159da13458c5eefa76083f upstream.

Checkpoint BOs last. That way we don't need to close dmabuf FDs if
something else fails later. This avoids problematic access to user mode
memory in the error handling code path.

criu_checkpoint_bos has its own error handling and cleanup that does not
depend on access to user memory.

In the private data, keep BOs before the remaining objects. This is
necessary to restore things in the correct order as restoring events
depends on the events-page BO being restored first.

Fixes: be072b06c739 ("drm/amdkfd: CRIU export BOs as prime dmabuf objects")
Reported-by: Jann Horn &lt;jannh@google.com&gt;
CC: Rajneesh Bhardwaj &lt;Rajneesh.Bhardwaj@amd.com&gt;
Signed-off-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Reviewed-and-tested-by: Rajneesh Bhardwaj &lt;rajneesh.bhardwaj@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drm/amdkfd: Fix NULL pointer dereference in svm_migrate_to_ram()</title>
<updated>2022-11-16T09:03:49+00:00</updated>
<author>
<name>Yang Li</name>
<email>yang.lee@linux.alibaba.com</email>
</author>
<published>2022-10-26T02:00:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=613d5a9a440828970f1543b962779401ac2c9c62'/>
<id>urn:sha1:613d5a9a440828970f1543b962779401ac2c9c62</id>
<content type='text'>
[ Upstream commit 5b994354af3cab770bf13386469c5725713679af ]

./drivers/gpu/drm/amd/amdkfd/kfd_migrate.c:985:58-62: ERROR: p is NULL but dereferenced.

Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2549
Reported-by: Abaci Robot &lt;abaci@linux.alibaba.com&gt;
Signed-off-by: Yang Li &lt;yang.lee@linux.alibaba.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdkfd: handle CPU fault on COW mapping</title>
<updated>2022-11-16T09:03:49+00:00</updated>
<author>
<name>Philip Yang</name>
<email>Philip.Yang@amd.com</email>
</author>
<published>2022-09-07T16:30:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=883584df581987e7176bfae1c8b276f73c2ca28d'/>
<id>urn:sha1:883584df581987e7176bfae1c8b276f73c2ca28d</id>
<content type='text'>
[ Upstream commit e1f84eef313f4820cca068a238c645d0a38c6a9b ]

If CPU page fault in a page with zone_device_data svm_bo from another
process, that means it is COW mapping in the child process and the
range is migrated to VRAM by parent process. Migrate the parent
process range back to system memory to recover the CPU page fault.

Signed-off-by: Philip Yang &lt;Philip.Yang@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Stable-dep-of: 5b994354af3c ("drm/amdkfd: Fix NULL pointer dereference in svm_migrate_to_ram()")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdkfd: correct the cache info for gfx1036</title>
<updated>2022-11-03T15:00:21+00:00</updated>
<author>
<name>Jesse Zhang</name>
<email>jesse.zhang@amd.com</email>
</author>
<published>2022-10-11T05:23:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f01acf19b8935e1db4892417202a707b87ddb739'/>
<id>urn:sha1:f01acf19b8935e1db4892417202a707b87ddb739</id>
<content type='text'>
commit 969758bbf5e9360b63bbb2328ac3fda46bbbc9f5 upstream.

correct the cache information for gfx1036

Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Yifan Zhang &lt;yifan1.zhang@amd.com&gt;
Signed-off-by: Yifan Zhang &lt;yifan1.zhang@amd.com&gt;
Signed-off-by: Jesse Zhang &lt;jesse.zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drm/amdkfd: update gfx1037 Lx cache setting</title>
<updated>2022-11-03T15:00:21+00:00</updated>
<author>
<name>Prike Liang</name>
<email>Prike.Liang@amd.com</email>
</author>
<published>2022-10-20T06:44:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e60a87e401e5c06cc492bd91b54d579c78df46b3'/>
<id>urn:sha1:e60a87e401e5c06cc492bd91b54d579c78df46b3</id>
<content type='text'>
commit 9656db1b933caf6ffaaef10322093fe018359090 upstream.

Update the gfx1037 L1/L2 cache setting.

Signed-off-by: Prike Liang &lt;Prike.Liang@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drm/amdkfd: Fix UBSAN shift-out-of-bounds warning</title>
<updated>2022-10-21T10:39:17+00:00</updated>
<author>
<name>Felix Kuehling</name>
<email>Felix.Kuehling@amd.com</email>
</author>
<published>2022-09-21T21:45:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=447524eea46e8466228b33da144a405af2973274'/>
<id>urn:sha1:447524eea46e8466228b33da144a405af2973274</id>
<content type='text'>
[ Upstream commit b292cafe2dd02d96a07147e4b160927e8399d5cc ]

This was fixed in initialize_cpsch before, but not in initialize_nocpsch.
Factor sdma bitmap initialization into a helper function to apply the
correct implementation in both cases without duplicating it.

v2: Added a range check

Reported-by: Ellis Michael &lt;ellis@ellismichael.com&gt;
Signed-off-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Reviewed-by: Graham Sider &lt;Graham.Sider@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
