<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/gpu/drm/amd/amdgpu, branch v6.4.15</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.4.15</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.4.15'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2023-08-23T15:32:55+00:00</updated>
<entry>
<title>drm/amdgpu: keep irq count in amdgpu_irq_disable_all</title>
<updated>2023-08-23T15:32:55+00:00</updated>
<author>
<name>Guchun Chen</name>
<email>guchun.chen@amd.com</email>
</author>
<published>2023-05-25T09:24:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5b62847596566617f1e7fa1d31a791f0a97b8e46'/>
<id>urn:sha1:5b62847596566617f1e7fa1d31a791f0a97b8e46</id>
<content type='text'>
commit 8ffd6f0442674f32c048ec8dffdbc5ec67829beb upstream.

This can clean up all irq warnings because of unbalanced
amdgpu_irq_get/put when unplugging/unbinding device, and leave
irq count decrease in each ip fini function.

Signed-off-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: skip fence GFX interrupts disable/enable for S0ix</title>
<updated>2023-08-23T15:32:55+00:00</updated>
<author>
<name>Tim Huang</name>
<email>Tim.Huang@amd.com</email>
</author>
<published>2023-08-14T07:13:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=11c187026cce4295e2e7c71981f0a1e3b29ffbe0'/>
<id>urn:sha1:11c187026cce4295e2e7c71981f0a1e3b29ffbe0</id>
<content type='text'>
commit f1740b1ab2703b2a057da7cf33b03297e0381aa0 upstream.

GFX v11.0.1 reported fence fallback timer expired issue on
SDMA and GFX rings after S0ix resume. This is generated by
EOP interrupts are disabled when S0ix suspend but fails to
re-enable when resume because of the GFX is in GFXOFF.

[  203.349571] [drm] Fence fallback timer expired on ring sdma0
[  203.349572] [drm] Fence fallback timer expired on ring gfx_0.0.0
[  203.861635] [drm] Fence fallback timer expired on ring gfx_0.0.0

For S0ix, GFX is in GFXOFF state, avoid to touch the GFX registers
to configure the fence driver interrupts for rings that belong to GFX.
The interrupts configuration will be restored by GFXOFF exit.

Signed-off-by: Tim Huang &lt;Tim.Huang@amd.com&gt;
Reviewed-by: Mario Limonciello &lt;mario.limonciello@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drm/amd: flush any delayed gfxoff on suspend entry</title>
<updated>2023-08-23T15:32:55+00:00</updated>
<author>
<name>Mario Limonciello</name>
<email>mario.limonciello@amd.com</email>
</author>
<published>2023-05-18T16:52:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=333f1edde0ad1700541b20e62c7e4d7be23e771a'/>
<id>urn:sha1:333f1edde0ad1700541b20e62c7e4d7be23e771a</id>
<content type='text'>
commit a7b7d9e8aee4f71b4c7151702fd74237b8cef989 upstream.

DCN 3.1.4 is reported to hang on s2idle entry if graphics activity
is happening during entry.  This is because GFXOFF was scheduled as
delayed but RLC gets disabled in s2idle entry sequence which will
hang GFX IP if not already in GFXOFF.

To help this problem, flush any delayed work for GFXOFF early in
s2idle entry sequence to ensure that it's off when RLC is changed.

commit 4b31b92b143f ("drm/amdgpu: complete gfxoff allow signal during
suspend without delay") modified power gating flow so that if called
in s0ix that it ensured that GFXOFF wasn't put in work queue but
instead processed immediately.

This is dead code due to commit 10cb67eb8a1b ("drm/amdgpu: skip
CG/PG for gfx during S0ix") because GFXOFF will now not be explicitly
called as part of the suspend entry code.  Remove that dead code.

Signed-off-by: Mario Limonciello &lt;mario.limonciello@amd.com&gt;
Signed-off-by: Tim Huang &lt;tim.huang@amd.com&gt;
Reviewed-by: Lijo Lazar &lt;lijo.lazar@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Fix potential fence use-after-free v2</title>
<updated>2023-08-23T15:32:32+00:00</updated>
<author>
<name>shanzhulig</name>
<email>shanzhulig@gmail.com</email>
</author>
<published>2023-06-28T01:10:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f5db29ce2502c4fc8a33ed7989950ba43875e322'/>
<id>urn:sha1:f5db29ce2502c4fc8a33ed7989950ba43875e322</id>
<content type='text'>
[ Upstream commit 2e54154b9f27262efd0cb4f903cc7d5ad1fe9628 ]

fence Decrements the reference count before exiting.
Avoid Race Vulnerabilities for fence use-after-free.

v2 (chk): actually fix the use after free and not just move it.

Signed-off-by: shanzhulig &lt;shanzhulig@gmail.com&gt;
Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: unmap and remove csa_va properly</title>
<updated>2023-08-23T15:32:28+00:00</updated>
<author>
<name>Lang Yu</name>
<email>Lang.Yu@amd.com</email>
</author>
<published>2023-05-06T04:01:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a3a96bf843c356d1d9b2d7f6d0784b6ee28ca9d0'/>
<id>urn:sha1:a3a96bf843c356d1d9b2d7f6d0784b6ee28ca9d0</id>
<content type='text'>
[ Upstream commit 5daff15cd013422bc6d1efcfe82b586800025384 ]

Root PD BO should be reserved before unmap and remove
a bo_va from VM otherwise lockdep will complain.

v2: check fpriv-&gt;csa_va is not NULL instead of amdgpu_mcbp (christian)

[14616.936827] WARNING: CPU: 6 PID: 1711 at drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:1762 amdgpu_vm_bo_del+0x399/0x3f0 [amdgpu]
[14616.937096] Call Trace:
[14616.937097]  &lt;TASK&gt;
[14616.937102]  amdgpu_driver_postclose_kms+0x249/0x2f0 [amdgpu]
[14616.937187]  drm_file_free+0x1d6/0x300 [drm]
[14616.937207]  drm_close_helper.isra.0+0x62/0x70 [drm]
[14616.937220]  drm_release+0x5e/0x100 [drm]
[14616.937234]  __fput+0x9f/0x280
[14616.937239]  ____fput+0xe/0x20
[14616.937241]  task_work_run+0x61/0x90
[14616.937246]  exit_to_user_mode_prepare+0x215/0x220
[14616.937251]  syscall_exit_to_user_mode+0x2a/0x60
[14616.937254]  do_syscall_64+0x48/0x90
[14616.937257]  entry_SYSCALL_64_after_hwframe+0x63/0xcd

Signed-off-by: Lang Yu &lt;Lang.Yu@amd.com&gt;
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: install stub fence into potential unused fence pointers</title>
<updated>2023-08-23T15:32:28+00:00</updated>
<author>
<name>Lang Yu</name>
<email>Lang.Yu@amd.com</email>
</author>
<published>2023-05-05T12:14:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=aa9e9ba5748c524eb0925a2ef6984b78793646d6'/>
<id>urn:sha1:aa9e9ba5748c524eb0925a2ef6984b78793646d6</id>
<content type='text'>
[ Upstream commit 187916e6ed9d0c3b3abc27429f7a5f8c936bd1f0 ]

When using cpu to update page tables, vm update fences are unused.
Install stub fence into these fence pointers instead of NULL
to avoid NULL dereference when calling dma_fence_wait() on them.

Suggested-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Lang Yu &lt;Lang.Yu@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: fix memory leak in mes self test</title>
<updated>2023-08-23T15:32:24+00:00</updated>
<author>
<name>Jack Xiao</name>
<email>Jack.Xiao@amd.com</email>
</author>
<published>2023-04-21T06:20:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8d8c96efcec95736622381b2afc0fe9e317f88aa'/>
<id>urn:sha1:8d8c96efcec95736622381b2afc0fe9e317f88aa</id>
<content type='text'>
[ Upstream commit 31d7c3a4fc3d312a0646990767647925d5bde540 ]

The fences associated with mes queue have to be freed
up during amdgpu_ring_fini.

Signed-off-by: Jack Xiao &lt;Jack.Xiao@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Fix integer overflow in amdgpu_cs_pass1</title>
<updated>2023-08-23T15:32:24+00:00</updated>
<author>
<name>hackyzh002</name>
<email>hackyzh002@gmail.com</email>
</author>
<published>2023-04-19T12:22:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c3deb091398e9e469d08dd1599b6d76fd6b29df8'/>
<id>urn:sha1:c3deb091398e9e469d08dd1599b6d76fd6b29df8</id>
<content type='text'>
[ Upstream commit 87c2213e85bd81e4a9a4d0880c256568794ae388 ]

The type of size is unsigned int, if size is 0x40000000, there will
be an integer overflow, size will be zero after size *= sizeof(uint32_t),
will cause uninitialized memory to be referenced later.

Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: hackyzh002 &lt;hackyzh002@gmail.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: fix calltrace warning in amddrm_buddy_fini</title>
<updated>2023-08-23T15:32:24+00:00</updated>
<author>
<name>Longlong Yao</name>
<email>Longlong.Yao@amd.com</email>
</author>
<published>2023-04-13T05:21:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=756d674117f5c451f415d1c4046b927052a90c14'/>
<id>urn:sha1:756d674117f5c451f415d1c4046b927052a90c14</id>
<content type='text'>
[ Upstream commit 01382501509871d0799bab6bd412c228486af5bf ]

The following call trace is observed when removing the amdgpu driver, which
is caused by that BOs allocated for psp are not freed until removing.

[61811.450562] RIP: 0010:amddrm_buddy_fini.cold+0x29/0x47 [amddrm_buddy]
[61811.450577] Call Trace:
[61811.450577]  &lt;TASK&gt;
[61811.450579]  amdgpu_vram_mgr_fini+0x135/0x1c0 [amdgpu]
[61811.450728]  amdgpu_ttm_fini+0x207/0x290 [amdgpu]
[61811.450870]  amdgpu_bo_fini+0x27/0xa0 [amdgpu]
[61811.451012]  gmc_v9_0_sw_fini+0x4a/0x60 [amdgpu]
[61811.451166]  amdgpu_device_fini_sw+0x117/0x520 [amdgpu]
[61811.451306]  amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
[61811.451447]  devm_drm_dev_init_release+0x4d/0x80 [drm]
[61811.451466]  devm_action_release+0x15/0x20
[61811.451469]  release_nodes+0x40/0xb0
[61811.451471]  devres_release_all+0x9b/0xd0
[61811.451473]  __device_release_driver+0x1bb/0x2a0
[61811.451476]  driver_detach+0xf3/0x140
[61811.451479]  bus_remove_driver+0x6c/0xf0
[61811.451481]  driver_unregister+0x31/0x60
[61811.451483]  pci_unregister_driver+0x40/0x90
[61811.451486]  amdgpu_exit+0x15/0x447 [amdgpu]

For smu v13_0_2, if the GPU supports xgmi, refer to

commit f5c7e7797060 ("drm/amdgpu: Adjust removal control flow for smu v13_0_2"),

it will run gpu recover in AMDGPU_RESET_FOR_DEVICE_REMOVE mode when removing,
which makes all devices in hive list have hw reset but no resume except the
basic ip blocks, then other ip blocks will not call .hw_fini according to
ip_block.status.hw.

Since psp_free_shared_bufs just includes some software operations, so move
it to psp_sw_fini.

Reviewed-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Reviewed-by: Feifei Xu &lt;Feifei.Xu@amd.com&gt;
Signed-off-by: Longlong Yao &lt;Longlong.Yao@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amd/pm: avoid unintentional shutdown due to temperature momentary fluctuation</title>
<updated>2023-08-16T16:32:20+00:00</updated>
<author>
<name>Evan Quan</name>
<email>evan.quan@amd.com</email>
</author>
<published>2023-05-04T09:09:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=737d519e7b38b75f052f46708cfe7e6bccd733fe'/>
<id>urn:sha1:737d519e7b38b75f052f46708cfe7e6bccd733fe</id>
<content type='text'>
commit b75efe88b20c2be28b67e2821a794cc183e32374 upstream.

An intentional delay is added on soft ctf triggered. Then there will
be a double check for the GPU temperature before taking further
action. This can avoid unintended shutdown due to temperature
momentary fluctuation.

Signed-off-by: Evan Quan &lt;evan.quan@amd.com&gt;
Reviewed-by: Lijo Lazar &lt;lijo.lazar@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
[ Hand-modified because XCP support added to amdgpu.h in kernel 6.5
  and is not necessary for this fix. ]
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1267
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2779
Signed-off-by: Mario Limonciello &lt;mario.limonciello@amd.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
</feed>
