<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/gpu/drm/amd/amdgpu, branch v6.6.47</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.47</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.47'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2024-08-14T11:58:53+00:00</updated>
<entry>
<title>drm/amdgpu: Forward soft recovery errors to userspace</title>
<updated>2024-08-14T11:58:53+00:00</updated>
<author>
<name>Joshua Ashton</name>
<email>joshua@froggi.es</email>
</author>
<published>2024-03-07T19:04:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0da0b06165d83a8ecbb6582d9d5a135f9d38a52a'/>
<id>urn:sha1:0da0b06165d83a8ecbb6582d9d5a135f9d38a52a</id>
<content type='text'>
commit 829798c789f567ef6ba4b084c15b7b5f3bd98d51 upstream.

As we discussed before[1], soft recovery should be
forwarded to userspace, or we can get into a really
bad state where apps will keep submitting hanging
command buffers cascading us to a hard reset.

1: https://lore.kernel.org/all/bf23d5ed-9a6b-43e7-84ee-8cbfd0d60f18@froggi.es/
Signed-off-by: Joshua Ashton &lt;joshua@froggi.es&gt;
Reviewed-by: Marek Olšák &lt;marek.olsak@amd.com&gt;
Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 434967aadbbbe3ad9103cc29e9a327de20fdba01)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Add lock around VF RLCG interface</title>
<updated>2024-08-14T11:58:45+00:00</updated>
<author>
<name>Victor Skvortsov</name>
<email>victor.skvortsov@amd.com</email>
</author>
<published>2024-05-27T20:10:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1adb5ebe205e96af77a93512e2d5b8c437548787'/>
<id>urn:sha1:1adb5ebe205e96af77a93512e2d5b8c437548787</id>
<content type='text'>
[ Upstream commit e864180ee49b4d30e640fd1e1d852b86411420c9 ]

flush_gpu_tlb may be called from another thread while
device_gpu_recover is running.

Both of these threads access registers through the VF
RLCG interface during VF Full Access. Add a lock around this interface
to prevent race conditions between these threads.

Signed-off-by: Victor Skvortsov &lt;victor.skvortsov@amd.com&gt;
Reviewed-by: Zhigang Luo &lt;zhigang.luo@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/admgpu: fix dereferencing null pointer context</title>
<updated>2024-08-14T11:58:45+00:00</updated>
<author>
<name>Jesse Zhang</name>
<email>jesse.zhang@amd.com</email>
</author>
<published>2024-05-09T02:57:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=641dac64178ccdb9e45c92b67120316896294d05'/>
<id>urn:sha1:641dac64178ccdb9e45c92b67120316896294d05</id>
<content type='text'>
[ Upstream commit 030ffd4d43b433bc6671d9ec34fc12c59220b95d ]

When user space sets an invalid ta type, the pointer context will be empty.
So it need to check the pointer context before using it

Signed-off-by: Jesse Zhang &lt;Jesse.Zhang@amd.com&gt;
Suggested-by: Tim Huang &lt;Tim.Huang@amd.com&gt;
Reviewed-by: Tim Huang &lt;Tim.Huang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Fix the null pointer dereference to ras_manager</title>
<updated>2024-08-14T11:58:45+00:00</updated>
<author>
<name>Ma Jun</name>
<email>Jun.Ma2@amd.com</email>
</author>
<published>2024-05-11T07:48:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b89616333979114bb0da5fa40fb6e4a2f5294ca2'/>
<id>urn:sha1:b89616333979114bb0da5fa40fb6e4a2f5294ca2</id>
<content type='text'>
[ Upstream commit 4c11d30c95576937c6c35e6f29884761f2dddb43 ]

Check ras_manager before using it

Signed-off-by: Ma Jun &lt;Jun.Ma2@amd.com&gt;
Reviewed-by: Lijo Lazar &lt;lijo.lazar@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: fix potential resource leak warning</title>
<updated>2024-08-14T11:58:44+00:00</updated>
<author>
<name>Tim Huang</name>
<email>Tim.Huang@amd.com</email>
</author>
<published>2024-04-25T03:09:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bc93cfde69b7c786fcdc70463e153fdf633caff9'/>
<id>urn:sha1:bc93cfde69b7c786fcdc70463e153fdf633caff9</id>
<content type='text'>
[ Upstream commit 22a5daaec0660dd19740c4c6608b78f38760d1e6 ]

Clear resource leak warning that when the prepare fails,
the allocated amdgpu job object will never be released.

Signed-off-by: Tim Huang &lt;Tim.Huang@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amd/amdgpu: Fix uninitialized variable warnings</title>
<updated>2024-08-03T06:54:29+00:00</updated>
<author>
<name>Ma Ke</name>
<email>make24@iscas.ac.cn</email>
</author>
<published>2024-07-18T14:17:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e150f0171c0c0c45a373a658358c51c940ed4fd9'/>
<id>urn:sha1:e150f0171c0c0c45a373a658358c51c940ed4fd9</id>
<content type='text'>
commit df65aabef3c0327c23b840ab5520150df4db6b5f upstream.

Return 0 to avoid returning an uninitialized variable r.

Cc: stable@vger.kernel.org
Fixes: 230dd6bb6117 ("drm/amd/amdgpu: implement mode2 reset on smu_v13_0_10")
Signed-off-by: Ma Ke &lt;make24@iscas.ac.cn&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 6472de66c0aa18d50a4b5ca85f8272e88a737676)
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: reset vm state machine after gpu reset(vram lost)</title>
<updated>2024-08-03T06:54:29+00:00</updated>
<author>
<name>ZhenGuo Yin</name>
<email>zhenguo.yin@amd.com</email>
</author>
<published>2024-07-19T08:10:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=323790535237927e1b6a7bc35ddf662c6e7c25cc'/>
<id>urn:sha1:323790535237927e1b6a7bc35ddf662c6e7c25cc</id>
<content type='text'>
commit 5659b0c93a1ea02c662a030b322093203f299185 upstream.

[Why]
Page table of compute VM in the VRAM will lost after gpu reset.
VRAM won't be restored since compute VM has no shadows.

[How]
Use higher 32-bit of vm-&gt;generation to record a vram_lost_counter.
Reset the VM state machine when vm-&gt;genertaion is not equal to
the new generation token.

v2: Check vm-&gt;generation instead of calling drm_sched_entity_error
in amdgpu_vm_validate.
v3: Use new generation token instead of vram_lost_counter for check.

Signed-off-by: ZhenGuo Yin &lt;zhenguo.yin@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: stable@vger.kernel.org
(cherry picked from commit 47c0388b0589cb481c294dcb857d25a214c46eb3)
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu/sdma5.2: Update wptr registers as well as doorbell</title>
<updated>2024-08-03T06:54:28+00:00</updated>
<author>
<name>Alex Deucher</name>
<email>alexander.deucher@amd.com</email>
</author>
<published>2024-07-09T21:54:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9d74e50098492e89f319ac6922db3c2062f69340'/>
<id>urn:sha1:9d74e50098492e89f319ac6922db3c2062f69340</id>
<content type='text'>
commit a03ebf116303e5d13ba9a2b65726b106cb1e96f6 upstream.

We seem to have a case where SDMA will sometimes miss a doorbell
if GFX is entering the powergating state when the doorbell comes in.
To workaround this, we can update the wptr via MMIO, however,
this is only safe because we disallow gfxoff in begin_ring() for
SDMA 5.2 and then allow it again in end_ring().

Enable this workaround while we are root causing the issue with
the HW team.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/3440
Tested-by: Friedrich Vock &lt;friedrich.vock@gmx.de&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: stable@vger.kernel.org
(cherry picked from commit f2ac52634963fc38e4935e11077b6f7854e5d700)
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Remove GC HW IP 9.3.0 from noretry=1</title>
<updated>2024-08-03T06:53:46+00:00</updated>
<author>
<name>Tim Van Patten</name>
<email>timvp@google.com</email>
</author>
<published>2024-05-16T17:57:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6d72626808325c1986cbf90b0ce27a59b4291876'/>
<id>urn:sha1:6d72626808325c1986cbf90b0ce27a59b4291876</id>
<content type='text'>
[ Upstream commit 1446226d32a45bb7c4f63195a59be8c08defe658 ]

The following commit updated gmc-&gt;noretry from 0 to 1 for GC HW IP
9.3.0:

    commit 5f3854f1f4e2 ("drm/amdgpu: add more cases to noretry=1")

This causes the device to hang when a page fault occurs, until the
device is rebooted. Instead, revert back to gmc-&gt;noretry=0 so the device
is still responsive.

Fixes: 5f3854f1f4e2 ("drm/amdgpu: add more cases to noretry=1")
Signed-off-by: Tim Van Patten &lt;timvp@google.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Check if NBIO funcs are NULL in amdgpu_device_baco_exit</title>
<updated>2024-08-03T06:53:46+00:00</updated>
<author>
<name>Friedrich Vock</name>
<email>friedrich.vock@gmx.de</email>
</author>
<published>2024-05-14T07:06:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=67b4592a7d74e57a5a0929eaf3ae30414ebd39ae'/>
<id>urn:sha1:67b4592a7d74e57a5a0929eaf3ae30414ebd39ae</id>
<content type='text'>
[ Upstream commit 0cdb3f9740844b9d95ca413e3fcff11f81223ecf ]

The special case for VM passthrough doesn't check adev-&gt;nbio.funcs
before dereferencing it. If GPUs that don't have an NBIO block are
passed through, this leads to a NULL pointer dereference on startup.

Signed-off-by: Friedrich Vock &lt;friedrich.vock@gmx.de&gt;
Fixes: 1bece222eabe ("drm/amdgpu: Clear doorbell interrupt status for Sienna Cichlid")
Cc: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: Christian König &lt;christian.koenig@amd.com&gt;
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
