<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c, branch v5.15.3</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v5.15.3</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v5.15.3'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2021-11-18T18:16:43+00:00</updated>
<entry>
<title>drm/amdgpu: fix a potential memory leak in amdgpu_device_fini_sw()</title>
<updated>2021-11-18T18:16:43+00:00</updated>
<author>
<name>Lang Yu</name>
<email>lang.yu@amd.com</email>
</author>
<published>2021-10-21T06:36:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=76eb974438109209b02758d7a3ac726c9c653439'/>
<id>urn:sha1:76eb974438109209b02758d7a3ac726c9c653439</id>
<content type='text'>
[ Upstream commit a5c5d8d50ecf5874be90a76e1557279ff8a30c9e ]

amdgpu_fence_driver_sw_fini() should be executed before
amdgpu_device_ip_fini(), otherwise fence driver resource
won't be properly freed as adev-&gt;rings have been tore down.

Fixes: 72c8c97b1522 ("drm/amdgpu: Split amdgpu_device_fini into early and late")

Signed-off-by: Lang Yu &lt;lang.yu@amd.com&gt;
Reviewed-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: move amdgpu_virt_release_full_gpu to fini_early stage</title>
<updated>2021-11-18T18:16:25+00:00</updated>
<author>
<name>Guchun Chen</name>
<email>guchun.chen@amd.com</email>
</author>
<published>2021-09-18T05:43:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=adce47a6405cc894204c2bf92e4f87b39c894c45'/>
<id>urn:sha1:adce47a6405cc894204c2bf92e4f87b39c894c45</id>
<content type='text'>
[ Upstream commit 6effad8abe0ba4db3d9c58ed585127858a990f35 ]

adev-&gt;rmmio is set to be NULL in amdgpu_device_unmap_mmio to prevent
access after pci_remove, however, in SRIOV case, amdgpu_virt_release_full_gpu
will still use adev-&gt;rmmio for access after amdgpu_device_unmap_mmio.
The patch is to move such SRIOV calling earlier to fini_early stage.

Fixes: 07775fc13878 ("drm/amdgpu: Unmap all MMIO mappings")
Cc: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Signed-off-by: Leslie Shi &lt;Yuliang.Shi@amd.com&gt;
Signed-off-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Reviewed-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: move iommu_resume before ip init/resume</title>
<updated>2021-11-18T18:16:09+00:00</updated>
<author>
<name>James Zhu</name>
<email>James.Zhu@amd.com</email>
</author>
<published>2021-09-07T15:32:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=55bf61828ab33b959ecc41b66c6a3725fb921ec1'/>
<id>urn:sha1:55bf61828ab33b959ecc41b66c6a3725fb921ec1</id>
<content type='text'>
[ Upstream commit 9cec53c18a3170c7e5673c414da56aeecee94832 ]

Separate iommu_resume from kfd_resume, and move it before
other amdgpu ip init/resume.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277
Signed-off-by: James Zhu &lt;James.Zhu@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: revert "Add autodump debugfs node for gpu reset v8"</title>
<updated>2021-11-06T13:13:31+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2021-09-30T09:22:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6ecad8906f05a44d3ce15ec3cba6871f2de93095'/>
<id>urn:sha1:6ecad8906f05a44d3ce15ec3cba6871f2de93095</id>
<content type='text'>
commit c8365dbda056578eebe164bf110816b1a39b4b7f upstream.

This reverts commit 728e7e0cd61899208e924472b9e641dbeb0775c4.

Further discussion reveals that this feature is severely broken
and needs to be reverted ASAP.

GPU reset can never be delayed by userspace even for debugging or
otherwise we can run into in kernel deadlocks.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Acked-by: Nirmoy Das &lt;nirmoy.das@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drm/amdkfd: fix boot failure when iommu is disabled in Picasso.</title>
<updated>2021-11-06T13:13:30+00:00</updated>
<author>
<name>Yifan Zhang</name>
<email>yifan1.zhang@amd.com</email>
</author>
<published>2021-10-11T12:37:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f17dca0ab3f38b19c0f1b935f417f62d4a528723'/>
<id>urn:sha1:f17dca0ab3f38b19c0f1b935f417f62d4a528723</id>
<content type='text'>
commit afd18180c07026f94a80ff024acef5f4159084a4 upstream.

When IOMMU disabled in sbios and kfd in iommuv2 path, iommuv2
init will fail. But this failure should not block amdgpu driver init.

Reported-by: youling &lt;youling257@gmail.com&gt;
Tested-by: youling &lt;youling257@gmail.com&gt;
Signed-off-by: Yifan Zhang &lt;yifan1.zhang@amd.com&gt;
Reviewed-by: James Zhu &lt;James.Zhu@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: handle the case of pci_channel_io_frozen only in amdgpu_pci_resume</title>
<updated>2021-10-05T17:02:31+00:00</updated>
<author>
<name>Guchun Chen</name>
<email>guchun.chen@amd.com</email>
</author>
<published>2021-10-01T01:48:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=248b061689a40f4fed05252ee2c89f87cf26d7d8'/>
<id>urn:sha1:248b061689a40f4fed05252ee2c89f87cf26d7d8</id>
<content type='text'>
In current code, when a PCI error state pci_channel_io_normal is detectd,
it will report PCI_ERS_RESULT_CAN_RECOVER status to PCI driver, and PCI
driver will continue the execution of PCI resume callback report_resume by
pci_walk_bridge, and the callback will go into amdgpu_pci_resume
finally, where write lock is releasd unconditionally without acquiring
such lock first. In this case, a deadlock will happen when other threads
start to acquire the read lock.

To fix this, add a member in amdgpu_device strucutre to cache
pci_channel_state, and only continue the execution in amdgpu_pci_resume
when it's pci_channel_io_frozen.

Fixes: c9a6b82f45e2 ("drm/amdgpu: Implement DPC recovery")
Suggested-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Signed-off-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Reviewed-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: init iommu after amdkfd device init</title>
<updated>2021-10-05T17:02:20+00:00</updated>
<author>
<name>Yifan Zhang</name>
<email>yifan1.zhang@amd.com</email>
</author>
<published>2021-09-28T07:42:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=714d9e4574d54596973ee3b0624ee4a16264d700'/>
<id>urn:sha1:714d9e4574d54596973ee3b0624ee4a16264d700</id>
<content type='text'>
This patch is to fix clinfo failure in Raven/Picasso:

Number of platforms: 1
  Platform Profile: FULL_PROFILE
  Platform Version: OpenCL 2.2 AMD-APP (3364.0)
  Platform Name: AMD Accelerated Parallel Processing
  Platform Vendor: Advanced Micro Devices, Inc.
  Platform Extensions: cl_khr_icd cl_amd_event_callback

  Platform Name: AMD Accelerated Parallel Processing Number of devices: 0

Signed-off-by: Yifan Zhang &lt;yifan1.zhang@amd.com&gt;
Reviewed-by: James Zhu &lt;James.Zhu@amd.com&gt;
Tested-by: James Zhu &lt;James.Zhu@amd.com&gt;
Acked-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: move iommu_resume before ip init/resume</title>
<updated>2021-09-16T13:56:24+00:00</updated>
<author>
<name>James Zhu</name>
<email>James.Zhu@amd.com</email>
</author>
<published>2021-09-07T15:32:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f02abeb0779700c308e661a412451b38962b8a0b'/>
<id>urn:sha1:f02abeb0779700c308e661a412451b38962b8a0b</id>
<content type='text'>
Separate iommu_resume from kfd_resume, and move it before
other amdgpu ip init/resume.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277
Signed-off-by: James Zhu &lt;James.Zhu@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: stable@vger.kernel.org
</content>
</entry>
<entry>
<title>drm/amdgpu: Cancel delayed work when GFXOFF is disabled</title>
<updated>2021-08-20T16:09:44+00:00</updated>
<author>
<name>Michel Dänzer</name>
<email>mdaenzer@redhat.com</email>
</author>
<published>2021-08-17T08:23:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=90a9266269eb9f71af1f323c33e1dca53527bd22'/>
<id>urn:sha1:90a9266269eb9f71af1f323c33e1dca53527bd22</id>
<content type='text'>
schedule_delayed_work does not push back the work if it was already
scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
was disabled and re-enabled again during those 100 ms.

This resulted in frame drops / stutter with the upcoming mutter 41
release on Navi 14, due to constantly enabling GFXOFF in the HW and
disabling it again (for getting the GPU clock counter).

To fix this, call cancel_delayed_work_sync when the disable count
transitions from 0 to 1, and only schedule the delayed work on the
reverse transition, not if the disable count was already 0. This makes
sure the delayed work doesn't run at unexpected times, and allows it to
be lock-free.

v2:
* Use cancel_delayed_work_sync &amp; mutex_trylock instead of
  mod_delayed_work.
v3:
* Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
v4:
* Fix race condition between amdgpu_gfx_off_ctrl incrementing
  adev-&gt;gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off
  checking for it to be 0 (Evan Quan)

Cc: stable@vger.kernel.org
Reviewed-by: Evan Quan &lt;evan.quan@amd.com&gt;
Reviewed-by: Lijo Lazar &lt;lijo.lazar@amd.com&gt; # v3
Acked-by: Christian König &lt;christian.koenig@amd.com&gt; # v3
Signed-off-by: Michel Dänzer &lt;mdaenzer@redhat.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amd/amdgpu:flush ttm delayed work before cancel_sync</title>
<updated>2021-08-18T22:23:00+00:00</updated>
<author>
<name>YuBiao Wang</name>
<email>YuBiao.Wang@amd.com</email>
</author>
<published>2021-08-17T09:36:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=691191a2f458e0176414cb5b3993b0c018cdc58c'/>
<id>urn:sha1:691191a2f458e0176414cb5b3993b0c018cdc58c</id>
<content type='text'>
[Why]
In some cases when we unload driver, warning call trace
will show up in vram_mgr_fini which claims that LRU is not empty, caused
by the ttm bo inside delay deleted queue.

[How]
We should flush delayed work to make sure the delay deleting is done.

Signed-off-by: YuBiao Wang &lt;YuBiao.Wang@amd.com&gt;
Reviewed-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
</feed>
