<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/gpu/drm/amd/amdgpu, branch v5.15.3</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v5.15.3</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v5.15.3'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2021-11-18T18:17:12+00:00</updated>
<entry>
<title>drm/amdgpu: fix uvd crash on Polaris12 during driver unloading</title>
<updated>2021-11-18T18:17:12+00:00</updated>
<author>
<name>Evan Quan</name>
<email>evan.quan@amd.com</email>
</author>
<published>2021-10-09T09:35:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f17e9e81004e197161956213e0c4f9bc3787d6b9'/>
<id>urn:sha1:f17e9e81004e197161956213e0c4f9bc3787d6b9</id>
<content type='text'>
[ Upstream commit 4fc30ea780e0a5c1c019bc2e44f8523e1eed9051 ]

There was a change(below) target for such issue:
d82e2c249c8f ("drm/amdgpu: Fix crash on device remove/driver unload")
But the fix for VI ASICs was missing there. This is a supplement for
that.

Fixes: d82e2c249c8f ("drm/amdgpu: Fix crash on device remove/driver unload")

Signed-off-by: Evan Quan &lt;evan.quan@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/ttm: remove ttm_bo_vm_insert_huge()</title>
<updated>2021-11-18T18:17:08+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@nvidia.com</email>
</author>
<published>2021-10-19T23:27:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=71fb40ae9b07578b3aeebf78fcbb3394810acd9a'/>
<id>urn:sha1:71fb40ae9b07578b3aeebf78fcbb3394810acd9a</id>
<content type='text'>
[ Upstream commit 0d979509539ed1df883a30d442177ca7be609565 ]

The huge page functionality in TTM does not work safely because PUD and
PMD entries do not have a special bit.

get_user_pages_fast() considers any page that passed pmd_huge() as
usable:

	if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd) ||
		     pmd_devmap(pmd))) {

And vmf_insert_pfn_pmd_prot() unconditionally sets

	entry = pmd_mkhuge(pfn_t_pmd(pfn, prot));

eg on x86 the page will be _PAGE_PRESENT | PAGE_PSE.

As such gup_huge_pmd() will try to deref a struct page:

	head = try_grab_compound_head(pmd_page(orig), refs, flags);

and thus crash.

Thomas further notices that the drivers are not expecting the struct page
to be used by anything - in particular the refcount incr above will cause
them to malfunction.

Thus everything about this is not able to fully work correctly considering
GUP_fast. Delete it entirely. It can return someday along with a proper
PMD/PUD_SPECIAL bit in the page table itself to gate GUP_fast.

Fixes: 314b6580adc5 ("drm/ttm, drm/vmwgfx: Support huge TTM pagefaults")
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Reviewed-by: Thomas Hellström &lt;thomas.helllstrom@linux.intel.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
[danvet: Update subject per Thomas' &amp;Christian's review]
Signed-off-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/0-v2-a44694790652+4ac-ttm_pmd_jgg@nvidia.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu/gmc6: fix DMA mask from 44 to 40 bits</title>
<updated>2021-11-18T18:16:43+00:00</updated>
<author>
<name>Alex Deucher</name>
<email>alexander.deucher@amd.com</email>
</author>
<published>2021-10-27T17:26:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e3e3c222c306b59334c704e4098256a8aaf3e9d6'/>
<id>urn:sha1:e3e3c222c306b59334c704e4098256a8aaf3e9d6</id>
<content type='text'>
[ Upstream commit 403475be6d8b122c3e6b8a47e075926d7299e5ef ]

The DMA mask on SI parts is 40 bits not 44.  Copy
paste typo.

Fixes: 244511f386ccb9 ("drm/amdgpu: simplify and cleanup setting the dma mask")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1762
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Tested-by: Paul Menzel &lt;pmenzel@molgen.mpg.de&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: fix a potential memory leak in amdgpu_device_fini_sw()</title>
<updated>2021-11-18T18:16:43+00:00</updated>
<author>
<name>Lang Yu</name>
<email>lang.yu@amd.com</email>
</author>
<published>2021-10-21T06:36:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=76eb974438109209b02758d7a3ac726c9c653439'/>
<id>urn:sha1:76eb974438109209b02758d7a3ac726c9c653439</id>
<content type='text'>
[ Upstream commit a5c5d8d50ecf5874be90a76e1557279ff8a30c9e ]

amdgpu_fence_driver_sw_fini() should be executed before
amdgpu_device_ip_fini(), otherwise fence driver resource
won't be properly freed as adev-&gt;rings have been tore down.

Fixes: 72c8c97b1522 ("drm/amdgpu: Split amdgpu_device_fini into early and late")

Signed-off-by: Lang Yu &lt;lang.yu@amd.com&gt;
Reviewed-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdkfd: Fix an inappropriate error handling in allloc memory of gpu</title>
<updated>2021-11-18T18:16:35+00:00</updated>
<author>
<name>Lang Yu</name>
<email>lang.yu@amd.com</email>
</author>
<published>2021-10-11T05:57:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5a9bd1b941f4418371ca932cf47f304caca18426'/>
<id>urn:sha1:5a9bd1b941f4418371ca932cf47f304caca18426</id>
<content type='text'>
[ Upstream commit 5aeeac6fa38fca450faed9770f75b1470c0e2073 ]

We should unreference a gem object instead of an amdgpu bo here.

Fixes: fd9a9f8801de ("drm/amdgpu: Use GEM obj reference for KFD BOs")

Signed-off-by: Lang Yu &lt;lang.yu@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: fix warning for overflow check</title>
<updated>2021-11-18T18:16:27+00:00</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2021-09-27T12:58:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=798419802590614d3987ab9346edcebfb7da40ad'/>
<id>urn:sha1:798419802590614d3987ab9346edcebfb7da40ad</id>
<content type='text'>
[ Upstream commit 335aea75b0d95518951cad7c4c676e6f1c02c150 ]

The overflow check in amdgpu_bo_list_create() causes a warning with
clang-14 on 64-bit architectures, since the limit can never be
exceeded.

drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c:74:18: error: result of comparison of constant 256204778801521549 with expression of type 'unsigned int' is always false [-Werror,-Wtautological-constant-out-of-range-compare]
        if (num_entries &gt; (SIZE_MAX - sizeof(struct amdgpu_bo_list))
            ~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The check remains useful for 32-bit architectures, so just avoid the
warning by using size_t as the type for the count.

Fixes: 920990cb080a ("drm/amdgpu: allocate the bo_list array after the list")
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: move amdgpu_virt_release_full_gpu to fini_early stage</title>
<updated>2021-11-18T18:16:25+00:00</updated>
<author>
<name>Guchun Chen</name>
<email>guchun.chen@amd.com</email>
</author>
<published>2021-09-18T05:43:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=adce47a6405cc894204c2bf92e4f87b39c894c45'/>
<id>urn:sha1:adce47a6405cc894204c2bf92e4f87b39c894c45</id>
<content type='text'>
[ Upstream commit 6effad8abe0ba4db3d9c58ed585127858a990f35 ]

adev-&gt;rmmio is set to be NULL in amdgpu_device_unmap_mmio to prevent
access after pci_remove, however, in SRIOV case, amdgpu_virt_release_full_gpu
will still use adev-&gt;rmmio for access after amdgpu_device_unmap_mmio.
The patch is to move such SRIOV calling earlier to fini_early stage.

Fixes: 07775fc13878 ("drm/amdgpu: Unmap all MMIO mappings")
Cc: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Signed-off-by: Leslie Shi &lt;Yuliang.Shi@amd.com&gt;
Signed-off-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Reviewed-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Fix crash on device remove/driver unload</title>
<updated>2021-11-18T18:16:24+00:00</updated>
<author>
<name>Andrey Grodzovsky</name>
<email>andrey.grodzovsky@amd.com</email>
</author>
<published>2021-09-15T19:27:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=24b4d71021256f6ab5e25fe9322d5f15b7d3d810'/>
<id>urn:sha1:24b4d71021256f6ab5e25fe9322d5f15b7d3d810</id>
<content type='text'>
[ Upstream commit d82e2c249c8ffaec20fa618611ea2ab4dcfd4d01 ]

Crash:
BUG: unable to handle page fault for address: 00000000000010e1
RIP: 0010:vega10_power_gate_vce+0x26/0x50 [amdgpu]
Call Trace:
pp_set_powergating_by_smu+0x16a/0x2b0 [amdgpu]
amdgpu_dpm_set_powergating_by_smu+0x92/0xf0 [amdgpu]
amdgpu_dpm_enable_vce+0x2e/0xc0 [amdgpu]
vce_v4_0_hw_fini+0x95/0xa0 [amdgpu]
amdgpu_device_fini_hw+0x232/0x30d [amdgpu]
amdgpu_driver_unload_kms+0x5c/0x80 [amdgpu]
amdgpu_pci_remove+0x27/0x40 [amdgpu]
pci_device_remove+0x3e/0xb0
device_release_driver_internal+0x103/0x1d0
device_release_driver+0x12/0x20
pci_stop_bus_device+0x79/0xa0
pci_stop_and_remove_bus_device_locked+0x1b/0x30
remove_store+0x7b/0x90
dev_attr_store+0x17/0x30
sysfs_kf_write+0x4b/0x60
kernfs_fop_write_iter+0x151/0x1e0

Why:
VCE/UVD had dependency on SMC block for their suspend but
SMC block is the first to do HW fini due to some constraints

How:
Since the original patch was dealing with suspend issues
move the SMC block dependency back into suspend hooks as
was done in V1 of the original patches.
Keep flushing idle work both in suspend and HW fini seuqnces
since it's essential in both cases.

Fixes: 859e4659273f1d ("drm/amdgpu: add missing cleanups for more ASICs on UVD/VCE suspend")
Fixes: bf756fb833cbe8 ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")
Signed-off-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Fix MMIO access page fault</title>
<updated>2021-11-18T18:16:11+00:00</updated>
<author>
<name>Andrey Grodzovsky</name>
<email>andrey.grodzovsky@amd.com</email>
</author>
<published>2021-09-16T16:54:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=351601af203da6404d2c0d87e0c10406bdc151bd'/>
<id>urn:sha1:351601af203da6404d2c0d87e0c10406bdc151bd</id>
<content type='text'>
[ Upstream commit c03509cbc01559549700e14c4a6239f2572ab4ba ]

Add more guards to MMIO access post device
unbind/unplug

Bug: https://bugs.archlinux.org/task/72092?project=1&amp;order=dateopened&amp;sort=desc&amp;pagenum=1
Signed-off-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Reviewed-by: James Zhu &lt;James.Zhu@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: move iommu_resume before ip init/resume</title>
<updated>2021-11-18T18:16:09+00:00</updated>
<author>
<name>James Zhu</name>
<email>James.Zhu@amd.com</email>
</author>
<published>2021-09-07T15:32:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=55bf61828ab33b959ecc41b66c6a3725fb921ec1'/>
<id>urn:sha1:55bf61828ab33b959ecc41b66c6a3725fb921ec1</id>
<content type='text'>
[ Upstream commit 9cec53c18a3170c7e5673c414da56aeecee94832 ]

Separate iommu_resume from kfd_resume, and move it before
other amdgpu ip init/resume.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277
Signed-off-by: James Zhu &lt;James.Zhu@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
