<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c, branch v5.5.13</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v5.5.13</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v5.5.13'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2020-02-24T07:38:17+00:00</updated>
<entry>
<title>drm/amdgpu: fix KIQ ring test fail in TDR of SRIOV</title>
<updated>2020-02-24T07:38:17+00:00</updated>
<author>
<name>Monk Liu</name>
<email>Monk.Liu@amd.com</email>
</author>
<published>2019-12-17T10:16:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7429e7ff873c77044f4d714633e9bd8578063c06'/>
<id>urn:sha1:7429e7ff873c77044f4d714633e9bd8578063c06</id>
<content type='text'>
[ Upstream commit 5a7489a7e189ee2be889485f90c8cf24ea4b9a40 ]

issues:
MEC is ruined by the amdkfd_pre_reset after VF FLR done

fix:
amdkfd_pre_reset() would ruin MEC after hypervisor finished the VF FLR,
the correct sequence is do amdkfd_pre_reset before VF FLR but there is
a limitation to block this sequence:
if we do pre_reset() before VF FLR, it would go KIQ way to do register
access and stuck there, because KIQ probably won't work by that time
(e.g. you already made GFX hang)

so the best way right now is to simply remove it.

Signed-off-by: Monk Liu &lt;Monk.Liu@amd.com&gt;
Reviewed-by: Emily Deng &lt;Emily.Deng@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma</title>
<updated>2019-11-30T18:33:14+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-11-30T18:33:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=aa32f1169148beb90d71494e2f2a1999ba7b5366'/>
<id>urn:sha1:aa32f1169148beb90d71494e2f2a1999ba7b5366</id>
<content type='text'>
Pull hmm updates from Jason Gunthorpe:
 "This is another round of bug fixing and cleanup. This time the focus
  is on the driver pattern to use mmu notifiers to monitor a VA range.
  This code is lifted out of many drivers and hmm_mirror directly into
  the mmu_notifier core and written using the best ideas from all the
  driver implementations.

  This removes many bugs from the drivers and has a very pleasing
  diffstat. More drivers can still be converted, but that is for another
  cycle.

   - A shared branch with RDMA reworking the RDMA ODP implementation

   - New mmu_interval_notifier API. This is focused on the use case of
     monitoring a VA and simplifies the process for drivers

   - A common seq-count locking scheme built into the
     mmu_interval_notifier API usable by drivers that call
     get_user_pages() or hmm_range_fault() with the VA range

   - Conversion of mlx5 ODP, hfi1, radeon, nouveau, AMD GPU, and Xen
     GntDev drivers to the new API. This deletes a lot of wonky driver
     code.

   - Two improvements for hmm_range_fault(), from testing done by Ralph"

* tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  mm/hmm: remove hmm_range_dma_map and hmm_range_dma_unmap
  mm/hmm: make full use of walk_page_range()
  xen/gntdev: use mmu_interval_notifier_insert
  mm/hmm: remove hmm_mirror and related
  drm/amdgpu: Use mmu_interval_notifier instead of hmm_mirror
  drm/amdgpu: Use mmu_interval_insert instead of hmm_mirror
  drm/amdgpu: Call find_vma under mmap_sem
  nouveau: use mmu_interval_notifier instead of hmm_mirror
  nouveau: use mmu_notifier directly for invalidate_range_start
  drm/radeon: use mmu_interval_notifier_insert
  RDMA/hfi1: Use mmu_interval_notifier_insert for user_exp_rcv
  RDMA/odp: Use mmu_interval_notifier_insert()
  mm/hmm: define the pre-processor related parts of hmm.h even if disabled
  mm/hmm: allow hmm_range to be used with a mmu_interval_notifier or hmm_mirror
  mm/mmu_notifier: add an interval tree notifier
  mm/mmu_notifier: define the header pre-processor parts even if disabled
  mm/hmm: allow snapshot of the special zero page
</content>
</entry>
<entry>
<title>drm/amdgpu: Use mmu_interval_insert instead of hmm_mirror</title>
<updated>2019-11-23T23:56:45+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@mellanox.com</email>
</author>
<published>2019-11-12T20:22:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=62914a99dee5ac51253a84e7d4a05c18f0c77535'/>
<id>urn:sha1:62914a99dee5ac51253a84e7d4a05c18f0c77535</id>
<content type='text'>
Remove the interval tree in the driver and rely on the tree maintained by
the mmu_notifier for delivering mmu_notifier invalidation callbacks.

For some reason amdgpu has a very complicated arrangement where it tries
to prevent duplicate entries in the interval_tree, this is not necessary,
each amdgpu_bo can be its own stand alone entry. interval_tree already
allows duplicates and overlaps in the tree.

Also, there is no need to remove entries upon a release callback, the
mmu_interval API safely allows objects to remain registered beyond the
lifetime of the mm. The driver only has to stop touching the pages during
release.

Link: https://lore.kernel.org/r/20191112202231.3856-12-jgg@ziepe.ca
Reviewed-by: Philip Yang &lt;Philip.Yang@amd.com&gt;
Tested-by: Philip Yang &lt;Philip.Yang@amd.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: put flush_delayed_work at first</title>
<updated>2019-11-22T19:35:10+00:00</updated>
<author>
<name>Yintian Tao</name>
<email>yttao@amd.com</email>
</author>
<published>2019-11-18T08:06:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c0e21ea1d0b557bdedd5b54d529162f74e7ef407'/>
<id>urn:sha1:c0e21ea1d0b557bdedd5b54d529162f74e7ef407</id>
<content type='text'>
There is one regression from 042f3d7b745cd76aa
To put flush_delayed_work after adev-&gt;shutdown = true
which will make amdgpu_ih_process not response the irq
At last, all ib ring tests will be failed just like below

[drm] amdgpu: finishing device.
[drm] Fence fallback timer expired on ring gfx
[drm] Fence fallback timer expired on ring comp_1.0.0
[drm] Fence fallback timer expired on ring comp_1.1.0
[drm] Fence fallback timer expired on ring comp_1.2.0
[drm] Fence fallback timer expired on ring comp_1.3.0
[drm] Fence fallback timer expired on ring comp_1.0.1
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.1.1 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.2.1 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.3.1 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on sdma0 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on sdma1 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on uvd_enc_0.0 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on vce0 (-110).
[drm:amdgpu_device_delayed_init_work_handler [amdgpu]] *ERROR* ib ring test failed (-110).

v2: replace cancel_delayed_work_sync() with flush_delayed_work()

Signed-off-by: Yintian Tao &lt;yttao@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amd/amdgpu: finish delay works before release resources</title>
<updated>2019-11-11T22:38:13+00:00</updated>
<author>
<name>Jesse Zhang</name>
<email>zhexi.zhang@amd.com</email>
</author>
<published>2019-11-08T10:06:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9f87516764a9f2a6a6a9552844efb41d9748afec'/>
<id>urn:sha1:9f87516764a9f2a6a6a9552844efb41d9748afec</id>
<content type='text'>
flush/cancel delayed works before doing finalization
to avoid concurrently requests.

Signed-off-by: Jesse Zhang &lt;zhexi.zhang@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: perform p-state switch after the whole hive initialized</title>
<updated>2019-11-06T21:27:48+00:00</updated>
<author>
<name>Evan Quan</name>
<email>evan.quan@amd.com</email>
</author>
<published>2019-11-05T07:15:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=60599a03638a611574a177fa6e6d7394abad0d7f'/>
<id>urn:sha1:60599a03638a611574a177fa6e6d7394abad0d7f</id>
<content type='text'>
P-state switch should be performed after all devices from the hive
get initialized.

Signed-off-by: Evan Quan &lt;evan.quan@amd.com&gt;
Reviewed-by: Feifei Xu &lt;Feifei.Xu@amd.com&gt;
Reviewed-by:  Jonathan Kim &lt;Jonathan.Kim@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: register gpu instance before fan boost feature enablment</title>
<updated>2019-11-06T21:27:47+00:00</updated>
<author>
<name>Evan Quan</name>
<email>evan.quan@amd.com</email>
</author>
<published>2019-11-05T10:13:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b0adca4d50169d9a05912e208a89cdd615da40d4'/>
<id>urn:sha1:b0adca4d50169d9a05912e208a89cdd615da40d4</id>
<content type='text'>
Otherwise, the feature enablement will be skipped due to wrong count.

Fixes: beff74bc6e0fa91 ("drm/amdgpu: fix a race in GPU reset with IB test (v2)")
Signed-off-by: Evan Quan &lt;evan.quan@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: change pstate only after all XGMI device initialized</title>
<updated>2019-11-06T21:27:46+00:00</updated>
<author>
<name>Evan Quan</name>
<email>evan.quan@amd.com</email>
</author>
<published>2019-10-31T06:10:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=39ea6e5f9e2d6e094b7a48592ecae6c69a82e471'/>
<id>urn:sha1:39ea6e5f9e2d6e094b7a48592ecae6c69a82e471</id>
<content type='text'>
Pstate settings should be performed after all device of the
XGMI setup get initialized.

Signed-off-by: Evan Quan &lt;evan.quan@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: bypass some cleanup work after err_event_athub (v2)</title>
<updated>2019-10-30T15:06:52+00:00</updated>
<author>
<name>Le Ma</name>
<email>le.ma@amd.com</email>
</author>
<published>2019-10-25T09:48:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bff77e86a3776fab6859bc168ecda6ccf56bfbd2'/>
<id>urn:sha1:bff77e86a3776fab6859bc168ecda6ccf56bfbd2</id>
<content type='text'>
PSP lost connection when err_event_athub occurs. These cleanup work can be
skipped in BACO reset.

v2: squash in missing include (Alex)

Signed-off-by: Le Ma &lt;le.ma@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;hawking.zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amd: correct "_LENTH" mispelling in constant</title>
<updated>2019-10-28T15:19:17+00:00</updated>
<author>
<name>Wambui Karuga</name>
<email>wambui.karugax@gmail.com</email>
</author>
<published>2019-10-28T09:20:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f440ff44b101796beec8ac63252f254de7f2de1c'/>
<id>urn:sha1:f440ff44b101796beec8ac63252f254de7f2de1c</id>
<content type='text'>
Correct the "_LENTH" mispelling in the AMDGPU_MAX_TIMEOUT_PARAM_LENGTH
constant.

Signed-off-by: Wambui Karuga &lt;wambui.karugax@gmail.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
</feed>
