<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/gpu/drm/scheduler, branch v5.9.10</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v5.9.10</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v5.9.10'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2020-11-05T10:51:17+00:00</updated>
<entry>
<title>drm/scheduler: Scheduler priority fixes (v2)</title>
<updated>2020-11-05T10:51:17+00:00</updated>
<author>
<name>Luben Tuikov</name>
<email>luben.tuikov@amd.com</email>
</author>
<published>2020-08-11T23:59:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c1918dd9f20cb5449d172e0ab43f6bdd16a46ac0'/>
<id>urn:sha1:c1918dd9f20cb5449d172e0ab43f6bdd16a46ac0</id>
<content type='text'>
[ Upstream commit e2d732fdb7a9e421720a644580cd6a9400f97f60 ]

Remove DRM_SCHED_PRIORITY_LOW, as it was used
in only one place.

Rename and separate by a line
DRM_SCHED_PRIORITY_MAX to DRM_SCHED_PRIORITY_COUNT
as it represents a (total) count of said
priorities and it is used as such in loops
throughout the code. (0-based indexing is the
the count number.)

Remove redundant word HIGH in priority names,
and rename *KERNEL* to *HIGH*, as it really
means that, high.

v2: Add back KERNEL and remove SW and HW,
    in lieu of a single HIGH between NORMAL and KERNEL.

Signed-off-by: Luben Tuikov &lt;luben.tuikov@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'sched-fifo-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2020-08-06T18:55:43+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-08-06T18:55:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6d2b84a4e5b954bd2587e06c29577256f59e0030'/>
<id>urn:sha1:6d2b84a4e5b954bd2587e06c29577256f59e0030</id>
<content type='text'>
Pull sched/fifo updates from Ingo Molnar:
 "This adds the sched_set_fifo*() encapsulation APIs to remove static
  priority level knowledge from non-scheduler code.

  The three APIs for non-scheduler code to set SCHED_FIFO are:

   - sched_set_fifo()
   - sched_set_fifo_low()
   - sched_set_normal()

  These are two FIFO priority levels: default (high), and a 'low'
  priority level, plus sched_set_normal() to set the policy back to
  non-SCHED_FIFO.

  Since the changes affect a lot of non-scheduler code, we kept this in
  a separate tree"

* tag 'sched-fifo-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  sched,tracing: Convert to sched_set_fifo()
  sched: Remove sched_set_*() return value
  sched: Remove sched_setscheduler*() EXPORTs
  sched,psi: Convert to sched_set_fifo_low()
  sched,rcutorture: Convert to sched_set_fifo_low()
  sched,rcuperf: Convert to sched_set_fifo_low()
  sched,locktorture: Convert to sched_set_fifo()
  sched,irq: Convert to sched_set_fifo()
  sched,watchdog: Convert to sched_set_fifo()
  sched,serial: Convert to sched_set_fifo()
  sched,powerclamp: Convert to sched_set_fifo()
  sched,ion: Convert to sched_set_normal()
  sched,powercap: Convert to sched_set_fifo*()
  sched,spi: Convert to sched_set_fifo*()
  sched,mmc: Convert to sched_set_fifo*()
  sched,ivtv: Convert to sched_set_fifo*()
  sched,drm/scheduler: Convert to sched_set_fifo*()
  sched,msm: Convert to sched_set_fifo*()
  sched,psci: Convert to sched_set_fifo*()
  sched,drbd: Convert to sched_set_fifo*()
  ...
</content>
</entry>
<entry>
<title>Backmerge remote-tracking branch 'drm/drm-next' into drm-misc-next</title>
<updated>2020-06-29T10:16:26+00:00</updated>
<author>
<name>Maarten Lankhorst</name>
<email>maarten.lankhorst@linux.intel.com</email>
</author>
<published>2020-06-29T10:15:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=60e9eabf41fa916d2ef68c5bf929197975917578'/>
<id>urn:sha1:60e9eabf41fa916d2ef68c5bf929197975917578</id>
<content type='text'>
Some conflicts with ttm_bo-&gt;offset removal, but drm-misc-next needs updating to v5.8.

Signed-off-by: Maarten Lankhorst &lt;maarten.lankhorst@linux.intel.com&gt;
</content>
</entry>
<entry>
<title>drm/scheduler: improve job distribution with multiple queues</title>
<updated>2020-06-26T12:16:29+00:00</updated>
<author>
<name>Nirmoy Das</name>
<email>nirmoy.aiemd@gmail.com</email>
</author>
<published>2020-06-25T12:07:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d41a39dda1407ff50c7c4bfbd307c188f1ae364b'/>
<id>urn:sha1:d41a39dda1407ff50c7c4bfbd307c188f1ae364b</id>
<content type='text'>
This patch uses score to select a new drm scheduler for better
loadbalance between multiple drm schedulers instead of num_jobs.

Below are test results after running amdgpu_test for ~10 times.

Before this patch:

sched_name     num of many times it got schedule
=========      ==================================
sdma0          1463
sdma1          198
comp_1.0.1     280

After this patch:

sched_name     num of many times it got schedule
=========      ==================================
sdma0          925
sdma1          928
comp_1.0.1     177
comp_1.1.1     44
comp_1.2.1     43
comp_1.3.1     44

Signed-off-by: Nirmoy Das &lt;nirmoy.das@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/373000/
Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
</content>
</entry>
<entry>
<title>sched,drm/scheduler: Convert to sched_set_fifo*()</title>
<updated>2020-06-15T12:10:21+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2020-04-21T10:09:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7b31e940b17bc47ca2f5e332c166231f01317469'/>
<id>urn:sha1:7b31e940b17bc47ca2f5e332c166231f01317469</id>
<content type='text'>
Because SCHED_FIFO is a broken scheduler model (see previous patches)
take away the priority field, the kernel can't possibly make an
informed decision.

In this case, use fifo_low, because it only cares about being above
SCHED_NORMAL. Effectively no change in behaviour.

Cc: alexander.deucher@amd.com
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/scheduler: fix drm_sched_get_cleanup_job</title>
<updated>2020-04-15T09:09:13+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2020-04-11T09:54:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8623b5255ae7ccaf276aac3920787bf575fa6b37'/>
<id>urn:sha1:8623b5255ae7ccaf276aac3920787bf575fa6b37</id>
<content type='text'>
We are racing to initialize sched-&gt;thread here, just always check the
current thread.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Reviewed-by: Kent Russell &lt;kent.russell@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/361303/
</content>
</entry>
<entry>
<title>drm/scheduler: fix rare NULL ptr race</title>
<updated>2020-03-25T21:00:11+00:00</updated>
<author>
<name>Yintian Tao</name>
<email>yttao@amd.com</email>
</author>
<published>2020-03-23T11:19:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=77bb2f204f1f0a53a602a8fd15816d6826212077'/>
<id>urn:sha1:77bb2f204f1f0a53a602a8fd15816d6826212077</id>
<content type='text'>
There is one one corner case at dma_fence_signal_locked
which will raise the NULL pointer problem just like below.
-&gt;dma_fence_signal
    -&gt;dma_fence_signal_locked
	-&gt;test_and_set_bit
here trigger dma_fence_release happen due to the zero of fence refcount.

-&gt;dma_fence_put
    -&gt;dma_fence_release
	-&gt;drm_sched_fence_release_scheduled
	    -&gt;call_rcu
here make the union fled “cb_list” at finished fence
to NULL because struct rcu_head contains two pointer
which is same as struct list_head cb_list

Therefore, to hold the reference of finished fence at drm_sched_process_job
to prevent the null pointer during finished fence dma_fence_signal

[  732.912867] BUG: kernel NULL pointer dereference, address: 0000000000000008
[  732.914815] #PF: supervisor write access in kernel mode
[  732.915731] #PF: error_code(0x0002) - not-present page
[  732.916621] PGD 0 P4D 0
[  732.917072] Oops: 0002 [#1] SMP PTI
[  732.917682] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G           OE     5.4.0-rc7 #1
[  732.918980] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
[  732.920906] RIP: 0010:dma_fence_signal_locked+0x3e/0x100
[  732.938569] Call Trace:
[  732.939003]  &lt;IRQ&gt;
[  732.939364]  dma_fence_signal+0x29/0x50
[  732.940036]  drm_sched_fence_finished+0x12/0x20 [gpu_sched]
[  732.940996]  drm_sched_process_job+0x34/0xa0 [gpu_sched]
[  732.941910]  dma_fence_signal_locked+0x85/0x100
[  732.942692]  dma_fence_signal+0x29/0x50
[  732.943457]  amdgpu_fence_process+0x99/0x120 [amdgpu]
[  732.944393]  sdma_v4_0_process_trap_irq+0x81/0xa0 [amdgpu]

v2: hold the finished fence at drm_sched_process_job instead of
    amdgpu_fence_process
v3: resume the blank line

Signed-off-by: Yintian Tao &lt;yttao@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/sched: implement and export drm_sched_pick_best</title>
<updated>2020-03-16T20:21:32+00:00</updated>
<author>
<name>Nirmoy Das</name>
<email>nirmoy.das@amd.com</email>
</author>
<published>2020-03-13T10:39:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ec2edcc2796c892aa0cc4740ce00a22fe57d2a1c'/>
<id>urn:sha1:ec2edcc2796c892aa0cc4740ce00a22fe57d2a1c</id>
<content type='text'>
Remove drm_sched_entity_get_free_sched() and use the logic of picking
the least loaded drm scheduler from a drm scheduler list to implement
drm_sched_pick_best(). This patch also exports drm_sched_pick_best() so
that it can be utilized by other drm drivers.

Signed-off-by: Nirmoy Das &lt;nirmoy.das@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>Revert "drm/scheduler: improve job distribution with multiple queues"</title>
<updated>2020-03-16T20:21:32+00:00</updated>
<author>
<name>changzhu</name>
<email>Changfeng.Zhu@amd.com</email>
</author>
<published>2020-03-11T11:12:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d164bebb95516c9dd2a63cf8c8e9fe0b13d7474e'/>
<id>urn:sha1:d164bebb95516c9dd2a63cf8c8e9fe0b13d7474e</id>
<content type='text'>
It needs to revert this patch to avoid amdgpu_test compute hang problem
on picasso.

This reverts commit 56822db194232c089601728d68ed078dccb97f8b.

Signed-off-by: changzhu &lt;Changfeng.Zhu@amd.com&gt;
Reviewed-by: Feifei Xu &lt;Feifei.Xu@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/scheduler: fix inconsistent locking of job_list_lock</title>
<updated>2020-03-13T15:52:36+00:00</updated>
<author>
<name>Lucas Stach</name>
<email>l.stach@pengutronix.de</email>
</author>
<published>2020-01-20T10:51:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a7fbb630c5485f5095146df46f04c2ca1a24c299'/>
<id>urn:sha1:a7fbb630c5485f5095146df46f04c2ca1a24c299</id>
<content type='text'>
1db8c142b6c5 (drm/scheduler: Add drm_sched_suspend/resume_timeout()) made
the job_list_lock IRQ safe in as the suspend/resume calls were expected to
be called from IRQ context. This usage never materialized in upstream.
Instead amdgpu started locking the job_list_lock in an IRQ unsafe way in
amdgpu_ib_preempt_mark_partial_job() and amdgpu_ib_preempt_job_recovery(),
which leads to potential deadlock if one would actually start to call the
drm_sched_suspend/resume_timeout functions from IRQ context.

As no current user needs the locking to be IRQ safe, the local IRQ
disable/enable is pure overhead. Fix the inconsistent locking by changing
all uses of job_list_lock to use the IRQ unsafe locking primitives.

Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Lucas Stach &lt;l.stach@pengutronix.de&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
</feed>
