<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/gpu/drm/scheduler, branch v5.15.89</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v5.15.89</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v5.15.89'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2021-07-01T06:53:25+00:00</updated>
<entry>
<title>drm/sched: Allow using a dedicated workqueue for the timeout/fault tdr</title>
<updated>2021-07-01T06:53:25+00:00</updated>
<author>
<name>Boris Brezillon</name>
<email>boris.brezillon@collabora.com</email>
</author>
<published>2021-06-30T06:27:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=78efe21b6f8e6f4d39fceaf0cc5c534c11f9dd60'/>
<id>urn:sha1:78efe21b6f8e6f4d39fceaf0cc5c534c11f9dd60</id>
<content type='text'>
Mali Midgard/Bifrost GPUs have 3 hardware queues but only a global GPU
reset. This leads to extra complexity when we need to synchronize timeout
works with the reset work. One solution to address that is to have an
ordered workqueue at the driver level that will be used by the different
schedulers to queue their timeout work. Thanks to the serialization
provided by the ordered workqueue we are guaranteed that timeout
handlers are executed sequentially, and can thus easily reset the GPU
from the timeout handler without extra synchronization.

v5:
* Add a new paragraph to the timedout_job() method

v3:
* New patch

v4:
* Actually use the timeout_wq to queue the timeout work

Suggested-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Signed-off-by: Boris Brezillon &lt;boris.brezillon@collabora.com&gt;
Reviewed-by: Steven Price &lt;steven.price@arm.com&gt;
Reviewed-by: Lucas Stach &lt;l.stach@pengutronix.de&gt;
Acked-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Qiang Yu &lt;yuq825@gmail.com&gt;
Cc: Emma Anholt &lt;emma@anholt.net&gt;
Cc: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: "Christian König" &lt;christian.koenig@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-3-boris.brezillon@collabora.com
</content>
</entry>
<entry>
<title>drm/sched: Declare entity idle only after HW submission</title>
<updated>2021-06-28T11:16:49+00:00</updated>
<author>
<name>Boris Brezillon</name>
<email>boris.brezillon@collabora.com</email>
</author>
<published>2021-06-24T14:08:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3b5ac97ad468f6cfd31346821a3a2b9f13d23015'/>
<id>urn:sha1:3b5ac97ad468f6cfd31346821a3a2b9f13d23015</id>
<content type='text'>
The panfrost driver tries to kill in-flight jobs on FD close after
destroying the FD scheduler entities. For this to work properly, we
need to make sure the jobs popped from the scheduler entities have
been queued at the HW level before declaring the entity idle, otherwise
we might iterate over a list that doesn't contain those jobs.

Suggested-by: Lucas Stach &lt;l.stach@pengutronix.de&gt;
Signed-off-by: Boris Brezillon &lt;boris.brezillon@collabora.com&gt;
Cc: Lucas Stach &lt;l.stach@pengutronix.de&gt;
Reviewed-by: Steven Price &lt;steven.price@arm.com&gt;
Reviewed-by: Lucas Stach &lt;l.stach@pengutronix.de&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20210624140850.2229697-1-boris.brezillon@collabora.com
</content>
</entry>
<entry>
<title>drm/sched: Avoid data corruptions</title>
<updated>2021-05-20T03:50:28+00:00</updated>
<author>
<name>Andrey Grodzovsky</name>
<email>andrey.grodzovsky@amd.com</email>
</author>
<published>2021-05-19T14:14:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0b10ab80695d61422337ede6ff496552d8ace99d'/>
<id>urn:sha1:0b10ab80695d61422337ede6ff496552d8ace99d</id>
<content type='text'>
Wait for all dependencies of a job  to complete before
killing it to avoid data corruptions.

Signed-off-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20210519141407.88444-1-andrey.grodzovsky@amd.com
</content>
</entry>
<entry>
<title>drm/scheduler: Fix hang when sched_entity released</title>
<updated>2021-05-20T03:50:28+00:00</updated>
<author>
<name>Andrey Grodzovsky</name>
<email>andrey.grodzovsky@amd.com</email>
</author>
<published>2021-05-12T14:26:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c61cdbdbffc169dc7f1e6fe94dfffaf574fe672a'/>
<id>urn:sha1:c61cdbdbffc169dc7f1e6fe94dfffaf574fe672a</id>
<content type='text'>
Problem: If scheduler is already stopped by the time sched_entity
is released and entity's job_queue not empty I encountred
a hang in drm_sched_entity_flush. This is because drm_sched_entity_is_idle
never becomes false.

Fix: In drm_sched_fini detach all sched_entities from the
scheduler's run queues. This will satisfy drm_sched_entity_is_idle.
Also wakeup all those processes stuck in sched_entity flushing
as the scheduler main thread which wakes them up is stopped by now.

v2:
Reverse order of drm_sched_rq_remove_entity and marking
s_entity as stopped to prevent reinserion back to rq due
to race.

v3:
Drop drm_sched_rq_remove_entity, only modify entity-&gt;stopped
and check for it in drm_sched_entity_is_idle

Signed-off-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20210512142648.666476-14-andrey.grodzovsky@amd.com
</content>
</entry>
<entry>
<title>drm/sched: Make timeout timer rearm conditional.</title>
<updated>2021-05-20T03:50:28+00:00</updated>
<author>
<name>Andrey Grodzovsky</name>
<email>andrey.grodzovsky@amd.com</email>
</author>
<published>2021-05-12T14:26:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=75973e5802afece679d3936328e23a1891a9badc'/>
<id>urn:sha1:75973e5802afece679d3936328e23a1891a9badc</id>
<content type='text'>
We don't want to rearm the timer if driver hook reports
that the device is gone.

v5: Update drm_gpu_sched_stat values in code.

Signed-off-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20210512142648.666476-11-andrey.grodzovsky@amd.com
</content>
</entry>
<entry>
<title>drm/scheduler: Change scheduled fence track v2</title>
<updated>2021-05-05T07:26:36+00:00</updated>
<author>
<name>Roy Sun</name>
<email>Roy.Sun@amd.com</email>
</author>
<published>2021-04-26T06:27:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1774baa64f9395fa884ea9ed494bcb043f3b83f5'/>
<id>urn:sha1:1774baa64f9395fa884ea9ed494bcb043f3b83f5</id>
<content type='text'>
Update the timestamp of scheduled fence on HW
completion of the previous fences

This allow more accurate tracking of the fence
execution in HW

v2 (chk): drop the flag check and improve the comment

Signed-off-by: David M Nieto &lt;david.nieto@amd.com&gt;
Signed-off-by: Roy Sun &lt;Roy.Sun@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20210426062701.39732-1-Roy.Sun@amd.com
</content>
</entry>
<entry>
<title>Merge drm/drm-next into drm-misc-next</title>
<updated>2021-04-26T12:03:09+00:00</updated>
<author>
<name>Maxime Ripard</name>
<email>maxime@cerno.tech</email>
</author>
<published>2021-04-26T12:03:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=355b60296143a090039211c5f0e1463f84aab65a'/>
<id>urn:sha1:355b60296143a090039211c5f0e1463f84aab65a</id>
<content type='text'>
Christian needs some patches from drm/next

Signed-off-by: Maxime Ripard &lt;maxime@cerno.tech&gt;
</content>
</entry>
<entry>
<title>drm/scheduler/sched_entity: Fix some function name disparity</title>
<updated>2021-04-22T12:05:17+00:00</updated>
<author>
<name>Lee Jones</name>
<email>lee.jones@linaro.org</email>
</author>
<published>2021-04-16T14:37:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=04be0c5b40a3dce3e4790aa507d0029a7d31c0b2'/>
<id>urn:sha1:04be0c5b40a3dce3e4790aa507d0029a7d31c0b2</id>
<content type='text'>
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/scheduler/sched_entity.c:204: warning: expecting prototype for drm_sched_entity_kill_jobs(). Prototype was for drm_sched_entity_kill_jobs_cb() instead
 drivers/gpu/drm/scheduler/sched_entity.c:262: warning: expecting prototype for drm_sched_entity_cleanup(). Prototype was for drm_sched_entity_fini() instead
 drivers/gpu/drm/scheduler/sched_entity.c:305: warning: expecting prototype for drm_sched_entity_fini(). Prototype was for drm_sched_entity_destroy() instead

Cc: David Airlie &lt;airlied@linux.ie&gt;
Cc: Daniel Vetter &lt;daniel@ffwll.ch&gt;
Cc: Sumit Semwal &lt;sumit.semwal@linaro.org&gt;
Cc: "Christian König" &lt;christian.koenig@amd.com&gt;
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones &lt;lee.jones@linaro.org&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20210416143725.2769053-25-lee.jones@linaro.org
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amd/amdgpu implement tdr advanced mode</title>
<updated>2021-04-09T20:45:45+00:00</updated>
<author>
<name>Jack Zhang</name>
<email>Jack.Zhang1@amd.com</email>
</author>
<published>2021-03-08T04:41:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e6c6338f393b74ac0b303d567bb918b44ae7ad75'/>
<id>urn:sha1:e6c6338f393b74ac0b303d567bb918b44ae7ad75</id>
<content type='text'>
[Why]
Previous tdr design treats the first job in job_timeout as the bad job.
But sometimes a later bad compute job can block a good gfx job and
cause an unexpected gfx job timeout because gfx and compute ring share
internal GC HW mutually.

[How]
This patch implements an advanced tdr mode.It involves an additinal
synchronous pre-resubmit step(Step0 Resubmit) before normal resubmit
step in order to find the real bad job.

1. At Step0 Resubmit stage, it synchronously submits and pends for the
first job being signaled. If it gets timeout, we identify it as guilty
and do hw reset. After that, we would do the normal resubmit step to
resubmit left jobs.

2. For whole gpu reset(vram lost), do resubmit as the old way.

v2: squash in build fix (Alex)

Signed-off-by: Jack Zhang &lt;Jack.Zhang1@amd.com&gt;
Reviewed-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/sched: select new rq even if there is only one v3</title>
<updated>2021-03-08T13:06:17+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2021-02-20T08:50:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ac4eb83ab255de9c31184df51fd1534ba36fd212'/>
<id>urn:sha1:ac4eb83ab255de9c31184df51fd1534ba36fd212</id>
<content type='text'>
This is necessary when changing priorities of an entity.

v2: test the sched_list instead of num_sched.
v3: set the sched_list to NULL when there is only one entry

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Sonny Jiang &lt;sonny.jiang@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20210305125155.2312-1-christian.koenig@amd.com
</content>
</entry>
</feed>
