<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/drm/gpu_scheduler.h, branch v5.15.209</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v5.15.209</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v5.15.209'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2021-07-01T06:53:25+00:00</updated>
<entry>
<title>drm/sched: Allow using a dedicated workqueue for the timeout/fault tdr</title>
<updated>2021-07-01T06:53:25+00:00</updated>
<author>
<name>Boris Brezillon</name>
<email>boris.brezillon@collabora.com</email>
</author>
<published>2021-06-30T06:27:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=78efe21b6f8e6f4d39fceaf0cc5c534c11f9dd60'/>
<id>urn:sha1:78efe21b6f8e6f4d39fceaf0cc5c534c11f9dd60</id>
<content type='text'>
Mali Midgard/Bifrost GPUs have 3 hardware queues but only a global GPU
reset. This leads to extra complexity when we need to synchronize timeout
works with the reset work. One solution to address that is to have an
ordered workqueue at the driver level that will be used by the different
schedulers to queue their timeout work. Thanks to the serialization
provided by the ordered workqueue we are guaranteed that timeout
handlers are executed sequentially, and can thus easily reset the GPU
from the timeout handler without extra synchronization.

v5:
* Add a new paragraph to the timedout_job() method

v3:
* New patch

v4:
* Actually use the timeout_wq to queue the timeout work

Suggested-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Signed-off-by: Boris Brezillon &lt;boris.brezillon@collabora.com&gt;
Reviewed-by: Steven Price &lt;steven.price@arm.com&gt;
Reviewed-by: Lucas Stach &lt;l.stach@pengutronix.de&gt;
Acked-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Qiang Yu &lt;yuq825@gmail.com&gt;
Cc: Emma Anholt &lt;emma@anholt.net&gt;
Cc: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: "Christian König" &lt;christian.koenig@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-3-boris.brezillon@collabora.com
</content>
</entry>
<entry>
<title>drm/sched: Document what the timedout_job method should do</title>
<updated>2021-07-01T06:53:25+00:00</updated>
<author>
<name>Boris Brezillon</name>
<email>boris.brezillon@collabora.com</email>
</author>
<published>2021-06-30T06:27:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1fad1b7ed1ebfcfb5a1d0d21b0c47f7af5f49a6c'/>
<id>urn:sha1:1fad1b7ed1ebfcfb5a1d0d21b0c47f7af5f49a6c</id>
<content type='text'>
The documentation is a bit vague and doesn't really describe what the
-&gt;timedout_job() is expected to do. Let's add a few more details.

v5:
* New patch

Suggested-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Signed-off-by: Boris Brezillon &lt;boris.brezillon@collabora.com&gt;
Reviewed-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-2-boris.brezillon@collabora.com
</content>
</entry>
<entry>
<title>drm/sched: Fix inverted comment for hang_limit</title>
<updated>2021-06-03T08:32:49+00:00</updated>
<author>
<name>Alyssa Rosenzweig</name>
<email>alyssa.rosenzweig@collabora.com</email>
</author>
<published>2021-05-28T23:51:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=95b2151fec3e62ba0033c61bd388ff0111884972'/>
<id>urn:sha1:95b2151fec3e62ba0033c61bd388ff0111884972</id>
<content type='text'>
The hang_limit is the threshold after which the kernel no longer
attempts to schedule a job. Its documentation stated the opposite due to
a typo. Correct the wording to indicate the actual purpose of the field.

Signed-off-by: Alyssa Rosenzweig &lt;alyssa.rosenzweig@collabora.com&gt;
Cc: David Airlie &lt;airlied@linux.ie&gt;
Cc: Daniel Vetter &lt;daniel@ffwll.ch&gt;
Signed-off-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20210528235152.38447-1-alyssa.rosenzweig@collabora.com
</content>
</entry>
<entry>
<title>drm/amd/amdgpu implement tdr advanced mode</title>
<updated>2021-04-09T20:45:45+00:00</updated>
<author>
<name>Jack Zhang</name>
<email>Jack.Zhang1@amd.com</email>
</author>
<published>2021-03-08T04:41:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e6c6338f393b74ac0b303d567bb918b44ae7ad75'/>
<id>urn:sha1:e6c6338f393b74ac0b303d567bb918b44ae7ad75</id>
<content type='text'>
[Why]
Previous tdr design treats the first job in job_timeout as the bad job.
But sometimes a later bad compute job can block a good gfx job and
cause an unexpected gfx job timeout because gfx and compute ring share
internal GC HW mutually.

[How]
This patch implements an advanced tdr mode.It involves an additinal
synchronous pre-resubmit step(Step0 Resubmit) before normal resubmit
step in order to find the real bad job.

1. At Step0 Resubmit stage, it synchronously submits and pends for the
first job being signaled. If it gets timeout, we identify it as guilty
and do hw reset. After that, we would do the normal resubmit step to
resubmit left jobs.

2. For whole gpu reset(vram lost), do resubmit as the old way.

v2: squash in build fix (Alex)

Signed-off-by: Jack Zhang &lt;Jack.Zhang1@amd.com&gt;
Reviewed-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/sched: add missing member documentation</title>
<updated>2021-04-08T12:59:45+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2021-04-01T12:50:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=be318fd85bf2c73c10850a6ce50a87e6f0068926'/>
<id>urn:sha1:be318fd85bf2c73c10850a6ce50a87e6f0068926</id>
<content type='text'>
Just fix a warning.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reported-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Fixes: f2f12eb9c32b ("drm/scheduler: provide scheduler score externally")
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20210401125213.138855-1-christian.koenig@amd.com
</content>
</entry>
<entry>
<title>drm/scheduler: provide scheduler score externally</title>
<updated>2021-02-05T09:47:11+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2021-02-02T11:40:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f2f12eb9c32bc7a714276d8862efac2e7c41bcbe'/>
<id>urn:sha1:f2f12eb9c32bc7a714276d8862efac2e7c41bcbe</id>
<content type='text'>
Allow multiple schedulers to share the load balancing score.

This is useful when one engine has different hw rings.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-and-Tested-by: Leo Liu &lt;leo.liu@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20210204144405.2737-1-christian.koenig@amd.com
</content>
</entry>
<entry>
<title>drm/scheduler: Job timeout handler returns status (v3)</title>
<updated>2021-01-29T10:30:22+00:00</updated>
<author>
<name>Luben Tuikov</name>
<email>luben.tuikov@amd.com</email>
</author>
<published>2021-01-20T20:09:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a6a1f036c74e3d2a3a56b3140492c7c3ecb879f3'/>
<id>urn:sha1:a6a1f036c74e3d2a3a56b3140492c7c3ecb879f3</id>
<content type='text'>
This patch does not change current behaviour.

The driver's job timeout handler now returns
status indicating back to the DRM layer whether
the device (GPU) is no longer available, such as
after it's been unplugged, or whether all is
normal, i.e. current behaviour.

All drivers which make use of the
drm_sched_backend_ops' .timedout_job() callback
have been accordingly renamed and return the
would've-been default value of
DRM_GPU_SCHED_STAT_NOMINAL to restart the task's
timeout timer--this is the old behaviour, and is
preserved by this patch.

v2: Use enum as the status of a driver's job
    timeout callback method.

v3: Return scheduler/device information, rather
    than task information.

Cc: Alexander Deucher &lt;Alexander.Deucher@amd.com&gt;
Cc: Andrey Grodzovsky &lt;Andrey.Grodzovsky@amd.com&gt;
Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Cc: Lucas Stach &lt;l.stach@pengutronix.de&gt;
Cc: Russell King &lt;linux+etnaviv@armlinux.org.uk&gt;
Cc: Christian Gmeiner &lt;christian.gmeiner@gmail.com&gt;
Cc: Qiang Yu &lt;yuq825@gmail.com&gt;
Cc: Rob Herring &lt;robh@kernel.org&gt;
Cc: Tomeu Vizoso &lt;tomeu.vizoso@collabora.com&gt;
Cc: Steven Price &lt;steven.price@arm.com&gt;
Cc: Alyssa Rosenzweig &lt;alyssa.rosenzweig@collabora.com&gt;
Cc: Eric Anholt &lt;eric@anholt.net&gt;
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Signed-off-by: Luben Tuikov &lt;luben.tuikov@amd.com&gt;
Acked-by: Alyssa Rosenzweig &lt;alyssa.rosenzweig@collabora.com&gt;
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Acked-by: Steven Price &lt;steven.price@arm.com&gt;
Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/415095/
</content>
</entry>
<entry>
<title>drm/sched: Add missing structure comment</title>
<updated>2020-12-10T12:33:41+00:00</updated>
<author>
<name>Luben Tuikov</name>
<email>luben.tuikov@amd.com</email>
</author>
<published>2020-12-09T22:31:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c365d304d69ab2a38e1431323d17a216b7930e32'/>
<id>urn:sha1:c365d304d69ab2a38e1431323d17a216b7930e32</id>
<content type='text'>
Add a missing structure comment for the recently
added @list member.

Cc: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Cc: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Cc: Christian König &lt;christian.koenig@amd.com&gt;
Fixes:  8935ff00e3b1 ("drm/scheduler: "node" --&gt; "list"")
Reported-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Luben Tuikov &lt;luben.tuikov@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/406546/
</content>
</entry>
<entry>
<title>gpu/drm: ring_mirror_list --&gt; pending_list</title>
<updated>2020-12-08T13:38:03+00:00</updated>
<author>
<name>Luben Tuikov</name>
<email>luben.tuikov@amd.com</email>
</author>
<published>2020-12-04T03:17:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6efa4b465cfdaa9be251ba043253e302ddb9976f'/>
<id>urn:sha1:6efa4b465cfdaa9be251ba043253e302ddb9976f</id>
<content type='text'>
Rename "ring_mirror_list" to "pending_list",
to describe what something is, not what it does,
how it's used, or how the hardware implements it.

This also abstracts the actual hardware
implementation, i.e. how the low-level driver
communicates with the device it drives, ring, CAM,
etc., shouldn't be exposed to DRM.

The pending_list keeps jobs submitted, which are
out of our control. Usually this means they are
pending execution status in hardware, but the
latter definition is a more general (inclusive)
definition.

Signed-off-by: Luben Tuikov &lt;luben.tuikov@amd.com&gt;
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/405573/

Cc: Alexander Deucher &lt;Alexander.Deucher@amd.com&gt;
Cc: Andrey Grodzovsky &lt;Andrey.Grodzovsky@amd.com&gt;
Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/scheduler: "node" --&gt; "list"</title>
<updated>2020-12-08T13:37:55+00:00</updated>
<author>
<name>Luben Tuikov</name>
<email>luben.tuikov@amd.com</email>
</author>
<published>2020-12-04T03:17:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8935ff00e3b110e47e3a291ff11951eb1066b1ae'/>
<id>urn:sha1:8935ff00e3b110e47e3a291ff11951eb1066b1ae</id>
<content type='text'>
Rename "node" to "list" in struct drm_sched_job,
in order to make it consistent with what we see
being used throughout gpu_scheduler.h, for
instance in struct drm_sched_entity, as well as
the rest of DRM and the kernel.

Signed-off-by: Luben Tuikov &lt;luben.tuikov@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/403515/

Cc: Alexander Deucher &lt;Alexander.Deucher@amd.com&gt;
Cc: Andrey Grodzovsky &lt;Andrey.Grodzovsky@amd.com&gt;
Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
</content>
</entry>
</feed>
