<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/gpu/drm/scheduler, branch v6.12.80</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-03-13T16:20:43+00:00</updated>
<entry>
<title>drm/sched: Fix kernel-doc warning for drm_sched_job_done()</title>
<updated>2026-03-13T16:20:43+00:00</updated>
<author>
<name>Yujie Liu</name>
<email>yujie.liu@intel.com</email>
</author>
<published>2026-02-27T08:24:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1e6eaf2150ef4594c458638a6cc077ffe28a31ac'/>
<id>urn:sha1:1e6eaf2150ef4594c458638a6cc077ffe28a31ac</id>
<content type='text'>
[ Upstream commit 61ded1083b264ff67ca8c2de822c66b6febaf9a8 ]

There is a kernel-doc warning for the scheduler:

Warning: drivers/gpu/drm/scheduler/sched_main.c:367 function parameter 'result' not described in 'drm_sched_job_done'

Fix the warning by describing the undocumented error code.

Fixes: 539f9ee4b52a ("drm/scheduler: properly forward fence errors")
Signed-off-by: Yujie Liu &lt;yujie.liu@intel.com&gt;
[phasta: Flesh out commit message]
Signed-off-by: Philipp Stanner &lt;phasta@kernel.org&gt;
Link: https://patch.msgid.link/20260227082452.1802922-1-yujie.liu@intel.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/sched: Fix deadlock in drm_sched_entity_kill_jobs_cb</title>
<updated>2025-11-13T20:34:39+00:00</updated>
<author>
<name>Pierre-Eric Pelloux-Prayer</name>
<email>pierre-eric.pelloux-prayer@amd.com</email>
</author>
<published>2025-11-04T09:53:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0d63031ee4a57be0252cb9a4e09ae921c75cece9'/>
<id>urn:sha1:0d63031ee4a57be0252cb9a4e09ae921c75cece9</id>
<content type='text'>
commit 487df8b698345dd5a91346335f05170ed5f29d4e upstream.

The Mesa issue referenced below pointed out a possible deadlock:

[ 1231.611031]  Possible interrupt unsafe locking scenario:

[ 1231.611033]        CPU0                    CPU1
[ 1231.611034]        ----                    ----
[ 1231.611035]   lock(&amp;xa-&gt;xa_lock#17);
[ 1231.611038]                                local_irq_disable();
[ 1231.611039]                                lock(&amp;fence-&gt;lock);
[ 1231.611041]                                lock(&amp;xa-&gt;xa_lock#17);
[ 1231.611044]   &lt;Interrupt&gt;
[ 1231.611045]     lock(&amp;fence-&gt;lock);
[ 1231.611047]
                *** DEADLOCK ***

In this example, CPU0 would be any function accessing job-&gt;dependencies
through the xa_* functions that don't disable interrupts (eg:
drm_sched_job_add_dependency(), drm_sched_entity_kill_jobs_cb()).

CPU1 is executing drm_sched_entity_kill_jobs_cb() as a fence signalling
callback so in an interrupt context. It will deadlock when trying to
grab the xa_lock which is already held by CPU0.

Replacing all xa_* usage by their xa_*_irq counterparts would fix
this issue, but Christian pointed out another issue: dma_fence_signal
takes fence.lock and so does dma_fence_add_callback.

  dma_fence_signal() // locks f1.lock
  -&gt; drm_sched_entity_kill_jobs_cb()
  -&gt; foreach dependencies
     -&gt; dma_fence_add_callback() // locks f2.lock

This will deadlock if f1 and f2 share the same spinlock.

To fix both issues, the code iterating on dependencies and re-arming them
is moved out to drm_sched_entity_kill_jobs_work().

Cc: stable@vger.kernel.org # v6.2+
Fixes: 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")
Link: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13908
Reported-by: Mikhail Gavrilov &lt;mikhail.v.gavrilov@gmail.com&gt;
Suggested-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Pierre-Eric Pelloux-Prayer &lt;pierre-eric.pelloux-prayer@amd.com&gt;
[phasta: commit message nits]
Signed-off-by: Philipp Stanner &lt;phasta@kernel.org&gt;
Link: https://patch.msgid.link/20251104095358.15092-1-pierre-eric.pelloux-prayer@amd.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drm/sched: Fix race in drm_sched_entity_select_rq()</title>
<updated>2025-11-13T20:34:01+00:00</updated>
<author>
<name>Philipp Stanner</name>
<email>phasta@kernel.org</email>
</author>
<published>2025-11-03T12:44:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6be3d7f731dcf2e0ae8b88a4f7e7cefb0e1fb991'/>
<id>urn:sha1:6be3d7f731dcf2e0ae8b88a4f7e7cefb0e1fb991</id>
<content type='text'>
[ Upstream commit d25e3a610bae03bffc5c14b5d944a5d0cd844678 ]

In a past bug fix it was forgotten that entity access must be protected
by the entity lock. That's a data race and potentially UB.

Move the spin_unlock() to the appropriate position.

Cc: stable@vger.kernel.org # v5.13+
Fixes: ac4eb83ab255 ("drm/sched: select new rq even if there is only one v3")
Reviewed-by: Tvrtko Ursulin &lt;tvrtko.ursulin@igalia.com&gt;
Signed-off-by: Philipp Stanner &lt;phasta@kernel.org&gt;
Link: https://patch.msgid.link/20251022063402.87318-2-phasta@kernel.org
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drm/sched: Re-group and rename the entity run-queue lock</title>
<updated>2025-11-13T20:34:01+00:00</updated>
<author>
<name>Tvrtko Ursulin</name>
<email>tvrtko.ursulin@igalia.com</email>
</author>
<published>2025-11-03T12:44:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f78da4cad3b112375881244a3538e2ba0feb2220'/>
<id>urn:sha1:f78da4cad3b112375881244a3538e2ba0feb2220</id>
<content type='text'>
[ Upstream commit f93126f5d55920d1447ef00a3fbe6706f40f53de ]

When writing to a drm_sched_entity's run-queue, writers are protected
through the lock drm_sched_entity.rq_lock. This naming, however,
frequently collides with the separate internal lock of struct
drm_sched_rq, resulting in uses like this:

	spin_lock(&amp;entity-&gt;rq_lock);
	spin_lock(&amp;entity-&gt;rq-&gt;lock);

Rename drm_sched_entity.rq_lock to improve readability. While at it,
re-order that struct's members to make it more obvious what the lock
protects.

v2:
 * Rename some rq_lock straddlers in kerneldoc, improve commit text. (Philipp)

Signed-off-by: Tvrtko Ursulin &lt;tvrtko.ursulin@igalia.com&gt;
Suggested-by: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: Luben Tuikov &lt;ltuikov89@gmail.com&gt;
Cc: Matthew Brost &lt;matthew.brost@intel.com&gt;
Cc: Philipp Stanner &lt;pstanner@redhat.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
[pstanner: Fix typo in docstring]
Signed-off-by: Philipp Stanner &lt;pstanner@redhat.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20241016122013.7857-5-tursulin@igalia.com
Stable-dep-of: d25e3a610bae ("drm/sched: Fix race in drm_sched_entity_select_rq()")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drm/sched: Optimise drm_sched_entity_push_job</title>
<updated>2025-11-13T20:34:01+00:00</updated>
<author>
<name>Tvrtko Ursulin</name>
<email>tvrtko.ursulin@igalia.com</email>
</author>
<published>2025-11-03T12:44:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=431b4e8c7bfdc7c5535aaa8bf7f1dabf848fca1e'/>
<id>urn:sha1:431b4e8c7bfdc7c5535aaa8bf7f1dabf848fca1e</id>
<content type='text'>
[ Upstream commit d42a254633c773921884a19e8a1a0f53a31150c3 ]

In FIFO mode (which is the default), both drm_sched_entity_push_job() and
drm_sched_rq_update_fifo(), where the latter calls the former, are
currently taking and releasing the same entity-&gt;rq_lock.

We can avoid that design inelegance, and also have a miniscule
efficiency improvement on the submit from idle path, by introducing a new
drm_sched_rq_update_fifo_locked() helper and pulling up the lock taking to
its callers.

v2:
 * Remove drm_sched_rq_update_fifo() altogether. (Christian)

v3:
 * Improved commit message. (Philipp)

Signed-off-by: Tvrtko Ursulin &lt;tvrtko.ursulin@igalia.com&gt;
Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: Luben Tuikov &lt;ltuikov89@gmail.com&gt;
Cc: Matthew Brost &lt;matthew.brost@intel.com&gt;
Cc: Philipp Stanner &lt;pstanner@redhat.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Philipp Stanner &lt;pstanner@redhat.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20241016122013.7857-2-tursulin@igalia.com
Stable-dep-of: d25e3a610bae ("drm/sched: Fix race in drm_sched_entity_select_rq()")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drm/sched: avoid killing parent entity on child SIGKILL</title>
<updated>2025-11-13T20:34:00+00:00</updated>
<author>
<name>David Rosca</name>
<email>david.rosca@amd.com</email>
</author>
<published>2025-10-15T14:01:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3ec3d47e3a03d5f1fe663edafcf0051cf0460788'/>
<id>urn:sha1:3ec3d47e3a03d5f1fe663edafcf0051cf0460788</id>
<content type='text'>
commit 9e8b3201c7302d5b522ba3535630bed21cc03c27 upstream.

The DRM scheduler tracks who last uses an entity and when that process
is killed blocks all further submissions to that entity.

The problem is that we didn't track who initially created an entity, so
when a process accidently leaked its file descriptor to a child and
that child got killed, we killed the parent's entities.

Avoid that and instead initialize the entities last user on entity
creation. This also allows to drop the extra NULL check.

Signed-off-by: David Rosca &lt;david.rosca@amd.com&gt;
Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4568
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
CC: stable@vger.kernel.org
Acked-by: Philipp Stanner &lt;phasta@kernel.org&gt;
Link: https://lore.kernel.org/r/20251015140128.1470-1-christian.koenig@amd.com
Signed-off-by: Philipp Stanner &lt;phasta@kernel.org&gt;
Link: https://patch.msgid.link/20251015140128.1470-1-christian.koenig@amd.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drm/sched: Fix potential double free in drm_sched_job_add_resv_dependencies</title>
<updated>2025-10-23T14:20:21+00:00</updated>
<author>
<name>Tvrtko Ursulin</name>
<email>tvrtko.ursulin@igalia.com</email>
</author>
<published>2025-10-15T08:40:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e5e3eb2aff92994ee81ce633f1c4e73bd4b87e11'/>
<id>urn:sha1:e5e3eb2aff92994ee81ce633f1c4e73bd4b87e11</id>
<content type='text'>
commit 5801e65206b065b0b2af032f7f1eef222aa2fd83 upstream.

When adding dependencies with drm_sched_job_add_dependency(), that
function consumes the fence reference both on success and failure, so in
the latter case the dma_fence_put() on the error path (xarray failed to
expand) is a double free.

Interestingly this bug appears to have been present ever since
commit ebd5f74255b9 ("drm/sched: Add dependency tracking"), since the code
back then looked like this:

drm_sched_job_add_implicit_dependencies():
...
       for (i = 0; i &lt; fence_count; i++) {
               ret = drm_sched_job_add_dependency(job, fences[i]);
               if (ret)
                       break;
       }

       for (; i &lt; fence_count; i++)
               dma_fence_put(fences[i]);

Which means for the failing 'i' the dma_fence_put was already a double
free. Possibly there were no users at that time, or the test cases were
insufficient to hit it.

The bug was then only noticed and fixed after
commit 9c2ba265352a ("drm/scheduler: use new iterator in drm_sched_job_add_implicit_dependencies v2")
landed, with its fixup of
commit 4eaf02d6076c ("drm/scheduler: fix drm_sched_job_add_implicit_dependencies").

At that point it was a slightly different flavour of a double free, which
commit 963d0b356935 ("drm/scheduler: fix drm_sched_job_add_implicit_dependencies harder")
noticed and attempted to fix.

But it only moved the double free from happening inside the
drm_sched_job_add_dependency(), when releasing the reference not yet
obtained, to the caller, when releasing the reference already released by
the former in the failure case.

As such it is not easy to identify the right target for the fixes tag so
lets keep it simple and just continue the chain.

While fixing we also improve the comment and explain the reason for taking
the reference and not dropping it.

Signed-off-by: Tvrtko Ursulin &lt;tvrtko.ursulin@igalia.com&gt;
Fixes: 963d0b356935 ("drm/scheduler: fix drm_sched_job_add_implicit_dependencies harder")
Reported-by: Dan Carpenter &lt;dan.carpenter@linaro.org&gt;
Closes: https://lore.kernel.org/dri-devel/aNFbXq8OeYl3QSdm@stanley.mountain/
Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Rob Clark &lt;robdclark@chromium.org&gt;
Cc: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Cc: Matthew Brost &lt;matthew.brost@intel.com&gt;
Cc: Danilo Krummrich &lt;dakr@kernel.org&gt;
Cc: Philipp Stanner &lt;phasta@kernel.org&gt;
Cc: Christian König &lt;ckoenig.leichtzumerken@gmail.com&gt;
Cc: dri-devel@lists.freedesktop.org
Cc: stable@vger.kernel.org # v5.16+
Signed-off-by: Philipp Stanner &lt;phasta@kernel.org&gt;
Link: https://lore.kernel.org/r/20251015084015.6273-1-tvrtko.ursulin@igalia.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drm/sched: Remove optimization that causes hang when killing dependent jobs</title>
<updated>2025-08-01T08:48:42+00:00</updated>
<author>
<name>Lin.Cao</name>
<email>lincao12@amd.com</email>
</author>
<published>2025-07-17T08:44:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f5ee8a39f03e51a17c42f93708f9d3251519988c'/>
<id>urn:sha1:f5ee8a39f03e51a17c42f93708f9d3251519988c</id>
<content type='text'>
commit 15f77764e90a713ee3916ca424757688e4f565b9 upstream.

When application A submits jobs and application B submits a job with a
dependency on A's fence, the normal flow wakes up the scheduler after
processing each job. However, the optimization in
drm_sched_entity_add_dependency_cb() uses a callback that only clears
dependencies without waking up the scheduler.

When application A is killed before its jobs can run, the callback gets
triggered but only clears the dependency without waking up the scheduler,
causing the scheduler to enter sleep state and application B to hang.

Remove the optimization by deleting drm_sched_entity_clear_dep() and its
usage, ensuring the scheduler is always woken up when dependencies are
cleared.

Fixes: 777dbd458c89 ("drm/amdgpu: drop a dummy wakeup scheduler")
Cc: stable@vger.kernel.org # v4.6+
Signed-off-by: Lin.Cao &lt;lincao12@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Philipp Stanner &lt;phasta@kernel.org&gt;
Link: https://lore.kernel.org/r/20250717084453.921097-1-lincao12@amd.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drm/scheduler: signal scheduled fence when kill job</title>
<updated>2025-07-06T09:01:34+00:00</updated>
<author>
<name>Lin.Cao</name>
<email>lincao12@amd.com</email>
</author>
<published>2025-05-15T02:07:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=aefd0a935625165a6ca36d0258d2d053901555df'/>
<id>urn:sha1:aefd0a935625165a6ca36d0258d2d053901555df</id>
<content type='text'>
[ Upstream commit 471db2c2d4f80ee94225a1ef246e4f5011733e50 ]

When an entity from application B is killed, drm_sched_entity_kill()
removes all jobs belonging to that entity through
drm_sched_entity_kill_jobs_work(). If application A's job depends on a
scheduled fence from application B's job, and that fence is not properly
signaled during the killing process, application A's dependency cannot be
cleared.

This leads to application A hanging indefinitely while waiting for a
dependency that will never be resolved. Fix this issue by ensuring that
scheduled fences are properly signaled when an entity is killed, allowing
dependent applications to continue execution.

Signed-off-by: Lin.Cao &lt;lincao12@amd.com&gt;
Reviewed-by: Philipp Stanner &lt;phasta@kernel.org&gt;
Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Link: https://lore.kernel.org/r/20250515020713.1110476-1-lincao12@amd.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/sched: Fix fence reference count leak</title>
<updated>2025-03-28T21:03:31+00:00</updated>
<author>
<name>qianyi liu</name>
<email>liuqianyi125@gmail.com</email>
</author>
<published>2025-03-11T06:02:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1135a9431160575466ea9ac37ebd756ecbe35fff'/>
<id>urn:sha1:1135a9431160575466ea9ac37ebd756ecbe35fff</id>
<content type='text'>
commit a952f1ab696873be124e31ce5ef964d36bce817f upstream.

The last_scheduled fence leaks when an entity is being killed and adding
the cleanup callback fails.

Decrement the reference count of prev when dma_fence_add_callback()
fails, ensuring proper balance.

Cc: stable@vger.kernel.org	# v6.2+
[phasta: add git tag info for stable kernel]
Fixes: 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")
Signed-off-by: qianyi liu &lt;liuqianyi125@gmail.com&gt;
Signed-off-by: Philipp Stanner &lt;phasta@kernel.org&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250311060251.4041101-1-liuqianyi125@gmail.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
</feed>
