<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/gpu/drm/scheduler, branch v6.18.21</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.18.21</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.18.21'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-03-12T11:09:56+00:00</updated>
<entry>
<title>drm/sched: Fix kernel-doc warning for drm_sched_job_done()</title>
<updated>2026-03-12T11:09:56+00:00</updated>
<author>
<name>Yujie Liu</name>
<email>yujie.liu@intel.com</email>
</author>
<published>2026-02-27T08:24:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=da09dfc90cb7ed1ab40d675234382f151eeb0563'/>
<id>urn:sha1:da09dfc90cb7ed1ab40d675234382f151eeb0563</id>
<content type='text'>
[ Upstream commit 61ded1083b264ff67ca8c2de822c66b6febaf9a8 ]

There is a kernel-doc warning for the scheduler:

Warning: drivers/gpu/drm/scheduler/sched_main.c:367 function parameter 'result' not described in 'drm_sched_job_done'

Fix the warning by describing the undocumented error code.

Fixes: 539f9ee4b52a ("drm/scheduler: properly forward fence errors")
Signed-off-by: Yujie Liu &lt;yujie.liu@intel.com&gt;
[phasta: Flesh out commit message]
Signed-off-by: Philipp Stanner &lt;phasta@kernel.org&gt;
Link: https://patch.msgid.link/20260227082452.1802922-1-yujie.liu@intel.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/sched: Fix deadlock in drm_sched_entity_kill_jobs_cb</title>
<updated>2025-11-05T11:29:52+00:00</updated>
<author>
<name>Pierre-Eric Pelloux-Prayer</name>
<email>pierre-eric.pelloux-prayer@amd.com</email>
</author>
<published>2025-11-04T09:53:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=487df8b698345dd5a91346335f05170ed5f29d4e'/>
<id>urn:sha1:487df8b698345dd5a91346335f05170ed5f29d4e</id>
<content type='text'>
The Mesa issue referenced below pointed out a possible deadlock:

[ 1231.611031]  Possible interrupt unsafe locking scenario:

[ 1231.611033]        CPU0                    CPU1
[ 1231.611034]        ----                    ----
[ 1231.611035]   lock(&amp;xa-&gt;xa_lock#17);
[ 1231.611038]                                local_irq_disable();
[ 1231.611039]                                lock(&amp;fence-&gt;lock);
[ 1231.611041]                                lock(&amp;xa-&gt;xa_lock#17);
[ 1231.611044]   &lt;Interrupt&gt;
[ 1231.611045]     lock(&amp;fence-&gt;lock);
[ 1231.611047]
                *** DEADLOCK ***

In this example, CPU0 would be any function accessing job-&gt;dependencies
through the xa_* functions that don't disable interrupts (eg:
drm_sched_job_add_dependency(), drm_sched_entity_kill_jobs_cb()).

CPU1 is executing drm_sched_entity_kill_jobs_cb() as a fence signalling
callback so in an interrupt context. It will deadlock when trying to
grab the xa_lock which is already held by CPU0.

Replacing all xa_* usage by their xa_*_irq counterparts would fix
this issue, but Christian pointed out another issue: dma_fence_signal
takes fence.lock and so does dma_fence_add_callback.

  dma_fence_signal() // locks f1.lock
  -&gt; drm_sched_entity_kill_jobs_cb()
  -&gt; foreach dependencies
     -&gt; dma_fence_add_callback() // locks f2.lock

This will deadlock if f1 and f2 share the same spinlock.

To fix both issues, the code iterating on dependencies and re-arming them
is moved out to drm_sched_entity_kill_jobs_work().

Cc: stable@vger.kernel.org # v6.2+
Fixes: 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")
Link: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13908
Reported-by: Mikhail Gavrilov &lt;mikhail.v.gavrilov@gmail.com&gt;
Suggested-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Pierre-Eric Pelloux-Prayer &lt;pierre-eric.pelloux-prayer@amd.com&gt;
[phasta: commit message nits]
Signed-off-by: Philipp Stanner &lt;phasta@kernel.org&gt;
Link: https://patch.msgid.link/20251104095358.15092-1-pierre-eric.pelloux-prayer@amd.com
</content>
</entry>
<entry>
<title>drm/sched: avoid killing parent entity on child SIGKILL</title>
<updated>2025-10-28T13:11:42+00:00</updated>
<author>
<name>David Rosca</name>
<email>david.rosca@amd.com</email>
</author>
<published>2025-10-15T14:01:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9e8b3201c7302d5b522ba3535630bed21cc03c27'/>
<id>urn:sha1:9e8b3201c7302d5b522ba3535630bed21cc03c27</id>
<content type='text'>
The DRM scheduler tracks who last uses an entity and when that process
is killed blocks all further submissions to that entity.

The problem is that we didn't track who initially created an entity, so
when a process accidently leaked its file descriptor to a child and
that child got killed, we killed the parent's entities.

Avoid that and instead initialize the entities last user on entity
creation. This also allows to drop the extra NULL check.

Signed-off-by: David Rosca &lt;david.rosca@amd.com&gt;
Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4568
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
CC: stable@vger.kernel.org
Acked-by: Philipp Stanner &lt;phasta@kernel.org&gt;
Link: https://lore.kernel.org/r/20251015140128.1470-1-christian.koenig@amd.com
Signed-off-by: Philipp Stanner &lt;phasta@kernel.org&gt;
Link: https://patch.msgid.link/20251015140128.1470-1-christian.koenig@amd.com
</content>
</entry>
<entry>
<title>drm/sched: Fix race in drm_sched_entity_select_rq()</title>
<updated>2025-10-27T12:48:46+00:00</updated>
<author>
<name>Philipp Stanner</name>
<email>phasta@kernel.org</email>
</author>
<published>2025-10-22T06:34:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d25e3a610bae03bffc5c14b5d944a5d0cd844678'/>
<id>urn:sha1:d25e3a610bae03bffc5c14b5d944a5d0cd844678</id>
<content type='text'>
In a past bug fix it was forgotten that entity access must be protected
by the entity lock. That's a data race and potentially UB.

Move the spin_unlock() to the appropriate position.

Cc: stable@vger.kernel.org # v5.13+
Fixes: ac4eb83ab255 ("drm/sched: select new rq even if there is only one v3")
Reviewed-by: Tvrtko Ursulin &lt;tvrtko.ursulin@igalia.com&gt;
Signed-off-by: Philipp Stanner &lt;phasta@kernel.org&gt;
Link: https://patch.msgid.link/20251022063402.87318-2-phasta@kernel.org
</content>
</entry>
<entry>
<title>drm/sched: Fix potential double free in drm_sched_job_add_resv_dependencies</title>
<updated>2025-10-16T12:26:05+00:00</updated>
<author>
<name>Tvrtko Ursulin</name>
<email>tvrtko.ursulin@igalia.com</email>
</author>
<published>2025-10-15T08:40:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5801e65206b065b0b2af032f7f1eef222aa2fd83'/>
<id>urn:sha1:5801e65206b065b0b2af032f7f1eef222aa2fd83</id>
<content type='text'>
When adding dependencies with drm_sched_job_add_dependency(), that
function consumes the fence reference both on success and failure, so in
the latter case the dma_fence_put() on the error path (xarray failed to
expand) is a double free.

Interestingly this bug appears to have been present ever since
commit ebd5f74255b9 ("drm/sched: Add dependency tracking"), since the code
back then looked like this:

drm_sched_job_add_implicit_dependencies():
...
       for (i = 0; i &lt; fence_count; i++) {
               ret = drm_sched_job_add_dependency(job, fences[i]);
               if (ret)
                       break;
       }

       for (; i &lt; fence_count; i++)
               dma_fence_put(fences[i]);

Which means for the failing 'i' the dma_fence_put was already a double
free. Possibly there were no users at that time, or the test cases were
insufficient to hit it.

The bug was then only noticed and fixed after
commit 9c2ba265352a ("drm/scheduler: use new iterator in drm_sched_job_add_implicit_dependencies v2")
landed, with its fixup of
commit 4eaf02d6076c ("drm/scheduler: fix drm_sched_job_add_implicit_dependencies").

At that point it was a slightly different flavour of a double free, which
commit 963d0b356935 ("drm/scheduler: fix drm_sched_job_add_implicit_dependencies harder")
noticed and attempted to fix.

But it only moved the double free from happening inside the
drm_sched_job_add_dependency(), when releasing the reference not yet
obtained, to the caller, when releasing the reference already released by
the former in the failure case.

As such it is not easy to identify the right target for the fixes tag so
lets keep it simple and just continue the chain.

While fixing we also improve the comment and explain the reason for taking
the reference and not dropping it.

Signed-off-by: Tvrtko Ursulin &lt;tvrtko.ursulin@igalia.com&gt;
Fixes: 963d0b356935 ("drm/scheduler: fix drm_sched_job_add_implicit_dependencies harder")
Reported-by: Dan Carpenter &lt;dan.carpenter@linaro.org&gt;
Closes: https://lore.kernel.org/dri-devel/aNFbXq8OeYl3QSdm@stanley.mountain/
Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Rob Clark &lt;robdclark@chromium.org&gt;
Cc: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Cc: Matthew Brost &lt;matthew.brost@intel.com&gt;
Cc: Danilo Krummrich &lt;dakr@kernel.org&gt;
Cc: Philipp Stanner &lt;phasta@kernel.org&gt;
Cc: Christian König &lt;ckoenig.leichtzumerken@gmail.com&gt;
Cc: dri-devel@lists.freedesktop.org
Cc: stable@vger.kernel.org # v5.16+
Signed-off-by: Philipp Stanner &lt;phasta@kernel.org&gt;
Link: https://lore.kernel.org/r/20251015084015.6273-1-tvrtko.ursulin@igalia.com
</content>
</entry>
<entry>
<title>drm/sched: Fix racy access to drm_sched_entity.dependency</title>
<updated>2025-09-02T07:25:51+00:00</updated>
<author>
<name>Pierre-Eric Pelloux-Prayer</name>
<email>pierre-eric.pelloux-prayer@amd.com</email>
</author>
<published>2025-09-01T12:40:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b2b8af21fec35be417a3199b5a6c354605dd222a'/>
<id>urn:sha1:b2b8af21fec35be417a3199b5a6c354605dd222a</id>
<content type='text'>
The drm_sched_job_unschedulable trace point can access
entity-&gt;dependency after it was cleared by the callback
installed in drm_sched_entity_add_dependency_cb, causing:

BUG: kernel NULL pointer dereference, address: 0000000000000020
[...]
Workqueue: comp_1.1.0 drm_sched_run_job_work [gpu_sched]
RIP: 0010:trace_event_raw_event_drm_sched_job_unschedulable+0x70/0xd0 [gpu_sched]

To fix this we either need to keep a reference to the fence before
setting up the callbacks, or move the trace_drm_sched_job_unschedulable
calls into drm_sched_entity_add_dependency_cb where they can be
done earlier.

Fixes: 76d97c870f29 ("drm/sched: Trace dependencies for GPU jobs")

Signed-off-by: Pierre-Eric Pelloux-Prayer &lt;pierre-eric.pelloux-prayer@amd.com&gt;
Reviewed-by: Tvrtko Ursulin &lt;tvrtko.ursulin@igalia.com&gt;
Signed-off-by: Philipp Stanner &lt;phasta@kernel.org&gt;
Link: https://lore.kernel.org/r/20250901124032.1955-1-pierre-eric.pelloux-prayer@amd.com
</content>
</entry>
<entry>
<title>drm/sched: Document race condition in drm_sched_fini()</title>
<updated>2025-08-28T08:27:18+00:00</updated>
<author>
<name>Philipp Stanner</name>
<email>phasta@kernel.org</email>
</author>
<published>2025-08-13T08:56:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f4c75f975cf50fa2e1fd96c5aafe5aa62e55fbe4'/>
<id>urn:sha1:f4c75f975cf50fa2e1fd96c5aafe5aa62e55fbe4</id>
<content type='text'>
In drm_sched_fini() all entities are marked as stopped - without taking
the appropriate lock, because that would deadlock. That means that
drm_sched_fini() and drm_sched_entity_push_job() can race against each
other.

This should most likely be fixed by establishing the rule that all
entities associated with a scheduler must be torn down first. Then,
however, the locking should be removed from drm_sched_fini() alltogether
with an appropriate comment.

Reported-by: James Flowers &lt;bold.zone2373@fastmail.com&gt;
Link: https://lore.kernel.org/dri-devel/20250720235748.2798-1-bold.zone2373@fastmail.com/
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Philipp Stanner &lt;phasta@kernel.org&gt;
Link: https://lore.kernel.org/r/20250813085654.102504-2-phasta@kernel.org
</content>
</entry>
<entry>
<title>drm/sched/tests: Remove redundant header files</title>
<updated>2025-08-28T08:13:56+00:00</updated>
<author>
<name>Liao Yuanhong</name>
<email>liaoyuanhong@vivo.com</email>
</author>
<published>2025-08-19T14:26:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=77a62e557f54ecf6305e7ed6fb05d02e8748bffb'/>
<id>urn:sha1:77a62e557f54ecf6305e7ed6fb05d02e8748bffb</id>
<content type='text'>
The header file &lt;linux/atomic.h&gt; is already included on line 8. Remove the
redundant include.

Fixes: 5a99350794fec ("drm/sched: Add scheduler unit testing infrastructure and some basic tests")
Signed-off-by: Liao Yuanhong &lt;liaoyuanhong@vivo.com&gt;
Reviewed-by: Tvrtko Ursulin &lt;tvrtko.ursulin@igalia.com&gt;
Signed-off-by: Philipp Stanner &lt;phasta@kernel.org&gt;
Link: https://lore.kernel.org/r/20250819142630.368796-1-liaoyuanhong@vivo.com
</content>
</entry>
<entry>
<title>drm/sched: Remove mention of indirect buffers</title>
<updated>2025-08-28T08:09:12+00:00</updated>
<author>
<name>Tvrtko Ursulin</name>
<email>tvrtko.ursulin@igalia.com</email>
</author>
<published>2025-08-14T13:36:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ec035aba7d4a9439d7fa6d54b9f9de6ce3fe1716'/>
<id>urn:sha1:ec035aba7d4a9439d7fa6d54b9f9de6ce3fe1716</id>
<content type='text'>
Indirect buffers are an AMD term describing essentialy a job submitted to
the scheduler, just a lower level one. Since scheduler was promoted to be
generic long ago, lets replace those references with jobs.

Signed-off-by: Tvrtko Ursulin &lt;tvrtko.ursulin@igalia.com&gt;
Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Danilo Krummrich &lt;dakr@kernel.org&gt;
Cc: Matthew Brost &lt;matthew.brost@intel.com&gt;
Cc: Philipp Stanner &lt;phasta@kernel.org&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Philipp Stanner &lt;phasta@kernel.org&gt;
Link: https://lore.kernel.org/r/20250814133627.2550-1-tvrtko.ursulin@igalia.com
</content>
</entry>
<entry>
<title>Merge drm/drm-next into drm-misc-n</title>
<updated>2025-08-11T12:37:45+00:00</updated>
<author>
<name>Thomas Zimmermann</name>
<email>tzimmermann@suse.de</email>
</author>
<published>2025-08-11T12:37:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=08c51f5bddc81c8c97c1eb11861b0dc009e5ccd8'/>
<id>urn:sha1:08c51f5bddc81c8c97c1eb11861b0dc009e5ccd8</id>
<content type='text'>
Updating drm-misc-next to the state of v6.17-rc1. Begins a new release
cycle.

Signed-off-by: Thomas Zimmermann &lt;tzimmermann@suse.de&gt;
</content>
</entry>
</feed>
