<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/gpu/drm/scheduler, branch v6.1.90</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.1.90</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.1.90'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2023-10-19T21:08:57+00:00</updated>
<entry>
<title>dma-buf: add dma_fence_timestamp helper</title>
<updated>2023-10-19T21:08:57+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2023-09-08T08:27:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d67b5a2b97b6e8959117ff71fc2daa30e66c8290'/>
<id>urn:sha1:d67b5a2b97b6e8959117ff71fc2daa30e66c8290</id>
<content type='text'>
commit b83ce9cb4a465b8f9a3fa45561b721a9551f60e3 upstream.

When a fence signals there is a very small race window where the timestamp
isn't updated yet. sync_file solves this by busy waiting for the
timestamp to appear, but on other ocassions didn't handled this
correctly.

Provide a dma_fence_timestamp() helper function for this and use it in
all appropriate cases.

Another alternative would be to grab the spinlock when that happens.

v2 by teddy: add a wait parameter to wait for the timestamp to show up, in case
   the accurate timestamp is needed and/or the timestamp is not based on
   ktime (e.g. hw timestamp)
v3 chk: drop the parameter again for unified handling

Signed-off-by: Yunxiang Li &lt;Yunxiang.Li@amd.com&gt;
Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Fixes: 1774baa64f93 ("drm/scheduler: Change scheduled fence track v2")
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
CC: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20230929104725.2358-1-christian.koenig@amd.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drm/scheduler: fix fence ref counting</title>
<updated>2022-10-25T11:14:36+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2022-09-27T16:43:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b3af84383e7abdc5e63435817bb73a268e7c3637'/>
<id>urn:sha1:b3af84383e7abdc5e63435817bb73a268e7c3637</id>
<content type='text'>
We leaked dependency fences when processes were beeing killed.

Additional to that grab a reference to the last scheduled fence.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20220929180151.139751-1-christian.koenig@amd.com
</content>
</entry>
<entry>
<title>Merge drm/drm-fixes into drm-misc-fixes</title>
<updated>2022-10-20T07:09:00+00:00</updated>
<author>
<name>Thomas Zimmermann</name>
<email>tzimmermann@suse.de</email>
</author>
<published>2022-10-20T07:09:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1aca5ce036e3499336d1a2ace3070f908381c055'/>
<id>urn:sha1:1aca5ce036e3499336d1a2ace3070f908381c055</id>
<content type='text'>
Backmerging to get v6.1-rc1.

Signed-off-by: Thomas Zimmermann &lt;tzimmermann@suse.de&gt;
</content>
</entry>
<entry>
<title>drm/sched: add DRM_SCHED_FENCE_DONT_PIPELINE flag</title>
<updated>2022-10-19T10:42:51+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2022-10-07T07:51:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7b476affcccfc7e644541a0a719f53fc7bd34c53'/>
<id>urn:sha1:7b476affcccfc7e644541a0a719f53fc7bd34c53</id>
<content type='text'>
Setting this flag on a scheduler fence prevents pipelining of jobs
depending on this fence. In other words we always insert a full CPU
round trip before dependent jobs are pushed to the pipeline.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2113#note_1579296
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Acked-by: Luben Tuikov &lt;luben.tuikov@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20221014081553.114899-1-christian.koenig@amd.com
</content>
</entry>
<entry>
<title>Revert "drm/sched: Use parent fence instead of finished"</title>
<updated>2022-10-07T02:58:39+00:00</updated>
<author>
<name>Dave Airlie</name>
<email>airlied@redhat.com</email>
</author>
<published>2022-10-07T02:40:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bafaf67c42f4b547bf4fb329ac6dcb28b05de15e'/>
<id>urn:sha1:bafaf67c42f4b547bf4fb329ac6dcb28b05de15e</id>
<content type='text'>
This reverts commit e4dc45b1848bc6bcac31eb1b4ccdd7f6718b3c86.

This is causing instability on Linus' desktop, and I'm seeing
oops with VK CTS runs.

netconsole got me the following oops:
[ 1234.778760] BUG: kernel NULL pointer dereference, address: 0000000000000088
[ 1234.778782] #PF: supervisor read access in kernel mode
[ 1234.778787] #PF: error_code(0x0000) - not-present page
[ 1234.778791] PGD 0 P4D 0
[ 1234.778798] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 1234.778803] CPU: 7 PID: 805 Comm: systemd-journal Not tainted 6.0.0+ #2
[ 1234.778809] Hardware name: System manufacturer System Product
Name/PRIME X370-PRO, BIOS 5603 07/28/2020
[ 1234.778813] RIP: 0010:drm_sched_job_done.isra.0+0xc/0x140 [gpu_sched]
[ 1234.778828] Code: aa 0f 1d ce e9 57 ff ff ff 48 89 d7 e8 9d 8f 3f
ce e9 4a ff ff ff 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 54 55 53
48 89 fb &lt;48&gt; 8b af 88 00 00 00 f0 ff 8d f0 00 00 00 48 8b 85 80 01 00
00 f0
[ 1234.778834] RSP: 0000:ffffabe680380de0 EFLAGS: 00010087
[ 1234.778839] RAX: ffffffffc04e9230 RBX: 0000000000000000 RCX: 0000000000000018
[ 1234.778897] RDX: 00000ba278e8977a RSI: ffff953fb288b460 RDI: 0000000000000000
[ 1234.778901] RBP: ffff953fb288b598 R08: 00000000000000e0 R09: ffff953fbd98b808
[ 1234.778905] R10: 0000000000000000 R11: ffffabe680380ff8 R12: ffffabe680380e00
[ 1234.778908] R13: 0000000000000001 R14: 00000000ffffffff R15: ffff953fbd9ec458
[ 1234.778912] FS:  00007f35e7008580(0000) GS:ffff95428ebc0000(0000)
knlGS:0000000000000000
[ 1234.778916] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1234.778919] CR2: 0000000000000088 CR3: 000000010147c000 CR4: 00000000003506e0
[ 1234.778924] Call Trace:
[ 1234.778981]  &lt;IRQ&gt;
[ 1234.778989]  dma_fence_signal_timestamp_locked+0x6a/0xe0
[ 1234.778999]  dma_fence_signal+0x2c/0x50
[ 1234.779005]  amdgpu_fence_process+0xc8/0x140 [amdgpu]
[ 1234.779234]  sdma_v3_0_process_trap_irq+0x70/0x80 [amdgpu]
[ 1234.779395]  amdgpu_irq_dispatch+0xa9/0x1d0 [amdgpu]
[ 1234.779609]  amdgpu_ih_process+0x80/0x100 [amdgpu]
[ 1234.779783]  amdgpu_irq_handler+0x1f/0x60 [amdgpu]
[ 1234.779940]  __handle_irq_event_percpu+0x46/0x190
[ 1234.779946]  handle_irq_event+0x34/0x70
[ 1234.779949]  handle_edge_irq+0x9f/0x240
[ 1234.779954]  __common_interrupt+0x66/0x100
[ 1234.779960]  common_interrupt+0xa0/0xc0
[ 1234.779965]  &lt;/IRQ&gt;
[ 1234.779968]  &lt;TASK&gt;
[ 1234.779971]  asm_common_interrupt+0x22/0x40
[ 1234.779976] RIP: 0010:finish_mkwrite_fault+0x22/0x110
[ 1234.779981] Code: 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 41 55 41
54 55 48 89 fd 53 48 8b 07 f6 40 50 08 0f 84 eb 00 00 00 48 8b 45 30
48 8b 18 &lt;48&gt; 89 df e8 66 bd ff ff 48 85 c0 74 0d 48 89 c2 83 e2 01 48
83 ea
[ 1234.779985] RSP: 0000:ffffabe680bcfd78 EFLAGS: 00000202

Revert it for now and figure it out later.

Signed-off-by: Dave Airlie &lt;airlied@redhat.com&gt;
</content>
</entry>
<entry>
<title>drm/sched: Use parent fence instead of finished</title>
<updated>2022-09-16T13:53:25+00:00</updated>
<author>
<name>Arvind Yadav</name>
<email>Arvind.Yadav@amd.com</email>
</author>
<published>2022-09-14T16:43:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e4dc45b1848bc6bcac31eb1b4ccdd7f6718b3c86'/>
<id>urn:sha1:e4dc45b1848bc6bcac31eb1b4ccdd7f6718b3c86</id>
<content type='text'>
Using the parent fence instead of the finished fence
to get the job status. This change is to avoid GPU
scheduler timeout error which can cause GPU reset.

Signed-off-by: Arvind Yadav &lt;Arvind.Yadav@amd.com&gt;
Reviewed-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20220914164321.2156-6-Arvind.Yadav@amd.com
Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/scheduler: quieten kernel-doc warnings</title>
<updated>2022-09-06T20:14:28+00:00</updated>
<author>
<name>Randy Dunlap</name>
<email>rdunlap@infradead.org</email>
</author>
<published>2022-04-04T21:30:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f8ad757e40c9c776a13eaa56d73e8e62381517b6'/>
<id>urn:sha1:f8ad757e40c9c776a13eaa56d73e8e62381517b6</id>
<content type='text'>
Fix kernel-doc warnings in gpu_scheduler.h and sched_main.c.

Quashes these warnings:

include/drm/gpu_scheduler.h:332: warning: missing initial short description on line:
 * struct drm_sched_backend_ops
include/drm/gpu_scheduler.h:412: warning: missing initial short description on line:
 * struct drm_gpu_scheduler
include/drm/gpu_scheduler.h:461: warning: Function parameter or member 'dev' not described in 'drm_gpu_scheduler'

drivers/gpu/drm/scheduler/sched_main.c:201: warning: missing initial short description on line:
 * drm_sched_dependency_optimized
drivers/gpu/drm/scheduler/sched_main.c:995: warning: Function parameter or member 'dev' not described in 'drm_sched_init'

Fixes: 2d33948e4e00 ("drm/scheduler: add documentation")
Fixes: 8ab62eda177b ("drm/sched: Add device pointer to drm_gpu_scheduler")
Fixes: 542cff7893a3 ("drm/sched: Avoid lockdep spalt on killing a processes")
Signed-off-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: David Airlie &lt;airlied@linux.ie&gt;
Cc: Daniel Vetter &lt;daniel@ffwll.ch&gt;
Cc: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Cc: Nayan Deshmukh &lt;nayan26deshmukh@gmail.com&gt;
Cc: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Jiawei Gu &lt;Jiawei.Gu@amd.com&gt;
Cc: dri-devel@lists.freedesktop.org
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20220404213040.12912-1-rdunlap@infradead.org
</content>
</entry>
<entry>
<title>drm/sched: move calling drm_sched_entity_select_rq</title>
<updated>2022-07-19T15:22:25+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2022-07-13T16:14:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6d602e031103fb78dbe50dbf57a5f29737494c6f'/>
<id>urn:sha1:6d602e031103fb78dbe50dbf57a5f29737494c6f</id>
<content type='text'>
We already discussed that the call to drm_sched_entity_select_rq() needs
to move to drm_sched_job_arm() to be able to set a new scheduler list
between _init() and _arm(). This was just not applied for some reason.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20220714103902.7084-2-christian.koenig@amd.com
</content>
</entry>
<entry>
<title>Merge tag 'amd-drm-next-5.20-2022-07-05' of https://gitlab.freedesktop.org/agd5f/linux into drm-next</title>
<updated>2022-07-12T01:07:32+00:00</updated>
<author>
<name>Dave Airlie</name>
<email>airlied@redhat.com</email>
</author>
<published>2022-07-12T01:07:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=344feb7ccf764756937cfd74fa4ac5caba069c99'/>
<id>urn:sha1:344feb7ccf764756937cfd74fa4ac5caba069c99</id>
<content type='text'>
amd-drm-next-5.20-2022-07-05:

amdgpu:
- Various spelling and grammer fixes
- Various eDP fixes
- Various DMCUB fixes
- VCN fixes
- GMC 11 fixes
- RAS fixes
- TMZ support for GC 10.3.7
- GPUVM TLB flush fixes
- SMU 13.0.x updates
- DCN 3.2 Support
- DCN 3.2.1 Support
- MES updates
- GFX11 modifiers support
- USB-C fixes
- MMHUB 3.0.1 support
- SDMA 6.0 doorbell fixes
- Initial devcoredump support
- Enable high priority gfx queue on asics which support it
- Enable GPU reset for SMU 13.0.4
- OLED display fixes
- MPO fixes
- DC frame size fixes
- ASPM support for PCIE 7.4/7.6
- GPU reset support for SMU 13.0.0
- GFX11 updates
- VCN JPEG fix
- BACO support for SMU 13.0.7
- VCN instance handling fix
- GFX8 GPUVM TLB flush fix
- GPU reset rework
- VCN 4.0.2 support
- GTT size fixes
- DP link training fixes
- LSDMA 6.0.1 support
- Various backlight fixes
- Color encoding fixes
- Backlight config cleanup
- VCN 4.x unified queue cleanup

amdkfd:
- MMU notifier fixes
- Updates for GC 10.3.6 and 10.3.7
- P2P DMA support using dma-buf
- Add available memory IOCTL
- SDMA 6.0.1 fix
- MES fixes
- HMM profiler support

radeon:
- License fix
- Backlight config cleanup

UAPI:
- Add available memory IOCTL to amdkfd
  Proposed userspace: https://www.mail-archive.com/amd-gfx@lists.freedesktop.org/msg75743.html
- HMM profiler support for amdkfd
  Proposed userspace: https://lists.freedesktop.org/archives/amd-gfx/2022-June/080805.html

Signed-off-by: Dave Airlie &lt;airlied@redhat.com&gt;
From: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20220705212633.6037-1-alexander.deucher@amd.com
</content>
</entry>
<entry>
<title>drm/sched: Partial revert of 'drm/sched: Keep s_fence-&gt;parent pointer'</title>
<updated>2022-06-28T15:24:31+00:00</updated>
<author>
<name>Andrey Grodzovsky</name>
<email>andrey.grodzovsky@amd.com</email>
</author>
<published>2022-06-20T20:39:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=45ecaea738830b9d521c93520c8f201359dcbd95'/>
<id>urn:sha1:45ecaea738830b9d521c93520c8f201359dcbd95</id>
<content type='text'>
Problem:
This patch caused negative refcount as described in [1] because
for that case parent fence did not signal by the time of drm_sched_stop and hence
kept in pending list the assumption was they will not signal and
so fence was put to account for the s_fence-&gt;parent refcount but for
amdgpu which has embedded HW fence (always same parent fence)
drm_sched_fence_release_scheduled was always called and would
still drop the count for parent fence once more. For jobs that
never signaled this imbalance was masked by refcount bug in
amdgpu_fence_driver_clear_job_fences that would not drop
refcount on the fences that were removed from fence drive
fences array (against prevois insertion into the array in
get in amdgpu_fence_emit).

Fix:
Revert this patch and by setting s_job-&gt;s_fence-&gt;parent to NULL
as before prevent the extra refcount drop in amdgpu when
drm_sched_fence_release_scheduled is called on job release.

Also - align behaviour in drm_sched_resubmit_jobs_ext with that of
drm_sched_main when submitting jobs - take a refcount for the
new parent fence pointer and drop refcount for original kref_init
for new HW fence creation (or fake new HW fence in amdgpu - see next patch).

[1] - https://lore.kernel.org/all/731b7ff1-3cc9-e314-df2a-7c51b76d4db0@amd.com/t/#r00c728fcc069b1276642c325bfa9d82bf8fa21a3

Signed-off-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Tested-by: Yiqing Yao &lt;yiqing.yao@amd.com&gt;
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
</feed>
