<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/gpu/host1x/hw, branch v6.19.11</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.11</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.11'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-11-14T17:27:19+00:00</updated>
<entry>
<title>gpu: host1x: Syncpoint interrupt performance optimization</title>
<updated>2025-11-14T17:27:19+00:00</updated>
<author>
<name>Mikko Perttunen</name>
<email>mperttunen@nvidia.com</email>
</author>
<published>2025-09-17T01:48:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bfe68975768a983d4e59c7fb465301999b863a7e'/>
<id>urn:sha1:bfe68975768a983d4e59c7fb465301999b863a7e</id>
<content type='text'>
Optimize performance of syncpoint interrupt handling by reading
the status register in 64-bit chunks when possible, and skipping
processing when the read value is zero.

Signed-off-by: Mikko Perttunen &lt;mperttunen@nvidia.com&gt;
Signed-off-by: Thierry Reding &lt;treding@nvidia.com&gt;
Link: https://patch.msgid.link/20250917-host1x-syncpt-irq-perf-v2-1-736ef69b1347@nvidia.com
</content>
</entry>
<entry>
<title>gpu: host1x: Wait prefences outside MLOCK</title>
<updated>2025-09-11T16:56:35+00:00</updated>
<author>
<name>Mikko Perttunen</name>
<email>mperttunen@nvidia.com</email>
</author>
<published>2025-07-08T11:25:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=63d47cc6eeb27fa0f5b2d9e2e9b950d728b6ca24'/>
<id>urn:sha1:63d47cc6eeb27fa0f5b2d9e2e9b950d728b6ca24</id>
<content type='text'>
The current submission opcode sequence first takes the engine MLOCK,
and then switches to HOST1X class to wait prefences. This is fine
while we only use a single channel per engine and there is no
virtualization, since jobs are serialized on that one channel anyway.
However, when that assumption doesn't hold, we are keeping the
engine locked while not running anything on it while waiting for
prefences to complete.

To resolve this, execute wait commands in the beginning of the job
outside the engine MLOCK. We still take the HOST1X MLOCK because
recent hardware requires register opcodes to be executed within some
MLOCK, but the hardware also allows unlimited channels to take the
HOST1X MLOCK at the same time.

Signed-off-by: Mikko Perttunen &lt;mperttunen@nvidia.com&gt;
Signed-off-by: Thierry Reding &lt;treding@nvidia.com&gt;
Link: https://lore.kernel.org/r/20250708-host1x-wait-prefences-outside-mlock-v1-1-13e98044e35a@nvidia.com
</content>
</entry>
<entry>
<title>gpu: host1x: Add MLOCK recovery for rest of engines</title>
<updated>2024-08-29T18:14:29+00:00</updated>
<author>
<name>Mikko Perttunen</name>
<email>mperttunen@nvidia.com</email>
</author>
<published>2024-04-25T05:02:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=57d298bdb46bc240498675a7f887fc71089da2c0'/>
<id>urn:sha1:57d298bdb46bc240498675a7f887fc71089da2c0</id>
<content type='text'>
Add class IDs / MLOCKs for MLOCK recovery for rest of engines
present on Tegra234.

Signed-off-by: Mikko Perttunen &lt;mperttunen@nvidia.com&gt;
Signed-off-by: Thierry Reding &lt;treding@nvidia.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20240425050238.2943404-4-cyndis@kapsi.fi
</content>
</entry>
<entry>
<title>gpu: host1x: Handle CDMA wraparound when debug printing</title>
<updated>2024-08-29T18:14:29+00:00</updated>
<author>
<name>Mikko Perttunen</name>
<email>mperttunen@nvidia.com</email>
</author>
<published>2024-04-25T05:02:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e436a40830f0fda1a1e5e51d3102644f3e0645fa'/>
<id>urn:sha1:e436a40830f0fda1a1e5e51d3102644f3e0645fa</id>
<content type='text'>
During channel debug information dump, when printing CDMA
opcodes, the circular nature of the CDMA pushbuffer wasn't being
taken into account, sometimes accessing past the end. Change
the printing to take this into account.

Signed-off-by: Mikko Perttunen &lt;mperttunen@nvidia.com&gt;
Signed-off-by: Thierry Reding &lt;treding@nvidia.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20240425050238.2943404-2-cyndis@kapsi.fi
</content>
</entry>
<entry>
<title>gpu: host1x: Request syncpoint IRQs only during probe</title>
<updated>2024-08-28T15:28:48+00:00</updated>
<author>
<name>Mikko Perttunen</name>
<email>mperttunen@nvidia.com</email>
</author>
<published>2024-05-31T07:07:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4c27ac45e6224ea0ca2d2e5dce64e3df122d27c7'/>
<id>urn:sha1:4c27ac45e6224ea0ca2d2e5dce64e3df122d27c7</id>
<content type='text'>
Syncpoint IRQs are currently requested in a code path that runs
during resume. Due to this, we get multiple overlapping registered
interrupt handlers as host1x is suspended and resumed.

Rearrange interrupt code to only request IRQs during initialization.

Signed-off-by: Mikko Perttunen &lt;mperttunen@nvidia.com&gt;
Signed-off-by: Thierry Reding &lt;treding@nvidia.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20240531070719.2138-1-cyndis@kapsi.fi
</content>
</entry>
<entry>
<title>gpu: host1x: Syncpoint interrupt sharding</title>
<updated>2023-10-11T20:52:44+00:00</updated>
<author>
<name>Mikko Perttunen</name>
<email>mperttunen@nvidia.com</email>
</author>
<published>2023-09-01T11:40:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f017f1e9cb3458a86f586a171e284e2ec46286db'/>
<id>urn:sha1:f017f1e9cb3458a86f586a171e284e2ec46286db</id>
<content type='text'>
Support sharded syncpoint interrupts on Tegra234+. This feature
allows specifying one of eight interrupt lines for each syncpoint
to lower processing latency of syncpoint threshold
interrupts.

Signed-off-by: Mikko Perttunen &lt;mperttunen@nvidia.com&gt;
Signed-off-by: Thierry Reding &lt;treding@nvidia.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20230901114008.672433-1-cyndis@kapsi.fi
</content>
</entry>
<entry>
<title>gpu: host1x: Use tegra_dev_iommu_get_stream_id()</title>
<updated>2023-01-27T16:41:49+00:00</updated>
<author>
<name>Thierry Reding</name>
<email>treding@nvidia.com</email>
</author>
<published>2022-11-17T12:36:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9026ba722360ce5fb72ff7bbc445c49dc9a736ac'/>
<id>urn:sha1:9026ba722360ce5fb72ff7bbc445c49dc9a736ac</id>
<content type='text'>
Use the newly implemented tegra_dev_iommu_get_stream_id() helper to
encapsulate and centralize the IOMMU stream ID access.

Signed-off-by: Thierry Reding &lt;treding@nvidia.com&gt;
</content>
</entry>
<entry>
<title>gpu: host1x: External timeout/cancellation for fences</title>
<updated>2023-01-26T14:55:38+00:00</updated>
<author>
<name>Mikko Perttunen</name>
<email>mperttunen@nvidia.com</email>
</author>
<published>2023-01-19T13:09:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d5179020f5ce44fd449790a9c12ef6c1a90a2ca7'/>
<id>urn:sha1:d5179020f5ce44fd449790a9c12ef6c1a90a2ca7</id>
<content type='text'>
Currently all fences have a 30 second timeout to ensure they are
cleaned up if the fence never completes otherwise. However, this
one size fits all solution doesn't actually fit in every case,
such as syncpoint waiting where we want to be able to have timeouts
longer than 30 seconds. As such, we want to be able to give control
over fence cancellation to the caller (and maybe eventually get rid
of the internal timeout altogether).

Here we add this cancellation mechanism by essentially adding a
function for entering the timeout path by function call, and changing
the syncpoint wait function to use it.

Signed-off-by: Mikko Perttunen &lt;mperttunen@nvidia.com&gt;
Signed-off-by: Thierry Reding &lt;treding@nvidia.com&gt;
</content>
</entry>
<entry>
<title>gpu: host1x: Rewrite syncpoint interrupt handling</title>
<updated>2023-01-26T14:55:38+00:00</updated>
<author>
<name>Mikko Perttunen</name>
<email>mperttunen@nvidia.com</email>
</author>
<published>2023-01-19T13:09:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=625d4ffb438cacc9b1ebaa48748cdc7171587cdc'/>
<id>urn:sha1:625d4ffb438cacc9b1ebaa48748cdc7171587cdc</id>
<content type='text'>
Move from the old, complex intr handling code to a new implementation
based on dma_fences. While there is a fair bit of churn to get there,
the new implementation is much simpler and likely faster as well due
to allowing signaling directly from interrupt context.

Signed-off-by: Mikko Perttunen &lt;mperttunen@nvidia.com&gt;
Signed-off-by: Thierry Reding &lt;treding@nvidia.com&gt;
</content>
</entry>
<entry>
<title>gpu: host1x: Implement job tracking using DMA fences</title>
<updated>2023-01-26T14:55:38+00:00</updated>
<author>
<name>Mikko Perttunen</name>
<email>mperttunen@nvidia.com</email>
</author>
<published>2023-01-19T13:09:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c24973ed795fec5c12d8a822a0de99a4b7bab394'/>
<id>urn:sha1:c24973ed795fec5c12d8a822a0de99a4b7bab394</id>
<content type='text'>
In anticipation of removal of the intr API, implement job tracking
using DMA fences instead. The main two things about this are
making cdma_update schedule the work since fence completion can
now be called from interrupt context, and some complication in
ensuring the callback is not running when we free the fence.

Signed-off-by: Mikko Perttunen &lt;mperttunen@nvidia.com&gt;
Signed-off-by: Thierry Reding &lt;treding@nvidia.com&gt;
</content>
</entry>
</feed>
