diff options
| author | Tanmay Patil <tanmayp@nvidia.com> | 2026-05-14 13:31:52 +0300 |
|---|---|---|
| committer | Thierry Reding <treding@nvidia.com> | 2026-05-28 18:19:28 +0300 |
| commit | 6fea41ff3642483cd6d968c3a9bcdb47a8c35312 (patch) | |
| tree | 262fee336c074b02055f3dd1ceb2c4250c12226a /include/linux/timerqueue.h | |
| parent | 3cbf5e3c46e66d9b3b6b91099bb720c6cb1be3bc (diff) | |
| download | linux-6fea41ff3642483cd6d968c3a9bcdb47a8c35312.tar.xz | |
gpu: host1x: Skip redundant syncpoint loads in host1x_syncpt_wait()
In host1x_syncpt_wait(), the hardware syncpoint value was loaded
initially for expiry check, and then loaded a second time to
populate the caller's value pointer. Reuse a single load for
both purposes.
After dma_fence_wait_timeout(), the previous code reloaded the syncpoint
value for the expiry check, which is only required in the timeout case.
On success (i.e., return value > 0, or return value == 0 with zero
jiffies remaining), the ISR has already cached the value before
signaling the fence. The value pointer can therefore be populated using
the cached value using host1x_syncpt_read_min() without MMIO access.
Only the timeout path requires a fresh load, move host1x_syncpt_load()
under that path.
Measured Syncpoint wait latency (50000 samples):
Average latency: 12.2 us -> 10.6 us
99.99 pct latency: 62.96 us -> 51.90 us
Signed-off-by: Tanmay Patil <tanmayp@nvidia.com>
Acked-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patch.msgid.link/20260514103153.766343-2-tanmayp@nvidia.com
Diffstat (limited to 'include/linux/timerqueue.h')
0 files changed, 0 insertions, 0 deletions
