summaryrefslogtreecommitdiff
path: root/tools/perf/scripts/python/stackcollapse.py
diff options
context:
space:
mode:
authorThomas Zimmermann <tzimmermann@suse.de>2022-02-23 22:38:01 +0300
committerThomas Zimmermann <tzimmermann@suse.de>2022-03-02 22:20:46 +0300
commit6f29e04938bf509fccfad490a74284cf158891ce (patch)
tree29ee3fde14a288d51e34cf9bdbd03702bee641fa /tools/perf/scripts/python/stackcollapse.py
parent7dbc515f5ca4b7867e34c0c4379591cb8b47d64f (diff)
downloadlinux-6f29e04938bf509fccfad490a74284cf158891ce.tar.xz
fbdev: Improve performance of sys_imageblit()
Improve the performance of sys_imageblit() by manually unrolling the inner blitting loop and moving some invariants out. The compiler failed to do this automatically. The resulting binary code was even slower than the cfb_imageblit() helper, which uses the same algorithm, but operates on I/O memory. A microbenchmark measures the average number of CPU cycles for sys_imageblit() after a stabilizing period of a few minutes (i7-4790, FullHD, simpledrm, kernel with debugging). The value for CFB is given as a reference. sys_imageblit(), new: 25934 cycles sys_imageblit(), old: 35944 cycles cfb_imageblit(): 30566 cycles In the optimized case, sys_imageblit() is now ~30% faster than before and ~20% faster than cfb_imageblit(). v2: * move switch out of inner loop (Gerd) * remove test for alignment of dst1 (Sam) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20220223193804.18636-3-tzimmermann@suse.de
Diffstat (limited to 'tools/perf/scripts/python/stackcollapse.py')
0 files changed, 0 insertions, 0 deletions