summaryrefslogtreecommitdiff
path: root/scripts/generate_rust_analyzer.py
diff options
context:
space:
mode:
authorZsolt Kajtar <soci@c64.rulez.org>2025-03-09 21:47:15 +0300
committerHelge Deller <deller@gmx.de>2025-03-27 00:39:21 +0300
commiteabb0329308729a9073785705224b4efeed081de (patch)
tree7b875cba2ee45daa57d1e4f6cecf0fbb2e86352f /scripts/generate_rust_analyzer.py
parent1a78d9a34b8d15f55f7639253af40c27888112c3 (diff)
downloadlinux-eabb0329308729a9073785705224b4efeed081de.tar.xz
fbdev: Refactoring the fbcon packed pixel drawing routines
The original version duplicated more or less the same algorithms for both system and i/o memory. In this version the drawing algorithms (copy/fill/blit) are separate from the memory access (system and i/o). The two parts are getting combined in the loadable module sources. This also makes it more robust against wrong memory access type or alignment mistakes as there's no direct pointer access or arithmetic in the algorithm sources anymore. Due to liberal use of inlining the compiled result is a single function in all 6 cases, without unnecessary function calls. Unlike earlier the use of macros could be minimized as apparently both gcc and clang is capable now to do the same with inline functions just as well. What wasn't quite the same in the two variants is the support for pixel order reversing. This version is capable to do that for both system and I/O memory, and not only for the latter. As demand for low bits per pixel modes isn't high there's a configuration option to enable this separately for the CFB and SYS modules. The pixel reversing algorithm is different than earlier and was designed so that it can take advantage of bit order reversing instructions on architectures which have them. And even for higher bits per pixel modes like four bpp. One of the shortcomings of the earlier version was the incomplete support for foreign endian framebuffers. Now all three drawing algorithms produce correct output on both endians with native and foreign framebuffers. This is one of the important differences even if otherwise the algorithms don't look too different than before. All three routines work now with aligned native word accesses. As a consequence blitting isn't limited to 32 bits on 64 bit architectures as it was before. The old routines silently assumed that rows are a multiple of the word size. Due to how the new routines function this isn't a requirement any more and access will be done aligned regardless. However if the framebuffer is configured like that then some of the fast paths won't be available. As this code is supposed to be running on all supported architectures it wasn't optimized for a particular one. That doesn't mean I haven't looked at the disassembly. That's where I noticed that it isn't a good idea to use the fallback bitreversing code for example. The low bits per pixel modes should be faster than before as the new routines can blit 4 pixels at a time. On the higher bits per pixel modes I retained the specialized aligned routines so it should be more or less the same, except on 64 bit architectures. There the blitting word size is double now which means 32 BPP isn't done a single pixel a time now. The code was tested on x86, amd64, mips32 and mips64. The latter two in big endian configuration. Originally thought I can get away with the first two, but with such bit twisting code byte ordering is tricky and not really possible to get right without actually verifying it. While writing such routines isn't rocket science a lot of time was spent on making sure that pixel ordering, foreign byte order, various bits per pixels, cpu endianness and word size will give the expected result in all sorts of combinations without making it overly complicated or full with special cases. Signed-off-by: Zsolt Kajtar <soci@c64.rulez.org> Signed-off-by: Helge Deller <deller@gmx.de>
Diffstat (limited to 'scripts/generate_rust_analyzer.py')
0 files changed, 0 insertions, 0 deletions