summaryrefslogtreecommitdiff
path: root/scripts/gen-crc-consts.py
AgeCommit message (Collapse)AuthorFilesLines
2025-03-10riscv/crc: add "template" for Zbc optimized CRC functionsEric Biggers1-1/+54
Add a "template" crc-clmul-template.h that can generate RISC-V Zbc optimized CRC functions. Each generated CRC function is parameterized by CRC length and bit order, and it accepts a pointer to the constants struct required for the specific CRC polynomial desired. Update gen-crc-consts.py to support generating the needed constants structs. This makes it possible to easily wire up a Zbc optimized implementation of almost any CRC. The design generally follows what I did for x86, but it is simplified by using RISC-V's scalar carryless multiplication Zbc, which has no equivalent on x86. RISC-V's clmulr instruction is also helpful. A potential switch to Zvbc (or support for Zvbc alongside Zbc) is left for future work. For long messages Zvbc should be fastest, but it would need to be shown to be worthwhile over just using Zbc which is significantly more convenient to use, especially in the kernel context. Compared to the existing Zbc-optimized CRC32 code and the earlier proposed Zbc-optimized CRC-T10DIF code (https://lore.kernel.org/r/20250211071101.181652-1-zhihang.shao.iscas@gmail.com), this submission deduplicates the code among CRC variants and is significantly more optimized. It uses "folding" to take better advantage of instruction-level parallelism (to a more limited extent than x86 for now, but it could be extended to more), it reworks the Barrett reduction to eliminate unnecessary instructions, and it documents all the math used and makes all the constants reproducible. Tested-by: Björn Töpel <bjorn@rivosinc.com> Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250216225530.306980-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-02-10scripts/gen-crc-consts: add gen-crc-consts.pyEric Biggers1-0/+238
Add a Python script that generates constants for computing the given CRC variant(s) using x86's pclmulqdq or vpclmulqdq instructions. This is specifically tuned for x86's crc-pclmul-template.S. However, other architectures with a 64x64 => 128-bit carryless multiplication instruction should be able to use the generated constants too. (Some tweaks may be warranted based on the exact instructions available on each arch, so the script may grow an arch argument in the future.) The script also supports generating the tables needed for table-based CRC computation. Thus, it can also be used to reproduce the tables like t10_dif_crc_table[] and crc16_table[] that are currently hardcoded in the source with no generation script explicitly documented. Python is used rather than C since it enables implementing the CRC math in the simplest way possible, using arbitrary precision integers. The outputs of this script are intended to be checked into the repo, so Python will continue to not be required to build the kernel, and the script has been optimized for simplicity rather than performance. Acked-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250210174540.161705-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>