<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/lib/raid6/algos.c, branch v6.19.11</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.11</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.11'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-07-10T05:57:54+00:00</updated>
<entry>
<title>lib/raid6: replace custom zero page with ZERO_PAGE</title>
<updated>2025-07-10T05:57:54+00:00</updated>
<author>
<name>Herbert Xu</name>
<email>herbert@gondor.apana.org.au</email>
</author>
<published>2025-03-17T09:02:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1857fcc847443b0238cb64584b43d8c3a9049a0a'/>
<id>urn:sha1:1857fcc847443b0238cb64584b43d8c3a9049a0a</id>
<content type='text'>
Use the system-wide zero page instead of a custom zero page.

[herbert@gondor.apana.org.au: update lib/raid6/recov_rvv.c, per Klara]
  Link: https://lkml.kernel.org/r/aFkUnXWtxcgOTVkw@gondor.apana.org.au
Link: https://lkml.kernel.org/r/Z9flJNkWQICx0PXk@gondor.apana.org.au
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Yu Kuai &lt;yukuai3@huawei.com&gt;
Cc: Klara Modin &lt;klarasmodin@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'riscv-for-linus-6.16-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux</title>
<updated>2025-06-07T01:05:18+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-06-07T01:05:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=119b1e61a769aa98e68599f44721661a4d8c55f3'/>
<id>urn:sha1:119b1e61a769aa98e68599f44721661a4d8c55f3</id>
<content type='text'>
Pull RISC-V updates from Palmer Dabbelt:

 - Support for the FWFT SBI extension, which is part of SBI 3.0 and a
   dependency for many new SBI and ISA extensions

 - Support for getrandom() in the VDSO

 - Support for mseal

 - Optimized routines for raid6 syndrome and recovery calculations

 - kexec_file() supports loading Image-formatted kernel binaries

 - Improvements to the instruction patching framework to allow for
   atomic instruction patching, along with rules as to how systems need
   to behave in order to function correctly

 - Support for a handful of new ISA extensions: Svinval, Zicbop, Zabha,
   some SiFive vendor extensions

 - Various fixes and cleanups, including: misaligned access handling,
   perf symbol mangling, module loading, PUD THPs, and improved uaccess
   routines

* tag 'riscv-for-linus-6.16-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (69 commits)
  riscv: uaccess: Only restore the CSR_STATUS SUM bit
  RISC-V: vDSO: Wire up getrandom() vDSO implementation
  riscv: enable mseal sysmap for RV64
  raid6: Add RISC-V SIMD syndrome and recovery calculations
  riscv: mm: Add support for Svinval extension
  RISC-V: Documentation: Add enough title underlines to CMODX
  riscv: Improve Kconfig help for RISCV_ISA_V_PREEMPTIVE
  MAINTAINERS: Update Atish's email address
  riscv: uaccess: do not do misaligned accesses in get/put_user()
  riscv: process: use unsigned int instead of unsigned long for put_user()
  riscv: make unsafe user copy routines use existing assembly routines
  riscv: hwprobe: export Zabha extension
  riscv: Make regs_irqs_disabled() more clear
  perf symbols: Ignore mapping symbols on riscv
  RISC-V: Kconfig: Fix help text of CMDLINE_EXTEND
  riscv: module: Optimize PLT/GOT entry counting
  riscv: Add support for PUD THP
  riscv: xchg: Prefetch the destination word for sc.w
  riscv: Add ARCH_HAS_PREFETCH[W] support with Zicbop
  riscv: Add support for Zicbop
  ...
</content>
</entry>
<entry>
<title>raid6: Add RISC-V SIMD syndrome and recovery calculations</title>
<updated>2025-06-05T21:03:07+00:00</updated>
<author>
<name>Chunyan Zhang</name>
<email>zhangchunyan@iscas.ac.cn</email>
</author>
<published>2025-03-05T08:37:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6093faaf9593fca92f96f165c95ff4b53353b1f4'/>
<id>urn:sha1:6093faaf9593fca92f96f165c95ff4b53353b1f4</id>
<content type='text'>
The assembly is originally based on the ARM NEON and int.uc, but uses
RISC-V vector instructions to implement the RAID6 syndrome and
recovery calculations.

The functions are tested on QEMU running with the option "-icount shift=0":

  raid6: rvvx1    gen()  1008 MB/s
  raid6: rvvx2    gen()  1395 MB/s
  raid6: rvvx4    gen()  1584 MB/s
  raid6: rvvx8    gen()  1694 MB/s
  raid6: int64x8  gen()   113 MB/s
  raid6: int64x4  gen()   116 MB/s
  raid6: int64x2  gen()   272 MB/s
  raid6: int64x1  gen()   229 MB/s
  raid6: using algorithm rvvx8 gen() 1694 MB/s
  raid6: .... xor() 1000 MB/s, rmw enabled
  raid6: using rvv recovery algorithm

[Charlie: - Fixup vector options]

Signed-off-by: Charlie Jenkins &lt;charlie@rivosinc.com&gt;
Signed-off-by: Chunyan Zhang &lt;zhangchunyan@iscas.ac.cn&gt;
Reviewed-by: Charlie Jenkins &lt;charlie@rivosinc.com&gt;
Tested-by: Charlie Jenkins &lt;charlie@rivosinc.com&gt;
Link: https://lore.kernel.org/r/20250305083707.74218-1-zhangchunyan@iscas.ac.cn
Signed-off-by: Alexandre Ghiti &lt;alexghiti@rivosinc.com&gt;
Signed-off-by: Palmer Dabbelt &lt;palmer@dabbelt.com&gt;
</content>
</entry>
<entry>
<title>raid6: skip avx512 checks</title>
<updated>2025-04-30T19:53:48+00:00</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2025-03-28T20:49:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5f5305dea066deb8a299cf9a00ac47b031332723'/>
<id>urn:sha1:5f5305dea066deb8a299cf9a00ac47b031332723</id>
<content type='text'>
It is no longer necessary to check for CONFIG_AS_AVX512, since the minimum
assembler version is now from binutils-2.30 and this always supports it.

Acked-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
</content>
</entry>
<entry>
<title>lib/raid6: Drop IA64 support</title>
<updated>2023-09-11T08:13:18+00:00</updated>
<author>
<name>Ard Biesheuvel</name>
<email>ardb@kernel.org</email>
</author>
<published>2023-01-13T17:08:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b089ea3cc30de85ea7e20aa66500feb4082dfbf7'/>
<id>urn:sha1:b089ea3cc30de85ea7e20aa66500feb4082dfbf7</id>
<content type='text'>
Drop Itanium support from the RAID6 code, and along with it, the 16x and
32x unrolled versions, which were only used by IA64.

Signed-off-by: Ard Biesheuvel &lt;ardb@kernel.org&gt;
</content>
</entry>
<entry>
<title>raid6: Add LoongArch SIMD recovery implementation</title>
<updated>2023-09-06T14:53:55+00:00</updated>
<author>
<name>WANG Xuerui</name>
<email>git@xen0n.name</email>
</author>
<published>2023-09-06T14:53:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f2091321044d9fbcadb93dfc1c9cf23e563ea40c'/>
<id>urn:sha1:f2091321044d9fbcadb93dfc1c9cf23e563ea40c</id>
<content type='text'>
Similar to the syndrome calculation, the recovery algorithms also work
on 64 bytes at a time to align with the L1 cache line size of current
and future LoongArch cores (that we care about). Which means
unrolled-by-4 LSX and unrolled-by-2 LASX code.

The assembly is originally based on the x86 SSSE3/AVX2 ports, but
register allocation has been redone to take advantage of LSX/LASX's 32
vector registers, and instruction sequence has been optimized to suit
(e.g. LoongArch can perform per-byte srl and andi on vectors, but x86
cannot).

Performance numbers measured by instrumenting the raid6test code, on a
3A5000 system clocked at 2.5GHz:

&gt; lasx  2data: 354.987 MiB/s
&gt; lasx  datap: 350.430 MiB/s
&gt; lsx   2data: 340.026 MiB/s
&gt; lsx   datap: 337.318 MiB/s
&gt; intx1 2data: 164.280 MiB/s
&gt; intx1 datap: 187.966 MiB/s

Because recovery algorithms are chosen solely based on priority and
availability, lasx is marked as priority 2 and lsx priority 1. At least
for the current generation of LoongArch micro-architectures, LASX should
always be faster than LSX whenever supported, and have similar power
consumption characteristics (because the only known LASX-capable uarch,
the LA464, always compute the full 256-bit result for vector ops).

Acked-by: Song Liu &lt;song@kernel.org&gt;
Signed-off-by: WANG Xuerui &lt;git@xen0n.name&gt;
Signed-off-by: Huacai Chen &lt;chenhuacai@loongson.cn&gt;
</content>
</entry>
<entry>
<title>raid6: Add LoongArch SIMD syndrome calculation</title>
<updated>2023-09-06T14:53:55+00:00</updated>
<author>
<name>WANG Xuerui</name>
<email>git@xen0n.name</email>
</author>
<published>2023-09-06T14:53:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8f3f06dfd6873135068ccf1a0b386308e8c4da38'/>
<id>urn:sha1:8f3f06dfd6873135068ccf1a0b386308e8c4da38</id>
<content type='text'>
The algorithms work on 64 bytes at a time, which is the L1 cache line
size of all current and future LoongArch cores (that we care about), as
confirmed by Huacai. The code is based on the generic int.uc algorithm,
unrolled 4 times for LSX and 2 times for LASX. Further unrolling does
not meaningfully improve the performance according to experiments.

Performance numbers measured during system boot on a 3A5000 @ 2.5GHz:

&gt; raid6: lasx     gen() 12726 MB/s
&gt; raid6: lsx      gen() 10001 MB/s
&gt; raid6: int64x8  gen()  2876 MB/s
&gt; raid6: int64x4  gen()  3867 MB/s
&gt; raid6: int64x2  gen()  2531 MB/s
&gt; raid6: int64x1  gen()  1945 MB/s

Comparison of xor() speeds (from different boots but meaningful anyway):

&gt; lasx:    11226 MB/s
&gt; lsx:     6395 MB/s
&gt; int64x4: 2147 MB/s

Performance as measured by raid6test:

&gt; raid6: lasx     gen() 25109 MB/s
&gt; raid6: lsx      gen() 13233 MB/s
&gt; raid6: int64x8  gen()  4164 MB/s
&gt; raid6: int64x4  gen()  6005 MB/s
&gt; raid6: int64x2  gen()  5781 MB/s
&gt; raid6: int64x1  gen()  4119 MB/s
&gt; raid6: using algorithm lasx gen() 25109 MB/s
&gt; raid6: .... xor() 14439 MB/s, rmw enabled

Acked-by: Song Liu &lt;song@kernel.org&gt;
Signed-off-by: WANG Xuerui &lt;git@xen0n.name&gt;
Signed-off-by: Huacai Chen &lt;chenhuacai@loongson.cn&gt;
</content>
</entry>
<entry>
<title>lib/raid6: drop RAID6_USE_EMPTY_ZERO_PAGE</title>
<updated>2022-11-14T17:35:50+00:00</updated>
<author>
<name>Giulio Benetti</name>
<email>giulio.benetti@benettiengineering.com</email>
</author>
<published>2022-10-19T16:04:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=42271ca389edb0446b9e492858b4c38083b0b9f8'/>
<id>urn:sha1:42271ca389edb0446b9e492858b4c38083b0b9f8</id>
<content type='text'>
RAID6_USE_EMPTY_ZERO_PAGE is unused and hardcoded to 0, so let's drop it.

Signed-off-by: Giulio Benetti &lt;giulio.benetti@benettiengineering.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
</entry>
<entry>
<title>lib/raid6: Use strict priority ranking for pq gen() benchmarking</title>
<updated>2022-01-06T16:37:03+00:00</updated>
<author>
<name>Dirk Müller</name>
<email>dmueller@suse.de</email>
</author>
<published>2022-01-05T16:38:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=36dacddbf0bdba86cd00f066b4d724157eeb63f1'/>
<id>urn:sha1:36dacddbf0bdba86cd00f066b4d724157eeb63f1</id>
<content type='text'>
On x86_64, currently 3 variants of AVX512, 3 variants of AVX2
and 3 variants of SSE2 are benchmarked on initialization, taking
between 144-153 jiffies. Testing across a hardware pool of
various generations of intel cpus I could not find a single
case where SSE2 won over AVX2 or AVX512. There are cases where
AVX2 wins over AVX512 however.

Change "prefer" into an integer priority field (similar to
how recov selection works) to have more than one ranking level
available, which is backwards compatible with existing behavior.

Give AVX2/512 variants higher priority over SSE2 in order to skip
SSE testing when AVX is available. in a AVX2/x86_64/HZ=250 case this
saves in the order of 200ms of initialization time.

Signed-off-by: Dirk Müller &lt;dmueller@suse.de&gt;
Acked-by: Paul Menzel &lt;pmenzel@molgen.mpg.de&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
</entry>
<entry>
<title>lib/raid6: skip benchmark of non-chosen xor_syndrome functions</title>
<updated>2022-01-06T16:37:03+00:00</updated>
<author>
<name>Dirk Müller</name>
<email>dmueller@suse.de</email>
</author>
<published>2022-01-05T16:38:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=38640c480939d56cc8b03d58642fc5261761a697'/>
<id>urn:sha1:38640c480939d56cc8b03d58642fc5261761a697</id>
<content type='text'>
In commit fe5cbc6e06c7 ("md/raid6 algorithms: delta syndrome functions")
a xor_syndrome() benchmarking was added also to the raid6_choose_gen()
function. However, the results of that benchmarking were intentionally
discarded and did not influence the choice. It picked the
xor_syndrome() variant related to the best performing gen_syndrome().

Reduce runtime of raid6_choose_gen() without modifying its outcome by
only benchmarking the xor_syndrome() of the best gen_syndrome() variant.

For a HZ=250 x86_64 system with avx2 and without avx512 this removes
5 out of 6 xor() benchmarks, saving 340ms of raid6 initialization time.

Signed-off-by: Dirk Müller &lt;dmueller@suse.de&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
</entry>
</feed>
