diff options
author | Jussi Kivilinna <jussi.kivilinna@mbnet.fi> | 2012-08-28 15:24:49 +0400 |
---|---|---|
committer | Herbert Xu <herbert@gondor.apana.org.au> | 2012-09-07 00:17:04 +0400 |
commit | ddaea7869d29beb9e0042c96ea52c9cca2afd68a (patch) | |
tree | d4d6d6e71ae0d1d451c58e7910ae3a05f55d0ad8 /arch/x86/crypto/cast6-avx-x86_64-asm_64.S | |
parent | f94a73f8dd5644f45f9d2e3139608ca83b932d93 (diff) | |
download | linux-ddaea7869d29beb9e0042c96ea52c9cca2afd68a.tar.xz |
crypto: cast5-avx - tune assembler code for more performance
Patch replaces 'movb' instructions with 'movzbl' to break false register
dependencies, interleaves instructions better for out-of-order scheduling
and merges constant 16-bit rotation with round-key variable rotation.
tcrypt ECB results (128bit key):
Intel Core i5-2450M:
size old-vs-new new-vs-generic old-vs-generic
enc dec enc dec enc dec
256 1.18x 1.18x 2.45x 2.47x 2.08x 2.10x
1k 1.20x 1.20x 2.73x 2.73x 2.28x 2.28x
8k 1.20x 1.19x 2.73x 2.73x 2.28x 2.29x
[v2]
- Do instruction interleaving another way to avoid adding new FPU<=>CPU
register moves as these cause performance drop on Bulldozer.
- Improvements to round-key variable rotation handling.
- Further interleaving improvements for better out-of-order scheduling.
Cc: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Diffstat (limited to 'arch/x86/crypto/cast6-avx-x86_64-asm_64.S')
0 files changed, 0 insertions, 0 deletions