International Association for Cryptologic Research

International Association
for Cryptologic Research

IACR News item: 05 September 2022

Weiji Guo
ePrint Report ePrint Report
The efficiency of constant-time SM4 implementation has been lagging behind that of AES for most internet traffic and applicable data encryption scenarios. The best performance before our works was 3.77 cpb for x86 platform (AESNI + AVX2), and 8.62 cpb for Arm platform (NEON). Meanwhile the state of art constant-time AES implementation could reach 0.63 cpb. Dedicated SM4 instruction set extensions like those optionally available in Armv8.2, could achieve comparable cpb to AES. But they are only available in limited processors, therefore does not impact much to real-world uses. To fill the gap we explored some novel techniques with Intel GFNI instruction set extension and Arm NEON coprocessor. We achieved 1.51 cpb with GFNI + AVX512 and 2.62 cpb with GFNI + AVX2 for Intel processors; we also achieved 6.74 cpb with NEON. In addition, we simplified the algebraic expression of SM4 S-Box. And our technique to exploit L1 cache could also be applied to other applications and hardware platforms if the circumstances apply.
Expand

Additional news items may be found on the IACR news page.