International Association for Cryptologic Research

International Association
for Cryptologic Research

IACR News item: 28 August 2020

Yihong Zhu, Min Zhu, Bohan Yang, Wenping Zhu, Chenchen Deng, Chen Chen, Shaojun Wei, Leibo Liu
ePrint Report ePrint Report
Although large numbers of hardware and software implementations have been proposed to accelerate lattice-based cryptography, Saber, a module-LWR-based algorithm, which has advanced to second round of the NIST standardization process, has not been adequately supported by the current solutions. Based on these motivations, a high-performance crypto-processor is proposed based on an algorithm-hardware co-design in this paper. First, a hierarchical Karatsuba calculating framework, a hardware-efficient Karatsuba scheduling strategy and an optimized circuit structure are utilized to enable high-throughput polynomial multiplication. Furthermore, a task-level pipeline and truncated multipliers are proposed to enable algorithm-specific fine-grained processing. Enabled by all of the above optimizations, Avalon takes 943, 1156, and 408 clock cycles for key generation, encryption, and decryption, respectively. Enabled by these optimizations, our processor takes 943, 1156 and 408 clock cycles for key generation, encryption, and decryption of Saber768, achieving 5.4x, 5.2x and 4.2x reductions compared with the state-of-the-art FPGA solutions, respectively. The post-layout simulation of our design is implemented with TSMC 40nm CMOS process within 0.35 mm2. The throughput for Saber768 is up to 346k encryption operations per second and the energy efficiency is 0.12uJ/encryption while operating at 400MHz, achieving nearly 52x improvement and 30x improvement, respectively compared with current PQC hardware solutions.
Expand

Additional news items may be found on the IACR news page.