IACR News item: 03 August 2025
Florian Krieger, Florian Hirner, Ahmet Can Mert, Sujoy Sinha Roy
Fully Homomorphic Encryption (FHE) and Post-Quantum Cryptography (PQC) involve polynomial multiplications, which are a common performance bottleneck. To resolve this bottleneck, polynomial multiplications are often accelerated in hardware using the Number-Theoretic Transformation (NTT) or the Fast Fourier Transformation (FFT). In particular, NTT operates over modular rings while FFT operates over complex numbers. NTT and FFT are widely deployed in applications with diverse parameter sets, leading to long design times for hardware accelerators. Existing hardware generation tools have limited functionality since they do not support generic on-the-fly twiddle factor generation or different memory-related optimizations. This paper improves the hardware design process and presents a generic and flexible tool to generate FFT and NTT architectures. In contrast to prior work, we combine on-the-fly twiddle factor generation and stall-free memory accesses. Moreover, we enhance hardware design flexibility through memory-optimized or routing-optimized design strategies. While our memory-optimized strategy minimizes twiddle factors in ROM, our routing-optimized strategy allows significantly higher clock frequencies on FPGAs. These optimization strategies allow effective customization of NTT/FFT architectures, spanning from low-end PQC to high-end FHE accelerators. Compared to existing works, we reach up to 15.9x lower latency and up to 7.4x improved ATP for FFT applications such as the Falcon signature scheme. Considering other NTT tools, we decrease latency by up to 1.8x and 2x for PQC and FHE parameter sets, respectively.
Additional news items may be found on the IACR news page.