International Association for Cryptologic Research

International Association
for Cryptologic Research

CryptoDB

Generation of Fast Finite Field Arithmetic forCortex-M4 with ECDH and SQIsign Applications

Authors:
Felix Carvalho Rodrigues
Décio Gazzoni Filho
Gora Adj
Isaac A. Canales-Martínez
Jorge Chávez-Saab
Julio López
Michael Scott
Francisco Rodríguez-Henríquez
Download:
DOI: 10.46586/tches.v2025.i4.588-620
URL: https://tches.iacr.org/index.php/TCHES/article/view/12422
Search ePrint
Search Google
Abstract: Finite field arithmetic is central to several cryptographic algorithms on embedded devices like the ARM Cortex-M4, particularly for elliptic curve and isogenybased cryptography. However, rapid algorithm evolution, driven by initiatives such as NIST’s post-quantum standardization, might frequently render hand-optimized implementations obsolete. We address this challenge with m4-modarith, a library generating C code with inline assembly for the Cortex-M4 that rivals custom-tuned assembly, enabling agile development in this ever-changing landscape. Our generated modular multiplications obtains fast performances, competitive with hand-optimized assembly implementations published in the literature, even outperforming some of them for Curve25519. Two contributions are pivotal to this success. First, we introduce a novel multiplication strategy that matches the memory access complexity of the operand caching method while being applicable to a larger cache size for Cortex-M4 implementations. Second, we generalize an efficient pseudo-Mersenne reduction strategy, and formally prove its correctness and applicability for most primes of cryptographic interest. Our generator allowed agile optimization of SQIsign’s NIST PQC Round 2 submission, improving level 1 verification from 123 Mcycles to only 54 Mcycles, a 2.3x speedup. As an additional case study, we use our generator to improve performance of portable implementations of RFC 7748 by up to 2.2x.
BibTeX
@article{tches-2025-35986,
  title={Generation of Fast Finite Field Arithmetic forCortex-M4 with ECDH and SQIsign Applications},
  journal={IACR Transactions on Cryptographic Hardware and Embedded Systems},
  publisher={Ruhr-Universität Bochum},
  volume={2025},
  pages={588-620},
  url={https://tches.iacr.org/index.php/TCHES/article/view/12422},
  doi={10.46586/tches.v2025.i4.588-620},
  author={Felix Carvalho Rodrigues and Décio Gazzoni Filho and Gora Adj and Isaac A. Canales-Martínez and Jorge Chávez-Saab and Julio López and Michael Scott and Francisco Rodríguez-Henríquez},
  year=2025
}