CryptoDB
Décio Gazzoni Filho
Publications and invited talks
Year
Venue
Title
2025
TCHES
Generation of Fast Finite Field Arithmetic forCortex-M4 with ECDH and SQIsign Applications
Abstract
Finite field arithmetic is central to several cryptographic algorithms on embedded devices like the ARM Cortex-M4, particularly for elliptic curve and isogenybased cryptography. However, rapid algorithm evolution, driven by initiatives such as NIST’s post-quantum standardization, might frequently render hand-optimized implementations obsolete. We address this challenge with m4-modarith, a library generating C code with inline assembly for the Cortex-M4 that rivals custom-tuned assembly, enabling agile development in this ever-changing landscape. Our generated modular multiplications obtains fast performances, competitive with hand-optimized assembly implementations published in the literature, even outperforming some of them for Curve25519. Two contributions are pivotal to this success. First, we introduce a novel multiplication strategy that matches the memory access complexity of the operand caching method while being applicable to a larger cache size for Cortex-M4 implementations. Second, we generalize an efficient pseudo-Mersenne reduction strategy, and formally prove its correctness and applicability for most primes of cryptographic interest. Our generator allowed agile optimization of SQIsign’s NIST PQC Round 2 submission, improving level 1 verification from 123 Mcycles to only 54 Mcycles, a 2.3x speedup. As an additional case study, we use our generator to improve performance of portable implementations of RFC 7748 by up to 2.2x.
Coauthors
- Gora Adj (1)
- Isaac A. Canales-Martínez (1)
- Jorge Chavez-Saab (1)
- Décio Gazzoni Filho (1)
- Julio López (1)
- Felix Carvalho Rodrigues (1)
- Francisco Rodríguez-Henríquez (1)
- Michael Scott (1)