International Association for Cryptologic Research

International Association
for Cryptologic Research

IACR News

Updates on the COVID-19 situation are on the Announcement channel.

Here you can see all recent updates to the IACR webpage. These updates are also available:

RSS symbol icon
via RSS feed
Twitter bird icon
via Twitter
Weibo icon
via Weibo
Facebook icon
via Facebook

23 October 2020

Akinori Hosoyamada, Tetsu Iwata
ePrint Report ePrint Report
Recent results on quantum cryptanalysis show that some symmetric key schemes can be broken in polynomial time even if they are proven to be secure in the classical setting. Liskov, Rivest, and Wagner showed that secure tweakable block ciphers can be constructed from secure block ciphers in the classical setting. However, Kaplan et al. showed that their scheme can be broken by polynomial time quantum superposition attacks, even if underlying block ciphers are quantum-secure. Since then, it remains open if there exists a mode of block ciphers to build quantum-secure tweakable block ciphers. This paper settles the problem in the reduction-based provable security paradigm. We show the first design of quantum-secure tweakable block ciphers based on quantum-secure block ciphers, and present a provable security bound. Our construction is simple, and when instantiated with a quantum-secure $n$-bit block cipher, it is secure against attacks that query arbitrary quantum superpositions of plaintexts and tweaks up to $O(2^{n/6})$ quantum queries. Our security proofs use the compressed oracle technique introduced by Zhandry. More precisely, we use an alternative formalization of the technique introduced by Hosoyamada and Iwata.
Expand
Subhadeep Banik, Zhenzhen Bao, Takanori Isobe, Hiroyasu Kubo, Fukang Liu, Kazuhiko Minematsu, Kosei Sakamoto, Nao Shibata, Maki Shigeri
ePrint Report ePrint Report
In this article, we present WARP, a lightweight 128-bit block cipher with a 128-bit key. It aims at small-footprint circuit in the field of 128-bit block ciphers, possibly for a unified encryption and decryption functionality. The overall structure of WARP is a variant of 32-nibble Type-2 Generalized Feistel Network (GFN), with a permutation over nibbles designed to optimize the security and efficiency. We conduct a thorough security analysis and report comprehensive hardware and software implementation results. Our hardware results show that WARP is the smallest 128-bit block cipher for most of typical hardware implementation strategies. A serialized circuit of WARP achieves around 800 Gate Equivalents (GEs), which is much smaller than previous state-of-the-art implementations of lightweight 128-bit ciphers (they need more than $1,000$ GEs). While our primary metric is hardware size, WARP also enjoys several other features, most notably low energy consumption. This is somewhat surprising, since GFN generally needs more rounds than substitution permutation network (SPN), and thus GFN has been considered to be less advantageous in this regard. We show a multi-round implementation of WARP is quite low-energy. Moreover, WARP also performs well on software: our SIMD implementation is quite competitive to known hardware-oriented 128-bit lightweight ciphers for long input, and even much better for small inputs due to the small number of parallel blocks. On 8-bit microcontrollers, the results of our assembly implementations show that WARP is flexible to achieve various performance characteristics.
Expand
Ohad Barta, Yuval Ishai, Rafail Ostrovsky, David J. Wu
ePrint Report ePrint Report
Succinct non-interactive arguments (SNARGs) enable proofs of NP statements with very low communication. Recently, there has been significant work in both theory and practice on constructing SNARGs with very short proofs. Currently, the state-of-the-art in succinctness is due to Groth (Eurocrypt 2016) who constructed a SNARG from bilinear maps where the proof consists of just 3 group elements.

In this work, we first construct a concretely-efficient designated-verifier (preprocessing) SNARG with inverse polynomial soundness, where the proof consists of just 2 group elements in a standard (generic) group. This leads to a 50% reduction in concrete proof size compared to Groth's construction. We follow the approach of Bitansky et al. (TCC 2013) who describe a compiler from linear PCPs to SNARGs in the preprocessing model. Our improvement is based on a new linear PCP packing technique that allows us to construct 1-query linear PCPs which can then be compiled into a SNARG (using ElGamal encryption over a generic group). An appealing feature of our new SNARG is that the verifier can precompute a statement-independent lookup table in an offline phase; verifying proofs then only requires 2 exponentiations and a single table lookup. This makes our new designated-verifier SNARG appealing in settings that demand fast verification and minimal communication.

We then turn to the question of constructing arguments where the proof consists of a single group element. Here, we first show that any (possibly interactive) argument for a language L where the verification algorithm is "generic" (i.e., only performs generic group operations) and the proof consists of a single group element, implies a witness encryption scheme for L. We then show that under a yet-unproven, but highly plausible, hypothesis on the hardness of approximating the minimal distance of linear codes, we can construct a 2-message laconic argument for NP where the proof consists of a single group element. Under the same hypothesis, we obtain a witness encryption scheme for NP in the generic group model. Along the way, we show that under a conceptually-similar but proven hardness of approximation result, there is a 2-message laconic argument for NP with negligible soundness error where the prover's message consists of just 2 group elements. In both settings, we obtain laconic arguments (and linear PCPs) with linear decision procedures. Our constructions circumvent a previous lower bound by Groth on such argument systems with linear decision procedures by relying on imperfect completeness. Namely, our constructions have vanishing but not negligible completeness error, while the lower bound of Groth implicitly assumes negligible completeness error of the underlying argument. Our techniques thus highlight new avenues for designing linear PCPs, succinct arguments, and witness encryption schemes.
Expand
Abhiram Kothapalli, Elisaweta Masserova, Bryan Parno
ePrint Report ePrint Report
We present the first direct construction of a zero-knowledge argument system for general computation that features a linear-time prover and a constant-time verifier (after a single linear-time public setup) in terms of the number of field and group operations. Our scheme utilizes a universal linear-size structured reference string (SRS) that allows a single trusted setup to be used across all computation instances of a bounded size. Concretely, for computations of size $n$, our prover's cost is dominated by $35$ multi-exponentiations of size $n$ and our verifier's cost is dominated by $34$ pairings. To achieve the stated asymptotics, we first construct a nearly-optimal zkSNARK with a logarithmic verifier in the random oracle model. We then show how to achieve a constant-time verifier using proof composition. Along the way we design (1) a new polynomial commitment scheme for evaluation-based representations of polynomials, (2) an asymptotically optimal inner-product argument system, (3) an asymptotically optimal multi-Hadamard-product argument system, and (4) a new constraint system for NP that is particularly well-suited for our bundle of techniques.
Expand
Hosein Hadipour, Nasour Bagheri, Ling Song
ePrint Report ePrint Report
The boomerang and rectangle attacks are adaptions of differential cryptanalysis in which the attacker divides a block cipher $E$ into two sub-ciphers, i.e., $E = E_{1}\circ E_{0}$, to construct a distinguisher for $E$ with probability $p^{2}q^{2}$ by concatenating two short differential trails for $E_{0}$ and $E_{1}$ with probability $p$ and $q$ respectively. According to the previous research the dependency between these two differential characteristics have a great impact on the probability of boomerang and rectangle distinguishers. Dunkelman \etal proposed the sandwich attack to formalise such dependency that regards $E$ as three parts, i.e., $E = \widetilde{E}_{1}\circ E_{m}\circ \widetilde{E}_{0}$, where $E_{m}$ contains the dependency between two differential trails, satisfying some differential propagation with probability $r$. Accordingly, the entire probability is $p^{2}q^{2}r$. Recently, Song et al. have proposed a general framework to identify the actual boundaries of $E_{m}$ and systematically evaluate the probability of $E_{m}$ with any number of rounds, and applied their method to improve the best boomerang distinguishers of SKINNY. In this paper, using a more advanced method to search for boomerang distinguishers, we show that the best previous boomerang distinguishers for SKINNY can be significantly improved. Given that SKINNY is a very important lightweight tweakable block cipher which is a basic module of many candidates of the Lightweight Cryptography (LWC) standardization project by NIST, and rectangle attack is one of the most efficient attacks on reduced-round of this cipher, using our boomerang distinguishers we improve the related tweakey rectangle attack on SKINNY to investigate the security of this cipher more accurately. CRAFT is another light weight tweakable block cipher for which we provide the security analysis against rectangle attack for the first time. Following the previous research regarding evaluation of switching in multiple rounds of boomerang distinguishers, we also introduce new tools called \textit{Double Boomerang Connectivity Table} (DBCT), $BDT^{\star}$ and $DBT^{\star}$ to evaluate the boomerang switch through the multiple rounds more accurately. Using these new tools we provide theoretical proofs for our boomerang distinguishers for CRAFT and SKINNY.
Expand
Juliane Krämer, Patrick Struck
ePrint Report ePrint Report
Ciphertext indistinguishability under chosen plaintext attacks is a standard security notion for public key encryption. It crucially relies on the usage of good randomness and is trivially unachievable if the randomness is known by the adversary. Yilek (CT-RSA'10) defined security against resetting attacks, where randomness might be reused but remains unknown to the adversary. Furthermore, Yilek claimed that security against adversaries making a single query to the challenge oracle implies security against adversaries making multiple queries to the challenge oracle. This is a typical simplification for indistinguishability security notions proven via a standard hybrid argument. The given proof, however, was pointed out to be flawed by Paterson, Schuldt, and Sibborn (PKC'14). Prior to this work, it has been unclear whether this simplification of the security notion also holds in case of resetting attacks. We remedy this state of affairs as follows. First, we show the strength of resetting attacks by showing that many public key encryption schemes are susceptible to these attacks. As our main contribution, we show that the simplification to adversaries making only one query to the challenge oracle also holds in the light of resetting attacks. More precisely, we show that the existing proof can not be fixed and give a different proof for the claim. Finally, we define real-or-random security against resetting attacks and prove it equivalent to the notion by Yilek which is of the form left-or-right.
Expand
Steven D. Galbraith, Robert Granger, Simon-Philipp Merz, Christophe Petit
ePrint Report ePrint Report
In this paper we further the study of index calculus methods for solving the elliptic curve discrete logarithm problem (ECDLP). We focus on the index calculus for subfield curves, also called Koblitz curves, defined over $\mathbb{F}_q$ with ECDLP in $\mathbb{F}_{q^n}$. Instead of accelerating the solution of polynomial systems during index calculus as was predominantly done in previous work, we define factor bases that are invariant under the $q$-power Frobenius automorphism of the field $\mathbb{F}_{q^n}$, reducing the number of polynomial systems that need to be solved. A reduction by a factor of $1/n$ is the best one could hope for. We show how to choose factor bases to achieve this, while simultaneously accelerating the linear algebra step of the index calculus method for Koblitz curves by a factor $n^2$. Furthermore, we show how to use the Frobenius endomorphism to improve symmetry breaking for Koblitz curves. We provide constructions of factor bases with the desired properties, and we study their impact on the polynomial system solving costs experimentally. This work gives an answer to the problem raised in the literature on how the Frobenius endomorphism can be used to speed-up index calculus on subfield curves.
Expand
Fuyuki Kitagawa, Ryo Nishimaki, Takashi Yamakawa
ePrint Report ePrint Report
Secure software leasing (SSL) is a quantum cryptographic primitive that enables users to execute software only during the software is leased. It prevents users from executing leased software after they return the leased software to its owner. SSL can make software distribution more flexible and controllable.

Although SSL is an attractive cryptographic primitive, the existing SSL scheme is based on public key quantum money, which is not instantiated with standard cryptographic assumptions so far. Moreover, the existing SSL scheme only supports a subclass of evasive functions.

In this work, we present SSL schemes based on the learning with errors assumption (LWE). Specifically, our contributions consist of the following.

- We construct an SSL scheme for pseudorandom functions from the LWE assumption against quantum adversaries. - We construct an SSL scheme for a subclass of evasive functions from the LWE assumption against sub-exponential quantum adversaries. - We construct SSL schemes for the functionalities above with classical communication from the LWE assumption against (sub-exponential) quantum adversaries.

SSL with classical communication means that entities exchange only classical information though they run quantum computation locally.

Our crucial tool is two-tier quantum lightning, which is introduced in this work and a relaxed version of quantum lighting. In two-tier quantum lightning schemes, we have a public verification algorithm called semi-verification and a private verification algorithm called full-verification. An adversary cannot generate possibly entangled two quantum states whose serial numbers are the same such that one passes the semi-verification, and the other also passes the full-verification. We show that we can construct a two-tier quantum lightning scheme from the LWE assumption.
Expand
Maxim Jourenko, Mario Larangeira, Keisuke Tanaka
ePrint Report ePrint Report
The security of blockchain based decentralized ledgers relies on consensus protocols executed between mutually distrustful parties. Such protocols incur delays which severely limit the throughput of such ledgers. Payment and state channels enable execution of offchain protocols that allow interaction between parties without involving the consensus protocol. Protocols such as Hashed Timelock Contracts (HTLC) and Sprites (FC'19) connect channels into Payment Channel Networks (PCN) allowing payments across a path of payment channels. Such a payment requires each party to lock away funds for an amount of time. The product of funds and locktime is the collateral of the party, i.e., their cost of opportunity to forward a payment. In the case of HTLC, the locktime is linear to the length of the path, making the total collateral invested across the path quadratic in size of its length. Sprites improved on this by reducing the locktime to a constant by utilizing smart contracts. Atomic Multi-Channel Updates (AMCU), published at CCS'19, introduced constant collateral payments without smart contracts. In this work we present the Channel Closure attack on AMCU that allows a malicious adversary to make honest parties lose funds. Furthermore, we propose the Payment Trees protocol that allows payments across a PCN with linear total collateral without the aid of smart contracts. A competitive performance similar to Sprites, and yet compatible to Bitcoin.
Expand

20 October 2020

Yi Deng
ePrint Report ePrint Report
We develop an individual simulation technique that explicitly makes use of particular properties/structures of a given adversary's functionality. Using this simulation technique, we obtain the following results.

1. We construct the first protocols that break previous black-box barriers of [Xiao, TCC'11 and Alwen et al., Crypto'05] under the standard hardness of factoring, both of which are polynomial time simulatable all a-priori bounded polynomial size distinguishers: -- Two-round selective opening secure commitment scheme. -- Three-round concurrent zero knowledge and concurrent witness hiding argument for NP in the bare public-key model.

2. We present a simpler two-round weak zero knowledge and witness hiding argument for NP in the plain model under the sub-exponential hardness of factoring. Our technique also yields a significantly simpler proof that existing distinguisher-dependent simulatable zero knowledge protocols are also polynomial time simulatable against all distinguishers of a-priori bounded polynomial size.

The core conceptual idea underlying our individual simulation technique is an observation of the existence of nearly optimal extractors for all hard distributions: For any NP-instance(s) sampling algorithm, there exists a polynomial-size witness extractor (depending on the sampler's functionality) that almost outperforms any circuit of a-priori bounded polynomial size in terms of the success probability.
Expand
Orr Dunkelman, Abhishek Kumar, Eran Lambooij, Somitra Kumar Sanadhya
ePrint Report ePrint Report
Format-Preserving Encryption (FPE) is a method to encrypt non-standard domains, thus allowing for securely encrypting not only binary strings, but also special domains, e.g., social security numbers into social security numbers. The need for those resulted in a few standardized constructions such as the NIST standardized FF1 and FF3-1 and the Korean Standards FEA-1 and FEA-2. Moreover, there are currently efforts both in ANSI and in ISO to include such block ciphers to standards (e.g., the ANSI X9.124 discussing encryption for financial services). Most of the proposed FPE schemes, such as the NIST standardized FF1 and FF3-1 and the Korean Standards FEA-1 and FEA-2, are based on a Feistel construction with pseudo-random round functions. Moreover, to mitigate enumeration attacks against the possibly small domains, they all employ tweaks, which enrich the actual domain sizes. In this paper we present distinguishing attacks against Feistel-based FPEs. We show a distinguishing attack against the full FF1 with data complexity of $2^{60}$ 20-bit plaintexts, against the full FF3-1 with data complexity of $2^{40}$ 20-bit plaintexts. For FEA-1 with 128-bit, 192-bit and 256-bit keys, the data complexity of the distinguishing attack is $2^{32}$, $2^{40}$, and $2^{48}$ 8-bit plaintexts, respectively. The data complexity of the distinguishing attack against the full FEA-2 with 128-bit, 192-bit and 256-bit is $2^{56}$, $2^{68}$, and $2^{80}$ 8-bit plaintexts, respectively. Moreover, we show how to extend the distinguishing attack on FEA-1 and FEA-2 using 192-bit and 256-bit keys into key recovery attacks with time complexity $2^{136}$ (for both attacks).
Expand
Karim Belabas, Thorsten Kleinjung, Antonio Sanso, Benjamin Wesolowski
ePrint Report ePrint Report
In this short note we analyze the low order assumption in the imaginary quadratic number fields. We show how this assumption is broken for Mersenne primes. We also provide a description on how to possible attack this assumption for other class of prime numbers leveraging some new mathematical tool coming from higher (cubic) number fields.
Expand
Noel Danz, Oliver Derwisch, Anja Lehmann, Wenzel Puenter, Marvin Stolle, Joshua Ziemann
ePrint Report ePrint Report
Automated contact tracing leverages the ubiquity of smartphones to warn users about an increased exposure risk to COVID-19. In the course of only a few weeks, several cryptographic protocols have been proposed that aim to achieve such contract tracing in a decentralized and privacy-preserving way. Roughly, they let users' phones exchange random looking pseudonyms that are derived from locally stored keys. If a user is diagnosed, her phone uploads the keys which allows other users to check for any contact matches. Ultimately this line of work led to Google and Apple including a variant of these protocols into their phones which is currently used by millions of users. Due to the obvious urgency, these schemes were pushed to deployment without a formal analysis of the achieved security and privacy features. In this work we address this gap and provide the first formal treatment of such decentralized cryptographic contact tracing. We formally define three main properties in a game-based manner: pseudonym and trace unlinkability to guarantee the privacy of users during healthy and infectious periods, and integrity ensuring that triggering false positive alarms is infeasible. A particular focus of our work is on the timed aspects of these schemes, as both keys and pseudonyms are rotated regularly, and we specify different variants of the aforementioned properties depending on the time granularity for which they hold. We analyze a selection of practical protocols (DP-3T, TCN, GAEN) and prove their security under well-defined assumptions.
Expand
Eamonn W. Postlethwaite, Fernando Virdia
ePrint Report ePrint Report
As lattice-based key encapsulation, digital signature, and fully homomorphic encryption schemes near standardisation, ever more focus is being directed to the precise estimation of the security of these schemes. The primal attack reduces key recovery against such schemes to instances of the unique Shortest Vector Problem (uSVP). Dachman-Soled et al. (Crypto 2020) recently proposed a new approach for fine-grained estimation of the cost of the primal attack when using Progressive BKZ for the lattice reduction step. In this paper we review and extend their technique to BKZ 2.0 and provide extensive experimental evidence of its accuracy. Using this technique we also explain results from previous primal attack experiments by Albrecht et al. (Asiacrypt 2017) where attacks succeeded with smaller than expected block sizes. Finally, we use our simulators to reestimate the cost of attacking the three lattice finalists of the NIST Post Quantum Standardisation Process.
Expand
Pedro Branco, Nico Döttling, Sihang Pu
ePrint Report ePrint Report
Threshold Private Set Intersection (PSI) allows multiple parties to compute the intersection of their input sets if and only if the intersection is larger than $n-t$, where $n$ is the size of the sets and $t$ is some threshold. The main appeal of this primitive is that, in contrast to standard PSI, known upper-bounds on the communication complexity only depend on the threshold $t$ and not on the sizes of the input sets. In this work, we present a Threshold PSI scheme for $N$ parties with communication complexity $\tilde O(Nt^2)$. At the heart of our construction is a new cardinality-test protocol, which allows the parties to determine if the intersection of their sets is sufficiently large.
Expand
Karim Baghery, Zaira Pindado, Carla Ràfols
ePrint Report ePrint Report
Among various NIZK arguments, zk-SNARKs are the most efficient constructions in terms of proof size and verification which are two critical criteria for large scale applications. Currently, Groth’s construction, $\mathsf{Groth16}$, from Eurocrypt’16 is the most efficient and widely deployed one. However, it is proven to achieve only knowledge soundness, which does not prevent attacks from the adversaries who have seen simulated proofs. There has been considerable progress in modifying $\mathsf{Groth16}$ to achieve simulation extractability to guarantee the non-malleability of proofs. We revise the Simulation Extractable (SE) version of $\mathsf{Groth16}$ proposed by Bowe and Gabizon that has the most efficient prover and $\mathsf{crs}$ size among the candidates, although it adds Random Oracle (RO) to the original construction. We present a new version which requires 4 parings in the verification, instead of 5. We also get rid of the RO at the cost of a collision resistant hash function and a single new element in the $\mathsf{crs}$. Our construction is proven in the $\textit{generic group model}$ and seems to result in the most efficient SE variant of $\mathsf{Groth16}$ in most dimensions.
Expand
Kai-Min Chung, Serge Fehr, Yu-Hsuan Huang, Tai-Ning Liao
ePrint Report ePrint Report
We revisit the so-called compressed oracle technique, introduced by Zhandry for analyzing quantum algorithms in the quantum random oracle model (QROM). This technique has proven to be very powerful for reproving known lower bound results, but also for proving new results that seemed to be out of reach before. Despite being very useful, it is however still quite cumbersome to actually employ the compressed oracle technique.

To start off with, we offer a concise yet mathematically rigorous exposition of the compressed oracle technique. We adopt a more abstract view than other descriptions found in the literature, which allows us to keep the focus on the relevant aspects. Our exposition easily extends to the parallel-query QROM, where in each query-round the considered quantum oracle algorithm may make several queries to the QROM in parallel. This variant of the QROM allows for a more fine-grained query-complexity analysis of quantum oracle algorithms.

Our main technical contribution is a framework that simplifies the use of (the parallel-query generalization of) the compressed oracle technique for proving query complexity results. With our framework in place, whenever applicable, it is possible to prove quantum query complexity lower bounds by means of purely classical reasoning. More than that, we show that, for typical examples, the crucial classical observations that give rise to the classical bounds are sufficient to conclude the corresponding quantum bounds.

We demonstrate this on a few examples, recovering known results (like the optimality of parallel Grover), but also obtaining new results (like the optimality of parallel BHT collision search). Our main application is to prove hardness of finding a $q$-chain, i.e., a sequence $x_0,x_1,\ldots,x_q$ with the property that $x_i = H(x_{i-1})$ for all $1 \leq i \leq q$, with fewer than $q$ parallel queries.

The above problem of producing a hash chain is of fundamental importance in the context of proofs of sequential work. Indeed, as a concrete application of our new bound, we prove that the ``Simple Proofs of Sequential Work" proposed by Cohen and Pietrzak remain secure against quantum attacks. Such a proof is not simply a matter of plugging in our new bound; the entire protocol needs to be analyzed in the light of a quantum attack, and substantial additional work is necessary. Thanks to our framework, this can now be done with purely classical reasoning.
Expand
Ritam Bhaumik, Xavier Bonnetain, André Chailloux, Gaëtan Leurent, María Naya-Plasencia, André Schrottenloher, Yannick Seurin
ePrint Report ePrint Report
It was long thought that symmetric cryptography was only mildly affected by quantum attacks, and that doubling the key length was sufficient to restore security. However, recent works have shown that Simon's quantum period finding algorithm breaks a large number of MAC and authenticated encryption algorithms when the adversary can query the MAC/encryption oracle with a quantum superposition of messages. In particular, the OCB authenticated encryption mode is broken in this setting, and no quantum-secure mode is known with the same efficiency (rate-one and parallelizable).

In this paper we generalize the previous attacks, show that a large class of OCB-like schemes is unsafe against superposition queries, and discuss the quantum security notions for authenticated encryption modes. We propose a new rate-one parallelizable mode named QCB inspired by TAE and OCB and prove its security against quantum superposition queries.
Expand

19 October 2020

Joppe W. Bos, Joost Renes, Christine van Vredendaal
ePrint Report ePrint Report
We discuss how to efficiently utilize contemporary co-processors used for public-key cryptography for polynomial multiplication in lattice-based cryptography. More precisely, we focus on polynomial multiplication in rings of the form $Z[X]/(X^n + 1)$. We make use of the roots of unity in this ring and construct the Kronecker+ algorithm, a variant of Nussbaumer designed to combine especially well with the Kronecker substitution method, This is a symbolic NTT of variable depth that can be considered a generalization of Harvey’s multipoint Kronecker substitution. Compared to straightforwardly combining Kronecker substitution with the state-ofthe- art symbolic NTT algorithms Nussbaumer or Schönhage-Strassen, we halve the number or the bit size of the multiplied integers, respectively. Kronecker+ directly applies to the polynomial multiplication operations in the lattice-based cryptographic schemes Kyber and Saber, and we provide a detailed implementation analysis for the latter. This analysis highlights the potential of the Kronecker+ technique for commonly available multiplier lengths on contemporary co-processors.
Expand
İrem Keskinkurt Paksoy, Murat Cenk
ePrint Report ePrint Report
Lattice-based NIST PQC finalists need efficient multiplication in $\mathbb{Z}_q[x]/(f(x))$. Multiplication in this ring can be performed very efficiently via number theoretic transform (NTT) as done in CRYSTALS-Kyber if the parameters of the scheme allow it. If NTT is not supported, other multiplication algorithms must be employed. For example, if the modulus $q$ of the scheme is a power of two as in Saber and NTRU, then NTT can not be used directly. In this case, Karatsuba and Toom-Cook methods together with modular reduction are commonly used for multiplication in this ring. In this paper, we show that the Toeplitz matrix-vector product (TMVP) representation of modular polynomial multiplication yields better results than Karatsuba and Toom-Cook methods. We present three- and four-way TMVP formulas that we derive from three- and four-way Toom-Cook algorithms, respectively. We use the four-way TMVP formula to develop an algorithm for multiplication in the ring $\mathbb{Z}_{2^m}[x]/(x^{256}+1)$. We implement the proposed algorithm on the ARM Cortex-M4 microcontroller and apply it to Saber, which is one of the lattice-based finalists of the NIST PQC competition. We compare the results to previous implementations. The TMVP-based multiplication algorithm we propose is $20.83\%$ faster than the previous algorithm that uses a combination of Toom-Cook, Karatsuba, and schoolbook methods. Our algorithm also speeds up key generation, encapsulation, and decapsulation algorithms of all variants of Saber. The speedups vary between $4.3-39.8\%$. Moreover, our algorithm requires less memory than the others, except for the memory-optimized implementation of Saber.
Expand
◄ Previous Next ►