## CryptoDB

### Daniel J. Bernstein

#### Affiliation: University of Illinois at Chicago

#### Publications

**Year**

**Venue**

**Title**

2019

TCHES

Fast constant-time gcd computation and modular inversion
📺
Abstract

This paper introduces streamlined constant-time variants of Euclid’s algorithm, both for polynomial inputs and for integer inputs. As concrete applications, this paper saves time in (1) modular inversion for Curve25519, which was previously believed to be handled much more efficiently by Fermat’s method, and (2) key generation for the ntruhrss701 and sntrup4591761 lattice-based cryptosystems.

2019

EUROCRYPT

Quantum Circuits for the CSIDH: Optimizing Quantum Evaluation of Isogenies
📺
Abstract

Choosing safe post-quantum parameters for the new CSIDH isogeny-based key-exchange system requires concrete analysis of the cost of quantum attacks. The two main contributions to attack cost are the number of queries in hidden-shift algorithms and the cost of each query. This paper analyzes algorithms for each query, introducing several new speedups while showing that some previous claims were too optimistic for the attacker. This paper includes a full computer-verified simulation of its main algorithm down to the bit-operation level.

2019

ASIACRYPT

Decisional Second-Preimage Resistance: When Does SPR Imply PRE?
Abstract

There is a well-known gap between second-preimage resistance and preimage resistance for length-preserving hash functions. This paper introduces a simple concept that fills this gap. One consequence of this concept is that tight reductions can remove interactivity for multi-target length-preserving preimage problems, such as the problems that appear in analyzing hash-based signature systems. Previous reduction techniques applied to only a negligible fraction of all length-preserving hash functions, presumably excluding all off-the-shelf hash functions.

2017

CHES

Gimli : A Cross-Platform Permutation
Abstract

This paper presents Gimli, a 384-bit permutation designed to achieve high security with high performance across a broad range of platforms, including 64-bit Intel/AMD server CPUs, 64-bit and 32-bit ARM smartphone CPUs, 32-bit ARM microcontrollers, 8-bit AVR microcontrollers, FPGAs, ASICs without side-channel protection, and ASICs with side-channel protection.

2017

CHES

Sliding Right into Disaster: Left-to-Right Sliding Windows Leak
Abstract

It is well known that constant-time implementations of modular exponentiation cannot use sliding windows. However, software libraries such as Libgcrypt, used by GnuPG, continue to use sliding windows. It is widely believed that, even if the complete pattern of squarings and multiplications is observed through a side-channel attack, the number of exponent bits leaked is not sufficient to carry out a full key-recovery attack against RSA. Specifically, 4-bit sliding windows leak only 40% of the bits, and 5-bit sliding windows leak only 33% of the bits.In this paper we demonstrate a complete break of RSA-1024 as implemented in Libgcrypt. Our attack makes essential use of the fact that Libgcrypt uses the left-to-right method for computing the sliding-window expansion. We show for the first time that the direction of the encoding matters: the pattern of squarings and multiplications in left-to-right sliding windows leaks significantly more information about the exponent than right-to-left. We show how to extend the Heninger-Shacham algorithm for partial key reconstruction to make use of this information and obtain a very efficient full key recovery for RSA-1024. For RSA-2048 our attack is efficient for 13% of keys.

2010

EPRINT

Starfish on Strike
Abstract

This paper improves the price-performance ratio of ECM, the elliptic-curve method of integer factorization. In particular, this paper constructs "a = -1" twisted Edwards curves having Q-torsion group Z/2 x Z/4, Z/8, or Z/6 and having a known non-torsion point; demonstrates that, compared to the curves used in previous ECM implementations, some of the new curves are more effective at finding small primes despite being faster; and precomputes particularly effective curves for several specific sizes of primes.

2010

EPRINT

Wild McEliece
Abstract

The original McEliece cryptosystem uses length-n codes over F_2 with
dimension >=n-mt efficiently correcting t errors where 2^m>=n.
This paper presents a generalized cryptosystem that uses length-n codes
over small finite fields F_q with dimension >=n-m(q-1)t efficiently
correcting floor(qt/2) errors where q^m>=n.
Previously proposed cryptosystems with the same length and dimension
corrected only floor((q-1)t/2) errors for q>=3.
This paper also presents list-decoding algorithms that efficiently
correct even more errors for the same codes over F_q.
Finally, this paper shows that the increase from floor((q-1)t/2) errors
to more than floor(qt/2) errors allows considerably smaller keys
to achieve the same security level against all known attacks.

2010

EPRINT

Type-II Optimal Polynomial Bases
Abstract

In the 1990s and early 2000s several papers investigated the relative merits of polynomial-basis and normal-basis computations for $\F_{2^n}$. Even for particularly squaring-friendly applications, such as implementations of Koblitz curves, normal bases fell behind in performance unless a type-I normal basis existed for $\F_{2^n}$.
In 2007 Shokrollahi proposed a new method of multiplying in a type-II normal basis. Shokrollahi's method efficiently transforms the normal-basis multiplication into a single multiplication of two size-$(n+1)$ polynomials.
This paper speeds up Shokrollahi's method in several ways. It first presents a simpler algorithm that uses only size-$n$ polynomials. It then explains how to reduce the transformation cost by dynamically switching to a `type-II optimal polynomial basis' and by using a new reduction strategy for multiplications that produce output in type-II polynomial basis.
As an illustration of its improvements, this paper explains in detail how the multiplication overhead in Shokrollahi's original method has been reduced by a factor of $1.4$ in a major cryptanalytic computation, the ongoing attack on the ECC2K-130 Certicom challenge. The resulting overhead is also considerably smaller than the overhead in a traditional low-weight-polynomial-basis approach. This is the first state-of-the-art binary-elliptic-curve computation in which type-II bases have been shown to outperform traditional low-weight polynomial bases.

2008

EPRINT

Twisted Edwards Curves
Abstract

This paper introduces ``twisted Edwards curves,''
a generalization of the recently introduced Edwards curves;
shows that twisted Edwards curves include more curves over finite fields,
and in particular every elliptic curve in Montgomery form;
shows how to cover even more curves via isogenies;
presents fast explicit formulas for twisted Edwards curves
in projective and inverted coordinates;
and shows that twisted Edwards curves save time
for many curves that were already expressible as Edwards curves.

2008

EPRINT

ECM using Edwards curves
Abstract

This paper introduces GMP-EECM, a fast implementation of the elliptic-curve
method of factoring integers.
GMP-EECM is based on, but faster than, the well-known GMP-ECM software.
The main changes are as follows:
(1) use Edwards curves instead of Montgomery curves;
(2) use twisted inverted Edwards coordinates;
(3) use signed-sliding-window addition chains;
(4) batch primes to increase the window size;
(5) choose curves with small parameters $a,d,X_1,Y_1,Z_1$;
(6) choose curves with larger torsion.

2008

EPRINT

New AES software speed records
Abstract

This paper presents new speed records for AES software,taking advantage of (1) architecture-dependent reduction of instructions used to compute AES and (2) microarchitecture-dependent reduction of cycles used for those instructions.
A wide variety of common CPU architectures---amd64, ppc32, sparcv9, and x86---are discussed in detail, along with several specific microarchitectures.

2008

EPRINT

Binary Edwards Curves
Abstract

This paper presents a new shape for ordinary elliptic curves over fields of characteristic 2. Using the new shape, this paper presents the first complete addition formulas for binary elliptic curves, i.e., addition formulas that work for all pairs of input points, with no exceptional cases. If n >= 3 then the complete curves cover all isomorphism classes of ordinary elliptic curves over F_2^n.
This paper also presents dedicated doubling formulas for these curves using 2M + 6S + 3D, where M is the cost of a field multiplication, S is the cost of a field squaring, and D is the cost of multiplying by a curve parameter. These doubling formulas are also the first complete doubling formulas in the literature, with no exceptions for the neutral element, points of order 2, etc.
Finally, this paper presents complete formulas for differential addition, i.e., addition of points with known difference. A differential addition and doubling, the basic step in a Montgomery
ladder, uses 5M + 4S + 2D when the known difference is given in affine form.

2008

EPRINT

Attacking and defending the McEliece cryptosystem
Abstract

This paper presents several improvements to Stern's attack on the McEliece cryptosystem and achieves results considerably better than Canteaut et al. This paper shows that the system with the originally proposed parameters can be broken in just 1400 days by a single 2.4GHz Core 2 Quad CPU,or 7 days by a cluster of 200 CPUs. This attack has been implemented and is now in progress.
This paper proposes new parameters for the McEliece and Niederreiter cryptosystems achieving standard levels of security against all known attacks. The new parameters take account of the improved attack; the recent introduction of list decoding for binary Goppa codes; and the possibility of choosing code lengths that are not a power of 2. The resulting public-key sizes are considerably smaller than previous parameter choices for the same level of security.

2008

EPRINT

ECM on Graphics Cards
Abstract

This paper reports record-setting performance for the elliptic-curve method of integer factorization: for example, 926.11 curves/second
for ECM stage 1 with B1=8192 for 280-bit integers on a single PC.The state-of-the-art GMP-ECM software handles 124.71 curves/second
for ECM stage 1 with B1=8192 for 280-bit integers using all four cores of a 2.4 GHz Core 2 Quad Q6600.
The extra speed takes advantage of extra hardware,specifically two NVIDIA GTX 295 graphics cards,using a new ECM implementation introduced in this paper.Our implementation uses Edwards curves, relies on new parallel addition formulas, and is carefully tuned for the highly parallel GPU architecture.On a single GTX 295 the implementation performs 41.88 million modular multiplications per second for a general 280-bit modulus.GMP-ECM, using all four cores of a Q6600, performs 13.03 million modular multiplications per second.
This paper also reports speeds on other graphics processors: for example, 2414 280-bit elliptic-curve scalar multiplications per second
on an older NVIDIA 8800 GTS (G80), again for a general 280-bit modulus.For comparison, the CHES 2008 paper ``Exploiting the Power of GPUs for Asymmetric Cryptography'' reported 1412 elliptic-curve scalar multiplications per second on the same graphics processor
despite having fewer bits in the scalar (224 instead of 280),
fewer bits in the modulus (224 instead of 280), and a special modulus (2^{224}-2^{96}+1).

2007

EPRINT

Inverted Edwards coordinates
Abstract

Edwards curves have attracted great interest for several reasons.
When curve parameters are chosen properly, the addition formulas use
only $10M+1S$. The formulas are {\it strongly unified}, i.e., work
without change for doublings; even better, they are {\it complete},
i.e., work without change for all inputs. Dedicated doubling formulas
use only $3M+4S$, and dedicated tripling formulas use only $9M+4S$.
This paper introduces {\it inverted Edwards coordinates}. Inverted
Edwards coordinates $(X_1:Y_1:Z_1)$ represent the affine point
$(Z_1/X_1,Z_1/Y_1)$ on an Edwards curve; for comparison, standard
Edwards coordinates $(X_1:Y_1:Z_1)$ represent the affine point
$(X_1/Z_1,Y_1/Z_1)$.
This paper presents addition formulas for inverted Edwards coordinates
using only $9M+1S$. The formulas are not complete but still are
strongly unified. Dedicated doubling formulas use only $3M+4S$, and
dedicated tripling formulas use only $9M+4S$. Inverted Edwards
coordinates thus save $1M$ for each addition, without slowing down
doubling or tripling.

2007

EPRINT

Optimizing double-base elliptic-curve single-scalar multiplication
Abstract

This paper analyzes the best speeds that can be obtained for single-scalar multiplication with variable base point by combining a huge range of options:
? many choices of coordinate systems and formulas for individual group operations, including new formulas for tripling on Edwards curves;
? double-base chains with many different doubling/tripling ratios, including standard base-2 chains as an extreme case;
? many precomputation strategies, going beyond Dimitrov, Imbert, Mishra (Asiacrypt 2005) and Doche and Imbert (Indocrypt 2006).
The analysis takes account of speedups such as S-M tradeoffs and includes recent advances such as inverted Edwards coordinates.
The main conclusions are as follows. Optimized precomputations and triplings save time for single-scalar multiplication in Jacobian coordinates, Hessian curves, and tripling-oriented Doche/Icart/Kohel curves. However, even faster single-scalar multiplication is possible in Jacobi intersections, Edwards curves, extended Jacobi-quartic coordinates, and inverted Edwards coordinates, thanks to extremely fast doublings and additions; there is no evidence that double-base chains are worthwhile for the fastest curves. Inverted Edwards coordinates are the speed leader.

2007

EPRINT

Analysis and optimization of elliptic-curve single-scalar multiplication
Abstract

Let $P$ be a point on an elliptic curve over a finite field of large characteristic.
Exactly how many points $2P,3P,5P,7P,9P,\ldots,mP$
should be precomputed in a sliding-window computation of $nP$?
Should some or all of the points be converted to affine form,
and at which moments during the precomputation should these conversions take place?
Exactly how many field multiplications are required for the resulting computation of $nP$?
The answers depend on the size of $n$,
the $\inversions/\mults$ ratio,
the choice of curve shape,
the choice of coordinate system,
and the choice of addition formulas.
This paper presents answers that, compared to previous analyses,
are more carefully optimized and cover a much wider range of situations.

#### Program Committees

- CHES 2019
- FSE 2019
- CHES 2018
- Asiacrypt 2018
- CHES 2017
- CHES 2016
- FSE 2015
- Asiacrypt 2015
- CHES 2015
- FSE 2014
- Asiacrypt 2014
- CHES 2014
- CHES 2013
- CHES 2012
- Eurocrypt 2011
- CHES 2011
- CHES 2010
- FSE 2010
- CHES 2009
- Asiacrypt 2009
- FSE 2009
- Asiacrypt 2008
- CHES 2008

#### Coauthors

- Pol Van Aubel (1)
- Jens Bauch (1)
- Mihir Bellare (1)
- Peter Birkner (4)
- Joachim Breitner (1)
- Leon Groot Bruinderink (1)
- Yun-An Chang (1)
- Owen Chia-Hsin Chen (1)
- Tien-Ren Chen (2)
- Jiun-Ming Chen (1)
- Chen-Mou Cheng (3)
- Li-Ping Chou (1)
- Tung Chou (3)
- Chitchanok Chuengsatiansup (6)
- Niels Duif (1)
- Reza Rezaeian Farashahi (2)
- Daniel Genkin (1)
- Nadia Heninger (2)
- Daira Hopwood (1)
- Andreas Hülsing (4)
- Simon Josefsson (1)
- Marc Joye (1)
- David Kohel (1)
- Stefan Kölbl (1)
- Tanja Lange (34)
- Stefan Lucks (1)
- Chloe Martindale (1)
- Pedro Maat Costa Massolino (1)
- Florian Mendel (1)
- Kashif Nawaz (1)
- Ruben Niederhagen (5)
- Lorenz Panny (1)
- Louiza Papachristodoulou (1)
- Christiane Peters (6)
- Tobias Schneider (1)
- Michael Schneider (1)
- Peter Schwabe (11)
- Nicko van Someren (1)
- François-Xavier Standaert (1)
- Stefano Tessaro (1)
- Yosuke Todo (1)
- Henry de Valence (1)
- Christine van Vredendaal (1)
- Benoît Viguier (1)
- Christine van Vredendaal (3)
- Zooko Wilcox-O'Hearn (1)
- Bo-Yin Yang (6)
- Yuval Yarom (1)