## CryptoDB

### Jean-Luc Beuchat

#### Publications

Year
Venue
Title
2010
EPRINT
We propose compact architectures of the SHA-$3$ candidates BLAKE-32 and BLAKE-64 for several FPGA families. We harness the intrinsic parallelism of the algorithm to interleave the computation of four instances of the $G_i$ function. This approach allows us to design an Arithmetic and Logic Unit with four pipeline stages and to achieve high clock frequencies. With careful scheduling, we completely avoid pipeline bubbles. For the time being, the designs presented in this work are the most compact ones for any of the SHA-3 candidates. We show for instance that a fully autonomous implementation of BLAKE-32 on a Xilinx Virtex-5 device requires 56 slices and two memory blocks.
2010
EPRINT
We propose a compact architecture of the SHA-3 candidate ECHO for the Virtex-5 FPGA family. Our architecture is built around a 8-bit datapath. We show that a careful organization of the chaining variable and the message block in the register file allows one to design a compact control unit based on a 4-bit counter, an 8-bit counter, and a simple Finite State Machine. A fully autonomous implementation of ECHO on a Xilinx Virtex-5 FPGA requires $127$ slices and a single memory block to store the internal state, and achieves a throughput of $72$Mbps.
2010
EPRINT
This paper describes the design of a fast software library for the computation of the optimal ate pairing on a Barreto--Naehrig elliptic curve. Our library is able to compute the optimal ate pairing over a $254$-bit prime field $\mathbb{F}_{p}$, in just $2.63$ million of clock cycles on a single core of an Intel Core i7 $2.8$GHz processor, which implies that the pairing computation takes $0.942$msec. We are able to achieve this performance by a careful implementation of the base field arithmetic through the usage of the customary Montgomery multiplier for prime fields. The prime field is constructed via the Barreto--Naehrig polynomial parametrization of the prime $p$ given as, $p = 36t^4 +36t^3 +24t^2 +6t+1$, with $t = 2^{62} - 2^{54} + 2^{44}$. This selection of $t$ allows us to obtain important savings for both the Miller loop as well as the final exponentiation steps of the optimal ate pairing.
2009
CHES
2008
EPRINT
In this article we propose a study of the modified Tate pairing in characteristics two and three. Starting from the $\eta_T$ pairing introduced by Barreto {\em et al.} (Des Codes Crypt, 2007), we detail various algorithmic improvements in the case of characteristic two. As far as characteristic three is concerned, we refer to the survey by Beuchat {\em et al.} (ePrint 2007-417). We then show how to get back to the modified Tate pairing at almost no extra cost. Finally, we explore the trade-offs involved in the hardware implementation of this pairing for both characteristics two and three. From our experiments, characteristic three appears to have a slight advantage over characteristic two.
2008
EPRINT
We present a subquadratic ternary field multiplier based on the combination of several variants of the Karatsuba-Ofman scheme recently published. Since one of the most relevant applications for this kind of multipliers is pairing computation, where several field multiplications need to be computed at once, we decided to design a $k$-stage pipeline structure for $k=1,\ldots,4$, where each stage is composed of a 49-trit polynomial multiplier unit. That architecture can compute an average of $k$ field multiplications every three clock cycles, which implies that our four-stage pipeline design can perform more than one field multiplication per clock cycle. When implemented in a Xilinx Virtex V XC5VLX330 FPGA device, this multiplier can compute one field multiplication over \gf($3^{97}$) in just $11.47$ns.
2008
EPRINT
Since their introduction in constructive cryptographic applications, pairings over (hyper)elliptic curves are at the heart of an ever increasing number of protocols. As they rely critically on efficient algorithms and implementations of pairing primitives, the study of hardware accelerators became an active research area. In this paper, we propose two coprocessors for the reduced $\eta_T$ pairing introduced by Barreto {\it et al.} as an alternative means of computing the Tate pairing on supersingular elliptic curves. We prototyped our architectures on FPGAs. According to our place-and-route results, our coprocessors compare favorably with other solutions described in the open literature. We also present the first ASIC implementation of the reduced $\eta_T$ pairing.
2007
CHES
2007
EPRINT
Since the introduction of pairings over (hyper)elliptic curves in constructive cryptographic applications, an ever increasing number of protocols based on pairings have appeared in the literature. Software implementations being rather slow, the study of hardware architectures became an active research area. Beuchat et al. proposed for instance a coprocessor which computes the characteristic three $\eta_T$ pairing, from which the Tate pairing can easily be derived, in $33$\,$\mu$s on a Cyclone II FPGA. However, a final exponentiation is required to ensure a unique output value and the authors proposed to supplement their $\eta_T$ pairing accelerator with a coprocessor for exponentiation. Thus, the challenge consists in designing the smallest possible piece of hardware able to perform this task in less than $33$\,$\mu$s on a Cyclone~II device. In this paper, we propose a novel arithmetic operator implementing addition, cubing, and multiplication over $\mathbb{F}_{3^{97}}$ and show that a coprocessor based on a single such operator meets this timing constraint.
2007
EPRINT
Since their introduction in constructive cryptographic applications, pairings over (hyper)elliptic curves are at the heart of an ever increasing number of protocols. Software implementations being rather slow, the study of hardware architectures became an active research area. In this paper, we first study an accelerator for the $\eta_T$ pairing over $\mathbb{F}_3[x]/(x^{97}+x^{12}+2)$. Our architecture is based on a unified arithmetic operator which performs addition, multiplication, and cubing over $\mathbb{F}_{3^{97}}$. This design methodology allows us to design a compact coprocessor ($1888$ slices on a Virtex-II Pro~$4$ FPGA) which compares favorably with other solutions described in the open literature. We then describe ways to extend our approach to any characteristic and any extension field.
2007
EPRINT
We describe further improvements of the $\eta_T$ pairing algorithm in characteristic three. Our approach combines the loop unrolling technique introduced by Granger {\em et. al} for the Duursma-Lee algorithm, and a novel algorithm for multiplication over $\mathbb{F}_{3^{6m}}$ proposed by Gorla {\em et al.} at SAC 2007. For $m=97$, the refined algorithm reduces the number of multiplications over $\mathbb{F}_{3^m}$ from $815$ to $692$.
2007
EPRINT
Since their introduction in constructive cryptographic applications, pairings over (hyper)elliptic curves are at the heart of an ever increasing number of protocols. Software implementations being rather slow, the study of hardware architectures became an active research area. In this paper, we discuss several algorithms to compute the $\eta_T$ pairing in characteristic three and suggest further improvements. These algorithms involve addition, multiplication, cubing, inversion, and sometimes cube root extraction over $\mathbb{F}_{3^m}$. We propose a hardware accelerator based on a unified arithmetic operator able to perform the operations required by a given algorithm. We describe the implementation of a compact coprocessor for the field $\mathbb{F}_{3^{97}}$ given by $\mathbb{F}_3[x]/(x^{97}+x^{12}+2)$, which compares favorably with other solutions described in the open literature.
2006
EPRINT
In this paper, we propose a modified $\eta_T$ pairing algorithm in characteristic three which does not need any cube root extraction. We also discuss its implementation on a low cost platform which hosts an Altera Cyclone~II FPGA device. Our pairing accelerator is ten times faster than previous known FPGA implementations in characteristic three.

Asiacrypt 2014
CHES 2010
CHES 2009