## CryptoDB

### Jean-Luc Beuchat

#### Publications

**Year**

**Venue**

**Title**

2010

EPRINT

Compact Implementations of BLAKE-32 and BLAKE-64 on FPGA
Abstract

We propose compact architectures of the SHA-$3$ candidates BLAKE-32 and BLAKE-64 for several FPGA families. We harness the intrinsic parallelism of the algorithm to interleave the computation of four instances of the $G_i$ function. This approach allows us to design an Arithmetic and Logic Unit with four pipeline stages and to achieve high clock frequencies. With careful scheduling, we completely avoid pipeline bubbles. For the time being, the designs presented in this work are the most compact ones for any of the SHA-3 candidates. We show for instance that a fully autonomous implementation of BLAKE-32 on a Xilinx Virtex-5 device requires 56 slices and two memory blocks.

2010

EPRINT

A Compact FPGA Implementation of the SHA-3 Candidate ECHO
Abstract

We propose a compact architecture of the SHA-3 candidate ECHO for the Virtex-5 FPGA family. Our architecture is built around a 8-bit datapath. We show that a careful organization of the chaining variable and the message block in the register file allows one to design a compact control unit based on a 4-bit counter, an 8-bit counter, and a simple Finite State Machine.
A fully autonomous implementation of ECHO on a Xilinx Virtex-5 FPGA requires $127$ slices and a single memory block to store the internal state, and achieves a throughput of $72$Mbps.

2010

EPRINT

High-Speed Software Implementation of the Optimal Ate Pairing over Barreto-Naehrig Curves
Abstract

This paper describes the design of a fast software library for the computation of the optimal ate pairing on a Barreto--Naehrig elliptic curve. Our library is able to compute the optimal ate pairing over a $254$-bit prime field $\mathbb{F}_{p}$, in just $2.63$ million of clock cycles on a single core of an Intel Core i7 $2.8$GHz processor, which implies that the pairing computation takes $0.942$msec. We are able to achieve this performance by a careful implementation of the base field arithmetic through the usage of the customary Montgomery multiplier for prime fields. The prime field is constructed via the Barreto--Naehrig polynomial parametrization of the prime $p$ given as, $p = 36t^4 +36t^3 +24t^2 +6t+1$, with $t = 2^{62} - 2^{54} + 2^{44}$. This selection of $t$ allows us to obtain important savings for both the Miller loop as well as the final exponentiation steps of the optimal ate pairing.

2009

CHES

2008

EPRINT

A Comparison Between Hardware Accelerators for the Modified Tate Pairing over $\mathbb{F}_{2^m}$ and $\mathbb{F}_{3^m}$
Abstract

In this article we propose a study of the modified Tate pairing in characteristics two and three. Starting from the $\eta_T$ pairing introduced by Barreto {\em et al.} (Des Codes Crypt, 2007), we detail various algorithmic improvements in the case of characteristic two. As far as characteristic three is concerned, we refer to the survey by Beuchat {\em et al.} (ePrint 2007-417). We then show how to get back to the modified Tate pairing at almost no extra cost. Finally, we explore the trade-offs involved in the hardware implementation of this pairing for both characteristics two and three. From our experiments, characteristic three appears to have a slight advantage over characteristic two.

2008

EPRINT

A Pipelined Karatsuba-Ofman Multiplier over GF($3^{97}$) Amenable for Pairing Computation
Abstract

We present a subquadratic ternary field multiplier based on the combination of several variants of the Karatsuba-Ofman scheme
recently published. Since one of the most relevant applications for this kind of multipliers is pairing computation,
where several field multiplications need to be computed at once, we decided to design a $k$-stage pipeline
structure for $k=1,\ldots,4$, where each stage is composed of a 49-trit polynomial multiplier unit. That
architecture can compute an average of $k$ field multiplications every three clock cycles, which implies that our
four-stage pipeline design can perform more than one field multiplication per clock cycle. When implemented in
a Xilinx Virtex V XC5VLX330 FPGA device, this multiplier can compute one field multiplication over \gf($3^{97}$)
in just $11.47$ns.

2008

EPRINT

FPGA and ASIC Implementations of the $\eta_T$ Pairing in Characteristic Three
Abstract

Since their introduction in constructive cryptographic applications, pairings over (hyper)elliptic curves are at the heart of an ever increasing number of protocols. As they rely critically on efficient algorithms and implementations of pairing primitives, the study of hardware accelerators became an active research area.
In this paper, we propose two coprocessors for the reduced $\eta_T$ pairing introduced by Barreto {\it et al.} as an alternative means of computing the Tate pairing on supersingular elliptic curves. We prototyped our architectures on FPGAs. According to our place-and-route results, our coprocessors compare favorably with other solutions described in the open literature. We also present the first ASIC implementation of the reduced $\eta_T$ pairing.

2007

EPRINT

A Coprocessor for the Final Exponentiation of the $\eta_T$ Pairing in Characteristic Three
Abstract

Since the introduction of pairings over (hyper)elliptic curves in
constructive cryptographic applications, an ever increasing number of
protocols based on pairings have appeared in the literature. Software
implementations being rather slow, the study of hardware architectures
became an active research area. Beuchat et al. proposed for
instance a coprocessor which computes the characteristic three
$\eta_T$ pairing, from which the Tate pairing can easily be derived,
in $33$\,$\mu$s on a Cyclone II FPGA. However, a final exponentiation
is required to ensure a unique output value and the authors proposed
to supplement their $\eta_T$ pairing accelerator with a coprocessor
for exponentiation. Thus, the challenge consists in designing the
smallest possible piece of hardware able to perform this task in less
than $33$\,$\mu$s on a Cyclone~II device. In this paper, we propose a
novel arithmetic operator implementing addition, cubing, and
multiplication over $\mathbb{F}_{3^{97}}$ and show that a coprocessor
based on a single such operator meets this timing constraint.

2007

EPRINT

Arithmetic Operators for Pairing-Based Cryptography
Abstract

Since their introduction in constructive cryptographic applications,
pairings over (hyper)elliptic curves are at the heart of an ever
increasing number of protocols. Software implementations being rather
slow, the study of hardware architectures became an active research
area. In this paper, we first study an accelerator for the $\eta_T$
pairing over $\mathbb{F}_3[x]/(x^{97}+x^{12}+2)$. Our architecture is
based on a unified arithmetic operator which performs addition,
multiplication, and cubing over $\mathbb{F}_{3^{97}}$. This design
methodology allows us to design a compact coprocessor ($1888$ slices
on a Virtex-II Pro~$4$ FPGA) which compares favorably with other
solutions described in the open literature. We then describe ways to
extend our approach to any characteristic and any extension field.

2007

EPRINT

A Refined Algorithm for the $\eta_T$ Pairing Calculation in Characteristic Three
Abstract

We describe further improvements of the $\eta_T$ pairing algorithm in
characteristic three. Our approach combines the loop unrolling
technique introduced by Granger {\em et. al} for the Duursma-Lee
algorithm, and a novel algorithm for multiplication over
$\mathbb{F}_{3^{6m}}$ proposed by Gorla {\em et al.} at SAC 2007. For
$m=97$, the refined algorithm reduces the number of multiplications
over $\mathbb{F}_{3^m}$ from $815$ to $692$.

2007

EPRINT

Algorithms and Arithmetic Operators for Computing the $\eta_T$ Pairing in Characteristic Three
Abstract

Since their introduction in constructive cryptographic applications,
pairings over (hyper)elliptic curves are at the heart of an ever
increasing number of protocols. Software implementations being rather
slow, the study of hardware architectures became an active research
area.
In this paper, we discuss several algorithms to compute the $\eta_T$
pairing in characteristic three and suggest further improvements.
These algorithms involve addition, multiplication, cubing, inversion,
and sometimes cube root extraction over $\mathbb{F}_{3^m}$. We propose
a hardware accelerator based on a unified arithmetic operator able to
perform the operations required by a given algorithm. We describe the
implementation of a compact coprocessor for the field
$\mathbb{F}_{3^{97}}$ given by $\mathbb{F}_3[x]/(x^{97}+x^{12}+2)$,
which compares favorably with other solutions described in the open
literature.

2006

EPRINT

An Algorithm for the $\eta_T$ Pairing Calculation in Characteristic Three and its Hardware Implementation
Abstract

In this paper, we propose a modified $\eta_T$ pairing algorithm in
characteristic three which does not need any cube root extraction. We
also discuss its implementation on a low cost platform which hosts an
Altera Cyclone~II FPGA device. Our pairing accelerator is ten times
faster than previous known FPGA implementations in characteristic
three.

#### Program Committees

- Asiacrypt 2014
- CHES 2010
- CHES 2009

#### Coauthors

- Nicolas Brisebarre (5)
- Nidia Cortez-Duarte (1)
- Jérémie Detrey (5)
- Jorge Enrique González Díaz (1)
- Hiroshi Doi (1)
- Nicolas Estibals (1)
- Kaoru Fujita (1)
- Atsuo Inomata (1)
- Akira Kanaoka (1)
- Masayoshi Katouno (1)
- Masahiro Mambo (1)
- Shigeo Mitsunari (1)
- Takeshi Okamoto (1)
- Eiji Okamoto (13)
- Francisco Rodríguez-Henríquez (4)
- Takaaki Shiga (1)
- Masaaki Shirase (5)
- Ryuji Soga (1)
- Tsuyoshi Takagi (5)
- Tadanori Teruya (1)
- Ananda Vithanage (1)
- Hiroyasu Yamamoto (1)
- Teppei Yamazaki (2)